* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-05-09 12:36 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-05-09 12:36 UTC (permalink / raw
To: gentoo-commits
commit: 1ca22dd1b6489786de3fdc6fee2514082b263993
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Tue May 9 12:31:07 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Tue May 9 12:35:39 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=1ca22dd1
Remove patch on security/selinux/Kconfig
As CONFIG_SECURITY_SELINUX_DISABLE was removed upstream,
remove our corresponding patch on it
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
4567_distro-Gentoo-Kconfig.patch | 12 ------------
1 file changed, 12 deletions(-)
diff --git a/4567_distro-Gentoo-Kconfig.patch b/4567_distro-Gentoo-Kconfig.patch
index 9cb1eb0c..bd7b76ca 100644
--- a/4567_distro-Gentoo-Kconfig.patch
+++ b/4567_distro-Gentoo-Kconfig.patch
@@ -300,18 +300,6 @@
+ See the settings that become available for more details and fine-tuning.
+
+endmenu
-diff --git a/security/selinux/Kconfig b/security/selinux/Kconfig
-index 9e921fc72..f29bc13fa 100644
---- a/security/selinux/Kconfig
-+++ b/security/selinux/Kconfig
-@@ -26,6 +26,7 @@ config SECURITY_SELINUX_BOOTPARAM
- config SECURITY_SELINUX_DISABLE
- bool "NSA SELinux runtime disable"
- depends on SECURITY_SELINUX
-+ depends on !GENTOO_KERNEL_SELF_PROTECTION
- select SECURITY_WRITABLE_HOOKS
- default n
- help
--
2.31.1
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-05-09 12:38 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-05-09 12:38 UTC (permalink / raw
To: gentoo-commits
commit: cf50677ae64b3a639f18d380c84cd142f86330c3
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Tue May 9 12:36:37 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Tue May 9 12:36:37 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=cf50677a
Create the 6.4 branch with genpatches
Bluetooth: Check key sizes only when Secure Simple Pairing is
enabled. See bug #686758
tmp513 requies REGMAP_I2C to build. Select it by default in Kconfig.
See bug #710790. Thanks to Phil Stracchino
sign-file: full functionality with modern LibreSSL
Kernel Self Protection patch
CPU Optimization patch
Print firmware info (Reqs CONFIG_GENTOO_PRINT_FIRMWARE_INFO
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 36 +
1500_XATTR_USER_PREFIX.patch | 66 ++
...ble-link-security-restrictions-by-default.patch | 17 +
1700_sparc-address-warray-bound-warnings.patch | 17 +
...zes-only-if-Secure-Simple-Pairing-enabled.patch | 37 +
...3-Fix-build-issue-by-selecting-CONFIG_REG.patch | 30 +
2910_bfp-mark-get-entry-ip-as--maybe-unused.patch | 11 +
2920_sign-file-patch-for-libressl.patch | 16 +
3000_Support-printing-firmware-info.patch | 14 +
5010_enable-cpu-optimizations-universal.patch | 789 +++++++++++++++++++++
10 files changed, 1033 insertions(+)
diff --git a/0000_README b/0000_README
index 90189932..8bb95e22 100644
--- a/0000_README
+++ b/0000_README
@@ -43,6 +43,42 @@ EXPERIMENTAL
Individual Patch Descriptions:
--------------------------------------------------------------------------
+Patch: 1500_XATTR_USER_PREFIX.patch
+From: https://bugs.gentoo.org/show_bug.cgi?id=470644
+Desc: Support for namespace user.pax.* on tmpfs.
+
+Patch: 1510_fs-enable-link-security-restrictions-by-default.patch
+From: http://sources.debian.net/src/linux/3.16.7-ckt4-3/debian/patches/debian/fs-enable-link-security-restrictions-by-default.patch/
+Desc: Enable link security restrictions by default.
+
+Patch: 1700_sparc-address-warray-bound-warnings.patch
+From: https://github.com/KSPP/linux/issues/109
+Desc: Address -Warray-bounds warnings
+
+Patch: 2000_BT-Check-key-sizes-only-if-Secure-Simple-Pairing-enabled.patch
+From: https://lore.kernel.org/linux-bluetooth/20190522070540.48895-1-marcel@holtmann.org/raw
+Desc: Bluetooth: Check key sizes only when Secure Simple Pairing is enabled. See bug #686758
+
+Patch: 2900_tmp513-Fix-build-issue-by-selecting-CONFIG_REG.patch
+From: https://bugs.gentoo.org/710790
+Desc: tmp513 requies REGMAP_I2C to build. Select it by default in Kconfig. See bug #710790. Thanks to Phil Stracchino
+
+Patch: 2910_bfp-mark-get-entry-ip-as--maybe-unused.patch
+From: https://www.spinics.net/lists/stable/msg604665.html
+Desc: bpf: mark get_entry_ip as __maybe_unused
+
+Patch: 2920_sign-file-patch-for-libressl.patch
+From: https://bugs.gentoo.org/717166
+Desc: sign-file: full functionality with modern LibreSSL
+
+Patch: 3000_Support-printing-firmware-info.patch
+From: https://bugs.gentoo.org/732852
+Desc: Print firmware info (Reqs CONFIG_GENTOO_PRINT_FIRMWARE_INFO). Thanks to Georgy Yakovlev
+
Patch: 4567_distro-Gentoo-Kconfig.patch
From: Tom Wijsman <TomWij@gentoo.org>
Desc: Add Gentoo Linux support config settings and defaults.
+
+Patch: 5010_enable-cpu-optimizations-universal.patch
+From: https://github.com/graysky2/kernel_compiler_patch
+Desc: Kernel >= 5.15 patch enables gcc = v11.1+ optimizations for additional CPUs.
diff --git a/1500_XATTR_USER_PREFIX.patch b/1500_XATTR_USER_PREFIX.patch
new file mode 100644
index 00000000..fac3eed7
--- /dev/null
+++ b/1500_XATTR_USER_PREFIX.patch
@@ -0,0 +1,66 @@
+From: Anthony G. Basile <blueness@gentoo.org>
+
+This patch adds support for a restricted user-controlled namespace on
+tmpfs filesystem used to house PaX flags. The namespace must be of the
+form user.pax.* and its value cannot exceed a size of 8 bytes.
+
+This is needed even on all Gentoo systems so that XATTR_PAX flags
+are preserved for users who might build packages using portage on
+a tmpfs system with a non-hardened kernel and then switch to a
+hardened kernel with XATTR_PAX enabled.
+
+The namespace is added to any user with Extended Attribute support
+enabled for tmpfs. Users who do not enable xattrs will not have
+the XATTR_PAX flags preserved.
+
+
+--- a/include/uapi/linux/xattr.h 2022-11-22 05:56:58.175733644 -0500
++++ b/include/uapi/linux/xattr.h 2022-11-22 06:04:26.394834989 -0500
+@@ -81,5 +81,9 @@
+ #define XATTR_POSIX_ACL_DEFAULT "posix_acl_default"
+ #define XATTR_NAME_POSIX_ACL_DEFAULT XATTR_SYSTEM_PREFIX XATTR_POSIX_ACL_DEFAULT
+
++/* User namespace */
++#define XATTR_PAX_PREFIX XATTR_USER_PREFIX "pax."
++#define XATTR_PAX_FLAGS_SUFFIX "flags"
++#define XATTR_NAME_PAX_FLAGS XATTR_PAX_PREFIX XATTR_PAX_FLAGS_SUFFIX
+
+ #endif /* _UAPI_LINUX_XATTR_H */
+--- a/mm/shmem.c 2022-11-22 05:57:29.011626215 -0500
++++ b/mm/shmem.c 2022-11-22 06:03:33.165939400 -0500
+@@ -3297,6 +3297,14 @@ static int shmem_xattr_handler_set(const
+ struct shmem_inode_info *info = SHMEM_I(inode);
+ int err;
+
++
++ if (!strncmp(name, XATTR_USER_PREFIX, XATTR_USER_PREFIX_LEN)) {
++ if (strcmp(name, XATTR_NAME_PAX_FLAGS))
++ return -EOPNOTSUPP;
++ if (size > 8)
++ return -EINVAL;
++ }
++
+ name = xattr_full_name(handler, name);
+ err = simple_xattr_set(&info->xattrs, name, value, size, flags, NULL);
+ if (!err) {
+@@ -3312,6 +3320,12 @@ static const struct xattr_handler shmem_
+ .set = shmem_xattr_handler_set,
+ };
+
++static const struct xattr_handler shmem_user_xattr_handler = {
++ .prefix = XATTR_USER_PREFIX,
++ .get = shmem_xattr_handler_get,
++ .set = shmem_xattr_handler_set,
++};
++
+ static const struct xattr_handler shmem_trusted_xattr_handler = {
+ .prefix = XATTR_TRUSTED_PREFIX,
+ .get = shmem_xattr_handler_get,
+@@ -3325,6 +3339,7 @@ static const struct xattr_handler *shmem
+ #endif
+ &shmem_security_xattr_handler,
+ &shmem_trusted_xattr_handler,
++ &shmem_user_xattr_handler,
+ NULL
+ };
+
diff --git a/1510_fs-enable-link-security-restrictions-by-default.patch b/1510_fs-enable-link-security-restrictions-by-default.patch
new file mode 100644
index 00000000..e8c30157
--- /dev/null
+++ b/1510_fs-enable-link-security-restrictions-by-default.patch
@@ -0,0 +1,17 @@
+--- a/fs/namei.c 2022-01-23 13:02:27.876558299 -0500
++++ b/fs/namei.c 2022-03-06 12:47:39.375719693 -0500
+@@ -1020,10 +1020,10 @@ static inline void put_link(struct namei
+ path_put(&last->link);
+ }
+
+-static int sysctl_protected_symlinks __read_mostly;
+-static int sysctl_protected_hardlinks __read_mostly;
+-static int sysctl_protected_fifos __read_mostly;
+-static int sysctl_protected_regular __read_mostly;
++static int sysctl_protected_symlinks __read_mostly = 1;
++static int sysctl_protected_hardlinks __read_mostly = 1;
++int sysctl_protected_fifos __read_mostly = 1;
++int sysctl_protected_regular __read_mostly = 1;
+
+ #ifdef CONFIG_SYSCTL
+ static struct ctl_table namei_sysctls[] = {
diff --git a/1700_sparc-address-warray-bound-warnings.patch b/1700_sparc-address-warray-bound-warnings.patch
new file mode 100644
index 00000000..f9393555
--- /dev/null
+++ b/1700_sparc-address-warray-bound-warnings.patch
@@ -0,0 +1,17 @@
+--- a/arch/sparc/mm/init_64.c 2022-05-24 16:48:40.749677491 -0400
++++ b/arch/sparc/mm/init_64.c 2022-05-24 16:55:15.511356945 -0400
+@@ -3052,11 +3052,11 @@ static inline resource_size_t compute_ke
+ static void __init kernel_lds_init(void)
+ {
+ code_resource.start = compute_kern_paddr(_text);
+- code_resource.end = compute_kern_paddr(_etext - 1);
++ code_resource.end = compute_kern_paddr(_etext) - 1;
+ data_resource.start = compute_kern_paddr(_etext);
+- data_resource.end = compute_kern_paddr(_edata - 1);
++ data_resource.end = compute_kern_paddr(_edata) - 1;
+ bss_resource.start = compute_kern_paddr(__bss_start);
+- bss_resource.end = compute_kern_paddr(_end - 1);
++ bss_resource.end = compute_kern_paddr(_end) - 1;
+ }
+
+ static int __init report_memory(void)
diff --git a/2000_BT-Check-key-sizes-only-if-Secure-Simple-Pairing-enabled.patch b/2000_BT-Check-key-sizes-only-if-Secure-Simple-Pairing-enabled.patch
new file mode 100644
index 00000000..394ad48f
--- /dev/null
+++ b/2000_BT-Check-key-sizes-only-if-Secure-Simple-Pairing-enabled.patch
@@ -0,0 +1,37 @@
+The encryption is only mandatory to be enforced when both sides are using
+Secure Simple Pairing and this means the key size check makes only sense
+in that case.
+
+On legacy Bluetooth 2.0 and earlier devices like mice the encryption was
+optional and thus causing an issue if the key size check is not bound to
+using Secure Simple Pairing.
+
+Fixes: d5bb334a8e17 ("Bluetooth: Align minimum encryption key size for LE and BR/EDR connections")
+Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
+Cc: stable@vger.kernel.org
+---
+ net/bluetooth/hci_conn.c | 9 +++++++--
+ 1 file changed, 7 insertions(+), 2 deletions(-)
+
+diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c
+index 3cf0764d5793..7516cdde3373 100644
+--- a/net/bluetooth/hci_conn.c
++++ b/net/bluetooth/hci_conn.c
+@@ -1272,8 +1272,13 @@ int hci_conn_check_link_mode(struct hci_conn *conn)
+ return 0;
+ }
+
+- if (hci_conn_ssp_enabled(conn) &&
+- !test_bit(HCI_CONN_ENCRYPT, &conn->flags))
++ /* If Secure Simple Pairing is not enabled, then legacy connection
++ * setup is used and no encryption or key sizes can be enforced.
++ */
++ if (!hci_conn_ssp_enabled(conn))
++ return 1;
++
++ if (!test_bit(HCI_CONN_ENCRYPT, &conn->flags))
+ return 0;
+
+ /* The minimum encryption key size needs to be enforced by the
+--
+2.20.1
diff --git a/2900_tmp513-Fix-build-issue-by-selecting-CONFIG_REG.patch b/2900_tmp513-Fix-build-issue-by-selecting-CONFIG_REG.patch
new file mode 100644
index 00000000..43356857
--- /dev/null
+++ b/2900_tmp513-Fix-build-issue-by-selecting-CONFIG_REG.patch
@@ -0,0 +1,30 @@
+From dc328d75a6f37f4ff11a81ae16b1ec88c3197640 Mon Sep 17 00:00:00 2001
+From: Mike Pagano <mpagano@gentoo.org>
+Date: Mon, 23 Mar 2020 08:20:06 -0400
+Subject: [PATCH 1/1] This driver requires REGMAP_I2C to build. Select it by
+ default in Kconfig. Reported at gentoo bugzilla:
+ https://bugs.gentoo.org/710790
+Cc: mpagano@gentoo.org
+
+Reported-by: Phil Stracchino <phils@caerllewys.net>
+
+Signed-off-by: Mike Pagano <mpagano@gentoo.org>
+---
+ drivers/hwmon/Kconfig | 1 +
+ 1 file changed, 1 insertion(+)
+
+diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
+index 47ac20aee06f..530b4f29ba85 100644
+--- a/drivers/hwmon/Kconfig
++++ b/drivers/hwmon/Kconfig
+@@ -1769,6 +1769,7 @@ config SENSORS_TMP421
+ config SENSORS_TMP513
+ tristate "Texas Instruments TMP513 and compatibles"
+ depends on I2C
++ select REGMAP_I2C
+ help
+ If you say yes here you get support for Texas Instruments TMP512,
+ and TMP513 temperature and power supply sensor chips.
+--
+2.24.1
+
diff --git a/2910_bfp-mark-get-entry-ip-as--maybe-unused.patch b/2910_bfp-mark-get-entry-ip-as--maybe-unused.patch
new file mode 100644
index 00000000..a75b90c8
--- /dev/null
+++ b/2910_bfp-mark-get-entry-ip-as--maybe-unused.patch
@@ -0,0 +1,11 @@
+--- a/kernel/trace/bpf_trace.c 2022-11-09 13:30:24.192940988 -0500
++++ b/kernel/trace/bpf_trace.c 2022-11-09 13:30:59.029810818 -0500
+@@ -1027,7 +1027,7 @@ static const struct bpf_func_proto bpf_g
+ };
+
+ #ifdef CONFIG_X86_KERNEL_IBT
+-static unsigned long get_entry_ip(unsigned long fentry_ip)
++static unsigned long __maybe_unused get_entry_ip(unsigned long fentry_ip)
+ {
+ u32 instr;
+
diff --git a/2920_sign-file-patch-for-libressl.patch b/2920_sign-file-patch-for-libressl.patch
new file mode 100644
index 00000000..e6ec017d
--- /dev/null
+++ b/2920_sign-file-patch-for-libressl.patch
@@ -0,0 +1,16 @@
+--- a/scripts/sign-file.c 2020-05-20 18:47:21.282820662 -0400
++++ b/scripts/sign-file.c 2020-05-20 18:48:37.991081899 -0400
+@@ -41,9 +41,10 @@
+ * signing with anything other than SHA1 - so we're stuck with that if such is
+ * the case.
+ */
+-#if defined(LIBRESSL_VERSION_NUMBER) || \
+- OPENSSL_VERSION_NUMBER < 0x10000000L || \
+- defined(OPENSSL_NO_CMS)
++#if defined(OPENSSL_NO_CMS) || \
++ ( defined(LIBRESSL_VERSION_NUMBER) \
++ && (LIBRESSL_VERSION_NUMBER < 0x3010000fL) ) || \
++ OPENSSL_VERSION_NUMBER < 0x10000000L
+ #define USE_PKCS7
+ #endif
+ #ifndef USE_PKCS7
diff --git a/3000_Support-printing-firmware-info.patch b/3000_Support-printing-firmware-info.patch
new file mode 100644
index 00000000..a630cfbe
--- /dev/null
+++ b/3000_Support-printing-firmware-info.patch
@@ -0,0 +1,14 @@
+--- a/drivers/base/firmware_loader/main.c 2021-08-24 15:42:07.025482085 -0400
++++ b/drivers/base/firmware_loader/main.c 2021-08-24 15:44:40.782975313 -0400
+@@ -809,6 +809,11 @@ _request_firmware(const struct firmware
+
+ ret = _request_firmware_prepare(&fw, name, device, buf, size,
+ offset, opt_flags);
++
++#ifdef CONFIG_GENTOO_PRINT_FIRMWARE_INFO
++ printk(KERN_NOTICE "Loading firmware: %s\n", name);
++#endif
++
+ if (ret <= 0) /* error or already assigned */
+ goto out;
+
diff --git a/5010_enable-cpu-optimizations-universal.patch b/5010_enable-cpu-optimizations-universal.patch
new file mode 100644
index 00000000..7a1b717a
--- /dev/null
+++ b/5010_enable-cpu-optimizations-universal.patch
@@ -0,0 +1,789 @@
+From 70d4906b87983ed2ed5da78930a701625d881dd0 Mon Sep 17 00:00:00 2001
+From: graysky <therealgraysky@proton.me>
+Date: Thu, 5 Jan 2023 14:29:37 -0500
+
+FEATURES
+This patch adds additional CPU options to the Linux kernel accessible under:
+ Processor type and features --->
+ Processor family --->
+
+With the release of gcc 11.1 and clang 12.0, several generic 64-bit levels are
+offered which are good for supported Intel or AMD CPUs:
+• x86-64-v2
+• x86-64-v3
+• x86-64-v4
+
+Users of glibc 2.33 and above can see which level is supported by current
+hardware by running:
+ /lib/ld-linux-x86-64.so.2 --help | grep supported
+
+Alternatively, compare the flags from /proc/cpuinfo to this list.[1]
+
+CPU-specific microarchitectures include:
+• AMD Improved K8-family
+• AMD K10-family
+• AMD Family 10h (Barcelona)
+• AMD Family 14h (Bobcat)
+• AMD Family 16h (Jaguar)
+• AMD Family 15h (Bulldozer)
+• AMD Family 15h (Piledriver)
+• AMD Family 15h (Steamroller)
+• AMD Family 15h (Excavator)
+• AMD Family 17h (Zen)
+• AMD Family 17h (Zen 2)
+• AMD Family 19h (Zen 3)†
+• AMD Family 19h (Zen 4)§
+• Intel Silvermont low-power processors
+• Intel Goldmont low-power processors (Apollo Lake and Denverton)
+• Intel Goldmont Plus low-power processors (Gemini Lake)
+• Intel 1st Gen Core i3/i5/i7 (Nehalem)
+• Intel 1.5 Gen Core i3/i5/i7 (Westmere)
+• Intel 2nd Gen Core i3/i5/i7 (Sandybridge)
+• Intel 3rd Gen Core i3/i5/i7 (Ivybridge)
+• Intel 4th Gen Core i3/i5/i7 (Haswell)
+• Intel 5th Gen Core i3/i5/i7 (Broadwell)
+• Intel 6th Gen Core i3/i5/i7 (Skylake)
+• Intel 6th Gen Core i7/i9 (Skylake X)
+• Intel 8th Gen Core i3/i5/i7 (Cannon Lake)
+• Intel 10th Gen Core i7/i9 (Ice Lake)
+• Intel Xeon (Cascade Lake)
+• Intel Xeon (Cooper Lake)*
+• Intel 3rd Gen 10nm++ i3/i5/i7/i9-family (Tiger Lake)*
+• Intel 4th Gen 10nm++ Xeon (Sapphire Rapids)‡
+• Intel 11th Gen i3/i5/i7/i9-family (Rocket Lake)‡
+• Intel 12th Gen i3/i5/i7/i9-family (Alder Lake)‡
+• Intel 13th Gen i3/i5/i7/i9-family (Raptor Lake)§
+• Intel 14th Gen i3/i5/i7/i9-family (Meteor Lake)§
+• Intel 5th Gen 10nm++ Xeon (Emerald Rapids)§
+
+Notes: If not otherwise noted, gcc >=9.1 is required for support.
+ *Requires gcc >=10.1 or clang >=10.0
+ †Required gcc >=10.3 or clang >=12.0
+ ‡Required gcc >=11.1 or clang >=12.0
+ §Required gcc >=13.0 or clang >=15.0.5
+
+It also offers to compile passing the 'native' option which, "selects the CPU
+to generate code for at compilation time by determining the processor type of
+the compiling machine. Using -march=native enables all instruction subsets
+supported by the local machine and will produce code optimized for the local
+machine under the constraints of the selected instruction set."[2]
+
+Users of Intel CPUs should select the 'Intel-Native' option and users of AMD
+CPUs should select the 'AMD-Native' option.
+
+MINOR NOTES RELATING TO INTEL ATOM PROCESSORS
+This patch also changes -march=atom to -march=bonnell in accordance with the
+gcc v4.9 changes. Upstream is using the deprecated -match=atom flags when I
+believe it should use the newer -march=bonnell flag for atom processors.[3]
+
+It is not recommended to compile on Atom-CPUs with the 'native' option.[4] The
+recommendation is to use the 'atom' option instead.
+
+BENEFITS
+Small but real speed increases are measurable using a make endpoint comparing
+a generic kernel to one built with one of the respective microarchs.
+
+See the following experimental evidence supporting this statement:
+https://github.com/graysky2/kernel_gcc_patch
+
+REQUIREMENTS
+linux version 5.17+
+gcc version >=9.0 or clang version >=9.0
+
+ACKNOWLEDGMENTS
+This patch builds on the seminal work by Jeroen.[5]
+
+REFERENCES
+1. https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9
+2. https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html#index-x86-Options
+3. https://bugzilla.kernel.org/show_bug.cgi?id=77461
+4. https://github.com/graysky2/kernel_gcc_patch/issues/15
+5. http://www.linuxforge.net/docs/linux/linux-gcc.php
+---
+ arch/x86/Kconfig.cpu | 427 ++++++++++++++++++++++++++++++--
+ arch/x86/Makefile | 44 +++-
+ arch/x86/include/asm/vermagic.h | 74 ++++++
+ 3 files changed, 528 insertions(+), 17 deletions(-)
+
+diff --git a/arch/x86/Kconfig.cpu b/arch/x86/Kconfig.cpu
+index 542377cd419d..f589971df2d3 100644
+--- a/arch/x86/Kconfig.cpu
++++ b/arch/x86/Kconfig.cpu
+@@ -157,7 +157,7 @@ config MPENTIUM4
+
+
+ config MK6
+- bool "K6/K6-II/K6-III"
++ bool "AMD K6/K6-II/K6-III"
+ depends on X86_32
+ help
+ Select this for an AMD K6-family processor. Enables use of
+@@ -165,7 +165,7 @@ config MK6
+ flags to GCC.
+
+ config MK7
+- bool "Athlon/Duron/K7"
++ bool "AMD Athlon/Duron/K7"
+ depends on X86_32
+ help
+ Select this for an AMD Athlon K7-family processor. Enables use of
+@@ -173,12 +173,106 @@ config MK7
+ flags to GCC.
+
+ config MK8
+- bool "Opteron/Athlon64/Hammer/K8"
++ bool "AMD Opteron/Athlon64/Hammer/K8"
+ help
+ Select this for an AMD Opteron or Athlon64 Hammer-family processor.
+ Enables use of some extended instructions, and passes appropriate
+ optimization flags to GCC.
+
++config MK8SSE3
++ bool "AMD Opteron/Athlon64/Hammer/K8 with SSE3"
++ help
++ Select this for improved AMD Opteron or Athlon64 Hammer-family processors.
++ Enables use of some extended instructions, and passes appropriate
++ optimization flags to GCC.
++
++config MK10
++ bool "AMD 61xx/7x50/PhenomX3/X4/II/K10"
++ help
++ Select this for an AMD 61xx Eight-Core Magny-Cours, Athlon X2 7x50,
++ Phenom X3/X4/II, Athlon II X2/X3/X4, or Turion II-family processor.
++ Enables use of some extended instructions, and passes appropriate
++ optimization flags to GCC.
++
++config MBARCELONA
++ bool "AMD Barcelona"
++ help
++ Select this for AMD Family 10h Barcelona processors.
++
++ Enables -march=barcelona
++
++config MBOBCAT
++ bool "AMD Bobcat"
++ help
++ Select this for AMD Family 14h Bobcat processors.
++
++ Enables -march=btver1
++
++config MJAGUAR
++ bool "AMD Jaguar"
++ help
++ Select this for AMD Family 16h Jaguar processors.
++
++ Enables -march=btver2
++
++config MBULLDOZER
++ bool "AMD Bulldozer"
++ help
++ Select this for AMD Family 15h Bulldozer processors.
++
++ Enables -march=bdver1
++
++config MPILEDRIVER
++ bool "AMD Piledriver"
++ help
++ Select this for AMD Family 15h Piledriver processors.
++
++ Enables -march=bdver2
++
++config MSTEAMROLLER
++ bool "AMD Steamroller"
++ help
++ Select this for AMD Family 15h Steamroller processors.
++
++ Enables -march=bdver3
++
++config MEXCAVATOR
++ bool "AMD Excavator"
++ help
++ Select this for AMD Family 15h Excavator processors.
++
++ Enables -march=bdver4
++
++config MZEN
++ bool "AMD Zen"
++ help
++ Select this for AMD Family 17h Zen processors.
++
++ Enables -march=znver1
++
++config MZEN2
++ bool "AMD Zen 2"
++ help
++ Select this for AMD Family 17h Zen 2 processors.
++
++ Enables -march=znver2
++
++config MZEN3
++ bool "AMD Zen 3"
++ depends on (CC_IS_GCC && GCC_VERSION >= 100300) || (CC_IS_CLANG && CLANG_VERSION >= 120000)
++ help
++ Select this for AMD Family 19h Zen 3 processors.
++
++ Enables -march=znver3
++
++config MZEN4
++ bool "AMD Zen 4"
++ depends on (CC_IS_GCC && GCC_VERSION >= 130000) || (CC_IS_CLANG && CLANG_VERSION >= 160000)
++ help
++ Select this for AMD Family 19h Zen 4 processors.
++
++ Enables -march=znver4
++
+ config MCRUSOE
+ bool "Crusoe"
+ depends on X86_32
+@@ -270,7 +364,7 @@ config MPSC
+ in /proc/cpuinfo. Family 15 is an older Xeon, Family 6 a newer one.
+
+ config MCORE2
+- bool "Core 2/newer Xeon"
++ bool "Intel Core 2"
+ help
+
+ Select this for Intel Core 2 and newer Core 2 Xeons (Xeon 51xx and
+@@ -278,6 +372,8 @@ config MCORE2
+ family in /proc/cpuinfo. Newer ones have 6 and older ones 15
+ (not a typo)
+
++ Enables -march=core2
++
+ config MATOM
+ bool "Intel Atom"
+ help
+@@ -287,6 +383,212 @@ config MATOM
+ accordingly optimized code. Use a recent GCC with specific Atom
+ support in order to fully benefit from selecting this option.
+
++config MNEHALEM
++ bool "Intel Nehalem"
++ select X86_P6_NOP
++ help
++
++ Select this for 1st Gen Core processors in the Nehalem family.
++
++ Enables -march=nehalem
++
++config MWESTMERE
++ bool "Intel Westmere"
++ select X86_P6_NOP
++ help
++
++ Select this for the Intel Westmere formerly Nehalem-C family.
++
++ Enables -march=westmere
++
++config MSILVERMONT
++ bool "Intel Silvermont"
++ select X86_P6_NOP
++ help
++
++ Select this for the Intel Silvermont platform.
++
++ Enables -march=silvermont
++
++config MGOLDMONT
++ bool "Intel Goldmont"
++ select X86_P6_NOP
++ help
++
++ Select this for the Intel Goldmont platform including Apollo Lake and Denverton.
++
++ Enables -march=goldmont
++
++config MGOLDMONTPLUS
++ bool "Intel Goldmont Plus"
++ select X86_P6_NOP
++ help
++
++ Select this for the Intel Goldmont Plus platform including Gemini Lake.
++
++ Enables -march=goldmont-plus
++
++config MSANDYBRIDGE
++ bool "Intel Sandy Bridge"
++ select X86_P6_NOP
++ help
++
++ Select this for 2nd Gen Core processors in the Sandy Bridge family.
++
++ Enables -march=sandybridge
++
++config MIVYBRIDGE
++ bool "Intel Ivy Bridge"
++ select X86_P6_NOP
++ help
++
++ Select this for 3rd Gen Core processors in the Ivy Bridge family.
++
++ Enables -march=ivybridge
++
++config MHASWELL
++ bool "Intel Haswell"
++ select X86_P6_NOP
++ help
++
++ Select this for 4th Gen Core processors in the Haswell family.
++
++ Enables -march=haswell
++
++config MBROADWELL
++ bool "Intel Broadwell"
++ select X86_P6_NOP
++ help
++
++ Select this for 5th Gen Core processors in the Broadwell family.
++
++ Enables -march=broadwell
++
++config MSKYLAKE
++ bool "Intel Skylake"
++ select X86_P6_NOP
++ help
++
++ Select this for 6th Gen Core processors in the Skylake family.
++
++ Enables -march=skylake
++
++config MSKYLAKEX
++ bool "Intel Skylake X"
++ select X86_P6_NOP
++ help
++
++ Select this for 6th Gen Core processors in the Skylake X family.
++
++ Enables -march=skylake-avx512
++
++config MCANNONLAKE
++ bool "Intel Cannon Lake"
++ select X86_P6_NOP
++ help
++
++ Select this for 8th Gen Core processors
++
++ Enables -march=cannonlake
++
++config MICELAKE
++ bool "Intel Ice Lake"
++ select X86_P6_NOP
++ help
++
++ Select this for 10th Gen Core processors in the Ice Lake family.
++
++ Enables -march=icelake-client
++
++config MCASCADELAKE
++ bool "Intel Cascade Lake"
++ select X86_P6_NOP
++ help
++
++ Select this for Xeon processors in the Cascade Lake family.
++
++ Enables -march=cascadelake
++
++config MCOOPERLAKE
++ bool "Intel Cooper Lake"
++ depends on (CC_IS_GCC && GCC_VERSION > 100100) || (CC_IS_CLANG && CLANG_VERSION >= 100000)
++ select X86_P6_NOP
++ help
++
++ Select this for Xeon processors in the Cooper Lake family.
++
++ Enables -march=cooperlake
++
++config MTIGERLAKE
++ bool "Intel Tiger Lake"
++ depends on (CC_IS_GCC && GCC_VERSION > 100100) || (CC_IS_CLANG && CLANG_VERSION >= 100000)
++ select X86_P6_NOP
++ help
++
++ Select this for third-generation 10 nm process processors in the Tiger Lake family.
++
++ Enables -march=tigerlake
++
++config MSAPPHIRERAPIDS
++ bool "Intel Sapphire Rapids"
++ depends on (CC_IS_GCC && GCC_VERSION > 110000) || (CC_IS_CLANG && CLANG_VERSION >= 120000)
++ select X86_P6_NOP
++ help
++
++ Select this for fourth-generation 10 nm process processors in the Sapphire Rapids family.
++
++ Enables -march=sapphirerapids
++
++config MROCKETLAKE
++ bool "Intel Rocket Lake"
++ depends on (CC_IS_GCC && GCC_VERSION > 110000) || (CC_IS_CLANG && CLANG_VERSION >= 120000)
++ select X86_P6_NOP
++ help
++
++ Select this for eleventh-generation processors in the Rocket Lake family.
++
++ Enables -march=rocketlake
++
++config MALDERLAKE
++ bool "Intel Alder Lake"
++ depends on (CC_IS_GCC && GCC_VERSION > 110000) || (CC_IS_CLANG && CLANG_VERSION >= 120000)
++ select X86_P6_NOP
++ help
++
++ Select this for twelfth-generation processors in the Alder Lake family.
++
++ Enables -march=alderlake
++
++config MRAPTORLAKE
++ bool "Intel Raptor Lake"
++ depends on (CC_IS_GCC && GCC_VERSION >= 130000) || (CC_IS_CLANG && CLANG_VERSION >= 150500)
++ select X86_P6_NOP
++ help
++
++ Select this for thirteenth-generation processors in the Raptor Lake family.
++
++ Enables -march=raptorlake
++
++config MMETEORLAKE
++ bool "Intel Meteor Lake"
++ depends on (CC_IS_GCC && GCC_VERSION >= 130000) || (CC_IS_CLANG && CLANG_VERSION >= 150500)
++ select X86_P6_NOP
++ help
++
++ Select this for fourteenth-generation processors in the Meteor Lake family.
++
++ Enables -march=meteorlake
++
++config MEMERALDRAPIDS
++ bool "Intel Emerald Rapids"
++ depends on (CC_IS_GCC && GCC_VERSION > 130000) || (CC_IS_CLANG && CLANG_VERSION >= 150500)
++ select X86_P6_NOP
++ help
++
++ Select this for fifth-generation 10 nm process processors in the Emerald Rapids family.
++
++ Enables -march=emeraldrapids
++
+ config GENERIC_CPU
+ bool "Generic-x86-64"
+ depends on X86_64
+@@ -294,6 +596,50 @@ config GENERIC_CPU
+ Generic x86-64 CPU.
+ Run equally well on all x86-64 CPUs.
+
++config GENERIC_CPU2
++ bool "Generic-x86-64-v2"
++ depends on (CC_IS_GCC && GCC_VERSION > 110000) || (CC_IS_CLANG && CLANG_VERSION >= 120000)
++ depends on X86_64
++ help
++ Generic x86-64 CPU.
++ Run equally well on all x86-64 CPUs with min support of x86-64-v2.
++
++config GENERIC_CPU3
++ bool "Generic-x86-64-v3"
++ depends on (CC_IS_GCC && GCC_VERSION > 110000) || (CC_IS_CLANG && CLANG_VERSION >= 120000)
++ depends on X86_64
++ help
++ Generic x86-64-v3 CPU with v3 instructions.
++ Run equally well on all x86-64 CPUs with min support of x86-64-v3.
++
++config GENERIC_CPU4
++ bool "Generic-x86-64-v4"
++ depends on (CC_IS_GCC && GCC_VERSION > 110000) || (CC_IS_CLANG && CLANG_VERSION >= 120000)
++ depends on X86_64
++ help
++ Generic x86-64 CPU with v4 instructions.
++ Run equally well on all x86-64 CPUs with min support of x86-64-v4.
++
++config MNATIVE_INTEL
++ bool "Intel-Native optimizations autodetected by the compiler"
++ help
++
++ Clang 3.8, GCC 4.2 and above support -march=native, which automatically detects
++ the optimum settings to use based on your processor. Do NOT use this
++ for AMD CPUs. Intel Only!
++
++ Enables -march=native
++
++config MNATIVE_AMD
++ bool "AMD-Native optimizations autodetected by the compiler"
++ help
++
++ Clang 3.8, GCC 4.2 and above support -march=native, which automatically detects
++ the optimum settings to use based on your processor. Do NOT use this
++ for Intel CPUs. AMD Only!
++
++ Enables -march=native
++
+ endchoice
+
+ config X86_GENERIC
+@@ -318,9 +664,17 @@ config X86_INTERNODE_CACHE_SHIFT
+ config X86_L1_CACHE_SHIFT
+ int
+ default "7" if MPENTIUM4 || MPSC
+- default "6" if MK7 || MK8 || MPENTIUMM || MCORE2 || MATOM || MVIAC7 || X86_GENERIC || GENERIC_CPU
++ default "6" if MK7 || MK8 || MPENTIUMM || MCORE2 || MATOM || MVIAC7 || MK8SSE3 || MK10 \
++ || MBARCELONA || MBOBCAT || MJAGUAR || MBULLDOZER || MPILEDRIVER || MSTEAMROLLER \
++ || MEXCAVATOR || MZEN || MZEN2 || MZEN3 || MZEN4 || MNEHALEM || MWESTMERE || MSILVERMONT \
++ || MGOLDMONT || MGOLDMONTPLUS || MSANDYBRIDGE || MIVYBRIDGE || MHASWELL || MBROADWELL \
++ || MSKYLAKE || MSKYLAKEX || MCANNONLAKE || MICELAKE || MCASCADELAKE || MCOOPERLAKE \
++ || MTIGERLAKE || MSAPPHIRERAPIDS || MROCKETLAKE || MALDERLAKE || MRAPTORLAKE || MMETEORLAKE \
++ || MEMERALDRAPIDS || MNATIVE_INTEL || MNATIVE_AMD || X86_GENERIC || GENERIC_CPU || GENERIC_CPU2 \
++ || GENERIC_CPU3 || GENERIC_CPU4
+ default "4" if MELAN || M486SX || M486 || MGEODEGX1
+- default "5" if MWINCHIP3D || MWINCHIPC6 || MCRUSOE || MEFFICEON || MCYRIXIII || MK6 || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || M586 || MVIAC3_2 || MGEODE_LX
++ default "5" if MWINCHIP3D || MWINCHIPC6 || MCRUSOE || MEFFICEON || MCYRIXIII || MK6 || MPENTIUMIII \
++ || MPENTIUMII || M686 || M586MMX || M586TSC || M586 || MVIAC3_2 || MGEODE_LX
+
+ config X86_F00F_BUG
+ def_bool y
+@@ -332,15 +686,27 @@ config X86_INVD_BUG
+
+ config X86_ALIGNMENT_16
+ def_bool y
+- depends on MWINCHIP3D || MWINCHIPC6 || MCYRIXIII || MELAN || MK6 || M586MMX || M586TSC || M586 || M486SX || M486 || MVIAC3_2 || MGEODEGX1
++ depends on MWINCHIP3D || MWINCHIPC6 || MCYRIXIII || MELAN || MK6 || M586MMX || M586TSC \
++ || M586 || M486SX || M486 || MVIAC3_2 || MGEODEGX1
+
+ config X86_INTEL_USERCOPY
+ def_bool y
+- depends on MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M586MMX || X86_GENERIC || MK8 || MK7 || MEFFICEON || MCORE2
++ depends on MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M586MMX || X86_GENERIC \
++ || MK8 || MK7 || MEFFICEON || MCORE2 || MNEHALEM || MWESTMERE || MSILVERMONT || MGOLDMONT \
++ || MGOLDMONTPLUS || MSANDYBRIDGE || MIVYBRIDGE || MHASWELL || MBROADWELL || MSKYLAKE || MSKYLAKEX \
++ || MCANNONLAKE || MICELAKE || MCASCADELAKE || MCOOPERLAKE || MTIGERLAKE || MSAPPHIRERAPIDS \
++ || MROCKETLAKE || MALDERLAKE || MRAPTORLAKE || MMETEORLAKE || MEMERALDRAPIDS || MNATIVE_INTEL
+
+ config X86_USE_PPRO_CHECKSUM
+ def_bool y
+- depends on MWINCHIP3D || MWINCHIPC6 || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MK8 || MVIAC3_2 || MVIAC7 || MEFFICEON || MGEODE_LX || MCORE2 || MATOM
++ depends on MWINCHIP3D || MWINCHIPC6 || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM \
++ || MPENTIUMIII || MPENTIUMII || M686 || MK8 || MVIAC3_2 || MVIAC7 || MEFFICEON || MGEODE_LX \
++ || MCORE2 || MATOM || MK8SSE3 || MK10 || MBARCELONA || MBOBCAT || MJAGUAR || MBULLDOZER \
++ || MPILEDRIVER || MSTEAMROLLER || MEXCAVATOR || MZEN || MZEN2 || MZEN3 || MZEN4 || MNEHALEM \
++ || MWESTMERE || MSILVERMONT || MGOLDMONT || MGOLDMONTPLUS || MSANDYBRIDGE || MIVYBRIDGE \
++ || MHASWELL || MBROADWELL || MSKYLAKE || MSKYLAKEX || MCANNONLAKE || MICELAKE \
++ || MCASCADELAKE || MCOOPERLAKE || MTIGERLAKE || MSAPPHIRERAPIDS || MROCKETLAKE \
++ || MALDERLAKE || MRAPTORLAKE || MMETEORLAKE || MEMERALDRAPIDS || MNATIVE_INTEL || MNATIVE_AMD
+
+ #
+ # P6_NOPs are a relatively minor optimization that require a family >=
+@@ -356,32 +722,63 @@ config X86_USE_PPRO_CHECKSUM
+ config X86_P6_NOP
+ def_bool y
+ depends on X86_64
+- depends on (MCORE2 || MPENTIUM4 || MPSC)
++ depends on (MCORE2 || MPENTIUM4 || MPSC || MNEHALEM || MWESTMERE || MSILVERMONT || MGOLDMONT \
++ || MGOLDMONTPLUS || MSANDYBRIDGE || MIVYBRIDGE || MHASWELL || MBROADWELL || MSKYLAKE \
++ || MSKYLAKEX || MCANNONLAKE || MICELAKE || MCASCADELAKE || MCOOPERLAKE || MTIGERLAKE \
++ || MSAPPHIRERAPIDS || MROCKETLAKE || MALDERLAKE || MRAPTORLAKE || MMETEORLAKE || MEMERALDRAPIDS \
++ || MNATIVE_INTEL)
+
+ config X86_TSC
+ def_bool y
+- depends on (MWINCHIP3D || MCRUSOE || MEFFICEON || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || MK8 || MVIAC3_2 || MVIAC7 || MGEODEGX1 || MGEODE_LX || MCORE2 || MATOM) || X86_64
++ depends on (MWINCHIP3D || MCRUSOE || MEFFICEON || MCYRIXIII || MK7 || MK6 || MPENTIUM4 || MPENTIUMM \
++ || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || MK8 || MVIAC3_2 || MVIAC7 || MGEODEGX1 \
++ || MGEODE_LX || MCORE2 || MATOM || MK8SSE3 || MK10 || MBARCELONA || MBOBCAT || MJAGUAR || MBULLDOZER \
++ || MPILEDRIVER || MSTEAMROLLER || MEXCAVATOR || MZEN || MZEN2 || MZEN3 || MZEN4 || MNEHALEM \
++ || MWESTMERE || MSILVERMONT || MGOLDMONT || MGOLDMONTPLUS || MSANDYBRIDGE || MIVYBRIDGE || MHASWELL \
++ || MBROADWELL || MSKYLAKE || MSKYLAKEX || MCANNONLAKE || MICELAKE || MCASCADELAKE || MCOOPERLAKE \
++ || MTIGERLAKE || MSAPPHIRERAPIDS || MROCKETLAKE || MALDERLAKE || MRAPTORLAKE || MMETEORLAKE || MEMERALDRAPIDS \
++ || MNATIVE_INTEL || MNATIVE_AMD) || X86_64
+
+ config X86_CMPXCHG64
+ def_bool y
+- depends on X86_PAE || X86_64 || MCORE2 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || M586TSC || M586MMX || MATOM || MGEODE_LX || MGEODEGX1 || MK6 || MK7 || MK8
++ depends on X86_PAE || X86_64 || MCORE2 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 \
++ || M586TSC || M586MMX || MATOM || MGEODE_LX || MGEODEGX1 || MK6 || MK7 || MK8 || MK8SSE3 || MK10 \
++ || MBARCELONA || MBOBCAT || MJAGUAR || MBULLDOZER || MPILEDRIVER || MSTEAMROLLER || MEXCAVATOR || MZEN \
++ || MZEN2 || MZEN3 || MZEN4 || MNEHALEM || MWESTMERE || MSILVERMONT || MGOLDMONT || MGOLDMONTPLUS \
++ || MSANDYBRIDGE || MIVYBRIDGE || MHASWELL || MBROADWELL || MSKYLAKE || MSKYLAKEX || MCANNONLAKE \
++ || MICELAKE || MCASCADELAKE || MCOOPERLAKE || MTIGERLAKE || MSAPPHIRERAPIDS || MROCKETLAKE \
++ || MALDERLAKE || MRAPTORLAKE || MMETEORLAKE || MEMERALDRAPIDS || MNATIVE_INTEL || MNATIVE_AMD
+
+ # this should be set for all -march=.. options where the compiler
+ # generates cmov.
+ config X86_CMOV
+ def_bool y
+- depends on (MK8 || MK7 || MCORE2 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MVIAC3_2 || MVIAC7 || MCRUSOE || MEFFICEON || X86_64 || MATOM || MGEODE_LX)
++ depends on (MK8 || MK7 || MCORE2 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 \
++ || MVIAC3_2 || MVIAC7 || MCRUSOE || MEFFICEON || X86_64 || MATOM || MGEODE_LX || MK8SSE3 || MK10 \
++ || MBARCELONA || MBOBCAT || MJAGUAR || MBULLDOZER || MPILEDRIVER || MSTEAMROLLER || MEXCAVATOR \
++ || MZEN || MZEN2 || MZEN3 || MZEN4 || MNEHALEM || MWESTMERE || MSILVERMONT || MGOLDMONT \
++ || MGOLDMONTPLUS || MSANDYBRIDGE || MIVYBRIDGE || MHASWELL || MBROADWELL || MSKYLAKE || MSKYLAKEX \
++ || MCANNONLAKE || MICELAKE || MCASCADELAKE || MCOOPERLAKE || MTIGERLAKE || MSAPPHIRERAPIDS \
++ || MROCKETLAKE || MALDERLAKE || MRAPTORLAKE || MMETEORLAKE || MEMERALDRAPIDS || MNATIVE_INTEL || MNATIVE_AMD)
+
+ config X86_MINIMUM_CPU_FAMILY
+ int
+ default "64" if X86_64
+- default "6" if X86_32 && (MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MVIAC3_2 || MVIAC7 || MEFFICEON || MATOM || MCRUSOE || MCORE2 || MK7 || MK8)
++ default "6" if X86_32 && (MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 \
++ || MVIAC3_2 || MVIAC7 || MEFFICEON || MATOM || MCRUSOE || MCORE2 || MK7 || MK8 || MK8SSE3 \
++ || MK10 || MBARCELONA || MBOBCAT || MJAGUAR || MBULLDOZER || MPILEDRIVER || MSTEAMROLLER \
++ || MEXCAVATOR || MZEN || MZEN2 || MZEN3 || MZEN4 || MNEHALEM || MWESTMERE || MSILVERMONT \
++ || MGOLDMONT || MGOLDMONTPLUS || MSANDYBRIDGE || MIVYBRIDGE || MHASWELL || MBROADWELL \
++ || MSKYLAKE || MSKYLAKEX || MCANNONLAKE || MICELAKE || MCASCADELAKE || MCOOPERLAKE \
++ || MTIGERLAKE || MSAPPHIRERAPIDS || MROCKETLAKE || MALDERLAKE || MRAPTORLAKE || MRAPTORLAKE \
++ || MNATIVE_INTEL || MNATIVE_AMD)
+ default "5" if X86_32 && X86_CMPXCHG64
+ default "4"
+
+ config X86_DEBUGCTLMSR
+ def_bool y
+- depends on !(MK6 || MWINCHIPC6 || MWINCHIP3D || MCYRIXIII || M586MMX || M586TSC || M586 || M486SX || M486) && !UML
++ depends on !(MK6 || MWINCHIPC6 || MWINCHIP3D || MCYRIXIII || M586MMX || M586TSC || M586 \
++ || M486SX || M486) && !UML
+
+ config IA32_FEAT_CTL
+ def_bool y
+diff --git a/arch/x86/Makefile b/arch/x86/Makefile
+index 415a5d138de4..17b1e039d955 100644
+--- a/arch/x86/Makefile
++++ b/arch/x86/Makefile
+@@ -151,8 +151,48 @@ else
+ # FIXME - should be integrated in Makefile.cpu (Makefile_32.cpu)
+ cflags-$(CONFIG_MK8) += -march=k8
+ cflags-$(CONFIG_MPSC) += -march=nocona
+- cflags-$(CONFIG_MCORE2) += -march=core2
+- cflags-$(CONFIG_MATOM) += -march=atom
++ cflags-$(CONFIG_MK8SSE3) += -march=k8-sse3
++ cflags-$(CONFIG_MK10) += -march=amdfam10
++ cflags-$(CONFIG_MBARCELONA) += -march=barcelona
++ cflags-$(CONFIG_MBOBCAT) += -march=btver1
++ cflags-$(CONFIG_MJAGUAR) += -march=btver2
++ cflags-$(CONFIG_MBULLDOZER) += -march=bdver1
++ cflags-$(CONFIG_MPILEDRIVER) += -march=bdver2 -mno-tbm
++ cflags-$(CONFIG_MSTEAMROLLER) += -march=bdver3 -mno-tbm
++ cflags-$(CONFIG_MEXCAVATOR) += -march=bdver4 -mno-tbm
++ cflags-$(CONFIG_MZEN) += -march=znver1
++ cflags-$(CONFIG_MZEN2) += -march=znver2
++ cflags-$(CONFIG_MZEN3) += -march=znver3
++ cflags-$(CONFIG_MZEN4) += -march=znver4
++ cflags-$(CONFIG_MNATIVE_INTEL) += -march=native
++ cflags-$(CONFIG_MNATIVE_AMD) += -march=native
++ cflags-$(CONFIG_MATOM) += -march=bonnell
++ cflags-$(CONFIG_MCORE2) += -march=core2
++ cflags-$(CONFIG_MNEHALEM) += -march=nehalem
++ cflags-$(CONFIG_MWESTMERE) += -march=westmere
++ cflags-$(CONFIG_MSILVERMONT) += -march=silvermont
++ cflags-$(CONFIG_MGOLDMONT) += -march=goldmont
++ cflags-$(CONFIG_MGOLDMONTPLUS) += -march=goldmont-plus
++ cflags-$(CONFIG_MSANDYBRIDGE) += -march=sandybridge
++ cflags-$(CONFIG_MIVYBRIDGE) += -march=ivybridge
++ cflags-$(CONFIG_MHASWELL) += -march=haswell
++ cflags-$(CONFIG_MBROADWELL) += -march=broadwell
++ cflags-$(CONFIG_MSKYLAKE) += -march=skylake
++ cflags-$(CONFIG_MSKYLAKEX) += -march=skylake-avx512
++ cflags-$(CONFIG_MCANNONLAKE) += -march=cannonlake
++ cflags-$(CONFIG_MICELAKE) += -march=icelake-client
++ cflags-$(CONFIG_MCASCADELAKE) += -march=cascadelake
++ cflags-$(CONFIG_MCOOPERLAKE) += -march=cooperlake
++ cflags-$(CONFIG_MTIGERLAKE) += -march=tigerlake
++ cflags-$(CONFIG_MSAPPHIRERAPIDS) += -march=sapphirerapids
++ cflags-$(CONFIG_MROCKETLAKE) += -march=rocketlake
++ cflags-$(CONFIG_MALDERLAKE) += -march=alderlake
++ cflags-$(CONFIG_MRAPTORLAKE) += -march=raptorlake
++ cflags-$(CONFIG_MMETEORLAKE) += -march=meteorlake
++ cflags-$(CONFIG_MEMERALDRAPIDS) += -march=emeraldrapids
++ cflags-$(CONFIG_GENERIC_CPU2) += -march=x86-64-v2
++ cflags-$(CONFIG_GENERIC_CPU3) += -march=x86-64-v3
++ cflags-$(CONFIG_GENERIC_CPU4) += -march=x86-64-v4
+ cflags-$(CONFIG_GENERIC_CPU) += -mtune=generic
+ KBUILD_CFLAGS += $(cflags-y)
+
+diff --git a/arch/x86/include/asm/vermagic.h b/arch/x86/include/asm/vermagic.h
+index 75884d2cdec3..02c1386eb653 100644
+--- a/arch/x86/include/asm/vermagic.h
++++ b/arch/x86/include/asm/vermagic.h
+@@ -17,6 +17,54 @@
+ #define MODULE_PROC_FAMILY "586MMX "
+ #elif defined CONFIG_MCORE2
+ #define MODULE_PROC_FAMILY "CORE2 "
++#elif defined CONFIG_MNATIVE_INTEL
++#define MODULE_PROC_FAMILY "NATIVE_INTEL "
++#elif defined CONFIG_MNATIVE_AMD
++#define MODULE_PROC_FAMILY "NATIVE_AMD "
++#elif defined CONFIG_MNEHALEM
++#define MODULE_PROC_FAMILY "NEHALEM "
++#elif defined CONFIG_MWESTMERE
++#define MODULE_PROC_FAMILY "WESTMERE "
++#elif defined CONFIG_MSILVERMONT
++#define MODULE_PROC_FAMILY "SILVERMONT "
++#elif defined CONFIG_MGOLDMONT
++#define MODULE_PROC_FAMILY "GOLDMONT "
++#elif defined CONFIG_MGOLDMONTPLUS
++#define MODULE_PROC_FAMILY "GOLDMONTPLUS "
++#elif defined CONFIG_MSANDYBRIDGE
++#define MODULE_PROC_FAMILY "SANDYBRIDGE "
++#elif defined CONFIG_MIVYBRIDGE
++#define MODULE_PROC_FAMILY "IVYBRIDGE "
++#elif defined CONFIG_MHASWELL
++#define MODULE_PROC_FAMILY "HASWELL "
++#elif defined CONFIG_MBROADWELL
++#define MODULE_PROC_FAMILY "BROADWELL "
++#elif defined CONFIG_MSKYLAKE
++#define MODULE_PROC_FAMILY "SKYLAKE "
++#elif defined CONFIG_MSKYLAKEX
++#define MODULE_PROC_FAMILY "SKYLAKEX "
++#elif defined CONFIG_MCANNONLAKE
++#define MODULE_PROC_FAMILY "CANNONLAKE "
++#elif defined CONFIG_MICELAKE
++#define MODULE_PROC_FAMILY "ICELAKE "
++#elif defined CONFIG_MCASCADELAKE
++#define MODULE_PROC_FAMILY "CASCADELAKE "
++#elif defined CONFIG_MCOOPERLAKE
++#define MODULE_PROC_FAMILY "COOPERLAKE "
++#elif defined CONFIG_MTIGERLAKE
++#define MODULE_PROC_FAMILY "TIGERLAKE "
++#elif defined CONFIG_MSAPPHIRERAPIDS
++#define MODULE_PROC_FAMILY "SAPPHIRERAPIDS "
++#elif defined CONFIG_ROCKETLAKE
++#define MODULE_PROC_FAMILY "ROCKETLAKE "
++#elif defined CONFIG_MALDERLAKE
++#define MODULE_PROC_FAMILY "ALDERLAKE "
++#elif defined CONFIG_MRAPTORLAKE
++#define MODULE_PROC_FAMILY "RAPTORLAKE "
++#elif defined CONFIG_MMETEORLAKE
++#define MODULE_PROC_FAMILY "METEORLAKE "
++#elif defined CONFIG_MEMERALDRAPIDS
++#define MODULE_PROC_FAMILY "EMERALDRAPIDS "
+ #elif defined CONFIG_MATOM
+ #define MODULE_PROC_FAMILY "ATOM "
+ #elif defined CONFIG_M686
+@@ -35,6 +83,32 @@
+ #define MODULE_PROC_FAMILY "K7 "
+ #elif defined CONFIG_MK8
+ #define MODULE_PROC_FAMILY "K8 "
++#elif defined CONFIG_MK8SSE3
++#define MODULE_PROC_FAMILY "K8SSE3 "
++#elif defined CONFIG_MK10
++#define MODULE_PROC_FAMILY "K10 "
++#elif defined CONFIG_MBARCELONA
++#define MODULE_PROC_FAMILY "BARCELONA "
++#elif defined CONFIG_MBOBCAT
++#define MODULE_PROC_FAMILY "BOBCAT "
++#elif defined CONFIG_MBULLDOZER
++#define MODULE_PROC_FAMILY "BULLDOZER "
++#elif defined CONFIG_MPILEDRIVER
++#define MODULE_PROC_FAMILY "PILEDRIVER "
++#elif defined CONFIG_MSTEAMROLLER
++#define MODULE_PROC_FAMILY "STEAMROLLER "
++#elif defined CONFIG_MJAGUAR
++#define MODULE_PROC_FAMILY "JAGUAR "
++#elif defined CONFIG_MEXCAVATOR
++#define MODULE_PROC_FAMILY "EXCAVATOR "
++#elif defined CONFIG_MZEN
++#define MODULE_PROC_FAMILY "ZEN "
++#elif defined CONFIG_MZEN2
++#define MODULE_PROC_FAMILY "ZEN2 "
++#elif defined CONFIG_MZEN3
++#define MODULE_PROC_FAMILY "ZEN3 "
++#elif defined CONFIG_MZEN4
++#define MODULE_PROC_FAMILY "ZEN4 "
+ #elif defined CONFIG_MELAN
+ #define MODULE_PROC_FAMILY "ELAN "
+ #elif defined CONFIG_MCRUSOE
+--
+2.39.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-07-01 18:19 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-07-01 18:19 UTC (permalink / raw
To: gentoo-commits
commit: 6959edbe18a7e37becc72959166c73b82c0cf790
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Sat Jul 1 18:19:34 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Sat Jul 1 18:19:34 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=6959edbe
Linux patch 6.4.1
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/0000_README b/0000_README
index 8bb95e22..22f83174 100644
--- a/0000_README
+++ b/0000_README
@@ -43,6 +43,10 @@ EXPERIMENTAL
Individual Patch Descriptions:
--------------------------------------------------------------------------
+Patch: 1000_linux-6.4.1.patch
+From: https://www.kernel.org
+Desc: Linux 6.4.1
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-07-01 18:48 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-07-01 18:48 UTC (permalink / raw
To: gentoo-commits
commit: 198268461f128858296efdf0c11e8d49bf1d4869
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Sat Jul 1 18:48:07 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Sat Jul 1 18:48:07 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=19826846
Linux patch 6.4.1
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
1000_linux-6.4.1.patch | 2433 ++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 2433 insertions(+)
diff --git a/1000_linux-6.4.1.patch b/1000_linux-6.4.1.patch
new file mode 100644
index 00000000..70bfbcf8
--- /dev/null
+++ b/1000_linux-6.4.1.patch
@@ -0,0 +1,2433 @@
+diff --git a/Makefile b/Makefile
+index e51e4d9174ab3..9f6376cbafebe 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 6
+ PATCHLEVEL = 4
+-SUBLEVEL = 0
++SUBLEVEL = 1
+ EXTRAVERSION =
+ NAME = Hurr durr I'ma ninja sloth
+
+diff --git a/arch/alpha/Kconfig b/arch/alpha/Kconfig
+index a5c2b1aa46b02..d6968d090d49a 100644
+--- a/arch/alpha/Kconfig
++++ b/arch/alpha/Kconfig
+@@ -30,6 +30,7 @@ config ALPHA
+ select HAS_IOPORT
+ select HAVE_ARCH_AUDITSYSCALL
+ select HAVE_MOD_ARCH_SPECIFIC
++ select LOCK_MM_AND_FIND_VMA
+ select MODULES_USE_ELF_RELA
+ select ODD_RT_SIGACTION
+ select OLD_SIGSUSPEND
+diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c
+index 7b01ae4f3bc6c..8c9850437e674 100644
+--- a/arch/alpha/mm/fault.c
++++ b/arch/alpha/mm/fault.c
+@@ -119,20 +119,12 @@ do_page_fault(unsigned long address, unsigned long mmcsr,
+ flags |= FAULT_FLAG_USER;
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
+ retry:
+- mmap_read_lock(mm);
+- vma = find_vma(mm, address);
++ vma = lock_mm_and_find_vma(mm, address, regs);
+ if (!vma)
+- goto bad_area;
+- if (vma->vm_start <= address)
+- goto good_area;
+- if (!(vma->vm_flags & VM_GROWSDOWN))
+- goto bad_area;
+- if (expand_stack(vma, address))
+- goto bad_area;
++ goto bad_area_nosemaphore;
+
+ /* Ok, we have a good vm_area for this memory access, so
+ we can handle it. */
+- good_area:
+ si_code = SEGV_ACCERR;
+ if (cause < 0) {
+ if (!(vma->vm_flags & VM_EXEC))
+@@ -192,6 +184,7 @@ retry:
+ bad_area:
+ mmap_read_unlock(mm);
+
++ bad_area_nosemaphore:
+ if (user_mode(regs))
+ goto do_sigsegv;
+
+diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
+index ab6d701365bb0..96cf8720bb939 100644
+--- a/arch/arc/Kconfig
++++ b/arch/arc/Kconfig
+@@ -41,6 +41,7 @@ config ARC
+ select HAVE_PERF_EVENTS
+ select HAVE_SYSCALL_TRACEPOINTS
+ select IRQ_DOMAIN
++ select LOCK_MM_AND_FIND_VMA
+ select MODULES_USE_ELF_RELA
+ select OF
+ select OF_EARLY_FLATTREE
+diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c
+index 5ca59a482632a..f59e722d147f9 100644
+--- a/arch/arc/mm/fault.c
++++ b/arch/arc/mm/fault.c
+@@ -113,15 +113,9 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
+
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
+ retry:
+- mmap_read_lock(mm);
+-
+- vma = find_vma(mm, address);
++ vma = lock_mm_and_find_vma(mm, address, regs);
+ if (!vma)
+- goto bad_area;
+- if (unlikely(address < vma->vm_start)) {
+- if (!(vma->vm_flags & VM_GROWSDOWN) || expand_stack(vma, address))
+- goto bad_area;
+- }
++ goto bad_area_nosemaphore;
+
+ /*
+ * vm_area is good, now check permissions for this memory access
+@@ -161,6 +155,7 @@ retry:
+ bad_area:
+ mmap_read_unlock(mm);
+
++bad_area_nosemaphore:
+ /*
+ * Major/minor page fault accounting
+ * (in case of retry we only land here once)
+diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
+index 0fb4b218f6658..9ed7f03ba15a3 100644
+--- a/arch/arm/Kconfig
++++ b/arch/arm/Kconfig
+@@ -125,6 +125,7 @@ config ARM
+ select HAVE_UID16
+ select HAVE_VIRT_CPU_ACCOUNTING_GEN
+ select IRQ_FORCED_THREADING
++ select LOCK_MM_AND_FIND_VMA
+ select MODULES_USE_ELF_REL
+ select NEED_DMA_MAP_STATE
+ select OF_EARLY_FLATTREE if OF
+diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
+index 2418f1efabd87..0860eeba8bd34 100644
+--- a/arch/arm/mm/fault.c
++++ b/arch/arm/mm/fault.c
+@@ -232,37 +232,11 @@ static inline bool is_permission_fault(unsigned int fsr)
+ return false;
+ }
+
+-static vm_fault_t __kprobes
+-__do_page_fault(struct mm_struct *mm, unsigned long addr, unsigned int flags,
+- unsigned long vma_flags, struct pt_regs *regs)
+-{
+- struct vm_area_struct *vma = find_vma(mm, addr);
+- if (unlikely(!vma))
+- return VM_FAULT_BADMAP;
+-
+- if (unlikely(vma->vm_start > addr)) {
+- if (!(vma->vm_flags & VM_GROWSDOWN))
+- return VM_FAULT_BADMAP;
+- if (addr < FIRST_USER_ADDRESS)
+- return VM_FAULT_BADMAP;
+- if (expand_stack(vma, addr))
+- return VM_FAULT_BADMAP;
+- }
+-
+- /*
+- * ok, we have a good vm_area for this memory access, check the
+- * permissions on the VMA allow for the fault which occurred.
+- */
+- if (!(vma->vm_flags & vma_flags))
+- return VM_FAULT_BADACCESS;
+-
+- return handle_mm_fault(vma, addr & PAGE_MASK, flags, regs);
+-}
+-
+ static int __kprobes
+ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
+ {
+ struct mm_struct *mm = current->mm;
++ struct vm_area_struct *vma;
+ int sig, code;
+ vm_fault_t fault;
+ unsigned int flags = FAULT_FLAG_DEFAULT;
+@@ -301,31 +275,21 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
+
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
+
+- /*
+- * As per x86, we may deadlock here. However, since the kernel only
+- * validly references user space from well defined areas of the code,
+- * we can bug out early if this is from code which shouldn't.
+- */
+- if (!mmap_read_trylock(mm)) {
+- if (!user_mode(regs) && !search_exception_tables(regs->ARM_pc))
+- goto no_context;
+ retry:
+- mmap_read_lock(mm);
+- } else {
+- /*
+- * The above down_read_trylock() might have succeeded in
+- * which case, we'll have missed the might_sleep() from
+- * down_read()
+- */
+- might_sleep();
+-#ifdef CONFIG_DEBUG_VM
+- if (!user_mode(regs) &&
+- !search_exception_tables(regs->ARM_pc))
+- goto no_context;
+-#endif
++ vma = lock_mm_and_find_vma(mm, addr, regs);
++ if (unlikely(!vma)) {
++ fault = VM_FAULT_BADMAP;
++ goto bad_area;
+ }
+
+- fault = __do_page_fault(mm, addr, flags, vm_flags, regs);
++ /*
++ * ok, we have a good vm_area for this memory access, check the
++ * permissions on the VMA allow for the fault which occurred.
++ */
++ if (!(vma->vm_flags & vm_flags))
++ fault = VM_FAULT_BADACCESS;
++ else
++ fault = handle_mm_fault(vma, addr & PAGE_MASK, flags, regs);
+
+ /* If we need to retry but a fatal signal is pending, handle the
+ * signal first. We do not need to release the mmap_lock because
+@@ -356,6 +320,7 @@ retry:
+ if (likely(!(fault & (VM_FAULT_ERROR | VM_FAULT_BADMAP | VM_FAULT_BADACCESS))))
+ return 0;
+
++bad_area:
+ /*
+ * If we are in kernel mode at this point, we
+ * have no context to handle this fault with.
+diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
+index 343e1e1cae10a..92f3fff2522b0 100644
+--- a/arch/arm64/Kconfig
++++ b/arch/arm64/Kconfig
+@@ -225,6 +225,7 @@ config ARM64
+ select IRQ_DOMAIN
+ select IRQ_FORCED_THREADING
+ select KASAN_VMALLOC if KASAN
++ select LOCK_MM_AND_FIND_VMA
+ select MODULES_USE_ELF_RELA
+ select NEED_DMA_MAP_STATE
+ select NEED_SG_DMA_LENGTH
+diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
+index 6045a5117ac15..8a169bdb4d534 100644
+--- a/arch/arm64/mm/fault.c
++++ b/arch/arm64/mm/fault.c
+@@ -483,27 +483,14 @@ static void do_bad_area(unsigned long far, unsigned long esr,
+ #define VM_FAULT_BADMAP ((__force vm_fault_t)0x010000)
+ #define VM_FAULT_BADACCESS ((__force vm_fault_t)0x020000)
+
+-static vm_fault_t __do_page_fault(struct mm_struct *mm, unsigned long addr,
++static vm_fault_t __do_page_fault(struct mm_struct *mm,
++ struct vm_area_struct *vma, unsigned long addr,
+ unsigned int mm_flags, unsigned long vm_flags,
+ struct pt_regs *regs)
+ {
+- struct vm_area_struct *vma = find_vma(mm, addr);
+-
+- if (unlikely(!vma))
+- return VM_FAULT_BADMAP;
+-
+ /*
+ * Ok, we have a good vm_area for this memory access, so we can handle
+ * it.
+- */
+- if (unlikely(vma->vm_start > addr)) {
+- if (!(vma->vm_flags & VM_GROWSDOWN))
+- return VM_FAULT_BADMAP;
+- if (expand_stack(vma, addr))
+- return VM_FAULT_BADMAP;
+- }
+-
+- /*
+ * Check that the permissions on the VMA allow for the fault which
+ * occurred.
+ */
+@@ -617,31 +604,15 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
+ }
+ lock_mmap:
+ #endif /* CONFIG_PER_VMA_LOCK */
+- /*
+- * As per x86, we may deadlock here. However, since the kernel only
+- * validly references user space from well defined areas of the code,
+- * we can bug out early if this is from code which shouldn't.
+- */
+- if (!mmap_read_trylock(mm)) {
+- if (!user_mode(regs) && !search_exception_tables(regs->pc))
+- goto no_context;
++
+ retry:
+- mmap_read_lock(mm);
+- } else {
+- /*
+- * The above mmap_read_trylock() might have succeeded in which
+- * case, we'll have missed the might_sleep() from down_read().
+- */
+- might_sleep();
+-#ifdef CONFIG_DEBUG_VM
+- if (!user_mode(regs) && !search_exception_tables(regs->pc)) {
+- mmap_read_unlock(mm);
+- goto no_context;
+- }
+-#endif
++ vma = lock_mm_and_find_vma(mm, addr, regs);
++ if (unlikely(!vma)) {
++ fault = VM_FAULT_BADMAP;
++ goto done;
+ }
+
+- fault = __do_page_fault(mm, addr, mm_flags, vm_flags, regs);
++ fault = __do_page_fault(mm, vma, addr, mm_flags, vm_flags, regs);
+
+ /* Quick path to respond to signals */
+ if (fault_signal_pending(fault, regs)) {
+@@ -660,9 +631,7 @@ retry:
+ }
+ mmap_read_unlock(mm);
+
+-#ifdef CONFIG_PER_VMA_LOCK
+ done:
+-#endif
+ /*
+ * Handle the "normal" (no error) case first.
+ */
+diff --git a/arch/csky/Kconfig b/arch/csky/Kconfig
+index 4df1f8c9d170b..03e9f66661570 100644
+--- a/arch/csky/Kconfig
++++ b/arch/csky/Kconfig
+@@ -96,6 +96,7 @@ config CSKY
+ select HAVE_REGS_AND_STACK_ACCESS_API
+ select HAVE_STACKPROTECTOR
+ select HAVE_SYSCALL_TRACEPOINTS
++ select LOCK_MM_AND_FIND_VMA
+ select MAY_HAVE_SPARSE_IRQ
+ select MODULES_USE_ELF_RELA if MODULES
+ select OF
+diff --git a/arch/csky/mm/fault.c b/arch/csky/mm/fault.c
+index e15f736cca4b4..a885518ce1dd2 100644
+--- a/arch/csky/mm/fault.c
++++ b/arch/csky/mm/fault.c
+@@ -97,13 +97,12 @@ static inline void mm_fault_error(struct pt_regs *regs, unsigned long addr, vm_f
+ BUG();
+ }
+
+-static inline void bad_area(struct pt_regs *regs, struct mm_struct *mm, int code, unsigned long addr)
++static inline void bad_area_nosemaphore(struct pt_regs *regs, struct mm_struct *mm, int code, unsigned long addr)
+ {
+ /*
+ * Something tried to access memory that isn't in our memory map.
+ * Fix it, but check if it's kernel or user first.
+ */
+- mmap_read_unlock(mm);
+ /* User mode accesses just cause a SIGSEGV */
+ if (user_mode(regs)) {
+ do_trap(regs, SIGSEGV, code, addr);
+@@ -238,20 +237,9 @@ asmlinkage void do_page_fault(struct pt_regs *regs)
+ if (is_write(regs))
+ flags |= FAULT_FLAG_WRITE;
+ retry:
+- mmap_read_lock(mm);
+- vma = find_vma(mm, addr);
++ vma = lock_mm_and_find_vma(mm, addr, regs);
+ if (unlikely(!vma)) {
+- bad_area(regs, mm, code, addr);
+- return;
+- }
+- if (likely(vma->vm_start <= addr))
+- goto good_area;
+- if (unlikely(!(vma->vm_flags & VM_GROWSDOWN))) {
+- bad_area(regs, mm, code, addr);
+- return;
+- }
+- if (unlikely(expand_stack(vma, addr))) {
+- bad_area(regs, mm, code, addr);
++ bad_area_nosemaphore(regs, mm, code, addr);
+ return;
+ }
+
+@@ -259,11 +247,11 @@ retry:
+ * Ok, we have a good vm_area for this memory access, so
+ * we can handle it.
+ */
+-good_area:
+ code = SEGV_ACCERR;
+
+ if (unlikely(access_error(regs, vma))) {
+- bad_area(regs, mm, code, addr);
++ mmap_read_unlock(mm);
++ bad_area_nosemaphore(regs, mm, code, addr);
+ return;
+ }
+
+diff --git a/arch/hexagon/Kconfig b/arch/hexagon/Kconfig
+index 54eadf2651786..6726f4941015f 100644
+--- a/arch/hexagon/Kconfig
++++ b/arch/hexagon/Kconfig
+@@ -28,6 +28,7 @@ config HEXAGON
+ select GENERIC_SMP_IDLE_THREAD
+ select STACKTRACE_SUPPORT
+ select GENERIC_CLOCKEVENTS_BROADCAST
++ select LOCK_MM_AND_FIND_VMA
+ select MODULES_USE_ELF_RELA
+ select GENERIC_CPU_DEVICES
+ select ARCH_WANT_LD_ORPHAN_WARN
+diff --git a/arch/hexagon/mm/vm_fault.c b/arch/hexagon/mm/vm_fault.c
+index 4b578d02fd01a..7295ea3f8cc8d 100644
+--- a/arch/hexagon/mm/vm_fault.c
++++ b/arch/hexagon/mm/vm_fault.c
+@@ -57,21 +57,10 @@ void do_page_fault(unsigned long address, long cause, struct pt_regs *regs)
+
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
+ retry:
+- mmap_read_lock(mm);
+- vma = find_vma(mm, address);
+- if (!vma)
+- goto bad_area;
++ vma = lock_mm_and_find_vma(mm, address, regs);
++ if (unlikely(!vma))
++ goto bad_area_nosemaphore;
+
+- if (vma->vm_start <= address)
+- goto good_area;
+-
+- if (!(vma->vm_flags & VM_GROWSDOWN))
+- goto bad_area;
+-
+- if (expand_stack(vma, address))
+- goto bad_area;
+-
+-good_area:
+ /* Address space is OK. Now check access rights. */
+ si_code = SEGV_ACCERR;
+
+@@ -143,6 +132,7 @@ good_area:
+ bad_area:
+ mmap_read_unlock(mm);
+
++bad_area_nosemaphore:
+ if (user_mode(regs)) {
+ force_sig_fault(SIGSEGV, si_code, (void __user *)address);
+ return;
+diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c
+index 85c4d9ac8686d..5458b52b40099 100644
+--- a/arch/ia64/mm/fault.c
++++ b/arch/ia64/mm/fault.c
+@@ -110,10 +110,12 @@ retry:
+ * register backing store that needs to expand upwards, in
+ * this case vma will be null, but prev_vma will ne non-null
+ */
+- if (( !vma && prev_vma ) || (address < vma->vm_start) )
+- goto check_expansion;
++ if (( !vma && prev_vma ) || (address < vma->vm_start) ) {
++ vma = expand_stack(mm, address);
++ if (!vma)
++ goto bad_area_nosemaphore;
++ }
+
+- good_area:
+ code = SEGV_ACCERR;
+
+ /* OK, we've got a good vm_area for this memory area. Check the access permissions: */
+@@ -177,35 +179,9 @@ retry:
+ mmap_read_unlock(mm);
+ return;
+
+- check_expansion:
+- if (!(prev_vma && (prev_vma->vm_flags & VM_GROWSUP) && (address == prev_vma->vm_end))) {
+- if (!vma)
+- goto bad_area;
+- if (!(vma->vm_flags & VM_GROWSDOWN))
+- goto bad_area;
+- if (REGION_NUMBER(address) != REGION_NUMBER(vma->vm_start)
+- || REGION_OFFSET(address) >= RGN_MAP_LIMIT)
+- goto bad_area;
+- if (expand_stack(vma, address))
+- goto bad_area;
+- } else {
+- vma = prev_vma;
+- if (REGION_NUMBER(address) != REGION_NUMBER(vma->vm_start)
+- || REGION_OFFSET(address) >= RGN_MAP_LIMIT)
+- goto bad_area;
+- /*
+- * Since the register backing store is accessed sequentially,
+- * we disallow growing it by more than a page at a time.
+- */
+- if (address > vma->vm_end + PAGE_SIZE - sizeof(long))
+- goto bad_area;
+- if (expand_upwards(vma, address))
+- goto bad_area;
+- }
+- goto good_area;
+-
+ bad_area:
+ mmap_read_unlock(mm);
++ bad_area_nosemaphore:
+ if ((isr & IA64_ISR_SP)
+ || ((isr & IA64_ISR_NA) && (isr & IA64_ISR_CODE_MASK) == IA64_ISR_CODE_LFETCH))
+ {
+diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
+index d38b066fc931b..73519e13bbb39 100644
+--- a/arch/loongarch/Kconfig
++++ b/arch/loongarch/Kconfig
+@@ -130,6 +130,7 @@ config LOONGARCH
+ select HAVE_VIRT_CPU_ACCOUNTING_GEN if !SMP
+ select IRQ_FORCED_THREADING
+ select IRQ_LOONGARCH_CPU
++ select LOCK_MM_AND_FIND_VMA
+ select MMU_GATHER_MERGE_VMAS if MMU
+ select MODULES_USE_ELF_RELA if MODULES
+ select NEED_PER_CPU_EMBED_FIRST_CHUNK
+diff --git a/arch/loongarch/mm/fault.c b/arch/loongarch/mm/fault.c
+index 449087bd589d3..da5b6d518cdb1 100644
+--- a/arch/loongarch/mm/fault.c
++++ b/arch/loongarch/mm/fault.c
+@@ -169,22 +169,18 @@ static void __kprobes __do_page_fault(struct pt_regs *regs,
+
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
+ retry:
+- mmap_read_lock(mm);
+- vma = find_vma(mm, address);
+- if (!vma)
+- goto bad_area;
+- if (vma->vm_start <= address)
+- goto good_area;
+- if (!(vma->vm_flags & VM_GROWSDOWN))
+- goto bad_area;
+- if (!expand_stack(vma, address))
+- goto good_area;
++ vma = lock_mm_and_find_vma(mm, address, regs);
++ if (unlikely(!vma))
++ goto bad_area_nosemaphore;
++ goto good_area;
++
+ /*
+ * Something tried to access memory that isn't in our memory map..
+ * Fix it, but check if it's kernel or user first..
+ */
+ bad_area:
+ mmap_read_unlock(mm);
++bad_area_nosemaphore:
+ do_sigsegv(regs, write, address, si_code);
+ return;
+
+diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c
+index 228128e45c673..c290c5c0cfb93 100644
+--- a/arch/m68k/mm/fault.c
++++ b/arch/m68k/mm/fault.c
+@@ -105,8 +105,9 @@ retry:
+ if (address + 256 < rdusp())
+ goto map_err;
+ }
+- if (expand_stack(vma, address))
+- goto map_err;
++ vma = expand_stack(mm, address);
++ if (!vma)
++ goto map_err_nosemaphore;
+
+ /*
+ * Ok, we have a good vm_area for this memory access, so
+@@ -196,10 +197,12 @@ bus_err:
+ goto send_sig;
+
+ map_err:
++ mmap_read_unlock(mm);
++map_err_nosemaphore:
+ current->thread.signo = SIGSEGV;
+ current->thread.code = SEGV_MAPERR;
+ current->thread.faddr = address;
+- goto send_sig;
++ return send_fault_sig(regs);
+
+ acc_err:
+ current->thread.signo = SIGSEGV;
+diff --git a/arch/microblaze/mm/fault.c b/arch/microblaze/mm/fault.c
+index 687714db6f4d0..d3c3c33b73a6e 100644
+--- a/arch/microblaze/mm/fault.c
++++ b/arch/microblaze/mm/fault.c
+@@ -192,8 +192,9 @@ retry:
+ && (kernel_mode(regs) || !store_updates_sp(regs)))
+ goto bad_area;
+ }
+- if (expand_stack(vma, address))
+- goto bad_area;
++ vma = expand_stack(mm, address);
++ if (!vma)
++ goto bad_area_nosemaphore;
+
+ good_area:
+ code = SEGV_ACCERR;
+diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
+index 675a8660cb85a..6796d839bcfdf 100644
+--- a/arch/mips/Kconfig
++++ b/arch/mips/Kconfig
+@@ -91,6 +91,7 @@ config MIPS
+ select HAVE_VIRT_CPU_ACCOUNTING_GEN if 64BIT || !SMP
+ select IRQ_FORCED_THREADING
+ select ISA if EISA
++ select LOCK_MM_AND_FIND_VMA
+ select MODULES_USE_ELF_REL if MODULES
+ select MODULES_USE_ELF_RELA if MODULES && 64BIT
+ select PERF_USE_VMALLOC
+diff --git a/arch/mips/mm/fault.c b/arch/mips/mm/fault.c
+index a27045f5a556d..d7878208bd3fa 100644
+--- a/arch/mips/mm/fault.c
++++ b/arch/mips/mm/fault.c
+@@ -99,21 +99,13 @@ static void __do_page_fault(struct pt_regs *regs, unsigned long write,
+
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
+ retry:
+- mmap_read_lock(mm);
+- vma = find_vma(mm, address);
++ vma = lock_mm_and_find_vma(mm, address, regs);
+ if (!vma)
+- goto bad_area;
+- if (vma->vm_start <= address)
+- goto good_area;
+- if (!(vma->vm_flags & VM_GROWSDOWN))
+- goto bad_area;
+- if (expand_stack(vma, address))
+- goto bad_area;
++ goto bad_area_nosemaphore;
+ /*
+ * Ok, we have a good vm_area for this memory access, so
+ * we can handle it..
+ */
+-good_area:
+ si_code = SEGV_ACCERR;
+
+ if (write) {
+diff --git a/arch/nios2/Kconfig b/arch/nios2/Kconfig
+index e5936417d3cd3..d54464021a618 100644
+--- a/arch/nios2/Kconfig
++++ b/arch/nios2/Kconfig
+@@ -16,6 +16,7 @@ config NIOS2
+ select HAVE_ARCH_TRACEHOOK
+ select HAVE_ARCH_KGDB
+ select IRQ_DOMAIN
++ select LOCK_MM_AND_FIND_VMA
+ select MODULES_USE_ELF_RELA
+ select OF
+ select OF_EARLY_FLATTREE
+diff --git a/arch/nios2/mm/fault.c b/arch/nios2/mm/fault.c
+index ca64eccea5511..e3fa9c15181df 100644
+--- a/arch/nios2/mm/fault.c
++++ b/arch/nios2/mm/fault.c
+@@ -86,27 +86,14 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause,
+
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
+
+- if (!mmap_read_trylock(mm)) {
+- if (!user_mode(regs) && !search_exception_tables(regs->ea))
+- goto bad_area_nosemaphore;
+ retry:
+- mmap_read_lock(mm);
+- }
+-
+- vma = find_vma(mm, address);
++ vma = lock_mm_and_find_vma(mm, address, regs);
+ if (!vma)
+- goto bad_area;
+- if (vma->vm_start <= address)
+- goto good_area;
+- if (!(vma->vm_flags & VM_GROWSDOWN))
+- goto bad_area;
+- if (expand_stack(vma, address))
+- goto bad_area;
++ goto bad_area_nosemaphore;
+ /*
+ * Ok, we have a good vm_area for this memory access, so
+ * we can handle it..
+ */
+-good_area:
+ code = SEGV_ACCERR;
+
+ switch (cause) {
+diff --git a/arch/openrisc/mm/fault.c b/arch/openrisc/mm/fault.c
+index 6734fee3134f4..a9dcd4381d1a1 100644
+--- a/arch/openrisc/mm/fault.c
++++ b/arch/openrisc/mm/fault.c
+@@ -127,8 +127,9 @@ retry:
+ if (address + PAGE_SIZE < regs->sp)
+ goto bad_area;
+ }
+- if (expand_stack(vma, address))
+- goto bad_area;
++ vma = expand_stack(mm, address);
++ if (!vma)
++ goto bad_area_nosemaphore;
+
+ /*
+ * Ok, we have a good vm_area for this memory access, so
+diff --git a/arch/parisc/mm/fault.c b/arch/parisc/mm/fault.c
+index 6941fdbf25173..a4c7c7630f48b 100644
+--- a/arch/parisc/mm/fault.c
++++ b/arch/parisc/mm/fault.c
+@@ -288,15 +288,19 @@ void do_page_fault(struct pt_regs *regs, unsigned long code,
+ retry:
+ mmap_read_lock(mm);
+ vma = find_vma_prev(mm, address, &prev_vma);
+- if (!vma || address < vma->vm_start)
+- goto check_expansion;
++ if (!vma || address < vma->vm_start) {
++ if (!prev_vma || !(prev_vma->vm_flags & VM_GROWSUP))
++ goto bad_area;
++ vma = expand_stack(mm, address);
++ if (!vma)
++ goto bad_area_nosemaphore;
++ }
++
+ /*
+ * Ok, we have a good vm_area for this memory access. We still need to
+ * check the access permissions.
+ */
+
+-good_area:
+-
+ if ((vma->vm_flags & acc_type) != acc_type)
+ goto bad_area;
+
+@@ -347,17 +351,13 @@ good_area:
+ mmap_read_unlock(mm);
+ return;
+
+-check_expansion:
+- vma = prev_vma;
+- if (vma && (expand_stack(vma, address) == 0))
+- goto good_area;
+-
+ /*
+ * Something tried to access memory that isn't in our memory map..
+ */
+ bad_area:
+ mmap_read_unlock(mm);
+
++bad_area_nosemaphore:
+ if (user_mode(regs)) {
+ int signo, si_code;
+
+@@ -449,7 +449,7 @@ handle_nadtlb_fault(struct pt_regs *regs)
+ {
+ unsigned long insn = regs->iir;
+ int breg, treg, xreg, val = 0;
+- struct vm_area_struct *vma, *prev_vma;
++ struct vm_area_struct *vma;
+ struct task_struct *tsk;
+ struct mm_struct *mm;
+ unsigned long address;
+@@ -485,7 +485,7 @@ handle_nadtlb_fault(struct pt_regs *regs)
+ /* Search for VMA */
+ address = regs->ior;
+ mmap_read_lock(mm);
+- vma = find_vma_prev(mm, address, &prev_vma);
++ vma = vma_lookup(mm, address);
+ mmap_read_unlock(mm);
+
+ /*
+@@ -494,7 +494,6 @@ handle_nadtlb_fault(struct pt_regs *regs)
+ */
+ acc_type = (insn & 0x40) ? VM_WRITE : VM_READ;
+ if (vma
+- && address >= vma->vm_start
+ && (vma->vm_flags & acc_type) == acc_type)
+ val = 1;
+ }
+diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
+index bff5820b7cda1..a243fcdf346de 100644
+--- a/arch/powerpc/Kconfig
++++ b/arch/powerpc/Kconfig
+@@ -278,6 +278,7 @@ config PPC
+ select IRQ_DOMAIN
+ select IRQ_FORCED_THREADING
+ select KASAN_VMALLOC if KASAN && MODULES
++ select LOCK_MM_AND_FIND_VMA
+ select MMU_GATHER_PAGE_SIZE
+ select MMU_GATHER_RCU_TABLE_FREE
+ select MMU_GATHER_MERGE_VMAS
+diff --git a/arch/powerpc/mm/copro_fault.c b/arch/powerpc/mm/copro_fault.c
+index 7c507fb48182b..f49fd873df8da 100644
+--- a/arch/powerpc/mm/copro_fault.c
++++ b/arch/powerpc/mm/copro_fault.c
+@@ -33,19 +33,11 @@ int copro_handle_mm_fault(struct mm_struct *mm, unsigned long ea,
+ if (mm->pgd == NULL)
+ return -EFAULT;
+
+- mmap_read_lock(mm);
+- ret = -EFAULT;
+- vma = find_vma(mm, ea);
++ vma = lock_mm_and_find_vma(mm, ea, NULL);
+ if (!vma)
+- goto out_unlock;
+-
+- if (ea < vma->vm_start) {
+- if (!(vma->vm_flags & VM_GROWSDOWN))
+- goto out_unlock;
+- if (expand_stack(vma, ea))
+- goto out_unlock;
+- }
++ return -EFAULT;
+
++ ret = -EFAULT;
+ is_write = dsisr & DSISR_ISSTORE;
+ if (is_write) {
+ if (!(vma->vm_flags & VM_WRITE))
+diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
+index 531177a4ee088..5bfdf6ecfa965 100644
+--- a/arch/powerpc/mm/fault.c
++++ b/arch/powerpc/mm/fault.c
+@@ -84,11 +84,6 @@ static int __bad_area(struct pt_regs *regs, unsigned long address, int si_code)
+ return __bad_area_nosemaphore(regs, address, si_code);
+ }
+
+-static noinline int bad_area(struct pt_regs *regs, unsigned long address)
+-{
+- return __bad_area(regs, address, SEGV_MAPERR);
+-}
+-
+ static noinline int bad_access_pkey(struct pt_regs *regs, unsigned long address,
+ struct vm_area_struct *vma)
+ {
+@@ -515,40 +510,12 @@ lock_mmap:
+ * we will deadlock attempting to validate the fault against the
+ * address space. Luckily the kernel only validly references user
+ * space from well defined areas of code, which are listed in the
+- * exceptions table.
+- *
+- * As the vast majority of faults will be valid we will only perform
+- * the source reference check when there is a possibility of a deadlock.
+- * Attempt to lock the address space, if we cannot we then validate the
+- * source. If this is invalid we can skip the address space check,
+- * thus avoiding the deadlock.
++ * exceptions table. lock_mm_and_find_vma() handles that logic.
+ */
+- if (unlikely(!mmap_read_trylock(mm))) {
+- if (!is_user && !search_exception_tables(regs->nip))
+- return bad_area_nosemaphore(regs, address);
+-
+ retry:
+- mmap_read_lock(mm);
+- } else {
+- /*
+- * The above down_read_trylock() might have succeeded in
+- * which case we'll have missed the might_sleep() from
+- * down_read():
+- */
+- might_sleep();
+- }
+-
+- vma = find_vma(mm, address);
++ vma = lock_mm_and_find_vma(mm, address, regs);
+ if (unlikely(!vma))
+- return bad_area(regs, address);
+-
+- if (unlikely(vma->vm_start > address)) {
+- if (unlikely(!(vma->vm_flags & VM_GROWSDOWN)))
+- return bad_area(regs, address);
+-
+- if (unlikely(expand_stack(vma, address)))
+- return bad_area(regs, address);
+- }
++ return bad_area_nosemaphore(regs, address);
+
+ if (unlikely(access_pkey_error(is_write, is_exec,
+ (error_code & DSISR_KEYFAULT), vma)))
+diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
+index 5966ad97c30c3..a11b1c038c6d1 100644
+--- a/arch/riscv/Kconfig
++++ b/arch/riscv/Kconfig
+@@ -126,6 +126,7 @@ config RISCV
+ select IRQ_DOMAIN
+ select IRQ_FORCED_THREADING
+ select KASAN_VMALLOC if KASAN
++ select LOCK_MM_AND_FIND_VMA
+ select MODULES_USE_ELF_RELA if MODULES
+ select MODULE_SECTIONS if MODULES
+ select OF
+diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
+index 8685f85a7474e..35a84ec69a9fd 100644
+--- a/arch/riscv/mm/fault.c
++++ b/arch/riscv/mm/fault.c
+@@ -84,13 +84,13 @@ static inline void mm_fault_error(struct pt_regs *regs, unsigned long addr, vm_f
+ BUG();
+ }
+
+-static inline void bad_area(struct pt_regs *regs, struct mm_struct *mm, int code, unsigned long addr)
++static inline void
++bad_area_nosemaphore(struct pt_regs *regs, int code, unsigned long addr)
+ {
+ /*
+ * Something tried to access memory that isn't in our memory map.
+ * Fix it, but check if it's kernel or user first.
+ */
+- mmap_read_unlock(mm);
+ /* User mode accesses just cause a SIGSEGV */
+ if (user_mode(regs)) {
+ do_trap(regs, SIGSEGV, code, addr);
+@@ -100,6 +100,15 @@ static inline void bad_area(struct pt_regs *regs, struct mm_struct *mm, int code
+ no_context(regs, addr);
+ }
+
++static inline void
++bad_area(struct pt_regs *regs, struct mm_struct *mm, int code,
++ unsigned long addr)
++{
++ mmap_read_unlock(mm);
++
++ bad_area_nosemaphore(regs, code, addr);
++}
++
+ static inline void vmalloc_fault(struct pt_regs *regs, int code, unsigned long addr)
+ {
+ pgd_t *pgd, *pgd_k;
+@@ -287,23 +296,10 @@ void handle_page_fault(struct pt_regs *regs)
+ else if (cause == EXC_INST_PAGE_FAULT)
+ flags |= FAULT_FLAG_INSTRUCTION;
+ retry:
+- mmap_read_lock(mm);
+- vma = find_vma(mm, addr);
++ vma = lock_mm_and_find_vma(mm, addr, regs);
+ if (unlikely(!vma)) {
+ tsk->thread.bad_cause = cause;
+- bad_area(regs, mm, code, addr);
+- return;
+- }
+- if (likely(vma->vm_start <= addr))
+- goto good_area;
+- if (unlikely(!(vma->vm_flags & VM_GROWSDOWN))) {
+- tsk->thread.bad_cause = cause;
+- bad_area(regs, mm, code, addr);
+- return;
+- }
+- if (unlikely(expand_stack(vma, addr))) {
+- tsk->thread.bad_cause = cause;
+- bad_area(regs, mm, code, addr);
++ bad_area_nosemaphore(regs, code, addr);
+ return;
+ }
+
+@@ -311,7 +307,6 @@ retry:
+ * Ok, we have a good vm_area for this memory access, so
+ * we can handle it.
+ */
+-good_area:
+ code = SEGV_ACCERR;
+
+ if (unlikely(access_error(cause, vma))) {
+diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
+index b65144c392b01..dbe8394234e2b 100644
+--- a/arch/s390/mm/fault.c
++++ b/arch/s390/mm/fault.c
+@@ -457,8 +457,9 @@ retry:
+ if (unlikely(vma->vm_start > address)) {
+ if (!(vma->vm_flags & VM_GROWSDOWN))
+ goto out_up;
+- if (expand_stack(vma, address))
+- goto out_up;
++ vma = expand_stack(mm, address);
++ if (!vma)
++ goto out;
+ }
+
+ /*
+diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
+index 9652d367fc377..393023d092450 100644
+--- a/arch/sh/Kconfig
++++ b/arch/sh/Kconfig
+@@ -59,6 +59,7 @@ config SUPERH
+ select HAVE_STACKPROTECTOR
+ select HAVE_SYSCALL_TRACEPOINTS
+ select IRQ_FORCED_THREADING
++ select LOCK_MM_AND_FIND_VMA
+ select MODULES_USE_ELF_RELA
+ select NEED_SG_DMA_LENGTH
+ select NO_DMA if !MMU && !DMA_COHERENT
+diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c
+index acd2f5e50bfcd..06e6b49529245 100644
+--- a/arch/sh/mm/fault.c
++++ b/arch/sh/mm/fault.c
+@@ -439,21 +439,9 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs *regs,
+ }
+
+ retry:
+- mmap_read_lock(mm);
+-
+- vma = find_vma(mm, address);
++ vma = lock_mm_and_find_vma(mm, address, regs);
+ if (unlikely(!vma)) {
+- bad_area(regs, error_code, address);
+- return;
+- }
+- if (likely(vma->vm_start <= address))
+- goto good_area;
+- if (unlikely(!(vma->vm_flags & VM_GROWSDOWN))) {
+- bad_area(regs, error_code, address);
+- return;
+- }
+- if (unlikely(expand_stack(vma, address))) {
+- bad_area(regs, error_code, address);
++ bad_area_nosemaphore(regs, error_code, address);
+ return;
+ }
+
+@@ -461,7 +449,6 @@ retry:
+ * Ok, we have a good vm_area for this memory access, so
+ * we can handle it..
+ */
+-good_area:
+ if (unlikely(access_error(error_code, vma))) {
+ bad_area_access_error(regs, error_code, address);
+ return;
+diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
+index 8535e19062f65..8c196990558b2 100644
+--- a/arch/sparc/Kconfig
++++ b/arch/sparc/Kconfig
+@@ -57,6 +57,7 @@ config SPARC32
+ select DMA_DIRECT_REMAP
+ select GENERIC_ATOMIC64
+ select HAVE_UID16
++ select LOCK_MM_AND_FIND_VMA
+ select OLD_SIGACTION
+ select ZONE_DMA
+
+diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c
+index 179295b14664a..86a831ebd8c8e 100644
+--- a/arch/sparc/mm/fault_32.c
++++ b/arch/sparc/mm/fault_32.c
+@@ -143,28 +143,19 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write,
+ if (pagefault_disabled() || !mm)
+ goto no_context;
+
++ if (!from_user && address >= PAGE_OFFSET)
++ goto no_context;
++
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
+
+ retry:
+- mmap_read_lock(mm);
+-
+- if (!from_user && address >= PAGE_OFFSET)
+- goto bad_area;
+-
+- vma = find_vma(mm, address);
++ vma = lock_mm_and_find_vma(mm, address, regs);
+ if (!vma)
+- goto bad_area;
+- if (vma->vm_start <= address)
+- goto good_area;
+- if (!(vma->vm_flags & VM_GROWSDOWN))
+- goto bad_area;
+- if (expand_stack(vma, address))
+- goto bad_area;
++ goto bad_area_nosemaphore;
+ /*
+ * Ok, we have a good vm_area for this memory access, so
+ * we can handle it..
+ */
+-good_area:
+ code = SEGV_ACCERR;
+ if (write) {
+ if (!(vma->vm_flags & VM_WRITE))
+@@ -321,17 +312,9 @@ static void force_user_fault(unsigned long address, int write)
+
+ code = SEGV_MAPERR;
+
+- mmap_read_lock(mm);
+- vma = find_vma(mm, address);
++ vma = lock_mm_and_find_vma(mm, address, NULL);
+ if (!vma)
+- goto bad_area;
+- if (vma->vm_start <= address)
+- goto good_area;
+- if (!(vma->vm_flags & VM_GROWSDOWN))
+- goto bad_area;
+- if (expand_stack(vma, address))
+- goto bad_area;
+-good_area:
++ goto bad_area_nosemaphore;
+ code = SEGV_ACCERR;
+ if (write) {
+ if (!(vma->vm_flags & VM_WRITE))
+@@ -350,6 +333,7 @@ good_area:
+ return;
+ bad_area:
+ mmap_read_unlock(mm);
++bad_area_nosemaphore:
+ __do_fault_siginfo(code, SIGSEGV, tsk->thread.kregs, address);
+ return;
+
+diff --git a/arch/sparc/mm/fault_64.c b/arch/sparc/mm/fault_64.c
+index d91305de694c5..69ff07bc6c07d 100644
+--- a/arch/sparc/mm/fault_64.c
++++ b/arch/sparc/mm/fault_64.c
+@@ -383,8 +383,9 @@ continue_fault:
+ goto bad_area;
+ }
+ }
+- if (expand_stack(vma, address))
+- goto bad_area;
++ vma = expand_stack(mm, address);
++ if (!vma)
++ goto bad_area_nosemaphore;
+ /*
+ * Ok, we have a good vm_area for this memory access, so
+ * we can handle it..
+@@ -487,8 +488,9 @@ exit_exception:
+ * Fix it, but check if it's kernel or user first..
+ */
+ bad_area:
+- insn = get_fault_insn(regs, insn);
+ mmap_read_unlock(mm);
++bad_area_nosemaphore:
++ insn = get_fault_insn(regs, insn);
+
+ handle_kernel_fault:
+ do_kernel_fault(regs, si_code, fault_code, insn, address);
+diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c
+index d3ce21c4ca32a..6d8ae86ae978f 100644
+--- a/arch/um/kernel/trap.c
++++ b/arch/um/kernel/trap.c
+@@ -47,14 +47,15 @@ retry:
+ vma = find_vma(mm, address);
+ if (!vma)
+ goto out;
+- else if (vma->vm_start <= address)
++ if (vma->vm_start <= address)
+ goto good_area;
+- else if (!(vma->vm_flags & VM_GROWSDOWN))
++ if (!(vma->vm_flags & VM_GROWSDOWN))
+ goto out;
+- else if (is_user && !ARCH_IS_STACKGROW(address))
+- goto out;
+- else if (expand_stack(vma, address))
++ if (is_user && !ARCH_IS_STACKGROW(address))
+ goto out;
++ vma = expand_stack(mm, address);
++ if (!vma)
++ goto out_nosemaphore;
+
+ good_area:
+ *code_out = SEGV_ACCERR;
+diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
+index 53bab123a8ee4..cb1031018afa5 100644
+--- a/arch/x86/Kconfig
++++ b/arch/x86/Kconfig
+@@ -276,6 +276,7 @@ config X86
+ select HAVE_GENERIC_VDSO
+ select HOTPLUG_SMT if SMP
+ select IRQ_FORCED_THREADING
++ select LOCK_MM_AND_FIND_VMA
+ select NEED_PER_CPU_EMBED_FIRST_CHUNK
+ select NEED_PER_CPU_PAGE_FIRST_CHUNK
+ select NEED_SG_DMA_LENGTH
+diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
+index 78796b98a5449..9ba3c3dec6f30 100644
+--- a/arch/x86/include/asm/cpu.h
++++ b/arch/x86/include/asm/cpu.h
+@@ -98,4 +98,6 @@ extern u64 x86_read_arch_cap_msr(void);
+ int intel_find_matching_signature(void *mc, unsigned int csig, int cpf);
+ int intel_microcode_sanity_check(void *mc, bool print_err, int hdr_type);
+
++extern struct cpumask cpus_stop_mask;
++
+ #endif /* _ASM_X86_CPU_H */
+diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
+index 4e91054c84be9..d4ce5cb5c9534 100644
+--- a/arch/x86/include/asm/smp.h
++++ b/arch/x86/include/asm/smp.h
+@@ -132,6 +132,8 @@ void wbinvd_on_cpu(int cpu);
+ int wbinvd_on_all_cpus(void);
+ void cond_wakeup_cpu0(void);
+
++void smp_kick_mwait_play_dead(void);
++
+ void native_smp_send_reschedule(int cpu);
+ void native_send_call_func_ipi(const struct cpumask *mask);
+ void native_send_call_func_single_ipi(int cpu);
+diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
+index f5fdeb1e3606e..46a679388d19b 100644
+--- a/arch/x86/kernel/cpu/microcode/amd.c
++++ b/arch/x86/kernel/cpu/microcode/amd.c
+@@ -705,7 +705,7 @@ static enum ucode_state apply_microcode_amd(int cpu)
+ rdmsr(MSR_AMD64_PATCH_LEVEL, rev, dummy);
+
+ /* need to apply patch? */
+- if (rev >= mc_amd->hdr.patch_id) {
++ if (rev > mc_amd->hdr.patch_id) {
+ ret = UCODE_OK;
+ goto out;
+ }
+diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
+index dac41a0072ea1..ff9b80a0e3e3b 100644
+--- a/arch/x86/kernel/process.c
++++ b/arch/x86/kernel/process.c
+@@ -759,15 +759,26 @@ bool xen_set_default_idle(void)
+ }
+ #endif
+
++struct cpumask cpus_stop_mask;
++
+ void __noreturn stop_this_cpu(void *dummy)
+ {
++ struct cpuinfo_x86 *c = this_cpu_ptr(&cpu_info);
++ unsigned int cpu = smp_processor_id();
++
+ local_irq_disable();
++
+ /*
+- * Remove this CPU:
++ * Remove this CPU from the online mask and disable it
++ * unconditionally. This might be redundant in case that the reboot
++ * vector was handled late and stop_other_cpus() sent an NMI.
++ *
++ * According to SDM and APM NMIs can be accepted even after soft
++ * disabling the local APIC.
+ */
+- set_cpu_online(smp_processor_id(), false);
++ set_cpu_online(cpu, false);
+ disable_local_APIC();
+- mcheck_cpu_clear(this_cpu_ptr(&cpu_info));
++ mcheck_cpu_clear(c);
+
+ /*
+ * Use wbinvd on processors that support SME. This provides support
+@@ -781,8 +792,17 @@ void __noreturn stop_this_cpu(void *dummy)
+ * Test the CPUID bit directly because the machine might've cleared
+ * X86_FEATURE_SME due to cmdline options.
+ */
+- if (cpuid_eax(0x8000001f) & BIT(0))
++ if (c->extended_cpuid_level >= 0x8000001f && (cpuid_eax(0x8000001f) & BIT(0)))
+ native_wbinvd();
++
++ /*
++ * This brings a cache line back and dirties it, but
++ * native_stop_other_cpus() will overwrite cpus_stop_mask after it
++ * observed that all CPUs reported stop. This write will invalidate
++ * the related cache line on this CPU.
++ */
++ cpumask_clear_cpu(cpu, &cpus_stop_mask);
++
+ for (;;) {
+ /*
+ * Use native_halt() so that memory contents don't change
+diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
+index 375b33ecafa27..174d6232b87fd 100644
+--- a/arch/x86/kernel/smp.c
++++ b/arch/x86/kernel/smp.c
+@@ -21,12 +21,14 @@
+ #include <linux/interrupt.h>
+ #include <linux/cpu.h>
+ #include <linux/gfp.h>
++#include <linux/kexec.h>
+
+ #include <asm/mtrr.h>
+ #include <asm/tlbflush.h>
+ #include <asm/mmu_context.h>
+ #include <asm/proto.h>
+ #include <asm/apic.h>
++#include <asm/cpu.h>
+ #include <asm/idtentry.h>
+ #include <asm/nmi.h>
+ #include <asm/mce.h>
+@@ -146,34 +148,47 @@ static int register_stop_handler(void)
+
+ static void native_stop_other_cpus(int wait)
+ {
+- unsigned long flags;
+- unsigned long timeout;
++ unsigned int cpu = smp_processor_id();
++ unsigned long flags, timeout;
+
+ if (reboot_force)
+ return;
+
+- /*
+- * Use an own vector here because smp_call_function
+- * does lots of things not suitable in a panic situation.
+- */
++ /* Only proceed if this is the first CPU to reach this code */
++ if (atomic_cmpxchg(&stopping_cpu, -1, cpu) != -1)
++ return;
++
++ /* For kexec, ensure that offline CPUs are out of MWAIT and in HLT */
++ if (kexec_in_progress)
++ smp_kick_mwait_play_dead();
+
+ /*
+- * We start by using the REBOOT_VECTOR irq.
+- * The irq is treated as a sync point to allow critical
+- * regions of code on other cpus to release their spin locks
+- * and re-enable irqs. Jumping straight to an NMI might
+- * accidentally cause deadlocks with further shutdown/panic
+- * code. By syncing, we give the cpus up to one second to
+- * finish their work before we force them off with the NMI.
++ * 1) Send an IPI on the reboot vector to all other CPUs.
++ *
++ * The other CPUs should react on it after leaving critical
++ * sections and re-enabling interrupts. They might still hold
++ * locks, but there is nothing which can be done about that.
++ *
++ * 2) Wait for all other CPUs to report that they reached the
++ * HLT loop in stop_this_cpu()
++ *
++ * 3) If #2 timed out send an NMI to the CPUs which did not
++ * yet report
++ *
++ * 4) Wait for all other CPUs to report that they reached the
++ * HLT loop in stop_this_cpu()
++ *
++ * #3 can obviously race against a CPU reaching the HLT loop late.
++ * That CPU will have reported already and the "have all CPUs
++ * reached HLT" condition will be true despite the fact that the
++ * other CPU is still handling the NMI. Again, there is no
++ * protection against that as "disabled" APICs still respond to
++ * NMIs.
+ */
+- if (num_online_cpus() > 1) {
+- /* did someone beat us here? */
+- if (atomic_cmpxchg(&stopping_cpu, -1, safe_smp_processor_id()) != -1)
+- return;
+-
+- /* sync above data before sending IRQ */
+- wmb();
++ cpumask_copy(&cpus_stop_mask, cpu_online_mask);
++ cpumask_clear_cpu(cpu, &cpus_stop_mask);
+
++ if (!cpumask_empty(&cpus_stop_mask)) {
+ apic_send_IPI_allbutself(REBOOT_VECTOR);
+
+ /*
+@@ -183,24 +198,22 @@ static void native_stop_other_cpus(int wait)
+ * CPUs reach shutdown state.
+ */
+ timeout = USEC_PER_SEC;
+- while (num_online_cpus() > 1 && timeout--)
++ while (!cpumask_empty(&cpus_stop_mask) && timeout--)
+ udelay(1);
+ }
+
+ /* if the REBOOT_VECTOR didn't work, try with the NMI */
+- if (num_online_cpus() > 1) {
++ if (!cpumask_empty(&cpus_stop_mask)) {
+ /*
+ * If NMI IPI is enabled, try to register the stop handler
+ * and send the IPI. In any case try to wait for the other
+ * CPUs to stop.
+ */
+ if (!smp_no_nmi_ipi && !register_stop_handler()) {
+- /* Sync above data before sending IRQ */
+- wmb();
+-
+ pr_emerg("Shutting down cpus with NMI\n");
+
+- apic_send_IPI_allbutself(NMI_VECTOR);
++ for_each_cpu(cpu, &cpus_stop_mask)
++ apic->send_IPI(cpu, NMI_VECTOR);
+ }
+ /*
+ * Don't wait longer than 10 ms if the caller didn't
+@@ -208,7 +221,7 @@ static void native_stop_other_cpus(int wait)
+ * one or more CPUs do not reach shutdown state.
+ */
+ timeout = USEC_PER_MSEC * 10;
+- while (num_online_cpus() > 1 && (wait || timeout--))
++ while (!cpumask_empty(&cpus_stop_mask) && (wait || timeout--))
+ udelay(1);
+ }
+
+@@ -216,6 +229,12 @@ static void native_stop_other_cpus(int wait)
+ disable_local_APIC();
+ mcheck_cpu_clear(this_cpu_ptr(&cpu_info));
+ local_irq_restore(flags);
++
++ /*
++ * Ensure that the cpus_stop_mask cache lines are invalidated on
++ * the other CPUs. See comment vs. SME in stop_this_cpu().
++ */
++ cpumask_clear(&cpus_stop_mask);
+ }
+
+ /*
+diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
+index 352f0ce1ece42..483df04276784 100644
+--- a/arch/x86/kernel/smpboot.c
++++ b/arch/x86/kernel/smpboot.c
+@@ -53,6 +53,7 @@
+ #include <linux/tboot.h>
+ #include <linux/gfp.h>
+ #include <linux/cpuidle.h>
++#include <linux/kexec.h>
+ #include <linux/numa.h>
+ #include <linux/pgtable.h>
+ #include <linux/overflow.h>
+@@ -101,6 +102,20 @@ EXPORT_PER_CPU_SYMBOL(cpu_die_map);
+ DEFINE_PER_CPU_READ_MOSTLY(struct cpuinfo_x86, cpu_info);
+ EXPORT_PER_CPU_SYMBOL(cpu_info);
+
++struct mwait_cpu_dead {
++ unsigned int control;
++ unsigned int status;
++};
++
++#define CPUDEAD_MWAIT_WAIT 0xDEADBEEF
++#define CPUDEAD_MWAIT_KEXEC_HLT 0x4A17DEAD
++
++/*
++ * Cache line aligned data for mwait_play_dead(). Separate on purpose so
++ * that it's unlikely to be touched by other CPUs.
++ */
++static DEFINE_PER_CPU_ALIGNED(struct mwait_cpu_dead, mwait_cpu_dead);
++
+ /* Logical package management. We might want to allocate that dynamically */
+ unsigned int __max_logical_packages __read_mostly;
+ EXPORT_SYMBOL(__max_logical_packages);
+@@ -162,6 +177,10 @@ static void smp_callin(void)
+ {
+ int cpuid;
+
++ /* Mop up eventual mwait_play_dead() wreckage */
++ this_cpu_write(mwait_cpu_dead.status, 0);
++ this_cpu_write(mwait_cpu_dead.control, 0);
++
+ /*
+ * If waken up by an INIT in an 82489DX configuration
+ * cpu_callout_mask guarantees we don't get here before
+@@ -1758,10 +1777,10 @@ EXPORT_SYMBOL_GPL(cond_wakeup_cpu0);
+ */
+ static inline void mwait_play_dead(void)
+ {
++ struct mwait_cpu_dead *md = this_cpu_ptr(&mwait_cpu_dead);
+ unsigned int eax, ebx, ecx, edx;
+ unsigned int highest_cstate = 0;
+ unsigned int highest_subcstate = 0;
+- void *mwait_ptr;
+ int i;
+
+ if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||
+@@ -1796,12 +1815,9 @@ static inline void mwait_play_dead(void)
+ (highest_subcstate - 1);
+ }
+
+- /*
+- * This should be a memory location in a cache line which is
+- * unlikely to be touched by other processors. The actual
+- * content is immaterial as it is not actually modified in any way.
+- */
+- mwait_ptr = ¤t_thread_info()->flags;
++ /* Set up state for the kexec() hack below */
++ md->status = CPUDEAD_MWAIT_WAIT;
++ md->control = CPUDEAD_MWAIT_WAIT;
+
+ wbinvd();
+
+@@ -1814,16 +1830,63 @@ static inline void mwait_play_dead(void)
+ * case where we return around the loop.
+ */
+ mb();
+- clflush(mwait_ptr);
++ clflush(md);
+ mb();
+- __monitor(mwait_ptr, 0, 0);
++ __monitor(md, 0, 0);
+ mb();
+ __mwait(eax, 0);
+
++ if (READ_ONCE(md->control) == CPUDEAD_MWAIT_KEXEC_HLT) {
++ /*
++ * Kexec is about to happen. Don't go back into mwait() as
++ * the kexec kernel might overwrite text and data including
++ * page tables and stack. So mwait() would resume when the
++ * monitor cache line is written to and then the CPU goes
++ * south due to overwritten text, page tables and stack.
++ *
++ * Note: This does _NOT_ protect against a stray MCE, NMI,
++ * SMI. They will resume execution at the instruction
++ * following the HLT instruction and run into the problem
++ * which this is trying to prevent.
++ */
++ WRITE_ONCE(md->status, CPUDEAD_MWAIT_KEXEC_HLT);
++ while(1)
++ native_halt();
++ }
++
+ cond_wakeup_cpu0();
+ }
+ }
+
++/*
++ * Kick all "offline" CPUs out of mwait on kexec(). See comment in
++ * mwait_play_dead().
++ */
++void smp_kick_mwait_play_dead(void)
++{
++ u32 newstate = CPUDEAD_MWAIT_KEXEC_HLT;
++ struct mwait_cpu_dead *md;
++ unsigned int cpu, i;
++
++ for_each_cpu_andnot(cpu, cpu_present_mask, cpu_online_mask) {
++ md = per_cpu_ptr(&mwait_cpu_dead, cpu);
++
++ /* Does it sit in mwait_play_dead() ? */
++ if (READ_ONCE(md->status) != CPUDEAD_MWAIT_WAIT)
++ continue;
++
++ /* Wait up to 5ms */
++ for (i = 0; READ_ONCE(md->status) != newstate && i < 1000; i++) {
++ /* Bring it out of mwait */
++ WRITE_ONCE(md->control, newstate);
++ udelay(5);
++ }
++
++ if (READ_ONCE(md->status) != newstate)
++ pr_err_once("CPU%u is stuck in mwait_play_dead()\n", cpu);
++ }
++}
++
+ void __noreturn hlt_play_dead(void)
+ {
+ if (__this_cpu_read(cpu_info.x86) >= 4)
+diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
+index e4399983c50c0..e8711b2cafaf7 100644
+--- a/arch/x86/mm/fault.c
++++ b/arch/x86/mm/fault.c
+@@ -880,12 +880,6 @@ __bad_area(struct pt_regs *regs, unsigned long error_code,
+ __bad_area_nosemaphore(regs, error_code, address, pkey, si_code);
+ }
+
+-static noinline void
+-bad_area(struct pt_regs *regs, unsigned long error_code, unsigned long address)
+-{
+- __bad_area(regs, error_code, address, 0, SEGV_MAPERR);
+-}
+-
+ static inline bool bad_area_access_from_pkeys(unsigned long error_code,
+ struct vm_area_struct *vma)
+ {
+@@ -1366,51 +1360,10 @@ void do_user_addr_fault(struct pt_regs *regs,
+ lock_mmap:
+ #endif /* CONFIG_PER_VMA_LOCK */
+
+- /*
+- * Kernel-mode access to the user address space should only occur
+- * on well-defined single instructions listed in the exception
+- * tables. But, an erroneous kernel fault occurring outside one of
+- * those areas which also holds mmap_lock might deadlock attempting
+- * to validate the fault against the address space.
+- *
+- * Only do the expensive exception table search when we might be at
+- * risk of a deadlock. This happens if we
+- * 1. Failed to acquire mmap_lock, and
+- * 2. The access did not originate in userspace.
+- */
+- if (unlikely(!mmap_read_trylock(mm))) {
+- if (!user_mode(regs) && !search_exception_tables(regs->ip)) {
+- /*
+- * Fault from code in kernel from
+- * which we do not expect faults.
+- */
+- bad_area_nosemaphore(regs, error_code, address);
+- return;
+- }
+ retry:
+- mmap_read_lock(mm);
+- } else {
+- /*
+- * The above down_read_trylock() might have succeeded in
+- * which case we'll have missed the might_sleep() from
+- * down_read():
+- */
+- might_sleep();
+- }
+-
+- vma = find_vma(mm, address);
++ vma = lock_mm_and_find_vma(mm, address, regs);
+ if (unlikely(!vma)) {
+- bad_area(regs, error_code, address);
+- return;
+- }
+- if (likely(vma->vm_start <= address))
+- goto good_area;
+- if (unlikely(!(vma->vm_flags & VM_GROWSDOWN))) {
+- bad_area(regs, error_code, address);
+- return;
+- }
+- if (unlikely(expand_stack(vma, address))) {
+- bad_area(regs, error_code, address);
++ bad_area_nosemaphore(regs, error_code, address);
+ return;
+ }
+
+@@ -1418,7 +1371,6 @@ retry:
+ * Ok, we have a good vm_area for this memory access, so
+ * we can handle it..
+ */
+-good_area:
+ if (unlikely(access_error(error_code, vma))) {
+ bad_area_access_error(regs, error_code, address, vma);
+ return;
+diff --git a/arch/xtensa/Kconfig b/arch/xtensa/Kconfig
+index 3c6e5471f025b..2d0d6440b9796 100644
+--- a/arch/xtensa/Kconfig
++++ b/arch/xtensa/Kconfig
+@@ -49,6 +49,7 @@ config XTENSA
+ select HAVE_SYSCALL_TRACEPOINTS
+ select HAVE_VIRT_CPU_ACCOUNTING_GEN
+ select IRQ_DOMAIN
++ select LOCK_MM_AND_FIND_VMA
+ select MODULES_USE_ELF_RELA
+ select PERF_USE_VMALLOC
+ select TRACE_IRQFLAGS_SUPPORT
+diff --git a/arch/xtensa/mm/fault.c b/arch/xtensa/mm/fault.c
+index faf7cf35a0ee3..d1eb8d6c5b826 100644
+--- a/arch/xtensa/mm/fault.c
++++ b/arch/xtensa/mm/fault.c
+@@ -130,23 +130,14 @@ void do_page_fault(struct pt_regs *regs)
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
+
+ retry:
+- mmap_read_lock(mm);
+- vma = find_vma(mm, address);
+-
++ vma = lock_mm_and_find_vma(mm, address, regs);
+ if (!vma)
+- goto bad_area;
+- if (vma->vm_start <= address)
+- goto good_area;
+- if (!(vma->vm_flags & VM_GROWSDOWN))
+- goto bad_area;
+- if (expand_stack(vma, address))
+- goto bad_area;
++ goto bad_area_nosemaphore;
+
+ /* Ok, we have a good vm_area for this memory access, so
+ * we can handle it..
+ */
+
+-good_area:
+ code = SEGV_ACCERR;
+
+ if (is_write) {
+@@ -205,6 +196,7 @@ good_area:
+ */
+ bad_area:
+ mmap_read_unlock(mm);
++bad_area_nosemaphore:
+ if (user_mode(regs)) {
+ force_sig_fault(SIGSEGV, code, (void *) address);
+ return;
+diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
+index ddd346a239e0b..a5764946434c6 100644
+--- a/drivers/cpufreq/amd-pstate.c
++++ b/drivers/cpufreq/amd-pstate.c
+@@ -1356,7 +1356,7 @@ static struct cpufreq_driver amd_pstate_epp_driver = {
+ .online = amd_pstate_epp_cpu_online,
+ .suspend = amd_pstate_epp_suspend,
+ .resume = amd_pstate_epp_resume,
+- .name = "amd_pstate_epp",
++ .name = "amd-pstate-epp",
+ .attr = amd_pstate_epp_attr,
+ };
+
+diff --git a/drivers/hid/hid-logitech-hidpp.c b/drivers/hid/hid-logitech-hidpp.c
+index 5e1a412fd28fa..f7e06d433a915 100644
+--- a/drivers/hid/hid-logitech-hidpp.c
++++ b/drivers/hid/hid-logitech-hidpp.c
+@@ -4553,7 +4553,7 @@ static const struct hid_device_id hidpp_devices[] = {
+ { /* wireless touchpad T651 */
+ HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH,
+ USB_DEVICE_ID_LOGITECH_T651),
+- .driver_data = HIDPP_QUIRK_CLASS_WTP },
++ .driver_data = HIDPP_QUIRK_CLASS_WTP | HIDPP_QUIRK_DELAYED_INIT },
+ { /* Mouse Logitech Anywhere MX */
+ LDJ_DEVICE(0x1017), .driver_data = HIDPP_QUIRK_HI_RES_SCROLL_1P0 },
+ { /* Mouse logitech M560 */
+diff --git a/drivers/hid/hidraw.c b/drivers/hid/hidraw.c
+index 93e62b161501c..e63c56a0d57fb 100644
+--- a/drivers/hid/hidraw.c
++++ b/drivers/hid/hidraw.c
+@@ -272,7 +272,12 @@ static int hidraw_open(struct inode *inode, struct file *file)
+ goto out;
+ }
+
+- down_read(&minors_rwsem);
++ /*
++ * Technically not writing to the hidraw_table but a write lock is
++ * required to protect the device refcount. This is symmetrical to
++ * hidraw_release().
++ */
++ down_write(&minors_rwsem);
+ if (!hidraw_table[minor] || !hidraw_table[minor]->exist) {
+ err = -ENODEV;
+ goto out_unlock;
+@@ -301,7 +306,7 @@ static int hidraw_open(struct inode *inode, struct file *file)
+ spin_unlock_irqrestore(&hidraw_table[minor]->list_lock, flags);
+ file->private_data = list;
+ out_unlock:
+- up_read(&minors_rwsem);
++ up_write(&minors_rwsem);
+ out:
+ if (err < 0)
+ kfree(list);
+diff --git a/drivers/hid/wacom_wac.c b/drivers/hid/wacom_wac.c
+index 2ccf838371343..174bf03908d7c 100644
+--- a/drivers/hid/wacom_wac.c
++++ b/drivers/hid/wacom_wac.c
+@@ -1314,7 +1314,7 @@ static void wacom_intuos_pro2_bt_pen(struct wacom_wac *wacom)
+ struct input_dev *pen_input = wacom->pen_input;
+ unsigned char *data = wacom->data;
+ int number_of_valid_frames = 0;
+- int time_interval = 15000000;
++ ktime_t time_interval = 15000000;
+ ktime_t time_packet_received = ktime_get();
+ int i;
+
+@@ -1348,7 +1348,7 @@ static void wacom_intuos_pro2_bt_pen(struct wacom_wac *wacom)
+ if (number_of_valid_frames) {
+ if (wacom->hid_data.time_delayed)
+ time_interval = ktime_get() - wacom->hid_data.time_delayed;
+- time_interval /= number_of_valid_frames;
++ time_interval = div_u64(time_interval, number_of_valid_frames);
+ wacom->hid_data.time_delayed = time_packet_received;
+ }
+
+@@ -1359,7 +1359,7 @@ static void wacom_intuos_pro2_bt_pen(struct wacom_wac *wacom)
+ bool range = frame[0] & 0x20;
+ bool invert = frame[0] & 0x10;
+ int frames_number_reversed = number_of_valid_frames - i - 1;
+- int event_timestamp = time_packet_received - frames_number_reversed * time_interval;
++ ktime_t event_timestamp = time_packet_received - frames_number_reversed * time_interval;
+
+ if (!valid)
+ continue;
+diff --git a/drivers/hid/wacom_wac.h b/drivers/hid/wacom_wac.h
+index 1a40bb8c5810c..ee21bb260f22f 100644
+--- a/drivers/hid/wacom_wac.h
++++ b/drivers/hid/wacom_wac.h
+@@ -324,7 +324,7 @@ struct hid_data {
+ int ps_connected;
+ bool pad_input_event_flag;
+ unsigned short sequence_number;
+- int time_delayed;
++ ktime_t time_delayed;
+ };
+
+ struct wacom_remote_data {
+diff --git a/drivers/iommu/amd/iommu_v2.c b/drivers/iommu/amd/iommu_v2.c
+index 864e4ffb6aa94..261352a232716 100644
+--- a/drivers/iommu/amd/iommu_v2.c
++++ b/drivers/iommu/amd/iommu_v2.c
+@@ -485,8 +485,8 @@ static void do_fault(struct work_struct *work)
+ flags |= FAULT_FLAG_REMOTE;
+
+ mmap_read_lock(mm);
+- vma = find_extend_vma(mm, address);
+- if (!vma || address < vma->vm_start)
++ vma = vma_lookup(mm, address);
++ if (!vma)
+ /* failed to get a vma in the right range */
+ goto out;
+
+diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c
+index 9821bc44f5ac1..3ebd4b6586b3e 100644
+--- a/drivers/iommu/iommu-sva.c
++++ b/drivers/iommu/iommu-sva.c
+@@ -175,7 +175,7 @@ iommu_sva_handle_iopf(struct iommu_fault *fault, void *data)
+
+ mmap_read_lock(mm);
+
+- vma = find_extend_vma(mm, prm->addr);
++ vma = vma_lookup(mm, prm->addr);
+ if (!vma)
+ /* Unmapped area */
+ goto out_put_mm;
+diff --git a/drivers/thermal/mediatek/auxadc_thermal.c b/drivers/thermal/mediatek/auxadc_thermal.c
+index 0b5528804bbd6..f59d36de20a09 100644
+--- a/drivers/thermal/mediatek/auxadc_thermal.c
++++ b/drivers/thermal/mediatek/auxadc_thermal.c
+@@ -1222,12 +1222,7 @@ static int mtk_thermal_probe(struct platform_device *pdev)
+ return -ENODEV;
+ }
+
+- auxadc_base = devm_of_iomap(&pdev->dev, auxadc, 0, NULL);
+- if (IS_ERR(auxadc_base)) {
+- of_node_put(auxadc);
+- return PTR_ERR(auxadc_base);
+- }
+-
++ auxadc_base = of_iomap(auxadc, 0);
+ auxadc_phys_base = of_get_phys_base(auxadc);
+
+ of_node_put(auxadc);
+@@ -1243,12 +1238,7 @@ static int mtk_thermal_probe(struct platform_device *pdev)
+ return -ENODEV;
+ }
+
+- apmixed_base = devm_of_iomap(&pdev->dev, apmixedsys, 0, NULL);
+- if (IS_ERR(apmixed_base)) {
+- of_node_put(apmixedsys);
+- return PTR_ERR(apmixed_base);
+- }
+-
++ apmixed_base = of_iomap(apmixedsys, 0);
+ apmixed_phys_base = of_get_phys_base(apmixedsys);
+
+ of_node_put(apmixedsys);
+diff --git a/drivers/video/fbdev/core/sysimgblt.c b/drivers/video/fbdev/core/sysimgblt.c
+index 335e92b813fc4..665ef7a0a2495 100644
+--- a/drivers/video/fbdev/core/sysimgblt.c
++++ b/drivers/video/fbdev/core/sysimgblt.c
+@@ -189,7 +189,7 @@ static void fast_imageblit(const struct fb_image *image, struct fb_info *p,
+ u32 fgx = fgcolor, bgx = bgcolor, bpp = p->var.bits_per_pixel;
+ u32 ppw = 32/bpp, spitch = (image->width + 7)/8;
+ u32 bit_mask, eorx, shift;
+- const char *s = image->data, *src;
++ const u8 *s = image->data, *src;
+ u32 *dst;
+ const u32 *tab;
+ size_t tablen;
+diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
+index 1033fbdfdbec7..befa93582ed79 100644
+--- a/fs/binfmt_elf.c
++++ b/fs/binfmt_elf.c
+@@ -320,10 +320,10 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec,
+ * Grow the stack manually; some architectures have a limit on how
+ * far ahead a user-space access may be in order to grow the stack.
+ */
+- if (mmap_read_lock_killable(mm))
++ if (mmap_write_lock_killable(mm))
+ return -EINTR;
+- vma = find_extend_vma(mm, bprm->p);
+- mmap_read_unlock(mm);
++ vma = find_extend_vma_locked(mm, bprm->p);
++ mmap_write_unlock(mm);
+ if (!vma)
+ return -EFAULT;
+
+diff --git a/fs/exec.c b/fs/exec.c
+index a466e797c8e2e..b84b4fee0f82f 100644
+--- a/fs/exec.c
++++ b/fs/exec.c
+@@ -200,33 +200,39 @@ static struct page *get_arg_page(struct linux_binprm *bprm, unsigned long pos,
+ int write)
+ {
+ struct page *page;
++ struct vm_area_struct *vma = bprm->vma;
++ struct mm_struct *mm = bprm->mm;
+ int ret;
+- unsigned int gup_flags = 0;
+
+-#ifdef CONFIG_STACK_GROWSUP
+- if (write) {
+- ret = expand_downwards(bprm->vma, pos);
+- if (ret < 0)
++ /*
++ * Avoid relying on expanding the stack down in GUP (which
++ * does not work for STACK_GROWSUP anyway), and just do it
++ * by hand ahead of time.
++ */
++ if (write && pos < vma->vm_start) {
++ mmap_write_lock(mm);
++ ret = expand_downwards(vma, pos);
++ if (unlikely(ret < 0)) {
++ mmap_write_unlock(mm);
+ return NULL;
+- }
+-#endif
+-
+- if (write)
+- gup_flags |= FOLL_WRITE;
++ }
++ mmap_write_downgrade(mm);
++ } else
++ mmap_read_lock(mm);
+
+ /*
+ * We are doing an exec(). 'current' is the process
+- * doing the exec and bprm->mm is the new process's mm.
++ * doing the exec and 'mm' is the new process's mm.
+ */
+- mmap_read_lock(bprm->mm);
+- ret = get_user_pages_remote(bprm->mm, pos, 1, gup_flags,
++ ret = get_user_pages_remote(mm, pos, 1,
++ write ? FOLL_WRITE : 0,
+ &page, NULL, NULL);
+- mmap_read_unlock(bprm->mm);
++ mmap_read_unlock(mm);
+ if (ret <= 0)
+ return NULL;
+
+ if (write)
+- acct_arg_size(bprm, vma_pages(bprm->vma));
++ acct_arg_size(bprm, vma_pages(vma));
+
+ return page;
+ }
+@@ -853,7 +859,7 @@ int setup_arg_pages(struct linux_binprm *bprm,
+ stack_base = vma->vm_end - stack_expand;
+ #endif
+ current->mm->start_stack = bprm->p;
+- ret = expand_stack(vma, stack_base);
++ ret = expand_stack_locked(vma, stack_base);
+ if (ret)
+ ret = -EFAULT;
+
+diff --git a/include/linux/mm.h b/include/linux/mm.h
+index 27ce77080c79c..6cbcc55a80b02 100644
+--- a/include/linux/mm.h
++++ b/include/linux/mm.h
+@@ -2314,6 +2314,9 @@ void pagecache_isize_extended(struct inode *inode, loff_t from, loff_t to);
+ void truncate_pagecache_range(struct inode *inode, loff_t offset, loff_t end);
+ int generic_error_remove_page(struct address_space *mapping, struct page *page);
+
++struct vm_area_struct *lock_mm_and_find_vma(struct mm_struct *mm,
++ unsigned long address, struct pt_regs *regs);
++
+ #ifdef CONFIG_MMU
+ extern vm_fault_t handle_mm_fault(struct vm_area_struct *vma,
+ unsigned long address, unsigned int flags,
+@@ -3190,16 +3193,11 @@ extern vm_fault_t filemap_page_mkwrite(struct vm_fault *vmf);
+
+ extern unsigned long stack_guard_gap;
+ /* Generic expand stack which grows the stack according to GROWS{UP,DOWN} */
+-extern int expand_stack(struct vm_area_struct *vma, unsigned long address);
++int expand_stack_locked(struct vm_area_struct *vma, unsigned long address);
++struct vm_area_struct *expand_stack(struct mm_struct * mm, unsigned long addr);
+
+ /* CONFIG_STACK_GROWSUP still needs to grow downwards at some places */
+-extern int expand_downwards(struct vm_area_struct *vma,
+- unsigned long address);
+-#if VM_GROWSUP
+-extern int expand_upwards(struct vm_area_struct *vma, unsigned long address);
+-#else
+- #define expand_upwards(vma, address) (0)
+-#endif
++int expand_downwards(struct vm_area_struct *vma, unsigned long address);
+
+ /* Look up the first VMA which satisfies addr < vm_end, NULL if none. */
+ extern struct vm_area_struct * find_vma(struct mm_struct * mm, unsigned long addr);
+@@ -3294,7 +3292,8 @@ unsigned long change_prot_numa(struct vm_area_struct *vma,
+ unsigned long start, unsigned long end);
+ #endif
+
+-struct vm_area_struct *find_extend_vma(struct mm_struct *, unsigned long addr);
++struct vm_area_struct *find_extend_vma_locked(struct mm_struct *,
++ unsigned long addr);
+ int remap_pfn_range(struct vm_area_struct *, unsigned long addr,
+ unsigned long pfn, unsigned long size, pgprot_t);
+ int remap_pfn_range_notrack(struct vm_area_struct *vma, unsigned long addr,
+diff --git a/lib/maple_tree.c b/lib/maple_tree.c
+index 8ebc43d4cc8c5..35264f1936a37 100644
+--- a/lib/maple_tree.c
++++ b/lib/maple_tree.c
+@@ -4263,11 +4263,13 @@ done:
+
+ static inline void mas_wr_end_piv(struct ma_wr_state *wr_mas)
+ {
+- while ((wr_mas->mas->last > wr_mas->end_piv) &&
+- (wr_mas->offset_end < wr_mas->node_end))
+- wr_mas->end_piv = wr_mas->pivots[++wr_mas->offset_end];
++ while ((wr_mas->offset_end < wr_mas->node_end) &&
++ (wr_mas->mas->last > wr_mas->pivots[wr_mas->offset_end]))
++ wr_mas->offset_end++;
+
+- if (wr_mas->mas->last > wr_mas->end_piv)
++ if (wr_mas->offset_end < wr_mas->node_end)
++ wr_mas->end_piv = wr_mas->pivots[wr_mas->offset_end];
++ else
+ wr_mas->end_piv = wr_mas->mas->max;
+ }
+
+@@ -4424,7 +4426,6 @@ static inline void *mas_wr_store_entry(struct ma_wr_state *wr_mas)
+ }
+
+ /* At this point, we are at the leaf node that needs to be altered. */
+- wr_mas->end_piv = wr_mas->r_max;
+ mas_wr_end_piv(wr_mas);
+
+ if (!wr_mas->entry)
+diff --git a/mm/Kconfig b/mm/Kconfig
+index 7672a22647b4a..e3454087fd31a 100644
+--- a/mm/Kconfig
++++ b/mm/Kconfig
+@@ -1206,6 +1206,10 @@ config PER_VMA_LOCK
+ This feature allows locking each virtual memory area separately when
+ handling page faults instead of taking mmap_lock.
+
++config LOCK_MM_AND_FIND_VMA
++ bool
++ depends on !STACK_GROWSUP
++
+ source "mm/damon/Kconfig"
+
+ endmenu
+diff --git a/mm/gup.c b/mm/gup.c
+index bbe4162365933..94102390b273a 100644
+--- a/mm/gup.c
++++ b/mm/gup.c
+@@ -1096,7 +1096,11 @@ static long __get_user_pages(struct mm_struct *mm,
+
+ /* first iteration or cross vma bound */
+ if (!vma || start >= vma->vm_end) {
+- vma = find_extend_vma(mm, start);
++ vma = find_vma(mm, start);
++ if (vma && (start < vma->vm_start)) {
++ WARN_ON_ONCE(vma->vm_flags & VM_GROWSDOWN);
++ vma = NULL;
++ }
+ if (!vma && in_gate_area(mm, start)) {
+ ret = get_gate_page(mm, start & PAGE_MASK,
+ gup_flags, &vma,
+@@ -1265,9 +1269,13 @@ int fixup_user_fault(struct mm_struct *mm,
+ fault_flags |= FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
+
+ retry:
+- vma = find_extend_vma(mm, address);
+- if (!vma || address < vma->vm_start)
++ vma = find_vma(mm, address);
++ if (!vma)
++ return -EFAULT;
++ if (address < vma->vm_start ) {
++ WARN_ON_ONCE(vma->vm_flags & VM_GROWSDOWN);
+ return -EFAULT;
++ }
+
+ if (!vma_permits_fault(vma, fault_flags))
+ return -EFAULT;
+diff --git a/mm/khugepaged.c b/mm/khugepaged.c
+index 2d0d58fb4e7fa..47b59f2843f60 100644
+--- a/mm/khugepaged.c
++++ b/mm/khugepaged.c
+@@ -1918,9 +1918,9 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
+ }
+ } while (1);
+
+- xas_set(&xas, start);
+ for (index = start; index < end; index++) {
+- page = xas_next(&xas);
++ xas_set(&xas, index);
++ page = xas_load(&xas);
+
+ VM_BUG_ON(index != xas.xa_index);
+ if (is_shmem) {
+@@ -1935,7 +1935,6 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
+ result = SCAN_TRUNCATED;
+ goto xa_locked;
+ }
+- xas_set(&xas, index + 1);
+ }
+ if (!shmem_charge(mapping->host, 1)) {
+ result = SCAN_FAIL;
+@@ -2071,7 +2070,7 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
+
+ xas_lock_irq(&xas);
+
+- VM_BUG_ON_PAGE(page != xas_load(&xas), page);
++ VM_BUG_ON_PAGE(page != xa_load(xas.xa, index), page);
+
+ /*
+ * We control three references to the page:
+diff --git a/mm/memory.c b/mm/memory.c
+index f69fbc2511984..5ce82a76201d5 100644
+--- a/mm/memory.c
++++ b/mm/memory.c
+@@ -5262,6 +5262,125 @@ out:
+ }
+ EXPORT_SYMBOL_GPL(handle_mm_fault);
+
++#ifdef CONFIG_LOCK_MM_AND_FIND_VMA
++#include <linux/extable.h>
++
++static inline bool get_mmap_lock_carefully(struct mm_struct *mm, struct pt_regs *regs)
++{
++ /* Even if this succeeds, make it clear we *might* have slept */
++ if (likely(mmap_read_trylock(mm))) {
++ might_sleep();
++ return true;
++ }
++
++ if (regs && !user_mode(regs)) {
++ unsigned long ip = instruction_pointer(regs);
++ if (!search_exception_tables(ip))
++ return false;
++ }
++
++ return !mmap_read_lock_killable(mm);
++}
++
++static inline bool mmap_upgrade_trylock(struct mm_struct *mm)
++{
++ /*
++ * We don't have this operation yet.
++ *
++ * It should be easy enough to do: it's basically a
++ * atomic_long_try_cmpxchg_acquire()
++ * from RWSEM_READER_BIAS -> RWSEM_WRITER_LOCKED, but
++ * it also needs the proper lockdep magic etc.
++ */
++ return false;
++}
++
++static inline bool upgrade_mmap_lock_carefully(struct mm_struct *mm, struct pt_regs *regs)
++{
++ mmap_read_unlock(mm);
++ if (regs && !user_mode(regs)) {
++ unsigned long ip = instruction_pointer(regs);
++ if (!search_exception_tables(ip))
++ return false;
++ }
++ return !mmap_write_lock_killable(mm);
++}
++
++/*
++ * Helper for page fault handling.
++ *
++ * This is kind of equivalend to "mmap_read_lock()" followed
++ * by "find_extend_vma()", except it's a lot more careful about
++ * the locking (and will drop the lock on failure).
++ *
++ * For example, if we have a kernel bug that causes a page
++ * fault, we don't want to just use mmap_read_lock() to get
++ * the mm lock, because that would deadlock if the bug were
++ * to happen while we're holding the mm lock for writing.
++ *
++ * So this checks the exception tables on kernel faults in
++ * order to only do this all for instructions that are actually
++ * expected to fault.
++ *
++ * We can also actually take the mm lock for writing if we
++ * need to extend the vma, which helps the VM layer a lot.
++ */
++struct vm_area_struct *lock_mm_and_find_vma(struct mm_struct *mm,
++ unsigned long addr, struct pt_regs *regs)
++{
++ struct vm_area_struct *vma;
++
++ if (!get_mmap_lock_carefully(mm, regs))
++ return NULL;
++
++ vma = find_vma(mm, addr);
++ if (likely(vma && (vma->vm_start <= addr)))
++ return vma;
++
++ /*
++ * Well, dang. We might still be successful, but only
++ * if we can extend a vma to do so.
++ */
++ if (!vma || !(vma->vm_flags & VM_GROWSDOWN)) {
++ mmap_read_unlock(mm);
++ return NULL;
++ }
++
++ /*
++ * We can try to upgrade the mmap lock atomically,
++ * in which case we can continue to use the vma
++ * we already looked up.
++ *
++ * Otherwise we'll have to drop the mmap lock and
++ * re-take it, and also look up the vma again,
++ * re-checking it.
++ */
++ if (!mmap_upgrade_trylock(mm)) {
++ if (!upgrade_mmap_lock_carefully(mm, regs))
++ return NULL;
++
++ vma = find_vma(mm, addr);
++ if (!vma)
++ goto fail;
++ if (vma->vm_start <= addr)
++ goto success;
++ if (!(vma->vm_flags & VM_GROWSDOWN))
++ goto fail;
++ }
++
++ if (expand_stack_locked(vma, addr))
++ goto fail;
++
++success:
++ mmap_write_downgrade(mm);
++ return vma;
++
++fail:
++ mmap_write_unlock(mm);
++ return NULL;
++}
++#endif
++
+ #ifdef CONFIG_PER_VMA_LOCK
+ /*
+ * Lookup and lock a VMA under RCU protection. Returned VMA is guaranteed to be
+@@ -5594,6 +5713,14 @@ int __access_remote_vm(struct mm_struct *mm, unsigned long addr, void *buf,
+ if (mmap_read_lock_killable(mm))
+ return 0;
+
++ /* We might need to expand the stack to access it */
++ vma = vma_lookup(mm, addr);
++ if (!vma) {
++ vma = expand_stack(mm, addr);
++ if (!vma)
++ return 0;
++ }
++
+ /* ignore errors, just check how much was successfully transferred */
+ while (len) {
+ int bytes, ret, offset;
+diff --git a/mm/mmap.c b/mm/mmap.c
+index d600404580b28..bc510361acec2 100644
+--- a/mm/mmap.c
++++ b/mm/mmap.c
+@@ -1935,7 +1935,7 @@ static int acct_stack_growth(struct vm_area_struct *vma,
+ * PA-RISC uses this for its stack; IA64 for its Register Backing Store.
+ * vma is the last one with address > vma->vm_end. Have to extend vma.
+ */
+-int expand_upwards(struct vm_area_struct *vma, unsigned long address)
++static int expand_upwards(struct vm_area_struct *vma, unsigned long address)
+ {
+ struct mm_struct *mm = vma->vm_mm;
+ struct vm_area_struct *next;
+@@ -2027,6 +2027,7 @@ int expand_upwards(struct vm_area_struct *vma, unsigned long address)
+
+ /*
+ * vma is the first one with address < vma->vm_start. Have to extend vma.
++ * mmap_lock held for writing.
+ */
+ int expand_downwards(struct vm_area_struct *vma, unsigned long address)
+ {
+@@ -2035,16 +2036,20 @@ int expand_downwards(struct vm_area_struct *vma, unsigned long address)
+ struct vm_area_struct *prev;
+ int error = 0;
+
++ if (!(vma->vm_flags & VM_GROWSDOWN))
++ return -EFAULT;
++
+ address &= PAGE_MASK;
+- if (address < mmap_min_addr)
++ if (address < mmap_min_addr || address < FIRST_USER_ADDRESS)
+ return -EPERM;
+
+ /* Enforce stack_guard_gap */
+ prev = mas_prev(&mas, 0);
+ /* Check that both stack segments have the same anon_vma? */
+- if (prev && !(prev->vm_flags & VM_GROWSDOWN) &&
+- vma_is_accessible(prev)) {
+- if (address - prev->vm_end < stack_guard_gap)
++ if (prev) {
++ if (!(prev->vm_flags & VM_GROWSDOWN) &&
++ vma_is_accessible(prev) &&
++ (address - prev->vm_end < stack_guard_gap))
+ return -ENOMEM;
+ }
+
+@@ -2124,13 +2129,12 @@ static int __init cmdline_parse_stack_guard_gap(char *p)
+ __setup("stack_guard_gap=", cmdline_parse_stack_guard_gap);
+
+ #ifdef CONFIG_STACK_GROWSUP
+-int expand_stack(struct vm_area_struct *vma, unsigned long address)
++int expand_stack_locked(struct vm_area_struct *vma, unsigned long address)
+ {
+ return expand_upwards(vma, address);
+ }
+
+-struct vm_area_struct *
+-find_extend_vma(struct mm_struct *mm, unsigned long addr)
++struct vm_area_struct *find_extend_vma_locked(struct mm_struct *mm, unsigned long addr)
+ {
+ struct vm_area_struct *vma, *prev;
+
+@@ -2138,20 +2142,23 @@ find_extend_vma(struct mm_struct *mm, unsigned long addr)
+ vma = find_vma_prev(mm, addr, &prev);
+ if (vma && (vma->vm_start <= addr))
+ return vma;
+- if (!prev || expand_stack(prev, addr))
++ if (!prev)
++ return NULL;
++ if (expand_stack_locked(prev, addr))
+ return NULL;
+ if (prev->vm_flags & VM_LOCKED)
+ populate_vma_page_range(prev, addr, prev->vm_end, NULL);
+ return prev;
+ }
+ #else
+-int expand_stack(struct vm_area_struct *vma, unsigned long address)
++int expand_stack_locked(struct vm_area_struct *vma, unsigned long address)
+ {
++ if (unlikely(!(vma->vm_flags & VM_GROWSDOWN)))
++ return -EINVAL;
+ return expand_downwards(vma, address);
+ }
+
+-struct vm_area_struct *
+-find_extend_vma(struct mm_struct *mm, unsigned long addr)
++struct vm_area_struct *find_extend_vma_locked(struct mm_struct *mm, unsigned long addr)
+ {
+ struct vm_area_struct *vma;
+ unsigned long start;
+@@ -2162,10 +2169,8 @@ find_extend_vma(struct mm_struct *mm, unsigned long addr)
+ return NULL;
+ if (vma->vm_start <= addr)
+ return vma;
+- if (!(vma->vm_flags & VM_GROWSDOWN))
+- return NULL;
+ start = vma->vm_start;
+- if (expand_stack(vma, addr))
++ if (expand_stack_locked(vma, addr))
+ return NULL;
+ if (vma->vm_flags & VM_LOCKED)
+ populate_vma_page_range(vma, addr, start, NULL);
+@@ -2173,7 +2178,91 @@ find_extend_vma(struct mm_struct *mm, unsigned long addr)
+ }
+ #endif
+
+-EXPORT_SYMBOL_GPL(find_extend_vma);
++/*
++ * IA64 has some horrid mapping rules: it can expand both up and down,
++ * but with various special rules.
++ *
++ * We'll get rid of this architecture eventually, so the ugliness is
++ * temporary.
++ */
++#ifdef CONFIG_IA64
++static inline bool vma_expand_ok(struct vm_area_struct *vma, unsigned long addr)
++{
++ return REGION_NUMBER(addr) == REGION_NUMBER(vma->vm_start) &&
++ REGION_OFFSET(addr) < RGN_MAP_LIMIT;
++}
++
++/*
++ * IA64 stacks grow down, but there's a special register backing store
++ * that can grow up. Only sequentially, though, so the new address must
++ * match vm_end.
++ */
++static inline int vma_expand_up(struct vm_area_struct *vma, unsigned long addr)
++{
++ if (!vma_expand_ok(vma, addr))
++ return -EFAULT;
++ if (vma->vm_end != (addr & PAGE_MASK))
++ return -EFAULT;
++ return expand_upwards(vma, addr);
++}
++
++static inline bool vma_expand_down(struct vm_area_struct *vma, unsigned long addr)
++{
++ if (!vma_expand_ok(vma, addr))
++ return -EFAULT;
++ return expand_downwards(vma, addr);
++}
++
++#elif defined(CONFIG_STACK_GROWSUP)
++
++#define vma_expand_up(vma,addr) expand_upwards(vma, addr)
++#define vma_expand_down(vma, addr) (-EFAULT)
++
++#else
++
++#define vma_expand_up(vma,addr) (-EFAULT)
++#define vma_expand_down(vma, addr) expand_downwards(vma, addr)
++
++#endif
++
++/*
++ * expand_stack(): legacy interface for page faulting. Don't use unless
++ * you have to.
++ *
++ * This is called with the mm locked for reading, drops the lock, takes
++ * the lock for writing, tries to look up a vma again, expands it if
++ * necessary, and downgrades the lock to reading again.
++ *
++ * If no vma is found or it can't be expanded, it returns NULL and has
++ * dropped the lock.
++ */
++struct vm_area_struct *expand_stack(struct mm_struct *mm, unsigned long addr)
++{
++ struct vm_area_struct *vma, *prev;
++
++ mmap_read_unlock(mm);
++ if (mmap_write_lock_killable(mm))
++ return NULL;
++
++ vma = find_vma_prev(mm, addr, &prev);
++ if (vma && vma->vm_start <= addr)
++ goto success;
++
++ if (prev && !vma_expand_up(prev, addr)) {
++ vma = prev;
++ goto success;
++ }
++
++ if (vma && !vma_expand_down(vma, addr))
++ goto success;
++
++ mmap_write_unlock(mm);
++ return NULL;
++
++success:
++ mmap_write_downgrade(mm);
++ return vma;
++}
+
+ /*
+ * Ok - we have the memory areas we should free on a maple tree so release them,
+diff --git a/mm/nommu.c b/mm/nommu.c
+index f670d9979a261..fdc392735ec6d 100644
+--- a/mm/nommu.c
++++ b/mm/nommu.c
+@@ -631,23 +631,31 @@ struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr)
+ EXPORT_SYMBOL(find_vma);
+
+ /*
+- * find a VMA
+- * - we don't extend stack VMAs under NOMMU conditions
++ * At least xtensa ends up having protection faults even with no
++ * MMU.. No stack expansion, at least.
+ */
+-struct vm_area_struct *find_extend_vma(struct mm_struct *mm, unsigned long addr)
++struct vm_area_struct *lock_mm_and_find_vma(struct mm_struct *mm,
++ unsigned long addr, struct pt_regs *regs)
+ {
+- return find_vma(mm, addr);
++ mmap_read_lock(mm);
++ return vma_lookup(mm, addr);
+ }
+
+ /*
+ * expand a stack to a given address
+ * - not supported under NOMMU conditions
+ */
+-int expand_stack(struct vm_area_struct *vma, unsigned long address)
++int expand_stack_locked(struct vm_area_struct *vma, unsigned long addr)
+ {
+ return -ENOMEM;
+ }
+
++struct vm_area_struct *expand_stack(struct mm_struct *mm, unsigned long addr)
++{
++ mmap_read_unlock(mm);
++ return NULL;
++}
++
+ /*
+ * look up the first VMA exactly that exactly matches addr
+ * - should be called with mm->mmap_lock at least held readlocked
+diff --git a/net/can/isotp.c b/net/can/isotp.c
+index 84f9aba029017..ca9d728d6d727 100644
+--- a/net/can/isotp.c
++++ b/net/can/isotp.c
+@@ -1112,8 +1112,9 @@ wait_free_buffer:
+ if (err)
+ goto err_event_drop;
+
+- if (sk->sk_err)
+- return -sk->sk_err;
++ err = sock_error(sk);
++ if (err)
++ return err;
+ }
+
+ return size;
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-07-03 16:59 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-07-03 16:59 UTC (permalink / raw
To: gentoo-commits
commit: 330a960e8ef0c8c0f12c0fc4d668e36aa8e64600
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Mon Jul 3 16:59:11 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Mon Jul 3 16:59:11 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=330a960e
wireguard: queueing: use saner cpu selection wrapping
Bug: https://bugs.gentoo.org/909066
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
2400_wireguard-queueing-cpu-sel-wrapping-fix.patch | 116 +++++++++++++++++++++
2 files changed, 120 insertions(+)
diff --git a/0000_README b/0000_README
index 22f83174..bda29555 100644
--- a/0000_README
+++ b/0000_README
@@ -63,6 +63,10 @@ Patch: 2000_BT-Check-key-sizes-only-if-Secure-Simple-Pairing-enabled.patch
From: https://lore.kernel.org/linux-bluetooth/20190522070540.48895-1-marcel@holtmann.org/raw
Desc: Bluetooth: Check key sizes only when Secure Simple Pairing is enabled. See bug #686758
+Patch: 2400_wireguard-queueing-cpu-sel-wrapping-fix.patch
+From: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=7387943fa35516f6f8017a3b0e9ce48a3bef9faa
+Desc: wireguard: queueing: use saner cpu selection wrapping
+
Patch: 2900_tmp513-Fix-build-issue-by-selecting-CONFIG_REG.patch
From: https://bugs.gentoo.org/710790
Desc: tmp513 requies REGMAP_I2C to build. Select it by default in Kconfig. See bug #710790. Thanks to Phil Stracchino
diff --git a/2400_wireguard-queueing-cpu-sel-wrapping-fix.patch b/2400_wireguard-queueing-cpu-sel-wrapping-fix.patch
new file mode 100644
index 00000000..fa199039
--- /dev/null
+++ b/2400_wireguard-queueing-cpu-sel-wrapping-fix.patch
@@ -0,0 +1,116 @@
+From 7387943fa35516f6f8017a3b0e9ce48a3bef9faa Mon Sep 17 00:00:00 2001
+From: "Jason A. Donenfeld" <Jason@zx2c4.com>
+Date: Mon, 3 Jul 2023 03:27:04 +0200
+Subject: wireguard: queueing: use saner cpu selection wrapping
+
+Using `% nr_cpumask_bits` is slow and complicated, and not totally
+robust toward dynamic changes to CPU topologies. Rather than storing the
+next CPU in the round-robin, just store the last one, and also return
+that value. This simplifies the loop drastically into a much more common
+pattern.
+
+Fixes: e7096c131e51 ("net: WireGuard secure network tunnel")
+Cc: stable@vger.kernel.org
+Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
+Tested-by: Manuel Leiner <manuel.leiner@gmx.de>
+Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
+Signed-off-by: David S. Miller <davem@davemloft.net>
+---
+ drivers/net/wireguard/queueing.c | 1 +
+ drivers/net/wireguard/queueing.h | 25 +++++++++++--------------
+ drivers/net/wireguard/receive.c | 2 +-
+ drivers/net/wireguard/send.c | 2 +-
+ 4 files changed, 14 insertions(+), 16 deletions(-)
+
+diff --git a/drivers/net/wireguard/queueing.c b/drivers/net/wireguard/queueing.c
+index 8084e7408c0ae..26d235d152352 100644
+--- a/drivers/net/wireguard/queueing.c
++++ b/drivers/net/wireguard/queueing.c
+@@ -28,6 +28,7 @@ int wg_packet_queue_init(struct crypt_queue *queue, work_func_t function,
+ int ret;
+
+ memset(queue, 0, sizeof(*queue));
++ queue->last_cpu = -1;
+ ret = ptr_ring_init(&queue->ring, len, GFP_KERNEL);
+ if (ret)
+ return ret;
+diff --git a/drivers/net/wireguard/queueing.h b/drivers/net/wireguard/queueing.h
+index 125284b346a77..1ea4f874e367e 100644
+--- a/drivers/net/wireguard/queueing.h
++++ b/drivers/net/wireguard/queueing.h
+@@ -117,20 +117,17 @@ static inline int wg_cpumask_choose_online(int *stored_cpu, unsigned int id)
+ return cpu;
+ }
+
+-/* This function is racy, in the sense that next is unlocked, so it could return
+- * the same CPU twice. A race-free version of this would be to instead store an
+- * atomic sequence number, do an increment-and-return, and then iterate through
+- * every possible CPU until we get to that index -- choose_cpu. However that's
+- * a bit slower, and it doesn't seem like this potential race actually
+- * introduces any performance loss, so we live with it.
++/* This function is racy, in the sense that it's called while last_cpu is
++ * unlocked, so it could return the same CPU twice. Adding locking or using
++ * atomic sequence numbers is slower though, and the consequences of racing are
++ * harmless, so live with it.
+ */
+-static inline int wg_cpumask_next_online(int *next)
++static inline int wg_cpumask_next_online(int *last_cpu)
+ {
+- int cpu = *next;
+-
+- while (unlikely(!cpumask_test_cpu(cpu, cpu_online_mask)))
+- cpu = cpumask_next(cpu, cpu_online_mask) % nr_cpumask_bits;
+- *next = cpumask_next(cpu, cpu_online_mask) % nr_cpumask_bits;
++ int cpu = cpumask_next(*last_cpu, cpu_online_mask);
++ if (cpu >= nr_cpu_ids)
++ cpu = cpumask_first(cpu_online_mask);
++ *last_cpu = cpu;
+ return cpu;
+ }
+
+@@ -159,7 +156,7 @@ static inline void wg_prev_queue_drop_peeked(struct prev_queue *queue)
+
+ static inline int wg_queue_enqueue_per_device_and_peer(
+ struct crypt_queue *device_queue, struct prev_queue *peer_queue,
+- struct sk_buff *skb, struct workqueue_struct *wq, int *next_cpu)
++ struct sk_buff *skb, struct workqueue_struct *wq)
+ {
+ int cpu;
+
+@@ -173,7 +170,7 @@ static inline int wg_queue_enqueue_per_device_and_peer(
+ /* Then we queue it up in the device queue, which consumes the
+ * packet as soon as it can.
+ */
+- cpu = wg_cpumask_next_online(next_cpu);
++ cpu = wg_cpumask_next_online(&device_queue->last_cpu);
+ if (unlikely(ptr_ring_produce_bh(&device_queue->ring, skb)))
+ return -EPIPE;
+ queue_work_on(cpu, wq, &per_cpu_ptr(device_queue->worker, cpu)->work);
+diff --git a/drivers/net/wireguard/receive.c b/drivers/net/wireguard/receive.c
+index 7135d51d2d872..0b3f0c8435509 100644
+--- a/drivers/net/wireguard/receive.c
++++ b/drivers/net/wireguard/receive.c
+@@ -524,7 +524,7 @@ static void wg_packet_consume_data(struct wg_device *wg, struct sk_buff *skb)
+ goto err;
+
+ ret = wg_queue_enqueue_per_device_and_peer(&wg->decrypt_queue, &peer->rx_queue, skb,
+- wg->packet_crypt_wq, &wg->decrypt_queue.last_cpu);
++ wg->packet_crypt_wq);
+ if (unlikely(ret == -EPIPE))
+ wg_queue_enqueue_per_peer_rx(skb, PACKET_STATE_DEAD);
+ if (likely(!ret || ret == -EPIPE)) {
+diff --git a/drivers/net/wireguard/send.c b/drivers/net/wireguard/send.c
+index 5368f7c35b4bf..95c853b59e1da 100644
+--- a/drivers/net/wireguard/send.c
++++ b/drivers/net/wireguard/send.c
+@@ -318,7 +318,7 @@ static void wg_packet_create_data(struct wg_peer *peer, struct sk_buff *first)
+ goto err;
+
+ ret = wg_queue_enqueue_per_device_and_peer(&wg->encrypt_queue, &peer->tx_queue, first,
+- wg->packet_crypt_wq, &wg->encrypt_queue.last_cpu);
++ wg->packet_crypt_wq);
+ if (unlikely(ret == -EPIPE))
+ wg_queue_enqueue_per_peer_tx(first, PACKET_STATE_DEAD);
+ err:
+--
+cgit
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-07-04 12:57 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-07-04 12:57 UTC (permalink / raw
To: gentoo-commits
commit: 89834e8979d29abb5525573e5fb3456092f77ef8
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Tue Jul 4 12:56:35 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Tue Jul 4 12:56:35 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=89834e89
mm: disable CONFIG_PER_VMA_LOCK by default until its fixed
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +++
1805_mm-disable-CONFIG-PER-VMA-LOCK-by-def.patch | 35 ++++++++++++++++++++++++
2 files changed, 39 insertions(+)
diff --git a/0000_README b/0000_README
index 38ee98e3..a09a44a9 100644
--- a/0000_README
+++ b/0000_README
@@ -63,6 +63,10 @@ Patch: 1800_mm-execve-mark-stack-as-growing-down.patch
From: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Desc: execve: always mark stack as growing down during early stack setup
+Patch: 1805_mm-disable-CONFIG-PER-VMA-LOCK-by-def.patch
+From: https://lore.kernel.org/all/20230703182150.2193578-1-surenb@google.com/
+Desc: mm: disable CONFIG_PER_VMA_LOCK by default until its fixed
+
Patch: 2000_BT-Check-key-sizes-only-if-Secure-Simple-Pairing-enabled.patch
From: https://lore.kernel.org/linux-bluetooth/20190522070540.48895-1-marcel@holtmann.org/raw
Desc: Bluetooth: Check key sizes only when Secure Simple Pairing is enabled. See bug #686758
diff --git a/1805_mm-disable-CONFIG-PER-VMA-LOCK-by-def.patch b/1805_mm-disable-CONFIG-PER-VMA-LOCK-by-def.patch
new file mode 100644
index 00000000..c98255a6
--- /dev/null
+++ b/1805_mm-disable-CONFIG-PER-VMA-LOCK-by-def.patch
@@ -0,0 +1,35 @@
+Subject: [PATCH 1/1] mm: disable CONFIG_PER_VMA_LOCK by default until its fixed
+Date: Mon, 3 Jul 2023 11:21:50 -0700 [thread overview]
+Message-ID: <20230703182150.2193578-1-surenb@google.com> (raw)
+
+A memory corruption was reported in [1] with bisection pointing to the
+patch [2] enabling per-VMA locks for x86.
+Disable per-VMA locks config to prevent this issue while the problem is
+being investigated. This is expected to be a temporary measure.
+
+[1] https://bugzilla.kernel.org/show_bug.cgi?id=217624
+[2] https://lore.kernel.org/all/20230227173632.3292573-30-surenb@google.com
+
+Reported-by: Jiri Slaby <jirislaby@kernel.org>
+Reported-by: Jacob Young <jacobly.alt@gmail.com>
+Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first")
+Signed-off-by: Suren Baghdasaryan <surenb@google.com>
+---
+ mm/Kconfig | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/mm/Kconfig b/mm/Kconfig
+index 09130434e30d..de94b2497600 100644
+--- a/mm/Kconfig
++++ b/mm/Kconfig
+@@ -1224,7 +1224,7 @@ config ARCH_SUPPORTS_PER_VMA_LOCK
+ def_bool n
+
+ config PER_VMA_LOCK
+- def_bool y
++ bool "Enable per-vma locking during page fault handling."
+ depends on ARCH_SUPPORTS_PER_VMA_LOCK && MMU && SMP
+ help
+ Allow per-vma locking during page fault handling.
+--
+2.41.0.255.g8b1d071c50-goog
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-07-04 12:57 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-07-04 12:57 UTC (permalink / raw
To: gentoo-commits
commit: cb463ff8b2189881a8ec5d3b2aecc283383b6227
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Tue Jul 4 12:46:06 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Tue Jul 4 12:46:06 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=cb463ff8
execve: always mark stack as growing down during early stack setup
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 ++
1800_mm-execve-mark-stack-as-growing-down.patch | 82 +++++++++++++++++++++++++
2 files changed, 86 insertions(+)
diff --git a/0000_README b/0000_README
index bda29555..38ee98e3 100644
--- a/0000_README
+++ b/0000_README
@@ -59,6 +59,10 @@ Patch: 1700_sparc-address-warray-bound-warnings.patch
From: https://github.com/KSPP/linux/issues/109
Desc: Address -Warray-bounds warnings
+Patch: 1800_mm-execve-mark-stack-as-growing-down.patch
+From: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
+Desc: execve: always mark stack as growing down during early stack setup
+
Patch: 2000_BT-Check-key-sizes-only-if-Secure-Simple-Pairing-enabled.patch
From: https://lore.kernel.org/linux-bluetooth/20190522070540.48895-1-marcel@holtmann.org/raw
Desc: Bluetooth: Check key sizes only when Secure Simple Pairing is enabled. See bug #686758
diff --git a/1800_mm-execve-mark-stack-as-growing-down.patch b/1800_mm-execve-mark-stack-as-growing-down.patch
new file mode 100644
index 00000000..07daf228
--- /dev/null
+++ b/1800_mm-execve-mark-stack-as-growing-down.patch
@@ -0,0 +1,82 @@
+From d7a7655b29081c053b1abe71f64a2928638dafc6 Mon Sep 17 00:00:00 2001
+From: Linus Torvalds <torvalds@linux-foundation.org>
+Date: Sun, 2 Jul 2023 23:20:17 -0700
+Subject: execve: always mark stack as growing down during early stack setup
+
+commit f66066bc5136f25e36a2daff4896c768f18c211e upstream.
+
+While our user stacks can grow either down (all common architectures) or
+up (parisc and the ia64 register stack), the initial stack setup when we
+copy the argument and environment strings to the new stack at execve()
+time is always done by extending the stack downwards.
+
+But it turns out that in commit 8d7071af8907 ("mm: always expand the
+stack with the mmap write lock held"), as part of making the stack
+growing code more robust, 'expand_downwards()' was now made to actually
+check the vma flags:
+
+ if (!(vma->vm_flags & VM_GROWSDOWN))
+ return -EFAULT;
+
+and that meant that this execve-time stack expansion started failing on
+parisc, because on that architecture, the stack flags do not contain the
+VM_GROWSDOWN bit.
+
+At the same time the new check in expand_downwards() is clearly correct,
+and simplified the callers, so let's not remove it.
+
+The solution is instead to just codify the fact that yes, during
+execve(), the stack grows down. This not only matches reality, it ends
+up being particularly simple: we already have special execve-time flags
+for the stack (VM_STACK_INCOMPLETE_SETUP) and use those flags to avoid
+page migration during this setup time (see vma_is_temporary_stack() and
+invalid_migration_vma()).
+
+So just add VM_GROWSDOWN to that set of temporary flags, and now our
+stack flags automatically match reality, and the parisc stack expansion
+works again.
+
+Note that the VM_STACK_INCOMPLETE_SETUP bits will be cleared when the
+stack is finalized, so we only add the extra VM_GROWSDOWN bit on
+CONFIG_STACK_GROWSUP architectures (ie parisc) rather than adding it in
+general.
+
+Link: https://lore.kernel.org/all/612eaa53-6904-6e16-67fc-394f4faa0e16@bell.net/
+Link: https://lore.kernel.org/all/5fd98a09-4792-1433-752d-029ae3545168@gmx.de/
+Fixes: 8d7071af8907 ("mm: always expand the stack with the mmap write lock held")
+Reported-by: John David Anglin <dave.anglin@bell.net>
+Reported-and-tested-by: Helge Deller <deller@gmx.de>
+Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net>
+Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
+Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
+---
+ include/linux/mm.h | 4 +++-
+ 1 file changed, 3 insertions(+), 1 deletion(-)
+
+diff --git a/include/linux/mm.h b/include/linux/mm.h
+index 6cbcc55a80b02..9e10485f37e7f 100644
+--- a/include/linux/mm.h
++++ b/include/linux/mm.h
+@@ -377,7 +377,7 @@ extern unsigned int kobjsize(const void *objp);
+ #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_MINOR */
+
+ /* Bits set in the VMA until the stack is in its final location */
+-#define VM_STACK_INCOMPLETE_SETUP (VM_RAND_READ | VM_SEQ_READ)
++#define VM_STACK_INCOMPLETE_SETUP (VM_RAND_READ | VM_SEQ_READ | VM_STACK_EARLY)
+
+ #define TASK_EXEC ((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0)
+
+@@ -399,8 +399,10 @@ extern unsigned int kobjsize(const void *objp);
+
+ #ifdef CONFIG_STACK_GROWSUP
+ #define VM_STACK VM_GROWSUP
++#define VM_STACK_EARLY VM_GROWSDOWN
+ #else
+ #define VM_STACK VM_GROWSDOWN
++#define VM_STACK_EARLY 0
+ #endif
+
+ #define VM_STACK_FLAGS (VM_STACK | VM_STACK_DEFAULT_FLAGS | VM_ACCOUNT)
+--
+cgit
+
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-07-05 20:26 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-07-05 20:26 UTC (permalink / raw
To: gentoo-commits
commit: 0b0b12cabf9c32c90ee268fd7db419a634e26f61
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Wed Jul 5 20:26:23 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Wed Jul 5 20:26:23 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=0b0b12ca
Linux patch 6.4.2
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1001_linux-6.4.2.patch | 554 +++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 558 insertions(+)
diff --git a/0000_README b/0000_README
index a09a44a9..f9224452 100644
--- a/0000_README
+++ b/0000_README
@@ -47,6 +47,10 @@ Patch: 1000_linux-6.4.1.patch
From: https://www.kernel.org
Desc: Linux 6.4.1
+Patch: 1001_linux-6.4.2.patch
+From: https://www.kernel.org
+Desc: Linux 6.4.2
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1001_linux-6.4.2.patch b/1001_linux-6.4.2.patch
new file mode 100644
index 00000000..a72ad77b
--- /dev/null
+++ b/1001_linux-6.4.2.patch
@@ -0,0 +1,554 @@
+diff --git a/Documentation/process/changes.rst b/Documentation/process/changes.rst
+index ef540865ad22e..a9ef00509c9b1 100644
+--- a/Documentation/process/changes.rst
++++ b/Documentation/process/changes.rst
+@@ -60,6 +60,7 @@ openssl & libcrypto 1.0.0 openssl version
+ bc 1.06.95 bc --version
+ Sphinx\ [#f1]_ 1.7 sphinx-build --version
+ cpio any cpio --version
++gtags (optional) 6.6.5 gtags --version
+ ====================== =============== ========================================
+
+ .. [#f1] Sphinx is needed only to build the Kernel documentation
+@@ -174,6 +175,12 @@ You will need openssl to build kernels 3.7 and higher if module signing is
+ enabled. You will also need openssl development packages to build kernels 4.3
+ and higher.
+
++gtags / GNU GLOBAL (optional)
++-----------------------------
++
++The kernel build requires GNU GLOBAL version 6.6.5 or later to generate
++tag files through ``make gtags``. This is due to its use of the gtags
++``-C (--directory)`` flag.
+
+ System utilities
+ ****************
+diff --git a/Makefile b/Makefile
+index 9f6376cbafebe..bcac81556b569 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 6
+ PATCHLEVEL = 4
+-SUBLEVEL = 1
++SUBLEVEL = 2
+ EXTRAVERSION =
+ NAME = Hurr durr I'ma ninja sloth
+
+diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
+index 8a169bdb4d534..df1386a60d521 100644
+--- a/arch/arm64/mm/fault.c
++++ b/arch/arm64/mm/fault.c
+@@ -522,9 +522,7 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
+ unsigned long vm_flags;
+ unsigned int mm_flags = FAULT_FLAG_DEFAULT;
+ unsigned long addr = untagged_addr(far);
+-#ifdef CONFIG_PER_VMA_LOCK
+ struct vm_area_struct *vma;
+-#endif
+
+ if (kprobe_page_fault(regs, esr))
+ return 0;
+diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
+index 67f4ab6daa34f..74962b18e3b21 100644
+--- a/drivers/cxl/core/pci.c
++++ b/drivers/cxl/core/pci.c
+@@ -308,36 +308,17 @@ static void disable_hdm(void *_cxlhdm)
+ hdm + CXL_HDM_DECODER_CTRL_OFFSET);
+ }
+
+-int devm_cxl_enable_hdm(struct cxl_port *port, struct cxl_hdm *cxlhdm)
++static int devm_cxl_enable_hdm(struct device *host, struct cxl_hdm *cxlhdm)
+ {
+- void __iomem *hdm;
++ void __iomem *hdm = cxlhdm->regs.hdm_decoder;
+ u32 global_ctrl;
+
+- /*
+- * If the hdm capability was not mapped there is nothing to enable and
+- * the caller is responsible for what happens next. For example,
+- * emulate a passthrough decoder.
+- */
+- if (IS_ERR(cxlhdm))
+- return 0;
+-
+- hdm = cxlhdm->regs.hdm_decoder;
+ global_ctrl = readl(hdm + CXL_HDM_DECODER_CTRL_OFFSET);
+-
+- /*
+- * If the HDM decoder capability was enabled on entry, skip
+- * registering disable_hdm() since this decode capability may be
+- * owned by platform firmware.
+- */
+- if (global_ctrl & CXL_HDM_DECODER_ENABLE)
+- return 0;
+-
+ writel(global_ctrl | CXL_HDM_DECODER_ENABLE,
+ hdm + CXL_HDM_DECODER_CTRL_OFFSET);
+
+- return devm_add_action_or_reset(&port->dev, disable_hdm, cxlhdm);
++ return devm_add_action_or_reset(host, disable_hdm, cxlhdm);
+ }
+-EXPORT_SYMBOL_NS_GPL(devm_cxl_enable_hdm, CXL);
+
+ int cxl_dvsec_rr_decode(struct device *dev, int d,
+ struct cxl_endpoint_dvsec_info *info)
+@@ -511,7 +492,7 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
+ if (info->mem_enabled)
+ return 0;
+
+- rc = devm_cxl_enable_hdm(port, cxlhdm);
++ rc = devm_cxl_enable_hdm(&port->dev, cxlhdm);
+ if (rc)
+ return rc;
+
+diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
+index f93a285389621..044a92d9813e2 100644
+--- a/drivers/cxl/cxl.h
++++ b/drivers/cxl/cxl.h
+@@ -710,7 +710,6 @@ struct cxl_endpoint_dvsec_info {
+ struct cxl_hdm;
+ struct cxl_hdm *devm_cxl_setup_hdm(struct cxl_port *port,
+ struct cxl_endpoint_dvsec_info *info);
+-int devm_cxl_enable_hdm(struct cxl_port *port, struct cxl_hdm *cxlhdm);
+ int devm_cxl_enumerate_decoders(struct cxl_hdm *cxlhdm,
+ struct cxl_endpoint_dvsec_info *info);
+ int devm_cxl_add_passthrough_decoder(struct cxl_port *port);
+diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c
+index c23b6164e1c0f..07c5ac598da1c 100644
+--- a/drivers/cxl/port.c
++++ b/drivers/cxl/port.c
+@@ -60,17 +60,13 @@ static int discover_region(struct device *dev, void *root)
+ static int cxl_switch_port_probe(struct cxl_port *port)
+ {
+ struct cxl_hdm *cxlhdm;
+- int rc, nr_dports;
+-
+- nr_dports = devm_cxl_port_enumerate_dports(port);
+- if (nr_dports < 0)
+- return nr_dports;
++ int rc;
+
+- cxlhdm = devm_cxl_setup_hdm(port, NULL);
+- rc = devm_cxl_enable_hdm(port, cxlhdm);
+- if (rc)
++ rc = devm_cxl_port_enumerate_dports(port);
++ if (rc < 0)
+ return rc;
+
++ cxlhdm = devm_cxl_setup_hdm(port, NULL);
+ if (!IS_ERR(cxlhdm))
+ return devm_cxl_enumerate_decoders(cxlhdm, NULL);
+
+@@ -79,7 +75,7 @@ static int cxl_switch_port_probe(struct cxl_port *port)
+ return PTR_ERR(cxlhdm);
+ }
+
+- if (nr_dports == 1) {
++ if (rc == 1) {
+ dev_dbg(&port->dev, "Fallback to passthrough decoder\n");
+ return devm_cxl_add_passthrough_decoder(port);
+ }
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+index 3c0310576b3bf..5b3a70becbdf4 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+@@ -2368,6 +2368,10 @@ int amdgpu_vm_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
+ struct amdgpu_fpriv *fpriv = filp->driver_priv;
+ int r;
+
++ /* No valid flags defined yet */
++ if (args->in.flags)
++ return -EINVAL;
++
+ switch (args->in.op) {
+ case AMDGPU_VM_OP_RESERVE_VMID:
+ /* We only have requirement to reserve vmid from gfxhub */
+diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c
+index 7d5c9c582ed2d..0d2fa7f86a544 100644
+--- a/drivers/md/dm-ioctl.c
++++ b/drivers/md/dm-ioctl.c
+@@ -1830,30 +1830,36 @@ static ioctl_fn lookup_ioctl(unsigned int cmd, int *ioctl_flags)
+ * As well as checking the version compatibility this always
+ * copies the kernel interface version out.
+ */
+-static int check_version(unsigned int cmd, struct dm_ioctl __user *user)
++static int check_version(unsigned int cmd, struct dm_ioctl __user *user,
++ struct dm_ioctl *kernel_params)
+ {
+- uint32_t version[3];
+ int r = 0;
+
+- if (copy_from_user(version, user->version, sizeof(version)))
++ /* Make certain version is first member of dm_ioctl struct */
++ BUILD_BUG_ON(offsetof(struct dm_ioctl, version) != 0);
++
++ if (copy_from_user(kernel_params->version, user->version, sizeof(kernel_params->version)))
+ return -EFAULT;
+
+- if ((version[0] != DM_VERSION_MAJOR) ||
+- (version[1] > DM_VERSION_MINOR)) {
++ if ((kernel_params->version[0] != DM_VERSION_MAJOR) ||
++ (kernel_params->version[1] > DM_VERSION_MINOR)) {
+ DMERR("ioctl interface mismatch: kernel(%u.%u.%u), user(%u.%u.%u), cmd(%d)",
+ DM_VERSION_MAJOR, DM_VERSION_MINOR,
+ DM_VERSION_PATCHLEVEL,
+- version[0], version[1], version[2], cmd);
++ kernel_params->version[0],
++ kernel_params->version[1],
++ kernel_params->version[2],
++ cmd);
+ r = -EINVAL;
+ }
+
+ /*
+ * Fill in the kernel version.
+ */
+- version[0] = DM_VERSION_MAJOR;
+- version[1] = DM_VERSION_MINOR;
+- version[2] = DM_VERSION_PATCHLEVEL;
+- if (copy_to_user(user->version, version, sizeof(version)))
++ kernel_params->version[0] = DM_VERSION_MAJOR;
++ kernel_params->version[1] = DM_VERSION_MINOR;
++ kernel_params->version[2] = DM_VERSION_PATCHLEVEL;
++ if (copy_to_user(user->version, kernel_params->version, sizeof(kernel_params->version)))
+ return -EFAULT;
+
+ return r;
+@@ -1879,7 +1885,10 @@ static int copy_params(struct dm_ioctl __user *user, struct dm_ioctl *param_kern
+ const size_t minimum_data_size = offsetof(struct dm_ioctl, data);
+ unsigned int noio_flag;
+
+- if (copy_from_user(param_kernel, user, minimum_data_size))
++ /* check_version() already copied version from userspace, avoid TOCTOU */
++ if (copy_from_user((char *)param_kernel + sizeof(param_kernel->version),
++ (char __user *)user + sizeof(param_kernel->version),
++ minimum_data_size - sizeof(param_kernel->version)))
+ return -EFAULT;
+
+ if (param_kernel->data_size < minimum_data_size) {
+@@ -1991,7 +2000,7 @@ static int ctl_ioctl(struct file *file, uint command, struct dm_ioctl __user *us
+ * Check the interface version passed in. This also
+ * writes out the kernel's interface version.
+ */
+- r = check_version(cmd, user);
++ r = check_version(cmd, user, ¶m_kernel);
+ if (r)
+ return r;
+
+diff --git a/drivers/nubus/proc.c b/drivers/nubus/proc.c
+index 1fd667852271f..cd4bd06cf3094 100644
+--- a/drivers/nubus/proc.c
++++ b/drivers/nubus/proc.c
+@@ -137,6 +137,18 @@ static int nubus_proc_rsrc_show(struct seq_file *m, void *v)
+ return 0;
+ }
+
++static int nubus_rsrc_proc_open(struct inode *inode, struct file *file)
++{
++ return single_open(file, nubus_proc_rsrc_show, inode);
++}
++
++static const struct proc_ops nubus_rsrc_proc_ops = {
++ .proc_open = nubus_rsrc_proc_open,
++ .proc_read = seq_read,
++ .proc_lseek = seq_lseek,
++ .proc_release = single_release,
++};
++
+ void nubus_proc_add_rsrc_mem(struct proc_dir_entry *procdir,
+ const struct nubus_dirent *ent,
+ unsigned int size)
+@@ -152,8 +164,8 @@ void nubus_proc_add_rsrc_mem(struct proc_dir_entry *procdir,
+ pded = nubus_proc_alloc_pde_data(nubus_dirptr(ent), size);
+ else
+ pded = NULL;
+- proc_create_single_data(name, S_IFREG | 0444, procdir,
+- nubus_proc_rsrc_show, pded);
++ proc_create_data(name, S_IFREG | 0444, procdir,
++ &nubus_rsrc_proc_ops, pded);
+ }
+
+ void nubus_proc_add_rsrc(struct proc_dir_entry *procdir,
+@@ -166,9 +178,9 @@ void nubus_proc_add_rsrc(struct proc_dir_entry *procdir,
+ return;
+
+ snprintf(name, sizeof(name), "%x", ent->type);
+- proc_create_single_data(name, S_IFREG | 0444, procdir,
+- nubus_proc_rsrc_show,
+- nubus_proc_alloc_pde_data(data, 0));
++ proc_create_data(name, S_IFREG | 0444, procdir,
++ &nubus_rsrc_proc_ops,
++ nubus_proc_alloc_pde_data(data, 0));
+ }
+
+ /*
+diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
+index 052a611081ecd..a05350a4e49cb 100644
+--- a/drivers/pci/pci-acpi.c
++++ b/drivers/pci/pci-acpi.c
+@@ -1043,6 +1043,16 @@ bool acpi_pci_bridge_d3(struct pci_dev *dev)
+ return false;
+ }
+
++static void acpi_pci_config_space_access(struct pci_dev *dev, bool enable)
++{
++ int val = enable ? ACPI_REG_CONNECT : ACPI_REG_DISCONNECT;
++ int ret = acpi_evaluate_reg(ACPI_HANDLE(&dev->dev),
++ ACPI_ADR_SPACE_PCI_CONFIG, val);
++ if (ret)
++ pci_dbg(dev, "ACPI _REG %s evaluation failed (%d)\n",
++ enable ? "connect" : "disconnect", ret);
++}
++
+ int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
+ {
+ struct acpi_device *adev = ACPI_COMPANION(&dev->dev);
+@@ -1053,32 +1063,49 @@ int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
+ [PCI_D3hot] = ACPI_STATE_D3_HOT,
+ [PCI_D3cold] = ACPI_STATE_D3_COLD,
+ };
+- int error = -EINVAL;
++ int error;
+
+ /* If the ACPI device has _EJ0, ignore the device */
+ if (!adev || acpi_has_method(adev->handle, "_EJ0"))
+ return -ENODEV;
+
+ switch (state) {
+- case PCI_D3cold:
+- if (dev_pm_qos_flags(&dev->dev, PM_QOS_FLAG_NO_POWER_OFF) ==
+- PM_QOS_FLAGS_ALL) {
+- error = -EBUSY;
+- break;
+- }
+- fallthrough;
+ case PCI_D0:
+ case PCI_D1:
+ case PCI_D2:
+ case PCI_D3hot:
+- error = acpi_device_set_power(adev, state_conv[state]);
++ case PCI_D3cold:
++ break;
++ default:
++ return -EINVAL;
++ }
++
++ if (state == PCI_D3cold) {
++ if (dev_pm_qos_flags(&dev->dev, PM_QOS_FLAG_NO_POWER_OFF) ==
++ PM_QOS_FLAGS_ALL)
++ return -EBUSY;
++
++ /* Notify AML lack of PCI config space availability */
++ acpi_pci_config_space_access(dev, false);
+ }
+
+- if (!error)
+- pci_dbg(dev, "power state changed by ACPI to %s\n",
+- acpi_power_state_string(adev->power.state));
++ error = acpi_device_set_power(adev, state_conv[state]);
++ if (error)
++ return error;
+
+- return error;
++ pci_dbg(dev, "power state changed by ACPI to %s\n",
++ acpi_power_state_string(adev->power.state));
++
++ /*
++ * Notify AML of PCI config space availability. Config space is
++ * accessible in all states except D3cold; the only transitions
++ * that change availability are transitions to D3cold and from
++ * D3cold to D0.
++ */
++ if (state == PCI_D0)
++ acpi_pci_config_space_access(dev, true);
++
++ return 0;
+ }
+
+ pci_power_t acpi_pci_get_power_state(struct pci_dev *dev)
+diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
+index ecfdfb2529a36..581201818ed8f 100644
+--- a/fs/hugetlbfs/inode.c
++++ b/fs/hugetlbfs/inode.c
+@@ -821,7 +821,6 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
+ */
+ struct folio *folio;
+ unsigned long addr;
+- bool present;
+
+ cond_resched();
+
+@@ -845,10 +844,9 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
+ mutex_lock(&hugetlb_fault_mutex_table[hash]);
+
+ /* See if already present in mapping to avoid alloc/free */
+- rcu_read_lock();
+- present = page_cache_next_miss(mapping, index, 1) != index;
+- rcu_read_unlock();
+- if (present) {
++ folio = filemap_get_folio(mapping, index);
++ if (!IS_ERR(folio)) {
++ folio_put(folio);
+ mutex_unlock(&hugetlb_fault_mutex_table[hash]);
+ hugetlb_drop_vma_policy(&pseudo_vma);
+ continue;
+diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
+index a910b9a638c5e..8172dd4135a1d 100644
+--- a/fs/nfs/inode.c
++++ b/fs/nfs/inode.c
+@@ -845,7 +845,7 @@ int nfs_getattr(struct mnt_idmap *idmap, const struct path *path,
+
+ request_mask &= STATX_TYPE | STATX_MODE | STATX_NLINK | STATX_UID |
+ STATX_GID | STATX_ATIME | STATX_MTIME | STATX_CTIME |
+- STATX_INO | STATX_SIZE | STATX_BLOCKS | STATX_BTIME |
++ STATX_INO | STATX_SIZE | STATX_BLOCKS |
+ STATX_CHANGE_COOKIE;
+
+ if ((query_flags & AT_STATX_DONT_SYNC) && !force_sync) {
+diff --git a/include/linux/mm.h b/include/linux/mm.h
+index 6cbcc55a80b02..9e10485f37e7f 100644
+--- a/include/linux/mm.h
++++ b/include/linux/mm.h
+@@ -377,7 +377,7 @@ extern unsigned int kobjsize(const void *objp);
+ #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_MINOR */
+
+ /* Bits set in the VMA until the stack is in its final location */
+-#define VM_STACK_INCOMPLETE_SETUP (VM_RAND_READ | VM_SEQ_READ)
++#define VM_STACK_INCOMPLETE_SETUP (VM_RAND_READ | VM_SEQ_READ | VM_STACK_EARLY)
+
+ #define TASK_EXEC ((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0)
+
+@@ -399,8 +399,10 @@ extern unsigned int kobjsize(const void *objp);
+
+ #ifdef CONFIG_STACK_GROWSUP
+ #define VM_STACK VM_GROWSUP
++#define VM_STACK_EARLY VM_GROWSDOWN
+ #else
+ #define VM_STACK VM_GROWSDOWN
++#define VM_STACK_EARLY 0
+ #endif
+
+ #define VM_STACK_FLAGS (VM_STACK | VM_STACK_DEFAULT_FLAGS | VM_ACCOUNT)
+diff --git a/mm/hugetlb.c b/mm/hugetlb.c
+index f154019e6b840..f791076da157c 100644
+--- a/mm/hugetlb.c
++++ b/mm/hugetlb.c
+@@ -5731,13 +5731,13 @@ static bool hugetlbfs_pagecache_present(struct hstate *h,
+ {
+ struct address_space *mapping = vma->vm_file->f_mapping;
+ pgoff_t idx = vma_hugecache_offset(h, vma, address);
+- bool present;
+-
+- rcu_read_lock();
+- present = page_cache_next_miss(mapping, idx, 1) != idx;
+- rcu_read_unlock();
++ struct folio *folio;
+
+- return present;
++ folio = filemap_get_folio(mapping, idx);
++ if (IS_ERR(folio))
++ return false;
++ folio_put(folio);
++ return true;
+ }
+
+ int hugetlb_add_to_page_cache(struct folio *folio, struct address_space *mapping,
+diff --git a/mm/nommu.c b/mm/nommu.c
+index fdc392735ec6d..c072a660ec2cf 100644
+--- a/mm/nommu.c
++++ b/mm/nommu.c
+@@ -637,8 +637,13 @@ EXPORT_SYMBOL(find_vma);
+ struct vm_area_struct *lock_mm_and_find_vma(struct mm_struct *mm,
+ unsigned long addr, struct pt_regs *regs)
+ {
++ struct vm_area_struct *vma;
++
+ mmap_read_lock(mm);
+- return vma_lookup(mm, addr);
++ vma = vma_lookup(mm, addr);
++ if (!vma)
++ mmap_read_unlock(mm);
++ return vma;
+ }
+
+ /*
+diff --git a/scripts/tags.sh b/scripts/tags.sh
+index ea31640b26715..f6b3c7cd39c7c 100755
+--- a/scripts/tags.sh
++++ b/scripts/tags.sh
+@@ -32,6 +32,13 @@ else
+ tree=${srctree}/
+ fi
+
++# gtags(1) refuses to index any file outside of its current working dir.
++# If gtags indexing is requested and the build output directory is not
++# the kernel source tree, index all files in absolute-path form.
++if [[ "$1" == "gtags" && -n "${tree}" ]]; then
++ tree=$(realpath "$tree")/
++fi
++
+ # Detect if ALLSOURCE_ARCHS is set. If not, we assume SRCARCH
+ if [ "${ALLSOURCE_ARCHS}" = "" ]; then
+ ALLSOURCE_ARCHS=${SRCARCH}
+@@ -131,7 +138,7 @@ docscope()
+
+ dogtags()
+ {
+- all_target_sources | gtags -i -f -
++ all_target_sources | gtags -i -C "${tree:-.}" -f - "$PWD"
+ }
+
+ # Basic regular expressions with an optional /kind-spec/ for ctags and
+diff --git a/tools/include/nolibc/arch-x86_64.h b/tools/include/nolibc/arch-x86_64.h
+index f7f2a11d4c3b0..f52725f51fca6 100644
+--- a/tools/include/nolibc/arch-x86_64.h
++++ b/tools/include/nolibc/arch-x86_64.h
+@@ -190,7 +190,7 @@ const unsigned long *_auxv __attribute__((weak));
+ * 2) The deepest stack frame should be zero (the %rbp).
+ *
+ */
+-void __attribute__((weak,noreturn,optimize("omit-frame-pointer"))) _start(void)
++void __attribute__((weak,noreturn,optimize("omit-frame-pointer"),no_stack_protector)) _start(void)
+ {
+ __asm__ volatile (
+ #ifdef NOLIBC_STACKPROTECTOR
+diff --git a/tools/testing/cxl/Kbuild b/tools/testing/cxl/Kbuild
+index 6f9347ade82cd..fba7bec96acd1 100644
+--- a/tools/testing/cxl/Kbuild
++++ b/tools/testing/cxl/Kbuild
+@@ -6,7 +6,6 @@ ldflags-y += --wrap=acpi_pci_find_root
+ ldflags-y += --wrap=nvdimm_bus_register
+ ldflags-y += --wrap=devm_cxl_port_enumerate_dports
+ ldflags-y += --wrap=devm_cxl_setup_hdm
+-ldflags-y += --wrap=devm_cxl_enable_hdm
+ ldflags-y += --wrap=devm_cxl_add_passthrough_decoder
+ ldflags-y += --wrap=devm_cxl_enumerate_decoders
+ ldflags-y += --wrap=cxl_await_media_ready
+diff --git a/tools/testing/cxl/test/mock.c b/tools/testing/cxl/test/mock.c
+index 2844165276440..de3933a776fdb 100644
+--- a/tools/testing/cxl/test/mock.c
++++ b/tools/testing/cxl/test/mock.c
+@@ -149,21 +149,6 @@ struct cxl_hdm *__wrap_devm_cxl_setup_hdm(struct cxl_port *port,
+ }
+ EXPORT_SYMBOL_NS_GPL(__wrap_devm_cxl_setup_hdm, CXL);
+
+-int __wrap_devm_cxl_enable_hdm(struct cxl_port *port, struct cxl_hdm *cxlhdm)
+-{
+- int index, rc;
+- struct cxl_mock_ops *ops = get_cxl_mock_ops(&index);
+-
+- if (ops && ops->is_mock_port(port->uport))
+- rc = 0;
+- else
+- rc = devm_cxl_enable_hdm(port, cxlhdm);
+- put_cxl_mock_ops(index);
+-
+- return rc;
+-}
+-EXPORT_SYMBOL_NS_GPL(__wrap_devm_cxl_enable_hdm, CXL);
+-
+ int __wrap_devm_cxl_add_passthrough_decoder(struct cxl_port *port)
+ {
+ int rc, index;
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-07-05 20:40 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-07-05 20:40 UTC (permalink / raw
To: gentoo-commits
commit: dc9e253c22b2d3fec7a59f68cb454d4fe849d773
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Wed Jul 5 20:40:24 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Wed Jul 5 20:40:24 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=dc9e253c
Remove redundant patch
Removed:
1800_mm-execve-mark-stack-as-growing-down.patch
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 --
1800_mm-execve-mark-stack-as-growing-down.patch | 82 -------------------------
2 files changed, 86 deletions(-)
diff --git a/0000_README b/0000_README
index f9224452..7e83110e 100644
--- a/0000_README
+++ b/0000_README
@@ -63,10 +63,6 @@ Patch: 1700_sparc-address-warray-bound-warnings.patch
From: https://github.com/KSPP/linux/issues/109
Desc: Address -Warray-bounds warnings
-Patch: 1800_mm-execve-mark-stack-as-growing-down.patch
-From: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
-Desc: execve: always mark stack as growing down during early stack setup
-
Patch: 1805_mm-disable-CONFIG-PER-VMA-LOCK-by-def.patch
From: https://lore.kernel.org/all/20230703182150.2193578-1-surenb@google.com/
Desc: mm: disable CONFIG_PER_VMA_LOCK by default until its fixed
diff --git a/1800_mm-execve-mark-stack-as-growing-down.patch b/1800_mm-execve-mark-stack-as-growing-down.patch
deleted file mode 100644
index 07daf228..00000000
--- a/1800_mm-execve-mark-stack-as-growing-down.patch
+++ /dev/null
@@ -1,82 +0,0 @@
-From d7a7655b29081c053b1abe71f64a2928638dafc6 Mon Sep 17 00:00:00 2001
-From: Linus Torvalds <torvalds@linux-foundation.org>
-Date: Sun, 2 Jul 2023 23:20:17 -0700
-Subject: execve: always mark stack as growing down during early stack setup
-
-commit f66066bc5136f25e36a2daff4896c768f18c211e upstream.
-
-While our user stacks can grow either down (all common architectures) or
-up (parisc and the ia64 register stack), the initial stack setup when we
-copy the argument and environment strings to the new stack at execve()
-time is always done by extending the stack downwards.
-
-But it turns out that in commit 8d7071af8907 ("mm: always expand the
-stack with the mmap write lock held"), as part of making the stack
-growing code more robust, 'expand_downwards()' was now made to actually
-check the vma flags:
-
- if (!(vma->vm_flags & VM_GROWSDOWN))
- return -EFAULT;
-
-and that meant that this execve-time stack expansion started failing on
-parisc, because on that architecture, the stack flags do not contain the
-VM_GROWSDOWN bit.
-
-At the same time the new check in expand_downwards() is clearly correct,
-and simplified the callers, so let's not remove it.
-
-The solution is instead to just codify the fact that yes, during
-execve(), the stack grows down. This not only matches reality, it ends
-up being particularly simple: we already have special execve-time flags
-for the stack (VM_STACK_INCOMPLETE_SETUP) and use those flags to avoid
-page migration during this setup time (see vma_is_temporary_stack() and
-invalid_migration_vma()).
-
-So just add VM_GROWSDOWN to that set of temporary flags, and now our
-stack flags automatically match reality, and the parisc stack expansion
-works again.
-
-Note that the VM_STACK_INCOMPLETE_SETUP bits will be cleared when the
-stack is finalized, so we only add the extra VM_GROWSDOWN bit on
-CONFIG_STACK_GROWSUP architectures (ie parisc) rather than adding it in
-general.
-
-Link: https://lore.kernel.org/all/612eaa53-6904-6e16-67fc-394f4faa0e16@bell.net/
-Link: https://lore.kernel.org/all/5fd98a09-4792-1433-752d-029ae3545168@gmx.de/
-Fixes: 8d7071af8907 ("mm: always expand the stack with the mmap write lock held")
-Reported-by: John David Anglin <dave.anglin@bell.net>
-Reported-and-tested-by: Helge Deller <deller@gmx.de>
-Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net>
-Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
----
- include/linux/mm.h | 4 +++-
- 1 file changed, 3 insertions(+), 1 deletion(-)
-
-diff --git a/include/linux/mm.h b/include/linux/mm.h
-index 6cbcc55a80b02..9e10485f37e7f 100644
---- a/include/linux/mm.h
-+++ b/include/linux/mm.h
-@@ -377,7 +377,7 @@ extern unsigned int kobjsize(const void *objp);
- #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_MINOR */
-
- /* Bits set in the VMA until the stack is in its final location */
--#define VM_STACK_INCOMPLETE_SETUP (VM_RAND_READ | VM_SEQ_READ)
-+#define VM_STACK_INCOMPLETE_SETUP (VM_RAND_READ | VM_SEQ_READ | VM_STACK_EARLY)
-
- #define TASK_EXEC ((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0)
-
-@@ -399,8 +399,10 @@ extern unsigned int kobjsize(const void *objp);
-
- #ifdef CONFIG_STACK_GROWSUP
- #define VM_STACK VM_GROWSUP
-+#define VM_STACK_EARLY VM_GROWSDOWN
- #else
- #define VM_STACK VM_GROWSDOWN
-+#define VM_STACK_EARLY 0
- #endif
-
- #define VM_STACK_FLAGS (VM_STACK | VM_STACK_DEFAULT_FLAGS | VM_ACCOUNT)
---
-cgit
-
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-07-11 11:45 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-07-11 11:45 UTC (permalink / raw
To: gentoo-commits
commit: 03abf70bafbcb23d545a721da42583adb48800bd
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Tue Jul 11 11:45:11 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Tue Jul 11 11:45:11 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=03abf70b
Linux patch 6.4.3
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +++
1002_linux-6.4.3.patch | 94 ++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 98 insertions(+)
diff --git a/0000_README b/0000_README
index 7e83110e..42e15858 100644
--- a/0000_README
+++ b/0000_README
@@ -51,6 +51,10 @@ Patch: 1001_linux-6.4.2.patch
From: https://www.kernel.org
Desc: Linux 6.4.2
+Patch: 1002_linux-6.4.3.patch
+From: https://www.kernel.org
+Desc: Linux 6.4.3
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1002_linux-6.4.3.patch b/1002_linux-6.4.3.patch
new file mode 100644
index 00000000..b0f9b896
--- /dev/null
+++ b/1002_linux-6.4.3.patch
@@ -0,0 +1,94 @@
+diff --git a/Makefile b/Makefile
+index bcac81556b569..56abbcac061d4 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 6
+ PATCHLEVEL = 4
+-SUBLEVEL = 2
++SUBLEVEL = 3
+ EXTRAVERSION =
+ NAME = Hurr durr I'ma ninja sloth
+
+diff --git a/include/linux/bootmem_info.h b/include/linux/bootmem_info.h
+index cc35d010fa949..e1a3c9c9754c5 100644
+--- a/include/linux/bootmem_info.h
++++ b/include/linux/bootmem_info.h
+@@ -3,6 +3,7 @@
+ #define __LINUX_BOOTMEM_INFO_H
+
+ #include <linux/mm.h>
++#include <linux/kmemleak.h>
+
+ /*
+ * Types for free bootmem stored in page->lru.next. These have to be in
+@@ -59,6 +60,7 @@ static inline void get_page_bootmem(unsigned long info, struct page *page,
+
+ static inline void free_bootmem_page(struct page *page)
+ {
++ kmemleak_free_part(page_to_virt(page), PAGE_SIZE);
+ free_reserved_page(page);
+ }
+ #endif
+diff --git a/kernel/fork.c b/kernel/fork.c
+index 41c964104b584..8103ffd217e97 100644
+--- a/kernel/fork.c
++++ b/kernel/fork.c
+@@ -690,6 +690,7 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm,
+ for_each_vma(old_vmi, mpnt) {
+ struct file *file;
+
++ vma_start_write(mpnt);
+ if (mpnt->vm_flags & VM_DONTCOPY) {
+ vm_stat_account(mm, mpnt->vm_flags, -vma_pages(mpnt));
+ continue;
+diff --git a/mm/memory.c b/mm/memory.c
+index 5ce82a76201d5..07bab1e774994 100644
+--- a/mm/memory.c
++++ b/mm/memory.c
+@@ -3932,6 +3932,13 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
+ }
+ }
+
++ /*
++ * Some architectures may have to restore extra metadata to the page
++ * when reading from swap. This metadata may be indexed by swap entry
++ * so this must be called before swap_free().
++ */
++ arch_swap_restore(entry, folio);
++
+ /*
+ * Remove the swap entry and conditionally try to free up the swapcache.
+ * We're already holding a reference on the page but haven't mapped it
+diff --git a/mm/mmap.c b/mm/mmap.c
+index bc510361acec2..30bf7772d4ac1 100644
+--- a/mm/mmap.c
++++ b/mm/mmap.c
+@@ -1975,6 +1975,8 @@ static int expand_upwards(struct vm_area_struct *vma, unsigned long address)
+ return -ENOMEM;
+ }
+
++ /* Lock the VMA before expanding to prevent concurrent page faults */
++ vma_start_write(vma);
+ /*
+ * vma->vm_start/vm_end cannot change under us because the caller
+ * is required to hold the mmap_lock in read mode. We need the
+@@ -2062,6 +2064,8 @@ int expand_downwards(struct vm_area_struct *vma, unsigned long address)
+ return -ENOMEM;
+ }
+
++ /* Lock the VMA before expanding to prevent concurrent page faults */
++ vma_start_write(vma);
+ /*
+ * vma->vm_start/vm_end cannot change under us because the caller
+ * is required to hold the mmap_lock in read mode. We need the
+@@ -2797,6 +2801,8 @@ cannot_expand:
+ if (vma_iter_prealloc(&vmi))
+ goto close_and_free_vma;
+
++ /* Lock the VMA since it is modified after insertion into VMA tree */
++ vma_start_write(vma);
+ if (vma->vm_file)
+ i_mmap_lock_write(vma->vm_file->f_mapping);
+
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-07-19 17:04 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-07-19 17:04 UTC (permalink / raw
To: gentoo-commits
commit: cad5fd068ed2bfce525b0b8bf3b60cc4610d8839
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Wed Jul 19 17:04:11 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Wed Jul 19 17:04:11 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=cad5fd06
Linux patch 6.4.4
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1003_linux-6.4.4.patch | 39171 +++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 39175 insertions(+)
diff --git a/0000_README b/0000_README
index 42e15858..2532d9e5 100644
--- a/0000_README
+++ b/0000_README
@@ -55,6 +55,10 @@ Patch: 1002_linux-6.4.3.patch
From: https://www.kernel.org
Desc: Linux 6.4.3
+Patch: 1003_linux-6.4.4.patch
+From: https://www.kernel.org
+Desc: Linux 6.4.4
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1003_linux-6.4.4.patch b/1003_linux-6.4.4.patch
new file mode 100644
index 00000000..bd7fd5f3
--- /dev/null
+++ b/1003_linux-6.4.4.patch
@@ -0,0 +1,39171 @@
+diff --git a/Documentation/ABI/testing/sysfs-driver-eud b/Documentation/ABI/testing/sysfs-driver-eud
+index 83f3872182a40..2bab0db2d2f0f 100644
+--- a/Documentation/ABI/testing/sysfs-driver-eud
++++ b/Documentation/ABI/testing/sysfs-driver-eud
+@@ -1,4 +1,4 @@
+-What: /sys/bus/platform/drivers/eud/.../enable
++What: /sys/bus/platform/drivers/qcom_eud/.../enable
+ Date: February 2022
+ Contact: Souradeep Chowdhury <quic_schowdhu@quicinc.com>
+ Description:
+diff --git a/Documentation/devicetree/bindings/crypto/qcom-qce.yaml b/Documentation/devicetree/bindings/crypto/qcom-qce.yaml
+index e375bd9813009..90ddf98a6df92 100644
+--- a/Documentation/devicetree/bindings/crypto/qcom-qce.yaml
++++ b/Documentation/devicetree/bindings/crypto/qcom-qce.yaml
+@@ -24,6 +24,12 @@ properties:
+ deprecated: true
+ description: Kept only for ABI backward compatibility
+
++ - items:
++ - enum:
++ - qcom,ipq4019-qce
++ - qcom,sm8150-qce
++ - const: qcom,qce
++
+ - items:
+ - enum:
+ - qcom,ipq6018-qce
+diff --git a/Documentation/devicetree/bindings/iio/adc/adi,ad7192.yaml b/Documentation/devicetree/bindings/iio/adc/adi,ad7192.yaml
+index d521d516088be..16def2985ab4f 100644
+--- a/Documentation/devicetree/bindings/iio/adc/adi,ad7192.yaml
++++ b/Documentation/devicetree/bindings/iio/adc/adi,ad7192.yaml
+@@ -47,6 +47,9 @@ properties:
+ avdd-supply:
+ description: AVdd voltage supply
+
++ vref-supply:
++ description: VRef voltage supply
++
+ adi,rejection-60-Hz-enable:
+ description: |
+ This bit enables a notch at 60 Hz when the first notch of the sinc
+@@ -89,6 +92,7 @@ required:
+ - interrupts
+ - dvdd-supply
+ - avdd-supply
++ - vref-supply
+ - spi-cpol
+ - spi-cpha
+
+@@ -115,6 +119,7 @@ examples:
+ interrupt-parent = <&gpio>;
+ dvdd-supply = <&dvdd>;
+ avdd-supply = <&avdd>;
++ vref-supply = <&vref>;
+
+ adi,refin2-pins-enable;
+ adi,rejection-60-Hz-enable;
+diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
+index ba677d401e240..6cb04f35642aa 100644
+--- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
++++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
+@@ -80,6 +80,7 @@ properties:
+ items:
+ - enum:
+ - qcom,sc7280-smmu-500
++ - qcom,sc8280xp-smmu-500
+ - qcom,sm6115-smmu-500
+ - qcom,sm6125-smmu-500
+ - qcom,sm8150-smmu-500
+@@ -331,7 +332,9 @@ allOf:
+ properties:
+ compatible:
+ contains:
+- const: qcom,sc7280-smmu-500
++ enum:
++ - qcom,sc7280-smmu-500
++ - qcom,sc8280xp-smmu-500
+ then:
+ properties:
+ clock-names:
+@@ -416,7 +419,6 @@ allOf:
+ - qcom,sa8775p-smmu-500
+ - qcom,sc7180-smmu-500
+ - qcom,sc8180x-smmu-500
+- - qcom,sc8280xp-smmu-500
+ - qcom,sdm670-smmu-500
+ - qcom,sdm845-smmu-500
+ - qcom,sdx55-smmu-500
+diff --git a/Documentation/devicetree/bindings/power/reset/qcom,pon.yaml b/Documentation/devicetree/bindings/power/reset/qcom,pon.yaml
+index d96170eecbd22..0b1eca734d3b1 100644
+--- a/Documentation/devicetree/bindings/power/reset/qcom,pon.yaml
++++ b/Documentation/devicetree/bindings/power/reset/qcom,pon.yaml
+@@ -56,7 +56,6 @@ required:
+ unevaluatedProperties: false
+
+ allOf:
+- - $ref: reboot-mode.yaml#
+ - if:
+ properties:
+ compatible:
+@@ -66,6 +65,9 @@ allOf:
+ - qcom,pms405-pon
+ - qcom,pm8998-pon
+ then:
++ allOf:
++ - $ref: reboot-mode.yaml#
++
+ properties:
+ reg:
+ maxItems: 1
+diff --git a/Documentation/devicetree/bindings/sound/mediatek,mt8188-afe.yaml b/Documentation/devicetree/bindings/sound/mediatek,mt8188-afe.yaml
+index 82ccb32f08f27..9e877f0d19fbb 100644
+--- a/Documentation/devicetree/bindings/sound/mediatek,mt8188-afe.yaml
++++ b/Documentation/devicetree/bindings/sound/mediatek,mt8188-afe.yaml
+@@ -63,15 +63,15 @@ properties:
+ - const: apll12_div2
+ - const: apll12_div3
+ - const: apll12_div9
+- - const: a1sys_hp_sel
+- - const: aud_intbus_sel
+- - const: audio_h_sel
+- - const: audio_local_bus_sel
+- - const: dptx_m_sel
+- - const: i2so1_m_sel
+- - const: i2so2_m_sel
+- - const: i2si1_m_sel
+- - const: i2si2_m_sel
++ - const: top_a1sys_hp
++ - const: top_aud_intbus
++ - const: top_audio_h
++ - const: top_audio_local_bus
++ - const: top_dptx
++ - const: top_i2so1
++ - const: top_i2so2
++ - const: top_i2si1
++ - const: top_i2si2
+ - const: adsp_audio_26m
+
+ mediatek,etdm-in1-cowork-source:
+@@ -193,15 +193,15 @@ examples:
+ "apll12_div2",
+ "apll12_div3",
+ "apll12_div9",
+- "a1sys_hp_sel",
+- "aud_intbus_sel",
+- "audio_h_sel",
+- "audio_local_bus_sel",
+- "dptx_m_sel",
+- "i2so1_m_sel",
+- "i2so2_m_sel",
+- "i2si1_m_sel",
+- "i2si2_m_sel",
++ "top_a1sys_hp",
++ "top_aud_intbus",
++ "top_audio_h",
++ "top_audio_local_bus",
++ "top_dptx",
++ "top_i2so1",
++ "top_i2so2",
++ "top_i2si1",
++ "top_i2si2",
+ "adsp_audio_26m";
+ };
+
+diff --git a/Documentation/fault-injection/provoke-crashes.rst b/Documentation/fault-injection/provoke-crashes.rst
+index 3abe842256139..1f087e502ca6d 100644
+--- a/Documentation/fault-injection/provoke-crashes.rst
++++ b/Documentation/fault-injection/provoke-crashes.rst
+@@ -29,7 +29,7 @@ recur_count
+ cpoint_name
+ Where in the kernel to trigger the action. It can be
+ one of INT_HARDWARE_ENTRY, INT_HW_IRQ_EN, INT_TASKLET_ENTRY,
+- FS_DEVRW, MEM_SWAPOUT, TIMERADD, SCSI_QUEUE_RQ, or DIRECT.
++ FS_SUBMIT_BH, MEM_SWAPOUT, TIMERADD, SCSI_QUEUE_RQ, or DIRECT.
+
+ cpoint_type
+ Indicates the action to be taken on hitting the crash point.
+diff --git a/Documentation/filesystems/autofs-mount-control.rst b/Documentation/filesystems/autofs-mount-control.rst
+index bf4b511cdbe85..b5a379d25c40b 100644
+--- a/Documentation/filesystems/autofs-mount-control.rst
++++ b/Documentation/filesystems/autofs-mount-control.rst
+@@ -196,7 +196,7 @@ information and return operation results::
+ struct args_ismountpoint ismountpoint;
+ };
+
+- char path[0];
++ char path[];
+ };
+
+ The ioctlfd field is a mount point file descriptor of an autofs mount
+diff --git a/Documentation/filesystems/autofs.rst b/Documentation/filesystems/autofs.rst
+index 4f490278d22fc..3b6e38e646cd8 100644
+--- a/Documentation/filesystems/autofs.rst
++++ b/Documentation/filesystems/autofs.rst
+@@ -467,7 +467,7 @@ Each ioctl is passed a pointer to an `autofs_dev_ioctl` structure::
+ struct args_ismountpoint ismountpoint;
+ };
+
+- char path[0];
++ char path[];
+ };
+
+ For the **OPEN_MOUNT** and **IS_MOUNTPOINT** commands, the target
+diff --git a/Documentation/filesystems/directory-locking.rst b/Documentation/filesystems/directory-locking.rst
+index 504ba940c36c1..dccd61c7c5c3b 100644
+--- a/Documentation/filesystems/directory-locking.rst
++++ b/Documentation/filesystems/directory-locking.rst
+@@ -22,12 +22,11 @@ exclusive.
+ 3) object removal. Locking rules: caller locks parent, finds victim,
+ locks victim and calls the method. Locks are exclusive.
+
+-4) rename() that is _not_ cross-directory. Locking rules: caller locks
+-the parent and finds source and target. In case of exchange (with
+-RENAME_EXCHANGE in flags argument) lock both. In any case,
+-if the target already exists, lock it. If the source is a non-directory,
+-lock it. If we need to lock both, lock them in inode pointer order.
+-Then call the method. All locks are exclusive.
++4) rename() that is _not_ cross-directory. Locking rules: caller locks the
++parent and finds source and target. We lock both (provided they exist). If we
++need to lock two inodes of different type (dir vs non-dir), we lock directory
++first. If we need to lock two inodes of the same type, lock them in inode
++pointer order. Then call the method. All locks are exclusive.
+ NB: we might get away with locking the source (and target in exchange
+ case) shared.
+
+@@ -44,15 +43,17 @@ All locks are exclusive.
+ rules:
+
+ * lock the filesystem
+- * lock parents in "ancestors first" order.
++ * lock parents in "ancestors first" order. If one is not ancestor of
++ the other, lock them in inode pointer order.
+ * find source and target.
+ * if old parent is equal to or is a descendent of target
+ fail with -ENOTEMPTY
+ * if new parent is equal to or is a descendent of source
+ fail with -ELOOP
+- * If it's an exchange, lock both the source and the target.
+- * If the target exists, lock it. If the source is a non-directory,
+- lock it. If we need to lock both, do so in inode pointer order.
++ * Lock both the source and the target provided they exist. If we
++ need to lock two inodes of different type (dir vs non-dir), we lock
++ the directory first. If we need to lock two inodes of the same type,
++ lock them in inode pointer order.
+ * call the method.
+
+ All ->i_rwsem are taken exclusive. Again, we might get away with locking
+@@ -66,8 +67,9 @@ If no directory is its own ancestor, the scheme above is deadlock-free.
+
+ Proof:
+
+- First of all, at any moment we have a partial ordering of the
+- objects - A < B iff A is an ancestor of B.
++ First of all, at any moment we have a linear ordering of the
++ objects - A < B iff (A is an ancestor of B) or (B is not an ancestor
++ of A and ptr(A) < ptr(B)).
+
+ That ordering can change. However, the following is true:
+
+diff --git a/Documentation/filesystems/f2fs.rst b/Documentation/filesystems/f2fs.rst
+index c57745375edbc..9359978a5af26 100644
+--- a/Documentation/filesystems/f2fs.rst
++++ b/Documentation/filesystems/f2fs.rst
+@@ -351,6 +351,22 @@ age_extent_cache Enable an age extent cache based on rb-tree. It records
+ data block update frequency of the extent per inode, in
+ order to provide better temperature hints for data block
+ allocation.
++errors=%s Specify f2fs behavior on critical errors. This supports modes:
++ "panic", "continue" and "remount-ro", respectively, trigger
++ panic immediately, continue without doing anything, and remount
++ the partition in read-only mode. By default it uses "continue"
++ mode.
++ ====================== =============== =============== ========
++ mode continue remount-ro panic
++ ====================== =============== =============== ========
++ access ops normal noraml N/A
++ syscall errors -EIO -EROFS N/A
++ mount option rw ro N/A
++ pending dir write keep keep N/A
++ pending non-dir write drop keep N/A
++ pending node write drop keep N/A
++ pending meta write keep keep N/A
++ ====================== =============== =============== ========
+ ======================== ============================================================
+
+ Debugfs Entries
+diff --git a/Documentation/networking/af_xdp.rst b/Documentation/networking/af_xdp.rst
+index 247c6c4127e94..1cc35de336a41 100644
+--- a/Documentation/networking/af_xdp.rst
++++ b/Documentation/networking/af_xdp.rst
+@@ -433,6 +433,15 @@ start N bytes into the buffer leaving the first N bytes for the
+ application to use. The final option is the flags field, but it will
+ be dealt with in separate sections for each UMEM flag.
+
++SO_BINDTODEVICE setsockopt
++--------------------------
++
++This is a generic SOL_SOCKET option that can be used to tie AF_XDP
++socket to a particular network interface. It is useful when a socket
++is created by a privileged process and passed to a non-privileged one.
++Once the option is set, kernel will refuse attempts to bind that socket
++to a different interface. Updating the value requires CAP_NET_RAW.
++
+ XDP_STATISTICS getsockopt
+ -------------------------
+
+diff --git a/Makefile b/Makefile
+index 56abbcac061d4..d5041f7daf689 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 6
+ PATCHLEVEL = 4
+-SUBLEVEL = 3
++SUBLEVEL = 4
+ EXTRAVERSION =
+ NAME = Hurr durr I'ma ninja sloth
+
+diff --git a/arch/arc/include/asm/linkage.h b/arch/arc/include/asm/linkage.h
+index c9434ff3aa4ce..8a3fb71e9cfad 100644
+--- a/arch/arc/include/asm/linkage.h
++++ b/arch/arc/include/asm/linkage.h
+@@ -8,6 +8,10 @@
+
+ #include <asm/dwarf.h>
+
++#define ASM_NL ` /* use '`' to mark new line in macro */
++#define __ALIGN .align 4
++#define __ALIGN_STR __stringify(__ALIGN)
++
+ #ifdef __ASSEMBLY__
+
+ .macro ST2 e, o, off
+@@ -28,10 +32,6 @@
+ #endif
+ .endm
+
+-#define ASM_NL ` /* use '`' to mark new line in macro */
+-#define __ALIGN .align 4
+-#define __ALIGN_STR __stringify(__ALIGN)
+-
+ /* annotation for data we want in DCCM - if enabled in .config */
+ .macro ARCFP_DATA nm
+ #ifdef CONFIG_ARC_HAS_DCCM
+diff --git a/arch/arm/boot/dts/bcm53015-meraki-mr26.dts b/arch/arm/boot/dts/bcm53015-meraki-mr26.dts
+index 14f58033efeb9..ca2266b936ee2 100644
+--- a/arch/arm/boot/dts/bcm53015-meraki-mr26.dts
++++ b/arch/arm/boot/dts/bcm53015-meraki-mr26.dts
+@@ -128,7 +128,7 @@
+
+ fixed-link {
+ speed = <1000>;
+- duplex-full;
++ full-duplex;
+ };
+ };
+ };
+diff --git a/arch/arm/boot/dts/bcm53016-meraki-mr32.dts b/arch/arm/boot/dts/bcm53016-meraki-mr32.dts
+index 46c2c93b01d88..a34e1746a6c59 100644
+--- a/arch/arm/boot/dts/bcm53016-meraki-mr32.dts
++++ b/arch/arm/boot/dts/bcm53016-meraki-mr32.dts
+@@ -187,7 +187,7 @@
+
+ fixed-link {
+ speed = <1000>;
+- duplex-full;
++ full-duplex;
+ };
+ };
+ };
+diff --git a/arch/arm/boot/dts/bcm5301x.dtsi b/arch/arm/boot/dts/bcm5301x.dtsi
+index 5fc1b847f4aa5..787a0dd8216b7 100644
+--- a/arch/arm/boot/dts/bcm5301x.dtsi
++++ b/arch/arm/boot/dts/bcm5301x.dtsi
+@@ -542,7 +542,6 @@
+ "spi_lr_session_done",
+ "spi_lr_overread";
+ clocks = <&iprocmed>;
+- clock-names = "iprocmed";
+ num-cs = <2>;
+ #address-cells = <1>;
+ #size-cells = <0>;
+diff --git a/arch/arm/boot/dts/iwg20d-q7-common.dtsi b/arch/arm/boot/dts/iwg20d-q7-common.dtsi
+index 03caea6fc6ffa..4351c5a02fa59 100644
+--- a/arch/arm/boot/dts/iwg20d-q7-common.dtsi
++++ b/arch/arm/boot/dts/iwg20d-q7-common.dtsi
+@@ -49,7 +49,7 @@
+ lcd_backlight: backlight {
+ compatible = "pwm-backlight";
+
+- pwms = <&pwm3 0 5000000 0>;
++ pwms = <&pwm3 0 5000000>;
+ brightness-levels = <0 4 8 16 32 64 128 255>;
+ default-brightness-level = <7>;
+ enable-gpios = <&gpio5 14 GPIO_ACTIVE_HIGH>;
+diff --git a/arch/arm/boot/dts/lan966x-kontron-kswitch-d10-mmt.dtsi b/arch/arm/boot/dts/lan966x-kontron-kswitch-d10-mmt.dtsi
+index 0097e72e3fb22..f4df4cc1dfa5e 100644
+--- a/arch/arm/boot/dts/lan966x-kontron-kswitch-d10-mmt.dtsi
++++ b/arch/arm/boot/dts/lan966x-kontron-kswitch-d10-mmt.dtsi
+@@ -18,6 +18,8 @@
+
+ gpio-restart {
+ compatible = "gpio-restart";
++ pinctrl-0 = <&reset_pins>;
++ pinctrl-names = "default";
+ gpios = <&gpio 56 GPIO_ACTIVE_LOW>;
+ priority = <200>;
+ };
+@@ -39,7 +41,7 @@
+ status = "okay";
+
+ spi3: spi@400 {
+- pinctrl-0 = <&fc3_b_pins>;
++ pinctrl-0 = <&fc3_b_pins>, <&spi3_cs_pins>;
+ pinctrl-names = "default";
+ status = "okay";
+ cs-gpios = <&gpio 46 GPIO_ACTIVE_LOW>;
+@@ -59,6 +61,12 @@
+ function = "miim_c";
+ };
+
++ reset_pins: reset-pins {
++ /* SYS_RST# */
++ pins = "GPIO_56";
++ function = "gpio";
++ };
++
+ sgpio_a_pins: sgpio-a-pins {
+ /* SCK, D0, D1 */
+ pins = "GPIO_32", "GPIO_33", "GPIO_34";
+@@ -71,6 +79,12 @@
+ function = "sgpio_b";
+ };
+
++ spi3_cs_pins: spi3-cs-pins {
++ /* CS# */
++ pins = "GPIO_46";
++ function = "gpio";
++ };
++
+ usart0_pins: usart0-pins {
+ /* RXD, TXD */
+ pins = "GPIO_25", "GPIO_26";
+diff --git a/arch/arm/boot/dts/meson8.dtsi b/arch/arm/boot/dts/meson8.dtsi
+index 4f22ab451aae2..59932fbfd5d5f 100644
+--- a/arch/arm/boot/dts/meson8.dtsi
++++ b/arch/arm/boot/dts/meson8.dtsi
+@@ -769,13 +769,13 @@
+
+ &uart_B {
+ compatible = "amlogic,meson8-uart";
+- clocks = <&xtal>, <&clkc CLKID_UART0>, <&clkc CLKID_CLK81>;
++ clocks = <&xtal>, <&clkc CLKID_UART1>, <&clkc CLKID_CLK81>;
+ clock-names = "xtal", "pclk", "baud";
+ };
+
+ &uart_C {
+ compatible = "amlogic,meson8-uart";
+- clocks = <&xtal>, <&clkc CLKID_UART0>, <&clkc CLKID_CLK81>;
++ clocks = <&xtal>, <&clkc CLKID_UART2>, <&clkc CLKID_CLK81>;
+ clock-names = "xtal", "pclk", "baud";
+ };
+
+diff --git a/arch/arm/boot/dts/meson8b.dtsi b/arch/arm/boot/dts/meson8b.dtsi
+index 5979209fe91ef..5198f5177c2c1 100644
+--- a/arch/arm/boot/dts/meson8b.dtsi
++++ b/arch/arm/boot/dts/meson8b.dtsi
+@@ -740,13 +740,13 @@
+
+ &uart_B {
+ compatible = "amlogic,meson8b-uart";
+- clocks = <&xtal>, <&clkc CLKID_UART0>, <&clkc CLKID_CLK81>;
++ clocks = <&xtal>, <&clkc CLKID_UART1>, <&clkc CLKID_CLK81>;
+ clock-names = "xtal", "pclk", "baud";
+ };
+
+ &uart_C {
+ compatible = "amlogic,meson8b-uart";
+- clocks = <&xtal>, <&clkc CLKID_UART0>, <&clkc CLKID_CLK81>;
++ clocks = <&xtal>, <&clkc CLKID_UART2>, <&clkc CLKID_CLK81>;
+ clock-names = "xtal", "pclk", "baud";
+ };
+
+diff --git a/arch/arm/boot/dts/omap3-gta04a5one.dts b/arch/arm/boot/dts/omap3-gta04a5one.dts
+index 9db9fe67cd63b..95df45cc70c09 100644
+--- a/arch/arm/boot/dts/omap3-gta04a5one.dts
++++ b/arch/arm/boot/dts/omap3-gta04a5one.dts
+@@ -5,9 +5,11 @@
+
+ #include "omap3-gta04a5.dts"
+
+-&omap3_pmx_core {
++/ {
+ model = "Goldelico GTA04A5/Letux 2804 with OneNAND";
++};
+
++&omap3_pmx_core {
+ gpmc_pins: pinmux_gpmc_pins {
+ pinctrl-single,pins = <
+
+diff --git a/arch/arm/boot/dts/qcom-apq8060-dragonboard.dts b/arch/arm/boot/dts/qcom-apq8060-dragonboard.dts
+index 8e4b61e4d4b17..e8fe321f3d89b 100644
+--- a/arch/arm/boot/dts/qcom-apq8060-dragonboard.dts
++++ b/arch/arm/boot/dts/qcom-apq8060-dragonboard.dts
+@@ -451,7 +451,7 @@
+ * PM8901 supplies "preliminary regulators" whatever
+ * that means
+ */
+- pm8901-regulators {
++ regulators-0 {
+ vdd_l0-supply = <&pm8901_s4>;
+ vdd_l1-supply = <&vph>;
+ vdd_l2-supply = <&vph>;
+@@ -537,7 +537,7 @@
+
+ };
+
+- pm8058-regulators {
++ regulators-1 {
+ vdd_l0_l1_lvs-supply = <&pm8058_s3>;
+ vdd_l2_l11_l12-supply = <&vph>;
+ vdd_l3_l4_l5-supply = <&vph>;
+diff --git a/arch/arm/boot/dts/qcom-apq8074-dragonboard.dts b/arch/arm/boot/dts/qcom-apq8074-dragonboard.dts
+index 1345df7cbd002..6b047c6793707 100644
+--- a/arch/arm/boot/dts/qcom-apq8074-dragonboard.dts
++++ b/arch/arm/boot/dts/qcom-apq8074-dragonboard.dts
+@@ -23,6 +23,10 @@
+ status = "okay";
+ };
+
++&blsp2_dma {
++ qcom,controlled-remotely;
++};
++
+ &blsp2_i2c5 {
+ status = "okay";
+ clock-frequency = <200000>;
+diff --git a/arch/arm/boot/dts/qcom-ipq4019-ap.dk04.1-c1.dts b/arch/arm/boot/dts/qcom-ipq4019-ap.dk04.1-c1.dts
+index 79b0c6318e527..0993f840d1fc7 100644
+--- a/arch/arm/boot/dts/qcom-ipq4019-ap.dk04.1-c1.dts
++++ b/arch/arm/boot/dts/qcom-ipq4019-ap.dk04.1-c1.dts
+@@ -11,9 +11,9 @@
+ dma-controller@7984000 {
+ status = "okay";
+ };
+-
+- qpic-nand@79b0000 {
+- status = "okay";
+- };
+ };
+ };
++
++&nand {
++ status = "okay";
++};
+diff --git a/arch/arm/boot/dts/qcom-ipq4019-ap.dk04.1.dtsi b/arch/arm/boot/dts/qcom-ipq4019-ap.dk04.1.dtsi
+index a63b3778636d4..468ebc40d2ad3 100644
+--- a/arch/arm/boot/dts/qcom-ipq4019-ap.dk04.1.dtsi
++++ b/arch/arm/boot/dts/qcom-ipq4019-ap.dk04.1.dtsi
+@@ -102,10 +102,10 @@
+ status = "okay";
+ perst-gpios = <&tlmm 38 GPIO_ACTIVE_LOW>;
+ };
+-
+- qpic-nand@79b0000 {
+- pinctrl-0 = <&nand_pins>;
+- pinctrl-names = "default";
+- };
+ };
+ };
++
++&nand {
++ pinctrl-0 = <&nand_pins>;
++ pinctrl-names = "default";
++};
+diff --git a/arch/arm/boot/dts/qcom-ipq4019-ap.dk07.1.dtsi b/arch/arm/boot/dts/qcom-ipq4019-ap.dk07.1.dtsi
+index 0107f552f5204..7ef635997efa4 100644
+--- a/arch/arm/boot/dts/qcom-ipq4019-ap.dk07.1.dtsi
++++ b/arch/arm/boot/dts/qcom-ipq4019-ap.dk07.1.dtsi
+@@ -65,11 +65,11 @@
+ dma-controller@7984000 {
+ status = "okay";
+ };
+-
+- qpic-nand@79b0000 {
+- pinctrl-0 = <&nand_pins>;
+- pinctrl-names = "default";
+- status = "okay";
+- };
+ };
+ };
++
++&nand {
++ pinctrl-0 = <&nand_pins>;
++ pinctrl-names = "default";
++ status = "okay";
++};
+diff --git a/arch/arm/boot/dts/qcom-msm8974.dtsi b/arch/arm/boot/dts/qcom-msm8974.dtsi
+index 7ed0d925a4e99..a22616491dc0e 100644
+--- a/arch/arm/boot/dts/qcom-msm8974.dtsi
++++ b/arch/arm/boot/dts/qcom-msm8974.dtsi
+@@ -301,7 +301,7 @@
+ qcom,ipc = <&apcs 8 0>;
+ qcom,smd-edge = <15>;
+
+- rpm_requests: rpm_requests {
++ rpm_requests: rpm-requests {
+ compatible = "qcom,rpm-msm8974";
+ qcom,smd-channels = "rpm_requests";
+
+diff --git a/arch/arm/boot/dts/stm32mp15xx-dhcom-pdk2.dtsi b/arch/arm/boot/dts/stm32mp15xx-dhcom-pdk2.dtsi
+index 4709677151aac..46b87a27d8b37 100644
+--- a/arch/arm/boot/dts/stm32mp15xx-dhcom-pdk2.dtsi
++++ b/arch/arm/boot/dts/stm32mp15xx-dhcom-pdk2.dtsi
+@@ -137,10 +137,13 @@
+
+ sound {
+ compatible = "audio-graph-card";
+- routing =
+- "MIC_IN", "Capture",
+- "Capture", "Mic Bias",
+- "Playback", "HP_OUT";
++ widgets = "Headphone", "Headphone Jack",
++ "Line", "Line In Jack",
++ "Microphone", "Microphone Jack";
++ routing = "Headphone Jack", "HP_OUT",
++ "LINE_IN", "Line In Jack",
++ "MIC_IN", "Microphone Jack",
++ "Microphone Jack", "Mic Bias";
+ dais = <&sai2a_port &sai2b_port>;
+ status = "okay";
+ };
+diff --git a/arch/arm/boot/dts/stm32mp15xx-dhcor-avenger96.dtsi b/arch/arm/boot/dts/stm32mp15xx-dhcor-avenger96.dtsi
+index 50af4a27d6be4..7d5d6d4360385 100644
+--- a/arch/arm/boot/dts/stm32mp15xx-dhcor-avenger96.dtsi
++++ b/arch/arm/boot/dts/stm32mp15xx-dhcor-avenger96.dtsi
+@@ -87,7 +87,7 @@
+
+ sound {
+ compatible = "audio-graph-card";
+- label = "STM32MP1-AV96-HDMI";
++ label = "STM32-AV96-HDMI";
+ dais = <&sai2a_port>;
+ status = "okay";
+ };
+@@ -321,6 +321,12 @@
+ };
+ };
+ };
++
++ dh_mac_eeprom: eeprom@53 {
++ compatible = "atmel,24c02";
++ reg = <0x53>;
++ pagesize = <16>;
++ };
+ };
+
+ <dc {
+diff --git a/arch/arm/boot/dts/stm32mp15xx-dhcor-drc-compact.dtsi b/arch/arm/boot/dts/stm32mp15xx-dhcor-drc-compact.dtsi
+index c32c160f97f20..39af79dc654cc 100644
+--- a/arch/arm/boot/dts/stm32mp15xx-dhcor-drc-compact.dtsi
++++ b/arch/arm/boot/dts/stm32mp15xx-dhcor-drc-compact.dtsi
+@@ -192,6 +192,12 @@
+ reg = <0x50>;
+ pagesize = <16>;
+ };
++
++ dh_mac_eeprom: eeprom@53 {
++ compatible = "atmel,24c02";
++ reg = <0x53>;
++ pagesize = <16>;
++ };
+ };
+
+ &sdmmc1 { /* MicroSD */
+diff --git a/arch/arm/boot/dts/stm32mp15xx-dhcor-som.dtsi b/arch/arm/boot/dts/stm32mp15xx-dhcor-som.dtsi
+index bb40fb46da81d..bba19f21e5277 100644
+--- a/arch/arm/boot/dts/stm32mp15xx-dhcor-som.dtsi
++++ b/arch/arm/boot/dts/stm32mp15xx-dhcor-som.dtsi
+@@ -213,12 +213,6 @@
+ status = "disabled";
+ };
+ };
+-
+- eeprom@53 {
+- compatible = "atmel,24c02";
+- reg = <0x53>;
+- pagesize = <16>;
+- };
+ };
+
+ &ipcc {
+diff --git a/arch/arm/boot/dts/stm32mp15xx-dhcor-testbench.dtsi b/arch/arm/boot/dts/stm32mp15xx-dhcor-testbench.dtsi
+index 5fdb74b652aca..faed31b6d84a1 100644
+--- a/arch/arm/boot/dts/stm32mp15xx-dhcor-testbench.dtsi
++++ b/arch/arm/boot/dts/stm32mp15xx-dhcor-testbench.dtsi
+@@ -90,6 +90,14 @@
+ };
+ };
+
++&i2c4 {
++ dh_mac_eeprom: eeprom@53 {
++ compatible = "atmel,24c02";
++ reg = <0x53>;
++ pagesize = <16>;
++ };
++};
++
+ &sdmmc1 {
+ pinctrl-names = "default", "opendrain", "sleep";
+ pinctrl-0 = <&sdmmc1_b4_pins_a &sdmmc1_dir_pins_b>;
+diff --git a/arch/arm/boot/dts/stm32mp15xx-dkx.dtsi b/arch/arm/boot/dts/stm32mp15xx-dkx.dtsi
+index cefeeb00fc228..aa2e92f1e63d3 100644
+--- a/arch/arm/boot/dts/stm32mp15xx-dkx.dtsi
++++ b/arch/arm/boot/dts/stm32mp15xx-dkx.dtsi
+@@ -435,7 +435,7 @@
+ i2s2_port: port {
+ i2s2_endpoint: endpoint {
+ remote-endpoint = <&sii9022_tx_endpoint>;
+- format = "i2s";
++ dai-format = "i2s";
+ mclk-fs = <256>;
+ };
+ };
+diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
+index 505a306e0271a..aebe2c8f6a686 100644
+--- a/arch/arm/include/asm/assembler.h
++++ b/arch/arm/include/asm/assembler.h
+@@ -394,6 +394,23 @@ ALT_UP_B(.L0_\@)
+ #endif
+ .endm
+
++/*
++ * Raw SMP data memory barrier
++ */
++ .macro __smp_dmb mode
++#if __LINUX_ARM_ARCH__ >= 7
++ .ifeqs "\mode","arm"
++ dmb ish
++ .else
++ W(dmb) ish
++ .endif
++#elif __LINUX_ARM_ARCH__ == 6
++ mcr p15, 0, r0, c7, c10, 5 @ dmb
++#else
++ .error "Incompatible SMP platform"
++#endif
++ .endm
++
+ #if defined(CONFIG_CPU_V7M)
+ /*
+ * setmode is used to assert to be in svc mode during boot. For v7-M
+diff --git a/arch/arm/include/asm/sync_bitops.h b/arch/arm/include/asm/sync_bitops.h
+index 6f5d627c44a3c..f46b3c570f92e 100644
+--- a/arch/arm/include/asm/sync_bitops.h
++++ b/arch/arm/include/asm/sync_bitops.h
+@@ -14,14 +14,35 @@
+ * ops which are SMP safe even on a UP kernel.
+ */
+
++/*
++ * Unordered
++ */
++
+ #define sync_set_bit(nr, p) _set_bit(nr, p)
+ #define sync_clear_bit(nr, p) _clear_bit(nr, p)
+ #define sync_change_bit(nr, p) _change_bit(nr, p)
+-#define sync_test_and_set_bit(nr, p) _test_and_set_bit(nr, p)
+-#define sync_test_and_clear_bit(nr, p) _test_and_clear_bit(nr, p)
+-#define sync_test_and_change_bit(nr, p) _test_and_change_bit(nr, p)
+ #define sync_test_bit(nr, addr) test_bit(nr, addr)
+-#define arch_sync_cmpxchg arch_cmpxchg
+
++/*
++ * Fully ordered
++ */
++
++int _sync_test_and_set_bit(int nr, volatile unsigned long * p);
++#define sync_test_and_set_bit(nr, p) _sync_test_and_set_bit(nr, p)
++
++int _sync_test_and_clear_bit(int nr, volatile unsigned long * p);
++#define sync_test_and_clear_bit(nr, p) _sync_test_and_clear_bit(nr, p)
++
++int _sync_test_and_change_bit(int nr, volatile unsigned long * p);
++#define sync_test_and_change_bit(nr, p) _sync_test_and_change_bit(nr, p)
++
++#define arch_sync_cmpxchg(ptr, old, new) \
++({ \
++ __typeof__(*(ptr)) __ret; \
++ __smp_mb__before_atomic(); \
++ __ret = arch_cmpxchg_relaxed((ptr), (old), (new)); \
++ __smp_mb__after_atomic(); \
++ __ret; \
++})
+
+ #endif
+diff --git a/arch/arm/lib/bitops.h b/arch/arm/lib/bitops.h
+index 95bd359912889..f069d1b2318e6 100644
+--- a/arch/arm/lib/bitops.h
++++ b/arch/arm/lib/bitops.h
+@@ -28,7 +28,7 @@ UNWIND( .fnend )
+ ENDPROC(\name )
+ .endm
+
+- .macro testop, name, instr, store
++ .macro __testop, name, instr, store, barrier
+ ENTRY( \name )
+ UNWIND( .fnstart )
+ ands ip, r1, #3
+@@ -38,7 +38,7 @@ UNWIND( .fnstart )
+ mov r0, r0, lsr #5
+ add r1, r1, r0, lsl #2 @ Get word offset
+ mov r3, r2, lsl r3 @ create mask
+- smp_dmb
++ \barrier
+ #if __LINUX_ARM_ARCH__ >= 7 && defined(CONFIG_SMP)
+ .arch_extension mp
+ ALT_SMP(W(pldw) [r1])
+@@ -50,13 +50,21 @@ UNWIND( .fnstart )
+ strex ip, r2, [r1]
+ cmp ip, #0
+ bne 1b
+- smp_dmb
++ \barrier
+ cmp r0, #0
+ movne r0, #1
+ 2: bx lr
+ UNWIND( .fnend )
+ ENDPROC(\name )
+ .endm
++
++ .macro testop, name, instr, store
++ __testop \name, \instr, \store, smp_dmb
++ .endm
++
++ .macro sync_testop, name, instr, store
++ __testop \name, \instr, \store, __smp_dmb
++ .endm
+ #else
+ .macro bitop, name, instr
+ ENTRY( \name )
+diff --git a/arch/arm/lib/testchangebit.S b/arch/arm/lib/testchangebit.S
+index 4ebecc67e6e04..f13fe9bc2399a 100644
+--- a/arch/arm/lib/testchangebit.S
++++ b/arch/arm/lib/testchangebit.S
+@@ -10,3 +10,7 @@
+ .text
+
+ testop _test_and_change_bit, eor, str
++
++#if __LINUX_ARM_ARCH__ >= 6
++sync_testop _sync_test_and_change_bit, eor, str
++#endif
+diff --git a/arch/arm/lib/testclearbit.S b/arch/arm/lib/testclearbit.S
+index 009afa0f5b4a7..4d2c5ca620ebf 100644
+--- a/arch/arm/lib/testclearbit.S
++++ b/arch/arm/lib/testclearbit.S
+@@ -10,3 +10,7 @@
+ .text
+
+ testop _test_and_clear_bit, bicne, strne
++
++#if __LINUX_ARM_ARCH__ >= 6
++sync_testop _sync_test_and_clear_bit, bicne, strne
++#endif
+diff --git a/arch/arm/lib/testsetbit.S b/arch/arm/lib/testsetbit.S
+index f3192e55acc87..649dbab65d8d0 100644
+--- a/arch/arm/lib/testsetbit.S
++++ b/arch/arm/lib/testsetbit.S
+@@ -10,3 +10,7 @@
+ .text
+
+ testop _test_and_set_bit, orreq, streq
++
++#if __LINUX_ARM_ARCH__ >= 6
++sync_testop _sync_test_and_set_bit, orreq, streq
++#endif
+diff --git a/arch/arm/mach-ep93xx/timer-ep93xx.c b/arch/arm/mach-ep93xx/timer-ep93xx.c
+index dd4b164d18317..a9efa7bc2fa12 100644
+--- a/arch/arm/mach-ep93xx/timer-ep93xx.c
++++ b/arch/arm/mach-ep93xx/timer-ep93xx.c
+@@ -9,6 +9,7 @@
+ #include <linux/io.h>
+ #include <asm/mach/time.h>
+ #include "soc.h"
++#include "platform.h"
+
+ /*************************************************************************
+ * Timer handling for EP93xx
+@@ -60,7 +61,7 @@ static u64 notrace ep93xx_read_sched_clock(void)
+ return ret;
+ }
+
+-u64 ep93xx_clocksource_read(struct clocksource *c)
++static u64 ep93xx_clocksource_read(struct clocksource *c)
+ {
+ u64 ret;
+
+diff --git a/arch/arm/mach-omap1/board-ams-delta.c b/arch/arm/mach-omap1/board-ams-delta.c
+index 9108c871d129a..ac47ab9fe0964 100644
+--- a/arch/arm/mach-omap1/board-ams-delta.c
++++ b/arch/arm/mach-omap1/board-ams-delta.c
+@@ -11,7 +11,6 @@
+ #include <linux/gpio/driver.h>
+ #include <linux/gpio/machine.h>
+ #include <linux/gpio/consumer.h>
+-#include <linux/gpio.h>
+ #include <linux/kernel.h>
+ #include <linux/init.h>
+ #include <linux/input.h>
+diff --git a/arch/arm/mach-omap1/board-nokia770.c b/arch/arm/mach-omap1/board-nokia770.c
+index a501a473ffd68..5ea27ca26abf2 100644
+--- a/arch/arm/mach-omap1/board-nokia770.c
++++ b/arch/arm/mach-omap1/board-nokia770.c
+@@ -6,17 +6,18 @@
+ */
+ #include <linux/clkdev.h>
+ #include <linux/irq.h>
+-#include <linux/gpio.h>
++#include <linux/gpio/consumer.h>
+ #include <linux/gpio/machine.h>
++#include <linux/gpio/property.h>
+ #include <linux/kernel.h>
+ #include <linux/init.h>
+ #include <linux/mutex.h>
+ #include <linux/platform_device.h>
++#include <linux/property.h>
+ #include <linux/input.h>
+ #include <linux/omapfb.h>
+
+ #include <linux/spi/spi.h>
+-#include <linux/spi/ads7846.h>
+ #include <linux/workqueue.h>
+ #include <linux/delay.h>
+
+@@ -35,6 +36,25 @@
+ #include "clock.h"
+ #include "mmc.h"
+
++static const struct software_node nokia770_mpuio_gpiochip_node = {
++ .name = "mpuio",
++};
++
++static const struct software_node nokia770_gpiochip1_node = {
++ .name = "gpio-0-15",
++};
++
++static const struct software_node nokia770_gpiochip2_node = {
++ .name = "gpio-16-31",
++};
++
++static const struct software_node *nokia770_gpiochip_nodes[] = {
++ &nokia770_mpuio_gpiochip_node,
++ &nokia770_gpiochip1_node,
++ &nokia770_gpiochip2_node,
++ NULL
++};
++
+ #define ADS7846_PENDOWN_GPIO 15
+
+ static const unsigned int nokia770_keymap[] = {
+@@ -85,40 +105,47 @@ static struct platform_device *nokia770_devices[] __initdata = {
+ &nokia770_kp_device,
+ };
+
+-static void mipid_shutdown(struct mipid_platform_data *pdata)
+-{
+- if (pdata->nreset_gpio != -1) {
+- printk(KERN_INFO "shutdown LCD\n");
+- gpio_set_value(pdata->nreset_gpio, 0);
+- msleep(120);
+- }
+-}
+-
+-static struct mipid_platform_data nokia770_mipid_platform_data = {
+- .shutdown = mipid_shutdown,
+-};
++static struct mipid_platform_data nokia770_mipid_platform_data = { };
+
+ static const struct omap_lcd_config nokia770_lcd_config __initconst = {
+ .ctrl_name = "hwa742",
+ };
+
++static const struct property_entry nokia770_mipid_props[] = {
++ PROPERTY_ENTRY_GPIO("reset-gpios", &nokia770_gpiochip1_node,
++ 13, GPIO_ACTIVE_LOW),
++ { }
++};
++
++static const struct software_node nokia770_mipid_swnode = {
++ .name = "lcd_mipid",
++ .properties = nokia770_mipid_props,
++};
++
+ static void __init mipid_dev_init(void)
+ {
+- nokia770_mipid_platform_data.nreset_gpio = 13;
+ nokia770_mipid_platform_data.data_lines = 16;
+
+ omapfb_set_lcd_config(&nokia770_lcd_config);
+ }
+
+-static struct ads7846_platform_data nokia770_ads7846_platform_data __initdata = {
+- .x_max = 0x0fff,
+- .y_max = 0x0fff,
+- .x_plate_ohms = 180,
+- .pressure_max = 255,
+- .debounce_max = 10,
+- .debounce_tol = 3,
+- .debounce_rep = 1,
+- .gpio_pendown = ADS7846_PENDOWN_GPIO,
++static const struct property_entry nokia770_ads7846_props[] = {
++ PROPERTY_ENTRY_STRING("compatible", "ti,ads7846"),
++ PROPERTY_ENTRY_U32("touchscreen-size-x", 4096),
++ PROPERTY_ENTRY_U32("touchscreen-size-y", 4096),
++ PROPERTY_ENTRY_U32("touchscreen-max-pressure", 256),
++ PROPERTY_ENTRY_U32("touchscreen-average-samples", 10),
++ PROPERTY_ENTRY_U16("ti,x-plate-ohms", 180),
++ PROPERTY_ENTRY_U16("ti,debounce-tol", 3),
++ PROPERTY_ENTRY_U16("ti,debounce-rep", 1),
++ PROPERTY_ENTRY_GPIO("pendown-gpios", &nokia770_gpiochip1_node,
++ ADS7846_PENDOWN_GPIO, GPIO_ACTIVE_LOW),
++ { }
++};
++
++static const struct software_node nokia770_ads7846_swnode = {
++ .name = "ads7846",
++ .properties = nokia770_ads7846_props,
+ };
+
+ static struct spi_board_info nokia770_spi_board_info[] __initdata = {
+@@ -128,13 +155,14 @@ static struct spi_board_info nokia770_spi_board_info[] __initdata = {
+ .chip_select = 3,
+ .max_speed_hz = 12000000,
+ .platform_data = &nokia770_mipid_platform_data,
++ .swnode = &nokia770_mipid_swnode,
+ },
+ [1] = {
+ .modalias = "ads7846",
+ .bus_num = 2,
+ .chip_select = 0,
+ .max_speed_hz = 2500000,
+- .platform_data = &nokia770_ads7846_platform_data,
++ .swnode = &nokia770_ads7846_swnode,
+ },
+ };
+
+@@ -156,27 +184,23 @@ static struct omap_usb_config nokia770_usb_config __initdata = {
+
+ #if IS_ENABLED(CONFIG_MMC_OMAP)
+
+-#define NOKIA770_GPIO_MMC_POWER 41
+-#define NOKIA770_GPIO_MMC_SWITCH 23
+-
+-static int nokia770_mmc_set_power(struct device *dev, int slot, int power_on,
+- int vdd)
+-{
+- gpio_set_value(NOKIA770_GPIO_MMC_POWER, power_on);
+- return 0;
+-}
+-
+-static int nokia770_mmc_get_cover_state(struct device *dev, int slot)
+-{
+- return gpio_get_value(NOKIA770_GPIO_MMC_SWITCH);
+-}
++static struct gpiod_lookup_table nokia770_mmc_gpio_table = {
++ .dev_id = "mmci-omap.1",
++ .table = {
++ /* Slot index 0, VSD power, GPIO 41 */
++ GPIO_LOOKUP_IDX("gpio-32-47", 9,
++ "vsd", 0, GPIO_ACTIVE_HIGH),
++ /* Slot index 0, switch, GPIO 23 */
++ GPIO_LOOKUP_IDX("gpio-16-31", 7,
++ "cover", 0, GPIO_ACTIVE_HIGH),
++ { }
++ },
++};
+
+ static struct omap_mmc_platform_data nokia770_mmc2_data = {
+ .nr_slots = 1,
+ .max_freq = 12000000,
+ .slots[0] = {
+- .set_power = nokia770_mmc_set_power,
+- .get_cover_state = nokia770_mmc_get_cover_state,
+ .ocr_mask = MMC_VDD_32_33|MMC_VDD_33_34,
+ .name = "mmcblk",
+ },
+@@ -186,20 +210,7 @@ static struct omap_mmc_platform_data *nokia770_mmc_data[OMAP16XX_NR_MMC];
+
+ static void __init nokia770_mmc_init(void)
+ {
+- int ret;
+-
+- ret = gpio_request(NOKIA770_GPIO_MMC_POWER, "MMC power");
+- if (ret < 0)
+- return;
+- gpio_direction_output(NOKIA770_GPIO_MMC_POWER, 0);
+-
+- ret = gpio_request(NOKIA770_GPIO_MMC_SWITCH, "MMC cover");
+- if (ret < 0) {
+- gpio_free(NOKIA770_GPIO_MMC_POWER);
+- return;
+- }
+- gpio_direction_input(NOKIA770_GPIO_MMC_SWITCH);
+-
++ gpiod_add_lookup_table(&nokia770_mmc_gpio_table);
+ /* Only the second MMC controller is used */
+ nokia770_mmc_data[1] = &nokia770_mmc2_data;
+ omap1_init_mmc(nokia770_mmc_data, OMAP16XX_NR_MMC);
+@@ -212,14 +223,16 @@ static inline void nokia770_mmc_init(void)
+ #endif
+
+ #if IS_ENABLED(CONFIG_I2C_CBUS_GPIO)
+-static struct gpiod_lookup_table nokia770_cbus_gpio_table = {
+- .dev_id = "i2c-cbus-gpio.2",
+- .table = {
+- GPIO_LOOKUP_IDX("mpuio", 9, NULL, 0, 0), /* clk */
+- GPIO_LOOKUP_IDX("mpuio", 10, NULL, 1, 0), /* dat */
+- GPIO_LOOKUP_IDX("mpuio", 11, NULL, 2, 0), /* sel */
+- { },
+- },
++
++static const struct software_node_ref_args nokia770_cbus_gpio_refs[] = {
++ SOFTWARE_NODE_REFERENCE(&nokia770_mpuio_gpiochip_node, 9, 0),
++ SOFTWARE_NODE_REFERENCE(&nokia770_mpuio_gpiochip_node, 10, 0),
++ SOFTWARE_NODE_REFERENCE(&nokia770_mpuio_gpiochip_node, 11, 0),
++};
++
++static const struct property_entry nokia770_cbus_props[] = {
++ PROPERTY_ENTRY_REF_ARRAY("gpios", nokia770_cbus_gpio_refs),
++ { }
+ };
+
+ static struct platform_device nokia770_cbus_device = {
+@@ -238,22 +251,29 @@ static struct i2c_board_info nokia770_i2c_board_info_2[] __initdata = {
+
+ static void __init nokia770_cbus_init(void)
+ {
+- const int retu_irq_gpio = 62;
+- const int tahvo_irq_gpio = 40;
+-
+- if (gpio_request_one(retu_irq_gpio, GPIOF_IN, "Retu IRQ"))
+- return;
+- if (gpio_request_one(tahvo_irq_gpio, GPIOF_IN, "Tahvo IRQ")) {
+- gpio_free(retu_irq_gpio);
+- return;
++ struct gpio_desc *d;
++ int irq;
++
++ d = gpiod_get(NULL, "retu_irq", GPIOD_IN);
++ if (IS_ERR(d)) {
++ pr_err("Unable to get CBUS Retu IRQ GPIO descriptor\n");
++ } else {
++ irq = gpiod_to_irq(d);
++ irq_set_irq_type(irq, IRQ_TYPE_EDGE_RISING);
++ nokia770_i2c_board_info_2[0].irq = irq;
++ }
++ d = gpiod_get(NULL, "tahvo_irq", GPIOD_IN);
++ if (IS_ERR(d)) {
++ pr_err("Unable to get CBUS Tahvo IRQ GPIO descriptor\n");
++ } else {
++ irq = gpiod_to_irq(d);
++ irq_set_irq_type(irq, IRQ_TYPE_EDGE_RISING);
++ nokia770_i2c_board_info_2[1].irq = irq;
+ }
+- irq_set_irq_type(gpio_to_irq(retu_irq_gpio), IRQ_TYPE_EDGE_RISING);
+- irq_set_irq_type(gpio_to_irq(tahvo_irq_gpio), IRQ_TYPE_EDGE_RISING);
+- nokia770_i2c_board_info_2[0].irq = gpio_to_irq(retu_irq_gpio);
+- nokia770_i2c_board_info_2[1].irq = gpio_to_irq(tahvo_irq_gpio);
+ i2c_register_board_info(2, nokia770_i2c_board_info_2,
+ ARRAY_SIZE(nokia770_i2c_board_info_2));
+- gpiod_add_lookup_table(&nokia770_cbus_gpio_table);
++ device_create_managed_software_node(&nokia770_cbus_device.dev,
++ nokia770_cbus_props, NULL);
+ platform_device_register(&nokia770_cbus_device);
+ }
+ #else /* CONFIG_I2C_CBUS_GPIO */
+@@ -262,8 +282,33 @@ static void __init nokia770_cbus_init(void)
+ }
+ #endif /* CONFIG_I2C_CBUS_GPIO */
+
++static struct gpiod_lookup_table nokia770_irq_gpio_table = {
++ .dev_id = NULL,
++ .table = {
++ /* GPIO used by SPI device 1 */
++ GPIO_LOOKUP("gpio-0-15", 15, "ads7846_irq",
++ GPIO_ACTIVE_HIGH),
++ /* GPIO used for retu IRQ */
++ GPIO_LOOKUP("gpio-48-63", 15, "retu_irq",
++ GPIO_ACTIVE_HIGH),
++ /* GPIO used for tahvo IRQ */
++ GPIO_LOOKUP("gpio-32-47", 8, "tahvo_irq",
++ GPIO_ACTIVE_HIGH),
++ /* GPIOs used by serial wakeup IRQs */
++ GPIO_LOOKUP_IDX("gpio-32-47", 5, "wakeup", 0,
++ GPIO_ACTIVE_HIGH),
++ GPIO_LOOKUP_IDX("gpio-16-31", 2, "wakeup", 1,
++ GPIO_ACTIVE_HIGH),
++ GPIO_LOOKUP_IDX("gpio-48-63", 1, "wakeup", 2,
++ GPIO_ACTIVE_HIGH),
++ { }
++ },
++};
++
+ static void __init omap_nokia770_init(void)
+ {
++ struct gpio_desc *d;
++
+ /* On Nokia 770, the SleepX signal is masked with an
+ * MPUIO line by default. It has to be unmasked for it
+ * to become functional */
+@@ -273,8 +318,16 @@ static void __init omap_nokia770_init(void)
+ /* Unmask SleepX signal */
+ omap_writew((omap_readw(0xfffb5004) & ~2), 0xfffb5004);
+
++ software_node_register_node_group(nokia770_gpiochip_nodes);
+ platform_add_devices(nokia770_devices, ARRAY_SIZE(nokia770_devices));
+- nokia770_spi_board_info[1].irq = gpio_to_irq(15);
++
++ gpiod_add_lookup_table(&nokia770_irq_gpio_table);
++ d = gpiod_get(NULL, "ads7846_irq", GPIOD_IN);
++ if (IS_ERR(d))
++ pr_err("Unable to get ADS7846 IRQ GPIO descriptor\n");
++ else
++ nokia770_spi_board_info[1].irq = gpiod_to_irq(d);
++
+ spi_register_board_info(nokia770_spi_board_info,
+ ARRAY_SIZE(nokia770_spi_board_info));
+ omap_serial_init();
+diff --git a/arch/arm/mach-omap1/board-osk.c b/arch/arm/mach-omap1/board-osk.c
+index df758c1f92373..463687b9ca52a 100644
+--- a/arch/arm/mach-omap1/board-osk.c
++++ b/arch/arm/mach-omap1/board-osk.c
+@@ -25,7 +25,8 @@
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+-#include <linux/gpio.h>
++#include <linux/gpio/consumer.h>
++#include <linux/gpio/driver.h>
+ #include <linux/gpio/machine.h>
+ #include <linux/kernel.h>
+ #include <linux/init.h>
+@@ -64,13 +65,12 @@
+ /* TPS65010 has four GPIOs. nPG and LED2 can be treated like GPIOs with
+ * alternate pin configurations for hardware-controlled blinking.
+ */
+-#define OSK_TPS_GPIO_BASE (OMAP_MAX_GPIO_LINES + 16 /* MPUIO */)
+-# define OSK_TPS_GPIO_USB_PWR_EN (OSK_TPS_GPIO_BASE + 0)
+-# define OSK_TPS_GPIO_LED_D3 (OSK_TPS_GPIO_BASE + 1)
+-# define OSK_TPS_GPIO_LAN_RESET (OSK_TPS_GPIO_BASE + 2)
+-# define OSK_TPS_GPIO_DSP_PWR_EN (OSK_TPS_GPIO_BASE + 3)
+-# define OSK_TPS_GPIO_LED_D9 (OSK_TPS_GPIO_BASE + 4)
+-# define OSK_TPS_GPIO_LED_D2 (OSK_TPS_GPIO_BASE + 5)
++#define OSK_TPS_GPIO_USB_PWR_EN 0
++#define OSK_TPS_GPIO_LED_D3 1
++#define OSK_TPS_GPIO_LAN_RESET 2
++#define OSK_TPS_GPIO_DSP_PWR_EN 3
++#define OSK_TPS_GPIO_LED_D9 4
++#define OSK_TPS_GPIO_LED_D2 5
+
+ static struct mtd_partition osk_partitions[] = {
+ /* bootloader (U-Boot, etc) in first sector */
+@@ -174,11 +174,20 @@ static const struct gpio_led tps_leds[] = {
+ /* NOTE: D9 and D2 have hardware blink support.
+ * Also, D9 requires non-battery power.
+ */
+- { .gpio = OSK_TPS_GPIO_LED_D9, .name = "d9",
+- .default_trigger = "disk-activity", },
+- { .gpio = OSK_TPS_GPIO_LED_D2, .name = "d2", },
+- { .gpio = OSK_TPS_GPIO_LED_D3, .name = "d3", .active_low = 1,
+- .default_trigger = "heartbeat", },
++ { .name = "d9", .default_trigger = "disk-activity", },
++ { .name = "d2", },
++ { .name = "d3", .default_trigger = "heartbeat", },
++};
++
++static struct gpiod_lookup_table tps_leds_gpio_table = {
++ .dev_id = "leds-gpio",
++ .table = {
++ /* Use local offsets on TPS65010 */
++ GPIO_LOOKUP_IDX("tps65010", OSK_TPS_GPIO_LED_D9, NULL, 0, GPIO_ACTIVE_HIGH),
++ GPIO_LOOKUP_IDX("tps65010", OSK_TPS_GPIO_LED_D2, NULL, 1, GPIO_ACTIVE_HIGH),
++ GPIO_LOOKUP_IDX("tps65010", OSK_TPS_GPIO_LED_D3, NULL, 2, GPIO_ACTIVE_LOW),
++ { }
++ },
+ };
+
+ static struct gpio_led_platform_data tps_leds_data = {
+@@ -192,29 +201,34 @@ static struct platform_device osk5912_tps_leds = {
+ .dev.platform_data = &tps_leds_data,
+ };
+
+-static int osk_tps_setup(struct i2c_client *client, void *context)
++/* The board just hold these GPIOs hogged from setup to teardown */
++static struct gpio_desc *eth_reset;
++static struct gpio_desc *vdd_dsp;
++
++static int osk_tps_setup(struct i2c_client *client, struct gpio_chip *gc)
+ {
++ struct gpio_desc *d;
+ if (!IS_BUILTIN(CONFIG_TPS65010))
+ return -ENOSYS;
+
+ /* Set GPIO 1 HIGH to disable VBUS power supply;
+ * OHCI driver powers it up/down as needed.
+ */
+- gpio_request(OSK_TPS_GPIO_USB_PWR_EN, "n_vbus_en");
+- gpio_direction_output(OSK_TPS_GPIO_USB_PWR_EN, 1);
++ d = gpiochip_request_own_desc(gc, OSK_TPS_GPIO_USB_PWR_EN, "n_vbus_en",
++ GPIO_ACTIVE_HIGH, GPIOD_OUT_HIGH);
+ /* Free the GPIO again as the driver will request it */
+- gpio_free(OSK_TPS_GPIO_USB_PWR_EN);
++ gpiochip_free_own_desc(d);
+
+ /* Set GPIO 2 high so LED D3 is off by default */
+ tps65010_set_gpio_out_value(GPIO2, HIGH);
+
+ /* Set GPIO 3 low to take ethernet out of reset */
+- gpio_request(OSK_TPS_GPIO_LAN_RESET, "smc_reset");
+- gpio_direction_output(OSK_TPS_GPIO_LAN_RESET, 0);
++ eth_reset = gpiochip_request_own_desc(gc, OSK_TPS_GPIO_LAN_RESET, "smc_reset",
++ GPIO_ACTIVE_HIGH, GPIOD_OUT_LOW);
+
+ /* GPIO4 is VDD_DSP */
+- gpio_request(OSK_TPS_GPIO_DSP_PWR_EN, "dsp_power");
+- gpio_direction_output(OSK_TPS_GPIO_DSP_PWR_EN, 1);
++ vdd_dsp = gpiochip_request_own_desc(gc, OSK_TPS_GPIO_DSP_PWR_EN, "dsp_power",
++ GPIO_ACTIVE_HIGH, GPIOD_OUT_HIGH);
+ /* REVISIT if DSP support isn't configured, power it off ... */
+
+ /* Let LED1 (D9) blink; leds-gpio may override it */
+@@ -232,15 +246,22 @@ static int osk_tps_setup(struct i2c_client *client, void *context)
+
+ /* register these three LEDs */
+ osk5912_tps_leds.dev.parent = &client->dev;
++ gpiod_add_lookup_table(&tps_leds_gpio_table);
+ platform_device_register(&osk5912_tps_leds);
+
+ return 0;
+ }
+
++static void osk_tps_teardown(struct i2c_client *client, struct gpio_chip *gc)
++{
++ gpiochip_free_own_desc(eth_reset);
++ gpiochip_free_own_desc(vdd_dsp);
++}
++
+ static struct tps65010_board tps_board = {
+- .base = OSK_TPS_GPIO_BASE,
+ .outmask = 0x0f,
+ .setup = osk_tps_setup,
++ .teardown = osk_tps_teardown,
+ };
+
+ static struct i2c_board_info __initdata osk_i2c_board_info[] = {
+@@ -263,11 +284,6 @@ static void __init osk_init_smc91x(void)
+ {
+ u32 l;
+
+- if ((gpio_request(0, "smc_irq")) < 0) {
+- printk("Error requesting gpio 0 for smc91x irq\n");
+- return;
+- }
+-
+ /* Check EMIFS wait states to fix errors with SMC_GET_PKT_HDR */
+ l = omap_readl(EMIFS_CCS(1));
+ l |= 0x3;
+@@ -279,10 +295,6 @@ static void __init osk_init_cf(int seg)
+ struct resource *res = &osk5912_cf_resources[1];
+
+ omap_cfg_reg(M7_1610_GPIO62);
+- if ((gpio_request(62, "cf_irq")) < 0) {
+- printk("Error requesting gpio 62 for CF irq\n");
+- return;
+- }
+
+ switch (seg) {
+ /* NOTE: CS0 could be configured too ... */
+@@ -308,18 +320,17 @@ static void __init osk_init_cf(int seg)
+ seg, omap_readl(EMIFS_CCS(seg)), omap_readl(EMIFS_ACS(seg)));
+ omap_writel(0x0004a1b3, EMIFS_CCS(seg)); /* synch mode 4 etc */
+ omap_writel(0x00000000, EMIFS_ACS(seg)); /* OE hold/setup */
+-
+- /* the CF I/O IRQ is really active-low */
+- irq_set_irq_type(gpio_to_irq(62), IRQ_TYPE_EDGE_FALLING);
+ }
+
+ static struct gpiod_lookup_table osk_usb_gpio_table = {
+ .dev_id = "ohci",
+ .table = {
+ /* Power GPIO on the I2C-attached TPS65010 */
+- GPIO_LOOKUP("tps65010", 0, "power", GPIO_ACTIVE_HIGH),
++ GPIO_LOOKUP("tps65010", OSK_TPS_GPIO_USB_PWR_EN, "power",
++ GPIO_ACTIVE_HIGH),
+ GPIO_LOOKUP(OMAP_GPIO_LABEL, 9, "overcurrent",
+ GPIO_ACTIVE_HIGH),
++ { }
+ },
+ };
+
+@@ -341,8 +352,32 @@ static struct omap_usb_config osk_usb_config __initdata = {
+
+ #define EMIFS_CS3_VAL (0x88013141)
+
++static struct gpiod_lookup_table osk_irq_gpio_table = {
++ .dev_id = NULL,
++ .table = {
++ /* GPIO used for SMC91x IRQ */
++ GPIO_LOOKUP(OMAP_GPIO_LABEL, 0, "smc_irq",
++ GPIO_ACTIVE_HIGH),
++ /* GPIO used for CF IRQ */
++ GPIO_LOOKUP("gpio-48-63", 14, "cf_irq",
++ GPIO_ACTIVE_HIGH),
++ /* GPIO used by the TPS65010 chip */
++ GPIO_LOOKUP("mpuio", 1, "tps65010",
++ GPIO_ACTIVE_HIGH),
++ /* GPIOs used for serial wakeup IRQs */
++ GPIO_LOOKUP_IDX("gpio-32-47", 5, "wakeup", 0,
++ GPIO_ACTIVE_HIGH),
++ GPIO_LOOKUP_IDX("gpio-16-31", 2, "wakeup", 1,
++ GPIO_ACTIVE_HIGH),
++ GPIO_LOOKUP_IDX("gpio-48-63", 1, "wakeup", 2,
++ GPIO_ACTIVE_HIGH),
++ { }
++ },
++};
++
+ static void __init osk_init(void)
+ {
++ struct gpio_desc *d;
+ u32 l;
+
+ osk_init_smc91x();
+@@ -359,10 +394,31 @@ static void __init osk_init(void)
+
+ osk_flash_resource.end = osk_flash_resource.start = omap_cs3_phys();
+ osk_flash_resource.end += SZ_32M - 1;
+- osk5912_smc91x_resources[1].start = gpio_to_irq(0);
+- osk5912_smc91x_resources[1].end = gpio_to_irq(0);
+- osk5912_cf_resources[0].start = gpio_to_irq(62);
+- osk5912_cf_resources[0].end = gpio_to_irq(62);
++
++ /*
++ * Add the GPIOs to be used as IRQs and immediately look them up
++ * to be passed as an IRQ resource. This is ugly but should work
++ * until the day we convert to device tree.
++ */
++ gpiod_add_lookup_table(&osk_irq_gpio_table);
++
++ d = gpiod_get(NULL, "smc_irq", GPIOD_IN);
++ if (IS_ERR(d)) {
++ pr_err("Unable to get SMC IRQ GPIO descriptor\n");
++ } else {
++ irq_set_irq_type(gpiod_to_irq(d), IRQ_TYPE_EDGE_RISING);
++ osk5912_smc91x_resources[1] = DEFINE_RES_IRQ(gpiod_to_irq(d));
++ }
++
++ d = gpiod_get(NULL, "cf_irq", GPIOD_IN);
++ if (IS_ERR(d)) {
++ pr_err("Unable to get CF IRQ GPIO descriptor\n");
++ } else {
++ /* the CF I/O IRQ is really active-low */
++ irq_set_irq_type(gpiod_to_irq(d), IRQ_TYPE_EDGE_FALLING);
++ osk5912_cf_resources[0] = DEFINE_RES_IRQ(gpiod_to_irq(d));
++ }
++
+ platform_add_devices(osk5912_devices, ARRAY_SIZE(osk5912_devices));
+
+ l = omap_readl(USB_TRANSCEIVER_CTRL);
+@@ -372,13 +428,15 @@ static void __init osk_init(void)
+ gpiod_add_lookup_table(&osk_usb_gpio_table);
+ omap1_usb_init(&osk_usb_config);
+
++ omap_serial_init();
++
+ /* irq for tps65010 chip */
+ /* bootloader effectively does: omap_cfg_reg(U19_1610_MPUIO1); */
+- if (gpio_request(OMAP_MPUIO(1), "tps65010") == 0)
+- gpio_direction_input(OMAP_MPUIO(1));
+-
+- omap_serial_init();
+- osk_i2c_board_info[0].irq = gpio_to_irq(OMAP_MPUIO(1));
++ d = gpiod_get(NULL, "tps65010", GPIOD_IN);
++ if (IS_ERR(d))
++ pr_err("Unable to get TPS65010 IRQ GPIO descriptor\n");
++ else
++ osk_i2c_board_info[0].irq = gpiod_to_irq(d);
+ omap_register_i2c_bus(1, 400, osk_i2c_board_info,
+ ARRAY_SIZE(osk_i2c_board_info));
+ }
+diff --git a/arch/arm/mach-omap1/board-palmte.c b/arch/arm/mach-omap1/board-palmte.c
+index f79c497f04d57..49b7757cb2fd3 100644
+--- a/arch/arm/mach-omap1/board-palmte.c
++++ b/arch/arm/mach-omap1/board-palmte.c
+@@ -13,7 +13,8 @@
+ *
+ * Copyright (c) 2006 Andrzej Zaborowski <balrog@zabor.org>
+ */
+-#include <linux/gpio.h>
++#include <linux/gpio/machine.h>
++#include <linux/gpio/consumer.h>
+ #include <linux/kernel.h>
+ #include <linux/init.h>
+ #include <linux/input.h>
+@@ -187,23 +188,6 @@ static struct spi_board_info palmte_spi_info[] __initdata = {
+ },
+ };
+
+-static void __init palmte_misc_gpio_setup(void)
+-{
+- /* Set TSC2102 PINTDAV pin as input (used by TSC2102 driver) */
+- if (gpio_request(PALMTE_PINTDAV_GPIO, "TSC2102 PINTDAV") < 0) {
+- printk(KERN_ERR "Could not reserve PINTDAV GPIO!\n");
+- return;
+- }
+- gpio_direction_input(PALMTE_PINTDAV_GPIO);
+-
+- /* Set USB-or-DC-IN pin as input (unused) */
+- if (gpio_request(PALMTE_USB_OR_DC_GPIO, "USB/DC-IN") < 0) {
+- printk(KERN_ERR "Could not reserve cable signal GPIO!\n");
+- return;
+- }
+- gpio_direction_input(PALMTE_USB_OR_DC_GPIO);
+-}
+-
+ #if IS_ENABLED(CONFIG_MMC_OMAP)
+
+ static struct omap_mmc_platform_data _palmte_mmc_config = {
+@@ -231,8 +215,23 @@ static void palmte_mmc_init(void)
+
+ #endif /* CONFIG_MMC_OMAP */
+
++static struct gpiod_lookup_table palmte_irq_gpio_table = {
++ .dev_id = NULL,
++ .table = {
++ /* GPIO used for TSC2102 PINTDAV IRQ */
++ GPIO_LOOKUP("gpio-0-15", PALMTE_PINTDAV_GPIO, "tsc2102_irq",
++ GPIO_ACTIVE_HIGH),
++ /* GPIO used for USB or DC input detection */
++ GPIO_LOOKUP("gpio-0-15", PALMTE_USB_OR_DC_GPIO, "usb_dc_irq",
++ GPIO_ACTIVE_HIGH),
++ { }
++ },
++};
++
+ static void __init omap_palmte_init(void)
+ {
++ struct gpio_desc *d;
++
+ /* mux pins for uarts */
+ omap_cfg_reg(UART1_TX);
+ omap_cfg_reg(UART1_RTS);
+@@ -243,9 +242,21 @@ static void __init omap_palmte_init(void)
+
+ platform_add_devices(palmte_devices, ARRAY_SIZE(palmte_devices));
+
+- palmte_spi_info[0].irq = gpio_to_irq(PALMTE_PINTDAV_GPIO);
++ gpiod_add_lookup_table(&palmte_irq_gpio_table);
++ d = gpiod_get(NULL, "tsc2102_irq", GPIOD_IN);
++ if (IS_ERR(d))
++ pr_err("Unable to get TSC2102 IRQ GPIO descriptor\n");
++ else
++ palmte_spi_info[0].irq = gpiod_to_irq(d);
+ spi_register_board_info(palmte_spi_info, ARRAY_SIZE(palmte_spi_info));
+- palmte_misc_gpio_setup();
++
++ /* We are getting this just to set it up as input */
++ d = gpiod_get(NULL, "usb_dc_irq", GPIOD_IN);
++ if (IS_ERR(d))
++ pr_err("Unable to get USB/DC IRQ GPIO descriptor\n");
++ else
++ gpiod_put(d);
++
+ omap_serial_init();
+ omap1_usb_init(&palmte_usb_config);
+ omap_register_i2c_bus(1, 100, NULL, 0);
+diff --git a/arch/arm/mach-omap1/board-sx1-mmc.c b/arch/arm/mach-omap1/board-sx1-mmc.c
+index f1c160924dfe4..f183a8448a7b0 100644
+--- a/arch/arm/mach-omap1/board-sx1-mmc.c
++++ b/arch/arm/mach-omap1/board-sx1-mmc.c
+@@ -9,7 +9,6 @@
+ * Copyright (C) 2007 Instituto Nokia de Tecnologia - INdT
+ */
+
+-#include <linux/gpio.h>
+ #include <linux/platform_device.h>
+
+ #include "hardware.h"
+diff --git a/arch/arm/mach-omap1/board-sx1.c b/arch/arm/mach-omap1/board-sx1.c
+index 0c0cdd5e77c79..a13c630be7b7f 100644
+--- a/arch/arm/mach-omap1/board-sx1.c
++++ b/arch/arm/mach-omap1/board-sx1.c
+@@ -11,7 +11,8 @@
+ * Maintainters : Vladimir Ananiev (aka Vovan888), Sergge
+ * oslik.ru
+ */
+-#include <linux/gpio.h>
++#include <linux/gpio/machine.h>
++#include <linux/gpio/consumer.h>
+ #include <linux/kernel.h>
+ #include <linux/init.h>
+ #include <linux/input.h>
+@@ -304,8 +305,23 @@ static struct platform_device *sx1_devices[] __initdata = {
+
+ /*-----------------------------------------*/
+
++static struct gpiod_lookup_table sx1_gpio_table = {
++ .dev_id = NULL,
++ .table = {
++ GPIO_LOOKUP("gpio-0-15", 1, "irda_off",
++ GPIO_ACTIVE_HIGH),
++ GPIO_LOOKUP("gpio-0-15", 11, "switch",
++ GPIO_ACTIVE_HIGH),
++ GPIO_LOOKUP("gpio-0-15", 15, "usb_on",
++ GPIO_ACTIVE_HIGH),
++ { }
++ },
++};
++
+ static void __init omap_sx1_init(void)
+ {
++ struct gpio_desc *d;
++
+ /* mux pins for uarts */
+ omap_cfg_reg(UART1_TX);
+ omap_cfg_reg(UART1_RTS);
+@@ -320,15 +336,25 @@ static void __init omap_sx1_init(void)
+ omap_register_i2c_bus(1, 100, NULL, 0);
+ omap1_usb_init(&sx1_usb_config);
+ sx1_mmc_init();
++ gpiod_add_lookup_table(&sx1_gpio_table);
+
+ /* turn on USB power */
+ /* sx1_setusbpower(1); can't do it here because i2c is not ready */
+- gpio_request(1, "A_IRDA_OFF");
+- gpio_request(11, "A_SWITCH");
+- gpio_request(15, "A_USB_ON");
+- gpio_direction_output(1, 1); /*A_IRDA_OFF = 1 */
+- gpio_direction_output(11, 0); /*A_SWITCH = 0 */
+- gpio_direction_output(15, 0); /*A_USB_ON = 0 */
++ d = gpiod_get(NULL, "irda_off", GPIOD_OUT_HIGH);
++ if (IS_ERR(d))
++ pr_err("Unable to get IRDA OFF GPIO descriptor\n");
++ else
++ gpiod_put(d);
++ d = gpiod_get(NULL, "switch", GPIOD_OUT_LOW);
++ if (IS_ERR(d))
++ pr_err("Unable to get SWITCH GPIO descriptor\n");
++ else
++ gpiod_put(d);
++ d = gpiod_get(NULL, "usb_on", GPIOD_OUT_LOW);
++ if (IS_ERR(d))
++ pr_err("Unable to get USB ON GPIO descriptor\n");
++ else
++ gpiod_put(d);
+
+ omapfb_set_lcd_config(&sx1_lcd_config);
+ }
+diff --git a/arch/arm/mach-omap1/devices.c b/arch/arm/mach-omap1/devices.c
+index 5304699c7a97e..8b2c5f911e973 100644
+--- a/arch/arm/mach-omap1/devices.c
++++ b/arch/arm/mach-omap1/devices.c
+@@ -6,7 +6,6 @@
+ */
+
+ #include <linux/dma-mapping.h>
+-#include <linux/gpio.h>
+ #include <linux/module.h>
+ #include <linux/kernel.h>
+ #include <linux/init.h>
+diff --git a/arch/arm/mach-omap1/gpio15xx.c b/arch/arm/mach-omap1/gpio15xx.c
+index 61fa26efd8653..6724af4925f24 100644
+--- a/arch/arm/mach-omap1/gpio15xx.c
++++ b/arch/arm/mach-omap1/gpio15xx.c
+@@ -8,7 +8,6 @@
+ * Charulatha V <charu@ti.com>
+ */
+
+-#include <linux/gpio.h>
+ #include <linux/platform_data/gpio-omap.h>
+ #include <linux/soc/ti/omap1-soc.h>
+ #include <asm/irq.h>
+diff --git a/arch/arm/mach-omap1/gpio16xx.c b/arch/arm/mach-omap1/gpio16xx.c
+index cf052714b3f8a..55acec22fef4e 100644
+--- a/arch/arm/mach-omap1/gpio16xx.c
++++ b/arch/arm/mach-omap1/gpio16xx.c
+@@ -8,7 +8,6 @@
+ * Charulatha V <charu@ti.com>
+ */
+
+-#include <linux/gpio.h>
+ #include <linux/platform_data/gpio-omap.h>
+ #include <linux/soc/ti/omap1-io.h>
+
+diff --git a/arch/arm/mach-omap1/irq.c b/arch/arm/mach-omap1/irq.c
+index bfc7ab010ae28..af06a8753fdc3 100644
+--- a/arch/arm/mach-omap1/irq.c
++++ b/arch/arm/mach-omap1/irq.c
+@@ -35,7 +35,6 @@
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+-#include <linux/gpio.h>
+ #include <linux/init.h>
+ #include <linux/module.h>
+ #include <linux/sched.h>
+diff --git a/arch/arm/mach-omap1/serial.c b/arch/arm/mach-omap1/serial.c
+index c7f5906457748..3adceb97138fb 100644
+--- a/arch/arm/mach-omap1/serial.c
++++ b/arch/arm/mach-omap1/serial.c
+@@ -4,7 +4,8 @@
+ *
+ * OMAP1 serial support.
+ */
+-#include <linux/gpio.h>
++#include <linux/gpio/machine.h>
++#include <linux/gpio/consumer.h>
+ #include <linux/module.h>
+ #include <linux/kernel.h>
+ #include <linux/init.h>
+@@ -196,39 +197,38 @@ void omap_serial_wake_trigger(int enable)
+ }
+ }
+
+-static void __init omap_serial_set_port_wakeup(int gpio_nr)
++static void __init omap_serial_set_port_wakeup(int idx)
+ {
++ struct gpio_desc *d;
+ int ret;
+
+- ret = gpio_request(gpio_nr, "UART wake");
+- if (ret < 0) {
+- printk(KERN_ERR "Could not request UART wake GPIO: %i\n",
+- gpio_nr);
++ d = gpiod_get_index(NULL, "wakeup", idx, GPIOD_IN);
++ if (IS_ERR(d)) {
++ pr_err("Unable to get UART wakeup GPIO descriptor\n");
+ return;
+ }
+- gpio_direction_input(gpio_nr);
+- ret = request_irq(gpio_to_irq(gpio_nr), &omap_serial_wake_interrupt,
++ ret = request_irq(gpiod_to_irq(d), &omap_serial_wake_interrupt,
+ IRQF_TRIGGER_RISING, "serial wakeup", NULL);
+ if (ret) {
+- gpio_free(gpio_nr);
+- printk(KERN_ERR "No interrupt for UART wake GPIO: %i\n",
+- gpio_nr);
++ gpiod_put(d);
++ pr_err("No interrupt for UART%d wake GPIO\n", idx + 1);
+ return;
+ }
+- enable_irq_wake(gpio_to_irq(gpio_nr));
++ enable_irq_wake(gpiod_to_irq(d));
+ }
+
++
+ int __init omap_serial_wakeup_init(void)
+ {
+ if (!cpu_is_omap16xx())
+ return 0;
+
+ if (uart1_ck != NULL)
+- omap_serial_set_port_wakeup(37);
++ omap_serial_set_port_wakeup(0);
+ if (uart2_ck != NULL)
+- omap_serial_set_port_wakeup(18);
++ omap_serial_set_port_wakeup(1);
+ if (uart3_ck != NULL)
+- omap_serial_set_port_wakeup(49);
++ omap_serial_set_port_wakeup(2);
+
+ return 0;
+ }
+diff --git a/arch/arm/mach-omap2/board-generic.c b/arch/arm/mach-omap2/board-generic.c
+index 1610c567a6a3a..10d2f078e4a8e 100644
+--- a/arch/arm/mach-omap2/board-generic.c
++++ b/arch/arm/mach-omap2/board-generic.c
+@@ -13,6 +13,7 @@
+ #include <linux/of_platform.h>
+ #include <linux/irqdomain.h>
+ #include <linux/clocksource.h>
++#include <linux/clockchips.h>
+
+ #include <asm/setup.h>
+ #include <asm/mach/arch.h>
+diff --git a/arch/arm/mach-omap2/board-n8x0.c b/arch/arm/mach-omap2/board-n8x0.c
+index 3353b0a923d96..564bf80a26212 100644
+--- a/arch/arm/mach-omap2/board-n8x0.c
++++ b/arch/arm/mach-omap2/board-n8x0.c
+@@ -10,7 +10,8 @@
+
+ #include <linux/clk.h>
+ #include <linux/delay.h>
+-#include <linux/gpio.h>
++#include <linux/gpio/machine.h>
++#include <linux/gpio/consumer.h>
+ #include <linux/init.h>
+ #include <linux/io.h>
+ #include <linux/irq.h>
+@@ -28,13 +29,12 @@
+
+ #include "common.h"
+ #include "mmc.h"
++#include "usb-tusb6010.h"
+ #include "soc.h"
+ #include "common-board-devices.h"
+
+ #define TUSB6010_ASYNC_CS 1
+ #define TUSB6010_SYNC_CS 4
+-#define TUSB6010_GPIO_INT 58
+-#define TUSB6010_GPIO_ENABLE 0
+ #define TUSB6010_DMACHAN 0x3f
+
+ #define NOKIA_N810_WIMAX (1 << 2)
+@@ -61,37 +61,6 @@ static void board_check_revision(void)
+ }
+
+ #if IS_ENABLED(CONFIG_USB_MUSB_TUSB6010)
+-/*
+- * Enable or disable power to TUSB6010. When enabling, turn on 3.3 V and
+- * 1.5 V voltage regulators of PM companion chip. Companion chip will then
+- * provide then PGOOD signal to TUSB6010 which will release it from reset.
+- */
+-static int tusb_set_power(int state)
+-{
+- int i, retval = 0;
+-
+- if (state) {
+- gpio_set_value(TUSB6010_GPIO_ENABLE, 1);
+- msleep(1);
+-
+- /* Wait until TUSB6010 pulls INT pin down */
+- i = 100;
+- while (i && gpio_get_value(TUSB6010_GPIO_INT)) {
+- msleep(1);
+- i--;
+- }
+-
+- if (!i) {
+- printk(KERN_ERR "tusb: powerup failed\n");
+- retval = -ENODEV;
+- }
+- } else {
+- gpio_set_value(TUSB6010_GPIO_ENABLE, 0);
+- msleep(10);
+- }
+-
+- return retval;
+-}
+
+ static struct musb_hdrc_config musb_config = {
+ .multipoint = 1,
+@@ -102,39 +71,36 @@ static struct musb_hdrc_config musb_config = {
+
+ static struct musb_hdrc_platform_data tusb_data = {
+ .mode = MUSB_OTG,
+- .set_power = tusb_set_power,
+ .min_power = 25, /* x2 = 50 mA drawn from VBUS as peripheral */
+ .power = 100, /* Max 100 mA VBUS for host mode */
+ .config = &musb_config,
+ };
+
++static struct gpiod_lookup_table tusb_gpio_table = {
++ .dev_id = "musb-tusb",
++ .table = {
++ GPIO_LOOKUP("gpio-0-15", 0, "enable",
++ GPIO_ACTIVE_HIGH),
++ GPIO_LOOKUP("gpio-48-63", 10, "int",
++ GPIO_ACTIVE_HIGH),
++ { }
++ },
++};
++
+ static void __init n8x0_usb_init(void)
+ {
+ int ret = 0;
+- static const char announce[] __initconst = KERN_INFO "TUSB 6010\n";
+-
+- /* PM companion chip power control pin */
+- ret = gpio_request_one(TUSB6010_GPIO_ENABLE, GPIOF_OUT_INIT_LOW,
+- "TUSB6010 enable");
+- if (ret != 0) {
+- printk(KERN_ERR "Could not get TUSB power GPIO%i\n",
+- TUSB6010_GPIO_ENABLE);
+- return;
+- }
+- tusb_set_power(0);
+
++ gpiod_add_lookup_table(&tusb_gpio_table);
+ ret = tusb6010_setup_interface(&tusb_data, TUSB6010_REFCLK_19, 2,
+- TUSB6010_ASYNC_CS, TUSB6010_SYNC_CS,
+- TUSB6010_GPIO_INT, TUSB6010_DMACHAN);
++ TUSB6010_ASYNC_CS, TUSB6010_SYNC_CS,
++ TUSB6010_DMACHAN);
+ if (ret != 0)
+- goto err;
++ return;
+
+- printk(announce);
++ pr_info("TUSB 6010\n");
+
+ return;
+-
+-err:
+- gpio_free(TUSB6010_GPIO_ENABLE);
+ }
+ #else
+
+@@ -170,22 +136,32 @@ static struct spi_board_info n800_spi_board_info[] __initdata = {
+ * GPIO23 and GPIO9 slot 2 EMMC on N810
+ *
+ */
+-#define N8X0_SLOT_SWITCH_GPIO 96
+-#define N810_EMMC_VSD_GPIO 23
+-#define N810_EMMC_VIO_GPIO 9
+-
+ static int slot1_cover_open;
+ static int slot2_cover_open;
+ static struct device *mmc_device;
+
+-static int n8x0_mmc_switch_slot(struct device *dev, int slot)
+-{
+-#ifdef CONFIG_MMC_DEBUG
+- dev_dbg(dev, "Choose slot %d\n", slot + 1);
+-#endif
+- gpio_set_value(N8X0_SLOT_SWITCH_GPIO, slot);
+- return 0;
+-}
++static struct gpiod_lookup_table nokia8xx_mmc_gpio_table = {
++ .dev_id = "mmci-omap.0",
++ .table = {
++ /* Slot switch, GPIO 96 */
++ GPIO_LOOKUP("gpio-80-111", 16,
++ "switch", GPIO_ACTIVE_HIGH),
++ { }
++ },
++};
++
++static struct gpiod_lookup_table nokia810_mmc_gpio_table = {
++ .dev_id = "mmci-omap.0",
++ .table = {
++ /* Slot index 1, VSD power, GPIO 23 */
++ GPIO_LOOKUP_IDX("gpio-16-31", 7,
++ "vsd", 1, GPIO_ACTIVE_HIGH),
++ /* Slot index 1, VIO power, GPIO 9 */
++ GPIO_LOOKUP_IDX("gpio-0-15", 9,
++ "vsd", 1, GPIO_ACTIVE_HIGH),
++ { }
++ },
++};
+
+ static int n8x0_mmc_set_power_menelaus(struct device *dev, int slot,
+ int power_on, int vdd)
+@@ -256,31 +232,13 @@ static int n8x0_mmc_set_power_menelaus(struct device *dev, int slot,
+ return 0;
+ }
+
+-static void n810_set_power_emmc(struct device *dev,
+- int power_on)
+-{
+- dev_dbg(dev, "Set EMMC power %s\n", power_on ? "on" : "off");
+-
+- if (power_on) {
+- gpio_set_value(N810_EMMC_VSD_GPIO, 1);
+- msleep(1);
+- gpio_set_value(N810_EMMC_VIO_GPIO, 1);
+- msleep(1);
+- } else {
+- gpio_set_value(N810_EMMC_VIO_GPIO, 0);
+- msleep(50);
+- gpio_set_value(N810_EMMC_VSD_GPIO, 0);
+- msleep(50);
+- }
+-}
+-
+ static int n8x0_mmc_set_power(struct device *dev, int slot, int power_on,
+ int vdd)
+ {
+ if (board_is_n800() || slot == 0)
+ return n8x0_mmc_set_power_menelaus(dev, slot, power_on, vdd);
+
+- n810_set_power_emmc(dev, power_on);
++ /* The n810 power will be handled by GPIO code in the driver */
+
+ return 0;
+ }
+@@ -418,13 +376,6 @@ static void n8x0_mmc_shutdown(struct device *dev)
+ static void n8x0_mmc_cleanup(struct device *dev)
+ {
+ menelaus_unregister_mmc_callback();
+-
+- gpio_free(N8X0_SLOT_SWITCH_GPIO);
+-
+- if (board_is_n810()) {
+- gpio_free(N810_EMMC_VSD_GPIO);
+- gpio_free(N810_EMMC_VIO_GPIO);
+- }
+ }
+
+ /*
+@@ -433,7 +384,6 @@ static void n8x0_mmc_cleanup(struct device *dev)
+ */
+ static struct omap_mmc_platform_data mmc1_data = {
+ .nr_slots = 0,
+- .switch_slot = n8x0_mmc_switch_slot,
+ .init = n8x0_mmc_late_init,
+ .cleanup = n8x0_mmc_cleanup,
+ .shutdown = n8x0_mmc_shutdown,
+@@ -463,14 +413,9 @@ static struct omap_mmc_platform_data mmc1_data = {
+
+ static struct omap_mmc_platform_data *mmc_data[OMAP24XX_NR_MMC];
+
+-static struct gpio n810_emmc_gpios[] __initdata = {
+- { N810_EMMC_VSD_GPIO, GPIOF_OUT_INIT_LOW, "MMC slot 2 Vddf" },
+- { N810_EMMC_VIO_GPIO, GPIOF_OUT_INIT_LOW, "MMC slot 2 Vdd" },
+-};
+-
+ static void __init n8x0_mmc_init(void)
+ {
+- int err;
++ gpiod_add_lookup_table(&nokia8xx_mmc_gpio_table);
+
+ if (board_is_n810()) {
+ mmc1_data.slots[0].name = "external";
+@@ -483,20 +428,7 @@ static void __init n8x0_mmc_init(void)
+ */
+ mmc1_data.slots[1].name = "internal";
+ mmc1_data.slots[1].ban_openended = 1;
+- }
+-
+- err = gpio_request_one(N8X0_SLOT_SWITCH_GPIO, GPIOF_OUT_INIT_LOW,
+- "MMC slot switch");
+- if (err)
+- return;
+-
+- if (board_is_n810()) {
+- err = gpio_request_array(n810_emmc_gpios,
+- ARRAY_SIZE(n810_emmc_gpios));
+- if (err) {
+- gpio_free(N8X0_SLOT_SWITCH_GPIO);
+- return;
+- }
++ gpiod_add_lookup_table(&nokia810_mmc_gpio_table);
+ }
+
+ mmc1_data.nr_slots = 2;
+diff --git a/arch/arm/mach-omap2/omap_device.c b/arch/arm/mach-omap2/omap_device.c
+index 4afa2f08e6681..fca7869c8075a 100644
+--- a/arch/arm/mach-omap2/omap_device.c
++++ b/arch/arm/mach-omap2/omap_device.c
+@@ -244,7 +244,6 @@ static int _omap_device_notifier_call(struct notifier_block *nb,
+ case BUS_NOTIFY_ADD_DEVICE:
+ if (pdev->dev.of_node)
+ omap_device_build_from_dt(pdev);
+- omap_auxdata_legacy_init(dev);
+ fallthrough;
+ default:
+ od = to_omap_device(pdev);
+diff --git a/arch/arm/mach-omap2/pdata-quirks.c b/arch/arm/mach-omap2/pdata-quirks.c
+index 04208cc52784e..c1c0121f478d6 100644
+--- a/arch/arm/mach-omap2/pdata-quirks.c
++++ b/arch/arm/mach-omap2/pdata-quirks.c
+@@ -6,8 +6,8 @@
+ */
+ #include <linux/clk.h>
+ #include <linux/davinci_emac.h>
++#include <linux/gpio/machine.h>
+ #include <linux/gpio/consumer.h>
+-#include <linux/gpio.h>
+ #include <linux/init.h>
+ #include <linux/kernel.h>
+ #include <linux/of_platform.h>
+@@ -41,7 +41,6 @@ struct pdata_init {
+ };
+
+ static struct of_dev_auxdata omap_auxdata_lookup[];
+-static struct twl4030_gpio_platform_data twl_gpio_auxdata;
+
+ #ifdef CONFIG_MACH_NOKIA_N8X0
+ static void __init omap2420_n8x0_legacy_init(void)
+@@ -98,52 +97,43 @@ static struct iommu_platform_data omap3_iommu_isp_pdata = {
+ };
+ #endif
+
+-static int omap3_sbc_t3730_twl_callback(struct device *dev,
+- unsigned gpio,
+- unsigned ngpio)
++static void __init omap3_sbc_t3x_usb_hub_init(char *hub_name, int idx)
+ {
+- int res;
++ struct gpio_desc *d;
+
+- res = gpio_request_one(gpio + 2, GPIOF_OUT_INIT_HIGH,
+- "wlan pwr");
+- if (res)
+- return res;
+-
+- gpiod_export(gpio_to_desc(gpio), 0);
+-
+- return 0;
+-}
+-
+-static void __init omap3_sbc_t3x_usb_hub_init(int gpio, char *hub_name)
+-{
+- int err = gpio_request_one(gpio, GPIOF_OUT_INIT_LOW, hub_name);
+-
+- if (err) {
+- pr_err("SBC-T3x: %s reset gpio request failed: %d\n",
+- hub_name, err);
++ /* This asserts the RESET line (reverse polarity) */
++ d = gpiod_get_index(NULL, "reset", idx, GPIOD_OUT_HIGH);
++ if (IS_ERR(d)) {
++ pr_err("Unable to get T3x USB reset GPIO descriptor\n");
+ return;
+ }
+-
+- gpiod_export(gpio_to_desc(gpio), 0);
+-
++ gpiod_set_consumer_name(d, hub_name);
++ gpiod_export(d, 0);
+ udelay(10);
+- gpio_set_value(gpio, 1);
++ /* De-assert RESET */
++ gpiod_set_value(d, 0);
+ msleep(1);
+ }
+
+-static void __init omap3_sbc_t3730_twl_init(void)
+-{
+- twl_gpio_auxdata.setup = omap3_sbc_t3730_twl_callback;
+-}
++static struct gpiod_lookup_table omap3_sbc_t3x_usb_gpio_table = {
++ .dev_id = NULL,
++ .table = {
++ GPIO_LOOKUP_IDX("gpio-160-175", 7, "reset", 0,
++ GPIO_ACTIVE_LOW),
++ { }
++ },
++};
+
+ static void __init omap3_sbc_t3730_legacy_init(void)
+ {
+- omap3_sbc_t3x_usb_hub_init(167, "sb-t35 usb hub");
++ gpiod_add_lookup_table(&omap3_sbc_t3x_usb_gpio_table);
++ omap3_sbc_t3x_usb_hub_init("sb-t35 usb hub", 0);
+ }
+
+ static void __init omap3_sbc_t3530_legacy_init(void)
+ {
+- omap3_sbc_t3x_usb_hub_init(167, "sb-t35 usb hub");
++ gpiod_add_lookup_table(&omap3_sbc_t3x_usb_gpio_table);
++ omap3_sbc_t3x_usb_hub_init("sb-t35 usb hub", 0);
+ }
+
+ static void __init omap3_evm_legacy_init(void)
+@@ -187,31 +177,59 @@ static void __init am35xx_emac_reset(void)
+ omap_ctrl_readl(AM35XX_CONTROL_IP_SW_RESET); /* OCP barrier */
+ }
+
+-static struct gpio cm_t3517_wlan_gpios[] __initdata = {
+- { 56, GPIOF_OUT_INIT_HIGH, "wlan pwr" },
+- { 4, GPIOF_OUT_INIT_HIGH, "xcvr noe" },
++static struct gpiod_lookup_table cm_t3517_wlan_gpio_table = {
++ .dev_id = NULL,
++ .table = {
++ GPIO_LOOKUP("gpio-48-53", 8, "power",
++ GPIO_ACTIVE_HIGH),
++ GPIO_LOOKUP("gpio-0-15", 4, "noe",
++ GPIO_ACTIVE_HIGH),
++ { }
++ },
+ };
+
+ static void __init omap3_sbc_t3517_wifi_init(void)
+ {
+- int err = gpio_request_array(cm_t3517_wlan_gpios,
+- ARRAY_SIZE(cm_t3517_wlan_gpios));
+- if (err) {
+- pr_err("SBC-T3517: wl12xx gpios request failed: %d\n", err);
+- return;
+- }
++ struct gpio_desc *d;
++
++ gpiod_add_lookup_table(&cm_t3517_wlan_gpio_table);
+
+- gpiod_export(gpio_to_desc(cm_t3517_wlan_gpios[0].gpio), 0);
+- gpiod_export(gpio_to_desc(cm_t3517_wlan_gpios[1].gpio), 0);
++ /* This asserts the RESET line (reverse polarity) */
++ d = gpiod_get(NULL, "power", GPIOD_OUT_HIGH);
++ if (IS_ERR(d)) {
++ pr_err("Unable to get CM T3517 WLAN power GPIO descriptor\n");
++ } else {
++ gpiod_set_consumer_name(d, "wlan pwr");
++ gpiod_export(d, 0);
++ }
+
++ d = gpiod_get(NULL, "noe", GPIOD_OUT_HIGH);
++ if (IS_ERR(d)) {
++ pr_err("Unable to get CM T3517 WLAN XCVR NOE GPIO descriptor\n");
++ } else {
++ gpiod_set_consumer_name(d, "xcvr noe");
++ gpiod_export(d, 0);
++ }
+ msleep(100);
+- gpio_set_value(cm_t3517_wlan_gpios[1].gpio, 0);
+-}
++ gpiod_set_value(d, 0);
++}
++
++static struct gpiod_lookup_table omap3_sbc_t3517_usb_gpio_table = {
++ .dev_id = NULL,
++ .table = {
++ GPIO_LOOKUP_IDX("gpio-144-159", 8, "reset", 0,
++ GPIO_ACTIVE_LOW),
++ GPIO_LOOKUP_IDX("gpio-96-111", 2, "reset", 1,
++ GPIO_ACTIVE_LOW),
++ { }
++ },
++};
+
+ static void __init omap3_sbc_t3517_legacy_init(void)
+ {
+- omap3_sbc_t3x_usb_hub_init(152, "cm-t3517 usb hub");
+- omap3_sbc_t3x_usb_hub_init(98, "sb-t35 usb hub");
++ gpiod_add_lookup_table(&omap3_sbc_t3517_usb_gpio_table);
++ omap3_sbc_t3x_usb_hub_init("cm-t3517 usb hub", 0);
++ omap3_sbc_t3x_usb_hub_init("sb-t35 usb hub", 1);
+ am35xx_emac_reset();
+ hsmmc2_internal_input_clk();
+ omap3_sbc_t3517_wifi_init();
+@@ -393,21 +411,6 @@ static struct ti_prm_platform_data ti_prm_pdata = {
+ .clkdm_lookup = clkdm_lookup,
+ };
+
+-/*
+- * GPIOs for TWL are initialized by the I2C bus and need custom
+- * handing until DSS has device tree bindings.
+- */
+-void omap_auxdata_legacy_init(struct device *dev)
+-{
+- if (dev->platform_data)
+- return;
+-
+- if (strcmp("twl4030-gpio", dev_name(dev)))
+- return;
+-
+- dev->platform_data = &twl_gpio_auxdata;
+-}
+-
+ #if defined(CONFIG_ARCH_OMAP3) && IS_ENABLED(CONFIG_SND_SOC_OMAP_MCBSP)
+ static struct omap_mcbsp_platform_data mcbsp_pdata;
+ static void __init omap3_mcbsp_init(void)
+@@ -427,9 +430,6 @@ static struct pdata_init auxdata_quirks[] __initdata = {
+ { "nokia,n800", omap2420_n8x0_legacy_init, },
+ { "nokia,n810", omap2420_n8x0_legacy_init, },
+ { "nokia,n810-wimax", omap2420_n8x0_legacy_init, },
+-#endif
+-#ifdef CONFIG_ARCH_OMAP3
+- { "compulab,omap3-sbc-t3730", omap3_sbc_t3730_twl_init, },
+ #endif
+ { /* sentinel */ },
+ };
+diff --git a/arch/arm/mach-omap2/usb-tusb6010.c b/arch/arm/mach-omap2/usb-tusb6010.c
+index 18fa52f828dc7..b46c254c2bc41 100644
+--- a/arch/arm/mach-omap2/usb-tusb6010.c
++++ b/arch/arm/mach-omap2/usb-tusb6010.c
+@@ -11,12 +11,12 @@
+ #include <linux/errno.h>
+ #include <linux/delay.h>
+ #include <linux/platform_device.h>
+-#include <linux/gpio.h>
+ #include <linux/export.h>
+ #include <linux/platform_data/usb-omap.h>
+
+ #include <linux/usb/musb.h>
+
++#include "usb-tusb6010.h"
+ #include "gpmc.h"
+
+ static u8 async_cs, sync_cs;
+@@ -132,10 +132,6 @@ static struct resource tusb_resources[] = {
+ { /* Synchronous access */
+ .flags = IORESOURCE_MEM,
+ },
+- { /* IRQ */
+- .name = "mc",
+- .flags = IORESOURCE_IRQ,
+- },
+ };
+
+ static u64 tusb_dmamask = ~(u32)0;
+@@ -154,9 +150,9 @@ static struct platform_device tusb_device = {
+
+ /* this may be called only from board-*.c setup code */
+ int __init tusb6010_setup_interface(struct musb_hdrc_platform_data *data,
+- unsigned ps_refclk, unsigned waitpin,
+- unsigned async, unsigned sync,
+- unsigned irq, unsigned dmachan)
++ unsigned int ps_refclk, unsigned int waitpin,
++ unsigned int async, unsigned int sync,
++ unsigned int dmachan)
+ {
+ int status;
+ static char error[] __initdata =
+@@ -192,14 +188,6 @@ int __init tusb6010_setup_interface(struct musb_hdrc_platform_data *data,
+ if (status < 0)
+ return status;
+
+- /* IRQ */
+- status = gpio_request_one(irq, GPIOF_IN, "TUSB6010 irq");
+- if (status < 0) {
+- printk(error, 3, status);
+- return status;
+- }
+- tusb_resources[2].start = gpio_to_irq(irq);
+-
+ /* set up memory timings ... can speed them up later */
+ if (!ps_refclk) {
+ printk(error, 4, status);
+diff --git a/arch/arm/mach-omap2/usb-tusb6010.h b/arch/arm/mach-omap2/usb-tusb6010.h
+new file mode 100644
+index 0000000000000..d210ff6238c26
+--- /dev/null
++++ b/arch/arm/mach-omap2/usb-tusb6010.h
+@@ -0,0 +1,12 @@
++/* SPDX-License-Identifier: GPL-2.0 */
++
++#ifndef __USB_TUSB6010_H
++#define __USB_TUSB6010_H
++
++extern int __init tusb6010_setup_interface(
++ struct musb_hdrc_platform_data *data,
++ unsigned int ps_refclk, unsigned int waitpin,
++ unsigned int async_cs, unsigned int sync_cs,
++ unsigned int dmachan);
++
++#endif /* __USB_TUSB6010_H */
+diff --git a/arch/arm/mach-orion5x/board-dt.c b/arch/arm/mach-orion5x/board-dt.c
+index e3736ffc83477..be47492c6640d 100644
+--- a/arch/arm/mach-orion5x/board-dt.c
++++ b/arch/arm/mach-orion5x/board-dt.c
+@@ -60,6 +60,9 @@ static void __init orion5x_dt_init(void)
+ if (of_machine_is_compatible("maxtor,shared-storage-2"))
+ mss2_init();
+
++ if (of_machine_is_compatible("lacie,d2-network"))
++ d2net_init();
++
+ of_platform_default_populate(NULL, orion5x_auxdata_lookup, NULL);
+ }
+
+diff --git a/arch/arm/mach-orion5x/common.h b/arch/arm/mach-orion5x/common.h
+index f2e0577bf50f4..8df70e23aa82a 100644
+--- a/arch/arm/mach-orion5x/common.h
++++ b/arch/arm/mach-orion5x/common.h
+@@ -73,6 +73,12 @@ extern void mss2_init(void);
+ static inline void mss2_init(void) {}
+ #endif
+
++#ifdef CONFIG_MACH_D2NET_DT
++void d2net_init(void);
++#else
++static inline void d2net_init(void) {}
++#endif
++
+ /*****************************************************************************
+ * Helpers to access Orion registers
+ ****************************************************************************/
+diff --git a/arch/arm/mach-pxa/spitz.c b/arch/arm/mach-pxa/spitz.c
+index 4325bdc2b9ff8..28e376e06fdc8 100644
+--- a/arch/arm/mach-pxa/spitz.c
++++ b/arch/arm/mach-pxa/spitz.c
+@@ -506,10 +506,18 @@ static struct ads7846_platform_data spitz_ads7846_info = {
+ .x_plate_ohms = 419,
+ .y_plate_ohms = 486,
+ .pressure_max = 1024,
+- .gpio_pendown = SPITZ_GPIO_TP_INT,
+ .wait_for_sync = spitz_ads7846_wait_for_hsync,
+ };
+
++static struct gpiod_lookup_table spitz_ads7846_gpio_table = {
++ .dev_id = "spi2.0",
++ .table = {
++ GPIO_LOOKUP("gpio-pxa", SPITZ_GPIO_TP_INT,
++ "pendown", GPIO_ACTIVE_LOW),
++ { }
++ },
++};
++
+ static void spitz_bl_kick_battery(void)
+ {
+ void (*kick_batt)(void);
+@@ -594,6 +602,7 @@ static void __init spitz_spi_init(void)
+ else
+ gpiod_add_lookup_table(&spitz_lcdcon_gpio_table);
+
++ gpiod_add_lookup_table(&spitz_ads7846_gpio_table);
+ gpiod_add_lookup_table(&spitz_spi_gpio_table);
+ pxa2xx_set_spi_info(2, &spitz_spi_info);
+ spi_register_board_info(ARRAY_AND_SIZE(spitz_spi_devices));
+diff --git a/arch/arm/probes/kprobes/checkers-common.c b/arch/arm/probes/kprobes/checkers-common.c
+index 4d720990cf2a3..eba7ac4725c02 100644
+--- a/arch/arm/probes/kprobes/checkers-common.c
++++ b/arch/arm/probes/kprobes/checkers-common.c
+@@ -40,7 +40,7 @@ enum probes_insn checker_stack_use_imm_0xx(probes_opcode_t insn,
+ * Different from other insn uses imm8, the real addressing offset of
+ * STRD in T32 encoding should be imm8 * 4. See ARMARM description.
+ */
+-enum probes_insn checker_stack_use_t32strd(probes_opcode_t insn,
++static enum probes_insn checker_stack_use_t32strd(probes_opcode_t insn,
+ struct arch_probes_insn *asi,
+ const struct decode_header *h)
+ {
+diff --git a/arch/arm/probes/kprobes/core.c b/arch/arm/probes/kprobes/core.c
+index 9090c3a74dcce..d8238da095df7 100644
+--- a/arch/arm/probes/kprobes/core.c
++++ b/arch/arm/probes/kprobes/core.c
+@@ -233,7 +233,7 @@ singlestep(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *kcb)
+ * kprobe, and that level is reserved for user kprobe handlers, so we can't
+ * risk encountering a new kprobe in an interrupt handler.
+ */
+-void __kprobes kprobe_handler(struct pt_regs *regs)
++static void __kprobes kprobe_handler(struct pt_regs *regs)
+ {
+ struct kprobe *p, *cur;
+ struct kprobe_ctlblk *kcb;
+diff --git a/arch/arm/probes/kprobes/opt-arm.c b/arch/arm/probes/kprobes/opt-arm.c
+index dbef34ed933f2..7f65048380ca5 100644
+--- a/arch/arm/probes/kprobes/opt-arm.c
++++ b/arch/arm/probes/kprobes/opt-arm.c
+@@ -145,8 +145,6 @@ __arch_remove_optimized_kprobe(struct optimized_kprobe *op, int dirty)
+ }
+ }
+
+-extern void kprobe_handler(struct pt_regs *regs);
+-
+ static void
+ optimized_callback(struct optimized_kprobe *op, struct pt_regs *regs)
+ {
+diff --git a/arch/arm/probes/kprobes/test-core.c b/arch/arm/probes/kprobes/test-core.c
+index c562832b86272..171c7076b89f4 100644
+--- a/arch/arm/probes/kprobes/test-core.c
++++ b/arch/arm/probes/kprobes/test-core.c
+@@ -720,7 +720,7 @@ static const char coverage_register_lookup[16] = {
+ [REG_TYPE_NOSPPCX] = COVERAGE_ANY_REG | COVERAGE_SP,
+ };
+
+-unsigned coverage_start_registers(const struct decode_header *h)
++static unsigned coverage_start_registers(const struct decode_header *h)
+ {
+ unsigned regs = 0;
+ int i;
+diff --git a/arch/arm/probes/kprobes/test-core.h b/arch/arm/probes/kprobes/test-core.h
+index 56ad3c0aaeeac..c7297037c1623 100644
+--- a/arch/arm/probes/kprobes/test-core.h
++++ b/arch/arm/probes/kprobes/test-core.h
+@@ -454,3 +454,7 @@ void kprobe_thumb32_test_cases(void);
+ #else
+ void kprobe_arm_test_cases(void);
+ #endif
++
++void __kprobes_test_case_start(void);
++void __kprobes_test_case_end_16(void);
++void __kprobes_test_case_end_32(void);
+diff --git a/arch/arm64/boot/dts/mediatek/mt7986a-bananapi-bpi-r3-nand.dtso b/arch/arm64/boot/dts/mediatek/mt7986a-bananapi-bpi-r3-nand.dtso
+index 15ee8c568f3c3..543c13385d6e3 100644
+--- a/arch/arm64/boot/dts/mediatek/mt7986a-bananapi-bpi-r3-nand.dtso
++++ b/arch/arm64/boot/dts/mediatek/mt7986a-bananapi-bpi-r3-nand.dtso
+@@ -29,13 +29,13 @@
+
+ partition@0 {
+ label = "bl2";
+- reg = <0x0 0x80000>;
++ reg = <0x0 0x100000>;
+ read-only;
+ };
+
+- partition@80000 {
++ partition@100000 {
+ label = "reserved";
+- reg = <0x80000 0x300000>;
++ reg = <0x100000 0x280000>;
+ };
+
+ partition@380000 {
+diff --git a/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi b/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi
+index 63952c1251dfd..8892b2f64a0f0 100644
+--- a/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi
++++ b/arch/arm64/boot/dts/mediatek/mt8183-kukui.dtsi
+@@ -292,6 +292,10 @@
+ };
+ };
+
++&gic {
++ mediatek,broken-save-restore-fw;
++};
++
+ &gpu {
+ mali-supply = <&mt6358_vgpu_reg>;
+ };
+diff --git a/arch/arm64/boot/dts/mediatek/mt8192-asurada.dtsi b/arch/arm64/boot/dts/mediatek/mt8192-asurada.dtsi
+index 5a440504d4f9b..0e8b341170907 100644
+--- a/arch/arm64/boot/dts/mediatek/mt8192-asurada.dtsi
++++ b/arch/arm64/boot/dts/mediatek/mt8192-asurada.dtsi
+@@ -275,6 +275,10 @@
+ remote-endpoint = <&anx7625_in>;
+ };
+
++&gic {
++ mediatek,broken-save-restore-fw;
++};
++
+ &gpu {
+ mali-supply = <&mt6315_7_vbuck1>;
+ status = "okay";
+diff --git a/arch/arm64/boot/dts/mediatek/mt8192.dtsi b/arch/arm64/boot/dts/mediatek/mt8192.dtsi
+index 5c30caf740265..75eeba539e6fe 100644
+--- a/arch/arm64/boot/dts/mediatek/mt8192.dtsi
++++ b/arch/arm64/boot/dts/mediatek/mt8192.dtsi
+@@ -70,7 +70,8 @@
+ d-cache-line-size = <64>;
+ d-cache-sets = <128>;
+ next-level-cache = <&l2_0>;
+- capacity-dmips-mhz = <530>;
++ performance-domains = <&performance 0>;
++ capacity-dmips-mhz = <427>;
+ };
+
+ cpu1: cpu@100 {
+@@ -87,7 +88,8 @@
+ d-cache-line-size = <64>;
+ d-cache-sets = <128>;
+ next-level-cache = <&l2_0>;
+- capacity-dmips-mhz = <530>;
++ performance-domains = <&performance 0>;
++ capacity-dmips-mhz = <427>;
+ };
+
+ cpu2: cpu@200 {
+@@ -104,7 +106,8 @@
+ d-cache-line-size = <64>;
+ d-cache-sets = <128>;
+ next-level-cache = <&l2_0>;
+- capacity-dmips-mhz = <530>;
++ performance-domains = <&performance 0>;
++ capacity-dmips-mhz = <427>;
+ };
+
+ cpu3: cpu@300 {
+@@ -121,7 +124,8 @@
+ d-cache-line-size = <64>;
+ d-cache-sets = <128>;
+ next-level-cache = <&l2_0>;
+- capacity-dmips-mhz = <530>;
++ performance-domains = <&performance 0>;
++ capacity-dmips-mhz = <427>;
+ };
+
+ cpu4: cpu@400 {
+@@ -138,6 +142,7 @@
+ d-cache-line-size = <64>;
+ d-cache-sets = <256>;
+ next-level-cache = <&l2_1>;
++ performance-domains = <&performance 1>;
+ capacity-dmips-mhz = <1024>;
+ };
+
+@@ -155,6 +160,7 @@
+ d-cache-line-size = <64>;
+ d-cache-sets = <256>;
+ next-level-cache = <&l2_1>;
++ performance-domains = <&performance 1>;
+ capacity-dmips-mhz = <1024>;
+ };
+
+@@ -172,6 +178,7 @@
+ d-cache-line-size = <64>;
+ d-cache-sets = <256>;
+ next-level-cache = <&l2_1>;
++ performance-domains = <&performance 1>;
+ capacity-dmips-mhz = <1024>;
+ };
+
+@@ -189,6 +196,7 @@
+ d-cache-line-size = <64>;
+ d-cache-sets = <256>;
+ next-level-cache = <&l2_1>;
++ performance-domains = <&performance 1>;
+ capacity-dmips-mhz = <1024>;
+ };
+
+@@ -403,6 +411,12 @@
+ compatible = "simple-bus";
+ ranges;
+
++ performance: performance-controller@11bc10 {
++ compatible = "mediatek,cpufreq-hw";
++ reg = <0 0x0011bc10 0 0x120>, <0 0x0011bd30 0 0x120>;
++ #performance-domain-cells = <1>;
++ };
++
+ gic: interrupt-controller@c000000 {
+ compatible = "arm,gic-v3";
+ #interrupt-cells = <4>;
+diff --git a/arch/arm64/boot/dts/mediatek/mt8195-cherry.dtsi b/arch/arm64/boot/dts/mediatek/mt8195-cherry.dtsi
+index 8ac80a136c371..f2d0726546c77 100644
+--- a/arch/arm64/boot/dts/mediatek/mt8195-cherry.dtsi
++++ b/arch/arm64/boot/dts/mediatek/mt8195-cherry.dtsi
+@@ -255,6 +255,10 @@
+ };
+ };
+
++&gic {
++ mediatek,broken-save-restore-fw;
++};
++
+ &gpu {
+ status = "okay";
+ mali-supply = <&mt6315_7_vbuck1>;
+diff --git a/arch/arm64/boot/dts/microchip/sparx5.dtsi b/arch/arm64/boot/dts/microchip/sparx5.dtsi
+index 0367a00a269b3..5eae6e7fd248e 100644
+--- a/arch/arm64/boot/dts/microchip/sparx5.dtsi
++++ b/arch/arm64/boot/dts/microchip/sparx5.dtsi
+@@ -61,7 +61,7 @@
+ interrupt-affinity = <&cpu0>, <&cpu1>;
+ };
+
+- psci {
++ psci: psci {
+ compatible = "arm,psci-0.2";
+ method = "smc";
+ };
+diff --git a/arch/arm64/boot/dts/microchip/sparx5_pcb_common.dtsi b/arch/arm64/boot/dts/microchip/sparx5_pcb_common.dtsi
+index 9d1a082de3e29..32bb76b3202a0 100644
+--- a/arch/arm64/boot/dts/microchip/sparx5_pcb_common.dtsi
++++ b/arch/arm64/boot/dts/microchip/sparx5_pcb_common.dtsi
+@@ -6,6 +6,18 @@
+ /dts-v1/;
+ #include "sparx5.dtsi"
+
++&psci {
++ status = "disabled";
++};
++
++&cpu0 {
++ enable-method = "spin-table";
++};
++
++&cpu1 {
++ enable-method = "spin-table";
++};
++
+ &uart0 {
+ status = "okay";
+ };
+diff --git a/arch/arm64/boot/dts/qcom/apq8016-sbc.dts b/arch/arm64/boot/dts/qcom/apq8016-sbc.dts
+index 59860a2223b83..3ec449f5cab78 100644
+--- a/arch/arm64/boot/dts/qcom/apq8016-sbc.dts
++++ b/arch/arm64/boot/dts/qcom/apq8016-sbc.dts
+@@ -447,21 +447,21 @@
+ vdd_l7-supply = <&pm8916_s4>;
+
+ s3 {
+- regulator-min-microvolt = <375000>;
+- regulator-max-microvolt = <1562000>;
++ regulator-min-microvolt = <1250000>;
++ regulator-max-microvolt = <1350000>;
+ };
+
+ s4 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
++ regulator-min-microvolt = <1850000>;
++ regulator-max-microvolt = <2150000>;
+
+ regulator-always-on;
+ regulator-boot-on;
+ };
+
+ l1 {
+- regulator-min-microvolt = <375000>;
+- regulator-max-microvolt = <1525000>;
++ regulator-min-microvolt = <1225000>;
++ regulator-max-microvolt = <1225000>;
+ };
+
+ l2 {
+@@ -470,13 +470,13 @@
+ };
+
+ l4 {
+- regulator-min-microvolt = <1750000>;
+- regulator-max-microvolt = <3337000>;
++ regulator-min-microvolt = <2050000>;
++ regulator-max-microvolt = <2050000>;
+ };
+
+ l5 {
+- regulator-min-microvolt = <1750000>;
+- regulator-max-microvolt = <3337000>;
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
+ };
+
+ l6 {
+@@ -485,60 +485,68 @@
+ };
+
+ l7 {
+- regulator-min-microvolt = <1750000>;
+- regulator-max-microvolt = <3337000>;
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
+ };
+
+ l8 {
+- regulator-min-microvolt = <1750000>;
+- regulator-max-microvolt = <3337000>;
++ regulator-min-microvolt = <2900000>;
++ regulator-max-microvolt = <2900000>;
+ };
+
+ l9 {
+- regulator-min-microvolt = <1750000>;
+- regulator-max-microvolt = <3337000>;
++ regulator-min-microvolt = <3300000>;
++ regulator-max-microvolt = <3300000>;
+ };
+
+ l10 {
+- regulator-min-microvolt = <1750000>;
+- regulator-max-microvolt = <3337000>;
++ regulator-min-microvolt = <2800000>;
++ regulator-max-microvolt = <2800000>;
+ };
+
+ l11 {
+- regulator-min-microvolt = <1750000>;
+- regulator-max-microvolt = <3337000>;
++ regulator-min-microvolt = <2950000>;
++ regulator-max-microvolt = <2950000>;
+ regulator-allow-set-load;
+ regulator-system-load = <200000>;
+ };
+
+ l12 {
+- regulator-min-microvolt = <1750000>;
+- regulator-max-microvolt = <3337000>;
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <2950000>;
+ };
+
+ l13 {
+- regulator-min-microvolt = <1750000>;
+- regulator-max-microvolt = <3337000>;
++ regulator-min-microvolt = <3075000>;
++ regulator-max-microvolt = <3075000>;
+ };
+
+ l14 {
+- regulator-min-microvolt = <1750000>;
+- regulator-max-microvolt = <3337000>;
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <3300000>;
+ };
+
+- /**
+- * 1.8v required on LS expansion
+- * for mezzanine boards
++ /*
++ * The 96Boards specification expects a 1.8V power rail on the low-speed
++ * expansion connector that is able to provide at least 0.18W / 100 mA.
++ * L15/L16 are connected in parallel to provide 55 mA each. A minimum load
++ * must be specified to ensure the regulators are not put in LPM where they
++ * would only provide 5 mA.
+ */
+ l15 {
+- regulator-min-microvolt = <1750000>;
+- regulator-max-microvolt = <3337000>;
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ regulator-system-load = <50000>;
++ regulator-allow-set-load;
+ regulator-always-on;
+ };
+
+ l16 {
+- regulator-min-microvolt = <1750000>;
+- regulator-max-microvolt = <3337000>;
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ regulator-system-load = <50000>;
++ regulator-allow-set-load;
++ regulator-always-on;
+ };
+
+ l17 {
+@@ -547,8 +555,8 @@
+ };
+
+ l18 {
+- regulator-min-microvolt = <1750000>;
+- regulator-max-microvolt = <3337000>;
++ regulator-min-microvolt = <2700000>;
++ regulator-max-microvolt = <2700000>;
+ };
+ };
+
+diff --git a/arch/arm64/boot/dts/qcom/apq8096-ifc6640.dts b/arch/arm64/boot/dts/qcom/apq8096-ifc6640.dts
+index 71e0a500599c8..ed2e2f6c6775a 100644
+--- a/arch/arm64/boot/dts/qcom/apq8096-ifc6640.dts
++++ b/arch/arm64/boot/dts/qcom/apq8096-ifc6640.dts
+@@ -26,7 +26,7 @@
+
+ v1p05: v1p05-regulator {
+ compatible = "regulator-fixed";
+- reglator-name = "v1p05";
++ regulator-name = "v1p05";
+ regulator-always-on;
+ regulator-boot-on;
+
+@@ -38,7 +38,7 @@
+
+ v12_poe: v12-poe-regulator {
+ compatible = "regulator-fixed";
+- reglator-name = "v12_poe";
++ regulator-name = "v12_poe";
+ regulator-always-on;
+ regulator-boot-on;
+
+diff --git a/arch/arm64/boot/dts/qcom/ipq6018.dtsi b/arch/arm64/boot/dts/qcom/ipq6018.dtsi
+index f531797f26195..c58eeb4376abe 100644
+--- a/arch/arm64/boot/dts/qcom/ipq6018.dtsi
++++ b/arch/arm64/boot/dts/qcom/ipq6018.dtsi
+@@ -302,7 +302,7 @@
+ status = "disabled";
+ };
+
+- prng: qrng@e1000 {
++ prng: qrng@e3000 {
+ compatible = "qcom,prng-ee";
+ reg = <0x0 0x000e3000 0x0 0x1000>;
+ clocks = <&gcc GCC_PRNG_AHB_CLK>;
+diff --git a/arch/arm64/boot/dts/qcom/ipq9574.dtsi b/arch/arm64/boot/dts/qcom/ipq9574.dtsi
+index 0ed19fbf7d87d..6e3a88ee06152 100644
+--- a/arch/arm64/boot/dts/qcom/ipq9574.dtsi
++++ b/arch/arm64/boot/dts/qcom/ipq9574.dtsi
+@@ -173,14 +173,14 @@
+ intc: interrupt-controller@b000000 {
+ compatible = "qcom,msm-qgic2";
+ reg = <0x0b000000 0x1000>, /* GICD */
+- <0x0b002000 0x1000>, /* GICC */
++ <0x0b002000 0x2000>, /* GICC */
+ <0x0b001000 0x1000>, /* GICH */
+- <0x0b004000 0x1000>; /* GICV */
++ <0x0b004000 0x2000>; /* GICV */
+ #address-cells = <1>;
+ #size-cells = <1>;
+ interrupt-controller;
+ #interrupt-cells = <3>;
+- interrupts = <GIC_PPI 9 IRQ_TYPE_LEVEL_HIGH>;
++ interrupts = <GIC_PPI 9 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_HIGH)>;
+ ranges = <0 0x0b00c000 0x3000>;
+
+ v2m0: v2m@0 {
+diff --git a/arch/arm64/boot/dts/qcom/msm8916.dtsi b/arch/arm64/boot/dts/qcom/msm8916.dtsi
+index 834e0b66b7f2e..bf88c10ff55b0 100644
+--- a/arch/arm64/boot/dts/qcom/msm8916.dtsi
++++ b/arch/arm64/boot/dts/qcom/msm8916.dtsi
+@@ -1162,7 +1162,7 @@
+ };
+ };
+
+- camss: camss@1b00000 {
++ camss: camss@1b0ac00 {
+ compatible = "qcom,msm8916-camss";
+ reg = <0x01b0ac00 0x200>,
+ <0x01b00030 0x4>,
+@@ -1554,7 +1554,7 @@
+ #sound-dai-cells = <1>;
+ };
+
+- sdhc_1: mmc@7824000 {
++ sdhc_1: mmc@7824900 {
+ compatible = "qcom,msm8916-sdhci", "qcom,sdhci-msm-v4";
+ reg = <0x07824900 0x11c>, <0x07824000 0x800>;
+ reg-names = "hc", "core";
+@@ -1572,7 +1572,7 @@
+ status = "disabled";
+ };
+
+- sdhc_2: mmc@7864000 {
++ sdhc_2: mmc@7864900 {
+ compatible = "qcom,msm8916-sdhci", "qcom,sdhci-msm-v4";
+ reg = <0x07864900 0x11c>, <0x07864000 0x800>;
+ reg-names = "hc", "core";
+@@ -1871,7 +1871,7 @@
+ };
+ };
+
+- wcnss: remoteproc@a21b000 {
++ wcnss: remoteproc@a204000 {
+ compatible = "qcom,pronto-v2-pil", "qcom,pronto";
+ reg = <0x0a204000 0x2000>, <0x0a202000 0x1000>, <0x0a21b000 0x3000>;
+ reg-names = "ccu", "dxe", "pmu";
+diff --git a/arch/arm64/boot/dts/qcom/msm8953.dtsi b/arch/arm64/boot/dts/qcom/msm8953.dtsi
+index d44cfa0471e9a..d1d6f80bb2e6b 100644
+--- a/arch/arm64/boot/dts/qcom/msm8953.dtsi
++++ b/arch/arm64/boot/dts/qcom/msm8953.dtsi
+@@ -1002,7 +1002,7 @@
+ };
+ };
+
+- apps_iommu: iommu@1e00000 {
++ apps_iommu: iommu@1e20000 {
+ compatible = "qcom,msm8953-iommu", "qcom,msm-iommu-v1";
+ ranges = <0 0x01e20000 0x20000>;
+
+@@ -1425,7 +1425,7 @@
+ status = "disabled";
+ };
+
+- wcnss: remoteproc@a21b000 {
++ wcnss: remoteproc@a204000 {
+ compatible = "qcom,pronto-v3-pil", "qcom,pronto";
+ reg = <0x0a204000 0x2000>, <0x0a202000 0x1000>, <0x0a21b000 0x3000>;
+ reg-names = "ccu", "dxe", "pmu";
+diff --git a/arch/arm64/boot/dts/qcom/msm8976.dtsi b/arch/arm64/boot/dts/qcom/msm8976.dtsi
+index f47fb8ea71e20..753b9a2105edd 100644
+--- a/arch/arm64/boot/dts/qcom/msm8976.dtsi
++++ b/arch/arm64/boot/dts/qcom/msm8976.dtsi
+@@ -822,7 +822,7 @@
+ #interrupt-cells = <4>;
+ };
+
+- sdhc_1: mmc@7824000 {
++ sdhc_1: mmc@7824900 {
+ compatible = "qcom,msm8976-sdhci", "qcom,sdhci-msm-v4";
+ reg = <0x07824900 0x500>, <0x07824000 0x800>;
+ reg-names = "hc", "core";
+@@ -838,7 +838,7 @@
+ status = "disabled";
+ };
+
+- sdhc_2: mmc@7864000 {
++ sdhc_2: mmc@7864900 {
+ compatible = "qcom,msm8976-sdhci", "qcom,sdhci-msm-v4";
+ reg = <0x07864900 0x11c>, <0x07864000 0x800>;
+ reg-names = "hc", "core";
+@@ -957,7 +957,7 @@
+ #reset-cells = <1>;
+ };
+
+- sdhc_3: mmc@7a24000 {
++ sdhc_3: mmc@7a24900 {
+ compatible = "qcom,msm8976-sdhci", "qcom,sdhci-msm-v4";
+ reg = <0x07a24900 0x11c>, <0x07a24000 0x800>;
+ reg-names = "hc", "core";
+diff --git a/arch/arm64/boot/dts/qcom/msm8994.dtsi b/arch/arm64/boot/dts/qcom/msm8994.dtsi
+index bdc3f2ba1755e..c5cf01c7f72e1 100644
+--- a/arch/arm64/boot/dts/qcom/msm8994.dtsi
++++ b/arch/arm64/boot/dts/qcom/msm8994.dtsi
+@@ -747,7 +747,7 @@
+ reg = <0xfc4ab000 0x4>;
+ };
+
+- spmi_bus: spmi@fc4c0000 {
++ spmi_bus: spmi@fc4cf000 {
+ compatible = "qcom,spmi-pmic-arb";
+ reg = <0xfc4cf000 0x1000>,
+ <0xfc4cb000 0x1000>,
+diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi b/arch/arm64/boot/dts/qcom/msm8996.dtsi
+index 30257c07e1279..25fe2b8552fc7 100644
+--- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
++++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
+@@ -2069,7 +2069,7 @@
+ };
+ };
+
+- camss: camss@a00000 {
++ camss: camss@a34000 {
+ compatible = "qcom,msm8996-camss";
+ reg = <0x00a34000 0x1000>,
+ <0x00a00030 0x4>,
+diff --git a/arch/arm64/boot/dts/qcom/pm7250b.dtsi b/arch/arm64/boot/dts/qcom/pm7250b.dtsi
+index d709d955a2f5a..daa6f1d30efa0 100644
+--- a/arch/arm64/boot/dts/qcom/pm7250b.dtsi
++++ b/arch/arm64/boot/dts/qcom/pm7250b.dtsi
+@@ -3,6 +3,7 @@
+ * Copyright (C) 2022 Luca Weiss <luca.weiss@fairphone.com>
+ */
+
++#include <dt-bindings/iio/qcom,spmi-vadc.h>
+ #include <dt-bindings/interrupt-controller/irq.h>
+ #include <dt-bindings/spmi/spmi.h>
+
+diff --git a/arch/arm64/boot/dts/qcom/pm8998.dtsi b/arch/arm64/boot/dts/qcom/pm8998.dtsi
+index 340033ac31860..695d79116cde2 100644
+--- a/arch/arm64/boot/dts/qcom/pm8998.dtsi
++++ b/arch/arm64/boot/dts/qcom/pm8998.dtsi
+@@ -55,7 +55,7 @@
+
+ pm8998_resin: resin {
+ compatible = "qcom,pm8941-resin";
+- interrupts = <GIC_SPI 0x8 1 IRQ_TYPE_EDGE_BOTH>;
++ interrupts = <0x0 0x8 1 IRQ_TYPE_EDGE_BOTH>;
+ debounce = <15625>;
+ bias-pull-up;
+ status = "disabled";
+diff --git a/arch/arm64/boot/dts/qcom/qdu1000.dtsi b/arch/arm64/boot/dts/qcom/qdu1000.dtsi
+index fb553f0bb17aa..6a6830777d8a8 100644
+--- a/arch/arm64/boot/dts/qcom/qdu1000.dtsi
++++ b/arch/arm64/boot/dts/qcom/qdu1000.dtsi
+@@ -1252,6 +1252,7 @@
+ qcom,tcs-config = <ACTIVE_TCS 2>, <SLEEP_TCS 3>,
+ <WAKE_TCS 3>, <CONTROL_TCS 0>;
+ label = "apps_rsc";
++ power-domains = <&CLUSTER_PD>;
+
+ apps_bcm_voter: bcm-voter {
+ compatible = "qcom,bcm-voter";
+diff --git a/arch/arm64/boot/dts/qcom/qrb4210-rb2.dts b/arch/arm64/boot/dts/qcom/qrb4210-rb2.dts
+index dc80f0bca7676..5554b3b9aaf32 100644
+--- a/arch/arm64/boot/dts/qcom/qrb4210-rb2.dts
++++ b/arch/arm64/boot/dts/qcom/qrb4210-rb2.dts
+@@ -199,7 +199,8 @@
+ };
+
+ &sdhc_2 {
+- cd-gpios = <&tlmm 88 GPIO_ACTIVE_HIGH>; /* card detect gpio */
++ cd-gpios = <&tlmm 88 GPIO_ACTIVE_LOW>; /* card detect gpio */
++
+ vmmc-supply = <&vreg_l22a_2p96>;
+ vqmmc-supply = <&vreg_l5a_2p96>;
+ no-sdio;
+diff --git a/arch/arm64/boot/dts/qcom/sdm630.dtsi b/arch/arm64/boot/dts/qcom/sdm630.dtsi
+index eaead2f7beb4e..ab04903fa3ff3 100644
+--- a/arch/arm64/boot/dts/qcom/sdm630.dtsi
++++ b/arch/arm64/boot/dts/qcom/sdm630.dtsi
+@@ -1894,7 +1894,7 @@
+ };
+ };
+
+- camss: camss@ca00000 {
++ camss: camss@ca00020 {
+ compatible = "qcom,sdm660-camss";
+ reg = <0x0ca00020 0x10>,
+ <0x0ca30000 0x100>,
+diff --git a/arch/arm64/boot/dts/qcom/sdm670.dtsi b/arch/arm64/boot/dts/qcom/sdm670.dtsi
+index b61e13db89bd5..a1c207c0266da 100644
+--- a/arch/arm64/boot/dts/qcom/sdm670.dtsi
++++ b/arch/arm64/boot/dts/qcom/sdm670.dtsi
+@@ -1282,6 +1282,7 @@
+ <SLEEP_TCS 3>,
+ <WAKE_TCS 3>,
+ <CONTROL_TCS 1>;
++ power-domains = <&CLUSTER_PD>;
+
+ apps_bcm_voter: bcm-voter {
+ compatible = "qcom,bcm-voter";
+diff --git a/arch/arm64/boot/dts/qcom/sdm845-xiaomi-polaris.dts b/arch/arm64/boot/dts/qcom/sdm845-xiaomi-polaris.dts
+index 8ae0ffccaab22..576f0421824f4 100644
+--- a/arch/arm64/boot/dts/qcom/sdm845-xiaomi-polaris.dts
++++ b/arch/arm64/boot/dts/qcom/sdm845-xiaomi-polaris.dts
+@@ -483,6 +483,7 @@
+ };
+
+ rmi4-f12@12 {
++ reg = <0x12>;
+ syna,rezero-wait-ms = <0xc8>;
+ syna,clip-x-high = <0x438>;
+ syna,clip-y-high = <0x870>;
+diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
+index cdeb05e95674e..1bfb938e284fb 100644
+--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
++++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
+@@ -4238,7 +4238,7 @@
+ #reset-cells = <1>;
+ };
+
+- camss: camss@a00000 {
++ camss: camss@acb3000 {
+ compatible = "qcom,sdm845-camss";
+
+ reg = <0 0x0acb3000 0 0x1000>,
+@@ -5137,6 +5137,7 @@
+ <SLEEP_TCS 3>,
+ <WAKE_TCS 3>,
+ <CONTROL_TCS 1>;
++ power-domains = <&CLUSTER_PD>;
+
+ apps_bcm_voter: bcm-voter {
+ compatible = "qcom,bcm-voter";
+diff --git a/arch/arm64/boot/dts/qcom/sm6115.dtsi b/arch/arm64/boot/dts/qcom/sm6115.dtsi
+index 43f31c1b9d5a7..ea71249bbdf3f 100644
+--- a/arch/arm64/boot/dts/qcom/sm6115.dtsi
++++ b/arch/arm64/boot/dts/qcom/sm6115.dtsi
+@@ -700,7 +700,7 @@
+ #interrupt-cells = <4>;
+ };
+
+- tsens0: thermal-sensor@4410000 {
++ tsens0: thermal-sensor@4411000 {
+ compatible = "qcom,sm6115-tsens", "qcom,tsens-v2";
+ reg = <0x0 0x04411000 0x0 0x1ff>, /* TM */
+ <0x0 0x04410000 0x0 0x8>; /* SROT */
+diff --git a/arch/arm64/boot/dts/qcom/sm8250-sony-xperia-edo.dtsi b/arch/arm64/boot/dts/qcom/sm8250-sony-xperia-edo.dtsi
+index 2f22d348d45d7..dcabb714f0f35 100644
+--- a/arch/arm64/boot/dts/qcom/sm8250-sony-xperia-edo.dtsi
++++ b/arch/arm64/boot/dts/qcom/sm8250-sony-xperia-edo.dtsi
+@@ -26,9 +26,10 @@
+ framebuffer: framebuffer@9c000000 {
+ compatible = "simple-framebuffer";
+ reg = <0 0x9c000000 0 0x2300000>;
+- width = <1644>;
+- height = <3840>;
+- stride = <(1644 * 4)>;
++ /* pdx203 BL initializes in 2.5k mode, not 4k */
++ width = <1096>;
++ height = <2560>;
++ stride = <(1096 * 4)>;
+ format = "a8r8g8b8";
+ /*
+ * That's a lot of clocks, but it's necessary due
+diff --git a/arch/arm64/boot/dts/qcom/sm8350.dtsi b/arch/arm64/boot/dts/qcom/sm8350.dtsi
+index 3efdc03ed0f11..425af2c38a37f 100644
+--- a/arch/arm64/boot/dts/qcom/sm8350.dtsi
++++ b/arch/arm64/boot/dts/qcom/sm8350.dtsi
+@@ -907,7 +907,7 @@
+ };
+ };
+
+- gpi_dma0: dma-controller@900000 {
++ gpi_dma0: dma-controller@9800000 {
+ compatible = "qcom,sm8350-gpi-dma", "qcom,sm6350-gpi-dma";
+ reg = <0 0x09800000 0 0x60000>;
+ interrupts = <GIC_SPI 244 IRQ_TYPE_LEVEL_HIGH>,
+@@ -1638,7 +1638,7 @@
+ status = "disabled";
+ };
+
+- pcie1_phy: phy@1c0f000 {
++ pcie1_phy: phy@1c0e000 {
+ compatible = "qcom,sm8350-qmp-gen3x2-pcie-phy";
+ reg = <0 0x01c0e000 0 0x2000>;
+ clocks = <&gcc GCC_PCIE_1_AUX_CLK>,
+@@ -2140,7 +2140,7 @@
+ resets = <&gcc GCC_QUSB2PHY_SEC_BCR>;
+ };
+
+- usb_1_qmpphy: phy@88e9000 {
++ usb_1_qmpphy: phy@88e8000 {
+ compatible = "qcom,sm8350-qmp-usb3-dp-phy";
+ reg = <0 0x088e8000 0 0x3000>;
+
+diff --git a/arch/arm64/boot/dts/qcom/sm8550.dtsi b/arch/arm64/boot/dts/qcom/sm8550.dtsi
+index 558cbc4307080..d2b404736a8e4 100644
+--- a/arch/arm64/boot/dts/qcom/sm8550.dtsi
++++ b/arch/arm64/boot/dts/qcom/sm8550.dtsi
+@@ -1858,7 +1858,7 @@
+ <&apps_smmu 0x481 0x0>;
+ };
+
+- crypto: crypto@1de0000 {
++ crypto: crypto@1dfa000 {
+ compatible = "qcom,sm8550-qce", "qcom,sm8150-qce", "qcom,qce";
+ reg = <0x0 0x01dfa000 0x0 0x6000>;
+ dmas = <&cryptobam 4>, <&cryptobam 5>;
+@@ -2769,6 +2769,10 @@
+
+ resets = <&gcc GCC_USB30_PRIM_BCR>;
+
++ interconnects = <&aggre1_noc MASTER_USB3_0 0 &mc_virt SLAVE_EBI1 0>,
++ <&gem_noc MASTER_APPSS_PROC 0 &config_noc SLAVE_USB3_0 0>;
++ interconnect-names = "usb-ddr", "apps-usb";
++
+ status = "disabled";
+
+ usb_1_dwc3: usb@a600000 {
+@@ -2883,7 +2887,7 @@
+ #interrupt-cells = <4>;
+ };
+
+- tlmm: pinctrl@f000000 {
++ tlmm: pinctrl@f100000 {
+ compatible = "qcom,sm8550-tlmm";
+ reg = <0 0x0f100000 0 0x300000>;
+ interrupts = <GIC_SPI 208 IRQ_TYPE_LEVEL_HIGH>;
+@@ -3597,6 +3601,7 @@
+ qcom,drv-id = <2>;
+ qcom,tcs-config = <ACTIVE_TCS 3>, <SLEEP_TCS 2>,
+ <WAKE_TCS 2>, <CONTROL_TCS 0>;
++ power-domains = <&CLUSTER_PD>;
+
+ apps_bcm_voter: bcm-voter {
+ compatible = "qcom,bcm-voter";
+diff --git a/arch/arm64/boot/dts/renesas/ulcb-kf.dtsi b/arch/arm64/boot/dts/renesas/ulcb-kf.dtsi
+index efc80960380f4..c78b7a5c2e2aa 100644
+--- a/arch/arm64/boot/dts/renesas/ulcb-kf.dtsi
++++ b/arch/arm64/boot/dts/renesas/ulcb-kf.dtsi
+@@ -367,7 +367,7 @@
+ };
+
+ scif1_pins: scif1 {
+- groups = "scif1_data_b", "scif1_ctrl";
++ groups = "scif1_data_b";
+ function = "scif1";
+ };
+
+@@ -397,7 +397,6 @@
+ &scif1 {
+ pinctrl-0 = <&scif1_pins>;
+ pinctrl-names = "default";
+- uart-has-rtscts;
+
+ status = "okay";
+ };
+diff --git a/arch/arm64/boot/dts/rockchip/rk3566-anbernic-rgxx3.dtsi b/arch/arm64/boot/dts/rockchip/rk3566-anbernic-rgxx3.dtsi
+index 8fadd8afb1906..ad43fa199ca55 100644
+--- a/arch/arm64/boot/dts/rockchip/rk3566-anbernic-rgxx3.dtsi
++++ b/arch/arm64/boot/dts/rockchip/rk3566-anbernic-rgxx3.dtsi
+@@ -716,7 +716,7 @@
+ status = "okay";
+
+ bluetooth {
+- compatible = "realtek,rtl8821cs-bt", "realtek,rtl8822cs-bt";
++ compatible = "realtek,rtl8821cs-bt", "realtek,rtl8723bs-bt";
+ device-wake-gpios = <&gpio4 4 GPIO_ACTIVE_HIGH>;
+ enable-gpios = <&gpio4 3 GPIO_ACTIVE_HIGH>;
+ host-wake-gpios = <&gpio4 5 GPIO_ACTIVE_HIGH>;
+diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
+index 3e4aee8f70c1b..30cdd366813fb 100644
+--- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
++++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
+@@ -133,6 +133,8 @@
+ reg = <0x11>;
+ clocks = <&cru I2S0_8CH_MCLKOUT>;
+ clock-names = "mclk";
++ assigned-clocks = <&cru I2S0_8CH_MCLKOUT>;
++ assigned-clock-rates = <12288000>;
+ #sound-dai-cells = <0>;
+
+ port {
+diff --git a/arch/arm64/boot/dts/ti/k3-am69-sk.dts b/arch/arm64/boot/dts/ti/k3-am69-sk.dts
+index bc49ba534790e..f364b7803115d 100644
+--- a/arch/arm64/boot/dts/ti/k3-am69-sk.dts
++++ b/arch/arm64/boot/dts/ti/k3-am69-sk.dts
+@@ -23,7 +23,7 @@
+ aliases {
+ serial2 = &main_uart8;
+ mmc1 = &main_sdhci1;
+- i2c0 = &main_i2c0;
++ i2c3 = &main_i2c0;
+ };
+
+ memory@80000000 {
+diff --git a/arch/arm64/boot/dts/ti/k3-j7200-common-proc-board.dts b/arch/arm64/boot/dts/ti/k3-j7200-common-proc-board.dts
+index 0d39d6b8cc0ca..63633e4f6c59f 100644
+--- a/arch/arm64/boot/dts/ti/k3-j7200-common-proc-board.dts
++++ b/arch/arm64/boot/dts/ti/k3-j7200-common-proc-board.dts
+@@ -83,25 +83,25 @@
+ &wkup_pmx2 {
+ mcu_cpsw_pins_default: mcu-cpsw-pins-default {
+ pinctrl-single,pins = <
+- J721E_WKUP_IOPAD(0x0068, PIN_OUTPUT, 0) /* MCU_RGMII1_TX_CTL */
+- J721E_WKUP_IOPAD(0x006c, PIN_INPUT, 0) /* MCU_RGMII1_RX_CTL */
+- J721E_WKUP_IOPAD(0x0070, PIN_OUTPUT, 0) /* MCU_RGMII1_TD3 */
+- J721E_WKUP_IOPAD(0x0074, PIN_OUTPUT, 0) /* MCU_RGMII1_TD2 */
+- J721E_WKUP_IOPAD(0x0078, PIN_OUTPUT, 0) /* MCU_RGMII1_TD1 */
+- J721E_WKUP_IOPAD(0x007c, PIN_OUTPUT, 0) /* MCU_RGMII1_TD0 */
+- J721E_WKUP_IOPAD(0x0088, PIN_INPUT, 0) /* MCU_RGMII1_RD3 */
+- J721E_WKUP_IOPAD(0x008c, PIN_INPUT, 0) /* MCU_RGMII1_RD2 */
+- J721E_WKUP_IOPAD(0x0090, PIN_INPUT, 0) /* MCU_RGMII1_RD1 */
+- J721E_WKUP_IOPAD(0x0094, PIN_INPUT, 0) /* MCU_RGMII1_RD0 */
+- J721E_WKUP_IOPAD(0x0080, PIN_OUTPUT, 0) /* MCU_RGMII1_TXC */
+- J721E_WKUP_IOPAD(0x0084, PIN_INPUT, 0) /* MCU_RGMII1_RXC */
++ J721E_WKUP_IOPAD(0x0000, PIN_OUTPUT, 0) /* MCU_RGMII1_TX_CTL */
++ J721E_WKUP_IOPAD(0x0004, PIN_INPUT, 0) /* MCU_RGMII1_RX_CTL */
++ J721E_WKUP_IOPAD(0x0008, PIN_OUTPUT, 0) /* MCU_RGMII1_TD3 */
++ J721E_WKUP_IOPAD(0x000c, PIN_OUTPUT, 0) /* MCU_RGMII1_TD2 */
++ J721E_WKUP_IOPAD(0x0010, PIN_OUTPUT, 0) /* MCU_RGMII1_TD1 */
++ J721E_WKUP_IOPAD(0x0014, PIN_OUTPUT, 0) /* MCU_RGMII1_TD0 */
++ J721E_WKUP_IOPAD(0x0020, PIN_INPUT, 0) /* MCU_RGMII1_RD3 */
++ J721E_WKUP_IOPAD(0x0024, PIN_INPUT, 0) /* MCU_RGMII1_RD2 */
++ J721E_WKUP_IOPAD(0x0028, PIN_INPUT, 0) /* MCU_RGMII1_RD1 */
++ J721E_WKUP_IOPAD(0x002c, PIN_INPUT, 0) /* MCU_RGMII1_RD0 */
++ J721E_WKUP_IOPAD(0x0018, PIN_OUTPUT, 0) /* MCU_RGMII1_TXC */
++ J721E_WKUP_IOPAD(0x001c, PIN_INPUT, 0) /* MCU_RGMII1_RXC */
+ >;
+ };
+
+ mcu_mdio_pins_default: mcu-mdio1-pins-default {
+ pinctrl-single,pins = <
+- J721E_WKUP_IOPAD(0x009c, PIN_OUTPUT, 0) /* (L1) MCU_MDIO0_MDC */
+- J721E_WKUP_IOPAD(0x0098, PIN_INPUT, 0) /* (L4) MCU_MDIO0_MDIO */
++ J721E_WKUP_IOPAD(0x0034, PIN_OUTPUT, 0) /* (L1) MCU_MDIO0_MDC */
++ J721E_WKUP_IOPAD(0x0030, PIN_INPUT, 0) /* (L4) MCU_MDIO0_MDIO */
+ >;
+ };
+ };
+diff --git a/arch/arm64/boot/dts/ti/k3-j721e-beagleboneai64.dts b/arch/arm64/boot/dts/ti/k3-j721e-beagleboneai64.dts
+index 37c24b077b6aa..8a62ac263b89a 100644
+--- a/arch/arm64/boot/dts/ti/k3-j721e-beagleboneai64.dts
++++ b/arch/arm64/boot/dts/ti/k3-j721e-beagleboneai64.dts
+@@ -936,6 +936,7 @@
+ };
+
+ &mailbox0_cluster0 {
++ status = "okay";
+ interrupts = <436>;
+
+ mbox_mcu_r5fss0_core0: mbox-mcu-r5fss0-core0 {
+@@ -950,6 +951,7 @@
+ };
+
+ &mailbox0_cluster1 {
++ status = "okay";
+ interrupts = <432>;
+
+ mbox_main_r5fss0_core0: mbox-main-r5fss0-core0 {
+@@ -964,6 +966,7 @@
+ };
+
+ &mailbox0_cluster2 {
++ status = "okay";
+ interrupts = <428>;
+
+ mbox_main_r5fss1_core0: mbox-main-r5fss1-core0 {
+@@ -978,6 +981,7 @@
+ };
+
+ &mailbox0_cluster3 {
++ status = "okay";
+ interrupts = <424>;
+
+ mbox_c66_0: mbox-c66-0 {
+@@ -992,6 +996,7 @@
+ };
+
+ &mailbox0_cluster4 {
++ status = "okay";
+ interrupts = <420>;
+
+ mbox_c71_0: mbox-c71-0 {
+diff --git a/arch/arm64/boot/dts/ti/k3-j784s4-evm.dts b/arch/arm64/boot/dts/ti/k3-j784s4-evm.dts
+index f33815953e779..34e9bc89ac663 100644
+--- a/arch/arm64/boot/dts/ti/k3-j784s4-evm.dts
++++ b/arch/arm64/boot/dts/ti/k3-j784s4-evm.dts
+@@ -23,7 +23,7 @@
+ serial2 = &main_uart8;
+ mmc0 = &main_sdhci0;
+ mmc1 = &main_sdhci1;
+- i2c0 = &main_i2c0;
++ i2c3 = &main_i2c0;
+ };
+
+ memory@80000000 {
+@@ -141,28 +141,28 @@
+ };
+ };
+
+-&wkup_pmx0 {
++&wkup_pmx2 {
+ mcu_cpsw_pins_default: mcu-cpsw-pins-default {
+ pinctrl-single,pins = <
+- J784S4_WKUP_IOPAD(0x094, PIN_INPUT, 0) /* (A35) MCU_RGMII1_RD0 */
+- J784S4_WKUP_IOPAD(0x090, PIN_INPUT, 0) /* (B36) MCU_RGMII1_RD1 */
+- J784S4_WKUP_IOPAD(0x08c, PIN_INPUT, 0) /* (C36) MCU_RGMII1_RD2 */
+- J784S4_WKUP_IOPAD(0x088, PIN_INPUT, 0) /* (D36) MCU_RGMII1_RD3 */
+- J784S4_WKUP_IOPAD(0x084, PIN_INPUT, 0) /* (B37) MCU_RGMII1_RXC */
+- J784S4_WKUP_IOPAD(0x06c, PIN_INPUT, 0) /* (C37) MCU_RGMII1_RX_CTL */
+- J784S4_WKUP_IOPAD(0x07c, PIN_OUTPUT, 0) /* (D37) MCU_RGMII1_TD0 */
+- J784S4_WKUP_IOPAD(0x078, PIN_OUTPUT, 0) /* (D38) MCU_RGMII1_TD1 */
+- J784S4_WKUP_IOPAD(0x074, PIN_OUTPUT, 0) /* (E37) MCU_RGMII1_TD2 */
+- J784S4_WKUP_IOPAD(0x070, PIN_OUTPUT, 0) /* (E38) MCU_RGMII1_TD3 */
+- J784S4_WKUP_IOPAD(0x080, PIN_OUTPUT, 0) /* (E36) MCU_RGMII1_TXC */
+- J784S4_WKUP_IOPAD(0x068, PIN_OUTPUT, 0) /* (C38) MCU_RGMII1_TX_CTL */
++ J784S4_WKUP_IOPAD(0x02c, PIN_INPUT, 0) /* (A35) MCU_RGMII1_RD0 */
++ J784S4_WKUP_IOPAD(0x028, PIN_INPUT, 0) /* (B36) MCU_RGMII1_RD1 */
++ J784S4_WKUP_IOPAD(0x024, PIN_INPUT, 0) /* (C36) MCU_RGMII1_RD2 */
++ J784S4_WKUP_IOPAD(0x020, PIN_INPUT, 0) /* (D36) MCU_RGMII1_RD3 */
++ J784S4_WKUP_IOPAD(0x01c, PIN_INPUT, 0) /* (B37) MCU_RGMII1_RXC */
++ J784S4_WKUP_IOPAD(0x004, PIN_INPUT, 0) /* (C37) MCU_RGMII1_RX_CTL */
++ J784S4_WKUP_IOPAD(0x014, PIN_OUTPUT, 0) /* (D37) MCU_RGMII1_TD0 */
++ J784S4_WKUP_IOPAD(0x010, PIN_OUTPUT, 0) /* (D38) MCU_RGMII1_TD1 */
++ J784S4_WKUP_IOPAD(0x00c, PIN_OUTPUT, 0) /* (E37) MCU_RGMII1_TD2 */
++ J784S4_WKUP_IOPAD(0x008, PIN_OUTPUT, 0) /* (E38) MCU_RGMII1_TD3 */
++ J784S4_WKUP_IOPAD(0x018, PIN_OUTPUT, 0) /* (E36) MCU_RGMII1_TXC */
++ J784S4_WKUP_IOPAD(0x000, PIN_OUTPUT, 0) /* (C38) MCU_RGMII1_TX_CTL */
+ >;
+ };
+
+ mcu_mdio_pins_default: mcu-mdio-pins-default {
+ pinctrl-single,pins = <
+- J784S4_WKUP_IOPAD(0x09c, PIN_OUTPUT, 0) /* (A36) MCU_MDIO0_MDC */
+- J784S4_WKUP_IOPAD(0x098, PIN_INPUT, 0) /* (B35) MCU_MDIO0_MDIO */
++ J784S4_WKUP_IOPAD(0x034, PIN_OUTPUT, 0) /* (A36) MCU_MDIO0_MDC */
++ J784S4_WKUP_IOPAD(0x030, PIN_INPUT, 0) /* (B35) MCU_MDIO0_MDIO */
+ >;
+ };
+ };
+diff --git a/arch/arm64/boot/dts/ti/k3-j784s4-mcu-wakeup.dtsi b/arch/arm64/boot/dts/ti/k3-j784s4-mcu-wakeup.dtsi
+index f04fcb614cbe4..ed2b40369c59a 100644
+--- a/arch/arm64/boot/dts/ti/k3-j784s4-mcu-wakeup.dtsi
++++ b/arch/arm64/boot/dts/ti/k3-j784s4-mcu-wakeup.dtsi
+@@ -50,7 +50,34 @@
+ wkup_pmx0: pinctrl@4301c000 {
+ compatible = "pinctrl-single";
+ /* Proxy 0 addressing */
+- reg = <0x00 0x4301c000 0x00 0x178>;
++ reg = <0x00 0x4301c000 0x00 0x034>;
++ #pinctrl-cells = <1>;
++ pinctrl-single,register-width = <32>;
++ pinctrl-single,function-mask = <0xffffffff>;
++ };
++
++ wkup_pmx1: pinctrl@4301c038 {
++ compatible = "pinctrl-single";
++ /* Proxy 0 addressing */
++ reg = <0x00 0x4301c038 0x00 0x02c>;
++ #pinctrl-cells = <1>;
++ pinctrl-single,register-width = <32>;
++ pinctrl-single,function-mask = <0xffffffff>;
++ };
++
++ wkup_pmx2: pinctrl@4301c068 {
++ compatible = "pinctrl-single";
++ /* Proxy 0 addressing */
++ reg = <0x00 0x4301c068 0x00 0x120>;
++ #pinctrl-cells = <1>;
++ pinctrl-single,register-width = <32>;
++ pinctrl-single,function-mask = <0xffffffff>;
++ };
++
++ wkup_pmx3: pinctrl@4301c190 {
++ compatible = "pinctrl-single";
++ /* Proxy 0 addressing */
++ reg = <0x00 0x4301c190 0x00 0x004>;
+ #pinctrl-cells = <1>;
+ pinctrl-single,register-width = <32>;
+ pinctrl-single,function-mask = <0xffffffff>;
+diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
+index cd03819a3b686..cdf6a35e39944 100644
+--- a/arch/arm64/include/asm/fpsimdmacros.h
++++ b/arch/arm64/include/asm/fpsimdmacros.h
+@@ -316,12 +316,12 @@
+ _for n, 0, 15, _sve_str_p \n, \nxbase, \n - 16
+ cbz \save_ffr, 921f
+ _sve_rdffr 0
+- _sve_str_p 0, \nxbase
+- _sve_ldr_p 0, \nxbase, -16
+ b 922f
+ 921:
+- str xzr, [x\nxbase] // Zero out FFR
++ _sve_pfalse 0 // Zero out FFR
+ 922:
++ _sve_str_p 0, \nxbase
++ _sve_ldr_p 0, \nxbase, -16
+ mrs x\nxtmp, fpsr
+ str w\nxtmp, [\xpfpsr]
+ mrs x\nxtmp, fpcr
+diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
+index 2cfc810d0a5b1..10b407672c427 100644
+--- a/arch/arm64/kernel/signal.c
++++ b/arch/arm64/kernel/signal.c
+@@ -398,7 +398,7 @@ static int restore_tpidr2_context(struct user_ctxs *user)
+
+ __get_user_error(tpidr2_el0, &user->tpidr2->tpidr2, err);
+ if (!err)
+- current->thread.tpidr2_el0 = tpidr2_el0;
++ write_sysreg_s(tpidr2_el0, SYS_TPIDR2_EL0);
+
+ return err;
+ }
+diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
+index a27e264bdaa5a..63a637fdf6c28 100644
+--- a/arch/loongarch/Makefile
++++ b/arch/loongarch/Makefile
+@@ -107,7 +107,7 @@ KBUILD_CFLAGS += -isystem $(shell $(CC) -print-file-name=include)
+ KBUILD_LDFLAGS += -m $(ld-emul)
+
+ ifdef CONFIG_LOONGARCH
+-CHECKFLAGS += $(shell $(CC) $(KBUILD_CFLAGS) -dM -E -x c /dev/null | \
++CHECKFLAGS += $(shell $(CC) $(KBUILD_CPPFLAGS) $(KBUILD_CFLAGS) -dM -E -x c /dev/null | \
+ grep -E -vw '__GNUC_(MINOR_|PATCHLEVEL_)?_' | \
+ sed -e "s/^\#define /-D'/" -e "s/ /'='/" -e "s/$$/'/" -e 's/\$$/&&/g')
+ endif
+diff --git a/arch/mips/Makefile b/arch/mips/Makefile
+index a7a4ee66a9d37..ef7b05ae92ceb 100644
+--- a/arch/mips/Makefile
++++ b/arch/mips/Makefile
+@@ -346,7 +346,7 @@ KBUILD_CFLAGS += -fno-asynchronous-unwind-tables
+ KBUILD_LDFLAGS += -m $(ld-emul)
+
+ ifdef CONFIG_MIPS
+-CHECKFLAGS += $(shell $(CC) $(KBUILD_CFLAGS) -dM -E -x c /dev/null | \
++CHECKFLAGS += $(shell $(CC) $(KBUILD_CPPFLAGS) $(KBUILD_CFLAGS) -dM -E -x c /dev/null | \
+ grep -E -vw '__GNUC_(MINOR_|PATCHLEVEL_)?_' | \
+ sed -e "s/^\#define /-D'/" -e "s/ /'='/" -e "s/$$/'/" -e 's/\$$/&&/g')
+ endif
+diff --git a/arch/mips/alchemy/devboards/db1000.c b/arch/mips/alchemy/devboards/db1000.c
+index 2c52ee27b4f25..79d66faa84828 100644
+--- a/arch/mips/alchemy/devboards/db1000.c
++++ b/arch/mips/alchemy/devboards/db1000.c
+@@ -381,13 +381,21 @@ static struct platform_device db1100_mmc1_dev = {
+ static struct ads7846_platform_data db1100_touch_pd = {
+ .model = 7846,
+ .vref_mv = 3300,
+- .gpio_pendown = 21,
+ };
+
+ static struct spi_gpio_platform_data db1100_spictl_pd = {
+ .num_chipselect = 1,
+ };
+
++static struct gpiod_lookup_table db1100_touch_gpio_table = {
++ .dev_id = "spi0.0",
++ .table = {
++ GPIO_LOOKUP("alchemy-gpio2", 21,
++ "pendown", GPIO_ACTIVE_LOW),
++ { }
++ },
++};
++
+ static struct spi_board_info db1100_spi_info[] __initdata = {
+ [0] = {
+ .modalias = "ads7846",
+@@ -474,6 +482,7 @@ int __init db1000_dev_setup(void)
+ pfc |= (1 << 0); /* SSI0 pins as GPIOs */
+ alchemy_wrsys(pfc, AU1000_SYS_PINFUNC);
+
++ gpiod_add_lookup_table(&db1100_touch_gpio_table);
+ spi_register_board_info(db1100_spi_info,
+ ARRAY_SIZE(db1100_spi_info));
+
+diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug
+index 6aaf8dc60610d..2a54fadbeaf51 100644
+--- a/arch/powerpc/Kconfig.debug
++++ b/arch/powerpc/Kconfig.debug
+@@ -240,7 +240,7 @@ config PPC_EARLY_DEBUG_40x
+
+ config PPC_EARLY_DEBUG_CPM
+ bool "Early serial debugging for Freescale CPM-based serial ports"
+- depends on SERIAL_CPM
++ depends on SERIAL_CPM=y
+ help
+ Select this to enable early debugging for Freescale chips
+ using a CPM-based serial port. This assumes that the bootwrapper
+diff --git a/arch/powerpc/boot/dts/turris1x.dts b/arch/powerpc/boot/dts/turris1x.dts
+index 6612160c19d59..dff1ea074d9d9 100644
+--- a/arch/powerpc/boot/dts/turris1x.dts
++++ b/arch/powerpc/boot/dts/turris1x.dts
+@@ -476,12 +476,12 @@
+ * channel 1 (but only USB 2.0 subset) to USB 2.0 pins on mPCIe
+ * slot 1 (CN5), channels 2 and 3 to connector P600.
+ *
+- * P2020 PCIe Root Port uses 1MB of PCIe MEM and xHCI controller
++ * P2020 PCIe Root Port does not use PCIe MEM and xHCI controller
+ * uses 64kB + 8kB of PCIe MEM. No PCIe IO is used or required.
+- * So allocate 2MB of PCIe MEM for this PCIe bus.
++ * So allocate 128kB of PCIe MEM for this PCIe bus.
+ */
+ reg = <0 0xffe08000 0 0x1000>;
+- ranges = <0x02000000 0x0 0xc0000000 0 0xc0000000 0x0 0x00200000>, /* MEM */
++ ranges = <0x02000000 0x0 0xc0000000 0 0xc0000000 0x0 0x00020000>, /* MEM */
+ <0x01000000 0x0 0x00000000 0 0xffc20000 0x0 0x00010000>; /* IO */
+
+ pcie@0 {
+diff --git a/arch/powerpc/include/asm/nmi.h b/arch/powerpc/include/asm/nmi.h
+index c3c7adef74de0..43bfd4de868f8 100644
+--- a/arch/powerpc/include/asm/nmi.h
++++ b/arch/powerpc/include/asm/nmi.h
+@@ -5,10 +5,10 @@
+ #ifdef CONFIG_PPC_WATCHDOG
+ extern void arch_touch_nmi_watchdog(void);
+ long soft_nmi_interrupt(struct pt_regs *regs);
+-void watchdog_nmi_set_timeout_pct(u64 pct);
++void watchdog_hardlockup_set_timeout_pct(u64 pct);
+ #else
+ static inline void arch_touch_nmi_watchdog(void) {}
+-static inline void watchdog_nmi_set_timeout_pct(u64 pct) {}
++static inline void watchdog_hardlockup_set_timeout_pct(u64 pct) {}
+ #endif
+
+ #ifdef CONFIG_NMI_IPI
+diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
+index e34c72285b4e9..f3fc5fe919d96 100644
+--- a/arch/powerpc/kernel/interrupt.c
++++ b/arch/powerpc/kernel/interrupt.c
+@@ -368,7 +368,6 @@ void preempt_schedule_irq(void);
+
+ notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs)
+ {
+- unsigned long flags;
+ unsigned long ret = 0;
+ unsigned long kuap;
+ bool stack_store = read_thread_flags() & _TIF_EMULATE_STACK_STORE;
+@@ -392,7 +391,7 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs)
+
+ kuap = kuap_get_and_assert_locked();
+
+- local_irq_save(flags);
++ local_irq_disable();
+
+ if (!arch_irq_disabled_regs(regs)) {
+ /* Returning to a kernel context with local irqs enabled. */
+diff --git a/arch/powerpc/kernel/ppc_save_regs.S b/arch/powerpc/kernel/ppc_save_regs.S
+index 49813f9824681..a9b9c32d0c1ff 100644
+--- a/arch/powerpc/kernel/ppc_save_regs.S
++++ b/arch/powerpc/kernel/ppc_save_regs.S
+@@ -31,10 +31,10 @@ _GLOBAL(ppc_save_regs)
+ lbz r0,PACAIRQSOFTMASK(r13)
+ PPC_STL r0,SOFTE(r3)
+ #endif
+- /* go up one stack frame for SP */
+- PPC_LL r4,0(r1)
+- PPC_STL r4,GPR1(r3)
++ /* store current SP */
++ PPC_STL r1,GPR1(r3)
+ /* get caller's LR */
++ PPC_LL r4,0(r1)
+ PPC_LL r0,LRSAVE(r4)
+ PPC_STL r0,_LINK(r3)
+ mflr r0
+diff --git a/arch/powerpc/kernel/signal_32.c b/arch/powerpc/kernel/signal_32.c
+index c114c7f25645c..7a718ed32b277 100644
+--- a/arch/powerpc/kernel/signal_32.c
++++ b/arch/powerpc/kernel/signal_32.c
+@@ -264,8 +264,9 @@ static void prepare_save_user_regs(int ctx_has_vsx_region)
+ #endif
+ }
+
+-static int __unsafe_save_user_regs(struct pt_regs *regs, struct mcontext __user *frame,
+- struct mcontext __user *tm_frame, int ctx_has_vsx_region)
++static __always_inline int
++__unsafe_save_user_regs(struct pt_regs *regs, struct mcontext __user *frame,
++ struct mcontext __user *tm_frame, int ctx_has_vsx_region)
+ {
+ unsigned long msr = regs->msr;
+
+@@ -364,8 +365,9 @@ static void prepare_save_tm_user_regs(void)
+ current->thread.ckvrsave = mfspr(SPRN_VRSAVE);
+ }
+
+-static int save_tm_user_regs_unsafe(struct pt_regs *regs, struct mcontext __user *frame,
+- struct mcontext __user *tm_frame, unsigned long msr)
++static __always_inline int
++save_tm_user_regs_unsafe(struct pt_regs *regs, struct mcontext __user *frame,
++ struct mcontext __user *tm_frame, unsigned long msr)
+ {
+ /* Save both sets of general registers */
+ unsafe_save_general_regs(¤t->thread.ckpt_regs, frame, failed);
+@@ -444,8 +446,9 @@ failed:
+ #else
+ static void prepare_save_tm_user_regs(void) { }
+
+-static int save_tm_user_regs_unsafe(struct pt_regs *regs, struct mcontext __user *frame,
+- struct mcontext __user *tm_frame, unsigned long msr)
++static __always_inline int
++save_tm_user_regs_unsafe(struct pt_regs *regs, struct mcontext __user *frame,
++ struct mcontext __user *tm_frame, unsigned long msr)
+ {
+ return 0;
+ }
+diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
+index 265801a3e94cf..6903a72222732 100644
+--- a/arch/powerpc/kernel/smp.c
++++ b/arch/powerpc/kernel/smp.c
+@@ -1605,6 +1605,7 @@ static void add_cpu_to_masks(int cpu)
+ }
+
+ /* Activate a secondary processor. */
++__no_stack_protector
+ void start_secondary(void *unused)
+ {
+ unsigned int cpu = raw_smp_processor_id();
+diff --git a/arch/powerpc/kernel/vdso/Makefile b/arch/powerpc/kernel/vdso/Makefile
+index 4c3f34485f08f..23d3caf27d6d4 100644
+--- a/arch/powerpc/kernel/vdso/Makefile
++++ b/arch/powerpc/kernel/vdso/Makefile
+@@ -54,7 +54,7 @@ KASAN_SANITIZE := n
+ KCSAN_SANITIZE := n
+
+ ccflags-y := -fno-common -fno-builtin
+-ldflags-y := -Wl,--hash-style=both -nostdlib -shared -z noexecstack
++ldflags-y := -Wl,--hash-style=both -nostdlib -shared -z noexecstack $(CLANG_FLAGS)
+ ldflags-$(CONFIG_LD_IS_LLD) += $(call cc-option,--ld-path=$(LD),-fuse-ld=lld)
+ # Filter flags that clang will warn are unused for linking
+ ldflags-y += $(filter-out $(CC_AUTO_VAR_INIT_ZERO_ENABLER) $(CC_FLAGS_FTRACE) -Wa$(comma)%, $(KBUILD_CFLAGS))
+diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
+index dbcc4a793f0b9..edb2dd1f53ebc 100644
+--- a/arch/powerpc/kernel/watchdog.c
++++ b/arch/powerpc/kernel/watchdog.c
+@@ -438,7 +438,7 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
+ {
+ int cpu = smp_processor_id();
+
+- if (!(watchdog_enabled & NMI_WATCHDOG_ENABLED))
++ if (!(watchdog_enabled & WATCHDOG_HARDLOCKUP_ENABLED))
+ return HRTIMER_NORESTART;
+
+ if (!cpumask_test_cpu(cpu, &watchdog_cpumask))
+@@ -479,7 +479,7 @@ static void start_watchdog(void *arg)
+ return;
+ }
+
+- if (!(watchdog_enabled & NMI_WATCHDOG_ENABLED))
++ if (!(watchdog_enabled & WATCHDOG_HARDLOCKUP_ENABLED))
+ return;
+
+ if (!cpumask_test_cpu(cpu, &watchdog_cpumask))
+@@ -546,7 +546,7 @@ static void watchdog_calc_timeouts(void)
+ wd_timer_period_ms = watchdog_thresh * 1000 * 2 / 5;
+ }
+
+-void watchdog_nmi_stop(void)
++void watchdog_hardlockup_stop(void)
+ {
+ int cpu;
+
+@@ -554,7 +554,7 @@ void watchdog_nmi_stop(void)
+ stop_watchdog_on_cpu(cpu);
+ }
+
+-void watchdog_nmi_start(void)
++void watchdog_hardlockup_start(void)
+ {
+ int cpu;
+
+@@ -566,7 +566,7 @@ void watchdog_nmi_start(void)
+ /*
+ * Invoked from core watchdog init.
+ */
+-int __init watchdog_nmi_probe(void)
++int __init watchdog_hardlockup_probe(void)
+ {
+ int err;
+
+@@ -582,7 +582,7 @@ int __init watchdog_nmi_probe(void)
+ }
+
+ #ifdef CONFIG_PPC_PSERIES
+-void watchdog_nmi_set_timeout_pct(u64 pct)
++void watchdog_hardlockup_set_timeout_pct(u64 pct)
+ {
+ pr_info("Set the NMI watchdog timeout factor to %llu%%\n", pct);
+ WRITE_ONCE(wd_timeout_pct, pct);
+diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
+index 2297aa764ecdb..e8db8c8efe359 100644
+--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
++++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
+@@ -745,9 +745,9 @@ static void free_pud_table(pud_t *pud_start, p4d_t *p4d)
+ }
+
+ static void remove_pte_table(pte_t *pte_start, unsigned long addr,
+- unsigned long end)
++ unsigned long end, bool direct)
+ {
+- unsigned long next;
++ unsigned long next, pages = 0;
+ pte_t *pte;
+
+ pte = pte_start + pte_index(addr);
+@@ -769,13 +769,16 @@ static void remove_pte_table(pte_t *pte_start, unsigned long addr,
+ }
+
+ pte_clear(&init_mm, addr, pte);
++ pages++;
+ }
++ if (direct)
++ update_page_count(mmu_virtual_psize, -pages);
+ }
+
+ static void __meminit remove_pmd_table(pmd_t *pmd_start, unsigned long addr,
+- unsigned long end)
++ unsigned long end, bool direct)
+ {
+- unsigned long next;
++ unsigned long next, pages = 0;
+ pte_t *pte_base;
+ pmd_t *pmd;
+
+@@ -793,19 +796,22 @@ static void __meminit remove_pmd_table(pmd_t *pmd_start, unsigned long addr,
+ continue;
+ }
+ pte_clear(&init_mm, addr, (pte_t *)pmd);
++ pages++;
+ continue;
+ }
+
+ pte_base = (pte_t *)pmd_page_vaddr(*pmd);
+- remove_pte_table(pte_base, addr, next);
++ remove_pte_table(pte_base, addr, next, direct);
+ free_pte_table(pte_base, pmd);
+ }
++ if (direct)
++ update_page_count(MMU_PAGE_2M, -pages);
+ }
+
+ static void __meminit remove_pud_table(pud_t *pud_start, unsigned long addr,
+- unsigned long end)
++ unsigned long end, bool direct)
+ {
+- unsigned long next;
++ unsigned long next, pages = 0;
+ pmd_t *pmd_base;
+ pud_t *pud;
+
+@@ -823,16 +829,20 @@ static void __meminit remove_pud_table(pud_t *pud_start, unsigned long addr,
+ continue;
+ }
+ pte_clear(&init_mm, addr, (pte_t *)pud);
++ pages++;
+ continue;
+ }
+
+ pmd_base = pud_pgtable(*pud);
+- remove_pmd_table(pmd_base, addr, next);
++ remove_pmd_table(pmd_base, addr, next, direct);
+ free_pmd_table(pmd_base, pud);
+ }
++ if (direct)
++ update_page_count(MMU_PAGE_1G, -pages);
+ }
+
+-static void __meminit remove_pagetable(unsigned long start, unsigned long end)
++static void __meminit remove_pagetable(unsigned long start, unsigned long end,
++ bool direct)
+ {
+ unsigned long addr, next;
+ pud_t *pud_base;
+@@ -861,7 +871,7 @@ static void __meminit remove_pagetable(unsigned long start, unsigned long end)
+ }
+
+ pud_base = p4d_pgtable(*p4d);
+- remove_pud_table(pud_base, addr, next);
++ remove_pud_table(pud_base, addr, next, direct);
+ free_pud_table(pud_base, p4d);
+ }
+
+@@ -884,7 +894,7 @@ int __meminit radix__create_section_mapping(unsigned long start,
+
+ int __meminit radix__remove_section_mapping(unsigned long start, unsigned long end)
+ {
+- remove_pagetable(start, end);
++ remove_pagetable(start, end, true);
+ return 0;
+ }
+ #endif /* CONFIG_MEMORY_HOTPLUG */
+@@ -920,7 +930,7 @@ int __meminit radix__vmemmap_create_mapping(unsigned long start,
+ #ifdef CONFIG_MEMORY_HOTPLUG
+ void __meminit radix__vmemmap_remove_mapping(unsigned long start, unsigned long page_size)
+ {
+- remove_pagetable(start, start + page_size);
++ remove_pagetable(start, start + page_size, false);
+ }
+ #endif
+ #endif
+diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
+index 05b0d584e50b8..fe1b83020e0df 100644
+--- a/arch/powerpc/mm/init_64.c
++++ b/arch/powerpc/mm/init_64.c
+@@ -189,7 +189,7 @@ static bool altmap_cross_boundary(struct vmem_altmap *altmap, unsigned long star
+ unsigned long nr_pfn = page_size / sizeof(struct page);
+ unsigned long start_pfn = page_to_pfn((struct page *)start);
+
+- if ((start_pfn + nr_pfn) > altmap->end_pfn)
++ if ((start_pfn + nr_pfn - 1) > altmap->end_pfn)
+ return true;
+
+ if (start_pfn < altmap->base_pfn)
+diff --git a/arch/powerpc/platforms/powernv/pci-sriov.c b/arch/powerpc/platforms/powernv/pci-sriov.c
+index 7195133b26bb9..59882da3e7425 100644
+--- a/arch/powerpc/platforms/powernv/pci-sriov.c
++++ b/arch/powerpc/platforms/powernv/pci-sriov.c
+@@ -594,12 +594,12 @@ static void pnv_pci_sriov_disable(struct pci_dev *pdev)
+ struct pnv_iov_data *iov;
+
+ iov = pnv_iov_get(pdev);
+- num_vfs = iov->num_vfs;
+- base_pe = iov->vf_pe_arr[0].pe_number;
+-
+ if (WARN_ON(!iov))
+ return;
+
++ num_vfs = iov->num_vfs;
++ base_pe = iov->vf_pe_arr[0].pe_number;
++
+ /* Release VF PEs */
+ pnv_ioda_release_vf_PE(pdev);
+
+diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
+index 0072682531d80..b664838008c12 100644
+--- a/arch/powerpc/platforms/powernv/vas-window.c
++++ b/arch/powerpc/platforms/powernv/vas-window.c
+@@ -1310,8 +1310,8 @@ int vas_win_close(struct vas_window *vwin)
+ /* if send window, drop reference to matching receive window */
+ if (window->tx_win) {
+ if (window->user_win) {
+- put_vas_user_win_ref(&vwin->task_ref);
+ mm_context_remove_vas_window(vwin->task_ref.mm);
++ put_vas_user_win_ref(&vwin->task_ref);
+ }
+ put_rx_win(window->rxwin);
+ }
+diff --git a/arch/powerpc/platforms/pseries/mobility.c b/arch/powerpc/platforms/pseries/mobility.c
+index 6f30113b5468e..cd632ba9ebfff 100644
+--- a/arch/powerpc/platforms/pseries/mobility.c
++++ b/arch/powerpc/platforms/pseries/mobility.c
+@@ -750,7 +750,7 @@ static int pseries_migrate_partition(u64 handle)
+ goto out;
+
+ if (factor)
+- watchdog_nmi_set_timeout_pct(factor);
++ watchdog_hardlockup_set_timeout_pct(factor);
+
+ ret = pseries_suspend(handle);
+ if (ret == 0) {
+@@ -766,7 +766,7 @@ static int pseries_migrate_partition(u64 handle)
+ pseries_cancel_migration(handle, ret);
+
+ if (factor)
+- watchdog_nmi_set_timeout_pct(0);
++ watchdog_hardlockup_set_timeout_pct(0);
+
+ out:
+ vas_migration_handler(VAS_RESUME);
+diff --git a/arch/powerpc/platforms/pseries/vas.c b/arch/powerpc/platforms/pseries/vas.c
+index 513180467562b..9a44a98ba3420 100644
+--- a/arch/powerpc/platforms/pseries/vas.c
++++ b/arch/powerpc/platforms/pseries/vas.c
+@@ -507,8 +507,8 @@ static int vas_deallocate_window(struct vas_window *vwin)
+ vascaps[win->win_type].nr_open_windows--;
+ mutex_unlock(&vas_pseries_mutex);
+
+- put_vas_user_win_ref(&vwin->task_ref);
+ mm_context_remove_vas_window(vwin->task_ref.mm);
++ put_vas_user_win_ref(&vwin->task_ref);
+
+ kfree(win);
+ return 0;
+diff --git a/arch/riscv/kernel/hibernate-asm.S b/arch/riscv/kernel/hibernate-asm.S
+index effaf5ca5da0e..f3e62e766cb29 100644
+--- a/arch/riscv/kernel/hibernate-asm.S
++++ b/arch/riscv/kernel/hibernate-asm.S
+@@ -28,7 +28,6 @@ ENTRY(__hibernate_cpu_resume)
+
+ REG_L a0, hibernate_cpu_context
+
+- suspend_restore_csrs
+ suspend_restore_regs
+
+ /* Return zero value. */
+diff --git a/arch/riscv/kernel/hibernate.c b/arch/riscv/kernel/hibernate.c
+index 264b2dcdd67e3..671b686c01587 100644
+--- a/arch/riscv/kernel/hibernate.c
++++ b/arch/riscv/kernel/hibernate.c
+@@ -80,7 +80,6 @@ int pfn_is_nosave(unsigned long pfn)
+
+ void notrace save_processor_state(void)
+ {
+- WARN_ON(num_online_cpus() != 1);
+ }
+
+ void notrace restore_processor_state(void)
+diff --git a/arch/riscv/kernel/probes/uprobes.c b/arch/riscv/kernel/probes/uprobes.c
+index c976a21cd4bd5..194f166b2cc40 100644
+--- a/arch/riscv/kernel/probes/uprobes.c
++++ b/arch/riscv/kernel/probes/uprobes.c
+@@ -67,6 +67,7 @@ int arch_uprobe_post_xol(struct arch_uprobe *auprobe, struct pt_regs *regs)
+ struct uprobe_task *utask = current->utask;
+
+ WARN_ON_ONCE(current->thread.bad_cause != UPROBE_TRAP_NR);
++ current->thread.bad_cause = utask->autask.saved_cause;
+
+ instruction_pointer_set(regs, utask->vaddr + auprobe->insn_size);
+
+@@ -102,6 +103,7 @@ void arch_uprobe_abort_xol(struct arch_uprobe *auprobe, struct pt_regs *regs)
+ {
+ struct uprobe_task *utask = current->utask;
+
++ current->thread.bad_cause = utask->autask.saved_cause;
+ /*
+ * Task has received a fatal signal, so reset back to probbed
+ * address.
+diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
+index 445a4efee267d..6765f1ce79625 100644
+--- a/arch/riscv/kernel/smpboot.c
++++ b/arch/riscv/kernel/smpboot.c
+@@ -161,10 +161,11 @@ asmlinkage __visible void smp_callin(void)
+ mmgrab(mm);
+ current->active_mm = mm;
+
+- riscv_ipi_enable();
+-
+ store_cpu_topology(curr_cpuid);
+ notify_cpu_starting(curr_cpuid);
++
++ riscv_ipi_enable();
++
+ numa_add_cpu(curr_cpuid);
+ set_cpu_online(curr_cpuid, 1);
+ probe_vendor_features(curr_cpuid);
+diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
+index 4fa420faa7808..1306149aad57a 100644
+--- a/arch/riscv/mm/init.c
++++ b/arch/riscv/mm/init.c
+@@ -267,7 +267,6 @@ static void __init setup_bootmem(void)
+ dma_contiguous_reserve(dma32_phys_limit);
+ if (IS_ENABLED(CONFIG_64BIT))
+ hugetlb_cma_reserve(PUD_SHIFT - PAGE_SHIFT);
+- memblock_allow_resize();
+ }
+
+ #ifdef CONFIG_MMU
+@@ -1370,6 +1369,9 @@ void __init paging_init(void)
+ {
+ setup_bootmem();
+ setup_vm_final();
++
++ /* Depend on that Linear Mapping is ready */
++ memblock_allow_resize();
+ }
+
+ void __init misc_mem_init(void)
+diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c
+index 807fa9da1e721..3c65b8258ae67 100644
+--- a/arch/s390/kvm/diag.c
++++ b/arch/s390/kvm/diag.c
+@@ -166,6 +166,7 @@ static int diag9c_forwarding_overrun(void)
+ static int __diag_time_slice_end_directed(struct kvm_vcpu *vcpu)
+ {
+ struct kvm_vcpu *tcpu;
++ int tcpu_cpu;
+ int tid;
+
+ tid = vcpu->run->s.regs.gprs[(vcpu->arch.sie_block->ipa & 0xf0) >> 4];
+@@ -181,14 +182,15 @@ static int __diag_time_slice_end_directed(struct kvm_vcpu *vcpu)
+ goto no_yield;
+
+ /* target guest VCPU already running */
+- if (READ_ONCE(tcpu->cpu) >= 0) {
++ tcpu_cpu = READ_ONCE(tcpu->cpu);
++ if (tcpu_cpu >= 0) {
+ if (!diag9c_forwarding_hz || diag9c_forwarding_overrun())
+ goto no_yield;
+
+ /* target host CPU already running */
+- if (!vcpu_is_preempted(tcpu->cpu))
++ if (!vcpu_is_preempted(tcpu_cpu))
+ goto no_yield;
+- smp_yield_cpu(tcpu->cpu);
++ smp_yield_cpu(tcpu_cpu);
+ VCPU_EVENT(vcpu, 5,
+ "diag time slice end directed to %d: yield forwarded",
+ tid);
+diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
+index 17b81659cdb20..6700196964648 100644
+--- a/arch/s390/kvm/kvm-s390.c
++++ b/arch/s390/kvm/kvm-s390.c
+@@ -2156,6 +2156,10 @@ static unsigned long kvm_s390_next_dirty_cmma(struct kvm_memslots *slots,
+ ms = container_of(mnode, struct kvm_memory_slot, gfn_node[slots->node_idx]);
+ ofs = 0;
+ }
++
++ if (cur_gfn < ms->base_gfn)
++ ofs = 0;
++
+ ofs = find_next_bit(kvm_second_dirty_bitmap(ms), ms->npages, ofs);
+ while (ofs >= ms->npages && (mnode = rb_next(mnode))) {
+ ms = container_of(mnode, struct kvm_memory_slot, gfn_node[slots->node_idx]);
+diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
+index 8d6b765abf29b..0333ee482eb89 100644
+--- a/arch/s390/kvm/vsie.c
++++ b/arch/s390/kvm/vsie.c
+@@ -177,7 +177,8 @@ static int setup_apcb00(struct kvm_vcpu *vcpu, unsigned long *apcb_s,
+ sizeof(struct kvm_s390_apcb0)))
+ return -EFAULT;
+
+- bitmap_and(apcb_s, apcb_s, apcb_h, sizeof(struct kvm_s390_apcb0));
++ bitmap_and(apcb_s, apcb_s, apcb_h,
++ BITS_PER_BYTE * sizeof(struct kvm_s390_apcb0));
+
+ return 0;
+ }
+@@ -203,7 +204,8 @@ static int setup_apcb11(struct kvm_vcpu *vcpu, unsigned long *apcb_s,
+ sizeof(struct kvm_s390_apcb1)))
+ return -EFAULT;
+
+- bitmap_and(apcb_s, apcb_s, apcb_h, sizeof(struct kvm_s390_apcb1));
++ bitmap_and(apcb_s, apcb_s, apcb_h,
++ BITS_PER_BYTE * sizeof(struct kvm_s390_apcb1));
+
+ return 0;
+ }
+diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c
+index 5b22c6e24528a..b9dcb4ae6c59a 100644
+--- a/arch/s390/mm/vmem.c
++++ b/arch/s390/mm/vmem.c
+@@ -667,7 +667,15 @@ static void __init memblock_region_swap(void *a, void *b, int size)
+
+ #ifdef CONFIG_KASAN
+ #define __sha(x) ((unsigned long)kasan_mem_to_shadow((void *)x))
++
++static inline int set_memory_kasan(unsigned long start, unsigned long end)
++{
++ start = PAGE_ALIGN_DOWN(__sha(start));
++ end = PAGE_ALIGN(__sha(end));
++ return set_memory_rwnx(start, (end - start) >> PAGE_SHIFT);
++}
+ #endif
++
+ /*
+ * map whole physical memory to virtual memory (identity mapping)
+ * we reserve enough space in the vmalloc area for vmemmap to hotplug
+@@ -737,10 +745,8 @@ void __init vmem_map_init(void)
+ }
+
+ #ifdef CONFIG_KASAN
+- for_each_mem_range(i, &base, &end) {
+- set_memory_rwnx(__sha(base),
+- (__sha(end) - __sha(base)) >> PAGE_SHIFT);
+- }
++ for_each_mem_range(i, &base, &end)
++ set_memory_kasan(base, end);
+ #endif
+ set_memory_rox((unsigned long)_stext,
+ (unsigned long)(_etext - _stext) >> PAGE_SHIFT);
+diff --git a/arch/sh/boards/mach-dreamcast/irq.c b/arch/sh/boards/mach-dreamcast/irq.c
+index cc06e4cdb4cdf..0eec82fb85e7c 100644
+--- a/arch/sh/boards/mach-dreamcast/irq.c
++++ b/arch/sh/boards/mach-dreamcast/irq.c
+@@ -108,13 +108,13 @@ int systemasic_irq_demux(int irq)
+ __u32 j, bit;
+
+ switch (irq) {
+- case 13:
++ case 13 + 16:
+ level = 0;
+ break;
+- case 11:
++ case 11 + 16:
+ level = 1;
+ break;
+- case 9:
++ case 9 + 16:
+ level = 2;
+ break;
+ default:
+diff --git a/arch/sh/boards/mach-highlander/setup.c b/arch/sh/boards/mach-highlander/setup.c
+index 533393d779c2b..01565660a6695 100644
+--- a/arch/sh/boards/mach-highlander/setup.c
++++ b/arch/sh/boards/mach-highlander/setup.c
+@@ -389,10 +389,10 @@ static unsigned char irl2irq[HL_NR_IRL];
+
+ static int highlander_irq_demux(int irq)
+ {
+- if (irq >= HL_NR_IRL || irq < 0 || !irl2irq[irq])
++ if (irq >= HL_NR_IRL + 16 || irq < 16 || !irl2irq[irq - 16])
+ return irq;
+
+- return irl2irq[irq];
++ return irl2irq[irq - 16];
+ }
+
+ static void __init highlander_init_irq(void)
+diff --git a/arch/sh/boards/mach-r2d/irq.c b/arch/sh/boards/mach-r2d/irq.c
+index e34f81e9ae813..d0a54a9adbce2 100644
+--- a/arch/sh/boards/mach-r2d/irq.c
++++ b/arch/sh/boards/mach-r2d/irq.c
+@@ -117,10 +117,10 @@ static unsigned char irl2irq[R2D_NR_IRL];
+
+ int rts7751r2d_irq_demux(int irq)
+ {
+- if (irq >= R2D_NR_IRL || irq < 0 || !irl2irq[irq])
++ if (irq >= R2D_NR_IRL + 16 || irq < 16 || !irl2irq[irq - 16])
+ return irq;
+
+- return irl2irq[irq];
++ return irl2irq[irq - 16];
+ }
+
+ /*
+diff --git a/arch/sh/cchips/Kconfig b/arch/sh/cchips/Kconfig
+index efde2edb56278..9659a0bc58dec 100644
+--- a/arch/sh/cchips/Kconfig
++++ b/arch/sh/cchips/Kconfig
+@@ -29,9 +29,9 @@ endchoice
+ config HD64461_IRQ
+ int "HD64461 IRQ"
+ depends on HD64461
+- default "36"
++ default "52"
+ help
+- The default setting of the HD64461 IRQ is 36.
++ The default setting of the HD64461 IRQ is 52.
+
+ Do not change this unless you know what you are doing.
+
+diff --git a/arch/sh/drivers/dma/dma-sh.c b/arch/sh/drivers/dma/dma-sh.c
+index 96c626c2cd0a4..306fba1564e5e 100644
+--- a/arch/sh/drivers/dma/dma-sh.c
++++ b/arch/sh/drivers/dma/dma-sh.c
+@@ -18,6 +18,18 @@
+ #include <cpu/dma-register.h>
+ #include <cpu/dma.h>
+
++/*
++ * Some of the SoCs feature two DMAC modules. In such a case, the channels are
++ * distributed equally among them.
++ */
++#ifdef SH_DMAC_BASE1
++#define SH_DMAC_NR_MD_CH (CONFIG_NR_ONCHIP_DMA_CHANNELS / 2)
++#else
++#define SH_DMAC_NR_MD_CH CONFIG_NR_ONCHIP_DMA_CHANNELS
++#endif
++
++#define SH_DMAC_CH_SZ 0x10
++
+ /*
+ * Define the default configuration for dual address memory-memory transfer.
+ * The 0x400 value represents auto-request, external->external.
+@@ -29,7 +41,7 @@ static unsigned long dma_find_base(unsigned int chan)
+ unsigned long base = SH_DMAC_BASE0;
+
+ #ifdef SH_DMAC_BASE1
+- if (chan >= 6)
++ if (chan >= SH_DMAC_NR_MD_CH)
+ base = SH_DMAC_BASE1;
+ #endif
+
+@@ -40,13 +52,13 @@ static unsigned long dma_base_addr(unsigned int chan)
+ {
+ unsigned long base = dma_find_base(chan);
+
+- /* Normalize offset calculation */
+- if (chan >= 9)
+- chan -= 6;
+- if (chan >= 4)
+- base += 0x10;
++ chan = (chan % SH_DMAC_NR_MD_CH) * SH_DMAC_CH_SZ;
++
++ /* DMAOR is placed inside the channel register space. Step over it. */
++ if (chan >= DMAOR)
++ base += SH_DMAC_CH_SZ;
+
+- return base + (chan * 0x10);
++ return base + chan;
+ }
+
+ #ifdef CONFIG_SH_DMA_IRQ_MULTI
+@@ -250,12 +262,11 @@ static int sh_dmac_get_dma_residue(struct dma_channel *chan)
+ #define NR_DMAOR 1
+ #endif
+
+-/*
+- * DMAOR bases are broken out amongst channel groups. DMAOR0 manages
+- * channels 0 - 5, DMAOR1 6 - 11 (optional).
+- */
+-#define dmaor_read_reg(n) __raw_readw(dma_find_base((n)*6))
+-#define dmaor_write_reg(n, data) __raw_writew(data, dma_find_base(n)*6)
++#define dmaor_read_reg(n) __raw_readw(dma_find_base((n) * \
++ SH_DMAC_NR_MD_CH) + DMAOR)
++#define dmaor_write_reg(n, data) __raw_writew(data, \
++ dma_find_base((n) * \
++ SH_DMAC_NR_MD_CH) + DMAOR)
+
+ static inline int dmaor_reset(int no)
+ {
+diff --git a/arch/sh/include/asm/hd64461.h b/arch/sh/include/asm/hd64461.h
+index afb24cb034b11..d2c485fa333b5 100644
+--- a/arch/sh/include/asm/hd64461.h
++++ b/arch/sh/include/asm/hd64461.h
+@@ -229,7 +229,7 @@
+ #define HD64461_NIMR HD64461_IO_OFFSET(0x5002)
+
+ #define HD64461_IRQBASE OFFCHIP_IRQ_BASE
+-#define OFFCHIP_IRQ_BASE 64
++#define OFFCHIP_IRQ_BASE (64 + 16)
+ #define HD64461_IRQ_NUM 16
+
+ #define HD64461_IRQ_UART (HD64461_IRQBASE+5)
+diff --git a/arch/sh/include/mach-common/mach/highlander.h b/arch/sh/include/mach-common/mach/highlander.h
+index fb44c299d0337..b12c795584225 100644
+--- a/arch/sh/include/mach-common/mach/highlander.h
++++ b/arch/sh/include/mach-common/mach/highlander.h
+@@ -176,7 +176,7 @@
+ #define IVDR_CK_ON 4 /* iVDR Clock ON */
+ #endif
+
+-#define HL_FPGA_IRQ_BASE 200
++#define HL_FPGA_IRQ_BASE (200 + 16)
+ #define HL_NR_IRL 15
+
+ #define IRQ_AX88796 (HL_FPGA_IRQ_BASE + 0)
+diff --git a/arch/sh/include/mach-common/mach/r2d.h b/arch/sh/include/mach-common/mach/r2d.h
+index 0d7e483c7d3f5..69bc1907c5637 100644
+--- a/arch/sh/include/mach-common/mach/r2d.h
++++ b/arch/sh/include/mach-common/mach/r2d.h
+@@ -47,7 +47,7 @@
+
+ #define IRLCNTR1 (PA_BCR + 0) /* Interrupt Control Register1 */
+
+-#define R2D_FPGA_IRQ_BASE 100
++#define R2D_FPGA_IRQ_BASE (100 + 16)
+
+ #define IRQ_VOYAGER (R2D_FPGA_IRQ_BASE + 0)
+ #define IRQ_EXT (R2D_FPGA_IRQ_BASE + 1)
+diff --git a/arch/sh/include/mach-dreamcast/mach/sysasic.h b/arch/sh/include/mach-dreamcast/mach/sysasic.h
+index ed69ce7f20301..3b27be9a527ea 100644
+--- a/arch/sh/include/mach-dreamcast/mach/sysasic.h
++++ b/arch/sh/include/mach-dreamcast/mach/sysasic.h
+@@ -22,7 +22,7 @@
+ takes.
+ */
+
+-#define HW_EVENT_IRQ_BASE 48
++#define HW_EVENT_IRQ_BASE (48 + 16)
+
+ /* IRQ 13 */
+ #define HW_EVENT_VSYNC (HW_EVENT_IRQ_BASE + 5) /* VSync */
+diff --git a/arch/sh/include/mach-se/mach/se7724.h b/arch/sh/include/mach-se/mach/se7724.h
+index 1fe28820dfa95..ea6c46633b337 100644
+--- a/arch/sh/include/mach-se/mach/se7724.h
++++ b/arch/sh/include/mach-se/mach/se7724.h
+@@ -37,7 +37,7 @@
+ #define IRQ2_IRQ evt2irq(0x640)
+
+ /* Bits in IRQ012 registers */
+-#define SE7724_FPGA_IRQ_BASE 220
++#define SE7724_FPGA_IRQ_BASE (220 + 16)
+
+ /* IRQ0 */
+ #define IRQ0_BASE SE7724_FPGA_IRQ_BASE
+diff --git a/arch/sh/kernel/cpu/sh2/probe.c b/arch/sh/kernel/cpu/sh2/probe.c
+index d342ea08843f6..70a07f4f2142f 100644
+--- a/arch/sh/kernel/cpu/sh2/probe.c
++++ b/arch/sh/kernel/cpu/sh2/probe.c
+@@ -21,7 +21,7 @@ static int __init scan_cache(unsigned long node, const char *uname,
+ if (!of_flat_dt_is_compatible(node, "jcore,cache"))
+ return 0;
+
+- j2_ccr_base = (u32 __iomem *)of_flat_dt_translate_address(node);
++ j2_ccr_base = ioremap(of_flat_dt_translate_address(node), 4);
+
+ return 1;
+ }
+diff --git a/arch/sh/kernel/cpu/sh3/entry.S b/arch/sh/kernel/cpu/sh3/entry.S
+index e48b3dd996f58..b1f5b3c58a018 100644
+--- a/arch/sh/kernel/cpu/sh3/entry.S
++++ b/arch/sh/kernel/cpu/sh3/entry.S
+@@ -470,9 +470,9 @@ ENTRY(handle_interrupt)
+ mov r4, r0 ! save vector->jmp table offset for later
+
+ shlr2 r4 ! vector to IRQ# conversion
+- add #-0x10, r4
+
+- cmp/pz r4 ! is it a valid IRQ?
++ mov #0x10, r5
++ cmp/hs r5, r4 ! is it a valid IRQ?
+ bt 10f
+
+ /*
+diff --git a/arch/sparc/kernel/nmi.c b/arch/sparc/kernel/nmi.c
+index 060fff95a305c..9d9e29b75c43a 100644
+--- a/arch/sparc/kernel/nmi.c
++++ b/arch/sparc/kernel/nmi.c
+@@ -282,11 +282,11 @@ __setup("nmi_watchdog=", setup_nmi_watchdog);
+ * sparc specific NMI watchdog enable function.
+ * Enables watchdog if it is not enabled already.
+ */
+-int watchdog_nmi_enable(unsigned int cpu)
++void watchdog_hardlockup_enable(unsigned int cpu)
+ {
+ if (atomic_read(&nmi_active) == -1) {
+ pr_warn("NMI watchdog cannot be enabled or disabled\n");
+- return -1;
++ return;
+ }
+
+ /*
+@@ -295,17 +295,15 @@ int watchdog_nmi_enable(unsigned int cpu)
+ * process first.
+ */
+ if (!nmi_init_done)
+- return 0;
++ return;
+
+ smp_call_function_single(cpu, start_nmi_watchdog, NULL, 1);
+-
+- return 0;
+ }
+ /*
+ * sparc specific NMI watchdog disable function.
+ * Disables watchdog if it is not disabled already.
+ */
+-void watchdog_nmi_disable(unsigned int cpu)
++void watchdog_hardlockup_disable(unsigned int cpu)
+ {
+ if (atomic_read(&nmi_active) == -1)
+ pr_warn_once("NMI watchdog cannot be enabled or disabled\n");
+diff --git a/arch/um/Makefile b/arch/um/Makefile
+index 8186d4761bda6..da4d5256af2f0 100644
+--- a/arch/um/Makefile
++++ b/arch/um/Makefile
+@@ -149,7 +149,7 @@ export CFLAGS_vmlinux := $(LINK-y) $(LINK_WRAPS) $(LD_FLAGS_CMDLINE) $(CC_FLAGS_
+ # When cleaning we don't include .config, so we don't include
+ # TT or skas makefiles and don't clean skas_ptregs.h.
+ CLEAN_FILES += linux x.i gmon.out
+-MRPROPER_FILES += arch/$(SUBARCH)/include/generated
++MRPROPER_FILES += $(HOST_DIR)/include/generated
+
+ archclean:
+ @find . \( -name '*.bb' -o -name '*.bbg' -o -name '*.da' \
+diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
+index e146b599260f8..64f1343df062f 100644
+--- a/arch/x86/coco/tdx/tdx.c
++++ b/arch/x86/coco/tdx/tdx.c
+@@ -840,6 +840,30 @@ static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc)
+ return true;
+ }
+
++static bool tdx_enc_status_change_prepare(unsigned long vaddr, int numpages,
++ bool enc)
++{
++ /*
++ * Only handle shared->private conversion here.
++ * See the comment in tdx_early_init().
++ */
++ if (enc)
++ return tdx_enc_status_changed(vaddr, numpages, enc);
++ return true;
++}
++
++static bool tdx_enc_status_change_finish(unsigned long vaddr, int numpages,
++ bool enc)
++{
++ /*
++ * Only handle private->shared conversion here.
++ * See the comment in tdx_early_init().
++ */
++ if (!enc)
++ return tdx_enc_status_changed(vaddr, numpages, enc);
++ return true;
++}
++
+ void __init tdx_early_init(void)
+ {
+ u64 cc_mask;
+@@ -867,9 +891,30 @@ void __init tdx_early_init(void)
+ */
+ physical_mask &= cc_mask - 1;
+
+- x86_platform.guest.enc_cache_flush_required = tdx_cache_flush_required;
+- x86_platform.guest.enc_tlb_flush_required = tdx_tlb_flush_required;
+- x86_platform.guest.enc_status_change_finish = tdx_enc_status_changed;
++ /*
++ * The kernel mapping should match the TDX metadata for the page.
++ * load_unaligned_zeropad() can touch memory *adjacent* to that which is
++ * owned by the caller and can catch even _momentary_ mismatches. Bad
++ * things happen on mismatch:
++ *
++ * - Private mapping => Shared Page == Guest shutdown
++ * - Shared mapping => Private Page == Recoverable #VE
++ *
++ * guest.enc_status_change_prepare() converts the page from
++ * shared=>private before the mapping becomes private.
++ *
++ * guest.enc_status_change_finish() converts the page from
++ * private=>shared after the mapping becomes private.
++ *
++ * In both cases there is a temporary shared mapping to a private page,
++ * which can result in a #VE. But, there is never a private mapping to
++ * a shared page.
++ */
++ x86_platform.guest.enc_status_change_prepare = tdx_enc_status_change_prepare;
++ x86_platform.guest.enc_status_change_finish = tdx_enc_status_change_finish;
++
++ x86_platform.guest.enc_cache_flush_required = tdx_cache_flush_required;
++ x86_platform.guest.enc_tlb_flush_required = tdx_tlb_flush_required;
+
+ pr_info("Guest detected\n");
+ }
+diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
+index bccea57dee81e..abadd5f234254 100644
+--- a/arch/x86/events/amd/core.c
++++ b/arch/x86/events/amd/core.c
+@@ -374,7 +374,7 @@ static int amd_pmu_hw_config(struct perf_event *event)
+
+ /* pass precise event sampling to ibs: */
+ if (event->attr.precise_ip && get_ibs_caps())
+- return -ENOENT;
++ return forward_event_to_ibs(event);
+
+ if (has_branch_stack(event) && !x86_pmu.lbr_nr)
+ return -EOPNOTSUPP;
+diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
+index 64582954b5f67..3710148021916 100644
+--- a/arch/x86/events/amd/ibs.c
++++ b/arch/x86/events/amd/ibs.c
+@@ -190,7 +190,7 @@ static struct perf_ibs *get_ibs_pmu(int type)
+ }
+
+ /*
+- * Use IBS for precise event sampling:
++ * core pmu config -> IBS config
+ *
+ * perf record -a -e cpu-cycles:p ... # use ibs op counting cycle count
+ * perf record -a -e r076:p ... # same as -e cpu-cycles:p
+@@ -199,25 +199,9 @@ static struct perf_ibs *get_ibs_pmu(int type)
+ * IbsOpCntCtl (bit 19) of IBS Execution Control Register (IbsOpCtl,
+ * MSRC001_1033) is used to select either cycle or micro-ops counting
+ * mode.
+- *
+- * The rip of IBS samples has skid 0. Thus, IBS supports precise
+- * levels 1 and 2 and the PERF_EFLAGS_EXACT is set. In rare cases the
+- * rip is invalid when IBS was not able to record the rip correctly.
+- * We clear PERF_EFLAGS_EXACT and take the rip from pt_regs then.
+- *
+ */
+-static int perf_ibs_precise_event(struct perf_event *event, u64 *config)
++static int core_pmu_ibs_config(struct perf_event *event, u64 *config)
+ {
+- switch (event->attr.precise_ip) {
+- case 0:
+- return -ENOENT;
+- case 1:
+- case 2:
+- break;
+- default:
+- return -EOPNOTSUPP;
+- }
+-
+ switch (event->attr.type) {
+ case PERF_TYPE_HARDWARE:
+ switch (event->attr.config) {
+@@ -243,22 +227,37 @@ static int perf_ibs_precise_event(struct perf_event *event, u64 *config)
+ return -EOPNOTSUPP;
+ }
+
++/*
++ * The rip of IBS samples has skid 0. Thus, IBS supports precise
++ * levels 1 and 2 and the PERF_EFLAGS_EXACT is set. In rare cases the
++ * rip is invalid when IBS was not able to record the rip correctly.
++ * We clear PERF_EFLAGS_EXACT and take the rip from pt_regs then.
++ */
++int forward_event_to_ibs(struct perf_event *event)
++{
++ u64 config = 0;
++
++ if (!event->attr.precise_ip || event->attr.precise_ip > 2)
++ return -EOPNOTSUPP;
++
++ if (!core_pmu_ibs_config(event, &config)) {
++ event->attr.type = perf_ibs_op.pmu.type;
++ event->attr.config = config;
++ }
++ return -ENOENT;
++}
++
+ static int perf_ibs_init(struct perf_event *event)
+ {
+ struct hw_perf_event *hwc = &event->hw;
+ struct perf_ibs *perf_ibs;
+ u64 max_cnt, config;
+- int ret;
+
+ perf_ibs = get_ibs_pmu(event->attr.type);
+- if (perf_ibs) {
+- config = event->attr.config;
+- } else {
+- perf_ibs = &perf_ibs_op;
+- ret = perf_ibs_precise_event(event, &config);
+- if (ret)
+- return ret;
+- }
++ if (!perf_ibs)
++ return -ENOENT;
++
++ config = event->attr.config;
+
+ if (event->pmu != &perf_ibs->pmu)
+ return -ENOENT;
+diff --git a/arch/x86/hyperv/ivm.c b/arch/x86/hyperv/ivm.c
+index cc92388b7a999..6f7c1b5606ad4 100644
+--- a/arch/x86/hyperv/ivm.c
++++ b/arch/x86/hyperv/ivm.c
+@@ -17,6 +17,7 @@
+ #include <asm/mem_encrypt.h>
+ #include <asm/mshyperv.h>
+ #include <asm/hypervisor.h>
++#include <asm/mtrr.h>
+
+ #ifdef CONFIG_AMD_MEM_ENCRYPT
+
+@@ -372,6 +373,9 @@ void __init hv_vtom_init(void)
+ x86_platform.guest.enc_cache_flush_required = hv_vtom_cache_flush_required;
+ x86_platform.guest.enc_tlb_flush_required = hv_vtom_tlb_flush_required;
+ x86_platform.guest.enc_status_change_finish = hv_vtom_set_host_visibility;
++
++ /* Set WB as the default cache mode. */
++ mtrr_overwrite_state(NULL, 0, MTRR_TYPE_WRBACK);
+ }
+
+ #endif /* CONFIG_AMD_MEM_ENCRYPT */
+diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
+index f0eeaf6e5f5f7..1bae790a553a5 100644
+--- a/arch/x86/include/asm/mtrr.h
++++ b/arch/x86/include/asm/mtrr.h
+@@ -23,14 +23,43 @@
+ #ifndef _ASM_X86_MTRR_H
+ #define _ASM_X86_MTRR_H
+
++#include <linux/bits.h>
+ #include <uapi/asm/mtrr.h>
+
++/* Defines for hardware MTRR registers. */
++#define MTRR_CAP_VCNT GENMASK(7, 0)
++#define MTRR_CAP_FIX BIT_MASK(8)
++#define MTRR_CAP_WC BIT_MASK(10)
++
++#define MTRR_DEF_TYPE_TYPE GENMASK(7, 0)
++#define MTRR_DEF_TYPE_FE BIT_MASK(10)
++#define MTRR_DEF_TYPE_E BIT_MASK(11)
++
++#define MTRR_DEF_TYPE_ENABLE (MTRR_DEF_TYPE_FE | MTRR_DEF_TYPE_E)
++#define MTRR_DEF_TYPE_DISABLE ~(MTRR_DEF_TYPE_TYPE | MTRR_DEF_TYPE_ENABLE)
++
++#define MTRR_PHYSBASE_TYPE GENMASK(7, 0)
++#define MTRR_PHYSBASE_RSVD GENMASK(11, 8)
++
++#define MTRR_PHYSMASK_RSVD GENMASK(10, 0)
++#define MTRR_PHYSMASK_V BIT_MASK(11)
++
++struct mtrr_state_type {
++ struct mtrr_var_range var_ranges[MTRR_MAX_VAR_RANGES];
++ mtrr_type fixed_ranges[MTRR_NUM_FIXED_RANGES];
++ unsigned char enabled;
++ bool have_fixed;
++ mtrr_type def_type;
++};
++
+ /*
+ * The following functions are for use by other drivers that cannot use
+ * arch_phys_wc_add and arch_phys_wc_del.
+ */
+ # ifdef CONFIG_MTRR
+ void mtrr_bp_init(void);
++void mtrr_overwrite_state(struct mtrr_var_range *var, unsigned int num_var,
++ mtrr_type def_type);
+ extern u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform);
+ extern void mtrr_save_fixed_ranges(void *);
+ extern void mtrr_save_state(void);
+@@ -48,6 +77,12 @@ void mtrr_disable(void);
+ void mtrr_enable(void);
+ void mtrr_generic_set_state(void);
+ # else
++static inline void mtrr_overwrite_state(struct mtrr_var_range *var,
++ unsigned int num_var,
++ mtrr_type def_type)
++{
++}
++
+ static inline u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform)
+ {
+ /*
+@@ -121,7 +156,8 @@ struct mtrr_gentry32 {
+ #endif /* CONFIG_COMPAT */
+
+ /* Bit fields for enabled in struct mtrr_state_type */
+-#define MTRR_STATE_MTRR_FIXED_ENABLED 0x01
+-#define MTRR_STATE_MTRR_ENABLED 0x02
++#define MTRR_STATE_SHIFT 10
++#define MTRR_STATE_MTRR_FIXED_ENABLED (MTRR_DEF_TYPE_FE >> MTRR_STATE_SHIFT)
++#define MTRR_STATE_MTRR_ENABLED (MTRR_DEF_TYPE_E >> MTRR_STATE_SHIFT)
+
+ #endif /* _ASM_X86_MTRR_H */
+diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
+index abf09882f58b6..f1a46500a2753 100644
+--- a/arch/x86/include/asm/perf_event.h
++++ b/arch/x86/include/asm/perf_event.h
+@@ -478,8 +478,10 @@ struct pebs_xmm {
+
+ #ifdef CONFIG_X86_LOCAL_APIC
+ extern u32 get_ibs_caps(void);
++extern int forward_event_to_ibs(struct perf_event *event);
+ #else
+ static inline u32 get_ibs_caps(void) { return 0; }
++static inline int forward_event_to_ibs(struct perf_event *event) { return -ENOENT; }
+ #endif
+
+ #ifdef CONFIG_PERF_EVENTS
+diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h
+index 7929327abe009..a629b1b9f65a6 100644
+--- a/arch/x86/include/asm/pgtable_64.h
++++ b/arch/x86/include/asm/pgtable_64.h
+@@ -237,8 +237,8 @@ static inline void native_pgd_clear(pgd_t *pgd)
+
+ #define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val((pte)) })
+ #define __pmd_to_swp_entry(pmd) ((swp_entry_t) { pmd_val((pmd)) })
+-#define __swp_entry_to_pte(x) ((pte_t) { .pte = (x).val })
+-#define __swp_entry_to_pmd(x) ((pmd_t) { .pmd = (x).val })
++#define __swp_entry_to_pte(x) (__pte((x).val))
++#define __swp_entry_to_pmd(x) (__pmd((x).val))
+
+ extern void cleanup_highmap(void);
+
+diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
+index 13dc2a9d23c1e..7ca5c9ec8b52e 100644
+--- a/arch/x86/include/asm/sev.h
++++ b/arch/x86/include/asm/sev.h
+@@ -192,12 +192,12 @@ struct snp_guest_request_ioctl;
+
+ void setup_ghcb(void);
+ void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr,
+- unsigned int npages);
++ unsigned long npages);
+ void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
+- unsigned int npages);
++ unsigned long npages);
+ void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
+-void snp_set_memory_shared(unsigned long vaddr, unsigned int npages);
+-void snp_set_memory_private(unsigned long vaddr, unsigned int npages);
++void snp_set_memory_shared(unsigned long vaddr, unsigned long npages);
++void snp_set_memory_private(unsigned long vaddr, unsigned long npages);
+ void snp_set_wakeup_secondary_cpu(void);
+ bool snp_init(struct boot_params *bp);
+ void __init __noreturn snp_abort(void);
+@@ -212,12 +212,12 @@ static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate)
+ static inline int rmpadjust(unsigned long vaddr, bool rmp_psize, unsigned long attrs) { return 0; }
+ static inline void setup_ghcb(void) { }
+ static inline void __init
+-early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned int npages) { }
++early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
+ static inline void __init
+-early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned int npages) { }
++early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned long npages) { }
+ static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { }
+-static inline void snp_set_memory_shared(unsigned long vaddr, unsigned int npages) { }
+-static inline void snp_set_memory_private(unsigned long vaddr, unsigned int npages) { }
++static inline void snp_set_memory_shared(unsigned long vaddr, unsigned long npages) { }
++static inline void snp_set_memory_private(unsigned long vaddr, unsigned long npages) { }
+ static inline void snp_set_wakeup_secondary_cpu(void) { }
+ static inline bool snp_init(struct boot_params *bp) { return false; }
+ static inline void snp_abort(void) { }
+diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
+index 88085f369ff6f..1ca9701917c55 100644
+--- a/arch/x86/include/asm/x86_init.h
++++ b/arch/x86/include/asm/x86_init.h
+@@ -150,7 +150,7 @@ struct x86_init_acpi {
+ * @enc_cache_flush_required Returns true if a cache flush is needed before changing page encryption status
+ */
+ struct x86_guest {
+- void (*enc_status_change_prepare)(unsigned long vaddr, int npages, bool enc);
++ bool (*enc_status_change_prepare)(unsigned long vaddr, int npages, bool enc);
+ bool (*enc_status_change_finish)(unsigned long vaddr, int npages, bool enc);
+ bool (*enc_tlb_flush_required)(bool enc);
+ bool (*enc_cache_flush_required)(void);
+diff --git a/arch/x86/include/uapi/asm/mtrr.h b/arch/x86/include/uapi/asm/mtrr.h
+index 376563f2bac1f..ab194c8316259 100644
+--- a/arch/x86/include/uapi/asm/mtrr.h
++++ b/arch/x86/include/uapi/asm/mtrr.h
+@@ -81,14 +81,6 @@ typedef __u8 mtrr_type;
+ #define MTRR_NUM_FIXED_RANGES 88
+ #define MTRR_MAX_VAR_RANGES 256
+
+-struct mtrr_state_type {
+- struct mtrr_var_range var_ranges[MTRR_MAX_VAR_RANGES];
+- mtrr_type fixed_ranges[MTRR_NUM_FIXED_RANGES];
+- unsigned char enabled;
+- unsigned char have_fixed;
+- mtrr_type def_type;
+-};
+-
+ #define MTRRphysBase_MSR(reg) (0x200 + 2 * (reg))
+ #define MTRRphysMask_MSR(reg) (0x200 + 2 * (reg) + 1)
+
+diff --git a/arch/x86/kernel/cpu/mtrr/cleanup.c b/arch/x86/kernel/cpu/mtrr/cleanup.c
+index b5f43049fa5f7..ca2d567e729e2 100644
+--- a/arch/x86/kernel/cpu/mtrr/cleanup.c
++++ b/arch/x86/kernel/cpu/mtrr/cleanup.c
+@@ -173,7 +173,7 @@ early_param("mtrr_cleanup_debug", mtrr_cleanup_debug_setup);
+
+ static void __init
+ set_var_mtrr(unsigned int reg, unsigned long basek, unsigned long sizek,
+- unsigned char type, unsigned int address_bits)
++ unsigned char type)
+ {
+ u32 base_lo, base_hi, mask_lo, mask_hi;
+ u64 base, mask;
+@@ -183,7 +183,7 @@ set_var_mtrr(unsigned int reg, unsigned long basek, unsigned long sizek,
+ return;
+ }
+
+- mask = (1ULL << address_bits) - 1;
++ mask = (1ULL << boot_cpu_data.x86_phys_bits) - 1;
+ mask &= ~((((u64)sizek) << 10) - 1);
+
+ base = ((u64)basek) << 10;
+@@ -209,7 +209,7 @@ save_var_mtrr(unsigned int reg, unsigned long basek, unsigned long sizek,
+ range_state[reg].type = type;
+ }
+
+-static void __init set_var_mtrr_all(unsigned int address_bits)
++static void __init set_var_mtrr_all(void)
+ {
+ unsigned long basek, sizek;
+ unsigned char type;
+@@ -220,7 +220,7 @@ static void __init set_var_mtrr_all(unsigned int address_bits)
+ sizek = range_state[reg].size_pfn << (PAGE_SHIFT - 10);
+ type = range_state[reg].type;
+
+- set_var_mtrr(reg, basek, sizek, type, address_bits);
++ set_var_mtrr(reg, basek, sizek, type);
+ }
+ }
+
+@@ -680,7 +680,7 @@ static int __init mtrr_search_optimal_index(void)
+ return index_good;
+ }
+
+-int __init mtrr_cleanup(unsigned address_bits)
++int __init mtrr_cleanup(void)
+ {
+ unsigned long x_remove_base, x_remove_size;
+ unsigned long base, size, def, dummy;
+@@ -742,7 +742,7 @@ int __init mtrr_cleanup(unsigned address_bits)
+ mtrr_print_out_one_result(i);
+
+ if (!result[i].bad) {
+- set_var_mtrr_all(address_bits);
++ set_var_mtrr_all();
+ pr_debug("New variable MTRRs\n");
+ print_out_mtrr_range_state();
+ return 1;
+@@ -786,7 +786,7 @@ int __init mtrr_cleanup(unsigned address_bits)
+ gran_size = result[i].gran_sizek;
+ gran_size <<= 10;
+ x86_setup_var_mtrrs(range, nr_range, chunk_size, gran_size);
+- set_var_mtrr_all(address_bits);
++ set_var_mtrr_all();
+ pr_debug("New variable MTRRs\n");
+ print_out_mtrr_range_state();
+ return 1;
+@@ -802,7 +802,7 @@ int __init mtrr_cleanup(unsigned address_bits)
+ return 0;
+ }
+ #else
+-int __init mtrr_cleanup(unsigned address_bits)
++int __init mtrr_cleanup(void)
+ {
+ return 0;
+ }
+@@ -890,7 +890,7 @@ int __init mtrr_trim_uncached_memory(unsigned long end_pfn)
+ return 0;
+
+ rdmsr(MSR_MTRRdefType, def, dummy);
+- def &= 0xff;
++ def &= MTRR_DEF_TYPE_TYPE;
+ if (def != MTRR_TYPE_UNCACHABLE)
+ return 0;
+
+diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
+index ee09d359e08f0..e81d832475a1f 100644
+--- a/arch/x86/kernel/cpu/mtrr/generic.c
++++ b/arch/x86/kernel/cpu/mtrr/generic.c
+@@ -8,10 +8,12 @@
+ #include <linux/init.h>
+ #include <linux/io.h>
+ #include <linux/mm.h>
+-
++#include <linux/cc_platform.h>
+ #include <asm/processor-flags.h>
+ #include <asm/cacheinfo.h>
+ #include <asm/cpufeature.h>
++#include <asm/hypervisor.h>
++#include <asm/mshyperv.h>
+ #include <asm/tlbflush.h>
+ #include <asm/mtrr.h>
+ #include <asm/msr.h>
+@@ -38,6 +40,9 @@ u64 mtrr_tom2;
+ struct mtrr_state_type mtrr_state;
+ EXPORT_SYMBOL_GPL(mtrr_state);
+
++/* Reserved bits in the high portion of the MTRRphysBaseN MSR. */
++u32 phys_hi_rsvd;
++
+ /*
+ * BIOS is expected to clear MtrrFixDramModEn bit, see for example
+ * "BIOS and Kernel Developer's Guide for the AMD Athlon 64 and AMD
+@@ -69,10 +74,9 @@ static u64 get_mtrr_size(u64 mask)
+ {
+ u64 size;
+
+- mask >>= PAGE_SHIFT;
+- mask |= size_or_mask;
++ mask |= (u64)phys_hi_rsvd << 32;
+ size = -mask;
+- size <<= PAGE_SHIFT;
++
+ return size;
+ }
+
+@@ -171,7 +175,7 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
+ for (i = 0; i < num_var_ranges; ++i) {
+ unsigned short start_state, end_state, inclusive;
+
+- if (!(mtrr_state.var_ranges[i].mask_lo & (1 << 11)))
++ if (!(mtrr_state.var_ranges[i].mask_lo & MTRR_PHYSMASK_V))
+ continue;
+
+ base = (((u64)mtrr_state.var_ranges[i].base_hi) << 32) +
+@@ -223,7 +227,7 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
+ if ((start & mask) != (base & mask))
+ continue;
+
+- curr_match = mtrr_state.var_ranges[i].base_lo & 0xff;
++ curr_match = mtrr_state.var_ranges[i].base_lo & MTRR_PHYSBASE_TYPE;
+ if (prev_match == MTRR_TYPE_INVALID) {
+ prev_match = curr_match;
+ continue;
+@@ -240,6 +244,62 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
+ return mtrr_state.def_type;
+ }
+
++/**
++ * mtrr_overwrite_state - set static MTRR state
++ *
++ * Used to set MTRR state via different means (e.g. with data obtained from
++ * a hypervisor).
++ * Is allowed only for special cases when running virtualized. Must be called
++ * from the x86_init.hyper.init_platform() hook. It can be called only once.
++ * The MTRR state can't be changed afterwards. To ensure that, X86_FEATURE_MTRR
++ * is cleared.
++ */
++void mtrr_overwrite_state(struct mtrr_var_range *var, unsigned int num_var,
++ mtrr_type def_type)
++{
++ unsigned int i;
++
++ /* Only allowed to be called once before mtrr_bp_init(). */
++ if (WARN_ON_ONCE(mtrr_state_set))
++ return;
++
++ /* Only allowed when running virtualized. */
++ if (!cpu_feature_enabled(X86_FEATURE_HYPERVISOR))
++ return;
++
++ /*
++ * Only allowed for special virtualization cases:
++ * - when running as Hyper-V, SEV-SNP guest using vTOM
++ * - when running as Xen PV guest
++ * - when running as SEV-SNP or TDX guest to avoid unnecessary
++ * VMM communication/Virtualization exceptions (#VC, #VE)
++ */
++ if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP) &&
++ !hv_is_isolation_supported() &&
++ !cpu_feature_enabled(X86_FEATURE_XENPV) &&
++ !cpu_feature_enabled(X86_FEATURE_TDX_GUEST))
++ return;
++
++ /* Disable MTRR in order to disable MTRR modifications. */
++ setup_clear_cpu_cap(X86_FEATURE_MTRR);
++
++ if (var) {
++ if (num_var > MTRR_MAX_VAR_RANGES) {
++ pr_warn("Trying to overwrite MTRR state with %u variable entries\n",
++ num_var);
++ num_var = MTRR_MAX_VAR_RANGES;
++ }
++ for (i = 0; i < num_var; i++)
++ mtrr_state.var_ranges[i] = var[i];
++ num_var_ranges = num_var;
++ }
++
++ mtrr_state.def_type = def_type;
++ mtrr_state.enabled |= MTRR_STATE_MTRR_ENABLED;
++
++ mtrr_state_set = 1;
++}
++
+ /**
+ * mtrr_type_lookup - look up memory type in MTRR
+ *
+@@ -422,10 +482,10 @@ static void __init print_mtrr_state(void)
+ }
+ pr_debug("MTRR variable ranges %sabled:\n",
+ mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED ? "en" : "dis");
+- high_width = (__ffs64(size_or_mask) - (32 - PAGE_SHIFT) + 3) / 4;
++ high_width = (boot_cpu_data.x86_phys_bits - (32 - PAGE_SHIFT) + 3) / 4;
+
+ for (i = 0; i < num_var_ranges; ++i) {
+- if (mtrr_state.var_ranges[i].mask_lo & (1 << 11))
++ if (mtrr_state.var_ranges[i].mask_lo & MTRR_PHYSMASK_V)
+ pr_debug(" %u base %0*X%05X000 mask %0*X%05X000 %s\n",
+ i,
+ high_width,
+@@ -434,7 +494,8 @@ static void __init print_mtrr_state(void)
+ high_width,
+ mtrr_state.var_ranges[i].mask_hi,
+ mtrr_state.var_ranges[i].mask_lo >> 12,
+- mtrr_attrib_to_str(mtrr_state.var_ranges[i].base_lo & 0xff));
++ mtrr_attrib_to_str(mtrr_state.var_ranges[i].base_lo &
++ MTRR_PHYSBASE_TYPE));
+ else
+ pr_debug(" %u disabled\n", i);
+ }
+@@ -452,7 +513,7 @@ bool __init get_mtrr_state(void)
+ vrs = mtrr_state.var_ranges;
+
+ rdmsr(MSR_MTRRcap, lo, dummy);
+- mtrr_state.have_fixed = (lo >> 8) & 1;
++ mtrr_state.have_fixed = lo & MTRR_CAP_FIX;
+
+ for (i = 0; i < num_var_ranges; i++)
+ get_mtrr_var_range(i, &vrs[i]);
+@@ -460,8 +521,8 @@ bool __init get_mtrr_state(void)
+ get_fixed_ranges(mtrr_state.fixed_ranges);
+
+ rdmsr(MSR_MTRRdefType, lo, dummy);
+- mtrr_state.def_type = (lo & 0xff);
+- mtrr_state.enabled = (lo & 0xc00) >> 10;
++ mtrr_state.def_type = lo & MTRR_DEF_TYPE_TYPE;
++ mtrr_state.enabled = (lo & MTRR_DEF_TYPE_ENABLE) >> MTRR_STATE_SHIFT;
+
+ if (amd_special_default_mtrr()) {
+ unsigned low, high;
+@@ -574,7 +635,7 @@ static void generic_get_mtrr(unsigned int reg, unsigned long *base,
+
+ rdmsr(MTRRphysMask_MSR(reg), mask_lo, mask_hi);
+
+- if ((mask_lo & 0x800) == 0) {
++ if (!(mask_lo & MTRR_PHYSMASK_V)) {
+ /* Invalid (i.e. free) range */
+ *base = 0;
+ *size = 0;
+@@ -585,8 +646,8 @@ static void generic_get_mtrr(unsigned int reg, unsigned long *base,
+ rdmsr(MTRRphysBase_MSR(reg), base_lo, base_hi);
+
+ /* Work out the shifted address mask: */
+- tmp = (u64)mask_hi << (32 - PAGE_SHIFT) | mask_lo >> PAGE_SHIFT;
+- mask = size_or_mask | tmp;
++ tmp = (u64)mask_hi << 32 | (mask_lo & PAGE_MASK);
++ mask = (u64)phys_hi_rsvd << 32 | tmp;
+
+ /* Expand tmp with high bits to all 1s: */
+ hi = fls64(tmp);
+@@ -604,9 +665,9 @@ static void generic_get_mtrr(unsigned int reg, unsigned long *base,
+ * This works correctly if size is a power of two, i.e. a
+ * contiguous range:
+ */
+- *size = -mask;
++ *size = -mask >> PAGE_SHIFT;
+ *base = (u64)base_hi << (32 - PAGE_SHIFT) | base_lo >> PAGE_SHIFT;
+- *type = base_lo & 0xff;
++ *type = base_lo & MTRR_PHYSBASE_TYPE;
+
+ out_put_cpu:
+ put_cpu();
+@@ -644,9 +705,8 @@ static bool set_mtrr_var_ranges(unsigned int index, struct mtrr_var_range *vr)
+ bool changed = false;
+
+ rdmsr(MTRRphysBase_MSR(index), lo, hi);
+- if ((vr->base_lo & 0xfffff0ffUL) != (lo & 0xfffff0ffUL)
+- || (vr->base_hi & (size_and_mask >> (32 - PAGE_SHIFT))) !=
+- (hi & (size_and_mask >> (32 - PAGE_SHIFT)))) {
++ if ((vr->base_lo & ~MTRR_PHYSBASE_RSVD) != (lo & ~MTRR_PHYSBASE_RSVD)
++ || (vr->base_hi & ~phys_hi_rsvd) != (hi & ~phys_hi_rsvd)) {
+
+ mtrr_wrmsr(MTRRphysBase_MSR(index), vr->base_lo, vr->base_hi);
+ changed = true;
+@@ -654,9 +714,8 @@ static bool set_mtrr_var_ranges(unsigned int index, struct mtrr_var_range *vr)
+
+ rdmsr(MTRRphysMask_MSR(index), lo, hi);
+
+- if ((vr->mask_lo & 0xfffff800UL) != (lo & 0xfffff800UL)
+- || (vr->mask_hi & (size_and_mask >> (32 - PAGE_SHIFT))) !=
+- (hi & (size_and_mask >> (32 - PAGE_SHIFT)))) {
++ if ((vr->mask_lo & ~MTRR_PHYSMASK_RSVD) != (lo & ~MTRR_PHYSMASK_RSVD)
++ || (vr->mask_hi & ~phys_hi_rsvd) != (hi & ~phys_hi_rsvd)) {
+ mtrr_wrmsr(MTRRphysMask_MSR(index), vr->mask_lo, vr->mask_hi);
+ changed = true;
+ }
+@@ -691,11 +750,12 @@ static unsigned long set_mtrr_state(void)
+ * Set_mtrr_restore restores the old value of MTRRdefType,
+ * so to set it we fiddle with the saved value:
+ */
+- if ((deftype_lo & 0xff) != mtrr_state.def_type
+- || ((deftype_lo & 0xc00) >> 10) != mtrr_state.enabled) {
++ if ((deftype_lo & MTRR_DEF_TYPE_TYPE) != mtrr_state.def_type ||
++ ((deftype_lo & MTRR_DEF_TYPE_ENABLE) >> MTRR_STATE_SHIFT) != mtrr_state.enabled) {
+
+- deftype_lo = (deftype_lo & ~0xcff) | mtrr_state.def_type |
+- (mtrr_state.enabled << 10);
++ deftype_lo = (deftype_lo & MTRR_DEF_TYPE_DISABLE) |
++ mtrr_state.def_type |
++ (mtrr_state.enabled << MTRR_STATE_SHIFT);
+ change_mask |= MTRR_CHANGE_MASK_DEFTYPE;
+ }
+
+@@ -708,7 +768,7 @@ void mtrr_disable(void)
+ rdmsr(MSR_MTRRdefType, deftype_lo, deftype_hi);
+
+ /* Disable MTRRs, and set the default type to uncached */
+- mtrr_wrmsr(MSR_MTRRdefType, deftype_lo & ~0xcff, deftype_hi);
++ mtrr_wrmsr(MSR_MTRRdefType, deftype_lo & MTRR_DEF_TYPE_DISABLE, deftype_hi);
+ }
+
+ void mtrr_enable(void)
+@@ -762,9 +822,9 @@ static void generic_set_mtrr(unsigned int reg, unsigned long base,
+ memset(vr, 0, sizeof(struct mtrr_var_range));
+ } else {
+ vr->base_lo = base << PAGE_SHIFT | type;
+- vr->base_hi = (base & size_and_mask) >> (32 - PAGE_SHIFT);
+- vr->mask_lo = -size << PAGE_SHIFT | 0x800;
+- vr->mask_hi = (-size & size_and_mask) >> (32 - PAGE_SHIFT);
++ vr->base_hi = (base >> (32 - PAGE_SHIFT)) & ~phys_hi_rsvd;
++ vr->mask_lo = -size << PAGE_SHIFT | MTRR_PHYSMASK_V;
++ vr->mask_hi = (-size >> (32 - PAGE_SHIFT)) & ~phys_hi_rsvd;
+
+ mtrr_wrmsr(MTRRphysBase_MSR(reg), vr->base_lo, vr->base_hi);
+ mtrr_wrmsr(MTRRphysMask_MSR(reg), vr->mask_lo, vr->mask_hi);
+@@ -817,7 +877,7 @@ static int generic_have_wrcomb(void)
+ {
+ unsigned long config, dummy;
+ rdmsr(MSR_MTRRcap, config, dummy);
+- return config & (1 << 10);
++ return config & MTRR_CAP_WC;
+ }
+
+ int positive_have_wrcomb(void)
+diff --git a/arch/x86/kernel/cpu/mtrr/mtrr.c b/arch/x86/kernel/cpu/mtrr/mtrr.c
+index 783f3210d5827..be35a0b09604d 100644
+--- a/arch/x86/kernel/cpu/mtrr/mtrr.c
++++ b/arch/x86/kernel/cpu/mtrr/mtrr.c
+@@ -67,8 +67,6 @@ static bool mtrr_enabled(void)
+ unsigned int mtrr_usage_table[MTRR_MAX_VAR_RANGES];
+ static DEFINE_MUTEX(mtrr_mutex);
+
+-u64 size_or_mask, size_and_mask;
+-
+ const struct mtrr_ops *mtrr_if;
+
+ /* Returns non-zero if we have the write-combining memory type */
+@@ -117,7 +115,7 @@ static void __init set_num_var_ranges(bool use_generic)
+ else if (is_cpu(CYRIX) || is_cpu(CENTAUR))
+ config = 8;
+
+- num_var_ranges = config & 0xff;
++ num_var_ranges = config & MTRR_CAP_VCNT;
+ }
+
+ static void __init init_table(void)
+@@ -619,77 +617,46 @@ static struct syscore_ops mtrr_syscore_ops = {
+
+ int __initdata changed_by_mtrr_cleanup;
+
+-#define SIZE_OR_MASK_BITS(n) (~((1ULL << ((n) - PAGE_SHIFT)) - 1))
+ /**
+- * mtrr_bp_init - initialize mtrrs on the boot CPU
++ * mtrr_bp_init - initialize MTRRs on the boot CPU
+ *
+ * This needs to be called early; before any of the other CPUs are
+ * initialized (i.e. before smp_init()).
+- *
+ */
+ void __init mtrr_bp_init(void)
+ {
++ bool generic_mtrrs = cpu_feature_enabled(X86_FEATURE_MTRR);
+ const char *why = "(not available)";
+- u32 phys_addr;
+
+- phys_addr = 32;
+-
+- if (boot_cpu_has(X86_FEATURE_MTRR)) {
+- mtrr_if = &generic_mtrr_ops;
+- size_or_mask = SIZE_OR_MASK_BITS(36);
+- size_and_mask = 0x00f00000;
+- phys_addr = 36;
++ phys_hi_rsvd = GENMASK(31, boot_cpu_data.x86_phys_bits - 32);
+
++ if (!generic_mtrrs && mtrr_state.enabled) {
+ /*
+- * This is an AMD specific MSR, but we assume(hope?) that
+- * Intel will implement it too when they extend the address
+- * bus of the Xeon.
++ * Software overwrite of MTRR state, only for generic case.
++ * Note that X86_FEATURE_MTRR has been reset in this case.
+ */
+- if (cpuid_eax(0x80000000) >= 0x80000008) {
+- phys_addr = cpuid_eax(0x80000008) & 0xff;
+- /* CPUID workaround for Intel 0F33/0F34 CPU */
+- if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
+- boot_cpu_data.x86 == 0xF &&
+- boot_cpu_data.x86_model == 0x3 &&
+- (boot_cpu_data.x86_stepping == 0x3 ||
+- boot_cpu_data.x86_stepping == 0x4))
+- phys_addr = 36;
+-
+- size_or_mask = SIZE_OR_MASK_BITS(phys_addr);
+- size_and_mask = ~size_or_mask & 0xfffff00000ULL;
+- } else if (boot_cpu_data.x86_vendor == X86_VENDOR_CENTAUR &&
+- boot_cpu_data.x86 == 6) {
+- /*
+- * VIA C* family have Intel style MTRRs,
+- * but don't support PAE
+- */
+- size_or_mask = SIZE_OR_MASK_BITS(32);
+- size_and_mask = 0;
+- phys_addr = 32;
+- }
++ init_table();
++ pr_info("MTRRs set to read-only\n");
++
++ return;
++ }
++
++ if (generic_mtrrs) {
++ mtrr_if = &generic_mtrr_ops;
+ } else {
+ switch (boot_cpu_data.x86_vendor) {
+ case X86_VENDOR_AMD:
+- if (cpu_feature_enabled(X86_FEATURE_K6_MTRR)) {
+- /* Pre-Athlon (K6) AMD CPU MTRRs */
++ /* Pre-Athlon (K6) AMD CPU MTRRs */
++ if (cpu_feature_enabled(X86_FEATURE_K6_MTRR))
+ mtrr_if = &amd_mtrr_ops;
+- size_or_mask = SIZE_OR_MASK_BITS(32);
+- size_and_mask = 0;
+- }
+ break;
+ case X86_VENDOR_CENTAUR:
+- if (cpu_feature_enabled(X86_FEATURE_CENTAUR_MCR)) {
++ if (cpu_feature_enabled(X86_FEATURE_CENTAUR_MCR))
+ mtrr_if = ¢aur_mtrr_ops;
+- size_or_mask = SIZE_OR_MASK_BITS(32);
+- size_and_mask = 0;
+- }
+ break;
+ case X86_VENDOR_CYRIX:
+- if (cpu_feature_enabled(X86_FEATURE_CYRIX_ARR)) {
++ if (cpu_feature_enabled(X86_FEATURE_CYRIX_ARR))
+ mtrr_if = &cyrix_mtrr_ops;
+- size_or_mask = SIZE_OR_MASK_BITS(32);
+- size_and_mask = 0;
+- }
+ break;
+ default:
+ break;
+@@ -703,7 +670,7 @@ void __init mtrr_bp_init(void)
+ /* BIOS may override */
+ if (get_mtrr_state()) {
+ memory_caching_control |= CACHE_MTRR;
+- changed_by_mtrr_cleanup = mtrr_cleanup(phys_addr);
++ changed_by_mtrr_cleanup = mtrr_cleanup();
+ } else {
+ mtrr_if = NULL;
+ why = "by BIOS";
+diff --git a/arch/x86/kernel/cpu/mtrr/mtrr.h b/arch/x86/kernel/cpu/mtrr/mtrr.h
+index 02eb5871492d0..59e8fb26bf9dd 100644
+--- a/arch/x86/kernel/cpu/mtrr/mtrr.h
++++ b/arch/x86/kernel/cpu/mtrr/mtrr.h
+@@ -51,7 +51,6 @@ void fill_mtrr_var_range(unsigned int index,
+ u32 base_lo, u32 base_hi, u32 mask_lo, u32 mask_hi);
+ bool get_mtrr_state(void);
+
+-extern u64 size_or_mask, size_and_mask;
+ extern const struct mtrr_ops *mtrr_if;
+
+ #define is_cpu(vnd) (mtrr_if && mtrr_if->vendor == X86_VENDOR_##vnd)
+@@ -59,6 +58,7 @@ extern const struct mtrr_ops *mtrr_if;
+ extern unsigned int num_var_ranges;
+ extern u64 mtrr_tom2;
+ extern struct mtrr_state_type mtrr_state;
++extern u32 phys_hi_rsvd;
+
+ void mtrr_state_warn(void);
+ const char *mtrr_attrib_to_str(int x);
+@@ -70,4 +70,4 @@ extern const struct mtrr_ops cyrix_mtrr_ops;
+ extern const struct mtrr_ops centaur_mtrr_ops;
+
+ extern int changed_by_mtrr_cleanup;
+-extern int mtrr_cleanup(unsigned address_bits);
++extern int mtrr_cleanup(void);
+diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+index 6ad33f355861f..61cdd9b1bb6d8 100644
+--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
++++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+@@ -726,11 +726,15 @@ unlock:
+ static void show_rdt_tasks(struct rdtgroup *r, struct seq_file *s)
+ {
+ struct task_struct *p, *t;
++ pid_t pid;
+
+ rcu_read_lock();
+ for_each_process_thread(p, t) {
+- if (is_closid_match(t, r) || is_rmid_match(t, r))
+- seq_printf(s, "%d\n", t->pid);
++ if (is_closid_match(t, r) || is_rmid_match(t, r)) {
++ pid = task_pid_vnr(t);
++ if (pid)
++ seq_printf(s, "%d\n", pid);
++ }
+ }
+ rcu_read_unlock();
+ }
+diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
+index 16babff771bdf..0cccfeb67c3ad 100644
+--- a/arch/x86/kernel/setup.c
++++ b/arch/x86/kernel/setup.c
+@@ -1037,6 +1037,8 @@ void __init setup_arch(char **cmdline_p)
+ /*
+ * VMware detection requires dmi to be available, so this
+ * needs to be done after dmi_setup(), for the boot CPU.
++ * For some guest types (Xen PV, SEV-SNP, TDX) it is required to be
++ * called before cache_bp_init() for setting up MTRR state.
+ */
+ init_hypervisor_platform();
+
+diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
+index b031244d6d2df..108bbae59c35a 100644
+--- a/arch/x86/kernel/sev.c
++++ b/arch/x86/kernel/sev.c
+@@ -645,7 +645,7 @@ static u64 __init get_jump_table_addr(void)
+ return ret;
+ }
+
+-static void pvalidate_pages(unsigned long vaddr, unsigned int npages, bool validate)
++static void pvalidate_pages(unsigned long vaddr, unsigned long npages, bool validate)
+ {
+ unsigned long vaddr_end;
+ int rc;
+@@ -662,7 +662,7 @@ static void pvalidate_pages(unsigned long vaddr, unsigned int npages, bool valid
+ }
+ }
+
+-static void __init early_set_pages_state(unsigned long paddr, unsigned int npages, enum psc_op op)
++static void __init early_set_pages_state(unsigned long paddr, unsigned long npages, enum psc_op op)
+ {
+ unsigned long paddr_end;
+ u64 val;
+@@ -701,7 +701,7 @@ e_term:
+ }
+
+ void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr,
+- unsigned int npages)
++ unsigned long npages)
+ {
+ /*
+ * This can be invoked in early boot while running identity mapped, so
+@@ -723,7 +723,7 @@ void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long padd
+ }
+
+ void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
+- unsigned int npages)
++ unsigned long npages)
+ {
+ /*
+ * This can be invoked in early boot while running identity mapped, so
+@@ -879,7 +879,7 @@ static void __set_pages_state(struct snp_psc_desc *data, unsigned long vaddr,
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC);
+ }
+
+-static void set_pages_state(unsigned long vaddr, unsigned int npages, int op)
++static void set_pages_state(unsigned long vaddr, unsigned long npages, int op)
+ {
+ unsigned long vaddr_end, next_vaddr;
+ struct snp_psc_desc *desc;
+@@ -904,7 +904,7 @@ static void set_pages_state(unsigned long vaddr, unsigned int npages, int op)
+ kfree(desc);
+ }
+
+-void snp_set_memory_shared(unsigned long vaddr, unsigned int npages)
++void snp_set_memory_shared(unsigned long vaddr, unsigned long npages)
+ {
+ if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ return;
+@@ -914,7 +914,7 @@ void snp_set_memory_shared(unsigned long vaddr, unsigned int npages)
+ set_pages_state(vaddr, npages, SNP_PAGE_STATE_SHARED);
+ }
+
+-void snp_set_memory_private(unsigned long vaddr, unsigned int npages)
++void snp_set_memory_private(unsigned long vaddr, unsigned long npages)
+ {
+ if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ return;
+diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
+index d82f4fa2f1bfe..f230d4d7d8eb4 100644
+--- a/arch/x86/kernel/x86_init.c
++++ b/arch/x86/kernel/x86_init.c
+@@ -130,7 +130,7 @@ struct x86_cpuinit_ops x86_cpuinit = {
+
+ static void default_nmi_init(void) { };
+
+-static void enc_status_change_prepare_noop(unsigned long vaddr, int npages, bool enc) { }
++static bool enc_status_change_prepare_noop(unsigned long vaddr, int npages, bool enc) { return true; }
+ static bool enc_status_change_finish_noop(unsigned long vaddr, int npages, bool enc) { return false; }
+ static bool enc_tlb_flush_required_noop(bool enc) { return false; }
+ static bool enc_cache_flush_required_noop(void) { return false; }
+diff --git a/arch/x86/mm/mem_encrypt_amd.c b/arch/x86/mm/mem_encrypt_amd.c
+index e0b51c09109f6..4f95c449a406e 100644
+--- a/arch/x86/mm/mem_encrypt_amd.c
++++ b/arch/x86/mm/mem_encrypt_amd.c
+@@ -319,7 +319,7 @@ static void enc_dec_hypercall(unsigned long vaddr, int npages, bool enc)
+ #endif
+ }
+
+-static void amd_enc_status_change_prepare(unsigned long vaddr, int npages, bool enc)
++static bool amd_enc_status_change_prepare(unsigned long vaddr, int npages, bool enc)
+ {
+ /*
+ * To maintain the security guarantees of SEV-SNP guests, make sure
+@@ -327,6 +327,8 @@ static void amd_enc_status_change_prepare(unsigned long vaddr, int npages, bool
+ */
+ if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP) && !enc)
+ snp_set_memory_shared(vaddr, npages);
++
++ return true;
+ }
+
+ /* Return true unconditionally: return value doesn't matter for the SEV side */
+diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
+index 7159cf7876130..b8f48ebe753c7 100644
+--- a/arch/x86/mm/pat/set_memory.c
++++ b/arch/x86/mm/pat/set_memory.c
+@@ -2151,7 +2151,8 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc)
+ cpa_flush(&cpa, x86_platform.guest.enc_cache_flush_required());
+
+ /* Notify hypervisor that we are about to set/clr encryption attribute. */
+- x86_platform.guest.enc_status_change_prepare(addr, numpages, enc);
++ if (!x86_platform.guest.enc_status_change_prepare(addr, numpages, enc))
++ return -EIO;
+
+ ret = __change_page_attr_set_clr(&cpa, 1);
+
+diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
+index 232acf418cfbe..77f7ac3668cb4 100644
+--- a/arch/x86/platform/efi/efi_64.c
++++ b/arch/x86/platform/efi/efi_64.c
+@@ -853,9 +853,9 @@ efi_set_virtual_address_map(unsigned long memory_map_size,
+
+ /* Disable interrupts around EFI calls: */
+ local_irq_save(flags);
+- status = efi_call(efi.runtime->set_virtual_address_map,
+- memory_map_size, descriptor_size,
+- descriptor_version, virtual_map);
++ status = arch_efi_call_virt(efi.runtime, set_virtual_address_map,
++ memory_map_size, descriptor_size,
++ descriptor_version, virtual_map);
+ local_irq_restore(flags);
+
+ efi_fpu_end();
+diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
+index 093b78c8bbec0..8732b85d56505 100644
+--- a/arch/x86/xen/enlighten_pv.c
++++ b/arch/x86/xen/enlighten_pv.c
+@@ -68,6 +68,7 @@
+ #include <asm/reboot.h>
+ #include <asm/hypervisor.h>
+ #include <asm/mach_traps.h>
++#include <asm/mtrr.h>
+ #include <asm/mwait.h>
+ #include <asm/pci_x86.h>
+ #include <asm/cpu.h>
+@@ -119,6 +120,54 @@ static int __init parse_xen_msr_safe(char *str)
+ }
+ early_param("xen_msr_safe", parse_xen_msr_safe);
+
++/* Get MTRR settings from Xen and put them into mtrr_state. */
++static void __init xen_set_mtrr_data(void)
++{
++#ifdef CONFIG_MTRR
++ struct xen_platform_op op = {
++ .cmd = XENPF_read_memtype,
++ .interface_version = XENPF_INTERFACE_VERSION,
++ };
++ unsigned int reg;
++ unsigned long mask;
++ uint32_t eax, width;
++ static struct mtrr_var_range var[MTRR_MAX_VAR_RANGES] __initdata;
++
++ /* Get physical address width (only 64-bit cpus supported). */
++ width = 36;
++ eax = cpuid_eax(0x80000000);
++ if ((eax >> 16) == 0x8000 && eax >= 0x80000008) {
++ eax = cpuid_eax(0x80000008);
++ width = eax & 0xff;
++ }
++
++ for (reg = 0; reg < MTRR_MAX_VAR_RANGES; reg++) {
++ op.u.read_memtype.reg = reg;
++ if (HYPERVISOR_platform_op(&op))
++ break;
++
++ /*
++ * Only called in dom0, which has all RAM PFNs mapped at
++ * RAM MFNs, and all PCI space etc. is identity mapped.
++ * This means we can treat MFN == PFN regarding MTRR settings.
++ */
++ var[reg].base_lo = op.u.read_memtype.type;
++ var[reg].base_lo |= op.u.read_memtype.mfn << PAGE_SHIFT;
++ var[reg].base_hi = op.u.read_memtype.mfn >> (32 - PAGE_SHIFT);
++ mask = ~((op.u.read_memtype.nr_mfns << PAGE_SHIFT) - 1);
++ mask &= (1UL << width) - 1;
++ if (mask)
++ mask |= MTRR_PHYSMASK_V;
++ var[reg].mask_lo = mask;
++ var[reg].mask_hi = mask >> 32;
++ }
++
++ /* Only overwrite MTRR state if any MTRR could be got from Xen. */
++ if (reg)
++ mtrr_overwrite_state(var, reg, MTRR_TYPE_UNCACHABLE);
++#endif
++}
++
+ static void __init xen_pv_init_platform(void)
+ {
+ /* PV guests can't operate virtio devices without grants. */
+@@ -135,6 +184,9 @@ static void __init xen_pv_init_platform(void)
+
+ /* pvclock is in shared info area */
+ xen_init_time_ops();
++
++ if (xen_initial_domain())
++ xen_set_mtrr_data();
+ }
+
+ static void __init xen_pv_guest_late_init(void)
+diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
+index dce1548a7a0c3..fc49be622e05b 100644
+--- a/block/blk-cgroup.c
++++ b/block/blk-cgroup.c
+@@ -624,8 +624,13 @@ static int blkcg_reset_stats(struct cgroup_subsys_state *css,
+ struct blkg_iostat_set *bis =
+ per_cpu_ptr(blkg->iostat_cpu, cpu);
+ memset(bis, 0, sizeof(*bis));
++
++ /* Re-initialize the cleared blkg_iostat_set */
++ u64_stats_init(&bis->sync);
++ bis->blkg = blkg;
+ }
+ memset(&blkg->iostat, 0, sizeof(blkg->iostat));
++ u64_stats_init(&blkg->iostat.sync);
+
+ for (i = 0; i < BLKCG_MAX_POLS; i++) {
+ struct blkcg_policy *pol = blkcg_policy[i];
+@@ -762,6 +767,13 @@ int blkg_conf_open_bdev(struct blkg_conf_ctx *ctx)
+ return -ENODEV;
+ }
+
++ mutex_lock(&bdev->bd_queue->rq_qos_mutex);
++ if (!disk_live(bdev->bd_disk)) {
++ blkdev_put_no_open(bdev);
++ mutex_unlock(&bdev->bd_queue->rq_qos_mutex);
++ return -ENODEV;
++ }
++
+ ctx->body = input;
+ ctx->bdev = bdev;
+ return 0;
+@@ -906,6 +918,7 @@ EXPORT_SYMBOL_GPL(blkg_conf_prep);
+ */
+ void blkg_conf_exit(struct blkg_conf_ctx *ctx)
+ __releases(&ctx->bdev->bd_queue->queue_lock)
++ __releases(&ctx->bdev->bd_queue->rq_qos_mutex)
+ {
+ if (ctx->blkg) {
+ spin_unlock_irq(&bdev_get_queue(ctx->bdev)->queue_lock);
+@@ -913,6 +926,7 @@ void blkg_conf_exit(struct blkg_conf_ctx *ctx)
+ }
+
+ if (ctx->bdev) {
++ mutex_unlock(&ctx->bdev->bd_queue->rq_qos_mutex);
+ blkdev_put_no_open(ctx->bdev);
+ ctx->body = NULL;
+ ctx->bdev = NULL;
+@@ -2072,6 +2086,9 @@ void blk_cgroup_bio_start(struct bio *bio)
+ struct blkg_iostat_set *bis;
+ unsigned long flags;
+
++ if (!cgroup_subsys_on_dfl(io_cgrp_subsys))
++ return;
++
+ /* Root-level stats are sourced from system-wide IO stats */
+ if (!cgroup_parent(blkcg->css.cgroup))
+ return;
+@@ -2102,8 +2119,7 @@ void blk_cgroup_bio_start(struct bio *bio)
+ }
+
+ u64_stats_update_end_irqrestore(&bis->sync, flags);
+- if (cgroup_subsys_on_dfl(io_cgrp_subsys))
+- cgroup_rstat_updated(blkcg->css.cgroup, cpu);
++ cgroup_rstat_updated(blkcg->css.cgroup, cpu);
+ put_cpu();
+ }
+
+diff --git a/block/blk-core.c b/block/blk-core.c
+index 1da77e7d62894..3fc68b9444791 100644
+--- a/block/blk-core.c
++++ b/block/blk-core.c
+@@ -420,6 +420,7 @@ struct request_queue *blk_alloc_queue(int node_id)
+ mutex_init(&q->debugfs_mutex);
+ mutex_init(&q->sysfs_lock);
+ mutex_init(&q->sysfs_dir_lock);
++ mutex_init(&q->rq_qos_mutex);
+ spin_lock_init(&q->queue_lock);
+
+ init_waitqueue_head(&q->mq_freeze_wq);
+diff --git a/block/blk-iocost.c b/block/blk-iocost.c
+index 285ced3467abb..6084a9519883e 100644
+--- a/block/blk-iocost.c
++++ b/block/blk-iocost.c
+@@ -2455,6 +2455,7 @@ static u64 adjust_inuse_and_calc_cost(struct ioc_gq *iocg, u64 vtime,
+ u32 hwi, adj_step;
+ s64 margin;
+ u64 cost, new_inuse;
++ unsigned long flags;
+
+ current_hweight(iocg, NULL, &hwi);
+ old_hwi = hwi;
+@@ -2473,11 +2474,11 @@ static u64 adjust_inuse_and_calc_cost(struct ioc_gq *iocg, u64 vtime,
+ iocg->inuse == iocg->active)
+ return cost;
+
+- spin_lock_irq(&ioc->lock);
++ spin_lock_irqsave(&ioc->lock, flags);
+
+ /* we own inuse only when @iocg is in the normal active state */
+ if (iocg->abs_vdebt || list_empty(&iocg->active_list)) {
+- spin_unlock_irq(&ioc->lock);
++ spin_unlock_irqrestore(&ioc->lock, flags);
+ return cost;
+ }
+
+@@ -2498,7 +2499,7 @@ static u64 adjust_inuse_and_calc_cost(struct ioc_gq *iocg, u64 vtime,
+ } while (time_after64(vtime + cost, now->vnow) &&
+ iocg->inuse != iocg->active);
+
+- spin_unlock_irq(&ioc->lock);
++ spin_unlock_irqrestore(&ioc->lock, flags);
+
+ TRACE_IOCG_PATH(inuse_adjust, iocg, now,
+ old_inuse, iocg->inuse, old_hwi, hwi);
+diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c
+index d23a8554ec4ae..7851e149d365f 100644
+--- a/block/blk-mq-debugfs.c
++++ b/block/blk-mq-debugfs.c
+@@ -399,7 +399,7 @@ static void blk_mq_debugfs_tags_show(struct seq_file *m,
+ seq_printf(m, "nr_tags=%u\n", tags->nr_tags);
+ seq_printf(m, "nr_reserved_tags=%u\n", tags->nr_reserved_tags);
+ seq_printf(m, "active_queues=%d\n",
+- atomic_read(&tags->active_queues));
++ READ_ONCE(tags->active_queues));
+
+ seq_puts(m, "\nbitmap_tags:\n");
+ sbitmap_queue_show(&tags->bitmap_tags, m);
+diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
+index dfd81cab57888..cc57e2dd9a0bb 100644
+--- a/block/blk-mq-tag.c
++++ b/block/blk-mq-tag.c
+@@ -38,6 +38,7 @@ static void blk_mq_update_wake_batch(struct blk_mq_tags *tags,
+ void __blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx)
+ {
+ unsigned int users;
++ struct blk_mq_tags *tags = hctx->tags;
+
+ /*
+ * calling test_bit() prior to test_and_set_bit() is intentional,
+@@ -55,9 +56,11 @@ void __blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx)
+ return;
+ }
+
+- users = atomic_inc_return(&hctx->tags->active_queues);
+-
+- blk_mq_update_wake_batch(hctx->tags, users);
++ spin_lock_irq(&tags->lock);
++ users = tags->active_queues + 1;
++ WRITE_ONCE(tags->active_queues, users);
++ blk_mq_update_wake_batch(tags, users);
++ spin_unlock_irq(&tags->lock);
+ }
+
+ /*
+@@ -90,9 +93,11 @@ void __blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx)
+ return;
+ }
+
+- users = atomic_dec_return(&tags->active_queues);
+-
++ spin_lock_irq(&tags->lock);
++ users = tags->active_queues - 1;
++ WRITE_ONCE(tags->active_queues, users);
+ blk_mq_update_wake_batch(tags, users);
++ spin_unlock_irq(&tags->lock);
+
+ blk_mq_tag_wakeup_all(tags, false);
+ }
+diff --git a/block/blk-mq.c b/block/blk-mq.c
+index 850bfb844ed2f..b9f4546139894 100644
+--- a/block/blk-mq.c
++++ b/block/blk-mq.c
+@@ -2711,6 +2711,7 @@ static void blk_mq_dispatch_plug_list(struct blk_plug *plug, bool from_sched)
+ struct request *requeue_list = NULL;
+ struct request **requeue_lastp = &requeue_list;
+ unsigned int depth = 0;
++ bool is_passthrough = false;
+ LIST_HEAD(list);
+
+ do {
+@@ -2719,7 +2720,9 @@ static void blk_mq_dispatch_plug_list(struct blk_plug *plug, bool from_sched)
+ if (!this_hctx) {
+ this_hctx = rq->mq_hctx;
+ this_ctx = rq->mq_ctx;
+- } else if (this_hctx != rq->mq_hctx || this_ctx != rq->mq_ctx) {
++ is_passthrough = blk_rq_is_passthrough(rq);
++ } else if (this_hctx != rq->mq_hctx || this_ctx != rq->mq_ctx ||
++ is_passthrough != blk_rq_is_passthrough(rq)) {
+ rq_list_add_tail(&requeue_lastp, rq);
+ continue;
+ }
+@@ -2731,7 +2734,13 @@ static void blk_mq_dispatch_plug_list(struct blk_plug *plug, bool from_sched)
+ trace_block_unplug(this_hctx->queue, depth, !from_sched);
+
+ percpu_ref_get(&this_hctx->queue->q_usage_counter);
+- if (this_hctx->queue->elevator) {
++ /* passthrough requests should never be issued to the I/O scheduler */
++ if (is_passthrough) {
++ spin_lock(&this_hctx->lock);
++ list_splice_tail_init(&list, &this_hctx->dispatch);
++ spin_unlock(&this_hctx->lock);
++ blk_mq_run_hw_queue(this_hctx, from_sched);
++ } else if (this_hctx->queue->elevator) {
+ this_hctx->queue->elevator->type->ops.insert_requests(this_hctx,
+ &list, 0);
+ blk_mq_run_hw_queue(this_hctx, from_sched);
+diff --git a/block/blk-mq.h b/block/blk-mq.h
+index e876584d35163..890fef9796bf9 100644
+--- a/block/blk-mq.h
++++ b/block/blk-mq.h
+@@ -417,8 +417,7 @@ static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx,
+ return true;
+ }
+
+- users = atomic_read(&hctx->tags->active_queues);
+-
++ users = READ_ONCE(hctx->tags->active_queues);
+ if (!users)
+ return true;
+
+diff --git a/block/blk-rq-qos.c b/block/blk-rq-qos.c
+index d8cc820a365e3..167be74df4eec 100644
+--- a/block/blk-rq-qos.c
++++ b/block/blk-rq-qos.c
+@@ -288,11 +288,13 @@ void rq_qos_wait(struct rq_wait *rqw, void *private_data,
+
+ void rq_qos_exit(struct request_queue *q)
+ {
++ mutex_lock(&q->rq_qos_mutex);
+ while (q->rq_qos) {
+ struct rq_qos *rqos = q->rq_qos;
+ q->rq_qos = rqos->next;
+ rqos->ops->exit(rqos);
+ }
++ mutex_unlock(&q->rq_qos_mutex);
+ }
+
+ int rq_qos_add(struct rq_qos *rqos, struct gendisk *disk, enum rq_qos_id id,
+@@ -300,6 +302,8 @@ int rq_qos_add(struct rq_qos *rqos, struct gendisk *disk, enum rq_qos_id id,
+ {
+ struct request_queue *q = disk->queue;
+
++ lockdep_assert_held(&q->rq_qos_mutex);
++
+ rqos->disk = disk;
+ rqos->id = id;
+ rqos->ops = ops;
+@@ -307,18 +311,13 @@ int rq_qos_add(struct rq_qos *rqos, struct gendisk *disk, enum rq_qos_id id,
+ /*
+ * No IO can be in-flight when adding rqos, so freeze queue, which
+ * is fine since we only support rq_qos for blk-mq queue.
+- *
+- * Reuse ->queue_lock for protecting against other concurrent
+- * rq_qos adding/deleting
+ */
+ blk_mq_freeze_queue(q);
+
+- spin_lock_irq(&q->queue_lock);
+ if (rq_qos_id(q, rqos->id))
+ goto ebusy;
+ rqos->next = q->rq_qos;
+ q->rq_qos = rqos;
+- spin_unlock_irq(&q->queue_lock);
+
+ blk_mq_unfreeze_queue(q);
+
+@@ -330,7 +329,6 @@ int rq_qos_add(struct rq_qos *rqos, struct gendisk *disk, enum rq_qos_id id,
+
+ return 0;
+ ebusy:
+- spin_unlock_irq(&q->queue_lock);
+ blk_mq_unfreeze_queue(q);
+ return -EBUSY;
+ }
+@@ -340,21 +338,15 @@ void rq_qos_del(struct rq_qos *rqos)
+ struct request_queue *q = rqos->disk->queue;
+ struct rq_qos **cur;
+
+- /*
+- * See comment in rq_qos_add() about freezing queue & using
+- * ->queue_lock.
+- */
+- blk_mq_freeze_queue(q);
++ lockdep_assert_held(&q->rq_qos_mutex);
+
+- spin_lock_irq(&q->queue_lock);
++ blk_mq_freeze_queue(q);
+ for (cur = &q->rq_qos; *cur; cur = &(*cur)->next) {
+ if (*cur == rqos) {
+ *cur = rqos->next;
+ break;
+ }
+ }
+- spin_unlock_irq(&q->queue_lock);
+-
+ blk_mq_unfreeze_queue(q);
+
+ mutex_lock(&q->debugfs_mutex);
+diff --git a/block/blk-throttle.c b/block/blk-throttle.c
+index 9d010d867fbf4..7397ff199d669 100644
+--- a/block/blk-throttle.c
++++ b/block/blk-throttle.c
+@@ -2178,12 +2178,6 @@ bool __blk_throtl_bio(struct bio *bio)
+
+ rcu_read_lock();
+
+- if (!cgroup_subsys_on_dfl(io_cgrp_subsys)) {
+- blkg_rwstat_add(&tg->stat_bytes, bio->bi_opf,
+- bio->bi_iter.bi_size);
+- blkg_rwstat_add(&tg->stat_ios, bio->bi_opf, 1);
+- }
+-
+ spin_lock_irq(&q->queue_lock);
+
+ throtl_update_latency_buckets(td);
+diff --git a/block/blk-throttle.h b/block/blk-throttle.h
+index ef4b7a4de987d..d1ccbfe9f7978 100644
+--- a/block/blk-throttle.h
++++ b/block/blk-throttle.h
+@@ -185,6 +185,15 @@ static inline bool blk_should_throtl(struct bio *bio)
+ struct throtl_grp *tg = blkg_to_tg(bio->bi_blkg);
+ int rw = bio_data_dir(bio);
+
++ if (!cgroup_subsys_on_dfl(io_cgrp_subsys)) {
++ if (!bio_flagged(bio, BIO_CGROUP_ACCT)) {
++ bio_set_flag(bio, BIO_CGROUP_ACCT);
++ blkg_rwstat_add(&tg->stat_bytes, bio->bi_opf,
++ bio->bi_iter.bi_size);
++ }
++ blkg_rwstat_add(&tg->stat_ios, bio->bi_opf, 1);
++ }
++
+ /* iops limit is always counted */
+ if (tg->has_rules_iops[rw])
+ return true;
+diff --git a/block/blk-wbt.c b/block/blk-wbt.c
+index 9ec2a2f1eda38..7a87506ff8e1c 100644
+--- a/block/blk-wbt.c
++++ b/block/blk-wbt.c
+@@ -944,7 +944,9 @@ int wbt_init(struct gendisk *disk)
+ /*
+ * Assign rwb and add the stats callback.
+ */
++ mutex_lock(&q->rq_qos_mutex);
+ ret = rq_qos_add(&rwb->rqos, disk, RQ_QOS_WBT, &wbt_rqos_ops);
++ mutex_unlock(&q->rq_qos_mutex);
+ if (ret)
+ goto err_free;
+
+diff --git a/block/disk-events.c b/block/disk-events.c
+index aee25a7e1ab7d..450c2cbe23d56 100644
+--- a/block/disk-events.c
++++ b/block/disk-events.c
+@@ -307,6 +307,7 @@ bool disk_force_media_change(struct gendisk *disk, unsigned int events)
+ if (!(events & DISK_EVENT_MEDIA_CHANGE))
+ return false;
+
++ inc_diskseq(disk);
+ if (__invalidate_device(disk->part0, true))
+ pr_warn("VFS: busy inodes on changed media %s\n",
+ disk->disk_name);
+diff --git a/block/genhd.c b/block/genhd.c
+index 1cb489b927d50..bb895397e9385 100644
+--- a/block/genhd.c
++++ b/block/genhd.c
+@@ -25,8 +25,9 @@
+ #include <linux/pm_runtime.h>
+ #include <linux/badblocks.h>
+ #include <linux/part_stat.h>
+-#include "blk-throttle.h"
++#include <linux/blktrace_api.h>
+
++#include "blk-throttle.h"
+ #include "blk.h"
+ #include "blk-mq-sched.h"
+ #include "blk-rq-qos.h"
+@@ -1171,6 +1172,8 @@ static void disk_release(struct device *dev)
+ might_sleep();
+ WARN_ON_ONCE(disk_live(disk));
+
++ blk_trace_remove(disk->queue);
++
+ /*
+ * To undo the all initialization from blk_mq_init_allocated_queue in
+ * case of a probe failure where add_disk is never called we have to
+diff --git a/block/partitions/amiga.c b/block/partitions/amiga.c
+index 5c8624e26a54c..5069210954129 100644
+--- a/block/partitions/amiga.c
++++ b/block/partitions/amiga.c
+@@ -11,10 +11,18 @@
+ #define pr_fmt(fmt) fmt
+
+ #include <linux/types.h>
++#include <linux/mm_types.h>
++#include <linux/overflow.h>
+ #include <linux/affs_hardblocks.h>
+
+ #include "check.h"
+
++/* magic offsets in partition DosEnvVec */
++#define NR_HD 3
++#define NR_SECT 5
++#define LO_CYL 9
++#define HI_CYL 10
++
+ static __inline__ u32
+ checksum_block(__be32 *m, int size)
+ {
+@@ -31,8 +39,12 @@ int amiga_partition(struct parsed_partitions *state)
+ unsigned char *data;
+ struct RigidDiskBlock *rdb;
+ struct PartitionBlock *pb;
+- int start_sect, nr_sects, blk, part, res = 0;
+- int blksize = 1; /* Multiplier for disk block size */
++ u64 start_sect, nr_sects;
++ sector_t blk, end_sect;
++ u32 cylblk; /* rdb_CylBlocks = nr_heads*sect_per_track */
++ u32 nr_hd, nr_sect, lo_cyl, hi_cyl;
++ int part, res = 0;
++ unsigned int blksize = 1; /* Multiplier for disk block size */
+ int slot = 1;
+
+ for (blk = 0; ; blk++, put_dev_sector(sect)) {
+@@ -40,7 +52,7 @@ int amiga_partition(struct parsed_partitions *state)
+ goto rdb_done;
+ data = read_part_sector(state, blk, §);
+ if (!data) {
+- pr_err("Dev %s: unable to read RDB block %d\n",
++ pr_err("Dev %s: unable to read RDB block %llu\n",
+ state->disk->disk_name, blk);
+ res = -1;
+ goto rdb_done;
+@@ -57,12 +69,12 @@ int amiga_partition(struct parsed_partitions *state)
+ *(__be32 *)(data+0xdc) = 0;
+ if (checksum_block((__be32 *)data,
+ be32_to_cpu(rdb->rdb_SummedLongs) & 0x7F)==0) {
+- pr_err("Trashed word at 0xd0 in block %d ignored in checksum calculation\n",
++ pr_err("Trashed word at 0xd0 in block %llu ignored in checksum calculation\n",
+ blk);
+ break;
+ }
+
+- pr_err("Dev %s: RDB in block %d has bad checksum\n",
++ pr_err("Dev %s: RDB in block %llu has bad checksum\n",
+ state->disk->disk_name, blk);
+ }
+
+@@ -78,11 +90,16 @@ int amiga_partition(struct parsed_partitions *state)
+ }
+ blk = be32_to_cpu(rdb->rdb_PartitionList);
+ put_dev_sector(sect);
+- for (part = 1; blk>0 && part<=16; part++, put_dev_sector(sect)) {
+- blk *= blksize; /* Read in terms partition table understands */
++ for (part = 1; (s32) blk>0 && part<=16; part++, put_dev_sector(sect)) {
++ /* Read in terms partition table understands */
++ if (check_mul_overflow(blk, (sector_t) blksize, &blk)) {
++ pr_err("Dev %s: overflow calculating partition block %llu! Skipping partitions %u and beyond\n",
++ state->disk->disk_name, blk, part);
++ break;
++ }
+ data = read_part_sector(state, blk, §);
+ if (!data) {
+- pr_err("Dev %s: unable to read partition block %d\n",
++ pr_err("Dev %s: unable to read partition block %llu\n",
+ state->disk->disk_name, blk);
+ res = -1;
+ goto rdb_done;
+@@ -94,19 +111,70 @@ int amiga_partition(struct parsed_partitions *state)
+ if (checksum_block((__be32 *)pb, be32_to_cpu(pb->pb_SummedLongs) & 0x7F) != 0 )
+ continue;
+
+- /* Tell Kernel about it */
++ /* RDB gives us more than enough rope to hang ourselves with,
++ * many times over (2^128 bytes if all fields max out).
++ * Some careful checks are in order, so check for potential
++ * overflows.
++ * We are multiplying four 32 bit numbers to one sector_t!
++ */
++
++ nr_hd = be32_to_cpu(pb->pb_Environment[NR_HD]);
++ nr_sect = be32_to_cpu(pb->pb_Environment[NR_SECT]);
++
++ /* CylBlocks is total number of blocks per cylinder */
++ if (check_mul_overflow(nr_hd, nr_sect, &cylblk)) {
++ pr_err("Dev %s: heads*sects %u overflows u32, skipping partition!\n",
++ state->disk->disk_name, cylblk);
++ continue;
++ }
++
++ /* check for consistency with RDB defined CylBlocks */
++ if (cylblk > be32_to_cpu(rdb->rdb_CylBlocks)) {
++ pr_warn("Dev %s: cylblk %u > rdb_CylBlocks %u!\n",
++ state->disk->disk_name, cylblk,
++ be32_to_cpu(rdb->rdb_CylBlocks));
++ }
++
++ /* RDB allows for variable logical block size -
++ * normalize to 512 byte blocks and check result.
++ */
++
++ if (check_mul_overflow(cylblk, blksize, &cylblk)) {
++ pr_err("Dev %s: partition %u bytes per cyl. overflows u32, skipping partition!\n",
++ state->disk->disk_name, part);
++ continue;
++ }
++
++ /* Calculate partition start and end. Limit of 32 bit on cylblk
++ * guarantees no overflow occurs if LBD support is enabled.
++ */
++
++ lo_cyl = be32_to_cpu(pb->pb_Environment[LO_CYL]);
++ start_sect = ((u64) lo_cyl * cylblk);
++
++ hi_cyl = be32_to_cpu(pb->pb_Environment[HI_CYL]);
++ nr_sects = (((u64) hi_cyl - lo_cyl + 1) * cylblk);
+
+- nr_sects = (be32_to_cpu(pb->pb_Environment[10]) + 1 -
+- be32_to_cpu(pb->pb_Environment[9])) *
+- be32_to_cpu(pb->pb_Environment[3]) *
+- be32_to_cpu(pb->pb_Environment[5]) *
+- blksize;
+ if (!nr_sects)
+ continue;
+- start_sect = be32_to_cpu(pb->pb_Environment[9]) *
+- be32_to_cpu(pb->pb_Environment[3]) *
+- be32_to_cpu(pb->pb_Environment[5]) *
+- blksize;
++
++ /* Warn user if partition end overflows u32 (AmigaDOS limit) */
++
++ if ((start_sect + nr_sects) > UINT_MAX) {
++ pr_warn("Dev %s: partition %u (%llu-%llu) needs 64 bit device support!\n",
++ state->disk->disk_name, part,
++ start_sect, start_sect + nr_sects);
++ }
++
++ if (check_add_overflow(start_sect, nr_sects, &end_sect)) {
++ pr_err("Dev %s: partition %u (%llu-%llu) needs LBD device support, skipping partition!\n",
++ state->disk->disk_name, part,
++ start_sect, end_sect);
++ continue;
++ }
++
++ /* Tell Kernel about it */
++
+ put_partition(state,slot++,start_sect,nr_sects);
+ {
+ /* Be even more informative to aid mounting */
+diff --git a/crypto/jitterentropy.c b/crypto/jitterentropy.c
+index 22f48bf4c6f57..227cedfa4f0ae 100644
+--- a/crypto/jitterentropy.c
++++ b/crypto/jitterentropy.c
+@@ -117,7 +117,6 @@ struct rand_data {
+ * zero). */
+ #define JENT_ESTUCK 8 /* Too many stuck results during init. */
+ #define JENT_EHEALTH 9 /* Health test failed during initialization */
+-#define JENT_ERCT 10 /* RCT failed during initialization */
+
+ /*
+ * The output n bits can receive more than n bits of min entropy, of course,
+@@ -762,14 +761,12 @@ int jent_entropy_init(void)
+ if ((nonstuck % JENT_APT_WINDOW_SIZE) == 0) {
+ jent_apt_reset(&ec,
+ delta & JENT_APT_WORD_MASK);
+- if (jent_health_failure(&ec))
+- return JENT_EHEALTH;
+ }
+ }
+
+- /* Validate RCT */
+- if (jent_rct_failure(&ec))
+- return JENT_ERCT;
++ /* Validate health test result */
++ if (jent_health_failure(&ec))
++ return JENT_EHEALTH;
+
+ /* test whether we have an increasing timer */
+ if (!(time2 > time))
+diff --git a/drivers/accel/habanalabs/gaudi2/gaudi2.c b/drivers/accel/habanalabs/gaudi2/gaudi2.c
+index b778cf764a68a..5539c84ee7171 100644
+--- a/drivers/accel/habanalabs/gaudi2/gaudi2.c
++++ b/drivers/accel/habanalabs/gaudi2/gaudi2.c
+@@ -7231,7 +7231,7 @@ static bool gaudi2_get_tpc_idle_status(struct hl_device *hdev, u64 *mask_arr, u8
+
+ gaudi2_iterate_tpcs(hdev, &tpc_iter);
+
+- return tpc_idle_data.is_idle;
++ return *tpc_idle_data.is_idle;
+ }
+
+ static bool gaudi2_get_decoder_idle_status(struct hl_device *hdev, u64 *mask_arr, u8 mask_len,
+diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
+index 34ad071a64e96..4382fe13ee3e4 100644
+--- a/drivers/acpi/apei/ghes.c
++++ b/drivers/acpi/apei/ghes.c
+@@ -1544,6 +1544,8 @@ struct list_head *ghes_get_devices(void)
+
+ pr_warn_once("Force-loading ghes_edac on an unsupported platform. You're on your own!\n");
+ }
++ } else if (list_empty(&ghes_devs)) {
++ return NULL;
+ }
+
+ return &ghes_devs;
+diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
+index 32084e38b73d0..5cb2023581d4d 100644
+--- a/drivers/base/power/domain.c
++++ b/drivers/base/power/domain.c
+@@ -1632,9 +1632,6 @@ static int genpd_add_device(struct generic_pm_domain *genpd, struct device *dev,
+
+ dev_dbg(dev, "%s()\n", __func__);
+
+- if (IS_ERR_OR_NULL(genpd) || IS_ERR_OR_NULL(dev))
+- return -EINVAL;
+-
+ gpd_data = genpd_alloc_dev_data(dev, gd);
+ if (IS_ERR(gpd_data))
+ return PTR_ERR(gpd_data);
+@@ -1676,6 +1673,9 @@ int pm_genpd_add_device(struct generic_pm_domain *genpd, struct device *dev)
+ {
+ int ret;
+
++ if (!genpd || !dev)
++ return -EINVAL;
++
+ mutex_lock(&gpd_list_lock);
+ ret = genpd_add_device(genpd, dev, dev);
+ mutex_unlock(&gpd_list_lock);
+@@ -2523,6 +2523,9 @@ int of_genpd_add_device(struct of_phandle_args *genpdspec, struct device *dev)
+ struct generic_pm_domain *genpd;
+ int ret;
+
++ if (!dev)
++ return -EINVAL;
++
+ mutex_lock(&gpd_list_lock);
+
+ genpd = genpd_get_from_provider(genpdspec);
+@@ -2939,10 +2942,10 @@ static int genpd_parse_state(struct genpd_power_state *genpd_state,
+
+ err = of_property_read_u32(state_node, "min-residency-us", &residency);
+ if (!err)
+- genpd_state->residency_ns = 1000 * residency;
++ genpd_state->residency_ns = 1000LL * residency;
+
+- genpd_state->power_on_latency_ns = 1000 * exit_latency;
+- genpd_state->power_off_latency_ns = 1000 * entry_latency;
++ genpd_state->power_on_latency_ns = 1000LL * exit_latency;
++ genpd_state->power_off_latency_ns = 1000LL * entry_latency;
+ genpd_state->fwnode = &state_node->fwnode;
+
+ return 0;
+diff --git a/drivers/base/property.c b/drivers/base/property.c
+index f6117ec9805c4..8c40abed78524 100644
+--- a/drivers/base/property.c
++++ b/drivers/base/property.c
+@@ -987,12 +987,18 @@ EXPORT_SYMBOL(fwnode_iomap);
+ * @fwnode: Pointer to the firmware node
+ * @index: Zero-based index of the IRQ
+ *
+- * Return: Linux IRQ number on success. Other values are determined
+- * according to acpi_irq_get() or of_irq_get() operation.
++ * Return: Linux IRQ number on success. Negative errno on failure.
+ */
+ int fwnode_irq_get(const struct fwnode_handle *fwnode, unsigned int index)
+ {
+- return fwnode_call_int_op(fwnode, irq_get, index);
++ int ret;
++
++ ret = fwnode_call_int_op(fwnode, irq_get, index);
++ /* We treat mapping errors as invalid case */
++ if (ret == 0)
++ return -EINVAL;
++
++ return ret;
+ }
+ EXPORT_SYMBOL(fwnode_irq_get);
+
+diff --git a/drivers/bus/fsl-mc/dprc-driver.c b/drivers/bus/fsl-mc/dprc-driver.c
+index 4c84be378bf27..ec5f26a45641b 100644
+--- a/drivers/bus/fsl-mc/dprc-driver.c
++++ b/drivers/bus/fsl-mc/dprc-driver.c
+@@ -45,6 +45,9 @@ static int __fsl_mc_device_remove_if_not_in_mc(struct device *dev, void *data)
+ struct fsl_mc_child_objs *objs;
+ struct fsl_mc_device *mc_dev;
+
++ if (!dev_is_fsl_mc(dev))
++ return 0;
++
+ mc_dev = to_fsl_mc_device(dev);
+ objs = data;
+
+@@ -64,6 +67,9 @@ static int __fsl_mc_device_remove_if_not_in_mc(struct device *dev, void *data)
+
+ static int __fsl_mc_device_remove(struct device *dev, void *data)
+ {
++ if (!dev_is_fsl_mc(dev))
++ return 0;
++
+ fsl_mc_device_remove(to_fsl_mc_device(dev));
+ return 0;
+ }
+diff --git a/drivers/bus/ti-sysc.c b/drivers/bus/ti-sysc.c
+index 6c49de37d5e90..21fe9854703f9 100644
+--- a/drivers/bus/ti-sysc.c
++++ b/drivers/bus/ti-sysc.c
+@@ -1791,7 +1791,7 @@ static u32 sysc_quirk_dispc(struct sysc *ddata, int dispc_offset,
+ if (!ddata->module_va)
+ return -EIO;
+
+- /* DISP_CONTROL */
++ /* DISP_CONTROL, shut down lcd and digit on disable if enabled */
+ val = sysc_read(ddata, dispc_offset + 0x40);
+ lcd_en = val & lcd_en_mask;
+ digit_en = val & digit_en_mask;
+@@ -1803,7 +1803,7 @@ static u32 sysc_quirk_dispc(struct sysc *ddata, int dispc_offset,
+ else
+ irq_mask |= BIT(2) | BIT(3); /* EVSYNC bits */
+ }
+- if (disable & (lcd_en | digit_en))
++ if (disable && (lcd_en || digit_en))
+ sysc_write(ddata, dispc_offset + 0x40,
+ val & ~(lcd_en_mask | digit_en_mask));
+
+diff --git a/drivers/cdx/cdx.c b/drivers/cdx/cdx.c
+index 38511fd363257..d2cad4c670a07 100644
+--- a/drivers/cdx/cdx.c
++++ b/drivers/cdx/cdx.c
+@@ -62,6 +62,8 @@
+ #include <linux/mm.h>
+ #include <linux/xarray.h>
+ #include <linux/cdx/cdx_bus.h>
++#include <linux/iommu.h>
++#include <linux/dma-map-ops.h>
+ #include "cdx.h"
+
+ /* Default DMA mask for devices on a CDX bus */
+@@ -257,6 +259,7 @@ static void cdx_shutdown(struct device *dev)
+
+ static int cdx_dma_configure(struct device *dev)
+ {
++ struct cdx_driver *cdx_drv = to_cdx_driver(dev->driver);
+ struct cdx_device *cdx_dev = to_cdx_device(dev);
+ u32 input_id = cdx_dev->req_id;
+ int ret;
+@@ -267,9 +270,23 @@ static int cdx_dma_configure(struct device *dev)
+ return ret;
+ }
+
++ if (!ret && !cdx_drv->driver_managed_dma) {
++ ret = iommu_device_use_default_domain(dev);
++ if (ret)
++ arch_teardown_dma_ops(dev);
++ }
++
+ return 0;
+ }
+
++static void cdx_dma_cleanup(struct device *dev)
++{
++ struct cdx_driver *cdx_drv = to_cdx_driver(dev->driver);
++
++ if (!cdx_drv->driver_managed_dma)
++ iommu_device_unuse_default_domain(dev);
++}
++
+ /* show configuration fields */
+ #define cdx_config_attr(field, format_string) \
+ static ssize_t \
+@@ -405,6 +422,7 @@ struct bus_type cdx_bus_type = {
+ .remove = cdx_remove,
+ .shutdown = cdx_shutdown,
+ .dma_configure = cdx_dma_configure,
++ .dma_cleanup = cdx_dma_cleanup,
+ .bus_groups = cdx_bus_groups,
+ .dev_groups = cdx_dev_groups,
+ };
+diff --git a/drivers/char/hw_random/st-rng.c b/drivers/char/hw_random/st-rng.c
+index 15ba1e6fae4d2..6e9dfac9fc9f4 100644
+--- a/drivers/char/hw_random/st-rng.c
++++ b/drivers/char/hw_random/st-rng.c
+@@ -42,7 +42,6 @@
+
+ struct st_rng_data {
+ void __iomem *base;
+- struct clk *clk;
+ struct hwrng ops;
+ };
+
+@@ -85,26 +84,18 @@ static int st_rng_probe(struct platform_device *pdev)
+ if (IS_ERR(base))
+ return PTR_ERR(base);
+
+- clk = devm_clk_get(&pdev->dev, NULL);
++ clk = devm_clk_get_enabled(&pdev->dev, NULL);
+ if (IS_ERR(clk))
+ return PTR_ERR(clk);
+
+- ret = clk_prepare_enable(clk);
+- if (ret)
+- return ret;
+-
+ ddata->ops.priv = (unsigned long)ddata;
+ ddata->ops.read = st_rng_read;
+ ddata->ops.name = pdev->name;
+ ddata->base = base;
+- ddata->clk = clk;
+-
+- dev_set_drvdata(&pdev->dev, ddata);
+
+ ret = devm_hwrng_register(&pdev->dev, &ddata->ops);
+ if (ret) {
+ dev_err(&pdev->dev, "Failed to register HW RNG\n");
+- clk_disable_unprepare(clk);
+ return ret;
+ }
+
+@@ -113,15 +104,6 @@ static int st_rng_probe(struct platform_device *pdev)
+ return 0;
+ }
+
+-static int st_rng_remove(struct platform_device *pdev)
+-{
+- struct st_rng_data *ddata = dev_get_drvdata(&pdev->dev);
+-
+- clk_disable_unprepare(ddata->clk);
+-
+- return 0;
+-}
+-
+ static const struct of_device_id st_rng_match[] __maybe_unused = {
+ { .compatible = "st,rng" },
+ {},
+@@ -134,7 +116,6 @@ static struct platform_driver st_rng_driver = {
+ .of_match_table = of_match_ptr(st_rng_match),
+ },
+ .probe = st_rng_probe,
+- .remove = st_rng_remove
+ };
+
+ module_platform_driver(st_rng_driver);
+diff --git a/drivers/char/hw_random/virtio-rng.c b/drivers/char/hw_random/virtio-rng.c
+index f7690e0f92ede..e41a84e6b4b56 100644
+--- a/drivers/char/hw_random/virtio-rng.c
++++ b/drivers/char/hw_random/virtio-rng.c
+@@ -4,6 +4,7 @@
+ * Copyright (C) 2007, 2008 Rusty Russell IBM Corporation
+ */
+
++#include <asm/barrier.h>
+ #include <linux/err.h>
+ #include <linux/hw_random.h>
+ #include <linux/scatterlist.h>
+@@ -37,13 +38,13 @@ struct virtrng_info {
+ static void random_recv_done(struct virtqueue *vq)
+ {
+ struct virtrng_info *vi = vq->vdev->priv;
++ unsigned int len;
+
+ /* We can get spurious callbacks, e.g. shared IRQs + virtio_pci. */
+- if (!virtqueue_get_buf(vi->vq, &vi->data_avail))
++ if (!virtqueue_get_buf(vi->vq, &len))
+ return;
+
+- vi->data_idx = 0;
+-
++ smp_store_release(&vi->data_avail, len);
+ complete(&vi->have_data);
+ }
+
+@@ -52,7 +53,6 @@ static void request_entropy(struct virtrng_info *vi)
+ struct scatterlist sg;
+
+ reinit_completion(&vi->have_data);
+- vi->data_avail = 0;
+ vi->data_idx = 0;
+
+ sg_init_one(&sg, vi->data, sizeof(vi->data));
+@@ -88,7 +88,7 @@ static int virtio_read(struct hwrng *rng, void *buf, size_t size, bool wait)
+ read = 0;
+
+ /* copy available data */
+- if (vi->data_avail) {
++ if (smp_load_acquire(&vi->data_avail)) {
+ chunk = copy_data(vi, buf, size);
+ size -= chunk;
+ read += chunk;
+diff --git a/drivers/clk/bcm/clk-raspberrypi.c b/drivers/clk/bcm/clk-raspberrypi.c
+index eb399a4d141ba..829406dc44a20 100644
+--- a/drivers/clk/bcm/clk-raspberrypi.c
++++ b/drivers/clk/bcm/clk-raspberrypi.c
+@@ -356,9 +356,9 @@ static int raspberrypi_discover_clocks(struct raspberrypi_clk *rpi,
+ while (clks->id) {
+ struct raspberrypi_clk_variant *variant;
+
+- if (clks->id > RPI_FIRMWARE_NUM_CLK_ID) {
++ if (clks->id >= RPI_FIRMWARE_NUM_CLK_ID) {
+ dev_err(rpi->dev, "Unknown clock id: %u (max: %u)\n",
+- clks->id, RPI_FIRMWARE_NUM_CLK_ID);
++ clks->id, RPI_FIRMWARE_NUM_CLK_ID - 1);
+ return -EINVAL;
+ }
+
+diff --git a/drivers/clk/clk-cdce925.c b/drivers/clk/clk-cdce925.c
+index 6350682f7e6d2..87890669297d8 100644
+--- a/drivers/clk/clk-cdce925.c
++++ b/drivers/clk/clk-cdce925.c
+@@ -701,6 +701,10 @@ static int cdce925_probe(struct i2c_client *client)
+ for (i = 0; i < data->chip_info->num_plls; ++i) {
+ pll_clk_name[i] = kasprintf(GFP_KERNEL, "%pOFn.pll%d",
+ client->dev.of_node, i);
++ if (!pll_clk_name[i]) {
++ err = -ENOMEM;
++ goto error;
++ }
+ init.name = pll_clk_name[i];
+ data->pll[i].chip = data;
+ data->pll[i].hw.init = &init;
+@@ -742,6 +746,10 @@ static int cdce925_probe(struct i2c_client *client)
+ init.num_parents = 1;
+ init.parent_names = &parent_name; /* Mux Y1 to input */
+ init.name = kasprintf(GFP_KERNEL, "%pOFn.Y1", client->dev.of_node);
++ if (!init.name) {
++ err = -ENOMEM;
++ goto error;
++ }
+ data->clk[0].chip = data;
+ data->clk[0].hw.init = &init;
+ data->clk[0].index = 0;
+@@ -760,6 +768,10 @@ static int cdce925_probe(struct i2c_client *client)
+ for (i = 1; i < data->chip_info->num_outputs; ++i) {
+ init.name = kasprintf(GFP_KERNEL, "%pOFn.Y%d",
+ client->dev.of_node, i+1);
++ if (!init.name) {
++ err = -ENOMEM;
++ goto error;
++ }
+ data->clk[i].chip = data;
+ data->clk[i].hw.init = &init;
+ data->clk[i].index = i;
+diff --git a/drivers/clk/clk-renesas-pcie.c b/drivers/clk/clk-renesas-pcie.c
+index 10d31c222a1cb..6060cafe1aa22 100644
+--- a/drivers/clk/clk-renesas-pcie.c
++++ b/drivers/clk/clk-renesas-pcie.c
+@@ -392,8 +392,8 @@ static const struct rs9_chip_info renesas_9fgv0441_info = {
+ };
+
+ static const struct i2c_device_id rs9_id[] = {
+- { "9fgv0241", .driver_data = RENESAS_9FGV0241 },
+- { "9fgv0441", .driver_data = RENESAS_9FGV0441 },
++ { "9fgv0241", .driver_data = (kernel_ulong_t)&renesas_9fgv0241_info },
++ { "9fgv0441", .driver_data = (kernel_ulong_t)&renesas_9fgv0441_info },
+ { }
+ };
+ MODULE_DEVICE_TABLE(i2c, rs9_id);
+diff --git a/drivers/clk/clk-si5341.c b/drivers/clk/clk-si5341.c
+index 0e528d7ba656e..c7d8cbd22bacc 100644
+--- a/drivers/clk/clk-si5341.c
++++ b/drivers/clk/clk-si5341.c
+@@ -1553,7 +1553,7 @@ static int si5341_probe(struct i2c_client *client)
+ struct clk_init_data init;
+ struct clk *input;
+ const char *root_clock_name;
+- const char *synth_clock_names[SI5341_NUM_SYNTH];
++ const char *synth_clock_names[SI5341_NUM_SYNTH] = { NULL };
+ int err;
+ unsigned int i;
+ struct clk_si5341_output_config config[SI5341_MAX_NUM_OUTPUTS];
+@@ -1697,6 +1697,10 @@ static int si5341_probe(struct i2c_client *client)
+ for (i = 0; i < data->num_synth; ++i) {
+ synth_clock_names[i] = devm_kasprintf(&client->dev, GFP_KERNEL,
+ "%s.N%u", client->dev.of_node->name, i);
++ if (!synth_clock_names[i]) {
++ err = -ENOMEM;
++ goto free_clk_names;
++ }
+ init.name = synth_clock_names[i];
+ data->synth[i].index = i;
+ data->synth[i].data = data;
+@@ -1705,6 +1709,7 @@ static int si5341_probe(struct i2c_client *client)
+ if (err) {
+ dev_err(&client->dev,
+ "synth N%u registration failed\n", i);
++ goto free_clk_names;
+ }
+ }
+
+@@ -1714,6 +1719,10 @@ static int si5341_probe(struct i2c_client *client)
+ for (i = 0; i < data->num_outputs; ++i) {
+ init.name = kasprintf(GFP_KERNEL, "%s.%d",
+ client->dev.of_node->name, i);
++ if (!init.name) {
++ err = -ENOMEM;
++ goto free_clk_names;
++ }
+ init.flags = config[i].synth_master ? CLK_SET_RATE_PARENT : 0;
+ data->clk[i].index = i;
+ data->clk[i].data = data;
+@@ -1735,7 +1744,7 @@ static int si5341_probe(struct i2c_client *client)
+ if (err) {
+ dev_err(&client->dev,
+ "output %u registration failed\n", i);
+- goto cleanup;
++ goto free_clk_names;
+ }
+ if (config[i].always_on)
+ clk_prepare(data->clk[i].hw.clk);
+@@ -1745,7 +1754,7 @@ static int si5341_probe(struct i2c_client *client)
+ data);
+ if (err) {
+ dev_err(&client->dev, "unable to add clk provider\n");
+- goto cleanup;
++ goto free_clk_names;
+ }
+
+ if (initialization_required) {
+@@ -1753,11 +1762,11 @@ static int si5341_probe(struct i2c_client *client)
+ regcache_cache_only(data->regmap, false);
+ err = regcache_sync(data->regmap);
+ if (err < 0)
+- goto cleanup;
++ goto free_clk_names;
+
+ err = si5341_finalize_defaults(data);
+ if (err < 0)
+- goto cleanup;
++ goto free_clk_names;
+ }
+
+ /* wait for device to report input clock present and PLL lock */
+@@ -1766,32 +1775,31 @@ static int si5341_probe(struct i2c_client *client)
+ 10000, 250000);
+ if (err) {
+ dev_err(&client->dev, "Error waiting for input clock or PLL lock\n");
+- goto cleanup;
++ goto free_clk_names;
+ }
+
+ /* clear sticky alarm bits from initialization */
+ err = regmap_write(data->regmap, SI5341_STATUS_STICKY, 0);
+ if (err) {
+ dev_err(&client->dev, "unable to clear sticky status\n");
+- goto cleanup;
++ goto free_clk_names;
+ }
+
+ err = sysfs_create_files(&client->dev.kobj, si5341_attributes);
+- if (err) {
++ if (err)
+ dev_err(&client->dev, "unable to create sysfs files\n");
+- goto cleanup;
+- }
+
++free_clk_names:
+ /* Free the names, clk framework makes copies */
+ for (i = 0; i < data->num_synth; ++i)
+ devm_kfree(&client->dev, (void *)synth_clock_names[i]);
+
+- return 0;
+-
+ cleanup:
+- for (i = 0; i < SI5341_MAX_NUM_OUTPUTS; ++i) {
+- if (data->clk[i].vddo_reg)
+- regulator_disable(data->clk[i].vddo_reg);
++ if (err) {
++ for (i = 0; i < SI5341_MAX_NUM_OUTPUTS; ++i) {
++ if (data->clk[i].vddo_reg)
++ regulator_disable(data->clk[i].vddo_reg);
++ }
+ }
+ return err;
+ }
+diff --git a/drivers/clk/clk-versaclock5.c b/drivers/clk/clk-versaclock5.c
+index fa71a57875ce8..e9a7f3c91ae0e 100644
+--- a/drivers/clk/clk-versaclock5.c
++++ b/drivers/clk/clk-versaclock5.c
+@@ -1028,6 +1028,11 @@ static int vc5_probe(struct i2c_client *client)
+ }
+
+ init.name = kasprintf(GFP_KERNEL, "%pOFn.mux", client->dev.of_node);
++ if (!init.name) {
++ ret = -ENOMEM;
++ goto err_clk;
++ }
++
+ init.ops = &vc5_mux_ops;
+ init.flags = 0;
+ init.parent_names = parent_names;
+@@ -1042,6 +1047,10 @@ static int vc5_probe(struct i2c_client *client)
+ memset(&init, 0, sizeof(init));
+ init.name = kasprintf(GFP_KERNEL, "%pOFn.dbl",
+ client->dev.of_node);
++ if (!init.name) {
++ ret = -ENOMEM;
++ goto err_clk;
++ }
+ init.ops = &vc5_dbl_ops;
+ init.flags = CLK_SET_RATE_PARENT;
+ init.parent_names = parent_names;
+@@ -1057,6 +1066,10 @@ static int vc5_probe(struct i2c_client *client)
+ /* Register PFD */
+ memset(&init, 0, sizeof(init));
+ init.name = kasprintf(GFP_KERNEL, "%pOFn.pfd", client->dev.of_node);
++ if (!init.name) {
++ ret = -ENOMEM;
++ goto err_clk;
++ }
+ init.ops = &vc5_pfd_ops;
+ init.flags = CLK_SET_RATE_PARENT;
+ init.parent_names = parent_names;
+@@ -1074,6 +1087,10 @@ static int vc5_probe(struct i2c_client *client)
+ /* Register PLL */
+ memset(&init, 0, sizeof(init));
+ init.name = kasprintf(GFP_KERNEL, "%pOFn.pll", client->dev.of_node);
++ if (!init.name) {
++ ret = -ENOMEM;
++ goto err_clk;
++ }
+ init.ops = &vc5_pll_ops;
+ init.flags = CLK_SET_RATE_PARENT;
+ init.parent_names = parent_names;
+@@ -1093,6 +1110,10 @@ static int vc5_probe(struct i2c_client *client)
+ memset(&init, 0, sizeof(init));
+ init.name = kasprintf(GFP_KERNEL, "%pOFn.fod%d",
+ client->dev.of_node, idx);
++ if (!init.name) {
++ ret = -ENOMEM;
++ goto err_clk;
++ }
+ init.ops = &vc5_fod_ops;
+ init.flags = CLK_SET_RATE_PARENT;
+ init.parent_names = parent_names;
+@@ -1111,6 +1132,10 @@ static int vc5_probe(struct i2c_client *client)
+ memset(&init, 0, sizeof(init));
+ init.name = kasprintf(GFP_KERNEL, "%pOFn.out0_sel_i2cb",
+ client->dev.of_node);
++ if (!init.name) {
++ ret = -ENOMEM;
++ goto err_clk;
++ }
+ init.ops = &vc5_clk_out_ops;
+ init.flags = CLK_SET_RATE_PARENT;
+ init.parent_names = parent_names;
+@@ -1137,6 +1162,10 @@ static int vc5_probe(struct i2c_client *client)
+ memset(&init, 0, sizeof(init));
+ init.name = kasprintf(GFP_KERNEL, "%pOFn.out%d",
+ client->dev.of_node, idx + 1);
++ if (!init.name) {
++ ret = -ENOMEM;
++ goto err_clk;
++ }
+ init.ops = &vc5_clk_out_ops;
+ init.flags = CLK_SET_RATE_PARENT;
+ init.parent_names = parent_names;
+@@ -1271,14 +1300,14 @@ static const struct vc5_chip_info idt_5p49v6975_info = {
+ };
+
+ static const struct i2c_device_id vc5_id[] = {
+- { "5p49v5923", .driver_data = IDT_VC5_5P49V5923 },
+- { "5p49v5925", .driver_data = IDT_VC5_5P49V5925 },
+- { "5p49v5933", .driver_data = IDT_VC5_5P49V5933 },
+- { "5p49v5935", .driver_data = IDT_VC5_5P49V5935 },
+- { "5p49v60", .driver_data = IDT_VC6_5P49V60 },
+- { "5p49v6901", .driver_data = IDT_VC6_5P49V6901 },
+- { "5p49v6965", .driver_data = IDT_VC6_5P49V6965 },
+- { "5p49v6975", .driver_data = IDT_VC6_5P49V6975 },
++ { "5p49v5923", .driver_data = (kernel_ulong_t)&idt_5p49v5923_info },
++ { "5p49v5925", .driver_data = (kernel_ulong_t)&idt_5p49v5925_info },
++ { "5p49v5933", .driver_data = (kernel_ulong_t)&idt_5p49v5933_info },
++ { "5p49v5935", .driver_data = (kernel_ulong_t)&idt_5p49v5935_info },
++ { "5p49v60", .driver_data = (kernel_ulong_t)&idt_5p49v60_info },
++ { "5p49v6901", .driver_data = (kernel_ulong_t)&idt_5p49v6901_info },
++ { "5p49v6965", .driver_data = (kernel_ulong_t)&idt_5p49v6965_info },
++ { "5p49v6975", .driver_data = (kernel_ulong_t)&idt_5p49v6975_info },
+ { }
+ };
+ MODULE_DEVICE_TABLE(i2c, vc5_id);
+diff --git a/drivers/clk/clk-versaclock7.c b/drivers/clk/clk-versaclock7.c
+index 8e4f86e852aa0..0ae191f50b4b2 100644
+--- a/drivers/clk/clk-versaclock7.c
++++ b/drivers/clk/clk-versaclock7.c
+@@ -1282,7 +1282,7 @@ static const struct regmap_config vc7_regmap_config = {
+ };
+
+ static const struct i2c_device_id vc7_i2c_id[] = {
+- { "rc21008a", VC7_RC21008A },
++ { "rc21008a", .driver_data = (kernel_ulong_t)&vc7_rc21008a_info },
+ {}
+ };
+ MODULE_DEVICE_TABLE(i2c, vc7_i2c_id);
+diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
+index 27c30a533759a..8c13bcf57f1ae 100644
+--- a/drivers/clk/clk.c
++++ b/drivers/clk/clk.c
+@@ -1549,6 +1549,7 @@ void clk_hw_forward_rate_request(const struct clk_hw *hw,
+ parent->core, req,
+ parent_rate);
+ }
++EXPORT_SYMBOL_GPL(clk_hw_forward_rate_request);
+
+ static bool clk_core_can_round(struct clk_core * const core)
+ {
+@@ -4695,6 +4696,7 @@ int devm_clk_notifier_register(struct device *dev, struct clk *clk,
+ if (!ret) {
+ devres->clk = clk;
+ devres->nb = nb;
++ devres_add(dev, devres);
+ } else {
+ devres_free(devres);
+ }
+diff --git a/drivers/clk/imx/clk-composite-8m.c b/drivers/clk/imx/clk-composite-8m.c
+index cbf0d7955a00a..7a6e3ce97133b 100644
+--- a/drivers/clk/imx/clk-composite-8m.c
++++ b/drivers/clk/imx/clk-composite-8m.c
+@@ -119,10 +119,41 @@ static int imx8m_clk_composite_divider_set_rate(struct clk_hw *hw,
+ return ret;
+ }
+
++static int imx8m_divider_determine_rate(struct clk_hw *hw,
++ struct clk_rate_request *req)
++{
++ struct clk_divider *divider = to_clk_divider(hw);
++ int prediv_value;
++ int div_value;
++
++ /* if read only, just return current value */
++ if (divider->flags & CLK_DIVIDER_READ_ONLY) {
++ u32 val;
++
++ val = readl(divider->reg);
++ prediv_value = val >> divider->shift;
++ prediv_value &= clk_div_mask(divider->width);
++ prediv_value++;
++
++ div_value = val >> PCG_DIV_SHIFT;
++ div_value &= clk_div_mask(PCG_DIV_WIDTH);
++ div_value++;
++
++ return divider_ro_determine_rate(hw, req, divider->table,
++ PCG_PREDIV_WIDTH + PCG_DIV_WIDTH,
++ divider->flags, prediv_value * div_value);
++ }
++
++ return divider_determine_rate(hw, req, divider->table,
++ PCG_PREDIV_WIDTH + PCG_DIV_WIDTH,
++ divider->flags);
++}
++
+ static const struct clk_ops imx8m_clk_composite_divider_ops = {
+ .recalc_rate = imx8m_clk_composite_divider_recalc_rate,
+ .round_rate = imx8m_clk_composite_divider_round_rate,
+ .set_rate = imx8m_clk_composite_divider_set_rate,
++ .determine_rate = imx8m_divider_determine_rate,
+ };
+
+ static u8 imx8m_clk_composite_mux_get_parent(struct clk_hw *hw)
+diff --git a/drivers/clk/imx/clk-imx8mn.c b/drivers/clk/imx/clk-imx8mn.c
+index 4b23a46486004..4bd1ed11353b3 100644
+--- a/drivers/clk/imx/clk-imx8mn.c
++++ b/drivers/clk/imx/clk-imx8mn.c
+@@ -323,7 +323,7 @@ static int imx8mn_clocks_probe(struct platform_device *pdev)
+ void __iomem *base;
+ int ret;
+
+- clk_hw_data = kzalloc(struct_size(clk_hw_data, hws,
++ clk_hw_data = devm_kzalloc(dev, struct_size(clk_hw_data, hws,
+ IMX8MN_CLK_END), GFP_KERNEL);
+ if (WARN_ON(!clk_hw_data))
+ return -ENOMEM;
+@@ -340,10 +340,10 @@ static int imx8mn_clocks_probe(struct platform_device *pdev)
+ hws[IMX8MN_CLK_EXT4] = imx_get_clk_hw_by_name(np, "clk_ext4");
+
+ np = of_find_compatible_node(NULL, NULL, "fsl,imx8mn-anatop");
+- base = of_iomap(np, 0);
++ base = devm_of_iomap(dev, np, 0, NULL);
+ of_node_put(np);
+- if (WARN_ON(!base)) {
+- ret = -ENOMEM;
++ if (WARN_ON(IS_ERR(base))) {
++ ret = PTR_ERR(base);
+ goto unregister_hws;
+ }
+
+diff --git a/drivers/clk/imx/clk-imx8mp.c b/drivers/clk/imx/clk-imx8mp.c
+index f26ae8de4cc6f..1469249386dd8 100644
+--- a/drivers/clk/imx/clk-imx8mp.c
++++ b/drivers/clk/imx/clk-imx8mp.c
+@@ -414,25 +414,22 @@ static int imx8mp_clocks_probe(struct platform_device *pdev)
+ struct device *dev = &pdev->dev;
+ struct device_node *np;
+ void __iomem *anatop_base, *ccm_base;
++ int err;
+
+ np = of_find_compatible_node(NULL, NULL, "fsl,imx8mp-anatop");
+- anatop_base = of_iomap(np, 0);
++ anatop_base = devm_of_iomap(dev, np, 0, NULL);
+ of_node_put(np);
+- if (WARN_ON(!anatop_base))
+- return -ENOMEM;
++ if (WARN_ON(IS_ERR(anatop_base)))
++ return PTR_ERR(anatop_base);
+
+ np = dev->of_node;
+ ccm_base = devm_platform_ioremap_resource(pdev, 0);
+- if (WARN_ON(IS_ERR(ccm_base))) {
+- iounmap(anatop_base);
++ if (WARN_ON(IS_ERR(ccm_base)))
+ return PTR_ERR(ccm_base);
+- }
+
+- clk_hw_data = kzalloc(struct_size(clk_hw_data, hws, IMX8MP_CLK_END), GFP_KERNEL);
+- if (WARN_ON(!clk_hw_data)) {
+- iounmap(anatop_base);
++ clk_hw_data = devm_kzalloc(dev, struct_size(clk_hw_data, hws, IMX8MP_CLK_END), GFP_KERNEL);
++ if (WARN_ON(!clk_hw_data))
+ return -ENOMEM;
+- }
+
+ clk_hw_data->num = IMX8MP_CLK_END;
+ hws = clk_hw_data->hws;
+@@ -722,7 +719,12 @@ static int imx8mp_clocks_probe(struct platform_device *pdev)
+
+ imx_check_clk_hws(hws, IMX8MP_CLK_END);
+
+- of_clk_add_hw_provider(np, of_clk_hw_onecell_get, clk_hw_data);
++ err = of_clk_add_hw_provider(np, of_clk_hw_onecell_get, clk_hw_data);
++ if (err < 0) {
++ dev_err(dev, "failed to register hws for i.MX8MP\n");
++ imx_unregister_hw_clocks(hws, IMX8MP_CLK_END);
++ return err;
++ }
+
+ imx_register_uart_clocks();
+
+diff --git a/drivers/clk/imx/clk-imx93.c b/drivers/clk/imx/clk-imx93.c
+index 07b4a043e4495..b6c7c2725906c 100644
+--- a/drivers/clk/imx/clk-imx93.c
++++ b/drivers/clk/imx/clk-imx93.c
+@@ -264,7 +264,7 @@ static int imx93_clocks_probe(struct platform_device *pdev)
+ void __iomem *base, *anatop_base;
+ int i, ret;
+
+- clk_hw_data = kzalloc(struct_size(clk_hw_data, hws,
++ clk_hw_data = devm_kzalloc(dev, struct_size(clk_hw_data, hws,
+ IMX93_CLK_END), GFP_KERNEL);
+ if (!clk_hw_data)
+ return -ENOMEM;
+@@ -288,10 +288,12 @@ static int imx93_clocks_probe(struct platform_device *pdev)
+ "sys_pll_pfd2", 1, 2);
+
+ np = of_find_compatible_node(NULL, NULL, "fsl,imx93-anatop");
+- anatop_base = of_iomap(np, 0);
++ anatop_base = devm_of_iomap(dev, np, 0, NULL);
+ of_node_put(np);
+- if (WARN_ON(!anatop_base))
+- return -ENOMEM;
++ if (WARN_ON(IS_ERR(anatop_base))) {
++ ret = PTR_ERR(base);
++ goto unregister_hws;
++ }
+
+ clks[IMX93_CLK_ARM_PLL] = imx_clk_fracn_gppll_integer("arm_pll", "osc_24m",
+ anatop_base + 0x1000,
+@@ -304,8 +306,8 @@ static int imx93_clocks_probe(struct platform_device *pdev)
+ np = dev->of_node;
+ base = devm_platform_ioremap_resource(pdev, 0);
+ if (WARN_ON(IS_ERR(base))) {
+- iounmap(anatop_base);
+- return PTR_ERR(base);
++ ret = PTR_ERR(base);
++ goto unregister_hws;
+ }
+
+ for (i = 0; i < ARRAY_SIZE(root_array); i++) {
+@@ -345,7 +347,6 @@ static int imx93_clocks_probe(struct platform_device *pdev)
+
+ unregister_hws:
+ imx_unregister_hw_clocks(clks, IMX93_CLK_END);
+- iounmap(anatop_base);
+
+ return ret;
+ }
+diff --git a/drivers/clk/imx/clk-imxrt1050.c b/drivers/clk/imx/clk-imxrt1050.c
+index fd5c51fc92c0e..08d155feb035a 100644
+--- a/drivers/clk/imx/clk-imxrt1050.c
++++ b/drivers/clk/imx/clk-imxrt1050.c
+@@ -42,7 +42,7 @@ static int imxrt1050_clocks_probe(struct platform_device *pdev)
+ struct device_node *anp;
+ int ret;
+
+- clk_hw_data = kzalloc(struct_size(clk_hw_data, hws,
++ clk_hw_data = devm_kzalloc(dev, struct_size(clk_hw_data, hws,
+ IMXRT1050_CLK_END), GFP_KERNEL);
+ if (WARN_ON(!clk_hw_data))
+ return -ENOMEM;
+@@ -53,10 +53,12 @@ static int imxrt1050_clocks_probe(struct platform_device *pdev)
+ hws[IMXRT1050_CLK_OSC] = imx_get_clk_hw_by_name(np, "osc");
+
+ anp = of_find_compatible_node(NULL, NULL, "fsl,imxrt-anatop");
+- pll_base = of_iomap(anp, 0);
++ pll_base = devm_of_iomap(dev, anp, 0, NULL);
+ of_node_put(anp);
+- if (WARN_ON(!pll_base))
+- return -ENOMEM;
++ if (WARN_ON(IS_ERR(pll_base))) {
++ ret = PTR_ERR(pll_base);
++ goto unregister_hws;
++ }
+
+ /* Anatop clocks */
+ hws[IMXRT1050_CLK_DUMMY] = imx_clk_hw_fixed("dummy", 0UL);
+@@ -104,8 +106,10 @@ static int imxrt1050_clocks_probe(struct platform_device *pdev)
+
+ /* CCM clocks */
+ ccm_base = devm_platform_ioremap_resource(pdev, 0);
+- if (WARN_ON(IS_ERR(ccm_base)))
+- return PTR_ERR(ccm_base);
++ if (WARN_ON(IS_ERR(ccm_base))) {
++ ret = PTR_ERR(ccm_base);
++ goto unregister_hws;
++ }
+
+ hws[IMXRT1050_CLK_ARM_PODF] = imx_clk_hw_divider("arm_podf", "pll1_arm", ccm_base + 0x10, 0, 3);
+ hws[IMXRT1050_CLK_PRE_PERIPH_SEL] = imx_clk_hw_mux("pre_periph_sel", ccm_base + 0x18, 18, 2,
+@@ -149,8 +153,12 @@ static int imxrt1050_clocks_probe(struct platform_device *pdev)
+ ret = of_clk_add_hw_provider(np, of_clk_hw_onecell_get, clk_hw_data);
+ if (ret < 0) {
+ dev_err(dev, "Failed to register clks for i.MXRT1050.\n");
+- imx_unregister_hw_clocks(hws, IMXRT1050_CLK_END);
++ goto unregister_hws;
+ }
++ return 0;
++
++unregister_hws:
++ imx_unregister_hw_clocks(hws, IMXRT1050_CLK_END);
+ return ret;
+ }
+ static const struct of_device_id imxrt1050_clk_of_match[] = {
+diff --git a/drivers/clk/imx/clk-scu.c b/drivers/clk/imx/clk-scu.c
+index 1e6870f3671f6..db307890e4c16 100644
+--- a/drivers/clk/imx/clk-scu.c
++++ b/drivers/clk/imx/clk-scu.c
+@@ -707,11 +707,11 @@ struct clk_hw *imx_clk_scu_alloc_dev(const char *name,
+
+ void imx_clk_scu_unregister(void)
+ {
+- struct imx_scu_clk_node *clk;
++ struct imx_scu_clk_node *clk, *n;
+ int i;
+
+ for (i = 0; i < IMX_SC_R_LAST; i++) {
+- list_for_each_entry(clk, &imx_scu_clks[i], node) {
++ list_for_each_entry_safe(clk, n, &imx_scu_clks[i], node) {
+ clk_hw_unregister(clk->hw);
+ kfree(clk);
+ }
+diff --git a/drivers/clk/keystone/sci-clk.c b/drivers/clk/keystone/sci-clk.c
+index 910ecd58c4ca2..6c1df4f11536d 100644
+--- a/drivers/clk/keystone/sci-clk.c
++++ b/drivers/clk/keystone/sci-clk.c
+@@ -294,6 +294,8 @@ static int _sci_clk_build(struct sci_clk_provider *provider,
+
+ name = kasprintf(GFP_KERNEL, "clk:%d:%d", sci_clk->dev_id,
+ sci_clk->clk_id);
++ if (!name)
++ return -ENOMEM;
+
+ init.name = name;
+
+diff --git a/drivers/clk/mediatek/clk-mt8173-apmixedsys.c b/drivers/clk/mediatek/clk-mt8173-apmixedsys.c
+index 8c2aa8b0f39ea..307c24aa1fb41 100644
+--- a/drivers/clk/mediatek/clk-mt8173-apmixedsys.c
++++ b/drivers/clk/mediatek/clk-mt8173-apmixedsys.c
+@@ -148,11 +148,13 @@ static int clk_mt8173_apmixed_probe(struct platform_device *pdev)
+
+ base = of_iomap(node, 0);
+ if (!base)
+- return PTR_ERR(base);
++ return -ENOMEM;
+
+ clk_data = mtk_alloc_clk_data(CLK_APMIXED_NR_CLK);
+- if (IS_ERR_OR_NULL(clk_data))
++ if (IS_ERR_OR_NULL(clk_data)) {
++ iounmap(base);
+ return -ENOMEM;
++ }
+
+ fhctl_parse_dt(fhctl_node, pllfhs, ARRAY_SIZE(pllfhs));
+ r = mtk_clk_register_pllfhs(node, plls, ARRAY_SIZE(plls),
+@@ -186,6 +188,7 @@ unregister_plls:
+ ARRAY_SIZE(pllfhs), clk_data);
+ free_clk_data:
+ mtk_free_clk_data(clk_data);
++ iounmap(base);
+ return r;
+ }
+
+diff --git a/drivers/clk/mediatek/clk-mtk.c b/drivers/clk/mediatek/clk-mtk.c
+index fd2214c3242f2..affaf52c82bd4 100644
+--- a/drivers/clk/mediatek/clk-mtk.c
++++ b/drivers/clk/mediatek/clk-mtk.c
+@@ -469,7 +469,7 @@ static int __mtk_clk_simple_probe(struct platform_device *pdev,
+ const struct platform_device_id *id;
+ const struct mtk_clk_desc *mcd;
+ struct clk_hw_onecell_data *clk_data;
+- void __iomem *base;
++ void __iomem *base = NULL;
+ int num_clks, r;
+
+ mcd = device_get_match_data(&pdev->dev);
+@@ -483,8 +483,8 @@ static int __mtk_clk_simple_probe(struct platform_device *pdev,
+ return -EINVAL;
+ }
+
+- /* Composite clocks needs us to pass iomem pointer */
+- if (mcd->composite_clks) {
++ /* Composite and divider clocks needs us to pass iomem pointer */
++ if (mcd->composite_clks || mcd->divider_clks) {
+ if (!mcd->shared_io)
+ base = devm_platform_ioremap_resource(pdev, 0);
+ else
+@@ -500,8 +500,10 @@ static int __mtk_clk_simple_probe(struct platform_device *pdev,
+ num_clks += mcd->num_mux_clks + mcd->num_divider_clks;
+
+ clk_data = mtk_alloc_clk_data(num_clks);
+- if (!clk_data)
+- return -ENOMEM;
++ if (!clk_data) {
++ r = -ENOMEM;
++ goto free_base;
++ }
+
+ if (mcd->fixed_clks) {
+ r = mtk_clk_register_fixed_clks(mcd->fixed_clks,
+@@ -599,6 +601,7 @@ unregister_fixed_clks:
+ mcd->num_fixed_clks, clk_data);
+ free_data:
+ mtk_free_clk_data(clk_data);
++free_base:
+ if (mcd->shared_io && base)
+ iounmap(base);
+ return r;
+diff --git a/drivers/clk/qcom/camcc-sc7180.c b/drivers/clk/qcom/camcc-sc7180.c
+index e2b4804695f37..8a4ba7a19ed12 100644
+--- a/drivers/clk/qcom/camcc-sc7180.c
++++ b/drivers/clk/qcom/camcc-sc7180.c
+@@ -1480,12 +1480,21 @@ static struct clk_branch cam_cc_sys_tmr_clk = {
+ },
+ };
+
++static struct gdsc titan_top_gdsc = {
++ .gdscr = 0xb134,
++ .pd = {
++ .name = "titan_top_gdsc",
++ },
++ .pwrsts = PWRSTS_OFF_ON,
++};
++
+ static struct gdsc bps_gdsc = {
+ .gdscr = 0x6004,
+ .pd = {
+ .name = "bps_gdsc",
+ },
+ .pwrsts = PWRSTS_OFF_ON,
++ .parent = &titan_top_gdsc.pd,
+ .flags = HW_CTRL,
+ };
+
+@@ -1495,6 +1504,7 @@ static struct gdsc ife_0_gdsc = {
+ .name = "ife_0_gdsc",
+ },
+ .pwrsts = PWRSTS_OFF_ON,
++ .parent = &titan_top_gdsc.pd,
+ };
+
+ static struct gdsc ife_1_gdsc = {
+@@ -1503,6 +1513,7 @@ static struct gdsc ife_1_gdsc = {
+ .name = "ife_1_gdsc",
+ },
+ .pwrsts = PWRSTS_OFF_ON,
++ .parent = &titan_top_gdsc.pd,
+ };
+
+ static struct gdsc ipe_0_gdsc = {
+@@ -1512,15 +1523,9 @@ static struct gdsc ipe_0_gdsc = {
+ },
+ .pwrsts = PWRSTS_OFF_ON,
+ .flags = HW_CTRL,
++ .parent = &titan_top_gdsc.pd,
+ };
+
+-static struct gdsc titan_top_gdsc = {
+- .gdscr = 0xb134,
+- .pd = {
+- .name = "titan_top_gdsc",
+- },
+- .pwrsts = PWRSTS_OFF_ON,
+-};
+
+ static struct clk_hw *cam_cc_sc7180_hws[] = {
+ [CAM_CC_PLL2_OUT_EARLY] = &cam_cc_pll2_out_early.hw,
+diff --git a/drivers/clk/qcom/dispcc-qcm2290.c b/drivers/clk/qcom/dispcc-qcm2290.c
+index e9cfe41c04426..44dd5cfcc1504 100644
+--- a/drivers/clk/qcom/dispcc-qcm2290.c
++++ b/drivers/clk/qcom/dispcc-qcm2290.c
+@@ -24,9 +24,11 @@
+
+ enum {
+ P_BI_TCXO,
++ P_BI_TCXO_AO,
+ P_DISP_CC_PLL0_OUT_MAIN,
+ P_DSI0_PHY_PLL_OUT_BYTECLK,
+ P_DSI0_PHY_PLL_OUT_DSICLK,
++ P_GPLL0_OUT_DIV,
+ P_GPLL0_OUT_MAIN,
+ P_SLEEP_CLK,
+ };
+@@ -82,8 +84,8 @@ static const struct clk_parent_data disp_cc_parent_data_1[] = {
+ };
+
+ static const struct parent_map disp_cc_parent_map_2[] = {
+- { P_BI_TCXO, 0 },
+- { P_GPLL0_OUT_MAIN, 4 },
++ { P_BI_TCXO_AO, 0 },
++ { P_GPLL0_OUT_DIV, 4 },
+ };
+
+ static const struct clk_parent_data disp_cc_parent_data_2[] = {
+@@ -151,9 +153,9 @@ static struct clk_regmap_div disp_cc_mdss_byte0_div_clk_src = {
+ };
+
+ static const struct freq_tbl ftbl_disp_cc_mdss_ahb_clk_src[] = {
+- F(19200000, P_BI_TCXO, 1, 0, 0),
+- F(37500000, P_GPLL0_OUT_MAIN, 8, 0, 0),
+- F(75000000, P_GPLL0_OUT_MAIN, 4, 0, 0),
++ F(19200000, P_BI_TCXO_AO, 1, 0, 0),
++ F(37500000, P_GPLL0_OUT_DIV, 8, 0, 0),
++ F(75000000, P_GPLL0_OUT_DIV, 4, 0, 0),
+ { }
+ };
+
+diff --git a/drivers/clk/qcom/gcc-ipq5332.c b/drivers/clk/qcom/gcc-ipq5332.c
+index bdb4a0a11d07b..a75ab88ed14c6 100644
+--- a/drivers/clk/qcom/gcc-ipq5332.c
++++ b/drivers/clk/qcom/gcc-ipq5332.c
+@@ -20,8 +20,8 @@
+ #include "reset.h"
+
+ enum {
+- DT_SLEEP_CLK,
+ DT_XO,
++ DT_SLEEP_CLK,
+ DT_PCIE_2LANE_PHY_PIPE_CLK,
+ DT_PCIE_2LANE_PHY_PIPE_CLK_X1,
+ DT_USB_PCIE_WRAPPER_PIPE_CLK,
+@@ -366,7 +366,7 @@ static struct clk_rcg2 gcc_adss_pwm_clk_src = {
+ };
+
+ static const struct freq_tbl ftbl_gcc_apss_axi_clk_src[] = {
+- F(480000000, P_GPLL4_OUT_MAIN, 2.5, 0, 0),
++ F(480000000, P_GPLL4_OUT_AUX, 2.5, 0, 0),
+ F(533333333, P_GPLL0_OUT_MAIN, 1.5, 0, 0),
+ { }
+ };
+@@ -963,7 +963,7 @@ static struct clk_rcg2 gcc_sdcc1_apps_clk_src = {
+ .name = "gcc_sdcc1_apps_clk_src",
+ .parent_data = gcc_parent_data_9,
+ .num_parents = ARRAY_SIZE(gcc_parent_data_9),
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_floor_ops,
+ },
+ };
+
+diff --git a/drivers/clk/qcom/gcc-ipq6018.c b/drivers/clk/qcom/gcc-ipq6018.c
+index 3f9c2f61a5d93..cde62a11f5736 100644
+--- a/drivers/clk/qcom/gcc-ipq6018.c
++++ b/drivers/clk/qcom/gcc-ipq6018.c
+@@ -1654,7 +1654,7 @@ static struct clk_rcg2 sdcc1_apps_clk_src = {
+ .name = "sdcc1_apps_clk_src",
+ .parent_data = gcc_xo_gpll0_gpll2_gpll0_out_main_div2,
+ .num_parents = 4,
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_floor_ops,
+ },
+ };
+
+@@ -4517,24 +4517,24 @@ static const struct qcom_reset_map gcc_ipq6018_resets[] = {
+ [GCC_PCIE0_AHB_ARES] = { 0x75040, 5 },
+ [GCC_PCIE0_AXI_MASTER_STICKY_ARES] = { 0x75040, 6 },
+ [GCC_PCIE0_AXI_SLAVE_STICKY_ARES] = { 0x75040, 7 },
+- [GCC_PPE_FULL_RESET] = { 0x68014, 0 },
+- [GCC_UNIPHY0_SOFT_RESET] = { 0x56004, 0 },
++ [GCC_PPE_FULL_RESET] = { .reg = 0x68014, .bitmask = 0xf0000 },
++ [GCC_UNIPHY0_SOFT_RESET] = { .reg = 0x56004, .bitmask = 0x3ff2 },
+ [GCC_UNIPHY0_XPCS_RESET] = { 0x56004, 2 },
+- [GCC_UNIPHY1_SOFT_RESET] = { 0x56104, 0 },
++ [GCC_UNIPHY1_SOFT_RESET] = { .reg = 0x56104, .bitmask = 0x32 },
+ [GCC_UNIPHY1_XPCS_RESET] = { 0x56104, 2 },
+- [GCC_EDMA_HW_RESET] = { 0x68014, 0 },
+- [GCC_NSSPORT1_RESET] = { 0x68014, 0 },
+- [GCC_NSSPORT2_RESET] = { 0x68014, 0 },
+- [GCC_NSSPORT3_RESET] = { 0x68014, 0 },
+- [GCC_NSSPORT4_RESET] = { 0x68014, 0 },
+- [GCC_NSSPORT5_RESET] = { 0x68014, 0 },
+- [GCC_UNIPHY0_PORT1_ARES] = { 0x56004, 0 },
+- [GCC_UNIPHY0_PORT2_ARES] = { 0x56004, 0 },
+- [GCC_UNIPHY0_PORT3_ARES] = { 0x56004, 0 },
+- [GCC_UNIPHY0_PORT4_ARES] = { 0x56004, 0 },
+- [GCC_UNIPHY0_PORT5_ARES] = { 0x56004, 0 },
+- [GCC_UNIPHY0_PORT_4_5_RESET] = { 0x56004, 0 },
+- [GCC_UNIPHY0_PORT_4_RESET] = { 0x56004, 0 },
++ [GCC_EDMA_HW_RESET] = { .reg = 0x68014, .bitmask = 0x300000 },
++ [GCC_NSSPORT1_RESET] = { .reg = 0x68014, .bitmask = 0x1000003 },
++ [GCC_NSSPORT2_RESET] = { .reg = 0x68014, .bitmask = 0x200000c },
++ [GCC_NSSPORT3_RESET] = { .reg = 0x68014, .bitmask = 0x4000030 },
++ [GCC_NSSPORT4_RESET] = { .reg = 0x68014, .bitmask = 0x8000300 },
++ [GCC_NSSPORT5_RESET] = { .reg = 0x68014, .bitmask = 0x10000c00 },
++ [GCC_UNIPHY0_PORT1_ARES] = { .reg = 0x56004, .bitmask = 0x30 },
++ [GCC_UNIPHY0_PORT2_ARES] = { .reg = 0x56004, .bitmask = 0xc0 },
++ [GCC_UNIPHY0_PORT3_ARES] = { .reg = 0x56004, .bitmask = 0x300 },
++ [GCC_UNIPHY0_PORT4_ARES] = { .reg = 0x56004, .bitmask = 0xc00 },
++ [GCC_UNIPHY0_PORT5_ARES] = { .reg = 0x56004, .bitmask = 0x3000 },
++ [GCC_UNIPHY0_PORT_4_5_RESET] = { .reg = 0x56004, .bitmask = 0x3c02 },
++ [GCC_UNIPHY0_PORT_4_RESET] = { .reg = 0x56004, .bitmask = 0xc02 },
+ [GCC_LPASS_BCR] = {0x1F000, 0},
+ [GCC_UBI32_TBU_BCR] = {0x65000, 0},
+ [GCC_LPASS_TBU_BCR] = {0x6C000, 0},
+diff --git a/drivers/clk/qcom/gcc-qcm2290.c b/drivers/clk/qcom/gcc-qcm2290.c
+index 096deff2ba257..48995e50c6bd7 100644
+--- a/drivers/clk/qcom/gcc-qcm2290.c
++++ b/drivers/clk/qcom/gcc-qcm2290.c
+@@ -650,7 +650,7 @@ static struct clk_rcg2 gcc_usb30_prim_mock_utmi_clk_src = {
+ .name = "gcc_usb30_prim_mock_utmi_clk_src",
+ .parent_data = gcc_parents_0,
+ .num_parents = ARRAY_SIZE(gcc_parents_0),
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ },
+ };
+
+@@ -686,7 +686,7 @@ static struct clk_rcg2 gcc_camss_axi_clk_src = {
+ .name = "gcc_camss_axi_clk_src",
+ .parent_data = gcc_parents_4,
+ .num_parents = ARRAY_SIZE(gcc_parents_4),
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ },
+ };
+
+@@ -706,7 +706,7 @@ static struct clk_rcg2 gcc_camss_cci_clk_src = {
+ .name = "gcc_camss_cci_clk_src",
+ .parent_data = gcc_parents_9,
+ .num_parents = ARRAY_SIZE(gcc_parents_9),
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ },
+ };
+
+@@ -728,7 +728,7 @@ static struct clk_rcg2 gcc_camss_csi0phytimer_clk_src = {
+ .name = "gcc_camss_csi0phytimer_clk_src",
+ .parent_data = gcc_parents_5,
+ .num_parents = ARRAY_SIZE(gcc_parents_5),
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ },
+ };
+
+@@ -742,7 +742,7 @@ static struct clk_rcg2 gcc_camss_csi1phytimer_clk_src = {
+ .name = "gcc_camss_csi1phytimer_clk_src",
+ .parent_data = gcc_parents_5,
+ .num_parents = ARRAY_SIZE(gcc_parents_5),
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ },
+ };
+
+@@ -764,7 +764,7 @@ static struct clk_rcg2 gcc_camss_mclk0_clk_src = {
+ .parent_data = gcc_parents_3,
+ .num_parents = ARRAY_SIZE(gcc_parents_3),
+ .flags = CLK_OPS_PARENT_ENABLE,
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ },
+ };
+
+@@ -779,7 +779,7 @@ static struct clk_rcg2 gcc_camss_mclk1_clk_src = {
+ .parent_data = gcc_parents_3,
+ .num_parents = ARRAY_SIZE(gcc_parents_3),
+ .flags = CLK_OPS_PARENT_ENABLE,
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ },
+ };
+
+@@ -794,7 +794,7 @@ static struct clk_rcg2 gcc_camss_mclk2_clk_src = {
+ .parent_data = gcc_parents_3,
+ .num_parents = ARRAY_SIZE(gcc_parents_3),
+ .flags = CLK_OPS_PARENT_ENABLE,
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ },
+ };
+
+@@ -809,7 +809,7 @@ static struct clk_rcg2 gcc_camss_mclk3_clk_src = {
+ .parent_data = gcc_parents_3,
+ .num_parents = ARRAY_SIZE(gcc_parents_3),
+ .flags = CLK_OPS_PARENT_ENABLE,
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ },
+ };
+
+@@ -830,7 +830,7 @@ static struct clk_rcg2 gcc_camss_ope_ahb_clk_src = {
+ .name = "gcc_camss_ope_ahb_clk_src",
+ .parent_data = gcc_parents_6,
+ .num_parents = ARRAY_SIZE(gcc_parents_6),
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ },
+ };
+
+@@ -854,7 +854,7 @@ static struct clk_rcg2 gcc_camss_ope_clk_src = {
+ .parent_data = gcc_parents_6,
+ .num_parents = ARRAY_SIZE(gcc_parents_6),
+ .flags = CLK_SET_RATE_PARENT,
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ },
+ };
+
+@@ -888,7 +888,7 @@ static struct clk_rcg2 gcc_camss_tfe_0_clk_src = {
+ .name = "gcc_camss_tfe_0_clk_src",
+ .parent_data = gcc_parents_7,
+ .num_parents = ARRAY_SIZE(gcc_parents_7),
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ },
+ };
+
+@@ -912,7 +912,7 @@ static struct clk_rcg2 gcc_camss_tfe_0_csid_clk_src = {
+ .name = "gcc_camss_tfe_0_csid_clk_src",
+ .parent_data = gcc_parents_8,
+ .num_parents = ARRAY_SIZE(gcc_parents_8),
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ },
+ };
+
+@@ -926,7 +926,7 @@ static struct clk_rcg2 gcc_camss_tfe_1_clk_src = {
+ .name = "gcc_camss_tfe_1_clk_src",
+ .parent_data = gcc_parents_7,
+ .num_parents = ARRAY_SIZE(gcc_parents_7),
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ },
+ };
+
+@@ -940,7 +940,7 @@ static struct clk_rcg2 gcc_camss_tfe_1_csid_clk_src = {
+ .name = "gcc_camss_tfe_1_csid_clk_src",
+ .parent_data = gcc_parents_8,
+ .num_parents = ARRAY_SIZE(gcc_parents_8),
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ },
+ };
+
+@@ -963,7 +963,7 @@ static struct clk_rcg2 gcc_camss_tfe_cphy_rx_clk_src = {
+ .parent_data = gcc_parents_10,
+ .num_parents = ARRAY_SIZE(gcc_parents_10),
+ .flags = CLK_OPS_PARENT_ENABLE,
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ },
+ };
+
+@@ -984,7 +984,7 @@ static struct clk_rcg2 gcc_camss_top_ahb_clk_src = {
+ .name = "gcc_camss_top_ahb_clk_src",
+ .parent_data = gcc_parents_4,
+ .num_parents = ARRAY_SIZE(gcc_parents_4),
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ },
+ };
+
+@@ -1006,7 +1006,7 @@ static struct clk_rcg2 gcc_gp1_clk_src = {
+ .name = "gcc_gp1_clk_src",
+ .parent_data = gcc_parents_2,
+ .num_parents = ARRAY_SIZE(gcc_parents_2),
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ },
+ };
+
+@@ -1020,7 +1020,7 @@ static struct clk_rcg2 gcc_gp2_clk_src = {
+ .name = "gcc_gp2_clk_src",
+ .parent_data = gcc_parents_2,
+ .num_parents = ARRAY_SIZE(gcc_parents_2),
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ },
+ };
+
+@@ -1034,7 +1034,7 @@ static struct clk_rcg2 gcc_gp3_clk_src = {
+ .name = "gcc_gp3_clk_src",
+ .parent_data = gcc_parents_2,
+ .num_parents = ARRAY_SIZE(gcc_parents_2),
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ },
+ };
+
+@@ -1054,7 +1054,7 @@ static struct clk_rcg2 gcc_pdm2_clk_src = {
+ .name = "gcc_pdm2_clk_src",
+ .parent_data = gcc_parents_0,
+ .num_parents = ARRAY_SIZE(gcc_parents_0),
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ },
+ };
+
+@@ -1082,7 +1082,7 @@ static struct clk_init_data gcc_qupv3_wrap0_s0_clk_src_init = {
+ .name = "gcc_qupv3_wrap0_s0_clk_src",
+ .parent_data = gcc_parents_1,
+ .num_parents = ARRAY_SIZE(gcc_parents_1),
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ };
+
+ static struct clk_rcg2 gcc_qupv3_wrap0_s0_clk_src = {
+@@ -1098,7 +1098,7 @@ static struct clk_init_data gcc_qupv3_wrap0_s1_clk_src_init = {
+ .name = "gcc_qupv3_wrap0_s1_clk_src",
+ .parent_data = gcc_parents_1,
+ .num_parents = ARRAY_SIZE(gcc_parents_1),
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ };
+
+ static struct clk_rcg2 gcc_qupv3_wrap0_s1_clk_src = {
+@@ -1114,7 +1114,7 @@ static struct clk_init_data gcc_qupv3_wrap0_s2_clk_src_init = {
+ .name = "gcc_qupv3_wrap0_s2_clk_src",
+ .parent_data = gcc_parents_1,
+ .num_parents = ARRAY_SIZE(gcc_parents_1),
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ };
+
+ static struct clk_rcg2 gcc_qupv3_wrap0_s2_clk_src = {
+@@ -1130,7 +1130,7 @@ static struct clk_init_data gcc_qupv3_wrap0_s3_clk_src_init = {
+ .name = "gcc_qupv3_wrap0_s3_clk_src",
+ .parent_data = gcc_parents_1,
+ .num_parents = ARRAY_SIZE(gcc_parents_1),
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ };
+
+ static struct clk_rcg2 gcc_qupv3_wrap0_s3_clk_src = {
+@@ -1146,7 +1146,7 @@ static struct clk_init_data gcc_qupv3_wrap0_s4_clk_src_init = {
+ .name = "gcc_qupv3_wrap0_s4_clk_src",
+ .parent_data = gcc_parents_1,
+ .num_parents = ARRAY_SIZE(gcc_parents_1),
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ };
+
+ static struct clk_rcg2 gcc_qupv3_wrap0_s4_clk_src = {
+@@ -1162,7 +1162,7 @@ static struct clk_init_data gcc_qupv3_wrap0_s5_clk_src_init = {
+ .name = "gcc_qupv3_wrap0_s5_clk_src",
+ .parent_data = gcc_parents_1,
+ .num_parents = ARRAY_SIZE(gcc_parents_1),
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ };
+
+ static struct clk_rcg2 gcc_qupv3_wrap0_s5_clk_src = {
+@@ -1219,7 +1219,7 @@ static struct clk_rcg2 gcc_sdcc1_ice_core_clk_src = {
+ .name = "gcc_sdcc1_ice_core_clk_src",
+ .parent_data = gcc_parents_0,
+ .num_parents = ARRAY_SIZE(gcc_parents_0),
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ },
+ };
+
+@@ -1266,7 +1266,7 @@ static struct clk_rcg2 gcc_usb30_prim_master_clk_src = {
+ .name = "gcc_usb30_prim_master_clk_src",
+ .parent_data = gcc_parents_0,
+ .num_parents = ARRAY_SIZE(gcc_parents_0),
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ },
+ };
+
+@@ -1280,7 +1280,7 @@ static struct clk_rcg2 gcc_usb3_prim_phy_aux_clk_src = {
+ .name = "gcc_usb3_prim_phy_aux_clk_src",
+ .parent_data = gcc_parents_13,
+ .num_parents = ARRAY_SIZE(gcc_parents_13),
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ },
+ };
+
+@@ -1303,7 +1303,7 @@ static struct clk_rcg2 gcc_video_venus_clk_src = {
+ .parent_data = gcc_parents_14,
+ .num_parents = ARRAY_SIZE(gcc_parents_14),
+ .flags = CLK_SET_RATE_PARENT,
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ },
+ };
+
+diff --git a/drivers/clk/qcom/mmcc-msm8974.c b/drivers/clk/qcom/mmcc-msm8974.c
+index 4273fce9a4a4c..82f6bad144a9a 100644
+--- a/drivers/clk/qcom/mmcc-msm8974.c
++++ b/drivers/clk/qcom/mmcc-msm8974.c
+@@ -485,7 +485,7 @@ static struct clk_rcg2 mdp_clk_src = {
+ .name = "mdp_clk_src",
+ .parent_data = mmcc_xo_mmpll0_dsi_hdmi_gpll0,
+ .num_parents = ARRAY_SIZE(mmcc_xo_mmpll0_dsi_hdmi_gpll0),
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_shared_ops,
+ },
+ };
+
+@@ -2204,23 +2204,6 @@ static struct clk_branch ocmemcx_ocmemnoc_clk = {
+ },
+ };
+
+-static struct clk_branch oxili_ocmemgx_clk = {
+- .halt_reg = 0x402c,
+- .clkr = {
+- .enable_reg = 0x402c,
+- .enable_mask = BIT(0),
+- .hw.init = &(struct clk_init_data){
+- .name = "oxili_ocmemgx_clk",
+- .parent_data = (const struct clk_parent_data[]){
+- { .fw_name = "gfx3d_clk_src", .name = "gfx3d_clk_src" },
+- },
+- .num_parents = 1,
+- .flags = CLK_SET_RATE_PARENT,
+- .ops = &clk_branch2_ops,
+- },
+- },
+-};
+-
+ static struct clk_branch ocmemnoc_clk = {
+ .halt_reg = 0x50b4,
+ .clkr = {
+@@ -2401,7 +2384,7 @@ static struct gdsc mdss_gdsc = {
+ .pd = {
+ .name = "mdss",
+ },
+- .pwrsts = PWRSTS_RET_ON,
++ .pwrsts = PWRSTS_OFF_ON,
+ };
+
+ static struct gdsc camss_jpeg_gdsc = {
+@@ -2512,7 +2495,6 @@ static struct clk_regmap *mmcc_msm8226_clocks[] = {
+ [MMSS_MMSSNOC_AXI_CLK] = &mmss_mmssnoc_axi_clk.clkr,
+ [MMSS_S0_AXI_CLK] = &mmss_s0_axi_clk.clkr,
+ [OCMEMCX_AHB_CLK] = &ocmemcx_ahb_clk.clkr,
+- [OXILI_OCMEMGX_CLK] = &oxili_ocmemgx_clk.clkr,
+ [OXILI_GFX3D_CLK] = &oxili_gfx3d_clk.clkr,
+ [OXILICX_AHB_CLK] = &oxilicx_ahb_clk.clkr,
+ [OXILICX_AXI_CLK] = &oxilicx_axi_clk.clkr,
+@@ -2670,7 +2652,6 @@ static struct clk_regmap *mmcc_msm8974_clocks[] = {
+ [MMSS_S0_AXI_CLK] = &mmss_s0_axi_clk.clkr,
+ [OCMEMCX_AHB_CLK] = &ocmemcx_ahb_clk.clkr,
+ [OCMEMCX_OCMEMNOC_CLK] = &ocmemcx_ocmemnoc_clk.clkr,
+- [OXILI_OCMEMGX_CLK] = &oxili_ocmemgx_clk.clkr,
+ [OCMEMNOC_CLK] = &ocmemnoc_clk.clkr,
+ [OXILI_GFX3D_CLK] = &oxili_gfx3d_clk.clkr,
+ [OXILICX_AHB_CLK] = &oxilicx_ahb_clk.clkr,
+diff --git a/drivers/clk/renesas/rzg2l-cpg.c b/drivers/clk/renesas/rzg2l-cpg.c
+index 93b02cdc98c25..ca8b921c77625 100644
+--- a/drivers/clk/renesas/rzg2l-cpg.c
++++ b/drivers/clk/renesas/rzg2l-cpg.c
+@@ -603,10 +603,8 @@ static int rzg2l_cpg_sipll5_set_rate(struct clk_hw *hw,
+ }
+
+ /* Output clock setting 1 */
+- writel(CPG_SIPLL5_CLK1_POSTDIV1_WEN | CPG_SIPLL5_CLK1_POSTDIV2_WEN |
+- CPG_SIPLL5_CLK1_REFDIV_WEN | (params.pl5_postdiv1 << 0) |
+- (params.pl5_postdiv2 << 4) | (params.pl5_refdiv << 8),
+- priv->base + CPG_SIPLL5_CLK1);
++ writel((params.pl5_postdiv1 << 0) | (params.pl5_postdiv2 << 4) |
++ (params.pl5_refdiv << 8), priv->base + CPG_SIPLL5_CLK1);
+
+ /* Output clock setting, SSCG modulation value setting 3 */
+ writel((params.pl5_fracin << 8), priv->base + CPG_SIPLL5_CLK3);
+diff --git a/drivers/clk/renesas/rzg2l-cpg.h b/drivers/clk/renesas/rzg2l-cpg.h
+index eee780276a9e2..6cee9e56acc72 100644
+--- a/drivers/clk/renesas/rzg2l-cpg.h
++++ b/drivers/clk/renesas/rzg2l-cpg.h
+@@ -32,9 +32,6 @@
+ #define CPG_SIPLL5_STBY_RESETB_WEN BIT(16)
+ #define CPG_SIPLL5_STBY_SSCG_EN_WEN BIT(18)
+ #define CPG_SIPLL5_STBY_DOWNSPREAD_WEN BIT(20)
+-#define CPG_SIPLL5_CLK1_POSTDIV1_WEN BIT(16)
+-#define CPG_SIPLL5_CLK1_POSTDIV2_WEN BIT(20)
+-#define CPG_SIPLL5_CLK1_REFDIV_WEN BIT(24)
+ #define CPG_SIPLL5_CLK4_RESV_LSB (0xFF)
+ #define CPG_SIPLL5_MON_PLL5_LOCK BIT(4)
+
+diff --git a/drivers/clk/tegra/clk-tegra124-emc.c b/drivers/clk/tegra/clk-tegra124-emc.c
+index 219c80653dbdb..2a6db04342815 100644
+--- a/drivers/clk/tegra/clk-tegra124-emc.c
++++ b/drivers/clk/tegra/clk-tegra124-emc.c
+@@ -464,6 +464,7 @@ static int load_timings_from_dt(struct tegra_clk_emc *tegra,
+ err = load_one_timing_from_dt(tegra, timing, child);
+ if (err) {
+ of_node_put(child);
++ kfree(tegra->timings);
+ return err;
+ }
+
+@@ -515,6 +516,7 @@ struct clk *tegra124_clk_register_emc(void __iomem *base, struct device_node *np
+ err = load_timings_from_dt(tegra, node, node_ram_code);
+ if (err) {
+ of_node_put(node);
++ kfree(tegra);
+ return ERR_PTR(err);
+ }
+ }
+diff --git a/drivers/clk/ti/clkctrl.c b/drivers/clk/ti/clkctrl.c
+index b6fce916967ce..8c40f10280b74 100644
+--- a/drivers/clk/ti/clkctrl.c
++++ b/drivers/clk/ti/clkctrl.c
+@@ -258,6 +258,9 @@ static const char * __init clkctrl_get_clock_name(struct device_node *np,
+ if (clkctrl_name && !legacy_naming) {
+ clock_name = kasprintf(GFP_KERNEL, "%s-clkctrl:%04x:%d",
+ clkctrl_name, offset, index);
++ if (!clock_name)
++ return NULL;
++
+ strreplace(clock_name, '_', '-');
+
+ return clock_name;
+@@ -586,6 +589,10 @@ static void __init _ti_omap4_clkctrl_setup(struct device_node *node)
+ if (clkctrl_name) {
+ provider->clkdm_name = kasprintf(GFP_KERNEL,
+ "%s_clkdm", clkctrl_name);
++ if (!provider->clkdm_name) {
++ kfree(provider);
++ return;
++ }
+ goto clkdm_found;
+ }
+
+diff --git a/drivers/clk/xilinx/clk-xlnx-clock-wizard.c b/drivers/clk/xilinx/clk-xlnx-clock-wizard.c
+index e83f104fad029..d56822ce6126c 100644
+--- a/drivers/clk/xilinx/clk-xlnx-clock-wizard.c
++++ b/drivers/clk/xilinx/clk-xlnx-clock-wizard.c
+@@ -525,7 +525,7 @@ static struct clk *clk_wzrd_register_divider(struct device *dev,
+ hw = &div->hw;
+ ret = devm_clk_hw_register(dev, hw);
+ if (ret)
+- hw = ERR_PTR(ret);
++ return ERR_PTR(ret);
+
+ return hw->clk;
+ }
+@@ -648,6 +648,11 @@ static int clk_wzrd_probe(struct platform_device *pdev)
+ }
+
+ clkout_name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "%s_out0", dev_name(&pdev->dev));
++ if (!clkout_name) {
++ ret = -ENOMEM;
++ goto err_disable_clk;
++ }
++
+ if (nr_outputs == 1) {
+ clk_wzrd->clkout[0] = clk_wzrd_register_divider
+ (&pdev->dev, clkout_name,
+diff --git a/drivers/clocksource/timer-cadence-ttc.c b/drivers/clocksource/timer-cadence-ttc.c
+index 4efd0cf3b602d..0d52e28fea4de 100644
+--- a/drivers/clocksource/timer-cadence-ttc.c
++++ b/drivers/clocksource/timer-cadence-ttc.c
+@@ -486,10 +486,10 @@ static int __init ttc_timer_probe(struct platform_device *pdev)
+ * and use it. Note that the event timer uses the interrupt and it's the
+ * 2nd TTC hence the irq_of_parse_and_map(,1)
+ */
+- timer_baseaddr = of_iomap(timer, 0);
+- if (!timer_baseaddr) {
++ timer_baseaddr = devm_of_iomap(&pdev->dev, timer, 0, NULL);
++ if (IS_ERR(timer_baseaddr)) {
+ pr_err("ERROR: invalid timer base address\n");
+- return -ENXIO;
++ return PTR_ERR(timer_baseaddr);
+ }
+
+ irq = irq_of_parse_and_map(timer, 1);
+@@ -513,20 +513,27 @@ static int __init ttc_timer_probe(struct platform_device *pdev)
+ clk_ce = of_clk_get(timer, clksel);
+ if (IS_ERR(clk_ce)) {
+ pr_err("ERROR: timer input clock not found\n");
+- return PTR_ERR(clk_ce);
++ ret = PTR_ERR(clk_ce);
++ goto put_clk_cs;
+ }
+
+ ret = ttc_setup_clocksource(clk_cs, timer_baseaddr, timer_width);
+ if (ret)
+- return ret;
++ goto put_clk_ce;
+
+ ret = ttc_setup_clockevent(clk_ce, timer_baseaddr + 4, irq);
+ if (ret)
+- return ret;
++ goto put_clk_ce;
+
+ pr_info("%pOFn #0 at %p, irq=%d\n", timer, timer_baseaddr, irq);
+
+ return 0;
++
++put_clk_ce:
++ clk_put(clk_ce);
++put_clk_cs:
++ clk_put(clk_cs);
++ return ret;
+ }
+
+ static const struct of_device_id ttc_timer_of_match[] = {
+diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
+index 2548ec92faa28..f29182512b982 100644
+--- a/drivers/cpufreq/intel_pstate.c
++++ b/drivers/cpufreq/intel_pstate.c
+@@ -824,6 +824,8 @@ static ssize_t store_energy_performance_preference(
+ err = cpufreq_start_governor(policy);
+ if (!ret)
+ ret = err;
++ } else {
++ ret = 0;
+ }
+ }
+
+diff --git a/drivers/cpufreq/mediatek-cpufreq.c b/drivers/cpufreq/mediatek-cpufreq.c
+index 9a39a7ccfae96..fef68cb2b38f7 100644
+--- a/drivers/cpufreq/mediatek-cpufreq.c
++++ b/drivers/cpufreq/mediatek-cpufreq.c
+@@ -696,9 +696,16 @@ static const struct mtk_cpufreq_platform_data mt2701_platform_data = {
+ static const struct mtk_cpufreq_platform_data mt7622_platform_data = {
+ .min_volt_shift = 100000,
+ .max_volt_shift = 200000,
+- .proc_max_volt = 1360000,
++ .proc_max_volt = 1350000,
+ .sram_min_volt = 0,
+- .sram_max_volt = 1360000,
++ .sram_max_volt = 1350000,
++ .ccifreq_supported = false,
++};
++
++static const struct mtk_cpufreq_platform_data mt7623_platform_data = {
++ .min_volt_shift = 100000,
++ .max_volt_shift = 200000,
++ .proc_max_volt = 1300000,
+ .ccifreq_supported = false,
+ };
+
+@@ -734,7 +741,7 @@ static const struct of_device_id mtk_cpufreq_machines[] __initconst = {
+ { .compatible = "mediatek,mt2701", .data = &mt2701_platform_data },
+ { .compatible = "mediatek,mt2712", .data = &mt2701_platform_data },
+ { .compatible = "mediatek,mt7622", .data = &mt7622_platform_data },
+- { .compatible = "mediatek,mt7623", .data = &mt7622_platform_data },
++ { .compatible = "mediatek,mt7623", .data = &mt7623_platform_data },
+ { .compatible = "mediatek,mt8167", .data = &mt8516_platform_data },
+ { .compatible = "mediatek,mt817x", .data = &mt2701_platform_data },
+ { .compatible = "mediatek,mt8173", .data = &mt2701_platform_data },
+diff --git a/drivers/cpufreq/tegra194-cpufreq.c b/drivers/cpufreq/tegra194-cpufreq.c
+index c8d03346068ab..36dad5ea59475 100644
+--- a/drivers/cpufreq/tegra194-cpufreq.c
++++ b/drivers/cpufreq/tegra194-cpufreq.c
+@@ -686,8 +686,10 @@ static int tegra194_cpufreq_probe(struct platform_device *pdev)
+
+ /* Check for optional OPPv2 and interconnect paths on CPU0 to enable ICC scaling */
+ cpu_dev = get_cpu_device(0);
+- if (!cpu_dev)
+- return -EPROBE_DEFER;
++ if (!cpu_dev) {
++ err = -EPROBE_DEFER;
++ goto err_free_res;
++ }
+
+ if (dev_pm_opp_of_get_opp_desc_node(cpu_dev)) {
+ err = dev_pm_opp_of_find_icc_paths(cpu_dev, NULL);
+diff --git a/drivers/crypto/intel/qat/qat_common/qat_asym_algs.c b/drivers/crypto/intel/qat/qat_common/qat_asym_algs.c
+index 935a7e012946e..4128200a90329 100644
+--- a/drivers/crypto/intel/qat/qat_common/qat_asym_algs.c
++++ b/drivers/crypto/intel/qat/qat_common/qat_asym_algs.c
+@@ -170,15 +170,14 @@ static void qat_dh_cb(struct icp_qat_fw_pke_resp *resp)
+ }
+
+ areq->dst_len = req->ctx.dh->p_size;
++ dma_unmap_single(dev, req->out.dh.r, req->ctx.dh->p_size,
++ DMA_FROM_DEVICE);
+ if (req->dst_align) {
+ scatterwalk_map_and_copy(req->dst_align, areq->dst, 0,
+ areq->dst_len, 1);
+ kfree_sensitive(req->dst_align);
+ }
+
+- dma_unmap_single(dev, req->out.dh.r, req->ctx.dh->p_size,
+- DMA_FROM_DEVICE);
+-
+ dma_unmap_single(dev, req->phy_in, sizeof(struct qat_dh_input_params),
+ DMA_TO_DEVICE);
+ dma_unmap_single(dev, req->phy_out,
+@@ -521,12 +520,14 @@ static void qat_rsa_cb(struct icp_qat_fw_pke_resp *resp)
+
+ err = (err == ICP_QAT_FW_COMN_STATUS_FLAG_OK) ? 0 : -EINVAL;
+
+- kfree_sensitive(req->src_align);
+-
+ dma_unmap_single(dev, req->in.rsa.enc.m, req->ctx.rsa->key_sz,
+ DMA_TO_DEVICE);
+
++ kfree_sensitive(req->src_align);
++
+ areq->dst_len = req->ctx.rsa->key_sz;
++ dma_unmap_single(dev, req->out.rsa.enc.c, req->ctx.rsa->key_sz,
++ DMA_FROM_DEVICE);
+ if (req->dst_align) {
+ scatterwalk_map_and_copy(req->dst_align, areq->dst, 0,
+ areq->dst_len, 1);
+@@ -534,9 +535,6 @@ static void qat_rsa_cb(struct icp_qat_fw_pke_resp *resp)
+ kfree_sensitive(req->dst_align);
+ }
+
+- dma_unmap_single(dev, req->out.rsa.enc.c, req->ctx.rsa->key_sz,
+- DMA_FROM_DEVICE);
+-
+ dma_unmap_single(dev, req->phy_in, sizeof(struct qat_rsa_input_params),
+ DMA_TO_DEVICE);
+ dma_unmap_single(dev, req->phy_out,
+diff --git a/drivers/crypto/marvell/cesa/cipher.c b/drivers/crypto/marvell/cesa/cipher.c
+index c6f2fa753b7c0..0f37dfd42d850 100644
+--- a/drivers/crypto/marvell/cesa/cipher.c
++++ b/drivers/crypto/marvell/cesa/cipher.c
+@@ -297,7 +297,7 @@ static int mv_cesa_des_setkey(struct crypto_skcipher *cipher, const u8 *key,
+ static int mv_cesa_des3_ede_setkey(struct crypto_skcipher *cipher,
+ const u8 *key, unsigned int len)
+ {
+- struct mv_cesa_des_ctx *ctx = crypto_skcipher_ctx(cipher);
++ struct mv_cesa_des3_ctx *ctx = crypto_skcipher_ctx(cipher);
+ int err;
+
+ err = verify_skcipher_des3_key(cipher, key);
+diff --git a/drivers/crypto/nx/Makefile b/drivers/crypto/nx/Makefile
+index d00181a26dd65..483cef62acee8 100644
+--- a/drivers/crypto/nx/Makefile
++++ b/drivers/crypto/nx/Makefile
+@@ -1,7 +1,6 @@
+ # SPDX-License-Identifier: GPL-2.0
+ obj-$(CONFIG_CRYPTO_DEV_NX_ENCRYPT) += nx-crypto.o
+ nx-crypto-objs := nx.o \
+- nx_debugfs.o \
+ nx-aes-cbc.o \
+ nx-aes-ecb.o \
+ nx-aes-gcm.o \
+@@ -11,6 +10,7 @@ nx-crypto-objs := nx.o \
+ nx-sha256.o \
+ nx-sha512.o
+
++nx-crypto-$(CONFIG_DEBUG_FS) += nx_debugfs.o
+ obj-$(CONFIG_CRYPTO_DEV_NX_COMPRESS_PSERIES) += nx-compress-pseries.o nx-compress.o
+ obj-$(CONFIG_CRYPTO_DEV_NX_COMPRESS_POWERNV) += nx-compress-powernv.o nx-compress.o
+ nx-compress-objs := nx-842.o
+diff --git a/drivers/crypto/nx/nx.h b/drivers/crypto/nx/nx.h
+index c6233173c612e..2697baebb6a35 100644
+--- a/drivers/crypto/nx/nx.h
++++ b/drivers/crypto/nx/nx.h
+@@ -170,8 +170,8 @@ struct nx_sg *nx_walk_and_build(struct nx_sg *, unsigned int,
+ void nx_debugfs_init(struct nx_crypto_driver *);
+ void nx_debugfs_fini(struct nx_crypto_driver *);
+ #else
+-#define NX_DEBUGFS_INIT(drv) (0)
+-#define NX_DEBUGFS_FINI(drv) (0)
++#define NX_DEBUGFS_INIT(drv) do {} while (0)
++#define NX_DEBUGFS_FINI(drv) do {} while (0)
+ #endif
+
+ #define NX_PAGE_NUM(x) ((u64)(x) & 0xfffffffffffff000ULL)
+diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
+index f822de44bee0a..bfdd424d68970 100644
+--- a/drivers/cxl/core/region.c
++++ b/drivers/cxl/core/region.c
+@@ -125,10 +125,38 @@ static struct cxl_region_ref *cxl_rr_load(struct cxl_port *port,
+ return xa_load(&port->regions, (unsigned long)cxlr);
+ }
+
++static int cxl_region_invalidate_memregion(struct cxl_region *cxlr)
++{
++ if (!cpu_cache_has_invalidate_memregion()) {
++ if (IS_ENABLED(CONFIG_CXL_REGION_INVALIDATION_TEST)) {
++ dev_warn_once(
++ &cxlr->dev,
++ "Bypassing cpu_cache_invalidate_memregion() for testing!\n");
++ return 0;
++ } else {
++ dev_err(&cxlr->dev,
++ "Failed to synchronize CPU cache state\n");
++ return -ENXIO;
++ }
++ }
++
++ cpu_cache_invalidate_memregion(IORES_DESC_CXL);
++ return 0;
++}
++
+ static int cxl_region_decode_reset(struct cxl_region *cxlr, int count)
+ {
+ struct cxl_region_params *p = &cxlr->params;
+- int i;
++ int i, rc = 0;
++
++ /*
++ * Before region teardown attempt to flush, and if the flush
++ * fails cancel the region teardown for data consistency
++ * concerns
++ */
++ rc = cxl_region_invalidate_memregion(cxlr);
++ if (rc)
++ return rc;
+
+ for (i = count - 1; i >= 0; i--) {
+ struct cxl_endpoint_decoder *cxled = p->targets[i];
+@@ -136,7 +164,6 @@ static int cxl_region_decode_reset(struct cxl_region *cxlr, int count)
+ struct cxl_port *iter = cxled_to_port(cxled);
+ struct cxl_dev_state *cxlds = cxlmd->cxlds;
+ struct cxl_ep *ep;
+- int rc = 0;
+
+ if (cxlds->rcd)
+ goto endpoint_reset;
+@@ -155,14 +182,19 @@ static int cxl_region_decode_reset(struct cxl_region *cxlr, int count)
+ rc = cxld->reset(cxld);
+ if (rc)
+ return rc;
++ set_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags);
+ }
+
+ endpoint_reset:
+ rc = cxled->cxld.reset(&cxled->cxld);
+ if (rc)
+ return rc;
++ set_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags);
+ }
+
++ /* all decoders associated with this region have been torn down */
++ clear_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags);
++
+ return 0;
+ }
+
+@@ -256,9 +288,19 @@ static ssize_t commit_store(struct device *dev, struct device_attribute *attr,
+ goto out;
+ }
+
+- if (commit)
++ /*
++ * Invalidate caches before region setup to drop any speculative
++ * consumption of this address space
++ */
++ rc = cxl_region_invalidate_memregion(cxlr);
++ if (rc)
++ return rc;
++
++ if (commit) {
+ rc = cxl_region_decode_commit(cxlr);
+- else {
++ if (rc == 0)
++ p->state = CXL_CONFIG_COMMIT;
++ } else {
+ p->state = CXL_CONFIG_RESET_PENDING;
+ up_write(&cxl_region_rwsem);
+ device_release_driver(&cxlr->dev);
+@@ -268,18 +310,20 @@ static ssize_t commit_store(struct device *dev, struct device_attribute *attr,
+ * The lock was dropped, so need to revalidate that the reset is
+ * still pending.
+ */
+- if (p->state == CXL_CONFIG_RESET_PENDING)
++ if (p->state == CXL_CONFIG_RESET_PENDING) {
+ rc = cxl_region_decode_reset(cxlr, p->interleave_ways);
++ /*
++ * Revert to committed since there may still be active
++ * decoders associated with this region, or move forward
++ * to active to mark the reset successful
++ */
++ if (rc)
++ p->state = CXL_CONFIG_COMMIT;
++ else
++ p->state = CXL_CONFIG_ACTIVE;
++ }
+ }
+
+- if (rc)
+- goto out;
+-
+- if (commit)
+- p->state = CXL_CONFIG_COMMIT;
+- else if (p->state == CXL_CONFIG_RESET_PENDING)
+- p->state = CXL_CONFIG_ACTIVE;
+-
+ out:
+ up_write(&cxl_region_rwsem);
+
+@@ -1674,7 +1718,6 @@ static int cxl_region_attach(struct cxl_region *cxlr,
+ if (rc)
+ goto err_decrement;
+ p->state = CXL_CONFIG_ACTIVE;
+- set_bit(CXL_REGION_F_INCOHERENT, &cxlr->flags);
+ }
+
+ cxled->cxld.interleave_ways = p->interleave_ways;
+@@ -2803,30 +2846,6 @@ out:
+ }
+ EXPORT_SYMBOL_NS_GPL(cxl_add_to_region, CXL);
+
+-static int cxl_region_invalidate_memregion(struct cxl_region *cxlr)
+-{
+- if (!test_bit(CXL_REGION_F_INCOHERENT, &cxlr->flags))
+- return 0;
+-
+- if (!cpu_cache_has_invalidate_memregion()) {
+- if (IS_ENABLED(CONFIG_CXL_REGION_INVALIDATION_TEST)) {
+- dev_warn_once(
+- &cxlr->dev,
+- "Bypassing cpu_cache_invalidate_memregion() for testing!\n");
+- clear_bit(CXL_REGION_F_INCOHERENT, &cxlr->flags);
+- return 0;
+- } else {
+- dev_err(&cxlr->dev,
+- "Failed to synchronize CPU cache state\n");
+- return -ENXIO;
+- }
+- }
+-
+- cpu_cache_invalidate_memregion(IORES_DESC_CXL);
+- clear_bit(CXL_REGION_F_INCOHERENT, &cxlr->flags);
+- return 0;
+-}
+-
+ static int is_system_ram(struct resource *res, void *arg)
+ {
+ struct cxl_region *cxlr = arg;
+@@ -2854,7 +2873,12 @@ static int cxl_region_probe(struct device *dev)
+ goto out;
+ }
+
+- rc = cxl_region_invalidate_memregion(cxlr);
++ if (test_bit(CXL_REGION_F_NEEDS_RESET, &cxlr->flags)) {
++ dev_err(&cxlr->dev,
++ "failed to activate, re-commit region and retry\n");
++ rc = -ENXIO;
++ goto out;
++ }
+
+ /*
+ * From this point on any path that changes the region's state away from
+diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
+index 044a92d9813e2..dcebe48bb5bb5 100644
+--- a/drivers/cxl/cxl.h
++++ b/drivers/cxl/cxl.h
+@@ -462,18 +462,20 @@ struct cxl_region_params {
+ int nr_targets;
+ };
+
+-/*
+- * Flag whether this region needs to have its HPA span synchronized with
+- * CPU cache state at region activation time.
+- */
+-#define CXL_REGION_F_INCOHERENT 0
+-
+ /*
+ * Indicate whether this region has been assembled by autodetection or
+ * userspace assembly. Prevent endpoint decoders outside of automatic
+ * detection from being added to the region.
+ */
+-#define CXL_REGION_F_AUTO 1
++#define CXL_REGION_F_AUTO 0
++
++/*
++ * Require that a committed region successfully complete a teardown once
++ * any of its associated decoders have been torn down. This maintains
++ * the commit state for the region since there are committed decoders,
++ * but blocks cxl_region_probe().
++ */
++#define CXL_REGION_F_NEEDS_RESET 1
+
+ /**
+ * struct cxl_region - CXL region
+diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
+index 227800053309f..e7c61358564e1 100644
+--- a/drivers/dax/bus.c
++++ b/drivers/dax/bus.c
+@@ -446,18 +446,34 @@ static void unregister_dev_dax(void *dev)
+ put_device(dev);
+ }
+
++static void dax_region_free(struct kref *kref)
++{
++ struct dax_region *dax_region;
++
++ dax_region = container_of(kref, struct dax_region, kref);
++ kfree(dax_region);
++}
++
++void dax_region_put(struct dax_region *dax_region)
++{
++ kref_put(&dax_region->kref, dax_region_free);
++}
++EXPORT_SYMBOL_GPL(dax_region_put);
++
+ /* a return value >= 0 indicates this invocation invalidated the id */
+ static int __free_dev_dax_id(struct dev_dax *dev_dax)
+ {
+- struct dax_region *dax_region = dev_dax->region;
+ struct device *dev = &dev_dax->dev;
++ struct dax_region *dax_region;
+ int rc = dev_dax->id;
+
+ device_lock_assert(dev);
+
+- if (is_static(dax_region) || dev_dax->id < 0)
++ if (!dev_dax->dyn_id || dev_dax->id < 0)
+ return -1;
++ dax_region = dev_dax->region;
+ ida_free(&dax_region->ida, dev_dax->id);
++ dax_region_put(dax_region);
+ dev_dax->id = -1;
+ return rc;
+ }
+@@ -473,6 +489,20 @@ static int free_dev_dax_id(struct dev_dax *dev_dax)
+ return rc;
+ }
+
++static int alloc_dev_dax_id(struct dev_dax *dev_dax)
++{
++ struct dax_region *dax_region = dev_dax->region;
++ int id;
++
++ id = ida_alloc(&dax_region->ida, GFP_KERNEL);
++ if (id < 0)
++ return id;
++ kref_get(&dax_region->kref);
++ dev_dax->dyn_id = true;
++ dev_dax->id = id;
++ return id;
++}
++
+ static ssize_t delete_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t len)
+ {
+@@ -560,20 +590,6 @@ static const struct attribute_group *dax_region_attribute_groups[] = {
+ NULL,
+ };
+
+-static void dax_region_free(struct kref *kref)
+-{
+- struct dax_region *dax_region;
+-
+- dax_region = container_of(kref, struct dax_region, kref);
+- kfree(dax_region);
+-}
+-
+-void dax_region_put(struct dax_region *dax_region)
+-{
+- kref_put(&dax_region->kref, dax_region_free);
+-}
+-EXPORT_SYMBOL_GPL(dax_region_put);
+-
+ static void dax_region_unregister(void *region)
+ {
+ struct dax_region *dax_region = region;
+@@ -635,10 +651,12 @@ EXPORT_SYMBOL_GPL(alloc_dax_region);
+ static void dax_mapping_release(struct device *dev)
+ {
+ struct dax_mapping *mapping = to_dax_mapping(dev);
+- struct dev_dax *dev_dax = to_dev_dax(dev->parent);
++ struct device *parent = dev->parent;
++ struct dev_dax *dev_dax = to_dev_dax(parent);
+
+ ida_free(&dev_dax->ida, mapping->id);
+ kfree(mapping);
++ put_device(parent);
+ }
+
+ static void unregister_dax_mapping(void *data)
+@@ -778,6 +796,7 @@ static int devm_register_dax_mapping(struct dev_dax *dev_dax, int range_id)
+ dev = &mapping->dev;
+ device_initialize(dev);
+ dev->parent = &dev_dax->dev;
++ get_device(dev->parent);
+ dev->type = &dax_mapping_type;
+ dev_set_name(dev, "mapping%d", mapping->id);
+ rc = device_add(dev);
+@@ -1295,12 +1314,10 @@ static const struct attribute_group *dax_attribute_groups[] = {
+ static void dev_dax_release(struct device *dev)
+ {
+ struct dev_dax *dev_dax = to_dev_dax(dev);
+- struct dax_region *dax_region = dev_dax->region;
+ struct dax_device *dax_dev = dev_dax->dax_dev;
+
+ put_dax(dax_dev);
+ free_dev_dax_id(dev_dax);
+- dax_region_put(dax_region);
+ kfree(dev_dax->pgmap);
+ kfree(dev_dax);
+ }
+@@ -1324,6 +1341,7 @@ struct dev_dax *devm_create_dev_dax(struct dev_dax_data *data)
+ if (!dev_dax)
+ return ERR_PTR(-ENOMEM);
+
++ dev_dax->region = dax_region;
+ if (is_static(dax_region)) {
+ if (dev_WARN_ONCE(parent, data->id < 0,
+ "dynamic id specified to static region\n")) {
+@@ -1339,13 +1357,11 @@ struct dev_dax *devm_create_dev_dax(struct dev_dax_data *data)
+ goto err_id;
+ }
+
+- rc = ida_alloc(&dax_region->ida, GFP_KERNEL);
++ rc = alloc_dev_dax_id(dev_dax);
+ if (rc < 0)
+ goto err_id;
+- dev_dax->id = rc;
+ }
+
+- dev_dax->region = dax_region;
+ dev = &dev_dax->dev;
+ device_initialize(dev);
+ dev_set_name(dev, "dax%d.%d", dax_region->id, dev_dax->id);
+@@ -1386,7 +1402,6 @@ struct dev_dax *devm_create_dev_dax(struct dev_dax_data *data)
+ dev_dax->target_node = dax_region->target_node;
+ dev_dax->align = dax_region->align;
+ ida_init(&dev_dax->ida);
+- kref_get(&dax_region->kref);
+
+ inode = dax_inode(dax_dev);
+ dev->devt = inode->i_rdev;
+diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h
+index 1c974b7caae6e..afcada6fd2eda 100644
+--- a/drivers/dax/dax-private.h
++++ b/drivers/dax/dax-private.h
+@@ -52,7 +52,8 @@ struct dax_mapping {
+ * @region - parent region
+ * @dax_dev - core dax functionality
+ * @target_node: effective numa node if dev_dax memory range is onlined
+- * @id: ida allocated id
++ * @dyn_id: is this a dynamic or statically created instance
++ * @id: ida allocated id when the dax_region is not static
+ * @ida: mapping id allocator
+ * @dev - device core
+ * @pgmap - pgmap for memmap setup / lifetime (driver owned)
+@@ -64,6 +65,7 @@ struct dev_dax {
+ struct dax_device *dax_dev;
+ unsigned int align;
+ int target_node;
++ bool dyn_id;
+ int id;
+ struct ida ida;
+ struct device dev;
+diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
+index 7b36db6f1cbdc..898ca95057547 100644
+--- a/drivers/dax/kmem.c
++++ b/drivers/dax/kmem.c
+@@ -99,7 +99,7 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax)
+ if (!data->res_name)
+ goto err_res_name;
+
+- rc = memory_group_register_static(numa_node, total_len);
++ rc = memory_group_register_static(numa_node, PFN_UP(total_len));
+ if (rc < 0)
+ goto err_reg_mgid;
+ data->mgid = rc;
+diff --git a/drivers/extcon/extcon-usbc-tusb320.c b/drivers/extcon/extcon-usbc-tusb320.c
+index b408ce989c223..10dff1c512c41 100644
+--- a/drivers/extcon/extcon-usbc-tusb320.c
++++ b/drivers/extcon/extcon-usbc-tusb320.c
+@@ -78,6 +78,7 @@ struct tusb320_priv {
+ struct typec_capability cap;
+ enum typec_port_type port_type;
+ enum typec_pwr_opmode pwr_opmode;
++ struct fwnode_handle *connector_fwnode;
+ };
+
+ static const char * const tusb_attached_states[] = {
+@@ -391,27 +392,25 @@ static int tusb320_typec_probe(struct i2c_client *client,
+ /* Type-C connector found. */
+ ret = typec_get_fw_cap(&priv->cap, connector);
+ if (ret)
+- return ret;
++ goto err_put;
+
+ priv->port_type = priv->cap.type;
+
+ /* This goes into register 0x8 field CURRENT_MODE_ADVERTISE */
+ ret = fwnode_property_read_string(connector, "typec-power-opmode", &cap_str);
+ if (ret)
+- return ret;
++ goto err_put;
+
+ ret = typec_find_pwr_opmode(cap_str);
+ if (ret < 0)
+- return ret;
+- if (ret == TYPEC_PWR_MODE_PD)
+- return -EINVAL;
++ goto err_put;
+
+ priv->pwr_opmode = ret;
+
+ /* Initialize the hardware with the devicetree settings. */
+ ret = tusb320_set_adv_pwr_mode(priv);
+ if (ret)
+- return ret;
++ goto err_put;
+
+ priv->cap.revision = USB_TYPEC_REV_1_1;
+ priv->cap.accessory[0] = TYPEC_ACCESSORY_AUDIO;
+@@ -422,10 +421,25 @@ static int tusb320_typec_probe(struct i2c_client *client,
+ priv->cap.fwnode = connector;
+
+ priv->port = typec_register_port(&client->dev, &priv->cap);
+- if (IS_ERR(priv->port))
+- return PTR_ERR(priv->port);
++ if (IS_ERR(priv->port)) {
++ ret = PTR_ERR(priv->port);
++ goto err_put;
++ }
++
++ priv->connector_fwnode = connector;
+
+ return 0;
++
++err_put:
++ fwnode_handle_put(connector);
++
++ return ret;
++}
++
++static void tusb320_typec_remove(struct tusb320_priv *priv)
++{
++ typec_unregister_port(priv->port);
++ fwnode_handle_put(priv->connector_fwnode);
+ }
+
+ static int tusb320_probe(struct i2c_client *client)
+@@ -438,7 +452,9 @@ static int tusb320_probe(struct i2c_client *client)
+ priv = devm_kzalloc(&client->dev, sizeof(*priv), GFP_KERNEL);
+ if (!priv)
+ return -ENOMEM;
++
+ priv->dev = &client->dev;
++ i2c_set_clientdata(client, priv);
+
+ priv->regmap = devm_regmap_init_i2c(client, &tusb320_regmap_config);
+ if (IS_ERR(priv->regmap))
+@@ -489,10 +505,19 @@ static int tusb320_probe(struct i2c_client *client)
+ tusb320_irq_handler,
+ IRQF_TRIGGER_FALLING | IRQF_ONESHOT,
+ client->name, priv);
++ if (ret)
++ tusb320_typec_remove(priv);
+
+ return ret;
+ }
+
++static void tusb320_remove(struct i2c_client *client)
++{
++ struct tusb320_priv *priv = i2c_get_clientdata(client);
++
++ tusb320_typec_remove(priv);
++}
++
+ static const struct of_device_id tusb320_extcon_dt_match[] = {
+ { .compatible = "ti,tusb320", .data = &tusb320_ops, },
+ { .compatible = "ti,tusb320l", .data = &tusb320l_ops, },
+@@ -502,6 +527,7 @@ MODULE_DEVICE_TABLE(of, tusb320_extcon_dt_match);
+
+ static struct i2c_driver tusb320_extcon_driver = {
+ .probe_new = tusb320_probe,
++ .remove = tusb320_remove,
+ .driver = {
+ .name = "extcon-tusb320",
+ .of_match_table = tusb320_extcon_dt_match,
+diff --git a/drivers/extcon/extcon.c b/drivers/extcon/extcon.c
+index d43ba8e7260dd..370b5b26d10b7 100644
+--- a/drivers/extcon/extcon.c
++++ b/drivers/extcon/extcon.c
+@@ -206,6 +206,14 @@ static const struct __extcon_info {
+ * @attr_name: "name" sysfs entry
+ * @attr_state: "state" sysfs entry
+ * @attrs: the array pointing to attr_name and attr_state for attr_g
++ * @usb_propval: the array of USB connector properties
++ * @chg_propval: the array of charger connector properties
++ * @jack_propval: the array of jack connector properties
++ * @disp_propval: the array of display connector properties
++ * @usb_bits: the bit array of the USB connector property capabilities
++ * @chg_bits: the bit array of the charger connector property capabilities
++ * @jack_bits: the bit array of the jack connector property capabilities
++ * @disp_bits: the bit array of the display connector property capabilities
+ */
+ struct extcon_cable {
+ struct extcon_dev *edev;
+diff --git a/drivers/firmware/efi/libstub/efi-stub-helper.c b/drivers/firmware/efi/libstub/efi-stub-helper.c
+index 1e0203d74691f..732984295295f 100644
+--- a/drivers/firmware/efi/libstub/efi-stub-helper.c
++++ b/drivers/firmware/efi/libstub/efi-stub-helper.c
+@@ -378,6 +378,9 @@ efi_status_t efi_exit_boot_services(void *handle, void *priv,
+ struct efi_boot_memmap *map;
+ efi_status_t status;
+
++ if (efi_disable_pci_dma)
++ efi_pci_disable_bridge_busmaster();
++
+ status = efi_get_memory_map(&map, true);
+ if (status != EFI_SUCCESS)
+ return status;
+@@ -388,9 +391,6 @@ efi_status_t efi_exit_boot_services(void *handle, void *priv,
+ return status;
+ }
+
+- if (efi_disable_pci_dma)
+- efi_pci_disable_bridge_busmaster();
+-
+ status = efi_bs_call(exit_boot_services, handle, map->map_key);
+
+ if (status == EFI_INVALID_PARAMETER) {
+diff --git a/drivers/gpio/gpio-twl4030.c b/drivers/gpio/gpio-twl4030.c
+index c1bb2c3ca6f29..446599ac234a9 100644
+--- a/drivers/gpio/gpio-twl4030.c
++++ b/drivers/gpio/gpio-twl4030.c
+@@ -17,7 +17,9 @@
+ #include <linux/interrupt.h>
+ #include <linux/kthread.h>
+ #include <linux/irq.h>
++#include <linux/gpio/machine.h>
+ #include <linux/gpio/driver.h>
++#include <linux/gpio/consumer.h>
+ #include <linux/platform_device.h>
+ #include <linux/of.h>
+ #include <linux/irqdomain.h>
+@@ -465,8 +467,7 @@ static int gpio_twl4030_debounce(u32 debounce, u8 mmc_cd)
+ REG_GPIO_DEBEN1, 3);
+ }
+
+-static struct twl4030_gpio_platform_data *of_gpio_twl4030(struct device *dev,
+- struct twl4030_gpio_platform_data *pdata)
++static struct twl4030_gpio_platform_data *of_gpio_twl4030(struct device *dev)
+ {
+ struct twl4030_gpio_platform_data *omap_twl_info;
+
+@@ -474,9 +475,6 @@ static struct twl4030_gpio_platform_data *of_gpio_twl4030(struct device *dev,
+ if (!omap_twl_info)
+ return NULL;
+
+- if (pdata)
+- *omap_twl_info = *pdata;
+-
+ omap_twl_info->use_leds = of_property_read_bool(dev->of_node,
+ "ti,use-leds");
+
+@@ -504,9 +502,18 @@ static int gpio_twl4030_remove(struct platform_device *pdev)
+ return 0;
+ }
+
++/* Called from the registered devm action */
++static void gpio_twl4030_power_off_action(void *data)
++{
++ struct gpio_desc *d = data;
++
++ gpiod_unexport(d);
++ gpiochip_free_own_desc(d);
++}
++
+ static int gpio_twl4030_probe(struct platform_device *pdev)
+ {
+- struct twl4030_gpio_platform_data *pdata = dev_get_platdata(&pdev->dev);
++ struct twl4030_gpio_platform_data *pdata;
+ struct device_node *node = pdev->dev.of_node;
+ struct gpio_twl4030_priv *priv;
+ int ret, irq_base;
+@@ -546,9 +553,7 @@ no_irqs:
+
+ mutex_init(&priv->mutex);
+
+- if (node)
+- pdata = of_gpio_twl4030(&pdev->dev, pdata);
+-
++ pdata = of_gpio_twl4030(&pdev->dev);
+ if (pdata == NULL) {
+ dev_err(&pdev->dev, "Platform data is missing\n");
+ return -ENXIO;
+@@ -585,17 +590,32 @@ no_irqs:
+ goto out;
+ }
+
+- platform_set_drvdata(pdev, priv);
++ /*
++ * Special quirk for the OMAP3 to hog and export a WLAN power
++ * GPIO.
++ */
++ if (IS_ENABLED(CONFIG_ARCH_OMAP3) &&
++ of_machine_is_compatible("compulab,omap3-sbc-t3730")) {
++ struct gpio_desc *d;
+
+- if (pdata->setup) {
+- int status;
++ d = gpiochip_request_own_desc(&priv->gpio_chip,
++ 2, "wlan pwr",
++ GPIO_ACTIVE_HIGH,
++ GPIOD_OUT_HIGH);
++ if (IS_ERR(d))
++ return dev_err_probe(&pdev->dev, PTR_ERR(d),
++ "unable to hog wlan pwr GPIO\n");
++
++ gpiod_export(d, 0);
++
++ ret = devm_add_action_or_reset(&pdev->dev, gpio_twl4030_power_off_action, d);
++ if (ret)
++ return dev_err_probe(&pdev->dev, ret,
++ "failed to install power off handler\n");
+
+- status = pdata->setup(&pdev->dev, priv->gpio_chip.base,
+- TWL4030_GPIO_MAX);
+- if (status)
+- dev_dbg(&pdev->dev, "setup --> %d\n", status);
+ }
+
++ platform_set_drvdata(pdev, priv);
+ out:
+ return ret;
+ }
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+index 2eb2c66843a88..5612caf77dd65 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+@@ -133,9 +133,6 @@ static int amdgpu_cs_p1_user_fence(struct amdgpu_cs_parser *p,
+ bo = amdgpu_bo_ref(gem_to_amdgpu_bo(gobj));
+ p->uf_entry.priority = 0;
+ p->uf_entry.tv.bo = &bo->tbo;
+- /* One for TTM and two for the CS job */
+- p->uf_entry.tv.num_shared = 3;
+-
+ drm_gem_object_put(gobj);
+
+ size = amdgpu_bo_size(bo);
+@@ -882,15 +879,19 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
+
+ mutex_lock(&p->bo_list->bo_list_mutex);
+
+- /* One for TTM and one for the CS job */
++ /* One for TTM and one for each CS job */
+ amdgpu_bo_list_for_each_entry(e, p->bo_list)
+- e->tv.num_shared = 2;
++ e->tv.num_shared = 1 + p->gang_size;
++ p->uf_entry.tv.num_shared = 1 + p->gang_size;
+
+ amdgpu_bo_list_get_list(p->bo_list, &p->validated);
+
+ INIT_LIST_HEAD(&duplicates);
+ amdgpu_vm_get_pd_bo(&fpriv->vm, &p->validated, &p->vm_pd);
+
++ /* Two for VM updates, one for TTM and one for each CS job */
++ p->vm_pd.tv.num_shared = 3 + p->gang_size;
++
+ if (p->uf_entry.tv.bo && !ttm_to_amdgpu_bo(p->uf_entry.tv.bo)->parent)
+ list_add(&p->uf_entry.tv.head, &p->validated);
+
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c
+index 4fa019c8aefc4..fb9251d9c899e 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c
+@@ -251,7 +251,8 @@ int amdgpu_jpeg_ras_late_init(struct amdgpu_device *adev, struct ras_common_if *
+
+ if (amdgpu_ras_is_supported(adev, ras_block->block)) {
+ for (i = 0; i < adev->jpeg.num_jpeg_inst; ++i) {
+- if (adev->jpeg.harvest_config & (1 << i))
++ if (adev->jpeg.harvest_config & (1 << i) ||
++ !adev->jpeg.inst[i].ras_poison_irq.funcs)
+ continue;
+
+ r = amdgpu_irq_get(adev, &adev->jpeg.inst[i].ras_poison_irq, 0);
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+index a70103ac0026a..46557bbbc18a2 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+@@ -1266,8 +1266,12 @@ void amdgpu_bo_move_notify(struct ttm_buffer_object *bo,
+ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
+ struct amdgpu_mem_stats *stats)
+ {
+- unsigned int domain;
+ uint64_t size = amdgpu_bo_size(bo);
++ unsigned int domain;
++
++ /* Abort if the BO doesn't currently have a backing store */
++ if (!bo->tbo.resource)
++ return;
+
+ domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
+ switch (domain) {
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+index a150b7a4b4aae..e4757a2807d9a 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+@@ -1947,6 +1947,8 @@ static int psp_securedisplay_initialize(struct psp_context *psp)
+ psp_securedisplay_parse_resp_status(psp, securedisplay_cmd->status);
+ dev_err(psp->adev->dev, "SECUREDISPLAY: query securedisplay TA failed. ret 0x%x\n",
+ securedisplay_cmd->securedisplay_out_message.query_ta.query_cmd_ret);
++ /* don't try again */
++ psp->securedisplay_context.context.bin_desc.size_bytes = 0;
+ }
+
+ return 0;
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+index 3ab8a88789c8f..dcca63019ea76 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+@@ -171,8 +171,7 @@ static int amdgpu_reserve_page_direct(struct amdgpu_device *adev, uint64_t addre
+
+ memset(&err_rec, 0x0, sizeof(struct eeprom_table_record));
+ err_data.err_addr = &err_rec;
+- amdgpu_umc_fill_error_record(&err_data, address,
+- (address >> AMDGPU_GPU_PAGE_SHIFT), 0, 0);
++ amdgpu_umc_fill_error_record(&err_data, address, address, 0, 0);
+
+ if (amdgpu_bad_page_threshold != 0) {
+ amdgpu_ras_add_bad_pages(adev, err_data.err_addr,
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c
+index 73516abef662f..b779ee4bbaa7b 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring_mux.c
+@@ -423,6 +423,9 @@ void amdgpu_sw_ring_ib_mark_offset(struct amdgpu_ring *ring, enum amdgpu_ring_mu
+ struct amdgpu_ring_mux *mux = &adev->gfx.muxer;
+ unsigned offset;
+
++ if (ring->hw_prio > AMDGPU_RING_PRIO_DEFAULT)
++ return;
++
+ offset = ring->wptr & ring->buf_mask;
+
+ amdgpu_ring_mux_ib_mark_offset(mux, ring, offset, type);
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+index 2d94f1b63bd6c..b46a5771c3ec1 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+@@ -1191,7 +1191,8 @@ int amdgpu_vcn_ras_late_init(struct amdgpu_device *adev, struct ras_common_if *r
+
+ if (amdgpu_ras_is_supported(adev, ras_block->block)) {
+ for (i = 0; i < adev->vcn.num_vcn_inst; i++) {
+- if (adev->vcn.harvest_config & (1 << i))
++ if (adev->vcn.harvest_config & (1 << i) ||
++ !adev->vcn.inst[i].ras_poison_irq.funcs)
+ continue;
+
+ r = amdgpu_irq_get(adev, &adev->vcn.inst[i].ras_poison_irq, 0);
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+index 5b3a70becbdf4..ac44b6774352b 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+@@ -920,42 +920,51 @@ error_unlock:
+ return r;
+ }
+
++static void amdgpu_vm_bo_get_memory(struct amdgpu_bo_va *bo_va,
++ struct amdgpu_mem_stats *stats)
++{
++ struct amdgpu_vm *vm = bo_va->base.vm;
++ struct amdgpu_bo *bo = bo_va->base.bo;
++
++ if (!bo)
++ return;
++
++ /*
++ * For now ignore BOs which are currently locked and potentially
++ * changing their location.
++ */
++ if (bo->tbo.base.resv != vm->root.bo->tbo.base.resv &&
++ !dma_resv_trylock(bo->tbo.base.resv))
++ return;
++
++ amdgpu_bo_get_memory(bo, stats);
++ if (bo->tbo.base.resv != vm->root.bo->tbo.base.resv)
++ dma_resv_unlock(bo->tbo.base.resv);
++}
++
+ void amdgpu_vm_get_memory(struct amdgpu_vm *vm,
+ struct amdgpu_mem_stats *stats)
+ {
+ struct amdgpu_bo_va *bo_va, *tmp;
+
+ spin_lock(&vm->status_lock);
+- list_for_each_entry_safe(bo_va, tmp, &vm->idle, base.vm_status) {
+- if (!bo_va->base.bo)
+- continue;
+- amdgpu_bo_get_memory(bo_va->base.bo, stats);
+- }
+- list_for_each_entry_safe(bo_va, tmp, &vm->evicted, base.vm_status) {
+- if (!bo_va->base.bo)
+- continue;
+- amdgpu_bo_get_memory(bo_va->base.bo, stats);
+- }
+- list_for_each_entry_safe(bo_va, tmp, &vm->relocated, base.vm_status) {
+- if (!bo_va->base.bo)
+- continue;
+- amdgpu_bo_get_memory(bo_va->base.bo, stats);
+- }
+- list_for_each_entry_safe(bo_va, tmp, &vm->moved, base.vm_status) {
+- if (!bo_va->base.bo)
+- continue;
+- amdgpu_bo_get_memory(bo_va->base.bo, stats);
+- }
+- list_for_each_entry_safe(bo_va, tmp, &vm->invalidated, base.vm_status) {
+- if (!bo_va->base.bo)
+- continue;
+- amdgpu_bo_get_memory(bo_va->base.bo, stats);
+- }
+- list_for_each_entry_safe(bo_va, tmp, &vm->done, base.vm_status) {
+- if (!bo_va->base.bo)
+- continue;
+- amdgpu_bo_get_memory(bo_va->base.bo, stats);
+- }
++ list_for_each_entry_safe(bo_va, tmp, &vm->idle, base.vm_status)
++ amdgpu_vm_bo_get_memory(bo_va, stats);
++
++ list_for_each_entry_safe(bo_va, tmp, &vm->evicted, base.vm_status)
++ amdgpu_vm_bo_get_memory(bo_va, stats);
++
++ list_for_each_entry_safe(bo_va, tmp, &vm->relocated, base.vm_status)
++ amdgpu_vm_bo_get_memory(bo_va, stats);
++
++ list_for_each_entry_safe(bo_va, tmp, &vm->moved, base.vm_status)
++ amdgpu_vm_bo_get_memory(bo_va, stats);
++
++ list_for_each_entry_safe(bo_va, tmp, &vm->invalidated, base.vm_status)
++ amdgpu_vm_bo_get_memory(bo_va, stats);
++
++ list_for_each_entry_safe(bo_va, tmp, &vm->done, base.vm_status)
++ amdgpu_vm_bo_get_memory(bo_va, stats);
+ spin_unlock(&vm->status_lock);
+ }
+
+@@ -1433,14 +1442,14 @@ int amdgpu_vm_bo_map(struct amdgpu_device *adev,
+ uint64_t eaddr;
+
+ /* validate the parameters */
+- if (saddr & ~PAGE_MASK || offset & ~PAGE_MASK ||
+- size == 0 || size & ~PAGE_MASK)
++ if (saddr & ~PAGE_MASK || offset & ~PAGE_MASK || size & ~PAGE_MASK)
++ return -EINVAL;
++ if (saddr + size <= saddr || offset + size <= offset)
+ return -EINVAL;
+
+ /* make sure object fit at this offset */
+ eaddr = saddr + size - 1;
+- if (saddr >= eaddr ||
+- (bo && offset + size > amdgpu_bo_size(bo)) ||
++ if ((bo && offset + size > amdgpu_bo_size(bo)) ||
+ (eaddr >= adev->vm_manager.max_pfn << AMDGPU_GPU_PAGE_SHIFT))
+ return -EINVAL;
+
+@@ -1499,14 +1508,14 @@ int amdgpu_vm_bo_replace_map(struct amdgpu_device *adev,
+ int r;
+
+ /* validate the parameters */
+- if (saddr & ~PAGE_MASK || offset & ~PAGE_MASK ||
+- size == 0 || size & ~PAGE_MASK)
++ if (saddr & ~PAGE_MASK || offset & ~PAGE_MASK || size & ~PAGE_MASK)
++ return -EINVAL;
++ if (saddr + size <= saddr || offset + size <= offset)
+ return -EINVAL;
+
+ /* make sure object fit at this offset */
+ eaddr = saddr + size - 1;
+- if (saddr >= eaddr ||
+- (bo && offset + size > amdgpu_bo_size(bo)) ||
++ if ((bo && offset + size > amdgpu_bo_size(bo)) ||
+ (eaddr >= adev->vm_manager.max_pfn << AMDGPU_GPU_PAGE_SHIFT))
+ return -EINVAL;
+
+diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
+index aa761ff3a5fae..7ba47fc1917b2 100644
+--- a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
++++ b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
+@@ -346,7 +346,7 @@ static void nbio_v2_3_init_registers(struct amdgpu_device *adev)
+
+ #define NAVI10_PCIE__LC_L0S_INACTIVITY_DEFAULT 0x00000000 // off by default, no gains over L1
+ #define NAVI10_PCIE__LC_L1_INACTIVITY_DEFAULT 0x00000009 // 1=1us, 9=1ms
+-#define NAVI10_PCIE__LC_L1_INACTIVITY_TBT_DEFAULT 0x0000000E // 4ms
++#define NAVI10_PCIE__LC_L1_INACTIVITY_TBT_DEFAULT 0x0000000E // 400ms
+
+ static void nbio_v2_3_enable_aspm(struct amdgpu_device *adev,
+ bool enable)
+@@ -479,9 +479,12 @@ static void nbio_v2_3_program_aspm(struct amdgpu_device *adev)
+ WREG32_SOC15(NBIO, 0, mmRCC_BIF_STRAP5, data);
+
+ def = data = RREG32_PCIE(smnPCIE_LC_CNTL);
+- data &= ~PCIE_LC_CNTL__LC_L0S_INACTIVITY_MASK;
+- data |= 0x9 << PCIE_LC_CNTL__LC_L1_INACTIVITY__SHIFT;
+- data |= 0x1 << PCIE_LC_CNTL__LC_PMI_TO_L1_DIS__SHIFT;
++ data |= NAVI10_PCIE__LC_L0S_INACTIVITY_DEFAULT << PCIE_LC_CNTL__LC_L0S_INACTIVITY__SHIFT;
++ if (pci_is_thunderbolt_attached(adev->pdev))
++ data |= NAVI10_PCIE__LC_L1_INACTIVITY_TBT_DEFAULT << PCIE_LC_CNTL__LC_L1_INACTIVITY__SHIFT;
++ else
++ data |= NAVI10_PCIE__LC_L1_INACTIVITY_DEFAULT << PCIE_LC_CNTL__LC_L1_INACTIVITY__SHIFT;
++ data &= ~PCIE_LC_CNTL__LC_PMI_TO_L1_DIS_MASK;
+ if (def != data)
+ WREG32_PCIE(smnPCIE_LC_CNTL, data);
+
+diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+index 9295ac7edd565..d35c8a33d06d3 100644
+--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
++++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+@@ -2306,7 +2306,7 @@ const struct amd_ip_funcs sdma_v4_0_ip_funcs = {
+
+ static const struct amdgpu_ring_funcs sdma_v4_0_ring_funcs = {
+ .type = AMDGPU_RING_TYPE_SDMA,
+- .align_mask = 0xf,
++ .align_mask = 0xff,
+ .nop = SDMA_PKT_NOP_HEADER_OP(SDMA_OP_NOP),
+ .support_64bit_ptrs = true,
+ .secure_submission_supported = true,
+@@ -2338,7 +2338,7 @@ static const struct amdgpu_ring_funcs sdma_v4_0_ring_funcs = {
+
+ static const struct amdgpu_ring_funcs sdma_v4_0_page_ring_funcs = {
+ .type = AMDGPU_RING_TYPE_SDMA,
+- .align_mask = 0xf,
++ .align_mask = 0xff,
+ .nop = SDMA_PKT_NOP_HEADER_OP(SDMA_OP_NOP),
+ .support_64bit_ptrs = true,
+ .secure_submission_supported = true,
+diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
+index 64dcaa2670dd1..ac7aa8631f6a7 100644
+--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
++++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
+@@ -1740,7 +1740,7 @@ const struct amd_ip_funcs sdma_v4_4_2_ip_funcs = {
+
+ static const struct amdgpu_ring_funcs sdma_v4_4_2_ring_funcs = {
+ .type = AMDGPU_RING_TYPE_SDMA,
+- .align_mask = 0xf,
++ .align_mask = 0xff,
+ .nop = SDMA_PKT_NOP_HEADER_OP(SDMA_OP_NOP),
+ .support_64bit_ptrs = true,
+ .get_rptr = sdma_v4_4_2_ring_get_rptr,
+@@ -1771,7 +1771,7 @@ static const struct amdgpu_ring_funcs sdma_v4_4_2_ring_funcs = {
+
+ static const struct amdgpu_ring_funcs sdma_v4_4_2_page_ring_funcs = {
+ .type = AMDGPU_RING_TYPE_SDMA,
+- .align_mask = 0xf,
++ .align_mask = 0xff,
+ .nop = SDMA_PKT_NOP_HEADER_OP(SDMA_OP_NOP),
+ .support_64bit_ptrs = true,
+ .get_rptr = sdma_v4_4_2_ring_get_rptr,
+diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
+index fdbfd725841ff..51b53110341bb 100644
+--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
++++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
+@@ -115,18 +115,19 @@ static struct kfd_mem_obj *allocate_mqd(struct kfd_dev *kfd,
+ &(mqd_mem_obj->gtt_mem),
+ &(mqd_mem_obj->gpu_addr),
+ (void *)&(mqd_mem_obj->cpu_ptr), true);
++
++ if (retval) {
++ kfree(mqd_mem_obj);
++ return NULL;
++ }
+ } else {
+ retval = kfd_gtt_sa_allocate(kfd, sizeof(struct v9_mqd),
+ &mqd_mem_obj);
+- }
+-
+- if (retval) {
+- kfree(mqd_mem_obj);
+- return NULL;
++ if (retval)
++ return NULL;
+ }
+
+ return mqd_mem_obj;
+-
+ }
+
+ static void init_mqd(struct mqd_manager *mm, void **mqd,
+diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+index 7acd73e5004fb..51269b0ab9b58 100644
+--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
++++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+@@ -7196,13 +7196,7 @@ static int amdgpu_dm_connector_get_modes(struct drm_connector *connector)
+ drm_add_modes_noedid(connector, 1920, 1080);
+ } else {
+ amdgpu_dm_connector_ddc_get_modes(connector, edid);
+- /* most eDP supports only timings from its edid,
+- * usually only detailed timings are available
+- * from eDP edid. timings which are not from edid
+- * may damage eDP
+- */
+- if (connector->connector_type != DRM_MODE_CONNECTOR_eDP)
+- amdgpu_dm_connector_add_common_modes(encoder, connector);
++ amdgpu_dm_connector_add_common_modes(encoder, connector);
+ amdgpu_dm_connector_add_freesync_modes(connector, edid);
+ }
+ amdgpu_dm_fbc_init(connector);
+@@ -9265,6 +9259,8 @@ static int dm_update_crtc_state(struct amdgpu_display_manager *dm,
+
+ /* Now check if we should set freesync video mode */
+ if (amdgpu_freesync_vid_mode && dm_new_crtc_state->stream &&
++ dc_is_stream_unchanged(new_stream, dm_old_crtc_state->stream) &&
++ dc_is_stream_scaling_unchanged(new_stream, dm_old_crtc_state->stream) &&
+ is_timing_unchanged_for_freesync(new_crtc_state,
+ old_crtc_state)) {
+ new_crtc_state->mode_changed = false;
+diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
+index 810ab682f424f..46d0a8f57e552 100644
+--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
++++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
+@@ -45,8 +45,7 @@
+ #endif
+
+ #include "dc/dcn20/dcn20_resource.h"
+-bool is_timing_changed(struct dc_stream_state *cur_stream,
+- struct dc_stream_state *new_stream);
++
+ #define PEAK_FACTOR_X1000 1006
+
+ static ssize_t dm_dp_aux_transfer(struct drm_dp_aux *aux,
+@@ -1422,7 +1421,7 @@ int pre_validate_dsc(struct drm_atomic_state *state,
+ struct dc_stream_state *stream = dm_state->context->streams[i];
+
+ if (local_dc_state->streams[i] &&
+- is_timing_changed(stream, local_dc_state->streams[i])) {
++ dc_is_timing_changed(stream, local_dc_state->streams[i])) {
+ DRM_INFO_ONCE("crtc[%d] needs mode_changed\n", i);
+ } else {
+ int ind = find_crtc_index_in_state_by_stream(state, stream);
+diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn30/dcn30_clk_mgr_smu_msg.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn30/dcn30_clk_mgr_smu_msg.c
+index 1fbf1c105dc12..bdbf183066981 100644
+--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn30/dcn30_clk_mgr_smu_msg.c
++++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn30/dcn30_clk_mgr_smu_msg.c
+@@ -312,6 +312,9 @@ void dcn30_smu_set_display_refresh_from_mall(struct clk_mgr_internal *clk_mgr, b
+ /* bits 8:7 for cache timer scale, bits 6:1 for cache timer delay, bit 0 = 1 for enable, = 0 for disable */
+ uint32_t param = (cache_timer_scale << 7) | (cache_timer_delay << 1) | (enable ? 1 : 0);
+
++ smu_print("SMU Set display refresh from mall: enable = %d, cache_timer_delay = %d, cache_timer_scale = %d\n",
++ enable, cache_timer_delay, cache_timer_scale);
++
+ dcn30_smu_send_msg_with_param(clk_mgr,
+ DALSMC_MSG_SetDisplayRefreshFromMall, param, NULL);
+ }
+diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c
+index 7cde67b7f0c33..dcf8631181690 100644
+--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
++++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
+@@ -2504,9 +2504,6 @@ static enum surface_update_type det_surface_update(const struct dc *dc,
+ enum surface_update_type overall_type = UPDATE_TYPE_FAST;
+ union surface_update_flags *update_flags = &u->surface->update_flags;
+
+- if (u->flip_addr)
+- update_flags->bits.addr_update = 1;
+-
+ if (!is_surface_in_context(context, u->surface) || u->surface->force_full_update) {
+ update_flags->raw = 0xFFFFFFFF;
+ return UPDATE_TYPE_FULL;
+diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
+index fe1551393b264..ba3eb36e75bc3 100644
+--- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
++++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c
+@@ -1878,7 +1878,7 @@ bool dc_add_all_planes_for_stream(
+ return add_all_planes_for_stream(dc, stream, &set, 1, context);
+ }
+
+-bool is_timing_changed(struct dc_stream_state *cur_stream,
++bool dc_is_timing_changed(struct dc_stream_state *cur_stream,
+ struct dc_stream_state *new_stream)
+ {
+ if (cur_stream == NULL)
+@@ -1903,7 +1903,7 @@ static bool are_stream_backends_same(
+ if (stream_a == NULL || stream_b == NULL)
+ return false;
+
+- if (is_timing_changed(stream_a, stream_b))
++ if (dc_is_timing_changed(stream_a, stream_b))
+ return false;
+
+ if (stream_a->signal != stream_b->signal)
+@@ -3528,7 +3528,7 @@ bool pipe_need_reprogram(
+ if (pipe_ctx_old->stream_res.stream_enc != pipe_ctx->stream_res.stream_enc)
+ return true;
+
+- if (is_timing_changed(pipe_ctx_old->stream, pipe_ctx->stream))
++ if (dc_is_timing_changed(pipe_ctx_old->stream, pipe_ctx->stream))
+ return true;
+
+ if (pipe_ctx_old->stream->dpms_off != pipe_ctx->stream->dpms_off)
+diff --git a/drivers/gpu/drm/amd/display/dc/dc.h b/drivers/gpu/drm/amd/display/dc/dc.h
+index 30f0ba05a6e6c..4d93ca9c627b0 100644
+--- a/drivers/gpu/drm/amd/display/dc/dc.h
++++ b/drivers/gpu/drm/amd/display/dc/dc.h
+@@ -2226,4 +2226,7 @@ void dc_process_dmub_dpia_hpd_int_enable(const struct dc *dc,
+ /* Disable acc mode Interfaces */
+ void dc_disable_accelerated_mode(struct dc *dc);
+
++bool dc_is_timing_changed(struct dc_stream_state *cur_stream,
++ struct dc_stream_state *new_stream);
++
+ #endif /* DC_INTERFACE_H_ */
+diff --git a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_hwseq.c
+index cc3fe9cac5b53..c309933112e5e 100644
+--- a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_hwseq.c
++++ b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_hwseq.c
+@@ -400,29 +400,6 @@ void dcn314_dpp_root_clock_control(struct dce_hwseq *hws, unsigned int dpp_inst,
+ hws->ctx->dc->res_pool->dccg, dpp_inst, clock_on);
+ }
+
+-void dcn314_hubp_pg_control(struct dce_hwseq *hws, unsigned int hubp_inst, bool power_on)
+-{
+- struct dc_context *ctx = hws->ctx;
+- union dmub_rb_cmd cmd;
+-
+- if (hws->ctx->dc->debug.disable_hubp_power_gate)
+- return;
+-
+- PERF_TRACE();
+-
+- memset(&cmd, 0, sizeof(cmd));
+- cmd.domain_control.header.type = DMUB_CMD__VBIOS;
+- cmd.domain_control.header.sub_type = DMUB_CMD__VBIOS_DOMAIN_CONTROL;
+- cmd.domain_control.header.payload_bytes = sizeof(cmd.domain_control.data);
+- cmd.domain_control.data.inst = hubp_inst;
+- cmd.domain_control.data.power_gate = !power_on;
+-
+- dc_dmub_srv_cmd_queue(ctx->dmub_srv, &cmd);
+- dc_dmub_srv_cmd_execute(ctx->dmub_srv);
+- dc_dmub_srv_wait_idle(ctx->dmub_srv);
+-
+- PERF_TRACE();
+-}
+ static void apply_symclk_on_tx_off_wa(struct dc_link *link)
+ {
+ /* There are use cases where SYMCLK is referenced by OTG. For instance
+diff --git a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_hwseq.h b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_hwseq.h
+index 6d0b62503caa6..54b1379914ce5 100644
+--- a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_hwseq.h
++++ b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_hwseq.h
+@@ -41,8 +41,6 @@ unsigned int dcn314_calculate_dccg_k1_k2_values(struct pipe_ctx *pipe_ctx, unsig
+
+ void dcn314_set_pixels_per_cycle(struct pipe_ctx *pipe_ctx);
+
+-void dcn314_hubp_pg_control(struct dce_hwseq *hws, unsigned int hubp_inst, bool power_on);
+-
+ void dcn314_dpp_root_clock_control(struct dce_hwseq *hws, unsigned int dpp_inst, bool clock_on);
+
+ void dcn314_disable_link_output(struct dc_link *link, const struct link_resource *link_res, enum signal_type signal);
+diff --git a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_init.c b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_init.c
+index a588f46b166f4..d9d2576f3e842 100644
+--- a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_init.c
++++ b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_init.c
+@@ -138,7 +138,7 @@ static const struct hwseq_private_funcs dcn314_private_funcs = {
+ .plane_atomic_power_down = dcn10_plane_atomic_power_down,
+ .enable_power_gating_plane = dcn314_enable_power_gating_plane,
+ .dpp_root_clock_control = dcn314_dpp_root_clock_control,
+- .hubp_pg_control = dcn314_hubp_pg_control,
++ .hubp_pg_control = dcn31_hubp_pg_control,
+ .program_all_writeback_pipes_in_tree = dcn30_program_all_writeback_pipes_in_tree,
+ .update_odm = dcn314_update_odm,
+ .dsc_pg_control = dcn314_dsc_pg_control,
+diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn21/display_mode_vba_21.c b/drivers/gpu/drm/amd/display/dc/dml/dcn21/display_mode_vba_21.c
+index b7c2844d0cbee..f294f2f8c75bc 100644
+--- a/drivers/gpu/drm/amd/display/dc/dml/dcn21/display_mode_vba_21.c
++++ b/drivers/gpu/drm/amd/display/dc/dml/dcn21/display_mode_vba_21.c
+@@ -810,7 +810,7 @@ static bool CalculatePrefetchSchedule(
+ *swath_width_chroma_ub = dml_ceil(SwathWidthY / 2 - 1, myPipe->BlockWidth256BytesC) + myPipe->BlockWidth256BytesC;
+ } else {
+ *swath_width_luma_ub = dml_ceil(SwathWidthY - 1, myPipe->BlockHeight256BytesY) + myPipe->BlockHeight256BytesY;
+- if (myPipe->BlockWidth256BytesC > 0)
++ if (myPipe->BlockHeight256BytesC > 0)
+ *swath_width_chroma_ub = dml_ceil(SwathWidthY / 2 - 1, myPipe->BlockHeight256BytesC) + myPipe->BlockHeight256BytesC;
+ }
+
+diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_rq_dlg_calc_32.c b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_rq_dlg_calc_32.c
+index 395ae8761980f..9ba6cb67655f4 100644
+--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_rq_dlg_calc_32.c
++++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_rq_dlg_calc_32.c
+@@ -116,7 +116,7 @@ void dml32_rq_dlg_get_rq_reg(display_rq_regs_st *rq_regs,
+ else
+ rq_regs->rq_regs_l.min_meta_chunk_size = dml_log2(min_meta_chunk_bytes) - 6 + 1;
+
+- if (min_meta_chunk_bytes == 0)
++ if (p1_min_meta_chunk_bytes == 0)
+ rq_regs->rq_regs_c.min_meta_chunk_size = 0;
+ else
+ rq_regs->rq_regs_c.min_meta_chunk_size = dml_log2(p1_min_meta_chunk_bytes) - 6 + 1;
+diff --git a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_capability.c b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_capability.c
+index ba98013fecd00..6d2d10da2b77c 100644
+--- a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_capability.c
++++ b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_capability.c
+@@ -1043,9 +1043,7 @@ static enum dc_status wake_up_aux_channel(struct dc_link *link)
+ DP_SET_POWER,
+ &dpcd_power_state,
+ sizeof(dpcd_power_state));
+- if (status < 0)
+- DC_LOG_DC("%s: Failed to power up sink: %s\n", __func__,
+- dpcd_power_state == DP_SET_POWER_D0 ? "D0" : "D3");
++ DC_LOG_DC("%s: Failed to power up sink\n", __func__);
+ return DC_ERROR_UNEXPECTED;
+ }
+
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+index 85d53597eb07a..f7ed3e655e397 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+@@ -431,7 +431,13 @@ static int sienna_cichlid_append_powerplay_table(struct smu_context *smu)
+ {
+ struct atom_smc_dpm_info_v4_9 *smc_dpm_table;
+ int index, ret;
+- I2cControllerConfig_t *table_member;
++ PPTable_beige_goby_t *ppt_beige_goby;
++ PPTable_t *ppt;
++
++ if (smu->adev->ip_versions[MP1_HWIP][0] == IP_VERSION(11, 0, 13))
++ ppt_beige_goby = smu->smu_table.driver_pptable;
++ else
++ ppt = smu->smu_table.driver_pptable;
+
+ index = get_index_into_master_table(atom_master_list_of_data_tables_v2_1,
+ smc_dpm_info);
+@@ -440,9 +446,13 @@ static int sienna_cichlid_append_powerplay_table(struct smu_context *smu)
+ (uint8_t **)&smc_dpm_table);
+ if (ret)
+ return ret;
+- GET_PPTABLE_MEMBER(I2cControllers, &table_member);
+- memcpy(table_member, smc_dpm_table->I2cControllers,
+- sizeof(*smc_dpm_table) - sizeof(smc_dpm_table->table_header));
++
++ if (smu->adev->ip_versions[MP1_HWIP][0] == IP_VERSION(11, 0, 13))
++ smu_memcpy_trailing(ppt_beige_goby, I2cControllers, BoardReserved,
++ smc_dpm_table, I2cControllers);
++ else
++ smu_memcpy_trailing(ppt, I2cControllers, BoardReserved,
++ smc_dpm_table, I2cControllers);
+
+ return 0;
+ }
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
+index 08577d1b84eca..c42c0c1446f4f 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
+@@ -1300,6 +1300,7 @@ static int smu_v13_0_0_get_thermal_temperature_range(struct smu_context *smu,
+ range->mem_emergency_max = (pptable->SkuTable.TemperatureLimit[TEMP_MEM] + CTF_OFFSET_MEM)*
+ SMU_TEMPERATURE_UNITS_PER_CENTIGRADES;
+ range->software_shutdown_temp = powerplay_table->software_shutdown_temp;
++ range->software_shutdown_temp_offset = pptable->SkuTable.FanAbnormalTempLimitOffset;
+
+ return 0;
+ }
+diff --git a/drivers/gpu/drm/bridge/analogix/anx7625.c b/drivers/gpu/drm/bridge/analogix/anx7625.c
+index 6846199a2ee14..9e387c3e9b696 100644
+--- a/drivers/gpu/drm/bridge/analogix/anx7625.c
++++ b/drivers/gpu/drm/bridge/analogix/anx7625.c
+@@ -1687,6 +1687,14 @@ static int anx7625_parse_dt(struct device *dev,
+ if (of_property_read_bool(np, "analogix,audio-enable"))
+ pdata->audio_en = 1;
+
++ return 0;
++}
++
++static int anx7625_parse_dt_panel(struct device *dev,
++ struct anx7625_platform_data *pdata)
++{
++ struct device_node *np = dev->of_node;
++
+ pdata->panel_bridge = devm_drm_of_get_bridge(dev, np, 1, 0);
+ if (IS_ERR(pdata->panel_bridge)) {
+ if (PTR_ERR(pdata->panel_bridge) == -ENODEV) {
+@@ -2032,7 +2040,7 @@ static int anx7625_register_audio(struct device *dev, struct anx7625_data *ctx)
+ return 0;
+ }
+
+-static int anx7625_attach_dsi(struct anx7625_data *ctx)
++static int anx7625_setup_dsi_device(struct anx7625_data *ctx)
+ {
+ struct mipi_dsi_device *dsi;
+ struct device *dev = &ctx->client->dev;
+@@ -2042,9 +2050,6 @@ static int anx7625_attach_dsi(struct anx7625_data *ctx)
+ .channel = 0,
+ .node = NULL,
+ };
+- int ret;
+-
+- DRM_DEV_DEBUG_DRIVER(dev, "attach dsi\n");
+
+ host = of_find_mipi_dsi_host_by_node(ctx->pdata.mipi_host_node);
+ if (!host) {
+@@ -2065,14 +2070,24 @@ static int anx7625_attach_dsi(struct anx7625_data *ctx)
+ MIPI_DSI_MODE_VIDEO_HSE |
+ MIPI_DSI_HS_PKT_END_ALIGNED;
+
+- ret = devm_mipi_dsi_attach(dev, dsi);
++ ctx->dsi = dsi;
++
++ return 0;
++}
++
++static int anx7625_attach_dsi(struct anx7625_data *ctx)
++{
++ struct device *dev = &ctx->client->dev;
++ int ret;
++
++ DRM_DEV_DEBUG_DRIVER(dev, "attach dsi\n");
++
++ ret = devm_mipi_dsi_attach(dev, ctx->dsi);
+ if (ret) {
+ DRM_DEV_ERROR(dev, "fail to attach dsi to host.\n");
+ return ret;
+ }
+
+- ctx->dsi = dsi;
+-
+ DRM_DEV_DEBUG_DRIVER(dev, "attach dsi succeeded.\n");
+
+ return 0;
+@@ -2560,6 +2575,40 @@ static void anx7625_runtime_disable(void *data)
+ pm_runtime_disable(data);
+ }
+
++static int anx7625_link_bridge(struct drm_dp_aux *aux)
++{
++ struct anx7625_data *platform = container_of(aux, struct anx7625_data, aux);
++ struct device *dev = aux->dev;
++ int ret;
++
++ ret = anx7625_parse_dt_panel(dev, &platform->pdata);
++ if (ret) {
++ DRM_DEV_ERROR(dev, "fail to parse DT for panel : %d\n", ret);
++ return ret;
++ }
++
++ platform->bridge.funcs = &anx7625_bridge_funcs;
++ platform->bridge.of_node = dev->of_node;
++ if (!anx7625_of_panel_on_aux_bus(dev))
++ platform->bridge.ops |= DRM_BRIDGE_OP_EDID;
++ if (!platform->pdata.panel_bridge)
++ platform->bridge.ops |= DRM_BRIDGE_OP_HPD |
++ DRM_BRIDGE_OP_DETECT;
++ platform->bridge.type = platform->pdata.panel_bridge ?
++ DRM_MODE_CONNECTOR_eDP :
++ DRM_MODE_CONNECTOR_DisplayPort;
++
++ drm_bridge_add(&platform->bridge);
++
++ if (!platform->pdata.is_dpi) {
++ ret = anx7625_attach_dsi(platform);
++ if (ret)
++ drm_bridge_remove(&platform->bridge);
++ }
++
++ return ret;
++}
++
+ static int anx7625_i2c_probe(struct i2c_client *client)
+ {
+ struct anx7625_data *platform;
+@@ -2634,6 +2683,24 @@ static int anx7625_i2c_probe(struct i2c_client *client)
+ platform->aux.wait_hpd_asserted = anx7625_wait_hpd_asserted;
+ drm_dp_aux_init(&platform->aux);
+
++ ret = anx7625_parse_dt(dev, pdata);
++ if (ret) {
++ if (ret != -EPROBE_DEFER)
++ DRM_DEV_ERROR(dev, "fail to parse DT : %d\n", ret);
++ goto free_wq;
++ }
++
++ if (!platform->pdata.is_dpi) {
++ ret = anx7625_setup_dsi_device(platform);
++ if (ret < 0)
++ goto free_wq;
++ }
++
++ /*
++ * Registering the i2c devices will retrigger deferred probe, so it
++ * needs to be done after calls that might return EPROBE_DEFER,
++ * otherwise we can get an infinite loop.
++ */
+ if (anx7625_register_i2c_dummy_clients(platform, client) != 0) {
+ ret = -ENOMEM;
+ DRM_DEV_ERROR(dev, "fail to reserve I2C bus.\n");
+@@ -2648,13 +2715,21 @@ static int anx7625_i2c_probe(struct i2c_client *client)
+ if (ret)
+ goto free_wq;
+
+- devm_of_dp_aux_populate_ep_devices(&platform->aux);
+-
+- ret = anx7625_parse_dt(dev, pdata);
++ /*
++ * Populating the aux bus will retrigger deferred probe, so it needs to
++ * be done after calls that might return EPROBE_DEFER, otherwise we can
++ * get an infinite loop.
++ */
++ ret = devm_of_dp_aux_populate_bus(&platform->aux, anx7625_link_bridge);
+ if (ret) {
+- if (ret != -EPROBE_DEFER)
+- DRM_DEV_ERROR(dev, "fail to parse DT : %d\n", ret);
+- goto free_wq;
++ if (ret != -ENODEV) {
++ DRM_DEV_ERROR(dev, "failed to populate aux bus : %d\n", ret);
++ goto free_wq;
++ }
++
++ ret = anx7625_link_bridge(&platform->aux);
++ if (ret)
++ goto free_wq;
+ }
+
+ if (!platform->pdata.low_power_mode) {
+@@ -2667,27 +2742,6 @@ static int anx7625_i2c_probe(struct i2c_client *client)
+ if (platform->pdata.intp_irq)
+ queue_work(platform->workqueue, &platform->work);
+
+- platform->bridge.funcs = &anx7625_bridge_funcs;
+- platform->bridge.of_node = client->dev.of_node;
+- if (!anx7625_of_panel_on_aux_bus(&client->dev))
+- platform->bridge.ops |= DRM_BRIDGE_OP_EDID;
+- if (!platform->pdata.panel_bridge)
+- platform->bridge.ops |= DRM_BRIDGE_OP_HPD |
+- DRM_BRIDGE_OP_DETECT;
+- platform->bridge.type = platform->pdata.panel_bridge ?
+- DRM_MODE_CONNECTOR_eDP :
+- DRM_MODE_CONNECTOR_DisplayPort;
+-
+- drm_bridge_add(&platform->bridge);
+-
+- if (!platform->pdata.is_dpi) {
+- ret = anx7625_attach_dsi(platform);
+- if (ret) {
+- DRM_DEV_ERROR(dev, "Fail to attach to dsi : %d\n", ret);
+- goto unregister_bridge;
+- }
+- }
+-
+ if (platform->pdata.audio_en)
+ anx7625_register_audio(dev, platform);
+
+@@ -2695,12 +2749,6 @@ static int anx7625_i2c_probe(struct i2c_client *client)
+
+ return 0;
+
+-unregister_bridge:
+- drm_bridge_remove(&platform->bridge);
+-
+- if (!platform->pdata.low_power_mode)
+- pm_runtime_put_sync_suspend(&client->dev);
+-
+ free_wq:
+ if (platform->workqueue)
+ destroy_workqueue(platform->workqueue);
+diff --git a/drivers/gpu/drm/bridge/ite-it6505.c b/drivers/gpu/drm/bridge/ite-it6505.c
+index abaf6e23775eb..45f579c365e7f 100644
+--- a/drivers/gpu/drm/bridge/ite-it6505.c
++++ b/drivers/gpu/drm/bridge/ite-it6505.c
+@@ -3207,7 +3207,7 @@ static ssize_t receive_timing_debugfs_show(struct file *file, char __user *buf,
+ size_t len, loff_t *ppos)
+ {
+ struct it6505 *it6505 = file->private_data;
+- struct drm_display_mode *vid = &it6505->video_info;
++ struct drm_display_mode *vid;
+ u8 read_buf[READ_BUFFER_SIZE];
+ u8 *str = read_buf, *end = read_buf + READ_BUFFER_SIZE;
+ ssize_t ret, count;
+@@ -3216,6 +3216,7 @@ static ssize_t receive_timing_debugfs_show(struct file *file, char __user *buf,
+ return -ENODEV;
+
+ it6505_calc_video_info(it6505);
++ vid = &it6505->video_info;
+ str += scnprintf(str, end - str, "---video timing---\n");
+ str += scnprintf(str, end - str, "PCLK:%d.%03dMHz\n",
+ vid->clock / 1000, vid->clock % 1000);
+diff --git a/drivers/gpu/drm/bridge/samsung-dsim.c b/drivers/gpu/drm/bridge/samsung-dsim.c
+index e0a402a85787c..3194cabb26b32 100644
+--- a/drivers/gpu/drm/bridge/samsung-dsim.c
++++ b/drivers/gpu/drm/bridge/samsung-dsim.c
+@@ -405,6 +405,9 @@ static const struct samsung_dsim_driver_data exynos3_dsi_driver_data = {
+ .num_bits_resol = 11,
+ .pll_p_offset = 13,
+ .reg_values = reg_values,
++ .m_min = 41,
++ .m_max = 125,
++ .min_freq = 500,
+ };
+
+ static const struct samsung_dsim_driver_data exynos4_dsi_driver_data = {
+@@ -418,6 +421,9 @@ static const struct samsung_dsim_driver_data exynos4_dsi_driver_data = {
+ .num_bits_resol = 11,
+ .pll_p_offset = 13,
+ .reg_values = reg_values,
++ .m_min = 41,
++ .m_max = 125,
++ .min_freq = 500,
+ };
+
+ static const struct samsung_dsim_driver_data exynos5_dsi_driver_data = {
+@@ -429,6 +435,9 @@ static const struct samsung_dsim_driver_data exynos5_dsi_driver_data = {
+ .num_bits_resol = 11,
+ .pll_p_offset = 13,
+ .reg_values = reg_values,
++ .m_min = 41,
++ .m_max = 125,
++ .min_freq = 500,
+ };
+
+ static const struct samsung_dsim_driver_data exynos5433_dsi_driver_data = {
+@@ -441,6 +450,9 @@ static const struct samsung_dsim_driver_data exynos5433_dsi_driver_data = {
+ .num_bits_resol = 12,
+ .pll_p_offset = 13,
+ .reg_values = exynos5433_reg_values,
++ .m_min = 41,
++ .m_max = 125,
++ .min_freq = 500,
+ };
+
+ static const struct samsung_dsim_driver_data exynos5422_dsi_driver_data = {
+@@ -453,6 +465,9 @@ static const struct samsung_dsim_driver_data exynos5422_dsi_driver_data = {
+ .num_bits_resol = 12,
+ .pll_p_offset = 13,
+ .reg_values = exynos5422_reg_values,
++ .m_min = 41,
++ .m_max = 125,
++ .min_freq = 500,
+ };
+
+ static const struct samsung_dsim_driver_data imx8mm_dsi_driver_data = {
+@@ -469,6 +484,9 @@ static const struct samsung_dsim_driver_data imx8mm_dsi_driver_data = {
+ */
+ .pll_p_offset = 14,
+ .reg_values = imx8mm_dsim_reg_values,
++ .m_min = 64,
++ .m_max = 1023,
++ .min_freq = 1050,
+ };
+
+ static const struct samsung_dsim_driver_data *
+@@ -547,12 +565,12 @@ static unsigned long samsung_dsim_pll_find_pms(struct samsung_dsim *dsi,
+ tmp = (u64)fout * (_p << _s);
+ do_div(tmp, fin);
+ _m = tmp;
+- if (_m < 41 || _m > 125)
++ if (_m < driver_data->m_min || _m > driver_data->m_max)
+ continue;
+
+ tmp = (u64)_m * fin;
+ do_div(tmp, _p);
+- if (tmp < 500 * MHZ ||
++ if (tmp < driver_data->min_freq * MHZ ||
+ tmp > driver_data->max_freq * MHZ)
+ continue;
+
+diff --git a/drivers/gpu/drm/bridge/tc358767.c b/drivers/gpu/drm/bridge/tc358767.c
+index 91f7cb56a654d..d6349af4f1b62 100644
+--- a/drivers/gpu/drm/bridge/tc358767.c
++++ b/drivers/gpu/drm/bridge/tc358767.c
+@@ -1890,7 +1890,7 @@ static int tc_mipi_dsi_host_attach(struct tc_data *tc)
+ if (dsi_lanes < 0)
+ return dsi_lanes;
+
+- dsi = mipi_dsi_device_register_full(host, &info);
++ dsi = devm_mipi_dsi_device_register_full(dev, host, &info);
+ if (IS_ERR(dsi))
+ return dev_err_probe(dev, PTR_ERR(dsi),
+ "failed to create dsi device\n");
+@@ -1901,7 +1901,7 @@ static int tc_mipi_dsi_host_attach(struct tc_data *tc)
+ dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST |
+ MIPI_DSI_MODE_LPM | MIPI_DSI_CLOCK_NON_CONTINUOUS;
+
+- ret = mipi_dsi_attach(dsi);
++ ret = devm_mipi_dsi_attach(dev, dsi);
+ if (ret < 0) {
+ dev_err(dev, "failed to attach dsi to host: %d\n", ret);
+ return ret;
+diff --git a/drivers/gpu/drm/bridge/tc358768.c b/drivers/gpu/drm/bridge/tc358768.c
+index 7c0cbe84611b9..966a25cb0b108 100644
+--- a/drivers/gpu/drm/bridge/tc358768.c
++++ b/drivers/gpu/drm/bridge/tc358768.c
+@@ -9,6 +9,8 @@
+ #include <linux/gpio/consumer.h>
+ #include <linux/i2c.h>
+ #include <linux/kernel.h>
++#include <linux/media-bus-format.h>
++#include <linux/minmax.h>
+ #include <linux/module.h>
+ #include <linux/regmap.h>
+ #include <linux/regulator/consumer.h>
+@@ -146,6 +148,7 @@ struct tc358768_priv {
+
+ u32 pd_lines; /* number of Parallel Port Input Data Lines */
+ u32 dsi_lanes; /* number of DSI Lanes */
++ u32 dsi_bpp; /* number of Bits Per Pixel over DSI */
+
+ /* Parameters for PLL programming */
+ u32 fbd; /* PLL feedback divider */
+@@ -284,12 +287,12 @@ static void tc358768_hw_disable(struct tc358768_priv *priv)
+
+ static u32 tc358768_pll_to_pclk(struct tc358768_priv *priv, u32 pll_clk)
+ {
+- return (u32)div_u64((u64)pll_clk * priv->dsi_lanes, priv->pd_lines);
++ return (u32)div_u64((u64)pll_clk * priv->dsi_lanes, priv->dsi_bpp);
+ }
+
+ static u32 tc358768_pclk_to_pll(struct tc358768_priv *priv, u32 pclk)
+ {
+- return (u32)div_u64((u64)pclk * priv->pd_lines, priv->dsi_lanes);
++ return (u32)div_u64((u64)pclk * priv->dsi_bpp, priv->dsi_lanes);
+ }
+
+ static int tc358768_calc_pll(struct tc358768_priv *priv,
+@@ -334,13 +337,17 @@ static int tc358768_calc_pll(struct tc358768_priv *priv,
+ u32 fbd;
+
+ for (fbd = 0; fbd < 512; ++fbd) {
+- u32 pll, diff;
++ u32 pll, diff, pll_in;
+
+ pll = (u32)div_u64((u64)refclk * (fbd + 1), divisor);
+
+ if (pll >= max_pll || pll < min_pll)
+ continue;
+
++ pll_in = (u32)div_u64((u64)refclk, prd + 1);
++ if (pll_in < 4000000)
++ continue;
++
+ diff = max(pll, target_pll) - min(pll, target_pll);
+
+ if (diff < best_diff) {
+@@ -422,6 +429,7 @@ static int tc358768_dsi_host_attach(struct mipi_dsi_host *host,
+ priv->output.panel = panel;
+
+ priv->dsi_lanes = dev->lanes;
++ priv->dsi_bpp = mipi_dsi_pixel_format_to_bpp(dev->format);
+
+ /* get input ep (port0/endpoint0) */
+ ret = -EINVAL;
+@@ -433,7 +441,7 @@ static int tc358768_dsi_host_attach(struct mipi_dsi_host *host,
+ }
+
+ if (ret)
+- priv->pd_lines = mipi_dsi_pixel_format_to_bpp(dev->format);
++ priv->pd_lines = priv->dsi_bpp;
+
+ drm_bridge_add(&priv->bridge);
+
+@@ -632,6 +640,7 @@ static void tc358768_bridge_pre_enable(struct drm_bridge *bridge)
+ struct mipi_dsi_device *dsi_dev = priv->output.dev;
+ unsigned long mode_flags = dsi_dev->mode_flags;
+ u32 val, val2, lptxcnt, hact, data_type;
++ s32 raw_val;
+ const struct drm_display_mode *mode;
+ u32 dsibclk_nsk, dsiclk_nsk, ui_nsk, phy_delay_nsk;
+ u32 dsiclk, dsibclk, video_start;
+@@ -736,25 +745,26 @@ static void tc358768_bridge_pre_enable(struct drm_bridge *bridge)
+
+ /* 38ns < TCLK_PREPARE < 95ns */
+ val = tc358768_ns_to_cnt(65, dsibclk_nsk) - 1;
+- /* TCLK_PREPARE > 300ns */
+- val2 = tc358768_ns_to_cnt(300 + tc358768_to_ns(3 * ui_nsk),
+- dsibclk_nsk);
+- val |= (val2 - tc358768_to_ns(phy_delay_nsk - dsibclk_nsk)) << 8;
++ /* TCLK_PREPARE + TCLK_ZERO > 300ns */
++ val2 = tc358768_ns_to_cnt(300 - tc358768_to_ns(2 * ui_nsk),
++ dsibclk_nsk) - 2;
++ val |= val2 << 8;
+ dev_dbg(priv->dev, "TCLK_HEADERCNT: 0x%x\n", val);
+ tc358768_write(priv, TC358768_TCLK_HEADERCNT, val);
+
+- /* TCLK_TRAIL > 60ns + 3*UI */
+- val = 60 + tc358768_to_ns(3 * ui_nsk);
+- val = tc358768_ns_to_cnt(val, dsibclk_nsk) - 5;
++ /* TCLK_TRAIL > 60ns AND TEOT <= 105 ns + 12*UI */
++ raw_val = tc358768_ns_to_cnt(60 + tc358768_to_ns(2 * ui_nsk), dsibclk_nsk) - 5;
++ val = clamp(raw_val, 0, 127);
+ dev_dbg(priv->dev, "TCLK_TRAILCNT: 0x%x\n", val);
+ tc358768_write(priv, TC358768_TCLK_TRAILCNT, val);
+
+ /* 40ns + 4*UI < THS_PREPARE < 85ns + 6*UI */
+ val = 50 + tc358768_to_ns(4 * ui_nsk);
+ val = tc358768_ns_to_cnt(val, dsibclk_nsk) - 1;
+- /* THS_ZERO > 145ns + 10*UI */
+- val2 = tc358768_ns_to_cnt(145 - tc358768_to_ns(ui_nsk), dsibclk_nsk);
+- val |= (val2 - tc358768_to_ns(phy_delay_nsk)) << 8;
++ /* THS_PREPARE + THS_ZERO > 145ns + 10*UI */
++ raw_val = tc358768_ns_to_cnt(145 - tc358768_to_ns(3 * ui_nsk), dsibclk_nsk) - 10;
++ val2 = clamp(raw_val, 0, 127);
++ val |= val2 << 8;
+ dev_dbg(priv->dev, "THS_HEADERCNT: 0x%x\n", val);
+ tc358768_write(priv, TC358768_THS_HEADERCNT, val);
+
+@@ -770,9 +780,10 @@ static void tc358768_bridge_pre_enable(struct drm_bridge *bridge)
+ dev_dbg(priv->dev, "TCLK_POSTCNT: 0x%x\n", val);
+ tc358768_write(priv, TC358768_TCLK_POSTCNT, val);
+
+- /* 60ns + 4*UI < THS_PREPARE < 105ns + 12*UI */
+- val = tc358768_ns_to_cnt(60 + tc358768_to_ns(15 * ui_nsk),
+- dsibclk_nsk) - 5;
++ /* max(60ns + 4*UI, 8*UI) < THS_TRAILCNT < 105ns + 12*UI */
++ raw_val = tc358768_ns_to_cnt(60 + tc358768_to_ns(18 * ui_nsk),
++ dsibclk_nsk) - 4;
++ val = clamp(raw_val, 0, 15);
+ dev_dbg(priv->dev, "THS_TRAILCNT: 0x%x\n", val);
+ tc358768_write(priv, TC358768_THS_TRAILCNT, val);
+
+@@ -786,7 +797,7 @@ static void tc358768_bridge_pre_enable(struct drm_bridge *bridge)
+
+ /* TXTAGOCNT[26:16] RXTASURECNT[10:0] */
+ val = tc358768_to_ns((lptxcnt + 1) * dsibclk_nsk * 4);
+- val = tc358768_ns_to_cnt(val, dsibclk_nsk) - 1;
++ val = tc358768_ns_to_cnt(val, dsibclk_nsk) / 4 - 1;
+ val2 = tc358768_ns_to_cnt(tc358768_to_ns((lptxcnt + 1) * dsibclk_nsk),
+ dsibclk_nsk) - 2;
+ val = val << 16 | val2;
+@@ -866,8 +877,7 @@ static void tc358768_bridge_pre_enable(struct drm_bridge *bridge)
+ val = TC358768_DSI_CONFW_MODE_SET | TC358768_DSI_CONFW_ADDR_DSI_CONTROL;
+ val |= (dsi_dev->lanes - 1) << 1;
+
+- if (!(dsi_dev->mode_flags & MIPI_DSI_MODE_LPM))
+- val |= TC358768_DSI_CONTROL_TXMD;
++ val |= TC358768_DSI_CONTROL_TXMD;
+
+ if (!(mode_flags & MIPI_DSI_CLOCK_NON_CONTINUOUS))
+ val |= TC358768_DSI_CONTROL_HSCKMD;
+@@ -913,6 +923,44 @@ static void tc358768_bridge_enable(struct drm_bridge *bridge)
+ }
+ }
+
++#define MAX_INPUT_SEL_FORMATS 1
++
++static u32 *
++tc358768_atomic_get_input_bus_fmts(struct drm_bridge *bridge,
++ struct drm_bridge_state *bridge_state,
++ struct drm_crtc_state *crtc_state,
++ struct drm_connector_state *conn_state,
++ u32 output_fmt,
++ unsigned int *num_input_fmts)
++{
++ struct tc358768_priv *priv = bridge_to_tc358768(bridge);
++ u32 *input_fmts;
++
++ *num_input_fmts = 0;
++
++ input_fmts = kcalloc(MAX_INPUT_SEL_FORMATS, sizeof(*input_fmts),
++ GFP_KERNEL);
++ if (!input_fmts)
++ return NULL;
++
++ switch (priv->pd_lines) {
++ case 16:
++ input_fmts[0] = MEDIA_BUS_FMT_RGB565_1X16;
++ break;
++ case 18:
++ input_fmts[0] = MEDIA_BUS_FMT_RGB666_1X18;
++ break;
++ default:
++ case 24:
++ input_fmts[0] = MEDIA_BUS_FMT_RGB888_1X24;
++ break;
++ };
++
++ *num_input_fmts = MAX_INPUT_SEL_FORMATS;
++
++ return input_fmts;
++}
++
+ static const struct drm_bridge_funcs tc358768_bridge_funcs = {
+ .attach = tc358768_bridge_attach,
+ .mode_valid = tc358768_bridge_mode_valid,
+@@ -920,6 +968,11 @@ static const struct drm_bridge_funcs tc358768_bridge_funcs = {
+ .enable = tc358768_bridge_enable,
+ .disable = tc358768_bridge_disable,
+ .post_disable = tc358768_bridge_post_disable,
++
++ .atomic_duplicate_state = drm_atomic_helper_bridge_duplicate_state,
++ .atomic_destroy_state = drm_atomic_helper_bridge_destroy_state,
++ .atomic_reset = drm_atomic_helper_bridge_reset,
++ .atomic_get_input_bus_fmts = tc358768_atomic_get_input_bus_fmts,
+ };
+
+ static const struct drm_bridge_timings default_tc358768_timings = {
+diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi83.c b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
+index 75286c9afbb96..6e125ba4f0d75 100644
+--- a/drivers/gpu/drm/bridge/ti-sn65dsi83.c
++++ b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
+@@ -321,8 +321,8 @@ static u8 sn65dsi83_get_dsi_div(struct sn65dsi83 *ctx)
+ return dsi_div - 1;
+ }
+
+-static void sn65dsi83_atomic_enable(struct drm_bridge *bridge,
+- struct drm_bridge_state *old_bridge_state)
++static void sn65dsi83_atomic_pre_enable(struct drm_bridge *bridge,
++ struct drm_bridge_state *old_bridge_state)
+ {
+ struct sn65dsi83 *ctx = bridge_to_sn65dsi83(bridge);
+ struct drm_atomic_state *state = old_bridge_state->base.state;
+@@ -478,17 +478,29 @@ static void sn65dsi83_atomic_enable(struct drm_bridge *bridge,
+ dev_err(ctx->dev, "failed to lock PLL, ret=%i\n", ret);
+ /* On failure, disable PLL again and exit. */
+ regmap_write(ctx->regmap, REG_RC_PLL_EN, 0x00);
++ regulator_disable(ctx->vcc);
+ return;
+ }
+
+ /* Trigger reset after CSR register update. */
+ regmap_write(ctx->regmap, REG_RC_RESET, REG_RC_RESET_SOFT_RESET);
+
++ /* Wait for 10ms after soft reset as specified in datasheet */
++ usleep_range(10000, 12000);
++}
++
++static void sn65dsi83_atomic_enable(struct drm_bridge *bridge,
++ struct drm_bridge_state *old_bridge_state)
++{
++ struct sn65dsi83 *ctx = bridge_to_sn65dsi83(bridge);
++ unsigned int pval;
++
+ /* Clear all errors that got asserted during initialization. */
+ regmap_read(ctx->regmap, REG_IRQ_STAT, &pval);
+ regmap_write(ctx->regmap, REG_IRQ_STAT, pval);
+
+- usleep_range(10000, 12000);
++ /* Wait for 1ms and check for errors in status register */
++ usleep_range(1000, 1100);
+ regmap_read(ctx->regmap, REG_IRQ_STAT, &pval);
+ if (pval)
+ dev_err(ctx->dev, "Unexpected link status 0x%02x\n", pval);
+@@ -555,6 +567,7 @@ static const struct drm_bridge_funcs sn65dsi83_funcs = {
+ .attach = sn65dsi83_attach,
+ .detach = sn65dsi83_detach,
+ .atomic_enable = sn65dsi83_atomic_enable,
++ .atomic_pre_enable = sn65dsi83_atomic_pre_enable,
+ .atomic_disable = sn65dsi83_atomic_disable,
+ .mode_valid = sn65dsi83_mode_valid,
+
+@@ -697,6 +710,7 @@ static int sn65dsi83_probe(struct i2c_client *client)
+
+ ctx->bridge.funcs = &sn65dsi83_funcs;
+ ctx->bridge.of_node = dev->of_node;
++ ctx->bridge.pre_enable_prev_first = true;
+ drm_bridge_add(&ctx->bridge);
+
+ ret = sn65dsi83_host_attach(ctx);
+diff --git a/drivers/gpu/drm/drm_gem_vram_helper.c b/drivers/gpu/drm/drm_gem_vram_helper.c
+index 0bea3df2a16dc..b67eafa557159 100644
+--- a/drivers/gpu/drm/drm_gem_vram_helper.c
++++ b/drivers/gpu/drm/drm_gem_vram_helper.c
+@@ -45,7 +45,7 @@ static const struct drm_gem_object_funcs drm_gem_vram_object_funcs;
+ * the frame's scanout buffer or the cursor image. If there's no more space
+ * left in VRAM, inactive GEM objects can be moved to system memory.
+ *
+- * To initialize the VRAM helper library call drmm_vram_helper_alloc_mm().
++ * To initialize the VRAM helper library call drmm_vram_helper_init().
+ * The function allocates and initializes an instance of &struct drm_vram_mm
+ * in &struct drm_device.vram_mm . Use &DRM_GEM_VRAM_DRIVER to initialize
+ * &struct drm_driver and &DRM_VRAM_MM_FILE_OPERATIONS to initialize
+@@ -73,7 +73,7 @@ static const struct drm_gem_object_funcs drm_gem_vram_object_funcs;
+ * // setup device, vram base and size
+ * // ...
+ *
+- * ret = drmm_vram_helper_alloc_mm(dev, vram_base, vram_size);
++ * ret = drmm_vram_helper_init(dev, vram_base, vram_size);
+ * if (ret)
+ * return ret;
+ * return 0;
+@@ -86,7 +86,7 @@ static const struct drm_gem_object_funcs drm_gem_vram_object_funcs;
+ * to userspace.
+ *
+ * You don't have to clean up the instance of VRAM MM.
+- * drmm_vram_helper_alloc_mm() is a managed interface that installs a
++ * drmm_vram_helper_init() is a managed interface that installs a
+ * clean-up handler to run during the DRM device's release.
+ *
+ * For drawing or scanout operations, rsp. buffer objects have to be pinned
+diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
+index 97b0d4ae221ac..b158f10b4269a 100644
+--- a/drivers/gpu/drm/i915/Makefile
++++ b/drivers/gpu/drm/i915/Makefile
+@@ -25,6 +25,7 @@ subdir-ccflags-$(CONFIG_DRM_I915_WERROR) += -Werror
+
+ # Fine grained warnings disable
+ CFLAGS_i915_pci.o = $(call cc-disable-warning, override-init)
++CFLAGS_display/intel_display_device.o = $(call cc-disable-warning, override-init)
+ CFLAGS_display/intel_fbdev.o = $(call cc-disable-warning, override-init)
+
+ subdir-ccflags-y += -I$(srctree)/$(src)
+@@ -300,6 +301,7 @@ i915-y += \
+ display/intel_crt.o \
+ display/intel_ddi.o \
+ display/intel_ddi_buf_trans.o \
++ display/intel_display_device.o \
+ display/intel_display_trace.o \
+ display/intel_dkl_phy.o \
+ display/intel_dp.o \
+diff --git a/drivers/gpu/drm/i915/display/intel_color.c b/drivers/gpu/drm/i915/display/intel_color.c
+index 36aac88143ac1..a5fb08f6cf136 100644
+--- a/drivers/gpu/drm/i915/display/intel_color.c
++++ b/drivers/gpu/drm/i915/display/intel_color.c
+@@ -116,10 +116,9 @@ struct intel_color_funcs {
+ #define ILK_CSC_COEFF_FP(coeff, fbits) \
+ (clamp_val(((coeff) >> (32 - (fbits) - 3)) + 4, 0, 0xfff) & 0xff8)
+
+-#define ILK_CSC_COEFF_LIMITED_RANGE 0x0dc0
+ #define ILK_CSC_COEFF_1_0 0x7800
+-
+-#define ILK_CSC_POSTOFF_LIMITED_RANGE (16 * (1 << 12) / 255)
++#define ILK_CSC_COEFF_LIMITED_RANGE ((235 - 16) << (12 - 8)) /* exponent 0 */
++#define ILK_CSC_POSTOFF_LIMITED_RANGE (16 << (12 - 8))
+
+ /* Nop pre/post offsets */
+ static const u16 ilk_csc_off_zero[3] = {};
+@@ -1606,14 +1605,14 @@ static u32 intel_gamma_lut_tests(const struct intel_crtc_state *crtc_state)
+ if (lut_is_legacy(gamma_lut))
+ return 0;
+
+- return INTEL_INFO(i915)->display.color.gamma_lut_tests;
++ return DISPLAY_INFO(i915)->color.gamma_lut_tests;
+ }
+
+ static u32 intel_degamma_lut_tests(const struct intel_crtc_state *crtc_state)
+ {
+ struct drm_i915_private *i915 = to_i915(crtc_state->uapi.crtc->dev);
+
+- return INTEL_INFO(i915)->display.color.degamma_lut_tests;
++ return DISPLAY_INFO(i915)->color.degamma_lut_tests;
+ }
+
+ static int intel_gamma_lut_size(const struct intel_crtc_state *crtc_state)
+@@ -1624,14 +1623,14 @@ static int intel_gamma_lut_size(const struct intel_crtc_state *crtc_state)
+ if (lut_is_legacy(gamma_lut))
+ return LEGACY_LUT_LENGTH;
+
+- return INTEL_INFO(i915)->display.color.gamma_lut_size;
++ return DISPLAY_INFO(i915)->color.gamma_lut_size;
+ }
+
+ static u32 intel_degamma_lut_size(const struct intel_crtc_state *crtc_state)
+ {
+ struct drm_i915_private *i915 = to_i915(crtc_state->uapi.crtc->dev);
+
+- return INTEL_INFO(i915)->display.color.degamma_lut_size;
++ return DISPLAY_INFO(i915)->color.degamma_lut_size;
+ }
+
+ static int check_lut_size(const struct drm_property_blob *lut, int expected)
+@@ -2097,7 +2096,7 @@ static int glk_assign_luts(struct intel_crtc_state *crtc_state)
+ struct drm_property_blob *gamma_lut;
+
+ gamma_lut = create_resized_lut(i915, crtc_state->hw.gamma_lut,
+- INTEL_INFO(i915)->display.color.degamma_lut_size,
++ DISPLAY_INFO(i915)->color.degamma_lut_size,
+ false);
+ if (IS_ERR(gamma_lut))
+ return PTR_ERR(gamma_lut);
+@@ -2627,7 +2626,7 @@ static struct drm_property_blob *i9xx_read_lut_8(struct intel_crtc *crtc)
+ static struct drm_property_blob *i9xx_read_lut_10(struct intel_crtc *crtc)
+ {
+ struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+- u32 lut_size = INTEL_INFO(dev_priv)->display.color.gamma_lut_size;
++ u32 lut_size = DISPLAY_INFO(dev_priv)->color.gamma_lut_size;
+ enum pipe pipe = crtc->pipe;
+ struct drm_property_blob *blob;
+ struct drm_color_lut *lut;
+@@ -2676,7 +2675,7 @@ static void i9xx_read_luts(struct intel_crtc_state *crtc_state)
+ static struct drm_property_blob *i965_read_lut_10p6(struct intel_crtc *crtc)
+ {
+ struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+- int i, lut_size = INTEL_INFO(dev_priv)->display.color.gamma_lut_size;
++ int i, lut_size = DISPLAY_INFO(dev_priv)->color.gamma_lut_size;
+ enum pipe pipe = crtc->pipe;
+ struct drm_property_blob *blob;
+ struct drm_color_lut *lut;
+@@ -2726,7 +2725,7 @@ static void i965_read_luts(struct intel_crtc_state *crtc_state)
+ static struct drm_property_blob *chv_read_cgm_degamma(struct intel_crtc *crtc)
+ {
+ struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+- int i, lut_size = INTEL_INFO(dev_priv)->display.color.degamma_lut_size;
++ int i, lut_size = DISPLAY_INFO(dev_priv)->color.degamma_lut_size;
+ enum pipe pipe = crtc->pipe;
+ struct drm_property_blob *blob;
+ struct drm_color_lut *lut;
+@@ -2752,7 +2751,7 @@ static struct drm_property_blob *chv_read_cgm_degamma(struct intel_crtc *crtc)
+ static struct drm_property_blob *chv_read_cgm_gamma(struct intel_crtc *crtc)
+ {
+ struct drm_i915_private *i915 = to_i915(crtc->base.dev);
+- int i, lut_size = INTEL_INFO(i915)->display.color.gamma_lut_size;
++ int i, lut_size = DISPLAY_INFO(i915)->color.gamma_lut_size;
+ enum pipe pipe = crtc->pipe;
+ struct drm_property_blob *blob;
+ struct drm_color_lut *lut;
+@@ -2816,7 +2815,7 @@ static struct drm_property_blob *ilk_read_lut_8(struct intel_crtc *crtc)
+ static struct drm_property_blob *ilk_read_lut_10(struct intel_crtc *crtc)
+ {
+ struct drm_i915_private *i915 = to_i915(crtc->base.dev);
+- int i, lut_size = INTEL_INFO(i915)->display.color.gamma_lut_size;
++ int i, lut_size = DISPLAY_INFO(i915)->color.gamma_lut_size;
+ enum pipe pipe = crtc->pipe;
+ struct drm_property_blob *blob;
+ struct drm_color_lut *lut;
+@@ -3000,7 +2999,7 @@ static void bdw_read_luts(struct intel_crtc_state *crtc_state)
+ static struct drm_property_blob *glk_read_degamma_lut(struct intel_crtc *crtc)
+ {
+ struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+- int i, lut_size = INTEL_INFO(dev_priv)->display.color.degamma_lut_size;
++ int i, lut_size = DISPLAY_INFO(dev_priv)->color.degamma_lut_size;
+ enum pipe pipe = crtc->pipe;
+ struct drm_property_blob *blob;
+ struct drm_color_lut *lut;
+@@ -3065,7 +3064,7 @@ static struct drm_property_blob *
+ icl_read_lut_multi_segment(struct intel_crtc *crtc)
+ {
+ struct drm_i915_private *i915 = to_i915(crtc->base.dev);
+- int i, lut_size = INTEL_INFO(i915)->display.color.gamma_lut_size;
++ int i, lut_size = DISPLAY_INFO(i915)->color.gamma_lut_size;
+ enum pipe pipe = crtc->pipe;
+ struct drm_property_blob *blob;
+ struct drm_color_lut *lut;
+@@ -3234,8 +3233,8 @@ void intel_color_crtc_init(struct intel_crtc *crtc)
+
+ drm_mode_crtc_set_gamma_size(&crtc->base, 256);
+
+- gamma_lut_size = INTEL_INFO(i915)->display.color.gamma_lut_size;
+- degamma_lut_size = INTEL_INFO(i915)->display.color.degamma_lut_size;
++ gamma_lut_size = DISPLAY_INFO(i915)->color.gamma_lut_size;
++ degamma_lut_size = DISPLAY_INFO(i915)->color.degamma_lut_size;
+ has_ctm = degamma_lut_size != 0;
+
+ /*
+@@ -3260,7 +3259,8 @@ int intel_color_init(struct drm_i915_private *i915)
+ if (DISPLAY_VER(i915) != 10)
+ return 0;
+
+- blob = create_linear_lut(i915, INTEL_INFO(i915)->display.color.degamma_lut_size);
++ blob = create_linear_lut(i915,
++ DISPLAY_INFO(i915)->color.degamma_lut_size);
+ if (IS_ERR(blob))
+ return PTR_ERR(blob);
+
+diff --git a/drivers/gpu/drm/i915/display/intel_crtc.c b/drivers/gpu/drm/i915/display/intel_crtc.c
+index ed45a69348548..349bc7f5f9a0d 100644
+--- a/drivers/gpu/drm/i915/display/intel_crtc.c
++++ b/drivers/gpu/drm/i915/display/intel_crtc.c
+@@ -302,7 +302,7 @@ int intel_crtc_init(struct drm_i915_private *dev_priv, enum pipe pipe)
+ return PTR_ERR(crtc);
+
+ crtc->pipe = pipe;
+- crtc->num_scalers = RUNTIME_INFO(dev_priv)->num_scalers[pipe];
++ crtc->num_scalers = DISPLAY_RUNTIME_INFO(dev_priv)->num_scalers[pipe];
+
+ if (DISPLAY_VER(dev_priv) >= 9)
+ primary = skl_universal_plane_create(dev_priv, pipe,
+diff --git a/drivers/gpu/drm/i915/display/intel_cursor.c b/drivers/gpu/drm/i915/display/intel_cursor.c
+index 31bef04273779..b342fad180ca5 100644
+--- a/drivers/gpu/drm/i915/display/intel_cursor.c
++++ b/drivers/gpu/drm/i915/display/intel_cursor.c
+@@ -36,7 +36,7 @@ static u32 intel_cursor_base(const struct intel_plane_state *plane_state)
+ const struct drm_i915_gem_object *obj = intel_fb_obj(fb);
+ u32 base;
+
+- if (INTEL_INFO(dev_priv)->display.cursor_needs_physical)
++ if (DISPLAY_INFO(dev_priv)->cursor_needs_physical)
+ base = sg_dma_address(obj->mm.pages->sgl);
+ else
+ base = intel_plane_ggtt_offset(plane_state);
+@@ -814,7 +814,7 @@ intel_cursor_plane_create(struct drm_i915_private *dev_priv,
+ DRM_MODE_ROTATE_0 |
+ DRM_MODE_ROTATE_180);
+
+- zpos = RUNTIME_INFO(dev_priv)->num_sprites[pipe] + 1;
++ zpos = DISPLAY_RUNTIME_INFO(dev_priv)->num_sprites[pipe] + 1;
+ drm_plane_create_zpos_immutable_property(&cursor->base, zpos);
+
+ if (DISPLAY_VER(dev_priv) >= 12)
+diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
+index 0aae9a1eb3d58..7749f95d5d02a 100644
+--- a/drivers/gpu/drm/i915/display/intel_display.c
++++ b/drivers/gpu/drm/i915/display/intel_display.c
+@@ -3568,7 +3568,7 @@ static u8 bigjoiner_pipes(struct drm_i915_private *i915)
+ else
+ pipes = 0;
+
+- return pipes & RUNTIME_INFO(i915)->pipe_mask;
++ return pipes & DISPLAY_RUNTIME_INFO(i915)->pipe_mask;
+ }
+
+ static bool transcoder_ddi_func_is_enabled(struct drm_i915_private *dev_priv,
+diff --git a/drivers/gpu/drm/i915/display/intel_display.h b/drivers/gpu/drm/i915/display/intel_display.h
+index 287159bdeb0d1..4065b6598cf31 100644
+--- a/drivers/gpu/drm/i915/display/intel_display.h
++++ b/drivers/gpu/drm/i915/display/intel_display.h
+@@ -105,7 +105,7 @@ enum i9xx_plane_id {
+ };
+
+ #define plane_name(p) ((p) + 'A')
+-#define sprite_name(p, s) ((p) * RUNTIME_INFO(dev_priv)->num_sprites[(p)] + (s) + 'A')
++#define sprite_name(p, s) ((p) * DISPLAY_RUNTIME_INFO(dev_priv)->num_sprites[(p)] + (s) + 'A')
+
+ #define for_each_plane_id_on_crtc(__crtc, __p) \
+ for ((__p) = PLANE_PRIMARY; (__p) < I915_MAX_PLANES; (__p)++) \
+@@ -113,7 +113,7 @@ enum i9xx_plane_id {
+
+ #define for_each_dbuf_slice(__dev_priv, __slice) \
+ for ((__slice) = DBUF_S1; (__slice) < I915_MAX_DBUF_SLICES; (__slice)++) \
+- for_each_if(INTEL_INFO(__dev_priv)->display.dbuf.slice_mask & BIT(__slice))
++ for_each_if(INTEL_INFO(__dev_priv)->display->dbuf.slice_mask & BIT(__slice))
+
+ #define for_each_dbuf_slice_in_mask(__dev_priv, __slice, __mask) \
+ for_each_dbuf_slice((__dev_priv), (__slice)) \
+@@ -221,7 +221,7 @@ enum phy_fia {
+
+ #define for_each_pipe(__dev_priv, __p) \
+ for ((__p) = 0; (__p) < I915_MAX_PIPES; (__p)++) \
+- for_each_if(RUNTIME_INFO(__dev_priv)->pipe_mask & BIT(__p))
++ for_each_if(DISPLAY_RUNTIME_INFO(__dev_priv)->pipe_mask & BIT(__p))
+
+ #define for_each_pipe_masked(__dev_priv, __p, __mask) \
+ for_each_pipe(__dev_priv, __p) \
+@@ -229,7 +229,7 @@ enum phy_fia {
+
+ #define for_each_cpu_transcoder(__dev_priv, __t) \
+ for ((__t) = 0; (__t) < I915_MAX_TRANSCODERS; (__t)++) \
+- for_each_if (RUNTIME_INFO(__dev_priv)->cpu_transcoder_mask & BIT(__t))
++ for_each_if (DISPLAY_RUNTIME_INFO(__dev_priv)->cpu_transcoder_mask & BIT(__t))
+
+ #define for_each_cpu_transcoder_masked(__dev_priv, __t, __mask) \
+ for_each_cpu_transcoder(__dev_priv, __t) \
+@@ -237,7 +237,7 @@ enum phy_fia {
+
+ #define for_each_sprite(__dev_priv, __p, __s) \
+ for ((__s) = 0; \
+- (__s) < RUNTIME_INFO(__dev_priv)->num_sprites[(__p)]; \
++ (__s) < DISPLAY_RUNTIME_INFO(__dev_priv)->num_sprites[(__p)]; \
+ (__s)++)
+
+ #define for_each_port(__port) \
+diff --git a/drivers/gpu/drm/i915/display/intel_display_device.c b/drivers/gpu/drm/i915/display/intel_display_device.c
+new file mode 100644
+index 0000000000000..8c57d48e8270f
+--- /dev/null
++++ b/drivers/gpu/drm/i915/display/intel_display_device.c
+@@ -0,0 +1,728 @@
++// SPDX-License-Identifier: MIT
++/*
++ * Copyright © 2023 Intel Corporation
++ */
++
++#include <drm/i915_pciids.h>
++#include <drm/drm_color_mgmt.h>
++
++#include "intel_display_device.h"
++#include "intel_display_power.h"
++#include "intel_display_reg_defs.h"
++#include "intel_fbc.h"
++
++static const struct intel_display_device_info no_display = {};
++
++#define PIPE_A_OFFSET 0x70000
++#define PIPE_B_OFFSET 0x71000
++#define PIPE_C_OFFSET 0x72000
++#define PIPE_D_OFFSET 0x73000
++#define CHV_PIPE_C_OFFSET 0x74000
++/*
++ * There's actually no pipe EDP. Some pipe registers have
++ * simply shifted from the pipe to the transcoder, while
++ * keeping their original offset. Thus we need PIPE_EDP_OFFSET
++ * to access such registers in transcoder EDP.
++ */
++#define PIPE_EDP_OFFSET 0x7f000
++
++/* ICL DSI 0 and 1 */
++#define PIPE_DSI0_OFFSET 0x7b000
++#define PIPE_DSI1_OFFSET 0x7b800
++
++#define TRANSCODER_A_OFFSET 0x60000
++#define TRANSCODER_B_OFFSET 0x61000
++#define TRANSCODER_C_OFFSET 0x62000
++#define CHV_TRANSCODER_C_OFFSET 0x63000
++#define TRANSCODER_D_OFFSET 0x63000
++#define TRANSCODER_EDP_OFFSET 0x6f000
++#define TRANSCODER_DSI0_OFFSET 0x6b000
++#define TRANSCODER_DSI1_OFFSET 0x6b800
++
++#define CURSOR_A_OFFSET 0x70080
++#define CURSOR_B_OFFSET 0x700c0
++#define CHV_CURSOR_C_OFFSET 0x700e0
++#define IVB_CURSOR_B_OFFSET 0x71080
++#define IVB_CURSOR_C_OFFSET 0x72080
++#define TGL_CURSOR_D_OFFSET 0x73080
++
++#define I845_PIPE_OFFSETS \
++ .pipe_offsets = { \
++ [TRANSCODER_A] = PIPE_A_OFFSET, \
++ }, \
++ .trans_offsets = { \
++ [TRANSCODER_A] = TRANSCODER_A_OFFSET, \
++ }
++
++#define I9XX_PIPE_OFFSETS \
++ .pipe_offsets = { \
++ [TRANSCODER_A] = PIPE_A_OFFSET, \
++ [TRANSCODER_B] = PIPE_B_OFFSET, \
++ }, \
++ .trans_offsets = { \
++ [TRANSCODER_A] = TRANSCODER_A_OFFSET, \
++ [TRANSCODER_B] = TRANSCODER_B_OFFSET, \
++ }
++
++#define IVB_PIPE_OFFSETS \
++ .pipe_offsets = { \
++ [TRANSCODER_A] = PIPE_A_OFFSET, \
++ [TRANSCODER_B] = PIPE_B_OFFSET, \
++ [TRANSCODER_C] = PIPE_C_OFFSET, \
++ }, \
++ .trans_offsets = { \
++ [TRANSCODER_A] = TRANSCODER_A_OFFSET, \
++ [TRANSCODER_B] = TRANSCODER_B_OFFSET, \
++ [TRANSCODER_C] = TRANSCODER_C_OFFSET, \
++ }
++
++#define HSW_PIPE_OFFSETS \
++ .pipe_offsets = { \
++ [TRANSCODER_A] = PIPE_A_OFFSET, \
++ [TRANSCODER_B] = PIPE_B_OFFSET, \
++ [TRANSCODER_C] = PIPE_C_OFFSET, \
++ [TRANSCODER_EDP] = PIPE_EDP_OFFSET, \
++ }, \
++ .trans_offsets = { \
++ [TRANSCODER_A] = TRANSCODER_A_OFFSET, \
++ [TRANSCODER_B] = TRANSCODER_B_OFFSET, \
++ [TRANSCODER_C] = TRANSCODER_C_OFFSET, \
++ [TRANSCODER_EDP] = TRANSCODER_EDP_OFFSET, \
++ }
++
++#define CHV_PIPE_OFFSETS \
++ .pipe_offsets = { \
++ [TRANSCODER_A] = PIPE_A_OFFSET, \
++ [TRANSCODER_B] = PIPE_B_OFFSET, \
++ [TRANSCODER_C] = CHV_PIPE_C_OFFSET, \
++ }, \
++ .trans_offsets = { \
++ [TRANSCODER_A] = TRANSCODER_A_OFFSET, \
++ [TRANSCODER_B] = TRANSCODER_B_OFFSET, \
++ [TRANSCODER_C] = CHV_TRANSCODER_C_OFFSET, \
++ }
++
++#define I845_CURSOR_OFFSETS \
++ .cursor_offsets = { \
++ [PIPE_A] = CURSOR_A_OFFSET, \
++ }
++
++#define I9XX_CURSOR_OFFSETS \
++ .cursor_offsets = { \
++ [PIPE_A] = CURSOR_A_OFFSET, \
++ [PIPE_B] = CURSOR_B_OFFSET, \
++ }
++
++#define CHV_CURSOR_OFFSETS \
++ .cursor_offsets = { \
++ [PIPE_A] = CURSOR_A_OFFSET, \
++ [PIPE_B] = CURSOR_B_OFFSET, \
++ [PIPE_C] = CHV_CURSOR_C_OFFSET, \
++ }
++
++#define IVB_CURSOR_OFFSETS \
++ .cursor_offsets = { \
++ [PIPE_A] = CURSOR_A_OFFSET, \
++ [PIPE_B] = IVB_CURSOR_B_OFFSET, \
++ [PIPE_C] = IVB_CURSOR_C_OFFSET, \
++ }
++
++#define TGL_CURSOR_OFFSETS \
++ .cursor_offsets = { \
++ [PIPE_A] = CURSOR_A_OFFSET, \
++ [PIPE_B] = IVB_CURSOR_B_OFFSET, \
++ [PIPE_C] = IVB_CURSOR_C_OFFSET, \
++ [PIPE_D] = TGL_CURSOR_D_OFFSET, \
++ }
++
++#define I845_COLORS \
++ .color = { .gamma_lut_size = 256 }
++#define I9XX_COLORS \
++ .color = { .gamma_lut_size = 129, \
++ .gamma_lut_tests = DRM_COLOR_LUT_NON_DECREASING, \
++ }
++#define ILK_COLORS \
++ .color = { .gamma_lut_size = 1024 }
++#define IVB_COLORS \
++ .color = { .degamma_lut_size = 1024, .gamma_lut_size = 1024 }
++#define CHV_COLORS \
++ .color = { \
++ .degamma_lut_size = 65, .gamma_lut_size = 257, \
++ .degamma_lut_tests = DRM_COLOR_LUT_NON_DECREASING, \
++ .gamma_lut_tests = DRM_COLOR_LUT_NON_DECREASING, \
++ }
++#define GLK_COLORS \
++ .color = { \
++ .degamma_lut_size = 33, .gamma_lut_size = 1024, \
++ .degamma_lut_tests = DRM_COLOR_LUT_NON_DECREASING | \
++ DRM_COLOR_LUT_EQUAL_CHANNELS, \
++ }
++#define ICL_COLORS \
++ .color = { \
++ .degamma_lut_size = 33, .gamma_lut_size = 262145, \
++ .degamma_lut_tests = DRM_COLOR_LUT_NON_DECREASING | \
++ DRM_COLOR_LUT_EQUAL_CHANNELS, \
++ .gamma_lut_tests = DRM_COLOR_LUT_NON_DECREASING, \
++ }
++
++#define I830_DISPLAY \
++ .has_overlay = 1, \
++ .cursor_needs_physical = 1, \
++ .overlay_needs_physical = 1, \
++ .has_gmch = 1, \
++ I9XX_PIPE_OFFSETS, \
++ I9XX_CURSOR_OFFSETS, \
++ I9XX_COLORS, \
++ \
++ .__runtime_defaults.ip.ver = 2, \
++ .__runtime_defaults.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B), \
++ .__runtime_defaults.cpu_transcoder_mask = \
++ BIT(TRANSCODER_A) | BIT(TRANSCODER_B)
++
++static const struct intel_display_device_info i830_display = {
++ I830_DISPLAY,
++};
++
++#define I845_DISPLAY \
++ .has_overlay = 1, \
++ .overlay_needs_physical = 1, \
++ .has_gmch = 1, \
++ I845_PIPE_OFFSETS, \
++ I845_CURSOR_OFFSETS, \
++ I845_COLORS, \
++ \
++ .__runtime_defaults.ip.ver = 2, \
++ .__runtime_defaults.pipe_mask = BIT(PIPE_A), \
++ .__runtime_defaults.cpu_transcoder_mask = BIT(TRANSCODER_A)
++
++static const struct intel_display_device_info i845_display = {
++ I845_DISPLAY,
++};
++
++static const struct intel_display_device_info i85x_display = {
++ I830_DISPLAY,
++
++ .__runtime_defaults.fbc_mask = BIT(INTEL_FBC_A),
++};
++
++static const struct intel_display_device_info i865g_display = {
++ I845_DISPLAY,
++
++ .__runtime_defaults.fbc_mask = BIT(INTEL_FBC_A),
++};
++
++#define GEN3_DISPLAY \
++ .has_gmch = 1, \
++ .has_overlay = 1, \
++ I9XX_PIPE_OFFSETS, \
++ I9XX_CURSOR_OFFSETS, \
++ \
++ .__runtime_defaults.ip.ver = 3, \
++ .__runtime_defaults.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B), \
++ .__runtime_defaults.cpu_transcoder_mask = \
++ BIT(TRANSCODER_A) | BIT(TRANSCODER_B)
++
++static const struct intel_display_device_info i915g_display = {
++ GEN3_DISPLAY,
++ I845_COLORS,
++ .cursor_needs_physical = 1,
++ .overlay_needs_physical = 1,
++};
++
++static const struct intel_display_device_info i915gm_display = {
++ GEN3_DISPLAY,
++ I9XX_COLORS,
++ .cursor_needs_physical = 1,
++ .overlay_needs_physical = 1,
++ .supports_tv = 1,
++
++ .__runtime_defaults.fbc_mask = BIT(INTEL_FBC_A),
++};
++
++static const struct intel_display_device_info i945g_display = {
++ GEN3_DISPLAY,
++ I845_COLORS,
++ .has_hotplug = 1,
++ .cursor_needs_physical = 1,
++ .overlay_needs_physical = 1,
++};
++
++static const struct intel_display_device_info i945gm_display = {
++ GEN3_DISPLAY,
++ I9XX_COLORS,
++ .has_hotplug = 1,
++ .cursor_needs_physical = 1,
++ .overlay_needs_physical = 1,
++ .supports_tv = 1,
++
++ .__runtime_defaults.fbc_mask = BIT(INTEL_FBC_A),
++};
++
++static const struct intel_display_device_info g33_display = {
++ GEN3_DISPLAY,
++ I845_COLORS,
++ .has_hotplug = 1,
++};
++
++static const struct intel_display_device_info pnv_display = {
++ GEN3_DISPLAY,
++ I9XX_COLORS,
++ .has_hotplug = 1,
++};
++
++#define GEN4_DISPLAY \
++ .has_hotplug = 1, \
++ .has_gmch = 1, \
++ I9XX_PIPE_OFFSETS, \
++ I9XX_CURSOR_OFFSETS, \
++ I9XX_COLORS, \
++ \
++ .__runtime_defaults.ip.ver = 4, \
++ .__runtime_defaults.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B), \
++ .__runtime_defaults.cpu_transcoder_mask = \
++ BIT(TRANSCODER_A) | BIT(TRANSCODER_B)
++
++static const struct intel_display_device_info i965g_display = {
++ GEN4_DISPLAY,
++ .has_overlay = 1,
++};
++
++static const struct intel_display_device_info i965gm_display = {
++ GEN4_DISPLAY,
++ .has_overlay = 1,
++ .supports_tv = 1,
++
++ .__runtime_defaults.fbc_mask = BIT(INTEL_FBC_A),
++};
++
++static const struct intel_display_device_info g45_display = {
++ GEN4_DISPLAY,
++};
++
++static const struct intel_display_device_info gm45_display = {
++ GEN4_DISPLAY,
++ .supports_tv = 1,
++
++ .__runtime_defaults.fbc_mask = BIT(INTEL_FBC_A),
++};
++
++#define ILK_DISPLAY \
++ .has_hotplug = 1, \
++ I9XX_PIPE_OFFSETS, \
++ I9XX_CURSOR_OFFSETS, \
++ ILK_COLORS, \
++ \
++ .__runtime_defaults.ip.ver = 5, \
++ .__runtime_defaults.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B), \
++ .__runtime_defaults.cpu_transcoder_mask = \
++ BIT(TRANSCODER_A) | BIT(TRANSCODER_B)
++
++static const struct intel_display_device_info ilk_d_display = {
++ ILK_DISPLAY,
++};
++
++static const struct intel_display_device_info ilk_m_display = {
++ ILK_DISPLAY,
++
++ .__runtime_defaults.fbc_mask = BIT(INTEL_FBC_A),
++};
++
++static const struct intel_display_device_info snb_display = {
++ .has_hotplug = 1,
++ I9XX_PIPE_OFFSETS,
++ I9XX_CURSOR_OFFSETS,
++ ILK_COLORS,
++
++ .__runtime_defaults.ip.ver = 6,
++ .__runtime_defaults.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B),
++ .__runtime_defaults.cpu_transcoder_mask =
++ BIT(TRANSCODER_A) | BIT(TRANSCODER_B),
++ .__runtime_defaults.fbc_mask = BIT(INTEL_FBC_A),
++};
++
++static const struct intel_display_device_info ivb_display = {
++ .has_hotplug = 1,
++ IVB_PIPE_OFFSETS,
++ IVB_CURSOR_OFFSETS,
++ IVB_COLORS,
++
++ .__runtime_defaults.ip.ver = 7,
++ .__runtime_defaults.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C),
++ .__runtime_defaults.cpu_transcoder_mask =
++ BIT(TRANSCODER_A) | BIT(TRANSCODER_B) | BIT(TRANSCODER_C),
++ .__runtime_defaults.fbc_mask = BIT(INTEL_FBC_A),
++};
++
++static const struct intel_display_device_info vlv_display = {
++ .has_gmch = 1,
++ .has_hotplug = 1,
++ .mmio_offset = VLV_DISPLAY_BASE,
++ I9XX_PIPE_OFFSETS,
++ I9XX_CURSOR_OFFSETS,
++ I9XX_COLORS,
++
++ .__runtime_defaults.ip.ver = 7,
++ .__runtime_defaults.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B),
++ .__runtime_defaults.cpu_transcoder_mask =
++ BIT(TRANSCODER_A) | BIT(TRANSCODER_B),
++};
++
++static const struct intel_display_device_info hsw_display = {
++ .has_ddi = 1,
++ .has_dp_mst = 1,
++ .has_fpga_dbg = 1,
++ .has_hotplug = 1,
++ HSW_PIPE_OFFSETS,
++ IVB_CURSOR_OFFSETS,
++ IVB_COLORS,
++
++ .__runtime_defaults.ip.ver = 7,
++ .__runtime_defaults.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C),
++ .__runtime_defaults.cpu_transcoder_mask =
++ BIT(TRANSCODER_A) | BIT(TRANSCODER_B) |
++ BIT(TRANSCODER_C) | BIT(TRANSCODER_EDP),
++ .__runtime_defaults.fbc_mask = BIT(INTEL_FBC_A),
++};
++
++static const struct intel_display_device_info bdw_display = {
++ .has_ddi = 1,
++ .has_dp_mst = 1,
++ .has_fpga_dbg = 1,
++ .has_hotplug = 1,
++ HSW_PIPE_OFFSETS,
++ IVB_CURSOR_OFFSETS,
++ IVB_COLORS,
++
++ .__runtime_defaults.ip.ver = 8,
++ .__runtime_defaults.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C),
++ .__runtime_defaults.cpu_transcoder_mask =
++ BIT(TRANSCODER_A) | BIT(TRANSCODER_B) |
++ BIT(TRANSCODER_C) | BIT(TRANSCODER_EDP),
++ .__runtime_defaults.fbc_mask = BIT(INTEL_FBC_A),
++};
++
++static const struct intel_display_device_info chv_display = {
++ .has_hotplug = 1,
++ .has_gmch = 1,
++ .mmio_offset = VLV_DISPLAY_BASE,
++ CHV_PIPE_OFFSETS,
++ CHV_CURSOR_OFFSETS,
++ CHV_COLORS,
++
++ .__runtime_defaults.ip.ver = 8,
++ .__runtime_defaults.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C),
++ .__runtime_defaults.cpu_transcoder_mask =
++ BIT(TRANSCODER_A) | BIT(TRANSCODER_B) | BIT(TRANSCODER_C),
++};
++
++static const struct intel_display_device_info skl_display = {
++ .dbuf.size = 896 - 4, /* 4 blocks for bypass path allocation */
++ .dbuf.slice_mask = BIT(DBUF_S1),
++ .has_ddi = 1,
++ .has_dp_mst = 1,
++ .has_fpga_dbg = 1,
++ .has_hotplug = 1,
++ .has_ipc = 1,
++ .has_psr = 1,
++ .has_psr_hw_tracking = 1,
++ HSW_PIPE_OFFSETS,
++ IVB_CURSOR_OFFSETS,
++ IVB_COLORS,
++
++ .__runtime_defaults.ip.ver = 9,
++ .__runtime_defaults.has_dmc = 1,
++ .__runtime_defaults.has_hdcp = 1,
++ .__runtime_defaults.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C),
++ .__runtime_defaults.cpu_transcoder_mask =
++ BIT(TRANSCODER_A) | BIT(TRANSCODER_B) |
++ BIT(TRANSCODER_C) | BIT(TRANSCODER_EDP),
++ .__runtime_defaults.fbc_mask = BIT(INTEL_FBC_A),
++};
++
++#define GEN9_LP_DISPLAY \
++ .dbuf.slice_mask = BIT(DBUF_S1), \
++ .has_dp_mst = 1, \
++ .has_ddi = 1, \
++ .has_fpga_dbg = 1, \
++ .has_hotplug = 1, \
++ .has_ipc = 1, \
++ .has_psr = 1, \
++ .has_psr_hw_tracking = 1, \
++ HSW_PIPE_OFFSETS, \
++ IVB_CURSOR_OFFSETS, \
++ IVB_COLORS, \
++ \
++ .__runtime_defaults.has_dmc = 1, \
++ .__runtime_defaults.has_hdcp = 1, \
++ .__runtime_defaults.fbc_mask = BIT(INTEL_FBC_A), \
++ .__runtime_defaults.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C), \
++ .__runtime_defaults.cpu_transcoder_mask = \
++ BIT(TRANSCODER_A) | BIT(TRANSCODER_B) | \
++ BIT(TRANSCODER_C) | BIT(TRANSCODER_EDP) | \
++ BIT(TRANSCODER_DSI_A) | BIT(TRANSCODER_DSI_C)
++
++static const struct intel_display_device_info bxt_display = {
++ GEN9_LP_DISPLAY,
++ .dbuf.size = 512 - 4, /* 4 blocks for bypass path allocation */
++
++ .__runtime_defaults.ip.ver = 9,
++};
++
++static const struct intel_display_device_info glk_display = {
++ GEN9_LP_DISPLAY,
++ .dbuf.size = 1024 - 4, /* 4 blocks for bypass path allocation */
++ GLK_COLORS,
++
++ .__runtime_defaults.ip.ver = 10,
++};
++
++static const struct intel_display_device_info gen11_display = {
++ .abox_mask = BIT(0),
++ .dbuf.size = 2048,
++ .dbuf.slice_mask = BIT(DBUF_S1) | BIT(DBUF_S2),
++ .has_ddi = 1,
++ .has_dp_mst = 1,
++ .has_fpga_dbg = 1,
++ .has_hotplug = 1,
++ .has_ipc = 1,
++ .has_psr = 1,
++ .has_psr_hw_tracking = 1,
++ .pipe_offsets = {
++ [TRANSCODER_A] = PIPE_A_OFFSET,
++ [TRANSCODER_B] = PIPE_B_OFFSET,
++ [TRANSCODER_C] = PIPE_C_OFFSET,
++ [TRANSCODER_EDP] = PIPE_EDP_OFFSET,
++ [TRANSCODER_DSI_0] = PIPE_DSI0_OFFSET,
++ [TRANSCODER_DSI_1] = PIPE_DSI1_OFFSET,
++ },
++ .trans_offsets = {
++ [TRANSCODER_A] = TRANSCODER_A_OFFSET,
++ [TRANSCODER_B] = TRANSCODER_B_OFFSET,
++ [TRANSCODER_C] = TRANSCODER_C_OFFSET,
++ [TRANSCODER_EDP] = TRANSCODER_EDP_OFFSET,
++ [TRANSCODER_DSI_0] = TRANSCODER_DSI0_OFFSET,
++ [TRANSCODER_DSI_1] = TRANSCODER_DSI1_OFFSET,
++ },
++ IVB_CURSOR_OFFSETS,
++ ICL_COLORS,
++
++ .__runtime_defaults.ip.ver = 11,
++ .__runtime_defaults.has_dmc = 1,
++ .__runtime_defaults.has_dsc = 1,
++ .__runtime_defaults.has_hdcp = 1,
++ .__runtime_defaults.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C),
++ .__runtime_defaults.cpu_transcoder_mask =
++ BIT(TRANSCODER_A) | BIT(TRANSCODER_B) |
++ BIT(TRANSCODER_C) | BIT(TRANSCODER_EDP) |
++ BIT(TRANSCODER_DSI_0) | BIT(TRANSCODER_DSI_1),
++ .__runtime_defaults.fbc_mask = BIT(INTEL_FBC_A),
++};
++
++#define XE_D_DISPLAY \
++ .abox_mask = GENMASK(2, 1), \
++ .dbuf.size = 2048, \
++ .dbuf.slice_mask = BIT(DBUF_S1) | BIT(DBUF_S2), \
++ .has_ddi = 1, \
++ .has_dp_mst = 1, \
++ .has_dsb = 1, \
++ .has_fpga_dbg = 1, \
++ .has_hotplug = 1, \
++ .has_ipc = 1, \
++ .has_psr = 1, \
++ .has_psr_hw_tracking = 1, \
++ .pipe_offsets = { \
++ [TRANSCODER_A] = PIPE_A_OFFSET, \
++ [TRANSCODER_B] = PIPE_B_OFFSET, \
++ [TRANSCODER_C] = PIPE_C_OFFSET, \
++ [TRANSCODER_D] = PIPE_D_OFFSET, \
++ [TRANSCODER_DSI_0] = PIPE_DSI0_OFFSET, \
++ [TRANSCODER_DSI_1] = PIPE_DSI1_OFFSET, \
++ }, \
++ .trans_offsets = { \
++ [TRANSCODER_A] = TRANSCODER_A_OFFSET, \
++ [TRANSCODER_B] = TRANSCODER_B_OFFSET, \
++ [TRANSCODER_C] = TRANSCODER_C_OFFSET, \
++ [TRANSCODER_D] = TRANSCODER_D_OFFSET, \
++ [TRANSCODER_DSI_0] = TRANSCODER_DSI0_OFFSET, \
++ [TRANSCODER_DSI_1] = TRANSCODER_DSI1_OFFSET, \
++ }, \
++ TGL_CURSOR_OFFSETS, \
++ ICL_COLORS, \
++ \
++ .__runtime_defaults.ip.ver = 12, \
++ .__runtime_defaults.has_dmc = 1, \
++ .__runtime_defaults.has_dsc = 1, \
++ .__runtime_defaults.has_hdcp = 1, \
++ .__runtime_defaults.pipe_mask = \
++ BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C) | BIT(PIPE_D), \
++ .__runtime_defaults.cpu_transcoder_mask = \
++ BIT(TRANSCODER_A) | BIT(TRANSCODER_B) | \
++ BIT(TRANSCODER_C) | BIT(TRANSCODER_D) | \
++ BIT(TRANSCODER_DSI_0) | BIT(TRANSCODER_DSI_1), \
++ .__runtime_defaults.fbc_mask = BIT(INTEL_FBC_A)
++
++static const struct intel_display_device_info tgl_display = {
++ XE_D_DISPLAY,
++};
++
++static const struct intel_display_device_info rkl_display = {
++ XE_D_DISPLAY,
++ .abox_mask = BIT(0),
++ .has_hti = 1,
++ .has_psr_hw_tracking = 0,
++
++ .__runtime_defaults.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C),
++ .__runtime_defaults.cpu_transcoder_mask =
++ BIT(TRANSCODER_A) | BIT(TRANSCODER_B) | BIT(TRANSCODER_C),
++};
++
++static const struct intel_display_device_info adl_s_display = {
++ XE_D_DISPLAY,
++ .has_hti = 1,
++ .has_psr_hw_tracking = 0,
++};
++
++#define XE_LPD_FEATURES \
++ .abox_mask = GENMASK(1, 0), \
++ .color = { \
++ .degamma_lut_size = 129, .gamma_lut_size = 1024, \
++ .degamma_lut_tests = DRM_COLOR_LUT_NON_DECREASING | \
++ DRM_COLOR_LUT_EQUAL_CHANNELS, \
++ }, \
++ .dbuf.size = 4096, \
++ .dbuf.slice_mask = BIT(DBUF_S1) | BIT(DBUF_S2) | BIT(DBUF_S3) | \
++ BIT(DBUF_S4), \
++ .has_ddi = 1, \
++ .has_dp_mst = 1, \
++ .has_dsb = 1, \
++ .has_fpga_dbg = 1, \
++ .has_hotplug = 1, \
++ .has_ipc = 1, \
++ .has_psr = 1, \
++ .pipe_offsets = { \
++ [TRANSCODER_A] = PIPE_A_OFFSET, \
++ [TRANSCODER_B] = PIPE_B_OFFSET, \
++ [TRANSCODER_C] = PIPE_C_OFFSET, \
++ [TRANSCODER_D] = PIPE_D_OFFSET, \
++ [TRANSCODER_DSI_0] = PIPE_DSI0_OFFSET, \
++ [TRANSCODER_DSI_1] = PIPE_DSI1_OFFSET, \
++ }, \
++ .trans_offsets = { \
++ [TRANSCODER_A] = TRANSCODER_A_OFFSET, \
++ [TRANSCODER_B] = TRANSCODER_B_OFFSET, \
++ [TRANSCODER_C] = TRANSCODER_C_OFFSET, \
++ [TRANSCODER_D] = TRANSCODER_D_OFFSET, \
++ [TRANSCODER_DSI_0] = TRANSCODER_DSI0_OFFSET, \
++ [TRANSCODER_DSI_1] = TRANSCODER_DSI1_OFFSET, \
++ }, \
++ TGL_CURSOR_OFFSETS, \
++ \
++ .__runtime_defaults.ip.ver = 13, \
++ .__runtime_defaults.has_dmc = 1, \
++ .__runtime_defaults.has_dsc = 1, \
++ .__runtime_defaults.fbc_mask = BIT(INTEL_FBC_A), \
++ .__runtime_defaults.has_hdcp = 1, \
++ .__runtime_defaults.pipe_mask = \
++ BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C) | BIT(PIPE_D)
++
++static const struct intel_display_device_info xe_lpd_display = {
++ XE_LPD_FEATURES,
++ .has_cdclk_crawl = 1,
++ .has_psr_hw_tracking = 0,
++
++ .__runtime_defaults.cpu_transcoder_mask =
++ BIT(TRANSCODER_A) | BIT(TRANSCODER_B) |
++ BIT(TRANSCODER_C) | BIT(TRANSCODER_D) |
++ BIT(TRANSCODER_DSI_0) | BIT(TRANSCODER_DSI_1),
++};
++
++static const struct intel_display_device_info xe_hpd_display = {
++ XE_LPD_FEATURES,
++ .has_cdclk_squash = 1,
++
++ .__runtime_defaults.cpu_transcoder_mask =
++ BIT(TRANSCODER_A) | BIT(TRANSCODER_B) |
++ BIT(TRANSCODER_C) | BIT(TRANSCODER_D),
++};
++
++static const struct intel_display_device_info xe_lpdp_display = {
++ XE_LPD_FEATURES,
++ .has_cdclk_crawl = 1,
++ .has_cdclk_squash = 1,
++
++ .__runtime_defaults.ip.ver = 14,
++ .__runtime_defaults.fbc_mask = BIT(INTEL_FBC_A) | BIT(INTEL_FBC_B),
++ .__runtime_defaults.cpu_transcoder_mask =
++ BIT(TRANSCODER_A) | BIT(TRANSCODER_B) |
++ BIT(TRANSCODER_C) | BIT(TRANSCODER_D),
++};
++
++#undef INTEL_VGA_DEVICE
++#undef INTEL_QUANTA_VGA_DEVICE
++#define INTEL_VGA_DEVICE(id, info) { id, info }
++#define INTEL_QUANTA_VGA_DEVICE(info) { 0x16a, info }
++
++static const struct {
++ u32 devid;
++ const struct intel_display_device_info *info;
++} intel_display_ids[] = {
++ INTEL_I830_IDS(&i830_display),
++ INTEL_I845G_IDS(&i845_display),
++ INTEL_I85X_IDS(&i85x_display),
++ INTEL_I865G_IDS(&i865g_display),
++ INTEL_I915G_IDS(&i915g_display),
++ INTEL_I915GM_IDS(&i915gm_display),
++ INTEL_I945G_IDS(&i945g_display),
++ INTEL_I945GM_IDS(&i945gm_display),
++ INTEL_I965G_IDS(&i965g_display),
++ INTEL_G33_IDS(&g33_display),
++ INTEL_I965GM_IDS(&i965gm_display),
++ INTEL_GM45_IDS(&gm45_display),
++ INTEL_G45_IDS(&g45_display),
++ INTEL_PINEVIEW_G_IDS(&pnv_display),
++ INTEL_PINEVIEW_M_IDS(&pnv_display),
++ INTEL_IRONLAKE_D_IDS(&ilk_d_display),
++ INTEL_IRONLAKE_M_IDS(&ilk_m_display),
++ INTEL_SNB_D_IDS(&snb_display),
++ INTEL_SNB_M_IDS(&snb_display),
++ INTEL_IVB_Q_IDS(NULL), /* must be first IVB in list */
++ INTEL_IVB_M_IDS(&ivb_display),
++ INTEL_IVB_D_IDS(&ivb_display),
++ INTEL_HSW_IDS(&hsw_display),
++ INTEL_VLV_IDS(&vlv_display),
++ INTEL_BDW_IDS(&bdw_display),
++ INTEL_CHV_IDS(&chv_display),
++ INTEL_SKL_IDS(&skl_display),
++ INTEL_BXT_IDS(&bxt_display),
++ INTEL_GLK_IDS(&glk_display),
++ INTEL_KBL_IDS(&skl_display),
++ INTEL_CFL_IDS(&skl_display),
++ INTEL_ICL_11_IDS(&gen11_display),
++ INTEL_EHL_IDS(&gen11_display),
++ INTEL_JSL_IDS(&gen11_display),
++ INTEL_TGL_12_IDS(&tgl_display),
++ INTEL_DG1_IDS(&tgl_display),
++ INTEL_RKL_IDS(&rkl_display),
++ INTEL_ADLS_IDS(&adl_s_display),
++ INTEL_RPLS_IDS(&adl_s_display),
++ INTEL_ADLP_IDS(&xe_lpd_display),
++ INTEL_ADLN_IDS(&xe_lpd_display),
++ INTEL_RPLP_IDS(&xe_lpd_display),
++ INTEL_DG2_IDS(&xe_hpd_display),
++
++ /* FIXME: Replace this with a GMD_ID lookup */
++ INTEL_MTL_IDS(&xe_lpdp_display),
++};
++
++const struct intel_display_device_info *
++intel_display_device_probe(u16 pci_devid)
++{
++ int i;
++
++ for (i = 0; i < ARRAY_SIZE(intel_display_ids); i++) {
++ if (intel_display_ids[i].devid == pci_devid)
++ return intel_display_ids[i].info;
++ }
++
++ return &no_display;
++}
+diff --git a/drivers/gpu/drm/i915/display/intel_display_device.h b/drivers/gpu/drm/i915/display/intel_display_device.h
+new file mode 100644
+index 0000000000000..1f7d08b3ad6b1
+--- /dev/null
++++ b/drivers/gpu/drm/i915/display/intel_display_device.h
+@@ -0,0 +1,86 @@
++/* SPDX-License-Identifier: MIT */
++/*
++ * Copyright © 2023 Intel Corporation
++ */
++
++#ifndef __INTEL_DISPLAY_DEVICE_H__
++#define __INTEL_DISPLAY_DEVICE_H__
++
++#include <linux/types.h>
++
++#include "display/intel_display_limits.h"
++
++#define DEV_INFO_DISPLAY_FOR_EACH_FLAG(func) \
++ /* Keep in alphabetical order */ \
++ func(cursor_needs_physical); \
++ func(has_cdclk_crawl); \
++ func(has_cdclk_squash); \
++ func(has_ddi); \
++ func(has_dp_mst); \
++ func(has_dsb); \
++ func(has_fpga_dbg); \
++ func(has_gmch); \
++ func(has_hotplug); \
++ func(has_hti); \
++ func(has_ipc); \
++ func(has_overlay); \
++ func(has_psr); \
++ func(has_psr_hw_tracking); \
++ func(overlay_needs_physical); \
++ func(supports_tv);
++
++struct intel_display_runtime_info {
++ struct {
++ u16 ver;
++ u16 rel;
++ u16 step;
++ } ip;
++
++ u8 pipe_mask;
++ u8 cpu_transcoder_mask;
++
++ u8 num_sprites[I915_MAX_PIPES];
++ u8 num_scalers[I915_MAX_PIPES];
++
++ u8 fbc_mask;
++
++ bool has_hdcp;
++ bool has_dmc;
++ bool has_dsc;
++};
++
++struct intel_display_device_info {
++ /* Initial runtime info. */
++ const struct intel_display_runtime_info __runtime_defaults;
++
++ u8 abox_mask;
++
++ struct {
++ u16 size; /* in blocks */
++ u8 slice_mask;
++ } dbuf;
++
++#define DEFINE_FLAG(name) u8 name:1
++ DEV_INFO_DISPLAY_FOR_EACH_FLAG(DEFINE_FLAG);
++#undef DEFINE_FLAG
++
++ /* Global register offset for the display engine */
++ u32 mmio_offset;
++
++ /* Register offsets for the various display pipes and transcoders */
++ u32 pipe_offsets[I915_MAX_TRANSCODERS];
++ u32 trans_offsets[I915_MAX_TRANSCODERS];
++ u32 cursor_offsets[I915_MAX_PIPES];
++
++ struct {
++ u32 degamma_lut_size;
++ u32 gamma_lut_size;
++ u32 degamma_lut_tests;
++ u32 gamma_lut_tests;
++ } color;
++};
++
++const struct intel_display_device_info *
++intel_display_device_probe(u16 pci_devid);
++
++#endif
+diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c b/drivers/gpu/drm/i915/display/intel_display_power.c
+index 7c9f4288329ed..5f7deaa23b218 100644
+--- a/drivers/gpu/drm/i915/display/intel_display_power.c
++++ b/drivers/gpu/drm/i915/display/intel_display_power.c
+@@ -1052,7 +1052,7 @@ void gen9_dbuf_slices_update(struct drm_i915_private *dev_priv,
+ u8 req_slices)
+ {
+ struct i915_power_domains *power_domains = &dev_priv->display.power.domains;
+- u8 slice_mask = INTEL_INFO(dev_priv)->display.dbuf.slice_mask;
++ u8 slice_mask = DISPLAY_INFO(dev_priv)->dbuf.slice_mask;
+ enum dbuf_slice slice;
+
+ drm_WARN(&dev_priv->drm, req_slices & ~slice_mask,
+@@ -1112,7 +1112,7 @@ static void gen12_dbuf_slices_config(struct drm_i915_private *dev_priv)
+
+ static void icl_mbus_init(struct drm_i915_private *dev_priv)
+ {
+- unsigned long abox_regs = INTEL_INFO(dev_priv)->display.abox_mask;
++ unsigned long abox_regs = DISPLAY_INFO(dev_priv)->abox_mask;
+ u32 mask, val, i;
+
+ if (IS_ALDERLAKE_P(dev_priv) || DISPLAY_VER(dev_priv) >= 14)
+@@ -1558,7 +1558,7 @@ static void tgl_bw_buddy_init(struct drm_i915_private *dev_priv)
+ enum intel_dram_type type = dev_priv->dram_info.type;
+ u8 num_channels = dev_priv->dram_info.num_channels;
+ const struct buddy_page_mask *table;
+- unsigned long abox_mask = INTEL_INFO(dev_priv)->display.abox_mask;
++ unsigned long abox_mask = DISPLAY_INFO(dev_priv)->abox_mask;
+ int config, i;
+
+ /* BW_BUDDY registers are not used on dgpu's beyond DG1 */
+diff --git a/drivers/gpu/drm/i915/display/intel_display_reg_defs.h b/drivers/gpu/drm/i915/display/intel_display_reg_defs.h
+index 755c1ea8225c5..2f07b7afa3bfe 100644
+--- a/drivers/gpu/drm/i915/display/intel_display_reg_defs.h
++++ b/drivers/gpu/drm/i915/display/intel_display_reg_defs.h
+@@ -8,7 +8,7 @@
+
+ #include "i915_reg_defs.h"
+
+-#define DISPLAY_MMIO_BASE(dev_priv) (INTEL_INFO(dev_priv)->display.mmio_offset)
++#define DISPLAY_MMIO_BASE(dev_priv) (DISPLAY_INFO(dev_priv)->mmio_offset)
+
+ #define VLV_DISPLAY_BASE 0x180000
+
+@@ -36,14 +36,14 @@
+ * Device info offset array based helpers for groups of registers with unevenly
+ * spaced base offsets.
+ */
+-#define _MMIO_PIPE2(pipe, reg) _MMIO(INTEL_INFO(dev_priv)->display.pipe_offsets[(pipe)] - \
+- INTEL_INFO(dev_priv)->display.pipe_offsets[PIPE_A] + \
++#define _MMIO_PIPE2(pipe, reg) _MMIO(DISPLAY_INFO(dev_priv)->pipe_offsets[(pipe)] - \
++ DISPLAY_INFO(dev_priv)->pipe_offsets[PIPE_A] + \
+ DISPLAY_MMIO_BASE(dev_priv) + (reg))
+-#define _MMIO_TRANS2(tran, reg) _MMIO(INTEL_INFO(dev_priv)->display.trans_offsets[(tran)] - \
+- INTEL_INFO(dev_priv)->display.trans_offsets[TRANSCODER_A] + \
++#define _MMIO_TRANS2(tran, reg) _MMIO(DISPLAY_INFO(dev_priv)->trans_offsets[(tran)] - \
++ DISPLAY_INFO(dev_priv)->trans_offsets[TRANSCODER_A] + \
+ DISPLAY_MMIO_BASE(dev_priv) + (reg))
+-#define _MMIO_CURSOR2(pipe, reg) _MMIO(INTEL_INFO(dev_priv)->display.cursor_offsets[(pipe)] - \
+- INTEL_INFO(dev_priv)->display.cursor_offsets[PIPE_A] + \
++#define _MMIO_CURSOR2(pipe, reg) _MMIO(DISPLAY_INFO(dev_priv)->cursor_offsets[(pipe)] - \
++ DISPLAY_INFO(dev_priv)->cursor_offsets[PIPE_A] + \
+ DISPLAY_MMIO_BASE(dev_priv) + (reg))
+
+ #endif /* __INTEL_DISPLAY_REG_DEFS_H__ */
+diff --git a/drivers/gpu/drm/i915/display/intel_fb_pin.c b/drivers/gpu/drm/i915/display/intel_fb_pin.c
+index 1aca7552a85d0..fffd568070d41 100644
+--- a/drivers/gpu/drm/i915/display/intel_fb_pin.c
++++ b/drivers/gpu/drm/i915/display/intel_fb_pin.c
+@@ -243,7 +243,7 @@ int intel_plane_pin_fb(struct intel_plane_state *plane_state)
+ struct i915_vma *vma;
+ bool phys_cursor =
+ plane->id == PLANE_CURSOR &&
+- INTEL_INFO(dev_priv)->display.cursor_needs_physical;
++ DISPLAY_INFO(dev_priv)->cursor_needs_physical;
+
+ if (!intel_fb_uses_dpt(fb)) {
+ vma = intel_pin_and_fence_fb_obj(fb, phys_cursor,
+diff --git a/drivers/gpu/drm/i915/display/intel_fbc.c b/drivers/gpu/drm/i915/display/intel_fbc.c
+index b507ff944864e..61914a1fe58df 100644
+--- a/drivers/gpu/drm/i915/display/intel_fbc.c
++++ b/drivers/gpu/drm/i915/display/intel_fbc.c
+@@ -55,7 +55,7 @@
+
+ #define for_each_fbc_id(__dev_priv, __fbc_id) \
+ for ((__fbc_id) = INTEL_FBC_A; (__fbc_id) < I915_MAX_FBCS; (__fbc_id)++) \
+- for_each_if(RUNTIME_INFO(__dev_priv)->fbc_mask & BIT(__fbc_id))
++ for_each_if(DISPLAY_RUNTIME_INFO(__dev_priv)->fbc_mask & BIT(__fbc_id))
+
+ #define for_each_intel_fbc(__dev_priv, __fbc, __fbc_id) \
+ for_each_fbc_id((__dev_priv), (__fbc_id)) \
+@@ -1707,10 +1707,10 @@ void intel_fbc_init(struct drm_i915_private *i915)
+ enum intel_fbc_id fbc_id;
+
+ if (!drm_mm_initialized(&i915->mm.stolen))
+- RUNTIME_INFO(i915)->fbc_mask = 0;
++ DISPLAY_RUNTIME_INFO(i915)->fbc_mask = 0;
+
+ if (need_fbc_vtd_wa(i915))
+- RUNTIME_INFO(i915)->fbc_mask = 0;
++ DISPLAY_RUNTIME_INFO(i915)->fbc_mask = 0;
+
+ i915->params.enable_fbc = intel_sanitize_fbc_option(i915);
+ drm_dbg_kms(&i915->drm, "Sanitized enable_fbc value: %d\n",
+diff --git a/drivers/gpu/drm/i915/display/intel_hdcp.c b/drivers/gpu/drm/i915/display/intel_hdcp.c
+index b183efab04a1d..ac46350074df2 100644
+--- a/drivers/gpu/drm/i915/display/intel_hdcp.c
++++ b/drivers/gpu/drm/i915/display/intel_hdcp.c
+@@ -1141,7 +1141,7 @@ static void intel_hdcp_prop_work(struct work_struct *work)
+
+ bool is_hdcp_supported(struct drm_i915_private *dev_priv, enum port port)
+ {
+- return RUNTIME_INFO(dev_priv)->has_hdcp &&
++ return DISPLAY_RUNTIME_INFO(dev_priv)->has_hdcp &&
+ (DISPLAY_VER(dev_priv) >= 12 || port < PORT_E);
+ }
+
+diff --git a/drivers/gpu/drm/i915/display/intel_hti.c b/drivers/gpu/drm/i915/display/intel_hti.c
+index c518efebdf779..a92d008d4e6e5 100644
+--- a/drivers/gpu/drm/i915/display/intel_hti.c
++++ b/drivers/gpu/drm/i915/display/intel_hti.c
+@@ -15,7 +15,7 @@ void intel_hti_init(struct drm_i915_private *i915)
+ * If the platform has HTI, we need to find out whether it has reserved
+ * any display resources before we create our display outputs.
+ */
+- if (INTEL_INFO(i915)->display.has_hti)
++ if (DISPLAY_INFO(i915)->has_hti)
+ i915->display.hti.state = intel_de_read(i915, HDPORT_STATE);
+ }
+
+diff --git a/drivers/gpu/drm/i915/display/intel_psr.c b/drivers/gpu/drm/i915/display/intel_psr.c
+index 6badfff2b4a28..b7cbc780e672f 100644
+--- a/drivers/gpu/drm/i915/display/intel_psr.c
++++ b/drivers/gpu/drm/i915/display/intel_psr.c
+@@ -851,9 +851,9 @@ static bool _compute_psr2_wake_times(struct intel_dp *intel_dp,
+ }
+
+ io_wake_lines = intel_usecs_to_scanlines(
+- &crtc_state->uapi.adjusted_mode, io_wake_time);
++ &crtc_state->hw.adjusted_mode, io_wake_time);
+ fast_wake_lines = intel_usecs_to_scanlines(
+- &crtc_state->uapi.adjusted_mode, fast_wake_time);
++ &crtc_state->hw.adjusted_mode, fast_wake_time);
+
+ if (io_wake_lines > max_wake_lines ||
+ fast_wake_lines > max_wake_lines)
+diff --git a/drivers/gpu/drm/i915/display/intel_psr_regs.h b/drivers/gpu/drm/i915/display/intel_psr_regs.h
+index 958d8cabc44b5..5e3fe23ef8eb2 100644
+--- a/drivers/gpu/drm/i915/display/intel_psr_regs.h
++++ b/drivers/gpu/drm/i915/display/intel_psr_regs.h
+@@ -75,7 +75,7 @@
+
+ #define _SRD_AUX_DATA_A 0x60814
+ #define _SRD_AUX_DATA_EDP 0x6f814
+-#define EDP_PSR_AUX_DATA(tran, i) _MMIO_TRANS2(tran, _SRD_AUX_DATA_A + (i) + 4) /* 5 registers */
++#define EDP_PSR_AUX_DATA(tran, i) _MMIO_TRANS2(tran, _SRD_AUX_DATA_A + (i) * 4) /* 5 registers */
+
+ #define _SRD_STATUS_A 0x60840
+ #define _SRD_STATUS_EDP 0x6f840
+diff --git a/drivers/gpu/drm/i915/display/skl_universal_plane.c b/drivers/gpu/drm/i915/display/skl_universal_plane.c
+index 8ea0598a5a07e..5df7b02483629 100644
+--- a/drivers/gpu/drm/i915/display/skl_universal_plane.c
++++ b/drivers/gpu/drm/i915/display/skl_universal_plane.c
+@@ -1936,7 +1936,7 @@ static enum intel_fbc_id skl_fbc_id_for_pipe(enum pipe pipe)
+ static bool skl_plane_has_fbc(struct drm_i915_private *dev_priv,
+ enum intel_fbc_id fbc_id, enum plane_id plane_id)
+ {
+- if ((RUNTIME_INFO(dev_priv)->fbc_mask & BIT(fbc_id)) == 0)
++ if ((DISPLAY_RUNTIME_INFO(dev_priv)->fbc_mask & BIT(fbc_id)) == 0)
+ return false;
+
+ return plane_id == PLANE_PRIMARY;
+diff --git a/drivers/gpu/drm/i915/display/skl_watermark.c b/drivers/gpu/drm/i915/display/skl_watermark.c
+index 1c7e6468f3e34..d1245c847f1cb 100644
+--- a/drivers/gpu/drm/i915/display/skl_watermark.c
++++ b/drivers/gpu/drm/i915/display/skl_watermark.c
+@@ -507,8 +507,8 @@ static u16 skl_ddb_entry_init(struct skl_ddb_entry *entry,
+
+ static int intel_dbuf_slice_size(struct drm_i915_private *i915)
+ {
+- return INTEL_INFO(i915)->display.dbuf.size /
+- hweight8(INTEL_INFO(i915)->display.dbuf.slice_mask);
++ return DISPLAY_INFO(i915)->dbuf.size /
++ hweight8(DISPLAY_INFO(i915)->dbuf.slice_mask);
+ }
+
+ static void
+@@ -527,7 +527,7 @@ skl_ddb_entry_for_slices(struct drm_i915_private *i915, u8 slice_mask,
+ ddb->end = fls(slice_mask) * slice_size;
+
+ WARN_ON(ddb->start >= ddb->end);
+- WARN_ON(ddb->end > INTEL_INFO(i915)->display.dbuf.size);
++ WARN_ON(ddb->end > DISPLAY_INFO(i915)->dbuf.size);
+ }
+
+ static unsigned int mbus_ddb_offset(struct drm_i915_private *i915, u8 slice_mask)
+@@ -2625,7 +2625,7 @@ skl_compute_ddb(struct intel_atomic_state *state)
+ "Enabled dbuf slices 0x%x -> 0x%x (total dbuf slices 0x%x), mbus joined? %s->%s\n",
+ old_dbuf_state->enabled_slices,
+ new_dbuf_state->enabled_slices,
+- INTEL_INFO(i915)->display.dbuf.slice_mask,
++ DISPLAY_INFO(i915)->dbuf.slice_mask,
+ str_yes_no(old_dbuf_state->joined_mbus),
+ str_yes_no(new_dbuf_state->joined_mbus));
+ }
+diff --git a/drivers/gpu/drm/i915/gt/intel_gt_sysfs_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_sysfs_pm.c
+index 28f27091cd3b7..ee2b44f896a27 100644
+--- a/drivers/gpu/drm/i915/gt/intel_gt_sysfs_pm.c
++++ b/drivers/gpu/drm/i915/gt/intel_gt_sysfs_pm.c
+@@ -451,6 +451,33 @@ static ssize_t punit_req_freq_mhz_show(struct kobject *kobj,
+ return sysfs_emit(buff, "%u\n", preq);
+ }
+
++static ssize_t slpc_ignore_eff_freq_show(struct kobject *kobj,
++ struct kobj_attribute *attr,
++ char *buff)
++{
++ struct intel_gt *gt = intel_gt_sysfs_get_drvdata(kobj, attr->attr.name);
++ struct intel_guc_slpc *slpc = >->uc.guc.slpc;
++
++ return sysfs_emit(buff, "%u\n", slpc->ignore_eff_freq);
++}
++
++static ssize_t slpc_ignore_eff_freq_store(struct kobject *kobj,
++ struct kobj_attribute *attr,
++ const char *buff, size_t count)
++{
++ struct intel_gt *gt = intel_gt_sysfs_get_drvdata(kobj, attr->attr.name);
++ struct intel_guc_slpc *slpc = >->uc.guc.slpc;
++ int err;
++ u32 val;
++
++ err = kstrtou32(buff, 0, &val);
++ if (err)
++ return err;
++
++ err = intel_guc_slpc_set_ignore_eff_freq(slpc, val);
++ return err ?: count;
++}
++
+ struct intel_gt_bool_throttle_attr {
+ struct attribute attr;
+ ssize_t (*show)(struct kobject *kobj, struct kobj_attribute *attr,
+@@ -663,6 +690,8 @@ static struct kobj_attribute attr_media_freq_factor_scale =
+ INTEL_GT_ATTR_RO(media_RP0_freq_mhz);
+ INTEL_GT_ATTR_RO(media_RPn_freq_mhz);
+
++INTEL_GT_ATTR_RW(slpc_ignore_eff_freq);
++
+ static const struct attribute *media_perf_power_attrs[] = {
+ &attr_media_freq_factor.attr,
+ &attr_media_freq_factor_scale.attr,
+@@ -744,6 +773,12 @@ void intel_gt_sysfs_pm_init(struct intel_gt *gt, struct kobject *kobj)
+ if (ret)
+ gt_warn(gt, "failed to create punit_req_freq_mhz sysfs (%pe)", ERR_PTR(ret));
+
++ if (intel_uc_uses_guc_slpc(>->uc)) {
++ ret = sysfs_create_file(kobj, &attr_slpc_ignore_eff_freq.attr);
++ if (ret)
++ gt_warn(gt, "failed to create ignore_eff_freq sysfs (%pe)", ERR_PTR(ret));
++ }
++
+ if (i915_mmio_reg_valid(intel_gt_perf_limit_reasons_reg(gt))) {
+ ret = sysfs_create_files(kobj, throttle_reason_attrs);
+ if (ret)
+diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
+index 026d73855f36c..cc18e8f664864 100644
+--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
++++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
+@@ -277,6 +277,7 @@ int intel_guc_slpc_init(struct intel_guc_slpc *slpc)
+
+ slpc->max_freq_softlimit = 0;
+ slpc->min_freq_softlimit = 0;
++ slpc->ignore_eff_freq = false;
+ slpc->min_is_rpmax = false;
+
+ slpc->boost_freq = 0;
+@@ -457,6 +458,29 @@ int intel_guc_slpc_get_max_freq(struct intel_guc_slpc *slpc, u32 *val)
+ return ret;
+ }
+
++int intel_guc_slpc_set_ignore_eff_freq(struct intel_guc_slpc *slpc, bool val)
++{
++ struct drm_i915_private *i915 = slpc_to_i915(slpc);
++ intel_wakeref_t wakeref;
++ int ret;
++
++ mutex_lock(&slpc->lock);
++ wakeref = intel_runtime_pm_get(&i915->runtime_pm);
++
++ ret = slpc_set_param(slpc,
++ SLPC_PARAM_IGNORE_EFFICIENT_FREQUENCY,
++ val);
++ if (ret)
++ guc_probe_error(slpc_to_guc(slpc), "Failed to set efficient freq(%d): %pe\n",
++ val, ERR_PTR(ret));
++ else
++ slpc->ignore_eff_freq = val;
++
++ intel_runtime_pm_put(&i915->runtime_pm, wakeref);
++ mutex_unlock(&slpc->lock);
++ return ret;
++}
++
+ /**
+ * intel_guc_slpc_set_min_freq() - Set min frequency limit for SLPC.
+ * @slpc: pointer to intel_guc_slpc.
+@@ -482,16 +506,6 @@ int intel_guc_slpc_set_min_freq(struct intel_guc_slpc *slpc, u32 val)
+ mutex_lock(&slpc->lock);
+ wakeref = intel_runtime_pm_get(&i915->runtime_pm);
+
+- /* Ignore efficient freq if lower min freq is requested */
+- ret = slpc_set_param(slpc,
+- SLPC_PARAM_IGNORE_EFFICIENT_FREQUENCY,
+- val < slpc->rp1_freq);
+- if (ret) {
+- guc_probe_error(slpc_to_guc(slpc), "Failed to toggle efficient freq: %pe\n",
+- ERR_PTR(ret));
+- goto out;
+- }
+-
+ ret = slpc_set_param(slpc,
+ SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ,
+ val);
+@@ -499,7 +513,6 @@ int intel_guc_slpc_set_min_freq(struct intel_guc_slpc *slpc, u32 val)
+ if (!ret)
+ slpc->min_freq_softlimit = val;
+
+-out:
+ intel_runtime_pm_put(&i915->runtime_pm, wakeref);
+ mutex_unlock(&slpc->lock);
+
+@@ -593,7 +606,7 @@ static int slpc_set_softlimits(struct intel_guc_slpc *slpc)
+ if (unlikely(ret))
+ return ret;
+ slpc_to_gt(slpc)->defaults.min_freq = slpc->min_freq_softlimit;
+- } else if (slpc->min_freq_softlimit != slpc->min_freq) {
++ } else {
+ return intel_guc_slpc_set_min_freq(slpc,
+ slpc->min_freq_softlimit);
+ }
+@@ -752,6 +765,9 @@ int intel_guc_slpc_enable(struct intel_guc_slpc *slpc)
+ /* Set cached media freq ratio mode */
+ intel_guc_slpc_set_media_ratio_mode(slpc, slpc->media_ratio_mode);
+
++ /* Set cached value of ignore efficient freq */
++ intel_guc_slpc_set_ignore_eff_freq(slpc, slpc->ignore_eff_freq);
++
+ return 0;
+ }
+
+diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h
+index 17ed515f6a852..597eb5413ddf2 100644
+--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h
++++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h
+@@ -46,5 +46,6 @@ void intel_guc_slpc_boost(struct intel_guc_slpc *slpc);
+ void intel_guc_slpc_dec_waiters(struct intel_guc_slpc *slpc);
+ int intel_guc_slpc_unset_gucrc_mode(struct intel_guc_slpc *slpc);
+ int intel_guc_slpc_override_gucrc_mode(struct intel_guc_slpc *slpc, u32 mode);
++int intel_guc_slpc_set_ignore_eff_freq(struct intel_guc_slpc *slpc, bool val);
+
+ #endif
+diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc_types.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc_types.h
+index a6ef53b04e047..a886513314977 100644
+--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc_types.h
++++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc_types.h
+@@ -31,6 +31,7 @@ struct intel_guc_slpc {
+ /* frequency softlimits */
+ u32 min_freq_softlimit;
+ u32 max_freq_softlimit;
++ bool ignore_eff_freq;
+
+ /* cached media ratio mode */
+ u32 media_ratio_mode;
+diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c
+index 93fdc40d724fa..2980ccdef6cd6 100644
+--- a/drivers/gpu/drm/i915/i915_driver.c
++++ b/drivers/gpu/drm/i915/i915_driver.c
+@@ -720,8 +720,6 @@ i915_driver_create(struct pci_dev *pdev, const struct pci_device_id *ent)
+ {
+ const struct intel_device_info *match_info =
+ (struct intel_device_info *)ent->driver_data;
+- struct intel_device_info *device_info;
+- struct intel_runtime_info *runtime;
+ struct drm_i915_private *i915;
+
+ i915 = devm_drm_dev_alloc(&pdev->dev, &i915_drm_driver,
+@@ -734,14 +732,8 @@ i915_driver_create(struct pci_dev *pdev, const struct pci_device_id *ent)
+ /* Device parameters start as a copy of module parameters. */
+ i915_params_copy(&i915->params, &i915_modparams);
+
+- /* Setup the write-once "constant" device info */
+- device_info = mkwrite_device_info(i915);
+- memcpy(device_info, match_info, sizeof(*device_info));
+-
+- /* Initialize initial runtime info from static const data and pdev. */
+- runtime = RUNTIME_INFO(i915);
+- memcpy(runtime, &INTEL_INFO(i915)->__runtime, sizeof(*runtime));
+- runtime->device_id = pdev->device;
++ /* Set up device info and initial runtime info. */
++ intel_device_info_driver_create(i915, pdev->device, match_info);
+
+ return i915;
+ }
+diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
+index e771fdc3099c2..dbdecf1ee24fe 100644
+--- a/drivers/gpu/drm/i915/i915_drv.h
++++ b/drivers/gpu/drm/i915/i915_drv.h
+@@ -205,6 +205,7 @@ struct drm_i915_private {
+
+ const struct intel_device_info __info; /* Use INTEL_INFO() to access. */
+ struct intel_runtime_info __runtime; /* Use RUNTIME_INFO() to access. */
++ struct intel_display_runtime_info __display_runtime; /* Access with DISPLAY_RUNTIME_INFO() */
+ struct intel_driver_caps caps;
+
+ struct i915_dsm dsm;
+@@ -408,7 +409,9 @@ static inline struct intel_gt *to_gt(struct drm_i915_private *i915)
+ (engine__) = rb_to_uabi_engine(rb_next(&(engine__)->uabi_node)))
+
+ #define INTEL_INFO(dev_priv) (&(dev_priv)->__info)
++#define DISPLAY_INFO(i915) (INTEL_INFO(i915)->display)
+ #define RUNTIME_INFO(dev_priv) (&(dev_priv)->__runtime)
++#define DISPLAY_RUNTIME_INFO(i915) (&(i915)->__display_runtime)
+ #define DRIVER_CAPS(dev_priv) (&(dev_priv)->caps)
+
+ #define INTEL_DEVID(dev_priv) (RUNTIME_INFO(dev_priv)->device_id)
+@@ -427,7 +430,7 @@ static inline struct intel_gt *to_gt(struct drm_i915_private *i915)
+ #define IS_MEDIA_VER(i915, from, until) \
+ (MEDIA_VER(i915) >= (from) && MEDIA_VER(i915) <= (until))
+
+-#define DISPLAY_VER(i915) (RUNTIME_INFO(i915)->display.ip.ver)
++#define DISPLAY_VER(i915) (DISPLAY_RUNTIME_INFO(i915)->ip.ver)
+ #define IS_DISPLAY_VER(i915, from, until) \
+ (DISPLAY_VER(i915) >= (from) && DISPLAY_VER(i915) <= (until))
+
+@@ -782,9 +785,9 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
+ ((sizes) & ~RUNTIME_INFO(dev_priv)->page_sizes) == 0; \
+ })
+
+-#define HAS_OVERLAY(dev_priv) (INTEL_INFO(dev_priv)->display.has_overlay)
++#define HAS_OVERLAY(dev_priv) (DISPLAY_INFO(dev_priv)->has_overlay)
+ #define OVERLAY_NEEDS_PHYSICAL(dev_priv) \
+- (INTEL_INFO(dev_priv)->display.overlay_needs_physical)
++ (DISPLAY_INFO(dev_priv)->overlay_needs_physical)
+
+ /* Early gen2 have a totally busted CS tlb and require pinned batches. */
+ #define HAS_BROKEN_CS_TLB(dev_priv) (IS_I830(dev_priv) || IS_I845G(dev_priv))
+@@ -806,31 +809,31 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
+ */
+ #define HAS_128_BYTE_Y_TILING(dev_priv) (GRAPHICS_VER(dev_priv) != 2 && \
+ !(IS_I915G(dev_priv) || IS_I915GM(dev_priv)))
+-#define SUPPORTS_TV(dev_priv) (INTEL_INFO(dev_priv)->display.supports_tv)
+-#define I915_HAS_HOTPLUG(dev_priv) (INTEL_INFO(dev_priv)->display.has_hotplug)
++#define SUPPORTS_TV(dev_priv) (DISPLAY_INFO(dev_priv)->supports_tv)
++#define I915_HAS_HOTPLUG(dev_priv) (DISPLAY_INFO(dev_priv)->has_hotplug)
+
+ #define HAS_FW_BLC(dev_priv) (DISPLAY_VER(dev_priv) > 2)
+-#define HAS_FBC(dev_priv) (RUNTIME_INFO(dev_priv)->fbc_mask != 0)
++#define HAS_FBC(dev_priv) (DISPLAY_RUNTIME_INFO(dev_priv)->fbc_mask != 0)
+ #define HAS_CUR_FBC(dev_priv) (!HAS_GMCH(dev_priv) && DISPLAY_VER(dev_priv) >= 7)
+
+ #define HAS_DPT(dev_priv) (DISPLAY_VER(dev_priv) >= 13)
+
+ #define HAS_IPS(dev_priv) (IS_HSW_ULT(dev_priv) || IS_BROADWELL(dev_priv))
+
+-#define HAS_DP_MST(dev_priv) (INTEL_INFO(dev_priv)->display.has_dp_mst)
++#define HAS_DP_MST(dev_priv) (DISPLAY_INFO(dev_priv)->has_dp_mst)
+ #define HAS_DP20(dev_priv) (IS_DG2(dev_priv) || DISPLAY_VER(dev_priv) >= 14)
+
+ #define HAS_DOUBLE_BUFFERED_M_N(dev_priv) (DISPLAY_VER(dev_priv) >= 9 || IS_BROADWELL(dev_priv))
+
+-#define HAS_CDCLK_CRAWL(dev_priv) (INTEL_INFO(dev_priv)->display.has_cdclk_crawl)
+-#define HAS_CDCLK_SQUASH(dev_priv) (INTEL_INFO(dev_priv)->display.has_cdclk_squash)
+-#define HAS_DDI(dev_priv) (INTEL_INFO(dev_priv)->display.has_ddi)
+-#define HAS_FPGA_DBG_UNCLAIMED(dev_priv) (INTEL_INFO(dev_priv)->display.has_fpga_dbg)
+-#define HAS_PSR(dev_priv) (INTEL_INFO(dev_priv)->display.has_psr)
++#define HAS_CDCLK_CRAWL(dev_priv) (DISPLAY_INFO(dev_priv)->has_cdclk_crawl)
++#define HAS_CDCLK_SQUASH(dev_priv) (DISPLAY_INFO(dev_priv)->has_cdclk_squash)
++#define HAS_DDI(dev_priv) (DISPLAY_INFO(dev_priv)->has_ddi)
++#define HAS_FPGA_DBG_UNCLAIMED(dev_priv) (DISPLAY_INFO(dev_priv)->has_fpga_dbg)
++#define HAS_PSR(dev_priv) (DISPLAY_INFO(dev_priv)->has_psr)
+ #define HAS_PSR_HW_TRACKING(dev_priv) \
+- (INTEL_INFO(dev_priv)->display.has_psr_hw_tracking)
++ (DISPLAY_INFO(dev_priv)->has_psr_hw_tracking)
+ #define HAS_PSR2_SEL_FETCH(dev_priv) (DISPLAY_VER(dev_priv) >= 12)
+-#define HAS_TRANSCODER(dev_priv, trans) ((RUNTIME_INFO(dev_priv)->cpu_transcoder_mask & BIT(trans)) != 0)
++#define HAS_TRANSCODER(dev_priv, trans) ((DISPLAY_RUNTIME_INFO(dev_priv)->cpu_transcoder_mask & BIT(trans)) != 0)
+
+ #define HAS_RC6(dev_priv) (INTEL_INFO(dev_priv)->has_rc6)
+ #define HAS_RC6p(dev_priv) (INTEL_INFO(dev_priv)->has_rc6p)
+@@ -838,9 +841,9 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
+
+ #define HAS_RPS(dev_priv) (INTEL_INFO(dev_priv)->has_rps)
+
+-#define HAS_DMC(dev_priv) (RUNTIME_INFO(dev_priv)->has_dmc)
+-#define HAS_DSB(dev_priv) (INTEL_INFO(dev_priv)->display.has_dsb)
+-#define HAS_DSC(__i915) (RUNTIME_INFO(__i915)->has_dsc)
++#define HAS_DMC(dev_priv) (DISPLAY_RUNTIME_INFO(dev_priv)->has_dmc)
++#define HAS_DSB(dev_priv) (DISPLAY_INFO(dev_priv)->has_dsb)
++#define HAS_DSC(__i915) (DISPLAY_RUNTIME_INFO(__i915)->has_dsc)
+ #define HAS_HW_SAGV_WM(i915) (DISPLAY_VER(i915) >= 13 && !IS_DGFX(i915))
+
+ #define HAS_HECI_PXP(dev_priv) \
+@@ -869,7 +872,7 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
+ */
+ #define HAS_64K_PAGES(dev_priv) (INTEL_INFO(dev_priv)->has_64k_pages)
+
+-#define HAS_IPC(dev_priv) (INTEL_INFO(dev_priv)->display.has_ipc)
++#define HAS_IPC(dev_priv) (DISPLAY_INFO(dev_priv)->has_ipc)
+ #define HAS_SAGV(dev_priv) (DISPLAY_VER(dev_priv) >= 9 && !IS_LP(dev_priv))
+
+ #define HAS_REGION(i915, i) (RUNTIME_INFO(i915)->memory_regions & (i))
+@@ -889,7 +892,7 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
+
+ #define HAS_GLOBAL_MOCS_REGISTERS(dev_priv) (INTEL_INFO(dev_priv)->has_global_mocs)
+
+-#define HAS_GMCH(dev_priv) (INTEL_INFO(dev_priv)->display.has_gmch)
++#define HAS_GMCH(dev_priv) (DISPLAY_INFO(dev_priv)->has_gmch)
+
+ #define HAS_GMD_ID(i915) (INTEL_INFO(i915)->has_gmd_id)
+
+@@ -902,9 +905,9 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
+ #define NUM_L3_SLICES(dev_priv) (IS_HSW_GT3(dev_priv) ? \
+ 2 : HAS_L3_DPF(dev_priv))
+
+-#define INTEL_NUM_PIPES(dev_priv) (hweight8(RUNTIME_INFO(dev_priv)->pipe_mask))
++#define INTEL_NUM_PIPES(dev_priv) (hweight8(DISPLAY_RUNTIME_INFO(dev_priv)->pipe_mask))
+
+-#define HAS_DISPLAY(dev_priv) (RUNTIME_INFO(dev_priv)->pipe_mask != 0)
++#define HAS_DISPLAY(dev_priv) (DISPLAY_RUNTIME_INFO(dev_priv)->pipe_mask != 0)
+
+ #define HAS_VRR(i915) (DISPLAY_VER(i915) >= 11)
+
+@@ -931,11 +934,4 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
+ #define HAS_LMEMBAR_SMEM_STOLEN(i915) (!HAS_LMEM(i915) && \
+ GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
+
+-/* intel_device_info.c */
+-static inline struct intel_device_info *
+-mkwrite_device_info(struct drm_i915_private *dev_priv)
+-{
+- return (struct intel_device_info *)INTEL_INFO(dev_priv);
+-}
+-
+ #endif
+diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
+index edcfb5fe20b24..6b69d4c7bdb79 100644
+--- a/drivers/gpu/drm/i915/i915_pci.c
++++ b/drivers/gpu/drm/i915/i915_pci.c
+@@ -39,129 +39,7 @@
+ #define PLATFORM(x) .platform = (x)
+ #define GEN(x) \
+ .__runtime.graphics.ip.ver = (x), \
+- .__runtime.media.ip.ver = (x), \
+- .__runtime.display.ip.ver = (x)
+-
+-#define NO_DISPLAY .__runtime.pipe_mask = 0
+-
+-#define I845_PIPE_OFFSETS \
+- .display.pipe_offsets = { \
+- [TRANSCODER_A] = PIPE_A_OFFSET, \
+- }, \
+- .display.trans_offsets = { \
+- [TRANSCODER_A] = TRANSCODER_A_OFFSET, \
+- }
+-
+-#define I9XX_PIPE_OFFSETS \
+- .display.pipe_offsets = { \
+- [TRANSCODER_A] = PIPE_A_OFFSET, \
+- [TRANSCODER_B] = PIPE_B_OFFSET, \
+- }, \
+- .display.trans_offsets = { \
+- [TRANSCODER_A] = TRANSCODER_A_OFFSET, \
+- [TRANSCODER_B] = TRANSCODER_B_OFFSET, \
+- }
+-
+-#define IVB_PIPE_OFFSETS \
+- .display.pipe_offsets = { \
+- [TRANSCODER_A] = PIPE_A_OFFSET, \
+- [TRANSCODER_B] = PIPE_B_OFFSET, \
+- [TRANSCODER_C] = PIPE_C_OFFSET, \
+- }, \
+- .display.trans_offsets = { \
+- [TRANSCODER_A] = TRANSCODER_A_OFFSET, \
+- [TRANSCODER_B] = TRANSCODER_B_OFFSET, \
+- [TRANSCODER_C] = TRANSCODER_C_OFFSET, \
+- }
+-
+-#define HSW_PIPE_OFFSETS \
+- .display.pipe_offsets = { \
+- [TRANSCODER_A] = PIPE_A_OFFSET, \
+- [TRANSCODER_B] = PIPE_B_OFFSET, \
+- [TRANSCODER_C] = PIPE_C_OFFSET, \
+- [TRANSCODER_EDP] = PIPE_EDP_OFFSET, \
+- }, \
+- .display.trans_offsets = { \
+- [TRANSCODER_A] = TRANSCODER_A_OFFSET, \
+- [TRANSCODER_B] = TRANSCODER_B_OFFSET, \
+- [TRANSCODER_C] = TRANSCODER_C_OFFSET, \
+- [TRANSCODER_EDP] = TRANSCODER_EDP_OFFSET, \
+- }
+-
+-#define CHV_PIPE_OFFSETS \
+- .display.pipe_offsets = { \
+- [TRANSCODER_A] = PIPE_A_OFFSET, \
+- [TRANSCODER_B] = PIPE_B_OFFSET, \
+- [TRANSCODER_C] = CHV_PIPE_C_OFFSET, \
+- }, \
+- .display.trans_offsets = { \
+- [TRANSCODER_A] = TRANSCODER_A_OFFSET, \
+- [TRANSCODER_B] = TRANSCODER_B_OFFSET, \
+- [TRANSCODER_C] = CHV_TRANSCODER_C_OFFSET, \
+- }
+-
+-#define I845_CURSOR_OFFSETS \
+- .display.cursor_offsets = { \
+- [PIPE_A] = CURSOR_A_OFFSET, \
+- }
+-
+-#define I9XX_CURSOR_OFFSETS \
+- .display.cursor_offsets = { \
+- [PIPE_A] = CURSOR_A_OFFSET, \
+- [PIPE_B] = CURSOR_B_OFFSET, \
+- }
+-
+-#define CHV_CURSOR_OFFSETS \
+- .display.cursor_offsets = { \
+- [PIPE_A] = CURSOR_A_OFFSET, \
+- [PIPE_B] = CURSOR_B_OFFSET, \
+- [PIPE_C] = CHV_CURSOR_C_OFFSET, \
+- }
+-
+-#define IVB_CURSOR_OFFSETS \
+- .display.cursor_offsets = { \
+- [PIPE_A] = CURSOR_A_OFFSET, \
+- [PIPE_B] = IVB_CURSOR_B_OFFSET, \
+- [PIPE_C] = IVB_CURSOR_C_OFFSET, \
+- }
+-
+-#define TGL_CURSOR_OFFSETS \
+- .display.cursor_offsets = { \
+- [PIPE_A] = CURSOR_A_OFFSET, \
+- [PIPE_B] = IVB_CURSOR_B_OFFSET, \
+- [PIPE_C] = IVB_CURSOR_C_OFFSET, \
+- [PIPE_D] = TGL_CURSOR_D_OFFSET, \
+- }
+-
+-#define I845_COLORS \
+- .display.color = { .gamma_lut_size = 256 }
+-#define I9XX_COLORS \
+- .display.color = { .gamma_lut_size = 129, \
+- .gamma_lut_tests = DRM_COLOR_LUT_NON_DECREASING, \
+- }
+-#define ILK_COLORS \
+- .display.color = { .gamma_lut_size = 1024 }
+-#define IVB_COLORS \
+- .display.color = { .degamma_lut_size = 1024, .gamma_lut_size = 1024 }
+-#define CHV_COLORS \
+- .display.color = { \
+- .degamma_lut_size = 65, .gamma_lut_size = 257, \
+- .degamma_lut_tests = DRM_COLOR_LUT_NON_DECREASING, \
+- .gamma_lut_tests = DRM_COLOR_LUT_NON_DECREASING, \
+- }
+-#define GLK_COLORS \
+- .display.color = { \
+- .degamma_lut_size = 33, .gamma_lut_size = 1024, \
+- .degamma_lut_tests = DRM_COLOR_LUT_NON_DECREASING | \
+- DRM_COLOR_LUT_EQUAL_CHANNELS, \
+- }
+-#define ICL_COLORS \
+- .display.color = { \
+- .degamma_lut_size = 33, .gamma_lut_size = 262145, \
+- .degamma_lut_tests = DRM_COLOR_LUT_NON_DECREASING | \
+- DRM_COLOR_LUT_EQUAL_CHANNELS, \
+- .gamma_lut_tests = DRM_COLOR_LUT_NON_DECREASING, \
+- }
++ .__runtime.media.ip.ver = (x)
+
+ /* Keep in gen based order, and chronological order within a gen */
+
+@@ -174,12 +52,6 @@
+ #define I830_FEATURES \
+ GEN(2), \
+ .is_mobile = 1, \
+- .__runtime.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B), \
+- .__runtime.cpu_transcoder_mask = BIT(TRANSCODER_A) | BIT(TRANSCODER_B), \
+- .display.has_overlay = 1, \
+- .display.cursor_needs_physical = 1, \
+- .display.overlay_needs_physical = 1, \
+- .display.has_gmch = 1, \
+ .gpu_reset_clobbers_display = true, \
+ .has_3d_pipeline = 1, \
+ .hws_needs_physical = 1, \
+@@ -188,19 +60,11 @@
+ .has_snoop = true, \
+ .has_coherent_ggtt = false, \
+ .dma_mask_size = 32, \
+- I9XX_PIPE_OFFSETS, \
+- I9XX_CURSOR_OFFSETS, \
+- I9XX_COLORS, \
+ GEN_DEFAULT_PAGE_SIZES, \
+ GEN_DEFAULT_REGIONS
+
+ #define I845_FEATURES \
+ GEN(2), \
+- .__runtime.pipe_mask = BIT(PIPE_A), \
+- .__runtime.cpu_transcoder_mask = BIT(TRANSCODER_A), \
+- .display.has_overlay = 1, \
+- .display.overlay_needs_physical = 1, \
+- .display.has_gmch = 1, \
+ .has_3d_pipeline = 1, \
+ .gpu_reset_clobbers_display = true, \
+ .hws_needs_physical = 1, \
+@@ -209,9 +73,6 @@
+ .has_snoop = true, \
+ .has_coherent_ggtt = false, \
+ .dma_mask_size = 32, \
+- I845_PIPE_OFFSETS, \
+- I845_CURSOR_OFFSETS, \
+- I845_COLORS, \
+ GEN_DEFAULT_PAGE_SIZES, \
+ GEN_DEFAULT_REGIONS
+
+@@ -228,29 +89,21 @@ static const struct intel_device_info i845g_info = {
+ static const struct intel_device_info i85x_info = {
+ I830_FEATURES,
+ PLATFORM(INTEL_I85X),
+- .__runtime.fbc_mask = BIT(INTEL_FBC_A),
+ };
+
+ static const struct intel_device_info i865g_info = {
+ I845_FEATURES,
+ PLATFORM(INTEL_I865G),
+- .__runtime.fbc_mask = BIT(INTEL_FBC_A),
+ };
+
+ #define GEN3_FEATURES \
+ GEN(3), \
+- .__runtime.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B), \
+- .__runtime.cpu_transcoder_mask = BIT(TRANSCODER_A) | BIT(TRANSCODER_B), \
+- .display.has_gmch = 1, \
+ .gpu_reset_clobbers_display = true, \
+ .__runtime.platform_engine_mask = BIT(RCS0), \
+ .has_3d_pipeline = 1, \
+ .has_snoop = true, \
+ .has_coherent_ggtt = true, \
+ .dma_mask_size = 32, \
+- I9XX_PIPE_OFFSETS, \
+- I9XX_CURSOR_OFFSETS, \
+- I9XX_COLORS, \
+ GEN_DEFAULT_PAGE_SIZES, \
+ GEN_DEFAULT_REGIONS
+
+@@ -258,9 +111,6 @@ static const struct intel_device_info i915g_info = {
+ GEN3_FEATURES,
+ PLATFORM(INTEL_I915G),
+ .has_coherent_ggtt = false,
+- .display.cursor_needs_physical = 1,
+- .display.has_overlay = 1,
+- .display.overlay_needs_physical = 1,
+ .hws_needs_physical = 1,
+ .unfenced_needs_alignment = 1,
+ };
+@@ -269,11 +119,6 @@ static const struct intel_device_info i915gm_info = {
+ GEN3_FEATURES,
+ PLATFORM(INTEL_I915GM),
+ .is_mobile = 1,
+- .display.cursor_needs_physical = 1,
+- .display.has_overlay = 1,
+- .display.overlay_needs_physical = 1,
+- .display.supports_tv = 1,
+- .__runtime.fbc_mask = BIT(INTEL_FBC_A),
+ .hws_needs_physical = 1,
+ .unfenced_needs_alignment = 1,
+ };
+@@ -281,10 +126,6 @@ static const struct intel_device_info i915gm_info = {
+ static const struct intel_device_info i945g_info = {
+ GEN3_FEATURES,
+ PLATFORM(INTEL_I945G),
+- .display.has_hotplug = 1,
+- .display.cursor_needs_physical = 1,
+- .display.has_overlay = 1,
+- .display.overlay_needs_physical = 1,
+ .hws_needs_physical = 1,
+ .unfenced_needs_alignment = 1,
+ };
+@@ -293,12 +134,6 @@ static const struct intel_device_info i945gm_info = {
+ GEN3_FEATURES,
+ PLATFORM(INTEL_I945GM),
+ .is_mobile = 1,
+- .display.has_hotplug = 1,
+- .display.cursor_needs_physical = 1,
+- .display.has_overlay = 1,
+- .display.overlay_needs_physical = 1,
+- .display.supports_tv = 1,
+- .__runtime.fbc_mask = BIT(INTEL_FBC_A),
+ .hws_needs_physical = 1,
+ .unfenced_needs_alignment = 1,
+ };
+@@ -306,16 +141,12 @@ static const struct intel_device_info i945gm_info = {
+ static const struct intel_device_info g33_info = {
+ GEN3_FEATURES,
+ PLATFORM(INTEL_G33),
+- .display.has_hotplug = 1,
+- .display.has_overlay = 1,
+ .dma_mask_size = 36,
+ };
+
+ static const struct intel_device_info pnv_g_info = {
+ GEN3_FEATURES,
+ PLATFORM(INTEL_PINEVIEW),
+- .display.has_hotplug = 1,
+- .display.has_overlay = 1,
+ .dma_mask_size = 36,
+ };
+
+@@ -323,33 +154,23 @@ static const struct intel_device_info pnv_m_info = {
+ GEN3_FEATURES,
+ PLATFORM(INTEL_PINEVIEW),
+ .is_mobile = 1,
+- .display.has_hotplug = 1,
+- .display.has_overlay = 1,
+ .dma_mask_size = 36,
+ };
+
+ #define GEN4_FEATURES \
+ GEN(4), \
+- .__runtime.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B), \
+- .__runtime.cpu_transcoder_mask = BIT(TRANSCODER_A) | BIT(TRANSCODER_B), \
+- .display.has_hotplug = 1, \
+- .display.has_gmch = 1, \
+ .gpu_reset_clobbers_display = true, \
+ .__runtime.platform_engine_mask = BIT(RCS0), \
+ .has_3d_pipeline = 1, \
+ .has_snoop = true, \
+ .has_coherent_ggtt = true, \
+ .dma_mask_size = 36, \
+- I9XX_PIPE_OFFSETS, \
+- I9XX_CURSOR_OFFSETS, \
+- I9XX_COLORS, \
+ GEN_DEFAULT_PAGE_SIZES, \
+ GEN_DEFAULT_REGIONS
+
+ static const struct intel_device_info i965g_info = {
+ GEN4_FEATURES,
+ PLATFORM(INTEL_I965G),
+- .display.has_overlay = 1,
+ .hws_needs_physical = 1,
+ .has_snoop = false,
+ };
+@@ -358,9 +179,6 @@ static const struct intel_device_info i965gm_info = {
+ GEN4_FEATURES,
+ PLATFORM(INTEL_I965GM),
+ .is_mobile = 1,
+- .__runtime.fbc_mask = BIT(INTEL_FBC_A),
+- .display.has_overlay = 1,
+- .display.supports_tv = 1,
+ .hws_needs_physical = 1,
+ .has_snoop = false,
+ };
+@@ -376,17 +194,12 @@ static const struct intel_device_info gm45_info = {
+ GEN4_FEATURES,
+ PLATFORM(INTEL_GM45),
+ .is_mobile = 1,
+- .__runtime.fbc_mask = BIT(INTEL_FBC_A),
+- .display.supports_tv = 1,
+ .__runtime.platform_engine_mask = BIT(RCS0) | BIT(VCS0),
+ .gpu_reset_clobbers_display = false,
+ };
+
+ #define GEN5_FEATURES \
+ GEN(5), \
+- .__runtime.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B), \
+- .__runtime.cpu_transcoder_mask = BIT(TRANSCODER_A) | BIT(TRANSCODER_B), \
+- .display.has_hotplug = 1, \
+ .__runtime.platform_engine_mask = BIT(RCS0) | BIT(VCS0), \
+ .has_3d_pipeline = 1, \
+ .has_snoop = true, \
+@@ -394,9 +207,6 @@ static const struct intel_device_info gm45_info = {
+ /* ilk does support rc6, but we do not implement [power] contexts */ \
+ .has_rc6 = 0, \
+ .dma_mask_size = 36, \
+- I9XX_PIPE_OFFSETS, \
+- I9XX_CURSOR_OFFSETS, \
+- ILK_COLORS, \
+ GEN_DEFAULT_PAGE_SIZES, \
+ GEN_DEFAULT_REGIONS
+
+@@ -410,15 +220,10 @@ static const struct intel_device_info ilk_m_info = {
+ PLATFORM(INTEL_IRONLAKE),
+ .is_mobile = 1,
+ .has_rps = true,
+- .__runtime.fbc_mask = BIT(INTEL_FBC_A),
+ };
+
+ #define GEN6_FEATURES \
+ GEN(6), \
+- .__runtime.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B), \
+- .__runtime.cpu_transcoder_mask = BIT(TRANSCODER_A) | BIT(TRANSCODER_B), \
+- .display.has_hotplug = 1, \
+- .__runtime.fbc_mask = BIT(INTEL_FBC_A), \
+ .__runtime.platform_engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0), \
+ .has_3d_pipeline = 1, \
+ .has_coherent_ggtt = true, \
+@@ -430,9 +235,6 @@ static const struct intel_device_info ilk_m_info = {
+ .dma_mask_size = 40, \
+ .__runtime.ppgtt_type = INTEL_PPGTT_ALIASING, \
+ .__runtime.ppgtt_size = 31, \
+- I9XX_PIPE_OFFSETS, \
+- I9XX_CURSOR_OFFSETS, \
+- ILK_COLORS, \
+ GEN_DEFAULT_PAGE_SIZES, \
+ GEN_DEFAULT_REGIONS
+
+@@ -468,10 +270,6 @@ static const struct intel_device_info snb_m_gt2_info = {
+
+ #define GEN7_FEATURES \
+ GEN(7), \
+- .__runtime.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C), \
+- .__runtime.cpu_transcoder_mask = BIT(TRANSCODER_A) | BIT(TRANSCODER_B) | BIT(TRANSCODER_C), \
+- .display.has_hotplug = 1, \
+- .__runtime.fbc_mask = BIT(INTEL_FBC_A), \
+ .__runtime.platform_engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0), \
+ .has_3d_pipeline = 1, \
+ .has_coherent_ggtt = true, \
+@@ -483,9 +281,6 @@ static const struct intel_device_info snb_m_gt2_info = {
+ .dma_mask_size = 40, \
+ .__runtime.ppgtt_type = INTEL_PPGTT_ALIASING, \
+ .__runtime.ppgtt_size = 31, \
+- IVB_PIPE_OFFSETS, \
+- IVB_CURSOR_OFFSETS, \
+- IVB_COLORS, \
+ GEN_DEFAULT_PAGE_SIZES, \
+ GEN_DEFAULT_REGIONS
+
+@@ -523,7 +318,6 @@ static const struct intel_device_info ivb_m_gt2_info = {
+ static const struct intel_device_info ivb_q_info = {
+ GEN7_FEATURES,
+ PLATFORM(INTEL_IVYBRIDGE),
+- NO_DISPLAY,
+ .gt = 2,
+ .has_l3_dpf = 1,
+ };
+@@ -532,24 +326,16 @@ static const struct intel_device_info vlv_info = {
+ PLATFORM(INTEL_VALLEYVIEW),
+ GEN(7),
+ .is_lp = 1,
+- .__runtime.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B),
+- .__runtime.cpu_transcoder_mask = BIT(TRANSCODER_A) | BIT(TRANSCODER_B),
+ .has_runtime_pm = 1,
+ .has_rc6 = 1,
+ .has_reset_engine = true,
+ .has_rps = true,
+- .display.has_gmch = 1,
+- .display.has_hotplug = 1,
+ .dma_mask_size = 40,
+ .__runtime.ppgtt_type = INTEL_PPGTT_ALIASING,
+ .__runtime.ppgtt_size = 31,
+ .has_snoop = true,
+ .has_coherent_ggtt = false,
+ .__runtime.platform_engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0),
+- .display.mmio_offset = VLV_DISPLAY_BASE,
+- I9XX_PIPE_OFFSETS,
+- I9XX_CURSOR_OFFSETS,
+- I9XX_COLORS,
+ GEN_DEFAULT_PAGE_SIZES,
+ GEN_DEFAULT_REGIONS,
+ };
+@@ -557,13 +343,7 @@ static const struct intel_device_info vlv_info = {
+ #define G75_FEATURES \
+ GEN7_FEATURES, \
+ .__runtime.platform_engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0) | BIT(VECS0), \
+- .__runtime.cpu_transcoder_mask = BIT(TRANSCODER_A) | BIT(TRANSCODER_B) | \
+- BIT(TRANSCODER_C) | BIT(TRANSCODER_EDP), \
+- .display.has_ddi = 1, \
+- .display.has_fpga_dbg = 1, \
+- .display.has_dp_mst = 1, \
+ .has_rc6p = 0 /* RC6p removed-by HSW */, \
+- HSW_PIPE_OFFSETS, \
+ .has_runtime_pm = 1
+
+ #define HSW_PLATFORM \
+@@ -627,9 +407,6 @@ static const struct intel_device_info bdw_gt3_info = {
+ static const struct intel_device_info chv_info = {
+ PLATFORM(INTEL_CHERRYVIEW),
+ GEN(8),
+- .__runtime.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C),
+- .__runtime.cpu_transcoder_mask = BIT(TRANSCODER_A) | BIT(TRANSCODER_B) | BIT(TRANSCODER_C),
+- .display.has_hotplug = 1,
+ .is_lp = 1,
+ .__runtime.platform_engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0) | BIT(VECS0),
+ .has_64bit_reloc = 1,
+@@ -637,17 +414,12 @@ static const struct intel_device_info chv_info = {
+ .has_rc6 = 1,
+ .has_rps = true,
+ .has_logical_ring_contexts = 1,
+- .display.has_gmch = 1,
+ .dma_mask_size = 39,
+ .__runtime.ppgtt_type = INTEL_PPGTT_FULL,
+ .__runtime.ppgtt_size = 32,
+ .has_reset_engine = 1,
+ .has_snoop = true,
+ .has_coherent_ggtt = false,
+- .display.mmio_offset = VLV_DISPLAY_BASE,
+- CHV_PIPE_OFFSETS,
+- CHV_CURSOR_OFFSETS,
+- CHV_COLORS,
+ GEN_DEFAULT_PAGE_SIZES,
+ GEN_DEFAULT_REGIONS,
+ };
+@@ -660,14 +432,7 @@ static const struct intel_device_info chv_info = {
+ GEN8_FEATURES, \
+ GEN(9), \
+ GEN9_DEFAULT_PAGE_SIZES, \
+- .__runtime.has_dmc = 1, \
+- .has_gt_uc = 1, \
+- .__runtime.has_hdcp = 1, \
+- .display.has_ipc = 1, \
+- .display.has_psr = 1, \
+- .display.has_psr_hw_tracking = 1, \
+- .display.dbuf.size = 896 - 4, /* 4 blocks for bypass path allocation */ \
+- .display.dbuf.slice_mask = BIT(DBUF_S1)
++ .has_gt_uc = 1
+
+ #define SKL_PLATFORM \
+ GEN9_FEATURES, \
+@@ -702,26 +467,12 @@ static const struct intel_device_info skl_gt4_info = {
+ #define GEN9_LP_FEATURES \
+ GEN(9), \
+ .is_lp = 1, \
+- .display.dbuf.slice_mask = BIT(DBUF_S1), \
+- .display.has_hotplug = 1, \
+ .__runtime.platform_engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0) | BIT(VECS0), \
+- .__runtime.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C), \
+- .__runtime.cpu_transcoder_mask = BIT(TRANSCODER_A) | BIT(TRANSCODER_B) | \
+- BIT(TRANSCODER_C) | BIT(TRANSCODER_EDP) | \
+- BIT(TRANSCODER_DSI_A) | BIT(TRANSCODER_DSI_C), \
+ .has_3d_pipeline = 1, \
+ .has_64bit_reloc = 1, \
+- .display.has_ddi = 1, \
+- .display.has_fpga_dbg = 1, \
+- .__runtime.fbc_mask = BIT(INTEL_FBC_A), \
+- .__runtime.has_hdcp = 1, \
+- .display.has_psr = 1, \
+- .display.has_psr_hw_tracking = 1, \
+ .has_runtime_pm = 1, \
+- .__runtime.has_dmc = 1, \
+ .has_rc6 = 1, \
+ .has_rps = true, \
+- .display.has_dp_mst = 1, \
+ .has_logical_ring_contexts = 1, \
+ .has_gt_uc = 1, \
+ .dma_mask_size = 39, \
+@@ -730,25 +481,17 @@ static const struct intel_device_info skl_gt4_info = {
+ .has_reset_engine = 1, \
+ .has_snoop = true, \
+ .has_coherent_ggtt = false, \
+- .display.has_ipc = 1, \
+- HSW_PIPE_OFFSETS, \
+- IVB_CURSOR_OFFSETS, \
+- IVB_COLORS, \
+ GEN9_DEFAULT_PAGE_SIZES, \
+ GEN_DEFAULT_REGIONS
+
+ static const struct intel_device_info bxt_info = {
+ GEN9_LP_FEATURES,
+ PLATFORM(INTEL_BROXTON),
+- .display.dbuf.size = 512 - 4, /* 4 blocks for bypass path allocation */
+ };
+
+ static const struct intel_device_info glk_info = {
+ GEN9_LP_FEATURES,
+ PLATFORM(INTEL_GEMINILAKE),
+- .__runtime.display.ip.ver = 10,
+- .display.dbuf.size = 1024 - 4, /* 4 blocks for bypass path allocation */
+- GLK_COLORS,
+ };
+
+ #define KBL_PLATFORM \
+@@ -815,31 +558,7 @@ static const struct intel_device_info cml_gt2_info = {
+ #define GEN11_FEATURES \
+ GEN9_FEATURES, \
+ GEN11_DEFAULT_PAGE_SIZES, \
+- .display.abox_mask = BIT(0), \
+- .__runtime.cpu_transcoder_mask = BIT(TRANSCODER_A) | BIT(TRANSCODER_B) | \
+- BIT(TRANSCODER_C) | BIT(TRANSCODER_EDP) | \
+- BIT(TRANSCODER_DSI_0) | BIT(TRANSCODER_DSI_1), \
+- .display.pipe_offsets = { \
+- [TRANSCODER_A] = PIPE_A_OFFSET, \
+- [TRANSCODER_B] = PIPE_B_OFFSET, \
+- [TRANSCODER_C] = PIPE_C_OFFSET, \
+- [TRANSCODER_EDP] = PIPE_EDP_OFFSET, \
+- [TRANSCODER_DSI_0] = PIPE_DSI0_OFFSET, \
+- [TRANSCODER_DSI_1] = PIPE_DSI1_OFFSET, \
+- }, \
+- .display.trans_offsets = { \
+- [TRANSCODER_A] = TRANSCODER_A_OFFSET, \
+- [TRANSCODER_B] = TRANSCODER_B_OFFSET, \
+- [TRANSCODER_C] = TRANSCODER_C_OFFSET, \
+- [TRANSCODER_EDP] = TRANSCODER_EDP_OFFSET, \
+- [TRANSCODER_DSI_0] = TRANSCODER_DSI0_OFFSET, \
+- [TRANSCODER_DSI_1] = TRANSCODER_DSI1_OFFSET, \
+- }, \
+ GEN(11), \
+- ICL_COLORS, \
+- .display.dbuf.size = 2048, \
+- .display.dbuf.slice_mask = BIT(DBUF_S1) | BIT(DBUF_S2), \
+- .__runtime.has_dsc = 1, \
+ .has_coherent_ggtt = false, \
+ .has_logical_ring_elsq = 1
+
+@@ -867,31 +586,8 @@ static const struct intel_device_info jsl_info = {
+ #define GEN12_FEATURES \
+ GEN11_FEATURES, \
+ GEN(12), \
+- .display.abox_mask = GENMASK(2, 1), \
+- .__runtime.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C) | BIT(PIPE_D), \
+- .__runtime.cpu_transcoder_mask = BIT(TRANSCODER_A) | BIT(TRANSCODER_B) | \
+- BIT(TRANSCODER_C) | BIT(TRANSCODER_D) | \
+- BIT(TRANSCODER_DSI_0) | BIT(TRANSCODER_DSI_1), \
+- .display.pipe_offsets = { \
+- [TRANSCODER_A] = PIPE_A_OFFSET, \
+- [TRANSCODER_B] = PIPE_B_OFFSET, \
+- [TRANSCODER_C] = PIPE_C_OFFSET, \
+- [TRANSCODER_D] = PIPE_D_OFFSET, \
+- [TRANSCODER_DSI_0] = PIPE_DSI0_OFFSET, \
+- [TRANSCODER_DSI_1] = PIPE_DSI1_OFFSET, \
+- }, \
+- .display.trans_offsets = { \
+- [TRANSCODER_A] = TRANSCODER_A_OFFSET, \
+- [TRANSCODER_B] = TRANSCODER_B_OFFSET, \
+- [TRANSCODER_C] = TRANSCODER_C_OFFSET, \
+- [TRANSCODER_D] = TRANSCODER_D_OFFSET, \
+- [TRANSCODER_DSI_0] = TRANSCODER_DSI0_OFFSET, \
+- [TRANSCODER_DSI_1] = TRANSCODER_DSI1_OFFSET, \
+- }, \
+- TGL_CURSOR_OFFSETS, \
+ .has_global_mocs = 1, \
+- .has_pxp = 1, \
+- .display.has_dsb = 1
++ .has_pxp = 1
+
+ static const struct intel_device_info tgl_info = {
+ GEN12_FEATURES,
+@@ -903,12 +599,6 @@ static const struct intel_device_info tgl_info = {
+ static const struct intel_device_info rkl_info = {
+ GEN12_FEATURES,
+ PLATFORM(INTEL_ROCKETLAKE),
+- .display.abox_mask = BIT(0),
+- .__runtime.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C),
+- .__runtime.cpu_transcoder_mask = BIT(TRANSCODER_A) | BIT(TRANSCODER_B) |
+- BIT(TRANSCODER_C),
+- .display.has_hti = 1,
+- .display.has_psr_hw_tracking = 0,
+ .__runtime.platform_engine_mask =
+ BIT(RCS0) | BIT(BCS0) | BIT(VECS0) | BIT(VCS0),
+ };
+@@ -926,7 +616,6 @@ static const struct intel_device_info dg1_info = {
+ DGFX_FEATURES,
+ .__runtime.graphics.ip.rel = 10,
+ PLATFORM(INTEL_DG1),
+- .__runtime.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C) | BIT(PIPE_D),
+ .require_force_probe = 1,
+ .__runtime.platform_engine_mask =
+ BIT(RCS0) | BIT(BCS0) | BIT(VECS0) |
+@@ -938,64 +627,14 @@ static const struct intel_device_info dg1_info = {
+ static const struct intel_device_info adl_s_info = {
+ GEN12_FEATURES,
+ PLATFORM(INTEL_ALDERLAKE_S),
+- .__runtime.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C) | BIT(PIPE_D),
+- .display.has_hti = 1,
+- .display.has_psr_hw_tracking = 0,
+ .__runtime.platform_engine_mask =
+ BIT(RCS0) | BIT(BCS0) | BIT(VECS0) | BIT(VCS0) | BIT(VCS2),
+ .dma_mask_size = 39,
+ };
+
+-#define XE_LPD_FEATURES \
+- .display.abox_mask = GENMASK(1, 0), \
+- .display.color = { \
+- .degamma_lut_size = 129, .gamma_lut_size = 1024, \
+- .degamma_lut_tests = DRM_COLOR_LUT_NON_DECREASING | \
+- DRM_COLOR_LUT_EQUAL_CHANNELS, \
+- }, \
+- .display.dbuf.size = 4096, \
+- .display.dbuf.slice_mask = BIT(DBUF_S1) | BIT(DBUF_S2) | BIT(DBUF_S3) | \
+- BIT(DBUF_S4), \
+- .display.has_ddi = 1, \
+- .__runtime.has_dmc = 1, \
+- .display.has_dp_mst = 1, \
+- .display.has_dsb = 1, \
+- .__runtime.has_dsc = 1, \
+- .__runtime.fbc_mask = BIT(INTEL_FBC_A), \
+- .display.has_fpga_dbg = 1, \
+- .__runtime.has_hdcp = 1, \
+- .display.has_hotplug = 1, \
+- .display.has_ipc = 1, \
+- .display.has_psr = 1, \
+- .__runtime.display.ip.ver = 13, \
+- .__runtime.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C) | BIT(PIPE_D), \
+- .display.pipe_offsets = { \
+- [TRANSCODER_A] = PIPE_A_OFFSET, \
+- [TRANSCODER_B] = PIPE_B_OFFSET, \
+- [TRANSCODER_C] = PIPE_C_OFFSET, \
+- [TRANSCODER_D] = PIPE_D_OFFSET, \
+- [TRANSCODER_DSI_0] = PIPE_DSI0_OFFSET, \
+- [TRANSCODER_DSI_1] = PIPE_DSI1_OFFSET, \
+- }, \
+- .display.trans_offsets = { \
+- [TRANSCODER_A] = TRANSCODER_A_OFFSET, \
+- [TRANSCODER_B] = TRANSCODER_B_OFFSET, \
+- [TRANSCODER_C] = TRANSCODER_C_OFFSET, \
+- [TRANSCODER_D] = TRANSCODER_D_OFFSET, \
+- [TRANSCODER_DSI_0] = TRANSCODER_DSI0_OFFSET, \
+- [TRANSCODER_DSI_1] = TRANSCODER_DSI1_OFFSET, \
+- }, \
+- TGL_CURSOR_OFFSETS
+-
+ static const struct intel_device_info adl_p_info = {
+ GEN12_FEATURES,
+- XE_LPD_FEATURES,
+ PLATFORM(INTEL_ALDERLAKE_P),
+- .__runtime.cpu_transcoder_mask = BIT(TRANSCODER_A) | BIT(TRANSCODER_B) |
+- BIT(TRANSCODER_C) | BIT(TRANSCODER_D) |
+- BIT(TRANSCODER_DSI_0) | BIT(TRANSCODER_DSI_1),
+- .display.has_cdclk_crawl = 1,
+- .display.has_psr_hw_tracking = 0,
+ .__runtime.platform_engine_mask =
+ BIT(RCS0) | BIT(BCS0) | BIT(VECS0) | BIT(VCS0) | BIT(VCS2),
+ .__runtime.ppgtt_size = 48,
+@@ -1044,7 +683,6 @@ static const struct intel_device_info xehpsdv_info = {
+ XE_HPM_FEATURES,
+ DGFX_FEATURES,
+ PLATFORM(INTEL_XEHPSDV),
+- NO_DISPLAY,
+ .has_64k_pages = 1,
+ .has_media_ratio_mode = 1,
+ .__runtime.platform_engine_mask =
+@@ -1067,7 +705,6 @@ static const struct intel_device_info xehpsdv_info = {
+ .has_guc_deprivilege = 1, \
+ .has_heci_pxp = 1, \
+ .has_media_ratio_mode = 1, \
+- .display.has_cdclk_squash = 1, \
+ .__runtime.platform_engine_mask = \
+ BIT(RCS0) | BIT(BCS0) | \
+ BIT(VECS0) | BIT(VECS1) | \
+@@ -1076,14 +713,10 @@ static const struct intel_device_info xehpsdv_info = {
+
+ static const struct intel_device_info dg2_info = {
+ DG2_FEATURES,
+- XE_LPD_FEATURES,
+- .__runtime.cpu_transcoder_mask = BIT(TRANSCODER_A) | BIT(TRANSCODER_B) |
+- BIT(TRANSCODER_C) | BIT(TRANSCODER_D),
+ };
+
+ static const struct intel_device_info ats_m_info = {
+ DG2_FEATURES,
+- NO_DISPLAY,
+ .require_force_probe = 1,
+ .tuning_thread_rr_after_dep = 1,
+ };
+@@ -1105,7 +738,6 @@ static const struct intel_device_info pvc_info = {
+ .__runtime.graphics.ip.rel = 60,
+ .__runtime.media.ip.rel = 60,
+ PLATFORM(INTEL_PONTEVECCHIO),
+- NO_DISPLAY,
+ .has_flat_ccs = 0,
+ .__runtime.platform_engine_mask =
+ BIT(BCS0) |
+@@ -1114,13 +746,6 @@ static const struct intel_device_info pvc_info = {
+ .require_force_probe = 1,
+ };
+
+-#define XE_LPDP_FEATURES \
+- XE_LPD_FEATURES, \
+- .__runtime.display.ip.ver = 14, \
+- .display.has_cdclk_crawl = 1, \
+- .display.has_cdclk_squash = 1, \
+- .__runtime.fbc_mask = BIT(INTEL_FBC_A) | BIT(INTEL_FBC_B)
+-
+ static const struct intel_gt_definition xelpmp_extra_gt[] = {
+ {
+ .type = GT_MEDIA,
+@@ -1133,9 +758,6 @@ static const struct intel_gt_definition xelpmp_extra_gt[] = {
+
+ static const struct intel_device_info mtl_info = {
+ XE_HP_FEATURES,
+- XE_LPDP_FEATURES,
+- .__runtime.cpu_transcoder_mask = BIT(TRANSCODER_A) | BIT(TRANSCODER_B) |
+- BIT(TRANSCODER_C) | BIT(TRANSCODER_D),
+ /*
+ * Real graphics IP version will be obtained from hardware GMD_ID
+ * register. Value provided here is just for sanity checking.
+diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
+index c4197e31962e1..d35c89f9da778 100644
+--- a/drivers/gpu/drm/i915/i915_reg.h
++++ b/drivers/gpu/drm/i915/i915_reg.h
+@@ -1961,15 +1961,6 @@
+ #define _TRANS_VSYNC_DSI1 0x6b814
+ #define _TRANS_VSYNCSHIFT_DSI1 0x6b828
+
+-#define TRANSCODER_A_OFFSET 0x60000
+-#define TRANSCODER_B_OFFSET 0x61000
+-#define TRANSCODER_C_OFFSET 0x62000
+-#define CHV_TRANSCODER_C_OFFSET 0x63000
+-#define TRANSCODER_D_OFFSET 0x63000
+-#define TRANSCODER_EDP_OFFSET 0x6f000
+-#define TRANSCODER_DSI0_OFFSET 0x6b000
+-#define TRANSCODER_DSI1_OFFSET 0x6b800
+-
+ #define TRANS_HTOTAL(trans) _MMIO_TRANS2((trans), _TRANS_HTOTAL_A)
+ #define TRANS_HBLANK(trans) _MMIO_TRANS2((trans), _TRANS_HBLANK_A)
+ #define TRANS_HSYNC(trans) _MMIO_TRANS2((trans), _TRANS_HSYNC_A)
+@@ -2619,23 +2610,6 @@
+ #define PIPESTAT_INT_ENABLE_MASK 0x7fff0000
+ #define PIPESTAT_INT_STATUS_MASK 0x0000ffff
+
+-#define PIPE_A_OFFSET 0x70000
+-#define PIPE_B_OFFSET 0x71000
+-#define PIPE_C_OFFSET 0x72000
+-#define PIPE_D_OFFSET 0x73000
+-#define CHV_PIPE_C_OFFSET 0x74000
+-/*
+- * There's actually no pipe EDP. Some pipe registers have
+- * simply shifted from the pipe to the transcoder, while
+- * keeping their original offset. Thus we need PIPE_EDP_OFFSET
+- * to access such registers in transcoder EDP.
+- */
+-#define PIPE_EDP_OFFSET 0x7f000
+-
+-/* ICL DSI 0 and 1 */
+-#define PIPE_DSI0_OFFSET 0x7b000
+-#define PIPE_DSI1_OFFSET 0x7b800
+-
+ #define TRANSCONF(trans) _MMIO_PIPE2((trans), _TRANSACONF)
+ #define PIPEDSL(pipe) _MMIO_PIPE2(pipe, _PIPEADSL)
+ #define PIPEFRAME(pipe) _MMIO_PIPE2(pipe, _PIPEAFRAMEHIGH)
+@@ -3091,13 +3065,6 @@
+ #define CUR_CHICKEN(pipe) _MMIO_CURSOR2(pipe, _CUR_CHICKEN_A)
+ #define CURSURFLIVE(pipe) _MMIO_CURSOR2(pipe, _CURASURFLIVE)
+
+-#define CURSOR_A_OFFSET 0x70080
+-#define CURSOR_B_OFFSET 0x700c0
+-#define CHV_CURSOR_C_OFFSET 0x700e0
+-#define IVB_CURSOR_B_OFFSET 0x71080
+-#define IVB_CURSOR_C_OFFSET 0x72080
+-#define TGL_CURSOR_D_OFFSET 0x73080
+-
+ /* Display A control */
+ #define _DSPAADDR_VLV 0x7017C /* vlv/chv */
+ #define _DSPACNTR 0x70180
+diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
+index fc5cd14adfccb..79523e55ca9c4 100644
+--- a/drivers/gpu/drm/i915/intel_device_info.c
++++ b/drivers/gpu/drm/i915/intel_device_info.c
+@@ -95,6 +95,9 @@ void intel_device_info_print(const struct intel_device_info *info,
+ const struct intel_runtime_info *runtime,
+ struct drm_printer *p)
+ {
++ const struct intel_display_runtime_info *display_runtime =
++ &info->display->__runtime_defaults;
++
+ if (runtime->graphics.ip.rel)
+ drm_printf(p, "graphics version: %u.%02u\n",
+ runtime->graphics.ip.ver,
+@@ -111,13 +114,13 @@ void intel_device_info_print(const struct intel_device_info *info,
+ drm_printf(p, "media version: %u\n",
+ runtime->media.ip.ver);
+
+- if (runtime->display.ip.rel)
++ if (display_runtime->ip.rel)
+ drm_printf(p, "display version: %u.%02u\n",
+- runtime->display.ip.ver,
+- runtime->display.ip.rel);
++ display_runtime->ip.ver,
++ display_runtime->ip.rel);
+ else
+ drm_printf(p, "display version: %u\n",
+- runtime->display.ip.ver);
++ display_runtime->ip.ver);
+
+ drm_printf(p, "graphics stepping: %s\n", intel_step_name(runtime->step.graphics_step));
+ drm_printf(p, "media stepping: %s\n", intel_step_name(runtime->step.media_step));
+@@ -138,13 +141,13 @@ void intel_device_info_print(const struct intel_device_info *info,
+
+ drm_printf(p, "has_pooled_eu: %s\n", str_yes_no(runtime->has_pooled_eu));
+
+-#define PRINT_FLAG(name) drm_printf(p, "%s: %s\n", #name, str_yes_no(info->display.name))
++#define PRINT_FLAG(name) drm_printf(p, "%s: %s\n", #name, str_yes_no(info->display->name))
+ DEV_INFO_DISPLAY_FOR_EACH_FLAG(PRINT_FLAG);
+ #undef PRINT_FLAG
+
+- drm_printf(p, "has_hdcp: %s\n", str_yes_no(runtime->has_hdcp));
+- drm_printf(p, "has_dmc: %s\n", str_yes_no(runtime->has_dmc));
+- drm_printf(p, "has_dsc: %s\n", str_yes_no(runtime->has_dsc));
++ drm_printf(p, "has_hdcp: %s\n", str_yes_no(display_runtime->has_hdcp));
++ drm_printf(p, "has_dmc: %s\n", str_yes_no(display_runtime->has_dmc));
++ drm_printf(p, "has_dsc: %s\n", str_yes_no(display_runtime->has_dsc));
+
+ drm_printf(p, "rawclk rate: %u kHz\n", runtime->rawclk_freq);
+ }
+@@ -342,6 +345,7 @@ static void ip_ver_read(struct drm_i915_private *i915, u32 offset, struct intel_
+ static void intel_ipver_early_init(struct drm_i915_private *i915)
+ {
+ struct intel_runtime_info *runtime = RUNTIME_INFO(i915);
++ struct intel_display_runtime_info *display_runtime = DISPLAY_RUNTIME_INFO(i915);
+
+ if (!HAS_GMD_ID(i915)) {
+ drm_WARN_ON(&i915->drm, RUNTIME_INFO(i915)->graphics.ip.ver > 12);
+@@ -363,7 +367,7 @@ static void intel_ipver_early_init(struct drm_i915_private *i915)
+ RUNTIME_INFO(i915)->graphics.ip.rel = 70;
+ }
+ ip_ver_read(i915, i915_mmio_reg_offset(GMD_ID_DISPLAY),
+- &runtime->display.ip);
++ (struct intel_ip_version *)&display_runtime->ip);
+ ip_ver_read(i915, i915_mmio_reg_offset(GMD_ID_MEDIA),
+ &runtime->media.ip);
+ }
+@@ -381,6 +385,15 @@ void intel_device_info_runtime_init_early(struct drm_i915_private *i915)
+ intel_device_info_subplatform_init(i915);
+ }
+
++/* FIXME: Remove this, and make device info a const pointer to rodata. */
++static struct intel_device_info *
++mkwrite_device_info(struct drm_i915_private *i915)
++{
++ return (struct intel_device_info *)INTEL_INFO(i915);
++}
++
++static const struct intel_display_device_info no_display = {};
++
+ /**
+ * intel_device_info_runtime_init - initialize runtime info
+ * @dev_priv: the i915 device
+@@ -401,32 +414,34 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
+ {
+ struct intel_device_info *info = mkwrite_device_info(dev_priv);
+ struct intel_runtime_info *runtime = RUNTIME_INFO(dev_priv);
++ struct intel_display_runtime_info *display_runtime =
++ DISPLAY_RUNTIME_INFO(dev_priv);
+ enum pipe pipe;
+
+ /* Wa_14011765242: adl-s A0,A1 */
+ if (IS_ADLS_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A2))
+ for_each_pipe(dev_priv, pipe)
+- runtime->num_scalers[pipe] = 0;
++ display_runtime->num_scalers[pipe] = 0;
+ else if (DISPLAY_VER(dev_priv) >= 11) {
+ for_each_pipe(dev_priv, pipe)
+- runtime->num_scalers[pipe] = 2;
++ display_runtime->num_scalers[pipe] = 2;
+ } else if (DISPLAY_VER(dev_priv) >= 9) {
+- runtime->num_scalers[PIPE_A] = 2;
+- runtime->num_scalers[PIPE_B] = 2;
+- runtime->num_scalers[PIPE_C] = 1;
++ display_runtime->num_scalers[PIPE_A] = 2;
++ display_runtime->num_scalers[PIPE_B] = 2;
++ display_runtime->num_scalers[PIPE_C] = 1;
+ }
+
+ BUILD_BUG_ON(BITS_PER_TYPE(intel_engine_mask_t) < I915_NUM_ENGINES);
+
+ if (DISPLAY_VER(dev_priv) >= 13 || HAS_D12_PLANE_MINIMIZATION(dev_priv))
+ for_each_pipe(dev_priv, pipe)
+- runtime->num_sprites[pipe] = 4;
++ display_runtime->num_sprites[pipe] = 4;
+ else if (DISPLAY_VER(dev_priv) >= 11)
+ for_each_pipe(dev_priv, pipe)
+- runtime->num_sprites[pipe] = 6;
++ display_runtime->num_sprites[pipe] = 6;
+ else if (DISPLAY_VER(dev_priv) == 10)
+ for_each_pipe(dev_priv, pipe)
+- runtime->num_sprites[pipe] = 3;
++ display_runtime->num_sprites[pipe] = 3;
+ else if (IS_BROXTON(dev_priv)) {
+ /*
+ * Skylake and Broxton currently don't expose the topmost plane as its
+@@ -437,15 +452,15 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
+ * down the line.
+ */
+
+- runtime->num_sprites[PIPE_A] = 2;
+- runtime->num_sprites[PIPE_B] = 2;
+- runtime->num_sprites[PIPE_C] = 1;
++ display_runtime->num_sprites[PIPE_A] = 2;
++ display_runtime->num_sprites[PIPE_B] = 2;
++ display_runtime->num_sprites[PIPE_C] = 1;
+ } else if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv)) {
+ for_each_pipe(dev_priv, pipe)
+- runtime->num_sprites[pipe] = 2;
++ display_runtime->num_sprites[pipe] = 2;
+ } else if (DISPLAY_VER(dev_priv) >= 5 || IS_G4X(dev_priv)) {
+ for_each_pipe(dev_priv, pipe)
+- runtime->num_sprites[pipe] = 1;
++ display_runtime->num_sprites[pipe] = 1;
+ }
+
+ if (HAS_DISPLAY(dev_priv) &&
+@@ -453,7 +468,7 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
+ !(intel_de_read(dev_priv, GU_CNTL_PROTECTED) & DEPRESENT)) {
+ drm_info(&dev_priv->drm, "Display not present, disabling\n");
+
+- runtime->pipe_mask = 0;
++ display_runtime->pipe_mask = 0;
+ }
+
+ if (HAS_DISPLAY(dev_priv) && IS_GRAPHICS_VER(dev_priv, 7, 8) &&
+@@ -476,47 +491,47 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
+ !(sfuse_strap & SFUSE_STRAP_FUSE_LOCK))) {
+ drm_info(&dev_priv->drm,
+ "Display fused off, disabling\n");
+- runtime->pipe_mask = 0;
++ display_runtime->pipe_mask = 0;
+ } else if (fuse_strap & IVB_PIPE_C_DISABLE) {
+ drm_info(&dev_priv->drm, "PipeC fused off\n");
+- runtime->pipe_mask &= ~BIT(PIPE_C);
+- runtime->cpu_transcoder_mask &= ~BIT(TRANSCODER_C);
++ display_runtime->pipe_mask &= ~BIT(PIPE_C);
++ display_runtime->cpu_transcoder_mask &= ~BIT(TRANSCODER_C);
+ }
+ } else if (HAS_DISPLAY(dev_priv) && DISPLAY_VER(dev_priv) >= 9) {
+ u32 dfsm = intel_de_read(dev_priv, SKL_DFSM);
+
+ if (dfsm & SKL_DFSM_PIPE_A_DISABLE) {
+- runtime->pipe_mask &= ~BIT(PIPE_A);
+- runtime->cpu_transcoder_mask &= ~BIT(TRANSCODER_A);
+- runtime->fbc_mask &= ~BIT(INTEL_FBC_A);
++ display_runtime->pipe_mask &= ~BIT(PIPE_A);
++ display_runtime->cpu_transcoder_mask &= ~BIT(TRANSCODER_A);
++ display_runtime->fbc_mask &= ~BIT(INTEL_FBC_A);
+ }
+ if (dfsm & SKL_DFSM_PIPE_B_DISABLE) {
+- runtime->pipe_mask &= ~BIT(PIPE_B);
+- runtime->cpu_transcoder_mask &= ~BIT(TRANSCODER_B);
++ display_runtime->pipe_mask &= ~BIT(PIPE_B);
++ display_runtime->cpu_transcoder_mask &= ~BIT(TRANSCODER_B);
+ }
+ if (dfsm & SKL_DFSM_PIPE_C_DISABLE) {
+- runtime->pipe_mask &= ~BIT(PIPE_C);
+- runtime->cpu_transcoder_mask &= ~BIT(TRANSCODER_C);
++ display_runtime->pipe_mask &= ~BIT(PIPE_C);
++ display_runtime->cpu_transcoder_mask &= ~BIT(TRANSCODER_C);
+ }
+
+ if (DISPLAY_VER(dev_priv) >= 12 &&
+ (dfsm & TGL_DFSM_PIPE_D_DISABLE)) {
+- runtime->pipe_mask &= ~BIT(PIPE_D);
+- runtime->cpu_transcoder_mask &= ~BIT(TRANSCODER_D);
++ display_runtime->pipe_mask &= ~BIT(PIPE_D);
++ display_runtime->cpu_transcoder_mask &= ~BIT(TRANSCODER_D);
+ }
+
+ if (dfsm & SKL_DFSM_DISPLAY_HDCP_DISABLE)
+- runtime->has_hdcp = 0;
++ display_runtime->has_hdcp = 0;
+
+ if (dfsm & SKL_DFSM_DISPLAY_PM_DISABLE)
+- runtime->fbc_mask = 0;
++ display_runtime->fbc_mask = 0;
+
+ if (DISPLAY_VER(dev_priv) >= 11 && (dfsm & ICL_DFSM_DMC_DISABLE))
+- runtime->has_dmc = 0;
++ display_runtime->has_dmc = 0;
+
+ if (IS_DISPLAY_VER(dev_priv, 10, 12) &&
+ (dfsm & GLK_DFSM_DISPLAY_DSC_DISABLE))
+- runtime->has_dsc = 0;
++ display_runtime->has_dsc = 0;
+ }
+
+ if (GRAPHICS_VER(dev_priv) == 6 && i915_vtd_active(dev_priv)) {
+@@ -531,15 +546,15 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
+ if (!HAS_DISPLAY(dev_priv)) {
+ dev_priv->drm.driver_features &= ~(DRIVER_MODESET |
+ DRIVER_ATOMIC);
+- memset(&info->display, 0, sizeof(info->display));
+-
+- runtime->cpu_transcoder_mask = 0;
+- memset(runtime->num_sprites, 0, sizeof(runtime->num_sprites));
+- memset(runtime->num_scalers, 0, sizeof(runtime->num_scalers));
+- runtime->fbc_mask = 0;
+- runtime->has_hdcp = false;
+- runtime->has_dmc = false;
+- runtime->has_dsc = false;
++ info->display = &no_display;
++
++ display_runtime->cpu_transcoder_mask = 0;
++ memset(display_runtime->num_sprites, 0, sizeof(display_runtime->num_sprites));
++ memset(display_runtime->num_scalers, 0, sizeof(display_runtime->num_scalers));
++ display_runtime->fbc_mask = 0;
++ display_runtime->has_hdcp = false;
++ display_runtime->has_dmc = false;
++ display_runtime->has_dsc = false;
+ }
+
+ /* Disable nuclear pageflip by default on pre-g4x */
+@@ -548,6 +563,35 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
+ dev_priv->drm.driver_features &= ~DRIVER_ATOMIC;
+ }
+
++/*
++ * Set up device info and initial runtime info at driver create.
++ *
++ * Note: i915 is only an allocated blob of memory at this point.
++ */
++void intel_device_info_driver_create(struct drm_i915_private *i915,
++ u16 device_id,
++ const struct intel_device_info *match_info)
++{
++ struct intel_device_info *info;
++ struct intel_runtime_info *runtime;
++
++ /* Setup the write-once "constant" device info */
++ info = mkwrite_device_info(i915);
++ memcpy(info, match_info, sizeof(*info));
++
++ /* Initialize initial runtime info from static const data and pdev. */
++ runtime = RUNTIME_INFO(i915);
++ memcpy(runtime, &INTEL_INFO(i915)->__runtime, sizeof(*runtime));
++
++ /* Probe display support */
++ info->display = intel_display_device_probe(device_id);
++ memcpy(DISPLAY_RUNTIME_INFO(i915),
++ &DISPLAY_INFO(i915)->__runtime_defaults,
++ sizeof(*DISPLAY_RUNTIME_INFO(i915)));
++
++ runtime->device_id = device_id;
++}
++
+ void intel_driver_caps_print(const struct intel_driver_caps *caps,
+ struct drm_printer *p)
+ {
+diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
+index 080a4557899b6..faf6cccdb343d 100644
+--- a/drivers/gpu/drm/i915/intel_device_info.h
++++ b/drivers/gpu/drm/i915/intel_device_info.h
+@@ -29,7 +29,7 @@
+
+ #include "intel_step.h"
+
+-#include "display/intel_display_limits.h"
++#include "display/intel_display_device.h"
+
+ #include "gt/intel_engine_types.h"
+ #include "gt/intel_context_types.h"
+@@ -180,25 +180,6 @@ enum intel_ppgtt_type {
+ func(unfenced_needs_alignment); \
+ func(hws_needs_physical);
+
+-#define DEV_INFO_DISPLAY_FOR_EACH_FLAG(func) \
+- /* Keep in alphabetical order */ \
+- func(cursor_needs_physical); \
+- func(has_cdclk_crawl); \
+- func(has_cdclk_squash); \
+- func(has_ddi); \
+- func(has_dp_mst); \
+- func(has_dsb); \
+- func(has_fpga_dbg); \
+- func(has_gmch); \
+- func(has_hotplug); \
+- func(has_hti); \
+- func(has_ipc); \
+- func(has_overlay); \
+- func(has_psr); \
+- func(has_psr_hw_tracking); \
+- func(overlay_needs_physical); \
+- func(supports_tv);
+-
+ struct intel_ip_version {
+ u8 ver;
+ u8 rel;
+@@ -216,9 +197,6 @@ struct intel_runtime_info {
+ struct {
+ struct intel_ip_version ip;
+ } media;
+- struct {
+- struct intel_ip_version ip;
+- } display;
+
+ /*
+ * Platform mask is used for optimizing or-ed IS_PLATFORM calls into
+@@ -246,21 +224,6 @@ struct intel_runtime_info {
+ u32 memory_regions; /* regions supported by the HW */
+
+ bool has_pooled_eu;
+-
+- /* display */
+- struct {
+- u8 pipe_mask;
+- u8 cpu_transcoder_mask;
+-
+- u8 num_sprites[I915_MAX_PIPES];
+- u8 num_scalers[I915_MAX_PIPES];
+-
+- u8 fbc_mask;
+-
+- bool has_hdcp;
+- bool has_dmc;
+- bool has_dsc;
+- };
+ };
+
+ struct intel_device_info {
+@@ -276,33 +239,7 @@ struct intel_device_info {
+ DEV_INFO_FOR_EACH_FLAG(DEFINE_FLAG);
+ #undef DEFINE_FLAG
+
+- struct {
+- u8 abox_mask;
+-
+- struct {
+- u16 size; /* in blocks */
+- u8 slice_mask;
+- } dbuf;
+-
+-#define DEFINE_FLAG(name) u8 name:1
+- DEV_INFO_DISPLAY_FOR_EACH_FLAG(DEFINE_FLAG);
+-#undef DEFINE_FLAG
+-
+- /* Global register offset for the display engine */
+- u32 mmio_offset;
+-
+- /* Register offsets for the various display pipes and transcoders */
+- u32 pipe_offsets[I915_MAX_TRANSCODERS];
+- u32 trans_offsets[I915_MAX_TRANSCODERS];
+- u32 cursor_offsets[I915_MAX_PIPES];
+-
+- struct {
+- u32 degamma_lut_size;
+- u32 gamma_lut_size;
+- u32 degamma_lut_tests;
+- u32 gamma_lut_tests;
+- } color;
+- } display;
++ const struct intel_display_device_info *display;
+
+ /*
+ * Initial runtime info. Do not access outside of i915_driver_create().
+@@ -317,6 +254,8 @@ struct intel_driver_caps {
+
+ const char *intel_platform_name(enum intel_platform platform);
+
++void intel_device_info_driver_create(struct drm_i915_private *i915, u16 device_id,
++ const struct intel_device_info *match_info);
+ void intel_device_info_runtime_init_early(struct drm_i915_private *dev_priv);
+ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv);
+
+diff --git a/drivers/gpu/drm/i915/intel_step.c b/drivers/gpu/drm/i915/intel_step.c
+index 84a6fe736a3b5..8a9ff6227e536 100644
+--- a/drivers/gpu/drm/i915/intel_step.c
++++ b/drivers/gpu/drm/i915/intel_step.c
+@@ -166,8 +166,12 @@ void intel_step_init(struct drm_i915_private *i915)
+ &RUNTIME_INFO(i915)->graphics.ip);
+ step.media_step = gmd_to_intel_step(i915,
+ &RUNTIME_INFO(i915)->media.ip);
+- step.display_step = gmd_to_intel_step(i915,
+- &RUNTIME_INFO(i915)->display.ip);
++ step.display_step = STEP_A0 + DISPLAY_RUNTIME_INFO(i915)->ip.step;
++ if (step.display_step >= STEP_FUTURE) {
++ drm_dbg(&i915->drm, "Using future display steppings\n");
++ step.display_step = STEP_FUTURE;
++ }
++
+ RUNTIME_INFO(i915)->step = step;
+
+ return;
+diff --git a/drivers/gpu/drm/imx/lcdc/imx-lcdc.c b/drivers/gpu/drm/imx/lcdc/imx-lcdc.c
+index 8e6d457917daf..277ead6a459a4 100644
+--- a/drivers/gpu/drm/imx/lcdc/imx-lcdc.c
++++ b/drivers/gpu/drm/imx/lcdc/imx-lcdc.c
+@@ -400,8 +400,8 @@ static int imx_lcdc_probe(struct platform_device *pdev)
+
+ lcdc = devm_drm_dev_alloc(dev, &imx_lcdc_drm_driver,
+ struct imx_lcdc, drm);
+- if (!lcdc)
+- return -ENOMEM;
++ if (IS_ERR(lcdc))
++ return PTR_ERR(lcdc);
+
+ drm = &lcdc->drm;
+
+diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+index 1e8d2982d603c..a99310b687932 100644
+--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
++++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+@@ -1743,6 +1743,7 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev)
+ {
+ struct msm_drm_private *priv = dev->dev_private;
+ struct platform_device *pdev = priv->gpu_pdev;
++ struct adreno_platform_config *config = pdev->dev.platform_data;
+ struct a5xx_gpu *a5xx_gpu = NULL;
+ struct adreno_gpu *adreno_gpu;
+ struct msm_gpu *gpu;
+@@ -1769,7 +1770,7 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev)
+
+ nr_rings = 4;
+
+- if (adreno_is_a510(adreno_gpu))
++ if (adreno_cmp_rev(ADRENO_REV(5, 1, 0, ANY_ID), config->rev))
+ nr_rings = 1;
+
+ ret = adreno_gpu_init(dev, pdev, adreno_gpu, &funcs, nr_rings);
+diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+index 52da3795b175d..411b7a5fa2f32 100644
+--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
++++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+@@ -1744,7 +1744,8 @@ a6xx_create_address_space(struct msm_gpu *gpu, struct platform_device *pdev)
+ * This allows GPU to set the bus attributes required to use system
+ * cache on behalf of the iommu page table walker.
+ */
+- if (!IS_ERR_OR_NULL(a6xx_gpu->htw_llc_slice))
++ if (!IS_ERR_OR_NULL(a6xx_gpu->htw_llc_slice) &&
++ !device_iommu_capable(&pdev->dev, IOMMU_CAP_CACHE_COHERENCY))
+ quirks |= IO_PGTABLE_QUIRK_ARM_OUTER_WBWA;
+
+ return adreno_iommu_create_address_space(gpu, pdev, quirks);
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
+index bdcd554fc8a80..ff9ccf72a4bf9 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
++++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
+@@ -39,8 +39,8 @@ static const struct dpu_mdp_cfg msm8998_mdp[] = {
+ .clk_ctrls[DPU_CLK_CTRL_DMA1] = { .reg_off = 0x2b4, .bit_off = 8 },
+ .clk_ctrls[DPU_CLK_CTRL_DMA2] = { .reg_off = 0x2c4, .bit_off = 8 },
+ .clk_ctrls[DPU_CLK_CTRL_DMA3] = { .reg_off = 0x2c4, .bit_off = 12 },
+- .clk_ctrls[DPU_CLK_CTRL_CURSOR0] = { .reg_off = 0x3a8, .bit_off = 15 },
+- .clk_ctrls[DPU_CLK_CTRL_CURSOR1] = { .reg_off = 0x3b0, .bit_off = 15 },
++ .clk_ctrls[DPU_CLK_CTRL_CURSOR0] = { .reg_off = 0x3a8, .bit_off = 16 },
++ .clk_ctrls[DPU_CLK_CTRL_CURSOR1] = { .reg_off = 0x3b0, .bit_off = 16 },
+ },
+ };
+
+@@ -112,16 +112,16 @@ static const struct dpu_lm_cfg msm8998_lm[] = {
+ };
+
+ static const struct dpu_pingpong_cfg msm8998_pp[] = {
+- PP_BLK_TE("pingpong_0", PINGPONG_0, 0x70000, 0, sdm845_pp_sblk_te,
++ PP_BLK("pingpong_0", PINGPONG_0, 0x70000, PINGPONG_SDM845_TE2_MASK, 0, sdm845_pp_sblk_te,
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 8),
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 12)),
+- PP_BLK_TE("pingpong_1", PINGPONG_1, 0x70800, 0, sdm845_pp_sblk_te,
++ PP_BLK("pingpong_1", PINGPONG_1, 0x70800, PINGPONG_SDM845_TE2_MASK, 0, sdm845_pp_sblk_te,
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 9),
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 13)),
+- PP_BLK("pingpong_2", PINGPONG_2, 0x71000, 0, sdm845_pp_sblk,
++ PP_BLK("pingpong_2", PINGPONG_2, 0x71000, PINGPONG_SDM845_MASK, 0, sdm845_pp_sblk,
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 10),
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 14)),
+- PP_BLK("pingpong_3", PINGPONG_3, 0x71800, 0, sdm845_pp_sblk,
++ PP_BLK("pingpong_3", PINGPONG_3, 0x71800, PINGPONG_SDM845_MASK, 0, sdm845_pp_sblk,
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 11),
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 15)),
+ };
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h
+index ceca741e93c9b..5b9b3b99f1b5f 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h
++++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h
+@@ -110,16 +110,16 @@ static const struct dpu_lm_cfg sdm845_lm[] = {
+ };
+
+ static const struct dpu_pingpong_cfg sdm845_pp[] = {
+- PP_BLK_TE("pingpong_0", PINGPONG_0, 0x70000, 0, sdm845_pp_sblk_te,
++ PP_BLK("pingpong_0", PINGPONG_0, 0x70000, PINGPONG_SDM845_TE2_MASK, 0, sdm845_pp_sblk_te,
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 8),
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 12)),
+- PP_BLK_TE("pingpong_1", PINGPONG_1, 0x70800, 0, sdm845_pp_sblk_te,
++ PP_BLK("pingpong_1", PINGPONG_1, 0x70800, PINGPONG_SDM845_TE2_MASK, 0, sdm845_pp_sblk_te,
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 9),
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 13)),
+- PP_BLK("pingpong_2", PINGPONG_2, 0x71000, 0, sdm845_pp_sblk,
++ PP_BLK("pingpong_2", PINGPONG_2, 0x71000, PINGPONG_SDM845_MASK, 0, sdm845_pp_sblk,
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 10),
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 14)),
+- PP_BLK("pingpong_3", PINGPONG_3, 0x71800, 0, sdm845_pp_sblk,
++ PP_BLK("pingpong_3", PINGPONG_3, 0x71800, PINGPONG_SDM845_MASK, 0, sdm845_pp_sblk,
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 11),
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 15)),
+ };
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
+index 42b0e58624d00..074ba54d420f4 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
++++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
+@@ -128,22 +128,22 @@ static const struct dpu_dspp_cfg sm8150_dspp[] = {
+ };
+
+ static const struct dpu_pingpong_cfg sm8150_pp[] = {
+- PP_BLK("pingpong_0", PINGPONG_0, 0x70000, MERGE_3D_0, sdm845_pp_sblk,
++ PP_BLK("pingpong_0", PINGPONG_0, 0x70000, PINGPONG_SM8150_MASK, MERGE_3D_0, sdm845_pp_sblk,
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 8),
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 12)),
+- PP_BLK("pingpong_1", PINGPONG_1, 0x70800, MERGE_3D_0, sdm845_pp_sblk,
++ PP_BLK("pingpong_1", PINGPONG_1, 0x70800, PINGPONG_SM8150_MASK, MERGE_3D_0, sdm845_pp_sblk,
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 9),
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 13)),
+- PP_BLK("pingpong_2", PINGPONG_2, 0x71000, MERGE_3D_1, sdm845_pp_sblk,
++ PP_BLK("pingpong_2", PINGPONG_2, 0x71000, PINGPONG_SM8150_MASK, MERGE_3D_1, sdm845_pp_sblk,
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 10),
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 14)),
+- PP_BLK("pingpong_3", PINGPONG_3, 0x71800, MERGE_3D_1, sdm845_pp_sblk,
++ PP_BLK("pingpong_3", PINGPONG_3, 0x71800, PINGPONG_SM8150_MASK, MERGE_3D_1, sdm845_pp_sblk,
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 11),
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 15)),
+- PP_BLK("pingpong_4", PINGPONG_4, 0x72000, MERGE_3D_2, sdm845_pp_sblk,
++ PP_BLK("pingpong_4", PINGPONG_4, 0x72000, PINGPONG_SM8150_MASK, MERGE_3D_2, sdm845_pp_sblk,
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR2, 30),
+ -1),
+- PP_BLK("pingpong_5", PINGPONG_5, 0x72800, MERGE_3D_2, sdm845_pp_sblk,
++ PP_BLK("pingpong_5", PINGPONG_5, 0x72800, PINGPONG_SM8150_MASK, MERGE_3D_2, sdm845_pp_sblk,
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR2, 31),
+ -1),
+ };
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h
+index e3bdfe7b30f1f..0540d21810857 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h
++++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h
+@@ -116,22 +116,22 @@ static const struct dpu_lm_cfg sc8180x_lm[] = {
+ };
+
+ static const struct dpu_pingpong_cfg sc8180x_pp[] = {
+- PP_BLK("pingpong_0", PINGPONG_0, 0x70000, MERGE_3D_0, sdm845_pp_sblk,
++ PP_BLK("pingpong_0", PINGPONG_0, 0x70000, PINGPONG_SM8150_MASK, MERGE_3D_0, sdm845_pp_sblk,
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 8),
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 12)),
+- PP_BLK("pingpong_1", PINGPONG_1, 0x70800, MERGE_3D_0, sdm845_pp_sblk,
++ PP_BLK("pingpong_1", PINGPONG_1, 0x70800, PINGPONG_SM8150_MASK, MERGE_3D_0, sdm845_pp_sblk,
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 9),
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 13)),
+- PP_BLK("pingpong_2", PINGPONG_2, 0x71000, MERGE_3D_1, sdm845_pp_sblk,
++ PP_BLK("pingpong_2", PINGPONG_2, 0x71000, PINGPONG_SM8150_MASK, MERGE_3D_1, sdm845_pp_sblk,
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 10),
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 14)),
+- PP_BLK("pingpong_3", PINGPONG_3, 0x71800, MERGE_3D_1, sdm845_pp_sblk,
++ PP_BLK("pingpong_3", PINGPONG_3, 0x71800, PINGPONG_SM8150_MASK, MERGE_3D_1, sdm845_pp_sblk,
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 11),
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 15)),
+- PP_BLK("pingpong_4", PINGPONG_4, 0x72000, MERGE_3D_2, sdm845_pp_sblk,
++ PP_BLK("pingpong_4", PINGPONG_4, 0x72000, PINGPONG_SM8150_MASK, MERGE_3D_2, sdm845_pp_sblk,
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR2, 30),
+ -1),
+- PP_BLK("pingpong_5", PINGPONG_5, 0x72800, MERGE_3D_2, sdm845_pp_sblk,
++ PP_BLK("pingpong_5", PINGPONG_5, 0x72800, PINGPONG_SM8150_MASK, MERGE_3D_2, sdm845_pp_sblk,
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR2, 31),
+ -1),
+ };
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h
+index ed130582873c7..b3284de35b8fa 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h
++++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h
+@@ -129,22 +129,22 @@ static const struct dpu_dspp_cfg sm8250_dspp[] = {
+ };
+
+ static const struct dpu_pingpong_cfg sm8250_pp[] = {
+- PP_BLK("pingpong_0", PINGPONG_0, 0x70000, MERGE_3D_0, sdm845_pp_sblk,
++ PP_BLK("pingpong_0", PINGPONG_0, 0x70000, PINGPONG_SM8150_MASK, MERGE_3D_0, sdm845_pp_sblk,
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 8),
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 12)),
+- PP_BLK("pingpong_1", PINGPONG_1, 0x70800, MERGE_3D_0, sdm845_pp_sblk,
++ PP_BLK("pingpong_1", PINGPONG_1, 0x70800, PINGPONG_SM8150_MASK, MERGE_3D_0, sdm845_pp_sblk,
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 9),
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 13)),
+- PP_BLK("pingpong_2", PINGPONG_2, 0x71000, MERGE_3D_1, sdm845_pp_sblk,
++ PP_BLK("pingpong_2", PINGPONG_2, 0x71000, PINGPONG_SM8150_MASK, MERGE_3D_1, sdm845_pp_sblk,
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 10),
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 14)),
+- PP_BLK("pingpong_3", PINGPONG_3, 0x71800, MERGE_3D_1, sdm845_pp_sblk,
++ PP_BLK("pingpong_3", PINGPONG_3, 0x71800, PINGPONG_SM8150_MASK, MERGE_3D_1, sdm845_pp_sblk,
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 11),
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 15)),
+- PP_BLK("pingpong_4", PINGPONG_4, 0x72000, MERGE_3D_2, sdm845_pp_sblk,
++ PP_BLK("pingpong_4", PINGPONG_4, 0x72000, PINGPONG_SM8150_MASK, MERGE_3D_2, sdm845_pp_sblk,
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR2, 30),
+ -1),
+- PP_BLK("pingpong_5", PINGPONG_5, 0x72800, MERGE_3D_2, sdm845_pp_sblk,
++ PP_BLK("pingpong_5", PINGPONG_5, 0x72800, PINGPONG_SM8150_MASK, MERGE_3D_2, sdm845_pp_sblk,
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR2, 31),
+ -1),
+ };
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h
+index a46b11730a4d4..88c211876516a 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h
++++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h
+@@ -76,12 +76,16 @@ static const struct dpu_lm_cfg sc7180_lm[] = {
+
+ static const struct dpu_dspp_cfg sc7180_dspp[] = {
+ DSPP_BLK("dspp_0", DSPP_0, 0x54000, DSPP_SC7180_MASK,
+- &sc7180_dspp_sblk),
++ &sm8150_dspp_sblk),
+ };
+
+ static const struct dpu_pingpong_cfg sc7180_pp[] = {
+- PP_BLK("pingpong_0", PINGPONG_0, 0x70000, 0, sdm845_pp_sblk, -1, -1),
+- PP_BLK("pingpong_1", PINGPONG_1, 0x70800, 0, sdm845_pp_sblk, -1, -1),
++ PP_BLK("pingpong_0", PINGPONG_0, 0x70000, PINGPONG_SM8150_MASK, 0, sdm845_pp_sblk,
++ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 8),
++ -1),
++ PP_BLK("pingpong_1", PINGPONG_1, 0x70800, PINGPONG_SM8150_MASK, 0, sdm845_pp_sblk,
++ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 9),
++ -1),
+ };
+
+ static const struct dpu_intf_cfg sc7180_intf[] = {
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h
+index 988d820f7ef2e..e15dc96f1286a 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h
++++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h
+@@ -60,7 +60,7 @@ static const struct dpu_dspp_cfg sm6115_dspp[] = {
+ };
+
+ static const struct dpu_pingpong_cfg sm6115_pp[] = {
+- PP_BLK("pingpong_0", PINGPONG_0, 0x70000, 0, sdm845_pp_sblk,
++ PP_BLK("pingpong_0", PINGPONG_0, 0x70000, PINGPONG_SM8150_MASK, 0, sdm845_pp_sblk,
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 8),
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 12)),
+ };
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h
+index c9003dcc1a59b..2ff98ef6999fe 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h
++++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h
+@@ -57,7 +57,7 @@ static const struct dpu_dspp_cfg qcm2290_dspp[] = {
+ };
+
+ static const struct dpu_pingpong_cfg qcm2290_pp[] = {
+- PP_BLK("pingpong_0", PINGPONG_0, 0x70000, 0, sdm845_pp_sblk,
++ PP_BLK("pingpong_0", PINGPONG_0, 0x70000, PINGPONG_SM8150_MASK, 0, sdm845_pp_sblk,
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 8),
+ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 12)),
+ };
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
+index 6b2c7eae71d99..7de87185d5c0c 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
++++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
+@@ -83,14 +83,22 @@ static const struct dpu_lm_cfg sc7280_lm[] = {
+
+ static const struct dpu_dspp_cfg sc7280_dspp[] = {
+ DSPP_BLK("dspp_0", DSPP_0, 0x54000, DSPP_SC7180_MASK,
+- &sc7180_dspp_sblk),
++ &sm8150_dspp_sblk),
+ };
+
+ static const struct dpu_pingpong_cfg sc7280_pp[] = {
+- PP_BLK_DITHER("pingpong_0", PINGPONG_0, 0x69000, 0, sc7280_pp_sblk, -1, -1),
+- PP_BLK_DITHER("pingpong_1", PINGPONG_1, 0x6a000, 0, sc7280_pp_sblk, -1, -1),
+- PP_BLK_DITHER("pingpong_2", PINGPONG_2, 0x6b000, 0, sc7280_pp_sblk, -1, -1),
+- PP_BLK_DITHER("pingpong_3", PINGPONG_3, 0x6c000, 0, sc7280_pp_sblk, -1, -1),
++ PP_BLK_DITHER("pingpong_0", PINGPONG_0, 0x69000, 0, sc7280_pp_sblk,
++ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 8),
++ -1),
++ PP_BLK_DITHER("pingpong_1", PINGPONG_1, 0x6a000, 0, sc7280_pp_sblk,
++ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 9),
++ -1),
++ PP_BLK_DITHER("pingpong_2", PINGPONG_2, 0x6b000, 0, sc7280_pp_sblk,
++ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 10),
++ -1),
++ PP_BLK_DITHER("pingpong_3", PINGPONG_3, 0x6c000, 0, sc7280_pp_sblk,
++ DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 11),
++ -1),
+ };
+
+ static const struct dpu_intf_cfg sc7280_intf[] = {
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h
+index 4ecb3df5cbc02..8bd4bb97e639c 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h
++++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h
+@@ -107,9 +107,9 @@ static const struct dpu_lm_cfg sm8450_lm[] = {
+ LM_BLK("lm_1", LM_1, 0x45000, MIXER_SDM845_MASK,
+ &sdm845_lm_sblk, PINGPONG_1, LM_0, DSPP_1),
+ LM_BLK("lm_2", LM_2, 0x46000, MIXER_SDM845_MASK,
+- &sdm845_lm_sblk, PINGPONG_2, LM_3, 0),
++ &sdm845_lm_sblk, PINGPONG_2, LM_3, DSPP_2),
+ LM_BLK("lm_3", LM_3, 0x47000, MIXER_SDM845_MASK,
+- &sdm845_lm_sblk, PINGPONG_3, LM_2, 0),
++ &sdm845_lm_sblk, PINGPONG_3, LM_2, DSPP_3),
+ LM_BLK("lm_4", LM_4, 0x48000, MIXER_SDM845_MASK,
+ &sdm845_lm_sblk, PINGPONG_4, LM_5, 0),
+ LM_BLK("lm_5", LM_5, 0x49000, MIXER_SDM845_MASK,
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
+index cc66ddffe6723..eee48371126d8 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
++++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
+@@ -1463,6 +1463,8 @@ static const struct drm_crtc_helper_funcs dpu_crtc_helper_funcs = {
+ struct drm_crtc *dpu_crtc_init(struct drm_device *dev, struct drm_plane *plane,
+ struct drm_plane *cursor)
+ {
++ struct msm_drm_private *priv = dev->dev_private;
++ struct dpu_kms *dpu_kms = to_dpu_kms(priv->kms);
+ struct drm_crtc *crtc = NULL;
+ struct dpu_crtc *dpu_crtc = NULL;
+ int i, ret;
+@@ -1494,7 +1496,8 @@ struct drm_crtc *dpu_crtc_init(struct drm_device *dev, struct drm_plane *plane,
+
+ drm_crtc_helper_add(crtc, &dpu_crtc_helper_funcs);
+
+- drm_crtc_enable_color_mgmt(crtc, 0, true, 0);
++ if (dpu_kms->catalog->dspp_count)
++ drm_crtc_enable_color_mgmt(crtc, 0, true, 0);
+
+ /* save user friendly CRTC name for later */
+ snprintf(dpu_crtc->name, DPU_CRTC_NAME_SIZE, "crtc%u", crtc->base.id);
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c
+index 74470d068622e..a60fb8d3736b5 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c
++++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c
+@@ -36,10 +36,6 @@
+ #define DEFAULT_TEARCHECK_SYNC_THRESH_START 4
+ #define DEFAULT_TEARCHECK_SYNC_THRESH_CONTINUE 4
+
+-#define DPU_ENC_WR_PTR_START_TIMEOUT_US 20000
+-
+-#define DPU_ENC_MAX_POLL_TIMEOUT_US 2000
+-
+ static void dpu_encoder_phys_cmd_enable_te(struct dpu_encoder_phys *phys_enc);
+
+ static bool dpu_encoder_phys_cmd_is_master(struct dpu_encoder_phys *phys_enc)
+@@ -574,28 +570,8 @@ static void dpu_encoder_phys_cmd_prepare_for_kickoff(
+ atomic_read(&phys_enc->pending_kickoff_cnt));
+ }
+
+-static bool dpu_encoder_phys_cmd_is_ongoing_pptx(
+- struct dpu_encoder_phys *phys_enc)
+-{
+- struct dpu_hw_pp_vsync_info info;
+-
+- if (!phys_enc)
+- return false;
+-
+- phys_enc->hw_pp->ops.get_vsync_info(phys_enc->hw_pp, &info);
+- if (info.wr_ptr_line_count > 0 &&
+- info.wr_ptr_line_count < phys_enc->cached_mode.vdisplay)
+- return true;
+-
+- return false;
+-}
+-
+ static void dpu_encoder_phys_cmd_enable_te(struct dpu_encoder_phys *phys_enc)
+ {
+- struct dpu_encoder_phys_cmd *cmd_enc =
+- to_dpu_encoder_phys_cmd(phys_enc);
+- int trial = 0;
+-
+ if (!phys_enc)
+ return;
+ if (!phys_enc->hw_pp)
+@@ -603,37 +579,11 @@ static void dpu_encoder_phys_cmd_enable_te(struct dpu_encoder_phys *phys_enc)
+ if (!dpu_encoder_phys_cmd_is_master(phys_enc))
+ return;
+
+- /* If autorefresh is already disabled, we have nothing to do */
+- if (!phys_enc->hw_pp->ops.get_autorefresh(phys_enc->hw_pp, NULL))
+- return;
+-
+- /*
+- * If autorefresh is enabled, disable it and make sure it is safe to
+- * proceed with current frame commit/push. Sequence fallowed is,
+- * 1. Disable TE
+- * 2. Disable autorefresh config
+- * 4. Poll for frame transfer ongoing to be false
+- * 5. Enable TE back
+- */
+- _dpu_encoder_phys_cmd_connect_te(phys_enc, false);
+- phys_enc->hw_pp->ops.setup_autorefresh(phys_enc->hw_pp, 0, false);
+-
+- do {
+- udelay(DPU_ENC_MAX_POLL_TIMEOUT_US);
+- if ((trial * DPU_ENC_MAX_POLL_TIMEOUT_US)
+- > (KICKOFF_TIMEOUT_MS * USEC_PER_MSEC)) {
+- DPU_ERROR_CMDENC(cmd_enc,
+- "disable autorefresh failed\n");
+- break;
+- }
+-
+- trial++;
+- } while (dpu_encoder_phys_cmd_is_ongoing_pptx(phys_enc));
+-
+- _dpu_encoder_phys_cmd_connect_te(phys_enc, true);
+-
+- DPU_DEBUG_CMDENC(to_dpu_encoder_phys_cmd(phys_enc),
+- "disabled autorefresh\n");
++ if (phys_enc->hw_pp->ops.disable_autorefresh) {
++ phys_enc->hw_pp->ops.disable_autorefresh(phys_enc->hw_pp,
++ DRMID(phys_enc->parent),
++ phys_enc->cached_mode.vdisplay);
++ }
+ }
+
+ static int _dpu_encoder_phys_cmd_wait_for_ctl_start(
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
+index 5d994bce696f9..0b604f31197bb 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
++++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
+@@ -75,11 +75,15 @@
+ #define MIXER_QCM2290_MASK \
+ (BIT(DPU_DIM_LAYER) | BIT(DPU_MIXER_COMBINED_ALPHA))
+
+-#define PINGPONG_SDM845_MASK BIT(DPU_PINGPONG_DITHER)
++#define PINGPONG_SDM845_MASK \
++ (BIT(DPU_PINGPONG_DITHER) | BIT(DPU_PINGPONG_TE))
+
+-#define PINGPONG_SDM845_SPLIT_MASK \
++#define PINGPONG_SDM845_TE2_MASK \
+ (PINGPONG_SDM845_MASK | BIT(DPU_PINGPONG_TE2))
+
++#define PINGPONG_SM8150_MASK \
++ (BIT(DPU_PINGPONG_DITHER))
++
+ #define CTL_SC7280_MASK \
+ (BIT(DPU_CTL_ACTIVE_CFG) | \
+ BIT(DPU_CTL_FETCH_ACTIVE) | \
+@@ -98,9 +102,12 @@
+ #define INTF_SDM845_MASK (0)
+
+ #define INTF_SC7180_MASK \
+- (BIT(DPU_INTF_INPUT_CTRL) | BIT(DPU_INTF_TE) | BIT(DPU_INTF_STATUS_SUPPORTED))
++ (BIT(DPU_INTF_INPUT_CTRL) | \
++ BIT(DPU_INTF_TE) | \
++ BIT(DPU_INTF_STATUS_SUPPORTED) | \
++ BIT(DPU_DATA_HCTL_EN))
+
+-#define INTF_SC7280_MASK INTF_SC7180_MASK | BIT(DPU_DATA_HCTL_EN)
++#define INTF_SC7280_MASK (INTF_SC7180_MASK)
+
+ #define WB_SM8250_MASK (BIT(DPU_WB_LINE_MODE) | \
+ BIT(DPU_WB_UBWC) | \
+@@ -453,11 +460,6 @@ static const struct dpu_dspp_sub_blks msm8998_dspp_sblk = {
+ .len = 0x90, .version = 0x10007},
+ };
+
+-static const struct dpu_dspp_sub_blks sc7180_dspp_sblk = {
+- .pcc = {.id = DPU_DSPP_PCC, .base = 0x1700,
+- .len = 0x90, .version = 0x10000},
+-};
+-
+ static const struct dpu_dspp_sub_blks sm8150_dspp_sblk = {
+ .pcc = {.id = DPU_DSPP_PCC, .base = 0x1700,
+ .len = 0x90, .version = 0x40000},
+@@ -501,21 +503,11 @@ static const struct dpu_pingpong_sub_blks sc7280_pp_sblk = {
+ .intr_done = _done, \
+ .intr_rdptr = _rdptr, \
+ }
+-#define PP_BLK_TE(_name, _id, _base, _merge_3d, _sblk, _done, _rdptr) \
++#define PP_BLK(_name, _id, _base, _features, _merge_3d, _sblk, _done, _rdptr) \
+ {\
+ .name = _name, .id = _id, \
+ .base = _base, .len = 0xd4, \
+- .features = PINGPONG_SDM845_SPLIT_MASK, \
+- .merge_3d = _merge_3d, \
+- .sblk = &_sblk, \
+- .intr_done = _done, \
+- .intr_rdptr = _rdptr, \
+- }
+-#define PP_BLK(_name, _id, _base, _merge_3d, _sblk, _done, _rdptr) \
+- {\
+- .name = _name, .id = _id, \
+- .base = _base, .len = 0xd4, \
+- .features = PINGPONG_SDM845_MASK, \
++ .features = _features, \
+ .merge_3d = _merge_3d, \
+ .sblk = &_sblk, \
+ .intr_done = _done, \
+@@ -528,7 +520,7 @@ static const struct dpu_pingpong_sub_blks sc7280_pp_sblk = {
+ #define MERGE_3D_BLK(_name, _id, _base) \
+ {\
+ .name = _name, .id = _id, \
+- .base = _base, .len = 0x100, \
++ .base = _base, .len = 0x8, \
+ .features = MERGE_3D_SM8150_MASK, \
+ .sblk = NULL \
+ }
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
+index bbdc95ce374a7..f6270b7a0b140 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
++++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
+@@ -117,6 +117,9 @@ static inline void dpu_hw_ctl_clear_pending_flush(struct dpu_hw_ctl *ctx)
+ trace_dpu_hw_ctl_clear_pending_flush(ctx->pending_flush_mask,
+ dpu_hw_ctl_get_flush_register(ctx));
+ ctx->pending_flush_mask = 0x0;
++ ctx->pending_intf_flush_mask = 0;
++ ctx->pending_wb_flush_mask = 0;
++ ctx->pending_merge_3d_flush_mask = 0;
+
+ memset(ctx->pending_dspp_flush_mask, 0,
+ sizeof(ctx->pending_dspp_flush_mask));
+@@ -542,7 +545,7 @@ static void dpu_hw_ctl_intf_cfg_v1(struct dpu_hw_ctl *ctx,
+ DPU_REG_WRITE(c, CTL_MERGE_3D_ACTIVE,
+ BIT(cfg->merge_3d - MERGE_3D_0));
+ if (cfg->dsc) {
+- DPU_REG_WRITE(&ctx->hw, CTL_FLUSH, DSC_IDX);
++ DPU_REG_WRITE(&ctx->hw, CTL_FLUSH, BIT(DSC_IDX));
+ DPU_REG_WRITE(c, CTL_DSC_ACTIVE, cfg->dsc);
+ }
+ }
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.c
+index 4e1396575e6aa..c3c70ba61c1c4 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.c
++++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.c
+@@ -54,9 +54,10 @@ static void dpu_hw_dsc_config(struct dpu_hw_dsc *hw_dsc,
+ if (is_cmd_mode)
+ initial_lines += 1;
+
+- slice_last_group_size = 3 - (dsc->slice_width % 3);
++ slice_last_group_size = (dsc->slice_width + 2) % 3;
++
+ data = (initial_lines << 20);
+- data |= ((slice_last_group_size - 1) << 18);
++ data |= (slice_last_group_size << 18);
+ /* bpp is 6.4 format, 4 LSBs bits are for fractional part */
+ data |= (dsc->bits_per_pixel << 8);
+ data |= (dsc->block_pred_enable << 7);
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.c
+index 0fcad9760b6fc..4a20a5841f223 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.c
++++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.c
+@@ -144,23 +144,6 @@ static bool dpu_hw_pp_get_autorefresh_config(struct dpu_hw_pingpong *pp,
+ return !!((val & BIT(31)) >> 31);
+ }
+
+-static int dpu_hw_pp_poll_timeout_wr_ptr(struct dpu_hw_pingpong *pp,
+- u32 timeout_us)
+-{
+- struct dpu_hw_blk_reg_map *c;
+- u32 val;
+- int rc;
+-
+- if (!pp)
+- return -EINVAL;
+-
+- c = &pp->hw;
+- rc = readl_poll_timeout(c->blk_addr + PP_LINE_COUNT,
+- val, (val & 0xffff) >= 1, 10, timeout_us);
+-
+- return rc;
+-}
+-
+ static int dpu_hw_pp_enable_te(struct dpu_hw_pingpong *pp, bool enable)
+ {
+ struct dpu_hw_blk_reg_map *c;
+@@ -245,6 +228,49 @@ static u32 dpu_hw_pp_get_line_count(struct dpu_hw_pingpong *pp)
+ return line;
+ }
+
++static void dpu_hw_pp_disable_autorefresh(struct dpu_hw_pingpong *pp,
++ uint32_t encoder_id, u16 vdisplay)
++{
++ struct dpu_hw_pp_vsync_info info;
++ int trial = 0;
++
++ /* If autorefresh is already disabled, we have nothing to do */
++ if (!dpu_hw_pp_get_autorefresh_config(pp, NULL))
++ return;
++
++ /*
++ * If autorefresh is enabled, disable it and make sure it is safe to
++ * proceed with current frame commit/push. Sequence followed is,
++ * 1. Disable TE
++ * 2. Disable autorefresh config
++ * 4. Poll for frame transfer ongoing to be false
++ * 5. Enable TE back
++ */
++
++ dpu_hw_pp_connect_external_te(pp, false);
++ dpu_hw_pp_setup_autorefresh_config(pp, 0, false);
++
++ do {
++ udelay(DPU_ENC_MAX_POLL_TIMEOUT_US);
++ if ((trial * DPU_ENC_MAX_POLL_TIMEOUT_US)
++ > (KICKOFF_TIMEOUT_MS * USEC_PER_MSEC)) {
++ DPU_ERROR("enc%d pp%d disable autorefresh failed\n",
++ encoder_id, pp->idx - PINGPONG_0);
++ break;
++ }
++
++ trial++;
++
++ dpu_hw_pp_get_vsync_info(pp, &info);
++ } while (info.wr_ptr_line_count > 0 &&
++ info.wr_ptr_line_count < vdisplay);
++
++ dpu_hw_pp_connect_external_te(pp, true);
++
++ DPU_DEBUG("enc%d pp%d disabled autorefresh\n",
++ encoder_id, pp->idx - PINGPONG_0);
++}
++
+ static int dpu_hw_pp_dsc_enable(struct dpu_hw_pingpong *pp)
+ {
+ struct dpu_hw_blk_reg_map *c = &pp->hw;
+@@ -274,14 +300,13 @@ static int dpu_hw_pp_setup_dsc(struct dpu_hw_pingpong *pp)
+ static void _setup_pingpong_ops(struct dpu_hw_pingpong *c,
+ unsigned long features)
+ {
+- c->ops.setup_tearcheck = dpu_hw_pp_setup_te_config;
+- c->ops.enable_tearcheck = dpu_hw_pp_enable_te;
+- c->ops.connect_external_te = dpu_hw_pp_connect_external_te;
+- c->ops.get_vsync_info = dpu_hw_pp_get_vsync_info;
+- c->ops.setup_autorefresh = dpu_hw_pp_setup_autorefresh_config;
+- c->ops.get_autorefresh = dpu_hw_pp_get_autorefresh_config;
+- c->ops.poll_timeout_wr_ptr = dpu_hw_pp_poll_timeout_wr_ptr;
+- c->ops.get_line_count = dpu_hw_pp_get_line_count;
++ if (test_bit(DPU_PINGPONG_TE, &features)) {
++ c->ops.setup_tearcheck = dpu_hw_pp_setup_te_config;
++ c->ops.enable_tearcheck = dpu_hw_pp_enable_te;
++ c->ops.connect_external_te = dpu_hw_pp_connect_external_te;
++ c->ops.get_line_count = dpu_hw_pp_get_line_count;
++ c->ops.disable_autorefresh = dpu_hw_pp_disable_autorefresh;
++ }
+ c->ops.setup_dsc = dpu_hw_pp_setup_dsc;
+ c->ops.enable_dsc = dpu_hw_pp_dsc_enable;
+ c->ops.disable_dsc = dpu_hw_pp_dsc_disable;
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h
+index c00223441d990..851b013c4c4b6 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h
++++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h
+@@ -61,9 +61,6 @@ struct dpu_hw_dither_cfg {
+ * Assumption is these functions will be called after clocks are enabled
+ * @setup_tearcheck : program tear check values
+ * @enable_tearcheck : enables tear check
+- * @get_vsync_info : retries timing info of the panel
+- * @setup_autorefresh : configure and enable the autorefresh config
+- * @get_autorefresh : retrieve autorefresh config from hardware
+ * @setup_dither : function to program the dither hw block
+ * @get_line_count: obtain current vertical line counter
+ */
+@@ -89,34 +86,14 @@ struct dpu_hw_pingpong_ops {
+ bool enable_external_te);
+
+ /**
+- * provides the programmed and current
+- * line_count
+- */
+- int (*get_vsync_info)(struct dpu_hw_pingpong *pp,
+- struct dpu_hw_pp_vsync_info *info);
+-
+- /**
+- * configure and enable the autorefresh config
+- */
+- void (*setup_autorefresh)(struct dpu_hw_pingpong *pp,
+- u32 frame_count, bool enable);
+-
+- /**
+- * retrieve autorefresh config from hardware
+- */
+- bool (*get_autorefresh)(struct dpu_hw_pingpong *pp,
+- u32 *frame_count);
+-
+- /**
+- * poll until write pointer transmission starts
+- * @Return: 0 on success, -ETIMEDOUT on timeout
++ * Obtain current vertical line counter
+ */
+- int (*poll_timeout_wr_ptr)(struct dpu_hw_pingpong *pp, u32 timeout_us);
++ u32 (*get_line_count)(struct dpu_hw_pingpong *pp);
+
+ /**
+- * Obtain current vertical line counter
++ * Disable autorefresh if enabled
+ */
+- u32 (*get_line_count)(struct dpu_hw_pingpong *pp);
++ void (*disable_autorefresh)(struct dpu_hw_pingpong *pp, uint32_t encoder_id, u16 vdisplay);
+
+ /**
+ * Setup dither matix for pingpong block
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
+index aca39a4689f48..e7fc67381c2bd 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
++++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
+@@ -118,6 +118,10 @@ struct vsync_info {
+ u32 line_count;
+ };
+
++#define DPU_ENC_WR_PTR_START_TIMEOUT_US 20000
++
++#define DPU_ENC_MAX_POLL_TIMEOUT_US 2000
++
+ #define to_dpu_kms(x) container_of(x, struct dpu_kms, base)
+
+ #define to_dpu_global_state(x) container_of(x, struct dpu_global_state, base)
+diff --git a/drivers/gpu/drm/msm/dp/dp_display.c b/drivers/gpu/drm/msm/dp/dp_display.c
+index 03b0eda6df54a..cffb3f41f6023 100644
+--- a/drivers/gpu/drm/msm/dp/dp_display.c
++++ b/drivers/gpu/drm/msm/dp/dp_display.c
+@@ -329,6 +329,8 @@ static void dp_display_unbind(struct device *dev, struct device *master,
+
+ kthread_stop(dp->ev_tsk);
+
++ of_dp_aux_depopulate_bus(dp->aux);
++
+ dp_power_client_deinit(dp->power);
+ dp_unregister_audio_driver(dev, dp->audio);
+ dp_aux_unregister(dp->aux);
+@@ -1328,9 +1330,9 @@ static int dp_display_remove(struct platform_device *pdev)
+ {
+ struct dp_display_private *dp = dev_get_dp_display_private(&pdev->dev);
+
++ component_del(&pdev->dev, &dp_display_comp_ops);
+ dp_display_deinit_sub_modules(dp);
+
+- component_del(&pdev->dev, &dp_display_comp_ops);
+ platform_set_drvdata(pdev, NULL);
+
+ return 0;
+@@ -1509,11 +1511,6 @@ void msm_dp_debugfs_init(struct msm_dp *dp_display, struct drm_minor *minor)
+ }
+ }
+
+-static void of_dp_aux_depopulate_bus_void(void *data)
+-{
+- of_dp_aux_depopulate_bus(data);
+-}
+-
+ static int dp_display_get_next_bridge(struct msm_dp *dp)
+ {
+ int rc;
+@@ -1541,12 +1538,6 @@ static int dp_display_get_next_bridge(struct msm_dp *dp)
+ of_node_put(aux_bus);
+ if (rc)
+ goto error;
+-
+- rc = devm_add_action_or_reset(dp->drm_dev->dev,
+- of_dp_aux_depopulate_bus_void,
+- dp_priv->aux);
+- if (rc)
+- goto error;
+ } else if (dp->is_edp) {
+ DRM_ERROR("eDP aux_bus not found\n");
+ return -ENODEV;
+@@ -1570,6 +1561,7 @@ static int dp_display_get_next_bridge(struct msm_dp *dp)
+
+ error:
+ if (dp->is_edp) {
++ of_dp_aux_depopulate_bus(dp_priv->aux);
+ dp_display_host_phy_exit(dp_priv);
+ dp_display_host_deinit(dp_priv);
+ }
+diff --git a/drivers/gpu/drm/msm/dsi/dsi_host.c b/drivers/gpu/drm/msm/dsi/dsi_host.c
+index 961689a255c47..735a7f6386df8 100644
+--- a/drivers/gpu/drm/msm/dsi/dsi_host.c
++++ b/drivers/gpu/drm/msm/dsi/dsi_host.c
+@@ -850,18 +850,17 @@ static void dsi_update_dsc_timing(struct msm_dsi_host *msm_host, bool is_cmd_mod
+ */
+ slice_per_intf = DIV_ROUND_UP(hdisplay, dsc->slice_width);
+
+- /*
+- * If slice_count is greater than slice_per_intf
+- * then default to 1. This can happen during partial
+- * update.
+- */
+- if (dsc->slice_count > slice_per_intf)
+- dsc->slice_count = 1;
+-
+ total_bytes_per_intf = dsc->slice_chunk_size * slice_per_intf;
+
+ eol_byte_num = total_bytes_per_intf % 3;
+- pkt_per_line = slice_per_intf / dsc->slice_count;
++
++ /*
++ * Typically, pkt_per_line = slice_per_intf * slice_per_pkt.
++ *
++ * Since the current driver only supports slice_per_pkt = 1,
++ * pkt_per_line will be equal to slice per intf for now.
++ */
++ pkt_per_line = slice_per_intf;
+
+ if (is_cmd_mode) /* packet data type */
+ reg = DSI_COMMAND_COMPRESSION_MODE_CTRL_STREAM0_DATATYPE(MIPI_DSI_DCS_LONG_WRITE);
+@@ -985,7 +984,14 @@ static void dsi_timing_setup(struct msm_dsi_host *msm_host, bool is_bonded_dsi)
+ if (!msm_host->dsc)
+ wc = hdisplay * dsi_get_bpp(msm_host->format) / 8 + 1;
+ else
+- wc = msm_host->dsc->slice_chunk_size * msm_host->dsc->slice_count + 1;
++ /*
++ * When DSC is enabled, WC = slice_chunk_size * slice_per_pkt + 1.
++ * Currently, the driver only supports default value of slice_per_pkt = 1
++ *
++ * TODO: Expand mipi_dsi_device struct to hold slice_per_pkt info
++ * and adjust DSC math to account for slice_per_pkt.
++ */
++ wc = msm_host->dsc->slice_chunk_size + 1;
+
+ dsi_write(msm_host, REG_DSI_CMD_MDP_STREAM0_CTRL,
+ DSI_CMD_MDP_STREAM0_CTRL_WORD_COUNT(wc) |
+diff --git a/drivers/gpu/drm/msm/dsi/phy/dsi_phy_14nm.c b/drivers/gpu/drm/msm/dsi/phy/dsi_phy_14nm.c
+index 9f488adea7f54..3ce45b023e637 100644
+--- a/drivers/gpu/drm/msm/dsi/phy/dsi_phy_14nm.c
++++ b/drivers/gpu/drm/msm/dsi/phy/dsi_phy_14nm.c
+@@ -539,6 +539,9 @@ static int dsi_pll_14nm_vco_prepare(struct clk_hw *hw)
+ if (unlikely(pll_14nm->phy->pll_on))
+ return 0;
+
++ if (dsi_pll_14nm_vco_recalc_rate(hw, VCO_REF_CLK_RATE) == 0)
++ dsi_pll_14nm_vco_set_rate(hw, pll_14nm->phy->cfg->min_pll_rate, VCO_REF_CLK_RATE);
++
+ dsi_phy_write(base + REG_DSI_14nm_PHY_PLL_VREF_CFG1, 0x10);
+ dsi_phy_write(cmn_base + REG_DSI_14nm_PHY_CMN_PLL_CNTRL, 1);
+
+diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c
+index 5bb777ff13130..9b6824f6b9e4b 100644
+--- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
++++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
+@@ -64,6 +64,7 @@
+ #include "nouveau_connector.h"
+ #include "nouveau_encoder.h"
+ #include "nouveau_fence.h"
++#include "nv50_display.h"
+
+ #include <subdev/bios/dp.h>
+
+diff --git a/drivers/gpu/drm/nouveau/nv50_display.h b/drivers/gpu/drm/nouveau/nv50_display.h
+index fbd3b15583bc8..60f77766766e9 100644
+--- a/drivers/gpu/drm/nouveau/nv50_display.h
++++ b/drivers/gpu/drm/nouveau/nv50_display.h
+@@ -31,7 +31,5 @@
+ #include "nouveau_reg.h"
+
+ int nv50_display_create(struct drm_device *);
+-void nv50_display_destroy(struct drm_device *);
+-int nv50_display_init(struct drm_device *);
+-void nv50_display_fini(struct drm_device *);
++
+ #endif /* __NV50_DISPLAY_H__ */
+diff --git a/drivers/gpu/drm/panel/panel-sharp-ls043t1le01.c b/drivers/gpu/drm/panel/panel-sharp-ls043t1le01.c
+index d1ec80a3e3c72..ef148504cf24a 100644
+--- a/drivers/gpu/drm/panel/panel-sharp-ls043t1le01.c
++++ b/drivers/gpu/drm/panel/panel-sharp-ls043t1le01.c
+@@ -192,15 +192,15 @@ static int sharp_nt_panel_enable(struct drm_panel *panel)
+ }
+
+ static const struct drm_display_mode default_mode = {
+- .clock = 41118,
++ .clock = (540 + 48 + 32 + 80) * (960 + 3 + 10 + 15) * 60 / 1000,
+ .hdisplay = 540,
+ .hsync_start = 540 + 48,
+- .hsync_end = 540 + 48 + 80,
+- .htotal = 540 + 48 + 80 + 32,
++ .hsync_end = 540 + 48 + 32,
++ .htotal = 540 + 48 + 32 + 80,
+ .vdisplay = 960,
+ .vsync_start = 960 + 3,
+- .vsync_end = 960 + 3 + 15,
+- .vtotal = 960 + 3 + 15 + 1,
++ .vsync_end = 960 + 3 + 10,
++ .vtotal = 960 + 3 + 10 + 15,
+ };
+
+ static int sharp_nt_panel_get_modes(struct drm_panel *panel,
+@@ -280,6 +280,7 @@ static int sharp_nt_panel_probe(struct mipi_dsi_device *dsi)
+ dsi->lanes = 2;
+ dsi->format = MIPI_DSI_FMT_RGB888;
+ dsi->mode_flags = MIPI_DSI_MODE_VIDEO |
++ MIPI_DSI_MODE_VIDEO_SYNC_PULSE |
+ MIPI_DSI_MODE_VIDEO_HSE |
+ MIPI_DSI_CLOCK_NON_CONTINUOUS |
+ MIPI_DSI_MODE_NO_EOT_PACKET;
+diff --git a/drivers/gpu/drm/panel/panel-simple.c b/drivers/gpu/drm/panel/panel-simple.c
+index 065f378bba9d2..d8efbcee9bc12 100644
+--- a/drivers/gpu/drm/panel/panel-simple.c
++++ b/drivers/gpu/drm/panel/panel-simple.c
+@@ -759,8 +759,8 @@ static const struct panel_desc ampire_am_480272h3tmqw_t01h = {
+ .num_modes = 1,
+ .bpc = 8,
+ .size = {
+- .width = 105,
+- .height = 67,
++ .width = 99,
++ .height = 58,
+ },
+ .bus_format = MEDIA_BUS_FMT_RGB888_1X24,
+ };
+diff --git a/drivers/gpu/drm/radeon/ci_dpm.c b/drivers/gpu/drm/radeon/ci_dpm.c
+index 8ef25ab305ae7..b8f4dac68d850 100644
+--- a/drivers/gpu/drm/radeon/ci_dpm.c
++++ b/drivers/gpu/drm/radeon/ci_dpm.c
+@@ -5517,6 +5517,7 @@ static int ci_parse_power_table(struct radeon_device *rdev)
+ u8 frev, crev;
+ u8 *power_state_offset;
+ struct ci_ps *ps;
++ int ret;
+
+ if (!atom_parse_data_header(mode_info->atom_context, index, NULL,
+ &frev, &crev, &data_offset))
+@@ -5546,11 +5547,15 @@ static int ci_parse_power_table(struct radeon_device *rdev)
+ non_clock_array_index = power_state->v2.nonClockInfoIndex;
+ non_clock_info = (struct _ATOM_PPLIB_NONCLOCK_INFO *)
+ &non_clock_info_array->nonClockInfo[non_clock_array_index];
+- if (!rdev->pm.power_state[i].clock_info)
+- return -EINVAL;
++ if (!rdev->pm.power_state[i].clock_info) {
++ ret = -EINVAL;
++ goto err_free_ps;
++ }
+ ps = kzalloc(sizeof(struct ci_ps), GFP_KERNEL);
+- if (ps == NULL)
+- return -ENOMEM;
++ if (ps == NULL) {
++ ret = -ENOMEM;
++ goto err_free_ps;
++ }
+ rdev->pm.dpm.ps[i].ps_priv = ps;
+ ci_parse_pplib_non_clock_info(rdev, &rdev->pm.dpm.ps[i],
+ non_clock_info,
+@@ -5590,6 +5595,12 @@ static int ci_parse_power_table(struct radeon_device *rdev)
+ }
+
+ return 0;
++
++err_free_ps:
++ for (i = 0; i < rdev->pm.dpm.num_ps; i++)
++ kfree(rdev->pm.dpm.ps[i].ps_priv);
++ kfree(rdev->pm.dpm.ps);
++ return ret;
+ }
+
+ static int ci_get_vbios_boot_values(struct radeon_device *rdev,
+@@ -5678,25 +5689,26 @@ int ci_dpm_init(struct radeon_device *rdev)
+
+ ret = ci_get_vbios_boot_values(rdev, &pi->vbios_boot_state);
+ if (ret) {
+- ci_dpm_fini(rdev);
++ kfree(rdev->pm.dpm.priv);
+ return ret;
+ }
+
+ ret = r600_get_platform_caps(rdev);
+ if (ret) {
+- ci_dpm_fini(rdev);
++ kfree(rdev->pm.dpm.priv);
+ return ret;
+ }
+
+ ret = r600_parse_extended_power_table(rdev);
+ if (ret) {
+- ci_dpm_fini(rdev);
++ kfree(rdev->pm.dpm.priv);
+ return ret;
+ }
+
+ ret = ci_parse_power_table(rdev);
+ if (ret) {
+- ci_dpm_fini(rdev);
++ kfree(rdev->pm.dpm.priv);
++ r600_free_extended_power_table(rdev);
+ return ret;
+ }
+
+diff --git a/drivers/gpu/drm/radeon/cypress_dpm.c b/drivers/gpu/drm/radeon/cypress_dpm.c
+index fdddbbaecbb74..72a0768df00f7 100644
+--- a/drivers/gpu/drm/radeon/cypress_dpm.c
++++ b/drivers/gpu/drm/radeon/cypress_dpm.c
+@@ -557,8 +557,12 @@ static int cypress_populate_mclk_value(struct radeon_device *rdev,
+ ASIC_INTERNAL_MEMORY_SS, vco_freq)) {
+ u32 reference_clock = rdev->clock.mpll.reference_freq;
+ u32 decoded_ref = rv740_get_decoded_reference_divider(dividers.ref_div);
+- u32 clk_s = reference_clock * 5 / (decoded_ref * ss.rate);
+- u32 clk_v = ss.percentage *
++ u32 clk_s, clk_v;
++
++ if (!decoded_ref)
++ return -EINVAL;
++ clk_s = reference_clock * 5 / (decoded_ref * ss.rate);
++ clk_v = ss.percentage *
+ (0x4000 * dividers.whole_fb_div + 0x800 * dividers.frac_fb_div) / (clk_s * 625);
+
+ mpll_ss1 &= ~CLKV_MASK;
+diff --git a/drivers/gpu/drm/radeon/ni_dpm.c b/drivers/gpu/drm/radeon/ni_dpm.c
+index 672d2239293e0..3e1c1a392fb7b 100644
+--- a/drivers/gpu/drm/radeon/ni_dpm.c
++++ b/drivers/gpu/drm/radeon/ni_dpm.c
+@@ -2241,8 +2241,12 @@ static int ni_populate_mclk_value(struct radeon_device *rdev,
+ ASIC_INTERNAL_MEMORY_SS, vco_freq)) {
+ u32 reference_clock = rdev->clock.mpll.reference_freq;
+ u32 decoded_ref = rv740_get_decoded_reference_divider(dividers.ref_div);
+- u32 clk_s = reference_clock * 5 / (decoded_ref * ss.rate);
+- u32 clk_v = ss.percentage *
++ u32 clk_s, clk_v;
++
++ if (!decoded_ref)
++ return -EINVAL;
++ clk_s = reference_clock * 5 / (decoded_ref * ss.rate);
++ clk_v = ss.percentage *
+ (0x4000 * dividers.whole_fb_div + 0x800 * dividers.frac_fb_div) / (clk_s * 625);
+
+ mpll_ss1 &= ~CLKV_MASK;
+diff --git a/drivers/gpu/drm/radeon/rv740_dpm.c b/drivers/gpu/drm/radeon/rv740_dpm.c
+index d57a3e1df8d63..4464fd21a3029 100644
+--- a/drivers/gpu/drm/radeon/rv740_dpm.c
++++ b/drivers/gpu/drm/radeon/rv740_dpm.c
+@@ -249,8 +249,12 @@ int rv740_populate_mclk_value(struct radeon_device *rdev,
+ ASIC_INTERNAL_MEMORY_SS, vco_freq)) {
+ u32 reference_clock = rdev->clock.mpll.reference_freq;
+ u32 decoded_ref = rv740_get_decoded_reference_divider(dividers.ref_div);
+- u32 clk_s = reference_clock * 5 / (decoded_ref * ss.rate);
+- u32 clk_v = 0x40000 * ss.percentage *
++ u32 clk_s, clk_v;
++
++ if (!decoded_ref)
++ return -EINVAL;
++ clk_s = reference_clock * 5 / (decoded_ref * ss.rate);
++ clk_v = 0x40000 * ss.percentage *
+ (dividers.whole_fb_div + (dividers.frac_fb_div / 8)) / (clk_s * 10000);
+
+ mpll_ss1 &= ~CLKV_MASK;
+diff --git a/drivers/gpu/drm/sun4i/sun4i_tcon.c b/drivers/gpu/drm/sun4i/sun4i_tcon.c
+index 523a6d7879210..936796851ffd3 100644
+--- a/drivers/gpu/drm/sun4i/sun4i_tcon.c
++++ b/drivers/gpu/drm/sun4i/sun4i_tcon.c
+@@ -778,21 +778,19 @@ static irqreturn_t sun4i_tcon_handler(int irq, void *private)
+ static int sun4i_tcon_init_clocks(struct device *dev,
+ struct sun4i_tcon *tcon)
+ {
+- tcon->clk = devm_clk_get(dev, "ahb");
++ tcon->clk = devm_clk_get_enabled(dev, "ahb");
+ if (IS_ERR(tcon->clk)) {
+ dev_err(dev, "Couldn't get the TCON bus clock\n");
+ return PTR_ERR(tcon->clk);
+ }
+- clk_prepare_enable(tcon->clk);
+
+ if (tcon->quirks->has_channel_0) {
+- tcon->sclk0 = devm_clk_get(dev, "tcon-ch0");
++ tcon->sclk0 = devm_clk_get_enabled(dev, "tcon-ch0");
+ if (IS_ERR(tcon->sclk0)) {
+ dev_err(dev, "Couldn't get the TCON channel 0 clock\n");
+ return PTR_ERR(tcon->sclk0);
+ }
+ }
+- clk_prepare_enable(tcon->sclk0);
+
+ if (tcon->quirks->has_channel_1) {
+ tcon->sclk1 = devm_clk_get(dev, "tcon-ch1");
+@@ -805,12 +803,6 @@ static int sun4i_tcon_init_clocks(struct device *dev,
+ return 0;
+ }
+
+-static void sun4i_tcon_free_clocks(struct sun4i_tcon *tcon)
+-{
+- clk_disable_unprepare(tcon->sclk0);
+- clk_disable_unprepare(tcon->clk);
+-}
+-
+ static int sun4i_tcon_init_irq(struct device *dev,
+ struct sun4i_tcon *tcon)
+ {
+@@ -1223,14 +1215,14 @@ static int sun4i_tcon_bind(struct device *dev, struct device *master,
+ ret = sun4i_tcon_init_regmap(dev, tcon);
+ if (ret) {
+ dev_err(dev, "Couldn't init our TCON regmap\n");
+- goto err_free_clocks;
++ goto err_assert_reset;
+ }
+
+ if (tcon->quirks->has_channel_0) {
+ ret = sun4i_dclk_create(dev, tcon);
+ if (ret) {
+ dev_err(dev, "Couldn't create our TCON dot clock\n");
+- goto err_free_clocks;
++ goto err_assert_reset;
+ }
+ }
+
+@@ -1293,8 +1285,6 @@ static int sun4i_tcon_bind(struct device *dev, struct device *master,
+ err_free_dotclock:
+ if (tcon->quirks->has_channel_0)
+ sun4i_dclk_free(tcon);
+-err_free_clocks:
+- sun4i_tcon_free_clocks(tcon);
+ err_assert_reset:
+ reset_control_assert(tcon->lcd_rst);
+ return ret;
+@@ -1308,7 +1298,6 @@ static void sun4i_tcon_unbind(struct device *dev, struct device *master,
+ list_del(&tcon->list);
+ if (tcon->quirks->has_channel_0)
+ sun4i_dclk_free(tcon);
+- sun4i_tcon_free_clocks(tcon);
+ }
+
+ static const struct component_ops sun4i_tcon_ops = {
+diff --git a/drivers/gpu/drm/vkms/vkms_composer.c b/drivers/gpu/drm/vkms/vkms_composer.c
+index 8e53fa80742b2..80164e79af006 100644
+--- a/drivers/gpu/drm/vkms/vkms_composer.c
++++ b/drivers/gpu/drm/vkms/vkms_composer.c
+@@ -99,7 +99,7 @@ static void blend(struct vkms_writeback_job *wb,
+ if (!check_y_limit(plane[i]->frame_info, y))
+ continue;
+
+- plane[i]->plane_read(stage_buffer, plane[i]->frame_info, y);
++ vkms_compose_row(stage_buffer, plane[i], y);
+ pre_mul_alpha_blend(plane[i]->frame_info, stage_buffer,
+ output_buffer);
+ }
+@@ -118,7 +118,7 @@ static int check_format_funcs(struct vkms_crtc_state *crtc_state,
+ u32 n_active_planes = crtc_state->num_active_planes;
+
+ for (size_t i = 0; i < n_active_planes; i++)
+- if (!planes[i]->plane_read)
++ if (!planes[i]->pixel_read)
+ return -1;
+
+ if (active_wb && !active_wb->wb_write)
+diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
+index 4a248567efb26..f152d54baf769 100644
+--- a/drivers/gpu/drm/vkms/vkms_drv.h
++++ b/drivers/gpu/drm/vkms/vkms_drv.h
+@@ -56,8 +56,7 @@ struct vkms_writeback_job {
+ struct vkms_plane_state {
+ struct drm_shadow_plane_state base;
+ struct vkms_frame_info *frame_info;
+- void (*plane_read)(struct line_buffer *buffer,
+- const struct vkms_frame_info *frame_info, int y);
++ void (*pixel_read)(u8 *src_buffer, struct pixel_argb_u16 *out_pixel);
+ };
+
+ struct vkms_plane {
+@@ -155,6 +154,7 @@ int vkms_verify_crc_source(struct drm_crtc *crtc, const char *source_name,
+ /* Composer Support */
+ void vkms_composer_worker(struct work_struct *work);
+ void vkms_set_composer(struct vkms_output *out, bool enabled);
++void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state *plane, int y);
+
+ /* Writeback */
+ int vkms_enable_writeback_connector(struct vkms_device *vkmsdev);
+diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
+index d4950688b3f17..b11342026485f 100644
+--- a/drivers/gpu/drm/vkms/vkms_formats.c
++++ b/drivers/gpu/drm/vkms/vkms_formats.c
+@@ -42,100 +42,75 @@ static void *get_packed_src_addr(const struct vkms_frame_info *frame_info, int y
+ return packed_pixels_addr(frame_info, x_src, y_src);
+ }
+
+-static void ARGB8888_to_argb_u16(struct line_buffer *stage_buffer,
+- const struct vkms_frame_info *frame_info, int y)
++static void ARGB8888_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
+ {
+- struct pixel_argb_u16 *out_pixels = stage_buffer->pixels;
+- u8 *src_pixels = get_packed_src_addr(frame_info, y);
+- int x_limit = min_t(size_t, drm_rect_width(&frame_info->dst),
+- stage_buffer->n_pixels);
+-
+- for (size_t x = 0; x < x_limit; x++, src_pixels += 4) {
+- /*
+- * The 257 is the "conversion ratio". This number is obtained by the
+- * (2^16 - 1) / (2^8 - 1) division. Which, in this case, tries to get
+- * the best color value in a pixel format with more possibilities.
+- * A similar idea applies to others RGB color conversions.
+- */
+- out_pixels[x].a = (u16)src_pixels[3] * 257;
+- out_pixels[x].r = (u16)src_pixels[2] * 257;
+- out_pixels[x].g = (u16)src_pixels[1] * 257;
+- out_pixels[x].b = (u16)src_pixels[0] * 257;
+- }
++ /*
++ * The 257 is the "conversion ratio". This number is obtained by the
++ * (2^16 - 1) / (2^8 - 1) division. Which, in this case, tries to get
++ * the best color value in a pixel format with more possibilities.
++ * A similar idea applies to others RGB color conversions.
++ */
++ out_pixel->a = (u16)src_pixels[3] * 257;
++ out_pixel->r = (u16)src_pixels[2] * 257;
++ out_pixel->g = (u16)src_pixels[1] * 257;
++ out_pixel->b = (u16)src_pixels[0] * 257;
+ }
+
+-static void XRGB8888_to_argb_u16(struct line_buffer *stage_buffer,
+- const struct vkms_frame_info *frame_info, int y)
++static void XRGB8888_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
+ {
+- struct pixel_argb_u16 *out_pixels = stage_buffer->pixels;
+- u8 *src_pixels = get_packed_src_addr(frame_info, y);
+- int x_limit = min_t(size_t, drm_rect_width(&frame_info->dst),
+- stage_buffer->n_pixels);
+-
+- for (size_t x = 0; x < x_limit; x++, src_pixels += 4) {
+- out_pixels[x].a = (u16)0xffff;
+- out_pixels[x].r = (u16)src_pixels[2] * 257;
+- out_pixels[x].g = (u16)src_pixels[1] * 257;
+- out_pixels[x].b = (u16)src_pixels[0] * 257;
+- }
++ out_pixel->a = (u16)0xffff;
++ out_pixel->r = (u16)src_pixels[2] * 257;
++ out_pixel->g = (u16)src_pixels[1] * 257;
++ out_pixel->b = (u16)src_pixels[0] * 257;
+ }
+
+-static void ARGB16161616_to_argb_u16(struct line_buffer *stage_buffer,
+- const struct vkms_frame_info *frame_info,
+- int y)
++static void ARGB16161616_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
+ {
+- struct pixel_argb_u16 *out_pixels = stage_buffer->pixels;
+- u16 *src_pixels = get_packed_src_addr(frame_info, y);
+- int x_limit = min_t(size_t, drm_rect_width(&frame_info->dst),
+- stage_buffer->n_pixels);
++ u16 *pixels = (u16 *)src_pixels;
+
+- for (size_t x = 0; x < x_limit; x++, src_pixels += 4) {
+- out_pixels[x].a = le16_to_cpu(src_pixels[3]);
+- out_pixels[x].r = le16_to_cpu(src_pixels[2]);
+- out_pixels[x].g = le16_to_cpu(src_pixels[1]);
+- out_pixels[x].b = le16_to_cpu(src_pixels[0]);
+- }
++ out_pixel->a = le16_to_cpu(pixels[3]);
++ out_pixel->r = le16_to_cpu(pixels[2]);
++ out_pixel->g = le16_to_cpu(pixels[1]);
++ out_pixel->b = le16_to_cpu(pixels[0]);
+ }
+
+-static void XRGB16161616_to_argb_u16(struct line_buffer *stage_buffer,
+- const struct vkms_frame_info *frame_info,
+- int y)
++static void XRGB16161616_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
+ {
+- struct pixel_argb_u16 *out_pixels = stage_buffer->pixels;
+- u16 *src_pixels = get_packed_src_addr(frame_info, y);
+- int x_limit = min_t(size_t, drm_rect_width(&frame_info->dst),
+- stage_buffer->n_pixels);
++ u16 *pixels = (u16 *)src_pixels;
+
+- for (size_t x = 0; x < x_limit; x++, src_pixels += 4) {
+- out_pixels[x].a = (u16)0xffff;
+- out_pixels[x].r = le16_to_cpu(src_pixels[2]);
+- out_pixels[x].g = le16_to_cpu(src_pixels[1]);
+- out_pixels[x].b = le16_to_cpu(src_pixels[0]);
+- }
++ out_pixel->a = (u16)0xffff;
++ out_pixel->r = le16_to_cpu(pixels[2]);
++ out_pixel->g = le16_to_cpu(pixels[1]);
++ out_pixel->b = le16_to_cpu(pixels[0]);
+ }
+
+-static void RGB565_to_argb_u16(struct line_buffer *stage_buffer,
+- const struct vkms_frame_info *frame_info, int y)
++static void RGB565_to_argb_u16(u8 *src_pixels, struct pixel_argb_u16 *out_pixel)
+ {
+- struct pixel_argb_u16 *out_pixels = stage_buffer->pixels;
+- u16 *src_pixels = get_packed_src_addr(frame_info, y);
+- int x_limit = min_t(size_t, drm_rect_width(&frame_info->dst),
+- stage_buffer->n_pixels);
++ u16 *pixels = (u16 *)src_pixels;
+
+ s64 fp_rb_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(31));
+ s64 fp_g_ratio = drm_fixp_div(drm_int2fixp(65535), drm_int2fixp(63));
+
+- for (size_t x = 0; x < x_limit; x++, src_pixels++) {
+- u16 rgb_565 = le16_to_cpu(*src_pixels);
+- s64 fp_r = drm_int2fixp((rgb_565 >> 11) & 0x1f);
+- s64 fp_g = drm_int2fixp((rgb_565 >> 5) & 0x3f);
+- s64 fp_b = drm_int2fixp(rgb_565 & 0x1f);
++ u16 rgb_565 = le16_to_cpu(*pixels);
++ s64 fp_r = drm_int2fixp((rgb_565 >> 11) & 0x1f);
++ s64 fp_g = drm_int2fixp((rgb_565 >> 5) & 0x3f);
++ s64 fp_b = drm_int2fixp(rgb_565 & 0x1f);
+
+- out_pixels[x].a = (u16)0xffff;
+- out_pixels[x].r = drm_fixp2int(drm_fixp_mul(fp_r, fp_rb_ratio));
+- out_pixels[x].g = drm_fixp2int(drm_fixp_mul(fp_g, fp_g_ratio));
+- out_pixels[x].b = drm_fixp2int(drm_fixp_mul(fp_b, fp_rb_ratio));
+- }
++ out_pixel->a = (u16)0xffff;
++ out_pixel->r = drm_fixp2int_round(drm_fixp_mul(fp_r, fp_rb_ratio));
++ out_pixel->g = drm_fixp2int_round(drm_fixp_mul(fp_g, fp_g_ratio));
++ out_pixel->b = drm_fixp2int_round(drm_fixp_mul(fp_b, fp_rb_ratio));
++}
++
++void vkms_compose_row(struct line_buffer *stage_buffer, struct vkms_plane_state *plane, int y)
++{
++ struct pixel_argb_u16 *out_pixels = stage_buffer->pixels;
++ struct vkms_frame_info *frame_info = plane->frame_info;
++ u8 *src_pixels = get_packed_src_addr(frame_info, y);
++ int limit = min_t(size_t, drm_rect_width(&frame_info->dst), stage_buffer->n_pixels);
++
++ for (size_t x = 0; x < limit; x++, src_pixels += frame_info->cpp)
++ plane->pixel_read(src_pixels, &out_pixels[x]);
+ }
+
+ /*
+@@ -241,15 +216,15 @@ static void argb_u16_to_RGB565(struct vkms_frame_info *frame_info,
+ s64 fp_g = drm_int2fixp(in_pixels[x].g);
+ s64 fp_b = drm_int2fixp(in_pixels[x].b);
+
+- u16 r = drm_fixp2int(drm_fixp_div(fp_r, fp_rb_ratio));
+- u16 g = drm_fixp2int(drm_fixp_div(fp_g, fp_g_ratio));
+- u16 b = drm_fixp2int(drm_fixp_div(fp_b, fp_rb_ratio));
++ u16 r = drm_fixp2int_round(drm_fixp_div(fp_r, fp_rb_ratio));
++ u16 g = drm_fixp2int_round(drm_fixp_div(fp_g, fp_g_ratio));
++ u16 b = drm_fixp2int_round(drm_fixp_div(fp_b, fp_rb_ratio));
+
+ *dst_pixels = cpu_to_le16(r << 11 | g << 5 | b);
+ }
+ }
+
+-void *get_frame_to_line_function(u32 format)
++void *get_pixel_conversion_function(u32 format)
+ {
+ switch (format) {
+ case DRM_FORMAT_ARGB8888:
+diff --git a/drivers/gpu/drm/vkms/vkms_formats.h b/drivers/gpu/drm/vkms/vkms_formats.h
+index 43b7c19790181..c5b113495d0c0 100644
+--- a/drivers/gpu/drm/vkms/vkms_formats.h
++++ b/drivers/gpu/drm/vkms/vkms_formats.h
+@@ -5,7 +5,7 @@
+
+ #include "vkms_drv.h"
+
+-void *get_frame_to_line_function(u32 format);
++void *get_pixel_conversion_function(u32 format);
+
+ void *get_line_to_frame_function(u32 format);
+
+diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
+index c41cec7dcb703..0a23875900ec5 100644
+--- a/drivers/gpu/drm/vkms/vkms_plane.c
++++ b/drivers/gpu/drm/vkms/vkms_plane.c
+@@ -123,7 +123,7 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
+ frame_info->offset = fb->offsets[0];
+ frame_info->pitch = fb->pitches[0];
+ frame_info->cpp = fb->format->cpp[0];
+- vkms_plane_state->plane_read = get_frame_to_line_function(fmt);
++ vkms_plane_state->pixel_read = get_pixel_conversion_function(fmt);
+ }
+
+ static int vkms_plane_atomic_check(struct drm_plane *plane,
+diff --git a/drivers/hid/Kconfig b/drivers/hid/Kconfig
+index 4ce012f83253e..b977450cac752 100644
+--- a/drivers/hid/Kconfig
++++ b/drivers/hid/Kconfig
+@@ -1285,7 +1285,7 @@ config HID_MCP2221
+
+ config HID_KUNIT_TEST
+ tristate "KUnit tests for HID" if !KUNIT_ALL_TESTS
+- depends on KUNIT=y
++ depends on KUNIT
+ depends on HID_BATTERY_STRENGTH
+ depends on HID_UCLOGIC
+ default KUNIT_ALL_TESTS
+diff --git a/drivers/hwmon/f71882fg.c b/drivers/hwmon/f71882fg.c
+index 70121482a6173..27207ec6f7feb 100644
+--- a/drivers/hwmon/f71882fg.c
++++ b/drivers/hwmon/f71882fg.c
+@@ -1096,8 +1096,11 @@ static ssize_t show_pwm(struct device *dev,
+ val = data->pwm[nr];
+ else {
+ /* RPM mode */
+- val = 255 * fan_from_reg(data->fan_target[nr])
+- / fan_from_reg(data->fan_full_speed[nr]);
++ if (fan_from_reg(data->fan_full_speed[nr]))
++ val = 255 * fan_from_reg(data->fan_target[nr])
++ / fan_from_reg(data->fan_full_speed[nr]);
++ else
++ val = 0;
+ }
+ mutex_unlock(&data->update_lock);
+ return sprintf(buf, "%d\n", val);
+diff --git a/drivers/hwmon/gsc-hwmon.c b/drivers/hwmon/gsc-hwmon.c
+index 73e5d92b200b0..1501ceb551e79 100644
+--- a/drivers/hwmon/gsc-hwmon.c
++++ b/drivers/hwmon/gsc-hwmon.c
+@@ -82,8 +82,8 @@ static ssize_t pwm_auto_point_temp_store(struct device *dev,
+ if (kstrtol(buf, 10, &temp))
+ return -EINVAL;
+
+- temp = clamp_val(temp, 0, 10000);
+- temp = DIV_ROUND_CLOSEST(temp, 10);
++ temp = clamp_val(temp, 0, 100000);
++ temp = DIV_ROUND_CLOSEST(temp, 100);
+
+ regs[0] = temp & 0xff;
+ regs[1] = (temp >> 8) & 0xff;
+@@ -100,7 +100,7 @@ static ssize_t pwm_auto_point_pwm_show(struct device *dev,
+ {
+ struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
+
+- return sprintf(buf, "%d\n", 255 * (50 + (attr->index * 10)) / 100);
++ return sprintf(buf, "%d\n", 255 * (50 + (attr->index * 10)));
+ }
+
+ static SENSOR_DEVICE_ATTR_RO(pwm1_auto_point1_pwm, pwm_auto_point_pwm, 0);
+diff --git a/drivers/hwmon/pmbus/adm1275.c b/drivers/hwmon/pmbus/adm1275.c
+index 3b07bfb43e937..b8543c06d022a 100644
+--- a/drivers/hwmon/pmbus/adm1275.c
++++ b/drivers/hwmon/pmbus/adm1275.c
+@@ -37,10 +37,13 @@ enum chips { adm1075, adm1272, adm1275, adm1276, adm1278, adm1293, adm1294 };
+
+ #define ADM1272_IRANGE BIT(0)
+
++#define ADM1278_TSFILT BIT(15)
+ #define ADM1278_TEMP1_EN BIT(3)
+ #define ADM1278_VIN_EN BIT(2)
+ #define ADM1278_VOUT_EN BIT(1)
+
++#define ADM1278_PMON_DEFCONFIG (ADM1278_VOUT_EN | ADM1278_TEMP1_EN | ADM1278_TSFILT)
++
+ #define ADM1293_IRANGE_25 0
+ #define ADM1293_IRANGE_50 BIT(6)
+ #define ADM1293_IRANGE_100 BIT(7)
+@@ -462,6 +465,22 @@ static const struct i2c_device_id adm1275_id[] = {
+ };
+ MODULE_DEVICE_TABLE(i2c, adm1275_id);
+
++/* Enable VOUT & TEMP1 if not enabled (disabled by default) */
++static int adm1275_enable_vout_temp(struct i2c_client *client, int config)
++{
++ int ret;
++
++ if ((config & ADM1278_PMON_DEFCONFIG) != ADM1278_PMON_DEFCONFIG) {
++ config |= ADM1278_PMON_DEFCONFIG;
++ ret = i2c_smbus_write_word_data(client, ADM1275_PMON_CONFIG, config);
++ if (ret < 0) {
++ dev_err(&client->dev, "Failed to enable VOUT/TEMP1 monitoring\n");
++ return ret;
++ }
++ }
++ return 0;
++}
++
+ static int adm1275_probe(struct i2c_client *client)
+ {
+ s32 (*config_read_fn)(const struct i2c_client *client, u8 reg);
+@@ -615,19 +634,10 @@ static int adm1275_probe(struct i2c_client *client)
+ PMBUS_HAVE_VOUT | PMBUS_HAVE_STATUS_VOUT |
+ PMBUS_HAVE_TEMP | PMBUS_HAVE_STATUS_TEMP;
+
+- /* Enable VOUT & TEMP1 if not enabled (disabled by default) */
+- if ((config & (ADM1278_VOUT_EN | ADM1278_TEMP1_EN)) !=
+- (ADM1278_VOUT_EN | ADM1278_TEMP1_EN)) {
+- config |= ADM1278_VOUT_EN | ADM1278_TEMP1_EN;
+- ret = i2c_smbus_write_byte_data(client,
+- ADM1275_PMON_CONFIG,
+- config);
+- if (ret < 0) {
+- dev_err(&client->dev,
+- "Failed to enable VOUT monitoring\n");
+- return -ENODEV;
+- }
+- }
++ ret = adm1275_enable_vout_temp(client, config);
++ if (ret)
++ return ret;
++
+ if (config & ADM1278_VIN_EN)
+ info->func[0] |= PMBUS_HAVE_VIN;
+ break;
+@@ -684,19 +694,9 @@ static int adm1275_probe(struct i2c_client *client)
+ PMBUS_HAVE_VOUT | PMBUS_HAVE_STATUS_VOUT |
+ PMBUS_HAVE_TEMP | PMBUS_HAVE_STATUS_TEMP;
+
+- /* Enable VOUT & TEMP1 if not enabled (disabled by default) */
+- if ((config & (ADM1278_VOUT_EN | ADM1278_TEMP1_EN)) !=
+- (ADM1278_VOUT_EN | ADM1278_TEMP1_EN)) {
+- config |= ADM1278_VOUT_EN | ADM1278_TEMP1_EN;
+- ret = i2c_smbus_write_word_data(client,
+- ADM1275_PMON_CONFIG,
+- config);
+- if (ret < 0) {
+- dev_err(&client->dev,
+- "Failed to enable VOUT monitoring\n");
+- return -ENODEV;
+- }
+- }
++ ret = adm1275_enable_vout_temp(client, config);
++ if (ret)
++ return ret;
+
+ if (config & ADM1278_VIN_EN)
+ info->func[0] |= PMBUS_HAVE_VIN;
+diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
+index d3bf82c0de1d8..5733294ce5cd2 100644
+--- a/drivers/hwtracing/coresight/coresight-core.c
++++ b/drivers/hwtracing/coresight/coresight-core.c
+@@ -1419,13 +1419,8 @@ static int coresight_remove_match(struct device *dev, void *data)
+ if (csdev->dev.fwnode == conn->child_fwnode) {
+ iterator->orphan = true;
+ coresight_remove_links(iterator, conn);
+- /*
+- * Drop the reference to the handle for the remote
+- * device acquired in parsing the connections from
+- * platform data.
+- */
+- fwnode_handle_put(conn->child_fwnode);
+- conn->child_fwnode = NULL;
++
++ conn->child_dev = NULL;
+ /* No need to continue */
+ break;
+ }
+diff --git a/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c b/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c
+index 5e62aa40ecd0f..a9f19629f3f84 100644
+--- a/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c
++++ b/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c
+@@ -2411,7 +2411,6 @@ static ssize_t trctraceid_show(struct device *dev,
+
+ return sysfs_emit(buf, "0x%x\n", trace_id);
+ }
+-static DEVICE_ATTR_RO(trctraceid);
+
+ struct etmv4_reg {
+ struct coresight_device *csdev;
+@@ -2528,13 +2527,23 @@ coresight_etm4x_attr_reg_implemented(struct kobject *kobj,
+ return 0;
+ }
+
+-#define coresight_etm4x_reg(name, offset) \
+- &((struct dev_ext_attribute[]) { \
+- { \
+- __ATTR(name, 0444, coresight_etm4x_reg_show, NULL), \
+- (void *)(unsigned long)offset \
+- } \
+- })[0].attr.attr
++/*
++ * Macro to set an RO ext attribute with offset and show function.
++ * Offset is used in mgmt group to ensure only correct registers for
++ * the ETM / ETE variant are visible.
++ */
++#define coresight_etm4x_reg_showfn(name, offset, showfn) ( \
++ &((struct dev_ext_attribute[]) { \
++ { \
++ __ATTR(name, 0444, showfn, NULL), \
++ (void *)(unsigned long)offset \
++ } \
++ })[0].attr.attr \
++ )
++
++/* macro using the default coresight_etm4x_reg_show function */
++#define coresight_etm4x_reg(name, offset) \
++ coresight_etm4x_reg_showfn(name, offset, coresight_etm4x_reg_show)
+
+ static struct attribute *coresight_etmv4_mgmt_attrs[] = {
+ coresight_etm4x_reg(trcpdcr, TRCPDCR),
+@@ -2549,7 +2558,7 @@ static struct attribute *coresight_etmv4_mgmt_attrs[] = {
+ coresight_etm4x_reg(trcpidr3, TRCPIDR3),
+ coresight_etm4x_reg(trcoslsr, TRCOSLSR),
+ coresight_etm4x_reg(trcconfig, TRCCONFIGR),
+- &dev_attr_trctraceid.attr,
++ coresight_etm4x_reg_showfn(trctraceid, TRCTRACEIDR, trctraceid_show),
+ coresight_etm4x_reg(trcdevarch, TRCDEVARCH),
+ NULL,
+ };
+diff --git a/drivers/hwtracing/ptt/hisi_ptt.c b/drivers/hwtracing/ptt/hisi_ptt.c
+index 30f1525639b57..4140efd664097 100644
+--- a/drivers/hwtracing/ptt/hisi_ptt.c
++++ b/drivers/hwtracing/ptt/hisi_ptt.c
+@@ -341,13 +341,13 @@ static int hisi_ptt_register_irq(struct hisi_ptt *hisi_ptt)
+ if (ret < 0)
+ return ret;
+
+- ret = devm_request_threaded_irq(&pdev->dev,
+- pci_irq_vector(pdev, HISI_PTT_TRACE_DMA_IRQ),
++ hisi_ptt->trace_irq = pci_irq_vector(pdev, HISI_PTT_TRACE_DMA_IRQ);
++ ret = devm_request_threaded_irq(&pdev->dev, hisi_ptt->trace_irq,
+ NULL, hisi_ptt_isr, 0,
+ DRV_NAME, hisi_ptt);
+ if (ret) {
+ pci_err(pdev, "failed to request irq %d, ret = %d\n",
+- pci_irq_vector(pdev, HISI_PTT_TRACE_DMA_IRQ), ret);
++ hisi_ptt->trace_irq, ret);
+ return ret;
+ }
+
+@@ -757,8 +757,7 @@ static void hisi_ptt_pmu_start(struct perf_event *event, int flags)
+ * core in event_function_local(). If CPU passed is offline we'll fail
+ * here, just log it since we can do nothing here.
+ */
+- ret = irq_set_affinity(pci_irq_vector(hisi_ptt->pdev, HISI_PTT_TRACE_DMA_IRQ),
+- cpumask_of(cpu));
++ ret = irq_set_affinity(hisi_ptt->trace_irq, cpumask_of(cpu));
+ if (ret)
+ dev_warn(dev, "failed to set the affinity of trace interrupt\n");
+
+@@ -1018,8 +1017,7 @@ static int hisi_ptt_cpu_teardown(unsigned int cpu, struct hlist_node *node)
+ * Also make sure the interrupt bind to the migrated CPU as well. Warn
+ * the user on failure here.
+ */
+- if (irq_set_affinity(pci_irq_vector(hisi_ptt->pdev, HISI_PTT_TRACE_DMA_IRQ),
+- cpumask_of(target)))
++ if (irq_set_affinity(hisi_ptt->trace_irq, cpumask_of(target)))
+ dev_warn(dev, "failed to set the affinity of trace interrupt\n");
+
+ hisi_ptt->trace_ctrl.on_cpu = target;
+diff --git a/drivers/hwtracing/ptt/hisi_ptt.h b/drivers/hwtracing/ptt/hisi_ptt.h
+index 5beb1648c93ab..948a4c4231527 100644
+--- a/drivers/hwtracing/ptt/hisi_ptt.h
++++ b/drivers/hwtracing/ptt/hisi_ptt.h
+@@ -166,6 +166,7 @@ struct hisi_ptt_pmu_buf {
+ * @pdev: pci_dev of this PTT device
+ * @tune_lock: lock to serialize the tune process
+ * @pmu_lock: lock to serialize the perf process
++ * @trace_irq: interrupt number used by trace
+ * @upper_bdf: the upper BDF range of the PCI devices managed by this PTT device
+ * @lower_bdf: the lower BDF range of the PCI devices managed by this PTT device
+ * @port_filters: the filter list of root ports
+@@ -180,6 +181,7 @@ struct hisi_ptt {
+ struct pci_dev *pdev;
+ struct mutex tune_lock;
+ spinlock_t pmu_lock;
++ int trace_irq;
+ u32 upper_bdf;
+ u32 lower_bdf;
+
+diff --git a/drivers/i2c/busses/i2c-designware-pcidrv.c b/drivers/i2c/busses/i2c-designware-pcidrv.c
+index 782fe1ef3ca10..61d7a27aa0701 100644
+--- a/drivers/i2c/busses/i2c-designware-pcidrv.c
++++ b/drivers/i2c/busses/i2c-designware-pcidrv.c
+@@ -20,6 +20,7 @@
+ #include <linux/module.h>
+ #include <linux/pci.h>
+ #include <linux/pm_runtime.h>
++#include <linux/power_supply.h>
+ #include <linux/sched.h>
+ #include <linux/slab.h>
+
+@@ -234,6 +235,16 @@ static const struct dev_pm_ops i2c_dw_pm_ops = {
+ SET_RUNTIME_PM_OPS(i2c_dw_pci_runtime_suspend, i2c_dw_pci_runtime_resume, NULL)
+ };
+
++static const struct property_entry dgpu_properties[] = {
++ /* USB-C doesn't power the system */
++ PROPERTY_ENTRY_U8("scope", POWER_SUPPLY_SCOPE_DEVICE),
++ {}
++};
++
++static const struct software_node dgpu_node = {
++ .properties = dgpu_properties,
++};
++
+ static int i2c_dw_pci_probe(struct pci_dev *pdev,
+ const struct pci_device_id *id)
+ {
+@@ -325,7 +336,7 @@ static int i2c_dw_pci_probe(struct pci_dev *pdev,
+ }
+
+ if ((dev->flags & MODEL_MASK) == MODEL_AMD_NAVI_GPU) {
+- dev->slave = i2c_new_ccgx_ucsi(&dev->adapter, dev->irq, NULL);
++ dev->slave = i2c_new_ccgx_ucsi(&dev->adapter, dev->irq, &dgpu_node);
+ if (IS_ERR(dev->slave))
+ return dev_err_probe(dev->dev, PTR_ERR(dev->slave),
+ "register UCSI failed\n");
+diff --git a/drivers/i2c/busses/i2c-nvidia-gpu.c b/drivers/i2c/busses/i2c-nvidia-gpu.c
+index a8b99e7f6262a..26622d24bb1b2 100644
+--- a/drivers/i2c/busses/i2c-nvidia-gpu.c
++++ b/drivers/i2c/busses/i2c-nvidia-gpu.c
+@@ -14,6 +14,7 @@
+ #include <linux/platform_device.h>
+ #include <linux/pm.h>
+ #include <linux/pm_runtime.h>
++#include <linux/power_supply.h>
+
+ #include <asm/unaligned.h>
+
+@@ -261,6 +262,8 @@ MODULE_DEVICE_TABLE(pci, gpu_i2c_ids);
+ static const struct property_entry ccgx_props[] = {
+ /* Use FW built for NVIDIA GPU only */
+ PROPERTY_ENTRY_STRING("firmware-name", "nvidia,gpu"),
++ /* USB-C doesn't power the system */
++ PROPERTY_ENTRY_U8("scope", POWER_SUPPLY_SCOPE_DEVICE),
+ { }
+ };
+
+diff --git a/drivers/i2c/busses/i2c-xiic.c b/drivers/i2c/busses/i2c-xiic.c
+index 8a3d9817cb41c..ee6edc963deac 100644
+--- a/drivers/i2c/busses/i2c-xiic.c
++++ b/drivers/i2c/busses/i2c-xiic.c
+@@ -721,6 +721,8 @@ static irqreturn_t xiic_process(int irq, void *dev_id)
+ wakeup_req = 1;
+ wakeup_code = STATE_ERROR;
+ }
++ /* don't try to handle other events */
++ goto out;
+ }
+ if (pend & XIIC_INTR_RX_FULL_MASK) {
+ /* Receive register/FIFO is full */
+diff --git a/drivers/i3c/master/svc-i3c-master.c b/drivers/i3c/master/svc-i3c-master.c
+index e3f454123805e..79b08942a925d 100644
+--- a/drivers/i3c/master/svc-i3c-master.c
++++ b/drivers/i3c/master/svc-i3c-master.c
+@@ -1090,12 +1090,6 @@ static void svc_i3c_master_start_xfer_locked(struct svc_i3c_master *master)
+ if (!xfer)
+ return;
+
+- ret = pm_runtime_resume_and_get(master->dev);
+- if (ret < 0) {
+- dev_err(master->dev, "<%s> Cannot get runtime PM.\n", __func__);
+- return;
+- }
+-
+ svc_i3c_master_clear_merrwarn(master);
+ svc_i3c_master_flush_fifo(master);
+
+@@ -1110,9 +1104,6 @@ static void svc_i3c_master_start_xfer_locked(struct svc_i3c_master *master)
+ break;
+ }
+
+- pm_runtime_mark_last_busy(master->dev);
+- pm_runtime_put_autosuspend(master->dev);
+-
+ xfer->ret = ret;
+ complete(&xfer->comp);
+
+@@ -1133,6 +1124,13 @@ static void svc_i3c_master_enqueue_xfer(struct svc_i3c_master *master,
+ struct svc_i3c_xfer *xfer)
+ {
+ unsigned long flags;
++ int ret;
++
++ ret = pm_runtime_resume_and_get(master->dev);
++ if (ret < 0) {
++ dev_err(master->dev, "<%s> Cannot get runtime PM.\n", __func__);
++ return;
++ }
+
+ init_completion(&xfer->comp);
+ spin_lock_irqsave(&master->xferqueue.lock, flags);
+@@ -1143,6 +1141,9 @@ static void svc_i3c_master_enqueue_xfer(struct svc_i3c_master *master,
+ svc_i3c_master_start_xfer_locked(master);
+ }
+ spin_unlock_irqrestore(&master->xferqueue.lock, flags);
++
++ pm_runtime_mark_last_busy(master->dev);
++ pm_runtime_put_autosuspend(master->dev);
+ }
+
+ static bool
+diff --git a/drivers/iio/accel/fxls8962af-core.c b/drivers/iio/accel/fxls8962af-core.c
+index 0d672b1469e8d..be8a15cb945fd 100644
+--- a/drivers/iio/accel/fxls8962af-core.c
++++ b/drivers/iio/accel/fxls8962af-core.c
+@@ -724,8 +724,7 @@ static const struct iio_event_spec fxls8962af_event[] = {
+ .sign = 's', \
+ .realbits = 12, \
+ .storagebits = 16, \
+- .shift = 4, \
+- .endianness = IIO_BE, \
++ .endianness = IIO_LE, \
+ }, \
+ .event_spec = fxls8962af_event, \
+ .num_event_specs = ARRAY_SIZE(fxls8962af_event), \
+@@ -904,9 +903,10 @@ static int fxls8962af_fifo_transfer(struct fxls8962af_data *data,
+ int total_length = samples * sample_length;
+ int ret;
+
+- if (i2c_verify_client(dev))
++ if (i2c_verify_client(dev) &&
++ data->chip_info->chip_id == FXLS8962AF_DEVICE_ID)
+ /*
+- * Due to errata bug:
++ * Due to errata bug (only applicable on fxls8962af):
+ * E3: FIFO burst read operation error using I2C interface
+ * We have to avoid burst reads on I2C..
+ */
+diff --git a/drivers/iio/adc/ad7192.c b/drivers/iio/adc/ad7192.c
+index 99bb604b78c8c..8685e0b58a838 100644
+--- a/drivers/iio/adc/ad7192.c
++++ b/drivers/iio/adc/ad7192.c
+@@ -367,7 +367,7 @@ static int ad7192_of_clock_select(struct ad7192_state *st)
+ clock_sel = AD7192_CLK_INT;
+
+ /* use internal clock */
+- if (st->mclk) {
++ if (!st->mclk) {
+ if (of_property_read_bool(np, "adi,int-clock-output-enable"))
+ clock_sel = AD7192_CLK_INT_CO;
+ } else {
+@@ -380,9 +380,9 @@ static int ad7192_of_clock_select(struct ad7192_state *st)
+ return clock_sel;
+ }
+
+-static int ad7192_setup(struct ad7192_state *st, struct device_node *np)
++static int ad7192_setup(struct iio_dev *indio_dev, struct device_node *np)
+ {
+- struct iio_dev *indio_dev = spi_get_drvdata(st->sd.spi);
++ struct ad7192_state *st = iio_priv(indio_dev);
+ bool rej60_en, refin2_en;
+ bool buf_en, bipolar, burnout_curr_en;
+ unsigned long long scale_uv;
+@@ -1069,7 +1069,7 @@ static int ad7192_probe(struct spi_device *spi)
+ }
+ }
+
+- ret = ad7192_setup(st, spi->dev.of_node);
++ ret = ad7192_setup(indio_dev, spi->dev.of_node);
+ if (ret)
+ return ret;
+
+diff --git a/drivers/iio/addac/ad74413r.c b/drivers/iio/addac/ad74413r.c
+index e3366cf5eb319..6b0e8218f1507 100644
+--- a/drivers/iio/addac/ad74413r.c
++++ b/drivers/iio/addac/ad74413r.c
+@@ -1317,13 +1317,14 @@ static int ad74413r_setup_gpios(struct ad74413r_state *st)
+ }
+
+ if (config->func == CH_FUNC_DIGITAL_INPUT_LOGIC ||
+- config->func == CH_FUNC_DIGITAL_INPUT_LOOP_POWER)
++ config->func == CH_FUNC_DIGITAL_INPUT_LOOP_POWER) {
+ st->comp_gpio_offsets[comp_gpio_i++] = i;
+
+- strength = config->drive_strength;
+- ret = ad74413r_set_comp_drive_strength(st, i, strength);
+- if (ret)
+- return ret;
++ strength = config->drive_strength;
++ ret = ad74413r_set_comp_drive_strength(st, i, strength);
++ if (ret)
++ return ret;
++ }
+
+ ret = ad74413r_set_gpo_config(st, i, gpo_config);
+ if (ret)
+diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
+index 3073398a21834..1936f4b4002a7 100644
+--- a/drivers/infiniband/hw/bnxt_re/main.c
++++ b/drivers/infiniband/hw/bnxt_re/main.c
+@@ -283,15 +283,21 @@ static void bnxt_re_start_irq(void *handle, struct bnxt_msix_entry *ent)
+ for (indx = 0; indx < rdev->num_msix; indx++)
+ rdev->en_dev->msix_entries[indx].vector = ent[indx].vector;
+
+- bnxt_qplib_rcfw_start_irq(rcfw, msix_ent[BNXT_RE_AEQ_IDX].vector,
+- false);
++ rc = bnxt_qplib_rcfw_start_irq(rcfw, msix_ent[BNXT_RE_AEQ_IDX].vector,
++ false);
++ if (rc) {
++ ibdev_warn(&rdev->ibdev, "Failed to reinit CREQ\n");
++ return;
++ }
+ for (indx = BNXT_RE_NQ_IDX ; indx < rdev->num_msix; indx++) {
+ nq = &rdev->nq[indx - 1];
+ rc = bnxt_qplib_nq_start_irq(nq, indx - 1,
+ msix_ent[indx].vector, false);
+- if (rc)
++ if (rc) {
+ ibdev_warn(&rdev->ibdev, "Failed to reinit NQ index %d\n",
+ indx - 1);
++ return;
++ }
+ }
+ }
+
+@@ -963,12 +969,6 @@ static int bnxt_re_update_gid(struct bnxt_re_dev *rdev)
+ if (!ib_device_try_get(&rdev->ibdev))
+ return 0;
+
+- if (!sgid_tbl) {
+- ibdev_err(&rdev->ibdev, "QPLIB: SGID table not allocated");
+- rc = -EINVAL;
+- goto out;
+- }
+-
+ for (index = 0; index < sgid_tbl->active; index++) {
+ gid_idx = sgid_tbl->hw_id[index];
+
+@@ -986,7 +986,7 @@ static int bnxt_re_update_gid(struct bnxt_re_dev *rdev)
+ rc = bnxt_qplib_update_sgid(sgid_tbl, &gid, gid_idx,
+ rdev->qplib_res.netdev->dev_addr);
+ }
+-out:
++
+ ib_device_put(&rdev->ibdev);
+ return rc;
+ }
+diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.c b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
+index 8974f6235cfaa..55f092c2c8a88 100644
+--- a/drivers/infiniband/hw/bnxt_re/qplib_fp.c
++++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
+@@ -399,6 +399,9 @@ static irqreturn_t bnxt_qplib_nq_irq(int irq, void *dev_instance)
+
+ void bnxt_qplib_nq_stop_irq(struct bnxt_qplib_nq *nq, bool kill)
+ {
++ if (!nq->requested)
++ return;
++
+ tasklet_disable(&nq->nq_tasklet);
+ /* Mask h/w interrupt */
+ bnxt_qplib_ring_nq_db(&nq->nq_db.dbinfo, nq->res->cctx, false);
+@@ -406,11 +409,12 @@ void bnxt_qplib_nq_stop_irq(struct bnxt_qplib_nq *nq, bool kill)
+ synchronize_irq(nq->msix_vec);
+ if (kill)
+ tasklet_kill(&nq->nq_tasklet);
+- if (nq->requested) {
+- irq_set_affinity_hint(nq->msix_vec, NULL);
+- free_irq(nq->msix_vec, nq);
+- nq->requested = false;
+- }
++
++ irq_set_affinity_hint(nq->msix_vec, NULL);
++ free_irq(nq->msix_vec, nq);
++ kfree(nq->name);
++ nq->name = NULL;
++ nq->requested = false;
+ }
+
+ void bnxt_qplib_disable_nq(struct bnxt_qplib_nq *nq)
+@@ -436,6 +440,7 @@ void bnxt_qplib_disable_nq(struct bnxt_qplib_nq *nq)
+ int bnxt_qplib_nq_start_irq(struct bnxt_qplib_nq *nq, int nq_indx,
+ int msix_vector, bool need_init)
+ {
++ struct bnxt_qplib_res *res = nq->res;
+ int rc;
+
+ if (nq->requested)
+@@ -447,10 +452,17 @@ int bnxt_qplib_nq_start_irq(struct bnxt_qplib_nq *nq, int nq_indx,
+ else
+ tasklet_enable(&nq->nq_tasklet);
+
+- snprintf(nq->name, sizeof(nq->name), "bnxt_qplib_nq-%d", nq_indx);
++ nq->name = kasprintf(GFP_KERNEL, "bnxt_re-nq-%d@pci:%s",
++ nq_indx, pci_name(res->pdev));
++ if (!nq->name)
++ return -ENOMEM;
+ rc = request_irq(nq->msix_vec, bnxt_qplib_nq_irq, 0, nq->name, nq);
+- if (rc)
++ if (rc) {
++ kfree(nq->name);
++ nq->name = NULL;
++ tasklet_disable(&nq->nq_tasklet);
+ return rc;
++ }
+
+ cpumask_clear(&nq->mask);
+ cpumask_set_cpu(nq_indx, &nq->mask);
+@@ -461,7 +473,7 @@ int bnxt_qplib_nq_start_irq(struct bnxt_qplib_nq *nq, int nq_indx,
+ nq->msix_vec, nq_indx);
+ }
+ nq->requested = true;
+- bnxt_qplib_ring_nq_db(&nq->nq_db.dbinfo, nq->res->cctx, true);
++ bnxt_qplib_ring_nq_db(&nq->nq_db.dbinfo, res->cctx, true);
+
+ return rc;
+ }
+@@ -1614,7 +1626,7 @@ static int bnxt_qplib_put_inline(struct bnxt_qplib_qp *qp,
+ il_src = (void *)wqe->sg_list[indx].addr;
+ t_len += len;
+ if (t_len > qp->max_inline_data)
+- goto bad;
++ return -ENOMEM;
+ while (len) {
+ if (pull_dst) {
+ pull_dst = false;
+@@ -1638,8 +1650,6 @@ static int bnxt_qplib_put_inline(struct bnxt_qplib_qp *qp,
+ }
+
+ return t_len;
+-bad:
+- return -ENOMEM;
+ }
+
+ static u32 bnxt_qplib_put_sges(struct bnxt_qplib_hwq *hwq,
+@@ -2069,7 +2079,7 @@ int bnxt_qplib_create_cq(struct bnxt_qplib_res *res, struct bnxt_qplib_cq *cq)
+ hwq_attr.sginfo = &cq->sg_info;
+ rc = bnxt_qplib_alloc_init_hwq(&cq->hwq, &hwq_attr);
+ if (rc)
+- goto exit;
++ return rc;
+
+ bnxt_qplib_rcfw_cmd_prep((struct cmdq_base *)&req,
+ CMDQ_BASE_OPCODE_CREATE_CQ,
+@@ -2112,7 +2122,6 @@ int bnxt_qplib_create_cq(struct bnxt_qplib_res *res, struct bnxt_qplib_cq *cq)
+
+ fail:
+ bnxt_qplib_free_hwq(res, &cq->hwq);
+-exit:
+ return rc;
+ }
+
+@@ -2790,11 +2799,8 @@ static int bnxt_qplib_cq_process_terminal(struct bnxt_qplib_cq *cq,
+
+ qp = (struct bnxt_qplib_qp *)((unsigned long)
+ le64_to_cpu(hwcqe->qp_handle));
+- if (!qp) {
+- dev_err(&cq->hwq.pdev->dev,
+- "FP: CQ Process terminal qp is NULL\n");
++ if (!qp)
+ return -EINVAL;
+- }
+
+ /* Must block new posting of SQ and RQ */
+ qp->state = CMDQ_MODIFY_QP_NEW_STATE_ERR;
+diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.h b/drivers/infiniband/hw/bnxt_re/qplib_fp.h
+index d74d5ead2e32a..a42820821c473 100644
+--- a/drivers/infiniband/hw/bnxt_re/qplib_fp.h
++++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.h
+@@ -472,7 +472,7 @@ typedef int (*srqn_handler_t)(struct bnxt_qplib_nq *nq,
+ struct bnxt_qplib_nq {
+ struct pci_dev *pdev;
+ struct bnxt_qplib_res *res;
+- char name[32];
++ char *name;
+ struct bnxt_qplib_hwq hwq;
+ struct bnxt_qplib_nq_db nq_db;
+ u16 ring_id;
+diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
+index de90691031773..c11b8e708844c 100644
+--- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
++++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
+@@ -180,7 +180,7 @@ static int __send_message(struct bnxt_qplib_rcfw *rcfw,
+ } while (bsize > 0);
+ cmdq->seq_num++;
+
+- cmdq_prod = hwq->prod;
++ cmdq_prod = hwq->prod & 0xFFFF;
+ if (test_bit(FIRMWARE_FIRST_FLAG, &cmdq->flags)) {
+ /* The very first doorbell write
+ * is required to set this flag
+@@ -295,7 +295,8 @@ static int bnxt_qplib_process_func_event(struct bnxt_qplib_rcfw *rcfw,
+ }
+
+ static int bnxt_qplib_process_qp_event(struct bnxt_qplib_rcfw *rcfw,
+- struct creq_qp_event *qp_event)
++ struct creq_qp_event *qp_event,
++ u32 *num_wait)
+ {
+ struct creq_qp_error_notification *err_event;
+ struct bnxt_qplib_hwq *hwq = &rcfw->cmdq.hwq;
+@@ -304,6 +305,7 @@ static int bnxt_qplib_process_qp_event(struct bnxt_qplib_rcfw *rcfw,
+ u16 cbit, blocked = 0;
+ struct pci_dev *pdev;
+ unsigned long flags;
++ u32 wait_cmds = 0;
+ __le16 mcookie;
+ u16 cookie;
+ int rc = 0;
+@@ -363,9 +365,10 @@ static int bnxt_qplib_process_qp_event(struct bnxt_qplib_rcfw *rcfw,
+ crsqe->req_size = 0;
+
+ if (!blocked)
+- wake_up(&rcfw->cmdq.waitq);
++ wait_cmds++;
+ spin_unlock_irqrestore(&hwq->lock, flags);
+ }
++ *num_wait += wait_cmds;
+ return rc;
+ }
+
+@@ -379,6 +382,7 @@ static void bnxt_qplib_service_creq(struct tasklet_struct *t)
+ struct creq_base *creqe;
+ u32 sw_cons, raw_cons;
+ unsigned long flags;
++ u32 num_wakeup = 0;
+
+ /* Service the CREQ until budget is over */
+ spin_lock_irqsave(&hwq->lock, flags);
+@@ -397,7 +401,8 @@ static void bnxt_qplib_service_creq(struct tasklet_struct *t)
+ switch (type) {
+ case CREQ_BASE_TYPE_QP_EVENT:
+ bnxt_qplib_process_qp_event
+- (rcfw, (struct creq_qp_event *)creqe);
++ (rcfw, (struct creq_qp_event *)creqe,
++ &num_wakeup);
+ creq->stats.creq_qp_event_processed++;
+ break;
+ case CREQ_BASE_TYPE_FUNC_EVENT:
+@@ -425,6 +430,8 @@ static void bnxt_qplib_service_creq(struct tasklet_struct *t)
+ rcfw->res->cctx, true);
+ }
+ spin_unlock_irqrestore(&hwq->lock, flags);
++ if (num_wakeup)
++ wake_up_nr(&rcfw->cmdq.waitq, num_wakeup);
+ }
+
+ static irqreturn_t bnxt_qplib_creq_irq(int irq, void *dev_instance)
+@@ -599,7 +606,7 @@ int bnxt_qplib_alloc_rcfw_channel(struct bnxt_qplib_res *res,
+ rcfw->cmdq_depth = BNXT_QPLIB_CMDQE_MAX_CNT_8192;
+
+ sginfo.pgsize = bnxt_qplib_cmdqe_page_size(rcfw->cmdq_depth);
+- hwq_attr.depth = rcfw->cmdq_depth;
++ hwq_attr.depth = rcfw->cmdq_depth & 0x7FFFFFFF;
+ hwq_attr.stride = BNXT_QPLIB_CMDQE_UNITS;
+ hwq_attr.type = HWQ_TYPE_CTX;
+ if (bnxt_qplib_alloc_init_hwq(&cmdq->hwq, &hwq_attr)) {
+@@ -636,6 +643,10 @@ void bnxt_qplib_rcfw_stop_irq(struct bnxt_qplib_rcfw *rcfw, bool kill)
+ struct bnxt_qplib_creq_ctx *creq;
+
+ creq = &rcfw->creq;
++
++ if (!creq->requested)
++ return;
++
+ tasklet_disable(&creq->creq_tasklet);
+ /* Mask h/w interrupts */
+ bnxt_qplib_ring_nq_db(&creq->creq_db.dbinfo, rcfw->res->cctx, false);
+@@ -644,10 +655,10 @@ void bnxt_qplib_rcfw_stop_irq(struct bnxt_qplib_rcfw *rcfw, bool kill)
+ if (kill)
+ tasklet_kill(&creq->creq_tasklet);
+
+- if (creq->requested) {
+- free_irq(creq->msix_vec, rcfw);
+- creq->requested = false;
+- }
++ free_irq(creq->msix_vec, rcfw);
++ kfree(creq->irq_name);
++ creq->irq_name = NULL;
++ creq->requested = false;
+ }
+
+ void bnxt_qplib_disable_rcfw_channel(struct bnxt_qplib_rcfw *rcfw)
+@@ -679,9 +690,11 @@ int bnxt_qplib_rcfw_start_irq(struct bnxt_qplib_rcfw *rcfw, int msix_vector,
+ bool need_init)
+ {
+ struct bnxt_qplib_creq_ctx *creq;
++ struct bnxt_qplib_res *res;
+ int rc;
+
+ creq = &rcfw->creq;
++ res = rcfw->res;
+
+ if (creq->requested)
+ return -EFAULT;
+@@ -691,13 +704,22 @@ int bnxt_qplib_rcfw_start_irq(struct bnxt_qplib_rcfw *rcfw, int msix_vector,
+ tasklet_setup(&creq->creq_tasklet, bnxt_qplib_service_creq);
+ else
+ tasklet_enable(&creq->creq_tasklet);
++
++ creq->irq_name = kasprintf(GFP_KERNEL, "bnxt_re-creq@pci:%s",
++ pci_name(res->pdev));
++ if (!creq->irq_name)
++ return -ENOMEM;
+ rc = request_irq(creq->msix_vec, bnxt_qplib_creq_irq, 0,
+- "bnxt_qplib_creq", rcfw);
+- if (rc)
++ creq->irq_name, rcfw);
++ if (rc) {
++ kfree(creq->irq_name);
++ creq->irq_name = NULL;
++ tasklet_disable(&creq->creq_tasklet);
+ return rc;
++ }
+ creq->requested = true;
+
+- bnxt_qplib_ring_nq_db(&creq->creq_db.dbinfo, rcfw->res->cctx, true);
++ bnxt_qplib_ring_nq_db(&creq->creq_db.dbinfo, res->cctx, true);
+
+ return 0;
+ }
+diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h
+index dd5651478bbb7..92f7a25533d3b 100644
+--- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h
++++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h
+@@ -186,6 +186,7 @@ struct bnxt_qplib_creq_ctx {
+ u16 ring_id;
+ int msix_vec;
+ bool requested; /*irq handler installed */
++ char *irq_name;
+ };
+
+ /* RCFW Communication Channels */
+diff --git a/drivers/infiniband/hw/hfi1/ipoib_tx.c b/drivers/infiniband/hw/hfi1/ipoib_tx.c
+index 8973a081d641e..e7d831330278d 100644
+--- a/drivers/infiniband/hw/hfi1/ipoib_tx.c
++++ b/drivers/infiniband/hw/hfi1/ipoib_tx.c
+@@ -215,11 +215,11 @@ static int hfi1_ipoib_build_ulp_payload(struct ipoib_txreq *tx,
+ const skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
+
+ ret = sdma_txadd_page(dd,
+- NULL,
+ txreq,
+ skb_frag_page(frag),
+ frag->bv_offset,
+- skb_frag_size(frag));
++ skb_frag_size(frag),
++ NULL, NULL, NULL);
+ if (unlikely(ret))
+ break;
+ }
+diff --git a/drivers/infiniband/hw/hfi1/mmu_rb.c b/drivers/infiniband/hw/hfi1/mmu_rb.c
+index 1cea8b0c78e0f..a864423c256dd 100644
+--- a/drivers/infiniband/hw/hfi1/mmu_rb.c
++++ b/drivers/infiniband/hw/hfi1/mmu_rb.c
+@@ -19,8 +19,7 @@ static int mmu_notifier_range_start(struct mmu_notifier *,
+ const struct mmu_notifier_range *);
+ static struct mmu_rb_node *__mmu_rb_search(struct mmu_rb_handler *,
+ unsigned long, unsigned long);
+-static void do_remove(struct mmu_rb_handler *handler,
+- struct list_head *del_list);
++static void release_immediate(struct kref *refcount);
+ static void handle_remove(struct work_struct *work);
+
+ static const struct mmu_notifier_ops mn_opts = {
+@@ -106,7 +105,11 @@ void hfi1_mmu_rb_unregister(struct mmu_rb_handler *handler)
+ }
+ spin_unlock_irqrestore(&handler->lock, flags);
+
+- do_remove(handler, &del_list);
++ while (!list_empty(&del_list)) {
++ rbnode = list_first_entry(&del_list, struct mmu_rb_node, list);
++ list_del(&rbnode->list);
++ kref_put(&rbnode->refcount, release_immediate);
++ }
+
+ /* Now the mm may be freed. */
+ mmdrop(handler->mn.mm);
+@@ -134,12 +137,6 @@ int hfi1_mmu_rb_insert(struct mmu_rb_handler *handler,
+ }
+ __mmu_int_rb_insert(mnode, &handler->root);
+ list_add_tail(&mnode->list, &handler->lru_list);
+-
+- ret = handler->ops->insert(handler->ops_arg, mnode);
+- if (ret) {
+- __mmu_int_rb_remove(mnode, &handler->root);
+- list_del(&mnode->list); /* remove from LRU list */
+- }
+ mnode->handler = handler;
+ unlock:
+ spin_unlock_irqrestore(&handler->lock, flags);
+@@ -183,6 +180,48 @@ static struct mmu_rb_node *__mmu_rb_search(struct mmu_rb_handler *handler,
+ return node;
+ }
+
++/*
++ * Must NOT call while holding mnode->handler->lock.
++ * mnode->handler->ops->remove() may sleep and mnode->handler->lock is a
++ * spinlock.
++ */
++static void release_immediate(struct kref *refcount)
++{
++ struct mmu_rb_node *mnode =
++ container_of(refcount, struct mmu_rb_node, refcount);
++ mnode->handler->ops->remove(mnode->handler->ops_arg, mnode);
++}
++
++/* Caller must hold mnode->handler->lock */
++static void release_nolock(struct kref *refcount)
++{
++ struct mmu_rb_node *mnode =
++ container_of(refcount, struct mmu_rb_node, refcount);
++ list_move(&mnode->list, &mnode->handler->del_list);
++ queue_work(mnode->handler->wq, &mnode->handler->del_work);
++}
++
++/*
++ * struct mmu_rb_node->refcount kref_put() callback.
++ * Adds mmu_rb_node to mmu_rb_node->handler->del_list and queues
++ * handler->del_work on handler->wq.
++ * Does not remove mmu_rb_node from handler->lru_list or handler->rb_root.
++ * Acquires mmu_rb_node->handler->lock; do not call while already holding
++ * handler->lock.
++ */
++void hfi1_mmu_rb_release(struct kref *refcount)
++{
++ struct mmu_rb_node *mnode =
++ container_of(refcount, struct mmu_rb_node, refcount);
++ struct mmu_rb_handler *handler = mnode->handler;
++ unsigned long flags;
++
++ spin_lock_irqsave(&handler->lock, flags);
++ list_move(&mnode->list, &mnode->handler->del_list);
++ spin_unlock_irqrestore(&handler->lock, flags);
++ queue_work(handler->wq, &handler->del_work);
++}
++
+ void hfi1_mmu_rb_evict(struct mmu_rb_handler *handler, void *evict_arg)
+ {
+ struct mmu_rb_node *rbnode, *ptr;
+@@ -197,6 +236,10 @@ void hfi1_mmu_rb_evict(struct mmu_rb_handler *handler, void *evict_arg)
+
+ spin_lock_irqsave(&handler->lock, flags);
+ list_for_each_entry_safe(rbnode, ptr, &handler->lru_list, list) {
++ /* refcount == 1 implies mmu_rb_handler has only rbnode ref */
++ if (kref_read(&rbnode->refcount) > 1)
++ continue;
++
+ if (handler->ops->evict(handler->ops_arg, rbnode, evict_arg,
+ &stop)) {
+ __mmu_int_rb_remove(rbnode, &handler->root);
+@@ -209,7 +252,7 @@ void hfi1_mmu_rb_evict(struct mmu_rb_handler *handler, void *evict_arg)
+ spin_unlock_irqrestore(&handler->lock, flags);
+
+ list_for_each_entry_safe(rbnode, ptr, &del_list, list) {
+- handler->ops->remove(handler->ops_arg, rbnode);
++ kref_put(&rbnode->refcount, release_immediate);
+ }
+ }
+
+@@ -221,7 +264,6 @@ static int mmu_notifier_range_start(struct mmu_notifier *mn,
+ struct rb_root_cached *root = &handler->root;
+ struct mmu_rb_node *node, *ptr = NULL;
+ unsigned long flags;
+- bool added = false;
+
+ spin_lock_irqsave(&handler->lock, flags);
+ for (node = __mmu_int_rb_iter_first(root, range->start, range->end-1);
+@@ -230,38 +272,16 @@ static int mmu_notifier_range_start(struct mmu_notifier *mn,
+ ptr = __mmu_int_rb_iter_next(node, range->start,
+ range->end - 1);
+ trace_hfi1_mmu_mem_invalidate(node->addr, node->len);
+- if (handler->ops->invalidate(handler->ops_arg, node)) {
+- __mmu_int_rb_remove(node, root);
+- /* move from LRU list to delete list */
+- list_move(&node->list, &handler->del_list);
+- added = true;
+- }
++ /* Remove from rb tree and lru_list. */
++ __mmu_int_rb_remove(node, root);
++ list_del_init(&node->list);
++ kref_put(&node->refcount, release_nolock);
+ }
+ spin_unlock_irqrestore(&handler->lock, flags);
+
+- if (added)
+- queue_work(handler->wq, &handler->del_work);
+-
+ return 0;
+ }
+
+-/*
+- * Call the remove function for the given handler and the list. This
+- * is expected to be called with a delete list extracted from handler.
+- * The caller should not be holding the handler lock.
+- */
+-static void do_remove(struct mmu_rb_handler *handler,
+- struct list_head *del_list)
+-{
+- struct mmu_rb_node *node;
+-
+- while (!list_empty(del_list)) {
+- node = list_first_entry(del_list, struct mmu_rb_node, list);
+- list_del(&node->list);
+- handler->ops->remove(handler->ops_arg, node);
+- }
+-}
+-
+ /*
+ * Work queue function to remove all nodes that have been queued up to
+ * be removed. The key feature is that mm->mmap_lock is not being held
+@@ -274,11 +294,16 @@ static void handle_remove(struct work_struct *work)
+ del_work);
+ struct list_head del_list;
+ unsigned long flags;
++ struct mmu_rb_node *node;
+
+ /* remove anything that is queued to get removed */
+ spin_lock_irqsave(&handler->lock, flags);
+ list_replace_init(&handler->del_list, &del_list);
+ spin_unlock_irqrestore(&handler->lock, flags);
+
+- do_remove(handler, &del_list);
++ while (!list_empty(&del_list)) {
++ node = list_first_entry(&del_list, struct mmu_rb_node, list);
++ list_del(&node->list);
++ handler->ops->remove(handler->ops_arg, node);
++ }
+ }
+diff --git a/drivers/infiniband/hw/hfi1/mmu_rb.h b/drivers/infiniband/hw/hfi1/mmu_rb.h
+index c4da064188c9d..82c505a04fc6d 100644
+--- a/drivers/infiniband/hw/hfi1/mmu_rb.h
++++ b/drivers/infiniband/hw/hfi1/mmu_rb.h
+@@ -16,6 +16,7 @@ struct mmu_rb_node {
+ struct rb_node node;
+ struct mmu_rb_handler *handler;
+ struct list_head list;
++ struct kref refcount;
+ };
+
+ /*
+@@ -61,6 +62,8 @@ int hfi1_mmu_rb_register(void *ops_arg,
+ void hfi1_mmu_rb_unregister(struct mmu_rb_handler *handler);
+ int hfi1_mmu_rb_insert(struct mmu_rb_handler *handler,
+ struct mmu_rb_node *mnode);
++void hfi1_mmu_rb_release(struct kref *refcount);
++
+ void hfi1_mmu_rb_evict(struct mmu_rb_handler *handler, void *evict_arg);
+ struct mmu_rb_node *hfi1_mmu_rb_get_first(struct mmu_rb_handler *handler,
+ unsigned long addr,
+diff --git a/drivers/infiniband/hw/hfi1/sdma.c b/drivers/infiniband/hw/hfi1/sdma.c
+index bb2552dd29c1e..26c62162759ba 100644
+--- a/drivers/infiniband/hw/hfi1/sdma.c
++++ b/drivers/infiniband/hw/hfi1/sdma.c
+@@ -1593,7 +1593,20 @@ static inline void sdma_unmap_desc(
+ struct hfi1_devdata *dd,
+ struct sdma_desc *descp)
+ {
+- system_descriptor_complete(dd, descp);
++ switch (sdma_mapping_type(descp)) {
++ case SDMA_MAP_SINGLE:
++ dma_unmap_single(&dd->pcidev->dev, sdma_mapping_addr(descp),
++ sdma_mapping_len(descp), DMA_TO_DEVICE);
++ break;
++ case SDMA_MAP_PAGE:
++ dma_unmap_page(&dd->pcidev->dev, sdma_mapping_addr(descp),
++ sdma_mapping_len(descp), DMA_TO_DEVICE);
++ break;
++ }
++
++ if (descp->pinning_ctx && descp->ctx_put)
++ descp->ctx_put(descp->pinning_ctx);
++ descp->pinning_ctx = NULL;
+ }
+
+ /*
+@@ -3113,8 +3126,8 @@ int ext_coal_sdma_tx_descs(struct hfi1_devdata *dd, struct sdma_txreq *tx,
+
+ /* Add descriptor for coalesce buffer */
+ tx->desc_limit = MAX_DESC;
+- return _sdma_txadd_daddr(dd, SDMA_MAP_SINGLE, NULL, tx,
+- addr, tx->tlen);
++ return _sdma_txadd_daddr(dd, SDMA_MAP_SINGLE, tx,
++ addr, tx->tlen, NULL, NULL, NULL);
+ }
+
+ return 1;
+@@ -3157,9 +3170,9 @@ int _pad_sdma_tx_descs(struct hfi1_devdata *dd, struct sdma_txreq *tx)
+ make_tx_sdma_desc(
+ tx,
+ SDMA_MAP_NONE,
+- NULL,
+ dd->sdma_pad_phys,
+- sizeof(u32) - (tx->packet_len & (sizeof(u32) - 1)));
++ sizeof(u32) - (tx->packet_len & (sizeof(u32) - 1)),
++ NULL, NULL, NULL);
+ tx->num_desc++;
+ _sdma_close_tx(dd, tx);
+ return rval;
+diff --git a/drivers/infiniband/hw/hfi1/sdma.h b/drivers/infiniband/hw/hfi1/sdma.h
+index 95aaec14c6c28..7fdebab202c4f 100644
+--- a/drivers/infiniband/hw/hfi1/sdma.h
++++ b/drivers/infiniband/hw/hfi1/sdma.h
+@@ -594,9 +594,11 @@ static inline dma_addr_t sdma_mapping_addr(struct sdma_desc *d)
+ static inline void make_tx_sdma_desc(
+ struct sdma_txreq *tx,
+ int type,
+- void *pinning_ctx,
+ dma_addr_t addr,
+- size_t len)
++ size_t len,
++ void *pinning_ctx,
++ void (*ctx_get)(void *),
++ void (*ctx_put)(void *))
+ {
+ struct sdma_desc *desc = &tx->descp[tx->num_desc];
+
+@@ -613,7 +615,11 @@ static inline void make_tx_sdma_desc(
+ << SDMA_DESC0_PHY_ADDR_SHIFT) |
+ (((u64)len & SDMA_DESC0_BYTE_COUNT_MASK)
+ << SDMA_DESC0_BYTE_COUNT_SHIFT);
++
+ desc->pinning_ctx = pinning_ctx;
++ desc->ctx_put = ctx_put;
++ if (pinning_ctx && ctx_get)
++ ctx_get(pinning_ctx);
+ }
+
+ /* helper to extend txreq */
+@@ -645,18 +651,20 @@ static inline void _sdma_close_tx(struct hfi1_devdata *dd,
+ static inline int _sdma_txadd_daddr(
+ struct hfi1_devdata *dd,
+ int type,
+- void *pinning_ctx,
+ struct sdma_txreq *tx,
+ dma_addr_t addr,
+- u16 len)
++ u16 len,
++ void *pinning_ctx,
++ void (*ctx_get)(void *),
++ void (*ctx_put)(void *))
+ {
+ int rval = 0;
+
+ make_tx_sdma_desc(
+ tx,
+ type,
+- pinning_ctx,
+- addr, len);
++ addr, len,
++ pinning_ctx, ctx_get, ctx_put);
+ WARN_ON(len > tx->tlen);
+ tx->num_desc++;
+ tx->tlen -= len;
+@@ -676,11 +684,18 @@ static inline int _sdma_txadd_daddr(
+ /**
+ * sdma_txadd_page() - add a page to the sdma_txreq
+ * @dd: the device to use for mapping
+- * @pinning_ctx: context to be released at descriptor retirement
+ * @tx: tx request to which the page is added
+ * @page: page to map
+ * @offset: offset within the page
+ * @len: length in bytes
++ * @pinning_ctx: context to be stored on struct sdma_desc .pinning_ctx. Not
++ * added if coalesce buffer is used. E.g. pointer to pinned-page
++ * cache entry for the sdma_desc.
++ * @ctx_get: optional function to take reference to @pinning_ctx. Not called if
++ * @pinning_ctx is NULL.
++ * @ctx_put: optional function to release reference to @pinning_ctx after
++ * sdma_desc completes. May be called in interrupt context so must
++ * not sleep. Not called if @pinning_ctx is NULL.
+ *
+ * This is used to add a page/offset/length descriptor.
+ *
+@@ -692,11 +707,13 @@ static inline int _sdma_txadd_daddr(
+ */
+ static inline int sdma_txadd_page(
+ struct hfi1_devdata *dd,
+- void *pinning_ctx,
+ struct sdma_txreq *tx,
+ struct page *page,
+ unsigned long offset,
+- u16 len)
++ u16 len,
++ void *pinning_ctx,
++ void (*ctx_get)(void *),
++ void (*ctx_put)(void *))
+ {
+ dma_addr_t addr;
+ int rval;
+@@ -720,7 +737,8 @@ static inline int sdma_txadd_page(
+ return -ENOSPC;
+ }
+
+- return _sdma_txadd_daddr(dd, SDMA_MAP_PAGE, pinning_ctx, tx, addr, len);
++ return _sdma_txadd_daddr(dd, SDMA_MAP_PAGE, tx, addr, len,
++ pinning_ctx, ctx_get, ctx_put);
+ }
+
+ /**
+@@ -754,8 +772,8 @@ static inline int sdma_txadd_daddr(
+ return rval;
+ }
+
+- return _sdma_txadd_daddr(dd, SDMA_MAP_NONE, NULL, tx,
+- addr, len);
++ return _sdma_txadd_daddr(dd, SDMA_MAP_NONE, tx, addr, len,
++ NULL, NULL, NULL);
+ }
+
+ /**
+@@ -801,7 +819,8 @@ static inline int sdma_txadd_kvaddr(
+ return -ENOSPC;
+ }
+
+- return _sdma_txadd_daddr(dd, SDMA_MAP_SINGLE, NULL, tx, addr, len);
++ return _sdma_txadd_daddr(dd, SDMA_MAP_SINGLE, tx, addr, len,
++ NULL, NULL, NULL);
+ }
+
+ struct iowait_work;
+@@ -1034,6 +1053,4 @@ u16 sdma_get_descq_cnt(void);
+ extern uint mod_num_sdma;
+
+ void sdma_update_lmc(struct hfi1_devdata *dd, u64 mask, u32 lid);
+-
+-void system_descriptor_complete(struct hfi1_devdata *dd, struct sdma_desc *descp);
+ #endif
+diff --git a/drivers/infiniband/hw/hfi1/sdma_txreq.h b/drivers/infiniband/hw/hfi1/sdma_txreq.h
+index fad946cb5e0d8..85ae7293c2741 100644
+--- a/drivers/infiniband/hw/hfi1/sdma_txreq.h
++++ b/drivers/infiniband/hw/hfi1/sdma_txreq.h
+@@ -20,6 +20,8 @@ struct sdma_desc {
+ /* private: don't use directly */
+ u64 qw[2];
+ void *pinning_ctx;
++ /* Release reference to @pinning_ctx. May be called in interrupt context. Must not sleep. */
++ void (*ctx_put)(void *ctx);
+ };
+
+ /**
+diff --git a/drivers/infiniband/hw/hfi1/user_sdma.c b/drivers/infiniband/hw/hfi1/user_sdma.c
+index ae58b48afe074..02bd62b857b75 100644
+--- a/drivers/infiniband/hw/hfi1/user_sdma.c
++++ b/drivers/infiniband/hw/hfi1/user_sdma.c
+@@ -62,18 +62,14 @@ static int defer_packet_queue(
+ static void activate_packet_queue(struct iowait *wait, int reason);
+ static bool sdma_rb_filter(struct mmu_rb_node *node, unsigned long addr,
+ unsigned long len);
+-static int sdma_rb_insert(void *arg, struct mmu_rb_node *mnode);
+ static int sdma_rb_evict(void *arg, struct mmu_rb_node *mnode,
+ void *arg2, bool *stop);
+ static void sdma_rb_remove(void *arg, struct mmu_rb_node *mnode);
+-static int sdma_rb_invalidate(void *arg, struct mmu_rb_node *mnode);
+
+ static struct mmu_rb_ops sdma_rb_ops = {
+ .filter = sdma_rb_filter,
+- .insert = sdma_rb_insert,
+ .evict = sdma_rb_evict,
+ .remove = sdma_rb_remove,
+- .invalidate = sdma_rb_invalidate
+ };
+
+ static int add_system_pages_to_sdma_packet(struct user_sdma_request *req,
+@@ -247,14 +243,14 @@ int hfi1_user_sdma_free_queues(struct hfi1_filedata *fd,
+ spin_unlock(&fd->pq_rcu_lock);
+ synchronize_srcu(&fd->pq_srcu);
+ /* at this point there can be no more new requests */
+- if (pq->handler)
+- hfi1_mmu_rb_unregister(pq->handler);
+ iowait_sdma_drain(&pq->busy);
+ /* Wait until all requests have been freed. */
+ wait_event_interruptible(
+ pq->wait,
+ !atomic_read(&pq->n_reqs));
+ kfree(pq->reqs);
++ if (pq->handler)
++ hfi1_mmu_rb_unregister(pq->handler);
+ bitmap_free(pq->req_in_use);
+ kmem_cache_destroy(pq->txreq_cache);
+ flush_pq_iowait(pq);
+@@ -1275,25 +1271,17 @@ static void free_system_node(struct sdma_mmu_node *node)
+ kfree(node);
+ }
+
+-static inline void acquire_node(struct sdma_mmu_node *node)
+-{
+- atomic_inc(&node->refcount);
+- WARN_ON(atomic_read(&node->refcount) < 0);
+-}
+-
+-static inline void release_node(struct mmu_rb_handler *handler,
+- struct sdma_mmu_node *node)
+-{
+- atomic_dec(&node->refcount);
+- WARN_ON(atomic_read(&node->refcount) < 0);
+-}
+-
++/*
++ * kref_get()'s an additional kref on the returned rb_node to prevent rb_node
++ * from being released until after rb_node is assigned to an SDMA descriptor
++ * (struct sdma_desc) under add_system_iovec_to_sdma_packet(), even if the
++ * virtual address range for rb_node is invalidated between now and then.
++ */
+ static struct sdma_mmu_node *find_system_node(struct mmu_rb_handler *handler,
+ unsigned long start,
+ unsigned long end)
+ {
+ struct mmu_rb_node *rb_node;
+- struct sdma_mmu_node *node;
+ unsigned long flags;
+
+ spin_lock_irqsave(&handler->lock, flags);
+@@ -1302,11 +1290,12 @@ static struct sdma_mmu_node *find_system_node(struct mmu_rb_handler *handler,
+ spin_unlock_irqrestore(&handler->lock, flags);
+ return NULL;
+ }
+- node = container_of(rb_node, struct sdma_mmu_node, rb);
+- acquire_node(node);
++
++ /* "safety" kref to prevent release before add_system_iovec_to_sdma_packet() */
++ kref_get(&rb_node->refcount);
+ spin_unlock_irqrestore(&handler->lock, flags);
+
+- return node;
++ return container_of(rb_node, struct sdma_mmu_node, rb);
+ }
+
+ static int pin_system_pages(struct user_sdma_request *req,
+@@ -1355,6 +1344,13 @@ retry:
+ return 0;
+ }
+
++/*
++ * kref refcount on *node_p will be 2 on successful addition: one kref from
++ * kref_init() for mmu_rb_handler and one kref to prevent *node_p from being
++ * released until after *node_p is assigned to an SDMA descriptor (struct
++ * sdma_desc) under add_system_iovec_to_sdma_packet(), even if the virtual
++ * address range for *node_p is invalidated between now and then.
++ */
+ static int add_system_pinning(struct user_sdma_request *req,
+ struct sdma_mmu_node **node_p,
+ unsigned long start, unsigned long len)
+@@ -1368,6 +1364,12 @@ static int add_system_pinning(struct user_sdma_request *req,
+ if (!node)
+ return -ENOMEM;
+
++ /* First kref "moves" to mmu_rb_handler */
++ kref_init(&node->rb.refcount);
++
++ /* "safety" kref to prevent release before add_system_iovec_to_sdma_packet() */
++ kref_get(&node->rb.refcount);
++
+ node->pq = pq;
+ ret = pin_system_pages(req, start, len, node, PFN_DOWN(len));
+ if (ret == 0) {
+@@ -1431,15 +1433,15 @@ static int get_system_cache_entry(struct user_sdma_request *req,
+ return 0;
+ }
+
+- SDMA_DBG(req, "prepend: node->rb.addr %lx, node->refcount %d",
+- node->rb.addr, atomic_read(&node->refcount));
++ SDMA_DBG(req, "prepend: node->rb.addr %lx, node->rb.refcount %d",
++ node->rb.addr, kref_read(&node->rb.refcount));
+ prepend_len = node->rb.addr - start;
+
+ /*
+ * This node will not be returned, instead a new node
+ * will be. So release the reference.
+ */
+- release_node(handler, node);
++ kref_put(&node->rb.refcount, hfi1_mmu_rb_release);
+
+ /* Prepend a node to cover the beginning of the allocation */
+ ret = add_system_pinning(req, node_p, start, prepend_len);
+@@ -1451,6 +1453,20 @@ static int get_system_cache_entry(struct user_sdma_request *req,
+ }
+ }
+
++static void sdma_mmu_rb_node_get(void *ctx)
++{
++ struct mmu_rb_node *node = ctx;
++
++ kref_get(&node->refcount);
++}
++
++static void sdma_mmu_rb_node_put(void *ctx)
++{
++ struct sdma_mmu_node *node = ctx;
++
++ kref_put(&node->rb.refcount, hfi1_mmu_rb_release);
++}
++
+ static int add_mapping_to_sdma_packet(struct user_sdma_request *req,
+ struct user_sdma_txreq *tx,
+ struct sdma_mmu_node *cache_entry,
+@@ -1494,9 +1510,12 @@ static int add_mapping_to_sdma_packet(struct user_sdma_request *req,
+ ctx = cache_entry;
+ }
+
+- ret = sdma_txadd_page(pq->dd, ctx, &tx->txreq,
++ ret = sdma_txadd_page(pq->dd, &tx->txreq,
+ cache_entry->pages[page_index],
+- page_offset, from_this_page);
++ page_offset, from_this_page,
++ ctx,
++ sdma_mmu_rb_node_get,
++ sdma_mmu_rb_node_put);
+ if (ret) {
+ /*
+ * When there's a failure, the entire request is freed by
+@@ -1518,8 +1537,6 @@ static int add_system_iovec_to_sdma_packet(struct user_sdma_request *req,
+ struct user_sdma_iovec *iovec,
+ size_t from_this_iovec)
+ {
+- struct mmu_rb_handler *handler = req->pq->handler;
+-
+ while (from_this_iovec > 0) {
+ struct sdma_mmu_node *cache_entry;
+ size_t from_this_cache_entry;
+@@ -1540,15 +1557,15 @@ static int add_system_iovec_to_sdma_packet(struct user_sdma_request *req,
+
+ ret = add_mapping_to_sdma_packet(req, tx, cache_entry, start,
+ from_this_cache_entry);
++
++ /*
++ * Done adding cache_entry to zero or more sdma_desc. Can
++ * kref_put() the "safety" kref taken under
++ * get_system_cache_entry().
++ */
++ kref_put(&cache_entry->rb.refcount, hfi1_mmu_rb_release);
++
+ if (ret) {
+- /*
+- * We're guaranteed that there will be no descriptor
+- * completion callback that releases this node
+- * because only the last descriptor referencing it
+- * has a context attached, and a failure means the
+- * last descriptor was never added.
+- */
+- release_node(handler, cache_entry);
+ SDMA_DBG(req, "add system segment failed %d", ret);
+ return ret;
+ }
+@@ -1599,42 +1616,12 @@ static int add_system_pages_to_sdma_packet(struct user_sdma_request *req,
+ return 0;
+ }
+
+-void system_descriptor_complete(struct hfi1_devdata *dd,
+- struct sdma_desc *descp)
+-{
+- switch (sdma_mapping_type(descp)) {
+- case SDMA_MAP_SINGLE:
+- dma_unmap_single(&dd->pcidev->dev, sdma_mapping_addr(descp),
+- sdma_mapping_len(descp), DMA_TO_DEVICE);
+- break;
+- case SDMA_MAP_PAGE:
+- dma_unmap_page(&dd->pcidev->dev, sdma_mapping_addr(descp),
+- sdma_mapping_len(descp), DMA_TO_DEVICE);
+- break;
+- }
+-
+- if (descp->pinning_ctx) {
+- struct sdma_mmu_node *node = descp->pinning_ctx;
+-
+- release_node(node->rb.handler, node);
+- }
+-}
+-
+ static bool sdma_rb_filter(struct mmu_rb_node *node, unsigned long addr,
+ unsigned long len)
+ {
+ return (bool)(node->addr == addr);
+ }
+
+-static int sdma_rb_insert(void *arg, struct mmu_rb_node *mnode)
+-{
+- struct sdma_mmu_node *node =
+- container_of(mnode, struct sdma_mmu_node, rb);
+-
+- atomic_inc(&node->refcount);
+- return 0;
+-}
+-
+ /*
+ * Return 1 to remove the node from the rb tree and call the remove op.
+ *
+@@ -1647,10 +1634,6 @@ static int sdma_rb_evict(void *arg, struct mmu_rb_node *mnode,
+ container_of(mnode, struct sdma_mmu_node, rb);
+ struct evict_data *evict_data = evict_arg;
+
+- /* is this node still being used? */
+- if (atomic_read(&node->refcount))
+- return 0; /* keep this node */
+-
+ /* this node will be evicted, add its pages to our count */
+ evict_data->cleared += node->npages;
+
+@@ -1668,13 +1651,3 @@ static void sdma_rb_remove(void *arg, struct mmu_rb_node *mnode)
+
+ free_system_node(node);
+ }
+-
+-static int sdma_rb_invalidate(void *arg, struct mmu_rb_node *mnode)
+-{
+- struct sdma_mmu_node *node =
+- container_of(mnode, struct sdma_mmu_node, rb);
+-
+- if (!atomic_read(&node->refcount))
+- return 1;
+- return 0;
+-}
+diff --git a/drivers/infiniband/hw/hfi1/user_sdma.h b/drivers/infiniband/hw/hfi1/user_sdma.h
+index a241836371dc1..548347d4c5bc2 100644
+--- a/drivers/infiniband/hw/hfi1/user_sdma.h
++++ b/drivers/infiniband/hw/hfi1/user_sdma.h
+@@ -104,7 +104,6 @@ struct hfi1_user_sdma_comp_q {
+ struct sdma_mmu_node {
+ struct mmu_rb_node rb;
+ struct hfi1_user_sdma_pkt_q *pq;
+- atomic_t refcount;
+ struct page **pages;
+ unsigned int npages;
+ };
+diff --git a/drivers/infiniband/hw/hfi1/vnic_sdma.c b/drivers/infiniband/hw/hfi1/vnic_sdma.c
+index 727eedfba332a..cc6324d2d1ddc 100644
+--- a/drivers/infiniband/hw/hfi1/vnic_sdma.c
++++ b/drivers/infiniband/hw/hfi1/vnic_sdma.c
+@@ -64,11 +64,11 @@ static noinline int build_vnic_ulp_payload(struct sdma_engine *sde,
+
+ /* combine physically continuous fragments later? */
+ ret = sdma_txadd_page(sde->dd,
+- NULL,
+ &tx->txreq,
+ skb_frag_page(frag),
+ skb_frag_off(frag),
+- skb_frag_size(frag));
++ skb_frag_size(frag),
++ NULL, NULL, NULL);
+ if (unlikely(ret))
+ goto bail_txadd;
+ }
+diff --git a/drivers/infiniband/hw/hns/hns_roce_hem.c b/drivers/infiniband/hw/hns/hns_roce_hem.c
+index aa8a08d1c0145..f30274986c0da 100644
+--- a/drivers/infiniband/hw/hns/hns_roce_hem.c
++++ b/drivers/infiniband/hw/hns/hns_roce_hem.c
+@@ -595,11 +595,12 @@ int hns_roce_table_get(struct hns_roce_dev *hr_dev,
+ }
+
+ /* Set HEM base address(128K/page, pa) to Hardware */
+- if (hr_dev->hw->set_hem(hr_dev, table, obj, HEM_HOP_STEP_DIRECT)) {
++ ret = hr_dev->hw->set_hem(hr_dev, table, obj, HEM_HOP_STEP_DIRECT);
++ if (ret) {
+ hns_roce_free_hem(hr_dev, table->hem[i]);
+ table->hem[i] = NULL;
+- ret = -ENODEV;
+- dev_err(dev, "set HEM base address to HW failed.\n");
++ dev_err(dev, "set HEM base address to HW failed, ret = %d.\n",
++ ret);
+ goto out;
+ }
+
+diff --git a/drivers/infiniband/hw/irdma/uk.c b/drivers/infiniband/hw/irdma/uk.c
+index 16183e894da77..dd428d915c175 100644
+--- a/drivers/infiniband/hw/irdma/uk.c
++++ b/drivers/infiniband/hw/irdma/uk.c
+@@ -93,16 +93,18 @@ static int irdma_nop_1(struct irdma_qp_uk *qp)
+ */
+ void irdma_clr_wqes(struct irdma_qp_uk *qp, u32 qp_wqe_idx)
+ {
+- __le64 *wqe;
++ struct irdma_qp_quanta *sq;
+ u32 wqe_idx;
+
+ if (!(qp_wqe_idx & 0x7F)) {
+ wqe_idx = (qp_wqe_idx + 128) % qp->sq_ring.size;
+- wqe = qp->sq_base[wqe_idx].elem;
++ sq = qp->sq_base + wqe_idx;
+ if (wqe_idx)
+- memset(wqe, qp->swqe_polarity ? 0 : 0xFF, 0x1000);
++ memset(sq, qp->swqe_polarity ? 0 : 0xFF,
++ 128 * sizeof(*sq));
+ else
+- memset(wqe, qp->swqe_polarity ? 0xFF : 0, 0x1000);
++ memset(sq, qp->swqe_polarity ? 0xFF : 0,
++ 128 * sizeof(*sq));
+ }
+ }
+
+diff --git a/drivers/infiniband/sw/rxe/rxe_mw.c b/drivers/infiniband/sw/rxe/rxe_mw.c
+index afa5ce1a71166..a7ec57ab8fadd 100644
+--- a/drivers/infiniband/sw/rxe/rxe_mw.c
++++ b/drivers/infiniband/sw/rxe/rxe_mw.c
+@@ -48,7 +48,7 @@ int rxe_dealloc_mw(struct ib_mw *ibmw)
+ }
+
+ static int rxe_check_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe,
+- struct rxe_mw *mw, struct rxe_mr *mr)
++ struct rxe_mw *mw, struct rxe_mr *mr, int access)
+ {
+ if (mw->ibmw.type == IB_MW_TYPE_1) {
+ if (unlikely(mw->state != RXE_MW_STATE_VALID)) {
+@@ -58,7 +58,7 @@ static int rxe_check_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe,
+ }
+
+ /* o10-36.2.2 */
+- if (unlikely((mw->access & IB_ZERO_BASED))) {
++ if (unlikely((access & IB_ZERO_BASED))) {
+ rxe_dbg_mw(mw, "attempt to bind a zero based type 1 MW\n");
+ return -EINVAL;
+ }
+@@ -104,7 +104,7 @@ static int rxe_check_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe,
+ }
+
+ /* C10-74 */
+- if (unlikely((mw->access &
++ if (unlikely((access &
+ (IB_ACCESS_REMOTE_WRITE | IB_ACCESS_REMOTE_ATOMIC)) &&
+ !(mr->access & IB_ACCESS_LOCAL_WRITE))) {
+ rxe_dbg_mw(mw,
+@@ -113,7 +113,7 @@ static int rxe_check_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe,
+ }
+
+ /* C10-75 */
+- if (mw->access & IB_ZERO_BASED) {
++ if (access & IB_ZERO_BASED) {
+ if (unlikely(wqe->wr.wr.mw.length > mr->ibmr.length)) {
+ rxe_dbg_mw(mw,
+ "attempt to bind a ZB MW outside of the MR\n");
+@@ -133,12 +133,12 @@ static int rxe_check_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe,
+ }
+
+ static void rxe_do_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe,
+- struct rxe_mw *mw, struct rxe_mr *mr)
++ struct rxe_mw *mw, struct rxe_mr *mr, int access)
+ {
+ u32 key = wqe->wr.wr.mw.rkey & 0xff;
+
+ mw->rkey = (mw->rkey & ~0xff) | key;
+- mw->access = wqe->wr.wr.mw.access;
++ mw->access = access;
+ mw->state = RXE_MW_STATE_VALID;
+ mw->addr = wqe->wr.wr.mw.addr;
+ mw->length = wqe->wr.wr.mw.length;
+@@ -169,6 +169,7 @@ int rxe_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe)
+ struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
+ u32 mw_rkey = wqe->wr.wr.mw.mw_rkey;
+ u32 mr_lkey = wqe->wr.wr.mw.mr_lkey;
++ int access = wqe->wr.wr.mw.access;
+
+ mw = rxe_pool_get_index(&rxe->mw_pool, mw_rkey >> 8);
+ if (unlikely(!mw)) {
+@@ -198,11 +199,11 @@ int rxe_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe)
+
+ spin_lock_bh(&mw->lock);
+
+- ret = rxe_check_bind_mw(qp, wqe, mw, mr);
++ ret = rxe_check_bind_mw(qp, wqe, mw, mr, access);
+ if (ret)
+ goto err_unlock;
+
+- rxe_do_bind_mw(qp, wqe, mw, mr);
++ rxe_do_bind_mw(qp, wqe, mw, mr, access);
+ err_unlock:
+ spin_unlock_bh(&mw->lock);
+ err_drop_mr:
+diff --git a/drivers/input/Kconfig b/drivers/input/Kconfig
+index 735f90b74ee5a..3bdbd34314b34 100644
+--- a/drivers/input/Kconfig
++++ b/drivers/input/Kconfig
+@@ -168,7 +168,7 @@ config INPUT_EVBUG
+
+ config INPUT_KUNIT_TEST
+ tristate "KUnit tests for Input" if !KUNIT_ALL_TESTS
+- depends on INPUT && KUNIT=y
++ depends on INPUT && KUNIT
+ default KUNIT_ALL_TESTS
+ help
+ Say Y here if you want to build the KUnit tests for the input
+diff --git a/drivers/input/misc/adxl34x.c b/drivers/input/misc/adxl34x.c
+index eecca671b5884..a3f45e0ee0c75 100644
+--- a/drivers/input/misc/adxl34x.c
++++ b/drivers/input/misc/adxl34x.c
+@@ -817,8 +817,7 @@ struct adxl34x *adxl34x_probe(struct device *dev, int irq,
+ AC_WRITE(ac, POWER_CTL, 0);
+
+ err = request_threaded_irq(ac->irq, NULL, adxl34x_irq,
+- IRQF_TRIGGER_HIGH | IRQF_ONESHOT,
+- dev_name(dev), ac);
++ IRQF_ONESHOT, dev_name(dev), ac);
+ if (err) {
+ dev_err(dev, "irq %d busy?\n", ac->irq);
+ goto err_free_mem;
+diff --git a/drivers/input/misc/drv260x.c b/drivers/input/misc/drv260x.c
+index 8a9ebfc04a2d9..85371fa1a03ed 100644
+--- a/drivers/input/misc/drv260x.c
++++ b/drivers/input/misc/drv260x.c
+@@ -435,6 +435,7 @@ static int drv260x_init(struct drv260x_data *haptics)
+ }
+
+ do {
++ usleep_range(15000, 15500);
+ error = regmap_read(haptics->regmap, DRV260X_GO, &cal_buf);
+ if (error) {
+ dev_err(&haptics->client->dev,
+diff --git a/drivers/input/misc/pm8941-pwrkey.c b/drivers/input/misc/pm8941-pwrkey.c
+index b6a27ebae977b..74d77d8aaeff2 100644
+--- a/drivers/input/misc/pm8941-pwrkey.c
++++ b/drivers/input/misc/pm8941-pwrkey.c
+@@ -50,7 +50,10 @@
+ #define PON_RESIN_PULL_UP BIT(0)
+
+ #define PON_DBC_CTL 0x71
+-#define PON_DBC_DELAY_MASK 0x7
++#define PON_DBC_DELAY_MASK_GEN1 0x7
++#define PON_DBC_DELAY_MASK_GEN2 0xf
++#define PON_DBC_SHIFT_GEN1 6
++#define PON_DBC_SHIFT_GEN2 14
+
+ struct pm8941_data {
+ unsigned int pull_up_bit;
+@@ -247,7 +250,7 @@ static int pm8941_pwrkey_probe(struct platform_device *pdev)
+ struct device *parent;
+ struct device_node *regmap_node;
+ const __be32 *addr;
+- u32 req_delay;
++ u32 req_delay, mask, delay_shift;
+ int error;
+
+ if (of_property_read_u32(pdev->dev.of_node, "debounce", &req_delay))
+@@ -336,12 +339,20 @@ static int pm8941_pwrkey_probe(struct platform_device *pdev)
+ pwrkey->input->phys = pwrkey->data->phys;
+
+ if (pwrkey->data->supports_debounce_config) {
+- req_delay = (req_delay << 6) / USEC_PER_SEC;
++ if (pwrkey->subtype >= PON_SUBTYPE_GEN2_PRIMARY) {
++ mask = PON_DBC_DELAY_MASK_GEN2;
++ delay_shift = PON_DBC_SHIFT_GEN2;
++ } else {
++ mask = PON_DBC_DELAY_MASK_GEN1;
++ delay_shift = PON_DBC_SHIFT_GEN1;
++ }
++
++ req_delay = (req_delay << delay_shift) / USEC_PER_SEC;
+ req_delay = ilog2(req_delay);
+
+ error = regmap_update_bits(pwrkey->regmap,
+ pwrkey->baseaddr + PON_DBC_CTL,
+- PON_DBC_DELAY_MASK,
++ mask,
+ req_delay);
+ if (error) {
+ dev_err(&pdev->dev, "failed to set debounce: %d\n",
+diff --git a/drivers/input/tests/input_test.c b/drivers/input/tests/input_test.c
+index e5a6c1ad2167c..0540225f02886 100644
+--- a/drivers/input/tests/input_test.c
++++ b/drivers/input/tests/input_test.c
+@@ -43,8 +43,8 @@ static void input_test_exit(struct kunit *test)
+ {
+ struct input_dev *input_dev = test->priv;
+
+- input_unregister_device(input_dev);
+- input_free_device(input_dev);
++ if (input_dev)
++ input_unregister_device(input_dev);
+ }
+
+ static void input_test_poll(struct input_dev *input) { }
+@@ -87,7 +87,7 @@ static void input_test_timestamp(struct kunit *test)
+ static void input_test_match_device_id(struct kunit *test)
+ {
+ struct input_dev *input_dev = test->priv;
+- struct input_device_id id;
++ struct input_device_id id = { 0 };
+
+ /*
+ * Must match when the input device bus, vendor, product, version
+diff --git a/drivers/input/touchscreen/ads7846.c b/drivers/input/touchscreen/ads7846.c
+index bb1058b1e7fd4..faea40dd66d01 100644
+--- a/drivers/input/touchscreen/ads7846.c
++++ b/drivers/input/touchscreen/ads7846.c
+@@ -24,11 +24,8 @@
+ #include <linux/interrupt.h>
+ #include <linux/slab.h>
+ #include <linux/pm.h>
+-#include <linux/of.h>
+-#include <linux/of_gpio.h>
+-#include <linux/of_device.h>
++#include <linux/property.h>
+ #include <linux/gpio/consumer.h>
+-#include <linux/gpio.h>
+ #include <linux/spi/spi.h>
+ #include <linux/spi/ads7846.h>
+ #include <linux/regulator/consumer.h>
+@@ -140,7 +137,7 @@ struct ads7846 {
+ int (*filter)(void *data, int data_idx, int *val);
+ void *filter_data;
+ int (*get_pendown_state)(void);
+- int gpio_pendown;
++ struct gpio_desc *gpio_pendown;
+
+ void (*wait_for_sync)(void);
+ };
+@@ -223,7 +220,7 @@ static int get_pendown_state(struct ads7846 *ts)
+ if (ts->get_pendown_state)
+ return ts->get_pendown_state();
+
+- return !gpio_get_value(ts->gpio_pendown);
++ return gpiod_get_value(ts->gpio_pendown);
+ }
+
+ static void ads7846_report_pen_up(struct ads7846 *ts)
+@@ -989,8 +986,6 @@ static int ads7846_setup_pendown(struct spi_device *spi,
+ struct ads7846 *ts,
+ const struct ads7846_platform_data *pdata)
+ {
+- int err;
+-
+ /*
+ * REVISIT when the irq can be triggered active-low, or if for some
+ * reason the touchscreen isn't hooked up, we don't need to access
+@@ -999,25 +994,15 @@ static int ads7846_setup_pendown(struct spi_device *spi,
+
+ if (pdata->get_pendown_state) {
+ ts->get_pendown_state = pdata->get_pendown_state;
+- } else if (gpio_is_valid(pdata->gpio_pendown)) {
+-
+- err = devm_gpio_request_one(&spi->dev, pdata->gpio_pendown,
+- GPIOF_IN, "ads7846_pendown");
+- if (err) {
+- dev_err(&spi->dev,
+- "failed to request/setup pendown GPIO%d: %d\n",
+- pdata->gpio_pendown, err);
+- return err;
++ } else {
++ ts->gpio_pendown = gpiod_get(&spi->dev, "pendown", GPIOD_IN);
++ if (IS_ERR(ts->gpio_pendown)) {
++ dev_err(&spi->dev, "failed to request pendown GPIO\n");
++ return PTR_ERR(ts->gpio_pendown);
+ }
+-
+- ts->gpio_pendown = pdata->gpio_pendown;
+-
+ if (pdata->gpio_pendown_debounce)
+- gpiod_set_debounce(gpio_to_desc(ts->gpio_pendown),
++ gpiod_set_debounce(ts->gpio_pendown,
+ pdata->gpio_pendown_debounce);
+- } else {
+- dev_err(&spi->dev, "no get_pendown_state nor gpio_pendown?\n");
+- return -EINVAL;
+ }
+
+ return 0;
+@@ -1119,7 +1104,6 @@ static int ads7846_setup_spi_msg(struct ads7846 *ts,
+ return 0;
+ }
+
+-#ifdef CONFIG_OF
+ static const struct of_device_id ads7846_dt_ids[] = {
+ { .compatible = "ti,tsc2046", .data = (void *) 7846 },
+ { .compatible = "ti,ads7843", .data = (void *) 7843 },
+@@ -1130,82 +1114,60 @@ static const struct of_device_id ads7846_dt_ids[] = {
+ };
+ MODULE_DEVICE_TABLE(of, ads7846_dt_ids);
+
+-static const struct ads7846_platform_data *ads7846_probe_dt(struct device *dev)
++static const struct ads7846_platform_data *ads7846_get_props(struct device *dev)
+ {
+ struct ads7846_platform_data *pdata;
+- struct device_node *node = dev->of_node;
+- const struct of_device_id *match;
+ u32 value;
+
+- if (!node) {
+- dev_err(dev, "Device does not have associated DT data\n");
+- return ERR_PTR(-EINVAL);
+- }
+-
+- match = of_match_device(ads7846_dt_ids, dev);
+- if (!match) {
+- dev_err(dev, "Unknown device model\n");
+- return ERR_PTR(-EINVAL);
+- }
+-
+ pdata = devm_kzalloc(dev, sizeof(*pdata), GFP_KERNEL);
+ if (!pdata)
+ return ERR_PTR(-ENOMEM);
+
+- pdata->model = (unsigned long)match->data;
++ pdata->model = (uintptr_t)device_get_match_data(dev);
+
+- of_property_read_u16(node, "ti,vref-delay-usecs",
+- &pdata->vref_delay_usecs);
+- of_property_read_u16(node, "ti,vref-mv", &pdata->vref_mv);
+- pdata->keep_vref_on = of_property_read_bool(node, "ti,keep-vref-on");
++ device_property_read_u16(dev, "ti,vref-delay-usecs",
++ &pdata->vref_delay_usecs);
++ device_property_read_u16(dev, "ti,vref-mv", &pdata->vref_mv);
++ pdata->keep_vref_on = device_property_read_bool(dev, "ti,keep-vref-on");
+
+- pdata->swap_xy = of_property_read_bool(node, "ti,swap-xy");
++ pdata->swap_xy = device_property_read_bool(dev, "ti,swap-xy");
+
+- of_property_read_u16(node, "ti,settle-delay-usec",
+- &pdata->settle_delay_usecs);
+- of_property_read_u16(node, "ti,penirq-recheck-delay-usecs",
+- &pdata->penirq_recheck_delay_usecs);
++ device_property_read_u16(dev, "ti,settle-delay-usec",
++ &pdata->settle_delay_usecs);
++ device_property_read_u16(dev, "ti,penirq-recheck-delay-usecs",
++ &pdata->penirq_recheck_delay_usecs);
+
+- of_property_read_u16(node, "ti,x-plate-ohms", &pdata->x_plate_ohms);
+- of_property_read_u16(node, "ti,y-plate-ohms", &pdata->y_plate_ohms);
++ device_property_read_u16(dev, "ti,x-plate-ohms", &pdata->x_plate_ohms);
++ device_property_read_u16(dev, "ti,y-plate-ohms", &pdata->y_plate_ohms);
+
+- of_property_read_u16(node, "ti,x-min", &pdata->x_min);
+- of_property_read_u16(node, "ti,y-min", &pdata->y_min);
+- of_property_read_u16(node, "ti,x-max", &pdata->x_max);
+- of_property_read_u16(node, "ti,y-max", &pdata->y_max);
++ device_property_read_u16(dev, "ti,x-min", &pdata->x_min);
++ device_property_read_u16(dev, "ti,y-min", &pdata->y_min);
++ device_property_read_u16(dev, "ti,x-max", &pdata->x_max);
++ device_property_read_u16(dev, "ti,y-max", &pdata->y_max);
+
+ /*
+ * touchscreen-max-pressure gets parsed during
+ * touchscreen_parse_properties()
+ */
+- of_property_read_u16(node, "ti,pressure-min", &pdata->pressure_min);
+- if (!of_property_read_u32(node, "touchscreen-min-pressure", &value))
++ device_property_read_u16(dev, "ti,pressure-min", &pdata->pressure_min);
++ if (!device_property_read_u32(dev, "touchscreen-min-pressure", &value))
+ pdata->pressure_min = (u16) value;
+- of_property_read_u16(node, "ti,pressure-max", &pdata->pressure_max);
++ device_property_read_u16(dev, "ti,pressure-max", &pdata->pressure_max);
+
+- of_property_read_u16(node, "ti,debounce-max", &pdata->debounce_max);
+- if (!of_property_read_u32(node, "touchscreen-average-samples", &value))
++ device_property_read_u16(dev, "ti,debounce-max", &pdata->debounce_max);
++ if (!device_property_read_u32(dev, "touchscreen-average-samples", &value))
+ pdata->debounce_max = (u16) value;
+- of_property_read_u16(node, "ti,debounce-tol", &pdata->debounce_tol);
+- of_property_read_u16(node, "ti,debounce-rep", &pdata->debounce_rep);
++ device_property_read_u16(dev, "ti,debounce-tol", &pdata->debounce_tol);
++ device_property_read_u16(dev, "ti,debounce-rep", &pdata->debounce_rep);
+
+- of_property_read_u32(node, "ti,pendown-gpio-debounce",
++ device_property_read_u32(dev, "ti,pendown-gpio-debounce",
+ &pdata->gpio_pendown_debounce);
+
+- pdata->wakeup = of_property_read_bool(node, "wakeup-source") ||
+- of_property_read_bool(node, "linux,wakeup");
+-
+- pdata->gpio_pendown = of_get_named_gpio(dev->of_node, "pendown-gpio", 0);
++ pdata->wakeup = device_property_read_bool(dev, "wakeup-source") ||
++ device_property_read_bool(dev, "linux,wakeup");
+
+ return pdata;
+ }
+-#else
+-static const struct ads7846_platform_data *ads7846_probe_dt(struct device *dev)
+-{
+- dev_err(dev, "no platform data defined\n");
+- return ERR_PTR(-EINVAL);
+-}
+-#endif
+
+ static void ads7846_regulator_disable(void *regulator)
+ {
+@@ -1269,7 +1231,7 @@ static int ads7846_probe(struct spi_device *spi)
+
+ pdata = dev_get_platdata(dev);
+ if (!pdata) {
+- pdata = ads7846_probe_dt(dev);
++ pdata = ads7846_get_props(dev);
+ if (IS_ERR(pdata))
+ return PTR_ERR(pdata);
+ }
+@@ -1426,7 +1388,7 @@ static struct spi_driver ads7846_driver = {
+ .driver = {
+ .name = "ads7846",
+ .pm = pm_sleep_ptr(&ads7846_pm),
+- .of_match_table = of_match_ptr(ads7846_dt_ids),
++ .of_match_table = ads7846_dt_ids,
+ },
+ .probe = ads7846_probe,
+ .remove = ads7846_remove,
+diff --git a/drivers/input/touchscreen/cyttsp4_core.c b/drivers/input/touchscreen/cyttsp4_core.c
+index 0cd6f626adec5..7cb26929dc732 100644
+--- a/drivers/input/touchscreen/cyttsp4_core.c
++++ b/drivers/input/touchscreen/cyttsp4_core.c
+@@ -1263,9 +1263,8 @@ static void cyttsp4_stop_wd_timer(struct cyttsp4 *cd)
+ * Ensure we wait until the watchdog timer
+ * running on a different CPU finishes
+ */
+- del_timer_sync(&cd->watchdog_timer);
++ timer_shutdown_sync(&cd->watchdog_timer);
+ cancel_work_sync(&cd->watchdog_work);
+- del_timer_sync(&cd->watchdog_timer);
+ }
+
+ static void cyttsp4_watchdog_timer(struct timer_list *t)
+diff --git a/drivers/interconnect/qcom/icc-rpm.c b/drivers/interconnect/qcom/icc-rpm.c
+index 5341fa169dbf1..8d3138e8c1ee3 100644
+--- a/drivers/interconnect/qcom/icc-rpm.c
++++ b/drivers/interconnect/qcom/icc-rpm.c
+@@ -379,7 +379,7 @@ static int qcom_icc_set(struct icc_node *src, struct icc_node *dst)
+ return ret;
+ }
+
+- for (i = 0; i < qp->num_clks; i++) {
++ for (i = 0; i < qp->num_bus_clks; i++) {
+ /*
+ * Use WAKE bucket for active clock, otherwise, use SLEEP bucket
+ * for other clocks. If a platform doesn't set interconnect
+@@ -464,7 +464,7 @@ int qnoc_probe(struct platform_device *pdev)
+
+ for (i = 0; i < cd_num; i++)
+ qp->bus_clks[i].id = cds[i];
+- qp->num_clks = cd_num;
++ qp->num_bus_clks = cd_num;
+
+ qp->type = desc->type;
+ qp->qos_offset = desc->qos_offset;
+@@ -494,11 +494,11 @@ int qnoc_probe(struct platform_device *pdev)
+ }
+
+ regmap_done:
+- ret = devm_clk_bulk_get_optional(dev, qp->num_clks, qp->bus_clks);
++ ret = devm_clk_bulk_get(dev, qp->num_bus_clks, qp->bus_clks);
+ if (ret)
+ return ret;
+
+- ret = clk_bulk_prepare_enable(qp->num_clks, qp->bus_clks);
++ ret = clk_bulk_prepare_enable(qp->num_bus_clks, qp->bus_clks);
+ if (ret)
+ return ret;
+
+@@ -551,7 +551,7 @@ err_deregister_provider:
+ icc_provider_deregister(provider);
+ err_remove_nodes:
+ icc_nodes_remove(provider);
+- clk_bulk_disable_unprepare(qp->num_clks, qp->bus_clks);
++ clk_bulk_disable_unprepare(qp->num_bus_clks, qp->bus_clks);
+
+ return ret;
+ }
+@@ -563,7 +563,7 @@ int qnoc_remove(struct platform_device *pdev)
+
+ icc_provider_deregister(&qp->provider);
+ icc_nodes_remove(&qp->provider);
+- clk_bulk_disable_unprepare(qp->num_clks, qp->bus_clks);
++ clk_bulk_disable_unprepare(qp->num_bus_clks, qp->bus_clks);
+
+ return 0;
+ }
+diff --git a/drivers/interconnect/qcom/icc-rpm.h b/drivers/interconnect/qcom/icc-rpm.h
+index 22bdb1e4bb123..838f3fa82278e 100644
+--- a/drivers/interconnect/qcom/icc-rpm.h
++++ b/drivers/interconnect/qcom/icc-rpm.h
+@@ -23,7 +23,7 @@ enum qcom_icc_type {
+ /**
+ * struct qcom_icc_provider - Qualcomm specific interconnect provider
+ * @provider: generic interconnect provider
+- * @num_clks: the total number of clk_bulk_data entries
++ * @num_bus_clks: the total number of bus_clks clk_bulk_data entries
+ * @type: the ICC provider type
+ * @regmap: regmap for QoS registers read/write access
+ * @qos_offset: offset to QoS registers
+@@ -32,7 +32,7 @@ enum qcom_icc_type {
+ */
+ struct qcom_icc_provider {
+ struct icc_provider provider;
+- int num_clks;
++ int num_bus_clks;
+ enum qcom_icc_type type;
+ struct regmap *regmap;
+ unsigned int qos_offset;
+diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
+index 4f9b2142274ce..29d05663d4d17 100644
+--- a/drivers/iommu/iommufd/device.c
++++ b/drivers/iommu/iommufd/device.c
+@@ -553,8 +553,8 @@ void iommufd_access_unpin_pages(struct iommufd_access *access,
+ iopt_area_iova_to_index(
+ area,
+ min(last_iova, iopt_area_last_iova(area))));
+- up_read(&iopt->iova_rwsem);
+ WARN_ON(!iopt_area_contig_done(&iter));
++ up_read(&iopt->iova_rwsem);
+ }
+ EXPORT_SYMBOL_NS_GPL(iommufd_access_unpin_pages, IOMMUFD);
+
+diff --git a/drivers/iommu/iommufd/io_pagetable.c b/drivers/iommu/iommufd/io_pagetable.c
+index e0ae72b9e67f8..724c4c5742417 100644
+--- a/drivers/iommu/iommufd/io_pagetable.c
++++ b/drivers/iommu/iommufd/io_pagetable.c
+@@ -458,6 +458,7 @@ static int iopt_unmap_iova_range(struct io_pagetable *iopt, unsigned long start,
+ {
+ struct iopt_area *area;
+ unsigned long unmapped_bytes = 0;
++ unsigned int tries = 0;
+ int rc = -ENOENT;
+
+ /*
+@@ -484,19 +485,26 @@ again:
+ goto out_unlock_iova;
+ }
+
++ if (area_first != start)
++ tries = 0;
++
+ /*
+ * num_accesses writers must hold the iova_rwsem too, so we can
+ * safely read it under the write side of the iovam_rwsem
+ * without the pages->mutex.
+ */
+ if (area->num_accesses) {
++ size_t length = iopt_area_length(area);
++
+ start = area_first;
+ area->prevent_access = true;
+ up_write(&iopt->iova_rwsem);
+ up_read(&iopt->domains_rwsem);
+- iommufd_access_notify_unmap(iopt, area_first,
+- iopt_area_length(area));
+- if (WARN_ON(READ_ONCE(area->num_accesses)))
++
++ iommufd_access_notify_unmap(iopt, area_first, length);
++ /* Something is not responding to unmap requests. */
++ tries++;
++ if (WARN_ON(tries > 100))
+ return -EDEADLOCK;
+ goto again;
+ }
+diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
+index 5b8fe9bfa9a5b..3551ed057774e 100644
+--- a/drivers/iommu/virtio-iommu.c
++++ b/drivers/iommu/virtio-iommu.c
+@@ -788,6 +788,29 @@ static int viommu_attach_dev(struct iommu_domain *domain, struct device *dev)
+ return 0;
+ }
+
++static void viommu_detach_dev(struct viommu_endpoint *vdev)
++{
++ int i;
++ struct virtio_iommu_req_detach req;
++ struct viommu_domain *vdomain = vdev->vdomain;
++ struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(vdev->dev);
++
++ if (!vdomain)
++ return;
++
++ req = (struct virtio_iommu_req_detach) {
++ .head.type = VIRTIO_IOMMU_T_DETACH,
++ .domain = cpu_to_le32(vdomain->id),
++ };
++
++ for (i = 0; i < fwspec->num_ids; i++) {
++ req.endpoint = cpu_to_le32(fwspec->ids[i]);
++ WARN_ON(viommu_send_req_sync(vdev->viommu, &req, sizeof(req)));
++ }
++ vdomain->nr_endpoints--;
++ vdev->vdomain = NULL;
++}
++
+ static int viommu_map_pages(struct iommu_domain *domain, unsigned long iova,
+ phys_addr_t paddr, size_t pgsize, size_t pgcount,
+ int prot, gfp_t gfp, size_t *mapped)
+@@ -810,25 +833,26 @@ static int viommu_map_pages(struct iommu_domain *domain, unsigned long iova,
+ if (ret)
+ return ret;
+
+- map = (struct virtio_iommu_req_map) {
+- .head.type = VIRTIO_IOMMU_T_MAP,
+- .domain = cpu_to_le32(vdomain->id),
+- .virt_start = cpu_to_le64(iova),
+- .phys_start = cpu_to_le64(paddr),
+- .virt_end = cpu_to_le64(end),
+- .flags = cpu_to_le32(flags),
+- };
+-
+- if (!vdomain->nr_endpoints)
+- return 0;
++ if (vdomain->nr_endpoints) {
++ map = (struct virtio_iommu_req_map) {
++ .head.type = VIRTIO_IOMMU_T_MAP,
++ .domain = cpu_to_le32(vdomain->id),
++ .virt_start = cpu_to_le64(iova),
++ .phys_start = cpu_to_le64(paddr),
++ .virt_end = cpu_to_le64(end),
++ .flags = cpu_to_le32(flags),
++ };
+
+- ret = viommu_send_req_sync(vdomain->viommu, &map, sizeof(map));
+- if (ret)
+- viommu_del_mappings(vdomain, iova, end);
+- else if (mapped)
++ ret = viommu_send_req_sync(vdomain->viommu, &map, sizeof(map));
++ if (ret) {
++ viommu_del_mappings(vdomain, iova, end);
++ return ret;
++ }
++ }
++ if (mapped)
+ *mapped = size;
+
+- return ret;
++ return 0;
+ }
+
+ static size_t viommu_unmap_pages(struct iommu_domain *domain, unsigned long iova,
+@@ -990,6 +1014,7 @@ static void viommu_release_device(struct device *dev)
+ {
+ struct viommu_endpoint *vdev = dev_iommu_priv_get(dev);
+
++ viommu_detach_dev(vdev);
+ iommu_put_resv_regions(dev, &vdev->resv_regions);
+ kfree(vdev);
+ }
+diff --git a/drivers/irqchip/irq-jcore-aic.c b/drivers/irqchip/irq-jcore-aic.c
+index 5f47d8ee4ae39..b9dcc8e78c750 100644
+--- a/drivers/irqchip/irq-jcore-aic.c
++++ b/drivers/irqchip/irq-jcore-aic.c
+@@ -68,6 +68,7 @@ static int __init aic_irq_of_init(struct device_node *node,
+ unsigned min_irq = JCORE_AIC2_MIN_HWIRQ;
+ unsigned dom_sz = JCORE_AIC_MAX_HWIRQ+1;
+ struct irq_domain *domain;
++ int ret;
+
+ pr_info("Initializing J-Core AIC\n");
+
+@@ -100,6 +101,12 @@ static int __init aic_irq_of_init(struct device_node *node,
+ jcore_aic.irq_unmask = noop;
+ jcore_aic.name = "AIC";
+
++ ret = irq_alloc_descs(-1, min_irq, dom_sz - min_irq,
++ of_node_to_nid(node));
++
++ if (ret < 0)
++ return ret;
++
+ domain = irq_domain_add_legacy(node, dom_sz - min_irq, min_irq, min_irq,
+ &jcore_aic_irqdomain_ops,
+ &jcore_aic);
+diff --git a/drivers/irqchip/irq-loongson-eiointc.c b/drivers/irqchip/irq-loongson-eiointc.c
+index 71ef19f77a5a0..a7fcde3e3ecc7 100644
+--- a/drivers/irqchip/irq-loongson-eiointc.c
++++ b/drivers/irqchip/irq-loongson-eiointc.c
+@@ -314,7 +314,7 @@ static void eiointc_resume(void)
+ desc = irq_resolve_mapping(eiointc_priv[i]->eiointc_domain, j);
+ if (desc && desc->handle_irq && desc->handle_irq != handle_bad_irq) {
+ raw_spin_lock(&desc->lock);
+- irq_data = &desc->irq_data;
++ irq_data = irq_domain_get_irq_data(eiointc_priv[i]->eiointc_domain, irq_desc_get_irq(desc));
+ eiointc_set_irq_affinity(irq_data, irq_data->common->affinity, 0);
+ raw_spin_unlock(&desc->lock);
+ }
+diff --git a/drivers/irqchip/irq-loongson-liointc.c b/drivers/irqchip/irq-loongson-liointc.c
+index 8d00a9ad5b005..5dd9db8f8fa8e 100644
+--- a/drivers/irqchip/irq-loongson-liointc.c
++++ b/drivers/irqchip/irq-loongson-liointc.c
+@@ -32,6 +32,10 @@
+ #define LIOINTC_REG_INTC_EN_STATUS (LIOINTC_INTC_CHIP_START + 0x04)
+ #define LIOINTC_REG_INTC_ENABLE (LIOINTC_INTC_CHIP_START + 0x08)
+ #define LIOINTC_REG_INTC_DISABLE (LIOINTC_INTC_CHIP_START + 0x0c)
++/*
++ * LIOINTC_REG_INTC_POL register is only valid for Loongson-2K series, and
++ * Loongson-3 series behave as noops.
++ */
+ #define LIOINTC_REG_INTC_POL (LIOINTC_INTC_CHIP_START + 0x10)
+ #define LIOINTC_REG_INTC_EDGE (LIOINTC_INTC_CHIP_START + 0x14)
+
+@@ -116,19 +120,19 @@ static int liointc_set_type(struct irq_data *data, unsigned int type)
+ switch (type) {
+ case IRQ_TYPE_LEVEL_HIGH:
+ liointc_set_bit(gc, LIOINTC_REG_INTC_EDGE, mask, false);
+- liointc_set_bit(gc, LIOINTC_REG_INTC_POL, mask, true);
++ liointc_set_bit(gc, LIOINTC_REG_INTC_POL, mask, false);
+ break;
+ case IRQ_TYPE_LEVEL_LOW:
+ liointc_set_bit(gc, LIOINTC_REG_INTC_EDGE, mask, false);
+- liointc_set_bit(gc, LIOINTC_REG_INTC_POL, mask, false);
++ liointc_set_bit(gc, LIOINTC_REG_INTC_POL, mask, true);
+ break;
+ case IRQ_TYPE_EDGE_RISING:
+ liointc_set_bit(gc, LIOINTC_REG_INTC_EDGE, mask, true);
+- liointc_set_bit(gc, LIOINTC_REG_INTC_POL, mask, true);
++ liointc_set_bit(gc, LIOINTC_REG_INTC_POL, mask, false);
+ break;
+ case IRQ_TYPE_EDGE_FALLING:
+ liointc_set_bit(gc, LIOINTC_REG_INTC_EDGE, mask, true);
+- liointc_set_bit(gc, LIOINTC_REG_INTC_POL, mask, false);
++ liointc_set_bit(gc, LIOINTC_REG_INTC_POL, mask, true);
+ break;
+ default:
+ irq_gc_unlock_irqrestore(gc, flags);
+diff --git a/drivers/irqchip/irq-loongson-pch-pic.c b/drivers/irqchip/irq-loongson-pch-pic.c
+index e5fe4d50be056..93a71f66efebf 100644
+--- a/drivers/irqchip/irq-loongson-pch-pic.c
++++ b/drivers/irqchip/irq-loongson-pch-pic.c
+@@ -164,7 +164,7 @@ static int pch_pic_domain_translate(struct irq_domain *d,
+ if (fwspec->param_count < 2)
+ return -EINVAL;
+
+- *hwirq = fwspec->param[0] + priv->ht_vec_base;
++ *hwirq = fwspec->param[0];
+ *type = fwspec->param[1] & IRQ_TYPE_SENSE_MASK;
+ } else {
+ if (fwspec->param_count < 1)
+@@ -196,7 +196,7 @@ static int pch_pic_alloc(struct irq_domain *domain, unsigned int virq,
+
+ parent_fwspec.fwnode = domain->parent->fwnode;
+ parent_fwspec.param_count = 1;
+- parent_fwspec.param[0] = hwirq;
++ parent_fwspec.param[0] = hwirq + priv->ht_vec_base;
+
+ err = irq_domain_alloc_irqs_parent(domain, virq, 1, &parent_fwspec);
+ if (err)
+@@ -401,14 +401,12 @@ static int __init acpi_cascade_irqdomain_init(void)
+ int __init pch_pic_acpi_init(struct irq_domain *parent,
+ struct acpi_madt_bio_pic *acpi_pchpic)
+ {
+- int ret, vec_base;
++ int ret;
+ struct fwnode_handle *domain_handle;
+
+ if (find_pch_pic(acpi_pchpic->gsi_base) >= 0)
+ return 0;
+
+- vec_base = acpi_pchpic->gsi_base - GSI_MIN_PCH_IRQ;
+-
+ domain_handle = irq_domain_alloc_fwnode(&acpi_pchpic->address);
+ if (!domain_handle) {
+ pr_err("Unable to allocate domain handle\n");
+@@ -416,7 +414,7 @@ int __init pch_pic_acpi_init(struct irq_domain *parent,
+ }
+
+ ret = pch_pic_init(acpi_pchpic->address, acpi_pchpic->size,
+- vec_base, parent, domain_handle, acpi_pchpic->gsi_base);
++ 0, parent, domain_handle, acpi_pchpic->gsi_base);
+
+ if (ret < 0) {
+ irq_domain_free_fwnode(domain_handle);
+diff --git a/drivers/irqchip/irq-stm32-exti.c b/drivers/irqchip/irq-stm32-exti.c
+index 6a3f7498ea8ea..8bbb2b114636c 100644
+--- a/drivers/irqchip/irq-stm32-exti.c
++++ b/drivers/irqchip/irq-stm32-exti.c
+@@ -173,6 +173,16 @@ static struct irq_chip stm32_exti_h_chip_direct;
+ #define EXTI_INVALID_IRQ U8_MAX
+ #define STM32MP1_DESC_IRQ_SIZE (ARRAY_SIZE(stm32mp1_exti_banks) * IRQS_PER_BANK)
+
++/*
++ * Use some intentionally tricky logic here to initialize the whole array to
++ * EXTI_INVALID_IRQ, but then override certain fields, requiring us to indicate
++ * that we "know" that there are overrides in this structure, and we'll need to
++ * disable that warning from W=1 builds.
++ */
++__diag_push();
++__diag_ignore_all("-Woverride-init",
++ "logic to initialize all and then override some is OK");
++
+ static const u8 stm32mp1_desc_irq[] = {
+ /* default value */
+ [0 ... (STM32MP1_DESC_IRQ_SIZE - 1)] = EXTI_INVALID_IRQ,
+@@ -266,6 +276,8 @@ static const u8 stm32mp13_desc_irq[] = {
+ [70] = 98,
+ };
+
++__diag_pop();
++
+ static const struct stm32_exti_drv_data stm32mp1_drv_data = {
+ .exti_banks = stm32mp1_exti_banks,
+ .bank_nr = ARRAY_SIZE(stm32mp1_exti_banks),
+diff --git a/drivers/leds/trigger/ledtrig-netdev.c b/drivers/leds/trigger/ledtrig-netdev.c
+index d5e774d830215..f4d670ec30bcb 100644
+--- a/drivers/leds/trigger/ledtrig-netdev.c
++++ b/drivers/leds/trigger/ledtrig-netdev.c
+@@ -318,6 +318,9 @@ static int netdev_trig_notify(struct notifier_block *nb,
+ clear_bit(NETDEV_LED_MODE_LINKUP, &trigger_data->mode);
+ switch (evt) {
+ case NETDEV_CHANGENAME:
++ if (netif_carrier_ok(dev))
++ set_bit(NETDEV_LED_MODE_LINKUP, &trigger_data->mode);
++ fallthrough;
+ case NETDEV_REGISTER:
+ if (trigger_data->net_dev)
+ dev_put(trigger_data->net_dev);
+diff --git a/drivers/mailbox/ti-msgmgr.c b/drivers/mailbox/ti-msgmgr.c
+index ddac423ac1a91..03048cbda525e 100644
+--- a/drivers/mailbox/ti-msgmgr.c
++++ b/drivers/mailbox/ti-msgmgr.c
+@@ -430,14 +430,20 @@ static int ti_msgmgr_send_data(struct mbox_chan *chan, void *data)
+ /* Ensure all unused data is 0 */
+ data_trail &= 0xFFFFFFFF >> (8 * (sizeof(u32) - trail_bytes));
+ writel(data_trail, data_reg);
+- data_reg++;
++ data_reg += sizeof(u32);
+ }
++
+ /*
+ * 'data_reg' indicates next register to write. If we did not already
+ * write on tx complete reg(last reg), we must do so for transmit
++ * In addition, we also need to make sure all intermediate data
++ * registers(if any required), are reset to 0 for TISCI backward
++ * compatibility to be maintained.
+ */
+- if (data_reg <= qinst->queue_buff_end)
+- writel(0, qinst->queue_buff_end);
++ while (data_reg <= qinst->queue_buff_end) {
++ writel(0, data_reg);
++ data_reg += sizeof(u32);
++ }
+
+ /* If we are in polled mode, wait for a response before proceeding */
+ if (ti_msgmgr_chan_has_polled_queue_rx(message->chan_rx))
+diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
+index 147c493a989a5..68b9d7ca864e2 100644
+--- a/drivers/md/bcache/btree.c
++++ b/drivers/md/bcache/btree.c
+@@ -885,7 +885,7 @@ static struct btree *mca_cannibalize(struct cache_set *c, struct btree_op *op,
+ * cannibalize_bucket() will take. This means every time we unlock the root of
+ * the btree, we need to release this lock if we have it held.
+ */
+-static void bch_cannibalize_unlock(struct cache_set *c)
++void bch_cannibalize_unlock(struct cache_set *c)
+ {
+ spin_lock(&c->btree_cannibalize_lock);
+ if (c->btree_cache_alloc_lock == current) {
+@@ -1090,10 +1090,12 @@ struct btree *__bch_btree_node_alloc(struct cache_set *c, struct btree_op *op,
+ struct btree *parent)
+ {
+ BKEY_PADDED(key) k;
+- struct btree *b = ERR_PTR(-EAGAIN);
++ struct btree *b;
+
+ mutex_lock(&c->bucket_lock);
+ retry:
++ /* return ERR_PTR(-EAGAIN) when it fails */
++ b = ERR_PTR(-EAGAIN);
+ if (__bch_bucket_alloc_set(c, RESERVE_BTREE, &k.key, wait))
+ goto err;
+
+@@ -1138,7 +1140,7 @@ static struct btree *btree_node_alloc_replacement(struct btree *b,
+ {
+ struct btree *n = bch_btree_node_alloc(b->c, op, b->level, b->parent);
+
+- if (!IS_ERR_OR_NULL(n)) {
++ if (!IS_ERR(n)) {
+ mutex_lock(&n->write_lock);
+ bch_btree_sort_into(&b->keys, &n->keys, &b->c->sort);
+ bkey_copy_key(&n->key, &b->key);
+@@ -1340,7 +1342,7 @@ static int btree_gc_coalesce(struct btree *b, struct btree_op *op,
+ memset(new_nodes, 0, sizeof(new_nodes));
+ closure_init_stack(&cl);
+
+- while (nodes < GC_MERGE_NODES && !IS_ERR_OR_NULL(r[nodes].b))
++ while (nodes < GC_MERGE_NODES && !IS_ERR(r[nodes].b))
+ keys += r[nodes++].keys;
+
+ blocks = btree_default_blocks(b->c) * 2 / 3;
+@@ -1352,7 +1354,7 @@ static int btree_gc_coalesce(struct btree *b, struct btree_op *op,
+
+ for (i = 0; i < nodes; i++) {
+ new_nodes[i] = btree_node_alloc_replacement(r[i].b, NULL);
+- if (IS_ERR_OR_NULL(new_nodes[i]))
++ if (IS_ERR(new_nodes[i]))
+ goto out_nocoalesce;
+ }
+
+@@ -1487,7 +1489,7 @@ out_nocoalesce:
+ bch_keylist_free(&keylist);
+
+ for (i = 0; i < nodes; i++)
+- if (!IS_ERR_OR_NULL(new_nodes[i])) {
++ if (!IS_ERR(new_nodes[i])) {
+ btree_node_free(new_nodes[i]);
+ rw_unlock(true, new_nodes[i]);
+ }
+@@ -1669,7 +1671,7 @@ static int bch_btree_gc_root(struct btree *b, struct btree_op *op,
+ if (should_rewrite) {
+ n = btree_node_alloc_replacement(b, NULL);
+
+- if (!IS_ERR_OR_NULL(n)) {
++ if (!IS_ERR(n)) {
+ bch_btree_node_write_sync(n);
+
+ bch_btree_set_root(n);
+@@ -1968,6 +1970,15 @@ static int bch_btree_check_thread(void *arg)
+ c->gc_stats.nodes++;
+ bch_btree_op_init(&op, 0);
+ ret = bcache_btree(check_recurse, p, c->root, &op);
++ /*
++ * The op may be added to cache_set's btree_cache_wait
++ * in mca_cannibalize(), must ensure it is removed from
++ * the list and release btree_cache_alloc_lock before
++ * free op memory.
++ * Otherwise, the btree_cache_wait will be damaged.
++ */
++ bch_cannibalize_unlock(c);
++ finish_wait(&c->btree_cache_wait, &(&op)->wait);
+ if (ret)
+ goto out;
+ }
+diff --git a/drivers/md/bcache/btree.h b/drivers/md/bcache/btree.h
+index 1b5fdbc0d83eb..a2920bbfcad56 100644
+--- a/drivers/md/bcache/btree.h
++++ b/drivers/md/bcache/btree.h
+@@ -282,6 +282,7 @@ void bch_initial_gc_finish(struct cache_set *c);
+ void bch_moving_gc(struct cache_set *c);
+ int bch_btree_check(struct cache_set *c);
+ void bch_initial_mark_key(struct cache_set *c, int level, struct bkey *k);
++void bch_cannibalize_unlock(struct cache_set *c);
+
+ static inline void wake_up_gc(struct cache_set *c)
+ {
+diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
+index 7e9d19fd21ddd..077149c4050b9 100644
+--- a/drivers/md/bcache/super.c
++++ b/drivers/md/bcache/super.c
+@@ -1723,7 +1723,7 @@ static void cache_set_flush(struct closure *cl)
+ if (!IS_ERR_OR_NULL(c->gc_thread))
+ kthread_stop(c->gc_thread);
+
+- if (!IS_ERR_OR_NULL(c->root))
++ if (!IS_ERR(c->root))
+ list_add(&c->root->list, &c->btree_cache);
+
+ /*
+@@ -2087,7 +2087,7 @@ static int run_cache_set(struct cache_set *c)
+
+ err = "cannot allocate new btree root";
+ c->root = __bch_btree_node_alloc(c, NULL, 0, true, NULL);
+- if (IS_ERR_OR_NULL(c->root))
++ if (IS_ERR(c->root))
+ goto err;
+
+ mutex_lock(&c->root->write_lock);
+diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
+index d4a5fc0650bb2..24c049067f61a 100644
+--- a/drivers/md/bcache/writeback.c
++++ b/drivers/md/bcache/writeback.c
+@@ -890,6 +890,16 @@ static int bch_root_node_dirty_init(struct cache_set *c,
+ if (ret < 0)
+ pr_warn("sectors dirty init failed, ret=%d!\n", ret);
+
++ /*
++ * The op may be added to cache_set's btree_cache_wait
++ * in mca_cannibalize(), must ensure it is removed from
++ * the list and release btree_cache_alloc_lock before
++ * free op memory.
++ * Otherwise, the btree_cache_wait will be damaged.
++ */
++ bch_cannibalize_unlock(c);
++ finish_wait(&c->btree_cache_wait, &(&op.op)->wait);
++
+ return ret;
+ }
+
+diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
+index bc8d7565171d4..ea226a37b110a 100644
+--- a/drivers/md/md-bitmap.c
++++ b/drivers/md/md-bitmap.c
+@@ -54,14 +54,7 @@ __acquires(bitmap->lock)
+ {
+ unsigned char *mappage;
+
+- if (page >= bitmap->pages) {
+- /* This can happen if bitmap_start_sync goes beyond
+- * End-of-device while looking for a whole page.
+- * It is harmless.
+- */
+- return -EINVAL;
+- }
+-
++ WARN_ON_ONCE(page >= bitmap->pages);
+ if (bitmap->bp[page].hijacked) /* it's hijacked, don't try to alloc */
+ return 0;
+
+@@ -1023,7 +1016,6 @@ static int md_bitmap_file_test_bit(struct bitmap *bitmap, sector_t block)
+ return set;
+ }
+
+-
+ /* this gets called when the md device is ready to unplug its underlying
+ * (slave) device queues -- before we let any writes go down, we need to
+ * sync the dirty pages of the bitmap file to disk */
+@@ -1033,8 +1025,7 @@ void md_bitmap_unplug(struct bitmap *bitmap)
+ int dirty, need_write;
+ int writing = 0;
+
+- if (!bitmap || !bitmap->storage.filemap ||
+- test_bit(BITMAP_STALE, &bitmap->flags))
++ if (!md_bitmap_enabled(bitmap))
+ return;
+
+ /* look at each page to see if there are any set bits that need to be
+@@ -1387,6 +1378,14 @@ __acquires(bitmap->lock)
+ sector_t csize;
+ int err;
+
++ if (page >= bitmap->pages) {
++ /*
++ * This can happen if bitmap_start_sync goes beyond
++ * End-of-device while looking for a whole page or
++ * user set a huge number to sysfs bitmap_set_bits.
++ */
++ return NULL;
++ }
+ err = md_bitmap_checkpage(bitmap, page, create, 0);
+
+ if (bitmap->bp[page].hijacked ||
+diff --git a/drivers/md/md-bitmap.h b/drivers/md/md-bitmap.h
+index cfd7395de8fd3..3a4750952b3a7 100644
+--- a/drivers/md/md-bitmap.h
++++ b/drivers/md/md-bitmap.h
+@@ -273,6 +273,13 @@ int md_bitmap_copy_from_slot(struct mddev *mddev, int slot,
+ sector_t *lo, sector_t *hi, bool clear_bits);
+ void md_bitmap_free(struct bitmap *bitmap);
+ void md_bitmap_wait_behind_writes(struct mddev *mddev);
++
++static inline bool md_bitmap_enabled(struct bitmap *bitmap)
++{
++ return bitmap && bitmap->storage.filemap &&
++ !test_bit(BITMAP_STALE, &bitmap->flags);
++}
++
+ #endif
+
+ #endif
+diff --git a/drivers/md/md.c b/drivers/md/md.c
+index 8e344b4b34446..350094f1cb09f 100644
+--- a/drivers/md/md.c
++++ b/drivers/md/md.c
+@@ -3794,8 +3794,9 @@ int strict_strtoul_scaled(const char *cp, unsigned long *res, int scale)
+ static ssize_t
+ safe_delay_show(struct mddev *mddev, char *page)
+ {
+- int msec = (mddev->safemode_delay*1000)/HZ;
+- return sprintf(page, "%d.%03d\n", msec/1000, msec%1000);
++ unsigned int msec = ((unsigned long)mddev->safemode_delay*1000)/HZ;
++
++ return sprintf(page, "%u.%03u\n", msec/1000, msec%1000);
+ }
+ static ssize_t
+ safe_delay_store(struct mddev *mddev, const char *cbuf, size_t len)
+@@ -3807,7 +3808,7 @@ safe_delay_store(struct mddev *mddev, const char *cbuf, size_t len)
+ return -EINVAL;
+ }
+
+- if (strict_strtoul_scaled(cbuf, &msec, 3) < 0)
++ if (strict_strtoul_scaled(cbuf, &msec, 3) < 0 || msec > UINT_MAX / HZ)
+ return -EINVAL;
+ if (msec == 0)
+ mddev->safemode_delay = 0;
+@@ -4477,6 +4478,8 @@ max_corrected_read_errors_store(struct mddev *mddev, const char *buf, size_t len
+ rv = kstrtouint(buf, 10, &n);
+ if (rv < 0)
+ return rv;
++ if (n > INT_MAX)
++ return -EINVAL;
+ atomic_set(&mddev->max_corr_read_errors, n);
+ return len;
+ }
+diff --git a/drivers/md/raid1-10.c b/drivers/md/raid1-10.c
+index e61f6cad4e08e..e0c8ac8146331 100644
+--- a/drivers/md/raid1-10.c
++++ b/drivers/md/raid1-10.c
+@@ -109,3 +109,45 @@ static void md_bio_reset_resync_pages(struct bio *bio, struct resync_pages *rp,
+ size -= len;
+ } while (idx++ < RESYNC_PAGES && size > 0);
+ }
++
++
++static inline void raid1_submit_write(struct bio *bio)
++{
++ struct md_rdev *rdev = (void *)bio->bi_bdev;
++
++ bio->bi_next = NULL;
++ bio_set_dev(bio, rdev->bdev);
++ if (test_bit(Faulty, &rdev->flags))
++ bio_io_error(bio);
++ else if (unlikely(bio_op(bio) == REQ_OP_DISCARD &&
++ !bdev_max_discard_sectors(bio->bi_bdev)))
++ /* Just ignore it */
++ bio_endio(bio);
++ else
++ submit_bio_noacct(bio);
++}
++
++static inline bool raid1_add_bio_to_plug(struct mddev *mddev, struct bio *bio,
++ blk_plug_cb_fn unplug)
++{
++ struct raid1_plug_cb *plug = NULL;
++ struct blk_plug_cb *cb;
++
++ /*
++ * If bitmap is not enabled, it's safe to submit the io directly, and
++ * this can get optimal performance.
++ */
++ if (!md_bitmap_enabled(mddev->bitmap)) {
++ raid1_submit_write(bio);
++ return true;
++ }
++
++ cb = blk_check_plugged(unplug, mddev, sizeof(*plug));
++ if (!cb)
++ return false;
++
++ plug = container_of(cb, struct raid1_plug_cb, cb);
++ bio_list_add(&plug->pending, bio);
++
++ return true;
++}
+diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
+index 68a9e2d9985b2..e51b77a3a8397 100644
+--- a/drivers/md/raid1.c
++++ b/drivers/md/raid1.c
+@@ -799,17 +799,8 @@ static void flush_bio_list(struct r1conf *conf, struct bio *bio)
+
+ while (bio) { /* submit pending writes */
+ struct bio *next = bio->bi_next;
+- struct md_rdev *rdev = (void *)bio->bi_bdev;
+- bio->bi_next = NULL;
+- bio_set_dev(bio, rdev->bdev);
+- if (test_bit(Faulty, &rdev->flags)) {
+- bio_io_error(bio);
+- } else if (unlikely((bio_op(bio) == REQ_OP_DISCARD) &&
+- !bdev_max_discard_sectors(bio->bi_bdev)))
+- /* Just ignore it */
+- bio_endio(bio);
+- else
+- submit_bio_noacct(bio);
++
++ raid1_submit_write(bio);
+ bio = next;
+ cond_resched();
+ }
+@@ -1343,8 +1334,6 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
+ struct bitmap *bitmap = mddev->bitmap;
+ unsigned long flags;
+ struct md_rdev *blocked_rdev;
+- struct blk_plug_cb *cb;
+- struct raid1_plug_cb *plug = NULL;
+ int first_clone;
+ int max_sectors;
+ bool write_behind = false;
+@@ -1573,15 +1562,7 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
+ r1_bio->sector);
+ /* flush_pending_writes() needs access to the rdev so...*/
+ mbio->bi_bdev = (void *)rdev;
+-
+- cb = blk_check_plugged(raid1_unplug, mddev, sizeof(*plug));
+- if (cb)
+- plug = container_of(cb, struct raid1_plug_cb, cb);
+- else
+- plug = NULL;
+- if (plug) {
+- bio_list_add(&plug->pending, mbio);
+- } else {
++ if (!raid1_add_bio_to_plug(mddev, mbio, raid1_unplug)) {
+ spin_lock_irqsave(&conf->device_lock, flags);
+ bio_list_add(&conf->pending_bio_list, mbio);
+ spin_unlock_irqrestore(&conf->device_lock, flags);
+diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
+index 4fcfcb350d2b4..9d23963496194 100644
+--- a/drivers/md/raid10.c
++++ b/drivers/md/raid10.c
+@@ -325,7 +325,7 @@ static void raid_end_bio_io(struct r10bio *r10_bio)
+ if (!test_bit(R10BIO_Uptodate, &r10_bio->state))
+ bio->bi_status = BLK_STS_IOERR;
+
+- if (blk_queue_io_stat(bio->bi_bdev->bd_disk->queue))
++ if (r10_bio->start_time)
+ bio_end_io_acct(bio, r10_bio->start_time);
+ bio_endio(bio);
+ /*
+@@ -779,8 +779,16 @@ static struct md_rdev *read_balance(struct r10conf *conf,
+ disk = r10_bio->devs[slot].devnum;
+ rdev = rcu_dereference(conf->mirrors[disk].replacement);
+ if (rdev == NULL || test_bit(Faulty, &rdev->flags) ||
+- r10_bio->devs[slot].addr + sectors > rdev->recovery_offset)
++ r10_bio->devs[slot].addr + sectors >
++ rdev->recovery_offset) {
++ /*
++ * Read replacement first to prevent reading both rdev
++ * and replacement as NULL during replacement replace
++ * rdev.
++ */
++ smp_mb();
+ rdev = rcu_dereference(conf->mirrors[disk].rdev);
++ }
+ if (rdev == NULL ||
+ test_bit(Faulty, &rdev->flags))
+ continue;
+@@ -909,17 +917,8 @@ static void flush_pending_writes(struct r10conf *conf)
+
+ while (bio) { /* submit pending writes */
+ struct bio *next = bio->bi_next;
+- struct md_rdev *rdev = (void*)bio->bi_bdev;
+- bio->bi_next = NULL;
+- bio_set_dev(bio, rdev->bdev);
+- if (test_bit(Faulty, &rdev->flags)) {
+- bio_io_error(bio);
+- } else if (unlikely((bio_op(bio) == REQ_OP_DISCARD) &&
+- !bdev_max_discard_sectors(bio->bi_bdev)))
+- /* Just ignore it */
+- bio_endio(bio);
+- else
+- submit_bio_noacct(bio);
++
++ raid1_submit_write(bio);
+ bio = next;
+ }
+ blk_finish_plug(&plug);
+@@ -1130,17 +1129,8 @@ static void raid10_unplug(struct blk_plug_cb *cb, bool from_schedule)
+
+ while (bio) { /* submit pending writes */
+ struct bio *next = bio->bi_next;
+- struct md_rdev *rdev = (void*)bio->bi_bdev;
+- bio->bi_next = NULL;
+- bio_set_dev(bio, rdev->bdev);
+- if (test_bit(Faulty, &rdev->flags)) {
+- bio_io_error(bio);
+- } else if (unlikely((bio_op(bio) == REQ_OP_DISCARD) &&
+- !bdev_max_discard_sectors(bio->bi_bdev)))
+- /* Just ignore it */
+- bio_endio(bio);
+- else
+- submit_bio_noacct(bio);
++
++ raid1_submit_write(bio);
+ bio = next;
+ }
+ kfree(plug);
+@@ -1282,8 +1272,6 @@ static void raid10_write_one_disk(struct mddev *mddev, struct r10bio *r10_bio,
+ const blk_opf_t do_sync = bio->bi_opf & REQ_SYNC;
+ const blk_opf_t do_fua = bio->bi_opf & REQ_FUA;
+ unsigned long flags;
+- struct blk_plug_cb *cb;
+- struct raid1_plug_cb *plug = NULL;
+ struct r10conf *conf = mddev->private;
+ struct md_rdev *rdev;
+ int devnum = r10_bio->devs[n_copy].devnum;
+@@ -1323,14 +1311,7 @@ static void raid10_write_one_disk(struct mddev *mddev, struct r10bio *r10_bio,
+
+ atomic_inc(&r10_bio->remaining);
+
+- cb = blk_check_plugged(raid10_unplug, mddev, sizeof(*plug));
+- if (cb)
+- plug = container_of(cb, struct raid1_plug_cb, cb);
+- else
+- plug = NULL;
+- if (plug) {
+- bio_list_add(&plug->pending, mbio);
+- } else {
++ if (!raid1_add_bio_to_plug(mddev, mbio, raid10_unplug)) {
+ spin_lock_irqsave(&conf->device_lock, flags);
+ bio_list_add(&conf->pending_bio_list, mbio);
+ spin_unlock_irqrestore(&conf->device_lock, flags);
+@@ -1479,9 +1460,15 @@ static void raid10_write_request(struct mddev *mddev, struct bio *bio,
+
+ for (i = 0; i < conf->copies; i++) {
+ int d = r10_bio->devs[i].devnum;
+- struct md_rdev *rdev = rcu_dereference(conf->mirrors[d].rdev);
+- struct md_rdev *rrdev = rcu_dereference(
+- conf->mirrors[d].replacement);
++ struct md_rdev *rdev, *rrdev;
++
++ rrdev = rcu_dereference(conf->mirrors[d].replacement);
++ /*
++ * Read replacement first to prevent reading both rdev and
++ * replacement as NULL during replacement replace rdev.
++ */
++ smp_mb();
++ rdev = rcu_dereference(conf->mirrors[d].rdev);
+ if (rdev == rrdev)
+ rrdev = NULL;
+ if (rdev && (test_bit(Faulty, &rdev->flags)))
+@@ -3438,7 +3425,6 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
+ int must_sync;
+ int any_working;
+ int need_recover = 0;
+- int need_replace = 0;
+ struct raid10_info *mirror = &conf->mirrors[i];
+ struct md_rdev *mrdev, *mreplace;
+
+@@ -3450,11 +3436,10 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
+ !test_bit(Faulty, &mrdev->flags) &&
+ !test_bit(In_sync, &mrdev->flags))
+ need_recover = 1;
+- if (mreplace != NULL &&
+- !test_bit(Faulty, &mreplace->flags))
+- need_replace = 1;
++ if (mreplace && test_bit(Faulty, &mreplace->flags))
++ mreplace = NULL;
+
+- if (!need_recover && !need_replace) {
++ if (!need_recover && !mreplace) {
+ rcu_read_unlock();
+ continue;
+ }
+@@ -3470,8 +3455,6 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
+ rcu_read_unlock();
+ continue;
+ }
+- if (mreplace && test_bit(Faulty, &mreplace->flags))
+- mreplace = NULL;
+ /* Unless we are doing a full sync, or a replacement
+ * we only need to recover the block if it is set in
+ * the bitmap
+@@ -3594,11 +3577,11 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
+ bio = r10_bio->devs[1].repl_bio;
+ if (bio)
+ bio->bi_end_io = NULL;
+- /* Note: if need_replace, then bio
++ /* Note: if replace is not NULL, then bio
+ * cannot be NULL as r10buf_pool_alloc will
+ * have allocated it.
+ */
+- if (!need_replace)
++ if (!mreplace)
+ break;
+ bio->bi_next = biolist;
+ biolist = bio;
+diff --git a/drivers/media/cec/i2c/Kconfig b/drivers/media/cec/i2c/Kconfig
+index 70432a1d69186..d912d143fb312 100644
+--- a/drivers/media/cec/i2c/Kconfig
++++ b/drivers/media/cec/i2c/Kconfig
+@@ -5,6 +5,7 @@
+ config CEC_CH7322
+ tristate "Chrontel CH7322 CEC controller"
+ depends on I2C
++ select REGMAP
+ select REGMAP_I2C
+ select CEC_CORE
+ help
+diff --git a/drivers/media/common/saa7146/saa7146_core.c b/drivers/media/common/saa7146/saa7146_core.c
+index bcb957883044c..27c53eed8fe39 100644
+--- a/drivers/media/common/saa7146/saa7146_core.c
++++ b/drivers/media/common/saa7146/saa7146_core.c
+@@ -133,8 +133,8 @@ int saa7146_wait_for_debi_done(struct saa7146_dev *dev, int nobusyloop)
+ ****************************************************************************/
+
+ /* this is videobuf_vmalloc_to_sg() from videobuf-dma-sg.c
+- make sure virt has been allocated with vmalloc_32(), otherwise the BUG()
+- may be triggered on highmem machines */
++ make sure virt has been allocated with vmalloc_32(), otherwise return NULL
++ on highmem machines */
+ static struct scatterlist* vmalloc_to_sg(unsigned char *virt, int nr_pages)
+ {
+ struct scatterlist *sglist;
+@@ -150,7 +150,7 @@ static struct scatterlist* vmalloc_to_sg(unsigned char *virt, int nr_pages)
+ if (NULL == pg)
+ goto err;
+ if (WARN_ON(PageHighMem(pg)))
+- return NULL;
++ goto err;
+ sg_set_page(&sglist[i], pg, PAGE_SIZE, 0);
+ }
+ return sglist;
+diff --git a/drivers/media/i2c/Kconfig b/drivers/media/i2c/Kconfig
+index 256d55bb2b1da..76d1ee3cc1bab 100644
+--- a/drivers/media/i2c/Kconfig
++++ b/drivers/media/i2c/Kconfig
+@@ -1292,6 +1292,7 @@ config VIDEO_TC358746
+ select VIDEO_V4L2_SUBDEV_API
+ select MEDIA_CONTROLLER
+ select V4L2_FWNODE
++ select GENERIC_PHY
+ select GENERIC_PHY_MIPI_DPHY
+ select REGMAP_I2C
+ help
+diff --git a/drivers/media/i2c/hi846.c b/drivers/media/i2c/hi846.c
+index 306dc35e925fd..f8709cdf28b39 100644
+--- a/drivers/media/i2c/hi846.c
++++ b/drivers/media/i2c/hi846.c
+@@ -1353,7 +1353,8 @@ static int hi846_set_ctrl(struct v4l2_ctrl *ctrl)
+ exposure_max);
+ }
+
+- if (!pm_runtime_get_if_in_use(&client->dev))
++ ret = pm_runtime_get_if_in_use(&client->dev);
++ if (!ret || ret == -EAGAIN)
+ return 0;
+
+ switch (ctrl->id) {
+diff --git a/drivers/media/i2c/imx296.c b/drivers/media/i2c/imx296.c
+index 4f22c0515ef8d..c3d6d52fc7727 100644
+--- a/drivers/media/i2c/imx296.c
++++ b/drivers/media/i2c/imx296.c
+@@ -922,10 +922,12 @@ static int imx296_read_temperature(struct imx296 *sensor, int *temp)
+ if (ret < 0)
+ return ret;
+
+- tmdout = imx296_read(sensor, IMX296_TMDOUT) & IMX296_TMDOUT_MASK;
++ tmdout = imx296_read(sensor, IMX296_TMDOUT);
+ if (tmdout < 0)
+ return tmdout;
+
++ tmdout &= IMX296_TMDOUT_MASK;
++
+ /* T(°C) = 246.312 - 0.304 * TMDOUT */;
+ *temp = 246312 - 304 * tmdout;
+
+diff --git a/drivers/media/i2c/st-mipid02.c b/drivers/media/i2c/st-mipid02.c
+index 31b89aff0e86a..f20f87562bf11 100644
+--- a/drivers/media/i2c/st-mipid02.c
++++ b/drivers/media/i2c/st-mipid02.c
+@@ -736,8 +736,13 @@ static void mipid02_set_fmt_source(struct v4l2_subdev *sd,
+ {
+ struct mipid02_dev *bridge = to_mipid02_dev(sd);
+
+- /* source pad mirror active sink pad */
+- format->format = bridge->fmt;
++ /* source pad mirror sink pad */
++ if (format->which == V4L2_SUBDEV_FORMAT_ACTIVE)
++ format->format = bridge->fmt;
++ else
++ format->format = *v4l2_subdev_get_try_format(sd, sd_state,
++ MIPID02_SINK_0);
++
+ /* but code may need to be converted */
+ format->format.code = serial_to_parallel_code(format->format.code);
+
+diff --git a/drivers/media/platform/amphion/vdec.c b/drivers/media/platform/amphion/vdec.c
+index 3fa1a74a2e204..6515f3cdb7a74 100644
+--- a/drivers/media/platform/amphion/vdec.c
++++ b/drivers/media/platform/amphion/vdec.c
+@@ -279,6 +279,7 @@ static void vdec_handle_resolution_change(struct vpu_inst *inst)
+
+ vdec->source_change--;
+ vpu_notify_source_change(inst);
++ vpu_set_last_buffer_dequeued(inst, false);
+ }
+
+ static int vdec_update_state(struct vpu_inst *inst, enum vpu_codec_state state, u32 force)
+@@ -314,7 +315,7 @@ static void vdec_set_last_buffer_dequeued(struct vpu_inst *inst)
+ return;
+
+ if (vdec->eos_received) {
+- if (!vpu_set_last_buffer_dequeued(inst)) {
++ if (!vpu_set_last_buffer_dequeued(inst, true)) {
+ vdec->eos_received--;
+ vdec_update_state(inst, VPU_CODEC_STATE_DRAIN, 0);
+ }
+@@ -569,7 +570,7 @@ static int vdec_drain(struct vpu_inst *inst)
+ return 0;
+
+ if (!vdec->params.frame_count) {
+- vpu_set_last_buffer_dequeued(inst);
++ vpu_set_last_buffer_dequeued(inst, true);
+ return 0;
+ }
+
+@@ -608,7 +609,7 @@ static int vdec_cmd_stop(struct vpu_inst *inst)
+ vpu_trace(inst->dev, "[%d]\n", inst->id);
+
+ if (inst->state == VPU_CODEC_STATE_DEINIT) {
+- vpu_set_last_buffer_dequeued(inst);
++ vpu_set_last_buffer_dequeued(inst, true);
+ } else {
+ vdec->drain = 1;
+ vdec_drain(inst);
+diff --git a/drivers/media/platform/amphion/venc.c b/drivers/media/platform/amphion/venc.c
+index e6e8fe45fc7c3..58480e2755ec4 100644
+--- a/drivers/media/platform/amphion/venc.c
++++ b/drivers/media/platform/amphion/venc.c
+@@ -458,7 +458,7 @@ static int venc_encoder_cmd(struct file *file, void *fh, struct v4l2_encoder_cmd
+ vpu_inst_lock(inst);
+ if (cmd->cmd == V4L2_ENC_CMD_STOP) {
+ if (inst->state == VPU_CODEC_STATE_DEINIT)
+- vpu_set_last_buffer_dequeued(inst);
++ vpu_set_last_buffer_dequeued(inst, true);
+ else
+ venc_request_eos(inst);
+ }
+@@ -878,7 +878,7 @@ static void venc_set_last_buffer_dequeued(struct vpu_inst *inst)
+ struct venc_t *venc = inst->priv;
+
+ if (venc->stopped && list_empty(&venc->frames))
+- vpu_set_last_buffer_dequeued(inst);
++ vpu_set_last_buffer_dequeued(inst, true);
+ }
+
+ static void venc_stop_done(struct vpu_inst *inst)
+diff --git a/drivers/media/platform/amphion/vpu_malone.c b/drivers/media/platform/amphion/vpu_malone.c
+index ef44bff9fbaf6..c1d6606ad7e57 100644
+--- a/drivers/media/platform/amphion/vpu_malone.c
++++ b/drivers/media/platform/amphion/vpu_malone.c
+@@ -1313,6 +1313,15 @@ static int vpu_malone_insert_scode_pic(struct malone_scode_t *scode, u32 codec_i
+ return sizeof(hdr);
+ }
+
++static int vpu_malone_insert_scode_vc1_g_seq(struct malone_scode_t *scode)
++{
++ if (!scode->inst->total_input_count)
++ return 0;
++ if (vpu_vb_is_codecconfig(to_vb2_v4l2_buffer(scode->vb)))
++ scode->need_data = 0;
++ return 0;
++}
++
+ static int vpu_malone_insert_scode_vc1_g_pic(struct malone_scode_t *scode)
+ {
+ struct vb2_v4l2_buffer *vbuf;
+@@ -1344,6 +1353,8 @@ static int vpu_malone_insert_scode_vc1_l_seq(struct malone_scode_t *scode)
+ int size = 0;
+ u8 rcv_seqhdr[MALONE_VC1_RCV_SEQ_HEADER_LEN];
+
++ if (vpu_vb_is_codecconfig(to_vb2_v4l2_buffer(scode->vb)))
++ scode->need_data = 0;
+ if (scode->inst->total_input_count)
+ return 0;
+ scode->need_data = 0;
+@@ -1458,6 +1469,7 @@ static const struct malone_scode_handler scode_handlers[] = {
+ },
+ {
+ .pixelformat = V4L2_PIX_FMT_VC1_ANNEX_G,
++ .insert_scode_seq = vpu_malone_insert_scode_vc1_g_seq,
+ .insert_scode_pic = vpu_malone_insert_scode_vc1_g_pic,
+ },
+ {
+diff --git a/drivers/media/platform/amphion/vpu_v4l2.c b/drivers/media/platform/amphion/vpu_v4l2.c
+index 6773b885597ce..810e93d2c954a 100644
+--- a/drivers/media/platform/amphion/vpu_v4l2.c
++++ b/drivers/media/platform/amphion/vpu_v4l2.c
+@@ -100,7 +100,7 @@ int vpu_notify_source_change(struct vpu_inst *inst)
+ return 0;
+ }
+
+-int vpu_set_last_buffer_dequeued(struct vpu_inst *inst)
++int vpu_set_last_buffer_dequeued(struct vpu_inst *inst, bool eos)
+ {
+ struct vb2_queue *q;
+
+@@ -116,7 +116,8 @@ int vpu_set_last_buffer_dequeued(struct vpu_inst *inst)
+ vpu_trace(inst->dev, "last buffer dequeued\n");
+ q->last_buffer_dequeued = true;
+ wake_up(&q->done_wq);
+- vpu_notify_eos(inst);
++ if (eos)
++ vpu_notify_eos(inst);
+ return 0;
+ }
+
+diff --git a/drivers/media/platform/amphion/vpu_v4l2.h b/drivers/media/platform/amphion/vpu_v4l2.h
+index ef5de6b66e474..60f43056a7a28 100644
+--- a/drivers/media/platform/amphion/vpu_v4l2.h
++++ b/drivers/media/platform/amphion/vpu_v4l2.h
+@@ -27,7 +27,7 @@ struct vb2_v4l2_buffer *vpu_find_buf_by_idx(struct vpu_inst *inst, u32 type, u32
+ void vpu_v4l2_set_error(struct vpu_inst *inst);
+ int vpu_notify_eos(struct vpu_inst *inst);
+ int vpu_notify_source_change(struct vpu_inst *inst);
+-int vpu_set_last_buffer_dequeued(struct vpu_inst *inst);
++int vpu_set_last_buffer_dequeued(struct vpu_inst *inst, bool eos);
+ void vpu_vb2_buffers_return(struct vpu_inst *inst, unsigned int type, enum vb2_buffer_state state);
+ int vpu_get_num_buffers(struct vpu_inst *inst, u32 type);
+ bool vpu_is_source_empty(struct vpu_inst *inst);
+diff --git a/drivers/media/platform/mediatek/vcodec/vdec_msg_queue.c b/drivers/media/platform/mediatek/vcodec/vdec_msg_queue.c
+index f3073d1e7f420..03f8d7cd8eddc 100644
+--- a/drivers/media/platform/mediatek/vcodec/vdec_msg_queue.c
++++ b/drivers/media/platform/mediatek/vcodec/vdec_msg_queue.c
+@@ -71,7 +71,6 @@ static void vdec_msg_queue_dec(struct vdec_msg_queue *msg_queue, int hardware_in
+ int vdec_msg_queue_qbuf(struct vdec_msg_queue_ctx *msg_ctx, struct vdec_lat_buf *buf)
+ {
+ struct list_head *head;
+- int status;
+
+ head = vdec_get_buf_list(msg_ctx->hardware_index, buf);
+ if (!head) {
+@@ -87,12 +86,9 @@ int vdec_msg_queue_qbuf(struct vdec_msg_queue_ctx *msg_ctx, struct vdec_lat_buf
+ if (msg_ctx->hardware_index != MTK_VDEC_CORE) {
+ wake_up_all(&msg_ctx->ready_to_use);
+ } else {
+- if (buf->ctx->msg_queue.core_work_cnt <
+- atomic_read(&buf->ctx->msg_queue.core_list_cnt)) {
+- status = queue_work(buf->ctx->dev->core_workqueue,
+- &buf->ctx->msg_queue.core_work);
+- if (status)
+- buf->ctx->msg_queue.core_work_cnt++;
++ if (!(buf->ctx->msg_queue.status & CONTEXT_LIST_QUEUED)) {
++ queue_work(buf->ctx->dev->core_workqueue, &buf->ctx->msg_queue.core_work);
++ buf->ctx->msg_queue.status |= CONTEXT_LIST_QUEUED;
+ }
+ }
+
+@@ -261,7 +257,10 @@ static void vdec_msg_queue_core_work(struct work_struct *work)
+ container_of(msg_queue, struct mtk_vcodec_ctx, msg_queue);
+ struct mtk_vcodec_dev *dev = ctx->dev;
+ struct vdec_lat_buf *lat_buf;
+- int status;
++
++ spin_lock(&ctx->dev->msg_queue_core_ctx.ready_lock);
++ ctx->msg_queue.status &= ~CONTEXT_LIST_QUEUED;
++ spin_unlock(&ctx->dev->msg_queue_core_ctx.ready_lock);
+
+ lat_buf = vdec_msg_queue_dqbuf(&dev->msg_queue_core_ctx);
+ if (!lat_buf)
+@@ -278,17 +277,13 @@ static void vdec_msg_queue_core_work(struct work_struct *work)
+ vdec_msg_queue_qbuf(&ctx->msg_queue.lat_ctx, lat_buf);
+
+ wake_up_all(&ctx->msg_queue.core_dec_done);
+- spin_lock(&dev->msg_queue_core_ctx.ready_lock);
+- lat_buf->ctx->msg_queue.core_work_cnt--;
+-
+- if (lat_buf->ctx->msg_queue.core_work_cnt <
+- atomic_read(&lat_buf->ctx->msg_queue.core_list_cnt)) {
+- status = queue_work(lat_buf->ctx->dev->core_workqueue,
+- &lat_buf->ctx->msg_queue.core_work);
+- if (status)
+- lat_buf->ctx->msg_queue.core_work_cnt++;
++ if (!(ctx->msg_queue.status & CONTEXT_LIST_QUEUED) &&
++ atomic_read(&msg_queue->core_list_cnt)) {
++ spin_lock(&ctx->dev->msg_queue_core_ctx.ready_lock);
++ ctx->msg_queue.status |= CONTEXT_LIST_QUEUED;
++ spin_unlock(&ctx->dev->msg_queue_core_ctx.ready_lock);
++ queue_work(ctx->dev->core_workqueue, &msg_queue->core_work);
+ }
+- spin_unlock(&dev->msg_queue_core_ctx.ready_lock);
+ }
+
+ int vdec_msg_queue_init(struct vdec_msg_queue *msg_queue,
+@@ -303,13 +298,13 @@ int vdec_msg_queue_init(struct vdec_msg_queue *msg_queue,
+ return 0;
+
+ msg_queue->ctx = ctx;
+- msg_queue->core_work_cnt = 0;
+ vdec_msg_queue_init_ctx(&msg_queue->lat_ctx, MTK_VDEC_LAT0);
+ INIT_WORK(&msg_queue->core_work, vdec_msg_queue_core_work);
+
+ atomic_set(&msg_queue->lat_list_cnt, 0);
+ atomic_set(&msg_queue->core_list_cnt, 0);
+ init_waitqueue_head(&msg_queue->core_dec_done);
++ msg_queue->status = CONTEXT_LIST_EMPTY;
+
+ msg_queue->wdma_addr.size =
+ vde_msg_queue_get_trans_size(ctx->picinfo.buf_w,
+diff --git a/drivers/media/platform/mediatek/vcodec/vdec_msg_queue.h b/drivers/media/platform/mediatek/vcodec/vdec_msg_queue.h
+index a5d44bc97c16b..8f82d14847726 100644
+--- a/drivers/media/platform/mediatek/vcodec/vdec_msg_queue.h
++++ b/drivers/media/platform/mediatek/vcodec/vdec_msg_queue.h
+@@ -21,6 +21,18 @@ struct mtk_vcodec_ctx;
+ struct mtk_vcodec_dev;
+ typedef int (*core_decode_cb_t)(struct vdec_lat_buf *lat_buf);
+
++/**
++ * enum core_ctx_status - Context decode status for core hardwre.
++ * @CONTEXT_LIST_EMPTY: No buffer queued on core hardware(must always be 0)
++ * @CONTEXT_LIST_QUEUED: Buffer queued to core work list
++ * @CONTEXT_LIST_DEC_DONE: context decode done
++ */
++enum core_ctx_status {
++ CONTEXT_LIST_EMPTY = 0,
++ CONTEXT_LIST_QUEUED,
++ CONTEXT_LIST_DEC_DONE,
++};
++
+ /**
+ * struct vdec_msg_queue_ctx - represents a queue for buffers ready to be processed
+ * @ready_to_use: ready used queue used to signalize when get a job queue
+@@ -77,7 +89,7 @@ struct vdec_lat_buf {
+ * @lat_list_cnt: used to record each instance lat list count
+ * @core_list_cnt: used to record each instance core list count
+ * @core_dec_done: core work queue decode done event
+- * @core_work_cnt: the number of core work in work queue
++ * @status: current context decode status for core hardware
+ */
+ struct vdec_msg_queue {
+ struct vdec_lat_buf lat_buf[NUM_BUFFER_COUNT];
+@@ -93,7 +105,7 @@ struct vdec_msg_queue {
+ atomic_t lat_list_cnt;
+ atomic_t core_list_cnt;
+ wait_queue_head_t core_dec_done;
+- int core_work_cnt;
++ int status;
+ };
+
+ /**
+diff --git a/drivers/media/platform/qcom/venus/helpers.c b/drivers/media/platform/qcom/venus/helpers.c
+index a2ceab7f9ddbf..a68389b0aae0a 100644
+--- a/drivers/media/platform/qcom/venus/helpers.c
++++ b/drivers/media/platform/qcom/venus/helpers.c
+@@ -1036,8 +1036,8 @@ static u32 get_framesize_raw_yuv420_tp10_ubwc(u32 width, u32 height)
+ u32 extradata = SZ_16K;
+ u32 size;
+
+- y_stride = ALIGN(ALIGN(width, 192) * 4 / 3, 256);
+- uv_stride = ALIGN(ALIGN(width, 192) * 4 / 3, 256);
++ y_stride = ALIGN(width * 4 / 3, 256);
++ uv_stride = ALIGN(width * 4 / 3, 256);
+ y_sclines = ALIGN(height, 16);
+ uv_sclines = ALIGN((height + 1) >> 1, 16);
+
+diff --git a/drivers/media/platform/renesas/rcar_fdp1.c b/drivers/media/platform/renesas/rcar_fdp1.c
+index f43e458590b8c..ab39cd2201c85 100644
+--- a/drivers/media/platform/renesas/rcar_fdp1.c
++++ b/drivers/media/platform/renesas/rcar_fdp1.c
+@@ -254,6 +254,8 @@ MODULE_PARM_DESC(debug, "activate debug info");
+
+ /* Internal Data (HW Version) */
+ #define FD1_IP_INTDATA 0x0800
++/* R-Car Gen2 HW manual says zero, but actual value matches R-Car H3 ES1.x */
++#define FD1_IP_GEN2 0x02010101
+ #define FD1_IP_M3W 0x02010202
+ #define FD1_IP_H3 0x02010203
+ #define FD1_IP_M3N 0x02010204
+@@ -2360,6 +2362,9 @@ static int fdp1_probe(struct platform_device *pdev)
+
+ hw_version = fdp1_read(fdp1, FD1_IP_INTDATA);
+ switch (hw_version) {
++ case FD1_IP_GEN2:
++ dprintk(fdp1, "FDP1 Version R-Car Gen2\n");
++ break;
+ case FD1_IP_M3W:
+ dprintk(fdp1, "FDP1 Version R-Car M3-W\n");
+ break;
+diff --git a/drivers/media/usb/dvb-usb-v2/az6007.c b/drivers/media/usb/dvb-usb-v2/az6007.c
+index 62ee09f28a0bc..7524c90f5da61 100644
+--- a/drivers/media/usb/dvb-usb-v2/az6007.c
++++ b/drivers/media/usb/dvb-usb-v2/az6007.c
+@@ -202,7 +202,8 @@ static int az6007_rc_query(struct dvb_usb_device *d)
+ unsigned code;
+ enum rc_proto proto;
+
+- az6007_read(d, AZ6007_READ_IR, 0, 0, st->data, 10);
++ if (az6007_read(d, AZ6007_READ_IR, 0, 0, st->data, 10) < 0)
++ return -EIO;
+
+ if (st->data[1] == 0x44)
+ return 0;
+diff --git a/drivers/media/usb/siano/smsusb.c b/drivers/media/usb/siano/smsusb.c
+index 6f443c542c6da..640737d3b8aeb 100644
+--- a/drivers/media/usb/siano/smsusb.c
++++ b/drivers/media/usb/siano/smsusb.c
+@@ -179,7 +179,8 @@ static void smsusb_stop_streaming(struct smsusb_device_t *dev)
+
+ for (i = 0; i < MAX_URBS; i++) {
+ usb_kill_urb(&dev->surbs[i].urb);
+- cancel_work_sync(&dev->surbs[i].wq);
++ if (dev->surbs[i].wq.func)
++ cancel_work_sync(&dev->surbs[i].wq);
+
+ if (dev->surbs[i].cb) {
+ smscore_putbuffer(dev->coredev, dev->surbs[i].cb);
+diff --git a/drivers/memory/brcmstb_dpfe.c b/drivers/memory/brcmstb_dpfe.c
+index 76c82e9c8fceb..9339f80b21c50 100644
+--- a/drivers/memory/brcmstb_dpfe.c
++++ b/drivers/memory/brcmstb_dpfe.c
+@@ -434,15 +434,17 @@ static void __finalize_command(struct brcmstb_dpfe_priv *priv)
+ static int __send_command(struct brcmstb_dpfe_priv *priv, unsigned int cmd,
+ u32 result[])
+ {
+- const u32 *msg = priv->dpfe_api->command[cmd];
+ void __iomem *regs = priv->regs;
+ unsigned int i, chksum, chksum_idx;
++ const u32 *msg;
+ int ret = 0;
+ u32 resp;
+
+ if (cmd >= DPFE_CMD_MAX)
+ return -1;
+
++ msg = priv->dpfe_api->command[cmd];
++
+ mutex_lock(&priv->lock);
+
+ /* Wait for DCPU to become ready */
+diff --git a/drivers/memstick/host/r592.c b/drivers/memstick/host/r592.c
+index 42bfc46842b82..461f5ffd02bc1 100644
+--- a/drivers/memstick/host/r592.c
++++ b/drivers/memstick/host/r592.c
+@@ -44,12 +44,10 @@ static const char *tpc_names[] = {
+ * memstick_debug_get_tpc_name - debug helper that returns string for
+ * a TPC number
+ */
+-const char *memstick_debug_get_tpc_name(int tpc)
++static __maybe_unused const char *memstick_debug_get_tpc_name(int tpc)
+ {
+ return tpc_names[tpc-1];
+ }
+-EXPORT_SYMBOL(memstick_debug_get_tpc_name);
+-
+
+ /* Read a register*/
+ static inline u32 r592_read_reg(struct r592_device *dev, int address)
+diff --git a/drivers/mfd/intel-lpss-acpi.c b/drivers/mfd/intel-lpss-acpi.c
+index a143c8dca2d93..212818aef93e2 100644
+--- a/drivers/mfd/intel-lpss-acpi.c
++++ b/drivers/mfd/intel-lpss-acpi.c
+@@ -183,6 +183,9 @@ static int intel_lpss_acpi_probe(struct platform_device *pdev)
+ return -ENOMEM;
+
+ info->mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);
++ if (!info->mem)
++ return -ENODEV;
++
+ info->irq = platform_get_irq(pdev, 0);
+
+ ret = intel_lpss_probe(&pdev->dev, info);
+diff --git a/drivers/mfd/rt5033.c b/drivers/mfd/rt5033.c
+index a5e520fe50a14..8029d444b7942 100644
+--- a/drivers/mfd/rt5033.c
++++ b/drivers/mfd/rt5033.c
+@@ -40,9 +40,6 @@ static const struct mfd_cell rt5033_devs[] = {
+ {
+ .name = "rt5033-charger",
+ .of_compatible = "richtek,rt5033-charger",
+- }, {
+- .name = "rt5033-battery",
+- .of_compatible = "richtek,rt5033-battery",
+ }, {
+ .name = "rt5033-led",
+ .of_compatible = "richtek,rt5033-led",
+diff --git a/drivers/mfd/stmfx.c b/drivers/mfd/stmfx.c
+index e281971ba54ed..76188212c66eb 100644
+--- a/drivers/mfd/stmfx.c
++++ b/drivers/mfd/stmfx.c
+@@ -330,9 +330,8 @@ static int stmfx_chip_init(struct i2c_client *client)
+ stmfx->vdd = devm_regulator_get_optional(&client->dev, "vdd");
+ ret = PTR_ERR_OR_ZERO(stmfx->vdd);
+ if (ret) {
+- if (ret == -ENODEV)
+- stmfx->vdd = NULL;
+- else
++ stmfx->vdd = NULL;
++ if (ret != -ENODEV)
+ return dev_err_probe(&client->dev, ret, "Failed to get VDD regulator\n");
+ }
+
+@@ -387,7 +386,7 @@ static int stmfx_chip_init(struct i2c_client *client)
+
+ err:
+ if (stmfx->vdd)
+- return regulator_disable(stmfx->vdd);
++ regulator_disable(stmfx->vdd);
+
+ return ret;
+ }
+diff --git a/drivers/mfd/stmpe.c b/drivers/mfd/stmpe.c
+index a92301dfc7126..9c3cf58457a7d 100644
+--- a/drivers/mfd/stmpe.c
++++ b/drivers/mfd/stmpe.c
+@@ -1485,9 +1485,9 @@ int stmpe_probe(struct stmpe_client_info *ci, enum stmpe_partnum partnum)
+
+ void stmpe_remove(struct stmpe *stmpe)
+ {
+- if (!IS_ERR(stmpe->vio))
++ if (!IS_ERR(stmpe->vio) && regulator_is_enabled(stmpe->vio))
+ regulator_disable(stmpe->vio);
+- if (!IS_ERR(stmpe->vcc))
++ if (!IS_ERR(stmpe->vcc) && regulator_is_enabled(stmpe->vcc))
+ regulator_disable(stmpe->vcc);
+
+ __stmpe_disable(stmpe, STMPE_BLOCK_ADC);
+diff --git a/drivers/mfd/tps65010.c b/drivers/mfd/tps65010.c
+index fb733288cca3b..faea4ff44c6fe 100644
+--- a/drivers/mfd/tps65010.c
++++ b/drivers/mfd/tps65010.c
+@@ -506,12 +506,8 @@ static void tps65010_remove(struct i2c_client *client)
+ struct tps65010 *tps = i2c_get_clientdata(client);
+ struct tps65010_board *board = dev_get_platdata(&client->dev);
+
+- if (board && board->teardown) {
+- int status = board->teardown(client, board->context);
+- if (status < 0)
+- dev_dbg(&client->dev, "board %s %s err %d\n",
+- "teardown", client->name, status);
+- }
++ if (board && board->teardown)
++ board->teardown(client, &tps->chip);
+ if (client->irq > 0)
+ free_irq(client->irq, tps);
+ cancel_delayed_work_sync(&tps->work);
+@@ -619,7 +615,7 @@ static int tps65010_probe(struct i2c_client *client)
+ tps, DEBUG_FOPS);
+
+ /* optionally register GPIOs */
+- if (board && board->base != 0) {
++ if (board) {
+ tps->outmask = board->outmask;
+
+ tps->chip.label = client->name;
+@@ -632,7 +628,7 @@ static int tps65010_probe(struct i2c_client *client)
+ /* NOTE: only partial support for inputs; nyet IRQs */
+ tps->chip.get = tps65010_gpio_get;
+
+- tps->chip.base = board->base;
++ tps->chip.base = -1;
+ tps->chip.ngpio = 7;
+ tps->chip.can_sleep = 1;
+
+@@ -641,7 +637,7 @@ static int tps65010_probe(struct i2c_client *client)
+ dev_err(&client->dev, "can't add gpiochip, err %d\n",
+ status);
+ else if (board->setup) {
+- status = board->setup(client, board->context);
++ status = board->setup(client, &tps->chip);
+ if (status < 0) {
+ dev_dbg(&client->dev,
+ "board %s %s err %d\n",
+diff --git a/drivers/mfd/wcd934x.c b/drivers/mfd/wcd934x.c
+index 07e884087f2c7..281470d6b0b99 100644
+--- a/drivers/mfd/wcd934x.c
++++ b/drivers/mfd/wcd934x.c
+@@ -258,8 +258,9 @@ static int wcd934x_slim_probe(struct slim_device *sdev)
+ usleep_range(600, 650);
+ reset_gpio = devm_gpiod_get_optional(dev, "reset", GPIOD_OUT_LOW);
+ if (IS_ERR(reset_gpio)) {
+- return dev_err_probe(dev, PTR_ERR(reset_gpio),
+- "Failed to get reset gpio: err = %ld\n", PTR_ERR(reset_gpio));
++ ret = dev_err_probe(dev, PTR_ERR(reset_gpio),
++ "Failed to get reset gpio\n");
++ goto err_disable_regulators;
+ }
+ msleep(20);
+ gpiod_set_value(reset_gpio, 1);
+@@ -269,6 +270,10 @@ static int wcd934x_slim_probe(struct slim_device *sdev)
+ dev_set_drvdata(dev, ddata);
+
+ return 0;
++
++err_disable_regulators:
++ regulator_bulk_disable(WCD934X_MAX_SUPPLY, ddata->supplies);
++ return ret;
+ }
+
+ static void wcd934x_slim_remove(struct slim_device *sdev)
+diff --git a/drivers/misc/fastrpc.c b/drivers/misc/fastrpc.c
+index 30d4d0476248f..9051551d99373 100644
+--- a/drivers/misc/fastrpc.c
++++ b/drivers/misc/fastrpc.c
+@@ -2225,6 +2225,9 @@ static int fastrpc_device_register(struct device *dev, struct fastrpc_channel_ct
+ fdev->miscdev.fops = &fastrpc_fops;
+ fdev->miscdev.name = devm_kasprintf(dev, GFP_KERNEL, "fastrpc-%s%s",
+ domain, is_secured ? "-secure" : "");
++ if (!fdev->miscdev.name)
++ return -ENOMEM;
++
+ err = misc_register(&fdev->miscdev);
+ if (!err) {
+ if (is_secured)
+diff --git a/drivers/misc/lkdtm/core.c b/drivers/misc/lkdtm/core.c
+index b4712ff196b4e..0772e4a4757e9 100644
+--- a/drivers/misc/lkdtm/core.c
++++ b/drivers/misc/lkdtm/core.c
+@@ -79,7 +79,7 @@ static struct crashpoint crashpoints[] = {
+ CRASHPOINT("INT_HARDWARE_ENTRY", "do_IRQ"),
+ CRASHPOINT("INT_HW_IRQ_EN", "handle_irq_event"),
+ CRASHPOINT("INT_TASKLET_ENTRY", "tasklet_action"),
+- CRASHPOINT("FS_DEVRW", "ll_rw_block"),
++ CRASHPOINT("FS_SUBMIT_BH", "submit_bh"),
+ CRASHPOINT("MEM_SWAPOUT", "shrink_inactive_list"),
+ CRASHPOINT("TIMERADD", "hrtimer_start"),
+ CRASHPOINT("SCSI_QUEUE_RQ", "scsi_queue_rq"),
+diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
+index d920c41783893..e46330815484d 100644
+--- a/drivers/mmc/core/block.c
++++ b/drivers/mmc/core/block.c
+@@ -178,6 +178,7 @@ static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
+ int recovery_mode,
+ struct mmc_queue *mq);
+ static void mmc_blk_hsq_req_done(struct mmc_request *mrq);
++static int mmc_spi_err_check(struct mmc_card *card);
+
+ static struct mmc_blk_data *mmc_blk_get(struct gendisk *disk)
+ {
+@@ -608,6 +609,11 @@ static int __mmc_blk_ioctl_cmd(struct mmc_card *card, struct mmc_blk_data *md,
+ if ((card->host->caps & MMC_CAP_WAIT_WHILE_BUSY) && use_r1b_resp)
+ return 0;
+
++ if (mmc_host_is_spi(card->host)) {
++ if (idata->ic.write_flag || r1b_resp || cmd.flags & MMC_RSP_SPI_BUSY)
++ return mmc_spi_err_check(card);
++ return err;
++ }
+ /* Ensure RPMB/R1B command has completed by polling with CMD13. */
+ if (idata->rpmb || r1b_resp)
+ err = mmc_poll_for_busy(card, busy_timeout_ms, false,
+diff --git a/drivers/mmc/core/card.h b/drivers/mmc/core/card.h
+index cfdd1ff40b865..4edf9057fa79d 100644
+--- a/drivers/mmc/core/card.h
++++ b/drivers/mmc/core/card.h
+@@ -53,6 +53,10 @@ struct mmc_fixup {
+ unsigned int manfid;
+ unsigned short oemid;
+
++ /* Manufacturing date */
++ unsigned short year;
++ unsigned char month;
++
+ /* SDIO-specific fields. You can use SDIO_ANY_ID here of course */
+ u16 cis_vendor, cis_device;
+
+@@ -68,6 +72,8 @@ struct mmc_fixup {
+
+ #define CID_MANFID_ANY (-1u)
+ #define CID_OEMID_ANY ((unsigned short) -1)
++#define CID_YEAR_ANY ((unsigned short) -1)
++#define CID_MONTH_ANY ((unsigned char) -1)
+ #define CID_NAME_ANY (NULL)
+
+ #define EXT_CSD_REV_ANY (-1u)
+@@ -81,17 +87,21 @@ struct mmc_fixup {
+ #define CID_MANFID_APACER 0x27
+ #define CID_MANFID_KINGSTON 0x70
+ #define CID_MANFID_HYNIX 0x90
++#define CID_MANFID_KINGSTON_SD 0x9F
+ #define CID_MANFID_NUMONYX 0xFE
+
+ #define END_FIXUP { NULL }
+
+-#define _FIXUP_EXT(_name, _manfid, _oemid, _rev_start, _rev_end, \
+- _cis_vendor, _cis_device, \
+- _fixup, _data, _ext_csd_rev) \
++#define _FIXUP_EXT(_name, _manfid, _oemid, _year, _month, \
++ _rev_start, _rev_end, \
++ _cis_vendor, _cis_device, \
++ _fixup, _data, _ext_csd_rev) \
+ { \
+ .name = (_name), \
+ .manfid = (_manfid), \
+ .oemid = (_oemid), \
++ .year = (_year), \
++ .month = (_month), \
+ .rev_start = (_rev_start), \
+ .rev_end = (_rev_end), \
+ .cis_vendor = (_cis_vendor), \
+@@ -103,8 +113,8 @@ struct mmc_fixup {
+
+ #define MMC_FIXUP_REV(_name, _manfid, _oemid, _rev_start, _rev_end, \
+ _fixup, _data, _ext_csd_rev) \
+- _FIXUP_EXT(_name, _manfid, \
+- _oemid, _rev_start, _rev_end, \
++ _FIXUP_EXT(_name, _manfid, _oemid, CID_YEAR_ANY, CID_MONTH_ANY, \
++ _rev_start, _rev_end, \
+ SDIO_ANY_ID, SDIO_ANY_ID, \
+ _fixup, _data, _ext_csd_rev) \
+
+@@ -118,8 +128,9 @@ struct mmc_fixup {
+ _ext_csd_rev)
+
+ #define SDIO_FIXUP(_vendor, _device, _fixup, _data) \
+- _FIXUP_EXT(CID_NAME_ANY, CID_MANFID_ANY, \
+- CID_OEMID_ANY, 0, -1ull, \
++ _FIXUP_EXT(CID_NAME_ANY, CID_MANFID_ANY, CID_OEMID_ANY, \
++ CID_YEAR_ANY, CID_MONTH_ANY, \
++ 0, -1ull, \
+ _vendor, _device, \
+ _fixup, _data, EXT_CSD_REV_ANY) \
+
+@@ -264,4 +275,9 @@ static inline int mmc_card_broken_sd_discard(const struct mmc_card *c)
+ return c->quirks & MMC_QUIRK_BROKEN_SD_DISCARD;
+ }
+
++static inline int mmc_card_broken_sd_cache(const struct mmc_card *c)
++{
++ return c->quirks & MMC_QUIRK_BROKEN_SD_CACHE;
++}
++
+ #endif
+diff --git a/drivers/mmc/core/quirks.h b/drivers/mmc/core/quirks.h
+index 29b9497936df9..857315f185fcf 100644
+--- a/drivers/mmc/core/quirks.h
++++ b/drivers/mmc/core/quirks.h
+@@ -53,6 +53,15 @@ static const struct mmc_fixup __maybe_unused mmc_blk_fixups[] = {
+ MMC_FIXUP("MMC32G", CID_MANFID_TOSHIBA, CID_OEMID_ANY, add_quirk_mmc,
+ MMC_QUIRK_BLK_NO_CMD23),
+
++ /*
++ * Kingston Canvas Go! Plus microSD cards never finish SD cache flush.
++ * This has so far only been observed on cards from 11/2019, while new
++ * cards from 2023/05 do not exhibit this behavior.
++ */
++ _FIXUP_EXT("SD64G", CID_MANFID_KINGSTON_SD, 0x5449, 2019, 11,
++ 0, -1ull, SDIO_ANY_ID, SDIO_ANY_ID, add_quirk_sd,
++ MMC_QUIRK_BROKEN_SD_CACHE, EXT_CSD_REV_ANY),
++
+ /*
+ * Some SD cards lockup while using CMD23 multiblock transfers.
+ */
+@@ -100,6 +109,20 @@ static const struct mmc_fixup __maybe_unused mmc_blk_fixups[] = {
+ MMC_FIXUP("V10016", CID_MANFID_KINGSTON, CID_OEMID_ANY, add_quirk_mmc,
+ MMC_QUIRK_TRIM_BROKEN),
+
++ /*
++ * Kingston EMMC04G-M627 advertises TRIM but it does not seems to
++ * support being used to offload WRITE_ZEROES.
++ */
++ MMC_FIXUP("M62704", CID_MANFID_KINGSTON, 0x0100, add_quirk_mmc,
++ MMC_QUIRK_TRIM_BROKEN),
++
++ /*
++ * Micron MTFC4GACAJCN-1M advertises TRIM but it does not seems to
++ * support being used to offload WRITE_ZEROES.
++ */
++ MMC_FIXUP("Q2J54A", CID_MANFID_MICRON, 0x014e, add_quirk_mmc,
++ MMC_QUIRK_TRIM_BROKEN),
++
+ /*
+ * Some SD cards reports discard support while they don't
+ */
+@@ -209,6 +232,10 @@ static inline void mmc_fixup_device(struct mmc_card *card,
+ if (f->of_compatible &&
+ !mmc_fixup_of_compatible_match(card, f->of_compatible))
+ continue;
++ if (f->year != CID_YEAR_ANY && f->year != card->cid.year)
++ continue;
++ if (f->month != CID_MONTH_ANY && f->month != card->cid.month)
++ continue;
+
+ dev_dbg(&card->dev, "calling %ps\n", f->vendor_fixup);
+ f->vendor_fixup(card, f->data);
+diff --git a/drivers/mmc/core/sd.c b/drivers/mmc/core/sd.c
+index 72b664ed90cf6..246ce027ae0aa 100644
+--- a/drivers/mmc/core/sd.c
++++ b/drivers/mmc/core/sd.c
+@@ -1170,7 +1170,7 @@ static int sd_parse_ext_reg_perf(struct mmc_card *card, u8 fno, u8 page,
+ card->ext_perf.feature_support |= SD_EXT_PERF_HOST_MAINT;
+
+ /* Cache support at bit 0. */
+- if (reg_buf[4] & BIT(0))
++ if ((reg_buf[4] & BIT(0)) && !mmc_card_broken_sd_cache(card))
+ card->ext_perf.feature_support |= SD_EXT_PERF_CACHE;
+
+ /* Command queue support indicated via queue depth bits (0 to 4). */
+diff --git a/drivers/mmc/host/mmci.c b/drivers/mmc/host/mmci.c
+index 696cbef3ff7de..f724bd0d2a612 100644
+--- a/drivers/mmc/host/mmci.c
++++ b/drivers/mmc/host/mmci.c
+@@ -2456,6 +2456,7 @@ static struct amba_driver mmci_driver = {
+ .drv = {
+ .name = DRIVER_NAME,
+ .pm = &mmci_dev_pm_ops,
++ .probe_type = PROBE_PREFER_ASYNCHRONOUS,
+ },
+ .probe = mmci_probe,
+ .remove = mmci_remove,
+diff --git a/drivers/mmc/host/mtk-sd.c b/drivers/mmc/host/mtk-sd.c
+index 9785ec91654f7..97c42aacaf346 100644
+--- a/drivers/mmc/host/mtk-sd.c
++++ b/drivers/mmc/host/mtk-sd.c
+@@ -2707,7 +2707,7 @@ static int msdc_drv_probe(struct platform_device *pdev)
+
+ /* Support for SDIO eint irq ? */
+ if ((mmc->pm_caps & MMC_PM_WAKE_SDIO_IRQ) && (mmc->pm_caps & MMC_PM_KEEP_POWER)) {
+- host->eint_irq = platform_get_irq_byname(pdev, "sdio_wakeup");
++ host->eint_irq = platform_get_irq_byname_optional(pdev, "sdio_wakeup");
+ if (host->eint_irq > 0) {
+ host->pins_eint = pinctrl_lookup_state(host->pinctrl, "state_eint");
+ if (IS_ERR(host->pins_eint)) {
+diff --git a/drivers/mmc/host/omap.c b/drivers/mmc/host/omap.c
+index 86454f1182bb1..6a259563690d6 100644
+--- a/drivers/mmc/host/omap.c
++++ b/drivers/mmc/host/omap.c
+@@ -26,6 +26,7 @@
+ #include <linux/clk.h>
+ #include <linux/scatterlist.h>
+ #include <linux/slab.h>
++#include <linux/gpio/consumer.h>
+ #include <linux/platform_data/mmc-omap.h>
+
+
+@@ -111,6 +112,9 @@ struct mmc_omap_slot {
+ struct mmc_request *mrq;
+ struct mmc_omap_host *host;
+ struct mmc_host *mmc;
++ struct gpio_desc *vsd;
++ struct gpio_desc *vio;
++ struct gpio_desc *cover;
+ struct omap_mmc_slot_data *pdata;
+ };
+
+@@ -133,6 +137,7 @@ struct mmc_omap_host {
+ int irq;
+ unsigned char bus_mode;
+ unsigned int reg_shift;
++ struct gpio_desc *slot_switch;
+
+ struct work_struct cmd_abort_work;
+ unsigned abort:1;
+@@ -216,8 +221,13 @@ no_claim:
+
+ if (host->current_slot != slot) {
+ OMAP_MMC_WRITE(host, CON, slot->saved_con & 0xFC00);
+- if (host->pdata->switch_slot != NULL)
+- host->pdata->switch_slot(mmc_dev(slot->mmc), slot->id);
++ if (host->slot_switch)
++ /*
++ * With two slots and a simple GPIO switch, setting
++ * the GPIO to 0 selects slot ID 0, setting it to 1
++ * selects slot ID 1.
++ */
++ gpiod_set_value(host->slot_switch, slot->id);
+ host->current_slot = slot;
+ }
+
+@@ -297,6 +307,9 @@ static void mmc_omap_release_slot(struct mmc_omap_slot *slot, int clk_enabled)
+ static inline
+ int mmc_omap_cover_is_open(struct mmc_omap_slot *slot)
+ {
++ /* If we have a GPIO then use that */
++ if (slot->cover)
++ return gpiod_get_value(slot->cover);
+ if (slot->pdata->get_cover_state)
+ return slot->pdata->get_cover_state(mmc_dev(slot->mmc),
+ slot->id);
+@@ -1106,6 +1119,11 @@ static void mmc_omap_set_power(struct mmc_omap_slot *slot, int power_on,
+
+ host = slot->host;
+
++ if (slot->vsd)
++ gpiod_set_value(slot->vsd, power_on);
++ if (slot->vio)
++ gpiod_set_value(slot->vio, power_on);
++
+ if (slot->pdata->set_power != NULL)
+ slot->pdata->set_power(mmc_dev(slot->mmc), slot->id, power_on,
+ vdd);
+@@ -1240,6 +1258,23 @@ static int mmc_omap_new_slot(struct mmc_omap_host *host, int id)
+ slot->power_mode = MMC_POWER_UNDEFINED;
+ slot->pdata = &host->pdata->slots[id];
+
++ /* Check for some optional GPIO controls */
++ slot->vsd = gpiod_get_index_optional(host->dev, "vsd",
++ id, GPIOD_OUT_LOW);
++ if (IS_ERR(slot->vsd))
++ return dev_err_probe(host->dev, PTR_ERR(slot->vsd),
++ "error looking up VSD GPIO\n");
++ slot->vio = gpiod_get_index_optional(host->dev, "vio",
++ id, GPIOD_OUT_LOW);
++ if (IS_ERR(slot->vio))
++ return dev_err_probe(host->dev, PTR_ERR(slot->vio),
++ "error looking up VIO GPIO\n");
++ slot->cover = gpiod_get_index_optional(host->dev, "cover",
++ id, GPIOD_IN);
++ if (IS_ERR(slot->cover))
++ return dev_err_probe(host->dev, PTR_ERR(slot->cover),
++ "error looking up cover switch GPIO\n");
++
+ host->slots[id] = slot;
+
+ mmc->caps = 0;
+@@ -1349,6 +1384,13 @@ static int mmc_omap_probe(struct platform_device *pdev)
+ if (IS_ERR(host->virt_base))
+ return PTR_ERR(host->virt_base);
+
++ host->slot_switch = gpiod_get_optional(host->dev, "switch",
++ GPIOD_OUT_LOW);
++ if (IS_ERR(host->slot_switch))
++ return dev_err_probe(host->dev, PTR_ERR(host->slot_switch),
++ "error looking up slot switch GPIO\n");
++
++
+ INIT_WORK(&host->slot_release_work, mmc_omap_slot_release_work);
+ INIT_WORK(&host->send_stop_work, mmc_omap_send_stop_work);
+
+diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
+index 3241916141d7d..ff41aa56564ea 100644
+--- a/drivers/mmc/host/sdhci.c
++++ b/drivers/mmc/host/sdhci.c
+@@ -1167,6 +1167,8 @@ static void sdhci_prepare_data(struct sdhci_host *host, struct mmc_command *cmd)
+ }
+ }
+
++ sdhci_config_dma(host);
++
+ if (host->flags & SDHCI_REQ_USE_DMA) {
+ int sg_cnt = sdhci_pre_dma_transfer(host, data, COOKIE_MAPPED);
+
+@@ -1186,8 +1188,6 @@ static void sdhci_prepare_data(struct sdhci_host *host, struct mmc_command *cmd)
+ }
+ }
+
+- sdhci_config_dma(host);
+-
+ if (!(host->flags & SDHCI_REQ_USE_DMA)) {
+ int flags;
+
+diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
+index edbaa1444f8ec..091e035c76a6f 100644
+--- a/drivers/net/bonding/bond_main.c
++++ b/drivers/net/bonding/bond_main.c
+@@ -4197,7 +4197,7 @@ u32 bond_xmit_hash(struct bonding *bond, struct sk_buff *skb)
+ return skb->hash;
+
+ return __bond_xmit_hash(bond, skb, skb->data, skb->protocol,
+- skb_mac_offset(skb), skb_network_offset(skb),
++ 0, skb_network_offset(skb),
+ skb_headlen(skb));
+ }
+
+diff --git a/drivers/net/can/kvaser_pciefd.c b/drivers/net/can/kvaser_pciefd.c
+index be189edb256ce..37f4befca0345 100644
+--- a/drivers/net/can/kvaser_pciefd.c
++++ b/drivers/net/can/kvaser_pciefd.c
+@@ -538,6 +538,13 @@ static int kvaser_pciefd_set_tx_irq(struct kvaser_pciefd_can *can)
+ return 0;
+ }
+
++static inline void kvaser_pciefd_set_skb_timestamp(const struct kvaser_pciefd *pcie,
++ struct sk_buff *skb, u64 timestamp)
++{
++ skb_hwtstamps(skb)->hwtstamp =
++ ns_to_ktime(div_u64(timestamp * 1000, pcie->freq_to_ticks_div));
++}
++
+ static void kvaser_pciefd_setup_controller(struct kvaser_pciefd_can *can)
+ {
+ u32 mode;
+@@ -1171,7 +1178,6 @@ static int kvaser_pciefd_handle_data_packet(struct kvaser_pciefd *pcie,
+ struct canfd_frame *cf;
+ struct can_priv *priv;
+ struct net_device_stats *stats;
+- struct skb_shared_hwtstamps *shhwtstamps;
+ u8 ch_id = (p->header[1] >> KVASER_PCIEFD_PACKET_CHID_SHIFT) & 0x7;
+
+ if (ch_id >= pcie->nr_channels)
+@@ -1214,12 +1220,7 @@ static int kvaser_pciefd_handle_data_packet(struct kvaser_pciefd *pcie,
+ stats->rx_bytes += cf->len;
+ }
+ stats->rx_packets++;
+-
+- shhwtstamps = skb_hwtstamps(skb);
+-
+- shhwtstamps->hwtstamp =
+- ns_to_ktime(div_u64(p->timestamp * 1000,
+- pcie->freq_to_ticks_div));
++ kvaser_pciefd_set_skb_timestamp(pcie, skb, p->timestamp);
+
+ return netif_rx(skb);
+ }
+@@ -1282,7 +1283,6 @@ static int kvaser_pciefd_rx_error_frame(struct kvaser_pciefd_can *can,
+ struct net_device *ndev = can->can.dev;
+ struct sk_buff *skb;
+ struct can_frame *cf = NULL;
+- struct skb_shared_hwtstamps *shhwtstamps;
+ struct net_device_stats *stats = &ndev->stats;
+
+ old_state = can->can.state;
+@@ -1323,10 +1323,7 @@ static int kvaser_pciefd_rx_error_frame(struct kvaser_pciefd_can *can,
+ return -ENOMEM;
+ }
+
+- shhwtstamps = skb_hwtstamps(skb);
+- shhwtstamps->hwtstamp =
+- ns_to_ktime(div_u64(p->timestamp * 1000,
+- can->kv_pcie->freq_to_ticks_div));
++ kvaser_pciefd_set_skb_timestamp(can->kv_pcie, skb, p->timestamp);
+ cf->can_id |= CAN_ERR_BUSERROR | CAN_ERR_CNT;
+
+ cf->data[6] = bec.txerr;
+@@ -1374,7 +1371,6 @@ static int kvaser_pciefd_handle_status_resp(struct kvaser_pciefd_can *can,
+ struct net_device *ndev = can->can.dev;
+ struct sk_buff *skb;
+ struct can_frame *cf;
+- struct skb_shared_hwtstamps *shhwtstamps;
+
+ skb = alloc_can_err_skb(ndev, &cf);
+ if (!skb) {
+@@ -1394,10 +1390,7 @@ static int kvaser_pciefd_handle_status_resp(struct kvaser_pciefd_can *can,
+ cf->can_id |= CAN_ERR_RESTARTED;
+ }
+
+- shhwtstamps = skb_hwtstamps(skb);
+- shhwtstamps->hwtstamp =
+- ns_to_ktime(div_u64(p->timestamp * 1000,
+- can->kv_pcie->freq_to_ticks_div));
++ kvaser_pciefd_set_skb_timestamp(can->kv_pcie, skb, p->timestamp);
+
+ cf->data[6] = bec.txerr;
+ cf->data[7] = bec.rxerr;
+@@ -1526,6 +1519,7 @@ static void kvaser_pciefd_handle_nack_packet(struct kvaser_pciefd_can *can,
+
+ if (skb) {
+ cf->can_id |= CAN_ERR_BUSERROR;
++ kvaser_pciefd_set_skb_timestamp(can->kv_pcie, skb, p->timestamp);
+ netif_rx(skb);
+ } else {
+ stats->rx_dropped++;
+@@ -1557,8 +1551,15 @@ static int kvaser_pciefd_handle_ack_packet(struct kvaser_pciefd *pcie,
+ netdev_dbg(can->can.dev, "Packet was flushed\n");
+ } else {
+ int echo_idx = p->header[0] & KVASER_PCIEFD_PACKET_SEQ_MSK;
+- int dlc = can_get_echo_skb(can->can.dev, echo_idx, NULL);
+- u8 count = ioread32(can->reg_base +
++ int dlc;
++ u8 count;
++ struct sk_buff *skb;
++
++ skb = can->can.echo_skb[echo_idx];
++ if (skb)
++ kvaser_pciefd_set_skb_timestamp(pcie, skb, p->timestamp);
++ dlc = can_get_echo_skb(can->can.dev, echo_idx, NULL);
++ count = ioread32(can->reg_base +
+ KVASER_PCIEFD_KCAN_TX_NPACKETS_REG) & 0xff;
+
+ if (count < KVASER_PCIEFD_CAN_TX_MAX_COUNT &&
+diff --git a/drivers/net/dsa/ocelot/felix.c b/drivers/net/dsa/ocelot/felix.c
+index 80861ac090ae3..70c0e2b1936b3 100644
+--- a/drivers/net/dsa/ocelot/felix.c
++++ b/drivers/net/dsa/ocelot/felix.c
+@@ -1725,6 +1725,18 @@ static bool felix_rxtstamp(struct dsa_switch *ds, int port,
+ u32 tstamp_hi;
+ u64 tstamp;
+
++ switch (type & PTP_CLASS_PMASK) {
++ case PTP_CLASS_L2:
++ if (!(ocelot->ports[port]->trap_proto & OCELOT_PROTO_PTP_L2))
++ return false;
++ break;
++ case PTP_CLASS_IPV4:
++ case PTP_CLASS_IPV6:
++ if (!(ocelot->ports[port]->trap_proto & OCELOT_PROTO_PTP_L4))
++ return false;
++ break;
++ }
++
+ /* If the "no XTR IRQ" workaround is in use, tell DSA to defer this skb
+ * for RX timestamping. Then free it, and poll for its copy through
+ * MMIO in the CPU port module, and inject that into the stack from
+diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h
+index fb1549a5fe321..dee35ba924ad2 100644
+--- a/drivers/net/dsa/sja1105/sja1105.h
++++ b/drivers/net/dsa/sja1105/sja1105.h
+@@ -252,6 +252,7 @@ struct sja1105_private {
+ unsigned long ucast_egress_floods;
+ unsigned long bcast_egress_floods;
+ unsigned long hwts_tx_en;
++ unsigned long hwts_rx_en;
+ const struct sja1105_info *info;
+ size_t max_xfer_len;
+ struct spi_device *spidev;
+@@ -289,7 +290,6 @@ struct sja1105_spi_message {
+ /* From sja1105_main.c */
+ enum sja1105_reset_reason {
+ SJA1105_VLAN_FILTERING = 0,
+- SJA1105_RX_HWTSTAMPING,
+ SJA1105_AGEING_TIME,
+ SJA1105_SCHEDULING,
+ SJA1105_BEST_EFFORT_POLICING,
+diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
+index b70dcf32a26dc..947e8f7c09880 100644
+--- a/drivers/net/dsa/sja1105/sja1105_main.c
++++ b/drivers/net/dsa/sja1105/sja1105_main.c
+@@ -866,12 +866,12 @@ static int sja1105_init_general_params(struct sja1105_private *priv)
+ .hostprio = 7,
+ .mac_fltres1 = SJA1105_LINKLOCAL_FILTER_A,
+ .mac_flt1 = SJA1105_LINKLOCAL_FILTER_A_MASK,
+- .incl_srcpt1 = false,
+- .send_meta1 = false,
++ .incl_srcpt1 = true,
++ .send_meta1 = true,
+ .mac_fltres0 = SJA1105_LINKLOCAL_FILTER_B,
+ .mac_flt0 = SJA1105_LINKLOCAL_FILTER_B_MASK,
+- .incl_srcpt0 = false,
+- .send_meta0 = false,
++ .incl_srcpt0 = true,
++ .send_meta0 = true,
+ /* Default to an invalid value */
+ .mirr_port = priv->ds->num_ports,
+ /* No TTEthernet */
+@@ -2215,7 +2215,6 @@ static int sja1105_reload_cbs(struct sja1105_private *priv)
+
+ static const char * const sja1105_reset_reasons[] = {
+ [SJA1105_VLAN_FILTERING] = "VLAN filtering",
+- [SJA1105_RX_HWTSTAMPING] = "RX timestamping",
+ [SJA1105_AGEING_TIME] = "Ageing time",
+ [SJA1105_SCHEDULING] = "Time-aware scheduling",
+ [SJA1105_BEST_EFFORT_POLICING] = "Best-effort policing",
+@@ -2407,11 +2406,6 @@ int sja1105_vlan_filtering(struct dsa_switch *ds, int port, bool enabled,
+ general_params->tpid = tpid;
+ /* EtherType used to identify outer tagged (S-tag) VLAN traffic */
+ general_params->tpid2 = tpid2;
+- /* When VLAN filtering is on, we need to at least be able to
+- * decode management traffic through the "backup plan".
+- */
+- general_params->incl_srcpt1 = enabled;
+- general_params->incl_srcpt0 = enabled;
+
+ for (port = 0; port < ds->num_ports; port++) {
+ if (dsa_is_unused_port(ds, port))
+diff --git a/drivers/net/dsa/sja1105/sja1105_ptp.c b/drivers/net/dsa/sja1105/sja1105_ptp.c
+index 30fb2cc40164b..a7d41e7813982 100644
+--- a/drivers/net/dsa/sja1105/sja1105_ptp.c
++++ b/drivers/net/dsa/sja1105/sja1105_ptp.c
+@@ -58,35 +58,10 @@ enum sja1105_ptp_clk_mode {
+ #define ptp_data_to_sja1105(d) \
+ container_of((d), struct sja1105_private, ptp_data)
+
+-/* Must be called only while the RX timestamping state of the tagger
+- * is turned off
+- */
+-static int sja1105_change_rxtstamping(struct sja1105_private *priv,
+- bool on)
+-{
+- struct sja1105_ptp_data *ptp_data = &priv->ptp_data;
+- struct sja1105_general_params_entry *general_params;
+- struct sja1105_table *table;
+-
+- table = &priv->static_config.tables[BLK_IDX_GENERAL_PARAMS];
+- general_params = table->entries;
+- general_params->send_meta1 = on;
+- general_params->send_meta0 = on;
+-
+- ptp_cancel_worker_sync(ptp_data->clock);
+- skb_queue_purge(&ptp_data->skb_txtstamp_queue);
+- skb_queue_purge(&ptp_data->skb_rxtstamp_queue);
+-
+- return sja1105_static_config_reload(priv, SJA1105_RX_HWTSTAMPING);
+-}
+-
+ int sja1105_hwtstamp_set(struct dsa_switch *ds, int port, struct ifreq *ifr)
+ {
+- struct sja1105_tagger_data *tagger_data = sja1105_tagger_data(ds);
+ struct sja1105_private *priv = ds->priv;
+ struct hwtstamp_config config;
+- bool rx_on;
+- int rc;
+
+ if (copy_from_user(&config, ifr->ifr_data, sizeof(config)))
+ return -EFAULT;
+@@ -104,26 +79,13 @@ int sja1105_hwtstamp_set(struct dsa_switch *ds, int port, struct ifreq *ifr)
+
+ switch (config.rx_filter) {
+ case HWTSTAMP_FILTER_NONE:
+- rx_on = false;
++ priv->hwts_rx_en &= ~BIT(port);
+ break;
+ default:
+- rx_on = true;
++ priv->hwts_rx_en |= BIT(port);
+ break;
+ }
+
+- if (rx_on != tagger_data->rxtstamp_get_state(ds)) {
+- tagger_data->rxtstamp_set_state(ds, false);
+-
+- rc = sja1105_change_rxtstamping(priv, rx_on);
+- if (rc < 0) {
+- dev_err(ds->dev,
+- "Failed to change RX timestamping: %d\n", rc);
+- return rc;
+- }
+- if (rx_on)
+- tagger_data->rxtstamp_set_state(ds, true);
+- }
+-
+ if (copy_to_user(ifr->ifr_data, &config, sizeof(config)))
+ return -EFAULT;
+ return 0;
+@@ -131,7 +93,6 @@ int sja1105_hwtstamp_set(struct dsa_switch *ds, int port, struct ifreq *ifr)
+
+ int sja1105_hwtstamp_get(struct dsa_switch *ds, int port, struct ifreq *ifr)
+ {
+- struct sja1105_tagger_data *tagger_data = sja1105_tagger_data(ds);
+ struct sja1105_private *priv = ds->priv;
+ struct hwtstamp_config config;
+
+@@ -140,7 +101,7 @@ int sja1105_hwtstamp_get(struct dsa_switch *ds, int port, struct ifreq *ifr)
+ config.tx_type = HWTSTAMP_TX_ON;
+ else
+ config.tx_type = HWTSTAMP_TX_OFF;
+- if (tagger_data->rxtstamp_get_state(ds))
++ if (priv->hwts_rx_en & BIT(port))
+ config.rx_filter = HWTSTAMP_FILTER_PTP_V2_L2_EVENT;
+ else
+ config.rx_filter = HWTSTAMP_FILTER_NONE;
+@@ -413,11 +374,10 @@ static long sja1105_rxtstamp_work(struct ptp_clock_info *ptp)
+
+ bool sja1105_rxtstamp(struct dsa_switch *ds, int port, struct sk_buff *skb)
+ {
+- struct sja1105_tagger_data *tagger_data = sja1105_tagger_data(ds);
+ struct sja1105_private *priv = ds->priv;
+ struct sja1105_ptp_data *ptp_data = &priv->ptp_data;
+
+- if (!tagger_data->rxtstamp_get_state(ds))
++ if (!(priv->hwts_rx_en & BIT(port)))
+ return false;
+
+ /* We need to read the full PTP clock to reconstruct the Rx
+diff --git a/drivers/net/dsa/vitesse-vsc73xx-core.c b/drivers/net/dsa/vitesse-vsc73xx-core.c
+index ae55167ce0a6f..ef1a4a7c47b23 100644
+--- a/drivers/net/dsa/vitesse-vsc73xx-core.c
++++ b/drivers/net/dsa/vitesse-vsc73xx-core.c
+@@ -1025,17 +1025,17 @@ static int vsc73xx_change_mtu(struct dsa_switch *ds, int port, int new_mtu)
+ struct vsc73xx *vsc = ds->priv;
+
+ return vsc73xx_write(vsc, VSC73XX_BLOCK_MAC, port,
+- VSC73XX_MAXLEN, new_mtu);
++ VSC73XX_MAXLEN, new_mtu + ETH_HLEN + ETH_FCS_LEN);
+ }
+
+ /* According to application not "VSC7398 Jumbo Frames" setting
+- * up the MTU to 9.6 KB does not affect the performance on standard
++ * up the frame size to 9.6 KB does not affect the performance on standard
+ * frames. It is clear from the application note that
+ * "9.6 kilobytes" == 9600 bytes.
+ */
+ static int vsc73xx_get_max_mtu(struct dsa_switch *ds, int port)
+ {
+- return 9600;
++ return 9600 - ETH_HLEN - ETH_FCS_LEN;
+ }
+
+ static const struct dsa_switch_ops vsc73xx_ds_ops = {
+diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
+index 58747292521d8..a52cf9aae4988 100644
+--- a/drivers/net/ethernet/broadcom/tg3.c
++++ b/drivers/net/ethernet/broadcom/tg3.c
+@@ -224,6 +224,7 @@ MODULE_AUTHOR("David S. Miller (davem@redhat.com) and Jeff Garzik (jgarzik@pobox
+ MODULE_DESCRIPTION("Broadcom Tigon3 ethernet driver");
+ MODULE_LICENSE("GPL");
+ MODULE_FIRMWARE(FIRMWARE_TG3);
++MODULE_FIRMWARE(FIRMWARE_TG357766);
+ MODULE_FIRMWARE(FIRMWARE_TG3TSO);
+ MODULE_FIRMWARE(FIRMWARE_TG3TSO5);
+
+diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
+index c63d3ec9d3284..763d613adbcc0 100644
+--- a/drivers/net/ethernet/ibm/ibmvnic.c
++++ b/drivers/net/ethernet/ibm/ibmvnic.c
+@@ -1816,7 +1816,14 @@ static int __ibmvnic_open(struct net_device *netdev)
+ if (prev_state == VNIC_CLOSED)
+ enable_irq(adapter->tx_scrq[i]->irq);
+ enable_scrq_irq(adapter, adapter->tx_scrq[i]);
+- netdev_tx_reset_queue(netdev_get_tx_queue(netdev, i));
++ /* netdev_tx_reset_queue will reset dql stats. During NON_FATAL
++ * resets, don't reset the stats because there could be batched
++ * skb's waiting to be sent. If we reset dql stats, we risk
++ * num_completed being greater than num_queued. This will cause
++ * a BUG_ON in dql_completed().
++ */
++ if (adapter->reset_reason != VNIC_RESET_NON_FATAL)
++ netdev_tx_reset_queue(netdev_get_tx_queue(netdev, i));
+ }
+
+ rc = set_link_state(adapter, IBMVNIC_LOGICAL_LNK_UP);
+diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
+index aa32111afd6ed..50ccde7942f2d 100644
+--- a/drivers/net/ethernet/intel/ice/ice.h
++++ b/drivers/net/ethernet/intel/ice/ice.h
+@@ -514,6 +514,12 @@ enum ice_pf_flags {
+ ICE_PF_FLAGS_NBITS /* must be last */
+ };
+
++enum ice_misc_thread_tasks {
++ ICE_MISC_THREAD_EXTTS_EVENT,
++ ICE_MISC_THREAD_TX_TSTAMP,
++ ICE_MISC_THREAD_NBITS /* must be last */
++};
++
+ struct ice_switchdev_info {
+ struct ice_vsi *control_vsi;
+ struct ice_vsi *uplink_vsi;
+@@ -556,6 +562,7 @@ struct ice_pf {
+ DECLARE_BITMAP(features, ICE_F_MAX);
+ DECLARE_BITMAP(state, ICE_STATE_NBITS);
+ DECLARE_BITMAP(flags, ICE_PF_FLAGS_NBITS);
++ DECLARE_BITMAP(misc_thread, ICE_MISC_THREAD_NBITS);
+ unsigned long *avail_txqs; /* bitmap to track PF Tx queue usage */
+ unsigned long *avail_rxqs; /* bitmap to track PF Rx queue usage */
+ unsigned long serv_tmr_period;
+diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
+index 42c318ceff618..fcc027c938fda 100644
+--- a/drivers/net/ethernet/intel/ice/ice_main.c
++++ b/drivers/net/ethernet/intel/ice/ice_main.c
+@@ -3141,20 +3141,28 @@ static irqreturn_t ice_misc_intr(int __always_unused irq, void *data)
+
+ if (oicr & PFINT_OICR_TSYN_TX_M) {
+ ena_mask &= ~PFINT_OICR_TSYN_TX_M;
+- if (!hw->reset_ongoing)
++ if (!hw->reset_ongoing) {
++ set_bit(ICE_MISC_THREAD_TX_TSTAMP, pf->misc_thread);
+ ret = IRQ_WAKE_THREAD;
++ }
+ }
+
+ if (oicr & PFINT_OICR_TSYN_EVNT_M) {
+ u8 tmr_idx = hw->func_caps.ts_func_info.tmr_index_owned;
+ u32 gltsyn_stat = rd32(hw, GLTSYN_STAT(tmr_idx));
+
+- /* Save EVENTs from GTSYN register */
+- pf->ptp.ext_ts_irq |= gltsyn_stat & (GLTSYN_STAT_EVENT0_M |
+- GLTSYN_STAT_EVENT1_M |
+- GLTSYN_STAT_EVENT2_M);
+ ena_mask &= ~PFINT_OICR_TSYN_EVNT_M;
+- kthread_queue_work(pf->ptp.kworker, &pf->ptp.extts_work);
++
++ if (hw->func_caps.ts_func_info.src_tmr_owned) {
++ /* Save EVENTs from GLTSYN register */
++ pf->ptp.ext_ts_irq |= gltsyn_stat &
++ (GLTSYN_STAT_EVENT0_M |
++ GLTSYN_STAT_EVENT1_M |
++ GLTSYN_STAT_EVENT2_M);
++
++ set_bit(ICE_MISC_THREAD_EXTTS_EVENT, pf->misc_thread);
++ ret = IRQ_WAKE_THREAD;
++ }
+ }
+
+ #define ICE_AUX_CRIT_ERR (PFINT_OICR_PE_CRITERR_M | PFINT_OICR_HMC_ERR_M | PFINT_OICR_PE_PUSH_M)
+@@ -3198,8 +3206,13 @@ static irqreturn_t ice_misc_intr_thread_fn(int __always_unused irq, void *data)
+ if (ice_is_reset_in_progress(pf->state))
+ return IRQ_HANDLED;
+
+- while (!ice_ptp_process_ts(pf))
+- usleep_range(50, 100);
++ if (test_and_clear_bit(ICE_MISC_THREAD_EXTTS_EVENT, pf->misc_thread))
++ ice_ptp_extts_event(pf);
++
++ if (test_and_clear_bit(ICE_MISC_THREAD_TX_TSTAMP, pf->misc_thread)) {
++ while (!ice_ptp_process_ts(pf))
++ usleep_range(50, 100);
++ }
+
+ return IRQ_HANDLED;
+ }
+diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
+index ac6f06f9a2ed0..e8507d09b8488 100644
+--- a/drivers/net/ethernet/intel/ice/ice_ptp.c
++++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
+@@ -1458,15 +1458,11 @@ static int ice_ptp_adjfine(struct ptp_clock_info *info, long scaled_ppm)
+ }
+
+ /**
+- * ice_ptp_extts_work - Workqueue task function
+- * @work: external timestamp work structure
+- *
+- * Service for PTP external clock event
++ * ice_ptp_extts_event - Process PTP external clock event
++ * @pf: Board private structure
+ */
+-static void ice_ptp_extts_work(struct kthread_work *work)
++void ice_ptp_extts_event(struct ice_pf *pf)
+ {
+- struct ice_ptp *ptp = container_of(work, struct ice_ptp, extts_work);
+- struct ice_pf *pf = container_of(ptp, struct ice_pf, ptp);
+ struct ptp_clock_event event;
+ struct ice_hw *hw = &pf->hw;
+ u8 chan, tmr_idx;
+@@ -2558,7 +2554,6 @@ void ice_ptp_prepare_for_reset(struct ice_pf *pf)
+ ice_ptp_cfg_timestamp(pf, false);
+
+ kthread_cancel_delayed_work_sync(&ptp->work);
+- kthread_cancel_work_sync(&ptp->extts_work);
+
+ if (test_bit(ICE_PFR_REQ, pf->state))
+ return;
+@@ -2656,7 +2651,6 @@ static int ice_ptp_init_work(struct ice_pf *pf, struct ice_ptp *ptp)
+
+ /* Initialize work functions */
+ kthread_init_delayed_work(&ptp->work, ice_ptp_periodic_work);
+- kthread_init_work(&ptp->extts_work, ice_ptp_extts_work);
+
+ /* Allocate a kworker for handling work required for the ports
+ * connected to the PTP hardware clock.
+diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.h b/drivers/net/ethernet/intel/ice/ice_ptp.h
+index 9cda2f43e0e56..9f8902c1e743d 100644
+--- a/drivers/net/ethernet/intel/ice/ice_ptp.h
++++ b/drivers/net/ethernet/intel/ice/ice_ptp.h
+@@ -169,7 +169,6 @@ struct ice_ptp_port {
+ * struct ice_ptp - data used for integrating with CONFIG_PTP_1588_CLOCK
+ * @port: data for the PHY port initialization procedure
+ * @work: delayed work function for periodic tasks
+- * @extts_work: work function for handling external Tx timestamps
+ * @cached_phc_time: a cached copy of the PHC time for timestamp extension
+ * @cached_phc_jiffies: jiffies when cached_phc_time was last updated
+ * @ext_ts_chan: the external timestamp channel in use
+@@ -190,7 +189,6 @@ struct ice_ptp_port {
+ struct ice_ptp {
+ struct ice_ptp_port port;
+ struct kthread_delayed_work work;
+- struct kthread_work extts_work;
+ u64 cached_phc_time;
+ unsigned long cached_phc_jiffies;
+ u8 ext_ts_chan;
+@@ -256,6 +254,7 @@ int ice_ptp_get_ts_config(struct ice_pf *pf, struct ifreq *ifr);
+ void ice_ptp_cfg_timestamp(struct ice_pf *pf, bool ena);
+ int ice_get_ptp_clock_index(struct ice_pf *pf);
+
++void ice_ptp_extts_event(struct ice_pf *pf);
+ s8 ice_ptp_request_ts(struct ice_ptp_tx *tx, struct sk_buff *skb);
+ bool ice_ptp_process_ts(struct ice_pf *pf);
+
+@@ -284,6 +283,7 @@ static inline int ice_get_ptp_clock_index(struct ice_pf *pf)
+ return -1;
+ }
+
++static inline void ice_ptp_extts_event(struct ice_pf *pf) { }
+ static inline s8
+ ice_ptp_request_ts(struct ice_ptp_tx *tx, struct sk_buff *skb)
+ {
+diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h
+index 34aebf00a5123..9dc9b982a7ea6 100644
+--- a/drivers/net/ethernet/intel/igc/igc.h
++++ b/drivers/net/ethernet/intel/igc/igc.h
+@@ -13,6 +13,7 @@
+ #include <linux/ptp_clock_kernel.h>
+ #include <linux/timecounter.h>
+ #include <linux/net_tstamp.h>
++#include <linux/bitfield.h>
+
+ #include "igc_hw.h"
+
+@@ -228,7 +229,10 @@ struct igc_adapter {
+
+ struct ptp_clock *ptp_clock;
+ struct ptp_clock_info ptp_caps;
+- struct work_struct ptp_tx_work;
++ /* Access to ptp_tx_skb and ptp_tx_start are protected by the
++ * ptp_tx_lock.
++ */
++ spinlock_t ptp_tx_lock;
+ struct sk_buff *ptp_tx_skb;
+ struct hwtstamp_config tstamp_config;
+ unsigned long ptp_tx_start;
+@@ -311,6 +315,33 @@ extern char igc_driver_name[];
+ #define IGC_MRQC_RSS_FIELD_IPV4_UDP 0x00400000
+ #define IGC_MRQC_RSS_FIELD_IPV6_UDP 0x00800000
+
++/* RX-desc Write-Back format RSS Type's */
++enum igc_rss_type_num {
++ IGC_RSS_TYPE_NO_HASH = 0,
++ IGC_RSS_TYPE_HASH_TCP_IPV4 = 1,
++ IGC_RSS_TYPE_HASH_IPV4 = 2,
++ IGC_RSS_TYPE_HASH_TCP_IPV6 = 3,
++ IGC_RSS_TYPE_HASH_IPV6_EX = 4,
++ IGC_RSS_TYPE_HASH_IPV6 = 5,
++ IGC_RSS_TYPE_HASH_TCP_IPV6_EX = 6,
++ IGC_RSS_TYPE_HASH_UDP_IPV4 = 7,
++ IGC_RSS_TYPE_HASH_UDP_IPV6 = 8,
++ IGC_RSS_TYPE_HASH_UDP_IPV6_EX = 9,
++ IGC_RSS_TYPE_MAX = 10,
++};
++#define IGC_RSS_TYPE_MAX_TABLE 16
++#define IGC_RSS_TYPE_MASK GENMASK(3,0) /* 4-bits (3:0) = mask 0x0F */
++
++/* igc_rss_type - Rx descriptor RSS type field */
++static inline u32 igc_rss_type(const union igc_adv_rx_desc *rx_desc)
++{
++ /* RSS Type 4-bits (3:0) number: 0-9 (above 9 is reserved)
++ * Accessing the same bits via u16 (wb.lower.lo_dword.hs_rss.pkt_info)
++ * is slightly slower than via u32 (wb.lower.lo_dword.data)
++ */
++ return le32_get_bits(rx_desc->wb.lower.lo_dword.data, IGC_RSS_TYPE_MASK);
++}
++
+ /* Interrupt defines */
+ #define IGC_START_ITR 648 /* ~6000 ints/sec */
+ #define IGC_4K_ITR 980
+@@ -401,7 +432,6 @@ enum igc_state_t {
+ __IGC_TESTING,
+ __IGC_RESETTING,
+ __IGC_DOWN,
+- __IGC_PTP_TX_IN_PROGRESS,
+ };
+
+ enum igc_tx_flags {
+@@ -578,6 +608,7 @@ enum igc_ring_flags_t {
+ IGC_RING_FLAG_TX_CTX_IDX,
+ IGC_RING_FLAG_TX_DETECT_HANG,
+ IGC_RING_FLAG_AF_XDP_ZC,
++ IGC_RING_FLAG_TX_HWTSTAMP,
+ };
+
+ #define ring_uses_large_buffer(ring) \
+@@ -634,6 +665,7 @@ int igc_ptp_set_ts_config(struct net_device *netdev, struct ifreq *ifr);
+ int igc_ptp_get_ts_config(struct net_device *netdev, struct ifreq *ifr);
+ void igc_ptp_tx_hang(struct igc_adapter *adapter);
+ void igc_ptp_read(struct igc_adapter *adapter, struct timespec64 *ts);
++void igc_ptp_tx_tstamp_event(struct igc_adapter *adapter);
+
+ #define igc_rx_pg_size(_ring) (PAGE_SIZE << igc_rx_pg_order(_ring))
+
+diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
+index fa764190f2705..5f2e8bcd75973 100644
+--- a/drivers/net/ethernet/intel/igc/igc_main.c
++++ b/drivers/net/ethernet/intel/igc/igc_main.c
+@@ -1585,14 +1585,16 @@ done:
+ }
+ }
+
+- if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)) {
++ if (unlikely(test_bit(IGC_RING_FLAG_TX_HWTSTAMP, &tx_ring->flags) &&
++ skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)) {
+ /* FIXME: add support for retrieving timestamps from
+ * the other timer registers before skipping the
+ * timestamping request.
+ */
+- if (adapter->tstamp_config.tx_type == HWTSTAMP_TX_ON &&
+- !test_and_set_bit_lock(__IGC_PTP_TX_IN_PROGRESS,
+- &adapter->state)) {
++ unsigned long flags;
++
++ spin_lock_irqsave(&adapter->ptp_tx_lock, flags);
++ if (!adapter->ptp_tx_skb) {
+ skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
+ tx_flags |= IGC_TX_FLAGS_TSTAMP;
+
+@@ -1601,6 +1603,8 @@ done:
+ } else {
+ adapter->tx_hwtstamp_skipped++;
+ }
++
++ spin_unlock_irqrestore(&adapter->ptp_tx_lock, flags);
+ }
+
+ if (skb_vlan_tag_present(skb)) {
+@@ -1697,14 +1701,36 @@ static void igc_rx_checksum(struct igc_ring *ring,
+ le32_to_cpu(rx_desc->wb.upper.status_error));
+ }
+
++/* Mapping HW RSS Type to enum pkt_hash_types */
++static const enum pkt_hash_types igc_rss_type_table[IGC_RSS_TYPE_MAX_TABLE] = {
++ [IGC_RSS_TYPE_NO_HASH] = PKT_HASH_TYPE_L2,
++ [IGC_RSS_TYPE_HASH_TCP_IPV4] = PKT_HASH_TYPE_L4,
++ [IGC_RSS_TYPE_HASH_IPV4] = PKT_HASH_TYPE_L3,
++ [IGC_RSS_TYPE_HASH_TCP_IPV6] = PKT_HASH_TYPE_L4,
++ [IGC_RSS_TYPE_HASH_IPV6_EX] = PKT_HASH_TYPE_L3,
++ [IGC_RSS_TYPE_HASH_IPV6] = PKT_HASH_TYPE_L3,
++ [IGC_RSS_TYPE_HASH_TCP_IPV6_EX] = PKT_HASH_TYPE_L4,
++ [IGC_RSS_TYPE_HASH_UDP_IPV4] = PKT_HASH_TYPE_L4,
++ [IGC_RSS_TYPE_HASH_UDP_IPV6] = PKT_HASH_TYPE_L4,
++ [IGC_RSS_TYPE_HASH_UDP_IPV6_EX] = PKT_HASH_TYPE_L4,
++ [10] = PKT_HASH_TYPE_NONE, /* RSS Type above 9 "Reserved" by HW */
++ [11] = PKT_HASH_TYPE_NONE, /* keep array sized for SW bit-mask */
++ [12] = PKT_HASH_TYPE_NONE, /* to handle future HW revisons */
++ [13] = PKT_HASH_TYPE_NONE,
++ [14] = PKT_HASH_TYPE_NONE,
++ [15] = PKT_HASH_TYPE_NONE,
++};
++
+ static inline void igc_rx_hash(struct igc_ring *ring,
+ union igc_adv_rx_desc *rx_desc,
+ struct sk_buff *skb)
+ {
+- if (ring->netdev->features & NETIF_F_RXHASH)
+- skb_set_hash(skb,
+- le32_to_cpu(rx_desc->wb.lower.hi_dword.rss),
+- PKT_HASH_TYPE_L3);
++ if (ring->netdev->features & NETIF_F_RXHASH) {
++ u32 rss_hash = le32_to_cpu(rx_desc->wb.lower.hi_dword.rss);
++ u32 rss_type = igc_rss_type(rx_desc);
++
++ skb_set_hash(skb, rss_hash, igc_rss_type_table[rss_type]);
++ }
+ }
+
+ static void igc_rx_vlan(struct igc_ring *rx_ring,
+@@ -5219,7 +5245,7 @@ static void igc_tsync_interrupt(struct igc_adapter *adapter)
+
+ if (tsicr & IGC_TSICR_TXTS) {
+ /* retrieve hardware timestamp */
+- schedule_work(&adapter->ptp_tx_work);
++ igc_ptp_tx_tstamp_event(adapter);
+ ack |= IGC_TSICR_TXTS;
+ }
+
+@@ -6561,6 +6587,7 @@ static int igc_probe(struct pci_dev *pdev,
+ netdev->features |= NETIF_F_TSO;
+ netdev->features |= NETIF_F_TSO6;
+ netdev->features |= NETIF_F_TSO_ECN;
++ netdev->features |= NETIF_F_RXHASH;
+ netdev->features |= NETIF_F_RXCSUM;
+ netdev->features |= NETIF_F_HW_CSUM;
+ netdev->features |= NETIF_F_SCTP_CRC;
+diff --git a/drivers/net/ethernet/intel/igc/igc_ptp.c b/drivers/net/ethernet/intel/igc/igc_ptp.c
+index 4e10ced736dbb..32ef112f8291a 100644
+--- a/drivers/net/ethernet/intel/igc/igc_ptp.c
++++ b/drivers/net/ethernet/intel/igc/igc_ptp.c
+@@ -536,9 +536,34 @@ static void igc_ptp_enable_rx_timestamp(struct igc_adapter *adapter)
+ wr32(IGC_TSYNCRXCTL, val);
+ }
+
++static void igc_ptp_clear_tx_tstamp(struct igc_adapter *adapter)
++{
++ unsigned long flags;
++
++ spin_lock_irqsave(&adapter->ptp_tx_lock, flags);
++
++ dev_kfree_skb_any(adapter->ptp_tx_skb);
++ adapter->ptp_tx_skb = NULL;
++
++ spin_unlock_irqrestore(&adapter->ptp_tx_lock, flags);
++}
++
+ static void igc_ptp_disable_tx_timestamp(struct igc_adapter *adapter)
+ {
+ struct igc_hw *hw = &adapter->hw;
++ int i;
++
++ /* Clear the flags first to avoid new packets to be enqueued
++ * for TX timestamping.
++ */
++ for (i = 0; i < adapter->num_tx_queues; i++) {
++ struct igc_ring *tx_ring = adapter->tx_ring[i];
++
++ clear_bit(IGC_RING_FLAG_TX_HWTSTAMP, &tx_ring->flags);
++ }
++
++ /* Now we can clean the pending TX timestamp requests. */
++ igc_ptp_clear_tx_tstamp(adapter);
+
+ wr32(IGC_TSYNCTXCTL, 0);
+ }
+@@ -546,12 +571,23 @@ static void igc_ptp_disable_tx_timestamp(struct igc_adapter *adapter)
+ static void igc_ptp_enable_tx_timestamp(struct igc_adapter *adapter)
+ {
+ struct igc_hw *hw = &adapter->hw;
++ int i;
+
+ wr32(IGC_TSYNCTXCTL, IGC_TSYNCTXCTL_ENABLED | IGC_TSYNCTXCTL_TXSYNSIG);
+
+ /* Read TXSTMP registers to discard any timestamp previously stored. */
+ rd32(IGC_TXSTMPL);
+ rd32(IGC_TXSTMPH);
++
++ /* The hardware is ready to accept TX timestamp requests,
++ * notify the transmit path.
++ */
++ for (i = 0; i < adapter->num_tx_queues; i++) {
++ struct igc_ring *tx_ring = adapter->tx_ring[i];
++
++ set_bit(IGC_RING_FLAG_TX_HWTSTAMP, &tx_ring->flags);
++ }
++
+ }
+
+ /**
+@@ -603,6 +639,7 @@ static int igc_ptp_set_timestamp_mode(struct igc_adapter *adapter,
+ return 0;
+ }
+
++/* Requires adapter->ptp_tx_lock held by caller. */
+ static void igc_ptp_tx_timeout(struct igc_adapter *adapter)
+ {
+ struct igc_hw *hw = &adapter->hw;
+@@ -610,7 +647,6 @@ static void igc_ptp_tx_timeout(struct igc_adapter *adapter)
+ dev_kfree_skb_any(adapter->ptp_tx_skb);
+ adapter->ptp_tx_skb = NULL;
+ adapter->tx_hwtstamp_timeouts++;
+- clear_bit_unlock(__IGC_PTP_TX_IN_PROGRESS, &adapter->state);
+ /* Clear the tx valid bit in TSYNCTXCTL register to enable interrupt. */
+ rd32(IGC_TXSTMPH);
+ netdev_warn(adapter->netdev, "Tx timestamp timeout\n");
+@@ -618,20 +654,20 @@ static void igc_ptp_tx_timeout(struct igc_adapter *adapter)
+
+ void igc_ptp_tx_hang(struct igc_adapter *adapter)
+ {
+- bool timeout = time_is_before_jiffies(adapter->ptp_tx_start +
+- IGC_PTP_TX_TIMEOUT);
++ unsigned long flags;
+
+- if (!test_bit(__IGC_PTP_TX_IN_PROGRESS, &adapter->state))
+- return;
++ spin_lock_irqsave(&adapter->ptp_tx_lock, flags);
+
+- /* If we haven't received a timestamp within the timeout, it is
+- * reasonable to assume that it will never occur, so we can unlock the
+- * timestamp bit when this occurs.
+- */
+- if (timeout) {
+- cancel_work_sync(&adapter->ptp_tx_work);
+- igc_ptp_tx_timeout(adapter);
+- }
++ if (!adapter->ptp_tx_skb)
++ goto unlock;
++
++ if (time_is_after_jiffies(adapter->ptp_tx_start + IGC_PTP_TX_TIMEOUT))
++ goto unlock;
++
++ igc_ptp_tx_timeout(adapter);
++
++unlock:
++ spin_unlock_irqrestore(&adapter->ptp_tx_lock, flags);
+ }
+
+ /**
+@@ -641,20 +677,57 @@ void igc_ptp_tx_hang(struct igc_adapter *adapter)
+ * If we were asked to do hardware stamping and such a time stamp is
+ * available, then it must have been for this skb here because we only
+ * allow only one such packet into the queue.
++ *
++ * Context: Expects adapter->ptp_tx_lock to be held by caller.
+ */
+ static void igc_ptp_tx_hwtstamp(struct igc_adapter *adapter)
+ {
+ struct sk_buff *skb = adapter->ptp_tx_skb;
+ struct skb_shared_hwtstamps shhwtstamps;
+ struct igc_hw *hw = &adapter->hw;
++ u32 tsynctxctl;
+ int adjust = 0;
+ u64 regval;
+
+ if (WARN_ON_ONCE(!skb))
+ return;
+
+- regval = rd32(IGC_TXSTMPL);
+- regval |= (u64)rd32(IGC_TXSTMPH) << 32;
++ tsynctxctl = rd32(IGC_TSYNCTXCTL);
++ tsynctxctl &= IGC_TSYNCTXCTL_TXTT_0;
++ if (tsynctxctl) {
++ regval = rd32(IGC_TXSTMPL);
++ regval |= (u64)rd32(IGC_TXSTMPH) << 32;
++ } else {
++ /* There's a bug in the hardware that could cause
++ * missing interrupts for TX timestamping. The issue
++ * is that for new interrupts to be triggered, the
++ * IGC_TXSTMPH_0 register must be read.
++ *
++ * To avoid discarding a valid timestamp that just
++ * happened at the "wrong" time, we need to confirm
++ * that there was no timestamp captured, we do that by
++ * assuming that no two timestamps in sequence have
++ * the same nanosecond value.
++ *
++ * So, we read the "low" register, read the "high"
++ * register (to latch a new timestamp) and read the
++ * "low" register again, if "old" and "new" versions
++ * of the "low" register are different, a valid
++ * timestamp was captured, we can read the "high"
++ * register again.
++ */
++ u32 txstmpl_old, txstmpl_new;
++
++ txstmpl_old = rd32(IGC_TXSTMPL);
++ rd32(IGC_TXSTMPH);
++ txstmpl_new = rd32(IGC_TXSTMPL);
++
++ if (txstmpl_old == txstmpl_new)
++ return;
++
++ regval = txstmpl_new;
++ regval |= (u64)rd32(IGC_TXSTMPH) << 32;
++ }
+ if (igc_ptp_systim_to_hwtstamp(adapter, &shhwtstamps, regval))
+ return;
+
+@@ -676,13 +749,7 @@ static void igc_ptp_tx_hwtstamp(struct igc_adapter *adapter)
+ shhwtstamps.hwtstamp =
+ ktime_add_ns(shhwtstamps.hwtstamp, adjust);
+
+- /* Clear the lock early before calling skb_tstamp_tx so that
+- * applications are not woken up before the lock bit is clear. We use
+- * a copy of the skb pointer to ensure other threads can't change it
+- * while we're notifying the stack.
+- */
+ adapter->ptp_tx_skb = NULL;
+- clear_bit_unlock(__IGC_PTP_TX_IN_PROGRESS, &adapter->state);
+
+ /* Notify the stack and free the skb after we've unlocked */
+ skb_tstamp_tx(skb, &shhwtstamps);
+@@ -690,27 +757,25 @@ static void igc_ptp_tx_hwtstamp(struct igc_adapter *adapter)
+ }
+
+ /**
+- * igc_ptp_tx_work
+- * @work: pointer to work struct
++ * igc_ptp_tx_tstamp_event
++ * @adapter: board private structure
+ *
+- * This work function polls the TSYNCTXCTL valid bit to determine when a
+- * timestamp has been taken for the current stored skb.
++ * Called when a TX timestamp interrupt happens to retrieve the
++ * timestamp and send it up to the socket.
+ */
+-static void igc_ptp_tx_work(struct work_struct *work)
++void igc_ptp_tx_tstamp_event(struct igc_adapter *adapter)
+ {
+- struct igc_adapter *adapter = container_of(work, struct igc_adapter,
+- ptp_tx_work);
+- struct igc_hw *hw = &adapter->hw;
+- u32 tsynctxctl;
++ unsigned long flags;
+
+- if (!test_bit(__IGC_PTP_TX_IN_PROGRESS, &adapter->state))
+- return;
++ spin_lock_irqsave(&adapter->ptp_tx_lock, flags);
+
+- tsynctxctl = rd32(IGC_TSYNCTXCTL);
+- if (WARN_ON_ONCE(!(tsynctxctl & IGC_TSYNCTXCTL_TXTT_0)))
+- return;
++ if (!adapter->ptp_tx_skb)
++ goto unlock;
+
+ igc_ptp_tx_hwtstamp(adapter);
++
++unlock:
++ spin_unlock_irqrestore(&adapter->ptp_tx_lock, flags);
+ }
+
+ /**
+@@ -959,8 +1024,8 @@ void igc_ptp_init(struct igc_adapter *adapter)
+ return;
+ }
+
++ spin_lock_init(&adapter->ptp_tx_lock);
+ spin_lock_init(&adapter->tmreg_lock);
+- INIT_WORK(&adapter->ptp_tx_work, igc_ptp_tx_work);
+
+ adapter->tstamp_config.rx_filter = HWTSTAMP_FILTER_NONE;
+ adapter->tstamp_config.tx_type = HWTSTAMP_TX_OFF;
+@@ -1020,10 +1085,7 @@ void igc_ptp_suspend(struct igc_adapter *adapter)
+ if (!(adapter->ptp_flags & IGC_PTP_ENABLED))
+ return;
+
+- cancel_work_sync(&adapter->ptp_tx_work);
+- dev_kfree_skb_any(adapter->ptp_tx_skb);
+- adapter->ptp_tx_skb = NULL;
+- clear_bit_unlock(__IGC_PTP_TX_IN_PROGRESS, &adapter->state);
++ igc_ptp_clear_tx_tstamp(adapter);
+
+ if (pci_device_is_present(adapter->pdev)) {
+ igc_ptp_time_save(adapter);
+diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cgx.c b/drivers/net/ethernet/marvell/octeontx2/af/cgx.c
+index bd77152bb8d7c..592037f4e55b6 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/af/cgx.c
++++ b/drivers/net/ethernet/marvell/octeontx2/af/cgx.c
+@@ -169,6 +169,9 @@ void cgx_lmac_write(int cgx_id, int lmac_id, u64 offset, u64 val)
+ {
+ struct cgx *cgx_dev = cgx_get_pdata(cgx_id);
+
++ /* Software must not access disabled LMAC registers */
++ if (!is_lmac_valid(cgx_dev, lmac_id))
++ return;
+ cgx_write(cgx_dev, lmac_id, offset, val);
+ }
+
+@@ -176,6 +179,10 @@ u64 cgx_lmac_read(int cgx_id, int lmac_id, u64 offset)
+ {
+ struct cgx *cgx_dev = cgx_get_pdata(cgx_id);
+
++ /* Software must not access disabled LMAC registers */
++ if (!is_lmac_valid(cgx_dev, lmac_id))
++ return 0;
++
+ return cgx_read(cgx_dev, lmac_id, offset);
+ }
+
+@@ -530,14 +537,15 @@ static u32 cgx_get_lmac_fifo_len(void *cgxd, int lmac_id)
+ int cgx_lmac_internal_loopback(void *cgxd, int lmac_id, bool enable)
+ {
+ struct cgx *cgx = cgxd;
+- u8 lmac_type;
++ struct lmac *lmac;
+ u64 cfg;
+
+ if (!is_lmac_valid(cgx, lmac_id))
+ return -ENODEV;
+
+- lmac_type = cgx->mac_ops->get_lmac_type(cgx, lmac_id);
+- if (lmac_type == LMAC_MODE_SGMII || lmac_type == LMAC_MODE_QSGMII) {
++ lmac = lmac_pdata(lmac_id, cgx);
++ if (lmac->lmac_type == LMAC_MODE_SGMII ||
++ lmac->lmac_type == LMAC_MODE_QSGMII) {
+ cfg = cgx_read(cgx, lmac_id, CGXX_GMP_PCS_MRX_CTL);
+ if (enable)
+ cfg |= CGXX_GMP_PCS_MRX_CTL_LBK;
+@@ -1556,6 +1564,23 @@ int cgx_lmac_linkup_start(void *cgxd)
+ return 0;
+ }
+
++int cgx_lmac_reset(void *cgxd, int lmac_id, u8 pf_req_flr)
++{
++ struct cgx *cgx = cgxd;
++ u64 cfg;
++
++ if (!is_lmac_valid(cgx, lmac_id))
++ return -ENODEV;
++
++ /* Resetting PFC related CSRs */
++ cfg = 0xff;
++ cgx_write(cgxd, lmac_id, CGXX_CMRX_RX_LOGL_XON, cfg);
++
++ if (pf_req_flr)
++ cgx_lmac_internal_loopback(cgxd, lmac_id, false);
++ return 0;
++}
++
+ static int cgx_configure_interrupt(struct cgx *cgx, struct lmac *lmac,
+ int cnt, bool req_free)
+ {
+@@ -1675,6 +1700,7 @@ static int cgx_lmac_init(struct cgx *cgx)
+ cgx->lmac_idmap[lmac->lmac_id] = lmac;
+ set_bit(lmac->lmac_id, &cgx->lmac_bmap);
+ cgx->mac_ops->mac_pause_frm_config(cgx, lmac->lmac_id, true);
++ lmac->lmac_type = cgx->mac_ops->get_lmac_type(cgx, lmac->lmac_id);
+ }
+
+ return cgx_lmac_verify_fwi_version(cgx);
+@@ -1771,6 +1797,7 @@ static struct mac_ops cgx_mac_ops = {
+ .mac_tx_enable = cgx_lmac_tx_enable,
+ .pfc_config = cgx_lmac_pfc_config,
+ .mac_get_pfc_frm_cfg = cgx_lmac_get_pfc_frm_cfg,
++ .mac_reset = cgx_lmac_reset,
+ };
+
+ static int cgx_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cgx.h b/drivers/net/ethernet/marvell/octeontx2/af/cgx.h
+index 5a20d93004c71..5741141796880 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/af/cgx.h
++++ b/drivers/net/ethernet/marvell/octeontx2/af/cgx.h
+@@ -35,6 +35,7 @@
+ #define CGXX_CMRX_INT_ENA_W1S 0x058
+ #define CGXX_CMRX_RX_ID_MAP 0x060
+ #define CGXX_CMRX_RX_STAT0 0x070
++#define CGXX_CMRX_RX_LOGL_XON 0x100
+ #define CGXX_CMRX_RX_LMACS 0x128
+ #define CGXX_CMRX_RX_DMAC_CTL0 (0x1F8 + mac_ops->csr_offset)
+ #define CGX_DMAC_CTL0_CAM_ENABLE BIT_ULL(3)
+@@ -181,4 +182,5 @@ int cgx_lmac_get_pfc_frm_cfg(void *cgxd, int lmac_id, u8 *tx_pause,
+ u8 *rx_pause);
+ int verify_lmac_fc_cfg(void *cgxd, int lmac_id, u8 tx_pause, u8 rx_pause,
+ int pfvf_idx);
++int cgx_lmac_reset(void *cgxd, int lmac_id, u8 pf_req_flr);
+ #endif /* CGX_H */
+diff --git a/drivers/net/ethernet/marvell/octeontx2/af/lmac_common.h b/drivers/net/ethernet/marvell/octeontx2/af/lmac_common.h
+index 39aaf0e4467dc..0b4cba03f2e83 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/af/lmac_common.h
++++ b/drivers/net/ethernet/marvell/octeontx2/af/lmac_common.h
+@@ -24,6 +24,7 @@
+ * @cgx: parent cgx port
+ * @mcast_filters_count: Number of multicast filters installed
+ * @lmac_id: lmac port id
++ * @lmac_type: lmac type like SGMII/XAUI
+ * @cmd_pend: flag set before new command is started
+ * flag cleared after command response is received
+ * @name: lmac port name
+@@ -43,6 +44,7 @@ struct lmac {
+ struct cgx *cgx;
+ u8 mcast_filters_count;
+ u8 lmac_id;
++ u8 lmac_type;
+ bool cmd_pend;
+ char *name;
+ };
+@@ -125,6 +127,7 @@ struct mac_ops {
+
+ int (*mac_get_pfc_frm_cfg)(void *cgxd, int lmac_id,
+ u8 *tx_pause, u8 *rx_pause);
++ int (*mac_reset)(void *cgxd, int lmac_id, u8 pf_req_flr);
+
+ /* FEC stats */
+ int (*get_fec_stats)(void *cgxd, int lmac_id,
+diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rpm.c b/drivers/net/ethernet/marvell/octeontx2/af/rpm.c
+index de0d88dd10d65..b4fcb20c3f4fd 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/af/rpm.c
++++ b/drivers/net/ethernet/marvell/octeontx2/af/rpm.c
+@@ -37,6 +37,7 @@ static struct mac_ops rpm_mac_ops = {
+ .mac_tx_enable = rpm_lmac_tx_enable,
+ .pfc_config = rpm_lmac_pfc_config,
+ .mac_get_pfc_frm_cfg = rpm_lmac_get_pfc_frm_cfg,
++ .mac_reset = rpm_lmac_reset,
+ };
+
+ static struct mac_ops rpm2_mac_ops = {
+@@ -47,7 +48,7 @@ static struct mac_ops rpm2_mac_ops = {
+ .int_set_reg = RPM2_CMRX_SW_INT_ENA_W1S,
+ .irq_offset = 1,
+ .int_ena_bit = BIT_ULL(0),
+- .lmac_fwi = RPM_LMAC_FWI,
++ .lmac_fwi = RPM2_LMAC_FWI,
+ .non_contiguous_serdes_lane = true,
+ .rx_stats_cnt = 43,
+ .tx_stats_cnt = 34,
+@@ -68,6 +69,7 @@ static struct mac_ops rpm2_mac_ops = {
+ .mac_tx_enable = rpm_lmac_tx_enable,
+ .pfc_config = rpm_lmac_pfc_config,
+ .mac_get_pfc_frm_cfg = rpm_lmac_get_pfc_frm_cfg,
++ .mac_reset = rpm_lmac_reset,
+ };
+
+ bool is_dev_rpm2(void *rpmd)
+@@ -537,14 +539,15 @@ u32 rpm2_get_lmac_fifo_len(void *rpmd, int lmac_id)
+ int rpm_lmac_internal_loopback(void *rpmd, int lmac_id, bool enable)
+ {
+ rpm_t *rpm = rpmd;
+- u8 lmac_type;
++ struct lmac *lmac;
+ u64 cfg;
+
+ if (!is_lmac_valid(rpm, lmac_id))
+ return -ENODEV;
+- lmac_type = rpm->mac_ops->get_lmac_type(rpm, lmac_id);
+
+- if (lmac_type == LMAC_MODE_QSGMII || lmac_type == LMAC_MODE_SGMII) {
++ lmac = lmac_pdata(lmac_id, rpm);
++ if (lmac->lmac_type == LMAC_MODE_QSGMII ||
++ lmac->lmac_type == LMAC_MODE_SGMII) {
+ dev_err(&rpm->pdev->dev, "loopback not supported for LPC mode\n");
+ return 0;
+ }
+@@ -713,3 +716,24 @@ int rpm_get_fec_stats(void *rpmd, int lmac_id, struct cgx_fec_stats_rsp *rsp)
+
+ return 0;
+ }
++
++int rpm_lmac_reset(void *rpmd, int lmac_id, u8 pf_req_flr)
++{
++ u64 rx_logl_xon, cfg;
++ rpm_t *rpm = rpmd;
++
++ if (!is_lmac_valid(rpm, lmac_id))
++ return -ENODEV;
++
++ /* Resetting PFC related CSRs */
++ rx_logl_xon = is_dev_rpm2(rpm) ? RPM2_CMRX_RX_LOGL_XON :
++ RPMX_CMRX_RX_LOGL_XON;
++ cfg = 0xff;
++
++ rpm_write(rpm, lmac_id, rx_logl_xon, cfg);
++
++ if (pf_req_flr)
++ rpm_lmac_internal_loopback(rpm, lmac_id, false);
++
++ return 0;
++}
+diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rpm.h b/drivers/net/ethernet/marvell/octeontx2/af/rpm.h
+index 22147b4c21370..b79cfbc6f8770 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/af/rpm.h
++++ b/drivers/net/ethernet/marvell/octeontx2/af/rpm.h
+@@ -74,6 +74,7 @@
+ #define RPMX_MTI_MAC100X_CL01_PAUSE_QUANTA 0x80A8
+ #define RPMX_MTI_MAC100X_CL89_PAUSE_QUANTA 0x8108
+ #define RPM_DEFAULT_PAUSE_TIME 0x7FF
++#define RPMX_CMRX_RX_LOGL_XON 0x4100
+
+ #define RPMX_MTI_MAC100X_XIF_MODE 0x8100
+ #define RPMX_ONESTEP_ENABLE BIT_ULL(5)
+@@ -94,7 +95,8 @@
+
+ /* CN10KB CSR Declaration */
+ #define RPM2_CMRX_SW_INT 0x1b0
+-#define RPM2_CMRX_SW_INT_ENA_W1S 0x1b8
++#define RPM2_CMRX_SW_INT_ENA_W1S 0x1c8
++#define RPM2_LMAC_FWI 0x12
+ #define RPM2_CMR_CHAN_MSK_OR 0x3120
+ #define RPM2_CMR_RX_OVR_BP_EN BIT_ULL(2)
+ #define RPM2_CMR_RX_OVR_BP_BP BIT_ULL(1)
+@@ -131,4 +133,5 @@ int rpm_lmac_get_pfc_frm_cfg(void *rpmd, int lmac_id, u8 *tx_pause,
+ int rpm2_get_nr_lmacs(void *rpmd);
+ bool is_dev_rpm2(void *rpmd);
+ int rpm_get_fec_stats(void *cgxd, int lmac_id, struct cgx_fec_stats_rsp *rsp);
++int rpm_lmac_reset(void *rpmd, int lmac_id, u8 pf_req_flr);
+ #endif /* RPM_H */
+diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
+index 9f673bda9dbdd..b26b013216933 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
++++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
+@@ -2629,6 +2629,7 @@ static void __rvu_flr_handler(struct rvu *rvu, u16 pcifunc)
+ * Since LF is detached use LF number as -1.
+ */
+ rvu_npc_free_mcam_entries(rvu, pcifunc, -1);
++ rvu_mac_reset(rvu, pcifunc);
+
+ mutex_unlock(&rvu->flr_lock);
+ }
+diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu.h b/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
+index d655bf04a483d..be279cd1fd729 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
++++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu.h
+@@ -23,6 +23,7 @@
+ #define PCI_DEVID_OCTEONTX2_LBK 0xA061
+
+ /* Subsystem Device ID */
++#define PCI_SUBSYS_DEVID_98XX 0xB100
+ #define PCI_SUBSYS_DEVID_96XX 0xB200
+ #define PCI_SUBSYS_DEVID_CN10K_A 0xB900
+ #define PCI_SUBSYS_DEVID_CNF10K_B 0xBC00
+@@ -669,6 +670,16 @@ static inline u16 rvu_nix_chan_cpt(struct rvu *rvu, u8 chan)
+ return rvu->hw->cpt_chan_base + chan;
+ }
+
++static inline bool is_rvu_supports_nix1(struct rvu *rvu)
++{
++ struct pci_dev *pdev = rvu->pdev;
++
++ if (pdev->subsystem_device == PCI_SUBSYS_DEVID_98XX)
++ return true;
++
++ return false;
++}
++
+ /* Function Prototypes
+ * RVU
+ */
+@@ -864,6 +875,7 @@ int rvu_cgx_config_tx(void *cgxd, int lmac_id, bool enable);
+ int rvu_cgx_prio_flow_ctrl_cfg(struct rvu *rvu, u16 pcifunc, u8 tx_pause, u8 rx_pause,
+ u16 pfc_en);
+ int rvu_cgx_cfg_pause_frm(struct rvu *rvu, u16 pcifunc, u8 tx_pause, u8 rx_pause);
++void rvu_mac_reset(struct rvu *rvu, u16 pcifunc);
+ u32 rvu_cgx_get_lmac_fifolen(struct rvu *rvu, int cgx, int lmac);
+ int npc_get_nixlf_mcam_index(struct npc_mcam *mcam, u16 pcifunc, int nixlf,
+ int type);
+diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_cgx.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_cgx.c
+index 83b342fa8d753..095b2cc4a6999 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_cgx.c
++++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_cgx.c
+@@ -114,7 +114,7 @@ static void rvu_map_cgx_nix_block(struct rvu *rvu, int pf,
+ p2x = cgx_lmac_get_p2x(cgx_id, lmac_id);
+ /* Firmware sets P2X_SELECT as either NIX0 or NIX1 */
+ pfvf->nix_blkaddr = BLKADDR_NIX0;
+- if (p2x == CMR_P2X_SEL_NIX1)
++ if (is_rvu_supports_nix1(rvu) && p2x == CMR_P2X_SEL_NIX1)
+ pfvf->nix_blkaddr = BLKADDR_NIX1;
+ }
+
+@@ -763,7 +763,7 @@ static int rvu_cgx_ptp_rx_cfg(struct rvu *rvu, u16 pcifunc, bool enable)
+ cgxd = rvu_cgx_pdata(cgx_id, rvu);
+
+ mac_ops = get_mac_ops(cgxd);
+- mac_ops->mac_enadis_ptp_config(cgxd, lmac_id, true);
++ mac_ops->mac_enadis_ptp_config(cgxd, lmac_id, enable);
+ /* If PTP is enabled then inform NPC that packets to be
+ * parsed by this PF will have their data shifted by 8 bytes
+ * and if PTP is disabled then no shift is required
+@@ -1250,3 +1250,21 @@ int rvu_mbox_handler_cgx_prio_flow_ctrl_cfg(struct rvu *rvu,
+ mac_ops->mac_get_pfc_frm_cfg(cgxd, lmac_id, &rsp->tx_pause, &rsp->rx_pause);
+ return err;
+ }
++
++void rvu_mac_reset(struct rvu *rvu, u16 pcifunc)
++{
++ int pf = rvu_get_pf(pcifunc);
++ struct mac_ops *mac_ops;
++ struct cgx *cgxd;
++ u8 cgx, lmac;
++
++ if (!is_pf_cgxmapped(rvu, pf))
++ return;
++
++ rvu_get_cgx_lmac_id(rvu->pf2cgxlmac_map[pf], &cgx, &lmac);
++ cgxd = rvu_cgx_pdata(cgx, rvu);
++ mac_ops = get_mac_ops(cgxd);
++
++ if (mac_ops->mac_reset(cgxd, lmac, !is_vf(pcifunc)))
++ dev_err(rvu->dev, "Failed to reset MAC\n");
++}
+diff --git a/drivers/net/ethernet/mellanox/mlxsw/minimal.c b/drivers/net/ethernet/mellanox/mlxsw/minimal.c
+index 6b56eadd736e5..6b98c3287b497 100644
+--- a/drivers/net/ethernet/mellanox/mlxsw/minimal.c
++++ b/drivers/net/ethernet/mellanox/mlxsw/minimal.c
+@@ -417,6 +417,7 @@ static int mlxsw_m_linecards_init(struct mlxsw_m *mlxsw_m)
+ err_kmalloc_array:
+ for (i--; i >= 0; i--)
+ kfree(mlxsw_m->line_cards[i]);
++ kfree(mlxsw_m->line_cards);
+ err_kcalloc:
+ kfree(mlxsw_m->ports);
+ return err;
+diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c
+index 1f5f00b304418..2fa833d041baa 100644
+--- a/drivers/net/ethernet/mscc/ocelot.c
++++ b/drivers/net/ethernet/mscc/ocelot.c
+@@ -2925,7 +2925,6 @@ int ocelot_init(struct ocelot *ocelot)
+ }
+ }
+
+- mutex_init(&ocelot->ptp_lock);
+ mutex_init(&ocelot->mact_lock);
+ mutex_init(&ocelot->fwd_domain_lock);
+ mutex_init(&ocelot->tas_lock);
+diff --git a/drivers/net/ethernet/mscc/ocelot_ptp.c b/drivers/net/ethernet/mscc/ocelot_ptp.c
+index 2180ae94c7447..cb32234a5bf1b 100644
+--- a/drivers/net/ethernet/mscc/ocelot_ptp.c
++++ b/drivers/net/ethernet/mscc/ocelot_ptp.c
+@@ -439,8 +439,12 @@ static int ocelot_ipv6_ptp_trap_del(struct ocelot *ocelot, int port)
+ static int ocelot_setup_ptp_traps(struct ocelot *ocelot, int port,
+ bool l2, bool l4)
+ {
++ struct ocelot_port *ocelot_port = ocelot->ports[port];
+ int err;
+
++ ocelot_port->trap_proto &= ~(OCELOT_PROTO_PTP_L2 |
++ OCELOT_PROTO_PTP_L4);
++
+ if (l2)
+ err = ocelot_l2_ptp_trap_add(ocelot, port);
+ else
+@@ -464,6 +468,11 @@ static int ocelot_setup_ptp_traps(struct ocelot *ocelot, int port,
+ if (err)
+ return err;
+
++ if (l2)
++ ocelot_port->trap_proto |= OCELOT_PROTO_PTP_L2;
++ if (l4)
++ ocelot_port->trap_proto |= OCELOT_PROTO_PTP_L4;
++
+ return 0;
+
+ err_ipv6:
+@@ -474,10 +483,38 @@ err_ipv4:
+ return err;
+ }
+
++static int ocelot_traps_to_ptp_rx_filter(unsigned int proto)
++{
++ if ((proto & OCELOT_PROTO_PTP_L2) && (proto & OCELOT_PROTO_PTP_L4))
++ return HWTSTAMP_FILTER_PTP_V2_EVENT;
++ else if (proto & OCELOT_PROTO_PTP_L2)
++ return HWTSTAMP_FILTER_PTP_V2_L2_EVENT;
++ else if (proto & OCELOT_PROTO_PTP_L4)
++ return HWTSTAMP_FILTER_PTP_V2_L4_EVENT;
++
++ return HWTSTAMP_FILTER_NONE;
++}
++
+ int ocelot_hwstamp_get(struct ocelot *ocelot, int port, struct ifreq *ifr)
+ {
+- return copy_to_user(ifr->ifr_data, &ocelot->hwtstamp_config,
+- sizeof(ocelot->hwtstamp_config)) ? -EFAULT : 0;
++ struct ocelot_port *ocelot_port = ocelot->ports[port];
++ struct hwtstamp_config cfg = {};
++
++ switch (ocelot_port->ptp_cmd) {
++ case IFH_REW_OP_TWO_STEP_PTP:
++ cfg.tx_type = HWTSTAMP_TX_ON;
++ break;
++ case IFH_REW_OP_ORIGIN_PTP:
++ cfg.tx_type = HWTSTAMP_TX_ONESTEP_SYNC;
++ break;
++ default:
++ cfg.tx_type = HWTSTAMP_TX_OFF;
++ break;
++ }
++
++ cfg.rx_filter = ocelot_traps_to_ptp_rx_filter(ocelot_port->trap_proto);
++
++ return copy_to_user(ifr->ifr_data, &cfg, sizeof(cfg)) ? -EFAULT : 0;
+ }
+ EXPORT_SYMBOL(ocelot_hwstamp_get);
+
+@@ -509,8 +546,6 @@ int ocelot_hwstamp_set(struct ocelot *ocelot, int port, struct ifreq *ifr)
+ return -ERANGE;
+ }
+
+- mutex_lock(&ocelot->ptp_lock);
+-
+ switch (cfg.rx_filter) {
+ case HWTSTAMP_FILTER_NONE:
+ break;
+@@ -531,28 +566,14 @@ int ocelot_hwstamp_set(struct ocelot *ocelot, int port, struct ifreq *ifr)
+ l4 = true;
+ break;
+ default:
+- mutex_unlock(&ocelot->ptp_lock);
+ return -ERANGE;
+ }
+
+ err = ocelot_setup_ptp_traps(ocelot, port, l2, l4);
+- if (err) {
+- mutex_unlock(&ocelot->ptp_lock);
++ if (err)
+ return err;
+- }
+
+- if (l2 && l4)
+- cfg.rx_filter = HWTSTAMP_FILTER_PTP_V2_EVENT;
+- else if (l2)
+- cfg.rx_filter = HWTSTAMP_FILTER_PTP_V2_L2_EVENT;
+- else if (l4)
+- cfg.rx_filter = HWTSTAMP_FILTER_PTP_V2_L4_EVENT;
+- else
+- cfg.rx_filter = HWTSTAMP_FILTER_NONE;
+-
+- /* Commit back the result & save it */
+- memcpy(&ocelot->hwtstamp_config, &cfg, sizeof(cfg));
+- mutex_unlock(&ocelot->ptp_lock);
++ cfg.rx_filter = ocelot_traps_to_ptp_rx_filter(ocelot_port->trap_proto);
+
+ return copy_to_user(ifr->ifr_data, &cfg, sizeof(cfg)) ? -EFAULT : 0;
+ }
+@@ -824,11 +845,6 @@ int ocelot_init_timestamp(struct ocelot *ocelot,
+
+ ocelot_write(ocelot, PTP_CFG_MISC_PTP_EN, PTP_CFG_MISC);
+
+- /* There is no device reconfiguration, PTP Rx stamping is always
+- * enabled.
+- */
+- ocelot->hwtstamp_config.rx_filter = HWTSTAMP_FILTER_PTP_V2_EVENT;
+-
+ return 0;
+ }
+ EXPORT_SYMBOL(ocelot_init_timestamp);
+diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
+index b63e47af63655..8c019f382a7f3 100644
+--- a/drivers/net/ethernet/sfc/ef10.c
++++ b/drivers/net/ethernet/sfc/ef10.c
+@@ -1297,8 +1297,10 @@ static void efx_ef10_fini_nic(struct efx_nic *efx)
+ {
+ struct efx_ef10_nic_data *nic_data = efx->nic_data;
+
++ spin_lock_bh(&efx->stats_lock);
+ kfree(nic_data->mc_stats);
+ nic_data->mc_stats = NULL;
++ spin_unlock_bh(&efx->stats_lock);
+ }
+
+ static int efx_ef10_init_nic(struct efx_nic *efx)
+@@ -1852,9 +1854,14 @@ static size_t efx_ef10_update_stats_pf(struct efx_nic *efx, u64 *full_stats,
+
+ efx_ef10_get_stat_mask(efx, mask);
+
+- efx_nic_copy_stats(efx, nic_data->mc_stats);
+- efx_nic_update_stats(efx_ef10_stat_desc, EF10_STAT_COUNT,
+- mask, stats, nic_data->mc_stats, false);
++ /* If NIC was fini'd (probably resetting), then we can't read
++ * updated stats right now.
++ */
++ if (nic_data->mc_stats) {
++ efx_nic_copy_stats(efx, nic_data->mc_stats);
++ efx_nic_update_stats(efx_ef10_stat_desc, EF10_STAT_COUNT,
++ mask, stats, nic_data->mc_stats, false);
++ }
+
+ /* Update derived statistics */
+ efx_nic_fix_nodesc_drop_stat(efx,
+diff --git a/drivers/net/ethernet/sfc/efx_devlink.c b/drivers/net/ethernet/sfc/efx_devlink.c
+index ef9971cbb695d..0384b134e1242 100644
+--- a/drivers/net/ethernet/sfc/efx_devlink.c
++++ b/drivers/net/ethernet/sfc/efx_devlink.c
+@@ -622,6 +622,9 @@ static struct devlink_port *ef100_set_devlink_port(struct efx_nic *efx, u32 idx)
+ u32 id;
+ int rc;
+
++ if (!efx->mae)
++ return NULL;
++
+ if (efx_mae_lookup_mport(efx, idx, &id)) {
+ /* This should not happen. */
+ if (idx == MAE_MPORT_DESC_VF_IDX_NULL)
+diff --git a/drivers/net/ethernet/sfc/tc.c b/drivers/net/ethernet/sfc/tc.c
+index c004443c1d58c..d7827ab3761f9 100644
+--- a/drivers/net/ethernet/sfc/tc.c
++++ b/drivers/net/ethernet/sfc/tc.c
+@@ -132,23 +132,6 @@ static void efx_tc_free_action_set_list(struct efx_nic *efx,
+ /* Don't kfree, as acts is embedded inside a struct efx_tc_flow_rule */
+ }
+
+-static void efx_tc_flow_free(void *ptr, void *arg)
+-{
+- struct efx_tc_flow_rule *rule = ptr;
+- struct efx_nic *efx = arg;
+-
+- netif_err(efx, drv, efx->net_dev,
+- "tc rule %lx still present at teardown, removing\n",
+- rule->cookie);
+-
+- efx_mae_delete_rule(efx, rule->fw_id);
+-
+- /* Release entries in subsidiary tables */
+- efx_tc_free_action_set_list(efx, &rule->acts, true);
+-
+- kfree(rule);
+-}
+-
+ /* Boilerplate for the simple 'copy a field' cases */
+ #define _MAP_KEY_AND_MASK(_name, _type, _tcget, _tcfield, _field) \
+ if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_##_name)) { \
+@@ -1451,6 +1434,21 @@ static void efx_tc_encap_match_free(void *ptr, void *__unused)
+ kfree(encap);
+ }
+
++static void efx_tc_flow_free(void *ptr, void *arg)
++{
++ struct efx_tc_flow_rule *rule = ptr;
++ struct efx_nic *efx = arg;
++
++ netif_err(efx, drv, efx->net_dev,
++ "tc rule %lx still present at teardown, removing\n",
++ rule->cookie);
++
++ /* Also releases entries in subsidiary tables */
++ efx_tc_delete_rule(efx, rule);
++
++ kfree(rule);
++}
++
+ int efx_init_struct_tc(struct efx_nic *efx)
+ {
+ int rc;
+diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+index 87510951f4e80..b74946bbee3c3 100644
+--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
++++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+@@ -7457,12 +7457,6 @@ void stmmac_dvr_remove(struct device *dev)
+ netif_carrier_off(ndev);
+ unregister_netdev(ndev);
+
+- /* Serdes power down needs to happen after VLAN filter
+- * is deleted that is triggered by unregister_netdev().
+- */
+- if (priv->plat->serdes_powerdown)
+- priv->plat->serdes_powerdown(ndev, priv->plat->bsp_priv);
+-
+ #ifdef CONFIG_DEBUG_FS
+ stmmac_exit_fs(ndev);
+ #endif
+diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+index 3e310b55bce2b..734822321e0ab 100644
+--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
++++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+@@ -2042,6 +2042,11 @@ static int axienet_probe(struct platform_device *pdev)
+ goto cleanup_clk;
+ }
+
++ /* Reset core now that clocks are enabled, prior to accessing MDIO */
++ ret = __axienet_device_reset(lp);
++ if (ret)
++ goto cleanup_clk;
++
+ /* Autodetect the need for 64-bit DMA pointers.
+ * When the IP is configured for a bus width bigger than 32 bits,
+ * writing the MSB registers is mandatory, even if they are all 0.
+@@ -2096,11 +2101,6 @@ static int axienet_probe(struct platform_device *pdev)
+ lp->coalesce_count_tx = XAXIDMA_DFT_TX_THRESHOLD;
+ lp->coalesce_usec_tx = XAXIDMA_DFT_TX_USEC;
+
+- /* Reset core now that clocks are enabled, prior to accessing MDIO */
+- ret = __axienet_device_reset(lp);
+- if (ret)
+- goto cleanup_clk;
+-
+ ret = axienet_mdio_setup(lp);
+ if (ret)
+ dev_warn(&pdev->dev,
+diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c
+index 15c7dc82107f4..acb20ad4e37eb 100644
+--- a/drivers/net/gtp.c
++++ b/drivers/net/gtp.c
+@@ -631,7 +631,9 @@ static void __gtp_encap_destroy(struct sock *sk)
+ gtp->sk1u = NULL;
+ udp_sk(sk)->encap_type = 0;
+ rcu_assign_sk_user_data(sk, NULL);
++ release_sock(sk);
+ sock_put(sk);
++ return;
+ }
+ release_sock(sk);
+ }
+diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c
+index ab5133eb1d517..e45817caaee8d 100644
+--- a/drivers/net/ipvlan/ipvlan_core.c
++++ b/drivers/net/ipvlan/ipvlan_core.c
+@@ -585,7 +585,8 @@ static int ipvlan_xmit_mode_l3(struct sk_buff *skb, struct net_device *dev)
+ consume_skb(skb);
+ return NET_XMIT_DROP;
+ }
+- return ipvlan_rcv_frame(addr, &skb, true);
++ ipvlan_rcv_frame(addr, &skb, true);
++ return NET_XMIT_SUCCESS;
+ }
+ }
+ out:
+@@ -611,7 +612,8 @@ static int ipvlan_xmit_mode_l2(struct sk_buff *skb, struct net_device *dev)
+ consume_skb(skb);
+ return NET_XMIT_DROP;
+ }
+- return ipvlan_rcv_frame(addr, &skb, true);
++ ipvlan_rcv_frame(addr, &skb, true);
++ return NET_XMIT_SUCCESS;
+ }
+ }
+ skb = skb_share_check(skb, GFP_ATOMIC);
+@@ -623,7 +625,8 @@ static int ipvlan_xmit_mode_l2(struct sk_buff *skb, struct net_device *dev)
+ * the skb for the main-dev. At the RX side we just return
+ * RX_PASS for it to be processed further on the stack.
+ */
+- return dev_forward_skb(ipvlan->phy_dev, skb);
++ dev_forward_skb(ipvlan->phy_dev, skb);
++ return NET_XMIT_SUCCESS;
+
+ } else if (is_multicast_ether_addr(eth->h_dest)) {
+ skb_reset_mac_header(skb);
+diff --git a/drivers/net/ppp/pptp.c b/drivers/net/ppp/pptp.c
+index 0fe78826c8fa4..32183f24e63ff 100644
+--- a/drivers/net/ppp/pptp.c
++++ b/drivers/net/ppp/pptp.c
+@@ -24,6 +24,7 @@
+ #include <linux/in.h>
+ #include <linux/ip.h>
+ #include <linux/rcupdate.h>
++#include <linux/security.h>
+ #include <linux/spinlock.h>
+
+ #include <net/sock.h>
+@@ -128,6 +129,23 @@ static void del_chan(struct pppox_sock *sock)
+ spin_unlock(&chan_lock);
+ }
+
++static struct rtable *pptp_route_output(struct pppox_sock *po,
++ struct flowi4 *fl4)
++{
++ struct sock *sk = &po->sk;
++ struct net *net;
++
++ net = sock_net(sk);
++ flowi4_init_output(fl4, sk->sk_bound_dev_if, sk->sk_mark, 0,
++ RT_SCOPE_UNIVERSE, IPPROTO_GRE, 0,
++ po->proto.pptp.dst_addr.sin_addr.s_addr,
++ po->proto.pptp.src_addr.sin_addr.s_addr,
++ 0, 0, sock_net_uid(net, sk));
++ security_sk_classify_flow(sk, flowi4_to_flowi_common(fl4));
++
++ return ip_route_output_flow(net, fl4, sk);
++}
++
+ static int pptp_xmit(struct ppp_channel *chan, struct sk_buff *skb)
+ {
+ struct sock *sk = (struct sock *) chan->private;
+@@ -151,11 +169,7 @@ static int pptp_xmit(struct ppp_channel *chan, struct sk_buff *skb)
+ if (sk_pppox(po)->sk_state & PPPOX_DEAD)
+ goto tx_error;
+
+- rt = ip_route_output_ports(net, &fl4, NULL,
+- opt->dst_addr.sin_addr.s_addr,
+- opt->src_addr.sin_addr.s_addr,
+- 0, 0, IPPROTO_GRE,
+- RT_TOS(0), sk->sk_bound_dev_if);
++ rt = pptp_route_output(po, &fl4);
+ if (IS_ERR(rt))
+ goto tx_error;
+
+@@ -438,12 +452,7 @@ static int pptp_connect(struct socket *sock, struct sockaddr *uservaddr,
+ po->chan.private = sk;
+ po->chan.ops = &pptp_chan_ops;
+
+- rt = ip_route_output_ports(sock_net(sk), &fl4, sk,
+- opt->dst_addr.sin_addr.s_addr,
+- opt->src_addr.sin_addr.s_addr,
+- 0, 0,
+- IPPROTO_GRE, RT_CONN_FLAGS(sk),
+- sk->sk_bound_dev_if);
++ rt = pptp_route_output(po, &fl4);
+ if (IS_ERR(rt)) {
+ error = -EHOSTUNREACH;
+ goto end;
+diff --git a/drivers/net/wireguard/netlink.c b/drivers/net/wireguard/netlink.c
+index 43c8c84e7ea82..6d1bd9f52d02a 100644
+--- a/drivers/net/wireguard/netlink.c
++++ b/drivers/net/wireguard/netlink.c
+@@ -546,6 +546,7 @@ static int wg_set_device(struct sk_buff *skb, struct genl_info *info)
+ u8 *private_key = nla_data(info->attrs[WGDEVICE_A_PRIVATE_KEY]);
+ u8 public_key[NOISE_PUBLIC_KEY_LEN];
+ struct wg_peer *peer, *temp;
++ bool send_staged_packets;
+
+ if (!crypto_memneq(wg->static_identity.static_private,
+ private_key, NOISE_PUBLIC_KEY_LEN))
+@@ -564,14 +565,17 @@ static int wg_set_device(struct sk_buff *skb, struct genl_info *info)
+ }
+
+ down_write(&wg->static_identity.lock);
+- wg_noise_set_static_identity_private_key(&wg->static_identity,
+- private_key);
+- list_for_each_entry_safe(peer, temp, &wg->peer_list,
+- peer_list) {
++ send_staged_packets = !wg->static_identity.has_identity && netif_running(wg->dev);
++ wg_noise_set_static_identity_private_key(&wg->static_identity, private_key);
++ send_staged_packets = send_staged_packets && wg->static_identity.has_identity;
++
++ wg_cookie_checker_precompute_device_keys(&wg->cookie_checker);
++ list_for_each_entry_safe(peer, temp, &wg->peer_list, peer_list) {
+ wg_noise_precompute_static_static(peer);
+ wg_noise_expire_current_peer_keypairs(peer);
++ if (send_staged_packets)
++ wg_packet_send_staged_packets(peer);
+ }
+- wg_cookie_checker_precompute_device_keys(&wg->cookie_checker);
+ up_write(&wg->static_identity.lock);
+ }
+ skip_set_private_key:
+diff --git a/drivers/net/wireguard/queueing.c b/drivers/net/wireguard/queueing.c
+index 8084e7408c0ae..26d235d152352 100644
+--- a/drivers/net/wireguard/queueing.c
++++ b/drivers/net/wireguard/queueing.c
+@@ -28,6 +28,7 @@ int wg_packet_queue_init(struct crypt_queue *queue, work_func_t function,
+ int ret;
+
+ memset(queue, 0, sizeof(*queue));
++ queue->last_cpu = -1;
+ ret = ptr_ring_init(&queue->ring, len, GFP_KERNEL);
+ if (ret)
+ return ret;
+diff --git a/drivers/net/wireguard/queueing.h b/drivers/net/wireguard/queueing.h
+index 125284b346a77..1ea4f874e367e 100644
+--- a/drivers/net/wireguard/queueing.h
++++ b/drivers/net/wireguard/queueing.h
+@@ -117,20 +117,17 @@ static inline int wg_cpumask_choose_online(int *stored_cpu, unsigned int id)
+ return cpu;
+ }
+
+-/* This function is racy, in the sense that next is unlocked, so it could return
+- * the same CPU twice. A race-free version of this would be to instead store an
+- * atomic sequence number, do an increment-and-return, and then iterate through
+- * every possible CPU until we get to that index -- choose_cpu. However that's
+- * a bit slower, and it doesn't seem like this potential race actually
+- * introduces any performance loss, so we live with it.
++/* This function is racy, in the sense that it's called while last_cpu is
++ * unlocked, so it could return the same CPU twice. Adding locking or using
++ * atomic sequence numbers is slower though, and the consequences of racing are
++ * harmless, so live with it.
+ */
+-static inline int wg_cpumask_next_online(int *next)
++static inline int wg_cpumask_next_online(int *last_cpu)
+ {
+- int cpu = *next;
+-
+- while (unlikely(!cpumask_test_cpu(cpu, cpu_online_mask)))
+- cpu = cpumask_next(cpu, cpu_online_mask) % nr_cpumask_bits;
+- *next = cpumask_next(cpu, cpu_online_mask) % nr_cpumask_bits;
++ int cpu = cpumask_next(*last_cpu, cpu_online_mask);
++ if (cpu >= nr_cpu_ids)
++ cpu = cpumask_first(cpu_online_mask);
++ *last_cpu = cpu;
+ return cpu;
+ }
+
+@@ -159,7 +156,7 @@ static inline void wg_prev_queue_drop_peeked(struct prev_queue *queue)
+
+ static inline int wg_queue_enqueue_per_device_and_peer(
+ struct crypt_queue *device_queue, struct prev_queue *peer_queue,
+- struct sk_buff *skb, struct workqueue_struct *wq, int *next_cpu)
++ struct sk_buff *skb, struct workqueue_struct *wq)
+ {
+ int cpu;
+
+@@ -173,7 +170,7 @@ static inline int wg_queue_enqueue_per_device_and_peer(
+ /* Then we queue it up in the device queue, which consumes the
+ * packet as soon as it can.
+ */
+- cpu = wg_cpumask_next_online(next_cpu);
++ cpu = wg_cpumask_next_online(&device_queue->last_cpu);
+ if (unlikely(ptr_ring_produce_bh(&device_queue->ring, skb)))
+ return -EPIPE;
+ queue_work_on(cpu, wq, &per_cpu_ptr(device_queue->worker, cpu)->work);
+diff --git a/drivers/net/wireguard/receive.c b/drivers/net/wireguard/receive.c
+index 7135d51d2d872..0b3f0c8435509 100644
+--- a/drivers/net/wireguard/receive.c
++++ b/drivers/net/wireguard/receive.c
+@@ -524,7 +524,7 @@ static void wg_packet_consume_data(struct wg_device *wg, struct sk_buff *skb)
+ goto err;
+
+ ret = wg_queue_enqueue_per_device_and_peer(&wg->decrypt_queue, &peer->rx_queue, skb,
+- wg->packet_crypt_wq, &wg->decrypt_queue.last_cpu);
++ wg->packet_crypt_wq);
+ if (unlikely(ret == -EPIPE))
+ wg_queue_enqueue_per_peer_rx(skb, PACKET_STATE_DEAD);
+ if (likely(!ret || ret == -EPIPE)) {
+diff --git a/drivers/net/wireguard/send.c b/drivers/net/wireguard/send.c
+index 5368f7c35b4bf..95c853b59e1da 100644
+--- a/drivers/net/wireguard/send.c
++++ b/drivers/net/wireguard/send.c
+@@ -318,7 +318,7 @@ static void wg_packet_create_data(struct wg_peer *peer, struct sk_buff *first)
+ goto err;
+
+ ret = wg_queue_enqueue_per_device_and_peer(&wg->encrypt_queue, &peer->tx_queue, first,
+- wg->packet_crypt_wq, &wg->encrypt_queue.last_cpu);
++ wg->packet_crypt_wq);
+ if (unlikely(ret == -EPIPE))
+ wg_queue_enqueue_per_peer_tx(first, PACKET_STATE_DEAD);
+ err:
+diff --git a/drivers/net/wireless/ath/ath10k/core.c b/drivers/net/wireless/ath/ath10k/core.c
+index 5eb131ab916fd..6cdb225b7eacc 100644
+--- a/drivers/net/wireless/ath/ath10k/core.c
++++ b/drivers/net/wireless/ath/ath10k/core.c
+@@ -2504,7 +2504,6 @@ EXPORT_SYMBOL(ath10k_core_napi_sync_disable);
+ static void ath10k_core_restart(struct work_struct *work)
+ {
+ struct ath10k *ar = container_of(work, struct ath10k, restart_work);
+- struct ath10k_vif *arvif;
+ int ret;
+
+ set_bit(ATH10K_FLAG_CRASH_FLUSH, &ar->dev_flags);
+@@ -2543,14 +2542,6 @@ static void ath10k_core_restart(struct work_struct *work)
+ ar->state = ATH10K_STATE_RESTARTING;
+ ath10k_halt(ar);
+ ath10k_scan_finish(ar);
+- if (ar->hw_params.hw_restart_disconnect) {
+- list_for_each_entry(arvif, &ar->arvifs, list) {
+- if (arvif->is_up &&
+- arvif->vdev_type == WMI_VDEV_TYPE_STA)
+- ieee80211_hw_restart_disconnect(arvif->vif);
+- }
+- }
+-
+ ieee80211_restart_hw(ar->hw);
+ break;
+ case ATH10K_STATE_OFF:
+@@ -3643,6 +3634,9 @@ struct ath10k *ath10k_core_create(size_t priv_size, struct device *dev,
+ mutex_init(&ar->dump_mutex);
+ spin_lock_init(&ar->data_lock);
+
++ for (int ac = 0; ac < IEEE80211_NUM_ACS; ac++)
++ spin_lock_init(&ar->queue_lock[ac]);
++
+ INIT_LIST_HEAD(&ar->peers);
+ init_waitqueue_head(&ar->peer_mapping_wq);
+ init_waitqueue_head(&ar->htt.empty_tx_wq);
+diff --git a/drivers/net/wireless/ath/ath10k/core.h b/drivers/net/wireless/ath/ath10k/core.h
+index f5de8ce8fb456..4b5239de40184 100644
+--- a/drivers/net/wireless/ath/ath10k/core.h
++++ b/drivers/net/wireless/ath/ath10k/core.h
+@@ -1170,6 +1170,9 @@ struct ath10k {
+ /* protects shared structure data */
+ spinlock_t data_lock;
+
++ /* serialize wake_tx_queue calls per ac */
++ spinlock_t queue_lock[IEEE80211_NUM_ACS];
++
+ struct list_head arvifs;
+ struct list_head peers;
+ struct ath10k_peer *peer_map[ATH10K_MAX_NUM_PEER_IDS];
+diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
+index 7675858f069bd..03e7bc5b6c0bd 100644
+--- a/drivers/net/wireless/ath/ath10k/mac.c
++++ b/drivers/net/wireless/ath/ath10k/mac.c
+@@ -4732,13 +4732,14 @@ static void ath10k_mac_op_wake_tx_queue(struct ieee80211_hw *hw,
+ {
+ struct ath10k *ar = hw->priv;
+ int ret;
+- u8 ac;
++ u8 ac = txq->ac;
+
+ ath10k_htt_tx_txq_update(hw, txq);
+ if (ar->htt.tx_q_state.mode != HTT_TX_MODE_SWITCH_PUSH)
+ return;
+
+- ac = txq->ac;
++ spin_lock_bh(&ar->queue_lock[ac]);
++
+ ieee80211_txq_schedule_start(hw, ac);
+ txq = ieee80211_next_txq(hw, ac);
+ if (!txq)
+@@ -4753,6 +4754,7 @@ static void ath10k_mac_op_wake_tx_queue(struct ieee80211_hw *hw,
+ ath10k_htt_tx_txq_update(hw, txq);
+ out:
+ ieee80211_txq_schedule_end(hw, ac);
++ spin_unlock_bh(&ar->queue_lock[ac]);
+ }
+
+ /* Must not be called with conf_mutex held as workers can use that also. */
+@@ -8107,6 +8109,7 @@ static void ath10k_reconfig_complete(struct ieee80211_hw *hw,
+ enum ieee80211_reconfig_type reconfig_type)
+ {
+ struct ath10k *ar = hw->priv;
++ struct ath10k_vif *arvif;
+
+ if (reconfig_type != IEEE80211_RECONFIG_TYPE_RESTART)
+ return;
+@@ -8121,6 +8124,12 @@ static void ath10k_reconfig_complete(struct ieee80211_hw *hw,
+ ar->state = ATH10K_STATE_ON;
+ ieee80211_wake_queues(ar->hw);
+ clear_bit(ATH10K_FLAG_RESTARTING, &ar->dev_flags);
++ if (ar->hw_params.hw_restart_disconnect) {
++ list_for_each_entry(arvif, &ar->arvifs, list) {
++ if (arvif->is_up && arvif->vdev_type == WMI_VDEV_TYPE_STA)
++ ieee80211_hw_restart_disconnect(arvif->vif);
++ }
++ }
+ }
+
+ mutex_unlock(&ar->conf_mutex);
+diff --git a/drivers/net/wireless/ath/ath11k/ahb.c b/drivers/net/wireless/ath/ath11k/ahb.c
+index 5cbba9a8b6ba9..396548e57022f 100644
+--- a/drivers/net/wireless/ath/ath11k/ahb.c
++++ b/drivers/net/wireless/ath/ath11k/ahb.c
+@@ -1127,6 +1127,7 @@ static int ath11k_ahb_probe(struct platform_device *pdev)
+ switch (hw_rev) {
+ case ATH11K_HW_IPQ8074:
+ case ATH11K_HW_IPQ6018_HW10:
++ case ATH11K_HW_IPQ5018_HW10:
+ hif_ops = &ath11k_ahb_hif_ops_ipq8074;
+ pci_ops = NULL;
+ break;
+diff --git a/drivers/net/wireless/ath/ath11k/core.c b/drivers/net/wireless/ath/ath11k/core.c
+index b1b90bd34d67e..9de23c11e18bb 100644
+--- a/drivers/net/wireless/ath/ath11k/core.c
++++ b/drivers/net/wireless/ath/ath11k/core.c
+@@ -664,6 +664,7 @@ static const struct ath11k_hw_params ath11k_hw_params[] = {
+ .hal_params = &ath11k_hw_hal_params_ipq8074,
+ .single_pdev_only = false,
+ .cold_boot_calib = true,
++ .cbcal_restart_fw = true,
+ .fix_l1ss = true,
+ .supports_dynamic_smps_6ghz = false,
+ .alloc_cacheable_memory = true,
+diff --git a/drivers/net/wireless/ath/ath11k/hw.c b/drivers/net/wireless/ath/ath11k/hw.c
+index eb995f9cf0fa1..72797289b33e2 100644
+--- a/drivers/net/wireless/ath/ath11k/hw.c
++++ b/drivers/net/wireless/ath/ath11k/hw.c
+@@ -1175,7 +1175,7 @@ const struct ath11k_hw_ops ipq5018_ops = {
+ .mpdu_info_get_peerid = ath11k_hw_ipq8074_mpdu_info_get_peerid,
+ .rx_desc_mac_addr2_valid = ath11k_hw_ipq9074_rx_desc_mac_addr2_valid,
+ .rx_desc_mpdu_start_addr2 = ath11k_hw_ipq9074_rx_desc_mpdu_start_addr2,
+-
++ .get_ring_selector = ath11k_hw_ipq8074_get_tcl_ring_selector,
+ };
+
+ #define ATH11K_TX_RING_MASK_0 BIT(0)
+diff --git a/drivers/net/wireless/ath/ath11k/qmi.c b/drivers/net/wireless/ath/ath11k/qmi.c
+index ab923e24b0a9c..2328b9447cf1b 100644
+--- a/drivers/net/wireless/ath/ath11k/qmi.c
++++ b/drivers/net/wireless/ath/ath11k/qmi.c
+@@ -2058,6 +2058,9 @@ static int ath11k_qmi_assign_target_mem_chunk(struct ath11k_base *ab)
+ ab->qmi.target_mem[idx].iaddr =
+ ioremap(ab->qmi.target_mem[idx].paddr,
+ ab->qmi.target_mem[i].size);
++ if (!ab->qmi.target_mem[idx].iaddr)
++ return -EIO;
++
+ ab->qmi.target_mem[idx].size = ab->qmi.target_mem[i].size;
+ host_ddr_sz = ab->qmi.target_mem[i].size;
+ ab->qmi.target_mem[idx].type = ab->qmi.target_mem[i].type;
+@@ -2083,6 +2086,8 @@ static int ath11k_qmi_assign_target_mem_chunk(struct ath11k_base *ab)
+ ab->qmi.target_mem[idx].iaddr =
+ ioremap(ab->qmi.target_mem[idx].paddr,
+ ab->qmi.target_mem[i].size);
++ if (!ab->qmi.target_mem[idx].iaddr)
++ return -EIO;
+ } else {
+ ab->qmi.target_mem[idx].paddr =
+ ATH11K_QMI_CALDB_ADDRESS;
+diff --git a/drivers/net/wireless/ath/ath9k/ar9003_hw.c b/drivers/net/wireless/ath/ath9k/ar9003_hw.c
+index 4f27a9fb1482b..e9bd13eeee92f 100644
+--- a/drivers/net/wireless/ath/ath9k/ar9003_hw.c
++++ b/drivers/net/wireless/ath/ath9k/ar9003_hw.c
+@@ -1099,17 +1099,22 @@ static bool ath9k_hw_verify_hang(struct ath_hw *ah, unsigned int queue)
+ {
+ u32 dma_dbg_chain, dma_dbg_complete;
+ u8 dcu_chain_state, dcu_complete_state;
++ unsigned int dbg_reg, reg_offset;
+ int i;
+
+- for (i = 0; i < NUM_STATUS_READS; i++) {
+- if (queue < 6)
+- dma_dbg_chain = REG_READ(ah, AR_DMADBG_4);
+- else
+- dma_dbg_chain = REG_READ(ah, AR_DMADBG_5);
++ if (queue < 6) {
++ dbg_reg = AR_DMADBG_4;
++ reg_offset = queue * 5;
++ } else {
++ dbg_reg = AR_DMADBG_5;
++ reg_offset = (queue - 6) * 5;
++ }
+
++ for (i = 0; i < NUM_STATUS_READS; i++) {
++ dma_dbg_chain = REG_READ(ah, dbg_reg);
+ dma_dbg_complete = REG_READ(ah, AR_DMADBG_6);
+
+- dcu_chain_state = (dma_dbg_chain >> (5 * queue)) & 0x1f;
++ dcu_chain_state = (dma_dbg_chain >> reg_offset) & 0x1f;
+ dcu_complete_state = dma_dbg_complete & 0x3;
+
+ if ((dcu_chain_state != 0x6) || (dcu_complete_state != 0x1))
+@@ -1128,6 +1133,7 @@ static bool ar9003_hw_detect_mac_hang(struct ath_hw *ah)
+ u8 dcu_chain_state, dcu_complete_state;
+ bool dcu_wait_frdone = false;
+ unsigned long chk_dcu = 0;
++ unsigned int reg_offset;
+ unsigned int i = 0;
+
+ dma_dbg_4 = REG_READ(ah, AR_DMADBG_4);
+@@ -1139,12 +1145,15 @@ static bool ar9003_hw_detect_mac_hang(struct ath_hw *ah)
+ goto exit;
+
+ for (i = 0; i < ATH9K_NUM_TX_QUEUES; i++) {
+- if (i < 6)
++ if (i < 6) {
+ chk_dbg = dma_dbg_4;
+- else
++ reg_offset = i * 5;
++ } else {
+ chk_dbg = dma_dbg_5;
++ reg_offset = (i - 6) * 5;
++ }
+
+- dcu_chain_state = (chk_dbg >> (5 * i)) & 0x1f;
++ dcu_chain_state = (chk_dbg >> reg_offset) & 0x1f;
+ if (dcu_chain_state == 0x6) {
+ dcu_wait_frdone = true;
+ chk_dcu |= BIT(i);
+diff --git a/drivers/net/wireless/ath/ath9k/htc_hst.c b/drivers/net/wireless/ath/ath9k/htc_hst.c
+index fe62ff668f757..99667aba289df 100644
+--- a/drivers/net/wireless/ath/ath9k/htc_hst.c
++++ b/drivers/net/wireless/ath/ath9k/htc_hst.c
+@@ -114,7 +114,13 @@ static void htc_process_conn_rsp(struct htc_target *target,
+
+ if (svc_rspmsg->status == HTC_SERVICE_SUCCESS) {
+ epid = svc_rspmsg->endpoint_id;
+- if (epid < 0 || epid >= ENDPOINT_MAX)
++
++ /* Check that the received epid for the endpoint to attach
++ * a new service is valid. ENDPOINT0 can't be used here as it
++ * is already reserved for HTC_CTRL_RSVD_SVC service and thus
++ * should not be modified.
++ */
++ if (epid <= ENDPOINT0 || epid >= ENDPOINT_MAX)
+ return;
+
+ service_id = be16_to_cpu(svc_rspmsg->service_id);
+diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c
+index a4197c14f0a92..6360d3356e256 100644
+--- a/drivers/net/wireless/ath/ath9k/main.c
++++ b/drivers/net/wireless/ath/ath9k/main.c
+@@ -203,7 +203,7 @@ void ath_cancel_work(struct ath_softc *sc)
+ void ath_restart_work(struct ath_softc *sc)
+ {
+ ieee80211_queue_delayed_work(sc->hw, &sc->hw_check_work,
+- ATH_HW_CHECK_POLL_INT);
++ msecs_to_jiffies(ATH_HW_CHECK_POLL_INT));
+
+ if (AR_SREV_9340(sc->sc_ah) || AR_SREV_9330(sc->sc_ah))
+ ieee80211_queue_delayed_work(sc->hw, &sc->hw_pll_work,
+@@ -850,7 +850,7 @@ static bool ath9k_txq_list_has_key(struct list_head *txq_list, u32 keyix)
+ static bool ath9k_txq_has_key(struct ath_softc *sc, u32 keyix)
+ {
+ struct ath_hw *ah = sc->sc_ah;
+- int i;
++ int i, j;
+ struct ath_txq *txq;
+ bool key_in_use = false;
+
+@@ -868,8 +868,9 @@ static bool ath9k_txq_has_key(struct ath_softc *sc, u32 keyix)
+ if (sc->sc_ah->caps.hw_caps & ATH9K_HW_CAP_EDMA) {
+ int idx = txq->txq_tailidx;
+
+- while (!key_in_use &&
+- !list_empty(&txq->txq_fifo[idx])) {
++ for (j = 0; !key_in_use &&
++ !list_empty(&txq->txq_fifo[idx]) &&
++ j < ATH_TXFIFO_DEPTH; j++) {
+ key_in_use = ath9k_txq_list_has_key(
+ &txq->txq_fifo[idx], keyix);
+ INCR(idx, ATH_TXFIFO_DEPTH);
+@@ -2239,7 +2240,7 @@ void __ath9k_flush(struct ieee80211_hw *hw, u32 queues, bool drop,
+ }
+
+ ieee80211_queue_delayed_work(hw, &sc->hw_check_work,
+- ATH_HW_CHECK_POLL_INT);
++ msecs_to_jiffies(ATH_HW_CHECK_POLL_INT));
+ }
+
+ static bool ath9k_tx_frames_pending(struct ieee80211_hw *hw)
+diff --git a/drivers/net/wireless/ath/ath9k/wmi.c b/drivers/net/wireless/ath/ath9k/wmi.c
+index 19345b8f7bfd5..d652c647d56b5 100644
+--- a/drivers/net/wireless/ath/ath9k/wmi.c
++++ b/drivers/net/wireless/ath/ath9k/wmi.c
+@@ -221,6 +221,10 @@ static void ath9k_wmi_ctrl_rx(void *priv, struct sk_buff *skb,
+ if (unlikely(wmi->stopped))
+ goto free_skb;
+
++ /* Validate the obtained SKB. */
++ if (unlikely(skb->len < sizeof(struct wmi_cmd_hdr)))
++ goto free_skb;
++
+ hdr = (struct wmi_cmd_hdr *) skb->data;
+ cmd_id = be16_to_cpu(hdr->command_id);
+
+diff --git a/drivers/net/wireless/atmel/atmel_cs.c b/drivers/net/wireless/atmel/atmel_cs.c
+index 453bb84cb3386..58bba9875d366 100644
+--- a/drivers/net/wireless/atmel/atmel_cs.c
++++ b/drivers/net/wireless/atmel/atmel_cs.c
+@@ -72,6 +72,7 @@ struct local_info {
+ static int atmel_probe(struct pcmcia_device *p_dev)
+ {
+ struct local_info *local;
++ int ret;
+
+ dev_dbg(&p_dev->dev, "atmel_attach()\n");
+
+@@ -82,8 +83,16 @@ static int atmel_probe(struct pcmcia_device *p_dev)
+
+ p_dev->priv = local;
+
+- return atmel_config(p_dev);
+-} /* atmel_attach */
++ ret = atmel_config(p_dev);
++ if (ret)
++ goto err_free_priv;
++
++ return 0;
++
++err_free_priv:
++ kfree(p_dev->priv);
++ return ret;
++}
+
+ static void atmel_detach(struct pcmcia_device *link)
+ {
+diff --git a/drivers/net/wireless/intel/iwlwifi/fw/api/rs.h b/drivers/net/wireless/intel/iwlwifi/fw/api/rs.h
+index c9a48fc5fac88..a1a272433b09b 100644
+--- a/drivers/net/wireless/intel/iwlwifi/fw/api/rs.h
++++ b/drivers/net/wireless/intel/iwlwifi/fw/api/rs.h
+@@ -21,6 +21,7 @@
+ * @IWL_TLC_MNG_CFG_FLAGS_HE_DCM_NSS_2_MSK: enable HE Dual Carrier Modulation
+ * for BPSK (MCS 0) with 2 spatial
+ * streams
++ * @IWL_TLC_MNG_CFG_FLAGS_EHT_EXTRA_LTF_MSK: enable support for EHT extra LTF
+ */
+ enum iwl_tlc_mng_cfg_flags {
+ IWL_TLC_MNG_CFG_FLAGS_STBC_MSK = BIT(0),
+@@ -28,6 +29,7 @@ enum iwl_tlc_mng_cfg_flags {
+ IWL_TLC_MNG_CFG_FLAGS_HE_STBC_160MHZ_MSK = BIT(2),
+ IWL_TLC_MNG_CFG_FLAGS_HE_DCM_NSS_1_MSK = BIT(3),
+ IWL_TLC_MNG_CFG_FLAGS_HE_DCM_NSS_2_MSK = BIT(4),
++ IWL_TLC_MNG_CFG_FLAGS_EHT_EXTRA_LTF_MSK = BIT(6),
+ };
+
+ /**
+diff --git a/drivers/net/wireless/intel/iwlwifi/fw/dump.c b/drivers/net/wireless/intel/iwlwifi/fw/dump.c
+index f86f7b4baa181..f61f1ce7fe795 100644
+--- a/drivers/net/wireless/intel/iwlwifi/fw/dump.c
++++ b/drivers/net/wireless/intel/iwlwifi/fw/dump.c
+@@ -507,11 +507,16 @@ void iwl_fwrt_dump_error_logs(struct iwl_fw_runtime *fwrt)
+ iwl_fwrt_dump_fseq_regs(fwrt);
+ if (fwrt->trans->trans_cfg->device_family >= IWL_DEVICE_FAMILY_22000) {
+ pc_data = fwrt->trans->dbg.pc_data;
++
++ if (!iwl_trans_grab_nic_access(fwrt->trans))
++ return;
+ for (count = 0; count < fwrt->trans->dbg.num_pc;
+ count++, pc_data++)
+ IWL_ERR(fwrt, "%s: 0x%x\n",
+ pc_data->pc_name,
+- pc_data->pc_address);
++ iwl_read_prph_no_grab(fwrt->trans,
++ pc_data->pc_address));
++ iwl_trans_release_nic_access(fwrt->trans);
+ }
+
+ if (fwrt->trans->trans_cfg->device_family >= IWL_DEVICE_FAMILY_BZ) {
+diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-nvm-parse.c b/drivers/net/wireless/intel/iwlwifi/iwl-nvm-parse.c
+index 7dcb1c3ab7282..be0eb69f2248a 100644
+--- a/drivers/net/wireless/intel/iwlwifi/iwl-nvm-parse.c
++++ b/drivers/net/wireless/intel/iwlwifi/iwl-nvm-parse.c
+@@ -975,6 +975,8 @@ iwl_nvm_fixup_sband_iftd(struct iwl_trans *trans,
+ iftype_data->eht_cap.eht_cap_elem.phy_cap_info[6] &=
+ ~(IEEE80211_EHT_PHY_CAP6_MCS15_SUPP_MASK |
+ IEEE80211_EHT_PHY_CAP6_EHT_DUP_6GHZ_SUPP);
++ iftype_data->eht_cap.eht_cap_elem.phy_cap_info[5] |=
++ IEEE80211_EHT_PHY_CAP5_SUPP_EXTRA_EHT_LTF;
+ }
+
+ if (fw_has_capa(&fw->ucode_capa, IWL_UCODE_TLV_CAPA_BROADCAST_TWT))
+diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/fw.c b/drivers/net/wireless/intel/iwlwifi/mvm/fw.c
+index 205c09bc98634..a6367909d7fe4 100644
+--- a/drivers/net/wireless/intel/iwlwifi/mvm/fw.c
++++ b/drivers/net/wireless/intel/iwlwifi/mvm/fw.c
+@@ -1699,9 +1699,11 @@ int iwl_mvm_up(struct iwl_mvm *mvm)
+
+ if (test_bit(IWL_MVM_STATUS_IN_HW_RESTART, &mvm->status)) {
+ iwl_mvm_send_recovery_cmd(mvm, ERROR_RECOVERY_UPDATE_DB);
+- iwl_mvm_time_sync_config(mvm, mvm->time_sync.peer_addr,
+- IWL_TIME_SYNC_PROTOCOL_TM |
+- IWL_TIME_SYNC_PROTOCOL_FTM);
++
++ if (mvm->time_sync.active)
++ iwl_mvm_time_sync_config(mvm, mvm->time_sync.peer_addr,
++ IWL_TIME_SYNC_PROTOCOL_TM |
++ IWL_TIME_SYNC_PROTOCOL_FTM);
+ }
+
+ if (!mvm->ptp_data.ptp_clock)
+diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c b/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c
+index 17f788a5ff6ba..f23cd100cf252 100644
+--- a/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c
++++ b/drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c
+@@ -2285,8 +2285,7 @@ bool iwl_mvm_is_nic_ack_enabled(struct iwl_mvm *mvm, struct ieee80211_vif *vif)
+ * so take it from one of them.
+ */
+ sband = mvm->hw->wiphy->bands[NL80211_BAND_2GHZ];
+- own_he_cap = ieee80211_get_he_iftype_cap(sband,
+- ieee80211_vif_type_p2p(vif));
++ own_he_cap = ieee80211_get_he_iftype_cap_vif(sband, vif);
+
+ return (own_he_cap && (own_he_cap->he_cap_elem.mac_cap_info[2] &
+ IEEE80211_HE_MAC_CAP2_ACK_EN));
+@@ -3468,8 +3467,7 @@ static void iwl_mvm_reset_cca_40mhz_workaround(struct iwl_mvm *mvm,
+
+ sband->ht_cap.cap |= IEEE80211_HT_CAP_SUP_WIDTH_20_40;
+
+- he_cap = ieee80211_get_he_iftype_cap(sband,
+- ieee80211_vif_type_p2p(vif));
++ he_cap = ieee80211_get_he_iftype_cap_vif(sband, vif);
+
+ if (he_cap) {
+ /* we know that ours is writable */
+@@ -3848,6 +3846,7 @@ int iwl_mvm_mac_sta_state_common(struct ieee80211_hw *hw,
+ struct iwl_mvm *mvm = IWL_MAC80211_GET_MVM(hw);
+ struct iwl_mvm_vif *mvmvif = iwl_mvm_vif_from_mac80211(vif);
+ struct iwl_mvm_sta *mvm_sta = iwl_mvm_sta_from_mac80211(sta);
++ struct ieee80211_link_sta *link_sta;
+ unsigned int link_id;
+ int ret;
+
+@@ -3889,7 +3888,7 @@ int iwl_mvm_mac_sta_state_common(struct ieee80211_hw *hw,
+ mutex_lock(&mvm->mutex);
+
+ /* this would be a mac80211 bug ... but don't crash */
+- for_each_mvm_vif_valid_link(mvmvif, link_id) {
++ for_each_sta_active_link(vif, sta, link_sta, link_id) {
+ if (WARN_ON_ONCE(!mvmvif->link[link_id]->phy_ctxt)) {
+ mutex_unlock(&mvm->mutex);
+ return test_bit(IWL_MVM_STATUS_HW_RESTART_REQUESTED,
+diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/ops.c b/drivers/net/wireless/intel/iwlwifi/mvm/ops.c
+index 32625bfacaaef..8a4415ef540d1 100644
+--- a/drivers/net/wireless/intel/iwlwifi/mvm/ops.c
++++ b/drivers/net/wireless/intel/iwlwifi/mvm/ops.c
+@@ -1,6 +1,6 @@
+ // SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause
+ /*
+- * Copyright (C) 2012-2014, 2018-2020 Intel Corporation
++ * Copyright (C) 2012-2014, 2018-2023 Intel Corporation
+ * Copyright (C) 2013-2015 Intel Mobile Communications GmbH
+ * Copyright (C) 2016-2017 Intel Deutschland GmbH
+ */
+@@ -192,8 +192,7 @@ static void iwl_mvm_rx_monitor_notif(struct iwl_mvm *mvm,
+ WARN_ON(!(sband->ht_cap.cap & IEEE80211_HT_CAP_SUP_WIDTH_20_40));
+ sband->ht_cap.cap &= ~IEEE80211_HT_CAP_SUP_WIDTH_20_40;
+
+- he_cap = ieee80211_get_he_iftype_cap(sband,
+- ieee80211_vif_type_p2p(vif));
++ he_cap = ieee80211_get_he_iftype_cap_vif(sband, vif);
+
+ if (he_cap) {
+ /* we know that ours is writable */
+@@ -1743,8 +1742,11 @@ static void iwl_mvm_queue_state_change(struct iwl_op_mode *op_mode,
+ else
+ set_bit(IWL_MVM_TXQ_STATE_STOP_FULL, &mvmtxq->state);
+
+- if (start && mvmsta->sta_state != IEEE80211_STA_NOTEXIST)
++ if (start && mvmsta->sta_state != IEEE80211_STA_NOTEXIST) {
++ local_bh_disable();
+ iwl_mvm_mac_itxq_xmit(mvm->hw, txq);
++ local_bh_enable();
++ }
+ }
+
+ out:
+diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/rs-fw.c b/drivers/net/wireless/intel/iwlwifi/mvm/rs-fw.c
+index c3a00bfbeef2c..680180b894794 100644
+--- a/drivers/net/wireless/intel/iwlwifi/mvm/rs-fw.c
++++ b/drivers/net/wireless/intel/iwlwifi/mvm/rs-fw.c
+@@ -1,7 +1,7 @@
+ // SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause
+ /*
+ * Copyright (C) 2017 Intel Deutschland GmbH
+- * Copyright (C) 2018-2022 Intel Corporation
++ * Copyright (C) 2018-2023 Intel Corporation
+ */
+ #include "rs.h"
+ #include "fw-api.h"
+@@ -63,12 +63,11 @@ static u8 rs_fw_sgi_cw_support(struct ieee80211_link_sta *link_sta)
+ static u16 rs_fw_get_config_flags(struct iwl_mvm *mvm,
+ struct ieee80211_vif *vif,
+ struct ieee80211_link_sta *link_sta,
+- struct ieee80211_supported_band *sband)
++ const struct ieee80211_sta_he_cap *sband_he_cap)
+ {
+ struct ieee80211_sta_ht_cap *ht_cap = &link_sta->ht_cap;
+ struct ieee80211_sta_vht_cap *vht_cap = &link_sta->vht_cap;
+ struct ieee80211_sta_he_cap *he_cap = &link_sta->he_cap;
+- const struct ieee80211_sta_he_cap *sband_he_cap;
+ bool vht_ena = vht_cap->vht_supported;
+ u16 flags = 0;
+
+@@ -94,8 +93,6 @@ static u16 rs_fw_get_config_flags(struct iwl_mvm *mvm,
+ IEEE80211_HE_PHY_CAP1_LDPC_CODING_IN_PAYLOAD))
+ flags |= IWL_TLC_MNG_CFG_FLAGS_LDPC_MSK;
+
+- sband_he_cap = ieee80211_get_he_iftype_cap(sband,
+- ieee80211_vif_type_p2p(vif));
+ if (sband_he_cap &&
+ !(sband_he_cap->he_cap_elem.phy_cap_info[1] &
+ IEEE80211_HE_PHY_CAP1_LDPC_CODING_IN_PAYLOAD))
+@@ -197,16 +194,14 @@ static u16 rs_fw_he_ieee80211_mcs_to_rs_mcs(u16 mcs)
+
+ static void
+ rs_fw_he_set_enabled_rates(const struct ieee80211_link_sta *link_sta,
+- struct ieee80211_supported_band *sband,
++ const struct ieee80211_sta_he_cap *sband_he_cap,
+ struct iwl_tlc_config_cmd_v4 *cmd)
+ {
+ const struct ieee80211_sta_he_cap *he_cap = &link_sta->he_cap;
+ u16 mcs_160 = le16_to_cpu(he_cap->he_mcs_nss_supp.rx_mcs_160);
+ u16 mcs_80 = le16_to_cpu(he_cap->he_mcs_nss_supp.rx_mcs_80);
+- u16 tx_mcs_80 =
+- le16_to_cpu(sband->iftype_data->he_cap.he_mcs_nss_supp.tx_mcs_80);
+- u16 tx_mcs_160 =
+- le16_to_cpu(sband->iftype_data->he_cap.he_mcs_nss_supp.tx_mcs_160);
++ u16 tx_mcs_80 = le16_to_cpu(sband_he_cap->he_mcs_nss_supp.tx_mcs_80);
++ u16 tx_mcs_160 = le16_to_cpu(sband_he_cap->he_mcs_nss_supp.tx_mcs_160);
+ int i;
+ u8 nss = link_sta->rx_nss;
+
+@@ -289,7 +284,8 @@ rs_fw_rs_mcs2eht_mcs(enum IWL_TLC_MCS_PER_BW bw,
+ static void
+ rs_fw_eht_set_enabled_rates(struct ieee80211_vif *vif,
+ const struct ieee80211_link_sta *link_sta,
+- struct ieee80211_supported_band *sband,
++ const struct ieee80211_sta_he_cap *sband_he_cap,
++ const struct ieee80211_sta_eht_cap *sband_eht_cap,
+ struct iwl_tlc_config_cmd_v4 *cmd)
+ {
+ /* peer RX mcs capa */
+@@ -297,7 +293,7 @@ rs_fw_eht_set_enabled_rates(struct ieee80211_vif *vif,
+ &link_sta->eht_cap.eht_mcs_nss_supp;
+ /* our TX mcs capa */
+ const struct ieee80211_eht_mcs_nss_supp *eht_tx_mcs =
+- &sband->iftype_data->eht_cap.eht_mcs_nss_supp;
++ &sband_eht_cap->eht_mcs_nss_supp;
+
+ enum IWL_TLC_MCS_PER_BW bw;
+ struct ieee80211_eht_mcs_nss_supp_20mhz_only mcs_rx_20;
+@@ -316,7 +312,7 @@ rs_fw_eht_set_enabled_rates(struct ieee80211_vif *vif,
+ }
+
+ /* nic is 20Mhz only */
+- if (!(sband->iftype_data->he_cap.he_cap_elem.phy_cap_info[0] &
++ if (!(sband_he_cap->he_cap_elem.phy_cap_info[0] &
+ IEEE80211_HE_PHY_CAP0_CHANNEL_WIDTH_SET_MASK_ALL)) {
+ mcs_tx_20 = eht_tx_mcs->only_20mhz;
+ } else {
+@@ -370,6 +366,8 @@ rs_fw_eht_set_enabled_rates(struct ieee80211_vif *vif,
+ static void rs_fw_set_supp_rates(struct ieee80211_vif *vif,
+ struct ieee80211_link_sta *link_sta,
+ struct ieee80211_supported_band *sband,
++ const struct ieee80211_sta_he_cap *sband_he_cap,
++ const struct ieee80211_sta_eht_cap *sband_eht_cap,
+ struct iwl_tlc_config_cmd_v4 *cmd)
+ {
+ int i;
+@@ -388,12 +386,13 @@ static void rs_fw_set_supp_rates(struct ieee80211_vif *vif,
+ cmd->mode = IWL_TLC_MNG_MODE_NON_HT;
+
+ /* HT/VHT rates */
+- if (link_sta->eht_cap.has_eht) {
++ if (link_sta->eht_cap.has_eht && sband_he_cap && sband_eht_cap) {
+ cmd->mode = IWL_TLC_MNG_MODE_EHT;
+- rs_fw_eht_set_enabled_rates(vif, link_sta, sband, cmd);
+- } else if (he_cap->has_he) {
++ rs_fw_eht_set_enabled_rates(vif, link_sta, sband_he_cap,
++ sband_eht_cap, cmd);
++ } else if (he_cap->has_he && sband_he_cap) {
+ cmd->mode = IWL_TLC_MNG_MODE_HE;
+- rs_fw_he_set_enabled_rates(link_sta, sband, cmd);
++ rs_fw_he_set_enabled_rates(link_sta, sband_he_cap, cmd);
+ } else if (vht_cap->vht_supported) {
+ cmd->mode = IWL_TLC_MNG_MODE_VHT;
+ rs_fw_vht_set_enabled_rates(link_sta, vht_cap, cmd);
+@@ -576,13 +575,17 @@ void iwl_mvm_rs_fw_rate_init(struct iwl_mvm *mvm,
+ u32 cmd_id = WIDE_ID(DATA_PATH_GROUP, TLC_MNG_CONFIG_CMD);
+ struct ieee80211_supported_band *sband = hw->wiphy->bands[band];
+ u16 max_amsdu_len = rs_fw_get_max_amsdu_len(sta, link_conf, link_sta);
++ const struct ieee80211_sta_he_cap *sband_he_cap =
++ ieee80211_get_he_iftype_cap_vif(sband, vif);
++ const struct ieee80211_sta_eht_cap *sband_eht_cap =
++ ieee80211_get_eht_iftype_cap_vif(sband, vif);
+ struct iwl_mvm_link_sta *mvm_link_sta;
+ struct iwl_lq_sta_rs_fw *lq_sta;
+ struct iwl_tlc_config_cmd_v4 cfg_cmd = {
+ .max_ch_width = mvmsta->authorized ?
+ rs_fw_bw_from_sta_bw(link_sta) : IWL_TLC_MNG_CH_WIDTH_20MHZ,
+ .flags = cpu_to_le16(rs_fw_get_config_flags(mvm, vif, link_sta,
+- sband)),
++ sband_he_cap)),
+ .chains = rs_fw_set_active_chains(iwl_mvm_get_valid_tx_ant(mvm)),
+ .sgi_ch_width_supp = rs_fw_sgi_cw_support(link_sta),
+ .max_mpdu_len = iwl_mvm_is_csum_supported(mvm) ?
+@@ -592,6 +595,21 @@ void iwl_mvm_rs_fw_rate_init(struct iwl_mvm *mvm,
+ int cmd_ver;
+ int ret;
+
++ /* Enable external EHT LTF only for GL device and if there's
++ * mutual support by AP and client
++ */
++ if (CSR_HW_REV_TYPE(mvm->trans->hw_rev) == IWL_CFG_MAC_TYPE_GL &&
++ sband_eht_cap &&
++ sband_eht_cap->eht_cap_elem.phy_cap_info[5] &
++ IEEE80211_EHT_PHY_CAP5_SUPP_EXTRA_EHT_LTF &&
++ link_sta->eht_cap.has_eht &&
++ link_sta->eht_cap.eht_cap_elem.phy_cap_info[5] &
++ IEEE80211_EHT_PHY_CAP5_SUPP_EXTRA_EHT_LTF) {
++ IWL_DEBUG_RATE(mvm, "Set support for Extra EHT LTF\n");
++ cfg_cmd.flags |=
++ cpu_to_le16(IWL_TLC_MNG_CFG_FLAGS_EHT_EXTRA_LTF_MSK);
++ }
++
+ rcu_read_lock();
+ mvm_link_sta = rcu_dereference(mvmsta->link[link_id]);
+ if (WARN_ON_ONCE(!mvm_link_sta)) {
+@@ -609,7 +627,9 @@ void iwl_mvm_rs_fw_rate_init(struct iwl_mvm *mvm,
+ #ifdef CONFIG_IWLWIFI_DEBUGFS
+ iwl_mvm_reset_frame_stats(mvm);
+ #endif
+- rs_fw_set_supp_rates(vif, link_sta, sband, &cfg_cmd);
++ rs_fw_set_supp_rates(vif, link_sta, sband,
++ sband_he_cap, sband_eht_cap,
++ &cfg_cmd);
+
+ /*
+ * since TLC offload works with one mode we can assume
+diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c b/drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c
+index 6226e4e54a51d..38f8d19f718ee 100644
+--- a/drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c
++++ b/drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c
+@@ -279,7 +279,8 @@ static void iwl_mvm_get_signal_strength(struct iwl_mvm *mvm,
+ static int iwl_mvm_rx_mgmt_prot(struct ieee80211_sta *sta,
+ struct ieee80211_hdr *hdr,
+ struct iwl_rx_mpdu_desc *desc,
+- u32 status)
++ u32 status,
++ struct ieee80211_rx_status *stats)
+ {
+ struct iwl_mvm_sta *mvmsta;
+ struct iwl_mvm_vif *mvmvif;
+@@ -308,8 +309,10 @@ static int iwl_mvm_rx_mgmt_prot(struct ieee80211_sta *sta,
+
+ /* good cases */
+ if (likely(status & IWL_RX_MPDU_STATUS_MIC_OK &&
+- !(status & IWL_RX_MPDU_STATUS_REPLAY_ERROR)))
++ !(status & IWL_RX_MPDU_STATUS_REPLAY_ERROR))) {
++ stats->flag |= RX_FLAG_DECRYPTED;
+ return 0;
++ }
+
+ if (!sta)
+ return -1;
+@@ -378,7 +381,7 @@ static int iwl_mvm_rx_crypto(struct iwl_mvm *mvm, struct ieee80211_sta *sta,
+
+ if (unlikely(ieee80211_is_mgmt(hdr->frame_control) &&
+ !ieee80211_has_protected(hdr->frame_control)))
+- return iwl_mvm_rx_mgmt_prot(sta, hdr, desc, status);
++ return iwl_mvm_rx_mgmt_prot(sta, hdr, desc, status, stats);
+
+ if (!ieee80211_has_protected(hdr->frame_control) ||
+ (status & IWL_RX_MPDU_STATUS_SEC_MASK) ==
+diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/sta.c b/drivers/net/wireless/intel/iwlwifi/mvm/sta.c
+index 05a54a69c1357..b85e363544f8b 100644
+--- a/drivers/net/wireless/intel/iwlwifi/mvm/sta.c
++++ b/drivers/net/wireless/intel/iwlwifi/mvm/sta.c
+@@ -1859,6 +1859,8 @@ int iwl_mvm_add_sta(struct iwl_mvm *mvm,
+
+ ret = iwl_mvm_sta_init(mvm, vif, sta, sta_id,
+ sta->tdls ? IWL_STA_TDLS_LINK : IWL_STA_LINK);
++ if (ret)
++ goto err;
+
+ update_fw:
+ ret = iwl_mvm_sta_send_to_fw(mvm, sta, sta_update, sta_flags);
+diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
+index 0d7890f99a5fb..90a46faaaffdf 100644
+--- a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
++++ b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
+@@ -1636,14 +1636,14 @@ irqreturn_t iwl_pcie_irq_rx_msix_handler(int irq, void *dev_id)
+ struct msix_entry *entry = dev_id;
+ struct iwl_trans_pcie *trans_pcie = iwl_pcie_get_trans_pcie(entry);
+ struct iwl_trans *trans = trans_pcie->trans;
+- struct iwl_rxq *rxq = &trans_pcie->rxq[entry->entry];
++ struct iwl_rxq *rxq;
+
+ trace_iwlwifi_dev_irq_msix(trans->dev, entry, false, 0, 0);
+
+ if (WARN_ON(entry->entry >= trans->num_rx_queues))
+ return IRQ_NONE;
+
+- if (!rxq) {
++ if (!trans_pcie->rxq) {
+ if (net_ratelimit())
+ IWL_ERR(trans,
+ "[%d] Got MSI-X interrupt before we have Rx queues\n",
+@@ -1651,6 +1651,7 @@ irqreturn_t iwl_pcie_irq_rx_msix_handler(int irq, void *dev_id)
+ return IRQ_NONE;
+ }
+
++ rxq = &trans_pcie->rxq[entry->entry];
+ lock_map_acquire(&trans->sync_cmd_lockdep_map);
+ IWL_DEBUG_ISR(trans, "[%d] Got interrupt\n", entry->entry);
+
+diff --git a/drivers/net/wireless/intersil/orinoco/orinoco_cs.c b/drivers/net/wireless/intersil/orinoco/orinoco_cs.c
+index a956f965a1e5e..03bfd2482656c 100644
+--- a/drivers/net/wireless/intersil/orinoco/orinoco_cs.c
++++ b/drivers/net/wireless/intersil/orinoco/orinoco_cs.c
+@@ -96,6 +96,7 @@ orinoco_cs_probe(struct pcmcia_device *link)
+ {
+ struct orinoco_private *priv;
+ struct orinoco_pccard *card;
++ int ret;
+
+ priv = alloc_orinocodev(sizeof(*card), &link->dev,
+ orinoco_cs_hard_reset, NULL);
+@@ -107,8 +108,16 @@ orinoco_cs_probe(struct pcmcia_device *link)
+ card->p_dev = link;
+ link->priv = priv;
+
+- return orinoco_cs_config(link);
+-} /* orinoco_cs_attach */
++ ret = orinoco_cs_config(link);
++ if (ret)
++ goto err_free_orinocodev;
++
++ return 0;
++
++err_free_orinocodev:
++ free_orinocodev(priv);
++ return ret;
++}
+
+ static void orinoco_cs_detach(struct pcmcia_device *link)
+ {
+diff --git a/drivers/net/wireless/intersil/orinoco/spectrum_cs.c b/drivers/net/wireless/intersil/orinoco/spectrum_cs.c
+index 291ef97ed45ec..841d623c621ac 100644
+--- a/drivers/net/wireless/intersil/orinoco/spectrum_cs.c
++++ b/drivers/net/wireless/intersil/orinoco/spectrum_cs.c
+@@ -157,6 +157,7 @@ spectrum_cs_probe(struct pcmcia_device *link)
+ {
+ struct orinoco_private *priv;
+ struct orinoco_pccard *card;
++ int ret;
+
+ priv = alloc_orinocodev(sizeof(*card), &link->dev,
+ spectrum_cs_hard_reset,
+@@ -169,8 +170,16 @@ spectrum_cs_probe(struct pcmcia_device *link)
+ card->p_dev = link;
+ link->priv = priv;
+
+- return spectrum_cs_config(link);
+-} /* spectrum_cs_attach */
++ ret = spectrum_cs_config(link);
++ if (ret)
++ goto err_free_orinocodev;
++
++ return 0;
++
++err_free_orinocodev:
++ free_orinocodev(priv);
++ return ret;
++}
+
+ static void spectrum_cs_detach(struct pcmcia_device *link)
+ {
+diff --git a/drivers/net/wireless/legacy/ray_cs.c b/drivers/net/wireless/legacy/ray_cs.c
+index 1f57a0055bbd8..38782d4c4694a 100644
+--- a/drivers/net/wireless/legacy/ray_cs.c
++++ b/drivers/net/wireless/legacy/ray_cs.c
+@@ -270,13 +270,14 @@ static int ray_probe(struct pcmcia_device *p_dev)
+ {
+ ray_dev_t *local;
+ struct net_device *dev;
++ int ret;
+
+ dev_dbg(&p_dev->dev, "ray_attach()\n");
+
+ /* Allocate space for private device-specific data */
+ dev = alloc_etherdev(sizeof(ray_dev_t));
+ if (!dev)
+- goto fail_alloc_dev;
++ return -ENOMEM;
+
+ local = netdev_priv(dev);
+ local->finder = p_dev;
+@@ -313,11 +314,16 @@ static int ray_probe(struct pcmcia_device *p_dev)
+ timer_setup(&local->timer, NULL, 0);
+
+ this_device = p_dev;
+- return ray_config(p_dev);
++ ret = ray_config(p_dev);
++ if (ret)
++ goto err_free_dev;
++
++ return 0;
+
+-fail_alloc_dev:
+- return -ENOMEM;
+-} /* ray_attach */
++err_free_dev:
++ free_netdev(dev);
++ return ret;
++}
+
+ static void ray_detach(struct pcmcia_device *link)
+ {
+diff --git a/drivers/net/wireless/legacy/wl3501_cs.c b/drivers/net/wireless/legacy/wl3501_cs.c
+index 7fb2f95134760..c45c4b7cbbaf1 100644
+--- a/drivers/net/wireless/legacy/wl3501_cs.c
++++ b/drivers/net/wireless/legacy/wl3501_cs.c
+@@ -1862,6 +1862,7 @@ static int wl3501_probe(struct pcmcia_device *p_dev)
+ {
+ struct net_device *dev;
+ struct wl3501_card *this;
++ int ret;
+
+ /* The io structure describes IO port mapping */
+ p_dev->resource[0]->end = 16;
+@@ -1873,8 +1874,7 @@ static int wl3501_probe(struct pcmcia_device *p_dev)
+
+ dev = alloc_etherdev(sizeof(struct wl3501_card));
+ if (!dev)
+- goto out_link;
+-
++ return -ENOMEM;
+
+ dev->netdev_ops = &wl3501_netdev_ops;
+ dev->watchdog_timeo = 5 * HZ;
+@@ -1887,9 +1887,15 @@ static int wl3501_probe(struct pcmcia_device *p_dev)
+ netif_stop_queue(dev);
+ p_dev->priv = dev;
+
+- return wl3501_config(p_dev);
+-out_link:
+- return -ENOMEM;
++ ret = wl3501_config(p_dev);
++ if (ret)
++ goto out_free_etherdev;
++
++ return 0;
++
++out_free_etherdev:
++ free_netdev(dev);
++ return ret;
+ }
+
+ static int wl3501_config(struct pcmcia_device *link)
+diff --git a/drivers/net/wireless/marvell/mwifiex/scan.c b/drivers/net/wireless/marvell/mwifiex/scan.c
+index ac8001c842935..644b1e134b01c 100644
+--- a/drivers/net/wireless/marvell/mwifiex/scan.c
++++ b/drivers/net/wireless/marvell/mwifiex/scan.c
+@@ -2187,9 +2187,9 @@ int mwifiex_ret_802_11_scan(struct mwifiex_private *priv,
+
+ if (nd_config) {
+ adapter->nd_info =
+- kzalloc(sizeof(struct cfg80211_wowlan_nd_match) +
+- sizeof(struct cfg80211_wowlan_nd_match *) *
+- scan_rsp->number_of_sets, GFP_ATOMIC);
++ kzalloc(struct_size(adapter->nd_info, matches,
++ scan_rsp->number_of_sets),
++ GFP_ATOMIC);
+
+ if (adapter->nd_info)
+ adapter->nd_info->n_matches = scan_rsp->number_of_sets;
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/dma.c b/drivers/net/wireless/mediatek/mt76/mt7921/dma.c
+index f0a80c2b476ab..4153cd6c2a01d 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7921/dma.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7921/dma.c
+@@ -231,10 +231,6 @@ int mt7921_dma_init(struct mt7921_dev *dev)
+ if (ret)
+ return ret;
+
+- ret = mt7921_wfsys_reset(dev);
+- if (ret)
+- return ret;
+-
+ /* init tx queue */
+ ret = mt76_connac_init_tx_queues(dev->phy.mt76, MT7921_TXQ_BAND0,
+ MT7921_TX_RING_SIZE,
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c
+index c69ce6df49561..f55caa00ac69b 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c
+@@ -476,12 +476,6 @@ static int mt7921_load_firmware(struct mt7921_dev *dev)
+ {
+ int ret;
+
+- ret = mt76_get_field(dev, MT_CONN_ON_MISC, MT_TOP_MISC2_FW_N9_RDY);
+- if (ret && mt76_is_mmio(&dev->mt76)) {
+- dev_dbg(dev->mt76.dev, "Firmware is already download\n");
+- goto fw_loaded;
+- }
+-
+ ret = mt76_connac2_load_patch(&dev->mt76, mt7921_patch_name(dev));
+ if (ret)
+ return ret;
+@@ -504,8 +498,6 @@ static int mt7921_load_firmware(struct mt7921_dev *dev)
+ return -EIO;
+ }
+
+-fw_loaded:
+-
+ #ifdef CONFIG_PM
+ dev->mt76.hw->wiphy->wowlan = &mt76_connac_wowlan_support;
+ #endif /* CONFIG_PM */
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
+index ddb1fa4ee01d7..95610a117d2f0 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
+@@ -325,6 +325,10 @@ static int mt7921_pci_probe(struct pci_dev *pdev,
+ bus_ops->rmw = mt7921_rmw;
+ dev->mt76.bus = bus_ops;
+
++ ret = mt7921e_mcu_fw_pmctrl(dev);
++ if (ret)
++ goto err_free_dev;
++
+ ret = __mt7921e_mcu_drv_pmctrl(dev);
+ if (ret)
+ goto err_free_dev;
+@@ -333,6 +337,10 @@ static int mt7921_pci_probe(struct pci_dev *pdev,
+ (mt7921_l1_rr(dev, MT_HW_REV) & 0xff);
+ dev_info(mdev->dev, "ASIC revision: %04x\n", mdev->rev);
+
++ ret = mt7921_wfsys_reset(dev);
++ if (ret)
++ goto err_free_dev;
++
+ mt76_wr(dev, MT_WFDMA0_HOST_INT_ENA, 0);
+
+ mt76_wr(dev, MT_PCIE_MAC_INT_ENABLE, 0xff);
+diff --git a/drivers/net/wireless/microchip/wilc1000/hif.c b/drivers/net/wireless/microchip/wilc1000/hif.c
+index 5adc69d5bcae3..a28da59384813 100644
+--- a/drivers/net/wireless/microchip/wilc1000/hif.c
++++ b/drivers/net/wireless/microchip/wilc1000/hif.c
+@@ -485,6 +485,9 @@ void *wilc_parse_join_bss_param(struct cfg80211_bss *bss,
+ int rsn_ie_len = sizeof(struct element) + rsn_ie[1];
+ int offset = 8;
+
++ param->mode_802_11i = 2;
++ param->rsn_found = true;
++
+ /* extract RSN capabilities */
+ if (offset < rsn_ie_len) {
+ /* skip over pairwise suites */
+@@ -494,11 +497,8 @@ void *wilc_parse_join_bss_param(struct cfg80211_bss *bss,
+ /* skip over authentication suites */
+ offset += (rsn_ie[offset] * 4) + 2;
+
+- if (offset + 1 < rsn_ie_len) {
+- param->mode_802_11i = 2;
+- param->rsn_found = true;
++ if (offset + 1 < rsn_ie_len)
+ memcpy(param->rsn_cap, &rsn_ie[offset], 2);
+- }
+ }
+ }
+ }
+diff --git a/drivers/net/wireless/realtek/rtw88/mac80211.c b/drivers/net/wireless/realtek/rtw88/mac80211.c
+index 144618bb94c86..09bcc2345bb05 100644
+--- a/drivers/net/wireless/realtek/rtw88/mac80211.c
++++ b/drivers/net/wireless/realtek/rtw88/mac80211.c
+@@ -164,8 +164,10 @@ static int rtw_ops_add_interface(struct ieee80211_hw *hw,
+ mutex_lock(&rtwdev->mutex);
+
+ port = find_first_zero_bit(rtwdev->hw_port, RTW_PORT_NUM);
+- if (port >= RTW_PORT_NUM)
++ if (port >= RTW_PORT_NUM) {
++ mutex_unlock(&rtwdev->mutex);
+ return -EINVAL;
++ }
+ set_bit(port, rtwdev->hw_port);
+
+ rtwvif->port = port;
+diff --git a/drivers/net/wireless/realtek/rtw88/usb.c b/drivers/net/wireless/realtek/rtw88/usb.c
+index 44a5fafb99055..976eafa739a2d 100644
+--- a/drivers/net/wireless/realtek/rtw88/usb.c
++++ b/drivers/net/wireless/realtek/rtw88/usb.c
+@@ -535,7 +535,7 @@ static void rtw_usb_rx_handler(struct work_struct *work)
+ }
+
+ if (skb_queue_len(&rtwusb->rx_queue) >= RTW_USB_MAX_RXQ_LEN) {
+- rtw_err(rtwdev, "failed to get rx_queue, overflow\n");
++ dev_dbg_ratelimited(rtwdev->dev, "failed to get rx_queue, overflow\n");
+ dev_kfree_skb_any(skb);
+ continue;
+ }
+diff --git a/drivers/net/wireless/realtek/rtw89/core.c b/drivers/net/wireless/realtek/rtw89/core.c
+index bad864d56bd5c..5423f8ae187f1 100644
+--- a/drivers/net/wireless/realtek/rtw89/core.c
++++ b/drivers/net/wireless/realtek/rtw89/core.c
+@@ -3584,7 +3584,7 @@ static void rtw89_read_chip_ver(struct rtw89_dev *rtwdev)
+
+ if (chip->chip_id == RTL8852B || chip->chip_id == RTL8851B) {
+ ret = rtw89_mac_read_xtal_si(rtwdev, XTAL_SI_CV, &val);
+- if (!ret)
++ if (ret)
+ return;
+
+ rtwdev->hal.acv = u8_get_bits(val, XTAL_SI_ACV_MASK);
+diff --git a/drivers/net/wireless/rsi/rsi_91x_sdio.c b/drivers/net/wireless/rsi/rsi_91x_sdio.c
+index d09998796ac08..1911fef3bbad6 100644
+--- a/drivers/net/wireless/rsi/rsi_91x_sdio.c
++++ b/drivers/net/wireless/rsi/rsi_91x_sdio.c
+@@ -1463,10 +1463,8 @@ static void rsi_shutdown(struct device *dev)
+
+ rsi_dbg(ERR_ZONE, "SDIO Bus shutdown =====>\n");
+
+- if (hw) {
+- struct cfg80211_wowlan *wowlan = hw->wiphy->wowlan_config;
+-
+- if (rsi_config_wowlan(adapter, wowlan))
++ if (hw && hw->wiphy && hw->wiphy->wowlan_config) {
++ if (rsi_config_wowlan(adapter, hw->wiphy->wowlan_config))
+ rsi_dbg(ERR_ZONE, "Failed to configure WoWLAN\n");
+ }
+
+@@ -1481,9 +1479,6 @@ static void rsi_shutdown(struct device *dev)
+ if (sdev->write_fail)
+ rsi_dbg(INFO_ZONE, "###### Device is not ready #######\n");
+
+- if (rsi_set_sdio_pm_caps(adapter))
+- rsi_dbg(INFO_ZONE, "Setting power management caps failed\n");
+-
+ rsi_dbg(INFO_ZONE, "***** RSI module shut down *****\n");
+ }
+
+diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
+index 3ec38e2b91732..3395e27438393 100644
+--- a/drivers/nvme/host/core.c
++++ b/drivers/nvme/host/core.c
+@@ -3872,8 +3872,10 @@ static ssize_t nvme_ctrl_dhchap_secret_store(struct device *dev,
+ int ret;
+
+ ret = nvme_auth_generate_key(dhchap_secret, &key);
+- if (ret)
++ if (ret) {
++ kfree(dhchap_secret);
+ return ret;
++ }
+ kfree(opts->dhchap_secret);
+ opts->dhchap_secret = dhchap_secret;
+ host_key = ctrl->host_key;
+@@ -3881,7 +3883,8 @@ static ssize_t nvme_ctrl_dhchap_secret_store(struct device *dev,
+ ctrl->host_key = key;
+ mutex_unlock(&ctrl->dhchap_auth_mutex);
+ nvme_auth_free_key(host_key);
+- }
++ } else
++ kfree(dhchap_secret);
+ /* Start re-authentication */
+ dev_info(ctrl->device, "re-authenticating controller\n");
+ queue_work(nvme_wq, &ctrl->dhchap_auth_work);
+@@ -3926,8 +3929,10 @@ static ssize_t nvme_ctrl_dhchap_ctrl_secret_store(struct device *dev,
+ int ret;
+
+ ret = nvme_auth_generate_key(dhchap_secret, &key);
+- if (ret)
++ if (ret) {
++ kfree(dhchap_secret);
+ return ret;
++ }
+ kfree(opts->dhchap_ctrl_secret);
+ opts->dhchap_ctrl_secret = dhchap_secret;
+ ctrl_key = ctrl->ctrl_key;
+@@ -3935,7 +3940,8 @@ static ssize_t nvme_ctrl_dhchap_ctrl_secret_store(struct device *dev,
+ ctrl->ctrl_key = key;
+ mutex_unlock(&ctrl->dhchap_auth_mutex);
+ nvme_auth_free_key(ctrl_key);
+- }
++ } else
++ kfree(dhchap_secret);
+ /* Start re-authentication */
+ dev_info(ctrl->device, "re-authenticating controller\n");
+ queue_work(nvme_wq, &ctrl->dhchap_auth_work);
+@@ -5243,6 +5249,8 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct device *dev,
+
+ return 0;
+ out_free_cdev:
++ nvme_fault_inject_fini(&ctrl->fault_inject);
++ dev_pm_qos_hide_latency_tolerance(ctrl->device);
+ cdev_device_del(&ctrl->cdev, ctrl->device);
+ out_free_name:
+ nvme_put_ctrl(ctrl);
+diff --git a/drivers/nvmem/imx-ocotp.c b/drivers/nvmem/imx-ocotp.c
+index ac0edb6398f1e..c1af271052276 100644
+--- a/drivers/nvmem/imx-ocotp.c
++++ b/drivers/nvmem/imx-ocotp.c
+@@ -97,7 +97,6 @@ struct ocotp_params {
+ unsigned int bank_address_words;
+ void (*set_timing)(struct ocotp_priv *priv);
+ struct ocotp_ctrl_reg ctrl;
+- bool reverse_mac_address;
+ };
+
+ static int imx_ocotp_wait_for_busy(struct ocotp_priv *priv, u32 flags)
+@@ -545,7 +544,6 @@ static const struct ocotp_params imx8mq_params = {
+ .bank_address_words = 0,
+ .set_timing = imx_ocotp_set_imx6_timing,
+ .ctrl = IMX_OCOTP_BM_CTRL_DEFAULT,
+- .reverse_mac_address = true,
+ };
+
+ static const struct ocotp_params imx8mm_params = {
+@@ -553,7 +551,6 @@ static const struct ocotp_params imx8mm_params = {
+ .bank_address_words = 0,
+ .set_timing = imx_ocotp_set_imx6_timing,
+ .ctrl = IMX_OCOTP_BM_CTRL_DEFAULT,
+- .reverse_mac_address = true,
+ };
+
+ static const struct ocotp_params imx8mn_params = {
+@@ -561,7 +558,6 @@ static const struct ocotp_params imx8mn_params = {
+ .bank_address_words = 0,
+ .set_timing = imx_ocotp_set_imx6_timing,
+ .ctrl = IMX_OCOTP_BM_CTRL_DEFAULT,
+- .reverse_mac_address = true,
+ };
+
+ static const struct ocotp_params imx8mp_params = {
+@@ -569,7 +565,6 @@ static const struct ocotp_params imx8mp_params = {
+ .bank_address_words = 0,
+ .set_timing = imx_ocotp_set_imx6_timing,
+ .ctrl = IMX_OCOTP_BM_CTRL_8MP,
+- .reverse_mac_address = true,
+ };
+
+ static const struct of_device_id imx_ocotp_dt_ids[] = {
+@@ -624,8 +619,7 @@ static int imx_ocotp_probe(struct platform_device *pdev)
+ imx_ocotp_nvmem_config.size = 4 * priv->params->nregs;
+ imx_ocotp_nvmem_config.dev = dev;
+ imx_ocotp_nvmem_config.priv = priv;
+- if (priv->params->reverse_mac_address)
+- imx_ocotp_nvmem_config.layout = &imx_ocotp_layout;
++ imx_ocotp_nvmem_config.layout = &imx_ocotp_layout;
+
+ priv->config = &imx_ocotp_nvmem_config;
+
+diff --git a/drivers/nvmem/rmem.c b/drivers/nvmem/rmem.c
+index 80cb187f14817..752d0bf4445ee 100644
+--- a/drivers/nvmem/rmem.c
++++ b/drivers/nvmem/rmem.c
+@@ -71,6 +71,7 @@ static int rmem_probe(struct platform_device *pdev)
+ config.dev = dev;
+ config.priv = priv;
+ config.name = "rmem";
++ config.id = NVMEM_DEVID_AUTO;
+ config.size = mem->size;
+ config.reg_read = rmem_read;
+
+diff --git a/drivers/nvmem/sunplus-ocotp.c b/drivers/nvmem/sunplus-ocotp.c
+index 52b928a7a6d58..f85350b17d672 100644
+--- a/drivers/nvmem/sunplus-ocotp.c
++++ b/drivers/nvmem/sunplus-ocotp.c
+@@ -192,9 +192,11 @@ static int sp_ocotp_probe(struct platform_device *pdev)
+ sp_ocotp_nvmem_config.dev = dev;
+
+ nvmem = devm_nvmem_register(dev, &sp_ocotp_nvmem_config);
+- if (IS_ERR(nvmem))
+- return dev_err_probe(&pdev->dev, PTR_ERR(nvmem),
++ if (IS_ERR(nvmem)) {
++ ret = dev_err_probe(&pdev->dev, PTR_ERR(nvmem),
+ "register nvmem device fail\n");
++ goto err;
++ }
+
+ platform_set_drvdata(pdev, nvmem);
+
+@@ -203,6 +205,9 @@ static int sp_ocotp_probe(struct platform_device *pdev)
+ (int)OTP_WORD_SIZE, (int)QAC628_OTP_SIZE);
+
+ return 0;
++err:
++ clk_unprepare(otp->clk);
++ return ret;
+ }
+
+ static const struct of_device_id sp_ocotp_dt_ids[] = {
+diff --git a/drivers/pci/controller/cadence/pcie-cadence-host.c b/drivers/pci/controller/cadence/pcie-cadence-host.c
+index 940c7dd701d68..5b14f7ee3c798 100644
+--- a/drivers/pci/controller/cadence/pcie-cadence-host.c
++++ b/drivers/pci/controller/cadence/pcie-cadence-host.c
+@@ -12,6 +12,8 @@
+
+ #include "pcie-cadence.h"
+
++#define LINK_RETRAIN_TIMEOUT HZ
++
+ static u64 bar_max_size[] = {
+ [RP_BAR0] = _ULL(128 * SZ_2G),
+ [RP_BAR1] = SZ_2G,
+@@ -77,6 +79,27 @@ static struct pci_ops cdns_pcie_host_ops = {
+ .write = pci_generic_config_write,
+ };
+
++static int cdns_pcie_host_training_complete(struct cdns_pcie *pcie)
++{
++ u32 pcie_cap_off = CDNS_PCIE_RP_CAP_OFFSET;
++ unsigned long end_jiffies;
++ u16 lnk_stat;
++
++ /* Wait for link training to complete. Exit after timeout. */
++ end_jiffies = jiffies + LINK_RETRAIN_TIMEOUT;
++ do {
++ lnk_stat = cdns_pcie_rp_readw(pcie, pcie_cap_off + PCI_EXP_LNKSTA);
++ if (!(lnk_stat & PCI_EXP_LNKSTA_LT))
++ break;
++ usleep_range(0, 1000);
++ } while (time_before(jiffies, end_jiffies));
++
++ if (!(lnk_stat & PCI_EXP_LNKSTA_LT))
++ return 0;
++
++ return -ETIMEDOUT;
++}
++
+ static int cdns_pcie_host_wait_for_link(struct cdns_pcie *pcie)
+ {
+ struct device *dev = pcie->dev;
+@@ -118,6 +141,10 @@ static int cdns_pcie_retrain(struct cdns_pcie *pcie)
+ cdns_pcie_rp_writew(pcie, pcie_cap_off + PCI_EXP_LNKCTL,
+ lnk_ctl);
+
++ ret = cdns_pcie_host_training_complete(pcie);
++ if (ret)
++ return ret;
++
+ ret = cdns_pcie_host_wait_for_link(pcie);
+ }
+ return ret;
+diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
+index 4ab30892f6efb..2783e9c3ef1ba 100644
+--- a/drivers/pci/controller/dwc/pcie-qcom.c
++++ b/drivers/pci/controller/dwc/pcie-qcom.c
+@@ -61,7 +61,6 @@
+ /* DBI registers */
+ #define AXI_MSTR_RESP_COMP_CTRL0 0x818
+ #define AXI_MSTR_RESP_COMP_CTRL1 0x81c
+-#define MISC_CONTROL_1_REG 0x8bc
+
+ /* MHI registers */
+ #define PARF_DEBUG_CNT_PM_LINKST_IN_L2 0xc04
+@@ -132,9 +131,6 @@
+ /* AXI_MSTR_RESP_COMP_CTRL1 register fields */
+ #define CFG_BRIDGE_SB_INIT BIT(0)
+
+-/* MISC_CONTROL_1_REG register fields */
+-#define DBI_RO_WR_EN 1
+-
+ /* PCI_EXP_SLTCAP register fields */
+ #define PCIE_CAP_SLOT_POWER_LIMIT_VAL FIELD_PREP(PCI_EXP_SLTCAP_SPLV, 250)
+ #define PCIE_CAP_SLOT_POWER_LIMIT_SCALE FIELD_PREP(PCI_EXP_SLTCAP_SPLS, 1)
+@@ -826,7 +822,9 @@ static int qcom_pcie_post_init_2_3_3(struct qcom_pcie *pcie)
+ writel(0, pcie->parf + PARF_Q2A_FLUSH);
+
+ writel(PCI_COMMAND_MASTER, pci->dbi_base + PCI_COMMAND);
+- writel(DBI_RO_WR_EN, pci->dbi_base + MISC_CONTROL_1_REG);
++
++ dw_pcie_dbi_ro_wr_en(pci);
++
+ writel(PCIE_CAP_SLOT_VAL, pci->dbi_base + offset + PCI_EXP_SLTCAP);
+
+ val = readl(pci->dbi_base + offset + PCI_EXP_LNKCAP);
+@@ -1136,6 +1134,7 @@ static int qcom_pcie_post_init_2_9_0(struct qcom_pcie *pcie)
+ writel(0, pcie->parf + PARF_Q2A_FLUSH);
+
+ dw_pcie_dbi_ro_wr_en(pci);
++
+ writel(PCIE_CAP_SLOT_VAL, pci->dbi_base + offset + PCI_EXP_SLTCAP);
+
+ val = readl(pci->dbi_base + offset + PCI_EXP_LNKCAP);
+@@ -1145,6 +1144,8 @@ static int qcom_pcie_post_init_2_9_0(struct qcom_pcie *pcie)
+ writel(PCI_EXP_DEVCTL2_COMP_TMOUT_DIS, pci->dbi_base + offset +
+ PCI_EXP_DEVCTL2);
+
++ dw_pcie_dbi_ro_wr_dis(pci);
++
+ for (i = 0; i < 256; i++)
+ writel(0, pcie->parf + PARF_BDF_TO_SID_TABLE_N + (4 * i));
+
+diff --git a/drivers/pci/controller/pci-ftpci100.c b/drivers/pci/controller/pci-ftpci100.c
+index ecd3009df586d..6e7981d2ed5e1 100644
+--- a/drivers/pci/controller/pci-ftpci100.c
++++ b/drivers/pci/controller/pci-ftpci100.c
+@@ -429,22 +429,12 @@ static int faraday_pci_probe(struct platform_device *pdev)
+ p->dev = dev;
+
+ /* Retrieve and enable optional clocks */
+- clk = devm_clk_get(dev, "PCLK");
++ clk = devm_clk_get_enabled(dev, "PCLK");
+ if (IS_ERR(clk))
+ return PTR_ERR(clk);
+- ret = clk_prepare_enable(clk);
+- if (ret) {
+- dev_err(dev, "could not prepare PCLK\n");
+- return ret;
+- }
+- p->bus_clk = devm_clk_get(dev, "PCICLK");
++ p->bus_clk = devm_clk_get_enabled(dev, "PCICLK");
+ if (IS_ERR(p->bus_clk))
+ return PTR_ERR(p->bus_clk);
+- ret = clk_prepare_enable(p->bus_clk);
+- if (ret) {
+- dev_err(dev, "could not prepare PCICLK\n");
+- return ret;
+- }
+
+ p->base = devm_platform_ioremap_resource(pdev, 0);
+ if (IS_ERR(p->base))
+diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
+index 990630ec57c6a..e718a816d4814 100644
+--- a/drivers/pci/controller/vmd.c
++++ b/drivers/pci/controller/vmd.c
+@@ -927,7 +927,8 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
+ if (!list_empty(&child->devices)) {
+ dev = list_first_entry(&child->devices,
+ struct pci_dev, bus_list);
+- if (pci_reset_bus(dev))
++ ret = pci_reset_bus(dev);
++ if (ret)
+ pci_warn(dev, "can't reset device: %d\n", ret);
+
+ break;
+@@ -1036,6 +1037,13 @@ static void vmd_remove(struct pci_dev *dev)
+ ida_simple_remove(&vmd_instance_ida, vmd->instance);
+ }
+
++static void vmd_shutdown(struct pci_dev *dev)
++{
++ struct vmd_dev *vmd = pci_get_drvdata(dev);
++
++ vmd_remove_irq_domain(vmd);
++}
++
+ #ifdef CONFIG_PM_SLEEP
+ static int vmd_suspend(struct device *dev)
+ {
+@@ -1101,6 +1109,7 @@ static struct pci_driver vmd_drv = {
+ .id_table = vmd_ids,
+ .probe = vmd_probe,
+ .remove = vmd_remove,
++ .shutdown = vmd_shutdown,
+ .driver = {
+ .pm = &vmd_dev_pm_ops,
+ },
+diff --git a/drivers/pci/endpoint/functions/Kconfig b/drivers/pci/endpoint/functions/Kconfig
+index 9fd5608868718..8efb6a869e7ce 100644
+--- a/drivers/pci/endpoint/functions/Kconfig
++++ b/drivers/pci/endpoint/functions/Kconfig
+@@ -27,7 +27,7 @@ config PCI_EPF_NTB
+ If in doubt, say "N" to disable Endpoint NTB driver.
+
+ config PCI_EPF_VNTB
+- tristate "PCI Endpoint NTB driver"
++ tristate "PCI Endpoint Virtual NTB driver"
+ depends on PCI_ENDPOINT
+ depends on NTB
+ select CONFIGFS_FS
+diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c b/drivers/pci/endpoint/functions/pci-epf-test.c
+index 0f9d2ec822ac6..172e5ac0bd96c 100644
+--- a/drivers/pci/endpoint/functions/pci-epf-test.c
++++ b/drivers/pci/endpoint/functions/pci-epf-test.c
+@@ -112,7 +112,7 @@ static int pci_epf_test_data_transfer(struct pci_epf_test *epf_test,
+ size_t len, dma_addr_t dma_remote,
+ enum dma_transfer_direction dir)
+ {
+- struct dma_chan *chan = (dir == DMA_DEV_TO_MEM) ?
++ struct dma_chan *chan = (dir == DMA_MEM_TO_DEV) ?
+ epf_test->dma_chan_tx : epf_test->dma_chan_rx;
+ dma_addr_t dma_local = (dir == DMA_MEM_TO_DEV) ? dma_src : dma_dst;
+ enum dma_ctrl_flags flags = DMA_CTRL_ACK | DMA_PREP_INTERRUPT;
+diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
+index 529c348084401..32baba1b7f131 100644
+--- a/drivers/pci/hotplug/pciehp_ctrl.c
++++ b/drivers/pci/hotplug/pciehp_ctrl.c
+@@ -256,6 +256,14 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
+ present = pciehp_card_present(ctrl);
+ link_active = pciehp_check_link_active(ctrl);
+ if (present <= 0 && link_active <= 0) {
++ if (ctrl->state == BLINKINGON_STATE) {
++ ctrl->state = OFF_STATE;
++ cancel_delayed_work(&ctrl->button_work);
++ pciehp_set_indicators(ctrl, PCI_EXP_SLTCTL_PWR_IND_OFF,
++ INDICATOR_NOOP);
++ ctrl_info(ctrl, "Slot(%s): Card not present\n",
++ slot_name(ctrl));
++ }
+ mutex_unlock(&ctrl->state_lock);
+ return;
+ }
+diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
+index 66d7514ca111b..db32335039d61 100644
+--- a/drivers/pci/pcie/aspm.c
++++ b/drivers/pci/pcie/aspm.c
+@@ -1010,21 +1010,24 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev)
+
+ down_read(&pci_bus_sem);
+ mutex_lock(&aspm_lock);
+- /*
+- * All PCIe functions are in one slot, remove one function will remove
+- * the whole slot, so just wait until we are the last function left.
+- */
+- if (!list_empty(&parent->subordinate->devices))
+- goto out;
+
+ link = parent->link_state;
+ root = link->root;
+ parent_link = link->parent;
+
+- /* All functions are removed, so just disable ASPM for the link */
++ /*
++ * link->downstream is a pointer to the pci_dev of function 0. If
++ * we remove that function, the pci_dev is about to be deallocated,
++ * so we can't use link->downstream again. Free the link state to
++ * avoid this.
++ *
++ * If we're removing a non-0 function, it's possible we could
++ * retain the link state, but PCIe r6.0, sec 7.5.3.7, recommends
++ * programming the same ASPM Control value for all functions of
++ * multi-function devices, so disable ASPM for all of them.
++ */
+ pcie_config_aspm_link(link, 0);
+ list_del(&link->sibling);
+- /* Clock PM is for endpoint device */
+ free_link_state(link);
+
+ /* Recheck latencies and configure upstream links */
+@@ -1032,7 +1035,7 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev)
+ pcie_update_aspm_capable(root);
+ pcie_config_aspm_path(parent_link);
+ }
+-out:
++
+ mutex_unlock(&aspm_lock);
+ up_read(&pci_bus_sem);
+ }
+diff --git a/drivers/perf/apple_m1_cpu_pmu.c b/drivers/perf/apple_m1_cpu_pmu.c
+index 8574c6e58c83a..cd2de44b61b91 100644
+--- a/drivers/perf/apple_m1_cpu_pmu.c
++++ b/drivers/perf/apple_m1_cpu_pmu.c
+@@ -493,6 +493,17 @@ static int m1_pmu_map_event(struct perf_event *event)
+ return armpmu_map_event(event, &m1_pmu_perf_map, NULL, M1_PMU_CFG_EVENT);
+ }
+
++static int m2_pmu_map_event(struct perf_event *event)
++{
++ /*
++ * Same deal as the above, except that M2 has 64bit counters.
++ * Which, as far as we're concerned, actually means 63 bits.
++ * Yes, this is getting awkward.
++ */
++ event->hw.flags |= ARMPMU_EVT_63BIT;
++ return armpmu_map_event(event, &m1_pmu_perf_map, NULL, M1_PMU_CFG_EVENT);
++}
++
+ static void m1_pmu_reset(void *info)
+ {
+ int i;
+@@ -525,7 +536,7 @@ static int m1_pmu_set_event_filter(struct hw_perf_event *event,
+ return 0;
+ }
+
+-static int m1_pmu_init(struct arm_pmu *cpu_pmu)
++static int m1_pmu_init(struct arm_pmu *cpu_pmu, u32 flags)
+ {
+ cpu_pmu->handle_irq = m1_pmu_handle_irq;
+ cpu_pmu->enable = m1_pmu_enable_event;
+@@ -536,7 +547,14 @@ static int m1_pmu_init(struct arm_pmu *cpu_pmu)
+ cpu_pmu->clear_event_idx = m1_pmu_clear_event_idx;
+ cpu_pmu->start = m1_pmu_start;
+ cpu_pmu->stop = m1_pmu_stop;
+- cpu_pmu->map_event = m1_pmu_map_event;
++
++ if (flags & ARMPMU_EVT_47BIT)
++ cpu_pmu->map_event = m1_pmu_map_event;
++ else if (flags & ARMPMU_EVT_63BIT)
++ cpu_pmu->map_event = m2_pmu_map_event;
++ else
++ return WARN_ON(-EINVAL);
++
+ cpu_pmu->reset = m1_pmu_reset;
+ cpu_pmu->set_event_filter = m1_pmu_set_event_filter;
+
+@@ -550,25 +568,25 @@ static int m1_pmu_init(struct arm_pmu *cpu_pmu)
+ static int m1_pmu_ice_init(struct arm_pmu *cpu_pmu)
+ {
+ cpu_pmu->name = "apple_icestorm_pmu";
+- return m1_pmu_init(cpu_pmu);
++ return m1_pmu_init(cpu_pmu, ARMPMU_EVT_47BIT);
+ }
+
+ static int m1_pmu_fire_init(struct arm_pmu *cpu_pmu)
+ {
+ cpu_pmu->name = "apple_firestorm_pmu";
+- return m1_pmu_init(cpu_pmu);
++ return m1_pmu_init(cpu_pmu, ARMPMU_EVT_47BIT);
+ }
+
+ static int m2_pmu_avalanche_init(struct arm_pmu *cpu_pmu)
+ {
+ cpu_pmu->name = "apple_avalanche_pmu";
+- return m1_pmu_init(cpu_pmu);
++ return m1_pmu_init(cpu_pmu, ARMPMU_EVT_63BIT);
+ }
+
+ static int m2_pmu_blizzard_init(struct arm_pmu *cpu_pmu)
+ {
+ cpu_pmu->name = "apple_blizzard_pmu";
+- return m1_pmu_init(cpu_pmu);
++ return m1_pmu_init(cpu_pmu, ARMPMU_EVT_63BIT);
+ }
+
+ static const struct of_device_id m1_pmu_of_device_ids[] = {
+diff --git a/drivers/perf/arm-cmn.c b/drivers/perf/arm-cmn.c
+index 47d359f729579..89a685a09d848 100644
+--- a/drivers/perf/arm-cmn.c
++++ b/drivers/perf/arm-cmn.c
+@@ -1899,9 +1899,10 @@ static int arm_cmn_init_dtc(struct arm_cmn *cmn, struct arm_cmn_node *dn, int id
+ if (dtc->irq < 0)
+ return dtc->irq;
+
+- writel_relaxed(0, dtc->base + CMN_DT_PMCR);
++ writel_relaxed(CMN_DT_DTC_CTL_DT_EN, dtc->base + CMN_DT_DTC_CTL);
++ writel_relaxed(CMN_DT_PMCR_PMU_EN | CMN_DT_PMCR_OVFL_INTR_EN, dtc->base + CMN_DT_PMCR);
++ writeq_relaxed(0, dtc->base + CMN_DT_PMCCNTR);
+ writel_relaxed(0x1ff, dtc->base + CMN_DT_PMOVSR_CLR);
+- writel_relaxed(CMN_DT_PMCR_OVFL_INTR_EN, dtc->base + CMN_DT_PMCR);
+
+ return 0;
+ }
+@@ -1961,7 +1962,7 @@ static int arm_cmn_init_dtcs(struct arm_cmn *cmn)
+ dn->type = CMN_TYPE_CCLA;
+ }
+
+- writel_relaxed(CMN_DT_DTC_CTL_DT_EN, cmn->dtc[0].base + CMN_DT_DTC_CTL);
++ arm_cmn_set_state(cmn, CMN_STATE_DISABLED);
+
+ return 0;
+ }
+diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c
+index a3f1c410b4173..e8bc8fc1fb9c0 100644
+--- a/drivers/perf/arm_cspmu/arm_cspmu.c
++++ b/drivers/perf/arm_cspmu/arm_cspmu.c
+@@ -189,10 +189,10 @@ static inline bool use_64b_counter_reg(const struct arm_cspmu *cspmu)
+ ssize_t arm_cspmu_sysfs_event_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+ {
+- struct dev_ext_attribute *eattr =
+- container_of(attr, struct dev_ext_attribute, attr);
+- return sysfs_emit(buf, "event=0x%llx\n",
+- (unsigned long long)eattr->var);
++ struct perf_pmu_events_attr *pmu_attr;
++
++ pmu_attr = container_of(attr, typeof(*pmu_attr), attr);
++ return sysfs_emit(buf, "event=0x%llx\n", pmu_attr->id);
+ }
+ EXPORT_SYMBOL_GPL(arm_cspmu_sysfs_event_show);
+
+@@ -1232,7 +1232,8 @@ static struct platform_driver arm_cspmu_driver = {
+ static void arm_cspmu_set_active_cpu(int cpu, struct arm_cspmu *cspmu)
+ {
+ cpumask_set_cpu(cpu, &cspmu->active_cpu);
+- WARN_ON(irq_set_affinity(cspmu->irq, &cspmu->active_cpu));
++ if (cspmu->irq)
++ WARN_ON(irq_set_affinity(cspmu->irq, &cspmu->active_cpu));
+ }
+
+ static int arm_cspmu_cpu_online(unsigned int cpu, struct hlist_node *node)
+diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
+index 15bd1e34a88ea..277e29fbd504f 100644
+--- a/drivers/perf/arm_pmu.c
++++ b/drivers/perf/arm_pmu.c
+@@ -109,6 +109,8 @@ static inline u64 arm_pmu_event_max_period(struct perf_event *event)
+ {
+ if (event->hw.flags & ARMPMU_EVT_64BIT)
+ return GENMASK_ULL(63, 0);
++ else if (event->hw.flags & ARMPMU_EVT_63BIT)
++ return GENMASK_ULL(62, 0);
+ else if (event->hw.flags & ARMPMU_EVT_47BIT)
+ return GENMASK_ULL(46, 0);
+ else
+diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
+index 6fee0b6e163bb..e10fc7cb9493a 100644
+--- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
++++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
+@@ -683,7 +683,7 @@ static int hisi_pcie_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
+
+ pcie_pmu->on_cpu = -1;
+ /* Choose a new CPU from all online cpus. */
+- target = cpumask_first(cpu_online_mask);
++ target = cpumask_any_but(cpu_online_mask, cpu);
+ if (target >= nr_cpu_ids) {
+ pci_err(pcie_pmu->pdev, "There is no CPU to set\n");
+ return 0;
+diff --git a/drivers/phy/Kconfig b/drivers/phy/Kconfig
+index f46e3148d286d..8dba9596408f2 100644
+--- a/drivers/phy/Kconfig
++++ b/drivers/phy/Kconfig
+@@ -18,6 +18,7 @@ config GENERIC_PHY
+
+ config GENERIC_PHY_MIPI_DPHY
+ bool
++ depends on GENERIC_PHY
+ help
+ Generic MIPI D-PHY support.
+
+diff --git a/drivers/phy/qualcomm/phy-qcom-qmp-combo.c b/drivers/phy/qualcomm/phy-qcom-qmp-combo.c
+index 87b17e5877ab8..1fdcc81661ed8 100644
+--- a/drivers/phy/qualcomm/phy-qcom-qmp-combo.c
++++ b/drivers/phy/qualcomm/phy-qcom-qmp-combo.c
+@@ -2142,6 +2142,7 @@ static void qmp_v4_configure_dp_tx(struct qmp_combo *qmp)
+ static int qmp_v456_configure_dp_phy(struct qmp_combo *qmp,
+ unsigned int com_resetm_ctrl_reg,
+ unsigned int com_c_ready_status_reg,
++ unsigned int com_cmn_status_reg,
+ unsigned int dp_phy_status_reg)
+ {
+ const struct phy_configure_opts_dp *dp_opts = &qmp->dp_opts;
+@@ -2198,14 +2199,14 @@ static int qmp_v456_configure_dp_phy(struct qmp_combo *qmp,
+ 10000))
+ return -ETIMEDOUT;
+
+- if (readl_poll_timeout(qmp->dp_serdes + QSERDES_V4_COM_CMN_STATUS,
++ if (readl_poll_timeout(qmp->dp_serdes + com_cmn_status_reg,
+ status,
+ ((status & BIT(0)) > 0),
+ 500,
+ 10000))
+ return -ETIMEDOUT;
+
+- if (readl_poll_timeout(qmp->dp_serdes + QSERDES_V4_COM_CMN_STATUS,
++ if (readl_poll_timeout(qmp->dp_serdes + com_cmn_status_reg,
+ status,
+ ((status & BIT(1)) > 0),
+ 500,
+@@ -2241,6 +2242,7 @@ static int qmp_v4_configure_dp_phy(struct qmp_combo *qmp)
+
+ ret = qmp_v456_configure_dp_phy(qmp, QSERDES_V4_COM_RESETSM_CNTRL,
+ QSERDES_V4_COM_C_READY_STATUS,
++ QSERDES_V4_COM_CMN_STATUS,
+ QSERDES_V4_DP_PHY_STATUS);
+ if (ret < 0)
+ return ret;
+@@ -2305,6 +2307,7 @@ static int qmp_v5_configure_dp_phy(struct qmp_combo *qmp)
+
+ ret = qmp_v456_configure_dp_phy(qmp, QSERDES_V4_COM_RESETSM_CNTRL,
+ QSERDES_V4_COM_C_READY_STATUS,
++ QSERDES_V4_COM_CMN_STATUS,
+ QSERDES_V4_DP_PHY_STATUS);
+ if (ret < 0)
+ return ret;
+@@ -2364,6 +2367,7 @@ static int qmp_v6_configure_dp_phy(struct qmp_combo *qmp)
+
+ ret = qmp_v456_configure_dp_phy(qmp, QSERDES_V6_COM_RESETSM_CNTRL,
+ QSERDES_V6_COM_C_READY_STATUS,
++ QSERDES_V6_COM_CMN_STATUS,
+ QSERDES_V6_DP_PHY_STATUS);
+ if (ret < 0)
+ return ret;
+diff --git a/drivers/phy/tegra/xusb.c b/drivers/phy/tegra/xusb.c
+index b55d4e9f42b5c..a296b87dced18 100644
+--- a/drivers/phy/tegra/xusb.c
++++ b/drivers/phy/tegra/xusb.c
+@@ -568,6 +568,7 @@ static void tegra_xusb_port_unregister(struct tegra_xusb_port *port)
+ usb_role_switch_unregister(port->usb_role_sw);
+ cancel_work_sync(&port->usb_phy_work);
+ usb_remove_phy(&port->usb_phy);
++ port->usb_phy.dev->driver = NULL;
+ }
+
+ if (port->ops->remove)
+@@ -675,6 +676,9 @@ static int tegra_xusb_setup_usb_role_switch(struct tegra_xusb_port *port)
+ port->dev.driver = devm_kzalloc(&port->dev,
+ sizeof(struct device_driver),
+ GFP_KERNEL);
++ if (!port->dev.driver)
++ return -ENOMEM;
++
+ port->dev.driver->owner = THIS_MODULE;
+
+ port->usb_role_sw = usb_role_switch_register(&port->dev,
+diff --git a/drivers/pinctrl/bcm/pinctrl-bcm2835.c b/drivers/pinctrl/bcm/pinctrl-bcm2835.c
+index 7435173e10f43..1489191a213fe 100644
+--- a/drivers/pinctrl/bcm/pinctrl-bcm2835.c
++++ b/drivers/pinctrl/bcm/pinctrl-bcm2835.c
+@@ -376,10 +376,8 @@ static int bcm2835_add_pin_ranges_fallback(struct gpio_chip *gc)
+ if (!pctldev)
+ return 0;
+
+- gpiochip_add_pin_range(gc, pinctrl_dev_get_devname(pctldev), 0, 0,
+- gc->ngpio);
+-
+- return 0;
++ return gpiochip_add_pin_range(gc, pinctrl_dev_get_devname(pctldev), 0, 0,
++ gc->ngpio);
+ }
+
+ static const struct gpio_chip bcm2835_gpio_chip = {
+diff --git a/drivers/pinctrl/freescale/pinctrl-scu.c b/drivers/pinctrl/freescale/pinctrl-scu.c
+index ea261b6e74581..3b252d684d723 100644
+--- a/drivers/pinctrl/freescale/pinctrl-scu.c
++++ b/drivers/pinctrl/freescale/pinctrl-scu.c
+@@ -90,7 +90,7 @@ int imx_pinconf_set_scu(struct pinctrl_dev *pctldev, unsigned pin_id,
+ struct imx_sc_msg_req_pad_set msg;
+ struct imx_sc_rpc_msg *hdr = &msg.hdr;
+ unsigned int mux = configs[0];
+- unsigned int conf = configs[1];
++ unsigned int conf;
+ unsigned int val;
+ int ret;
+
+@@ -115,6 +115,7 @@ int imx_pinconf_set_scu(struct pinctrl_dev *pctldev, unsigned pin_id,
+ * Set mux and conf together in one IPC call
+ */
+ WARN_ON(num_configs != 2);
++ conf = configs[1];
+
+ val = conf | BM_PAD_CTL_IFMUX_ENABLE | BM_PAD_CTL_GP_ENABLE;
+ val |= mux << BP_PAD_CTL_IFMUX;
+diff --git a/drivers/pinctrl/intel/pinctrl-cherryview.c b/drivers/pinctrl/intel/pinctrl-cherryview.c
+index 722990e278361..87cf1e7403979 100644
+--- a/drivers/pinctrl/intel/pinctrl-cherryview.c
++++ b/drivers/pinctrl/intel/pinctrl-cherryview.c
+@@ -949,11 +949,6 @@ static int chv_config_get(struct pinctrl_dev *pctldev, unsigned int pin,
+
+ break;
+
+- case PIN_CONFIG_DRIVE_OPEN_DRAIN:
+- if (!(ctrl1 & CHV_PADCTRL1_ODEN))
+- return -EINVAL;
+- break;
+-
+ case PIN_CONFIG_BIAS_HIGH_IMPEDANCE: {
+ u32 cfg;
+
+@@ -963,6 +958,16 @@ static int chv_config_get(struct pinctrl_dev *pctldev, unsigned int pin,
+ return -EINVAL;
+
+ break;
++
++ case PIN_CONFIG_DRIVE_PUSH_PULL:
++ if (ctrl1 & CHV_PADCTRL1_ODEN)
++ return -EINVAL;
++ break;
++
++ case PIN_CONFIG_DRIVE_OPEN_DRAIN:
++ if (!(ctrl1 & CHV_PADCTRL1_ODEN))
++ return -EINVAL;
++ break;
+ }
+
+ default:
+diff --git a/drivers/pinctrl/nuvoton/pinctrl-npcm7xx.c b/drivers/pinctrl/nuvoton/pinctrl-npcm7xx.c
+index 21e61c2a37988..843ffcd968774 100644
+--- a/drivers/pinctrl/nuvoton/pinctrl-npcm7xx.c
++++ b/drivers/pinctrl/nuvoton/pinctrl-npcm7xx.c
+@@ -1884,6 +1884,8 @@ static int npcm7xx_gpio_of(struct npcm7xx_pinctrl *pctrl)
+ }
+
+ pctrl->gpio_bank[id].base = ioremap(res.start, resource_size(&res));
++ if (!pctrl->gpio_bank[id].base)
++ return -EINVAL;
+
+ ret = bgpio_init(&pctrl->gpio_bank[id].gc, dev, 4,
+ pctrl->gpio_bank[id].base + NPCM7XX_GP_N_DIN,
+diff --git a/drivers/pinctrl/pinctrl-at91-pio4.c b/drivers/pinctrl/pinctrl-at91-pio4.c
+index 2fe40acb6a3e5..98c2122817264 100644
+--- a/drivers/pinctrl/pinctrl-at91-pio4.c
++++ b/drivers/pinctrl/pinctrl-at91-pio4.c
+@@ -1146,6 +1146,8 @@ static int atmel_pinctrl_probe(struct platform_device *pdev)
+ /* Pin naming convention: P(bank_name)(bank_pin_number). */
+ pin_desc[i].name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "P%c%u",
+ bank + 'A', line);
++ if (!pin_desc[i].name)
++ return -ENOMEM;
+
+ group->name = group_names[i] = pin_desc[i].name;
+ group->pin = pin_desc[i].number;
+diff --git a/drivers/pinctrl/pinctrl-at91.c b/drivers/pinctrl/pinctrl-at91.c
+index 871209c241532..39956d821ad75 100644
+--- a/drivers/pinctrl/pinctrl-at91.c
++++ b/drivers/pinctrl/pinctrl-at91.c
+@@ -1389,8 +1389,8 @@ static int at91_pinctrl_probe(struct platform_device *pdev)
+ char **names;
+
+ names = devm_kasprintf_strarray(dev, "pio", MAX_NB_GPIO_PER_BANK);
+- if (!names)
+- return -ENOMEM;
++ if (IS_ERR(names))
++ return PTR_ERR(names);
+
+ for (j = 0; j < MAX_NB_GPIO_PER_BANK; j++, k++) {
+ char *name = names[j];
+@@ -1870,8 +1870,8 @@ static int at91_gpio_probe(struct platform_device *pdev)
+ }
+
+ names = devm_kasprintf_strarray(dev, "pio", chip->ngpio);
+- if (!names)
+- return -ENOMEM;
++ if (IS_ERR(names))
++ return PTR_ERR(names);
+
+ for (i = 0; i < chip->ngpio; i++)
+ strreplace(names[i], '-', alias_idx + 'A');
+diff --git a/drivers/pinctrl/pinctrl-microchip-sgpio.c b/drivers/pinctrl/pinctrl-microchip-sgpio.c
+index 4794602316e7d..666d8b7cdbad3 100644
+--- a/drivers/pinctrl/pinctrl-microchip-sgpio.c
++++ b/drivers/pinctrl/pinctrl-microchip-sgpio.c
+@@ -818,6 +818,9 @@ static int microchip_sgpio_register_bank(struct device *dev,
+ pctl_desc->name = devm_kasprintf(dev, GFP_KERNEL, "%s-%sput",
+ dev_name(dev),
+ bank->is_input ? "in" : "out");
++ if (!pctl_desc->name)
++ return -ENOMEM;
++
+ pctl_desc->pctlops = &sgpio_pctl_ops;
+ pctl_desc->pmxops = &sgpio_pmx_ops;
+ pctl_desc->confops = &sgpio_confops;
+diff --git a/drivers/pinctrl/sunplus/sppctl.c b/drivers/pinctrl/sunplus/sppctl.c
+index 6bbbab3a6fdf3..150996949ede7 100644
+--- a/drivers/pinctrl/sunplus/sppctl.c
++++ b/drivers/pinctrl/sunplus/sppctl.c
+@@ -834,11 +834,6 @@ static int sppctl_dt_node_to_map(struct pinctrl_dev *pctldev, struct device_node
+ int i, size = 0;
+
+ list = of_get_property(np_config, "sunplus,pins", &size);
+-
+- if (nmG <= 0)
+- nmG = 0;
+-
+- parent = of_get_parent(np_config);
+ *num_maps = size / sizeof(*list);
+
+ /*
+@@ -866,10 +861,14 @@ static int sppctl_dt_node_to_map(struct pinctrl_dev *pctldev, struct device_node
+ }
+ }
+
++ if (nmG <= 0)
++ nmG = 0;
++
+ *map = kcalloc(*num_maps + nmG, sizeof(**map), GFP_KERNEL);
+- if (*map == NULL)
++ if (!(*map))
+ return -ENOMEM;
+
++ parent = of_get_parent(np_config);
+ for (i = 0; i < (*num_maps); i++) {
+ dt_pin = be32_to_cpu(list[i]);
+ pin_num = FIELD_GET(GENMASK(31, 24), dt_pin);
+@@ -883,6 +882,8 @@ static int sppctl_dt_node_to_map(struct pinctrl_dev *pctldev, struct device_node
+ (*map)[i].data.configs.num_configs = 1;
+ (*map)[i].data.configs.group_or_pin = pin_get_name(pctldev, pin_num);
+ configs = kmalloc(sizeof(*configs), GFP_KERNEL);
++ if (!configs)
++ goto sppctl_map_err;
+ *configs = FIELD_GET(GENMASK(7, 0), dt_pin);
+ (*map)[i].data.configs.configs = configs;
+
+@@ -896,6 +897,8 @@ static int sppctl_dt_node_to_map(struct pinctrl_dev *pctldev, struct device_node
+ (*map)[i].data.configs.num_configs = 1;
+ (*map)[i].data.configs.group_or_pin = pin_get_name(pctldev, pin_num);
+ configs = kmalloc(sizeof(*configs), GFP_KERNEL);
++ if (!configs)
++ goto sppctl_map_err;
+ *configs = SPPCTL_IOP_CONFIGS;
+ (*map)[i].data.configs.configs = configs;
+
+@@ -965,6 +968,14 @@ static int sppctl_dt_node_to_map(struct pinctrl_dev *pctldev, struct device_node
+ of_node_put(parent);
+ dev_dbg(pctldev->dev, "%d pins mapped\n", *num_maps);
+ return 0;
++
++sppctl_map_err:
++ for (i = 0; i < (*num_maps); i++)
++ if ((*map)[i].type == PIN_MAP_TYPE_CONFIGS_PIN)
++ kfree((*map)[i].data.configs.configs);
++ kfree(*map);
++ of_node_put(parent);
++ return -ENOMEM;
+ }
+
+ static const struct pinctrl_ops sppctl_pctl_ops = {
+diff --git a/drivers/pinctrl/tegra/pinctrl-tegra.c b/drivers/pinctrl/tegra/pinctrl-tegra.c
+index 1729b7ddfa946..21e08fbd1df0e 100644
+--- a/drivers/pinctrl/tegra/pinctrl-tegra.c
++++ b/drivers/pinctrl/tegra/pinctrl-tegra.c
+@@ -232,7 +232,7 @@ static const char *tegra_pinctrl_get_func_name(struct pinctrl_dev *pctldev,
+ {
+ struct tegra_pmx *pmx = pinctrl_dev_get_drvdata(pctldev);
+
+- return pmx->soc->functions[function].name;
++ return pmx->functions[function].name;
+ }
+
+ static int tegra_pinctrl_get_func_groups(struct pinctrl_dev *pctldev,
+@@ -242,8 +242,8 @@ static int tegra_pinctrl_get_func_groups(struct pinctrl_dev *pctldev,
+ {
+ struct tegra_pmx *pmx = pinctrl_dev_get_drvdata(pctldev);
+
+- *groups = pmx->soc->functions[function].groups;
+- *num_groups = pmx->soc->functions[function].ngroups;
++ *groups = pmx->functions[function].groups;
++ *num_groups = pmx->functions[function].ngroups;
+
+ return 0;
+ }
+@@ -795,10 +795,17 @@ int tegra_pinctrl_probe(struct platform_device *pdev,
+ if (!pmx->group_pins)
+ return -ENOMEM;
+
++ pmx->functions = devm_kcalloc(&pdev->dev, pmx->soc->nfunctions,
++ sizeof(*pmx->functions), GFP_KERNEL);
++ if (!pmx->functions)
++ return -ENOMEM;
++
+ group_pins = pmx->group_pins;
++
+ for (fn = 0; fn < soc_data->nfunctions; fn++) {
+- struct tegra_function *func = &soc_data->functions[fn];
++ struct tegra_function *func = &pmx->functions[fn];
+
++ func->name = pmx->soc->functions[fn];
+ func->groups = group_pins;
+
+ for (gn = 0; gn < soc_data->ngroups; gn++) {
+diff --git a/drivers/pinctrl/tegra/pinctrl-tegra.h b/drivers/pinctrl/tegra/pinctrl-tegra.h
+index 6130cba7cce54..b3289bdf727d8 100644
+--- a/drivers/pinctrl/tegra/pinctrl-tegra.h
++++ b/drivers/pinctrl/tegra/pinctrl-tegra.h
+@@ -13,6 +13,7 @@ struct tegra_pmx {
+ struct pinctrl_dev *pctl;
+
+ const struct tegra_pinctrl_soc_data *soc;
++ struct tegra_function *functions;
+ const char **group_pins;
+
+ struct pinctrl_gpio_range gpio_range;
+@@ -191,7 +192,7 @@ struct tegra_pinctrl_soc_data {
+ const char *gpio_compatible;
+ const struct pinctrl_pin_desc *pins;
+ unsigned npins;
+- struct tegra_function *functions;
++ const char * const *functions;
+ unsigned nfunctions;
+ const struct tegra_pingroup *groups;
+ unsigned ngroups;
+diff --git a/drivers/pinctrl/tegra/pinctrl-tegra114.c b/drivers/pinctrl/tegra/pinctrl-tegra114.c
+index e72ab1eb23983..3d425b2018e78 100644
+--- a/drivers/pinctrl/tegra/pinctrl-tegra114.c
++++ b/drivers/pinctrl/tegra/pinctrl-tegra114.c
+@@ -1452,12 +1452,9 @@ enum tegra_mux {
+ TEGRA_MUX_VI_ALT3,
+ };
+
+-#define FUNCTION(fname) \
+- { \
+- .name = #fname, \
+- }
++#define FUNCTION(fname) #fname
+
+-static struct tegra_function tegra114_functions[] = {
++static const char * const tegra114_functions[] = {
+ FUNCTION(blink),
+ FUNCTION(cec),
+ FUNCTION(cldvfs),
+diff --git a/drivers/pinctrl/tegra/pinctrl-tegra124.c b/drivers/pinctrl/tegra/pinctrl-tegra124.c
+index 26096c6b967e2..2a50c5c7516c3 100644
+--- a/drivers/pinctrl/tegra/pinctrl-tegra124.c
++++ b/drivers/pinctrl/tegra/pinctrl-tegra124.c
+@@ -1611,12 +1611,9 @@ enum tegra_mux {
+ TEGRA_MUX_VIMCLK2_ALT,
+ };
+
+-#define FUNCTION(fname) \
+- { \
+- .name = #fname, \
+- }
++#define FUNCTION(fname) #fname
+
+-static struct tegra_function tegra124_functions[] = {
++static const char * const tegra124_functions[] = {
+ FUNCTION(blink),
+ FUNCTION(ccla),
+ FUNCTION(cec),
+diff --git a/drivers/pinctrl/tegra/pinctrl-tegra194.c b/drivers/pinctrl/tegra/pinctrl-tegra194.c
+index 277973c884344..69f58df628977 100644
+--- a/drivers/pinctrl/tegra/pinctrl-tegra194.c
++++ b/drivers/pinctrl/tegra/pinctrl-tegra194.c
+@@ -1189,12 +1189,9 @@ enum tegra_mux_dt {
+ };
+
+ /* Make list of each function name */
+-#define TEGRA_PIN_FUNCTION(lid) \
+- { \
+- .name = #lid, \
+- }
++#define TEGRA_PIN_FUNCTION(lid) #lid
+
+-static struct tegra_function tegra194_functions[] = {
++static const char * const tegra194_functions[] = {
+ TEGRA_PIN_FUNCTION(rsvd0),
+ TEGRA_PIN_FUNCTION(rsvd1),
+ TEGRA_PIN_FUNCTION(rsvd2),
+diff --git a/drivers/pinctrl/tegra/pinctrl-tegra20.c b/drivers/pinctrl/tegra/pinctrl-tegra20.c
+index 0dc2cf0d05b1e..737fc2000f66b 100644
+--- a/drivers/pinctrl/tegra/pinctrl-tegra20.c
++++ b/drivers/pinctrl/tegra/pinctrl-tegra20.c
+@@ -1889,12 +1889,9 @@ enum tegra_mux {
+ TEGRA_MUX_XIO,
+ };
+
+-#define FUNCTION(fname) \
+- { \
+- .name = #fname, \
+- }
++#define FUNCTION(fname) #fname
+
+-static struct tegra_function tegra20_functions[] = {
++static const char * const tegra20_functions[] = {
+ FUNCTION(ahb_clk),
+ FUNCTION(apb_clk),
+ FUNCTION(audio_sync),
+diff --git a/drivers/pinctrl/tegra/pinctrl-tegra210.c b/drivers/pinctrl/tegra/pinctrl-tegra210.c
+index b480f607fa16f..9bb29146dfff7 100644
+--- a/drivers/pinctrl/tegra/pinctrl-tegra210.c
++++ b/drivers/pinctrl/tegra/pinctrl-tegra210.c
+@@ -1185,12 +1185,9 @@ enum tegra_mux {
+ TEGRA_MUX_VIMCLK2,
+ };
+
+-#define FUNCTION(fname) \
+- { \
+- .name = #fname, \
+- }
++#define FUNCTION(fname) #fname
+
+-static struct tegra_function tegra210_functions[] = {
++static const char * const tegra210_functions[] = {
+ FUNCTION(aud),
+ FUNCTION(bcl),
+ FUNCTION(blink),
+diff --git a/drivers/pinctrl/tegra/pinctrl-tegra30.c b/drivers/pinctrl/tegra/pinctrl-tegra30.c
+index 7299a371827f1..de5aa2d4d28d3 100644
+--- a/drivers/pinctrl/tegra/pinctrl-tegra30.c
++++ b/drivers/pinctrl/tegra/pinctrl-tegra30.c
+@@ -2010,12 +2010,9 @@ enum tegra_mux {
+ TEGRA_MUX_VI_ALT3,
+ };
+
+-#define FUNCTION(fname) \
+- { \
+- .name = #fname, \
+- }
++#define FUNCTION(fname) #fname
+
+-static struct tegra_function tegra30_functions[] = {
++static const char * const tegra30_functions[] = {
+ FUNCTION(blink),
+ FUNCTION(cec),
+ FUNCTION(clk_12m_out),
+diff --git a/drivers/platform/x86/dell/dell-rbtn.c b/drivers/platform/x86/dell/dell-rbtn.c
+index aa0e6c9074942..c8fcb537fd65d 100644
+--- a/drivers/platform/x86/dell/dell-rbtn.c
++++ b/drivers/platform/x86/dell/dell-rbtn.c
+@@ -395,16 +395,16 @@ static int rbtn_add(struct acpi_device *device)
+ return -EINVAL;
+ }
+
++ rbtn_data = devm_kzalloc(&device->dev, sizeof(*rbtn_data), GFP_KERNEL);
++ if (!rbtn_data)
++ return -ENOMEM;
++
+ ret = rbtn_acquire(device, true);
+ if (ret < 0) {
+ dev_err(&device->dev, "Cannot enable device\n");
+ return ret;
+ }
+
+- rbtn_data = devm_kzalloc(&device->dev, sizeof(*rbtn_data), GFP_KERNEL);
+- if (!rbtn_data)
+- return -ENOMEM;
+-
+ rbtn_data->type = type;
+ device->driver_data = rbtn_data;
+
+@@ -420,10 +420,12 @@ static int rbtn_add(struct acpi_device *device)
+ break;
+ default:
+ ret = -EINVAL;
++ break;
+ }
++ if (ret)
++ rbtn_acquire(device, false);
+
+ return ret;
+-
+ }
+
+ static void rbtn_remove(struct acpi_device *device)
+@@ -442,7 +444,6 @@ static void rbtn_remove(struct acpi_device *device)
+ }
+
+ rbtn_acquire(device, false);
+- device->driver_data = NULL;
+ }
+
+ static void rbtn_notify(struct acpi_device *device, u32 event)
+diff --git a/drivers/platform/x86/intel/pmc/core.c b/drivers/platform/x86/intel/pmc/core.c
+index da6e7206d38b5..ed91ef9d1cf6c 100644
+--- a/drivers/platform/x86/intel/pmc/core.c
++++ b/drivers/platform/x86/intel/pmc/core.c
+@@ -1039,7 +1039,6 @@ static const struct x86_cpu_id intel_pmc_core_ids[] = {
+ X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_P, tgl_core_init),
+ X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE, adl_core_init),
+ X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_S, adl_core_init),
+- X86_MATCH_INTEL_FAM6_MODEL(METEORLAKE, mtl_core_init),
+ X86_MATCH_INTEL_FAM6_MODEL(METEORLAKE_L, mtl_core_init),
+ {}
+ };
+@@ -1223,11 +1222,11 @@ static inline bool pmc_core_is_s0ix_failed(struct pmc_dev *pmcdev)
+ return false;
+ }
+
+-static __maybe_unused int pmc_core_resume(struct device *dev)
++int pmc_core_resume_common(struct pmc_dev *pmcdev)
+ {
+- struct pmc_dev *pmcdev = dev_get_drvdata(dev);
+ const struct pmc_bit_map **maps = pmcdev->map->lpm_sts;
+ int offset = pmcdev->map->lpm_status_offset;
++ struct device *dev = &pmcdev->pdev->dev;
+
+ /* Check if the syspend used S0ix */
+ if (pm_suspend_via_firmware())
+@@ -1257,6 +1256,16 @@ static __maybe_unused int pmc_core_resume(struct device *dev)
+ return 0;
+ }
+
++static __maybe_unused int pmc_core_resume(struct device *dev)
++{
++ struct pmc_dev *pmcdev = dev_get_drvdata(dev);
++
++ if (pmcdev->resume)
++ return pmcdev->resume(pmcdev);
++
++ return pmc_core_resume_common(pmcdev);
++}
++
+ static const struct dev_pm_ops pmc_core_pm_ops = {
+ SET_LATE_SYSTEM_SLEEP_PM_OPS(pmc_core_suspend, pmc_core_resume)
+ };
+diff --git a/drivers/platform/x86/intel/pmc/core.h b/drivers/platform/x86/intel/pmc/core.h
+index 9ca9b97467193..86d38270000a7 100644
+--- a/drivers/platform/x86/intel/pmc/core.h
++++ b/drivers/platform/x86/intel/pmc/core.h
+@@ -249,6 +249,14 @@ enum ppfear_regs {
+ #define MTL_LPM_STATUS_LATCH_EN_OFFSET 0x16F8
+ #define MTL_LPM_STATUS_OFFSET 0x1700
+ #define MTL_LPM_LIVE_STATUS_OFFSET 0x175C
++#define MTL_PMC_LTR_IOE_PMC 0x1C0C
++#define MTL_PMC_LTR_ESE 0x1BAC
++#define MTL_SOCM_NUM_IP_IGN_ALLOWED 25
++#define MTL_SOC_PMC_MMIO_REG_LEN 0x2708
++#define MTL_PMC_LTR_SPG 0x1B74
++
++/* Meteor Lake PGD PFET Enable Ack Status */
++#define MTL_SOCM_PPFEAR_NUM_ENTRIES 8
+
+ extern const char *pmc_lpm_modes[];
+
+@@ -327,6 +335,7 @@ struct pmc_reg_map {
+ * @lpm_en_modes: Array of enabled modes from lowest to highest priority
+ * @lpm_req_regs: List of substate requirements
+ * @core_configure: Function pointer to configure the platform
++ * @resume: Function to perform platform specific resume
+ *
+ * pmc_dev contains info about power management controller device.
+ */
+@@ -345,6 +354,7 @@ struct pmc_dev {
+ int lpm_en_modes[LPM_MAX_NUM_MODES];
+ u32 *lpm_req_regs;
+ void (*core_configure)(struct pmc_dev *pmcdev);
++ int (*resume)(struct pmc_dev *pmcdev);
+ };
+
+ extern const struct pmc_bit_map msr_map[];
+@@ -393,11 +403,30 @@ extern const struct pmc_bit_map adl_vnn_req_status_3_map[];
+ extern const struct pmc_bit_map adl_vnn_misc_status_map[];
+ extern const struct pmc_bit_map *adl_lpm_maps[];
+ extern const struct pmc_reg_map adl_reg_map;
+-extern const struct pmc_reg_map mtl_reg_map;
++extern const struct pmc_bit_map mtl_socm_pfear_map[];
++extern const struct pmc_bit_map *ext_mtl_socm_pfear_map[];
++extern const struct pmc_bit_map mtl_socm_ltr_show_map[];
++extern const struct pmc_bit_map mtl_socm_clocksource_status_map[];
++extern const struct pmc_bit_map mtl_socm_power_gating_status_0_map[];
++extern const struct pmc_bit_map mtl_socm_power_gating_status_1_map[];
++extern const struct pmc_bit_map mtl_socm_power_gating_status_2_map[];
++extern const struct pmc_bit_map mtl_socm_d3_status_0_map[];
++extern const struct pmc_bit_map mtl_socm_d3_status_1_map[];
++extern const struct pmc_bit_map mtl_socm_d3_status_2_map[];
++extern const struct pmc_bit_map mtl_socm_d3_status_3_map[];
++extern const struct pmc_bit_map mtl_socm_vnn_req_status_0_map[];
++extern const struct pmc_bit_map mtl_socm_vnn_req_status_1_map[];
++extern const struct pmc_bit_map mtl_socm_vnn_req_status_2_map[];
++extern const struct pmc_bit_map mtl_socm_vnn_req_status_3_map[];
++extern const struct pmc_bit_map mtl_socm_vnn_misc_status_map[];
++extern const struct pmc_bit_map mtl_socm_signal_status_map[];
++extern const struct pmc_bit_map *mtl_socm_lpm_maps[];
++extern const struct pmc_reg_map mtl_socm_reg_map;
+
+ extern void pmc_core_get_tgl_lpm_reqs(struct platform_device *pdev);
+ extern int pmc_core_send_ltr_ignore(struct pmc_dev *pmcdev, u32 value);
+
++int pmc_core_resume_common(struct pmc_dev *pmcdev);
+ void spt_core_init(struct pmc_dev *pmcdev);
+ void cnp_core_init(struct pmc_dev *pmcdev);
+ void icl_core_init(struct pmc_dev *pmcdev);
+diff --git a/drivers/platform/x86/intel/pmc/mtl.c b/drivers/platform/x86/intel/pmc/mtl.c
+index e8cc156412ce5..cdcf743b5e2c7 100644
+--- a/drivers/platform/x86/intel/pmc/mtl.c
++++ b/drivers/platform/x86/intel/pmc/mtl.c
+@@ -11,28 +11,458 @@
+ #include <linux/pci.h>
+ #include "core.h"
+
+-const struct pmc_reg_map mtl_reg_map = {
+- .pfear_sts = ext_tgl_pfear_map,
++/*
++ * Die Mapping to Product.
++ * Product SOCDie IOEDie PCHDie
++ * MTL-M SOC-M IOE-M None
++ * MTL-P SOC-M IOE-P None
++ * MTL-S SOC-S IOE-P PCH-S
++ */
++
++const struct pmc_bit_map mtl_socm_pfear_map[] = {
++ {"PMC", BIT(0)},
++ {"OPI", BIT(1)},
++ {"SPI", BIT(2)},
++ {"XHCI", BIT(3)},
++ {"SPA", BIT(4)},
++ {"SPB", BIT(5)},
++ {"SPC", BIT(6)},
++ {"GBE", BIT(7)},
++
++ {"SATA", BIT(0)},
++ {"DSP0", BIT(1)},
++ {"DSP1", BIT(2)},
++ {"DSP2", BIT(3)},
++ {"DSP3", BIT(4)},
++ {"SPD", BIT(5)},
++ {"LPSS", BIT(6)},
++ {"LPC", BIT(7)},
++
++ {"SMB", BIT(0)},
++ {"ISH", BIT(1)},
++ {"P2SB", BIT(2)},
++ {"NPK_VNN", BIT(3)},
++ {"SDX", BIT(4)},
++ {"SPE", BIT(5)},
++ {"FUSE", BIT(6)},
++ {"SBR8", BIT(7)},
++
++ {"RSVD24", BIT(0)},
++ {"OTG", BIT(1)},
++ {"EXI", BIT(2)},
++ {"CSE", BIT(3)},
++ {"CSME_KVM", BIT(4)},
++ {"CSME_PMT", BIT(5)},
++ {"CSME_CLINK", BIT(6)},
++ {"CSME_PTIO", BIT(7)},
++
++ {"CSME_USBR", BIT(0)},
++ {"CSME_SUSRAM", BIT(1)},
++ {"CSME_SMT1", BIT(2)},
++ {"RSVD35", BIT(3)},
++ {"CSME_SMS2", BIT(4)},
++ {"CSME_SMS", BIT(5)},
++ {"CSME_RTC", BIT(6)},
++ {"CSME_PSF", BIT(7)},
++
++ {"SBR0", BIT(0)},
++ {"SBR1", BIT(1)},
++ {"SBR2", BIT(2)},
++ {"SBR3", BIT(3)},
++ {"SBR4", BIT(4)},
++ {"SBR5", BIT(5)},
++ {"RSVD46", BIT(6)},
++ {"PSF1", BIT(7)},
++
++ {"PSF2", BIT(0)},
++ {"PSF3", BIT(1)},
++ {"PSF4", BIT(2)},
++ {"CNVI", BIT(3)},
++ {"UFSX2", BIT(4)},
++ {"EMMC", BIT(5)},
++ {"SPF", BIT(6)},
++ {"SBR6", BIT(7)},
++
++ {"SBR7", BIT(0)},
++ {"NPK_AON", BIT(1)},
++ {"HDA4", BIT(2)},
++ {"HDA5", BIT(3)},
++ {"HDA6", BIT(4)},
++ {"PSF6", BIT(5)},
++ {"RSVD62", BIT(6)},
++ {"RSVD63", BIT(7)},
++ {}
++};
++
++const struct pmc_bit_map *ext_mtl_socm_pfear_map[] = {
++ mtl_socm_pfear_map,
++ NULL
++};
++
++const struct pmc_bit_map mtl_socm_ltr_show_map[] = {
++ {"SOUTHPORT_A", CNP_PMC_LTR_SPA},
++ {"SOUTHPORT_B", CNP_PMC_LTR_SPB},
++ {"SATA", CNP_PMC_LTR_SATA},
++ {"GIGABIT_ETHERNET", CNP_PMC_LTR_GBE},
++ {"XHCI", CNP_PMC_LTR_XHCI},
++ {"SOUTHPORT_F", ADL_PMC_LTR_SPF},
++ {"ME", CNP_PMC_LTR_ME},
++ {"SATA1", CNP_PMC_LTR_EVA},
++ {"SOUTHPORT_C", CNP_PMC_LTR_SPC},
++ {"HD_AUDIO", CNP_PMC_LTR_AZ},
++ {"CNV", CNP_PMC_LTR_CNV},
++ {"LPSS", CNP_PMC_LTR_LPSS},
++ {"SOUTHPORT_D", CNP_PMC_LTR_SPD},
++ {"SOUTHPORT_E", CNP_PMC_LTR_SPE},
++ {"SATA2", CNP_PMC_LTR_CAM},
++ {"ESPI", CNP_PMC_LTR_ESPI},
++ {"SCC", CNP_PMC_LTR_SCC},
++ {"ISH", CNP_PMC_LTR_ISH},
++ {"UFSX2", CNP_PMC_LTR_UFSX2},
++ {"EMMC", CNP_PMC_LTR_EMMC},
++ {"WIGIG", ICL_PMC_LTR_WIGIG},
++ {"THC0", TGL_PMC_LTR_THC0},
++ {"THC1", TGL_PMC_LTR_THC1},
++ {"SOUTHPORT_G", MTL_PMC_LTR_SPG},
++ {"ESE", MTL_PMC_LTR_ESE},
++ {"IOE_PMC", MTL_PMC_LTR_IOE_PMC},
++
++ /* Below two cannot be used for LTR_IGNORE */
++ {"CURRENT_PLATFORM", CNP_PMC_LTR_CUR_PLT},
++ {"AGGREGATED_SYSTEM", CNP_PMC_LTR_CUR_ASLT},
++ {}
++};
++
++const struct pmc_bit_map mtl_socm_clocksource_status_map[] = {
++ {"AON2_OFF_STS", BIT(0)},
++ {"AON3_OFF_STS", BIT(1)},
++ {"AON4_OFF_STS", BIT(2)},
++ {"AON5_OFF_STS", BIT(3)},
++ {"AON1_OFF_STS", BIT(4)},
++ {"XTAL_LVM_OFF_STS", BIT(5)},
++ {"MPFPW1_0_PLL_OFF_STS", BIT(6)},
++ {"MPFPW1_1_PLL_OFF_STS", BIT(7)},
++ {"USB3_PLL_OFF_STS", BIT(8)},
++ {"AON3_SPL_OFF_STS", BIT(9)},
++ {"MPFPW2_0_PLL_OFF_STS", BIT(12)},
++ {"MPFPW3_0_PLL_OFF_STS", BIT(13)},
++ {"XTAL_AGGR_OFF_STS", BIT(17)},
++ {"USB2_PLL_OFF_STS", BIT(18)},
++ {"FILTER_PLL_OFF_STS", BIT(22)},
++ {"ACE_PLL_OFF_STS", BIT(24)},
++ {"FABRIC_PLL_OFF_STS", BIT(25)},
++ {"SOC_PLL_OFF_STS", BIT(26)},
++ {"PCIFAB_PLL_OFF_STS", BIT(27)},
++ {"REF_PLL_OFF_STS", BIT(28)},
++ {"IMG_PLL_OFF_STS", BIT(29)},
++ {"RTC_PLL_OFF_STS", BIT(31)},
++ {}
++};
++
++const struct pmc_bit_map mtl_socm_power_gating_status_0_map[] = {
++ {"PMC_PGD0_PG_STS", BIT(0)},
++ {"DMI_PGD0_PG_STS", BIT(1)},
++ {"ESPISPI_PGD0_PG_STS", BIT(2)},
++ {"XHCI_PGD0_PG_STS", BIT(3)},
++ {"SPA_PGD0_PG_STS", BIT(4)},
++ {"SPB_PGD0_PG_STS", BIT(5)},
++ {"SPC_PGD0_PG_STS", BIT(6)},
++ {"GBE_PGD0_PG_STS", BIT(7)},
++ {"SATA_PGD0_PG_STS", BIT(8)},
++ {"PSF13_PGD0_PG_STS", BIT(9)},
++ {"SOC_D2D_PGD3_PG_STS", BIT(10)},
++ {"MPFPW3_PGD0_PG_STS", BIT(11)},
++ {"ESE_PGD0_PG_STS", BIT(12)},
++ {"SPD_PGD0_PG_STS", BIT(13)},
++ {"LPSS_PGD0_PG_STS", BIT(14)},
++ {"LPC_PGD0_PG_STS", BIT(15)},
++ {"SMB_PGD0_PG_STS", BIT(16)},
++ {"ISH_PGD0_PG_STS", BIT(17)},
++ {"P2S_PGD0_PG_STS", BIT(18)},
++ {"NPK_PGD0_PG_STS", BIT(19)},
++ {"DBG_SBR_PGD0_PG_STS", BIT(20)},
++ {"SBRG_PGD0_PG_STS", BIT(21)},
++ {"FUSE_PGD0_PG_STS", BIT(22)},
++ {"SBR8_PGD0_PG_STS", BIT(23)},
++ {"SOC_D2D_PGD2_PG_STS", BIT(24)},
++ {"XDCI_PGD0_PG_STS", BIT(25)},
++ {"EXI_PGD0_PG_STS", BIT(26)},
++ {"CSE_PGD0_PG_STS", BIT(27)},
++ {"KVMCC_PGD0_PG_STS", BIT(28)},
++ {"PMT_PGD0_PG_STS", BIT(29)},
++ {"CLINK_PGD0_PG_STS", BIT(30)},
++ {"PTIO_PGD0_PG_STS", BIT(31)},
++ {}
++};
++
++const struct pmc_bit_map mtl_socm_power_gating_status_1_map[] = {
++ {"USBR0_PGD0_PG_STS", BIT(0)},
++ {"SUSRAM_PGD0_PG_STS", BIT(1)},
++ {"SMT1_PGD0_PG_STS", BIT(2)},
++ {"FIACPCB_U_PGD0_PG_STS", BIT(3)},
++ {"SMS2_PGD0_PG_STS", BIT(4)},
++ {"SMS1_PGD0_PG_STS", BIT(5)},
++ {"CSMERTC_PGD0_PG_STS", BIT(6)},
++ {"CSMEPSF_PGD0_PG_STS", BIT(7)},
++ {"SBR0_PGD0_PG_STS", BIT(8)},
++ {"SBR1_PGD0_PG_STS", BIT(9)},
++ {"SBR2_PGD0_PG_STS", BIT(10)},
++ {"SBR3_PGD0_PG_STS", BIT(11)},
++ {"U3FPW1_PGD0_PG_STS", BIT(12)},
++ {"SBR5_PGD0_PG_STS", BIT(13)},
++ {"MPFPW1_PGD0_PG_STS", BIT(14)},
++ {"UFSPW1_PGD0_PG_STS", BIT(15)},
++ {"FIA_X_PGD0_PG_STS", BIT(16)},
++ {"SOC_D2D_PGD0_PG_STS", BIT(17)},
++ {"MPFPW2_PGD0_PG_STS", BIT(18)},
++ {"CNVI_PGD0_PG_STS", BIT(19)},
++ {"UFSX2_PGD0_PG_STS", BIT(20)},
++ {"ENDBG_PGD0_PG_STS", BIT(21)},
++ {"DBG_PSF_PGD0_PG_STS", BIT(22)},
++ {"SBR6_PGD0_PG_STS", BIT(23)},
++ {"SBR7_PGD0_PG_STS", BIT(24)},
++ {"NPK_PGD1_PG_STS", BIT(25)},
++ {"FIACPCB_X_PGD0_PG_STS", BIT(26)},
++ {"DBC_PGD0_PG_STS", BIT(27)},
++ {"FUSEGPSB_PGD0_PG_STS", BIT(28)},
++ {"PSF6_PGD0_PG_STS", BIT(29)},
++ {"PSF7_PGD0_PG_STS", BIT(30)},
++ {"GBETSN1_PGD0_PG_STS", BIT(31)},
++ {}
++};
++
++const struct pmc_bit_map mtl_socm_power_gating_status_2_map[] = {
++ {"PSF8_PGD0_PG_STS", BIT(0)},
++ {"FIA_PGD0_PG_STS", BIT(1)},
++ {"SOC_D2D_PGD1_PG_STS", BIT(2)},
++ {"FIA_U_PGD0_PG_STS", BIT(3)},
++ {"TAM_PGD0_PG_STS", BIT(4)},
++ {"GBETSN_PGD0_PG_STS", BIT(5)},
++ {"TBTLSX_PGD0_PG_STS", BIT(6)},
++ {"THC0_PGD0_PG_STS", BIT(7)},
++ {"THC1_PGD0_PG_STS", BIT(8)},
++ {"PMC_PGD1_PG_STS", BIT(9)},
++ {"GNA_PGD0_PG_STS", BIT(10)},
++ {"ACE_PGD0_PG_STS", BIT(11)},
++ {"ACE_PGD1_PG_STS", BIT(12)},
++ {"ACE_PGD2_PG_STS", BIT(13)},
++ {"ACE_PGD3_PG_STS", BIT(14)},
++ {"ACE_PGD4_PG_STS", BIT(15)},
++ {"ACE_PGD5_PG_STS", BIT(16)},
++ {"ACE_PGD6_PG_STS", BIT(17)},
++ {"ACE_PGD7_PG_STS", BIT(18)},
++ {"ACE_PGD8_PG_STS", BIT(19)},
++ {"FIA_PGS_PGD0_PG_STS", BIT(20)},
++ {"FIACPCB_PGS_PGD0_PG_STS", BIT(21)},
++ {"FUSEPMSB_PGD0_PG_STS", BIT(22)},
++ {}
++};
++
++const struct pmc_bit_map mtl_socm_d3_status_0_map[] = {
++ {"LPSS_D3_STS", BIT(3)},
++ {"XDCI_D3_STS", BIT(4)},
++ {"XHCI_D3_STS", BIT(5)},
++ {"SPA_D3_STS", BIT(12)},
++ {"SPB_D3_STS", BIT(13)},
++ {"SPC_D3_STS", BIT(14)},
++ {"SPD_D3_STS", BIT(15)},
++ {"ESPISPI_D3_STS", BIT(18)},
++ {"SATA_D3_STS", BIT(20)},
++ {"PSTH_D3_STS", BIT(21)},
++ {"DMI_D3_STS", BIT(22)},
++ {}
++};
++
++const struct pmc_bit_map mtl_socm_d3_status_1_map[] = {
++ {"GBETSN1_D3_STS", BIT(14)},
++ {"GBE_D3_STS", BIT(19)},
++ {"ITSS_D3_STS", BIT(23)},
++ {"P2S_D3_STS", BIT(24)},
++ {"CNVI_D3_STS", BIT(27)},
++ {"UFSX2_D3_STS", BIT(28)},
++ {}
++};
++
++const struct pmc_bit_map mtl_socm_d3_status_2_map[] = {
++ {"GNA_D3_STS", BIT(0)},
++ {"CSMERTC_D3_STS", BIT(1)},
++ {"SUSRAM_D3_STS", BIT(2)},
++ {"CSE_D3_STS", BIT(4)},
++ {"KVMCC_D3_STS", BIT(5)},
++ {"USBR0_D3_STS", BIT(6)},
++ {"ISH_D3_STS", BIT(7)},
++ {"SMT1_D3_STS", BIT(8)},
++ {"SMT2_D3_STS", BIT(9)},
++ {"SMT3_D3_STS", BIT(10)},
++ {"CLINK_D3_STS", BIT(14)},
++ {"PTIO_D3_STS", BIT(16)},
++ {"PMT_D3_STS", BIT(17)},
++ {"SMS1_D3_STS", BIT(18)},
++ {"SMS2_D3_STS", BIT(19)},
++ {}
++};
++
++const struct pmc_bit_map mtl_socm_d3_status_3_map[] = {
++ {"ESE_D3_STS", BIT(2)},
++ {"GBETSN_D3_STS", BIT(13)},
++ {"THC0_D3_STS", BIT(14)},
++ {"THC1_D3_STS", BIT(15)},
++ {"ACE_D3_STS", BIT(23)},
++ {}
++};
++
++const struct pmc_bit_map mtl_socm_vnn_req_status_0_map[] = {
++ {"LPSS_VNN_REQ_STS", BIT(3)},
++ {"FIA_VNN_REQ_STS", BIT(17)},
++ {"ESPISPI_VNN_REQ_STS", BIT(18)},
++ {}
++};
++
++const struct pmc_bit_map mtl_socm_vnn_req_status_1_map[] = {
++ {"NPK_VNN_REQ_STS", BIT(4)},
++ {"DFXAGG_VNN_REQ_STS", BIT(8)},
++ {"EXI_VNN_REQ_STS", BIT(9)},
++ {"P2D_VNN_REQ_STS", BIT(18)},
++ {"GBE_VNN_REQ_STS", BIT(19)},
++ {"SMB_VNN_REQ_STS", BIT(25)},
++ {"LPC_VNN_REQ_STS", BIT(26)},
++ {}
++};
++
++const struct pmc_bit_map mtl_socm_vnn_req_status_2_map[] = {
++ {"CSMERTC_VNN_REQ_STS", BIT(1)},
++ {"CSE_VNN_REQ_STS", BIT(4)},
++ {"ISH_VNN_REQ_STS", BIT(7)},
++ {"SMT1_VNN_REQ_STS", BIT(8)},
++ {"CLINK_VNN_REQ_STS", BIT(14)},
++ {"SMS1_VNN_REQ_STS", BIT(18)},
++ {"SMS2_VNN_REQ_STS", BIT(19)},
++ {"GPIOCOM4_VNN_REQ_STS", BIT(20)},
++ {"GPIOCOM3_VNN_REQ_STS", BIT(21)},
++ {"GPIOCOM2_VNN_REQ_STS", BIT(22)},
++ {"GPIOCOM1_VNN_REQ_STS", BIT(23)},
++ {"GPIOCOM0_VNN_REQ_STS", BIT(24)},
++ {}
++};
++
++const struct pmc_bit_map mtl_socm_vnn_req_status_3_map[] = {
++ {"ESE_VNN_REQ_STS", BIT(2)},
++ {"DTS0_VNN_REQ_STS", BIT(7)},
++ {"GPIOCOM5_VNN_REQ_STS", BIT(11)},
++ {}
++};
++
++const struct pmc_bit_map mtl_socm_vnn_misc_status_map[] = {
++ {"CPU_C10_REQ_STS", BIT(0)},
++ {"TS_OFF_REQ_STS", BIT(1)},
++ {"PNDE_MET_REQ_STS", BIT(2)},
++ {"PCIE_DEEP_PM_REQ_STS", BIT(3)},
++ {"PMC_CLK_THROTTLE_EN_REQ_STS", BIT(4)},
++ {"NPK_VNNAON_REQ_STS", BIT(5)},
++ {"VNN_SOC_REQ_STS", BIT(6)},
++ {"ISH_VNNAON_REQ_STS", BIT(7)},
++ {"IOE_COND_MET_S02I2_0_REQ_STS", BIT(8)},
++ {"IOE_COND_MET_S02I2_1_REQ_STS", BIT(9)},
++ {"IOE_COND_MET_S02I2_2_REQ_STS", BIT(10)},
++ {"PLT_GREATER_REQ_STS", BIT(11)},
++ {"PCIE_CLKREQ_REQ_STS", BIT(12)},
++ {"PMC_IDLE_FB_OCP_REQ_STS", BIT(13)},
++ {"PM_SYNC_STATES_REQ_STS", BIT(14)},
++ {"EA_REQ_STS", BIT(15)},
++ {"MPHY_CORE_OFF_REQ_STS", BIT(16)},
++ {"BRK_EV_EN_REQ_STS", BIT(17)},
++ {"AUTO_DEMO_EN_REQ_STS", BIT(18)},
++ {"ITSS_CLK_SRC_REQ_STS", BIT(19)},
++ {"LPC_CLK_SRC_REQ_STS", BIT(20)},
++ {"ARC_IDLE_REQ_STS", BIT(21)},
++ {"MPHY_SUS_REQ_STS", BIT(22)},
++ {"FIA_DEEP_PM_REQ_STS", BIT(23)},
++ {"UXD_CONNECTED_REQ_STS", BIT(24)},
++ {"ARC_INTERRUPT_WAKE_REQ_STS", BIT(25)},
++ {"USB2_VNNAON_ACT_REQ_STS", BIT(26)},
++ {"PRE_WAKE0_REQ_STS", BIT(27)},
++ {"PRE_WAKE1_REQ_STS", BIT(28)},
++ {"PRE_WAKE2_EN_REQ_STS", BIT(29)},
++ {"WOV_REQ_STS", BIT(30)},
++ {"CNVI_V1P05_REQ_STS", BIT(31)},
++ {}
++};
++
++const struct pmc_bit_map mtl_socm_signal_status_map[] = {
++ {"LSX_Wake0_En_STS", BIT(0)},
++ {"LSX_Wake0_Pol_STS", BIT(1)},
++ {"LSX_Wake1_En_STS", BIT(2)},
++ {"LSX_Wake1_Pol_STS", BIT(3)},
++ {"LSX_Wake2_En_STS", BIT(4)},
++ {"LSX_Wake2_Pol_STS", BIT(5)},
++ {"LSX_Wake3_En_STS", BIT(6)},
++ {"LSX_Wake3_Pol_STS", BIT(7)},
++ {"LSX_Wake4_En_STS", BIT(8)},
++ {"LSX_Wake4_Pol_STS", BIT(9)},
++ {"LSX_Wake5_En_STS", BIT(10)},
++ {"LSX_Wake5_Pol_STS", BIT(11)},
++ {"LSX_Wake6_En_STS", BIT(12)},
++ {"LSX_Wake6_Pol_STS", BIT(13)},
++ {"LSX_Wake7_En_STS", BIT(14)},
++ {"LSX_Wake7_Pol_STS", BIT(15)},
++ {"LPSS_Wake0_En_STS", BIT(16)},
++ {"LPSS_Wake0_Pol_STS", BIT(17)},
++ {"LPSS_Wake1_En_STS", BIT(18)},
++ {"LPSS_Wake1_Pol_STS", BIT(19)},
++ {"Int_Timer_SS_Wake0_En_STS", BIT(20)},
++ {"Int_Timer_SS_Wake0_Pol_STS", BIT(21)},
++ {"Int_Timer_SS_Wake1_En_STS", BIT(22)},
++ {"Int_Timer_SS_Wake1_Pol_STS", BIT(23)},
++ {"Int_Timer_SS_Wake2_En_STS", BIT(24)},
++ {"Int_Timer_SS_Wake2_Pol_STS", BIT(25)},
++ {"Int_Timer_SS_Wake3_En_STS", BIT(26)},
++ {"Int_Timer_SS_Wake3_Pol_STS", BIT(27)},
++ {"Int_Timer_SS_Wake4_En_STS", BIT(28)},
++ {"Int_Timer_SS_Wake4_Pol_STS", BIT(29)},
++ {"Int_Timer_SS_Wake5_En_STS", BIT(30)},
++ {"Int_Timer_SS_Wake5_Pol_STS", BIT(31)},
++ {}
++};
++
++const struct pmc_bit_map *mtl_socm_lpm_maps[] = {
++ mtl_socm_clocksource_status_map,
++ mtl_socm_power_gating_status_0_map,
++ mtl_socm_power_gating_status_1_map,
++ mtl_socm_power_gating_status_2_map,
++ mtl_socm_d3_status_0_map,
++ mtl_socm_d3_status_1_map,
++ mtl_socm_d3_status_2_map,
++ mtl_socm_d3_status_3_map,
++ mtl_socm_vnn_req_status_0_map,
++ mtl_socm_vnn_req_status_1_map,
++ mtl_socm_vnn_req_status_2_map,
++ mtl_socm_vnn_req_status_3_map,
++ mtl_socm_vnn_misc_status_map,
++ mtl_socm_signal_status_map,
++ NULL
++};
++
++const struct pmc_reg_map mtl_socm_reg_map = {
++ .pfear_sts = ext_mtl_socm_pfear_map,
+ .slp_s0_offset = CNP_PMC_SLP_S0_RES_COUNTER_OFFSET,
+ .slp_s0_res_counter_step = TGL_PMC_SLP_S0_RES_COUNTER_STEP,
+- .ltr_show_sts = adl_ltr_show_map,
++ .ltr_show_sts = mtl_socm_ltr_show_map,
+ .msr_sts = msr_map,
+ .ltr_ignore_offset = CNP_PMC_LTR_IGNORE_OFFSET,
+- .regmap_length = CNP_PMC_MMIO_REG_LEN,
++ .regmap_length = MTL_SOC_PMC_MMIO_REG_LEN,
+ .ppfear0_offset = CNP_PMC_HOST_PPFEAR0A,
+- .ppfear_buckets = ICL_PPFEAR_NUM_ENTRIES,
++ .ppfear_buckets = MTL_SOCM_PPFEAR_NUM_ENTRIES,
+ .pm_cfg_offset = CNP_PMC_PM_CFG_OFFSET,
+ .pm_read_disable_bit = CNP_PMC_READ_DISABLE_BIT,
+- .ltr_ignore_max = ADL_NUM_IP_IGN_ALLOWED,
+- .lpm_num_modes = ADL_LPM_NUM_MODES,
+ .lpm_num_maps = ADL_LPM_NUM_MAPS,
++ .ltr_ignore_max = MTL_SOCM_NUM_IP_IGN_ALLOWED,
+ .lpm_res_counter_step_x2 = TGL_PMC_LPM_RES_COUNTER_STEP_X2,
+ .etr3_offset = ETR3_OFFSET,
+ .lpm_sts_latch_en_offset = MTL_LPM_STATUS_LATCH_EN_OFFSET,
+ .lpm_priority_offset = MTL_LPM_PRI_OFFSET,
+ .lpm_en_offset = MTL_LPM_EN_OFFSET,
+ .lpm_residency_offset = MTL_LPM_RESIDENCY_OFFSET,
+- .lpm_sts = adl_lpm_maps,
++ .lpm_sts = mtl_socm_lpm_maps,
+ .lpm_status_offset = MTL_LPM_STATUS_OFFSET,
+ .lpm_live_status_offset = MTL_LPM_LIVE_STATUS_OFFSET,
+ };
+@@ -68,16 +498,29 @@ static void mtl_set_device_d3(unsigned int device)
+ }
+ }
+
+-void mtl_core_init(struct pmc_dev *pmcdev)
++/*
++ * Set power state of select devices that do not have drivers to D3
++ * so that they do not block Package C entry.
++ */
++static void mtl_d3_fixup(void)
+ {
+- pmcdev->map = &mtl_reg_map;
+- pmcdev->core_configure = mtl_core_configure;
+-
+- /*
+- * Set power state of select devices that do not have drivers to D3
+- * so that they do not block Package C entry.
+- */
+ mtl_set_device_d3(MTL_GNA_PCI_DEV);
+ mtl_set_device_d3(MTL_IPU_PCI_DEV);
+ mtl_set_device_d3(MTL_VPU_PCI_DEV);
+ }
++
++static int mtl_resume(struct pmc_dev *pmcdev)
++{
++ mtl_d3_fixup();
++ return pmc_core_resume_common(pmcdev);
++}
++
++void mtl_core_init(struct pmc_dev *pmcdev)
++{
++ pmcdev->map = &mtl_socm_reg_map;
++ pmcdev->core_configure = mtl_core_configure;
++
++ mtl_d3_fixup();
++
++ pmcdev->resume = mtl_resume;
++}
+diff --git a/drivers/platform/x86/lenovo-yogabook-wmi.c b/drivers/platform/x86/lenovo-yogabook-wmi.c
+index 5f4bd1eec38a9..d57fcc8388519 100644
+--- a/drivers/platform/x86/lenovo-yogabook-wmi.c
++++ b/drivers/platform/x86/lenovo-yogabook-wmi.c
+@@ -2,7 +2,6 @@
+ /* WMI driver for Lenovo Yoga Book YB1-X90* / -X91* tablets */
+
+ #include <linux/acpi.h>
+-#include <linux/devm-helpers.h>
+ #include <linux/gpio/consumer.h>
+ #include <linux/gpio/machine.h>
+ #include <linux/interrupt.h>
+@@ -248,10 +247,7 @@ static int yogabook_wmi_probe(struct wmi_device *wdev, const void *context)
+ data->brightness = YB_KBD_BL_DEFAULT;
+ set_bit(YB_KBD_IS_ON, &data->flags);
+ set_bit(YB_DIGITIZER_IS_ON, &data->flags);
+-
+- r = devm_work_autocancel(&wdev->dev, &data->work, yogabook_wmi_work);
+- if (r)
+- return r;
++ INIT_WORK(&data->work, yogabook_wmi_work);
+
+ data->kbd_adev = acpi_dev_get_first_match_dev("GDIX1001", NULL, -1);
+ if (!data->kbd_adev) {
+@@ -299,10 +295,12 @@ static int yogabook_wmi_probe(struct wmi_device *wdev, const void *context)
+ }
+ data->backside_hall_irq = r;
+
+- r = devm_request_irq(&wdev->dev, data->backside_hall_irq,
+- yogabook_backside_hall_irq,
+- IRQF_TRIGGER_RISING | IRQF_TRIGGER_FALLING,
+- "backside_hall_sw", data);
++ /* Set default brightness before enabling the IRQ */
++ yogabook_wmi_set_kbd_backlight(data->wdev, YB_KBD_BL_DEFAULT);
++
++ r = request_irq(data->backside_hall_irq, yogabook_backside_hall_irq,
++ IRQF_TRIGGER_RISING | IRQF_TRIGGER_FALLING,
++ "backside_hall_sw", data);
+ if (r) {
+ dev_err_probe(&wdev->dev, r, "Requesting backside_hall_sw IRQ\n");
+ goto error_put_devs;
+@@ -318,11 +316,14 @@ static int yogabook_wmi_probe(struct wmi_device *wdev, const void *context)
+ r = devm_led_classdev_register(&wdev->dev, &data->kbd_bl_led);
+ if (r < 0) {
+ dev_err_probe(&wdev->dev, r, "Registering backlight LED device\n");
+- goto error_put_devs;
++ goto error_free_irq;
+ }
+
+ return 0;
+
++error_free_irq:
++ free_irq(data->backside_hall_irq, data);
++ cancel_work_sync(&data->work);
+ error_put_devs:
+ put_device(data->dig_dev);
+ put_device(data->kbd_dev);
+@@ -334,6 +335,19 @@ error_put_devs:
+ static void yogabook_wmi_remove(struct wmi_device *wdev)
+ {
+ struct yogabook_wmi *data = dev_get_drvdata(&wdev->dev);
++ int r = 0;
++
++ free_irq(data->backside_hall_irq, data);
++ cancel_work_sync(&data->work);
++
++ if (!test_bit(YB_KBD_IS_ON, &data->flags))
++ r |= device_reprobe(data->kbd_dev);
++
++ if (!test_bit(YB_DIGITIZER_IS_ON, &data->flags))
++ r |= device_reprobe(data->dig_dev);
++
++ if (r)
++ dev_warn(&wdev->dev, "Reprobe of devices failed\n");
+
+ put_device(data->dig_dev);
+ put_device(data->kbd_dev);
+diff --git a/drivers/platform/x86/think-lmi.c b/drivers/platform/x86/think-lmi.c
+index 1138f770149d9..e4047ee0a7546 100644
+--- a/drivers/platform/x86/think-lmi.c
++++ b/drivers/platform/x86/think-lmi.c
+@@ -14,6 +14,7 @@
+ #include <linux/acpi.h>
+ #include <linux/errno.h>
+ #include <linux/fs.h>
++#include <linux/mutex.h>
+ #include <linux/string.h>
+ #include <linux/types.h>
+ #include <linux/dmi.h>
+@@ -171,7 +172,7 @@ MODULE_PARM_DESC(debug_support, "Enable debug command support");
+ #define TLMI_POP_PWD (1 << 0)
+ #define TLMI_PAP_PWD (1 << 1)
+ #define TLMI_HDD_PWD (1 << 2)
+-#define TLMI_SYS_PWD (1 << 3)
++#define TLMI_SMP_PWD (1 << 6) /* System Management */
+ #define TLMI_CERT (1 << 7)
+
+ #define to_tlmi_pwd_setting(kobj) container_of(kobj, struct tlmi_pwd_setting, kobj)
+@@ -195,6 +196,7 @@ static const char * const level_options[] = {
+ };
+ static struct think_lmi tlmi_priv;
+ static struct class *fw_attr_class;
++static DEFINE_MUTEX(tlmi_mutex);
+
+ /* ------ Utility functions ------------*/
+ /* Strip out CR if one is present */
+@@ -437,6 +439,9 @@ static ssize_t new_password_store(struct kobject *kobj,
+ /* Strip out CR if one is present, setting password won't work if it is present */
+ strip_cr(new_pwd);
+
++ /* Use lock in case multiple WMI operations needed */
++ mutex_lock(&tlmi_mutex);
++
+ pwdlen = strlen(new_pwd);
+ /* pwdlen == 0 is allowed to clear the password */
+ if (pwdlen && ((pwdlen < setting->minlen) || (pwdlen > setting->maxlen))) {
+@@ -456,9 +461,9 @@ static ssize_t new_password_store(struct kobject *kobj,
+ sprintf(pwd_type, "mhdp%d", setting->index);
+ } else if (setting == tlmi_priv.pwd_nvme) {
+ if (setting->level == TLMI_LEVEL_USER)
+- sprintf(pwd_type, "unvp%d", setting->index);
++ sprintf(pwd_type, "udrp%d", setting->index);
+ else
+- sprintf(pwd_type, "mnvp%d", setting->index);
++ sprintf(pwd_type, "adrp%d", setting->index);
+ } else {
+ sprintf(pwd_type, "%s", setting->pwd_type);
+ }
+@@ -493,6 +498,7 @@ static ssize_t new_password_store(struct kobject *kobj,
+ kfree(auth_str);
+ }
+ out:
++ mutex_unlock(&tlmi_mutex);
+ kfree(new_pwd);
+ return ret ?: count;
+ }
+@@ -981,6 +987,9 @@ static ssize_t current_value_store(struct kobject *kobj,
+ /* Strip out CR if one is present */
+ strip_cr(new_setting);
+
++ /* Use lock in case multiple WMI operations needed */
++ mutex_lock(&tlmi_mutex);
++
+ /* Check if certificate authentication is enabled and active */
+ if (tlmi_priv.certificate_support && tlmi_priv.pwd_admin->cert_installed) {
+ if (!tlmi_priv.pwd_admin->signature || !tlmi_priv.pwd_admin->save_signature) {
+@@ -1039,6 +1048,7 @@ static ssize_t current_value_store(struct kobject *kobj,
+ kobject_uevent(&tlmi_priv.class_dev->kobj, KOBJ_CHANGE);
+ }
+ out:
++ mutex_unlock(&tlmi_mutex);
+ kfree(auth_str);
+ kfree(set_str);
+ kfree(new_setting);
+@@ -1483,11 +1493,11 @@ static int tlmi_analyze(void)
+ tlmi_priv.pwd_power->valid = true;
+
+ if (tlmi_priv.opcode_support) {
+- tlmi_priv.pwd_system = tlmi_create_auth("sys", "system");
++ tlmi_priv.pwd_system = tlmi_create_auth("smp", "system");
+ if (!tlmi_priv.pwd_system)
+ goto fail_clear_attr;
+
+- if (tlmi_priv.pwdcfg.core.password_state & TLMI_SYS_PWD)
++ if (tlmi_priv.pwdcfg.core.password_state & TLMI_SMP_PWD)
+ tlmi_priv.pwd_system->valid = true;
+
+ tlmi_priv.pwd_hdd = tlmi_create_auth("hdd", "hdd");
+diff --git a/drivers/platform/x86/thinkpad_acpi.c b/drivers/platform/x86/thinkpad_acpi.c
+index b3808ad77278d..187018ffb0686 100644
+--- a/drivers/platform/x86/thinkpad_acpi.c
++++ b/drivers/platform/x86/thinkpad_acpi.c
+@@ -10524,8 +10524,8 @@ unlock:
+ static void dytc_profile_refresh(void)
+ {
+ enum platform_profile_option profile;
+- int output, err = 0;
+- int perfmode, funcmode;
++ int output = 0, err = 0;
++ int perfmode, funcmode = 0;
+
+ mutex_lock(&dytc_mutex);
+ if (dytc_capabilities & BIT(DYTC_FC_MMC)) {
+@@ -10538,6 +10538,8 @@ static void dytc_profile_refresh(void)
+ err = dytc_command(DYTC_CMD_GET, &output);
+ /* Check if we are PSC mode, or have AMT enabled */
+ funcmode = (output >> DYTC_GET_FUNCTION_BIT) & 0xF;
++ } else { /* Unknown profile mode */
++ err = -ENODEV;
+ }
+ mutex_unlock(&dytc_mutex);
+ if (err)
+diff --git a/drivers/power/supply/rt9467-charger.c b/drivers/power/supply/rt9467-charger.c
+index ea33693b69779..b0b9ff8667459 100644
+--- a/drivers/power/supply/rt9467-charger.c
++++ b/drivers/power/supply/rt9467-charger.c
+@@ -1192,7 +1192,7 @@ static int rt9467_charger_probe(struct i2c_client *i2c)
+ i2c_set_clientdata(i2c, data);
+
+ /* Default pull charge enable gpio to make 'CHG_EN' by SW control only */
+- ceb_gpio = devm_gpiod_get_optional(dev, "charge-enable", GPIOD_OUT_LOW);
++ ceb_gpio = devm_gpiod_get_optional(dev, "charge-enable", GPIOD_OUT_HIGH);
+ if (IS_ERR(ceb_gpio))
+ return dev_err_probe(dev, PTR_ERR(ceb_gpio),
+ "Failed to config charge enable gpio\n");
+diff --git a/drivers/powercap/Kconfig b/drivers/powercap/Kconfig
+index 90d33cd1b670a..b063f75117738 100644
+--- a/drivers/powercap/Kconfig
++++ b/drivers/powercap/Kconfig
+@@ -18,10 +18,12 @@ if POWERCAP
+ # Client driver configurations go here.
+ config INTEL_RAPL_CORE
+ tristate
++ depends on PCI
++ select IOSF_MBI
+
+ config INTEL_RAPL
+ tristate "Intel RAPL Support via MSR Interface"
+- depends on X86 && IOSF_MBI
++ depends on X86 && PCI
+ select INTEL_RAPL_CORE
+ help
+ This enables support for the Intel Running Average Power Limit (RAPL)
+diff --git a/drivers/powercap/intel_rapl_msr.c b/drivers/powercap/intel_rapl_msr.c
+index a27673706c3d6..9ea4797d70b44 100644
+--- a/drivers/powercap/intel_rapl_msr.c
++++ b/drivers/powercap/intel_rapl_msr.c
+@@ -22,7 +22,6 @@
+ #include <linux/processor.h>
+ #include <linux/platform_device.h>
+
+-#include <asm/iosf_mbi.h>
+ #include <asm/cpu_device_id.h>
+ #include <asm/intel-family.h>
+
+@@ -137,14 +136,14 @@ static int rapl_msr_write_raw(int cpu, struct reg_action *ra)
+
+ /* List of verified CPUs. */
+ static const struct x86_cpu_id pl4_support_ids[] = {
+- { X86_VENDOR_INTEL, 6, INTEL_FAM6_TIGERLAKE_L, X86_FEATURE_ANY },
+- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ALDERLAKE, X86_FEATURE_ANY },
+- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ALDERLAKE_L, X86_FEATURE_ANY },
+- { X86_VENDOR_INTEL, 6, INTEL_FAM6_ALDERLAKE_N, X86_FEATURE_ANY },
+- { X86_VENDOR_INTEL, 6, INTEL_FAM6_RAPTORLAKE, X86_FEATURE_ANY },
+- { X86_VENDOR_INTEL, 6, INTEL_FAM6_RAPTORLAKE_P, X86_FEATURE_ANY },
+- { X86_VENDOR_INTEL, 6, INTEL_FAM6_METEORLAKE, X86_FEATURE_ANY },
+- { X86_VENDOR_INTEL, 6, INTEL_FAM6_METEORLAKE_L, X86_FEATURE_ANY },
++ X86_MATCH_INTEL_FAM6_MODEL(TIGERLAKE_L, NULL),
++ X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE, NULL),
++ X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, NULL),
++ X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_N, NULL),
++ X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE, NULL),
++ X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_P, NULL),
++ X86_MATCH_INTEL_FAM6_MODEL(METEORLAKE, NULL),
++ X86_MATCH_INTEL_FAM6_MODEL(METEORLAKE_L, NULL),
+ {}
+ };
+
+diff --git a/drivers/pwm/pwm-ab8500.c b/drivers/pwm/pwm-ab8500.c
+index 507ff0d5f7bd8..583a7d69c7415 100644
+--- a/drivers/pwm/pwm-ab8500.c
++++ b/drivers/pwm/pwm-ab8500.c
+@@ -190,7 +190,7 @@ static int ab8500_pwm_probe(struct platform_device *pdev)
+ int err;
+
+ if (pdev->id < 1 || pdev->id > 31)
+- return dev_err_probe(&pdev->dev, EINVAL, "Invalid device id %d\n", pdev->id);
++ return dev_err_probe(&pdev->dev, -EINVAL, "Invalid device id %d\n", pdev->id);
+
+ /*
+ * Nothing to be done in probe, this is required to get the
+diff --git a/drivers/pwm/pwm-imx-tpm.c b/drivers/pwm/pwm-imx-tpm.c
+index 5e2b452ee5f2e..98ab65c896850 100644
+--- a/drivers/pwm/pwm-imx-tpm.c
++++ b/drivers/pwm/pwm-imx-tpm.c
+@@ -397,6 +397,13 @@ static int __maybe_unused pwm_imx_tpm_suspend(struct device *dev)
+ if (tpm->enable_count > 0)
+ return -EBUSY;
+
++ /*
++ * Force 'real_period' to be zero to force period update code
++ * can be executed after system resume back, since suspend causes
++ * the period related registers to become their reset values.
++ */
++ tpm->real_period = 0;
++
+ clk_disable_unprepare(tpm->clk);
+
+ return 0;
+diff --git a/drivers/pwm/pwm-mtk-disp.c b/drivers/pwm/pwm-mtk-disp.c
+index 79e321e96f56a..2401b67332417 100644
+--- a/drivers/pwm/pwm-mtk-disp.c
++++ b/drivers/pwm/pwm-mtk-disp.c
+@@ -79,14 +79,11 @@ static int mtk_disp_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
+ if (state->polarity != PWM_POLARITY_NORMAL)
+ return -EINVAL;
+
+- if (!state->enabled) {
+- mtk_disp_pwm_update_bits(mdp, DISP_PWM_EN, mdp->data->enable_mask,
+- 0x0);
+-
+- if (mdp->enabled) {
+- clk_disable_unprepare(mdp->clk_mm);
+- clk_disable_unprepare(mdp->clk_main);
+- }
++ if (!state->enabled && mdp->enabled) {
++ mtk_disp_pwm_update_bits(mdp, DISP_PWM_EN,
++ mdp->data->enable_mask, 0x0);
++ clk_disable_unprepare(mdp->clk_mm);
++ clk_disable_unprepare(mdp->clk_main);
+
+ mdp->enabled = false;
+ return 0;
+diff --git a/drivers/pwm/sysfs.c b/drivers/pwm/sysfs.c
+index 1a106ec329392..8d1254761e4dd 100644
+--- a/drivers/pwm/sysfs.c
++++ b/drivers/pwm/sysfs.c
+@@ -424,6 +424,13 @@ static int pwm_class_resume_npwm(struct device *parent, unsigned int npwm)
+ if (!export)
+ continue;
+
++ /* If pwmchip was not enabled before suspend, do nothing. */
++ if (!export->suspend.enabled) {
++ /* release lock taken in pwm_class_get_state */
++ mutex_unlock(&export->lock);
++ continue;
++ }
++
+ state.enabled = export->suspend.enabled;
+ ret = pwm_class_apply_state(export, pwm, &state);
+ if (ret < 0)
+@@ -448,7 +455,17 @@ static int pwm_class_suspend(struct device *parent)
+ if (!export)
+ continue;
+
++ /*
++ * If pwmchip was not enabled before suspend, save
++ * state for resume time and do nothing else.
++ */
+ export->suspend = state;
++ if (!state.enabled) {
++ /* release lock taken in pwm_class_get_state */
++ mutex_unlock(&export->lock);
++ continue;
++ }
++
+ state.enabled = false;
+ ret = pwm_class_apply_state(export, pwm, &state);
+ if (ret < 0) {
+diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c
+index 698ab7f5004bf..d8e1caaf207e1 100644
+--- a/drivers/regulator/core.c
++++ b/drivers/regulator/core.c
+@@ -1911,19 +1911,17 @@ static struct regulator *create_regulator(struct regulator_dev *rdev,
+
+ if (err != -EEXIST)
+ regulator->debugfs = debugfs_create_dir(supply_name, rdev->debugfs);
+- if (!regulator->debugfs) {
++ if (IS_ERR(regulator->debugfs))
+ rdev_dbg(rdev, "Failed to create debugfs directory\n");
+- } else {
+- debugfs_create_u32("uA_load", 0444, regulator->debugfs,
+- ®ulator->uA_load);
+- debugfs_create_u32("min_uV", 0444, regulator->debugfs,
+- ®ulator->voltage[PM_SUSPEND_ON].min_uV);
+- debugfs_create_u32("max_uV", 0444, regulator->debugfs,
+- ®ulator->voltage[PM_SUSPEND_ON].max_uV);
+- debugfs_create_file("constraint_flags", 0444,
+- regulator->debugfs, regulator,
+- &constraint_flags_fops);
+- }
++
++ debugfs_create_u32("uA_load", 0444, regulator->debugfs,
++ ®ulator->uA_load);
++ debugfs_create_u32("min_uV", 0444, regulator->debugfs,
++ ®ulator->voltage[PM_SUSPEND_ON].min_uV);
++ debugfs_create_u32("max_uV", 0444, regulator->debugfs,
++ ®ulator->voltage[PM_SUSPEND_ON].max_uV);
++ debugfs_create_file("constraint_flags", 0444, regulator->debugfs,
++ regulator, &constraint_flags_fops);
+
+ /*
+ * Check now if the regulator is an always on regulator - if
+@@ -5256,10 +5254,8 @@ static void rdev_init_debugfs(struct regulator_dev *rdev)
+ }
+
+ rdev->debugfs = debugfs_create_dir(rname, debugfs_root);
+- if (IS_ERR(rdev->debugfs)) {
+- rdev_warn(rdev, "Failed to create debugfs directory\n");
+- return;
+- }
++ if (IS_ERR(rdev->debugfs))
++ rdev_dbg(rdev, "Failed to create debugfs directory\n");
+
+ debugfs_create_u32("use_count", 0444, rdev->debugfs,
+ &rdev->use_count);
+@@ -6179,7 +6175,7 @@ static int __init regulator_init(void)
+
+ debugfs_root = debugfs_create_dir("regulator", NULL);
+ if (IS_ERR(debugfs_root))
+- pr_warn("regulator: Failed to create debugfs directory\n");
++ pr_debug("regulator: Failed to create debugfs directory\n");
+
+ #ifdef CONFIG_DEBUG_FS
+ debugfs_create_file("supply_map", 0444, debugfs_root, NULL,
+diff --git a/drivers/regulator/rk808-regulator.c b/drivers/regulator/rk808-regulator.c
+index 3637e81654a8e..80ba782d89239 100644
+--- a/drivers/regulator/rk808-regulator.c
++++ b/drivers/regulator/rk808-regulator.c
+@@ -1336,6 +1336,7 @@ static int rk808_regulator_probe(struct platform_device *pdev)
+
+ config.dev = &pdev->dev;
+ config.dev->of_node = pdev->dev.parent->of_node;
++ config.dev->of_node_reused = true;
+ config.driver_data = pdata;
+ config.regmap = regmap;
+
+diff --git a/drivers/regulator/tps65219-regulator.c b/drivers/regulator/tps65219-regulator.c
+index b1719ee990ab4..8971b507a79ac 100644
+--- a/drivers/regulator/tps65219-regulator.c
++++ b/drivers/regulator/tps65219-regulator.c
+@@ -289,13 +289,13 @@ static irqreturn_t tps65219_regulator_irq_handler(int irq, void *data)
+
+ static int tps65219_get_rdev_by_name(const char *regulator_name,
+ struct regulator_dev *rdevtbl[7],
+- struct regulator_dev *dev)
++ struct regulator_dev **dev)
+ {
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(regulators); i++) {
+ if (strcmp(regulator_name, regulators[i].name) == 0) {
+- dev = rdevtbl[i];
++ *dev = rdevtbl[i];
+ return 0;
+ }
+ }
+@@ -348,7 +348,7 @@ static int tps65219_regulator_probe(struct platform_device *pdev)
+ irq_data[i].dev = tps->dev;
+ irq_data[i].type = irq_type;
+
+- tps65219_get_rdev_by_name(irq_type->regulator_name, rdevtbl, rdev);
++ tps65219_get_rdev_by_name(irq_type->regulator_name, rdevtbl, &rdev);
+ if (IS_ERR(rdev)) {
+ dev_err(tps->dev, "Failed to get rdev for %s\n",
+ irq_type->regulator_name);
+diff --git a/drivers/rtc/rtc-st-lpc.c b/drivers/rtc/rtc-st-lpc.c
+index 0f8e4231098ef..d04d46f9cc65a 100644
+--- a/drivers/rtc/rtc-st-lpc.c
++++ b/drivers/rtc/rtc-st-lpc.c
+@@ -228,7 +228,7 @@ static int st_rtc_probe(struct platform_device *pdev)
+ enable_irq_wake(rtc->irq);
+ disable_irq(rtc->irq);
+
+- rtc->clk = clk_get(&pdev->dev, NULL);
++ rtc->clk = devm_clk_get(&pdev->dev, NULL);
+ if (IS_ERR(rtc->clk)) {
+ dev_err(&pdev->dev, "Unable to request clock\n");
+ return PTR_ERR(rtc->clk);
+diff --git a/drivers/s390/net/qeth_l3_sys.c b/drivers/s390/net/qeth_l3_sys.c
+index 9f90a860ca2c9..a6b64228ead25 100644
+--- a/drivers/s390/net/qeth_l3_sys.c
++++ b/drivers/s390/net/qeth_l3_sys.c
+@@ -625,7 +625,7 @@ static QETH_DEVICE_ATTR(vipa_add4, add4, 0644,
+ static ssize_t qeth_l3_dev_vipa_del4_store(struct device *dev,
+ struct device_attribute *attr, const char *buf, size_t count)
+ {
+- return qeth_l3_vipa_store(dev, buf, true, count, QETH_PROT_IPV4);
++ return qeth_l3_vipa_store(dev, buf, false, count, QETH_PROT_IPV4);
+ }
+
+ static QETH_DEVICE_ATTR(vipa_del4, del4, 0200, NULL,
+diff --git a/drivers/scsi/3w-xxxx.c b/drivers/scsi/3w-xxxx.c
+index 36c34ced0cc18..f39c9ec2e7810 100644
+--- a/drivers/scsi/3w-xxxx.c
++++ b/drivers/scsi/3w-xxxx.c
+@@ -2305,8 +2305,10 @@ static int tw_probe(struct pci_dev *pdev, const struct pci_device_id *dev_id)
+ TW_DISABLE_INTERRUPTS(tw_dev);
+
+ /* Initialize the card */
+- if (tw_reset_sequence(tw_dev))
++ if (tw_reset_sequence(tw_dev)) {
++ retval = -EINVAL;
+ goto out_release_mem_region;
++ }
+
+ /* Set host specific parameters */
+ host->max_id = TW_MAX_UNITS;
+diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
+index 6a15f879e5173..a5e5b7bff59b4 100644
+--- a/drivers/scsi/lpfc/lpfc_els.c
++++ b/drivers/scsi/lpfc/lpfc_els.c
+@@ -5468,9 +5468,19 @@ out:
+ ndlp->nlp_flag &= ~NLP_RELEASE_RPI;
+ spin_unlock_irq(&ndlp->lock);
+ }
++ lpfc_drop_node(vport, ndlp);
++ } else if (ndlp->nlp_state != NLP_STE_PLOGI_ISSUE &&
++ ndlp->nlp_state != NLP_STE_REG_LOGIN_ISSUE &&
++ ndlp->nlp_state != NLP_STE_PRLI_ISSUE) {
++ /* Drop ndlp if there is no planned or outstanding
++ * issued PRLI.
++ *
++ * In cases when the ndlp is acting as both an initiator
++ * and target function, let our issued PRLI determine
++ * the final ndlp kref drop.
++ */
++ lpfc_drop_node(vport, ndlp);
+ }
+-
+- lpfc_drop_node(vport, ndlp);
+ }
+
+ /* Release the originating I/O reference. */
+diff --git a/drivers/scsi/qedf/qedf_main.c b/drivers/scsi/qedf/qedf_main.c
+index 3b64de81ea0d3..2a31ddc99dde5 100644
+--- a/drivers/scsi/qedf/qedf_main.c
++++ b/drivers/scsi/qedf/qedf_main.c
+@@ -3041,9 +3041,8 @@ static int qedf_alloc_global_queues(struct qedf_ctx *qedf)
+ * addresses of our queues
+ */
+ if (!qedf->p_cpuq) {
+- status = -EINVAL;
+ QEDF_ERR(&qedf->dbg_ctx, "p_cpuq is NULL.\n");
+- goto mem_alloc_failure;
++ return -EINVAL;
+ }
+
+ qedf->global_queues = kzalloc((sizeof(struct global_queue *)
+diff --git a/drivers/soc/amlogic/meson-secure-pwrc.c b/drivers/soc/amlogic/meson-secure-pwrc.c
+index e935187635267..25b4b71df9b89 100644
+--- a/drivers/soc/amlogic/meson-secure-pwrc.c
++++ b/drivers/soc/amlogic/meson-secure-pwrc.c
+@@ -105,7 +105,7 @@ static struct meson_secure_pwrc_domain_desc a1_pwrc_domains[] = {
+ SEC_PD(ACODEC, 0),
+ SEC_PD(AUDIO, 0),
+ SEC_PD(OTP, 0),
+- SEC_PD(DMA, 0),
++ SEC_PD(DMA, GENPD_FLAG_ALWAYS_ON | GENPD_FLAG_IRQ_SAFE),
+ SEC_PD(SD_EMMC, 0),
+ SEC_PD(RAMA, 0),
+ /* SRAMB is used as ATF runtime memory, and should be always on */
+diff --git a/drivers/soc/fsl/qe/Kconfig b/drivers/soc/fsl/qe/Kconfig
+index e0d096607fefb..fa9ffbed0e929 100644
+--- a/drivers/soc/fsl/qe/Kconfig
++++ b/drivers/soc/fsl/qe/Kconfig
+@@ -62,6 +62,7 @@ config QE_TDM
+
+ config QE_USB
+ bool
++ depends on QUICC_ENGINE
+ default y if USB_FSL_QE
+ help
+ QE USB Controller support
+diff --git a/drivers/soc/mediatek/mtk-svs.c b/drivers/soc/mediatek/mtk-svs.c
+index 81585733c8a99..3a2f97cd52720 100644
+--- a/drivers/soc/mediatek/mtk-svs.c
++++ b/drivers/soc/mediatek/mtk-svs.c
+@@ -2061,9 +2061,9 @@ static int svs_mt8192_platform_probe(struct svs_platform *svsp)
+ svsb = &svsp->banks[idx];
+
+ if (svsb->type == SVSB_HIGH)
+- svsb->opp_dev = svs_add_device_link(svsp, "mali");
++ svsb->opp_dev = svs_add_device_link(svsp, "gpu");
+ else if (svsb->type == SVSB_LOW)
+- svsb->opp_dev = svs_get_subsys_device(svsp, "mali");
++ svsb->opp_dev = svs_get_subsys_device(svsp, "gpu");
+
+ if (IS_ERR(svsb->opp_dev))
+ return dev_err_probe(svsp->dev, PTR_ERR(svsb->opp_dev),
+diff --git a/drivers/soc/qcom/qcom-geni-se.c b/drivers/soc/qcom/qcom-geni-se.c
+index 795a2e1d59b3a..dd50a255fa6cb 100644
+--- a/drivers/soc/qcom/qcom-geni-se.c
++++ b/drivers/soc/qcom/qcom-geni-se.c
+@@ -682,6 +682,30 @@ EXPORT_SYMBOL(geni_se_clk_freq_match);
+ #define GENI_SE_DMA_EOT_EN BIT(1)
+ #define GENI_SE_DMA_AHB_ERR_EN BIT(2)
+ #define GENI_SE_DMA_EOT_BUF BIT(0)
++
++/**
++ * geni_se_tx_init_dma() - Initiate TX DMA transfer on the serial engine
++ * @se: Pointer to the concerned serial engine.
++ * @iova: Mapped DMA address.
++ * @len: Length of the TX buffer.
++ *
++ * This function is used to initiate DMA TX transfer.
++ */
++void geni_se_tx_init_dma(struct geni_se *se, dma_addr_t iova, size_t len)
++{
++ u32 val;
++
++ val = GENI_SE_DMA_DONE_EN;
++ val |= GENI_SE_DMA_EOT_EN;
++ val |= GENI_SE_DMA_AHB_ERR_EN;
++ writel_relaxed(val, se->base + SE_DMA_TX_IRQ_EN_SET);
++ writel_relaxed(lower_32_bits(iova), se->base + SE_DMA_TX_PTR_L);
++ writel_relaxed(upper_32_bits(iova), se->base + SE_DMA_TX_PTR_H);
++ writel_relaxed(GENI_SE_DMA_EOT_BUF, se->base + SE_DMA_TX_ATTR);
++ writel(len, se->base + SE_DMA_TX_LEN);
++}
++EXPORT_SYMBOL(geni_se_tx_init_dma);
++
+ /**
+ * geni_se_tx_dma_prep() - Prepare the serial engine for TX DMA transfer
+ * @se: Pointer to the concerned serial engine.
+@@ -697,7 +721,6 @@ int geni_se_tx_dma_prep(struct geni_se *se, void *buf, size_t len,
+ dma_addr_t *iova)
+ {
+ struct geni_wrapper *wrapper = se->wrapper;
+- u32 val;
+
+ if (!wrapper)
+ return -EINVAL;
+@@ -706,17 +729,34 @@ int geni_se_tx_dma_prep(struct geni_se *se, void *buf, size_t len,
+ if (dma_mapping_error(wrapper->dev, *iova))
+ return -EIO;
+
++ geni_se_tx_init_dma(se, *iova, len);
++ return 0;
++}
++EXPORT_SYMBOL(geni_se_tx_dma_prep);
++
++/**
++ * geni_se_rx_init_dma() - Initiate RX DMA transfer on the serial engine
++ * @se: Pointer to the concerned serial engine.
++ * @iova: Mapped DMA address.
++ * @len: Length of the RX buffer.
++ *
++ * This function is used to initiate DMA RX transfer.
++ */
++void geni_se_rx_init_dma(struct geni_se *se, dma_addr_t iova, size_t len)
++{
++ u32 val;
++
+ val = GENI_SE_DMA_DONE_EN;
+ val |= GENI_SE_DMA_EOT_EN;
+ val |= GENI_SE_DMA_AHB_ERR_EN;
+- writel_relaxed(val, se->base + SE_DMA_TX_IRQ_EN_SET);
+- writel_relaxed(lower_32_bits(*iova), se->base + SE_DMA_TX_PTR_L);
+- writel_relaxed(upper_32_bits(*iova), se->base + SE_DMA_TX_PTR_H);
+- writel_relaxed(GENI_SE_DMA_EOT_BUF, se->base + SE_DMA_TX_ATTR);
+- writel(len, se->base + SE_DMA_TX_LEN);
+- return 0;
++ writel_relaxed(val, se->base + SE_DMA_RX_IRQ_EN_SET);
++ writel_relaxed(lower_32_bits(iova), se->base + SE_DMA_RX_PTR_L);
++ writel_relaxed(upper_32_bits(iova), se->base + SE_DMA_RX_PTR_H);
++ /* RX does not have EOT buffer type bit. So just reset RX_ATTR */
++ writel_relaxed(0, se->base + SE_DMA_RX_ATTR);
++ writel(len, se->base + SE_DMA_RX_LEN);
+ }
+-EXPORT_SYMBOL(geni_se_tx_dma_prep);
++EXPORT_SYMBOL(geni_se_rx_init_dma);
+
+ /**
+ * geni_se_rx_dma_prep() - Prepare the serial engine for RX DMA transfer
+@@ -733,7 +773,6 @@ int geni_se_rx_dma_prep(struct geni_se *se, void *buf, size_t len,
+ dma_addr_t *iova)
+ {
+ struct geni_wrapper *wrapper = se->wrapper;
+- u32 val;
+
+ if (!wrapper)
+ return -EINVAL;
+@@ -742,15 +781,7 @@ int geni_se_rx_dma_prep(struct geni_se *se, void *buf, size_t len,
+ if (dma_mapping_error(wrapper->dev, *iova))
+ return -EIO;
+
+- val = GENI_SE_DMA_DONE_EN;
+- val |= GENI_SE_DMA_EOT_EN;
+- val |= GENI_SE_DMA_AHB_ERR_EN;
+- writel_relaxed(val, se->base + SE_DMA_RX_IRQ_EN_SET);
+- writel_relaxed(lower_32_bits(*iova), se->base + SE_DMA_RX_PTR_L);
+- writel_relaxed(upper_32_bits(*iova), se->base + SE_DMA_RX_PTR_H);
+- /* RX does not have EOT buffer type bit. So just reset RX_ATTR */
+- writel_relaxed(0, se->base + SE_DMA_RX_ATTR);
+- writel(len, se->base + SE_DMA_RX_LEN);
++ geni_se_rx_init_dma(se, *iova, len);
+ return 0;
+ }
+ EXPORT_SYMBOL(geni_se_rx_dma_prep);
+diff --git a/drivers/soc/xilinx/xlnx_event_manager.c b/drivers/soc/xilinx/xlnx_event_manager.c
+index c76381899ef49..f9d9b82b562da 100644
+--- a/drivers/soc/xilinx/xlnx_event_manager.c
++++ b/drivers/soc/xilinx/xlnx_event_manager.c
+@@ -192,11 +192,12 @@ static int xlnx_remove_cb_for_suspend(event_cb_func_t cb_fun)
+ struct registered_event_data *eve_data;
+ struct agent_cb *cb_pos;
+ struct agent_cb *cb_next;
++ struct hlist_node *tmp;
+
+ is_need_to_unregister = false;
+
+ /* Check for existing entry in hash table for given cb_type */
+- hash_for_each_possible(reg_driver_map, eve_data, hentry, PM_INIT_SUSPEND_CB) {
++ hash_for_each_possible_safe(reg_driver_map, eve_data, tmp, hentry, PM_INIT_SUSPEND_CB) {
+ if (eve_data->cb_type == PM_INIT_SUSPEND_CB) {
+ /* Delete the list of callback */
+ list_for_each_entry_safe(cb_pos, cb_next, &eve_data->cb_list_head, list) {
+@@ -228,11 +229,12 @@ static int xlnx_remove_cb_for_notify_event(const u32 node_id, const u32 event,
+ u64 key = ((u64)node_id << 32U) | (u64)event;
+ struct agent_cb *cb_pos;
+ struct agent_cb *cb_next;
++ struct hlist_node *tmp;
+
+ is_need_to_unregister = false;
+
+ /* Check for existing entry in hash table for given key id */
+- hash_for_each_possible(reg_driver_map, eve_data, hentry, key) {
++ hash_for_each_possible_safe(reg_driver_map, eve_data, tmp, hentry, key) {
+ if (eve_data->key == key) {
+ /* Delete the list of callback */
+ list_for_each_entry_safe(cb_pos, cb_next, &eve_data->cb_list_head, list) {
+diff --git a/drivers/soundwire/debugfs.c b/drivers/soundwire/debugfs.c
+index dea782e0edc4b..c3a1a359ee5c3 100644
+--- a/drivers/soundwire/debugfs.c
++++ b/drivers/soundwire/debugfs.c
+@@ -56,8 +56,9 @@ static int sdw_slave_reg_show(struct seq_file *s_file, void *data)
+ if (!buf)
+ return -ENOMEM;
+
+- ret = pm_runtime_resume_and_get(&slave->dev);
++ ret = pm_runtime_get_sync(&slave->dev);
+ if (ret < 0 && ret != -EACCES) {
++ pm_runtime_put_noidle(&slave->dev);
+ kfree(buf);
+ return ret;
+ }
+diff --git a/drivers/soundwire/qcom.c b/drivers/soundwire/qcom.c
+index 280455f047a36..bd39e78788590 100644
+--- a/drivers/soundwire/qcom.c
++++ b/drivers/soundwire/qcom.c
+@@ -278,14 +278,14 @@ static u32 swrm_get_packed_reg_val(u8 *cmd_id, u8 cmd_data,
+ return val;
+ }
+
+-static int swrm_wait_for_rd_fifo_avail(struct qcom_swrm_ctrl *swrm)
++static int swrm_wait_for_rd_fifo_avail(struct qcom_swrm_ctrl *ctrl)
+ {
+ u32 fifo_outstanding_data, value;
+ int fifo_retry_count = SWR_OVERFLOW_RETRY_COUNT;
+
+ do {
+ /* Check for fifo underflow during read */
+- swrm->reg_read(swrm, SWRM_CMD_FIFO_STATUS, &value);
++ ctrl->reg_read(ctrl, SWRM_CMD_FIFO_STATUS, &value);
+ fifo_outstanding_data = FIELD_GET(SWRM_RD_CMD_FIFO_CNT_MASK, value);
+
+ /* Check if read data is available in read fifo */
+@@ -296,39 +296,39 @@ static int swrm_wait_for_rd_fifo_avail(struct qcom_swrm_ctrl *swrm)
+ } while (fifo_retry_count--);
+
+ if (fifo_outstanding_data == 0) {
+- dev_err_ratelimited(swrm->dev, "%s err read underflow\n", __func__);
++ dev_err_ratelimited(ctrl->dev, "%s err read underflow\n", __func__);
+ return -EIO;
+ }
+
+ return 0;
+ }
+
+-static int swrm_wait_for_wr_fifo_avail(struct qcom_swrm_ctrl *swrm)
++static int swrm_wait_for_wr_fifo_avail(struct qcom_swrm_ctrl *ctrl)
+ {
+ u32 fifo_outstanding_cmds, value;
+ int fifo_retry_count = SWR_OVERFLOW_RETRY_COUNT;
+
+ do {
+ /* Check for fifo overflow during write */
+- swrm->reg_read(swrm, SWRM_CMD_FIFO_STATUS, &value);
++ ctrl->reg_read(ctrl, SWRM_CMD_FIFO_STATUS, &value);
+ fifo_outstanding_cmds = FIELD_GET(SWRM_WR_CMD_FIFO_CNT_MASK, value);
+
+ /* Check for space in write fifo before writing */
+- if (fifo_outstanding_cmds < swrm->wr_fifo_depth)
++ if (fifo_outstanding_cmds < ctrl->wr_fifo_depth)
+ return 0;
+
+ usleep_range(500, 510);
+ } while (fifo_retry_count--);
+
+- if (fifo_outstanding_cmds == swrm->wr_fifo_depth) {
+- dev_err_ratelimited(swrm->dev, "%s err write overflow\n", __func__);
++ if (fifo_outstanding_cmds == ctrl->wr_fifo_depth) {
++ dev_err_ratelimited(ctrl->dev, "%s err write overflow\n", __func__);
+ return -EIO;
+ }
+
+ return 0;
+ }
+
+-static int qcom_swrm_cmd_fifo_wr_cmd(struct qcom_swrm_ctrl *swrm, u8 cmd_data,
++static int qcom_swrm_cmd_fifo_wr_cmd(struct qcom_swrm_ctrl *ctrl, u8 cmd_data,
+ u8 dev_addr, u16 reg_addr)
+ {
+
+@@ -341,20 +341,20 @@ static int qcom_swrm_cmd_fifo_wr_cmd(struct qcom_swrm_ctrl *swrm, u8 cmd_data,
+ val = swrm_get_packed_reg_val(&cmd_id, cmd_data,
+ dev_addr, reg_addr);
+ } else {
+- val = swrm_get_packed_reg_val(&swrm->wcmd_id, cmd_data,
++ val = swrm_get_packed_reg_val(&ctrl->wcmd_id, cmd_data,
+ dev_addr, reg_addr);
+ }
+
+- if (swrm_wait_for_wr_fifo_avail(swrm))
++ if (swrm_wait_for_wr_fifo_avail(ctrl))
+ return SDW_CMD_FAIL_OTHER;
+
+ if (cmd_id == SWR_BROADCAST_CMD_ID)
+- reinit_completion(&swrm->broadcast);
++ reinit_completion(&ctrl->broadcast);
+
+ /* Its assumed that write is okay as we do not get any status back */
+- swrm->reg_write(swrm, SWRM_CMD_FIFO_WR_CMD, val);
++ ctrl->reg_write(ctrl, SWRM_CMD_FIFO_WR_CMD, val);
+
+- if (swrm->version <= SWRM_VERSION_1_3_0)
++ if (ctrl->version <= SWRM_VERSION_1_3_0)
+ usleep_range(150, 155);
+
+ if (cmd_id == SWR_BROADCAST_CMD_ID) {
+@@ -362,7 +362,7 @@ static int qcom_swrm_cmd_fifo_wr_cmd(struct qcom_swrm_ctrl *swrm, u8 cmd_data,
+ * sleep for 10ms for MSM soundwire variant to allow broadcast
+ * command to complete.
+ */
+- ret = wait_for_completion_timeout(&swrm->broadcast,
++ ret = wait_for_completion_timeout(&ctrl->broadcast,
+ msecs_to_jiffies(TIMEOUT_MS));
+ if (!ret)
+ ret = SDW_CMD_IGNORED;
+@@ -375,41 +375,41 @@ static int qcom_swrm_cmd_fifo_wr_cmd(struct qcom_swrm_ctrl *swrm, u8 cmd_data,
+ return ret;
+ }
+
+-static int qcom_swrm_cmd_fifo_rd_cmd(struct qcom_swrm_ctrl *swrm,
++static int qcom_swrm_cmd_fifo_rd_cmd(struct qcom_swrm_ctrl *ctrl,
+ u8 dev_addr, u16 reg_addr,
+ u32 len, u8 *rval)
+ {
+ u32 cmd_data, cmd_id, val, retry_attempt = 0;
+
+- val = swrm_get_packed_reg_val(&swrm->rcmd_id, len, dev_addr, reg_addr);
++ val = swrm_get_packed_reg_val(&ctrl->rcmd_id, len, dev_addr, reg_addr);
+
+ /*
+ * Check for outstanding cmd wrt. write fifo depth to avoid
+ * overflow as read will also increase write fifo cnt.
+ */
+- swrm_wait_for_wr_fifo_avail(swrm);
++ swrm_wait_for_wr_fifo_avail(ctrl);
+
+ /* wait for FIFO RD to complete to avoid overflow */
+ usleep_range(100, 105);
+- swrm->reg_write(swrm, SWRM_CMD_FIFO_RD_CMD, val);
++ ctrl->reg_write(ctrl, SWRM_CMD_FIFO_RD_CMD, val);
+ /* wait for FIFO RD CMD complete to avoid overflow */
+ usleep_range(250, 255);
+
+- if (swrm_wait_for_rd_fifo_avail(swrm))
++ if (swrm_wait_for_rd_fifo_avail(ctrl))
+ return SDW_CMD_FAIL_OTHER;
+
+ do {
+- swrm->reg_read(swrm, SWRM_CMD_FIFO_RD_FIFO_ADDR, &cmd_data);
++ ctrl->reg_read(ctrl, SWRM_CMD_FIFO_RD_FIFO_ADDR, &cmd_data);
+ rval[0] = cmd_data & 0xFF;
+ cmd_id = FIELD_GET(SWRM_RD_FIFO_CMD_ID_MASK, cmd_data);
+
+- if (cmd_id != swrm->rcmd_id) {
++ if (cmd_id != ctrl->rcmd_id) {
+ if (retry_attempt < (MAX_FIFO_RD_RETRY - 1)) {
+ /* wait 500 us before retry on fifo read failure */
+ usleep_range(500, 505);
+- swrm->reg_write(swrm, SWRM_CMD_FIFO_CMD,
++ ctrl->reg_write(ctrl, SWRM_CMD_FIFO_CMD,
+ SWRM_CMD_FIFO_FLUSH);
+- swrm->reg_write(swrm, SWRM_CMD_FIFO_RD_CMD, val);
++ ctrl->reg_write(ctrl, SWRM_CMD_FIFO_RD_CMD, val);
+ }
+ retry_attempt++;
+ } else {
+@@ -418,9 +418,9 @@ static int qcom_swrm_cmd_fifo_rd_cmd(struct qcom_swrm_ctrl *swrm,
+
+ } while (retry_attempt < MAX_FIFO_RD_RETRY);
+
+- dev_err(swrm->dev, "failed to read fifo: reg: 0x%x, rcmd_id: 0x%x,\
++ dev_err(ctrl->dev, "failed to read fifo: reg: 0x%x, rcmd_id: 0x%x,\
+ dev_num: 0x%x, cmd_data: 0x%x\n",
+- reg_addr, swrm->rcmd_id, dev_addr, cmd_data);
++ reg_addr, ctrl->rcmd_id, dev_addr, cmd_data);
+
+ return SDW_CMD_IGNORED;
+ }
+@@ -532,39 +532,40 @@ static int qcom_swrm_enumerate(struct sdw_bus *bus)
+
+ static irqreturn_t qcom_swrm_wake_irq_handler(int irq, void *dev_id)
+ {
+- struct qcom_swrm_ctrl *swrm = dev_id;
++ struct qcom_swrm_ctrl *ctrl = dev_id;
+ int ret;
+
+- ret = pm_runtime_resume_and_get(swrm->dev);
++ ret = pm_runtime_get_sync(ctrl->dev);
+ if (ret < 0 && ret != -EACCES) {
+- dev_err_ratelimited(swrm->dev,
+- "pm_runtime_resume_and_get failed in %s, ret %d\n",
++ dev_err_ratelimited(ctrl->dev,
++ "pm_runtime_get_sync failed in %s, ret %d\n",
+ __func__, ret);
++ pm_runtime_put_noidle(ctrl->dev);
+ return ret;
+ }
+
+- if (swrm->wake_irq > 0) {
+- if (!irqd_irq_disabled(irq_get_irq_data(swrm->wake_irq)))
+- disable_irq_nosync(swrm->wake_irq);
++ if (ctrl->wake_irq > 0) {
++ if (!irqd_irq_disabled(irq_get_irq_data(ctrl->wake_irq)))
++ disable_irq_nosync(ctrl->wake_irq);
+ }
+
+- pm_runtime_mark_last_busy(swrm->dev);
+- pm_runtime_put_autosuspend(swrm->dev);
++ pm_runtime_mark_last_busy(ctrl->dev);
++ pm_runtime_put_autosuspend(ctrl->dev);
+
+ return IRQ_HANDLED;
+ }
+
+ static irqreturn_t qcom_swrm_irq_handler(int irq, void *dev_id)
+ {
+- struct qcom_swrm_ctrl *swrm = dev_id;
++ struct qcom_swrm_ctrl *ctrl = dev_id;
+ u32 value, intr_sts, intr_sts_masked, slave_status;
+ u32 i;
+ int devnum;
+ int ret = IRQ_HANDLED;
+- clk_prepare_enable(swrm->hclk);
++ clk_prepare_enable(ctrl->hclk);
+
+- swrm->reg_read(swrm, SWRM_INTERRUPT_STATUS, &intr_sts);
+- intr_sts_masked = intr_sts & swrm->intr_mask;
++ ctrl->reg_read(ctrl, SWRM_INTERRUPT_STATUS, &intr_sts);
++ intr_sts_masked = intr_sts & ctrl->intr_mask;
+
+ do {
+ for (i = 0; i < SWRM_INTERRUPT_MAX; i++) {
+@@ -574,80 +575,80 @@ static irqreturn_t qcom_swrm_irq_handler(int irq, void *dev_id)
+
+ switch (value) {
+ case SWRM_INTERRUPT_STATUS_SLAVE_PEND_IRQ:
+- devnum = qcom_swrm_get_alert_slave_dev_num(swrm);
++ devnum = qcom_swrm_get_alert_slave_dev_num(ctrl);
+ if (devnum < 0) {
+- dev_err_ratelimited(swrm->dev,
++ dev_err_ratelimited(ctrl->dev,
+ "no slave alert found.spurious interrupt\n");
+ } else {
+- sdw_handle_slave_status(&swrm->bus, swrm->status);
++ sdw_handle_slave_status(&ctrl->bus, ctrl->status);
+ }
+
+ break;
+ case SWRM_INTERRUPT_STATUS_NEW_SLAVE_ATTACHED:
+ case SWRM_INTERRUPT_STATUS_CHANGE_ENUM_SLAVE_STATUS:
+- dev_dbg_ratelimited(swrm->dev, "SWR new slave attached\n");
+- swrm->reg_read(swrm, SWRM_MCP_SLV_STATUS, &slave_status);
+- if (swrm->slave_status == slave_status) {
+- dev_dbg(swrm->dev, "Slave status not changed %x\n",
++ dev_dbg_ratelimited(ctrl->dev, "SWR new slave attached\n");
++ ctrl->reg_read(ctrl, SWRM_MCP_SLV_STATUS, &slave_status);
++ if (ctrl->slave_status == slave_status) {
++ dev_dbg(ctrl->dev, "Slave status not changed %x\n",
+ slave_status);
+ } else {
+- qcom_swrm_get_device_status(swrm);
+- qcom_swrm_enumerate(&swrm->bus);
+- sdw_handle_slave_status(&swrm->bus, swrm->status);
++ qcom_swrm_get_device_status(ctrl);
++ qcom_swrm_enumerate(&ctrl->bus);
++ sdw_handle_slave_status(&ctrl->bus, ctrl->status);
+ }
+ break;
+ case SWRM_INTERRUPT_STATUS_MASTER_CLASH_DET:
+- dev_err_ratelimited(swrm->dev,
++ dev_err_ratelimited(ctrl->dev,
+ "%s: SWR bus clsh detected\n",
+ __func__);
+- swrm->intr_mask &= ~SWRM_INTERRUPT_STATUS_MASTER_CLASH_DET;
+- swrm->reg_write(swrm, SWRM_INTERRUPT_CPU_EN, swrm->intr_mask);
++ ctrl->intr_mask &= ~SWRM_INTERRUPT_STATUS_MASTER_CLASH_DET;
++ ctrl->reg_write(ctrl, SWRM_INTERRUPT_CPU_EN, ctrl->intr_mask);
+ break;
+ case SWRM_INTERRUPT_STATUS_RD_FIFO_OVERFLOW:
+- swrm->reg_read(swrm, SWRM_CMD_FIFO_STATUS, &value);
+- dev_err_ratelimited(swrm->dev,
++ ctrl->reg_read(ctrl, SWRM_CMD_FIFO_STATUS, &value);
++ dev_err_ratelimited(ctrl->dev,
+ "%s: SWR read FIFO overflow fifo status 0x%x\n",
+ __func__, value);
+ break;
+ case SWRM_INTERRUPT_STATUS_RD_FIFO_UNDERFLOW:
+- swrm->reg_read(swrm, SWRM_CMD_FIFO_STATUS, &value);
+- dev_err_ratelimited(swrm->dev,
++ ctrl->reg_read(ctrl, SWRM_CMD_FIFO_STATUS, &value);
++ dev_err_ratelimited(ctrl->dev,
+ "%s: SWR read FIFO underflow fifo status 0x%x\n",
+ __func__, value);
+ break;
+ case SWRM_INTERRUPT_STATUS_WR_CMD_FIFO_OVERFLOW:
+- swrm->reg_read(swrm, SWRM_CMD_FIFO_STATUS, &value);
+- dev_err(swrm->dev,
++ ctrl->reg_read(ctrl, SWRM_CMD_FIFO_STATUS, &value);
++ dev_err(ctrl->dev,
+ "%s: SWR write FIFO overflow fifo status %x\n",
+ __func__, value);
+- swrm->reg_write(swrm, SWRM_CMD_FIFO_CMD, 0x1);
++ ctrl->reg_write(ctrl, SWRM_CMD_FIFO_CMD, 0x1);
+ break;
+ case SWRM_INTERRUPT_STATUS_CMD_ERROR:
+- swrm->reg_read(swrm, SWRM_CMD_FIFO_STATUS, &value);
+- dev_err_ratelimited(swrm->dev,
++ ctrl->reg_read(ctrl, SWRM_CMD_FIFO_STATUS, &value);
++ dev_err_ratelimited(ctrl->dev,
+ "%s: SWR CMD error, fifo status 0x%x, flushing fifo\n",
+ __func__, value);
+- swrm->reg_write(swrm, SWRM_CMD_FIFO_CMD, 0x1);
++ ctrl->reg_write(ctrl, SWRM_CMD_FIFO_CMD, 0x1);
+ break;
+ case SWRM_INTERRUPT_STATUS_DOUT_PORT_COLLISION:
+- dev_err_ratelimited(swrm->dev,
++ dev_err_ratelimited(ctrl->dev,
+ "%s: SWR Port collision detected\n",
+ __func__);
+- swrm->intr_mask &= ~SWRM_INTERRUPT_STATUS_DOUT_PORT_COLLISION;
+- swrm->reg_write(swrm,
+- SWRM_INTERRUPT_CPU_EN, swrm->intr_mask);
++ ctrl->intr_mask &= ~SWRM_INTERRUPT_STATUS_DOUT_PORT_COLLISION;
++ ctrl->reg_write(ctrl,
++ SWRM_INTERRUPT_CPU_EN, ctrl->intr_mask);
+ break;
+ case SWRM_INTERRUPT_STATUS_READ_EN_RD_VALID_MISMATCH:
+- dev_err_ratelimited(swrm->dev,
++ dev_err_ratelimited(ctrl->dev,
+ "%s: SWR read enable valid mismatch\n",
+ __func__);
+- swrm->intr_mask &=
++ ctrl->intr_mask &=
+ ~SWRM_INTERRUPT_STATUS_READ_EN_RD_VALID_MISMATCH;
+- swrm->reg_write(swrm,
+- SWRM_INTERRUPT_CPU_EN, swrm->intr_mask);
++ ctrl->reg_write(ctrl,
++ SWRM_INTERRUPT_CPU_EN, ctrl->intr_mask);
+ break;
+ case SWRM_INTERRUPT_STATUS_SPECIAL_CMD_ID_FINISHED:
+- complete(&swrm->broadcast);
++ complete(&ctrl->broadcast);
+ break;
+ case SWRM_INTERRUPT_STATUS_BUS_RESET_FINISHED_V2:
+ break;
+@@ -656,19 +657,19 @@ static irqreturn_t qcom_swrm_irq_handler(int irq, void *dev_id)
+ case SWRM_INTERRUPT_STATUS_EXT_CLK_STOP_WAKEUP:
+ break;
+ default:
+- dev_err_ratelimited(swrm->dev,
++ dev_err_ratelimited(ctrl->dev,
+ "%s: SWR unknown interrupt value: %d\n",
+ __func__, value);
+ ret = IRQ_NONE;
+ break;
+ }
+ }
+- swrm->reg_write(swrm, SWRM_INTERRUPT_CLEAR, intr_sts);
+- swrm->reg_read(swrm, SWRM_INTERRUPT_STATUS, &intr_sts);
+- intr_sts_masked = intr_sts & swrm->intr_mask;
++ ctrl->reg_write(ctrl, SWRM_INTERRUPT_CLEAR, intr_sts);
++ ctrl->reg_read(ctrl, SWRM_INTERRUPT_STATUS, &intr_sts);
++ intr_sts_masked = intr_sts & ctrl->intr_mask;
+ } while (intr_sts_masked);
+
+- clk_disable_unprepare(swrm->hclk);
++ clk_disable_unprepare(ctrl->hclk);
+ return ret;
+ }
+
+@@ -1090,11 +1091,12 @@ static int qcom_swrm_startup(struct snd_pcm_substream *substream,
+ struct snd_soc_dai *codec_dai;
+ int ret, i;
+
+- ret = pm_runtime_resume_and_get(ctrl->dev);
++ ret = pm_runtime_get_sync(ctrl->dev);
+ if (ret < 0 && ret != -EACCES) {
+ dev_err_ratelimited(ctrl->dev,
+- "pm_runtime_resume_and_get failed in %s, ret %d\n",
++ "pm_runtime_get_sync failed in %s, ret %d\n",
+ __func__, ret);
++ pm_runtime_put_noidle(ctrl->dev);
+ return ret;
+ }
+
+@@ -1292,23 +1294,24 @@ static int qcom_swrm_get_port_config(struct qcom_swrm_ctrl *ctrl)
+ #ifdef CONFIG_DEBUG_FS
+ static int swrm_reg_show(struct seq_file *s_file, void *data)
+ {
+- struct qcom_swrm_ctrl *swrm = s_file->private;
++ struct qcom_swrm_ctrl *ctrl = s_file->private;
+ int reg, reg_val, ret;
+
+- ret = pm_runtime_resume_and_get(swrm->dev);
++ ret = pm_runtime_get_sync(ctrl->dev);
+ if (ret < 0 && ret != -EACCES) {
+- dev_err_ratelimited(swrm->dev,
+- "pm_runtime_resume_and_get failed in %s, ret %d\n",
++ dev_err_ratelimited(ctrl->dev,
++ "pm_runtime_get_sync failed in %s, ret %d\n",
+ __func__, ret);
++ pm_runtime_put_noidle(ctrl->dev);
+ return ret;
+ }
+
+ for (reg = 0; reg <= SWR_MSTR_MAX_REG_ADDR; reg += 4) {
+- swrm->reg_read(swrm, reg, ®_val);
++ ctrl->reg_read(ctrl, reg, ®_val);
+ seq_printf(s_file, "0x%.3x: 0x%.2x\n", reg, reg_val);
+ }
+- pm_runtime_mark_last_busy(swrm->dev);
+- pm_runtime_put_autosuspend(swrm->dev);
++ pm_runtime_mark_last_busy(ctrl->dev);
++ pm_runtime_put_autosuspend(ctrl->dev);
+
+
+ return 0;
+@@ -1489,13 +1492,13 @@ static int qcom_swrm_remove(struct platform_device *pdev)
+ return 0;
+ }
+
+-static bool swrm_wait_for_frame_gen_enabled(struct qcom_swrm_ctrl *swrm)
++static bool swrm_wait_for_frame_gen_enabled(struct qcom_swrm_ctrl *ctrl)
+ {
+ int retry = SWRM_LINK_STATUS_RETRY_CNT;
+ int comp_sts;
+
+ do {
+- swrm->reg_read(swrm, SWRM_COMP_STATUS, &comp_sts);
++ ctrl->reg_read(ctrl, SWRM_COMP_STATUS, &comp_sts);
+
+ if (comp_sts & SWRM_FRM_GEN_ENABLED)
+ return true;
+@@ -1503,7 +1506,7 @@ static bool swrm_wait_for_frame_gen_enabled(struct qcom_swrm_ctrl *swrm)
+ usleep_range(500, 510);
+ } while (retry--);
+
+- dev_err(swrm->dev, "%s: link status not %s\n", __func__,
++ dev_err(ctrl->dev, "%s: link status not %s\n", __func__,
+ comp_sts & SWRM_FRM_GEN_ENABLED ? "connected" : "disconnected");
+
+ return false;
+diff --git a/drivers/spi/spi-bcm-qspi.c b/drivers/spi/spi-bcm-qspi.c
+index 6b46a3b67c416..d91dfbe47aa50 100644
+--- a/drivers/spi/spi-bcm-qspi.c
++++ b/drivers/spi/spi-bcm-qspi.c
+@@ -1543,13 +1543,9 @@ int bcm_qspi_probe(struct platform_device *pdev,
+ res = platform_get_resource_byname(pdev, IORESOURCE_MEM,
+ "mspi");
+
+- if (res) {
+- qspi->base[MSPI] = devm_ioremap_resource(dev, res);
+- if (IS_ERR(qspi->base[MSPI]))
+- return PTR_ERR(qspi->base[MSPI]);
+- } else {
+- return 0;
+- }
++ qspi->base[MSPI] = devm_ioremap_resource(dev, res);
++ if (IS_ERR(qspi->base[MSPI]))
++ return PTR_ERR(qspi->base[MSPI]);
+
+ res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "bspi");
+ if (res) {
+diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
+index ae3108c70f508..7778b19bcb6c6 100644
+--- a/drivers/spi/spi-dw-core.c
++++ b/drivers/spi/spi-dw-core.c
+@@ -426,7 +426,10 @@ static int dw_spi_transfer_one(struct spi_controller *master,
+ int ret;
+
+ dws->dma_mapped = 0;
+- dws->n_bytes = DIV_ROUND_UP(transfer->bits_per_word, BITS_PER_BYTE);
++ dws->n_bytes =
++ roundup_pow_of_two(DIV_ROUND_UP(transfer->bits_per_word,
++ BITS_PER_BYTE));
++
+ dws->tx = (void *)transfer->tx_buf;
+ dws->tx_len = transfer->len / dws->n_bytes;
+ dws->rx = transfer->rx_buf;
+diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c
+index b293428760bc6..1df9d4844a68d 100644
+--- a/drivers/spi/spi-geni-qcom.c
++++ b/drivers/spi/spi-geni-qcom.c
+@@ -35,7 +35,7 @@
+ #define CS_DEMUX_OUTPUT_SEL GENMASK(3, 0)
+
+ #define SE_SPI_TRANS_CFG 0x25c
+-#define CS_TOGGLE BIT(0)
++#define CS_TOGGLE BIT(1)
+
+ #define SE_SPI_WORD_LEN 0x268
+ #define WORD_LEN_MSK GENMASK(9, 0)
+@@ -97,8 +97,6 @@ struct spi_geni_master {
+ struct dma_chan *tx;
+ struct dma_chan *rx;
+ int cur_xfer_mode;
+- dma_addr_t tx_se_dma;
+- dma_addr_t rx_se_dma;
+ };
+
+ static int get_spi_clk_cfg(unsigned int speed_hz,
+@@ -174,7 +172,7 @@ static void handle_se_timeout(struct spi_master *spi,
+ unmap_if_dma:
+ if (mas->cur_xfer_mode == GENI_SE_DMA) {
+ if (xfer) {
+- if (xfer->tx_buf && mas->tx_se_dma) {
++ if (xfer->tx_buf) {
+ spin_lock_irq(&mas->lock);
+ reinit_completion(&mas->tx_reset_done);
+ writel(1, se->base + SE_DMA_TX_FSM_RST);
+@@ -182,9 +180,8 @@ unmap_if_dma:
+ time_left = wait_for_completion_timeout(&mas->tx_reset_done, HZ);
+ if (!time_left)
+ dev_err(mas->dev, "DMA TX RESET failed\n");
+- geni_se_tx_dma_unprep(se, mas->tx_se_dma, xfer->len);
+ }
+- if (xfer->rx_buf && mas->rx_se_dma) {
++ if (xfer->rx_buf) {
+ spin_lock_irq(&mas->lock);
+ reinit_completion(&mas->rx_reset_done);
+ writel(1, se->base + SE_DMA_RX_FSM_RST);
+@@ -192,7 +189,6 @@ unmap_if_dma:
+ time_left = wait_for_completion_timeout(&mas->rx_reset_done, HZ);
+ if (!time_left)
+ dev_err(mas->dev, "DMA RX RESET failed\n");
+- geni_se_rx_dma_unprep(se, mas->rx_se_dma, xfer->len);
+ }
+ } else {
+ /*
+@@ -523,17 +519,36 @@ static int setup_gsi_xfer(struct spi_transfer *xfer, struct spi_geni_master *mas
+ return 1;
+ }
+
++static u32 get_xfer_len_in_words(struct spi_transfer *xfer,
++ struct spi_geni_master *mas)
++{
++ u32 len;
++
++ if (!(mas->cur_bits_per_word % MIN_WORD_LEN))
++ len = xfer->len * BITS_PER_BYTE / mas->cur_bits_per_word;
++ else
++ len = xfer->len / (mas->cur_bits_per_word / BITS_PER_BYTE + 1);
++ len &= TRANS_LEN_MSK;
++
++ return len;
++}
++
+ static bool geni_can_dma(struct spi_controller *ctlr,
+ struct spi_device *slv, struct spi_transfer *xfer)
+ {
+ struct spi_geni_master *mas = spi_master_get_devdata(slv->master);
++ u32 len, fifo_size;
+
+- /*
+- * Return true if transfer needs to be mapped prior to
+- * calling transfer_one which is the case only for GPI_DMA.
+- * For SE_DMA mode, map/unmap is done in geni_se_*x_dma_prep.
+- */
+- return mas->cur_xfer_mode == GENI_GPI_DMA;
++ if (mas->cur_xfer_mode == GENI_GPI_DMA)
++ return true;
++
++ len = get_xfer_len_in_words(xfer, mas);
++ fifo_size = mas->tx_fifo_depth * mas->fifo_width_bits / mas->cur_bits_per_word;
++
++ if (len > fifo_size)
++ return true;
++ else
++ return false;
+ }
+
+ static int spi_geni_prepare_message(struct spi_master *spi,
+@@ -774,7 +789,7 @@ static int setup_se_xfer(struct spi_transfer *xfer,
+ u16 mode, struct spi_master *spi)
+ {
+ u32 m_cmd = 0;
+- u32 len, fifo_size;
++ u32 len;
+ struct geni_se *se = &mas->se;
+ int ret;
+
+@@ -806,11 +821,7 @@ static int setup_se_xfer(struct spi_transfer *xfer,
+ mas->tx_rem_bytes = 0;
+ mas->rx_rem_bytes = 0;
+
+- if (!(mas->cur_bits_per_word % MIN_WORD_LEN))
+- len = xfer->len * BITS_PER_BYTE / mas->cur_bits_per_word;
+- else
+- len = xfer->len / (mas->cur_bits_per_word / BITS_PER_BYTE + 1);
+- len &= TRANS_LEN_MSK;
++ len = get_xfer_len_in_words(xfer, mas);
+
+ mas->cur_xfer = xfer;
+ if (xfer->tx_buf) {
+@@ -825,9 +836,20 @@ static int setup_se_xfer(struct spi_transfer *xfer,
+ mas->rx_rem_bytes = xfer->len;
+ }
+
+- /* Select transfer mode based on transfer length */
+- fifo_size = mas->tx_fifo_depth * mas->fifo_width_bits / mas->cur_bits_per_word;
+- mas->cur_xfer_mode = (len <= fifo_size) ? GENI_SE_FIFO : GENI_SE_DMA;
++ /*
++ * Select DMA mode if sgt are present; and with only 1 entry
++ * This is not a serious limitation because the xfer buffers are
++ * expected to fit into in 1 entry almost always, and if any
++ * doesn't for any reason we fall back to FIFO mode anyway
++ */
++ if (!xfer->tx_sg.nents && !xfer->rx_sg.nents)
++ mas->cur_xfer_mode = GENI_SE_FIFO;
++ else if (xfer->tx_sg.nents > 1 || xfer->rx_sg.nents > 1) {
++ dev_warn_once(mas->dev, "Doing FIFO, cannot handle tx_nents-%d, rx_nents-%d\n",
++ xfer->tx_sg.nents, xfer->rx_sg.nents);
++ mas->cur_xfer_mode = GENI_SE_FIFO;
++ } else
++ mas->cur_xfer_mode = GENI_SE_DMA;
+ geni_se_select_mode(se, mas->cur_xfer_mode);
+
+ /*
+@@ -838,35 +860,17 @@ static int setup_se_xfer(struct spi_transfer *xfer,
+ geni_se_setup_m_cmd(se, m_cmd, FRAGMENTATION);
+
+ if (mas->cur_xfer_mode == GENI_SE_DMA) {
+- if (m_cmd & SPI_RX_ONLY) {
+- ret = geni_se_rx_dma_prep(se, xfer->rx_buf,
+- xfer->len, &mas->rx_se_dma);
+- if (ret) {
+- dev_err(mas->dev, "Failed to setup Rx dma %d\n", ret);
+- mas->rx_se_dma = 0;
+- goto unlock_and_return;
+- }
+- }
+- if (m_cmd & SPI_TX_ONLY) {
+- ret = geni_se_tx_dma_prep(se, (void *)xfer->tx_buf,
+- xfer->len, &mas->tx_se_dma);
+- if (ret) {
+- dev_err(mas->dev, "Failed to setup Tx dma %d\n", ret);
+- mas->tx_se_dma = 0;
+- if (m_cmd & SPI_RX_ONLY) {
+- /* Unmap rx buffer if duplex transfer */
+- geni_se_rx_dma_unprep(se, mas->rx_se_dma, xfer->len);
+- mas->rx_se_dma = 0;
+- }
+- goto unlock_and_return;
+- }
+- }
++ if (m_cmd & SPI_RX_ONLY)
++ geni_se_rx_init_dma(se, sg_dma_address(xfer->rx_sg.sgl),
++ sg_dma_len(xfer->rx_sg.sgl));
++ if (m_cmd & SPI_TX_ONLY)
++ geni_se_tx_init_dma(se, sg_dma_address(xfer->tx_sg.sgl),
++ sg_dma_len(xfer->tx_sg.sgl));
+ } else if (m_cmd & SPI_TX_ONLY) {
+ if (geni_spi_handle_tx(mas))
+ writel(mas->tx_wm, se->base + SE_GENI_TX_WATERMARK_REG);
+ }
+
+-unlock_and_return:
+ spin_unlock_irq(&mas->lock);
+ return ret;
+ }
+@@ -967,14 +971,6 @@ static irqreturn_t geni_spi_isr(int irq, void *data)
+ if (dma_rx_status & RX_RESET_DONE)
+ complete(&mas->rx_reset_done);
+ if (!mas->tx_rem_bytes && !mas->rx_rem_bytes && xfer) {
+- if (xfer->tx_buf && mas->tx_se_dma) {
+- geni_se_tx_dma_unprep(se, mas->tx_se_dma, xfer->len);
+- mas->tx_se_dma = 0;
+- }
+- if (xfer->rx_buf && mas->rx_se_dma) {
+- geni_se_rx_dma_unprep(se, mas->rx_se_dma, xfer->len);
+- mas->rx_se_dma = 0;
+- }
+ spi_finalize_current_transfer(spi);
+ mas->cur_xfer = NULL;
+ }
+@@ -1059,6 +1055,7 @@ static int spi_geni_probe(struct platform_device *pdev)
+ spi->bits_per_word_mask = SPI_BPW_RANGE_MASK(4, 32);
+ spi->num_chipselect = 4;
+ spi->max_speed_hz = 50000000;
++ spi->max_dma_len = 0xffff0; /* 24 bits for tx/rx dma length */
+ spi->prepare_message = spi_geni_prepare_message;
+ spi->transfer_one = spi_geni_transfer_one;
+ spi->can_dma = geni_can_dma;
+@@ -1100,6 +1097,12 @@ static int spi_geni_probe(struct platform_device *pdev)
+ if (mas->cur_xfer_mode == GENI_SE_FIFO)
+ spi->set_cs = spi_geni_set_cs;
+
++ /*
++ * TX is required per GSI spec, see setup_gsi_xfer().
++ */
++ if (mas->cur_xfer_mode == GENI_GPI_DMA)
++ spi->flags = SPI_CONTROLLER_MUST_TX;
++
+ ret = request_irq(mas->irq, geni_spi_isr, 0, dev_name(dev), spi);
+ if (ret)
+ goto spi_geni_release_dma;
+diff --git a/drivers/staging/media/atomisp/i2c/atomisp-gc0310.c b/drivers/staging/media/atomisp/i2c/atomisp-gc0310.c
+index 273155308fe36..eb6db1571dc0d 100644
+--- a/drivers/staging/media/atomisp/i2c/atomisp-gc0310.c
++++ b/drivers/staging/media/atomisp/i2c/atomisp-gc0310.c
+@@ -377,8 +377,8 @@ static void gc0310_remove(struct i2c_client *client)
+ v4l2_device_unregister_subdev(sd);
+ media_entity_cleanup(&dev->sd.entity);
+ v4l2_ctrl_handler_free(&dev->ctrls.handler);
++ mutex_destroy(&dev->input_lock);
+ pm_runtime_disable(&client->dev);
+- kfree(dev);
+ }
+
+ static int gc0310_probe(struct i2c_client *client)
+diff --git a/drivers/staging/media/atomisp/i2c/atomisp-ov2680.c b/drivers/staging/media/atomisp/i2c/atomisp-ov2680.c
+index c079368019e87..3a6bc3e56b10e 100644
+--- a/drivers/staging/media/atomisp/i2c/atomisp-ov2680.c
++++ b/drivers/staging/media/atomisp/i2c/atomisp-ov2680.c
+@@ -239,27 +239,21 @@ static void ov2680_calc_mode(struct ov2680_device *sensor, int width, int height
+ static int ov2680_set_mode(struct ov2680_device *sensor)
+ {
+ struct i2c_client *client = sensor->client;
+- u8 pll_div, unknown, inc, fmt1, fmt2;
++ u8 unknown, inc, fmt1, fmt2;
+ int ret;
+
+ if (sensor->mode.binning) {
+- pll_div = 1;
+ unknown = 0x23;
+ inc = 0x31;
+ fmt1 = 0xc2;
+ fmt2 = 0x01;
+ } else {
+- pll_div = 0;
+ unknown = 0x21;
+ inc = 0x11;
+ fmt1 = 0xc0;
+ fmt2 = 0x00;
+ }
+
+- ret = ov_write_reg8(client, 0x3086, pll_div);
+- if (ret)
+- return ret;
+-
+ ret = ov_write_reg8(client, 0x370a, unknown);
+ if (ret)
+ return ret;
+diff --git a/drivers/staging/media/atomisp/i2c/ov2680.h b/drivers/staging/media/atomisp/i2c/ov2680.h
+index baf49eb0659e3..eed18d6943370 100644
+--- a/drivers/staging/media/atomisp/i2c/ov2680.h
++++ b/drivers/staging/media/atomisp/i2c/ov2680.h
+@@ -172,6 +172,7 @@ static struct ov2680_reg const ov2680_global_setting[] = {
+ {0x3082, 0x45},
+ {0x3084, 0x09},
+ {0x3085, 0x04},
++ {0x3086, 0x00},
+ {0x3503, 0x03},
+ {0x350b, 0x36},
+ {0x3600, 0xb4},
+diff --git a/drivers/staging/media/atomisp/pci/atomisp_gmin_platform.c b/drivers/staging/media/atomisp/pci/atomisp_gmin_platform.c
+index c718a74ea70a3..88d4499233b98 100644
+--- a/drivers/staging/media/atomisp/pci/atomisp_gmin_platform.c
++++ b/drivers/staging/media/atomisp/pci/atomisp_gmin_platform.c
+@@ -1357,7 +1357,7 @@ static int gmin_get_config_dsm_var(struct device *dev,
+ dev_info(dev, "found _DSM entry for '%s': %s\n", var,
+ cur->string.pointer);
+ strscpy(out, cur->string.pointer, *out_len);
+- *out_len = strlen(cur->string.pointer);
++ *out_len = strlen(out);
+
+ ACPI_FREE(obj);
+ return 0;
+diff --git a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c
+index 90a3958d1f297..aa2313f3bcab8 100644
+--- a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c
++++ b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c
+@@ -415,7 +415,7 @@ free_pagelist(struct vchiq_instance *instance, struct vchiq_pagelist_info *pagel
+ pagelistinfo->scatterlist_mapped = 0;
+
+ /* Deal with any partial cache lines (fragments) */
+- if (pagelist->type >= PAGELIST_READ_WITH_FRAGMENTS) {
++ if (pagelist->type >= PAGELIST_READ_WITH_FRAGMENTS && g_fragments_base) {
+ char *fragments = g_fragments_base +
+ (pagelist->type - PAGELIST_READ_WITH_FRAGMENTS) *
+ g_fragments_size;
+@@ -462,7 +462,7 @@ free_pagelist(struct vchiq_instance *instance, struct vchiq_pagelist_info *pagel
+ cleanup_pagelistinfo(instance, pagelistinfo);
+ }
+
+-int vchiq_platform_init(struct platform_device *pdev, struct vchiq_state *state)
++static int vchiq_platform_init(struct platform_device *pdev, struct vchiq_state *state)
+ {
+ struct device *dev = &pdev->dev;
+ struct vchiq_drvdata *drvdata = platform_get_drvdata(pdev);
+diff --git a/drivers/thermal/qcom/tsens-v0_1.c b/drivers/thermal/qcom/tsens-v0_1.c
+index e89c6f39a3aea..e9ce7b62b3818 100644
+--- a/drivers/thermal/qcom/tsens-v0_1.c
++++ b/drivers/thermal/qcom/tsens-v0_1.c
+@@ -243,6 +243,18 @@ static int calibrate_8974(struct tsens_priv *priv)
+ return 0;
+ }
+
++static int __init init_8226(struct tsens_priv *priv)
++{
++ priv->sensor[0].slope = 2901;
++ priv->sensor[1].slope = 2846;
++ priv->sensor[2].slope = 3038;
++ priv->sensor[3].slope = 2955;
++ priv->sensor[4].slope = 2901;
++ priv->sensor[5].slope = 2846;
++
++ return init_common(priv);
++}
++
+ static int __init init_8939(struct tsens_priv *priv) {
+ priv->sensor[0].slope = 2911;
+ priv->sensor[1].slope = 2789;
+@@ -258,7 +270,28 @@ static int __init init_8939(struct tsens_priv *priv) {
+ return init_common(priv);
+ }
+
+-/* v0.1: 8916, 8939, 8974, 9607 */
++static int __init init_9607(struct tsens_priv *priv)
++{
++ int i;
++
++ for (i = 0; i < priv->num_sensors; ++i)
++ priv->sensor[i].slope = 3000;
++
++ priv->sensor[0].p1_calib_offset = 1;
++ priv->sensor[0].p2_calib_offset = 1;
++ priv->sensor[1].p1_calib_offset = -4;
++ priv->sensor[1].p2_calib_offset = -2;
++ priv->sensor[2].p1_calib_offset = 4;
++ priv->sensor[2].p2_calib_offset = 8;
++ priv->sensor[3].p1_calib_offset = -3;
++ priv->sensor[3].p2_calib_offset = -5;
++ priv->sensor[4].p1_calib_offset = -4;
++ priv->sensor[4].p2_calib_offset = -4;
++
++ return init_common(priv);
++}
++
++/* v0.1: 8226, 8916, 8939, 8974, 9607 */
+
+ static struct tsens_features tsens_v0_1_feat = {
+ .ver_major = VER_0_1,
+@@ -313,6 +346,19 @@ static const struct tsens_ops ops_v0_1 = {
+ .get_temp = get_temp_common,
+ };
+
++static const struct tsens_ops ops_8226 = {
++ .init = init_8226,
++ .calibrate = tsens_calibrate_common,
++ .get_temp = get_temp_common,
++};
++
++struct tsens_plat_data data_8226 = {
++ .num_sensors = 6,
++ .ops = &ops_8226,
++ .feat = &tsens_v0_1_feat,
++ .fields = tsens_v0_1_regfields,
++};
++
+ static const struct tsens_ops ops_8916 = {
+ .init = init_common,
+ .calibrate = calibrate_8916,
+@@ -356,9 +402,15 @@ struct tsens_plat_data data_8974 = {
+ .fields = tsens_v0_1_regfields,
+ };
+
++static const struct tsens_ops ops_9607 = {
++ .init = init_9607,
++ .calibrate = tsens_calibrate_common,
++ .get_temp = get_temp_common,
++};
++
+ struct tsens_plat_data data_9607 = {
+ .num_sensors = 5,
+- .ops = &ops_v0_1,
++ .ops = &ops_9607,
+ .feat = &tsens_v0_1_feat,
+ .fields = tsens_v0_1_regfields,
+ };
+diff --git a/drivers/thermal/qcom/tsens.c b/drivers/thermal/qcom/tsens.c
+index d3218127e617d..9dd5e4b709117 100644
+--- a/drivers/thermal/qcom/tsens.c
++++ b/drivers/thermal/qcom/tsens.c
+@@ -134,10 +134,12 @@ int tsens_read_calibration(struct tsens_priv *priv, int shift, u32 *p1, u32 *p2,
+ p1[i] = p1[i] + (base1 << shift);
+ break;
+ case TWO_PT_CALIB:
++ case TWO_PT_CALIB_NO_OFFSET:
+ for (i = 0; i < priv->num_sensors; i++)
+ p2[i] = (p2[i] + base2) << shift;
+ fallthrough;
+ case ONE_PT_CALIB2:
++ case ONE_PT_CALIB2_NO_OFFSET:
+ for (i = 0; i < priv->num_sensors; i++)
+ p1[i] = (p1[i] + base1) << shift;
+ break;
+@@ -149,6 +151,18 @@ int tsens_read_calibration(struct tsens_priv *priv, int shift, u32 *p1, u32 *p2,
+ }
+ }
+
++ /* Apply calibration offset workaround except for _NO_OFFSET modes */
++ switch (mode) {
++ case TWO_PT_CALIB:
++ for (i = 0; i < priv->num_sensors; i++)
++ p2[i] += priv->sensor[i].p2_calib_offset;
++ fallthrough;
++ case ONE_PT_CALIB2:
++ for (i = 0; i < priv->num_sensors; i++)
++ p1[i] += priv->sensor[i].p1_calib_offset;
++ break;
++ }
++
+ return mode;
+ }
+
+@@ -254,7 +268,7 @@ void compute_intercept_slope(struct tsens_priv *priv, u32 *p1,
+
+ if (!priv->sensor[i].slope)
+ priv->sensor[i].slope = SLOPE_DEFAULT;
+- if (mode == TWO_PT_CALIB) {
++ if (mode == TWO_PT_CALIB || mode == TWO_PT_CALIB_NO_OFFSET) {
+ /*
+ * slope (m) = adc_code2 - adc_code1 (y2 - y1)/
+ * temp_120_degc - temp_30_degc (x2 - x1)
+@@ -1095,6 +1109,9 @@ static const struct of_device_id tsens_table[] = {
+ }, {
+ .compatible = "qcom,mdm9607-tsens",
+ .data = &data_9607,
++ }, {
++ .compatible = "qcom,msm8226-tsens",
++ .data = &data_8226,
+ }, {
+ .compatible = "qcom,msm8916-tsens",
+ .data = &data_8916,
+diff --git a/drivers/thermal/qcom/tsens.h b/drivers/thermal/qcom/tsens.h
+index dba9cd38f637c..1cd8f4fe0971f 100644
+--- a/drivers/thermal/qcom/tsens.h
++++ b/drivers/thermal/qcom/tsens.h
+@@ -10,6 +10,8 @@
+ #define ONE_PT_CALIB 0x1
+ #define ONE_PT_CALIB2 0x2
+ #define TWO_PT_CALIB 0x3
++#define ONE_PT_CALIB2_NO_OFFSET 0x6
++#define TWO_PT_CALIB_NO_OFFSET 0x7
+ #define CAL_DEGC_PT1 30
+ #define CAL_DEGC_PT2 120
+ #define SLOPE_FACTOR 1000
+@@ -57,6 +59,8 @@ struct tsens_sensor {
+ unsigned int hw_id;
+ int slope;
+ u32 status;
++ int p1_calib_offset;
++ int p2_calib_offset;
+ };
+
+ /**
+@@ -635,7 +639,7 @@ int get_temp_common(const struct tsens_sensor *s, int *temp);
+ extern struct tsens_plat_data data_8960;
+
+ /* TSENS v0.1 targets */
+-extern struct tsens_plat_data data_8916, data_8939, data_8974, data_9607;
++extern struct tsens_plat_data data_8226, data_8916, data_8939, data_8974, data_9607;
+
+ /* TSENS v1 targets */
+ extern struct tsens_plat_data data_tsens_v1, data_8976, data_8956;
+diff --git a/drivers/thermal/qoriq_thermal.c b/drivers/thermal/qoriq_thermal.c
+index e58756323457e..3eca7085d9efe 100644
+--- a/drivers/thermal/qoriq_thermal.c
++++ b/drivers/thermal/qoriq_thermal.c
+@@ -31,7 +31,6 @@
+ #define TMR_DISABLE 0x0
+ #define TMR_ME 0x80000000
+ #define TMR_ALPF 0x0c000000
+-#define TMR_MSITE_ALL GENMASK(15, 0)
+
+ #define REGS_TMTMIR 0x008 /* Temperature measurement interval Register */
+ #define TMTMIR_DEFAULT 0x0000000f
+@@ -105,6 +104,11 @@ static int tmu_get_temp(struct thermal_zone_device *tz, int *temp)
+ * within sensor range. TEMP is an 9 bit value representing
+ * temperature in KelVin.
+ */
++
++ regmap_read(qdata->regmap, REGS_TMR, &val);
++ if (!(val & TMR_ME))
++ return -EAGAIN;
++
+ if (regmap_read_poll_timeout(qdata->regmap,
+ REGS_TRITSR(qsensor->id),
+ val,
+@@ -128,15 +132,7 @@ static const struct thermal_zone_device_ops tmu_tz_ops = {
+ static int qoriq_tmu_register_tmu_zone(struct device *dev,
+ struct qoriq_tmu_data *qdata)
+ {
+- int id;
+-
+- if (qdata->ver == TMU_VER1) {
+- regmap_write(qdata->regmap, REGS_TMR,
+- TMR_MSITE_ALL | TMR_ME | TMR_ALPF);
+- } else {
+- regmap_write(qdata->regmap, REGS_V2_TMSR, TMR_MSITE_ALL);
+- regmap_write(qdata->regmap, REGS_TMR, TMR_ME | TMR_ALPF_V2);
+- }
++ int id, sites = 0;
+
+ for (id = 0; id < SITES_MAX; id++) {
+ struct thermal_zone_device *tzd;
+@@ -153,14 +149,26 @@ static int qoriq_tmu_register_tmu_zone(struct device *dev,
+ if (ret == -ENODEV)
+ continue;
+
+- regmap_write(qdata->regmap, REGS_TMR, TMR_DISABLE);
+ return ret;
+ }
+
++ if (qdata->ver == TMU_VER1)
++ sites |= 0x1 << (15 - id);
++ else
++ sites |= 0x1 << id;
++
+ if (devm_thermal_add_hwmon_sysfs(dev, tzd))
+ dev_warn(dev,
+ "Failed to add hwmon sysfs attributes\n");
++ }
+
++ if (sites) {
++ if (qdata->ver == TMU_VER1) {
++ regmap_write(qdata->regmap, REGS_TMR, TMR_ME | TMR_ALPF | sites);
++ } else {
++ regmap_write(qdata->regmap, REGS_V2_TMSR, sites);
++ regmap_write(qdata->regmap, REGS_TMR, TMR_ME | TMR_ALPF_V2);
++ }
+ }
+
+ return 0;
+diff --git a/drivers/thermal/sun8i_thermal.c b/drivers/thermal/sun8i_thermal.c
+index 793ddce72132f..d4d241686c810 100644
+--- a/drivers/thermal/sun8i_thermal.c
++++ b/drivers/thermal/sun8i_thermal.c
+@@ -319,6 +319,11 @@ out:
+ return ret;
+ }
+
++static void sun8i_ths_reset_control_assert(void *data)
++{
++ reset_control_assert(data);
++}
++
+ static int sun8i_ths_resource_init(struct ths_device *tmdev)
+ {
+ struct device *dev = tmdev->dev;
+@@ -339,47 +344,35 @@ static int sun8i_ths_resource_init(struct ths_device *tmdev)
+ if (IS_ERR(tmdev->reset))
+ return PTR_ERR(tmdev->reset);
+
+- tmdev->bus_clk = devm_clk_get(&pdev->dev, "bus");
++ ret = reset_control_deassert(tmdev->reset);
++ if (ret)
++ return ret;
++
++ ret = devm_add_action_or_reset(dev, sun8i_ths_reset_control_assert,
++ tmdev->reset);
++ if (ret)
++ return ret;
++
++ tmdev->bus_clk = devm_clk_get_enabled(&pdev->dev, "bus");
+ if (IS_ERR(tmdev->bus_clk))
+ return PTR_ERR(tmdev->bus_clk);
+ }
+
+ if (tmdev->chip->has_mod_clk) {
+- tmdev->mod_clk = devm_clk_get(&pdev->dev, "mod");
++ tmdev->mod_clk = devm_clk_get_enabled(&pdev->dev, "mod");
+ if (IS_ERR(tmdev->mod_clk))
+ return PTR_ERR(tmdev->mod_clk);
+ }
+
+- ret = reset_control_deassert(tmdev->reset);
+- if (ret)
+- return ret;
+-
+- ret = clk_prepare_enable(tmdev->bus_clk);
+- if (ret)
+- goto assert_reset;
+-
+ ret = clk_set_rate(tmdev->mod_clk, 24000000);
+ if (ret)
+- goto bus_disable;
+-
+- ret = clk_prepare_enable(tmdev->mod_clk);
+- if (ret)
+- goto bus_disable;
++ return ret;
+
+ ret = sun8i_ths_calibrate(tmdev);
+ if (ret)
+- goto mod_disable;
++ return ret;
+
+ return 0;
+-
+-mod_disable:
+- clk_disable_unprepare(tmdev->mod_clk);
+-bus_disable:
+- clk_disable_unprepare(tmdev->bus_clk);
+-assert_reset:
+- reset_control_assert(tmdev->reset);
+-
+- return ret;
+ }
+
+ static int sun8i_h3_thermal_init(struct ths_device *tmdev)
+@@ -530,17 +523,6 @@ static int sun8i_ths_probe(struct platform_device *pdev)
+ return 0;
+ }
+
+-static int sun8i_ths_remove(struct platform_device *pdev)
+-{
+- struct ths_device *tmdev = platform_get_drvdata(pdev);
+-
+- clk_disable_unprepare(tmdev->mod_clk);
+- clk_disable_unprepare(tmdev->bus_clk);
+- reset_control_assert(tmdev->reset);
+-
+- return 0;
+-}
+-
+ static const struct ths_thermal_chip sun8i_a83t_ths = {
+ .sensor_num = 3,
+ .scale = 705,
+@@ -642,7 +624,6 @@ MODULE_DEVICE_TABLE(of, of_ths_match);
+
+ static struct platform_driver ths_driver = {
+ .probe = sun8i_ths_probe,
+- .remove = sun8i_ths_remove,
+ .driver = {
+ .name = "sun8i-thermal",
+ .of_match_table = of_ths_match,
+diff --git a/drivers/tty/serial/8250/8250_omap.c b/drivers/tty/serial/8250/8250_omap.c
+index 734f092ef839a..b758e9b613c74 100644
+--- a/drivers/tty/serial/8250/8250_omap.c
++++ b/drivers/tty/serial/8250/8250_omap.c
+@@ -649,6 +649,8 @@ static irqreturn_t omap8250_irq(int irq, void *dev_id)
+ if ((lsr & UART_LSR_OE) && up->overrun_backoff_time_ms > 0) {
+ unsigned long delay;
+
++ /* Synchronize UART_IER access against the console. */
++ spin_lock(&port->lock);
+ up->ier = port->serial_in(port, UART_IER);
+ if (up->ier & (UART_IER_RLSI | UART_IER_RDI)) {
+ port->ops->stop_rx(port);
+@@ -658,6 +660,7 @@ static irqreturn_t omap8250_irq(int irq, void *dev_id)
+ */
+ cancel_delayed_work(&up->overrun_backoff);
+ }
++ spin_unlock(&port->lock);
+
+ delay = msecs_to_jiffies(up->overrun_backoff_time_ms);
+ schedule_delayed_work(&up->overrun_backoff, delay);
+@@ -1532,7 +1535,9 @@ static int omap8250_probe(struct platform_device *pdev)
+ err:
+ pm_runtime_dont_use_autosuspend(&pdev->dev);
+ pm_runtime_put_sync(&pdev->dev);
++ flush_work(&priv->qos_work);
+ pm_runtime_disable(&pdev->dev);
++ cpu_latency_qos_remove_request(&priv->pm_qos_request);
+ return ret;
+ }
+
+@@ -1579,25 +1584,35 @@ static int omap8250_suspend(struct device *dev)
+ {
+ struct omap8250_priv *priv = dev_get_drvdata(dev);
+ struct uart_8250_port *up = serial8250_get_port(priv->line);
++ int err;
+
+ serial8250_suspend_port(priv->line);
+
+- pm_runtime_get_sync(dev);
++ err = pm_runtime_resume_and_get(dev);
++ if (err)
++ return err;
+ if (!device_may_wakeup(dev))
+ priv->wer = 0;
+ serial_out(up, UART_OMAP_WER, priv->wer);
+- pm_runtime_mark_last_busy(dev);
+- pm_runtime_put_autosuspend(dev);
+-
++ err = pm_runtime_force_suspend(dev);
+ flush_work(&priv->qos_work);
+- return 0;
++
++ return err;
+ }
+
+ static int omap8250_resume(struct device *dev)
+ {
+ struct omap8250_priv *priv = dev_get_drvdata(dev);
++ int err;
+
++ err = pm_runtime_force_resume(dev);
++ if (err)
++ return err;
+ serial8250_resume_port(priv->line);
++ /* Paired with pm_runtime_resume_and_get() in omap8250_suspend() */
++ pm_runtime_mark_last_busy(dev);
++ pm_runtime_put_autosuspend(dev);
++
+ return 0;
+ }
+ #else
+diff --git a/drivers/tty/serial/fsl_lpuart.c b/drivers/tty/serial/fsl_lpuart.c
+index 7fd30fcc10c62..f38606b750967 100644
+--- a/drivers/tty/serial/fsl_lpuart.c
++++ b/drivers/tty/serial/fsl_lpuart.c
+@@ -2676,6 +2676,7 @@ OF_EARLYCON_DECLARE(lpuart, "fsl,vf610-lpuart", lpuart_early_console_setup);
+ OF_EARLYCON_DECLARE(lpuart32, "fsl,ls1021a-lpuart", lpuart32_early_console_setup);
+ OF_EARLYCON_DECLARE(lpuart32, "fsl,ls1028a-lpuart", ls1028a_early_console_setup);
+ OF_EARLYCON_DECLARE(lpuart32, "fsl,imx7ulp-lpuart", lpuart32_imx_early_console_setup);
++OF_EARLYCON_DECLARE(lpuart32, "fsl,imx8ulp-lpuart", lpuart32_imx_early_console_setup);
+ OF_EARLYCON_DECLARE(lpuart32, "fsl,imx8qxp-lpuart", lpuart32_imx_early_console_setup);
+ OF_EARLYCON_DECLARE(lpuart32, "fsl,imxrt1050-lpuart", lpuart32_imx_early_console_setup);
+ EARLYCON_DECLARE(lpuart, lpuart_early_console_setup);
+diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
+index 54e82f476a2cc..ea4a70055ad9f 100644
+--- a/drivers/tty/serial/serial_core.c
++++ b/drivers/tty/serial/serial_core.c
+@@ -2333,8 +2333,11 @@ int uart_suspend_port(struct uart_driver *drv, struct uart_port *uport)
+ * able to Re-start_rx later.
+ */
+ if (!console_suspend_enabled && uart_console(uport)) {
+- if (uport->ops->start_rx)
++ if (uport->ops->start_rx) {
++ spin_lock_irq(&uport->lock);
+ uport->ops->stop_rx(uport);
++ spin_unlock_irq(&uport->lock);
++ }
+ goto unlock;
+ }
+
+@@ -2427,8 +2430,11 @@ int uart_resume_port(struct uart_driver *drv, struct uart_port *uport)
+ if (console_suspend_enabled)
+ uart_change_pm(state, UART_PM_STATE_ON);
+ uport->ops->set_termios(uport, &termios, NULL);
+- if (!console_suspend_enabled && uport->ops->start_rx)
++ if (!console_suspend_enabled && uport->ops->start_rx) {
++ spin_lock_irq(&uport->lock);
+ uport->ops->start_rx(uport);
++ spin_unlock_irq(&uport->lock);
++ }
+ if (console_suspend_enabled)
+ console_start(uport->cons);
+ }
+diff --git a/drivers/ufs/core/ufshcd-priv.h b/drivers/ufs/core/ufshcd-priv.h
+index d53b93c21a0c6..8f58c21693985 100644
+--- a/drivers/ufs/core/ufshcd-priv.h
++++ b/drivers/ufs/core/ufshcd-priv.h
+@@ -84,9 +84,6 @@ unsigned long ufshcd_mcq_poll_cqe_lock(struct ufs_hba *hba,
+ int ufshcd_read_string_desc(struct ufs_hba *hba, u8 desc_index,
+ u8 **buf, bool ascii);
+
+-int ufshcd_hold(struct ufs_hba *hba, bool async);
+-void ufshcd_release(struct ufs_hba *hba);
+-
+ int ufshcd_send_uic_cmd(struct ufs_hba *hba, struct uic_command *uic_cmd);
+
+ int ufshcd_exec_raw_upiu_cmd(struct ufs_hba *hba,
+diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
+index e7e79f515e141..6d8ef80d9cbc4 100644
+--- a/drivers/ufs/core/ufshcd.c
++++ b/drivers/ufs/core/ufshcd.c
+@@ -2945,7 +2945,6 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd)
+ (hba->clk_gating.state != CLKS_ON));
+
+ lrbp = &hba->lrb[tag];
+- WARN_ON(lrbp->cmd);
+ lrbp->cmd = cmd;
+ lrbp->task_tag = tag;
+ lrbp->lun = ufshcd_scsi_to_upiu_lun(cmd->device->lun);
+@@ -2961,7 +2960,6 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd)
+
+ err = ufshcd_map_sg(hba, lrbp);
+ if (err) {
+- lrbp->cmd = NULL;
+ ufshcd_release(hba);
+ goto out;
+ }
+@@ -3099,7 +3097,7 @@ retry:
+ * not trigger any race conditions.
+ */
+ hba->dev_cmd.complete = NULL;
+- err = ufshcd_get_tr_ocs(lrbp, hba->dev_cmd.cqe);
++ err = ufshcd_get_tr_ocs(lrbp, NULL);
+ if (!err)
+ err = ufshcd_dev_cmd_completion(hba, lrbp);
+ } else {
+@@ -3180,13 +3178,12 @@ static int ufshcd_exec_dev_cmd(struct ufs_hba *hba,
+ down_read(&hba->clk_scaling_lock);
+
+ lrbp = &hba->lrb[tag];
+- WARN_ON(lrbp->cmd);
++ lrbp->cmd = NULL;
+ err = ufshcd_compose_dev_cmd(hba, lrbp, cmd_type, tag);
+ if (unlikely(err))
+ goto out;
+
+ hba->dev_cmd.complete = &wait;
+- hba->dev_cmd.cqe = NULL;
+
+ ufshcd_add_query_upiu_trace(hba, UFS_QUERY_SEND, lrbp->ucd_req_ptr);
+
+@@ -5422,7 +5419,6 @@ static void ufshcd_release_scsi_cmd(struct ufs_hba *hba,
+ struct scsi_cmnd *cmd = lrbp->cmd;
+
+ scsi_dma_unmap(cmd);
+- lrbp->cmd = NULL; /* Mark the command as completed. */
+ ufshcd_release(hba);
+ ufshcd_clk_scaling_update_busy(hba);
+ }
+@@ -5438,6 +5434,7 @@ void ufshcd_compl_one_cqe(struct ufs_hba *hba, int task_tag,
+ {
+ struct ufshcd_lrb *lrbp;
+ struct scsi_cmnd *cmd;
++ enum utp_ocs ocs;
+
+ lrbp = &hba->lrb[task_tag];
+ lrbp->compl_time_stamp = ktime_get();
+@@ -5453,8 +5450,11 @@ void ufshcd_compl_one_cqe(struct ufs_hba *hba, int task_tag,
+ } else if (lrbp->command_type == UTP_CMD_TYPE_DEV_MANAGE ||
+ lrbp->command_type == UTP_CMD_TYPE_UFS_STORAGE) {
+ if (hba->dev_cmd.complete) {
+- hba->dev_cmd.cqe = cqe;
+- ufshcd_add_command_trace(hba, task_tag, UFS_DEV_COMP);
++ if (cqe) {
++ ocs = le32_to_cpu(cqe->status) & MASK_OCS;
++ lrbp->utr_descriptor_ptr->header.dword_2 =
++ cpu_to_le32(ocs);
++ }
+ complete(hba->dev_cmd.complete);
+ ufshcd_clk_scaling_update_busy(hba);
+ }
+@@ -7037,7 +7037,6 @@ static int ufshcd_issue_devman_upiu_cmd(struct ufs_hba *hba,
+ down_read(&hba->clk_scaling_lock);
+
+ lrbp = &hba->lrb[tag];
+- WARN_ON(lrbp->cmd);
+ lrbp->cmd = NULL;
+ lrbp->task_tag = tag;
+ lrbp->lun = 0;
+@@ -7209,7 +7208,6 @@ int ufshcd_advanced_rpmb_req_handler(struct ufs_hba *hba, struct utp_upiu_req *r
+ down_read(&hba->clk_scaling_lock);
+
+ lrbp = &hba->lrb[tag];
+- WARN_ON(lrbp->cmd);
+ lrbp->cmd = NULL;
+ lrbp->task_tag = tag;
+ lrbp->lun = UFS_UPIU_RPMB_WLUN;
+@@ -9184,7 +9182,8 @@ static int ufshcd_execute_start_stop(struct scsi_device *sdev,
+ };
+
+ return scsi_execute_cmd(sdev, cdb, REQ_OP_DRV_IN, /*buffer=*/NULL,
+- /*bufflen=*/0, /*timeout=*/HZ, /*retries=*/0, &args);
++ /*bufflen=*/0, /*timeout=*/10 * HZ, /*retries=*/0,
++ &args);
+ }
+
+ /**
+diff --git a/drivers/usb/core/devio.c b/drivers/usb/core/devio.c
+index fcf68818e9992..cbad2af5fd882 100644
+--- a/drivers/usb/core/devio.c
++++ b/drivers/usb/core/devio.c
+@@ -746,6 +746,7 @@ static int driver_resume(struct usb_interface *intf)
+ return 0;
+ }
+
++#ifdef CONFIG_PM
+ /* The following routines apply to the entire device, not interfaces */
+ void usbfs_notify_suspend(struct usb_device *udev)
+ {
+@@ -764,6 +765,7 @@ void usbfs_notify_resume(struct usb_device *udev)
+ }
+ mutex_unlock(&usbfs_mutex);
+ }
++#endif
+
+ struct usb_driver usbfs_driver = {
+ .name = "usbfs",
+diff --git a/drivers/usb/core/hcd-pci.c b/drivers/usb/core/hcd-pci.c
+index ab2f3737764e4..990280688b254 100644
+--- a/drivers/usb/core/hcd-pci.c
++++ b/drivers/usb/core/hcd-pci.c
+@@ -415,12 +415,15 @@ static int check_root_hub_suspended(struct device *dev)
+ return 0;
+ }
+
+-static int suspend_common(struct device *dev, bool do_wakeup)
++static int suspend_common(struct device *dev, pm_message_t msg)
+ {
+ struct pci_dev *pci_dev = to_pci_dev(dev);
+ struct usb_hcd *hcd = pci_get_drvdata(pci_dev);
++ bool do_wakeup;
+ int retval;
+
++ do_wakeup = PMSG_IS_AUTO(msg) ? true : device_may_wakeup(dev);
++
+ /* Root hub suspend should have stopped all downstream traffic,
+ * and all bus master traffic. And done so for both the interface
+ * and the stub usb_device (which we check here). But maybe it
+@@ -447,7 +450,7 @@ static int suspend_common(struct device *dev, bool do_wakeup)
+ (retval == 0 && do_wakeup && hcd->shared_hcd &&
+ HCD_WAKEUP_PENDING(hcd->shared_hcd))) {
+ if (hcd->driver->pci_resume)
+- hcd->driver->pci_resume(hcd, false);
++ hcd->driver->pci_resume(hcd, msg);
+ retval = -EBUSY;
+ }
+ if (retval)
+@@ -470,7 +473,7 @@ static int suspend_common(struct device *dev, bool do_wakeup)
+ return retval;
+ }
+
+-static int resume_common(struct device *dev, int event)
++static int resume_common(struct device *dev, pm_message_t msg)
+ {
+ struct pci_dev *pci_dev = to_pci_dev(dev);
+ struct usb_hcd *hcd = pci_get_drvdata(pci_dev);
+@@ -498,12 +501,11 @@ static int resume_common(struct device *dev, int event)
+ * No locking is needed because PCI controller drivers do not
+ * get unbound during system resume.
+ */
+- if (pci_dev->class == CL_EHCI && event != PM_EVENT_AUTO_RESUME)
++ if (pci_dev->class == CL_EHCI && msg.event != PM_EVENT_AUTO_RESUME)
+ for_each_companion(pci_dev, hcd,
+ ehci_wait_for_companions);
+
+- retval = hcd->driver->pci_resume(hcd,
+- event == PM_EVENT_RESTORE);
++ retval = hcd->driver->pci_resume(hcd, msg);
+ if (retval) {
+ dev_err(dev, "PCI post-resume error %d!\n", retval);
+ usb_hc_died(hcd);
+@@ -516,7 +518,7 @@ static int resume_common(struct device *dev, int event)
+
+ static int hcd_pci_suspend(struct device *dev)
+ {
+- return suspend_common(dev, device_may_wakeup(dev));
++ return suspend_common(dev, PMSG_SUSPEND);
+ }
+
+ static int hcd_pci_suspend_noirq(struct device *dev)
+@@ -577,12 +579,12 @@ static int hcd_pci_resume_noirq(struct device *dev)
+
+ static int hcd_pci_resume(struct device *dev)
+ {
+- return resume_common(dev, PM_EVENT_RESUME);
++ return resume_common(dev, PMSG_RESUME);
+ }
+
+ static int hcd_pci_restore(struct device *dev)
+ {
+- return resume_common(dev, PM_EVENT_RESTORE);
++ return resume_common(dev, PMSG_RESTORE);
+ }
+
+ #else
+@@ -600,7 +602,7 @@ static int hcd_pci_runtime_suspend(struct device *dev)
+ {
+ int retval;
+
+- retval = suspend_common(dev, true);
++ retval = suspend_common(dev, PMSG_AUTO_SUSPEND);
+ if (retval == 0)
+ powermac_set_asic(to_pci_dev(dev), 0);
+ dev_dbg(dev, "hcd_pci_runtime_suspend: %d\n", retval);
+@@ -612,7 +614,7 @@ static int hcd_pci_runtime_resume(struct device *dev)
+ int retval;
+
+ powermac_set_asic(to_pci_dev(dev), 1);
+- retval = resume_common(dev, PM_EVENT_AUTO_RESUME);
++ retval = resume_common(dev, PMSG_AUTO_RESUME);
+ dev_dbg(dev, "hcd_pci_runtime_resume: %d\n", retval);
+ return retval;
+ }
+diff --git a/drivers/usb/dwc2/platform.c b/drivers/usb/dwc2/platform.c
+index 5aee284018c00..5cf025511cce6 100644
+--- a/drivers/usb/dwc2/platform.c
++++ b/drivers/usb/dwc2/platform.c
+@@ -203,6 +203,11 @@ int dwc2_lowlevel_hw_disable(struct dwc2_hsotg *hsotg)
+ return ret;
+ }
+
++static void dwc2_reset_control_assert(void *data)
++{
++ reset_control_assert(data);
++}
++
+ static int dwc2_lowlevel_hw_init(struct dwc2_hsotg *hsotg)
+ {
+ int i, ret;
+@@ -213,6 +218,10 @@ static int dwc2_lowlevel_hw_init(struct dwc2_hsotg *hsotg)
+ "error getting reset control\n");
+
+ reset_control_deassert(hsotg->reset);
++ ret = devm_add_action_or_reset(hsotg->dev, dwc2_reset_control_assert,
++ hsotg->reset);
++ if (ret)
++ return ret;
+
+ hsotg->reset_ecc = devm_reset_control_get_optional(hsotg->dev, "dwc2-ecc");
+ if (IS_ERR(hsotg->reset_ecc))
+@@ -220,6 +229,10 @@ static int dwc2_lowlevel_hw_init(struct dwc2_hsotg *hsotg)
+ "error getting reset control for ecc\n");
+
+ reset_control_deassert(hsotg->reset_ecc);
++ ret = devm_add_action_or_reset(hsotg->dev, dwc2_reset_control_assert,
++ hsotg->reset_ecc);
++ if (ret)
++ return ret;
+
+ /*
+ * Attempt to find a generic PHY, then look for an old style
+@@ -339,9 +352,6 @@ static int dwc2_driver_remove(struct platform_device *dev)
+ if (hsotg->ll_hw_enabled)
+ dwc2_lowlevel_hw_disable(hsotg);
+
+- reset_control_assert(hsotg->reset);
+- reset_control_assert(hsotg->reset_ecc);
+-
+ return 0;
+ }
+
+diff --git a/drivers/usb/dwc3/dwc3-meson-g12a.c b/drivers/usb/dwc3/dwc3-meson-g12a.c
+index b282ad0e69c6d..eaea944ebd2ce 100644
+--- a/drivers/usb/dwc3/dwc3-meson-g12a.c
++++ b/drivers/usb/dwc3/dwc3-meson-g12a.c
+@@ -805,7 +805,7 @@ static int dwc3_meson_g12a_probe(struct platform_device *pdev)
+
+ ret = dwc3_meson_g12a_otg_init(pdev, priv);
+ if (ret)
+- goto err_phys_power;
++ goto err_plat_depopulate;
+
+ pm_runtime_set_active(dev);
+ pm_runtime_enable(dev);
+@@ -813,6 +813,9 @@ static int dwc3_meson_g12a_probe(struct platform_device *pdev)
+
+ return 0;
+
++err_plat_depopulate:
++ of_platform_depopulate(dev);
++
+ err_phys_power:
+ for (i = 0 ; i < PHY_COUNT ; ++i)
+ phy_power_off(priv->phys[i]);
+diff --git a/drivers/usb/dwc3/dwc3-qcom.c b/drivers/usb/dwc3/dwc3-qcom.c
+index 79b22abf97276..72c22851d7eef 100644
+--- a/drivers/usb/dwc3/dwc3-qcom.c
++++ b/drivers/usb/dwc3/dwc3-qcom.c
+@@ -800,6 +800,7 @@ static int dwc3_qcom_probe(struct platform_device *pdev)
+ struct device *dev = &pdev->dev;
+ struct dwc3_qcom *qcom;
+ struct resource *res, *parent_res = NULL;
++ struct resource local_res;
+ int ret, i;
+ bool ignore_pipe_clk;
+ bool wakeup_source;
+@@ -851,9 +852,8 @@ static int dwc3_qcom_probe(struct platform_device *pdev)
+ if (np) {
+ parent_res = res;
+ } else {
+- parent_res = kmemdup(res, sizeof(struct resource), GFP_KERNEL);
+- if (!parent_res)
+- return -ENOMEM;
++ memcpy(&local_res, res, sizeof(struct resource));
++ parent_res = &local_res;
+
+ parent_res->start = res->start +
+ qcom->acpi_pdata->qscratch_base_offset;
+@@ -865,9 +865,10 @@ static int dwc3_qcom_probe(struct platform_device *pdev)
+ if (IS_ERR_OR_NULL(qcom->urs_usb)) {
+ dev_err(dev, "failed to create URS USB platdev\n");
+ if (!qcom->urs_usb)
+- return -ENODEV;
++ ret = -ENODEV;
+ else
+- return PTR_ERR(qcom->urs_usb);
++ ret = PTR_ERR(qcom->urs_usb);
++ goto clk_disable;
+ }
+ }
+ }
+@@ -950,11 +951,15 @@ reset_assert:
+ static int dwc3_qcom_remove(struct platform_device *pdev)
+ {
+ struct dwc3_qcom *qcom = platform_get_drvdata(pdev);
++ struct device_node *np = pdev->dev.of_node;
+ struct device *dev = &pdev->dev;
+ int i;
+
+ device_remove_software_node(&qcom->dwc3->dev);
+- of_platform_depopulate(dev);
++ if (np)
++ of_platform_depopulate(&pdev->dev);
++ else
++ platform_device_put(pdev);
+
+ for (i = qcom->num_clocks - 1; i >= 0; i--) {
+ clk_disable_unprepare(qcom->clks[i]);
+diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
+index b78599dd705c2..550dc8f4d16ad 100644
+--- a/drivers/usb/dwc3/gadget.c
++++ b/drivers/usb/dwc3/gadget.c
+@@ -2744,7 +2744,9 @@ static int dwc3_gadget_pullup(struct usb_gadget *g, int is_on)
+ ret = pm_runtime_get_sync(dwc->dev);
+ if (!ret || ret < 0) {
+ pm_runtime_put(dwc->dev);
+- return 0;
++ if (ret < 0)
++ pm_runtime_set_suspended(dwc->dev);
++ return ret;
+ }
+
+ if (dwc->pullups_connected == is_on) {
+diff --git a/drivers/usb/gadget/function/u_serial.c b/drivers/usb/gadget/function/u_serial.c
+index a0ca47fbff0fc..e5d522d54f6a3 100644
+--- a/drivers/usb/gadget/function/u_serial.c
++++ b/drivers/usb/gadget/function/u_serial.c
+@@ -1420,10 +1420,19 @@ EXPORT_SYMBOL_GPL(gserial_disconnect);
+
+ void gserial_suspend(struct gserial *gser)
+ {
+- struct gs_port *port = gser->ioport;
++ struct gs_port *port;
+ unsigned long flags;
+
+- spin_lock_irqsave(&port->port_lock, flags);
++ spin_lock_irqsave(&serial_port_lock, flags);
++ port = gser->ioport;
++
++ if (!port) {
++ spin_unlock_irqrestore(&serial_port_lock, flags);
++ return;
++ }
++
++ spin_lock(&port->port_lock);
++ spin_unlock(&serial_port_lock);
+ port->suspended = true;
+ spin_unlock_irqrestore(&port->port_lock, flags);
+ }
+diff --git a/drivers/usb/host/ehci-pci.c b/drivers/usb/host/ehci-pci.c
+index 4b148fe5e43b2..889dc44262711 100644
+--- a/drivers/usb/host/ehci-pci.c
++++ b/drivers/usb/host/ehci-pci.c
+@@ -354,10 +354,11 @@ done:
+ * Also they depend on separate root hub suspend/resume.
+ */
+
+-static int ehci_pci_resume(struct usb_hcd *hcd, bool hibernated)
++static int ehci_pci_resume(struct usb_hcd *hcd, pm_message_t msg)
+ {
+ struct ehci_hcd *ehci = hcd_to_ehci(hcd);
+ struct pci_dev *pdev = to_pci_dev(hcd->self.controller);
++ bool hibernated = (msg.event == PM_EVENT_RESTORE);
+
+ if (ehci_resume(hcd, hibernated) != 0)
+ (void) ehci_pci_reinit(ehci, pdev);
+diff --git a/drivers/usb/host/ohci-pci.c b/drivers/usb/host/ohci-pci.c
+index d7b4f40f9ff4e..900ea0d368e03 100644
+--- a/drivers/usb/host/ohci-pci.c
++++ b/drivers/usb/host/ohci-pci.c
+@@ -301,6 +301,12 @@ static struct pci_driver ohci_pci_driver = {
+ #endif
+ };
+
++#ifdef CONFIG_PM
++static int ohci_pci_resume(struct usb_hcd *hcd, pm_message_t msg)
++{
++ return ohci_resume(hcd, msg.event == PM_EVENT_RESTORE);
++}
++#endif
+ static int __init ohci_pci_init(void)
+ {
+ if (usb_disabled())
+@@ -311,7 +317,7 @@ static int __init ohci_pci_init(void)
+ #ifdef CONFIG_PM
+ /* Entries for the PCI suspend/resume callbacks are special */
+ ohci_pci_hc_driver.pci_suspend = ohci_suspend;
+- ohci_pci_hc_driver.pci_resume = ohci_resume;
++ ohci_pci_hc_driver.pci_resume = ohci_pci_resume;
+ #endif
+
+ return pci_register_driver(&ohci_pci_driver);
+diff --git a/drivers/usb/host/uhci-pci.c b/drivers/usb/host/uhci-pci.c
+index 7bd2fddde770a..5edf6a08cf82c 100644
+--- a/drivers/usb/host/uhci-pci.c
++++ b/drivers/usb/host/uhci-pci.c
+@@ -169,7 +169,7 @@ static void uhci_shutdown(struct pci_dev *pdev)
+
+ #ifdef CONFIG_PM
+
+-static int uhci_pci_resume(struct usb_hcd *hcd, bool hibernated);
++static int uhci_pci_resume(struct usb_hcd *hcd, pm_message_t state);
+
+ static int uhci_pci_suspend(struct usb_hcd *hcd, bool do_wakeup)
+ {
+@@ -204,14 +204,15 @@ done_okay:
+
+ /* Check for race with a wakeup request */
+ if (do_wakeup && HCD_WAKEUP_PENDING(hcd)) {
+- uhci_pci_resume(hcd, false);
++ uhci_pci_resume(hcd, PMSG_SUSPEND);
+ rc = -EBUSY;
+ }
+ return rc;
+ }
+
+-static int uhci_pci_resume(struct usb_hcd *hcd, bool hibernated)
++static int uhci_pci_resume(struct usb_hcd *hcd, pm_message_t msg)
+ {
++ bool hibernated = (msg.event == PM_EVENT_RESTORE);
+ struct uhci_hcd *uhci = hcd_to_uhci(hcd);
+
+ dev_dbg(uhci_dev(uhci), "%s\n", __func__);
+diff --git a/drivers/usb/host/xhci-histb.c b/drivers/usb/host/xhci-histb.c
+index 08369857686e7..91ce97821de51 100644
+--- a/drivers/usb/host/xhci-histb.c
++++ b/drivers/usb/host/xhci-histb.c
+@@ -367,7 +367,7 @@ static int __maybe_unused xhci_histb_resume(struct device *dev)
+ if (!device_may_wakeup(dev))
+ xhci_histb_host_enable(histb);
+
+- return xhci_resume(xhci, 0);
++ return xhci_resume(xhci, PMSG_RESUME);
+ }
+
+ static const struct dev_pm_ops xhci_histb_pm_ops = {
+diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
+index 79b3691f373f3..69a5cb7eba381 100644
+--- a/drivers/usb/host/xhci-pci.c
++++ b/drivers/usb/host/xhci-pci.c
+@@ -832,7 +832,7 @@ static int xhci_pci_suspend(struct usb_hcd *hcd, bool do_wakeup)
+ return ret;
+ }
+
+-static int xhci_pci_resume(struct usb_hcd *hcd, bool hibernated)
++static int xhci_pci_resume(struct usb_hcd *hcd, pm_message_t msg)
+ {
+ struct xhci_hcd *xhci = hcd_to_xhci(hcd);
+ struct pci_dev *pdev = to_pci_dev(hcd->self.controller);
+@@ -867,7 +867,7 @@ static int xhci_pci_resume(struct usb_hcd *hcd, bool hibernated)
+ if (xhci->quirks & XHCI_PME_STUCK_QUIRK)
+ xhci_pme_quirk(hcd);
+
+- retval = xhci_resume(xhci, hibernated);
++ retval = xhci_resume(xhci, msg);
+ return retval;
+ }
+
+diff --git a/drivers/usb/host/xhci-plat.c b/drivers/usb/host/xhci-plat.c
+index b0c8e8efc43b6..f36633fa83624 100644
+--- a/drivers/usb/host/xhci-plat.c
++++ b/drivers/usb/host/xhci-plat.c
+@@ -478,7 +478,7 @@ static int __maybe_unused xhci_plat_resume(struct device *dev)
+ if (ret)
+ return ret;
+
+- ret = xhci_resume(xhci, 0);
++ ret = xhci_resume(xhci, PMSG_RESUME);
+ if (ret)
+ return ret;
+
+@@ -507,7 +507,7 @@ static int __maybe_unused xhci_plat_runtime_resume(struct device *dev)
+ struct usb_hcd *hcd = dev_get_drvdata(dev);
+ struct xhci_hcd *xhci = hcd_to_xhci(hcd);
+
+- return xhci_resume(xhci, 0);
++ return xhci_resume(xhci, PMSG_AUTO_RESUME);
+ }
+
+ const struct dev_pm_ops xhci_plat_pm_ops = {
+diff --git a/drivers/usb/host/xhci-tegra.c b/drivers/usb/host/xhci-tegra.c
+index c75d932441436..8a9c7deb7686e 100644
+--- a/drivers/usb/host/xhci-tegra.c
++++ b/drivers/usb/host/xhci-tegra.c
+@@ -2272,7 +2272,7 @@ static int tegra_xusb_exit_elpg(struct tegra_xusb *tegra, bool runtime)
+ if (wakeup)
+ tegra_xhci_disable_phy_sleepwalk(tegra);
+
+- err = xhci_resume(xhci, 0);
++ err = xhci_resume(xhci, runtime ? PMSG_AUTO_RESUME : PMSG_RESUME);
+ if (err < 0) {
+ dev_err(tegra->dev, "failed to resume XHCI: %d\n", err);
+ goto disable_phy;
+diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
+index 78790dc13c5f1..b81313ffeb768 100644
+--- a/drivers/usb/host/xhci.c
++++ b/drivers/usb/host/xhci.c
+@@ -960,8 +960,9 @@ EXPORT_SYMBOL_GPL(xhci_suspend);
+ * This is called when the machine transition from S3/S4 mode.
+ *
+ */
+-int xhci_resume(struct xhci_hcd *xhci, bool hibernated)
++int xhci_resume(struct xhci_hcd *xhci, pm_message_t msg)
+ {
++ bool hibernated = (msg.event == PM_EVENT_RESTORE);
+ u32 command, temp = 0;
+ struct usb_hcd *hcd = xhci_to_hcd(xhci);
+ int retval = 0;
+@@ -1116,7 +1117,7 @@ int xhci_resume(struct xhci_hcd *xhci, bool hibernated)
+ * the first wake signalling failed, give it that chance.
+ */
+ pending_portevent = xhci_pending_portevent(xhci);
+- if (!pending_portevent) {
++ if (!pending_portevent && msg.event == PM_EVENT_AUTO_RESUME) {
+ msleep(120);
+ pending_portevent = xhci_pending_portevent(xhci);
+ }
+diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
+index 6b690ec91ff3a..f845c15073ba4 100644
+--- a/drivers/usb/host/xhci.h
++++ b/drivers/usb/host/xhci.h
+@@ -2140,7 +2140,7 @@ int xhci_disable_slot(struct xhci_hcd *xhci, u32 slot_id);
+ int xhci_ext_cap_init(struct xhci_hcd *xhci);
+
+ int xhci_suspend(struct xhci_hcd *xhci, bool do_wakeup);
+-int xhci_resume(struct xhci_hcd *xhci, bool hibernated);
++int xhci_resume(struct xhci_hcd *xhci, pm_message_t msg);
+
+ irqreturn_t xhci_irq(struct usb_hcd *hcd);
+ irqreturn_t xhci_msi_irq(int irq, void *hcd);
+diff --git a/drivers/usb/musb/musb_core.c b/drivers/usb/musb/musb_core.c
+index d162afbbe19f7..ecbd3784bec36 100644
+--- a/drivers/usb/musb/musb_core.c
++++ b/drivers/usb/musb/musb_core.c
+@@ -2330,7 +2330,6 @@ musb_init_controller(struct device *dev, int nIrq, void __iomem *ctrl)
+
+ spin_lock_init(&musb->lock);
+ spin_lock_init(&musb->list_lock);
+- musb->board_set_power = plat->set_power;
+ musb->min_power = plat->min_power;
+ musb->ops = plat->platform_ops;
+ musb->port_mode = plat->mode;
+diff --git a/drivers/usb/musb/musb_core.h b/drivers/usb/musb/musb_core.h
+index b7588d11cfc59..91b5b6b66f963 100644
+--- a/drivers/usb/musb/musb_core.h
++++ b/drivers/usb/musb/musb_core.h
+@@ -352,8 +352,6 @@ struct musb {
+ u16 epmask;
+ u8 nr_endpoints;
+
+- int (*board_set_power)(int state);
+-
+ u8 min_power; /* vbus for periph, in mA/2 */
+
+ enum musb_mode port_mode;
+diff --git a/drivers/usb/musb/tusb6010.c b/drivers/usb/musb/tusb6010.c
+index a1f29dbc62e6e..cbc707fe570fa 100644
+--- a/drivers/usb/musb/tusb6010.c
++++ b/drivers/usb/musb/tusb6010.c
+@@ -11,6 +11,8 @@
+ * interface.
+ */
+
++#include <linux/gpio/consumer.h>
++#include <linux/delay.h>
+ #include <linux/module.h>
+ #include <linux/kernel.h>
+ #include <linux/errno.h>
+@@ -30,6 +32,8 @@ struct tusb6010_glue {
+ struct device *dev;
+ struct platform_device *musb;
+ struct platform_device *phy;
++ struct gpio_desc *enable;
++ struct gpio_desc *intpin;
+ };
+
+ static void tusb_musb_set_vbus(struct musb *musb, int is_on);
+@@ -1021,16 +1025,29 @@ static void tusb_setup_cpu_interface(struct musb *musb)
+
+ static int tusb_musb_start(struct musb *musb)
+ {
++ struct tusb6010_glue *glue = dev_get_drvdata(musb->controller->parent);
+ void __iomem *tbase = musb->ctrl_base;
+- int ret = 0;
+ unsigned long flags;
+ u32 reg;
++ int i;
+
+- if (musb->board_set_power)
+- ret = musb->board_set_power(1);
+- if (ret != 0) {
+- printk(KERN_ERR "tusb: Cannot enable TUSB6010\n");
+- return ret;
++ /*
++ * Enable or disable power to TUSB6010. When enabling, turn on 3.3 V and
++ * 1.5 V voltage regulators of PM companion chip. Companion chip will then
++ * provide then PGOOD signal to TUSB6010 which will release it from reset.
++ */
++ gpiod_set_value(glue->enable, 1);
++ msleep(1);
++
++ /* Wait for 100ms until TUSB6010 pulls INT pin down */
++ i = 100;
++ while (i && gpiod_get_value(glue->intpin)) {
++ msleep(1);
++ i--;
++ }
++ if (!i) {
++ pr_err("tusb: Powerup respones failed\n");
++ return -ENODEV;
+ }
+
+ spin_lock_irqsave(&musb->lock, flags);
+@@ -1083,8 +1100,8 @@ static int tusb_musb_start(struct musb *musb)
+ err:
+ spin_unlock_irqrestore(&musb->lock, flags);
+
+- if (musb->board_set_power)
+- musb->board_set_power(0);
++ gpiod_set_value(glue->enable, 0);
++ msleep(10);
+
+ return -ENODEV;
+ }
+@@ -1158,11 +1175,13 @@ done:
+
+ static int tusb_musb_exit(struct musb *musb)
+ {
++ struct tusb6010_glue *glue = dev_get_drvdata(musb->controller->parent);
++
+ del_timer_sync(&musb->dev_timer);
+ the_musb = NULL;
+
+- if (musb->board_set_power)
+- musb->board_set_power(0);
++ gpiod_set_value(glue->enable, 0);
++ msleep(10);
+
+ iounmap(musb->sync_va);
+
+@@ -1218,6 +1237,15 @@ static int tusb_probe(struct platform_device *pdev)
+
+ glue->dev = &pdev->dev;
+
++ glue->enable = devm_gpiod_get(glue->dev, "enable", GPIOD_OUT_LOW);
++ if (IS_ERR(glue->enable))
++ return dev_err_probe(glue->dev, PTR_ERR(glue->enable),
++ "could not obtain power on/off GPIO\n");
++ glue->intpin = devm_gpiod_get(glue->dev, "int", GPIOD_IN);
++ if (IS_ERR(glue->intpin))
++ return dev_err_probe(glue->dev, PTR_ERR(glue->intpin),
++ "could not obtain INT GPIO\n");
++
+ pdata->platform_ops = &tusb_ops;
+
+ usb_phy_generic_register();
+@@ -1236,10 +1264,7 @@ static int tusb_probe(struct platform_device *pdev)
+ musb_resources[1].end = pdev->resource[1].end;
+ musb_resources[1].flags = pdev->resource[1].flags;
+
+- musb_resources[2].name = pdev->resource[2].name;
+- musb_resources[2].start = pdev->resource[2].start;
+- musb_resources[2].end = pdev->resource[2].end;
+- musb_resources[2].flags = pdev->resource[2].flags;
++ musb_resources[2] = DEFINE_RES_IRQ_NAMED(gpiod_to_irq(glue->intpin), "mc");
+
+ pinfo = tusb_dev_info;
+ pinfo.parent = &pdev->dev;
+diff --git a/drivers/usb/phy/phy-tahvo.c b/drivers/usb/phy/phy-tahvo.c
+index 47562d49dfc1b..5cac31c6029b3 100644
+--- a/drivers/usb/phy/phy-tahvo.c
++++ b/drivers/usb/phy/phy-tahvo.c
+@@ -391,7 +391,7 @@ static int tahvo_usb_probe(struct platform_device *pdev)
+
+ tu->irq = ret = platform_get_irq(pdev, 0);
+ if (ret < 0)
+- return ret;
++ goto err_remove_phy;
+ ret = request_threaded_irq(tu->irq, NULL, tahvo_usb_vbus_interrupt,
+ IRQF_ONESHOT,
+ "tahvo-vbus", tu);
+diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c
+index fd42e3a0bd187..288a96a742661 100644
+--- a/drivers/usb/serial/option.c
++++ b/drivers/usb/serial/option.c
+@@ -1151,6 +1151,10 @@ static const struct usb_device_id option_ids[] = {
+ { USB_DEVICE(QUALCOMM_VENDOR_ID, 0x90fa),
+ .driver_info = RSVD(3) },
+ /* u-blox products */
++ { USB_DEVICE(UBLOX_VENDOR_ID, 0x1311) }, /* u-blox LARA-R6 01B */
++ { USB_DEVICE(UBLOX_VENDOR_ID, 0x1312), /* u-blox LARA-R6 01B (RMNET) */
++ .driver_info = RSVD(4) },
++ { USB_DEVICE_INTERFACE_CLASS(UBLOX_VENDOR_ID, 0x1313, 0xff) }, /* u-blox LARA-R6 01B (ECM) */
+ { USB_DEVICE(UBLOX_VENDOR_ID, 0x1341) }, /* u-blox LARA-L6 */
+ { USB_DEVICE(UBLOX_VENDOR_ID, 0x1342), /* u-blox LARA-L6 (RMNET) */
+ .driver_info = RSVD(4) },
+diff --git a/drivers/usb/typec/ucsi/psy.c b/drivers/usb/typec/ucsi/psy.c
+index 56bf56517f75a..384b42267f1fc 100644
+--- a/drivers/usb/typec/ucsi/psy.c
++++ b/drivers/usb/typec/ucsi/psy.c
+@@ -27,8 +27,20 @@ static enum power_supply_property ucsi_psy_props[] = {
+ POWER_SUPPLY_PROP_VOLTAGE_NOW,
+ POWER_SUPPLY_PROP_CURRENT_MAX,
+ POWER_SUPPLY_PROP_CURRENT_NOW,
++ POWER_SUPPLY_PROP_SCOPE,
+ };
+
++static int ucsi_psy_get_scope(struct ucsi_connector *con,
++ union power_supply_propval *val)
++{
++ u8 scope = POWER_SUPPLY_SCOPE_UNKNOWN;
++ struct device *dev = con->ucsi->dev;
++
++ device_property_read_u8(dev, "scope", &scope);
++ val->intval = scope;
++ return 0;
++}
++
+ static int ucsi_psy_get_online(struct ucsi_connector *con,
+ union power_supply_propval *val)
+ {
+@@ -194,6 +206,8 @@ static int ucsi_psy_get_prop(struct power_supply *psy,
+ return ucsi_psy_get_current_max(con, val);
+ case POWER_SUPPLY_PROP_CURRENT_NOW:
+ return ucsi_psy_get_current_now(con, val);
++ case POWER_SUPPLY_PROP_SCOPE:
++ return ucsi_psy_get_scope(con, val);
+ default:
+ return -EINVAL;
+ }
+diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
+index 5f5c21674fdce..0d84e6a9c3cca 100644
+--- a/drivers/vdpa/vdpa_user/vduse_dev.c
++++ b/drivers/vdpa/vdpa_user/vduse_dev.c
+@@ -726,7 +726,11 @@ static int vduse_vdpa_set_vq_affinity(struct vdpa_device *vdpa, u16 idx,
+ {
+ struct vduse_dev *dev = vdpa_to_vduse(vdpa);
+
+- cpumask_copy(&dev->vqs[idx]->irq_affinity, cpu_mask);
++ if (cpu_mask)
++ cpumask_copy(&dev->vqs[idx]->irq_affinity, cpu_mask);
++ else
++ cpumask_setall(&dev->vqs[idx]->irq_affinity);
++
+ return 0;
+ }
+
+diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
+index 58f91b3bd670c..ed4737de45289 100644
+--- a/drivers/vfio/mdev/mdev_core.c
++++ b/drivers/vfio/mdev/mdev_core.c
+@@ -72,12 +72,6 @@ int mdev_register_parent(struct mdev_parent *parent, struct device *dev,
+ parent->nr_types = nr_types;
+ atomic_set(&parent->available_instances, mdev_driver->max_instances);
+
+- if (!mdev_bus_compat_class) {
+- mdev_bus_compat_class = class_compat_register("mdev_bus");
+- if (!mdev_bus_compat_class)
+- return -ENOMEM;
+- }
+-
+ ret = parent_create_sysfs_files(parent);
+ if (ret)
+ return ret;
+@@ -251,13 +245,24 @@ int mdev_device_remove(struct mdev_device *mdev)
+
+ static int __init mdev_init(void)
+ {
+- return bus_register(&mdev_bus_type);
++ int ret;
++
++ ret = bus_register(&mdev_bus_type);
++ if (ret)
++ return ret;
++
++ mdev_bus_compat_class = class_compat_register("mdev_bus");
++ if (!mdev_bus_compat_class) {
++ bus_unregister(&mdev_bus_type);
++ return -ENOMEM;
++ }
++
++ return 0;
+ }
+
+ static void __exit mdev_exit(void)
+ {
+- if (mdev_bus_compat_class)
+- class_compat_unregister(mdev_bus_compat_class);
++ class_compat_unregister(mdev_bus_compat_class);
+ bus_unregister(&mdev_bus_type);
+ }
+
+diff --git a/drivers/video/fbdev/omap/lcd_mipid.c b/drivers/video/fbdev/omap/lcd_mipid.c
+index 03cff39d392db..a0fc4570403b8 100644
+--- a/drivers/video/fbdev/omap/lcd_mipid.c
++++ b/drivers/video/fbdev/omap/lcd_mipid.c
+@@ -7,6 +7,7 @@
+ */
+ #include <linux/device.h>
+ #include <linux/delay.h>
++#include <linux/gpio/consumer.h>
+ #include <linux/slab.h>
+ #include <linux/workqueue.h>
+ #include <linux/spi/spi.h>
+@@ -41,6 +42,7 @@ struct mipid_device {
+ when we can issue the
+ next sleep in/out command */
+ unsigned long hw_guard_wait; /* max guard time in jiffies */
++ struct gpio_desc *reset;
+
+ struct omapfb_device *fbdev;
+ struct spi_device *spi;
+@@ -556,6 +558,12 @@ static int mipid_spi_probe(struct spi_device *spi)
+ return -ENOMEM;
+ }
+
++ /* This will de-assert RESET if active */
++ md->reset = gpiod_get(&spi->dev, "reset", GPIOD_OUT_LOW);
++ if (IS_ERR(md->reset))
++ return dev_err_probe(&spi->dev, PTR_ERR(md->reset),
++ "no reset GPIO line\n");
++
+ spi->mode = SPI_MODE_0;
+ md->spi = spi;
+ dev_set_drvdata(&spi->dev, md);
+@@ -563,17 +571,23 @@ static int mipid_spi_probe(struct spi_device *spi)
+
+ r = mipid_detect(md);
+ if (r < 0)
+- return r;
++ goto free_md;
+
+ omapfb_register_panel(&md->panel);
+
+ return 0;
++
++free_md:
++ kfree(md);
++ return r;
+ }
+
+ static void mipid_spi_remove(struct spi_device *spi)
+ {
+ struct mipid_device *md = dev_get_drvdata(&spi->dev);
+
++ /* Asserts RESET */
++ gpiod_set_value(md->reset, 1);
+ mipid_disable(&md->panel);
+ kfree(md);
+ }
+diff --git a/drivers/virt/coco/sev-guest/Kconfig b/drivers/virt/coco/sev-guest/Kconfig
+index f9db0799ae67c..da2d7ca531f0f 100644
+--- a/drivers/virt/coco/sev-guest/Kconfig
++++ b/drivers/virt/coco/sev-guest/Kconfig
+@@ -2,6 +2,7 @@ config SEV_GUEST
+ tristate "AMD SEV Guest driver"
+ default m
+ depends on AMD_MEM_ENCRYPT
++ select CRYPTO
+ select CRYPTO_AEAD2
+ select CRYPTO_GCM
+ help
+diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
+index eb6aee8c06b2c..989e2d7184ce4 100644
+--- a/drivers/virtio/virtio_vdpa.c
++++ b/drivers/virtio/virtio_vdpa.c
+@@ -385,7 +385,9 @@ static int virtio_vdpa_find_vqs(struct virtio_device *vdev, unsigned int nvqs,
+ err = PTR_ERR(vqs[i]);
+ goto err_setup_vq;
+ }
+- ops->set_vq_affinity(vdpa, i, &masks[i]);
++
++ if (ops->set_vq_affinity)
++ ops->set_vq_affinity(vdpa, i, &masks[i]);
+ }
+
+ cb.callback = virtio_vdpa_config_cb;
+diff --git a/drivers/w1/slaves/w1_therm.c b/drivers/w1/slaves/w1_therm.c
+index 067692626cf07..99c58bd9d2df0 100644
+--- a/drivers/w1/slaves/w1_therm.c
++++ b/drivers/w1/slaves/w1_therm.c
+@@ -1159,29 +1159,26 @@ static int convert_t(struct w1_slave *sl, struct therm_info *info)
+
+ w1_write_8(dev_master, W1_CONVERT_TEMP);
+
+- if (strong_pullup) { /*some device need pullup */
++ if (SLAVE_FEATURES(sl) & W1_THERM_POLL_COMPLETION) {
++ ret = w1_poll_completion(dev_master, W1_POLL_CONVERT_TEMP);
++ if (ret) {
++ dev_dbg(&sl->dev, "%s: Timeout\n", __func__);
++ goto mt_unlock;
++ }
++ mutex_unlock(&dev_master->bus_mutex);
++ } else if (!strong_pullup) { /*no device need pullup */
+ sleep_rem = msleep_interruptible(t_conv);
+ if (sleep_rem != 0) {
+ ret = -EINTR;
+ goto mt_unlock;
+ }
+ mutex_unlock(&dev_master->bus_mutex);
+- } else { /*no device need pullup */
+- if (SLAVE_FEATURES(sl) & W1_THERM_POLL_COMPLETION) {
+- ret = w1_poll_completion(dev_master, W1_POLL_CONVERT_TEMP);
+- if (ret) {
+- dev_dbg(&sl->dev, "%s: Timeout\n", __func__);
+- goto mt_unlock;
+- }
+- mutex_unlock(&dev_master->bus_mutex);
+- } else {
+- /* Fixed delay */
+- mutex_unlock(&dev_master->bus_mutex);
+- sleep_rem = msleep_interruptible(t_conv);
+- if (sleep_rem != 0) {
+- ret = -EINTR;
+- goto dec_refcnt;
+- }
++ } else { /*some device need pullup */
++ mutex_unlock(&dev_master->bus_mutex);
++ sleep_rem = msleep_interruptible(t_conv);
++ if (sleep_rem != 0) {
++ ret = -EINTR;
++ goto dec_refcnt;
+ }
+ }
+ ret = read_scratchpad(sl, info);
+diff --git a/drivers/w1/w1.c b/drivers/w1/w1.c
+index 9d199fed96287..2c766bdc68cc5 100644
+--- a/drivers/w1/w1.c
++++ b/drivers/w1/w1.c
+@@ -1263,10 +1263,10 @@ err_out_exit_init:
+
+ static void __exit w1_fini(void)
+ {
+- struct w1_master *dev;
++ struct w1_master *dev, *n;
+
+ /* Set netlink removal messages and some cleanup */
+- list_for_each_entry(dev, &w1_masters, w1_master_entry)
++ list_for_each_entry_safe(dev, n, &w1_masters, w1_master_entry)
+ __w1_remove_master_device(dev);
+
+ w1_fini_netlink();
+diff --git a/fs/afs/write.c b/fs/afs/write.c
+index 8750b99c3f566..c1f4391ccd7c6 100644
+--- a/fs/afs/write.c
++++ b/fs/afs/write.c
+@@ -413,17 +413,19 @@ static int afs_store_data(struct afs_vnode *vnode, struct iov_iter *iter, loff_t
+ afs_op_set_vnode(op, 0, vnode);
+ op->file[0].dv_delta = 1;
+ op->file[0].modification = true;
+- op->store.write_iter = iter;
+ op->store.pos = pos;
+ op->store.size = size;
+- op->store.i_size = max(pos + size, vnode->netfs.remote_i_size);
+ op->store.laundering = laundering;
+- op->mtime = vnode->netfs.inode.i_mtime;
+ op->flags |= AFS_OPERATION_UNINTR;
+ op->ops = &afs_store_data_operation;
+
+ try_next_key:
+ afs_begin_vnode_operation(op);
++
++ op->store.write_iter = iter;
++ op->store.i_size = max(pos + size, vnode->netfs.remote_i_size);
++ op->mtime = vnode->netfs.inode.i_mtime;
++
+ afs_wait_for_operation(op);
+
+ switch (op->error) {
+diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c
+index b3ad0f51e6162..b86faf8126e77 100644
+--- a/fs/btrfs/bio.c
++++ b/fs/btrfs/bio.c
+@@ -95,8 +95,7 @@ static struct btrfs_bio *btrfs_split_bio(struct btrfs_fs_info *fs_info,
+ btrfs_bio_init(bbio, fs_info, NULL, orig_bbio);
+ bbio->inode = orig_bbio->inode;
+ bbio->file_offset = orig_bbio->file_offset;
+- if (!(orig_bbio->bio.bi_opf & REQ_BTRFS_ONE_ORDERED))
+- orig_bbio->file_offset += map_length;
++ orig_bbio->file_offset += map_length;
+
+ atomic_inc(&orig_bbio->pending_ios);
+ return bbio;
+diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
+index e97af2e510c37..e2b3448476490 100644
+--- a/fs/btrfs/block-group.c
++++ b/fs/btrfs/block-group.c
+@@ -95,14 +95,21 @@ static u64 btrfs_reduce_alloc_profile(struct btrfs_fs_info *fs_info, u64 flags)
+ }
+ allowed &= flags;
+
+- if (allowed & BTRFS_BLOCK_GROUP_RAID6)
++ /* Select the highest-redundancy RAID level. */
++ if (allowed & BTRFS_BLOCK_GROUP_RAID1C4)
++ allowed = BTRFS_BLOCK_GROUP_RAID1C4;
++ else if (allowed & BTRFS_BLOCK_GROUP_RAID6)
+ allowed = BTRFS_BLOCK_GROUP_RAID6;
++ else if (allowed & BTRFS_BLOCK_GROUP_RAID1C3)
++ allowed = BTRFS_BLOCK_GROUP_RAID1C3;
+ else if (allowed & BTRFS_BLOCK_GROUP_RAID5)
+ allowed = BTRFS_BLOCK_GROUP_RAID5;
+ else if (allowed & BTRFS_BLOCK_GROUP_RAID10)
+ allowed = BTRFS_BLOCK_GROUP_RAID10;
+ else if (allowed & BTRFS_BLOCK_GROUP_RAID1)
+ allowed = BTRFS_BLOCK_GROUP_RAID1;
++ else if (allowed & BTRFS_BLOCK_GROUP_DUP)
++ allowed = BTRFS_BLOCK_GROUP_DUP;
+ else if (allowed & BTRFS_BLOCK_GROUP_RAID0)
+ allowed = BTRFS_BLOCK_GROUP_RAID0;
+
+@@ -1791,8 +1798,15 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
+ }
+ spin_unlock(&bg->lock);
+
+- /* Get out fast, in case we're unmounting the filesystem */
+- if (btrfs_fs_closing(fs_info)) {
++ /*
++ * Get out fast, in case we're read-only or unmounting the
++ * filesystem. It is OK to drop block groups from the list even
++ * for the read-only case. As we did sb_start_write(),
++ * "mount -o remount,ro" won't happen and read-only filesystem
++ * means it is forced read-only due to a fatal error. So, it
++ * never gets back to read-write to let us reclaim again.
++ */
++ if (btrfs_need_cleaner_sleep(fs_info)) {
+ up_write(&space_info->groups_sem);
+ goto next;
+ }
+@@ -1823,11 +1837,27 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
+ }
+
+ next:
++ if (ret)
++ btrfs_mark_bg_to_reclaim(bg);
+ btrfs_put_block_group(bg);
++
++ mutex_unlock(&fs_info->reclaim_bgs_lock);
++ /*
++ * Reclaiming all the block groups in the list can take really
++ * long. Prioritize cleaning up unused block groups.
++ */
++ btrfs_delete_unused_bgs(fs_info);
++ /*
++ * If we are interrupted by a balance, we can just bail out. The
++ * cleaner thread restart again if necessary.
++ */
++ if (!mutex_trylock(&fs_info->reclaim_bgs_lock))
++ goto end;
+ spin_lock(&fs_info->unused_bgs_lock);
+ }
+ spin_unlock(&fs_info->unused_bgs_lock);
+ mutex_unlock(&fs_info->reclaim_bgs_lock);
++end:
+ btrfs_exclop_finish(fs_info);
+ sb_end_write(fs_info->sb);
+ }
+diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
+index 2ff2961b11830..4912d624ca3d3 100644
+--- a/fs/btrfs/ctree.c
++++ b/fs/btrfs/ctree.c
+@@ -583,9 +583,14 @@ static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans,
+ btrfs_header_backref_rev(buf) < BTRFS_MIXED_BACKREF_REV)
+ parent_start = buf->start;
+
+- atomic_inc(&cow->refs);
+ ret = btrfs_tree_mod_log_insert_root(root->node, cow, true);
+- BUG_ON(ret < 0);
++ if (ret < 0) {
++ btrfs_tree_unlock(cow);
++ free_extent_buffer(cow);
++ btrfs_abort_transaction(trans, ret);
++ return ret;
++ }
++ atomic_inc(&cow->refs);
+ rcu_assign_pointer(root->node, cow);
+
+ btrfs_free_tree_block(trans, btrfs_root_id(root), buf,
+@@ -594,8 +599,14 @@ static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans,
+ add_root_to_dirty_list(root);
+ } else {
+ WARN_ON(trans->transid != btrfs_header_generation(parent));
+- btrfs_tree_mod_log_insert_key(parent, parent_slot,
+- BTRFS_MOD_LOG_KEY_REPLACE);
++ ret = btrfs_tree_mod_log_insert_key(parent, parent_slot,
++ BTRFS_MOD_LOG_KEY_REPLACE);
++ if (ret) {
++ btrfs_tree_unlock(cow);
++ free_extent_buffer(cow);
++ btrfs_abort_transaction(trans, ret);
++ return ret;
++ }
+ btrfs_set_node_blockptr(parent, parent_slot,
+ cow->start);
+ btrfs_set_node_ptr_generation(parent, parent_slot,
+@@ -1042,7 +1053,12 @@ static noinline int balance_level(struct btrfs_trans_handle *trans,
+ }
+
+ ret = btrfs_tree_mod_log_insert_root(root->node, child, true);
+- BUG_ON(ret < 0);
++ if (ret < 0) {
++ btrfs_tree_unlock(child);
++ free_extent_buffer(child);
++ btrfs_abort_transaction(trans, ret);
++ goto enospc;
++ }
+ rcu_assign_pointer(root->node, child);
+
+ add_root_to_dirty_list(root);
+@@ -1130,7 +1146,10 @@ static noinline int balance_level(struct btrfs_trans_handle *trans,
+ btrfs_node_key(right, &right_key, 0);
+ ret = btrfs_tree_mod_log_insert_key(parent, pslot + 1,
+ BTRFS_MOD_LOG_KEY_REPLACE);
+- BUG_ON(ret < 0);
++ if (ret < 0) {
++ btrfs_abort_transaction(trans, ret);
++ goto enospc;
++ }
+ btrfs_set_node_key(parent, &right_key, pslot + 1);
+ btrfs_mark_buffer_dirty(parent);
+ }
+@@ -1176,7 +1195,10 @@ static noinline int balance_level(struct btrfs_trans_handle *trans,
+ btrfs_node_key(mid, &mid_key, 0);
+ ret = btrfs_tree_mod_log_insert_key(parent, pslot,
+ BTRFS_MOD_LOG_KEY_REPLACE);
+- BUG_ON(ret < 0);
++ if (ret < 0) {
++ btrfs_abort_transaction(trans, ret);
++ goto enospc;
++ }
+ btrfs_set_node_key(parent, &mid_key, pslot);
+ btrfs_mark_buffer_dirty(parent);
+ }
+@@ -2703,8 +2725,8 @@ static int push_node_left(struct btrfs_trans_handle *trans,
+
+ if (push_items < src_nritems) {
+ /*
+- * Don't call btrfs_tree_mod_log_insert_move() here, key removal
+- * was already fully logged by btrfs_tree_mod_log_eb_copy() above.
++ * btrfs_tree_mod_log_eb_copy handles logging the move, so we
++ * don't need to do an explicit tree mod log operation for it.
+ */
+ memmove_extent_buffer(src, btrfs_node_key_ptr_offset(src, 0),
+ btrfs_node_key_ptr_offset(src, push_items),
+@@ -2765,8 +2787,11 @@ static int balance_node_right(struct btrfs_trans_handle *trans,
+ btrfs_abort_transaction(trans, ret);
+ return ret;
+ }
+- ret = btrfs_tree_mod_log_insert_move(dst, push_items, 0, dst_nritems);
+- BUG_ON(ret < 0);
++
++ /*
++ * btrfs_tree_mod_log_eb_copy handles logging the move, so we don't
++ * need to do an explicit tree mod log operation for it.
++ */
+ memmove_extent_buffer(dst, btrfs_node_key_ptr_offset(dst, push_items),
+ btrfs_node_key_ptr_offset(dst, 0),
+ (dst_nritems) *
+@@ -2962,6 +2987,8 @@ static noinline int split_node(struct btrfs_trans_handle *trans,
+
+ ret = btrfs_tree_mod_log_eb_copy(split, c, 0, mid, c_nritems - mid);
+ if (ret) {
++ btrfs_tree_unlock(split);
++ free_extent_buffer(split);
+ btrfs_abort_transaction(trans, ret);
+ return ret;
+ }
+diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
+index dabc79c1af1bd..fc59eb4024438 100644
+--- a/fs/btrfs/disk-io.c
++++ b/fs/btrfs/disk-io.c
+@@ -4683,7 +4683,6 @@ void btrfs_mark_buffer_dirty(struct extent_buffer *buf)
+ {
+ struct btrfs_fs_info *fs_info = buf->fs_info;
+ u64 transid = btrfs_header_generation(buf);
+- int was_dirty;
+
+ #ifdef CONFIG_BTRFS_FS_RUN_SANITY_TESTS
+ /*
+@@ -4698,11 +4697,7 @@ void btrfs_mark_buffer_dirty(struct extent_buffer *buf)
+ if (transid != fs_info->generation)
+ WARN(1, KERN_CRIT "btrfs transid mismatch buffer %llu, found %llu running %llu\n",
+ buf->start, transid, fs_info->generation);
+- was_dirty = set_extent_buffer_dirty(buf);
+- if (!was_dirty)
+- percpu_counter_add_batch(&fs_info->dirty_metadata_bytes,
+- buf->len,
+- fs_info->dirty_metadata_batch);
++ set_extent_buffer_dirty(buf);
+ #ifdef CONFIG_BTRFS_FS_CHECK_INTEGRITY
+ /*
+ * Since btrfs_mark_buffer_dirty() can be called with item pointer set
+diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
+index a1adadd5d25dd..e3ae55d8bae14 100644
+--- a/fs/btrfs/extent_io.c
++++ b/fs/btrfs/extent_io.c
+@@ -98,33 +98,16 @@ void btrfs_extent_buffer_leak_debug_check(struct btrfs_fs_info *fs_info)
+ */
+ struct btrfs_bio_ctrl {
+ struct btrfs_bio *bbio;
+- int mirror_num;
+ enum btrfs_compression_type compress_type;
+ u32 len_to_oe_boundary;
+ blk_opf_t opf;
+ btrfs_bio_end_io_t end_io_func;
+ struct writeback_control *wbc;
+-
+- /*
+- * This is for metadata read, to provide the extra needed verification
+- * info. This has to be provided for submit_one_bio(), as
+- * submit_one_bio() can submit a bio if it ends at stripe boundary. If
+- * no such parent_check is provided, the metadata can hit false alert at
+- * endio time.
+- */
+- struct btrfs_tree_parent_check *parent_check;
+-
+- /*
+- * Tell writepage not to lock the state bits for this range, it still
+- * does the unlocking.
+- */
+- bool extent_locked;
+ };
+
+ static void submit_one_bio(struct btrfs_bio_ctrl *bio_ctrl)
+ {
+ struct btrfs_bio *bbio = bio_ctrl->bbio;
+- int mirror_num = bio_ctrl->mirror_num;
+
+ if (!bbio)
+ return;
+@@ -132,25 +115,14 @@ static void submit_one_bio(struct btrfs_bio_ctrl *bio_ctrl)
+ /* Caller should ensure the bio has at least some range added */
+ ASSERT(bbio->bio.bi_iter.bi_size);
+
+- if (!is_data_inode(&bbio->inode->vfs_inode)) {
+- if (btrfs_op(&bbio->bio) != BTRFS_MAP_WRITE) {
+- /*
+- * For metadata read, we should have the parent_check,
+- * and copy it to bbio for metadata verification.
+- */
+- ASSERT(bio_ctrl->parent_check);
+- memcpy(&bbio->parent_check,
+- bio_ctrl->parent_check,
+- sizeof(struct btrfs_tree_parent_check));
+- }
++ if (!is_data_inode(&bbio->inode->vfs_inode))
+ bbio->bio.bi_opf |= REQ_META;
+- }
+
+ if (btrfs_op(&bbio->bio) == BTRFS_MAP_READ &&
+ bio_ctrl->compress_type != BTRFS_COMPRESS_NONE)
+- btrfs_submit_compressed_read(bbio, mirror_num);
++ btrfs_submit_compressed_read(bbio, 0);
+ else
+- btrfs_submit_bio(bbio, mirror_num);
++ btrfs_submit_bio(bbio, 0);
+
+ /* The bbio is owned by the end_io handler now */
+ bio_ctrl->bbio = NULL;
+@@ -1572,7 +1544,6 @@ static int __extent_writepage(struct page *page, struct btrfs_bio_ctrl *bio_ctrl
+ {
+ struct folio *folio = page_folio(page);
+ struct inode *inode = page->mapping->host;
+- struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
+ const u64 page_start = page_offset(page);
+ const u64 page_end = page_start + PAGE_SIZE - 1;
+ int ret;
+@@ -1605,13 +1576,11 @@ static int __extent_writepage(struct page *page, struct btrfs_bio_ctrl *bio_ctrl
+ goto done;
+ }
+
+- if (!bio_ctrl->extent_locked) {
+- ret = writepage_delalloc(BTRFS_I(inode), page, bio_ctrl->wbc);
+- if (ret == 1)
+- return 0;
+- if (ret)
+- goto done;
+- }
++ ret = writepage_delalloc(BTRFS_I(inode), page, bio_ctrl->wbc);
++ if (ret == 1)
++ return 0;
++ if (ret)
++ goto done;
+
+ ret = __extent_writepage_io(BTRFS_I(inode), page, bio_ctrl, i_size, &nr);
+ if (ret == 1)
+@@ -1656,21 +1625,7 @@ done:
+ */
+ if (PageError(page))
+ end_extent_writepage(page, ret, page_start, page_end);
+- if (bio_ctrl->extent_locked) {
+- struct writeback_control *wbc = bio_ctrl->wbc;
+-
+- /*
+- * If bio_ctrl->extent_locked, it's from extent_write_locked_range(),
+- * the page can either be locked by lock_page() or
+- * process_one_page().
+- * Let btrfs_page_unlock_writer() handle both cases.
+- */
+- ASSERT(wbc);
+- btrfs_page_unlock_writer(fs_info, page, wbc->range_start,
+- wbc->range_end + 1 - wbc->range_start);
+- } else {
+- unlock_page(page);
+- }
++ unlock_page(page);
+ ASSERT(ret <= 0);
+ return ret;
+ }
+@@ -1691,42 +1646,24 @@ static void end_extent_buffer_writeback(struct extent_buffer *eb)
+ /*
+ * Lock extent buffer status and pages for writeback.
+ *
+- * May try to flush write bio if we can't get the lock.
+- *
+- * Return 0 if the extent buffer doesn't need to be submitted.
+- * (E.g. the extent buffer is not dirty)
+- * Return >0 is the extent buffer is submitted to bio.
+- * Return <0 if something went wrong, no page is locked.
++ * Return %false if the extent buffer doesn't need to be submitted (e.g. the
++ * extent buffer is not dirty)
++ * Return %true is the extent buffer is submitted to bio.
+ */
+-static noinline_for_stack int lock_extent_buffer_for_io(struct extent_buffer *eb,
+- struct btrfs_bio_ctrl *bio_ctrl)
++static noinline_for_stack bool lock_extent_buffer_for_io(struct extent_buffer *eb,
++ struct writeback_control *wbc)
+ {
+ struct btrfs_fs_info *fs_info = eb->fs_info;
+- int i, num_pages;
+- int flush = 0;
+- int ret = 0;
+-
+- if (!btrfs_try_tree_write_lock(eb)) {
+- submit_write_bio(bio_ctrl, 0);
+- flush = 1;
+- btrfs_tree_lock(eb);
+- }
++ bool ret = false;
++ int i;
+
+- if (test_bit(EXTENT_BUFFER_WRITEBACK, &eb->bflags)) {
++ btrfs_tree_lock(eb);
++ while (test_bit(EXTENT_BUFFER_WRITEBACK, &eb->bflags)) {
+ btrfs_tree_unlock(eb);
+- if (bio_ctrl->wbc->sync_mode != WB_SYNC_ALL)
+- return 0;
+- if (!flush) {
+- submit_write_bio(bio_ctrl, 0);
+- flush = 1;
+- }
+- while (1) {
+- wait_on_extent_buffer_writeback(eb);
+- btrfs_tree_lock(eb);
+- if (!test_bit(EXTENT_BUFFER_WRITEBACK, &eb->bflags))
+- break;
+- btrfs_tree_unlock(eb);
+- }
++ if (wbc->sync_mode != WB_SYNC_ALL)
++ return false;
++ wait_on_extent_buffer_writeback(eb);
++ btrfs_tree_lock(eb);
+ }
+
+ /*
+@@ -1742,7 +1679,7 @@ static noinline_for_stack int lock_extent_buffer_for_io(struct extent_buffer *eb
+ percpu_counter_add_batch(&fs_info->dirty_metadata_bytes,
+ -eb->len,
+ fs_info->dirty_metadata_batch);
+- ret = 1;
++ ret = true;
+ } else {
+ spin_unlock(&eb->refs_lock);
+ }
+@@ -1758,19 +1695,8 @@ static noinline_for_stack int lock_extent_buffer_for_io(struct extent_buffer *eb
+ if (!ret || fs_info->nodesize < PAGE_SIZE)
+ return ret;
+
+- num_pages = num_extent_pages(eb);
+- for (i = 0; i < num_pages; i++) {
+- struct page *p = eb->pages[i];
+-
+- if (!trylock_page(p)) {
+- if (!flush) {
+- submit_write_bio(bio_ctrl, 0);
+- flush = 1;
+- }
+- lock_page(p);
+- }
+- }
+-
++ for (i = 0; i < num_extent_pages(eb); i++)
++ lock_page(eb->pages[i]);
+ return ret;
+ }
+
+@@ -2000,11 +1926,16 @@ static void prepare_eb_write(struct extent_buffer *eb)
+ * Page locking is only utilized at minimum to keep the VMM code happy.
+ */
+ static void write_one_subpage_eb(struct extent_buffer *eb,
+- struct btrfs_bio_ctrl *bio_ctrl)
++ struct writeback_control *wbc)
+ {
+ struct btrfs_fs_info *fs_info = eb->fs_info;
+ struct page *page = eb->pages[0];
+ bool no_dirty_ebs = false;
++ struct btrfs_bio_ctrl bio_ctrl = {
++ .wbc = wbc,
++ .opf = REQ_OP_WRITE | wbc_to_write_flags(wbc),
++ .end_io_func = end_bio_subpage_eb_writepage,
++ };
+
+ prepare_eb_write(eb);
+
+@@ -2018,40 +1949,43 @@ static void write_one_subpage_eb(struct extent_buffer *eb,
+ if (no_dirty_ebs)
+ clear_page_dirty_for_io(page);
+
+- bio_ctrl->end_io_func = end_bio_subpage_eb_writepage;
+-
+- submit_extent_page(bio_ctrl, eb->start, page, eb->len,
++ submit_extent_page(&bio_ctrl, eb->start, page, eb->len,
+ eb->start - page_offset(page));
+ unlock_page(page);
++ submit_one_bio(&bio_ctrl);
+ /*
+ * Submission finished without problem, if no range of the page is
+ * dirty anymore, we have submitted a page. Update nr_written in wbc.
+ */
+ if (no_dirty_ebs)
+- bio_ctrl->wbc->nr_to_write--;
++ wbc->nr_to_write--;
+ }
+
+ static noinline_for_stack void write_one_eb(struct extent_buffer *eb,
+- struct btrfs_bio_ctrl *bio_ctrl)
++ struct writeback_control *wbc)
+ {
+ u64 disk_bytenr = eb->start;
+ int i, num_pages;
++ struct btrfs_bio_ctrl bio_ctrl = {
++ .wbc = wbc,
++ .opf = REQ_OP_WRITE | wbc_to_write_flags(wbc),
++ .end_io_func = end_bio_extent_buffer_writepage,
++ };
+
+ prepare_eb_write(eb);
+
+- bio_ctrl->end_io_func = end_bio_extent_buffer_writepage;
+-
+ num_pages = num_extent_pages(eb);
+ for (i = 0; i < num_pages; i++) {
+ struct page *p = eb->pages[i];
+
+ clear_page_dirty_for_io(p);
+ set_page_writeback(p);
+- submit_extent_page(bio_ctrl, disk_bytenr, p, PAGE_SIZE, 0);
++ submit_extent_page(&bio_ctrl, disk_bytenr, p, PAGE_SIZE, 0);
+ disk_bytenr += PAGE_SIZE;
+- bio_ctrl->wbc->nr_to_write--;
++ wbc->nr_to_write--;
+ unlock_page(p);
+ }
++ submit_one_bio(&bio_ctrl);
+ }
+
+ /*
+@@ -2068,14 +2002,13 @@ static noinline_for_stack void write_one_eb(struct extent_buffer *eb,
+ * Return >=0 for the number of submitted extent buffers.
+ * Return <0 for fatal error.
+ */
+-static int submit_eb_subpage(struct page *page, struct btrfs_bio_ctrl *bio_ctrl)
++static int submit_eb_subpage(struct page *page, struct writeback_control *wbc)
+ {
+ struct btrfs_fs_info *fs_info = btrfs_sb(page->mapping->host->i_sb);
+ int submitted = 0;
+ u64 page_start = page_offset(page);
+ int bit_start = 0;
+ int sectors_per_node = fs_info->nodesize >> fs_info->sectorsize_bits;
+- int ret;
+
+ /* Lock and write each dirty extent buffers in the range */
+ while (bit_start < fs_info->subpage_info->bitmap_nr_bits) {
+@@ -2121,25 +2054,13 @@ static int submit_eb_subpage(struct page *page, struct btrfs_bio_ctrl *bio_ctrl)
+ if (!eb)
+ continue;
+
+- ret = lock_extent_buffer_for_io(eb, bio_ctrl);
+- if (ret == 0) {
+- free_extent_buffer(eb);
+- continue;
++ if (lock_extent_buffer_for_io(eb, wbc)) {
++ write_one_subpage_eb(eb, wbc);
++ submitted++;
+ }
+- if (ret < 0) {
+- free_extent_buffer(eb);
+- goto cleanup;
+- }
+- write_one_subpage_eb(eb, bio_ctrl);
+ free_extent_buffer(eb);
+- submitted++;
+ }
+ return submitted;
+-
+-cleanup:
+- /* We hit error, end bio for the submitted extent buffers */
+- submit_write_bio(bio_ctrl, ret);
+- return ret;
+ }
+
+ /*
+@@ -2162,7 +2083,7 @@ cleanup:
+ * previous call.
+ * Return <0 for fatal error.
+ */
+-static int submit_eb_page(struct page *page, struct btrfs_bio_ctrl *bio_ctrl,
++static int submit_eb_page(struct page *page, struct writeback_control *wbc,
+ struct extent_buffer **eb_context)
+ {
+ struct address_space *mapping = page->mapping;
+@@ -2174,7 +2095,7 @@ static int submit_eb_page(struct page *page, struct btrfs_bio_ctrl *bio_ctrl,
+ return 0;
+
+ if (btrfs_sb(page->mapping->host->i_sb)->nodesize < PAGE_SIZE)
+- return submit_eb_subpage(page, bio_ctrl);
++ return submit_eb_subpage(page, wbc);
+
+ spin_lock(&mapping->private_lock);
+ if (!PagePrivate(page)) {
+@@ -2207,8 +2128,7 @@ static int submit_eb_page(struct page *page, struct btrfs_bio_ctrl *bio_ctrl,
+ * If for_sync, this hole will be filled with
+ * trasnsaction commit.
+ */
+- if (bio_ctrl->wbc->sync_mode == WB_SYNC_ALL &&
+- !bio_ctrl->wbc->for_sync)
++ if (wbc->sync_mode == WB_SYNC_ALL && !wbc->for_sync)
+ ret = -EAGAIN;
+ else
+ ret = 0;
+@@ -2218,13 +2138,12 @@ static int submit_eb_page(struct page *page, struct btrfs_bio_ctrl *bio_ctrl,
+
+ *eb_context = eb;
+
+- ret = lock_extent_buffer_for_io(eb, bio_ctrl);
+- if (ret <= 0) {
++ if (!lock_extent_buffer_for_io(eb, wbc)) {
+ btrfs_revert_meta_write_pointer(cache, eb);
+ if (cache)
+ btrfs_put_block_group(cache);
+ free_extent_buffer(eb);
+- return ret;
++ return 0;
+ }
+ if (cache) {
+ /*
+@@ -2233,7 +2152,7 @@ static int submit_eb_page(struct page *page, struct btrfs_bio_ctrl *bio_ctrl,
+ btrfs_schedule_zone_finish_bg(cache, eb);
+ btrfs_put_block_group(cache);
+ }
+- write_one_eb(eb, bio_ctrl);
++ write_one_eb(eb, wbc);
+ free_extent_buffer(eb);
+ return 1;
+ }
+@@ -2242,11 +2161,6 @@ int btree_write_cache_pages(struct address_space *mapping,
+ struct writeback_control *wbc)
+ {
+ struct extent_buffer *eb_context = NULL;
+- struct btrfs_bio_ctrl bio_ctrl = {
+- .wbc = wbc,
+- .opf = REQ_OP_WRITE | wbc_to_write_flags(wbc),
+- .extent_locked = 0,
+- };
+ struct btrfs_fs_info *fs_info = BTRFS_I(mapping->host)->root->fs_info;
+ int ret = 0;
+ int done = 0;
+@@ -2288,7 +2202,7 @@ retry:
+ for (i = 0; i < nr_folios; i++) {
+ struct folio *folio = fbatch.folios[i];
+
+- ret = submit_eb_page(&folio->page, &bio_ctrl, &eb_context);
++ ret = submit_eb_page(&folio->page, wbc, &eb_context);
+ if (ret == 0)
+ continue;
+ if (ret < 0) {
+@@ -2349,8 +2263,6 @@ retry:
+ ret = 0;
+ if (!ret && BTRFS_FS_ERROR(fs_info))
+ ret = -EROFS;
+- submit_write_bio(&bio_ctrl, ret);
+-
+ btrfs_zoned_meta_io_unlock(fs_info);
+ return ret;
+ }
+@@ -2520,38 +2432,31 @@ retry:
+ * already been ran (aka, ordered extent inserted) and all pages are still
+ * locked.
+ */
+-int extent_write_locked_range(struct inode *inode, u64 start, u64 end)
++int extent_write_locked_range(struct inode *inode, u64 start, u64 end,
++ struct writeback_control *wbc)
+ {
+ bool found_error = false;
+ int first_error = 0;
+ int ret = 0;
+ struct address_space *mapping = inode->i_mapping;
+- struct page *page;
++ struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
++ const u32 sectorsize = fs_info->sectorsize;
++ loff_t i_size = i_size_read(inode);
+ u64 cur = start;
+- unsigned long nr_pages;
+- const u32 sectorsize = btrfs_sb(inode->i_sb)->sectorsize;
+- struct writeback_control wbc_writepages = {
+- .sync_mode = WB_SYNC_ALL,
+- .range_start = start,
+- .range_end = end + 1,
+- .no_cgroup_owner = 1,
+- };
+ struct btrfs_bio_ctrl bio_ctrl = {
+- .wbc = &wbc_writepages,
+- /* We're called from an async helper function */
+- .opf = REQ_OP_WRITE | REQ_BTRFS_CGROUP_PUNT |
+- wbc_to_write_flags(&wbc_writepages),
+- .extent_locked = 1,
++ .wbc = wbc,
++ .opf = REQ_OP_WRITE | wbc_to_write_flags(wbc),
+ };
+
++ if (wbc->no_cgroup_owner)
++ bio_ctrl.opf |= REQ_BTRFS_CGROUP_PUNT;
++
+ ASSERT(IS_ALIGNED(start, sectorsize) && IS_ALIGNED(end + 1, sectorsize));
+- nr_pages = (round_up(end, PAGE_SIZE) - round_down(start, PAGE_SIZE)) >>
+- PAGE_SHIFT;
+- wbc_writepages.nr_to_write = nr_pages * 2;
+
+- wbc_attach_fdatawrite_inode(&wbc_writepages, inode);
+ while (cur <= end) {
+ u64 cur_end = min(round_down(cur, PAGE_SIZE) + PAGE_SIZE - 1, end);
++ struct page *page;
++ int nr = 0;
+
+ page = find_get_page(mapping, cur >> PAGE_SHIFT);
+ /*
+@@ -2562,19 +2467,31 @@ int extent_write_locked_range(struct inode *inode, u64 start, u64 end)
+ ASSERT(PageLocked(page));
+ ASSERT(PageDirty(page));
+ clear_page_dirty_for_io(page);
+- ret = __extent_writepage(page, &bio_ctrl);
+- ASSERT(ret <= 0);
++
++ ret = __extent_writepage_io(BTRFS_I(inode), page, &bio_ctrl,
++ i_size, &nr);
++ if (ret == 1)
++ goto next_page;
++
++ /* Make sure the mapping tag for page dirty gets cleared. */
++ if (nr == 0) {
++ set_page_writeback(page);
++ end_page_writeback(page);
++ }
++ if (ret)
++ end_extent_writepage(page, ret, cur, cur_end);
++ btrfs_page_unlock_writer(fs_info, page, cur, cur_end + 1 - cur);
+ if (ret < 0) {
+ found_error = true;
+ first_error = ret;
+ }
++next_page:
+ put_page(page);
+ cur = cur_end + 1;
+ }
+
+ submit_write_bio(&bio_ctrl, found_error ? ret : 0);
+
+- wbc_detach_inode(&wbc_writepages);
+ if (found_error)
+ return first_error;
+ return ret;
+@@ -2588,7 +2505,6 @@ int extent_writepages(struct address_space *mapping,
+ struct btrfs_bio_ctrl bio_ctrl = {
+ .wbc = wbc,
+ .opf = REQ_OP_WRITE | wbc_to_write_flags(wbc),
+- .extent_locked = 0,
+ };
+
+ /*
+@@ -4148,7 +4064,7 @@ void btrfs_clear_buffer_dirty(struct btrfs_trans_handle *trans,
+ WARN_ON(atomic_read(&eb->refs) == 0);
+ }
+
+-bool set_extent_buffer_dirty(struct extent_buffer *eb)
++void set_extent_buffer_dirty(struct extent_buffer *eb)
+ {
+ int i;
+ int num_pages;
+@@ -4183,13 +4099,14 @@ bool set_extent_buffer_dirty(struct extent_buffer *eb)
+ eb->start, eb->len);
+ if (subpage)
+ unlock_page(eb->pages[0]);
++ percpu_counter_add_batch(&eb->fs_info->dirty_metadata_bytes,
++ eb->len,
++ eb->fs_info->dirty_metadata_batch);
+ }
+ #ifdef CONFIG_BTRFS_DEBUG
+ for (i = 0; i < num_pages; i++)
+ ASSERT(PageDirty(eb->pages[i]));
+ #endif
+-
+- return was_dirty;
+ }
+
+ void clear_extent_buffer_uptodate(struct extent_buffer *eb)
+@@ -4242,6 +4159,36 @@ void set_extent_buffer_uptodate(struct extent_buffer *eb)
+ }
+ }
+
++static void __read_extent_buffer_pages(struct extent_buffer *eb, int mirror_num,
++ struct btrfs_tree_parent_check *check)
++{
++ int num_pages = num_extent_pages(eb), i;
++ struct btrfs_bio *bbio;
++
++ clear_bit(EXTENT_BUFFER_READ_ERR, &eb->bflags);
++ eb->read_mirror = 0;
++ atomic_set(&eb->io_pages, num_pages);
++ check_buffer_tree_ref(eb);
++
++ bbio = btrfs_bio_alloc(INLINE_EXTENT_BUFFER_PAGES,
++ REQ_OP_READ | REQ_META, eb->fs_info,
++ end_bio_extent_readpage, NULL);
++ bbio->bio.bi_iter.bi_sector = eb->start >> SECTOR_SHIFT;
++ bbio->inode = BTRFS_I(eb->fs_info->btree_inode);
++ bbio->file_offset = eb->start;
++ memcpy(&bbio->parent_check, check, sizeof(*check));
++ if (eb->fs_info->nodesize < PAGE_SIZE) {
++ __bio_add_page(&bbio->bio, eb->pages[0], eb->len,
++ eb->start - page_offset(eb->pages[0]));
++ } else {
++ for (i = 0; i < num_pages; i++) {
++ ClearPageError(eb->pages[i]);
++ __bio_add_page(&bbio->bio, eb->pages[i], PAGE_SIZE, 0);
++ }
++ }
++ btrfs_submit_bio(bbio, mirror_num);
++}
++
+ static int read_extent_buffer_subpage(struct extent_buffer *eb, int wait,
+ int mirror_num,
+ struct btrfs_tree_parent_check *check)
+@@ -4250,11 +4197,6 @@ static int read_extent_buffer_subpage(struct extent_buffer *eb, int wait,
+ struct extent_io_tree *io_tree;
+ struct page *page = eb->pages[0];
+ struct extent_state *cached_state = NULL;
+- struct btrfs_bio_ctrl bio_ctrl = {
+- .opf = REQ_OP_READ,
+- .mirror_num = mirror_num,
+- .parent_check = check,
+- };
+ int ret;
+
+ ASSERT(!test_bit(EXTENT_BUFFER_UNMAPPED, &eb->bflags));
+@@ -4282,18 +4224,10 @@ static int read_extent_buffer_subpage(struct extent_buffer *eb, int wait,
+ return 0;
+ }
+
+- clear_bit(EXTENT_BUFFER_READ_ERR, &eb->bflags);
+- eb->read_mirror = 0;
+- atomic_set(&eb->io_pages, 1);
+- check_buffer_tree_ref(eb);
+- bio_ctrl.end_io_func = end_bio_extent_readpage;
+-
+ btrfs_subpage_clear_error(fs_info, page, eb->start, eb->len);
+-
+ btrfs_subpage_start_reader(fs_info, page, eb->start, eb->len);
+- submit_extent_page(&bio_ctrl, eb->start, page, eb->len,
+- eb->start - page_offset(page));
+- submit_one_bio(&bio_ctrl);
++
++ __read_extent_buffer_pages(eb, mirror_num, check);
+ if (wait != WAIT_COMPLETE) {
+ free_extent_state(cached_state);
+ return 0;
+@@ -4314,12 +4248,6 @@ int read_extent_buffer_pages(struct extent_buffer *eb, int wait, int mirror_num,
+ int locked_pages = 0;
+ int all_uptodate = 1;
+ int num_pages;
+- unsigned long num_reads = 0;
+- struct btrfs_bio_ctrl bio_ctrl = {
+- .opf = REQ_OP_READ,
+- .mirror_num = mirror_num,
+- .parent_check = check,
+- };
+
+ if (test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags))
+ return 0;
+@@ -4360,10 +4288,8 @@ int read_extent_buffer_pages(struct extent_buffer *eb, int wait, int mirror_num,
+ */
+ for (i = 0; i < num_pages; i++) {
+ page = eb->pages[i];
+- if (!PageUptodate(page)) {
+- num_reads++;
++ if (!PageUptodate(page))
+ all_uptodate = 0;
+- }
+ }
+
+ if (all_uptodate) {
+@@ -4371,28 +4297,7 @@ int read_extent_buffer_pages(struct extent_buffer *eb, int wait, int mirror_num,
+ goto unlock_exit;
+ }
+
+- clear_bit(EXTENT_BUFFER_READ_ERR, &eb->bflags);
+- eb->read_mirror = 0;
+- atomic_set(&eb->io_pages, num_reads);
+- /*
+- * It is possible for release_folio to clear the TREE_REF bit before we
+- * set io_pages. See check_buffer_tree_ref for a more detailed comment.
+- */
+- check_buffer_tree_ref(eb);
+- bio_ctrl.end_io_func = end_bio_extent_readpage;
+- for (i = 0; i < num_pages; i++) {
+- page = eb->pages[i];
+-
+- if (!PageUptodate(page)) {
+- ClearPageError(page);
+- submit_extent_page(&bio_ctrl, page_offset(page), page,
+- PAGE_SIZE, 0);
+- } else {
+- unlock_page(page);
+- }
+- }
+-
+- submit_one_bio(&bio_ctrl);
++ __read_extent_buffer_pages(eb, mirror_num, check);
+
+ if (wait != WAIT_COMPLETE)
+ return 0;
+diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
+index 4341ad978fb8e..c368e9f02c74e 100644
+--- a/fs/btrfs/extent_io.h
++++ b/fs/btrfs/extent_io.h
+@@ -179,7 +179,8 @@ int try_release_extent_mapping(struct page *page, gfp_t mask);
+ int try_release_extent_buffer(struct page *page);
+
+ int btrfs_read_folio(struct file *file, struct folio *folio);
+-int extent_write_locked_range(struct inode *inode, u64 start, u64 end);
++int extent_write_locked_range(struct inode *inode, u64 start, u64 end,
++ struct writeback_control *wbc);
+ int extent_writepages(struct address_space *mapping,
+ struct writeback_control *wbc);
+ int btree_write_cache_pages(struct address_space *mapping,
+@@ -262,7 +263,7 @@ void extent_buffer_bitmap_set(const struct extent_buffer *eb, unsigned long star
+ void extent_buffer_bitmap_clear(const struct extent_buffer *eb,
+ unsigned long start, unsigned long pos,
+ unsigned long len);
+-bool set_extent_buffer_dirty(struct extent_buffer *eb);
++void set_extent_buffer_dirty(struct extent_buffer *eb);
+ void set_extent_buffer_uptodate(struct extent_buffer *eb);
+ void clear_extent_buffer_uptodate(struct extent_buffer *eb);
+ int extent_buffer_under_io(const struct extent_buffer *eb);
+diff --git a/fs/btrfs/free-space-tree.c b/fs/btrfs/free-space-tree.c
+index b21da1446f2aa..045ddce32eca4 100644
+--- a/fs/btrfs/free-space-tree.c
++++ b/fs/btrfs/free-space-tree.c
+@@ -1280,7 +1280,10 @@ int btrfs_delete_free_space_tree(struct btrfs_fs_info *fs_info)
+ goto abort;
+
+ btrfs_global_root_delete(free_space_root);
++
++ spin_lock(&fs_info->trans_lock);
+ list_del(&free_space_root->dirty_list);
++ spin_unlock(&fs_info->trans_lock);
+
+ btrfs_tree_lock(free_space_root->node);
+ btrfs_clear_buffer_dirty(trans, free_space_root->node);
+diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
+index 7fcafcc5292c8..2e6eed4b1b3cc 100644
+--- a/fs/btrfs/inode.c
++++ b/fs/btrfs/inode.c
+@@ -934,6 +934,12 @@ static int submit_uncompressed_range(struct btrfs_inode *inode,
+ unsigned long nr_written = 0;
+ int page_started = 0;
+ int ret;
++ struct writeback_control wbc = {
++ .sync_mode = WB_SYNC_ALL,
++ .range_start = start,
++ .range_end = end,
++ .no_cgroup_owner = 1,
++ };
+
+ /*
+ * Call cow_file_range() to run the delalloc range directly, since we
+@@ -965,7 +971,10 @@ static int submit_uncompressed_range(struct btrfs_inode *inode,
+ }
+
+ /* All pages will be unlocked, including @locked_page */
+- return extent_write_locked_range(&inode->vfs_inode, start, end);
++ wbc_attach_fdatawrite_inode(&wbc, &inode->vfs_inode);
++ ret = extent_write_locked_range(&inode->vfs_inode, start, end, &wbc);
++ wbc_detach_inode(&wbc);
++ return ret;
+ }
+
+ static int submit_one_async_extent(struct btrfs_inode *inode,
+@@ -1521,58 +1530,36 @@ static noinline void async_cow_free(struct btrfs_work *work)
+ kvfree(async_cow);
+ }
+
+-static int cow_file_range_async(struct btrfs_inode *inode,
+- struct writeback_control *wbc,
+- struct page *locked_page,
+- u64 start, u64 end, int *page_started,
+- unsigned long *nr_written)
++static bool cow_file_range_async(struct btrfs_inode *inode,
++ struct writeback_control *wbc,
++ struct page *locked_page,
++ u64 start, u64 end, int *page_started,
++ unsigned long *nr_written)
+ {
+ struct btrfs_fs_info *fs_info = inode->root->fs_info;
+ struct cgroup_subsys_state *blkcg_css = wbc_blkcg_css(wbc);
+ struct async_cow *ctx;
+ struct async_chunk *async_chunk;
+ unsigned long nr_pages;
+- u64 cur_end;
+ u64 num_chunks = DIV_ROUND_UP(end - start, SZ_512K);
+ int i;
+- bool should_compress;
+ unsigned nofs_flag;
+ const blk_opf_t write_flags = wbc_to_write_flags(wbc);
+
+- unlock_extent(&inode->io_tree, start, end, NULL);
+-
+- if (inode->flags & BTRFS_INODE_NOCOMPRESS &&
+- !btrfs_test_opt(fs_info, FORCE_COMPRESS)) {
+- num_chunks = 1;
+- should_compress = false;
+- } else {
+- should_compress = true;
+- }
+-
+ nofs_flag = memalloc_nofs_save();
+ ctx = kvmalloc(struct_size(ctx, chunks, num_chunks), GFP_KERNEL);
+ memalloc_nofs_restore(nofs_flag);
++ if (!ctx)
++ return false;
+
+- if (!ctx) {
+- unsigned clear_bits = EXTENT_LOCKED | EXTENT_DELALLOC |
+- EXTENT_DELALLOC_NEW | EXTENT_DEFRAG |
+- EXTENT_DO_ACCOUNTING;
+- unsigned long page_ops = PAGE_UNLOCK | PAGE_START_WRITEBACK |
+- PAGE_END_WRITEBACK | PAGE_SET_ERROR;
+-
+- extent_clear_unlock_delalloc(inode, start, end, locked_page,
+- clear_bits, page_ops);
+- return -ENOMEM;
+- }
++ unlock_extent(&inode->io_tree, start, end, NULL);
++ set_bit(BTRFS_INODE_HAS_ASYNC_EXTENT, &inode->runtime_flags);
+
+ async_chunk = ctx->chunks;
+ atomic_set(&ctx->num_chunks, num_chunks);
+
+ for (i = 0; i < num_chunks; i++) {
+- if (should_compress)
+- cur_end = min(end, start + SZ_512K - 1);
+- else
+- cur_end = end;
++ u64 cur_end = min(end, start + SZ_512K - 1);
+
+ /*
+ * igrab is called higher up in the call chain, take only the
+@@ -1633,13 +1620,14 @@ static int cow_file_range_async(struct btrfs_inode *inode,
+ start = cur_end + 1;
+ }
+ *page_started = 1;
+- return 0;
++ return true;
+ }
+
+ static noinline int run_delalloc_zoned(struct btrfs_inode *inode,
+ struct page *locked_page, u64 start,
+ u64 end, int *page_started,
+- unsigned long *nr_written)
++ unsigned long *nr_written,
++ struct writeback_control *wbc)
+ {
+ u64 done_offset = end;
+ int ret;
+@@ -1671,8 +1659,8 @@ static noinline int run_delalloc_zoned(struct btrfs_inode *inode,
+ account_page_redirty(locked_page);
+ }
+ locked_page_done = true;
+- extent_write_locked_range(&inode->vfs_inode, start, done_offset);
+-
++ extent_write_locked_range(&inode->vfs_inode, start, done_offset,
++ wbc);
+ start = done_offset + 1;
+ }
+
+@@ -2214,7 +2202,7 @@ int btrfs_run_delalloc_range(struct btrfs_inode *inode, struct page *locked_page
+ u64 start, u64 end, int *page_started, unsigned long *nr_written,
+ struct writeback_control *wbc)
+ {
+- int ret;
++ int ret = 0;
+ const bool zoned = btrfs_is_zoned(inode->root->fs_info);
+
+ /*
+@@ -2235,19 +2223,23 @@ int btrfs_run_delalloc_range(struct btrfs_inode *inode, struct page *locked_page
+ ASSERT(!zoned || btrfs_is_data_reloc_root(inode->root));
+ ret = run_delalloc_nocow(inode, locked_page, start, end,
+ page_started, nr_written);
+- } else if (!btrfs_inode_can_compress(inode) ||
+- !inode_need_compress(inode, start, end)) {
+- if (zoned)
+- ret = run_delalloc_zoned(inode, locked_page, start, end,
+- page_started, nr_written);
+- else
+- ret = cow_file_range(inode, locked_page, start, end,
+- page_started, nr_written, 1, NULL);
+- } else {
+- set_bit(BTRFS_INODE_HAS_ASYNC_EXTENT, &inode->runtime_flags);
+- ret = cow_file_range_async(inode, wbc, locked_page, start, end,
+- page_started, nr_written);
++ goto out;
+ }
++
++ if (btrfs_inode_can_compress(inode) &&
++ inode_need_compress(inode, start, end) &&
++ cow_file_range_async(inode, wbc, locked_page, start,
++ end, page_started, nr_written))
++ goto out;
++
++ if (zoned)
++ ret = run_delalloc_zoned(inode, locked_page, start, end,
++ page_started, nr_written, wbc);
++ else
++ ret = cow_file_range(inode, locked_page, start, end,
++ page_started, nr_written, 1, NULL);
++
++out:
+ ASSERT(ret <= 0);
+ if (ret)
+ btrfs_cleanup_ordered_extents(inode, locked_page, start,
+diff --git a/fs/btrfs/locking.c b/fs/btrfs/locking.c
+index 3a496b0d3d2b0..7979449a58d6b 100644
+--- a/fs/btrfs/locking.c
++++ b/fs/btrfs/locking.c
+@@ -57,8 +57,8 @@
+
+ static struct btrfs_lockdep_keyset {
+ u64 id; /* root objectid */
+- /* Longest entry: btrfs-free-space-00 */
+- char names[BTRFS_MAX_LEVEL][20];
++ /* Longest entry: btrfs-block-group-00 */
++ char names[BTRFS_MAX_LEVEL][24];
+ struct lock_class_key keys[BTRFS_MAX_LEVEL];
+ } btrfs_lockdep_keysets[] = {
+ { .id = BTRFS_ROOT_TREE_OBJECTID, DEFINE_NAME("root") },
+@@ -72,6 +72,7 @@ static struct btrfs_lockdep_keyset {
+ { .id = BTRFS_DATA_RELOC_TREE_OBJECTID, DEFINE_NAME("dreloc") },
+ { .id = BTRFS_UUID_TREE_OBJECTID, DEFINE_NAME("uuid") },
+ { .id = BTRFS_FREE_SPACE_TREE_OBJECTID, DEFINE_NAME("free-space") },
++ { .id = BTRFS_BLOCK_GROUP_TREE_OBJECTID, DEFINE_NAME("block-group") },
+ { .id = 0, DEFINE_NAME("tree") },
+ };
+
+diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
+index f41da7ac360d8..f8735b31da16f 100644
+--- a/fs/btrfs/qgroup.c
++++ b/fs/btrfs/qgroup.c
+@@ -1301,7 +1301,9 @@ int btrfs_quota_disable(struct btrfs_fs_info *fs_info)
+ goto out;
+ }
+
++ spin_lock(&fs_info->trans_lock);
+ list_del("a_root->dirty_list);
++ spin_unlock(&fs_info->trans_lock);
+
+ btrfs_tree_lock(quota_root->node);
+ btrfs_clear_buffer_dirty(trans, quota_root->node);
+diff --git a/fs/btrfs/tree-mod-log.c b/fs/btrfs/tree-mod-log.c
+index a555baa0143ac..07c086f9e35e9 100644
+--- a/fs/btrfs/tree-mod-log.c
++++ b/fs/btrfs/tree-mod-log.c
+@@ -248,6 +248,26 @@ int btrfs_tree_mod_log_insert_key(struct extent_buffer *eb, int slot,
+ return ret;
+ }
+
++static struct tree_mod_elem *tree_mod_log_alloc_move(struct extent_buffer *eb,
++ int dst_slot, int src_slot,
++ int nr_items)
++{
++ struct tree_mod_elem *tm;
++
++ tm = kzalloc(sizeof(*tm), GFP_NOFS);
++ if (!tm)
++ return ERR_PTR(-ENOMEM);
++
++ tm->logical = eb->start;
++ tm->slot = src_slot;
++ tm->move.dst_slot = dst_slot;
++ tm->move.nr_items = nr_items;
++ tm->op = BTRFS_MOD_LOG_MOVE_KEYS;
++ RB_CLEAR_NODE(&tm->node);
++
++ return tm;
++}
++
+ int btrfs_tree_mod_log_insert_move(struct extent_buffer *eb,
+ int dst_slot, int src_slot,
+ int nr_items)
+@@ -265,18 +285,13 @@ int btrfs_tree_mod_log_insert_move(struct extent_buffer *eb,
+ if (!tm_list)
+ return -ENOMEM;
+
+- tm = kzalloc(sizeof(*tm), GFP_NOFS);
+- if (!tm) {
+- ret = -ENOMEM;
++ tm = tree_mod_log_alloc_move(eb, dst_slot, src_slot, nr_items);
++ if (IS_ERR(tm)) {
++ ret = PTR_ERR(tm);
++ tm = NULL;
+ goto free_tms;
+ }
+
+- tm->logical = eb->start;
+- tm->slot = src_slot;
+- tm->move.dst_slot = dst_slot;
+- tm->move.nr_items = nr_items;
+- tm->op = BTRFS_MOD_LOG_MOVE_KEYS;
+-
+ for (i = 0; i + dst_slot < src_slot && i < nr_items; i++) {
+ tm_list[i] = alloc_tree_mod_elem(eb, i + dst_slot,
+ BTRFS_MOD_LOG_KEY_REMOVE_WHILE_MOVING);
+@@ -489,6 +504,10 @@ int btrfs_tree_mod_log_eb_copy(struct extent_buffer *dst,
+ struct tree_mod_elem **tm_list_add, **tm_list_rem;
+ int i;
+ bool locked = false;
++ struct tree_mod_elem *dst_move_tm = NULL;
++ struct tree_mod_elem *src_move_tm = NULL;
++ u32 dst_move_nr_items = btrfs_header_nritems(dst) - dst_offset;
++ u32 src_move_nr_items = btrfs_header_nritems(src) - (src_offset + nr_items);
+
+ if (!tree_mod_need_log(fs_info, NULL))
+ return 0;
+@@ -501,6 +520,26 @@ int btrfs_tree_mod_log_eb_copy(struct extent_buffer *dst,
+ if (!tm_list)
+ return -ENOMEM;
+
++ if (dst_move_nr_items) {
++ dst_move_tm = tree_mod_log_alloc_move(dst, dst_offset + nr_items,
++ dst_offset, dst_move_nr_items);
++ if (IS_ERR(dst_move_tm)) {
++ ret = PTR_ERR(dst_move_tm);
++ dst_move_tm = NULL;
++ goto free_tms;
++ }
++ }
++ if (src_move_nr_items) {
++ src_move_tm = tree_mod_log_alloc_move(src, src_offset,
++ src_offset + nr_items,
++ src_move_nr_items);
++ if (IS_ERR(src_move_tm)) {
++ ret = PTR_ERR(src_move_tm);
++ src_move_tm = NULL;
++ goto free_tms;
++ }
++ }
++
+ tm_list_add = tm_list;
+ tm_list_rem = tm_list + nr_items;
+ for (i = 0; i < nr_items; i++) {
+@@ -523,6 +562,11 @@ int btrfs_tree_mod_log_eb_copy(struct extent_buffer *dst,
+ goto free_tms;
+ locked = true;
+
++ if (dst_move_tm) {
++ ret = tree_mod_log_insert(fs_info, dst_move_tm);
++ if (ret)
++ goto free_tms;
++ }
+ for (i = 0; i < nr_items; i++) {
+ ret = tree_mod_log_insert(fs_info, tm_list_rem[i]);
+ if (ret)
+@@ -531,6 +575,11 @@ int btrfs_tree_mod_log_eb_copy(struct extent_buffer *dst,
+ if (ret)
+ goto free_tms;
+ }
++ if (src_move_tm) {
++ ret = tree_mod_log_insert(fs_info, src_move_tm);
++ if (ret)
++ goto free_tms;
++ }
+
+ write_unlock(&fs_info->tree_mod_log_lock);
+ kfree(tm_list);
+@@ -538,6 +587,12 @@ int btrfs_tree_mod_log_eb_copy(struct extent_buffer *dst,
+ return 0;
+
+ free_tms:
++ if (dst_move_tm && !RB_EMPTY_NODE(&dst_move_tm->node))
++ rb_erase(&dst_move_tm->node, &fs_info->tree_mod_log);
++ kfree(dst_move_tm);
++ if (src_move_tm && !RB_EMPTY_NODE(&src_move_tm->node))
++ rb_erase(&src_move_tm->node, &fs_info->tree_mod_log);
++ kfree(src_move_tm);
+ for (i = 0; i < nr_items * 2; i++) {
+ if (tm_list[i] && !RB_EMPTY_NODE(&tm_list[i]->node))
+ rb_erase(&tm_list[i]->node, &fs_info->tree_mod_log);
+@@ -664,10 +719,27 @@ static void tree_mod_log_rewind(struct btrfs_fs_info *fs_info,
+ unsigned long o_dst;
+ unsigned long o_src;
+ unsigned long p_size = sizeof(struct btrfs_key_ptr);
++ /*
++ * max_slot tracks the maximum valid slot of the rewind eb at every
++ * step of the rewind. This is in contrast with 'n' which eventually
++ * matches the number of items, but can be wrong during moves or if
++ * removes overlap on already valid slots (which is probably separately
++ * a bug). We do this to validate the offsets of memmoves for rewinding
++ * moves and detect invalid memmoves.
++ *
++ * Since a rewind eb can start empty, max_slot is a signed integer with
++ * a special meaning for -1, which is that no slot is valid to move out
++ * of. Any other negative value is invalid.
++ */
++ int max_slot;
++ int move_src_end_slot;
++ int move_dst_end_slot;
+
+ n = btrfs_header_nritems(eb);
++ max_slot = n - 1;
+ read_lock(&fs_info->tree_mod_log_lock);
+ while (tm && tm->seq >= time_seq) {
++ ASSERT(max_slot >= -1);
+ /*
+ * All the operations are recorded with the operator used for
+ * the modification. As we're going backwards, we do the
+@@ -684,6 +756,8 @@ static void tree_mod_log_rewind(struct btrfs_fs_info *fs_info,
+ btrfs_set_node_ptr_generation(eb, tm->slot,
+ tm->generation);
+ n++;
++ if (tm->slot > max_slot)
++ max_slot = tm->slot;
+ break;
+ case BTRFS_MOD_LOG_KEY_REPLACE:
+ BUG_ON(tm->slot >= n);
+@@ -693,14 +767,37 @@ static void tree_mod_log_rewind(struct btrfs_fs_info *fs_info,
+ tm->generation);
+ break;
+ case BTRFS_MOD_LOG_KEY_ADD:
++ /*
++ * It is possible we could have already removed keys
++ * behind the known max slot, so this will be an
++ * overestimate. In practice, the copy operation
++ * inserts them in increasing order, and overestimating
++ * just means we miss some warnings, so it's OK. It
++ * isn't worth carefully tracking the full array of
++ * valid slots to check against when moving.
++ */
++ if (tm->slot == max_slot)
++ max_slot--;
+ /* if a move operation is needed it's in the log */
+ n--;
+ break;
+ case BTRFS_MOD_LOG_MOVE_KEYS:
++ ASSERT(tm->move.nr_items > 0);
++ move_src_end_slot = tm->move.dst_slot + tm->move.nr_items - 1;
++ move_dst_end_slot = tm->slot + tm->move.nr_items - 1;
+ o_dst = btrfs_node_key_ptr_offset(eb, tm->slot);
+ o_src = btrfs_node_key_ptr_offset(eb, tm->move.dst_slot);
++ if (WARN_ON(move_src_end_slot > max_slot ||
++ tm->move.nr_items <= 0)) {
++ btrfs_warn(fs_info,
++"move from invalid tree mod log slot eb %llu slot %d dst_slot %d nr_items %d seq %llu n %u max_slot %d",
++ eb->start, tm->slot,
++ tm->move.dst_slot, tm->move.nr_items,
++ tm->seq, n, max_slot);
++ }
+ memmove_extent_buffer(eb, o_dst, o_src,
+ tm->move.nr_items * p_size);
++ max_slot = move_dst_end_slot;
+ break;
+ case BTRFS_MOD_LOG_ROOT_REPLACE:
+ /*
+diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
+index 160b3da43aecd..502893e3da010 100644
+--- a/fs/erofs/zdata.c
++++ b/fs/erofs/zdata.c
+@@ -94,11 +94,8 @@ struct z_erofs_pcluster {
+
+ /* let's avoid the valid 32-bit kernel addresses */
+
+-/* the chained workgroup has't submitted io (still open) */
++/* the end of a chain of pclusters */
+ #define Z_EROFS_PCLUSTER_TAIL ((void *)0x5F0ECAFE)
+-/* the chained workgroup has already submitted io */
+-#define Z_EROFS_PCLUSTER_TAIL_CLOSED ((void *)0x5F0EDEAD)
+-
+ #define Z_EROFS_PCLUSTER_NIL (NULL)
+
+ struct z_erofs_decompressqueue {
+@@ -499,20 +496,6 @@ out_error_pcluster_pool:
+
+ enum z_erofs_pclustermode {
+ Z_EROFS_PCLUSTER_INFLIGHT,
+- /*
+- * The current pclusters was the tail of an exist chain, in addition
+- * that the previous processed chained pclusters are all decided to
+- * be hooked up to it.
+- * A new chain will be created for the remaining pclusters which are
+- * not processed yet, so different from Z_EROFS_PCLUSTER_FOLLOWED,
+- * the next pcluster cannot reuse the whole page safely for inplace I/O
+- * in the following scenario:
+- * ________________________________________________________________
+- * | tail (partial) page | head (partial) page |
+- * | (belongs to the next pcl) | (belongs to the current pcl) |
+- * |_______PCLUSTER_FOLLOWED______|________PCLUSTER_HOOKED__________|
+- */
+- Z_EROFS_PCLUSTER_HOOKED,
+ /*
+ * a weak form of Z_EROFS_PCLUSTER_FOLLOWED, the difference is that it
+ * could be dispatched into bypass queue later due to uptodated managed
+@@ -530,8 +513,8 @@ enum z_erofs_pclustermode {
+ * ________________________________________________________________
+ * | tail (partial) page | head (partial) page |
+ * | (of the current cl) | (of the previous collection) |
+- * | PCLUSTER_FOLLOWED or | |
+- * |_____PCLUSTER_HOOKED__|___________PCLUSTER_FOLLOWED____________|
++ * | | |
++ * |__PCLUSTER_FOLLOWED___|___________PCLUSTER_FOLLOWED____________|
+ *
+ * [ (*) the above page can be used as inplace I/O. ]
+ */
+@@ -544,7 +527,7 @@ struct z_erofs_decompress_frontend {
+ struct z_erofs_bvec_iter biter;
+
+ struct page *candidate_bvpage;
+- struct z_erofs_pcluster *pcl, *tailpcl;
++ struct z_erofs_pcluster *pcl;
+ z_erofs_next_pcluster_t owned_head;
+ enum z_erofs_pclustermode mode;
+
+@@ -750,19 +733,7 @@ static void z_erofs_try_to_claim_pcluster(struct z_erofs_decompress_frontend *f)
+ return;
+ }
+
+- /*
+- * type 2, link to the end of an existing open chain, be careful
+- * that its submission is controlled by the original attached chain.
+- */
+- if (*owned_head != &pcl->next && pcl != f->tailpcl &&
+- cmpxchg(&pcl->next, Z_EROFS_PCLUSTER_TAIL,
+- *owned_head) == Z_EROFS_PCLUSTER_TAIL) {
+- *owned_head = Z_EROFS_PCLUSTER_TAIL;
+- f->mode = Z_EROFS_PCLUSTER_HOOKED;
+- f->tailpcl = NULL;
+- return;
+- }
+- /* type 3, it belongs to a chain, but it isn't the end of the chain */
++ /* type 2, it belongs to an ongoing chain */
+ f->mode = Z_EROFS_PCLUSTER_INFLIGHT;
+ }
+
+@@ -823,9 +794,6 @@ static int z_erofs_register_pcluster(struct z_erofs_decompress_frontend *fe)
+ goto err_out;
+ }
+ }
+- /* used to check tail merging loop due to corrupted images */
+- if (fe->owned_head == Z_EROFS_PCLUSTER_TAIL)
+- fe->tailpcl = pcl;
+ fe->owned_head = &pcl->next;
+ fe->pcl = pcl;
+ return 0;
+@@ -846,7 +814,6 @@ static int z_erofs_collector_begin(struct z_erofs_decompress_frontend *fe)
+
+ /* must be Z_EROFS_PCLUSTER_TAIL or pointed to previous pcluster */
+ DBG_BUGON(fe->owned_head == Z_EROFS_PCLUSTER_NIL);
+- DBG_BUGON(fe->owned_head == Z_EROFS_PCLUSTER_TAIL_CLOSED);
+
+ if (!(map->m_flags & EROFS_MAP_META)) {
+ grp = erofs_find_workgroup(fe->inode->i_sb,
+@@ -865,10 +832,6 @@ static int z_erofs_collector_begin(struct z_erofs_decompress_frontend *fe)
+
+ if (ret == -EEXIST) {
+ mutex_lock(&fe->pcl->lock);
+- /* used to check tail merging loop due to corrupted images */
+- if (fe->owned_head == Z_EROFS_PCLUSTER_TAIL)
+- fe->tailpcl = fe->pcl;
+-
+ z_erofs_try_to_claim_pcluster(fe);
+ } else if (ret) {
+ return ret;
+@@ -1025,8 +988,7 @@ hitted:
+ * those chains are handled asynchronously thus the page cannot be used
+ * for inplace I/O or bvpage (should be processed in a strict order.)
+ */
+- tight &= (fe->mode >= Z_EROFS_PCLUSTER_HOOKED &&
+- fe->mode != Z_EROFS_PCLUSTER_FOLLOWED_NOINPLACE);
++ tight &= (fe->mode > Z_EROFS_PCLUSTER_FOLLOWED_NOINPLACE);
+
+ cur = end - min_t(unsigned int, offset + end - map->m_la, end);
+ if (!(map->m_flags & EROFS_MAP_MAPPED)) {
+@@ -1404,10 +1366,7 @@ static void z_erofs_decompress_queue(const struct z_erofs_decompressqueue *io,
+ };
+ z_erofs_next_pcluster_t owned = io->head;
+
+- while (owned != Z_EROFS_PCLUSTER_TAIL_CLOSED) {
+- /* impossible that 'owned' equals Z_EROFS_WORK_TPTR_TAIL */
+- DBG_BUGON(owned == Z_EROFS_PCLUSTER_TAIL);
+- /* impossible that 'owned' equals Z_EROFS_PCLUSTER_NIL */
++ while (owned != Z_EROFS_PCLUSTER_TAIL) {
+ DBG_BUGON(owned == Z_EROFS_PCLUSTER_NIL);
+
+ be.pcl = container_of(owned, struct z_erofs_pcluster, next);
+@@ -1424,7 +1383,7 @@ static void z_erofs_decompressqueue_work(struct work_struct *work)
+ container_of(work, struct z_erofs_decompressqueue, u.work);
+ struct page *pagepool = NULL;
+
+- DBG_BUGON(bgq->head == Z_EROFS_PCLUSTER_TAIL_CLOSED);
++ DBG_BUGON(bgq->head == Z_EROFS_PCLUSTER_TAIL);
+ z_erofs_decompress_queue(bgq, &pagepool);
+ erofs_release_pages(&pagepool);
+ kvfree(bgq);
+@@ -1612,7 +1571,7 @@ fg_out:
+ q->sync = true;
+ }
+ q->sb = sb;
+- q->head = Z_EROFS_PCLUSTER_TAIL_CLOSED;
++ q->head = Z_EROFS_PCLUSTER_TAIL;
+ return q;
+ }
+
+@@ -1630,11 +1589,7 @@ static void move_to_bypass_jobqueue(struct z_erofs_pcluster *pcl,
+ z_erofs_next_pcluster_t *const submit_qtail = qtail[JQ_SUBMIT];
+ z_erofs_next_pcluster_t *const bypass_qtail = qtail[JQ_BYPASS];
+
+- DBG_BUGON(owned_head == Z_EROFS_PCLUSTER_TAIL_CLOSED);
+- if (owned_head == Z_EROFS_PCLUSTER_TAIL)
+- owned_head = Z_EROFS_PCLUSTER_TAIL_CLOSED;
+-
+- WRITE_ONCE(pcl->next, Z_EROFS_PCLUSTER_TAIL_CLOSED);
++ WRITE_ONCE(pcl->next, Z_EROFS_PCLUSTER_TAIL);
+
+ WRITE_ONCE(*submit_qtail, owned_head);
+ WRITE_ONCE(*bypass_qtail, &pcl->next);
+@@ -1705,15 +1660,10 @@ static void z_erofs_submit_queue(struct z_erofs_decompress_frontend *f,
+ unsigned int i = 0;
+ bool bypass = true;
+
+- /* no possible 'owned_head' equals the following */
+- DBG_BUGON(owned_head == Z_EROFS_PCLUSTER_TAIL_CLOSED);
+ DBG_BUGON(owned_head == Z_EROFS_PCLUSTER_NIL);
+-
+ pcl = container_of(owned_head, struct z_erofs_pcluster, next);
++ owned_head = READ_ONCE(pcl->next);
+
+- /* close the main owned chain at first */
+- owned_head = cmpxchg(&pcl->next, Z_EROFS_PCLUSTER_TAIL,
+- Z_EROFS_PCLUSTER_TAIL_CLOSED);
+ if (z_erofs_is_inline_pcluster(pcl)) {
+ move_to_bypass_jobqueue(pcl, qtail, owned_head);
+ continue;
+diff --git a/fs/erofs/zmap.c b/fs/erofs/zmap.c
+index d37c5c89c7287..920fb4dbc731c 100644
+--- a/fs/erofs/zmap.c
++++ b/fs/erofs/zmap.c
+@@ -129,7 +129,7 @@ static int unpack_compacted_index(struct z_erofs_maprecorder *m,
+ u8 *in, type;
+ bool big_pcluster;
+
+- if (1 << amortizedshift == 4)
++ if (1 << amortizedshift == 4 && lclusterbits <= 14)
+ vcnt = 2;
+ else if (1 << amortizedshift == 2 && lclusterbits == 12)
+ vcnt = 16;
+@@ -231,7 +231,6 @@ static int compacted_load_cluster_from_disk(struct z_erofs_maprecorder *m,
+ {
+ struct inode *const inode = m->inode;
+ struct erofs_inode *const vi = EROFS_I(inode);
+- const unsigned int lclusterbits = vi->z_logical_clusterbits;
+ const erofs_off_t ebase = sizeof(struct z_erofs_map_header) +
+ ALIGN(erofs_iloc(inode) + vi->inode_isize + vi->xattr_isize, 8);
+ unsigned int totalidx = erofs_iblks(inode);
+@@ -239,9 +238,6 @@ static int compacted_load_cluster_from_disk(struct z_erofs_maprecorder *m,
+ unsigned int amortizedshift;
+ erofs_off_t pos;
+
+- if (lclusterbits != 12)
+- return -EOPNOTSUPP;
+-
+ if (lcn >= totalidx)
+ return -EINVAL;
+
+diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
+index 45b579805c954..0caf6c730ce34 100644
+--- a/fs/ext4/namei.c
++++ b/fs/ext4/namei.c
+@@ -3834,19 +3834,10 @@ static int ext4_rename(struct mnt_idmap *idmap, struct inode *old_dir,
+ return retval;
+ }
+
+- /*
+- * We need to protect against old.inode directory getting converted
+- * from inline directory format into a normal one.
+- */
+- if (S_ISDIR(old.inode->i_mode))
+- inode_lock_nested(old.inode, I_MUTEX_NONDIR2);
+-
+ old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de,
+ &old.inlined);
+- if (IS_ERR(old.bh)) {
+- retval = PTR_ERR(old.bh);
+- goto unlock_moved_dir;
+- }
++ if (IS_ERR(old.bh))
++ return PTR_ERR(old.bh);
+
+ /*
+ * Check for inode number is _not_ due to possible IO errors.
+@@ -4043,10 +4034,6 @@ release_bh:
+ brelse(old.bh);
+ brelse(new.bh);
+
+-unlock_moved_dir:
+- if (S_ISDIR(old.inode->i_mode))
+- inode_unlock(old.inode);
+-
+ return retval;
+ }
+
+diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
+index 64b3860f50ee5..8fd3b7f9fb88e 100644
+--- a/fs/f2fs/checkpoint.c
++++ b/fs/f2fs/checkpoint.c
+@@ -30,12 +30,9 @@ void f2fs_stop_checkpoint(struct f2fs_sb_info *sbi, bool end_io,
+ unsigned char reason)
+ {
+ f2fs_build_fault_attr(sbi, 0, 0);
+- set_ckpt_flags(sbi, CP_ERROR_FLAG);
+- if (!end_io) {
++ if (!end_io)
+ f2fs_flush_merged_writes(sbi);
+-
+- f2fs_handle_stop(sbi, reason);
+- }
++ f2fs_handle_critical_error(sbi, reason, end_io);
+ }
+
+ /*
+diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
+index 11653fa792897..1132d3cd8f337 100644
+--- a/fs/f2fs/compress.c
++++ b/fs/f2fs/compress.c
+@@ -743,8 +743,8 @@ void f2fs_decompress_cluster(struct decompress_io_ctx *dic, bool in_task)
+ ret = -EFSCORRUPTED;
+
+ /* Avoid f2fs_commit_super in irq context */
+- if (in_task)
+- f2fs_save_errors(sbi, ERROR_FAIL_DECOMPRESSION);
++ if (!in_task)
++ f2fs_handle_error_async(sbi, ERROR_FAIL_DECOMPRESSION);
+ else
+ f2fs_handle_error(sbi, ERROR_FAIL_DECOMPRESSION);
+ goto out_release;
+@@ -1215,6 +1215,7 @@ static int f2fs_write_compressed_pages(struct compress_ctx *cc,
+ unsigned int last_index = cc->cluster_size - 1;
+ loff_t psize;
+ int i, err;
++ bool quota_inode = IS_NOQUOTA(inode);
+
+ /* we should bypass data pages to proceed the kworker jobs */
+ if (unlikely(f2fs_cp_error(sbi))) {
+@@ -1222,7 +1223,7 @@ static int f2fs_write_compressed_pages(struct compress_ctx *cc,
+ goto out_free;
+ }
+
+- if (IS_NOQUOTA(inode)) {
++ if (quota_inode) {
+ /*
+ * We need to wait for node_write to avoid block allocation during
+ * checkpoint. This can only happen to quota writes which can cause
+@@ -1344,7 +1345,7 @@ unlock_continue:
+ set_inode_flag(inode, FI_FIRST_BLOCK_WRITTEN);
+
+ f2fs_put_dnode(&dn);
+- if (IS_NOQUOTA(inode))
++ if (quota_inode)
+ f2fs_up_read(&sbi->node_write);
+ else
+ f2fs_unlock_op(sbi);
+@@ -1370,7 +1371,7 @@ out_put_cic:
+ out_put_dnode:
+ f2fs_put_dnode(&dn);
+ out_unlock_op:
+- if (IS_NOQUOTA(inode))
++ if (quota_inode)
+ f2fs_up_read(&sbi->node_write);
+ else
+ f2fs_unlock_op(sbi);
+diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
+index 7165b1202f539..15b6dc2e06410 100644
+--- a/fs/f2fs/data.c
++++ b/fs/f2fs/data.c
+@@ -2775,6 +2775,7 @@ int f2fs_write_single_data_page(struct page *page, int *submitted,
+ loff_t psize = (loff_t)(page->index + 1) << PAGE_SHIFT;
+ unsigned offset = 0;
+ bool need_balance_fs = false;
++ bool quota_inode = IS_NOQUOTA(inode);
+ int err = 0;
+ struct f2fs_io_info fio = {
+ .sbi = sbi,
+@@ -2807,6 +2808,10 @@ int f2fs_write_single_data_page(struct page *page, int *submitted,
+ if (S_ISDIR(inode->i_mode) &&
+ !is_sbi_flag_set(sbi, SBI_IS_CLOSE))
+ goto redirty_out;
++
++ /* keep data pages in remount-ro mode */
++ if (F2FS_OPTION(sbi).errors == MOUNT_ERRORS_READONLY)
++ goto redirty_out;
+ goto out;
+ }
+
+@@ -2832,19 +2837,19 @@ write:
+ goto out;
+
+ /* Dentry/quota blocks are controlled by checkpoint */
+- if (S_ISDIR(inode->i_mode) || IS_NOQUOTA(inode)) {
++ if (S_ISDIR(inode->i_mode) || quota_inode) {
+ /*
+ * We need to wait for node_write to avoid block allocation during
+ * checkpoint. This can only happen to quota writes which can cause
+ * the below discard race condition.
+ */
+- if (IS_NOQUOTA(inode))
++ if (quota_inode)
+ f2fs_down_read(&sbi->node_write);
+
+ fio.need_lock = LOCK_DONE;
+ err = f2fs_do_write_data_page(&fio);
+
+- if (IS_NOQUOTA(inode))
++ if (quota_inode)
+ f2fs_up_read(&sbi->node_write);
+
+ goto done;
+diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
+index d211ee89c1586..d867056a01f65 100644
+--- a/fs/f2fs/f2fs.h
++++ b/fs/f2fs/f2fs.h
+@@ -162,6 +162,7 @@ struct f2fs_mount_info {
+ int fs_mode; /* fs mode: LFS or ADAPTIVE */
+ int bggc_mode; /* bggc mode: off, on or sync */
+ int memory_mode; /* memory mode */
++ int errors; /* errors parameter */
+ int discard_unit; /*
+ * discard command's offset/size should
+ * be aligned to this unit: block,
+@@ -1370,6 +1371,12 @@ enum {
+ MEMORY_MODE_LOW, /* memory mode for low memry devices */
+ };
+
++enum errors_option {
++ MOUNT_ERRORS_READONLY, /* remount fs ro on errors */
++ MOUNT_ERRORS_CONTINUE, /* continue on errors */
++ MOUNT_ERRORS_PANIC, /* panic on errors */
++};
++
+ static inline int f2fs_test_bit(unsigned int nr, char *addr);
+ static inline void f2fs_set_bit(unsigned int nr, char *addr);
+ static inline void f2fs_clear_bit(unsigned int nr, char *addr);
+@@ -1721,8 +1728,14 @@ struct f2fs_sb_info {
+
+ struct workqueue_struct *post_read_wq; /* post read workqueue */
+
+- unsigned char errors[MAX_F2FS_ERRORS]; /* error flags */
+- spinlock_t error_lock; /* protect errors array */
++ /*
++ * If we are in irq context, let's update error information into
++ * on-disk superblock in the work.
++ */
++ struct work_struct s_error_work;
++ unsigned char errors[MAX_F2FS_ERRORS]; /* error flags */
++ unsigned char stop_reason[MAX_STOP_REASON]; /* stop reason */
++ spinlock_t error_lock; /* protect errors/stop_reason array */
+ bool error_dirty; /* errors of sb is dirty */
+
+ struct kmem_cache *inline_xattr_slab; /* inline xattr entry */
+@@ -3541,9 +3554,11 @@ int f2fs_enable_quota_files(struct f2fs_sb_info *sbi, bool rdonly);
+ int f2fs_quota_sync(struct super_block *sb, int type);
+ loff_t max_file_blocks(struct inode *inode);
+ void f2fs_quota_off_umount(struct super_block *sb);
+-void f2fs_handle_stop(struct f2fs_sb_info *sbi, unsigned char reason);
+ void f2fs_save_errors(struct f2fs_sb_info *sbi, unsigned char flag);
++void f2fs_handle_critical_error(struct f2fs_sb_info *sbi, unsigned char reason,
++ bool irq_context);
+ void f2fs_handle_error(struct f2fs_sb_info *sbi, unsigned char error);
++void f2fs_handle_error_async(struct f2fs_sb_info *sbi, unsigned char error);
+ int f2fs_commit_super(struct f2fs_sb_info *sbi, bool recover);
+ int f2fs_sync_fs(struct super_block *sb, int sync);
+ int f2fs_sanity_check_ckpt(struct f2fs_sb_info *sbi);
+@@ -3815,7 +3830,7 @@ void f2fs_stop_gc_thread(struct f2fs_sb_info *sbi);
+ block_t f2fs_start_bidx_of_node(unsigned int node_ofs, struct inode *inode);
+ int f2fs_gc(struct f2fs_sb_info *sbi, struct f2fs_gc_control *gc_control);
+ void f2fs_build_gc_manager(struct f2fs_sb_info *sbi);
+-int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count);
++int f2fs_resize_fs(struct file *filp, __u64 block_count);
+ int __init f2fs_create_garbage_collection_cache(void);
+ void f2fs_destroy_garbage_collection_cache(void);
+ /* victim selection function for cleaning and SSR */
+diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
+index 5ac53d2627d20..015ed274dc312 100644
+--- a/fs/f2fs/file.c
++++ b/fs/f2fs/file.c
+@@ -2225,7 +2225,6 @@ static int f2fs_ioc_shutdown(struct file *filp, unsigned long arg)
+ ret = 0;
+ f2fs_stop_checkpoint(sbi, false,
+ STOP_CP_REASON_SHUTDOWN);
+- set_sbi_flag(sbi, SBI_IS_SHUTDOWN);
+ trace_f2fs_shutdown(sbi, in, ret);
+ }
+ return ret;
+@@ -2238,7 +2237,6 @@ static int f2fs_ioc_shutdown(struct file *filp, unsigned long arg)
+ if (ret)
+ goto out;
+ f2fs_stop_checkpoint(sbi, false, STOP_CP_REASON_SHUTDOWN);
+- set_sbi_flag(sbi, SBI_IS_SHUTDOWN);
+ thaw_bdev(sb->s_bdev);
+ break;
+ case F2FS_GOING_DOWN_METASYNC:
+@@ -2247,16 +2245,13 @@ static int f2fs_ioc_shutdown(struct file *filp, unsigned long arg)
+ if (ret)
+ goto out;
+ f2fs_stop_checkpoint(sbi, false, STOP_CP_REASON_SHUTDOWN);
+- set_sbi_flag(sbi, SBI_IS_SHUTDOWN);
+ break;
+ case F2FS_GOING_DOWN_NOSYNC:
+ f2fs_stop_checkpoint(sbi, false, STOP_CP_REASON_SHUTDOWN);
+- set_sbi_flag(sbi, SBI_IS_SHUTDOWN);
+ break;
+ case F2FS_GOING_DOWN_METAFLUSH:
+ f2fs_sync_meta_pages(sbi, META, LONG_MAX, FS_META_IO);
+ f2fs_stop_checkpoint(sbi, false, STOP_CP_REASON_SHUTDOWN);
+- set_sbi_flag(sbi, SBI_IS_SHUTDOWN);
+ break;
+ case F2FS_GOING_DOWN_NEED_FSCK:
+ set_sbi_flag(sbi, SBI_NEED_FSCK);
+@@ -2593,6 +2588,11 @@ static int f2fs_defragment_range(struct f2fs_sb_info *sbi,
+
+ inode_lock(inode);
+
++ if (is_inode_flag_set(inode, FI_COMPRESS_RELEASED)) {
++ err = -EINVAL;
++ goto unlock_out;
++ }
++
+ /* if in-place-update policy is enabled, don't waste time here */
+ set_inode_flag(inode, FI_OPU_WRITE);
+ if (f2fs_should_update_inplace(inode, NULL)) {
+@@ -2717,6 +2717,7 @@ clear_out:
+ clear_inode_flag(inode, FI_SKIP_WRITES);
+ out:
+ clear_inode_flag(inode, FI_OPU_WRITE);
++unlock_out:
+ inode_unlock(inode);
+ if (!err)
+ range->len = (u64)total << PAGE_SHIFT;
+@@ -3278,7 +3279,7 @@ static int f2fs_ioc_resize_fs(struct file *filp, unsigned long arg)
+ sizeof(block_count)))
+ return -EFAULT;
+
+- return f2fs_resize_fs(sbi, block_count);
++ return f2fs_resize_fs(filp, block_count);
+ }
+
+ static int f2fs_ioc_enable_verity(struct file *filp, unsigned long arg)
+diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
+index 61c5f9d26018e..719b1ba32a78b 100644
+--- a/fs/f2fs/gc.c
++++ b/fs/f2fs/gc.c
+@@ -59,7 +59,7 @@ static int gc_thread_func(void *data)
+ if (gc_th->gc_wake)
+ gc_th->gc_wake = false;
+
+- if (try_to_freeze()) {
++ if (try_to_freeze() || f2fs_readonly(sbi->sb)) {
+ stat_other_skip_bggc_count(sbi);
+ continue;
+ }
+@@ -2099,8 +2099,9 @@ static void update_fs_metadata(struct f2fs_sb_info *sbi, int secs)
+ }
+ }
+
+-int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
++int f2fs_resize_fs(struct file *filp, __u64 block_count)
+ {
++ struct f2fs_sb_info *sbi = F2FS_I_SB(file_inode(filp));
+ __u64 old_block_count, shrunk_blocks;
+ struct cp_control cpc = { CP_RESIZE, 0, 0, 0 };
+ unsigned int secs;
+@@ -2138,12 +2139,18 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
+ return -EINVAL;
+ }
+
++ err = mnt_want_write_file(filp);
++ if (err)
++ return err;
++
+ shrunk_blocks = old_block_count - block_count;
+ secs = div_u64(shrunk_blocks, BLKS_PER_SEC(sbi));
+
+ /* stop other GC */
+- if (!f2fs_down_write_trylock(&sbi->gc_lock))
+- return -EAGAIN;
++ if (!f2fs_down_write_trylock(&sbi->gc_lock)) {
++ err = -EAGAIN;
++ goto out_drop_write;
++ }
+
+ /* stop CP to protect MAIN_SEC in free_segment_range */
+ f2fs_lock_op(sbi);
+@@ -2163,10 +2170,20 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
+ out_unlock:
+ f2fs_unlock_op(sbi);
+ f2fs_up_write(&sbi->gc_lock);
++out_drop_write:
++ mnt_drop_write_file(filp);
+ if (err)
+ return err;
+
+- freeze_super(sbi->sb);
++ err = freeze_super(sbi->sb);
++ if (err)
++ return err;
++
++ if (f2fs_readonly(sbi->sb)) {
++ thaw_super(sbi->sb);
++ return -EROFS;
++ }
++
+ f2fs_down_write(&sbi->gc_lock);
+ f2fs_down_write(&sbi->cp_global_sem);
+
+diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c
+index 77a71276ecb15..ad597b417fea5 100644
+--- a/fs/f2fs/namei.c
++++ b/fs/f2fs/namei.c
+@@ -995,20 +995,12 @@ static int f2fs_rename(struct mnt_idmap *idmap, struct inode *old_dir,
+ goto out;
+ }
+
+- /*
+- * Copied from ext4_rename: we need to protect against old.inode
+- * directory getting converted from inline directory format into
+- * a normal one.
+- */
+- if (S_ISDIR(old_inode->i_mode))
+- inode_lock_nested(old_inode, I_MUTEX_NONDIR2);
+-
+ err = -ENOENT;
+ old_entry = f2fs_find_entry(old_dir, &old_dentry->d_name, &old_page);
+ if (!old_entry) {
+ if (IS_ERR(old_page))
+ err = PTR_ERR(old_page);
+- goto out_unlock_old;
++ goto out;
+ }
+
+ if (S_ISDIR(old_inode->i_mode)) {
+@@ -1116,9 +1108,6 @@ static int f2fs_rename(struct mnt_idmap *idmap, struct inode *old_dir,
+
+ f2fs_unlock_op(sbi);
+
+- if (S_ISDIR(old_inode->i_mode))
+- inode_unlock(old_inode);
+-
+ if (IS_DIRSYNC(old_dir) || IS_DIRSYNC(new_dir))
+ f2fs_sync_fs(sbi->sb, 1);
+
+@@ -1133,9 +1122,6 @@ out_dir:
+ f2fs_put_page(old_dir_page, 0);
+ out_old:
+ f2fs_put_page(old_page, 0);
+-out_unlock_old:
+- if (S_ISDIR(old_inode->i_mode))
+- inode_unlock(old_inode);
+ out:
+ iput(whiteout);
+ return err;
+diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
+index bd1dad5237967..6bdb1bed29ec9 100644
+--- a/fs/f2fs/node.c
++++ b/fs/f2fs/node.c
+@@ -943,8 +943,10 @@ static int truncate_dnode(struct dnode_of_data *dn)
+ dn->ofs_in_node = 0;
+ f2fs_truncate_data_blocks(dn);
+ err = truncate_node(dn);
+- if (err)
++ if (err) {
++ f2fs_put_page(page, 1);
+ return err;
++ }
+
+ return 1;
+ }
+@@ -1596,6 +1598,9 @@ static int __write_node_page(struct page *page, bool atomic, bool *submitted,
+ trace_f2fs_writepage(page, NODE);
+
+ if (unlikely(f2fs_cp_error(sbi))) {
++ /* keep node pages in remount-ro mode */
++ if (F2FS_OPTION(sbi).errors == MOUNT_ERRORS_READONLY)
++ goto redirty_out;
+ ClearPageUptodate(page);
+ dec_page_count(sbi, F2FS_DIRTY_NODES);
+ unlock_page(page);
+diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
+index 9f15b03037dba..17082dc3c1a34 100644
+--- a/fs/f2fs/super.c
++++ b/fs/f2fs/super.c
+@@ -164,6 +164,7 @@ enum {
+ Opt_discard_unit,
+ Opt_memory_mode,
+ Opt_age_extent_cache,
++ Opt_errors,
+ Opt_err,
+ };
+
+@@ -243,6 +244,7 @@ static match_table_t f2fs_tokens = {
+ {Opt_discard_unit, "discard_unit=%s"},
+ {Opt_memory_mode, "memory=%s"},
+ {Opt_age_extent_cache, "age_extent_cache"},
++ {Opt_errors, "errors=%s"},
+ {Opt_err, NULL},
+ };
+
+@@ -1268,6 +1270,25 @@ static int parse_options(struct super_block *sb, char *options, bool is_remount)
+ case Opt_age_extent_cache:
+ set_opt(sbi, AGE_EXTENT_CACHE);
+ break;
++ case Opt_errors:
++ name = match_strdup(&args[0]);
++ if (!name)
++ return -ENOMEM;
++ if (!strcmp(name, "remount-ro")) {
++ F2FS_OPTION(sbi).errors =
++ MOUNT_ERRORS_READONLY;
++ } else if (!strcmp(name, "continue")) {
++ F2FS_OPTION(sbi).errors =
++ MOUNT_ERRORS_CONTINUE;
++ } else if (!strcmp(name, "panic")) {
++ F2FS_OPTION(sbi).errors =
++ MOUNT_ERRORS_PANIC;
++ } else {
++ kfree(name);
++ return -EINVAL;
++ }
++ kfree(name);
++ break;
+ default:
+ f2fs_err(sbi, "Unrecognized mount option \"%s\" or missing value",
+ p);
+@@ -1622,6 +1643,9 @@ static void f2fs_put_super(struct super_block *sb)
+ f2fs_destroy_node_manager(sbi);
+ f2fs_destroy_segment_manager(sbi);
+
++ /* flush s_error_work before sbi destroy */
++ flush_work(&sbi->s_error_work);
++
+ f2fs_destroy_post_read_wq(sbi);
+
+ kvfree(sbi->ckpt);
+@@ -2052,6 +2076,13 @@ static int f2fs_show_options(struct seq_file *seq, struct dentry *root)
+ else if (F2FS_OPTION(sbi).memory_mode == MEMORY_MODE_LOW)
+ seq_printf(seq, ",memory=%s", "low");
+
++ if (F2FS_OPTION(sbi).errors == MOUNT_ERRORS_READONLY)
++ seq_printf(seq, ",errors=%s", "remount-ro");
++ else if (F2FS_OPTION(sbi).errors == MOUNT_ERRORS_CONTINUE)
++ seq_printf(seq, ",errors=%s", "continue");
++ else if (F2FS_OPTION(sbi).errors == MOUNT_ERRORS_PANIC)
++ seq_printf(seq, ",errors=%s", "panic");
++
+ return 0;
+ }
+
+@@ -2080,6 +2111,7 @@ static void default_options(struct f2fs_sb_info *sbi)
+ }
+ F2FS_OPTION(sbi).bggc_mode = BGGC_MODE_ON;
+ F2FS_OPTION(sbi).memory_mode = MEMORY_MODE_NORMAL;
++ F2FS_OPTION(sbi).errors = MOUNT_ERRORS_CONTINUE;
+
+ sbi->sb->s_flags &= ~SB_INLINECRYPT;
+
+@@ -2281,6 +2313,9 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
+ if (err)
+ goto restore_opts;
+
++ /* flush outstanding errors before changing fs state */
++ flush_work(&sbi->s_error_work);
++
+ /*
+ * Previous and new state of filesystem is RO,
+ * so skip checking GC and FLUSH_MERGE conditions.
+@@ -3926,55 +3961,73 @@ int f2fs_commit_super(struct f2fs_sb_info *sbi, bool recover)
+ return err;
+ }
+
+-void f2fs_handle_stop(struct f2fs_sb_info *sbi, unsigned char reason)
++static void save_stop_reason(struct f2fs_sb_info *sbi, unsigned char reason)
++{
++ unsigned long flags;
++
++ spin_lock_irqsave(&sbi->error_lock, flags);
++ if (sbi->stop_reason[reason] < GENMASK(BITS_PER_BYTE - 1, 0))
++ sbi->stop_reason[reason]++;
++ spin_unlock_irqrestore(&sbi->error_lock, flags);
++}
++
++static void f2fs_record_stop_reason(struct f2fs_sb_info *sbi)
+ {
+ struct f2fs_super_block *raw_super = F2FS_RAW_SUPER(sbi);
++ unsigned long flags;
+ int err;
+
+ f2fs_down_write(&sbi->sb_lock);
+
+- if (raw_super->s_stop_reason[reason] < GENMASK(BITS_PER_BYTE - 1, 0))
+- raw_super->s_stop_reason[reason]++;
++ spin_lock_irqsave(&sbi->error_lock, flags);
++ if (sbi->error_dirty) {
++ memcpy(F2FS_RAW_SUPER(sbi)->s_errors, sbi->errors,
++ MAX_F2FS_ERRORS);
++ sbi->error_dirty = false;
++ }
++ memcpy(raw_super->s_stop_reason, sbi->stop_reason, MAX_STOP_REASON);
++ spin_unlock_irqrestore(&sbi->error_lock, flags);
+
+ err = f2fs_commit_super(sbi, false);
+- if (err)
+- f2fs_err(sbi, "f2fs_commit_super fails to record reason:%u err:%d",
+- reason, err);
++
+ f2fs_up_write(&sbi->sb_lock);
++ if (err)
++ f2fs_err(sbi, "f2fs_commit_super fails to record err:%d", err);
+ }
+
+ void f2fs_save_errors(struct f2fs_sb_info *sbi, unsigned char flag)
+ {
+- spin_lock(&sbi->error_lock);
++ unsigned long flags;
++
++ spin_lock_irqsave(&sbi->error_lock, flags);
+ if (!test_bit(flag, (unsigned long *)sbi->errors)) {
+ set_bit(flag, (unsigned long *)sbi->errors);
+ sbi->error_dirty = true;
+ }
+- spin_unlock(&sbi->error_lock);
++ spin_unlock_irqrestore(&sbi->error_lock, flags);
+ }
+
+ static bool f2fs_update_errors(struct f2fs_sb_info *sbi)
+ {
++ unsigned long flags;
+ bool need_update = false;
+
+- spin_lock(&sbi->error_lock);
++ spin_lock_irqsave(&sbi->error_lock, flags);
+ if (sbi->error_dirty) {
+ memcpy(F2FS_RAW_SUPER(sbi)->s_errors, sbi->errors,
+ MAX_F2FS_ERRORS);
+ sbi->error_dirty = false;
+ need_update = true;
+ }
+- spin_unlock(&sbi->error_lock);
++ spin_unlock_irqrestore(&sbi->error_lock, flags);
+
+ return need_update;
+ }
+
+-void f2fs_handle_error(struct f2fs_sb_info *sbi, unsigned char error)
++static void f2fs_record_errors(struct f2fs_sb_info *sbi, unsigned char error)
+ {
+ int err;
+
+- f2fs_save_errors(sbi, error);
+-
+ f2fs_down_write(&sbi->sb_lock);
+
+ if (!f2fs_update_errors(sbi))
+@@ -3988,6 +4041,83 @@ out_unlock:
+ f2fs_up_write(&sbi->sb_lock);
+ }
+
++void f2fs_handle_error(struct f2fs_sb_info *sbi, unsigned char error)
++{
++ f2fs_save_errors(sbi, error);
++ f2fs_record_errors(sbi, error);
++}
++
++void f2fs_handle_error_async(struct f2fs_sb_info *sbi, unsigned char error)
++{
++ f2fs_save_errors(sbi, error);
++
++ if (!sbi->error_dirty)
++ return;
++ if (!test_bit(error, (unsigned long *)sbi->errors))
++ return;
++ schedule_work(&sbi->s_error_work);
++}
++
++static bool system_going_down(void)
++{
++ return system_state == SYSTEM_HALT || system_state == SYSTEM_POWER_OFF
++ || system_state == SYSTEM_RESTART;
++}
++
++void f2fs_handle_critical_error(struct f2fs_sb_info *sbi, unsigned char reason,
++ bool irq_context)
++{
++ struct super_block *sb = sbi->sb;
++ bool shutdown = reason == STOP_CP_REASON_SHUTDOWN;
++ bool continue_fs = !shutdown &&
++ F2FS_OPTION(sbi).errors == MOUNT_ERRORS_CONTINUE;
++
++ set_ckpt_flags(sbi, CP_ERROR_FLAG);
++
++ if (!f2fs_hw_is_readonly(sbi)) {
++ save_stop_reason(sbi, reason);
++
++ if (irq_context && !shutdown)
++ schedule_work(&sbi->s_error_work);
++ else
++ f2fs_record_stop_reason(sbi);
++ }
++
++ /*
++ * We force ERRORS_RO behavior when system is rebooting. Otherwise we
++ * could panic during 'reboot -f' as the underlying device got already
++ * disabled.
++ */
++ if (F2FS_OPTION(sbi).errors == MOUNT_ERRORS_PANIC &&
++ !shutdown && !system_going_down() &&
++ !is_sbi_flag_set(sbi, SBI_IS_SHUTDOWN))
++ panic("F2FS-fs (device %s): panic forced after error\n",
++ sb->s_id);
++
++ if (shutdown)
++ set_sbi_flag(sbi, SBI_IS_SHUTDOWN);
++
++ /* continue filesystem operators if errors=continue */
++ if (continue_fs || f2fs_readonly(sb))
++ return;
++
++ f2fs_warn(sbi, "Remounting filesystem read-only");
++ /*
++ * Make sure updated value of ->s_mount_flags will be visible before
++ * ->s_flags update
++ */
++ smp_wmb();
++ sb->s_flags |= SB_RDONLY;
++}
++
++static void f2fs_record_error_work(struct work_struct *work)
++{
++ struct f2fs_sb_info *sbi = container_of(work,
++ struct f2fs_sb_info, s_error_work);
++
++ f2fs_record_stop_reason(sbi);
++}
++
+ static int f2fs_scan_devices(struct f2fs_sb_info *sbi)
+ {
+ struct f2fs_super_block *raw_super = F2FS_RAW_SUPER(sbi);
+@@ -4218,7 +4348,9 @@ try_onemore:
+ sb->s_fs_info = sbi;
+ sbi->raw_super = raw_super;
+
++ INIT_WORK(&sbi->s_error_work, f2fs_record_error_work);
+ memcpy(sbi->errors, raw_super->s_errors, MAX_F2FS_ERRORS);
++ memcpy(sbi->stop_reason, raw_super->s_stop_reason, MAX_STOP_REASON);
+
+ /* precompute checksum seed for metadata */
+ if (f2fs_sb_has_inode_chksum(sbi))
+@@ -4615,6 +4747,8 @@ free_sm:
+ f2fs_destroy_segment_manager(sbi);
+ stop_ckpt_thread:
+ f2fs_stop_ckpt_thread(sbi);
++ /* flush s_error_work before sbi destroy */
++ flush_work(&sbi->s_error_work);
+ f2fs_destroy_post_read_wq(sbi);
+ free_devices:
+ destroy_device_list(sbi);
+diff --git a/fs/fs_context.c b/fs/fs_context.c
+index 24ce12f0db32e..851214d1d013d 100644
+--- a/fs/fs_context.c
++++ b/fs/fs_context.c
+@@ -561,7 +561,8 @@ static int legacy_parse_param(struct fs_context *fc, struct fs_parameter *param)
+ return -ENOMEM;
+ }
+
+- ctx->legacy_data[size++] = ',';
++ if (size)
++ ctx->legacy_data[size++] = ',';
+ len = strlen(param->key);
+ memcpy(ctx->legacy_data + size, param->key, len);
+ size += len;
+diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
+index cb62c8f07d1e7..21335d1b67bf2 100644
+--- a/fs/gfs2/file.c
++++ b/fs/gfs2/file.c
+@@ -1030,8 +1030,8 @@ static ssize_t gfs2_file_buffered_write(struct kiocb *iocb,
+ }
+
+ gfs2_holder_init(ip->i_gl, LM_ST_EXCLUSIVE, 0, gh);
+-retry:
+ if (should_fault_in_pages(from, iocb, &prev_count, &window_size)) {
++retry:
+ window_size -= fault_in_iov_iter_readable(from, window_size);
+ if (!window_size) {
+ ret = -EFAULT;
+diff --git a/fs/inode.c b/fs/inode.c
+index 577799b7855f6..b9d4980322700 100644
+--- a/fs/inode.c
++++ b/fs/inode.c
+@@ -1103,6 +1103,48 @@ void discard_new_inode(struct inode *inode)
+ }
+ EXPORT_SYMBOL(discard_new_inode);
+
++/**
++ * lock_two_inodes - lock two inodes (may be regular files but also dirs)
++ *
++ * Lock any non-NULL argument. The caller must make sure that if he is passing
++ * in two directories, one is not ancestor of the other. Zero, one or two
++ * objects may be locked by this function.
++ *
++ * @inode1: first inode to lock
++ * @inode2: second inode to lock
++ * @subclass1: inode lock subclass for the first lock obtained
++ * @subclass2: inode lock subclass for the second lock obtained
++ */
++void lock_two_inodes(struct inode *inode1, struct inode *inode2,
++ unsigned subclass1, unsigned subclass2)
++{
++ if (!inode1 || !inode2) {
++ /*
++ * Make sure @subclass1 will be used for the acquired lock.
++ * This is not strictly necessary (no current caller cares) but
++ * let's keep things consistent.
++ */
++ if (!inode1)
++ swap(inode1, inode2);
++ goto lock;
++ }
++
++ /*
++ * If one object is directory and the other is not, we must make sure
++ * to lock directory first as the other object may be its child.
++ */
++ if (S_ISDIR(inode2->i_mode) == S_ISDIR(inode1->i_mode)) {
++ if (inode1 > inode2)
++ swap(inode1, inode2);
++ } else if (!S_ISDIR(inode1->i_mode))
++ swap(inode1, inode2);
++lock:
++ if (inode1)
++ inode_lock_nested(inode1, subclass1);
++ if (inode2 && inode2 != inode1)
++ inode_lock_nested(inode2, subclass2);
++}
++
+ /**
+ * lock_two_nondirectories - take two i_mutexes on non-directory objects
+ *
+diff --git a/fs/internal.h b/fs/internal.h
+index bd3b2810a36b6..377030a50aca6 100644
+--- a/fs/internal.h
++++ b/fs/internal.h
+@@ -152,6 +152,8 @@ extern long prune_icache_sb(struct super_block *sb, struct shrink_control *sc);
+ int dentry_needs_remove_privs(struct mnt_idmap *, struct dentry *dentry);
+ bool in_group_or_capable(struct mnt_idmap *idmap,
+ const struct inode *inode, vfsgid_t vfsgid);
++void lock_two_inodes(struct inode *inode1, struct inode *inode2,
++ unsigned subclass1, unsigned subclass2);
+
+ /*
+ * fs-writeback.c
+diff --git a/fs/jffs2/build.c b/fs/jffs2/build.c
+index 837cd55fd4c5e..6ae9d6fefb861 100644
+--- a/fs/jffs2/build.c
++++ b/fs/jffs2/build.c
+@@ -211,7 +211,10 @@ static int jffs2_build_filesystem(struct jffs2_sb_info *c)
+ ic->scan_dents = NULL;
+ cond_resched();
+ }
+- jffs2_build_xattr_subsystem(c);
++ ret = jffs2_build_xattr_subsystem(c);
++ if (ret)
++ goto exit;
++
+ c->flags &= ~JFFS2_SB_FLAG_BUILDING;
+
+ dbg_fsbuild("FS build complete\n");
+diff --git a/fs/jffs2/xattr.c b/fs/jffs2/xattr.c
+index aa4048a27f31f..3b6bdc9a49e1b 100644
+--- a/fs/jffs2/xattr.c
++++ b/fs/jffs2/xattr.c
+@@ -772,10 +772,10 @@ void jffs2_clear_xattr_subsystem(struct jffs2_sb_info *c)
+ }
+
+ #define XREF_TMPHASH_SIZE (128)
+-void jffs2_build_xattr_subsystem(struct jffs2_sb_info *c)
++int jffs2_build_xattr_subsystem(struct jffs2_sb_info *c)
+ {
+ struct jffs2_xattr_ref *ref, *_ref;
+- struct jffs2_xattr_ref *xref_tmphash[XREF_TMPHASH_SIZE];
++ struct jffs2_xattr_ref **xref_tmphash;
+ struct jffs2_xattr_datum *xd, *_xd;
+ struct jffs2_inode_cache *ic;
+ struct jffs2_raw_node_ref *raw;
+@@ -784,9 +784,12 @@ void jffs2_build_xattr_subsystem(struct jffs2_sb_info *c)
+
+ BUG_ON(!(c->flags & JFFS2_SB_FLAG_BUILDING));
+
++ xref_tmphash = kcalloc(XREF_TMPHASH_SIZE,
++ sizeof(struct jffs2_xattr_ref *), GFP_KERNEL);
++ if (!xref_tmphash)
++ return -ENOMEM;
++
+ /* Phase.1 : Merge same xref */
+- for (i=0; i < XREF_TMPHASH_SIZE; i++)
+- xref_tmphash[i] = NULL;
+ for (ref=c->xref_temp; ref; ref=_ref) {
+ struct jffs2_xattr_ref *tmp;
+
+@@ -884,6 +887,8 @@ void jffs2_build_xattr_subsystem(struct jffs2_sb_info *c)
+ "%u of xref (%u dead, %u orphan) found.\n",
+ xdatum_count, xdatum_unchecked_count, xdatum_orphan_count,
+ xref_count, xref_dead_count, xref_orphan_count);
++ kfree(xref_tmphash);
++ return 0;
+ }
+
+ struct jffs2_xattr_datum *jffs2_setup_xattr_datum(struct jffs2_sb_info *c,
+diff --git a/fs/jffs2/xattr.h b/fs/jffs2/xattr.h
+index 720007b2fd65d..1b5030a3349db 100644
+--- a/fs/jffs2/xattr.h
++++ b/fs/jffs2/xattr.h
+@@ -71,7 +71,7 @@ static inline int is_xattr_ref_dead(struct jffs2_xattr_ref *ref)
+ #ifdef CONFIG_JFFS2_FS_XATTR
+
+ extern void jffs2_init_xattr_subsystem(struct jffs2_sb_info *c);
+-extern void jffs2_build_xattr_subsystem(struct jffs2_sb_info *c);
++extern int jffs2_build_xattr_subsystem(struct jffs2_sb_info *c);
+ extern void jffs2_clear_xattr_subsystem(struct jffs2_sb_info *c);
+
+ extern struct jffs2_xattr_datum *jffs2_setup_xattr_datum(struct jffs2_sb_info *c,
+@@ -103,7 +103,7 @@ extern ssize_t jffs2_listxattr(struct dentry *, char *, size_t);
+ #else
+
+ #define jffs2_init_xattr_subsystem(c)
+-#define jffs2_build_xattr_subsystem(c)
++#define jffs2_build_xattr_subsystem(c) (0)
+ #define jffs2_clear_xattr_subsystem(c)
+
+ #define jffs2_xattr_do_crccheck_inode(c, ic)
+diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
+index 45b6919903e6b..5a1a4af9d3d29 100644
+--- a/fs/kernfs/dir.c
++++ b/fs/kernfs/dir.c
+@@ -655,7 +655,9 @@ static struct kernfs_node *__kernfs_new_node(struct kernfs_root *root,
+ return kn;
+
+ err_out3:
++ spin_lock(&kernfs_idr_lock);
+ idr_remove(&root->ino_idr, (u32)kernfs_ino(kn));
++ spin_unlock(&kernfs_idr_lock);
+ err_out2:
+ kmem_cache_free(kernfs_node_cache, kn);
+ err_out1:
+diff --git a/fs/lockd/svc.c b/fs/lockd/svc.c
+index 04ba95b83d168..22d3ff3818f5f 100644
+--- a/fs/lockd/svc.c
++++ b/fs/lockd/svc.c
+@@ -355,7 +355,6 @@ static int lockd_get(void)
+ int error;
+
+ if (nlmsvc_serv) {
+- svc_get(nlmsvc_serv);
+ nlmsvc_users++;
+ return 0;
+ }
+diff --git a/fs/namei.c b/fs/namei.c
+index e4fe0879ae553..7e5cb92feab3f 100644
+--- a/fs/namei.c
++++ b/fs/namei.c
+@@ -3028,8 +3028,8 @@ static struct dentry *lock_two_directories(struct dentry *p1, struct dentry *p2)
+ return p;
+ }
+
+- inode_lock_nested(p1->d_inode, I_MUTEX_PARENT);
+- inode_lock_nested(p2->d_inode, I_MUTEX_PARENT2);
++ lock_two_inodes(p1->d_inode, p2->d_inode,
++ I_MUTEX_PARENT, I_MUTEX_PARENT2);
+ return NULL;
+ }
+
+@@ -4731,7 +4731,7 @@ SYSCALL_DEFINE2(link, const char __user *, oldname, const char __user *, newname
+ * sb->s_vfs_rename_mutex. We might be more accurate, but that's another
+ * story.
+ * c) we have to lock _four_ objects - parents and victim (if it exists),
+- * and source (if it is not a directory).
++ * and source.
+ * And that - after we got ->i_mutex on parents (until then we don't know
+ * whether the target exists). Solution: try to be smart with locking
+ * order for inodes. We rely on the fact that tree topology may change
+@@ -4815,10 +4815,16 @@ int vfs_rename(struct renamedata *rd)
+
+ take_dentry_name_snapshot(&old_name, old_dentry);
+ dget(new_dentry);
+- if (!is_dir || (flags & RENAME_EXCHANGE))
+- lock_two_nondirectories(source, target);
+- else if (target)
+- inode_lock(target);
++ /*
++ * Lock all moved children. Moved directories may need to change parent
++ * pointer so they need the lock to prevent against concurrent
++ * directory changes moving parent pointer. For regular files we've
++ * historically always done this. The lockdep locking subclasses are
++ * somewhat arbitrary but RENAME_EXCHANGE in particular can swap
++ * regular files and directories so it's difficult to tell which
++ * subclasses to use.
++ */
++ lock_two_inodes(source, target, I_MUTEX_NORMAL, I_MUTEX_NONDIR2);
+
+ error = -EPERM;
+ if (IS_SWAPFILE(source) || (target && IS_SWAPFILE(target)))
+@@ -4866,9 +4872,8 @@ int vfs_rename(struct renamedata *rd)
+ d_exchange(old_dentry, new_dentry);
+ }
+ out:
+- if (!is_dir || (flags & RENAME_EXCHANGE))
+- unlock_two_nondirectories(source, target);
+- else if (target)
++ inode_unlock(source);
++ if (target)
+ inode_unlock(target);
+ dput(new_dentry);
+ if (!error) {
+diff --git a/fs/nfs/nfs42xattr.c b/fs/nfs/nfs42xattr.c
+index 76ae118342066..911f634ba3da7 100644
+--- a/fs/nfs/nfs42xattr.c
++++ b/fs/nfs/nfs42xattr.c
+@@ -991,6 +991,29 @@ static void nfs4_xattr_cache_init_once(void *p)
+ INIT_LIST_HEAD(&cache->dispose);
+ }
+
++static int nfs4_xattr_shrinker_init(struct shrinker *shrinker,
++ struct list_lru *lru, const char *name)
++{
++ int ret = 0;
++
++ ret = register_shrinker(shrinker, name);
++ if (ret)
++ return ret;
++
++ ret = list_lru_init_memcg(lru, shrinker);
++ if (ret)
++ unregister_shrinker(shrinker);
++
++ return ret;
++}
++
++static void nfs4_xattr_shrinker_destroy(struct shrinker *shrinker,
++ struct list_lru *lru)
++{
++ unregister_shrinker(shrinker);
++ list_lru_destroy(lru);
++}
++
+ int __init nfs4_xattr_cache_init(void)
+ {
+ int ret = 0;
+@@ -1002,44 +1025,30 @@ int __init nfs4_xattr_cache_init(void)
+ if (nfs4_xattr_cache_cachep == NULL)
+ return -ENOMEM;
+
+- ret = list_lru_init_memcg(&nfs4_xattr_large_entry_lru,
+- &nfs4_xattr_large_entry_shrinker);
+- if (ret)
+- goto out4;
+-
+- ret = list_lru_init_memcg(&nfs4_xattr_entry_lru,
+- &nfs4_xattr_entry_shrinker);
+- if (ret)
+- goto out3;
+-
+- ret = list_lru_init_memcg(&nfs4_xattr_cache_lru,
+- &nfs4_xattr_cache_shrinker);
+- if (ret)
+- goto out2;
+-
+- ret = register_shrinker(&nfs4_xattr_cache_shrinker, "nfs-xattr_cache");
++ ret = nfs4_xattr_shrinker_init(&nfs4_xattr_cache_shrinker,
++ &nfs4_xattr_cache_lru,
++ "nfs-xattr_cache");
+ if (ret)
+ goto out1;
+
+- ret = register_shrinker(&nfs4_xattr_entry_shrinker, "nfs-xattr_entry");
++ ret = nfs4_xattr_shrinker_init(&nfs4_xattr_entry_shrinker,
++ &nfs4_xattr_entry_lru,
++ "nfs-xattr_entry");
+ if (ret)
+- goto out;
++ goto out2;
+
+- ret = register_shrinker(&nfs4_xattr_large_entry_shrinker,
+- "nfs-xattr_large_entry");
++ ret = nfs4_xattr_shrinker_init(&nfs4_xattr_large_entry_shrinker,
++ &nfs4_xattr_large_entry_lru,
++ "nfs-xattr_large_entry");
+ if (!ret)
+ return 0;
+
+- unregister_shrinker(&nfs4_xattr_entry_shrinker);
+-out:
+- unregister_shrinker(&nfs4_xattr_cache_shrinker);
+-out1:
+- list_lru_destroy(&nfs4_xattr_cache_lru);
++ nfs4_xattr_shrinker_destroy(&nfs4_xattr_entry_shrinker,
++ &nfs4_xattr_entry_lru);
+ out2:
+- list_lru_destroy(&nfs4_xattr_entry_lru);
+-out3:
+- list_lru_destroy(&nfs4_xattr_large_entry_lru);
+-out4:
++ nfs4_xattr_shrinker_destroy(&nfs4_xattr_cache_shrinker,
++ &nfs4_xattr_cache_lru);
++out1:
+ kmem_cache_destroy(nfs4_xattr_cache_cachep);
+
+ return ret;
+@@ -1047,11 +1056,11 @@ out4:
+
+ void nfs4_xattr_cache_exit(void)
+ {
+- unregister_shrinker(&nfs4_xattr_large_entry_shrinker);
+- unregister_shrinker(&nfs4_xattr_entry_shrinker);
+- unregister_shrinker(&nfs4_xattr_cache_shrinker);
+- list_lru_destroy(&nfs4_xattr_large_entry_lru);
+- list_lru_destroy(&nfs4_xattr_entry_lru);
+- list_lru_destroy(&nfs4_xattr_cache_lru);
++ nfs4_xattr_shrinker_destroy(&nfs4_xattr_large_entry_shrinker,
++ &nfs4_xattr_large_entry_lru);
++ nfs4_xattr_shrinker_destroy(&nfs4_xattr_entry_shrinker,
++ &nfs4_xattr_entry_lru);
++ nfs4_xattr_shrinker_destroy(&nfs4_xattr_cache_shrinker,
++ &nfs4_xattr_cache_lru);
+ kmem_cache_destroy(nfs4_xattr_cache_cachep);
+ }
+diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
+index d3665390c4cb8..9faba2dac11dd 100644
+--- a/fs/nfs/nfs4proc.c
++++ b/fs/nfs/nfs4proc.c
+@@ -921,6 +921,7 @@ out:
+ out_noaction:
+ return ret;
+ session_recover:
++ set_bit(NFS4_SLOT_TBL_DRAINING, &session->fc_slot_table.slot_tbl_state);
+ nfs4_schedule_session_recovery(session, status);
+ dprintk("%s ERROR: %d Reset session\n", __func__, status);
+ nfs41_sequence_free_slot(res);
+diff --git a/fs/nfsd/cache.h b/fs/nfsd/cache.h
+index f21259ead64bb..4c9b87850ab12 100644
+--- a/fs/nfsd/cache.h
++++ b/fs/nfsd/cache.h
+@@ -80,6 +80,8 @@ enum {
+
+ int nfsd_drc_slab_create(void);
+ void nfsd_drc_slab_free(void);
++int nfsd_net_reply_cache_init(struct nfsd_net *nn);
++void nfsd_net_reply_cache_destroy(struct nfsd_net *nn);
+ int nfsd_reply_cache_init(struct nfsd_net *);
+ void nfsd_reply_cache_shutdown(struct nfsd_net *);
+ int nfsd_cache_lookup(struct svc_rqst *);
+diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
+index 76db2fe296244..ee1a24debd60c 100644
+--- a/fs/nfsd/nfs4xdr.c
++++ b/fs/nfsd/nfs4xdr.c
+@@ -3956,7 +3956,7 @@ nfsd4_encode_open(struct nfsd4_compoundres *resp, __be32 nfserr,
+ p = xdr_reserve_space(xdr, 32);
+ if (!p)
+ return nfserr_resource;
+- *p++ = cpu_to_be32(0);
++ *p++ = cpu_to_be32(open->op_recall);
+
+ /*
+ * TODO: space_limit's in delegations
+diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c
+index 041faa13b852e..a8eda1c85829e 100644
+--- a/fs/nfsd/nfscache.c
++++ b/fs/nfsd/nfscache.c
+@@ -148,12 +148,23 @@ void nfsd_drc_slab_free(void)
+ kmem_cache_destroy(drc_slab);
+ }
+
+-static int nfsd_reply_cache_stats_init(struct nfsd_net *nn)
++/**
++ * nfsd_net_reply_cache_init - per net namespace reply cache set-up
++ * @nn: nfsd_net being initialized
++ *
++ * Returns zero on succes; otherwise a negative errno is returned.
++ */
++int nfsd_net_reply_cache_init(struct nfsd_net *nn)
+ {
+ return nfsd_percpu_counters_init(nn->counter, NFSD_NET_COUNTERS_NUM);
+ }
+
+-static void nfsd_reply_cache_stats_destroy(struct nfsd_net *nn)
++/**
++ * nfsd_net_reply_cache_destroy - per net namespace reply cache tear-down
++ * @nn: nfsd_net being freed
++ *
++ */
++void nfsd_net_reply_cache_destroy(struct nfsd_net *nn)
+ {
+ nfsd_percpu_counters_destroy(nn->counter, NFSD_NET_COUNTERS_NUM);
+ }
+@@ -169,17 +180,13 @@ int nfsd_reply_cache_init(struct nfsd_net *nn)
+ hashsize = nfsd_hashsize(nn->max_drc_entries);
+ nn->maskbits = ilog2(hashsize);
+
+- status = nfsd_reply_cache_stats_init(nn);
+- if (status)
+- goto out_nomem;
+-
+ nn->nfsd_reply_cache_shrinker.scan_objects = nfsd_reply_cache_scan;
+ nn->nfsd_reply_cache_shrinker.count_objects = nfsd_reply_cache_count;
+ nn->nfsd_reply_cache_shrinker.seeks = 1;
+ status = register_shrinker(&nn->nfsd_reply_cache_shrinker,
+ "nfsd-reply:%s", nn->nfsd_name);
+ if (status)
+- goto out_stats_destroy;
++ return status;
+
+ nn->drc_hashtbl = kvzalloc(array_size(hashsize,
+ sizeof(*nn->drc_hashtbl)), GFP_KERNEL);
+@@ -195,9 +202,6 @@ int nfsd_reply_cache_init(struct nfsd_net *nn)
+ return 0;
+ out_shrinker:
+ unregister_shrinker(&nn->nfsd_reply_cache_shrinker);
+-out_stats_destroy:
+- nfsd_reply_cache_stats_destroy(nn);
+-out_nomem:
+ printk(KERN_ERR "nfsd: failed to allocate reply cache\n");
+ return -ENOMEM;
+ }
+@@ -217,7 +221,6 @@ void nfsd_reply_cache_shutdown(struct nfsd_net *nn)
+ rp, nn);
+ }
+ }
+- nfsd_reply_cache_stats_destroy(nn);
+
+ kvfree(nn->drc_hashtbl);
+ nn->drc_hashtbl = NULL;
+diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
+index b4fd7a7062d5e..7effd7db0b858 100644
+--- a/fs/nfsd/nfsctl.c
++++ b/fs/nfsd/nfsctl.c
+@@ -1488,6 +1488,9 @@ static __net_init int nfsd_init_net(struct net *net)
+ retval = nfsd_idmap_init(net);
+ if (retval)
+ goto out_idmap_error;
++ retval = nfsd_net_reply_cache_init(nn);
++ if (retval)
++ goto out_repcache_error;
+ nn->nfsd_versions = NULL;
+ nn->nfsd4_minorversions = NULL;
+ nfsd4_init_leases_net(nn);
+@@ -1496,6 +1499,8 @@ static __net_init int nfsd_init_net(struct net *net)
+
+ return 0;
+
++out_repcache_error:
++ nfsd_idmap_shutdown(net);
+ out_idmap_error:
+ nfsd_export_shutdown(net);
+ out_export_error:
+@@ -1504,9 +1509,12 @@ out_export_error:
+
+ static __net_exit void nfsd_exit_net(struct net *net)
+ {
++ struct nfsd_net *nn = net_generic(net, nfsd_net_id);
++
++ nfsd_net_reply_cache_destroy(nn);
+ nfsd_idmap_shutdown(net);
+ nfsd_export_shutdown(net);
+- nfsd_netns_free_versions(net_generic(net, nfsd_net_id));
++ nfsd_netns_free_versions(nn);
+ }
+
+ static struct pernet_operations nfsd_net_ops = {
+diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
+index db67f8e19344a..0016bcc04a59d 100644
+--- a/fs/nfsd/vfs.c
++++ b/fs/nfsd/vfs.c
+@@ -388,7 +388,9 @@ nfsd_sanitize_attrs(struct inode *inode, struct iattr *iap)
+ iap->ia_mode &= ~S_ISGID;
+ } else {
+ /* set ATTR_KILL_* bits and let VFS handle it */
+- iap->ia_valid |= (ATTR_KILL_SUID | ATTR_KILL_SGID);
++ iap->ia_valid |= ATTR_KILL_SUID;
++ iap->ia_valid |=
++ setattr_should_drop_sgid(&nop_mnt_idmap, inode);
+ }
+ }
+ }
+diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
+index 22fb1cf7e1fc5..f7e11ac763907 100644
+--- a/fs/notify/fanotify/fanotify_user.c
++++ b/fs/notify/fanotify/fanotify_user.c
+@@ -1623,6 +1623,20 @@ static int fanotify_events_supported(struct fsnotify_group *group,
+ path->mnt->mnt_sb->s_type->fs_flags & FS_DISALLOW_NOTIFY_PERM)
+ return -EINVAL;
+
++ /*
++ * mount and sb marks are not allowed on kernel internal pseudo fs,
++ * like pipe_mnt, because that would subscribe to events on all the
++ * anonynous pipes in the system.
++ *
++ * SB_NOUSER covers all of the internal pseudo fs whose objects are not
++ * exposed to user's mount namespace, but there are other SB_KERNMOUNT
++ * fs, like nsfs, debugfs, for which the value of allowing sb and mount
++ * mark is questionable. For now we leave them alone.
++ */
++ if (mark_type != FAN_MARK_INODE &&
++ path->mnt->mnt_sb->s_flags & SB_NOUSER)
++ return -EINVAL;
++
+ /*
+ * We shouldn't have allowed setting dirent events and the directory
+ * flags FAN_ONDIR and FAN_EVENT_ON_CHILD in mask of non-dir inode,
+diff --git a/fs/ntfs3/xattr.c b/fs/ntfs3/xattr.c
+index c3de60a4543fa..fd02fcf4d4091 100644
+--- a/fs/ntfs3/xattr.c
++++ b/fs/ntfs3/xattr.c
+@@ -214,6 +214,9 @@ static ssize_t ntfs_list_ea(struct ntfs_inode *ni, char *buffer,
+ ea = Add2Ptr(ea_all, off);
+ ea_size = unpacked_ea_size(ea);
+
++ if (!ea->name_len)
++ break;
++
+ if (buffer) {
+ if (ret + ea->name_len + 1 > bytes_per_buffer) {
+ err = -ERANGE;
+diff --git a/fs/ocfs2/cluster/tcp.c b/fs/ocfs2/cluster/tcp.c
+index aecbd712a00cf..929a1133bc180 100644
+--- a/fs/ocfs2/cluster/tcp.c
++++ b/fs/ocfs2/cluster/tcp.c
+@@ -2087,18 +2087,24 @@ void o2net_stop_listening(struct o2nm_node *node)
+
+ int o2net_init(void)
+ {
++ struct folio *folio;
++ void *p;
+ unsigned long i;
+
+ o2quo_init();
+-
+ o2net_debugfs_init();
+
+- o2net_hand = kzalloc(sizeof(struct o2net_handshake), GFP_KERNEL);
+- o2net_keep_req = kzalloc(sizeof(struct o2net_msg), GFP_KERNEL);
+- o2net_keep_resp = kzalloc(sizeof(struct o2net_msg), GFP_KERNEL);
+- if (!o2net_hand || !o2net_keep_req || !o2net_keep_resp)
++ folio = folio_alloc(GFP_KERNEL | __GFP_ZERO, 0);
++ if (!folio)
+ goto out;
+
++ p = folio_address(folio);
++ o2net_hand = p;
++ p += sizeof(struct o2net_handshake);
++ o2net_keep_req = p;
++ p += sizeof(struct o2net_msg);
++ o2net_keep_resp = p;
++
+ o2net_hand->protocol_version = cpu_to_be64(O2NET_PROTOCOL_VERSION);
+ o2net_hand->connector_id = cpu_to_be64(1);
+
+@@ -2124,9 +2130,6 @@ int o2net_init(void)
+ return 0;
+
+ out:
+- kfree(o2net_hand);
+- kfree(o2net_keep_req);
+- kfree(o2net_keep_resp);
+ o2net_debugfs_exit();
+ o2quo_exit();
+ return -ENOMEM;
+@@ -2135,8 +2138,6 @@ out:
+ void o2net_exit(void)
+ {
+ o2quo_exit();
+- kfree(o2net_hand);
+- kfree(o2net_keep_req);
+- kfree(o2net_keep_resp);
+ o2net_debugfs_exit();
++ folio_put(virt_to_folio(o2net_hand));
+ }
+diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
+index f658cc8ea4920..95dce240ba17a 100644
+--- a/fs/overlayfs/copy_up.c
++++ b/fs/overlayfs/copy_up.c
+@@ -575,6 +575,7 @@ static int ovl_link_up(struct ovl_copy_up_ctx *c)
+ /* Restore timestamps on parent (best effort) */
+ ovl_set_timestamps(ofs, upperdir, &c->pstat);
+ ovl_dentry_set_upper_alias(c->dentry);
++ ovl_dentry_update_reval(c->dentry, upper);
+ }
+ }
+ inode_unlock(udir);
+@@ -894,6 +895,7 @@ static int ovl_do_copy_up(struct ovl_copy_up_ctx *c)
+ inode_unlock(udir);
+
+ ovl_dentry_set_upper_alias(c->dentry);
++ ovl_dentry_update_reval(c->dentry, ovl_dentry_upper(c->dentry));
+ }
+
+ out:
+diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
+index fc25fb95d5fc0..9be52d8013c83 100644
+--- a/fs/overlayfs/dir.c
++++ b/fs/overlayfs/dir.c
+@@ -269,8 +269,7 @@ static int ovl_instantiate(struct dentry *dentry, struct inode *inode,
+
+ ovl_dir_modified(dentry->d_parent, false);
+ ovl_dentry_set_upper_alias(dentry);
+- ovl_dentry_update_reval(dentry, newdentry,
+- DCACHE_OP_REVALIDATE | DCACHE_OP_WEAK_REVALIDATE);
++ ovl_dentry_init_reval(dentry, newdentry);
+
+ if (!hardlink) {
+ /*
+diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
+index defd4e231ad2c..5c36fb3a7bab1 100644
+--- a/fs/overlayfs/export.c
++++ b/fs/overlayfs/export.c
+@@ -326,8 +326,7 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
+ if (upper_alias)
+ ovl_dentry_set_upper_alias(dentry);
+
+- ovl_dentry_update_reval(dentry, upper,
+- DCACHE_OP_REVALIDATE | DCACHE_OP_WEAK_REVALIDATE);
++ ovl_dentry_init_reval(dentry, upper);
+
+ return d_instantiate_anon(dentry, inode);
+
+diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
+index 541cf3717fc2b..e7e888dea6341 100644
+--- a/fs/overlayfs/inode.c
++++ b/fs/overlayfs/inode.c
+@@ -288,8 +288,8 @@ int ovl_permission(struct mnt_idmap *idmap,
+ int err;
+
+ /* Careful in RCU walk mode */
+- ovl_i_path_real(inode, &realpath);
+- if (!realpath.dentry) {
++ realinode = ovl_i_path_real(inode, &realpath);
++ if (!realinode) {
+ WARN_ON(!(mask & MAY_NOT_BLOCK));
+ return -ECHILD;
+ }
+@@ -302,7 +302,6 @@ int ovl_permission(struct mnt_idmap *idmap,
+ if (err)
+ return err;
+
+- realinode = d_inode(realpath.dentry);
+ old_cred = ovl_override_creds(inode->i_sb);
+ if (!upperinode &&
+ !special_file(realinode->i_mode) && mask & MAY_WRITE) {
+@@ -559,20 +558,20 @@ struct posix_acl *do_ovl_get_acl(struct mnt_idmap *idmap,
+ struct inode *inode, int type,
+ bool rcu, bool noperm)
+ {
+- struct inode *realinode = ovl_inode_real(inode);
++ struct inode *realinode;
+ struct posix_acl *acl;
+ struct path realpath;
+
+- if (!IS_POSIXACL(realinode))
+- return NULL;
+-
+ /* Careful in RCU walk mode */
+- ovl_i_path_real(inode, &realpath);
+- if (!realpath.dentry) {
++ realinode = ovl_i_path_real(inode, &realpath);
++ if (!realinode) {
+ WARN_ON(!rcu);
+ return ERR_PTR(-ECHILD);
+ }
+
++ if (!IS_POSIXACL(realinode))
++ return NULL;
++
+ if (rcu) {
+ /*
+ * If the layer is idmapped drop out of RCU path walk
+diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
+index cfb3420b7df0e..100a492d2b2a6 100644
+--- a/fs/overlayfs/namei.c
++++ b/fs/overlayfs/namei.c
+@@ -1122,8 +1122,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
+ ovl_set_flag(OVL_UPPERDATA, inode);
+ }
+
+- ovl_dentry_update_reval(dentry, upperdentry,
+- DCACHE_OP_REVALIDATE | DCACHE_OP_WEAK_REVALIDATE);
++ ovl_dentry_init_reval(dentry, upperdentry);
+
+ revert_creds(old_cred);
+ if (origin_path) {
+diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
+index 4d0b278f5630e..2b79a9398c132 100644
+--- a/fs/overlayfs/overlayfs.h
++++ b/fs/overlayfs/overlayfs.h
+@@ -375,14 +375,16 @@ bool ovl_index_all(struct super_block *sb);
+ bool ovl_verify_lower(struct super_block *sb);
+ struct ovl_entry *ovl_alloc_entry(unsigned int numlower);
+ bool ovl_dentry_remote(struct dentry *dentry);
+-void ovl_dentry_update_reval(struct dentry *dentry, struct dentry *upperdentry,
+- unsigned int mask);
++void ovl_dentry_update_reval(struct dentry *dentry, struct dentry *realdentry);
++void ovl_dentry_init_reval(struct dentry *dentry, struct dentry *upperdentry);
++void ovl_dentry_init_flags(struct dentry *dentry, struct dentry *upperdentry,
++ unsigned int mask);
+ bool ovl_dentry_weird(struct dentry *dentry);
+ enum ovl_path_type ovl_path_type(struct dentry *dentry);
+ void ovl_path_upper(struct dentry *dentry, struct path *path);
+ void ovl_path_lower(struct dentry *dentry, struct path *path);
+ void ovl_path_lowerdata(struct dentry *dentry, struct path *path);
+-void ovl_i_path_real(struct inode *inode, struct path *path);
++struct inode *ovl_i_path_real(struct inode *inode, struct path *path);
+ enum ovl_path_type ovl_path_real(struct dentry *dentry, struct path *path);
+ enum ovl_path_type ovl_path_realdata(struct dentry *dentry, struct path *path);
+ struct dentry *ovl_dentry_upper(struct dentry *dentry);
+diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
+index f97ad8b40dbbd..ae1058fbfb5b2 100644
+--- a/fs/overlayfs/super.c
++++ b/fs/overlayfs/super.c
+@@ -1877,7 +1877,7 @@ static struct dentry *ovl_get_root(struct super_block *sb,
+ ovl_dentry_set_flag(OVL_E_CONNECTED, root);
+ ovl_set_upperdata(d_inode(root));
+ ovl_inode_init(d_inode(root), &oip, ino, fsid);
+- ovl_dentry_update_reval(root, upperdentry, DCACHE_OP_WEAK_REVALIDATE);
++ ovl_dentry_init_flags(root, upperdentry, DCACHE_OP_WEAK_REVALIDATE);
+
+ return root;
+ }
+diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
+index 923d66d131c16..fb12e7fa85486 100644
+--- a/fs/overlayfs/util.c
++++ b/fs/overlayfs/util.c
+@@ -94,14 +94,30 @@ struct ovl_entry *ovl_alloc_entry(unsigned int numlower)
+ return oe;
+ }
+
++#define OVL_D_REVALIDATE (DCACHE_OP_REVALIDATE | DCACHE_OP_WEAK_REVALIDATE)
++
+ bool ovl_dentry_remote(struct dentry *dentry)
+ {
+- return dentry->d_flags &
+- (DCACHE_OP_REVALIDATE | DCACHE_OP_WEAK_REVALIDATE);
++ return dentry->d_flags & OVL_D_REVALIDATE;
++}
++
++void ovl_dentry_update_reval(struct dentry *dentry, struct dentry *realdentry)
++{
++ if (!ovl_dentry_remote(realdentry))
++ return;
++
++ spin_lock(&dentry->d_lock);
++ dentry->d_flags |= realdentry->d_flags & OVL_D_REVALIDATE;
++ spin_unlock(&dentry->d_lock);
++}
++
++void ovl_dentry_init_reval(struct dentry *dentry, struct dentry *upperdentry)
++{
++ return ovl_dentry_init_flags(dentry, upperdentry, OVL_D_REVALIDATE);
+ }
+
+-void ovl_dentry_update_reval(struct dentry *dentry, struct dentry *upperdentry,
+- unsigned int mask)
++void ovl_dentry_init_flags(struct dentry *dentry, struct dentry *upperdentry,
++ unsigned int mask)
+ {
+ struct ovl_entry *oe = OVL_E(dentry);
+ unsigned int i, flags = 0;
+@@ -250,7 +266,7 @@ struct dentry *ovl_i_dentry_upper(struct inode *inode)
+ return ovl_upperdentry_dereference(OVL_I(inode));
+ }
+
+-void ovl_i_path_real(struct inode *inode, struct path *path)
++struct inode *ovl_i_path_real(struct inode *inode, struct path *path)
+ {
+ path->dentry = ovl_i_dentry_upper(inode);
+ if (!path->dentry) {
+@@ -259,6 +275,8 @@ void ovl_i_path_real(struct inode *inode, struct path *path)
+ } else {
+ path->mnt = ovl_upper_mnt(OVL_FS(inode->i_sb));
+ }
++
++ return path->dentry ? d_inode_rcu(path->dentry) : NULL;
+ }
+
+ struct inode *ovl_inode_upper(struct inode *inode)
+@@ -1105,8 +1123,7 @@ void ovl_copyattr(struct inode *inode)
+ vfsuid_t vfsuid;
+ vfsgid_t vfsgid;
+
+- ovl_i_path_real(inode, &realpath);
+- realinode = d_inode(realpath.dentry);
++ realinode = ovl_i_path_real(inode, &realpath);
+ real_idmap = mnt_idmap(realpath.mnt);
+
+ vfsuid = i_uid_into_vfsuid(real_idmap, realinode);
+diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c
+index 966191d3a5ba2..85aaf0fc6d7d1 100644
+--- a/fs/pstore/ram_core.c
++++ b/fs/pstore/ram_core.c
+@@ -599,6 +599,8 @@ struct persistent_ram_zone *persistent_ram_new(phys_addr_t start, size_t size,
+ raw_spin_lock_init(&prz->buffer_lock);
+ prz->flags = flags;
+ prz->label = kstrdup(label, GFP_KERNEL);
++ if (!prz->label)
++ goto err;
+
+ ret = persistent_ram_buffer_map(start, size, prz, memtype);
+ if (ret)
+diff --git a/fs/ramfs/inode.c b/fs/ramfs/inode.c
+index 5ba580c78835f..fef477c781073 100644
+--- a/fs/ramfs/inode.c
++++ b/fs/ramfs/inode.c
+@@ -278,7 +278,7 @@ int ramfs_init_fs_context(struct fs_context *fc)
+ return 0;
+ }
+
+-static void ramfs_kill_sb(struct super_block *sb)
++void ramfs_kill_sb(struct super_block *sb)
+ {
+ kfree(sb->s_fs_info);
+ kill_litter_super(sb);
+diff --git a/fs/reiserfs/xattr_security.c b/fs/reiserfs/xattr_security.c
+index 6e0a099dd7886..078dd8cc312fc 100644
+--- a/fs/reiserfs/xattr_security.c
++++ b/fs/reiserfs/xattr_security.c
+@@ -67,6 +67,7 @@ int reiserfs_security_init(struct inode *dir, struct inode *inode,
+
+ sec->name = NULL;
+ sec->value = NULL;
++ sec->length = 0;
+
+ /* Don't add selinux attributes on xattrs - they'll never get used */
+ if (IS_PRIVATE(dir))
+diff --git a/fs/smb/client/cifs_debug.c b/fs/smb/client/cifs_debug.c
+index b279f745466e4..ed0f71137584f 100644
+--- a/fs/smb/client/cifs_debug.c
++++ b/fs/smb/client/cifs_debug.c
+@@ -122,6 +122,12 @@ static void cifs_debug_tcon(struct seq_file *m, struct cifs_tcon *tcon)
+ seq_puts(m, " nosparse");
+ if (tcon->need_reconnect)
+ seq_puts(m, "\tDISCONNECTED ");
++ spin_lock(&tcon->tc_lock);
++ if (tcon->origin_fullpath) {
++ seq_printf(m, "\n\tDFS origin fullpath: %s",
++ tcon->origin_fullpath);
++ }
++ spin_unlock(&tcon->tc_lock);
+ seq_putc(m, '\n');
+ }
+
+@@ -427,13 +433,9 @@ skip_rdma:
+ seq_printf(m, "\nIn Send: %d In MaxReq Wait: %d",
+ atomic_read(&server->in_send),
+ atomic_read(&server->num_waiters));
+- if (IS_ENABLED(CONFIG_CIFS_DFS_UPCALL)) {
+- if (server->origin_fullpath)
+- seq_printf(m, "\nDFS origin full path: %s",
+- server->origin_fullpath);
+- if (server->leaf_fullpath)
+- seq_printf(m, "\nDFS leaf full path: %s",
+- server->leaf_fullpath);
++ if (server->leaf_fullpath) {
++ seq_printf(m, "\nDFS leaf full path: %s",
++ server->leaf_fullpath);
+ }
+
+ seq_printf(m, "\n\n\tSessions: ");
+diff --git a/fs/smb/client/cifsglob.h b/fs/smb/client/cifsglob.h
+index b212a4e16b39b..ca2da713c5fe9 100644
+--- a/fs/smb/client/cifsglob.h
++++ b/fs/smb/client/cifsglob.h
+@@ -736,23 +736,20 @@ struct TCP_Server_Info {
+ #endif
+ struct mutex refpath_lock; /* protects leaf_fullpath */
+ /*
+- * origin_fullpath: Canonical copy of smb3_fs_context::source.
+- * It is used for matching existing DFS tcons.
+- *
+ * leaf_fullpath: Canonical DFS referral path related to this
+ * connection.
+ * It is used in DFS cache refresher, reconnect and may
+ * change due to nested DFS links.
+ *
+- * Both protected by @refpath_lock and @srv_lock. The @refpath_lock is
+- * mosly used for not requiring a copy of @leaf_fullpath when getting
++ * Protected by @refpath_lock and @srv_lock. The @refpath_lock is
++ * mostly used for not requiring a copy of @leaf_fullpath when getting
+ * cached or new DFS referrals (which might also sleep during I/O).
+ * While @srv_lock is held for making string and NULL comparions against
+ * both fields as in mount(2) and cache refresh.
+ *
+ * format: \\HOST\SHARE[\OPTIONAL PATH]
+ */
+- char *origin_fullpath, *leaf_fullpath;
++ char *leaf_fullpath;
+ };
+
+ static inline bool is_smb1(struct TCP_Server_Info *server)
+@@ -1205,6 +1202,7 @@ struct cifs_tcon {
+ struct delayed_work dfs_cache_work;
+ #endif
+ struct delayed_work query_interfaces; /* query interfaces workqueue job */
++ char *origin_fullpath; /* canonical copy of smb3_fs_context::source */
+ };
+
+ /*
+diff --git a/fs/smb/client/cifsproto.h b/fs/smb/client/cifsproto.h
+index d127aded2f287..94ab6402965c5 100644
+--- a/fs/smb/client/cifsproto.h
++++ b/fs/smb/client/cifsproto.h
+@@ -650,7 +650,7 @@ int smb2_parse_query_directory(struct cifs_tcon *tcon, struct kvec *rsp_iov,
+ int resp_buftype,
+ struct cifs_search_info *srch_inf);
+
+-struct super_block *cifs_get_tcp_super(struct TCP_Server_Info *server);
++struct super_block *cifs_get_dfs_tcon_super(struct cifs_tcon *tcon);
+ void cifs_put_tcp_super(struct super_block *sb);
+ int cifs_update_super_prepath(struct cifs_sb_info *cifs_sb, char *prefix);
+ char *extract_hostname(const char *unc);
+diff --git a/fs/smb/client/connect.c b/fs/smb/client/connect.c
+index 9d16626e7a669..d9f0b3b94f007 100644
+--- a/fs/smb/client/connect.c
++++ b/fs/smb/client/connect.c
+@@ -996,7 +996,6 @@ static void clean_demultiplex_info(struct TCP_Server_Info *server)
+ */
+ }
+
+- kfree(server->origin_fullpath);
+ kfree(server->leaf_fullpath);
+ kfree(server);
+
+@@ -1436,7 +1435,9 @@ match_security(struct TCP_Server_Info *server, struct smb3_fs_context *ctx)
+ }
+
+ /* this function must be called with srv_lock held */
+-static int match_server(struct TCP_Server_Info *server, struct smb3_fs_context *ctx)
++static int match_server(struct TCP_Server_Info *server,
++ struct smb3_fs_context *ctx,
++ bool match_super)
+ {
+ struct sockaddr *addr = (struct sockaddr *)&ctx->dstaddr;
+
+@@ -1467,36 +1468,38 @@ static int match_server(struct TCP_Server_Info *server, struct smb3_fs_context *
+ (struct sockaddr *)&server->srcaddr))
+ return 0;
+ /*
+- * - Match for an DFS tcon (@server->origin_fullpath).
+- * - Match for an DFS root server connection (@server->leaf_fullpath).
+- * - If none of the above and @ctx->leaf_fullpath is set, then
+- * it is a new DFS connection.
+- * - If 'nodfs' mount option was passed, then match only connections
+- * that have no DFS referrals set
+- * (e.g. can't failover to other targets).
++ * When matching cifs.ko superblocks (@match_super == true), we can't
++ * really match either @server->leaf_fullpath or @server->dstaddr
++ * directly since this @server might belong to a completely different
++ * server -- in case of domain-based DFS referrals or DFS links -- as
++ * provided earlier by mount(2) through 'source' and 'ip' options.
++ *
++ * Otherwise, match the DFS referral in @server->leaf_fullpath or the
++ * destination address in @server->dstaddr.
++ *
++ * When using 'nodfs' mount option, we avoid sharing it with DFS
++ * connections as they might failover.
+ */
+- if (!ctx->nodfs) {
+- if (ctx->source && server->origin_fullpath) {
+- if (!dfs_src_pathname_equal(ctx->source,
+- server->origin_fullpath))
++ if (!match_super) {
++ if (!ctx->nodfs) {
++ if (server->leaf_fullpath) {
++ if (!ctx->leaf_fullpath ||
++ strcasecmp(server->leaf_fullpath,
++ ctx->leaf_fullpath))
++ return 0;
++ } else if (ctx->leaf_fullpath) {
+ return 0;
++ }
+ } else if (server->leaf_fullpath) {
+- if (!ctx->leaf_fullpath ||
+- strcasecmp(server->leaf_fullpath,
+- ctx->leaf_fullpath))
+- return 0;
+- } else if (ctx->leaf_fullpath) {
+ return 0;
+ }
+- } else if (server->origin_fullpath || server->leaf_fullpath) {
+- return 0;
+ }
+
+ /*
+ * Match for a regular connection (address/hostname/port) which has no
+ * DFS referrals set.
+ */
+- if (!server->origin_fullpath && !server->leaf_fullpath &&
++ if (!server->leaf_fullpath &&
+ (strcasecmp(server->hostname, ctx->server_hostname) ||
+ !match_server_address(server, addr) ||
+ !match_port(server, addr)))
+@@ -1532,7 +1535,8 @@ cifs_find_tcp_session(struct smb3_fs_context *ctx)
+ * Skip ses channels since they're only handled in lower layers
+ * (e.g. cifs_send_recv).
+ */
+- if (CIFS_SERVER_IS_CHAN(server) || !match_server(server, ctx)) {
++ if (CIFS_SERVER_IS_CHAN(server) ||
++ !match_server(server, ctx, false)) {
+ spin_unlock(&server->srv_lock);
+ continue;
+ }
+@@ -2320,10 +2324,16 @@ static int match_tcon(struct cifs_tcon *tcon, struct smb3_fs_context *ctx)
+
+ if (tcon->status == TID_EXITING)
+ return 0;
+- /* Skip UNC validation when matching DFS connections or superblocks */
+- if (!server->origin_fullpath && !server->leaf_fullpath &&
+- strncmp(tcon->tree_name, ctx->UNC, MAX_TREE_SIZE))
++
++ if (tcon->origin_fullpath) {
++ if (!ctx->source ||
++ !dfs_src_pathname_equal(ctx->source,
++ tcon->origin_fullpath))
++ return 0;
++ } else if (!server->leaf_fullpath &&
++ strncmp(tcon->tree_name, ctx->UNC, MAX_TREE_SIZE)) {
+ return 0;
++ }
+ if (tcon->seal != ctx->seal)
+ return 0;
+ if (tcon->snapshot_time != ctx->snapshot_time)
+@@ -2722,7 +2732,7 @@ compare_mount_options(struct super_block *sb, struct cifs_mnt_data *mnt_data)
+ }
+
+ static int match_prepath(struct super_block *sb,
+- struct TCP_Server_Info *server,
++ struct cifs_tcon *tcon,
+ struct cifs_mnt_data *mnt_data)
+ {
+ struct smb3_fs_context *ctx = mnt_data->ctx;
+@@ -2733,8 +2743,8 @@ static int match_prepath(struct super_block *sb,
+ bool new_set = (new->mnt_cifs_flags & CIFS_MOUNT_USE_PREFIX_PATH) &&
+ new->prepath;
+
+- if (server->origin_fullpath &&
+- dfs_src_pathname_equal(server->origin_fullpath, ctx->source))
++ if (tcon->origin_fullpath &&
++ dfs_src_pathname_equal(tcon->origin_fullpath, ctx->source))
+ return 1;
+
+ if (old_set && new_set && !strcmp(new->prepath, old->prepath))
+@@ -2782,10 +2792,10 @@ cifs_match_super(struct super_block *sb, void *data)
+ spin_lock(&ses->ses_lock);
+ spin_lock(&ses->chan_lock);
+ spin_lock(&tcon->tc_lock);
+- if (!match_server(tcp_srv, ctx) ||
++ if (!match_server(tcp_srv, ctx, true) ||
+ !match_session(ses, ctx) ||
+ !match_tcon(tcon, ctx) ||
+- !match_prepath(sb, tcp_srv, mnt_data)) {
++ !match_prepath(sb, tcon, mnt_data)) {
+ rc = 0;
+ goto out;
+ }
+diff --git a/fs/smb/client/dfs.c b/fs/smb/client/dfs.c
+index 2390b2fedd6a3..267536a7531df 100644
+--- a/fs/smb/client/dfs.c
++++ b/fs/smb/client/dfs.c
+@@ -249,14 +249,12 @@ static int __dfs_mount_share(struct cifs_mount_ctx *mnt_ctx)
+ server = mnt_ctx->server;
+ tcon = mnt_ctx->tcon;
+
+- mutex_lock(&server->refpath_lock);
+- spin_lock(&server->srv_lock);
+- if (!server->origin_fullpath) {
+- server->origin_fullpath = origin_fullpath;
++ spin_lock(&tcon->tc_lock);
++ if (!tcon->origin_fullpath) {
++ tcon->origin_fullpath = origin_fullpath;
+ origin_fullpath = NULL;
+ }
+- spin_unlock(&server->srv_lock);
+- mutex_unlock(&server->refpath_lock);
++ spin_unlock(&tcon->tc_lock);
+
+ if (list_empty(&tcon->dfs_ses_list)) {
+ list_replace_init(&mnt_ctx->dfs_ses_list,
+@@ -279,18 +277,13 @@ int dfs_mount_share(struct cifs_mount_ctx *mnt_ctx, bool *isdfs)
+ {
+ struct smb3_fs_context *ctx = mnt_ctx->fs_ctx;
+ struct cifs_ses *ses;
+- char *source = ctx->source;
+ bool nodfs = ctx->nodfs;
+ int rc;
+
+ *isdfs = false;
+- /* Temporarily set @ctx->source to NULL as we're not matching DFS
+- * superblocks yet. See cifs_match_super() and match_server().
+- */
+- ctx->source = NULL;
+ rc = get_session(mnt_ctx, NULL);
+ if (rc)
+- goto out;
++ return rc;
+
+ ctx->dfs_root_ses = mnt_ctx->ses;
+ /*
+@@ -304,7 +297,7 @@ int dfs_mount_share(struct cifs_mount_ctx *mnt_ctx, bool *isdfs)
+ rc = dfs_get_referral(mnt_ctx, ctx->UNC + 1, NULL, NULL);
+ if (rc) {
+ if (rc != -ENOENT && rc != -EOPNOTSUPP && rc != -EIO)
+- goto out;
++ return rc;
+ nodfs = true;
+ }
+ }
+@@ -312,7 +305,7 @@ int dfs_mount_share(struct cifs_mount_ctx *mnt_ctx, bool *isdfs)
+ rc = cifs_mount_get_tcon(mnt_ctx);
+ if (!rc)
+ rc = cifs_is_path_remote(mnt_ctx);
+- goto out;
++ return rc;
+ }
+
+ *isdfs = true;
+@@ -328,12 +321,7 @@ int dfs_mount_share(struct cifs_mount_ctx *mnt_ctx, bool *isdfs)
+ rc = __dfs_mount_share(mnt_ctx);
+ if (ses == ctx->dfs_root_ses)
+ cifs_put_smb_ses(ses);
+-out:
+- /*
+- * Restore previous value of @ctx->source so DFS superblock can be
+- * matched in cifs_match_super().
+- */
+- ctx->source = source;
++
+ return rc;
+ }
+
+@@ -567,11 +555,11 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon, const stru
+ int rc;
+ struct TCP_Server_Info *server = tcon->ses->server;
+ const struct smb_version_operations *ops = server->ops;
+- struct super_block *sb = NULL;
+- struct cifs_sb_info *cifs_sb;
+ struct dfs_cache_tgt_list tl = DFS_CACHE_TGT_LIST_INIT(tl);
+- char *tree;
++ struct cifs_sb_info *cifs_sb = NULL;
++ struct super_block *sb = NULL;
+ struct dfs_info3_param ref = {0};
++ char *tree;
+
+ /* only send once per connect */
+ spin_lock(&tcon->tc_lock);
+@@ -603,19 +591,18 @@ int cifs_tree_connect(const unsigned int xid, struct cifs_tcon *tcon, const stru
+ goto out;
+ }
+
+- sb = cifs_get_tcp_super(server);
+- if (IS_ERR(sb)) {
+- rc = PTR_ERR(sb);
+- cifs_dbg(VFS, "%s: could not find superblock: %d\n", __func__, rc);
+- goto out;
+- }
+-
+- cifs_sb = CIFS_SB(sb);
++ sb = cifs_get_dfs_tcon_super(tcon);
++ if (!IS_ERR(sb))
++ cifs_sb = CIFS_SB(sb);
+
+- /* If it is not dfs or there was no cached dfs referral, then reconnect to same share */
+- if (!server->leaf_fullpath ||
++ /*
++ * Tree connect to last share in @tcon->tree_name whether dfs super or
++ * cached dfs referral was not found.
++ */
++ if (!cifs_sb || !server->leaf_fullpath ||
+ dfs_cache_noreq_find(server->leaf_fullpath + 1, &ref, &tl)) {
+- rc = ops->tree_connect(xid, tcon->ses, tcon->tree_name, tcon, cifs_sb->local_nls);
++ rc = ops->tree_connect(xid, tcon->ses, tcon->tree_name, tcon,
++ cifs_sb ? cifs_sb->local_nls : nlsc);
+ goto out;
+ }
+
+diff --git a/fs/smb/client/dfs.h b/fs/smb/client/dfs.h
+index 1c90df5ecfbda..98e9d2aca6a7a 100644
+--- a/fs/smb/client/dfs.h
++++ b/fs/smb/client/dfs.h
+@@ -39,16 +39,15 @@ static inline char *dfs_get_automount_devname(struct dentry *dentry, void *page)
+ {
+ struct cifs_sb_info *cifs_sb = CIFS_SB(dentry->d_sb);
+ struct cifs_tcon *tcon = cifs_sb_master_tcon(cifs_sb);
+- struct TCP_Server_Info *server = tcon->ses->server;
+ size_t len;
+ char *s;
+
+- spin_lock(&server->srv_lock);
+- if (unlikely(!server->origin_fullpath)) {
+- spin_unlock(&server->srv_lock);
++ spin_lock(&tcon->tc_lock);
++ if (unlikely(!tcon->origin_fullpath)) {
++ spin_unlock(&tcon->tc_lock);
+ return ERR_PTR(-EREMOTE);
+ }
+- spin_unlock(&server->srv_lock);
++ spin_unlock(&tcon->tc_lock);
+
+ s = dentry_path_raw(dentry, page, PATH_MAX);
+ if (IS_ERR(s))
+@@ -57,16 +56,16 @@ static inline char *dfs_get_automount_devname(struct dentry *dentry, void *page)
+ if (!s[1])
+ s++;
+
+- spin_lock(&server->srv_lock);
+- len = strlen(server->origin_fullpath);
++ spin_lock(&tcon->tc_lock);
++ len = strlen(tcon->origin_fullpath);
+ if (s < (char *)page + len) {
+- spin_unlock(&server->srv_lock);
++ spin_unlock(&tcon->tc_lock);
+ return ERR_PTR(-ENAMETOOLONG);
+ }
+
+ s -= len;
+- memcpy(s, server->origin_fullpath, len);
+- spin_unlock(&server->srv_lock);
++ memcpy(s, tcon->origin_fullpath, len);
++ spin_unlock(&tcon->tc_lock);
+ convert_delimiter(s, '/');
+
+ return s;
+diff --git a/fs/smb/client/dfs_cache.c b/fs/smb/client/dfs_cache.c
+index 1513b2709889b..33adf43a01f1d 100644
+--- a/fs/smb/client/dfs_cache.c
++++ b/fs/smb/client/dfs_cache.c
+@@ -1248,18 +1248,20 @@ static int refresh_tcon(struct cifs_tcon *tcon, bool force_refresh)
+ int dfs_cache_remount_fs(struct cifs_sb_info *cifs_sb)
+ {
+ struct cifs_tcon *tcon;
+- struct TCP_Server_Info *server;
+
+ if (!cifs_sb || !cifs_sb->master_tlink)
+ return -EINVAL;
+
+ tcon = cifs_sb_master_tcon(cifs_sb);
+- server = tcon->ses->server;
+
+- if (!server->origin_fullpath) {
++ spin_lock(&tcon->tc_lock);
++ if (!tcon->origin_fullpath) {
++ spin_unlock(&tcon->tc_lock);
+ cifs_dbg(FYI, "%s: not a dfs mount\n", __func__);
+ return 0;
+ }
++ spin_unlock(&tcon->tc_lock);
++
+ /*
+ * After reconnecting to a different server, unique ids won't match anymore, so we disable
+ * serverino. This prevents dentry revalidation to think the dentry are stale (ESTALE).
+diff --git a/fs/smb/client/file.c b/fs/smb/client/file.c
+index 051283386e229..1a854dc204823 100644
+--- a/fs/smb/client/file.c
++++ b/fs/smb/client/file.c
+@@ -4936,20 +4936,19 @@ oplock_break_ack:
+
+ _cifsFileInfo_put(cfile, false /* do not wait for ourself */, false);
+ /*
+- * releasing stale oplock after recent reconnect of smb session using
+- * a now incorrect file handle is not a data integrity issue but do
+- * not bother sending an oplock release if session to server still is
+- * disconnected since oplock already released by the server
++ * MS-SMB2 3.2.5.19.1 and 3.2.5.19.2 (and MS-CIFS 3.2.5.42) do not require
++ * an acknowledgment to be sent when the file has already been closed.
++ * check for server null, since can race with kill_sb calling tree disconnect.
+ */
+- if (!oplock_break_cancelled) {
+- /* check for server null since can race with kill_sb calling tree disconnect */
+- if (tcon->ses && tcon->ses->server) {
+- rc = tcon->ses->server->ops->oplock_response(tcon, persistent_fid,
+- volatile_fid, net_fid, cinode);
+- cifs_dbg(FYI, "Oplock release rc = %d\n", rc);
+- } else
+- pr_warn_once("lease break not sent for unmounted share\n");
+- }
++ spin_lock(&cinode->open_file_lock);
++ if (tcon->ses && tcon->ses->server && !oplock_break_cancelled &&
++ !list_empty(&cinode->openFileList)) {
++ spin_unlock(&cinode->open_file_lock);
++ rc = tcon->ses->server->ops->oplock_response(tcon, persistent_fid,
++ volatile_fid, net_fid, cinode);
++ cifs_dbg(FYI, "Oplock release rc = %d\n", rc);
++ } else
++ spin_unlock(&cinode->open_file_lock);
+
+ cifs_done_oplock_break(cinode);
+ }
+diff --git a/fs/smb/client/misc.c b/fs/smb/client/misc.c
+index cd914be905b24..b0dedc26643b6 100644
+--- a/fs/smb/client/misc.c
++++ b/fs/smb/client/misc.c
+@@ -156,6 +156,7 @@ tconInfoFree(struct cifs_tcon *tcon)
+ #ifdef CONFIG_CIFS_DFS_UPCALL
+ dfs_put_root_smb_sessions(&tcon->dfs_ses_list);
+ #endif
++ kfree(tcon->origin_fullpath);
+ kfree(tcon);
+ }
+
+@@ -1106,20 +1107,25 @@ struct super_cb_data {
+ struct super_block *sb;
+ };
+
+-static void tcp_super_cb(struct super_block *sb, void *arg)
++static void tcon_super_cb(struct super_block *sb, void *arg)
+ {
+ struct super_cb_data *sd = arg;
+- struct TCP_Server_Info *server = sd->data;
+ struct cifs_sb_info *cifs_sb;
+- struct cifs_tcon *tcon;
++ struct cifs_tcon *t1 = sd->data, *t2;
+
+ if (sd->sb)
+ return;
+
+ cifs_sb = CIFS_SB(sb);
+- tcon = cifs_sb_master_tcon(cifs_sb);
+- if (tcon->ses->server == server)
++ t2 = cifs_sb_master_tcon(cifs_sb);
++
++ spin_lock(&t2->tc_lock);
++ if (t1->ses == t2->ses &&
++ t1->ses->server == t2->ses->server &&
++ t2->origin_fullpath &&
++ dfs_src_pathname_equal(t2->origin_fullpath, t1->origin_fullpath))
+ sd->sb = sb;
++ spin_unlock(&t2->tc_lock);
+ }
+
+ static struct super_block *__cifs_get_super(void (*f)(struct super_block *, void *),
+@@ -1145,6 +1151,7 @@ static struct super_block *__cifs_get_super(void (*f)(struct super_block *, void
+ return sd.sb;
+ }
+ }
++ pr_warn_once("%s: could not find dfs superblock\n", __func__);
+ return ERR_PTR(-EINVAL);
+ }
+
+@@ -1154,9 +1161,15 @@ static void __cifs_put_super(struct super_block *sb)
+ cifs_sb_deactive(sb);
+ }
+
+-struct super_block *cifs_get_tcp_super(struct TCP_Server_Info *server)
++struct super_block *cifs_get_dfs_tcon_super(struct cifs_tcon *tcon)
+ {
+- return __cifs_get_super(tcp_super_cb, server);
++ spin_lock(&tcon->tc_lock);
++ if (!tcon->origin_fullpath) {
++ spin_unlock(&tcon->tc_lock);
++ return ERR_PTR(-ENOENT);
++ }
++ spin_unlock(&tcon->tc_lock);
++ return __cifs_get_super(tcon_super_cb, tcon);
+ }
+
+ void cifs_put_tcp_super(struct super_block *sb)
+@@ -1238,9 +1251,16 @@ int cifs_inval_name_dfs_link_error(const unsigned int xid,
+ */
+ if (strlen(full_path) < 2 || !cifs_sb ||
+ (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_NO_DFS) ||
+- !is_tcon_dfs(tcon) || !ses->server->origin_fullpath)
++ !is_tcon_dfs(tcon))
+ return 0;
+
++ spin_lock(&tcon->tc_lock);
++ if (!tcon->origin_fullpath) {
++ spin_unlock(&tcon->tc_lock);
++ return 0;
++ }
++ spin_unlock(&tcon->tc_lock);
++
+ /*
+ * Slow path - tcon is DFS and @full_path has prefix path, so attempt
+ * to get a referral to figure out whether it is an DFS link.
+@@ -1264,7 +1284,7 @@ int cifs_inval_name_dfs_link_error(const unsigned int xid,
+
+ /*
+ * XXX: we are not using dfs_cache_find() here because we might
+- * end filling all the DFS cache and thus potentially
++ * end up filling all the DFS cache and thus potentially
+ * removing cached DFS targets that the client would eventually
+ * need during failover.
+ */
+diff --git a/fs/smb/client/smb2inode.c b/fs/smb/client/smb2inode.c
+index 163a03298430d..8e696fbd72fa8 100644
+--- a/fs/smb/client/smb2inode.c
++++ b/fs/smb/client/smb2inode.c
+@@ -398,9 +398,6 @@ static int smb2_compound_op(const unsigned int xid, struct cifs_tcon *tcon,
+ rsp_iov);
+
+ finished:
+- if (cfile)
+- cifsFileInfo_put(cfile);
+-
+ SMB2_open_free(&rqst[0]);
+ if (rc == -EREMCHG) {
+ pr_warn_once("server share %s deleted\n", tcon->tree_name);
+@@ -529,6 +526,9 @@ static int smb2_compound_op(const unsigned int xid, struct cifs_tcon *tcon,
+ break;
+ }
+
++ if (cfile)
++ cifsFileInfo_put(cfile);
++
+ if (rc && err_iov && err_buftype) {
+ memcpy(err_iov, rsp_iov, 3 * sizeof(*err_iov));
+ memcpy(err_buftype, resp_buftype, 3 * sizeof(*err_buftype));
+@@ -609,9 +609,6 @@ int smb2_query_path_info(const unsigned int xid, struct cifs_tcon *tcon,
+ if (islink)
+ rc = -EREMOTE;
+ }
+- if (rc == -EREMOTE && IS_ENABLED(CONFIG_CIFS_DFS_UPCALL) && cifs_sb &&
+- (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_NO_DFS))
+- rc = -EOPNOTSUPP;
+ }
+
+ out:
+diff --git a/fs/smb/client/smb2ops.c b/fs/smb/client/smb2ops.c
+index a8bb9d00d33ad..3bac586e8a8eb 100644
+--- a/fs/smb/client/smb2ops.c
++++ b/fs/smb/client/smb2ops.c
+@@ -211,6 +211,16 @@ smb2_wait_mtu_credits(struct TCP_Server_Info *server, unsigned int size,
+
+ spin_lock(&server->req_lock);
+ while (1) {
++ spin_unlock(&server->req_lock);
++
++ spin_lock(&server->srv_lock);
++ if (server->tcpStatus == CifsExiting) {
++ spin_unlock(&server->srv_lock);
++ return -ENOENT;
++ }
++ spin_unlock(&server->srv_lock);
++
++ spin_lock(&server->req_lock);
+ if (server->credits <= 0) {
+ spin_unlock(&server->req_lock);
+ cifs_num_waiters_inc(server);
+@@ -221,15 +231,6 @@ smb2_wait_mtu_credits(struct TCP_Server_Info *server, unsigned int size,
+ return rc;
+ spin_lock(&server->req_lock);
+ } else {
+- spin_unlock(&server->req_lock);
+- spin_lock(&server->srv_lock);
+- if (server->tcpStatus == CifsExiting) {
+- spin_unlock(&server->srv_lock);
+- return -ENOENT;
+- }
+- spin_unlock(&server->srv_lock);
+-
+- spin_lock(&server->req_lock);
+ scredits = server->credits;
+ /* can deadlock with reopen */
+ if (scredits <= 8) {
+diff --git a/fs/smb/client/transport.c b/fs/smb/client/transport.c
+index 0474d0bba0a2e..f280502a2aee8 100644
+--- a/fs/smb/client/transport.c
++++ b/fs/smb/client/transport.c
+@@ -522,6 +522,16 @@ wait_for_free_credits(struct TCP_Server_Info *server, const int num_credits,
+ }
+
+ while (1) {
++ spin_unlock(&server->req_lock);
++
++ spin_lock(&server->srv_lock);
++ if (server->tcpStatus == CifsExiting) {
++ spin_unlock(&server->srv_lock);
++ return -ENOENT;
++ }
++ spin_unlock(&server->srv_lock);
++
++ spin_lock(&server->req_lock);
+ if (*credits < num_credits) {
+ scredits = *credits;
+ spin_unlock(&server->req_lock);
+@@ -547,15 +557,6 @@ wait_for_free_credits(struct TCP_Server_Info *server, const int num_credits,
+ return -ERESTARTSYS;
+ spin_lock(&server->req_lock);
+ } else {
+- spin_unlock(&server->req_lock);
+-
+- spin_lock(&server->srv_lock);
+- if (server->tcpStatus == CifsExiting) {
+- spin_unlock(&server->srv_lock);
+- return -ENOENT;
+- }
+- spin_unlock(&server->srv_lock);
+-
+ /*
+ * For normal commands, reserve the last MAX_COMPOUND
+ * credits to compound requests.
+@@ -569,7 +570,6 @@ wait_for_free_credits(struct TCP_Server_Info *server, const int num_credits,
+ * for servers that are slow to hand out credits on
+ * new sessions.
+ */
+- spin_lock(&server->req_lock);
+ if (!optype && num_credits == 1 &&
+ server->in_flight > 2 * MAX_COMPOUND &&
+ *credits <= MAX_COMPOUND) {
+diff --git a/fs/smb/server/smb_common.c b/fs/smb/server/smb_common.c
+index 569e5eecdf3db..3e391a7d5a3ab 100644
+--- a/fs/smb/server/smb_common.c
++++ b/fs/smb/server/smb_common.c
+@@ -536,7 +536,7 @@ int ksmbd_extract_shortname(struct ksmbd_conn *conn, const char *longname,
+ out[baselen + 3] = PERIOD;
+
+ if (dot_present)
+- memcpy(&out[baselen + 4], extension, 4);
++ memcpy(out + baselen + 4, extension, 4);
+ else
+ out[baselen + 4] = '\0';
+ smbConvertToUTF16((__le16 *)shortname, out, PATH_MAX,
+diff --git a/fs/splice.c b/fs/splice.c
+index 3e06611d19ae5..030e162985b5d 100644
+--- a/fs/splice.c
++++ b/fs/splice.c
+@@ -355,7 +355,6 @@ ssize_t direct_splice_read(struct file *in, loff_t *ppos,
+ reclaim -= ret;
+ remain = ret;
+ *ppos = kiocb.ki_pos;
+- file_accessed(in);
+ } else if (ret < 0) {
+ /*
+ * callers of ->splice_read() expect -EAGAIN on
+diff --git a/fs/udf/namei.c b/fs/udf/namei.c
+index fd20423d3ed24..fd29a66e7241f 100644
+--- a/fs/udf/namei.c
++++ b/fs/udf/namei.c
+@@ -793,11 +793,6 @@ static int udf_rename(struct mnt_idmap *idmap, struct inode *old_dir,
+ if (!empty_dir(new_inode))
+ goto out_oiter;
+ }
+- /*
+- * We need to protect against old_inode getting converted from
+- * ICB to normal directory.
+- */
+- inode_lock_nested(old_inode, I_MUTEX_NONDIR2);
+ retval = udf_fiiter_find_entry(old_inode, &dotdot_name,
+ &diriter);
+ if (retval == -ENOENT) {
+@@ -806,10 +801,8 @@ static int udf_rename(struct mnt_idmap *idmap, struct inode *old_dir,
+ old_inode->i_ino);
+ retval = -EFSCORRUPTED;
+ }
+- if (retval) {
+- inode_unlock(old_inode);
++ if (retval)
+ goto out_oiter;
+- }
+ has_diriter = true;
+ tloc = lelb_to_cpu(diriter.fi.icb.extLocation);
+ if (udf_get_lb_pblock(old_inode->i_sb, &tloc, 0) !=
+@@ -889,7 +882,6 @@ static int udf_rename(struct mnt_idmap *idmap, struct inode *old_dir,
+ udf_dir_entry_len(&diriter.fi));
+ udf_fiiter_write_fi(&diriter, NULL);
+ udf_fiiter_release(&diriter);
+- inode_unlock(old_inode);
+
+ inode_dec_link_count(old_dir);
+ if (new_inode)
+@@ -901,10 +893,8 @@ static int udf_rename(struct mnt_idmap *idmap, struct inode *old_dir,
+ }
+ return 0;
+ out_oiter:
+- if (has_diriter) {
++ if (has_diriter)
+ udf_fiiter_release(&diriter);
+- inode_unlock(old_inode);
+- }
+ udf_fiiter_release(&oiter);
+
+ return retval;
+diff --git a/fs/verity/enable.c b/fs/verity/enable.c
+index fc4c50e5219dc..bd86b25ac084b 100644
+--- a/fs/verity/enable.c
++++ b/fs/verity/enable.c
+@@ -7,6 +7,7 @@
+
+ #include "fsverity_private.h"
+
++#include <crypto/hash.h>
+ #include <linux/mount.h>
+ #include <linux/sched/signal.h>
+ #include <linux/uaccess.h>
+@@ -20,7 +21,7 @@ struct block_buffer {
+ /* Hash a block, writing the result to the next level's pending block buffer. */
+ static int hash_one_block(struct inode *inode,
+ const struct merkle_tree_params *params,
+- struct ahash_request *req, struct block_buffer *cur)
++ struct block_buffer *cur)
+ {
+ struct block_buffer *next = cur + 1;
+ int err;
+@@ -36,8 +37,7 @@ static int hash_one_block(struct inode *inode,
+ /* Zero-pad the block if it's shorter than the block size. */
+ memset(&cur->data[cur->filled], 0, params->block_size - cur->filled);
+
+- err = fsverity_hash_block(params, inode, req, virt_to_page(cur->data),
+- offset_in_page(cur->data),
++ err = fsverity_hash_block(params, inode, cur->data,
+ &next->data[next->filled]);
+ if (err)
+ return err;
+@@ -76,7 +76,6 @@ static int build_merkle_tree(struct file *filp,
+ struct inode *inode = file_inode(filp);
+ const u64 data_size = inode->i_size;
+ const int num_levels = params->num_levels;
+- struct ahash_request *req;
+ struct block_buffer _buffers[1 + FS_VERITY_MAX_LEVELS + 1] = {};
+ struct block_buffer *buffers = &_buffers[1];
+ unsigned long level_offset[FS_VERITY_MAX_LEVELS];
+@@ -90,9 +89,6 @@ static int build_merkle_tree(struct file *filp,
+ return 0;
+ }
+
+- /* This allocation never fails, since it's mempool-backed. */
+- req = fsverity_alloc_hash_request(params->hash_alg, GFP_KERNEL);
+-
+ /*
+ * Allocate the block buffers. Buffer "-1" is for data blocks.
+ * Buffers 0 <= level < num_levels are for the actual tree levels.
+@@ -130,7 +126,7 @@ static int build_merkle_tree(struct file *filp,
+ fsverity_err(inode, "Short read of file data");
+ goto out;
+ }
+- err = hash_one_block(inode, params, req, &buffers[-1]);
++ err = hash_one_block(inode, params, &buffers[-1]);
+ if (err)
+ goto out;
+ for (level = 0; level < num_levels; level++) {
+@@ -141,8 +137,7 @@ static int build_merkle_tree(struct file *filp,
+ }
+ /* Next block at @level is full */
+
+- err = hash_one_block(inode, params, req,
+- &buffers[level]);
++ err = hash_one_block(inode, params, &buffers[level]);
+ if (err)
+ goto out;
+ err = write_merkle_tree_block(inode,
+@@ -162,8 +157,7 @@ static int build_merkle_tree(struct file *filp,
+ /* Finish all nonempty pending tree blocks. */
+ for (level = 0; level < num_levels; level++) {
+ if (buffers[level].filled != 0) {
+- err = hash_one_block(inode, params, req,
+- &buffers[level]);
++ err = hash_one_block(inode, params, &buffers[level]);
+ if (err)
+ goto out;
+ err = write_merkle_tree_block(inode,
+@@ -183,7 +177,6 @@ static int build_merkle_tree(struct file *filp,
+ out:
+ for (level = -1; level < num_levels; level++)
+ kfree(buffers[level].data);
+- fsverity_free_hash_request(params->hash_alg, req);
+ return err;
+ }
+
+diff --git a/fs/verity/fsverity_private.h b/fs/verity/fsverity_private.h
+index d34dcc033d723..8527beca2a454 100644
+--- a/fs/verity/fsverity_private.h
++++ b/fs/verity/fsverity_private.h
+@@ -11,9 +11,6 @@
+ #define pr_fmt(fmt) "fs-verity: " fmt
+
+ #include <linux/fsverity.h>
+-#include <linux/mempool.h>
+-
+-struct ahash_request;
+
+ /*
+ * Implementation limit: maximum depth of the Merkle tree. For now 8 is plenty;
+@@ -23,11 +20,10 @@ struct ahash_request;
+
+ /* A hash algorithm supported by fs-verity */
+ struct fsverity_hash_alg {
+- struct crypto_ahash *tfm; /* hash tfm, allocated on demand */
++ struct crypto_shash *tfm; /* hash tfm, allocated on demand */
+ const char *name; /* crypto API name, e.g. sha256 */
+ unsigned int digest_size; /* digest size in bytes, e.g. 32 for SHA-256 */
+ unsigned int block_size; /* block size in bytes, e.g. 64 for SHA-256 */
+- mempool_t req_pool; /* mempool with a preallocated hash request */
+ /*
+ * The HASH_ALGO_* constant for this algorithm. This is different from
+ * FS_VERITY_HASH_ALG_*, which uses a different numbering scheme.
+@@ -85,15 +81,10 @@ extern struct fsverity_hash_alg fsverity_hash_algs[];
+
+ struct fsverity_hash_alg *fsverity_get_hash_alg(const struct inode *inode,
+ unsigned int num);
+-struct ahash_request *fsverity_alloc_hash_request(struct fsverity_hash_alg *alg,
+- gfp_t gfp_flags);
+-void fsverity_free_hash_request(struct fsverity_hash_alg *alg,
+- struct ahash_request *req);
+ const u8 *fsverity_prepare_hash_state(struct fsverity_hash_alg *alg,
+ const u8 *salt, size_t salt_size);
+ int fsverity_hash_block(const struct merkle_tree_params *params,
+- const struct inode *inode, struct ahash_request *req,
+- struct page *page, unsigned int offset, u8 *out);
++ const struct inode *inode, const void *data, u8 *out);
+ int fsverity_hash_buffer(struct fsverity_hash_alg *alg,
+ const void *data, size_t size, u8 *out);
+ void __init fsverity_check_hash_algs(void);
+diff --git a/fs/verity/hash_algs.c b/fs/verity/hash_algs.c
+index ea00dbedf756b..e7e982412e23a 100644
+--- a/fs/verity/hash_algs.c
++++ b/fs/verity/hash_algs.c
+@@ -8,7 +8,6 @@
+ #include "fsverity_private.h"
+
+ #include <crypto/hash.h>
+-#include <linux/scatterlist.h>
+
+ /* The hash algorithms supported by fs-verity */
+ struct fsverity_hash_alg fsverity_hash_algs[] = {
+@@ -44,7 +43,7 @@ struct fsverity_hash_alg *fsverity_get_hash_alg(const struct inode *inode,
+ unsigned int num)
+ {
+ struct fsverity_hash_alg *alg;
+- struct crypto_ahash *tfm;
++ struct crypto_shash *tfm;
+ int err;
+
+ if (num >= ARRAY_SIZE(fsverity_hash_algs) ||
+@@ -63,11 +62,7 @@ struct fsverity_hash_alg *fsverity_get_hash_alg(const struct inode *inode,
+ if (alg->tfm != NULL)
+ goto out_unlock;
+
+- /*
+- * Using the shash API would make things a bit simpler, but the ahash
+- * API is preferable as it allows the use of crypto accelerators.
+- */
+- tfm = crypto_alloc_ahash(alg->name, 0, 0);
++ tfm = crypto_alloc_shash(alg->name, 0, 0);
+ if (IS_ERR(tfm)) {
+ if (PTR_ERR(tfm) == -ENOENT) {
+ fsverity_warn(inode,
+@@ -84,68 +79,26 @@ struct fsverity_hash_alg *fsverity_get_hash_alg(const struct inode *inode,
+ }
+
+ err = -EINVAL;
+- if (WARN_ON_ONCE(alg->digest_size != crypto_ahash_digestsize(tfm)))
++ if (WARN_ON_ONCE(alg->digest_size != crypto_shash_digestsize(tfm)))
+ goto err_free_tfm;
+- if (WARN_ON_ONCE(alg->block_size != crypto_ahash_blocksize(tfm)))
+- goto err_free_tfm;
+-
+- err = mempool_init_kmalloc_pool(&alg->req_pool, 1,
+- sizeof(struct ahash_request) +
+- crypto_ahash_reqsize(tfm));
+- if (err)
++ if (WARN_ON_ONCE(alg->block_size != crypto_shash_blocksize(tfm)))
+ goto err_free_tfm;
+
+ pr_info("%s using implementation \"%s\"\n",
+- alg->name, crypto_ahash_driver_name(tfm));
++ alg->name, crypto_shash_driver_name(tfm));
+
+ /* pairs with smp_load_acquire() above */
+ smp_store_release(&alg->tfm, tfm);
+ goto out_unlock;
+
+ err_free_tfm:
+- crypto_free_ahash(tfm);
++ crypto_free_shash(tfm);
+ alg = ERR_PTR(err);
+ out_unlock:
+ mutex_unlock(&fsverity_hash_alg_init_mutex);
+ return alg;
+ }
+
+-/**
+- * fsverity_alloc_hash_request() - allocate a hash request object
+- * @alg: the hash algorithm for which to allocate the request
+- * @gfp_flags: memory allocation flags
+- *
+- * This is mempool-backed, so this never fails if __GFP_DIRECT_RECLAIM is set in
+- * @gfp_flags. However, in that case this might need to wait for all
+- * previously-allocated requests to be freed. So to avoid deadlocks, callers
+- * must never need multiple requests at a time to make forward progress.
+- *
+- * Return: the request object on success; NULL on failure (but see above)
+- */
+-struct ahash_request *fsverity_alloc_hash_request(struct fsverity_hash_alg *alg,
+- gfp_t gfp_flags)
+-{
+- struct ahash_request *req = mempool_alloc(&alg->req_pool, gfp_flags);
+-
+- if (req)
+- ahash_request_set_tfm(req, alg->tfm);
+- return req;
+-}
+-
+-/**
+- * fsverity_free_hash_request() - free a hash request object
+- * @alg: the hash algorithm
+- * @req: the hash request object to free
+- */
+-void fsverity_free_hash_request(struct fsverity_hash_alg *alg,
+- struct ahash_request *req)
+-{
+- if (req) {
+- ahash_request_zero(req);
+- mempool_free(req, &alg->req_pool);
+- }
+-}
+-
+ /**
+ * fsverity_prepare_hash_state() - precompute the initial hash state
+ * @alg: hash algorithm
+@@ -159,23 +112,20 @@ const u8 *fsverity_prepare_hash_state(struct fsverity_hash_alg *alg,
+ const u8 *salt, size_t salt_size)
+ {
+ u8 *hashstate = NULL;
+- struct ahash_request *req = NULL;
++ SHASH_DESC_ON_STACK(desc, alg->tfm);
+ u8 *padded_salt = NULL;
+ size_t padded_salt_size;
+- struct scatterlist sg;
+- DECLARE_CRYPTO_WAIT(wait);
+ int err;
+
++ desc->tfm = alg->tfm;
++
+ if (salt_size == 0)
+ return NULL;
+
+- hashstate = kmalloc(crypto_ahash_statesize(alg->tfm), GFP_KERNEL);
++ hashstate = kmalloc(crypto_shash_statesize(alg->tfm), GFP_KERNEL);
+ if (!hashstate)
+ return ERR_PTR(-ENOMEM);
+
+- /* This allocation never fails, since it's mempool-backed. */
+- req = fsverity_alloc_hash_request(alg, GFP_KERNEL);
+-
+ /*
+ * Zero-pad the salt to the next multiple of the input size of the hash
+ * algorithm's compression function, e.g. 64 bytes for SHA-256 or 128
+@@ -190,26 +140,18 @@ const u8 *fsverity_prepare_hash_state(struct fsverity_hash_alg *alg,
+ goto err_free;
+ }
+ memcpy(padded_salt, salt, salt_size);
+-
+- sg_init_one(&sg, padded_salt, padded_salt_size);
+- ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_SLEEP |
+- CRYPTO_TFM_REQ_MAY_BACKLOG,
+- crypto_req_done, &wait);
+- ahash_request_set_crypt(req, &sg, NULL, padded_salt_size);
+-
+- err = crypto_wait_req(crypto_ahash_init(req), &wait);
++ err = crypto_shash_init(desc);
+ if (err)
+ goto err_free;
+
+- err = crypto_wait_req(crypto_ahash_update(req), &wait);
++ err = crypto_shash_update(desc, padded_salt, padded_salt_size);
+ if (err)
+ goto err_free;
+
+- err = crypto_ahash_export(req, hashstate);
++ err = crypto_shash_export(desc, hashstate);
+ if (err)
+ goto err_free;
+ out:
+- fsverity_free_hash_request(alg, req);
+ kfree(padded_salt);
+ return hashstate;
+
+@@ -223,9 +165,7 @@ err_free:
+ * fsverity_hash_block() - hash a single data or hash block
+ * @params: the Merkle tree's parameters
+ * @inode: inode for which the hashing is being done
+- * @req: preallocated hash request
+- * @page: the page containing the block to hash
+- * @offset: the offset of the block within @page
++ * @data: virtual address of a buffer containing the block to hash
+ * @out: output digest, size 'params->digest_size' bytes
+ *
+ * Hash a single data or hash block. The hash is salted if a salt is specified
+@@ -234,33 +174,24 @@ err_free:
+ * Return: 0 on success, -errno on failure
+ */
+ int fsverity_hash_block(const struct merkle_tree_params *params,
+- const struct inode *inode, struct ahash_request *req,
+- struct page *page, unsigned int offset, u8 *out)
++ const struct inode *inode, const void *data, u8 *out)
+ {
+- struct scatterlist sg;
+- DECLARE_CRYPTO_WAIT(wait);
++ SHASH_DESC_ON_STACK(desc, params->hash_alg->tfm);
+ int err;
+
+- sg_init_table(&sg, 1);
+- sg_set_page(&sg, page, params->block_size, offset);
+- ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_SLEEP |
+- CRYPTO_TFM_REQ_MAY_BACKLOG,
+- crypto_req_done, &wait);
+- ahash_request_set_crypt(req, &sg, out, params->block_size);
++ desc->tfm = params->hash_alg->tfm;
+
+ if (params->hashstate) {
+- err = crypto_ahash_import(req, params->hashstate);
++ err = crypto_shash_import(desc, params->hashstate);
+ if (err) {
+ fsverity_err(inode,
+ "Error %d importing hash state", err);
+ return err;
+ }
+- err = crypto_ahash_finup(req);
++ err = crypto_shash_finup(desc, data, params->block_size, out);
+ } else {
+- err = crypto_ahash_digest(req);
++ err = crypto_shash_digest(desc, data, params->block_size, out);
+ }
+-
+- err = crypto_wait_req(err, &wait);
+ if (err)
+ fsverity_err(inode, "Error %d computing block hash", err);
+ return err;
+@@ -273,32 +204,12 @@ int fsverity_hash_block(const struct merkle_tree_params *params,
+ * @size: size of data to hash, in bytes
+ * @out: output digest, size 'alg->digest_size' bytes
+ *
+- * Hash some data which is located in physically contiguous memory (i.e. memory
+- * allocated by kmalloc(), not by vmalloc()). No salt is used.
+- *
+ * Return: 0 on success, -errno on failure
+ */
+ int fsverity_hash_buffer(struct fsverity_hash_alg *alg,
+ const void *data, size_t size, u8 *out)
+ {
+- struct ahash_request *req;
+- struct scatterlist sg;
+- DECLARE_CRYPTO_WAIT(wait);
+- int err;
+-
+- /* This allocation never fails, since it's mempool-backed. */
+- req = fsverity_alloc_hash_request(alg, GFP_KERNEL);
+-
+- sg_init_one(&sg, data, size);
+- ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_SLEEP |
+- CRYPTO_TFM_REQ_MAY_BACKLOG,
+- crypto_req_done, &wait);
+- ahash_request_set_crypt(req, &sg, out, size);
+-
+- err = crypto_wait_req(crypto_ahash_digest(req), &wait);
+-
+- fsverity_free_hash_request(alg, req);
+- return err;
++ return crypto_shash_tfm_digest(alg->tfm, data, size, out);
+ }
+
+ void __init fsverity_check_hash_algs(void)
+diff --git a/fs/verity/verify.c b/fs/verity/verify.c
+index e2508222750b3..cf40e2fe6ace7 100644
+--- a/fs/verity/verify.c
++++ b/fs/verity/verify.c
+@@ -29,21 +29,6 @@ static inline int cmp_hashes(const struct fsverity_info *vi,
+ return -EBADMSG;
+ }
+
+-static bool data_is_zeroed(struct inode *inode, struct page *page,
+- unsigned int len, unsigned int offset)
+-{
+- void *virt = kmap_local_page(page);
+-
+- if (memchr_inv(virt + offset, 0, len)) {
+- kunmap_local(virt);
+- fsverity_err(inode,
+- "FILE CORRUPTED! Data past EOF is not zeroed");
+- return false;
+- }
+- kunmap_local(virt);
+- return true;
+-}
+-
+ /*
+ * Returns true if the hash block with index @hblock_idx in the tree, located in
+ * @hpage, has already been verified.
+@@ -122,9 +107,7 @@ static bool is_hash_block_verified(struct fsverity_info *vi, struct page *hpage,
+ */
+ static bool
+ verify_data_block(struct inode *inode, struct fsverity_info *vi,
+- struct ahash_request *req, struct page *data_page,
+- u64 data_pos, unsigned int dblock_offset_in_page,
+- unsigned long max_ra_pages)
++ const void *data, u64 data_pos, unsigned long max_ra_pages)
+ {
+ const struct merkle_tree_params *params = &vi->tree_params;
+ const unsigned int hsize = params->digest_size;
+@@ -136,11 +119,11 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
+ struct {
+ /* Page containing the hash block */
+ struct page *page;
++ /* Mapped address of the hash block (will be within @page) */
++ const void *addr;
+ /* Index of the hash block in the tree overall */
+ unsigned long index;
+- /* Byte offset of the hash block within @page */
+- unsigned int offset_in_page;
+- /* Byte offset of the wanted hash within @page */
++ /* Byte offset of the wanted hash relative to @addr */
+ unsigned int hoffset;
+ } hblocks[FS_VERITY_MAX_LEVELS];
+ /*
+@@ -150,6 +133,9 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
+ u64 hidx = data_pos >> params->log_blocksize;
+ int err;
+
++ /* Up to 1 + FS_VERITY_MAX_LEVELS pages may be mapped at once */
++ BUILD_BUG_ON(1 + FS_VERITY_MAX_LEVELS > KM_MAX_IDX);
++
+ if (unlikely(data_pos >= inode->i_size)) {
+ /*
+ * This can happen in the data page spanning EOF when the Merkle
+@@ -159,8 +145,12 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
+ * any part past EOF should be all zeroes. Therefore, we need
+ * to verify that any data blocks fully past EOF are all zeroes.
+ */
+- return data_is_zeroed(inode, data_page, params->block_size,
+- dblock_offset_in_page);
++ if (memchr_inv(data, 0, params->block_size)) {
++ fsverity_err(inode,
++ "FILE CORRUPTED! Data past EOF is not zeroed");
++ return false;
++ }
++ return true;
+ }
+
+ /*
+@@ -175,6 +165,7 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
+ unsigned int hblock_offset_in_page;
+ unsigned int hoffset;
+ struct page *hpage;
++ const void *haddr;
+
+ /*
+ * The index of the block in the current level; also the index
+@@ -192,10 +183,9 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
+ hblock_offset_in_page =
+ (hblock_idx << params->log_blocksize) & ~PAGE_MASK;
+
+- /* Byte offset of the hash within the page */
+- hoffset = hblock_offset_in_page +
+- ((hidx << params->log_digestsize) &
+- (params->block_size - 1));
++ /* Byte offset of the hash within the block */
++ hoffset = (hidx << params->log_digestsize) &
++ (params->block_size - 1);
+
+ hpage = inode->i_sb->s_vop->read_merkle_tree_page(inode,
+ hpage_idx, level == 0 ? min(max_ra_pages,
+@@ -207,15 +197,17 @@ verify_data_block(struct inode *inode, struct fsverity_info *vi,
+ err, hpage_idx);
+ goto out;
+ }
++ haddr = kmap_local_page(hpage) + hblock_offset_in_page;
+ if (is_hash_block_verified(vi, hpage, hblock_idx)) {
+- memcpy_from_page(_want_hash, hpage, hoffset, hsize);
++ memcpy(_want_hash, haddr + hoffset, hsize);
+ want_hash = _want_hash;
++ kunmap_local(haddr);
+ put_page(hpage);
+ goto descend;
+ }
+ hblocks[level].page = hpage;
++ hblocks[level].addr = haddr;
+ hblocks[level].index = hblock_idx;
+- hblocks[level].offset_in_page = hblock_offset_in_page;
+ hblocks[level].hoffset = hoffset;
+ hidx = next_hidx;
+ }
+@@ -225,13 +217,11 @@ descend:
+ /* Descend the tree verifying hash blocks. */
+ for (; level > 0; level--) {
+ struct page *hpage = hblocks[level - 1].page;
++ const void *haddr = hblocks[level - 1].addr;
+ unsigned long hblock_idx = hblocks[level - 1].index;
+- unsigned int hblock_offset_in_page =
+- hblocks[level - 1].offset_in_page;
+ unsigned int hoffset = hblocks[level - 1].hoffset;
+
+- err = fsverity_hash_block(params, inode, req, hpage,
+- hblock_offset_in_page, real_hash);
++ err = fsverity_hash_block(params, inode, haddr, real_hash);
+ if (err)
+ goto out;
+ err = cmp_hashes(vi, want_hash, real_hash, data_pos, level - 1);
+@@ -246,29 +236,31 @@ descend:
+ set_bit(hblock_idx, vi->hash_block_verified);
+ else
+ SetPageChecked(hpage);
+- memcpy_from_page(_want_hash, hpage, hoffset, hsize);
++ memcpy(_want_hash, haddr + hoffset, hsize);
+ want_hash = _want_hash;
++ kunmap_local(haddr);
+ put_page(hpage);
+ }
+
+ /* Finally, verify the data block. */
+- err = fsverity_hash_block(params, inode, req, data_page,
+- dblock_offset_in_page, real_hash);
++ err = fsverity_hash_block(params, inode, data, real_hash);
+ if (err)
+ goto out;
+ err = cmp_hashes(vi, want_hash, real_hash, data_pos, -1);
+ out:
+- for (; level > 0; level--)
++ for (; level > 0; level--) {
++ kunmap_local(hblocks[level - 1].addr);
+ put_page(hblocks[level - 1].page);
+-
++ }
+ return err == 0;
+ }
+
+ static bool
+-verify_data_blocks(struct inode *inode, struct fsverity_info *vi,
+- struct ahash_request *req, struct folio *data_folio,
+- size_t len, size_t offset, unsigned long max_ra_pages)
++verify_data_blocks(struct folio *data_folio, size_t len, size_t offset,
++ unsigned long max_ra_pages)
+ {
++ struct inode *inode = data_folio->mapping->host;
++ struct fsverity_info *vi = inode->i_verity_info;
+ const unsigned int block_size = vi->tree_params.block_size;
+ u64 pos = (u64)data_folio->index << PAGE_SHIFT;
+
+@@ -278,11 +270,14 @@ verify_data_blocks(struct inode *inode, struct fsverity_info *vi,
+ folio_test_uptodate(data_folio)))
+ return false;
+ do {
+- struct page *data_page =
+- folio_page(data_folio, offset >> PAGE_SHIFT);
+-
+- if (!verify_data_block(inode, vi, req, data_page, pos + offset,
+- offset & ~PAGE_MASK, max_ra_pages))
++ void *data;
++ bool valid;
++
++ data = kmap_local_folio(data_folio, offset);
++ valid = verify_data_block(inode, vi, data, pos + offset,
++ max_ra_pages);
++ kunmap_local(data);
++ if (!valid)
+ return false;
+ offset += block_size;
+ len -= block_size;
+@@ -304,19 +299,7 @@ verify_data_blocks(struct inode *inode, struct fsverity_info *vi,
+ */
+ bool fsverity_verify_blocks(struct folio *folio, size_t len, size_t offset)
+ {
+- struct inode *inode = folio->mapping->host;
+- struct fsverity_info *vi = inode->i_verity_info;
+- struct ahash_request *req;
+- bool valid;
+-
+- /* This allocation never fails, since it's mempool-backed. */
+- req = fsverity_alloc_hash_request(vi->tree_params.hash_alg, GFP_NOFS);
+-
+- valid = verify_data_blocks(inode, vi, req, folio, len, offset, 0);
+-
+- fsverity_free_hash_request(vi->tree_params.hash_alg, req);
+-
+- return valid;
++ return verify_data_blocks(folio, len, offset, 0);
+ }
+ EXPORT_SYMBOL_GPL(fsverity_verify_blocks);
+
+@@ -337,15 +320,9 @@ EXPORT_SYMBOL_GPL(fsverity_verify_blocks);
+ */
+ void fsverity_verify_bio(struct bio *bio)
+ {
+- struct inode *inode = bio_first_page_all(bio)->mapping->host;
+- struct fsverity_info *vi = inode->i_verity_info;
+- struct ahash_request *req;
+ struct folio_iter fi;
+ unsigned long max_ra_pages = 0;
+
+- /* This allocation never fails, since it's mempool-backed. */
+- req = fsverity_alloc_hash_request(vi->tree_params.hash_alg, GFP_NOFS);
+-
+ if (bio->bi_opf & REQ_RAHEAD) {
+ /*
+ * If this bio is for data readahead, then we also do readahead
+@@ -360,14 +337,12 @@ void fsverity_verify_bio(struct bio *bio)
+ }
+
+ bio_for_each_folio_all(fi, bio) {
+- if (!verify_data_blocks(inode, vi, req, fi.folio, fi.length,
+- fi.offset, max_ra_pages)) {
++ if (!verify_data_blocks(fi.folio, fi.length, fi.offset,
++ max_ra_pages)) {
+ bio->bi_status = BLK_STS_IOERR;
+ break;
+ }
+ }
+-
+- fsverity_free_hash_request(vi->tree_params.hash_alg, req);
+ }
+ EXPORT_SYMBOL_GPL(fsverity_verify_bio);
+ #endif /* CONFIG_BLOCK */
+diff --git a/include/drm/bridge/samsung-dsim.h b/include/drm/bridge/samsung-dsim.h
+index ba5484de2b30e..a1a5b2b89a7ab 100644
+--- a/include/drm/bridge/samsung-dsim.h
++++ b/include/drm/bridge/samsung-dsim.h
+@@ -54,11 +54,14 @@ struct samsung_dsim_driver_data {
+ unsigned int has_freqband:1;
+ unsigned int has_clklane_stop:1;
+ unsigned int num_clks;
++ unsigned int min_freq;
+ unsigned int max_freq;
+ unsigned int wait_for_reset;
+ unsigned int num_bits_resol;
+ unsigned int pll_p_offset;
+ const unsigned int *reg_values;
++ u16 m_min;
++ u16 m_max;
+ };
+
+ struct samsung_dsim_host_ops {
+diff --git a/include/drm/drm_fixed.h b/include/drm/drm_fixed.h
+index 255645c1f9a89..6ea339d5de088 100644
+--- a/include/drm/drm_fixed.h
++++ b/include/drm/drm_fixed.h
+@@ -71,6 +71,7 @@ static inline u32 dfixed_div(fixed20_12 A, fixed20_12 B)
+ }
+
+ #define DRM_FIXED_POINT 32
++#define DRM_FIXED_POINT_HALF 16
+ #define DRM_FIXED_ONE (1ULL << DRM_FIXED_POINT)
+ #define DRM_FIXED_DECIMAL_MASK (DRM_FIXED_ONE - 1)
+ #define DRM_FIXED_DIGITS_MASK (~DRM_FIXED_DECIMAL_MASK)
+@@ -87,6 +88,11 @@ static inline int drm_fixp2int(s64 a)
+ return ((s64)a) >> DRM_FIXED_POINT;
+ }
+
++static inline int drm_fixp2int_round(s64 a)
++{
++ return drm_fixp2int(a + (1 << (DRM_FIXED_POINT_HALF - 1)));
++}
++
+ static inline int drm_fixp2int_ceil(s64 a)
+ {
+ if (a > 0)
+diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
+index 7d6d73b781472..03644237e1efb 100644
+--- a/include/linux/bitmap.h
++++ b/include/linux/bitmap.h
+@@ -302,12 +302,10 @@ void bitmap_to_arr32(u32 *buf, const unsigned long *bitmap,
+ #endif
+
+ /*
+- * On 64-bit systems bitmaps are represented as u64 arrays internally. On LE32
+- * machines the order of hi and lo parts of numbers match the bitmap structure.
+- * In both cases conversion is not needed when copying data from/to arrays of
+- * u64.
++ * On 64-bit systems bitmaps are represented as u64 arrays internally. So,
++ * the conversion is not needed when copying data from/to arrays of u64.
+ */
+-#if (BITS_PER_LONG == 32) && defined(__BIG_ENDIAN)
++#if BITS_PER_LONG == 32
+ void bitmap_from_arr64(unsigned long *bitmap, const u64 *buf, unsigned int nbits);
+ void bitmap_to_arr64(u64 *buf, const unsigned long *bitmap, unsigned int nbits);
+ #else
+diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
+index 06caacd77ed66..710d122472641 100644
+--- a/include/linux/blk-mq.h
++++ b/include/linux/blk-mq.h
+@@ -746,8 +746,7 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q,
+ struct blk_mq_tags {
+ unsigned int nr_tags;
+ unsigned int nr_reserved_tags;
+-
+- atomic_t active_queues;
++ unsigned int active_queues;
+
+ struct sbitmap_queue bitmap_tags;
+ struct sbitmap_queue breserved_tags;
+diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
+index c0ffe203a6022..67e942d776bd8 100644
+--- a/include/linux/blkdev.h
++++ b/include/linux/blkdev.h
+@@ -392,6 +392,7 @@ struct request_queue {
+
+ struct blk_queue_stats *stats;
+ struct rq_qos *rq_qos;
++ struct mutex rq_qos_mutex;
+
+ const struct blk_mq_ops *mq_ops;
+
+@@ -1282,7 +1283,7 @@ static inline unsigned int bdev_zone_no(struct block_device *bdev, sector_t sec)
+ }
+
+ static inline bool bdev_op_is_zoned_write(struct block_device *bdev,
+- blk_opf_t op)
++ enum req_op op)
+ {
+ if (!bdev_is_zoned(bdev))
+ return false;
+diff --git a/include/linux/blktrace_api.h b/include/linux/blktrace_api.h
+index cfbda114348c9..122c62e561fc7 100644
+--- a/include/linux/blktrace_api.h
++++ b/include/linux/blktrace_api.h
+@@ -85,10 +85,14 @@ extern int blk_trace_remove(struct request_queue *q);
+ # define blk_add_driver_data(rq, data, len) do {} while (0)
+ # define blk_trace_setup(q, name, dev, bdev, arg) (-ENOTTY)
+ # define blk_trace_startstop(q, start) (-ENOTTY)
+-# define blk_trace_remove(q) (-ENOTTY)
+ # define blk_add_trace_msg(q, fmt, ...) do { } while (0)
+ # define blk_add_cgroup_trace_msg(q, cg, fmt, ...) do { } while (0)
+ # define blk_trace_note_message_enabled(q) (false)
++
++static inline int blk_trace_remove(struct request_queue *q)
++{
++ return -ENOTTY;
++}
+ #endif /* CONFIG_BLK_DEV_IO_TRACE */
+
+ #ifdef CONFIG_COMPAT
+diff --git a/include/linux/bpf.h b/include/linux/bpf.h
+index e53ceee1df370..1ad211acf1d25 100644
+--- a/include/linux/bpf.h
++++ b/include/linux/bpf.h
+@@ -1125,7 +1125,6 @@ struct bpf_trampoline {
+ int progs_cnt[BPF_TRAMP_MAX];
+ /* Executable image of trampoline */
+ struct bpf_tramp_image *cur_image;
+- u64 selector;
+ struct module *mod;
+ };
+
+diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
+index 3dd29a53b7112..f70f9ac884d24 100644
+--- a/include/linux/bpf_verifier.h
++++ b/include/linux/bpf_verifier.h
+@@ -18,8 +18,11 @@
+ * that converting umax_value to int cannot overflow.
+ */
+ #define BPF_MAX_VAR_SIZ (1 << 29)
+-/* size of type_str_buf in bpf_verifier. */
+-#define TYPE_STR_BUF_LEN 128
++/* size of tmp_str_buf in bpf_verifier.
++ * we need at least 306 bytes to fit full stack mask representation
++ * (in the "-8,-16,...,-512" form)
++ */
++#define TMP_STR_BUF_LEN 320
+
+ /* Liveness marks, used for registers and spilled-regs (in stack slots).
+ * Read marks propagate upwards until they find a write mark; they record that
+@@ -238,6 +241,10 @@ enum bpf_stack_slot_type {
+
+ #define BPF_REG_SIZE 8 /* size of eBPF register in bytes */
+
++#define BPF_REGMASK_ARGS ((1 << BPF_REG_1) | (1 << BPF_REG_2) | \
++ (1 << BPF_REG_3) | (1 << BPF_REG_4) | \
++ (1 << BPF_REG_5))
++
+ #define BPF_DYNPTR_SIZE sizeof(struct bpf_dynptr_kern)
+ #define BPF_DYNPTR_NR_SLOTS (BPF_DYNPTR_SIZE / BPF_REG_SIZE)
+
+@@ -306,11 +313,6 @@ struct bpf_idx_pair {
+ u32 idx;
+ };
+
+-struct bpf_id_pair {
+- u32 old;
+- u32 cur;
+-};
+-
+ #define MAX_CALL_FRAMES 8
+ /* Maximum number of register states that can exist at once */
+ #define BPF_ID_MAP_SIZE ((MAX_BPF_REG + MAX_BPF_STACK / BPF_REG_SIZE) * MAX_CALL_FRAMES)
+@@ -541,6 +543,30 @@ struct bpf_subprog_info {
+ bool is_async_cb;
+ };
+
++struct bpf_verifier_env;
++
++struct backtrack_state {
++ struct bpf_verifier_env *env;
++ u32 frame;
++ u32 reg_masks[MAX_CALL_FRAMES];
++ u64 stack_masks[MAX_CALL_FRAMES];
++};
++
++struct bpf_id_pair {
++ u32 old;
++ u32 cur;
++};
++
++struct bpf_idmap {
++ u32 tmp_id_gen;
++ struct bpf_id_pair map[BPF_ID_MAP_SIZE];
++};
++
++struct bpf_idset {
++ u32 count;
++ u32 ids[BPF_ID_MAP_SIZE];
++};
++
+ /* single container for all structs
+ * one verifier_env per bpf_check() call
+ */
+@@ -572,12 +598,16 @@ struct bpf_verifier_env {
+ const struct bpf_line_info *prev_linfo;
+ struct bpf_verifier_log log;
+ struct bpf_subprog_info subprog_info[BPF_MAX_SUBPROGS + 1];
+- struct bpf_id_pair idmap_scratch[BPF_ID_MAP_SIZE];
++ union {
++ struct bpf_idmap idmap_scratch;
++ struct bpf_idset idset_scratch;
++ };
+ struct {
+ int *insn_state;
+ int *insn_stack;
+ int cur_stack;
+ } cfg;
++ struct backtrack_state bt;
+ u32 pass_cnt; /* number of times do_check() was called */
+ u32 subprog_cnt;
+ /* number of instructions analyzed by the verifier */
+@@ -606,8 +636,10 @@ struct bpf_verifier_env {
+ /* Same as scratched_regs but for stack slots */
+ u64 scratched_stack_slots;
+ u64 prev_log_pos, prev_insn_print_pos;
+- /* buffer used in reg_type_str() to generate reg_type string */
+- char type_str_buf[TYPE_STR_BUF_LEN];
++ /* buffer used to generate temporary string representations,
++ * e.g., in reg_type_str() to generate reg_type string
++ */
++ char tmp_str_buf[TMP_STR_BUF_LEN];
+ };
+
+ __printf(2, 0) void bpf_verifier_vlog(struct bpf_verifier_log *log,
+diff --git a/include/linux/can/length.h b/include/linux/can/length.h
+index 6995092b774ec..ef1fd32cef16b 100644
+--- a/include/linux/can/length.h
++++ b/include/linux/can/length.h
+@@ -69,17 +69,18 @@
+ * Error Status Indicator (ESI) 1
+ * Data length code (DLC) 4
+ * Data field 0...512
+- * Stuff Bit Count (SBC) 0...16: 4 20...64:5
++ * Stuff Bit Count (SBC) 4
+ * CRC 0...16: 17 20...64:21
+ * CRC delimiter (CD) 1
++ * Fixed Stuff bits (FSB) 0...16: 6 20...64:7
+ * ACK slot (AS) 1
+ * ACK delimiter (AD) 1
+ * End-of-frame (EOF) 7
+ * Inter frame spacing 3
+ *
+- * assuming CRC21, rounded up and ignoring bitstuffing
++ * assuming CRC21, rounded up and ignoring dynamic bitstuffing
+ */
+-#define CANFD_FRAME_OVERHEAD_SFF DIV_ROUND_UP(61, 8)
++#define CANFD_FRAME_OVERHEAD_SFF DIV_ROUND_UP(67, 8)
+
+ /*
+ * Size of a CAN-FD Extended Frame
+@@ -98,17 +99,18 @@
+ * Error Status Indicator (ESI) 1
+ * Data length code (DLC) 4
+ * Data field 0...512
+- * Stuff Bit Count (SBC) 0...16: 4 20...64:5
++ * Stuff Bit Count (SBC) 4
+ * CRC 0...16: 17 20...64:21
+ * CRC delimiter (CD) 1
++ * Fixed Stuff bits (FSB) 0...16: 6 20...64:7
+ * ACK slot (AS) 1
+ * ACK delimiter (AD) 1
+ * End-of-frame (EOF) 7
+ * Inter frame spacing 3
+ *
+- * assuming CRC21, rounded up and ignoring bitstuffing
++ * assuming CRC21, rounded up and ignoring dynamic bitstuffing
+ */
+-#define CANFD_FRAME_OVERHEAD_EFF DIV_ROUND_UP(80, 8)
++#define CANFD_FRAME_OVERHEAD_EFF DIV_ROUND_UP(86, 8)
+
+ /*
+ * Maximum size of a Classical CAN frame
+diff --git a/include/linux/compiler_attributes.h b/include/linux/compiler_attributes.h
+index e659cb6fded39..84864767a56ae 100644
+--- a/include/linux/compiler_attributes.h
++++ b/include/linux/compiler_attributes.h
+@@ -255,6 +255,18 @@
+ */
+ #define __noreturn __attribute__((__noreturn__))
+
++/*
++ * Optional: only supported since GCC >= 11.1, clang >= 7.0.
++ *
++ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-no_005fstack_005fprotector-function-attribute
++ * clang: https://clang.llvm.org/docs/AttributeReference.html#no-stack-protector-safebuffers
++ */
++#if __has_attribute(__no_stack_protector__)
++# define __no_stack_protector __attribute__((__no_stack_protector__))
++#else
++# define __no_stack_protector
++#endif
++
+ /*
+ * Optional: not supported by gcc.
+ *
+diff --git a/include/linux/dsa/sja1105.h b/include/linux/dsa/sja1105.h
+index 159e43171cccf..c177322f793d6 100644
+--- a/include/linux/dsa/sja1105.h
++++ b/include/linux/dsa/sja1105.h
+@@ -48,13 +48,9 @@ struct sja1105_deferred_xmit_work {
+
+ /* Global tagger data */
+ struct sja1105_tagger_data {
+- /* Tagger to switch */
+ void (*xmit_work_fn)(struct kthread_work *work);
+ void (*meta_tstamp_handler)(struct dsa_switch *ds, int port, u8 ts_id,
+ enum sja1110_meta_tstamp dir, u64 tstamp);
+- /* Switch to tagger */
+- bool (*rxtstamp_get_state)(struct dsa_switch *ds);
+- void (*rxtstamp_set_state)(struct dsa_switch *ds, bool on);
+ };
+
+ struct sja1105_skb_cb {
+diff --git a/include/linux/ieee80211.h b/include/linux/ieee80211.h
+index c4cf296e7eafe..4cda32ac3116a 100644
+--- a/include/linux/ieee80211.h
++++ b/include/linux/ieee80211.h
+@@ -2856,6 +2856,7 @@ ieee80211_he_spr_size(const u8 *he_spr_ie)
+
+ /* Maximum number of supported EHT LTF is split */
+ #define IEEE80211_EHT_PHY_CAP5_MAX_NUM_SUPP_EHT_LTF_MASK 0xc0
++#define IEEE80211_EHT_PHY_CAP5_SUPP_EXTRA_EHT_LTF 0x40
+ #define IEEE80211_EHT_PHY_CAP6_MAX_NUM_SUPP_EHT_LTF_MASK 0x07
+
+ #define IEEE80211_EHT_PHY_CAP6_MCS15_SUPP_MASK 0x78
+@@ -4611,15 +4612,12 @@ static inline u8 ieee80211_mle_common_size(const u8 *data)
+ case IEEE80211_ML_CONTROL_TYPE_BASIC:
+ case IEEE80211_ML_CONTROL_TYPE_PREQ:
+ case IEEE80211_ML_CONTROL_TYPE_TDLS:
++ case IEEE80211_ML_CONTROL_TYPE_RECONF:
+ /*
+ * The length is the first octet pointed by mle->variable so no
+ * need to add anything
+ */
+ break;
+- case IEEE80211_ML_CONTROL_TYPE_RECONF:
+- if (control & IEEE80211_MLC_RECONF_PRES_MLD_MAC_ADDR)
+- common += ETH_ALEN;
+- return common;
+ case IEEE80211_ML_CONTROL_TYPE_PRIO_ACCESS:
+ if (control & IEEE80211_MLC_PRIO_ACCESS_PRES_AP_MLD_MAC_ADDR)
+ common += ETH_ALEN;
+diff --git a/include/linux/mfd/tps65010.h b/include/linux/mfd/tps65010.h
+index a1fb9bc5311de..5edf1aef11185 100644
+--- a/include/linux/mfd/tps65010.h
++++ b/include/linux/mfd/tps65010.h
+@@ -28,6 +28,8 @@
+ #ifndef __LINUX_I2C_TPS65010_H
+ #define __LINUX_I2C_TPS65010_H
+
++struct gpio_chip;
++
+ /*
+ * ----------------------------------------------------------------------------
+ * Registers, all 8 bits
+@@ -176,12 +178,10 @@ struct i2c_client;
+
+ /**
+ * struct tps65010_board - packages GPIO and LED lines
+- * @base: the GPIO number to assign to GPIO-1
+ * @outmask: bit (N-1) is set to allow GPIO-N to be used as an
+ * (open drain) output
+ * @setup: optional callback issued once the GPIOs are valid
+ * @teardown: optional callback issued before the GPIOs are invalidated
+- * @context: optional parameter passed to setup() and teardown()
+ *
+ * Board data may be used to package the GPIO (and LED) lines for use
+ * in by the generic GPIO and LED frameworks. The first four GPIOs
+@@ -193,12 +193,9 @@ struct i2c_client;
+ * devices in their initial states using these GPIOs.
+ */
+ struct tps65010_board {
+- int base;
+ unsigned outmask;
+-
+- int (*setup)(struct i2c_client *client, void *context);
+- int (*teardown)(struct i2c_client *client, void *context);
+- void *context;
++ int (*setup)(struct i2c_client *client, struct gpio_chip *gc);
++ void (*teardown)(struct i2c_client *client, struct gpio_chip *gc);
+ };
+
+ #endif /* __LINUX_I2C_TPS65010_H */
+diff --git a/include/linux/mfd/twl.h b/include/linux/mfd/twl.h
+index 6e3d99b7a0ee6..c062d91a67d92 100644
+--- a/include/linux/mfd/twl.h
++++ b/include/linux/mfd/twl.h
+@@ -593,9 +593,6 @@ struct twl4030_gpio_platform_data {
+ */
+ u32 pullups;
+ u32 pulldowns;
+-
+- int (*setup)(struct device *dev,
+- unsigned gpio, unsigned ngpio);
+ };
+
+ struct twl4030_madc_platform_data {
+diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
+index 306a3d1a0fa65..de10fc797c8e9 100644
+--- a/include/linux/mm_types.h
++++ b/include/linux/mm_types.h
+@@ -583,6 +583,21 @@ struct mm_cid {
+ struct kioctx_table;
+ struct mm_struct {
+ struct {
++ /*
++ * Fields which are often written to are placed in a separate
++ * cache line.
++ */
++ struct {
++ /**
++ * @mm_count: The number of references to &struct
++ * mm_struct (@mm_users count as 1).
++ *
++ * Use mmgrab()/mmdrop() to modify. When this drops to
++ * 0, the &struct mm_struct is freed.
++ */
++ atomic_t mm_count;
++ } ____cacheline_aligned_in_smp;
++
+ struct maple_tree mm_mt;
+ #ifdef CONFIG_MMU
+ unsigned long (*get_unmapped_area) (struct file *filp,
+@@ -620,14 +635,6 @@ struct mm_struct {
+ */
+ atomic_t mm_users;
+
+- /**
+- * @mm_count: The number of references to &struct mm_struct
+- * (@mm_users count as 1).
+- *
+- * Use mmgrab()/mmdrop() to modify. When this drops to 0, the
+- * &struct mm_struct is freed.
+- */
+- atomic_t mm_count;
+ #ifdef CONFIG_SCHED_MM_CID
+ /**
+ * @pcpu_cid: Per-cpu current cid.
+diff --git a/include/linux/mmc/card.h b/include/linux/mmc/card.h
+index c726ea7812552..daa2f40d9ce65 100644
+--- a/include/linux/mmc/card.h
++++ b/include/linux/mmc/card.h
+@@ -294,6 +294,7 @@ struct mmc_card {
+ #define MMC_QUIRK_TRIM_BROKEN (1<<12) /* Skip trim */
+ #define MMC_QUIRK_BROKEN_HPI (1<<13) /* Disable broken HPI support */
+ #define MMC_QUIRK_BROKEN_SD_DISCARD (1<<14) /* Disable broken SD discard support */
++#define MMC_QUIRK_BROKEN_SD_CACHE (1<<15) /* Disable broken SD cache support */
+
+ bool reenable_cmdq; /* Re-enable Command Queue */
+
+diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
+index c2f0c6002a84b..68adc8af29efb 100644
+--- a/include/linux/netdevice.h
++++ b/include/linux/netdevice.h
+@@ -5093,6 +5093,15 @@ static inline bool netif_is_l3_slave(const struct net_device *dev)
+ return dev->priv_flags & IFF_L3MDEV_SLAVE;
+ }
+
++static inline int dev_sdif(const struct net_device *dev)
++{
++#ifdef CONFIG_NET_L3_MASTER_DEV
++ if (netif_is_l3_slave(dev))
++ return dev->ifindex;
++#endif
++ return 0;
++}
++
+ static inline bool netif_is_bridge_master(const struct net_device *dev)
+ {
+ return dev->priv_flags & IFF_EBRIDGE;
+diff --git a/include/linux/nmi.h b/include/linux/nmi.h
+index 048c0b9aa623d..d54b9ba9c8247 100644
+--- a/include/linux/nmi.h
++++ b/include/linux/nmi.h
+@@ -13,13 +13,11 @@
+
+ #ifdef CONFIG_LOCKUP_DETECTOR
+ void lockup_detector_init(void);
++void lockup_detector_retry_init(void);
+ void lockup_detector_soft_poweroff(void);
+ void lockup_detector_cleanup(void);
+-bool is_hardlockup(void);
+
+ extern int watchdog_user_enabled;
+-extern int nmi_watchdog_user_enabled;
+-extern int soft_watchdog_user_enabled;
+ extern int watchdog_thresh;
+ extern unsigned long watchdog_enabled;
+
+@@ -35,6 +33,7 @@ extern int sysctl_hardlockup_all_cpu_backtrace;
+
+ #else /* CONFIG_LOCKUP_DETECTOR */
+ static inline void lockup_detector_init(void) { }
++static inline void lockup_detector_retry_init(void) { }
+ static inline void lockup_detector_soft_poweroff(void) { }
+ static inline void lockup_detector_cleanup(void) { }
+ #endif /* !CONFIG_LOCKUP_DETECTOR */
+@@ -69,17 +68,17 @@ static inline void reset_hung_task_detector(void) { }
+ * 'watchdog_enabled' variable. Each lockup detector has its dedicated bit -
+ * bit 0 for the hard lockup detector and bit 1 for the soft lockup detector.
+ *
+- * 'watchdog_user_enabled', 'nmi_watchdog_user_enabled' and
+- * 'soft_watchdog_user_enabled' are variables that are only used as an
++ * 'watchdog_user_enabled', 'watchdog_hardlockup_user_enabled' and
++ * 'watchdog_softlockup_user_enabled' are variables that are only used as an
+ * 'interface' between the parameters in /proc/sys/kernel and the internal
+ * state bits in 'watchdog_enabled'. The 'watchdog_thresh' variable is
+ * handled differently because its value is not boolean, and the lockup
+ * detectors are 'suspended' while 'watchdog_thresh' is equal zero.
+ */
+-#define NMI_WATCHDOG_ENABLED_BIT 0
+-#define SOFT_WATCHDOG_ENABLED_BIT 1
+-#define NMI_WATCHDOG_ENABLED (1 << NMI_WATCHDOG_ENABLED_BIT)
+-#define SOFT_WATCHDOG_ENABLED (1 << SOFT_WATCHDOG_ENABLED_BIT)
++#define WATCHDOG_HARDLOCKUP_ENABLED_BIT 0
++#define WATCHDOG_SOFTOCKUP_ENABLED_BIT 1
++#define WATCHDOG_HARDLOCKUP_ENABLED (1 << WATCHDOG_HARDLOCKUP_ENABLED_BIT)
++#define WATCHDOG_SOFTOCKUP_ENABLED (1 << WATCHDOG_SOFTOCKUP_ENABLED_BIT)
+
+ #if defined(CONFIG_HARDLOCKUP_DETECTOR)
+ extern void hardlockup_detector_disable(void);
+@@ -88,10 +87,8 @@ extern unsigned int hardlockup_panic;
+ static inline void hardlockup_detector_disable(void) {}
+ #endif
+
+-#if defined(CONFIG_HAVE_NMI_WATCHDOG) || defined(CONFIG_HARDLOCKUP_DETECTOR)
+-# define NMI_WATCHDOG_SYSCTL_PERM 0644
+-#else
+-# define NMI_WATCHDOG_SYSCTL_PERM 0444
++#if defined(CONFIG_HARDLOCKUP_DETECTOR_PERF)
++void watchdog_hardlockup_check(struct pt_regs *regs);
+ #endif
+
+ #if defined(CONFIG_HARDLOCKUP_DETECTOR_PERF)
+@@ -116,11 +113,11 @@ static inline int hardlockup_detector_perf_init(void) { return 0; }
+ # endif
+ #endif
+
+-void watchdog_nmi_stop(void);
+-void watchdog_nmi_start(void);
+-int watchdog_nmi_probe(void);
+-int watchdog_nmi_enable(unsigned int cpu);
+-void watchdog_nmi_disable(unsigned int cpu);
++void watchdog_hardlockup_stop(void);
++void watchdog_hardlockup_start(void);
++int watchdog_hardlockup_probe(void);
++void watchdog_hardlockup_enable(unsigned int cpu);
++void watchdog_hardlockup_disable(unsigned int cpu);
+
+ void lockup_detector_reconfigure(void);
+
+@@ -197,7 +194,7 @@ u64 hw_nmi_get_sample_period(int watchdog_thresh);
+ #endif
+
+ #if defined(CONFIG_HARDLOCKUP_CHECK_TIMESTAMP) && \
+- defined(CONFIG_HARDLOCKUP_DETECTOR)
++ defined(CONFIG_HARDLOCKUP_DETECTOR_PERF)
+ void watchdog_update_hrtimer_threshold(u64 period);
+ #else
+ static inline void watchdog_update_hrtimer_threshold(u64 period) { }
+diff --git a/include/linux/pci.h b/include/linux/pci.h
+index 60b8772b5bd45..c69a2cc1f4123 100644
+--- a/include/linux/pci.h
++++ b/include/linux/pci.h
+@@ -1903,6 +1903,7 @@ static inline int pci_dev_present(const struct pci_device_id *ids)
+ #define pci_dev_put(dev) do { } while (0)
+
+ static inline void pci_set_master(struct pci_dev *dev) { }
++static inline void pci_clear_master(struct pci_dev *dev) { }
+ static inline int pci_enable_device(struct pci_dev *dev) { return -EIO; }
+ static inline void pci_disable_device(struct pci_dev *dev) { }
+ static inline int pcim_enable_device(struct pci_dev *pdev) { return -EIO; }
+diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
+index 525b5d64e3948..c0e4baf940dce 100644
+--- a/include/linux/perf/arm_pmu.h
++++ b/include/linux/perf/arm_pmu.h
+@@ -26,9 +26,11 @@
+ */
+ #define ARMPMU_EVT_64BIT 0x00001 /* Event uses a 64bit counter */
+ #define ARMPMU_EVT_47BIT 0x00002 /* Event uses a 47bit counter */
++#define ARMPMU_EVT_63BIT 0x00004 /* Event uses a 63bit counter */
+
+ static_assert((PERF_EVENT_FLAG_ARCH & ARMPMU_EVT_64BIT) == ARMPMU_EVT_64BIT);
+ static_assert((PERF_EVENT_FLAG_ARCH & ARMPMU_EVT_47BIT) == ARMPMU_EVT_47BIT);
++static_assert((PERF_EVENT_FLAG_ARCH & ARMPMU_EVT_63BIT) == ARMPMU_EVT_63BIT);
+
+ #define HW_OP_UNSUPPORTED 0xFFFF
+ #define C(_x) PERF_COUNT_HW_CACHE_##_x
+diff --git a/include/linux/pipe_fs_i.h b/include/linux/pipe_fs_i.h
+index d2c3f16cf6b18..02e0086b10f6f 100644
+--- a/include/linux/pipe_fs_i.h
++++ b/include/linux/pipe_fs_i.h
+@@ -261,18 +261,14 @@ void generic_pipe_buf_release(struct pipe_inode_info *, struct pipe_buffer *);
+
+ extern const struct pipe_buf_operations nosteal_pipe_buf_ops;
+
+-#ifdef CONFIG_WATCH_QUEUE
+ unsigned long account_pipe_buffers(struct user_struct *user,
+ unsigned long old, unsigned long new);
+ bool too_many_pipe_buffers_soft(unsigned long user_bufs);
+ bool too_many_pipe_buffers_hard(unsigned long user_bufs);
+ bool pipe_is_unprivileged_user(void);
+-#endif
+
+ /* for F_SETPIPE_SZ and F_GETPIPE_SZ */
+-#ifdef CONFIG_WATCH_QUEUE
+ int pipe_resize_ring(struct pipe_inode_info *pipe, unsigned int nr_slots);
+-#endif
+ long pipe_fcntl(struct file *, unsigned int, unsigned long arg);
+ struct pipe_inode_info *get_pipe_info(struct file *file, bool for_splice);
+
+diff --git a/include/linux/platform_data/lcd-mipid.h b/include/linux/platform_data/lcd-mipid.h
+index 63f05eb238274..4927cfc5158c6 100644
+--- a/include/linux/platform_data/lcd-mipid.h
++++ b/include/linux/platform_data/lcd-mipid.h
+@@ -15,10 +15,8 @@ enum mipid_test_result {
+ #ifdef __KERNEL__
+
+ struct mipid_platform_data {
+- int nreset_gpio;
+ int data_lines;
+
+- void (*shutdown)(struct mipid_platform_data *pdata);
+ void (*set_bklight_level)(struct mipid_platform_data *pdata,
+ int level);
+ int (*get_bklight_level)(struct mipid_platform_data *pdata);
+diff --git a/include/linux/platform_data/mmc-omap.h b/include/linux/platform_data/mmc-omap.h
+index 91051e9907f34..054d0c3c5ec58 100644
+--- a/include/linux/platform_data/mmc-omap.h
++++ b/include/linux/platform_data/mmc-omap.h
+@@ -20,8 +20,6 @@ struct omap_mmc_platform_data {
+ * maximum frequency on the MMC bus */
+ unsigned int max_freq;
+
+- /* switch the bus to a new slot */
+- int (*switch_slot)(struct device *dev, int slot);
+ /* initialize board-specific MMC functionality, can be NULL if
+ * not supported */
+ int (*init)(struct device *dev);
+diff --git a/include/linux/ramfs.h b/include/linux/ramfs.h
+index 917528d102c4e..d506dc63dd47c 100644
+--- a/include/linux/ramfs.h
++++ b/include/linux/ramfs.h
+@@ -7,6 +7,7 @@
+ struct inode *ramfs_get_inode(struct super_block *sb, const struct inode *dir,
+ umode_t mode, dev_t dev);
+ extern int ramfs_init_fs_context(struct fs_context *fc);
++extern void ramfs_kill_sb(struct super_block *sb);
+
+ #ifdef CONFIG_MMU
+ static inline int
+diff --git a/include/linux/sh_intc.h b/include/linux/sh_intc.h
+index 37ad81058d6ae..27ae79191bdc3 100644
+--- a/include/linux/sh_intc.h
++++ b/include/linux/sh_intc.h
+@@ -13,9 +13,9 @@
+ /*
+ * Convert back and forth between INTEVT and IRQ values.
+ */
+-#ifdef CONFIG_CPU_HAS_INTEVT
+-#define evt2irq(evt) (((evt) >> 5) - 16)
+-#define irq2evt(irq) (((irq) + 16) << 5)
++#ifdef CONFIG_CPU_HAS_INTEVT /* Avoid IRQ0 (invalid for platform devices) */
++#define evt2irq(evt) ((evt) >> 5)
++#define irq2evt(irq) ((irq) << 5)
+ #else
+ #define evt2irq(evt) (evt)
+ #define irq2evt(irq) (irq)
+diff --git a/include/linux/soc/qcom/geni-se.h b/include/linux/soc/qcom/geni-se.h
+index c55a0bc8cb0e9..821a19135bb66 100644
+--- a/include/linux/soc/qcom/geni-se.h
++++ b/include/linux/soc/qcom/geni-se.h
+@@ -490,9 +490,13 @@ int geni_se_clk_freq_match(struct geni_se *se, unsigned long req_freq,
+ unsigned int *index, unsigned long *res_freq,
+ bool exact);
+
++void geni_se_tx_init_dma(struct geni_se *se, dma_addr_t iova, size_t len);
++
+ int geni_se_tx_dma_prep(struct geni_se *se, void *buf, size_t len,
+ dma_addr_t *iova);
+
++void geni_se_rx_init_dma(struct geni_se *se, dma_addr_t iova, size_t len);
++
+ int geni_se_rx_dma_prep(struct geni_se *se, void *buf, size_t len,
+ dma_addr_t *iova);
+
+diff --git a/include/linux/spi/ads7846.h b/include/linux/spi/ads7846.h
+index d424c1aadf382..a04c1c34c3443 100644
+--- a/include/linux/spi/ads7846.h
++++ b/include/linux/spi/ads7846.h
+@@ -35,8 +35,6 @@ struct ads7846_platform_data {
+ u16 debounce_tol; /* tolerance used for filtering */
+ u16 debounce_rep; /* additional consecutive good readings
+ * required after the first two */
+- int gpio_pendown; /* the GPIO used to decide the pendown
+- * state if get_pendown_state == NULL */
+ int gpio_pendown_debounce; /* platform specific debounce time for
+ * the gpio_pendown */
+ int (*get_pendown_state)(void);
+diff --git a/include/linux/usb/hcd.h b/include/linux/usb/hcd.h
+index 0c7eff91adf4e..4e9623e8492b3 100644
+--- a/include/linux/usb/hcd.h
++++ b/include/linux/usb/hcd.h
+@@ -267,7 +267,7 @@ struct hc_driver {
+ int (*pci_suspend)(struct usb_hcd *hcd, bool do_wakeup);
+
+ /* called after entering D0 (etc), before resuming the hub */
+- int (*pci_resume)(struct usb_hcd *hcd, bool hibernated);
++ int (*pci_resume)(struct usb_hcd *hcd, pm_message_t state);
+
+ /* called just before hibernate final D3 state, allows host to poweroff parts */
+ int (*pci_poweroff_late)(struct usb_hcd *hcd, bool do_wakeup);
+diff --git a/include/linux/usb/musb.h b/include/linux/usb/musb.h
+index e4a3ad3c800f5..3963e55e88a31 100644
+--- a/include/linux/usb/musb.h
++++ b/include/linux/usb/musb.h
+@@ -99,9 +99,6 @@ struct musb_hdrc_platform_data {
+ /* (HOST or OTG) program PHY for external Vbus */
+ unsigned extvbus:1;
+
+- /* Power the device on or off */
+- int (*set_power)(int state);
+-
+ /* MUSB configuration-specific details */
+ const struct musb_hdrc_config *config;
+
+@@ -135,14 +132,4 @@ static inline int musb_mailbox(enum musb_vbus_id_status status)
+ #define TUSB6010_REFCLK_24 41667 /* psec/clk @ 24.0 MHz XI */
+ #define TUSB6010_REFCLK_19 52083 /* psec/clk @ 19.2 MHz CLKIN */
+
+-#ifdef CONFIG_ARCH_OMAP2
+-
+-extern int __init tusb6010_setup_interface(
+- struct musb_hdrc_platform_data *data,
+- unsigned ps_refclk, unsigned waitpin,
+- unsigned async_cs, unsigned sync_cs,
+- unsigned irq, unsigned dmachan);
+-
+-#endif /* OMAP2 */
+-
+ #endif /* __LINUX_USB_MUSB_H */
+diff --git a/include/linux/watch_queue.h b/include/linux/watch_queue.h
+index fc6bba20273bd..45cd42f55d492 100644
+--- a/include/linux/watch_queue.h
++++ b/include/linux/watch_queue.h
+@@ -38,7 +38,7 @@ struct watch_filter {
+ struct watch_queue {
+ struct rcu_head rcu;
+ struct watch_filter __rcu *filter;
+- struct pipe_inode_info *pipe; /* The pipe we're using as a buffer */
++ struct pipe_inode_info *pipe; /* Pipe we use as a buffer, NULL if queue closed */
+ struct hlist_head watches; /* Contributory watches */
+ struct page **notes; /* Preallocated notifications */
+ unsigned long *notes_bitmap; /* Allocation bitmap for notes */
+@@ -46,7 +46,6 @@ struct watch_queue {
+ spinlock_t lock;
+ unsigned int nr_notes; /* Number of notes */
+ unsigned int nr_pages; /* Number of pages in notes[] */
+- bool defunct; /* T when queues closed */
+ };
+
+ /*
+diff --git a/include/net/bluetooth/mgmt.h b/include/net/bluetooth/mgmt.h
+index a5801649f6196..5e68b3dd44222 100644
+--- a/include/net/bluetooth/mgmt.h
++++ b/include/net/bluetooth/mgmt.h
+@@ -979,6 +979,7 @@ struct mgmt_ev_auth_failed {
+ #define MGMT_DEV_FOUND_NOT_CONNECTABLE BIT(2)
+ #define MGMT_DEV_FOUND_INITIATED_CONN BIT(3)
+ #define MGMT_DEV_FOUND_NAME_REQUEST_FAILED BIT(4)
++#define MGMT_DEV_FOUND_SCAN_RSP BIT(5)
+
+ #define MGMT_EV_DEVICE_FOUND 0x0012
+ struct mgmt_ev_device_found {
+diff --git a/include/net/dsa.h b/include/net/dsa.h
+index ab0f0a5b08602..197c5a6ca8f7f 100644
+--- a/include/net/dsa.h
++++ b/include/net/dsa.h
+@@ -314,9 +314,17 @@ struct dsa_port {
+ struct list_head fdbs;
+ struct list_head mdbs;
+
+- /* List of VLANs that CPU and DSA ports are members of. */
+ struct mutex vlans_lock;
+- struct list_head vlans;
++ union {
++ /* List of VLANs that CPU and DSA ports are members of.
++ * Access to this is serialized by the sleepable @vlans_lock.
++ */
++ struct list_head vlans;
++ /* List of VLANs that user ports are members of.
++ * Access to this is serialized by netif_addr_lock_bh().
++ */
++ struct list_head user_vlans;
++ };
+ };
+
+ /* TODO: ideally DSA ports would have a single dp->link_dp member,
+diff --git a/include/net/mac80211.h b/include/net/mac80211.h
+index ac0370e768749..65510cfda37af 100644
+--- a/include/net/mac80211.h
++++ b/include/net/mac80211.h
+@@ -7,7 +7,7 @@
+ * Copyright 2007-2010 Johannes Berg <johannes@sipsolutions.net>
+ * Copyright 2013-2014 Intel Mobile Communications GmbH
+ * Copyright (C) 2015 - 2017 Intel Deutschland GmbH
+- * Copyright (C) 2018 - 2022 Intel Corporation
++ * Copyright (C) 2018 - 2023 Intel Corporation
+ */
+
+ #ifndef MAC80211_H
+@@ -6861,6 +6861,48 @@ ieee80211_vif_type_p2p(struct ieee80211_vif *vif)
+ return ieee80211_iftype_p2p(vif->type, vif->p2p);
+ }
+
++/**
++ * ieee80211_get_he_iftype_cap_vif - return HE capabilities for sband/vif
++ * @sband: the sband to search for the iftype on
++ * @vif: the vif to get the iftype from
++ *
++ * Return: pointer to the struct ieee80211_sta_he_cap, or %NULL is none found
++ */
++static inline const struct ieee80211_sta_he_cap *
++ieee80211_get_he_iftype_cap_vif(const struct ieee80211_supported_band *sband,
++ struct ieee80211_vif *vif)
++{
++ return ieee80211_get_he_iftype_cap(sband, ieee80211_vif_type_p2p(vif));
++}
++
++/**
++ * ieee80211_get_he_6ghz_capa_vif - return HE 6 GHz capabilities
++ * @sband: the sband to search for the STA on
++ * @vif: the vif to get the iftype from
++ *
++ * Return: the 6GHz capabilities
++ */
++static inline __le16
++ieee80211_get_he_6ghz_capa_vif(const struct ieee80211_supported_band *sband,
++ struct ieee80211_vif *vif)
++{
++ return ieee80211_get_he_6ghz_capa(sband, ieee80211_vif_type_p2p(vif));
++}
++
++/**
++ * ieee80211_get_eht_iftype_cap_vif - return ETH capabilities for sband/vif
++ * @sband: the sband to search for the iftype on
++ * @vif: the vif to get the iftype from
++ *
++ * Return: pointer to the struct ieee80211_sta_eht_cap, or %NULL is none found
++ */
++static inline const struct ieee80211_sta_eht_cap *
++ieee80211_get_eht_iftype_cap_vif(const struct ieee80211_supported_band *sband,
++ struct ieee80211_vif *vif)
++{
++ return ieee80211_get_eht_iftype_cap(sband, ieee80211_vif_type_p2p(vif));
++}
++
+ /**
+ * ieee80211_update_mu_groups - set the VHT MU-MIMO groud data
+ *
+diff --git a/include/net/regulatory.h b/include/net/regulatory.h
+index 896191f420d50..b2cb4a9eb04dc 100644
+--- a/include/net/regulatory.h
++++ b/include/net/regulatory.h
+@@ -140,17 +140,6 @@ struct regulatory_request {
+ * otherwise initiating radiation is not allowed. This will enable the
+ * relaxations enabled under the CFG80211_REG_RELAX_NO_IR configuration
+ * option
+- * @REGULATORY_IGNORE_STALE_KICKOFF: the regulatory core will _not_ make sure
+- * all interfaces on this wiphy reside on allowed channels. If this flag
+- * is not set, upon a regdomain change, the interfaces are given a grace
+- * period (currently 60 seconds) to disconnect or move to an allowed
+- * channel. Interfaces on forbidden channels are forcibly disconnected.
+- * Currently these types of interfaces are supported for enforcement:
+- * NL80211_IFTYPE_ADHOC, NL80211_IFTYPE_STATION, NL80211_IFTYPE_AP,
+- * NL80211_IFTYPE_AP_VLAN, NL80211_IFTYPE_MONITOR,
+- * NL80211_IFTYPE_P2P_CLIENT, NL80211_IFTYPE_P2P_GO,
+- * NL80211_IFTYPE_P2P_DEVICE. The flag will be set by default if a device
+- * includes any modes unsupported for enforcement checking.
+ * @REGULATORY_WIPHY_SELF_MANAGED: for devices that employ wiphy-specific
+ * regdom management. These devices will ignore all regdom changes not
+ * originating from their own wiphy.
+@@ -177,7 +166,7 @@ enum ieee80211_regulatory_flags {
+ REGULATORY_COUNTRY_IE_FOLLOW_POWER = BIT(3),
+ REGULATORY_COUNTRY_IE_IGNORE = BIT(4),
+ REGULATORY_ENABLE_RELAX_NO_IR = BIT(5),
+- REGULATORY_IGNORE_STALE_KICKOFF = BIT(6),
++ /* reuse bit 6 next time */
+ REGULATORY_WIPHY_SELF_MANAGED = BIT(7),
+ };
+
+diff --git a/include/net/sock.h b/include/net/sock.h
+index 6f428a7f35675..ad468fe71413a 100644
+--- a/include/net/sock.h
++++ b/include/net/sock.h
+@@ -2100,6 +2100,7 @@ static inline void sock_graft(struct sock *sk, struct socket *parent)
+ }
+
+ kuid_t sock_i_uid(struct sock *sk);
++unsigned long __sock_i_ino(struct sock *sk);
+ unsigned long sock_i_ino(struct sock *sk);
+
+ static inline kuid_t sock_net_uid(const struct net *net, const struct sock *sk)
+diff --git a/include/soc/mscc/ocelot.h b/include/soc/mscc/ocelot.h
+index cb8fbb2418795..22aae505c813b 100644
+--- a/include/soc/mscc/ocelot.h
++++ b/include/soc/mscc/ocelot.h
+@@ -730,6 +730,11 @@ enum macaccess_entry_type {
+ ENTRYTYPE_MACv6,
+ };
+
++enum ocelot_proto {
++ OCELOT_PROTO_PTP_L2 = BIT(0),
++ OCELOT_PROTO_PTP_L4 = BIT(1),
++};
++
+ #define OCELOT_QUIRK_PCS_PERFORMS_RATE_ADAPTATION BIT(0)
+ #define OCELOT_QUIRK_QSGMII_PORTS_MUST_BE_UP BIT(1)
+
+@@ -775,6 +780,8 @@ struct ocelot_port {
+ unsigned int ptp_skbs_in_flight;
+ struct sk_buff_head tx_skbs;
+
++ unsigned int trap_proto;
++
+ u16 mrp_ring_id;
+
+ u8 ptp_cmd;
+@@ -868,12 +875,9 @@ struct ocelot {
+ u8 mm_supported:1;
+ struct ptp_clock *ptp_clock;
+ struct ptp_clock_info ptp_info;
+- struct hwtstamp_config hwtstamp_config;
+ unsigned int ptp_skbs_in_flight;
+ /* Protects the 2-step TX timestamp ID logic */
+ spinlock_t ts_id_lock;
+- /* Protects the PTP interface state */
+- struct mutex ptp_lock;
+ /* Protects the PTP clock */
+ spinlock_t ptp_clock_lock;
+ struct ptp_pin_desc ptp_pins[OCELOT_PTP_PINS_NUM];
+diff --git a/include/trace/events/net.h b/include/trace/events/net.h
+index da611a7aaf970..f667c76a3b022 100644
+--- a/include/trace/events/net.h
++++ b/include/trace/events/net.h
+@@ -51,7 +51,8 @@ TRACE_EVENT(net_dev_start_xmit,
+ __entry->network_offset = skb_network_offset(skb);
+ __entry->transport_offset_valid =
+ skb_transport_header_was_set(skb);
+- __entry->transport_offset = skb_transport_offset(skb);
++ __entry->transport_offset = skb_transport_header_was_set(skb) ?
++ skb_transport_offset(skb) : 0;
+ __entry->tx_flags = skb_shinfo(skb)->tx_flags;
+ __entry->gso_size = skb_shinfo(skb)->gso_size;
+ __entry->gso_segs = skb_shinfo(skb)->gso_segs;
+diff --git a/include/trace/events/timer.h b/include/trace/events/timer.h
+index 3e8619c72f774..b4bc2828fa09f 100644
+--- a/include/trace/events/timer.h
++++ b/include/trace/events/timer.h
+@@ -158,7 +158,11 @@ DEFINE_EVENT(timer_class, timer_cancel,
+ { HRTIMER_MODE_ABS_SOFT, "ABS|SOFT" }, \
+ { HRTIMER_MODE_REL_SOFT, "REL|SOFT" }, \
+ { HRTIMER_MODE_ABS_PINNED_SOFT, "ABS|PINNED|SOFT" }, \
+- { HRTIMER_MODE_REL_PINNED_SOFT, "REL|PINNED|SOFT" })
++ { HRTIMER_MODE_REL_PINNED_SOFT, "REL|PINNED|SOFT" }, \
++ { HRTIMER_MODE_ABS_HARD, "ABS|HARD" }, \
++ { HRTIMER_MODE_REL_HARD, "REL|HARD" }, \
++ { HRTIMER_MODE_ABS_PINNED_HARD, "ABS|PINNED|HARD" }, \
++ { HRTIMER_MODE_REL_PINNED_HARD, "REL|PINNED|HARD" })
+
+ /**
+ * hrtimer_init - called when the hrtimer is initialized
+diff --git a/include/uapi/linux/affs_hardblocks.h b/include/uapi/linux/affs_hardblocks.h
+index 5e2fb8481252a..a5aff2eb5f708 100644
+--- a/include/uapi/linux/affs_hardblocks.h
++++ b/include/uapi/linux/affs_hardblocks.h
+@@ -7,42 +7,42 @@
+ /* Just the needed definitions for the RDB of an Amiga HD. */
+
+ struct RigidDiskBlock {
+- __u32 rdb_ID;
++ __be32 rdb_ID;
+ __be32 rdb_SummedLongs;
+- __s32 rdb_ChkSum;
+- __u32 rdb_HostID;
++ __be32 rdb_ChkSum;
++ __be32 rdb_HostID;
+ __be32 rdb_BlockBytes;
+- __u32 rdb_Flags;
+- __u32 rdb_BadBlockList;
++ __be32 rdb_Flags;
++ __be32 rdb_BadBlockList;
+ __be32 rdb_PartitionList;
+- __u32 rdb_FileSysHeaderList;
+- __u32 rdb_DriveInit;
+- __u32 rdb_Reserved1[6];
+- __u32 rdb_Cylinders;
+- __u32 rdb_Sectors;
+- __u32 rdb_Heads;
+- __u32 rdb_Interleave;
+- __u32 rdb_Park;
+- __u32 rdb_Reserved2[3];
+- __u32 rdb_WritePreComp;
+- __u32 rdb_ReducedWrite;
+- __u32 rdb_StepRate;
+- __u32 rdb_Reserved3[5];
+- __u32 rdb_RDBBlocksLo;
+- __u32 rdb_RDBBlocksHi;
+- __u32 rdb_LoCylinder;
+- __u32 rdb_HiCylinder;
+- __u32 rdb_CylBlocks;
+- __u32 rdb_AutoParkSeconds;
+- __u32 rdb_HighRDSKBlock;
+- __u32 rdb_Reserved4;
++ __be32 rdb_FileSysHeaderList;
++ __be32 rdb_DriveInit;
++ __be32 rdb_Reserved1[6];
++ __be32 rdb_Cylinders;
++ __be32 rdb_Sectors;
++ __be32 rdb_Heads;
++ __be32 rdb_Interleave;
++ __be32 rdb_Park;
++ __be32 rdb_Reserved2[3];
++ __be32 rdb_WritePreComp;
++ __be32 rdb_ReducedWrite;
++ __be32 rdb_StepRate;
++ __be32 rdb_Reserved3[5];
++ __be32 rdb_RDBBlocksLo;
++ __be32 rdb_RDBBlocksHi;
++ __be32 rdb_LoCylinder;
++ __be32 rdb_HiCylinder;
++ __be32 rdb_CylBlocks;
++ __be32 rdb_AutoParkSeconds;
++ __be32 rdb_HighRDSKBlock;
++ __be32 rdb_Reserved4;
+ char rdb_DiskVendor[8];
+ char rdb_DiskProduct[16];
+ char rdb_DiskRevision[4];
+ char rdb_ControllerVendor[8];
+ char rdb_ControllerProduct[16];
+ char rdb_ControllerRevision[4];
+- __u32 rdb_Reserved5[10];
++ __be32 rdb_Reserved5[10];
+ };
+
+ #define IDNAME_RIGIDDISK 0x5244534B /* "RDSK" */
+@@ -50,16 +50,16 @@ struct RigidDiskBlock {
+ struct PartitionBlock {
+ __be32 pb_ID;
+ __be32 pb_SummedLongs;
+- __s32 pb_ChkSum;
+- __u32 pb_HostID;
++ __be32 pb_ChkSum;
++ __be32 pb_HostID;
+ __be32 pb_Next;
+- __u32 pb_Flags;
+- __u32 pb_Reserved1[2];
+- __u32 pb_DevFlags;
++ __be32 pb_Flags;
++ __be32 pb_Reserved1[2];
++ __be32 pb_DevFlags;
+ __u8 pb_DriveName[32];
+- __u32 pb_Reserved2[15];
++ __be32 pb_Reserved2[15];
+ __be32 pb_Environment[17];
+- __u32 pb_EReserved[15];
++ __be32 pb_EReserved[15];
+ };
+
+ #define IDNAME_PARTITION 0x50415254 /* "PART" */
+diff --git a/include/uapi/linux/auto_dev-ioctl.h b/include/uapi/linux/auto_dev-ioctl.h
+index 62e625356dc81..08be539605fca 100644
+--- a/include/uapi/linux/auto_dev-ioctl.h
++++ b/include/uapi/linux/auto_dev-ioctl.h
+@@ -109,7 +109,7 @@ struct autofs_dev_ioctl {
+ struct args_ismountpoint ismountpoint;
+ };
+
+- char path[0];
++ char path[];
+ };
+
+ static inline void init_autofs_dev_ioctl(struct autofs_dev_ioctl *in)
+diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
+index aee75eb9e6863..5d8bd754c69f1 100644
+--- a/include/uapi/linux/videodev2.h
++++ b/include/uapi/linux/videodev2.h
+@@ -1720,7 +1720,7 @@ struct v4l2_input {
+ __u8 name[32]; /* Label */
+ __u32 type; /* Type of input */
+ __u32 audioset; /* Associated audios (bitfield) */
+- __u32 tuner; /* enum v4l2_tuner_type */
++ __u32 tuner; /* Tuner index */
+ v4l2_std_id std;
+ __u32 status;
+ __u32 capabilities;
+@@ -1807,8 +1807,8 @@ struct v4l2_ext_control {
+ __u8 __user *p_u8;
+ __u16 __user *p_u16;
+ __u32 __user *p_u32;
+- __u32 __user *p_s32;
+- __u32 __user *p_s64;
++ __s32 __user *p_s32;
++ __s64 __user *p_s64;
+ struct v4l2_area __user *p_area;
+ struct v4l2_ctrl_h264_sps __user *p_h264_sps;
+ struct v4l2_ctrl_h264_pps *p_h264_pps;
+diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h
+index df1d04f7a5424..8eecbb3766158 100644
+--- a/include/ufs/ufshcd.h
++++ b/include/ufs/ufshcd.h
+@@ -225,7 +225,6 @@ struct ufs_dev_cmd {
+ struct mutex lock;
+ struct completion *complete;
+ struct ufs_query query;
+- struct cq_entry *cqe;
+ };
+
+ /**
+diff --git a/init/Makefile b/init/Makefile
+index 26de459006c4e..ec557ada3c12e 100644
+--- a/init/Makefile
++++ b/init/Makefile
+@@ -60,3 +60,4 @@ include/generated/utsversion.h: FORCE
+ $(obj)/version-timestamp.o: include/generated/utsversion.h
+ CFLAGS_version-timestamp.o := -include include/generated/utsversion.h
+ KASAN_SANITIZE_version-timestamp.o := n
++GCOV_PROFILE_version-timestamp.o := n
+diff --git a/init/main.c b/init/main.c
+index af50044deed56..c445c1fb19b95 100644
+--- a/init/main.c
++++ b/init/main.c
+@@ -877,7 +877,8 @@ static void __init print_unknown_bootoptions(void)
+ memblock_free(unknown_options, len);
+ }
+
+-asmlinkage __visible void __init __no_sanitize_address __noreturn start_kernel(void)
++asmlinkage __visible __init __no_sanitize_address __noreturn __no_stack_protector
++void start_kernel(void)
+ {
+ char *command_line;
+ char *after_dashes;
+diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
+index 3bca7a79efda4..f1b79959d1c1d 100644
+--- a/io_uring/io_uring.c
++++ b/io_uring/io_uring.c
+@@ -2575,6 +2575,8 @@ int io_run_task_work_sig(struct io_ring_ctx *ctx)
+ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
+ struct io_wait_queue *iowq)
+ {
++ int token, ret;
++
+ if (unlikely(READ_ONCE(ctx->check_cq)))
+ return 1;
+ if (unlikely(!llist_empty(&ctx->work_llist)))
+@@ -2585,11 +2587,20 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
+ return -EINTR;
+ if (unlikely(io_should_wake(iowq)))
+ return 0;
++
++ /*
++ * Use io_schedule_prepare/finish, so cpufreq can take into account
++ * that the task is waiting for IO - turns out to be important for low
++ * QD IO.
++ */
++ token = io_schedule_prepare();
++ ret = 0;
+ if (iowq->timeout == KTIME_MAX)
+ schedule();
+ else if (!schedule_hrtimeout(&iowq->timeout, HRTIMER_MODE_ABS))
+- return -ETIME;
+- return 0;
++ ret = -ETIME;
++ io_schedule_finish(token);
++ return ret;
+ }
+
+ /*
+@@ -3050,7 +3061,18 @@ static __cold void io_ring_exit_work(struct work_struct *work)
+ /* there is little hope left, don't run it too often */
+ interval = HZ * 60;
+ }
+- } while (!wait_for_completion_timeout(&ctx->ref_comp, interval));
++ /*
++ * This is really an uninterruptible wait, as it has to be
++ * complete. But it's also run from a kworker, which doesn't
++ * take signals, so it's fine to make it interruptible. This
++ * avoids scenarios where we knowingly can wait much longer
++ * on completions, for example if someone does a SIGSTOP on
++ * a task that needs to finish task_work to make this loop
++ * complete. That's a synthetic situation that should not
++ * cause a stuck task backtrace, and hence a potential panic
++ * on stuck tasks if that is enabled.
++ */
++ } while (!wait_for_completion_interruptible_timeout(&ctx->ref_comp, interval));
+
+ init_completion(&exit.completion);
+ init_task_work(&exit.task_work, io_tctx_exit_cb);
+@@ -3074,7 +3096,12 @@ static __cold void io_ring_exit_work(struct work_struct *work)
+ continue;
+
+ mutex_unlock(&ctx->uring_lock);
+- wait_for_completion(&exit.completion);
++ /*
++ * See comment above for
++ * wait_for_completion_interruptible_timeout() on why this
++ * wait is marked as interruptible.
++ */
++ wait_for_completion_interruptible(&exit.completion);
+ mutex_lock(&ctx->uring_lock);
+ }
+ mutex_unlock(&ctx->uring_lock);
+diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
+index 72b32b7cd9cd9..25ca17a8e1964 100644
+--- a/kernel/bpf/btf.c
++++ b/kernel/bpf/btf.c
+@@ -7848,10 +7848,8 @@ static int __register_btf_kfunc_id_set(enum btf_kfunc_hook hook,
+ pr_err("missing vmlinux BTF, cannot register kfuncs\n");
+ return -ENOENT;
+ }
+- if (kset->owner && IS_ENABLED(CONFIG_DEBUG_INFO_BTF_MODULES)) {
+- pr_err("missing module BTF, cannot register kfuncs\n");
+- return -ENOENT;
+- }
++ if (kset->owner && IS_ENABLED(CONFIG_DEBUG_INFO_BTF_MODULES))
++ pr_warn("missing module BTF, cannot register kfuncs\n");
+ return 0;
+ }
+ if (IS_ERR(btf))
+diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
+index 517b6a5928cc0..5b2741aa0d9bb 100644
+--- a/kernel/bpf/cgroup.c
++++ b/kernel/bpf/cgroup.c
+@@ -1826,6 +1826,12 @@ int __cgroup_bpf_run_filter_setsockopt(struct sock *sk, int *level,
+ ret = 1;
+ } else if (ctx.optlen > max_optlen || ctx.optlen < -1) {
+ /* optlen is out of bounds */
++ if (*optlen > PAGE_SIZE && ctx.optlen >= 0) {
++ pr_info_once("bpf setsockopt: ignoring program buffer with optlen=%d (max_optlen=%d)\n",
++ ctx.optlen, max_optlen);
++ ret = 0;
++ goto out;
++ }
+ ret = -EFAULT;
+ } else {
+ /* optlen within bounds, run kernel handler */
+@@ -1881,8 +1887,10 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
+ .optname = optname,
+ .current_task = current,
+ };
++ int orig_optlen;
+ int ret;
+
++ orig_optlen = max_optlen;
+ ctx.optlen = max_optlen;
+ max_optlen = sockopt_alloc_buf(&ctx, max_optlen, &buf);
+ if (max_optlen < 0)
+@@ -1905,6 +1913,7 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
+ ret = -EFAULT;
+ goto out;
+ }
++ orig_optlen = ctx.optlen;
+
+ if (copy_from_user(ctx.optval, optval,
+ min(ctx.optlen, max_optlen)) != 0) {
+@@ -1922,6 +1931,12 @@ int __cgroup_bpf_run_filter_getsockopt(struct sock *sk, int level,
+ goto out;
+
+ if (optval && (ctx.optlen > max_optlen || ctx.optlen < 0)) {
++ if (orig_optlen > PAGE_SIZE && ctx.optlen >= 0) {
++ pr_info_once("bpf getsockopt: ignoring program buffer with optlen=%d (max_optlen=%d)\n",
++ ctx.optlen, max_optlen);
++ ret = retval;
++ goto out;
++ }
+ ret = -EFAULT;
+ goto out;
+ }
+diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
+index 8d368fa353f99..f12565ba136b0 100644
+--- a/kernel/bpf/helpers.c
++++ b/kernel/bpf/helpers.c
+@@ -1926,8 +1926,12 @@ __bpf_kfunc void *bpf_refcount_acquire_impl(void *p__refcounted_kptr, void *meta
+ * bpf_refcount type so that it is emitted in vmlinux BTF
+ */
+ ref = (struct bpf_refcount *)(p__refcounted_kptr + meta->record->refcount_off);
++ if (!refcount_inc_not_zero((refcount_t *)ref))
++ return NULL;
+
+- refcount_inc((refcount_t *)ref);
++ /* Verifier strips KF_RET_NULL if input is owned ref, see is_kfunc_ret_null
++ * in verifier.c
++ */
+ return (void *)p__refcounted_kptr;
+ }
+
+@@ -1943,7 +1947,7 @@ static int __bpf_list_add(struct bpf_list_node *node, struct bpf_list_head *head
+ INIT_LIST_HEAD(h);
+ if (!list_empty(n)) {
+ /* Only called from BPF prog, no need to migrate_disable */
+- __bpf_obj_drop_impl(n - off, rec);
++ __bpf_obj_drop_impl((void *)n - off, rec);
+ return -EINVAL;
+ }
+
+@@ -2025,7 +2029,7 @@ static int __bpf_rbtree_add(struct bpf_rb_root *root, struct bpf_rb_node *node,
+
+ if (!RB_EMPTY_NODE(n)) {
+ /* Only called from BPF prog, no need to migrate_disable */
+- __bpf_obj_drop_impl(n - off, rec);
++ __bpf_obj_drop_impl((void *)n - off, rec);
+ return -EINVAL;
+ }
+
+@@ -2325,7 +2329,7 @@ BTF_ID_FLAGS(func, crash_kexec, KF_DESTRUCTIVE)
+ #endif
+ BTF_ID_FLAGS(func, bpf_obj_new_impl, KF_ACQUIRE | KF_RET_NULL)
+ BTF_ID_FLAGS(func, bpf_obj_drop_impl, KF_RELEASE)
+-BTF_ID_FLAGS(func, bpf_refcount_acquire_impl, KF_ACQUIRE)
++BTF_ID_FLAGS(func, bpf_refcount_acquire_impl, KF_ACQUIRE | KF_RET_NULL)
+ BTF_ID_FLAGS(func, bpf_list_push_front_impl)
+ BTF_ID_FLAGS(func, bpf_list_push_back_impl)
+ BTF_ID_FLAGS(func, bpf_list_pop_front, KF_ACQUIRE | KF_RET_NULL)
+diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
+index ac021bc43a66b..78acf28d48732 100644
+--- a/kernel/bpf/trampoline.c
++++ b/kernel/bpf/trampoline.c
+@@ -251,11 +251,8 @@ bpf_trampoline_get_progs(const struct bpf_trampoline *tr, int *total, bool *ip_a
+ return tlinks;
+ }
+
+-static void __bpf_tramp_image_put_deferred(struct work_struct *work)
++static void bpf_tramp_image_free(struct bpf_tramp_image *im)
+ {
+- struct bpf_tramp_image *im;
+-
+- im = container_of(work, struct bpf_tramp_image, work);
+ bpf_image_ksym_del(&im->ksym);
+ bpf_jit_free_exec(im->image);
+ bpf_jit_uncharge_modmem(PAGE_SIZE);
+@@ -263,6 +260,14 @@ static void __bpf_tramp_image_put_deferred(struct work_struct *work)
+ kfree_rcu(im, rcu);
+ }
+
++static void __bpf_tramp_image_put_deferred(struct work_struct *work)
++{
++ struct bpf_tramp_image *im;
++
++ im = container_of(work, struct bpf_tramp_image, work);
++ bpf_tramp_image_free(im);
++}
++
+ /* callback, fexit step 3 or fentry step 2 */
+ static void __bpf_tramp_image_put_rcu(struct rcu_head *rcu)
+ {
+@@ -344,7 +349,7 @@ static void bpf_tramp_image_put(struct bpf_tramp_image *im)
+ call_rcu_tasks_trace(&im->rcu, __bpf_tramp_image_put_rcu_tasks);
+ }
+
+-static struct bpf_tramp_image *bpf_tramp_image_alloc(u64 key, u32 idx)
++static struct bpf_tramp_image *bpf_tramp_image_alloc(u64 key)
+ {
+ struct bpf_tramp_image *im;
+ struct bpf_ksym *ksym;
+@@ -371,7 +376,7 @@ static struct bpf_tramp_image *bpf_tramp_image_alloc(u64 key, u32 idx)
+
+ ksym = &im->ksym;
+ INIT_LIST_HEAD_RCU(&ksym->lnode);
+- snprintf(ksym->name, KSYM_NAME_LEN, "bpf_trampoline_%llu_%u", key, idx);
++ snprintf(ksym->name, KSYM_NAME_LEN, "bpf_trampoline_%llu", key);
+ bpf_image_ksym_add(image, ksym);
+ return im;
+
+@@ -401,11 +406,10 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr, bool lock_direct_mut
+ err = unregister_fentry(tr, tr->cur_image->image);
+ bpf_tramp_image_put(tr->cur_image);
+ tr->cur_image = NULL;
+- tr->selector = 0;
+ goto out;
+ }
+
+- im = bpf_tramp_image_alloc(tr->key, tr->selector);
++ im = bpf_tramp_image_alloc(tr->key);
+ if (IS_ERR(im)) {
+ err = PTR_ERR(im);
+ goto out;
+@@ -438,12 +442,11 @@ again:
+ &tr->func.model, tr->flags, tlinks,
+ tr->func.addr);
+ if (err < 0)
+- goto out;
++ goto out_free;
+
+ set_memory_rox((long)im->image, 1);
+
+- WARN_ON(tr->cur_image && tr->selector == 0);
+- WARN_ON(!tr->cur_image && tr->selector);
++ WARN_ON(tr->cur_image && total == 0);
+ if (tr->cur_image)
+ /* progs already running at this address */
+ err = modify_fentry(tr, tr->cur_image->image, im->image, lock_direct_mutex);
+@@ -468,18 +471,21 @@ again:
+ }
+ #endif
+ if (err)
+- goto out;
++ goto out_free;
+
+ if (tr->cur_image)
+ bpf_tramp_image_put(tr->cur_image);
+ tr->cur_image = im;
+- tr->selector++;
+ out:
+ /* If any error happens, restore previous flags */
+ if (err)
+ tr->flags = orig_flags;
+ kfree(tlinks);
+ return err;
++
++out_free:
++ bpf_tramp_image_free(im);
++ goto out;
+ }
+
+ static enum bpf_tramp_prog_type bpf_attach_type_to_tramp(struct bpf_prog *prog)
+diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
+index cf5f230360f53..30fabae47a07b 100644
+--- a/kernel/bpf/verifier.c
++++ b/kernel/bpf/verifier.c
+@@ -273,11 +273,6 @@ struct bpf_call_arg_meta {
+ struct btf_field *kptr_field;
+ };
+
+-struct btf_and_id {
+- struct btf *btf;
+- u32 btf_id;
+-};
+-
+ struct bpf_kfunc_call_arg_meta {
+ /* In parameters */
+ struct btf *btf;
+@@ -296,10 +291,21 @@ struct bpf_kfunc_call_arg_meta {
+ u64 value;
+ bool found;
+ } arg_constant;
+- union {
+- struct btf_and_id arg_obj_drop;
+- struct btf_and_id arg_refcount_acquire;
+- };
++
++ /* arg_{btf,btf_id,owning_ref} are used by kfunc-specific handling,
++ * generally to pass info about user-defined local kptr types to later
++ * verification logic
++ * bpf_obj_drop
++ * Record the local kptr type to be drop'd
++ * bpf_refcount_acquire (via KF_ARG_PTR_TO_REFCOUNTED_KPTR arg type)
++ * Record the local kptr type to be refcount_incr'd and use
++ * arg_owning_ref to determine whether refcount_acquire should be
++ * fallible
++ */
++ struct btf *arg_btf;
++ u32 arg_btf_id;
++ bool arg_owning_ref;
++
+ struct {
+ struct btf_field *field;
+ } arg_list_head;
+@@ -604,9 +610,9 @@ static const char *reg_type_str(struct bpf_verifier_env *env,
+ type & PTR_TRUSTED ? "trusted_" : ""
+ );
+
+- snprintf(env->type_str_buf, TYPE_STR_BUF_LEN, "%s%s%s",
++ snprintf(env->tmp_str_buf, TMP_STR_BUF_LEN, "%s%s%s",
+ prefix, str[base_type(type)], postfix);
+- return env->type_str_buf;
++ return env->tmp_str_buf;
+ }
+
+ static char slot_type_char[] = {
+@@ -1254,6 +1260,12 @@ static bool is_spilled_reg(const struct bpf_stack_state *stack)
+ return stack->slot_type[BPF_REG_SIZE - 1] == STACK_SPILL;
+ }
+
++static bool is_spilled_scalar_reg(const struct bpf_stack_state *stack)
++{
++ return stack->slot_type[BPF_REG_SIZE - 1] == STACK_SPILL &&
++ stack->spilled_ptr.type == SCALAR_VALUE;
++}
++
+ static void scrub_spilled_slot(u8 *stype)
+ {
+ if (*stype != STACK_INVALID)
+@@ -3144,12 +3156,167 @@ static const char *disasm_kfunc_name(void *data, const struct bpf_insn *insn)
+ return btf_name_by_offset(desc_btf, func->name_off);
+ }
+
++static inline void bt_init(struct backtrack_state *bt, u32 frame)
++{
++ bt->frame = frame;
++}
++
++static inline void bt_reset(struct backtrack_state *bt)
++{
++ struct bpf_verifier_env *env = bt->env;
++
++ memset(bt, 0, sizeof(*bt));
++ bt->env = env;
++}
++
++static inline u32 bt_empty(struct backtrack_state *bt)
++{
++ u64 mask = 0;
++ int i;
++
++ for (i = 0; i <= bt->frame; i++)
++ mask |= bt->reg_masks[i] | bt->stack_masks[i];
++
++ return mask == 0;
++}
++
++static inline int bt_subprog_enter(struct backtrack_state *bt)
++{
++ if (bt->frame == MAX_CALL_FRAMES - 1) {
++ verbose(bt->env, "BUG subprog enter from frame %d\n", bt->frame);
++ WARN_ONCE(1, "verifier backtracking bug");
++ return -EFAULT;
++ }
++ bt->frame++;
++ return 0;
++}
++
++static inline int bt_subprog_exit(struct backtrack_state *bt)
++{
++ if (bt->frame == 0) {
++ verbose(bt->env, "BUG subprog exit from frame 0\n");
++ WARN_ONCE(1, "verifier backtracking bug");
++ return -EFAULT;
++ }
++ bt->frame--;
++ return 0;
++}
++
++static inline void bt_set_frame_reg(struct backtrack_state *bt, u32 frame, u32 reg)
++{
++ bt->reg_masks[frame] |= 1 << reg;
++}
++
++static inline void bt_clear_frame_reg(struct backtrack_state *bt, u32 frame, u32 reg)
++{
++ bt->reg_masks[frame] &= ~(1 << reg);
++}
++
++static inline void bt_set_reg(struct backtrack_state *bt, u32 reg)
++{
++ bt_set_frame_reg(bt, bt->frame, reg);
++}
++
++static inline void bt_clear_reg(struct backtrack_state *bt, u32 reg)
++{
++ bt_clear_frame_reg(bt, bt->frame, reg);
++}
++
++static inline void bt_set_frame_slot(struct backtrack_state *bt, u32 frame, u32 slot)
++{
++ bt->stack_masks[frame] |= 1ull << slot;
++}
++
++static inline void bt_clear_frame_slot(struct backtrack_state *bt, u32 frame, u32 slot)
++{
++ bt->stack_masks[frame] &= ~(1ull << slot);
++}
++
++static inline void bt_set_slot(struct backtrack_state *bt, u32 slot)
++{
++ bt_set_frame_slot(bt, bt->frame, slot);
++}
++
++static inline void bt_clear_slot(struct backtrack_state *bt, u32 slot)
++{
++ bt_clear_frame_slot(bt, bt->frame, slot);
++}
++
++static inline u32 bt_frame_reg_mask(struct backtrack_state *bt, u32 frame)
++{
++ return bt->reg_masks[frame];
++}
++
++static inline u32 bt_reg_mask(struct backtrack_state *bt)
++{
++ return bt->reg_masks[bt->frame];
++}
++
++static inline u64 bt_frame_stack_mask(struct backtrack_state *bt, u32 frame)
++{
++ return bt->stack_masks[frame];
++}
++
++static inline u64 bt_stack_mask(struct backtrack_state *bt)
++{
++ return bt->stack_masks[bt->frame];
++}
++
++static inline bool bt_is_reg_set(struct backtrack_state *bt, u32 reg)
++{
++ return bt->reg_masks[bt->frame] & (1 << reg);
++}
++
++static inline bool bt_is_slot_set(struct backtrack_state *bt, u32 slot)
++{
++ return bt->stack_masks[bt->frame] & (1ull << slot);
++}
++
++/* format registers bitmask, e.g., "r0,r2,r4" for 0x15 mask */
++static void fmt_reg_mask(char *buf, ssize_t buf_sz, u32 reg_mask)
++{
++ DECLARE_BITMAP(mask, 64);
++ bool first = true;
++ int i, n;
++
++ buf[0] = '\0';
++
++ bitmap_from_u64(mask, reg_mask);
++ for_each_set_bit(i, mask, 32) {
++ n = snprintf(buf, buf_sz, "%sr%d", first ? "" : ",", i);
++ first = false;
++ buf += n;
++ buf_sz -= n;
++ if (buf_sz < 0)
++ break;
++ }
++}
++/* format stack slots bitmask, e.g., "-8,-24,-40" for 0x15 mask */
++static void fmt_stack_mask(char *buf, ssize_t buf_sz, u64 stack_mask)
++{
++ DECLARE_BITMAP(mask, 64);
++ bool first = true;
++ int i, n;
++
++ buf[0] = '\0';
++
++ bitmap_from_u64(mask, stack_mask);
++ for_each_set_bit(i, mask, 64) {
++ n = snprintf(buf, buf_sz, "%s%d", first ? "" : ",", -(i + 1) * 8);
++ first = false;
++ buf += n;
++ buf_sz -= n;
++ if (buf_sz < 0)
++ break;
++ }
++}
++
+ /* For given verifier state backtrack_insn() is called from the last insn to
+ * the first insn. Its purpose is to compute a bitmask of registers and
+ * stack slots that needs precision in the parent verifier state.
+ */
+ static int backtrack_insn(struct bpf_verifier_env *env, int idx,
+- u32 *reg_mask, u64 *stack_mask)
++ struct backtrack_state *bt)
+ {
+ const struct bpf_insn_cbs cbs = {
+ .cb_call = disasm_kfunc_name,
+@@ -3160,20 +3327,24 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx,
+ u8 class = BPF_CLASS(insn->code);
+ u8 opcode = BPF_OP(insn->code);
+ u8 mode = BPF_MODE(insn->code);
+- u32 dreg = 1u << insn->dst_reg;
+- u32 sreg = 1u << insn->src_reg;
++ u32 dreg = insn->dst_reg;
++ u32 sreg = insn->src_reg;
+ u32 spi;
+
+ if (insn->code == 0)
+ return 0;
+ if (env->log.level & BPF_LOG_LEVEL2) {
+- verbose(env, "regs=%x stack=%llx before ", *reg_mask, *stack_mask);
++ fmt_reg_mask(env->tmp_str_buf, TMP_STR_BUF_LEN, bt_reg_mask(bt));
++ verbose(env, "mark_precise: frame%d: regs=%s ",
++ bt->frame, env->tmp_str_buf);
++ fmt_stack_mask(env->tmp_str_buf, TMP_STR_BUF_LEN, bt_stack_mask(bt));
++ verbose(env, "stack=%s before ", env->tmp_str_buf);
+ verbose(env, "%d: ", idx);
+ print_bpf_insn(&cbs, insn, env->allow_ptr_leaks);
+ }
+
+ if (class == BPF_ALU || class == BPF_ALU64) {
+- if (!(*reg_mask & dreg))
++ if (!bt_is_reg_set(bt, dreg))
+ return 0;
+ if (opcode == BPF_MOV) {
+ if (BPF_SRC(insn->code) == BPF_X) {
+@@ -3181,8 +3352,8 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx,
+ * dreg needs precision after this insn
+ * sreg needs precision before this insn
+ */
+- *reg_mask &= ~dreg;
+- *reg_mask |= sreg;
++ bt_clear_reg(bt, dreg);
++ bt_set_reg(bt, sreg);
+ } else {
+ /* dreg = K
+ * dreg needs precision after this insn.
+@@ -3190,7 +3361,7 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx,
+ * as precise=true in this verifier state.
+ * No further markings in parent are necessary
+ */
+- *reg_mask &= ~dreg;
++ bt_clear_reg(bt, dreg);
+ }
+ } else {
+ if (BPF_SRC(insn->code) == BPF_X) {
+@@ -3198,15 +3369,15 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx,
+ * both dreg and sreg need precision
+ * before this insn
+ */
+- *reg_mask |= sreg;
++ bt_set_reg(bt, sreg);
+ } /* else dreg += K
+ * dreg still needs precision before this insn
+ */
+ }
+ } else if (class == BPF_LDX) {
+- if (!(*reg_mask & dreg))
++ if (!bt_is_reg_set(bt, dreg))
+ return 0;
+- *reg_mask &= ~dreg;
++ bt_clear_reg(bt, dreg);
+
+ /* scalars can only be spilled into stack w/o losing precision.
+ * Load from any other memory can be zero extended.
+@@ -3227,9 +3398,9 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx,
+ WARN_ONCE(1, "verifier backtracking bug");
+ return -EFAULT;
+ }
+- *stack_mask |= 1ull << spi;
++ bt_set_slot(bt, spi);
+ } else if (class == BPF_STX || class == BPF_ST) {
+- if (*reg_mask & dreg)
++ if (bt_is_reg_set(bt, dreg))
+ /* stx & st shouldn't be using _scalar_ dst_reg
+ * to access memory. It means backtracking
+ * encountered a case of pointer subtraction.
+@@ -3244,11 +3415,11 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx,
+ WARN_ONCE(1, "verifier backtracking bug");
+ return -EFAULT;
+ }
+- if (!(*stack_mask & (1ull << spi)))
++ if (!bt_is_slot_set(bt, spi))
+ return 0;
+- *stack_mask &= ~(1ull << spi);
++ bt_clear_slot(bt, spi);
+ if (class == BPF_STX)
+- *reg_mask |= sreg;
++ bt_set_reg(bt, sreg);
+ } else if (class == BPF_JMP || class == BPF_JMP32) {
+ if (opcode == BPF_CALL) {
+ if (insn->src_reg == BPF_PSEUDO_CALL)
+@@ -3265,19 +3436,19 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx,
+ if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL && insn->imm == 0)
+ return -ENOTSUPP;
+ /* regular helper call sets R0 */
+- *reg_mask &= ~1;
+- if (*reg_mask & 0x3f) {
++ bt_clear_reg(bt, BPF_REG_0);
++ if (bt_reg_mask(bt) & BPF_REGMASK_ARGS) {
+ /* if backtracing was looking for registers R1-R5
+ * they should have been found already.
+ */
+- verbose(env, "BUG regs %x\n", *reg_mask);
++ verbose(env, "BUG regs %x\n", bt_reg_mask(bt));
+ WARN_ONCE(1, "verifier backtracking bug");
+ return -EFAULT;
+ }
+ } else if (opcode == BPF_EXIT) {
+ return -ENOTSUPP;
+ } else if (BPF_SRC(insn->code) == BPF_X) {
+- if (!(*reg_mask & (dreg | sreg)))
++ if (!bt_is_reg_set(bt, dreg) && !bt_is_reg_set(bt, sreg))
+ return 0;
+ /* dreg <cond> sreg
+ * Both dreg and sreg need precision before
+@@ -3285,7 +3456,8 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx,
+ * before it would be equally necessary to
+ * propagate it to dreg.
+ */
+- *reg_mask |= (sreg | dreg);
++ bt_set_reg(bt, dreg);
++ bt_set_reg(bt, sreg);
+ /* else dreg <cond> K
+ * Only dreg still needs precision before
+ * this insn, so for the K-based conditional
+@@ -3293,9 +3465,9 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx,
+ */
+ }
+ } else if (class == BPF_LD) {
+- if (!(*reg_mask & dreg))
++ if (!bt_is_reg_set(bt, dreg))
+ return 0;
+- *reg_mask &= ~dreg;
++ bt_clear_reg(bt, dreg);
+ /* It's ld_imm64 or ld_abs or ld_ind.
+ * For ld_imm64 no further tracking of precision
+ * into parent is necessary
+@@ -3366,6 +3538,11 @@ static void mark_all_scalars_precise(struct bpf_verifier_env *env,
+ struct bpf_reg_state *reg;
+ int i, j;
+
++ if (env->log.level & BPF_LOG_LEVEL2) {
++ verbose(env, "mark_precise: frame%d: falling back to forcing all scalars precise\n",
++ st->curframe);
++ }
++
+ /* big hammer: mark all scalars precise in this path.
+ * pop_stack may still get !precise scalars.
+ * We also skip current state and go straight to first parent state,
+@@ -3377,17 +3554,25 @@ static void mark_all_scalars_precise(struct bpf_verifier_env *env,
+ func = st->frame[i];
+ for (j = 0; j < BPF_REG_FP; j++) {
+ reg = &func->regs[j];
+- if (reg->type != SCALAR_VALUE)
++ if (reg->type != SCALAR_VALUE || reg->precise)
+ continue;
+ reg->precise = true;
++ if (env->log.level & BPF_LOG_LEVEL2) {
++ verbose(env, "force_precise: frame%d: forcing r%d to be precise\n",
++ i, j);
++ }
+ }
+ for (j = 0; j < func->allocated_stack / BPF_REG_SIZE; j++) {
+ if (!is_spilled_reg(&func->stack[j]))
+ continue;
+ reg = &func->stack[j].spilled_ptr;
+- if (reg->type != SCALAR_VALUE)
++ if (reg->type != SCALAR_VALUE || reg->precise)
+ continue;
+ reg->precise = true;
++ if (env->log.level & BPF_LOG_LEVEL2) {
++ verbose(env, "force_precise: frame%d: forcing fp%d to be precise\n",
++ i, -(j + 1) * 8);
++ }
+ }
+ }
+ }
+@@ -3418,6 +3603,96 @@ static void mark_all_scalars_imprecise(struct bpf_verifier_env *env, struct bpf_
+ }
+ }
+
++static bool idset_contains(struct bpf_idset *s, u32 id)
++{
++ u32 i;
++
++ for (i = 0; i < s->count; ++i)
++ if (s->ids[i] == id)
++ return true;
++
++ return false;
++}
++
++static int idset_push(struct bpf_idset *s, u32 id)
++{
++ if (WARN_ON_ONCE(s->count >= ARRAY_SIZE(s->ids)))
++ return -EFAULT;
++ s->ids[s->count++] = id;
++ return 0;
++}
++
++static void idset_reset(struct bpf_idset *s)
++{
++ s->count = 0;
++}
++
++/* Collect a set of IDs for all registers currently marked as precise in env->bt.
++ * Mark all registers with these IDs as precise.
++ */
++static int mark_precise_scalar_ids(struct bpf_verifier_env *env, struct bpf_verifier_state *st)
++{
++ struct bpf_idset *precise_ids = &env->idset_scratch;
++ struct backtrack_state *bt = &env->bt;
++ struct bpf_func_state *func;
++ struct bpf_reg_state *reg;
++ DECLARE_BITMAP(mask, 64);
++ int i, fr;
++
++ idset_reset(precise_ids);
++
++ for (fr = bt->frame; fr >= 0; fr--) {
++ func = st->frame[fr];
++
++ bitmap_from_u64(mask, bt_frame_reg_mask(bt, fr));
++ for_each_set_bit(i, mask, 32) {
++ reg = &func->regs[i];
++ if (!reg->id || reg->type != SCALAR_VALUE)
++ continue;
++ if (idset_push(precise_ids, reg->id))
++ return -EFAULT;
++ }
++
++ bitmap_from_u64(mask, bt_frame_stack_mask(bt, fr));
++ for_each_set_bit(i, mask, 64) {
++ if (i >= func->allocated_stack / BPF_REG_SIZE)
++ break;
++ if (!is_spilled_scalar_reg(&func->stack[i]))
++ continue;
++ reg = &func->stack[i].spilled_ptr;
++ if (!reg->id)
++ continue;
++ if (idset_push(precise_ids, reg->id))
++ return -EFAULT;
++ }
++ }
++
++ for (fr = 0; fr <= st->curframe; ++fr) {
++ func = st->frame[fr];
++
++ for (i = BPF_REG_0; i < BPF_REG_10; ++i) {
++ reg = &func->regs[i];
++ if (!reg->id)
++ continue;
++ if (!idset_contains(precise_ids, reg->id))
++ continue;
++ bt_set_frame_reg(bt, fr, i);
++ }
++ for (i = 0; i < func->allocated_stack / BPF_REG_SIZE; ++i) {
++ if (!is_spilled_scalar_reg(&func->stack[i]))
++ continue;
++ reg = &func->stack[i].spilled_ptr;
++ if (!reg->id)
++ continue;
++ if (!idset_contains(precise_ids, reg->id))
++ continue;
++ bt_set_frame_slot(bt, fr, i);
++ }
++ }
++
++ return 0;
++}
++
+ /*
+ * __mark_chain_precision() backtracks BPF program instruction sequence and
+ * chain of verifier states making sure that register *regno* (if regno >= 0)
+@@ -3505,62 +3780,73 @@ static void mark_all_scalars_imprecise(struct bpf_verifier_env *env, struct bpf_
+ * mark_all_scalars_imprecise() to hopefully get more permissive and generic
+ * finalized states which help in short circuiting more future states.
+ */
+-static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int regno,
+- int spi)
++static int __mark_chain_precision(struct bpf_verifier_env *env, int regno)
+ {
++ struct backtrack_state *bt = &env->bt;
+ struct bpf_verifier_state *st = env->cur_state;
+ int first_idx = st->first_insn_idx;
+ int last_idx = env->insn_idx;
+ struct bpf_func_state *func;
+ struct bpf_reg_state *reg;
+- u32 reg_mask = regno >= 0 ? 1u << regno : 0;
+- u64 stack_mask = spi >= 0 ? 1ull << spi : 0;
+ bool skip_first = true;
+- bool new_marks = false;
+- int i, err;
++ int i, fr, err;
+
+ if (!env->bpf_capable)
+ return 0;
+
++ /* set frame number from which we are starting to backtrack */
++ bt_init(bt, env->cur_state->curframe);
++
+ /* Do sanity checks against current state of register and/or stack
+ * slot, but don't set precise flag in current state, as precision
+ * tracking in the current state is unnecessary.
+ */
+- func = st->frame[frame];
++ func = st->frame[bt->frame];
+ if (regno >= 0) {
+ reg = &func->regs[regno];
+ if (reg->type != SCALAR_VALUE) {
+ WARN_ONCE(1, "backtracing misuse");
+ return -EFAULT;
+ }
+- new_marks = true;
++ bt_set_reg(bt, regno);
+ }
+
+- while (spi >= 0) {
+- if (!is_spilled_reg(&func->stack[spi])) {
+- stack_mask = 0;
+- break;
+- }
+- reg = &func->stack[spi].spilled_ptr;
+- if (reg->type != SCALAR_VALUE) {
+- stack_mask = 0;
+- break;
+- }
+- new_marks = true;
+- break;
+- }
+-
+- if (!new_marks)
+- return 0;
+- if (!reg_mask && !stack_mask)
++ if (bt_empty(bt))
+ return 0;
+
+ for (;;) {
+ DECLARE_BITMAP(mask, 64);
+ u32 history = st->jmp_history_cnt;
+
+- if (env->log.level & BPF_LOG_LEVEL2)
+- verbose(env, "last_idx %d first_idx %d\n", last_idx, first_idx);
++ if (env->log.level & BPF_LOG_LEVEL2) {
++ verbose(env, "mark_precise: frame%d: last_idx %d first_idx %d\n",
++ bt->frame, last_idx, first_idx);
++ }
++
++ /* If some register with scalar ID is marked as precise,
++ * make sure that all registers sharing this ID are also precise.
++ * This is needed to estimate effect of find_equal_scalars().
++ * Do this at the last instruction of each state,
++ * bpf_reg_state::id fields are valid for these instructions.
++ *
++ * Allows to track precision in situation like below:
++ *
++ * r2 = unknown value
++ * ...
++ * --- state #0 ---
++ * ...
++ * r1 = r2 // r1 and r2 now share the same ID
++ * ...
++ * --- state #1 {r1.id = A, r2.id = A} ---
++ * ...
++ * if (r2 > 10) goto exit; // find_equal_scalars() assigns range to r1
++ * ...
++ * --- state #2 {r1.id = A, r2.id = A} ---
++ * r3 = r10
++ * r3 += r1 // need to mark both r1 and r2
++ */
++ if (mark_precise_scalar_ids(env, st))
++ return -EFAULT;
+
+ if (last_idx < 0) {
+ /* we are at the entry into subprog, which
+@@ -3571,12 +3857,13 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r
+ if (st->curframe == 0 &&
+ st->frame[0]->subprogno > 0 &&
+ st->frame[0]->callsite == BPF_MAIN_FUNC &&
+- stack_mask == 0 && (reg_mask & ~0x3e) == 0) {
+- bitmap_from_u64(mask, reg_mask);
++ bt_stack_mask(bt) == 0 &&
++ (bt_reg_mask(bt) & ~BPF_REGMASK_ARGS) == 0) {
++ bitmap_from_u64(mask, bt_reg_mask(bt));
+ for_each_set_bit(i, mask, 32) {
+ reg = &st->frame[0]->regs[i];
+ if (reg->type != SCALAR_VALUE) {
+- reg_mask &= ~(1u << i);
++ bt_clear_reg(bt, i);
+ continue;
+ }
+ reg->precise = true;
+@@ -3584,8 +3871,8 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r
+ return 0;
+ }
+
+- verbose(env, "BUG backtracing func entry subprog %d reg_mask %x stack_mask %llx\n",
+- st->frame[0]->subprogno, reg_mask, stack_mask);
++ verbose(env, "BUG backtracking func entry subprog %d reg_mask %x stack_mask %llx\n",
++ st->frame[0]->subprogno, bt_reg_mask(bt), bt_stack_mask(bt));
+ WARN_ONCE(1, "verifier backtracking bug");
+ return -EFAULT;
+ }
+@@ -3595,15 +3882,16 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r
+ err = 0;
+ skip_first = false;
+ } else {
+- err = backtrack_insn(env, i, ®_mask, &stack_mask);
++ err = backtrack_insn(env, i, bt);
+ }
+ if (err == -ENOTSUPP) {
+ mark_all_scalars_precise(env, st);
++ bt_reset(bt);
+ return 0;
+ } else if (err) {
+ return err;
+ }
+- if (!reg_mask && !stack_mask)
++ if (bt_empty(bt))
+ /* Found assignment(s) into tracked register in this state.
+ * Since this state is already marked, just return.
+ * Nothing to be tracked further in the parent state.
+@@ -3628,63 +3916,65 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r
+ if (!st)
+ break;
+
+- new_marks = false;
+- func = st->frame[frame];
+- bitmap_from_u64(mask, reg_mask);
+- for_each_set_bit(i, mask, 32) {
+- reg = &func->regs[i];
+- if (reg->type != SCALAR_VALUE) {
+- reg_mask &= ~(1u << i);
+- continue;
++ for (fr = bt->frame; fr >= 0; fr--) {
++ func = st->frame[fr];
++ bitmap_from_u64(mask, bt_frame_reg_mask(bt, fr));
++ for_each_set_bit(i, mask, 32) {
++ reg = &func->regs[i];
++ if (reg->type != SCALAR_VALUE) {
++ bt_clear_frame_reg(bt, fr, i);
++ continue;
++ }
++ if (reg->precise)
++ bt_clear_frame_reg(bt, fr, i);
++ else
++ reg->precise = true;
+ }
+- if (!reg->precise)
+- new_marks = true;
+- reg->precise = true;
+- }
+
+- bitmap_from_u64(mask, stack_mask);
+- for_each_set_bit(i, mask, 64) {
+- if (i >= func->allocated_stack / BPF_REG_SIZE) {
+- /* the sequence of instructions:
+- * 2: (bf) r3 = r10
+- * 3: (7b) *(u64 *)(r3 -8) = r0
+- * 4: (79) r4 = *(u64 *)(r10 -8)
+- * doesn't contain jmps. It's backtracked
+- * as a single block.
+- * During backtracking insn 3 is not recognized as
+- * stack access, so at the end of backtracking
+- * stack slot fp-8 is still marked in stack_mask.
+- * However the parent state may not have accessed
+- * fp-8 and it's "unallocated" stack space.
+- * In such case fallback to conservative.
+- */
+- mark_all_scalars_precise(env, st);
+- return 0;
+- }
++ bitmap_from_u64(mask, bt_frame_stack_mask(bt, fr));
++ for_each_set_bit(i, mask, 64) {
++ if (i >= func->allocated_stack / BPF_REG_SIZE) {
++ /* the sequence of instructions:
++ * 2: (bf) r3 = r10
++ * 3: (7b) *(u64 *)(r3 -8) = r0
++ * 4: (79) r4 = *(u64 *)(r10 -8)
++ * doesn't contain jmps. It's backtracked
++ * as a single block.
++ * During backtracking insn 3 is not recognized as
++ * stack access, so at the end of backtracking
++ * stack slot fp-8 is still marked in stack_mask.
++ * However the parent state may not have accessed
++ * fp-8 and it's "unallocated" stack space.
++ * In such case fallback to conservative.
++ */
++ mark_all_scalars_precise(env, st);
++ bt_reset(bt);
++ return 0;
++ }
+
+- if (!is_spilled_reg(&func->stack[i])) {
+- stack_mask &= ~(1ull << i);
+- continue;
++ if (!is_spilled_scalar_reg(&func->stack[i])) {
++ bt_clear_frame_slot(bt, fr, i);
++ continue;
++ }
++ reg = &func->stack[i].spilled_ptr;
++ if (reg->precise)
++ bt_clear_frame_slot(bt, fr, i);
++ else
++ reg->precise = true;
+ }
+- reg = &func->stack[i].spilled_ptr;
+- if (reg->type != SCALAR_VALUE) {
+- stack_mask &= ~(1ull << i);
+- continue;
++ if (env->log.level & BPF_LOG_LEVEL2) {
++ fmt_reg_mask(env->tmp_str_buf, TMP_STR_BUF_LEN,
++ bt_frame_reg_mask(bt, fr));
++ verbose(env, "mark_precise: frame%d: parent state regs=%s ",
++ fr, env->tmp_str_buf);
++ fmt_stack_mask(env->tmp_str_buf, TMP_STR_BUF_LEN,
++ bt_frame_stack_mask(bt, fr));
++ verbose(env, "stack=%s: ", env->tmp_str_buf);
++ print_verifier_state(env, func, true);
+ }
+- if (!reg->precise)
+- new_marks = true;
+- reg->precise = true;
+- }
+- if (env->log.level & BPF_LOG_LEVEL2) {
+- verbose(env, "parent %s regs=%x stack=%llx marks:",
+- new_marks ? "didn't have" : "already had",
+- reg_mask, stack_mask);
+- print_verifier_state(env, func, true);
+ }
+
+- if (!reg_mask && !stack_mask)
+- break;
+- if (!new_marks)
++ if (bt_empty(bt))
+ break;
+
+ last_idx = st->last_insn_idx;
+@@ -3695,17 +3985,15 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r
+
+ int mark_chain_precision(struct bpf_verifier_env *env, int regno)
+ {
+- return __mark_chain_precision(env, env->cur_state->curframe, regno, -1);
+-}
+-
+-static int mark_chain_precision_frame(struct bpf_verifier_env *env, int frame, int regno)
+-{
+- return __mark_chain_precision(env, frame, regno, -1);
++ return __mark_chain_precision(env, regno);
+ }
+
+-static int mark_chain_precision_stack_frame(struct bpf_verifier_env *env, int frame, int spi)
++/* mark_chain_precision_batch() assumes that env->bt is set in the caller to
++ * desired reg and stack masks across all relevant frames
++ */
++static int mark_chain_precision_batch(struct bpf_verifier_env *env)
+ {
+- return __mark_chain_precision(env, frame, -1, spi);
++ return __mark_chain_precision(env, -1);
+ }
+
+ static bool is_spillable_regtype(enum bpf_reg_type type)
+@@ -9327,11 +9615,6 @@ static bool is_kfunc_acquire(struct bpf_kfunc_call_arg_meta *meta)
+ return meta->kfunc_flags & KF_ACQUIRE;
+ }
+
+-static bool is_kfunc_ret_null(struct bpf_kfunc_call_arg_meta *meta)
+-{
+- return meta->kfunc_flags & KF_RET_NULL;
+-}
+-
+ static bool is_kfunc_release(struct bpf_kfunc_call_arg_meta *meta)
+ {
+ return meta->kfunc_flags & KF_RELEASE;
+@@ -9639,6 +9922,16 @@ BTF_ID(func, bpf_dynptr_from_xdp)
+ BTF_ID(func, bpf_dynptr_slice)
+ BTF_ID(func, bpf_dynptr_slice_rdwr)
+
++static bool is_kfunc_ret_null(struct bpf_kfunc_call_arg_meta *meta)
++{
++ if (meta->func_id == special_kfunc_list[KF_bpf_refcount_acquire_impl] &&
++ meta->arg_owning_ref) {
++ return false;
++ }
++
++ return meta->kfunc_flags & KF_RET_NULL;
++}
++
+ static bool is_kfunc_bpf_rcu_read_lock(struct bpf_kfunc_call_arg_meta *meta)
+ {
+ return meta->func_id == special_kfunc_list[KF_bpf_rcu_read_lock];
+@@ -10116,6 +10409,8 @@ __process_kf_arg_ptr_to_graph_node(struct bpf_verifier_env *env,
+ node_off, btf_name_by_offset(reg->btf, t->name_off));
+ return -EINVAL;
+ }
++ meta->arg_btf = reg->btf;
++ meta->arg_btf_id = reg->btf_id;
+
+ if (node_off != field->graph_root.node_offset) {
+ verbose(env, "arg#1 offset=%d, but expected %s at offset=%d in struct %s\n",
+@@ -10326,8 +10621,8 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
+ }
+ if (meta->btf == btf_vmlinux &&
+ meta->func_id == special_kfunc_list[KF_bpf_obj_drop_impl]) {
+- meta->arg_obj_drop.btf = reg->btf;
+- meta->arg_obj_drop.btf_id = reg->btf_id;
++ meta->arg_btf = reg->btf;
++ meta->arg_btf_id = reg->btf_id;
+ }
+ break;
+ case KF_ARG_PTR_TO_DYNPTR:
+@@ -10497,10 +10792,12 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
+ meta->subprogno = reg->subprogno;
+ break;
+ case KF_ARG_PTR_TO_REFCOUNTED_KPTR:
+- if (!type_is_ptr_alloc_obj(reg->type) && !type_is_non_owning_ref(reg->type)) {
++ if (!type_is_ptr_alloc_obj(reg->type)) {
+ verbose(env, "arg#%d is neither owning or non-owning ref\n", i);
+ return -EINVAL;
+ }
++ if (!type_is_non_owning_ref(reg->type))
++ meta->arg_owning_ref = true;
+
+ rec = reg_btf_record(reg);
+ if (!rec) {
+@@ -10516,8 +10813,8 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
+ verbose(env, "bpf_refcount_acquire calls are disabled for now\n");
+ return -EINVAL;
+ }
+- meta->arg_refcount_acquire.btf = reg->btf;
+- meta->arg_refcount_acquire.btf_id = reg->btf_id;
++ meta->arg_btf = reg->btf;
++ meta->arg_btf_id = reg->btf_id;
+ break;
+ }
+ }
+@@ -10663,6 +10960,7 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
+ meta.func_id == special_kfunc_list[KF_bpf_rbtree_add_impl]) {
+ release_ref_obj_id = regs[BPF_REG_2].ref_obj_id;
+ insn_aux->insert_off = regs[BPF_REG_2].off;
++ insn_aux->kptr_struct_meta = btf_find_struct_meta(meta.arg_btf, meta.arg_btf_id);
+ err = ref_convert_owning_non_owning(env, release_ref_obj_id);
+ if (err) {
+ verbose(env, "kfunc %s#%d conversion of owning ref to non-owning failed\n",
+@@ -10749,12 +11047,12 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
+ } else if (meta.func_id == special_kfunc_list[KF_bpf_refcount_acquire_impl]) {
+ mark_reg_known_zero(env, regs, BPF_REG_0);
+ regs[BPF_REG_0].type = PTR_TO_BTF_ID | MEM_ALLOC;
+- regs[BPF_REG_0].btf = meta.arg_refcount_acquire.btf;
+- regs[BPF_REG_0].btf_id = meta.arg_refcount_acquire.btf_id;
++ regs[BPF_REG_0].btf = meta.arg_btf;
++ regs[BPF_REG_0].btf_id = meta.arg_btf_id;
+
+ insn_aux->kptr_struct_meta =
+- btf_find_struct_meta(meta.arg_refcount_acquire.btf,
+- meta.arg_refcount_acquire.btf_id);
++ btf_find_struct_meta(meta.arg_btf,
++ meta.arg_btf_id);
+ } else if (meta.func_id == special_kfunc_list[KF_bpf_list_pop_front] ||
+ meta.func_id == special_kfunc_list[KF_bpf_list_pop_back]) {
+ struct btf_field *field = meta.arg_list_head.field;
+@@ -10884,8 +11182,8 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
+ if (meta.btf == btf_vmlinux && btf_id_set_contains(&special_kfunc_set, meta.func_id)) {
+ if (meta.func_id == special_kfunc_list[KF_bpf_obj_drop_impl]) {
+ insn_aux->kptr_struct_meta =
+- btf_find_struct_meta(meta.arg_obj_drop.btf,
+- meta.arg_obj_drop.btf_id);
++ btf_find_struct_meta(meta.arg_btf,
++ meta.arg_btf_id);
+ }
+ }
+ }
+@@ -12420,12 +12718,14 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
+ if (BPF_SRC(insn->code) == BPF_X) {
+ struct bpf_reg_state *src_reg = regs + insn->src_reg;
+ struct bpf_reg_state *dst_reg = regs + insn->dst_reg;
++ bool need_id = src_reg->type == SCALAR_VALUE && !src_reg->id &&
++ !tnum_is_const(src_reg->var_off);
+
+ if (BPF_CLASS(insn->code) == BPF_ALU64) {
+ /* case: R1 = R2
+ * copy register state to dest reg
+ */
+- if (src_reg->type == SCALAR_VALUE && !src_reg->id)
++ if (need_id)
+ /* Assign src and dst registers the same ID
+ * that will be used by find_equal_scalars()
+ * to propagate min/max range.
+@@ -12444,7 +12744,7 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
+ } else if (src_reg->type == SCALAR_VALUE) {
+ bool is_src_reg_u32 = src_reg->umax_value <= U32_MAX;
+
+- if (is_src_reg_u32 && !src_reg->id)
++ if (is_src_reg_u32 && need_id)
+ src_reg->id = ++env->id_gen;
+ copy_register_state(dst_reg, src_reg);
+ /* Make sure ID is cleared if src_reg is not in u32 range otherwise
+@@ -14600,8 +14900,9 @@ static bool range_within(struct bpf_reg_state *old,
+ * So we look through our idmap to see if this old id has been seen before. If
+ * so, we require the new id to match; otherwise, we add the id pair to the map.
+ */
+-static bool check_ids(u32 old_id, u32 cur_id, struct bpf_id_pair *idmap)
++static bool check_ids(u32 old_id, u32 cur_id, struct bpf_idmap *idmap)
+ {
++ struct bpf_id_pair *map = idmap->map;
+ unsigned int i;
+
+ /* either both IDs should be set or both should be zero */
+@@ -14612,20 +14913,34 @@ static bool check_ids(u32 old_id, u32 cur_id, struct bpf_id_pair *idmap)
+ return true;
+
+ for (i = 0; i < BPF_ID_MAP_SIZE; i++) {
+- if (!idmap[i].old) {
++ if (!map[i].old) {
+ /* Reached an empty slot; haven't seen this id before */
+- idmap[i].old = old_id;
+- idmap[i].cur = cur_id;
++ map[i].old = old_id;
++ map[i].cur = cur_id;
+ return true;
+ }
+- if (idmap[i].old == old_id)
+- return idmap[i].cur == cur_id;
++ if (map[i].old == old_id)
++ return map[i].cur == cur_id;
++ if (map[i].cur == cur_id)
++ return false;
+ }
+ /* We ran out of idmap slots, which should be impossible */
+ WARN_ON_ONCE(1);
+ return false;
+ }
+
++/* Similar to check_ids(), but allocate a unique temporary ID
++ * for 'old_id' or 'cur_id' of zero.
++ * This makes pairs like '0 vs unique ID', 'unique ID vs 0' valid.
++ */
++static bool check_scalar_ids(u32 old_id, u32 cur_id, struct bpf_idmap *idmap)
++{
++ old_id = old_id ? old_id : ++idmap->tmp_id_gen;
++ cur_id = cur_id ? cur_id : ++idmap->tmp_id_gen;
++
++ return check_ids(old_id, cur_id, idmap);
++}
++
+ static void clean_func_state(struct bpf_verifier_env *env,
+ struct bpf_func_state *st)
+ {
+@@ -14724,7 +15039,7 @@ next:
+
+ static bool regs_exact(const struct bpf_reg_state *rold,
+ const struct bpf_reg_state *rcur,
+- struct bpf_id_pair *idmap)
++ struct bpf_idmap *idmap)
+ {
+ return memcmp(rold, rcur, offsetof(struct bpf_reg_state, id)) == 0 &&
+ check_ids(rold->id, rcur->id, idmap) &&
+@@ -14733,7 +15048,7 @@ static bool regs_exact(const struct bpf_reg_state *rold,
+
+ /* Returns true if (rold safe implies rcur safe) */
+ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
+- struct bpf_reg_state *rcur, struct bpf_id_pair *idmap)
++ struct bpf_reg_state *rcur, struct bpf_idmap *idmap)
+ {
+ if (!(rold->live & REG_LIVE_READ))
+ /* explored state didn't use this */
+@@ -14770,15 +15085,42 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
+
+ switch (base_type(rold->type)) {
+ case SCALAR_VALUE:
+- if (regs_exact(rold, rcur, idmap))
+- return true;
+- if (env->explore_alu_limits)
+- return false;
++ if (env->explore_alu_limits) {
++ /* explore_alu_limits disables tnum_in() and range_within()
++ * logic and requires everything to be strict
++ */
++ return memcmp(rold, rcur, offsetof(struct bpf_reg_state, id)) == 0 &&
++ check_scalar_ids(rold->id, rcur->id, idmap);
++ }
+ if (!rold->precise)
+ return true;
+- /* new val must satisfy old val knowledge */
++ /* Why check_ids() for scalar registers?
++ *
++ * Consider the following BPF code:
++ * 1: r6 = ... unbound scalar, ID=a ...
++ * 2: r7 = ... unbound scalar, ID=b ...
++ * 3: if (r6 > r7) goto +1
++ * 4: r6 = r7
++ * 5: if (r6 > X) goto ...
++ * 6: ... memory operation using r7 ...
++ *
++ * First verification path is [1-6]:
++ * - at (4) same bpf_reg_state::id (b) would be assigned to r6 and r7;
++ * - at (5) r6 would be marked <= X, find_equal_scalars() would also mark
++ * r7 <= X, because r6 and r7 share same id.
++ * Next verification path is [1-4, 6].
++ *
++ * Instruction (6) would be reached in two states:
++ * I. r6{.id=b}, r7{.id=b} via path 1-6;
++ * II. r6{.id=a}, r7{.id=b} via path 1-4, 6.
++ *
++ * Use check_ids() to distinguish these states.
++ * ---
++ * Also verify that new value satisfies old value range knowledge.
++ */
+ return range_within(rold, rcur) &&
+- tnum_in(rold->var_off, rcur->var_off);
++ tnum_in(rold->var_off, rcur->var_off) &&
++ check_scalar_ids(rold->id, rcur->id, idmap);
+ case PTR_TO_MAP_KEY:
+ case PTR_TO_MAP_VALUE:
+ case PTR_TO_MEM:
+@@ -14824,7 +15166,7 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
+ }
+
+ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
+- struct bpf_func_state *cur, struct bpf_id_pair *idmap)
++ struct bpf_func_state *cur, struct bpf_idmap *idmap)
+ {
+ int i, spi;
+
+@@ -14927,7 +15269,7 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
+ }
+
+ static bool refsafe(struct bpf_func_state *old, struct bpf_func_state *cur,
+- struct bpf_id_pair *idmap)
++ struct bpf_idmap *idmap)
+ {
+ int i;
+
+@@ -14975,13 +15317,13 @@ static bool func_states_equal(struct bpf_verifier_env *env, struct bpf_func_stat
+
+ for (i = 0; i < MAX_BPF_REG; i++)
+ if (!regsafe(env, &old->regs[i], &cur->regs[i],
+- env->idmap_scratch))
++ &env->idmap_scratch))
+ return false;
+
+- if (!stacksafe(env, old, cur, env->idmap_scratch))
++ if (!stacksafe(env, old, cur, &env->idmap_scratch))
+ return false;
+
+- if (!refsafe(old, cur, env->idmap_scratch))
++ if (!refsafe(old, cur, &env->idmap_scratch))
+ return false;
+
+ return true;
+@@ -14996,7 +15338,8 @@ static bool states_equal(struct bpf_verifier_env *env,
+ if (old->curframe != cur->curframe)
+ return false;
+
+- memset(env->idmap_scratch, 0, sizeof(env->idmap_scratch));
++ env->idmap_scratch.tmp_id_gen = env->id_gen;
++ memset(&env->idmap_scratch.map, 0, sizeof(env->idmap_scratch.map));
+
+ /* Verification state from speculative execution simulation
+ * must never prune a non-speculative execution one.
+@@ -15014,7 +15357,7 @@ static bool states_equal(struct bpf_verifier_env *env,
+ return false;
+
+ if (old->active_lock.id &&
+- !check_ids(old->active_lock.id, cur->active_lock.id, env->idmap_scratch))
++ !check_ids(old->active_lock.id, cur->active_lock.id, &env->idmap_scratch))
+ return false;
+
+ if (old->active_rcu_lock != cur->active_rcu_lock)
+@@ -15121,20 +15464,25 @@ static int propagate_precision(struct bpf_verifier_env *env,
+ struct bpf_reg_state *state_reg;
+ struct bpf_func_state *state;
+ int i, err = 0, fr;
++ bool first;
+
+ for (fr = old->curframe; fr >= 0; fr--) {
+ state = old->frame[fr];
+ state_reg = state->regs;
++ first = true;
+ for (i = 0; i < BPF_REG_FP; i++, state_reg++) {
+ if (state_reg->type != SCALAR_VALUE ||
+ !state_reg->precise ||
+ !(state_reg->live & REG_LIVE_READ))
+ continue;
+- if (env->log.level & BPF_LOG_LEVEL2)
+- verbose(env, "frame %d: propagating r%d\n", fr, i);
+- err = mark_chain_precision_frame(env, fr, i);
+- if (err < 0)
+- return err;
++ if (env->log.level & BPF_LOG_LEVEL2) {
++ if (first)
++ verbose(env, "frame %d: propagating r%d", fr, i);
++ else
++ verbose(env, ",r%d", i);
++ }
++ bt_set_frame_reg(&env->bt, fr, i);
++ first = false;
+ }
+
+ for (i = 0; i < state->allocated_stack / BPF_REG_SIZE; i++) {
+@@ -15145,14 +15493,24 @@ static int propagate_precision(struct bpf_verifier_env *env,
+ !state_reg->precise ||
+ !(state_reg->live & REG_LIVE_READ))
+ continue;
+- if (env->log.level & BPF_LOG_LEVEL2)
+- verbose(env, "frame %d: propagating fp%d\n",
+- fr, (-i - 1) * BPF_REG_SIZE);
+- err = mark_chain_precision_stack_frame(env, fr, i);
+- if (err < 0)
+- return err;
++ if (env->log.level & BPF_LOG_LEVEL2) {
++ if (first)
++ verbose(env, "frame %d: propagating fp%d",
++ fr, (-i - 1) * BPF_REG_SIZE);
++ else
++ verbose(env, ",fp%d", (-i - 1) * BPF_REG_SIZE);
++ }
++ bt_set_frame_slot(&env->bt, fr, i);
++ first = false;
+ }
++ if (!first)
++ verbose(env, "\n");
+ }
++
++ err = mark_chain_precision_batch(env);
++ if (err < 0)
++ return err;
++
+ return 0;
+ }
+
+@@ -18812,6 +19170,8 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
+ if (!env)
+ return -ENOMEM;
+
++ env->bt.env = env;
++
+ len = (*prog)->len;
+ env->insn_aux_data =
+ vzalloc(array_size(sizeof(struct bpf_insn_aux_data), len));
+diff --git a/kernel/kcsan/core.c b/kernel/kcsan/core.c
+index 5a60cc52adc0c..8a7baf4e332e3 100644
+--- a/kernel/kcsan/core.c
++++ b/kernel/kcsan/core.c
+@@ -1270,7 +1270,9 @@ static __always_inline void kcsan_atomic_builtin_memorder(int memorder)
+ DEFINE_TSAN_ATOMIC_OPS(8);
+ DEFINE_TSAN_ATOMIC_OPS(16);
+ DEFINE_TSAN_ATOMIC_OPS(32);
++#ifdef CONFIG_64BIT
+ DEFINE_TSAN_ATOMIC_OPS(64);
++#endif
+
+ void __tsan_atomic_thread_fence(int memorder);
+ void __tsan_atomic_thread_fence(int memorder)
+diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
+index 3d578c6fefee3..22acee18195a5 100644
+--- a/kernel/kexec_core.c
++++ b/kernel/kexec_core.c
+@@ -1122,6 +1122,7 @@ int crash_shrink_memory(unsigned long new_size)
+ start = crashk_res.start;
+ end = crashk_res.end;
+ old_size = (end == 0) ? 0 : end - start + 1;
++ new_size = roundup(new_size, KEXEC_CRASH_MEM_ALIGN);
+ if (new_size >= old_size) {
+ ret = (new_size == old_size) ? 0 : -EINVAL;
+ goto unlock;
+@@ -1133,9 +1134,7 @@ int crash_shrink_memory(unsigned long new_size)
+ goto unlock;
+ }
+
+- start = roundup(start, KEXEC_CRASH_MEM_ALIGN);
+- end = roundup(start + new_size, KEXEC_CRASH_MEM_ALIGN);
+-
++ end = start + new_size;
+ crash_free_reserved_phys_range(end, crashk_res.end);
+
+ if ((start == end) && (crashk_res.parent != NULL))
+diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
+index 4a1b9622598be..98c1544cf572b 100644
+--- a/kernel/rcu/rcu.h
++++ b/kernel/rcu/rcu.h
+@@ -642,4 +642,10 @@ void show_rcu_tasks_trace_gp_kthread(void);
+ static inline void show_rcu_tasks_trace_gp_kthread(void) {}
+ #endif
+
++#ifdef CONFIG_TINY_RCU
++static inline bool rcu_cpu_beenfullyonline(int cpu) { return true; }
++#else
++bool rcu_cpu_beenfullyonline(int cpu);
++#endif
++
+ #endif /* __LINUX_RCU_H */
+diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c
+index e82ec9f9a5d80..d1221731c7cfd 100644
+--- a/kernel/rcu/rcuscale.c
++++ b/kernel/rcu/rcuscale.c
+@@ -522,89 +522,6 @@ rcu_scale_print_module_parms(struct rcu_scale_ops *cur_ops, const char *tag)
+ scale_type, tag, nrealreaders, nrealwriters, verbose, shutdown);
+ }
+
+-static void
+-rcu_scale_cleanup(void)
+-{
+- int i;
+- int j;
+- int ngps = 0;
+- u64 *wdp;
+- u64 *wdpp;
+-
+- /*
+- * Would like warning at start, but everything is expedited
+- * during the mid-boot phase, so have to wait till the end.
+- */
+- if (rcu_gp_is_expedited() && !rcu_gp_is_normal() && !gp_exp)
+- SCALEOUT_ERRSTRING("All grace periods expedited, no normal ones to measure!");
+- if (rcu_gp_is_normal() && gp_exp)
+- SCALEOUT_ERRSTRING("All grace periods normal, no expedited ones to measure!");
+- if (gp_exp && gp_async)
+- SCALEOUT_ERRSTRING("No expedited async GPs, so went with async!");
+-
+- if (torture_cleanup_begin())
+- return;
+- if (!cur_ops) {
+- torture_cleanup_end();
+- return;
+- }
+-
+- if (reader_tasks) {
+- for (i = 0; i < nrealreaders; i++)
+- torture_stop_kthread(rcu_scale_reader,
+- reader_tasks[i]);
+- kfree(reader_tasks);
+- }
+-
+- if (writer_tasks) {
+- for (i = 0; i < nrealwriters; i++) {
+- torture_stop_kthread(rcu_scale_writer,
+- writer_tasks[i]);
+- if (!writer_n_durations)
+- continue;
+- j = writer_n_durations[i];
+- pr_alert("%s%s writer %d gps: %d\n",
+- scale_type, SCALE_FLAG, i, j);
+- ngps += j;
+- }
+- pr_alert("%s%s start: %llu end: %llu duration: %llu gps: %d batches: %ld\n",
+- scale_type, SCALE_FLAG,
+- t_rcu_scale_writer_started, t_rcu_scale_writer_finished,
+- t_rcu_scale_writer_finished -
+- t_rcu_scale_writer_started,
+- ngps,
+- rcuscale_seq_diff(b_rcu_gp_test_finished,
+- b_rcu_gp_test_started));
+- for (i = 0; i < nrealwriters; i++) {
+- if (!writer_durations)
+- break;
+- if (!writer_n_durations)
+- continue;
+- wdpp = writer_durations[i];
+- if (!wdpp)
+- continue;
+- for (j = 0; j < writer_n_durations[i]; j++) {
+- wdp = &wdpp[j];
+- pr_alert("%s%s %4d writer-duration: %5d %llu\n",
+- scale_type, SCALE_FLAG,
+- i, j, *wdp);
+- if (j % 100 == 0)
+- schedule_timeout_uninterruptible(1);
+- }
+- kfree(writer_durations[i]);
+- }
+- kfree(writer_tasks);
+- kfree(writer_durations);
+- kfree(writer_n_durations);
+- }
+-
+- /* Do torture-type-specific cleanup operations. */
+- if (cur_ops->cleanup != NULL)
+- cur_ops->cleanup();
+-
+- torture_cleanup_end();
+-}
+-
+ /*
+ * Return the number if non-negative. If -1, the number of CPUs.
+ * If less than -1, that much less than the number of CPUs, but
+@@ -624,20 +541,6 @@ static int compute_real(int n)
+ return nr;
+ }
+
+-/*
+- * RCU scalability shutdown kthread. Just waits to be awakened, then shuts
+- * down system.
+- */
+-static int
+-rcu_scale_shutdown(void *arg)
+-{
+- wait_event_idle(shutdown_wq, atomic_read(&n_rcu_scale_writer_finished) >= nrealwriters);
+- smp_mb(); /* Wake before output. */
+- rcu_scale_cleanup();
+- kernel_power_off();
+- return -EINVAL;
+-}
+-
+ /*
+ * kfree_rcu() scalability tests: Start a kfree_rcu() loop on all CPUs for number
+ * of iterations and measure total time and number of GP for all iterations to complete.
+@@ -874,6 +777,108 @@ unwind:
+ return firsterr;
+ }
+
++static void
++rcu_scale_cleanup(void)
++{
++ int i;
++ int j;
++ int ngps = 0;
++ u64 *wdp;
++ u64 *wdpp;
++
++ /*
++ * Would like warning at start, but everything is expedited
++ * during the mid-boot phase, so have to wait till the end.
++ */
++ if (rcu_gp_is_expedited() && !rcu_gp_is_normal() && !gp_exp)
++ SCALEOUT_ERRSTRING("All grace periods expedited, no normal ones to measure!");
++ if (rcu_gp_is_normal() && gp_exp)
++ SCALEOUT_ERRSTRING("All grace periods normal, no expedited ones to measure!");
++ if (gp_exp && gp_async)
++ SCALEOUT_ERRSTRING("No expedited async GPs, so went with async!");
++
++ if (kfree_rcu_test) {
++ kfree_scale_cleanup();
++ return;
++ }
++
++ if (torture_cleanup_begin())
++ return;
++ if (!cur_ops) {
++ torture_cleanup_end();
++ return;
++ }
++
++ if (reader_tasks) {
++ for (i = 0; i < nrealreaders; i++)
++ torture_stop_kthread(rcu_scale_reader,
++ reader_tasks[i]);
++ kfree(reader_tasks);
++ }
++
++ if (writer_tasks) {
++ for (i = 0; i < nrealwriters; i++) {
++ torture_stop_kthread(rcu_scale_writer,
++ writer_tasks[i]);
++ if (!writer_n_durations)
++ continue;
++ j = writer_n_durations[i];
++ pr_alert("%s%s writer %d gps: %d\n",
++ scale_type, SCALE_FLAG, i, j);
++ ngps += j;
++ }
++ pr_alert("%s%s start: %llu end: %llu duration: %llu gps: %d batches: %ld\n",
++ scale_type, SCALE_FLAG,
++ t_rcu_scale_writer_started, t_rcu_scale_writer_finished,
++ t_rcu_scale_writer_finished -
++ t_rcu_scale_writer_started,
++ ngps,
++ rcuscale_seq_diff(b_rcu_gp_test_finished,
++ b_rcu_gp_test_started));
++ for (i = 0; i < nrealwriters; i++) {
++ if (!writer_durations)
++ break;
++ if (!writer_n_durations)
++ continue;
++ wdpp = writer_durations[i];
++ if (!wdpp)
++ continue;
++ for (j = 0; j < writer_n_durations[i]; j++) {
++ wdp = &wdpp[j];
++ pr_alert("%s%s %4d writer-duration: %5d %llu\n",
++ scale_type, SCALE_FLAG,
++ i, j, *wdp);
++ if (j % 100 == 0)
++ schedule_timeout_uninterruptible(1);
++ }
++ kfree(writer_durations[i]);
++ }
++ kfree(writer_tasks);
++ kfree(writer_durations);
++ kfree(writer_n_durations);
++ }
++
++ /* Do torture-type-specific cleanup operations. */
++ if (cur_ops->cleanup != NULL)
++ cur_ops->cleanup();
++
++ torture_cleanup_end();
++}
++
++/*
++ * RCU scalability shutdown kthread. Just waits to be awakened, then shuts
++ * down system.
++ */
++static int
++rcu_scale_shutdown(void *arg)
++{
++ wait_event_idle(shutdown_wq, atomic_read(&n_rcu_scale_writer_finished) >= nrealwriters);
++ smp_mb(); /* Wake before output. */
++ rcu_scale_cleanup();
++ kernel_power_off();
++ return -EINVAL;
++}
++
+ static int __init
+ rcu_scale_init(void)
+ {
+diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
+index 5f4fc8184dd0b..8f08c087142b0 100644
+--- a/kernel/rcu/tasks.h
++++ b/kernel/rcu/tasks.h
+@@ -463,6 +463,7 @@ static void rcu_tasks_invoke_cbs(struct rcu_tasks *rtp, struct rcu_tasks_percpu
+ {
+ int cpu;
+ int cpunext;
++ int cpuwq;
+ unsigned long flags;
+ int len;
+ struct rcu_head *rhp;
+@@ -473,11 +474,13 @@ static void rcu_tasks_invoke_cbs(struct rcu_tasks *rtp, struct rcu_tasks_percpu
+ cpunext = cpu * 2 + 1;
+ if (cpunext < smp_load_acquire(&rtp->percpu_dequeue_lim)) {
+ rtpcp_next = per_cpu_ptr(rtp->rtpcpu, cpunext);
+- queue_work_on(cpunext, system_wq, &rtpcp_next->rtp_work);
++ cpuwq = rcu_cpu_beenfullyonline(cpunext) ? cpunext : WORK_CPU_UNBOUND;
++ queue_work_on(cpuwq, system_wq, &rtpcp_next->rtp_work);
+ cpunext++;
+ if (cpunext < smp_load_acquire(&rtp->percpu_dequeue_lim)) {
+ rtpcp_next = per_cpu_ptr(rtp->rtpcpu, cpunext);
+- queue_work_on(cpunext, system_wq, &rtpcp_next->rtp_work);
++ cpuwq = rcu_cpu_beenfullyonline(cpunext) ? cpunext : WORK_CPU_UNBOUND;
++ queue_work_on(cpuwq, system_wq, &rtpcp_next->rtp_work);
+ }
+ }
+
+diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
+index f52ff72410416..ce51f85f0d5e4 100644
+--- a/kernel/rcu/tree.c
++++ b/kernel/rcu/tree.c
+@@ -4283,7 +4283,6 @@ int rcutree_prepare_cpu(unsigned int cpu)
+ */
+ rnp = rdp->mynode;
+ raw_spin_lock_rcu_node(rnp); /* irqs already disabled. */
+- rdp->beenonline = true; /* We have now been online. */
+ rdp->gp_seq = READ_ONCE(rnp->gp_seq);
+ rdp->gp_seq_needed = rdp->gp_seq;
+ rdp->cpu_no_qs.b.norm = true;
+@@ -4310,6 +4309,16 @@ static void rcutree_affinity_setting(unsigned int cpu, int outgoing)
+ rcu_boost_kthread_setaffinity(rdp->mynode, outgoing);
+ }
+
++/*
++ * Has the specified (known valid) CPU ever been fully online?
++ */
++bool rcu_cpu_beenfullyonline(int cpu)
++{
++ struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
++
++ return smp_load_acquire(&rdp->beenonline);
++}
++
+ /*
+ * Near the end of the CPU-online process. Pretty much all services
+ * enabled, and the CPU is now very much alive.
+@@ -4368,15 +4377,16 @@ int rcutree_offline_cpu(unsigned int cpu)
+ * Note that this function is special in that it is invoked directly
+ * from the incoming CPU rather than from the cpuhp_step mechanism.
+ * This is because this function must be invoked at a precise location.
++ * This incoming CPU must not have enabled interrupts yet.
+ */
+ void rcu_cpu_starting(unsigned int cpu)
+ {
+- unsigned long flags;
+ unsigned long mask;
+ struct rcu_data *rdp;
+ struct rcu_node *rnp;
+ bool newcpu;
+
++ lockdep_assert_irqs_disabled();
+ rdp = per_cpu_ptr(&rcu_data, cpu);
+ if (rdp->cpu_started)
+ return;
+@@ -4384,7 +4394,6 @@ void rcu_cpu_starting(unsigned int cpu)
+
+ rnp = rdp->mynode;
+ mask = rdp->grpmask;
+- local_irq_save(flags);
+ arch_spin_lock(&rcu_state.ofl_lock);
+ rcu_dynticks_eqs_online();
+ raw_spin_lock(&rcu_state.barrier_lock);
+@@ -4403,17 +4412,17 @@ void rcu_cpu_starting(unsigned int cpu)
+ /* An incoming CPU should never be blocking a grace period. */
+ if (WARN_ON_ONCE(rnp->qsmask & mask)) { /* RCU waiting on incoming CPU? */
+ /* rcu_report_qs_rnp() *really* wants some flags to restore */
+- unsigned long flags2;
++ unsigned long flags;
+
+- local_irq_save(flags2);
++ local_irq_save(flags);
+ rcu_disable_urgency_upon_qs(rdp);
+ /* Report QS -after- changing ->qsmaskinitnext! */
+- rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags2);
++ rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags);
+ } else {
+ raw_spin_unlock_rcu_node(rnp);
+ }
+ arch_spin_unlock(&rcu_state.ofl_lock);
+- local_irq_restore(flags);
++ smp_store_release(&rdp->beenonline, true);
+ smp_mb(); /* Ensure RCU read-side usage follows above initialization. */
+ }
+
+diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
+index 373ff5f558844..4da5f35417626 100644
+--- a/kernel/sched/fair.c
++++ b/kernel/sched/fair.c
+@@ -5576,6 +5576,14 @@ static void __cfsb_csd_unthrottle(void *arg)
+
+ rq_lock(rq, &rf);
+
++ /*
++ * Iterating over the list can trigger several call to
++ * update_rq_clock() in unthrottle_cfs_rq().
++ * Do it once and skip the potential next ones.
++ */
++ update_rq_clock(rq);
++ rq_clock_start_loop_update(rq);
++
+ /*
+ * Since we hold rq lock we're safe from concurrent manipulation of
+ * the CSD list. However, this RCU critical section annotates the
+@@ -5595,6 +5603,7 @@ static void __cfsb_csd_unthrottle(void *arg)
+
+ rcu_read_unlock();
+
++ rq_clock_stop_loop_update(rq);
+ rq_unlock(rq, &rf);
+ }
+
+@@ -6115,6 +6124,13 @@ static void __maybe_unused unthrottle_offline_cfs_rqs(struct rq *rq)
+
+ lockdep_assert_rq_held(rq);
+
++ /*
++ * The rq clock has already been updated in the
++ * set_rq_offline(), so we should skip updating
++ * the rq clock again in unthrottle_cfs_rq().
++ */
++ rq_clock_start_loop_update(rq);
++
+ rcu_read_lock();
+ list_for_each_entry_rcu(tg, &task_groups, list) {
+ struct cfs_rq *cfs_rq = tg->cfs_rq[cpu_of(rq)];
+@@ -6137,6 +6153,8 @@ static void __maybe_unused unthrottle_offline_cfs_rqs(struct rq *rq)
+ unthrottle_cfs_rq(cfs_rq);
+ }
+ rcu_read_unlock();
++
++ rq_clock_stop_loop_update(rq);
+ }
+
+ #else /* CONFIG_CFS_BANDWIDTH */
+diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
+index ec7b3e0a2b207..81ac605b9cd5c 100644
+--- a/kernel/sched/sched.h
++++ b/kernel/sched/sched.h
+@@ -1546,6 +1546,28 @@ static inline void rq_clock_cancel_skipupdate(struct rq *rq)
+ rq->clock_update_flags &= ~RQCF_REQ_SKIP;
+ }
+
++/*
++ * During cpu offlining and rq wide unthrottling, we can trigger
++ * an update_rq_clock() for several cfs and rt runqueues (Typically
++ * when using list_for_each_entry_*)
++ * rq_clock_start_loop_update() can be called after updating the clock
++ * once and before iterating over the list to prevent multiple update.
++ * After the iterative traversal, we need to call rq_clock_stop_loop_update()
++ * to clear RQCF_ACT_SKIP of rq->clock_update_flags.
++ */
++static inline void rq_clock_start_loop_update(struct rq *rq)
++{
++ lockdep_assert_rq_held(rq);
++ SCHED_WARN_ON(rq->clock_update_flags & RQCF_ACT_SKIP);
++ rq->clock_update_flags |= RQCF_ACT_SKIP;
++}
++
++static inline void rq_clock_stop_loop_update(struct rq *rq)
++{
++ lockdep_assert_rq_held(rq);
++ rq->clock_update_flags &= ~RQCF_ACT_SKIP;
++}
++
+ struct rq_flags {
+ unsigned long flags;
+ struct pin_cookie cookie;
+diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
+index 808a247205a9a..ed3c4a9543982 100644
+--- a/kernel/time/posix-timers.c
++++ b/kernel/time/posix-timers.c
+@@ -1037,27 +1037,52 @@ retry_delete:
+ }
+
+ /*
+- * return timer owned by the process, used by exit_itimers
++ * Delete a timer if it is armed, remove it from the hash and schedule it
++ * for RCU freeing.
+ */
+ static void itimer_delete(struct k_itimer *timer)
+ {
+-retry_delete:
+- spin_lock_irq(&timer->it_lock);
++ unsigned long flags;
++
++ /*
++ * irqsave is required to make timer_wait_running() work.
++ */
++ spin_lock_irqsave(&timer->it_lock, flags);
+
++retry_delete:
++ /*
++ * Even if the timer is not longer accessible from other tasks
++ * it still might be armed and queued in the underlying timer
++ * mechanism. Worse, that timer mechanism might run the expiry
++ * function concurrently.
++ */
+ if (timer_delete_hook(timer) == TIMER_RETRY) {
+- spin_unlock_irq(&timer->it_lock);
++ /*
++ * Timer is expired concurrently, prevent livelocks
++ * and pointless spinning on RT.
++ *
++ * timer_wait_running() drops timer::it_lock, which opens
++ * the possibility for another task to delete the timer.
++ *
++ * That's not possible here because this is invoked from
++ * do_exit() only for the last thread of the thread group.
++ * So no other task can access and delete that timer.
++ */
++ if (WARN_ON_ONCE(timer_wait_running(timer, &flags) != timer))
++ return;
++
+ goto retry_delete;
+ }
+ list_del(&timer->list);
+
+- spin_unlock_irq(&timer->it_lock);
++ spin_unlock_irqrestore(&timer->it_lock, flags);
+ release_posix_timer(timer, IT_ID_SET);
+ }
+
+ /*
+- * This is called by do_exit or de_thread, only when nobody else can
+- * modify the signal->posix_timers list. Yet we need sighand->siglock
+- * to prevent the race with /proc/pid/timers.
++ * Invoked from do_exit() when the last thread of a thread group exits.
++ * At that point no other task can access the timers of the dying
++ * task anymore.
+ */
+ void exit_itimers(struct task_struct *tsk)
+ {
+@@ -1067,10 +1092,12 @@ void exit_itimers(struct task_struct *tsk)
+ if (list_empty(&tsk->signal->posix_timers))
+ return;
+
++ /* Protect against concurrent read via /proc/$PID/timers */
+ spin_lock_irq(&tsk->sighand->siglock);
+ list_replace_init(&tsk->signal->posix_timers, &timers);
+ spin_unlock_irq(&tsk->sighand->siglock);
+
++ /* The timers are not longer accessible via tsk::signal */
+ while (!list_empty(&timers)) {
+ tmr = list_first_entry(&timers, struct k_itimer, list);
+ itimer_delete(tmr);
+diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
+index 42c0be3080bde..4df14db4da490 100644
+--- a/kernel/time/tick-sched.c
++++ b/kernel/time/tick-sched.c
+@@ -1041,7 +1041,7 @@ static bool report_idle_softirq(void)
+ return false;
+ }
+
+- if (ratelimit < 10)
++ if (ratelimit >= 10)
+ return false;
+
+ /* On RT, softirqs handling may be waiting on some lock */
+diff --git a/kernel/watch_queue.c b/kernel/watch_queue.c
+index e91cb4c2833f1..d0b6b390ee423 100644
+--- a/kernel/watch_queue.c
++++ b/kernel/watch_queue.c
+@@ -42,7 +42,7 @@ MODULE_AUTHOR("Red Hat, Inc.");
+ static inline bool lock_wqueue(struct watch_queue *wqueue)
+ {
+ spin_lock_bh(&wqueue->lock);
+- if (unlikely(wqueue->defunct)) {
++ if (unlikely(!wqueue->pipe)) {
+ spin_unlock_bh(&wqueue->lock);
+ return false;
+ }
+@@ -104,9 +104,6 @@ static bool post_one_notification(struct watch_queue *wqueue,
+ unsigned int head, tail, mask, note, offset, len;
+ bool done = false;
+
+- if (!pipe)
+- return false;
+-
+ spin_lock_irq(&pipe->rd_wait.lock);
+
+ mask = pipe->ring_size - 1;
+@@ -603,8 +600,11 @@ void watch_queue_clear(struct watch_queue *wqueue)
+ rcu_read_lock();
+ spin_lock_bh(&wqueue->lock);
+
+- /* Prevent new notifications from being stored. */
+- wqueue->defunct = true;
++ /*
++ * This pipe can be freed by callers like free_pipe_info().
++ * Removing this reference also prevents new notifications.
++ */
++ wqueue->pipe = NULL;
+
+ while (!hlist_empty(&wqueue->watches)) {
+ watch = hlist_entry(wqueue->watches.first, struct watch, queue_node);
+diff --git a/kernel/watchdog.c b/kernel/watchdog.c
+index 8e61f21e7e33e..6b1754e8b6e96 100644
+--- a/kernel/watchdog.c
++++ b/kernel/watchdog.c
+@@ -30,19 +30,17 @@
+ static DEFINE_MUTEX(watchdog_mutex);
+
+ #if defined(CONFIG_HARDLOCKUP_DETECTOR) || defined(CONFIG_HAVE_NMI_WATCHDOG)
+-# define WATCHDOG_DEFAULT (SOFT_WATCHDOG_ENABLED | NMI_WATCHDOG_ENABLED)
+-# define NMI_WATCHDOG_DEFAULT 1
++# define WATCHDOG_HARDLOCKUP_DEFAULT 1
+ #else
+-# define WATCHDOG_DEFAULT (SOFT_WATCHDOG_ENABLED)
+-# define NMI_WATCHDOG_DEFAULT 0
++# define WATCHDOG_HARDLOCKUP_DEFAULT 0
+ #endif
+
+ unsigned long __read_mostly watchdog_enabled;
+ int __read_mostly watchdog_user_enabled = 1;
+-int __read_mostly nmi_watchdog_user_enabled = NMI_WATCHDOG_DEFAULT;
+-int __read_mostly soft_watchdog_user_enabled = 1;
++static int __read_mostly watchdog_hardlockup_user_enabled = WATCHDOG_HARDLOCKUP_DEFAULT;
++static int __read_mostly watchdog_softlockup_user_enabled = 1;
+ int __read_mostly watchdog_thresh = 10;
+-static int __read_mostly nmi_watchdog_available;
++static int __read_mostly watchdog_hardlockup_available;
+
+ struct cpumask watchdog_cpumask __read_mostly;
+ unsigned long *watchdog_cpumask_bits = cpumask_bits(&watchdog_cpumask);
+@@ -68,7 +66,7 @@ unsigned int __read_mostly hardlockup_panic =
+ */
+ void __init hardlockup_detector_disable(void)
+ {
+- nmi_watchdog_user_enabled = 0;
++ watchdog_hardlockup_user_enabled = 0;
+ }
+
+ static int __init hardlockup_panic_setup(char *str)
+@@ -78,54 +76,131 @@ static int __init hardlockup_panic_setup(char *str)
+ else if (!strncmp(str, "nopanic", 7))
+ hardlockup_panic = 0;
+ else if (!strncmp(str, "0", 1))
+- nmi_watchdog_user_enabled = 0;
++ watchdog_hardlockup_user_enabled = 0;
+ else if (!strncmp(str, "1", 1))
+- nmi_watchdog_user_enabled = 1;
++ watchdog_hardlockup_user_enabled = 1;
+ return 1;
+ }
+ __setup("nmi_watchdog=", hardlockup_panic_setup);
+
+ #endif /* CONFIG_HARDLOCKUP_DETECTOR */
+
++#if defined(CONFIG_HARDLOCKUP_DETECTOR_PERF)
++
++static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts);
++static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts_saved);
++static DEFINE_PER_CPU(bool, hard_watchdog_warn);
++static unsigned long hardlockup_allcpu_dumped;
++
++static bool is_hardlockup(void)
++{
++ unsigned long hrint = __this_cpu_read(hrtimer_interrupts);
++
++ if (__this_cpu_read(hrtimer_interrupts_saved) == hrint)
++ return true;
++
++ __this_cpu_write(hrtimer_interrupts_saved, hrint);
++ return false;
++}
++
++static void watchdog_hardlockup_kick(void)
++{
++ __this_cpu_inc(hrtimer_interrupts);
++}
++
++void watchdog_hardlockup_check(struct pt_regs *regs)
++{
++ /* check for a hardlockup
++ * This is done by making sure our timer interrupt
++ * is incrementing. The timer interrupt should have
++ * fired multiple times before we overflow'd. If it hasn't
++ * then this is a good indication the cpu is stuck
++ */
++ if (is_hardlockup()) {
++ int this_cpu = smp_processor_id();
++
++ /* only print hardlockups once */
++ if (__this_cpu_read(hard_watchdog_warn) == true)
++ return;
++
++ pr_emerg("Watchdog detected hard LOCKUP on cpu %d\n",
++ this_cpu);
++ print_modules();
++ print_irqtrace_events(current);
++ if (regs)
++ show_regs(regs);
++ else
++ dump_stack();
++
++ /*
++ * Perform all-CPU dump only once to avoid multiple hardlockups
++ * generating interleaving traces
++ */
++ if (sysctl_hardlockup_all_cpu_backtrace &&
++ !test_and_set_bit(0, &hardlockup_allcpu_dumped))
++ trigger_allbutself_cpu_backtrace();
++
++ if (hardlockup_panic)
++ nmi_panic(regs, "Hard LOCKUP");
++
++ __this_cpu_write(hard_watchdog_warn, true);
++ return;
++ }
++
++ __this_cpu_write(hard_watchdog_warn, false);
++ return;
++}
++
++#else /* CONFIG_HARDLOCKUP_DETECTOR_PERF */
++
++static inline void watchdog_hardlockup_kick(void) { }
++
++#endif /* !CONFIG_HARDLOCKUP_DETECTOR_PERF */
++
+ /*
+ * These functions can be overridden if an architecture implements its
+ * own hardlockup detector.
+ *
+- * watchdog_nmi_enable/disable can be implemented to start and stop when
++ * watchdog_hardlockup_enable/disable can be implemented to start and stop when
+ * softlockup watchdog start and stop. The arch must select the
+ * SOFTLOCKUP_DETECTOR Kconfig.
+ */
+-int __weak watchdog_nmi_enable(unsigned int cpu)
++void __weak watchdog_hardlockup_enable(unsigned int cpu)
+ {
+ hardlockup_detector_perf_enable();
+- return 0;
+ }
+
+-void __weak watchdog_nmi_disable(unsigned int cpu)
++void __weak watchdog_hardlockup_disable(unsigned int cpu)
+ {
+ hardlockup_detector_perf_disable();
+ }
+
+-/* Return 0, if a NMI watchdog is available. Error code otherwise */
+-int __weak __init watchdog_nmi_probe(void)
++/*
++ * Watchdog-detector specific API.
++ *
++ * Return 0 when hardlockup watchdog is available, negative value otherwise.
++ * Note that the negative value means that a delayed probe might
++ * succeed later.
++ */
++int __weak __init watchdog_hardlockup_probe(void)
+ {
+ return hardlockup_detector_perf_init();
+ }
+
+ /**
+- * watchdog_nmi_stop - Stop the watchdog for reconfiguration
++ * watchdog_hardlockup_stop - Stop the watchdog for reconfiguration
+ *
+ * The reconfiguration steps are:
+- * watchdog_nmi_stop();
++ * watchdog_hardlockup_stop();
+ * update_variables();
+- * watchdog_nmi_start();
++ * watchdog_hardlockup_start();
+ */
+-void __weak watchdog_nmi_stop(void) { }
++void __weak watchdog_hardlockup_stop(void) { }
+
+ /**
+- * watchdog_nmi_start - Start the watchdog after reconfiguration
++ * watchdog_hardlockup_start - Start the watchdog after reconfiguration
+ *
+- * Counterpart to watchdog_nmi_stop().
++ * Counterpart to watchdog_hardlockup_stop().
+ *
+ * The following variables have been updated in update_variables() and
+ * contain the currently valid configuration:
+@@ -133,23 +208,23 @@ void __weak watchdog_nmi_stop(void) { }
+ * - watchdog_thresh
+ * - watchdog_cpumask
+ */
+-void __weak watchdog_nmi_start(void) { }
++void __weak watchdog_hardlockup_start(void) { }
+
+ /**
+ * lockup_detector_update_enable - Update the sysctl enable bit
+ *
+- * Caller needs to make sure that the NMI/perf watchdogs are off, so this
+- * can't race with watchdog_nmi_disable().
++ * Caller needs to make sure that the hard watchdogs are off, so this
++ * can't race with watchdog_hardlockup_disable().
+ */
+ static void lockup_detector_update_enable(void)
+ {
+ watchdog_enabled = 0;
+ if (!watchdog_user_enabled)
+ return;
+- if (nmi_watchdog_available && nmi_watchdog_user_enabled)
+- watchdog_enabled |= NMI_WATCHDOG_ENABLED;
+- if (soft_watchdog_user_enabled)
+- watchdog_enabled |= SOFT_WATCHDOG_ENABLED;
++ if (watchdog_hardlockup_available && watchdog_hardlockup_user_enabled)
++ watchdog_enabled |= WATCHDOG_HARDLOCKUP_ENABLED;
++ if (watchdog_softlockup_user_enabled)
++ watchdog_enabled |= WATCHDOG_SOFTOCKUP_ENABLED;
+ }
+
+ #ifdef CONFIG_SOFTLOCKUP_DETECTOR
+@@ -179,8 +254,6 @@ static DEFINE_PER_CPU(unsigned long, watchdog_touch_ts);
+ static DEFINE_PER_CPU(unsigned long, watchdog_report_ts);
+ static DEFINE_PER_CPU(struct hrtimer, watchdog_hrtimer);
+ static DEFINE_PER_CPU(bool, softlockup_touch_sync);
+-static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts);
+-static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts_saved);
+ static unsigned long soft_lockup_nmi_warn;
+
+ static int __init nowatchdog_setup(char *str)
+@@ -192,7 +265,7 @@ __setup("nowatchdog", nowatchdog_setup);
+
+ static int __init nosoftlockup_setup(char *str)
+ {
+- soft_watchdog_user_enabled = 0;
++ watchdog_softlockup_user_enabled = 0;
+ return 1;
+ }
+ __setup("nosoftlockup", nosoftlockup_setup);
+@@ -306,7 +379,7 @@ static int is_softlockup(unsigned long touch_ts,
+ unsigned long period_ts,
+ unsigned long now)
+ {
+- if ((watchdog_enabled & SOFT_WATCHDOG_ENABLED) && watchdog_thresh){
++ if ((watchdog_enabled & WATCHDOG_SOFTOCKUP_ENABLED) && watchdog_thresh) {
+ /* Warn about unreasonable delays. */
+ if (time_after(now, period_ts + get_softlockup_thresh()))
+ return now - touch_ts;
+@@ -315,22 +388,6 @@ static int is_softlockup(unsigned long touch_ts,
+ }
+
+ /* watchdog detector functions */
+-bool is_hardlockup(void)
+-{
+- unsigned long hrint = __this_cpu_read(hrtimer_interrupts);
+-
+- if (__this_cpu_read(hrtimer_interrupts_saved) == hrint)
+- return true;
+-
+- __this_cpu_write(hrtimer_interrupts_saved, hrint);
+- return false;
+-}
+-
+-static void watchdog_interrupt_count(void)
+-{
+- __this_cpu_inc(hrtimer_interrupts);
+-}
+-
+ static DEFINE_PER_CPU(struct completion, softlockup_completion);
+ static DEFINE_PER_CPU(struct cpu_stop_work, softlockup_stop_work);
+
+@@ -361,8 +418,7 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
+ if (!watchdog_enabled)
+ return HRTIMER_NORESTART;
+
+- /* kick the hardlockup detector */
+- watchdog_interrupt_count();
++ watchdog_hardlockup_kick();
+
+ /* kick the softlockup detector */
+ if (completion_done(this_cpu_ptr(&softlockup_completion))) {
+@@ -458,7 +514,7 @@ static void watchdog_enable(unsigned int cpu)
+ complete(done);
+
+ /*
+- * Start the timer first to prevent the NMI watchdog triggering
++ * Start the timer first to prevent the hardlockup watchdog triggering
+ * before the timer has a chance to fire.
+ */
+ hrtimer_init(hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD);
+@@ -468,9 +524,9 @@ static void watchdog_enable(unsigned int cpu)
+
+ /* Initialize timestamp */
+ update_touch_ts();
+- /* Enable the perf event */
+- if (watchdog_enabled & NMI_WATCHDOG_ENABLED)
+- watchdog_nmi_enable(cpu);
++ /* Enable the hardlockup detector */
++ if (watchdog_enabled & WATCHDOG_HARDLOCKUP_ENABLED)
++ watchdog_hardlockup_enable(cpu);
+ }
+
+ static void watchdog_disable(unsigned int cpu)
+@@ -480,11 +536,11 @@ static void watchdog_disable(unsigned int cpu)
+ WARN_ON_ONCE(cpu != smp_processor_id());
+
+ /*
+- * Disable the perf event first. That prevents that a large delay
+- * between disabling the timer and disabling the perf event causes
+- * the perf NMI to detect a false positive.
++ * Disable the hardlockup detector first. That prevents that a large
++ * delay between disabling the timer and disabling the hardlockup
++ * detector causes a false positive.
+ */
+- watchdog_nmi_disable(cpu);
++ watchdog_hardlockup_disable(cpu);
+ hrtimer_cancel(hrtimer);
+ wait_for_completion(this_cpu_ptr(&softlockup_completion));
+ }
+@@ -540,7 +596,7 @@ int lockup_detector_offline_cpu(unsigned int cpu)
+ static void __lockup_detector_reconfigure(void)
+ {
+ cpus_read_lock();
+- watchdog_nmi_stop();
++ watchdog_hardlockup_stop();
+
+ softlockup_stop_all();
+ set_sample_period();
+@@ -548,7 +604,7 @@ static void __lockup_detector_reconfigure(void)
+ if (watchdog_enabled && watchdog_thresh)
+ softlockup_start_all();
+
+- watchdog_nmi_start();
++ watchdog_hardlockup_start();
+ cpus_read_unlock();
+ /*
+ * Must be called outside the cpus locked section to prevent
+@@ -589,9 +645,9 @@ static __init void lockup_detector_setup(void)
+ static void __lockup_detector_reconfigure(void)
+ {
+ cpus_read_lock();
+- watchdog_nmi_stop();
++ watchdog_hardlockup_stop();
+ lockup_detector_update_enable();
+- watchdog_nmi_start();
++ watchdog_hardlockup_start();
+ cpus_read_unlock();
+ }
+ void lockup_detector_reconfigure(void)
+@@ -646,14 +702,14 @@ static void proc_watchdog_update(void)
+ /*
+ * common function for watchdog, nmi_watchdog and soft_watchdog parameter
+ *
+- * caller | table->data points to | 'which'
+- * -------------------|----------------------------|--------------------------
+- * proc_watchdog | watchdog_user_enabled | NMI_WATCHDOG_ENABLED |
+- * | | SOFT_WATCHDOG_ENABLED
+- * -------------------|----------------------------|--------------------------
+- * proc_nmi_watchdog | nmi_watchdog_user_enabled | NMI_WATCHDOG_ENABLED
+- * -------------------|----------------------------|--------------------------
+- * proc_soft_watchdog | soft_watchdog_user_enabled | SOFT_WATCHDOG_ENABLED
++ * caller | table->data points to | 'which'
++ * -------------------|----------------------------------|-------------------------------
++ * proc_watchdog | watchdog_user_enabled | WATCHDOG_HARDLOCKUP_ENABLED |
++ * | | WATCHDOG_SOFTOCKUP_ENABLED
++ * -------------------|----------------------------------|-------------------------------
++ * proc_nmi_watchdog | watchdog_hardlockup_user_enabled | WATCHDOG_HARDLOCKUP_ENABLED
++ * -------------------|----------------------------------|-------------------------------
++ * proc_soft_watchdog | watchdog_softlockup_user_enabled | WATCHDOG_SOFTOCKUP_ENABLED
+ */
+ static int proc_watchdog_common(int which, struct ctl_table *table, int write,
+ void *buffer, size_t *lenp, loff_t *ppos)
+@@ -685,7 +741,8 @@ static int proc_watchdog_common(int which, struct ctl_table *table, int write,
+ int proc_watchdog(struct ctl_table *table, int write,
+ void *buffer, size_t *lenp, loff_t *ppos)
+ {
+- return proc_watchdog_common(NMI_WATCHDOG_ENABLED|SOFT_WATCHDOG_ENABLED,
++ return proc_watchdog_common(WATCHDOG_HARDLOCKUP_ENABLED |
++ WATCHDOG_SOFTOCKUP_ENABLED,
+ table, write, buffer, lenp, ppos);
+ }
+
+@@ -695,9 +752,9 @@ int proc_watchdog(struct ctl_table *table, int write,
+ int proc_nmi_watchdog(struct ctl_table *table, int write,
+ void *buffer, size_t *lenp, loff_t *ppos)
+ {
+- if (!nmi_watchdog_available && write)
++ if (!watchdog_hardlockup_available && write)
+ return -ENOTSUPP;
+- return proc_watchdog_common(NMI_WATCHDOG_ENABLED,
++ return proc_watchdog_common(WATCHDOG_HARDLOCKUP_ENABLED,
+ table, write, buffer, lenp, ppos);
+ }
+
+@@ -707,7 +764,7 @@ int proc_nmi_watchdog(struct ctl_table *table, int write,
+ int proc_soft_watchdog(struct ctl_table *table, int write,
+ void *buffer, size_t *lenp, loff_t *ppos)
+ {
+- return proc_watchdog_common(SOFT_WATCHDOG_ENABLED,
++ return proc_watchdog_common(WATCHDOG_SOFTOCKUP_ENABLED,
+ table, write, buffer, lenp, ppos);
+ }
+
+@@ -773,15 +830,6 @@ static struct ctl_table watchdog_sysctls[] = {
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = (void *)&sixty,
+ },
+- {
+- .procname = "nmi_watchdog",
+- .data = &nmi_watchdog_user_enabled,
+- .maxlen = sizeof(int),
+- .mode = NMI_WATCHDOG_SYSCTL_PERM,
+- .proc_handler = proc_nmi_watchdog,
+- .extra1 = SYSCTL_ZERO,
+- .extra2 = SYSCTL_ONE,
+- },
+ {
+ .procname = "watchdog_cpumask",
+ .data = &watchdog_cpumask_bits,
+@@ -792,7 +840,7 @@ static struct ctl_table watchdog_sysctls[] = {
+ #ifdef CONFIG_SOFTLOCKUP_DETECTOR
+ {
+ .procname = "soft_watchdog",
+- .data = &soft_watchdog_user_enabled,
++ .data = &watchdog_softlockup_user_enabled,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_soft_watchdog,
+@@ -845,14 +893,90 @@ static struct ctl_table watchdog_sysctls[] = {
+ {}
+ };
+
++static struct ctl_table watchdog_hardlockup_sysctl[] = {
++ {
++ .procname = "nmi_watchdog",
++ .data = &watchdog_hardlockup_user_enabled,
++ .maxlen = sizeof(int),
++ .mode = 0444,
++ .proc_handler = proc_nmi_watchdog,
++ .extra1 = SYSCTL_ZERO,
++ .extra2 = SYSCTL_ONE,
++ },
++ {}
++};
++
+ static void __init watchdog_sysctl_init(void)
+ {
+ register_sysctl_init("kernel", watchdog_sysctls);
++
++ if (watchdog_hardlockup_available)
++ watchdog_hardlockup_sysctl[0].mode = 0644;
++ register_sysctl_init("kernel", watchdog_hardlockup_sysctl);
+ }
++
+ #else
+ #define watchdog_sysctl_init() do { } while (0)
+ #endif /* CONFIG_SYSCTL */
+
++static void __init lockup_detector_delay_init(struct work_struct *work);
++static bool allow_lockup_detector_init_retry __initdata;
++
++static struct work_struct detector_work __initdata =
++ __WORK_INITIALIZER(detector_work, lockup_detector_delay_init);
++
++static void __init lockup_detector_delay_init(struct work_struct *work)
++{
++ int ret;
++
++ ret = watchdog_hardlockup_probe();
++ if (ret) {
++ pr_info("Delayed init of the lockup detector failed: %d\n", ret);
++ pr_info("Hard watchdog permanently disabled\n");
++ return;
++ }
++
++ allow_lockup_detector_init_retry = false;
++
++ watchdog_hardlockup_available = true;
++ lockup_detector_setup();
++}
++
++/*
++ * lockup_detector_retry_init - retry init lockup detector if possible.
++ *
++ * Retry hardlockup detector init. It is useful when it requires some
++ * functionality that has to be initialized later on a particular
++ * platform.
++ */
++void __init lockup_detector_retry_init(void)
++{
++ /* Must be called before late init calls */
++ if (!allow_lockup_detector_init_retry)
++ return;
++
++ schedule_work(&detector_work);
++}
++
++/*
++ * Ensure that optional delayed hardlockup init is proceed before
++ * the init code and memory is freed.
++ */
++static int __init lockup_detector_check(void)
++{
++ /* Prevent any later retry. */
++ allow_lockup_detector_init_retry = false;
++
++ /* Make sure no work is pending. */
++ flush_work(&detector_work);
++
++ watchdog_sysctl_init();
++
++ return 0;
++
++}
++late_initcall_sync(lockup_detector_check);
++
+ void __init lockup_detector_init(void)
+ {
+ if (tick_nohz_full_enabled())
+@@ -861,8 +985,10 @@ void __init lockup_detector_init(void)
+ cpumask_copy(&watchdog_cpumask,
+ housekeeping_cpumask(HK_TYPE_TIMER));
+
+- if (!watchdog_nmi_probe())
+- nmi_watchdog_available = true;
++ if (!watchdog_hardlockup_probe())
++ watchdog_hardlockup_available = true;
++ else
++ allow_lockup_detector_init_retry = true;
++
+ lockup_detector_setup();
+- watchdog_sysctl_init();
+ }
+diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c
+index 247bf0b1582ca..6e66e0938bbc1 100644
+--- a/kernel/watchdog_hld.c
++++ b/kernel/watchdog_hld.c
+@@ -20,13 +20,11 @@
+ #include <asm/irq_regs.h>
+ #include <linux/perf_event.h>
+
+-static DEFINE_PER_CPU(bool, hard_watchdog_warn);
+ static DEFINE_PER_CPU(bool, watchdog_nmi_touch);
+ static DEFINE_PER_CPU(struct perf_event *, watchdog_ev);
+ static DEFINE_PER_CPU(struct perf_event *, dead_event);
+ static struct cpumask dead_events_mask;
+
+-static unsigned long hardlockup_allcpu_dumped;
+ static atomic_t watchdog_cpus = ATOMIC_INIT(0);
+
+ notrace void arch_touch_nmi_watchdog(void)
+@@ -114,53 +112,15 @@ static void watchdog_overflow_callback(struct perf_event *event,
+ /* Ensure the watchdog never gets throttled */
+ event->hw.interrupts = 0;
+
+- if (__this_cpu_read(watchdog_nmi_touch) == true) {
+- __this_cpu_write(watchdog_nmi_touch, false);
+- return;
+- }
+-
+ if (!watchdog_check_timestamp())
+ return;
+
+- /* check for a hardlockup
+- * This is done by making sure our timer interrupt
+- * is incrementing. The timer interrupt should have
+- * fired multiple times before we overflow'd. If it hasn't
+- * then this is a good indication the cpu is stuck
+- */
+- if (is_hardlockup()) {
+- int this_cpu = smp_processor_id();
+-
+- /* only print hardlockups once */
+- if (__this_cpu_read(hard_watchdog_warn) == true)
+- return;
+-
+- pr_emerg("Watchdog detected hard LOCKUP on cpu %d\n",
+- this_cpu);
+- print_modules();
+- print_irqtrace_events(current);
+- if (regs)
+- show_regs(regs);
+- else
+- dump_stack();
+-
+- /*
+- * Perform all-CPU dump only once to avoid multiple hardlockups
+- * generating interleaving traces
+- */
+- if (sysctl_hardlockup_all_cpu_backtrace &&
+- !test_and_set_bit(0, &hardlockup_allcpu_dumped))
+- trigger_allbutself_cpu_backtrace();
+-
+- if (hardlockup_panic)
+- nmi_panic(regs, "Hard LOCKUP");
+-
+- __this_cpu_write(hard_watchdog_warn, true);
++ if (__this_cpu_read(watchdog_nmi_touch) == true) {
++ __this_cpu_write(watchdog_nmi_touch, false);
+ return;
+ }
+
+- __this_cpu_write(hard_watchdog_warn, false);
+- return;
++ watchdog_hardlockup_check(regs);
+ }
+
+ static int hardlockup_detector_event_create(void)
+@@ -268,7 +228,7 @@ void __init hardlockup_detector_perf_restart(void)
+
+ lockdep_assert_cpus_held();
+
+- if (!(watchdog_enabled & NMI_WATCHDOG_ENABLED))
++ if (!(watchdog_enabled & WATCHDOG_HARDLOCKUP_ENABLED))
+ return;
+
+ for_each_online_cpu(cpu) {
+diff --git a/lib/bitmap.c b/lib/bitmap.c
+index 1c81413c51f86..ddb31015e38ae 100644
+--- a/lib/bitmap.c
++++ b/lib/bitmap.c
+@@ -1495,7 +1495,7 @@ void bitmap_to_arr32(u32 *buf, const unsigned long *bitmap, unsigned int nbits)
+ EXPORT_SYMBOL(bitmap_to_arr32);
+ #endif
+
+-#if (BITS_PER_LONG == 32) && defined(__BIG_ENDIAN)
++#if BITS_PER_LONG == 32
+ /**
+ * bitmap_from_arr64 - copy the contents of u64 array of bits to bitmap
+ * @bitmap: array of unsigned longs, the destination bitmap
+diff --git a/lib/dhry_1.c b/lib/dhry_1.c
+index 83247106824cc..08edbbb19f573 100644
+--- a/lib/dhry_1.c
++++ b/lib/dhry_1.c
+@@ -139,8 +139,15 @@ int dhry(int n)
+
+ /* Initializations */
+
+- Next_Ptr_Glob = (Rec_Pointer)kzalloc(sizeof(Rec_Type), GFP_KERNEL);
+- Ptr_Glob = (Rec_Pointer)kzalloc(sizeof(Rec_Type), GFP_KERNEL);
++ Next_Ptr_Glob = (Rec_Pointer)kzalloc(sizeof(Rec_Type), GFP_ATOMIC);
++ if (!Next_Ptr_Glob)
++ return -ENOMEM;
++
++ Ptr_Glob = (Rec_Pointer)kzalloc(sizeof(Rec_Type), GFP_ATOMIC);
++ if (!Ptr_Glob) {
++ kfree(Next_Ptr_Glob);
++ return -ENOMEM;
++ }
+
+ Ptr_Glob->Ptr_Comp = Next_Ptr_Glob;
+ Ptr_Glob->Discr = Ident_1;
+diff --git a/lib/test_firmware.c b/lib/test_firmware.c
+index 1d7d480b8eeb3..add4699fc6cd4 100644
+--- a/lib/test_firmware.c
++++ b/lib/test_firmware.c
+@@ -214,7 +214,7 @@ static int __kstrncpy(char **dst, const char *name, size_t count, gfp_t gfp)
+ {
+ *dst = kstrndup(name, count, gfp);
+ if (!*dst)
+- return -ENOSPC;
++ return -ENOMEM;
+ return count;
+ }
+
+@@ -671,7 +671,7 @@ static ssize_t trigger_request_store(struct device *dev,
+
+ name = kstrndup(buf, count, GFP_KERNEL);
+ if (!name)
+- return -ENOSPC;
++ return -ENOMEM;
+
+ pr_info("loading '%s'\n", name);
+
+@@ -719,7 +719,7 @@ static ssize_t trigger_request_platform_store(struct device *dev,
+
+ name = kstrndup(buf, count, GFP_KERNEL);
+ if (!name)
+- return -ENOSPC;
++ return -ENOMEM;
+
+ pr_info("inserting test platform fw '%s'\n", name);
+ efi_embedded_fw.name = name;
+@@ -772,7 +772,7 @@ static ssize_t trigger_async_request_store(struct device *dev,
+
+ name = kstrndup(buf, count, GFP_KERNEL);
+ if (!name)
+- return -ENOSPC;
++ return -ENOMEM;
+
+ pr_info("loading '%s'\n", name);
+
+@@ -817,7 +817,7 @@ static ssize_t trigger_custom_fallback_store(struct device *dev,
+
+ name = kstrndup(buf, count, GFP_KERNEL);
+ if (!name)
+- return -ENOSPC;
++ return -ENOMEM;
+
+ pr_info("loading '%s' using custom fallback mechanism\n", name);
+
+@@ -868,7 +868,7 @@ static int test_fw_run_batch_request(void *data)
+
+ test_buf = kzalloc(TEST_FIRMWARE_BUF_SIZE, GFP_KERNEL);
+ if (!test_buf)
+- return -ENOSPC;
++ return -ENOMEM;
+
+ if (test_fw_config->partial)
+ req->rc = request_partial_firmware_into_buf
+diff --git a/lib/ts_bm.c b/lib/ts_bm.c
+index 1f2234221dd11..c8ecbf74ef295 100644
+--- a/lib/ts_bm.c
++++ b/lib/ts_bm.c
+@@ -60,10 +60,12 @@ static unsigned int bm_find(struct ts_config *conf, struct ts_state *state)
+ struct ts_bm *bm = ts_config_priv(conf);
+ unsigned int i, text_len, consumed = state->offset;
+ const u8 *text;
+- int shift = bm->patlen - 1, bs;
++ int bs;
+ const u8 icase = conf->flags & TS_IGNORECASE;
+
+ for (;;) {
++ int shift = bm->patlen - 1;
++
+ text_len = conf->get_next_block(consumed, &text, conf, state);
+
+ if (unlikely(text_len == 0))
+diff --git a/mm/damon/ops-common.c b/mm/damon/ops-common.c
+index cc63cf9536369..acc264b979034 100644
+--- a/mm/damon/ops-common.c
++++ b/mm/damon/ops-common.c
+@@ -37,7 +37,7 @@ struct folio *damon_get_folio(unsigned long pfn)
+ return folio;
+ }
+
+-void damon_ptep_mkold(pte_t *pte, struct mm_struct *mm, unsigned long addr)
++void damon_ptep_mkold(pte_t *pte, struct vm_area_struct *vma, unsigned long addr)
+ {
+ bool referenced = false;
+ struct folio *folio = damon_get_folio(pte_pfn(*pte));
+@@ -45,13 +45,11 @@ void damon_ptep_mkold(pte_t *pte, struct mm_struct *mm, unsigned long addr)
+ if (!folio)
+ return;
+
+- if (pte_young(*pte)) {
++ if (ptep_test_and_clear_young(vma, addr, pte))
+ referenced = true;
+- *pte = pte_mkold(*pte);
+- }
+
+ #ifdef CONFIG_MMU_NOTIFIER
+- if (mmu_notifier_clear_young(mm, addr, addr + PAGE_SIZE))
++ if (mmu_notifier_clear_young(vma->vm_mm, addr, addr + PAGE_SIZE))
+ referenced = true;
+ #endif /* CONFIG_MMU_NOTIFIER */
+
+@@ -62,7 +60,7 @@ void damon_ptep_mkold(pte_t *pte, struct mm_struct *mm, unsigned long addr)
+ folio_put(folio);
+ }
+
+-void damon_pmdp_mkold(pmd_t *pmd, struct mm_struct *mm, unsigned long addr)
++void damon_pmdp_mkold(pmd_t *pmd, struct vm_area_struct *vma, unsigned long addr)
+ {
+ #ifdef CONFIG_TRANSPARENT_HUGEPAGE
+ bool referenced = false;
+@@ -71,13 +69,11 @@ void damon_pmdp_mkold(pmd_t *pmd, struct mm_struct *mm, unsigned long addr)
+ if (!folio)
+ return;
+
+- if (pmd_young(*pmd)) {
++ if (pmdp_test_and_clear_young(vma, addr, pmd))
+ referenced = true;
+- *pmd = pmd_mkold(*pmd);
+- }
+
+ #ifdef CONFIG_MMU_NOTIFIER
+- if (mmu_notifier_clear_young(mm, addr, addr + HPAGE_PMD_SIZE))
++ if (mmu_notifier_clear_young(vma->vm_mm, addr, addr + HPAGE_PMD_SIZE))
+ referenced = true;
+ #endif /* CONFIG_MMU_NOTIFIER */
+
+diff --git a/mm/damon/ops-common.h b/mm/damon/ops-common.h
+index 14f4bc69f29be..18d837d11bcee 100644
+--- a/mm/damon/ops-common.h
++++ b/mm/damon/ops-common.h
+@@ -9,8 +9,8 @@
+
+ struct folio *damon_get_folio(unsigned long pfn);
+
+-void damon_ptep_mkold(pte_t *pte, struct mm_struct *mm, unsigned long addr);
+-void damon_pmdp_mkold(pmd_t *pmd, struct mm_struct *mm, unsigned long addr);
++void damon_ptep_mkold(pte_t *pte, struct vm_area_struct *vma, unsigned long addr);
++void damon_pmdp_mkold(pmd_t *pmd, struct vm_area_struct *vma, unsigned long addr);
+
+ int damon_cold_score(struct damon_ctx *c, struct damon_region *r,
+ struct damos *s);
+diff --git a/mm/damon/paddr.c b/mm/damon/paddr.c
+index 467b99166b437..5b3a3463d0782 100644
+--- a/mm/damon/paddr.c
++++ b/mm/damon/paddr.c
+@@ -24,9 +24,9 @@ static bool __damon_pa_mkold(struct folio *folio, struct vm_area_struct *vma,
+ while (page_vma_mapped_walk(&pvmw)) {
+ addr = pvmw.address;
+ if (pvmw.pte)
+- damon_ptep_mkold(pvmw.pte, vma->vm_mm, addr);
++ damon_ptep_mkold(pvmw.pte, vma, addr);
+ else
+- damon_pmdp_mkold(pvmw.pmd, vma->vm_mm, addr);
++ damon_pmdp_mkold(pvmw.pmd, vma, addr);
+ }
+ return true;
+ }
+diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c
+index 1fec16d7263e5..37994fb6120cb 100644
+--- a/mm/damon/vaddr.c
++++ b/mm/damon/vaddr.c
+@@ -311,7 +311,7 @@ static int damon_mkold_pmd_entry(pmd_t *pmd, unsigned long addr,
+ }
+
+ if (pmd_trans_huge(*pmd)) {
+- damon_pmdp_mkold(pmd, walk->mm, addr);
++ damon_pmdp_mkold(pmd, walk->vma, addr);
+ spin_unlock(ptl);
+ return 0;
+ }
+@@ -323,7 +323,7 @@ static int damon_mkold_pmd_entry(pmd_t *pmd, unsigned long addr,
+ pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
+ if (!pte_present(*pte))
+ goto out;
+- damon_ptep_mkold(pte, walk->mm, addr);
++ damon_ptep_mkold(pte, walk->vma, addr);
+ out:
+ pte_unmap_unlock(pte, ptl);
+ return 0;
+diff --git a/mm/filemap.c b/mm/filemap.c
+index 83dda76d1fc36..8abce63b259c9 100644
+--- a/mm/filemap.c
++++ b/mm/filemap.c
+@@ -2906,7 +2906,7 @@ ssize_t filemap_splice_read(struct file *in, loff_t *ppos,
+ do {
+ cond_resched();
+
+- if (*ppos >= i_size_read(file_inode(in)))
++ if (*ppos >= i_size_read(in->f_mapping->host))
+ break;
+
+ iocb.ki_pos = *ppos;
+@@ -2922,7 +2922,7 @@ ssize_t filemap_splice_read(struct file *in, loff_t *ppos,
+ * part of the page is not copied back to userspace (unless
+ * another truncate extends the file - this is desired though).
+ */
+- isize = i_size_read(file_inode(in));
++ isize = i_size_read(in->f_mapping->host);
+ if (unlikely(*ppos >= isize))
+ break;
+ end_offset = min_t(loff_t, isize, *ppos + len);
+diff --git a/mm/page-writeback.c b/mm/page-writeback.c
+index db79439990073..6faa09f1783b3 100644
+--- a/mm/page-writeback.c
++++ b/mm/page-writeback.c
+@@ -2434,6 +2434,7 @@ int write_cache_pages(struct address_space *mapping,
+
+ for (i = 0; i < nr_folios; i++) {
+ struct folio *folio = fbatch.folios[i];
++ unsigned long nr;
+
+ done_index = folio->index;
+
+@@ -2471,6 +2472,7 @@ continue_unlock:
+
+ trace_wbc_writepage(wbc, inode_to_bdi(mapping->host));
+ error = writepage(folio, wbc, data);
++ nr = folio_nr_pages(folio);
+ if (unlikely(error)) {
+ /*
+ * Handle errors according to the type of
+@@ -2489,8 +2491,7 @@ continue_unlock:
+ error = 0;
+ } else if (wbc->sync_mode != WB_SYNC_ALL) {
+ ret = error;
+- done_index = folio->index +
+- folio_nr_pages(folio);
++ done_index = folio->index + nr;
+ done = 1;
+ break;
+ }
+@@ -2504,7 +2505,8 @@ continue_unlock:
+ * keep going until we have written all the pages
+ * we tagged for writeback prior to entering this loop.
+ */
+- if (--wbc->nr_to_write <= 0 &&
++ wbc->nr_to_write -= nr;
++ if (wbc->nr_to_write <= 0 &&
+ wbc->sync_mode == WB_SYNC_NONE) {
+ done = 1;
+ break;
+diff --git a/mm/shmem.c b/mm/shmem.c
+index e40a08c5c6d78..74abb97ea557b 100644
+--- a/mm/shmem.c
++++ b/mm/shmem.c
+@@ -4196,7 +4196,7 @@ static struct file_system_type shmem_fs_type = {
+ .name = "tmpfs",
+ .init_fs_context = ramfs_init_fs_context,
+ .parameters = ramfs_fs_parameters,
+- .kill_sb = kill_litter_super,
++ .kill_sb = ramfs_kill_sb,
+ .fs_flags = FS_USERNS_MOUNT,
+ };
+
+diff --git a/mm/vmscan.c b/mm/vmscan.c
+index 5bf98d0a22c9a..6114a1fc6c688 100644
+--- a/mm/vmscan.c
++++ b/mm/vmscan.c
+@@ -4728,10 +4728,11 @@ static void lru_gen_rotate_memcg(struct lruvec *lruvec, int op)
+ {
+ int seg;
+ int old, new;
++ unsigned long flags;
+ int bin = get_random_u32_below(MEMCG_NR_BINS);
+ struct pglist_data *pgdat = lruvec_pgdat(lruvec);
+
+- spin_lock(&pgdat->memcg_lru.lock);
++ spin_lock_irqsave(&pgdat->memcg_lru.lock, flags);
+
+ VM_WARN_ON_ONCE(hlist_nulls_unhashed(&lruvec->lrugen.list));
+
+@@ -4766,7 +4767,7 @@ static void lru_gen_rotate_memcg(struct lruvec *lruvec, int op)
+ if (!pgdat->memcg_lru.nr_memcgs[old] && old == get_memcg_gen(pgdat->memcg_lru.seq))
+ WRITE_ONCE(pgdat->memcg_lru.seq, pgdat->memcg_lru.seq + 1);
+
+- spin_unlock(&pgdat->memcg_lru.lock);
++ spin_unlock_irqrestore(&pgdat->memcg_lru.lock, flags);
+ }
+
+ void lru_gen_online_memcg(struct mem_cgroup *memcg)
+@@ -4779,7 +4780,7 @@ void lru_gen_online_memcg(struct mem_cgroup *memcg)
+ struct pglist_data *pgdat = NODE_DATA(nid);
+ struct lruvec *lruvec = get_lruvec(memcg, nid);
+
+- spin_lock(&pgdat->memcg_lru.lock);
++ spin_lock_irq(&pgdat->memcg_lru.lock);
+
+ VM_WARN_ON_ONCE(!hlist_nulls_unhashed(&lruvec->lrugen.list));
+
+@@ -4790,7 +4791,7 @@ void lru_gen_online_memcg(struct mem_cgroup *memcg)
+
+ lruvec->lrugen.gen = gen;
+
+- spin_unlock(&pgdat->memcg_lru.lock);
++ spin_unlock_irq(&pgdat->memcg_lru.lock);
+ }
+ }
+
+@@ -4814,7 +4815,7 @@ void lru_gen_release_memcg(struct mem_cgroup *memcg)
+ struct pglist_data *pgdat = NODE_DATA(nid);
+ struct lruvec *lruvec = get_lruvec(memcg, nid);
+
+- spin_lock(&pgdat->memcg_lru.lock);
++ spin_lock_irq(&pgdat->memcg_lru.lock);
+
+ VM_WARN_ON_ONCE(hlist_nulls_unhashed(&lruvec->lrugen.list));
+
+@@ -4826,7 +4827,7 @@ void lru_gen_release_memcg(struct mem_cgroup *memcg)
+ if (!pgdat->memcg_lru.nr_memcgs[gen] && gen == get_memcg_gen(pgdat->memcg_lru.seq))
+ WRITE_ONCE(pgdat->memcg_lru.seq, pgdat->memcg_lru.seq + 1);
+
+- spin_unlock(&pgdat->memcg_lru.lock);
++ spin_unlock_irq(&pgdat->memcg_lru.lock);
+ }
+ }
+
+diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c
+index 1ef952bda97d8..2275e0d9f8419 100644
+--- a/net/bluetooth/hci_conn.c
++++ b/net/bluetooth/hci_conn.c
+@@ -775,6 +775,11 @@ static void le_conn_timeout(struct work_struct *work)
+ hci_abort_conn(conn, HCI_ERROR_REMOTE_USER_TERM);
+ }
+
++struct iso_cig_params {
++ struct hci_cp_le_set_cig_params cp;
++ struct hci_cis_params cis[0x1f];
++};
++
+ struct iso_list_data {
+ union {
+ u8 cig;
+@@ -786,10 +791,7 @@ struct iso_list_data {
+ u16 sync_handle;
+ };
+ int count;
+- struct {
+- struct hci_cp_le_set_cig_params cp;
+- struct hci_cis_params cis[0x11];
+- } pdu;
++ struct iso_cig_params pdu;
+ };
+
+ static void bis_list(struct hci_conn *conn, void *data)
+@@ -1764,10 +1766,33 @@ static int hci_le_create_big(struct hci_conn *conn, struct bt_iso_qos *qos)
+ return hci_send_cmd(hdev, HCI_OP_LE_CREATE_BIG, sizeof(cp), &cp);
+ }
+
++static void set_cig_params_complete(struct hci_dev *hdev, void *data, int err)
++{
++ struct iso_cig_params *pdu = data;
++
++ bt_dev_dbg(hdev, "");
++
++ if (err)
++ bt_dev_err(hdev, "Unable to set CIG parameters: %d", err);
++
++ kfree(pdu);
++}
++
++static int set_cig_params_sync(struct hci_dev *hdev, void *data)
++{
++ struct iso_cig_params *pdu = data;
++ u32 plen;
++
++ plen = sizeof(pdu->cp) + pdu->cp.num_cis * sizeof(pdu->cis[0]);
++ return __hci_cmd_sync_status(hdev, HCI_OP_LE_SET_CIG_PARAMS, plen, pdu,
++ HCI_CMD_TIMEOUT);
++}
++
+ static bool hci_le_set_cig_params(struct hci_conn *conn, struct bt_iso_qos *qos)
+ {
+ struct hci_dev *hdev = conn->hdev;
+ struct iso_list_data data;
++ struct iso_cig_params *pdu;
+
+ memset(&data, 0, sizeof(data));
+
+@@ -1837,12 +1862,18 @@ static bool hci_le_set_cig_params(struct hci_conn *conn, struct bt_iso_qos *qos)
+ if (qos->ucast.cis == BT_ISO_QOS_CIS_UNSET || !data.pdu.cp.num_cis)
+ return false;
+
+- if (hci_send_cmd(hdev, HCI_OP_LE_SET_CIG_PARAMS,
+- sizeof(data.pdu.cp) +
+- (data.pdu.cp.num_cis * sizeof(*data.pdu.cis)),
+- &data.pdu) < 0)
++ pdu = kzalloc(sizeof(*pdu), GFP_KERNEL);
++ if (!pdu)
+ return false;
+
++ memcpy(pdu, &data.pdu, sizeof(*pdu));
++
++ if (hci_cmd_sync_queue(hdev, set_cig_params_sync, pdu,
++ set_cig_params_complete) < 0) {
++ kfree(pdu);
++ return false;
++ }
++
+ return true;
+ }
+
+diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
+index 09ba6d8987ee1..21e26d3b286cc 100644
+--- a/net/bluetooth/hci_event.c
++++ b/net/bluetooth/hci_event.c
+@@ -6316,23 +6316,18 @@ static void process_adv_report(struct hci_dev *hdev, u8 type, bdaddr_t *bdaddr,
+ return;
+ }
+
+- /* When receiving non-connectable or scannable undirected
+- * advertising reports, this means that the remote device is
+- * not connectable and then clearly indicate this in the
+- * device found event.
+- *
+- * When receiving a scan response, then there is no way to
++ /* When receiving a scan response, then there is no way to
+ * know if the remote device is connectable or not. However
+ * since scan responses are merged with a previously seen
+ * advertising report, the flags field from that report
+ * will be used.
+ *
+- * In the really unlikely case that a controller get confused
+- * and just sends a scan response event, then it is marked as
+- * not connectable as well.
++ * In the unlikely case that a controller just sends a scan
++ * response event that doesn't match the pending report, then
++ * it is marked as a standalone SCAN_RSP.
+ */
+ if (type == LE_ADV_SCAN_RSP)
+- flags = MGMT_DEV_FOUND_NOT_CONNECTABLE;
++ flags = MGMT_DEV_FOUND_SCAN_RSP;
+
+ /* If there's nothing pending either store the data from this
+ * event or send an immediate device found event if the data
+diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c
+index 804cde43b4e02..b5b1b610df335 100644
+--- a/net/bluetooth/hci_sync.c
++++ b/net/bluetooth/hci_sync.c
+@@ -4626,23 +4626,17 @@ static int hci_dev_setup_sync(struct hci_dev *hdev)
+ invalid_bdaddr = test_bit(HCI_QUIRK_INVALID_BDADDR, &hdev->quirks);
+
+ if (!ret) {
+- if (test_bit(HCI_QUIRK_USE_BDADDR_PROPERTY, &hdev->quirks)) {
+- if (!bacmp(&hdev->public_addr, BDADDR_ANY))
+- hci_dev_get_bd_addr_from_property(hdev);
+-
+- if (bacmp(&hdev->public_addr, BDADDR_ANY) &&
+- hdev->set_bdaddr) {
+- ret = hdev->set_bdaddr(hdev,
+- &hdev->public_addr);
+-
+- /* If setting of the BD_ADDR from the device
+- * property succeeds, then treat the address
+- * as valid even if the invalid BD_ADDR
+- * quirk indicates otherwise.
+- */
+- if (!ret)
+- invalid_bdaddr = false;
+- }
++ if (test_bit(HCI_QUIRK_USE_BDADDR_PROPERTY, &hdev->quirks) &&
++ !bacmp(&hdev->public_addr, BDADDR_ANY))
++ hci_dev_get_bd_addr_from_property(hdev);
++
++ if ((invalid_bdaddr ||
++ test_bit(HCI_QUIRK_USE_BDADDR_PROPERTY, &hdev->quirks)) &&
++ bacmp(&hdev->public_addr, BDADDR_ANY) &&
++ hdev->set_bdaddr) {
++ ret = hdev->set_bdaddr(hdev, &hdev->public_addr);
++ if (!ret)
++ invalid_bdaddr = false;
+ }
+ }
+
+diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
+index 3f04b40f60568..2450690f98cfa 100644
+--- a/net/bridge/br_if.c
++++ b/net/bridge/br_if.c
+@@ -166,8 +166,9 @@ void br_manage_promisc(struct net_bridge *br)
+ * This lets us disable promiscuous mode and write
+ * this config to hw.
+ */
+- if (br->auto_cnt == 0 ||
+- (br->auto_cnt == 1 && br_auto_port(p)))
++ if ((p->dev->priv_flags & IFF_UNICAST_FLT) &&
++ (br->auto_cnt == 0 ||
++ (br->auto_cnt == 1 && br_auto_port(p))))
+ br_port_clear_promisc(p);
+ else
+ br_port_set_promisc(p);
+diff --git a/net/core/filter.c b/net/core/filter.c
+index d9ce04ca22ce8..1c959794a8862 100644
+--- a/net/core/filter.c
++++ b/net/core/filter.c
+@@ -6555,12 +6555,11 @@ static struct sock *sk_lookup(struct net *net, struct bpf_sock_tuple *tuple,
+ static struct sock *
+ __bpf_skc_lookup(struct sk_buff *skb, struct bpf_sock_tuple *tuple, u32 len,
+ struct net *caller_net, u32 ifindex, u8 proto, u64 netns_id,
+- u64 flags)
++ u64 flags, int sdif)
+ {
+ struct sock *sk = NULL;
+ struct net *net;
+ u8 family;
+- int sdif;
+
+ if (len == sizeof(tuple->ipv4))
+ family = AF_INET;
+@@ -6572,10 +6571,12 @@ __bpf_skc_lookup(struct sk_buff *skb, struct bpf_sock_tuple *tuple, u32 len,
+ if (unlikely(flags || !((s32)netns_id < 0 || netns_id <= S32_MAX)))
+ goto out;
+
+- if (family == AF_INET)
+- sdif = inet_sdif(skb);
+- else
+- sdif = inet6_sdif(skb);
++ if (sdif < 0) {
++ if (family == AF_INET)
++ sdif = inet_sdif(skb);
++ else
++ sdif = inet6_sdif(skb);
++ }
+
+ if ((s32)netns_id < 0) {
+ net = caller_net;
+@@ -6595,10 +6596,11 @@ out:
+ static struct sock *
+ __bpf_sk_lookup(struct sk_buff *skb, struct bpf_sock_tuple *tuple, u32 len,
+ struct net *caller_net, u32 ifindex, u8 proto, u64 netns_id,
+- u64 flags)
++ u64 flags, int sdif)
+ {
+ struct sock *sk = __bpf_skc_lookup(skb, tuple, len, caller_net,
+- ifindex, proto, netns_id, flags);
++ ifindex, proto, netns_id, flags,
++ sdif);
+
+ if (sk) {
+ struct sock *sk2 = sk_to_full_sk(sk);
+@@ -6638,7 +6640,7 @@ bpf_skc_lookup(struct sk_buff *skb, struct bpf_sock_tuple *tuple, u32 len,
+ }
+
+ return __bpf_skc_lookup(skb, tuple, len, caller_net, ifindex, proto,
+- netns_id, flags);
++ netns_id, flags, -1);
+ }
+
+ static struct sock *
+@@ -6727,6 +6729,78 @@ static const struct bpf_func_proto bpf_sk_lookup_udp_proto = {
+ .arg5_type = ARG_ANYTHING,
+ };
+
++BPF_CALL_5(bpf_tc_skc_lookup_tcp, struct sk_buff *, skb,
++ struct bpf_sock_tuple *, tuple, u32, len, u64, netns_id, u64, flags)
++{
++ struct net_device *dev = skb->dev;
++ int ifindex = dev->ifindex, sdif = dev_sdif(dev);
++ struct net *caller_net = dev_net(dev);
++
++ return (unsigned long)__bpf_skc_lookup(skb, tuple, len, caller_net,
++ ifindex, IPPROTO_TCP, netns_id,
++ flags, sdif);
++}
++
++static const struct bpf_func_proto bpf_tc_skc_lookup_tcp_proto = {
++ .func = bpf_tc_skc_lookup_tcp,
++ .gpl_only = false,
++ .pkt_access = true,
++ .ret_type = RET_PTR_TO_SOCK_COMMON_OR_NULL,
++ .arg1_type = ARG_PTR_TO_CTX,
++ .arg2_type = ARG_PTR_TO_MEM | MEM_RDONLY,
++ .arg3_type = ARG_CONST_SIZE,
++ .arg4_type = ARG_ANYTHING,
++ .arg5_type = ARG_ANYTHING,
++};
++
++BPF_CALL_5(bpf_tc_sk_lookup_tcp, struct sk_buff *, skb,
++ struct bpf_sock_tuple *, tuple, u32, len, u64, netns_id, u64, flags)
++{
++ struct net_device *dev = skb->dev;
++ int ifindex = dev->ifindex, sdif = dev_sdif(dev);
++ struct net *caller_net = dev_net(dev);
++
++ return (unsigned long)__bpf_sk_lookup(skb, tuple, len, caller_net,
++ ifindex, IPPROTO_TCP, netns_id,
++ flags, sdif);
++}
++
++static const struct bpf_func_proto bpf_tc_sk_lookup_tcp_proto = {
++ .func = bpf_tc_sk_lookup_tcp,
++ .gpl_only = false,
++ .pkt_access = true,
++ .ret_type = RET_PTR_TO_SOCKET_OR_NULL,
++ .arg1_type = ARG_PTR_TO_CTX,
++ .arg2_type = ARG_PTR_TO_MEM | MEM_RDONLY,
++ .arg3_type = ARG_CONST_SIZE,
++ .arg4_type = ARG_ANYTHING,
++ .arg5_type = ARG_ANYTHING,
++};
++
++BPF_CALL_5(bpf_tc_sk_lookup_udp, struct sk_buff *, skb,
++ struct bpf_sock_tuple *, tuple, u32, len, u64, netns_id, u64, flags)
++{
++ struct net_device *dev = skb->dev;
++ int ifindex = dev->ifindex, sdif = dev_sdif(dev);
++ struct net *caller_net = dev_net(dev);
++
++ return (unsigned long)__bpf_sk_lookup(skb, tuple, len, caller_net,
++ ifindex, IPPROTO_UDP, netns_id,
++ flags, sdif);
++}
++
++static const struct bpf_func_proto bpf_tc_sk_lookup_udp_proto = {
++ .func = bpf_tc_sk_lookup_udp,
++ .gpl_only = false,
++ .pkt_access = true,
++ .ret_type = RET_PTR_TO_SOCKET_OR_NULL,
++ .arg1_type = ARG_PTR_TO_CTX,
++ .arg2_type = ARG_PTR_TO_MEM | MEM_RDONLY,
++ .arg3_type = ARG_CONST_SIZE,
++ .arg4_type = ARG_ANYTHING,
++ .arg5_type = ARG_ANYTHING,
++};
++
+ BPF_CALL_1(bpf_sk_release, struct sock *, sk)
+ {
+ if (sk && sk_is_refcounted(sk))
+@@ -6744,12 +6818,13 @@ static const struct bpf_func_proto bpf_sk_release_proto = {
+ BPF_CALL_5(bpf_xdp_sk_lookup_udp, struct xdp_buff *, ctx,
+ struct bpf_sock_tuple *, tuple, u32, len, u32, netns_id, u64, flags)
+ {
+- struct net *caller_net = dev_net(ctx->rxq->dev);
+- int ifindex = ctx->rxq->dev->ifindex;
++ struct net_device *dev = ctx->rxq->dev;
++ int ifindex = dev->ifindex, sdif = dev_sdif(dev);
++ struct net *caller_net = dev_net(dev);
+
+ return (unsigned long)__bpf_sk_lookup(NULL, tuple, len, caller_net,
+ ifindex, IPPROTO_UDP, netns_id,
+- flags);
++ flags, sdif);
+ }
+
+ static const struct bpf_func_proto bpf_xdp_sk_lookup_udp_proto = {
+@@ -6767,12 +6842,13 @@ static const struct bpf_func_proto bpf_xdp_sk_lookup_udp_proto = {
+ BPF_CALL_5(bpf_xdp_skc_lookup_tcp, struct xdp_buff *, ctx,
+ struct bpf_sock_tuple *, tuple, u32, len, u32, netns_id, u64, flags)
+ {
+- struct net *caller_net = dev_net(ctx->rxq->dev);
+- int ifindex = ctx->rxq->dev->ifindex;
++ struct net_device *dev = ctx->rxq->dev;
++ int ifindex = dev->ifindex, sdif = dev_sdif(dev);
++ struct net *caller_net = dev_net(dev);
+
+ return (unsigned long)__bpf_skc_lookup(NULL, tuple, len, caller_net,
+ ifindex, IPPROTO_TCP, netns_id,
+- flags);
++ flags, sdif);
+ }
+
+ static const struct bpf_func_proto bpf_xdp_skc_lookup_tcp_proto = {
+@@ -6790,12 +6866,13 @@ static const struct bpf_func_proto bpf_xdp_skc_lookup_tcp_proto = {
+ BPF_CALL_5(bpf_xdp_sk_lookup_tcp, struct xdp_buff *, ctx,
+ struct bpf_sock_tuple *, tuple, u32, len, u32, netns_id, u64, flags)
+ {
+- struct net *caller_net = dev_net(ctx->rxq->dev);
+- int ifindex = ctx->rxq->dev->ifindex;
++ struct net_device *dev = ctx->rxq->dev;
++ int ifindex = dev->ifindex, sdif = dev_sdif(dev);
++ struct net *caller_net = dev_net(dev);
+
+ return (unsigned long)__bpf_sk_lookup(NULL, tuple, len, caller_net,
+ ifindex, IPPROTO_TCP, netns_id,
+- flags);
++ flags, sdif);
+ }
+
+ static const struct bpf_func_proto bpf_xdp_sk_lookup_tcp_proto = {
+@@ -6815,7 +6892,8 @@ BPF_CALL_5(bpf_sock_addr_skc_lookup_tcp, struct bpf_sock_addr_kern *, ctx,
+ {
+ return (unsigned long)__bpf_skc_lookup(NULL, tuple, len,
+ sock_net(ctx->sk), 0,
+- IPPROTO_TCP, netns_id, flags);
++ IPPROTO_TCP, netns_id, flags,
++ -1);
+ }
+
+ static const struct bpf_func_proto bpf_sock_addr_skc_lookup_tcp_proto = {
+@@ -6834,7 +6912,7 @@ BPF_CALL_5(bpf_sock_addr_sk_lookup_tcp, struct bpf_sock_addr_kern *, ctx,
+ {
+ return (unsigned long)__bpf_sk_lookup(NULL, tuple, len,
+ sock_net(ctx->sk), 0, IPPROTO_TCP,
+- netns_id, flags);
++ netns_id, flags, -1);
+ }
+
+ static const struct bpf_func_proto bpf_sock_addr_sk_lookup_tcp_proto = {
+@@ -6853,7 +6931,7 @@ BPF_CALL_5(bpf_sock_addr_sk_lookup_udp, struct bpf_sock_addr_kern *, ctx,
+ {
+ return (unsigned long)__bpf_sk_lookup(NULL, tuple, len,
+ sock_net(ctx->sk), 0, IPPROTO_UDP,
+- netns_id, flags);
++ netns_id, flags, -1);
+ }
+
+ static const struct bpf_func_proto bpf_sock_addr_sk_lookup_udp_proto = {
+@@ -7980,9 +8058,9 @@ tc_cls_act_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
+ #endif
+ #ifdef CONFIG_INET
+ case BPF_FUNC_sk_lookup_tcp:
+- return &bpf_sk_lookup_tcp_proto;
++ return &bpf_tc_sk_lookup_tcp_proto;
+ case BPF_FUNC_sk_lookup_udp:
+- return &bpf_sk_lookup_udp_proto;
++ return &bpf_tc_sk_lookup_udp_proto;
+ case BPF_FUNC_sk_release:
+ return &bpf_sk_release_proto;
+ case BPF_FUNC_tcp_sock:
+@@ -7990,7 +8068,7 @@ tc_cls_act_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
+ case BPF_FUNC_get_listener_sock:
+ return &bpf_get_listener_sock_proto;
+ case BPF_FUNC_skc_lookup_tcp:
+- return &bpf_skc_lookup_tcp_proto;
++ return &bpf_tc_skc_lookup_tcp_proto;
+ case BPF_FUNC_tcp_check_syncookie:
+ return &bpf_tcp_check_syncookie_proto;
+ case BPF_FUNC_skb_ecn_set_ce:
+diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
+index 41de3a2f29e15..2fe6a3379aaed 100644
+--- a/net/core/rtnetlink.c
++++ b/net/core/rtnetlink.c
+@@ -961,24 +961,27 @@ static inline int rtnl_vfinfo_size(const struct net_device *dev,
+ nla_total_size(sizeof(struct ifla_vf_rate)) +
+ nla_total_size(sizeof(struct ifla_vf_link_state)) +
+ nla_total_size(sizeof(struct ifla_vf_rss_query_en)) +
+- nla_total_size(0) + /* nest IFLA_VF_STATS */
+- /* IFLA_VF_STATS_RX_PACKETS */
+- nla_total_size_64bit(sizeof(__u64)) +
+- /* IFLA_VF_STATS_TX_PACKETS */
+- nla_total_size_64bit(sizeof(__u64)) +
+- /* IFLA_VF_STATS_RX_BYTES */
+- nla_total_size_64bit(sizeof(__u64)) +
+- /* IFLA_VF_STATS_TX_BYTES */
+- nla_total_size_64bit(sizeof(__u64)) +
+- /* IFLA_VF_STATS_BROADCAST */
+- nla_total_size_64bit(sizeof(__u64)) +
+- /* IFLA_VF_STATS_MULTICAST */
+- nla_total_size_64bit(sizeof(__u64)) +
+- /* IFLA_VF_STATS_RX_DROPPED */
+- nla_total_size_64bit(sizeof(__u64)) +
+- /* IFLA_VF_STATS_TX_DROPPED */
+- nla_total_size_64bit(sizeof(__u64)) +
+ nla_total_size(sizeof(struct ifla_vf_trust)));
++ if (~ext_filter_mask & RTEXT_FILTER_SKIP_STATS) {
++ size += num_vfs *
++ (nla_total_size(0) + /* nest IFLA_VF_STATS */
++ /* IFLA_VF_STATS_RX_PACKETS */
++ nla_total_size_64bit(sizeof(__u64)) +
++ /* IFLA_VF_STATS_TX_PACKETS */
++ nla_total_size_64bit(sizeof(__u64)) +
++ /* IFLA_VF_STATS_RX_BYTES */
++ nla_total_size_64bit(sizeof(__u64)) +
++ /* IFLA_VF_STATS_TX_BYTES */
++ nla_total_size_64bit(sizeof(__u64)) +
++ /* IFLA_VF_STATS_BROADCAST */
++ nla_total_size_64bit(sizeof(__u64)) +
++ /* IFLA_VF_STATS_MULTICAST */
++ nla_total_size_64bit(sizeof(__u64)) +
++ /* IFLA_VF_STATS_RX_DROPPED */
++ nla_total_size_64bit(sizeof(__u64)) +
++ /* IFLA_VF_STATS_TX_DROPPED */
++ nla_total_size_64bit(sizeof(__u64)));
++ }
+ return size;
+ } else
+ return 0;
+@@ -1270,7 +1273,8 @@ static noinline_for_stack int rtnl_fill_stats(struct sk_buff *skb,
+ static noinline_for_stack int rtnl_fill_vfinfo(struct sk_buff *skb,
+ struct net_device *dev,
+ int vfs_num,
+- struct nlattr *vfinfo)
++ struct nlattr *vfinfo,
++ u32 ext_filter_mask)
+ {
+ struct ifla_vf_rss_query_en vf_rss_query_en;
+ struct nlattr *vf, *vfstats, *vfvlanlist;
+@@ -1376,33 +1380,35 @@ static noinline_for_stack int rtnl_fill_vfinfo(struct sk_buff *skb,
+ goto nla_put_vf_failure;
+ }
+ nla_nest_end(skb, vfvlanlist);
+- memset(&vf_stats, 0, sizeof(vf_stats));
+- if (dev->netdev_ops->ndo_get_vf_stats)
+- dev->netdev_ops->ndo_get_vf_stats(dev, vfs_num,
+- &vf_stats);
+- vfstats = nla_nest_start_noflag(skb, IFLA_VF_STATS);
+- if (!vfstats)
+- goto nla_put_vf_failure;
+- if (nla_put_u64_64bit(skb, IFLA_VF_STATS_RX_PACKETS,
+- vf_stats.rx_packets, IFLA_VF_STATS_PAD) ||
+- nla_put_u64_64bit(skb, IFLA_VF_STATS_TX_PACKETS,
+- vf_stats.tx_packets, IFLA_VF_STATS_PAD) ||
+- nla_put_u64_64bit(skb, IFLA_VF_STATS_RX_BYTES,
+- vf_stats.rx_bytes, IFLA_VF_STATS_PAD) ||
+- nla_put_u64_64bit(skb, IFLA_VF_STATS_TX_BYTES,
+- vf_stats.tx_bytes, IFLA_VF_STATS_PAD) ||
+- nla_put_u64_64bit(skb, IFLA_VF_STATS_BROADCAST,
+- vf_stats.broadcast, IFLA_VF_STATS_PAD) ||
+- nla_put_u64_64bit(skb, IFLA_VF_STATS_MULTICAST,
+- vf_stats.multicast, IFLA_VF_STATS_PAD) ||
+- nla_put_u64_64bit(skb, IFLA_VF_STATS_RX_DROPPED,
+- vf_stats.rx_dropped, IFLA_VF_STATS_PAD) ||
+- nla_put_u64_64bit(skb, IFLA_VF_STATS_TX_DROPPED,
+- vf_stats.tx_dropped, IFLA_VF_STATS_PAD)) {
+- nla_nest_cancel(skb, vfstats);
+- goto nla_put_vf_failure;
++ if (~ext_filter_mask & RTEXT_FILTER_SKIP_STATS) {
++ memset(&vf_stats, 0, sizeof(vf_stats));
++ if (dev->netdev_ops->ndo_get_vf_stats)
++ dev->netdev_ops->ndo_get_vf_stats(dev, vfs_num,
++ &vf_stats);
++ vfstats = nla_nest_start_noflag(skb, IFLA_VF_STATS);
++ if (!vfstats)
++ goto nla_put_vf_failure;
++ if (nla_put_u64_64bit(skb, IFLA_VF_STATS_RX_PACKETS,
++ vf_stats.rx_packets, IFLA_VF_STATS_PAD) ||
++ nla_put_u64_64bit(skb, IFLA_VF_STATS_TX_PACKETS,
++ vf_stats.tx_packets, IFLA_VF_STATS_PAD) ||
++ nla_put_u64_64bit(skb, IFLA_VF_STATS_RX_BYTES,
++ vf_stats.rx_bytes, IFLA_VF_STATS_PAD) ||
++ nla_put_u64_64bit(skb, IFLA_VF_STATS_TX_BYTES,
++ vf_stats.tx_bytes, IFLA_VF_STATS_PAD) ||
++ nla_put_u64_64bit(skb, IFLA_VF_STATS_BROADCAST,
++ vf_stats.broadcast, IFLA_VF_STATS_PAD) ||
++ nla_put_u64_64bit(skb, IFLA_VF_STATS_MULTICAST,
++ vf_stats.multicast, IFLA_VF_STATS_PAD) ||
++ nla_put_u64_64bit(skb, IFLA_VF_STATS_RX_DROPPED,
++ vf_stats.rx_dropped, IFLA_VF_STATS_PAD) ||
++ nla_put_u64_64bit(skb, IFLA_VF_STATS_TX_DROPPED,
++ vf_stats.tx_dropped, IFLA_VF_STATS_PAD)) {
++ nla_nest_cancel(skb, vfstats);
++ goto nla_put_vf_failure;
++ }
++ nla_nest_end(skb, vfstats);
+ }
+- nla_nest_end(skb, vfstats);
+ nla_nest_end(skb, vf);
+ return 0;
+
+@@ -1435,7 +1441,7 @@ static noinline_for_stack int rtnl_fill_vf(struct sk_buff *skb,
+ return -EMSGSIZE;
+
+ for (i = 0; i < num_vfs; i++) {
+- if (rtnl_fill_vfinfo(skb, dev, i, vfinfo))
++ if (rtnl_fill_vfinfo(skb, dev, i, vfinfo, ext_filter_mask))
+ return -EMSGSIZE;
+ }
+
+@@ -4090,7 +4096,7 @@ static int nlmsg_populate_fdb_fill(struct sk_buff *skb,
+ ndm->ndm_ifindex = dev->ifindex;
+ ndm->ndm_state = ndm_state;
+
+- if (nla_put(skb, NDA_LLADDR, ETH_ALEN, addr))
++ if (nla_put(skb, NDA_LLADDR, dev->addr_len, addr))
+ goto nla_put_failure;
+ if (vid)
+ if (nla_put(skb, NDA_VLAN, sizeof(u16), &vid))
+@@ -4104,10 +4110,10 @@ nla_put_failure:
+ return -EMSGSIZE;
+ }
+
+-static inline size_t rtnl_fdb_nlmsg_size(void)
++static inline size_t rtnl_fdb_nlmsg_size(const struct net_device *dev)
+ {
+ return NLMSG_ALIGN(sizeof(struct ndmsg)) +
+- nla_total_size(ETH_ALEN) + /* NDA_LLADDR */
++ nla_total_size(dev->addr_len) + /* NDA_LLADDR */
+ nla_total_size(sizeof(u16)) + /* NDA_VLAN */
+ 0;
+ }
+@@ -4119,7 +4125,7 @@ static void rtnl_fdb_notify(struct net_device *dev, u8 *addr, u16 vid, int type,
+ struct sk_buff *skb;
+ int err = -ENOBUFS;
+
+- skb = nlmsg_new(rtnl_fdb_nlmsg_size(), GFP_ATOMIC);
++ skb = nlmsg_new(rtnl_fdb_nlmsg_size(dev), GFP_ATOMIC);
+ if (!skb)
+ goto errout;
+
+diff --git a/net/core/sock.c b/net/core/sock.c
+index 6e5662ca00fe5..4a0edccf86066 100644
+--- a/net/core/sock.c
++++ b/net/core/sock.c
+@@ -2550,13 +2550,24 @@ kuid_t sock_i_uid(struct sock *sk)
+ }
+ EXPORT_SYMBOL(sock_i_uid);
+
+-unsigned long sock_i_ino(struct sock *sk)
++unsigned long __sock_i_ino(struct sock *sk)
+ {
+ unsigned long ino;
+
+- read_lock_bh(&sk->sk_callback_lock);
++ read_lock(&sk->sk_callback_lock);
+ ino = sk->sk_socket ? SOCK_INODE(sk->sk_socket)->i_ino : 0;
+- read_unlock_bh(&sk->sk_callback_lock);
++ read_unlock(&sk->sk_callback_lock);
++ return ino;
++}
++EXPORT_SYMBOL(__sock_i_ino);
++
++unsigned long sock_i_ino(struct sock *sk)
++{
++ unsigned long ino;
++
++ local_bh_disable();
++ ino = __sock_i_ino(sk);
++ local_bh_enable();
+ return ino;
+ }
+ EXPORT_SYMBOL(sock_i_ino);
+diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
+index 1afed89e03c00..ccbdb98109f80 100644
+--- a/net/dsa/dsa.c
++++ b/net/dsa/dsa.c
+@@ -1106,7 +1106,7 @@ static struct dsa_port *dsa_port_touch(struct dsa_switch *ds, int index)
+ mutex_init(&dp->vlans_lock);
+ INIT_LIST_HEAD(&dp->fdbs);
+ INIT_LIST_HEAD(&dp->mdbs);
+- INIT_LIST_HEAD(&dp->vlans);
++ INIT_LIST_HEAD(&dp->vlans); /* also initializes &dp->user_vlans */
+ INIT_LIST_HEAD(&dp->list);
+ list_add_tail(&dp->list, &dst->ports);
+
+diff --git a/net/dsa/slave.c b/net/dsa/slave.c
+index 165bb2cb84316..527b1d576460f 100644
+--- a/net/dsa/slave.c
++++ b/net/dsa/slave.c
+@@ -27,6 +27,7 @@
+ #include "master.h"
+ #include "netlink.h"
+ #include "slave.h"
++#include "switch.h"
+ #include "tag.h"
+
+ struct dsa_switchdev_event_work {
+@@ -161,8 +162,7 @@ static int dsa_slave_schedule_standalone_work(struct net_device *dev,
+ return 0;
+ }
+
+-static int dsa_slave_host_vlan_rx_filtering(struct net_device *vdev, int vid,
+- void *arg)
++static int dsa_slave_host_vlan_rx_filtering(void *arg, int vid)
+ {
+ struct dsa_host_vlan_rx_filtering_ctx *ctx = arg;
+
+@@ -170,6 +170,28 @@ static int dsa_slave_host_vlan_rx_filtering(struct net_device *vdev, int vid,
+ ctx->addr, vid);
+ }
+
++static int dsa_slave_vlan_for_each(struct net_device *dev,
++ int (*cb)(void *arg, int vid), void *arg)
++{
++ struct dsa_port *dp = dsa_slave_to_port(dev);
++ struct dsa_vlan *v;
++ int err;
++
++ lockdep_assert_held(&dev->addr_list_lock);
++
++ err = cb(arg, 0);
++ if (err)
++ return err;
++
++ list_for_each_entry(v, &dp->user_vlans, list) {
++ err = cb(arg, v->vid);
++ if (err)
++ return err;
++ }
++
++ return 0;
++}
++
+ static int dsa_slave_sync_uc(struct net_device *dev,
+ const unsigned char *addr)
+ {
+@@ -180,18 +202,14 @@ static int dsa_slave_sync_uc(struct net_device *dev,
+ .addr = addr,
+ .event = DSA_UC_ADD,
+ };
+- int err;
+
+ dev_uc_add(master, addr);
+
+ if (!dsa_switch_supports_uc_filtering(dp->ds))
+ return 0;
+
+- err = dsa_slave_schedule_standalone_work(dev, DSA_UC_ADD, addr, 0);
+- if (err)
+- return err;
+-
+- return vlan_for_each(dev, dsa_slave_host_vlan_rx_filtering, &ctx);
++ return dsa_slave_vlan_for_each(dev, dsa_slave_host_vlan_rx_filtering,
++ &ctx);
+ }
+
+ static int dsa_slave_unsync_uc(struct net_device *dev,
+@@ -204,18 +222,14 @@ static int dsa_slave_unsync_uc(struct net_device *dev,
+ .addr = addr,
+ .event = DSA_UC_DEL,
+ };
+- int err;
+
+ dev_uc_del(master, addr);
+
+ if (!dsa_switch_supports_uc_filtering(dp->ds))
+ return 0;
+
+- err = dsa_slave_schedule_standalone_work(dev, DSA_UC_DEL, addr, 0);
+- if (err)
+- return err;
+-
+- return vlan_for_each(dev, dsa_slave_host_vlan_rx_filtering, &ctx);
++ return dsa_slave_vlan_for_each(dev, dsa_slave_host_vlan_rx_filtering,
++ &ctx);
+ }
+
+ static int dsa_slave_sync_mc(struct net_device *dev,
+@@ -228,18 +242,14 @@ static int dsa_slave_sync_mc(struct net_device *dev,
+ .addr = addr,
+ .event = DSA_MC_ADD,
+ };
+- int err;
+
+ dev_mc_add(master, addr);
+
+ if (!dsa_switch_supports_mc_filtering(dp->ds))
+ return 0;
+
+- err = dsa_slave_schedule_standalone_work(dev, DSA_MC_ADD, addr, 0);
+- if (err)
+- return err;
+-
+- return vlan_for_each(dev, dsa_slave_host_vlan_rx_filtering, &ctx);
++ return dsa_slave_vlan_for_each(dev, dsa_slave_host_vlan_rx_filtering,
++ &ctx);
+ }
+
+ static int dsa_slave_unsync_mc(struct net_device *dev,
+@@ -252,18 +262,14 @@ static int dsa_slave_unsync_mc(struct net_device *dev,
+ .addr = addr,
+ .event = DSA_MC_DEL,
+ };
+- int err;
+
+ dev_mc_del(master, addr);
+
+ if (!dsa_switch_supports_mc_filtering(dp->ds))
+ return 0;
+
+- err = dsa_slave_schedule_standalone_work(dev, DSA_MC_DEL, addr, 0);
+- if (err)
+- return err;
+-
+- return vlan_for_each(dev, dsa_slave_host_vlan_rx_filtering, &ctx);
++ return dsa_slave_vlan_for_each(dev, dsa_slave_host_vlan_rx_filtering,
++ &ctx);
+ }
+
+ void dsa_slave_sync_ha(struct net_device *dev)
+@@ -1759,6 +1765,7 @@ static int dsa_slave_vlan_rx_add_vid(struct net_device *dev, __be16 proto,
+ struct netlink_ext_ack extack = {0};
+ struct dsa_switch *ds = dp->ds;
+ struct netdev_hw_addr *ha;
++ struct dsa_vlan *v;
+ int ret;
+
+ /* User port... */
+@@ -1782,8 +1789,17 @@ static int dsa_slave_vlan_rx_add_vid(struct net_device *dev, __be16 proto,
+ !dsa_switch_supports_mc_filtering(ds))
+ return 0;
+
++ v = kzalloc(sizeof(*v), GFP_KERNEL);
++ if (!v) {
++ ret = -ENOMEM;
++ goto rollback;
++ }
++
+ netif_addr_lock_bh(dev);
+
++ v->vid = vid;
++ list_add_tail(&v->list, &dp->user_vlans);
++
+ if (dsa_switch_supports_mc_filtering(ds)) {
+ netdev_for_each_synced_mc_addr(ha, dev) {
+ dsa_slave_schedule_standalone_work(dev, DSA_MC_ADD,
+@@ -1803,6 +1819,12 @@ static int dsa_slave_vlan_rx_add_vid(struct net_device *dev, __be16 proto,
+ dsa_flush_workqueue();
+
+ return 0;
++
++rollback:
++ dsa_port_host_vlan_del(dp, &vlan);
++ dsa_port_vlan_del(dp, &vlan);
++
++ return ret;
+ }
+
+ static int dsa_slave_vlan_rx_kill_vid(struct net_device *dev, __be16 proto,
+@@ -1816,6 +1838,7 @@ static int dsa_slave_vlan_rx_kill_vid(struct net_device *dev, __be16 proto,
+ };
+ struct dsa_switch *ds = dp->ds;
+ struct netdev_hw_addr *ha;
++ struct dsa_vlan *v;
+ int err;
+
+ err = dsa_port_vlan_del(dp, &vlan);
+@@ -1832,6 +1855,15 @@ static int dsa_slave_vlan_rx_kill_vid(struct net_device *dev, __be16 proto,
+
+ netif_addr_lock_bh(dev);
+
++ v = dsa_vlan_find(&dp->user_vlans, &vlan);
++ if (!v) {
++ netif_addr_unlock_bh(dev);
++ return -ENOENT;
++ }
++
++ list_del(&v->list);
++ kfree(v);
++
+ if (dsa_switch_supports_mc_filtering(ds)) {
+ netdev_for_each_synced_mc_addr(ha, dev) {
+ dsa_slave_schedule_standalone_work(dev, DSA_MC_DEL,
+diff --git a/net/dsa/switch.c b/net/dsa/switch.c
+index 8c9a9f94b756a..1a42f93173345 100644
+--- a/net/dsa/switch.c
++++ b/net/dsa/switch.c
+@@ -673,8 +673,8 @@ static bool dsa_port_host_vlan_match(struct dsa_port *dp,
+ return false;
+ }
+
+-static struct dsa_vlan *dsa_vlan_find(struct list_head *vlan_list,
+- const struct switchdev_obj_port_vlan *vlan)
++struct dsa_vlan *dsa_vlan_find(struct list_head *vlan_list,
++ const struct switchdev_obj_port_vlan *vlan)
+ {
+ struct dsa_vlan *v;
+
+diff --git a/net/dsa/switch.h b/net/dsa/switch.h
+index 15e67b95eb6e1..ea034677da153 100644
+--- a/net/dsa/switch.h
++++ b/net/dsa/switch.h
+@@ -111,6 +111,9 @@ struct dsa_notifier_master_state_info {
+ bool operational;
+ };
+
++struct dsa_vlan *dsa_vlan_find(struct list_head *vlan_list,
++ const struct switchdev_obj_port_vlan *vlan);
++
+ int dsa_tree_notify(struct dsa_switch_tree *dst, unsigned long e, void *v);
+ int dsa_broadcast(unsigned long e, void *v);
+
+diff --git a/net/dsa/tag_sja1105.c b/net/dsa/tag_sja1105.c
+index a5f3b73da417f..ade3eeb2f3e6d 100644
+--- a/net/dsa/tag_sja1105.c
++++ b/net/dsa/tag_sja1105.c
+@@ -58,11 +58,8 @@
+ #define SJA1110_TX_TRAILER_LEN 4
+ #define SJA1110_MAX_PADDING_LEN 15
+
+-#define SJA1105_HWTS_RX_EN 0
+-
+ struct sja1105_tagger_private {
+ struct sja1105_tagger_data data; /* Must be first */
+- unsigned long state;
+ /* Protects concurrent access to the meta state machine
+ * from taggers running on multiple ports on SMP systems
+ */
+@@ -118,8 +115,8 @@ static void sja1105_meta_unpack(const struct sk_buff *skb,
+ * a unified unpacking command for both device series.
+ */
+ packing(buf, &meta->tstamp, 31, 0, 4, UNPACK, 0);
+- packing(buf + 4, &meta->dmac_byte_4, 7, 0, 1, UNPACK, 0);
+- packing(buf + 5, &meta->dmac_byte_3, 7, 0, 1, UNPACK, 0);
++ packing(buf + 4, &meta->dmac_byte_3, 7, 0, 1, UNPACK, 0);
++ packing(buf + 5, &meta->dmac_byte_4, 7, 0, 1, UNPACK, 0);
+ packing(buf + 6, &meta->source_port, 7, 0, 1, UNPACK, 0);
+ packing(buf + 7, &meta->switch_id, 7, 0, 1, UNPACK, 0);
+ }
+@@ -392,10 +389,6 @@ static struct sk_buff
+
+ priv = sja1105_tagger_private(ds);
+
+- if (!test_bit(SJA1105_HWTS_RX_EN, &priv->state))
+- /* Do normal processing. */
+- return skb;
+-
+ spin_lock(&priv->meta_lock);
+ /* Was this a link-local frame instead of the meta
+ * that we were expecting?
+@@ -431,12 +424,6 @@ static struct sk_buff
+
+ priv = sja1105_tagger_private(ds);
+
+- /* Drop the meta frame if we're not in the right state
+- * to process it.
+- */
+- if (!test_bit(SJA1105_HWTS_RX_EN, &priv->state))
+- return NULL;
+-
+ spin_lock(&priv->meta_lock);
+
+ stampable_skb = priv->stampable_skb;
+@@ -472,30 +459,6 @@ static struct sk_buff
+ return skb;
+ }
+
+-static bool sja1105_rxtstamp_get_state(struct dsa_switch *ds)
+-{
+- struct sja1105_tagger_private *priv = sja1105_tagger_private(ds);
+-
+- return test_bit(SJA1105_HWTS_RX_EN, &priv->state);
+-}
+-
+-static void sja1105_rxtstamp_set_state(struct dsa_switch *ds, bool on)
+-{
+- struct sja1105_tagger_private *priv = sja1105_tagger_private(ds);
+-
+- if (on)
+- set_bit(SJA1105_HWTS_RX_EN, &priv->state);
+- else
+- clear_bit(SJA1105_HWTS_RX_EN, &priv->state);
+-
+- /* Initialize the meta state machine to a known state */
+- if (!priv->stampable_skb)
+- return;
+-
+- kfree_skb(priv->stampable_skb);
+- priv->stampable_skb = NULL;
+-}
+-
+ static bool sja1105_skb_has_tag_8021q(const struct sk_buff *skb)
+ {
+ u16 tpid = ntohs(eth_hdr(skb)->h_proto);
+@@ -545,33 +508,53 @@ static struct sk_buff *sja1105_rcv(struct sk_buff *skb,
+ is_link_local = sja1105_is_link_local(skb);
+ is_meta = sja1105_is_meta_frame(skb);
+
+- if (sja1105_skb_has_tag_8021q(skb)) {
+- /* Normal traffic path. */
+- sja1105_vlan_rcv(skb, &source_port, &switch_id, &vbid, &vid);
+- } else if (is_link_local) {
++ if (is_link_local) {
+ /* Management traffic path. Switch embeds the switch ID and
+ * port ID into bytes of the destination MAC, courtesy of
+ * the incl_srcpt options.
+ */
+ source_port = hdr->h_dest[3];
+ switch_id = hdr->h_dest[4];
+- /* Clear the DMAC bytes that were mangled by the switch */
+- hdr->h_dest[3] = 0;
+- hdr->h_dest[4] = 0;
+ } else if (is_meta) {
+ sja1105_meta_unpack(skb, &meta);
+ source_port = meta.source_port;
+ switch_id = meta.switch_id;
+- } else {
++ }
++
++ /* Normal data plane traffic and link-local frames are tagged with
++ * a tag_8021q VLAN which we have to strip
++ */
++ if (sja1105_skb_has_tag_8021q(skb)) {
++ int tmp_source_port = -1, tmp_switch_id = -1;
++
++ sja1105_vlan_rcv(skb, &tmp_source_port, &tmp_switch_id, &vbid,
++ &vid);
++ /* Preserve the source information from the INCL_SRCPT option,
++ * if available. This allows us to not overwrite a valid source
++ * port and switch ID with zeroes when receiving link-local
++ * frames from a VLAN-unaware bridged port (non-zero vbid) or a
++ * VLAN-aware bridged port (non-zero vid). Furthermore, the
++ * tag_8021q source port information is only of trust when the
++ * vbid is 0 (precise port). Otherwise, tmp_source_port and
++ * tmp_switch_id will be zeroes.
++ */
++ if (vbid == 0 && source_port == -1)
++ source_port = tmp_source_port;
++ if (vbid == 0 && switch_id == -1)
++ switch_id = tmp_switch_id;
++ } else if (source_port == -1 && switch_id == -1) {
++ /* Packets with no source information have no chance of
++ * getting accepted, drop them straight away.
++ */
+ return NULL;
+ }
+
+- if (vbid >= 1)
++ if (source_port != -1 && switch_id != -1)
++ skb->dev = dsa_master_find_slave(netdev, switch_id, source_port);
++ else if (vbid >= 1)
+ skb->dev = dsa_tag_8021q_find_port_by_vbid(netdev, vbid);
+- else if (source_port == -1 || switch_id == -1)
+- skb->dev = dsa_find_designated_bridge_port_by_vid(netdev, vid);
+ else
+- skb->dev = dsa_master_find_slave(netdev, switch_id, source_port);
++ skb->dev = dsa_find_designated_bridge_port_by_vid(netdev, vid);
+ if (!skb->dev) {
+ netdev_warn(netdev, "Couldn't decode source port\n");
+ return NULL;
+@@ -762,7 +745,6 @@ static void sja1105_disconnect(struct dsa_switch *ds)
+
+ static int sja1105_connect(struct dsa_switch *ds)
+ {
+- struct sja1105_tagger_data *tagger_data;
+ struct sja1105_tagger_private *priv;
+ struct kthread_worker *xmit_worker;
+ int err;
+@@ -782,10 +764,6 @@ static int sja1105_connect(struct dsa_switch *ds)
+ }
+
+ priv->xmit_worker = xmit_worker;
+- /* Export functions for switch driver use */
+- tagger_data = &priv->data;
+- tagger_data->rxtstamp_get_state = sja1105_rxtstamp_get_state;
+- tagger_data->rxtstamp_set_state = sja1105_rxtstamp_set_state;
+ ds->tagger_data = priv;
+
+ return 0;
+diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
+index bf8b22218dd46..57f1e4883b761 100644
+--- a/net/ipv4/tcp_input.c
++++ b/net/ipv4/tcp_input.c
+@@ -3590,8 +3590,11 @@ static int tcp_ack_update_window(struct sock *sk, const struct sk_buff *skb, u32
+ static bool __tcp_oow_rate_limited(struct net *net, int mib_idx,
+ u32 *last_oow_ack_time)
+ {
+- if (*last_oow_ack_time) {
+- s32 elapsed = (s32)(tcp_jiffies32 - *last_oow_ack_time);
++ /* Paired with the WRITE_ONCE() in this function. */
++ u32 val = READ_ONCE(*last_oow_ack_time);
++
++ if (val) {
++ s32 elapsed = (s32)(tcp_jiffies32 - val);
+
+ if (0 <= elapsed &&
+ elapsed < READ_ONCE(net->ipv4.sysctl_tcp_invalid_ratelimit)) {
+@@ -3600,7 +3603,10 @@ static bool __tcp_oow_rate_limited(struct net *net, int mib_idx,
+ }
+ }
+
+- *last_oow_ack_time = tcp_jiffies32;
++ /* Paired with the prior READ_ONCE() and with itself,
++ * as we might be lockless.
++ */
++ WRITE_ONCE(*last_oow_ack_time, tcp_jiffies32);
+
+ return false; /* not rate-limited: go ahead, send dupack now! */
+ }
+diff --git a/net/mac80211/debugfs_netdev.c b/net/mac80211/debugfs_netdev.c
+index b0cef37eb3948..03374eb8b7cb9 100644
+--- a/net/mac80211/debugfs_netdev.c
++++ b/net/mac80211/debugfs_netdev.c
+@@ -717,7 +717,7 @@ static void add_sta_files(struct ieee80211_sub_if_data *sdata)
+ DEBUGFS_ADD_MODE(uapsd_queues, 0600);
+ DEBUGFS_ADD_MODE(uapsd_max_sp_len, 0600);
+ DEBUGFS_ADD_MODE(tdls_wider_bw, 0600);
+- DEBUGFS_ADD_MODE(valid_links, 0200);
++ DEBUGFS_ADD_MODE(valid_links, 0400);
+ DEBUGFS_ADD_MODE(active_links, 0600);
+ }
+
+diff --git a/net/mac80211/eht.c b/net/mac80211/eht.c
+index 18bc6b78b2679..ddc7acc68335a 100644
+--- a/net/mac80211/eht.c
++++ b/net/mac80211/eht.c
+@@ -2,7 +2,7 @@
+ /*
+ * EHT handling
+ *
+- * Copyright(c) 2021-2022 Intel Corporation
++ * Copyright(c) 2021-2023 Intel Corporation
+ */
+
+ #include "ieee80211_i.h"
+@@ -25,8 +25,7 @@ ieee80211_eht_cap_ie_to_sta_eht_cap(struct ieee80211_sub_if_data *sdata,
+ memset(eht_cap, 0, sizeof(*eht_cap));
+
+ if (!eht_cap_ie_elem ||
+- !ieee80211_get_eht_iftype_cap(sband,
+- ieee80211_vif_type_p2p(&sdata->vif)))
++ !ieee80211_get_eht_iftype_cap_vif(sband, &sdata->vif))
+ return;
+
+ mcs_nss_size = ieee80211_eht_mcs_nss_size(he_cap_ie_elem,
+diff --git a/net/mac80211/he.c b/net/mac80211/he.c
+index 0322abae08250..9f5ffdc9db284 100644
+--- a/net/mac80211/he.c
++++ b/net/mac80211/he.c
+@@ -128,8 +128,7 @@ ieee80211_he_cap_ie_to_sta_he_cap(struct ieee80211_sub_if_data *sdata,
+ return;
+
+ own_he_cap_ptr =
+- ieee80211_get_he_iftype_cap(sband,
+- ieee80211_vif_type_p2p(&sdata->vif));
++ ieee80211_get_he_iftype_cap_vif(sband, &sdata->vif);
+ if (!own_he_cap_ptr)
+ return;
+
+diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c
+index 5a4303130ef22..93da8373583be 100644
+--- a/net/mac80211/mlme.c
++++ b/net/mac80211/mlme.c
+@@ -511,16 +511,14 @@ static int ieee80211_config_bw(struct ieee80211_link_data *link,
+
+ /* don't check HE if we associated as non-HE station */
+ if (link->u.mgd.conn_flags & IEEE80211_CONN_DISABLE_HE ||
+- !ieee80211_get_he_iftype_cap(sband,
+- ieee80211_vif_type_p2p(&sdata->vif))) {
++ !ieee80211_get_he_iftype_cap_vif(sband, &sdata->vif)) {
+ he_oper = NULL;
+ eht_oper = NULL;
+ }
+
+ /* don't check EHT if we associated as non-EHT station */
+ if (link->u.mgd.conn_flags & IEEE80211_CONN_DISABLE_EHT ||
+- !ieee80211_get_eht_iftype_cap(sband,
+- ieee80211_vif_type_p2p(&sdata->vif)))
++ !ieee80211_get_eht_iftype_cap_vif(sband, &sdata->vif))
+ eht_oper = NULL;
+
+ /*
+@@ -776,8 +774,7 @@ static void ieee80211_add_he_ie(struct ieee80211_sub_if_data *sdata,
+ const struct ieee80211_sta_he_cap *he_cap;
+ u8 he_cap_size;
+
+- he_cap = ieee80211_get_he_iftype_cap(sband,
+- ieee80211_vif_type_p2p(&sdata->vif));
++ he_cap = ieee80211_get_he_iftype_cap_vif(sband, &sdata->vif);
+ if (WARN_ON(!he_cap))
+ return;
+
+@@ -806,10 +803,8 @@ static void ieee80211_add_eht_ie(struct ieee80211_sub_if_data *sdata,
+ const struct ieee80211_sta_eht_cap *eht_cap;
+ u8 eht_cap_size;
+
+- he_cap = ieee80211_get_he_iftype_cap(sband,
+- ieee80211_vif_type_p2p(&sdata->vif));
+- eht_cap = ieee80211_get_eht_iftype_cap(sband,
+- ieee80211_vif_type_p2p(&sdata->vif));
++ he_cap = ieee80211_get_he_iftype_cap_vif(sband, &sdata->vif);
++ eht_cap = ieee80211_get_eht_iftype_cap_vif(sband, &sdata->vif);
+
+ /*
+ * EHT capabilities element is only added if the HE capabilities element
+@@ -3949,8 +3944,7 @@ static bool ieee80211_twt_req_supported(struct ieee80211_sub_if_data *sdata,
+ const struct ieee802_11_elems *elems)
+ {
+ const struct ieee80211_sta_he_cap *own_he_cap =
+- ieee80211_get_he_iftype_cap(sband,
+- ieee80211_vif_type_p2p(&sdata->vif));
++ ieee80211_get_he_iftype_cap_vif(sband, &sdata->vif);
+
+ if (elems->ext_capab_len < 10)
+ return false;
+@@ -3986,8 +3980,7 @@ static bool ieee80211_twt_bcast_support(struct ieee80211_sub_if_data *sdata,
+ struct link_sta_info *link_sta)
+ {
+ const struct ieee80211_sta_he_cap *own_he_cap =
+- ieee80211_get_he_iftype_cap(sband,
+- ieee80211_vif_type_p2p(&sdata->vif));
++ ieee80211_get_he_iftype_cap_vif(sband, &sdata->vif);
+
+ return bss_conf->he_support &&
+ (link_sta->pub->he_cap.he_cap_elem.mac_cap_info[2] &
+@@ -4624,8 +4617,7 @@ ieee80211_verify_sta_he_mcs_support(struct ieee80211_sub_if_data *sdata,
+ const struct ieee80211_he_operation *he_op)
+ {
+ const struct ieee80211_sta_he_cap *sta_he_cap =
+- ieee80211_get_he_iftype_cap(sband,
+- ieee80211_vif_type_p2p(&sdata->vif));
++ ieee80211_get_he_iftype_cap_vif(sband, &sdata->vif);
+ u16 ap_min_req_set;
+ int i;
+
+@@ -4759,15 +4751,13 @@ static int ieee80211_prep_channel(struct ieee80211_sub_if_data *sdata,
+ *conn_flags |= IEEE80211_CONN_DISABLE_EHT;
+ }
+
+- if (!ieee80211_get_he_iftype_cap(sband,
+- ieee80211_vif_type_p2p(&sdata->vif))) {
++ if (!ieee80211_get_he_iftype_cap_vif(sband, &sdata->vif)) {
+ mlme_dbg(sdata, "HE not supported, disabling HE and EHT\n");
+ *conn_flags |= IEEE80211_CONN_DISABLE_HE;
+ *conn_flags |= IEEE80211_CONN_DISABLE_EHT;
+ }
+
+- if (!ieee80211_get_eht_iftype_cap(sband,
+- ieee80211_vif_type_p2p(&sdata->vif))) {
++ if (!ieee80211_get_eht_iftype_cap_vif(sband, &sdata->vif)) {
+ mlme_dbg(sdata, "EHT not supported, disabling EHT\n");
+ *conn_flags |= IEEE80211_CONN_DISABLE_EHT;
+ }
+diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
+index 1400512e0dde5..a1cd5c234f47e 100644
+--- a/net/mac80211/sta_info.c
++++ b/net/mac80211/sta_info.c
+@@ -2913,6 +2913,8 @@ int ieee80211_sta_activate_link(struct sta_info *sta, unsigned int link_id)
+ if (!test_sta_flag(sta, WLAN_STA_INSERTED))
+ goto hash;
+
++ ieee80211_recalc_min_chandef(sdata, link_id);
++
+ /* Ensure the values are updated for the driver,
+ * redone by sta_remove_link on failure.
+ */
+diff --git a/net/mac80211/util.c b/net/mac80211/util.c
+index 3bd07a0a782f7..4cfd6b9b705cb 100644
+--- a/net/mac80211/util.c
++++ b/net/mac80211/util.c
+@@ -6,7 +6,7 @@
+ * Copyright 2007 Johannes Berg <johannes@sipsolutions.net>
+ * Copyright 2013-2014 Intel Mobile Communications GmbH
+ * Copyright (C) 2015-2017 Intel Deutschland GmbH
+- * Copyright (C) 2018-2022 Intel Corporation
++ * Copyright (C) 2018-2023 Intel Corporation
+ *
+ * utilities for mac80211
+ */
+@@ -2121,8 +2121,7 @@ static int ieee80211_build_preq_ies_band(struct ieee80211_sub_if_data *sdata,
+ *offset = noffset;
+ }
+
+- he_cap = ieee80211_get_he_iftype_cap(sband,
+- ieee80211_vif_type_p2p(&sdata->vif));
++ he_cap = ieee80211_get_he_iftype_cap_vif(sband, &sdata->vif);
+ if (he_cap &&
+ cfg80211_any_usable_channels(local->hw.wiphy, BIT(sband->band),
+ IEEE80211_CHAN_NO_HE)) {
+@@ -2131,8 +2130,7 @@ static int ieee80211_build_preq_ies_band(struct ieee80211_sub_if_data *sdata,
+ goto out_err;
+ }
+
+- eht_cap = ieee80211_get_eht_iftype_cap(sband,
+- ieee80211_vif_type_p2p(&sdata->vif));
++ eht_cap = ieee80211_get_eht_iftype_cap_vif(sband, &sdata->vif);
+
+ if (eht_cap &&
+ cfg80211_any_usable_channels(local->hw.wiphy, BIT(sband->band),
+@@ -2150,8 +2148,7 @@ static int ieee80211_build_preq_ies_band(struct ieee80211_sub_if_data *sdata,
+ struct ieee80211_supported_band *sband6;
+
+ sband6 = local->hw.wiphy->bands[NL80211_BAND_6GHZ];
+- he_cap = ieee80211_get_he_iftype_cap(sband6,
+- ieee80211_vif_type_p2p(&sdata->vif));
++ he_cap = ieee80211_get_he_iftype_cap_vif(sband6, &sdata->vif);
+
+ if (he_cap) {
+ enum nl80211_iftype iftype =
+@@ -3801,10 +3798,8 @@ bool ieee80211_chandef_he_6ghz_oper(struct ieee80211_sub_if_data *sdata,
+ }
+
+ eht_cap = ieee80211_get_eht_iftype_cap(sband, iftype);
+- if (!eht_cap) {
+- sdata_info(sdata, "Missing iftype sband data/EHT cap");
++ if (!eht_cap)
+ eht_oper = NULL;
+- }
+
+ he_6ghz_oper = ieee80211_he_6ghz_oper(he_oper);
+
+diff --git a/net/netfilter/ipvs/Kconfig b/net/netfilter/ipvs/Kconfig
+index 271da8447b293..2a3017b9c001b 100644
+--- a/net/netfilter/ipvs/Kconfig
++++ b/net/netfilter/ipvs/Kconfig
+@@ -44,7 +44,8 @@ config IP_VS_DEBUG
+
+ config IP_VS_TAB_BITS
+ int "IPVS connection table size (the Nth power of 2)"
+- range 8 20
++ range 8 20 if !64BIT
++ range 8 27 if 64BIT
+ default 12
+ help
+ The IPVS connection hash table uses the chaining scheme to handle
+@@ -54,24 +55,24 @@ config IP_VS_TAB_BITS
+
+ Note the table size must be power of 2. The table size will be the
+ value of 2 to the your input number power. The number to choose is
+- from 8 to 20, the default number is 12, which means the table size
+- is 4096. Don't input the number too small, otherwise you will lose
+- performance on it. You can adapt the table size yourself, according
+- to your virtual server application. It is good to set the table size
+- not far less than the number of connections per second multiplying
+- average lasting time of connection in the table. For example, your
+- virtual server gets 200 connections per second, the connection lasts
+- for 200 seconds in average in the connection table, the table size
+- should be not far less than 200x200, it is good to set the table
+- size 32768 (2**15).
++ from 8 to 27 for 64BIT(20 otherwise), the default number is 12,
++ which means the table size is 4096. Don't input the number too
++ small, otherwise you will lose performance on it. You can adapt the
++ table size yourself, according to your virtual server application.
++ It is good to set the table size not far less than the number of
++ connections per second multiplying average lasting time of
++ connection in the table. For example, your virtual server gets 200
++ connections per second, the connection lasts for 200 seconds in
++ average in the connection table, the table size should be not far
++ less than 200x200, it is good to set the table size 32768 (2**15).
+
+ Another note that each connection occupies 128 bytes effectively and
+ each hash entry uses 8 bytes, so you can estimate how much memory is
+ needed for your box.
+
+ You can overwrite this number setting conn_tab_bits module parameter
+- or by appending ip_vs.conn_tab_bits=? to the kernel command line
+- if IP VS was compiled built-in.
++ or by appending ip_vs.conn_tab_bits=? to the kernel command line if
++ IP VS was compiled built-in.
+
+ comment "IPVS transport protocol load balancing support"
+
+diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c
+index 928e64653837f..f4c55e65abd12 100644
+--- a/net/netfilter/ipvs/ip_vs_conn.c
++++ b/net/netfilter/ipvs/ip_vs_conn.c
+@@ -1485,8 +1485,8 @@ int __init ip_vs_conn_init(void)
+ int idx;
+
+ /* Compute size and mask */
+- if (ip_vs_conn_tab_bits < 8 || ip_vs_conn_tab_bits > 20) {
+- pr_info("conn_tab_bits not in [8, 20]. Using default value\n");
++ if (ip_vs_conn_tab_bits < 8 || ip_vs_conn_tab_bits > 27) {
++ pr_info("conn_tab_bits not in [8, 27]. Using default value\n");
+ ip_vs_conn_tab_bits = CONFIG_IP_VS_TAB_BITS;
+ }
+ ip_vs_conn_tab_size = 1 << ip_vs_conn_tab_bits;
+diff --git a/net/netfilter/nf_conntrack_helper.c b/net/netfilter/nf_conntrack_helper.c
+index 0c4db2f2ac43e..f22691f838536 100644
+--- a/net/netfilter/nf_conntrack_helper.c
++++ b/net/netfilter/nf_conntrack_helper.c
+@@ -360,6 +360,9 @@ int nf_conntrack_helper_register(struct nf_conntrack_helper *me)
+ BUG_ON(me->expect_class_max >= NF_CT_MAX_EXPECT_CLASSES);
+ BUG_ON(strlen(me->name) > NF_CT_HELPER_NAME_LEN - 1);
+
++ if (!nf_ct_helper_hash)
++ return -ENOENT;
++
+ if (me->expect_policy->max_expected > NF_CT_EXPECT_MAX_CNT)
+ return -EINVAL;
+
+@@ -515,4 +518,5 @@ int nf_conntrack_helper_init(void)
+ void nf_conntrack_helper_fini(void)
+ {
+ kvfree(nf_ct_helper_hash);
++ nf_ct_helper_hash = NULL;
+ }
+diff --git a/net/netfilter/nf_conntrack_proto_dccp.c b/net/netfilter/nf_conntrack_proto_dccp.c
+index c1557d47ccd1e..d4fd626d2b8c3 100644
+--- a/net/netfilter/nf_conntrack_proto_dccp.c
++++ b/net/netfilter/nf_conntrack_proto_dccp.c
+@@ -432,9 +432,19 @@ static bool dccp_error(const struct dccp_hdr *dh,
+ struct sk_buff *skb, unsigned int dataoff,
+ const struct nf_hook_state *state)
+ {
++ static const unsigned long require_seq48 = 1 << DCCP_PKT_REQUEST |
++ 1 << DCCP_PKT_RESPONSE |
++ 1 << DCCP_PKT_CLOSEREQ |
++ 1 << DCCP_PKT_CLOSE |
++ 1 << DCCP_PKT_RESET |
++ 1 << DCCP_PKT_SYNC |
++ 1 << DCCP_PKT_SYNCACK;
+ unsigned int dccp_len = skb->len - dataoff;
+ unsigned int cscov;
+ const char *msg;
++ u8 type;
++
++ BUILD_BUG_ON(DCCP_PKT_INVALID >= BITS_PER_LONG);
+
+ if (dh->dccph_doff * 4 < sizeof(struct dccp_hdr) ||
+ dh->dccph_doff * 4 > dccp_len) {
+@@ -459,34 +469,70 @@ static bool dccp_error(const struct dccp_hdr *dh,
+ goto out_invalid;
+ }
+
+- if (dh->dccph_type >= DCCP_PKT_INVALID) {
++ type = dh->dccph_type;
++ if (type >= DCCP_PKT_INVALID) {
+ msg = "nf_ct_dccp: reserved packet type ";
+ goto out_invalid;
+ }
++
++ if (test_bit(type, &require_seq48) && !dh->dccph_x) {
++ msg = "nf_ct_dccp: type lacks 48bit sequence numbers";
++ goto out_invalid;
++ }
++
+ return false;
+ out_invalid:
+ nf_l4proto_log_invalid(skb, state, IPPROTO_DCCP, "%s", msg);
+ return true;
+ }
+
++struct nf_conntrack_dccp_buf {
++ struct dccp_hdr dh; /* generic header part */
++ struct dccp_hdr_ext ext; /* optional depending dh->dccph_x */
++ union { /* depends on header type */
++ struct dccp_hdr_ack_bits ack;
++ struct dccp_hdr_request req;
++ struct dccp_hdr_response response;
++ struct dccp_hdr_reset rst;
++ } u;
++};
++
++static struct dccp_hdr *
++dccp_header_pointer(const struct sk_buff *skb, int offset, const struct dccp_hdr *dh,
++ struct nf_conntrack_dccp_buf *buf)
++{
++ unsigned int hdrlen = __dccp_hdr_len(dh);
++
++ if (hdrlen > sizeof(*buf))
++ return NULL;
++
++ return skb_header_pointer(skb, offset, hdrlen, buf);
++}
++
+ int nf_conntrack_dccp_packet(struct nf_conn *ct, struct sk_buff *skb,
+ unsigned int dataoff,
+ enum ip_conntrack_info ctinfo,
+ const struct nf_hook_state *state)
+ {
+ enum ip_conntrack_dir dir = CTINFO2DIR(ctinfo);
+- struct dccp_hdr _dh, *dh;
++ struct nf_conntrack_dccp_buf _dh;
+ u_int8_t type, old_state, new_state;
+ enum ct_dccp_roles role;
+ unsigned int *timeouts;
++ struct dccp_hdr *dh;
+
+- dh = skb_header_pointer(skb, dataoff, sizeof(_dh), &_dh);
++ dh = skb_header_pointer(skb, dataoff, sizeof(*dh), &_dh.dh);
+ if (!dh)
+ return NF_DROP;
+
+ if (dccp_error(dh, skb, dataoff, state))
+ return -NF_ACCEPT;
+
++ /* pull again, including possible 48 bit sequences and subtype header */
++ dh = dccp_header_pointer(skb, dataoff, dh, &_dh);
++ if (!dh)
++ return NF_DROP;
++
+ type = dh->dccph_type;
+ if (!nf_ct_is_confirmed(ct) && !dccp_new(ct, skb, dh, state))
+ return -NF_ACCEPT;
+diff --git a/net/netfilter/nf_conntrack_sip.c b/net/netfilter/nf_conntrack_sip.c
+index 77f5e82d8e3fe..d0eac27f6ba03 100644
+--- a/net/netfilter/nf_conntrack_sip.c
++++ b/net/netfilter/nf_conntrack_sip.c
+@@ -611,7 +611,7 @@ int ct_sip_parse_numerical_param(const struct nf_conn *ct, const char *dptr,
+ start += strlen(name);
+ *val = simple_strtoul(start, &end, 0);
+ if (start == end)
+- return 0;
++ return -1;
+ if (matchoff && matchlen) {
+ *matchoff = start - dptr;
+ *matchlen = end - start;
+diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
+index 4c7937fd803f9..79719e8cda799 100644
+--- a/net/netfilter/nf_tables_api.c
++++ b/net/netfilter/nf_tables_api.c
+@@ -2693,7 +2693,7 @@ err_hooks:
+
+ static struct nft_chain *nft_chain_lookup_byid(const struct net *net,
+ const struct nft_table *table,
+- const struct nlattr *nla)
++ const struct nlattr *nla, u8 genmask)
+ {
+ struct nftables_pernet *nft_net = nft_pernet(net);
+ u32 id = ntohl(nla_get_be32(nla));
+@@ -2704,7 +2704,8 @@ static struct nft_chain *nft_chain_lookup_byid(const struct net *net,
+
+ if (trans->msg_type == NFT_MSG_NEWCHAIN &&
+ chain->table == table &&
+- id == nft_trans_chain_id(trans))
++ id == nft_trans_chain_id(trans) &&
++ nft_active_genmask(chain, genmask))
+ return chain;
+ }
+ return ERR_PTR(-ENOENT);
+@@ -3808,7 +3809,8 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info,
+ return -EOPNOTSUPP;
+
+ } else if (nla[NFTA_RULE_CHAIN_ID]) {
+- chain = nft_chain_lookup_byid(net, table, nla[NFTA_RULE_CHAIN_ID]);
++ chain = nft_chain_lookup_byid(net, table, nla[NFTA_RULE_CHAIN_ID],
++ genmask);
+ if (IS_ERR(chain)) {
+ NL_SET_BAD_ATTR(extack, nla[NFTA_RULE_CHAIN_ID]);
+ return PTR_ERR(chain);
+@@ -5343,6 +5345,8 @@ void nf_tables_deactivate_set(const struct nft_ctx *ctx, struct nft_set *set,
+ nft_set_trans_unbind(ctx, set);
+ if (nft_set_is_anonymous(set))
+ nft_deactivate_next(ctx->net, set);
++ else
++ list_del_rcu(&binding->list);
+
+ set->use--;
+ break;
+@@ -6769,7 +6773,9 @@ err_set_full:
+ err_element_clash:
+ kfree(trans);
+ err_elem_free:
+- nft_set_elem_destroy(set, elem.priv, true);
++ nf_tables_set_elem_destroy(ctx, set, elem.priv);
++ if (obj)
++ obj->use--;
+ err_parse_data:
+ if (nla[NFTA_SET_ELEM_DATA] != NULL)
+ nft_data_release(&elem.data.val, desc.type);
+@@ -10463,7 +10469,8 @@ static int nft_verdict_init(const struct nft_ctx *ctx, struct nft_data *data,
+ genmask);
+ } else if (tb[NFTA_VERDICT_CHAIN_ID]) {
+ chain = nft_chain_lookup_byid(ctx->net, ctx->table,
+- tb[NFTA_VERDICT_CHAIN_ID]);
++ tb[NFTA_VERDICT_CHAIN_ID],
++ genmask);
+ if (IS_ERR(chain))
+ return PTR_ERR(chain);
+ } else {
+diff --git a/net/netfilter/nft_byteorder.c b/net/netfilter/nft_byteorder.c
+index b66647a5a1717..8db3481ed5549 100644
+--- a/net/netfilter/nft_byteorder.c
++++ b/net/netfilter/nft_byteorder.c
+@@ -30,11 +30,11 @@ void nft_byteorder_eval(const struct nft_expr *expr,
+ const struct nft_byteorder *priv = nft_expr_priv(expr);
+ u32 *src = ®s->data[priv->sreg];
+ u32 *dst = ®s->data[priv->dreg];
+- union { u32 u32; u16 u16; } *s, *d;
++ u16 *s16, *d16;
+ unsigned int i;
+
+- s = (void *)src;
+- d = (void *)dst;
++ s16 = (void *)src;
++ d16 = (void *)dst;
+
+ switch (priv->size) {
+ case 8: {
+@@ -62,11 +62,11 @@ void nft_byteorder_eval(const struct nft_expr *expr,
+ switch (priv->op) {
+ case NFT_BYTEORDER_NTOH:
+ for (i = 0; i < priv->len / 4; i++)
+- d[i].u32 = ntohl((__force __be32)s[i].u32);
++ dst[i] = ntohl((__force __be32)src[i]);
+ break;
+ case NFT_BYTEORDER_HTON:
+ for (i = 0; i < priv->len / 4; i++)
+- d[i].u32 = (__force __u32)htonl(s[i].u32);
++ dst[i] = (__force __u32)htonl(src[i]);
+ break;
+ }
+ break;
+@@ -74,11 +74,11 @@ void nft_byteorder_eval(const struct nft_expr *expr,
+ switch (priv->op) {
+ case NFT_BYTEORDER_NTOH:
+ for (i = 0; i < priv->len / 2; i++)
+- d[i].u16 = ntohs((__force __be16)s[i].u16);
++ d16[i] = ntohs((__force __be16)s16[i]);
+ break;
+ case NFT_BYTEORDER_HTON:
+ for (i = 0; i < priv->len / 2; i++)
+- d[i].u16 = (__force __u16)htons(s[i].u16);
++ d16[i] = (__force __u16)htons(s16[i]);
+ break;
+ }
+ break;
+diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
+index 3a1e0fd5bf149..5968b6450d828 100644
+--- a/net/netlink/af_netlink.c
++++ b/net/netlink/af_netlink.c
+@@ -1600,6 +1600,7 @@ out:
+ int netlink_set_err(struct sock *ssk, u32 portid, u32 group, int code)
+ {
+ struct netlink_set_err_data info;
++ unsigned long flags;
+ struct sock *sk;
+ int ret = 0;
+
+@@ -1609,12 +1610,12 @@ int netlink_set_err(struct sock *ssk, u32 portid, u32 group, int code)
+ /* sk->sk_err wants a positive error value */
+ info.code = -code;
+
+- read_lock(&nl_table_lock);
++ read_lock_irqsave(&nl_table_lock, flags);
+
+ sk_for_each_bound(sk, &nl_table[ssk->sk_protocol].mc_list)
+ ret += do_one_set_err(sk, &info);
+
+- read_unlock(&nl_table_lock);
++ read_unlock_irqrestore(&nl_table_lock, flags);
+ return ret;
+ }
+ EXPORT_SYMBOL(netlink_set_err);
+diff --git a/net/netlink/diag.c b/net/netlink/diag.c
+index c6255eac305c7..e4f21b1067bcc 100644
+--- a/net/netlink/diag.c
++++ b/net/netlink/diag.c
+@@ -94,6 +94,7 @@ static int __netlink_diag_dump(struct sk_buff *skb, struct netlink_callback *cb,
+ struct net *net = sock_net(skb->sk);
+ struct netlink_diag_req *req;
+ struct netlink_sock *nlsk;
++ unsigned long flags;
+ struct sock *sk;
+ int num = 2;
+ int ret = 0;
+@@ -152,7 +153,7 @@ static int __netlink_diag_dump(struct sk_buff *skb, struct netlink_callback *cb,
+ num++;
+
+ mc_list:
+- read_lock(&nl_table_lock);
++ read_lock_irqsave(&nl_table_lock, flags);
+ sk_for_each_bound(sk, &tbl->mc_list) {
+ if (sk_hashed(sk))
+ continue;
+@@ -167,13 +168,13 @@ mc_list:
+ NETLINK_CB(cb->skb).portid,
+ cb->nlh->nlmsg_seq,
+ NLM_F_MULTI,
+- sock_i_ino(sk)) < 0) {
++ __sock_i_ino(sk)) < 0) {
+ ret = 1;
+ break;
+ }
+ num++;
+ }
+- read_unlock(&nl_table_lock);
++ read_unlock_irqrestore(&nl_table_lock, flags);
+
+ done:
+ cb->args[0] = num;
+diff --git a/net/nfc/llcp.h b/net/nfc/llcp.h
+index c1d9be636933c..d8345ed57c954 100644
+--- a/net/nfc/llcp.h
++++ b/net/nfc/llcp.h
+@@ -201,7 +201,6 @@ void nfc_llcp_sock_link(struct llcp_sock_list *l, struct sock *s);
+ void nfc_llcp_sock_unlink(struct llcp_sock_list *l, struct sock *s);
+ void nfc_llcp_socket_remote_param_init(struct nfc_llcp_sock *sock);
+ struct nfc_llcp_local *nfc_llcp_find_local(struct nfc_dev *dev);
+-struct nfc_llcp_local *nfc_llcp_local_get(struct nfc_llcp_local *local);
+ int nfc_llcp_local_put(struct nfc_llcp_local *local);
+ u8 nfc_llcp_get_sdp_ssap(struct nfc_llcp_local *local,
+ struct nfc_llcp_sock *sock);
+diff --git a/net/nfc/llcp_commands.c b/net/nfc/llcp_commands.c
+index 41e3a20c89355..e2680a3bef799 100644
+--- a/net/nfc/llcp_commands.c
++++ b/net/nfc/llcp_commands.c
+@@ -359,6 +359,7 @@ int nfc_llcp_send_symm(struct nfc_dev *dev)
+ struct sk_buff *skb;
+ struct nfc_llcp_local *local;
+ u16 size = 0;
++ int err;
+
+ local = nfc_llcp_find_local(dev);
+ if (local == NULL)
+@@ -368,8 +369,10 @@ int nfc_llcp_send_symm(struct nfc_dev *dev)
+ size += dev->tx_headroom + dev->tx_tailroom + NFC_HEADER_SIZE;
+
+ skb = alloc_skb(size, GFP_KERNEL);
+- if (skb == NULL)
+- return -ENOMEM;
++ if (skb == NULL) {
++ err = -ENOMEM;
++ goto out;
++ }
+
+ skb_reserve(skb, dev->tx_headroom + NFC_HEADER_SIZE);
+
+@@ -379,8 +382,11 @@ int nfc_llcp_send_symm(struct nfc_dev *dev)
+
+ nfc_llcp_send_to_raw_sock(local, skb, NFC_DIRECTION_TX);
+
+- return nfc_data_exchange(dev, local->target_idx, skb,
++ err = nfc_data_exchange(dev, local->target_idx, skb,
+ nfc_llcp_recv, local);
++out:
++ nfc_llcp_local_put(local);
++ return err;
+ }
+
+ int nfc_llcp_send_connect(struct nfc_llcp_sock *sock)
+@@ -390,7 +396,8 @@ int nfc_llcp_send_connect(struct nfc_llcp_sock *sock)
+ const u8 *service_name_tlv = NULL;
+ const u8 *miux_tlv = NULL;
+ const u8 *rw_tlv = NULL;
+- u8 service_name_tlv_length, miux_tlv_length, rw_tlv_length, rw;
++ u8 service_name_tlv_length = 0;
++ u8 miux_tlv_length, rw_tlv_length, rw;
+ int err;
+ u16 size = 0;
+ __be16 miux;
+diff --git a/net/nfc/llcp_core.c b/net/nfc/llcp_core.c
+index a27e1842b2a09..f60e424e06076 100644
+--- a/net/nfc/llcp_core.c
++++ b/net/nfc/llcp_core.c
+@@ -17,6 +17,8 @@
+ static u8 llcp_magic[3] = {0x46, 0x66, 0x6d};
+
+ static LIST_HEAD(llcp_devices);
++/* Protects llcp_devices list */
++static DEFINE_SPINLOCK(llcp_devices_lock);
+
+ static void nfc_llcp_rx_skb(struct nfc_llcp_local *local, struct sk_buff *skb);
+
+@@ -141,7 +143,7 @@ static void nfc_llcp_socket_release(struct nfc_llcp_local *local, bool device,
+ write_unlock(&local->raw_sockets.lock);
+ }
+
+-struct nfc_llcp_local *nfc_llcp_local_get(struct nfc_llcp_local *local)
++static struct nfc_llcp_local *nfc_llcp_local_get(struct nfc_llcp_local *local)
+ {
+ kref_get(&local->ref);
+
+@@ -169,7 +171,6 @@ static void local_release(struct kref *ref)
+
+ local = container_of(ref, struct nfc_llcp_local, ref);
+
+- list_del(&local->list);
+ local_cleanup(local);
+ kfree(local);
+ }
+@@ -282,12 +283,33 @@ static void nfc_llcp_sdreq_timer(struct timer_list *t)
+ struct nfc_llcp_local *nfc_llcp_find_local(struct nfc_dev *dev)
+ {
+ struct nfc_llcp_local *local;
++ struct nfc_llcp_local *res = NULL;
+
++ spin_lock(&llcp_devices_lock);
+ list_for_each_entry(local, &llcp_devices, list)
+- if (local->dev == dev)
++ if (local->dev == dev) {
++ res = nfc_llcp_local_get(local);
++ break;
++ }
++ spin_unlock(&llcp_devices_lock);
++
++ return res;
++}
++
++static struct nfc_llcp_local *nfc_llcp_remove_local(struct nfc_dev *dev)
++{
++ struct nfc_llcp_local *local, *tmp;
++
++ spin_lock(&llcp_devices_lock);
++ list_for_each_entry_safe(local, tmp, &llcp_devices, list)
++ if (local->dev == dev) {
++ list_del(&local->list);
++ spin_unlock(&llcp_devices_lock);
+ return local;
++ }
++ spin_unlock(&llcp_devices_lock);
+
+- pr_debug("No device found\n");
++ pr_warn("Shutting down device not found\n");
+
+ return NULL;
+ }
+@@ -608,12 +630,15 @@ u8 *nfc_llcp_general_bytes(struct nfc_dev *dev, size_t *general_bytes_len)
+
+ *general_bytes_len = local->gb_len;
+
++ nfc_llcp_local_put(local);
++
+ return local->gb;
+ }
+
+ int nfc_llcp_set_remote_gb(struct nfc_dev *dev, const u8 *gb, u8 gb_len)
+ {
+ struct nfc_llcp_local *local;
++ int err;
+
+ if (gb_len < 3 || gb_len > NFC_MAX_GT_LEN)
+ return -EINVAL;
+@@ -630,12 +655,16 @@ int nfc_llcp_set_remote_gb(struct nfc_dev *dev, const u8 *gb, u8 gb_len)
+
+ if (memcmp(local->remote_gb, llcp_magic, 3)) {
+ pr_err("MAC does not support LLCP\n");
+- return -EINVAL;
++ err = -EINVAL;
++ goto out;
+ }
+
+- return nfc_llcp_parse_gb_tlv(local,
++ err = nfc_llcp_parse_gb_tlv(local,
+ &local->remote_gb[3],
+ local->remote_gb_len - 3);
++out:
++ nfc_llcp_local_put(local);
++ return err;
+ }
+
+ static u8 nfc_llcp_dsap(const struct sk_buff *pdu)
+@@ -1517,6 +1546,8 @@ int nfc_llcp_data_received(struct nfc_dev *dev, struct sk_buff *skb)
+
+ __nfc_llcp_recv(local, skb);
+
++ nfc_llcp_local_put(local);
++
+ return 0;
+ }
+
+@@ -1533,6 +1564,8 @@ void nfc_llcp_mac_is_down(struct nfc_dev *dev)
+
+ /* Close and purge all existing sockets */
+ nfc_llcp_socket_release(local, true, 0);
++
++ nfc_llcp_local_put(local);
+ }
+
+ void nfc_llcp_mac_is_up(struct nfc_dev *dev, u32 target_idx,
+@@ -1558,6 +1591,8 @@ void nfc_llcp_mac_is_up(struct nfc_dev *dev, u32 target_idx,
+ mod_timer(&local->link_timer,
+ jiffies + msecs_to_jiffies(local->remote_lto));
+ }
++
++ nfc_llcp_local_put(local);
+ }
+
+ int nfc_llcp_register_device(struct nfc_dev *ndev)
+@@ -1608,7 +1643,7 @@ int nfc_llcp_register_device(struct nfc_dev *ndev)
+
+ void nfc_llcp_unregister_device(struct nfc_dev *dev)
+ {
+- struct nfc_llcp_local *local = nfc_llcp_find_local(dev);
++ struct nfc_llcp_local *local = nfc_llcp_remove_local(dev);
+
+ if (local == NULL) {
+ pr_debug("No such device\n");
+diff --git a/net/nfc/llcp_sock.c b/net/nfc/llcp_sock.c
+index 77642d18a3b43..645677f84dba2 100644
+--- a/net/nfc/llcp_sock.c
++++ b/net/nfc/llcp_sock.c
+@@ -99,7 +99,7 @@ static int llcp_sock_bind(struct socket *sock, struct sockaddr *addr, int alen)
+ }
+
+ llcp_sock->dev = dev;
+- llcp_sock->local = nfc_llcp_local_get(local);
++ llcp_sock->local = local;
+ llcp_sock->nfc_protocol = llcp_addr.nfc_protocol;
+ llcp_sock->service_name_len = min_t(unsigned int,
+ llcp_addr.service_name_len,
+@@ -186,7 +186,7 @@ static int llcp_raw_sock_bind(struct socket *sock, struct sockaddr *addr,
+ }
+
+ llcp_sock->dev = dev;
+- llcp_sock->local = nfc_llcp_local_get(local);
++ llcp_sock->local = local;
+ llcp_sock->nfc_protocol = llcp_addr.nfc_protocol;
+
+ nfc_llcp_sock_link(&local->raw_sockets, sk);
+@@ -696,22 +696,22 @@ static int llcp_sock_connect(struct socket *sock, struct sockaddr *_addr,
+ if (dev->dep_link_up == false) {
+ ret = -ENOLINK;
+ device_unlock(&dev->dev);
+- goto put_dev;
++ goto sock_llcp_put_local;
+ }
+ device_unlock(&dev->dev);
+
+ if (local->rf_mode == NFC_RF_INITIATOR &&
+ addr->target_idx != local->target_idx) {
+ ret = -ENOLINK;
+- goto put_dev;
++ goto sock_llcp_put_local;
+ }
+
+ llcp_sock->dev = dev;
+- llcp_sock->local = nfc_llcp_local_get(local);
++ llcp_sock->local = local;
+ llcp_sock->ssap = nfc_llcp_get_local_ssap(local);
+ if (llcp_sock->ssap == LLCP_SAP_MAX) {
+ ret = -ENOMEM;
+- goto sock_llcp_put_local;
++ goto sock_llcp_nullify;
+ }
+
+ llcp_sock->reserved_ssap = llcp_sock->ssap;
+@@ -757,11 +757,13 @@ sock_unlink:
+ sock_llcp_release:
+ nfc_llcp_put_ssap(local, llcp_sock->ssap);
+
+-sock_llcp_put_local:
+- nfc_llcp_local_put(llcp_sock->local);
++sock_llcp_nullify:
+ llcp_sock->local = NULL;
+ llcp_sock->dev = NULL;
+
++sock_llcp_put_local:
++ nfc_llcp_local_put(local);
++
+ put_dev:
+ nfc_put_device(dev);
+
+diff --git a/net/nfc/netlink.c b/net/nfc/netlink.c
+index b9264e730fd93..e9ac6a6f934e7 100644
+--- a/net/nfc/netlink.c
++++ b/net/nfc/netlink.c
+@@ -1039,11 +1039,14 @@ static int nfc_genl_llc_get_params(struct sk_buff *skb, struct genl_info *info)
+ msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+ if (!msg) {
+ rc = -ENOMEM;
+- goto exit;
++ goto put_local;
+ }
+
+ rc = nfc_genl_send_params(msg, local, info->snd_portid, info->snd_seq);
+
++put_local:
++ nfc_llcp_local_put(local);
++
+ exit:
+ device_unlock(&dev->dev);
+
+@@ -1105,7 +1108,7 @@ static int nfc_genl_llc_set_params(struct sk_buff *skb, struct genl_info *info)
+ if (info->attrs[NFC_ATTR_LLC_PARAM_LTO]) {
+ if (dev->dep_link_up) {
+ rc = -EINPROGRESS;
+- goto exit;
++ goto put_local;
+ }
+
+ local->lto = nla_get_u8(info->attrs[NFC_ATTR_LLC_PARAM_LTO]);
+@@ -1117,6 +1120,9 @@ static int nfc_genl_llc_set_params(struct sk_buff *skb, struct genl_info *info)
+ if (info->attrs[NFC_ATTR_LLC_PARAM_MIUX])
+ local->miux = cpu_to_be16(miux);
+
++put_local:
++ nfc_llcp_local_put(local);
++
+ exit:
+ device_unlock(&dev->dev);
+
+@@ -1172,7 +1178,7 @@ static int nfc_genl_llc_sdreq(struct sk_buff *skb, struct genl_info *info)
+
+ if (rc != 0) {
+ rc = -EINVAL;
+- goto exit;
++ goto put_local;
+ }
+
+ if (!sdp_attrs[NFC_SDP_ATTR_URI])
+@@ -1191,7 +1197,7 @@ static int nfc_genl_llc_sdreq(struct sk_buff *skb, struct genl_info *info)
+ sdreq = nfc_llcp_build_sdreq_tlv(tid, uri, uri_len);
+ if (sdreq == NULL) {
+ rc = -ENOMEM;
+- goto exit;
++ goto put_local;
+ }
+
+ tlvs_len += sdreq->tlv_len;
+@@ -1201,10 +1207,14 @@ static int nfc_genl_llc_sdreq(struct sk_buff *skb, struct genl_info *info)
+
+ if (hlist_empty(&sdreq_list)) {
+ rc = -EINVAL;
+- goto exit;
++ goto put_local;
+ }
+
+ rc = nfc_llcp_send_snl_sdreq(local, &sdreq_list, tlvs_len);
++
++put_local:
++ nfc_llcp_local_put(local);
++
+ exit:
+ device_unlock(&dev->dev);
+
+diff --git a/net/nfc/nfc.h b/net/nfc/nfc.h
+index de2ec66d7e83a..0b1e6466f4fbf 100644
+--- a/net/nfc/nfc.h
++++ b/net/nfc/nfc.h
+@@ -52,6 +52,7 @@ int nfc_llcp_set_remote_gb(struct nfc_dev *dev, const u8 *gb, u8 gb_len);
+ u8 *nfc_llcp_general_bytes(struct nfc_dev *dev, size_t *general_bytes_len);
+ int nfc_llcp_data_received(struct nfc_dev *dev, struct sk_buff *skb);
+ struct nfc_llcp_local *nfc_llcp_find_local(struct nfc_dev *dev);
++int nfc_llcp_local_put(struct nfc_llcp_local *local);
+ int __init nfc_llcp_init(void);
+ void nfc_llcp_exit(void);
+ void nfc_llcp_free_sdp_tlv(struct nfc_llcp_sdp_tlv *sdp);
+diff --git a/net/sched/act_ipt.c b/net/sched/act_ipt.c
+index 5d96ffebd40f0..598d6e299152a 100644
+--- a/net/sched/act_ipt.c
++++ b/net/sched/act_ipt.c
+@@ -21,6 +21,7 @@
+ #include <linux/tc_act/tc_ipt.h>
+ #include <net/tc_act/tc_ipt.h>
+ #include <net/tc_wrapper.h>
++#include <net/ip.h>
+
+ #include <linux/netfilter_ipv4/ip_tables.h>
+
+@@ -48,7 +49,7 @@ static int ipt_init_target(struct net *net, struct xt_entry_target *t,
+ par.entryinfo = &e;
+ par.target = target;
+ par.targinfo = t->data;
+- par.hook_mask = hook;
++ par.hook_mask = 1 << hook;
+ par.family = NFPROTO_IPV4;
+
+ ret = xt_check_target(&par, t->u.target_size - sizeof(*t), 0, false);
+@@ -85,7 +86,8 @@ static void tcf_ipt_release(struct tc_action *a)
+
+ static const struct nla_policy ipt_policy[TCA_IPT_MAX + 1] = {
+ [TCA_IPT_TABLE] = { .type = NLA_STRING, .len = IFNAMSIZ },
+- [TCA_IPT_HOOK] = { .type = NLA_U32 },
++ [TCA_IPT_HOOK] = NLA_POLICY_RANGE(NLA_U32, NF_INET_PRE_ROUTING,
++ NF_INET_NUMHOOKS),
+ [TCA_IPT_INDEX] = { .type = NLA_U32 },
+ [TCA_IPT_TARG] = { .len = sizeof(struct xt_entry_target) },
+ };
+@@ -158,15 +160,27 @@ static int __tcf_ipt_init(struct net *net, unsigned int id, struct nlattr *nla,
+ return -EEXIST;
+ }
+ }
++
++ err = -EINVAL;
+ hook = nla_get_u32(tb[TCA_IPT_HOOK]);
++ switch (hook) {
++ case NF_INET_PRE_ROUTING:
++ break;
++ case NF_INET_POST_ROUTING:
++ break;
++ default:
++ goto err1;
++ }
++
++ if (tb[TCA_IPT_TABLE]) {
++ /* mangle only for now */
++ if (nla_strcmp(tb[TCA_IPT_TABLE], "mangle"))
++ goto err1;
++ }
+
+- err = -ENOMEM;
+- tname = kmalloc(IFNAMSIZ, GFP_KERNEL);
++ tname = kstrdup("mangle", GFP_KERNEL);
+ if (unlikely(!tname))
+ goto err1;
+- if (tb[TCA_IPT_TABLE] == NULL ||
+- nla_strscpy(tname, tb[TCA_IPT_TABLE], IFNAMSIZ) >= IFNAMSIZ)
+- strcpy(tname, "mangle");
+
+ t = kmemdup(td, td->u.target_size, GFP_KERNEL);
+ if (unlikely(!t))
+@@ -217,10 +231,31 @@ static int tcf_xt_init(struct net *net, struct nlattr *nla,
+ a, &act_xt_ops, tp, flags);
+ }
+
++static bool tcf_ipt_act_check(struct sk_buff *skb)
++{
++ const struct iphdr *iph;
++ unsigned int nhoff, len;
++
++ if (!pskb_may_pull(skb, sizeof(struct iphdr)))
++ return false;
++
++ nhoff = skb_network_offset(skb);
++ iph = ip_hdr(skb);
++ if (iph->ihl < 5 || iph->version != 4)
++ return false;
++
++ len = skb_ip_totlen(skb);
++ if (skb->len < nhoff + len || len < (iph->ihl * 4u))
++ return false;
++
++ return pskb_may_pull(skb, iph->ihl * 4u);
++}
++
+ TC_INDIRECT_SCOPE int tcf_ipt_act(struct sk_buff *skb,
+ const struct tc_action *a,
+ struct tcf_result *res)
+ {
++ char saved_cb[sizeof_field(struct sk_buff, cb)];
+ int ret = 0, result = 0;
+ struct tcf_ipt *ipt = to_ipt(a);
+ struct xt_action_param par;
+@@ -231,9 +266,24 @@ TC_INDIRECT_SCOPE int tcf_ipt_act(struct sk_buff *skb,
+ .pf = NFPROTO_IPV4,
+ };
+
++ if (skb_protocol(skb, false) != htons(ETH_P_IP))
++ return TC_ACT_UNSPEC;
++
+ if (skb_unclone(skb, GFP_ATOMIC))
+ return TC_ACT_UNSPEC;
+
++ if (!tcf_ipt_act_check(skb))
++ return TC_ACT_UNSPEC;
++
++ if (state.hook == NF_INET_POST_ROUTING) {
++ if (!skb_dst(skb))
++ return TC_ACT_UNSPEC;
++
++ state.out = skb->dev;
++ }
++
++ memcpy(saved_cb, skb->cb, sizeof(saved_cb));
++
+ spin_lock(&ipt->tcf_lock);
+
+ tcf_lastuse_update(&ipt->tcf_tm);
+@@ -246,6 +296,9 @@ TC_INDIRECT_SCOPE int tcf_ipt_act(struct sk_buff *skb,
+ par.state = &state;
+ par.target = ipt->tcfi_t->u.kernel.target;
+ par.targinfo = ipt->tcfi_t->data;
++
++ memset(IPCB(skb), 0, sizeof(struct inet_skb_parm));
++
+ ret = par.target->target(skb, &par);
+
+ switch (ret) {
+@@ -266,6 +319,9 @@ TC_INDIRECT_SCOPE int tcf_ipt_act(struct sk_buff *skb,
+ break;
+ }
+ spin_unlock(&ipt->tcf_lock);
++
++ memcpy(skb->cb, saved_cb, sizeof(skb->cb));
++
+ return result;
+
+ }
+diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c
+index c819b812a899c..399d4643a940e 100644
+--- a/net/sched/act_pedit.c
++++ b/net/sched/act_pedit.c
+@@ -29,6 +29,7 @@ static struct tc_action_ops act_pedit_ops;
+
+ static const struct nla_policy pedit_policy[TCA_PEDIT_MAX + 1] = {
+ [TCA_PEDIT_PARMS] = { .len = sizeof(struct tc_pedit) },
++ [TCA_PEDIT_PARMS_EX] = { .len = sizeof(struct tc_pedit) },
+ [TCA_PEDIT_KEYS_EX] = { .type = NLA_NESTED },
+ };
+
+diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
+index e79be1b3e74da..b93ec2a3454eb 100644
+--- a/net/sched/sch_netem.c
++++ b/net/sched/sch_netem.c
+@@ -773,12 +773,10 @@ static void dist_free(struct disttable *d)
+ * signed 16 bit values.
+ */
+
+-static int get_dist_table(struct Qdisc *sch, struct disttable **tbl,
+- const struct nlattr *attr)
++static int get_dist_table(struct disttable **tbl, const struct nlattr *attr)
+ {
+ size_t n = nla_len(attr)/sizeof(__s16);
+ const __s16 *data = nla_data(attr);
+- spinlock_t *root_lock;
+ struct disttable *d;
+ int i;
+
+@@ -793,13 +791,7 @@ static int get_dist_table(struct Qdisc *sch, struct disttable **tbl,
+ for (i = 0; i < n; i++)
+ d->table[i] = data[i];
+
+- root_lock = qdisc_root_sleeping_lock(sch);
+-
+- spin_lock_bh(root_lock);
+- swap(*tbl, d);
+- spin_unlock_bh(root_lock);
+-
+- dist_free(d);
++ *tbl = d;
+ return 0;
+ }
+
+@@ -956,6 +948,8 @@ static int netem_change(struct Qdisc *sch, struct nlattr *opt,
+ {
+ struct netem_sched_data *q = qdisc_priv(sch);
+ struct nlattr *tb[TCA_NETEM_MAX + 1];
++ struct disttable *delay_dist = NULL;
++ struct disttable *slot_dist = NULL;
+ struct tc_netem_qopt *qopt;
+ struct clgstate old_clg;
+ int old_loss_model = CLG_RANDOM;
+@@ -966,6 +960,18 @@ static int netem_change(struct Qdisc *sch, struct nlattr *opt,
+ if (ret < 0)
+ return ret;
+
++ if (tb[TCA_NETEM_DELAY_DIST]) {
++ ret = get_dist_table(&delay_dist, tb[TCA_NETEM_DELAY_DIST]);
++ if (ret)
++ goto table_free;
++ }
++
++ if (tb[TCA_NETEM_SLOT_DIST]) {
++ ret = get_dist_table(&slot_dist, tb[TCA_NETEM_SLOT_DIST]);
++ if (ret)
++ goto table_free;
++ }
++
+ sch_tree_lock(sch);
+ /* backup q->clg and q->loss_model */
+ old_clg = q->clg;
+@@ -975,26 +981,17 @@ static int netem_change(struct Qdisc *sch, struct nlattr *opt,
+ ret = get_loss_clg(q, tb[TCA_NETEM_LOSS]);
+ if (ret) {
+ q->loss_model = old_loss_model;
++ q->clg = old_clg;
+ goto unlock;
+ }
+ } else {
+ q->loss_model = CLG_RANDOM;
+ }
+
+- if (tb[TCA_NETEM_DELAY_DIST]) {
+- ret = get_dist_table(sch, &q->delay_dist,
+- tb[TCA_NETEM_DELAY_DIST]);
+- if (ret)
+- goto get_table_failure;
+- }
+-
+- if (tb[TCA_NETEM_SLOT_DIST]) {
+- ret = get_dist_table(sch, &q->slot_dist,
+- tb[TCA_NETEM_SLOT_DIST]);
+- if (ret)
+- goto get_table_failure;
+- }
+-
++ if (delay_dist)
++ swap(q->delay_dist, delay_dist);
++ if (slot_dist)
++ swap(q->slot_dist, slot_dist);
+ sch->limit = qopt->limit;
+
+ q->latency = PSCHED_TICKS2NS(qopt->latency);
+@@ -1044,17 +1041,11 @@ static int netem_change(struct Qdisc *sch, struct nlattr *opt,
+
+ unlock:
+ sch_tree_unlock(sch);
+- return ret;
+
+-get_table_failure:
+- /* recover clg and loss_model, in case of
+- * q->clg and q->loss_model were modified
+- * in get_loss_clg()
+- */
+- q->clg = old_clg;
+- q->loss_model = old_loss_model;
+-
+- goto unlock;
++table_free:
++ dist_free(delay_dist);
++ dist_free(slot_dist);
++ return ret;
+ }
+
+ static int netem_init(struct Qdisc *sch, struct nlattr *opt,
+diff --git a/net/sctp/socket.c b/net/sctp/socket.c
+index cda8c2874691d..ee15eff6364ee 100644
+--- a/net/sctp/socket.c
++++ b/net/sctp/socket.c
+@@ -364,9 +364,9 @@ static void sctp_auto_asconf_init(struct sctp_sock *sp)
+ struct net *net = sock_net(&sp->inet.sk);
+
+ if (net->sctp.default_auto_asconf) {
+- spin_lock(&net->sctp.addr_wq_lock);
++ spin_lock_bh(&net->sctp.addr_wq_lock);
+ list_add_tail(&sp->auto_asconf_list, &net->sctp.auto_asconf_splist);
+- spin_unlock(&net->sctp.addr_wq_lock);
++ spin_unlock_bh(&net->sctp.addr_wq_lock);
+ sp->do_auto_asconf = 1;
+ }
+ }
+@@ -8281,6 +8281,22 @@ static int sctp_getsockopt(struct sock *sk, int level, int optname,
+ return retval;
+ }
+
++static bool sctp_bpf_bypass_getsockopt(int level, int optname)
++{
++ if (level == SOL_SCTP) {
++ switch (optname) {
++ case SCTP_SOCKOPT_PEELOFF:
++ case SCTP_SOCKOPT_PEELOFF_FLAGS:
++ case SCTP_SOCKOPT_CONNECTX3:
++ return true;
++ default:
++ return false;
++ }
++ }
++
++ return false;
++}
++
+ static int sctp_hash(struct sock *sk)
+ {
+ /* STUB */
+@@ -9650,6 +9666,7 @@ struct proto sctp_prot = {
+ .shutdown = sctp_shutdown,
+ .setsockopt = sctp_setsockopt,
+ .getsockopt = sctp_getsockopt,
++ .bpf_bypass_getsockopt = sctp_bpf_bypass_getsockopt,
+ .sendmsg = sctp_sendmsg,
+ .recvmsg = sctp_recvmsg,
+ .bind = sctp_bind,
+@@ -9705,6 +9722,7 @@ struct proto sctpv6_prot = {
+ .shutdown = sctp_shutdown,
+ .setsockopt = sctp_setsockopt,
+ .getsockopt = sctp_getsockopt,
++ .bpf_bypass_getsockopt = sctp_bpf_bypass_getsockopt,
+ .sendmsg = sctp_sendmsg,
+ .recvmsg = sctp_recvmsg,
+ .bind = sctp_bind,
+diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
+index f77cebe2c0713..15f4d0d40bdd6 100644
+--- a/net/sunrpc/svcsock.c
++++ b/net/sunrpc/svcsock.c
+@@ -826,12 +826,6 @@ static void svc_tcp_listen_data_ready(struct sock *sk)
+
+ trace_sk_data_ready(sk);
+
+- if (svsk) {
+- /* Refer to svc_setup_socket() for details. */
+- rmb();
+- svsk->sk_odata(sk);
+- }
+-
+ /*
+ * This callback may called twice when a new connection
+ * is established as a child socket inherits everything
+@@ -840,13 +834,18 @@ static void svc_tcp_listen_data_ready(struct sock *sk)
+ * when one of child sockets become ESTABLISHED.
+ * 2) data_ready method of the child socket may be called
+ * when it receives data before the socket is accepted.
+- * In case of 2, we should ignore it silently.
++ * In case of 2, we should ignore it silently and DO NOT
++ * dereference svsk.
+ */
+- if (sk->sk_state == TCP_LISTEN) {
+- if (svsk) {
+- set_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags);
+- svc_xprt_enqueue(&svsk->sk_xprt);
+- }
++ if (sk->sk_state != TCP_LISTEN)
++ return;
++
++ if (svsk) {
++ /* Refer to svc_setup_socket() for details. */
++ rmb();
++ svsk->sk_odata(sk);
++ set_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags);
++ svc_xprt_enqueue(&svsk->sk_xprt);
+ }
+ }
+
+diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+index a22fe7587fa6f..70207d8a318a4 100644
+--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
++++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+@@ -796,6 +796,12 @@ int svc_rdma_recvfrom(struct svc_rqst *rqstp)
+ struct svc_rdma_recv_ctxt *ctxt;
+ int ret;
+
++ /* Prevent svc_xprt_release() from releasing pages in rq_pages
++ * when returning 0 or an error.
++ */
++ rqstp->rq_respages = rqstp->rq_pages;
++ rqstp->rq_next_page = rqstp->rq_respages;
++
+ rqstp->rq_xprt_ctxt = NULL;
+
+ ctxt = NULL;
+@@ -819,12 +825,6 @@ int svc_rdma_recvfrom(struct svc_rqst *rqstp)
+ DMA_FROM_DEVICE);
+ svc_rdma_build_arg_xdr(rqstp, ctxt);
+
+- /* Prevent svc_xprt_release from releasing pages in rq_pages
+- * if we return 0 or an error.
+- */
+- rqstp->rq_respages = rqstp->rq_pages;
+- rqstp->rq_next_page = rqstp->rq_respages;
+-
+ ret = svc_rdma_xdr_decode_req(&rqstp->rq_arg, ctxt);
+ if (ret < 0)
+ goto out_err;
+diff --git a/net/wireless/core.c b/net/wireless/core.c
+index b3ec9eaec36b3..609b79fe4a748 100644
+--- a/net/wireless/core.c
++++ b/net/wireless/core.c
+@@ -721,22 +721,6 @@ int wiphy_register(struct wiphy *wiphy)
+ return -EINVAL;
+ }
+
+- /*
+- * if a wiphy has unsupported modes for regulatory channel enforcement,
+- * opt-out of enforcement checking
+- */
+- if (wiphy->interface_modes & ~(BIT(NL80211_IFTYPE_STATION) |
+- BIT(NL80211_IFTYPE_P2P_CLIENT) |
+- BIT(NL80211_IFTYPE_AP) |
+- BIT(NL80211_IFTYPE_MESH_POINT) |
+- BIT(NL80211_IFTYPE_P2P_GO) |
+- BIT(NL80211_IFTYPE_ADHOC) |
+- BIT(NL80211_IFTYPE_P2P_DEVICE) |
+- BIT(NL80211_IFTYPE_NAN) |
+- BIT(NL80211_IFTYPE_AP_VLAN) |
+- BIT(NL80211_IFTYPE_MONITOR)))
+- wiphy->regulatory_flags |= REGULATORY_IGNORE_STALE_KICKOFF;
+-
+ if (WARN_ON((wiphy->regulatory_flags & REGULATORY_WIPHY_SELF_MANAGED) &&
+ (wiphy->regulatory_flags &
+ (REGULATORY_CUSTOM_REG |
+diff --git a/net/wireless/reg.c b/net/wireless/reg.c
+index 26f11e4746c05..f9e03850d71bf 100644
+--- a/net/wireless/reg.c
++++ b/net/wireless/reg.c
+@@ -2352,7 +2352,7 @@ static bool reg_wdev_chan_valid(struct wiphy *wiphy, struct wireless_dev *wdev)
+
+ if (!wdev->valid_links && link > 0)
+ break;
+- if (!(wdev->valid_links & BIT(link)))
++ if (wdev->valid_links && !(wdev->valid_links & BIT(link)))
+ continue;
+ switch (iftype) {
+ case NL80211_IFTYPE_AP:
+@@ -2391,9 +2391,17 @@ static bool reg_wdev_chan_valid(struct wiphy *wiphy, struct wireless_dev *wdev)
+ case NL80211_IFTYPE_P2P_DEVICE:
+ /* no enforcement required */
+ break;
++ case NL80211_IFTYPE_OCB:
++ if (!wdev->u.ocb.chandef.chan)
++ continue;
++ chandef = wdev->u.ocb.chandef;
++ break;
++ case NL80211_IFTYPE_NAN:
++ /* we have no info, but NAN is also pretty universal */
++ continue;
+ default:
+ /* others not implemented for now */
+- WARN_ON(1);
++ WARN_ON_ONCE(1);
+ break;
+ }
+
+@@ -2452,9 +2460,7 @@ static void reg_check_chans_work(struct work_struct *work)
+ rtnl_lock();
+
+ list_for_each_entry(rdev, &cfg80211_rdev_list, list)
+- if (!(rdev->wiphy.regulatory_flags &
+- REGULATORY_IGNORE_STALE_KICKOFF))
+- reg_leave_invalid_chans(&rdev->wiphy);
++ reg_leave_invalid_chans(&rdev->wiphy);
+
+ rtnl_unlock();
+ }
+diff --git a/net/wireless/scan.c b/net/wireless/scan.c
+index c501db7bbdb3d..396c63431e1f3 100644
+--- a/net/wireless/scan.c
++++ b/net/wireless/scan.c
+@@ -259,117 +259,152 @@ bool cfg80211_is_element_inherited(const struct element *elem,
+ }
+ EXPORT_SYMBOL(cfg80211_is_element_inherited);
+
+-static size_t cfg80211_gen_new_ie(const u8 *ie, size_t ielen,
+- const u8 *subelement, size_t subie_len,
+- u8 *new_ie, gfp_t gfp)
++static size_t cfg80211_copy_elem_with_frags(const struct element *elem,
++ const u8 *ie, size_t ie_len,
++ u8 **pos, u8 *buf, size_t buf_len)
+ {
+- u8 *pos, *tmp;
+- const u8 *tmp_old, *tmp_new;
+- const struct element *non_inherit_elem;
+- u8 *sub_copy;
++ if (WARN_ON((u8 *)elem < ie || elem->data > ie + ie_len ||
++ elem->data + elem->datalen > ie + ie_len))
++ return 0;
+
+- /* copy subelement as we need to change its content to
+- * mark an ie after it is processed.
+- */
+- sub_copy = kmemdup(subelement, subie_len, gfp);
+- if (!sub_copy)
++ if (elem->datalen + 2 > buf + buf_len - *pos)
+ return 0;
+
+- pos = &new_ie[0];
++ memcpy(*pos, elem, elem->datalen + 2);
++ *pos += elem->datalen + 2;
++
++ /* Finish if it is not fragmented */
++ if (elem->datalen != 255)
++ return *pos - buf;
++
++ ie_len = ie + ie_len - elem->data - elem->datalen;
++ ie = (const u8 *)elem->data + elem->datalen;
++
++ for_each_element(elem, ie, ie_len) {
++ if (elem->id != WLAN_EID_FRAGMENT)
++ break;
++
++ if (elem->datalen + 2 > buf + buf_len - *pos)
++ return 0;
++
++ memcpy(*pos, elem, elem->datalen + 2);
++ *pos += elem->datalen + 2;
+
+- /* set new ssid */
+- tmp_new = cfg80211_find_ie(WLAN_EID_SSID, sub_copy, subie_len);
+- if (tmp_new) {
+- memcpy(pos, tmp_new, tmp_new[1] + 2);
+- pos += (tmp_new[1] + 2);
++ if (elem->datalen != 255)
++ break;
+ }
+
+- /* get non inheritance list if exists */
+- non_inherit_elem =
+- cfg80211_find_ext_elem(WLAN_EID_EXT_NON_INHERITANCE,
+- sub_copy, subie_len);
++ return *pos - buf;
++}
+
+- /* go through IEs in ie (skip SSID) and subelement,
+- * merge them into new_ie
++static size_t cfg80211_gen_new_ie(const u8 *ie, size_t ielen,
++ const u8 *subie, size_t subie_len,
++ u8 *new_ie, size_t new_ie_len)
++{
++ const struct element *non_inherit_elem, *parent, *sub;
++ u8 *pos = new_ie;
++ u8 id, ext_id;
++ unsigned int match_len;
++
++ non_inherit_elem = cfg80211_find_ext_elem(WLAN_EID_EXT_NON_INHERITANCE,
++ subie, subie_len);
++
++ /* We copy the elements one by one from the parent to the generated
++ * elements.
++ * If they are not inherited (included in subie or in the non
++ * inheritance element), then we copy all occurrences the first time
++ * we see this element type.
+ */
+- tmp_old = cfg80211_find_ie(WLAN_EID_SSID, ie, ielen);
+- tmp_old = (tmp_old) ? tmp_old + tmp_old[1] + 2 : ie;
+-
+- while (tmp_old + 2 - ie <= ielen &&
+- tmp_old + tmp_old[1] + 2 - ie <= ielen) {
+- if (tmp_old[0] == 0) {
+- tmp_old++;
++ for_each_element(parent, ie, ielen) {
++ if (parent->id == WLAN_EID_FRAGMENT)
+ continue;
++
++ if (parent->id == WLAN_EID_EXTENSION) {
++ if (parent->datalen < 1)
++ continue;
++
++ id = WLAN_EID_EXTENSION;
++ ext_id = parent->data[0];
++ match_len = 1;
++ } else {
++ id = parent->id;
++ match_len = 0;
+ }
+
+- if (tmp_old[0] == WLAN_EID_EXTENSION)
+- tmp = (u8 *)cfg80211_find_ext_ie(tmp_old[2], sub_copy,
+- subie_len);
+- else
+- tmp = (u8 *)cfg80211_find_ie(tmp_old[0], sub_copy,
+- subie_len);
++ /* Find first occurrence in subie */
++ sub = cfg80211_find_elem_match(id, subie, subie_len,
++ &ext_id, match_len, 0);
+
+- if (!tmp) {
+- const struct element *old_elem = (void *)tmp_old;
++ /* Copy from parent if not in subie and inherited */
++ if (!sub &&
++ cfg80211_is_element_inherited(parent, non_inherit_elem)) {
++ if (!cfg80211_copy_elem_with_frags(parent,
++ ie, ielen,
++ &pos, new_ie,
++ new_ie_len))
++ return 0;
+
+- /* ie in old ie but not in subelement */
+- if (cfg80211_is_element_inherited(old_elem,
+- non_inherit_elem)) {
+- memcpy(pos, tmp_old, tmp_old[1] + 2);
+- pos += tmp_old[1] + 2;
+- }
+- } else {
+- /* ie in transmitting ie also in subelement,
+- * copy from subelement and flag the ie in subelement
+- * as copied (by setting eid field to WLAN_EID_SSID,
+- * which is skipped anyway).
+- * For vendor ie, compare OUI + type + subType to
+- * determine if they are the same ie.
+- */
+- if (tmp_old[0] == WLAN_EID_VENDOR_SPECIFIC) {
+- if (tmp_old[1] >= 5 && tmp[1] >= 5 &&
+- !memcmp(tmp_old + 2, tmp + 2, 5)) {
+- /* same vendor ie, copy from
+- * subelement
+- */
+- memcpy(pos, tmp, tmp[1] + 2);
+- pos += tmp[1] + 2;
+- tmp[0] = WLAN_EID_SSID;
+- } else {
+- memcpy(pos, tmp_old, tmp_old[1] + 2);
+- pos += tmp_old[1] + 2;
+- }
+- } else {
+- /* copy ie from subelement into new ie */
+- memcpy(pos, tmp, tmp[1] + 2);
+- pos += tmp[1] + 2;
+- tmp[0] = WLAN_EID_SSID;
+- }
++ continue;
+ }
+
+- if (tmp_old + tmp_old[1] + 2 - ie == ielen)
+- break;
++ /* Already copied if an earlier element had the same type */
++ if (cfg80211_find_elem_match(id, ie, (u8 *)parent - ie,
++ &ext_id, match_len, 0))
++ continue;
+
+- tmp_old += tmp_old[1] + 2;
++ /* Not inheriting, copy all similar elements from subie */
++ while (sub) {
++ if (!cfg80211_copy_elem_with_frags(sub,
++ subie, subie_len,
++ &pos, new_ie,
++ new_ie_len))
++ return 0;
++
++ sub = cfg80211_find_elem_match(id,
++ sub->data + sub->datalen,
++ subie_len + subie -
++ (sub->data +
++ sub->datalen),
++ &ext_id, match_len, 0);
++ }
+ }
+
+- /* go through subelement again to check if there is any ie not
+- * copied to new ie, skip ssid, capability, bssid-index ie
++ /* The above misses elements that are included in subie but not in the
++ * parent, so do a pass over subie and append those.
++ * Skip the non-tx BSSID caps and non-inheritance element.
+ */
+- tmp_new = sub_copy;
+- while (tmp_new + 2 - sub_copy <= subie_len &&
+- tmp_new + tmp_new[1] + 2 - sub_copy <= subie_len) {
+- if (!(tmp_new[0] == WLAN_EID_NON_TX_BSSID_CAP ||
+- tmp_new[0] == WLAN_EID_SSID)) {
+- memcpy(pos, tmp_new, tmp_new[1] + 2);
+- pos += tmp_new[1] + 2;
++ for_each_element(sub, subie, subie_len) {
++ if (sub->id == WLAN_EID_NON_TX_BSSID_CAP)
++ continue;
++
++ if (sub->id == WLAN_EID_FRAGMENT)
++ continue;
++
++ if (sub->id == WLAN_EID_EXTENSION) {
++ if (sub->datalen < 1)
++ continue;
++
++ id = WLAN_EID_EXTENSION;
++ ext_id = sub->data[0];
++ match_len = 1;
++
++ if (ext_id == WLAN_EID_EXT_NON_INHERITANCE)
++ continue;
++ } else {
++ id = sub->id;
++ match_len = 0;
+ }
+- if (tmp_new + tmp_new[1] + 2 - sub_copy == subie_len)
+- break;
+- tmp_new += tmp_new[1] + 2;
++
++ /* Processed if one was included in the parent */
++ if (cfg80211_find_elem_match(id, ie, ielen,
++ &ext_id, match_len, 0))
++ continue;
++
++ if (!cfg80211_copy_elem_with_frags(sub, subie, subie_len,
++ &pos, new_ie, new_ie_len))
++ return 0;
+ }
+
+- kfree(sub_copy);
+ return pos - new_ie;
+ }
+
+@@ -2212,7 +2247,7 @@ static void cfg80211_parse_mbssid_data(struct wiphy *wiphy,
+ new_ie_len = cfg80211_gen_new_ie(ie, ielen,
+ profile,
+ profile_len, new_ie,
+- gfp);
++ IEEE80211_MAX_DATA_LEN);
+ if (!new_ie_len)
+ continue;
+
+@@ -2261,118 +2296,6 @@ cfg80211_inform_bss_data(struct wiphy *wiphy,
+ }
+ EXPORT_SYMBOL(cfg80211_inform_bss_data);
+
+-static void
+-cfg80211_parse_mbssid_frame_data(struct wiphy *wiphy,
+- struct cfg80211_inform_bss *data,
+- struct ieee80211_mgmt *mgmt, size_t len,
+- struct cfg80211_non_tx_bss *non_tx_data,
+- gfp_t gfp)
+-{
+- enum cfg80211_bss_frame_type ftype;
+- const u8 *ie = mgmt->u.probe_resp.variable;
+- size_t ielen = len - offsetof(struct ieee80211_mgmt,
+- u.probe_resp.variable);
+-
+- ftype = ieee80211_is_beacon(mgmt->frame_control) ?
+- CFG80211_BSS_FTYPE_BEACON : CFG80211_BSS_FTYPE_PRESP;
+-
+- cfg80211_parse_mbssid_data(wiphy, data, ftype, mgmt->bssid,
+- le64_to_cpu(mgmt->u.probe_resp.timestamp),
+- le16_to_cpu(mgmt->u.probe_resp.beacon_int),
+- ie, ielen, non_tx_data, gfp);
+-}
+-
+-static void
+-cfg80211_update_notlisted_nontrans(struct wiphy *wiphy,
+- struct cfg80211_bss *nontrans_bss,
+- struct ieee80211_mgmt *mgmt, size_t len)
+-{
+- u8 *ie, *new_ie, *pos;
+- const struct element *nontrans_ssid;
+- const u8 *trans_ssid, *mbssid;
+- size_t ielen = len - offsetof(struct ieee80211_mgmt,
+- u.probe_resp.variable);
+- size_t new_ie_len;
+- struct cfg80211_bss_ies *new_ies;
+- const struct cfg80211_bss_ies *old;
+- size_t cpy_len;
+-
+- lockdep_assert_held(&wiphy_to_rdev(wiphy)->bss_lock);
+-
+- ie = mgmt->u.probe_resp.variable;
+-
+- new_ie_len = ielen;
+- trans_ssid = cfg80211_find_ie(WLAN_EID_SSID, ie, ielen);
+- if (!trans_ssid)
+- return;
+- new_ie_len -= trans_ssid[1];
+- mbssid = cfg80211_find_ie(WLAN_EID_MULTIPLE_BSSID, ie, ielen);
+- /*
+- * It's not valid to have the MBSSID element before SSID
+- * ignore if that happens - the code below assumes it is
+- * after (while copying things inbetween).
+- */
+- if (!mbssid || mbssid < trans_ssid)
+- return;
+- new_ie_len -= mbssid[1];
+-
+- nontrans_ssid = ieee80211_bss_get_elem(nontrans_bss, WLAN_EID_SSID);
+- if (!nontrans_ssid)
+- return;
+-
+- new_ie_len += nontrans_ssid->datalen;
+-
+- /* generate new ie for nontrans BSS
+- * 1. replace SSID with nontrans BSS' SSID
+- * 2. skip MBSSID IE
+- */
+- new_ie = kzalloc(new_ie_len, GFP_ATOMIC);
+- if (!new_ie)
+- return;
+-
+- new_ies = kzalloc(sizeof(*new_ies) + new_ie_len, GFP_ATOMIC);
+- if (!new_ies)
+- goto out_free;
+-
+- pos = new_ie;
+-
+- /* copy the nontransmitted SSID */
+- cpy_len = nontrans_ssid->datalen + 2;
+- memcpy(pos, nontrans_ssid, cpy_len);
+- pos += cpy_len;
+- /* copy the IEs between SSID and MBSSID */
+- cpy_len = trans_ssid[1] + 2;
+- memcpy(pos, (trans_ssid + cpy_len), (mbssid - (trans_ssid + cpy_len)));
+- pos += (mbssid - (trans_ssid + cpy_len));
+- /* copy the IEs after MBSSID */
+- cpy_len = mbssid[1] + 2;
+- memcpy(pos, mbssid + cpy_len, ((ie + ielen) - (mbssid + cpy_len)));
+-
+- /* update ie */
+- new_ies->len = new_ie_len;
+- new_ies->tsf = le64_to_cpu(mgmt->u.probe_resp.timestamp);
+- new_ies->from_beacon = ieee80211_is_beacon(mgmt->frame_control);
+- memcpy(new_ies->data, new_ie, new_ie_len);
+- if (ieee80211_is_probe_resp(mgmt->frame_control)) {
+- old = rcu_access_pointer(nontrans_bss->proberesp_ies);
+- rcu_assign_pointer(nontrans_bss->proberesp_ies, new_ies);
+- rcu_assign_pointer(nontrans_bss->ies, new_ies);
+- if (old)
+- kfree_rcu((struct cfg80211_bss_ies *)old, rcu_head);
+- } else {
+- old = rcu_access_pointer(nontrans_bss->beacon_ies);
+- rcu_assign_pointer(nontrans_bss->beacon_ies, new_ies);
+- cfg80211_update_hidden_bsses(bss_from_pub(nontrans_bss),
+- new_ies, old);
+- rcu_assign_pointer(nontrans_bss->ies, new_ies);
+- if (old)
+- kfree_rcu((struct cfg80211_bss_ies *)old, rcu_head);
+- }
+-
+-out_free:
+- kfree(new_ie);
+-}
+-
+ /* cfg80211_inform_bss_width_frame helper */
+ static struct cfg80211_bss *
+ cfg80211_inform_single_bss_frame_data(struct wiphy *wiphy,
+@@ -2505,51 +2428,31 @@ cfg80211_inform_bss_frame_data(struct wiphy *wiphy,
+ struct ieee80211_mgmt *mgmt, size_t len,
+ gfp_t gfp)
+ {
+- struct cfg80211_bss *res, *tmp_bss;
++ struct cfg80211_bss *res;
+ const u8 *ie = mgmt->u.probe_resp.variable;
+- const struct cfg80211_bss_ies *ies1, *ies2;
+ size_t ielen = len - offsetof(struct ieee80211_mgmt,
+ u.probe_resp.variable);
++ enum cfg80211_bss_frame_type ftype;
+ struct cfg80211_non_tx_bss non_tx_data = {};
+
+ res = cfg80211_inform_single_bss_frame_data(wiphy, data, mgmt,
+ len, gfp);
++ if (!res)
++ return NULL;
+
+ /* don't do any further MBSSID handling for S1G */
+ if (ieee80211_is_s1g_beacon(mgmt->frame_control))
+ return res;
+
+- if (!res || !wiphy->support_mbssid ||
+- !cfg80211_find_elem(WLAN_EID_MULTIPLE_BSSID, ie, ielen))
+- return res;
+- if (wiphy->support_only_he_mbssid &&
+- !cfg80211_find_ext_elem(WLAN_EID_EXT_HE_CAPABILITY, ie, ielen))
+- return res;
+-
++ ftype = ieee80211_is_beacon(mgmt->frame_control) ?
++ CFG80211_BSS_FTYPE_BEACON : CFG80211_BSS_FTYPE_PRESP;
+ non_tx_data.tx_bss = res;
+- /* process each non-transmitting bss */
+- cfg80211_parse_mbssid_frame_data(wiphy, data, mgmt, len,
+- &non_tx_data, gfp);
+-
+- spin_lock_bh(&wiphy_to_rdev(wiphy)->bss_lock);
+
+- /* check if the res has other nontransmitting bss which is not
+- * in MBSSID IE
+- */
+- ies1 = rcu_access_pointer(res->ies);
+-
+- /* go through nontrans_list, if the timestamp of the BSS is
+- * earlier than the timestamp of the transmitting BSS then
+- * update it
+- */
+- list_for_each_entry(tmp_bss, &res->nontrans_list,
+- nontrans_list) {
+- ies2 = rcu_access_pointer(tmp_bss->ies);
+- if (ies2->tsf < ies1->tsf)
+- cfg80211_update_notlisted_nontrans(wiphy, tmp_bss,
+- mgmt, len);
+- }
+- spin_unlock_bh(&wiphy_to_rdev(wiphy)->bss_lock);
++ /* process each non-transmitting bss */
++ cfg80211_parse_mbssid_data(wiphy, data, ftype, mgmt->bssid,
++ le64_to_cpu(mgmt->u.probe_resp.timestamp),
++ le16_to_cpu(mgmt->u.probe_resp.beacon_int),
++ ie, ielen, &non_tx_data, gfp);
+
+ return res;
+ }
+diff --git a/net/wireless/util.c b/net/wireless/util.c
+index 9755ef281040f..60be95eea6caf 100644
+--- a/net/wireless/util.c
++++ b/net/wireless/util.c
+@@ -580,6 +580,8 @@ int ieee80211_strip_8023_mesh_hdr(struct sk_buff *skb)
+ hdrlen += ETH_ALEN + 2;
+ else if (!pskb_may_pull(skb, hdrlen))
+ return -EINVAL;
++ else
++ payload.eth.h_proto = htons(skb->len - hdrlen);
+
+ mesh_addr = skb->data + sizeof(payload.eth) + ETH_ALEN;
+ switch (payload.flags & MESH_FLAGS_AE) {
+diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
+index cc1e7f15fa731..32dd55b9ce8a8 100644
+--- a/net/xdp/xsk.c
++++ b/net/xdp/xsk.c
+@@ -886,6 +886,7 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len)
+ struct sock *sk = sock->sk;
+ struct xdp_sock *xs = xdp_sk(sk);
+ struct net_device *dev;
++ int bound_dev_if;
+ u32 flags, qid;
+ int err = 0;
+
+@@ -899,6 +900,10 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len)
+ XDP_USE_NEED_WAKEUP))
+ return -EINVAL;
+
++ bound_dev_if = READ_ONCE(sk->sk_bound_dev_if);
++ if (bound_dev_if && bound_dev_if != sxdp->sxdp_ifindex)
++ return -EINVAL;
++
+ rtnl_lock();
+ mutex_lock(&xs->mutex);
+ if (xs->state != XSK_READY) {
+diff --git a/samples/bpf/tcp_basertt_kern.c b/samples/bpf/tcp_basertt_kern.c
+index 8dfe09a92feca..822b0742b8154 100644
+--- a/samples/bpf/tcp_basertt_kern.c
++++ b/samples/bpf/tcp_basertt_kern.c
+@@ -47,7 +47,7 @@ int bpf_basertt(struct bpf_sock_ops *skops)
+ case BPF_SOCK_OPS_BASE_RTT:
+ n = bpf_getsockopt(skops, SOL_TCP, TCP_CONGESTION,
+ cong, sizeof(cong));
+- if (!n && !__builtin_memcmp(cong, nv, sizeof(nv)+1)) {
++ if (!n && !__builtin_memcmp(cong, nv, sizeof(nv))) {
+ /* Set base_rtt to 80us */
+ rv = 80;
+ } else if (n) {
+diff --git a/samples/bpf/xdp1_kern.c b/samples/bpf/xdp1_kern.c
+index 0a5c704badd00..d91f27cbcfa99 100644
+--- a/samples/bpf/xdp1_kern.c
++++ b/samples/bpf/xdp1_kern.c
+@@ -39,7 +39,7 @@ static int parse_ipv6(void *data, u64 nh_off, void *data_end)
+ return ip6h->nexthdr;
+ }
+
+-#define XDPBUFSIZE 64
++#define XDPBUFSIZE 60
+ SEC("xdp.frags")
+ int xdp_prog1(struct xdp_md *ctx)
+ {
+diff --git a/samples/bpf/xdp2_kern.c b/samples/bpf/xdp2_kern.c
+index 67804ecf7ce37..8bca674451ed1 100644
+--- a/samples/bpf/xdp2_kern.c
++++ b/samples/bpf/xdp2_kern.c
+@@ -55,7 +55,7 @@ static int parse_ipv6(void *data, u64 nh_off, void *data_end)
+ return ip6h->nexthdr;
+ }
+
+-#define XDPBUFSIZE 64
++#define XDPBUFSIZE 60
+ SEC("xdp.frags")
+ int xdp_prog1(struct xdp_md *ctx)
+ {
+diff --git a/scripts/Makefile.clang b/scripts/Makefile.clang
+index 9076cc939e874..058a4c0f864ec 100644
+--- a/scripts/Makefile.clang
++++ b/scripts/Makefile.clang
+@@ -34,6 +34,5 @@ CLANG_FLAGS += -Werror=unknown-warning-option
+ CLANG_FLAGS += -Werror=ignored-optimization-argument
+ CLANG_FLAGS += -Werror=option-ignored
+ CLANG_FLAGS += -Werror=unused-command-line-argument
+-KBUILD_CFLAGS += $(CLANG_FLAGS)
+-KBUILD_AFLAGS += $(CLANG_FLAGS)
++KBUILD_CPPFLAGS += $(CLANG_FLAGS)
+ export CLANG_FLAGS
+diff --git a/scripts/Makefile.compiler b/scripts/Makefile.compiler
+index 7aa1fbc4aafef..e31f18625fcf5 100644
+--- a/scripts/Makefile.compiler
++++ b/scripts/Makefile.compiler
+@@ -32,13 +32,13 @@ try-run = $(shell set -e; \
+ # Usage: aflags-y += $(call as-option,-Wa$(comma)-isa=foo,)
+
+ as-option = $(call try-run,\
+- $(CC) -Werror $(KBUILD_AFLAGS) $(1) -c -x assembler-with-cpp /dev/null -o "$$TMP",$(1),$(2))
++ $(CC) -Werror $(KBUILD_CPPFLAGS) $(KBUILD_AFLAGS) $(1) -c -x assembler-with-cpp /dev/null -o "$$TMP",$(1),$(2))
+
+ # as-instr
+ # Usage: aflags-y += $(call as-instr,instr,option1,option2)
+
+ as-instr = $(call try-run,\
+- printf "%b\n" "$(1)" | $(CC) -Werror $(KBUILD_AFLAGS) -c -x assembler-with-cpp -o "$$TMP" -,$(2),$(3))
++ printf "%b\n" "$(1)" | $(CC) -Werror $(CLANG_FLAGS) $(KBUILD_AFLAGS) -c -x assembler-with-cpp -o "$$TMP" -,$(2),$(3))
+
+ # __cc-option
+ # Usage: MY_CFLAGS += $(call __cc-option,$(CC),$(MY_CFLAGS),-march=winchip-c6,-march=i586)
+diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal
+index 4703f652c0098..fc19f67039bda 100644
+--- a/scripts/Makefile.modfinal
++++ b/scripts/Makefile.modfinal
+@@ -23,7 +23,7 @@ modname = $(notdir $(@:.mod.o=))
+ part-of-module = y
+
+ quiet_cmd_cc_o_c = CC [M] $@
+- cmd_cc_o_c = $(CC) $(filter-out $(CC_FLAGS_CFI), $(c_flags)) -c -o $@ $<
++ cmd_cc_o_c = $(CC) $(filter-out $(CC_FLAGS_CFI) $(CFLAGS_GCOV), $(c_flags)) -c -o $@ $<
+
+ %.mod.o: %.mod.c FORCE
+ $(call if_changed_dep,cc_o_c)
+diff --git a/scripts/Makefile.vmlinux b/scripts/Makefile.vmlinux
+index 10176dec97eac..3cd6ca15f390d 100644
+--- a/scripts/Makefile.vmlinux
++++ b/scripts/Makefile.vmlinux
+@@ -19,6 +19,7 @@ quiet_cmd_cc_o_c = CC $@
+
+ ifdef CONFIG_MODULES
+ KASAN_SANITIZE_.vmlinux.export.o := n
++GCOV_PROFILE_.vmlinux.export.o := n
+ targets += .vmlinux.export.o
+ vmlinux: .vmlinux.export.o
+ endif
+diff --git a/scripts/mksysmap b/scripts/mksysmap
+index cb3b1fff3eee8..ec33385261022 100755
+--- a/scripts/mksysmap
++++ b/scripts/mksysmap
+@@ -32,7 +32,7 @@ ${NM} -n ${1} | sed >${2} -e "
+ # (do not forget a space before each pattern)
+
+ # local symbols for ARM, MIPS, etc.
+-/ \$/d
++/ \\$/d
+
+ # local labels, .LBB, .Ltmpxxx, .L__unnamed_xx, .LASANPC, etc.
+ / \.L/d
+@@ -41,7 +41,7 @@ ${NM} -n ${1} | sed >${2} -e "
+ / __efistub_/d
+
+ # arm64 local symbols in non-VHE KVM namespace
+-/ __kvm_nvhe_\$/d
++/ __kvm_nvhe_\\$/d
+ / __kvm_nvhe_\.L/d
+
+ # arm64 lld
+diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
+index c12150f96b884..d8baa9b9ae6d8 100644
+--- a/scripts/mod/modpost.c
++++ b/scripts/mod/modpost.c
+@@ -1150,6 +1150,10 @@ static Elf_Sym *find_elf_symbol(struct elf_info *elf, Elf64_Sword addr,
+ if (relsym->st_name != 0)
+ return relsym;
+
++ /*
++ * Strive to find a better symbol name, but the resulting name may not
++ * match the symbol referenced in the original code.
++ */
+ relsym_secindex = get_secindex(elf, relsym);
+ for (sym = elf->symtab_start; sym < elf->symtab_stop; sym++) {
+ if (get_secindex(elf, sym) != relsym_secindex)
+@@ -1286,49 +1290,12 @@ static void default_mismatch_handler(const char *modname, struct elf_info *elf,
+
+ static int is_executable_section(struct elf_info* elf, unsigned int section_index)
+ {
+- if (section_index > elf->num_sections)
++ if (section_index >= elf->num_sections)
+ fatal("section_index is outside elf->num_sections!\n");
+
+ return ((elf->sechdrs[section_index].sh_flags & SHF_EXECINSTR) == SHF_EXECINSTR);
+ }
+
+-/*
+- * We rely on a gross hack in section_rel[a]() calling find_extable_entry_size()
+- * to know the sizeof(struct exception_table_entry) for the target architecture.
+- */
+-static unsigned int extable_entry_size = 0;
+-static void find_extable_entry_size(const char* const sec, const Elf_Rela* r)
+-{
+- /*
+- * If we're currently checking the second relocation within __ex_table,
+- * that relocation offset tells us the offsetof(struct
+- * exception_table_entry, fixup) which is equal to sizeof(struct
+- * exception_table_entry) divided by two. We use that to our advantage
+- * since there's no portable way to get that size as every architecture
+- * seems to go with different sized types. Not pretty but better than
+- * hard-coding the size for every architecture..
+- */
+- if (!extable_entry_size)
+- extable_entry_size = r->r_offset * 2;
+-}
+-
+-static inline bool is_extable_fault_address(Elf_Rela *r)
+-{
+- /*
+- * extable_entry_size is only discovered after we've handled the
+- * _second_ relocation in __ex_table, so only abort when we're not
+- * handling the first reloc and extable_entry_size is zero.
+- */
+- if (r->r_offset && extable_entry_size == 0)
+- fatal("extable_entry size hasn't been discovered!\n");
+-
+- return ((r->r_offset == 0) ||
+- (r->r_offset % extable_entry_size == 0));
+-}
+-
+-#define is_second_extable_reloc(Start, Cur, Sec) \
+- (((Cur) == (Start) + 1) && (strcmp("__ex_table", (Sec)) == 0))
+-
+ static void report_extable_warnings(const char* modname, struct elf_info* elf,
+ const struct sectioncheck* const mismatch,
+ Elf_Rela* r, Elf_Sym* sym,
+@@ -1384,22 +1351,9 @@ static void extable_mismatch_handler(const char* modname, struct elf_info *elf,
+ "You might get more information about where this is\n"
+ "coming from by using scripts/check_extable.sh %s\n",
+ fromsec, (long)r->r_offset, tosec, modname);
+- else if (!is_executable_section(elf, get_secindex(elf, sym))) {
+- if (is_extable_fault_address(r))
+- fatal("The relocation at %s+0x%lx references\n"
+- "section \"%s\" which is not executable, IOW\n"
+- "it is not possible for the kernel to fault\n"
+- "at that address. Something is seriously wrong\n"
+- "and should be fixed.\n",
+- fromsec, (long)r->r_offset, tosec);
+- else
+- fatal("The relocation at %s+0x%lx references\n"
+- "section \"%s\" which is not executable, IOW\n"
+- "the kernel will fault if it ever tries to\n"
+- "jump to it. Something is seriously wrong\n"
+- "and should be fixed.\n",
+- fromsec, (long)r->r_offset, tosec);
+- }
++ else if (!is_executable_section(elf, get_secindex(elf, sym)))
++ error("%s+0x%lx references non-executable section '%s'\n",
++ fromsec, (long)r->r_offset, tosec);
+ }
+
+ static void check_section_mismatch(const char *modname, struct elf_info *elf,
+@@ -1457,19 +1411,33 @@ static int addend_386_rel(struct elf_info *elf, Elf_Shdr *sechdr, Elf_Rela *r)
+ #define R_ARM_THM_JUMP19 51
+ #endif
+
++static int32_t sign_extend32(int32_t value, int index)
++{
++ uint8_t shift = 31 - index;
++
++ return (int32_t)(value << shift) >> shift;
++}
++
+ static int addend_arm_rel(struct elf_info *elf, Elf_Shdr *sechdr, Elf_Rela *r)
+ {
+ unsigned int r_typ = ELF_R_TYPE(r->r_info);
++ Elf_Sym *sym = elf->symtab_start + ELF_R_SYM(r->r_info);
++ void *loc = reloc_location(elf, sechdr, r);
++ uint32_t inst;
++ int32_t offset;
+
+ switch (r_typ) {
+ case R_ARM_ABS32:
+- /* From ARM ABI: (S + A) | T */
+- r->r_addend = (int)(long)
+- (elf->symtab_start + ELF_R_SYM(r->r_info));
++ inst = TO_NATIVE(*(uint32_t *)loc);
++ r->r_addend = inst + sym->st_value;
+ break;
+ case R_ARM_PC24:
+ case R_ARM_CALL:
+ case R_ARM_JUMP24:
++ inst = TO_NATIVE(*(uint32_t *)loc);
++ offset = sign_extend32((inst & 0x00ffffff) << 2, 25);
++ r->r_addend = offset + sym->st_value + 8;
++ break;
+ case R_ARM_THM_CALL:
+ case R_ARM_THM_JUMP24:
+ case R_ARM_THM_JUMP19:
+@@ -1574,8 +1542,6 @@ static void section_rela(const char *modname, struct elf_info *elf,
+ /* Skip special sections */
+ if (is_shndx_special(sym->st_shndx))
+ continue;
+- if (is_second_extable_reloc(start, rela, fromsec))
+- find_extable_entry_size(fromsec, &r);
+ check_section_mismatch(modname, elf, &r, sym, fromsec);
+ }
+ }
+@@ -1633,8 +1599,6 @@ static void section_rel(const char *modname, struct elf_info *elf,
+ /* Skip special sections */
+ if (is_shndx_special(sym->st_shndx))
+ continue;
+- if (is_second_extable_reloc(start, rel, fromsec))
+- find_extable_entry_size(fromsec, &r);
+ check_section_mismatch(modname, elf, &r, sym, fromsec);
+ }
+ }
+diff --git a/scripts/package/builddeb b/scripts/package/builddeb
+index 252faaa5561cc..032774eb061e1 100755
+--- a/scripts/package/builddeb
++++ b/scripts/package/builddeb
+@@ -62,18 +62,14 @@ install_linux_image () {
+ ${MAKE} -f ${srctree}/Makefile INSTALL_DTBS_PATH="${pdir}/usr/lib/linux-image-${KERNELRELEASE}" dtbs_install
+ fi
+
+- if is_enabled CONFIG_MODULES; then
+- ${MAKE} -f ${srctree}/Makefile INSTALL_MOD_PATH="${pdir}" modules_install
+- rm -f "${pdir}/lib/modules/${KERNELRELEASE}/build"
+- rm -f "${pdir}/lib/modules/${KERNELRELEASE}/source"
+- if [ "${SRCARCH}" = um ] ; then
+- mkdir -p "${pdir}/usr/lib/uml/modules"
+- mv "${pdir}/lib/modules/${KERNELRELEASE}" "${pdir}/usr/lib/uml/modules/${KERNELRELEASE}"
+- fi
+- fi
++ ${MAKE} -f ${srctree}/Makefile INSTALL_MOD_PATH="${pdir}" modules_install
++ rm -f "${pdir}/lib/modules/${KERNELRELEASE}/build"
++ rm -f "${pdir}/lib/modules/${KERNELRELEASE}/source"
+
+ # Install the kernel
+ if [ "${ARCH}" = um ] ; then
++ mkdir -p "${pdir}/usr/lib/uml/modules"
++ mv "${pdir}/lib/modules/${KERNELRELEASE}" "${pdir}/usr/lib/uml/modules/${KERNELRELEASE}"
+ mkdir -p "${pdir}/usr/bin" "${pdir}/usr/share/doc/${pname}"
+ cp System.map "${pdir}/usr/lib/uml/modules/${KERNELRELEASE}/System.map"
+ cp ${KCONFIG_CONFIG} "${pdir}/usr/share/doc/${pname}/config"
+diff --git a/security/apparmor/policy.c b/security/apparmor/policy.c
+index 51e8184e0fec1..69711ae682e5e 100644
+--- a/security/apparmor/policy.c
++++ b/security/apparmor/policy.c
+@@ -591,7 +591,15 @@ struct aa_profile *aa_alloc_null(struct aa_profile *parent, const char *name,
+ profile->label.flags |= FLAG_NULL;
+ rules = list_first_entry(&profile->rules, typeof(*rules), list);
+ rules->file.dfa = aa_get_dfa(nulldfa);
++ rules->file.perms = kcalloc(2, sizeof(struct aa_perms), GFP_KERNEL);
++ if (!rules->file.perms)
++ goto fail;
++ rules->file.size = 2;
+ rules->policy.dfa = aa_get_dfa(nulldfa);
++ rules->policy.perms = kcalloc(2, sizeof(struct aa_perms), GFP_KERNEL);
++ if (!rules->policy.perms)
++ goto fail;
++ rules->policy.size = 2;
+
+ if (parent) {
+ profile->path_flags = parent->path_flags;
+@@ -602,6 +610,11 @@ struct aa_profile *aa_alloc_null(struct aa_profile *parent, const char *name,
+ }
+
+ return profile;
++
++fail:
++ aa_free_profile(profile);
++
++ return NULL;
+ }
+
+ /**
+diff --git a/security/apparmor/policy_compat.c b/security/apparmor/policy_compat.c
+index cc89d1e88fb74..0cb02da8a3193 100644
+--- a/security/apparmor/policy_compat.c
++++ b/security/apparmor/policy_compat.c
+@@ -146,7 +146,8 @@ static struct aa_perms compute_fperms_other(struct aa_dfa *dfa,
+ *
+ * Returns: remapped perm table
+ */
+-static struct aa_perms *compute_fperms(struct aa_dfa *dfa)
++static struct aa_perms *compute_fperms(struct aa_dfa *dfa,
++ u32 *size)
+ {
+ aa_state_t state;
+ unsigned int state_count;
+@@ -159,6 +160,7 @@ static struct aa_perms *compute_fperms(struct aa_dfa *dfa)
+ table = kvcalloc(state_count * 2, sizeof(struct aa_perms), GFP_KERNEL);
+ if (!table)
+ return NULL;
++ *size = state_count * 2;
+
+ for (state = 0; state < state_count; state++) {
+ table[state * 2] = compute_fperms_user(dfa, state);
+@@ -168,7 +170,8 @@ static struct aa_perms *compute_fperms(struct aa_dfa *dfa)
+ return table;
+ }
+
+-static struct aa_perms *compute_xmatch_perms(struct aa_dfa *xmatch)
++static struct aa_perms *compute_xmatch_perms(struct aa_dfa *xmatch,
++ u32 *size)
+ {
+ struct aa_perms *perms;
+ int state;
+@@ -179,6 +182,9 @@ static struct aa_perms *compute_xmatch_perms(struct aa_dfa *xmatch)
+ state_count = xmatch->tables[YYTD_ID_BASE]->td_lolen;
+ /* DFAs are restricted from having a state_count of less than 2 */
+ perms = kvcalloc(state_count, sizeof(struct aa_perms), GFP_KERNEL);
++ if (!perms)
++ return NULL;
++ *size = state_count;
+
+ /* zero init so skip the trap state (state == 0) */
+ for (state = 1; state < state_count; state++)
+@@ -239,7 +245,8 @@ static struct aa_perms compute_perms_entry(struct aa_dfa *dfa,
+ return perms;
+ }
+
+-static struct aa_perms *compute_perms(struct aa_dfa *dfa, u32 version)
++static struct aa_perms *compute_perms(struct aa_dfa *dfa, u32 version,
++ u32 *size)
+ {
+ unsigned int state;
+ unsigned int state_count;
+@@ -252,6 +259,7 @@ static struct aa_perms *compute_perms(struct aa_dfa *dfa, u32 version)
+ table = kvcalloc(state_count, sizeof(struct aa_perms), GFP_KERNEL);
+ if (!table)
+ return NULL;
++ *size = state_count;
+
+ /* zero init so skip the trap state (state == 0) */
+ for (state = 1; state < state_count; state++)
+@@ -286,7 +294,7 @@ static void remap_dfa_accept(struct aa_dfa *dfa, unsigned int factor)
+ /* TODO: merge different dfa mappings into single map_policy fn */
+ int aa_compat_map_xmatch(struct aa_policydb *policy)
+ {
+- policy->perms = compute_xmatch_perms(policy->dfa);
++ policy->perms = compute_xmatch_perms(policy->dfa, &policy->size);
+ if (!policy->perms)
+ return -ENOMEM;
+
+@@ -297,7 +305,7 @@ int aa_compat_map_xmatch(struct aa_policydb *policy)
+
+ int aa_compat_map_policy(struct aa_policydb *policy, u32 version)
+ {
+- policy->perms = compute_perms(policy->dfa, version);
++ policy->perms = compute_perms(policy->dfa, version, &policy->size);
+ if (!policy->perms)
+ return -ENOMEM;
+
+@@ -308,7 +316,7 @@ int aa_compat_map_policy(struct aa_policydb *policy, u32 version)
+
+ int aa_compat_map_file(struct aa_policydb *policy)
+ {
+- policy->perms = compute_fperms(policy->dfa);
++ policy->perms = compute_fperms(policy->dfa, &policy->size);
+ if (!policy->perms)
+ return -ENOMEM;
+
+diff --git a/security/apparmor/policy_unpack.c b/security/apparmor/policy_unpack.c
+index cf2ceec40b28a..bc9f436d49cca 100644
+--- a/security/apparmor/policy_unpack.c
++++ b/security/apparmor/policy_unpack.c
+@@ -860,10 +860,12 @@ static struct aa_profile *unpack_profile(struct aa_ext *e, char **ns_name)
+ }
+ profile->attach.xmatch_len = tmp;
+ profile->attach.xmatch.start[AA_CLASS_XMATCH] = DFA_START;
+- error = aa_compat_map_xmatch(&profile->attach.xmatch);
+- if (error) {
+- info = "failed to convert xmatch permission table";
+- goto fail;
++ if (!profile->attach.xmatch.perms) {
++ error = aa_compat_map_xmatch(&profile->attach.xmatch);
++ if (error) {
++ info = "failed to convert xmatch permission table";
++ goto fail;
++ }
+ }
+ }
+
+@@ -983,31 +985,54 @@ static struct aa_profile *unpack_profile(struct aa_ext *e, char **ns_name)
+ AA_CLASS_FILE);
+ if (!aa_unpack_nameX(e, AA_STRUCTEND, NULL))
+ goto fail;
+- error = aa_compat_map_policy(&rules->policy, e->version);
+- if (error) {
+- info = "failed to remap policydb permission table";
+- goto fail;
++ if (!rules->policy.perms) {
++ error = aa_compat_map_policy(&rules->policy,
++ e->version);
++ if (error) {
++ info = "failed to remap policydb permission table";
++ goto fail;
++ }
+ }
+- } else
++ } else {
+ rules->policy.dfa = aa_get_dfa(nulldfa);
+-
++ rules->policy.perms = kcalloc(2, sizeof(struct aa_perms),
++ GFP_KERNEL);
++ if (!rules->policy.perms)
++ goto fail;
++ rules->policy.size = 2;
++ }
+ /* get file rules */
+ error = unpack_pdb(e, &rules->file, false, true, &info);
+ if (error) {
+ goto fail;
+ } else if (rules->file.dfa) {
+- error = aa_compat_map_file(&rules->file);
+- if (error) {
+- info = "failed to remap file permission table";
+- goto fail;
++ if (!rules->file.perms) {
++ error = aa_compat_map_file(&rules->file);
++ if (error) {
++ info = "failed to remap file permission table";
++ goto fail;
++ }
+ }
+ } else if (rules->policy.dfa &&
+ rules->policy.start[AA_CLASS_FILE]) {
+ rules->file.dfa = aa_get_dfa(rules->policy.dfa);
+ rules->file.start[AA_CLASS_FILE] = rules->policy.start[AA_CLASS_FILE];
+- } else
++ rules->file.perms = kcalloc(rules->policy.size,
++ sizeof(struct aa_perms),
++ GFP_KERNEL);
++ if (!rules->file.perms)
++ goto fail;
++ memcpy(rules->file.perms, rules->policy.perms,
++ rules->policy.size * sizeof(struct aa_perms));
++ rules->file.size = rules->policy.size;
++ } else {
+ rules->file.dfa = aa_get_dfa(nulldfa);
+-
++ rules->file.perms = kcalloc(2, sizeof(struct aa_perms),
++ GFP_KERNEL);
++ if (!rules->file.perms)
++ goto fail;
++ rules->file.size = 2;
++ }
+ error = -EPROTO;
+ if (aa_unpack_nameX(e, AA_STRUCT, "data")) {
+ info = "out of memory";
+@@ -1046,8 +1071,13 @@ static struct aa_profile *unpack_profile(struct aa_ext *e, char **ns_name)
+ goto fail;
+ }
+
+- rhashtable_insert_fast(profile->data, &data->head,
+- profile->data->p);
++ if (rhashtable_insert_fast(profile->data, &data->head,
++ profile->data->p)) {
++ kfree_sensitive(data->key);
++ kfree_sensitive(data);
++ info = "failed to insert data to table";
++ goto fail;
++ }
+ }
+
+ if (!aa_unpack_nameX(e, AA_STRUCTEND, NULL)) {
+@@ -1134,22 +1164,16 @@ static int verify_header(struct aa_ext *e, int required, const char **ns)
+ return 0;
+ }
+
+-static bool verify_xindex(int xindex, int table_size)
+-{
+- int index, xtype;
+- xtype = xindex & AA_X_TYPE_MASK;
+- index = xindex & AA_X_INDEX_MASK;
+- if (xtype == AA_X_TABLE && index >= table_size)
+- return false;
+- return true;
+-}
+-
+-/* verify dfa xindexes are in range of transition tables */
+-static bool verify_dfa_xindex(struct aa_dfa *dfa, int table_size)
++/**
++ * verify_dfa_accept_xindex - verify accept indexes are in range of perms table
++ * @dfa: the dfa to check accept indexes are in range
++ * table_size: the permission table size the indexes should be within
++ */
++static bool verify_dfa_accept_index(struct aa_dfa *dfa, int table_size)
+ {
+ int i;
+ for (i = 0; i < dfa->tables[YYTD_ID_ACCEPT]->td_lolen; i++) {
+- if (!verify_xindex(ACCEPT_TABLE(dfa)[i], table_size))
++ if (ACCEPT_TABLE(dfa)[i] >= table_size)
+ return false;
+ }
+ return true;
+@@ -1186,11 +1210,13 @@ static bool verify_perms(struct aa_policydb *pdb)
+ if (!verify_perm(&pdb->perms[i]))
+ return false;
+ /* verify indexes into str table */
+- if (pdb->perms[i].xindex >= pdb->trans.size)
++ if ((pdb->perms[i].xindex & AA_X_TYPE_MASK) == AA_X_TABLE &&
++ (pdb->perms[i].xindex & AA_X_INDEX_MASK) >= pdb->trans.size)
+ return false;
+- if (pdb->perms[i].tag >= pdb->trans.size)
++ if (pdb->perms[i].tag && pdb->perms[i].tag >= pdb->trans.size)
+ return false;
+- if (pdb->perms[i].label >= pdb->trans.size)
++ if (pdb->perms[i].label &&
++ pdb->perms[i].label >= pdb->trans.size)
+ return false;
+ }
+
+@@ -1212,10 +1238,10 @@ static int verify_profile(struct aa_profile *profile)
+ if (!rules)
+ return 0;
+
+- if ((rules->file.dfa && !verify_dfa_xindex(rules->file.dfa,
+- rules->file.trans.size)) ||
++ if ((rules->file.dfa && !verify_dfa_accept_index(rules->file.dfa,
++ rules->file.size)) ||
+ (rules->policy.dfa &&
+- !verify_dfa_xindex(rules->policy.dfa, rules->policy.trans.size))) {
++ !verify_dfa_accept_index(rules->policy.dfa, rules->policy.size))) {
+ audit_iface(profile, NULL, NULL,
+ "Unpack: Invalid named transition", NULL, -EPROTO);
+ return -EPROTO;
+diff --git a/security/integrity/evm/evm_crypto.c b/security/integrity/evm/evm_crypto.c
+index 033804f5a5f20..0dae649f3740c 100644
+--- a/security/integrity/evm/evm_crypto.c
++++ b/security/integrity/evm/evm_crypto.c
+@@ -40,7 +40,7 @@ static const char evm_hmac[] = "hmac(sha1)";
+ /**
+ * evm_set_key() - set EVM HMAC key from the kernel
+ * @key: pointer to a buffer with the key data
+- * @size: length of the key data
++ * @keylen: length of the key data
+ *
+ * This function allows setting the EVM HMAC key from the kernel
+ * without using the "encrypted" key subsystem keys. It can be used
+diff --git a/security/integrity/evm/evm_main.c b/security/integrity/evm/evm_main.c
+index cf24c5255583c..c9b6e2a43478a 100644
+--- a/security/integrity/evm/evm_main.c
++++ b/security/integrity/evm/evm_main.c
+@@ -318,7 +318,6 @@ int evm_protected_xattr_if_enabled(const char *req_xattr_name)
+ /**
+ * evm_read_protected_xattrs - read EVM protected xattr names, lengths, values
+ * @dentry: dentry of the read xattrs
+- * @inode: inode of the read xattrs
+ * @buffer: buffer xattr names, lengths or values are copied to
+ * @buffer_size: size of buffer
+ * @type: n: names, l: lengths, v: values
+@@ -390,6 +389,7 @@ int evm_read_protected_xattrs(struct dentry *dentry, u8 *buffer,
+ * @xattr_name: requested xattr
+ * @xattr_value: requested xattr value
+ * @xattr_value_len: requested xattr value length
++ * @iint: inode integrity metadata
+ *
+ * Calculate the HMAC for the given dentry and verify it against the stored
+ * security.evm xattr. For performance, use the xattr value and length
+@@ -795,7 +795,9 @@ static int evm_attr_change(struct mnt_idmap *idmap,
+
+ /**
+ * evm_inode_setattr - prevent updating an invalid EVM extended attribute
++ * @idmap: idmap of the mount
+ * @dentry: pointer to the affected dentry
++ * @attr: iattr structure containing the new file attributes
+ *
+ * Permit update of file attributes when files have a valid EVM signature,
+ * except in the case of them having an immutable portable signature.
+diff --git a/security/integrity/iint.c b/security/integrity/iint.c
+index c73858e8c6d51..a462df827de2d 100644
+--- a/security/integrity/iint.c
++++ b/security/integrity/iint.c
+@@ -43,12 +43,10 @@ static struct integrity_iint_cache *__integrity_iint_find(struct inode *inode)
+ else if (inode > iint->inode)
+ n = n->rb_right;
+ else
+- break;
++ return iint;
+ }
+- if (!n)
+- return NULL;
+
+- return iint;
++ return NULL;
+ }
+
+ /*
+@@ -113,10 +111,15 @@ struct integrity_iint_cache *integrity_inode_get(struct inode *inode)
+ parent = *p;
+ test_iint = rb_entry(parent, struct integrity_iint_cache,
+ rb_node);
+- if (inode < test_iint->inode)
++ if (inode < test_iint->inode) {
+ p = &(*p)->rb_left;
+- else
++ } else if (inode > test_iint->inode) {
+ p = &(*p)->rb_right;
++ } else {
++ write_unlock(&integrity_iint_lock);
++ kmem_cache_free(iint_cache, iint);
++ return test_iint;
++ }
+ }
+
+ iint->inode = inode;
+diff --git a/security/integrity/ima/ima_modsig.c b/security/integrity/ima/ima_modsig.c
+index fb25723c65bc4..3e7bee30080f2 100644
+--- a/security/integrity/ima/ima_modsig.c
++++ b/security/integrity/ima/ima_modsig.c
+@@ -89,6 +89,9 @@ int ima_read_modsig(enum ima_hooks func, const void *buf, loff_t buf_len,
+
+ /**
+ * ima_collect_modsig - Calculate the file hash without the appended signature.
++ * @modsig: parsed module signature
++ * @buf: data to verify the signature on
++ * @size: data size
+ *
+ * Since the modsig is part of the file contents, the hash used in its signature
+ * isn't the same one ordinarily calculated by IMA. Therefore PKCS7 code
+diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c
+index 3ca8b7348c2e4..c9b3bd8f1bb9c 100644
+--- a/security/integrity/ima/ima_policy.c
++++ b/security/integrity/ima/ima_policy.c
+@@ -721,6 +721,7 @@ static int get_subaction(struct ima_rule_entry *rule, enum ima_hooks func)
+ * @secid: LSM secid of the task to be validated
+ * @func: IMA hook identifier
+ * @mask: requested action (MAY_READ | MAY_WRITE | MAY_APPEND | MAY_EXEC)
++ * @flags: IMA actions to consider (e.g. IMA_MEASURE | IMA_APPRAISE)
+ * @pcr: set the pcr to extend
+ * @template_desc: the template that should be used for this rule
+ * @func_data: func specific data, may be NULL
+@@ -1915,7 +1916,7 @@ static int ima_parse_rule(char *rule, struct ima_rule_entry *entry)
+
+ /**
+ * ima_parse_add_rule - add a rule to ima_policy_rules
+- * @rule - ima measurement policy rule
++ * @rule: ima measurement policy rule
+ *
+ * Avoid locking by allowing just one writer at a time in ima_write_policy()
+ * Returns the length of the rule parsed, an error code on failure
+diff --git a/sound/core/jack.c b/sound/core/jack.c
+index 88493cc31914b..03d155ed362b4 100644
+--- a/sound/core/jack.c
++++ b/sound/core/jack.c
+@@ -654,6 +654,7 @@ void snd_jack_report(struct snd_jack *jack, int status)
+ struct snd_jack_kctl *jack_kctl;
+ unsigned int mask_bits = 0;
+ #ifdef CONFIG_SND_JACK_INPUT_DEV
++ struct input_dev *idev;
+ int i;
+ #endif
+
+@@ -670,17 +671,15 @@ void snd_jack_report(struct snd_jack *jack, int status)
+ status & jack_kctl->mask_bits);
+
+ #ifdef CONFIG_SND_JACK_INPUT_DEV
+- mutex_lock(&jack->input_dev_lock);
+- if (!jack->input_dev) {
+- mutex_unlock(&jack->input_dev_lock);
++ idev = input_get_device(jack->input_dev);
++ if (!idev)
+ return;
+- }
+
+ for (i = 0; i < ARRAY_SIZE(jack->key); i++) {
+ int testbit = ((SND_JACK_BTN_0 >> i) & ~mask_bits);
+
+ if (jack->type & testbit)
+- input_report_key(jack->input_dev, jack->key[i],
++ input_report_key(idev, jack->key[i],
+ status & testbit);
+ }
+
+@@ -688,13 +687,13 @@ void snd_jack_report(struct snd_jack *jack, int status)
+ int testbit = ((1 << i) & ~mask_bits);
+
+ if (jack->type & testbit)
+- input_report_switch(jack->input_dev,
++ input_report_switch(idev,
+ jack_switch_types[i],
+ status & testbit);
+ }
+
+- input_sync(jack->input_dev);
+- mutex_unlock(&jack->input_dev_lock);
++ input_sync(idev);
++ input_put_device(idev);
+ #endif /* CONFIG_SND_JACK_INPUT_DEV */
+ }
+ EXPORT_SYMBOL(snd_jack_report);
+diff --git a/sound/core/pcm_memory.c b/sound/core/pcm_memory.c
+index 7bde7fb64011e..a0b9514716995 100644
+--- a/sound/core/pcm_memory.c
++++ b/sound/core/pcm_memory.c
+@@ -31,15 +31,41 @@ static unsigned long max_alloc_per_card = 32UL * 1024UL * 1024UL;
+ module_param(max_alloc_per_card, ulong, 0644);
+ MODULE_PARM_DESC(max_alloc_per_card, "Max total allocation bytes per card.");
+
++static void __update_allocated_size(struct snd_card *card, ssize_t bytes)
++{
++ card->total_pcm_alloc_bytes += bytes;
++}
++
++static void update_allocated_size(struct snd_card *card, ssize_t bytes)
++{
++ mutex_lock(&card->memory_mutex);
++ __update_allocated_size(card, bytes);
++ mutex_unlock(&card->memory_mutex);
++}
++
++static void decrease_allocated_size(struct snd_card *card, size_t bytes)
++{
++ mutex_lock(&card->memory_mutex);
++ WARN_ON(card->total_pcm_alloc_bytes < bytes);
++ __update_allocated_size(card, -(ssize_t)bytes);
++ mutex_unlock(&card->memory_mutex);
++}
++
+ static int do_alloc_pages(struct snd_card *card, int type, struct device *dev,
+ int str, size_t size, struct snd_dma_buffer *dmab)
+ {
+ enum dma_data_direction dir;
+ int err;
+
++ /* check and reserve the requested size */
++ mutex_lock(&card->memory_mutex);
+ if (max_alloc_per_card &&
+- card->total_pcm_alloc_bytes + size > max_alloc_per_card)
++ card->total_pcm_alloc_bytes + size > max_alloc_per_card) {
++ mutex_unlock(&card->memory_mutex);
+ return -ENOMEM;
++ }
++ __update_allocated_size(card, size);
++ mutex_unlock(&card->memory_mutex);
+
+ if (str == SNDRV_PCM_STREAM_PLAYBACK)
+ dir = DMA_TO_DEVICE;
+@@ -47,9 +73,14 @@ static int do_alloc_pages(struct snd_card *card, int type, struct device *dev,
+ dir = DMA_FROM_DEVICE;
+ err = snd_dma_alloc_dir_pages(type, dev, dir, size, dmab);
+ if (!err) {
+- mutex_lock(&card->memory_mutex);
+- card->total_pcm_alloc_bytes += dmab->bytes;
+- mutex_unlock(&card->memory_mutex);
++ /* the actual allocation size might be bigger than requested,
++ * and we need to correct the account
++ */
++ if (dmab->bytes != size)
++ update_allocated_size(card, dmab->bytes - size);
++ } else {
++ /* take back on allocation failure */
++ decrease_allocated_size(card, size);
+ }
+ return err;
+ }
+@@ -58,10 +89,7 @@ static void do_free_pages(struct snd_card *card, struct snd_dma_buffer *dmab)
+ {
+ if (!dmab->area)
+ return;
+- mutex_lock(&card->memory_mutex);
+- WARN_ON(card->total_pcm_alloc_bytes < dmab->bytes);
+- card->total_pcm_alloc_bytes -= dmab->bytes;
+- mutex_unlock(&card->memory_mutex);
++ decrease_allocated_size(card, dmab->bytes);
+ snd_dma_free_pages(dmab);
+ dmab->area = NULL;
+ }
+diff --git a/sound/pci/ac97/ac97_codec.c b/sound/pci/ac97/ac97_codec.c
+index 9afc5906d662e..80a65b8ad7b9b 100644
+--- a/sound/pci/ac97/ac97_codec.c
++++ b/sound/pci/ac97/ac97_codec.c
+@@ -2069,8 +2069,8 @@ int snd_ac97_mixer(struct snd_ac97_bus *bus, struct snd_ac97_template *template,
+ .dev_disconnect = snd_ac97_dev_disconnect,
+ };
+
+- if (rac97)
+- *rac97 = NULL;
++ if (!rac97)
++ return -EINVAL;
+ if (snd_BUG_ON(!bus || !template))
+ return -EINVAL;
+ if (snd_BUG_ON(template->num >= 4))
+diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
+index dabfdecece264..f1b934a502169 100644
+--- a/sound/pci/hda/patch_realtek.c
++++ b/sound/pci/hda/patch_realtek.c
+@@ -9490,9 +9490,9 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x103c, 0x8b63, "HP Elite Dragonfly 13.5 inch G4", ALC245_FIXUP_CS35L41_SPI_4_HP_GPIO_LED),
+ SND_PCI_QUIRK(0x103c, 0x8b65, "HP ProBook 455 15.6 inch G10 Notebook PC", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF),
+ SND_PCI_QUIRK(0x103c, 0x8b66, "HP", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF),
+- SND_PCI_QUIRK(0x103c, 0x8b70, "HP EliteBook 835 G10", ALC287_FIXUP_CS35L41_I2C_2),
+- SND_PCI_QUIRK(0x103c, 0x8b72, "HP EliteBook 845 G10", ALC287_FIXUP_CS35L41_I2C_2),
+- SND_PCI_QUIRK(0x103c, 0x8b74, "HP EliteBook 845W G10", ALC287_FIXUP_CS35L41_I2C_2),
++ SND_PCI_QUIRK(0x103c, 0x8b70, "HP EliteBook 835 G10", ALC287_FIXUP_CS35L41_I2C_2_HP_GPIO_LED),
++ SND_PCI_QUIRK(0x103c, 0x8b72, "HP EliteBook 845 G10", ALC287_FIXUP_CS35L41_I2C_2_HP_GPIO_LED),
++ SND_PCI_QUIRK(0x103c, 0x8b74, "HP EliteBook 845W G10", ALC287_FIXUP_CS35L41_I2C_2_HP_GPIO_LED),
+ SND_PCI_QUIRK(0x103c, 0x8b77, "HP ElieBook 865 G10", ALC287_FIXUP_CS35L41_I2C_2),
+ SND_PCI_QUIRK(0x103c, 0x8b7a, "HP", ALC236_FIXUP_HP_GPIO_LED),
+ SND_PCI_QUIRK(0x103c, 0x8b7d, "HP", ALC236_FIXUP_HP_GPIO_LED),
+@@ -9682,6 +9682,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x1558, 0x971d, "Clevo N970T[CDF]", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1558, 0xa500, "Clevo NL5[03]RU", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1558, 0xa600, "Clevo NL50NU", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
++ SND_PCI_QUIRK(0x1558, 0xa650, "Clevo NP[567]0SN[CD]", ALC256_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1558, 0xa671, "Clevo NP70SN[CDE]", ALC256_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1558, 0xb018, "Clevo NP50D[BE]", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1558, 0xb019, "Clevo NH77D[BE]Q", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+diff --git a/sound/soc/amd/acp/acp-pdm.c b/sound/soc/amd/acp/acp-pdm.c
+index 66ec6b6a59723..f8030b79ac17c 100644
+--- a/sound/soc/amd/acp/acp-pdm.c
++++ b/sound/soc/amd/acp/acp-pdm.c
+@@ -176,7 +176,7 @@ static void acp_dmic_dai_shutdown(struct snd_pcm_substream *substream,
+
+ /* Disable DMIC interrupts */
+ ext_int_ctrl = readl(ACP_EXTERNAL_INTR_CNTL(adata, 0));
+- ext_int_ctrl |= ~PDM_DMA_INTR_MASK;
++ ext_int_ctrl &= ~PDM_DMA_INTR_MASK;
+ writel(ext_int_ctrl, ACP_EXTERNAL_INTR_CNTL(adata, 0));
+ }
+
+diff --git a/sound/soc/codecs/es8316.c b/sound/soc/codecs/es8316.c
+index a27d809564593..ccecfdf700649 100644
+--- a/sound/soc/codecs/es8316.c
++++ b/sound/soc/codecs/es8316.c
+@@ -52,7 +52,12 @@ static const SNDRV_CTL_TLVD_DECLARE_DB_SCALE(dac_vol_tlv, -9600, 50, 1);
+ static const SNDRV_CTL_TLVD_DECLARE_DB_SCALE(adc_vol_tlv, -9600, 50, 1);
+ static const SNDRV_CTL_TLVD_DECLARE_DB_SCALE(alc_max_gain_tlv, -650, 150, 0);
+ static const SNDRV_CTL_TLVD_DECLARE_DB_SCALE(alc_min_gain_tlv, -1200, 150, 0);
+-static const SNDRV_CTL_TLVD_DECLARE_DB_SCALE(alc_target_tlv, -1650, 150, 0);
++
++static const SNDRV_CTL_TLVD_DECLARE_DB_RANGE(alc_target_tlv,
++ 0, 10, TLV_DB_SCALE_ITEM(-1650, 150, 0),
++ 11, 11, TLV_DB_SCALE_ITEM(-150, 0, 0),
++);
++
+ static const SNDRV_CTL_TLVD_DECLARE_DB_RANGE(hpmixer_gain_tlv,
+ 0, 4, TLV_DB_SCALE_ITEM(-1200, 150, 0),
+ 8, 11, TLV_DB_SCALE_ITEM(-450, 150, 0),
+@@ -115,7 +120,7 @@ static const struct snd_kcontrol_new es8316_snd_controls[] = {
+ alc_max_gain_tlv),
+ SOC_SINGLE_TLV("ALC Capture Min Volume", ES8316_ADC_ALC2, 0, 28, 0,
+ alc_min_gain_tlv),
+- SOC_SINGLE_TLV("ALC Capture Target Volume", ES8316_ADC_ALC3, 4, 10, 0,
++ SOC_SINGLE_TLV("ALC Capture Target Volume", ES8316_ADC_ALC3, 4, 11, 0,
+ alc_target_tlv),
+ SOC_SINGLE("ALC Capture Hold Time", ES8316_ADC_ALC3, 0, 10, 0),
+ SOC_SINGLE("ALC Capture Decay Time", ES8316_ADC_ALC4, 4, 10, 0),
+@@ -364,13 +369,11 @@ static int es8316_set_dai_sysclk(struct snd_soc_dai *codec_dai,
+ int count = 0;
+
+ es8316->sysclk = freq;
++ es8316->sysclk_constraints.list = NULL;
++ es8316->sysclk_constraints.count = 0;
+
+- if (freq == 0) {
+- es8316->sysclk_constraints.list = NULL;
+- es8316->sysclk_constraints.count = 0;
+-
++ if (freq == 0)
+ return 0;
+- }
+
+ ret = clk_set_rate(es8316->mclk, freq);
+ if (ret)
+@@ -386,8 +389,10 @@ static int es8316_set_dai_sysclk(struct snd_soc_dai *codec_dai,
+ es8316->allowed_rates[count++] = freq / ratio;
+ }
+
+- es8316->sysclk_constraints.list = es8316->allowed_rates;
+- es8316->sysclk_constraints.count = count;
++ if (count) {
++ es8316->sysclk_constraints.list = es8316->allowed_rates;
++ es8316->sysclk_constraints.count = count;
++ }
+
+ return 0;
+ }
+diff --git a/sound/soc/fsl/imx-audmix.c b/sound/soc/fsl/imx-audmix.c
+index b2c5aca92c6bf..f9ed8fcc03c48 100644
+--- a/sound/soc/fsl/imx-audmix.c
++++ b/sound/soc/fsl/imx-audmix.c
+@@ -228,6 +228,8 @@ static int imx_audmix_probe(struct platform_device *pdev)
+
+ dai_name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "%s%s",
+ fe_name_pref, args.np->full_name + 1);
++ if (!dai_name)
++ return -ENOMEM;
+
+ dev_info(pdev->dev.parent, "DAI FE name:%s\n", dai_name);
+
+@@ -236,6 +238,8 @@ static int imx_audmix_probe(struct platform_device *pdev)
+ capture_dai_name =
+ devm_kasprintf(&pdev->dev, GFP_KERNEL, "%s %s",
+ dai_name, "CPU-Capture");
++ if (!capture_dai_name)
++ return -ENOMEM;
+ }
+
+ /*
+@@ -269,6 +273,8 @@ static int imx_audmix_probe(struct platform_device *pdev)
+ "AUDMIX-Playback-%d", i);
+ be_cp = devm_kasprintf(&pdev->dev, GFP_KERNEL,
+ "AUDMIX-Capture-%d", i);
++ if (!be_name || !be_pb || !be_cp)
++ return -ENOMEM;
+
+ priv->dai[num_dai + i].cpus = &dlc[2];
+ priv->dai[num_dai + i].codecs = &dlc[3];
+@@ -293,6 +299,9 @@ static int imx_audmix_probe(struct platform_device *pdev)
+ priv->dapm_routes[i].source =
+ devm_kasprintf(&pdev->dev, GFP_KERNEL, "%s %s",
+ dai_name, "CPU-Playback");
++ if (!priv->dapm_routes[i].source)
++ return -ENOMEM;
++
+ priv->dapm_routes[i].sink = be_pb;
+ priv->dapm_routes[num_dai + i].source = be_pb;
+ priv->dapm_routes[num_dai + i].sink = be_cp;
+diff --git a/sound/soc/intel/boards/sof_sdw.c b/sound/soc/intel/boards/sof_sdw.c
+index 144f082c63fda..5fa204897a52b 100644
+--- a/sound/soc/intel/boards/sof_sdw.c
++++ b/sound/soc/intel/boards/sof_sdw.c
+@@ -413,7 +413,7 @@ static const struct dmi_system_id sof_sdw_quirk_table[] = {
+ .matches = {
+ DMI_MATCH(DMI_PRODUCT_FAMILY, "Intel_mtlrvp"),
+ },
+- .driver_data = (void *)(RT711_JD1 | SOF_SDW_TGL_HDMI),
++ .driver_data = (void *)(RT711_JD1),
+ },
+ {}
+ };
+@@ -902,17 +902,20 @@ static int create_codec_dai_name(struct device *dev,
+ static int set_codec_init_func(struct snd_soc_card *card,
+ const struct snd_soc_acpi_link_adr *link,
+ struct snd_soc_dai_link *dai_links,
+- bool playback, int group_id)
++ bool playback, int group_id, int adr_index)
+ {
+- int i;
++ int i = adr_index;
+
+ do {
+ /*
+ * Initialize the codec. If codec is part of an aggregated
+ * group (group_id>0), initialize all codecs belonging to
+ * same group.
++ * The first link should start with link->adr_d[adr_index]
++ * because that is the device that we want to initialize and
++ * we should end immediately if it is not aggregated (group_id=0)
+ */
+- for (i = 0; i < link->num_adr; i++) {
++ for ( ; i < link->num_adr; i++) {
+ int codec_index;
+
+ codec_index = find_codec_info_part(link->adr_d[i].adr);
+@@ -928,9 +931,12 @@ static int set_codec_init_func(struct snd_soc_card *card,
+ dai_links,
+ &codec_info_list[codec_index],
+ playback);
++ if (!group_id)
++ return 0;
+ }
++ i = 0;
+ link++;
+- } while (link->mask && group_id);
++ } while (link->mask);
+
+ return 0;
+ }
+@@ -1180,7 +1186,7 @@ static int create_sdw_dailink(struct snd_soc_card *card,
+ dai_links[*link_index].nonatomic = true;
+
+ ret = set_codec_init_func(card, link, dai_links + (*link_index)++,
+- playback, group_id);
++ playback, group_id, adr_index);
+ if (ret < 0) {
+ dev_err(dev, "failed to init codec %d", codec_index);
+ return ret;
+diff --git a/sound/soc/mediatek/mt8173/mt8173-afe-pcm.c b/sound/soc/mediatek/mt8173/mt8173-afe-pcm.c
+index f93c2ec8beb7b..06269f7e37566 100644
+--- a/sound/soc/mediatek/mt8173/mt8173-afe-pcm.c
++++ b/sound/soc/mediatek/mt8173/mt8173-afe-pcm.c
+@@ -1070,6 +1070,10 @@ static int mt8173_afe_pcm_dev_probe(struct platform_device *pdev)
+
+ afe->dev = &pdev->dev;
+
++ irq_id = platform_get_irq(pdev, 0);
++ if (irq_id <= 0)
++ return irq_id < 0 ? irq_id : -ENXIO;
++
+ afe->base_addr = devm_platform_ioremap_resource(pdev, 0);
+ if (IS_ERR(afe->base_addr))
+ return PTR_ERR(afe->base_addr);
+@@ -1156,14 +1160,14 @@ static int mt8173_afe_pcm_dev_probe(struct platform_device *pdev)
+ comp_hdmi = devm_kzalloc(&pdev->dev, sizeof(*comp_hdmi), GFP_KERNEL);
+ if (!comp_hdmi) {
+ ret = -ENOMEM;
+- goto err_pm_disable;
++ goto err_cleanup_components;
+ }
+
+ ret = snd_soc_component_initialize(comp_hdmi,
+ &mt8173_afe_hdmi_dai_component,
+ &pdev->dev);
+ if (ret)
+- goto err_pm_disable;
++ goto err_cleanup_components;
+
+ #ifdef CONFIG_DEBUG_FS
+ comp_hdmi->debugfs_prefix = "hdmi";
+@@ -1175,14 +1179,11 @@ static int mt8173_afe_pcm_dev_probe(struct platform_device *pdev)
+ if (ret)
+ goto err_cleanup_components;
+
+- irq_id = platform_get_irq(pdev, 0);
+- if (irq_id <= 0)
+- return irq_id < 0 ? irq_id : -ENXIO;
+ ret = devm_request_irq(afe->dev, irq_id, mt8173_afe_irq_handler,
+ 0, "Afe_ISR_Handle", (void *)afe);
+ if (ret) {
+ dev_err(afe->dev, "could not request_irq\n");
+- goto err_pm_disable;
++ goto err_cleanup_components;
+ }
+
+ dev_info(&pdev->dev, "MT8173 AFE driver initialized.\n");
+diff --git a/tools/bpf/bpftool/feature.c b/tools/bpf/bpftool/feature.c
+index da16e6a27cccd..0675d6a464138 100644
+--- a/tools/bpf/bpftool/feature.c
++++ b/tools/bpf/bpftool/feature.c
+@@ -167,12 +167,12 @@ static int get_vendor_id(int ifindex)
+ return strtol(buf, NULL, 0);
+ }
+
+-static int read_procfs(const char *path)
++static long read_procfs(const char *path)
+ {
+ char *endptr, *line = NULL;
+ size_t len = 0;
+ FILE *fd;
+- int res;
++ long res;
+
+ fd = fopen(path, "r");
+ if (!fd)
+@@ -194,7 +194,7 @@ static int read_procfs(const char *path)
+
+ static void probe_unprivileged_disabled(void)
+ {
+- int res;
++ long res;
+
+ /* No support for C-style ouptut */
+
+@@ -216,14 +216,14 @@ static void probe_unprivileged_disabled(void)
+ printf("Unable to retrieve required privileges for bpf() syscall\n");
+ break;
+ default:
+- printf("bpf() syscall restriction has unknown value %d\n", res);
++ printf("bpf() syscall restriction has unknown value %ld\n", res);
+ }
+ }
+ }
+
+ static void probe_jit_enable(void)
+ {
+- int res;
++ long res;
+
+ /* No support for C-style ouptut */
+
+@@ -245,7 +245,7 @@ static void probe_jit_enable(void)
+ printf("Unable to retrieve JIT-compiler status\n");
+ break;
+ default:
+- printf("JIT-compiler status has unknown value %d\n",
++ printf("JIT-compiler status has unknown value %ld\n",
+ res);
+ }
+ }
+@@ -253,7 +253,7 @@ static void probe_jit_enable(void)
+
+ static void probe_jit_harden(void)
+ {
+- int res;
++ long res;
+
+ /* No support for C-style ouptut */
+
+@@ -275,7 +275,7 @@ static void probe_jit_harden(void)
+ printf("Unable to retrieve JIT hardening status\n");
+ break;
+ default:
+- printf("JIT hardening status has unknown value %d\n",
++ printf("JIT hardening status has unknown value %ld\n",
+ res);
+ }
+ }
+@@ -283,7 +283,7 @@ static void probe_jit_harden(void)
+
+ static void probe_jit_kallsyms(void)
+ {
+- int res;
++ long res;
+
+ /* No support for C-style ouptut */
+
+@@ -302,14 +302,14 @@ static void probe_jit_kallsyms(void)
+ printf("Unable to retrieve JIT kallsyms export status\n");
+ break;
+ default:
+- printf("JIT kallsyms exports status has unknown value %d\n", res);
++ printf("JIT kallsyms exports status has unknown value %ld\n", res);
+ }
+ }
+ }
+
+ static void probe_jit_limit(void)
+ {
+- int res;
++ long res;
+
+ /* No support for C-style ouptut */
+
+@@ -322,7 +322,7 @@ static void probe_jit_limit(void)
+ printf("Unable to retrieve global memory limit for JIT compiler for unprivileged users\n");
+ break;
+ default:
+- printf("Global memory limit for JIT compiler for unprivileged users is %d bytes\n", res);
++ printf("Global memory limit for JIT compiler for unprivileged users is %ld bytes\n", res);
+ }
+ }
+ }
+diff --git a/tools/bpf/resolve_btfids/Makefile b/tools/bpf/resolve_btfids/Makefile
+index ac548a7baa73e..4b8079f294f65 100644
+--- a/tools/bpf/resolve_btfids/Makefile
++++ b/tools/bpf/resolve_btfids/Makefile
+@@ -67,7 +67,7 @@ $(BPFOBJ): $(wildcard $(LIBBPF_SRC)/*.[ch] $(LIBBPF_SRC)/Makefile) | $(LIBBPF_OU
+ LIBELF_FLAGS := $(shell $(HOSTPKG_CONFIG) libelf --cflags 2>/dev/null)
+ LIBELF_LIBS := $(shell $(HOSTPKG_CONFIG) libelf --libs 2>/dev/null || echo -lelf)
+
+-HOSTCFLAGS += -g \
++HOSTCFLAGS_resolve_btfids += -g \
+ -I$(srctree)/tools/include \
+ -I$(srctree)/tools/include/uapi \
+ -I$(LIBBPF_INCLUDE) \
+@@ -76,7 +76,7 @@ HOSTCFLAGS += -g \
+
+ LIBS = $(LIBELF_LIBS) -lz
+
+-export srctree OUTPUT HOSTCFLAGS Q HOSTCC HOSTLD HOSTAR
++export srctree OUTPUT HOSTCFLAGS_resolve_btfids Q HOSTCC HOSTLD HOSTAR
+ include $(srctree)/tools/build/Makefile.include
+
+ $(BINARY_IN): fixdep FORCE prepare | $(OUTPUT)
+diff --git a/tools/include/nolibc/stdint.h b/tools/include/nolibc/stdint.h
+index c1ce4f5e06034..661d942862c0b 100644
+--- a/tools/include/nolibc/stdint.h
++++ b/tools/include/nolibc/stdint.h
+@@ -36,8 +36,8 @@ typedef ssize_t int_fast16_t;
+ typedef size_t uint_fast16_t;
+ typedef ssize_t int_fast32_t;
+ typedef size_t uint_fast32_t;
+-typedef ssize_t int_fast64_t;
+-typedef size_t uint_fast64_t;
++typedef int64_t int_fast64_t;
++typedef uint64_t uint_fast64_t;
+
+ typedef int64_t intmax_t;
+ typedef uint64_t uintmax_t;
+@@ -84,16 +84,16 @@ typedef uint64_t uintmax_t;
+ #define INT_FAST8_MIN INT8_MIN
+ #define INT_FAST16_MIN INTPTR_MIN
+ #define INT_FAST32_MIN INTPTR_MIN
+-#define INT_FAST64_MIN INTPTR_MIN
++#define INT_FAST64_MIN INT64_MIN
+
+ #define INT_FAST8_MAX INT8_MAX
+ #define INT_FAST16_MAX INTPTR_MAX
+ #define INT_FAST32_MAX INTPTR_MAX
+-#define INT_FAST64_MAX INTPTR_MAX
++#define INT_FAST64_MAX INT64_MAX
+
+ #define UINT_FAST8_MAX UINT8_MAX
+ #define UINT_FAST16_MAX SIZE_MAX
+ #define UINT_FAST32_MAX SIZE_MAX
+-#define UINT_FAST64_MAX SIZE_MAX
++#define UINT_FAST64_MAX UINT64_MAX
+
+ #endif /* _NOLIBC_STDINT_H */
+diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h
+index 929a3baca8ef3..bbab9ad9dc5a7 100644
+--- a/tools/lib/bpf/bpf_helpers.h
++++ b/tools/lib/bpf/bpf_helpers.h
+@@ -77,16 +77,21 @@
+ /*
+ * Helper macros to manipulate data structures
+ */
+-#ifndef offsetof
+-#define offsetof(TYPE, MEMBER) ((unsigned long)&((TYPE *)0)->MEMBER)
+-#endif
+-#ifndef container_of
++
++/* offsetof() definition that uses __builtin_offset() might not preserve field
++ * offset CO-RE relocation properly, so force-redefine offsetof() using
++ * old-school approach which works with CO-RE correctly
++ */
++#undef offsetof
++#define offsetof(type, member) ((unsigned long)&((type *)0)->member)
++
++/* redefined container_of() to ensure we use the above offsetof() macro */
++#undef container_of
+ #define container_of(ptr, type, member) \
+ ({ \
+ void *__mptr = (void *)(ptr); \
+ ((type *)(__mptr - offsetof(type, member))); \
+ })
+-#endif
+
+ /*
+ * Compiler (optimization) barrier.
+diff --git a/tools/lib/bpf/btf_dump.c b/tools/lib/bpf/btf_dump.c
+index 580985ee55458..4d9f30bf7f014 100644
+--- a/tools/lib/bpf/btf_dump.c
++++ b/tools/lib/bpf/btf_dump.c
+@@ -2250,9 +2250,25 @@ static int btf_dump_type_data_check_overflow(struct btf_dump *d,
+ const struct btf_type *t,
+ __u32 id,
+ const void *data,
+- __u8 bits_offset)
++ __u8 bits_offset,
++ __u8 bit_sz)
+ {
+- __s64 size = btf__resolve_size(d->btf, id);
++ __s64 size;
++
++ if (bit_sz) {
++ /* bits_offset is at most 7. bit_sz is at most 128. */
++ __u8 nr_bytes = (bits_offset + bit_sz + 7) / 8;
++
++ /* When bit_sz is non zero, it is called from
++ * btf_dump_struct_data() where it only cares about
++ * negative error value.
++ * Return nr_bytes in success case to make it
++ * consistent as the regular integer case below.
++ */
++ return data + nr_bytes > d->typed_dump->data_end ? -E2BIG : nr_bytes;
++ }
++
++ size = btf__resolve_size(d->btf, id);
+
+ if (size < 0 || size >= INT_MAX) {
+ pr_warn("unexpected size [%zu] for id [%u]\n",
+@@ -2407,7 +2423,7 @@ static int btf_dump_dump_type_data(struct btf_dump *d,
+ {
+ int size, err = 0;
+
+- size = btf_dump_type_data_check_overflow(d, t, id, data, bits_offset);
++ size = btf_dump_type_data_check_overflow(d, t, id, data, bits_offset, bit_sz);
+ if (size < 0)
+ return size;
+ err = btf_dump_type_data_check_zero(d, t, id, data, bits_offset, bit_sz);
+diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build
+index 195ccfdef7aa1..005907cb97d8c 100644
+--- a/tools/perf/arch/x86/util/Build
++++ b/tools/perf/arch/x86/util/Build
+@@ -10,6 +10,7 @@ perf-y += evlist.o
+ perf-y += mem-events.o
+ perf-y += evsel.o
+ perf-y += iostat.o
++perf-y += env.o
+
+ perf-$(CONFIG_DWARF) += dwarf-regs.o
+ perf-$(CONFIG_BPF_PROLOGUE) += dwarf-regs.o
+diff --git a/tools/perf/arch/x86/util/env.c b/tools/perf/arch/x86/util/env.c
+new file mode 100644
+index 0000000000000..3e537ffb1353a
+--- /dev/null
++++ b/tools/perf/arch/x86/util/env.c
+@@ -0,0 +1,19 @@
++// SPDX-License-Identifier: GPL-2.0
++#include "linux/string.h"
++#include "util/env.h"
++#include "env.h"
++
++bool x86__is_amd_cpu(void)
++{
++ struct perf_env env = { .total_mem = 0, };
++ static int is_amd; /* 0: Uninitialized, 1: Yes, -1: No */
++
++ if (is_amd)
++ goto ret;
++
++ perf_env__cpuid(&env);
++ is_amd = env.cpuid && strstarts(env.cpuid, "AuthenticAMD") ? 1 : -1;
++ perf_env__exit(&env);
++ret:
++ return is_amd >= 1 ? true : false;
++}
+diff --git a/tools/perf/arch/x86/util/env.h b/tools/perf/arch/x86/util/env.h
+new file mode 100644
+index 0000000000000..d78f080b6b3f8
+--- /dev/null
++++ b/tools/perf/arch/x86/util/env.h
+@@ -0,0 +1,7 @@
++/* SPDX-License-Identifier: GPL-2.0 */
++#ifndef _X86_ENV_H
++#define _X86_ENV_H
++
++bool x86__is_amd_cpu(void);
++
++#endif /* _X86_ENV_H */
+diff --git a/tools/perf/arch/x86/util/evsel.c b/tools/perf/arch/x86/util/evsel.c
+index ea3972d785d10..d72390cdf391d 100644
+--- a/tools/perf/arch/x86/util/evsel.c
++++ b/tools/perf/arch/x86/util/evsel.c
+@@ -7,6 +7,7 @@
+ #include "linux/string.h"
+ #include "evsel.h"
+ #include "util/debug.h"
++#include "env.h"
+
+ #define IBS_FETCH_L3MISSONLY (1ULL << 59)
+ #define IBS_OP_L3MISSONLY (1ULL << 16)
+@@ -97,23 +98,10 @@ void arch__post_evsel_config(struct evsel *evsel, struct perf_event_attr *attr)
+ {
+ struct perf_pmu *evsel_pmu, *ibs_fetch_pmu, *ibs_op_pmu;
+ static int warned_once;
+- /* 0: Uninitialized, 1: Yes, -1: No */
+- static int is_amd;
+
+- if (warned_once || is_amd == -1)
++ if (warned_once || !x86__is_amd_cpu())
+ return;
+
+- if (!is_amd) {
+- struct perf_env *env = evsel__env(evsel);
+-
+- if (!perf_env__cpuid(env) || !env->cpuid ||
+- !strstarts(env->cpuid, "AuthenticAMD")) {
+- is_amd = -1;
+- return;
+- }
+- is_amd = 1;
+- }
+-
+ evsel_pmu = evsel__find_pmu(evsel);
+ if (!evsel_pmu)
+ return;
+diff --git a/tools/perf/arch/x86/util/mem-events.c b/tools/perf/arch/x86/util/mem-events.c
+index f683ac702247c..efc0fae9ed0a7 100644
+--- a/tools/perf/arch/x86/util/mem-events.c
++++ b/tools/perf/arch/x86/util/mem-events.c
+@@ -4,6 +4,7 @@
+ #include "map_symbol.h"
+ #include "mem-events.h"
+ #include "linux/string.h"
++#include "env.h"
+
+ static char mem_loads_name[100];
+ static bool mem_loads_name__init;
+@@ -26,28 +27,12 @@ static struct perf_mem_event perf_mem_events_amd[PERF_MEM_EVENTS__MAX] = {
+ E("mem-ldst", "ibs_op//", "ibs_op"),
+ };
+
+-static int perf_mem_is_amd_cpu(void)
+-{
+- struct perf_env env = { .total_mem = 0, };
+-
+- perf_env__cpuid(&env);
+- if (env.cpuid && strstarts(env.cpuid, "AuthenticAMD"))
+- return 1;
+- return -1;
+-}
+-
+ struct perf_mem_event *perf_mem_events__ptr(int i)
+ {
+- /* 0: Uninitialized, 1: Yes, -1: No */
+- static int is_amd;
+-
+ if (i >= PERF_MEM_EVENTS__MAX)
+ return NULL;
+
+- if (!is_amd)
+- is_amd = perf_mem_is_amd_cpu();
+-
+- if (is_amd == 1)
++ if (x86__is_amd_cpu())
+ return &perf_mem_events_amd[i];
+
+ return &perf_mem_events_intel[i];
+diff --git a/tools/perf/builtin-bench.c b/tools/perf/builtin-bench.c
+index 58f1cfe1eb34b..db435b791a09b 100644
+--- a/tools/perf/builtin-bench.c
++++ b/tools/perf/builtin-bench.c
+@@ -21,6 +21,7 @@
+ #include "builtin.h"
+ #include "bench/bench.h"
+
++#include <locale.h>
+ #include <stdio.h>
+ #include <stdlib.h>
+ #include <string.h>
+@@ -260,6 +261,7 @@ int cmd_bench(int argc, const char **argv)
+
+ /* Unbuffered output */
+ setvbuf(stdout, NULL, _IONBF, 0);
++ setlocale(LC_ALL, "");
+
+ if (argc < 2) {
+ /* No collection specified. */
+diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
+index c57be48d65bb0..2ecfca0fccda0 100644
+--- a/tools/perf/builtin-script.c
++++ b/tools/perf/builtin-script.c
+@@ -2422,6 +2422,9 @@ out_put:
+ return ret;
+ }
+
++// Used when scr->per_event_dump is not set
++static struct evsel_script es_stdout;
++
+ static int process_attr(struct perf_tool *tool, union perf_event *event,
+ struct evlist **pevlist)
+ {
+@@ -2430,7 +2433,6 @@ static int process_attr(struct perf_tool *tool, union perf_event *event,
+ struct evsel *evsel, *pos;
+ u64 sample_type;
+ int err;
+- static struct evsel_script *es;
+
+ err = perf_event__process_attr(tool, event, pevlist);
+ if (err)
+@@ -2440,14 +2442,13 @@ static int process_attr(struct perf_tool *tool, union perf_event *event,
+ evsel = evlist__last(*pevlist);
+
+ if (!evsel->priv) {
+- if (scr->per_event_dump) {
++ if (scr->per_event_dump) {
+ evsel->priv = evsel_script__new(evsel, scr->session->data);
+- } else {
+- es = zalloc(sizeof(*es));
+- if (!es)
++ if (!evsel->priv)
+ return -ENOMEM;
+- es->fp = stdout;
+- evsel->priv = es;
++ } else { // Replicate what is done in perf_script__setup_per_event_dump()
++ es_stdout.fp = stdout;
++ evsel->priv = &es_stdout;
+ }
+ }
+
+@@ -2753,7 +2754,6 @@ out_err_fclose:
+ static int perf_script__setup_per_event_dump(struct perf_script *script)
+ {
+ struct evsel *evsel;
+- static struct evsel_script es_stdout;
+
+ if (script->per_event_dump)
+ return perf_script__fopen_per_event_dump(script);
+diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
+index b9ad32f21e575..463643cda0d5f 100644
+--- a/tools/perf/builtin-stat.c
++++ b/tools/perf/builtin-stat.c
+@@ -723,6 +723,8 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
+ all_counters_use_bpf = false;
+ }
+
++ evlist__reset_aggr_stats(evsel_list);
++
+ evlist__for_each_cpu(evlist_cpu_itr, evsel_list, affinity) {
+ counter = evlist_cpu_itr.evsel;
+
+diff --git a/tools/perf/tests/shell/test_task_analyzer.sh b/tools/perf/tests/shell/test_task_analyzer.sh
+index a98e4ab66040e..365b61aea519a 100755
+--- a/tools/perf/tests/shell/test_task_analyzer.sh
++++ b/tools/perf/tests/shell/test_task_analyzer.sh
+@@ -5,6 +5,12 @@
+ tmpdir=$(mktemp -d /tmp/perf-script-task-analyzer-XXXXX)
+ err=0
+
++# set PERF_EXEC_PATH to find scripts in the source directory
++perfdir=$(dirname "$0")/../..
++if [ -e "$perfdir/scripts/python/Perf-Trace-Util" ]; then
++ export PERF_EXEC_PATH=$perfdir
++fi
++
+ cleanup() {
+ rm -f perf.data
+ rm -f perf.data.old
+@@ -31,7 +37,7 @@ report() {
+
+ check_exec_0() {
+ if [ $? != 0 ]; then
+- report 1 "invokation of ${$1} command failed"
++ report 1 "invocation of $1 command failed"
+ fi
+ }
+
+@@ -44,9 +50,20 @@ find_str_or_fail() {
+ fi
+ }
+
++# check if perf is compiled with libtraceevent support
++skip_no_probe_record_support() {
++ perf record -e "sched:sched_switch" -a -- sleep 1 2>&1 | grep "libtraceevent is necessary for tracepoint support" && return 2
++ return 0
++}
++
+ prepare_perf_data() {
+ # 1s should be sufficient to catch at least some switches
+ perf record -e sched:sched_switch -a -- sleep 1 > /dev/null 2>&1
++ # check if perf data file got created in above step.
++ if [ ! -e "perf.data" ]; then
++ printf "FAIL: perf record failed to create \"perf.data\" \n"
++ return 1
++ fi
+ }
+
+ # check standard inkvokation with no arguments
+@@ -134,6 +151,13 @@ test_csvsummary_extended() {
+ find_str_or_fail "Out-Out;" csvsummary ${FUNCNAME[0]}
+ }
+
++skip_no_probe_record_support
++err=$?
++if [ $err -ne 0 ]; then
++ echo "WARN: Skipping tests. No libtraceevent support"
++ cleanup
++ exit $err
++fi
+ prepare_perf_data
+ test_basic
+ test_ns_rename
+diff --git a/tools/perf/util/bpf_skel/lock_contention.bpf.c b/tools/perf/util/bpf_skel/lock_contention.bpf.c
+index 1d48226ae75d4..8d3cfbb3cc65b 100644
+--- a/tools/perf/util/bpf_skel/lock_contention.bpf.c
++++ b/tools/perf/util/bpf_skel/lock_contention.bpf.c
+@@ -416,8 +416,6 @@ int contention_end(u64 *ctx)
+ return 0;
+ }
+
+-struct rq {};
+-
+ extern struct rq runqueues __ksym;
+
+ struct rq___old {
+diff --git a/tools/perf/util/bpf_skel/vmlinux.h b/tools/perf/util/bpf_skel/vmlinux.h
+index c7ed51b0c1ef9..ab84a6e1da5ee 100644
+--- a/tools/perf/util/bpf_skel/vmlinux.h
++++ b/tools/perf/util/bpf_skel/vmlinux.h
+@@ -171,4 +171,14 @@ struct bpf_perf_event_data_kern {
+ struct perf_sample_data *data;
+ struct perf_event *event;
+ } __attribute__((preserve_access_index));
++
++/*
++ * If 'struct rq' isn't defined for lock_contention.bpf.c, for the sake of
++ * rq___old and rq___new, then the type for the 'runqueue' variable ends up
++ * being a forward declaration (BTF_KIND_FWD) while the kernel has it defined
++ * (BTF_KIND_STRUCT). The definition appears in vmlinux.h rather than
++ * lock_contention.bpf.c for consistency with a generated vmlinux.h.
++ */
++struct rq {};
++
+ #endif // __VMLINUX_H
+diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c
+index b074144097710..3bff678745635 100644
+--- a/tools/perf/util/dwarf-aux.c
++++ b/tools/perf/util/dwarf-aux.c
+@@ -1103,7 +1103,7 @@ int die_get_varname(Dwarf_Die *vr_die, struct strbuf *buf)
+ ret = die_get_typename(vr_die, buf);
+ if (ret < 0) {
+ pr_debug("Failed to get type, make it unknown.\n");
+- ret = strbuf_add(buf, " (unknown_type)", 14);
++ ret = strbuf_add(buf, "(unknown_type)", 14);
+ }
+
+ return ret < 0 ? ret : strbuf_addf(buf, "\t%s", dwarf_diename(vr_die));
+diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
+index 0f54f28a69c25..5a488803d368f 100644
+--- a/tools/perf/util/evsel.h
++++ b/tools/perf/util/evsel.h
+@@ -460,16 +460,24 @@ static inline int evsel__group_idx(struct evsel *evsel)
+ }
+
+ /* Iterates group WITHOUT the leader. */
+-#define for_each_group_member(_evsel, _leader) \
+-for ((_evsel) = list_entry((_leader)->core.node.next, struct evsel, core.node); \
+- (_evsel) && (_evsel)->core.leader == (&_leader->core); \
+- (_evsel) = list_entry((_evsel)->core.node.next, struct evsel, core.node))
++#define for_each_group_member_head(_evsel, _leader, _head) \
++for ((_evsel) = list_entry((_leader)->core.node.next, struct evsel, core.node); \
++ (_evsel) && &(_evsel)->core.node != (_head) && \
++ (_evsel)->core.leader == &(_leader)->core; \
++ (_evsel) = list_entry((_evsel)->core.node.next, struct evsel, core.node))
++
++#define for_each_group_member(_evsel, _leader) \
++ for_each_group_member_head(_evsel, _leader, &(_leader)->evlist->core.entries)
+
+ /* Iterates group WITH the leader. */
+-#define for_each_group_evsel(_evsel, _leader) \
+-for ((_evsel) = _leader; \
+- (_evsel) && (_evsel)->core.leader == (&_leader->core); \
+- (_evsel) = list_entry((_evsel)->core.node.next, struct evsel, core.node))
++#define for_each_group_evsel_head(_evsel, _leader, _head) \
++for ((_evsel) = _leader; \
++ (_evsel) && &(_evsel)->core.node != (_head) && \
++ (_evsel)->core.leader == &(_leader)->core; \
++ (_evsel) = list_entry((_evsel)->core.node.next, struct evsel, core.node))
++
++#define for_each_group_evsel(_evsel, _leader) \
++ for_each_group_evsel_head(_evsel, _leader, &(_leader)->evlist->core.entries)
+
+ static inline bool evsel__has_branch_callstack(const struct evsel *evsel)
+ {
+diff --git a/tools/perf/util/evsel_fprintf.c b/tools/perf/util/evsel_fprintf.c
+index cc80ec554c0a9..036a2171dc1c5 100644
+--- a/tools/perf/util/evsel_fprintf.c
++++ b/tools/perf/util/evsel_fprintf.c
+@@ -2,6 +2,7 @@
+ #include <inttypes.h>
+ #include <stdio.h>
+ #include <stdbool.h>
++#include "util/evlist.h"
+ #include "evsel.h"
+ #include "util/evsel_fprintf.h"
+ #include "util/event.h"
+diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
+index 5e9c657dd3f7a..b659b149e5b41 100644
+--- a/tools/perf/util/metricgroup.c
++++ b/tools/perf/util/metricgroup.c
+@@ -1146,7 +1146,7 @@ static int metricgroup__add_metric_callback(const struct pmu_metric *pm,
+
+ if (pm->metric_expr && match_pm_metric(pm, data->metric_name)) {
+ bool metric_no_group = data->metric_no_group ||
+- match_metric(data->metric_name, pm->metricgroup_no_group);
++ match_metric(pm->metricgroup_no_group, data->metric_name);
+
+ data->has_match = true;
+ ret = add_metric(data->list, pm, data->modifier, metric_no_group,
+diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
+index 34b48027b3def..403cd36087726 100644
+--- a/tools/testing/cxl/test/mem.c
++++ b/tools/testing/cxl/test/mem.c
+@@ -52,11 +52,11 @@ static struct cxl_cel_entry mock_cel[] = {
+ },
+ {
+ .opcode = cpu_to_le16(CXL_MBOX_OP_INJECT_POISON),
+- .effect = cpu_to_le16(0),
++ .effect = cpu_to_le16(EFFECT(2)),
+ },
+ {
+ .opcode = cpu_to_le16(CXL_MBOX_OP_CLEAR_POISON),
+- .effect = cpu_to_le16(0),
++ .effect = cpu_to_le16(EFFECT(2)),
+ },
+ };
+
+diff --git a/tools/testing/kunit/kunit_kernel.py b/tools/testing/kunit/kunit_kernel.py
+index f01f941061296..7f648802caf6a 100644
+--- a/tools/testing/kunit/kunit_kernel.py
++++ b/tools/testing/kunit/kunit_kernel.py
+@@ -92,7 +92,7 @@ class LinuxSourceTreeOperations:
+ if stderr: # likely only due to build warnings
+ print(stderr.decode())
+
+- def start(self, params: List[str], build_dir: str) -> subprocess.Popen[str]:
++ def start(self, params: List[str], build_dir: str) -> subprocess.Popen:
+ raise RuntimeError('not implemented!')
+
+
+@@ -113,7 +113,7 @@ class LinuxSourceTreeOperationsQemu(LinuxSourceTreeOperations):
+ kconfig.merge_in_entries(base_kunitconfig)
+ return kconfig
+
+- def start(self, params: List[str], build_dir: str) -> subprocess.Popen[str]:
++ def start(self, params: List[str], build_dir: str) -> subprocess.Popen:
+ kernel_path = os.path.join(build_dir, self._kernel_path)
+ qemu_command = ['qemu-system-' + self._qemu_arch,
+ '-nodefaults',
+@@ -142,7 +142,7 @@ class LinuxSourceTreeOperationsUml(LinuxSourceTreeOperations):
+ kconfig.merge_in_entries(base_kunitconfig)
+ return kconfig
+
+- def start(self, params: List[str], build_dir: str) -> subprocess.Popen[str]:
++ def start(self, params: List[str], build_dir: str) -> subprocess.Popen:
+ """Runs the Linux UML binary. Must be named 'linux'."""
+ linux_bin = os.path.join(build_dir, 'linux')
+ params.extend(['mem=1G', 'console=tty', 'kunit_shutdown=halt'])
+diff --git a/tools/testing/kunit/mypy.ini b/tools/testing/kunit/mypy.ini
+new file mode 100644
+index 0000000000000..ddd288309efaa
+--- /dev/null
++++ b/tools/testing/kunit/mypy.ini
+@@ -0,0 +1,6 @@
++[mypy]
++strict = True
++
++# E.g. we can't write subprocess.Popen[str] until Python 3.9+.
++# But kunit.py tries to support Python 3.7+, so let's disable it.
++disable_error_code = type-arg
+diff --git a/tools/testing/kunit/run_checks.py b/tools/testing/kunit/run_checks.py
+index 8208c3b3135ef..c6d494ea33739 100755
+--- a/tools/testing/kunit/run_checks.py
++++ b/tools/testing/kunit/run_checks.py
+@@ -23,7 +23,7 @@ commands: Dict[str, Sequence[str]] = {
+ 'kunit_tool_test.py': ['./kunit_tool_test.py'],
+ 'kunit smoke test': ['./kunit.py', 'run', '--kunitconfig=lib/kunit', '--build_dir=kunit_run_checks'],
+ 'pytype': ['/bin/sh', '-c', 'pytype *.py'],
+- 'mypy': ['mypy', '--strict', '--exclude', '_test.py$', '--exclude', 'qemu_configs/', '.'],
++ 'mypy': ['mypy', '--config-file', 'mypy.ini', '--exclude', '_test.py$', '--exclude', 'qemu_configs/', '.'],
+ }
+
+ # The user might not have mypy or pytype installed, skip them if so.
+diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
+index 28d2c77262bed..538df8fb8c42b 100644
+--- a/tools/testing/selftests/bpf/Makefile
++++ b/tools/testing/selftests/bpf/Makefile
+@@ -88,8 +88,7 @@ TEST_GEN_PROGS_EXTENDED = test_sock_addr test_skb_cgroup_id_user \
+ xskxceiver xdp_redirect_multi xdp_synproxy veristat xdp_hw_metadata \
+ xdp_features
+
+-TEST_CUSTOM_PROGS = $(OUTPUT)/urandom_read $(OUTPUT)/sign-file
+-TEST_GEN_FILES += liburandom_read.so
++TEST_GEN_FILES += liburandom_read.so urandom_read sign-file
+
+ # Emit succinct information message describing current building step
+ # $1 - generic step name (e.g., CC, LINK, etc);
+diff --git a/tools/testing/selftests/bpf/prog_tests/check_mtu.c b/tools/testing/selftests/bpf/prog_tests/check_mtu.c
+index 5338d2ea04603..2a9a30650350e 100644
+--- a/tools/testing/selftests/bpf/prog_tests/check_mtu.c
++++ b/tools/testing/selftests/bpf/prog_tests/check_mtu.c
+@@ -183,7 +183,7 @@ cleanup:
+
+ void serial_test_check_mtu(void)
+ {
+- __u32 mtu_lo;
++ int mtu_lo;
+
+ if (test__start_subtest("bpf_check_mtu XDP-attach"))
+ test_check_mtu_xdp_attach();
+diff --git a/tools/testing/selftests/bpf/progs/refcounted_kptr.c b/tools/testing/selftests/bpf/progs/refcounted_kptr.c
+index 1d348a225140d..a3da610b1e6b0 100644
+--- a/tools/testing/selftests/bpf/progs/refcounted_kptr.c
++++ b/tools/testing/selftests/bpf/progs/refcounted_kptr.c
+@@ -375,6 +375,8 @@ long rbtree_refcounted_node_ref_escapes(void *ctx)
+ bpf_rbtree_add(&aroot, &n->node, less_a);
+ m = bpf_refcount_acquire(n);
+ bpf_spin_unlock(&alock);
++ if (!m)
++ return 2;
+
+ m->key = 2;
+ bpf_obj_drop(m);
+diff --git a/tools/testing/selftests/bpf/progs/refcounted_kptr_fail.c b/tools/testing/selftests/bpf/progs/refcounted_kptr_fail.c
+index efcb308f80adf..0b09e5c915b15 100644
+--- a/tools/testing/selftests/bpf/progs/refcounted_kptr_fail.c
++++ b/tools/testing/selftests/bpf/progs/refcounted_kptr_fail.c
+@@ -29,7 +29,7 @@ static bool less(struct bpf_rb_node *a, const struct bpf_rb_node *b)
+ }
+
+ SEC("?tc")
+-__failure __msg("Unreleased reference id=3 alloc_insn=21")
++__failure __msg("Unreleased reference id=4 alloc_insn=21")
+ long rbtree_refcounted_node_ref_escapes(void *ctx)
+ {
+ struct node_acquire *n, *m;
+@@ -43,6 +43,8 @@ long rbtree_refcounted_node_ref_escapes(void *ctx)
+ /* m becomes an owning ref but is never drop'd or added to a tree */
+ m = bpf_refcount_acquire(n);
+ bpf_spin_unlock(&glock);
++ if (!m)
++ return 2;
+
+ m->key = 2;
+ return 0;
+diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
+index e4657c5bc3f12..4683ff84044d6 100644
+--- a/tools/testing/selftests/bpf/test_verifier.c
++++ b/tools/testing/selftests/bpf/test_verifier.c
+@@ -1227,45 +1227,46 @@ static bool cmp_str_seq(const char *log, const char *exp)
+ return true;
+ }
+
+-static int get_xlated_program(int fd_prog, struct bpf_insn **buf, int *cnt)
++static struct bpf_insn *get_xlated_program(int fd_prog, int *cnt)
+ {
++ __u32 buf_element_size = sizeof(struct bpf_insn);
+ struct bpf_prog_info info = {};
+ __u32 info_len = sizeof(info);
+ __u32 xlated_prog_len;
+- __u32 buf_element_size = sizeof(struct bpf_insn);
++ struct bpf_insn *buf;
+
+ if (bpf_prog_get_info_by_fd(fd_prog, &info, &info_len)) {
+ perror("bpf_prog_get_info_by_fd failed");
+- return -1;
++ return NULL;
+ }
+
+ xlated_prog_len = info.xlated_prog_len;
+ if (xlated_prog_len % buf_element_size) {
+ printf("Program length %d is not multiple of %d\n",
+ xlated_prog_len, buf_element_size);
+- return -1;
++ return NULL;
+ }
+
+ *cnt = xlated_prog_len / buf_element_size;
+- *buf = calloc(*cnt, buf_element_size);
++ buf = calloc(*cnt, buf_element_size);
+ if (!buf) {
+ perror("can't allocate xlated program buffer");
+- return -ENOMEM;
++ return NULL;
+ }
+
+ bzero(&info, sizeof(info));
+ info.xlated_prog_len = xlated_prog_len;
+- info.xlated_prog_insns = (__u64)(unsigned long)*buf;
++ info.xlated_prog_insns = (__u64)(unsigned long)buf;
+ if (bpf_prog_get_info_by_fd(fd_prog, &info, &info_len)) {
+ perror("second bpf_prog_get_info_by_fd failed");
+ goto out_free_buf;
+ }
+
+- return 0;
++ return buf;
+
+ out_free_buf:
+- free(*buf);
+- return -1;
++ free(buf);
++ return NULL;
+ }
+
+ static bool is_null_insn(struct bpf_insn *insn)
+@@ -1398,7 +1399,8 @@ static bool check_xlated_program(struct bpf_test *test, int fd_prog)
+ if (!check_expected && !check_unexpected)
+ goto out;
+
+- if (get_xlated_program(fd_prog, &buf, &cnt)) {
++ buf = get_xlated_program(fd_prog, &cnt);
++ if (!buf) {
+ printf("FAIL: can't get xlated program\n");
+ result = false;
+ goto out;
+diff --git a/tools/testing/selftests/bpf/verifier/precise.c b/tools/testing/selftests/bpf/verifier/precise.c
+index 6c03a7d805f9d..ac5eeea094683 100644
+--- a/tools/testing/selftests/bpf/verifier/precise.c
++++ b/tools/testing/selftests/bpf/verifier/precise.c
+@@ -38,25 +38,24 @@
+ .fixup_map_array_48b = { 1 },
+ .result = VERBOSE_ACCEPT,
+ .errstr =
+- "26: (85) call bpf_probe_read_kernel#113\
+- last_idx 26 first_idx 20\
+- regs=4 stack=0 before 25\
+- regs=4 stack=0 before 24\
+- regs=4 stack=0 before 23\
+- regs=4 stack=0 before 22\
+- regs=4 stack=0 before 20\
+- parent didn't have regs=4 stack=0 marks\
+- last_idx 19 first_idx 10\
+- regs=4 stack=0 before 19\
+- regs=200 stack=0 before 18\
+- regs=300 stack=0 before 17\
+- regs=201 stack=0 before 15\
+- regs=201 stack=0 before 14\
+- regs=200 stack=0 before 13\
+- regs=200 stack=0 before 12\
+- regs=200 stack=0 before 11\
+- regs=200 stack=0 before 10\
+- parent already had regs=0 stack=0 marks",
++ "mark_precise: frame0: last_idx 26 first_idx 20\
++ mark_precise: frame0: regs=r2 stack= before 25\
++ mark_precise: frame0: regs=r2 stack= before 24\
++ mark_precise: frame0: regs=r2 stack= before 23\
++ mark_precise: frame0: regs=r2 stack= before 22\
++ mark_precise: frame0: regs=r2 stack= before 20\
++ mark_precise: frame0: parent state regs=r2 stack=:\
++ mark_precise: frame0: last_idx 19 first_idx 10\
++ mark_precise: frame0: regs=r2,r9 stack= before 19\
++ mark_precise: frame0: regs=r9 stack= before 18\
++ mark_precise: frame0: regs=r8,r9 stack= before 17\
++ mark_precise: frame0: regs=r0,r9 stack= before 15\
++ mark_precise: frame0: regs=r0,r9 stack= before 14\
++ mark_precise: frame0: regs=r9 stack= before 13\
++ mark_precise: frame0: regs=r9 stack= before 12\
++ mark_precise: frame0: regs=r9 stack= before 11\
++ mark_precise: frame0: regs=r9 stack= before 10\
++ mark_precise: frame0: parent state regs= stack=:",
+ },
+ {
+ "precise: test 2",
+@@ -100,20 +99,20 @@
+ .flags = BPF_F_TEST_STATE_FREQ,
+ .errstr =
+ "26: (85) call bpf_probe_read_kernel#113\
+- last_idx 26 first_idx 22\
+- regs=4 stack=0 before 25\
+- regs=4 stack=0 before 24\
+- regs=4 stack=0 before 23\
+- regs=4 stack=0 before 22\
+- parent didn't have regs=4 stack=0 marks\
+- last_idx 20 first_idx 20\
+- regs=4 stack=0 before 20\
+- parent didn't have regs=4 stack=0 marks\
+- last_idx 19 first_idx 17\
+- regs=4 stack=0 before 19\
+- regs=200 stack=0 before 18\
+- regs=300 stack=0 before 17\
+- parent already had regs=0 stack=0 marks",
++ mark_precise: frame0: last_idx 26 first_idx 22\
++ mark_precise: frame0: regs=r2 stack= before 25\
++ mark_precise: frame0: regs=r2 stack= before 24\
++ mark_precise: frame0: regs=r2 stack= before 23\
++ mark_precise: frame0: regs=r2 stack= before 22\
++ mark_precise: frame0: parent state regs=r2 stack=:\
++ mark_precise: frame0: last_idx 20 first_idx 20\
++ mark_precise: frame0: regs=r2,r9 stack= before 20\
++ mark_precise: frame0: parent state regs=r2,r9 stack=:\
++ mark_precise: frame0: last_idx 19 first_idx 17\
++ mark_precise: frame0: regs=r2,r9 stack= before 19\
++ mark_precise: frame0: regs=r9 stack= before 18\
++ mark_precise: frame0: regs=r8,r9 stack= before 17\
++ mark_precise: frame0: parent state regs= stack=:",
+ },
+ {
+ "precise: cross frame pruning",
+@@ -153,15 +152,15 @@
+ },
+ .prog_type = BPF_PROG_TYPE_XDP,
+ .flags = BPF_F_TEST_STATE_FREQ,
+- .errstr = "5: (2d) if r4 > r0 goto pc+0\
+- last_idx 5 first_idx 5\
+- parent didn't have regs=10 stack=0 marks\
+- last_idx 4 first_idx 2\
+- regs=10 stack=0 before 4\
+- regs=10 stack=0 before 3\
+- regs=0 stack=1 before 2\
+- last_idx 5 first_idx 5\
+- parent didn't have regs=1 stack=0 marks",
++ .errstr = "mark_precise: frame0: last_idx 5 first_idx 5\
++ mark_precise: frame0: parent state regs=r4 stack=:\
++ mark_precise: frame0: last_idx 4 first_idx 2\
++ mark_precise: frame0: regs=r4 stack= before 4\
++ mark_precise: frame0: regs=r4 stack= before 3\
++ mark_precise: frame0: regs= stack=-8 before 2\
++ mark_precise: frame0: falling back to forcing all scalars precise\
++ mark_precise: frame0: last_idx 5 first_idx 5\
++ mark_precise: frame0: parent state regs=r0 stack=:",
+ .result = VERBOSE_ACCEPT,
+ .retval = -1,
+ },
+@@ -179,16 +178,19 @@
+ },
+ .prog_type = BPF_PROG_TYPE_XDP,
+ .flags = BPF_F_TEST_STATE_FREQ,
+- .errstr = "last_idx 6 first_idx 6\
+- parent didn't have regs=10 stack=0 marks\
+- last_idx 5 first_idx 3\
+- regs=10 stack=0 before 5\
+- regs=10 stack=0 before 4\
+- regs=0 stack=1 before 3\
+- last_idx 6 first_idx 6\
+- parent didn't have regs=1 stack=0 marks\
+- last_idx 5 first_idx 3\
+- regs=1 stack=0 before 5",
++ .errstr = "mark_precise: frame0: last_idx 6 first_idx 6\
++ mark_precise: frame0: parent state regs=r4 stack=:\
++ mark_precise: frame0: last_idx 5 first_idx 3\
++ mark_precise: frame0: regs=r4 stack= before 5\
++ mark_precise: frame0: regs=r4 stack= before 4\
++ mark_precise: frame0: regs= stack=-8 before 3\
++ mark_precise: frame0: falling back to forcing all scalars precise\
++ force_precise: frame0: forcing r0 to be precise\
++ force_precise: frame0: forcing r0 to be precise\
++ mark_precise: frame0: last_idx 6 first_idx 6\
++ mark_precise: frame0: parent state regs=r0 stack=:\
++ mark_precise: frame0: last_idx 5 first_idx 3\
++ mark_precise: frame0: regs=r0 stack= before 5",
+ .result = VERBOSE_ACCEPT,
+ .retval = -1,
+ },
+diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
+index f4f7c0aef702b..a2a90f4bfe9fe 100644
+--- a/tools/testing/selftests/cgroup/test_memcontrol.c
++++ b/tools/testing/selftests/cgroup/test_memcontrol.c
+@@ -292,6 +292,7 @@ static int test_memcg_protection(const char *root, bool min)
+ char *children[4] = {NULL};
+ const char *attribute = min ? "memory.min" : "memory.low";
+ long c[4];
++ long current;
+ int i, attempts;
+ int fd;
+
+@@ -400,7 +401,8 @@ static int test_memcg_protection(const char *root, bool min)
+ goto cleanup;
+ }
+
+- if (!values_close(cg_read_long(parent[1], "memory.current"), MB(50), 3))
++ current = min ? MB(50) : MB(30);
++ if (!values_close(cg_read_long(parent[1], "memory.current"), current, 3))
+ goto cleanup;
+
+ if (!reclaim_until(children[0], MB(10)))
+diff --git a/tools/testing/selftests/ftrace/ftracetest b/tools/testing/selftests/ftrace/ftracetest
+index 2506621e75dfb..cb5f18c06593d 100755
+--- a/tools/testing/selftests/ftrace/ftracetest
++++ b/tools/testing/selftests/ftrace/ftracetest
+@@ -301,7 +301,7 @@ ktaptest() { # result comment
+ comment="# $comment"
+ fi
+
+- echo $CASENO $result $INSTANCE$CASENAME $comment
++ echo $result $CASENO $INSTANCE$CASENAME $comment
+ }
+
+ eval_result() { # sigval
+diff --git a/tools/testing/selftests/net/rtnetlink.sh b/tools/testing/selftests/net/rtnetlink.sh
+index 383ac6fc037d0..ba286d680fd9a 100755
+--- a/tools/testing/selftests/net/rtnetlink.sh
++++ b/tools/testing/selftests/net/rtnetlink.sh
+@@ -860,6 +860,7 @@ EOF
+ fi
+
+ # clean up any leftovers
++ echo 0 > /sys/bus/netdevsim/del_device
+ $probed && rmmod netdevsim
+
+ if [ $ret -ne 0 ]; then
+diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c
+index 21bacc928bf7b..d37d036876ea9 100644
+--- a/tools/testing/selftests/nolibc/nolibc-test.c
++++ b/tools/testing/selftests/nolibc/nolibc-test.c
+@@ -639,9 +639,9 @@ int run_stdlib(int min, int max)
+ CASE_TEST(limit_int_fast32_min); EXPECT_EQ(1, INT_FAST32_MIN, (int_fast32_t) INTPTR_MIN); break;
+ CASE_TEST(limit_int_fast32_max); EXPECT_EQ(1, INT_FAST32_MAX, (int_fast32_t) INTPTR_MAX); break;
+ CASE_TEST(limit_uint_fast32_max); EXPECT_EQ(1, UINT_FAST32_MAX, (uint_fast32_t) UINTPTR_MAX); break;
+- CASE_TEST(limit_int_fast64_min); EXPECT_EQ(1, INT_FAST64_MIN, (int_fast64_t) INTPTR_MIN); break;
+- CASE_TEST(limit_int_fast64_max); EXPECT_EQ(1, INT_FAST64_MAX, (int_fast64_t) INTPTR_MAX); break;
+- CASE_TEST(limit_uint_fast64_max); EXPECT_EQ(1, UINT_FAST64_MAX, (uint_fast64_t) UINTPTR_MAX); break;
++ CASE_TEST(limit_int_fast64_min); EXPECT_EQ(1, INT_FAST64_MIN, (int_fast64_t) INT64_MIN); break;
++ CASE_TEST(limit_int_fast64_max); EXPECT_EQ(1, INT_FAST64_MAX, (int_fast64_t) INT64_MAX); break;
++ CASE_TEST(limit_uint_fast64_max); EXPECT_EQ(1, UINT_FAST64_MAX, (uint_fast64_t) UINT64_MAX); break;
+ #if __SIZEOF_LONG__ == 8
+ CASE_TEST(limit_intptr_min); EXPECT_EQ(1, INTPTR_MIN, (intptr_t) 0x8000000000000000LL); break;
+ CASE_TEST(limit_intptr_max); EXPECT_EQ(1, INTPTR_MAX, (intptr_t) 0x7fffffffffffffffLL); break;
+diff --git a/tools/testing/selftests/rcutorture/configs/rcu/BUSTED-BOOST.boot b/tools/testing/selftests/rcutorture/configs/rcu/BUSTED-BOOST.boot
+index f57720c52c0f9..84f6bb98ce993 100644
+--- a/tools/testing/selftests/rcutorture/configs/rcu/BUSTED-BOOST.boot
++++ b/tools/testing/selftests/rcutorture/configs/rcu/BUSTED-BOOST.boot
+@@ -5,4 +5,4 @@ rcutree.gp_init_delay=3
+ rcutree.gp_cleanup_delay=3
+ rcutree.kthread_prio=2
+ threadirqs
+-tree.use_softirq=0
++rcutree.use_softirq=0
+diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE03.boot b/tools/testing/selftests/rcutorture/configs/rcu/TREE03.boot
+index 64f864f1f361f..8e50bfd4b710d 100644
+--- a/tools/testing/selftests/rcutorture/configs/rcu/TREE03.boot
++++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE03.boot
+@@ -4,4 +4,4 @@ rcutree.gp_init_delay=3
+ rcutree.gp_cleanup_delay=3
+ rcutree.kthread_prio=2
+ threadirqs
+-tree.use_softirq=0
++rcutree.use_softirq=0
+diff --git a/tools/testing/selftests/vDSO/vdso_test_clock_getres.c b/tools/testing/selftests/vDSO/vdso_test_clock_getres.c
+index 15dcee16ff726..38d46a8bf7cba 100644
+--- a/tools/testing/selftests/vDSO/vdso_test_clock_getres.c
++++ b/tools/testing/selftests/vDSO/vdso_test_clock_getres.c
+@@ -84,12 +84,12 @@ static inline int vdso_test_clock(unsigned int clock_id)
+
+ int main(int argc, char **argv)
+ {
+- int ret;
++ int ret = 0;
+
+ #if _POSIX_TIMERS > 0
+
+ #ifdef CLOCK_REALTIME
+- ret = vdso_test_clock(CLOCK_REALTIME);
++ ret += vdso_test_clock(CLOCK_REALTIME);
+ #endif
+
+ #ifdef CLOCK_BOOTTIME
+diff --git a/tools/testing/selftests/wireguard/netns.sh b/tools/testing/selftests/wireguard/netns.sh
+index 69c7796c7ca92..405ff262ca93d 100755
+--- a/tools/testing/selftests/wireguard/netns.sh
++++ b/tools/testing/selftests/wireguard/netns.sh
+@@ -514,10 +514,32 @@ n2 bash -c 'printf 0 > /proc/sys/net/ipv4/conf/all/rp_filter'
+ n1 ping -W 1 -c 1 192.168.241.2
+ [[ $(n2 wg show wg0 endpoints) == "$pub1 10.0.0.3:1" ]]
+
+-ip1 link del veth1
+-ip1 link del veth3
+-ip1 link del wg0
+-ip2 link del wg0
++ip1 link del dev veth3
++ip1 link del dev wg0
++ip2 link del dev wg0
++
++# Make sure persistent keep alives are sent when an adapter comes up
++ip1 link add dev wg0 type wireguard
++n1 wg set wg0 private-key <(echo "$key1") peer "$pub2" endpoint 10.0.0.1:1 persistent-keepalive 1
++read _ _ tx_bytes < <(n1 wg show wg0 transfer)
++[[ $tx_bytes -eq 0 ]]
++ip1 link set dev wg0 up
++read _ _ tx_bytes < <(n1 wg show wg0 transfer)
++[[ $tx_bytes -gt 0 ]]
++ip1 link del dev wg0
++# This should also happen even if the private key is set later
++ip1 link add dev wg0 type wireguard
++n1 wg set wg0 peer "$pub2" endpoint 10.0.0.1:1 persistent-keepalive 1
++read _ _ tx_bytes < <(n1 wg show wg0 transfer)
++[[ $tx_bytes -eq 0 ]]
++ip1 link set dev wg0 up
++read _ _ tx_bytes < <(n1 wg show wg0 transfer)
++[[ $tx_bytes -eq 0 ]]
++n1 wg set wg0 private-key <(echo "$key1")
++read _ _ tx_bytes < <(n1 wg show wg0 transfer)
++[[ $tx_bytes -gt 0 ]]
++ip1 link del dev veth1
++ip1 link del dev wg0
+
+ # We test that Netlink/IPC is working properly by doing things that usually cause split responses
+ ip0 link add dev wg0 type wireguard
+diff --git a/tools/tracing/rtla/src/osnoise_top.c b/tools/tracing/rtla/src/osnoise_top.c
+index 562f2e4b18c57..3ece8c09ecd95 100644
+--- a/tools/tracing/rtla/src/osnoise_top.c
++++ b/tools/tracing/rtla/src/osnoise_top.c
+@@ -340,8 +340,14 @@ struct osnoise_top_params *osnoise_top_parse_args(int argc, char **argv)
+ if (!params)
+ exit(1);
+
+- if (strcmp(argv[0], "hwnoise") == 0)
++ if (strcmp(argv[0], "hwnoise") == 0) {
+ params->mode = MODE_HWNOISE;
++ /*
++ * Reduce CPU usage for 75% to avoid killing the system.
++ */
++ params->runtime = 750000;
++ params->period = 1000000;
++ }
+
+ while (1) {
+ static struct option long_options[] = {
+diff --git a/tools/virtio/Makefile b/tools/virtio/Makefile
+index 7b7139d97d742..d128925980e05 100644
+--- a/tools/virtio/Makefile
++++ b/tools/virtio/Makefile
+@@ -4,7 +4,18 @@ test: virtio_test vringh_test
+ virtio_test: virtio_ring.o virtio_test.o
+ vringh_test: vringh_test.o vringh.o virtio_ring.o
+
+-CFLAGS += -g -O2 -Werror -Wno-maybe-uninitialized -Wall -I. -I../include/ -I ../../usr/include/ -Wno-pointer-sign -fno-strict-overflow -fno-strict-aliasing -fno-common -MMD -U_FORTIFY_SOURCE -include ../../include/linux/kconfig.h -mfunction-return=thunk -fcf-protection=none -mindirect-branch-register
++try-run = $(shell set -e; \
++ if ($(1)) >/dev/null 2>&1; \
++ then echo "$(2)"; \
++ else echo "$(3)"; \
++ fi)
++
++__cc-option = $(call try-run,\
++ $(1) -Werror $(2) -c -x c /dev/null -o /dev/null,$(2),)
++cc-option = $(call __cc-option, $(CC),$(1))
++
++CFLAGS += -g -O2 -Werror -Wno-maybe-uninitialized -Wall -I. -I../include/ -I ../../usr/include/ -Wno-pointer-sign -fno-strict-overflow -fno-strict-aliasing -fno-common -MMD -U_FORTIFY_SOURCE -include ../../include/linux/kconfig.h $(call cc-option,-mfunction-return=thunk) $(call cc-option,-fcf-protection=none) $(call cc-option,-mindirect-branch-register)
++
+ CFLAGS += -pthread
+ LDFLAGS += -pthread
+ vpath %.c ../../drivers/virtio ../../drivers/vhost
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-07-19 17:20 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-07-19 17:20 UTC (permalink / raw
To: gentoo-commits
commit: c8d07676d1b037a176fadabbb45ae5dd6e8c831a
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Wed Jul 19 17:19:54 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Wed Jul 19 17:19:54 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=c8d07676
Remove redundant patch
Removed:
2400_wireguard-queueing-cpu-sel-wrapping-fix.patch
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 -
2400_wireguard-queueing-cpu-sel-wrapping-fix.patch | 116 ---------------------
2 files changed, 120 deletions(-)
diff --git a/0000_README b/0000_README
index 2532d9e5..8da93495 100644
--- a/0000_README
+++ b/0000_README
@@ -79,10 +79,6 @@ Patch: 2000_BT-Check-key-sizes-only-if-Secure-Simple-Pairing-enabled.patch
From: https://lore.kernel.org/linux-bluetooth/20190522070540.48895-1-marcel@holtmann.org/raw
Desc: Bluetooth: Check key sizes only when Secure Simple Pairing is enabled. See bug #686758
-Patch: 2400_wireguard-queueing-cpu-sel-wrapping-fix.patch
-From: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=7387943fa35516f6f8017a3b0e9ce48a3bef9faa
-Desc: wireguard: queueing: use saner cpu selection wrapping
-
Patch: 2900_tmp513-Fix-build-issue-by-selecting-CONFIG_REG.patch
From: https://bugs.gentoo.org/710790
Desc: tmp513 requies REGMAP_I2C to build. Select it by default in Kconfig. See bug #710790. Thanks to Phil Stracchino
diff --git a/2400_wireguard-queueing-cpu-sel-wrapping-fix.patch b/2400_wireguard-queueing-cpu-sel-wrapping-fix.patch
deleted file mode 100644
index fa199039..00000000
--- a/2400_wireguard-queueing-cpu-sel-wrapping-fix.patch
+++ /dev/null
@@ -1,116 +0,0 @@
-From 7387943fa35516f6f8017a3b0e9ce48a3bef9faa Mon Sep 17 00:00:00 2001
-From: "Jason A. Donenfeld" <Jason@zx2c4.com>
-Date: Mon, 3 Jul 2023 03:27:04 +0200
-Subject: wireguard: queueing: use saner cpu selection wrapping
-
-Using `% nr_cpumask_bits` is slow and complicated, and not totally
-robust toward dynamic changes to CPU topologies. Rather than storing the
-next CPU in the round-robin, just store the last one, and also return
-that value. This simplifies the loop drastically into a much more common
-pattern.
-
-Fixes: e7096c131e51 ("net: WireGuard secure network tunnel")
-Cc: stable@vger.kernel.org
-Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
-Tested-by: Manuel Leiner <manuel.leiner@gmx.de>
-Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
-Signed-off-by: David S. Miller <davem@davemloft.net>
----
- drivers/net/wireguard/queueing.c | 1 +
- drivers/net/wireguard/queueing.h | 25 +++++++++++--------------
- drivers/net/wireguard/receive.c | 2 +-
- drivers/net/wireguard/send.c | 2 +-
- 4 files changed, 14 insertions(+), 16 deletions(-)
-
-diff --git a/drivers/net/wireguard/queueing.c b/drivers/net/wireguard/queueing.c
-index 8084e7408c0ae..26d235d152352 100644
---- a/drivers/net/wireguard/queueing.c
-+++ b/drivers/net/wireguard/queueing.c
-@@ -28,6 +28,7 @@ int wg_packet_queue_init(struct crypt_queue *queue, work_func_t function,
- int ret;
-
- memset(queue, 0, sizeof(*queue));
-+ queue->last_cpu = -1;
- ret = ptr_ring_init(&queue->ring, len, GFP_KERNEL);
- if (ret)
- return ret;
-diff --git a/drivers/net/wireguard/queueing.h b/drivers/net/wireguard/queueing.h
-index 125284b346a77..1ea4f874e367e 100644
---- a/drivers/net/wireguard/queueing.h
-+++ b/drivers/net/wireguard/queueing.h
-@@ -117,20 +117,17 @@ static inline int wg_cpumask_choose_online(int *stored_cpu, unsigned int id)
- return cpu;
- }
-
--/* This function is racy, in the sense that next is unlocked, so it could return
-- * the same CPU twice. A race-free version of this would be to instead store an
-- * atomic sequence number, do an increment-and-return, and then iterate through
-- * every possible CPU until we get to that index -- choose_cpu. However that's
-- * a bit slower, and it doesn't seem like this potential race actually
-- * introduces any performance loss, so we live with it.
-+/* This function is racy, in the sense that it's called while last_cpu is
-+ * unlocked, so it could return the same CPU twice. Adding locking or using
-+ * atomic sequence numbers is slower though, and the consequences of racing are
-+ * harmless, so live with it.
- */
--static inline int wg_cpumask_next_online(int *next)
-+static inline int wg_cpumask_next_online(int *last_cpu)
- {
-- int cpu = *next;
--
-- while (unlikely(!cpumask_test_cpu(cpu, cpu_online_mask)))
-- cpu = cpumask_next(cpu, cpu_online_mask) % nr_cpumask_bits;
-- *next = cpumask_next(cpu, cpu_online_mask) % nr_cpumask_bits;
-+ int cpu = cpumask_next(*last_cpu, cpu_online_mask);
-+ if (cpu >= nr_cpu_ids)
-+ cpu = cpumask_first(cpu_online_mask);
-+ *last_cpu = cpu;
- return cpu;
- }
-
-@@ -159,7 +156,7 @@ static inline void wg_prev_queue_drop_peeked(struct prev_queue *queue)
-
- static inline int wg_queue_enqueue_per_device_and_peer(
- struct crypt_queue *device_queue, struct prev_queue *peer_queue,
-- struct sk_buff *skb, struct workqueue_struct *wq, int *next_cpu)
-+ struct sk_buff *skb, struct workqueue_struct *wq)
- {
- int cpu;
-
-@@ -173,7 +170,7 @@ static inline int wg_queue_enqueue_per_device_and_peer(
- /* Then we queue it up in the device queue, which consumes the
- * packet as soon as it can.
- */
-- cpu = wg_cpumask_next_online(next_cpu);
-+ cpu = wg_cpumask_next_online(&device_queue->last_cpu);
- if (unlikely(ptr_ring_produce_bh(&device_queue->ring, skb)))
- return -EPIPE;
- queue_work_on(cpu, wq, &per_cpu_ptr(device_queue->worker, cpu)->work);
-diff --git a/drivers/net/wireguard/receive.c b/drivers/net/wireguard/receive.c
-index 7135d51d2d872..0b3f0c8435509 100644
---- a/drivers/net/wireguard/receive.c
-+++ b/drivers/net/wireguard/receive.c
-@@ -524,7 +524,7 @@ static void wg_packet_consume_data(struct wg_device *wg, struct sk_buff *skb)
- goto err;
-
- ret = wg_queue_enqueue_per_device_and_peer(&wg->decrypt_queue, &peer->rx_queue, skb,
-- wg->packet_crypt_wq, &wg->decrypt_queue.last_cpu);
-+ wg->packet_crypt_wq);
- if (unlikely(ret == -EPIPE))
- wg_queue_enqueue_per_peer_rx(skb, PACKET_STATE_DEAD);
- if (likely(!ret || ret == -EPIPE)) {
-diff --git a/drivers/net/wireguard/send.c b/drivers/net/wireguard/send.c
-index 5368f7c35b4bf..95c853b59e1da 100644
---- a/drivers/net/wireguard/send.c
-+++ b/drivers/net/wireguard/send.c
-@@ -318,7 +318,7 @@ static void wg_packet_create_data(struct wg_peer *peer, struct sk_buff *first)
- goto err;
-
- ret = wg_queue_enqueue_per_device_and_peer(&wg->encrypt_queue, &peer->tx_queue, first,
-- wg->packet_crypt_wq, &wg->encrypt_queue.last_cpu);
-+ wg->packet_crypt_wq);
- if (unlikely(ret == -EPIPE))
- wg_queue_enqueue_per_peer_tx(first, PACKET_STATE_DEAD);
- err:
---
-cgit
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-07-23 15:15 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-07-23 15:15 UTC (permalink / raw
To: gentoo-commits
commit: 54240facd41208ca7892ff518c1f6dd5ec66a375
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Sun Jul 23 15:15:18 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Sun Jul 23 15:15:18 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=54240fac
Linux patch 6.4.5
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1004_linux-6.4.5.patch | 13476 +++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 13480 insertions(+)
diff --git a/0000_README b/0000_README
index 8da93495..9d316aeb 100644
--- a/0000_README
+++ b/0000_README
@@ -59,6 +59,10 @@ Patch: 1003_linux-6.4.4.patch
From: https://www.kernel.org
Desc: Linux 6.4.4
+Patch: 1004_linux-6.4.5.patch
+From: https://www.kernel.org
+Desc: Linux 6.4.5
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1004_linux-6.4.5.patch b/1004_linux-6.4.5.patch
new file mode 100644
index 00000000..06e41875
--- /dev/null
+++ b/1004_linux-6.4.5.patch
@@ -0,0 +1,13476 @@
+diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst
+index 9e311bc43e05e..cd46e2b20a814 100644
+--- a/Documentation/arm64/silicon-errata.rst
++++ b/Documentation/arm64/silicon-errata.rst
+@@ -52,6 +52,9 @@ stable kernels.
+ | Allwinner | A64/R18 | UNKNOWN1 | SUN50I_ERRATUM_UNKNOWN1 |
+ +----------------+-----------------+-----------------+-----------------------------+
+ +----------------+-----------------+-----------------+-----------------------------+
++| Ampere | AmpereOne | AC03_CPU_38 | AMPERE_ERRATUM_AC03_CPU_38 |
+++----------------+-----------------+-----------------+-----------------------------+
+++----------------+-----------------+-----------------+-----------------------------+
+ | ARM | Cortex-A510 | #2457168 | ARM64_ERRATUM_2457168 |
+ +----------------+-----------------+-----------------+-----------------------------+
+ | ARM | Cortex-A510 | #2064142 | ARM64_ERRATUM_2064142 |
+diff --git a/Documentation/userspace-api/media/v4l/vidioc-subdev-g-routing.rst b/Documentation/userspace-api/media/v4l/vidioc-subdev-g-routing.rst
+index 68ca343c3b44a..2d6e3bbdd0404 100644
+--- a/Documentation/userspace-api/media/v4l/vidioc-subdev-g-routing.rst
++++ b/Documentation/userspace-api/media/v4l/vidioc-subdev-g-routing.rst
+@@ -122,7 +122,7 @@ for all the route entries and call ``VIDIOC_SUBDEV_G_ROUTING`` again.
+ :widths: 3 1 4
+
+ * - V4L2_SUBDEV_ROUTE_FL_ACTIVE
+- - 0
++ - 0x0001
+ - The route is enabled. Set by applications.
+
+ Return Value
+diff --git a/Makefile b/Makefile
+index d5041f7daf689..c324529158cc6 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 6
+ PATCHLEVEL = 4
+-SUBLEVEL = 4
++SUBLEVEL = 5
+ EXTRAVERSION =
+ NAME = Hurr durr I'ma ninja sloth
+
+@@ -1561,6 +1561,8 @@ modules_sign_only := y
+ endif
+ endif
+
++endif # CONFIG_MODULES
++
+ modinst_pre :=
+ ifneq ($(filter modules_install,$(MAKECMDGOALS)),)
+ modinst_pre := __modinst_pre
+@@ -1571,18 +1573,18 @@ PHONY += __modinst_pre
+ __modinst_pre:
+ @rm -rf $(MODLIB)/kernel
+ @rm -f $(MODLIB)/source
+- @mkdir -p $(MODLIB)/kernel
++ @mkdir -p $(MODLIB)
++ifdef CONFIG_MODULES
+ @ln -s $(abspath $(srctree)) $(MODLIB)/source
+ @if [ ! $(objtree) -ef $(MODLIB)/build ]; then \
+ rm -f $(MODLIB)/build ; \
+ ln -s $(CURDIR) $(MODLIB)/build ; \
+ fi
+ @sed 's:^\(.*\)\.o$$:kernel/\1.ko:' modules.order > $(MODLIB)/modules.order
++endif
+ @cp -f modules.builtin $(MODLIB)/
+ @cp -f $(objtree)/modules.builtin.modinfo $(MODLIB)/
+
+-endif # CONFIG_MODULES
+-
+ ###
+ # Cleaning is done on three levels.
+ # make clean Delete most generated files
+@@ -1924,6 +1926,13 @@ help:
+ @echo ' clean - remove generated files in module directory only'
+ @echo ''
+
++__external_modules_error:
++ @echo >&2 '***'
++ @echo >&2 '*** The present kernel disabled CONFIG_MODULES.'
++ @echo >&2 '*** You cannot build or install external modules.'
++ @echo >&2 '***'
++ @false
++
+ endif # KBUILD_EXTMOD
+
+ # ---------------------------------------------------------------------------
+@@ -1960,13 +1969,10 @@ else # CONFIG_MODULES
+ # Modules not configured
+ # ---------------------------------------------------------------------------
+
+-modules modules_install:
+- @echo >&2 '***'
+- @echo >&2 '*** The present kernel configuration has modules disabled.'
+- @echo >&2 '*** To use the module feature, please run "make menuconfig" etc.'
+- @echo >&2 '*** to enable CONFIG_MODULES.'
+- @echo >&2 '***'
+- @exit 1
++PHONY += __external_modules_error
++
++modules modules_install: __external_modules_error
++ @:
+
+ KBUILD_MODULES :=
+
+diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
+index 92f3fff2522b0..b2326674b4d3e 100644
+--- a/arch/arm64/Kconfig
++++ b/arch/arm64/Kconfig
+@@ -407,6 +407,25 @@ menu "Kernel Features"
+
+ menu "ARM errata workarounds via the alternatives framework"
+
++config AMPERE_ERRATUM_AC03_CPU_38
++ bool "AmpereOne: AC03_CPU_38: Certain bits in the Virtualization Translation Control Register and Translation Control Registers do not follow RES0 semantics"
++ default y
++ help
++ This option adds an alternative code sequence to work around Ampere
++ erratum AC03_CPU_38 on AmpereOne.
++
++ The affected design reports FEAT_HAFDBS as not implemented in
++ ID_AA64MMFR1_EL1.HAFDBS, but (V)TCR_ELx.{HA,HD} are not RES0
++ as required by the architecture. The unadvertised HAFDBS
++ implementation suffers from an additional erratum where hardware
++ A/D updates can occur after a PTE has been marked invalid.
++
++ The workaround forces KVM to explicitly set VTCR_EL2.HA to 0,
++ which avoids enabling unadvertised hardware Access Flag management
++ at stage-2.
++
++ If unsure, say Y.
++
+ config ARM64_WORKAROUND_CLEAN_CACHE
+ bool
+
+diff --git a/arch/arm64/boot/dts/mediatek/mt7986a-bananapi-bpi-r3-nor.dtso b/arch/arm64/boot/dts/mediatek/mt7986a-bananapi-bpi-r3-nor.dtso
+index 84aa229e80f37..e48881be4ed60 100644
+--- a/arch/arm64/boot/dts/mediatek/mt7986a-bananapi-bpi-r3-nor.dtso
++++ b/arch/arm64/boot/dts/mediatek/mt7986a-bananapi-bpi-r3-nor.dtso
+@@ -27,15 +27,10 @@
+
+ partition@0 {
+ label = "bl2";
+- reg = <0x0 0x20000>;
++ reg = <0x0 0x40000>;
+ read-only;
+ };
+
+- partition@20000 {
+- label = "reserved";
+- reg = <0x20000 0x20000>;
+- };
+-
+ partition@40000 {
+ label = "u-boot-env";
+ reg = <0x40000 0x40000>;
+diff --git a/arch/arm64/boot/dts/ti/k3-am68-sk-base-board.dts b/arch/arm64/boot/dts/ti/k3-am68-sk-base-board.dts
+index 27a43a8ecffd4..67bcf906ded19 100644
+--- a/arch/arm64/boot/dts/ti/k3-am68-sk-base-board.dts
++++ b/arch/arm64/boot/dts/ti/k3-am68-sk-base-board.dts
+@@ -175,49 +175,49 @@
+ };
+ };
+
+-&wkup_pmx0 {
++&wkup_pmx2 {
+ mcu_cpsw_pins_default: mcu-cpsw-pins-default {
+ pinctrl-single,pins = <
+- J721S2_WKUP_IOPAD(0x094, PIN_INPUT, 0) /* (B22) MCU_RGMII1_RD0 */
+- J721S2_WKUP_IOPAD(0x090, PIN_INPUT, 0) /* (B21) MCU_RGMII1_RD1 */
+- J721S2_WKUP_IOPAD(0x08c, PIN_INPUT, 0) /* (C22) MCU_RGMII1_RD2 */
+- J721S2_WKUP_IOPAD(0x088, PIN_INPUT, 0) /* (D23) MCU_RGMII1_RD3 */
+- J721S2_WKUP_IOPAD(0x084, PIN_INPUT, 0) /* (D22) MCU_RGMII1_RXC */
+- J721S2_WKUP_IOPAD(0x06c, PIN_INPUT, 0) /* (E23) MCU_RGMII1_RX_CTL */
+- J721S2_WKUP_IOPAD(0x07c, PIN_OUTPUT, 0) /* (F23) MCU_RGMII1_TD0 */
+- J721S2_WKUP_IOPAD(0x078, PIN_OUTPUT, 0) /* (G22) MCU_RGMII1_TD1 */
+- J721S2_WKUP_IOPAD(0x074, PIN_OUTPUT, 0) /* (E21) MCU_RGMII1_TD2 */
+- J721S2_WKUP_IOPAD(0x070, PIN_OUTPUT, 0) /* (E22) MCU_RGMII1_TD3 */
+- J721S2_WKUP_IOPAD(0x080, PIN_OUTPUT, 0) /* (F21) MCU_RGMII1_TXC */
+- J721S2_WKUP_IOPAD(0x068, PIN_OUTPUT, 0) /* (F22) MCU_RGMII1_TX_CTL */
++ J721S2_WKUP_IOPAD(0x02C, PIN_INPUT, 0) /* (B22) MCU_RGMII1_RD0 */
++ J721S2_WKUP_IOPAD(0x028, PIN_INPUT, 0) /* (B21) MCU_RGMII1_RD1 */
++ J721S2_WKUP_IOPAD(0x024, PIN_INPUT, 0) /* (C22) MCU_RGMII1_RD2 */
++ J721S2_WKUP_IOPAD(0x020, PIN_INPUT, 0) /* (D23) MCU_RGMII1_RD3 */
++ J721S2_WKUP_IOPAD(0x01C, PIN_INPUT, 0) /* (D22) MCU_RGMII1_RXC */
++ J721S2_WKUP_IOPAD(0x004, PIN_INPUT, 0) /* (E23) MCU_RGMII1_RX_CTL */
++ J721S2_WKUP_IOPAD(0x014, PIN_OUTPUT, 0) /* (F23) MCU_RGMII1_TD0 */
++ J721S2_WKUP_IOPAD(0x010, PIN_OUTPUT, 0) /* (G22) MCU_RGMII1_TD1 */
++ J721S2_WKUP_IOPAD(0x00C, PIN_OUTPUT, 0) /* (E21) MCU_RGMII1_TD2 */
++ J721S2_WKUP_IOPAD(0x008, PIN_OUTPUT, 0) /* (E22) MCU_RGMII1_TD3 */
++ J721S2_WKUP_IOPAD(0x018, PIN_OUTPUT, 0) /* (F21) MCU_RGMII1_TXC */
++ J721S2_WKUP_IOPAD(0x000, PIN_OUTPUT, 0) /* (F22) MCU_RGMII1_TX_CTL */
+ >;
+ };
+
+ mcu_mdio_pins_default: mcu-mdio-pins-default {
+ pinctrl-single,pins = <
+- J721S2_WKUP_IOPAD(0x09c, PIN_OUTPUT, 0) /* (A21) MCU_MDIO0_MDC */
+- J721S2_WKUP_IOPAD(0x098, PIN_INPUT, 0) /* (A22) MCU_MDIO0_MDIO */
++ J721S2_WKUP_IOPAD(0x034, PIN_OUTPUT, 0) /* (A21) MCU_MDIO0_MDC */
++ J721S2_WKUP_IOPAD(0x030, PIN_INPUT, 0) /* (A22) MCU_MDIO0_MDIO */
+ >;
+ };
+
+ mcu_mcan0_pins_default: mcu-mcan0-pins-default {
+ pinctrl-single,pins = <
+- J721S2_WKUP_IOPAD(0x0bc, PIN_INPUT, 0) /* (E28) MCU_MCAN0_RX */
+- J721S2_WKUP_IOPAD(0x0b8, PIN_OUTPUT, 0) /* (E27) MCU_MCAN0_TX */
++ J721S2_WKUP_IOPAD(0x054, PIN_INPUT, 0) /* (E28) MCU_MCAN0_RX */
++ J721S2_WKUP_IOPAD(0x050, PIN_OUTPUT, 0) /* (E27) MCU_MCAN0_TX */
+ >;
+ };
+
+ mcu_mcan1_pins_default: mcu-mcan1-pins-default {
+ pinctrl-single,pins = <
+- J721S2_WKUP_IOPAD(0x0d4, PIN_INPUT, 0) /* (F26) WKUP_GPIO0_5.MCU_MCAN1_RX */
+- J721S2_WKUP_IOPAD(0x0d0, PIN_OUTPUT, 0) /* (C23) WKUP_GPIO0_4.MCU_MCAN1_TX*/
++ J721S2_WKUP_IOPAD(0x06C, PIN_INPUT, 0) /* (F26) WKUP_GPIO0_5.MCU_MCAN1_RX */
++ J721S2_WKUP_IOPAD(0x068, PIN_OUTPUT, 0) /* (C23) WKUP_GPIO0_4.MCU_MCAN1_TX*/
+ >;
+ };
+
+ mcu_i2c1_pins_default: mcu-i2c1-pins-default {
+ pinctrl-single,pins = <
+- J721S2_WKUP_IOPAD(0x0e0, PIN_INPUT, 0) /* (F24) WKUP_GPIO0_8.MCU_I2C1_SCL */
+- J721S2_WKUP_IOPAD(0x0e4, PIN_INPUT, 0) /* (H26) WKUP_GPIO0_9.MCU_I2C1_SDA */
++ J721S2_WKUP_IOPAD(0x078, PIN_INPUT, 0) /* (F24) WKUP_GPIO0_8.MCU_I2C1_SCL */
++ J721S2_WKUP_IOPAD(0x07c, PIN_INPUT, 0) /* (H26) WKUP_GPIO0_9.MCU_I2C1_SDA */
+ >;
+ };
+ };
+diff --git a/arch/arm64/boot/dts/ti/k3-j721s2-common-proc-board.dts b/arch/arm64/boot/dts/ti/k3-j721s2-common-proc-board.dts
+index b4b9edfe2d12c..a7caf1c3404d5 100644
+--- a/arch/arm64/boot/dts/ti/k3-j721s2-common-proc-board.dts
++++ b/arch/arm64/boot/dts/ti/k3-j721s2-common-proc-board.dts
+@@ -146,81 +146,81 @@
+ };
+ };
+
+-&wkup_pmx0 {
++&wkup_pmx2 {
+ mcu_cpsw_pins_default: mcu-cpsw-pins-default {
+ pinctrl-single,pins = <
+- J721S2_WKUP_IOPAD(0x094, PIN_INPUT, 0) /* (B22) MCU_RGMII1_RD0 */
+- J721S2_WKUP_IOPAD(0x090, PIN_INPUT, 0) /* (B21) MCU_RGMII1_RD1 */
+- J721S2_WKUP_IOPAD(0x08c, PIN_INPUT, 0) /* (C22) MCU_RGMII1_RD2 */
+- J721S2_WKUP_IOPAD(0x088, PIN_INPUT, 0) /* (D23) MCU_RGMII1_RD3 */
+- J721S2_WKUP_IOPAD(0x084, PIN_INPUT, 0) /* (D22) MCU_RGMII1_RXC */
+- J721S2_WKUP_IOPAD(0x06c, PIN_INPUT, 0) /* (E23) MCU_RGMII1_RX_CTL */
+- J721S2_WKUP_IOPAD(0x07c, PIN_OUTPUT, 0) /* (F23) MCU_RGMII1_TD0 */
+- J721S2_WKUP_IOPAD(0x078, PIN_OUTPUT, 0) /* (G22) MCU_RGMII1_TD1 */
+- J721S2_WKUP_IOPAD(0x074, PIN_OUTPUT, 0) /* (E21) MCU_RGMII1_TD2 */
+- J721S2_WKUP_IOPAD(0x070, PIN_OUTPUT, 0) /* (E22) MCU_RGMII1_TD3 */
+- J721S2_WKUP_IOPAD(0x080, PIN_OUTPUT, 0) /* (F21) MCU_RGMII1_TXC */
+- J721S2_WKUP_IOPAD(0x068, PIN_OUTPUT, 0) /* (F22) MCU_RGMII1_TX_CTL */
++ J721S2_WKUP_IOPAD(0x02c, PIN_INPUT, 0) /* (B22) MCU_RGMII1_RD0 */
++ J721S2_WKUP_IOPAD(0x028, PIN_INPUT, 0) /* (B21) MCU_RGMII1_RD1 */
++ J721S2_WKUP_IOPAD(0x024, PIN_INPUT, 0) /* (C22) MCU_RGMII1_RD2 */
++ J721S2_WKUP_IOPAD(0x020, PIN_INPUT, 0) /* (D23) MCU_RGMII1_RD3 */
++ J721S2_WKUP_IOPAD(0x01c, PIN_INPUT, 0) /* (D22) MCU_RGMII1_RXC */
++ J721S2_WKUP_IOPAD(0x004, PIN_INPUT, 0) /* (E23) MCU_RGMII1_RX_CTL */
++ J721S2_WKUP_IOPAD(0x014, PIN_OUTPUT, 0) /* (F23) MCU_RGMII1_TD0 */
++ J721S2_WKUP_IOPAD(0x010, PIN_OUTPUT, 0) /* (G22) MCU_RGMII1_TD1 */
++ J721S2_WKUP_IOPAD(0x00c, PIN_OUTPUT, 0) /* (E21) MCU_RGMII1_TD2 */
++ J721S2_WKUP_IOPAD(0x008, PIN_OUTPUT, 0) /* (E22) MCU_RGMII1_TD3 */
++ J721S2_WKUP_IOPAD(0x018, PIN_OUTPUT, 0) /* (F21) MCU_RGMII1_TXC */
++ J721S2_WKUP_IOPAD(0x000, PIN_OUTPUT, 0) /* (F22) MCU_RGMII1_TX_CTL */
+ >;
+ };
+
+ mcu_mdio_pins_default: mcu-mdio-pins-default {
+ pinctrl-single,pins = <
+- J721S2_WKUP_IOPAD(0x09c, PIN_OUTPUT, 0) /* (A21) MCU_MDIO0_MDC */
+- J721S2_WKUP_IOPAD(0x098, PIN_INPUT, 0) /* (A22) MCU_MDIO0_MDIO */
++ J721S2_WKUP_IOPAD(0x034, PIN_OUTPUT, 0) /* (A21) MCU_MDIO0_MDC */
++ J721S2_WKUP_IOPAD(0x030, PIN_INPUT, 0) /* (A22) MCU_MDIO0_MDIO */
+ >;
+ };
+
+ mcu_mcan0_pins_default: mcu-mcan0-pins-default {
+ pinctrl-single,pins = <
+- J721S2_WKUP_IOPAD(0x0bc, PIN_INPUT, 0) /* (E28) MCU_MCAN0_RX */
+- J721S2_WKUP_IOPAD(0x0b8, PIN_OUTPUT, 0) /* (E27) MCU_MCAN0_TX */
++ J721S2_WKUP_IOPAD(0x054, PIN_INPUT, 0) /* (E28) MCU_MCAN0_RX */
++ J721S2_WKUP_IOPAD(0x050, PIN_OUTPUT, 0) /* (E27) MCU_MCAN0_TX */
+ >;
+ };
+
+ mcu_mcan1_pins_default: mcu-mcan1-pins-default {
+ pinctrl-single,pins = <
+- J721S2_WKUP_IOPAD(0x0d4, PIN_INPUT, 0) /* (F26) WKUP_GPIO0_5.MCU_MCAN1_RX */
+- J721S2_WKUP_IOPAD(0x0d0, PIN_OUTPUT, 0) /* (C23) WKUP_GPIO0_4.MCU_MCAN1_TX */
++ J721S2_WKUP_IOPAD(0x06c, PIN_INPUT, 0) /* (F26) WKUP_GPIO0_5.MCU_MCAN1_RX */
++ J721S2_WKUP_IOPAD(0x068, PIN_OUTPUT, 0) /*(C23) WKUP_GPIO0_4.MCU_MCAN1_TX */
+ >;
+ };
+
+ mcu_mcan0_gpio_pins_default: mcu-mcan0-gpio-pins-default {
+ pinctrl-single,pins = <
+- J721S2_WKUP_IOPAD(0x0c0, PIN_INPUT, 7) /* (D26) WKUP_GPIO0_0 */
+- J721S2_WKUP_IOPAD(0x0a8, PIN_INPUT, 7) /* (B25) MCU_SPI0_D1.WKUP_GPIO0_69 */
++ J721S2_WKUP_IOPAD(0x058, PIN_INPUT, 7) /* (D26) WKUP_GPIO0_0 */
++ J721S2_WKUP_IOPAD(0x040, PIN_INPUT, 7) /* (B25) MCU_SPI0_D1.WKUP_GPIO0_69 */
+ >;
+ };
+
+ mcu_mcan1_gpio_pins_default: mcu-mcan1-gpio-pins-default {
+ pinctrl-single,pins = <
+- J721S2_WKUP_IOPAD(0x0c8, PIN_INPUT, 7) /* (C28) WKUP_GPIO0_2 */
++ J721S2_WKUP_IOPAD(0x060, PIN_INPUT, 7) /* (C28) WKUP_GPIO0_2 */
+ >;
+ };
+
+ mcu_adc0_pins_default: mcu-adc0-pins-default {
+ pinctrl-single,pins = <
+- J721S2_WKUP_IOPAD(0x134, PIN_INPUT, 0) /* (L25) MCU_ADC0_AIN0 */
+- J721S2_WKUP_IOPAD(0x138, PIN_INPUT, 0) /* (K25) MCU_ADC0_AIN1 */
+- J721S2_WKUP_IOPAD(0x13c, PIN_INPUT, 0) /* (M24) MCU_ADC0_AIN2 */
+- J721S2_WKUP_IOPAD(0x140, PIN_INPUT, 0) /* (L24) MCU_ADC0_AIN3 */
+- J721S2_WKUP_IOPAD(0x144, PIN_INPUT, 0) /* (L27) MCU_ADC0_AIN4 */
+- J721S2_WKUP_IOPAD(0x148, PIN_INPUT, 0) /* (K24) MCU_ADC0_AIN5 */
+- J721S2_WKUP_IOPAD(0x14c, PIN_INPUT, 0) /* (M27) MCU_ADC0_AIN6 */
+- J721S2_WKUP_IOPAD(0x150, PIN_INPUT, 0) /* (M26) MCU_ADC0_AIN7 */
++ J721S2_WKUP_IOPAD(0x0cc, PIN_INPUT, 0) /* (L25) MCU_ADC0_AIN0 */
++ J721S2_WKUP_IOPAD(0x0d0, PIN_INPUT, 0) /* (K25) MCU_ADC0_AIN1 */
++ J721S2_WKUP_IOPAD(0x0d4, PIN_INPUT, 0) /* (M24) MCU_ADC0_AIN2 */
++ J721S2_WKUP_IOPAD(0x0d8, PIN_INPUT, 0) /* (L24) MCU_ADC0_AIN3 */
++ J721S2_WKUP_IOPAD(0x0dc, PIN_INPUT, 0) /* (L27) MCU_ADC0_AIN4 */
++ J721S2_WKUP_IOPAD(0x0e0, PIN_INPUT, 0) /* (K24) MCU_ADC0_AIN5 */
++ J721S2_WKUP_IOPAD(0x0e4, PIN_INPUT, 0) /* (M27) MCU_ADC0_AIN6 */
++ J721S2_WKUP_IOPAD(0x0e8, PIN_INPUT, 0) /* (M26) MCU_ADC0_AIN7 */
+ >;
+ };
+
+ mcu_adc1_pins_default: mcu-adc1-pins-default {
+ pinctrl-single,pins = <
+- J721S2_WKUP_IOPAD(0x154, PIN_INPUT, 0) /* (P25) MCU_ADC1_AIN0 */
+- J721S2_WKUP_IOPAD(0x158, PIN_INPUT, 0) /* (R25) MCU_ADC1_AIN1 */
+- J721S2_WKUP_IOPAD(0x15c, PIN_INPUT, 0) /* (P28) MCU_ADC1_AIN2 */
+- J721S2_WKUP_IOPAD(0x160, PIN_INPUT, 0) /* (P27) MCU_ADC1_AIN3 */
+- J721S2_WKUP_IOPAD(0x164, PIN_INPUT, 0) /* (N25) MCU_ADC1_AIN4 */
+- J721S2_WKUP_IOPAD(0x168, PIN_INPUT, 0) /* (P26) MCU_ADC1_AIN5 */
+- J721S2_WKUP_IOPAD(0x16c, PIN_INPUT, 0) /* (N26) MCU_ADC1_AIN6 */
+- J721S2_WKUP_IOPAD(0x170, PIN_INPUT, 0) /* (N27) MCU_ADC1_AIN7 */
++ J721S2_WKUP_IOPAD(0x0ec, PIN_INPUT, 0) /* (P25) MCU_ADC1_AIN0 */
++ J721S2_WKUP_IOPAD(0x0f0, PIN_INPUT, 0) /* (R25) MCU_ADC1_AIN1 */
++ J721S2_WKUP_IOPAD(0x0f4, PIN_INPUT, 0) /* (P28) MCU_ADC1_AIN2 */
++ J721S2_WKUP_IOPAD(0x0f8, PIN_INPUT, 0) /* (P27) MCU_ADC1_AIN3 */
++ J721S2_WKUP_IOPAD(0x0fc, PIN_INPUT, 0) /* (N25) MCU_ADC1_AIN4 */
++ J721S2_WKUP_IOPAD(0x100, PIN_INPUT, 0) /* (P26) MCU_ADC1_AIN5 */
++ J721S2_WKUP_IOPAD(0x104, PIN_INPUT, 0) /* (N26) MCU_ADC1_AIN6 */
++ J721S2_WKUP_IOPAD(0x108, PIN_INPUT, 0) /* (N27) MCU_ADC1_AIN7 */
+ >;
+ };
+ };
+diff --git a/arch/arm64/boot/dts/ti/k3-j721s2-mcu-wakeup.dtsi b/arch/arm64/boot/dts/ti/k3-j721s2-mcu-wakeup.dtsi
+index a353705a7463e..fa31831e33e85 100644
+--- a/arch/arm64/boot/dts/ti/k3-j721s2-mcu-wakeup.dtsi
++++ b/arch/arm64/boot/dts/ti/k3-j721s2-mcu-wakeup.dtsi
+@@ -50,7 +50,34 @@
+ wkup_pmx0: pinctrl@4301c000 {
+ compatible = "pinctrl-single";
+ /* Proxy 0 addressing */
+- reg = <0x00 0x4301c000 0x00 0x178>;
++ reg = <0x00 0x4301c000 0x00 0x034>;
++ #pinctrl-cells = <1>;
++ pinctrl-single,register-width = <32>;
++ pinctrl-single,function-mask = <0xffffffff>;
++ };
++
++ wkup_pmx1: pinctrl@4301c038 {
++ compatible = "pinctrl-single";
++ /* Proxy 0 addressing */
++ reg = <0x00 0x4301c038 0x00 0x02C>;
++ #pinctrl-cells = <1>;
++ pinctrl-single,register-width = <32>;
++ pinctrl-single,function-mask = <0xffffffff>;
++ };
++
++ wkup_pmx2: pinctrl@4301c068 {
++ compatible = "pinctrl-single";
++ /* Proxy 0 addressing */
++ reg = <0x00 0x4301c068 0x00 0x120>;
++ #pinctrl-cells = <1>;
++ pinctrl-single,register-width = <32>;
++ pinctrl-single,function-mask = <0xffffffff>;
++ };
++
++ wkup_pmx3: pinctrl@4301c190 {
++ compatible = "pinctrl-single";
++ /* Proxy 0 addressing */
++ reg = <0x00 0x4301c190 0x00 0x004>;
+ #pinctrl-cells = <1>;
+ pinctrl-single,register-width = <32>;
+ pinctrl-single,function-mask = <0xffffffff>;
+diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
+index 307faa2b4395e..be66e94a21bda 100644
+--- a/arch/arm64/kernel/cpu_errata.c
++++ b/arch/arm64/kernel/cpu_errata.c
+@@ -729,6 +729,13 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
+ MIDR_FIXED(MIDR_CPU_VAR_REV(1,1), BIT(25)),
+ .cpu_enable = cpu_clear_bf16_from_user_emulation,
+ },
++#endif
++#ifdef CONFIG_AMPERE_ERRATUM_AC03_CPU_38
++ {
++ .desc = "AmpereOne erratum AC03_CPU_38",
++ .capability = ARM64_WORKAROUND_AMPERE_AC03_CPU_38,
++ ERRATA_MIDR_ALL_VERSIONS(MIDR_AMPERE1),
++ },
+ #endif
+ {
+ }
+diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
+index 4bb1b8f472982..7b889445e5c64 100644
+--- a/arch/arm64/kernel/traps.c
++++ b/arch/arm64/kernel/traps.c
+@@ -1044,7 +1044,7 @@ static int kasan_handler(struct pt_regs *regs, unsigned long esr)
+ bool recover = esr & KASAN_ESR_RECOVER;
+ bool write = esr & KASAN_ESR_WRITE;
+ size_t size = KASAN_ESR_SIZE(esr);
+- u64 addr = regs->regs[0];
++ void *addr = (void *)regs->regs[0];
+ u64 pc = regs->pc;
+
+ kasan_report(addr, size, write, pc);
+diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
+index 95dae02ccc2e6..37bd64e912ca7 100644
+--- a/arch/arm64/kvm/hyp/pgtable.c
++++ b/arch/arm64/kvm/hyp/pgtable.c
+@@ -623,10 +623,18 @@ u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift)
+ #ifdef CONFIG_ARM64_HW_AFDBM
+ /*
+ * Enable the Hardware Access Flag management, unconditionally
+- * on all CPUs. The features is RES0 on CPUs without the support
+- * and must be ignored by the CPUs.
++ * on all CPUs. In systems that have asymmetric support for the feature
++ * this allows KVM to leverage hardware support on the subset of cores
++ * that implement the feature.
++ *
++ * The architecture requires VTCR_EL2.HA to be RES0 (thus ignored by
++ * hardware) on implementations that do not advertise support for the
++ * feature. As such, setting HA unconditionally is safe, unless you
++ * happen to be running on a design that has unadvertised support for
++ * HAFDBS. Here be dragons.
+ */
+- vtcr |= VTCR_EL2_HA;
++ if (!cpus_have_final_cap(ARM64_WORKAROUND_AMPERE_AC03_CPU_38))
++ vtcr |= VTCR_EL2_HA;
+ #endif /* CONFIG_ARM64_HW_AFDBM */
+
+ /* Set the vmid bits */
+diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
+index df1386a60d521..22911bed0eb05 100644
+--- a/arch/arm64/mm/fault.c
++++ b/arch/arm64/mm/fault.c
+@@ -317,7 +317,7 @@ static void report_tag_fault(unsigned long addr, unsigned long esr,
+ * find out access size.
+ */
+ bool is_write = !!(esr & ESR_ELx_WNR);
+- kasan_report(addr, 0, is_write, regs->pc);
++ kasan_report((void *)addr, 0, is_write, regs->pc);
+ }
+ #else
+ /* Tag faults aren't enabled without CONFIG_KASAN_HW_TAGS. */
+diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
+index 40ba95472594d..9f9a2d6652ebd 100644
+--- a/arch/arm64/tools/cpucaps
++++ b/arch/arm64/tools/cpucaps
+@@ -77,6 +77,7 @@ WORKAROUND_2077057
+ WORKAROUND_2457168
+ WORKAROUND_2645198
+ WORKAROUND_2658417
++WORKAROUND_AMPERE_AC03_CPU_38
+ WORKAROUND_TRBE_OVERWRITE_FILL_MODE
+ WORKAROUND_TSB_FLUSH_FAILURE
+ WORKAROUND_TRBE_WRITE_OUT_OF_RANGE
+diff --git a/arch/mips/Makefile b/arch/mips/Makefile
+index ef7b05ae92ceb..a47593d72f6f5 100644
+--- a/arch/mips/Makefile
++++ b/arch/mips/Makefile
+@@ -181,16 +181,12 @@ endif
+ cflags-$(CONFIG_CAVIUM_CN63XXP1) += -Wa,-mfix-cn63xxp1
+ cflags-$(CONFIG_CPU_BMIPS) += -march=mips32 -Wa,-mips32 -Wa,--trap
+
+-cflags-$(CONFIG_CPU_LOONGSON2E) += -march=loongson2e -Wa,--trap
+-cflags-$(CONFIG_CPU_LOONGSON2F) += -march=loongson2f -Wa,--trap
++cflags-$(CONFIG_CPU_LOONGSON2E) += $(call cc-option,-march=loongson2e) -Wa,--trap
++cflags-$(CONFIG_CPU_LOONGSON2F) += $(call cc-option,-march=loongson2f) -Wa,--trap
++cflags-$(CONFIG_CPU_LOONGSON64) += $(call cc-option,-march=loongson3a,-march=mips64r2) -Wa,--trap
+ # Some -march= flags enable MMI instructions, and GCC complains about that
+ # support being enabled alongside -msoft-float. Thus explicitly disable MMI.
+ cflags-$(CONFIG_CPU_LOONGSON2EF) += $(call cc-option,-mno-loongson-mmi)
+-ifdef CONFIG_CPU_LOONGSON64
+-cflags-$(CONFIG_CPU_LOONGSON64) += -Wa,--trap
+-cflags-$(CONFIG_CC_IS_GCC) += -march=loongson3a
+-cflags-$(CONFIG_CC_IS_CLANG) += -march=mips64r2
+-endif
+ cflags-$(CONFIG_CPU_LOONGSON64) += $(call cc-option,-mno-loongson-mmi)
+
+ cflags-$(CONFIG_CPU_R4000_WORKAROUNDS) += $(call cc-option,-mfix-r4000,)
+diff --git a/arch/mips/include/asm/cpu-features.h b/arch/mips/include/asm/cpu-features.h
+index 51a1737b03d0c..404390bb87eaf 100644
+--- a/arch/mips/include/asm/cpu-features.h
++++ b/arch/mips/include/asm/cpu-features.h
+@@ -125,7 +125,7 @@
+ ({ \
+ int __res; \
+ \
+- switch (current_cpu_type()) { \
++ switch (boot_cpu_type()) { \
+ case CPU_CAVIUM_OCTEON: \
+ case CPU_CAVIUM_OCTEON_PLUS: \
+ case CPU_CAVIUM_OCTEON2: \
+@@ -368,7 +368,7 @@
+ ({ \
+ int __res; \
+ \
+- switch (current_cpu_type()) { \
++ switch (boot_cpu_type()) { \
+ case CPU_M14KC: \
+ case CPU_74K: \
+ case CPU_1074K: \
+diff --git a/arch/mips/include/asm/kvm_host.h b/arch/mips/include/asm/kvm_host.h
+index 957121a495f0b..04cedf9f88115 100644
+--- a/arch/mips/include/asm/kvm_host.h
++++ b/arch/mips/include/asm/kvm_host.h
+@@ -317,7 +317,7 @@ struct kvm_vcpu_arch {
+ unsigned int aux_inuse;
+
+ /* COP0 State */
+- struct mips_coproc *cop0;
++ struct mips_coproc cop0;
+
+ /* Resume PC after MMIO completion */
+ unsigned long io_pc;
+@@ -698,7 +698,7 @@ static inline bool kvm_mips_guest_can_have_fpu(struct kvm_vcpu_arch *vcpu)
+ static inline bool kvm_mips_guest_has_fpu(struct kvm_vcpu_arch *vcpu)
+ {
+ return kvm_mips_guest_can_have_fpu(vcpu) &&
+- kvm_read_c0_guest_config1(vcpu->cop0) & MIPS_CONF1_FP;
++ kvm_read_c0_guest_config1(&vcpu->cop0) & MIPS_CONF1_FP;
+ }
+
+ static inline bool kvm_mips_guest_can_have_msa(struct kvm_vcpu_arch *vcpu)
+@@ -710,7 +710,7 @@ static inline bool kvm_mips_guest_can_have_msa(struct kvm_vcpu_arch *vcpu)
+ static inline bool kvm_mips_guest_has_msa(struct kvm_vcpu_arch *vcpu)
+ {
+ return kvm_mips_guest_can_have_msa(vcpu) &&
+- kvm_read_c0_guest_config3(vcpu->cop0) & MIPS_CONF3_MSA;
++ kvm_read_c0_guest_config3(&vcpu->cop0) & MIPS_CONF3_MSA;
+ }
+
+ struct kvm_mips_callbacks {
+diff --git a/arch/mips/kernel/cpu-probe.c b/arch/mips/kernel/cpu-probe.c
+index e79adcb128e67..b406d8bfb15a3 100644
+--- a/arch/mips/kernel/cpu-probe.c
++++ b/arch/mips/kernel/cpu-probe.c
+@@ -1677,7 +1677,10 @@ static inline void decode_cpucfg(struct cpuinfo_mips *c)
+
+ static inline void cpu_probe_loongson(struct cpuinfo_mips *c, unsigned int cpu)
+ {
++ c->cputype = CPU_LOONGSON64;
++
+ /* All Loongson processors covered here define ExcCode 16 as GSExc. */
++ decode_configs(c);
+ c->options |= MIPS_CPU_GSEXCEX;
+
+ switch (c->processor_id & PRID_IMP_MASK) {
+@@ -1687,7 +1690,6 @@ static inline void cpu_probe_loongson(struct cpuinfo_mips *c, unsigned int cpu)
+ case PRID_REV_LOONGSON2K_R1_1:
+ case PRID_REV_LOONGSON2K_R1_2:
+ case PRID_REV_LOONGSON2K_R1_3:
+- c->cputype = CPU_LOONGSON64;
+ __cpu_name[cpu] = "Loongson-2K";
+ set_elf_platform(cpu, "gs264e");
+ set_isa(c, MIPS_CPU_ISA_M64R2);
+@@ -1700,14 +1702,12 @@ static inline void cpu_probe_loongson(struct cpuinfo_mips *c, unsigned int cpu)
+ switch (c->processor_id & PRID_REV_MASK) {
+ case PRID_REV_LOONGSON3A_R2_0:
+ case PRID_REV_LOONGSON3A_R2_1:
+- c->cputype = CPU_LOONGSON64;
+ __cpu_name[cpu] = "ICT Loongson-3";
+ set_elf_platform(cpu, "loongson3a");
+ set_isa(c, MIPS_CPU_ISA_M64R2);
+ break;
+ case PRID_REV_LOONGSON3A_R3_0:
+ case PRID_REV_LOONGSON3A_R3_1:
+- c->cputype = CPU_LOONGSON64;
+ __cpu_name[cpu] = "ICT Loongson-3";
+ set_elf_platform(cpu, "loongson3a");
+ set_isa(c, MIPS_CPU_ISA_M64R2);
+@@ -1727,7 +1727,6 @@ static inline void cpu_probe_loongson(struct cpuinfo_mips *c, unsigned int cpu)
+ c->ases &= ~MIPS_ASE_VZ; /* VZ of Loongson-3A2000/3000 is incomplete */
+ break;
+ case PRID_IMP_LOONGSON_64G:
+- c->cputype = CPU_LOONGSON64;
+ __cpu_name[cpu] = "ICT Loongson-3";
+ set_elf_platform(cpu, "loongson3a");
+ set_isa(c, MIPS_CPU_ISA_M64R2);
+@@ -1737,8 +1736,6 @@ static inline void cpu_probe_loongson(struct cpuinfo_mips *c, unsigned int cpu)
+ panic("Unknown Loongson Processor ID!");
+ break;
+ }
+-
+- decode_configs(c);
+ }
+ #else
+ static inline void cpu_probe_loongson(struct cpuinfo_mips *c, unsigned int cpu) { }
+diff --git a/arch/mips/kvm/emulate.c b/arch/mips/kvm/emulate.c
+index edaec93a1a1fe..e64372b8f66af 100644
+--- a/arch/mips/kvm/emulate.c
++++ b/arch/mips/kvm/emulate.c
+@@ -312,7 +312,7 @@ int kvm_get_badinstrp(u32 *opc, struct kvm_vcpu *vcpu, u32 *out)
+ */
+ int kvm_mips_count_disabled(struct kvm_vcpu *vcpu)
+ {
+- struct mips_coproc *cop0 = vcpu->arch.cop0;
++ struct mips_coproc *cop0 = &vcpu->arch.cop0;
+
+ return (vcpu->arch.count_ctl & KVM_REG_MIPS_COUNT_CTL_DC) ||
+ (kvm_read_c0_guest_cause(cop0) & CAUSEF_DC);
+@@ -384,7 +384,7 @@ static inline ktime_t kvm_mips_count_time(struct kvm_vcpu *vcpu)
+ */
+ static u32 kvm_mips_read_count_running(struct kvm_vcpu *vcpu, ktime_t now)
+ {
+- struct mips_coproc *cop0 = vcpu->arch.cop0;
++ struct mips_coproc *cop0 = &vcpu->arch.cop0;
+ ktime_t expires, threshold;
+ u32 count, compare;
+ int running;
+@@ -444,7 +444,7 @@ static u32 kvm_mips_read_count_running(struct kvm_vcpu *vcpu, ktime_t now)
+ */
+ u32 kvm_mips_read_count(struct kvm_vcpu *vcpu)
+ {
+- struct mips_coproc *cop0 = vcpu->arch.cop0;
++ struct mips_coproc *cop0 = &vcpu->arch.cop0;
+
+ /* If count disabled just read static copy of count */
+ if (kvm_mips_count_disabled(vcpu))
+@@ -502,7 +502,7 @@ ktime_t kvm_mips_freeze_hrtimer(struct kvm_vcpu *vcpu, u32 *count)
+ static void kvm_mips_resume_hrtimer(struct kvm_vcpu *vcpu,
+ ktime_t now, u32 count)
+ {
+- struct mips_coproc *cop0 = vcpu->arch.cop0;
++ struct mips_coproc *cop0 = &vcpu->arch.cop0;
+ u32 compare;
+ u64 delta;
+ ktime_t expire;
+@@ -603,7 +603,7 @@ resume:
+ */
+ void kvm_mips_write_count(struct kvm_vcpu *vcpu, u32 count)
+ {
+- struct mips_coproc *cop0 = vcpu->arch.cop0;
++ struct mips_coproc *cop0 = &vcpu->arch.cop0;
+ ktime_t now;
+
+ /* Calculate bias */
+@@ -649,7 +649,7 @@ void kvm_mips_init_count(struct kvm_vcpu *vcpu, unsigned long count_hz)
+ */
+ int kvm_mips_set_count_hz(struct kvm_vcpu *vcpu, s64 count_hz)
+ {
+- struct mips_coproc *cop0 = vcpu->arch.cop0;
++ struct mips_coproc *cop0 = &vcpu->arch.cop0;
+ int dc;
+ ktime_t now;
+ u32 count;
+@@ -696,7 +696,7 @@ int kvm_mips_set_count_hz(struct kvm_vcpu *vcpu, s64 count_hz)
+ */
+ void kvm_mips_write_compare(struct kvm_vcpu *vcpu, u32 compare, bool ack)
+ {
+- struct mips_coproc *cop0 = vcpu->arch.cop0;
++ struct mips_coproc *cop0 = &vcpu->arch.cop0;
+ int dc;
+ u32 old_compare = kvm_read_c0_guest_compare(cop0);
+ s32 delta = compare - old_compare;
+@@ -779,7 +779,7 @@ void kvm_mips_write_compare(struct kvm_vcpu *vcpu, u32 compare, bool ack)
+ */
+ static ktime_t kvm_mips_count_disable(struct kvm_vcpu *vcpu)
+ {
+- struct mips_coproc *cop0 = vcpu->arch.cop0;
++ struct mips_coproc *cop0 = &vcpu->arch.cop0;
+ u32 count;
+ ktime_t now;
+
+@@ -806,7 +806,7 @@ static ktime_t kvm_mips_count_disable(struct kvm_vcpu *vcpu)
+ */
+ void kvm_mips_count_disable_cause(struct kvm_vcpu *vcpu)
+ {
+- struct mips_coproc *cop0 = vcpu->arch.cop0;
++ struct mips_coproc *cop0 = &vcpu->arch.cop0;
+
+ kvm_set_c0_guest_cause(cop0, CAUSEF_DC);
+ if (!(vcpu->arch.count_ctl & KVM_REG_MIPS_COUNT_CTL_DC))
+@@ -826,7 +826,7 @@ void kvm_mips_count_disable_cause(struct kvm_vcpu *vcpu)
+ */
+ void kvm_mips_count_enable_cause(struct kvm_vcpu *vcpu)
+ {
+- struct mips_coproc *cop0 = vcpu->arch.cop0;
++ struct mips_coproc *cop0 = &vcpu->arch.cop0;
+ u32 count;
+
+ kvm_clear_c0_guest_cause(cop0, CAUSEF_DC);
+@@ -852,7 +852,7 @@ void kvm_mips_count_enable_cause(struct kvm_vcpu *vcpu)
+ */
+ int kvm_mips_set_count_ctl(struct kvm_vcpu *vcpu, s64 count_ctl)
+ {
+- struct mips_coproc *cop0 = vcpu->arch.cop0;
++ struct mips_coproc *cop0 = &vcpu->arch.cop0;
+ s64 changed = count_ctl ^ vcpu->arch.count_ctl;
+ s64 delta;
+ ktime_t expire, now;
+diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
+index 884be4ef99dc1..aa5583a7b05be 100644
+--- a/arch/mips/kvm/mips.c
++++ b/arch/mips/kvm/mips.c
+@@ -649,7 +649,7 @@ static int kvm_mips_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *indices)
+ static int kvm_mips_get_reg(struct kvm_vcpu *vcpu,
+ const struct kvm_one_reg *reg)
+ {
+- struct mips_coproc *cop0 = vcpu->arch.cop0;
++ struct mips_coproc *cop0 = &vcpu->arch.cop0;
+ struct mips_fpu_struct *fpu = &vcpu->arch.fpu;
+ int ret;
+ s64 v;
+@@ -761,7 +761,7 @@ static int kvm_mips_get_reg(struct kvm_vcpu *vcpu,
+ static int kvm_mips_set_reg(struct kvm_vcpu *vcpu,
+ const struct kvm_one_reg *reg)
+ {
+- struct mips_coproc *cop0 = vcpu->arch.cop0;
++ struct mips_coproc *cop0 = &vcpu->arch.cop0;
+ struct mips_fpu_struct *fpu = &vcpu->arch.fpu;
+ s64 v;
+ s64 vs[2];
+@@ -1086,7 +1086,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
+ int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
+ {
+ return kvm_mips_pending_timer(vcpu) ||
+- kvm_read_c0_guest_cause(vcpu->arch.cop0) & C_TI;
++ kvm_read_c0_guest_cause(&vcpu->arch.cop0) & C_TI;
+ }
+
+ int kvm_arch_vcpu_dump_regs(struct kvm_vcpu *vcpu)
+@@ -1110,7 +1110,7 @@ int kvm_arch_vcpu_dump_regs(struct kvm_vcpu *vcpu)
+ kvm_debug("\thi: 0x%08lx\n", vcpu->arch.hi);
+ kvm_debug("\tlo: 0x%08lx\n", vcpu->arch.lo);
+
+- cop0 = vcpu->arch.cop0;
++ cop0 = &vcpu->arch.cop0;
+ kvm_debug("\tStatus: 0x%08x, Cause: 0x%08x\n",
+ kvm_read_c0_guest_status(cop0),
+ kvm_read_c0_guest_cause(cop0));
+@@ -1232,7 +1232,7 @@ static int __kvm_mips_handle_exit(struct kvm_vcpu *vcpu)
+
+ case EXCCODE_TLBS:
+ kvm_debug("TLB ST fault: cause %#x, status %#x, PC: %p, BadVaddr: %#lx\n",
+- cause, kvm_read_c0_guest_status(vcpu->arch.cop0), opc,
++ cause, kvm_read_c0_guest_status(&vcpu->arch.cop0), opc,
+ badvaddr);
+
+ ++vcpu->stat.tlbmiss_st_exits;
+@@ -1304,7 +1304,7 @@ static int __kvm_mips_handle_exit(struct kvm_vcpu *vcpu)
+ kvm_get_badinstr(opc, vcpu, &inst);
+ kvm_err("Exception Code: %d, not yet handled, @ PC: %p, inst: 0x%08x BadVaddr: %#lx Status: %#x\n",
+ exccode, opc, inst, badvaddr,
+- kvm_read_c0_guest_status(vcpu->arch.cop0));
++ kvm_read_c0_guest_status(&vcpu->arch.cop0));
+ kvm_arch_vcpu_dump_regs(vcpu);
+ run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
+ ret = RESUME_HOST;
+@@ -1377,7 +1377,7 @@ int noinstr kvm_mips_handle_exit(struct kvm_vcpu *vcpu)
+ /* Enable FPU for guest and restore context */
+ void kvm_own_fpu(struct kvm_vcpu *vcpu)
+ {
+- struct mips_coproc *cop0 = vcpu->arch.cop0;
++ struct mips_coproc *cop0 = &vcpu->arch.cop0;
+ unsigned int sr, cfg5;
+
+ preempt_disable();
+@@ -1421,7 +1421,7 @@ void kvm_own_fpu(struct kvm_vcpu *vcpu)
+ /* Enable MSA for guest and restore context */
+ void kvm_own_msa(struct kvm_vcpu *vcpu)
+ {
+- struct mips_coproc *cop0 = vcpu->arch.cop0;
++ struct mips_coproc *cop0 = &vcpu->arch.cop0;
+ unsigned int sr, cfg5;
+
+ preempt_disable();
+diff --git a/arch/mips/kvm/stats.c b/arch/mips/kvm/stats.c
+index 53f851a615542..3e6682018fbe6 100644
+--- a/arch/mips/kvm/stats.c
++++ b/arch/mips/kvm/stats.c
+@@ -54,9 +54,9 @@ void kvm_mips_dump_stats(struct kvm_vcpu *vcpu)
+ kvm_info("\nKVM VCPU[%d] COP0 Access Profile:\n", vcpu->vcpu_id);
+ for (i = 0; i < N_MIPS_COPROC_REGS; i++) {
+ for (j = 0; j < N_MIPS_COPROC_SEL; j++) {
+- if (vcpu->arch.cop0->stat[i][j])
++ if (vcpu->arch.cop0.stat[i][j])
+ kvm_info("%s[%d]: %lu\n", kvm_cop0_str[i], j,
+- vcpu->arch.cop0->stat[i][j]);
++ vcpu->arch.cop0.stat[i][j]);
+ }
+ }
+ #endif
+diff --git a/arch/mips/kvm/trace.h b/arch/mips/kvm/trace.h
+index a8c7fd7bf6d26..136c3535a1cbb 100644
+--- a/arch/mips/kvm/trace.h
++++ b/arch/mips/kvm/trace.h
+@@ -322,11 +322,11 @@ TRACE_EVENT_FN(kvm_guest_mode_change,
+ ),
+
+ TP_fast_assign(
+- __entry->epc = kvm_read_c0_guest_epc(vcpu->arch.cop0);
++ __entry->epc = kvm_read_c0_guest_epc(&vcpu->arch.cop0);
+ __entry->pc = vcpu->arch.pc;
+- __entry->badvaddr = kvm_read_c0_guest_badvaddr(vcpu->arch.cop0);
+- __entry->status = kvm_read_c0_guest_status(vcpu->arch.cop0);
+- __entry->cause = kvm_read_c0_guest_cause(vcpu->arch.cop0);
++ __entry->badvaddr = kvm_read_c0_guest_badvaddr(&vcpu->arch.cop0);
++ __entry->status = kvm_read_c0_guest_status(&vcpu->arch.cop0);
++ __entry->cause = kvm_read_c0_guest_cause(&vcpu->arch.cop0);
+ ),
+
+ TP_printk("EPC: 0x%08lx PC: 0x%08lx Status: 0x%08x Cause: 0x%08x BadVAddr: 0x%08lx",
+diff --git a/arch/mips/kvm/vz.c b/arch/mips/kvm/vz.c
+index 3d21cbfa74435..99d5a71e43000 100644
+--- a/arch/mips/kvm/vz.c
++++ b/arch/mips/kvm/vz.c
+@@ -422,7 +422,7 @@ static void _kvm_vz_restore_htimer(struct kvm_vcpu *vcpu,
+ */
+ static void kvm_vz_restore_timer(struct kvm_vcpu *vcpu)
+ {
+- struct mips_coproc *cop0 = vcpu->arch.cop0;
++ struct mips_coproc *cop0 = &vcpu->arch.cop0;
+ u32 cause, compare;
+
+ compare = kvm_read_sw_gc0_compare(cop0);
+@@ -517,7 +517,7 @@ static void _kvm_vz_save_htimer(struct kvm_vcpu *vcpu,
+ */
+ static void kvm_vz_save_timer(struct kvm_vcpu *vcpu)
+ {
+- struct mips_coproc *cop0 = vcpu->arch.cop0;
++ struct mips_coproc *cop0 = &vcpu->arch.cop0;
+ u32 gctl0, compare, cause;
+
+ gctl0 = read_c0_guestctl0();
+@@ -863,7 +863,7 @@ static unsigned long mips_process_maar(unsigned int op, unsigned long val)
+
+ static void kvm_write_maari(struct kvm_vcpu *vcpu, unsigned long val)
+ {
+- struct mips_coproc *cop0 = vcpu->arch.cop0;
++ struct mips_coproc *cop0 = &vcpu->arch.cop0;
+
+ val &= MIPS_MAARI_INDEX;
+ if (val == MIPS_MAARI_INDEX)
+@@ -876,7 +876,7 @@ static enum emulation_result kvm_vz_gpsi_cop0(union mips_instruction inst,
+ u32 *opc, u32 cause,
+ struct kvm_vcpu *vcpu)
+ {
+- struct mips_coproc *cop0 = vcpu->arch.cop0;
++ struct mips_coproc *cop0 = &vcpu->arch.cop0;
+ enum emulation_result er = EMULATE_DONE;
+ u32 rt, rd, sel;
+ unsigned long curr_pc;
+@@ -1911,7 +1911,7 @@ static int kvm_vz_get_one_reg(struct kvm_vcpu *vcpu,
+ const struct kvm_one_reg *reg,
+ s64 *v)
+ {
+- struct mips_coproc *cop0 = vcpu->arch.cop0;
++ struct mips_coproc *cop0 = &vcpu->arch.cop0;
+ unsigned int idx;
+
+ switch (reg->id) {
+@@ -2081,7 +2081,7 @@ static int kvm_vz_get_one_reg(struct kvm_vcpu *vcpu,
+ case KVM_REG_MIPS_CP0_MAARI:
+ if (!cpu_guest_has_maar || cpu_guest_has_dyn_maar)
+ return -EINVAL;
+- *v = kvm_read_sw_gc0_maari(vcpu->arch.cop0);
++ *v = kvm_read_sw_gc0_maari(&vcpu->arch.cop0);
+ break;
+ #ifdef CONFIG_64BIT
+ case KVM_REG_MIPS_CP0_XCONTEXT:
+@@ -2135,7 +2135,7 @@ static int kvm_vz_set_one_reg(struct kvm_vcpu *vcpu,
+ const struct kvm_one_reg *reg,
+ s64 v)
+ {
+- struct mips_coproc *cop0 = vcpu->arch.cop0;
++ struct mips_coproc *cop0 = &vcpu->arch.cop0;
+ unsigned int idx;
+ int ret = 0;
+ unsigned int cur, change;
+@@ -2562,7 +2562,7 @@ static void kvm_vz_vcpu_load_tlb(struct kvm_vcpu *vcpu, int cpu)
+
+ static int kvm_vz_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
+ {
+- struct mips_coproc *cop0 = vcpu->arch.cop0;
++ struct mips_coproc *cop0 = &vcpu->arch.cop0;
+ bool migrated, all;
+
+ /*
+@@ -2704,7 +2704,7 @@ static int kvm_vz_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
+
+ static int kvm_vz_vcpu_put(struct kvm_vcpu *vcpu, int cpu)
+ {
+- struct mips_coproc *cop0 = vcpu->arch.cop0;
++ struct mips_coproc *cop0 = &vcpu->arch.cop0;
+
+ if (current->flags & PF_VCPU)
+ kvm_vz_vcpu_save_wired(vcpu);
+@@ -3076,7 +3076,7 @@ static void kvm_vz_vcpu_uninit(struct kvm_vcpu *vcpu)
+
+ static int kvm_vz_vcpu_setup(struct kvm_vcpu *vcpu)
+ {
+- struct mips_coproc *cop0 = vcpu->arch.cop0;
++ struct mips_coproc *cop0 = &vcpu->arch.cop0;
+ unsigned long count_hz = 100*1000*1000; /* default to 100 MHz */
+
+ /*
+diff --git a/arch/openrisc/include/uapi/asm/sigcontext.h b/arch/openrisc/include/uapi/asm/sigcontext.h
+index ca585e4af6b8e..e7ffb58ff58fb 100644
+--- a/arch/openrisc/include/uapi/asm/sigcontext.h
++++ b/arch/openrisc/include/uapi/asm/sigcontext.h
+@@ -28,8 +28,10 @@
+
+ struct sigcontext {
+ struct user_regs_struct regs; /* needs to be first */
+- struct __or1k_fpu_state fpu;
+- unsigned long oldmask;
++ union {
++ unsigned long fpcsr;
++ unsigned long oldmask; /* unused */
++ };
+ };
+
+ #endif /* __ASM_OPENRISC_SIGCONTEXT_H */
+diff --git a/arch/openrisc/kernel/signal.c b/arch/openrisc/kernel/signal.c
+index 4664a18f0787d..2e7257a433ff4 100644
+--- a/arch/openrisc/kernel/signal.c
++++ b/arch/openrisc/kernel/signal.c
+@@ -50,7 +50,7 @@ static int restore_sigcontext(struct pt_regs *regs,
+ err |= __copy_from_user(regs, sc->regs.gpr, 32 * sizeof(unsigned long));
+ err |= __copy_from_user(®s->pc, &sc->regs.pc, sizeof(unsigned long));
+ err |= __copy_from_user(®s->sr, &sc->regs.sr, sizeof(unsigned long));
+- err |= __copy_from_user(®s->fpcsr, &sc->fpu.fpcsr, sizeof(unsigned long));
++ err |= __copy_from_user(®s->fpcsr, &sc->fpcsr, sizeof(unsigned long));
+
+ /* make sure the SM-bit is cleared so user-mode cannot fool us */
+ regs->sr &= ~SPR_SR_SM;
+@@ -113,7 +113,7 @@ static int setup_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc)
+ err |= __copy_to_user(sc->regs.gpr, regs, 32 * sizeof(unsigned long));
+ err |= __copy_to_user(&sc->regs.pc, ®s->pc, sizeof(unsigned long));
+ err |= __copy_to_user(&sc->regs.sr, ®s->sr, sizeof(unsigned long));
+- err |= __copy_to_user(&sc->fpu.fpcsr, ®s->fpcsr, sizeof(unsigned long));
++ err |= __copy_to_user(&sc->fpcsr, ®s->fpcsr, sizeof(unsigned long));
+
+ return err;
+ }
+diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
+index dca73f673d704..fc76eb9830fdf 100644
+--- a/arch/powerpc/Makefile
++++ b/arch/powerpc/Makefile
+@@ -409,3 +409,11 @@ checkbin:
+ echo -n '*** Please use a different binutils version.' ; \
+ false ; \
+ fi
++ @if test "x${CONFIG_FTRACE_MCOUNT_USE_RECORDMCOUNT}" = "xy" -a \
++ "x${CONFIG_LD_IS_BFD}" = "xy" -a \
++ "${CONFIG_LD_VERSION}" = "23700" ; then \
++ echo -n '*** binutils 2.37 drops unused section symbols, which recordmcount ' ; \
++ echo 'is unable to handle.' ; \
++ echo '*** Please use a different binutils version.' ; \
++ false ; \
++ fi
+diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c
+index 206475e3e0b48..4856e1a5161cc 100644
+--- a/arch/powerpc/kernel/security.c
++++ b/arch/powerpc/kernel/security.c
+@@ -364,26 +364,27 @@ ssize_t cpu_show_spec_store_bypass(struct device *dev, struct device_attribute *
+
+ static int ssb_prctl_get(struct task_struct *task)
+ {
++ /*
++ * The STF_BARRIER feature is on by default, so if it's off that means
++ * firmware has explicitly said the CPU is not vulnerable via either
++ * the hypercall or device tree.
++ */
++ if (!security_ftr_enabled(SEC_FTR_STF_BARRIER))
++ return PR_SPEC_NOT_AFFECTED;
++
++ /*
++ * If the system's CPU has no known barrier (see setup_stf_barrier())
++ * then assume that the CPU is not vulnerable.
++ */
+ if (stf_enabled_flush_types == STF_BARRIER_NONE)
+- /*
+- * We don't have an explicit signal from firmware that we're
+- * vulnerable or not, we only have certain CPU revisions that
+- * are known to be vulnerable.
+- *
+- * We assume that if we're on another CPU, where the barrier is
+- * NONE, then we are not vulnerable.
+- */
+ return PR_SPEC_NOT_AFFECTED;
+- else
+- /*
+- * If we do have a barrier type then we are vulnerable. The
+- * barrier is not a global or per-process mitigation, so the
+- * only value we can report here is PR_SPEC_ENABLE, which
+- * appears as "vulnerable" in /proc.
+- */
+- return PR_SPEC_ENABLE;
+-
+- return -EINVAL;
++
++ /*
++ * Otherwise the CPU is vulnerable. The barrier is not a global or
++ * per-process mitigation, so the only value that can be reported here
++ * is PR_SPEC_ENABLE, which appears as "vulnerable" in /proc.
++ */
++ return PR_SPEC_ENABLE;
+ }
+
+ int arch_prctl_spec_ctrl_get(struct task_struct *task, unsigned long which)
+diff --git a/arch/powerpc/mm/book3s64/hash_native.c b/arch/powerpc/mm/book3s64/hash_native.c
+index 9342e79870dfd..430d1d935a7cb 100644
+--- a/arch/powerpc/mm/book3s64/hash_native.c
++++ b/arch/powerpc/mm/book3s64/hash_native.c
+@@ -328,10 +328,12 @@ static long native_hpte_insert(unsigned long hpte_group, unsigned long vpn,
+
+ static long native_hpte_remove(unsigned long hpte_group)
+ {
++ unsigned long hpte_v, flags;
+ struct hash_pte *hptep;
+ int i;
+ int slot_offset;
+- unsigned long hpte_v;
++
++ local_irq_save(flags);
+
+ DBG_LOW(" remove(group=%lx)\n", hpte_group);
+
+@@ -356,13 +358,16 @@ static long native_hpte_remove(unsigned long hpte_group)
+ slot_offset &= 0x7;
+ }
+
+- if (i == HPTES_PER_GROUP)
+- return -1;
++ if (i == HPTES_PER_GROUP) {
++ i = -1;
++ goto out;
++ }
+
+ /* Invalidate the hpte. NOTE: this also unlocks it */
+ release_hpte_lock();
+ hptep->v = 0;
+-
++out:
++ local_irq_restore(flags);
+ return i;
+ }
+
+diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
+index 1306149aad57a..93e7bb9f67fd4 100644
+--- a/arch/riscv/mm/init.c
++++ b/arch/riscv/mm/init.c
+@@ -1346,7 +1346,7 @@ static void __init reserve_crashkernel(void)
+ */
+ crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE,
+ search_start,
+- min(search_end, (unsigned long) SZ_4G));
++ min(search_end, (unsigned long)(SZ_4G - 1)));
+ if (crash_base == 0) {
+ /* Try again without restricting region to 32bit addressible memory */
+ crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE,
+diff --git a/arch/riscv/net/bpf_jit.h b/arch/riscv/net/bpf_jit.h
+index bf9802a63061d..2717f54904287 100644
+--- a/arch/riscv/net/bpf_jit.h
++++ b/arch/riscv/net/bpf_jit.h
+@@ -69,7 +69,7 @@ struct rv_jit_context {
+ struct bpf_prog *prog;
+ u16 *insns; /* RV insns */
+ int ninsns;
+- int body_len;
++ int prologue_len;
+ int epilogue_offset;
+ int *offset; /* BPF to RV */
+ int nexentries;
+@@ -216,8 +216,8 @@ static inline int rv_offset(int insn, int off, struct rv_jit_context *ctx)
+ int from, to;
+
+ off++; /* BPF branch is from PC+1, RV is from PC */
+- from = (insn > 0) ? ctx->offset[insn - 1] : 0;
+- to = (insn + off > 0) ? ctx->offset[insn + off - 1] : 0;
++ from = (insn > 0) ? ctx->offset[insn - 1] : ctx->prologue_len;
++ to = (insn + off > 0) ? ctx->offset[insn + off - 1] : ctx->prologue_len;
+ return ninsns_rvoff(to - from);
+ }
+
+diff --git a/arch/riscv/net/bpf_jit_core.c b/arch/riscv/net/bpf_jit_core.c
+index 737baf8715da7..7a26a3e1c73cf 100644
+--- a/arch/riscv/net/bpf_jit_core.c
++++ b/arch/riscv/net/bpf_jit_core.c
+@@ -44,7 +44,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
+ unsigned int prog_size = 0, extable_size = 0;
+ bool tmp_blinded = false, extra_pass = false;
+ struct bpf_prog *tmp, *orig_prog = prog;
+- int pass = 0, prev_ninsns = 0, prologue_len, i;
++ int pass = 0, prev_ninsns = 0, i;
+ struct rv_jit_data *jit_data;
+ struct rv_jit_context *ctx;
+
+@@ -83,6 +83,12 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
+ prog = orig_prog;
+ goto out_offset;
+ }
++
++ if (build_body(ctx, extra_pass, NULL)) {
++ prog = orig_prog;
++ goto out_offset;
++ }
++
+ for (i = 0; i < prog->len; i++) {
+ prev_ninsns += 32;
+ ctx->offset[i] = prev_ninsns;
+@@ -91,12 +97,15 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
+ for (i = 0; i < NR_JIT_ITERATIONS; i++) {
+ pass++;
+ ctx->ninsns = 0;
++
++ bpf_jit_build_prologue(ctx);
++ ctx->prologue_len = ctx->ninsns;
++
+ if (build_body(ctx, extra_pass, ctx->offset)) {
+ prog = orig_prog;
+ goto out_offset;
+ }
+- ctx->body_len = ctx->ninsns;
+- bpf_jit_build_prologue(ctx);
++
+ ctx->epilogue_offset = ctx->ninsns;
+ bpf_jit_build_epilogue(ctx);
+
+@@ -162,10 +171,8 @@ skip_init_ctx:
+
+ if (!prog->is_func || extra_pass) {
+ bpf_jit_binary_lock_ro(jit_data->header);
+- prologue_len = ctx->epilogue_offset - ctx->body_len;
+ for (i = 0; i < prog->len; i++)
+- ctx->offset[i] = ninsns_rvoff(prologue_len +
+- ctx->offset[i]);
++ ctx->offset[i] = ninsns_rvoff(ctx->offset[i]);
+ bpf_prog_fill_jited_linfo(prog, ctx->offset);
+ out_offset:
+ kfree(ctx->offset);
+diff --git a/arch/s390/Makefile b/arch/s390/Makefile
+index ed646c583e4fe..5ed242897b0d2 100644
+--- a/arch/s390/Makefile
++++ b/arch/s390/Makefile
+@@ -27,6 +27,7 @@ KBUILD_CFLAGS_DECOMPRESSOR += -fno-delete-null-pointer-checks -msoft-float -mbac
+ KBUILD_CFLAGS_DECOMPRESSOR += -fno-asynchronous-unwind-tables
+ KBUILD_CFLAGS_DECOMPRESSOR += -ffreestanding
+ KBUILD_CFLAGS_DECOMPRESSOR += -fno-stack-protector
++KBUILD_CFLAGS_DECOMPRESSOR += -fPIE
+ KBUILD_CFLAGS_DECOMPRESSOR += $(call cc-disable-warning, address-of-packed-member)
+ KBUILD_CFLAGS_DECOMPRESSOR += $(if $(CONFIG_DEBUG_INFO),-g)
+ KBUILD_CFLAGS_DECOMPRESSOR += $(if $(CONFIG_DEBUG_INFO_DWARF4), $(call cc-option, -gdwarf-4,))
+diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
+index 27f3a7b34bd52..5e52ed3e5562b 100644
+--- a/arch/x86/events/intel/core.c
++++ b/arch/x86/events/intel/core.c
+@@ -3993,6 +3993,13 @@ static int intel_pmu_hw_config(struct perf_event *event)
+ struct perf_event *leader = event->group_leader;
+ struct perf_event *sibling = NULL;
+
++ /*
++ * When this memload event is also the first event (no group
++ * exists yet), then there is no aux event before it.
++ */
++ if (leader == event)
++ return -ENODATA;
++
+ if (!is_mem_loads_aux_event(leader)) {
+ for_each_sibling_event(sibling, leader) {
+ if (is_mem_loads_aux_event(sibling))
+diff --git a/arch/xtensa/platforms/iss/network.c b/arch/xtensa/platforms/iss/network.c
+index 9ac46ab3a296c..119345eeb04c9 100644
+--- a/arch/xtensa/platforms/iss/network.c
++++ b/arch/xtensa/platforms/iss/network.c
+@@ -237,7 +237,7 @@ static int tuntap_probe(struct iss_net_private *lp, int index, char *init)
+
+ init += sizeof(TRANSPORT_TUNTAP_NAME) - 1;
+ if (*init == ',') {
+- rem = split_if_spec(init + 1, &mac_str, &dev_name);
++ rem = split_if_spec(init + 1, &mac_str, &dev_name, NULL);
+ if (rem != NULL) {
+ pr_err("%s: extra garbage on specification : '%s'\n",
+ dev->name, rem);
+diff --git a/block/blk-crypto-profile.c b/block/blk-crypto-profile.c
+index 2a67d3fb63e5c..7fabc883e39f1 100644
+--- a/block/blk-crypto-profile.c
++++ b/block/blk-crypto-profile.c
+@@ -79,7 +79,14 @@ int blk_crypto_profile_init(struct blk_crypto_profile *profile,
+ unsigned int slot_hashtable_size;
+
+ memset(profile, 0, sizeof(*profile));
+- init_rwsem(&profile->lock);
++
++ /*
++ * profile->lock of an underlying device can nest inside profile->lock
++ * of a device-mapper device, so use a dynamic lock class to avoid
++ * false-positive lockdep reports.
++ */
++ lockdep_register_key(&profile->lockdep_key);
++ __init_rwsem(&profile->lock, "&profile->lock", &profile->lockdep_key);
+
+ if (num_slots == 0)
+ return 0;
+@@ -89,7 +96,7 @@ int blk_crypto_profile_init(struct blk_crypto_profile *profile,
+ profile->slots = kvcalloc(num_slots, sizeof(profile->slots[0]),
+ GFP_KERNEL);
+ if (!profile->slots)
+- return -ENOMEM;
++ goto err_destroy;
+
+ profile->num_slots = num_slots;
+
+@@ -435,6 +442,7 @@ void blk_crypto_profile_destroy(struct blk_crypto_profile *profile)
+ {
+ if (!profile)
+ return;
++ lockdep_unregister_key(&profile->lockdep_key);
+ kvfree(profile->slot_hashtable);
+ kvfree_sensitive(profile->slots,
+ sizeof(profile->slots[0]) * profile->num_slots);
+diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h
+index d3013fbd13b32..399dc5dcefd7c 100644
+--- a/drivers/accel/ivpu/ivpu_drv.h
++++ b/drivers/accel/ivpu/ivpu_drv.h
+@@ -75,6 +75,7 @@ struct ivpu_wa_table {
+ bool punit_disabled;
+ bool clear_runtime_mem;
+ bool d3hot_after_power_off;
++ bool interrupt_clear_with_0;
+ };
+
+ struct ivpu_hw_info;
+diff --git a/drivers/accel/ivpu/ivpu_hw_mtl.c b/drivers/accel/ivpu/ivpu_hw_mtl.c
+index fef35422c6f0d..2a5dd3a5dc461 100644
+--- a/drivers/accel/ivpu/ivpu_hw_mtl.c
++++ b/drivers/accel/ivpu/ivpu_hw_mtl.c
+@@ -101,6 +101,9 @@ static void ivpu_hw_wa_init(struct ivpu_device *vdev)
+ vdev->wa.punit_disabled = ivpu_is_fpga(vdev);
+ vdev->wa.clear_runtime_mem = false;
+ vdev->wa.d3hot_after_power_off = true;
++
++ if (ivpu_device_id(vdev) == PCI_DEVICE_ID_MTL && ivpu_revision(vdev) < 4)
++ vdev->wa.interrupt_clear_with_0 = true;
+ }
+
+ static void ivpu_hw_timeouts_init(struct ivpu_device *vdev)
+@@ -885,7 +888,7 @@ static void ivpu_hw_mtl_irq_disable(struct ivpu_device *vdev)
+ REGB_WR32(MTL_BUTTRESS_GLOBAL_INT_MASK, 0x1);
+ REGB_WR32(MTL_BUTTRESS_LOCAL_INT_MASK, BUTTRESS_IRQ_DISABLE_MASK);
+ REGV_WR64(MTL_VPU_HOST_SS_ICB_ENABLE_0, 0x0ull);
+- REGB_WR32(MTL_VPU_HOST_SS_FW_SOC_IRQ_EN, 0x0);
++ REGV_WR32(MTL_VPU_HOST_SS_FW_SOC_IRQ_EN, 0x0);
+ }
+
+ static void ivpu_hw_mtl_irq_wdt_nce_handler(struct ivpu_device *vdev)
+@@ -973,12 +976,15 @@ static u32 ivpu_hw_mtl_irqb_handler(struct ivpu_device *vdev, int irq)
+ schedule_recovery = true;
+ }
+
+- /*
+- * Clear local interrupt status by writing 0 to all bits.
+- * This must be done after interrupts are cleared at the source.
+- * Writing 1 triggers an interrupt, so we can't perform read update write.
+- */
+- REGB_WR32(MTL_BUTTRESS_INTERRUPT_STAT, 0x0);
++ /* This must be done after interrupts are cleared at the source. */
++ if (IVPU_WA(interrupt_clear_with_0))
++ /*
++ * Writing 1 triggers an interrupt, so we can't perform read update write.
++ * Clear local interrupt status by writing 0 to all bits.
++ */
++ REGB_WR32(MTL_BUTTRESS_INTERRUPT_STAT, 0x0);
++ else
++ REGB_WR32(MTL_BUTTRESS_INTERRUPT_STAT, status);
+
+ /* Re-enable global interrupt */
+ REGB_WR32(MTL_BUTTRESS_GLOBAL_INT_MASK, 0x0);
+diff --git a/drivers/base/regmap/regmap-irq.c b/drivers/base/regmap/regmap-irq.c
+index b99bb2369fffb..d80054470909f 100644
+--- a/drivers/base/regmap/regmap-irq.c
++++ b/drivers/base/regmap/regmap-irq.c
+@@ -852,7 +852,7 @@ int regmap_add_irq_chip_fwnode(struct fwnode_handle *fwnode,
+ if (!d->config_buf)
+ goto err_alloc;
+
+- for (i = 0; i < chip->num_config_regs; i++) {
++ for (i = 0; i < chip->num_config_bases; i++) {
+ d->config_buf[i] = kcalloc(chip->num_config_regs,
+ sizeof(**d->config_buf),
+ GFP_KERNEL);
+diff --git a/drivers/bus/intel-ixp4xx-eb.c b/drivers/bus/intel-ixp4xx-eb.c
+index f5ba6bee6fd8b..320cf307db054 100644
+--- a/drivers/bus/intel-ixp4xx-eb.c
++++ b/drivers/bus/intel-ixp4xx-eb.c
+@@ -33,7 +33,7 @@
+ #define IXP4XX_EXP_TIMING_STRIDE 0x04
+ #define IXP4XX_EXP_CS_EN BIT(31)
+ #define IXP456_EXP_PAR_EN BIT(30) /* Only on IXP45x and IXP46x */
+-#define IXP4XX_EXP_T1_MASK GENMASK(28, 27)
++#define IXP4XX_EXP_T1_MASK GENMASK(29, 28)
+ #define IXP4XX_EXP_T1_SHIFT 28
+ #define IXP4XX_EXP_T2_MASK GENMASK(27, 26)
+ #define IXP4XX_EXP_T2_SHIFT 26
+diff --git a/drivers/char/hw_random/imx-rngc.c b/drivers/char/hw_random/imx-rngc.c
+index a1c24148ed314..75fb46298a87b 100644
+--- a/drivers/char/hw_random/imx-rngc.c
++++ b/drivers/char/hw_random/imx-rngc.c
+@@ -110,7 +110,7 @@ static int imx_rngc_self_test(struct imx_rngc *rngc)
+ cmd = readl(rngc->base + RNGC_COMMAND);
+ writel(cmd | RNGC_CMD_SELF_TEST, rngc->base + RNGC_COMMAND);
+
+- ret = wait_for_completion_timeout(&rngc->rng_op_done, RNGC_TIMEOUT);
++ ret = wait_for_completion_timeout(&rngc->rng_op_done, msecs_to_jiffies(RNGC_TIMEOUT));
+ imx_rngc_irq_mask_clear(rngc);
+ if (!ret)
+ return -ETIMEDOUT;
+@@ -187,9 +187,7 @@ static int imx_rngc_init(struct hwrng *rng)
+ cmd = readl(rngc->base + RNGC_COMMAND);
+ writel(cmd | RNGC_CMD_SEED, rngc->base + RNGC_COMMAND);
+
+- ret = wait_for_completion_timeout(&rngc->rng_op_done,
+- RNGC_TIMEOUT);
+-
++ ret = wait_for_completion_timeout(&rngc->rng_op_done, msecs_to_jiffies(RNGC_TIMEOUT));
+ if (!ret) {
+ ret = -ETIMEDOUT;
+ goto err;
+diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
+index cd48033b804a3..cf5499e51999b 100644
+--- a/drivers/char/tpm/tpm-chip.c
++++ b/drivers/char/tpm/tpm-chip.c
+@@ -518,6 +518,7 @@ static int tpm_add_legacy_sysfs(struct tpm_chip *chip)
+ * 6.x.y.z series: 6.0.18.6 +
+ * 3.x.y.z series: 3.57.y.5 +
+ */
++#ifdef CONFIG_X86
+ static bool tpm_amd_is_rng_defective(struct tpm_chip *chip)
+ {
+ u32 val1, val2;
+@@ -566,6 +567,12 @@ release:
+
+ return true;
+ }
++#else
++static inline bool tpm_amd_is_rng_defective(struct tpm_chip *chip)
++{
++ return false;
++}
++#endif /* CONFIG_X86 */
+
+ static int tpm_hwrng_read(struct hwrng *rng, void *data, size_t max, bool wait)
+ {
+diff --git a/drivers/char/tpm/tpm_crb.c b/drivers/char/tpm/tpm_crb.c
+index d43a0d7b97a89..1a5d09b185134 100644
+--- a/drivers/char/tpm/tpm_crb.c
++++ b/drivers/char/tpm/tpm_crb.c
+@@ -563,15 +563,18 @@ static int crb_map_io(struct acpi_device *device, struct crb_priv *priv,
+ u32 rsp_size;
+ int ret;
+
+- INIT_LIST_HEAD(&acpi_resource_list);
+- ret = acpi_dev_get_resources(device, &acpi_resource_list,
+- crb_check_resource, iores_array);
+- if (ret < 0)
+- return ret;
+- acpi_dev_free_resource_list(&acpi_resource_list);
+-
+- /* Pluton doesn't appear to define ACPI memory regions */
++ /*
++ * Pluton sometimes does not define ACPI memory regions.
++ * Mapping is then done in crb_map_pluton
++ */
+ if (priv->sm != ACPI_TPM2_COMMAND_BUFFER_WITH_PLUTON) {
++ INIT_LIST_HEAD(&acpi_resource_list);
++ ret = acpi_dev_get_resources(device, &acpi_resource_list,
++ crb_check_resource, iores_array);
++ if (ret < 0)
++ return ret;
++ acpi_dev_free_resource_list(&acpi_resource_list);
++
+ if (resource_type(iores_array) != IORESOURCE_MEM) {
+ dev_err(dev, FW_BUG "TPM2 ACPI table does not define a memory resource\n");
+ return -EINVAL;
+diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c
+index 7db3593941eaa..cc42cf3de960f 100644
+--- a/drivers/char/tpm/tpm_tis.c
++++ b/drivers/char/tpm/tpm_tis.c
+@@ -114,6 +114,22 @@ static int tpm_tis_disable_irq(const struct dmi_system_id *d)
+ }
+
+ static const struct dmi_system_id tpm_tis_dmi_table[] = {
++ {
++ .callback = tpm_tis_disable_irq,
++ .ident = "Framework Laptop (12th Gen Intel Core)",
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "Framework"),
++ DMI_MATCH(DMI_PRODUCT_NAME, "Laptop (12th Gen Intel Core)"),
++ },
++ },
++ {
++ .callback = tpm_tis_disable_irq,
++ .ident = "Framework Laptop (13th Gen Intel Core)",
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "Framework"),
++ DMI_MATCH(DMI_PRODUCT_NAME, "Laptop (13th Gen Intel Core)"),
++ },
++ },
+ {
+ .callback = tpm_tis_disable_irq,
+ .ident = "ThinkPad T490s",
+@@ -138,11 +154,20 @@ static const struct dmi_system_id tpm_tis_dmi_table[] = {
+ DMI_MATCH(DMI_PRODUCT_VERSION, "ThinkPad L490"),
+ },
+ },
++ {
++ .callback = tpm_tis_disable_irq,
++ .ident = "ThinkPad L590",
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
++ DMI_MATCH(DMI_PRODUCT_VERSION, "ThinkPad L590"),
++ },
++ },
+ {
+ .callback = tpm_tis_disable_irq,
+ .ident = "UPX-TGL",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "AAEON"),
++ DMI_MATCH(DMI_PRODUCT_VERSION, "UPX-TGL"),
+ },
+ },
+ {}
+diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
+index 558144fa707ae..88a5384c09c02 100644
+--- a/drivers/char/tpm/tpm_tis_core.c
++++ b/drivers/char/tpm/tpm_tis_core.c
+@@ -24,9 +24,12 @@
+ #include <linux/wait.h>
+ #include <linux/acpi.h>
+ #include <linux/freezer.h>
++#include <linux/dmi.h>
+ #include "tpm.h"
+ #include "tpm_tis_core.h"
+
++#define TPM_TIS_MAX_UNHANDLED_IRQS 1000
++
+ static void tpm_tis_clkrun_enable(struct tpm_chip *chip, bool value);
+
+ static bool wait_for_tpm_stat_cond(struct tpm_chip *chip, u8 mask,
+@@ -468,25 +471,29 @@ out_err:
+ return rc;
+ }
+
+-static void disable_interrupts(struct tpm_chip *chip)
++static void __tpm_tis_disable_interrupts(struct tpm_chip *chip)
++{
++ struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev);
++ u32 int_mask = 0;
++
++ tpm_tis_read32(priv, TPM_INT_ENABLE(priv->locality), &int_mask);
++ int_mask &= ~TPM_GLOBAL_INT_ENABLE;
++ tpm_tis_write32(priv, TPM_INT_ENABLE(priv->locality), int_mask);
++
++ chip->flags &= ~TPM_CHIP_FLAG_IRQ;
++}
++
++static void tpm_tis_disable_interrupts(struct tpm_chip *chip)
+ {
+ struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev);
+- u32 intmask;
+- int rc;
+
+ if (priv->irq == 0)
+ return;
+
+- rc = tpm_tis_read32(priv, TPM_INT_ENABLE(priv->locality), &intmask);
+- if (rc < 0)
+- intmask = 0;
+-
+- intmask &= ~TPM_GLOBAL_INT_ENABLE;
+- rc = tpm_tis_write32(priv, TPM_INT_ENABLE(priv->locality), intmask);
++ __tpm_tis_disable_interrupts(chip);
+
+ devm_free_irq(chip->dev.parent, priv->irq, chip);
+ priv->irq = 0;
+- chip->flags &= ~TPM_CHIP_FLAG_IRQ;
+ }
+
+ /*
+@@ -552,7 +559,7 @@ static int tpm_tis_send(struct tpm_chip *chip, u8 *buf, size_t len)
+ if (!test_bit(TPM_TIS_IRQ_TESTED, &priv->flags))
+ tpm_msleep(1);
+ if (!test_bit(TPM_TIS_IRQ_TESTED, &priv->flags))
+- disable_interrupts(chip);
++ tpm_tis_disable_interrupts(chip);
+ set_bit(TPM_TIS_IRQ_TESTED, &priv->flags);
+ return rc;
+ }
+@@ -752,6 +759,57 @@ static bool tpm_tis_req_canceled(struct tpm_chip *chip, u8 status)
+ return status == TPM_STS_COMMAND_READY;
+ }
+
++static irqreturn_t tpm_tis_revert_interrupts(struct tpm_chip *chip)
++{
++ struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev);
++ const char *product;
++ const char *vendor;
++
++ dev_warn(&chip->dev, FW_BUG
++ "TPM interrupt storm detected, polling instead\n");
++
++ vendor = dmi_get_system_info(DMI_SYS_VENDOR);
++ product = dmi_get_system_info(DMI_PRODUCT_VERSION);
++
++ if (vendor && product) {
++ dev_info(&chip->dev,
++ "Consider adding the following entry to tpm_tis_dmi_table:\n");
++ dev_info(&chip->dev, "\tDMI_SYS_VENDOR: %s\n", vendor);
++ dev_info(&chip->dev, "\tDMI_PRODUCT_VERSION: %s\n", product);
++ }
++
++ if (tpm_tis_request_locality(chip, 0) != 0)
++ return IRQ_NONE;
++
++ __tpm_tis_disable_interrupts(chip);
++ tpm_tis_relinquish_locality(chip, 0);
++
++ schedule_work(&priv->free_irq_work);
++
++ return IRQ_HANDLED;
++}
++
++static irqreturn_t tpm_tis_update_unhandled_irqs(struct tpm_chip *chip)
++{
++ struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev);
++ irqreturn_t irqret = IRQ_HANDLED;
++
++ if (!(chip->flags & TPM_CHIP_FLAG_IRQ))
++ return IRQ_HANDLED;
++
++ if (time_after(jiffies, priv->last_unhandled_irq + HZ/10))
++ priv->unhandled_irqs = 1;
++ else
++ priv->unhandled_irqs++;
++
++ priv->last_unhandled_irq = jiffies;
++
++ if (priv->unhandled_irqs > TPM_TIS_MAX_UNHANDLED_IRQS)
++ irqret = tpm_tis_revert_interrupts(chip);
++
++ return irqret;
++}
++
+ static irqreturn_t tis_int_handler(int dummy, void *dev_id)
+ {
+ struct tpm_chip *chip = dev_id;
+@@ -761,10 +819,10 @@ static irqreturn_t tis_int_handler(int dummy, void *dev_id)
+
+ rc = tpm_tis_read32(priv, TPM_INT_STATUS(priv->locality), &interrupt);
+ if (rc < 0)
+- return IRQ_NONE;
++ goto err;
+
+ if (interrupt == 0)
+- return IRQ_NONE;
++ goto err;
+
+ set_bit(TPM_TIS_IRQ_TESTED, &priv->flags);
+ if (interrupt & TPM_INTF_DATA_AVAIL_INT)
+@@ -780,10 +838,13 @@ static irqreturn_t tis_int_handler(int dummy, void *dev_id)
+ rc = tpm_tis_write32(priv, TPM_INT_STATUS(priv->locality), interrupt);
+ tpm_tis_relinquish_locality(chip, 0);
+ if (rc < 0)
+- return IRQ_NONE;
++ goto err;
+
+ tpm_tis_read32(priv, TPM_INT_STATUS(priv->locality), &interrupt);
+ return IRQ_HANDLED;
++
++err:
++ return tpm_tis_update_unhandled_irqs(chip);
+ }
+
+ static void tpm_tis_gen_interrupt(struct tpm_chip *chip)
+@@ -804,6 +865,15 @@ static void tpm_tis_gen_interrupt(struct tpm_chip *chip)
+ chip->flags &= ~TPM_CHIP_FLAG_IRQ;
+ }
+
++static void tpm_tis_free_irq_func(struct work_struct *work)
++{
++ struct tpm_tis_data *priv = container_of(work, typeof(*priv), free_irq_work);
++ struct tpm_chip *chip = priv->chip;
++
++ devm_free_irq(chip->dev.parent, priv->irq, chip);
++ priv->irq = 0;
++}
++
+ /* Register the IRQ and issue a command that will cause an interrupt. If an
+ * irq is seen then leave the chip setup for IRQ operation, otherwise reverse
+ * everything and leave in polling mode. Returns 0 on success.
+@@ -816,6 +886,7 @@ static int tpm_tis_probe_irq_single(struct tpm_chip *chip, u32 intmask,
+ int rc;
+ u32 int_status;
+
++ INIT_WORK(&priv->free_irq_work, tpm_tis_free_irq_func);
+
+ rc = devm_request_threaded_irq(chip->dev.parent, irq, NULL,
+ tis_int_handler, IRQF_ONESHOT | flags,
+@@ -918,6 +989,7 @@ void tpm_tis_remove(struct tpm_chip *chip)
+ interrupt = 0;
+
+ tpm_tis_write32(priv, reg, ~TPM_GLOBAL_INT_ENABLE & interrupt);
++ flush_work(&priv->free_irq_work);
+
+ tpm_tis_clkrun_enable(chip, false);
+
+@@ -1021,6 +1093,7 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
+ chip->timeout_b = msecs_to_jiffies(TIS_TIMEOUT_B_MAX);
+ chip->timeout_c = msecs_to_jiffies(TIS_TIMEOUT_C_MAX);
+ chip->timeout_d = msecs_to_jiffies(TIS_TIMEOUT_D_MAX);
++ priv->chip = chip;
+ priv->timeout_min = TPM_TIMEOUT_USECS_MIN;
+ priv->timeout_max = TPM_TIMEOUT_USECS_MAX;
+ priv->phy_ops = phy_ops;
+@@ -1179,7 +1252,7 @@ int tpm_tis_core_init(struct device *dev, struct tpm_tis_data *priv, int irq,
+ rc = tpm_tis_request_locality(chip, 0);
+ if (rc < 0)
+ goto out_err;
+- disable_interrupts(chip);
++ tpm_tis_disable_interrupts(chip);
+ tpm_tis_relinquish_locality(chip, 0);
+ }
+ }
+diff --git a/drivers/char/tpm/tpm_tis_core.h b/drivers/char/tpm/tpm_tis_core.h
+index 610bfadb6acf1..b1a169d7d1ca9 100644
+--- a/drivers/char/tpm/tpm_tis_core.h
++++ b/drivers/char/tpm/tpm_tis_core.h
+@@ -91,11 +91,15 @@ enum tpm_tis_flags {
+ };
+
+ struct tpm_tis_data {
++ struct tpm_chip *chip;
+ u16 manufacturer_id;
+ struct mutex locality_count_mutex;
+ unsigned int locality_count;
+ int locality;
+ int irq;
++ struct work_struct free_irq_work;
++ unsigned long last_unhandled_irq;
++ unsigned int unhandled_irqs;
+ unsigned int int_mask;
+ unsigned long flags;
+ void __iomem *ilb_base_addr;
+diff --git a/drivers/char/tpm/tpm_tis_i2c.c b/drivers/char/tpm/tpm_tis_i2c.c
+index c8c34adc14c0e..82fda488e98bb 100644
+--- a/drivers/char/tpm/tpm_tis_i2c.c
++++ b/drivers/char/tpm/tpm_tis_i2c.c
+@@ -189,21 +189,28 @@ static int tpm_tis_i2c_read_bytes(struct tpm_tis_data *data, u32 addr, u16 len,
+ int ret;
+
+ for (i = 0; i < TPM_RETRY; i++) {
+- /* write register */
+- msg.len = sizeof(reg);
+- msg.buf = ®
+- msg.flags = 0;
+- ret = tpm_tis_i2c_retry_transfer_until_ack(data, &msg);
+- if (ret < 0)
+- return ret;
+-
+- /* read data */
+- msg.buf = result;
+- msg.len = len;
+- msg.flags = I2C_M_RD;
+- ret = tpm_tis_i2c_retry_transfer_until_ack(data, &msg);
+- if (ret < 0)
+- return ret;
++ u16 read = 0;
++
++ while (read < len) {
++ /* write register */
++ msg.len = sizeof(reg);
++ msg.buf = ®
++ msg.flags = 0;
++ ret = tpm_tis_i2c_retry_transfer_until_ack(data, &msg);
++ if (ret < 0)
++ return ret;
++
++ /* read data */
++ msg.buf = result + read;
++ msg.len = len - read;
++ msg.flags = I2C_M_RD;
++ if (msg.len > I2C_SMBUS_BLOCK_MAX)
++ msg.len = I2C_SMBUS_BLOCK_MAX;
++ ret = tpm_tis_i2c_retry_transfer_until_ack(data, &msg);
++ if (ret < 0)
++ return ret;
++ read += msg.len;
++ }
+
+ ret = tpm_tis_i2c_sanity_check_read(reg, len, result);
+ if (ret == 0)
+@@ -223,19 +230,27 @@ static int tpm_tis_i2c_write_bytes(struct tpm_tis_data *data, u32 addr, u16 len,
+ struct i2c_msg msg = { .addr = phy->i2c_client->addr };
+ u8 reg = tpm_tis_i2c_address_to_register(addr);
+ int ret;
++ u16 wrote = 0;
+
+ if (len > TPM_BUFSIZE - 1)
+ return -EIO;
+
+- /* write register and data in one go */
+ phy->io_buf[0] = reg;
+- memcpy(phy->io_buf + sizeof(reg), value, len);
+-
+- msg.len = sizeof(reg) + len;
+ msg.buf = phy->io_buf;
+- ret = tpm_tis_i2c_retry_transfer_until_ack(data, &msg);
+- if (ret < 0)
+- return ret;
++ while (wrote < len) {
++ /* write register and data in one go */
++ msg.len = sizeof(reg) + len - wrote;
++ if (msg.len > I2C_SMBUS_BLOCK_MAX)
++ msg.len = I2C_SMBUS_BLOCK_MAX;
++
++ memcpy(phy->io_buf + sizeof(reg), value + wrote,
++ msg.len - sizeof(reg));
++
++ ret = tpm_tis_i2c_retry_transfer_until_ack(data, &msg);
++ if (ret < 0)
++ return ret;
++ wrote += msg.len - sizeof(reg);
++ }
+
+ return 0;
+ }
+diff --git a/drivers/char/tpm/tpm_vtpm_proxy.c b/drivers/char/tpm/tpm_vtpm_proxy.c
+index 5c865987ba5c1..30e953988cabe 100644
+--- a/drivers/char/tpm/tpm_vtpm_proxy.c
++++ b/drivers/char/tpm/tpm_vtpm_proxy.c
+@@ -683,37 +683,21 @@ static struct miscdevice vtpmx_miscdev = {
+ .fops = &vtpmx_fops,
+ };
+
+-static int vtpmx_init(void)
+-{
+- return misc_register(&vtpmx_miscdev);
+-}
+-
+-static void vtpmx_cleanup(void)
+-{
+- misc_deregister(&vtpmx_miscdev);
+-}
+-
+ static int __init vtpm_module_init(void)
+ {
+ int rc;
+
+- rc = vtpmx_init();
+- if (rc) {
+- pr_err("couldn't create vtpmx device\n");
+- return rc;
+- }
+-
+ workqueue = create_workqueue("tpm-vtpm");
+ if (!workqueue) {
+ pr_err("couldn't create workqueue\n");
+- rc = -ENOMEM;
+- goto err_vtpmx_cleanup;
++ return -ENOMEM;
+ }
+
+- return 0;
+-
+-err_vtpmx_cleanup:
+- vtpmx_cleanup();
++ rc = misc_register(&vtpmx_miscdev);
++ if (rc) {
++ pr_err("couldn't create vtpmx device\n");
++ destroy_workqueue(workqueue);
++ }
+
+ return rc;
+ }
+@@ -721,7 +705,7 @@ err_vtpmx_cleanup:
+ static void __exit vtpm_module_exit(void)
+ {
+ destroy_workqueue(workqueue);
+- vtpmx_cleanup();
++ misc_deregister(&vtpmx_miscdev);
+ }
+
+ module_init(vtpm_module_init);
+diff --git a/drivers/firmware/stratix10-svc.c b/drivers/firmware/stratix10-svc.c
+index 80f4e2d14e046..2d674126160fe 100644
+--- a/drivers/firmware/stratix10-svc.c
++++ b/drivers/firmware/stratix10-svc.c
+@@ -755,7 +755,7 @@ svc_create_memory_pool(struct platform_device *pdev,
+ end = rounddown(sh_memory->addr + sh_memory->size, PAGE_SIZE);
+ paddr = begin;
+ size = end - begin;
+- va = memremap(paddr, size, MEMREMAP_WC);
++ va = devm_memremap(dev, paddr, size, MEMREMAP_WC);
+ if (!va) {
+ dev_err(dev, "fail to remap shared memory\n");
+ return ERR_PTR(-EINVAL);
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+index 83a83ced2439f..05b9989e3ac1b 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+@@ -2792,6 +2792,9 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, struct dma_fence **ef)
+ if (!attachment->is_mapped)
+ continue;
+
++ if (attachment->bo_va->base.bo->tbo.pin_count)
++ continue;
++
+ kfd_mem_dmaunmap_attachment(mem, attachment);
+ ret = update_gpuvm_pte(mem, attachment, &sync_obj);
+ if (ret) {
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+index ac44b6774352b..23f52150ebef4 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+@@ -1683,18 +1683,30 @@ int amdgpu_vm_bo_clear_mappings(struct amdgpu_device *adev,
+
+ /* Insert partial mapping before the range */
+ if (!list_empty(&before->list)) {
++ struct amdgpu_bo *bo = before->bo_va->base.bo;
++
+ amdgpu_vm_it_insert(before, &vm->va);
+ if (before->flags & AMDGPU_PTE_PRT)
+ amdgpu_vm_prt_get(adev);
++
++ if (bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv &&
++ !before->bo_va->base.moved)
++ amdgpu_vm_bo_moved(&before->bo_va->base);
+ } else {
+ kfree(before);
+ }
+
+ /* Insert partial mapping after the range */
+ if (!list_empty(&after->list)) {
++ struct amdgpu_bo *bo = after->bo_va->base.bo;
++
+ amdgpu_vm_it_insert(after, &vm->va);
+ if (after->flags & AMDGPU_PTE_PRT)
+ amdgpu_vm_prt_get(adev);
++
++ if (bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv &&
++ !after->bo_va->base.moved)
++ amdgpu_vm_bo_moved(&after->bo_va->base);
+ } else {
+ kfree(after);
+ }
+diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+index 51269b0ab9b58..44f4c74419740 100644
+--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
++++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+@@ -1776,12 +1776,6 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
+
+ dc_init_callbacks(adev->dm.dc, &init_params);
+ }
+-#if defined(CONFIG_DRM_AMD_SECURE_DISPLAY)
+- adev->dm.secure_display_ctxs = amdgpu_dm_crtc_secure_display_create_contexts(adev);
+- if (!adev->dm.secure_display_ctxs) {
+- DRM_ERROR("amdgpu: failed to initialize secure_display_ctxs.\n");
+- }
+-#endif
+ if (dc_is_dmub_outbox_supported(adev->dm.dc)) {
+ init_completion(&adev->dm.dmub_aux_transfer_done);
+ adev->dm.dmub_notify = kzalloc(sizeof(struct dmub_notification), GFP_KERNEL);
+@@ -1840,6 +1834,11 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
+ goto error;
+ }
+
++#if defined(CONFIG_DRM_AMD_SECURE_DISPLAY)
++ adev->dm.secure_display_ctxs = amdgpu_dm_crtc_secure_display_create_contexts(adev);
++ if (!adev->dm.secure_display_ctxs)
++ DRM_ERROR("amdgpu: failed to initialize secure display contexts.\n");
++#endif
+
+ DRM_DEBUG_DRIVER("KMS initialized.\n");
+
+@@ -3263,6 +3262,7 @@ static void dm_handle_mst_sideband_msg(struct amdgpu_dm_connector *aconnector)
+
+ while (dret == dpcd_bytes_to_read &&
+ process_count < max_process_count) {
++ u8 ack[DP_PSR_ERROR_STATUS - DP_SINK_COUNT_ESI] = {};
+ u8 retry;
+ dret = 0;
+
+@@ -3271,28 +3271,29 @@ static void dm_handle_mst_sideband_msg(struct amdgpu_dm_connector *aconnector)
+ DRM_DEBUG_DRIVER("ESI %02x %02x %02x\n", esi[0], esi[1], esi[2]);
+ /* handle HPD short pulse irq */
+ if (aconnector->mst_mgr.mst_state)
+- drm_dp_mst_hpd_irq(
+- &aconnector->mst_mgr,
+- esi,
+- &new_irq_handled);
++ drm_dp_mst_hpd_irq_handle_event(&aconnector->mst_mgr,
++ esi,
++ ack,
++ &new_irq_handled);
+
+ if (new_irq_handled) {
+ /* ACK at DPCD to notify down stream */
+- const int ack_dpcd_bytes_to_write =
+- dpcd_bytes_to_read - 1;
+-
+ for (retry = 0; retry < 3; retry++) {
+- u8 wret;
+-
+- wret = drm_dp_dpcd_write(
+- &aconnector->dm_dp_aux.aux,
+- dpcd_addr + 1,
+- &esi[1],
+- ack_dpcd_bytes_to_write);
+- if (wret == ack_dpcd_bytes_to_write)
++ ssize_t wret;
++
++ wret = drm_dp_dpcd_writeb(&aconnector->dm_dp_aux.aux,
++ dpcd_addr + 1,
++ ack[1]);
++ if (wret == 1)
+ break;
+ }
+
++ if (retry == 3) {
++ DRM_ERROR("Failed to ack MST event.\n");
++ return;
++ }
++
++ drm_dp_mst_hpd_irq_send_new_request(&aconnector->mst_mgr);
+ /* check if there is new irq to be handled */
+ dret = drm_dp_dpcd_read(
+ &aconnector->dm_dp_aux.aux,
+@@ -5057,11 +5058,7 @@ static inline void fill_dc_dirty_rect(struct drm_plane *plane,
+ s32 y, s32 width, s32 height,
+ int *i, bool ffu)
+ {
+- if (*i > DC_MAX_DIRTY_RECTS)
+- return;
+-
+- if (*i == DC_MAX_DIRTY_RECTS)
+- goto out;
++ WARN_ON(*i >= DC_MAX_DIRTY_RECTS);
+
+ dirty_rect->x = x;
+ dirty_rect->y = y;
+@@ -5077,7 +5074,6 @@ static inline void fill_dc_dirty_rect(struct drm_plane *plane,
+ "[PLANE:%d] PSR SU dirty rect at (%d, %d) size (%d, %d)",
+ plane->base.id, x, y, width, height);
+
+-out:
+ (*i)++;
+ }
+
+@@ -5164,6 +5160,9 @@ static void fill_dc_dirty_rects(struct drm_plane *plane,
+
+ *dirty_regions_changed = bb_changed;
+
++ if ((num_clips + (bb_changed ? 2 : 0)) > DC_MAX_DIRTY_RECTS)
++ goto ffu;
++
+ if (bb_changed) {
+ fill_dc_dirty_rect(new_plane_state->plane, &dirty_rects[i],
+ new_plane_state->crtc_x,
+@@ -5193,9 +5192,6 @@ static void fill_dc_dirty_rects(struct drm_plane *plane,
+ new_plane_state->crtc_h, &i, false);
+ }
+
+- if (i > DC_MAX_DIRTY_RECTS)
+- goto ffu;
+-
+ flip_addrs->dirty_rect_count = i;
+ return;
+
+@@ -7196,7 +7192,13 @@ static int amdgpu_dm_connector_get_modes(struct drm_connector *connector)
+ drm_add_modes_noedid(connector, 1920, 1080);
+ } else {
+ amdgpu_dm_connector_ddc_get_modes(connector, edid);
+- amdgpu_dm_connector_add_common_modes(encoder, connector);
++ /* most eDP supports only timings from its edid,
++ * usually only detailed timings are available
++ * from eDP edid. timings which are not from edid
++ * may damage eDP
++ */
++ if (connector->connector_type != DRM_MODE_CONNECTOR_eDP)
++ amdgpu_dm_connector_add_common_modes(encoder, connector);
+ amdgpu_dm_connector_add_freesync_modes(connector, edid);
+ }
+ amdgpu_dm_fbc_init(connector);
+diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.h b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.h
+index 935adca6f0486..748e80ef40d0a 100644
+--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.h
++++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.h
+@@ -100,7 +100,7 @@ struct secure_display_context *amdgpu_dm_crtc_secure_display_create_contexts(
+ #else
+ #define amdgpu_dm_crc_window_is_activated(x)
+ #define amdgpu_dm_crtc_handle_crc_window_irq(x)
+-#define amdgpu_dm_crtc_secure_display_create_contexts()
++#define amdgpu_dm_crtc_secure_display_create_contexts(x)
+ #endif
+
+ #endif /* AMD_DAL_DEV_AMDGPU_DM_AMDGPU_DM_CRC_H_ */
+diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
+index c6ce2b7123b79..616a450deddde 100644
+--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
++++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
+@@ -47,6 +47,30 @@
+ /* MST Dock */
+ static const uint8_t SYNAPTICS_DEVICE_ID[] = "SYNA";
+
++static u32 edid_extract_panel_id(struct edid *edid)
++{
++ return (u32)edid->mfg_id[0] << 24 |
++ (u32)edid->mfg_id[1] << 16 |
++ (u32)EDID_PRODUCT_ID(edid);
++}
++
++static void apply_edid_quirks(struct edid *edid, struct dc_edid_caps *edid_caps)
++{
++ uint32_t panel_id = edid_extract_panel_id(edid);
++
++ switch (panel_id) {
++ /* Workaround for some monitors which does not work well with FAMS */
++ case drm_edid_encode_panel_id('S', 'A', 'M', 0x0E5E):
++ case drm_edid_encode_panel_id('S', 'A', 'M', 0x7053):
++ case drm_edid_encode_panel_id('S', 'A', 'M', 0x71AC):
++ DRM_DEBUG_DRIVER("Disabling FAMS on monitor with panel id %X\n", panel_id);
++ edid_caps->panel_patch.disable_fams = true;
++ break;
++ default:
++ return;
++ }
++}
++
+ /* dm_helpers_parse_edid_caps
+ *
+ * Parse edid caps
+@@ -118,6 +142,8 @@ enum dc_edid_status dm_helpers_parse_edid_caps(
+ else
+ edid_caps->speaker_flags = DEFAULT_SPEAKER_LOCATION;
+
++ apply_edid_quirks(edid_buf, edid_caps);
++
+ kfree(sads);
+ kfree(sadb);
+
+diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c
+index dcf8631181690..6eace83c9c6f5 100644
+--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
++++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
+@@ -1602,6 +1602,9 @@ bool dc_validate_boot_timing(const struct dc *dc,
+ return false;
+ }
+
++ if (dc->debug.force_odm_combine)
++ return false;
++
+ /* Check for enabled DIG to identify enabled display */
+ if (!link->link_enc->funcs->is_dig_enabled(link->link_enc))
+ return false;
+diff --git a/drivers/gpu/drm/amd/display/dc/dce112/dce112_resource.c b/drivers/gpu/drm/amd/display/dc/dce112/dce112_resource.c
+index e179e80667d1c..19d7cfa53211b 100644
+--- a/drivers/gpu/drm/amd/display/dc/dce112/dce112_resource.c
++++ b/drivers/gpu/drm/amd/display/dc/dce112/dce112_resource.c
+@@ -970,10 +970,12 @@ enum dc_status resource_map_phy_clock_resources(
+ || dc_is_virtual_signal(pipe_ctx->stream->signal))
+ pipe_ctx->clock_source =
+ dc->res_pool->dp_clock_source;
+- else
+- pipe_ctx->clock_source = find_matching_pll(
+- &context->res_ctx, dc->res_pool,
+- stream);
++ else {
++ if (stream && stream->link && stream->link->link_enc)
++ pipe_ctx->clock_source = find_matching_pll(
++ &context->res_ctx, dc->res_pool,
++ stream);
++ }
+
+ if (pipe_ctx->clock_source == NULL)
+ return DC_NO_CLOCK_SOURCE_RESOURCE;
+diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
+index 5403e9399a465..c38be3c6c234e 100644
+--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
++++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
+@@ -1732,6 +1732,17 @@ static void dcn20_program_pipe(
+
+ if (hws->funcs.setup_vupdate_interrupt)
+ hws->funcs.setup_vupdate_interrupt(dc, pipe_ctx);
++
++ if (hws->funcs.calculate_dccg_k1_k2_values && dc->res_pool->dccg->funcs->set_pixel_rate_div) {
++ unsigned int k1_div, k2_div;
++
++ hws->funcs.calculate_dccg_k1_k2_values(pipe_ctx, &k1_div, &k2_div);
++
++ dc->res_pool->dccg->funcs->set_pixel_rate_div(
++ dc->res_pool->dccg,
++ pipe_ctx->stream_res.tg->inst,
++ k1_div, k2_div);
++ }
+ }
+
+ if (pipe_ctx->update_flags.bits.odm)
+diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
+index 1f5ee5cde6e1c..27f09ccb1f713 100644
+--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
++++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c
+@@ -1125,10 +1125,6 @@ unsigned int dcn32_calculate_dccg_k1_k2_values(struct pipe_ctx *pipe_ctx, unsign
+ unsigned int odm_combine_factor = 0;
+ bool two_pix_per_container = false;
+
+- // For phantom pipes, use the same programming as the main pipes
+- if (pipe_ctx->stream->mall_stream_config.type == SUBVP_PHANTOM) {
+- stream = pipe_ctx->stream->mall_stream_config.paired_stream;
+- }
+ two_pix_per_container = optc2_is_two_pixels_per_containter(&stream->timing);
+ odm_combine_factor = get_odm_config(pipe_ctx, NULL);
+
+diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_optc.c b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_optc.c
+index 2ee798965bc2b..a974f86e718a8 100644
+--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_optc.c
++++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_optc.c
+@@ -98,7 +98,7 @@ static void optc32_set_odm_combine(struct timing_generator *optc, int *opp_id, i
+ optc1->opp_count = opp_cnt;
+ }
+
+-static void optc32_set_h_timing_div_manual_mode(struct timing_generator *optc, bool manual_mode)
++void optc32_set_h_timing_div_manual_mode(struct timing_generator *optc, bool manual_mode)
+ {
+ struct optc *optc1 = DCN10TG_FROM_TG(optc);
+
+diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_optc.h b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_optc.h
+index b92ba8c756940..abf0121a10060 100644
+--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_optc.h
++++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_optc.h
+@@ -179,5 +179,6 @@
+ SF(OTG0_OTG_DRR_CONTROL, OTG_V_TOTAL_LAST_USED_BY_DRR, mask_sh)
+
+ void dcn32_timing_generator_init(struct optc *optc1);
++void optc32_set_h_timing_div_manual_mode(struct timing_generator *optc, bool manual_mode);
+
+ #endif /* __DC_OPTC_DCN32_H__ */
+diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c
+index 22dd1ebea618b..df6132d92d70b 100644
+--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c
++++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c
+@@ -1888,6 +1888,8 @@ bool dcn32_validate_bandwidth(struct dc *dc,
+
+ dc->res_pool->funcs->calculate_wm_and_dlg(dc, context, pipes, pipe_cnt, vlevel);
+
++ dcn32_override_min_req_memclk(dc, context);
++
+ BW_VAL_TRACE_END_WATERMARKS();
+
+ goto validate_out;
+diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
+index 0c4c3208def17..d8b4119820bfc 100644
+--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
++++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
+@@ -2882,3 +2882,18 @@ void dcn32_set_clock_limits(const struct _vcs_dpi_soc_bounding_box_st *soc_bb)
+ dc_assert_fp_enabled();
+ dcn3_2_soc.clock_limits[0].dcfclk_mhz = 1200.0;
+ }
++
++void dcn32_override_min_req_memclk(struct dc *dc, struct dc_state *context)
++{
++ // WA: restrict FPO and SubVP to use first non-strobe mode (DCN32 BW issue)
++ if ((context->bw_ctx.bw.dcn.clk.fw_based_mclk_switching || dcn32_subvp_in_use(dc, context)) &&
++ dc->dml.soc.num_chans <= 8) {
++ int num_mclk_levels = dc->clk_mgr->bw_params->clk_table.num_entries_per_clk.num_memclk_levels;
++
++ if (context->bw_ctx.dml.vba.DRAMSpeed <= dc->clk_mgr->bw_params->clk_table.entries[0].memclk_mhz * 16 &&
++ num_mclk_levels > 1) {
++ context->bw_ctx.dml.vba.DRAMSpeed = dc->clk_mgr->bw_params->clk_table.entries[1].memclk_mhz * 16;
++ context->bw_ctx.bw.dcn.clk.dramclk_khz = context->bw_ctx.dml.vba.DRAMSpeed * 1000 / 16;
++ }
++ }
++}
+diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.h b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.h
+index dcf512cd30721..a4206b71d650a 100644
+--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.h
++++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.h
+@@ -80,6 +80,8 @@ void dcn32_assign_fpo_vactive_candidate(struct dc *dc, const struct dc_state *co
+
+ bool dcn32_find_vactive_pipe(struct dc *dc, const struct dc_state *context, uint32_t vactive_margin_req);
+
++void dcn32_override_min_req_memclk(struct dc *dc, struct dc_state *context);
++
+ void dcn32_set_clock_limits(const struct _vcs_dpi_soc_bounding_box_st *soc_bb);
+
+ #endif
+diff --git a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_irq_handler.c b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_irq_handler.c
+index ba95facc4ee86..b1b11eb0f9bb4 100644
+--- a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_irq_handler.c
++++ b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_irq_handler.c
+@@ -82,8 +82,15 @@ bool dp_parse_link_loss_status(
+ }
+
+ /* Check interlane align.*/
+- if (sink_status_changed ||
+- !hpd_irq_dpcd_data->bytes.lane_status_updated.bits.INTERLANE_ALIGN_DONE) {
++ if (link_dp_get_encoding_format(&link->cur_link_settings) == DP_128b_132b_ENCODING &&
++ (!hpd_irq_dpcd_data->bytes.lane_status_updated.bits.EQ_INTERLANE_ALIGN_DONE_128b_132b ||
++ !hpd_irq_dpcd_data->bytes.lane_status_updated.bits.CDS_INTERLANE_ALIGN_DONE_128b_132b)) {
++ sink_status_changed = true;
++ } else if (!hpd_irq_dpcd_data->bytes.lane_status_updated.bits.INTERLANE_ALIGN_DONE) {
++ sink_status_changed = true;
++ }
++
++ if (sink_status_changed) {
+
+ DC_LOG_HW_HPD_IRQ("%s: Link Status changed.\n", __func__);
+
+diff --git a/drivers/gpu/drm/amd/display/dmub/dmub_srv.h b/drivers/gpu/drm/amd/display/dmub/dmub_srv.h
+index ba1715e2d25a9..554ab48d4e647 100644
+--- a/drivers/gpu/drm/amd/display/dmub/dmub_srv.h
++++ b/drivers/gpu/drm/amd/display/dmub/dmub_srv.h
+@@ -490,7 +490,7 @@ struct dmub_notification {
+ * of a firmware to know if feature or functionality is supported or present.
+ */
+ #define DMUB_FW_VERSION(major, minor, revision) \
+- ((((major) & 0xFF) << 24) | (((minor) & 0xFF) << 16) | ((revision) & 0xFFFF))
++ ((((major) & 0xFF) << 24) | (((minor) & 0xFF) << 16) | (((revision) & 0xFF) << 8))
+
+ /**
+ * dmub_srv_create() - creates the DMUB service.
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h
+index df3baaab00378..9bb6cc99f0d0b 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h
++++ b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h
+@@ -303,5 +303,9 @@ int smu_v13_0_get_pptable_from_firmware(struct smu_context *smu,
+ uint32_t *size,
+ uint32_t pptable_id);
+
++int smu_v13_0_update_pcie_parameters(struct smu_context *smu,
++ uint32_t pcie_gen_cap,
++ uint32_t pcie_width_cap);
++
+ #endif
+ #endif
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
+index 9cd005131f566..3bb18396d2f9d 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c
+@@ -2113,7 +2113,6 @@ static int arcturus_i2c_xfer(struct i2c_adapter *i2c_adap,
+ }
+ mutex_lock(&adev->pm.mutex);
+ r = smu_cmn_update_table(smu, SMU_TABLE_I2C_COMMANDS, 0, req, true);
+- mutex_unlock(&adev->pm.mutex);
+ if (r)
+ goto fail;
+
+@@ -2130,6 +2129,7 @@ static int arcturus_i2c_xfer(struct i2c_adapter *i2c_adap,
+ }
+ r = num_msgs;
+ fail:
++ mutex_unlock(&adev->pm.mutex);
+ kfree(req);
+ return r;
+ }
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+index 275f708db6362..3f2494c6e3725 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+@@ -3021,7 +3021,6 @@ static int navi10_i2c_xfer(struct i2c_adapter *i2c_adap,
+ }
+ mutex_lock(&adev->pm.mutex);
+ r = smu_cmn_update_table(smu, SMU_TABLE_I2C_COMMANDS, 0, req, true);
+- mutex_unlock(&adev->pm.mutex);
+ if (r)
+ goto fail;
+
+@@ -3038,6 +3037,7 @@ static int navi10_i2c_xfer(struct i2c_adapter *i2c_adap,
+ }
+ r = num_msgs;
+ fail:
++ mutex_unlock(&adev->pm.mutex);
+ kfree(req);
+ return r;
+ }
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+index f7ed3e655e397..8fe2e1716da44 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+@@ -3842,7 +3842,6 @@ static int sienna_cichlid_i2c_xfer(struct i2c_adapter *i2c_adap,
+ }
+ mutex_lock(&adev->pm.mutex);
+ r = smu_cmn_update_table(smu, SMU_TABLE_I2C_COMMANDS, 0, req, true);
+- mutex_unlock(&adev->pm.mutex);
+ if (r)
+ goto fail;
+
+@@ -3859,6 +3858,7 @@ static int sienna_cichlid_i2c_xfer(struct i2c_adapter *i2c_adap,
+ }
+ r = num_msgs;
+ fail:
++ mutex_unlock(&adev->pm.mutex);
+ kfree(req);
+ return r;
+ }
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
+index d30ec3005ea19..10eb72d892d48 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
+@@ -1525,7 +1525,6 @@ static int aldebaran_i2c_xfer(struct i2c_adapter *i2c_adap,
+ }
+ mutex_lock(&adev->pm.mutex);
+ r = smu_cmn_update_table(smu, SMU_TABLE_I2C_COMMANDS, 0, req, true);
+- mutex_unlock(&adev->pm.mutex);
+ if (r)
+ goto fail;
+
+@@ -1542,6 +1541,7 @@ static int aldebaran_i2c_xfer(struct i2c_adapter *i2c_adap,
+ }
+ r = num_msgs;
+ fail:
++ mutex_unlock(&adev->pm.mutex);
+ kfree(req);
+ return r;
+ }
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
+index ca379181081c1..7acf731a69ccf 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
+@@ -2453,3 +2453,70 @@ int smu_v13_0_mode1_reset(struct smu_context *smu)
+
+ return ret;
+ }
++
++/*
++ * Intel hosts such as Raptor Lake and Sapphire Rapids don't support dynamic
++ * speed switching. Until we have confirmation from Intel that a specific host
++ * supports it, it's safer that we keep it disabled for all.
++ *
++ * https://edc.intel.com/content/www/us/en/design/products/platforms/details/raptor-lake-s/13th-generation-core-processors-datasheet-volume-1-of-2/005/pci-express-support/
++ * https://gitlab.freedesktop.org/drm/amd/-/issues/2663
++ */
++static bool smu_v13_0_is_pcie_dynamic_switching_supported(void)
++{
++#if IS_ENABLED(CONFIG_X86)
++ struct cpuinfo_x86 *c = &cpu_data(0);
++
++ if (c->x86_vendor == X86_VENDOR_INTEL)
++ return false;
++#endif
++ return true;
++}
++
++int smu_v13_0_update_pcie_parameters(struct smu_context *smu,
++ uint32_t pcie_gen_cap,
++ uint32_t pcie_width_cap)
++{
++ struct smu_13_0_dpm_context *dpm_context = smu->smu_dpm.dpm_context;
++ struct smu_13_0_pcie_table *pcie_table =
++ &dpm_context->dpm_tables.pcie_table;
++ int num_of_levels = pcie_table->num_of_link_levels;
++ uint32_t smu_pcie_arg;
++ int ret, i;
++
++ if (!smu_v13_0_is_pcie_dynamic_switching_supported()) {
++ if (pcie_table->pcie_gen[num_of_levels - 1] < pcie_gen_cap)
++ pcie_gen_cap = pcie_table->pcie_gen[num_of_levels - 1];
++
++ if (pcie_table->pcie_lane[num_of_levels - 1] < pcie_width_cap)
++ pcie_width_cap = pcie_table->pcie_lane[num_of_levels - 1];
++
++ /* Force all levels to use the same settings */
++ for (i = 0; i < num_of_levels; i++) {
++ pcie_table->pcie_gen[i] = pcie_gen_cap;
++ pcie_table->pcie_lane[i] = pcie_width_cap;
++ }
++ } else {
++ for (i = 0; i < num_of_levels; i++) {
++ if (pcie_table->pcie_gen[i] > pcie_gen_cap)
++ pcie_table->pcie_gen[i] = pcie_gen_cap;
++ if (pcie_table->pcie_lane[i] > pcie_width_cap)
++ pcie_table->pcie_lane[i] = pcie_width_cap;
++ }
++ }
++
++ for (i = 0; i < num_of_levels; i++) {
++ smu_pcie_arg = i << 16;
++ smu_pcie_arg |= pcie_table->pcie_gen[i] << 8;
++ smu_pcie_arg |= pcie_table->pcie_lane[i];
++
++ ret = smu_cmn_send_smc_msg_with_param(smu,
++ SMU_MSG_OverridePcieParameters,
++ smu_pcie_arg,
++ NULL);
++ if (ret)
++ return ret;
++ }
++
++ return 0;
++}
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
+index c42c0c1446f4f..907cc43d16a90 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
+@@ -1235,37 +1235,6 @@ static int smu_v13_0_0_force_clk_levels(struct smu_context *smu,
+ return ret;
+ }
+
+-static int smu_v13_0_0_update_pcie_parameters(struct smu_context *smu,
+- uint32_t pcie_gen_cap,
+- uint32_t pcie_width_cap)
+-{
+- struct smu_13_0_dpm_context *dpm_context = smu->smu_dpm.dpm_context;
+- struct smu_13_0_pcie_table *pcie_table =
+- &dpm_context->dpm_tables.pcie_table;
+- uint32_t smu_pcie_arg;
+- int ret, i;
+-
+- for (i = 0; i < pcie_table->num_of_link_levels; i++) {
+- if (pcie_table->pcie_gen[i] > pcie_gen_cap)
+- pcie_table->pcie_gen[i] = pcie_gen_cap;
+- if (pcie_table->pcie_lane[i] > pcie_width_cap)
+- pcie_table->pcie_lane[i] = pcie_width_cap;
+-
+- smu_pcie_arg = i << 16;
+- smu_pcie_arg |= pcie_table->pcie_gen[i] << 8;
+- smu_pcie_arg |= pcie_table->pcie_lane[i];
+-
+- ret = smu_cmn_send_smc_msg_with_param(smu,
+- SMU_MSG_OverridePcieParameters,
+- smu_pcie_arg,
+- NULL);
+- if (ret)
+- return ret;
+- }
+-
+- return 0;
+-}
+-
+ static const struct smu_temperature_range smu13_thermal_policy[] = {
+ {-273150, 99000, 99000, -273150, 99000, 99000, -273150, 99000, 99000},
+ { 120000, 120000, 120000, 120000, 120000, 120000, 120000, 120000, 120000},
+@@ -1838,7 +1807,6 @@ static int smu_v13_0_0_i2c_xfer(struct i2c_adapter *i2c_adap,
+ }
+ mutex_lock(&adev->pm.mutex);
+ r = smu_cmn_update_table(smu, SMU_TABLE_I2C_COMMANDS, 0, req, true);
+- mutex_unlock(&adev->pm.mutex);
+ if (r)
+ goto fail;
+
+@@ -1855,6 +1823,7 @@ static int smu_v13_0_0_i2c_xfer(struct i2c_adapter *i2c_adap,
+ }
+ r = num_msgs;
+ fail:
++ mutex_unlock(&adev->pm.mutex);
+ kfree(req);
+ return r;
+ }
+@@ -2172,7 +2141,7 @@ static const struct pptable_funcs smu_v13_0_0_ppt_funcs = {
+ .feature_is_enabled = smu_cmn_feature_is_enabled,
+ .print_clk_levels = smu_v13_0_0_print_clk_levels,
+ .force_clk_levels = smu_v13_0_0_force_clk_levels,
+- .update_pcie_parameters = smu_v13_0_0_update_pcie_parameters,
++ .update_pcie_parameters = smu_v13_0_update_pcie_parameters,
+ .get_thermal_temperature_range = smu_v13_0_0_get_thermal_temperature_range,
+ .register_irq_handler = smu_v13_0_register_irq_handler,
+ .enable_thermal_alert = smu_v13_0_enable_thermal_alert,
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
+index ea8f3d6fb98b3..c9093517b1bda 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
+@@ -1639,7 +1639,6 @@ static int smu_v13_0_6_i2c_xfer(struct i2c_adapter *i2c_adap,
+ }
+ mutex_lock(&adev->pm.mutex);
+ r = smu_v13_0_6_request_i2c_xfer(smu, req);
+- mutex_unlock(&adev->pm.mutex);
+ if (r)
+ goto fail;
+
+@@ -1656,6 +1655,7 @@ static int smu_v13_0_6_i2c_xfer(struct i2c_adapter *i2c_adap,
+ }
+ r = num_msgs;
+ fail:
++ mutex_unlock(&adev->pm.mutex);
+ kfree(req);
+ return r;
+ }
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c
+index bba621615abf0..aac72925db34a 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c
+@@ -1225,37 +1225,6 @@ static int smu_v13_0_7_force_clk_levels(struct smu_context *smu,
+ return ret;
+ }
+
+-static int smu_v13_0_7_update_pcie_parameters(struct smu_context *smu,
+- uint32_t pcie_gen_cap,
+- uint32_t pcie_width_cap)
+-{
+- struct smu_13_0_dpm_context *dpm_context = smu->smu_dpm.dpm_context;
+- struct smu_13_0_pcie_table *pcie_table =
+- &dpm_context->dpm_tables.pcie_table;
+- uint32_t smu_pcie_arg;
+- int ret, i;
+-
+- for (i = 0; i < pcie_table->num_of_link_levels; i++) {
+- if (pcie_table->pcie_gen[i] > pcie_gen_cap)
+- pcie_table->pcie_gen[i] = pcie_gen_cap;
+- if (pcie_table->pcie_lane[i] > pcie_width_cap)
+- pcie_table->pcie_lane[i] = pcie_width_cap;
+-
+- smu_pcie_arg = i << 16;
+- smu_pcie_arg |= pcie_table->pcie_gen[i] << 8;
+- smu_pcie_arg |= pcie_table->pcie_lane[i];
+-
+- ret = smu_cmn_send_smc_msg_with_param(smu,
+- SMU_MSG_OverridePcieParameters,
+- smu_pcie_arg,
+- NULL);
+- if (ret)
+- return ret;
+- }
+-
+- return 0;
+-}
+-
+ static const struct smu_temperature_range smu13_thermal_policy[] =
+ {
+ {-273150, 99000, 99000, -273150, 99000, 99000, -273150, 99000, 99000},
+@@ -1752,7 +1721,7 @@ static const struct pptable_funcs smu_v13_0_7_ppt_funcs = {
+ .feature_is_enabled = smu_cmn_feature_is_enabled,
+ .print_clk_levels = smu_v13_0_7_print_clk_levels,
+ .force_clk_levels = smu_v13_0_7_force_clk_levels,
+- .update_pcie_parameters = smu_v13_0_7_update_pcie_parameters,
++ .update_pcie_parameters = smu_v13_0_update_pcie_parameters,
+ .get_thermal_temperature_range = smu_v13_0_7_get_thermal_temperature_range,
+ .register_irq_handler = smu_v13_0_register_irq_handler,
+ .enable_thermal_alert = smu_v13_0_enable_thermal_alert,
+diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
+index 603bb3c51027b..3b40e0fdca5cb 100644
+--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
++++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
+@@ -1426,9 +1426,9 @@ void dw_hdmi_set_high_tmds_clock_ratio(struct dw_hdmi *hdmi,
+ /* Control for TMDS Bit Period/TMDS Clock-Period Ratio */
+ if (dw_hdmi_support_scdc(hdmi, display)) {
+ if (mtmdsclock > HDMI14_MAX_TMDSCLK)
+- drm_scdc_set_high_tmds_clock_ratio(&hdmi->connector, 1);
++ drm_scdc_set_high_tmds_clock_ratio(hdmi->curr_conn, 1);
+ else
+- drm_scdc_set_high_tmds_clock_ratio(&hdmi->connector, 0);
++ drm_scdc_set_high_tmds_clock_ratio(hdmi->curr_conn, 0);
+ }
+ }
+ EXPORT_SYMBOL_GPL(dw_hdmi_set_high_tmds_clock_ratio);
+@@ -2116,7 +2116,7 @@ static void hdmi_av_composer(struct dw_hdmi *hdmi,
+ min_t(u8, bytes, SCDC_MIN_SOURCE_VERSION));
+
+ /* Enabled Scrambling in the Sink */
+- drm_scdc_set_scrambling(&hdmi->connector, 1);
++ drm_scdc_set_scrambling(hdmi->curr_conn, 1);
+
+ /*
+ * To activate the scrambler feature, you must ensure
+@@ -2132,7 +2132,7 @@ static void hdmi_av_composer(struct dw_hdmi *hdmi,
+ hdmi_writeb(hdmi, 0, HDMI_FC_SCRAMBLER_CTRL);
+ hdmi_writeb(hdmi, (u8)~HDMI_MC_SWRSTZ_TMDSSWRST_REQ,
+ HDMI_MC_SWRSTZ);
+- drm_scdc_set_scrambling(&hdmi->connector, 0);
++ drm_scdc_set_scrambling(hdmi->curr_conn, 0);
+ }
+ }
+
+@@ -3553,6 +3553,7 @@ struct dw_hdmi *dw_hdmi_probe(struct platform_device *pdev,
+ hdmi->bridge.ops = DRM_BRIDGE_OP_DETECT | DRM_BRIDGE_OP_EDID
+ | DRM_BRIDGE_OP_HPD;
+ hdmi->bridge.interlace_allowed = true;
++ hdmi->bridge.ddc = hdmi->ddc;
+ #ifdef CONFIG_OF
+ hdmi->bridge.of_node = pdev->dev.of_node;
+ #endif
+diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
+index 4676cf2900dfd..3c8fd6ea6d6a4 100644
+--- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
++++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
+@@ -170,10 +170,10 @@
+ * @pwm_refclk_freq: Cache for the reference clock input to the PWM.
+ */
+ struct ti_sn65dsi86 {
+- struct auxiliary_device bridge_aux;
+- struct auxiliary_device gpio_aux;
+- struct auxiliary_device aux_aux;
+- struct auxiliary_device pwm_aux;
++ struct auxiliary_device *bridge_aux;
++ struct auxiliary_device *gpio_aux;
++ struct auxiliary_device *aux_aux;
++ struct auxiliary_device *pwm_aux;
+
+ struct device *dev;
+ struct regmap *regmap;
+@@ -468,27 +468,34 @@ static void ti_sn65dsi86_delete_aux(void *data)
+ auxiliary_device_delete(data);
+ }
+
+-/*
+- * AUX bus docs say that a non-NULL release is mandatory, but it makes no
+- * sense for the model used here where all of the aux devices are allocated
+- * in the single shared structure. We'll use this noop as a workaround.
+- */
+-static void ti_sn65dsi86_noop(struct device *dev) {}
++static void ti_sn65dsi86_aux_device_release(struct device *dev)
++{
++ struct auxiliary_device *aux = container_of(dev, struct auxiliary_device, dev);
++
++ kfree(aux);
++}
+
+ static int ti_sn65dsi86_add_aux_device(struct ti_sn65dsi86 *pdata,
+- struct auxiliary_device *aux,
++ struct auxiliary_device **aux_out,
+ const char *name)
+ {
+ struct device *dev = pdata->dev;
++ struct auxiliary_device *aux;
+ int ret;
+
++ aux = kzalloc(sizeof(*aux), GFP_KERNEL);
++ if (!aux)
++ return -ENOMEM;
++
+ aux->name = name;
+ aux->dev.parent = dev;
+- aux->dev.release = ti_sn65dsi86_noop;
++ aux->dev.release = ti_sn65dsi86_aux_device_release;
+ device_set_of_node_from_dev(&aux->dev, dev);
+ ret = auxiliary_device_init(aux);
+- if (ret)
++ if (ret) {
++ kfree(aux);
+ return ret;
++ }
+ ret = devm_add_action_or_reset(dev, ti_sn65dsi86_uninit_aux, aux);
+ if (ret)
+ return ret;
+@@ -497,6 +504,8 @@ static int ti_sn65dsi86_add_aux_device(struct ti_sn65dsi86 *pdata,
+ if (ret)
+ return ret;
+ ret = devm_add_action_or_reset(dev, ti_sn65dsi86_delete_aux, aux);
++ if (!ret)
++ *aux_out = aux;
+
+ return ret;
+ }
+diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c b/drivers/gpu/drm/display/drm_dp_mst_topology.c
+index e2e21ce79510e..f854cb5eafbe7 100644
+--- a/drivers/gpu/drm/display/drm_dp_mst_topology.c
++++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c
+@@ -4053,17 +4053,28 @@ out:
+ }
+
+ /**
+- * drm_dp_mst_hpd_irq() - MST hotplug IRQ notify
++ * drm_dp_mst_hpd_irq_handle_event() - MST hotplug IRQ handle MST event
+ * @mgr: manager to notify irq for.
+ * @esi: 4 bytes from SINK_COUNT_ESI
++ * @ack: 4 bytes used to ack events starting from SINK_COUNT_ESI
+ * @handled: whether the hpd interrupt was consumed or not
+ *
+- * This should be called from the driver when it detects a short IRQ,
++ * This should be called from the driver when it detects a HPD IRQ,
+ * along with the value of the DEVICE_SERVICE_IRQ_VECTOR_ESI0. The
+- * topology manager will process the sideband messages received as a result
+- * of this.
++ * topology manager will process the sideband messages received
++ * as indicated in the DEVICE_SERVICE_IRQ_VECTOR_ESI0 and set the
++ * corresponding flags that Driver has to ack the DP receiver later.
++ *
++ * Note that driver shall also call
++ * drm_dp_mst_hpd_irq_send_new_request() if the 'handled' is set
++ * after calling this function, to try to kick off a new request in
++ * the queue if the previous message transaction is completed.
++ *
++ * See also:
++ * drm_dp_mst_hpd_irq_send_new_request()
+ */
+-int drm_dp_mst_hpd_irq(struct drm_dp_mst_topology_mgr *mgr, u8 *esi, bool *handled)
++int drm_dp_mst_hpd_irq_handle_event(struct drm_dp_mst_topology_mgr *mgr, const u8 *esi,
++ u8 *ack, bool *handled)
+ {
+ int ret = 0;
+ int sc;
+@@ -4078,18 +4089,47 @@ int drm_dp_mst_hpd_irq(struct drm_dp_mst_topology_mgr *mgr, u8 *esi, bool *handl
+ if (esi[1] & DP_DOWN_REP_MSG_RDY) {
+ ret = drm_dp_mst_handle_down_rep(mgr);
+ *handled = true;
++ ack[1] |= DP_DOWN_REP_MSG_RDY;
+ }
+
+ if (esi[1] & DP_UP_REQ_MSG_RDY) {
+ ret |= drm_dp_mst_handle_up_req(mgr);
+ *handled = true;
++ ack[1] |= DP_UP_REQ_MSG_RDY;
+ }
+
+- drm_dp_mst_kick_tx(mgr);
+ return ret;
+ }
+-EXPORT_SYMBOL(drm_dp_mst_hpd_irq);
++EXPORT_SYMBOL(drm_dp_mst_hpd_irq_handle_event);
+
++/**
++ * drm_dp_mst_hpd_irq_send_new_request() - MST hotplug IRQ kick off new request
++ * @mgr: manager to notify irq for.
++ *
++ * This should be called from the driver when mst irq event is handled
++ * and acked. Note that new down request should only be sent when
++ * previous message transaction is completed. Source is not supposed to generate
++ * interleaved message transactions.
++ */
++void drm_dp_mst_hpd_irq_send_new_request(struct drm_dp_mst_topology_mgr *mgr)
++{
++ struct drm_dp_sideband_msg_tx *txmsg;
++ bool kick = true;
++
++ mutex_lock(&mgr->qlock);
++ txmsg = list_first_entry_or_null(&mgr->tx_msg_downq,
++ struct drm_dp_sideband_msg_tx, next);
++ /* If last transaction is not completed yet*/
++ if (!txmsg ||
++ txmsg->state == DRM_DP_SIDEBAND_TX_START_SEND ||
++ txmsg->state == DRM_DP_SIDEBAND_TX_SENT)
++ kick = false;
++ mutex_unlock(&mgr->qlock);
++
++ if (kick)
++ drm_dp_mst_kick_tx(mgr);
++}
++EXPORT_SYMBOL(drm_dp_mst_hpd_irq_send_new_request);
+ /**
+ * drm_dp_mst_detect_port() - get connection status for an MST port
+ * @connector: DRM connector for this port
+diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
+index b4c6ffc438da2..88fcc6bbc8b7b 100644
+--- a/drivers/gpu/drm/drm_atomic.c
++++ b/drivers/gpu/drm/drm_atomic.c
+@@ -140,6 +140,12 @@ drm_atomic_state_init(struct drm_device *dev, struct drm_atomic_state *state)
+ if (!state->planes)
+ goto fail;
+
++ /*
++ * Because drm_atomic_state can be committed asynchronously we need our
++ * own reference and cannot rely on the on implied by drm_file in the
++ * ioctl call.
++ */
++ drm_dev_get(dev);
+ state->dev = dev;
+
+ drm_dbg_atomic(dev, "Allocated atomic state %p\n", state);
+@@ -299,7 +305,8 @@ EXPORT_SYMBOL(drm_atomic_state_clear);
+ void __drm_atomic_state_free(struct kref *ref)
+ {
+ struct drm_atomic_state *state = container_of(ref, typeof(*state), ref);
+- struct drm_mode_config *config = &state->dev->mode_config;
++ struct drm_device *dev = state->dev;
++ struct drm_mode_config *config = &dev->mode_config;
+
+ drm_atomic_state_clear(state);
+
+@@ -311,6 +318,8 @@ void __drm_atomic_state_free(struct kref *ref)
+ drm_atomic_state_default_release(state);
+ kfree(state);
+ }
++
++ drm_dev_put(dev);
+ }
+ EXPORT_SYMBOL(__drm_atomic_state_free);
+
+diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
+index 2c2c9caf0be5e..e0ab555aad2cf 100644
+--- a/drivers/gpu/drm/drm_atomic_helper.c
++++ b/drivers/gpu/drm/drm_atomic_helper.c
+@@ -1209,7 +1209,16 @@ disable_outputs(struct drm_device *dev, struct drm_atomic_state *old_state)
+ continue;
+
+ ret = drm_crtc_vblank_get(crtc);
+- WARN_ONCE(ret != -EINVAL, "driver forgot to call drm_crtc_vblank_off()\n");
++ /*
++ * Self-refresh is not a true "disable"; ensure vblank remains
++ * enabled.
++ */
++ if (new_crtc_state->self_refresh_active)
++ WARN_ONCE(ret != 0,
++ "driver disabled vblank in self-refresh\n");
++ else
++ WARN_ONCE(ret != -EINVAL,
++ "driver forgot to call drm_crtc_vblank_off()\n");
+ if (ret == 0)
+ drm_crtc_vblank_put(crtc);
+ }
+diff --git a/drivers/gpu/drm/drm_client.c b/drivers/gpu/drm/drm_client.c
+index f6292ba0e6fc3..037e36f2049c1 100644
+--- a/drivers/gpu/drm/drm_client.c
++++ b/drivers/gpu/drm/drm_client.c
+@@ -122,13 +122,34 @@ EXPORT_SYMBOL(drm_client_init);
+ * drm_client_register() it is no longer permissible to call drm_client_release()
+ * directly (outside the unregister callback), instead cleanup will happen
+ * automatically on driver unload.
++ *
++ * Registering a client generates a hotplug event that allows the client
++ * to set up its display from pre-existing outputs. The client must have
++ * initialized its state to able to handle the hotplug event successfully.
+ */
+ void drm_client_register(struct drm_client_dev *client)
+ {
+ struct drm_device *dev = client->dev;
++ int ret;
+
+ mutex_lock(&dev->clientlist_mutex);
+ list_add(&client->list, &dev->clientlist);
++
++ if (client->funcs && client->funcs->hotplug) {
++ /*
++ * Perform an initial hotplug event to pick up the
++ * display configuration for the client. This step
++ * has to be performed *after* registering the client
++ * in the list of clients, or a concurrent hotplug
++ * event might be lost; leaving the display off.
++ *
++ * Hold the clientlist_mutex as for a regular hotplug
++ * event.
++ */
++ ret = client->funcs->hotplug(client);
++ if (ret)
++ drm_dbg_kms(dev, "client hotplug ret=%d\n", ret);
++ }
+ mutex_unlock(&dev->clientlist_mutex);
+ }
+ EXPORT_SYMBOL(drm_client_register);
+diff --git a/drivers/gpu/drm/drm_fbdev_dma.c b/drivers/gpu/drm/drm_fbdev_dma.c
+index 728deffcc0d92..9f1ebebe14690 100644
+--- a/drivers/gpu/drm/drm_fbdev_dma.c
++++ b/drivers/gpu/drm/drm_fbdev_dma.c
+@@ -218,7 +218,7 @@ static const struct drm_client_funcs drm_fbdev_dma_client_funcs = {
+ * drm_fbdev_dma_setup() - Setup fbdev emulation for GEM DMA helpers
+ * @dev: DRM device
+ * @preferred_bpp: Preferred bits per pixel for the device.
+- * @dev->mode_config.preferred_depth is used if this is zero.
++ * 32 is used if this is zero.
+ *
+ * This function sets up fbdev emulation for GEM DMA drivers that support
+ * dumb buffers with a virtual address and that can be mmap'ed.
+@@ -253,10 +253,6 @@ void drm_fbdev_dma_setup(struct drm_device *dev, unsigned int preferred_bpp)
+ goto err_drm_client_init;
+ }
+
+- ret = drm_fbdev_dma_client_hotplug(&fb_helper->client);
+- if (ret)
+- drm_dbg_kms(dev, "client hotplug ret=%d\n", ret);
+-
+ drm_client_register(&fb_helper->client);
+
+ return;
+diff --git a/drivers/gpu/drm/drm_fbdev_generic.c b/drivers/gpu/drm/drm_fbdev_generic.c
+index 8e5148bf40bbc..7e65be35477e6 100644
+--- a/drivers/gpu/drm/drm_fbdev_generic.c
++++ b/drivers/gpu/drm/drm_fbdev_generic.c
+@@ -340,10 +340,6 @@ void drm_fbdev_generic_setup(struct drm_device *dev, unsigned int preferred_bpp)
+ goto err_drm_client_init;
+ }
+
+- ret = drm_fbdev_generic_client_hotplug(&fb_helper->client);
+- if (ret)
+- drm_dbg_kms(dev, "client hotplug ret=%d\n", ret);
+-
+ drm_client_register(&fb_helper->client);
+
+ return;
+diff --git a/drivers/gpu/drm/exynos/exynos_drm_fbdev.c b/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
+index ea4b3d248aaca..cc98168a7a5e8 100644
+--- a/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
++++ b/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
+@@ -216,10 +216,6 @@ void exynos_drm_fbdev_setup(struct drm_device *dev)
+ if (ret)
+ goto err_drm_client_init;
+
+- ret = exynos_drm_fbdev_client_hotplug(&fb_helper->client);
+- if (ret)
+- drm_dbg_kms(dev, "client hotplug ret=%d\n", ret);
+-
+ drm_client_register(&fb_helper->client);
+
+ return;
+diff --git a/drivers/gpu/drm/gma500/fbdev.c b/drivers/gpu/drm/gma500/fbdev.c
+index 62287407e7173..ba193c5bb35ef 100644
+--- a/drivers/gpu/drm/gma500/fbdev.c
++++ b/drivers/gpu/drm/gma500/fbdev.c
+@@ -330,10 +330,6 @@ void psb_fbdev_setup(struct drm_psb_private *dev_priv)
+ goto err_drm_fb_helper_unprepare;
+ }
+
+- ret = psb_fbdev_client_hotplug(&fb_helper->client);
+- if (ret)
+- drm_dbg_kms(dev, "client hotplug ret=%d\n", ret);
+-
+ drm_client_register(&fb_helper->client);
+
+ return;
+diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
+index 7749f95d5d02a..a805b57f3d912 100644
+--- a/drivers/gpu/drm/i915/display/intel_display.c
++++ b/drivers/gpu/drm/i915/display/intel_display.c
+@@ -4968,7 +4968,6 @@ copy_bigjoiner_crtc_state_modeset(struct intel_atomic_state *state,
+ saved_state->uapi = slave_crtc_state->uapi;
+ saved_state->scaler_state = slave_crtc_state->scaler_state;
+ saved_state->shared_dpll = slave_crtc_state->shared_dpll;
+- saved_state->dpll_hw_state = slave_crtc_state->dpll_hw_state;
+ saved_state->crc_enabled = slave_crtc_state->crc_enabled;
+
+ intel_crtc_free_hw_state(slave_crtc_state);
+diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c
+index 529ee22be872e..6ecfffbe90019 100644
+--- a/drivers/gpu/drm/i915/display/intel_dp.c
++++ b/drivers/gpu/drm/i915/display/intel_dp.c
+@@ -3940,9 +3940,7 @@ intel_dp_mst_hpd_irq(struct intel_dp *intel_dp, u8 *esi, u8 *ack)
+ {
+ bool handled = false;
+
+- drm_dp_mst_hpd_irq(&intel_dp->mst_mgr, esi, &handled);
+- if (handled)
+- ack[1] |= esi[1] & (DP_DOWN_REP_MSG_RDY | DP_UP_REQ_MSG_RDY);
++ drm_dp_mst_hpd_irq_handle_event(&intel_dp->mst_mgr, esi, ack, &handled);
+
+ if (esi[1] & DP_CP_IRQ) {
+ intel_hdcp_handle_cp_irq(intel_dp->attached_connector);
+@@ -4017,6 +4015,9 @@ intel_dp_check_mst_status(struct intel_dp *intel_dp)
+
+ if (!intel_dp_ack_sink_irq_esi(intel_dp, ack))
+ drm_dbg_kms(&i915->drm, "Failed to ack ESI\n");
++
++ if (ack[1] & (DP_DOWN_REP_MSG_RDY | DP_UP_REQ_MSG_RDY))
++ drm_dp_mst_hpd_irq_send_new_request(&intel_dp->mst_mgr);
+ }
+
+ return link_ok;
+diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
+index 4f436ba7a3c83..123b82f29a1bf 100644
+--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
++++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
+@@ -625,7 +625,7 @@ __vm_create_scratch_for_read(struct i915_address_space *vm, unsigned long size)
+ if (IS_ERR(obj))
+ return ERR_CAST(obj);
+
+- i915_gem_object_set_cache_coherency(obj, I915_CACHING_CACHED);
++ i915_gem_object_set_cache_coherency(obj, I915_CACHE_LLC);
+
+ vma = i915_vma_instance(obj, vm, NULL);
+ if (IS_ERR(vma)) {
+diff --git a/drivers/gpu/drm/msm/msm_fbdev.c b/drivers/gpu/drm/msm/msm_fbdev.c
+index 2ebc86381e1c9..c082646c3c0ba 100644
+--- a/drivers/gpu/drm/msm/msm_fbdev.c
++++ b/drivers/gpu/drm/msm/msm_fbdev.c
+@@ -227,10 +227,6 @@ void msm_fbdev_setup(struct drm_device *dev)
+ goto err_drm_fb_helper_unprepare;
+ }
+
+- ret = msm_fbdev_client_hotplug(&helper->client);
+- if (ret)
+- drm_dbg_kms(dev, "client hotplug ret=%d\n", ret);
+-
+ drm_client_register(&helper->client);
+
+ return;
+diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c
+index 9b6824f6b9e4b..42e1665ba11a3 100644
+--- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
++++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
+@@ -1359,22 +1359,26 @@ nv50_mstm_service(struct nouveau_drm *drm,
+ u8 esi[8] = {};
+
+ while (handled) {
++ u8 ack[8] = {};
++
+ rc = drm_dp_dpcd_read(aux, DP_SINK_COUNT_ESI, esi, 8);
+ if (rc != 8) {
+ ret = false;
+ break;
+ }
+
+- drm_dp_mst_hpd_irq(&mstm->mgr, esi, &handled);
++ drm_dp_mst_hpd_irq_handle_event(&mstm->mgr, esi, ack, &handled);
+ if (!handled)
+ break;
+
+- rc = drm_dp_dpcd_write(aux, DP_SINK_COUNT_ESI + 1, &esi[1],
+- 3);
+- if (rc != 3) {
++ rc = drm_dp_dpcd_writeb(aux, DP_SINK_COUNT_ESI + 1, ack[1]);
++
++ if (rc != 1) {
+ ret = false;
+ break;
+ }
++
++ drm_dp_mst_hpd_irq_send_new_request(&mstm->mgr);
+ }
+
+ if (!ret)
+diff --git a/drivers/gpu/drm/nouveau/nouveau_chan.c b/drivers/gpu/drm/nouveau/nouveau_chan.c
+index e648ecd0c1a03..3dfbc374478e6 100644
+--- a/drivers/gpu/drm/nouveau/nouveau_chan.c
++++ b/drivers/gpu/drm/nouveau/nouveau_chan.c
+@@ -90,6 +90,7 @@ nouveau_channel_del(struct nouveau_channel **pchan)
+ if (cli)
+ nouveau_svmm_part(chan->vmm->svmm, chan->inst);
+
++ nvif_object_dtor(&chan->blit);
+ nvif_object_dtor(&chan->nvsw);
+ nvif_object_dtor(&chan->gart);
+ nvif_object_dtor(&chan->vram);
+diff --git a/drivers/gpu/drm/nouveau/nouveau_chan.h b/drivers/gpu/drm/nouveau/nouveau_chan.h
+index e06a8ffed31a8..bad7466bd0d59 100644
+--- a/drivers/gpu/drm/nouveau/nouveau_chan.h
++++ b/drivers/gpu/drm/nouveau/nouveau_chan.h
+@@ -53,6 +53,7 @@ struct nouveau_channel {
+ u32 user_put;
+
+ struct nvif_object user;
++ struct nvif_object blit;
+
+ struct nvif_event kill;
+ atomic_t killed;
+diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
+index 7aac9384600ed..40fb9a8349180 100644
+--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
++++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
+@@ -375,15 +375,29 @@ nouveau_accel_gr_init(struct nouveau_drm *drm)
+ ret = nvif_object_ctor(&drm->channel->user, "drmNvsw",
+ NVDRM_NVSW, nouveau_abi16_swclass(drm),
+ NULL, 0, &drm->channel->nvsw);
++
++ if (ret == 0 && device->info.chipset >= 0x11) {
++ ret = nvif_object_ctor(&drm->channel->user, "drmBlit",
++ 0x005f, 0x009f,
++ NULL, 0, &drm->channel->blit);
++ }
++
+ if (ret == 0) {
+ struct nvif_push *push = drm->channel->chan.push;
+- ret = PUSH_WAIT(push, 2);
+- if (ret == 0)
++ ret = PUSH_WAIT(push, 8);
++ if (ret == 0) {
++ if (device->info.chipset >= 0x11) {
++ PUSH_NVSQ(push, NV05F, 0x0000, drm->channel->blit.handle);
++ PUSH_NVSQ(push, NV09F, 0x0120, 0,
++ 0x0124, 1,
++ 0x0128, 2);
++ }
+ PUSH_NVSQ(push, NV_SW, 0x0000, drm->channel->nvsw.handle);
++ }
+ }
+
+ if (ret) {
+- NV_ERROR(drm, "failed to allocate sw class, %d\n", ret);
++ NV_ERROR(drm, "failed to allocate sw or blit class, %d\n", ret);
+ nouveau_accel_gr_fini(drm);
+ return;
+ }
+diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/disp/g94.c b/drivers/gpu/drm/nouveau/nvkm/engine/disp/g94.c
+index a4853c4e5ee3a..67ef889a0c5f4 100644
+--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/g94.c
++++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/g94.c
+@@ -295,6 +295,7 @@ g94_sor = {
+ .clock = nv50_sor_clock,
+ .war_2 = g94_sor_war_2,
+ .war_3 = g94_sor_war_3,
++ .hdmi = &g84_sor_hdmi,
+ .dp = &g94_sor_dp,
+ };
+
+diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/disp/gt215.c b/drivers/gpu/drm/nouveau/nvkm/engine/disp/gt215.c
+index a2c7c6f83dcdb..506ffbe7b8421 100644
+--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/gt215.c
++++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/gt215.c
+@@ -125,7 +125,7 @@ gt215_sor_hdmi_infoframe_avi(struct nvkm_ior *ior, int head, void *data, u32 siz
+ pack_hdmi_infoframe(&avi, data, size);
+
+ nvkm_mask(device, 0x61c520 + soff, 0x00000001, 0x00000000);
+- if (size)
++ if (!size)
+ return;
+
+ nvkm_wr32(device, 0x61c528 + soff, avi.header);
+diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/acr/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/acr/base.c
+index 795f3a649b122..9b8ca4e898f90 100644
+--- a/drivers/gpu/drm/nouveau/nvkm/subdev/acr/base.c
++++ b/drivers/gpu/drm/nouveau/nvkm/subdev/acr/base.c
+@@ -224,7 +224,7 @@ nvkm_acr_oneinit(struct nvkm_subdev *subdev)
+ u64 falcons;
+ int ret, i;
+
+- if (list_empty(&acr->hsfw)) {
++ if (list_empty(&acr->hsfw) || !acr->func || !acr->func->wpr_layout) {
+ nvkm_debug(subdev, "No HSFW(s)\n");
+ nvkm_acr_cleanup(acr);
+ return 0;
+diff --git a/drivers/gpu/drm/omapdrm/omap_fbdev.c b/drivers/gpu/drm/omapdrm/omap_fbdev.c
+index b950e93b3846a..02cec22b9749e 100644
+--- a/drivers/gpu/drm/omapdrm/omap_fbdev.c
++++ b/drivers/gpu/drm/omapdrm/omap_fbdev.c
+@@ -323,10 +323,6 @@ void omap_fbdev_setup(struct drm_device *dev)
+
+ INIT_WORK(&fbdev->work, pan_worker);
+
+- ret = omap_fbdev_client_hotplug(&helper->client);
+- if (ret)
+- drm_dbg_kms(dev, "client hotplug ret=%d\n", ret);
+-
+ drm_client_register(&helper->client);
+
+ return;
+diff --git a/drivers/gpu/drm/panel/panel-simple.c b/drivers/gpu/drm/panel/panel-simple.c
+index d8efbcee9bc12..e02249b212c2a 100644
+--- a/drivers/gpu/drm/panel/panel-simple.c
++++ b/drivers/gpu/drm/panel/panel-simple.c
+@@ -2117,6 +2117,7 @@ static const struct panel_desc innolux_at043tn24 = {
+ .height = 54,
+ },
+ .bus_format = MEDIA_BUS_FMT_RGB888_1X24,
++ .connector_type = DRM_MODE_CONNECTOR_DPI,
+ .bus_flags = DRM_BUS_FLAG_DE_HIGH | DRM_BUS_FLAG_PIXDATA_DRIVE_POSEDGE,
+ };
+
+@@ -3109,6 +3110,7 @@ static const struct drm_display_mode powertip_ph800480t013_idf02_mode = {
+ .vsync_start = 480 + 49,
+ .vsync_end = 480 + 49 + 2,
+ .vtotal = 480 + 49 + 2 + 22,
++ .flags = DRM_MODE_FLAG_NVSYNC | DRM_MODE_FLAG_NHSYNC,
+ };
+
+ static const struct panel_desc powertip_ph800480t013_idf02 = {
+diff --git a/drivers/gpu/drm/radeon/radeon_fbdev.c b/drivers/gpu/drm/radeon/radeon_fbdev.c
+index 8f6c3aef09628..8b93e6e5d2ffb 100644
+--- a/drivers/gpu/drm/radeon/radeon_fbdev.c
++++ b/drivers/gpu/drm/radeon/radeon_fbdev.c
+@@ -386,10 +386,6 @@ void radeon_fbdev_setup(struct radeon_device *rdev)
+ goto err_drm_client_init;
+ }
+
+- ret = radeon_fbdev_client_hotplug(&fb_helper->client);
+- if (ret)
+- drm_dbg_kms(rdev->ddev, "client hotplug ret=%d\n", ret);
+-
+ drm_client_register(&fb_helper->client);
+
+ return;
+diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
+index d8f5e064a1baa..a530ecc4d207c 100644
+--- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
++++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
+@@ -714,13 +714,13 @@ static void vop_crtc_atomic_disable(struct drm_crtc *crtc,
+ if (crtc->state->self_refresh_active)
+ rockchip_drm_set_win_enabled(crtc, false);
+
++ if (crtc->state->self_refresh_active)
++ goto out;
++
+ mutex_lock(&vop->vop_lock);
+
+ drm_crtc_vblank_off(crtc);
+
+- if (crtc->state->self_refresh_active)
+- goto out;
+-
+ /*
+ * Vop standby will take effect at end of current frame,
+ * if dsp hold valid irq happen, it means standby complete.
+@@ -754,9 +754,9 @@ static void vop_crtc_atomic_disable(struct drm_crtc *crtc,
+ vop_core_clks_disable(vop);
+ pm_runtime_put(vop->dev);
+
+-out:
+ mutex_unlock(&vop->vop_lock);
+
++out:
+ if (crtc->state->event && !crtc->state->active) {
+ spin_lock_irq(&crtc->dev->event_lock);
+ drm_crtc_send_vblank_event(crtc, crtc->state->event);
+diff --git a/drivers/gpu/drm/tegra/fbdev.c b/drivers/gpu/drm/tegra/fbdev.c
+index dca9eccae466b..d527b0b9de1df 100644
+--- a/drivers/gpu/drm/tegra/fbdev.c
++++ b/drivers/gpu/drm/tegra/fbdev.c
+@@ -227,10 +227,6 @@ void tegra_fbdev_setup(struct drm_device *dev)
+ if (ret)
+ goto err_drm_client_init;
+
+- ret = tegra_fbdev_client_hotplug(&helper->client);
+- if (ret)
+- drm_dbg_kms(dev, "client hotplug ret=%d\n", ret);
+-
+ drm_client_register(&helper->client);
+
+ return;
+diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
+index bd5dae4d16249..1a1cfd675cc46 100644
+--- a/drivers/gpu/drm/ttm/ttm_bo.c
++++ b/drivers/gpu/drm/ttm/ttm_bo.c
+@@ -458,18 +458,18 @@ static int ttm_bo_evict(struct ttm_buffer_object *bo,
+ goto out;
+ }
+
+-bounce:
+- ret = ttm_bo_handle_move_mem(bo, evict_mem, true, ctx, &hop);
+- if (ret == -EMULTIHOP) {
++ do {
++ ret = ttm_bo_handle_move_mem(bo, evict_mem, true, ctx, &hop);
++ if (ret != -EMULTIHOP)
++ break;
++
+ ret = ttm_bo_bounce_temp_buffer(bo, &evict_mem, ctx, &hop);
+- if (ret) {
+- if (ret != -ERESTARTSYS && ret != -EINTR)
+- pr_err("Buffer eviction failed\n");
+- ttm_resource_free(bo, &evict_mem);
+- goto out;
+- }
+- /* try and move to final place now. */
+- goto bounce;
++ } while (!ret);
++
++ if (ret) {
++ ttm_resource_free(bo, &evict_mem);
++ if (ret != -ERESTARTSYS && ret != -EINTR)
++ pr_err("Buffer eviction failed\n");
+ }
+ out:
+ return ret;
+@@ -1167,6 +1167,7 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx,
+ ret = ttm_bo_handle_move_mem(bo, evict_mem, true, &ctx, &hop);
+ if (unlikely(ret != 0)) {
+ WARN(ret == -EMULTIHOP, "Unexpected multihop in swaput - likely driver bug.\n");
++ ttm_resource_free(bo, &evict_mem);
+ goto out;
+ }
+ }
+diff --git a/drivers/hid/amd-sfh-hid/sfh1_1/amd_sfh_desc.c b/drivers/hid/amd-sfh-hid/sfh1_1/amd_sfh_desc.c
+index 6f0d332ccf51c..06bdcf072d10c 100644
+--- a/drivers/hid/amd-sfh-hid/sfh1_1/amd_sfh_desc.c
++++ b/drivers/hid/amd-sfh-hid/sfh1_1/amd_sfh_desc.c
+@@ -132,29 +132,45 @@ static void get_common_inputs(struct common_input_property *common, int report_i
+ common->event_type = HID_USAGE_SENSOR_EVENT_DATA_UPDATED_ENUM;
+ }
+
+-static int float_to_int(u32 float32)
++static int float_to_int(u32 flt32_val)
+ {
+ int fraction, shift, mantissa, sign, exp, zeropre;
+
+- mantissa = float32 & GENMASK(22, 0);
+- sign = (float32 & BIT(31)) ? -1 : 1;
+- exp = (float32 & ~BIT(31)) >> 23;
++ mantissa = flt32_val & GENMASK(22, 0);
++ sign = (flt32_val & BIT(31)) ? -1 : 1;
++ exp = (flt32_val & ~BIT(31)) >> 23;
+
+ if (!exp && !mantissa)
+ return 0;
+
++ /*
++ * Calculate the exponent and fraction part of floating
++ * point representation.
++ */
+ exp -= 127;
+ if (exp < 0) {
+ exp = -exp;
++ if (exp >= BITS_PER_TYPE(u32))
++ return 0;
+ zeropre = (((BIT(23) + mantissa) * 100) >> 23) >> exp;
+ return zeropre >= 50 ? sign : 0;
+ }
+
+ shift = 23 - exp;
+- float32 = BIT(exp) + (mantissa >> shift);
+- fraction = mantissa & GENMASK(shift - 1, 0);
++ if (abs(shift) >= BITS_PER_TYPE(u32))
++ return 0;
++
++ if (shift < 0) {
++ shift = -shift;
++ flt32_val = BIT(exp) + (mantissa << shift);
++ shift = 0;
++ } else {
++ flt32_val = BIT(exp) + (mantissa >> shift);
++ }
++
++ fraction = (shift == 0) ? 0 : mantissa & GENMASK(shift - 1, 0);
+
+- return (((fraction * 100) >> shift) >= 50) ? sign * (float32 + 1) : sign * float32;
++ return (((fraction * 100) >> shift) >= 50) ? sign * (flt32_val + 1) : sign * flt32_val;
+ }
+
+ static u8 get_input_rep(u8 current_index, int sensor_idx, int report_id,
+diff --git a/drivers/hid/hid-hyperv.c b/drivers/hid/hid-hyperv.c
+index 49d4a26895e76..f33485d83d24f 100644
+--- a/drivers/hid/hid-hyperv.c
++++ b/drivers/hid/hid-hyperv.c
+@@ -258,19 +258,17 @@ static void mousevsc_on_receive(struct hv_device *device,
+
+ switch (hid_msg_hdr->type) {
+ case SYNTH_HID_PROTOCOL_RESPONSE:
++ len = struct_size(pipe_msg, data, pipe_msg->size);
++
+ /*
+ * While it will be impossible for us to protect against
+ * malicious/buggy hypervisor/host, add a check here to
+ * ensure we don't corrupt memory.
+ */
+- if (struct_size(pipe_msg, data, pipe_msg->size)
+- > sizeof(struct mousevsc_prt_msg)) {
+- WARN_ON(1);
++ if (WARN_ON(len > sizeof(struct mousevsc_prt_msg)))
+ break;
+- }
+
+- memcpy(&input_dev->protocol_resp, pipe_msg,
+- struct_size(pipe_msg, data, pipe_msg->size));
++ memcpy(&input_dev->protocol_resp, pipe_msg, len);
+ complete(&input_dev->wait_event);
+ break;
+
+diff --git a/drivers/hid/hid-input.c b/drivers/hid/hid-input.c
+index a1d2690a1a0de..851ee86eff32a 100644
+--- a/drivers/hid/hid-input.c
++++ b/drivers/hid/hid-input.c
+@@ -1093,6 +1093,10 @@ static void hidinput_configure_usage(struct hid_input *hidinput, struct hid_fiel
+ case 0x074: map_key_clear(KEY_BRIGHTNESS_MAX); break;
+ case 0x075: map_key_clear(KEY_BRIGHTNESS_AUTO); break;
+
++ case 0x076: map_key_clear(KEY_CAMERA_ACCESS_ENABLE); break;
++ case 0x077: map_key_clear(KEY_CAMERA_ACCESS_DISABLE); break;
++ case 0x078: map_key_clear(KEY_CAMERA_ACCESS_TOGGLE); break;
++
+ case 0x079: map_key_clear(KEY_KBDILLUMUP); break;
+ case 0x07a: map_key_clear(KEY_KBDILLUMDOWN); break;
+ case 0x07c: map_key_clear(KEY_KBDILLUMTOGGLE); break;
+@@ -1139,9 +1143,6 @@ static void hidinput_configure_usage(struct hid_input *hidinput, struct hid_fiel
+ case 0x0cd: map_key_clear(KEY_PLAYPAUSE); break;
+ case 0x0cf: map_key_clear(KEY_VOICECOMMAND); break;
+
+- case 0x0d5: map_key_clear(KEY_CAMERA_ACCESS_ENABLE); break;
+- case 0x0d6: map_key_clear(KEY_CAMERA_ACCESS_DISABLE); break;
+- case 0x0d7: map_key_clear(KEY_CAMERA_ACCESS_TOGGLE); break;
+ case 0x0d8: map_key_clear(KEY_DICTATE); break;
+ case 0x0d9: map_key_clear(KEY_EMOJI_PICKER); break;
+
+diff --git a/drivers/iio/adc/meson_saradc.c b/drivers/iio/adc/meson_saradc.c
+index 18937a262af6f..af6bfcc190752 100644
+--- a/drivers/iio/adc/meson_saradc.c
++++ b/drivers/iio/adc/meson_saradc.c
+@@ -72,7 +72,7 @@
+ #define MESON_SAR_ADC_REG3_PANEL_DETECT_COUNT_MASK GENMASK(20, 18)
+ #define MESON_SAR_ADC_REG3_PANEL_DETECT_FILTER_TB_MASK GENMASK(17, 16)
+ #define MESON_SAR_ADC_REG3_ADC_CLK_DIV_SHIFT 10
+- #define MESON_SAR_ADC_REG3_ADC_CLK_DIV_WIDTH 5
++ #define MESON_SAR_ADC_REG3_ADC_CLK_DIV_WIDTH 6
+ #define MESON_SAR_ADC_REG3_BLOCK_DLY_SEL_MASK GENMASK(9, 8)
+ #define MESON_SAR_ADC_REG3_BLOCK_DLY_MASK GENMASK(7, 0)
+
+diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
+index 31838b13ea543..a72bac9c5d10d 100644
+--- a/drivers/md/dm-integrity.c
++++ b/drivers/md/dm-integrity.c
+@@ -34,11 +34,11 @@
+ #define DEFAULT_BUFFER_SECTORS 128
+ #define DEFAULT_JOURNAL_WATERMARK 50
+ #define DEFAULT_SYNC_MSEC 10000
+-#define DEFAULT_MAX_JOURNAL_SECTORS 131072
++#define DEFAULT_MAX_JOURNAL_SECTORS (IS_ENABLED(CONFIG_64BIT) ? 131072 : 8192)
+ #define MIN_LOG2_INTERLEAVE_SECTORS 3
+ #define MAX_LOG2_INTERLEAVE_SECTORS 31
+ #define METADATA_WORKQUEUE_MAX_ACTIVE 16
+-#define RECALC_SECTORS 32768
++#define RECALC_SECTORS (IS_ENABLED(CONFIG_64BIT) ? 32768 : 2048)
+ #define RECALC_WRITE_SUPER 16
+ #define BITMAP_BLOCK_SIZE 4096 /* don't change it */
+ #define BITMAP_FLUSH_INTERVAL (10 * HZ)
+diff --git a/drivers/md/dm-verity-loadpin.c b/drivers/md/dm-verity-loadpin.c
+index 4f78cc55c2514..0666699b68581 100644
+--- a/drivers/md/dm-verity-loadpin.c
++++ b/drivers/md/dm-verity-loadpin.c
+@@ -58,6 +58,9 @@ bool dm_verity_loadpin_is_bdev_trusted(struct block_device *bdev)
+ int srcu_idx;
+ bool trusted = false;
+
++ if (bdev == NULL)
++ return false;
++
+ if (list_empty(&dm_verity_loadpin_trusted_root_digests))
+ return false;
+
+diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
+index f8ee9a95e25d5..d1ac73fcd8529 100644
+--- a/drivers/md/raid0.c
++++ b/drivers/md/raid0.c
+@@ -270,6 +270,18 @@ static int create_strip_zones(struct mddev *mddev, struct r0conf **private_conf)
+ goto abort;
+ }
+
++ if (conf->layout == RAID0_ORIG_LAYOUT) {
++ for (i = 1; i < conf->nr_strip_zones; i++) {
++ sector_t first_sector = conf->strip_zone[i-1].zone_end;
++
++ sector_div(first_sector, mddev->chunk_sectors);
++ zone = conf->strip_zone + i;
++ /* disk_shift is first disk index used in the zone */
++ zone->disk_shift = sector_div(first_sector,
++ zone->nb_dev);
++ }
++ }
++
+ pr_debug("md/raid0:%s: done.\n", mdname(mddev));
+ *private_conf = conf;
+
+@@ -431,6 +443,20 @@ exit_acct_set:
+ return ret;
+ }
+
++/*
++ * Convert disk_index to the disk order in which it is read/written.
++ * For example, if we have 4 disks, they are numbered 0,1,2,3. If we
++ * write the disks starting at disk 3, then the read/write order would
++ * be disk 3, then 0, then 1, and then disk 2 and we want map_disk_shift()
++ * to map the disks as follows 0,1,2,3 => 1,2,3,0. So disk 0 would map
++ * to 1, 1 to 2, 2 to 3, and 3 to 0. That way we can compare disks in
++ * that 'output' space to understand the read/write disk ordering.
++ */
++static int map_disk_shift(int disk_index, int num_disks, int disk_shift)
++{
++ return ((disk_index + num_disks - disk_shift) % num_disks);
++}
++
+ static void raid0_handle_discard(struct mddev *mddev, struct bio *bio)
+ {
+ struct r0conf *conf = mddev->private;
+@@ -444,7 +470,9 @@ static void raid0_handle_discard(struct mddev *mddev, struct bio *bio)
+ sector_t end_disk_offset;
+ unsigned int end_disk_index;
+ unsigned int disk;
++ sector_t orig_start, orig_end;
+
++ orig_start = start;
+ zone = find_zone(conf, &start);
+
+ if (bio_end_sector(bio) > zone->zone_end) {
+@@ -458,6 +486,7 @@ static void raid0_handle_discard(struct mddev *mddev, struct bio *bio)
+ } else
+ end = bio_end_sector(bio);
+
++ orig_end = end;
+ if (zone != conf->strip_zone)
+ end = end - zone[-1].zone_end;
+
+@@ -469,13 +498,26 @@ static void raid0_handle_discard(struct mddev *mddev, struct bio *bio)
+ last_stripe_index = end;
+ sector_div(last_stripe_index, stripe_size);
+
+- start_disk_index = (int)(start - first_stripe_index * stripe_size) /
+- mddev->chunk_sectors;
++ /* In the first zone the original and alternate layouts are the same */
++ if ((conf->layout == RAID0_ORIG_LAYOUT) && (zone != conf->strip_zone)) {
++ sector_div(orig_start, mddev->chunk_sectors);
++ start_disk_index = sector_div(orig_start, zone->nb_dev);
++ start_disk_index = map_disk_shift(start_disk_index,
++ zone->nb_dev,
++ zone->disk_shift);
++ sector_div(orig_end, mddev->chunk_sectors);
++ end_disk_index = sector_div(orig_end, zone->nb_dev);
++ end_disk_index = map_disk_shift(end_disk_index,
++ zone->nb_dev, zone->disk_shift);
++ } else {
++ start_disk_index = (int)(start - first_stripe_index * stripe_size) /
++ mddev->chunk_sectors;
++ end_disk_index = (int)(end - last_stripe_index * stripe_size) /
++ mddev->chunk_sectors;
++ }
+ start_disk_offset = ((int)(start - first_stripe_index * stripe_size) %
+ mddev->chunk_sectors) +
+ first_stripe_index * mddev->chunk_sectors;
+- end_disk_index = (int)(end - last_stripe_index * stripe_size) /
+- mddev->chunk_sectors;
+ end_disk_offset = ((int)(end - last_stripe_index * stripe_size) %
+ mddev->chunk_sectors) +
+ last_stripe_index * mddev->chunk_sectors;
+@@ -483,18 +525,22 @@ static void raid0_handle_discard(struct mddev *mddev, struct bio *bio)
+ for (disk = 0; disk < zone->nb_dev; disk++) {
+ sector_t dev_start, dev_end;
+ struct md_rdev *rdev;
++ int compare_disk;
++
++ compare_disk = map_disk_shift(disk, zone->nb_dev,
++ zone->disk_shift);
+
+- if (disk < start_disk_index)
++ if (compare_disk < start_disk_index)
+ dev_start = (first_stripe_index + 1) *
+ mddev->chunk_sectors;
+- else if (disk > start_disk_index)
++ else if (compare_disk > start_disk_index)
+ dev_start = first_stripe_index * mddev->chunk_sectors;
+ else
+ dev_start = start_disk_offset;
+
+- if (disk < end_disk_index)
++ if (compare_disk < end_disk_index)
+ dev_end = (last_stripe_index + 1) * mddev->chunk_sectors;
+- else if (disk > end_disk_index)
++ else if (compare_disk > end_disk_index)
+ dev_end = last_stripe_index * mddev->chunk_sectors;
+ else
+ dev_end = end_disk_offset;
+diff --git a/drivers/md/raid0.h b/drivers/md/raid0.h
+index 3816e5477db1e..8cc761ca74230 100644
+--- a/drivers/md/raid0.h
++++ b/drivers/md/raid0.h
+@@ -6,6 +6,7 @@ struct strip_zone {
+ sector_t zone_end; /* Start of the next zone (in sectors) */
+ sector_t dev_start; /* Zone offset in real dev (in sectors) */
+ int nb_dev; /* # of devices attached to the zone */
++ int disk_shift; /* start disk for the original layout */
+ };
+
+ /* Linux 3.14 (20d0189b101) made an unintended change to
+diff --git a/drivers/mfd/qcom-pm8008.c b/drivers/mfd/qcom-pm8008.c
+index 837246aab4ace..29ec3901564bf 100644
+--- a/drivers/mfd/qcom-pm8008.c
++++ b/drivers/mfd/qcom-pm8008.c
+@@ -199,6 +199,7 @@ static const struct of_device_id pm8008_match[] = {
+ { .compatible = "qcom,pm8008", },
+ { },
+ };
++MODULE_DEVICE_TABLE(of, pm8008_match);
+
+ static struct i2c_driver pm8008_mfd_driver = {
+ .driver = {
+diff --git a/drivers/misc/fastrpc.c b/drivers/misc/fastrpc.c
+index 9051551d99373..9666d28037e18 100644
+--- a/drivers/misc/fastrpc.c
++++ b/drivers/misc/fastrpc.c
+@@ -1437,7 +1437,7 @@ static int fastrpc_init_create_process(struct fastrpc_user *fl,
+
+ sc = FASTRPC_SCALARS(FASTRPC_RMID_INIT_CREATE, 4, 0);
+ if (init.attrs)
+- sc = FASTRPC_SCALARS(FASTRPC_RMID_INIT_CREATE_ATTR, 6, 0);
++ sc = FASTRPC_SCALARS(FASTRPC_RMID_INIT_CREATE_ATTR, 4, 0);
+
+ err = fastrpc_internal_invoke(fl, true, FASTRPC_INIT_HANDLE,
+ sc, args);
+diff --git a/drivers/misc/pci_endpoint_test.c b/drivers/misc/pci_endpoint_test.c
+index a7244de081ec9..24efe3b88a1f0 100644
+--- a/drivers/misc/pci_endpoint_test.c
++++ b/drivers/misc/pci_endpoint_test.c
+@@ -729,6 +729,10 @@ static long pci_endpoint_test_ioctl(struct file *file, unsigned int cmd,
+ struct pci_dev *pdev = test->pdev;
+
+ mutex_lock(&test->mutex);
++
++ reinit_completion(&test->irq_raised);
++ test->last_irq = -ENODATA;
++
+ switch (cmd) {
+ case PCITEST_BAR:
+ bar = arg;
+@@ -938,6 +942,9 @@ static void pci_endpoint_test_remove(struct pci_dev *pdev)
+ if (id < 0)
+ return;
+
++ pci_endpoint_test_release_irq(test);
++ pci_endpoint_test_free_irq_vectors(test);
++
+ misc_deregister(&test->miscdev);
+ kfree(misc_device->name);
+ kfree(test->name);
+@@ -947,9 +954,6 @@ static void pci_endpoint_test_remove(struct pci_dev *pdev)
+ pci_iounmap(pdev, test->bar[bar]);
+ }
+
+- pci_endpoint_test_release_irq(test);
+- pci_endpoint_test_free_irq_vectors(test);
+-
+ pci_release_regions(pdev);
+ pci_disable_device(pdev);
+ }
+diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
+index 1feea7d822520..4efb96e4e1c7a 100644
+--- a/drivers/mtd/nand/raw/meson_nand.c
++++ b/drivers/mtd/nand/raw/meson_nand.c
+@@ -76,6 +76,7 @@
+ #define GENCMDIADDRH(aih, addr) ((aih) | (((addr) >> 16) & 0xffff))
+
+ #define DMA_DIR(dir) ((dir) ? NFC_CMD_N2M : NFC_CMD_M2N)
++#define DMA_ADDR_ALIGN 8
+
+ #define ECC_CHECK_RETURN_FF (-1)
+
+@@ -842,6 +843,9 @@ static int meson_nfc_read_oob(struct nand_chip *nand, int page)
+
+ static bool meson_nfc_is_buffer_dma_safe(const void *buffer)
+ {
++ if ((uintptr_t)buffer % DMA_ADDR_ALIGN)
++ return false;
++
+ if (virt_addr_valid(buffer) && (!object_is_on_stack(buffer)))
+ return true;
+ return false;
+diff --git a/drivers/net/dsa/hirschmann/hellcreek.c b/drivers/net/dsa/hirschmann/hellcreek.c
+index 595a548bb0a80..af50001ccdd4e 100644
+--- a/drivers/net/dsa/hirschmann/hellcreek.c
++++ b/drivers/net/dsa/hirschmann/hellcreek.c
+@@ -1885,13 +1885,17 @@ static int hellcreek_port_setup_tc(struct dsa_switch *ds, int port,
+ case TC_SETUP_QDISC_TAPRIO: {
+ struct tc_taprio_qopt_offload *taprio = type_data;
+
+- if (!hellcreek_validate_schedule(hellcreek, taprio))
+- return -EOPNOTSUPP;
++ switch (taprio->cmd) {
++ case TAPRIO_CMD_REPLACE:
++ if (!hellcreek_validate_schedule(hellcreek, taprio))
++ return -EOPNOTSUPP;
+
+- if (taprio->enable)
+ return hellcreek_port_set_schedule(ds, port, taprio);
+-
+- return hellcreek_port_del_schedule(ds, port);
++ case TAPRIO_CMD_DESTROY:
++ return hellcreek_port_del_schedule(ds, port);
++ default:
++ return -EOPNOTSUPP;
++ }
+ }
+ default:
+ return -EOPNOTSUPP;
+diff --git a/drivers/net/dsa/ocelot/felix.c b/drivers/net/dsa/ocelot/felix.c
+index 70c0e2b1936b3..d78b4bd4787e8 100644
+--- a/drivers/net/dsa/ocelot/felix.c
++++ b/drivers/net/dsa/ocelot/felix.c
+@@ -1286,7 +1286,6 @@ static int felix_parse_ports_node(struct felix *felix,
+ if (err < 0) {
+ dev_info(dev, "Unsupported PHY mode %s on port %d\n",
+ phy_modes(phy_mode), port);
+- of_node_put(child);
+
+ /* Leave port_phy_modes[port] = 0, which is also
+ * PHY_INTERFACE_MODE_NA. This will perform a
+@@ -1786,14 +1785,13 @@ static int felix_change_mtu(struct dsa_switch *ds, int port, int new_mtu)
+ {
+ struct ocelot *ocelot = ds->priv;
+ struct ocelot_port *ocelot_port = ocelot->ports[port];
+- struct felix *felix = ocelot_to_felix(ocelot);
+
+ ocelot_port_set_maxlen(ocelot, port, new_mtu);
+
+ mutex_lock(&ocelot->tas_lock);
+
+- if (ocelot_port->taprio && felix->info->tas_guard_bands_update)
+- felix->info->tas_guard_bands_update(ocelot, port);
++ if (ocelot_port->taprio && ocelot->ops->tas_guard_bands_update)
++ ocelot->ops->tas_guard_bands_update(ocelot, port);
+
+ mutex_unlock(&ocelot->tas_lock);
+
+diff --git a/drivers/net/dsa/ocelot/felix.h b/drivers/net/dsa/ocelot/felix.h
+index 96008c046da53..1d4befe7cfe8e 100644
+--- a/drivers/net/dsa/ocelot/felix.h
++++ b/drivers/net/dsa/ocelot/felix.h
+@@ -57,7 +57,6 @@ struct felix_info {
+ void (*mdio_bus_free)(struct ocelot *ocelot);
+ int (*port_setup_tc)(struct dsa_switch *ds, int port,
+ enum tc_setup_type type, void *type_data);
+- void (*tas_guard_bands_update)(struct ocelot *ocelot, int port);
+ void (*port_sched_speed_set)(struct ocelot *ocelot, int port,
+ u32 speed);
+ void (*phylink_mac_config)(struct ocelot *ocelot, int port,
+diff --git a/drivers/net/dsa/ocelot/felix_vsc9959.c b/drivers/net/dsa/ocelot/felix_vsc9959.c
+index d172a3e9736c4..ca69973ae91b9 100644
+--- a/drivers/net/dsa/ocelot/felix_vsc9959.c
++++ b/drivers/net/dsa/ocelot/felix_vsc9959.c
+@@ -1221,11 +1221,13 @@ static u32 vsc9959_tas_tc_max_sdu(struct tc_taprio_qopt_offload *taprio, int tc)
+ static void vsc9959_tas_guard_bands_update(struct ocelot *ocelot, int port)
+ {
+ struct ocelot_port *ocelot_port = ocelot->ports[port];
++ struct ocelot_mm_state *mm = &ocelot->mm[port];
+ struct tc_taprio_qopt_offload *taprio;
+ u64 min_gate_len[OCELOT_NUM_TC];
++ u32 val, maxlen, add_frag_size;
++ u64 needed_min_frag_time_ps;
+ int speed, picos_per_byte;
+ u64 needed_bit_time_ps;
+- u32 val, maxlen;
+ u8 tas_speed;
+ int tc;
+
+@@ -1265,9 +1267,18 @@ static void vsc9959_tas_guard_bands_update(struct ocelot *ocelot, int port)
+ */
+ needed_bit_time_ps = (u64)(maxlen + 24) * picos_per_byte;
+
++ /* Preemptible TCs don't need to pass a full MTU, the port will
++ * automatically emit a HOLD request when a preemptible TC gate closes
++ */
++ val = ocelot_read_rix(ocelot, QSYS_PREEMPTION_CFG, port);
++ add_frag_size = QSYS_PREEMPTION_CFG_MM_ADD_FRAG_SIZE_X(val);
++ needed_min_frag_time_ps = picos_per_byte *
++ (u64)(24 + 2 * ethtool_mm_frag_size_add_to_min(add_frag_size));
++
+ dev_dbg(ocelot->dev,
+- "port %d: max frame size %d needs %llu ps at speed %d\n",
+- port, maxlen, needed_bit_time_ps, speed);
++ "port %d: max frame size %d needs %llu ps, %llu ps for mPackets at speed %d\n",
++ port, maxlen, needed_bit_time_ps, needed_min_frag_time_ps,
++ speed);
+
+ vsc9959_tas_min_gate_lengths(taprio, min_gate_len);
+
+@@ -1281,7 +1292,9 @@ static void vsc9959_tas_guard_bands_update(struct ocelot *ocelot, int port)
+ remaining_gate_len_ps =
+ vsc9959_tas_remaining_gate_len_ps(min_gate_len[tc]);
+
+- if (remaining_gate_len_ps > needed_bit_time_ps) {
++ if ((mm->active_preemptible_tcs & BIT(tc)) ?
++ remaining_gate_len_ps > needed_min_frag_time_ps :
++ remaining_gate_len_ps > needed_bit_time_ps) {
+ /* Setting QMAXSDU_CFG to 0 disables oversized frame
+ * dropping.
+ */
+@@ -1423,7 +1436,7 @@ static int vsc9959_qos_port_tas_set(struct ocelot *ocelot, int port,
+
+ mutex_lock(&ocelot->tas_lock);
+
+- if (!taprio->enable) {
++ if (taprio->cmd == TAPRIO_CMD_DESTROY) {
+ ocelot_port_mqprio(ocelot, port, &taprio->mqprio);
+ ocelot_rmw_rix(ocelot, 0, QSYS_TAG_CONFIG_ENABLE,
+ QSYS_TAG_CONFIG, port);
+@@ -1435,6 +1448,9 @@ static int vsc9959_qos_port_tas_set(struct ocelot *ocelot, int port,
+
+ mutex_unlock(&ocelot->tas_lock);
+ return 0;
++ } else if (taprio->cmd != TAPRIO_CMD_REPLACE) {
++ ret = -EOPNOTSUPP;
++ goto err_unlock;
+ }
+
+ ret = ocelot_port_mqprio(ocelot, port, &taprio->mqprio);
+@@ -2600,6 +2616,7 @@ static const struct ocelot_ops vsc9959_ops = {
+ .cut_through_fwd = vsc9959_cut_through_fwd,
+ .tas_clock_adjust = vsc9959_tas_clock_adjust,
+ .update_stats = vsc9959_update_stats,
++ .tas_guard_bands_update = vsc9959_tas_guard_bands_update,
+ };
+
+ static const struct felix_info felix_info_vsc9959 = {
+@@ -2625,7 +2642,6 @@ static const struct felix_info felix_info_vsc9959 = {
+ .port_modes = vsc9959_port_modes,
+ .port_setup_tc = vsc9959_port_setup_tc,
+ .port_sched_speed_set = vsc9959_sched_speed_set,
+- .tas_guard_bands_update = vsc9959_tas_guard_bands_update,
+ };
+
+ /* The INTB interrupt is shared between for PTP TX timestamp availability
+diff --git a/drivers/net/dsa/qca/qca8k-8xxx.c b/drivers/net/dsa/qca/qca8k-8xxx.c
+index 6d5ac7588a691..d775a14784f7e 100644
+--- a/drivers/net/dsa/qca/qca8k-8xxx.c
++++ b/drivers/net/dsa/qca/qca8k-8xxx.c
+@@ -588,6 +588,9 @@ qca8k_phy_eth_busy_wait(struct qca8k_mgmt_eth_data *mgmt_eth_data,
+ bool ack;
+ int ret;
+
++ if (!skb)
++ return -ENOMEM;
++
+ reinit_completion(&mgmt_eth_data->rw_done);
+
+ /* Increment seq_num and set it in the copy pkt */
+diff --git a/drivers/net/dsa/sja1105/sja1105_tas.c b/drivers/net/dsa/sja1105/sja1105_tas.c
+index e6153848a9509..d7818710bc028 100644
+--- a/drivers/net/dsa/sja1105/sja1105_tas.c
++++ b/drivers/net/dsa/sja1105/sja1105_tas.c
+@@ -516,10 +516,11 @@ int sja1105_setup_tc_taprio(struct dsa_switch *ds, int port,
+ /* Can't change an already configured port (must delete qdisc first).
+ * Can't delete the qdisc from an unconfigured port.
+ */
+- if (!!tas_data->offload[port] == admin->enable)
++ if ((!!tas_data->offload[port] && admin->cmd == TAPRIO_CMD_REPLACE) ||
++ (!tas_data->offload[port] && admin->cmd == TAPRIO_CMD_DESTROY))
+ return -EINVAL;
+
+- if (!admin->enable) {
++ if (admin->cmd == TAPRIO_CMD_DESTROY) {
+ taprio_offload_free(tas_data->offload[port]);
+ tas_data->offload[port] = NULL;
+
+@@ -528,6 +529,8 @@ int sja1105_setup_tc_taprio(struct dsa_switch *ds, int port,
+ return rc;
+
+ return sja1105_static_config_reload(priv, SJA1105_SCHEDULING);
++ } else if (admin->cmd != TAPRIO_CMD_REPLACE) {
++ return -EOPNOTSUPP;
+ }
+
+ /* The cycle time extension is the amount of time the last cycle from
+diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c
+index 451c3a1b62553..633b321d7fdd9 100644
+--- a/drivers/net/ethernet/amazon/ena/ena_com.c
++++ b/drivers/net/ethernet/amazon/ena/ena_com.c
+@@ -35,6 +35,8 @@
+
+ #define ENA_REGS_ADMIN_INTR_MASK 1
+
++#define ENA_MAX_BACKOFF_DELAY_EXP 16U
++
+ #define ENA_MIN_ADMIN_POLL_US 100
+
+ #define ENA_MAX_ADMIN_POLL_US 5000
+@@ -536,6 +538,7 @@ static int ena_com_comp_status_to_errno(struct ena_com_admin_queue *admin_queue,
+
+ static void ena_delay_exponential_backoff_us(u32 exp, u32 delay_us)
+ {
++ exp = min_t(u32, exp, ENA_MAX_BACKOFF_DELAY_EXP);
+ delay_us = max_t(u32, ENA_MIN_ADMIN_POLL_US, delay_us);
+ delay_us = min_t(u32, delay_us * (1U << exp), ENA_MAX_ADMIN_POLL_US);
+ usleep_range(delay_us, 2 * delay_us);
+diff --git a/drivers/net/ethernet/broadcom/bgmac.c b/drivers/net/ethernet/broadcom/bgmac.c
+index 1761df8fb7f96..10c7c232cc4ec 100644
+--- a/drivers/net/ethernet/broadcom/bgmac.c
++++ b/drivers/net/ethernet/broadcom/bgmac.c
+@@ -1492,8 +1492,6 @@ int bgmac_enet_probe(struct bgmac *bgmac)
+
+ bgmac->in_init = true;
+
+- bgmac_chip_intrs_off(bgmac);
+-
+ net_dev->irq = bgmac->irq;
+ SET_NETDEV_DEV(net_dev, bgmac->dev);
+ dev_set_drvdata(bgmac->dev, bgmac);
+@@ -1511,6 +1509,8 @@ int bgmac_enet_probe(struct bgmac *bgmac)
+ */
+ bgmac_clk_enable(bgmac, 0);
+
++ bgmac_chip_intrs_off(bgmac);
++
+ /* This seems to be fixing IRQ by assigning OOB #6 to the core */
+ if (!(bgmac->feature_flags & BGMAC_FEAT_IDM_MASK)) {
+ if (bgmac->feature_flags & BGMAC_FEAT_IRQ_ID_OOB_6)
+diff --git a/drivers/net/ethernet/broadcom/genet/bcmmii.c b/drivers/net/ethernet/broadcom/genet/bcmmii.c
+index c15ed0acdb777..0092e46c46f83 100644
+--- a/drivers/net/ethernet/broadcom/genet/bcmmii.c
++++ b/drivers/net/ethernet/broadcom/genet/bcmmii.c
+@@ -673,5 +673,7 @@ void bcmgenet_mii_exit(struct net_device *dev)
+ if (of_phy_is_fixed_link(dn))
+ of_phy_deregister_fixed_link(dn);
+ of_node_put(priv->phy_dn);
++ clk_prepare_enable(priv->clk);
+ platform_device_unregister(priv->mii_pdev);
++ clk_disable_unprepare(priv->clk);
+ }
+diff --git a/drivers/net/ethernet/engleder/tsnep_selftests.c b/drivers/net/ethernet/engleder/tsnep_selftests.c
+index 1581d6b222320..8a9145f93147c 100644
+--- a/drivers/net/ethernet/engleder/tsnep_selftests.c
++++ b/drivers/net/ethernet/engleder/tsnep_selftests.c
+@@ -329,7 +329,7 @@ static bool disable_taprio(struct tsnep_adapter *adapter)
+ int retval;
+
+ memset(&qopt, 0, sizeof(qopt));
+- qopt.enable = 0;
++ qopt.cmd = TAPRIO_CMD_DESTROY;
+ retval = tsnep_tc_setup(adapter->netdev, TC_SETUP_QDISC_TAPRIO, &qopt);
+ if (retval)
+ return false;
+@@ -360,7 +360,7 @@ static bool tsnep_test_taprio(struct tsnep_adapter *adapter)
+ for (i = 0; i < 255; i++)
+ qopt->entries[i].command = TC_TAPRIO_CMD_SET_GATES;
+
+- qopt->enable = 1;
++ qopt->cmd = TAPRIO_CMD_REPLACE;
+ qopt->base_time = ktime_set(0, 0);
+ qopt->cycle_time = 1500000;
+ qopt->cycle_time_extension = 0;
+@@ -382,7 +382,7 @@ static bool tsnep_test_taprio(struct tsnep_adapter *adapter)
+ if (!run_taprio(adapter, qopt, 100))
+ goto failed;
+
+- qopt->enable = 1;
++ qopt->cmd = TAPRIO_CMD_REPLACE;
+ qopt->base_time = ktime_set(0, 0);
+ qopt->cycle_time = 411854;
+ qopt->cycle_time_extension = 0;
+@@ -406,7 +406,7 @@ static bool tsnep_test_taprio(struct tsnep_adapter *adapter)
+ if (!run_taprio(adapter, qopt, 100))
+ goto failed;
+
+- qopt->enable = 1;
++ qopt->cmd = TAPRIO_CMD_REPLACE;
+ qopt->base_time = ktime_set(0, 0);
+ delay_base_time(adapter, qopt, 12);
+ qopt->cycle_time = 125000;
+@@ -457,7 +457,7 @@ static bool tsnep_test_taprio_change(struct tsnep_adapter *adapter)
+ for (i = 0; i < 255; i++)
+ qopt->entries[i].command = TC_TAPRIO_CMD_SET_GATES;
+
+- qopt->enable = 1;
++ qopt->cmd = TAPRIO_CMD_REPLACE;
+ qopt->base_time = ktime_set(0, 0);
+ qopt->cycle_time = 100000;
+ qopt->cycle_time_extension = 0;
+@@ -610,7 +610,7 @@ static bool tsnep_test_taprio_extension(struct tsnep_adapter *adapter)
+ for (i = 0; i < 255; i++)
+ qopt->entries[i].command = TC_TAPRIO_CMD_SET_GATES;
+
+- qopt->enable = 1;
++ qopt->cmd = TAPRIO_CMD_REPLACE;
+ qopt->base_time = ktime_set(0, 0);
+ qopt->cycle_time = 100000;
+ qopt->cycle_time_extension = 50000;
+diff --git a/drivers/net/ethernet/engleder/tsnep_tc.c b/drivers/net/ethernet/engleder/tsnep_tc.c
+index d083e6684f120..745b191a55402 100644
+--- a/drivers/net/ethernet/engleder/tsnep_tc.c
++++ b/drivers/net/ethernet/engleder/tsnep_tc.c
+@@ -325,7 +325,7 @@ static int tsnep_taprio(struct tsnep_adapter *adapter,
+ if (!adapter->gate_control)
+ return -EOPNOTSUPP;
+
+- if (!qopt->enable) {
++ if (qopt->cmd == TAPRIO_CMD_DESTROY) {
+ /* disable gate control if active */
+ mutex_lock(&adapter->gate_control_lock);
+
+@@ -337,6 +337,8 @@ static int tsnep_taprio(struct tsnep_adapter *adapter,
+ mutex_unlock(&adapter->gate_control_lock);
+
+ return 0;
++ } else if (qopt->cmd != TAPRIO_CMD_REPLACE) {
++ return -EOPNOTSUPP;
+ }
+
+ retval = tsnep_validate_gcl(qopt);
+diff --git a/drivers/net/ethernet/freescale/enetc/enetc_qos.c b/drivers/net/ethernet/freescale/enetc/enetc_qos.c
+index 126007ab70f61..dfec50106106f 100644
+--- a/drivers/net/ethernet/freescale/enetc/enetc_qos.c
++++ b/drivers/net/ethernet/freescale/enetc/enetc_qos.c
+@@ -65,7 +65,7 @@ static int enetc_setup_taprio(struct net_device *ndev,
+ gcl_len = admin_conf->num_entries;
+
+ tge = enetc_rd(hw, ENETC_PTGCR);
+- if (!admin_conf->enable) {
++ if (admin_conf->cmd == TAPRIO_CMD_DESTROY) {
+ enetc_wr(hw, ENETC_PTGCR, tge & ~ENETC_PTGCR_TGE);
+ enetc_reset_ptcmsdur(hw);
+
+@@ -138,6 +138,10 @@ int enetc_setup_tc_taprio(struct net_device *ndev, void *type_data)
+ struct enetc_ndev_priv *priv = netdev_priv(ndev);
+ int err, i;
+
++ if (taprio->cmd != TAPRIO_CMD_REPLACE &&
++ taprio->cmd != TAPRIO_CMD_DESTROY)
++ return -EOPNOTSUPP;
++
+ /* TSD and Qbv are mutually exclusive in hardware */
+ for (i = 0; i < priv->num_tx_rings; i++)
+ if (priv->tx_ring[i]->tsd_enable)
+diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h
+index 9939ccafb5566..63a053dea819d 100644
+--- a/drivers/net/ethernet/freescale/fec.h
++++ b/drivers/net/ethernet/freescale/fec.h
+@@ -355,7 +355,7 @@ struct bufdesc_ex {
+ #define RX_RING_SIZE (FEC_ENET_RX_FRPPG * FEC_ENET_RX_PAGES)
+ #define FEC_ENET_TX_FRSIZE 2048
+ #define FEC_ENET_TX_FRPPG (PAGE_SIZE / FEC_ENET_TX_FRSIZE)
+-#define TX_RING_SIZE 512 /* Must be power of two */
++#define TX_RING_SIZE 1024 /* Must be power of two */
+ #define TX_RING_MOD_MASK 511 /* for this to work */
+
+ #define BD_ENET_RX_INT 0x00800000
+@@ -544,10 +544,23 @@ enum {
+ XDP_STATS_TOTAL,
+ };
+
++enum fec_txbuf_type {
++ FEC_TXBUF_T_SKB,
++ FEC_TXBUF_T_XDP_NDO,
++};
++
++struct fec_tx_buffer {
++ union {
++ struct sk_buff *skb;
++ struct xdp_frame *xdp;
++ };
++ enum fec_txbuf_type type;
++};
++
+ struct fec_enet_priv_tx_q {
+ struct bufdesc_prop bd;
+ unsigned char *tx_bounce[TX_RING_SIZE];
+- struct sk_buff *tx_skbuff[TX_RING_SIZE];
++ struct fec_tx_buffer tx_buf[TX_RING_SIZE];
+
+ unsigned short tx_stop_threshold;
+ unsigned short tx_wake_threshold;
+diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
+index 38e5b5abe067c..7659888a96917 100644
+--- a/drivers/net/ethernet/freescale/fec_main.c
++++ b/drivers/net/ethernet/freescale/fec_main.c
+@@ -397,7 +397,7 @@ static void fec_dump(struct net_device *ndev)
+ fec16_to_cpu(bdp->cbd_sc),
+ fec32_to_cpu(bdp->cbd_bufaddr),
+ fec16_to_cpu(bdp->cbd_datlen),
+- txq->tx_skbuff[index]);
++ txq->tx_buf[index].skb);
+ bdp = fec_enet_get_nextdesc(bdp, &txq->bd);
+ index++;
+ } while (bdp != txq->bd.base);
+@@ -654,7 +654,7 @@ static int fec_enet_txq_submit_skb(struct fec_enet_priv_tx_q *txq,
+
+ index = fec_enet_get_bd_index(last_bdp, &txq->bd);
+ /* Save skb pointer */
+- txq->tx_skbuff[index] = skb;
++ txq->tx_buf[index].skb = skb;
+
+ /* Make sure the updates to rest of the descriptor are performed before
+ * transferring ownership.
+@@ -672,9 +672,7 @@ static int fec_enet_txq_submit_skb(struct fec_enet_priv_tx_q *txq,
+
+ skb_tx_timestamp(skb);
+
+- /* Make sure the update to bdp and tx_skbuff are performed before
+- * txq->bd.cur.
+- */
++ /* Make sure the update to bdp is performed before txq->bd.cur. */
+ wmb();
+ txq->bd.cur = bdp;
+
+@@ -862,7 +860,7 @@ static int fec_enet_txq_submit_tso(struct fec_enet_priv_tx_q *txq,
+ }
+
+ /* Save skb pointer */
+- txq->tx_skbuff[index] = skb;
++ txq->tx_buf[index].skb = skb;
+
+ skb_tx_timestamp(skb);
+ txq->bd.cur = bdp;
+@@ -952,16 +950,33 @@ static void fec_enet_bd_init(struct net_device *dev)
+ for (i = 0; i < txq->bd.ring_size; i++) {
+ /* Initialize the BD for every fragment in the page. */
+ bdp->cbd_sc = cpu_to_fec16(0);
+- if (bdp->cbd_bufaddr &&
+- !IS_TSO_HEADER(txq, fec32_to_cpu(bdp->cbd_bufaddr)))
+- dma_unmap_single(&fep->pdev->dev,
+- fec32_to_cpu(bdp->cbd_bufaddr),
+- fec16_to_cpu(bdp->cbd_datlen),
+- DMA_TO_DEVICE);
+- if (txq->tx_skbuff[i]) {
+- dev_kfree_skb_any(txq->tx_skbuff[i]);
+- txq->tx_skbuff[i] = NULL;
++ if (txq->tx_buf[i].type == FEC_TXBUF_T_SKB) {
++ if (bdp->cbd_bufaddr &&
++ !IS_TSO_HEADER(txq, fec32_to_cpu(bdp->cbd_bufaddr)))
++ dma_unmap_single(&fep->pdev->dev,
++ fec32_to_cpu(bdp->cbd_bufaddr),
++ fec16_to_cpu(bdp->cbd_datlen),
++ DMA_TO_DEVICE);
++ if (txq->tx_buf[i].skb) {
++ dev_kfree_skb_any(txq->tx_buf[i].skb);
++ txq->tx_buf[i].skb = NULL;
++ }
++ } else {
++ if (bdp->cbd_bufaddr)
++ dma_unmap_single(&fep->pdev->dev,
++ fec32_to_cpu(bdp->cbd_bufaddr),
++ fec16_to_cpu(bdp->cbd_datlen),
++ DMA_TO_DEVICE);
++
++ if (txq->tx_buf[i].xdp) {
++ xdp_return_frame(txq->tx_buf[i].xdp);
++ txq->tx_buf[i].xdp = NULL;
++ }
++
++ /* restore default tx buffer type: FEC_TXBUF_T_SKB */
++ txq->tx_buf[i].type = FEC_TXBUF_T_SKB;
+ }
++
+ bdp->cbd_bufaddr = cpu_to_fec32(0);
+ bdp = fec_enet_get_nextdesc(bdp, &txq->bd);
+ }
+@@ -1011,24 +1026,6 @@ static void fec_enet_enable_ring(struct net_device *ndev)
+ }
+ }
+
+-static void fec_enet_reset_skb(struct net_device *ndev)
+-{
+- struct fec_enet_private *fep = netdev_priv(ndev);
+- struct fec_enet_priv_tx_q *txq;
+- int i, j;
+-
+- for (i = 0; i < fep->num_tx_queues; i++) {
+- txq = fep->tx_queue[i];
+-
+- for (j = 0; j < txq->bd.ring_size; j++) {
+- if (txq->tx_skbuff[j]) {
+- dev_kfree_skb_any(txq->tx_skbuff[j]);
+- txq->tx_skbuff[j] = NULL;
+- }
+- }
+- }
+-}
+-
+ /*
+ * This function is called to start or restart the FEC during a link
+ * change, transmit timeout, or to reconfigure the FEC. The network
+@@ -1071,9 +1068,6 @@ fec_restart(struct net_device *ndev)
+
+ fec_enet_enable_ring(ndev);
+
+- /* Reset tx SKB buffers. */
+- fec_enet_reset_skb(ndev);
+-
+ /* Enable MII mode */
+ if (fep->full_duplex == DUPLEX_FULL) {
+ /* FD enable */
+@@ -1381,6 +1375,7 @@ static void
+ fec_enet_tx_queue(struct net_device *ndev, u16 queue_id)
+ {
+ struct fec_enet_private *fep;
++ struct xdp_frame *xdpf;
+ struct bufdesc *bdp;
+ unsigned short status;
+ struct sk_buff *skb;
+@@ -1408,16 +1403,31 @@ fec_enet_tx_queue(struct net_device *ndev, u16 queue_id)
+
+ index = fec_enet_get_bd_index(bdp, &txq->bd);
+
+- skb = txq->tx_skbuff[index];
+- txq->tx_skbuff[index] = NULL;
+- if (!IS_TSO_HEADER(txq, fec32_to_cpu(bdp->cbd_bufaddr)))
+- dma_unmap_single(&fep->pdev->dev,
+- fec32_to_cpu(bdp->cbd_bufaddr),
+- fec16_to_cpu(bdp->cbd_datlen),
+- DMA_TO_DEVICE);
+- bdp->cbd_bufaddr = cpu_to_fec32(0);
+- if (!skb)
+- goto skb_done;
++ if (txq->tx_buf[index].type == FEC_TXBUF_T_SKB) {
++ skb = txq->tx_buf[index].skb;
++ txq->tx_buf[index].skb = NULL;
++ if (bdp->cbd_bufaddr &&
++ !IS_TSO_HEADER(txq, fec32_to_cpu(bdp->cbd_bufaddr)))
++ dma_unmap_single(&fep->pdev->dev,
++ fec32_to_cpu(bdp->cbd_bufaddr),
++ fec16_to_cpu(bdp->cbd_datlen),
++ DMA_TO_DEVICE);
++ bdp->cbd_bufaddr = cpu_to_fec32(0);
++ if (!skb)
++ goto tx_buf_done;
++ } else {
++ xdpf = txq->tx_buf[index].xdp;
++ if (bdp->cbd_bufaddr)
++ dma_unmap_single(&fep->pdev->dev,
++ fec32_to_cpu(bdp->cbd_bufaddr),
++ fec16_to_cpu(bdp->cbd_datlen),
++ DMA_TO_DEVICE);
++ bdp->cbd_bufaddr = cpu_to_fec32(0);
++ if (!xdpf) {
++ txq->tx_buf[index].type = FEC_TXBUF_T_SKB;
++ goto tx_buf_done;
++ }
++ }
+
+ /* Check for errors. */
+ if (status & (BD_ENET_TX_HB | BD_ENET_TX_LC |
+@@ -1436,21 +1446,11 @@ fec_enet_tx_queue(struct net_device *ndev, u16 queue_id)
+ ndev->stats.tx_carrier_errors++;
+ } else {
+ ndev->stats.tx_packets++;
+- ndev->stats.tx_bytes += skb->len;
+- }
+
+- /* NOTE: SKBTX_IN_PROGRESS being set does not imply it's we who
+- * are to time stamp the packet, so we still need to check time
+- * stamping enabled flag.
+- */
+- if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS &&
+- fep->hwts_tx_en) &&
+- fep->bufdesc_ex) {
+- struct skb_shared_hwtstamps shhwtstamps;
+- struct bufdesc_ex *ebdp = (struct bufdesc_ex *)bdp;
+-
+- fec_enet_hwtstamp(fep, fec32_to_cpu(ebdp->ts), &shhwtstamps);
+- skb_tstamp_tx(skb, &shhwtstamps);
++ if (txq->tx_buf[index].type == FEC_TXBUF_T_SKB)
++ ndev->stats.tx_bytes += skb->len;
++ else
++ ndev->stats.tx_bytes += xdpf->len;
+ }
+
+ /* Deferred means some collisions occurred during transmit,
+@@ -1459,10 +1459,32 @@ fec_enet_tx_queue(struct net_device *ndev, u16 queue_id)
+ if (status & BD_ENET_TX_DEF)
+ ndev->stats.collisions++;
+
+- /* Free the sk buffer associated with this last transmit */
+- dev_kfree_skb_any(skb);
+-skb_done:
+- /* Make sure the update to bdp and tx_skbuff are performed
++ if (txq->tx_buf[index].type == FEC_TXBUF_T_SKB) {
++ /* NOTE: SKBTX_IN_PROGRESS being set does not imply it's we who
++ * are to time stamp the packet, so we still need to check time
++ * stamping enabled flag.
++ */
++ if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS &&
++ fep->hwts_tx_en) && fep->bufdesc_ex) {
++ struct skb_shared_hwtstamps shhwtstamps;
++ struct bufdesc_ex *ebdp = (struct bufdesc_ex *)bdp;
++
++ fec_enet_hwtstamp(fep, fec32_to_cpu(ebdp->ts), &shhwtstamps);
++ skb_tstamp_tx(skb, &shhwtstamps);
++ }
++
++ /* Free the sk buffer associated with this last transmit */
++ dev_kfree_skb_any(skb);
++ } else {
++ xdp_return_frame(xdpf);
++
++ txq->tx_buf[index].xdp = NULL;
++ /* restore default tx buffer type: FEC_TXBUF_T_SKB */
++ txq->tx_buf[index].type = FEC_TXBUF_T_SKB;
++ }
++
++tx_buf_done:
++ /* Make sure the update to bdp and tx_buf are performed
+ * before dirty_tx
+ */
+ wmb();
+@@ -3268,9 +3290,19 @@ static void fec_enet_free_buffers(struct net_device *ndev)
+ for (i = 0; i < txq->bd.ring_size; i++) {
+ kfree(txq->tx_bounce[i]);
+ txq->tx_bounce[i] = NULL;
+- skb = txq->tx_skbuff[i];
+- txq->tx_skbuff[i] = NULL;
+- dev_kfree_skb(skb);
++
++ if (txq->tx_buf[i].type == FEC_TXBUF_T_SKB) {
++ skb = txq->tx_buf[i].skb;
++ txq->tx_buf[i].skb = NULL;
++ dev_kfree_skb(skb);
++ } else {
++ if (txq->tx_buf[i].xdp) {
++ xdp_return_frame(txq->tx_buf[i].xdp);
++ txq->tx_buf[i].xdp = NULL;
++ }
++
++ txq->tx_buf[i].type = FEC_TXBUF_T_SKB;
++ }
+ }
+ }
+ }
+@@ -3315,8 +3347,7 @@ static int fec_enet_alloc_queue(struct net_device *ndev)
+ fep->total_tx_ring_size += fep->tx_queue[i]->bd.ring_size;
+
+ txq->tx_stop_threshold = FEC_MAX_SKB_DESCS;
+- txq->tx_wake_threshold =
+- (txq->bd.ring_size - txq->tx_stop_threshold) / 2;
++ txq->tx_wake_threshold = FEC_MAX_SKB_DESCS + 2 * MAX_SKB_FRAGS;
+
+ txq->tso_hdrs = dma_alloc_coherent(&fep->pdev->dev,
+ txq->bd.ring_size * TSO_HEADER_SIZE,
+@@ -3791,7 +3822,7 @@ static int fec_enet_txq_xmit_frame(struct fec_enet_private *fep,
+ struct xdp_frame *frame)
+ {
+ unsigned int index, status, estatus;
+- struct bufdesc *bdp, *last_bdp;
++ struct bufdesc *bdp;
+ dma_addr_t dma_addr;
+ int entries_free;
+
+@@ -3803,7 +3834,6 @@ static int fec_enet_txq_xmit_frame(struct fec_enet_private *fep,
+
+ /* Fill in a Tx ring entry */
+ bdp = txq->bd.cur;
+- last_bdp = bdp;
+ status = fec16_to_cpu(bdp->cbd_sc);
+ status &= ~BD_ENET_TX_STATS;
+
+@@ -3831,8 +3861,8 @@ static int fec_enet_txq_xmit_frame(struct fec_enet_private *fep,
+ ebdp->cbd_esc = cpu_to_fec32(estatus);
+ }
+
+- index = fec_enet_get_bd_index(last_bdp, &txq->bd);
+- txq->tx_skbuff[index] = NULL;
++ txq->tx_buf[index].type = FEC_TXBUF_T_XDP_NDO;
++ txq->tx_buf[index].xdp = frame;
+
+ /* Make sure the updates to rest of the descriptor are performed before
+ * transferring ownership.
+@@ -3846,7 +3876,7 @@ static int fec_enet_txq_xmit_frame(struct fec_enet_private *fep,
+ bdp->cbd_sc = cpu_to_fec16(status);
+
+ /* If this was the last BD in the ring, start at the beginning again. */
+- bdp = fec_enet_get_nextdesc(last_bdp, &txq->bd);
++ bdp = fec_enet_get_nextdesc(bdp, &txq->bd);
+
+ /* Make sure the update to bdp are performed before txq->bd.cur. */
+ dma_wmb();
+diff --git a/drivers/net/ethernet/google/gve/gve_ethtool.c b/drivers/net/ethernet/google/gve/gve_ethtool.c
+index cfd4b8d284d12..50162ec9424df 100644
+--- a/drivers/net/ethernet/google/gve/gve_ethtool.c
++++ b/drivers/net/ethernet/google/gve/gve_ethtool.c
+@@ -590,6 +590,9 @@ static int gve_get_link_ksettings(struct net_device *netdev,
+ err = gve_adminq_report_link_speed(priv);
+
+ cmd->base.speed = priv->link_speed;
++
++ cmd->base.duplex = DUPLEX_FULL;
++
+ return err;
+ }
+
+diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
+index fcc027c938fda..1277e0a044ee4 100644
+--- a/drivers/net/ethernet/intel/ice/ice_main.c
++++ b/drivers/net/ethernet/intel/ice/ice_main.c
+@@ -5969,6 +5969,13 @@ ice_set_tx_maxrate(struct net_device *netdev, int queue_index, u32 maxrate)
+ q_handle = vsi->tx_rings[queue_index]->q_handle;
+ tc = ice_dcb_get_tc(vsi, queue_index);
+
++ vsi = ice_locate_vsi_using_queue(vsi, queue_index);
++ if (!vsi) {
++ netdev_err(netdev, "Invalid VSI for given queue %d\n",
++ queue_index);
++ return -EINVAL;
++ }
++
+ /* Set BW back to default, when user set maxrate to 0 */
+ if (!maxrate)
+ status = ice_cfg_q_bw_dflt_lmt(vsi->port_info, vsi->idx, tc,
+@@ -8114,10 +8121,10 @@ static int
+ ice_validate_mqprio_qopt(struct ice_vsi *vsi,
+ struct tc_mqprio_qopt_offload *mqprio_qopt)
+ {
+- u64 sum_max_rate = 0, sum_min_rate = 0;
+ int non_power_of_2_qcount = 0;
+ struct ice_pf *pf = vsi->back;
+ int max_rss_q_cnt = 0;
++ u64 sum_min_rate = 0;
+ struct device *dev;
+ int i, speed;
+ u8 num_tc;
+@@ -8133,6 +8140,7 @@ ice_validate_mqprio_qopt(struct ice_vsi *vsi,
+ dev = ice_pf_to_dev(pf);
+ vsi->ch_rss_size = 0;
+ num_tc = mqprio_qopt->qopt.num_tc;
++ speed = ice_get_link_speed_kbps(vsi);
+
+ for (i = 0; num_tc; i++) {
+ int qcount = mqprio_qopt->qopt.count[i];
+@@ -8173,7 +8181,6 @@ ice_validate_mqprio_qopt(struct ice_vsi *vsi,
+ */
+ max_rate = mqprio_qopt->max_rate[i];
+ max_rate = div_u64(max_rate, ICE_BW_KBPS_DIVISOR);
+- sum_max_rate += max_rate;
+
+ /* min_rate is minimum guaranteed rate and it can't be zero */
+ min_rate = mqprio_qopt->min_rate[i];
+@@ -8186,6 +8193,12 @@ ice_validate_mqprio_qopt(struct ice_vsi *vsi,
+ return -EINVAL;
+ }
+
++ if (max_rate && max_rate > speed) {
++ dev_err(dev, "TC%d: max_rate(%llu Kbps) > link speed of %u Kbps\n",
++ i, max_rate, speed);
++ return -EINVAL;
++ }
++
+ iter_div_u64_rem(min_rate, ICE_MIN_BW_LIMIT, &rem);
+ if (rem) {
+ dev_err(dev, "TC%d: Min Rate not multiple of %u Kbps",
+@@ -8223,12 +8236,6 @@ ice_validate_mqprio_qopt(struct ice_vsi *vsi,
+ (mqprio_qopt->qopt.offset[i] + mqprio_qopt->qopt.count[i]))
+ return -EINVAL;
+
+- speed = ice_get_link_speed_kbps(vsi);
+- if (sum_max_rate && sum_max_rate > (u64)speed) {
+- dev_err(dev, "Invalid max Tx rate(%llu) Kbps > speed(%u) Kbps specified\n",
+- sum_max_rate, speed);
+- return -EINVAL;
+- }
+ if (sum_min_rate && sum_min_rate > (u64)speed) {
+ dev_err(dev, "Invalid min Tx rate(%llu) Kbps > speed (%u) Kbps specified\n",
+ sum_min_rate, speed);
+diff --git a/drivers/net/ethernet/intel/ice/ice_tc_lib.c b/drivers/net/ethernet/intel/ice/ice_tc_lib.c
+index d1a31f236d26a..8578dc1cb967d 100644
+--- a/drivers/net/ethernet/intel/ice/ice_tc_lib.c
++++ b/drivers/net/ethernet/intel/ice/ice_tc_lib.c
+@@ -735,17 +735,16 @@ exit:
+ /**
+ * ice_locate_vsi_using_queue - locate VSI using queue (forward to queue action)
+ * @vsi: Pointer to VSI
+- * @tc_fltr: Pointer to tc_flower_filter
++ * @queue: Queue index
+ *
+- * Locate the VSI using specified queue. When ADQ is not enabled, always
+- * return input VSI, otherwise locate corresponding VSI based on per channel
+- * offset and qcount
++ * Locate the VSI using specified "queue". When ADQ is not enabled,
++ * always return input VSI, otherwise locate corresponding
++ * VSI based on per channel "offset" and "qcount"
+ */
+-static struct ice_vsi *
+-ice_locate_vsi_using_queue(struct ice_vsi *vsi,
+- struct ice_tc_flower_fltr *tc_fltr)
++struct ice_vsi *
++ice_locate_vsi_using_queue(struct ice_vsi *vsi, int queue)
+ {
+- int num_tc, tc, queue;
++ int num_tc, tc;
+
+ /* if ADQ is not active, passed VSI is the candidate VSI */
+ if (!ice_is_adq_active(vsi->back))
+@@ -755,7 +754,6 @@ ice_locate_vsi_using_queue(struct ice_vsi *vsi,
+ * upon queue number)
+ */
+ num_tc = vsi->mqprio_qopt.qopt.num_tc;
+- queue = tc_fltr->action.fwd.q.queue;
+
+ for (tc = 0; tc < num_tc; tc++) {
+ int qcount = vsi->mqprio_qopt.qopt.count[tc];
+@@ -797,6 +795,7 @@ ice_tc_forward_action(struct ice_vsi *vsi, struct ice_tc_flower_fltr *tc_fltr)
+ struct ice_pf *pf = vsi->back;
+ struct device *dev;
+ u32 tc_class;
++ int q;
+
+ dev = ice_pf_to_dev(pf);
+
+@@ -825,7 +824,8 @@ ice_tc_forward_action(struct ice_vsi *vsi, struct ice_tc_flower_fltr *tc_fltr)
+ /* Determine destination VSI even though the action is
+ * FWD_TO_QUEUE, because QUEUE is associated with VSI
+ */
+- dest_vsi = tc_fltr->dest_vsi;
++ q = tc_fltr->action.fwd.q.queue;
++ dest_vsi = ice_locate_vsi_using_queue(vsi, q);
+ break;
+ default:
+ dev_err(dev,
+@@ -1702,7 +1702,7 @@ ice_tc_forward_to_queue(struct ice_vsi *vsi, struct ice_tc_flower_fltr *fltr,
+ /* If ADQ is configured, and the queue belongs to ADQ VSI, then prepare
+ * ADQ switch filter
+ */
+- ch_vsi = ice_locate_vsi_using_queue(vsi, fltr);
++ ch_vsi = ice_locate_vsi_using_queue(vsi, fltr->action.fwd.q.queue);
+ if (!ch_vsi)
+ return -EINVAL;
+ fltr->dest_vsi = ch_vsi;
+diff --git a/drivers/net/ethernet/intel/ice/ice_tc_lib.h b/drivers/net/ethernet/intel/ice/ice_tc_lib.h
+index 8d5e22ac7023c..189c73d885356 100644
+--- a/drivers/net/ethernet/intel/ice/ice_tc_lib.h
++++ b/drivers/net/ethernet/intel/ice/ice_tc_lib.h
+@@ -203,6 +203,7 @@ static inline int ice_chnl_dmac_fltr_cnt(struct ice_pf *pf)
+ return pf->num_dmac_chnl_fltrs;
+ }
+
++struct ice_vsi *ice_locate_vsi_using_queue(struct ice_vsi *vsi, int queue);
+ int
+ ice_add_cls_flower(struct net_device *netdev, struct ice_vsi *vsi,
+ struct flow_cls_offload *cls_flower);
+diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h
+index 9dc9b982a7ea6..345d3a4e8ed44 100644
+--- a/drivers/net/ethernet/intel/igc/igc.h
++++ b/drivers/net/ethernet/intel/igc/igc.h
+@@ -14,6 +14,7 @@
+ #include <linux/timecounter.h>
+ #include <linux/net_tstamp.h>
+ #include <linux/bitfield.h>
++#include <linux/hrtimer.h>
+
+ #include "igc_hw.h"
+
+@@ -101,6 +102,8 @@ struct igc_ring {
+ u32 start_time;
+ u32 end_time;
+ u32 max_sdu;
++ bool oper_gate_closed; /* Operating gate. True if the TX Queue is closed */
++ bool admin_gate_closed; /* Future gate. True if the TX Queue will be closed */
+
+ /* CBS parameters */
+ bool cbs_enable; /* indicates if CBS is enabled */
+@@ -160,6 +163,7 @@ struct igc_adapter {
+ struct timer_list watchdog_timer;
+ struct timer_list dma_err_timer;
+ struct timer_list phy_info_timer;
++ struct hrtimer hrtimer;
+
+ u32 wol;
+ u32 en_mng_pt;
+@@ -184,10 +188,13 @@ struct igc_adapter {
+ u32 max_frame_size;
+ u32 min_frame_size;
+
++ int tc_setup_type;
+ ktime_t base_time;
+ ktime_t cycle_time;
+- bool qbv_enable;
++ bool taprio_offload_enable;
+ u32 qbv_config_change_errors;
++ bool qbv_transition;
++ unsigned int qbv_count;
+
+ /* OS defined structs */
+ struct pci_dev *pdev;
+@@ -501,6 +508,12 @@ struct igc_rx_buffer {
+ };
+ };
+
++/* context wrapper around xdp_buff to provide access to descriptor metadata */
++struct igc_xdp_buff {
++ struct xdp_buff xdp;
++ union igc_adv_rx_desc *rx_desc;
++};
++
+ struct igc_q_vector {
+ struct igc_adapter *adapter; /* backlink */
+ void __iomem *itr_register;
+diff --git a/drivers/net/ethernet/intel/igc/igc_ethtool.c b/drivers/net/ethernet/intel/igc/igc_ethtool.c
+index 0e2cb00622d1a..93bce729be76a 100644
+--- a/drivers/net/ethernet/intel/igc/igc_ethtool.c
++++ b/drivers/net/ethernet/intel/igc/igc_ethtool.c
+@@ -1708,6 +1708,8 @@ static int igc_ethtool_get_link_ksettings(struct net_device *netdev,
+ /* twisted pair */
+ cmd->base.port = PORT_TP;
+ cmd->base.phy_address = hw->phy.addr;
++ ethtool_link_ksettings_add_link_mode(cmd, supported, TP);
++ ethtool_link_ksettings_add_link_mode(cmd, advertising, TP);
+
+ /* advertising link modes */
+ if (hw->phy.autoneg_advertised & ADVERTISE_10_HALF)
+diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
+index 5f2e8bcd75973..44aa4342cbbb5 100644
+--- a/drivers/net/ethernet/intel/igc/igc_main.c
++++ b/drivers/net/ethernet/intel/igc/igc_main.c
+@@ -711,7 +711,6 @@ static void igc_configure_tx_ring(struct igc_adapter *adapter,
+ /* disable the queue */
+ wr32(IGC_TXDCTL(reg_idx), 0);
+ wrfl();
+- mdelay(10);
+
+ wr32(IGC_TDLEN(reg_idx),
+ ring->count * sizeof(union igc_adv_tx_desc));
+@@ -1017,7 +1016,7 @@ static __le32 igc_tx_launchtime(struct igc_ring *ring, ktime_t txtime,
+ ktime_t base_time = adapter->base_time;
+ ktime_t now = ktime_get_clocktai();
+ ktime_t baset_est, end_of_cycle;
+- u32 launchtime;
++ s32 launchtime;
+ s64 n;
+
+ n = div64_s64(ktime_sub_ns(now, base_time), cycle_time);
+@@ -1030,7 +1029,7 @@ static __le32 igc_tx_launchtime(struct igc_ring *ring, ktime_t txtime,
+ *first_flag = true;
+ ring->last_ff_cycle = baset_est;
+
+- if (ktime_compare(txtime, ring->last_tx_cycle) > 0)
++ if (ktime_compare(end_of_cycle, ring->last_tx_cycle) > 0)
+ *insert_empty = true;
+ }
+ }
+@@ -1573,16 +1572,12 @@ done:
+ first->bytecount = skb->len;
+ first->gso_segs = 1;
+
+- if (tx_ring->max_sdu > 0) {
+- u32 max_sdu = 0;
+-
+- max_sdu = tx_ring->max_sdu +
+- (skb_vlan_tagged(first->skb) ? VLAN_HLEN : 0);
++ if (adapter->qbv_transition || tx_ring->oper_gate_closed)
++ goto out_drop;
+
+- if (first->bytecount > max_sdu) {
+- adapter->stats.txdrop++;
+- goto out_drop;
+- }
++ if (tx_ring->max_sdu > 0 && first->bytecount > tx_ring->max_sdu) {
++ adapter->stats.txdrop++;
++ goto out_drop;
+ }
+
+ if (unlikely(test_bit(IGC_RING_FLAG_TX_HWTSTAMP, &tx_ring->flags) &&
+@@ -2247,6 +2242,8 @@ static bool igc_alloc_rx_buffers_zc(struct igc_ring *ring, u16 count)
+ if (!count)
+ return ok;
+
++ XSK_CHECK_PRIV_TYPE(struct igc_xdp_buff);
++
+ desc = IGC_RX_DESC(ring, i);
+ bi = &ring->rx_buffer_info[i];
+ i -= ring->count;
+@@ -2531,8 +2528,8 @@ static int igc_clean_rx_irq(struct igc_q_vector *q_vector, const int budget)
+ union igc_adv_rx_desc *rx_desc;
+ struct igc_rx_buffer *rx_buffer;
+ unsigned int size, truesize;
++ struct igc_xdp_buff ctx;
+ ktime_t timestamp = 0;
+- struct xdp_buff xdp;
+ int pkt_offset = 0;
+ void *pktbuf;
+
+@@ -2566,13 +2563,14 @@ static int igc_clean_rx_irq(struct igc_q_vector *q_vector, const int budget)
+ }
+
+ if (!skb) {
+- xdp_init_buff(&xdp, truesize, &rx_ring->xdp_rxq);
+- xdp_prepare_buff(&xdp, pktbuf - igc_rx_offset(rx_ring),
++ xdp_init_buff(&ctx.xdp, truesize, &rx_ring->xdp_rxq);
++ xdp_prepare_buff(&ctx.xdp, pktbuf - igc_rx_offset(rx_ring),
+ igc_rx_offset(rx_ring) + pkt_offset,
+ size, true);
+- xdp_buff_clear_frags_flag(&xdp);
++ xdp_buff_clear_frags_flag(&ctx.xdp);
++ ctx.rx_desc = rx_desc;
+
+- skb = igc_xdp_run_prog(adapter, &xdp);
++ skb = igc_xdp_run_prog(adapter, &ctx.xdp);
+ }
+
+ if (IS_ERR(skb)) {
+@@ -2594,9 +2592,9 @@ static int igc_clean_rx_irq(struct igc_q_vector *q_vector, const int budget)
+ } else if (skb)
+ igc_add_rx_frag(rx_ring, rx_buffer, skb, size);
+ else if (ring_uses_build_skb(rx_ring))
+- skb = igc_build_skb(rx_ring, rx_buffer, &xdp);
++ skb = igc_build_skb(rx_ring, rx_buffer, &ctx.xdp);
+ else
+- skb = igc_construct_skb(rx_ring, rx_buffer, &xdp,
++ skb = igc_construct_skb(rx_ring, rx_buffer, &ctx.xdp,
+ timestamp);
+
+ /* exit if we failed to retrieve a buffer */
+@@ -2697,6 +2695,15 @@ static void igc_dispatch_skb_zc(struct igc_q_vector *q_vector,
+ napi_gro_receive(&q_vector->napi, skb);
+ }
+
++static struct igc_xdp_buff *xsk_buff_to_igc_ctx(struct xdp_buff *xdp)
++{
++ /* xdp_buff pointer used by ZC code path is alloc as xdp_buff_xsk. The
++ * igc_xdp_buff shares its layout with xdp_buff_xsk and private
++ * igc_xdp_buff fields fall into xdp_buff_xsk->cb
++ */
++ return (struct igc_xdp_buff *)xdp;
++}
++
+ static int igc_clean_rx_irq_zc(struct igc_q_vector *q_vector, const int budget)
+ {
+ struct igc_adapter *adapter = q_vector->adapter;
+@@ -2715,6 +2722,7 @@ static int igc_clean_rx_irq_zc(struct igc_q_vector *q_vector, const int budget)
+ while (likely(total_packets < budget)) {
+ union igc_adv_rx_desc *desc;
+ struct igc_rx_buffer *bi;
++ struct igc_xdp_buff *ctx;
+ ktime_t timestamp = 0;
+ unsigned int size;
+ int res;
+@@ -2732,6 +2740,9 @@ static int igc_clean_rx_irq_zc(struct igc_q_vector *q_vector, const int budget)
+
+ bi = &ring->rx_buffer_info[ntc];
+
++ ctx = xsk_buff_to_igc_ctx(bi->xdp);
++ ctx->rx_desc = desc;
++
+ if (igc_test_staterr(desc, IGC_RXDADV_STAT_TSIP)) {
+ timestamp = igc_ptp_rx_pktstamp(q_vector->adapter,
+ bi->xdp->data);
+@@ -2989,8 +3000,8 @@ static bool igc_clean_tx_irq(struct igc_q_vector *q_vector, int napi_budget)
+ time_after(jiffies, tx_buffer->time_stamp +
+ (adapter->tx_timeout_factor * HZ)) &&
+ !(rd32(IGC_STATUS) & IGC_STATUS_TXOFF) &&
+- (rd32(IGC_TDH(tx_ring->reg_idx)) !=
+- readl(tx_ring->tail))) {
++ (rd32(IGC_TDH(tx_ring->reg_idx)) != readl(tx_ring->tail)) &&
++ !tx_ring->oper_gate_closed) {
+ /* detected Tx unit hang */
+ netdev_err(tx_ring->netdev,
+ "Detected Tx Unit Hang\n"
+@@ -6079,7 +6090,10 @@ static int igc_tsn_clear_schedule(struct igc_adapter *adapter)
+
+ adapter->base_time = 0;
+ adapter->cycle_time = NSEC_PER_SEC;
++ adapter->taprio_offload_enable = false;
+ adapter->qbv_config_change_errors = 0;
++ adapter->qbv_transition = false;
++ adapter->qbv_count = 0;
+
+ for (i = 0; i < adapter->num_tx_queues; i++) {
+ struct igc_ring *ring = adapter->tx_ring[i];
+@@ -6087,6 +6101,8 @@ static int igc_tsn_clear_schedule(struct igc_adapter *adapter)
+ ring->start_time = 0;
+ ring->end_time = NSEC_PER_SEC;
+ ring->max_sdu = 0;
++ ring->oper_gate_closed = false;
++ ring->admin_gate_closed = false;
+ }
+
+ return 0;
+@@ -6098,18 +6114,20 @@ static int igc_save_qbv_schedule(struct igc_adapter *adapter,
+ bool queue_configured[IGC_MAX_TX_QUEUES] = { };
+ struct igc_hw *hw = &adapter->hw;
+ u32 start_time = 0, end_time = 0;
++ struct timespec64 now;
+ size_t n;
+ int i;
+
+- adapter->qbv_enable = qopt->enable;
+-
+- if (!qopt->enable)
++ if (qopt->cmd == TAPRIO_CMD_DESTROY)
+ return igc_tsn_clear_schedule(adapter);
+
++ if (qopt->cmd != TAPRIO_CMD_REPLACE)
++ return -EOPNOTSUPP;
++
+ if (qopt->base_time < 0)
+ return -ERANGE;
+
+- if (igc_is_device_id_i225(hw) && adapter->base_time)
++ if (igc_is_device_id_i225(hw) && adapter->taprio_offload_enable)
+ return -EALREADY;
+
+ if (!validate_schedule(adapter, qopt))
+@@ -6117,6 +6135,9 @@ static int igc_save_qbv_schedule(struct igc_adapter *adapter,
+
+ adapter->cycle_time = qopt->cycle_time;
+ adapter->base_time = qopt->base_time;
++ adapter->taprio_offload_enable = true;
++
++ igc_ptp_read(adapter, &now);
+
+ for (n = 0; n < qopt->num_entries; n++) {
+ struct tc_taprio_sched_entry *e = &qopt->entries[n];
+@@ -6152,7 +6173,10 @@ static int igc_save_qbv_schedule(struct igc_adapter *adapter,
+ ring->start_time = start_time;
+ ring->end_time = end_time;
+
+- queue_configured[i] = true;
++ if (ring->start_time >= adapter->cycle_time)
++ queue_configured[i] = false;
++ else
++ queue_configured[i] = true;
+ }
+
+ start_time += e->interval;
+@@ -6162,8 +6186,20 @@ static int igc_save_qbv_schedule(struct igc_adapter *adapter,
+ * If not, set the start and end time to be end time.
+ */
+ for (i = 0; i < adapter->num_tx_queues; i++) {
++ struct igc_ring *ring = adapter->tx_ring[i];
++
++ if (!is_base_time_past(qopt->base_time, &now)) {
++ ring->admin_gate_closed = false;
++ } else {
++ ring->oper_gate_closed = false;
++ ring->admin_gate_closed = false;
++ }
++
+ if (!queue_configured[i]) {
+- struct igc_ring *ring = adapter->tx_ring[i];
++ if (!is_base_time_past(qopt->base_time, &now))
++ ring->admin_gate_closed = true;
++ else
++ ring->oper_gate_closed = true;
+
+ ring->start_time = end_time;
+ ring->end_time = end_time;
+@@ -6175,7 +6211,7 @@ static int igc_save_qbv_schedule(struct igc_adapter *adapter,
+ struct net_device *dev = adapter->netdev;
+
+ if (qopt->max_sdu[i])
+- ring->max_sdu = qopt->max_sdu[i] + dev->hard_header_len;
++ ring->max_sdu = qopt->max_sdu[i] + dev->hard_header_len - ETH_TLEN;
+ else
+ ring->max_sdu = 0;
+ }
+@@ -6295,6 +6331,8 @@ static int igc_setup_tc(struct net_device *dev, enum tc_setup_type type,
+ {
+ struct igc_adapter *adapter = netdev_priv(dev);
+
++ adapter->tc_setup_type = type;
++
+ switch (type) {
+ case TC_QUERY_CAPS:
+ return igc_tc_query_caps(adapter, type_data);
+@@ -6487,6 +6525,65 @@ u32 igc_rd32(struct igc_hw *hw, u32 reg)
+ return value;
+ }
+
++/* Mapping HW RSS Type to enum xdp_rss_hash_type */
++static enum xdp_rss_hash_type igc_xdp_rss_type[IGC_RSS_TYPE_MAX_TABLE] = {
++ [IGC_RSS_TYPE_NO_HASH] = XDP_RSS_TYPE_L2,
++ [IGC_RSS_TYPE_HASH_TCP_IPV4] = XDP_RSS_TYPE_L4_IPV4_TCP,
++ [IGC_RSS_TYPE_HASH_IPV4] = XDP_RSS_TYPE_L3_IPV4,
++ [IGC_RSS_TYPE_HASH_TCP_IPV6] = XDP_RSS_TYPE_L4_IPV6_TCP,
++ [IGC_RSS_TYPE_HASH_IPV6_EX] = XDP_RSS_TYPE_L3_IPV6_EX,
++ [IGC_RSS_TYPE_HASH_IPV6] = XDP_RSS_TYPE_L3_IPV6,
++ [IGC_RSS_TYPE_HASH_TCP_IPV6_EX] = XDP_RSS_TYPE_L4_IPV6_TCP_EX,
++ [IGC_RSS_TYPE_HASH_UDP_IPV4] = XDP_RSS_TYPE_L4_IPV4_UDP,
++ [IGC_RSS_TYPE_HASH_UDP_IPV6] = XDP_RSS_TYPE_L4_IPV6_UDP,
++ [IGC_RSS_TYPE_HASH_UDP_IPV6_EX] = XDP_RSS_TYPE_L4_IPV6_UDP_EX,
++ [10] = XDP_RSS_TYPE_NONE, /* RSS Type above 9 "Reserved" by HW */
++ [11] = XDP_RSS_TYPE_NONE, /* keep array sized for SW bit-mask */
++ [12] = XDP_RSS_TYPE_NONE, /* to handle future HW revisons */
++ [13] = XDP_RSS_TYPE_NONE,
++ [14] = XDP_RSS_TYPE_NONE,
++ [15] = XDP_RSS_TYPE_NONE,
++};
++
++static int igc_xdp_rx_hash(const struct xdp_md *_ctx, u32 *hash,
++ enum xdp_rss_hash_type *rss_type)
++{
++ const struct igc_xdp_buff *ctx = (void *)_ctx;
++
++ if (!(ctx->xdp.rxq->dev->features & NETIF_F_RXHASH))
++ return -ENODATA;
++
++ *hash = le32_to_cpu(ctx->rx_desc->wb.lower.hi_dword.rss);
++ *rss_type = igc_xdp_rss_type[igc_rss_type(ctx->rx_desc)];
++
++ return 0;
++}
++
++static const struct xdp_metadata_ops igc_xdp_metadata_ops = {
++ .xmo_rx_hash = igc_xdp_rx_hash,
++};
++
++static enum hrtimer_restart igc_qbv_scheduling_timer(struct hrtimer *timer)
++{
++ struct igc_adapter *adapter = container_of(timer, struct igc_adapter,
++ hrtimer);
++ unsigned int i;
++
++ adapter->qbv_transition = true;
++ for (i = 0; i < adapter->num_tx_queues; i++) {
++ struct igc_ring *tx_ring = adapter->tx_ring[i];
++
++ if (tx_ring->admin_gate_closed) {
++ tx_ring->admin_gate_closed = false;
++ tx_ring->oper_gate_closed = true;
++ } else {
++ tx_ring->oper_gate_closed = false;
++ }
++ }
++ adapter->qbv_transition = false;
++ return HRTIMER_NORESTART;
++}
++
+ /**
+ * igc_probe - Device Initialization Routine
+ * @pdev: PCI device information struct
+@@ -6560,6 +6657,7 @@ static int igc_probe(struct pci_dev *pdev,
+ hw->hw_addr = adapter->io_addr;
+
+ netdev->netdev_ops = &igc_netdev_ops;
++ netdev->xdp_metadata_ops = &igc_xdp_metadata_ops;
+ igc_ethtool_set_ops(netdev);
+ netdev->watchdog_timeo = 5 * HZ;
+
+@@ -6664,6 +6762,9 @@ static int igc_probe(struct pci_dev *pdev,
+ INIT_WORK(&adapter->reset_task, igc_reset_task);
+ INIT_WORK(&adapter->watchdog_task, igc_watchdog_task);
+
++ hrtimer_init(&adapter->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
++ adapter->hrtimer.function = &igc_qbv_scheduling_timer;
++
+ /* Initialize link properties that are user-changeable */
+ adapter->fc_autoneg = true;
+ hw->mac.autoneg = true;
+@@ -6767,6 +6868,7 @@ static void igc_remove(struct pci_dev *pdev)
+
+ cancel_work_sync(&adapter->reset_task);
+ cancel_work_sync(&adapter->watchdog_task);
++ hrtimer_cancel(&adapter->hrtimer);
+
+ /* Release control of h/w to f/w. If f/w is AMT enabled, this
+ * would have already happened in close and is redundant.
+diff --git a/drivers/net/ethernet/intel/igc/igc_ptp.c b/drivers/net/ethernet/intel/igc/igc_ptp.c
+index 32ef112f8291a..f0b979a706552 100644
+--- a/drivers/net/ethernet/intel/igc/igc_ptp.c
++++ b/drivers/net/ethernet/intel/igc/igc_ptp.c
+@@ -356,16 +356,35 @@ static int igc_ptp_feature_enable_i225(struct ptp_clock_info *ptp,
+ tsim &= ~IGC_TSICR_TT0;
+ }
+ if (on) {
++ struct timespec64 safe_start;
+ int i = rq->perout.index;
+
+ igc_pin_perout(igc, i, pin, use_freq);
+- igc->perout[i].start.tv_sec = rq->perout.start.sec;
++ igc_ptp_read(igc, &safe_start);
++
++ /* PPS output start time is triggered by Target time(TT)
++ * register. Programming any past time value into TT
++ * register will cause PPS to never start. Need to make
++ * sure we program the TT register a time ahead in
++ * future. There isn't a stringent need to fire PPS out
++ * right away. Adding +2 seconds should take care of
++ * corner cases. Let's say if the SYSTIML is close to
++ * wrap up and the timer keeps ticking as we program the
++ * register, adding +2seconds is safe bet.
++ */
++ safe_start.tv_sec += 2;
++
++ if (rq->perout.start.sec < safe_start.tv_sec)
++ igc->perout[i].start.tv_sec = safe_start.tv_sec;
++ else
++ igc->perout[i].start.tv_sec = rq->perout.start.sec;
+ igc->perout[i].start.tv_nsec = rq->perout.start.nsec;
+ igc->perout[i].period.tv_sec = ts.tv_sec;
+ igc->perout[i].period.tv_nsec = ts.tv_nsec;
+- wr32(trgttimh, rq->perout.start.sec);
++ wr32(trgttimh, (u32)igc->perout[i].start.tv_sec);
+ /* For now, always select timer 0 as source. */
+- wr32(trgttiml, rq->perout.start.nsec | IGC_TT_IO_TIMER_SEL_SYSTIM0);
++ wr32(trgttiml, (u32)(igc->perout[i].start.tv_nsec |
++ IGC_TT_IO_TIMER_SEL_SYSTIM0));
+ if (use_freq)
+ wr32(freqout, ns);
+ tsauxc |= tsauxc_mask;
+diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.c b/drivers/net/ethernet/intel/igc/igc_tsn.c
+index 94a2b0dfb54d4..a9c08321aca90 100644
+--- a/drivers/net/ethernet/intel/igc/igc_tsn.c
++++ b/drivers/net/ethernet/intel/igc/igc_tsn.c
+@@ -37,7 +37,7 @@ static unsigned int igc_tsn_new_flags(struct igc_adapter *adapter)
+ {
+ unsigned int new_flags = adapter->flags & ~IGC_FLAG_TSN_ANY_ENABLED;
+
+- if (adapter->qbv_enable)
++ if (adapter->taprio_offload_enable)
+ new_flags |= IGC_FLAG_TSN_QBV_ENABLED;
+
+ if (is_any_launchtime(adapter))
+@@ -114,7 +114,6 @@ static int igc_tsn_disable_offload(struct igc_adapter *adapter)
+ static int igc_tsn_enable_offload(struct igc_adapter *adapter)
+ {
+ struct igc_hw *hw = &adapter->hw;
+- bool tsn_mode_reconfig = false;
+ u32 tqavctrl, baset_l, baset_h;
+ u32 sec, nsec, cycle;
+ ktime_t base_time, systim;
+@@ -133,8 +132,28 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter)
+ wr32(IGC_STQT(i), ring->start_time);
+ wr32(IGC_ENDQT(i), ring->end_time);
+
+- txqctl |= IGC_TXQCTL_STRICT_CYCLE |
+- IGC_TXQCTL_STRICT_END;
++ if (adapter->taprio_offload_enable) {
++ /* If taprio_offload_enable is set we are in "taprio"
++ * mode and we need to be strict about the
++ * cycles: only transmit a packet if it can be
++ * completed during that cycle.
++ *
++ * If taprio_offload_enable is NOT true when
++ * enabling TSN offload, the cycle should have
++ * no external effects, but is only used internally
++ * to adapt the base time register after a second
++ * has passed.
++ *
++ * Enabling strict mode in this case would
++ * unnecessarily prevent the transmission of
++ * certain packets (i.e. at the boundary of a
++ * second) and thus interfere with the launchtime
++ * feature that promises transmission at a
++ * certain point in time.
++ */
++ txqctl |= IGC_TXQCTL_STRICT_CYCLE |
++ IGC_TXQCTL_STRICT_END;
++ }
+
+ if (ring->launchtime_enable)
+ txqctl |= IGC_TXQCTL_QUEUE_MODE_LAUNCHT;
+@@ -228,11 +247,10 @@ skip_cbs:
+
+ tqavctrl = rd32(IGC_TQAVCTRL) & ~IGC_TQAVCTRL_FUTSCDDIS;
+
+- if (tqavctrl & IGC_TQAVCTRL_TRANSMIT_MODE_TSN)
+- tsn_mode_reconfig = true;
+-
+ tqavctrl |= IGC_TQAVCTRL_TRANSMIT_MODE_TSN | IGC_TQAVCTRL_ENHANCED_QAV;
+
++ adapter->qbv_count++;
++
+ cycle = adapter->cycle_time;
+ base_time = adapter->base_time;
+
+@@ -249,17 +267,29 @@ skip_cbs:
+ * Gate Control List (GCL) is running.
+ */
+ if ((rd32(IGC_BASET_H) || rd32(IGC_BASET_L)) &&
+- tsn_mode_reconfig)
++ (adapter->tc_setup_type == TC_SETUP_QDISC_TAPRIO) &&
++ (adapter->qbv_count > 1))
+ adapter->qbv_config_change_errors++;
+ } else {
+- /* According to datasheet section 7.5.2.9.3.3, FutScdDis bit
+- * has to be configured before the cycle time and base time.
+- * Tx won't hang if there is a GCL is already running,
+- * so in this case we don't need to set FutScdDis.
+- */
+- if (igc_is_device_id_i226(hw) &&
+- !(rd32(IGC_BASET_H) || rd32(IGC_BASET_L)))
+- tqavctrl |= IGC_TQAVCTRL_FUTSCDDIS;
++ if (igc_is_device_id_i226(hw)) {
++ ktime_t adjust_time, expires_time;
++
++ /* According to datasheet section 7.5.2.9.3.3, FutScdDis bit
++ * has to be configured before the cycle time and base time.
++ * Tx won't hang if a GCL is already running,
++ * so in this case we don't need to set FutScdDis.
++ */
++ if (!(rd32(IGC_BASET_H) || rd32(IGC_BASET_L)))
++ tqavctrl |= IGC_TQAVCTRL_FUTSCDDIS;
++
++ nsec = rd32(IGC_SYSTIML);
++ sec = rd32(IGC_SYSTIMH);
++ systim = ktime_set(sec, nsec);
++
++ adjust_time = adapter->base_time;
++ expires_time = ktime_sub_ns(adjust_time, systim);
++ hrtimer_start(&adapter->hrtimer, expires_time, HRTIMER_MODE_REL);
++ }
+ }
+
+ wr32(IGC_TQAVCTRL, tqavctrl);
+@@ -305,7 +335,11 @@ int igc_tsn_offload_apply(struct igc_adapter *adapter)
+ {
+ struct igc_hw *hw = &adapter->hw;
+
+- if (netif_running(adapter->netdev) && igc_is_device_id_i225(hw)) {
++ /* Per I225/6 HW Design Section 7.5.2.1, transmit mode
++ * cannot be changed dynamically. Require reset the adapter.
++ */
++ if (netif_running(adapter->netdev) &&
++ (igc_is_device_id_i225(hw) || !adapter->qbv_count)) {
+ schedule_work(&adapter->reset_task);
+ return 0;
+ }
+diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
+index 2cad76d0a50ef..4401fad31fb98 100644
+--- a/drivers/net/ethernet/marvell/mvneta.c
++++ b/drivers/net/ethernet/marvell/mvneta.c
+@@ -1505,7 +1505,7 @@ static void mvneta_defaults_set(struct mvneta_port *pp)
+ */
+ if (txq_number == 1)
+ txq_map = (cpu == pp->rxq_def) ?
+- MVNETA_CPU_TXQ_ACCESS(1) : 0;
++ MVNETA_CPU_TXQ_ACCESS(0) : 0;
+
+ } else {
+ txq_map = MVNETA_CPU_TXQ_ACCESS_ALL_MASK;
+@@ -4295,7 +4295,7 @@ static void mvneta_percpu_elect(struct mvneta_port *pp)
+ */
+ if (txq_number == 1)
+ txq_map = (cpu == elected_cpu) ?
+- MVNETA_CPU_TXQ_ACCESS(1) : 0;
++ MVNETA_CPU_TXQ_ACCESS(0) : 0;
+ else
+ txq_map = mvreg_read(pp, MVNETA_CPU_MAP(cpu)) &
+ MVNETA_CPU_TXQ_ACCESS_ALL_MASK;
+diff --git a/drivers/net/ethernet/marvell/octeontx2/af/ptp.c b/drivers/net/ethernet/marvell/octeontx2/af/ptp.c
+index 3411e2e47d46b..0ee420a489fc4 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/af/ptp.c
++++ b/drivers/net/ethernet/marvell/octeontx2/af/ptp.c
+@@ -208,7 +208,7 @@ struct ptp *ptp_get(void)
+ /* Check driver is bound to PTP block */
+ if (!ptp)
+ ptp = ERR_PTR(-EPROBE_DEFER);
+- else
++ else if (!IS_ERR(ptp))
+ pci_dev_get(ptp->pdev);
+
+ return ptp;
+@@ -388,11 +388,10 @@ static int ptp_extts_on(struct ptp *ptp, int on)
+ static int ptp_probe(struct pci_dev *pdev,
+ const struct pci_device_id *ent)
+ {
+- struct device *dev = &pdev->dev;
+ struct ptp *ptp;
+ int err;
+
+- ptp = devm_kzalloc(dev, sizeof(*ptp), GFP_KERNEL);
++ ptp = kzalloc(sizeof(*ptp), GFP_KERNEL);
+ if (!ptp) {
+ err = -ENOMEM;
+ goto error;
+@@ -428,20 +427,19 @@ static int ptp_probe(struct pci_dev *pdev,
+ return 0;
+
+ error_free:
+- devm_kfree(dev, ptp);
++ kfree(ptp);
+
+ error:
+ /* For `ptp_get()` we need to differentiate between the case
+ * when the core has not tried to probe this device and the case when
+- * the probe failed. In the later case we pretend that the
+- * initialization was successful and keep the error in
++ * the probe failed. In the later case we keep the error in
+ * `dev->driver_data`.
+ */
+ pci_set_drvdata(pdev, ERR_PTR(err));
+ if (!first_ptp_block)
+ first_ptp_block = ERR_PTR(err);
+
+- return 0;
++ return err;
+ }
+
+ static void ptp_remove(struct pci_dev *pdev)
+@@ -449,16 +447,17 @@ static void ptp_remove(struct pci_dev *pdev)
+ struct ptp *ptp = pci_get_drvdata(pdev);
+ u64 clock_cfg;
+
+- if (cn10k_ptp_errata(ptp) && hrtimer_active(&ptp->hrtimer))
+- hrtimer_cancel(&ptp->hrtimer);
+-
+ if (IS_ERR_OR_NULL(ptp))
+ return;
+
++ if (cn10k_ptp_errata(ptp) && hrtimer_active(&ptp->hrtimer))
++ hrtimer_cancel(&ptp->hrtimer);
++
+ /* Disable PTP clock */
+ clock_cfg = readq(ptp->reg_base + PTP_CLOCK_CFG);
+ clock_cfg &= ~PTP_CLOCK_CFG_PTP_EN;
+ writeq(clock_cfg, ptp->reg_base + PTP_CLOCK_CFG);
++ kfree(ptp);
+ }
+
+ static const struct pci_device_id ptp_id_table[] = {
+diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
+index b26b013216933..73932e2755bca 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
++++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
+@@ -3253,7 +3253,7 @@ static int rvu_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+ rvu->ptp = ptp_get();
+ if (IS_ERR(rvu->ptp)) {
+ err = PTR_ERR(rvu->ptp);
+- if (err == -EPROBE_DEFER)
++ if (err)
+ goto err_release_regions;
+ rvu->ptp = NULL;
+ }
+diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
+index f01d057ad025a..8cdf91a5bf44f 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
++++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
+@@ -3815,21 +3815,14 @@ int rvu_mbox_handler_nix_set_rx_mode(struct rvu *rvu, struct nix_rx_mode *req,
+ }
+
+ /* install/uninstall promisc entry */
+- if (promisc) {
++ if (promisc)
+ rvu_npc_install_promisc_entry(rvu, pcifunc, nixlf,
+ pfvf->rx_chan_base,
+ pfvf->rx_chan_cnt);
+-
+- if (rvu_npc_exact_has_match_table(rvu))
+- rvu_npc_exact_promisc_enable(rvu, pcifunc);
+- } else {
++ else
+ if (!nix_rx_multicast)
+ rvu_npc_enable_promisc_entry(rvu, pcifunc, nixlf, false);
+
+- if (rvu_npc_exact_has_match_table(rvu))
+- rvu_npc_exact_promisc_disable(rvu, pcifunc);
+- }
+-
+ return 0;
+ }
+
+diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_hash.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_hash.c
+index 9f11c1e407373..6fe67f3a7f6f1 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_hash.c
++++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_hash.c
+@@ -1164,8 +1164,10 @@ static u16 __rvu_npc_exact_cmd_rules_cnt_update(struct rvu *rvu, int drop_mcam_i
+ {
+ struct npc_exact_table *table;
+ u16 *cnt, old_cnt;
++ bool promisc;
+
+ table = rvu->hw->table;
++ promisc = table->promisc_mode[drop_mcam_idx];
+
+ cnt = &table->cnt_cmd_rules[drop_mcam_idx];
+ old_cnt = *cnt;
+@@ -1177,13 +1179,18 @@ static u16 __rvu_npc_exact_cmd_rules_cnt_update(struct rvu *rvu, int drop_mcam_i
+
+ *enable_or_disable_cam = false;
+
+- /* If all rules are deleted, disable cam */
++ if (promisc)
++ goto done;
++
++ /* If all rules are deleted and not already in promisc mode;
++ * disable cam
++ */
+ if (!*cnt && val < 0) {
+ *enable_or_disable_cam = true;
+ goto done;
+ }
+
+- /* If rule got added, enable cam */
++ /* If rule got added and not already in promisc mode; enable cam */
+ if (!old_cnt && val > 0) {
+ *enable_or_disable_cam = true;
+ goto done;
+@@ -1462,6 +1469,12 @@ int rvu_npc_exact_promisc_disable(struct rvu *rvu, u16 pcifunc)
+ *promisc = false;
+ mutex_unlock(&table->lock);
+
++ /* Enable drop rule */
++ rvu_npc_enable_mcam_by_entry_index(rvu, drop_mcam_idx, NIX_INTF_RX,
++ true);
++
++ dev_dbg(rvu->dev, "%s: disabled promisc mode (cgx=%d lmac=%d)\n",
++ __func__, cgx_id, lmac_id);
+ return 0;
+ }
+
+@@ -1503,6 +1516,12 @@ int rvu_npc_exact_promisc_enable(struct rvu *rvu, u16 pcifunc)
+ *promisc = true;
+ mutex_unlock(&table->lock);
+
++ /* disable drop rule */
++ rvu_npc_enable_mcam_by_entry_index(rvu, drop_mcam_idx, NIX_INTF_RX,
++ false);
++
++ dev_dbg(rvu->dev, "%s: Enabled promisc mode (cgx=%d lmac=%d)\n",
++ __func__, cgx_id, lmac_id);
+ return 0;
+ }
+
+diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c
+index 10e11262d48a0..2d7713a1a1539 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c
++++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_flows.c
+@@ -872,6 +872,14 @@ static int otx2_prepare_flow_request(struct ethtool_rx_flow_spec *fsp,
+ return -EINVAL;
+
+ vlan_etype = be16_to_cpu(fsp->h_ext.vlan_etype);
++
++ /* Drop rule with vlan_etype == 802.1Q
++ * and vlan_id == 0 is not supported
++ */
++ if (vlan_etype == ETH_P_8021Q && !fsp->m_ext.vlan_tci &&
++ fsp->ring_cookie == RX_CLS_FLOW_DISC)
++ return -EINVAL;
++
+ /* Only ETH_P_8021Q and ETH_P_802AD types supported */
+ if (vlan_etype != ETH_P_8021Q &&
+ vlan_etype != ETH_P_8021AD)
+diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_tc.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_tc.c
+index 8392f63e433fc..293bd3f29b077 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_tc.c
++++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_tc.c
+@@ -604,6 +604,21 @@ static int otx2_tc_prepare_flow(struct otx2_nic *nic, struct otx2_tc_flow *node,
+ return -EOPNOTSUPP;
+ }
+
++ if (!match.mask->vlan_id) {
++ struct flow_action_entry *act;
++ int i;
++
++ flow_action_for_each(i, act, &rule->action) {
++ if (act->id == FLOW_ACTION_DROP) {
++ netdev_err(nic->netdev,
++ "vlan tpid 0x%x with vlan_id %d is not supported for DROP rule.\n",
++ ntohs(match.key->vlan_tpid),
++ match.key->vlan_id);
++ return -EOPNOTSUPP;
++ }
++ }
++ }
++
+ if (match.mask->vlan_id ||
+ match.mask->vlan_dei ||
+ match.mask->vlan_priority) {
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/fs_tt_redirect.c b/drivers/net/ethernet/mellanox/mlx5/core/en/fs_tt_redirect.c
+index 03cb79adf912f..be83ad9db82a4 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en/fs_tt_redirect.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/fs_tt_redirect.c
+@@ -594,7 +594,7 @@ int mlx5e_fs_tt_redirect_any_create(struct mlx5e_flow_steering *fs)
+
+ err = fs_any_create_table(fs);
+ if (err)
+- return err;
++ goto err_free_any;
+
+ err = fs_any_enable(fs);
+ if (err)
+@@ -606,8 +606,8 @@ int mlx5e_fs_tt_redirect_any_create(struct mlx5e_flow_steering *fs)
+
+ err_destroy_table:
+ fs_any_destroy_table(fs_any);
+-
+- kfree(fs_any);
++err_free_any:
+ mlx5e_fs_set_any(fs, NULL);
++ kfree(fs_any);
+ return err;
+ }
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
+index 3cbebfba582bd..b0b429a0321ed 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c
+@@ -729,8 +729,10 @@ int mlx5e_ptp_open(struct mlx5e_priv *priv, struct mlx5e_params *params,
+
+ c = kvzalloc_node(sizeof(*c), GFP_KERNEL, dev_to_node(mlx5_core_dma_dev(mdev)));
+ cparams = kvzalloc(sizeof(*cparams), GFP_KERNEL);
+- if (!c || !cparams)
+- return -ENOMEM;
++ if (!c || !cparams) {
++ err = -ENOMEM;
++ goto err_free;
++ }
+
+ c->priv = priv;
+ c->mdev = priv->mdev;
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
+index a254e728ac954..fadfa8b50bebe 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c
+@@ -1545,7 +1545,8 @@ mlx5_tc_ct_parse_action(struct mlx5_tc_ct_priv *priv,
+
+ attr->ct_attr.ct_action |= act->ct.action; /* So we can have clear + ct */
+ attr->ct_attr.zone = act->ct.zone;
+- attr->ct_attr.nf_ft = act->ct.flow_table;
++ if (!(act->ct.action & TCA_CT_ACT_CLEAR))
++ attr->ct_attr.nf_ft = act->ct.flow_table;
+ attr->ct_attr.act_miss_cookie = act->miss_cookie;
+
+ return 0;
+@@ -1990,6 +1991,9 @@ mlx5_tc_ct_flow_offload(struct mlx5_tc_ct_priv *priv, struct mlx5_flow_attr *att
+ if (!priv)
+ return -EOPNOTSUPP;
+
++ if (attr->ct_attr.offloaded)
++ return 0;
++
+ if (attr->ct_attr.ct_action & TCA_CT_ACT_CLEAR) {
+ err = mlx5_tc_ct_entry_set_registers(priv, &attr->parse_attr->mod_hdr_acts,
+ 0, 0, 0, 0);
+@@ -1999,11 +2003,15 @@ mlx5_tc_ct_flow_offload(struct mlx5_tc_ct_priv *priv, struct mlx5_flow_attr *att
+ attr->action |= MLX5_FLOW_CONTEXT_ACTION_MOD_HDR;
+ }
+
+- if (!attr->ct_attr.nf_ft) /* means only ct clear action, and not ct_clear,ct() */
++ if (!attr->ct_attr.nf_ft) { /* means only ct clear action, and not ct_clear,ct() */
++ attr->ct_attr.offloaded = true;
+ return 0;
++ }
+
+ mutex_lock(&priv->control_lock);
+ err = __mlx5_tc_ct_flow_offload(priv, attr);
++ if (!err)
++ attr->ct_attr.offloaded = true;
+ mutex_unlock(&priv->control_lock);
+
+ return err;
+@@ -2021,7 +2029,7 @@ void
+ mlx5_tc_ct_delete_flow(struct mlx5_tc_ct_priv *priv,
+ struct mlx5_flow_attr *attr)
+ {
+- if (!attr->ct_attr.ft) /* no ct action, return */
++ if (!attr->ct_attr.offloaded) /* no ct action, return */
+ return;
+ if (!attr->ct_attr.nf_ft) /* means only ct clear action, and not ct_clear,ct() */
+ return;
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h
+index 8e9316fa46d4b..b66c5f98067f7 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h
+@@ -29,6 +29,7 @@ struct mlx5_ct_attr {
+ u32 ct_labels_id;
+ u32 act_miss_mapping;
+ u64 act_miss_cookie;
++ bool offloaded;
+ struct mlx5_ct_ft *ft;
+ };
+
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
+index f0e6095809faf..40589cebb7730 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
+@@ -662,8 +662,7 @@ static void mlx5e_free_xdpsq_desc(struct mlx5e_xdpsq *sq,
+ /* No need to check ((page->pp_magic & ~0x3UL) == PP_SIGNATURE)
+ * as we know this is a page_pool page.
+ */
+- page_pool_put_defragged_page(page->pp,
+- page, -1, true);
++ page_pool_recycle_direct(page->pp, page);
+ } while (++n < num);
+
+ break;
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c
+index 88a5aed9d6781..c7d191f66ad1b 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c
+@@ -190,6 +190,7 @@ static int accel_fs_tcp_create_groups(struct mlx5e_flow_table *ft,
+ in = kvzalloc(inlen, GFP_KERNEL);
+ if (!in || !ft->g) {
+ kfree(ft->g);
++ ft->g = NULL;
+ kvfree(in);
+ return -ENOMEM;
+ }
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+index 69634829558e2..08e08489f4220 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+@@ -390,10 +390,18 @@ static void mlx5e_dealloc_rx_wqe(struct mlx5e_rq *rq, u16 ix)
+ {
+ struct mlx5e_wqe_frag_info *wi = get_frag(rq, ix);
+
+- if (rq->xsk_pool)
++ if (rq->xsk_pool) {
+ mlx5e_xsk_free_rx_wqe(wi);
+- else
++ } else {
+ mlx5e_free_rx_wqe(rq, wi);
++
++ /* Avoid a second release of the wqe pages: dealloc is called
++ * for the same missing wqes on regular RQ flush and on regular
++ * RQ close. This happens when XSK RQs come into play.
++ */
++ for (int i = 0; i < rq->wqe.info.num_frags; i++, wi++)
++ wi->flags |= BIT(MLX5E_WQE_FRAG_SKIP_RELEASE);
++ }
+ }
+
+ static void mlx5e_xsk_free_rx_wqes(struct mlx5e_rq *rq, u16 ix, int wqe_bulk)
+@@ -1745,11 +1753,11 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi
+
+ prog = rcu_dereference(rq->xdp_prog);
+ if (prog && mlx5e_xdp_handle(rq, prog, &mxbuf)) {
+- if (test_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) {
++ if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) {
+ struct mlx5e_wqe_frag_info *pwi;
+
+ for (pwi = head_wi; pwi < wi; pwi++)
+- pwi->flags |= BIT(MLX5E_WQE_FRAG_SKIP_RELEASE);
++ pwi->frag_page->frags++;
+ }
+ return NULL; /* page/packet was consumed by XDP */
+ }
+@@ -1819,12 +1827,8 @@ static void mlx5e_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe)
+ rq, wi, cqe, cqe_bcnt);
+ if (!skb) {
+ /* probably for XDP */
+- if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) {
+- /* do not return page to cache,
+- * it will be returned on XDP_TX completion.
+- */
+- wi->flags |= BIT(MLX5E_WQE_FRAG_SKIP_RELEASE);
+- }
++ if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags))
++ wi->frag_page->frags++;
+ goto wq_cyc_pop;
+ }
+
+@@ -1870,12 +1874,8 @@ static void mlx5e_handle_rx_cqe_rep(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe)
+ rq, wi, cqe, cqe_bcnt);
+ if (!skb) {
+ /* probably for XDP */
+- if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) {
+- /* do not return page to cache,
+- * it will be returned on XDP_TX completion.
+- */
+- wi->flags |= BIT(MLX5E_WQE_FRAG_SKIP_RELEASE);
+- }
++ if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags))
++ wi->frag_page->frags++;
+ goto wq_cyc_pop;
+ }
+
+@@ -2054,12 +2054,12 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w
+ if (prog) {
+ if (mlx5e_xdp_handle(rq, prog, &mxbuf)) {
+ if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) {
+- int i;
++ struct mlx5e_frag_page *pfp;
++
++ for (pfp = head_page; pfp < frag_page; pfp++)
++ pfp->frags++;
+
+- for (i = 0; i < sinfo->nr_frags; i++)
+- /* non-atomic */
+- __set_bit(page_idx + i, wi->skip_release_bitmap);
+- return NULL;
++ wi->linear_page.frags++;
+ }
+ mlx5e_page_release_fragmented(rq, &wi->linear_page);
+ return NULL; /* page/packet was consumed by XDP */
+@@ -2157,7 +2157,7 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi,
+ cqe_bcnt, &mxbuf);
+ if (mlx5e_xdp_handle(rq, prog, &mxbuf)) {
+ if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags))
+- __set_bit(page_idx, wi->skip_release_bitmap); /* non-atomic */
++ frag_page->frags++;
+ return NULL; /* page/packet was consumed by XDP */
+ }
+
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+index b9b1da751a3b8..ed05ac8ae1de5 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+@@ -1639,7 +1639,8 @@ static void remove_unready_flow(struct mlx5e_tc_flow *flow)
+ uplink_priv = &rpriv->uplink_priv;
+
+ mutex_lock(&uplink_priv->unready_flows_lock);
+- unready_flow_del(flow);
++ if (flow_flag_test(flow, NOT_READY))
++ unready_flow_del(flow);
+ mutex_unlock(&uplink_priv->unready_flows_lock);
+ }
+
+@@ -1932,8 +1933,7 @@ static void mlx5e_tc_del_fdb_flow(struct mlx5e_priv *priv,
+ esw_attr = attr->esw_attr;
+ mlx5e_put_flow_tunnel_id(flow);
+
+- if (flow_flag_test(flow, NOT_READY))
+- remove_unready_flow(flow);
++ remove_unready_flow(flow);
+
+ if (mlx5e_is_offloaded_flow(flow)) {
+ if (flow_flag_test(flow, SLOW))
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
+index 901c53751b0aa..f81c6d8d5e0f4 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
+@@ -800,6 +800,9 @@ static int mlx5_esw_vport_caps_get(struct mlx5_eswitch *esw, struct mlx5_vport *
+ hca_caps = MLX5_ADDR_OF(query_hca_cap_out, query_ctx, capability);
+ vport->info.roce_enabled = MLX5_GET(cmd_hca_cap, hca_caps, roce);
+
++ if (!MLX5_CAP_GEN_MAX(esw->dev, hca_cap_2))
++ goto out_free;
++
+ memset(query_ctx, 0, query_out_sz);
+ err = mlx5_vport_get_other_func_cap(esw->dev, vport->vport, query_ctx,
+ MLX5_CAP_GENERAL_2);
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/thermal.c b/drivers/net/ethernet/mellanox/mlx5/core/thermal.c
+index e47fa6fb836f1..89a22ff04cb60 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/thermal.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/thermal.c
+@@ -68,14 +68,19 @@ static struct thermal_zone_device_ops mlx5_thermal_ops = {
+
+ int mlx5_thermal_init(struct mlx5_core_dev *mdev)
+ {
++ char data[THERMAL_NAME_LENGTH];
+ struct mlx5_thermal *thermal;
+- struct thermal_zone_device *tzd;
+- const char *data = "mlx5";
++ int err;
+
+- tzd = thermal_zone_get_zone_by_name(data);
+- if (!IS_ERR(tzd))
++ if (!mlx5_core_is_pf(mdev) && !mlx5_core_is_ecpf(mdev))
+ return 0;
+
++ err = snprintf(data, sizeof(data), "mlx5_%s", dev_name(mdev->device));
++ if (err < 0 || err >= sizeof(data)) {
++ mlx5_core_err(mdev, "Failed to setup thermal zone name, %d\n", err);
++ return -EINVAL;
++ }
++
+ thermal = kzalloc(sizeof(*thermal), GFP_KERNEL);
+ if (!thermal)
+ return -ENOMEM;
+@@ -88,10 +93,10 @@ int mlx5_thermal_init(struct mlx5_core_dev *mdev)
+ &mlx5_thermal_ops,
+ NULL, 0, MLX5_THERMAL_POLL_INT_MSEC);
+ if (IS_ERR(thermal->tzdev)) {
+- dev_err(mdev->device, "Failed to register thermal zone device (%s) %ld\n",
+- data, PTR_ERR(thermal->tzdev));
++ err = PTR_ERR(thermal->tzdev);
++ mlx5_core_err(mdev, "Failed to register thermal zone device (%s) %d\n", data, err);
+ kfree(thermal);
+- return -EINVAL;
++ return err;
+ }
+
+ mdev->thermal = thermal;
+diff --git a/drivers/net/ethernet/microchip/Kconfig b/drivers/net/ethernet/microchip/Kconfig
+index 24c994baad135..329e374b9539c 100644
+--- a/drivers/net/ethernet/microchip/Kconfig
++++ b/drivers/net/ethernet/microchip/Kconfig
+@@ -46,7 +46,7 @@ config LAN743X
+ tristate "LAN743x support"
+ depends on PCI
+ depends on PTP_1588_CLOCK_OPTIONAL
+- select PHYLIB
++ select FIXED_PHY
+ select CRC16
+ select CRC32
+ help
+diff --git a/drivers/net/ethernet/microchip/lan743x_main.c b/drivers/net/ethernet/microchip/lan743x_main.c
+index 957d96a91a8af..8a991caf51612 100644
+--- a/drivers/net/ethernet/microchip/lan743x_main.c
++++ b/drivers/net/ethernet/microchip/lan743x_main.c
+@@ -144,6 +144,18 @@ static int lan743x_csr_light_reset(struct lan743x_adapter *adapter)
+ !(data & HW_CFG_LRST_), 100000, 10000000);
+ }
+
++static int lan743x_csr_wait_for_bit_atomic(struct lan743x_adapter *adapter,
++ int offset, u32 bit_mask,
++ int target_value, int udelay_min,
++ int udelay_max, int count)
++{
++ u32 data;
++
++ return readx_poll_timeout_atomic(LAN743X_CSR_READ_OP, offset, data,
++ target_value == !!(data & bit_mask),
++ udelay_max, udelay_min * count);
++}
++
+ static int lan743x_csr_wait_for_bit(struct lan743x_adapter *adapter,
+ int offset, u32 bit_mask,
+ int target_value, int usleep_min,
+@@ -746,8 +758,8 @@ static int lan743x_dp_write(struct lan743x_adapter *adapter,
+ u32 dp_sel;
+ int i;
+
+- if (lan743x_csr_wait_for_bit(adapter, DP_SEL, DP_SEL_DPRDY_,
+- 1, 40, 100, 100))
++ if (lan743x_csr_wait_for_bit_atomic(adapter, DP_SEL, DP_SEL_DPRDY_,
++ 1, 40, 100, 100))
+ return -EIO;
+ dp_sel = lan743x_csr_read(adapter, DP_SEL);
+ dp_sel &= ~DP_SEL_MASK_;
+@@ -758,8 +770,9 @@ static int lan743x_dp_write(struct lan743x_adapter *adapter,
+ lan743x_csr_write(adapter, DP_ADDR, addr + i);
+ lan743x_csr_write(adapter, DP_DATA_0, buf[i]);
+ lan743x_csr_write(adapter, DP_CMD, DP_CMD_WRITE_);
+- if (lan743x_csr_wait_for_bit(adapter, DP_SEL, DP_SEL_DPRDY_,
+- 1, 40, 100, 100))
++ if (lan743x_csr_wait_for_bit_atomic(adapter, DP_SEL,
++ DP_SEL_DPRDY_,
++ 1, 40, 100, 100))
+ return -EIO;
+ }
+
+diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_tc.c b/drivers/net/ethernet/microchip/lan966x/lan966x_tc.c
+index cf0cc7562d042..ee652f2d23595 100644
+--- a/drivers/net/ethernet/microchip/lan966x/lan966x_tc.c
++++ b/drivers/net/ethernet/microchip/lan966x/lan966x_tc.c
+@@ -21,8 +21,14 @@ static int lan966x_tc_setup_qdisc_mqprio(struct lan966x_port *port,
+ static int lan966x_tc_setup_qdisc_taprio(struct lan966x_port *port,
+ struct tc_taprio_qopt_offload *taprio)
+ {
+- return taprio->enable ? lan966x_taprio_add(port, taprio) :
+- lan966x_taprio_del(port);
++ switch (taprio->cmd) {
++ case TAPRIO_CMD_REPLACE:
++ return lan966x_taprio_add(port, taprio);
++ case TAPRIO_CMD_DESTROY:
++ return lan966x_taprio_del(port);
++ default:
++ return -EOPNOTSUPP;
++ }
+ }
+
+ static int lan966x_tc_setup_qdisc_tbf(struct lan966x_port *port,
+diff --git a/drivers/net/ethernet/mscc/ocelot_mm.c b/drivers/net/ethernet/mscc/ocelot_mm.c
+index fb3145118d686..99b29d1e62449 100644
+--- a/drivers/net/ethernet/mscc/ocelot_mm.c
++++ b/drivers/net/ethernet/mscc/ocelot_mm.c
+@@ -67,10 +67,13 @@ void ocelot_port_update_active_preemptible_tcs(struct ocelot *ocelot, int port)
+ val = mm->preemptible_tcs;
+
+ /* Cut through switching doesn't work for preemptible priorities,
+- * so first make sure it is disabled.
++ * so first make sure it is disabled. Also, changing the preemptible
++ * TCs affects the oversized frame dropping logic, so that needs to be
++ * re-triggered. And since tas_guard_bands_update() also implicitly
++ * calls cut_through_fwd(), we don't need to explicitly call it.
+ */
+ mm->active_preemptible_tcs = val;
+- ocelot->ops->cut_through_fwd(ocelot);
++ ocelot->ops->tas_guard_bands_update(ocelot, port);
+
+ dev_dbg(ocelot->dev,
+ "port %d %s/%s, MM TX %s, preemptible TCs 0x%x, active 0x%x\n",
+diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+index 62f0bf91d1e1e..52bf8928571d1 100644
+--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
++++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+@@ -53,6 +53,8 @@
+ #include "crypto/crypto.h"
+ #include "crypto/fw.h"
+
++static int nfp_net_mc_unsync(struct net_device *netdev, const unsigned char *addr);
++
+ /**
+ * nfp_net_get_fw_version() - Read and parse the FW version
+ * @fw_ver: Output fw_version structure to read to
+@@ -1084,6 +1086,9 @@ static int nfp_net_netdev_close(struct net_device *netdev)
+
+ /* Step 2: Tell NFP
+ */
++ if (nn->cap_w1 & NFP_NET_CFG_CTRL_MCAST_FILTER)
++ __dev_mc_unsync(netdev, nfp_net_mc_unsync);
++
+ nfp_net_clear_config_and_disable(nn);
+ nfp_port_configure(netdev, false);
+
+diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
+index 957027e546b30..e03a94f2469ab 100644
+--- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c
++++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
+@@ -474,11 +474,6 @@ static void ionic_qcqs_free(struct ionic_lif *lif)
+ static void ionic_link_qcq_interrupts(struct ionic_qcq *src_qcq,
+ struct ionic_qcq *n_qcq)
+ {
+- if (WARN_ON(n_qcq->flags & IONIC_QCQ_F_INTR)) {
+- ionic_intr_free(n_qcq->cq.lif->ionic, n_qcq->intr.index);
+- n_qcq->flags &= ~IONIC_QCQ_F_INTR;
+- }
+-
+ n_qcq->intr.vector = src_qcq->intr.vector;
+ n_qcq->intr.index = src_qcq->intr.index;
+ n_qcq->napi_qcq = src_qcq->napi_qcq;
+diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c
+index 9d55226479b4a..ac41ef4cbd2f0 100644
+--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c
++++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c
+@@ -966,8 +966,11 @@ static int tc_setup_taprio(struct stmmac_priv *priv,
+ return -EOPNOTSUPP;
+ }
+
+- if (!qopt->enable)
++ if (qopt->cmd == TAPRIO_CMD_DESTROY)
+ goto disable;
++ else if (qopt->cmd != TAPRIO_CMD_REPLACE)
++ return -EOPNOTSUPP;
++
+ if (qopt->num_entries >= dep)
+ return -EINVAL;
+ if (!qopt->cycle_time)
+@@ -988,7 +991,7 @@ static int tc_setup_taprio(struct stmmac_priv *priv,
+
+ mutex_lock(&priv->plat->est->lock);
+ priv->plat->est->gcl_size = size;
+- priv->plat->est->enable = qopt->enable;
++ priv->plat->est->enable = qopt->cmd == TAPRIO_CMD_REPLACE;
+ mutex_unlock(&priv->plat->est->lock);
+
+ for (i = 0; i < size; i++) {
+diff --git a/drivers/net/ethernet/ti/am65-cpsw-qos.c b/drivers/net/ethernet/ti/am65-cpsw-qos.c
+index 3a908db6e5b22..eced87fa261c9 100644
+--- a/drivers/net/ethernet/ti/am65-cpsw-qos.c
++++ b/drivers/net/ethernet/ti/am65-cpsw-qos.c
+@@ -450,7 +450,7 @@ static int am65_cpsw_configure_taprio(struct net_device *ndev,
+
+ am65_cpsw_est_update_state(ndev);
+
+- if (!est_new->taprio.enable) {
++ if (est_new->taprio.cmd == TAPRIO_CMD_DESTROY) {
+ am65_cpsw_stop_est(ndev);
+ return ret;
+ }
+@@ -476,7 +476,7 @@ static int am65_cpsw_configure_taprio(struct net_device *ndev,
+ am65_cpsw_est_set_sched_list(ndev, est_new);
+ am65_cpsw_port_est_assign_buf_num(ndev, est_new->buf);
+
+- am65_cpsw_est_set(ndev, est_new->taprio.enable);
++ am65_cpsw_est_set(ndev, est_new->taprio.cmd == TAPRIO_CMD_REPLACE);
+
+ if (tact == TACT_PROG) {
+ ret = am65_cpsw_timer_set(ndev, est_new);
+@@ -520,7 +520,7 @@ static int am65_cpsw_set_taprio(struct net_device *ndev, void *type_data)
+ am65_cpsw_cp_taprio(taprio, &est_new->taprio);
+ ret = am65_cpsw_configure_taprio(ndev, est_new);
+ if (!ret) {
+- if (taprio->enable) {
++ if (taprio->cmd == TAPRIO_CMD_REPLACE) {
+ devm_kfree(&ndev->dev, port->qos.est_admin);
+
+ port->qos.est_admin = est_new;
+@@ -564,8 +564,13 @@ purge_est:
+ static int am65_cpsw_setup_taprio(struct net_device *ndev, void *type_data)
+ {
+ struct am65_cpsw_port *port = am65_ndev_to_port(ndev);
++ struct tc_taprio_qopt_offload *taprio = type_data;
+ struct am65_cpsw_common *common = port->common;
+
++ if (taprio->cmd != TAPRIO_CMD_REPLACE &&
++ taprio->cmd != TAPRIO_CMD_DESTROY)
++ return -EOPNOTSUPP;
++
+ if (!IS_ENABLED(CONFIG_TI_AM65_CPSW_TAS))
+ return -ENODEV;
+
+diff --git a/drivers/net/ethernet/wangxun/txgbe/txgbe_hw.c b/drivers/net/ethernet/wangxun/txgbe/txgbe_hw.c
+index ebc46f3be0569..fc37af2e71ffc 100644
+--- a/drivers/net/ethernet/wangxun/txgbe/txgbe_hw.c
++++ b/drivers/net/ethernet/wangxun/txgbe/txgbe_hw.c
+@@ -196,9 +196,6 @@ static int txgbe_calc_eeprom_checksum(struct wx *wx, u16 *checksum)
+ if (eeprom_ptrs)
+ kvfree(eeprom_ptrs);
+
+- if (*checksum > TXGBE_EEPROM_SUM)
+- return -EINVAL;
+-
+ *checksum = TXGBE_EEPROM_SUM - *checksum;
+
+ return 0;
+diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
+index 6045bece2654d..b4d3b9cde8bd6 100644
+--- a/drivers/net/netdevsim/dev.c
++++ b/drivers/net/netdevsim/dev.c
+@@ -184,13 +184,10 @@ static ssize_t nsim_dev_trap_fa_cookie_write(struct file *file,
+ cookie_len = (count - 1) / 2;
+ if ((count - 1) % 2)
+ return -EINVAL;
+- buf = kmalloc(count, GFP_KERNEL | __GFP_NOWARN);
+- if (!buf)
+- return -ENOMEM;
+
+- ret = simple_write_to_buffer(buf, count, ppos, data, count);
+- if (ret < 0)
+- goto free_buf;
++ buf = memdup_user(data, count);
++ if (IS_ERR(buf))
++ return PTR_ERR(buf);
+
+ fa_cookie = kmalloc(sizeof(*fa_cookie) + cookie_len,
+ GFP_KERNEL | __GFP_NOWARN);
+diff --git a/drivers/net/phy/dp83td510.c b/drivers/net/phy/dp83td510.c
+index 3cd9a77f95324..d7616b13c5946 100644
+--- a/drivers/net/phy/dp83td510.c
++++ b/drivers/net/phy/dp83td510.c
+@@ -12,6 +12,11 @@
+
+ /* MDIO_MMD_VEND2 registers */
+ #define DP83TD510E_PHY_STS 0x10
++/* Bit 7 - mii_interrupt, active high. Clears on read.
++ * Note: Clearing does not necessarily deactivate IRQ pin if interrupts pending.
++ * This differs from the DP83TD510E datasheet (2020) which states this bit
++ * clears on write 0.
++ */
+ #define DP83TD510E_STS_MII_INT BIT(7)
+ #define DP83TD510E_LINK_STATUS BIT(0)
+
+@@ -53,12 +58,6 @@ static int dp83td510_config_intr(struct phy_device *phydev)
+ int ret;
+
+ if (phydev->interrupts == PHY_INTERRUPT_ENABLED) {
+- /* Clear any pending interrupts */
+- ret = phy_write_mmd(phydev, MDIO_MMD_VEND2, DP83TD510E_PHY_STS,
+- 0x0);
+- if (ret)
+- return ret;
+-
+ ret = phy_write_mmd(phydev, MDIO_MMD_VEND2,
+ DP83TD510E_INTERRUPT_REG_1,
+ DP83TD510E_INT1_LINK_EN);
+@@ -81,10 +80,6 @@ static int dp83td510_config_intr(struct phy_device *phydev)
+ DP83TD510E_GENCFG_INT_EN);
+ if (ret)
+ return ret;
+-
+- /* Clear any pending interrupts */
+- ret = phy_write_mmd(phydev, MDIO_MMD_VEND2, DP83TD510E_PHY_STS,
+- 0x0);
+ }
+
+ return ret;
+@@ -94,14 +89,6 @@ static irqreturn_t dp83td510_handle_interrupt(struct phy_device *phydev)
+ {
+ int ret;
+
+- ret = phy_read_mmd(phydev, MDIO_MMD_VEND2, DP83TD510E_PHY_STS);
+- if (ret < 0) {
+- phy_error(phydev);
+- return IRQ_NONE;
+- } else if (!(ret & DP83TD510E_STS_MII_INT)) {
+- return IRQ_NONE;
+- }
+-
+ /* Read the current enabled interrupts */
+ ret = phy_read_mmd(phydev, MDIO_MMD_VEND2, DP83TD510E_INTERRUPT_REG_1);
+ if (ret < 0) {
+diff --git a/drivers/net/wireless/cisco/airo.c b/drivers/net/wireless/cisco/airo.c
+index 7c4cc5f5e1eb4..dbd13f7aa3e6e 100644
+--- a/drivers/net/wireless/cisco/airo.c
++++ b/drivers/net/wireless/cisco/airo.c
+@@ -6157,8 +6157,11 @@ static int airo_get_rate(struct net_device *dev,
+ struct iw_param *vwrq = &wrqu->bitrate;
+ struct airo_info *local = dev->ml_priv;
+ StatusRid status_rid; /* Card status info */
++ int ret;
+
+- readStatusRid(local, &status_rid, 1);
++ ret = readStatusRid(local, &status_rid, 1);
++ if (ret)
++ return -EBUSY;
+
+ vwrq->value = le16_to_cpu(status_rid.currentXmitRate) * 500000;
+ /* If more than one rate, set auto */
+diff --git a/drivers/net/wireless/realtek/rtw89/debug.c b/drivers/net/wireless/realtek/rtw89/debug.c
+index 1e5b7a9987163..858494ddfb12e 100644
+--- a/drivers/net/wireless/realtek/rtw89/debug.c
++++ b/drivers/net/wireless/realtek/rtw89/debug.c
+@@ -2998,17 +2998,18 @@ static ssize_t rtw89_debug_priv_send_h2c_set(struct file *filp,
+ struct rtw89_debugfs_priv *debugfs_priv = filp->private_data;
+ struct rtw89_dev *rtwdev = debugfs_priv->rtwdev;
+ u8 *h2c;
++ int ret;
+ u16 h2c_len = count / 2;
+
+ h2c = rtw89_hex2bin_user(rtwdev, user_buf, count);
+ if (IS_ERR(h2c))
+ return -EFAULT;
+
+- rtw89_fw_h2c_raw(rtwdev, h2c, h2c_len);
++ ret = rtw89_fw_h2c_raw(rtwdev, h2c, h2c_len);
+
+ kfree(h2c);
+
+- return count;
++ return ret ? ret : count;
+ }
+
+ static int
+diff --git a/drivers/ntb/hw/amd/ntb_hw_amd.c b/drivers/ntb/hw/amd/ntb_hw_amd.c
+index 04550b1f984c6..730f2103b91d1 100644
+--- a/drivers/ntb/hw/amd/ntb_hw_amd.c
++++ b/drivers/ntb/hw/amd/ntb_hw_amd.c
+@@ -1338,12 +1338,17 @@ static struct pci_driver amd_ntb_pci_driver = {
+
+ static int __init amd_ntb_pci_driver_init(void)
+ {
++ int ret;
+ pr_info("%s %s\n", NTB_DESC, NTB_VER);
+
+ if (debugfs_initialized())
+ debugfs_dir = debugfs_create_dir(KBUILD_MODNAME, NULL);
+
+- return pci_register_driver(&amd_ntb_pci_driver);
++ ret = pci_register_driver(&amd_ntb_pci_driver);
++ if (ret)
++ debugfs_remove_recursive(debugfs_dir);
++
++ return ret;
+ }
+ module_init(amd_ntb_pci_driver_init);
+
+diff --git a/drivers/ntb/hw/idt/ntb_hw_idt.c b/drivers/ntb/hw/idt/ntb_hw_idt.c
+index 0ed6f809ff2ee..51799fccf8404 100644
+--- a/drivers/ntb/hw/idt/ntb_hw_idt.c
++++ b/drivers/ntb/hw/idt/ntb_hw_idt.c
+@@ -2891,6 +2891,7 @@ static struct pci_driver idt_pci_driver = {
+
+ static int __init idt_pci_driver_init(void)
+ {
++ int ret;
+ pr_info("%s %s\n", NTB_DESC, NTB_VER);
+
+ /* Create the top DebugFS directory if the FS is initialized */
+@@ -2898,7 +2899,11 @@ static int __init idt_pci_driver_init(void)
+ dbgfs_topdir = debugfs_create_dir(KBUILD_MODNAME, NULL);
+
+ /* Register the NTB hardware driver to handle the PCI device */
+- return pci_register_driver(&idt_pci_driver);
++ ret = pci_register_driver(&idt_pci_driver);
++ if (ret)
++ debugfs_remove_recursive(dbgfs_topdir);
++
++ return ret;
+ }
+ module_init(idt_pci_driver_init);
+
+diff --git a/drivers/ntb/hw/intel/ntb_hw_gen1.c b/drivers/ntb/hw/intel/ntb_hw_gen1.c
+index 84772013812bf..60a4ebc7bf35a 100644
+--- a/drivers/ntb/hw/intel/ntb_hw_gen1.c
++++ b/drivers/ntb/hw/intel/ntb_hw_gen1.c
+@@ -2064,12 +2064,17 @@ static struct pci_driver intel_ntb_pci_driver = {
+
+ static int __init intel_ntb_pci_driver_init(void)
+ {
++ int ret;
+ pr_info("%s %s\n", NTB_DESC, NTB_VER);
+
+ if (debugfs_initialized())
+ debugfs_dir = debugfs_create_dir(KBUILD_MODNAME, NULL);
+
+- return pci_register_driver(&intel_ntb_pci_driver);
++ ret = pci_register_driver(&intel_ntb_pci_driver);
++ if (ret)
++ debugfs_remove_recursive(debugfs_dir);
++
++ return ret;
+ }
+ module_init(intel_ntb_pci_driver_init);
+
+diff --git a/drivers/ntb/ntb_transport.c b/drivers/ntb/ntb_transport.c
+index a9b97ebc71ac5..2abd2235bbcab 100644
+--- a/drivers/ntb/ntb_transport.c
++++ b/drivers/ntb/ntb_transport.c
+@@ -410,7 +410,7 @@ int ntb_transport_register_client_dev(char *device_name)
+
+ rc = device_register(dev);
+ if (rc) {
+- kfree(client_dev);
++ put_device(dev);
+ goto err;
+ }
+
+diff --git a/drivers/ntb/test/ntb_tool.c b/drivers/ntb/test/ntb_tool.c
+index 5ee0afa621a95..eeeb4b1c97d2c 100644
+--- a/drivers/ntb/test/ntb_tool.c
++++ b/drivers/ntb/test/ntb_tool.c
+@@ -998,6 +998,8 @@ static int tool_init_mws(struct tool_ctx *tc)
+ tc->peers[pidx].outmws =
+ devm_kcalloc(&tc->ntb->dev, tc->peers[pidx].outmw_cnt,
+ sizeof(*tc->peers[pidx].outmws), GFP_KERNEL);
++ if (tc->peers[pidx].outmws == NULL)
++ return -ENOMEM;
+
+ for (widx = 0; widx < tc->peers[pidx].outmw_cnt; widx++) {
+ tc->peers[pidx].outmws[widx].pidx = pidx;
+diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
+index 3395e27438393..45f1dac07685d 100644
+--- a/drivers/nvme/host/core.c
++++ b/drivers/nvme/host/core.c
+@@ -4226,10 +4226,40 @@ static int nvme_init_ns_head(struct nvme_ns *ns, struct nvme_ns_info *info)
+
+ ret = nvme_global_check_duplicate_ids(ctrl->subsys, &info->ids);
+ if (ret) {
+- dev_err(ctrl->device,
+- "globally duplicate IDs for nsid %d\n", info->nsid);
++ /*
++ * We've found two different namespaces on two different
++ * subsystems that report the same ID. This is pretty nasty
++ * for anything that actually requires unique device
++ * identification. In the kernel we need this for multipathing,
++ * and in user space the /dev/disk/by-id/ links rely on it.
++ *
++ * If the device also claims to be multi-path capable back off
++ * here now and refuse the probe the second device as this is a
++ * recipe for data corruption. If not this is probably a
++ * cheap consumer device if on the PCIe bus, so let the user
++ * proceed and use the shiny toy, but warn that with changing
++ * probing order (which due to our async probing could just be
++ * device taking longer to startup) the other device could show
++ * up at any time.
++ */
+ nvme_print_device_info(ctrl);
+- return ret;
++ if ((ns->ctrl->ops->flags & NVME_F_FABRICS) || /* !PCIe */
++ ((ns->ctrl->subsys->cmic & NVME_CTRL_CMIC_MULTI_CTRL) &&
++ info->is_shared)) {
++ dev_err(ctrl->device,
++ "ignoring nsid %d because of duplicate IDs\n",
++ info->nsid);
++ return ret;
++ }
++
++ dev_err(ctrl->device,
++ "clearing duplicate IDs for nsid %d\n", info->nsid);
++ dev_err(ctrl->device,
++ "use of /dev/disk/by-id/ may cause data corruption\n");
++ memset(&info->ids.nguid, 0, sizeof(info->ids.nguid));
++ memset(&info->ids.uuid, 0, sizeof(info->ids.uuid));
++ memset(&info->ids.eui64, 0, sizeof(info->ids.eui64));
++ ctrl->quirks |= NVME_QUIRK_BOGUS_NID;
+ }
+
+ mutex_lock(&ctrl->subsys->lock);
+diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
+index 492f319ebdf37..5b5303f0e2c20 100644
+--- a/drivers/nvme/host/pci.c
++++ b/drivers/nvme/host/pci.c
+@@ -968,7 +968,7 @@ static __always_inline void nvme_pci_unmap_rq(struct request *req)
+ struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
+
+ dma_unmap_page(dev->dev, iod->meta_dma,
+- rq_integrity_vec(req)->bv_len, rq_data_dir(req));
++ rq_integrity_vec(req)->bv_len, rq_dma_dir(req));
+ }
+
+ if (blk_rq_nr_phys_segments(req))
+diff --git a/drivers/opp/core.c b/drivers/opp/core.c
+index 954c94865cf56..b5973fefdfd83 100644
+--- a/drivers/opp/core.c
++++ b/drivers/opp/core.c
+@@ -1358,7 +1358,10 @@ static struct opp_table *_allocate_opp_table(struct device *dev, int index)
+ return opp_table;
+
+ remove_opp_dev:
++ _of_clear_opp_table(opp_table);
+ _remove_opp_dev(opp_dev, opp_table);
++ mutex_destroy(&opp_table->genpd_virt_dev_lock);
++ mutex_destroy(&opp_table->lock);
+ err:
+ kfree(opp_table);
+ return ERR_PTR(ret);
+diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
+index 2783e9c3ef1ba..391a45d1e70a6 100644
+--- a/drivers/pci/controller/dwc/pcie-qcom.c
++++ b/drivers/pci/controller/dwc/pcie-qcom.c
+@@ -834,6 +834,8 @@ static int qcom_pcie_post_init_2_3_3(struct qcom_pcie *pcie)
+ writel(PCI_EXP_DEVCTL2_COMP_TMOUT_DIS, pci->dbi_base + offset +
+ PCI_EXP_DEVCTL2);
+
++ dw_pcie_dbi_ro_wr_dis(pci);
++
+ return 0;
+ }
+
+diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
+index d1a200b93b2bf..827d91e73efab 100644
+--- a/drivers/pci/controller/pcie-rockchip-ep.c
++++ b/drivers/pci/controller/pcie-rockchip-ep.c
+@@ -125,6 +125,7 @@ static void rockchip_pcie_prog_ep_ob_atu(struct rockchip_pcie *rockchip, u8 fn,
+ static int rockchip_pcie_ep_write_header(struct pci_epc *epc, u8 fn, u8 vfn,
+ struct pci_epf_header *hdr)
+ {
++ u32 reg;
+ struct rockchip_pcie_ep *ep = epc_get_drvdata(epc);
+ struct rockchip_pcie *rockchip = &ep->rockchip;
+
+@@ -137,8 +138,9 @@ static int rockchip_pcie_ep_write_header(struct pci_epc *epc, u8 fn, u8 vfn,
+ PCIE_CORE_CONFIG_VENDOR);
+ }
+
+- rockchip_pcie_write(rockchip, hdr->deviceid << 16,
+- ROCKCHIP_PCIE_EP_FUNC_BASE(fn) + PCI_VENDOR_ID);
++ reg = rockchip_pcie_read(rockchip, PCIE_EP_CONFIG_DID_VID);
++ reg = (reg & 0xFFFF) | (hdr->deviceid << 16);
++ rockchip_pcie_write(rockchip, reg, PCIE_EP_CONFIG_DID_VID);
+
+ rockchip_pcie_write(rockchip,
+ hdr->revid |
+@@ -312,15 +314,15 @@ static int rockchip_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn,
+ {
+ struct rockchip_pcie_ep *ep = epc_get_drvdata(epc);
+ struct rockchip_pcie *rockchip = &ep->rockchip;
+- u16 flags;
++ u32 flags;
+
+ flags = rockchip_pcie_read(rockchip,
+ ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
+ ROCKCHIP_PCIE_EP_MSI_CTRL_REG);
+ flags &= ~ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_MASK;
+ flags |=
+- ((multi_msg_cap << 1) << ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET) |
+- PCI_MSI_FLAGS_64BIT;
++ (multi_msg_cap << ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET) |
++ (PCI_MSI_FLAGS_64BIT << ROCKCHIP_PCIE_EP_MSI_FLAGS_OFFSET);
+ flags &= ~ROCKCHIP_PCIE_EP_MSI_CTRL_MASK_MSI_CAP;
+ rockchip_pcie_write(rockchip, flags,
+ ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
+@@ -332,7 +334,7 @@ static int rockchip_pcie_ep_get_msi(struct pci_epc *epc, u8 fn, u8 vfn)
+ {
+ struct rockchip_pcie_ep *ep = epc_get_drvdata(epc);
+ struct rockchip_pcie *rockchip = &ep->rockchip;
+- u16 flags;
++ u32 flags;
+
+ flags = rockchip_pcie_read(rockchip,
+ ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
+@@ -345,48 +347,25 @@ static int rockchip_pcie_ep_get_msi(struct pci_epc *epc, u8 fn, u8 vfn)
+ }
+
+ static void rockchip_pcie_ep_assert_intx(struct rockchip_pcie_ep *ep, u8 fn,
+- u8 intx, bool is_asserted)
++ u8 intx, bool do_assert)
+ {
+ struct rockchip_pcie *rockchip = &ep->rockchip;
+- u32 r = ep->max_regions - 1;
+- u32 offset;
+- u32 status;
+- u8 msg_code;
+-
+- if (unlikely(ep->irq_pci_addr != ROCKCHIP_PCIE_EP_PCI_LEGACY_IRQ_ADDR ||
+- ep->irq_pci_fn != fn)) {
+- rockchip_pcie_prog_ep_ob_atu(rockchip, fn, r,
+- AXI_WRAPPER_NOR_MSG,
+- ep->irq_phys_addr, 0, 0);
+- ep->irq_pci_addr = ROCKCHIP_PCIE_EP_PCI_LEGACY_IRQ_ADDR;
+- ep->irq_pci_fn = fn;
+- }
+
+ intx &= 3;
+- if (is_asserted) {
++
++ if (do_assert) {
+ ep->irq_pending |= BIT(intx);
+- msg_code = ROCKCHIP_PCIE_MSG_CODE_ASSERT_INTA + intx;
++ rockchip_pcie_write(rockchip,
++ PCIE_CLIENT_INT_IN_ASSERT |
++ PCIE_CLIENT_INT_PEND_ST_PEND,
++ PCIE_CLIENT_LEGACY_INT_CTRL);
+ } else {
+ ep->irq_pending &= ~BIT(intx);
+- msg_code = ROCKCHIP_PCIE_MSG_CODE_DEASSERT_INTA + intx;
++ rockchip_pcie_write(rockchip,
++ PCIE_CLIENT_INT_IN_DEASSERT |
++ PCIE_CLIENT_INT_PEND_ST_NORMAL,
++ PCIE_CLIENT_LEGACY_INT_CTRL);
+ }
+-
+- status = rockchip_pcie_read(rockchip,
+- ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
+- ROCKCHIP_PCIE_EP_CMD_STATUS);
+- status &= ROCKCHIP_PCIE_EP_CMD_STATUS_IS;
+-
+- if ((status != 0) ^ (ep->irq_pending != 0)) {
+- status ^= ROCKCHIP_PCIE_EP_CMD_STATUS_IS;
+- rockchip_pcie_write(rockchip, status,
+- ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
+- ROCKCHIP_PCIE_EP_CMD_STATUS);
+- }
+-
+- offset =
+- ROCKCHIP_PCIE_MSG_ROUTING(ROCKCHIP_PCIE_MSG_ROUTING_LOCAL_INTX) |
+- ROCKCHIP_PCIE_MSG_CODE(msg_code) | ROCKCHIP_PCIE_MSG_NO_DATA;
+- writel(0, ep->irq_cpu_addr + offset);
+ }
+
+ static int rockchip_pcie_ep_send_legacy_irq(struct rockchip_pcie_ep *ep, u8 fn,
+@@ -416,7 +395,7 @@ static int rockchip_pcie_ep_send_msi_irq(struct rockchip_pcie_ep *ep, u8 fn,
+ u8 interrupt_num)
+ {
+ struct rockchip_pcie *rockchip = &ep->rockchip;
+- u16 flags, mme, data, data_mask;
++ u32 flags, mme, data, data_mask;
+ u8 msi_count;
+ u64 pci_addr, pci_addr_mask = 0xff;
+
+@@ -506,6 +485,7 @@ static const struct pci_epc_features rockchip_pcie_epc_features = {
+ .linkup_notifier = false,
+ .msi_capable = true,
+ .msix_capable = false,
++ .align = 256,
+ };
+
+ static const struct pci_epc_features*
+@@ -631,6 +611,9 @@ static int rockchip_pcie_ep_probe(struct platform_device *pdev)
+
+ ep->irq_pci_addr = ROCKCHIP_PCIE_EP_DUMMY_IRQ_ADDR;
+
++ rockchip_pcie_write(rockchip, PCIE_CLIENT_CONF_ENABLE,
++ PCIE_CLIENT_CONFIG);
++
+ return 0;
+ err_epc_mem_exit:
+ pci_epc_mem_exit(epc);
+diff --git a/drivers/pci/controller/pcie-rockchip.c b/drivers/pci/controller/pcie-rockchip.c
+index 990a00e08bc5b..1aa84035a8bc7 100644
+--- a/drivers/pci/controller/pcie-rockchip.c
++++ b/drivers/pci/controller/pcie-rockchip.c
+@@ -14,6 +14,7 @@
+ #include <linux/clk.h>
+ #include <linux/delay.h>
+ #include <linux/gpio/consumer.h>
++#include <linux/iopoll.h>
+ #include <linux/of_pci.h>
+ #include <linux/phy/phy.h>
+ #include <linux/platform_device.h>
+@@ -153,6 +154,12 @@ int rockchip_pcie_parse_dt(struct rockchip_pcie *rockchip)
+ }
+ EXPORT_SYMBOL_GPL(rockchip_pcie_parse_dt);
+
++#define rockchip_pcie_read_addr(addr) rockchip_pcie_read(rockchip, addr)
++/* 100 ms max wait time for PHY PLLs to lock */
++#define RK_PHY_PLL_LOCK_TIMEOUT_US 100000
++/* Sleep should be less than 20ms */
++#define RK_PHY_PLL_LOCK_SLEEP_US 1000
++
+ int rockchip_pcie_init_port(struct rockchip_pcie *rockchip)
+ {
+ struct device *dev = rockchip->dev;
+@@ -254,6 +261,16 @@ int rockchip_pcie_init_port(struct rockchip_pcie *rockchip)
+ }
+ }
+
++ err = readx_poll_timeout(rockchip_pcie_read_addr,
++ PCIE_CLIENT_SIDE_BAND_STATUS,
++ regs, !(regs & PCIE_CLIENT_PHY_ST),
++ RK_PHY_PLL_LOCK_SLEEP_US,
++ RK_PHY_PLL_LOCK_TIMEOUT_US);
++ if (err) {
++ dev_err(dev, "PHY PLLs could not lock, %d\n", err);
++ goto err_power_off_phy;
++ }
++
+ /*
+ * Please don't reorder the deassert sequence of the following
+ * four reset pins.
+diff --git a/drivers/pci/controller/pcie-rockchip.h b/drivers/pci/controller/pcie-rockchip.h
+index 32c3a859c26b2..8e92dc3339ecc 100644
+--- a/drivers/pci/controller/pcie-rockchip.h
++++ b/drivers/pci/controller/pcie-rockchip.h
+@@ -38,6 +38,13 @@
+ #define PCIE_CLIENT_MODE_EP HIWORD_UPDATE(0x0040, 0)
+ #define PCIE_CLIENT_GEN_SEL_1 HIWORD_UPDATE(0x0080, 0)
+ #define PCIE_CLIENT_GEN_SEL_2 HIWORD_UPDATE_BIT(0x0080)
++#define PCIE_CLIENT_LEGACY_INT_CTRL (PCIE_CLIENT_BASE + 0x0c)
++#define PCIE_CLIENT_INT_IN_ASSERT HIWORD_UPDATE_BIT(0x0002)
++#define PCIE_CLIENT_INT_IN_DEASSERT HIWORD_UPDATE(0x0002, 0)
++#define PCIE_CLIENT_INT_PEND_ST_PEND HIWORD_UPDATE_BIT(0x0001)
++#define PCIE_CLIENT_INT_PEND_ST_NORMAL HIWORD_UPDATE(0x0001, 0)
++#define PCIE_CLIENT_SIDE_BAND_STATUS (PCIE_CLIENT_BASE + 0x20)
++#define PCIE_CLIENT_PHY_ST BIT(12)
+ #define PCIE_CLIENT_DEBUG_OUT_0 (PCIE_CLIENT_BASE + 0x3c)
+ #define PCIE_CLIENT_DEBUG_LTSSM_MASK GENMASK(5, 0)
+ #define PCIE_CLIENT_DEBUG_LTSSM_L1 0x18
+@@ -133,6 +140,8 @@
+ #define PCIE_RC_RP_ATS_BASE 0x400000
+ #define PCIE_RC_CONFIG_NORMAL_BASE 0x800000
+ #define PCIE_RC_CONFIG_BASE 0xa00000
++#define PCIE_EP_CONFIG_BASE 0xa00000
++#define PCIE_EP_CONFIG_DID_VID (PCIE_EP_CONFIG_BASE + 0x00)
+ #define PCIE_RC_CONFIG_RID_CCR (PCIE_RC_CONFIG_BASE + 0x08)
+ #define PCIE_RC_CONFIG_DCR (PCIE_RC_CONFIG_BASE + 0xc4)
+ #define PCIE_RC_CONFIG_DCR_CSPL_SHIFT 18
+@@ -216,6 +225,7 @@
+ #define ROCKCHIP_PCIE_EP_CMD_STATUS 0x4
+ #define ROCKCHIP_PCIE_EP_CMD_STATUS_IS BIT(19)
+ #define ROCKCHIP_PCIE_EP_MSI_CTRL_REG 0x90
++#define ROCKCHIP_PCIE_EP_MSI_FLAGS_OFFSET 16
+ #define ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET 17
+ #define ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_MASK GENMASK(19, 17)
+ #define ROCKCHIP_PCIE_EP_MSI_CTRL_MME_OFFSET 20
+@@ -223,7 +233,6 @@
+ #define ROCKCHIP_PCIE_EP_MSI_CTRL_ME BIT(16)
+ #define ROCKCHIP_PCIE_EP_MSI_CTRL_MASK_MSI_CAP BIT(24)
+ #define ROCKCHIP_PCIE_EP_DUMMY_IRQ_ADDR 0x1
+-#define ROCKCHIP_PCIE_EP_PCI_LEGACY_IRQ_ADDR 0x3
+ #define ROCKCHIP_PCIE_EP_FUNC_BASE(fn) (((fn) << 12) & GENMASK(19, 12))
+ #define ROCKCHIP_PCIE_AT_IB_EP_FUNC_BAR_ADDR0(fn, bar) \
+ (PCIE_RC_RP_ATS_BASE + 0x0840 + (fn) * 0x0040 + (bar) * 0x0008)
+diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c b/drivers/pci/endpoint/functions/pci-epf-test.c
+index 172e5ac0bd96c..a383906212cf4 100644
+--- a/drivers/pci/endpoint/functions/pci-epf-test.c
++++ b/drivers/pci/endpoint/functions/pci-epf-test.c
+@@ -54,6 +54,9 @@ struct pci_epf_test {
+ struct delayed_work cmd_handler;
+ struct dma_chan *dma_chan_tx;
+ struct dma_chan *dma_chan_rx;
++ struct dma_chan *transfer_chan;
++ dma_cookie_t transfer_cookie;
++ enum dma_status transfer_status;
+ struct completion transfer_complete;
+ bool dma_supported;
+ bool dma_private;
+@@ -85,8 +88,14 @@ static size_t bar_size[] = { 512, 512, 1024, 16384, 131072, 1048576 };
+ static void pci_epf_test_dma_callback(void *param)
+ {
+ struct pci_epf_test *epf_test = param;
+-
+- complete(&epf_test->transfer_complete);
++ struct dma_tx_state state;
++
++ epf_test->transfer_status =
++ dmaengine_tx_status(epf_test->transfer_chan,
++ epf_test->transfer_cookie, &state);
++ if (epf_test->transfer_status == DMA_COMPLETE ||
++ epf_test->transfer_status == DMA_ERROR)
++ complete(&epf_test->transfer_complete);
+ }
+
+ /**
+@@ -120,7 +129,6 @@ static int pci_epf_test_data_transfer(struct pci_epf_test *epf_test,
+ struct dma_async_tx_descriptor *tx;
+ struct dma_slave_config sconf = {};
+ struct device *dev = &epf->dev;
+- dma_cookie_t cookie;
+ int ret;
+
+ if (IS_ERR_OR_NULL(chan)) {
+@@ -151,26 +159,34 @@ static int pci_epf_test_data_transfer(struct pci_epf_test *epf_test,
+ return -EIO;
+ }
+
++ reinit_completion(&epf_test->transfer_complete);
++ epf_test->transfer_chan = chan;
+ tx->callback = pci_epf_test_dma_callback;
+ tx->callback_param = epf_test;
+- cookie = tx->tx_submit(tx);
+- reinit_completion(&epf_test->transfer_complete);
++ epf_test->transfer_cookie = tx->tx_submit(tx);
+
+- ret = dma_submit_error(cookie);
++ ret = dma_submit_error(epf_test->transfer_cookie);
+ if (ret) {
+- dev_err(dev, "Failed to do DMA tx_submit %d\n", cookie);
+- return -EIO;
++ dev_err(dev, "Failed to do DMA tx_submit %d\n", ret);
++ goto terminate;
+ }
+
+ dma_async_issue_pending(chan);
+ ret = wait_for_completion_interruptible(&epf_test->transfer_complete);
+ if (ret < 0) {
+- dmaengine_terminate_sync(chan);
+- dev_err(dev, "DMA wait_for_completion_timeout\n");
+- return -ETIMEDOUT;
++ dev_err(dev, "DMA wait_for_completion interrupted\n");
++ goto terminate;
+ }
+
+- return 0;
++ if (epf_test->transfer_status == DMA_ERROR) {
++ dev_err(dev, "DMA transfer failed\n");
++ ret = -EIO;
++ }
++
++terminate:
++ dmaengine_terminate_sync(chan);
++
++ return ret;
+ }
+
+ struct epf_dma_filter {
+diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
+index 5ede93222bc12..c779eb4d7fb84 100644
+--- a/drivers/pci/pci.c
++++ b/drivers/pci/pci.c
+@@ -2949,13 +2949,13 @@ static const struct dmi_system_id bridge_d3_blacklist[] = {
+ {
+ /*
+ * Downstream device is not accessible after putting a root port
+- * into D3cold and back into D0 on Elo i2.
++ * into D3cold and back into D0 on Elo Continental Z2 board
+ */
+- .ident = "Elo i2",
++ .ident = "Elo Continental Z2",
+ .matches = {
+- DMI_MATCH(DMI_SYS_VENDOR, "Elo Touch Solutions"),
+- DMI_MATCH(DMI_PRODUCT_NAME, "Elo i2"),
+- DMI_MATCH(DMI_PRODUCT_VERSION, "RevB"),
++ DMI_MATCH(DMI_BOARD_VENDOR, "Elo Touch Solutions"),
++ DMI_MATCH(DMI_BOARD_NAME, "Geminilake"),
++ DMI_MATCH(DMI_BOARD_VERSION, "Continental Z2"),
+ },
+ },
+ #endif
+diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
+index 0b2826c4a832d..00ed20ac0dd61 100644
+--- a/drivers/pci/probe.c
++++ b/drivers/pci/probe.c
+@@ -997,8 +997,10 @@ static int pci_register_host_bridge(struct pci_host_bridge *bridge)
+ resource_list_for_each_entry_safe(window, n, &resources) {
+ offset = window->offset;
+ res = window->res;
+- if (!res->flags && !res->start && !res->end)
++ if (!res->flags && !res->start && !res->end) {
++ release_resource(res);
+ continue;
++ }
+
+ list_move_tail(&window->node, &bridge->windows);
+
+diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
+index c525867760bf8..b7c65193e786a 100644
+--- a/drivers/pci/quirks.c
++++ b/drivers/pci/quirks.c
+@@ -4174,6 +4174,8 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9220,
+ /* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c49 */
+ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9230,
+ quirk_dma_func1_alias);
++DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9235,
++ quirk_dma_func1_alias);
+ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TTI, 0x0642,
+ quirk_dma_func1_alias);
+ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TTI, 0x0645,
+diff --git a/drivers/perf/riscv_pmu.c b/drivers/perf/riscv_pmu.c
+index ebca5eab9c9be..56897d4d4fd3e 100644
+--- a/drivers/perf/riscv_pmu.c
++++ b/drivers/perf/riscv_pmu.c
+@@ -181,9 +181,6 @@ void riscv_pmu_start(struct perf_event *event, int flags)
+ uint64_t max_period = riscv_pmu_ctr_get_width_mask(event);
+ u64 init_val;
+
+- if (WARN_ON_ONCE(!(event->hw.state & PERF_HES_STOPPED)))
+- return;
+-
+ if (flags & PERF_EF_RELOAD)
+ WARN_ON_ONCE(!(event->hw.state & PERF_HES_UPTODATE));
+
+diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
+index f279b360c20d3..b129d7c76b3e9 100644
+--- a/drivers/pinctrl/pinctrl-amd.c
++++ b/drivers/pinctrl/pinctrl-amd.c
+@@ -115,16 +115,20 @@ static void amd_gpio_set_value(struct gpio_chip *gc, unsigned offset, int value)
+ raw_spin_unlock_irqrestore(&gpio_dev->lock, flags);
+ }
+
+-static int amd_gpio_set_debounce(struct gpio_chip *gc, unsigned offset,
+- unsigned debounce)
++static int amd_gpio_set_debounce(struct amd_gpio *gpio_dev, unsigned int offset,
++ unsigned int debounce)
+ {
+ u32 time;
+ u32 pin_reg;
+ int ret = 0;
+- unsigned long flags;
+- struct amd_gpio *gpio_dev = gpiochip_get_data(gc);
+
+- raw_spin_lock_irqsave(&gpio_dev->lock, flags);
++ /* Use special handling for Pin0 debounce */
++ if (offset == 0) {
++ pin_reg = readl(gpio_dev->base + WAKE_INT_MASTER_REG);
++ if (pin_reg & INTERNAL_GPIO0_DEBOUNCE)
++ debounce = 0;
++ }
++
+ pin_reg = readl(gpio_dev->base + offset * 4);
+
+ if (debounce) {
+@@ -175,23 +179,10 @@ static int amd_gpio_set_debounce(struct gpio_chip *gc, unsigned offset,
+ pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
+ }
+ writel(pin_reg, gpio_dev->base + offset * 4);
+- raw_spin_unlock_irqrestore(&gpio_dev->lock, flags);
+
+ return ret;
+ }
+
+-static int amd_gpio_set_config(struct gpio_chip *gc, unsigned offset,
+- unsigned long config)
+-{
+- u32 debounce;
+-
+- if (pinconf_to_config_param(config) != PIN_CONFIG_INPUT_DEBOUNCE)
+- return -ENOTSUPP;
+-
+- debounce = pinconf_to_config_argument(config);
+- return amd_gpio_set_debounce(gc, offset, debounce);
+-}
+-
+ #ifdef CONFIG_DEBUG_FS
+ static void amd_gpio_dbg_show(struct seq_file *s, struct gpio_chip *gc)
+ {
+@@ -213,12 +204,12 @@ static void amd_gpio_dbg_show(struct seq_file *s, struct gpio_chip *gc)
+ char *pin_sts;
+ char *interrupt_sts;
+ char *wake_sts;
+- char *pull_up_sel;
+ char *orientation;
+ char debounce_value[40];
+ char *debounce_enable;
+ char *wake_cntrlz;
+
++ seq_printf(s, "WAKE_INT_MASTER_REG: 0x%08x\n", readl(gpio_dev->base + WAKE_INT_MASTER_REG));
+ for (bank = 0; bank < gpio_dev->hwbank_num; bank++) {
+ unsigned int time = 0;
+ unsigned int unit = 0;
+@@ -320,14 +311,9 @@ static void amd_gpio_dbg_show(struct seq_file *s, struct gpio_chip *gc)
+ seq_printf(s, " %s|", wake_sts);
+
+ if (pin_reg & BIT(PULL_UP_ENABLE_OFF)) {
+- if (pin_reg & BIT(PULL_UP_SEL_OFF))
+- pull_up_sel = "8k";
+- else
+- pull_up_sel = "4k";
+- seq_printf(s, "%s ↑|",
+- pull_up_sel);
++ seq_puts(s, " ↑ |");
+ } else if (pin_reg & BIT(PULL_DOWN_ENABLE_OFF)) {
+- seq_puts(s, " ↓|");
++ seq_puts(s, " ↓ |");
+ } else {
+ seq_puts(s, " |");
+ }
+@@ -653,21 +639,21 @@ static bool do_amd_gpio_irq_handler(int irq, void *dev_id)
+ * We must read the pin register again, in case the
+ * value was changed while executing
+ * generic_handle_domain_irq() above.
+- * If we didn't find a mapping for the interrupt,
+- * disable it in order to avoid a system hang caused
+- * by an interrupt storm.
++ * If the line is not an irq, disable it in order to
++ * avoid a system hang caused by an interrupt storm.
+ */
+ raw_spin_lock_irqsave(&gpio_dev->lock, flags);
+ regval = readl(regs + i);
+- if (irq == 0) {
+- regval &= ~BIT(INTERRUPT_ENABLE_OFF);
++ if (!gpiochip_line_is_irq(gc, irqnr + i)) {
++ regval &= ~BIT(INTERRUPT_MASK_OFF);
+ dev_dbg(&gpio_dev->pdev->dev,
+ "Disabling spurious GPIO IRQ %d\n",
+ irqnr + i);
++ } else {
++ ret = true;
+ }
+ writel(regval, regs + i);
+ raw_spin_unlock_irqrestore(&gpio_dev->lock, flags);
+- ret = true;
+ }
+ }
+ /* did not cause wake on resume context for shared IRQ */
+@@ -754,7 +740,7 @@ static int amd_pinconf_get(struct pinctrl_dev *pctldev,
+ break;
+
+ case PIN_CONFIG_BIAS_PULL_UP:
+- arg = (pin_reg >> PULL_UP_SEL_OFF) & (BIT(0) | BIT(1));
++ arg = (pin_reg >> PULL_UP_ENABLE_OFF) & BIT(0);
+ break;
+
+ case PIN_CONFIG_DRIVE_STRENGTH:
+@@ -773,7 +759,7 @@ static int amd_pinconf_get(struct pinctrl_dev *pctldev,
+ }
+
+ static int amd_pinconf_set(struct pinctrl_dev *pctldev, unsigned int pin,
+- unsigned long *configs, unsigned num_configs)
++ unsigned long *configs, unsigned int num_configs)
+ {
+ int i;
+ u32 arg;
+@@ -791,9 +777,8 @@ static int amd_pinconf_set(struct pinctrl_dev *pctldev, unsigned int pin,
+
+ switch (param) {
+ case PIN_CONFIG_INPUT_DEBOUNCE:
+- pin_reg &= ~DB_TMR_OUT_MASK;
+- pin_reg |= arg & DB_TMR_OUT_MASK;
+- break;
++ ret = amd_gpio_set_debounce(gpio_dev, pin, arg);
++ goto out_unlock;
+
+ case PIN_CONFIG_BIAS_PULL_DOWN:
+ pin_reg &= ~BIT(PULL_DOWN_ENABLE_OFF);
+@@ -801,10 +786,8 @@ static int amd_pinconf_set(struct pinctrl_dev *pctldev, unsigned int pin,
+ break;
+
+ case PIN_CONFIG_BIAS_PULL_UP:
+- pin_reg &= ~BIT(PULL_UP_SEL_OFF);
+- pin_reg |= (arg & BIT(0)) << PULL_UP_SEL_OFF;
+ pin_reg &= ~BIT(PULL_UP_ENABLE_OFF);
+- pin_reg |= ((arg>>1) & BIT(0)) << PULL_UP_ENABLE_OFF;
++ pin_reg |= (arg & BIT(0)) << PULL_UP_ENABLE_OFF;
+ break;
+
+ case PIN_CONFIG_DRIVE_STRENGTH:
+@@ -822,6 +805,7 @@ static int amd_pinconf_set(struct pinctrl_dev *pctldev, unsigned int pin,
+
+ writel(pin_reg, gpio_dev->base + pin*4);
+ }
++out_unlock:
+ raw_spin_unlock_irqrestore(&gpio_dev->lock, flags);
+
+ return ret;
+@@ -863,6 +847,14 @@ static int amd_pinconf_group_set(struct pinctrl_dev *pctldev,
+ return 0;
+ }
+
++static int amd_gpio_set_config(struct gpio_chip *gc, unsigned int pin,
++ unsigned long config)
++{
++ struct amd_gpio *gpio_dev = gpiochip_get_data(gc);
++
++ return amd_pinconf_set(gpio_dev->pctrl, pin, &config, 1);
++}
++
+ static const struct pinconf_ops amd_pinconf_ops = {
+ .pin_config_get = amd_pinconf_get,
+ .pin_config_set = amd_pinconf_set,
+@@ -870,34 +862,6 @@ static const struct pinconf_ops amd_pinconf_ops = {
+ .pin_config_group_set = amd_pinconf_group_set,
+ };
+
+-static void amd_gpio_irq_init(struct amd_gpio *gpio_dev)
+-{
+- struct pinctrl_desc *desc = gpio_dev->pctrl->desc;
+- unsigned long flags;
+- u32 pin_reg, mask;
+- int i;
+-
+- mask = BIT(WAKE_CNTRL_OFF_S0I3) | BIT(WAKE_CNTRL_OFF_S3) |
+- BIT(INTERRUPT_MASK_OFF) | BIT(INTERRUPT_ENABLE_OFF) |
+- BIT(WAKE_CNTRL_OFF_S4);
+-
+- for (i = 0; i < desc->npins; i++) {
+- int pin = desc->pins[i].number;
+- const struct pin_desc *pd = pin_desc_get(gpio_dev->pctrl, pin);
+-
+- if (!pd)
+- continue;
+-
+- raw_spin_lock_irqsave(&gpio_dev->lock, flags);
+-
+- pin_reg = readl(gpio_dev->base + i * 4);
+- pin_reg &= ~mask;
+- writel(pin_reg, gpio_dev->base + i * 4);
+-
+- raw_spin_unlock_irqrestore(&gpio_dev->lock, flags);
+- }
+-}
+-
+ #ifdef CONFIG_PM_SLEEP
+ static bool amd_gpio_should_save(struct amd_gpio *gpio_dev, unsigned int pin)
+ {
+@@ -1135,9 +1099,6 @@ static int amd_gpio_probe(struct platform_device *pdev)
+ return PTR_ERR(gpio_dev->pctrl);
+ }
+
+- /* Disable and mask interrupts */
+- amd_gpio_irq_init(gpio_dev);
+-
+ girq = &gpio_dev->gc.irq;
+ gpio_irq_chip_set_chip(girq, &amd_gpio_irqchip);
+ /* This will let us handle the parent IRQ in the driver */
+diff --git a/drivers/pinctrl/pinctrl-amd.h b/drivers/pinctrl/pinctrl-amd.h
+index 81ae8319a1f0a..34c5c3e71fb26 100644
+--- a/drivers/pinctrl/pinctrl-amd.h
++++ b/drivers/pinctrl/pinctrl-amd.h
+@@ -17,6 +17,7 @@
+ #define AMD_GPIO_PINS_BANK3 32
+
+ #define WAKE_INT_MASTER_REG 0xfc
++#define INTERNAL_GPIO0_DEBOUNCE (1 << 15)
+ #define EOI_MASK (1 << 29)
+
+ #define WAKE_INT_STATUS_REG0 0x2f8
+@@ -35,7 +36,6 @@
+ #define WAKE_CNTRL_OFF_S4 15
+ #define PIN_STS_OFF 16
+ #define DRV_STRENGTH_SEL_OFF 17
+-#define PULL_UP_SEL_OFF 19
+ #define PULL_UP_ENABLE_OFF 20
+ #define PULL_DOWN_ENABLE_OFF 21
+ #define OUTPUT_VALUE_OFF 22
+diff --git a/drivers/platform/x86/wmi.c b/drivers/platform/x86/wmi.c
+index d81319a502efc..e1a3bfeeed529 100644
+--- a/drivers/platform/x86/wmi.c
++++ b/drivers/platform/x86/wmi.c
+@@ -136,6 +136,16 @@ static acpi_status find_guid(const char *guid_string, struct wmi_block **out)
+ return AE_NOT_FOUND;
+ }
+
++static bool guid_parse_and_compare(const char *string, const guid_t *guid)
++{
++ guid_t guid_input;
++
++ if (guid_parse(string, &guid_input))
++ return false;
++
++ return guid_equal(&guid_input, guid);
++}
++
+ static const void *find_guid_context(struct wmi_block *wblock,
+ struct wmi_driver *wdriver)
+ {
+@@ -146,11 +156,7 @@ static const void *find_guid_context(struct wmi_block *wblock,
+ return NULL;
+
+ while (*id->guid_string) {
+- guid_t guid_input;
+-
+- if (guid_parse(id->guid_string, &guid_input))
+- continue;
+- if (guid_equal(&wblock->gblock.guid, &guid_input))
++ if (guid_parse_and_compare(id->guid_string, &wblock->gblock.guid))
+ return id->context;
+ id++;
+ }
+@@ -827,11 +833,7 @@ static int wmi_dev_match(struct device *dev, struct device_driver *driver)
+ return 0;
+
+ while (*id->guid_string) {
+- guid_t driver_guid;
+-
+- if (WARN_ON(guid_parse(id->guid_string, &driver_guid)))
+- continue;
+- if (guid_equal(&driver_guid, &wblock->gblock.guid))
++ if (guid_parse_and_compare(id->guid_string, &wblock->gblock.guid))
+ return 1;
+
+ id++;
+diff --git a/drivers/pwm/pwm-meson.c b/drivers/pwm/pwm-meson.c
+index 5732300eb0046..33107204a951d 100644
+--- a/drivers/pwm/pwm-meson.c
++++ b/drivers/pwm/pwm-meson.c
+@@ -156,8 +156,9 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
+ const struct pwm_state *state)
+ {
+ struct meson_pwm_channel *channel = &meson->channels[pwm->hwpwm];
+- unsigned int duty, period, pre_div, cnt, duty_cnt;
++ unsigned int pre_div, cnt, duty_cnt;
+ unsigned long fin_freq;
++ u64 duty, period;
+
+ duty = state->duty_cycle;
+ period = state->period;
+@@ -179,19 +180,19 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
+
+ dev_dbg(meson->chip.dev, "fin_freq: %lu Hz\n", fin_freq);
+
+- pre_div = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC * 0xffffLL);
++ pre_div = div64_u64(fin_freq * period, NSEC_PER_SEC * 0xffffLL);
+ if (pre_div > MISC_CLK_DIV_MASK) {
+ dev_err(meson->chip.dev, "unable to get period pre_div\n");
+ return -EINVAL;
+ }
+
+- cnt = div64_u64(fin_freq * (u64)period, NSEC_PER_SEC * (pre_div + 1));
++ cnt = div64_u64(fin_freq * period, NSEC_PER_SEC * (pre_div + 1));
+ if (cnt > 0xffff) {
+ dev_err(meson->chip.dev, "unable to get period cnt\n");
+ return -EINVAL;
+ }
+
+- dev_dbg(meson->chip.dev, "period=%u pre_div=%u cnt=%u\n", period,
++ dev_dbg(meson->chip.dev, "period=%llu pre_div=%u cnt=%u\n", period,
+ pre_div, cnt);
+
+ if (duty == period) {
+@@ -204,14 +205,13 @@ static int meson_pwm_calc(struct meson_pwm *meson, struct pwm_device *pwm,
+ channel->lo = cnt;
+ } else {
+ /* Then check is we can have the duty with the same pre_div */
+- duty_cnt = div64_u64(fin_freq * (u64)duty,
+- NSEC_PER_SEC * (pre_div + 1));
++ duty_cnt = div64_u64(fin_freq * duty, NSEC_PER_SEC * (pre_div + 1));
+ if (duty_cnt > 0xffff) {
+ dev_err(meson->chip.dev, "unable to get duty cycle\n");
+ return -EINVAL;
+ }
+
+- dev_dbg(meson->chip.dev, "duty=%u pre_div=%u duty_cnt=%u\n",
++ dev_dbg(meson->chip.dev, "duty=%llu pre_div=%u duty_cnt=%u\n",
+ duty, pre_div, duty_cnt);
+
+ channel->pre_div = pre_div;
+@@ -351,18 +351,8 @@ static int meson_pwm_get_state(struct pwm_chip *chip, struct pwm_device *pwm,
+ channel->lo = FIELD_GET(PWM_LOW_MASK, value);
+ channel->hi = FIELD_GET(PWM_HIGH_MASK, value);
+
+- if (channel->lo == 0) {
+- state->period = meson_pwm_cnt_to_ns(chip, pwm, channel->hi);
+- state->duty_cycle = state->period;
+- } else if (channel->lo >= channel->hi) {
+- state->period = meson_pwm_cnt_to_ns(chip, pwm,
+- channel->lo + channel->hi);
+- state->duty_cycle = meson_pwm_cnt_to_ns(chip, pwm,
+- channel->hi);
+- } else {
+- state->period = 0;
+- state->duty_cycle = 0;
+- }
++ state->period = meson_pwm_cnt_to_ns(chip, pwm, channel->lo + channel->hi);
++ state->duty_cycle = meson_pwm_cnt_to_ns(chip, pwm, channel->hi);
+
+ state->polarity = PWM_POLARITY_NORMAL;
+
+diff --git a/drivers/s390/crypto/zcrypt_msgtype6.c b/drivers/s390/crypto/zcrypt_msgtype6.c
+index 2f9bf23fbb44e..9eb5153737007 100644
+--- a/drivers/s390/crypto/zcrypt_msgtype6.c
++++ b/drivers/s390/crypto/zcrypt_msgtype6.c
+@@ -1143,6 +1143,9 @@ static long zcrypt_msgtype6_send_cprb(bool userspace, struct zcrypt_queue *zq,
+ ap_cancel_message(zq->queue, ap_msg);
+ }
+
++ if (rc == -EAGAIN && ap_msg->flags & AP_MSG_FLAG_ADMIN)
++ rc = -EIO; /* do not retry administrative requests */
++
+ out:
+ if (rc)
+ ZCRYPT_DBF_DBG("%s send cprb at dev=%02x.%04x rc=%d\n",
+@@ -1263,6 +1266,9 @@ static long zcrypt_msgtype6_send_ep11_cprb(bool userspace, struct zcrypt_queue *
+ ap_cancel_message(zq->queue, ap_msg);
+ }
+
++ if (rc == -EAGAIN && ap_msg->flags & AP_MSG_FLAG_ADMIN)
++ rc = -EIO; /* do not retry administrative requests */
++
+ out:
+ if (rc)
+ ZCRYPT_DBF_DBG("%s send cprb at dev=%02x.%04x rc=%d\n",
+diff --git a/drivers/s390/net/ism_drv.c b/drivers/s390/net/ism_drv.c
+index c2096e4bba319..6db5cf7e901f9 100644
+--- a/drivers/s390/net/ism_drv.c
++++ b/drivers/s390/net/ism_drv.c
+@@ -36,7 +36,7 @@ static const struct smcd_ops ism_ops;
+ static struct ism_client *clients[MAX_CLIENTS]; /* use an array rather than */
+ /* a list for fast mapping */
+ static u8 max_client;
+-static DEFINE_SPINLOCK(clients_lock);
++static DEFINE_MUTEX(clients_lock);
+ struct ism_dev_list {
+ struct list_head list;
+ struct mutex mutex; /* protects ism device list */
+@@ -47,14 +47,22 @@ static struct ism_dev_list ism_dev_list = {
+ .mutex = __MUTEX_INITIALIZER(ism_dev_list.mutex),
+ };
+
++static void ism_setup_forwarding(struct ism_client *client, struct ism_dev *ism)
++{
++ unsigned long flags;
++
++ spin_lock_irqsave(&ism->lock, flags);
++ ism->subs[client->id] = client;
++ spin_unlock_irqrestore(&ism->lock, flags);
++}
++
+ int ism_register_client(struct ism_client *client)
+ {
+ struct ism_dev *ism;
+- unsigned long flags;
+ int i, rc = -ENOSPC;
+
+ mutex_lock(&ism_dev_list.mutex);
+- spin_lock_irqsave(&clients_lock, flags);
++ mutex_lock(&clients_lock);
+ for (i = 0; i < MAX_CLIENTS; ++i) {
+ if (!clients[i]) {
+ clients[i] = client;
+@@ -65,12 +73,14 @@ int ism_register_client(struct ism_client *client)
+ break;
+ }
+ }
+- spin_unlock_irqrestore(&clients_lock, flags);
++ mutex_unlock(&clients_lock);
++
+ if (i < MAX_CLIENTS) {
+ /* initialize with all devices that we got so far */
+ list_for_each_entry(ism, &ism_dev_list.list, list) {
+ ism->priv[i] = NULL;
+ client->add(ism);
++ ism_setup_forwarding(client, ism);
+ }
+ }
+ mutex_unlock(&ism_dev_list.mutex);
+@@ -86,25 +96,32 @@ int ism_unregister_client(struct ism_client *client)
+ int rc = 0;
+
+ mutex_lock(&ism_dev_list.mutex);
+- spin_lock_irqsave(&clients_lock, flags);
+- clients[client->id] = NULL;
+- if (client->id + 1 == max_client)
+- max_client--;
+- spin_unlock_irqrestore(&clients_lock, flags);
+ list_for_each_entry(ism, &ism_dev_list.list, list) {
++ spin_lock_irqsave(&ism->lock, flags);
++ /* Stop forwarding IRQs and events */
++ ism->subs[client->id] = NULL;
+ for (int i = 0; i < ISM_NR_DMBS; ++i) {
+ if (ism->sba_client_arr[i] == client->id) {
+- pr_err("%s: attempt to unregister client '%s'"
+- "with registered dmb(s)\n", __func__,
+- client->name);
++ WARN(1, "%s: attempt to unregister '%s' with registered dmb(s)\n",
++ __func__, client->name);
+ rc = -EBUSY;
+- goto out;
++ goto err_reg_dmb;
+ }
+ }
++ spin_unlock_irqrestore(&ism->lock, flags);
+ }
+-out:
+ mutex_unlock(&ism_dev_list.mutex);
+
++ mutex_lock(&clients_lock);
++ clients[client->id] = NULL;
++ if (client->id + 1 == max_client)
++ max_client--;
++ mutex_unlock(&clients_lock);
++ return rc;
++
++err_reg_dmb:
++ spin_unlock_irqrestore(&ism->lock, flags);
++ mutex_unlock(&ism_dev_list.mutex);
+ return rc;
+ }
+ EXPORT_SYMBOL_GPL(ism_unregister_client);
+@@ -328,6 +345,7 @@ int ism_register_dmb(struct ism_dev *ism, struct ism_dmb *dmb,
+ struct ism_client *client)
+ {
+ union ism_reg_dmb cmd;
++ unsigned long flags;
+ int ret;
+
+ ret = ism_alloc_dmb(ism, dmb);
+@@ -351,7 +369,9 @@ int ism_register_dmb(struct ism_dev *ism, struct ism_dmb *dmb,
+ goto out;
+ }
+ dmb->dmb_tok = cmd.response.dmb_tok;
++ spin_lock_irqsave(&ism->lock, flags);
+ ism->sba_client_arr[dmb->sba_idx - ISM_DMB_BIT_OFFSET] = client->id;
++ spin_unlock_irqrestore(&ism->lock, flags);
+ out:
+ return ret;
+ }
+@@ -360,6 +380,7 @@ EXPORT_SYMBOL_GPL(ism_register_dmb);
+ int ism_unregister_dmb(struct ism_dev *ism, struct ism_dmb *dmb)
+ {
+ union ism_unreg_dmb cmd;
++ unsigned long flags;
+ int ret;
+
+ memset(&cmd, 0, sizeof(cmd));
+@@ -368,7 +389,9 @@ int ism_unregister_dmb(struct ism_dev *ism, struct ism_dmb *dmb)
+
+ cmd.request.dmb_tok = dmb->dmb_tok;
+
++ spin_lock_irqsave(&ism->lock, flags);
+ ism->sba_client_arr[dmb->sba_idx - ISM_DMB_BIT_OFFSET] = NO_CLIENT;
++ spin_unlock_irqrestore(&ism->lock, flags);
+
+ ret = ism_cmd(ism, &cmd);
+ if (ret && ret != ISM_ERROR)
+@@ -491,6 +514,7 @@ static u16 ism_get_chid(struct ism_dev *ism)
+ static void ism_handle_event(struct ism_dev *ism)
+ {
+ struct ism_event *entry;
++ struct ism_client *clt;
+ int i;
+
+ while ((ism->ieq_idx + 1) != READ_ONCE(ism->ieq->header.idx)) {
+@@ -499,21 +523,21 @@ static void ism_handle_event(struct ism_dev *ism)
+
+ entry = &ism->ieq->entry[ism->ieq_idx];
+ debug_event(ism_debug_info, 2, entry, sizeof(*entry));
+- spin_lock(&clients_lock);
+- for (i = 0; i < max_client; ++i)
+- if (clients[i])
+- clients[i]->handle_event(ism, entry);
+- spin_unlock(&clients_lock);
++ for (i = 0; i < max_client; ++i) {
++ clt = ism->subs[i];
++ if (clt)
++ clt->handle_event(ism, entry);
++ }
+ }
+ }
+
+ static irqreturn_t ism_handle_irq(int irq, void *data)
+ {
+ struct ism_dev *ism = data;
+- struct ism_client *clt;
+ unsigned long bit, end;
+ unsigned long *bv;
+ u16 dmbemask;
++ u8 client_id;
+
+ bv = (void *) &ism->sba->dmb_bits[ISM_DMB_WORD_OFFSET];
+ end = sizeof(ism->sba->dmb_bits) * BITS_PER_BYTE - ISM_DMB_BIT_OFFSET;
+@@ -530,8 +554,10 @@ static irqreturn_t ism_handle_irq(int irq, void *data)
+ dmbemask = ism->sba->dmbe_mask[bit + ISM_DMB_BIT_OFFSET];
+ ism->sba->dmbe_mask[bit + ISM_DMB_BIT_OFFSET] = 0;
+ barrier();
+- clt = clients[ism->sba_client_arr[bit]];
+- clt->handle_irq(ism, bit + ISM_DMB_BIT_OFFSET, dmbemask);
++ client_id = ism->sba_client_arr[bit];
++ if (unlikely(client_id == NO_CLIENT || !ism->subs[client_id]))
++ continue;
++ ism->subs[client_id]->handle_irq(ism, bit + ISM_DMB_BIT_OFFSET, dmbemask);
+ }
+
+ if (ism->sba->e) {
+@@ -548,20 +574,9 @@ static u64 ism_get_local_gid(struct ism_dev *ism)
+ return ism->local_gid;
+ }
+
+-static void ism_dev_add_work_func(struct work_struct *work)
+-{
+- struct ism_client *client = container_of(work, struct ism_client,
+- add_work);
+-
+- client->add(client->tgt_ism);
+- atomic_dec(&client->tgt_ism->add_dev_cnt);
+- wake_up(&client->tgt_ism->waitq);
+-}
+-
+ static int ism_dev_init(struct ism_dev *ism)
+ {
+ struct pci_dev *pdev = ism->pdev;
+- unsigned long flags;
+ int i, ret;
+
+ ret = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_MSI);
+@@ -594,25 +609,16 @@ static int ism_dev_init(struct ism_dev *ism)
+ /* hardware is V2 capable */
+ ism_create_system_eid();
+
+- init_waitqueue_head(&ism->waitq);
+- atomic_set(&ism->free_clients_cnt, 0);
+- atomic_set(&ism->add_dev_cnt, 0);
+-
+- wait_event(ism->waitq, !atomic_read(&ism->add_dev_cnt));
+- spin_lock_irqsave(&clients_lock, flags);
+- for (i = 0; i < max_client; ++i)
++ mutex_lock(&ism_dev_list.mutex);
++ mutex_lock(&clients_lock);
++ for (i = 0; i < max_client; ++i) {
+ if (clients[i]) {
+- INIT_WORK(&clients[i]->add_work,
+- ism_dev_add_work_func);
+- clients[i]->tgt_ism = ism;
+- atomic_inc(&ism->add_dev_cnt);
+- schedule_work(&clients[i]->add_work);
++ clients[i]->add(ism);
++ ism_setup_forwarding(clients[i], ism);
+ }
+- spin_unlock_irqrestore(&clients_lock, flags);
+-
+- wait_event(ism->waitq, !atomic_read(&ism->add_dev_cnt));
++ }
++ mutex_unlock(&clients_lock);
+
+- mutex_lock(&ism_dev_list.mutex);
+ list_add(&ism->list, &ism_dev_list.list);
+ mutex_unlock(&ism_dev_list.mutex);
+
+@@ -687,36 +693,24 @@ err_dev:
+ return ret;
+ }
+
+-static void ism_dev_remove_work_func(struct work_struct *work)
+-{
+- struct ism_client *client = container_of(work, struct ism_client,
+- remove_work);
+-
+- client->remove(client->tgt_ism);
+- atomic_dec(&client->tgt_ism->free_clients_cnt);
+- wake_up(&client->tgt_ism->waitq);
+-}
+-
+-/* Callers must hold ism_dev_list.mutex */
+ static void ism_dev_exit(struct ism_dev *ism)
+ {
+ struct pci_dev *pdev = ism->pdev;
+ unsigned long flags;
+ int i;
+
+- wait_event(ism->waitq, !atomic_read(&ism->free_clients_cnt));
+- spin_lock_irqsave(&clients_lock, flags);
++ spin_lock_irqsave(&ism->lock, flags);
+ for (i = 0; i < max_client; ++i)
+- if (clients[i]) {
+- INIT_WORK(&clients[i]->remove_work,
+- ism_dev_remove_work_func);
+- clients[i]->tgt_ism = ism;
+- atomic_inc(&ism->free_clients_cnt);
+- schedule_work(&clients[i]->remove_work);
+- }
+- spin_unlock_irqrestore(&clients_lock, flags);
++ ism->subs[i] = NULL;
++ spin_unlock_irqrestore(&ism->lock, flags);
+
+- wait_event(ism->waitq, !atomic_read(&ism->free_clients_cnt));
++ mutex_lock(&ism_dev_list.mutex);
++ mutex_lock(&clients_lock);
++ for (i = 0; i < max_client; ++i) {
++ if (clients[i])
++ clients[i]->remove(ism);
++ }
++ mutex_unlock(&clients_lock);
+
+ if (SYSTEM_EID.serial_number[0] != '0' ||
+ SYSTEM_EID.type[0] != '0')
+@@ -727,15 +721,14 @@ static void ism_dev_exit(struct ism_dev *ism)
+ kfree(ism->sba_client_arr);
+ pci_free_irq_vectors(pdev);
+ list_del_init(&ism->list);
++ mutex_unlock(&ism_dev_list.mutex);
+ }
+
+ static void ism_remove(struct pci_dev *pdev)
+ {
+ struct ism_dev *ism = dev_get_drvdata(&pdev->dev);
+
+- mutex_lock(&ism_dev_list.mutex);
+ ism_dev_exit(ism);
+- mutex_unlock(&ism_dev_list.mutex);
+
+ pci_release_mem_regions(pdev);
+ pci_disable_device(pdev);
+diff --git a/drivers/scsi/lpfc/lpfc_crtn.h b/drivers/scsi/lpfc/lpfc_crtn.h
+index b833b983e69d8..0b9edde26abd8 100644
+--- a/drivers/scsi/lpfc/lpfc_crtn.h
++++ b/drivers/scsi/lpfc/lpfc_crtn.h
+@@ -134,7 +134,6 @@ void lpfc_check_nlp_post_devloss(struct lpfc_vport *vport,
+ struct lpfc_nodelist *ndlp);
+ void lpfc_ignore_els_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
+ struct lpfc_iocbq *rspiocb);
+-int lpfc_nlp_not_used(struct lpfc_nodelist *ndlp);
+ struct lpfc_nodelist *lpfc_setup_disc_node(struct lpfc_vport *, uint32_t);
+ void lpfc_disc_list_loopmap(struct lpfc_vport *);
+ void lpfc_disc_start(struct lpfc_vport *);
+diff --git a/drivers/scsi/lpfc/lpfc_els.c b/drivers/scsi/lpfc/lpfc_els.c
+index a5e5b7bff59b4..2bad9954c355f 100644
+--- a/drivers/scsi/lpfc/lpfc_els.c
++++ b/drivers/scsi/lpfc/lpfc_els.c
+@@ -5205,14 +5205,9 @@ lpfc_els_free_iocb(struct lpfc_hba *phba, struct lpfc_iocbq *elsiocb)
+ *
+ * This routine is the completion callback function to the Logout (LOGO)
+ * Accept (ACC) Response ELS command. This routine is invoked to indicate
+- * the completion of the LOGO process. It invokes the lpfc_nlp_not_used() to
+- * release the ndlp if it has the last reference remaining (reference count
+- * is 1). If succeeded (meaning ndlp released), it sets the iocb ndlp
+- * field to NULL to inform the following lpfc_els_free_iocb() routine no
+- * ndlp reference count needs to be decremented. Otherwise, the ndlp
+- * reference use-count shall be decremented by the lpfc_els_free_iocb()
+- * routine. Finally, the lpfc_els_free_iocb() is invoked to release the
+- * IOCB data structure.
++ * the completion of the LOGO process. If the node has transitioned to NPR,
++ * this routine unregisters the RPI if it is still registered. The
++ * lpfc_els_free_iocb() is invoked to release the IOCB data structure.
+ **/
+ static void
+ lpfc_cmpl_els_logo_acc(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
+@@ -5253,19 +5248,9 @@ lpfc_cmpl_els_logo_acc(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
+ (ndlp->nlp_last_elscmd == ELS_CMD_PLOGI))
+ goto out;
+
+- /* NPort Recovery mode or node is just allocated */
+- if (!lpfc_nlp_not_used(ndlp)) {
+- /* A LOGO is completing and the node is in NPR state.
+- * Just unregister the RPI because the node is still
+- * required.
+- */
++ if (ndlp->nlp_flag & NLP_RPI_REGISTERED)
+ lpfc_unreg_rpi(vport, ndlp);
+- } else {
+- /* Indicate the node has already released, should
+- * not reference to it from within lpfc_els_free_iocb.
+- */
+- cmdiocb->ndlp = NULL;
+- }
++
+ }
+ out:
+ /*
+@@ -5285,9 +5270,8 @@ lpfc_cmpl_els_logo_acc(struct lpfc_hba *phba, struct lpfc_iocbq *cmdiocb,
+ * RPI (Remote Port Index) mailbox command to the @phba. It simply releases
+ * the associated lpfc Direct Memory Access (DMA) buffer back to the pool and
+ * decrements the ndlp reference count held for this completion callback
+- * function. After that, it invokes the lpfc_nlp_not_used() to check
+- * whether there is only one reference left on the ndlp. If so, it will
+- * perform one more decrement and trigger the release of the ndlp.
++ * function. After that, it invokes the lpfc_drop_node to check
++ * whether it is appropriate to release the node.
+ **/
+ void
+ lpfc_mbx_cmpl_dflt_rpi(struct lpfc_hba *phba, LPFC_MBOXQ_t *pmb)
+diff --git a/drivers/scsi/lpfc/lpfc_hbadisc.c b/drivers/scsi/lpfc/lpfc_hbadisc.c
+index 5ba3a9ad95016..67bfdddb897c4 100644
+--- a/drivers/scsi/lpfc/lpfc_hbadisc.c
++++ b/drivers/scsi/lpfc/lpfc_hbadisc.c
+@@ -4333,13 +4333,14 @@ out:
+
+ /* If the node is not registered with the scsi or nvme
+ * transport, remove the fabric node. The failed reg_login
+- * is terminal.
++ * is terminal and forces the removal of the last node
++ * reference.
+ */
+ if (!(ndlp->fc4_xpt_flags & (SCSI_XPT_REGD | NVME_XPT_REGD))) {
+ spin_lock_irq(&ndlp->lock);
+ ndlp->nlp_flag &= ~NLP_NPR_2B_DISC;
+ spin_unlock_irq(&ndlp->lock);
+- lpfc_nlp_not_used(ndlp);
++ lpfc_nlp_put(ndlp);
+ }
+
+ if (phba->fc_topology == LPFC_TOPOLOGY_LOOP) {
+@@ -6704,25 +6705,6 @@ lpfc_nlp_put(struct lpfc_nodelist *ndlp)
+ return ndlp ? kref_put(&ndlp->kref, lpfc_nlp_release) : 0;
+ }
+
+-/* This routine free's the specified nodelist if it is not in use
+- * by any other discovery thread. This routine returns 1 if the
+- * ndlp has been freed. A return value of 0 indicates the ndlp is
+- * not yet been released.
+- */
+-int
+-lpfc_nlp_not_used(struct lpfc_nodelist *ndlp)
+-{
+- lpfc_debugfs_disc_trc(ndlp->vport, LPFC_DISC_TRC_NODE,
+- "node not used: did:x%x flg:x%x refcnt:x%x",
+- ndlp->nlp_DID, ndlp->nlp_flag,
+- kref_read(&ndlp->kref));
+-
+- if (kref_read(&ndlp->kref) == 1)
+- if (lpfc_nlp_put(ndlp))
+- return 1;
+- return 0;
+-}
+-
+ /**
+ * lpfc_fcf_inuse - Check if FCF can be unregistered.
+ * @phba: Pointer to hba context object.
+diff --git a/drivers/scsi/mpi3mr/mpi3mr_fw.c b/drivers/scsi/mpi3mr/mpi3mr_fw.c
+index 075fa67e95eeb..1218b18c7dcf5 100644
+--- a/drivers/scsi/mpi3mr/mpi3mr_fw.c
++++ b/drivers/scsi/mpi3mr/mpi3mr_fw.c
+@@ -402,6 +402,11 @@ static void mpi3mr_process_admin_reply_desc(struct mpi3mr_ioc *mrioc,
+ memcpy((u8 *)cmdptr->reply, (u8 *)def_reply,
+ mrioc->reply_sz);
+ }
++ if (sense_buf && cmdptr->sensebuf) {
++ cmdptr->is_sense = 1;
++ memcpy(cmdptr->sensebuf, sense_buf,
++ MPI3MR_SENSE_BUF_SZ);
++ }
+ if (cmdptr->is_waiting) {
+ complete(&cmdptr->done);
+ cmdptr->is_waiting = 0;
+diff --git a/drivers/scsi/qla2xxx/qla_attr.c b/drivers/scsi/qla2xxx/qla_attr.c
+index 70cfc94c3d436..b00222459607a 100644
+--- a/drivers/scsi/qla2xxx/qla_attr.c
++++ b/drivers/scsi/qla2xxx/qla_attr.c
+@@ -2750,6 +2750,7 @@ static void
+ qla2x00_terminate_rport_io(struct fc_rport *rport)
+ {
+ fc_port_t *fcport = *(fc_port_t **)rport->dd_data;
++ scsi_qla_host_t *vha;
+
+ if (!fcport)
+ return;
+@@ -2759,9 +2760,12 @@ qla2x00_terminate_rport_io(struct fc_rport *rport)
+
+ if (test_bit(ABORT_ISP_ACTIVE, &fcport->vha->dpc_flags))
+ return;
++ vha = fcport->vha;
+
+ if (unlikely(pci_channel_offline(fcport->vha->hw->pdev))) {
+ qla2x00_abort_all_cmds(fcport->vha, DID_NO_CONNECT << 16);
++ qla2x00_eh_wait_for_pending_commands(fcport->vha, fcport->d_id.b24,
++ 0, WAIT_TARGET);
+ return;
+ }
+ /*
+@@ -2786,6 +2790,15 @@ qla2x00_terminate_rport_io(struct fc_rport *rport)
+ qla2x00_port_logout(fcport->vha, fcport);
+ }
+ }
++
++ /* check for any straggling io left behind */
++ if (qla2x00_eh_wait_for_pending_commands(fcport->vha, fcport->d_id.b24, 0, WAIT_TARGET)) {
++ ql_log(ql_log_warn, vha, 0x300b,
++ "IO not return. Resetting. \n");
++ set_bit(ISP_ABORT_NEEDED, &vha->dpc_flags);
++ qla2xxx_wake_dpc(vha);
++ qla2x00_wait_for_chip_reset(vha);
++ }
+ }
+
+ static int
+diff --git a/drivers/scsi/qla2xxx/qla_bsg.c b/drivers/scsi/qla2xxx/qla_bsg.c
+index dba7bba788d76..19bb64bdd88b1 100644
+--- a/drivers/scsi/qla2xxx/qla_bsg.c
++++ b/drivers/scsi/qla2xxx/qla_bsg.c
+@@ -283,6 +283,10 @@ qla2x00_process_els(struct bsg_job *bsg_job)
+
+ if (bsg_request->msgcode == FC_BSG_RPT_ELS) {
+ rport = fc_bsg_to_rport(bsg_job);
++ if (!rport) {
++ rval = -ENOMEM;
++ goto done;
++ }
+ fcport = *(fc_port_t **) rport->dd_data;
+ host = rport_to_shost(rport);
+ vha = shost_priv(host);
+@@ -2992,6 +2996,8 @@ qla24xx_bsg_request(struct bsg_job *bsg_job)
+
+ if (bsg_request->msgcode == FC_BSG_RPT_ELS) {
+ rport = fc_bsg_to_rport(bsg_job);
++ if (!rport)
++ return ret;
+ host = rport_to_shost(rport);
+ vha = shost_priv(host);
+ } else {
+diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h
+index 84aa3571be6d4..47d8f7913c3dd 100644
+--- a/drivers/scsi/qla2xxx/qla_def.h
++++ b/drivers/scsi/qla2xxx/qla_def.h
+@@ -465,6 +465,15 @@ static inline be_id_t port_id_to_be_id(port_id_t port_id)
+ return res;
+ }
+
++struct tmf_arg {
++ struct qla_qpair *qpair;
++ struct fc_port *fcport;
++ struct scsi_qla_host *vha;
++ u64 lun;
++ u32 flags;
++ uint8_t modifier;
++};
++
+ struct els_logo_payload {
+ uint8_t opcode;
+ uint8_t rsvd[3];
+@@ -544,6 +553,10 @@ struct srb_iocb {
+ uint32_t data;
+ struct completion comp;
+ __le16 comp_status;
++
++ uint8_t modifier;
++ uint8_t vp_index;
++ uint16_t loop_id;
+ } tmf;
+ struct {
+ #define SRB_FXDISC_REQ_DMA_VALID BIT_0
+@@ -647,6 +660,7 @@ struct srb_iocb {
+ #define SRB_SA_UPDATE 25
+ #define SRB_ELS_CMD_HST_NOLOGIN 26
+ #define SRB_SA_REPLACE 27
++#define SRB_MARKER 28
+
+ struct qla_els_pt_arg {
+ u8 els_opcode;
+@@ -689,7 +703,6 @@ typedef struct srb {
+ struct iocb_resource iores;
+ struct kref cmd_kref; /* need to migrate ref_count over to this */
+ void *priv;
+- wait_queue_head_t nvme_ls_waitq;
+ struct fc_port *fcport;
+ struct scsi_qla_host *vha;
+ unsigned int start_timer:1;
+@@ -2528,6 +2541,7 @@ enum rscn_addr_format {
+ typedef struct fc_port {
+ struct list_head list;
+ struct scsi_qla_host *vha;
++ struct list_head tmf_pending;
+
+ unsigned int conf_compl_supported:1;
+ unsigned int deleted:2;
+@@ -2548,6 +2562,8 @@ typedef struct fc_port {
+ unsigned int do_prli_nvme:1;
+
+ uint8_t nvme_flag;
++ uint8_t active_tmf;
++#define MAX_ACTIVE_TMF 8
+
+ uint8_t node_name[WWN_SIZE];
+ uint8_t port_name[WWN_SIZE];
+@@ -5499,4 +5515,8 @@ struct ql_vnd_tgt_stats_resp {
+ _fp->disc_state, _fp->scan_state, _fp->loop_id, _fp->deleted, \
+ _fp->flags
+
++#define TMF_NOT_READY(_fcport) \
++ (!_fcport || IS_SESSION_DELETED(_fcport) || atomic_read(&_fcport->state) != FCS_ONLINE || \
++ !_fcport->vha->hw->flags.fw_started)
++
+ #endif
+diff --git a/drivers/scsi/qla2xxx/qla_edif.c b/drivers/scsi/qla2xxx/qla_edif.c
+index ec0e20255bd3b..26e6b3e3af431 100644
+--- a/drivers/scsi/qla2xxx/qla_edif.c
++++ b/drivers/scsi/qla2xxx/qla_edif.c
+@@ -2361,8 +2361,8 @@ qla24xx_issue_sa_replace_iocb(scsi_qla_host_t *vha, struct qla_work_evt *e)
+ if (!sa_ctl) {
+ ql_dbg(ql_dbg_edif, vha, 0x70e6,
+ "sa_ctl allocation failed\n");
+- rval = -ENOMEM;
+- goto done;
++ rval = -ENOMEM;
++ return rval;
+ }
+
+ fcport = sa_ctl->fcport;
+diff --git a/drivers/scsi/qla2xxx/qla_gbl.h b/drivers/scsi/qla2xxx/qla_gbl.h
+index 391c8b3623a69..ba7831f24734f 100644
+--- a/drivers/scsi/qla2xxx/qla_gbl.h
++++ b/drivers/scsi/qla2xxx/qla_gbl.h
+@@ -69,7 +69,7 @@ extern int qla2x00_async_logout(struct scsi_qla_host *, fc_port_t *);
+ extern int qla2x00_async_prlo(struct scsi_qla_host *, fc_port_t *);
+ extern int qla2x00_async_adisc(struct scsi_qla_host *, fc_port_t *,
+ uint16_t *);
+-extern int qla2x00_async_tm_cmd(fc_port_t *, uint32_t, uint32_t, uint32_t);
++extern int qla2x00_async_tm_cmd(fc_port_t *, uint32_t, uint64_t, uint32_t);
+ struct qla_work_evt *qla2x00_alloc_work(struct scsi_qla_host *,
+ enum qla_work_type);
+ extern int qla24xx_async_gnl(struct scsi_qla_host *, fc_port_t *);
+diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c
+index 1a955c3ff3d6c..e6b857d7b3dd4 100644
+--- a/drivers/scsi/qla2xxx/qla_init.c
++++ b/drivers/scsi/qla2xxx/qla_init.c
+@@ -1996,6 +1996,11 @@ qla2x00_tmf_iocb_timeout(void *data)
+ int rc, h;
+ unsigned long flags;
+
++ if (sp->type == SRB_MARKER) {
++ complete(&tmf->u.tmf.comp);
++ return;
++ }
++
+ rc = qla24xx_async_abort_cmd(sp, false);
+ if (rc) {
+ spin_lock_irqsave(sp->qpair->qp_lock_ptr, flags);
+@@ -2013,24 +2018,131 @@ qla2x00_tmf_iocb_timeout(void *data)
+ }
+ }
+
++static void qla_marker_sp_done(srb_t *sp, int res)
++{
++ struct srb_iocb *tmf = &sp->u.iocb_cmd;
++
++ if (res != QLA_SUCCESS)
++ ql_dbg(ql_dbg_taskm, sp->vha, 0x8004,
++ "Async-marker fail hdl=%x portid=%06x ctrl=%x lun=%lld qp=%d.\n",
++ sp->handle, sp->fcport->d_id.b24, sp->u.iocb_cmd.u.tmf.flags,
++ sp->u.iocb_cmd.u.tmf.lun, sp->qpair->id);
++
++ sp->u.iocb_cmd.u.tmf.data = res;
++ complete(&tmf->u.tmf.comp);
++}
++
++#define START_SP_W_RETRIES(_sp, _rval) \
++{\
++ int cnt = 5; \
++ do { \
++ _rval = qla2x00_start_sp(_sp); \
++ if (_rval == EAGAIN) \
++ msleep(1); \
++ else \
++ break; \
++ cnt--; \
++ } while (cnt); \
++}
++
++/**
++ * qla26xx_marker: send marker IOCB and wait for the completion of it.
++ * @arg: pointer to argument list.
++ * It is assume caller will provide an fcport pointer and modifier
++ */
++static int
++qla26xx_marker(struct tmf_arg *arg)
++{
++ struct scsi_qla_host *vha = arg->vha;
++ struct srb_iocb *tm_iocb;
++ srb_t *sp;
++ int rval = QLA_FUNCTION_FAILED;
++ fc_port_t *fcport = arg->fcport;
++
++ if (TMF_NOT_READY(arg->fcport)) {
++ ql_dbg(ql_dbg_taskm, vha, 0x8039,
++ "FC port not ready for marker loop-id=%x portid=%06x modifier=%x lun=%lld qp=%d.\n",
++ fcport->loop_id, fcport->d_id.b24,
++ arg->modifier, arg->lun, arg->qpair->id);
++ return QLA_SUSPENDED;
++ }
++
++ /* ref: INIT */
++ sp = qla2xxx_get_qpair_sp(vha, arg->qpair, fcport, GFP_KERNEL);
++ if (!sp)
++ goto done;
++
++ sp->type = SRB_MARKER;
++ sp->name = "marker";
++ qla2x00_init_async_sp(sp, qla2x00_get_async_timeout(vha), qla_marker_sp_done);
++ sp->u.iocb_cmd.timeout = qla2x00_tmf_iocb_timeout;
++
++ tm_iocb = &sp->u.iocb_cmd;
++ init_completion(&tm_iocb->u.tmf.comp);
++ tm_iocb->u.tmf.modifier = arg->modifier;
++ tm_iocb->u.tmf.lun = arg->lun;
++ tm_iocb->u.tmf.loop_id = fcport->loop_id;
++ tm_iocb->u.tmf.vp_index = vha->vp_idx;
++
++ START_SP_W_RETRIES(sp, rval);
++
++ ql_dbg(ql_dbg_taskm, vha, 0x8006,
++ "Async-marker hdl=%x loop-id=%x portid=%06x modifier=%x lun=%lld qp=%d rval %d.\n",
++ sp->handle, fcport->loop_id, fcport->d_id.b24,
++ arg->modifier, arg->lun, sp->qpair->id, rval);
++
++ if (rval != QLA_SUCCESS) {
++ ql_log(ql_log_warn, vha, 0x8031,
++ "Marker IOCB send failure (%x).\n", rval);
++ goto done_free_sp;
++ }
++
++ wait_for_completion(&tm_iocb->u.tmf.comp);
++ rval = tm_iocb->u.tmf.data;
++
++ if (rval != QLA_SUCCESS) {
++ ql_log(ql_log_warn, vha, 0x8019,
++ "Marker failed hdl=%x loop-id=%x portid=%06x modifier=%x lun=%lld qp=%d rval %d.\n",
++ sp->handle, fcport->loop_id, fcport->d_id.b24,
++ arg->modifier, arg->lun, sp->qpair->id, rval);
++ }
++
++done_free_sp:
++ /* ref: INIT */
++ kref_put(&sp->cmd_kref, qla2x00_sp_release);
++done:
++ return rval;
++}
++
+ static void qla2x00_tmf_sp_done(srb_t *sp, int res)
+ {
+ struct srb_iocb *tmf = &sp->u.iocb_cmd;
+
++ if (res)
++ tmf->u.tmf.data = res;
+ complete(&tmf->u.tmf.comp);
+ }
+
+-int
+-qla2x00_async_tm_cmd(fc_port_t *fcport, uint32_t flags, uint32_t lun,
+- uint32_t tag)
++static int
++__qla2x00_async_tm_cmd(struct tmf_arg *arg)
+ {
+- struct scsi_qla_host *vha = fcport->vha;
++ struct scsi_qla_host *vha = arg->vha;
+ struct srb_iocb *tm_iocb;
+ srb_t *sp;
+ int rval = QLA_FUNCTION_FAILED;
+
++ fc_port_t *fcport = arg->fcport;
++
++ if (TMF_NOT_READY(arg->fcport)) {
++ ql_dbg(ql_dbg_taskm, vha, 0x8032,
++ "FC port not ready for TM command loop-id=%x portid=%06x modifier=%x lun=%lld qp=%d.\n",
++ fcport->loop_id, fcport->d_id.b24,
++ arg->modifier, arg->lun, arg->qpair->id);
++ return QLA_SUSPENDED;
++ }
++
+ /* ref: INIT */
+- sp = qla2x00_get_sp(vha, fcport, GFP_KERNEL);
++ sp = qla2xxx_get_qpair_sp(vha, arg->qpair, fcport, GFP_KERNEL);
+ if (!sp)
+ goto done;
+
+@@ -2043,15 +2155,16 @@ qla2x00_async_tm_cmd(fc_port_t *fcport, uint32_t flags, uint32_t lun,
+
+ tm_iocb = &sp->u.iocb_cmd;
+ init_completion(&tm_iocb->u.tmf.comp);
+- tm_iocb->u.tmf.flags = flags;
+- tm_iocb->u.tmf.lun = lun;
++ tm_iocb->u.tmf.flags = arg->flags;
++ tm_iocb->u.tmf.lun = arg->lun;
++
++ START_SP_W_RETRIES(sp, rval);
+
+ ql_dbg(ql_dbg_taskm, vha, 0x802f,
+- "Async-tmf hdl=%x loop-id=%x portid=%02x%02x%02x.\n",
+- sp->handle, fcport->loop_id, fcport->d_id.b.domain,
+- fcport->d_id.b.area, fcport->d_id.b.al_pa);
++ "Async-tmf hdl=%x loop-id=%x portid=%06x ctrl=%x lun=%lld qp=%d rval=%x.\n",
++ sp->handle, fcport->loop_id, fcport->d_id.b24,
++ arg->flags, arg->lun, sp->qpair->id, rval);
+
+- rval = qla2x00_start_sp(sp);
+ if (rval != QLA_SUCCESS)
+ goto done_free_sp;
+ wait_for_completion(&tm_iocb->u.tmf.comp);
+@@ -2063,15 +2176,8 @@ qla2x00_async_tm_cmd(fc_port_t *fcport, uint32_t flags, uint32_t lun,
+ "TM IOCB failed (%x).\n", rval);
+ }
+
+- if (!test_bit(UNLOADING, &vha->dpc_flags) && !IS_QLAFX00(vha->hw)) {
+- flags = tm_iocb->u.tmf.flags;
+- lun = (uint16_t)tm_iocb->u.tmf.lun;
+-
+- /* Issue Marker IOCB */
+- qla2x00_marker(vha, vha->hw->base_qpair,
+- fcport->loop_id, lun,
+- flags == TCF_LUN_RESET ? MK_SYNC_ID_LUN : MK_SYNC_ID);
+- }
++ if (!test_bit(UNLOADING, &vha->dpc_flags) && !IS_QLAFX00(vha->hw))
++ rval = qla26xx_marker(arg);
+
+ done_free_sp:
+ /* ref: INIT */
+@@ -2080,6 +2186,115 @@ done:
+ return rval;
+ }
+
++static void qla_put_tmf(fc_port_t *fcport)
++{
++ struct scsi_qla_host *vha = fcport->vha;
++ struct qla_hw_data *ha = vha->hw;
++ unsigned long flags;
++
++ spin_lock_irqsave(&ha->tgt.sess_lock, flags);
++ fcport->active_tmf--;
++ spin_unlock_irqrestore(&ha->tgt.sess_lock, flags);
++}
++
++static
++int qla_get_tmf(fc_port_t *fcport)
++{
++ struct scsi_qla_host *vha = fcport->vha;
++ struct qla_hw_data *ha = vha->hw;
++ unsigned long flags;
++ int rc = 0;
++ LIST_HEAD(tmf_elem);
++
++ spin_lock_irqsave(&ha->tgt.sess_lock, flags);
++ list_add_tail(&tmf_elem, &fcport->tmf_pending);
++
++ while (fcport->active_tmf >= MAX_ACTIVE_TMF) {
++ spin_unlock_irqrestore(&ha->tgt.sess_lock, flags);
++
++ msleep(1);
++
++ spin_lock_irqsave(&ha->tgt.sess_lock, flags);
++ if (TMF_NOT_READY(fcport)) {
++ ql_log(ql_log_warn, vha, 0x802c,
++ "Unable to acquire TM resource due to disruption.\n");
++ rc = EIO;
++ break;
++ }
++ if (fcport->active_tmf < MAX_ACTIVE_TMF &&
++ list_is_first(&tmf_elem, &fcport->tmf_pending))
++ break;
++ }
++
++ list_del(&tmf_elem);
++
++ if (!rc)
++ fcport->active_tmf++;
++
++ spin_unlock_irqrestore(&ha->tgt.sess_lock, flags);
++
++ return rc;
++}
++
++int
++qla2x00_async_tm_cmd(fc_port_t *fcport, uint32_t flags, uint64_t lun,
++ uint32_t tag)
++{
++ struct scsi_qla_host *vha = fcport->vha;
++ struct qla_qpair *qpair;
++ struct tmf_arg a;
++ int i, rval = QLA_SUCCESS;
++
++ if (TMF_NOT_READY(fcport))
++ return QLA_SUSPENDED;
++
++ a.vha = fcport->vha;
++ a.fcport = fcport;
++ a.lun = lun;
++ if (flags & (TCF_LUN_RESET|TCF_ABORT_TASK_SET|TCF_CLEAR_TASK_SET|TCF_CLEAR_ACA)) {
++ a.modifier = MK_SYNC_ID_LUN;
++
++ if (qla_get_tmf(fcport))
++ return QLA_FUNCTION_FAILED;
++ } else {
++ a.modifier = MK_SYNC_ID;
++ }
++
++ if (vha->hw->mqenable) {
++ for (i = 0; i < vha->hw->num_qpairs; i++) {
++ qpair = vha->hw->queue_pair_map[i];
++ if (!qpair)
++ continue;
++
++ if (TMF_NOT_READY(fcport)) {
++ ql_log(ql_log_warn, vha, 0x8026,
++ "Unable to send TM due to disruption.\n");
++ rval = QLA_SUSPENDED;
++ break;
++ }
++
++ a.qpair = qpair;
++ a.flags = flags|TCF_NOTMCMD_TO_TARGET;
++ rval = __qla2x00_async_tm_cmd(&a);
++ if (rval)
++ break;
++ }
++ }
++
++ if (rval)
++ goto bailout;
++
++ a.qpair = vha->hw->base_qpair;
++ a.flags = flags;
++ rval = __qla2x00_async_tm_cmd(&a);
++
++bailout:
++ if (a.modifier == MK_SYNC_ID_LUN)
++ qla_put_tmf(fcport);
++
++ return rval;
++}
++
+ int
+ qla24xx_async_abort_command(srb_t *sp)
+ {
+@@ -5291,6 +5506,7 @@ qla2x00_alloc_fcport(scsi_qla_host_t *vha, gfp_t flags)
+ INIT_WORK(&fcport->reg_work, qla_register_fcport_fn);
+ INIT_LIST_HEAD(&fcport->gnl_entry);
+ INIT_LIST_HEAD(&fcport->list);
++ INIT_LIST_HEAD(&fcport->tmf_pending);
+
+ INIT_LIST_HEAD(&fcport->sess_cmd_list);
+ spin_lock_init(&fcport->sess_cmd_lock);
+@@ -5333,7 +5549,7 @@ static void qla_get_login_template(scsi_qla_host_t *vha)
+ __be32 *q;
+
+ memset(ha->init_cb, 0, ha->init_cb_size);
+- sz = min_t(int, sizeof(struct fc_els_flogi), ha->init_cb_size);
++ sz = min_t(int, sizeof(struct fc_els_csp), ha->init_cb_size);
+ rval = qla24xx_get_port_login_templ(vha, ha->init_cb_dma,
+ ha->init_cb, sz);
+ if (rval != QLA_SUCCESS) {
+diff --git a/drivers/scsi/qla2xxx/qla_inline.h b/drivers/scsi/qla2xxx/qla_inline.h
+index 7b42558a8839a..0167e85ba0587 100644
+--- a/drivers/scsi/qla2xxx/qla_inline.h
++++ b/drivers/scsi/qla2xxx/qla_inline.h
+@@ -109,11 +109,13 @@ qla2x00_set_fcport_disc_state(fc_port_t *fcport, int state)
+ {
+ int old_val;
+ uint8_t shiftbits, mask;
++ uint8_t port_dstate_str_sz;
+
+ /* This will have to change when the max no. of states > 16 */
+ shiftbits = 4;
+ mask = (1 << shiftbits) - 1;
+
++ port_dstate_str_sz = sizeof(port_dstate_str) / sizeof(char *);
+ fcport->disc_state = state;
+ while (1) {
+ old_val = atomic_read(&fcport->shadow_disc_state);
+@@ -121,7 +123,8 @@ qla2x00_set_fcport_disc_state(fc_port_t *fcport, int state)
+ old_val, (old_val << shiftbits) | state)) {
+ ql_dbg(ql_dbg_disc, fcport->vha, 0x2134,
+ "FCPort %8phC disc_state transition: %s to %s - portid=%06x.\n",
+- fcport->port_name, port_dstate_str[old_val & mask],
++ fcport->port_name, (old_val & mask) < port_dstate_str_sz ?
++ port_dstate_str[old_val & mask] : "Unknown",
+ port_dstate_str[state], fcport->d_id.b24);
+ return;
+ }
+diff --git a/drivers/scsi/qla2xxx/qla_iocb.c b/drivers/scsi/qla2xxx/qla_iocb.c
+index b9b3e6f80ea9b..364b65a0f6334 100644
+--- a/drivers/scsi/qla2xxx/qla_iocb.c
++++ b/drivers/scsi/qla2xxx/qla_iocb.c
+@@ -522,21 +522,25 @@ __qla2x00_marker(struct scsi_qla_host *vha, struct qla_qpair *qpair,
+ return (QLA_FUNCTION_FAILED);
+ }
+
++ mrk24 = (struct mrk_entry_24xx *)mrk;
++
+ mrk->entry_type = MARKER_TYPE;
+ mrk->modifier = type;
+ if (type != MK_SYNC_ALL) {
+ if (IS_FWI2_CAPABLE(ha)) {
+- mrk24 = (struct mrk_entry_24xx *) mrk;
+ mrk24->nport_handle = cpu_to_le16(loop_id);
+ int_to_scsilun(lun, (struct scsi_lun *)&mrk24->lun);
+ host_to_fcp_swap(mrk24->lun, sizeof(mrk24->lun));
+ mrk24->vp_index = vha->vp_idx;
+- mrk24->handle = make_handle(req->id, mrk24->handle);
+ } else {
+ SET_TARGET_ID(ha, mrk->target, loop_id);
+ mrk->lun = cpu_to_le16((uint16_t)lun);
+ }
+ }
++
++ if (IS_FWI2_CAPABLE(ha))
++ mrk24->handle = QLA_SKIP_HANDLE;
++
+ wmb();
+
+ qla2x00_start_iocbs(vha, req);
+@@ -603,7 +607,8 @@ qla24xx_build_scsi_type_6_iocbs(srb_t *sp, struct cmd_type_6 *cmd_pkt,
+ put_unaligned_le32(COMMAND_TYPE_6, &cmd_pkt->entry_type);
+
+ /* No data transfer */
+- if (!scsi_bufflen(cmd) || cmd->sc_data_direction == DMA_NONE) {
++ if (!scsi_bufflen(cmd) || cmd->sc_data_direction == DMA_NONE ||
++ tot_dsds == 0) {
+ cmd_pkt->byte_count = cpu_to_le32(0);
+ return 0;
+ }
+@@ -2541,7 +2546,7 @@ qla24xx_tm_iocb(srb_t *sp, struct tsk_mgmt_entry *tsk)
+ scsi_qla_host_t *vha = fcport->vha;
+ struct qla_hw_data *ha = vha->hw;
+ struct srb_iocb *iocb = &sp->u.iocb_cmd;
+- struct req_que *req = vha->req;
++ struct req_que *req = sp->qpair->req;
+
+ flags = iocb->u.tmf.flags;
+ lun = iocb->u.tmf.lun;
+@@ -2557,7 +2562,8 @@ qla24xx_tm_iocb(srb_t *sp, struct tsk_mgmt_entry *tsk)
+ tsk->port_id[2] = fcport->d_id.b.domain;
+ tsk->vp_index = fcport->vha->vp_idx;
+
+- if (flags == TCF_LUN_RESET) {
++ if (flags & (TCF_LUN_RESET | TCF_ABORT_TASK_SET|
++ TCF_CLEAR_TASK_SET|TCF_CLEAR_ACA)) {
+ int_to_scsilun(lun, &tsk->lun);
+ host_to_fcp_swap((uint8_t *)&tsk->lun,
+ sizeof(tsk->lun));
+@@ -3852,9 +3858,9 @@ static int qla_get_iocbs_resource(struct srb *sp)
+ case SRB_NACK_LOGO:
+ case SRB_LOGOUT_CMD:
+ case SRB_CTRL_VP:
+- push_it_through = true;
+- fallthrough;
++ case SRB_MARKER:
+ default:
++ push_it_through = true;
+ get_exch = false;
+ }
+
+@@ -3870,6 +3876,19 @@ static int qla_get_iocbs_resource(struct srb *sp)
+ return qla_get_fw_resources(sp->qpair, &sp->iores);
+ }
+
++static void
++qla_marker_iocb(srb_t *sp, struct mrk_entry_24xx *mrk)
++{
++ mrk->entry_type = MARKER_TYPE;
++ mrk->modifier = sp->u.iocb_cmd.u.tmf.modifier;
++ if (sp->u.iocb_cmd.u.tmf.modifier != MK_SYNC_ALL) {
++ mrk->nport_handle = cpu_to_le16(sp->u.iocb_cmd.u.tmf.loop_id);
++ int_to_scsilun(sp->u.iocb_cmd.u.tmf.lun, (struct scsi_lun *)&mrk->lun);
++ host_to_fcp_swap(mrk->lun, sizeof(mrk->lun));
++ mrk->vp_index = sp->u.iocb_cmd.u.tmf.vp_index;
++ }
++}
++
+ int
+ qla2x00_start_sp(srb_t *sp)
+ {
+@@ -3892,7 +3911,7 @@ qla2x00_start_sp(srb_t *sp)
+
+ pkt = __qla2x00_alloc_iocbs(sp->qpair, sp);
+ if (!pkt) {
+- rval = EAGAIN;
++ rval = -EAGAIN;
+ ql_log(ql_log_warn, vha, 0x700c,
+ "qla2x00_alloc_iocbs failed.\n");
+ goto done;
+@@ -3973,6 +3992,9 @@ qla2x00_start_sp(srb_t *sp)
+ case SRB_SA_REPLACE:
+ qla24xx_sa_replace_iocb(sp, pkt);
+ break;
++ case SRB_MARKER:
++ qla_marker_iocb(sp, pkt);
++ break;
+ default:
+ break;
+ }
+diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
+index 245e3a5d81fd3..656700f793259 100644
+--- a/drivers/scsi/qla2xxx/qla_isr.c
++++ b/drivers/scsi/qla2xxx/qla_isr.c
+@@ -1862,9 +1862,9 @@ qla2x00_process_completed_request(struct scsi_qla_host *vha,
+ }
+ }
+
+-srb_t *
+-qla2x00_get_sp_from_handle(scsi_qla_host_t *vha, const char *func,
+- struct req_que *req, void *iocb)
++static srb_t *
++qla_get_sp_from_handle(scsi_qla_host_t *vha, const char *func,
++ struct req_que *req, void *iocb, u16 *ret_index)
+ {
+ struct qla_hw_data *ha = vha->hw;
+ sts_entry_t *pkt = iocb;
+@@ -1899,12 +1899,25 @@ qla2x00_get_sp_from_handle(scsi_qla_host_t *vha, const char *func,
+ return NULL;
+ }
+
+- req->outstanding_cmds[index] = NULL;
+-
++ *ret_index = index;
+ qla_put_fw_resources(sp->qpair, &sp->iores);
+ return sp;
+ }
+
++srb_t *
++qla2x00_get_sp_from_handle(scsi_qla_host_t *vha, const char *func,
++ struct req_que *req, void *iocb)
++{
++ uint16_t index;
++ srb_t *sp;
++
++ sp = qla_get_sp_from_handle(vha, func, req, iocb, &index);
++ if (sp)
++ req->outstanding_cmds[index] = NULL;
++
++ return sp;
++}
++
+ static void
+ qla2x00_mbx_iocb_entry(scsi_qla_host_t *vha, struct req_que *req,
+ struct mbx_entry *mbx)
+@@ -3237,13 +3250,13 @@ qla2x00_status_entry(scsi_qla_host_t *vha, struct rsp_que *rsp, void *pkt)
+ return;
+ }
+
+- req->outstanding_cmds[handle] = NULL;
+ cp = GET_CMD_SP(sp);
+ if (cp == NULL) {
+ ql_dbg(ql_dbg_io, vha, 0x3018,
+ "Command already returned (0x%x/%p).\n",
+ sts->handle, sp);
+
++ req->outstanding_cmds[handle] = NULL;
+ return;
+ }
+
+@@ -3514,6 +3527,9 @@ out:
+
+ if (rsp->status_srb == NULL)
+ sp->done(sp, res);
++
++ /* for io's, clearing of outstanding_cmds[handle] means scsi_done was called */
++ req->outstanding_cmds[handle] = NULL;
+ }
+
+ /**
+@@ -3590,6 +3606,7 @@ qla2x00_error_entry(scsi_qla_host_t *vha, struct rsp_que *rsp, sts_entry_t *pkt)
+ uint16_t que = MSW(pkt->handle);
+ struct req_que *req = NULL;
+ int res = DID_ERROR << 16;
++ u16 index;
+
+ ql_dbg(ql_dbg_async, vha, 0x502a,
+ "iocb type %xh with error status %xh, handle %xh, rspq id %d\n",
+@@ -3608,7 +3625,6 @@ qla2x00_error_entry(scsi_qla_host_t *vha, struct rsp_que *rsp, sts_entry_t *pkt)
+
+ switch (pkt->entry_type) {
+ case NOTIFY_ACK_TYPE:
+- case STATUS_TYPE:
+ case STATUS_CONT_TYPE:
+ case LOGINOUT_PORT_IOCB_TYPE:
+ case CT_IOCB_TYPE:
+@@ -3628,6 +3644,14 @@ qla2x00_error_entry(scsi_qla_host_t *vha, struct rsp_que *rsp, sts_entry_t *pkt)
+ case CTIO_TYPE7:
+ case CTIO_CRC2:
+ return 1;
++ case STATUS_TYPE:
++ sp = qla_get_sp_from_handle(vha, func, req, pkt, &index);
++ if (sp) {
++ sp->done(sp, res);
++ req->outstanding_cmds[index] = NULL;
++ return 0;
++ }
++ break;
+ }
+ fatal:
+ ql_log(ql_log_warn, vha, 0x5030,
+@@ -3750,6 +3774,28 @@ static int qla_chk_cont_iocb_avail(struct scsi_qla_host *vha,
+ return rc;
+ }
+
++static void qla_marker_iocb_entry(scsi_qla_host_t *vha, struct req_que *req,
++ struct mrk_entry_24xx *pkt)
++{
++ const char func[] = "MRK-IOCB";
++ srb_t *sp;
++ int res = QLA_SUCCESS;
++
++ if (!IS_FWI2_CAPABLE(vha->hw))
++ return;
++
++ sp = qla2x00_get_sp_from_handle(vha, func, req, pkt);
++ if (!sp)
++ return;
++
++ if (pkt->entry_status) {
++ ql_dbg(ql_dbg_taskm, vha, 0x8025, "marker failure.\n");
++ res = QLA_COMMAND_ERROR;
++ }
++ sp->u.iocb_cmd.u.tmf.data = res;
++ sp->done(sp, res);
++}
++
+ /**
+ * qla24xx_process_response_queue() - Process response queue entries.
+ * @vha: SCSI driver HA context
+@@ -3866,9 +3912,7 @@ process_err:
+ (struct nack_to_isp *)pkt);
+ break;
+ case MARKER_TYPE:
+- /* Do nothing in this case, this check is to prevent it
+- * from falling into default case
+- */
++ qla_marker_iocb_entry(vha, rsp->req, (struct mrk_entry_24xx *)pkt);
+ break;
+ case ABORT_IOCB_TYPE:
+ qla24xx_abort_iocb_entry(vha, rsp->req,
+diff --git a/drivers/scsi/qla2xxx/qla_nvme.c b/drivers/scsi/qla2xxx/qla_nvme.c
+index 648e8f7986065..86e85f2f4782f 100644
+--- a/drivers/scsi/qla2xxx/qla_nvme.c
++++ b/drivers/scsi/qla2xxx/qla_nvme.c
+@@ -360,7 +360,6 @@ static int qla_nvme_ls_req(struct nvme_fc_local_port *lport,
+ if (rval != QLA_SUCCESS) {
+ ql_log(ql_log_warn, vha, 0x700e,
+ "qla2x00_start_sp failed = %d\n", rval);
+- wake_up(&sp->nvme_ls_waitq);
+ sp->priv = NULL;
+ priv->sp = NULL;
+ qla2x00_rel_sp(sp);
+@@ -652,7 +651,6 @@ static int qla_nvme_post_cmd(struct nvme_fc_local_port *lport,
+ if (!sp)
+ return -EBUSY;
+
+- init_waitqueue_head(&sp->nvme_ls_waitq);
+ kref_init(&sp->cmd_kref);
+ spin_lock_init(&priv->cmd_lock);
+ sp->priv = priv;
+@@ -671,7 +669,6 @@ static int qla_nvme_post_cmd(struct nvme_fc_local_port *lport,
+ if (rval != QLA_SUCCESS) {
+ ql_log(ql_log_warn, vha, 0x212d,
+ "qla2x00_start_nvme_mq failed = %d\n", rval);
+- wake_up(&sp->nvme_ls_waitq);
+ sp->priv = NULL;
+ priv->sp = NULL;
+ qla2xxx_rel_qpair_sp(sp->qpair, sp);
+diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
+index 2fa695bf38b77..877e4f446709d 100644
+--- a/drivers/scsi/qla2xxx/qla_os.c
++++ b/drivers/scsi/qla2xxx/qla_os.c
+@@ -1078,43 +1078,6 @@ qc24_fail_command:
+ return 0;
+ }
+
+-/*
+- * qla2x00_eh_wait_on_command
+- * Waits for the command to be returned by the Firmware for some
+- * max time.
+- *
+- * Input:
+- * cmd = Scsi Command to wait on.
+- *
+- * Return:
+- * Completed in time : QLA_SUCCESS
+- * Did not complete in time : QLA_FUNCTION_FAILED
+- */
+-static int
+-qla2x00_eh_wait_on_command(struct scsi_cmnd *cmd)
+-{
+-#define ABORT_POLLING_PERIOD 1000
+-#define ABORT_WAIT_ITER ((2 * 1000) / (ABORT_POLLING_PERIOD))
+- unsigned long wait_iter = ABORT_WAIT_ITER;
+- scsi_qla_host_t *vha = shost_priv(cmd->device->host);
+- struct qla_hw_data *ha = vha->hw;
+- srb_t *sp = scsi_cmd_priv(cmd);
+- int ret = QLA_SUCCESS;
+-
+- if (unlikely(pci_channel_offline(ha->pdev)) || ha->flags.eeh_busy) {
+- ql_dbg(ql_dbg_taskm, vha, 0x8005,
+- "Return:eh_wait.\n");
+- return ret;
+- }
+-
+- while (sp->type && wait_iter--)
+- msleep(ABORT_POLLING_PERIOD);
+- if (sp->type)
+- ret = QLA_FUNCTION_FAILED;
+-
+- return ret;
+-}
+-
+ /*
+ * qla2x00_wait_for_hba_online
+ * Wait till the HBA is online after going through
+@@ -1365,6 +1328,9 @@ qla2xxx_eh_abort(struct scsi_cmnd *cmd)
+ return ret;
+ }
+
++#define ABORT_POLLING_PERIOD 1000
++#define ABORT_WAIT_ITER ((2 * 1000) / (ABORT_POLLING_PERIOD))
++
+ /*
+ * Returns: QLA_SUCCESS or QLA_FUNCTION_FAILED.
+ */
+@@ -1378,41 +1344,73 @@ __qla2x00_eh_wait_for_pending_commands(struct qla_qpair *qpair, unsigned int t,
+ struct req_que *req = qpair->req;
+ srb_t *sp;
+ struct scsi_cmnd *cmd;
++ unsigned long wait_iter = ABORT_WAIT_ITER;
++ bool found;
++ struct qla_hw_data *ha = vha->hw;
+
+ status = QLA_SUCCESS;
+
+- spin_lock_irqsave(qpair->qp_lock_ptr, flags);
+- for (cnt = 1; status == QLA_SUCCESS &&
+- cnt < req->num_outstanding_cmds; cnt++) {
+- sp = req->outstanding_cmds[cnt];
+- if (!sp)
+- continue;
+- if (sp->type != SRB_SCSI_CMD)
+- continue;
+- if (vha->vp_idx != sp->vha->vp_idx)
+- continue;
+- match = 0;
+- cmd = GET_CMD_SP(sp);
+- switch (type) {
+- case WAIT_HOST:
+- match = 1;
+- break;
+- case WAIT_TARGET:
+- match = cmd->device->id == t;
+- break;
+- case WAIT_LUN:
+- match = (cmd->device->id == t &&
+- cmd->device->lun == l);
+- break;
+- }
+- if (!match)
+- continue;
++ while (wait_iter--) {
++ found = false;
+
+- spin_unlock_irqrestore(qpair->qp_lock_ptr, flags);
+- status = qla2x00_eh_wait_on_command(cmd);
+ spin_lock_irqsave(qpair->qp_lock_ptr, flags);
++ for (cnt = 1; cnt < req->num_outstanding_cmds; cnt++) {
++ sp = req->outstanding_cmds[cnt];
++ if (!sp)
++ continue;
++ if (sp->type != SRB_SCSI_CMD)
++ continue;
++ if (vha->vp_idx != sp->vha->vp_idx)
++ continue;
++ match = 0;
++ cmd = GET_CMD_SP(sp);
++ switch (type) {
++ case WAIT_HOST:
++ match = 1;
++ break;
++ case WAIT_TARGET:
++ if (sp->fcport)
++ match = sp->fcport->d_id.b24 == t;
++ else
++ match = 0;
++ break;
++ case WAIT_LUN:
++ if (sp->fcport)
++ match = (sp->fcport->d_id.b24 == t &&
++ cmd->device->lun == l);
++ else
++ match = 0;
++ break;
++ }
++ if (!match)
++ continue;
++
++ spin_unlock_irqrestore(qpair->qp_lock_ptr, flags);
++
++ if (unlikely(pci_channel_offline(ha->pdev)) ||
++ ha->flags.eeh_busy) {
++ ql_dbg(ql_dbg_taskm, vha, 0x8005,
++ "Return:eh_wait.\n");
++ return status;
++ }
++
++ /*
++ * SRB_SCSI_CMD is still in the outstanding_cmds array.
++ * it means scsi_done has not called. Wait for it to
++ * clear from outstanding_cmds.
++ */
++ msleep(ABORT_POLLING_PERIOD);
++ spin_lock_irqsave(qpair->qp_lock_ptr, flags);
++ found = true;
++ }
++ spin_unlock_irqrestore(qpair->qp_lock_ptr, flags);
++
++ if (!found)
++ break;
+ }
+- spin_unlock_irqrestore(qpair->qp_lock_ptr, flags);
++
++ if (wait_iter == -1)
++ status = QLA_FUNCTION_FAILED;
+
+ return status;
+ }
+@@ -5090,7 +5088,8 @@ struct scsi_qla_host *qla2x00_create_host(const struct scsi_host_template *sht,
+ }
+ INIT_DELAYED_WORK(&vha->scan.scan_work, qla_scan_work_fn);
+
+- sprintf(vha->host_str, "%s_%lu", QLA2XXX_DRIVER_NAME, vha->host_no);
++ snprintf(vha->host_str, sizeof(vha->host_str), "%s_%lu",
++ QLA2XXX_DRIVER_NAME, vha->host_no);
+ ql_dbg(ql_dbg_init, vha, 0x0041,
+ "Allocated the host=%p hw=%p vha=%p dev_name=%s",
+ vha->host, vha->hw, vha,
+diff --git a/drivers/soc/qcom/mdt_loader.c b/drivers/soc/qcom/mdt_loader.c
+index 33dd8c315eb73..46820bcdae983 100644
+--- a/drivers/soc/qcom/mdt_loader.c
++++ b/drivers/soc/qcom/mdt_loader.c
+@@ -210,6 +210,7 @@ int qcom_mdt_pas_init(struct device *dev, const struct firmware *fw,
+ const struct elf32_hdr *ehdr;
+ phys_addr_t min_addr = PHYS_ADDR_MAX;
+ phys_addr_t max_addr = 0;
++ bool relocate = false;
+ size_t metadata_len;
+ void *metadata;
+ int ret;
+@@ -224,6 +225,9 @@ int qcom_mdt_pas_init(struct device *dev, const struct firmware *fw,
+ if (!mdt_phdr_valid(phdr))
+ continue;
+
++ if (phdr->p_flags & QCOM_MDT_RELOCATABLE)
++ relocate = true;
++
+ if (phdr->p_paddr < min_addr)
+ min_addr = phdr->p_paddr;
+
+@@ -246,11 +250,13 @@ int qcom_mdt_pas_init(struct device *dev, const struct firmware *fw,
+ goto out;
+ }
+
+- ret = qcom_scm_pas_mem_setup(pas_id, mem_phys, max_addr - min_addr);
+- if (ret) {
+- /* Unable to set up relocation */
+- dev_err(dev, "error %d setting up firmware %s\n", ret, fw_name);
+- goto out;
++ if (relocate) {
++ ret = qcom_scm_pas_mem_setup(pas_id, mem_phys, max_addr - min_addr);
++ if (ret) {
++ /* Unable to set up relocation */
++ dev_err(dev, "error %d setting up firmware %s\n", ret, fw_name);
++ goto out;
++ }
+ }
+
+ out:
+diff --git a/drivers/soundwire/qcom.c b/drivers/soundwire/qcom.c
+index bd39e78788590..e3ef5ebae6b7c 100644
+--- a/drivers/soundwire/qcom.c
++++ b/drivers/soundwire/qcom.c
+@@ -171,7 +171,8 @@ struct qcom_swrm_ctrl {
+ u32 intr_mask;
+ u8 rcmd_id;
+ u8 wcmd_id;
+- struct qcom_swrm_port_config pconfig[QCOM_SDW_MAX_PORTS];
++ /* Port numbers are 1 - 14 */
++ struct qcom_swrm_port_config pconfig[QCOM_SDW_MAX_PORTS + 1];
+ struct sdw_stream_runtime *sruntime[SWRM_MAX_DAIS];
+ enum sdw_slave_status status[SDW_MAX_DEVICES + 1];
+ int (*reg_read)(struct qcom_swrm_ctrl *ctrl, int reg, u32 *val);
+diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
+index 1c9e5d2ea7de6..552e8a7415625 100644
+--- a/drivers/tty/n_tty.c
++++ b/drivers/tty/n_tty.c
+@@ -203,8 +203,8 @@ static void n_tty_kick_worker(struct tty_struct *tty)
+ struct n_tty_data *ldata = tty->disc_data;
+
+ /* Did the input worker stop? Restart it */
+- if (unlikely(ldata->no_room)) {
+- ldata->no_room = 0;
++ if (unlikely(READ_ONCE(ldata->no_room))) {
++ WRITE_ONCE(ldata->no_room, 0);
+
+ WARN_RATELIMIT(tty->port->itty == NULL,
+ "scheduling with invalid itty\n");
+@@ -1697,7 +1697,7 @@ n_tty_receive_buf_common(struct tty_struct *tty, const unsigned char *cp,
+ if (overflow && room < 0)
+ ldata->read_head--;
+ room = overflow;
+- ldata->no_room = flow && !room;
++ WRITE_ONCE(ldata->no_room, flow && !room);
+ } else
+ overflow = 0;
+
+@@ -1728,6 +1728,17 @@ n_tty_receive_buf_common(struct tty_struct *tty, const unsigned char *cp,
+ } else
+ n_tty_check_throttle(tty);
+
++ if (unlikely(ldata->no_room)) {
++ /*
++ * Barrier here is to ensure to read the latest read_tail in
++ * chars_in_buffer() and to make sure that read_tail is not loaded
++ * before ldata->no_room is set.
++ */
++ smp_mb();
++ if (!chars_in_buffer(tty))
++ n_tty_kick_worker(tty);
++ }
++
+ up_read(&tty->termios_rwsem);
+
+ return rcvd;
+@@ -2281,8 +2292,14 @@ more_to_be_read:
+ if (time)
+ timeout = time;
+ }
+- if (old_tail != ldata->read_tail)
++ if (old_tail != ldata->read_tail) {
++ /*
++ * Make sure no_room is not read in n_tty_kick_worker()
++ * before setting ldata->read_tail in copy_from_read_buf().
++ */
++ smp_mb();
+ n_tty_kick_worker(tty);
++ }
+ up_read(&tty->termios_rwsem);
+
+ remove_wait_queue(&tty->read_wait, &wait);
+diff --git a/drivers/tty/serial/8250/8250.h b/drivers/tty/serial/8250/8250.h
+index 1e8fe44a7099f..eeb7b43ebe539 100644
+--- a/drivers/tty/serial/8250/8250.h
++++ b/drivers/tty/serial/8250/8250.h
+@@ -91,7 +91,6 @@ struct serial8250_config {
+ #define UART_BUG_TXEN BIT(1) /* UART has buggy TX IIR status */
+ #define UART_BUG_NOMSR BIT(2) /* UART has buggy MSR status bits (Au1x00) */
+ #define UART_BUG_THRE BIT(3) /* UART has buggy THRE reassertion */
+-#define UART_BUG_PARITY BIT(4) /* UART mishandles parity if FIFO enabled */
+ #define UART_BUG_TXRACE BIT(5) /* UART Tx fails to set remote DR */
+
+
+diff --git a/drivers/tty/serial/8250/8250_pci.c b/drivers/tty/serial/8250/8250_pci.c
+index e80c4f6551a1c..d2d547b5da95a 100644
+--- a/drivers/tty/serial/8250/8250_pci.c
++++ b/drivers/tty/serial/8250/8250_pci.c
+@@ -1232,14 +1232,6 @@ static int pci_oxsemi_tornado_setup(struct serial_private *priv,
+ return pci_default_setup(priv, board, up, idx);
+ }
+
+-static int pci_asix_setup(struct serial_private *priv,
+- const struct pciserial_board *board,
+- struct uart_8250_port *port, int idx)
+-{
+- port->bugs |= UART_BUG_PARITY;
+- return pci_default_setup(priv, board, port, idx);
+-}
+-
+ #define QPCR_TEST_FOR1 0x3F
+ #define QPCR_TEST_GET1 0x00
+ #define QPCR_TEST_FOR2 0x40
+@@ -1955,7 +1947,6 @@ pci_moxa_setup(struct serial_private *priv,
+ #define PCI_DEVICE_ID_WCH_CH355_4S 0x7173
+ #define PCI_VENDOR_ID_AGESTAR 0x5372
+ #define PCI_DEVICE_ID_AGESTAR_9375 0x6872
+-#define PCI_VENDOR_ID_ASIX 0x9710
+ #define PCI_DEVICE_ID_BROADCOM_TRUMANAGE 0x160a
+ #define PCI_DEVICE_ID_AMCC_ADDIDATA_APCI7800 0x818e
+
+@@ -2600,16 +2591,6 @@ static struct pci_serial_quirk pci_serial_quirks[] = {
+ .exit = pci_wch_ch38x_exit,
+ .setup = pci_wch_ch38x_setup,
+ },
+- /*
+- * ASIX devices with FIFO bug
+- */
+- {
+- .vendor = PCI_VENDOR_ID_ASIX,
+- .device = PCI_ANY_ID,
+- .subvendor = PCI_ANY_ID,
+- .subdevice = PCI_ANY_ID,
+- .setup = pci_asix_setup,
+- },
+ /*
+ * Broadcom TruManage (NetXtreme)
+ */
+diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
+index c153ba3a018a2..053d44412e42f 100644
+--- a/drivers/tty/serial/8250/8250_port.c
++++ b/drivers/tty/serial/8250/8250_port.c
+@@ -2636,11 +2636,8 @@ static unsigned char serial8250_compute_lcr(struct uart_8250_port *up,
+
+ if (c_cflag & CSTOPB)
+ cval |= UART_LCR_STOP;
+- if (c_cflag & PARENB) {
++ if (c_cflag & PARENB)
+ cval |= UART_LCR_PARITY;
+- if (up->bugs & UART_BUG_PARITY)
+- up->fifo_bug = true;
+- }
+ if (!(c_cflag & PARODD))
+ cval |= UART_LCR_EPAR;
+ if (c_cflag & CMSPAR)
+@@ -2801,8 +2798,7 @@ serial8250_do_set_termios(struct uart_port *port, struct ktermios *termios,
+ up->lcr = cval; /* Save computed LCR */
+
+ if (up->capabilities & UART_CAP_FIFO && port->fifosize > 1) {
+- /* NOTE: If fifo_bug is not set, a user can set RX_trigger. */
+- if ((baud < 2400 && !up->dma) || up->fifo_bug) {
++ if (baud < 2400 && !up->dma) {
+ up->fcr &= ~UART_FCR_TRIGGER_MASK;
+ up->fcr |= UART_FCR_TRIGGER_1;
+ }
+@@ -3138,8 +3134,7 @@ static int do_set_rxtrig(struct tty_port *port, unsigned char bytes)
+ struct uart_8250_port *up = up_to_u8250p(uport);
+ int rxtrig;
+
+- if (!(up->capabilities & UART_CAP_FIFO) || uport->fifosize <= 1 ||
+- up->fifo_bug)
++ if (!(up->capabilities & UART_CAP_FIFO) || uport->fifosize <= 1)
+ return -EINVAL;
+
+ rxtrig = bytes_to_fcr_rxtrig(up, bytes);
+diff --git a/drivers/tty/serial/atmel_serial.c b/drivers/tty/serial/atmel_serial.c
+index 9cd7479b03c08..595b591bb49b7 100644
+--- a/drivers/tty/serial/atmel_serial.c
++++ b/drivers/tty/serial/atmel_serial.c
+@@ -868,11 +868,11 @@ static void atmel_complete_tx_dma(void *arg)
+ dmaengine_terminate_all(chan);
+ uart_xmit_advance(port, atmel_port->tx_len);
+
+- spin_lock_irq(&atmel_port->lock_tx);
++ spin_lock(&atmel_port->lock_tx);
+ async_tx_ack(atmel_port->desc_tx);
+ atmel_port->cookie_tx = -EINVAL;
+ atmel_port->desc_tx = NULL;
+- spin_unlock_irq(&atmel_port->lock_tx);
++ spin_unlock(&atmel_port->lock_tx);
+
+ if (uart_circ_chars_pending(xmit) < WAKEUP_CHARS)
+ uart_write_wakeup(port);
+diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
+index c5e17569c3adc..3fe8ff241522c 100644
+--- a/drivers/tty/serial/imx.c
++++ b/drivers/tty/serial/imx.c
+@@ -369,6 +369,16 @@ static void imx_uart_soft_reset(struct imx_port *sport)
+ sport->idle_counter = 0;
+ }
+
++static void imx_uart_disable_loopback_rs485(struct imx_port *sport)
++{
++ unsigned int uts;
++
++ /* See SER_RS485_ENABLED/UTS_LOOP comment in imx_uart_probe() */
++ uts = imx_uart_readl(sport, imx_uart_uts_reg(sport));
++ uts &= ~UTS_LOOP;
++ imx_uart_writel(sport, uts, imx_uart_uts_reg(sport));
++}
++
+ /* called with port.lock taken and irqs off */
+ static void imx_uart_start_rx(struct uart_port *port)
+ {
+@@ -390,6 +400,7 @@ static void imx_uart_start_rx(struct uart_port *port)
+ /* Write UCR2 first as it includes RXEN */
+ imx_uart_writel(sport, ucr2, UCR2);
+ imx_uart_writel(sport, ucr1, UCR1);
++ imx_uart_disable_loopback_rs485(sport);
+ }
+
+ /* called with port.lock taken and irqs off */
+@@ -1422,7 +1433,7 @@ static int imx_uart_startup(struct uart_port *port)
+ int retval;
+ unsigned long flags;
+ int dma_is_inited = 0;
+- u32 ucr1, ucr2, ucr3, ucr4, uts;
++ u32 ucr1, ucr2, ucr3, ucr4;
+
+ retval = clk_prepare_enable(sport->clk_per);
+ if (retval)
+@@ -1521,10 +1532,7 @@ static int imx_uart_startup(struct uart_port *port)
+ imx_uart_writel(sport, ucr2, UCR2);
+ }
+
+- /* See SER_RS485_ENABLED/UTS_LOOP comment in imx_uart_probe() */
+- uts = imx_uart_readl(sport, imx_uart_uts_reg(sport));
+- uts &= ~UTS_LOOP;
+- imx_uart_writel(sport, uts, imx_uart_uts_reg(sport));
++ imx_uart_disable_loopback_rs485(sport);
+
+ spin_unlock_irqrestore(&sport->port.lock, flags);
+
+diff --git a/drivers/tty/serial/samsung_tty.c b/drivers/tty/serial/samsung_tty.c
+index 2a7520ad3abd9..0b37019820b41 100644
+--- a/drivers/tty/serial/samsung_tty.c
++++ b/drivers/tty/serial/samsung_tty.c
+@@ -1459,8 +1459,12 @@ static unsigned int s3c24xx_serial_getclk(struct s3c24xx_uart_port *ourport,
+ continue;
+
+ rate = clk_get_rate(clk);
+- if (!rate)
++ if (!rate) {
++ dev_err(ourport->port.dev,
++ "Failed to get clock rate for %s.\n", clkname);
++ clk_put(clk);
+ continue;
++ }
+
+ if (ourport->info->has_divslot) {
+ unsigned long div = rate / req_baud;
+@@ -1486,10 +1490,18 @@ static unsigned int s3c24xx_serial_getclk(struct s3c24xx_uart_port *ourport,
+ calc_deviation = -calc_deviation;
+
+ if (calc_deviation < deviation) {
++ /*
++ * If we find a better clk, release the previous one, if
++ * any.
++ */
++ if (!IS_ERR(*best_clk))
++ clk_put(*best_clk);
+ *best_clk = clk;
+ best_quot = quot;
+ *clk_num = cnt;
+ deviation = calc_deviation;
++ } else {
++ clk_put(clk);
+ }
+ }
+
+diff --git a/drivers/ufs/host/Kconfig b/drivers/ufs/host/Kconfig
+index 8793e34335806..f11e98c9e6652 100644
+--- a/drivers/ufs/host/Kconfig
++++ b/drivers/ufs/host/Kconfig
+@@ -72,6 +72,7 @@ config SCSI_UFS_QCOM
+ config SCSI_UFS_MEDIATEK
+ tristate "Mediatek specific hooks to UFS controller platform driver"
+ depends on SCSI_UFSHCD_PLATFORM && ARCH_MEDIATEK
++ depends on RESET_CONTROLLER
+ select PHY_MTK_UFS
+ select RESET_TI_SYSCON
+ help
+diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
+index 7e106bd804ca1..1979702b0fa9e 100644
+--- a/drivers/usb/host/xhci-mem.c
++++ b/drivers/usb/host/xhci-mem.c
+@@ -1968,7 +1968,7 @@ static void xhci_add_in_port(struct xhci_hcd *xhci, unsigned int num_ports,
+ {
+ u32 temp, port_offset, port_count;
+ int i;
+- u8 major_revision, minor_revision;
++ u8 major_revision, minor_revision, tmp_minor_revision;
+ struct xhci_hub *rhub;
+ struct device *dev = xhci_to_hcd(xhci)->self.sysdev;
+ struct xhci_port_cap *port_cap;
+@@ -1988,6 +1988,15 @@ static void xhci_add_in_port(struct xhci_hcd *xhci, unsigned int num_ports,
+ */
+ if (minor_revision > 0x00 && minor_revision < 0x10)
+ minor_revision <<= 4;
++ /*
++ * Some zhaoxin's xHCI controller that follow usb3.1 spec
++ * but only support Gen1.
++ */
++ if (xhci->quirks & XHCI_ZHAOXIN_HOST) {
++ tmp_minor_revision = minor_revision;
++ minor_revision = 0;
++ }
++
+ } else if (major_revision <= 0x02) {
+ rhub = &xhci->usb2_rhub;
+ } else {
+@@ -1996,10 +2005,6 @@ static void xhci_add_in_port(struct xhci_hcd *xhci, unsigned int num_ports,
+ /* Ignoring port protocol we can't understand. FIXME */
+ return;
+ }
+- rhub->maj_rev = XHCI_EXT_PORT_MAJOR(temp);
+-
+- if (rhub->min_rev < minor_revision)
+- rhub->min_rev = minor_revision;
+
+ /* Port offset and count in the third dword, see section 7.2 */
+ temp = readl(addr + 2);
+@@ -2017,8 +2022,6 @@ static void xhci_add_in_port(struct xhci_hcd *xhci, unsigned int num_ports,
+ if (xhci->num_port_caps > max_caps)
+ return;
+
+- port_cap->maj_rev = major_revision;
+- port_cap->min_rev = minor_revision;
+ port_cap->psi_count = XHCI_EXT_PORT_PSIC(temp);
+
+ if (port_cap->psi_count) {
+@@ -2039,6 +2042,11 @@ static void xhci_add_in_port(struct xhci_hcd *xhci, unsigned int num_ports,
+ XHCI_EXT_PORT_PSIV(port_cap->psi[i - 1])))
+ port_cap->psi_uid_count++;
+
++ if (xhci->quirks & XHCI_ZHAOXIN_HOST &&
++ major_revision == 0x03 &&
++ XHCI_EXT_PORT_PSIV(port_cap->psi[i]) >= 5)
++ minor_revision = tmp_minor_revision;
++
+ xhci_dbg(xhci, "PSIV:%d PSIE:%d PLT:%d PFD:%d LP:%d PSIM:%d\n",
+ XHCI_EXT_PORT_PSIV(port_cap->psi[i]),
+ XHCI_EXT_PORT_PSIE(port_cap->psi[i]),
+@@ -2048,6 +2056,15 @@ static void xhci_add_in_port(struct xhci_hcd *xhci, unsigned int num_ports,
+ XHCI_EXT_PORT_PSIM(port_cap->psi[i]));
+ }
+ }
++
++ rhub->maj_rev = major_revision;
++
++ if (rhub->min_rev < minor_revision)
++ rhub->min_rev = minor_revision;
++
++ port_cap->maj_rev = major_revision;
++ port_cap->min_rev = minor_revision;
++
+ /* cache usb2 port capabilities */
+ if (major_revision < 0x03 && xhci->num_ext_caps < max_caps)
+ xhci->ext_caps[xhci->num_ext_caps++] = temp;
+@@ -2352,8 +2369,12 @@ int xhci_mem_init(struct xhci_hcd *xhci, gfp_t flags)
+ * and our use of dma addresses in the trb_address_map radix tree needs
+ * TRB_SEGMENT_SIZE alignment, so we pick the greater alignment need.
+ */
+- xhci->segment_pool = dma_pool_create("xHCI ring segments", dev,
+- TRB_SEGMENT_SIZE, TRB_SEGMENT_SIZE, xhci->page_size);
++ if (xhci->quirks & XHCI_ZHAOXIN_TRB_FETCH)
++ xhci->segment_pool = dma_pool_create("xHCI ring segments", dev,
++ TRB_SEGMENT_SIZE * 2, TRB_SEGMENT_SIZE * 2, xhci->page_size * 2);
++ else
++ xhci->segment_pool = dma_pool_create("xHCI ring segments", dev,
++ TRB_SEGMENT_SIZE, TRB_SEGMENT_SIZE, xhci->page_size);
+
+ /* See Table 46 and Note on Figure 55 */
+ xhci->device_pool = dma_pool_create("xHCI input/output contexts", dev,
+diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
+index 69a5cb7eba381..a410162e15df1 100644
+--- a/drivers/usb/host/xhci-pci.c
++++ b/drivers/usb/host/xhci-pci.c
+@@ -528,6 +528,18 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
+ pdev->device == PCI_DEVICE_ID_AMD_PROMONTORYA_4))
+ xhci->quirks |= XHCI_NO_SOFT_RETRY;
+
++ if (pdev->vendor == PCI_VENDOR_ID_ZHAOXIN) {
++ xhci->quirks |= XHCI_ZHAOXIN_HOST;
++
++ if (pdev->device == 0x9202) {
++ xhci->quirks |= XHCI_RESET_ON_RESUME;
++ xhci->quirks |= XHCI_ZHAOXIN_TRB_FETCH;
++ }
++
++ if (pdev->device == 0x9203)
++ xhci->quirks |= XHCI_ZHAOXIN_TRB_FETCH;
++ }
++
+ /* xHC spec requires PCI devices to support D3hot and D3cold */
+ if (xhci->hci_version >= 0x120)
+ xhci->quirks |= XHCI_DEFAULT_PM_RUNTIME_ALLOW;
+diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
+index f845c15073ba4..4474d540f6b49 100644
+--- a/drivers/usb/host/xhci.h
++++ b/drivers/usb/host/xhci.h
+@@ -1905,6 +1905,8 @@ struct xhci_hcd {
+ #define XHCI_EP_CTX_BROKEN_DCS BIT_ULL(42)
+ #define XHCI_SUSPEND_RESUME_CLKS BIT_ULL(43)
+ #define XHCI_RESET_TO_DEFAULT BIT_ULL(44)
++#define XHCI_ZHAOXIN_TRB_FETCH BIT_ULL(45)
++#define XHCI_ZHAOXIN_HOST BIT_ULL(46)
+
+ unsigned int num_active_eps;
+ unsigned int limit_active_eps;
+diff --git a/drivers/xen/grant-dma-ops.c b/drivers/xen/grant-dma-ops.c
+index 9784a77fa3c99..76f6f26265a3b 100644
+--- a/drivers/xen/grant-dma-ops.c
++++ b/drivers/xen/grant-dma-ops.c
+@@ -303,6 +303,8 @@ static struct device_node *xen_dt_get_node(struct device *dev)
+ while (!pci_is_root_bus(bus))
+ bus = bus->parent;
+
++ if (!bus->bridge->parent)
++ return NULL;
+ return of_node_get(bus->bridge->parent->of_node);
+ }
+
+diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
+index 6bb251a4d613e..59cbfb80edbda 100644
+--- a/fs/ceph/addr.c
++++ b/fs/ceph/addr.c
+@@ -187,16 +187,42 @@ static void ceph_netfs_expand_readahead(struct netfs_io_request *rreq)
+ struct inode *inode = rreq->inode;
+ struct ceph_inode_info *ci = ceph_inode(inode);
+ struct ceph_file_layout *lo = &ci->i_layout;
++ unsigned long max_pages = inode->i_sb->s_bdi->ra_pages;
++ loff_t end = rreq->start + rreq->len, new_end;
++ struct ceph_netfs_request_data *priv = rreq->netfs_priv;
++ unsigned long max_len;
+ u32 blockoff;
+- u64 blockno;
+
+- /* Expand the start downward */
+- blockno = div_u64_rem(rreq->start, lo->stripe_unit, &blockoff);
+- rreq->start = blockno * lo->stripe_unit;
+- rreq->len += blockoff;
++ if (priv) {
++ /* Readahead is disabled by posix_fadvise POSIX_FADV_RANDOM */
++ if (priv->file_ra_disabled)
++ max_pages = 0;
++ else
++ max_pages = priv->file_ra_pages;
++
++ }
+
+- /* Now, round up the length to the next block */
+- rreq->len = roundup(rreq->len, lo->stripe_unit);
++ /* Readahead is disabled */
++ if (!max_pages)
++ return;
++
++ max_len = max_pages << PAGE_SHIFT;
++
++ /*
++ * Try to expand the length forward by rounding up it to the next
++ * block, but do not exceed the file size, unless the original
++ * request already exceeds it.
++ */
++ new_end = min(round_up(end, lo->stripe_unit), rreq->i_size);
++ if (new_end > end && new_end <= rreq->start + max_len)
++ rreq->len = new_end - rreq->start;
++
++ /* Try to expand the start downward */
++ div_u64_rem(rreq->start, lo->stripe_unit, &blockoff);
++ if (rreq->len + blockoff <= max_len) {
++ rreq->start -= blockoff;
++ rreq->len += blockoff;
++ }
+ }
+
+ static bool ceph_netfs_clamp_length(struct netfs_io_subrequest *subreq)
+@@ -362,18 +388,28 @@ static int ceph_init_request(struct netfs_io_request *rreq, struct file *file)
+ {
+ struct inode *inode = rreq->inode;
+ int got = 0, want = CEPH_CAP_FILE_CACHE;
++ struct ceph_netfs_request_data *priv;
+ int ret = 0;
+
+ if (rreq->origin != NETFS_READAHEAD)
+ return 0;
+
++ priv = kzalloc(sizeof(*priv), GFP_NOFS);
++ if (!priv)
++ return -ENOMEM;
++
+ if (file) {
+ struct ceph_rw_context *rw_ctx;
+ struct ceph_file_info *fi = file->private_data;
+
++ priv->file_ra_pages = file->f_ra.ra_pages;
++ priv->file_ra_disabled = file->f_mode & FMODE_RANDOM;
++
+ rw_ctx = ceph_find_rw_context(fi);
+- if (rw_ctx)
++ if (rw_ctx) {
++ rreq->netfs_priv = priv;
+ return 0;
++ }
+ }
+
+ /*
+@@ -383,27 +419,40 @@ static int ceph_init_request(struct netfs_io_request *rreq, struct file *file)
+ ret = ceph_try_get_caps(inode, CEPH_CAP_FILE_RD, want, true, &got);
+ if (ret < 0) {
+ dout("start_read %p, error getting cap\n", inode);
+- return ret;
++ goto out;
+ }
+
+ if (!(got & want)) {
+ dout("start_read %p, no cache cap\n", inode);
+- return -EACCES;
++ ret = -EACCES;
++ goto out;
++ }
++ if (ret == 0) {
++ ret = -EACCES;
++ goto out;
+ }
+- if (ret == 0)
+- return -EACCES;
+
+- rreq->netfs_priv = (void *)(uintptr_t)got;
+- return 0;
++ priv->caps = got;
++ rreq->netfs_priv = priv;
++
++out:
++ if (ret < 0)
++ kfree(priv);
++
++ return ret;
+ }
+
+ static void ceph_netfs_free_request(struct netfs_io_request *rreq)
+ {
+- struct ceph_inode_info *ci = ceph_inode(rreq->inode);
+- int got = (uintptr_t)rreq->netfs_priv;
++ struct ceph_netfs_request_data *priv = rreq->netfs_priv;
++
++ if (!priv)
++ return;
+
+- if (got)
+- ceph_put_cap_refs(ci, got);
++ if (priv->caps)
++ ceph_put_cap_refs(ceph_inode(rreq->inode), priv->caps);
++ kfree(priv);
++ rreq->netfs_priv = NULL;
+ }
+
+ const struct netfs_request_ops ceph_netfs_ops = {
+diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
+index 2321e5ddb664d..ee6611e141590 100644
+--- a/fs/ceph/caps.c
++++ b/fs/ceph/caps.c
+@@ -3560,6 +3560,15 @@ static void handle_cap_grant(struct inode *inode,
+ }
+ BUG_ON(cap->issued & ~cap->implemented);
+
++ /* don't let check_caps skip sending a response to MDS for revoke msgs */
++ if (le32_to_cpu(grant->op) == CEPH_CAP_OP_REVOKE) {
++ cap->mds_wanted = 0;
++ if (cap == ci->i_auth_cap)
++ check_caps = 1; /* check auth cap only */
++ else
++ check_caps = 2; /* check all caps */
++ }
++
+ if (extra_info->inline_version > 0 &&
+ extra_info->inline_version >= ci->i_inline_version) {
+ ci->i_inline_version = extra_info->inline_version;
+diff --git a/fs/ceph/super.h b/fs/ceph/super.h
+index d24bf0db52346..3bfddf34d488b 100644
+--- a/fs/ceph/super.h
++++ b/fs/ceph/super.h
+@@ -451,6 +451,19 @@ struct ceph_inode_info {
+ unsigned long i_work_mask;
+ };
+
++struct ceph_netfs_request_data {
++ int caps;
++
++ /*
++ * Maximum size of a file readahead request.
++ * The fadvise could update the bdi's default ra_pages.
++ */
++ unsigned int file_ra_pages;
++
++ /* Set it if fadvise disables file readahead entirely */
++ bool file_ra_disabled;
++};
++
+ static inline struct ceph_inode_info *
+ ceph_inode(const struct inode *inode)
+ {
+diff --git a/fs/dlm/ast.c b/fs/dlm/ast.c
+index 700ff2e0515a1..ff0ef4653535b 100644
+--- a/fs/dlm/ast.c
++++ b/fs/dlm/ast.c
+@@ -181,10 +181,12 @@ void dlm_callback_work(struct work_struct *work)
+
+ spin_lock(&lkb->lkb_cb_lock);
+ rv = dlm_dequeue_lkb_callback(lkb, &cb);
+- spin_unlock(&lkb->lkb_cb_lock);
+-
+- if (WARN_ON_ONCE(rv == DLM_DEQUEUE_CALLBACK_EMPTY))
++ if (WARN_ON_ONCE(rv == DLM_DEQUEUE_CALLBACK_EMPTY)) {
++ clear_bit(DLM_IFL_CB_PENDING_BIT, &lkb->lkb_iflags);
++ spin_unlock(&lkb->lkb_cb_lock);
+ goto out;
++ }
++ spin_unlock(&lkb->lkb_cb_lock);
+
+ for (;;) {
+ castfn = lkb->lkb_astfn;
+diff --git a/fs/dlm/lockspace.c b/fs/dlm/lockspace.c
+index 67261b7b1f0e1..0455dddb0797c 100644
+--- a/fs/dlm/lockspace.c
++++ b/fs/dlm/lockspace.c
+@@ -935,15 +935,3 @@ void dlm_stop_lockspaces(void)
+ log_print("dlm user daemon left %d lockspaces", count);
+ }
+
+-void dlm_stop_lockspaces_check(void)
+-{
+- struct dlm_ls *ls;
+-
+- spin_lock(&lslist_lock);
+- list_for_each_entry(ls, &lslist, ls_list) {
+- if (WARN_ON(!rwsem_is_locked(&ls->ls_in_recovery) ||
+- !dlm_locking_stopped(ls)))
+- break;
+- }
+- spin_unlock(&lslist_lock);
+-}
+diff --git a/fs/dlm/lockspace.h b/fs/dlm/lockspace.h
+index 03f4a4a3a871c..47ebd44119264 100644
+--- a/fs/dlm/lockspace.h
++++ b/fs/dlm/lockspace.h
+@@ -27,7 +27,6 @@ struct dlm_ls *dlm_find_lockspace_local(void *id);
+ struct dlm_ls *dlm_find_lockspace_device(int minor);
+ void dlm_put_lockspace(struct dlm_ls *ls);
+ void dlm_stop_lockspaces(void);
+-void dlm_stop_lockspaces_check(void);
+ int dlm_new_user_lockspace(const char *name, const char *cluster,
+ uint32_t flags, int lvblen,
+ const struct dlm_lockspace_ops *ops,
+diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
+index 3d3802c47b8bb..5aad4d4842eba 100644
+--- a/fs/dlm/lowcomms.c
++++ b/fs/dlm/lowcomms.c
+@@ -898,6 +898,7 @@ static void process_dlm_messages(struct work_struct *work)
+ pentry = list_first_entry_or_null(&processqueue,
+ struct processqueue_entry, list);
+ if (WARN_ON_ONCE(!pentry)) {
++ process_dlm_messages_pending = false;
+ spin_unlock(&processqueue_lock);
+ return;
+ }
+diff --git a/fs/dlm/midcomms.c b/fs/dlm/midcomms.c
+index c02c43e4980a7..3df916a568baf 100644
+--- a/fs/dlm/midcomms.c
++++ b/fs/dlm/midcomms.c
+@@ -136,7 +136,6 @@
+ #include <net/tcp.h>
+
+ #include "dlm_internal.h"
+-#include "lockspace.h"
+ #include "lowcomms.h"
+ #include "config.h"
+ #include "memory.h"
+@@ -1491,8 +1490,6 @@ int dlm_midcomms_close(int nodeid)
+ if (nodeid == dlm_our_nodeid())
+ return 0;
+
+- dlm_stop_lockspaces_check();
+-
+ idx = srcu_read_lock(&nodes_srcu);
+ /* Abort pending close/remove operation */
+ node = nodeid2node(nodeid, 0);
+diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
+index ed4357e62f35f..70a4752ed913a 100644
+--- a/fs/dlm/plock.c
++++ b/fs/dlm/plock.c
+@@ -30,8 +30,6 @@ struct plock_async_data {
+ struct plock_op {
+ struct list_head list;
+ int done;
+- /* if lock op got interrupted while waiting dlm_controld reply */
+- bool sigint;
+ struct dlm_plock_info info;
+ /* if set indicates async handling */
+ struct plock_async_data *data;
+@@ -157,23 +155,29 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
+
+ send_op(op);
+
+- rv = wait_event_interruptible(recv_wq, (op->done != 0));
+- if (rv == -ERESTARTSYS) {
+- spin_lock(&ops_lock);
+- /* recheck under ops_lock if we got a done != 0,
+- * if so this interrupt case should be ignored
+- */
+- if (op->done != 0) {
++ if (op->info.wait) {
++ rv = wait_event_killable(recv_wq, (op->done != 0));
++ if (rv == -ERESTARTSYS) {
++ spin_lock(&ops_lock);
++ /* recheck under ops_lock if we got a done != 0,
++ * if so this interrupt case should be ignored
++ */
++ if (op->done != 0) {
++ spin_unlock(&ops_lock);
++ goto do_lock_wait;
++ }
++ list_del(&op->list);
+ spin_unlock(&ops_lock);
+- goto do_lock_wait;
+- }
+
+- op->sigint = true;
+- spin_unlock(&ops_lock);
+- log_debug(ls, "%s: wait interrupted %x %llx pid %d",
+- __func__, ls->ls_global_id,
+- (unsigned long long)number, op->info.pid);
+- goto out;
++ log_debug(ls, "%s: wait interrupted %x %llx pid %d",
++ __func__, ls->ls_global_id,
++ (unsigned long long)number, op->info.pid);
++ do_unlock_close(&op->info);
++ dlm_release_plock_op(op);
++ goto out;
++ }
++ } else {
++ wait_event(recv_wq, (op->done != 0));
+ }
+
+ do_lock_wait:
+@@ -360,7 +364,9 @@ int dlm_posix_get(dlm_lockspace_t *lockspace, u64 number, struct file *file,
+ locks_init_lock(fl);
+ fl->fl_type = (op->info.ex) ? F_WRLCK : F_RDLCK;
+ fl->fl_flags = FL_POSIX;
+- fl->fl_pid = -op->info.pid;
++ fl->fl_pid = op->info.pid;
++ if (op->info.nodeid != dlm_our_nodeid())
++ fl->fl_pid = -fl->fl_pid;
+ fl->fl_start = op->info.start;
+ fl->fl_end = op->info.end;
+ rv = 0;
+@@ -389,7 +395,7 @@ static ssize_t dev_read(struct file *file, char __user *u, size_t count,
+ if (op->info.flags & DLM_PLOCK_FL_CLOSE)
+ list_del(&op->list);
+ else
+- list_move(&op->list, &recv_list);
++ list_move_tail(&op->list, &recv_list);
+ memcpy(&info, &op->info, sizeof(info));
+ }
+ spin_unlock(&ops_lock);
+@@ -427,34 +433,53 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count,
+ if (check_version(&info))
+ return -EINVAL;
+
++ /*
++ * The results for waiting ops (SETLKW) can be returned in any
++ * order, so match all fields to find the op. The results for
++ * non-waiting ops are returned in the order that they were sent
++ * to userspace, so match the result with the first non-waiting op.
++ */
+ spin_lock(&ops_lock);
+- list_for_each_entry(iter, &recv_list, list) {
+- if (iter->info.fsid == info.fsid &&
+- iter->info.number == info.number &&
+- iter->info.owner == info.owner) {
+- if (iter->sigint) {
+- list_del(&iter->list);
+- spin_unlock(&ops_lock);
+-
+- pr_debug("%s: sigint cleanup %x %llx pid %d",
+- __func__, iter->info.fsid,
+- (unsigned long long)iter->info.number,
+- iter->info.pid);
+- do_unlock_close(&iter->info);
+- memcpy(&iter->info, &info, sizeof(info));
+- dlm_release_plock_op(iter);
+- return count;
++ if (info.wait) {
++ list_for_each_entry(iter, &recv_list, list) {
++ if (iter->info.fsid == info.fsid &&
++ iter->info.number == info.number &&
++ iter->info.owner == info.owner &&
++ iter->info.pid == info.pid &&
++ iter->info.start == info.start &&
++ iter->info.end == info.end &&
++ iter->info.ex == info.ex &&
++ iter->info.wait) {
++ op = iter;
++ break;
++ }
++ }
++ } else {
++ list_for_each_entry(iter, &recv_list, list) {
++ if (!iter->info.wait) {
++ op = iter;
++ break;
+ }
+- list_del_init(&iter->list);
+- memcpy(&iter->info, &info, sizeof(info));
+- if (iter->data)
+- do_callback = 1;
+- else
+- iter->done = 1;
+- op = iter;
+- break;
+ }
+ }
++
++ if (op) {
++ /* Sanity check that op and info match. */
++ if (info.wait)
++ WARN_ON(op->info.optype != DLM_PLOCK_OP_LOCK);
++ else
++ WARN_ON(op->info.fsid != info.fsid ||
++ op->info.number != info.number ||
++ op->info.owner != info.owner ||
++ op->info.optype != info.optype);
++
++ list_del_init(&op->list);
++ memcpy(&op->info, &info, sizeof(info));
++ if (op->data)
++ do_callback = 1;
++ else
++ op->done = 1;
++ }
+ spin_unlock(&ops_lock);
+
+ if (op) {
+@@ -463,8 +488,8 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count,
+ else
+ wake_up(&recv_wq);
+ } else
+- log_print("%s: no op %x %llx", __func__,
+- info.fsid, (unsigned long long)info.number);
++ pr_debug("%s: no op %x %llx", __func__,
++ info.fsid, (unsigned long long)info.number);
+ return count;
+ }
+
+diff --git a/fs/erofs/inode.c b/fs/erofs/inode.c
+index d70b12b81507f..e12592727a546 100644
+--- a/fs/erofs/inode.c
++++ b/fs/erofs/inode.c
+@@ -183,7 +183,8 @@ static void *erofs_read_inode(struct erofs_buf *buf,
+
+ inode->i_flags &= ~S_DAX;
+ if (test_opt(&sbi->opt, DAX_ALWAYS) && S_ISREG(inode->i_mode) &&
+- vi->datalayout == EROFS_INODE_FLAT_PLAIN)
++ (vi->datalayout == EROFS_INODE_FLAT_PLAIN ||
++ vi->datalayout == EROFS_INODE_CHUNK_BASED))
+ inode->i_flags |= S_DAX;
+
+ if (!nblks)
+diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
+index 502893e3da010..997ca4b32e87f 100644
+--- a/fs/erofs/zdata.c
++++ b/fs/erofs/zdata.c
+@@ -990,7 +990,7 @@ hitted:
+ */
+ tight &= (fe->mode > Z_EROFS_PCLUSTER_FOLLOWED_NOINPLACE);
+
+- cur = end - min_t(unsigned int, offset + end - map->m_la, end);
++ cur = end - min_t(erofs_off_t, offset + end - map->m_la, end);
+ if (!(map->m_flags & EROFS_MAP_MAPPED)) {
+ zero_user_segment(page, cur, end);
+ goto next_part;
+@@ -1807,7 +1807,7 @@ static void z_erofs_pcluster_readmore(struct z_erofs_decompress_frontend *f,
+ }
+
+ cur = map->m_la + map->m_llen - 1;
+- while (cur >= end) {
++ while ((cur >= end) && (cur < i_size_read(inode))) {
+ pgoff_t index = cur >> PAGE_SHIFT;
+ struct page *page;
+
+diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
+index 26f135e7ffce7..dc76147e7b07b 100644
+--- a/fs/ext2/inode.c
++++ b/fs/ext2/inode.c
+@@ -1259,9 +1259,8 @@ static int ext2_setsize(struct inode *inode, loff_t newsize)
+ inode_dio_wait(inode);
+
+ if (IS_DAX(inode))
+- error = dax_zero_range(inode, newsize,
+- PAGE_ALIGN(newsize) - newsize, NULL,
+- &ext2_iomap_ops);
++ error = dax_truncate_page(inode, newsize, NULL,
++ &ext2_iomap_ops);
+ else
+ error = block_truncate_page(inode->i_mapping,
+ newsize, ext2_get_block);
+diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c
+index c68bebe7ff4b6..a9f3716119d37 100644
+--- a/fs/ext4/indirect.c
++++ b/fs/ext4/indirect.c
+@@ -651,6 +651,14 @@ int ext4_ind_map_blocks(handle_t *handle, struct inode *inode,
+
+ ext4_update_inode_fsync_trans(handle, inode, 1);
+ count = ar.len;
++
++ /*
++ * Update reserved blocks/metadata blocks after successful block
++ * allocation which had been deferred till now.
++ */
++ if (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE)
++ ext4_da_update_reserve_space(inode, count, 1);
++
+ got_it:
+ map->m_flags |= EXT4_MAP_MAPPED;
+ map->m_pblk = le32_to_cpu(chain[depth-1].key);
+diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
+index 02de439bf1f04..ac35120cd35a7 100644
+--- a/fs/ext4/inode.c
++++ b/fs/ext4/inode.c
+@@ -632,16 +632,6 @@ found:
+ */
+ ext4_clear_inode_state(inode, EXT4_STATE_EXT_MIGRATE);
+ }
+-
+- /*
+- * Update reserved blocks/metadata blocks after successful
+- * block allocation which had been deferred till now. We don't
+- * support fallocate for non extent files. So we can update
+- * reserve space here.
+- */
+- if ((retval > 0) &&
+- (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE))
+- ext4_da_update_reserve_space(inode, retval, 1);
+ }
+
+ if (retval > 0) {
+diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
+index f9a4301520632..55be1b8a63608 100644
+--- a/fs/ext4/ioctl.c
++++ b/fs/ext4/ioctl.c
+@@ -797,6 +797,7 @@ static int ext4_shutdown(struct super_block *sb, unsigned long arg)
+ {
+ struct ext4_sb_info *sbi = EXT4_SB(sb);
+ __u32 flags;
++ int ret;
+
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+@@ -815,7 +816,9 @@ static int ext4_shutdown(struct super_block *sb, unsigned long arg)
+
+ switch (flags) {
+ case EXT4_GOING_FLAGS_DEFAULT:
+- freeze_bdev(sb->s_bdev);
++ ret = freeze_bdev(sb->s_bdev);
++ if (ret)
++ return ret;
+ set_bit(EXT4_FLAGS_SHUTDOWN, &sbi->s_ext4_flags);
+ thaw_bdev(sb->s_bdev);
+ break;
+diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
+index 20f67a260df50..fd4d12c58c3b4 100644
+--- a/fs/ext4/mballoc.c
++++ b/fs/ext4/mballoc.c
+@@ -6243,8 +6243,8 @@ do_more:
+ * them with group lock_held
+ */
+ if (test_opt(sb, DISCARD)) {
+- err = ext4_issue_discard(sb, block_group, bit, count,
+- NULL);
++ err = ext4_issue_discard(sb, block_group, bit,
++ count_clusters, NULL);
+ if (err && err != -EOPNOTSUPP)
+ ext4_msg(sb, KERN_WARNING, "discard request in"
+ " group:%u block:%d count:%lu failed"
+@@ -6328,12 +6328,6 @@ void ext4_free_blocks(handle_t *handle, struct inode *inode,
+
+ sbi = EXT4_SB(sb);
+
+- if (sbi->s_mount_state & EXT4_FC_REPLAY) {
+- ext4_free_blocks_simple(inode, block, count);
+- return;
+- }
+-
+- might_sleep();
+ if (bh) {
+ if (block)
+ BUG_ON(block != bh->b_blocknr);
+@@ -6341,6 +6335,13 @@ void ext4_free_blocks(handle_t *handle, struct inode *inode,
+ block = bh->b_blocknr;
+ }
+
++ if (sbi->s_mount_state & EXT4_FC_REPLAY) {
++ ext4_free_blocks_simple(inode, block, EXT4_NUM_B2C(sbi, count));
++ return;
++ }
++
++ might_sleep();
++
+ if (!(flags & EXT4_FREE_BLOCKS_VALIDATED) &&
+ !ext4_inode_block_valid(inode, block, count)) {
+ ext4_error(sb, "Freeing blocks not in datazone - "
+diff --git a/fs/ext4/super.c b/fs/ext4/super.c
+index 05fcecc36244d..77b4d9d2b2150 100644
+--- a/fs/ext4/super.c
++++ b/fs/ext4/super.c
+@@ -1128,6 +1128,12 @@ static void ext4_blkdev_remove(struct ext4_sb_info *sbi)
+ struct block_device *bdev;
+ bdev = sbi->s_journal_bdev;
+ if (bdev) {
++ /*
++ * Invalidate the journal device's buffers. We don't want them
++ * floating about in memory - the physical journal device may
++ * hotswapped, and it breaks the `ro-after' testing code.
++ */
++ invalidate_bdev(bdev);
+ ext4_blkdev_put(bdev);
+ sbi->s_journal_bdev = NULL;
+ }
+@@ -1328,13 +1334,7 @@ static void ext4_put_super(struct super_block *sb)
+ sync_blockdev(sb->s_bdev);
+ invalidate_bdev(sb->s_bdev);
+ if (sbi->s_journal_bdev && sbi->s_journal_bdev != sb->s_bdev) {
+- /*
+- * Invalidate the journal device's buffers. We don't want them
+- * floating about in memory - the physical journal device may
+- * hotswapped, and it breaks the `ro-after' testing code.
+- */
+ sync_blockdev(sbi->s_journal_bdev);
+- invalidate_bdev(sbi->s_journal_bdev);
+ ext4_blkdev_remove(sbi);
+ }
+
+@@ -5567,7 +5567,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
+ ext4_msg(sb, KERN_INFO, "recovery complete");
+ err = ext4_mark_recovery_complete(sb, es);
+ if (err)
+- goto failed_mount9;
++ goto failed_mount10;
+ }
+
+ if (test_opt(sb, DISCARD) && !bdev_max_discard_sectors(sb->s_bdev))
+@@ -5586,7 +5586,9 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
+
+ return 0;
+
+-failed_mount9:
++failed_mount10:
++ ext4_quota_off_umount(sb);
++failed_mount9: __maybe_unused
+ ext4_release_orphan_info(sb);
+ failed_mount8:
+ ext4_unregister_sysfs(sb);
+@@ -5645,6 +5647,7 @@ failed_mount:
+ brelse(sbi->s_sbh);
+ ext4_blkdev_remove(sbi);
+ out_fail:
++ invalidate_bdev(sb->s_bdev);
+ sb->s_fs_info = NULL;
+ return err;
+ }
+@@ -5979,19 +5982,27 @@ static int ext4_load_journal(struct super_block *sb,
+ err = jbd2_journal_wipe(journal, !really_read_only);
+ if (!err) {
+ char *save = kmalloc(EXT4_S_ERR_LEN, GFP_KERNEL);
++ __le16 orig_state;
++ bool changed = false;
+
+ if (save)
+ memcpy(save, ((char *) es) +
+ EXT4_S_ERR_START, EXT4_S_ERR_LEN);
+ err = jbd2_journal_load(journal);
+- if (save)
++ if (save && memcmp(((char *) es) + EXT4_S_ERR_START,
++ save, EXT4_S_ERR_LEN)) {
+ memcpy(((char *) es) + EXT4_S_ERR_START,
+ save, EXT4_S_ERR_LEN);
++ changed = true;
++ }
+ kfree(save);
++ orig_state = es->s_state;
+ es->s_state |= cpu_to_le16(EXT4_SB(sb)->s_mount_state &
+ EXT4_ERROR_FS);
++ if (orig_state != es->s_state)
++ changed = true;
+ /* Write out restored error information to the superblock */
+- if (!bdev_read_only(sb->s_bdev)) {
++ if (changed && !really_read_only) {
+ int err2;
+ err2 = ext4_commit_super(sb);
+ err = err ? : err2;
+diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
+index 887e559884501..d635c58cf5a39 100644
+--- a/fs/f2fs/dir.c
++++ b/fs/f2fs/dir.c
+@@ -775,8 +775,15 @@ int f2fs_add_dentry(struct inode *dir, const struct f2fs_filename *fname,
+ {
+ int err = -EAGAIN;
+
+- if (f2fs_has_inline_dentry(dir))
++ if (f2fs_has_inline_dentry(dir)) {
++ /*
++ * Should get i_xattr_sem to keep the lock order:
++ * i_xattr_sem -> inode_page lock used by f2fs_setxattr.
++ */
++ f2fs_down_read(&F2FS_I(dir)->i_xattr_sem);
+ err = f2fs_add_inline_entry(dir, fname, inode, ino, mode);
++ f2fs_up_read(&F2FS_I(dir)->i_xattr_sem);
++ }
+ if (err == -EAGAIN)
+ err = f2fs_add_regular_entry(dir, fname, inode, ino, mode);
+
+diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
+index 17082dc3c1a34..3d91b5313947f 100644
+--- a/fs/f2fs/super.c
++++ b/fs/f2fs/super.c
+@@ -2086,9 +2086,22 @@ static int f2fs_show_options(struct seq_file *seq, struct dentry *root)
+ return 0;
+ }
+
+-static void default_options(struct f2fs_sb_info *sbi)
++static void default_options(struct f2fs_sb_info *sbi, bool remount)
+ {
+ /* init some FS parameters */
++ if (!remount) {
++ set_opt(sbi, READ_EXTENT_CACHE);
++ clear_opt(sbi, DISABLE_CHECKPOINT);
++
++ if (f2fs_hw_support_discard(sbi) || f2fs_hw_should_discard(sbi))
++ set_opt(sbi, DISCARD);
++
++ if (f2fs_sb_has_blkzoned(sbi))
++ F2FS_OPTION(sbi).discard_unit = DISCARD_UNIT_SECTION;
++ else
++ F2FS_OPTION(sbi).discard_unit = DISCARD_UNIT_BLOCK;
++ }
++
+ if (f2fs_sb_has_readonly(sbi))
+ F2FS_OPTION(sbi).active_logs = NR_CURSEG_RO_TYPE;
+ else
+@@ -2118,23 +2131,16 @@ static void default_options(struct f2fs_sb_info *sbi)
+ set_opt(sbi, INLINE_XATTR);
+ set_opt(sbi, INLINE_DATA);
+ set_opt(sbi, INLINE_DENTRY);
+- set_opt(sbi, READ_EXTENT_CACHE);
+ set_opt(sbi, NOHEAP);
+- clear_opt(sbi, DISABLE_CHECKPOINT);
+ set_opt(sbi, MERGE_CHECKPOINT);
+ F2FS_OPTION(sbi).unusable_cap = 0;
+ sbi->sb->s_flags |= SB_LAZYTIME;
+ if (!f2fs_is_readonly(sbi))
+ set_opt(sbi, FLUSH_MERGE);
+- if (f2fs_hw_support_discard(sbi) || f2fs_hw_should_discard(sbi))
+- set_opt(sbi, DISCARD);
+- if (f2fs_sb_has_blkzoned(sbi)) {
++ if (f2fs_sb_has_blkzoned(sbi))
+ F2FS_OPTION(sbi).fs_mode = FS_MODE_LFS;
+- F2FS_OPTION(sbi).discard_unit = DISCARD_UNIT_SECTION;
+- } else {
++ else
+ F2FS_OPTION(sbi).fs_mode = FS_MODE_ADAPTIVE;
+- F2FS_OPTION(sbi).discard_unit = DISCARD_UNIT_BLOCK;
+- }
+
+ #ifdef CONFIG_F2FS_FS_XATTR
+ set_opt(sbi, XATTR_USER);
+@@ -2306,7 +2312,7 @@ static int f2fs_remount(struct super_block *sb, int *flags, char *data)
+ clear_sbi_flag(sbi, SBI_NEED_SB_WRITE);
+ }
+
+- default_options(sbi);
++ default_options(sbi, true);
+
+ /* parse mount options */
+ err = parse_options(sb, data, true);
+@@ -4357,7 +4363,7 @@ try_onemore:
+ sbi->s_chksum_seed = f2fs_chksum(sbi, ~0, raw_super->uuid,
+ sizeof(raw_super->uuid));
+
+- default_options(sbi);
++ default_options(sbi, false);
+ /* parse mount options */
+ options = kstrdup((const char *)data, GFP_KERNEL);
+ if (data && !options) {
+diff --git a/fs/f2fs/xattr.c b/fs/f2fs/xattr.c
+index 213805d3592cc..476b186b90a6c 100644
+--- a/fs/f2fs/xattr.c
++++ b/fs/f2fs/xattr.c
+@@ -528,10 +528,12 @@ int f2fs_getxattr(struct inode *inode, int index, const char *name,
+ if (len > F2FS_NAME_LEN)
+ return -ERANGE;
+
+- f2fs_down_read(&F2FS_I(inode)->i_xattr_sem);
++ if (!ipage)
++ f2fs_down_read(&F2FS_I(inode)->i_xattr_sem);
+ error = lookup_all_xattrs(inode, ipage, index, len, name,
+ &entry, &base_addr, &base_size, &is_inline);
+- f2fs_up_read(&F2FS_I(inode)->i_xattr_sem);
++ if (!ipage)
++ f2fs_up_read(&F2FS_I(inode)->i_xattr_sem);
+ if (error)
+ return error;
+
+diff --git a/fs/jfs/jfs_dmap.c b/fs/jfs/jfs_dmap.c
+index a3eb1e8269477..da6a2bc6bf022 100644
+--- a/fs/jfs/jfs_dmap.c
++++ b/fs/jfs/jfs_dmap.c
+@@ -178,7 +178,13 @@ int dbMount(struct inode *ipbmap)
+ dbmp_le = (struct dbmap_disk *) mp->data;
+ bmp->db_mapsize = le64_to_cpu(dbmp_le->dn_mapsize);
+ bmp->db_nfree = le64_to_cpu(dbmp_le->dn_nfree);
++
+ bmp->db_l2nbperpage = le32_to_cpu(dbmp_le->dn_l2nbperpage);
++ if (bmp->db_l2nbperpage > L2PSIZE - L2MINBLOCKSIZE) {
++ err = -EINVAL;
++ goto err_release_metapage;
++ }
++
+ bmp->db_numag = le32_to_cpu(dbmp_le->dn_numag);
+ if (!bmp->db_numag) {
+ err = -EINVAL;
+diff --git a/fs/jfs/jfs_filsys.h b/fs/jfs/jfs_filsys.h
+index b5d702df7111a..33ef13a0b1108 100644
+--- a/fs/jfs/jfs_filsys.h
++++ b/fs/jfs/jfs_filsys.h
+@@ -122,7 +122,9 @@
+ #define NUM_INODE_PER_IAG INOSPERIAG
+
+ #define MINBLOCKSIZE 512
++#define L2MINBLOCKSIZE 9
+ #define MAXBLOCKSIZE 4096
++#define L2MAXBLOCKSIZE 12
+ #define MAXFILESIZE ((s64)1 << 52)
+
+ #define JFS_LINK_MAX 0xffffffff
+diff --git a/fs/smb/client/cifs_dfs_ref.c b/fs/smb/client/cifs_dfs_ref.c
+index 0329a907bdfe8..b1c2499b1c3b8 100644
+--- a/fs/smb/client/cifs_dfs_ref.c
++++ b/fs/smb/client/cifs_dfs_ref.c
+@@ -118,12 +118,12 @@ cifs_build_devname(char *nodename, const char *prepath)
+ return dev;
+ }
+
+-static int set_dest_addr(struct smb3_fs_context *ctx, const char *full_path)
++static int set_dest_addr(struct smb3_fs_context *ctx)
+ {
+ struct sockaddr *addr = (struct sockaddr *)&ctx->dstaddr;
+ int rc;
+
+- rc = dns_resolve_server_name_to_ip(full_path, addr, NULL);
++ rc = dns_resolve_server_name_to_ip(ctx->source, addr, NULL);
+ if (!rc)
+ cifs_set_port(addr, ctx->port);
+ return rc;
+@@ -171,10 +171,9 @@ static struct vfsmount *cifs_dfs_do_automount(struct path *path)
+ mnt = ERR_CAST(full_path);
+ goto out;
+ }
+- cifs_dbg(FYI, "%s: full_path: %s\n", __func__, full_path);
+
+ tmp = *cur_ctx;
+- tmp.source = full_path;
++ tmp.source = NULL;
+ tmp.leaf_fullpath = NULL;
+ tmp.UNC = tmp.prepath = NULL;
+ tmp.dfs_root_ses = NULL;
+@@ -185,13 +184,22 @@ static struct vfsmount *cifs_dfs_do_automount(struct path *path)
+ goto out;
+ }
+
+- rc = set_dest_addr(ctx, full_path);
++ rc = smb3_parse_devname(full_path, ctx);
+ if (rc) {
+ mnt = ERR_PTR(rc);
+ goto out;
+ }
+
+- rc = smb3_parse_devname(full_path, ctx);
++ ctx->source = smb3_fs_context_fullpath(ctx, '/');
++ if (IS_ERR(ctx->source)) {
++ mnt = ERR_CAST(ctx->source);
++ ctx->source = NULL;
++ goto out;
++ }
++ cifs_dbg(FYI, "%s: ctx: source=%s UNC=%s prepath=%s dstaddr=%pISpc\n",
++ __func__, ctx->source, ctx->UNC, ctx->prepath, &ctx->dstaddr);
++
++ rc = set_dest_addr(ctx);
+ if (!rc)
+ mnt = fc_mount(fc);
+ else
+diff --git a/fs/smb/client/cifsproto.h b/fs/smb/client/cifsproto.h
+index 94ab6402965c5..1d71d658e1679 100644
+--- a/fs/smb/client/cifsproto.h
++++ b/fs/smb/client/cifsproto.h
+@@ -85,6 +85,8 @@ extern void release_mid(struct mid_q_entry *mid);
+ extern void cifs_wake_up_task(struct mid_q_entry *mid);
+ extern int cifs_handle_standard(struct TCP_Server_Info *server,
+ struct mid_q_entry *mid);
++extern char *smb3_fs_context_fullpath(const struct smb3_fs_context *ctx,
++ char dirsep);
+ extern int smb3_parse_devname(const char *devname, struct smb3_fs_context *ctx);
+ extern int smb3_parse_opt(const char *options, const char *key, char **val);
+ extern int cifs_ipaddr_cmp(struct sockaddr *srcaddr, struct sockaddr *rhs);
+diff --git a/fs/smb/client/cifssmb.c b/fs/smb/client/cifssmb.c
+index 9d963caec35c8..a0c4e9874b010 100644
+--- a/fs/smb/client/cifssmb.c
++++ b/fs/smb/client/cifssmb.c
+@@ -3184,7 +3184,7 @@ setAclRetry:
+ param_offset = offsetof(struct smb_com_transaction2_spi_req,
+ InformationLevel) - 4;
+ offset = param_offset + params;
+- parm_data = ((char *) &pSMB->hdr.Protocol) + offset;
++ parm_data = ((char *)pSMB) + sizeof(pSMB->hdr.smb_buf_length) + offset;
+ pSMB->ParameterOffset = cpu_to_le16(param_offset);
+
+ /* convert to on the wire format for POSIX ACL */
+diff --git a/fs/smb/client/dfs.c b/fs/smb/client/dfs.c
+index 267536a7531df..26d14dd0482ef 100644
+--- a/fs/smb/client/dfs.c
++++ b/fs/smb/client/dfs.c
+@@ -54,39 +54,6 @@ out:
+ return rc;
+ }
+
+-/*
+- * cifs_build_path_to_root returns full path to root when we do not have an
+- * existing connection (tcon)
+- */
+-static char *build_unc_path_to_root(const struct smb3_fs_context *ctx,
+- const struct cifs_sb_info *cifs_sb, bool useppath)
+-{
+- char *full_path, *pos;
+- unsigned int pplen = useppath && ctx->prepath ? strlen(ctx->prepath) + 1 : 0;
+- unsigned int unc_len = strnlen(ctx->UNC, MAX_TREE_SIZE + 1);
+-
+- if (unc_len > MAX_TREE_SIZE)
+- return ERR_PTR(-EINVAL);
+-
+- full_path = kmalloc(unc_len + pplen + 1, GFP_KERNEL);
+- if (full_path == NULL)
+- return ERR_PTR(-ENOMEM);
+-
+- memcpy(full_path, ctx->UNC, unc_len);
+- pos = full_path + unc_len;
+-
+- if (pplen) {
+- *pos = CIFS_DIR_SEP(cifs_sb);
+- memcpy(pos + 1, ctx->prepath, pplen);
+- pos += pplen;
+- }
+-
+- *pos = '\0'; /* add trailing null */
+- convert_delimiter(full_path, CIFS_DIR_SEP(cifs_sb));
+- cifs_dbg(FYI, "%s: full_path=%s\n", __func__, full_path);
+- return full_path;
+-}
+-
+ static int get_session(struct cifs_mount_ctx *mnt_ctx, const char *full_path)
+ {
+ struct smb3_fs_context *ctx = mnt_ctx->fs_ctx;
+@@ -179,6 +146,7 @@ static int __dfs_mount_share(struct cifs_mount_ctx *mnt_ctx)
+ struct TCP_Server_Info *server;
+ struct cifs_tcon *tcon;
+ char *origin_fullpath = NULL;
++ char sep = CIFS_DIR_SEP(cifs_sb);
+ int num_links = 0;
+ int rc;
+
+@@ -186,7 +154,7 @@ static int __dfs_mount_share(struct cifs_mount_ctx *mnt_ctx)
+ if (IS_ERR(ref_path))
+ return PTR_ERR(ref_path);
+
+- full_path = build_unc_path_to_root(ctx, cifs_sb, true);
++ full_path = smb3_fs_context_fullpath(ctx, sep);
+ if (IS_ERR(full_path)) {
+ rc = PTR_ERR(full_path);
+ full_path = NULL;
+@@ -228,7 +196,7 @@ static int __dfs_mount_share(struct cifs_mount_ctx *mnt_ctx)
+ kfree(full_path);
+ ref_path = full_path = NULL;
+
+- full_path = build_unc_path_to_root(ctx, cifs_sb, true);
++ full_path = smb3_fs_context_fullpath(ctx, sep);
+ if (IS_ERR(full_path)) {
+ rc = PTR_ERR(full_path);
+ full_path = NULL;
+@@ -296,8 +264,9 @@ int dfs_mount_share(struct cifs_mount_ctx *mnt_ctx, bool *isdfs)
+ if (!nodfs) {
+ rc = dfs_get_referral(mnt_ctx, ctx->UNC + 1, NULL, NULL);
+ if (rc) {
+- if (rc != -ENOENT && rc != -EOPNOTSUPP && rc != -EIO)
+- return rc;
++ cifs_dbg(FYI, "%s: no dfs referral for %s: %d\n",
++ __func__, ctx->UNC + 1, rc);
++ cifs_dbg(FYI, "%s: assuming non-dfs mount...\n", __func__);
+ nodfs = true;
+ }
+ }
+diff --git a/fs/smb/client/file.c b/fs/smb/client/file.c
+index 1a854dc204823..d554bca7e07eb 100644
+--- a/fs/smb/client/file.c
++++ b/fs/smb/client/file.c
+@@ -1080,8 +1080,8 @@ int cifs_close(struct inode *inode, struct file *file)
+ cfile = file->private_data;
+ file->private_data = NULL;
+ dclose = kmalloc(sizeof(struct cifs_deferred_close), GFP_KERNEL);
+- if ((cinode->oplock == CIFS_CACHE_RHW_FLG) &&
+- cinode->lease_granted &&
++ if ((cifs_sb->ctx->closetimeo && cinode->oplock == CIFS_CACHE_RHW_FLG)
++ && cinode->lease_granted &&
+ !test_bit(CIFS_INO_CLOSE_ON_LOCK, &cinode->flags) &&
+ dclose) {
+ if (test_and_clear_bit(CIFS_INO_MODIFIED_ATTR, &cinode->flags)) {
+diff --git a/fs/smb/client/fs_context.c b/fs/smb/client/fs_context.c
+index 1bda75609b642..4946a0c596009 100644
+--- a/fs/smb/client/fs_context.c
++++ b/fs/smb/client/fs_context.c
+@@ -441,14 +441,17 @@ out:
+ * but there are some bugs that prevent rename from working if there are
+ * multiple delimiters.
+ *
+- * Returns a sanitized duplicate of @path. @gfp indicates the GFP_* flags
+- * for kstrdup.
++ * Return a sanitized duplicate of @path or NULL for empty prefix paths.
++ * Otherwise, return ERR_PTR.
++ *
++ * @gfp indicates the GFP_* flags for kstrdup.
+ * The caller is responsible for freeing the original.
+ */
+ #define IS_DELIM(c) ((c) == '/' || (c) == '\\')
+ char *cifs_sanitize_prepath(char *prepath, gfp_t gfp)
+ {
+ char *cursor1 = prepath, *cursor2 = prepath;
++ char *s;
+
+ /* skip all prepended delimiters */
+ while (IS_DELIM(*cursor1))
+@@ -469,8 +472,39 @@ char *cifs_sanitize_prepath(char *prepath, gfp_t gfp)
+ if (IS_DELIM(*(cursor2 - 1)))
+ cursor2--;
+
+- *(cursor2) = '\0';
+- return kstrdup(prepath, gfp);
++ *cursor2 = '\0';
++ if (!*prepath)
++ return NULL;
++ s = kstrdup(prepath, gfp);
++ if (!s)
++ return ERR_PTR(-ENOMEM);
++ return s;
++}
++
++/*
++ * Return full path based on the values of @ctx->{UNC,prepath}.
++ *
++ * It is assumed that both values were already parsed by smb3_parse_devname().
++ */
++char *smb3_fs_context_fullpath(const struct smb3_fs_context *ctx, char dirsep)
++{
++ size_t ulen, plen;
++ char *s;
++
++ ulen = strlen(ctx->UNC);
++ plen = ctx->prepath ? strlen(ctx->prepath) + 1 : 0;
++
++ s = kmalloc(ulen + plen + 1, GFP_KERNEL);
++ if (!s)
++ return ERR_PTR(-ENOMEM);
++ memcpy(s, ctx->UNC, ulen);
++ if (plen) {
++ s[ulen] = dirsep;
++ memcpy(s + ulen + 1, ctx->prepath, plen);
++ }
++ s[ulen + plen] = '\0';
++ convert_delimiter(s, dirsep);
++ return s;
+ }
+
+ /*
+@@ -484,6 +518,7 @@ smb3_parse_devname(const char *devname, struct smb3_fs_context *ctx)
+ char *pos;
+ const char *delims = "/\\";
+ size_t len;
++ int rc;
+
+ if (unlikely(!devname || !*devname)) {
+ cifs_dbg(VFS, "Device name not specified\n");
+@@ -511,6 +546,8 @@ smb3_parse_devname(const char *devname, struct smb3_fs_context *ctx)
+
+ /* now go until next delimiter or end of string */
+ len = strcspn(pos, delims);
++ if (!len)
++ return -EINVAL;
+
+ /* move "pos" up to delimiter or NULL */
+ pos += len;
+@@ -533,8 +570,11 @@ smb3_parse_devname(const char *devname, struct smb3_fs_context *ctx)
+ return 0;
+
+ ctx->prepath = cifs_sanitize_prepath(pos, GFP_KERNEL);
+- if (!ctx->prepath)
+- return -ENOMEM;
++ if (IS_ERR(ctx->prepath)) {
++ rc = PTR_ERR(ctx->prepath);
++ ctx->prepath = NULL;
++ return rc;
++ }
+
+ return 0;
+ }
+@@ -1146,12 +1186,13 @@ static int smb3_fs_context_parse_param(struct fs_context *fc,
+ cifs_errorf(fc, "Unknown error parsing devname\n");
+ goto cifs_parse_mount_err;
+ }
+- ctx->source = kstrdup(param->string, GFP_KERNEL);
+- if (ctx->source == NULL) {
++ ctx->source = smb3_fs_context_fullpath(ctx, '/');
++ if (IS_ERR(ctx->source)) {
++ ctx->source = NULL;
+ cifs_errorf(fc, "OOM when copying UNC string\n");
+ goto cifs_parse_mount_err;
+ }
+- fc->source = kstrdup(param->string, GFP_KERNEL);
++ fc->source = kstrdup(ctx->source, GFP_KERNEL);
+ if (fc->source == NULL) {
+ cifs_errorf(fc, "OOM when copying UNC string\n");
+ goto cifs_parse_mount_err;
+diff --git a/fs/smb/client/misc.c b/fs/smb/client/misc.c
+index b0dedc26643b6..70dbfe6584f9e 100644
+--- a/fs/smb/client/misc.c
++++ b/fs/smb/client/misc.c
+@@ -1211,16 +1211,21 @@ int match_target_ip(struct TCP_Server_Info *server,
+
+ int cifs_update_super_prepath(struct cifs_sb_info *cifs_sb, char *prefix)
+ {
++ int rc;
++
+ kfree(cifs_sb->prepath);
++ cifs_sb->prepath = NULL;
+
+ if (prefix && *prefix) {
+ cifs_sb->prepath = cifs_sanitize_prepath(prefix, GFP_ATOMIC);
+- if (!cifs_sb->prepath)
+- return -ENOMEM;
+-
+- convert_delimiter(cifs_sb->prepath, CIFS_DIR_SEP(cifs_sb));
+- } else
+- cifs_sb->prepath = NULL;
++ if (IS_ERR(cifs_sb->prepath)) {
++ rc = PTR_ERR(cifs_sb->prepath);
++ cifs_sb->prepath = NULL;
++ return rc;
++ }
++ if (cifs_sb->prepath)
++ convert_delimiter(cifs_sb->prepath, CIFS_DIR_SEP(cifs_sb));
++ }
+
+ cifs_sb->mnt_cifs_flags |= CIFS_MOUNT_USE_PREFIX_PATH;
+ return 0;
+diff --git a/fs/smb/client/smb2transport.c b/fs/smb/client/smb2transport.c
+index 790acf65a0926..22954a9c7a6c7 100644
+--- a/fs/smb/client/smb2transport.c
++++ b/fs/smb/client/smb2transport.c
+@@ -153,7 +153,14 @@ smb2_find_smb_ses_unlocked(struct TCP_Server_Info *server, __u64 ses_id)
+ list_for_each_entry(ses, &pserver->smb_ses_list, smb_ses_list) {
+ if (ses->Suid != ses_id)
+ continue;
++
++ spin_lock(&ses->ses_lock);
++ if (ses->ses_status == SES_EXITING) {
++ spin_unlock(&ses->ses_lock);
++ continue;
++ }
+ ++ses->ses_count;
++ spin_unlock(&ses->ses_lock);
+ return ses;
+ }
+
+diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c
+index da1787c68ba03..1cc336f512851 100644
+--- a/fs/smb/server/smb2pdu.c
++++ b/fs/smb/server/smb2pdu.c
+@@ -1322,9 +1322,8 @@ static int decode_negotiation_token(struct ksmbd_conn *conn,
+
+ static int ntlm_negotiate(struct ksmbd_work *work,
+ struct negotiate_message *negblob,
+- size_t negblob_len)
++ size_t negblob_len, struct smb2_sess_setup_rsp *rsp)
+ {
+- struct smb2_sess_setup_rsp *rsp = smb2_get_msg(work->response_buf);
+ struct challenge_message *chgblob;
+ unsigned char *spnego_blob = NULL;
+ u16 spnego_blob_len;
+@@ -1429,10 +1428,10 @@ static struct ksmbd_user *session_user(struct ksmbd_conn *conn,
+ return user;
+ }
+
+-static int ntlm_authenticate(struct ksmbd_work *work)
++static int ntlm_authenticate(struct ksmbd_work *work,
++ struct smb2_sess_setup_req *req,
++ struct smb2_sess_setup_rsp *rsp)
+ {
+- struct smb2_sess_setup_req *req = smb2_get_msg(work->request_buf);
+- struct smb2_sess_setup_rsp *rsp = smb2_get_msg(work->response_buf);
+ struct ksmbd_conn *conn = work->conn;
+ struct ksmbd_session *sess = work->sess;
+ struct channel *chann = NULL;
+@@ -1566,10 +1565,10 @@ binding_session:
+ }
+
+ #ifdef CONFIG_SMB_SERVER_KERBEROS5
+-static int krb5_authenticate(struct ksmbd_work *work)
++static int krb5_authenticate(struct ksmbd_work *work,
++ struct smb2_sess_setup_req *req,
++ struct smb2_sess_setup_rsp *rsp)
+ {
+- struct smb2_sess_setup_req *req = smb2_get_msg(work->request_buf);
+- struct smb2_sess_setup_rsp *rsp = smb2_get_msg(work->response_buf);
+ struct ksmbd_conn *conn = work->conn;
+ struct ksmbd_session *sess = work->sess;
+ char *in_blob, *out_blob;
+@@ -1647,7 +1646,9 @@ static int krb5_authenticate(struct ksmbd_work *work)
+ return 0;
+ }
+ #else
+-static int krb5_authenticate(struct ksmbd_work *work)
++static int krb5_authenticate(struct ksmbd_work *work,
++ struct smb2_sess_setup_req *req,
++ struct smb2_sess_setup_rsp *rsp)
+ {
+ return -EOPNOTSUPP;
+ }
+@@ -1656,8 +1657,8 @@ static int krb5_authenticate(struct ksmbd_work *work)
+ int smb2_sess_setup(struct ksmbd_work *work)
+ {
+ struct ksmbd_conn *conn = work->conn;
+- struct smb2_sess_setup_req *req = smb2_get_msg(work->request_buf);
+- struct smb2_sess_setup_rsp *rsp = smb2_get_msg(work->response_buf);
++ struct smb2_sess_setup_req *req;
++ struct smb2_sess_setup_rsp *rsp;
+ struct ksmbd_session *sess;
+ struct negotiate_message *negblob;
+ unsigned int negblob_len, negblob_off;
+@@ -1665,6 +1666,8 @@ int smb2_sess_setup(struct ksmbd_work *work)
+
+ ksmbd_debug(SMB, "Received request for session setup\n");
+
++ WORK_BUFFERS(work, req, rsp);
++
+ rsp->StructureSize = cpu_to_le16(9);
+ rsp->SessionFlags = 0;
+ rsp->SecurityBufferOffset = cpu_to_le16(72);
+@@ -1786,7 +1789,7 @@ int smb2_sess_setup(struct ksmbd_work *work)
+
+ if (conn->preferred_auth_mech &
+ (KSMBD_AUTH_KRB5 | KSMBD_AUTH_MSKRB5)) {
+- rc = krb5_authenticate(work);
++ rc = krb5_authenticate(work, req, rsp);
+ if (rc) {
+ rc = -EINVAL;
+ goto out_err;
+@@ -1800,7 +1803,7 @@ int smb2_sess_setup(struct ksmbd_work *work)
+ sess->Preauth_HashValue = NULL;
+ } else if (conn->preferred_auth_mech == KSMBD_AUTH_NTLMSSP) {
+ if (negblob->MessageType == NtLmNegotiate) {
+- rc = ntlm_negotiate(work, negblob, negblob_len);
++ rc = ntlm_negotiate(work, negblob, negblob_len, rsp);
+ if (rc)
+ goto out_err;
+ rsp->hdr.Status =
+@@ -1813,7 +1816,7 @@ int smb2_sess_setup(struct ksmbd_work *work)
+ le16_to_cpu(rsp->SecurityBufferLength) - 1);
+
+ } else if (negblob->MessageType == NtLmAuthenticate) {
+- rc = ntlm_authenticate(work);
++ rc = ntlm_authenticate(work, req, rsp);
+ if (rc)
+ goto out_err;
+
+@@ -1911,14 +1914,16 @@ out_err:
+ int smb2_tree_connect(struct ksmbd_work *work)
+ {
+ struct ksmbd_conn *conn = work->conn;
+- struct smb2_tree_connect_req *req = smb2_get_msg(work->request_buf);
+- struct smb2_tree_connect_rsp *rsp = smb2_get_msg(work->response_buf);
++ struct smb2_tree_connect_req *req;
++ struct smb2_tree_connect_rsp *rsp;
+ struct ksmbd_session *sess = work->sess;
+ char *treename = NULL, *name = NULL;
+ struct ksmbd_tree_conn_status status;
+ struct ksmbd_share_config *share;
+ int rc = -EINVAL;
+
++ WORK_BUFFERS(work, req, rsp);
++
+ treename = smb_strndup_from_utf16(req->Buffer,
+ le16_to_cpu(req->PathLength), true,
+ conn->local_nls);
+@@ -2087,19 +2092,19 @@ static int smb2_create_open_flags(bool file_present, __le32 access,
+ */
+ int smb2_tree_disconnect(struct ksmbd_work *work)
+ {
+- struct smb2_tree_disconnect_rsp *rsp = smb2_get_msg(work->response_buf);
++ struct smb2_tree_disconnect_rsp *rsp;
++ struct smb2_tree_disconnect_req *req;
+ struct ksmbd_session *sess = work->sess;
+ struct ksmbd_tree_connect *tcon = work->tcon;
+
++ WORK_BUFFERS(work, req, rsp);
++
+ rsp->StructureSize = cpu_to_le16(4);
+ inc_rfc1001_len(work->response_buf, 4);
+
+ ksmbd_debug(SMB, "request\n");
+
+ if (!tcon || test_and_set_bit(TREE_CONN_EXPIRE, &tcon->status)) {
+- struct smb2_tree_disconnect_req *req =
+- smb2_get_msg(work->request_buf);
+-
+ ksmbd_debug(SMB, "Invalid tid %d\n", req->hdr.Id.SyncId.TreeId);
+
+ rsp->hdr.Status = STATUS_NETWORK_NAME_DELETED;
+@@ -2122,10 +2127,14 @@ int smb2_tree_disconnect(struct ksmbd_work *work)
+ int smb2_session_logoff(struct ksmbd_work *work)
+ {
+ struct ksmbd_conn *conn = work->conn;
+- struct smb2_logoff_rsp *rsp = smb2_get_msg(work->response_buf);
++ struct smb2_logoff_req *req;
++ struct smb2_logoff_rsp *rsp;
+ struct ksmbd_session *sess;
+- struct smb2_logoff_req *req = smb2_get_msg(work->request_buf);
+- u64 sess_id = le64_to_cpu(req->hdr.SessionId);
++ u64 sess_id;
++
++ WORK_BUFFERS(work, req, rsp);
++
++ sess_id = le64_to_cpu(req->hdr.SessionId);
+
+ rsp->StructureSize = cpu_to_le16(4);
+ inc_rfc1001_len(work->response_buf, 4);
+@@ -2165,12 +2174,14 @@ int smb2_session_logoff(struct ksmbd_work *work)
+ */
+ static noinline int create_smb2_pipe(struct ksmbd_work *work)
+ {
+- struct smb2_create_rsp *rsp = smb2_get_msg(work->response_buf);
+- struct smb2_create_req *req = smb2_get_msg(work->request_buf);
++ struct smb2_create_rsp *rsp;
++ struct smb2_create_req *req;
+ int id;
+ int err;
+ char *name;
+
++ WORK_BUFFERS(work, req, rsp);
++
+ name = smb_strndup_from_utf16(req->Buffer, le16_to_cpu(req->NameLength),
+ 1, work->conn->local_nls);
+ if (IS_ERR(name)) {
+@@ -5305,8 +5316,10 @@ int smb2_query_info(struct ksmbd_work *work)
+ static noinline int smb2_close_pipe(struct ksmbd_work *work)
+ {
+ u64 id;
+- struct smb2_close_req *req = smb2_get_msg(work->request_buf);
+- struct smb2_close_rsp *rsp = smb2_get_msg(work->response_buf);
++ struct smb2_close_req *req;
++ struct smb2_close_rsp *rsp;
++
++ WORK_BUFFERS(work, req, rsp);
+
+ id = req->VolatileFileId;
+ ksmbd_session_rpc_close(work->sess, id);
+@@ -5448,6 +5461,9 @@ int smb2_echo(struct ksmbd_work *work)
+ {
+ struct smb2_echo_rsp *rsp = smb2_get_msg(work->response_buf);
+
++ if (work->next_smb2_rcv_hdr_off)
++ rsp = ksmbd_resp_buf_next(work);
++
+ rsp->StructureSize = cpu_to_le16(4);
+ rsp->Reserved = 0;
+ inc_rfc1001_len(work->response_buf, 4);
+@@ -6082,8 +6098,10 @@ static noinline int smb2_read_pipe(struct ksmbd_work *work)
+ int nbytes = 0, err;
+ u64 id;
+ struct ksmbd_rpc_command *rpc_resp;
+- struct smb2_read_req *req = smb2_get_msg(work->request_buf);
+- struct smb2_read_rsp *rsp = smb2_get_msg(work->response_buf);
++ struct smb2_read_req *req;
++ struct smb2_read_rsp *rsp;
++
++ WORK_BUFFERS(work, req, rsp);
+
+ id = req->VolatileFileId;
+
+@@ -6331,14 +6349,16 @@ out:
+ */
+ static noinline int smb2_write_pipe(struct ksmbd_work *work)
+ {
+- struct smb2_write_req *req = smb2_get_msg(work->request_buf);
+- struct smb2_write_rsp *rsp = smb2_get_msg(work->response_buf);
++ struct smb2_write_req *req;
++ struct smb2_write_rsp *rsp;
+ struct ksmbd_rpc_command *rpc_resp;
+ u64 id = 0;
+ int err = 0, ret = 0;
+ char *data_buf;
+ size_t length;
+
++ WORK_BUFFERS(work, req, rsp);
++
+ length = le32_to_cpu(req->Length);
+ id = req->VolatileFileId;
+
+@@ -6607,6 +6627,9 @@ int smb2_cancel(struct ksmbd_work *work)
+ struct ksmbd_work *iter;
+ struct list_head *command_list;
+
++ if (work->next_smb2_rcv_hdr_off)
++ hdr = ksmbd_resp_buf_next(work);
++
+ ksmbd_debug(SMB, "smb2 cancel called on mid %llu, async flags 0x%x\n",
+ hdr->MessageId, hdr->Flags);
+
+@@ -6766,8 +6789,8 @@ static inline bool lock_defer_pending(struct file_lock *fl)
+ */
+ int smb2_lock(struct ksmbd_work *work)
+ {
+- struct smb2_lock_req *req = smb2_get_msg(work->request_buf);
+- struct smb2_lock_rsp *rsp = smb2_get_msg(work->response_buf);
++ struct smb2_lock_req *req;
++ struct smb2_lock_rsp *rsp;
+ struct smb2_lock_element *lock_ele;
+ struct ksmbd_file *fp = NULL;
+ struct file_lock *flock = NULL;
+@@ -6784,6 +6807,8 @@ int smb2_lock(struct ksmbd_work *work)
+ LIST_HEAD(rollback_list);
+ int prior_lock = 0;
+
++ WORK_BUFFERS(work, req, rsp);
++
+ ksmbd_debug(SMB, "Received lock request\n");
+ fp = ksmbd_lookup_fd_slow(work, req->VolatileFileId, req->PersistentFileId);
+ if (!fp) {
+@@ -7897,8 +7922,8 @@ out:
+ */
+ static void smb20_oplock_break_ack(struct ksmbd_work *work)
+ {
+- struct smb2_oplock_break *req = smb2_get_msg(work->request_buf);
+- struct smb2_oplock_break *rsp = smb2_get_msg(work->response_buf);
++ struct smb2_oplock_break *req;
++ struct smb2_oplock_break *rsp;
+ struct ksmbd_file *fp;
+ struct oplock_info *opinfo = NULL;
+ __le32 err = 0;
+@@ -7907,6 +7932,8 @@ static void smb20_oplock_break_ack(struct ksmbd_work *work)
+ char req_oplevel = 0, rsp_oplevel = 0;
+ unsigned int oplock_change_type;
+
++ WORK_BUFFERS(work, req, rsp);
++
+ volatile_id = req->VolatileFid;
+ persistent_id = req->PersistentFid;
+ req_oplevel = req->OplockLevel;
+@@ -8041,8 +8068,8 @@ static int check_lease_state(struct lease *lease, __le32 req_state)
+ static void smb21_lease_break_ack(struct ksmbd_work *work)
+ {
+ struct ksmbd_conn *conn = work->conn;
+- struct smb2_lease_ack *req = smb2_get_msg(work->request_buf);
+- struct smb2_lease_ack *rsp = smb2_get_msg(work->response_buf);
++ struct smb2_lease_ack *req;
++ struct smb2_lease_ack *rsp;
+ struct oplock_info *opinfo;
+ __le32 err = 0;
+ int ret = 0;
+@@ -8050,6 +8077,8 @@ static void smb21_lease_break_ack(struct ksmbd_work *work)
+ __le32 lease_state;
+ struct lease *lease;
+
++ WORK_BUFFERS(work, req, rsp);
++
+ ksmbd_debug(OPLOCK, "smb21 lease break, lease state(0x%x)\n",
+ le32_to_cpu(req->LeaseState));
+ opinfo = lookup_lease_in_table(conn, req->LeaseKey);
+@@ -8175,8 +8204,10 @@ err_out:
+ */
+ int smb2_oplock_break(struct ksmbd_work *work)
+ {
+- struct smb2_oplock_break *req = smb2_get_msg(work->request_buf);
+- struct smb2_oplock_break *rsp = smb2_get_msg(work->response_buf);
++ struct smb2_oplock_break *req;
++ struct smb2_oplock_break *rsp;
++
++ WORK_BUFFERS(work, req, rsp);
+
+ switch (le16_to_cpu(req->StructureSize)) {
+ case OP_BREAK_STRUCT_SIZE_20:
+diff --git a/include/drm/display/drm_dp_mst_helper.h b/include/drm/display/drm_dp_mst_helper.h
+index 32c764fb9cb56..40e855c8407cf 100644
+--- a/include/drm/display/drm_dp_mst_helper.h
++++ b/include/drm/display/drm_dp_mst_helper.h
+@@ -815,8 +815,11 @@ void drm_dp_mst_topology_mgr_destroy(struct drm_dp_mst_topology_mgr *mgr);
+ bool drm_dp_read_mst_cap(struct drm_dp_aux *aux, const u8 dpcd[DP_RECEIVER_CAP_SIZE]);
+ int drm_dp_mst_topology_mgr_set_mst(struct drm_dp_mst_topology_mgr *mgr, bool mst_state);
+
+-int drm_dp_mst_hpd_irq(struct drm_dp_mst_topology_mgr *mgr, u8 *esi, bool *handled);
+-
++int drm_dp_mst_hpd_irq_handle_event(struct drm_dp_mst_topology_mgr *mgr,
++ const u8 *esi,
++ u8 *ack,
++ bool *handled);
++void drm_dp_mst_hpd_irq_send_new_request(struct drm_dp_mst_topology_mgr *mgr);
+
+ int
+ drm_dp_mst_detect_port(struct drm_connector *connector,
+diff --git a/include/linux/blk-crypto-profile.h b/include/linux/blk-crypto-profile.h
+index e6802b69cdd64..90ab33cb5d0ef 100644
+--- a/include/linux/blk-crypto-profile.h
++++ b/include/linux/blk-crypto-profile.h
+@@ -111,6 +111,7 @@ struct blk_crypto_profile {
+ * keyslots while ensuring that they can't be changed concurrently.
+ */
+ struct rw_semaphore lock;
++ struct lock_class_key lockdep_key;
+
+ /* List of idle slots, with least recently used slot at front */
+ wait_queue_head_t idle_slots_wait_queue;
+diff --git a/include/linux/ism.h b/include/linux/ism.h
+index ea2bcdae74012..9a4c204df3da1 100644
+--- a/include/linux/ism.h
++++ b/include/linux/ism.h
+@@ -44,9 +44,7 @@ struct ism_dev {
+ u64 local_gid;
+ int ieq_idx;
+
+- atomic_t free_clients_cnt;
+- atomic_t add_dev_cnt;
+- wait_queue_head_t waitq;
++ struct ism_client *subs[MAX_CLIENTS];
+ };
+
+ struct ism_event {
+@@ -68,9 +66,6 @@ struct ism_client {
+ */
+ void (*handle_irq)(struct ism_dev *dev, unsigned int bit, u16 dmbemask);
+ /* Private area - don't touch! */
+- struct work_struct remove_work;
+- struct work_struct add_work;
+- struct ism_dev *tgt_ism;
+ u8 id;
+ };
+
+diff --git a/include/linux/kasan.h b/include/linux/kasan.h
+index f7ef70661ce24..819b6bc8ac088 100644
+--- a/include/linux/kasan.h
++++ b/include/linux/kasan.h
+@@ -343,7 +343,7 @@ static inline void *kasan_reset_tag(const void *addr)
+ * @is_write: whether the bad access is a write or a read
+ * @ip: instruction pointer for the accessibility check or the bad access itself
+ */
+-bool kasan_report(unsigned long addr, size_t size,
++bool kasan_report(const void *addr, size_t size,
+ bool is_write, unsigned long ip);
+
+ #else /* CONFIG_KASAN_SW_TAGS || CONFIG_KASAN_HW_TAGS */
+diff --git a/include/linux/nvme.h b/include/linux/nvme.h
+index 779507ac750b8..2819d6c3a6b5d 100644
+--- a/include/linux/nvme.h
++++ b/include/linux/nvme.h
+@@ -473,7 +473,7 @@ struct nvme_id_ns_nvm {
+ };
+
+ enum {
+- NVME_ID_NS_NVM_STS_MASK = 0x3f,
++ NVME_ID_NS_NVM_STS_MASK = 0x7f,
+ NVME_ID_NS_NVM_GUARD_SHIFT = 7,
+ NVME_ID_NS_NVM_GUARD_MASK = 0x3,
+ };
+diff --git a/include/linux/rethook.h b/include/linux/rethook.h
+index c8ac1e5afcd1d..bdbe6717f45a2 100644
+--- a/include/linux/rethook.h
++++ b/include/linux/rethook.h
+@@ -59,6 +59,7 @@ struct rethook_node {
+ };
+
+ struct rethook *rethook_alloc(void *data, rethook_handler_t handler);
++void rethook_stop(struct rethook *rh);
+ void rethook_free(struct rethook *rh);
+ void rethook_add_node(struct rethook *rh, struct rethook_node *node);
+ struct rethook_node *rethook_try_get(struct rethook *rh);
+diff --git a/include/linux/serial_8250.h b/include/linux/serial_8250.h
+index 6f78f302d2722..09029cb33e743 100644
+--- a/include/linux/serial_8250.h
++++ b/include/linux/serial_8250.h
+@@ -98,7 +98,6 @@ struct uart_8250_port {
+ struct list_head list; /* ports on this IRQ */
+ u32 capabilities; /* port capabilities */
+ unsigned short bugs; /* port bugs */
+- bool fifo_bug; /* min RX trigger if enabled */
+ unsigned int tx_loadsz; /* transmit fifo load size */
+ unsigned char acr;
+ unsigned char fcr;
+diff --git a/include/net/netfilter/nf_conntrack_tuple.h b/include/net/netfilter/nf_conntrack_tuple.h
+index 9334371c94e2b..f7dd950ff2509 100644
+--- a/include/net/netfilter/nf_conntrack_tuple.h
++++ b/include/net/netfilter/nf_conntrack_tuple.h
+@@ -67,6 +67,9 @@ struct nf_conntrack_tuple {
+ /* The protocol. */
+ u_int8_t protonum;
+
++ /* The direction must be ignored for the tuplehash */
++ struct { } __nfct_hash_offsetend;
++
+ /* The direction (for tuplehash) */
+ u_int8_t dir;
+ } dst;
+diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
+index ee47d7143d99f..1b0beb8f08aee 100644
+--- a/include/net/netfilter/nf_tables.h
++++ b/include/net/netfilter/nf_tables.h
+@@ -1211,6 +1211,29 @@ int __nft_release_basechain(struct nft_ctx *ctx);
+
+ unsigned int nft_do_chain(struct nft_pktinfo *pkt, void *priv);
+
++static inline bool nft_use_inc(u32 *use)
++{
++ if (*use == UINT_MAX)
++ return false;
++
++ (*use)++;
++
++ return true;
++}
++
++static inline void nft_use_dec(u32 *use)
++{
++ WARN_ON_ONCE((*use)-- == 0);
++}
++
++/* For error and abort path: restore use counter to previous state. */
++static inline void nft_use_inc_restore(u32 *use)
++{
++ WARN_ON_ONCE(!nft_use_inc(use));
++}
++
++#define nft_use_dec_restore nft_use_dec
++
+ /**
+ * struct nft_table - nf_tables table
+ *
+@@ -1296,8 +1319,8 @@ struct nft_object {
+ struct list_head list;
+ struct rhlist_head rhlhead;
+ struct nft_object_hash_key key;
+- u32 genmask:2,
+- use:30;
++ u32 genmask:2;
++ u32 use;
+ u64 handle;
+ u16 udlen;
+ u8 *udata;
+@@ -1399,8 +1422,8 @@ struct nft_flowtable {
+ char *name;
+ int hooknum;
+ int ops_len;
+- u32 genmask:2,
+- use:30;
++ u32 genmask:2;
++ u32 use;
+ u64 handle;
+ /* runtime data below here */
+ struct list_head hook_list ____cacheline_aligned;
+diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
+index 5722931d83d43..2465d1e79d10e 100644
+--- a/include/net/pkt_sched.h
++++ b/include/net/pkt_sched.h
+@@ -134,7 +134,7 @@ extern const struct nla_policy rtm_tca_policy[TCA_MAX + 1];
+ */
+ static inline unsigned int psched_mtu(const struct net_device *dev)
+ {
+- return dev->mtu + dev->hard_header_len;
++ return READ_ONCE(dev->mtu) + dev->hard_header_len;
+ }
+
+ static inline struct net *qdisc_net(struct Qdisc *q)
+@@ -187,6 +187,11 @@ struct tc_taprio_caps {
+ bool broken_mqprio:1;
+ };
+
++enum tc_taprio_qopt_cmd {
++ TAPRIO_CMD_REPLACE,
++ TAPRIO_CMD_DESTROY,
++};
++
+ struct tc_taprio_sched_entry {
+ u8 command; /* TC_TAPRIO_CMD_* */
+
+@@ -198,7 +203,7 @@ struct tc_taprio_sched_entry {
+ struct tc_taprio_qopt_offload {
+ struct tc_mqprio_qopt_offload mqprio;
+ struct netlink_ext_ack *extack;
+- u8 enable;
++ enum tc_taprio_qopt_cmd cmd;
+ ktime_t base_time;
+ u64 cycle_time;
+ u64 cycle_time_extension;
+diff --git a/include/soc/mscc/ocelot.h b/include/soc/mscc/ocelot.h
+index 22aae505c813b..85a726fb006ca 100644
+--- a/include/soc/mscc/ocelot.h
++++ b/include/soc/mscc/ocelot.h
+@@ -663,6 +663,7 @@ struct ocelot_ops {
+ struct flow_stats *stats);
+ void (*cut_through_fwd)(struct ocelot *ocelot);
+ void (*tas_clock_adjust)(struct ocelot *ocelot);
++ void (*tas_guard_bands_update)(struct ocelot *ocelot, int port);
+ void (*update_stats)(struct ocelot *ocelot);
+ };
+
+diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
+index 8ec18faa74ac3..3da63be602d1c 100644
+--- a/kernel/bpf/cpumap.c
++++ b/kernel/bpf/cpumap.c
+@@ -126,22 +126,6 @@ static void get_cpu_map_entry(struct bpf_cpu_map_entry *rcpu)
+ atomic_inc(&rcpu->refcnt);
+ }
+
+-/* called from workqueue, to workaround syscall using preempt_disable */
+-static void cpu_map_kthread_stop(struct work_struct *work)
+-{
+- struct bpf_cpu_map_entry *rcpu;
+-
+- rcpu = container_of(work, struct bpf_cpu_map_entry, kthread_stop_wq);
+-
+- /* Wait for flush in __cpu_map_entry_free(), via full RCU barrier,
+- * as it waits until all in-flight call_rcu() callbacks complete.
+- */
+- rcu_barrier();
+-
+- /* kthread_stop will wake_up_process and wait for it to complete */
+- kthread_stop(rcpu->kthread);
+-}
+-
+ static void __cpu_map_ring_cleanup(struct ptr_ring *ring)
+ {
+ /* The tear-down procedure should have made sure that queue is
+@@ -169,6 +153,30 @@ static void put_cpu_map_entry(struct bpf_cpu_map_entry *rcpu)
+ }
+ }
+
++/* called from workqueue, to workaround syscall using preempt_disable */
++static void cpu_map_kthread_stop(struct work_struct *work)
++{
++ struct bpf_cpu_map_entry *rcpu;
++ int err;
++
++ rcpu = container_of(work, struct bpf_cpu_map_entry, kthread_stop_wq);
++
++ /* Wait for flush in __cpu_map_entry_free(), via full RCU barrier,
++ * as it waits until all in-flight call_rcu() callbacks complete.
++ */
++ rcu_barrier();
++
++ /* kthread_stop will wake_up_process and wait for it to complete */
++ err = kthread_stop(rcpu->kthread);
++ if (err) {
++ /* kthread_stop may be called before cpu_map_kthread_run
++ * is executed, so we need to release the memory related
++ * to rcpu.
++ */
++ put_cpu_map_entry(rcpu);
++ }
++}
++
+ static void cpu_map_bpf_prog_run_skb(struct bpf_cpu_map_entry *rcpu,
+ struct list_head *listp,
+ struct xdp_cpumap_stats *stats)
+diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
+index 30fabae47a07b..aac31e33323bb 100644
+--- a/kernel/bpf/verifier.c
++++ b/kernel/bpf/verifier.c
+@@ -5450,8 +5450,9 @@ continue_func:
+ verbose(env, "verifier bug. subprog has tail_call and async cb\n");
+ return -EFAULT;
+ }
+- /* async callbacks don't increase bpf prog stack size */
+- continue;
++ /* async callbacks don't increase bpf prog stack size unless called directly */
++ if (!bpf_pseudo_call(insn + i))
++ continue;
+ }
+ i = next_insn;
+
+diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
+index af2e304c672c4..b1bbd6270ba79 100644
+--- a/kernel/dma/swiotlb.c
++++ b/kernel/dma/swiotlb.c
+@@ -115,9 +115,16 @@ static bool round_up_default_nslabs(void)
+ return true;
+ }
+
++/**
++ * swiotlb_adjust_nareas() - adjust the number of areas and slots
++ * @nareas: Desired number of areas. Zero is treated as 1.
++ *
++ * Adjust the default number of areas in a memory pool.
++ * The default size of the memory pool may also change to meet minimum area
++ * size requirements.
++ */
+ static void swiotlb_adjust_nareas(unsigned int nareas)
+ {
+- /* use a single area when non is specified */
+ if (!nareas)
+ nareas = 1;
+ else if (!is_power_of_2(nareas))
+@@ -131,6 +138,23 @@ static void swiotlb_adjust_nareas(unsigned int nareas)
+ (default_nslabs << IO_TLB_SHIFT) >> 20);
+ }
+
++/**
++ * limit_nareas() - get the maximum number of areas for a given memory pool size
++ * @nareas: Desired number of areas.
++ * @nslots: Total number of slots in the memory pool.
++ *
++ * Limit the number of areas to the maximum possible number of areas in
++ * a memory pool of the given size.
++ *
++ * Return: Maximum possible number of areas.
++ */
++static unsigned int limit_nareas(unsigned int nareas, unsigned long nslots)
++{
++ if (nslots < nareas * IO_TLB_SEGSIZE)
++ return nslots / IO_TLB_SEGSIZE;
++ return nareas;
++}
++
+ static int __init
+ setup_io_tlb_npages(char *str)
+ {
+@@ -290,6 +314,7 @@ void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags,
+ {
+ struct io_tlb_mem *mem = &io_tlb_default_mem;
+ unsigned long nslabs;
++ unsigned int nareas;
+ size_t alloc_size;
+ void *tlb;
+
+@@ -298,18 +323,16 @@ void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags,
+ if (swiotlb_force_disable)
+ return;
+
+- /*
+- * default_nslabs maybe changed when adjust area number.
+- * So allocate bounce buffer after adjusting area number.
+- */
+ if (!default_nareas)
+ swiotlb_adjust_nareas(num_possible_cpus());
+
+ nslabs = default_nslabs;
++ nareas = limit_nareas(default_nareas, nslabs);
+ while ((tlb = swiotlb_memblock_alloc(nslabs, flags, remap)) == NULL) {
+ if (nslabs <= IO_TLB_MIN_SLABS)
+ return;
+ nslabs = ALIGN(nslabs >> 1, IO_TLB_SEGSIZE);
++ nareas = limit_nareas(nareas, nslabs);
+ }
+
+ if (default_nslabs != nslabs) {
+@@ -355,6 +378,7 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask,
+ {
+ struct io_tlb_mem *mem = &io_tlb_default_mem;
+ unsigned long nslabs = ALIGN(size >> IO_TLB_SHIFT, IO_TLB_SEGSIZE);
++ unsigned int nareas;
+ unsigned char *vstart = NULL;
+ unsigned int order, area_order;
+ bool retried = false;
+@@ -363,6 +387,9 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask,
+ if (swiotlb_force_disable)
+ return 0;
+
++ if (!default_nareas)
++ swiotlb_adjust_nareas(num_possible_cpus());
++
+ retry:
+ order = get_order(nslabs << IO_TLB_SHIFT);
+ nslabs = SLABS_PER_PAGE << order;
+@@ -397,11 +424,8 @@ retry:
+ (PAGE_SIZE << order) >> 20);
+ }
+
+- if (!default_nareas)
+- swiotlb_adjust_nareas(num_possible_cpus());
+-
+- area_order = get_order(array_size(sizeof(*mem->areas),
+- default_nareas));
++ nareas = limit_nareas(default_nareas, nslabs);
++ area_order = get_order(array_size(sizeof(*mem->areas), nareas));
+ mem->areas = (struct io_tlb_area *)
+ __get_free_pages(GFP_KERNEL | __GFP_ZERO, area_order);
+ if (!mem->areas)
+@@ -415,7 +439,7 @@ retry:
+ set_memory_decrypted((unsigned long)vstart,
+ (nslabs << IO_TLB_SHIFT) >> PAGE_SHIFT);
+ swiotlb_init_io_tlb_mem(mem, virt_to_phys(vstart), nslabs, 0, true,
+- default_nareas);
++ nareas);
+
+ swiotlb_print_info();
+ return 0;
+diff --git a/kernel/power/qos.c b/kernel/power/qos.c
+index af51ed6d45ef1..782d3b41c1f35 100644
+--- a/kernel/power/qos.c
++++ b/kernel/power/qos.c
+@@ -426,6 +426,11 @@ late_initcall(cpu_latency_qos_init);
+
+ /* Definitions related to the frequency QoS below. */
+
++static inline bool freq_qos_value_invalid(s32 value)
++{
++ return value < 0 && value != PM_QOS_DEFAULT_VALUE;
++}
++
+ /**
+ * freq_constraints_init - Initialize frequency QoS constraints.
+ * @qos: Frequency QoS constraints to initialize.
+@@ -531,7 +536,7 @@ int freq_qos_add_request(struct freq_constraints *qos,
+ {
+ int ret;
+
+- if (IS_ERR_OR_NULL(qos) || !req || value < 0)
++ if (IS_ERR_OR_NULL(qos) || !req || freq_qos_value_invalid(value))
+ return -EINVAL;
+
+ if (WARN(freq_qos_request_active(req),
+@@ -563,7 +568,7 @@ EXPORT_SYMBOL_GPL(freq_qos_add_request);
+ */
+ int freq_qos_update_request(struct freq_qos_request *req, s32 new_value)
+ {
+- if (!req || new_value < 0)
++ if (!req || freq_qos_value_invalid(new_value))
+ return -EINVAL;
+
+ if (WARN(!freq_qos_request_active(req),
+diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
+index 18d36842faf57..2571f7f3d5f28 100644
+--- a/kernel/trace/fprobe.c
++++ b/kernel/trace/fprobe.c
+@@ -102,12 +102,14 @@ static void fprobe_kprobe_handler(unsigned long ip, unsigned long parent_ip,
+
+ if (unlikely(kprobe_running())) {
+ fp->nmissed++;
+- return;
++ goto recursion_unlock;
+ }
+
+ kprobe_busy_begin();
+ __fprobe_handler(ip, parent_ip, ops, fregs);
+ kprobe_busy_end();
++
++recursion_unlock:
+ ftrace_test_recursion_unlock(bit);
+ }
+
+@@ -364,19 +366,16 @@ int unregister_fprobe(struct fprobe *fp)
+ fp->ops.saved_func != fprobe_kprobe_handler))
+ return -EINVAL;
+
+- /*
+- * rethook_free() starts disabling the rethook, but the rethook handlers
+- * may be running on other processors at this point. To make sure that all
+- * current running handlers are finished, call unregister_ftrace_function()
+- * after this.
+- */
+ if (fp->rethook)
+- rethook_free(fp->rethook);
++ rethook_stop(fp->rethook);
+
+ ret = unregister_ftrace_function(&fp->ops);
+ if (ret < 0)
+ return ret;
+
++ if (fp->rethook)
++ rethook_free(fp->rethook);
++
+ ftrace_free_filter(&fp->ops);
+
+ return ret;
+diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
+index 7646684671558..416f5840b45bb 100644
+--- a/kernel/trace/ftrace.c
++++ b/kernel/trace/ftrace.c
+@@ -3305,6 +3305,22 @@ static int ftrace_allocate_records(struct ftrace_page *pg, int count)
+ return cnt;
+ }
+
++static void ftrace_free_pages(struct ftrace_page *pages)
++{
++ struct ftrace_page *pg = pages;
++
++ while (pg) {
++ if (pg->records) {
++ free_pages((unsigned long)pg->records, pg->order);
++ ftrace_number_of_pages -= 1 << pg->order;
++ }
++ pages = pg->next;
++ kfree(pg);
++ pg = pages;
++ ftrace_number_of_groups--;
++ }
++}
++
+ static struct ftrace_page *
+ ftrace_allocate_pages(unsigned long num_to_init)
+ {
+@@ -3343,17 +3359,7 @@ ftrace_allocate_pages(unsigned long num_to_init)
+ return start_pg;
+
+ free_pages:
+- pg = start_pg;
+- while (pg) {
+- if (pg->records) {
+- free_pages((unsigned long)pg->records, pg->order);
+- ftrace_number_of_pages -= 1 << pg->order;
+- }
+- start_pg = pg->next;
+- kfree(pg);
+- pg = start_pg;
+- ftrace_number_of_groups--;
+- }
++ ftrace_free_pages(start_pg);
+ pr_info("ftrace: FAILED to allocate memory for functions\n");
+ return NULL;
+ }
+@@ -6434,9 +6440,11 @@ static int ftrace_process_locs(struct module *mod,
+ unsigned long *start,
+ unsigned long *end)
+ {
++ struct ftrace_page *pg_unuse = NULL;
+ struct ftrace_page *start_pg;
+ struct ftrace_page *pg;
+ struct dyn_ftrace *rec;
++ unsigned long skipped = 0;
+ unsigned long count;
+ unsigned long *p;
+ unsigned long addr;
+@@ -6499,8 +6507,10 @@ static int ftrace_process_locs(struct module *mod,
+ * object files to satisfy alignments.
+ * Skip any NULL pointers.
+ */
+- if (!addr)
++ if (!addr) {
++ skipped++;
+ continue;
++ }
+
+ end_offset = (pg->index+1) * sizeof(pg->records[0]);
+ if (end_offset > PAGE_SIZE << pg->order) {
+@@ -6514,8 +6524,10 @@ static int ftrace_process_locs(struct module *mod,
+ rec->ip = addr;
+ }
+
+- /* We should have used all pages */
+- WARN_ON(pg->next);
++ if (pg->next) {
++ pg_unuse = pg->next;
++ pg->next = NULL;
++ }
+
+ /* Assign the last page to ftrace_pages */
+ ftrace_pages = pg;
+@@ -6537,6 +6549,11 @@ static int ftrace_process_locs(struct module *mod,
+ out:
+ mutex_unlock(&ftrace_lock);
+
++ /* We should have used all pages unless we skipped some */
++ if (pg_unuse) {
++ WARN_ON(!skipped);
++ ftrace_free_pages(pg_unuse);
++ }
+ return ret;
+ }
+
+diff --git a/kernel/trace/rethook.c b/kernel/trace/rethook.c
+index 60f6cb2b486bf..468006cce7cae 100644
+--- a/kernel/trace/rethook.c
++++ b/kernel/trace/rethook.c
+@@ -53,6 +53,19 @@ static void rethook_free_rcu(struct rcu_head *head)
+ kfree(rh);
+ }
+
++/**
++ * rethook_stop() - Stop using a rethook.
++ * @rh: the struct rethook to stop.
++ *
++ * Stop using a rethook to prepare for freeing it. If you want to wait for
++ * all running rethook handler before calling rethook_free(), you need to
++ * call this first and wait RCU, and call rethook_free().
++ */
++void rethook_stop(struct rethook *rh)
++{
++ WRITE_ONCE(rh->handler, NULL);
++}
++
+ /**
+ * rethook_free() - Free struct rethook.
+ * @rh: the struct rethook to be freed.
+diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
+index 834b361a4a66c..14d8001140c82 100644
+--- a/kernel/trace/ring_buffer.c
++++ b/kernel/trace/ring_buffer.c
+@@ -5242,28 +5242,34 @@ unsigned long ring_buffer_size(struct trace_buffer *buffer, int cpu)
+ }
+ EXPORT_SYMBOL_GPL(ring_buffer_size);
+
++static void rb_clear_buffer_page(struct buffer_page *page)
++{
++ local_set(&page->write, 0);
++ local_set(&page->entries, 0);
++ rb_init_page(page->page);
++ page->read = 0;
++}
++
+ static void
+ rb_reset_cpu(struct ring_buffer_per_cpu *cpu_buffer)
+ {
++ struct buffer_page *page;
++
+ rb_head_page_deactivate(cpu_buffer);
+
+ cpu_buffer->head_page
+ = list_entry(cpu_buffer->pages, struct buffer_page, list);
+- local_set(&cpu_buffer->head_page->write, 0);
+- local_set(&cpu_buffer->head_page->entries, 0);
+- local_set(&cpu_buffer->head_page->page->commit, 0);
+-
+- cpu_buffer->head_page->read = 0;
++ rb_clear_buffer_page(cpu_buffer->head_page);
++ list_for_each_entry(page, cpu_buffer->pages, list) {
++ rb_clear_buffer_page(page);
++ }
+
+ cpu_buffer->tail_page = cpu_buffer->head_page;
+ cpu_buffer->commit_page = cpu_buffer->head_page;
+
+ INIT_LIST_HEAD(&cpu_buffer->reader_page->list);
+ INIT_LIST_HEAD(&cpu_buffer->new_pages);
+- local_set(&cpu_buffer->reader_page->write, 0);
+- local_set(&cpu_buffer->reader_page->entries, 0);
+- local_set(&cpu_buffer->reader_page->page->commit, 0);
+- cpu_buffer->reader_page->read = 0;
++ rb_clear_buffer_page(cpu_buffer->reader_page);
+
+ local_set(&cpu_buffer->entries_bytes, 0);
+ local_set(&cpu_buffer->overrun, 0);
+diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
+index 64a4dde073ef6..c80ff6f5b2cc1 100644
+--- a/kernel/trace/trace.c
++++ b/kernel/trace/trace.c
+@@ -6753,6 +6753,7 @@ static int tracing_release_pipe(struct inode *inode, struct file *file)
+
+ free_cpumask_var(iter->started);
+ kfree(iter->fmt);
++ kfree(iter->temp);
+ mutex_destroy(&iter->mutex);
+ kfree(iter);
+
+@@ -8135,7 +8136,7 @@ static const struct file_operations tracing_err_log_fops = {
+ .open = tracing_err_log_open,
+ .write = tracing_err_log_write,
+ .read = seq_read,
+- .llseek = seq_lseek,
++ .llseek = tracing_lseek,
+ .release = tracing_err_log_release,
+ };
+
+diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
+index 79bdefe9261bf..eee1f3ca47494 100644
+--- a/kernel/trace/trace.h
++++ b/kernel/trace/trace.h
+@@ -113,6 +113,8 @@ enum trace_type {
+ #define MEM_FAIL(condition, fmt, ...) \
+ DO_ONCE_LITE_IF(condition, pr_err, "ERROR: " fmt, ##__VA_ARGS__)
+
++#define FAULT_STRING "(fault)"
++
+ #define HIST_STACKTRACE_DEPTH 16
+ #define HIST_STACKTRACE_SIZE (HIST_STACKTRACE_DEPTH * sizeof(unsigned long))
+ #define HIST_STACKTRACE_SKIP 5
+diff --git a/kernel/trace/trace_eprobe.c b/kernel/trace/trace_eprobe.c
+index 67e854979d53e..3f04f0ffe0d70 100644
+--- a/kernel/trace/trace_eprobe.c
++++ b/kernel/trace/trace_eprobe.c
+@@ -675,6 +675,7 @@ static int enable_trace_eprobe(struct trace_event_call *call,
+ struct trace_eprobe *ep;
+ bool enabled;
+ int ret = 0;
++ int cnt = 0;
+
+ tp = trace_probe_primary_from_call(call);
+ if (WARN_ON_ONCE(!tp))
+@@ -698,12 +699,25 @@ static int enable_trace_eprobe(struct trace_event_call *call,
+ if (ret)
+ break;
+ enabled = true;
++ cnt++;
+ }
+
+ if (ret) {
+ /* Failed to enable one of them. Roll back all */
+- if (enabled)
+- disable_eprobe(ep, file->tr);
++ if (enabled) {
++ /*
++ * It's a bug if one failed for something other than memory
++ * not being available but another eprobe succeeded.
++ */
++ WARN_ON_ONCE(ret != -ENOMEM);
++
++ list_for_each_entry(pos, trace_probe_probe_list(tp), list) {
++ ep = container_of(pos, struct trace_eprobe, tp);
++ disable_eprobe(ep, file->tr);
++ if (!--cnt)
++ break;
++ }
++ }
+ if (file)
+ trace_probe_remove_file(tp, file);
+ else
+diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
+index b97d3ad832f1a..c8c61381eba48 100644
+--- a/kernel/trace/trace_events_hist.c
++++ b/kernel/trace/trace_events_hist.c
+@@ -6663,13 +6663,15 @@ static int event_hist_trigger_parse(struct event_command *cmd_ops,
+ if (get_named_trigger_data(trigger_data))
+ goto enable;
+
+- if (has_hist_vars(hist_data))
+- save_hist_vars(hist_data);
+-
+ ret = create_actions(hist_data);
+ if (ret)
+ goto out_unreg;
+
++ if (has_hist_vars(hist_data) || hist_data->n_var_refs) {
++ if (save_hist_vars(hist_data))
++ goto out_unreg;
++ }
++
+ ret = tracing_map_init(hist_data->map);
+ if (ret)
+ goto out_unreg;
+diff --git a/kernel/trace/trace_events_user.c b/kernel/trace/trace_events_user.c
+index 8df0550415e71..0f04f7904e76a 100644
+--- a/kernel/trace/trace_events_user.c
++++ b/kernel/trace/trace_events_user.c
+@@ -1317,6 +1317,9 @@ static int user_field_set_string(struct ftrace_event_field *field,
+ pos += snprintf(buf + pos, LEN_OR_ZERO, " ");
+ pos += snprintf(buf + pos, LEN_OR_ZERO, "%s", field->name);
+
++ if (str_has_prefix(field->type, "struct "))
++ pos += snprintf(buf + pos, LEN_OR_ZERO, " %d", field->size);
++
+ if (colon)
+ pos += snprintf(buf + pos, LEN_OR_ZERO, ";");
+
+@@ -2096,7 +2099,8 @@ static ssize_t user_events_write_core(struct file *file, struct iov_iter *i)
+
+ if (unlikely(faulted))
+ return -EFAULT;
+- }
++ } else
++ return -EBADF;
+
+ return ret;
+ }
+diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
+index 2d26166782950..591399ddcee5c 100644
+--- a/kernel/trace/trace_probe.c
++++ b/kernel/trace/trace_probe.c
+@@ -65,7 +65,7 @@ int PRINT_TYPE_FUNC_NAME(string)(struct trace_seq *s, void *data, void *ent)
+ int len = *(u32 *)data >> 16;
+
+ if (!len)
+- trace_seq_puts(s, "(fault)");
++ trace_seq_puts(s, FAULT_STRING);
+ else
+ trace_seq_printf(s, "\"%s\"",
+ (const char *)get_loc_data(data, ent));
+diff --git a/kernel/trace/trace_probe_kernel.h b/kernel/trace/trace_probe_kernel.h
+index c4e1d4c03a85f..bb723eefd7b71 100644
+--- a/kernel/trace/trace_probe_kernel.h
++++ b/kernel/trace/trace_probe_kernel.h
+@@ -2,8 +2,6 @@
+ #ifndef __TRACE_PROBE_KERNEL_H_
+ #define __TRACE_PROBE_KERNEL_H_
+
+-#define FAULT_STRING "(fault)"
+-
+ /*
+ * This depends on trace_probe.h, but can not include it due to
+ * the way trace_probe_tmpl.h is used by trace_kprobe.c and trace_eprobe.c.
+@@ -15,16 +13,8 @@ static nokprobe_inline int
+ fetch_store_strlen_user(unsigned long addr)
+ {
+ const void __user *uaddr = (__force const void __user *)addr;
+- int ret;
+
+- ret = strnlen_user_nofault(uaddr, MAX_STRING_SIZE);
+- /*
+- * strnlen_user_nofault returns zero on fault, insert the
+- * FAULT_STRING when that occurs.
+- */
+- if (ret <= 0)
+- return strlen(FAULT_STRING) + 1;
+- return ret;
++ return strnlen_user_nofault(uaddr, MAX_STRING_SIZE);
+ }
+
+ /* Return the length of string -- including null terminal byte */
+@@ -44,18 +34,14 @@ fetch_store_strlen(unsigned long addr)
+ len++;
+ } while (c && ret == 0 && len < MAX_STRING_SIZE);
+
+- /* For faults, return enough to hold the FAULT_STRING */
+- return (ret < 0) ? strlen(FAULT_STRING) + 1 : len;
++ return (ret < 0) ? ret : len;
+ }
+
+-static nokprobe_inline void set_data_loc(int ret, void *dest, void *__dest, void *base, int len)
++static nokprobe_inline void set_data_loc(int ret, void *dest, void *__dest, void *base)
+ {
+- if (ret >= 0) {
+- *(u32 *)dest = make_data_loc(ret, __dest - base);
+- } else {
+- strscpy(__dest, FAULT_STRING, len);
+- ret = strlen(__dest) + 1;
+- }
++ if (ret < 0)
++ ret = 0;
++ *(u32 *)dest = make_data_loc(ret, __dest - base);
+ }
+
+ /*
+@@ -76,7 +62,7 @@ fetch_store_string_user(unsigned long addr, void *dest, void *base)
+ __dest = get_loc_data(dest, base);
+
+ ret = strncpy_from_user_nofault(__dest, uaddr, maxlen);
+- set_data_loc(ret, dest, __dest, base, maxlen);
++ set_data_loc(ret, dest, __dest, base);
+
+ return ret;
+ }
+@@ -107,7 +93,7 @@ fetch_store_string(unsigned long addr, void *dest, void *base)
+ * probing.
+ */
+ ret = strncpy_from_kernel_nofault(__dest, (void *)addr, maxlen);
+- set_data_loc(ret, dest, __dest, base, maxlen);
++ set_data_loc(ret, dest, __dest, base);
+
+ return ret;
+ }
+diff --git a/kernel/trace/trace_probe_tmpl.h b/kernel/trace/trace_probe_tmpl.h
+index 00707630788d6..3935b347f874b 100644
+--- a/kernel/trace/trace_probe_tmpl.h
++++ b/kernel/trace/trace_probe_tmpl.h
+@@ -156,11 +156,11 @@ stage3:
+ code++;
+ goto array;
+ case FETCH_OP_ST_USTRING:
+- ret += fetch_store_strlen_user(val + code->offset);
++ ret = fetch_store_strlen_user(val + code->offset);
+ code++;
+ goto array;
+ case FETCH_OP_ST_SYMSTR:
+- ret += fetch_store_symstrlen(val + code->offset);
++ ret = fetch_store_symstrlen(val + code->offset);
+ code++;
+ goto array;
+ default:
+@@ -204,6 +204,8 @@ stage3:
+ array:
+ /* the last stage: Loop on array */
+ if (code->op == FETCH_OP_LP_ARRAY) {
++ if (ret < 0)
++ ret = 0;
+ total += ret;
+ if (++i < code->param) {
+ code = s3;
+@@ -265,9 +267,7 @@ store_trace_args(void *data, struct trace_probe *tp, void *rec,
+ if (unlikely(arg->dynamic))
+ *dl = make_data_loc(maxlen, dyndata - base);
+ ret = process_fetch_insn(arg->code, rec, dl, base);
+- if (unlikely(ret < 0 && arg->dynamic)) {
+- *dl = make_data_loc(0, dyndata - base);
+- } else {
++ if (arg->dynamic && likely(ret > 0)) {
+ dyndata += ret;
+ maxlen -= ret;
+ }
+diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
+index 8b92e34ff0c83..7b47e9a2c0102 100644
+--- a/kernel/trace/trace_uprobe.c
++++ b/kernel/trace/trace_uprobe.c
+@@ -170,7 +170,8 @@ fetch_store_string(unsigned long addr, void *dest, void *base)
+ */
+ ret++;
+ *(u32 *)dest = make_data_loc(ret, (void *)dst - base);
+- }
++ } else
++ *(u32 *)dest = make_data_loc(0, (void *)dst - base);
+
+ return ret;
+ }
+diff --git a/mm/kasan/common.c b/mm/kasan/common.c
+index b376a5d055e55..256930da578a0 100644
+--- a/mm/kasan/common.c
++++ b/mm/kasan/common.c
+@@ -445,7 +445,7 @@ void * __must_check __kasan_krealloc(const void *object, size_t size, gfp_t flag
+ bool __kasan_check_byte(const void *address, unsigned long ip)
+ {
+ if (!kasan_byte_accessible(address)) {
+- kasan_report((unsigned long)address, 1, false, ip);
++ kasan_report(address, 1, false, ip);
+ return false;
+ }
+ return true;
+diff --git a/mm/kasan/generic.c b/mm/kasan/generic.c
+index e5eef670735ef..f9cb5af9894c6 100644
+--- a/mm/kasan/generic.c
++++ b/mm/kasan/generic.c
+@@ -40,39 +40,39 @@
+ * depending on memory access size X.
+ */
+
+-static __always_inline bool memory_is_poisoned_1(unsigned long addr)
++static __always_inline bool memory_is_poisoned_1(const void *addr)
+ {
+- s8 shadow_value = *(s8 *)kasan_mem_to_shadow((void *)addr);
++ s8 shadow_value = *(s8 *)kasan_mem_to_shadow(addr);
+
+ if (unlikely(shadow_value)) {
+- s8 last_accessible_byte = addr & KASAN_GRANULE_MASK;
++ s8 last_accessible_byte = (unsigned long)addr & KASAN_GRANULE_MASK;
+ return unlikely(last_accessible_byte >= shadow_value);
+ }
+
+ return false;
+ }
+
+-static __always_inline bool memory_is_poisoned_2_4_8(unsigned long addr,
++static __always_inline bool memory_is_poisoned_2_4_8(const void *addr,
+ unsigned long size)
+ {
+- u8 *shadow_addr = (u8 *)kasan_mem_to_shadow((void *)addr);
++ u8 *shadow_addr = (u8 *)kasan_mem_to_shadow(addr);
+
+ /*
+ * Access crosses 8(shadow size)-byte boundary. Such access maps
+ * into 2 shadow bytes, so we need to check them both.
+ */
+- if (unlikely(((addr + size - 1) & KASAN_GRANULE_MASK) < size - 1))
++ if (unlikely((((unsigned long)addr + size - 1) & KASAN_GRANULE_MASK) < size - 1))
+ return *shadow_addr || memory_is_poisoned_1(addr + size - 1);
+
+ return memory_is_poisoned_1(addr + size - 1);
+ }
+
+-static __always_inline bool memory_is_poisoned_16(unsigned long addr)
++static __always_inline bool memory_is_poisoned_16(const void *addr)
+ {
+- u16 *shadow_addr = (u16 *)kasan_mem_to_shadow((void *)addr);
++ u16 *shadow_addr = (u16 *)kasan_mem_to_shadow(addr);
+
+ /* Unaligned 16-bytes access maps into 3 shadow bytes. */
+- if (unlikely(!IS_ALIGNED(addr, KASAN_GRANULE_SIZE)))
++ if (unlikely(!IS_ALIGNED((unsigned long)addr, KASAN_GRANULE_SIZE)))
+ return *shadow_addr || memory_is_poisoned_1(addr + 15);
+
+ return *shadow_addr;
+@@ -120,26 +120,26 @@ static __always_inline unsigned long memory_is_nonzero(const void *start,
+ return bytes_is_nonzero(start, (end - start) % 8);
+ }
+
+-static __always_inline bool memory_is_poisoned_n(unsigned long addr,
+- size_t size)
++static __always_inline bool memory_is_poisoned_n(const void *addr, size_t size)
+ {
+ unsigned long ret;
+
+- ret = memory_is_nonzero(kasan_mem_to_shadow((void *)addr),
+- kasan_mem_to_shadow((void *)addr + size - 1) + 1);
++ ret = memory_is_nonzero(kasan_mem_to_shadow(addr),
++ kasan_mem_to_shadow(addr + size - 1) + 1);
+
+ if (unlikely(ret)) {
+- unsigned long last_byte = addr + size - 1;
+- s8 *last_shadow = (s8 *)kasan_mem_to_shadow((void *)last_byte);
++ const void *last_byte = addr + size - 1;
++ s8 *last_shadow = (s8 *)kasan_mem_to_shadow(last_byte);
++ s8 last_accessible_byte = (unsigned long)last_byte & KASAN_GRANULE_MASK;
+
+ if (unlikely(ret != (unsigned long)last_shadow ||
+- ((long)(last_byte & KASAN_GRANULE_MASK) >= *last_shadow)))
++ last_accessible_byte >= *last_shadow))
+ return true;
+ }
+ return false;
+ }
+
+-static __always_inline bool memory_is_poisoned(unsigned long addr, size_t size)
++static __always_inline bool memory_is_poisoned(const void *addr, size_t size)
+ {
+ if (__builtin_constant_p(size)) {
+ switch (size) {
+@@ -159,7 +159,7 @@ static __always_inline bool memory_is_poisoned(unsigned long addr, size_t size)
+ return memory_is_poisoned_n(addr, size);
+ }
+
+-static __always_inline bool check_region_inline(unsigned long addr,
++static __always_inline bool check_region_inline(const void *addr,
+ size_t size, bool write,
+ unsigned long ret_ip)
+ {
+@@ -172,7 +172,7 @@ static __always_inline bool check_region_inline(unsigned long addr,
+ if (unlikely(addr + size < addr))
+ return !kasan_report(addr, size, write, ret_ip);
+
+- if (unlikely(!addr_has_metadata((void *)addr)))
++ if (unlikely(!addr_has_metadata(addr)))
+ return !kasan_report(addr, size, write, ret_ip);
+
+ if (likely(!memory_is_poisoned(addr, size)))
+@@ -181,7 +181,7 @@ static __always_inline bool check_region_inline(unsigned long addr,
+ return !kasan_report(addr, size, write, ret_ip);
+ }
+
+-bool kasan_check_range(unsigned long addr, size_t size, bool write,
++bool kasan_check_range(const void *addr, size_t size, bool write,
+ unsigned long ret_ip)
+ {
+ return check_region_inline(addr, size, write, ret_ip);
+@@ -221,36 +221,37 @@ static void register_global(struct kasan_global *global)
+ KASAN_GLOBAL_REDZONE, false);
+ }
+
+-void __asan_register_globals(struct kasan_global *globals, size_t size)
++void __asan_register_globals(void *ptr, ssize_t size)
+ {
+ int i;
++ struct kasan_global *globals = ptr;
+
+ for (i = 0; i < size; i++)
+ register_global(&globals[i]);
+ }
+ EXPORT_SYMBOL(__asan_register_globals);
+
+-void __asan_unregister_globals(struct kasan_global *globals, size_t size)
++void __asan_unregister_globals(void *ptr, ssize_t size)
+ {
+ }
+ EXPORT_SYMBOL(__asan_unregister_globals);
+
+ #define DEFINE_ASAN_LOAD_STORE(size) \
+- void __asan_load##size(unsigned long addr) \
++ void __asan_load##size(void *addr) \
+ { \
+ check_region_inline(addr, size, false, _RET_IP_); \
+ } \
+ EXPORT_SYMBOL(__asan_load##size); \
+ __alias(__asan_load##size) \
+- void __asan_load##size##_noabort(unsigned long); \
++ void __asan_load##size##_noabort(void *); \
+ EXPORT_SYMBOL(__asan_load##size##_noabort); \
+- void __asan_store##size(unsigned long addr) \
++ void __asan_store##size(void *addr) \
+ { \
+ check_region_inline(addr, size, true, _RET_IP_); \
+ } \
+ EXPORT_SYMBOL(__asan_store##size); \
+ __alias(__asan_store##size) \
+- void __asan_store##size##_noabort(unsigned long); \
++ void __asan_store##size##_noabort(void *); \
+ EXPORT_SYMBOL(__asan_store##size##_noabort)
+
+ DEFINE_ASAN_LOAD_STORE(1);
+@@ -259,24 +260,24 @@ DEFINE_ASAN_LOAD_STORE(4);
+ DEFINE_ASAN_LOAD_STORE(8);
+ DEFINE_ASAN_LOAD_STORE(16);
+
+-void __asan_loadN(unsigned long addr, size_t size)
++void __asan_loadN(void *addr, ssize_t size)
+ {
+ kasan_check_range(addr, size, false, _RET_IP_);
+ }
+ EXPORT_SYMBOL(__asan_loadN);
+
+ __alias(__asan_loadN)
+-void __asan_loadN_noabort(unsigned long, size_t);
++void __asan_loadN_noabort(void *, ssize_t);
+ EXPORT_SYMBOL(__asan_loadN_noabort);
+
+-void __asan_storeN(unsigned long addr, size_t size)
++void __asan_storeN(void *addr, ssize_t size)
+ {
+ kasan_check_range(addr, size, true, _RET_IP_);
+ }
+ EXPORT_SYMBOL(__asan_storeN);
+
+ __alias(__asan_storeN)
+-void __asan_storeN_noabort(unsigned long, size_t);
++void __asan_storeN_noabort(void *, ssize_t);
+ EXPORT_SYMBOL(__asan_storeN_noabort);
+
+ /* to shut up compiler complaints */
+@@ -284,7 +285,7 @@ void __asan_handle_no_return(void) {}
+ EXPORT_SYMBOL(__asan_handle_no_return);
+
+ /* Emitted by compiler to poison alloca()ed objects. */
+-void __asan_alloca_poison(unsigned long addr, size_t size)
++void __asan_alloca_poison(void *addr, ssize_t size)
+ {
+ size_t rounded_up_size = round_up(size, KASAN_GRANULE_SIZE);
+ size_t padding_size = round_up(size, KASAN_ALLOCA_REDZONE_SIZE) -
+@@ -295,7 +296,7 @@ void __asan_alloca_poison(unsigned long addr, size_t size)
+ KASAN_ALLOCA_REDZONE_SIZE);
+ const void *right_redzone = (const void *)(addr + rounded_up_size);
+
+- WARN_ON(!IS_ALIGNED(addr, KASAN_ALLOCA_REDZONE_SIZE));
++ WARN_ON(!IS_ALIGNED((unsigned long)addr, KASAN_ALLOCA_REDZONE_SIZE));
+
+ kasan_unpoison((const void *)(addr + rounded_down_size),
+ size - rounded_down_size, false);
+@@ -307,18 +308,18 @@ void __asan_alloca_poison(unsigned long addr, size_t size)
+ EXPORT_SYMBOL(__asan_alloca_poison);
+
+ /* Emitted by compiler to unpoison alloca()ed areas when the stack unwinds. */
+-void __asan_allocas_unpoison(const void *stack_top, const void *stack_bottom)
++void __asan_allocas_unpoison(void *stack_top, ssize_t stack_bottom)
+ {
+- if (unlikely(!stack_top || stack_top > stack_bottom))
++ if (unlikely(!stack_top || stack_top > (void *)stack_bottom))
+ return;
+
+- kasan_unpoison(stack_top, stack_bottom - stack_top, false);
++ kasan_unpoison(stack_top, (void *)stack_bottom - stack_top, false);
+ }
+ EXPORT_SYMBOL(__asan_allocas_unpoison);
+
+ /* Emitted by the compiler to [un]poison local variables. */
+ #define DEFINE_ASAN_SET_SHADOW(byte) \
+- void __asan_set_shadow_##byte(const void *addr, size_t size) \
++ void __asan_set_shadow_##byte(const void *addr, ssize_t size) \
+ { \
+ __memset((void *)addr, 0x##byte, size); \
+ } \
+diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
+index f5e4f5f2ba203..2e973b36fe072 100644
+--- a/mm/kasan/kasan.h
++++ b/mm/kasan/kasan.h
+@@ -198,13 +198,13 @@ enum kasan_report_type {
+ struct kasan_report_info {
+ /* Filled in by kasan_report_*(). */
+ enum kasan_report_type type;
+- void *access_addr;
++ const void *access_addr;
+ size_t access_size;
+ bool is_write;
+ unsigned long ip;
+
+ /* Filled in by the common reporting code. */
+- void *first_bad_addr;
++ const void *first_bad_addr;
+ struct kmem_cache *cache;
+ void *object;
+ size_t alloc_size;
+@@ -311,7 +311,7 @@ static __always_inline bool addr_has_metadata(const void *addr)
+ * @ret_ip: return address
+ * @return: true if access was valid, false if invalid
+ */
+-bool kasan_check_range(unsigned long addr, size_t size, bool write,
++bool kasan_check_range(const void *addr, size_t size, bool write,
+ unsigned long ret_ip);
+
+ #else /* CONFIG_KASAN_GENERIC || CONFIG_KASAN_SW_TAGS */
+@@ -323,7 +323,7 @@ static __always_inline bool addr_has_metadata(const void *addr)
+
+ #endif /* CONFIG_KASAN_GENERIC || CONFIG_KASAN_SW_TAGS */
+
+-void *kasan_find_first_bad_addr(void *addr, size_t size);
++const void *kasan_find_first_bad_addr(const void *addr, size_t size);
+ size_t kasan_get_alloc_size(void *object, struct kmem_cache *cache);
+ void kasan_complete_mode_report_info(struct kasan_report_info *info);
+ void kasan_metadata_fetch_row(char *buffer, void *row);
+@@ -346,7 +346,7 @@ void kasan_print_aux_stacks(struct kmem_cache *cache, const void *object);
+ static inline void kasan_print_aux_stacks(struct kmem_cache *cache, const void *object) { }
+ #endif
+
+-bool kasan_report(unsigned long addr, size_t size,
++bool kasan_report(const void *addr, size_t size,
+ bool is_write, unsigned long ip);
+ void kasan_report_invalid_free(void *object, unsigned long ip, enum kasan_report_type type);
+
+@@ -466,18 +466,6 @@ static inline void kasan_unpoison(const void *addr, size_t size, bool init)
+
+ if (WARN_ON((unsigned long)addr & KASAN_GRANULE_MASK))
+ return;
+- /*
+- * Explicitly initialize the memory with the precise object size to
+- * avoid overwriting the slab redzone. This disables initialization in
+- * the arch code and may thus lead to performance penalty. This penalty
+- * does not affect production builds, as slab redzones are not enabled
+- * there.
+- */
+- if (__slub_debug_enabled() &&
+- init && ((unsigned long)size & KASAN_GRANULE_MASK)) {
+- init = false;
+- memzero_explicit((void *)addr, size);
+- }
+ size = round_up(size, KASAN_GRANULE_SIZE);
+
+ hw_set_mem_tag_range((void *)addr, size, tag, init);
+@@ -571,79 +559,82 @@ void kasan_restore_multi_shot(bool enabled);
+ */
+
+ asmlinkage void kasan_unpoison_task_stack_below(const void *watermark);
+-void __asan_register_globals(struct kasan_global *globals, size_t size);
+-void __asan_unregister_globals(struct kasan_global *globals, size_t size);
++void __asan_register_globals(void *globals, ssize_t size);
++void __asan_unregister_globals(void *globals, ssize_t size);
+ void __asan_handle_no_return(void);
+-void __asan_alloca_poison(unsigned long addr, size_t size);
+-void __asan_allocas_unpoison(const void *stack_top, const void *stack_bottom);
+-
+-void __asan_load1(unsigned long addr);
+-void __asan_store1(unsigned long addr);
+-void __asan_load2(unsigned long addr);
+-void __asan_store2(unsigned long addr);
+-void __asan_load4(unsigned long addr);
+-void __asan_store4(unsigned long addr);
+-void __asan_load8(unsigned long addr);
+-void __asan_store8(unsigned long addr);
+-void __asan_load16(unsigned long addr);
+-void __asan_store16(unsigned long addr);
+-void __asan_loadN(unsigned long addr, size_t size);
+-void __asan_storeN(unsigned long addr, size_t size);
+-
+-void __asan_load1_noabort(unsigned long addr);
+-void __asan_store1_noabort(unsigned long addr);
+-void __asan_load2_noabort(unsigned long addr);
+-void __asan_store2_noabort(unsigned long addr);
+-void __asan_load4_noabort(unsigned long addr);
+-void __asan_store4_noabort(unsigned long addr);
+-void __asan_load8_noabort(unsigned long addr);
+-void __asan_store8_noabort(unsigned long addr);
+-void __asan_load16_noabort(unsigned long addr);
+-void __asan_store16_noabort(unsigned long addr);
+-void __asan_loadN_noabort(unsigned long addr, size_t size);
+-void __asan_storeN_noabort(unsigned long addr, size_t size);
+-
+-void __asan_report_load1_noabort(unsigned long addr);
+-void __asan_report_store1_noabort(unsigned long addr);
+-void __asan_report_load2_noabort(unsigned long addr);
+-void __asan_report_store2_noabort(unsigned long addr);
+-void __asan_report_load4_noabort(unsigned long addr);
+-void __asan_report_store4_noabort(unsigned long addr);
+-void __asan_report_load8_noabort(unsigned long addr);
+-void __asan_report_store8_noabort(unsigned long addr);
+-void __asan_report_load16_noabort(unsigned long addr);
+-void __asan_report_store16_noabort(unsigned long addr);
+-void __asan_report_load_n_noabort(unsigned long addr, size_t size);
+-void __asan_report_store_n_noabort(unsigned long addr, size_t size);
+-
+-void __asan_set_shadow_00(const void *addr, size_t size);
+-void __asan_set_shadow_f1(const void *addr, size_t size);
+-void __asan_set_shadow_f2(const void *addr, size_t size);
+-void __asan_set_shadow_f3(const void *addr, size_t size);
+-void __asan_set_shadow_f5(const void *addr, size_t size);
+-void __asan_set_shadow_f8(const void *addr, size_t size);
+-
+-void *__asan_memset(void *addr, int c, size_t len);
+-void *__asan_memmove(void *dest, const void *src, size_t len);
+-void *__asan_memcpy(void *dest, const void *src, size_t len);
+-
+-void __hwasan_load1_noabort(unsigned long addr);
+-void __hwasan_store1_noabort(unsigned long addr);
+-void __hwasan_load2_noabort(unsigned long addr);
+-void __hwasan_store2_noabort(unsigned long addr);
+-void __hwasan_load4_noabort(unsigned long addr);
+-void __hwasan_store4_noabort(unsigned long addr);
+-void __hwasan_load8_noabort(unsigned long addr);
+-void __hwasan_store8_noabort(unsigned long addr);
+-void __hwasan_load16_noabort(unsigned long addr);
+-void __hwasan_store16_noabort(unsigned long addr);
+-void __hwasan_loadN_noabort(unsigned long addr, size_t size);
+-void __hwasan_storeN_noabort(unsigned long addr, size_t size);
+-
+-void __hwasan_tag_memory(unsigned long addr, u8 tag, unsigned long size);
+-
+-void *__hwasan_memset(void *addr, int c, size_t len);
+-void *__hwasan_memmove(void *dest, const void *src, size_t len);
+-void *__hwasan_memcpy(void *dest, const void *src, size_t len);
++void __asan_alloca_poison(void *, ssize_t size);
++void __asan_allocas_unpoison(void *stack_top, ssize_t stack_bottom);
++
++void __asan_load1(void *);
++void __asan_store1(void *);
++void __asan_load2(void *);
++void __asan_store2(void *);
++void __asan_load4(void *);
++void __asan_store4(void *);
++void __asan_load8(void *);
++void __asan_store8(void *);
++void __asan_load16(void *);
++void __asan_store16(void *);
++void __asan_loadN(void *, ssize_t size);
++void __asan_storeN(void *, ssize_t size);
++
++void __asan_load1_noabort(void *);
++void __asan_store1_noabort(void *);
++void __asan_load2_noabort(void *);
++void __asan_store2_noabort(void *);
++void __asan_load4_noabort(void *);
++void __asan_store4_noabort(void *);
++void __asan_load8_noabort(void *);
++void __asan_store8_noabort(void *);
++void __asan_load16_noabort(void *);
++void __asan_store16_noabort(void *);
++void __asan_loadN_noabort(void *, ssize_t size);
++void __asan_storeN_noabort(void *, ssize_t size);
++
++void __asan_report_load1_noabort(void *);
++void __asan_report_store1_noabort(void *);
++void __asan_report_load2_noabort(void *);
++void __asan_report_store2_noabort(void *);
++void __asan_report_load4_noabort(void *);
++void __asan_report_store4_noabort(void *);
++void __asan_report_load8_noabort(void *);
++void __asan_report_store8_noabort(void *);
++void __asan_report_load16_noabort(void *);
++void __asan_report_store16_noabort(void *);
++void __asan_report_load_n_noabort(void *, ssize_t size);
++void __asan_report_store_n_noabort(void *, ssize_t size);
++
++void __asan_set_shadow_00(const void *addr, ssize_t size);
++void __asan_set_shadow_f1(const void *addr, ssize_t size);
++void __asan_set_shadow_f2(const void *addr, ssize_t size);
++void __asan_set_shadow_f3(const void *addr, ssize_t size);
++void __asan_set_shadow_f5(const void *addr, ssize_t size);
++void __asan_set_shadow_f8(const void *addr, ssize_t size);
++
++void *__asan_memset(void *addr, int c, ssize_t len);
++void *__asan_memmove(void *dest, const void *src, ssize_t len);
++void *__asan_memcpy(void *dest, const void *src, ssize_t len);
++
++void __hwasan_load1_noabort(void *);
++void __hwasan_store1_noabort(void *);
++void __hwasan_load2_noabort(void *);
++void __hwasan_store2_noabort(void *);
++void __hwasan_load4_noabort(void *);
++void __hwasan_store4_noabort(void *);
++void __hwasan_load8_noabort(void *);
++void __hwasan_store8_noabort(void *);
++void __hwasan_load16_noabort(void *);
++void __hwasan_store16_noabort(void *);
++void __hwasan_loadN_noabort(void *, ssize_t size);
++void __hwasan_storeN_noabort(void *, ssize_t size);
++
++void __hwasan_tag_memory(void *, u8 tag, ssize_t size);
++
++void *__hwasan_memset(void *addr, int c, ssize_t len);
++void *__hwasan_memmove(void *dest, const void *src, ssize_t len);
++void *__hwasan_memcpy(void *dest, const void *src, ssize_t len);
++
++void kasan_tag_mismatch(void *addr, unsigned long access_info,
++ unsigned long ret_ip);
+
+ #endif /* __MM_KASAN_KASAN_H */
+diff --git a/mm/kasan/report.c b/mm/kasan/report.c
+index 892a9dc9d4d31..84d9f3b370149 100644
+--- a/mm/kasan/report.c
++++ b/mm/kasan/report.c
+@@ -211,7 +211,7 @@ static void start_report(unsigned long *flags, bool sync)
+ pr_err("==================================================================\n");
+ }
+
+-static void end_report(unsigned long *flags, void *addr)
++static void end_report(unsigned long *flags, const void *addr)
+ {
+ if (addr)
+ trace_error_report_end(ERROR_DETECTOR_KASAN,
+@@ -450,8 +450,8 @@ static void print_memory_metadata(const void *addr)
+
+ static void print_report(struct kasan_report_info *info)
+ {
+- void *addr = kasan_reset_tag(info->access_addr);
+- u8 tag = get_tag(info->access_addr);
++ void *addr = kasan_reset_tag((void *)info->access_addr);
++ u8 tag = get_tag((void *)info->access_addr);
+
+ print_error_description(info);
+ if (addr_has_metadata(addr))
+@@ -468,12 +468,12 @@ static void print_report(struct kasan_report_info *info)
+
+ static void complete_report_info(struct kasan_report_info *info)
+ {
+- void *addr = kasan_reset_tag(info->access_addr);
++ void *addr = kasan_reset_tag((void *)info->access_addr);
+ struct slab *slab;
+
+ if (info->type == KASAN_REPORT_ACCESS)
+ info->first_bad_addr = kasan_find_first_bad_addr(
+- info->access_addr, info->access_size);
++ (void *)info->access_addr, info->access_size);
+ else
+ info->first_bad_addr = addr;
+
+@@ -544,11 +544,10 @@ void kasan_report_invalid_free(void *ptr, unsigned long ip, enum kasan_report_ty
+ * user_access_save/restore(): kasan_report_invalid_free() cannot be called
+ * from a UACCESS region, and kasan_report_async() is not used on x86.
+ */
+-bool kasan_report(unsigned long addr, size_t size, bool is_write,
++bool kasan_report(const void *addr, size_t size, bool is_write,
+ unsigned long ip)
+ {
+ bool ret = true;
+- void *ptr = (void *)addr;
+ unsigned long ua_flags = user_access_save();
+ unsigned long irq_flags;
+ struct kasan_report_info info;
+@@ -562,7 +561,7 @@ bool kasan_report(unsigned long addr, size_t size, bool is_write,
+
+ memset(&info, 0, sizeof(info));
+ info.type = KASAN_REPORT_ACCESS;
+- info.access_addr = ptr;
++ info.access_addr = addr;
+ info.access_size = size;
+ info.is_write = is_write;
+ info.ip = ip;
+@@ -571,7 +570,7 @@ bool kasan_report(unsigned long addr, size_t size, bool is_write,
+
+ print_report(&info);
+
+- end_report(&irq_flags, ptr);
++ end_report(&irq_flags, (void *)addr);
+
+ out:
+ user_access_restore(ua_flags);
+diff --git a/mm/kasan/report_generic.c b/mm/kasan/report_generic.c
+index 87d39bc0a6735..51a1e8a8877f7 100644
+--- a/mm/kasan/report_generic.c
++++ b/mm/kasan/report_generic.c
+@@ -30,9 +30,9 @@
+ #include "kasan.h"
+ #include "../slab.h"
+
+-void *kasan_find_first_bad_addr(void *addr, size_t size)
++const void *kasan_find_first_bad_addr(const void *addr, size_t size)
+ {
+- void *p = addr;
++ const void *p = addr;
+
+ if (!addr_has_metadata(p))
+ return p;
+@@ -362,14 +362,14 @@ void kasan_print_address_stack_frame(const void *addr)
+ #endif /* CONFIG_KASAN_STACK */
+
+ #define DEFINE_ASAN_REPORT_LOAD(size) \
+-void __asan_report_load##size##_noabort(unsigned long addr) \
++void __asan_report_load##size##_noabort(void *addr) \
+ { \
+ kasan_report(addr, size, false, _RET_IP_); \
+ } \
+ EXPORT_SYMBOL(__asan_report_load##size##_noabort)
+
+ #define DEFINE_ASAN_REPORT_STORE(size) \
+-void __asan_report_store##size##_noabort(unsigned long addr) \
++void __asan_report_store##size##_noabort(void *addr) \
+ { \
+ kasan_report(addr, size, true, _RET_IP_); \
+ } \
+@@ -386,13 +386,13 @@ DEFINE_ASAN_REPORT_STORE(4);
+ DEFINE_ASAN_REPORT_STORE(8);
+ DEFINE_ASAN_REPORT_STORE(16);
+
+-void __asan_report_load_n_noabort(unsigned long addr, size_t size)
++void __asan_report_load_n_noabort(void *addr, ssize_t size)
+ {
+ kasan_report(addr, size, false, _RET_IP_);
+ }
+ EXPORT_SYMBOL(__asan_report_load_n_noabort);
+
+-void __asan_report_store_n_noabort(unsigned long addr, size_t size)
++void __asan_report_store_n_noabort(void *addr, ssize_t size)
+ {
+ kasan_report(addr, size, true, _RET_IP_);
+ }
+diff --git a/mm/kasan/report_hw_tags.c b/mm/kasan/report_hw_tags.c
+index 32e80f78de7d0..065e1b2fc484c 100644
+--- a/mm/kasan/report_hw_tags.c
++++ b/mm/kasan/report_hw_tags.c
+@@ -15,7 +15,7 @@
+
+ #include "kasan.h"
+
+-void *kasan_find_first_bad_addr(void *addr, size_t size)
++const void *kasan_find_first_bad_addr(const void *addr, size_t size)
+ {
+ /*
+ * Hardware Tag-Based KASAN only calls this function for normal memory
+diff --git a/mm/kasan/report_sw_tags.c b/mm/kasan/report_sw_tags.c
+index 8b1f5a73ee6d3..689e94f9fe3cf 100644
+--- a/mm/kasan/report_sw_tags.c
++++ b/mm/kasan/report_sw_tags.c
+@@ -30,7 +30,7 @@
+ #include "kasan.h"
+ #include "../slab.h"
+
+-void *kasan_find_first_bad_addr(void *addr, size_t size)
++const void *kasan_find_first_bad_addr(const void *addr, size_t size)
+ {
+ u8 tag = get_tag(addr);
+ void *p = kasan_reset_tag(addr);
+diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
+index c8b86f3273b50..3e62728ae25d3 100644
+--- a/mm/kasan/shadow.c
++++ b/mm/kasan/shadow.c
+@@ -28,13 +28,13 @@
+
+ bool __kasan_check_read(const volatile void *p, unsigned int size)
+ {
+- return kasan_check_range((unsigned long)p, size, false, _RET_IP_);
++ return kasan_check_range((void *)p, size, false, _RET_IP_);
+ }
+ EXPORT_SYMBOL(__kasan_check_read);
+
+ bool __kasan_check_write(const volatile void *p, unsigned int size)
+ {
+- return kasan_check_range((unsigned long)p, size, true, _RET_IP_);
++ return kasan_check_range((void *)p, size, true, _RET_IP_);
+ }
+ EXPORT_SYMBOL(__kasan_check_write);
+
+@@ -50,7 +50,7 @@ EXPORT_SYMBOL(__kasan_check_write);
+ #undef memset
+ void *memset(void *addr, int c, size_t len)
+ {
+- if (!kasan_check_range((unsigned long)addr, len, true, _RET_IP_))
++ if (!kasan_check_range(addr, len, true, _RET_IP_))
+ return NULL;
+
+ return __memset(addr, c, len);
+@@ -60,8 +60,8 @@ void *memset(void *addr, int c, size_t len)
+ #undef memmove
+ void *memmove(void *dest, const void *src, size_t len)
+ {
+- if (!kasan_check_range((unsigned long)src, len, false, _RET_IP_) ||
+- !kasan_check_range((unsigned long)dest, len, true, _RET_IP_))
++ if (!kasan_check_range(src, len, false, _RET_IP_) ||
++ !kasan_check_range(dest, len, true, _RET_IP_))
+ return NULL;
+
+ return __memmove(dest, src, len);
+@@ -71,17 +71,17 @@ void *memmove(void *dest, const void *src, size_t len)
+ #undef memcpy
+ void *memcpy(void *dest, const void *src, size_t len)
+ {
+- if (!kasan_check_range((unsigned long)src, len, false, _RET_IP_) ||
+- !kasan_check_range((unsigned long)dest, len, true, _RET_IP_))
++ if (!kasan_check_range(src, len, false, _RET_IP_) ||
++ !kasan_check_range(dest, len, true, _RET_IP_))
+ return NULL;
+
+ return __memcpy(dest, src, len);
+ }
+ #endif
+
+-void *__asan_memset(void *addr, int c, size_t len)
++void *__asan_memset(void *addr, int c, ssize_t len)
+ {
+- if (!kasan_check_range((unsigned long)addr, len, true, _RET_IP_))
++ if (!kasan_check_range(addr, len, true, _RET_IP_))
+ return NULL;
+
+ return __memset(addr, c, len);
+@@ -89,10 +89,10 @@ void *__asan_memset(void *addr, int c, size_t len)
+ EXPORT_SYMBOL(__asan_memset);
+
+ #ifdef __HAVE_ARCH_MEMMOVE
+-void *__asan_memmove(void *dest, const void *src, size_t len)
++void *__asan_memmove(void *dest, const void *src, ssize_t len)
+ {
+- if (!kasan_check_range((unsigned long)src, len, false, _RET_IP_) ||
+- !kasan_check_range((unsigned long)dest, len, true, _RET_IP_))
++ if (!kasan_check_range(src, len, false, _RET_IP_) ||
++ !kasan_check_range(dest, len, true, _RET_IP_))
+ return NULL;
+
+ return __memmove(dest, src, len);
+@@ -100,10 +100,10 @@ void *__asan_memmove(void *dest, const void *src, size_t len)
+ EXPORT_SYMBOL(__asan_memmove);
+ #endif
+
+-void *__asan_memcpy(void *dest, const void *src, size_t len)
++void *__asan_memcpy(void *dest, const void *src, ssize_t len)
+ {
+- if (!kasan_check_range((unsigned long)src, len, false, _RET_IP_) ||
+- !kasan_check_range((unsigned long)dest, len, true, _RET_IP_))
++ if (!kasan_check_range(src, len, false, _RET_IP_) ||
++ !kasan_check_range(dest, len, true, _RET_IP_))
+ return NULL;
+
+ return __memcpy(dest, src, len);
+@@ -111,13 +111,13 @@ void *__asan_memcpy(void *dest, const void *src, size_t len)
+ EXPORT_SYMBOL(__asan_memcpy);
+
+ #ifdef CONFIG_KASAN_SW_TAGS
+-void *__hwasan_memset(void *addr, int c, size_t len) __alias(__asan_memset);
++void *__hwasan_memset(void *addr, int c, ssize_t len) __alias(__asan_memset);
+ EXPORT_SYMBOL(__hwasan_memset);
+ #ifdef __HAVE_ARCH_MEMMOVE
+-void *__hwasan_memmove(void *dest, const void *src, size_t len) __alias(__asan_memmove);
++void *__hwasan_memmove(void *dest, const void *src, ssize_t len) __alias(__asan_memmove);
+ EXPORT_SYMBOL(__hwasan_memmove);
+ #endif
+-void *__hwasan_memcpy(void *dest, const void *src, size_t len) __alias(__asan_memcpy);
++void *__hwasan_memcpy(void *dest, const void *src, ssize_t len) __alias(__asan_memcpy);
+ EXPORT_SYMBOL(__hwasan_memcpy);
+ #endif
+
+diff --git a/mm/kasan/sw_tags.c b/mm/kasan/sw_tags.c
+index 30da65fa02a1e..220b5d4c6876f 100644
+--- a/mm/kasan/sw_tags.c
++++ b/mm/kasan/sw_tags.c
+@@ -70,8 +70,8 @@ u8 kasan_random_tag(void)
+ return (u8)(state % (KASAN_TAG_MAX + 1));
+ }
+
+-bool kasan_check_range(unsigned long addr, size_t size, bool write,
+- unsigned long ret_ip)
++bool kasan_check_range(const void *addr, size_t size, bool write,
++ unsigned long ret_ip)
+ {
+ u8 tag;
+ u8 *shadow_first, *shadow_last, *shadow;
+@@ -133,12 +133,12 @@ bool kasan_byte_accessible(const void *addr)
+ }
+
+ #define DEFINE_HWASAN_LOAD_STORE(size) \
+- void __hwasan_load##size##_noabort(unsigned long addr) \
++ void __hwasan_load##size##_noabort(void *addr) \
+ { \
+- kasan_check_range(addr, size, false, _RET_IP_); \
++ kasan_check_range(addr, size, false, _RET_IP_); \
+ } \
+ EXPORT_SYMBOL(__hwasan_load##size##_noabort); \
+- void __hwasan_store##size##_noabort(unsigned long addr) \
++ void __hwasan_store##size##_noabort(void *addr) \
+ { \
+ kasan_check_range(addr, size, true, _RET_IP_); \
+ } \
+@@ -150,25 +150,25 @@ DEFINE_HWASAN_LOAD_STORE(4);
+ DEFINE_HWASAN_LOAD_STORE(8);
+ DEFINE_HWASAN_LOAD_STORE(16);
+
+-void __hwasan_loadN_noabort(unsigned long addr, unsigned long size)
++void __hwasan_loadN_noabort(void *addr, ssize_t size)
+ {
+ kasan_check_range(addr, size, false, _RET_IP_);
+ }
+ EXPORT_SYMBOL(__hwasan_loadN_noabort);
+
+-void __hwasan_storeN_noabort(unsigned long addr, unsigned long size)
++void __hwasan_storeN_noabort(void *addr, ssize_t size)
+ {
+ kasan_check_range(addr, size, true, _RET_IP_);
+ }
+ EXPORT_SYMBOL(__hwasan_storeN_noabort);
+
+-void __hwasan_tag_memory(unsigned long addr, u8 tag, unsigned long size)
++void __hwasan_tag_memory(void *addr, u8 tag, ssize_t size)
+ {
+- kasan_poison((void *)addr, size, tag, false);
++ kasan_poison(addr, size, tag, false);
+ }
+ EXPORT_SYMBOL(__hwasan_tag_memory);
+
+-void kasan_tag_mismatch(unsigned long addr, unsigned long access_info,
++void kasan_tag_mismatch(void *addr, unsigned long access_info,
+ unsigned long ret_ip)
+ {
+ kasan_report(addr, 1 << (access_info & 0xf), access_info & 0x10,
+diff --git a/mm/mmap.c b/mm/mmap.c
+index 30bf7772d4ac1..5c5a917b261e7 100644
+--- a/mm/mmap.c
++++ b/mm/mmap.c
+@@ -2480,7 +2480,8 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
+ }
+ vma_start_write(next);
+ mas_set_range(&mas_detach, next->vm_start, next->vm_end - 1);
+- if (mas_store_gfp(&mas_detach, next, GFP_KERNEL))
++ error = mas_store_gfp(&mas_detach, next, GFP_KERNEL);
++ if (error)
+ goto munmap_gather_failed;
+ vma_mark_detached(next, true);
+ if (next->vm_flags & VM_LOCKED)
+@@ -2529,12 +2530,12 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
+ BUG_ON(count != test_count);
+ }
+ #endif
+- /* Point of no return */
+- error = -ENOMEM;
+ vma_iter_set(vmi, start);
+- if (vma_iter_clear_gfp(vmi, start, end, GFP_KERNEL))
++ error = vma_iter_clear_gfp(vmi, start, end, GFP_KERNEL);
++ if (error)
+ goto clear_tree_failed;
+
++ /* Point of no return */
+ mm->locked_vm -= locked_vm;
+ mm->map_count -= count;
+ /*
+diff --git a/mm/slab.h b/mm/slab.h
+index f01ac256a8f55..24ecd4f50b574 100644
+--- a/mm/slab.h
++++ b/mm/slab.h
+@@ -684,6 +684,7 @@ static inline void slab_post_alloc_hook(struct kmem_cache *s,
+ unsigned int orig_size)
+ {
+ unsigned int zero_size = s->object_size;
++ bool kasan_init = init;
+ size_t i;
+
+ flags &= gfp_allowed_mask;
+@@ -700,6 +701,17 @@ static inline void slab_post_alloc_hook(struct kmem_cache *s,
+ (s->flags & SLAB_KMALLOC))
+ zero_size = orig_size;
+
++ /*
++ * When slub_debug is enabled, avoid memory initialization integrated
++ * into KASAN and instead zero out the memory via the memset below with
++ * the proper size. Otherwise, KASAN might overwrite SLUB redzones and
++ * cause false-positive reports. This does not lead to a performance
++ * penalty on production builds, as slub_debug is not intended to be
++ * enabled there.
++ */
++ if (__slub_debug_enabled())
++ kasan_init = false;
++
+ /*
+ * As memory initialization might be integrated into KASAN,
+ * kasan_slab_alloc and initialization memset must be
+@@ -708,8 +720,8 @@ static inline void slab_post_alloc_hook(struct kmem_cache *s,
+ * As p[i] might get tagged, memset and kmemleak hook come after KASAN.
+ */
+ for (i = 0; i < size; i++) {
+- p[i] = kasan_slab_alloc(s, p[i], flags, init);
+- if (p[i] && init && !kasan_has_integrated_init())
++ p[i] = kasan_slab_alloc(s, p[i], flags, kasan_init);
++ if (p[i] && init && (!kasan_init || !kasan_has_integrated_init()))
+ memset(p[i], 0, zero_size);
+ kmemleak_alloc_recursive(p[i], s->object_size, 1,
+ s->flags, flags);
+diff --git a/net/ceph/messenger_v2.c b/net/ceph/messenger_v2.c
+index 301a991dc6a68..66b79a7b5f78e 100644
+--- a/net/ceph/messenger_v2.c
++++ b/net/ceph/messenger_v2.c
+@@ -391,6 +391,8 @@ static int head_onwire_len(int ctrl_len, bool secure)
+ int head_len;
+ int rem_len;
+
++ BUG_ON(ctrl_len < 0 || ctrl_len > CEPH_MSG_MAX_CONTROL_LEN);
++
+ if (secure) {
+ head_len = CEPH_PREAMBLE_SECURE_LEN;
+ if (ctrl_len > CEPH_PREAMBLE_INLINE_LEN) {
+@@ -409,6 +411,10 @@ static int head_onwire_len(int ctrl_len, bool secure)
+ static int __tail_onwire_len(int front_len, int middle_len, int data_len,
+ bool secure)
+ {
++ BUG_ON(front_len < 0 || front_len > CEPH_MSG_MAX_FRONT_LEN ||
++ middle_len < 0 || middle_len > CEPH_MSG_MAX_MIDDLE_LEN ||
++ data_len < 0 || data_len > CEPH_MSG_MAX_DATA_LEN);
++
+ if (!front_len && !middle_len && !data_len)
+ return 0;
+
+@@ -521,29 +527,34 @@ static int decode_preamble(void *p, struct ceph_frame_desc *desc)
+ desc->fd_aligns[i] = ceph_decode_16(&p);
+ }
+
+- /*
+- * This would fire for FRAME_TAG_WAIT (it has one empty
+- * segment), but we should never get it as client.
+- */
+- if (!desc->fd_lens[desc->fd_seg_cnt - 1]) {
+- pr_err("last segment empty\n");
++ if (desc->fd_lens[0] < 0 ||
++ desc->fd_lens[0] > CEPH_MSG_MAX_CONTROL_LEN) {
++ pr_err("bad control segment length %d\n", desc->fd_lens[0]);
+ return -EINVAL;
+ }
+-
+- if (desc->fd_lens[0] > CEPH_MSG_MAX_CONTROL_LEN) {
+- pr_err("control segment too big %d\n", desc->fd_lens[0]);
++ if (desc->fd_lens[1] < 0 ||
++ desc->fd_lens[1] > CEPH_MSG_MAX_FRONT_LEN) {
++ pr_err("bad front segment length %d\n", desc->fd_lens[1]);
+ return -EINVAL;
+ }
+- if (desc->fd_lens[1] > CEPH_MSG_MAX_FRONT_LEN) {
+- pr_err("front segment too big %d\n", desc->fd_lens[1]);
++ if (desc->fd_lens[2] < 0 ||
++ desc->fd_lens[2] > CEPH_MSG_MAX_MIDDLE_LEN) {
++ pr_err("bad middle segment length %d\n", desc->fd_lens[2]);
+ return -EINVAL;
+ }
+- if (desc->fd_lens[2] > CEPH_MSG_MAX_MIDDLE_LEN) {
+- pr_err("middle segment too big %d\n", desc->fd_lens[2]);
++ if (desc->fd_lens[3] < 0 ||
++ desc->fd_lens[3] > CEPH_MSG_MAX_DATA_LEN) {
++ pr_err("bad data segment length %d\n", desc->fd_lens[3]);
+ return -EINVAL;
+ }
+- if (desc->fd_lens[3] > CEPH_MSG_MAX_DATA_LEN) {
+- pr_err("data segment too big %d\n", desc->fd_lens[3]);
++
++ /*
++ * This would fire for FRAME_TAG_WAIT (it has one empty
++ * segment), but we should never get it as client.
++ */
++ if (!desc->fd_lens[desc->fd_seg_cnt - 1]) {
++ pr_err("last segment empty, segment count %d\n",
++ desc->fd_seg_cnt);
+ return -EINVAL;
+ }
+
+diff --git a/net/core/net-traces.c b/net/core/net-traces.c
+index 805b7385dd8da..6aef976bc1da2 100644
+--- a/net/core/net-traces.c
++++ b/net/core/net-traces.c
+@@ -63,4 +63,6 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(napi_poll);
+ EXPORT_TRACEPOINT_SYMBOL_GPL(tcp_send_reset);
+ EXPORT_TRACEPOINT_SYMBOL_GPL(tcp_bad_csum);
+
++EXPORT_TRACEPOINT_SYMBOL_GPL(udp_fail_queue_rcv_skb);
++
+ EXPORT_TRACEPOINT_SYMBOL_GPL(sk_data_ready);
+diff --git a/net/core/skbuff.c b/net/core/skbuff.c
+index cea28d30abb55..1b6a1d99869dc 100644
+--- a/net/core/skbuff.c
++++ b/net/core/skbuff.c
+@@ -4270,6 +4270,11 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
+
+ skb_push(skb, -skb_network_offset(skb) + offset);
+
++ /* Ensure the head is writeable before touching the shared info */
++ err = skb_unclone(skb, GFP_ATOMIC);
++ if (err)
++ goto err_linearize;
++
+ skb_shinfo(skb)->frag_list = NULL;
+
+ while (list_skb) {
+diff --git a/net/core/xdp.c b/net/core/xdp.c
+index 41e5ca8643ec9..8362130bf085d 100644
+--- a/net/core/xdp.c
++++ b/net/core/xdp.c
+@@ -741,7 +741,7 @@ __bpf_kfunc int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash,
+ __diag_pop();
+
+ BTF_SET8_START(xdp_metadata_kfunc_ids)
+-#define XDP_METADATA_KFUNC(_, name) BTF_ID_FLAGS(func, name, 0)
++#define XDP_METADATA_KFUNC(_, name) BTF_ID_FLAGS(func, name, KF_TRUSTED_ARGS)
+ XDP_METADATA_KFUNC_xxx
+ #undef XDP_METADATA_KFUNC
+ BTF_SET8_END(xdp_metadata_kfunc_ids)
+diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
+index 3797917237d03..5affca8e2f53a 100644
+--- a/net/ipv6/addrconf.c
++++ b/net/ipv6/addrconf.c
+@@ -318,9 +318,8 @@ static void addrconf_del_dad_work(struct inet6_ifaddr *ifp)
+ static void addrconf_mod_rs_timer(struct inet6_dev *idev,
+ unsigned long when)
+ {
+- if (!timer_pending(&idev->rs_timer))
++ if (!mod_timer(&idev->rs_timer, jiffies + when))
+ in6_dev_hold(idev);
+- mod_timer(&idev->rs_timer, jiffies + when);
+ }
+
+ static void addrconf_mod_dad_work(struct inet6_ifaddr *ifp,
+diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c
+index 9edf1f45b1ed6..65fa5014bc85e 100644
+--- a/net/ipv6/icmp.c
++++ b/net/ipv6/icmp.c
+@@ -424,7 +424,10 @@ static struct net_device *icmp6_dev(const struct sk_buff *skb)
+ if (unlikely(dev->ifindex == LOOPBACK_IFINDEX || netif_is_l3_master(skb->dev))) {
+ const struct rt6_info *rt6 = skb_rt6_info(skb);
+
+- if (rt6)
++ /* The destination could be an external IP in Ext Hdr (SRv6, RPL, etc.),
++ * and ip6_null_entry could be set to skb if no route is found.
++ */
++ if (rt6 && rt6->rt6i_idev)
+ dev = rt6->rt6i_idev->dev;
+ }
+
+diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
+index e5a337e6b9705..d594a0425749b 100644
+--- a/net/ipv6/udp.c
++++ b/net/ipv6/udp.c
+@@ -45,6 +45,7 @@
+ #include <net/tcp_states.h>
+ #include <net/ip6_checksum.h>
+ #include <net/ip6_tunnel.h>
++#include <trace/events/udp.h>
+ #include <net/xfrm.h>
+ #include <net/inet_hashtables.h>
+ #include <net/inet6_hashtables.h>
+@@ -90,7 +91,7 @@ static u32 udp6_ehashfn(const struct net *net,
+ fhash = __ipv6_addr_jhash(faddr, udp_ipv6_hash_secret);
+
+ return __inet6_ehashfn(lhash, lport, fhash, fport,
+- udp_ipv6_hash_secret + net_hash_mix(net));
++ udp6_ehash_secret + net_hash_mix(net));
+ }
+
+ int udp_v6_get_port(struct sock *sk, unsigned short snum)
+@@ -680,6 +681,7 @@ static int __udpv6_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
+ }
+ UDP6_INC_STATS(sock_net(sk), UDP_MIB_INERRORS, is_udplite);
+ kfree_skb_reason(skb, drop_reason);
++ trace_udp_fail_queue_rcv_skb(rc, sk);
+ return -1;
+ }
+
+diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
+index a6c7f2d249093..b069826869d05 100644
+--- a/net/mptcp/protocol.c
++++ b/net/mptcp/protocol.c
+@@ -2908,10 +2908,10 @@ static void mptcp_check_listen_stop(struct sock *sk)
+ return;
+
+ lock_sock_nested(ssk, SINGLE_DEPTH_NESTING);
++ tcp_set_state(ssk, TCP_CLOSE);
+ mptcp_subflow_queue_clean(sk, ssk);
+ inet_csk_listen_stop(ssk);
+ mptcp_event_pm_listener(ssk, MPTCP_EVENT_LISTENER_CLOSED);
+- tcp_set_state(ssk, TCP_CLOSE);
+ release_sock(ssk);
+ }
+
+@@ -3697,6 +3697,11 @@ static int mptcp_listen(struct socket *sock, int backlog)
+ pr_debug("msk=%p", msk);
+
+ lock_sock(sk);
++
++ err = -EINVAL;
++ if (sock->state != SS_UNCONNECTED || sock->type != SOCK_STREAM)
++ goto unlock;
++
+ ssock = __mptcp_nmpc_socket(msk);
+ if (IS_ERR(ssock)) {
+ err = PTR_ERR(ssock);
+diff --git a/net/ncsi/ncsi-rsp.c b/net/ncsi/ncsi-rsp.c
+index 6447a09932f55..069c2659074bc 100644
+--- a/net/ncsi/ncsi-rsp.c
++++ b/net/ncsi/ncsi-rsp.c
+@@ -611,14 +611,14 @@ static int ncsi_rsp_handler_snfc(struct ncsi_request *nr)
+ return 0;
+ }
+
+-/* Response handler for Mellanox command Get Mac Address */
+-static int ncsi_rsp_handler_oem_mlx_gma(struct ncsi_request *nr)
++/* Response handler for Get Mac Address command */
++static int ncsi_rsp_handler_oem_gma(struct ncsi_request *nr, int mfr_id)
+ {
+ struct ncsi_dev_priv *ndp = nr->ndp;
+ struct net_device *ndev = ndp->ndev.dev;
+- const struct net_device_ops *ops = ndev->netdev_ops;
+ struct ncsi_rsp_oem_pkt *rsp;
+ struct sockaddr saddr;
++ u32 mac_addr_off = 0;
+ int ret = 0;
+
+ /* Get the response header */
+@@ -626,11 +626,25 @@ static int ncsi_rsp_handler_oem_mlx_gma(struct ncsi_request *nr)
+
+ saddr.sa_family = ndev->type;
+ ndev->priv_flags |= IFF_LIVE_ADDR_CHANGE;
+- memcpy(saddr.sa_data, &rsp->data[MLX_MAC_ADDR_OFFSET], ETH_ALEN);
++ if (mfr_id == NCSI_OEM_MFR_BCM_ID)
++ mac_addr_off = BCM_MAC_ADDR_OFFSET;
++ else if (mfr_id == NCSI_OEM_MFR_MLX_ID)
++ mac_addr_off = MLX_MAC_ADDR_OFFSET;
++ else if (mfr_id == NCSI_OEM_MFR_INTEL_ID)
++ mac_addr_off = INTEL_MAC_ADDR_OFFSET;
++
++ memcpy(saddr.sa_data, &rsp->data[mac_addr_off], ETH_ALEN);
++ if (mfr_id == NCSI_OEM_MFR_BCM_ID || mfr_id == NCSI_OEM_MFR_INTEL_ID)
++ eth_addr_inc((u8 *)saddr.sa_data);
++ if (!is_valid_ether_addr((const u8 *)saddr.sa_data))
++ return -ENXIO;
++
+ /* Set the flag for GMA command which should only be called once */
+ ndp->gma_flag = 1;
+
+- ret = ops->ndo_set_mac_address(ndev, &saddr);
++ rtnl_lock();
++ ret = dev_set_mac_address(ndev, &saddr, NULL);
++ rtnl_unlock();
+ if (ret < 0)
+ netdev_warn(ndev, "NCSI: 'Writing mac address to device failed\n");
+
+@@ -649,41 +663,10 @@ static int ncsi_rsp_handler_oem_mlx(struct ncsi_request *nr)
+
+ if (mlx->cmd == NCSI_OEM_MLX_CMD_GMA &&
+ mlx->param == NCSI_OEM_MLX_CMD_GMA_PARAM)
+- return ncsi_rsp_handler_oem_mlx_gma(nr);
++ return ncsi_rsp_handler_oem_gma(nr, NCSI_OEM_MFR_MLX_ID);
+ return 0;
+ }
+
+-/* Response handler for Broadcom command Get Mac Address */
+-static int ncsi_rsp_handler_oem_bcm_gma(struct ncsi_request *nr)
+-{
+- struct ncsi_dev_priv *ndp = nr->ndp;
+- struct net_device *ndev = ndp->ndev.dev;
+- const struct net_device_ops *ops = ndev->netdev_ops;
+- struct ncsi_rsp_oem_pkt *rsp;
+- struct sockaddr saddr;
+- int ret = 0;
+-
+- /* Get the response header */
+- rsp = (struct ncsi_rsp_oem_pkt *)skb_network_header(nr->rsp);
+-
+- saddr.sa_family = ndev->type;
+- ndev->priv_flags |= IFF_LIVE_ADDR_CHANGE;
+- memcpy(saddr.sa_data, &rsp->data[BCM_MAC_ADDR_OFFSET], ETH_ALEN);
+- /* Increase mac address by 1 for BMC's address */
+- eth_addr_inc((u8 *)saddr.sa_data);
+- if (!is_valid_ether_addr((const u8 *)saddr.sa_data))
+- return -ENXIO;
+-
+- /* Set the flag for GMA command which should only be called once */
+- ndp->gma_flag = 1;
+-
+- ret = ops->ndo_set_mac_address(ndev, &saddr);
+- if (ret < 0)
+- netdev_warn(ndev, "NCSI: 'Writing mac address to device failed\n");
+-
+- return ret;
+-}
+-
+ /* Response handler for Broadcom card */
+ static int ncsi_rsp_handler_oem_bcm(struct ncsi_request *nr)
+ {
+@@ -695,42 +678,10 @@ static int ncsi_rsp_handler_oem_bcm(struct ncsi_request *nr)
+ bcm = (struct ncsi_rsp_oem_bcm_pkt *)(rsp->data);
+
+ if (bcm->type == NCSI_OEM_BCM_CMD_GMA)
+- return ncsi_rsp_handler_oem_bcm_gma(nr);
++ return ncsi_rsp_handler_oem_gma(nr, NCSI_OEM_MFR_BCM_ID);
+ return 0;
+ }
+
+-/* Response handler for Intel command Get Mac Address */
+-static int ncsi_rsp_handler_oem_intel_gma(struct ncsi_request *nr)
+-{
+- struct ncsi_dev_priv *ndp = nr->ndp;
+- struct net_device *ndev = ndp->ndev.dev;
+- const struct net_device_ops *ops = ndev->netdev_ops;
+- struct ncsi_rsp_oem_pkt *rsp;
+- struct sockaddr saddr;
+- int ret = 0;
+-
+- /* Get the response header */
+- rsp = (struct ncsi_rsp_oem_pkt *)skb_network_header(nr->rsp);
+-
+- saddr.sa_family = ndev->type;
+- ndev->priv_flags |= IFF_LIVE_ADDR_CHANGE;
+- memcpy(saddr.sa_data, &rsp->data[INTEL_MAC_ADDR_OFFSET], ETH_ALEN);
+- /* Increase mac address by 1 for BMC's address */
+- eth_addr_inc((u8 *)saddr.sa_data);
+- if (!is_valid_ether_addr((const u8 *)saddr.sa_data))
+- return -ENXIO;
+-
+- /* Set the flag for GMA command which should only be called once */
+- ndp->gma_flag = 1;
+-
+- ret = ops->ndo_set_mac_address(ndev, &saddr);
+- if (ret < 0)
+- netdev_warn(ndev,
+- "NCSI: 'Writing mac address to device failed\n");
+-
+- return ret;
+-}
+-
+ /* Response handler for Intel card */
+ static int ncsi_rsp_handler_oem_intel(struct ncsi_request *nr)
+ {
+@@ -742,7 +693,7 @@ static int ncsi_rsp_handler_oem_intel(struct ncsi_request *nr)
+ intel = (struct ncsi_rsp_oem_intel_pkt *)(rsp->data);
+
+ if (intel->cmd == NCSI_OEM_INTEL_CMD_GMA)
+- return ncsi_rsp_handler_oem_intel_gma(nr);
++ return ncsi_rsp_handler_oem_gma(nr, NCSI_OEM_MFR_INTEL_ID);
+
+ return 0;
+ }
+diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
+index d119f1d4c2fc8..992393102d5f5 100644
+--- a/net/netfilter/nf_conntrack_core.c
++++ b/net/netfilter/nf_conntrack_core.c
+@@ -211,24 +211,18 @@ static u32 hash_conntrack_raw(const struct nf_conntrack_tuple *tuple,
+ unsigned int zoneid,
+ const struct net *net)
+ {
+- u64 a, b, c, d;
++ siphash_key_t key;
+
+ get_random_once(&nf_conntrack_hash_rnd, sizeof(nf_conntrack_hash_rnd));
+
+- /* The direction must be ignored, handle usable tuplehash members manually */
+- a = (u64)tuple->src.u3.all[0] << 32 | tuple->src.u3.all[3];
+- b = (u64)tuple->dst.u3.all[0] << 32 | tuple->dst.u3.all[3];
++ key = nf_conntrack_hash_rnd;
+
+- c = (__force u64)tuple->src.u.all << 32 | (__force u64)tuple->dst.u.all << 16;
+- c |= tuple->dst.protonum;
++ key.key[0] ^= zoneid;
++ key.key[1] ^= net_hash_mix(net);
+
+- d = (u64)zoneid << 32 | net_hash_mix(net);
+-
+- /* IPv4: u3.all[1,2,3] == 0 */
+- c ^= (u64)tuple->src.u3.all[1] << 32 | tuple->src.u3.all[2];
+- d += (u64)tuple->dst.u3.all[1] << 32 | tuple->dst.u3.all[2];
+-
+- return (u32)siphash_4u64(a, b, c, d, &nf_conntrack_hash_rnd);
++ return siphash((void *)tuple,
++ offsetofend(struct nf_conntrack_tuple, dst.__nfct_hash_offsetend),
++ &key);
+ }
+
+ static u32 scale_hash(u32 hash)
+diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
+index 79719e8cda799..18546f9b2a63a 100644
+--- a/net/netfilter/nf_tables_api.c
++++ b/net/netfilter/nf_tables_api.c
+@@ -253,8 +253,10 @@ int nf_tables_bind_chain(const struct nft_ctx *ctx, struct nft_chain *chain)
+ if (chain->bound)
+ return -EBUSY;
+
++ if (!nft_use_inc(&chain->use))
++ return -EMFILE;
++
+ chain->bound = true;
+- chain->use++;
+ nft_chain_trans_bind(ctx, chain);
+
+ return 0;
+@@ -437,7 +439,7 @@ static int nft_delchain(struct nft_ctx *ctx)
+ if (IS_ERR(trans))
+ return PTR_ERR(trans);
+
+- ctx->table->use--;
++ nft_use_dec(&ctx->table->use);
+ nft_deactivate_next(ctx->net, ctx->chain);
+
+ return 0;
+@@ -476,7 +478,7 @@ nf_tables_delrule_deactivate(struct nft_ctx *ctx, struct nft_rule *rule)
+ /* You cannot delete the same rule twice */
+ if (nft_is_active_next(ctx->net, rule)) {
+ nft_deactivate_next(ctx->net, rule);
+- ctx->chain->use--;
++ nft_use_dec(&ctx->chain->use);
+ return 0;
+ }
+ return -ENOENT;
+@@ -643,7 +645,7 @@ static int nft_delset(const struct nft_ctx *ctx, struct nft_set *set)
+ nft_map_deactivate(ctx, set);
+
+ nft_deactivate_next(ctx->net, set);
+- ctx->table->use--;
++ nft_use_dec(&ctx->table->use);
+
+ return err;
+ }
+@@ -675,7 +677,7 @@ static int nft_delobj(struct nft_ctx *ctx, struct nft_object *obj)
+ return err;
+
+ nft_deactivate_next(ctx->net, obj);
+- ctx->table->use--;
++ nft_use_dec(&ctx->table->use);
+
+ return err;
+ }
+@@ -710,7 +712,7 @@ static int nft_delflowtable(struct nft_ctx *ctx,
+ return err;
+
+ nft_deactivate_next(ctx->net, flowtable);
+- ctx->table->use--;
++ nft_use_dec(&ctx->table->use);
+
+ return err;
+ }
+@@ -2395,9 +2397,6 @@ static int nf_tables_addchain(struct nft_ctx *ctx, u8 family, u8 genmask,
+ struct nft_chain *chain;
+ int err;
+
+- if (table->use == UINT_MAX)
+- return -EOVERFLOW;
+-
+ if (nla[NFTA_CHAIN_HOOK]) {
+ struct nft_stats __percpu *stats = NULL;
+ struct nft_chain_hook hook = {};
+@@ -2493,6 +2492,11 @@ static int nf_tables_addchain(struct nft_ctx *ctx, u8 family, u8 genmask,
+ if (err < 0)
+ goto err_destroy_chain;
+
++ if (!nft_use_inc(&table->use)) {
++ err = -EMFILE;
++ goto err_use;
++ }
++
+ trans = nft_trans_chain_add(ctx, NFT_MSG_NEWCHAIN);
+ if (IS_ERR(trans)) {
+ err = PTR_ERR(trans);
+@@ -2509,10 +2513,11 @@ static int nf_tables_addchain(struct nft_ctx *ctx, u8 family, u8 genmask,
+ goto err_unregister_hook;
+ }
+
+- table->use++;
+-
+ return 0;
++
+ err_unregister_hook:
++ nft_use_dec_restore(&table->use);
++err_use:
+ nf_tables_unregister_hook(net, table, chain);
+ err_destroy_chain:
+ nf_tables_chain_destroy(ctx);
+@@ -3841,9 +3846,6 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info,
+ return -EINVAL;
+ handle = nf_tables_alloc_handle(table);
+
+- if (chain->use == UINT_MAX)
+- return -EOVERFLOW;
+-
+ if (nla[NFTA_RULE_POSITION]) {
+ pos_handle = be64_to_cpu(nla_get_be64(nla[NFTA_RULE_POSITION]));
+ old_rule = __nft_rule_lookup(chain, pos_handle);
+@@ -3937,6 +3939,11 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info,
+ }
+ }
+
++ if (!nft_use_inc(&chain->use)) {
++ err = -EMFILE;
++ goto err_release_rule;
++ }
++
+ if (info->nlh->nlmsg_flags & NLM_F_REPLACE) {
+ err = nft_delrule(&ctx, old_rule);
+ if (err < 0)
+@@ -3968,7 +3975,6 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info,
+ }
+ }
+ kvfree(expr_info);
+- chain->use++;
+
+ if (flow)
+ nft_trans_flow_rule(trans) = flow;
+@@ -3979,6 +3985,7 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info,
+ return 0;
+
+ err_destroy_flow_rule:
++ nft_use_dec_restore(&chain->use);
+ if (flow)
+ nft_flow_rule_destroy(flow);
+ err_release_rule:
+@@ -5015,9 +5022,15 @@ static int nf_tables_newset(struct sk_buff *skb, const struct nfnl_info *info,
+ alloc_size = sizeof(*set) + size + udlen;
+ if (alloc_size < size || alloc_size > INT_MAX)
+ return -ENOMEM;
++
++ if (!nft_use_inc(&table->use))
++ return -EMFILE;
++
+ set = kvzalloc(alloc_size, GFP_KERNEL_ACCOUNT);
+- if (!set)
+- return -ENOMEM;
++ if (!set) {
++ err = -ENOMEM;
++ goto err_alloc;
++ }
+
+ name = nla_strdup(nla[NFTA_SET_NAME], GFP_KERNEL_ACCOUNT);
+ if (!name) {
+@@ -5075,7 +5088,7 @@ static int nf_tables_newset(struct sk_buff *skb, const struct nfnl_info *info,
+ goto err_set_expr_alloc;
+
+ list_add_tail_rcu(&set->list, &table->sets);
+- table->use++;
++
+ return 0;
+
+ err_set_expr_alloc:
+@@ -5087,6 +5100,9 @@ err_set_init:
+ kfree(set->name);
+ err_set_name:
+ kvfree(set);
++err_alloc:
++ nft_use_dec_restore(&table->use);
++
+ return err;
+ }
+
+@@ -5225,9 +5241,6 @@ int nf_tables_bind_set(const struct nft_ctx *ctx, struct nft_set *set,
+ struct nft_set_binding *i;
+ struct nft_set_iter iter;
+
+- if (set->use == UINT_MAX)
+- return -EOVERFLOW;
+-
+ if (!list_empty(&set->bindings) && nft_set_is_anonymous(set))
+ return -EBUSY;
+
+@@ -5255,10 +5268,12 @@ int nf_tables_bind_set(const struct nft_ctx *ctx, struct nft_set *set,
+ return iter.err;
+ }
+ bind:
++ if (!nft_use_inc(&set->use))
++ return -EMFILE;
++
+ binding->chain = ctx->chain;
+ list_add_tail_rcu(&binding->list, &set->bindings);
+ nft_set_trans_bind(ctx, set);
+- set->use++;
+
+ return 0;
+ }
+@@ -5332,7 +5347,7 @@ void nf_tables_activate_set(const struct nft_ctx *ctx, struct nft_set *set)
+ nft_clear(ctx->net, set);
+ }
+
+- set->use++;
++ nft_use_inc_restore(&set->use);
+ }
+ EXPORT_SYMBOL_GPL(nf_tables_activate_set);
+
+@@ -5348,7 +5363,7 @@ void nf_tables_deactivate_set(const struct nft_ctx *ctx, struct nft_set *set,
+ else
+ list_del_rcu(&binding->list);
+
+- set->use--;
++ nft_use_dec(&set->use);
+ break;
+ case NFT_TRANS_PREPARE:
+ if (nft_set_is_anonymous(set)) {
+@@ -5357,7 +5372,7 @@ void nf_tables_deactivate_set(const struct nft_ctx *ctx, struct nft_set *set,
+
+ nft_deactivate_next(ctx->net, set);
+ }
+- set->use--;
++ nft_use_dec(&set->use);
+ return;
+ case NFT_TRANS_ABORT:
+ case NFT_TRANS_RELEASE:
+@@ -5365,7 +5380,7 @@ void nf_tables_deactivate_set(const struct nft_ctx *ctx, struct nft_set *set,
+ set->flags & (NFT_SET_MAP | NFT_SET_OBJECT))
+ nft_map_deactivate(ctx, set);
+
+- set->use--;
++ nft_use_dec(&set->use);
+ fallthrough;
+ default:
+ nf_tables_unbind_set(ctx, set, binding,
+@@ -6134,7 +6149,7 @@ void nft_set_elem_destroy(const struct nft_set *set, void *elem,
+ nft_set_elem_expr_destroy(&ctx, nft_set_ext_expr(ext));
+
+ if (nft_set_ext_exists(ext, NFT_SET_EXT_OBJREF))
+- (*nft_set_ext_obj(ext))->use--;
++ nft_use_dec(&(*nft_set_ext_obj(ext))->use);
+ kfree(elem);
+ }
+ EXPORT_SYMBOL_GPL(nft_set_elem_destroy);
+@@ -6636,8 +6651,16 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
+ set->objtype, genmask);
+ if (IS_ERR(obj)) {
+ err = PTR_ERR(obj);
++ obj = NULL;
+ goto err_parse_key_end;
+ }
++
++ if (!nft_use_inc(&obj->use)) {
++ err = -EMFILE;
++ obj = NULL;
++ goto err_parse_key_end;
++ }
++
+ err = nft_set_ext_add(&tmpl, NFT_SET_EXT_OBJREF);
+ if (err < 0)
+ goto err_parse_key_end;
+@@ -6706,10 +6729,9 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
+ if (flags)
+ *nft_set_ext_flags(ext) = flags;
+
+- if (obj) {
++ if (obj)
+ *nft_set_ext_obj(ext) = obj;
+- obj->use++;
+- }
++
+ if (ulen > 0) {
+ if (nft_set_ext_check(&tmpl, NFT_SET_EXT_USERDATA, ulen) < 0) {
+ err = -EINVAL;
+@@ -6774,12 +6796,13 @@ err_element_clash:
+ kfree(trans);
+ err_elem_free:
+ nf_tables_set_elem_destroy(ctx, set, elem.priv);
+- if (obj)
+- obj->use--;
+ err_parse_data:
+ if (nla[NFTA_SET_ELEM_DATA] != NULL)
+ nft_data_release(&elem.data.val, desc.type);
+ err_parse_key_end:
++ if (obj)
++ nft_use_dec_restore(&obj->use);
++
+ nft_data_release(&elem.key_end.val, NFT_DATA_VALUE);
+ err_parse_key:
+ nft_data_release(&elem.key.val, NFT_DATA_VALUE);
+@@ -6859,7 +6882,7 @@ void nft_data_hold(const struct nft_data *data, enum nft_data_types type)
+ case NFT_JUMP:
+ case NFT_GOTO:
+ chain = data->verdict.chain;
+- chain->use++;
++ nft_use_inc_restore(&chain->use);
+ break;
+ }
+ }
+@@ -6874,7 +6897,7 @@ static void nft_setelem_data_activate(const struct net *net,
+ if (nft_set_ext_exists(ext, NFT_SET_EXT_DATA))
+ nft_data_hold(nft_set_ext_data(ext), set->dtype);
+ if (nft_set_ext_exists(ext, NFT_SET_EXT_OBJREF))
+- (*nft_set_ext_obj(ext))->use++;
++ nft_use_inc_restore(&(*nft_set_ext_obj(ext))->use);
+ }
+
+ static void nft_setelem_data_deactivate(const struct net *net,
+@@ -6886,7 +6909,7 @@ static void nft_setelem_data_deactivate(const struct net *net,
+ if (nft_set_ext_exists(ext, NFT_SET_EXT_DATA))
+ nft_data_release(nft_set_ext_data(ext), set->dtype);
+ if (nft_set_ext_exists(ext, NFT_SET_EXT_OBJREF))
+- (*nft_set_ext_obj(ext))->use--;
++ nft_use_dec(&(*nft_set_ext_obj(ext))->use);
+ }
+
+ static int nft_del_setelem(struct nft_ctx *ctx, struct nft_set *set,
+@@ -7429,9 +7452,14 @@ static int nf_tables_newobj(struct sk_buff *skb, const struct nfnl_info *info,
+
+ nft_ctx_init(&ctx, net, skb, info->nlh, family, table, NULL, nla);
+
++ if (!nft_use_inc(&table->use))
++ return -EMFILE;
++
+ type = nft_obj_type_get(net, objtype);
+- if (IS_ERR(type))
+- return PTR_ERR(type);
++ if (IS_ERR(type)) {
++ err = PTR_ERR(type);
++ goto err_type;
++ }
+
+ obj = nft_obj_init(&ctx, type, nla[NFTA_OBJ_DATA]);
+ if (IS_ERR(obj)) {
+@@ -7465,7 +7493,7 @@ static int nf_tables_newobj(struct sk_buff *skb, const struct nfnl_info *info,
+ goto err_obj_ht;
+
+ list_add_tail_rcu(&obj->list, &table->objects);
+- table->use++;
++
+ return 0;
+ err_obj_ht:
+ /* queued in transaction log */
+@@ -7481,6 +7509,9 @@ err_strdup:
+ kfree(obj);
+ err_init:
+ module_put(type->owner);
++err_type:
++ nft_use_dec_restore(&table->use);
++
+ return err;
+ }
+
+@@ -7882,7 +7913,7 @@ void nf_tables_deactivate_flowtable(const struct nft_ctx *ctx,
+ case NFT_TRANS_PREPARE:
+ case NFT_TRANS_ABORT:
+ case NFT_TRANS_RELEASE:
+- flowtable->use--;
++ nft_use_dec(&flowtable->use);
+ fallthrough;
+ default:
+ return;
+@@ -8236,9 +8267,14 @@ static int nf_tables_newflowtable(struct sk_buff *skb,
+
+ nft_ctx_init(&ctx, net, skb, info->nlh, family, table, NULL, nla);
+
++ if (!nft_use_inc(&table->use))
++ return -EMFILE;
++
+ flowtable = kzalloc(sizeof(*flowtable), GFP_KERNEL_ACCOUNT);
+- if (!flowtable)
+- return -ENOMEM;
++ if (!flowtable) {
++ err = -ENOMEM;
++ goto flowtable_alloc;
++ }
+
+ flowtable->table = table;
+ flowtable->handle = nf_tables_alloc_handle(table);
+@@ -8293,7 +8329,6 @@ static int nf_tables_newflowtable(struct sk_buff *skb,
+ goto err5;
+
+ list_add_tail_rcu(&flowtable->list, &table->flowtables);
+- table->use++;
+
+ return 0;
+ err5:
+@@ -8310,6 +8345,9 @@ err2:
+ kfree(flowtable->name);
+ err1:
+ kfree(flowtable);
++flowtable_alloc:
++ nft_use_dec_restore(&table->use);
++
+ return err;
+ }
+
+@@ -9680,7 +9718,7 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb)
+ */
+ if (nft_set_is_anonymous(nft_trans_set(trans)) &&
+ !list_empty(&nft_trans_set(trans)->bindings))
+- trans->ctx.table->use--;
++ nft_use_dec(&trans->ctx.table->use);
+ }
+ nf_tables_set_notify(&trans->ctx, nft_trans_set(trans),
+ NFT_MSG_NEWSET, GFP_KERNEL);
+@@ -9910,7 +9948,7 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action)
+ nft_trans_destroy(trans);
+ break;
+ }
+- trans->ctx.table->use--;
++ nft_use_dec_restore(&trans->ctx.table->use);
+ nft_chain_del(trans->ctx.chain);
+ nf_tables_unregister_hook(trans->ctx.net,
+ trans->ctx.table,
+@@ -9923,7 +9961,7 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action)
+ list_splice(&nft_trans_chain_hooks(trans),
+ &nft_trans_basechain(trans)->hook_list);
+ } else {
+- trans->ctx.table->use++;
++ nft_use_inc_restore(&trans->ctx.table->use);
+ nft_clear(trans->ctx.net, trans->ctx.chain);
+ }
+ nft_trans_destroy(trans);
+@@ -9933,7 +9971,7 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action)
+ nft_trans_destroy(trans);
+ break;
+ }
+- trans->ctx.chain->use--;
++ nft_use_dec_restore(&trans->ctx.chain->use);
+ list_del_rcu(&nft_trans_rule(trans)->list);
+ nft_rule_expr_deactivate(&trans->ctx,
+ nft_trans_rule(trans),
+@@ -9943,7 +9981,7 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action)
+ break;
+ case NFT_MSG_DELRULE:
+ case NFT_MSG_DESTROYRULE:
+- trans->ctx.chain->use++;
++ nft_use_inc_restore(&trans->ctx.chain->use);
+ nft_clear(trans->ctx.net, nft_trans_rule(trans));
+ nft_rule_expr_activate(&trans->ctx, nft_trans_rule(trans));
+ if (trans->ctx.chain->flags & NFT_CHAIN_HW_OFFLOAD)
+@@ -9956,7 +9994,7 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action)
+ nft_trans_destroy(trans);
+ break;
+ }
+- trans->ctx.table->use--;
++ nft_use_dec_restore(&trans->ctx.table->use);
+ if (nft_trans_set_bound(trans)) {
+ nft_trans_destroy(trans);
+ break;
+@@ -9965,7 +10003,7 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action)
+ break;
+ case NFT_MSG_DELSET:
+ case NFT_MSG_DESTROYSET:
+- trans->ctx.table->use++;
++ nft_use_inc_restore(&trans->ctx.table->use);
+ nft_clear(trans->ctx.net, nft_trans_set(trans));
+ if (nft_trans_set(trans)->flags & (NFT_SET_MAP | NFT_SET_OBJECT))
+ nft_map_activate(&trans->ctx, nft_trans_set(trans));
+@@ -10009,13 +10047,13 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action)
+ nft_obj_destroy(&trans->ctx, nft_trans_obj_newobj(trans));
+ nft_trans_destroy(trans);
+ } else {
+- trans->ctx.table->use--;
++ nft_use_dec_restore(&trans->ctx.table->use);
+ nft_obj_del(nft_trans_obj(trans));
+ }
+ break;
+ case NFT_MSG_DELOBJ:
+ case NFT_MSG_DESTROYOBJ:
+- trans->ctx.table->use++;
++ nft_use_inc_restore(&trans->ctx.table->use);
+ nft_clear(trans->ctx.net, nft_trans_obj(trans));
+ nft_trans_destroy(trans);
+ break;
+@@ -10024,7 +10062,7 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action)
+ nft_unregister_flowtable_net_hooks(net,
+ &nft_trans_flowtable_hooks(trans));
+ } else {
+- trans->ctx.table->use--;
++ nft_use_dec_restore(&trans->ctx.table->use);
+ list_del_rcu(&nft_trans_flowtable(trans)->list);
+ nft_unregister_flowtable_net_hooks(net,
+ &nft_trans_flowtable(trans)->hook_list);
+@@ -10036,7 +10074,7 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action)
+ list_splice(&nft_trans_flowtable_hooks(trans),
+ &nft_trans_flowtable(trans)->hook_list);
+ } else {
+- trans->ctx.table->use++;
++ nft_use_inc_restore(&trans->ctx.table->use);
+ nft_clear(trans->ctx.net, nft_trans_flowtable(trans));
+ }
+ nft_trans_destroy(trans);
+@@ -10486,8 +10524,9 @@ static int nft_verdict_init(const struct nft_ctx *ctx, struct nft_data *data,
+ if (desc->flags & NFT_DATA_DESC_SETELEM &&
+ chain->flags & NFT_CHAIN_BINDING)
+ return -EINVAL;
++ if (!nft_use_inc(&chain->use))
++ return -EMFILE;
+
+- chain->use++;
+ data->verdict.chain = chain;
+ break;
+ }
+@@ -10505,7 +10544,7 @@ static void nft_verdict_uninit(const struct nft_data *data)
+ case NFT_JUMP:
+ case NFT_GOTO:
+ chain = data->verdict.chain;
+- chain->use--;
++ nft_use_dec(&chain->use);
+ break;
+ }
+ }
+@@ -10674,11 +10713,11 @@ int __nft_release_basechain(struct nft_ctx *ctx)
+ nf_tables_unregister_hook(ctx->net, ctx->chain->table, ctx->chain);
+ list_for_each_entry_safe(rule, nr, &ctx->chain->rules, list) {
+ list_del(&rule->list);
+- ctx->chain->use--;
++ nft_use_dec(&ctx->chain->use);
+ nf_tables_rule_release(ctx, rule);
+ }
+ nft_chain_del(ctx->chain);
+- ctx->table->use--;
++ nft_use_dec(&ctx->table->use);
+ nf_tables_chain_destroy(ctx);
+
+ return 0;
+@@ -10728,18 +10767,18 @@ static void __nft_release_table(struct net *net, struct nft_table *table)
+ ctx.chain = chain;
+ list_for_each_entry_safe(rule, nr, &chain->rules, list) {
+ list_del(&rule->list);
+- chain->use--;
++ nft_use_dec(&chain->use);
+ nf_tables_rule_release(&ctx, rule);
+ }
+ }
+ list_for_each_entry_safe(flowtable, nf, &table->flowtables, list) {
+ list_del(&flowtable->list);
+- table->use--;
++ nft_use_dec(&table->use);
+ nf_tables_flowtable_destroy(flowtable);
+ }
+ list_for_each_entry_safe(set, ns, &table->sets, list) {
+ list_del(&set->list);
+- table->use--;
++ nft_use_dec(&table->use);
+ if (set->flags & (NFT_SET_MAP | NFT_SET_OBJECT))
+ nft_map_deactivate(&ctx, set);
+
+@@ -10747,13 +10786,13 @@ static void __nft_release_table(struct net *net, struct nft_table *table)
+ }
+ list_for_each_entry_safe(obj, ne, &table->objects, list) {
+ nft_obj_del(obj);
+- table->use--;
++ nft_use_dec(&table->use);
+ nft_obj_destroy(&ctx, obj);
+ }
+ list_for_each_entry_safe(chain, nc, &table->chains, list) {
+ ctx.chain = chain;
+ nft_chain_del(chain);
+- table->use--;
++ nft_use_dec(&table->use);
+ nf_tables_chain_destroy(&ctx);
+ }
+ nf_tables_table_destroy(&ctx);
+diff --git a/net/netfilter/nft_flow_offload.c b/net/netfilter/nft_flow_offload.c
+index e860d8fe0e5e2..03159c6c6c4b6 100644
+--- a/net/netfilter/nft_flow_offload.c
++++ b/net/netfilter/nft_flow_offload.c
+@@ -404,8 +404,10 @@ static int nft_flow_offload_init(const struct nft_ctx *ctx,
+ if (IS_ERR(flowtable))
+ return PTR_ERR(flowtable);
+
++ if (!nft_use_inc(&flowtable->use))
++ return -EMFILE;
++
+ priv->flowtable = flowtable;
+- flowtable->use++;
+
+ return nf_ct_netns_get(ctx->net, ctx->family);
+ }
+@@ -424,7 +426,7 @@ static void nft_flow_offload_activate(const struct nft_ctx *ctx,
+ {
+ struct nft_flow_offload *priv = nft_expr_priv(expr);
+
+- priv->flowtable->use++;
++ nft_use_inc_restore(&priv->flowtable->use);
+ }
+
+ static void nft_flow_offload_destroy(const struct nft_ctx *ctx,
+diff --git a/net/netfilter/nft_immediate.c b/net/netfilter/nft_immediate.c
+index 3d76ebfe8939b..407d7197f75bb 100644
+--- a/net/netfilter/nft_immediate.c
++++ b/net/netfilter/nft_immediate.c
+@@ -159,7 +159,7 @@ static void nft_immediate_deactivate(const struct nft_ctx *ctx,
+ default:
+ nft_chain_del(chain);
+ chain->bound = false;
+- chain->table->use--;
++ nft_use_dec(&chain->table->use);
+ break;
+ }
+ break;
+@@ -198,7 +198,7 @@ static void nft_immediate_destroy(const struct nft_ctx *ctx,
+ * let the transaction records release this chain and its rules.
+ */
+ if (chain->bound) {
+- chain->use--;
++ nft_use_dec(&chain->use);
+ break;
+ }
+
+@@ -206,9 +206,9 @@ static void nft_immediate_destroy(const struct nft_ctx *ctx,
+ chain_ctx = *ctx;
+ chain_ctx.chain = chain;
+
+- chain->use--;
++ nft_use_dec(&chain->use);
+ list_for_each_entry_safe(rule, n, &chain->rules, list) {
+- chain->use--;
++ nft_use_dec(&chain->use);
+ list_del(&rule->list);
+ nf_tables_rule_destroy(&chain_ctx, rule);
+ }
+diff --git a/net/netfilter/nft_objref.c b/net/netfilter/nft_objref.c
+index a48dd5b5d45b1..509011b1ef597 100644
+--- a/net/netfilter/nft_objref.c
++++ b/net/netfilter/nft_objref.c
+@@ -41,8 +41,10 @@ static int nft_objref_init(const struct nft_ctx *ctx,
+ if (IS_ERR(obj))
+ return -ENOENT;
+
++ if (!nft_use_inc(&obj->use))
++ return -EMFILE;
++
+ nft_objref_priv(expr) = obj;
+- obj->use++;
+
+ return 0;
+ }
+@@ -72,7 +74,7 @@ static void nft_objref_deactivate(const struct nft_ctx *ctx,
+ if (phase == NFT_TRANS_COMMIT)
+ return;
+
+- obj->use--;
++ nft_use_dec(&obj->use);
+ }
+
+ static void nft_objref_activate(const struct nft_ctx *ctx,
+@@ -80,7 +82,7 @@ static void nft_objref_activate(const struct nft_ctx *ctx,
+ {
+ struct nft_object *obj = nft_objref_priv(expr);
+
+- obj->use++;
++ nft_use_inc_restore(&obj->use);
+ }
+
+ static const struct nft_expr_ops nft_objref_ops = {
+diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
+index 815c3e416bc54..652158f612fc2 100644
+--- a/net/sched/cls_flower.c
++++ b/net/sched/cls_flower.c
+@@ -799,6 +799,16 @@ static int fl_set_key_port_range(struct nlattr **tb, struct fl_flow_key *key,
+ TCA_FLOWER_KEY_PORT_SRC_MAX, &mask->tp_range.tp_max.src,
+ TCA_FLOWER_UNSPEC, sizeof(key->tp_range.tp_max.src));
+
++ if (mask->tp_range.tp_min.dst != mask->tp_range.tp_max.dst) {
++ NL_SET_ERR_MSG(extack,
++ "Both min and max destination ports must be specified");
++ return -EINVAL;
++ }
++ if (mask->tp_range.tp_min.src != mask->tp_range.tp_max.src) {
++ NL_SET_ERR_MSG(extack,
++ "Both min and max source ports must be specified");
++ return -EINVAL;
++ }
+ if (mask->tp_range.tp_min.dst && mask->tp_range.tp_max.dst &&
+ ntohs(key->tp_range.tp_max.dst) <=
+ ntohs(key->tp_range.tp_min.dst)) {
+diff --git a/net/sched/cls_fw.c b/net/sched/cls_fw.c
+index ae9439a6c56c9..8641f80593179 100644
+--- a/net/sched/cls_fw.c
++++ b/net/sched/cls_fw.c
+@@ -212,11 +212,6 @@ static int fw_set_parms(struct net *net, struct tcf_proto *tp,
+ if (err < 0)
+ return err;
+
+- if (tb[TCA_FW_CLASSID]) {
+- f->res.classid = nla_get_u32(tb[TCA_FW_CLASSID]);
+- tcf_bind_filter(tp, &f->res, base);
+- }
+-
+ if (tb[TCA_FW_INDEV]) {
+ int ret;
+ ret = tcf_change_indev(net, tb[TCA_FW_INDEV], extack);
+@@ -233,6 +228,11 @@ static int fw_set_parms(struct net *net, struct tcf_proto *tp,
+ } else if (head->mask != 0xFFFFFFFF)
+ return err;
+
++ if (tb[TCA_FW_CLASSID]) {
++ f->res.classid = nla_get_u32(tb[TCA_FW_CLASSID]);
++ tcf_bind_filter(tp, &f->res, base);
++ }
++
+ return 0;
+ }
+
+diff --git a/net/sched/sch_qfq.c b/net/sched/sch_qfq.c
+index dfd9a99e62570..befaf74b33caa 100644
+--- a/net/sched/sch_qfq.c
++++ b/net/sched/sch_qfq.c
+@@ -381,8 +381,13 @@ static int qfq_change_agg(struct Qdisc *sch, struct qfq_class *cl, u32 weight,
+ u32 lmax)
+ {
+ struct qfq_sched *q = qdisc_priv(sch);
+- struct qfq_aggregate *new_agg = qfq_find_agg(q, lmax, weight);
++ struct qfq_aggregate *new_agg;
+
++ /* 'lmax' can range from [QFQ_MIN_LMAX, pktlen + stab overhead] */
++ if (lmax > QFQ_MAX_LMAX)
++ return -EINVAL;
++
++ new_agg = qfq_find_agg(q, lmax, weight);
+ if (new_agg == NULL) { /* create new aggregate */
+ new_agg = kzalloc(sizeof(*new_agg), GFP_ATOMIC);
+ if (new_agg == NULL)
+@@ -423,10 +428,17 @@ static int qfq_change_class(struct Qdisc *sch, u32 classid, u32 parentid,
+ else
+ weight = 1;
+
+- if (tb[TCA_QFQ_LMAX])
++ if (tb[TCA_QFQ_LMAX]) {
+ lmax = nla_get_u32(tb[TCA_QFQ_LMAX]);
+- else
++ } else {
++ /* MTU size is user controlled */
+ lmax = psched_mtu(qdisc_dev(sch));
++ if (lmax < QFQ_MIN_LMAX || lmax > QFQ_MAX_LMAX) {
++ NL_SET_ERR_MSG_MOD(extack,
++ "MTU size out of bounds for qfq");
++ return -EINVAL;
++ }
++ }
+
+ inv_w = ONE_FP / weight;
+ weight = ONE_FP / inv_w;
+diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
+index cf0e61ed92253..4caf80ddc6721 100644
+--- a/net/sched/sch_taprio.c
++++ b/net/sched/sch_taprio.c
+@@ -1527,7 +1527,7 @@ static int taprio_enable_offload(struct net_device *dev,
+ "Not enough memory for enabling offload mode");
+ return -ENOMEM;
+ }
+- offload->enable = 1;
++ offload->cmd = TAPRIO_CMD_REPLACE;
+ offload->extack = extack;
+ mqprio_qopt_reconstruct(dev, &offload->mqprio.qopt);
+ offload->mqprio.extack = extack;
+@@ -1575,7 +1575,7 @@ static int taprio_disable_offload(struct net_device *dev,
+ "Not enough memory to disable offload mode");
+ return -ENOMEM;
+ }
+- offload->enable = 0;
++ offload->cmd = TAPRIO_CMD_DESTROY;
+
+ err = ops->ndo_setup_tc(dev, TC_SETUP_QDISC_TAPRIO, offload);
+ if (err < 0) {
+diff --git a/samples/ftrace/ftrace-direct-too.c b/samples/ftrace/ftrace-direct-too.c
+index a05bc2cc22614..7986033887f63 100644
+--- a/samples/ftrace/ftrace-direct-too.c
++++ b/samples/ftrace/ftrace-direct-too.c
+@@ -5,14 +5,14 @@
+ #include <linux/ftrace.h>
+ #include <asm/asm-offsets.h>
+
+-extern void my_direct_func(struct vm_area_struct *vma,
+- unsigned long address, unsigned int flags);
++extern void my_direct_func(struct vm_area_struct *vma, unsigned long address,
++ unsigned int flags, struct pt_regs *regs);
+
+-void my_direct_func(struct vm_area_struct *vma,
+- unsigned long address, unsigned int flags)
++void my_direct_func(struct vm_area_struct *vma, unsigned long address,
++ unsigned int flags, struct pt_regs *regs)
+ {
+- trace_printk("handle mm fault vma=%p address=%lx flags=%x\n",
+- vma, address, flags);
++ trace_printk("handle mm fault vma=%p address=%lx flags=%x regs=%p\n",
++ vma, address, flags, regs);
+ }
+
+ extern void my_tramp(void *);
+@@ -34,7 +34,9 @@ asm (
+ " pushq %rdi\n"
+ " pushq %rsi\n"
+ " pushq %rdx\n"
++" pushq %rcx\n"
+ " call my_direct_func\n"
++" popq %rcx\n"
+ " popq %rdx\n"
+ " popq %rsi\n"
+ " popq %rdi\n"
+diff --git a/security/integrity/platform_certs/load_powerpc.c b/security/integrity/platform_certs/load_powerpc.c
+index b9de70b908262..170789dc63d21 100644
+--- a/security/integrity/platform_certs/load_powerpc.c
++++ b/security/integrity/platform_certs/load_powerpc.c
+@@ -15,6 +15,9 @@
+ #include "keyring_handler.h"
+ #include "../integrity.h"
+
++#define extract_esl(db, data, size, offset) \
++ do { db = data + offset; size = size - offset; } while (0)
++
+ /*
+ * Get a certificate list blob from the named secure variable.
+ *
+@@ -55,8 +58,9 @@ static __init void *get_cert_list(u8 *key, unsigned long keylen, u64 *size)
+ */
+ static int __init load_powerpc_certs(void)
+ {
+- void *db = NULL, *dbx = NULL;
+- u64 dbsize = 0, dbxsize = 0;
++ void *db = NULL, *dbx = NULL, *data = NULL;
++ u64 dsize = 0;
++ u64 offset = 0;
+ int rc = 0;
+ ssize_t len;
+ char buf[32];
+@@ -74,38 +78,46 @@ static int __init load_powerpc_certs(void)
+ return -ENODEV;
+ }
+
++ if (strcmp("ibm,plpks-sb-v1", buf) == 0)
++ /* PLPKS authenticated variables ESL data is prefixed with 8 bytes of timestamp */
++ offset = 8;
++
+ /*
+ * Get db, and dbx. They might not exist, so it isn't an error if we
+ * can't get them.
+ */
+- db = get_cert_list("db", 3, &dbsize);
+- if (!db) {
++ data = get_cert_list("db", 3, &dsize);
++ if (!data) {
+ pr_info("Couldn't get db list from firmware\n");
+- } else if (IS_ERR(db)) {
+- rc = PTR_ERR(db);
++ } else if (IS_ERR(data)) {
++ rc = PTR_ERR(data);
+ pr_err("Error reading db from firmware: %d\n", rc);
+ return rc;
+ } else {
+- rc = parse_efi_signature_list("powerpc:db", db, dbsize,
++ extract_esl(db, data, dsize, offset);
++
++ rc = parse_efi_signature_list("powerpc:db", db, dsize,
+ get_handler_for_db);
+ if (rc)
+ pr_err("Couldn't parse db signatures: %d\n", rc);
+- kfree(db);
++ kfree(data);
+ }
+
+- dbx = get_cert_list("dbx", 4, &dbxsize);
+- if (!dbx) {
++ data = get_cert_list("dbx", 4, &dsize);
++ if (!data) {
+ pr_info("Couldn't get dbx list from firmware\n");
+- } else if (IS_ERR(dbx)) {
+- rc = PTR_ERR(dbx);
++ } else if (IS_ERR(data)) {
++ rc = PTR_ERR(data);
+ pr_err("Error reading dbx from firmware: %d\n", rc);
+ return rc;
+ } else {
+- rc = parse_efi_signature_list("powerpc:dbx", dbx, dbxsize,
++ extract_esl(dbx, data, dsize, offset);
++
++ rc = parse_efi_signature_list("powerpc:dbx", dbx, dsize,
+ get_handler_for_dbx);
+ if (rc)
+ pr_err("Couldn't parse dbx signatures: %d\n", rc);
+- kfree(dbx);
++ kfree(data);
+ }
+
+ return rc;
+diff --git a/tools/testing/selftests/net/mptcp/config b/tools/testing/selftests/net/mptcp/config
+index 6032f9b23c4c2..e317c2e44dae8 100644
+--- a/tools/testing/selftests/net/mptcp/config
++++ b/tools/testing/selftests/net/mptcp/config
+@@ -6,6 +6,7 @@ CONFIG_INET_DIAG=m
+ CONFIG_INET_MPTCP_DIAG=m
+ CONFIG_VETH=y
+ CONFIG_NET_SCH_NETEM=m
++CONFIG_SYN_COOKIES=y
+ CONFIG_NETFILTER=y
+ CONFIG_NETFILTER_ADVANCED=y
+ CONFIG_NETFILTER_NETLINK=m
+diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.sh b/tools/testing/selftests/net/mptcp/mptcp_connect.sh
+index 773dd770a5670..780b8fef22617 100755
+--- a/tools/testing/selftests/net/mptcp/mptcp_connect.sh
++++ b/tools/testing/selftests/net/mptcp/mptcp_connect.sh
+@@ -718,6 +718,7 @@ table inet mangle {
+ EOF
+ if [ $? -ne 0 ]; then
+ echo "SKIP: $msg, could not load nft ruleset"
++ mptcp_lib_fail_if_expected_feature "nft rules"
+ return
+ fi
+
+@@ -733,6 +734,7 @@ EOF
+ if [ $? -ne 0 ]; then
+ ip netns exec "$listener_ns" nft flush ruleset
+ echo "SKIP: $msg, ip $r6flag rule failed"
++ mptcp_lib_fail_if_expected_feature "ip rule"
+ return
+ fi
+
+@@ -741,6 +743,7 @@ EOF
+ ip netns exec "$listener_ns" nft flush ruleset
+ ip -net "$listener_ns" $r6flag rule del fwmark 1 lookup 100
+ echo "SKIP: $msg, ip route add local $local_addr failed"
++ mptcp_lib_fail_if_expected_feature "ip route"
+ return
+ fi
+
+diff --git a/tools/testing/selftests/net/mptcp/mptcp_sockopt.sh b/tools/testing/selftests/net/mptcp/mptcp_sockopt.sh
+index f295a371ff148..dc8d473fc82c8 100755
+--- a/tools/testing/selftests/net/mptcp/mptcp_sockopt.sh
++++ b/tools/testing/selftests/net/mptcp/mptcp_sockopt.sh
+@@ -12,6 +12,8 @@ ksft_skip=4
+ timeout_poll=30
+ timeout_test=$((timeout_poll * 2 + 1))
+ mptcp_connect=""
++iptables="iptables"
++ip6tables="ip6tables"
+
+ sec=$(date +%s)
+ rndh=$(printf %x $sec)-$(mktemp -u XXXXXX)
+@@ -25,7 +27,7 @@ add_mark_rules()
+ local m=$2
+
+ local t
+- for t in iptables ip6tables; do
++ for t in ${iptables} ${ip6tables}; do
+ # just to debug: check we have multiple subflows connection requests
+ ip netns exec $ns $t -A OUTPUT -p tcp --syn -m mark --mark $m -j ACCEPT
+
+@@ -95,14 +97,14 @@ if [ $? -ne 0 ];then
+ exit $ksft_skip
+ fi
+
+-iptables -V > /dev/null 2>&1
+-if [ $? -ne 0 ];then
++# Use the legacy version if available to support old kernel versions
++if iptables-legacy -V &> /dev/null; then
++ iptables="iptables-legacy"
++ ip6tables="ip6tables-legacy"
++elif ! iptables -V &> /dev/null; then
+ echo "SKIP: Could not run all tests without iptables tool"
+ exit $ksft_skip
+-fi
+-
+-ip6tables -V > /dev/null 2>&1
+-if [ $? -ne 0 ];then
++elif ! ip6tables -V &> /dev/null; then
+ echo "SKIP: Could not run all tests without ip6tables tool"
+ exit $ksft_skip
+ fi
+@@ -112,10 +114,10 @@ check_mark()
+ local ns=$1
+ local af=$2
+
+- local tables=iptables
++ local tables=${iptables}
+
+ if [ $af -eq 6 ];then
+- tables=ip6tables
++ tables=${ip6tables}
+ fi
+
+ local counters values
+@@ -126,6 +128,7 @@ check_mark()
+ for v in $values; do
+ if [ $v -ne 0 ]; then
+ echo "FAIL: got $tables $values in ns $ns , not 0 - not all expected packets marked" 1>&2
++ ret=1
+ return 1
+ fi
+ done
+@@ -225,11 +228,11 @@ do_transfer()
+ fi
+
+ if [ $local_addr = "::" ];then
+- check_mark $listener_ns 6
+- check_mark $connector_ns 6
++ check_mark $listener_ns 6 || retc=1
++ check_mark $connector_ns 6 || retc=1
+ else
+- check_mark $listener_ns 4
+- check_mark $connector_ns 4
++ check_mark $listener_ns 4 || retc=1
++ check_mark $connector_ns 4 || retc=1
+ fi
+
+ check_transfer $cin $sout "file received by server"
+diff --git a/tools/testing/selftests/net/mptcp/pm_nl_ctl.c b/tools/testing/selftests/net/mptcp/pm_nl_ctl.c
+index abddf4c63e797..1887bd61bd9a5 100644
+--- a/tools/testing/selftests/net/mptcp/pm_nl_ctl.c
++++ b/tools/testing/selftests/net/mptcp/pm_nl_ctl.c
+@@ -425,7 +425,7 @@ int dsf(int fd, int pm_family, int argc, char *argv[])
+ }
+
+ /* token */
+- token = atoi(params[4]);
++ token = strtoul(params[4], NULL, 10);
+ rta = (void *)(data + off);
+ rta->rta_type = MPTCP_PM_ATTR_TOKEN;
+ rta->rta_len = RTA_LENGTH(4);
+@@ -551,7 +551,7 @@ int csf(int fd, int pm_family, int argc, char *argv[])
+ }
+
+ /* token */
+- token = atoi(params[4]);
++ token = strtoul(params[4], NULL, 10);
+ rta = (void *)(data + off);
+ rta->rta_type = MPTCP_PM_ATTR_TOKEN;
+ rta->rta_len = RTA_LENGTH(4);
+@@ -598,7 +598,7 @@ int remove_addr(int fd, int pm_family, int argc, char *argv[])
+ if (++arg >= argc)
+ error(1, 0, " missing token value");
+
+- token = atoi(argv[arg]);
++ token = strtoul(argv[arg], NULL, 10);
+ rta = (void *)(data + off);
+ rta->rta_type = MPTCP_PM_ATTR_TOKEN;
+ rta->rta_len = RTA_LENGTH(4);
+@@ -710,7 +710,7 @@ int announce_addr(int fd, int pm_family, int argc, char *argv[])
+ if (++arg >= argc)
+ error(1, 0, " missing token value");
+
+- token = atoi(argv[arg]);
++ token = strtoul(argv[arg], NULL, 10);
+ } else
+ error(1, 0, "unknown keyword %s", argv[arg]);
+ }
+@@ -1347,7 +1347,7 @@ int set_flags(int fd, int pm_family, int argc, char *argv[])
+ error(1, 0, " missing token value");
+
+ /* token */
+- token = atoi(argv[arg]);
++ token = strtoul(argv[arg], NULL, 10);
+ } else if (!strcmp(argv[arg], "flags")) {
+ char *tok, *str;
+
+diff --git a/tools/testing/selftests/net/mptcp/userspace_pm.sh b/tools/testing/selftests/net/mptcp/userspace_pm.sh
+index 98d9e4d2d3fc2..b180133a30af7 100755
+--- a/tools/testing/selftests/net/mptcp/userspace_pm.sh
++++ b/tools/testing/selftests/net/mptcp/userspace_pm.sh
+@@ -423,6 +423,7 @@ test_remove()
+ stdbuf -o0 -e0 printf "[OK]\n"
+ else
+ stdbuf -o0 -e0 printf "[FAIL]\n"
++ exit 1
+ fi
+
+ # RM_ADDR using an invalid addr id should result in no action
+@@ -437,6 +438,7 @@ test_remove()
+ stdbuf -o0 -e0 printf "[OK]\n"
+ else
+ stdbuf -o0 -e0 printf "[FAIL]\n"
++ exit 1
+ fi
+
+ # RM_ADDR from the client to server machine
+@@ -848,7 +850,7 @@ test_prio()
+ local count
+
+ # Send MP_PRIO signal from client to server machine
+- ip netns exec "$ns2" ./pm_nl_ctl set 10.0.1.2 port "$client4_port" flags backup token "$client4_token" rip 10.0.1.1 rport "$server4_port"
++ ip netns exec "$ns2" ./pm_nl_ctl set 10.0.1.2 port "$client4_port" flags backup token "$client4_token" rip 10.0.1.1 rport "$app4_port"
+ sleep 0.5
+
+ # Check TX
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-07-24 20:26 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-07-24 20:26 UTC (permalink / raw
To: gentoo-commits
commit: 39721c9903cde07abf7177f0a6d3e677564845e6
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Mon Jul 24 20:25:54 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Mon Jul 24 20:25:54 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=39721c99
Linux patch 6.4.6
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1005_linux-6.4.6.patch | 310 +++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 314 insertions(+)
diff --git a/0000_README b/0000_README
index 9d316aeb..44858dee 100644
--- a/0000_README
+++ b/0000_README
@@ -63,6 +63,10 @@ Patch: 1004_linux-6.4.5.patch
From: https://www.kernel.org
Desc: Linux 6.4.5
+Patch: 1005_linux-6.4.6.patch
+From: https://www.kernel.org
+Desc: Linux 6.4.6
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1005_linux-6.4.6.patch b/1005_linux-6.4.6.patch
new file mode 100644
index 00000000..46797473
--- /dev/null
+++ b/1005_linux-6.4.6.patch
@@ -0,0 +1,310 @@
+diff --git a/Makefile b/Makefile
+index c324529158cc6..23ddaa3f30343 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 6
+ PATCHLEVEL = 4
+-SUBLEVEL = 5
++SUBLEVEL = 6
+ EXTRAVERSION =
+ NAME = Hurr durr I'ma ninja sloth
+
+diff --git a/arch/x86/include/asm/microcode.h b/arch/x86/include/asm/microcode.h
+index 320566a0443db..66dbba181bd9a 100644
+--- a/arch/x86/include/asm/microcode.h
++++ b/arch/x86/include/asm/microcode.h
+@@ -5,6 +5,7 @@
+ #include <asm/cpu.h>
+ #include <linux/earlycpio.h>
+ #include <linux/initrd.h>
++#include <asm/microcode_amd.h>
+
+ struct ucode_patch {
+ struct list_head plist;
+diff --git a/arch/x86/include/asm/microcode_amd.h b/arch/x86/include/asm/microcode_amd.h
+index e6662adf3af4d..9675c621c1ca4 100644
+--- a/arch/x86/include/asm/microcode_amd.h
++++ b/arch/x86/include/asm/microcode_amd.h
+@@ -48,11 +48,13 @@ extern void __init load_ucode_amd_bsp(unsigned int family);
+ extern void load_ucode_amd_ap(unsigned int family);
+ extern int __init save_microcode_in_initrd_amd(unsigned int family);
+ void reload_ucode_amd(unsigned int cpu);
++extern void amd_check_microcode(void);
+ #else
+ static inline void __init load_ucode_amd_bsp(unsigned int family) {}
+ static inline void load_ucode_amd_ap(unsigned int family) {}
+ static inline int __init
+ save_microcode_in_initrd_amd(unsigned int family) { return -EINVAL; }
+ static inline void reload_ucode_amd(unsigned int cpu) {}
++static inline void amd_check_microcode(void) {}
+ #endif
+ #endif /* _ASM_X86_MICROCODE_AMD_H */
+diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
+index 3aedae61af4fc..a00a53e15ab73 100644
+--- a/arch/x86/include/asm/msr-index.h
++++ b/arch/x86/include/asm/msr-index.h
+@@ -545,6 +545,7 @@
+ #define MSR_AMD64_DE_CFG 0xc0011029
+ #define MSR_AMD64_DE_CFG_LFENCE_SERIALIZE_BIT 1
+ #define MSR_AMD64_DE_CFG_LFENCE_SERIALIZE BIT_ULL(MSR_AMD64_DE_CFG_LFENCE_SERIALIZE_BIT)
++#define MSR_AMD64_DE_CFG_ZEN2_FP_BACKUP_FIX_BIT 9
+
+ #define MSR_AMD64_BU_CFG2 0xc001102a
+ #define MSR_AMD64_IBSFETCHCTL 0xc0011030
+diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
+index 571abf808ea31..26ad7ca423e7c 100644
+--- a/arch/x86/kernel/cpu/amd.c
++++ b/arch/x86/kernel/cpu/amd.c
+@@ -27,11 +27,6 @@
+
+ #include "cpu.h"
+
+-static const int amd_erratum_383[];
+-static const int amd_erratum_400[];
+-static const int amd_erratum_1054[];
+-static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum);
+-
+ /*
+ * nodes_per_socket: Stores the number of nodes per socket.
+ * Refer to Fam15h Models 00-0fh BKDG - CPUID Fn8000_001E_ECX
+@@ -39,6 +34,78 @@ static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum);
+ */
+ static u32 nodes_per_socket = 1;
+
++/*
++ * AMD errata checking
++ *
++ * Errata are defined as arrays of ints using the AMD_LEGACY_ERRATUM() or
++ * AMD_OSVW_ERRATUM() macros. The latter is intended for newer errata that
++ * have an OSVW id assigned, which it takes as first argument. Both take a
++ * variable number of family-specific model-stepping ranges created by
++ * AMD_MODEL_RANGE().
++ *
++ * Example:
++ *
++ * const int amd_erratum_319[] =
++ * AMD_LEGACY_ERRATUM(AMD_MODEL_RANGE(0x10, 0x2, 0x1, 0x4, 0x2),
++ * AMD_MODEL_RANGE(0x10, 0x8, 0x0, 0x8, 0x0),
++ * AMD_MODEL_RANGE(0x10, 0x9, 0x0, 0x9, 0x0));
++ */
++
++#define AMD_LEGACY_ERRATUM(...) { -1, __VA_ARGS__, 0 }
++#define AMD_OSVW_ERRATUM(osvw_id, ...) { osvw_id, __VA_ARGS__, 0 }
++#define AMD_MODEL_RANGE(f, m_start, s_start, m_end, s_end) \
++ ((f << 24) | (m_start << 16) | (s_start << 12) | (m_end << 4) | (s_end))
++#define AMD_MODEL_RANGE_FAMILY(range) (((range) >> 24) & 0xff)
++#define AMD_MODEL_RANGE_START(range) (((range) >> 12) & 0xfff)
++#define AMD_MODEL_RANGE_END(range) ((range) & 0xfff)
++
++static const int amd_erratum_400[] =
++ AMD_OSVW_ERRATUM(1, AMD_MODEL_RANGE(0xf, 0x41, 0x2, 0xff, 0xf),
++ AMD_MODEL_RANGE(0x10, 0x2, 0x1, 0xff, 0xf));
++
++static const int amd_erratum_383[] =
++ AMD_OSVW_ERRATUM(3, AMD_MODEL_RANGE(0x10, 0, 0, 0xff, 0xf));
++
++/* #1054: Instructions Retired Performance Counter May Be Inaccurate */
++static const int amd_erratum_1054[] =
++ AMD_LEGACY_ERRATUM(AMD_MODEL_RANGE(0x17, 0, 0, 0x2f, 0xf));
++
++static const int amd_zenbleed[] =
++ AMD_LEGACY_ERRATUM(AMD_MODEL_RANGE(0x17, 0x30, 0x0, 0x4f, 0xf),
++ AMD_MODEL_RANGE(0x17, 0x60, 0x0, 0x7f, 0xf),
++ AMD_MODEL_RANGE(0x17, 0xa0, 0x0, 0xaf, 0xf));
++
++static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum)
++{
++ int osvw_id = *erratum++;
++ u32 range;
++ u32 ms;
++
++ if (osvw_id >= 0 && osvw_id < 65536 &&
++ cpu_has(cpu, X86_FEATURE_OSVW)) {
++ u64 osvw_len;
++
++ rdmsrl(MSR_AMD64_OSVW_ID_LENGTH, osvw_len);
++ if (osvw_id < osvw_len) {
++ u64 osvw_bits;
++
++ rdmsrl(MSR_AMD64_OSVW_STATUS + (osvw_id >> 6),
++ osvw_bits);
++ return osvw_bits & (1ULL << (osvw_id & 0x3f));
++ }
++ }
++
++ /* OSVW unavailable or ID unknown, match family-model-stepping range */
++ ms = (cpu->x86_model << 4) | cpu->x86_stepping;
++ while ((range = *erratum++))
++ if ((cpu->x86 == AMD_MODEL_RANGE_FAMILY(range)) &&
++ (ms >= AMD_MODEL_RANGE_START(range)) &&
++ (ms <= AMD_MODEL_RANGE_END(range)))
++ return true;
++
++ return false;
++}
++
+ static inline int rdmsrl_amd_safe(unsigned msr, unsigned long long *p)
+ {
+ u32 gprs[8] = { 0 };
+@@ -916,6 +983,47 @@ static void init_amd_zn(struct cpuinfo_x86 *c)
+ }
+ }
+
++static bool cpu_has_zenbleed_microcode(void)
++{
++ u32 good_rev = 0;
++
++ switch (boot_cpu_data.x86_model) {
++ case 0x30 ... 0x3f: good_rev = 0x0830107a; break;
++ case 0x60 ... 0x67: good_rev = 0x0860010b; break;
++ case 0x68 ... 0x6f: good_rev = 0x08608105; break;
++ case 0x70 ... 0x7f: good_rev = 0x08701032; break;
++ case 0xa0 ... 0xaf: good_rev = 0x08a00008; break;
++
++ default:
++ return false;
++ break;
++ }
++
++ if (boot_cpu_data.microcode < good_rev)
++ return false;
++
++ return true;
++}
++
++static void zenbleed_check(struct cpuinfo_x86 *c)
++{
++ if (!cpu_has_amd_erratum(c, amd_zenbleed))
++ return;
++
++ if (cpu_has(c, X86_FEATURE_HYPERVISOR))
++ return;
++
++ if (!cpu_has(c, X86_FEATURE_AVX))
++ return;
++
++ if (!cpu_has_zenbleed_microcode()) {
++ pr_notice_once("Zenbleed: please update your microcode for the most optimal fix\n");
++ msr_set_bit(MSR_AMD64_DE_CFG, MSR_AMD64_DE_CFG_ZEN2_FP_BACKUP_FIX_BIT);
++ } else {
++ msr_clear_bit(MSR_AMD64_DE_CFG, MSR_AMD64_DE_CFG_ZEN2_FP_BACKUP_FIX_BIT);
++ }
++}
++
+ static void init_amd(struct cpuinfo_x86 *c)
+ {
+ early_init_amd(c);
+@@ -1020,6 +1128,8 @@ static void init_amd(struct cpuinfo_x86 *c)
+ if (spectre_v2_in_eibrs_mode(spectre_v2_enabled) &&
+ cpu_has(c, X86_FEATURE_AUTOIBRS))
+ WARN_ON_ONCE(msr_set_bit(MSR_EFER, _EFER_AUTOIBRS));
++
++ zenbleed_check(c);
+ }
+
+ #ifdef CONFIG_X86_32
+@@ -1115,73 +1225,6 @@ static const struct cpu_dev amd_cpu_dev = {
+
+ cpu_dev_register(amd_cpu_dev);
+
+-/*
+- * AMD errata checking
+- *
+- * Errata are defined as arrays of ints using the AMD_LEGACY_ERRATUM() or
+- * AMD_OSVW_ERRATUM() macros. The latter is intended for newer errata that
+- * have an OSVW id assigned, which it takes as first argument. Both take a
+- * variable number of family-specific model-stepping ranges created by
+- * AMD_MODEL_RANGE().
+- *
+- * Example:
+- *
+- * const int amd_erratum_319[] =
+- * AMD_LEGACY_ERRATUM(AMD_MODEL_RANGE(0x10, 0x2, 0x1, 0x4, 0x2),
+- * AMD_MODEL_RANGE(0x10, 0x8, 0x0, 0x8, 0x0),
+- * AMD_MODEL_RANGE(0x10, 0x9, 0x0, 0x9, 0x0));
+- */
+-
+-#define AMD_LEGACY_ERRATUM(...) { -1, __VA_ARGS__, 0 }
+-#define AMD_OSVW_ERRATUM(osvw_id, ...) { osvw_id, __VA_ARGS__, 0 }
+-#define AMD_MODEL_RANGE(f, m_start, s_start, m_end, s_end) \
+- ((f << 24) | (m_start << 16) | (s_start << 12) | (m_end << 4) | (s_end))
+-#define AMD_MODEL_RANGE_FAMILY(range) (((range) >> 24) & 0xff)
+-#define AMD_MODEL_RANGE_START(range) (((range) >> 12) & 0xfff)
+-#define AMD_MODEL_RANGE_END(range) ((range) & 0xfff)
+-
+-static const int amd_erratum_400[] =
+- AMD_OSVW_ERRATUM(1, AMD_MODEL_RANGE(0xf, 0x41, 0x2, 0xff, 0xf),
+- AMD_MODEL_RANGE(0x10, 0x2, 0x1, 0xff, 0xf));
+-
+-static const int amd_erratum_383[] =
+- AMD_OSVW_ERRATUM(3, AMD_MODEL_RANGE(0x10, 0, 0, 0xff, 0xf));
+-
+-/* #1054: Instructions Retired Performance Counter May Be Inaccurate */
+-static const int amd_erratum_1054[] =
+- AMD_LEGACY_ERRATUM(AMD_MODEL_RANGE(0x17, 0, 0, 0x2f, 0xf));
+-
+-static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum)
+-{
+- int osvw_id = *erratum++;
+- u32 range;
+- u32 ms;
+-
+- if (osvw_id >= 0 && osvw_id < 65536 &&
+- cpu_has(cpu, X86_FEATURE_OSVW)) {
+- u64 osvw_len;
+-
+- rdmsrl(MSR_AMD64_OSVW_ID_LENGTH, osvw_len);
+- if (osvw_id < osvw_len) {
+- u64 osvw_bits;
+-
+- rdmsrl(MSR_AMD64_OSVW_STATUS + (osvw_id >> 6),
+- osvw_bits);
+- return osvw_bits & (1ULL << (osvw_id & 0x3f));
+- }
+- }
+-
+- /* OSVW unavailable or ID unknown, match family-model-stepping range */
+- ms = (cpu->x86_model << 4) | cpu->x86_stepping;
+- while ((range = *erratum++))
+- if ((cpu->x86 == AMD_MODEL_RANGE_FAMILY(range)) &&
+- (ms >= AMD_MODEL_RANGE_START(range)) &&
+- (ms <= AMD_MODEL_RANGE_END(range)))
+- return true;
+-
+- return false;
+-}
+-
+ static DEFINE_PER_CPU_READ_MOSTLY(unsigned long[4], amd_dr_addr_mask);
+
+ static unsigned int amd_msr_dr_addr_masks[] = {
+@@ -1235,3 +1278,15 @@ u32 amd_get_highest_perf(void)
+ return 255;
+ }
+ EXPORT_SYMBOL_GPL(amd_get_highest_perf);
++
++static void zenbleed_check_cpu(void *unused)
++{
++ struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
++
++ zenbleed_check(c);
++}
++
++void amd_check_microcode(void)
++{
++ on_each_cpu(zenbleed_check_cpu, NULL, 1);
++}
+diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
+index 80710a68ef7da..04eebbacb5503 100644
+--- a/arch/x86/kernel/cpu/common.c
++++ b/arch/x86/kernel/cpu/common.c
+@@ -2341,6 +2341,8 @@ void microcode_check(struct cpuinfo_x86 *prev_info)
+
+ perf_check_microcode();
+
++ amd_check_microcode();
++
+ store_cpu_caps(&curr_info);
+
+ if (!memcmp(&prev_info->x86_capability, &curr_info.x86_capability,
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-07-27 11:41 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-07-27 11:41 UTC (permalink / raw
To: gentoo-commits
commit: ee51f4eb54bb7924e582f9391b383ca51a5efb09
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Thu Jul 27 11:41:31 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Thu Jul 27 11:41:31 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=ee51f4eb
Linux patch 6.4.7
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
1006_linux-6.4.7.patch | 9670 ++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 9670 insertions(+)
diff --git a/1006_linux-6.4.7.patch b/1006_linux-6.4.7.patch
new file mode 100644
index 00000000..3b9df10a
--- /dev/null
+++ b/1006_linux-6.4.7.patch
@@ -0,0 +1,9670 @@
+diff --git a/Makefile b/Makefile
+index 23ddaa3f30343..b3dc3b5f14cae 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 6
+ PATCHLEVEL = 4
+-SUBLEVEL = 6
++SUBLEVEL = 7
+ EXTRAVERSION =
+ NAME = Hurr durr I'ma ninja sloth
+
+diff --git a/arch/arm64/include/asm/exception.h b/arch/arm64/include/asm/exception.h
+index e73af709cb7ad..88d8dfeed0db6 100644
+--- a/arch/arm64/include/asm/exception.h
++++ b/arch/arm64/include/asm/exception.h
+@@ -8,16 +8,11 @@
+ #define __ASM_EXCEPTION_H
+
+ #include <asm/esr.h>
+-#include <asm/kprobes.h>
+ #include <asm/ptrace.h>
+
+ #include <linux/interrupt.h>
+
+-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+ #define __exception_irq_entry __irq_entry
+-#else
+-#define __exception_irq_entry __kprobes
+-#endif
+
+ static inline unsigned long disr_to_esr(u64 disr)
+ {
+diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
+index 9787503ff43fd..36d72d0300db9 100644
+--- a/arch/arm64/include/asm/kvm_host.h
++++ b/arch/arm64/include/asm/kvm_host.h
+@@ -701,6 +701,8 @@ struct kvm_vcpu_arch {
+ #define DBG_SS_ACTIVE_PENDING __vcpu_single_flag(sflags, BIT(5))
+ /* PMUSERENR for the guest EL0 is on physical CPU */
+ #define PMUSERENR_ON_CPU __vcpu_single_flag(sflags, BIT(6))
++/* WFI instruction trapped */
++#define IN_WFI __vcpu_single_flag(sflags, BIT(7))
+
+
+ /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
+diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
+index 93bd0975b15f5..f5e5730c2c1cd 100644
+--- a/arch/arm64/include/asm/kvm_pgtable.h
++++ b/arch/arm64/include/asm/kvm_pgtable.h
+@@ -556,22 +556,26 @@ int kvm_pgtable_stage2_wrprotect(struct kvm_pgtable *pgt, u64 addr, u64 size);
+ kvm_pte_t kvm_pgtable_stage2_mkyoung(struct kvm_pgtable *pgt, u64 addr);
+
+ /**
+- * kvm_pgtable_stage2_mkold() - Clear the access flag in a page-table entry.
++ * kvm_pgtable_stage2_test_clear_young() - Test and optionally clear the access
++ * flag in a page-table entry.
+ * @pgt: Page-table structure initialised by kvm_pgtable_stage2_init*().
+ * @addr: Intermediate physical address to identify the page-table entry.
++ * @size: Size of the address range to visit.
++ * @mkold: True if the access flag should be cleared.
+ *
+ * The offset of @addr within a page is ignored.
+ *
+- * If there is a valid, leaf page-table entry used to translate @addr, then
+- * clear the access flag in that entry.
++ * Tests and conditionally clears the access flag for every valid, leaf
++ * page-table entry used to translate the range [@addr, @addr + @size).
+ *
+ * Note that it is the caller's responsibility to invalidate the TLB after
+ * calling this function to ensure that the updated permissions are visible
+ * to the CPUs.
+ *
+- * Return: The old page-table entry prior to clearing the flag, 0 on failure.
++ * Return: True if any of the visited PTEs had the access flag set.
+ */
+-kvm_pte_t kvm_pgtable_stage2_mkold(struct kvm_pgtable *pgt, u64 addr);
++bool kvm_pgtable_stage2_test_clear_young(struct kvm_pgtable *pgt, u64 addr,
++ u64 size, bool mkold);
+
+ /**
+ * kvm_pgtable_stage2_relax_perms() - Relax the permissions enforced by a
+@@ -593,18 +597,6 @@ kvm_pte_t kvm_pgtable_stage2_mkold(struct kvm_pgtable *pgt, u64 addr);
+ int kvm_pgtable_stage2_relax_perms(struct kvm_pgtable *pgt, u64 addr,
+ enum kvm_pgtable_prot prot);
+
+-/**
+- * kvm_pgtable_stage2_is_young() - Test whether a page-table entry has the
+- * access flag set.
+- * @pgt: Page-table structure initialised by kvm_pgtable_stage2_init*().
+- * @addr: Intermediate physical address to identify the page-table entry.
+- *
+- * The offset of @addr within a page is ignored.
+- *
+- * Return: True if the page-table entry has the access flag set, false otherwise.
+- */
+-bool kvm_pgtable_stage2_is_young(struct kvm_pgtable *pgt, u64 addr);
+-
+ /**
+ * kvm_pgtable_stage2_flush_range() - Clean and invalidate data cache to Point
+ * of Coherency for guest stage-2 address
+diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
+index 2fbafa5cc7ac1..9d7d10d60bfdc 100644
+--- a/arch/arm64/kernel/fpsimd.c
++++ b/arch/arm64/kernel/fpsimd.c
+@@ -847,6 +847,8 @@ void sve_sync_from_fpsimd_zeropad(struct task_struct *task)
+ int vec_set_vector_length(struct task_struct *task, enum vec_type type,
+ unsigned long vl, unsigned long flags)
+ {
++ bool free_sme = false;
++
+ if (flags & ~(unsigned long)(PR_SVE_VL_INHERIT |
+ PR_SVE_SET_VL_ONEXEC))
+ return -EINVAL;
+@@ -897,21 +899,36 @@ int vec_set_vector_length(struct task_struct *task, enum vec_type type,
+ task->thread.fp_type = FP_STATE_FPSIMD;
+ }
+
+- if (system_supports_sme() && type == ARM64_VEC_SME) {
+- task->thread.svcr &= ~(SVCR_SM_MASK |
+- SVCR_ZA_MASK);
+- clear_thread_flag(TIF_SME);
++ if (system_supports_sme()) {
++ if (type == ARM64_VEC_SME ||
++ !(task->thread.svcr & (SVCR_SM_MASK | SVCR_ZA_MASK))) {
++ /*
++ * We are changing the SME VL or weren't using
++ * SME anyway, discard the state and force a
++ * reallocation.
++ */
++ task->thread.svcr &= ~(SVCR_SM_MASK |
++ SVCR_ZA_MASK);
++ clear_thread_flag(TIF_SME);
++ free_sme = true;
++ }
+ }
+
+ if (task == current)
+ put_cpu_fpsimd_context();
+
+ /*
+- * Force reallocation of task SVE and SME state to the correct
+- * size on next use:
++ * Free the changed states if they are not in use, SME will be
++ * reallocated to the correct size on next use and we just
++ * allocate SVE now in case it is needed for use in streaming
++ * mode.
+ */
+- sve_free(task);
+- if (system_supports_sme() && type == ARM64_VEC_SME)
++ if (system_supports_sve()) {
++ sve_free(task);
++ sve_alloc(task, true);
++ }
++
++ if (free_sme)
+ sme_free(task);
+
+ task_set_vl(task, type, vl);
+diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
+index 05b022be885b6..644a687100aa3 100644
+--- a/arch/arm64/kvm/arch_timer.c
++++ b/arch/arm64/kvm/arch_timer.c
+@@ -827,8 +827,8 @@ static void timer_set_traps(struct kvm_vcpu *vcpu, struct timer_map *map)
+ assign_clear_set_bit(tpt, CNTHCTL_EL1PCEN << 10, set, clr);
+ assign_clear_set_bit(tpc, CNTHCTL_EL1PCTEN << 10, set, clr);
+
+- /* This only happens on VHE, so use the CNTKCTL_EL1 accessor */
+- sysreg_clear_set(cntkctl_el1, clr, set);
++ /* This only happens on VHE, so use the CNTHCTL_EL2 accessor. */
++ sysreg_clear_set(cnthctl_el2, clr, set);
+ }
+
+ void kvm_timer_vcpu_load(struct kvm_vcpu *vcpu)
+@@ -1559,7 +1559,7 @@ no_vgic:
+ void kvm_timer_init_vhe(void)
+ {
+ if (cpus_have_final_cap(ARM64_HAS_ECV_CNTPOFF))
+- sysreg_clear_set(cntkctl_el1, 0, CNTHCTL_ECV);
++ sysreg_clear_set(cnthctl_el2, 0, CNTHCTL_ECV);
+ }
+
+ int kvm_arm_timer_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
+diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
+index 14391826241c8..7d8c3dd8b7ca9 100644
+--- a/arch/arm64/kvm/arm.c
++++ b/arch/arm64/kvm/arm.c
+@@ -704,13 +704,15 @@ void kvm_vcpu_wfi(struct kvm_vcpu *vcpu)
+ */
+ preempt_disable();
+ kvm_vgic_vmcr_sync(vcpu);
+- vgic_v4_put(vcpu, true);
++ vcpu_set_flag(vcpu, IN_WFI);
++ vgic_v4_put(vcpu);
+ preempt_enable();
+
+ kvm_vcpu_halt(vcpu);
+ vcpu_clear_flag(vcpu, IN_WFIT);
+
+ preempt_disable();
++ vcpu_clear_flag(vcpu, IN_WFI);
+ vgic_v4_load(vcpu);
+ preempt_enable();
+ }
+@@ -778,7 +780,7 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu)
+ if (kvm_check_request(KVM_REQ_RELOAD_GICv4, vcpu)) {
+ /* The distributor enable bits were changed */
+ preempt_disable();
+- vgic_v4_put(vcpu, false);
++ vgic_v4_put(vcpu);
+ vgic_v4_load(vcpu);
+ preempt_enable();
+ }
+@@ -1793,8 +1795,17 @@ static void _kvm_arch_hardware_enable(void *discard)
+
+ int kvm_arch_hardware_enable(void)
+ {
+- int was_enabled = __this_cpu_read(kvm_arm_hardware_enabled);
++ int was_enabled;
+
++ /*
++ * Most calls to this function are made with migration
++ * disabled, but not with preemption disabled. The former is
++ * enough to ensure correctness, but most of the helpers
++ * expect the later and will throw a tantrum otherwise.
++ */
++ preempt_disable();
++
++ was_enabled = __this_cpu_read(kvm_arm_hardware_enabled);
+ _kvm_arch_hardware_enable(NULL);
+
+ if (!was_enabled) {
+@@ -1802,6 +1813,8 @@ int kvm_arch_hardware_enable(void)
+ kvm_timer_cpu_up();
+ }
+
++ preempt_enable();
++
+ return 0;
+ }
+
+diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
+index 37bd64e912ca7..14fd3c581a3be 100644
+--- a/arch/arm64/kvm/hyp/pgtable.c
++++ b/arch/arm64/kvm/hyp/pgtable.c
+@@ -1173,25 +1173,54 @@ kvm_pte_t kvm_pgtable_stage2_mkyoung(struct kvm_pgtable *pgt, u64 addr)
+ return pte;
+ }
+
+-kvm_pte_t kvm_pgtable_stage2_mkold(struct kvm_pgtable *pgt, u64 addr)
++struct stage2_age_data {
++ bool mkold;
++ bool young;
++};
++
++static int stage2_age_walker(const struct kvm_pgtable_visit_ctx *ctx,
++ enum kvm_pgtable_walk_flags visit)
+ {
+- kvm_pte_t pte = 0;
+- stage2_update_leaf_attrs(pgt, addr, 1, 0, KVM_PTE_LEAF_ATTR_LO_S2_AF,
+- &pte, NULL, 0);
++ kvm_pte_t new = ctx->old & ~KVM_PTE_LEAF_ATTR_LO_S2_AF;
++ struct stage2_age_data *data = ctx->arg;
++
++ if (!kvm_pte_valid(ctx->old) || new == ctx->old)
++ return 0;
++
++ data->young = true;
++
++ /*
++ * stage2_age_walker() is always called while holding the MMU lock for
++ * write, so this will always succeed. Nonetheless, this deliberately
++ * follows the race detection pattern of the other stage-2 walkers in
++ * case the locking mechanics of the MMU notifiers is ever changed.
++ */
++ if (data->mkold && !stage2_try_set_pte(ctx, new))
++ return -EAGAIN;
++
+ /*
+ * "But where's the TLBI?!", you scream.
+ * "Over in the core code", I sigh.
+ *
+ * See the '->clear_flush_young()' callback on the KVM mmu notifier.
+ */
+- return pte;
++ return 0;
+ }
+
+-bool kvm_pgtable_stage2_is_young(struct kvm_pgtable *pgt, u64 addr)
++bool kvm_pgtable_stage2_test_clear_young(struct kvm_pgtable *pgt, u64 addr,
++ u64 size, bool mkold)
+ {
+- kvm_pte_t pte = 0;
+- stage2_update_leaf_attrs(pgt, addr, 1, 0, 0, &pte, NULL, 0);
+- return pte & KVM_PTE_LEAF_ATTR_LO_S2_AF;
++ struct stage2_age_data data = {
++ .mkold = mkold,
++ };
++ struct kvm_pgtable_walker walker = {
++ .cb = stage2_age_walker,
++ .arg = &data,
++ .flags = KVM_PGTABLE_WALK_LEAF,
++ };
++
++ WARN_ON(kvm_pgtable_walk(pgt, addr, size, &walker));
++ return data.young;
+ }
+
+ int kvm_pgtable_stage2_relax_perms(struct kvm_pgtable *pgt, u64 addr,
+diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
+index 3b9d4d24c361a..8a7e9381710ed 100644
+--- a/arch/arm64/kvm/mmu.c
++++ b/arch/arm64/kvm/mmu.c
+@@ -1639,27 +1639,25 @@ bool kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
+ bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
+ {
+ u64 size = (range->end - range->start) << PAGE_SHIFT;
+- kvm_pte_t kpte;
+- pte_t pte;
+
+ if (!kvm->arch.mmu.pgt)
+ return false;
+
+- WARN_ON(size != PAGE_SIZE && size != PMD_SIZE && size != PUD_SIZE);
+-
+- kpte = kvm_pgtable_stage2_mkold(kvm->arch.mmu.pgt,
+- range->start << PAGE_SHIFT);
+- pte = __pte(kpte);
+- return pte_valid(pte) && pte_young(pte);
++ return kvm_pgtable_stage2_test_clear_young(kvm->arch.mmu.pgt,
++ range->start << PAGE_SHIFT,
++ size, true);
+ }
+
+ bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
+ {
++ u64 size = (range->end - range->start) << PAGE_SHIFT;
++
+ if (!kvm->arch.mmu.pgt)
+ return false;
+
+- return kvm_pgtable_stage2_is_young(kvm->arch.mmu.pgt,
+- range->start << PAGE_SHIFT);
++ return kvm_pgtable_stage2_test_clear_young(kvm->arch.mmu.pgt,
++ range->start << PAGE_SHIFT,
++ size, false);
+ }
+
+ phys_addr_t kvm_mmu_get_httbr(void)
+diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
+index c3b8e132d5992..3dfc8b84e03e6 100644
+--- a/arch/arm64/kvm/vgic/vgic-v3.c
++++ b/arch/arm64/kvm/vgic/vgic-v3.c
+@@ -749,7 +749,7 @@ void vgic_v3_put(struct kvm_vcpu *vcpu)
+ {
+ struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3;
+
+- WARN_ON(vgic_v4_put(vcpu, false));
++ WARN_ON(vgic_v4_put(vcpu));
+
+ vgic_v3_vmcr_sync(vcpu);
+
+diff --git a/arch/arm64/kvm/vgic/vgic-v4.c b/arch/arm64/kvm/vgic/vgic-v4.c
+index c1c28fe680ba3..339a55194b2c6 100644
+--- a/arch/arm64/kvm/vgic/vgic-v4.c
++++ b/arch/arm64/kvm/vgic/vgic-v4.c
+@@ -336,14 +336,14 @@ void vgic_v4_teardown(struct kvm *kvm)
+ its_vm->vpes = NULL;
+ }
+
+-int vgic_v4_put(struct kvm_vcpu *vcpu, bool need_db)
++int vgic_v4_put(struct kvm_vcpu *vcpu)
+ {
+ struct its_vpe *vpe = &vcpu->arch.vgic_cpu.vgic_v3.its_vpe;
+
+ if (!vgic_supports_direct_msis(vcpu->kvm) || !vpe->resident)
+ return 0;
+
+- return its_make_vpe_non_resident(vpe, need_db);
++ return its_make_vpe_non_resident(vpe, !!vcpu_get_flag(vcpu, IN_WFI));
+ }
+
+ int vgic_v4_load(struct kvm_vcpu *vcpu)
+@@ -354,6 +354,9 @@ int vgic_v4_load(struct kvm_vcpu *vcpu)
+ if (!vgic_supports_direct_msis(vcpu->kvm) || vpe->resident)
+ return 0;
+
++ if (vcpu_get_flag(vcpu, IN_WFI))
++ return 0;
++
+ /*
+ * Before making the VPE resident, make sure the redistributor
+ * corresponding to our current CPU expects us here. See the
+diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
+index af6bc8403ee46..72b3c21820b96 100644
+--- a/arch/arm64/mm/mmu.c
++++ b/arch/arm64/mm/mmu.c
+@@ -451,7 +451,7 @@ static phys_addr_t pgd_pgtable_alloc(int shift)
+ void __init create_mapping_noalloc(phys_addr_t phys, unsigned long virt,
+ phys_addr_t size, pgprot_t prot)
+ {
+- if ((virt >= PAGE_END) && (virt < VMALLOC_START)) {
++ if (virt < PAGE_OFFSET) {
+ pr_warn("BUG: not creating mapping for %pa at 0x%016lx - outside kernel range\n",
+ &phys, virt);
+ return;
+@@ -478,7 +478,7 @@ void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
+ static void update_mapping_prot(phys_addr_t phys, unsigned long virt,
+ phys_addr_t size, pgprot_t prot)
+ {
+- if ((virt >= PAGE_END) && (virt < VMALLOC_START)) {
++ if (virt < PAGE_OFFSET) {
+ pr_warn("BUG: not updating mapping for %pa at 0x%016lx - outside kernel range\n",
+ &phys, virt);
+ return;
+diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
+index b26da8efa616e..0ce5f13eabb1b 100644
+--- a/arch/arm64/net/bpf_jit_comp.c
++++ b/arch/arm64/net/bpf_jit_comp.c
+@@ -322,7 +322,13 @@ static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf)
+ *
+ */
+
+- emit_bti(A64_BTI_C, ctx);
++ /* bpf function may be invoked by 3 instruction types:
++ * 1. bl, attached via freplace to bpf prog via short jump
++ * 2. br, attached via freplace to bpf prog via long jump
++ * 3. blr, working as a function pointer, used by emit_call.
++ * So BTI_JC should used here to support both br and blr.
++ */
++ emit_bti(A64_BTI_JC, ctx);
+
+ emit(A64_MOV(1, A64_R(9), A64_LR), ctx);
+ emit(A64_NOP, ctx);
+diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg
+index c9a0d1fa32090..930c8cc0812fc 100644
+--- a/arch/arm64/tools/sysreg
++++ b/arch/arm64/tools/sysreg
+@@ -1890,7 +1890,7 @@ Field 0 SM
+ EndSysreg
+
+ SysregFields HFGxTR_EL2
+-Field 63 nAMIAIR2_EL1
++Field 63 nAMAIR2_EL1
+ Field 62 nMAIR2_EL1
+ Field 61 nS2POR_EL1
+ Field 60 nPOR_EL1
+@@ -1905,9 +1905,9 @@ Field 52 nGCS_EL0
+ Res0 51
+ Field 50 nACCDATA_EL1
+ Field 49 ERXADDR_EL1
+-Field 48 EXRPFGCDN_EL1
+-Field 47 EXPFGCTL_EL1
+-Field 46 EXPFGF_EL1
++Field 48 ERXPFGCDN_EL1
++Field 47 ERXPFGCTL_EL1
++Field 46 ERXPFGF_EL1
+ Field 45 ERXMISCn_EL1
+ Field 44 ERXSTATUS_EL1
+ Field 43 ERXCTLR_EL1
+@@ -1922,8 +1922,8 @@ Field 35 TPIDR_EL0
+ Field 34 TPIDRRO_EL0
+ Field 33 TPIDR_EL1
+ Field 32 TCR_EL1
+-Field 31 SCTXNUM_EL0
+-Field 30 SCTXNUM_EL1
++Field 31 SCXTNUM_EL0
++Field 30 SCXTNUM_EL1
+ Field 29 SCTLR_EL1
+ Field 28 REVIDR_EL1
+ Field 27 PAR_EL1
+diff --git a/arch/ia64/kernel/sys_ia64.c b/arch/ia64/kernel/sys_ia64.c
+index 6e948d015332a..eb561cc93632f 100644
+--- a/arch/ia64/kernel/sys_ia64.c
++++ b/arch/ia64/kernel/sys_ia64.c
+@@ -63,7 +63,7 @@ arch_get_unmapped_area (struct file *filp, unsigned long addr, unsigned long len
+ info.low_limit = addr;
+ info.high_limit = TASK_SIZE;
+ info.align_mask = align_mask;
+- info.align_offset = 0;
++ info.align_offset = pgoff << PAGE_SHIFT;
+ return vm_unmapped_area(&info);
+ }
+
+diff --git a/arch/mips/include/asm/dec/prom.h b/arch/mips/include/asm/dec/prom.h
+index 1e1247add1cf8..908e96e3a3117 100644
+--- a/arch/mips/include/asm/dec/prom.h
++++ b/arch/mips/include/asm/dec/prom.h
+@@ -70,7 +70,7 @@ static inline bool prom_is_rex(u32 magic)
+ */
+ typedef struct {
+ int pagesize;
+- unsigned char bitmap[0];
++ unsigned char bitmap[];
+ } memmap;
+
+
+diff --git a/arch/parisc/kernel/sys_parisc.c b/arch/parisc/kernel/sys_parisc.c
+index 39acccabf2ede..465b7cb9d44f4 100644
+--- a/arch/parisc/kernel/sys_parisc.c
++++ b/arch/parisc/kernel/sys_parisc.c
+@@ -26,12 +26,17 @@
+ #include <linux/compat.h>
+
+ /*
+- * Construct an artificial page offset for the mapping based on the physical
++ * Construct an artificial page offset for the mapping based on the virtual
+ * address of the kernel file mapping variable.
++ * If filp is zero the calculated pgoff value aliases the memory of the given
++ * address. This is useful for io_uring where the mapping shall alias a kernel
++ * address and a userspace adress where both the kernel and the userspace
++ * access the same memory region.
+ */
+-#define GET_FILP_PGOFF(filp) \
+- (filp ? (((unsigned long) filp->f_mapping) >> 8) \
+- & ((SHM_COLOUR-1) >> PAGE_SHIFT) : 0UL)
++#define GET_FILP_PGOFF(filp, addr) \
++ ((filp ? (((unsigned long) filp->f_mapping) >> 8) \
++ & ((SHM_COLOUR-1) >> PAGE_SHIFT) : 0UL) \
++ + (addr >> PAGE_SHIFT))
+
+ static unsigned long shared_align_offset(unsigned long filp_pgoff,
+ unsigned long pgoff)
+@@ -111,7 +116,7 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
+ do_color_align = 0;
+ if (filp || (flags & MAP_SHARED))
+ do_color_align = 1;
+- filp_pgoff = GET_FILP_PGOFF(filp);
++ filp_pgoff = GET_FILP_PGOFF(filp, addr);
+
+ if (flags & MAP_FIXED) {
+ /* Even MAP_FIXED mappings must reside within TASK_SIZE */
+diff --git a/block/blk-mq.c b/block/blk-mq.c
+index b9f4546139894..73ed8ccb09ce8 100644
+--- a/block/blk-mq.c
++++ b/block/blk-mq.c
+@@ -4617,9 +4617,6 @@ static bool blk_mq_elv_switch_none(struct list_head *head,
+ {
+ struct blk_mq_qe_pair *qe;
+
+- if (!q->elevator)
+- return true;
+-
+ qe = kmalloc(sizeof(*qe), GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY);
+ if (!qe)
+ return false;
+@@ -4627,6 +4624,12 @@ static bool blk_mq_elv_switch_none(struct list_head *head,
+ /* q->elevator needs protection from ->sysfs_lock */
+ mutex_lock(&q->sysfs_lock);
+
++ /* the check has to be done with holding sysfs_lock */
++ if (!q->elevator) {
++ kfree(qe);
++ goto unlock;
++ }
++
+ INIT_LIST_HEAD(&qe->node);
+ qe->q = q;
+ qe->type = q->elevator->type;
+@@ -4634,6 +4637,7 @@ static bool blk_mq_elv_switch_none(struct list_head *head,
+ __elevator_get(qe->type);
+ list_add(&qe->node, head);
+ elevator_disable(q);
++unlock:
+ mutex_unlock(&q->sysfs_lock);
+
+ return true;
+diff --git a/drivers/accel/qaic/qaic_control.c b/drivers/accel/qaic/qaic_control.c
+index 5c57f7b4494e4..cfbc92da426fa 100644
+--- a/drivers/accel/qaic/qaic_control.c
++++ b/drivers/accel/qaic/qaic_control.c
+@@ -14,6 +14,7 @@
+ #include <linux/mm.h>
+ #include <linux/moduleparam.h>
+ #include <linux/mutex.h>
++#include <linux/overflow.h>
+ #include <linux/pci.h>
+ #include <linux/scatterlist.h>
+ #include <linux/types.h>
+@@ -366,7 +367,7 @@ static int encode_passthrough(struct qaic_device *qdev, void *trans, struct wrap
+ if (in_trans->hdr.len % 8 != 0)
+ return -EINVAL;
+
+- if (msg_hdr_len + in_trans->hdr.len > QAIC_MANAGE_EXT_MSG_LENGTH)
++ if (size_add(msg_hdr_len, in_trans->hdr.len) > QAIC_MANAGE_EXT_MSG_LENGTH)
+ return -ENOSPC;
+
+ trans_wrapper = add_wrapper(wrappers,
+@@ -418,9 +419,12 @@ static int find_and_map_user_pages(struct qaic_device *qdev,
+ }
+
+ ret = get_user_pages_fast(xfer_start_addr, nr_pages, 0, page_list);
+- if (ret < 0 || ret != nr_pages) {
+- ret = -EFAULT;
++ if (ret < 0)
+ goto free_page_list;
++ if (ret != nr_pages) {
++ nr_pages = ret;
++ ret = -EFAULT;
++ goto put_pages;
+ }
+
+ sgt = kmalloc(sizeof(*sgt), GFP_KERNEL);
+@@ -557,11 +561,8 @@ static int encode_dma(struct qaic_device *qdev, void *trans, struct wrapper_list
+ msg = &wrapper->msg;
+ msg_hdr_len = le32_to_cpu(msg->hdr.len);
+
+- if (msg_hdr_len > (UINT_MAX - QAIC_MANAGE_EXT_MSG_LENGTH))
+- return -EINVAL;
+-
+ /* There should be enough space to hold at least one ASP entry. */
+- if (msg_hdr_len + sizeof(*out_trans) + sizeof(struct wire_addr_size_pair) >
++ if (size_add(msg_hdr_len, sizeof(*out_trans) + sizeof(struct wire_addr_size_pair)) >
+ QAIC_MANAGE_EXT_MSG_LENGTH)
+ return -ENOMEM;
+
+@@ -634,7 +635,7 @@ static int encode_activate(struct qaic_device *qdev, void *trans, struct wrapper
+ msg = &wrapper->msg;
+ msg_hdr_len = le32_to_cpu(msg->hdr.len);
+
+- if (msg_hdr_len + sizeof(*out_trans) > QAIC_MANAGE_MAX_MSG_LENGTH)
++ if (size_add(msg_hdr_len, sizeof(*out_trans)) > QAIC_MANAGE_MAX_MSG_LENGTH)
+ return -ENOSPC;
+
+ if (!in_trans->queue_size)
+@@ -718,7 +719,7 @@ static int encode_status(struct qaic_device *qdev, void *trans, struct wrapper_l
+ msg = &wrapper->msg;
+ msg_hdr_len = le32_to_cpu(msg->hdr.len);
+
+- if (msg_hdr_len + in_trans->hdr.len > QAIC_MANAGE_MAX_MSG_LENGTH)
++ if (size_add(msg_hdr_len, in_trans->hdr.len) > QAIC_MANAGE_MAX_MSG_LENGTH)
+ return -ENOSPC;
+
+ trans_wrapper = add_wrapper(wrappers, sizeof(*trans_wrapper));
+@@ -748,7 +749,8 @@ static int encode_message(struct qaic_device *qdev, struct manage_msg *user_msg,
+ int ret;
+ int i;
+
+- if (!user_msg->count) {
++ if (!user_msg->count ||
++ user_msg->len < sizeof(*trans_hdr)) {
+ ret = -EINVAL;
+ goto out;
+ }
+@@ -765,12 +767,13 @@ static int encode_message(struct qaic_device *qdev, struct manage_msg *user_msg,
+ }
+
+ for (i = 0; i < user_msg->count; ++i) {
+- if (user_len >= user_msg->len) {
++ if (user_len > user_msg->len - sizeof(*trans_hdr)) {
+ ret = -EINVAL;
+ break;
+ }
+ trans_hdr = (struct qaic_manage_trans_hdr *)(user_msg->data + user_len);
+- if (user_len + trans_hdr->len > user_msg->len) {
++ if (trans_hdr->len < sizeof(trans_hdr) ||
++ size_add(user_len, trans_hdr->len) > user_msg->len) {
+ ret = -EINVAL;
+ break;
+ }
+@@ -953,15 +956,23 @@ static int decode_message(struct qaic_device *qdev, struct manage_msg *user_msg,
+ int ret;
+ int i;
+
+- if (msg_hdr_len > QAIC_MANAGE_MAX_MSG_LENGTH)
++ if (msg_hdr_len < sizeof(*trans_hdr) ||
++ msg_hdr_len > QAIC_MANAGE_MAX_MSG_LENGTH)
+ return -EINVAL;
+
+ user_msg->len = 0;
+ user_msg->count = le32_to_cpu(msg->hdr.count);
+
+ for (i = 0; i < user_msg->count; ++i) {
++ u32 hdr_len;
++
++ if (msg_len > msg_hdr_len - sizeof(*trans_hdr))
++ return -EINVAL;
++
+ trans_hdr = (struct wire_trans_hdr *)(msg->data + msg_len);
+- if (msg_len + le32_to_cpu(trans_hdr->len) > msg_hdr_len)
++ hdr_len = le32_to_cpu(trans_hdr->len);
++ if (hdr_len < sizeof(*trans_hdr) ||
++ size_add(msg_len, hdr_len) > msg_hdr_len)
+ return -EINVAL;
+
+ switch (le32_to_cpu(trans_hdr->type)) {
+diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
+index 475e1eddfa3b4..ef77c14c72a92 100644
+--- a/drivers/acpi/button.c
++++ b/drivers/acpi/button.c
+@@ -77,6 +77,15 @@ static const struct dmi_system_id dmi_lid_quirks[] = {
+ },
+ .driver_data = (void *)(long)ACPI_BUTTON_LID_INIT_DISABLED,
+ },
++ {
++ /* Nextbook Ares 8A tablet, _LID device always reports lid closed */
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "Insyde"),
++ DMI_MATCH(DMI_PRODUCT_NAME, "CherryTrail"),
++ DMI_MATCH(DMI_BIOS_VERSION, "M882"),
++ },
++ .driver_data = (void *)(long)ACPI_BUTTON_LID_INIT_DISABLED,
++ },
+ {
+ /*
+ * Lenovo Yoga 9 14ITL5, initial notification of the LID device
+diff --git a/drivers/acpi/resource.c b/drivers/acpi/resource.c
+index 0800a9d775580..1dd8d5aebf678 100644
+--- a/drivers/acpi/resource.c
++++ b/drivers/acpi/resource.c
+@@ -470,52 +470,6 @@ static const struct dmi_system_id asus_laptop[] = {
+ { }
+ };
+
+-static const struct dmi_system_id lenovo_laptop[] = {
+- {
+- .ident = "LENOVO IdeaPad Flex 5 14ALC7",
+- .matches = {
+- DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
+- DMI_MATCH(DMI_PRODUCT_NAME, "82R9"),
+- },
+- },
+- {
+- .ident = "LENOVO IdeaPad Flex 5 16ALC7",
+- .matches = {
+- DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
+- DMI_MATCH(DMI_PRODUCT_NAME, "82RA"),
+- },
+- },
+- { }
+-};
+-
+-static const struct dmi_system_id tongfang_gm_rg[] = {
+- {
+- .ident = "TongFang GMxRGxx/XMG CORE 15 (M22)/TUXEDO Stellaris 15 Gen4 AMD",
+- .matches = {
+- DMI_MATCH(DMI_BOARD_NAME, "GMxRGxx"),
+- },
+- },
+- { }
+-};
+-
+-static const struct dmi_system_id maingear_laptop[] = {
+- {
+- .ident = "MAINGEAR Vector Pro 2 15",
+- .matches = {
+- DMI_MATCH(DMI_SYS_VENDOR, "Micro Electronics Inc"),
+- DMI_MATCH(DMI_PRODUCT_NAME, "MG-VCP2-15A3070T"),
+- }
+- },
+- {
+- .ident = "MAINGEAR Vector Pro 2 17",
+- .matches = {
+- DMI_MATCH(DMI_SYS_VENDOR, "Micro Electronics Inc"),
+- DMI_MATCH(DMI_PRODUCT_NAME, "MG-VCP2-17A3070T"),
+- },
+- },
+- { }
+-};
+-
+ static const struct dmi_system_id lg_laptop[] = {
+ {
+ .ident = "LG Electronics 17U70P",
+@@ -539,10 +493,6 @@ struct irq_override_cmp {
+ static const struct irq_override_cmp override_table[] = {
+ { medion_laptop, 1, ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_LOW, 0, false },
+ { asus_laptop, 1, ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_LOW, 0, false },
+- { lenovo_laptop, 6, ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_LOW, 0, true },
+- { lenovo_laptop, 10, ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_LOW, 0, true },
+- { tongfang_gm_rg, 1, ACPI_EDGE_SENSITIVE, ACPI_ACTIVE_LOW, 1, true },
+- { maingear_laptop, 1, ACPI_EDGE_SENSITIVE, ACPI_ACTIVE_LOW, 1, true },
+ { lg_laptop, 1, ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_LOW, 0, false },
+ };
+
+@@ -562,16 +512,6 @@ static bool acpi_dev_irq_override(u32 gsi, u8 triggering, u8 polarity,
+ return entry->override;
+ }
+
+-#ifdef CONFIG_X86
+- /*
+- * IRQ override isn't needed on modern AMD Zen systems and
+- * this override breaks active low IRQs on AMD Ryzen 6000 and
+- * newer systems. Skip it.
+- */
+- if (boot_cpu_has(X86_FEATURE_ZEN))
+- return false;
+-#endif
+-
+ return true;
+ }
+
+diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c
+index bcc25d457581d..e7d04ab864a16 100644
+--- a/drivers/acpi/video_detect.c
++++ b/drivers/acpi/video_detect.c
+@@ -470,6 +470,22 @@ static const struct dmi_system_id video_detect_dmi_table[] = {
+ DMI_MATCH(DMI_PRODUCT_NAME, "82BK"),
+ },
+ },
++ {
++ .callback = video_detect_force_native,
++ /* Lenovo ThinkPad X131e (3371 AMD version) */
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
++ DMI_MATCH(DMI_PRODUCT_NAME, "3371"),
++ },
++ },
++ {
++ .callback = video_detect_force_native,
++ /* Apple iMac11,3 */
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "Apple Inc."),
++ DMI_MATCH(DMI_PRODUCT_NAME, "iMac11,3"),
++ },
++ },
+ {
+ /* https://bugzilla.redhat.com/show_bug.cgi?id=1217249 */
+ .callback = video_detect_force_native,
+@@ -512,6 +528,14 @@ static const struct dmi_system_id video_detect_dmi_table[] = {
+ DMI_MATCH(DMI_PRODUCT_NAME, "Precision 7510"),
+ },
+ },
++ {
++ .callback = video_detect_force_native,
++ /* Dell Studio 1569 */
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
++ DMI_MATCH(DMI_PRODUCT_NAME, "Studio 1569"),
++ },
++ },
+ {
+ .callback = video_detect_force_native,
+ /* Acer Aspire 3830TG */
+diff --git a/drivers/acpi/x86/utils.c b/drivers/acpi/x86/utils.c
+index 9c2d6f35f88a0..c2b925f8cd4e4 100644
+--- a/drivers/acpi/x86/utils.c
++++ b/drivers/acpi/x86/utils.c
+@@ -259,10 +259,11 @@ bool force_storage_d3(void)
+ * drivers/platform/x86/x86-android-tablets.c kernel module.
+ */
+ #define ACPI_QUIRK_SKIP_I2C_CLIENTS BIT(0)
+-#define ACPI_QUIRK_UART1_TTY_UART2_SKIP BIT(1)
+-#define ACPI_QUIRK_SKIP_ACPI_AC_AND_BATTERY BIT(2)
+-#define ACPI_QUIRK_USE_ACPI_AC_AND_BATTERY BIT(3)
+-#define ACPI_QUIRK_SKIP_GPIO_EVENT_HANDLERS BIT(4)
++#define ACPI_QUIRK_UART1_SKIP BIT(1)
++#define ACPI_QUIRK_UART1_TTY_UART2_SKIP BIT(2)
++#define ACPI_QUIRK_SKIP_ACPI_AC_AND_BATTERY BIT(3)
++#define ACPI_QUIRK_USE_ACPI_AC_AND_BATTERY BIT(4)
++#define ACPI_QUIRK_SKIP_GPIO_EVENT_HANDLERS BIT(5)
+
+ static const struct dmi_system_id acpi_quirk_skip_dmi_ids[] = {
+ /*
+@@ -319,6 +320,7 @@ static const struct dmi_system_id acpi_quirk_skip_dmi_ids[] = {
+ DMI_EXACT_MATCH(DMI_PRODUCT_VERSION, "YETI-11"),
+ },
+ .driver_data = (void *)(ACPI_QUIRK_SKIP_I2C_CLIENTS |
++ ACPI_QUIRK_UART1_SKIP |
+ ACPI_QUIRK_SKIP_ACPI_AC_AND_BATTERY |
+ ACPI_QUIRK_SKIP_GPIO_EVENT_HANDLERS),
+ },
+@@ -365,7 +367,7 @@ static const struct dmi_system_id acpi_quirk_skip_dmi_ids[] = {
+ ACPI_QUIRK_SKIP_ACPI_AC_AND_BATTERY),
+ },
+ {
+- /* Nextbook Ares 8 */
++ /* Nextbook Ares 8 (BYT version)*/
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Insyde"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "M890BAP"),
+@@ -374,6 +376,16 @@ static const struct dmi_system_id acpi_quirk_skip_dmi_ids[] = {
+ ACPI_QUIRK_SKIP_ACPI_AC_AND_BATTERY |
+ ACPI_QUIRK_SKIP_GPIO_EVENT_HANDLERS),
+ },
++ {
++ /* Nextbook Ares 8A (CHT version)*/
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "Insyde"),
++ DMI_MATCH(DMI_PRODUCT_NAME, "CherryTrail"),
++ DMI_MATCH(DMI_BIOS_VERSION, "M882"),
++ },
++ .driver_data = (void *)(ACPI_QUIRK_SKIP_I2C_CLIENTS |
++ ACPI_QUIRK_SKIP_ACPI_AC_AND_BATTERY),
++ },
+ {
+ /* Whitelabel (sold as various brands) TM800A550L */
+ .matches = {
+@@ -392,6 +404,7 @@ static const struct dmi_system_id acpi_quirk_skip_dmi_ids[] = {
+ #if IS_ENABLED(CONFIG_X86_ANDROID_TABLETS)
+ static const struct acpi_device_id i2c_acpi_known_good_ids[] = {
+ { "10EC5640", 0 }, /* RealTek ALC5640 audio codec */
++ { "10EC5651", 0 }, /* RealTek ALC5651 audio codec */
+ { "INT33F4", 0 }, /* X-Powers AXP288 PMIC */
+ { "INT33FD", 0 }, /* Intel Crystal Cove PMIC */
+ { "INT34D3", 0 }, /* Intel Whiskey Cove PMIC */
+@@ -438,6 +451,9 @@ int acpi_quirk_skip_serdev_enumeration(struct device *controller_parent, bool *s
+ if (dmi_id)
+ quirks = (unsigned long)dmi_id->driver_data;
+
++ if ((quirks & ACPI_QUIRK_UART1_SKIP) && uid == 1)
++ *skip = true;
++
+ if (quirks & ACPI_QUIRK_UART1_TTY_UART2_SKIP) {
+ if (uid == 1)
+ return -ENODEV; /* Create tty cdev instead of serdev */
+diff --git a/drivers/base/regmap/regmap-i2c.c b/drivers/base/regmap/regmap-i2c.c
+index 980e5ce6a3a35..3ec611dc0c09f 100644
+--- a/drivers/base/regmap/regmap-i2c.c
++++ b/drivers/base/regmap/regmap-i2c.c
+@@ -242,8 +242,8 @@ static int regmap_i2c_smbus_i2c_read(void *context, const void *reg,
+ static const struct regmap_bus regmap_i2c_smbus_i2c_block = {
+ .write = regmap_i2c_smbus_i2c_write,
+ .read = regmap_i2c_smbus_i2c_read,
+- .max_raw_read = I2C_SMBUS_BLOCK_MAX,
+- .max_raw_write = I2C_SMBUS_BLOCK_MAX,
++ .max_raw_read = I2C_SMBUS_BLOCK_MAX - 1,
++ .max_raw_write = I2C_SMBUS_BLOCK_MAX - 1,
+ };
+
+ static int regmap_i2c_smbus_i2c_write_reg16(void *context, const void *data,
+@@ -299,8 +299,8 @@ static int regmap_i2c_smbus_i2c_read_reg16(void *context, const void *reg,
+ static const struct regmap_bus regmap_i2c_smbus_i2c_block_reg16 = {
+ .write = regmap_i2c_smbus_i2c_write_reg16,
+ .read = regmap_i2c_smbus_i2c_read_reg16,
+- .max_raw_read = I2C_SMBUS_BLOCK_MAX,
+- .max_raw_write = I2C_SMBUS_BLOCK_MAX,
++ .max_raw_read = I2C_SMBUS_BLOCK_MAX - 2,
++ .max_raw_write = I2C_SMBUS_BLOCK_MAX - 2,
+ };
+
+ static const struct regmap_bus *regmap_get_i2c_bus(struct i2c_client *i2c,
+diff --git a/drivers/base/regmap/regmap-spi-avmm.c b/drivers/base/regmap/regmap-spi-avmm.c
+index 6af692844c196..4c2b94b3e30be 100644
+--- a/drivers/base/regmap/regmap-spi-avmm.c
++++ b/drivers/base/regmap/regmap-spi-avmm.c
+@@ -660,7 +660,7 @@ static const struct regmap_bus regmap_spi_avmm_bus = {
+ .reg_format_endian_default = REGMAP_ENDIAN_NATIVE,
+ .val_format_endian_default = REGMAP_ENDIAN_NATIVE,
+ .max_raw_read = SPI_AVMM_VAL_SIZE * MAX_READ_CNT,
+- .max_raw_write = SPI_AVMM_REG_SIZE + SPI_AVMM_VAL_SIZE * MAX_WRITE_CNT,
++ .max_raw_write = SPI_AVMM_VAL_SIZE * MAX_WRITE_CNT,
+ .free_context = spi_avmm_bridge_ctx_free,
+ };
+
+diff --git a/drivers/base/regmap/regmap.c b/drivers/base/regmap/regmap.c
+index fa2d3fba6ac9d..db7851f0e3b8c 100644
+--- a/drivers/base/regmap/regmap.c
++++ b/drivers/base/regmap/regmap.c
+@@ -2082,8 +2082,6 @@ int _regmap_raw_write(struct regmap *map, unsigned int reg,
+ size_t val_count = val_len / val_bytes;
+ size_t chunk_count, chunk_bytes;
+ size_t chunk_regs = val_count;
+- size_t max_data = map->max_raw_write - map->format.reg_bytes -
+- map->format.pad_bytes;
+ int ret, i;
+
+ if (!val_count)
+@@ -2091,8 +2089,8 @@ int _regmap_raw_write(struct regmap *map, unsigned int reg,
+
+ if (map->use_single_write)
+ chunk_regs = 1;
+- else if (map->max_raw_write && val_len > max_data)
+- chunk_regs = max_data / val_bytes;
++ else if (map->max_raw_write && val_len > map->max_raw_write)
++ chunk_regs = map->max_raw_write / val_bytes;
+
+ chunk_count = val_count / chunk_regs;
+ chunk_bytes = chunk_regs * val_bytes;
+diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
+index 2a8e2bb038f58..50e23762ec5e9 100644
+--- a/drivers/bluetooth/btusb.c
++++ b/drivers/bluetooth/btusb.c
+@@ -4099,6 +4099,7 @@ static int btusb_probe(struct usb_interface *intf,
+ BT_DBG("intf %p id %p", intf, id);
+
+ if ((id->driver_info & BTUSB_IFNUM_2) &&
++ (intf->cur_altsetting->desc.bInterfaceNumber != 0) &&
+ (intf->cur_altsetting->desc.bInterfaceNumber != 2))
+ return -ENODEV;
+
+diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
+index 2a594b754af14..b630f2acc105e 100644
+--- a/drivers/dma-buf/dma-resv.c
++++ b/drivers/dma-buf/dma-resv.c
+@@ -571,6 +571,7 @@ int dma_resv_get_fences(struct dma_resv *obj, enum dma_resv_usage usage,
+ dma_resv_for_each_fence_unlocked(&cursor, fence) {
+
+ if (dma_resv_iter_is_restarted(&cursor)) {
++ struct dma_fence **new_fences;
+ unsigned int count;
+
+ while (*num_fences)
+@@ -579,13 +580,17 @@ int dma_resv_get_fences(struct dma_resv *obj, enum dma_resv_usage usage,
+ count = cursor.num_fences + 1;
+
+ /* Eventually re-allocate the array */
+- *fences = krealloc_array(*fences, count,
+- sizeof(void *),
+- GFP_KERNEL);
+- if (count && !*fences) {
++ new_fences = krealloc_array(*fences, count,
++ sizeof(void *),
++ GFP_KERNEL);
++ if (count && !new_fences) {
++ kfree(*fences);
++ *fences = NULL;
++ *num_fences = 0;
+ dma_resv_iter_end(&cursor);
+ return -ENOMEM;
+ }
++ *fences = new_fences;
+ }
+
+ (*fences)[(*num_fences)++] = dma_fence_get(fence);
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
+index 53ff91fc6cf6b..d0748bcfad16b 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
+@@ -55,8 +55,9 @@ static enum hrtimer_restart amdgpu_vkms_vblank_simulate(struct hrtimer *timer)
+ DRM_WARN("%s: vblank timer overrun\n", __func__);
+
+ ret = drm_crtc_handle_vblank(crtc);
++ /* Don't queue timer again when vblank is disabled. */
+ if (!ret)
+- DRM_ERROR("amdgpu_vkms failure on handling vblank");
++ return HRTIMER_NORESTART;
+
+ return HRTIMER_RESTART;
+ }
+@@ -81,7 +82,7 @@ static void amdgpu_vkms_disable_vblank(struct drm_crtc *crtc)
+ {
+ struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
+
+- hrtimer_cancel(&amdgpu_crtc->vblank_timer);
++ hrtimer_try_to_cancel(&amdgpu_crtc->vblank_timer);
+ }
+
+ static bool amdgpu_vkms_get_vblank_timestamp(struct drm_crtc *crtc,
+diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+index 44f4c74419740..812d7dd4c04b4 100644
+--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
++++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+@@ -416,12 +416,12 @@ static void dm_pflip_high_irq(void *interrupt_params)
+
+ spin_lock_irqsave(&adev_to_drm(adev)->event_lock, flags);
+
+- if (amdgpu_crtc->pflip_status != AMDGPU_FLIP_SUBMITTED){
+- DC_LOG_PFLIP("amdgpu_crtc->pflip_status = %d !=AMDGPU_FLIP_SUBMITTED(%d) on crtc:%d[%p] \n",
+- amdgpu_crtc->pflip_status,
+- AMDGPU_FLIP_SUBMITTED,
+- amdgpu_crtc->crtc_id,
+- amdgpu_crtc);
++ if (amdgpu_crtc->pflip_status != AMDGPU_FLIP_SUBMITTED) {
++ DC_LOG_PFLIP("amdgpu_crtc->pflip_status = %d !=AMDGPU_FLIP_SUBMITTED(%d) on crtc:%d[%p]\n",
++ amdgpu_crtc->pflip_status,
++ AMDGPU_FLIP_SUBMITTED,
++ amdgpu_crtc->crtc_id,
++ amdgpu_crtc);
+ spin_unlock_irqrestore(&adev_to_drm(adev)->event_lock, flags);
+ return;
+ }
+@@ -875,7 +875,7 @@ static int dm_set_powergating_state(void *handle,
+ }
+
+ /* Prototypes of private functions */
+-static int dm_early_init(void* handle);
++static int dm_early_init(void *handle);
+
+ /* Allocate memory for FBC compressed data */
+ static void amdgpu_dm_fbc_init(struct drm_connector *connector)
+@@ -1274,7 +1274,7 @@ static void mmhub_read_system_context(struct amdgpu_device *adev, struct dc_phy_
+ pa_config->system_aperture.start_addr = (uint64_t)logical_addr_low << 18;
+ pa_config->system_aperture.end_addr = (uint64_t)logical_addr_high << 18;
+
+- pa_config->system_aperture.agp_base = (uint64_t)agp_base << 24 ;
++ pa_config->system_aperture.agp_base = (uint64_t)agp_base << 24;
+ pa_config->system_aperture.agp_bot = (uint64_t)agp_bot << 24;
+ pa_config->system_aperture.agp_top = (uint64_t)agp_top << 24;
+
+@@ -1339,6 +1339,15 @@ static void dm_handle_hpd_rx_offload_work(struct work_struct *work)
+ if (amdgpu_in_reset(adev))
+ goto skip;
+
++ if (offload_work->data.bytes.device_service_irq.bits.UP_REQ_MSG_RDY ||
++ offload_work->data.bytes.device_service_irq.bits.DOWN_REP_MSG_RDY) {
++ dm_handle_mst_sideband_msg_ready_event(&aconnector->mst_mgr, DOWN_OR_UP_MSG_RDY_EVENT);
++ spin_lock_irqsave(&offload_work->offload_wq->offload_lock, flags);
++ offload_work->offload_wq->is_handling_mst_msg_rdy_event = false;
++ spin_unlock_irqrestore(&offload_work->offload_wq->offload_lock, flags);
++ goto skip;
++ }
++
+ mutex_lock(&adev->dm.dc_lock);
+ if (offload_work->data.bytes.device_service_irq.bits.AUTOMATED_TEST) {
+ dc_link_dp_handle_automated_test(dc_link);
+@@ -1357,8 +1366,7 @@ static void dm_handle_hpd_rx_offload_work(struct work_struct *work)
+ DP_TEST_RESPONSE,
+ &test_response.raw,
+ sizeof(test_response));
+- }
+- else if ((dc_link->connector_signal != SIGNAL_TYPE_EDP) &&
++ } else if ((dc_link->connector_signal != SIGNAL_TYPE_EDP) &&
+ dc_link_check_link_loss_status(dc_link, &offload_work->data) &&
+ dc_link_dp_allow_hpd_rx_irq(dc_link)) {
+ /* offload_work->data is from handle_hpd_rx_irq->
+@@ -1546,7 +1554,7 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
+ mutex_init(&adev->dm.dc_lock);
+ mutex_init(&adev->dm.audio_lock);
+
+- if(amdgpu_dm_irq_init(adev)) {
++ if (amdgpu_dm_irq_init(adev)) {
+ DRM_ERROR("amdgpu: failed to initialize DM IRQ support.\n");
+ goto error;
+ }
+@@ -1691,9 +1699,8 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
+ if (amdgpu_dc_debug_mask & DC_DISABLE_STUTTER)
+ adev->dm.dc->debug.disable_stutter = true;
+
+- if (amdgpu_dc_debug_mask & DC_DISABLE_DSC) {
++ if (amdgpu_dc_debug_mask & DC_DISABLE_DSC)
+ adev->dm.dc->debug.disable_dsc = true;
+- }
+
+ if (amdgpu_dc_debug_mask & DC_DISABLE_CLOCK_GATING)
+ adev->dm.dc->debug.disable_clock_gate = true;
+@@ -1937,8 +1944,6 @@ static void amdgpu_dm_fini(struct amdgpu_device *adev)
+ mutex_destroy(&adev->dm.audio_lock);
+ mutex_destroy(&adev->dm.dc_lock);
+ mutex_destroy(&adev->dm.dpia_aux_lock);
+-
+- return;
+ }
+
+ static int load_dmcu_fw(struct amdgpu_device *adev)
+@@ -1947,7 +1952,7 @@ static int load_dmcu_fw(struct amdgpu_device *adev)
+ int r;
+ const struct dmcu_firmware_header_v1_0 *hdr;
+
+- switch(adev->asic_type) {
++ switch (adev->asic_type) {
+ #if defined(CONFIG_DRM_AMD_DC_SI)
+ case CHIP_TAHITI:
+ case CHIP_PITCAIRN:
+@@ -2704,7 +2709,7 @@ static void dm_gpureset_commit_state(struct dc_state *dc_state,
+ struct dc_scaling_info scaling_infos[MAX_SURFACES];
+ struct dc_flip_addrs flip_addrs[MAX_SURFACES];
+ struct dc_stream_update stream_update;
+- } * bundle;
++ } *bundle;
+ int k, m;
+
+ bundle = kzalloc(sizeof(*bundle), GFP_KERNEL);
+@@ -2734,8 +2739,6 @@ static void dm_gpureset_commit_state(struct dc_state *dc_state,
+
+ cleanup:
+ kfree(bundle);
+-
+- return;
+ }
+
+ static int dm_resume(void *handle)
+@@ -2949,8 +2952,7 @@ static const struct amd_ip_funcs amdgpu_dm_funcs = {
+ .set_powergating_state = dm_set_powergating_state,
+ };
+
+-const struct amdgpu_ip_block_version dm_ip_block =
+-{
++const struct amdgpu_ip_block_version dm_ip_block = {
+ .type = AMD_IP_BLOCK_TYPE_DCE,
+ .major = 1,
+ .minor = 0,
+@@ -2995,9 +2997,12 @@ static void update_connector_ext_caps(struct amdgpu_dm_connector *aconnector)
+ caps->ext_caps = &aconnector->dc_link->dpcd_sink_ext_caps;
+ caps->aux_support = false;
+
+- if (caps->ext_caps->bits.oled == 1 /*||
+- caps->ext_caps->bits.sdr_aux_backlight_control == 1 ||
+- caps->ext_caps->bits.hdr_aux_backlight_control == 1*/)
++ if (caps->ext_caps->bits.oled == 1
++ /*
++ * ||
++ * caps->ext_caps->bits.sdr_aux_backlight_control == 1 ||
++ * caps->ext_caps->bits.hdr_aux_backlight_control == 1
++ */)
+ caps->aux_support = true;
+
+ if (amdgpu_backlight == 0)
+@@ -3231,86 +3236,6 @@ static void handle_hpd_irq(void *param)
+
+ }
+
+-static void dm_handle_mst_sideband_msg(struct amdgpu_dm_connector *aconnector)
+-{
+- u8 esi[DP_PSR_ERROR_STATUS - DP_SINK_COUNT_ESI] = { 0 };
+- u8 dret;
+- bool new_irq_handled = false;
+- int dpcd_addr;
+- int dpcd_bytes_to_read;
+-
+- const int max_process_count = 30;
+- int process_count = 0;
+-
+- const struct dc_link_status *link_status = dc_link_get_status(aconnector->dc_link);
+-
+- if (link_status->dpcd_caps->dpcd_rev.raw < 0x12) {
+- dpcd_bytes_to_read = DP_LANE0_1_STATUS - DP_SINK_COUNT;
+- /* DPCD 0x200 - 0x201 for downstream IRQ */
+- dpcd_addr = DP_SINK_COUNT;
+- } else {
+- dpcd_bytes_to_read = DP_PSR_ERROR_STATUS - DP_SINK_COUNT_ESI;
+- /* DPCD 0x2002 - 0x2005 for downstream IRQ */
+- dpcd_addr = DP_SINK_COUNT_ESI;
+- }
+-
+- dret = drm_dp_dpcd_read(
+- &aconnector->dm_dp_aux.aux,
+- dpcd_addr,
+- esi,
+- dpcd_bytes_to_read);
+-
+- while (dret == dpcd_bytes_to_read &&
+- process_count < max_process_count) {
+- u8 ack[DP_PSR_ERROR_STATUS - DP_SINK_COUNT_ESI] = {};
+- u8 retry;
+- dret = 0;
+-
+- process_count++;
+-
+- DRM_DEBUG_DRIVER("ESI %02x %02x %02x\n", esi[0], esi[1], esi[2]);
+- /* handle HPD short pulse irq */
+- if (aconnector->mst_mgr.mst_state)
+- drm_dp_mst_hpd_irq_handle_event(&aconnector->mst_mgr,
+- esi,
+- ack,
+- &new_irq_handled);
+-
+- if (new_irq_handled) {
+- /* ACK at DPCD to notify down stream */
+- for (retry = 0; retry < 3; retry++) {
+- ssize_t wret;
+-
+- wret = drm_dp_dpcd_writeb(&aconnector->dm_dp_aux.aux,
+- dpcd_addr + 1,
+- ack[1]);
+- if (wret == 1)
+- break;
+- }
+-
+- if (retry == 3) {
+- DRM_ERROR("Failed to ack MST event.\n");
+- return;
+- }
+-
+- drm_dp_mst_hpd_irq_send_new_request(&aconnector->mst_mgr);
+- /* check if there is new irq to be handled */
+- dret = drm_dp_dpcd_read(
+- &aconnector->dm_dp_aux.aux,
+- dpcd_addr,
+- esi,
+- dpcd_bytes_to_read);
+-
+- new_irq_handled = false;
+- } else {
+- break;
+- }
+- }
+-
+- if (process_count == max_process_count)
+- DRM_DEBUG_DRIVER("Loop exceeded max iterations\n");
+-}
+-
+ static void schedule_hpd_rx_offload_work(struct hpd_rx_irq_offload_work_queue *offload_wq,
+ union hpd_irq_data hpd_irq_data)
+ {
+@@ -3372,7 +3297,23 @@ static void handle_hpd_rx_irq(void *param)
+ if (dc_link_dp_allow_hpd_rx_irq(dc_link)) {
+ if (hpd_irq_data.bytes.device_service_irq.bits.UP_REQ_MSG_RDY ||
+ hpd_irq_data.bytes.device_service_irq.bits.DOWN_REP_MSG_RDY) {
+- dm_handle_mst_sideband_msg(aconnector);
++ bool skip = false;
++
++ /*
++ * DOWN_REP_MSG_RDY is also handled by polling method
++ * mgr->cbs->poll_hpd_irq()
++ */
++ spin_lock(&offload_wq->offload_lock);
++ skip = offload_wq->is_handling_mst_msg_rdy_event;
++
++ if (!skip)
++ offload_wq->is_handling_mst_msg_rdy_event = true;
++
++ spin_unlock(&offload_wq->offload_lock);
++
++ if (!skip)
++ schedule_hpd_rx_offload_work(offload_wq, hpd_irq_data);
++
+ goto out;
+ }
+
+@@ -3463,7 +3404,7 @@ static void register_hpd_handlers(struct amdgpu_device *adev)
+ aconnector = to_amdgpu_dm_connector(connector);
+ dc_link = aconnector->dc_link;
+
+- if (DC_IRQ_SOURCE_INVALID != dc_link->irq_source_hpd) {
++ if (dc_link->irq_source_hpd != DC_IRQ_SOURCE_INVALID) {
+ int_params.int_context = INTERRUPT_LOW_IRQ_CONTEXT;
+ int_params.irq_source = dc_link->irq_source_hpd;
+
+@@ -3472,7 +3413,7 @@ static void register_hpd_handlers(struct amdgpu_device *adev)
+ (void *) aconnector);
+ }
+
+- if (DC_IRQ_SOURCE_INVALID != dc_link->irq_source_hpd_rx) {
++ if (dc_link->irq_source_hpd_rx != DC_IRQ_SOURCE_INVALID) {
+
+ /* Also register for DP short pulse (hpd_rx). */
+ int_params.int_context = INTERRUPT_LOW_IRQ_CONTEXT;
+@@ -3481,11 +3422,11 @@ static void register_hpd_handlers(struct amdgpu_device *adev)
+ amdgpu_dm_irq_register_interrupt(adev, &int_params,
+ handle_hpd_rx_irq,
+ (void *) aconnector);
+-
+- if (adev->dm.hpd_rx_offload_wq)
+- adev->dm.hpd_rx_offload_wq[dc_link->link_index].aconnector =
+- aconnector;
+ }
++
++ if (adev->dm.hpd_rx_offload_wq)
++ adev->dm.hpd_rx_offload_wq[connector->index].aconnector =
++ aconnector;
+ }
+ }
+
+@@ -3498,7 +3439,7 @@ static int dce60_register_irq_handlers(struct amdgpu_device *adev)
+ struct dc_interrupt_params int_params = {0};
+ int r;
+ int i;
+- unsigned client_id = AMDGPU_IRQ_CLIENTID_LEGACY;
++ unsigned int client_id = AMDGPU_IRQ_CLIENTID_LEGACY;
+
+ int_params.requested_polarity = INTERRUPT_POLARITY_DEFAULT;
+ int_params.current_polarity = INTERRUPT_POLARITY_DEFAULT;
+@@ -3512,11 +3453,12 @@ static int dce60_register_irq_handlers(struct amdgpu_device *adev)
+ * Base driver will call amdgpu_dm_irq_handler() for ALL interrupts
+ * coming from DC hardware.
+ * amdgpu_dm_irq_handler() will re-direct the interrupt to DC
+- * for acknowledging and handling. */
++ * for acknowledging and handling.
++ */
+
+ /* Use VBLANK interrupt */
+ for (i = 0; i < adev->mode_info.num_crtc; i++) {
+- r = amdgpu_irq_add_id(adev, client_id, i+1 , &adev->crtc_irq);
++ r = amdgpu_irq_add_id(adev, client_id, i + 1, &adev->crtc_irq);
+ if (r) {
+ DRM_ERROR("Failed to add crtc irq id!\n");
+ return r;
+@@ -3524,7 +3466,7 @@ static int dce60_register_irq_handlers(struct amdgpu_device *adev)
+
+ int_params.int_context = INTERRUPT_HIGH_IRQ_CONTEXT;
+ int_params.irq_source =
+- dc_interrupt_to_irq_source(dc, i+1 , 0);
++ dc_interrupt_to_irq_source(dc, i + 1, 0);
+
+ c_irq_params = &adev->dm.vblank_params[int_params.irq_source - DC_IRQ_SOURCE_VBLANK1];
+
+@@ -3580,7 +3522,7 @@ static int dce110_register_irq_handlers(struct amdgpu_device *adev)
+ struct dc_interrupt_params int_params = {0};
+ int r;
+ int i;
+- unsigned client_id = AMDGPU_IRQ_CLIENTID_LEGACY;
++ unsigned int client_id = AMDGPU_IRQ_CLIENTID_LEGACY;
+
+ if (adev->family >= AMDGPU_FAMILY_AI)
+ client_id = SOC15_IH_CLIENTID_DCE;
+@@ -3597,7 +3539,8 @@ static int dce110_register_irq_handlers(struct amdgpu_device *adev)
+ * Base driver will call amdgpu_dm_irq_handler() for ALL interrupts
+ * coming from DC hardware.
+ * amdgpu_dm_irq_handler() will re-direct the interrupt to DC
+- * for acknowledging and handling. */
++ * for acknowledging and handling.
++ */
+
+ /* Use VBLANK interrupt */
+ for (i = VISLANDS30_IV_SRCID_D1_VERTICAL_INTERRUPT0; i <= VISLANDS30_IV_SRCID_D6_VERTICAL_INTERRUPT0; i++) {
+@@ -4044,7 +3987,7 @@ static void amdgpu_dm_update_backlight_caps(struct amdgpu_display_manager *dm,
+ }
+
+ static int get_brightness_range(const struct amdgpu_dm_backlight_caps *caps,
+- unsigned *min, unsigned *max)
++ unsigned int *min, unsigned int *max)
+ {
+ if (!caps)
+ return 0;
+@@ -4064,7 +4007,7 @@ static int get_brightness_range(const struct amdgpu_dm_backlight_caps *caps,
+ static u32 convert_brightness_from_user(const struct amdgpu_dm_backlight_caps *caps,
+ uint32_t brightness)
+ {
+- unsigned min, max;
++ unsigned int min, max;
+
+ if (!get_brightness_range(caps, &min, &max))
+ return brightness;
+@@ -4077,7 +4020,7 @@ static u32 convert_brightness_from_user(const struct amdgpu_dm_backlight_caps *c
+ static u32 convert_brightness_to_user(const struct amdgpu_dm_backlight_caps *caps,
+ uint32_t brightness)
+ {
+- unsigned min, max;
++ unsigned int min, max;
+
+ if (!get_brightness_range(caps, &min, &max))
+ return brightness;
+@@ -4557,7 +4500,6 @@ fail:
+ static void amdgpu_dm_destroy_drm_device(struct amdgpu_display_manager *dm)
+ {
+ drm_atomic_private_obj_fini(&dm->atomic_obj);
+- return;
+ }
+
+ /******************************************************************************
+@@ -5375,6 +5317,7 @@ static bool adjust_colour_depth_from_display_info(
+ {
+ enum dc_color_depth depth = timing_out->display_color_depth;
+ int normalized_clk;
++
+ do {
+ normalized_clk = timing_out->pix_clk_100hz / 10;
+ /* YCbCr 4:2:0 requires additional adjustment of 1/2 */
+@@ -5590,6 +5533,7 @@ create_fake_sink(struct amdgpu_dm_connector *aconnector)
+ {
+ struct dc_sink_init_data sink_init_data = { 0 };
+ struct dc_sink *sink = NULL;
++
+ sink_init_data.link = aconnector->dc_link;
+ sink_init_data.sink_signal = aconnector->dc_link->connector_signal;
+
+@@ -5713,7 +5657,7 @@ get_highest_refresh_rate_mode(struct amdgpu_dm_connector *aconnector,
+ return &aconnector->freesync_vid_base;
+
+ /* Find the preferred mode */
+- list_for_each_entry (m, list_head, head) {
++ list_for_each_entry(m, list_head, head) {
+ if (m->type & DRM_MODE_TYPE_PREFERRED) {
+ m_pref = m;
+ break;
+@@ -5737,7 +5681,7 @@ get_highest_refresh_rate_mode(struct amdgpu_dm_connector *aconnector,
+ * For some monitors, preferred mode is not the mode with highest
+ * supported refresh rate.
+ */
+- list_for_each_entry (m, list_head, head) {
++ list_for_each_entry(m, list_head, head) {
+ current_refresh = drm_mode_vrefresh(m);
+
+ if (m->hdisplay == m_pref->hdisplay &&
+@@ -6010,7 +5954,7 @@ create_stream_for_sink(struct amdgpu_dm_connector *aconnector,
+ * This may not be an error, the use case is when we have no
+ * usermode calls to reset and set mode upon hotplug. In this
+ * case, we call set mode ourselves to restore the previous mode
+- * and the modelist may not be filled in in time.
++ * and the modelist may not be filled in time.
+ */
+ DRM_DEBUG_DRIVER("No preferred mode found\n");
+ } else {
+@@ -6034,9 +5978,9 @@ create_stream_for_sink(struct amdgpu_dm_connector *aconnector,
+ drm_mode_set_crtcinfo(&mode, 0);
+
+ /*
+- * If scaling is enabled and refresh rate didn't change
+- * we copy the vic and polarities of the old timings
+- */
++ * If scaling is enabled and refresh rate didn't change
++ * we copy the vic and polarities of the old timings
++ */
+ if (!scale || mode_refresh != preferred_refresh)
+ fill_stream_properties_from_drm_display_mode(
+ stream, &mode, &aconnector->base, con_state, NULL,
+@@ -6756,6 +6700,7 @@ static int dm_encoder_helper_atomic_check(struct drm_encoder *encoder,
+
+ if (!state->duplicated) {
+ int max_bpc = conn_state->max_requested_bpc;
++
+ is_y420 = drm_mode_is_420_also(&connector->display_info, adjusted_mode) &&
+ aconnector->force_yuv420_output;
+ color_depth = convert_color_depth_from_display_info(connector,
+@@ -7074,7 +7019,7 @@ static bool is_duplicate_mode(struct amdgpu_dm_connector *aconnector,
+ {
+ struct drm_display_mode *m;
+
+- list_for_each_entry (m, &aconnector->base.probed_modes, head) {
++ list_for_each_entry(m, &aconnector->base.probed_modes, head) {
+ if (drm_mode_equal(m, mode))
+ return true;
+ }
+@@ -7192,13 +7137,7 @@ static int amdgpu_dm_connector_get_modes(struct drm_connector *connector)
+ drm_add_modes_noedid(connector, 1920, 1080);
+ } else {
+ amdgpu_dm_connector_ddc_get_modes(connector, edid);
+- /* most eDP supports only timings from its edid,
+- * usually only detailed timings are available
+- * from eDP edid. timings which are not from edid
+- * may damage eDP
+- */
+- if (connector->connector_type != DRM_MODE_CONNECTOR_eDP)
+- amdgpu_dm_connector_add_common_modes(encoder, connector);
++ amdgpu_dm_connector_add_common_modes(encoder, connector);
+ amdgpu_dm_connector_add_freesync_modes(connector, edid);
+ }
+ amdgpu_dm_fbc_init(connector);
+@@ -7234,6 +7173,7 @@ void amdgpu_dm_connector_init_helper(struct amdgpu_display_manager *dm,
+ aconnector->as_type = ADAPTIVE_SYNC_TYPE_NONE;
+ memset(&aconnector->vsdb_info, 0, sizeof(aconnector->vsdb_info));
+ mutex_init(&aconnector->hpd_lock);
++ mutex_init(&aconnector->handle_mst_msg_ready);
+
+ /*
+ * configure support HPD hot plug connector_>polled default value is 0
+@@ -7384,7 +7324,6 @@ static int amdgpu_dm_connector_init(struct amdgpu_display_manager *dm,
+
+ link->priv = aconnector;
+
+- DRM_DEBUG_DRIVER("%s()\n", __func__);
+
+ i2c = create_i2c(link->ddc, link->link_index, &res);
+ if (!i2c) {
+@@ -8055,7 +7994,15 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
+ * Only allow immediate flips for fast updates that don't
+ * change memory domain, FB pitch, DCC state, rotation or
+ * mirroring.
++ *
++ * dm_crtc_helper_atomic_check() only accepts async flips with
++ * fast updates.
+ */
++ if (crtc->state->async_flip &&
++ acrtc_state->update_type != UPDATE_TYPE_FAST)
++ drm_warn_once(state->dev,
++ "[PLANE:%d:%s] async flip with non-fast update\n",
++ plane->base.id, plane->name);
+ bundle->flip_addrs[planes_count].flip_immediate =
+ crtc->state->async_flip &&
+ acrtc_state->update_type == UPDATE_TYPE_FAST &&
+@@ -8098,8 +8045,7 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
+ * DRI3/Present extension with defined target_msc.
+ */
+ last_flip_vblank = amdgpu_get_vblank_counter_kms(pcrtc);
+- }
+- else {
++ } else {
+ /* For variable refresh rate mode only:
+ * Get vblank of last completed flip to avoid > 1 vrr
+ * flips per video frame by use of throttling, but allow
+@@ -8432,8 +8378,8 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state)
+ dc_resource_state_copy_construct_current(dm->dc, dc_state);
+ }
+
+- for_each_oldnew_crtc_in_state (state, crtc, old_crtc_state,
+- new_crtc_state, i) {
++ for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state,
++ new_crtc_state, i) {
+ struct amdgpu_crtc *acrtc = to_amdgpu_crtc(crtc);
+
+ dm_old_crtc_state = to_dm_crtc_state(old_crtc_state);
+@@ -8456,9 +8402,7 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state)
+ dm_old_crtc_state = to_dm_crtc_state(old_crtc_state);
+
+ drm_dbg_state(state->dev,
+- "amdgpu_crtc id:%d crtc_state_flags: enable:%d, active:%d, "
+- "planes_changed:%d, mode_changed:%d,active_changed:%d,"
+- "connectors_changed:%d\n",
++ "amdgpu_crtc id:%d crtc_state_flags: enable:%d, active:%d, planes_changed:%d, mode_changed:%d,active_changed:%d,connectors_changed:%d\n",
+ acrtc->crtc_id,
+ new_crtc_state->enable,
+ new_crtc_state->active,
+@@ -9027,8 +8971,8 @@ static int do_aquire_global_lock(struct drm_device *dev,
+ &commit->flip_done, 10*HZ);
+
+ if (ret == 0)
+- DRM_ERROR("[CRTC:%d:%s] hw_done or flip_done "
+- "timed out\n", crtc->base.id, crtc->name);
++ DRM_ERROR("[CRTC:%d:%s] hw_done or flip_done timed out\n",
++ crtc->base.id, crtc->name);
+
+ drm_crtc_commit_put(commit);
+ }
+@@ -9113,7 +9057,8 @@ is_timing_unchanged_for_freesync(struct drm_crtc_state *old_crtc_state,
+ return false;
+ }
+
+-static void set_freesync_fixed_config(struct dm_crtc_state *dm_new_crtc_state) {
++static void set_freesync_fixed_config(struct dm_crtc_state *dm_new_crtc_state)
++{
+ u64 num, den, res;
+ struct drm_crtc_state *new_crtc_state = &dm_new_crtc_state->base;
+
+@@ -9236,9 +9181,7 @@ static int dm_update_crtc_state(struct amdgpu_display_manager *dm,
+ goto skip_modeset;
+
+ drm_dbg_state(state->dev,
+- "amdgpu_crtc id:%d crtc_state_flags: enable:%d, active:%d, "
+- "planes_changed:%d, mode_changed:%d,active_changed:%d,"
+- "connectors_changed:%d\n",
++ "amdgpu_crtc id:%d crtc_state_flags: enable:%d, active:%d, planes_changed:%d, mode_changed:%d,active_changed:%d,connectors_changed:%d\n",
+ acrtc->crtc_id,
+ new_crtc_state->enable,
+ new_crtc_state->active,
+@@ -9267,8 +9210,7 @@ static int dm_update_crtc_state(struct amdgpu_display_manager *dm,
+ old_crtc_state)) {
+ new_crtc_state->mode_changed = false;
+ DRM_DEBUG_DRIVER(
+- "Mode change not required for front porch change, "
+- "setting mode_changed to %d",
++ "Mode change not required for front porch change, setting mode_changed to %d",
+ new_crtc_state->mode_changed);
+
+ set_freesync_fixed_config(dm_new_crtc_state);
+@@ -9280,9 +9222,8 @@ static int dm_update_crtc_state(struct amdgpu_display_manager *dm,
+ struct drm_display_mode *high_mode;
+
+ high_mode = get_highest_refresh_rate_mode(aconnector, false);
+- if (!drm_mode_equal(&new_crtc_state->mode, high_mode)) {
++ if (!drm_mode_equal(&new_crtc_state->mode, high_mode))
+ set_freesync_fixed_config(dm_new_crtc_state);
+- }
+ }
+
+ ret = dm_atomic_get_state(state, &dm_state);
+@@ -9450,6 +9391,7 @@ static bool should_reset_plane(struct drm_atomic_state *state,
+ */
+ for_each_oldnew_plane_in_state(state, other, old_other_state, new_other_state, i) {
+ struct amdgpu_framebuffer *old_afb, *new_afb;
++
+ if (other->type == DRM_PLANE_TYPE_CURSOR)
+ continue;
+
+@@ -9548,11 +9490,12 @@ static int dm_check_cursor_fb(struct amdgpu_crtc *new_acrtc,
+ }
+
+ /* Core DRM takes care of checking FB modifiers, so we only need to
+- * check tiling flags when the FB doesn't have a modifier. */
++ * check tiling flags when the FB doesn't have a modifier.
++ */
+ if (!(fb->flags & DRM_MODE_FB_MODIFIERS)) {
+ if (adev->family < AMDGPU_FAMILY_AI) {
+ linear = AMDGPU_TILING_GET(afb->tiling_flags, ARRAY_MODE) != DC_ARRAY_2D_TILED_THIN1 &&
+- AMDGPU_TILING_GET(afb->tiling_flags, ARRAY_MODE) != DC_ARRAY_1D_TILED_THIN1 &&
++ AMDGPU_TILING_GET(afb->tiling_flags, ARRAY_MODE) != DC_ARRAY_1D_TILED_THIN1 &&
+ AMDGPU_TILING_GET(afb->tiling_flags, MICRO_TILE_MODE) == 0;
+ } else {
+ linear = AMDGPU_TILING_GET(afb->tiling_flags, SWIZZLE_MODE) == 0;
+@@ -9774,12 +9717,12 @@ static int dm_check_crtc_cursor(struct drm_atomic_state *state,
+ /* On DCE and DCN there is no dedicated hardware cursor plane. We get a
+ * cursor per pipe but it's going to inherit the scaling and
+ * positioning from the underlying pipe. Check the cursor plane's
+- * blending properties match the underlying planes'. */
++ * blending properties match the underlying planes'.
++ */
+
+ new_cursor_state = drm_atomic_get_new_plane_state(state, cursor);
+- if (!new_cursor_state || !new_cursor_state->fb) {
++ if (!new_cursor_state || !new_cursor_state->fb)
+ return 0;
+- }
+
+ dm_get_oriented_plane_size(new_cursor_state, &cursor_src_w, &cursor_src_h);
+ cursor_scale_w = new_cursor_state->crtc_w * 1000 / cursor_src_w;
+@@ -9824,6 +9767,7 @@ static int add_affected_mst_dsc_crtcs(struct drm_atomic_state *state, struct drm
+ struct drm_connector_state *conn_state, *old_conn_state;
+ struct amdgpu_dm_connector *aconnector = NULL;
+ int i;
++
+ for_each_oldnew_connector_in_state(state, connector, old_conn_state, conn_state, i) {
+ if (!conn_state->crtc)
+ conn_state = old_conn_state;
+@@ -10258,7 +10202,7 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev,
+ }
+
+ /* Store the overall update type for use later in atomic check. */
+- for_each_new_crtc_in_state (state, crtc, new_crtc_state, i) {
++ for_each_new_crtc_in_state(state, crtc, new_crtc_state, i) {
+ struct dm_crtc_state *dm_new_crtc_state =
+ to_dm_crtc_state(new_crtc_state);
+
+@@ -10280,7 +10224,7 @@ fail:
+ else if (ret == -EINTR || ret == -EAGAIN || ret == -ERESTARTSYS)
+ DRM_DEBUG_DRIVER("Atomic check stopped due to signal.\n");
+ else
+- DRM_DEBUG_DRIVER("Atomic check failed with err: %d \n", ret);
++ DRM_DEBUG_DRIVER("Atomic check failed with err: %d\n", ret);
+
+ trace_amdgpu_dm_atomic_check_finish(state, ret);
+
+diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
+index 2e2413fd73a4f..b91b902ab3c81 100644
+--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
++++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
+@@ -194,6 +194,11 @@ struct hpd_rx_irq_offload_work_queue {
+ * we're handling link loss
+ */
+ bool is_handling_link_loss;
++ /**
++ * @is_handling_mst_msg_rdy_event: Used to prevent inserting mst message
++ * ready event when we're already handling mst message ready event
++ */
++ bool is_handling_mst_msg_rdy_event;
+ /**
+ * @aconnector: The aconnector that this work queue is attached to
+ */
+@@ -638,6 +643,8 @@ struct amdgpu_dm_connector {
+ struct drm_dp_mst_port *mst_output_port;
+ struct amdgpu_dm_connector *mst_root;
+ struct drm_dp_aux *dsc_aux;
++ struct mutex handle_mst_msg_ready;
++
+ /* TODO see if we can merge with ddc_bus or make a dm_connector */
+ struct amdgpu_i2c_adapter *i2c;
+
+diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c
+index 440fc0869a34b..30d4c6fd95f53 100644
+--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c
++++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c
+@@ -398,6 +398,18 @@ static int dm_crtc_helper_atomic_check(struct drm_crtc *crtc,
+ return -EINVAL;
+ }
+
++ /*
++ * Only allow async flips for fast updates that don't change the FB
++ * pitch, the DCC state, rotation, etc.
++ */
++ if (crtc_state->async_flip &&
++ dm_crtc_state->update_type != UPDATE_TYPE_FAST) {
++ drm_dbg_atomic(crtc->dev,
++ "[CRTC:%d:%s] async flips are only supported for fast updates\n",
++ crtc->base.id, crtc->name);
++ return -EINVAL;
++ }
++
+ /* In some use cases, like reset, no stream is attached */
+ if (!dm_crtc_state->stream)
+ return 0;
+diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
+index 46d0a8f57e552..888e80f498e97 100644
+--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
++++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
+@@ -619,8 +619,118 @@ dm_dp_add_mst_connector(struct drm_dp_mst_topology_mgr *mgr,
+ return connector;
+ }
+
++void dm_handle_mst_sideband_msg_ready_event(
++ struct drm_dp_mst_topology_mgr *mgr,
++ enum mst_msg_ready_type msg_rdy_type)
++{
++ uint8_t esi[DP_PSR_ERROR_STATUS - DP_SINK_COUNT_ESI] = { 0 };
++ uint8_t dret;
++ bool new_irq_handled = false;
++ int dpcd_addr;
++ uint8_t dpcd_bytes_to_read;
++ const uint8_t max_process_count = 30;
++ uint8_t process_count = 0;
++ u8 retry;
++ struct amdgpu_dm_connector *aconnector =
++ container_of(mgr, struct amdgpu_dm_connector, mst_mgr);
++
++
++ const struct dc_link_status *link_status = dc_link_get_status(aconnector->dc_link);
++
++ if (link_status->dpcd_caps->dpcd_rev.raw < 0x12) {
++ dpcd_bytes_to_read = DP_LANE0_1_STATUS - DP_SINK_COUNT;
++ /* DPCD 0x200 - 0x201 for downstream IRQ */
++ dpcd_addr = DP_SINK_COUNT;
++ } else {
++ dpcd_bytes_to_read = DP_PSR_ERROR_STATUS - DP_SINK_COUNT_ESI;
++ /* DPCD 0x2002 - 0x2005 for downstream IRQ */
++ dpcd_addr = DP_SINK_COUNT_ESI;
++ }
++
++ mutex_lock(&aconnector->handle_mst_msg_ready);
++
++ while (process_count < max_process_count) {
++ u8 ack[DP_PSR_ERROR_STATUS - DP_SINK_COUNT_ESI] = {};
++
++ process_count++;
++
++ dret = drm_dp_dpcd_read(
++ &aconnector->dm_dp_aux.aux,
++ dpcd_addr,
++ esi,
++ dpcd_bytes_to_read);
++
++ if (dret != dpcd_bytes_to_read) {
++ DRM_DEBUG_KMS("DPCD read and acked number is not as expected!");
++ break;
++ }
++
++ DRM_DEBUG_DRIVER("ESI %02x %02x %02x\n", esi[0], esi[1], esi[2]);
++
++ switch (msg_rdy_type) {
++ case DOWN_REP_MSG_RDY_EVENT:
++ /* Only handle DOWN_REP_MSG_RDY case*/
++ esi[1] &= DP_DOWN_REP_MSG_RDY;
++ break;
++ case UP_REQ_MSG_RDY_EVENT:
++ /* Only handle UP_REQ_MSG_RDY case*/
++ esi[1] &= DP_UP_REQ_MSG_RDY;
++ break;
++ default:
++ /* Handle both cases*/
++ esi[1] &= (DP_DOWN_REP_MSG_RDY | DP_UP_REQ_MSG_RDY);
++ break;
++ }
++
++ if (!esi[1])
++ break;
++
++ /* handle MST irq */
++ if (aconnector->mst_mgr.mst_state)
++ drm_dp_mst_hpd_irq_handle_event(&aconnector->mst_mgr,
++ esi,
++ ack,
++ &new_irq_handled);
++
++ if (new_irq_handled) {
++ /* ACK at DPCD to notify down stream */
++ for (retry = 0; retry < 3; retry++) {
++ ssize_t wret;
++
++ wret = drm_dp_dpcd_writeb(&aconnector->dm_dp_aux.aux,
++ dpcd_addr + 1,
++ ack[1]);
++ if (wret == 1)
++ break;
++ }
++
++ if (retry == 3) {
++ DRM_ERROR("Failed to ack MST event.\n");
++ return;
++ }
++
++ drm_dp_mst_hpd_irq_send_new_request(&aconnector->mst_mgr);
++
++ new_irq_handled = false;
++ } else {
++ break;
++ }
++ }
++
++ mutex_unlock(&aconnector->handle_mst_msg_ready);
++
++ if (process_count == max_process_count)
++ DRM_DEBUG_DRIVER("Loop exceeded max iterations\n");
++}
++
++static void dm_handle_mst_down_rep_msg_ready(struct drm_dp_mst_topology_mgr *mgr)
++{
++ dm_handle_mst_sideband_msg_ready_event(mgr, DOWN_REP_MSG_RDY_EVENT);
++}
++
+ static const struct drm_dp_mst_topology_cbs dm_mst_cbs = {
+ .add_connector = dm_dp_add_mst_connector,
++ .poll_hpd_irq = dm_handle_mst_down_rep_msg_ready,
+ };
+
+ void amdgpu_dm_initialize_dp_connector(struct amdgpu_display_manager *dm,
+diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.h b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.h
+index 1e4ede1e57abd..37c820ab0fdbc 100644
+--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.h
++++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.h
+@@ -49,6 +49,13 @@
+ #define PBN_FEC_OVERHEAD_MULTIPLIER_8B_10B 1031
+ #define PBN_FEC_OVERHEAD_MULTIPLIER_128B_132B 1000
+
++enum mst_msg_ready_type {
++ NONE_MSG_RDY_EVENT = 0,
++ DOWN_REP_MSG_RDY_EVENT = 1,
++ UP_REQ_MSG_RDY_EVENT = 2,
++ DOWN_OR_UP_MSG_RDY_EVENT = 3
++};
++
+ struct amdgpu_display_manager;
+ struct amdgpu_dm_connector;
+
+@@ -61,6 +68,10 @@ void amdgpu_dm_initialize_dp_connector(struct amdgpu_display_manager *dm,
+ void
+ dm_dp_create_fake_mst_encoders(struct amdgpu_device *adev);
+
++void dm_handle_mst_sideband_msg_ready_event(
++ struct drm_dp_mst_topology_mgr *mgr,
++ enum mst_msg_ready_type msg_rdy_type);
++
+ struct dsc_mst_fairness_vars {
+ int pbn;
+ bool dsc_enabled;
+diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c
+index f9e2e0c3095e7..b686efa43c347 100644
+--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c
++++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c
+@@ -87,6 +87,11 @@ static int dcn31_get_active_display_cnt_wa(
+ stream->signal == SIGNAL_TYPE_DVI_SINGLE_LINK ||
+ stream->signal == SIGNAL_TYPE_DVI_DUAL_LINK)
+ tmds_present = true;
++
++ /* Checking stream / link detection ensuring that PHY is active*/
++ if (dc_is_dp_signal(stream->signal) && !stream->dpms_off)
++ display_count++;
++
+ }
+
+ for (i = 0; i < dc->link_count; i++) {
+diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+index 1c3b6f25a7825..6f56a35c08571 100644
+--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
++++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+@@ -3309,7 +3309,8 @@ void dcn10_wait_for_mpcc_disconnect(
+ if (pipe_ctx->stream_res.opp->mpcc_disconnect_pending[mpcc_inst]) {
+ struct hubp *hubp = get_hubp_by_inst(res_pool, mpcc_inst);
+
+- if (pipe_ctx->stream_res.tg->funcs->is_tg_enabled(pipe_ctx->stream_res.tg))
++ if (pipe_ctx->stream_res.tg &&
++ pipe_ctx->stream_res.tg->funcs->is_tg_enabled(pipe_ctx->stream_res.tg))
+ res_pool->mpc->funcs->wait_for_idle(res_pool->mpc, mpcc_inst);
+ pipe_ctx->stream_res.opp->mpcc_disconnect_pending[mpcc_inst] = false;
+ hubp->funcs->set_blank(hubp, true);
+diff --git a/drivers/gpu/drm/amd/display/dc/dcn303/dcn303_resource.c b/drivers/gpu/drm/amd/display/dc/dcn303/dcn303_resource.c
+index 7f72ef882ca41..21eea8d7bf7f4 100644
+--- a/drivers/gpu/drm/amd/display/dc/dcn303/dcn303_resource.c
++++ b/drivers/gpu/drm/amd/display/dc/dcn303/dcn303_resource.c
+@@ -65,7 +65,7 @@ static const struct dc_debug_options debug_defaults_drv = {
+ .timing_trace = false,
+ .clock_trace = true,
+ .disable_pplib_clock_request = true,
+- .pipe_split_policy = MPC_SPLIT_DYNAMIC,
++ .pipe_split_policy = MPC_SPLIT_AVOID,
+ .force_single_disp_pipe_split = false,
+ .disable_dcc = DCC_ENABLE,
+ .vsr_support = true,
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+index 8fe2e1716da44..e22fc563b462f 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+@@ -1927,12 +1927,16 @@ static int sienna_cichlid_read_sensor(struct smu_context *smu,
+ *size = 4;
+ break;
+ case AMDGPU_PP_SENSOR_GFX_MCLK:
+- ret = sienna_cichlid_get_current_clk_freq_by_table(smu, SMU_UCLK, (uint32_t *)data);
++ ret = sienna_cichlid_get_smu_metrics_data(smu,
++ METRICS_CURR_UCLK,
++ (uint32_t *)data);
+ *(uint32_t *)data *= 100;
+ *size = 4;
+ break;
+ case AMDGPU_PP_SENSOR_GFX_SCLK:
+- ret = sienna_cichlid_get_current_clk_freq_by_table(smu, SMU_GFXCLK, (uint32_t *)data);
++ ret = sienna_cichlid_get_smu_metrics_data(smu,
++ METRICS_AVERAGE_GFXCLK,
++ (uint32_t *)data);
+ *(uint32_t *)data *= 100;
+ *size = 4;
+ break;
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c
+index aac72925db34a..f53a09b02c38a 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c
+@@ -940,7 +940,7 @@ static int smu_v13_0_7_read_sensor(struct smu_context *smu,
+ break;
+ case AMDGPU_PP_SENSOR_GFX_MCLK:
+ ret = smu_v13_0_7_get_smu_metrics_data(smu,
+- METRICS_AVERAGE_UCLK,
++ METRICS_CURR_UCLK,
+ (uint32_t *)data);
+ *(uint32_t *)data *= 100;
+ *size = 4;
+diff --git a/drivers/gpu/drm/drm_client_modeset.c b/drivers/gpu/drm/drm_client_modeset.c
+index 1b12a3c201a3c..871e4e2129d6d 100644
+--- a/drivers/gpu/drm/drm_client_modeset.c
++++ b/drivers/gpu/drm/drm_client_modeset.c
+@@ -311,6 +311,9 @@ static bool drm_client_target_cloned(struct drm_device *dev,
+ can_clone = true;
+ dmt_mode = drm_mode_find_dmt(dev, 1024, 768, 60, false);
+
++ if (!dmt_mode)
++ goto fail;
++
+ for (i = 0; i < connector_count; i++) {
+ if (!enabled[i])
+ continue;
+@@ -326,11 +329,13 @@ static bool drm_client_target_cloned(struct drm_device *dev,
+ if (!modes[i])
+ can_clone = false;
+ }
++ kfree(dmt_mode);
+
+ if (can_clone) {
+ DRM_DEBUG_KMS("can clone using 1024x768\n");
+ return true;
+ }
++fail:
+ DRM_INFO("kms: can't enable cloning when we probably wanted to.\n");
+ return false;
+ }
+@@ -862,6 +867,7 @@ int drm_client_modeset_probe(struct drm_client_dev *client, unsigned int width,
+ break;
+ }
+
++ kfree(modeset->mode);
+ modeset->mode = drm_mode_duplicate(dev, mode);
+ drm_connector_get(connector);
+ modeset->connectors[modeset->num_connectors++] = connector;
+diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
+index 3035cba2c6a29..d7caae281fb92 100644
+--- a/drivers/gpu/drm/i915/i915_perf.c
++++ b/drivers/gpu/drm/i915/i915_perf.c
+@@ -4442,6 +4442,7 @@ static const struct i915_range mtl_oam_b_counters[] = {
+ static const struct i915_range xehp_oa_b_counters[] = {
+ { .start = 0xdc48, .end = 0xdc48 }, /* OAA_ENABLE_REG */
+ { .start = 0xdd00, .end = 0xdd48 }, /* OAG_LCE0_0 - OAA_LENABLE_REG */
++ {}
+ };
+
+ static const struct i915_range gen7_oa_mux_regs[] = {
+diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c
+index 42e1665ba11a3..1ecd3d63b1081 100644
+--- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
++++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
+@@ -1873,6 +1873,8 @@ nv50_pior_destroy(struct drm_encoder *encoder)
+ nvif_outp_dtor(&nv_encoder->outp);
+
+ drm_encoder_cleanup(encoder);
++
++ mutex_destroy(&nv_encoder->dp.hpd_irq_lock);
+ kfree(encoder);
+ }
+
+@@ -1917,6 +1919,8 @@ nv50_pior_create(struct drm_connector *connector, struct dcb_output *dcbe)
+ nv_encoder->i2c = ddc;
+ nv_encoder->aux = aux;
+
++ mutex_init(&nv_encoder->dp.hpd_irq_lock);
++
+ encoder = to_drm_encoder(nv_encoder);
+ encoder->possible_crtcs = dcbe->heads;
+ encoder->possible_clones = 0;
+diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/i2c.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/i2c.h
+index 40a1065ae626e..ef441dfdea09f 100644
+--- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/i2c.h
++++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/i2c.h
+@@ -16,7 +16,7 @@ struct nvkm_i2c_bus {
+ const struct nvkm_i2c_bus_func *func;
+ struct nvkm_i2c_pad *pad;
+ #define NVKM_I2C_BUS_CCB(n) /* 'n' is ccb index */ (n)
+-#define NVKM_I2C_BUS_EXT(n) /* 'n' is dcb external encoder type */ ((n) + 0x100)
++#define NVKM_I2C_BUS_EXT(n) /* 'n' is dcb external encoder type */ ((n) + 0x10)
+ #define NVKM_I2C_BUS_PRI /* ccb primary comm. port */ -1
+ #define NVKM_I2C_BUS_SEC /* ccb secondary comm. port */ -2
+ int id;
+@@ -38,7 +38,7 @@ struct nvkm_i2c_aux {
+ const struct nvkm_i2c_aux_func *func;
+ struct nvkm_i2c_pad *pad;
+ #define NVKM_I2C_AUX_CCB(n) /* 'n' is ccb index */ (n)
+-#define NVKM_I2C_AUX_EXT(n) /* 'n' is dcb external encoder type */ ((n) + 0x100)
++#define NVKM_I2C_AUX_EXT(n) /* 'n' is dcb external encoder type */ ((n) + 0x10)
+ int id;
+
+ struct mutex mutex;
+diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/disp/uconn.c b/drivers/gpu/drm/nouveau/nvkm/engine/disp/uconn.c
+index dad942be6679c..46b057fe1412e 100644
+--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/uconn.c
++++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/uconn.c
+@@ -81,20 +81,29 @@ nvkm_uconn_uevent(struct nvkm_object *object, void *argv, u32 argc, struct nvkm_
+ return -ENOSYS;
+
+ list_for_each_entry(outp, &conn->disp->outps, head) {
+- if (outp->info.connector == conn->index && outp->dp.aux) {
+- if (args->v0.types & NVIF_CONN_EVENT_V0_PLUG ) bits |= NVKM_I2C_PLUG;
+- if (args->v0.types & NVIF_CONN_EVENT_V0_UNPLUG) bits |= NVKM_I2C_UNPLUG;
+- if (args->v0.types & NVIF_CONN_EVENT_V0_IRQ ) bits |= NVKM_I2C_IRQ;
++ if (outp->info.connector == conn->index)
++ break;
++ }
+
+- return nvkm_uevent_add(uevent, &device->i2c->event, outp->dp.aux->id, bits,
+- nvkm_uconn_uevent_aux);
+- }
++ if (&outp->head == &conn->disp->outps)
++ return -EINVAL;
++
++ if (outp->dp.aux && !outp->info.location) {
++ if (args->v0.types & NVIF_CONN_EVENT_V0_PLUG ) bits |= NVKM_I2C_PLUG;
++ if (args->v0.types & NVIF_CONN_EVENT_V0_UNPLUG) bits |= NVKM_I2C_UNPLUG;
++ if (args->v0.types & NVIF_CONN_EVENT_V0_IRQ ) bits |= NVKM_I2C_IRQ;
++
++ return nvkm_uevent_add(uevent, &device->i2c->event, outp->dp.aux->id, bits,
++ nvkm_uconn_uevent_aux);
+ }
+
+ if (args->v0.types & NVIF_CONN_EVENT_V0_PLUG ) bits |= NVKM_GPIO_HI;
+ if (args->v0.types & NVIF_CONN_EVENT_V0_UNPLUG) bits |= NVKM_GPIO_LO;
+- if (args->v0.types & NVIF_CONN_EVENT_V0_IRQ)
+- return -EINVAL;
++ if (args->v0.types & NVIF_CONN_EVENT_V0_IRQ) {
++ /* TODO: support DP IRQ on ANX9805 and remove this hack. */
++ if (!outp->info.location)
++ return -EINVAL;
++ }
+
+ return nvkm_uevent_add(uevent, &device->gpio->event, conn->info.hpd, bits,
+ nvkm_uconn_uevent_gpio);
+diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/base.c
+index 976539de4220c..731b2f68d3dbf 100644
+--- a/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/base.c
++++ b/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/base.c
+@@ -260,10 +260,11 @@ nvkm_i2c_new_(const struct nvkm_i2c_func *func, struct nvkm_device *device,
+ {
+ struct nvkm_bios *bios = device->bios;
+ struct nvkm_i2c *i2c;
++ struct nvkm_i2c_aux *aux;
+ struct dcb_i2c_entry ccbE;
+ struct dcb_output dcbE;
+ u8 ver, hdr;
+- int ret, i;
++ int ret, i, ids;
+
+ if (!(i2c = *pi2c = kzalloc(sizeof(*i2c), GFP_KERNEL)))
+ return -ENOMEM;
+@@ -406,5 +407,11 @@ nvkm_i2c_new_(const struct nvkm_i2c_func *func, struct nvkm_device *device,
+ }
+ }
+
+- return nvkm_event_init(&nvkm_i2c_intr_func, &i2c->subdev, 4, i, &i2c->event);
++ ids = 0;
++ list_for_each_entry(aux, &i2c->aux, head)
++ ids = max(ids, aux->id + 1);
++ if (!ids)
++ return 0;
++
++ return nvkm_event_init(&nvkm_i2c_intr_func, &i2c->subdev, 4, ids, &i2c->event);
+ }
+diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c
+index 46a27ebf4588a..a6700d7278bf3 100644
+--- a/drivers/gpu/drm/radeon/radeon_cs.c
++++ b/drivers/gpu/drm/radeon/radeon_cs.c
+@@ -270,7 +270,8 @@ int radeon_cs_parser_init(struct radeon_cs_parser *p, void *data)
+ {
+ struct drm_radeon_cs *cs = data;
+ uint64_t *chunk_array_ptr;
+- unsigned size, i;
++ u64 size;
++ unsigned i;
+ u32 ring = RADEON_CS_RING_GFX;
+ s32 priority = 0;
+
+diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c
+index 7333f7a87a2fb..46ff9c75bb124 100644
+--- a/drivers/gpu/drm/ttm/ttm_resource.c
++++ b/drivers/gpu/drm/ttm/ttm_resource.c
+@@ -86,6 +86,8 @@ static void ttm_lru_bulk_move_pos_tail(struct ttm_lru_bulk_move_pos *pos,
+ struct ttm_resource *res)
+ {
+ if (pos->last != res) {
++ if (pos->first == res)
++ pos->first = list_next_entry(res, lru);
+ list_move(&res->lru, &pos->last->lru);
+ pos->last = res;
+ }
+@@ -111,7 +113,8 @@ static void ttm_lru_bulk_move_del(struct ttm_lru_bulk_move *bulk,
+ {
+ struct ttm_lru_bulk_move_pos *pos = ttm_lru_bulk_move_pos(bulk, res);
+
+- if (unlikely(pos->first == res && pos->last == res)) {
++ if (unlikely(WARN_ON(!pos->first || !pos->last) ||
++ (pos->first == res && pos->last == res))) {
+ pos->first = NULL;
+ pos->last = NULL;
+ } else if (pos->first == res) {
+diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h
+index 5d29abac2300e..55a436a6dde98 100644
+--- a/drivers/hid/hid-ids.h
++++ b/drivers/hid/hid-ids.h
+@@ -620,6 +620,7 @@
+ #define USB_DEVICE_ID_UGCI_FIGHTING 0x0030
+
+ #define USB_VENDOR_ID_HP 0x03f0
++#define USB_PRODUCT_ID_HP_ELITE_PRESENTER_MOUSE_464A 0x464a
+ #define USB_PRODUCT_ID_HP_LOGITECH_OEM_USB_OPTICAL_MOUSE_0A4A 0x0a4a
+ #define USB_PRODUCT_ID_HP_LOGITECH_OEM_USB_OPTICAL_MOUSE_0B4A 0x0b4a
+ #define USB_PRODUCT_ID_HP_PIXART_OEM_USB_OPTICAL_MOUSE 0x134a
+diff --git a/drivers/hid/hid-quirks.c b/drivers/hid/hid-quirks.c
+index 804fc03600cc9..3983b4f282f8f 100644
+--- a/drivers/hid/hid-quirks.c
++++ b/drivers/hid/hid-quirks.c
+@@ -96,6 +96,7 @@ static const struct hid_device_id hid_quirks[] = {
+ { HID_USB_DEVICE(USB_VENDOR_ID_HOLTEK_ALT, USB_DEVICE_ID_HOLTEK_ALT_KEYBOARD_A096), HID_QUIRK_NO_INIT_REPORTS },
+ { HID_USB_DEVICE(USB_VENDOR_ID_HOLTEK_ALT, USB_DEVICE_ID_HOLTEK_ALT_KEYBOARD_A293), HID_QUIRK_ALWAYS_POLL },
+ { HID_USB_DEVICE(USB_VENDOR_ID_HP, USB_PRODUCT_ID_HP_LOGITECH_OEM_USB_OPTICAL_MOUSE_0A4A), HID_QUIRK_ALWAYS_POLL },
++ { HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_HP, USB_PRODUCT_ID_HP_ELITE_PRESENTER_MOUSE_464A), HID_QUIRK_MULTI_INPUT },
+ { HID_USB_DEVICE(USB_VENDOR_ID_HP, USB_PRODUCT_ID_HP_LOGITECH_OEM_USB_OPTICAL_MOUSE_0B4A), HID_QUIRK_ALWAYS_POLL },
+ { HID_USB_DEVICE(USB_VENDOR_ID_HP, USB_PRODUCT_ID_HP_PIXART_OEM_USB_OPTICAL_MOUSE), HID_QUIRK_ALWAYS_POLL },
+ { HID_USB_DEVICE(USB_VENDOR_ID_HP, USB_PRODUCT_ID_HP_PIXART_OEM_USB_OPTICAL_MOUSE_094A), HID_QUIRK_ALWAYS_POLL },
+diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c
+index 3ebd4b6586b3e..05c0fb2acbc44 100644
+--- a/drivers/iommu/iommu-sva.c
++++ b/drivers/iommu/iommu-sva.c
+@@ -34,8 +34,9 @@ static int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t ma
+ }
+
+ ret = ida_alloc_range(&iommu_global_pasid_ida, min, max, GFP_KERNEL);
+- if (ret < min)
++ if (ret < 0)
+ goto out;
++
+ mm->pasid = ret;
+ ret = 0;
+ out:
+diff --git a/drivers/md/md.c b/drivers/md/md.c
+index 350094f1cb09f..18384251399ab 100644
+--- a/drivers/md/md.c
++++ b/drivers/md/md.c
+@@ -4807,11 +4807,21 @@ action_store(struct mddev *mddev, const char *page, size_t len)
+ return -EINVAL;
+ err = mddev_lock(mddev);
+ if (!err) {
+- if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery))
++ if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) {
+ err = -EBUSY;
+- else {
++ } else if (mddev->reshape_position == MaxSector ||
++ mddev->pers->check_reshape == NULL ||
++ mddev->pers->check_reshape(mddev)) {
+ clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
+ err = mddev->pers->start_reshape(mddev);
++ } else {
++ /*
++ * If reshape is still in progress, and
++ * md_check_recovery() can continue to reshape,
++ * don't restart reshape because data can be
++ * corrupted for raid456.
++ */
++ clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
+ }
+ mddev_unlock(mddev);
+ }
+diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
+index 9d23963496194..ee75b058438f3 100644
+--- a/drivers/md/raid10.c
++++ b/drivers/md/raid10.c
+@@ -920,6 +920,7 @@ static void flush_pending_writes(struct r10conf *conf)
+
+ raid1_submit_write(bio);
+ bio = next;
++ cond_resched();
+ }
+ blk_finish_plug(&plug);
+ } else
+@@ -1132,6 +1133,7 @@ static void raid10_unplug(struct blk_plug_cb *cb, bool from_schedule)
+
+ raid1_submit_write(bio);
+ bio = next;
++ cond_resched();
+ }
+ kfree(plug);
+ }
+diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c
+index 68df6d4641b5c..eebf967f4711a 100644
+--- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c
++++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c
+@@ -227,6 +227,8 @@ static int
+ __mcp251xfd_chip_set_mode(const struct mcp251xfd_priv *priv,
+ const u8 mode_req, bool nowait)
+ {
++ const struct can_bittiming *bt = &priv->can.bittiming;
++ unsigned long timeout_us = MCP251XFD_POLL_TIMEOUT_US;
+ u32 con = 0, con_reqop, osc = 0;
+ u8 mode;
+ int err;
+@@ -246,12 +248,16 @@ __mcp251xfd_chip_set_mode(const struct mcp251xfd_priv *priv,
+ if (mode_req == MCP251XFD_REG_CON_MODE_SLEEP || nowait)
+ return 0;
+
++ if (bt->bitrate)
++ timeout_us = max_t(unsigned long, timeout_us,
++ MCP251XFD_FRAME_LEN_MAX_BITS * USEC_PER_SEC /
++ bt->bitrate);
++
+ err = regmap_read_poll_timeout(priv->map_reg, MCP251XFD_REG_CON, con,
+ !mcp251xfd_reg_invalid(con) &&
+ FIELD_GET(MCP251XFD_REG_CON_OPMOD_MASK,
+ con) == mode_req,
+- MCP251XFD_POLL_SLEEP_US,
+- MCP251XFD_POLL_TIMEOUT_US);
++ MCP251XFD_POLL_SLEEP_US, timeout_us);
+ if (err != -ETIMEDOUT && err != -EBADMSG)
+ return err;
+
+diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd.h b/drivers/net/can/spi/mcp251xfd/mcp251xfd.h
+index 7024ff0cc2c0c..24510b3b80203 100644
+--- a/drivers/net/can/spi/mcp251xfd/mcp251xfd.h
++++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd.h
+@@ -387,6 +387,7 @@ static_assert(MCP251XFD_TIMESTAMP_WORK_DELAY_SEC <
+ #define MCP251XFD_OSC_STAB_TIMEOUT_US (10 * MCP251XFD_OSC_STAB_SLEEP_US)
+ #define MCP251XFD_POLL_SLEEP_US (10)
+ #define MCP251XFD_POLL_TIMEOUT_US (USEC_PER_MSEC)
++#define MCP251XFD_FRAME_LEN_MAX_BITS (736)
+
+ /* Misc */
+ #define MCP251XFD_NAPI_WEIGHT 32
+diff --git a/drivers/net/can/usb/gs_usb.c b/drivers/net/can/usb/gs_usb.c
+index d476c28840084..f418066569fcc 100644
+--- a/drivers/net/can/usb/gs_usb.c
++++ b/drivers/net/can/usb/gs_usb.c
+@@ -303,12 +303,6 @@ struct gs_can {
+ struct can_bittiming_const bt_const, data_bt_const;
+ unsigned int channel; /* channel number */
+
+- /* time counter for hardware timestamps */
+- struct cyclecounter cc;
+- struct timecounter tc;
+- spinlock_t tc_lock; /* spinlock to guard access tc->cycle_last */
+- struct delayed_work timestamp;
+-
+ u32 feature;
+ unsigned int hf_size_tx;
+
+@@ -325,6 +319,13 @@ struct gs_usb {
+ struct gs_can *canch[GS_MAX_INTF];
+ struct usb_anchor rx_submitted;
+ struct usb_device *udev;
++
++ /* time counter for hardware timestamps */
++ struct cyclecounter cc;
++ struct timecounter tc;
++ spinlock_t tc_lock; /* spinlock to guard access tc->cycle_last */
++ struct delayed_work timestamp;
++
+ unsigned int hf_size_rx;
+ u8 active_channels;
+ };
+@@ -388,15 +389,15 @@ static int gs_cmd_reset(struct gs_can *dev)
+ GFP_KERNEL);
+ }
+
+-static inline int gs_usb_get_timestamp(const struct gs_can *dev,
++static inline int gs_usb_get_timestamp(const struct gs_usb *parent,
+ u32 *timestamp_p)
+ {
+ __le32 timestamp;
+ int rc;
+
+- rc = usb_control_msg_recv(dev->udev, 0, GS_USB_BREQ_TIMESTAMP,
++ rc = usb_control_msg_recv(parent->udev, 0, GS_USB_BREQ_TIMESTAMP,
+ USB_DIR_IN | USB_TYPE_VENDOR | USB_RECIP_INTERFACE,
+- dev->channel, 0,
++ 0, 0,
+ ×tamp, sizeof(timestamp),
+ USB_CTRL_GET_TIMEOUT,
+ GFP_KERNEL);
+@@ -410,20 +411,20 @@ static inline int gs_usb_get_timestamp(const struct gs_can *dev,
+
+ static u64 gs_usb_timestamp_read(const struct cyclecounter *cc) __must_hold(&dev->tc_lock)
+ {
+- struct gs_can *dev = container_of(cc, struct gs_can, cc);
++ struct gs_usb *parent = container_of(cc, struct gs_usb, cc);
+ u32 timestamp = 0;
+ int err;
+
+- lockdep_assert_held(&dev->tc_lock);
++ lockdep_assert_held(&parent->tc_lock);
+
+ /* drop lock for synchronous USB transfer */
+- spin_unlock_bh(&dev->tc_lock);
+- err = gs_usb_get_timestamp(dev, ×tamp);
+- spin_lock_bh(&dev->tc_lock);
++ spin_unlock_bh(&parent->tc_lock);
++ err = gs_usb_get_timestamp(parent, ×tamp);
++ spin_lock_bh(&parent->tc_lock);
+ if (err)
+- netdev_err(dev->netdev,
+- "Error %d while reading timestamp. HW timestamps may be inaccurate.",
+- err);
++ dev_err(&parent->udev->dev,
++ "Error %d while reading timestamp. HW timestamps may be inaccurate.",
++ err);
+
+ return timestamp;
+ }
+@@ -431,14 +432,14 @@ static u64 gs_usb_timestamp_read(const struct cyclecounter *cc) __must_hold(&dev
+ static void gs_usb_timestamp_work(struct work_struct *work)
+ {
+ struct delayed_work *delayed_work = to_delayed_work(work);
+- struct gs_can *dev;
++ struct gs_usb *parent;
+
+- dev = container_of(delayed_work, struct gs_can, timestamp);
+- spin_lock_bh(&dev->tc_lock);
+- timecounter_read(&dev->tc);
+- spin_unlock_bh(&dev->tc_lock);
++ parent = container_of(delayed_work, struct gs_usb, timestamp);
++ spin_lock_bh(&parent->tc_lock);
++ timecounter_read(&parent->tc);
++ spin_unlock_bh(&parent->tc_lock);
+
+- schedule_delayed_work(&dev->timestamp,
++ schedule_delayed_work(&parent->timestamp,
+ GS_USB_TIMESTAMP_WORK_DELAY_SEC * HZ);
+ }
+
+@@ -446,37 +447,38 @@ static void gs_usb_skb_set_timestamp(struct gs_can *dev,
+ struct sk_buff *skb, u32 timestamp)
+ {
+ struct skb_shared_hwtstamps *hwtstamps = skb_hwtstamps(skb);
++ struct gs_usb *parent = dev->parent;
+ u64 ns;
+
+- spin_lock_bh(&dev->tc_lock);
+- ns = timecounter_cyc2time(&dev->tc, timestamp);
+- spin_unlock_bh(&dev->tc_lock);
++ spin_lock_bh(&parent->tc_lock);
++ ns = timecounter_cyc2time(&parent->tc, timestamp);
++ spin_unlock_bh(&parent->tc_lock);
+
+ hwtstamps->hwtstamp = ns_to_ktime(ns);
+ }
+
+-static void gs_usb_timestamp_init(struct gs_can *dev)
++static void gs_usb_timestamp_init(struct gs_usb *parent)
+ {
+- struct cyclecounter *cc = &dev->cc;
++ struct cyclecounter *cc = &parent->cc;
+
+ cc->read = gs_usb_timestamp_read;
+ cc->mask = CYCLECOUNTER_MASK(32);
+ cc->shift = 32 - bits_per(NSEC_PER_SEC / GS_USB_TIMESTAMP_TIMER_HZ);
+ cc->mult = clocksource_hz2mult(GS_USB_TIMESTAMP_TIMER_HZ, cc->shift);
+
+- spin_lock_init(&dev->tc_lock);
+- spin_lock_bh(&dev->tc_lock);
+- timecounter_init(&dev->tc, &dev->cc, ktime_get_real_ns());
+- spin_unlock_bh(&dev->tc_lock);
++ spin_lock_init(&parent->tc_lock);
++ spin_lock_bh(&parent->tc_lock);
++ timecounter_init(&parent->tc, &parent->cc, ktime_get_real_ns());
++ spin_unlock_bh(&parent->tc_lock);
+
+- INIT_DELAYED_WORK(&dev->timestamp, gs_usb_timestamp_work);
+- schedule_delayed_work(&dev->timestamp,
++ INIT_DELAYED_WORK(&parent->timestamp, gs_usb_timestamp_work);
++ schedule_delayed_work(&parent->timestamp,
+ GS_USB_TIMESTAMP_WORK_DELAY_SEC * HZ);
+ }
+
+-static void gs_usb_timestamp_stop(struct gs_can *dev)
++static void gs_usb_timestamp_stop(struct gs_usb *parent)
+ {
+- cancel_delayed_work_sync(&dev->timestamp);
++ cancel_delayed_work_sync(&parent->timestamp);
+ }
+
+ static void gs_update_state(struct gs_can *dev, struct can_frame *cf)
+@@ -560,6 +562,9 @@ static void gs_usb_receive_bulk_callback(struct urb *urb)
+ if (!netif_device_present(netdev))
+ return;
+
++ if (!netif_running(netdev))
++ goto resubmit_urb;
++
+ if (hf->echo_id == -1) { /* normal rx */
+ if (hf->flags & GS_CAN_FLAG_FD) {
+ skb = alloc_canfd_skb(dev->netdev, &cfd);
+@@ -833,6 +838,7 @@ static int gs_can_open(struct net_device *netdev)
+ .mode = cpu_to_le32(GS_CAN_MODE_START),
+ };
+ struct gs_host_frame *hf;
++ struct urb *urb = NULL;
+ u32 ctrlmode;
+ u32 flags = 0;
+ int rc, i;
+@@ -855,14 +861,18 @@ static int gs_can_open(struct net_device *netdev)
+ }
+
+ if (!parent->active_channels) {
++ if (dev->feature & GS_CAN_FEATURE_HW_TIMESTAMP)
++ gs_usb_timestamp_init(parent);
++
+ for (i = 0; i < GS_MAX_RX_URBS; i++) {
+- struct urb *urb;
+ u8 *buf;
+
+ /* alloc rx urb */
+ urb = usb_alloc_urb(0, GFP_KERNEL);
+- if (!urb)
+- return -ENOMEM;
++ if (!urb) {
++ rc = -ENOMEM;
++ goto out_usb_kill_anchored_urbs;
++ }
+
+ /* alloc rx buffer */
+ buf = kmalloc(dev->parent->hf_size_rx,
+@@ -870,8 +880,8 @@ static int gs_can_open(struct net_device *netdev)
+ if (!buf) {
+ netdev_err(netdev,
+ "No memory left for USB buffer\n");
+- usb_free_urb(urb);
+- return -ENOMEM;
++ rc = -ENOMEM;
++ goto out_usb_free_urb;
+ }
+
+ /* fill, anchor, and submit rx urb */
+@@ -894,9 +904,7 @@ static int gs_can_open(struct net_device *netdev)
+ netdev_err(netdev,
+ "usb_submit failed (err=%d)\n", rc);
+
+- usb_unanchor_urb(urb);
+- usb_free_urb(urb);
+- break;
++ goto out_usb_unanchor_urb;
+ }
+
+ /* Drop reference,
+@@ -926,13 +934,9 @@ static int gs_can_open(struct net_device *netdev)
+ flags |= GS_CAN_MODE_FD;
+
+ /* if hardware supports timestamps, enable it */
+- if (dev->feature & GS_CAN_FEATURE_HW_TIMESTAMP) {
++ if (dev->feature & GS_CAN_FEATURE_HW_TIMESTAMP)
+ flags |= GS_CAN_MODE_HW_TIMESTAMP;
+
+- /* start polling timestamp */
+- gs_usb_timestamp_init(dev);
+- }
+-
+ /* finally start device */
+ dev->can.state = CAN_STATE_ERROR_ACTIVE;
+ dm.flags = cpu_to_le32(flags);
+@@ -942,10 +946,9 @@ static int gs_can_open(struct net_device *netdev)
+ GFP_KERNEL);
+ if (rc) {
+ netdev_err(netdev, "Couldn't start device (err=%d)\n", rc);
+- if (dev->feature & GS_CAN_FEATURE_HW_TIMESTAMP)
+- gs_usb_timestamp_stop(dev);
+ dev->can.state = CAN_STATE_STOPPED;
+- return rc;
++
++ goto out_usb_kill_anchored_urbs;
+ }
+
+ parent->active_channels++;
+@@ -953,6 +956,22 @@ static int gs_can_open(struct net_device *netdev)
+ netif_start_queue(netdev);
+
+ return 0;
++
++out_usb_unanchor_urb:
++ usb_unanchor_urb(urb);
++out_usb_free_urb:
++ usb_free_urb(urb);
++out_usb_kill_anchored_urbs:
++ if (!parent->active_channels) {
++ usb_kill_anchored_urbs(&dev->tx_submitted);
++
++ if (dev->feature & GS_CAN_FEATURE_HW_TIMESTAMP)
++ gs_usb_timestamp_stop(parent);
++ }
++
++ close_candev(netdev);
++
++ return rc;
+ }
+
+ static int gs_usb_get_state(const struct net_device *netdev,
+@@ -998,14 +1017,13 @@ static int gs_can_close(struct net_device *netdev)
+
+ netif_stop_queue(netdev);
+
+- /* stop polling timestamp */
+- if (dev->feature & GS_CAN_FEATURE_HW_TIMESTAMP)
+- gs_usb_timestamp_stop(dev);
+-
+ /* Stop polling */
+ parent->active_channels--;
+ if (!parent->active_channels) {
+ usb_kill_anchored_urbs(&parent->rx_submitted);
++
++ if (dev->feature & GS_CAN_FEATURE_HW_TIMESTAMP)
++ gs_usb_timestamp_stop(parent);
+ }
+
+ /* Stop sending URBs */
+diff --git a/drivers/net/dsa/microchip/ksz8795.c b/drivers/net/dsa/microchip/ksz8795.c
+index f56fca1b1a222..cc5b19a3d0df2 100644
+--- a/drivers/net/dsa/microchip/ksz8795.c
++++ b/drivers/net/dsa/microchip/ksz8795.c
+@@ -506,7 +506,13 @@ static int ksz8_r_sta_mac_table(struct ksz_device *dev, u16 addr,
+ (data_hi & masks[STATIC_MAC_TABLE_FWD_PORTS]) >>
+ shifts[STATIC_MAC_FWD_PORTS];
+ alu->is_override = (data_hi & masks[STATIC_MAC_TABLE_OVERRIDE]) ? 1 : 0;
+- data_hi >>= 1;
++
++ /* KSZ8795 family switches have STATIC_MAC_TABLE_USE_FID and
++ * STATIC_MAC_TABLE_FID definitions off by 1 when doing read on the
++ * static MAC table compared to doing write.
++ */
++ if (ksz_is_ksz87xx(dev))
++ data_hi >>= 1;
+ alu->is_static = true;
+ alu->is_use_fid = (data_hi & masks[STATIC_MAC_TABLE_USE_FID]) ? 1 : 0;
+ alu->fid = (data_hi & masks[STATIC_MAC_TABLE_FID]) >>
+diff --git a/drivers/net/dsa/microchip/ksz_common.c b/drivers/net/dsa/microchip/ksz_common.c
+index a4428be5f483c..a0ba2605bb620 100644
+--- a/drivers/net/dsa/microchip/ksz_common.c
++++ b/drivers/net/dsa/microchip/ksz_common.c
+@@ -331,13 +331,13 @@ static const u32 ksz8795_masks[] = {
+ [STATIC_MAC_TABLE_VALID] = BIT(21),
+ [STATIC_MAC_TABLE_USE_FID] = BIT(23),
+ [STATIC_MAC_TABLE_FID] = GENMASK(30, 24),
+- [STATIC_MAC_TABLE_OVERRIDE] = BIT(26),
+- [STATIC_MAC_TABLE_FWD_PORTS] = GENMASK(24, 20),
++ [STATIC_MAC_TABLE_OVERRIDE] = BIT(22),
++ [STATIC_MAC_TABLE_FWD_PORTS] = GENMASK(20, 16),
+ [DYNAMIC_MAC_TABLE_ENTRIES_H] = GENMASK(6, 0),
+- [DYNAMIC_MAC_TABLE_MAC_EMPTY] = BIT(8),
++ [DYNAMIC_MAC_TABLE_MAC_EMPTY] = BIT(7),
+ [DYNAMIC_MAC_TABLE_NOT_READY] = BIT(7),
+ [DYNAMIC_MAC_TABLE_ENTRIES] = GENMASK(31, 29),
+- [DYNAMIC_MAC_TABLE_FID] = GENMASK(26, 20),
++ [DYNAMIC_MAC_TABLE_FID] = GENMASK(22, 16),
+ [DYNAMIC_MAC_TABLE_SRC_PORT] = GENMASK(26, 24),
+ [DYNAMIC_MAC_TABLE_TIMESTAMP] = GENMASK(28, 27),
+ [P_MII_TX_FLOW_CTRL] = BIT(5),
+diff --git a/drivers/net/dsa/microchip/ksz_common.h b/drivers/net/dsa/microchip/ksz_common.h
+index 8abecaf6089ef..33d9a2f6af27a 100644
+--- a/drivers/net/dsa/microchip/ksz_common.h
++++ b/drivers/net/dsa/microchip/ksz_common.h
+@@ -569,6 +569,13 @@ static inline void ksz_regmap_unlock(void *__mtx)
+ mutex_unlock(mtx);
+ }
+
++static inline bool ksz_is_ksz87xx(struct ksz_device *dev)
++{
++ return dev->chip_id == KSZ8795_CHIP_ID ||
++ dev->chip_id == KSZ8794_CHIP_ID ||
++ dev->chip_id == KSZ8765_CHIP_ID;
++}
++
+ static inline bool ksz_is_ksz88x3(struct ksz_device *dev)
+ {
+ return dev->chip_id == KSZ8830_CHIP_ID;
+diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
+index 08a46ffd53af9..642e93e8623eb 100644
+--- a/drivers/net/dsa/mv88e6xxx/chip.c
++++ b/drivers/net/dsa/mv88e6xxx/chip.c
+@@ -109,6 +109,13 @@ int mv88e6xxx_wait_mask(struct mv88e6xxx_chip *chip, int addr, int reg,
+ usleep_range(1000, 2000);
+ }
+
++ err = mv88e6xxx_read(chip, addr, reg, &data);
++ if (err)
++ return err;
++
++ if ((data & mask) == val)
++ return 0;
++
+ dev_err(chip->dev, "Timeout while waiting for switch\n");
+ return -ETIMEDOUT;
+ }
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c b/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c
+index d385ffc218766..32bb14303473b 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c
+@@ -438,19 +438,36 @@ static void hns3_dbg_fill_content(char *content, u16 len,
+ const struct hns3_dbg_item *items,
+ const char **result, u16 size)
+ {
++#define HNS3_DBG_LINE_END_LEN 2
+ char *pos = content;
++ u16 item_len;
+ u16 i;
+
++ if (!len) {
++ return;
++ } else if (len <= HNS3_DBG_LINE_END_LEN) {
++ *pos++ = '\0';
++ return;
++ }
++
+ memset(content, ' ', len);
+- for (i = 0; i < size; i++) {
+- if (result)
+- strncpy(pos, result[i], strlen(result[i]));
+- else
+- strncpy(pos, items[i].name, strlen(items[i].name));
++ len -= HNS3_DBG_LINE_END_LEN;
+
+- pos += strlen(items[i].name) + items[i].interval;
++ for (i = 0; i < size; i++) {
++ item_len = strlen(items[i].name) + items[i].interval;
++ if (len < item_len)
++ break;
++
++ if (result) {
++ if (item_len < strlen(result[i]))
++ break;
++ strscpy(pos, result[i], strlen(result[i]));
++ } else {
++ strscpy(pos, items[i].name, strlen(items[i].name));
++ }
++ pos += item_len;
++ len -= item_len;
+ }
+-
+ *pos++ = '\n';
+ *pos++ = '\0';
+ }
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c
+index a0b46e7d863eb..233c132dc513e 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c
+@@ -88,16 +88,35 @@ static void hclge_dbg_fill_content(char *content, u16 len,
+ const struct hclge_dbg_item *items,
+ const char **result, u16 size)
+ {
++#define HCLGE_DBG_LINE_END_LEN 2
+ char *pos = content;
++ u16 item_len;
+ u16 i;
+
++ if (!len) {
++ return;
++ } else if (len <= HCLGE_DBG_LINE_END_LEN) {
++ *pos++ = '\0';
++ return;
++ }
++
+ memset(content, ' ', len);
++ len -= HCLGE_DBG_LINE_END_LEN;
++
+ for (i = 0; i < size; i++) {
+- if (result)
+- strncpy(pos, result[i], strlen(result[i]));
+- else
+- strncpy(pos, items[i].name, strlen(items[i].name));
+- pos += strlen(items[i].name) + items[i].interval;
++ item_len = strlen(items[i].name) + items[i].interval;
++ if (len < item_len)
++ break;
++
++ if (result) {
++ if (item_len < strlen(result[i]))
++ break;
++ strscpy(pos, result[i], strlen(result[i]));
++ } else {
++ strscpy(pos, items[i].name, strlen(items[i].name));
++ }
++ pos += item_len;
++ len -= item_len;
+ }
+ *pos++ = '\n';
+ *pos++ = '\0';
+diff --git a/drivers/net/ethernet/intel/iavf/iavf.h b/drivers/net/ethernet/intel/iavf/iavf.h
+index 39d0fe76a38ff..8cbdebc5b6989 100644
+--- a/drivers/net/ethernet/intel/iavf/iavf.h
++++ b/drivers/net/ethernet/intel/iavf/iavf.h
+@@ -255,8 +255,10 @@ struct iavf_adapter {
+ struct workqueue_struct *wq;
+ struct work_struct reset_task;
+ struct work_struct adminq_task;
++ struct work_struct finish_config;
+ struct delayed_work client_task;
+ wait_queue_head_t down_waitqueue;
++ wait_queue_head_t reset_waitqueue;
+ wait_queue_head_t vc_waitqueue;
+ struct iavf_q_vector *q_vectors;
+ struct list_head vlan_filter_list;
+@@ -518,14 +520,12 @@ int iavf_up(struct iavf_adapter *adapter);
+ void iavf_down(struct iavf_adapter *adapter);
+ int iavf_process_config(struct iavf_adapter *adapter);
+ int iavf_parse_vf_resource_msg(struct iavf_adapter *adapter);
+-void iavf_schedule_reset(struct iavf_adapter *adapter);
++void iavf_schedule_reset(struct iavf_adapter *adapter, u64 flags);
+ void iavf_schedule_request_stats(struct iavf_adapter *adapter);
++void iavf_schedule_finish_config(struct iavf_adapter *adapter);
+ void iavf_reset(struct iavf_adapter *adapter);
+ void iavf_set_ethtool_ops(struct net_device *netdev);
+ void iavf_update_stats(struct iavf_adapter *adapter);
+-void iavf_reset_interrupt_capability(struct iavf_adapter *adapter);
+-int iavf_init_interrupt_scheme(struct iavf_adapter *adapter);
+-void iavf_irq_enable_queues(struct iavf_adapter *adapter);
+ void iavf_free_all_tx_resources(struct iavf_adapter *adapter);
+ void iavf_free_all_rx_resources(struct iavf_adapter *adapter);
+
+@@ -579,17 +579,11 @@ void iavf_enable_vlan_stripping_v2(struct iavf_adapter *adapter, u16 tpid);
+ void iavf_disable_vlan_stripping_v2(struct iavf_adapter *adapter, u16 tpid);
+ void iavf_enable_vlan_insertion_v2(struct iavf_adapter *adapter, u16 tpid);
+ void iavf_disable_vlan_insertion_v2(struct iavf_adapter *adapter, u16 tpid);
+-int iavf_replace_primary_mac(struct iavf_adapter *adapter,
+- const u8 *new_mac);
+-void
+-iavf_set_vlan_offload_features(struct iavf_adapter *adapter,
+- netdev_features_t prev_features,
+- netdev_features_t features);
+ void iavf_add_fdir_filter(struct iavf_adapter *adapter);
+ void iavf_del_fdir_filter(struct iavf_adapter *adapter);
+ void iavf_add_adv_rss_cfg(struct iavf_adapter *adapter);
+ void iavf_del_adv_rss_cfg(struct iavf_adapter *adapter);
+ struct iavf_mac_filter *iavf_add_filter(struct iavf_adapter *adapter,
+ const u8 *macaddr);
+-int iavf_lock_timeout(struct mutex *lock, unsigned int msecs);
++int iavf_wait_for_reset(struct iavf_adapter *adapter);
+ #endif /* _IAVF_H_ */
+diff --git a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
+index 6f171d1d85b75..2f47cfa7f06e2 100644
+--- a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
++++ b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
+@@ -484,6 +484,7 @@ static int iavf_set_priv_flags(struct net_device *netdev, u32 flags)
+ {
+ struct iavf_adapter *adapter = netdev_priv(netdev);
+ u32 orig_flags, new_flags, changed_flags;
++ int ret = 0;
+ u32 i;
+
+ orig_flags = READ_ONCE(adapter->flags);
+@@ -531,12 +532,14 @@ static int iavf_set_priv_flags(struct net_device *netdev, u32 flags)
+ /* issue a reset to force legacy-rx change to take effect */
+ if (changed_flags & IAVF_FLAG_LEGACY_RX) {
+ if (netif_running(netdev)) {
+- adapter->flags |= IAVF_FLAG_RESET_NEEDED;
+- queue_work(adapter->wq, &adapter->reset_task);
++ iavf_schedule_reset(adapter, IAVF_FLAG_RESET_NEEDED);
++ ret = iavf_wait_for_reset(adapter);
++ if (ret)
++ netdev_warn(netdev, "Changing private flags timeout or interrupted waiting for reset");
+ }
+ }
+
+- return 0;
++ return ret;
+ }
+
+ /**
+@@ -627,6 +630,7 @@ static int iavf_set_ringparam(struct net_device *netdev,
+ {
+ struct iavf_adapter *adapter = netdev_priv(netdev);
+ u32 new_rx_count, new_tx_count;
++ int ret = 0;
+
+ if ((ring->rx_mini_pending) || (ring->rx_jumbo_pending))
+ return -EINVAL;
+@@ -671,11 +675,13 @@ static int iavf_set_ringparam(struct net_device *netdev,
+ }
+
+ if (netif_running(netdev)) {
+- adapter->flags |= IAVF_FLAG_RESET_NEEDED;
+- queue_work(adapter->wq, &adapter->reset_task);
++ iavf_schedule_reset(adapter, IAVF_FLAG_RESET_NEEDED);
++ ret = iavf_wait_for_reset(adapter);
++ if (ret)
++ netdev_warn(netdev, "Changing ring parameters timeout or interrupted waiting for reset");
+ }
+
+- return 0;
++ return ret;
+ }
+
+ /**
+@@ -1830,7 +1836,7 @@ static int iavf_set_channels(struct net_device *netdev,
+ {
+ struct iavf_adapter *adapter = netdev_priv(netdev);
+ u32 num_req = ch->combined_count;
+- int i;
++ int ret = 0;
+
+ if ((adapter->vf_res->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_ADQ) &&
+ adapter->num_tc) {
+@@ -1852,22 +1858,13 @@ static int iavf_set_channels(struct net_device *netdev,
+
+ adapter->num_req_queues = num_req;
+ adapter->flags |= IAVF_FLAG_REINIT_ITR_NEEDED;
+- iavf_schedule_reset(adapter);
++ iavf_schedule_reset(adapter, IAVF_FLAG_RESET_NEEDED);
+
+- /* wait for the reset is done */
+- for (i = 0; i < IAVF_RESET_WAIT_COMPLETE_COUNT; i++) {
+- msleep(IAVF_RESET_WAIT_MS);
+- if (adapter->flags & IAVF_FLAG_RESET_PENDING)
+- continue;
+- break;
+- }
+- if (i == IAVF_RESET_WAIT_COMPLETE_COUNT) {
+- adapter->flags &= ~IAVF_FLAG_REINIT_ITR_NEEDED;
+- adapter->num_active_queues = num_req;
+- return -EOPNOTSUPP;
+- }
++ ret = iavf_wait_for_reset(adapter);
++ if (ret)
++ netdev_warn(netdev, "Changing channel count timeout or interrupted waiting for reset");
+
+- return 0;
++ return ret;
+ }
+
+ /**
+diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
+index 4a66873882d12..ba96312feb505 100644
+--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
++++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
+@@ -166,6 +166,45 @@ static struct iavf_adapter *iavf_pdev_to_adapter(struct pci_dev *pdev)
+ return netdev_priv(pci_get_drvdata(pdev));
+ }
+
++/**
++ * iavf_is_reset_in_progress - Check if a reset is in progress
++ * @adapter: board private structure
++ */
++static bool iavf_is_reset_in_progress(struct iavf_adapter *adapter)
++{
++ if (adapter->state == __IAVF_RESETTING ||
++ adapter->flags & (IAVF_FLAG_RESET_PENDING |
++ IAVF_FLAG_RESET_NEEDED))
++ return true;
++
++ return false;
++}
++
++/**
++ * iavf_wait_for_reset - Wait for reset to finish.
++ * @adapter: board private structure
++ *
++ * Returns 0 if reset finished successfully, negative on timeout or interrupt.
++ */
++int iavf_wait_for_reset(struct iavf_adapter *adapter)
++{
++ int ret = wait_event_interruptible_timeout(adapter->reset_waitqueue,
++ !iavf_is_reset_in_progress(adapter),
++ msecs_to_jiffies(5000));
++
++ /* If ret < 0 then it means wait was interrupted.
++ * If ret == 0 then it means we got a timeout while waiting
++ * for reset to finish.
++ * If ret > 0 it means reset has finished.
++ */
++ if (ret > 0)
++ return 0;
++ else if (ret < 0)
++ return -EINTR;
++ else
++ return -EBUSY;
++}
++
+ /**
+ * iavf_allocate_dma_mem_d - OS specific memory alloc for shared code
+ * @hw: pointer to the HW structure
+@@ -253,7 +292,7 @@ enum iavf_status iavf_free_virt_mem_d(struct iavf_hw *hw,
+ *
+ * Returns 0 on success, negative on failure
+ **/
+-int iavf_lock_timeout(struct mutex *lock, unsigned int msecs)
++static int iavf_lock_timeout(struct mutex *lock, unsigned int msecs)
+ {
+ unsigned int wait, delay = 10;
+
+@@ -270,12 +309,14 @@ int iavf_lock_timeout(struct mutex *lock, unsigned int msecs)
+ /**
+ * iavf_schedule_reset - Set the flags and schedule a reset event
+ * @adapter: board private structure
++ * @flags: IAVF_FLAG_RESET_PENDING or IAVF_FLAG_RESET_NEEDED
+ **/
+-void iavf_schedule_reset(struct iavf_adapter *adapter)
++void iavf_schedule_reset(struct iavf_adapter *adapter, u64 flags)
+ {
+- if (!(adapter->flags &
+- (IAVF_FLAG_RESET_PENDING | IAVF_FLAG_RESET_NEEDED))) {
+- adapter->flags |= IAVF_FLAG_RESET_NEEDED;
++ if (!test_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section) &&
++ !(adapter->flags &
++ (IAVF_FLAG_RESET_PENDING | IAVF_FLAG_RESET_NEEDED))) {
++ adapter->flags |= flags;
+ queue_work(adapter->wq, &adapter->reset_task);
+ }
+ }
+@@ -303,7 +344,7 @@ static void iavf_tx_timeout(struct net_device *netdev, unsigned int txqueue)
+ struct iavf_adapter *adapter = netdev_priv(netdev);
+
+ adapter->tx_timeout_count++;
+- iavf_schedule_reset(adapter);
++ iavf_schedule_reset(adapter, IAVF_FLAG_RESET_NEEDED);
+ }
+
+ /**
+@@ -362,7 +403,7 @@ static void iavf_irq_disable(struct iavf_adapter *adapter)
+ * iavf_irq_enable_queues - Enable interrupt for all queues
+ * @adapter: board private structure
+ **/
+-void iavf_irq_enable_queues(struct iavf_adapter *adapter)
++static void iavf_irq_enable_queues(struct iavf_adapter *adapter)
+ {
+ struct iavf_hw *hw = &adapter->hw;
+ int i;
+@@ -1003,8 +1044,8 @@ struct iavf_mac_filter *iavf_add_filter(struct iavf_adapter *adapter,
+ *
+ * Do not call this with mac_vlan_list_lock!
+ **/
+-int iavf_replace_primary_mac(struct iavf_adapter *adapter,
+- const u8 *new_mac)
++static int iavf_replace_primary_mac(struct iavf_adapter *adapter,
++ const u8 *new_mac)
+ {
+ struct iavf_hw *hw = &adapter->hw;
+ struct iavf_mac_filter *f;
+@@ -1663,10 +1704,10 @@ static int iavf_set_interrupt_capability(struct iavf_adapter *adapter)
+ adapter->msix_entries[vector].entry = vector;
+
+ err = iavf_acquire_msix_vectors(adapter, v_budget);
++ if (!err)
++ iavf_schedule_finish_config(adapter);
+
+ out:
+- netif_set_real_num_rx_queues(adapter->netdev, pairs);
+- netif_set_real_num_tx_queues(adapter->netdev, pairs);
+ return err;
+ }
+
+@@ -1840,19 +1881,16 @@ static int iavf_alloc_q_vectors(struct iavf_adapter *adapter)
+ static void iavf_free_q_vectors(struct iavf_adapter *adapter)
+ {
+ int q_idx, num_q_vectors;
+- int napi_vectors;
+
+ if (!adapter->q_vectors)
+ return;
+
+ num_q_vectors = adapter->num_msix_vectors - NONQ_VECS;
+- napi_vectors = adapter->num_active_queues;
+
+ for (q_idx = 0; q_idx < num_q_vectors; q_idx++) {
+ struct iavf_q_vector *q_vector = &adapter->q_vectors[q_idx];
+
+- if (q_idx < napi_vectors)
+- netif_napi_del(&q_vector->napi);
++ netif_napi_del(&q_vector->napi);
+ }
+ kfree(adapter->q_vectors);
+ adapter->q_vectors = NULL;
+@@ -1863,7 +1901,7 @@ static void iavf_free_q_vectors(struct iavf_adapter *adapter)
+ * @adapter: board private structure
+ *
+ **/
+-void iavf_reset_interrupt_capability(struct iavf_adapter *adapter)
++static void iavf_reset_interrupt_capability(struct iavf_adapter *adapter)
+ {
+ if (!adapter->msix_entries)
+ return;
+@@ -1878,7 +1916,7 @@ void iavf_reset_interrupt_capability(struct iavf_adapter *adapter)
+ * @adapter: board private structure to initialize
+ *
+ **/
+-int iavf_init_interrupt_scheme(struct iavf_adapter *adapter)
++static int iavf_init_interrupt_scheme(struct iavf_adapter *adapter)
+ {
+ int err;
+
+@@ -1889,9 +1927,7 @@ int iavf_init_interrupt_scheme(struct iavf_adapter *adapter)
+ goto err_alloc_queues;
+ }
+
+- rtnl_lock();
+ err = iavf_set_interrupt_capability(adapter);
+- rtnl_unlock();
+ if (err) {
+ dev_err(&adapter->pdev->dev,
+ "Unable to setup interrupt capabilities\n");
+@@ -1944,15 +1980,16 @@ static void iavf_free_rss(struct iavf_adapter *adapter)
+ /**
+ * iavf_reinit_interrupt_scheme - Reallocate queues and vectors
+ * @adapter: board private structure
++ * @running: true if adapter->state == __IAVF_RUNNING
+ *
+ * Returns 0 on success, negative on failure
+ **/
+-static int iavf_reinit_interrupt_scheme(struct iavf_adapter *adapter)
++static int iavf_reinit_interrupt_scheme(struct iavf_adapter *adapter, bool running)
+ {
+ struct net_device *netdev = adapter->netdev;
+ int err;
+
+- if (netif_running(netdev))
++ if (running)
+ iavf_free_traffic_irqs(adapter);
+ iavf_free_misc_irq(adapter);
+ iavf_reset_interrupt_capability(adapter);
+@@ -1976,6 +2013,78 @@ err:
+ return err;
+ }
+
++/**
++ * iavf_finish_config - do all netdev work that needs RTNL
++ * @work: our work_struct
++ *
++ * Do work that needs both RTNL and crit_lock.
++ **/
++static void iavf_finish_config(struct work_struct *work)
++{
++ struct iavf_adapter *adapter;
++ int pairs, err;
++
++ adapter = container_of(work, struct iavf_adapter, finish_config);
++
++ /* Always take RTNL first to prevent circular lock dependency */
++ rtnl_lock();
++ mutex_lock(&adapter->crit_lock);
++
++ if ((adapter->flags & IAVF_FLAG_SETUP_NETDEV_FEATURES) &&
++ adapter->netdev_registered &&
++ !test_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section)) {
++ netdev_update_features(adapter->netdev);
++ adapter->flags &= ~IAVF_FLAG_SETUP_NETDEV_FEATURES;
++ }
++
++ switch (adapter->state) {
++ case __IAVF_DOWN:
++ if (!adapter->netdev_registered) {
++ err = register_netdevice(adapter->netdev);
++ if (err) {
++ dev_err(&adapter->pdev->dev, "Unable to register netdev (%d)\n",
++ err);
++
++ /* go back and try again.*/
++ iavf_free_rss(adapter);
++ iavf_free_misc_irq(adapter);
++ iavf_reset_interrupt_capability(adapter);
++ iavf_change_state(adapter,
++ __IAVF_INIT_CONFIG_ADAPTER);
++ goto out;
++ }
++ adapter->netdev_registered = true;
++ }
++
++ /* Set the real number of queues when reset occurs while
++ * state == __IAVF_DOWN
++ */
++ fallthrough;
++ case __IAVF_RUNNING:
++ pairs = adapter->num_active_queues;
++ netif_set_real_num_rx_queues(adapter->netdev, pairs);
++ netif_set_real_num_tx_queues(adapter->netdev, pairs);
++ break;
++
++ default:
++ break;
++ }
++
++out:
++ mutex_unlock(&adapter->crit_lock);
++ rtnl_unlock();
++}
++
++/**
++ * iavf_schedule_finish_config - Set the flags and schedule a reset event
++ * @adapter: board private structure
++ **/
++void iavf_schedule_finish_config(struct iavf_adapter *adapter)
++{
++ if (!test_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section))
++ queue_work(adapter->wq, &adapter->finish_config);
++}
++
+ /**
+ * iavf_process_aq_command - process aq_required flags
+ * and sends aq command
+@@ -2176,7 +2285,7 @@ static int iavf_process_aq_command(struct iavf_adapter *adapter)
+ * the watchdog if any changes are requested to expedite the request via
+ * virtchnl.
+ **/
+-void
++static void
+ iavf_set_vlan_offload_features(struct iavf_adapter *adapter,
+ netdev_features_t prev_features,
+ netdev_features_t features)
+@@ -2383,7 +2492,7 @@ int iavf_parse_vf_resource_msg(struct iavf_adapter *adapter)
+ adapter->vsi_res->num_queue_pairs);
+ adapter->flags |= IAVF_FLAG_REINIT_MSIX_NEEDED;
+ adapter->num_req_queues = adapter->vsi_res->num_queue_pairs;
+- iavf_schedule_reset(adapter);
++ iavf_schedule_reset(adapter, IAVF_FLAG_RESET_NEEDED);
+
+ return -EAGAIN;
+ }
+@@ -2613,22 +2722,8 @@ static void iavf_init_config_adapter(struct iavf_adapter *adapter)
+
+ netif_carrier_off(netdev);
+ adapter->link_up = false;
+-
+- /* set the semaphore to prevent any callbacks after device registration
+- * up to time when state of driver will be set to __IAVF_DOWN
+- */
+- rtnl_lock();
+- if (!adapter->netdev_registered) {
+- err = register_netdevice(netdev);
+- if (err) {
+- rtnl_unlock();
+- goto err_register;
+- }
+- }
+-
+- adapter->netdev_registered = true;
+-
+ netif_tx_stop_all_queues(netdev);
++
+ if (CLIENT_ALLOWED(adapter)) {
+ err = iavf_lan_add_device(adapter);
+ if (err)
+@@ -2641,7 +2736,6 @@ static void iavf_init_config_adapter(struct iavf_adapter *adapter)
+
+ iavf_change_state(adapter, __IAVF_DOWN);
+ set_bit(__IAVF_VSI_DOWN, adapter->vsi.state);
+- rtnl_unlock();
+
+ iavf_misc_irq_enable(adapter);
+ wake_up(&adapter->down_waitqueue);
+@@ -2661,10 +2755,11 @@ static void iavf_init_config_adapter(struct iavf_adapter *adapter)
+ /* request initial VLAN offload settings */
+ iavf_set_vlan_offload_features(adapter, 0, netdev->features);
+
++ iavf_schedule_finish_config(adapter);
+ return;
++
+ err_mem:
+ iavf_free_rss(adapter);
+-err_register:
+ iavf_free_misc_irq(adapter);
+ err_sw_init:
+ iavf_reset_interrupt_capability(adapter);
+@@ -2691,26 +2786,9 @@ static void iavf_watchdog_task(struct work_struct *work)
+ goto restart_watchdog;
+ }
+
+- if ((adapter->flags & IAVF_FLAG_SETUP_NETDEV_FEATURES) &&
+- adapter->netdev_registered &&
+- !test_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section) &&
+- rtnl_trylock()) {
+- netdev_update_features(adapter->netdev);
+- rtnl_unlock();
+- adapter->flags &= ~IAVF_FLAG_SETUP_NETDEV_FEATURES;
+- }
+-
+ if (adapter->flags & IAVF_FLAG_PF_COMMS_FAILED)
+ iavf_change_state(adapter, __IAVF_COMM_FAILED);
+
+- if (adapter->flags & IAVF_FLAG_RESET_NEEDED) {
+- adapter->aq_required = 0;
+- adapter->current_op = VIRTCHNL_OP_UNKNOWN;
+- mutex_unlock(&adapter->crit_lock);
+- queue_work(adapter->wq, &adapter->reset_task);
+- return;
+- }
+-
+ switch (adapter->state) {
+ case __IAVF_STARTUP:
+ iavf_startup(adapter);
+@@ -2838,11 +2916,10 @@ static void iavf_watchdog_task(struct work_struct *work)
+ /* check for hw reset */
+ reg_val = rd32(hw, IAVF_VF_ARQLEN1) & IAVF_VF_ARQLEN1_ARQENABLE_MASK;
+ if (!reg_val) {
+- adapter->flags |= IAVF_FLAG_RESET_PENDING;
+ adapter->aq_required = 0;
+ adapter->current_op = VIRTCHNL_OP_UNKNOWN;
+ dev_err(&adapter->pdev->dev, "Hardware reset detected\n");
+- queue_work(adapter->wq, &adapter->reset_task);
++ iavf_schedule_reset(adapter, IAVF_FLAG_RESET_PENDING);
+ mutex_unlock(&adapter->crit_lock);
+ queue_delayed_work(adapter->wq,
+ &adapter->watchdog_task, HZ * 2);
+@@ -3068,7 +3145,7 @@ continue_reset:
+
+ if ((adapter->flags & IAVF_FLAG_REINIT_MSIX_NEEDED) ||
+ (adapter->flags & IAVF_FLAG_REINIT_ITR_NEEDED)) {
+- err = iavf_reinit_interrupt_scheme(adapter);
++ err = iavf_reinit_interrupt_scheme(adapter, running);
+ if (err)
+ goto reset_err;
+ }
+@@ -3163,6 +3240,7 @@ continue_reset:
+
+ adapter->flags &= ~IAVF_FLAG_REINIT_ITR_NEEDED;
+
++ wake_up(&adapter->reset_waitqueue);
+ mutex_unlock(&adapter->client_lock);
+ mutex_unlock(&adapter->crit_lock);
+
+@@ -3239,9 +3317,7 @@ static void iavf_adminq_task(struct work_struct *work)
+ } while (pending);
+ mutex_unlock(&adapter->crit_lock);
+
+- if ((adapter->flags &
+- (IAVF_FLAG_RESET_PENDING | IAVF_FLAG_RESET_NEEDED)) ||
+- adapter->state == __IAVF_RESETTING)
++ if (iavf_is_reset_in_progress(adapter))
+ goto freedom;
+
+ /* check for error indications */
+@@ -4327,6 +4403,7 @@ static int iavf_close(struct net_device *netdev)
+ static int iavf_change_mtu(struct net_device *netdev, int new_mtu)
+ {
+ struct iavf_adapter *adapter = netdev_priv(netdev);
++ int ret = 0;
+
+ netdev_dbg(netdev, "changing MTU from %d to %d\n",
+ netdev->mtu, new_mtu);
+@@ -4337,11 +4414,15 @@ static int iavf_change_mtu(struct net_device *netdev, int new_mtu)
+ }
+
+ if (netif_running(netdev)) {
+- adapter->flags |= IAVF_FLAG_RESET_NEEDED;
+- queue_work(adapter->wq, &adapter->reset_task);
++ iavf_schedule_reset(adapter, IAVF_FLAG_RESET_NEEDED);
++ ret = iavf_wait_for_reset(adapter);
++ if (ret < 0)
++ netdev_warn(netdev, "MTU change interrupted waiting for reset");
++ else if (ret)
++ netdev_warn(netdev, "MTU change timed out waiting for reset");
+ }
+
+- return 0;
++ return ret;
+ }
+
+ #define NETIF_VLAN_OFFLOAD_FEATURES (NETIF_F_HW_VLAN_CTAG_RX | \
+@@ -4934,6 +5015,7 @@ static int iavf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
+
+ INIT_WORK(&adapter->reset_task, iavf_reset_task);
+ INIT_WORK(&adapter->adminq_task, iavf_adminq_task);
++ INIT_WORK(&adapter->finish_config, iavf_finish_config);
+ INIT_DELAYED_WORK(&adapter->watchdog_task, iavf_watchdog_task);
+ INIT_DELAYED_WORK(&adapter->client_task, iavf_client_task);
+ queue_delayed_work(adapter->wq, &adapter->watchdog_task,
+@@ -4942,6 +5024,9 @@ static int iavf_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
+ /* Setup the wait queue for indicating transition to down status */
+ init_waitqueue_head(&adapter->down_waitqueue);
+
++ /* Setup the wait queue for indicating transition to running state */
++ init_waitqueue_head(&adapter->reset_waitqueue);
++
+ /* Setup the wait queue for indicating virtchannel events */
+ init_waitqueue_head(&adapter->vc_waitqueue);
+
+@@ -5073,13 +5158,15 @@ static void iavf_remove(struct pci_dev *pdev)
+ usleep_range(500, 1000);
+ }
+ cancel_delayed_work_sync(&adapter->watchdog_task);
++ cancel_work_sync(&adapter->finish_config);
+
++ rtnl_lock();
+ if (adapter->netdev_registered) {
+- rtnl_lock();
+ unregister_netdevice(netdev);
+ adapter->netdev_registered = false;
+- rtnl_unlock();
+ }
++ rtnl_unlock();
++
+ if (CLIENT_ALLOWED(adapter)) {
+ err = iavf_lan_del_device(adapter);
+ if (err)
+diff --git a/drivers/net/ethernet/intel/iavf/iavf_txrx.c b/drivers/net/ethernet/intel/iavf/iavf_txrx.c
+index e989feda133c1..8c5f6096b0022 100644
+--- a/drivers/net/ethernet/intel/iavf/iavf_txrx.c
++++ b/drivers/net/ethernet/intel/iavf/iavf_txrx.c
+@@ -54,7 +54,7 @@ static void iavf_unmap_and_free_tx_resource(struct iavf_ring *ring,
+ * iavf_clean_tx_ring - Free any empty Tx buffers
+ * @tx_ring: ring to be cleaned
+ **/
+-void iavf_clean_tx_ring(struct iavf_ring *tx_ring)
++static void iavf_clean_tx_ring(struct iavf_ring *tx_ring)
+ {
+ unsigned long bi_size;
+ u16 i;
+@@ -110,7 +110,7 @@ void iavf_free_tx_resources(struct iavf_ring *tx_ring)
+ * Since there is no access to the ring head register
+ * in XL710, we need to use our local copies
+ **/
+-u32 iavf_get_tx_pending(struct iavf_ring *ring, bool in_sw)
++static u32 iavf_get_tx_pending(struct iavf_ring *ring, bool in_sw)
+ {
+ u32 head, tail;
+
+@@ -127,6 +127,24 @@ u32 iavf_get_tx_pending(struct iavf_ring *ring, bool in_sw)
+ return 0;
+ }
+
++/**
++ * iavf_force_wb - Issue SW Interrupt so HW does a wb
++ * @vsi: the VSI we care about
++ * @q_vector: the vector on which to force writeback
++ **/
++static void iavf_force_wb(struct iavf_vsi *vsi, struct iavf_q_vector *q_vector)
++{
++ u32 val = IAVF_VFINT_DYN_CTLN1_INTENA_MASK |
++ IAVF_VFINT_DYN_CTLN1_ITR_INDX_MASK | /* set noitr */
++ IAVF_VFINT_DYN_CTLN1_SWINT_TRIG_MASK |
++ IAVF_VFINT_DYN_CTLN1_SW_ITR_INDX_ENA_MASK
++ /* allow 00 to be written to the index */;
++
++ wr32(&vsi->back->hw,
++ IAVF_VFINT_DYN_CTLN1(q_vector->reg_idx),
++ val);
++}
++
+ /**
+ * iavf_detect_recover_hung - Function to detect and recover hung_queues
+ * @vsi: pointer to vsi struct with tx queues
+@@ -352,25 +370,6 @@ static void iavf_enable_wb_on_itr(struct iavf_vsi *vsi,
+ q_vector->arm_wb_state = true;
+ }
+
+-/**
+- * iavf_force_wb - Issue SW Interrupt so HW does a wb
+- * @vsi: the VSI we care about
+- * @q_vector: the vector on which to force writeback
+- *
+- **/
+-void iavf_force_wb(struct iavf_vsi *vsi, struct iavf_q_vector *q_vector)
+-{
+- u32 val = IAVF_VFINT_DYN_CTLN1_INTENA_MASK |
+- IAVF_VFINT_DYN_CTLN1_ITR_INDX_MASK | /* set noitr */
+- IAVF_VFINT_DYN_CTLN1_SWINT_TRIG_MASK |
+- IAVF_VFINT_DYN_CTLN1_SW_ITR_INDX_ENA_MASK
+- /* allow 00 to be written to the index */;
+-
+- wr32(&vsi->back->hw,
+- IAVF_VFINT_DYN_CTLN1(q_vector->reg_idx),
+- val);
+-}
+-
+ static inline bool iavf_container_is_rx(struct iavf_q_vector *q_vector,
+ struct iavf_ring_container *rc)
+ {
+@@ -687,7 +686,7 @@ err:
+ * iavf_clean_rx_ring - Free Rx buffers
+ * @rx_ring: ring to be cleaned
+ **/
+-void iavf_clean_rx_ring(struct iavf_ring *rx_ring)
++static void iavf_clean_rx_ring(struct iavf_ring *rx_ring)
+ {
+ unsigned long bi_size;
+ u16 i;
+diff --git a/drivers/net/ethernet/intel/iavf/iavf_txrx.h b/drivers/net/ethernet/intel/iavf/iavf_txrx.h
+index 2624bf6d009e3..7e6ee32d19b69 100644
+--- a/drivers/net/ethernet/intel/iavf/iavf_txrx.h
++++ b/drivers/net/ethernet/intel/iavf/iavf_txrx.h
+@@ -442,15 +442,11 @@ static inline unsigned int iavf_rx_pg_order(struct iavf_ring *ring)
+
+ bool iavf_alloc_rx_buffers(struct iavf_ring *rxr, u16 cleaned_count);
+ netdev_tx_t iavf_xmit_frame(struct sk_buff *skb, struct net_device *netdev);
+-void iavf_clean_tx_ring(struct iavf_ring *tx_ring);
+-void iavf_clean_rx_ring(struct iavf_ring *rx_ring);
+ int iavf_setup_tx_descriptors(struct iavf_ring *tx_ring);
+ int iavf_setup_rx_descriptors(struct iavf_ring *rx_ring);
+ void iavf_free_tx_resources(struct iavf_ring *tx_ring);
+ void iavf_free_rx_resources(struct iavf_ring *rx_ring);
+ int iavf_napi_poll(struct napi_struct *napi, int budget);
+-void iavf_force_wb(struct iavf_vsi *vsi, struct iavf_q_vector *q_vector);
+-u32 iavf_get_tx_pending(struct iavf_ring *ring, bool in_sw);
+ void iavf_detect_recover_hung(struct iavf_vsi *vsi);
+ int __iavf_maybe_stop_tx(struct iavf_ring *tx_ring, int size);
+ bool __iavf_chk_linearize(struct sk_buff *skb);
+diff --git a/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c b/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c
+index 7c0578b5457b9..be3c007ce90a9 100644
+--- a/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c
++++ b/drivers/net/ethernet/intel/iavf/iavf_virtchnl.c
+@@ -1961,9 +1961,8 @@ void iavf_virtchnl_completion(struct iavf_adapter *adapter,
+ case VIRTCHNL_EVENT_RESET_IMPENDING:
+ dev_info(&adapter->pdev->dev, "Reset indication received from the PF\n");
+ if (!(adapter->flags & IAVF_FLAG_RESET_PENDING)) {
+- adapter->flags |= IAVF_FLAG_RESET_PENDING;
+ dev_info(&adapter->pdev->dev, "Scheduling reset task\n");
+- queue_work(adapter->wq, &adapter->reset_task);
++ iavf_schedule_reset(adapter, IAVF_FLAG_RESET_PENDING);
+ }
+ break;
+ default:
+@@ -2237,6 +2236,7 @@ void iavf_virtchnl_completion(struct iavf_adapter *adapter,
+
+ iavf_process_config(adapter);
+ adapter->flags |= IAVF_FLAG_SETUP_NETDEV_FEATURES;
++ iavf_schedule_finish_config(adapter);
+
+ iavf_set_queue_vlan_tag_loc(adapter);
+
+@@ -2285,6 +2285,7 @@ void iavf_virtchnl_completion(struct iavf_adapter *adapter,
+ case VIRTCHNL_OP_ENABLE_QUEUES:
+ /* enable transmits */
+ iavf_irq_enable(adapter, true);
++ wake_up(&adapter->reset_waitqueue);
+ adapter->flags &= ~IAVF_FLAG_QUEUES_DISABLED;
+ break;
+ case VIRTCHNL_OP_DISABLE_QUEUES:
+diff --git a/drivers/net/ethernet/intel/ice/ice_base.c b/drivers/net/ethernet/intel/ice/ice_base.c
+index 1911d644dfa8d..619cb07a40691 100644
+--- a/drivers/net/ethernet/intel/ice/ice_base.c
++++ b/drivers/net/ethernet/intel/ice/ice_base.c
+@@ -758,6 +758,8 @@ void ice_vsi_free_q_vectors(struct ice_vsi *vsi)
+
+ ice_for_each_q_vector(vsi, v_idx)
+ ice_free_q_vector(vsi, v_idx);
++
++ vsi->num_q_vectors = 0;
+ }
+
+ /**
+diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c
+index f86e814354a31..ec4138e684bd2 100644
+--- a/drivers/net/ethernet/intel/ice/ice_ethtool.c
++++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c
+@@ -2920,8 +2920,13 @@ ice_get_ringparam(struct net_device *netdev, struct ethtool_ringparam *ring,
+
+ ring->rx_max_pending = ICE_MAX_NUM_DESC;
+ ring->tx_max_pending = ICE_MAX_NUM_DESC;
+- ring->rx_pending = vsi->rx_rings[0]->count;
+- ring->tx_pending = vsi->tx_rings[0]->count;
++ if (vsi->tx_rings && vsi->rx_rings) {
++ ring->rx_pending = vsi->rx_rings[0]->count;
++ ring->tx_pending = vsi->tx_rings[0]->count;
++ } else {
++ ring->rx_pending = 0;
++ ring->tx_pending = 0;
++ }
+
+ /* Rx mini and jumbo rings are not supported */
+ ring->rx_mini_max_pending = 0;
+@@ -2955,6 +2960,10 @@ ice_set_ringparam(struct net_device *netdev, struct ethtool_ringparam *ring,
+ return -EINVAL;
+ }
+
++ /* Return if there is no rings (device is reloading) */
++ if (!vsi->tx_rings || !vsi->rx_rings)
++ return -EBUSY;
++
+ new_tx_cnt = ALIGN(ring->tx_pending, ICE_REQ_DESC_MULTIPLE);
+ if (new_tx_cnt != ring->tx_pending)
+ netdev_info(netdev, "Requested Tx descriptor count rounded up to %d\n",
+diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
+index 11ae0e41f518a..284a1f0bfdb54 100644
+--- a/drivers/net/ethernet/intel/ice/ice_lib.c
++++ b/drivers/net/ethernet/intel/ice/ice_lib.c
+@@ -3272,39 +3272,12 @@ int ice_vsi_release(struct ice_vsi *vsi)
+ return -ENODEV;
+ pf = vsi->back;
+
+- /* do not unregister while driver is in the reset recovery pending
+- * state. Since reset/rebuild happens through PF service task workqueue,
+- * it's not a good idea to unregister netdev that is associated to the
+- * PF that is running the work queue items currently. This is done to
+- * avoid check_flush_dependency() warning on this wq
+- */
+- if (vsi->netdev && !ice_is_reset_in_progress(pf->state) &&
+- (test_bit(ICE_VSI_NETDEV_REGISTERED, vsi->state))) {
+- unregister_netdev(vsi->netdev);
+- clear_bit(ICE_VSI_NETDEV_REGISTERED, vsi->state);
+- }
+-
+- if (vsi->type == ICE_VSI_PF)
+- ice_devlink_destroy_pf_port(pf);
+-
+ if (test_bit(ICE_FLAG_RSS_ENA, pf->flags))
+ ice_rss_clean(vsi);
+
+ ice_vsi_close(vsi);
+ ice_vsi_decfg(vsi);
+
+- if (vsi->netdev) {
+- if (test_bit(ICE_VSI_NETDEV_REGISTERED, vsi->state)) {
+- unregister_netdev(vsi->netdev);
+- clear_bit(ICE_VSI_NETDEV_REGISTERED, vsi->state);
+- }
+- if (test_bit(ICE_VSI_NETDEV_ALLOCD, vsi->state)) {
+- free_netdev(vsi->netdev);
+- vsi->netdev = NULL;
+- clear_bit(ICE_VSI_NETDEV_ALLOCD, vsi->state);
+- }
+- }
+-
+ /* retain SW VSI data structure since it is needed to unregister and
+ * free VSI netdev when PF is not in reset recovery pending state,\
+ * for ex: during rmmod.
+diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
+index 1277e0a044ee4..fbe70458fda27 100644
+--- a/drivers/net/ethernet/intel/ice/ice_main.c
++++ b/drivers/net/ethernet/intel/ice/ice_main.c
+@@ -4655,9 +4655,9 @@ static int ice_start_eth(struct ice_vsi *vsi)
+ if (err)
+ return err;
+
+- rtnl_lock();
+ err = ice_vsi_open(vsi);
+- rtnl_unlock();
++ if (err)
++ ice_fltr_remove_all(vsi);
+
+ return err;
+ }
+@@ -5120,6 +5120,7 @@ int ice_load(struct ice_pf *pf)
+ params = ice_vsi_to_params(vsi);
+ params.flags = ICE_VSI_FLAG_INIT;
+
++ rtnl_lock();
+ err = ice_vsi_cfg(vsi, ¶ms);
+ if (err)
+ goto err_vsi_cfg;
+@@ -5127,6 +5128,7 @@ int ice_load(struct ice_pf *pf)
+ err = ice_start_eth(ice_get_main_vsi(pf));
+ if (err)
+ goto err_start_eth;
++ rtnl_unlock();
+
+ err = ice_init_rdma(pf);
+ if (err)
+@@ -5141,9 +5143,11 @@ int ice_load(struct ice_pf *pf)
+
+ err_init_rdma:
+ ice_vsi_close(ice_get_main_vsi(pf));
++ rtnl_lock();
+ err_start_eth:
+ ice_vsi_decfg(ice_get_main_vsi(pf));
+ err_vsi_cfg:
++ rtnl_unlock();
+ ice_deinit_dev(pf);
+ return err;
+ }
+@@ -5156,8 +5160,10 @@ void ice_unload(struct ice_pf *pf)
+ {
+ ice_deinit_features(pf);
+ ice_deinit_rdma(pf);
++ rtnl_lock();
+ ice_stop_eth(ice_get_main_vsi(pf));
+ ice_vsi_decfg(ice_get_main_vsi(pf));
++ rtnl_unlock();
+ ice_deinit_dev(pf);
+ }
+
+diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
+index bb3db387d49cf..ba5e1d1320f67 100644
+--- a/drivers/net/ethernet/intel/igb/igb_main.c
++++ b/drivers/net/ethernet/intel/igb/igb_main.c
+@@ -9585,6 +9585,11 @@ static pci_ers_result_t igb_io_error_detected(struct pci_dev *pdev,
+ struct net_device *netdev = pci_get_drvdata(pdev);
+ struct igb_adapter *adapter = netdev_priv(netdev);
+
++ if (state == pci_channel_io_normal) {
++ dev_warn(&pdev->dev, "Non-correctable non-fatal error reported.\n");
++ return PCI_ERS_RESULT_CAN_RECOVER;
++ }
++
+ netif_device_detach(netdev);
+
+ if (state == pci_channel_io_perm_failure)
+diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
+index 44aa4342cbbb5..496a4eb687b00 100644
+--- a/drivers/net/ethernet/intel/igc/igc_main.c
++++ b/drivers/net/ethernet/intel/igc/igc_main.c
+@@ -2417,6 +2417,8 @@ static int igc_xdp_xmit_back(struct igc_adapter *adapter, struct xdp_buff *xdp)
+ nq = txring_txq(ring);
+
+ __netif_tx_lock(nq, cpu);
++ /* Avoid transmit queue timeout since we share it with the slow path */
++ txq_trans_cond_update(nq);
+ res = igc_xdp_init_tx_descriptor(ring, xdpf);
+ __netif_tx_unlock(nq);
+ return res;
+@@ -2824,15 +2826,18 @@ static void igc_xdp_xmit_zc(struct igc_ring *ring)
+ struct netdev_queue *nq = txring_txq(ring);
+ union igc_adv_tx_desc *tx_desc = NULL;
+ int cpu = smp_processor_id();
+- u16 ntu = ring->next_to_use;
+ struct xdp_desc xdp_desc;
+- u16 budget;
++ u16 budget, ntu;
+
+ if (!netif_carrier_ok(ring->netdev))
+ return;
+
+ __netif_tx_lock(nq, cpu);
+
++ /* Avoid transmit queue timeout since we share it with the slow path */
++ txq_trans_cond_update(nq);
++
++ ntu = ring->next_to_use;
+ budget = igc_desc_unused(ring);
+
+ while (xsk_tx_peek_desc(pool, &xdp_desc) && budget--) {
+@@ -6385,6 +6390,9 @@ static int igc_xdp_xmit(struct net_device *dev, int num_frames,
+
+ __netif_tx_lock(nq, cpu);
+
++ /* Avoid transmit queue timeout since we share it with the slow path */
++ txq_trans_cond_update(nq);
++
+ drops = 0;
+ for (i = 0; i < num_frames; i++) {
+ int err;
+diff --git a/drivers/net/ethernet/litex/litex_liteeth.c b/drivers/net/ethernet/litex/litex_liteeth.c
+index 35f24e0f09349..ffa96059079c6 100644
+--- a/drivers/net/ethernet/litex/litex_liteeth.c
++++ b/drivers/net/ethernet/litex/litex_liteeth.c
+@@ -78,8 +78,7 @@ static int liteeth_rx(struct net_device *netdev)
+ memcpy_fromio(data, priv->rx_base + rx_slot * priv->slot_size, len);
+ skb->protocol = eth_type_trans(skb, netdev);
+
+- netdev->stats.rx_packets++;
+- netdev->stats.rx_bytes += len;
++ dev_sw_netstats_rx_add(netdev, len);
+
+ return netif_rx(skb);
+
+@@ -185,8 +184,7 @@ static netdev_tx_t liteeth_start_xmit(struct sk_buff *skb,
+ litex_write16(priv->base + LITEETH_READER_LENGTH, skb->len);
+ litex_write8(priv->base + LITEETH_READER_START, 1);
+
+- netdev->stats.tx_bytes += skb->len;
+- netdev->stats.tx_packets++;
++ dev_sw_netstats_tx_add(netdev, 1, skb->len);
+
+ priv->tx_slot = (priv->tx_slot + 1) % priv->num_tx_slots;
+ dev_kfree_skb_any(skb);
+@@ -194,9 +192,17 @@ static netdev_tx_t liteeth_start_xmit(struct sk_buff *skb,
+ return NETDEV_TX_OK;
+ }
+
++static void
++liteeth_get_stats64(struct net_device *netdev, struct rtnl_link_stats64 *stats)
++{
++ netdev_stats_to_stats64(stats, &netdev->stats);
++ dev_fetch_sw_netstats(stats, netdev->tstats);
++}
++
+ static const struct net_device_ops liteeth_netdev_ops = {
+ .ndo_open = liteeth_open,
+ .ndo_stop = liteeth_stop,
++ .ndo_get_stats64 = liteeth_get_stats64,
+ .ndo_start_xmit = liteeth_start_xmit,
+ };
+
+@@ -242,6 +248,11 @@ static int liteeth_probe(struct platform_device *pdev)
+ priv->netdev = netdev;
+ priv->dev = &pdev->dev;
+
++ netdev->tstats = devm_netdev_alloc_pcpu_stats(&pdev->dev,
++ struct pcpu_sw_netstats);
++ if (!netdev->tstats)
++ return -ENOMEM;
++
+ irq = platform_get_irq(pdev, 0);
+ if (irq < 0)
+ return irq;
+diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
+index 18284ad751572..384d26bee9b23 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
++++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
+@@ -1452,8 +1452,9 @@ static int otx2_init_hw_resources(struct otx2_nic *pf)
+ if (err)
+ goto err_free_npa_lf;
+
+- /* Enable backpressure */
+- otx2_nix_config_bp(pf, true);
++ /* Enable backpressure for CGX mapped PF/VFs */
++ if (!is_otx2_lbkvf(pf->pdev))
++ otx2_nix_config_bp(pf, true);
+
+ /* Init Auras and pools used by NIX RQ, for free buffer ptrs */
+ err = otx2_rq_aura_pool_init(pf);
+diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+index 834c644b67db5..2d15342c260ae 100644
+--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
++++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+@@ -3846,23 +3846,6 @@ static int mtk_hw_deinit(struct mtk_eth *eth)
+ return 0;
+ }
+
+-static int __init mtk_init(struct net_device *dev)
+-{
+- struct mtk_mac *mac = netdev_priv(dev);
+- struct mtk_eth *eth = mac->hw;
+- int ret;
+-
+- ret = of_get_ethdev_address(mac->of_node, dev);
+- if (ret) {
+- /* If the mac address is invalid, use random mac address */
+- eth_hw_addr_random(dev);
+- dev_err(eth->dev, "generated random MAC address %pM\n",
+- dev->dev_addr);
+- }
+-
+- return 0;
+-}
+-
+ static void mtk_uninit(struct net_device *dev)
+ {
+ struct mtk_mac *mac = netdev_priv(dev);
+@@ -4278,7 +4261,6 @@ static const struct ethtool_ops mtk_ethtool_ops = {
+ };
+
+ static const struct net_device_ops mtk_netdev_ops = {
+- .ndo_init = mtk_init,
+ .ndo_uninit = mtk_uninit,
+ .ndo_open = mtk_open,
+ .ndo_stop = mtk_stop,
+@@ -4340,6 +4322,17 @@ static int mtk_add_mac(struct mtk_eth *eth, struct device_node *np)
+ mac->hw = eth;
+ mac->of_node = np;
+
++ err = of_get_ethdev_address(mac->of_node, eth->netdev[id]);
++ if (err == -EPROBE_DEFER)
++ return err;
++
++ if (err) {
++ /* If the mac address is invalid, use random mac address */
++ eth_hw_addr_random(eth->netdev[id]);
++ dev_err(eth->dev, "generated random MAC address %pM\n",
++ eth->netdev[id]->dev_addr);
++ }
++
+ memset(mac->hwlro_ip, 0, sizeof(mac->hwlro_ip));
+ mac->hwlro_ip_cnt = 0;
+
+diff --git a/drivers/net/ethernet/mediatek/mtk_ppe_debugfs.c b/drivers/net/ethernet/mediatek/mtk_ppe_debugfs.c
+index 316fe2e70fead..1a97feca77f23 100644
+--- a/drivers/net/ethernet/mediatek/mtk_ppe_debugfs.c
++++ b/drivers/net/ethernet/mediatek/mtk_ppe_debugfs.c
+@@ -98,7 +98,7 @@ mtk_ppe_debugfs_foe_show(struct seq_file *m, void *private, bool bind)
+
+ acct = mtk_foe_entry_get_mib(ppe, i, NULL);
+
+- type = FIELD_GET(MTK_FOE_IB1_PACKET_TYPE, entry->ib1);
++ type = mtk_get_ib1_pkt_type(ppe->eth, entry->ib1);
+ seq_printf(m, "%05x %s %7s", i,
+ mtk_foe_entry_state_str(state),
+ mtk_foe_pkt_type_str(type));
+diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
+index 4b19803a7dd01..b69122686407d 100644
+--- a/drivers/net/ethernet/realtek/r8169_main.c
++++ b/drivers/net/ethernet/realtek/r8169_main.c
+@@ -2747,6 +2747,13 @@ static void rtl_hw_aspm_clkreq_enable(struct rtl8169_private *tp, bool enable)
+ return;
+
+ if (enable) {
++ /* On these chip versions ASPM can even harm
++ * bus communication of other PCI devices.
++ */
++ if (tp->mac_version == RTL_GIGA_MAC_VER_42 ||
++ tp->mac_version == RTL_GIGA_MAC_VER_43)
++ return;
++
+ rtl_mod_config5(tp, 0, ASPM_en);
+ rtl_mod_config2(tp, 0, ClkReqEn);
+
+@@ -4514,10 +4521,6 @@ static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance)
+ }
+
+ if (napi_schedule_prep(&tp->napi)) {
+- rtl_unlock_config_regs(tp);
+- rtl_hw_aspm_clkreq_enable(tp, false);
+- rtl_lock_config_regs(tp);
+-
+ rtl_irq_disable(tp);
+ __napi_schedule(&tp->napi);
+ }
+@@ -4577,14 +4580,9 @@ static int rtl8169_poll(struct napi_struct *napi, int budget)
+
+ work_done = rtl_rx(dev, tp, budget);
+
+- if (work_done < budget && napi_complete_done(napi, work_done)) {
++ if (work_done < budget && napi_complete_done(napi, work_done))
+ rtl_irq_enable(tp);
+
+- rtl_unlock_config_regs(tp);
+- rtl_hw_aspm_clkreq_enable(tp, true);
+- rtl_lock_config_regs(tp);
+- }
+-
+ return work_done;
+ }
+
+diff --git a/drivers/net/ethernet/ti/cpsw_ale.c b/drivers/net/ethernet/ti/cpsw_ale.c
+index 0c5e783e574c4..64bf22cd860c9 100644
+--- a/drivers/net/ethernet/ti/cpsw_ale.c
++++ b/drivers/net/ethernet/ti/cpsw_ale.c
+@@ -106,23 +106,37 @@ struct cpsw_ale_dev_id {
+
+ static inline int cpsw_ale_get_field(u32 *ale_entry, u32 start, u32 bits)
+ {
+- int idx;
++ int idx, idx2;
++ u32 hi_val = 0;
+
+ idx = start / 32;
++ idx2 = (start + bits - 1) / 32;
++ /* Check if bits to be fetched exceed a word */
++ if (idx != idx2) {
++ idx2 = 2 - idx2; /* flip */
++ hi_val = ale_entry[idx2] << ((idx2 * 32) - start);
++ }
+ start -= idx * 32;
+ idx = 2 - idx; /* flip */
+- return (ale_entry[idx] >> start) & BITMASK(bits);
++ return (hi_val + (ale_entry[idx] >> start)) & BITMASK(bits);
+ }
+
+ static inline void cpsw_ale_set_field(u32 *ale_entry, u32 start, u32 bits,
+ u32 value)
+ {
+- int idx;
++ int idx, idx2;
+
+ value &= BITMASK(bits);
+- idx = start / 32;
++ idx = start / 32;
++ idx2 = (start + bits - 1) / 32;
++ /* Check if bits to be set exceed a word */
++ if (idx != idx2) {
++ idx2 = 2 - idx2; /* flip */
++ ale_entry[idx2] &= ~(BITMASK(bits + start - (idx2 * 32)));
++ ale_entry[idx2] |= (value >> ((idx2 * 32) - start));
++ }
+ start -= idx * 32;
+- idx = 2 - idx; /* flip */
++ idx = 2 - idx; /* flip */
+ ale_entry[idx] &= ~(BITMASK(bits) << start);
+ ale_entry[idx] |= (value << start);
+ }
+diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
+index 53598210be6cb..2c4e6de8f4d9f 100644
+--- a/drivers/net/phy/phy_device.c
++++ b/drivers/net/phy/phy_device.c
+@@ -3452,23 +3452,30 @@ static int __init phy_init(void)
+ {
+ int rc;
+
++ ethtool_set_ethtool_phy_ops(&phy_ethtool_phy_ops);
++
+ rc = mdio_bus_init();
+ if (rc)
+- return rc;
++ goto err_ethtool_phy_ops;
+
+- ethtool_set_ethtool_phy_ops(&phy_ethtool_phy_ops);
+ features_init();
+
+ rc = phy_driver_register(&genphy_c45_driver, THIS_MODULE);
+ if (rc)
+- goto err_c45;
++ goto err_mdio_bus;
+
+ rc = phy_driver_register(&genphy_driver, THIS_MODULE);
+- if (rc) {
+- phy_driver_unregister(&genphy_c45_driver);
++ if (rc)
++ goto err_c45;
++
++ return 0;
++
+ err_c45:
+- mdio_bus_exit();
+- }
++ phy_driver_unregister(&genphy_c45_driver);
++err_mdio_bus:
++ mdio_bus_exit();
++err_ethtool_phy_ops:
++ ethtool_set_ethtool_phy_ops(NULL);
+
+ return rc;
+ }
+diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
+index bdb3a76a352e4..6043e63b42f97 100644
+--- a/drivers/net/vrf.c
++++ b/drivers/net/vrf.c
+@@ -664,7 +664,7 @@ static int vrf_finish_output6(struct net *net, struct sock *sk,
+ skb->protocol = htons(ETH_P_IPV6);
+ skb->dev = dev;
+
+- rcu_read_lock_bh();
++ rcu_read_lock();
+ nexthop = rt6_nexthop((struct rt6_info *)dst, &ipv6_hdr(skb)->daddr);
+ neigh = __ipv6_neigh_lookup_noref(dst->dev, nexthop);
+ if (unlikely(!neigh))
+@@ -672,10 +672,10 @@ static int vrf_finish_output6(struct net *net, struct sock *sk,
+ if (!IS_ERR(neigh)) {
+ sock_confirm_neigh(skb, neigh);
+ ret = neigh_output(neigh, skb, false);
+- rcu_read_unlock_bh();
++ rcu_read_unlock();
+ return ret;
+ }
+- rcu_read_unlock_bh();
++ rcu_read_unlock();
+
+ IP6_INC_STATS(dev_net(dst->dev),
+ ip6_dst_idev(dst), IPSTATS_MIB_OUTNOROUTES);
+@@ -889,7 +889,7 @@ static int vrf_finish_output(struct net *net, struct sock *sk, struct sk_buff *s
+ }
+ }
+
+- rcu_read_lock_bh();
++ rcu_read_lock();
+
+ neigh = ip_neigh_for_gw(rt, skb, &is_v6gw);
+ if (!IS_ERR(neigh)) {
+@@ -898,11 +898,11 @@ static int vrf_finish_output(struct net *net, struct sock *sk, struct sk_buff *s
+ sock_confirm_neigh(skb, neigh);
+ /* if crossing protocols, can not use the cached header */
+ ret = neigh_output(neigh, skb, is_v6gw);
+- rcu_read_unlock_bh();
++ rcu_read_unlock();
+ return ret;
+ }
+
+- rcu_read_unlock_bh();
++ rcu_read_unlock();
+ vrf_tx_error(skb->dev, skb);
+ return -EINVAL;
+ }
+diff --git a/drivers/net/wireless/ath/ath11k/core.c b/drivers/net/wireless/ath/ath11k/core.c
+index 9de23c11e18bb..8ab1a62351b98 100644
+--- a/drivers/net/wireless/ath/ath11k/core.c
++++ b/drivers/net/wireless/ath/ath11k/core.c
+@@ -962,7 +962,8 @@ int ath11k_core_check_dt(struct ath11k_base *ab)
+ }
+
+ static int __ath11k_core_create_board_name(struct ath11k_base *ab, char *name,
+- size_t name_len, bool with_variant)
++ size_t name_len, bool with_variant,
++ bool bus_type_mode)
+ {
+ /* strlen(',variant=') + strlen(ab->qmi.target.bdf_ext) */
+ char variant[9 + ATH11K_QMI_BDF_EXT_STR_LENGTH] = { 0 };
+@@ -973,15 +974,20 @@ static int __ath11k_core_create_board_name(struct ath11k_base *ab, char *name,
+
+ switch (ab->id.bdf_search) {
+ case ATH11K_BDF_SEARCH_BUS_AND_BOARD:
+- scnprintf(name, name_len,
+- "bus=%s,vendor=%04x,device=%04x,subsystem-vendor=%04x,subsystem-device=%04x,qmi-chip-id=%d,qmi-board-id=%d%s",
+- ath11k_bus_str(ab->hif.bus),
+- ab->id.vendor, ab->id.device,
+- ab->id.subsystem_vendor,
+- ab->id.subsystem_device,
+- ab->qmi.target.chip_id,
+- ab->qmi.target.board_id,
+- variant);
++ if (bus_type_mode)
++ scnprintf(name, name_len,
++ "bus=%s",
++ ath11k_bus_str(ab->hif.bus));
++ else
++ scnprintf(name, name_len,
++ "bus=%s,vendor=%04x,device=%04x,subsystem-vendor=%04x,subsystem-device=%04x,qmi-chip-id=%d,qmi-board-id=%d%s",
++ ath11k_bus_str(ab->hif.bus),
++ ab->id.vendor, ab->id.device,
++ ab->id.subsystem_vendor,
++ ab->id.subsystem_device,
++ ab->qmi.target.chip_id,
++ ab->qmi.target.board_id,
++ variant);
+ break;
+ default:
+ scnprintf(name, name_len,
+@@ -1000,13 +1006,19 @@ static int __ath11k_core_create_board_name(struct ath11k_base *ab, char *name,
+ static int ath11k_core_create_board_name(struct ath11k_base *ab, char *name,
+ size_t name_len)
+ {
+- return __ath11k_core_create_board_name(ab, name, name_len, true);
++ return __ath11k_core_create_board_name(ab, name, name_len, true, false);
+ }
+
+ static int ath11k_core_create_fallback_board_name(struct ath11k_base *ab, char *name,
+ size_t name_len)
+ {
+- return __ath11k_core_create_board_name(ab, name, name_len, false);
++ return __ath11k_core_create_board_name(ab, name, name_len, false, false);
++}
++
++static int ath11k_core_create_bus_type_board_name(struct ath11k_base *ab, char *name,
++ size_t name_len)
++{
++ return __ath11k_core_create_board_name(ab, name, name_len, false, true);
+ }
+
+ const struct firmware *ath11k_core_firmware_request(struct ath11k_base *ab,
+@@ -1310,7 +1322,7 @@ success:
+
+ int ath11k_core_fetch_regdb(struct ath11k_base *ab, struct ath11k_board_data *bd)
+ {
+- char boardname[BOARD_NAME_SIZE];
++ char boardname[BOARD_NAME_SIZE], default_boardname[BOARD_NAME_SIZE];
+ int ret;
+
+ ret = ath11k_core_create_board_name(ab, boardname, BOARD_NAME_SIZE);
+@@ -1327,6 +1339,21 @@ int ath11k_core_fetch_regdb(struct ath11k_base *ab, struct ath11k_board_data *bd
+ if (!ret)
+ goto exit;
+
++ ret = ath11k_core_create_bus_type_board_name(ab, default_boardname,
++ BOARD_NAME_SIZE);
++ if (ret) {
++ ath11k_dbg(ab, ATH11K_DBG_BOOT,
++ "failed to create default board name for regdb: %d", ret);
++ goto exit;
++ }
++
++ ret = ath11k_core_fetch_board_data_api_n(ab, bd, default_boardname,
++ ATH11K_BD_IE_REGDB,
++ ATH11K_BD_IE_REGDB_NAME,
++ ATH11K_BD_IE_REGDB_DATA);
++ if (!ret)
++ goto exit;
++
+ ret = ath11k_core_fetch_board_data_api_1(ab, bd, ATH11K_REGDB_FILE_NAME);
+ if (ret)
+ ath11k_dbg(ab, ATH11K_DBG_BOOT, "failed to fetch %s from %s\n",
+diff --git a/drivers/net/wireless/ath/ath11k/mac.c b/drivers/net/wireless/ath/ath11k/mac.c
+index 1c93f1afccc57..01ff197b017f7 100644
+--- a/drivers/net/wireless/ath/ath11k/mac.c
++++ b/drivers/net/wireless/ath/ath11k/mac.c
+@@ -8892,7 +8892,7 @@ static int ath11k_mac_setup_channels_rates(struct ath11k *ar,
+ }
+
+ if (supported_bands & WMI_HOST_WLAN_5G_CAP) {
+- if (reg_cap->high_5ghz_chan >= ATH11K_MAX_6G_FREQ) {
++ if (reg_cap->high_5ghz_chan >= ATH11K_MIN_6G_FREQ) {
+ channels = kmemdup(ath11k_6ghz_channels,
+ sizeof(ath11k_6ghz_channels), GFP_KERNEL);
+ if (!channels) {
+@@ -9468,6 +9468,7 @@ void ath11k_mac_destroy(struct ath11k_base *ab)
+ if (!ar)
+ continue;
+
++ ath11k_fw_stats_free(&ar->fw_stats);
+ ieee80211_free_hw(ar->hw);
+ pdev->ar = NULL;
+ }
+diff --git a/drivers/net/wireless/ath/ath11k/wmi.c b/drivers/net/wireless/ath/ath11k/wmi.c
+index d0b59bc2905a9..42d9b29623a47 100644
+--- a/drivers/net/wireless/ath/ath11k/wmi.c
++++ b/drivers/net/wireless/ath/ath11k/wmi.c
+@@ -8103,6 +8103,11 @@ complete:
+ rcu_read_unlock();
+ spin_unlock_bh(&ar->data_lock);
+
++ /* Since the stats's pdev, vdev and beacon list are spliced and reinitialised
++ * at this point, no need to free the individual list.
++ */
++ return;
++
+ free:
+ ath11k_fw_stats_free(&stats);
+ }
+diff --git a/drivers/net/wireless/ath/ath12k/mac.c b/drivers/net/wireless/ath/ath12k/mac.c
+index ee792822b4113..58acfe8fdf8c0 100644
+--- a/drivers/net/wireless/ath/ath12k/mac.c
++++ b/drivers/net/wireless/ath/ath12k/mac.c
+@@ -4425,6 +4425,7 @@ static int ath12k_mac_mgmt_tx_wmi(struct ath12k *ar, struct ath12k_vif *arvif,
+ int buf_id;
+ int ret;
+
++ ATH12K_SKB_CB(skb)->ar = ar;
+ spin_lock_bh(&ar->txmgmt_idr_lock);
+ buf_id = idr_alloc(&ar->txmgmt_idr, skb, 0,
+ ATH12K_TX_MGMT_NUM_PENDING_MAX, GFP_ATOMIC);
+diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/mld-key.c b/drivers/net/wireless/intel/iwlwifi/mvm/mld-key.c
+index 8853821b37168..1e659bd07392a 100644
+--- a/drivers/net/wireless/intel/iwlwifi/mvm/mld-key.c
++++ b/drivers/net/wireless/intel/iwlwifi/mvm/mld-key.c
+@@ -1,6 +1,6 @@
+ // SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause
+ /*
+- * Copyright (C) 2022 Intel Corporation
++ * Copyright (C) 2022 - 2023 Intel Corporation
+ */
+ #include <linux/kernel.h>
+ #include <net/mac80211.h>
+@@ -179,9 +179,14 @@ int iwl_mvm_sec_key_add(struct iwl_mvm *mvm,
+ .u.add.key_flags = cpu_to_le32(key_flags),
+ .u.add.tx_seq = cpu_to_le64(atomic64_read(&keyconf->tx_pn)),
+ };
++ int max_key_len = sizeof(cmd.u.add.key);
+ int ret;
+
+- if (WARN_ON(keyconf->keylen > sizeof(cmd.u.add.key)))
++ if (keyconf->cipher == WLAN_CIPHER_SUITE_WEP40 ||
++ keyconf->cipher == WLAN_CIPHER_SUITE_WEP104)
++ max_key_len -= IWL_SEC_WEP_KEY_OFFSET;
++
++ if (WARN_ON(keyconf->keylen > max_key_len))
+ return -EINVAL;
+
+ if (WARN_ON(!sta_mask))
+diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/power.c b/drivers/net/wireless/intel/iwlwifi/mvm/power.c
+index ac1dae52556f8..19839cc44eb3d 100644
+--- a/drivers/net/wireless/intel/iwlwifi/mvm/power.c
++++ b/drivers/net/wireless/intel/iwlwifi/mvm/power.c
+@@ -647,30 +647,32 @@ static void iwl_mvm_power_set_pm(struct iwl_mvm *mvm,
+ return;
+
+ /* enable PM on bss if bss stand alone */
+- if (vifs->bss_active && !vifs->p2p_active && !vifs->ap_active) {
++ if (bss_mvmvif && vifs->bss_active && !vifs->p2p_active &&
++ !vifs->ap_active) {
+ bss_mvmvif->pm_enabled = true;
+ return;
+ }
+
+ /* enable PM on p2p if p2p stand alone */
+- if (vifs->p2p_active && !vifs->bss_active && !vifs->ap_active) {
++ if (p2p_mvmvif && vifs->p2p_active && !vifs->bss_active &&
++ !vifs->ap_active) {
+ p2p_mvmvif->pm_enabled = true;
+ return;
+ }
+
+- if (vifs->bss_active && vifs->p2p_active)
++ if (p2p_mvmvif && bss_mvmvif && vifs->bss_active && vifs->p2p_active)
+ client_same_channel =
+ iwl_mvm_have_links_same_channel(bss_mvmvif, p2p_mvmvif);
+
+- if (vifs->bss_active && vifs->ap_active)
++ if (bss_mvmvif && ap_mvmvif && vifs->bss_active && vifs->ap_active)
+ ap_same_channel =
+ iwl_mvm_have_links_same_channel(bss_mvmvif, ap_mvmvif);
+
+ /* clients are not stand alone: enable PM if DCM */
+ if (!(client_same_channel || ap_same_channel)) {
+- if (vifs->bss_active)
++ if (bss_mvmvif && vifs->bss_active)
+ bss_mvmvif->pm_enabled = true;
+- if (vifs->p2p_active)
++ if (p2p_mvmvif && vifs->p2p_active)
+ p2p_mvmvif->pm_enabled = true;
+ return;
+ }
+diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/sta.c b/drivers/net/wireless/intel/iwlwifi/mvm/sta.c
+index b85e363544f8b..7f9a809dd081c 100644
+--- a/drivers/net/wireless/intel/iwlwifi/mvm/sta.c
++++ b/drivers/net/wireless/intel/iwlwifi/mvm/sta.c
+@@ -2884,7 +2884,7 @@ int iwl_mvm_sta_rx_agg(struct iwl_mvm *mvm, struct ieee80211_sta *sta,
+ }
+
+ if (iwl_mvm_has_new_rx_api(mvm) && start) {
+- u16 reorder_buf_size = buf_size * sizeof(baid_data->entries[0]);
++ u32 reorder_buf_size = buf_size * sizeof(baid_data->entries[0]);
+
+ /* sparse doesn't like the __align() so don't check */
+ #ifndef __CHECKER__
+diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/drv.c b/drivers/net/wireless/intel/iwlwifi/pcie/drv.c
+index 79115eb1c2852..e086664a4eaca 100644
+--- a/drivers/net/wireless/intel/iwlwifi/pcie/drv.c
++++ b/drivers/net/wireless/intel/iwlwifi/pcie/drv.c
+@@ -495,6 +495,7 @@ static const struct pci_device_id iwl_hw_card_ids[] = {
+ {IWL_PCI_DEVICE(0x7AF0, PCI_ANY_ID, iwl_so_trans_cfg)},
+ {IWL_PCI_DEVICE(0x51F0, PCI_ANY_ID, iwl_so_long_latency_trans_cfg)},
+ {IWL_PCI_DEVICE(0x51F1, PCI_ANY_ID, iwl_so_long_latency_imr_trans_cfg)},
++ {IWL_PCI_DEVICE(0x51F1, PCI_ANY_ID, iwl_so_long_latency_trans_cfg)},
+ {IWL_PCI_DEVICE(0x54F0, PCI_ANY_ID, iwl_so_long_latency_trans_cfg)},
+ {IWL_PCI_DEVICE(0x7F70, PCI_ANY_ID, iwl_so_trans_cfg)},
+
+@@ -544,6 +545,7 @@ static const struct iwl_dev_info iwl_dev_info_table[] = {
+ IWL_DEV_INFO(0x51F0, 0x1551, iwl9560_2ac_cfg_soc, iwl9560_killer_1550i_160_name),
+ IWL_DEV_INFO(0x51F0, 0x1691, iwlax411_2ax_cfg_so_gf4_a0, iwl_ax411_killer_1690s_name),
+ IWL_DEV_INFO(0x51F0, 0x1692, iwlax411_2ax_cfg_so_gf4_a0, iwl_ax411_killer_1690i_name),
++ IWL_DEV_INFO(0x51F1, 0x1692, iwlax411_2ax_cfg_so_gf4_a0, iwl_ax411_killer_1690i_name),
+ IWL_DEV_INFO(0x54F0, 0x1691, iwlax411_2ax_cfg_so_gf4_a0, iwl_ax411_killer_1690s_name),
+ IWL_DEV_INFO(0x54F0, 0x1692, iwlax411_2ax_cfg_so_gf4_a0, iwl_ax411_killer_1690i_name),
+ IWL_DEV_INFO(0x7A70, 0x1691, iwlax411_2ax_cfg_so_gf4_a0, iwl_ax411_killer_1690s_name),
+@@ -682,6 +684,8 @@ static const struct iwl_dev_info iwl_dev_info_table[] = {
+ IWL_DEV_INFO(0x2726, 0x1672, iwlax211_2ax_cfg_so_gf_a0, iwl_ax211_killer_1675i_name),
+ IWL_DEV_INFO(0x51F0, 0x1671, iwlax211_2ax_cfg_so_gf_a0, iwl_ax211_killer_1675s_name),
+ IWL_DEV_INFO(0x51F0, 0x1672, iwlax211_2ax_cfg_so_gf_a0, iwl_ax211_killer_1675i_name),
++ IWL_DEV_INFO(0x51F1, 0x1671, iwlax211_2ax_cfg_so_gf_a0, iwl_ax211_killer_1675s_name),
++ IWL_DEV_INFO(0x51F1, 0x1672, iwlax211_2ax_cfg_so_gf_a0, iwl_ax211_killer_1675i_name),
+ IWL_DEV_INFO(0x54F0, 0x1671, iwlax211_2ax_cfg_so_gf_a0, iwl_ax211_killer_1675s_name),
+ IWL_DEV_INFO(0x54F0, 0x1672, iwlax211_2ax_cfg_so_gf_a0, iwl_ax211_killer_1675i_name),
+ IWL_DEV_INFO(0x7A70, 0x1671, iwlax211_2ax_cfg_so_gf_a0, iwl_ax211_killer_1675s_name),
+diff --git a/drivers/net/wireless/realtek/rtw88/sdio.c b/drivers/net/wireless/realtek/rtw88/sdio.c
+index 06fce7c3addaa..2c1fb2dabd40a 100644
+--- a/drivers/net/wireless/realtek/rtw88/sdio.c
++++ b/drivers/net/wireless/realtek/rtw88/sdio.c
+@@ -998,9 +998,9 @@ static void rtw_sdio_rxfifo_recv(struct rtw_dev *rtwdev, u32 rx_len)
+
+ static void rtw_sdio_rx_isr(struct rtw_dev *rtwdev)
+ {
+- u32 rx_len, total_rx_bytes = 0;
++ u32 rx_len, hisr, total_rx_bytes = 0;
+
+- while (total_rx_bytes < SZ_64K) {
++ do {
+ if (rtw_chip_wcpu_11n(rtwdev))
+ rx_len = rtw_read16(rtwdev, REG_SDIO_RX0_REQ_LEN);
+ else
+@@ -1012,7 +1012,25 @@ static void rtw_sdio_rx_isr(struct rtw_dev *rtwdev)
+ rtw_sdio_rxfifo_recv(rtwdev, rx_len);
+
+ total_rx_bytes += rx_len;
+- }
++
++ if (rtw_chip_wcpu_11n(rtwdev)) {
++ /* Stop if no more RX requests are pending, even if
++ * rx_len could be greater than zero in the next
++ * iteration. This is needed because the RX buffer may
++ * already contain data while either HW or FW are not
++ * done filling that buffer yet. Still reading the
++ * buffer can result in packets where
++ * rtw_rx_pkt_stat.pkt_len is zero or points beyond the
++ * end of the buffer.
++ */
++ hisr = rtw_read32(rtwdev, REG_SDIO_HISR);
++ } else {
++ /* RTW_WCPU_11AC chips have improved hardware or
++ * firmware and can use rx_len unconditionally.
++ */
++ hisr = REG_SDIO_HISR_RX_REQUEST;
++ }
++ } while (total_rx_bytes < SZ_64K && hisr & REG_SDIO_HISR_RX_REQUEST);
+ }
+
+ static void rtw_sdio_handle_interrupt(struct sdio_func *sdio_func)
+diff --git a/drivers/net/wireless/virtual/mac80211_hwsim.c b/drivers/net/wireless/virtual/mac80211_hwsim.c
+index 89c7a1420381d..ed5af63025979 100644
+--- a/drivers/net/wireless/virtual/mac80211_hwsim.c
++++ b/drivers/net/wireless/virtual/mac80211_hwsim.c
+@@ -4,7 +4,7 @@
+ * Copyright (c) 2008, Jouni Malinen <j@w1.fi>
+ * Copyright (c) 2011, Javier Lopez <jlopex@gmail.com>
+ * Copyright (c) 2016 - 2017 Intel Deutschland GmbH
+- * Copyright (C) 2018 - 2022 Intel Corporation
++ * Copyright (C) 2018 - 2023 Intel Corporation
+ */
+
+ /*
+@@ -1864,7 +1864,7 @@ mac80211_hwsim_select_tx_link(struct mac80211_hwsim_data *data,
+
+ WARN_ON(is_multicast_ether_addr(hdr->addr1));
+
+- if (WARN_ON_ONCE(!sta->valid_links))
++ if (WARN_ON_ONCE(!sta || !sta->valid_links))
+ return &vif->bss_conf;
+
+ for (i = 0; i < ARRAY_SIZE(vif->link_conf); i++) {
+diff --git a/drivers/of/platform.c b/drivers/of/platform.c
+index 78ae841874490..e46482cef9c7d 100644
+--- a/drivers/of/platform.c
++++ b/drivers/of/platform.c
+@@ -553,7 +553,7 @@ static int __init of_platform_default_populate_init(void)
+ if (!of_get_property(node, "linux,opened", NULL) ||
+ !of_get_property(node, "linux,boot-display", NULL))
+ continue;
+- dev = of_platform_device_create(node, "of-display.0", NULL);
++ dev = of_platform_device_create(node, "of-display", NULL);
+ of_node_put(node);
+ if (WARN_ON(!dev))
+ return -ENOMEM;
+diff --git a/drivers/pinctrl/renesas/pinctrl-rzg2l.c b/drivers/pinctrl/renesas/pinctrl-rzg2l.c
+index 9511d920565e9..b53d26167da52 100644
+--- a/drivers/pinctrl/renesas/pinctrl-rzg2l.c
++++ b/drivers/pinctrl/renesas/pinctrl-rzg2l.c
+@@ -249,6 +249,7 @@ static int rzg2l_map_add_config(struct pinctrl_map *map,
+
+ static int rzg2l_dt_subnode_to_map(struct pinctrl_dev *pctldev,
+ struct device_node *np,
++ struct device_node *parent,
+ struct pinctrl_map **map,
+ unsigned int *num_maps,
+ unsigned int *index)
+@@ -266,6 +267,7 @@ static int rzg2l_dt_subnode_to_map(struct pinctrl_dev *pctldev,
+ struct property *prop;
+ int ret, gsel, fsel;
+ const char **pin_fn;
++ const char *name;
+ const char *pin;
+
+ pinmux = of_find_property(np, "pinmux", NULL);
+@@ -349,8 +351,19 @@ static int rzg2l_dt_subnode_to_map(struct pinctrl_dev *pctldev,
+ psel_val[i] = MUX_FUNC(value);
+ }
+
++ if (parent) {
++ name = devm_kasprintf(pctrl->dev, GFP_KERNEL, "%pOFn.%pOFn",
++ parent, np);
++ if (!name) {
++ ret = -ENOMEM;
++ goto done;
++ }
++ } else {
++ name = np->name;
++ }
++
+ /* Register a single pin group listing all the pins we read from DT */
+- gsel = pinctrl_generic_add_group(pctldev, np->name, pins, num_pinmux, NULL);
++ gsel = pinctrl_generic_add_group(pctldev, name, pins, num_pinmux, NULL);
+ if (gsel < 0) {
+ ret = gsel;
+ goto done;
+@@ -360,17 +373,16 @@ static int rzg2l_dt_subnode_to_map(struct pinctrl_dev *pctldev,
+ * Register a single group function where the 'data' is an array PSEL
+ * register values read from DT.
+ */
+- pin_fn[0] = np->name;
+- fsel = pinmux_generic_add_function(pctldev, np->name, pin_fn, 1,
+- psel_val);
++ pin_fn[0] = name;
++ fsel = pinmux_generic_add_function(pctldev, name, pin_fn, 1, psel_val);
+ if (fsel < 0) {
+ ret = fsel;
+ goto remove_group;
+ }
+
+ maps[idx].type = PIN_MAP_TYPE_MUX_GROUP;
+- maps[idx].data.mux.group = np->name;
+- maps[idx].data.mux.function = np->name;
++ maps[idx].data.mux.group = name;
++ maps[idx].data.mux.function = name;
+ idx++;
+
+ dev_dbg(pctrl->dev, "Parsed %pOF with %d pins\n", np, num_pinmux);
+@@ -417,7 +429,7 @@ static int rzg2l_dt_node_to_map(struct pinctrl_dev *pctldev,
+ index = 0;
+
+ for_each_child_of_node(np, child) {
+- ret = rzg2l_dt_subnode_to_map(pctldev, child, map,
++ ret = rzg2l_dt_subnode_to_map(pctldev, child, np, map,
+ num_maps, &index);
+ if (ret < 0) {
+ of_node_put(child);
+@@ -426,7 +438,7 @@ static int rzg2l_dt_node_to_map(struct pinctrl_dev *pctldev,
+ }
+
+ if (*num_maps == 0) {
+- ret = rzg2l_dt_subnode_to_map(pctldev, np, map,
++ ret = rzg2l_dt_subnode_to_map(pctldev, np, NULL, map,
+ num_maps, &index);
+ if (ret < 0)
+ goto done;
+diff --git a/drivers/pinctrl/renesas/pinctrl-rzv2m.c b/drivers/pinctrl/renesas/pinctrl-rzv2m.c
+index e5472293bc7fb..35b23c1a5684d 100644
+--- a/drivers/pinctrl/renesas/pinctrl-rzv2m.c
++++ b/drivers/pinctrl/renesas/pinctrl-rzv2m.c
+@@ -209,6 +209,7 @@ static int rzv2m_map_add_config(struct pinctrl_map *map,
+
+ static int rzv2m_dt_subnode_to_map(struct pinctrl_dev *pctldev,
+ struct device_node *np,
++ struct device_node *parent,
+ struct pinctrl_map **map,
+ unsigned int *num_maps,
+ unsigned int *index)
+@@ -226,6 +227,7 @@ static int rzv2m_dt_subnode_to_map(struct pinctrl_dev *pctldev,
+ struct property *prop;
+ int ret, gsel, fsel;
+ const char **pin_fn;
++ const char *name;
+ const char *pin;
+
+ pinmux = of_find_property(np, "pinmux", NULL);
+@@ -309,8 +311,19 @@ static int rzv2m_dt_subnode_to_map(struct pinctrl_dev *pctldev,
+ psel_val[i] = MUX_FUNC(value);
+ }
+
++ if (parent) {
++ name = devm_kasprintf(pctrl->dev, GFP_KERNEL, "%pOFn.%pOFn",
++ parent, np);
++ if (!name) {
++ ret = -ENOMEM;
++ goto done;
++ }
++ } else {
++ name = np->name;
++ }
++
+ /* Register a single pin group listing all the pins we read from DT */
+- gsel = pinctrl_generic_add_group(pctldev, np->name, pins, num_pinmux, NULL);
++ gsel = pinctrl_generic_add_group(pctldev, name, pins, num_pinmux, NULL);
+ if (gsel < 0) {
+ ret = gsel;
+ goto done;
+@@ -320,17 +333,16 @@ static int rzv2m_dt_subnode_to_map(struct pinctrl_dev *pctldev,
+ * Register a single group function where the 'data' is an array PSEL
+ * register values read from DT.
+ */
+- pin_fn[0] = np->name;
+- fsel = pinmux_generic_add_function(pctldev, np->name, pin_fn, 1,
+- psel_val);
++ pin_fn[0] = name;
++ fsel = pinmux_generic_add_function(pctldev, name, pin_fn, 1, psel_val);
+ if (fsel < 0) {
+ ret = fsel;
+ goto remove_group;
+ }
+
+ maps[idx].type = PIN_MAP_TYPE_MUX_GROUP;
+- maps[idx].data.mux.group = np->name;
+- maps[idx].data.mux.function = np->name;
++ maps[idx].data.mux.group = name;
++ maps[idx].data.mux.function = name;
+ idx++;
+
+ dev_dbg(pctrl->dev, "Parsed %pOF with %d pins\n", np, num_pinmux);
+@@ -377,7 +389,7 @@ static int rzv2m_dt_node_to_map(struct pinctrl_dev *pctldev,
+ index = 0;
+
+ for_each_child_of_node(np, child) {
+- ret = rzv2m_dt_subnode_to_map(pctldev, child, map,
++ ret = rzv2m_dt_subnode_to_map(pctldev, child, np, map,
+ num_maps, &index);
+ if (ret < 0) {
+ of_node_put(child);
+@@ -386,7 +398,7 @@ static int rzv2m_dt_node_to_map(struct pinctrl_dev *pctldev,
+ }
+
+ if (*num_maps == 0) {
+- ret = rzv2m_dt_subnode_to_map(pctldev, np, map,
++ ret = rzv2m_dt_subnode_to_map(pctldev, np, NULL, map,
+ num_maps, &index);
+ if (ret < 0)
+ goto done;
+diff --git a/drivers/regulator/da9063-regulator.c b/drivers/regulator/da9063-regulator.c
+index c5dd77be558b6..dfd5ec9f75c90 100644
+--- a/drivers/regulator/da9063-regulator.c
++++ b/drivers/regulator/da9063-regulator.c
+@@ -778,6 +778,9 @@ static int da9063_check_xvp_constraints(struct regulator_config *config)
+ const struct notification_limit *uv_l = &constr->under_voltage_limits;
+ const struct notification_limit *ov_l = &constr->over_voltage_limits;
+
++ if (!config->init_data) /* No config in DT, pointers will be invalid */
++ return 0;
++
+ /* make sure that only one severity is used to clarify if unchanged, enabled or disabled */
+ if ((!!uv_l->prot + !!uv_l->err + !!uv_l->warn) > 1) {
+ dev_err(config->dev, "%s: at most one voltage monitoring severity allowed!\n",
+diff --git a/drivers/s390/crypto/zcrypt_msgtype6.c b/drivers/s390/crypto/zcrypt_msgtype6.c
+index 9eb5153737007..f7fb6836eaaa1 100644
+--- a/drivers/s390/crypto/zcrypt_msgtype6.c
++++ b/drivers/s390/crypto/zcrypt_msgtype6.c
+@@ -1111,23 +1111,36 @@ static long zcrypt_msgtype6_send_cprb(bool userspace, struct zcrypt_queue *zq,
+ struct ica_xcRB *xcrb,
+ struct ap_message *ap_msg)
+ {
+- int rc;
+ struct response_type *rtype = ap_msg->private;
+ struct {
+ struct type6_hdr hdr;
+ struct CPRBX cprbx;
+ /* ... more data blocks ... */
+ } __packed * msg = ap_msg->msg;
+-
+- /*
+- * Set the queue's reply buffer length minus 128 byte padding
+- * as reply limit for the card firmware.
+- */
+- msg->hdr.fromcardlen1 = min_t(unsigned int, msg->hdr.fromcardlen1,
+- zq->reply.bufsize - 128);
+- if (msg->hdr.fromcardlen2)
+- msg->hdr.fromcardlen2 =
+- zq->reply.bufsize - msg->hdr.fromcardlen1 - 128;
++ unsigned int max_payload_size;
++ int rc, delta;
++
++ /* calculate maximum payload for this card and msg type */
++ max_payload_size = zq->reply.bufsize - sizeof(struct type86_fmt2_msg);
++
++ /* limit each of the two from fields to the maximum payload size */
++ msg->hdr.fromcardlen1 = min(msg->hdr.fromcardlen1, max_payload_size);
++ msg->hdr.fromcardlen2 = min(msg->hdr.fromcardlen2, max_payload_size);
++
++ /* calculate delta if the sum of both exceeds max payload size */
++ delta = msg->hdr.fromcardlen1 + msg->hdr.fromcardlen2
++ - max_payload_size;
++ if (delta > 0) {
++ /*
++ * Sum exceeds maximum payload size, prune fromcardlen1
++ * (always trust fromcardlen2)
++ */
++ if (delta > msg->hdr.fromcardlen1) {
++ rc = -EINVAL;
++ goto out;
++ }
++ msg->hdr.fromcardlen1 -= delta;
++ }
+
+ init_completion(&rtype->work);
+ rc = ap_queue_message(zq->queue, ap_msg);
+diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
+index 037f8c98a6d36..3c766a9ade381 100644
+--- a/drivers/scsi/sg.c
++++ b/drivers/scsi/sg.c
+@@ -1496,6 +1496,11 @@ sg_add_device(struct device *cl_dev)
+ int error;
+ unsigned long iflags;
+
++ if (!blk_get_queue(scsidp->request_queue)) {
++ pr_warn("%s: get scsi_device queue failed\n", __func__);
++ return -ENODEV;
++ }
++
+ error = -ENOMEM;
+ cdev = cdev_alloc();
+ if (!cdev) {
+@@ -1553,6 +1558,7 @@ cdev_add_err:
+ out:
+ if (cdev)
+ cdev_del(cdev);
++ blk_put_queue(scsidp->request_queue);
+ return error;
+ }
+
+@@ -1560,6 +1566,7 @@ static void
+ sg_device_destroy(struct kref *kref)
+ {
+ struct sg_device *sdp = container_of(kref, struct sg_device, d_ref);
++ struct request_queue *q = sdp->device->request_queue;
+ unsigned long flags;
+
+ /* CAUTION! Note that the device can still be found via idr_find()
+@@ -1567,6 +1574,9 @@ sg_device_destroy(struct kref *kref)
+ * any other cleanup.
+ */
+
++ blk_trace_remove(q);
++ blk_put_queue(q);
++
+ write_lock_irqsave(&sg_index_lock, flags);
+ idr_remove(&sg_index_idr, sdp->index);
+ write_unlock_irqrestore(&sg_index_lock, flags);
+diff --git a/drivers/spi/spi-bcm63xx.c b/drivers/spi/spi-bcm63xx.c
+index 9aecb77c3d892..07b5b71b23520 100644
+--- a/drivers/spi/spi-bcm63xx.c
++++ b/drivers/spi/spi-bcm63xx.c
+@@ -126,7 +126,7 @@ enum bcm63xx_regs_spi {
+ SPI_MSG_DATA_SIZE,
+ };
+
+-#define BCM63XX_SPI_MAX_PREPEND 15
++#define BCM63XX_SPI_MAX_PREPEND 7
+
+ #define BCM63XX_SPI_MAX_CS 8
+ #define BCM63XX_SPI_BUS_NUM 0
+diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
+index 32449bef4415a..abf10f92415dc 100644
+--- a/drivers/spi/spi-cadence-quadspi.c
++++ b/drivers/spi/spi-cadence-quadspi.c
+@@ -40,6 +40,7 @@
+ #define CQSPI_SUPPORT_EXTERNAL_DMA BIT(2)
+ #define CQSPI_NO_SUPPORT_WR_COMPLETION BIT(3)
+ #define CQSPI_SLOW_SRAM BIT(4)
++#define CQSPI_NEEDS_APB_AHB_HAZARD_WAR BIT(5)
+
+ /* Capabilities */
+ #define CQSPI_SUPPORTS_OCTAL BIT(0)
+@@ -90,6 +91,7 @@ struct cqspi_st {
+ u32 pd_dev_id;
+ bool wr_completion;
+ bool slow_sram;
++ bool apb_ahb_hazard;
+ };
+
+ struct cqspi_driver_platdata {
+@@ -1027,6 +1029,13 @@ static int cqspi_indirect_write_execute(struct cqspi_flash_pdata *f_pdata,
+ if (cqspi->wr_delay)
+ ndelay(cqspi->wr_delay);
+
++ /*
++ * If a hazard exists between the APB and AHB interfaces, perform a
++ * dummy readback from the controller to ensure synchronization.
++ */
++ if (cqspi->apb_ahb_hazard)
++ readl(reg_base + CQSPI_REG_INDIRECTWR);
++
+ while (remaining > 0) {
+ size_t write_words, mod_bytes;
+
+@@ -1754,6 +1763,8 @@ static int cqspi_probe(struct platform_device *pdev)
+ cqspi->wr_completion = false;
+ if (ddata->quirks & CQSPI_SLOW_SRAM)
+ cqspi->slow_sram = true;
++ if (ddata->quirks & CQSPI_NEEDS_APB_AHB_HAZARD_WAR)
++ cqspi->apb_ahb_hazard = true;
+
+ if (of_device_is_compatible(pdev->dev.of_node,
+ "xlnx,versal-ospi-1.0")) {
+@@ -1888,6 +1899,10 @@ static const struct cqspi_driver_platdata jh7110_qspi = {
+ .quirks = CQSPI_DISABLE_DAC_MODE,
+ };
+
++static const struct cqspi_driver_platdata pensando_cdns_qspi = {
++ .quirks = CQSPI_NEEDS_APB_AHB_HAZARD_WAR | CQSPI_DISABLE_DAC_MODE,
++};
++
+ static const struct of_device_id cqspi_dt_ids[] = {
+ {
+ .compatible = "cdns,qspi-nor",
+@@ -1917,6 +1932,10 @@ static const struct of_device_id cqspi_dt_ids[] = {
+ .compatible = "starfive,jh7110-qspi",
+ .data = &jh7110_qspi,
+ },
++ {
++ .compatible = "amd,pensando-elba-qspi",
++ .data = &pensando_cdns_qspi,
++ },
+ { /* end of table */ }
+ };
+
+diff --git a/drivers/spi/spi-dw-mmio.c b/drivers/spi/spi-dw-mmio.c
+index 15f5e9cb54ad4..a963bc96c223f 100644
+--- a/drivers/spi/spi-dw-mmio.c
++++ b/drivers/spi/spi-dw-mmio.c
+@@ -236,6 +236,24 @@ static int dw_spi_intel_init(struct platform_device *pdev,
+ return 0;
+ }
+
++/*
++ * DMA-based mem ops are not configured for this device and are not tested.
++ */
++static int dw_spi_mountevans_imc_init(struct platform_device *pdev,
++ struct dw_spi_mmio *dwsmmio)
++{
++ /*
++ * The Intel Mount Evans SoC's Integrated Management Complex DW
++ * apb_ssi_v4.02a controller has an errata where a full TX FIFO can
++ * result in data corruption. The suggested workaround is to never
++ * completely fill the FIFO. The TX FIFO has a size of 32 so the
++ * fifo_len is set to 31.
++ */
++ dwsmmio->dws.fifo_len = 31;
++
++ return 0;
++}
++
+ static int dw_spi_canaan_k210_init(struct platform_device *pdev,
+ struct dw_spi_mmio *dwsmmio)
+ {
+@@ -405,6 +423,10 @@ static const struct of_device_id dw_spi_mmio_of_match[] = {
+ { .compatible = "snps,dwc-ssi-1.01a", .data = dw_spi_hssi_init},
+ { .compatible = "intel,keembay-ssi", .data = dw_spi_intel_init},
+ { .compatible = "intel,thunderbay-ssi", .data = dw_spi_intel_init},
++ {
++ .compatible = "intel,mountevans-imc-ssi",
++ .data = dw_spi_mountevans_imc_init,
++ },
+ { .compatible = "microchip,sparx5-spi", dw_spi_mscc_sparx5_init},
+ { .compatible = "canaan,k210-spi", dw_spi_canaan_k210_init},
+ { .compatible = "amd,pensando-elba-spi", .data = dw_spi_elba_init},
+diff --git a/drivers/spi/spi-s3c64xx.c b/drivers/spi/spi-s3c64xx.c
+index 7ac17f0d18a95..1a8b31e20baf2 100644
+--- a/drivers/spi/spi-s3c64xx.c
++++ b/drivers/spi/spi-s3c64xx.c
+@@ -668,6 +668,8 @@ static int s3c64xx_spi_config(struct s3c64xx_spi_driver_data *sdd)
+
+ if ((sdd->cur_mode & SPI_LOOP) && sdd->port_conf->has_loopback)
+ val |= S3C64XX_SPI_MODE_SELF_LOOPBACK;
++ else
++ val &= ~S3C64XX_SPI_MODE_SELF_LOOPBACK;
+
+ writel(val, regs + S3C64XX_SPI_MODE_CFG);
+
+diff --git a/drivers/video/fbdev/au1200fb.c b/drivers/video/fbdev/au1200fb.c
+index aed88ce45bf09..d8f085d4ede30 100644
+--- a/drivers/video/fbdev/au1200fb.c
++++ b/drivers/video/fbdev/au1200fb.c
+@@ -1732,6 +1732,9 @@ static int au1200fb_drv_probe(struct platform_device *dev)
+
+ /* Now hook interrupt too */
+ irq = platform_get_irq(dev, 0);
++ if (irq < 0)
++ return irq;
++
+ ret = request_irq(irq, au1200fb_handle_irq,
+ IRQF_SHARED, "lcd", (void *)dev);
+ if (ret) {
+diff --git a/drivers/video/fbdev/imxfb.c b/drivers/video/fbdev/imxfb.c
+index adf36690c342b..c8b1c73412d36 100644
+--- a/drivers/video/fbdev/imxfb.c
++++ b/drivers/video/fbdev/imxfb.c
+@@ -613,10 +613,10 @@ static int imxfb_activate_var(struct fb_var_screeninfo *var, struct fb_info *inf
+ if (var->hsync_len < 1 || var->hsync_len > 64)
+ printk(KERN_ERR "%s: invalid hsync_len %d\n",
+ info->fix.id, var->hsync_len);
+- if (var->left_margin > 255)
++ if (var->left_margin < 3 || var->left_margin > 255)
+ printk(KERN_ERR "%s: invalid left_margin %d\n",
+ info->fix.id, var->left_margin);
+- if (var->right_margin > 255)
++ if (var->right_margin < 1 || var->right_margin > 255)
+ printk(KERN_ERR "%s: invalid right_margin %d\n",
+ info->fix.id, var->right_margin);
+ if (var->yres < 1 || var->yres > ymax_mask)
+@@ -1043,7 +1043,6 @@ failed_cmap:
+ failed_map:
+ failed_ioremap:
+ failed_getclock:
+- release_mem_region(res->start, resource_size(res));
+ failed_of_parse:
+ kfree(info->pseudo_palette);
+ failed_init:
+diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
+index e2b3448476490..152b3ec911599 100644
+--- a/fs/btrfs/block-group.c
++++ b/fs/btrfs/block-group.c
+@@ -2084,6 +2084,7 @@ static int exclude_super_stripes(struct btrfs_block_group *cache)
+
+ /* Shouldn't have super stripes in sequential zones */
+ if (zoned && nr) {
++ kfree(logical);
+ btrfs_err(fs_info,
+ "zoned: block group %llu must not contain super block",
+ cache->start);
+diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
+index 4912d624ca3d3..886e661a218fc 100644
+--- a/fs/btrfs/ctree.c
++++ b/fs/btrfs/ctree.c
+@@ -417,9 +417,13 @@ static noinline int update_ref_for_cow(struct btrfs_trans_handle *trans,
+ &refs, &flags);
+ if (ret)
+ return ret;
+- if (refs == 0) {
+- ret = -EROFS;
+- btrfs_handle_fs_error(fs_info, ret, NULL);
++ if (unlikely(refs == 0)) {
++ btrfs_crit(fs_info,
++ "found 0 references for tree block at bytenr %llu level %d root %llu",
++ buf->start, btrfs_header_level(buf),
++ btrfs_root_id(root));
++ ret = -EUCLEAN;
++ btrfs_abort_transaction(trans, ret);
+ return ret;
+ }
+ } else {
+diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
+index fc59eb4024438..795b30913c542 100644
+--- a/fs/btrfs/disk-io.c
++++ b/fs/btrfs/disk-io.c
+@@ -2265,6 +2265,9 @@ static int btrfs_init_csum_hash(struct btrfs_fs_info *fs_info, u16 csum_type)
+ if (!strstr(crypto_shash_driver_name(csum_shash), "generic"))
+ set_bit(BTRFS_FS_CSUM_IMPL_FAST, &fs_info->flags);
+ break;
++ case BTRFS_CSUM_TYPE_XXHASH:
++ set_bit(BTRFS_FS_CSUM_IMPL_FAST, &fs_info->flags);
++ break;
+ default:
+ break;
+ }
+diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
+index e3ae55d8bae14..a37a6587efaf0 100644
+--- a/fs/btrfs/extent_io.c
++++ b/fs/btrfs/extent_io.c
+@@ -1592,38 +1592,7 @@ done:
+ set_page_writeback(page);
+ end_page_writeback(page);
+ }
+- /*
+- * Here we used to have a check for PageError() and then set @ret and
+- * call end_extent_writepage().
+- *
+- * But in fact setting @ret here will cause different error paths
+- * between subpage and regular sectorsize.
+- *
+- * For regular page size, we never submit current page, but only add
+- * current page to current bio.
+- * The bio submission can only happen in next page.
+- * Thus if we hit the PageError() branch, @ret is already set to
+- * non-zero value and will not get updated for regular sectorsize.
+- *
+- * But for subpage case, it's possible we submit part of current page,
+- * thus can get PageError() set by submitted bio of the same page,
+- * while our @ret is still 0.
+- *
+- * So here we unify the behavior and don't set @ret.
+- * Error can still be properly passed to higher layer as page will
+- * be set error, here we just don't handle the IO failure.
+- *
+- * NOTE: This is just a hotfix for subpage.
+- * The root fix will be properly ending ordered extent when we hit
+- * an error during writeback.
+- *
+- * But that needs a bigger refactoring, as we not only need to grab the
+- * submitted OE, but also need to know exactly at which bytenr we hit
+- * the error.
+- * Currently the full page based __extent_writepage_io() is not
+- * capable of that.
+- */
+- if (PageError(page))
++ if (ret)
+ end_extent_writepage(page, ret, page_start, page_end);
+ unlock_page(page);
+ ASSERT(ret <= 0);
+diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
+index 2e6eed4b1b3cc..c89071186388b 100644
+--- a/fs/btrfs/inode.c
++++ b/fs/btrfs/inode.c
+@@ -3546,11 +3546,14 @@ int btrfs_orphan_cleanup(struct btrfs_root *root)
+ found_key.type = BTRFS_INODE_ITEM_KEY;
+ found_key.offset = 0;
+ inode = btrfs_iget(fs_info->sb, last_objectid, root);
+- ret = PTR_ERR_OR_ZERO(inode);
+- if (ret && ret != -ENOENT)
+- goto out;
++ if (IS_ERR(inode)) {
++ ret = PTR_ERR(inode);
++ inode = NULL;
++ if (ret != -ENOENT)
++ goto out;
++ }
+
+- if (ret == -ENOENT && root == fs_info->tree_root) {
++ if (!inode && root == fs_info->tree_root) {
+ struct btrfs_root *dead_root;
+ int is_dead_root = 0;
+
+@@ -3611,17 +3614,17 @@ int btrfs_orphan_cleanup(struct btrfs_root *root)
+ * deleted but wasn't. The inode number may have been reused,
+ * but either way, we can delete the orphan item.
+ */
+- if (ret == -ENOENT || inode->i_nlink) {
+- if (!ret) {
++ if (!inode || inode->i_nlink) {
++ if (inode) {
+ ret = btrfs_drop_verity_items(BTRFS_I(inode));
+ iput(inode);
++ inode = NULL;
+ if (ret)
+ goto out;
+ }
+ trans = btrfs_start_transaction(root, 1);
+ if (IS_ERR(trans)) {
+ ret = PTR_ERR(trans);
+- iput(inode);
+ goto out;
+ }
+ btrfs_debug(fs_info, "auto deleting %Lu",
+@@ -3629,10 +3632,8 @@ int btrfs_orphan_cleanup(struct btrfs_root *root)
+ ret = btrfs_del_orphan_item(trans, root,
+ found_key.objectid);
+ btrfs_end_transaction(trans);
+- if (ret) {
+- iput(inode);
++ if (ret)
+ goto out;
+- }
+ continue;
+ }
+
+@@ -4734,9 +4735,6 @@ again:
+ ret = -ENOMEM;
+ goto out;
+ }
+- ret = set_page_extent_mapped(page);
+- if (ret < 0)
+- goto out_unlock;
+
+ if (!PageUptodate(page)) {
+ ret = btrfs_read_folio(NULL, page_folio(page));
+@@ -4751,6 +4749,17 @@ again:
+ goto out_unlock;
+ }
+ }
++
++ /*
++ * We unlock the page after the io is completed and then re-lock it
++ * above. release_folio() could have come in between that and cleared
++ * PagePrivate(), but left the page in the mapping. Set the page mapped
++ * here to make sure it's properly set for the subpage stuff.
++ */
++ ret = set_page_extent_mapped(page);
++ if (ret < 0)
++ goto out_unlock;
++
+ wait_on_page_writeback(page);
+
+ lock_extent(io_tree, block_start, block_end, &cached_state);
+diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
+index f8735b31da16f..360bf2522a871 100644
+--- a/fs/btrfs/qgroup.c
++++ b/fs/btrfs/qgroup.c
+@@ -4433,4 +4433,5 @@ void btrfs_qgroup_destroy_extent_records(struct btrfs_transaction *trans)
+ ulist_free(entry->old_roots);
+ kfree(entry);
+ }
++ *root = RB_ROOT;
+ }
+diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
+index 2fab37f062def..fd2cd30aeb0d8 100644
+--- a/fs/btrfs/raid56.c
++++ b/fs/btrfs/raid56.c
+@@ -71,7 +71,7 @@ static void rmw_rbio_work_locked(struct work_struct *work);
+ static void index_rbio_pages(struct btrfs_raid_bio *rbio);
+ static int alloc_rbio_pages(struct btrfs_raid_bio *rbio);
+
+-static int finish_parity_scrub(struct btrfs_raid_bio *rbio, int need_check);
++static int finish_parity_scrub(struct btrfs_raid_bio *rbio);
+ static void scrub_rbio_work_locked(struct work_struct *work);
+
+ static void free_raid_bio_pointers(struct btrfs_raid_bio *rbio)
+@@ -2404,7 +2404,7 @@ static int alloc_rbio_essential_pages(struct btrfs_raid_bio *rbio)
+ return 0;
+ }
+
+-static int finish_parity_scrub(struct btrfs_raid_bio *rbio, int need_check)
++static int finish_parity_scrub(struct btrfs_raid_bio *rbio)
+ {
+ struct btrfs_io_context *bioc = rbio->bioc;
+ const u32 sectorsize = bioc->fs_info->sectorsize;
+@@ -2445,9 +2445,6 @@ static int finish_parity_scrub(struct btrfs_raid_bio *rbio, int need_check)
+ */
+ clear_bit(RBIO_CACHE_READY_BIT, &rbio->flags);
+
+- if (!need_check)
+- goto writeback;
+-
+ p_sector.page = alloc_page(GFP_NOFS);
+ if (!p_sector.page)
+ return -ENOMEM;
+@@ -2516,7 +2513,6 @@ static int finish_parity_scrub(struct btrfs_raid_bio *rbio, int need_check)
+ q_sector.page = NULL;
+ }
+
+-writeback:
+ /*
+ * time to start writing. Make bios for everything from the
+ * higher layers (the bio_list in our rbio) and our p/q. Ignore
+@@ -2699,7 +2695,6 @@ static int scrub_assemble_read_bios(struct btrfs_raid_bio *rbio)
+
+ static void scrub_rbio(struct btrfs_raid_bio *rbio)
+ {
+- bool need_check = false;
+ int sector_nr;
+ int ret;
+
+@@ -2722,7 +2717,7 @@ static void scrub_rbio(struct btrfs_raid_bio *rbio)
+ * We have every sector properly prepared. Can finish the scrub
+ * and writeback the good content.
+ */
+- ret = finish_parity_scrub(rbio, need_check);
++ ret = finish_parity_scrub(rbio);
+ wait_event(rbio->io_wait, atomic_read(&rbio->stripes_pending) == 0);
+ for (sector_nr = 0; sector_nr < rbio->stripe_nsectors; sector_nr++) {
+ int found_errors;
+diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
+index 72a838c975345..436e15e3759da 100644
+--- a/fs/btrfs/volumes.c
++++ b/fs/btrfs/volumes.c
+@@ -4071,14 +4071,6 @@ static int alloc_profile_is_valid(u64 flags, int extended)
+ return has_single_bit_set(flags);
+ }
+
+-static inline int balance_need_close(struct btrfs_fs_info *fs_info)
+-{
+- /* cancel requested || normal exit path */
+- return atomic_read(&fs_info->balance_cancel_req) ||
+- (atomic_read(&fs_info->balance_pause_req) == 0 &&
+- atomic_read(&fs_info->balance_cancel_req) == 0);
+-}
+-
+ /*
+ * Validate target profile against allowed profiles and return true if it's OK.
+ * Otherwise print the error message and return false.
+@@ -4268,6 +4260,7 @@ int btrfs_balance(struct btrfs_fs_info *fs_info,
+ u64 num_devices;
+ unsigned seq;
+ bool reducing_redundancy;
++ bool paused = false;
+ int i;
+
+ if (btrfs_fs_closing(fs_info) ||
+@@ -4398,6 +4391,7 @@ int btrfs_balance(struct btrfs_fs_info *fs_info,
+ if (ret == -ECANCELED && atomic_read(&fs_info->balance_pause_req)) {
+ btrfs_info(fs_info, "balance: paused");
+ btrfs_exclop_balance(fs_info, BTRFS_EXCLOP_BALANCE_PAUSED);
++ paused = true;
+ }
+ /*
+ * Balance can be canceled by:
+@@ -4426,8 +4420,8 @@ int btrfs_balance(struct btrfs_fs_info *fs_info,
+ btrfs_update_ioctl_balance_args(fs_info, bargs);
+ }
+
+- if ((ret && ret != -ECANCELED && ret != -ENOSPC) ||
+- balance_need_close(fs_info)) {
++ /* We didn't pause, we can clean everything up. */
++ if (!paused) {
+ reset_balance_state(fs_info);
+ btrfs_exclop_finish(fs_info);
+ }
+@@ -6405,7 +6399,8 @@ int __btrfs_map_block(struct btrfs_fs_info *fs_info, enum btrfs_map_op op,
+ (!need_full_stripe(op) || !dev_replace_is_ongoing ||
+ !dev_replace->tgtdev)) {
+ set_io_stripe(smap, map, stripe_index, stripe_offset, stripe_nr);
+- *mirror_num_ret = mirror_num;
++ if (mirror_num_ret)
++ *mirror_num_ret = mirror_num;
+ *bioc_ret = NULL;
+ ret = 0;
+ goto out;
+diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
+index 997ca4b32e87f..4a1c238600c52 100644
+--- a/fs/erofs/zdata.c
++++ b/fs/erofs/zdata.c
+@@ -1411,7 +1411,7 @@ static void z_erofs_decompress_kickoff(struct z_erofs_decompressqueue *io,
+ if (atomic_add_return(bios, &io->pending_bios))
+ return;
+ /* Use (kthread_)work and sync decompression for atomic contexts only */
+- if (in_atomic() || irqs_disabled()) {
++ if (!in_task() || irqs_disabled() || rcu_read_lock_any_held()) {
+ #ifdef CONFIG_EROFS_FS_PCPU_KTHREAD
+ struct kthread_worker *worker;
+
+diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
+index 321e3a888c20b..05151d61b00b3 100644
+--- a/fs/ext4/xattr.c
++++ b/fs/ext4/xattr.c
+@@ -1782,6 +1782,20 @@ static int ext4_xattr_set_entry(struct ext4_xattr_info *i,
+ memmove(here, (void *)here + size,
+ (void *)last - (void *)here + sizeof(__u32));
+ memset(last, 0, size);
++
++ /*
++ * Update i_inline_off - moved ibody region might contain
++ * system.data attribute. Handling a failure here won't
++ * cause other complications for setting an xattr.
++ */
++ if (!is_block && ext4_has_inline_data(inode)) {
++ ret = ext4_find_inline_data_nolock(inode);
++ if (ret) {
++ ext4_warning_inode(inode,
++ "unable to update i_inline_off");
++ goto out;
++ }
++ }
+ } else if (s->not_found) {
+ /* Insert new name. */
+ size_t size = EXT4_XATTR_LEN(name_len);
+diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
+index 35bc174f9ba21..49c3e96207260 100644
+--- a/fs/fuse/dir.c
++++ b/fs/fuse/dir.c
+@@ -258,7 +258,7 @@ static int fuse_dentry_revalidate(struct dentry *entry, unsigned int flags)
+ spin_unlock(&fi->lock);
+ }
+ kfree(forget);
+- if (ret == -ENOMEM)
++ if (ret == -ENOMEM || ret == -EINTR)
+ goto out;
+ if (ret || fuse_invalid_attr(&outarg.attr) ||
+ fuse_stale_inode(inode, outarg.generation, &outarg.attr))
+diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
+index d66070af145d0..f19d748890f08 100644
+--- a/fs/fuse/inode.c
++++ b/fs/fuse/inode.c
+@@ -1134,7 +1134,10 @@ static void process_init_reply(struct fuse_mount *fm, struct fuse_args *args,
+ process_init_limits(fc, arg);
+
+ if (arg->minor >= 6) {
+- u64 flags = arg->flags | (u64) arg->flags2 << 32;
++ u64 flags = arg->flags;
++
++ if (flags & FUSE_INIT_EXT)
++ flags |= (u64) arg->flags2 << 32;
+
+ ra_pages = arg->max_readahead / PAGE_SIZE;
+ if (flags & FUSE_ASYNC_READ)
+@@ -1254,7 +1257,8 @@ void fuse_send_init(struct fuse_mount *fm)
+ FUSE_ABORT_ERROR | FUSE_MAX_PAGES | FUSE_CACHE_SYMLINKS |
+ FUSE_NO_OPENDIR_SUPPORT | FUSE_EXPLICIT_INVAL_DATA |
+ FUSE_HANDLE_KILLPRIV_V2 | FUSE_SETXATTR_EXT | FUSE_INIT_EXT |
+- FUSE_SECURITY_CTX | FUSE_CREATE_SUPP_GROUP;
++ FUSE_SECURITY_CTX | FUSE_CREATE_SUPP_GROUP |
++ FUSE_HAS_EXPIRE_ONLY;
+ #ifdef CONFIG_FUSE_DAX
+ if (fm->fc->dax)
+ flags |= FUSE_MAP_ALIGNMENT;
+diff --git a/fs/fuse/ioctl.c b/fs/fuse/ioctl.c
+index 8e01bfdfc4303..726640fa439e0 100644
+--- a/fs/fuse/ioctl.c
++++ b/fs/fuse/ioctl.c
+@@ -9,14 +9,23 @@
+ #include <linux/compat.h>
+ #include <linux/fileattr.h>
+
+-static ssize_t fuse_send_ioctl(struct fuse_mount *fm, struct fuse_args *args)
++static ssize_t fuse_send_ioctl(struct fuse_mount *fm, struct fuse_args *args,
++ struct fuse_ioctl_out *outarg)
+ {
+- ssize_t ret = fuse_simple_request(fm, args);
++ ssize_t ret;
++
++ args->out_args[0].size = sizeof(*outarg);
++ args->out_args[0].value = outarg;
++
++ ret = fuse_simple_request(fm, args);
+
+ /* Translate ENOSYS, which shouldn't be returned from fs */
+ if (ret == -ENOSYS)
+ ret = -ENOTTY;
+
++ if (ret >= 0 && outarg->result == -ENOSYS)
++ outarg->result = -ENOTTY;
++
+ return ret;
+ }
+
+@@ -264,13 +273,11 @@ long fuse_do_ioctl(struct file *file, unsigned int cmd, unsigned long arg,
+ }
+
+ ap.args.out_numargs = 2;
+- ap.args.out_args[0].size = sizeof(outarg);
+- ap.args.out_args[0].value = &outarg;
+ ap.args.out_args[1].size = out_size;
+ ap.args.out_pages = true;
+ ap.args.out_argvar = true;
+
+- transferred = fuse_send_ioctl(fm, &ap.args);
++ transferred = fuse_send_ioctl(fm, &ap.args, &outarg);
+ err = transferred;
+ if (transferred < 0)
+ goto out;
+@@ -399,12 +406,10 @@ static int fuse_priv_ioctl(struct inode *inode, struct fuse_file *ff,
+ args.in_args[1].size = inarg.in_size;
+ args.in_args[1].value = ptr;
+ args.out_numargs = 2;
+- args.out_args[0].size = sizeof(outarg);
+- args.out_args[0].value = &outarg;
+ args.out_args[1].size = inarg.out_size;
+ args.out_args[1].value = ptr;
+
+- err = fuse_send_ioctl(fm, &args);
++ err = fuse_send_ioctl(fm, &args, &outarg);
+ if (!err) {
+ if (outarg.result < 0)
+ err = outarg.result;
+diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
+index 51bd38da21cdd..25e3c20eb19f6 100644
+--- a/fs/jbd2/checkpoint.c
++++ b/fs/jbd2/checkpoint.c
+@@ -57,28 +57,6 @@ static inline void __buffer_unlink(struct journal_head *jh)
+ }
+ }
+
+-/*
+- * Move a buffer from the checkpoint list to the checkpoint io list
+- *
+- * Called with j_list_lock held
+- */
+-static inline void __buffer_relink_io(struct journal_head *jh)
+-{
+- transaction_t *transaction = jh->b_cp_transaction;
+-
+- __buffer_unlink_first(jh);
+-
+- if (!transaction->t_checkpoint_io_list) {
+- jh->b_cpnext = jh->b_cpprev = jh;
+- } else {
+- jh->b_cpnext = transaction->t_checkpoint_io_list;
+- jh->b_cpprev = transaction->t_checkpoint_io_list->b_cpprev;
+- jh->b_cpprev->b_cpnext = jh;
+- jh->b_cpnext->b_cpprev = jh;
+- }
+- transaction->t_checkpoint_io_list = jh;
+-}
+-
+ /*
+ * Check a checkpoint buffer could be release or not.
+ *
+@@ -183,6 +161,7 @@ __flush_batch(journal_t *journal, int *batch_count)
+ struct buffer_head *bh = journal->j_chkpt_bhs[i];
+ BUFFER_TRACE(bh, "brelse");
+ __brelse(bh);
++ journal->j_chkpt_bhs[i] = NULL;
+ }
+ *batch_count = 0;
+ }
+@@ -242,6 +221,11 @@ restart:
+ jh = transaction->t_checkpoint_list;
+ bh = jh2bh(jh);
+
++ /*
++ * The buffer may be writing back, or flushing out in the
++ * last couple of cycles, or re-adding into a new transaction,
++ * need to check it again until it's unlocked.
++ */
+ if (buffer_locked(bh)) {
+ get_bh(bh);
+ spin_unlock(&journal->j_list_lock);
+@@ -287,28 +271,32 @@ restart:
+ }
+ if (!buffer_dirty(bh)) {
+ BUFFER_TRACE(bh, "remove from checkpoint");
+- if (__jbd2_journal_remove_checkpoint(jh))
+- /* The transaction was released; we're done */
++ /*
++ * If the transaction was released or the checkpoint
++ * list was empty, we're done.
++ */
++ if (__jbd2_journal_remove_checkpoint(jh) ||
++ !transaction->t_checkpoint_list)
+ goto out;
+- continue;
++ } else {
++ /*
++ * We are about to write the buffer, it could be
++ * raced by some other transaction shrink or buffer
++ * re-log logic once we release the j_list_lock,
++ * leave it on the checkpoint list and check status
++ * again to make sure it's clean.
++ */
++ BUFFER_TRACE(bh, "queue");
++ get_bh(bh);
++ J_ASSERT_BH(bh, !buffer_jwrite(bh));
++ journal->j_chkpt_bhs[batch_count++] = bh;
++ transaction->t_chp_stats.cs_written++;
++ transaction->t_checkpoint_list = jh->b_cpnext;
+ }
+- /*
+- * Important: we are about to write the buffer, and
+- * possibly block, while still holding the journal
+- * lock. We cannot afford to let the transaction
+- * logic start messing around with this buffer before
+- * we write it to disk, as that would break
+- * recoverability.
+- */
+- BUFFER_TRACE(bh, "queue");
+- get_bh(bh);
+- J_ASSERT_BH(bh, !buffer_jwrite(bh));
+- journal->j_chkpt_bhs[batch_count++] = bh;
+- __buffer_relink_io(jh);
+- transaction->t_chp_stats.cs_written++;
++
+ if ((batch_count == JBD2_NR_BATCH) ||
+- need_resched() ||
+- spin_needbreak(&journal->j_list_lock))
++ need_resched() || spin_needbreak(&journal->j_list_lock) ||
++ jh2bh(transaction->t_checkpoint_list) == journal->j_chkpt_bhs[0])
+ goto unlock_and_flush;
+ }
+
+@@ -322,38 +310,6 @@ restart:
+ goto restart;
+ }
+
+- /*
+- * Now we issued all of the transaction's buffers, let's deal
+- * with the buffers that are out for I/O.
+- */
+-restart2:
+- /* Did somebody clean up the transaction in the meanwhile? */
+- if (journal->j_checkpoint_transactions != transaction ||
+- transaction->t_tid != this_tid)
+- goto out;
+-
+- while (transaction->t_checkpoint_io_list) {
+- jh = transaction->t_checkpoint_io_list;
+- bh = jh2bh(jh);
+- if (buffer_locked(bh)) {
+- get_bh(bh);
+- spin_unlock(&journal->j_list_lock);
+- wait_on_buffer(bh);
+- /* the journal_head may have gone by now */
+- BUFFER_TRACE(bh, "brelse");
+- __brelse(bh);
+- spin_lock(&journal->j_list_lock);
+- goto restart2;
+- }
+-
+- /*
+- * Now in whatever state the buffer currently is, we
+- * know that it has been written out and so we can
+- * drop it from the list
+- */
+- if (__jbd2_journal_remove_checkpoint(jh))
+- break;
+- }
+ out:
+ spin_unlock(&journal->j_list_lock);
+ result = jbd2_cleanup_journal_tail(journal);
+diff --git a/fs/jfs/jfs_dmap.c b/fs/jfs/jfs_dmap.c
+index da6a2bc6bf022..bd4ef43b02033 100644
+--- a/fs/jfs/jfs_dmap.c
++++ b/fs/jfs/jfs_dmap.c
+@@ -1959,6 +1959,9 @@ dbAllocDmapLev(struct bmap * bmp,
+ if (dbFindLeaf((dmtree_t *) & dp->tree, l2nb, &leafidx))
+ return -ENOSPC;
+
++ if (leafidx < 0)
++ return -EIO;
++
+ /* determine the block number within the file system corresponding
+ * to the leaf at which free space was found.
+ */
+diff --git a/fs/jfs/jfs_txnmgr.c b/fs/jfs/jfs_txnmgr.c
+index ffd4feece0785..ce4b4760fcb1d 100644
+--- a/fs/jfs/jfs_txnmgr.c
++++ b/fs/jfs/jfs_txnmgr.c
+@@ -354,6 +354,11 @@ tid_t txBegin(struct super_block *sb, int flag)
+ jfs_info("txBegin: flag = 0x%x", flag);
+ log = JFS_SBI(sb)->log;
+
++ if (!log) {
++ jfs_error(sb, "read-only filesystem\n");
++ return 0;
++ }
++
+ TXN_LOCK();
+
+ INCREMENT(TxStat.txBegin);
+diff --git a/fs/jfs/namei.c b/fs/jfs/namei.c
+index b29d68b5eec53..f370c76051205 100644
+--- a/fs/jfs/namei.c
++++ b/fs/jfs/namei.c
+@@ -799,6 +799,11 @@ static int jfs_link(struct dentry *old_dentry,
+ if (rc)
+ goto out;
+
++ if (isReadOnly(ip)) {
++ jfs_error(ip->i_sb, "read-only filesystem\n");
++ return -EROFS;
++ }
++
+ tid = txBegin(ip->i_sb, 0);
+
+ mutex_lock_nested(&JFS_IP(dir)->commit_mutex, COMMIT_MUTEX_PARENT);
+diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
+index fd11fe6d6d45f..6b9f7917fc1bb 100644
+--- a/fs/overlayfs/ovl_entry.h
++++ b/fs/overlayfs/ovl_entry.h
+@@ -32,6 +32,7 @@ struct ovl_sb {
+ };
+
+ struct ovl_layer {
++ /* ovl_free_fs() relies on @mnt being the first member! */
+ struct vfsmount *mnt;
+ /* Trap in ovl inode cache */
+ struct inode *trap;
+@@ -42,6 +43,14 @@ struct ovl_layer {
+ int fsid;
+ };
+
++/*
++ * ovl_free_fs() relies on @mnt being the first member when unmounting
++ * the private mounts created for each layer. Let's check both the
++ * offset and type.
++ */
++static_assert(offsetof(struct ovl_layer, mnt) == 0);
++static_assert(__same_type(typeof_member(struct ovl_layer, mnt), struct vfsmount *));
++
+ struct ovl_path {
+ const struct ovl_layer *layer;
+ struct dentry *dentry;
+diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
+index ffd40dc3e4e99..e3e4f40476579 100644
+--- a/fs/quota/dquot.c
++++ b/fs/quota/dquot.c
+@@ -555,7 +555,7 @@ restart:
+ continue;
+ /* Wait for dquot users */
+ if (atomic_read(&dquot->dq_count)) {
+- dqgrab(dquot);
++ atomic_inc(&dquot->dq_count);
+ spin_unlock(&dq_list_lock);
+ /*
+ * Once dqput() wakes us up, we know it's time to free
+@@ -2420,7 +2420,8 @@ int dquot_load_quota_sb(struct super_block *sb, int type, int format_id,
+
+ error = add_dquot_ref(sb, type);
+ if (error)
+- dquot_disable(sb, type, flags);
++ dquot_disable(sb, type,
++ DQUOT_USAGE_ENABLED | DQUOT_LIMITS_ENABLED);
+
+ return error;
+ out_fmt:
+diff --git a/fs/smb/client/connect.c b/fs/smb/client/connect.c
+index d9f0b3b94f007..853209268f507 100644
+--- a/fs/smb/client/connect.c
++++ b/fs/smb/client/connect.c
+@@ -60,7 +60,7 @@ extern bool disable_legacy_dialects;
+ #define TLINK_IDLE_EXPIRE (600 * HZ)
+
+ /* Drop the connection to not overload the server */
+-#define NUM_STATUS_IO_TIMEOUT 5
++#define MAX_STATUS_IO_TIMEOUT 5
+
+ static int ip_connect(struct TCP_Server_Info *server);
+ static int generic_ip_connect(struct TCP_Server_Info *server);
+@@ -1117,6 +1117,7 @@ cifs_demultiplex_thread(void *p)
+ struct mid_q_entry *mids[MAX_COMPOUND];
+ char *bufs[MAX_COMPOUND];
+ unsigned int noreclaim_flag, num_io_timeout = 0;
++ bool pending_reconnect = false;
+
+ noreclaim_flag = memalloc_noreclaim_save();
+ cifs_dbg(FYI, "Demultiplex PID: %d\n", task_pid_nr(current));
+@@ -1156,6 +1157,8 @@ cifs_demultiplex_thread(void *p)
+ cifs_dbg(FYI, "RFC1002 header 0x%x\n", pdu_length);
+ if (!is_smb_response(server, buf[0]))
+ continue;
++
++ pending_reconnect = false;
+ next_pdu:
+ server->pdu_size = pdu_length;
+
+@@ -1213,10 +1216,13 @@ next_pdu:
+ if (server->ops->is_status_io_timeout &&
+ server->ops->is_status_io_timeout(buf)) {
+ num_io_timeout++;
+- if (num_io_timeout > NUM_STATUS_IO_TIMEOUT) {
+- cifs_reconnect(server, false);
++ if (num_io_timeout > MAX_STATUS_IO_TIMEOUT) {
++ cifs_server_dbg(VFS,
++ "Number of request timeouts exceeded %d. Reconnecting",
++ MAX_STATUS_IO_TIMEOUT);
++
++ pending_reconnect = true;
+ num_io_timeout = 0;
+- continue;
+ }
+ }
+
+@@ -1263,6 +1269,11 @@ next_pdu:
+ buf = server->smallbuf;
+ goto next_pdu;
+ }
++
++ /* do this reconnect at the very end after processing all MIDs */
++ if (pending_reconnect)
++ cifs_reconnect(server, true);
++
+ } /* end while !EXITING */
+
+ /* buffer usually freed in free_mid - need to free it here on exit */
+diff --git a/fs/smb/client/dfs.c b/fs/smb/client/dfs.c
+index 26d14dd0482ef..cf83617236d8b 100644
+--- a/fs/smb/client/dfs.c
++++ b/fs/smb/client/dfs.c
+@@ -66,6 +66,12 @@ static int get_session(struct cifs_mount_ctx *mnt_ctx, const char *full_path)
+ return rc;
+ }
+
++/*
++ * Track individual DFS referral servers used by new DFS mount.
++ *
++ * On success, their lifetime will be shared by final tcon (dfs_ses_list).
++ * Otherwise, they will be put by dfs_put_root_smb_sessions() in cifs_mount().
++ */
+ static int add_root_smb_session(struct cifs_mount_ctx *mnt_ctx)
+ {
+ struct smb3_fs_context *ctx = mnt_ctx->fs_ctx;
+@@ -80,11 +86,12 @@ static int add_root_smb_session(struct cifs_mount_ctx *mnt_ctx)
+ INIT_LIST_HEAD(&root_ses->list);
+
+ spin_lock(&cifs_tcp_ses_lock);
+- ses->ses_count++;
++ cifs_smb_ses_inc_refcount(ses);
+ spin_unlock(&cifs_tcp_ses_lock);
+ root_ses->ses = ses;
+ list_add_tail(&root_ses->list, &mnt_ctx->dfs_ses_list);
+ }
++ /* Select new DFS referral server so that new referrals go through it */
+ ctx->dfs_root_ses = ses;
+ return 0;
+ }
+@@ -244,7 +251,6 @@ out:
+ int dfs_mount_share(struct cifs_mount_ctx *mnt_ctx, bool *isdfs)
+ {
+ struct smb3_fs_context *ctx = mnt_ctx->fs_ctx;
+- struct cifs_ses *ses;
+ bool nodfs = ctx->nodfs;
+ int rc;
+
+@@ -278,20 +284,8 @@ int dfs_mount_share(struct cifs_mount_ctx *mnt_ctx, bool *isdfs)
+ }
+
+ *isdfs = true;
+- /*
+- * Prevent DFS root session of being put in the first call to
+- * cifs_mount_put_conns(). If another DFS root server was not found
+- * while chasing the referrals (@ctx->dfs_root_ses == @ses), then we
+- * can safely put extra refcount of @ses.
+- */
+- ses = mnt_ctx->ses;
+- mnt_ctx->ses = NULL;
+- mnt_ctx->server = NULL;
+- rc = __dfs_mount_share(mnt_ctx);
+- if (ses == ctx->dfs_root_ses)
+- cifs_put_smb_ses(ses);
+-
+- return rc;
++ add_root_smb_session(mnt_ctx);
++ return __dfs_mount_share(mnt_ctx);
+ }
+
+ /* Update dfs referral path of superblock */
+diff --git a/fs/smb/client/smb2transport.c b/fs/smb/client/smb2transport.c
+index 22954a9c7a6c7..355e8700530fc 100644
+--- a/fs/smb/client/smb2transport.c
++++ b/fs/smb/client/smb2transport.c
+@@ -159,7 +159,7 @@ smb2_find_smb_ses_unlocked(struct TCP_Server_Info *server, __u64 ses_id)
+ spin_unlock(&ses->ses_lock);
+ continue;
+ }
+- ++ses->ses_count;
++ cifs_smb_ses_inc_refcount(ses);
+ spin_unlock(&ses->ses_lock);
+ return ses;
+ }
+diff --git a/fs/udf/unicode.c b/fs/udf/unicode.c
+index 622569007b530..2142cbd1dde24 100644
+--- a/fs/udf/unicode.c
++++ b/fs/udf/unicode.c
+@@ -247,7 +247,7 @@ static int udf_name_from_CS0(struct super_block *sb,
+ }
+
+ if (translate) {
+- if (str_o_len <= 2 && str_o[0] == '.' &&
++ if (str_o_len > 0 && str_o_len <= 2 && str_o[0] == '.' &&
+ (str_o_len == 1 || str_o[1] == '.'))
+ needsCRC = 1;
+ if (needsCRC) {
+diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
+index 402b545959af7..5b27f94d4fad6 100644
+--- a/include/kvm/arm_vgic.h
++++ b/include/kvm/arm_vgic.h
+@@ -431,7 +431,7 @@ int kvm_vgic_v4_unset_forwarding(struct kvm *kvm, int irq,
+
+ int vgic_v4_load(struct kvm_vcpu *vcpu);
+ void vgic_v4_commit(struct kvm_vcpu *vcpu);
+-int vgic_v4_put(struct kvm_vcpu *vcpu, bool need_db);
++int vgic_v4_put(struct kvm_vcpu *vcpu);
+
+ /* CPU HP callbacks */
+ void kvm_vgic_cpu_up(void);
+diff --git a/include/linux/psi.h b/include/linux/psi.h
+index ab26200c28033..e0745873e3f26 100644
+--- a/include/linux/psi.h
++++ b/include/linux/psi.h
+@@ -23,8 +23,9 @@ void psi_memstall_enter(unsigned long *flags);
+ void psi_memstall_leave(unsigned long *flags);
+
+ int psi_show(struct seq_file *s, struct psi_group *group, enum psi_res res);
+-struct psi_trigger *psi_trigger_create(struct psi_group *group,
+- char *buf, enum psi_res res, struct file *file);
++struct psi_trigger *psi_trigger_create(struct psi_group *group, char *buf,
++ enum psi_res res, struct file *file,
++ struct kernfs_open_file *of);
+ void psi_trigger_destroy(struct psi_trigger *t);
+
+ __poll_t psi_trigger_poll(void **trigger_ptr, struct file *file,
+diff --git a/include/linux/psi_types.h b/include/linux/psi_types.h
+index 040c089581c6c..f1fd3a8044e0e 100644
+--- a/include/linux/psi_types.h
++++ b/include/linux/psi_types.h
+@@ -137,6 +137,9 @@ struct psi_trigger {
+ /* Wait queue for polling */
+ wait_queue_head_t event_wait;
+
++ /* Kernfs file for cgroup triggers */
++ struct kernfs_open_file *of;
++
+ /* Pending event flag */
+ int event;
+
+diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h
+index 20099268fa257..669e8cff40c74 100644
+--- a/include/linux/sched/signal.h
++++ b/include/linux/sched/signal.h
+@@ -135,7 +135,7 @@ struct signal_struct {
+ #ifdef CONFIG_POSIX_TIMERS
+
+ /* POSIX.1b Interval Timers */
+- int posix_timer_id;
++ unsigned int next_posix_timer_id;
+ struct list_head posix_timers;
+
+ /* ITIMER_REAL timer for the process */
+diff --git a/include/linux/tcp.h b/include/linux/tcp.h
+index b4c08ac869835..91a37c99ba665 100644
+--- a/include/linux/tcp.h
++++ b/include/linux/tcp.h
+@@ -513,7 +513,7 @@ static inline void fastopen_queue_tune(struct sock *sk, int backlog)
+ struct request_sock_queue *queue = &inet_csk(sk)->icsk_accept_queue;
+ int somaxconn = READ_ONCE(sock_net(sk)->core.sysctl_somaxconn);
+
+- queue->fastopenq.max_qlen = min_t(unsigned int, backlog, somaxconn);
++ WRITE_ONCE(queue->fastopenq.max_qlen, min_t(unsigned int, backlog, somaxconn));
+ }
+
+ static inline void tcp_move_syn(struct tcp_sock *tp,
+diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h
+index 9654567cfae37..870b6d3c5146b 100644
+--- a/include/net/bluetooth/hci_core.h
++++ b/include/net/bluetooth/hci_core.h
+@@ -822,6 +822,7 @@ struct hci_conn_params {
+
+ struct hci_conn *conn;
+ bool explicit_connect;
++ /* Accessed without hdev->lock: */
+ hci_conn_flags_t flags;
+ u8 privacy_mode;
+ };
+@@ -1573,7 +1574,11 @@ struct hci_conn_params *hci_conn_params_add(struct hci_dev *hdev,
+ bdaddr_t *addr, u8 addr_type);
+ void hci_conn_params_del(struct hci_dev *hdev, bdaddr_t *addr, u8 addr_type);
+ void hci_conn_params_clear_disabled(struct hci_dev *hdev);
++void hci_conn_params_free(struct hci_conn_params *param);
+
++void hci_pend_le_list_del_init(struct hci_conn_params *param);
++void hci_pend_le_list_add(struct hci_conn_params *param,
++ struct list_head *list);
+ struct hci_conn_params *hci_pend_le_action_lookup(struct list_head *list,
+ bdaddr_t *addr,
+ u8 addr_type);
+diff --git a/include/net/ip.h b/include/net/ip.h
+index acec504c469a0..83a1a9bc3ceb1 100644
+--- a/include/net/ip.h
++++ b/include/net/ip.h
+@@ -282,7 +282,7 @@ void ip_send_unicast_reply(struct sock *sk, struct sk_buff *skb,
+ const struct ip_options *sopt,
+ __be32 daddr, __be32 saddr,
+ const struct ip_reply_arg *arg,
+- unsigned int len, u64 transmit_time);
++ unsigned int len, u64 transmit_time, u32 txhash);
+
+ #define IP_INC_STATS(net, field) SNMP_INC_STATS64((net)->mib.ip_statistics, field)
+ #define __IP_INC_STATS(net, field) __SNMP_INC_STATS64((net)->mib.ip_statistics, field)
+diff --git a/include/net/tcp.h b/include/net/tcp.h
+index 5066e4586cf09..182337a8cf94a 100644
+--- a/include/net/tcp.h
++++ b/include/net/tcp.h
+@@ -1514,25 +1514,38 @@ void tcp_leave_memory_pressure(struct sock *sk);
+ static inline int keepalive_intvl_when(const struct tcp_sock *tp)
+ {
+ struct net *net = sock_net((struct sock *)tp);
++ int val;
+
+- return tp->keepalive_intvl ? :
+- READ_ONCE(net->ipv4.sysctl_tcp_keepalive_intvl);
++ /* Paired with WRITE_ONCE() in tcp_sock_set_keepintvl()
++ * and do_tcp_setsockopt().
++ */
++ val = READ_ONCE(tp->keepalive_intvl);
++
++ return val ? : READ_ONCE(net->ipv4.sysctl_tcp_keepalive_intvl);
+ }
+
+ static inline int keepalive_time_when(const struct tcp_sock *tp)
+ {
+ struct net *net = sock_net((struct sock *)tp);
++ int val;
+
+- return tp->keepalive_time ? :
+- READ_ONCE(net->ipv4.sysctl_tcp_keepalive_time);
++ /* Paired with WRITE_ONCE() in tcp_sock_set_keepidle_locked() */
++ val = READ_ONCE(tp->keepalive_time);
++
++ return val ? : READ_ONCE(net->ipv4.sysctl_tcp_keepalive_time);
+ }
+
+ static inline int keepalive_probes(const struct tcp_sock *tp)
+ {
+ struct net *net = sock_net((struct sock *)tp);
++ int val;
+
+- return tp->keepalive_probes ? :
+- READ_ONCE(net->ipv4.sysctl_tcp_keepalive_probes);
++ /* Paired with WRITE_ONCE() in tcp_sock_set_keepcnt()
++ * and do_tcp_setsockopt().
++ */
++ val = READ_ONCE(tp->keepalive_probes);
++
++ return val ? : READ_ONCE(net->ipv4.sysctl_tcp_keepalive_probes);
+ }
+
+ static inline u32 keepalive_time_elapsed(const struct tcp_sock *tp)
+@@ -2053,7 +2066,11 @@ void __tcp_v4_send_check(struct sk_buff *skb, __be32 saddr, __be32 daddr);
+ static inline u32 tcp_notsent_lowat(const struct tcp_sock *tp)
+ {
+ struct net *net = sock_net((struct sock *)tp);
+- return tp->notsent_lowat ?: READ_ONCE(net->ipv4.sysctl_tcp_notsent_lowat);
++ u32 val;
++
++ val = READ_ONCE(tp->notsent_lowat);
++
++ return val ?: READ_ONCE(net->ipv4.sysctl_tcp_notsent_lowat);
+ }
+
+ bool tcp_stream_memory_free(const struct sock *sk, int wake);
+diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
+index 1b9d0dfae72df..b3fcab13fcd3d 100644
+--- a/include/uapi/linux/fuse.h
++++ b/include/uapi/linux/fuse.h
+@@ -206,6 +206,7 @@
+ * - add extension header
+ * - add FUSE_EXT_GROUPS
+ * - add FUSE_CREATE_SUPP_GROUP
++ * - add FUSE_HAS_EXPIRE_ONLY
+ */
+
+ #ifndef _LINUX_FUSE_H
+@@ -369,6 +370,7 @@ struct fuse_file_lock {
+ * FUSE_HAS_INODE_DAX: use per inode DAX
+ * FUSE_CREATE_SUPP_GROUP: add supplementary group info to create, mkdir,
+ * symlink and mknod (single group that matches parent)
++ * FUSE_HAS_EXPIRE_ONLY: kernel supports expiry-only entry invalidation
+ */
+ #define FUSE_ASYNC_READ (1 << 0)
+ #define FUSE_POSIX_LOCKS (1 << 1)
+@@ -406,6 +408,7 @@ struct fuse_file_lock {
+ #define FUSE_SECURITY_CTX (1ULL << 32)
+ #define FUSE_HAS_INODE_DAX (1ULL << 33)
+ #define FUSE_CREATE_SUPP_GROUP (1ULL << 34)
++#define FUSE_HAS_EXPIRE_ONLY (1ULL << 35)
+
+ /**
+ * CUSE INIT request/reply flags
+diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
+index f1b79959d1c1d..d6667b435dd39 100644
+--- a/io_uring/io_uring.c
++++ b/io_uring/io_uring.c
+@@ -2032,6 +2032,14 @@ fail:
+ ret = io_issue_sqe(req, issue_flags);
+ if (ret != -EAGAIN)
+ break;
++
++ /*
++ * If REQ_F_NOWAIT is set, then don't wait or retry with
++ * poll. -EAGAIN is final for that case.
++ */
++ if (req->flags & REQ_F_NOWAIT)
++ break;
++
+ /*
+ * We can get EAGAIN for iopolled IO even though we're
+ * forcing a sync submission from here, since we can't
+@@ -3425,8 +3433,6 @@ static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp,
+ unsigned long addr, unsigned long len,
+ unsigned long pgoff, unsigned long flags)
+ {
+- const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags);
+- struct vm_unmapped_area_info info;
+ void *ptr;
+
+ /*
+@@ -3441,32 +3447,26 @@ static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp,
+ if (IS_ERR(ptr))
+ return -ENOMEM;
+
+- info.flags = VM_UNMAPPED_AREA_TOPDOWN;
+- info.length = len;
+- info.low_limit = max(PAGE_SIZE, mmap_min_addr);
+- info.high_limit = arch_get_mmap_base(addr, current->mm->mmap_base);
++ /*
++ * Some architectures have strong cache aliasing requirements.
++ * For such architectures we need a coherent mapping which aliases
++ * kernel memory *and* userspace memory. To achieve that:
++ * - use a NULL file pointer to reference physical memory, and
++ * - use the kernel virtual address of the shared io_uring context
++ * (instead of the userspace-provided address, which has to be 0UL
++ * anyway).
++ * For architectures without such aliasing requirements, the
++ * architecture will return any suitable mapping because addr is 0.
++ */
++ filp = NULL;
++ flags |= MAP_SHARED;
++ pgoff = 0; /* has been translated to ptr above */
+ #ifdef SHM_COLOUR
+- info.align_mask = PAGE_MASK & (SHM_COLOUR - 1UL);
++ addr = (uintptr_t) ptr;
+ #else
+- info.align_mask = PAGE_MASK & (SHMLBA - 1UL);
++ addr = 0UL;
+ #endif
+- info.align_offset = (unsigned long) ptr;
+-
+- /*
+- * A failed mmap() very likely causes application failure,
+- * so fall back to the bottom-up function here. This scenario
+- * can happen with large stack limits and large mmap()
+- * allocations.
+- */
+- addr = vm_unmapped_area(&info);
+- if (offset_in_page(addr)) {
+- info.flags = 0;
+- info.low_limit = TASK_UNMAPPED_BASE;
+- info.high_limit = mmap_end;
+- addr = vm_unmapped_area(&info);
+- }
+-
+- return addr;
++ return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
+ }
+
+ #else /* !CONFIG_MMU */
+diff --git a/kernel/bpf/bpf_lru_list.c b/kernel/bpf/bpf_lru_list.c
+index d99e89f113c43..3dabdd137d102 100644
+--- a/kernel/bpf/bpf_lru_list.c
++++ b/kernel/bpf/bpf_lru_list.c
+@@ -41,7 +41,12 @@ static struct list_head *local_pending_list(struct bpf_lru_locallist *loc_l)
+ /* bpf_lru_node helpers */
+ static bool bpf_lru_node_is_ref(const struct bpf_lru_node *node)
+ {
+- return node->ref;
++ return READ_ONCE(node->ref);
++}
++
++static void bpf_lru_node_clear_ref(struct bpf_lru_node *node)
++{
++ WRITE_ONCE(node->ref, 0);
+ }
+
+ static void bpf_lru_list_count_inc(struct bpf_lru_list *l,
+@@ -89,7 +94,7 @@ static void __bpf_lru_node_move_in(struct bpf_lru_list *l,
+
+ bpf_lru_list_count_inc(l, tgt_type);
+ node->type = tgt_type;
+- node->ref = 0;
++ bpf_lru_node_clear_ref(node);
+ list_move(&node->list, &l->lists[tgt_type]);
+ }
+
+@@ -110,7 +115,7 @@ static void __bpf_lru_node_move(struct bpf_lru_list *l,
+ bpf_lru_list_count_inc(l, tgt_type);
+ node->type = tgt_type;
+ }
+- node->ref = 0;
++ bpf_lru_node_clear_ref(node);
+
+ /* If the moving node is the next_inactive_rotation candidate,
+ * move the next_inactive_rotation pointer also.
+@@ -353,7 +358,7 @@ static void __local_list_add_pending(struct bpf_lru *lru,
+ *(u32 *)((void *)node + lru->hash_offset) = hash;
+ node->cpu = cpu;
+ node->type = BPF_LRU_LOCAL_LIST_T_PENDING;
+- node->ref = 0;
++ bpf_lru_node_clear_ref(node);
+ list_add(&node->list, local_pending_list(loc_l));
+ }
+
+@@ -419,7 +424,7 @@ static struct bpf_lru_node *bpf_percpu_lru_pop_free(struct bpf_lru *lru,
+ if (!list_empty(free_list)) {
+ node = list_first_entry(free_list, struct bpf_lru_node, list);
+ *(u32 *)((void *)node + lru->hash_offset) = hash;
+- node->ref = 0;
++ bpf_lru_node_clear_ref(node);
+ __bpf_lru_node_move(l, node, BPF_LRU_LIST_T_INACTIVE);
+ }
+
+@@ -522,7 +527,7 @@ static void bpf_common_lru_push_free(struct bpf_lru *lru,
+ }
+
+ node->type = BPF_LRU_LOCAL_LIST_T_FREE;
+- node->ref = 0;
++ bpf_lru_node_clear_ref(node);
+ list_move(&node->list, local_free_list(loc_l));
+
+ raw_spin_unlock_irqrestore(&loc_l->lock, flags);
+@@ -568,7 +573,7 @@ static void bpf_common_lru_populate(struct bpf_lru *lru, void *buf,
+
+ node = (struct bpf_lru_node *)(buf + node_offset);
+ node->type = BPF_LRU_LIST_T_FREE;
+- node->ref = 0;
++ bpf_lru_node_clear_ref(node);
+ list_add(&node->list, &l->lists[BPF_LRU_LIST_T_FREE]);
+ buf += elem_size;
+ }
+@@ -594,7 +599,7 @@ again:
+ node = (struct bpf_lru_node *)(buf + node_offset);
+ node->cpu = cpu;
+ node->type = BPF_LRU_LIST_T_FREE;
+- node->ref = 0;
++ bpf_lru_node_clear_ref(node);
+ list_add(&node->list, &l->lists[BPF_LRU_LIST_T_FREE]);
+ i++;
+ buf += elem_size;
+diff --git a/kernel/bpf/bpf_lru_list.h b/kernel/bpf/bpf_lru_list.h
+index 4ea227c9c1ade..8f3c8b2b4490e 100644
+--- a/kernel/bpf/bpf_lru_list.h
++++ b/kernel/bpf/bpf_lru_list.h
+@@ -64,11 +64,8 @@ struct bpf_lru {
+
+ static inline void bpf_lru_node_set_ref(struct bpf_lru_node *node)
+ {
+- /* ref is an approximation on access frequency. It does not
+- * have to be very accurate. Hence, no protection is used.
+- */
+- if (!node->ref)
+- node->ref = 1;
++ if (!READ_ONCE(node->ref))
++ WRITE_ONCE(node->ref, 1);
+ }
+
+ int bpf_lru_init(struct bpf_lru *lru, bool percpu, u32 hash_offset,
+diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
+index 25ca17a8e1964..8b4e92439d1d6 100644
+--- a/kernel/bpf/btf.c
++++ b/kernel/bpf/btf.c
+@@ -485,25 +485,26 @@ static bool btf_type_is_fwd(const struct btf_type *t)
+ return BTF_INFO_KIND(t->info) == BTF_KIND_FWD;
+ }
+
+-static bool btf_type_nosize(const struct btf_type *t)
++static bool btf_type_is_datasec(const struct btf_type *t)
+ {
+- return btf_type_is_void(t) || btf_type_is_fwd(t) ||
+- btf_type_is_func(t) || btf_type_is_func_proto(t);
++ return BTF_INFO_KIND(t->info) == BTF_KIND_DATASEC;
+ }
+
+-static bool btf_type_nosize_or_null(const struct btf_type *t)
++static bool btf_type_is_decl_tag(const struct btf_type *t)
+ {
+- return !t || btf_type_nosize(t);
++ return BTF_INFO_KIND(t->info) == BTF_KIND_DECL_TAG;
+ }
+
+-static bool btf_type_is_datasec(const struct btf_type *t)
++static bool btf_type_nosize(const struct btf_type *t)
+ {
+- return BTF_INFO_KIND(t->info) == BTF_KIND_DATASEC;
++ return btf_type_is_void(t) || btf_type_is_fwd(t) ||
++ btf_type_is_func(t) || btf_type_is_func_proto(t) ||
++ btf_type_is_decl_tag(t);
+ }
+
+-static bool btf_type_is_decl_tag(const struct btf_type *t)
++static bool btf_type_nosize_or_null(const struct btf_type *t)
+ {
+- return BTF_INFO_KIND(t->info) == BTF_KIND_DECL_TAG;
++ return !t || btf_type_nosize(t);
+ }
+
+ static bool btf_type_is_decl_tag_target(const struct btf_type *t)
+diff --git a/kernel/bpf/log.c b/kernel/bpf/log.c
+index 046ddff37a76d..850494423530e 100644
+--- a/kernel/bpf/log.c
++++ b/kernel/bpf/log.c
+@@ -62,9 +62,6 @@ void bpf_verifier_vlog(struct bpf_verifier_log *log, const char *fmt,
+
+ n = vscnprintf(log->kbuf, BPF_VERIFIER_TMP_LOG_SIZE, fmt, args);
+
+- WARN_ONCE(n >= BPF_VERIFIER_TMP_LOG_SIZE - 1,
+- "verifier log line truncated - local buffer too short\n");
+-
+ if (log->level == BPF_LOG_KERNEL) {
+ bool newline = n > 0 && log->kbuf[n - 1] == '\n';
+
+diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
+index f1c8733f76b83..5524fcf6fb2a4 100644
+--- a/kernel/bpf/syscall.c
++++ b/kernel/bpf/syscall.c
+@@ -5394,7 +5394,8 @@ static int bpf_unpriv_handler(struct ctl_table *table, int write,
+ *(int *)table->data = unpriv_enable;
+ }
+
+- unpriv_ebpf_notify(unpriv_enable);
++ if (write)
++ unpriv_ebpf_notify(unpriv_enable);
+
+ return ret;
+ }
+diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
+index aac31e33323bb..4fbfe1d086467 100644
+--- a/kernel/bpf/verifier.c
++++ b/kernel/bpf/verifier.c
+@@ -5381,16 +5381,17 @@ static int update_stack_depth(struct bpf_verifier_env *env,
+ * Since recursion is prevented by check_cfg() this algorithm
+ * only needs a local stack of MAX_CALL_FRAMES to remember callsites
+ */
+-static int check_max_stack_depth(struct bpf_verifier_env *env)
++static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx)
+ {
+- int depth = 0, frame = 0, idx = 0, i = 0, subprog_end;
+ struct bpf_subprog_info *subprog = env->subprog_info;
+ struct bpf_insn *insn = env->prog->insnsi;
++ int depth = 0, frame = 0, i, subprog_end;
+ bool tail_call_reachable = false;
+ int ret_insn[MAX_CALL_FRAMES];
+ int ret_prog[MAX_CALL_FRAMES];
+ int j;
+
++ i = subprog[idx].start;
+ process_func:
+ /* protect against potential stack overflow that might happen when
+ * bpf2bpf calls get combined with tailcalls. Limit the caller's stack
+@@ -5429,7 +5430,7 @@ process_func:
+ continue_func:
+ subprog_end = subprog[idx + 1].start;
+ for (; i < subprog_end; i++) {
+- int next_insn;
++ int next_insn, sidx;
+
+ if (!bpf_pseudo_call(insn + i) && !bpf_pseudo_func(insn + i))
+ continue;
+@@ -5439,14 +5440,14 @@ continue_func:
+
+ /* find the callee */
+ next_insn = i + insn[i].imm + 1;
+- idx = find_subprog(env, next_insn);
+- if (idx < 0) {
++ sidx = find_subprog(env, next_insn);
++ if (sidx < 0) {
+ WARN_ONCE(1, "verifier bug. No program starts at insn %d\n",
+ next_insn);
+ return -EFAULT;
+ }
+- if (subprog[idx].is_async_cb) {
+- if (subprog[idx].has_tail_call) {
++ if (subprog[sidx].is_async_cb) {
++ if (subprog[sidx].has_tail_call) {
+ verbose(env, "verifier bug. subprog has tail_call and async cb\n");
+ return -EFAULT;
+ }
+@@ -5455,6 +5456,7 @@ continue_func:
+ continue;
+ }
+ i = next_insn;
++ idx = sidx;
+
+ if (subprog[idx].has_tail_call)
+ tail_call_reachable = true;
+@@ -5490,6 +5492,22 @@ continue_func:
+ goto continue_func;
+ }
+
++static int check_max_stack_depth(struct bpf_verifier_env *env)
++{
++ struct bpf_subprog_info *si = env->subprog_info;
++ int ret;
++
++ for (int i = 0; i < env->subprog_cnt; i++) {
++ if (!i || si[i].is_async_cb) {
++ ret = check_max_stack_depth_subprog(env, i);
++ if (ret < 0)
++ return ret;
++ }
++ continue;
++ }
++ return 0;
++}
++
+ #ifndef CONFIG_BPF_JIT_ALWAYS_ON
+ static int get_callee_stack_depth(struct bpf_verifier_env *env,
+ const struct bpf_insn *insn, int idx)
+diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
+index 4d42f0cbc11ea..3299ec69ce0d1 100644
+--- a/kernel/cgroup/cgroup.c
++++ b/kernel/cgroup/cgroup.c
+@@ -3785,7 +3785,7 @@ static ssize_t pressure_write(struct kernfs_open_file *of, char *buf,
+ }
+
+ psi = cgroup_psi(cgrp);
+- new = psi_trigger_create(psi, buf, res, of->file);
++ new = psi_trigger_create(psi, buf, res, of->file, of);
+ if (IS_ERR(new)) {
+ cgroup_put(cgrp);
+ return PTR_ERR(new);
+diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
+index 77747391f49b6..4874508bb950e 100644
+--- a/kernel/kallsyms.c
++++ b/kernel/kallsyms.c
+@@ -174,11 +174,10 @@ static bool cleanup_symbol_name(char *s)
+ * LLVM appends various suffixes for local functions and variables that
+ * must be promoted to global scope as part of LTO. This can break
+ * hooking of static functions with kprobes. '.' is not a valid
+- * character in an identifier in C. Suffixes observed:
++ * character in an identifier in C. Suffixes only in LLVM LTO observed:
+ * - foo.llvm.[0-9a-f]+
+- * - foo.[0-9a-f]+
+ */
+- res = strchr(s, '.');
++ res = strstr(s, ".llvm.");
+ if (res) {
+ *res = '\0';
+ return true;
+diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
+index 8f08c087142b0..9b9ce09f8f358 100644
+--- a/kernel/rcu/tasks.h
++++ b/kernel/rcu/tasks.h
+@@ -241,7 +241,6 @@ static void cblist_init_generic(struct rcu_tasks *rtp)
+ if (rcu_task_enqueue_lim < 0) {
+ rcu_task_enqueue_lim = 1;
+ rcu_task_cb_adjust = true;
+- pr_info("%s: Setting adjustable number of callback queues.\n", __func__);
+ } else if (rcu_task_enqueue_lim == 0) {
+ rcu_task_enqueue_lim = 1;
+ }
+@@ -272,6 +271,10 @@ static void cblist_init_generic(struct rcu_tasks *rtp)
+ raw_spin_unlock_rcu_node(rtpcp); // irqs remain disabled.
+ }
+ raw_spin_unlock_irqrestore(&rtp->cbs_gbl_lock, flags);
++
++ if (rcu_task_cb_adjust)
++ pr_info("%s: Setting adjustable number of callback queues.\n", __func__);
++
+ pr_info("%s: Setting shift to %d and lim to %d.\n", __func__, data_race(rtp->percpu_enqueue_shift), data_race(rtp->percpu_enqueue_lim));
+ }
+
+diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
+index 3b7abb58157df..8239b39d945bd 100644
+--- a/kernel/rcu/tree_exp.h
++++ b/kernel/rcu/tree_exp.h
+@@ -643,7 +643,7 @@ static void synchronize_rcu_expedited_wait(void)
+ "O."[!!cpu_online(cpu)],
+ "o."[!!(rdp->grpmask & rnp->expmaskinit)],
+ "N."[!!(rdp->grpmask & rnp->expmaskinitnext)],
+- "D."[!!(rdp->cpu_no_qs.b.exp)]);
++ "D."[!!data_race(rdp->cpu_no_qs.b.exp)]);
+ }
+ }
+ pr_cont(" } %lu jiffies s: %lu root: %#lx/%c\n",
+diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
+index 7b0fe741a0886..41021080ad258 100644
+--- a/kernel/rcu/tree_plugin.h
++++ b/kernel/rcu/tree_plugin.h
+@@ -257,6 +257,8 @@ static void rcu_preempt_ctxt_queue(struct rcu_node *rnp, struct rcu_data *rdp)
+ * GP should not be able to end until we report, so there should be
+ * no need to check for a subsequent expedited GP. (Though we are
+ * still in a quiescent state in any case.)
++ *
++ * Interrupts are disabled, so ->cpu_no_qs.b.exp cannot change.
+ */
+ if (blkd_state & RCU_EXP_BLKD && rdp->cpu_no_qs.b.exp)
+ rcu_report_exp_rdp(rdp);
+@@ -941,7 +943,7 @@ notrace void rcu_preempt_deferred_qs(struct task_struct *t)
+ {
+ struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
+
+- if (rdp->cpu_no_qs.b.exp)
++ if (READ_ONCE(rdp->cpu_no_qs.b.exp))
+ rcu_report_exp_rdp(rdp);
+ }
+
+diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
+index 4da5f35417626..dacb56d7e9147 100644
+--- a/kernel/sched/fair.c
++++ b/kernel/sched/fair.c
+@@ -7174,7 +7174,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
+ recent_used_cpu != target &&
+ cpus_share_cache(recent_used_cpu, target) &&
+ (available_idle_cpu(recent_used_cpu) || sched_idle_cpu(recent_used_cpu)) &&
+- cpumask_test_cpu(p->recent_used_cpu, p->cpus_ptr) &&
++ cpumask_test_cpu(recent_used_cpu, p->cpus_ptr) &&
+ asym_fits_cpu(task_util, util_min, util_max, recent_used_cpu)) {
+ return recent_used_cpu;
+ }
+@@ -10762,7 +10762,7 @@ static int load_balance(int this_cpu, struct rq *this_rq,
+ .sd = sd,
+ .dst_cpu = this_cpu,
+ .dst_rq = this_rq,
+- .dst_grpmask = sched_group_span(sd->groups),
++ .dst_grpmask = group_balance_mask(sd->groups),
+ .idle = idle,
+ .loop_break = SCHED_NR_MIGRATE_BREAK,
+ .cpus = cpus,
+diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
+index e072f6b31bf30..80d8c10e93638 100644
+--- a/kernel/sched/psi.c
++++ b/kernel/sched/psi.c
+@@ -494,8 +494,12 @@ static u64 update_triggers(struct psi_group *group, u64 now, bool *update_total,
+ continue;
+
+ /* Generate an event */
+- if (cmpxchg(&t->event, 0, 1) == 0)
+- wake_up_interruptible(&t->event_wait);
++ if (cmpxchg(&t->event, 0, 1) == 0) {
++ if (t->of)
++ kernfs_notify(t->of->kn);
++ else
++ wake_up_interruptible(&t->event_wait);
++ }
+ t->last_event_time = now;
+ /* Reset threshold breach flag once event got generated */
+ t->pending_event = false;
+@@ -1272,8 +1276,9 @@ int psi_show(struct seq_file *m, struct psi_group *group, enum psi_res res)
+ return 0;
+ }
+
+-struct psi_trigger *psi_trigger_create(struct psi_group *group,
+- char *buf, enum psi_res res, struct file *file)
++struct psi_trigger *psi_trigger_create(struct psi_group *group, char *buf,
++ enum psi_res res, struct file *file,
++ struct kernfs_open_file *of)
+ {
+ struct psi_trigger *t;
+ enum psi_states state;
+@@ -1333,7 +1338,9 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group,
+
+ t->event = 0;
+ t->last_event_time = 0;
+- init_waitqueue_head(&t->event_wait);
++ t->of = of;
++ if (!of)
++ init_waitqueue_head(&t->event_wait);
+ t->pending_event = false;
+ t->aggregator = privileged ? PSI_POLL : PSI_AVGS;
+
+@@ -1390,7 +1397,10 @@ void psi_trigger_destroy(struct psi_trigger *t)
+ * being accessed later. Can happen if cgroup is deleted from under a
+ * polling process.
+ */
+- wake_up_pollfree(&t->event_wait);
++ if (t->of)
++ kernfs_notify(t->of->kn);
++ else
++ wake_up_interruptible(&t->event_wait);
+
+ if (t->aggregator == PSI_AVGS) {
+ mutex_lock(&group->avgs_lock);
+@@ -1462,7 +1472,10 @@ __poll_t psi_trigger_poll(void **trigger_ptr,
+ if (!t)
+ return DEFAULT_POLLMASK | EPOLLERR | EPOLLPRI;
+
+- poll_wait(file, &t->event_wait, wait);
++ if (t->of)
++ kernfs_generic_poll(t->of, wait);
++ else
++ poll_wait(file, &t->event_wait, wait);
+
+ if (cmpxchg(&t->event, 1, 0) == 1)
+ ret |= EPOLLPRI;
+@@ -1532,7 +1545,7 @@ static ssize_t psi_write(struct file *file, const char __user *user_buf,
+ return -EBUSY;
+ }
+
+- new = psi_trigger_create(&psi_system, buf, res, file);
++ new = psi_trigger_create(&psi_system, buf, res, file, NULL);
+ if (IS_ERR(new)) {
+ mutex_unlock(&seq->lock);
+ return PTR_ERR(new);
+diff --git a/kernel/sys.c b/kernel/sys.c
+index 339fee3eff6a2..a36a27ebac33e 100644
+--- a/kernel/sys.c
++++ b/kernel/sys.c
+@@ -2529,11 +2529,6 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
+ else
+ return -EINVAL;
+ break;
+- case PR_GET_AUXV:
+- if (arg4 || arg5)
+- return -EINVAL;
+- error = prctl_get_auxv((void __user *)arg2, arg3);
+- break;
+ default:
+ return -EINVAL;
+ }
+@@ -2688,6 +2683,11 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
+ case PR_SET_VMA:
+ error = prctl_set_vma(arg2, arg3, arg4, arg5);
+ break;
++ case PR_GET_AUXV:
++ if (arg4 || arg5)
++ return -EINVAL;
++ error = prctl_get_auxv((void __user *)arg2, arg3);
++ break;
+ #ifdef CONFIG_KSM
+ case PR_SET_MEMORY_MERGE:
+ if (arg3 || arg4 || arg5)
+diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
+index ed3c4a9543982..2d6cf93ca370a 100644
+--- a/kernel/time/posix-timers.c
++++ b/kernel/time/posix-timers.c
+@@ -140,25 +140,30 @@ static struct k_itimer *posix_timer_by_id(timer_t id)
+ static int posix_timer_add(struct k_itimer *timer)
+ {
+ struct signal_struct *sig = current->signal;
+- int first_free_id = sig->posix_timer_id;
+ struct hlist_head *head;
+- int ret = -ENOENT;
++ unsigned int cnt, id;
+
+- do {
++ /*
++ * FIXME: Replace this by a per signal struct xarray once there is
++ * a plan to handle the resulting CRIU regression gracefully.
++ */
++ for (cnt = 0; cnt <= INT_MAX; cnt++) {
+ spin_lock(&hash_lock);
+- head = &posix_timers_hashtable[hash(sig, sig->posix_timer_id)];
+- if (!__posix_timers_find(head, sig, sig->posix_timer_id)) {
++ id = sig->next_posix_timer_id;
++
++ /* Write the next ID back. Clamp it to the positive space */
++ sig->next_posix_timer_id = (id + 1) & INT_MAX;
++
++ head = &posix_timers_hashtable[hash(sig, id)];
++ if (!__posix_timers_find(head, sig, id)) {
+ hlist_add_head_rcu(&timer->t_hash, head);
+- ret = sig->posix_timer_id;
++ spin_unlock(&hash_lock);
++ return id;
+ }
+- if (++sig->posix_timer_id < 0)
+- sig->posix_timer_id = 0;
+- if ((sig->posix_timer_id == first_free_id) && (ret == -ENOENT))
+- /* Loop over all possible ids completed */
+- ret = -EAGAIN;
+ spin_unlock(&hash_lock);
+- } while (ret == -ENOENT);
+- return ret;
++ }
++ /* POSIX return code when no timer ID could be allocated */
++ return -EAGAIN;
+ }
+
+ static inline void unlock_timer(struct k_itimer *timr, unsigned long flags)
+diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
+index c8c61381eba48..d06938ae07174 100644
+--- a/kernel/trace/trace_events_hist.c
++++ b/kernel/trace/trace_events_hist.c
+@@ -6668,7 +6668,8 @@ static int event_hist_trigger_parse(struct event_command *cmd_ops,
+ goto out_unreg;
+
+ if (has_hist_vars(hist_data) || hist_data->n_var_refs) {
+- if (save_hist_vars(hist_data))
++ ret = save_hist_vars(hist_data);
++ if (ret)
+ goto out_unreg;
+ }
+
+diff --git a/lib/iov_iter.c b/lib/iov_iter.c
+index 960223ed91991..061cc3ed58f5b 100644
+--- a/lib/iov_iter.c
++++ b/lib/iov_iter.c
+@@ -1795,7 +1795,7 @@ uaccess_end:
+ return ret;
+ }
+
+-static int copy_iovec_from_user(struct iovec *iov,
++static __noclone int copy_iovec_from_user(struct iovec *iov,
+ const struct iovec __user *uiov, unsigned long nr_segs)
+ {
+ int ret = -EFAULT;
+diff --git a/lib/maple_tree.c b/lib/maple_tree.c
+index 35264f1936a37..bb28a49d173c0 100644
+--- a/lib/maple_tree.c
++++ b/lib/maple_tree.c
+@@ -3693,7 +3693,8 @@ static inline int mas_root_expand(struct ma_state *mas, void *entry)
+ mas->offset = slot;
+ pivots[slot] = mas->last;
+ if (mas->last != ULONG_MAX)
+- slot++;
++ pivots[++slot] = ULONG_MAX;
++
+ mas->depth = 1;
+ mas_set_height(mas);
+ ma_set_meta(node, maple_leaf_64, 0, slot);
+diff --git a/mm/mlock.c b/mm/mlock.c
+index 40b43f8740dfb..39e03a37f0a98 100644
+--- a/mm/mlock.c
++++ b/mm/mlock.c
+@@ -471,7 +471,6 @@ static int apply_vma_lock_flags(unsigned long start, size_t len,
+ {
+ unsigned long nstart, end, tmp;
+ struct vm_area_struct *vma, *prev;
+- int error;
+ VMA_ITERATOR(vmi, current->mm, start);
+
+ VM_BUG_ON(offset_in_page(start));
+@@ -492,6 +491,7 @@ static int apply_vma_lock_flags(unsigned long start, size_t len,
+ nstart = start;
+ tmp = vma->vm_start;
+ for_each_vma_range(vmi, vma, end) {
++ int error;
+ vm_flags_t newflags;
+
+ if (vma->vm_start != tmp)
+@@ -505,14 +505,15 @@ static int apply_vma_lock_flags(unsigned long start, size_t len,
+ tmp = end;
+ error = mlock_fixup(&vmi, vma, &prev, nstart, tmp, newflags);
+ if (error)
+- break;
++ return error;
++ tmp = vma_iter_end(&vmi);
+ nstart = tmp;
+ }
+
+- if (vma_iter_end(&vmi) < end)
++ if (tmp < end)
+ return -ENOMEM;
+
+- return error;
++ return 0;
+ }
+
+ /*
+diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c
+index 2275e0d9f8419..31c115b225e7e 100644
+--- a/net/bluetooth/hci_conn.c
++++ b/net/bluetooth/hci_conn.c
+@@ -118,7 +118,7 @@ static void hci_connect_le_scan_cleanup(struct hci_conn *conn, u8 status)
+ */
+ params->explicit_connect = false;
+
+- list_del_init(¶ms->action);
++ hci_pend_le_list_del_init(params);
+
+ switch (params->auto_connect) {
+ case HCI_AUTO_CONN_EXPLICIT:
+@@ -127,10 +127,10 @@ static void hci_connect_le_scan_cleanup(struct hci_conn *conn, u8 status)
+ return;
+ case HCI_AUTO_CONN_DIRECT:
+ case HCI_AUTO_CONN_ALWAYS:
+- list_add(¶ms->action, &hdev->pend_le_conns);
++ hci_pend_le_list_add(params, &hdev->pend_le_conns);
+ break;
+ case HCI_AUTO_CONN_REPORT:
+- list_add(¶ms->action, &hdev->pend_le_reports);
++ hci_pend_le_list_add(params, &hdev->pend_le_reports);
+ break;
+ default:
+ break;
+@@ -1426,8 +1426,8 @@ static int hci_explicit_conn_params_set(struct hci_dev *hdev,
+ if (params->auto_connect == HCI_AUTO_CONN_DISABLED ||
+ params->auto_connect == HCI_AUTO_CONN_REPORT ||
+ params->auto_connect == HCI_AUTO_CONN_EXPLICIT) {
+- list_del_init(¶ms->action);
+- list_add(¶ms->action, &hdev->pend_le_conns);
++ hci_pend_le_list_del_init(params);
++ hci_pend_le_list_add(params, &hdev->pend_le_conns);
+ }
+
+ params->explicit_connect = true;
+@@ -1684,7 +1684,7 @@ struct hci_conn *hci_connect_sco(struct hci_dev *hdev, int type, bdaddr_t *dst,
+ if (!link) {
+ hci_conn_drop(acl);
+ hci_conn_drop(sco);
+- return NULL;
++ return ERR_PTR(-ENOLINK);
+ }
+
+ sco->setting = setting;
+@@ -2256,7 +2256,7 @@ struct hci_conn *hci_connect_cis(struct hci_dev *hdev, bdaddr_t *dst,
+ if (!link) {
+ hci_conn_drop(le);
+ hci_conn_drop(cis);
+- return NULL;
++ return ERR_PTR(-ENOLINK);
+ }
+
+ /* If LE is already connected and CIS handle is already set proceed to
+diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
+index 48917c68358de..1ec83985f1ab0 100644
+--- a/net/bluetooth/hci_core.c
++++ b/net/bluetooth/hci_core.c
+@@ -1972,6 +1972,7 @@ static int hci_remove_adv_monitor(struct hci_dev *hdev,
+ struct adv_monitor *monitor)
+ {
+ int status = 0;
++ int handle;
+
+ switch (hci_get_adv_monitor_offload_ext(hdev)) {
+ case HCI_ADV_MONITOR_EXT_NONE: /* also goes here when powered off */
+@@ -1980,9 +1981,10 @@ static int hci_remove_adv_monitor(struct hci_dev *hdev,
+ goto free_monitor;
+
+ case HCI_ADV_MONITOR_EXT_MSFT:
++ handle = monitor->handle;
+ status = msft_remove_monitor(hdev, monitor);
+ bt_dev_dbg(hdev, "%s remove monitor %d msft status %d",
+- hdev->name, monitor->handle, status);
++ hdev->name, handle, status);
+ break;
+ }
+
+@@ -2249,21 +2251,45 @@ struct hci_conn_params *hci_conn_params_lookup(struct hci_dev *hdev,
+ return NULL;
+ }
+
+-/* This function requires the caller holds hdev->lock */
++/* This function requires the caller holds hdev->lock or rcu_read_lock */
+ struct hci_conn_params *hci_pend_le_action_lookup(struct list_head *list,
+ bdaddr_t *addr, u8 addr_type)
+ {
+ struct hci_conn_params *param;
+
+- list_for_each_entry(param, list, action) {
++ rcu_read_lock();
++
++ list_for_each_entry_rcu(param, list, action) {
+ if (bacmp(¶m->addr, addr) == 0 &&
+- param->addr_type == addr_type)
++ param->addr_type == addr_type) {
++ rcu_read_unlock();
+ return param;
++ }
+ }
+
++ rcu_read_unlock();
++
+ return NULL;
+ }
+
++/* This function requires the caller holds hdev->lock */
++void hci_pend_le_list_del_init(struct hci_conn_params *param)
++{
++ if (list_empty(¶m->action))
++ return;
++
++ list_del_rcu(¶m->action);
++ synchronize_rcu();
++ INIT_LIST_HEAD(¶m->action);
++}
++
++/* This function requires the caller holds hdev->lock */
++void hci_pend_le_list_add(struct hci_conn_params *param,
++ struct list_head *list)
++{
++ list_add_rcu(¶m->action, list);
++}
++
+ /* This function requires the caller holds hdev->lock */
+ struct hci_conn_params *hci_conn_params_add(struct hci_dev *hdev,
+ bdaddr_t *addr, u8 addr_type)
+@@ -2297,14 +2323,15 @@ struct hci_conn_params *hci_conn_params_add(struct hci_dev *hdev,
+ return params;
+ }
+
+-static void hci_conn_params_free(struct hci_conn_params *params)
++void hci_conn_params_free(struct hci_conn_params *params)
+ {
++ hci_pend_le_list_del_init(params);
++
+ if (params->conn) {
+ hci_conn_drop(params->conn);
+ hci_conn_put(params->conn);
+ }
+
+- list_del(¶ms->action);
+ list_del(¶ms->list);
+ kfree(params);
+ }
+@@ -2342,8 +2369,7 @@ void hci_conn_params_clear_disabled(struct hci_dev *hdev)
+ continue;
+ }
+
+- list_del(¶ms->list);
+- kfree(params);
++ hci_conn_params_free(params);
+ }
+
+ BT_DBG("All LE disabled connection parameters were removed");
+diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
+index 21e26d3b286cc..cb0b5fe7a6f8c 100644
+--- a/net/bluetooth/hci_event.c
++++ b/net/bluetooth/hci_event.c
+@@ -1564,7 +1564,7 @@ static u8 hci_cc_le_set_privacy_mode(struct hci_dev *hdev, void *data,
+
+ params = hci_conn_params_lookup(hdev, &cp->bdaddr, cp->bdaddr_type);
+ if (params)
+- params->privacy_mode = cp->mode;
++ WRITE_ONCE(params->privacy_mode, cp->mode);
+
+ hci_dev_unlock(hdev);
+
+@@ -2784,6 +2784,9 @@ static void hci_cs_disconnect(struct hci_dev *hdev, u8 status)
+ hci_enable_advertising(hdev);
+ }
+
++ /* Inform sockets conn is gone before we delete it */
++ hci_disconn_cfm(conn, HCI_ERROR_UNSPECIFIED);
++
+ goto done;
+ }
+
+@@ -2804,8 +2807,8 @@ static void hci_cs_disconnect(struct hci_dev *hdev, u8 status)
+
+ case HCI_AUTO_CONN_DIRECT:
+ case HCI_AUTO_CONN_ALWAYS:
+- list_del_init(¶ms->action);
+- list_add(¶ms->action, &hdev->pend_le_conns);
++ hci_pend_le_list_del_init(params);
++ hci_pend_le_list_add(params, &hdev->pend_le_conns);
+ break;
+
+ default:
+@@ -3423,8 +3426,8 @@ static void hci_disconn_complete_evt(struct hci_dev *hdev, void *data,
+
+ case HCI_AUTO_CONN_DIRECT:
+ case HCI_AUTO_CONN_ALWAYS:
+- list_del_init(¶ms->action);
+- list_add(¶ms->action, &hdev->pend_le_conns);
++ hci_pend_le_list_del_init(params);
++ hci_pend_le_list_add(params, &hdev->pend_le_conns);
+ hci_update_passive_scan(hdev);
+ break;
+
+@@ -5961,7 +5964,7 @@ static void le_conn_complete_evt(struct hci_dev *hdev, u8 status,
+ params = hci_pend_le_action_lookup(&hdev->pend_le_conns, &conn->dst,
+ conn->dst_type);
+ if (params) {
+- list_del_init(¶ms->action);
++ hci_pend_le_list_del_init(params);
+ if (params->conn) {
+ hci_conn_drop(params->conn);
+ hci_conn_put(params->conn);
+diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c
+index b5b1b610df335..1bcb54272dc67 100644
+--- a/net/bluetooth/hci_sync.c
++++ b/net/bluetooth/hci_sync.c
+@@ -2160,15 +2160,23 @@ static int hci_le_del_accept_list_sync(struct hci_dev *hdev,
+ return 0;
+ }
+
++struct conn_params {
++ bdaddr_t addr;
++ u8 addr_type;
++ hci_conn_flags_t flags;
++ u8 privacy_mode;
++};
++
+ /* Adds connection to resolve list if needed.
+ * Setting params to NULL programs local hdev->irk
+ */
+ static int hci_le_add_resolve_list_sync(struct hci_dev *hdev,
+- struct hci_conn_params *params)
++ struct conn_params *params)
+ {
+ struct hci_cp_le_add_to_resolv_list cp;
+ struct smp_irk *irk;
+ struct bdaddr_list_with_irk *entry;
++ struct hci_conn_params *p;
+
+ if (!use_ll_privacy(hdev))
+ return 0;
+@@ -2203,6 +2211,16 @@ static int hci_le_add_resolve_list_sync(struct hci_dev *hdev,
+ /* Default privacy mode is always Network */
+ params->privacy_mode = HCI_NETWORK_PRIVACY;
+
++ rcu_read_lock();
++ p = hci_pend_le_action_lookup(&hdev->pend_le_conns,
++ ¶ms->addr, params->addr_type);
++ if (!p)
++ p = hci_pend_le_action_lookup(&hdev->pend_le_reports,
++ ¶ms->addr, params->addr_type);
++ if (p)
++ WRITE_ONCE(p->privacy_mode, HCI_NETWORK_PRIVACY);
++ rcu_read_unlock();
++
+ done:
+ if (hci_dev_test_flag(hdev, HCI_PRIVACY))
+ memcpy(cp.local_irk, hdev->irk, 16);
+@@ -2215,7 +2233,7 @@ done:
+
+ /* Set Device Privacy Mode. */
+ static int hci_le_set_privacy_mode_sync(struct hci_dev *hdev,
+- struct hci_conn_params *params)
++ struct conn_params *params)
+ {
+ struct hci_cp_le_set_privacy_mode cp;
+ struct smp_irk *irk;
+@@ -2240,6 +2258,8 @@ static int hci_le_set_privacy_mode_sync(struct hci_dev *hdev,
+ bacpy(&cp.bdaddr, &irk->bdaddr);
+ cp.mode = HCI_DEVICE_PRIVACY;
+
++ /* Note: params->privacy_mode is not updated since it is a copy */
++
+ return __hci_cmd_sync_status(hdev, HCI_OP_LE_SET_PRIVACY_MODE,
+ sizeof(cp), &cp, HCI_CMD_TIMEOUT);
+ }
+@@ -2249,7 +2269,7 @@ static int hci_le_set_privacy_mode_sync(struct hci_dev *hdev,
+ * properly set the privacy mode.
+ */
+ static int hci_le_add_accept_list_sync(struct hci_dev *hdev,
+- struct hci_conn_params *params,
++ struct conn_params *params,
+ u8 *num_entries)
+ {
+ struct hci_cp_le_add_to_accept_list cp;
+@@ -2447,6 +2467,52 @@ struct sk_buff *hci_read_local_oob_data_sync(struct hci_dev *hdev,
+ return __hci_cmd_sync_sk(hdev, opcode, 0, NULL, 0, HCI_CMD_TIMEOUT, sk);
+ }
+
++static struct conn_params *conn_params_copy(struct list_head *list, size_t *n)
++{
++ struct hci_conn_params *params;
++ struct conn_params *p;
++ size_t i;
++
++ rcu_read_lock();
++
++ i = 0;
++ list_for_each_entry_rcu(params, list, action)
++ ++i;
++ *n = i;
++
++ rcu_read_unlock();
++
++ p = kvcalloc(*n, sizeof(struct conn_params), GFP_KERNEL);
++ if (!p)
++ return NULL;
++
++ rcu_read_lock();
++
++ i = 0;
++ list_for_each_entry_rcu(params, list, action) {
++ /* Racing adds are handled in next scan update */
++ if (i >= *n)
++ break;
++
++ /* No hdev->lock, but: addr, addr_type are immutable.
++ * privacy_mode is only written by us or in
++ * hci_cc_le_set_privacy_mode that we wait for.
++ * We should be idempotent so MGMT updating flags
++ * while we are processing is OK.
++ */
++ bacpy(&p[i].addr, ¶ms->addr);
++ p[i].addr_type = params->addr_type;
++ p[i].flags = READ_ONCE(params->flags);
++ p[i].privacy_mode = READ_ONCE(params->privacy_mode);
++ ++i;
++ }
++
++ rcu_read_unlock();
++
++ *n = i;
++ return p;
++}
++
+ /* Device must not be scanning when updating the accept list.
+ *
+ * Update is done using the following sequence:
+@@ -2466,11 +2532,12 @@ struct sk_buff *hci_read_local_oob_data_sync(struct hci_dev *hdev,
+ */
+ static u8 hci_update_accept_list_sync(struct hci_dev *hdev)
+ {
+- struct hci_conn_params *params;
++ struct conn_params *params;
+ struct bdaddr_list *b, *t;
+ u8 num_entries = 0;
+ bool pend_conn, pend_report;
+ u8 filter_policy;
++ size_t i, n;
+ int err;
+
+ /* Pause advertising if resolving list can be used as controllers
+@@ -2504,6 +2571,7 @@ static u8 hci_update_accept_list_sync(struct hci_dev *hdev)
+ if (hci_conn_hash_lookup_le(hdev, &b->bdaddr, b->bdaddr_type))
+ continue;
+
++ /* Pointers not dereferenced, no locks needed */
+ pend_conn = hci_pend_le_action_lookup(&hdev->pend_le_conns,
+ &b->bdaddr,
+ b->bdaddr_type);
+@@ -2532,23 +2600,50 @@ static u8 hci_update_accept_list_sync(struct hci_dev *hdev)
+ * available accept list entries in the controller, then
+ * just abort and return filer policy value to not use the
+ * accept list.
++ *
++ * The list and params may be mutated while we wait for events,
++ * so make a copy and iterate it.
+ */
+- list_for_each_entry(params, &hdev->pend_le_conns, action) {
+- err = hci_le_add_accept_list_sync(hdev, params, &num_entries);
+- if (err)
++
++ params = conn_params_copy(&hdev->pend_le_conns, &n);
++ if (!params) {
++ err = -ENOMEM;
++ goto done;
++ }
++
++ for (i = 0; i < n; ++i) {
++ err = hci_le_add_accept_list_sync(hdev, ¶ms[i],
++ &num_entries);
++ if (err) {
++ kvfree(params);
+ goto done;
++ }
+ }
+
++ kvfree(params);
++
+ /* After adding all new pending connections, walk through
+ * the list of pending reports and also add these to the
+ * accept list if there is still space. Abort if space runs out.
+ */
+- list_for_each_entry(params, &hdev->pend_le_reports, action) {
+- err = hci_le_add_accept_list_sync(hdev, params, &num_entries);
+- if (err)
++
++ params = conn_params_copy(&hdev->pend_le_reports, &n);
++ if (!params) {
++ err = -ENOMEM;
++ goto done;
++ }
++
++ for (i = 0; i < n; ++i) {
++ err = hci_le_add_accept_list_sync(hdev, ¶ms[i],
++ &num_entries);
++ if (err) {
++ kvfree(params);
+ goto done;
++ }
+ }
+
++ kvfree(params);
++
+ /* Use the allowlist unless the following conditions are all true:
+ * - We are not currently suspending
+ * - There are 1 or more ADV monitors registered and it's not offloaded
+@@ -4839,12 +4934,12 @@ static void hci_pend_le_actions_clear(struct hci_dev *hdev)
+ struct hci_conn_params *p;
+
+ list_for_each_entry(p, &hdev->le_conn_params, list) {
++ hci_pend_le_list_del_init(p);
+ if (p->conn) {
+ hci_conn_drop(p->conn);
+ hci_conn_put(p->conn);
+ p->conn = NULL;
+ }
+- list_del_init(&p->action);
+ }
+
+ BT_DBG("All LE pending actions cleared");
+diff --git a/net/bluetooth/iso.c b/net/bluetooth/iso.c
+index 34d55a85d8f6f..94d5bc104fede 100644
+--- a/net/bluetooth/iso.c
++++ b/net/bluetooth/iso.c
+@@ -123,8 +123,11 @@ static struct iso_conn *iso_conn_add(struct hci_conn *hcon)
+ {
+ struct iso_conn *conn = hcon->iso_data;
+
+- if (conn)
++ if (conn) {
++ if (!conn->hcon)
++ conn->hcon = hcon;
+ return conn;
++ }
+
+ conn = kzalloc(sizeof(*conn), GFP_KERNEL);
+ if (!conn)
+@@ -300,14 +303,13 @@ static int iso_connect_bis(struct sock *sk)
+ goto unlock;
+ }
+
+- hci_dev_unlock(hdev);
+- hci_dev_put(hdev);
++ lock_sock(sk);
+
+ err = iso_chan_add(conn, sk, NULL);
+- if (err)
+- return err;
+-
+- lock_sock(sk);
++ if (err) {
++ release_sock(sk);
++ goto unlock;
++ }
+
+ /* Update source addr of the socket */
+ bacpy(&iso_pi(sk)->src, &hcon->src);
+@@ -321,7 +323,6 @@ static int iso_connect_bis(struct sock *sk)
+ }
+
+ release_sock(sk);
+- return err;
+
+ unlock:
+ hci_dev_unlock(hdev);
+@@ -389,14 +390,13 @@ static int iso_connect_cis(struct sock *sk)
+ goto unlock;
+ }
+
+- hci_dev_unlock(hdev);
+- hci_dev_put(hdev);
++ lock_sock(sk);
+
+ err = iso_chan_add(conn, sk, NULL);
+- if (err)
+- return err;
+-
+- lock_sock(sk);
++ if (err) {
++ release_sock(sk);
++ goto unlock;
++ }
+
+ /* Update source addr of the socket */
+ bacpy(&iso_pi(sk)->src, &hcon->src);
+@@ -413,7 +413,6 @@ static int iso_connect_cis(struct sock *sk)
+ }
+
+ release_sock(sk);
+- return err;
+
+ unlock:
+ hci_dev_unlock(hdev);
+@@ -1072,8 +1071,8 @@ static int iso_sock_sendmsg(struct socket *sock, struct msghdr *msg,
+ size_t len)
+ {
+ struct sock *sk = sock->sk;
+- struct iso_conn *conn = iso_pi(sk)->conn;
+ struct sk_buff *skb, **frag;
++ size_t mtu;
+ int err;
+
+ BT_DBG("sock %p, sk %p", sock, sk);
+@@ -1085,11 +1084,18 @@ static int iso_sock_sendmsg(struct socket *sock, struct msghdr *msg,
+ if (msg->msg_flags & MSG_OOB)
+ return -EOPNOTSUPP;
+
+- if (sk->sk_state != BT_CONNECTED)
++ lock_sock(sk);
++
++ if (sk->sk_state != BT_CONNECTED) {
++ release_sock(sk);
+ return -ENOTCONN;
++ }
++
++ mtu = iso_pi(sk)->conn->hcon->hdev->iso_mtu;
++
++ release_sock(sk);
+
+- skb = bt_skb_sendmsg(sk, msg, len, conn->hcon->hdev->iso_mtu,
+- HCI_ISO_DATA_HDR_SIZE, 0);
++ skb = bt_skb_sendmsg(sk, msg, len, mtu, HCI_ISO_DATA_HDR_SIZE, 0);
+ if (IS_ERR(skb))
+ return PTR_ERR(skb);
+
+@@ -1102,8 +1108,7 @@ static int iso_sock_sendmsg(struct socket *sock, struct msghdr *msg,
+ while (len) {
+ struct sk_buff *tmp;
+
+- tmp = bt_skb_sendmsg(sk, msg, len, conn->hcon->hdev->iso_mtu,
+- 0, 0);
++ tmp = bt_skb_sendmsg(sk, msg, len, mtu, 0, 0);
+ if (IS_ERR(tmp)) {
+ kfree_skb(skb);
+ return PTR_ERR(tmp);
+@@ -1158,15 +1163,19 @@ static int iso_sock_recvmsg(struct socket *sock, struct msghdr *msg,
+ BT_DBG("sk %p", sk);
+
+ if (test_and_clear_bit(BT_SK_DEFER_SETUP, &bt_sk(sk)->flags)) {
++ lock_sock(sk);
+ switch (sk->sk_state) {
+ case BT_CONNECT2:
+- lock_sock(sk);
+ iso_conn_defer_accept(pi->conn->hcon);
+ sk->sk_state = BT_CONFIG;
+ release_sock(sk);
+ return 0;
+ case BT_CONNECT:
++ release_sock(sk);
+ return iso_connect_cis(sk);
++ default:
++ release_sock(sk);
++ break;
+ }
+ }
+
+diff --git a/net/bluetooth/mgmt.c b/net/bluetooth/mgmt.c
+index f7b2d0971f240..1e07d0f289723 100644
+--- a/net/bluetooth/mgmt.c
++++ b/net/bluetooth/mgmt.c
+@@ -1297,15 +1297,15 @@ static void restart_le_actions(struct hci_dev *hdev)
+ /* Needed for AUTO_OFF case where might not "really"
+ * have been powered off.
+ */
+- list_del_init(&p->action);
++ hci_pend_le_list_del_init(p);
+
+ switch (p->auto_connect) {
+ case HCI_AUTO_CONN_DIRECT:
+ case HCI_AUTO_CONN_ALWAYS:
+- list_add(&p->action, &hdev->pend_le_conns);
++ hci_pend_le_list_add(p, &hdev->pend_le_conns);
+ break;
+ case HCI_AUTO_CONN_REPORT:
+- list_add(&p->action, &hdev->pend_le_reports);
++ hci_pend_le_list_add(p, &hdev->pend_le_reports);
+ break;
+ default:
+ break;
+@@ -5169,7 +5169,7 @@ static int set_device_flags(struct sock *sk, struct hci_dev *hdev, void *data,
+ goto unlock;
+ }
+
+- params->flags = current_flags;
++ WRITE_ONCE(params->flags, current_flags);
+ status = MGMT_STATUS_SUCCESS;
+
+ /* Update passive scan if HCI_CONN_FLAG_DEVICE_PRIVACY
+@@ -7580,7 +7580,7 @@ static int hci_conn_params_set(struct hci_dev *hdev, bdaddr_t *addr,
+ if (params->auto_connect == auto_connect)
+ return 0;
+
+- list_del_init(¶ms->action);
++ hci_pend_le_list_del_init(params);
+
+ switch (auto_connect) {
+ case HCI_AUTO_CONN_DISABLED:
+@@ -7589,18 +7589,18 @@ static int hci_conn_params_set(struct hci_dev *hdev, bdaddr_t *addr,
+ * connect to device, keep connecting.
+ */
+ if (params->explicit_connect)
+- list_add(¶ms->action, &hdev->pend_le_conns);
++ hci_pend_le_list_add(params, &hdev->pend_le_conns);
+ break;
+ case HCI_AUTO_CONN_REPORT:
+ if (params->explicit_connect)
+- list_add(¶ms->action, &hdev->pend_le_conns);
++ hci_pend_le_list_add(params, &hdev->pend_le_conns);
+ else
+- list_add(¶ms->action, &hdev->pend_le_reports);
++ hci_pend_le_list_add(params, &hdev->pend_le_reports);
+ break;
+ case HCI_AUTO_CONN_DIRECT:
+ case HCI_AUTO_CONN_ALWAYS:
+ if (!is_connected(hdev, addr, addr_type))
+- list_add(¶ms->action, &hdev->pend_le_conns);
++ hci_pend_le_list_add(params, &hdev->pend_le_conns);
+ break;
+ }
+
+@@ -7823,9 +7823,7 @@ static int remove_device(struct sock *sk, struct hci_dev *hdev,
+ goto unlock;
+ }
+
+- list_del(¶ms->action);
+- list_del(¶ms->list);
+- kfree(params);
++ hci_conn_params_free(params);
+
+ device_removed(sk, hdev, &cp->addr.bdaddr, cp->addr.type);
+ } else {
+@@ -7856,9 +7854,7 @@ static int remove_device(struct sock *sk, struct hci_dev *hdev,
+ p->auto_connect = HCI_AUTO_CONN_EXPLICIT;
+ continue;
+ }
+- list_del(&p->action);
+- list_del(&p->list);
+- kfree(p);
++ hci_conn_params_free(p);
+ }
+
+ bt_dev_dbg(hdev, "All LE connection parameters were removed");
+diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c
+index cd1a27ac555d0..7762604ddfc05 100644
+--- a/net/bluetooth/sco.c
++++ b/net/bluetooth/sco.c
+@@ -126,8 +126,11 @@ static struct sco_conn *sco_conn_add(struct hci_conn *hcon)
+ struct hci_dev *hdev = hcon->hdev;
+ struct sco_conn *conn = hcon->sco_data;
+
+- if (conn)
++ if (conn) {
++ if (!conn->hcon)
++ conn->hcon = hcon;
+ return conn;
++ }
+
+ conn = kzalloc(sizeof(struct sco_conn), GFP_KERNEL);
+ if (!conn)
+@@ -268,21 +271,21 @@ static int sco_connect(struct sock *sk)
+ goto unlock;
+ }
+
+- hci_dev_unlock(hdev);
+- hci_dev_put(hdev);
+-
+ conn = sco_conn_add(hcon);
+ if (!conn) {
+ hci_conn_drop(hcon);
+- return -ENOMEM;
++ err = -ENOMEM;
++ goto unlock;
+ }
+
+- err = sco_chan_add(conn, sk, NULL);
+- if (err)
+- return err;
+-
+ lock_sock(sk);
+
++ err = sco_chan_add(conn, sk, NULL);
++ if (err) {
++ release_sock(sk);
++ goto unlock;
++ }
++
+ /* Update source addr of the socket */
+ bacpy(&sco_pi(sk)->src, &hcon->src);
+
+@@ -296,8 +299,6 @@ static int sco_connect(struct sock *sk)
+
+ release_sock(sk);
+
+- return err;
+-
+ unlock:
+ hci_dev_unlock(hdev);
+ hci_dev_put(hdev);
+diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
+index 75204d36d7f90..b65962682771f 100644
+--- a/net/bridge/br_stp_if.c
++++ b/net/bridge/br_stp_if.c
+@@ -201,6 +201,9 @@ int br_stp_set_enabled(struct net_bridge *br, unsigned long val,
+ {
+ ASSERT_RTNL();
+
++ if (!net_eq(dev_net(br->dev), &init_net))
++ NL_SET_ERR_MSG_MOD(extack, "STP does not work in non-root netns");
++
+ if (br_mrp_enabled(br)) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "STP can't be enabled if MRP is already enabled");
+diff --git a/net/can/bcm.c b/net/can/bcm.c
+index a962ec2b8ba5b..925d48cc50f81 100644
+--- a/net/can/bcm.c
++++ b/net/can/bcm.c
+@@ -1526,6 +1526,12 @@ static int bcm_release(struct socket *sock)
+
+ lock_sock(sk);
+
++#if IS_ENABLED(CONFIG_PROC_FS)
++ /* remove procfs entry */
++ if (net->can.bcmproc_dir && bo->bcm_proc_read)
++ remove_proc_entry(bo->procname, net->can.bcmproc_dir);
++#endif /* CONFIG_PROC_FS */
++
+ list_for_each_entry_safe(op, next, &bo->tx_ops, list)
+ bcm_remove_op(op);
+
+@@ -1561,12 +1567,6 @@ static int bcm_release(struct socket *sock)
+ list_for_each_entry_safe(op, next, &bo->rx_ops, list)
+ bcm_remove_op(op);
+
+-#if IS_ENABLED(CONFIG_PROC_FS)
+- /* remove procfs entry */
+- if (net->can.bcmproc_dir && bo->bcm_proc_read)
+- remove_proc_entry(bo->procname, net->can.bcmproc_dir);
+-#endif /* CONFIG_PROC_FS */
+-
+ /* remove device reference */
+ if (bo->bound) {
+ bo->bound = 0;
+diff --git a/net/devlink/health.c b/net/devlink/health.c
+index 0839706d5741a..194340a8bb863 100644
+--- a/net/devlink/health.c
++++ b/net/devlink/health.c
+@@ -480,7 +480,7 @@ static void devlink_recover_notify(struct devlink_health_reporter *reporter,
+ int err;
+
+ WARN_ON(cmd != DEVLINK_CMD_HEALTH_REPORTER_RECOVER);
+- WARN_ON(!xa_get_mark(&devlinks, devlink->index, DEVLINK_REGISTERED));
++ ASSERT_DEVLINK_REGISTERED(devlink);
+
+ msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+ if (!msg)
+diff --git a/net/devlink/leftover.c b/net/devlink/leftover.c
+index cd02549680767..790e61b2a9404 100644
+--- a/net/devlink/leftover.c
++++ b/net/devlink/leftover.c
+@@ -6772,7 +6772,10 @@ void devlink_notify_unregister(struct devlink *devlink)
+
+ static void devlink_port_type_warn(struct work_struct *work)
+ {
+- WARN(true, "Type was not set for devlink port.");
++ struct devlink_port *port = container_of(to_delayed_work(work),
++ struct devlink_port,
++ type_warn_dw);
++ dev_warn(port->devlink->dev, "Type was not set for devlink port.");
+ }
+
+ static bool devlink_port_type_should_warn(struct devlink_port *devlink_port)
+diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
+index ba06ed42e4284..2be2d49225573 100644
+--- a/net/ipv4/esp4.c
++++ b/net/ipv4/esp4.c
+@@ -1132,7 +1132,7 @@ static int esp_init_authenc(struct xfrm_state *x,
+ err = crypto_aead_setkey(aead, key, keylen);
+
+ free_key:
+- kfree(key);
++ kfree_sensitive(key);
+
+ error:
+ return err;
+diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
+index 1386787eaf1a5..3105a676eba76 100644
+--- a/net/ipv4/inet_connection_sock.c
++++ b/net/ipv4/inet_connection_sock.c
+@@ -1016,7 +1016,7 @@ static void reqsk_timer_handler(struct timer_list *t)
+
+ icsk = inet_csk(sk_listener);
+ net = sock_net(sk_listener);
+- max_syn_ack_retries = icsk->icsk_syn_retries ? :
++ max_syn_ack_retries = READ_ONCE(icsk->icsk_syn_retries) ? :
+ READ_ONCE(net->ipv4.sysctl_tcp_synack_retries);
+ /* Normally all the openreqs are young and become mature
+ * (i.e. converted to established socket) for first timeout.
+diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
+index e7391bf310a75..0819d6001b9ab 100644
+--- a/net/ipv4/inet_hashtables.c
++++ b/net/ipv4/inet_hashtables.c
+@@ -650,20 +650,8 @@ bool inet_ehash_insert(struct sock *sk, struct sock *osk, bool *found_dup_sk)
+ spin_lock(lock);
+ if (osk) {
+ WARN_ON_ONCE(sk->sk_hash != osk->sk_hash);
+- ret = sk_hashed(osk);
+- if (ret) {
+- /* Before deleting the node, we insert a new one to make
+- * sure that the look-up-sk process would not miss either
+- * of them and that at least one node would exist in ehash
+- * table all the time. Otherwise there's a tiny chance
+- * that lookup process could find nothing in ehash table.
+- */
+- __sk_nulls_add_node_tail_rcu(sk, list);
+- sk_nulls_del_node_init_rcu(osk);
+- }
+- goto unlock;
+- }
+- if (found_dup_sk) {
++ ret = sk_nulls_del_node_init_rcu(osk);
++ } else if (found_dup_sk) {
+ *found_dup_sk = inet_ehash_lookup_by_sk(sk, list);
+ if (*found_dup_sk)
+ ret = false;
+@@ -672,7 +660,6 @@ bool inet_ehash_insert(struct sock *sk, struct sock *osk, bool *found_dup_sk)
+ if (ret)
+ __sk_nulls_add_node_rcu(sk, list);
+
+-unlock:
+ spin_unlock(lock);
+
+ return ret;
+diff --git a/net/ipv4/inet_timewait_sock.c b/net/ipv4/inet_timewait_sock.c
+index 40052414c7c71..2c1b245dba8e8 100644
+--- a/net/ipv4/inet_timewait_sock.c
++++ b/net/ipv4/inet_timewait_sock.c
+@@ -88,10 +88,10 @@ void inet_twsk_put(struct inet_timewait_sock *tw)
+ }
+ EXPORT_SYMBOL_GPL(inet_twsk_put);
+
+-static void inet_twsk_add_node_tail_rcu(struct inet_timewait_sock *tw,
+- struct hlist_nulls_head *list)
++static void inet_twsk_add_node_rcu(struct inet_timewait_sock *tw,
++ struct hlist_nulls_head *list)
+ {
+- hlist_nulls_add_tail_rcu(&tw->tw_node, list);
++ hlist_nulls_add_head_rcu(&tw->tw_node, list);
+ }
+
+ static void inet_twsk_add_bind_node(struct inet_timewait_sock *tw,
+@@ -144,7 +144,7 @@ void inet_twsk_hashdance(struct inet_timewait_sock *tw, struct sock *sk,
+
+ spin_lock(lock);
+
+- inet_twsk_add_node_tail_rcu(tw, &ehead->chain);
++ inet_twsk_add_node_rcu(tw, &ehead->chain);
+
+ /* Step 3: Remove SK from hash chain */
+ if (__sk_nulls_del_node_init_rcu(sk))
+diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
+index 61892268e8a6c..a1bead441026e 100644
+--- a/net/ipv4/ip_output.c
++++ b/net/ipv4/ip_output.c
+@@ -1692,7 +1692,7 @@ void ip_send_unicast_reply(struct sock *sk, struct sk_buff *skb,
+ const struct ip_options *sopt,
+ __be32 daddr, __be32 saddr,
+ const struct ip_reply_arg *arg,
+- unsigned int len, u64 transmit_time)
++ unsigned int len, u64 transmit_time, u32 txhash)
+ {
+ struct ip_options_data replyopts;
+ struct ipcm_cookie ipc;
+@@ -1755,6 +1755,8 @@ void ip_send_unicast_reply(struct sock *sk, struct sk_buff *skb,
+ arg->csum));
+ nskb->ip_summed = CHECKSUM_NONE;
+ nskb->mono_delivery_time = !!transmit_time;
++ if (txhash)
++ skb_set_hash(nskb, txhash, PKT_HASH_TYPE_L4);
+ ip_push_pending_frames(sk, &fl4);
+ }
+ out:
+diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
+index 8d20d9221238c..79f29e138fc9f 100644
+--- a/net/ipv4/tcp.c
++++ b/net/ipv4/tcp.c
+@@ -3400,7 +3400,7 @@ int tcp_sock_set_syncnt(struct sock *sk, int val)
+ return -EINVAL;
+
+ lock_sock(sk);
+- inet_csk(sk)->icsk_syn_retries = val;
++ WRITE_ONCE(inet_csk(sk)->icsk_syn_retries, val);
+ release_sock(sk);
+ return 0;
+ }
+@@ -3409,7 +3409,7 @@ EXPORT_SYMBOL(tcp_sock_set_syncnt);
+ void tcp_sock_set_user_timeout(struct sock *sk, u32 val)
+ {
+ lock_sock(sk);
+- inet_csk(sk)->icsk_user_timeout = val;
++ WRITE_ONCE(inet_csk(sk)->icsk_user_timeout, val);
+ release_sock(sk);
+ }
+ EXPORT_SYMBOL(tcp_sock_set_user_timeout);
+@@ -3421,7 +3421,8 @@ int tcp_sock_set_keepidle_locked(struct sock *sk, int val)
+ if (val < 1 || val > MAX_TCP_KEEPIDLE)
+ return -EINVAL;
+
+- tp->keepalive_time = val * HZ;
++ /* Paired with WRITE_ONCE() in keepalive_time_when() */
++ WRITE_ONCE(tp->keepalive_time, val * HZ);
+ if (sock_flag(sk, SOCK_KEEPOPEN) &&
+ !((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_LISTEN))) {
+ u32 elapsed = keepalive_time_elapsed(tp);
+@@ -3453,7 +3454,7 @@ int tcp_sock_set_keepintvl(struct sock *sk, int val)
+ return -EINVAL;
+
+ lock_sock(sk);
+- tcp_sk(sk)->keepalive_intvl = val * HZ;
++ WRITE_ONCE(tcp_sk(sk)->keepalive_intvl, val * HZ);
+ release_sock(sk);
+ return 0;
+ }
+@@ -3465,7 +3466,8 @@ int tcp_sock_set_keepcnt(struct sock *sk, int val)
+ return -EINVAL;
+
+ lock_sock(sk);
+- tcp_sk(sk)->keepalive_probes = val;
++ /* Paired with READ_ONCE() in keepalive_probes() */
++ WRITE_ONCE(tcp_sk(sk)->keepalive_probes, val);
+ release_sock(sk);
+ return 0;
+ }
+@@ -3667,19 +3669,19 @@ int do_tcp_setsockopt(struct sock *sk, int level, int optname,
+ if (val < 1 || val > MAX_TCP_KEEPINTVL)
+ err = -EINVAL;
+ else
+- tp->keepalive_intvl = val * HZ;
++ WRITE_ONCE(tp->keepalive_intvl, val * HZ);
+ break;
+ case TCP_KEEPCNT:
+ if (val < 1 || val > MAX_TCP_KEEPCNT)
+ err = -EINVAL;
+ else
+- tp->keepalive_probes = val;
++ WRITE_ONCE(tp->keepalive_probes, val);
+ break;
+ case TCP_SYNCNT:
+ if (val < 1 || val > MAX_TCP_SYNCNT)
+ err = -EINVAL;
+ else
+- icsk->icsk_syn_retries = val;
++ WRITE_ONCE(icsk->icsk_syn_retries, val);
+ break;
+
+ case TCP_SAVE_SYN:
+@@ -3692,18 +3694,18 @@ int do_tcp_setsockopt(struct sock *sk, int level, int optname,
+
+ case TCP_LINGER2:
+ if (val < 0)
+- tp->linger2 = -1;
++ WRITE_ONCE(tp->linger2, -1);
+ else if (val > TCP_FIN_TIMEOUT_MAX / HZ)
+- tp->linger2 = TCP_FIN_TIMEOUT_MAX;
++ WRITE_ONCE(tp->linger2, TCP_FIN_TIMEOUT_MAX);
+ else
+- tp->linger2 = val * HZ;
++ WRITE_ONCE(tp->linger2, val * HZ);
+ break;
+
+ case TCP_DEFER_ACCEPT:
+ /* Translate value in seconds to number of retransmits */
+- icsk->icsk_accept_queue.rskq_defer_accept =
+- secs_to_retrans(val, TCP_TIMEOUT_INIT / HZ,
+- TCP_RTO_MAX / HZ);
++ WRITE_ONCE(icsk->icsk_accept_queue.rskq_defer_accept,
++ secs_to_retrans(val, TCP_TIMEOUT_INIT / HZ,
++ TCP_RTO_MAX / HZ));
+ break;
+
+ case TCP_WINDOW_CLAMP:
+@@ -3727,7 +3729,7 @@ int do_tcp_setsockopt(struct sock *sk, int level, int optname,
+ if (val < 0)
+ err = -EINVAL;
+ else
+- icsk->icsk_user_timeout = val;
++ WRITE_ONCE(icsk->icsk_user_timeout, val);
+ break;
+
+ case TCP_FASTOPEN:
+@@ -3765,13 +3767,13 @@ int do_tcp_setsockopt(struct sock *sk, int level, int optname,
+ if (!tp->repair)
+ err = -EPERM;
+ else
+- tp->tsoffset = val - tcp_time_stamp_raw();
++ WRITE_ONCE(tp->tsoffset, val - tcp_time_stamp_raw());
+ break;
+ case TCP_REPAIR_WINDOW:
+ err = tcp_repair_set_window(tp, optval, optlen);
+ break;
+ case TCP_NOTSENT_LOWAT:
+- tp->notsent_lowat = val;
++ WRITE_ONCE(tp->notsent_lowat, val);
+ sk->sk_write_space(sk);
+ break;
+ case TCP_INQ:
+@@ -3783,7 +3785,7 @@ int do_tcp_setsockopt(struct sock *sk, int level, int optname,
+ case TCP_TX_DELAY:
+ if (val)
+ tcp_enable_tx_delay();
+- tp->tcp_tx_delay = val;
++ WRITE_ONCE(tp->tcp_tx_delay, val);
+ break;
+ default:
+ err = -ENOPROTOOPT;
+@@ -4100,17 +4102,18 @@ int do_tcp_getsockopt(struct sock *sk, int level,
+ val = keepalive_probes(tp);
+ break;
+ case TCP_SYNCNT:
+- val = icsk->icsk_syn_retries ? :
++ val = READ_ONCE(icsk->icsk_syn_retries) ? :
+ READ_ONCE(net->ipv4.sysctl_tcp_syn_retries);
+ break;
+ case TCP_LINGER2:
+- val = tp->linger2;
++ val = READ_ONCE(tp->linger2);
+ if (val >= 0)
+ val = (val ? : READ_ONCE(net->ipv4.sysctl_tcp_fin_timeout)) / HZ;
+ break;
+ case TCP_DEFER_ACCEPT:
+- val = retrans_to_secs(icsk->icsk_accept_queue.rskq_defer_accept,
+- TCP_TIMEOUT_INIT / HZ, TCP_RTO_MAX / HZ);
++ val = READ_ONCE(icsk->icsk_accept_queue.rskq_defer_accept);
++ val = retrans_to_secs(val, TCP_TIMEOUT_INIT / HZ,
++ TCP_RTO_MAX / HZ);
+ break;
+ case TCP_WINDOW_CLAMP:
+ val = tp->window_clamp;
+@@ -4247,11 +4250,11 @@ int do_tcp_getsockopt(struct sock *sk, int level,
+ break;
+
+ case TCP_USER_TIMEOUT:
+- val = icsk->icsk_user_timeout;
++ val = READ_ONCE(icsk->icsk_user_timeout);
+ break;
+
+ case TCP_FASTOPEN:
+- val = icsk->icsk_accept_queue.fastopenq.max_qlen;
++ val = READ_ONCE(icsk->icsk_accept_queue.fastopenq.max_qlen);
+ break;
+
+ case TCP_FASTOPEN_CONNECT:
+@@ -4263,14 +4266,14 @@ int do_tcp_getsockopt(struct sock *sk, int level,
+ break;
+
+ case TCP_TX_DELAY:
+- val = tp->tcp_tx_delay;
++ val = READ_ONCE(tp->tcp_tx_delay);
+ break;
+
+ case TCP_TIMESTAMP:
+- val = tcp_time_stamp_raw() + tp->tsoffset;
++ val = tcp_time_stamp_raw() + READ_ONCE(tp->tsoffset);
+ break;
+ case TCP_NOTSENT_LOWAT:
+- val = tp->notsent_lowat;
++ val = READ_ONCE(tp->notsent_lowat);
+ break;
+ case TCP_INQ:
+ val = tp->recvmsg_inq;
+diff --git a/net/ipv4/tcp_fastopen.c b/net/ipv4/tcp_fastopen.c
+index 45cc7f1ca2961..85e4953f11821 100644
+--- a/net/ipv4/tcp_fastopen.c
++++ b/net/ipv4/tcp_fastopen.c
+@@ -296,6 +296,7 @@ static struct sock *tcp_fastopen_create_child(struct sock *sk,
+ static bool tcp_fastopen_queue_check(struct sock *sk)
+ {
+ struct fastopen_queue *fastopenq;
++ int max_qlen;
+
+ /* Make sure the listener has enabled fastopen, and we don't
+ * exceed the max # of pending TFO requests allowed before trying
+@@ -308,10 +309,11 @@ static bool tcp_fastopen_queue_check(struct sock *sk)
+ * temporarily vs a server not supporting Fast Open at all.
+ */
+ fastopenq = &inet_csk(sk)->icsk_accept_queue.fastopenq;
+- if (fastopenq->max_qlen == 0)
++ max_qlen = READ_ONCE(fastopenq->max_qlen);
++ if (max_qlen == 0)
+ return false;
+
+- if (fastopenq->qlen >= fastopenq->max_qlen) {
++ if (fastopenq->qlen >= max_qlen) {
+ struct request_sock *req1;
+ spin_lock(&fastopenq->lock);
+ req1 = fastopenq->rskq_rst_head;
+diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
+index 06d2573685ca9..f37d13ee7b4cc 100644
+--- a/net/ipv4/tcp_ipv4.c
++++ b/net/ipv4/tcp_ipv4.c
+@@ -307,8 +307,9 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
+ inet->inet_daddr,
+ inet->inet_sport,
+ usin->sin_port));
+- tp->tsoffset = secure_tcp_ts_off(net, inet->inet_saddr,
+- inet->inet_daddr);
++ WRITE_ONCE(tp->tsoffset,
++ secure_tcp_ts_off(net, inet->inet_saddr,
++ inet->inet_daddr));
+ }
+
+ inet->inet_id = get_random_u16();
+@@ -692,6 +693,7 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb)
+ u64 transmit_time = 0;
+ struct sock *ctl_sk;
+ struct net *net;
++ u32 txhash = 0;
+
+ /* Never send a reset in response to a reset. */
+ if (th->rst)
+@@ -829,6 +831,8 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb)
+ inet_twsk(sk)->tw_priority : sk->sk_priority;
+ transmit_time = tcp_transmit_time(sk);
+ xfrm_sk_clone_policy(ctl_sk, sk);
++ txhash = (sk->sk_state == TCP_TIME_WAIT) ?
++ inet_twsk(sk)->tw_txhash : sk->sk_txhash;
+ } else {
+ ctl_sk->sk_mark = 0;
+ ctl_sk->sk_priority = 0;
+@@ -837,7 +841,7 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb)
+ skb, &TCP_SKB_CB(skb)->header.h4.opt,
+ ip_hdr(skb)->saddr, ip_hdr(skb)->daddr,
+ &arg, arg.iov[0].iov_len,
+- transmit_time);
++ transmit_time, txhash);
+
+ xfrm_sk_free_policy(ctl_sk);
+ sock_net_set(ctl_sk, &init_net);
+@@ -859,7 +863,7 @@ static void tcp_v4_send_ack(const struct sock *sk,
+ struct sk_buff *skb, u32 seq, u32 ack,
+ u32 win, u32 tsval, u32 tsecr, int oif,
+ struct tcp_md5sig_key *key,
+- int reply_flags, u8 tos)
++ int reply_flags, u8 tos, u32 txhash)
+ {
+ const struct tcphdr *th = tcp_hdr(skb);
+ struct {
+@@ -935,7 +939,7 @@ static void tcp_v4_send_ack(const struct sock *sk,
+ skb, &TCP_SKB_CB(skb)->header.h4.opt,
+ ip_hdr(skb)->saddr, ip_hdr(skb)->daddr,
+ &arg, arg.iov[0].iov_len,
+- transmit_time);
++ transmit_time, txhash);
+
+ sock_net_set(ctl_sk, &init_net);
+ __TCP_INC_STATS(net, TCP_MIB_OUTSEGS);
+@@ -955,7 +959,8 @@ static void tcp_v4_timewait_ack(struct sock *sk, struct sk_buff *skb)
+ tw->tw_bound_dev_if,
+ tcp_twsk_md5_key(tcptw),
+ tw->tw_transparent ? IP_REPLY_ARG_NOSRCCHECK : 0,
+- tw->tw_tos
++ tw->tw_tos,
++ tw->tw_txhash
+ );
+
+ inet_twsk_put(tw);
+@@ -984,11 +989,12 @@ static void tcp_v4_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb,
+ tcp_rsk(req)->rcv_nxt,
+ req->rsk_rcv_wnd >> inet_rsk(req)->rcv_wscale,
+ tcp_time_stamp_raw() + tcp_rsk(req)->ts_off,
+- req->ts_recent,
++ READ_ONCE(req->ts_recent),
+ 0,
+ tcp_md5_do_lookup(sk, l3index, addr, AF_INET),
+ inet_rsk(req)->no_srccheck ? IP_REPLY_ARG_NOSRCCHECK : 0,
+- ip_hdr(skb)->tos);
++ ip_hdr(skb)->tos,
++ READ_ONCE(tcp_rsk(req)->txhash));
+ }
+
+ /*
+@@ -2963,7 +2969,6 @@ static int bpf_iter_tcp_seq_show(struct seq_file *seq, void *v)
+ struct bpf_iter_meta meta;
+ struct bpf_prog *prog;
+ struct sock *sk = v;
+- bool slow;
+ uid_t uid;
+ int ret;
+
+@@ -2971,7 +2976,7 @@ static int bpf_iter_tcp_seq_show(struct seq_file *seq, void *v)
+ return 0;
+
+ if (sk_fullsock(sk))
+- slow = lock_sock_fast(sk);
++ lock_sock(sk);
+
+ if (unlikely(sk_unhashed(sk))) {
+ ret = SEQ_SKIP;
+@@ -2995,7 +3000,7 @@ static int bpf_iter_tcp_seq_show(struct seq_file *seq, void *v)
+
+ unlock:
+ if (sk_fullsock(sk))
+- unlock_sock_fast(sk, slow);
++ release_sock(sk);
+ return ret;
+
+ }
+diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
+index dac0d62120e62..62641d42b06b5 100644
+--- a/net/ipv4/tcp_minisocks.c
++++ b/net/ipv4/tcp_minisocks.c
+@@ -528,7 +528,7 @@ struct sock *tcp_create_openreq_child(const struct sock *sk,
+ newicsk->icsk_ack.lrcvtime = tcp_jiffies32;
+
+ newtp->lsndtime = tcp_jiffies32;
+- newsk->sk_txhash = treq->txhash;
++ newsk->sk_txhash = READ_ONCE(treq->txhash);
+ newtp->total_retrans = req->num_retrans;
+
+ tcp_init_xmit_timers(newsk);
+@@ -555,7 +555,7 @@ struct sock *tcp_create_openreq_child(const struct sock *sk,
+ newtp->max_window = newtp->snd_wnd;
+
+ if (newtp->rx_opt.tstamp_ok) {
+- newtp->rx_opt.ts_recent = req->ts_recent;
++ newtp->rx_opt.ts_recent = READ_ONCE(req->ts_recent);
+ newtp->rx_opt.ts_recent_stamp = ktime_get_seconds();
+ newtp->tcp_header_len = sizeof(struct tcphdr) + TCPOLEN_TSTAMP_ALIGNED;
+ } else {
+@@ -619,7 +619,7 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb,
+ tcp_parse_options(sock_net(sk), skb, &tmp_opt, 0, NULL);
+
+ if (tmp_opt.saw_tstamp) {
+- tmp_opt.ts_recent = req->ts_recent;
++ tmp_opt.ts_recent = READ_ONCE(req->ts_recent);
+ if (tmp_opt.rcv_tsecr)
+ tmp_opt.rcv_tsecr -= tcp_rsk(req)->ts_off;
+ /* We do not store true stamp, but it is not required,
+@@ -758,8 +758,11 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb,
+
+ /* In sequence, PAWS is OK. */
+
++ /* TODO: We probably should defer ts_recent change once
++ * we take ownership of @req.
++ */
+ if (tmp_opt.saw_tstamp && !after(TCP_SKB_CB(skb)->seq, tcp_rsk(req)->rcv_nxt))
+- req->ts_recent = tmp_opt.rcv_tsval;
++ WRITE_ONCE(req->ts_recent, tmp_opt.rcv_tsval);
+
+ if (TCP_SKB_CB(skb)->seq == tcp_rsk(req)->rcv_isn) {
+ /* Truncate SYN, it is out of window starting
+diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
+index cfe128b81a010..518cb4abc8b4f 100644
+--- a/net/ipv4/tcp_output.c
++++ b/net/ipv4/tcp_output.c
+@@ -876,7 +876,7 @@ static unsigned int tcp_synack_options(const struct sock *sk,
+ if (likely(ireq->tstamp_ok)) {
+ opts->options |= OPTION_TS;
+ opts->tsval = tcp_skb_timestamp(skb) + tcp_rsk(req)->ts_off;
+- opts->tsecr = req->ts_recent;
++ opts->tsecr = READ_ONCE(req->ts_recent);
+ remaining -= TCPOLEN_TSTAMP_ALIGNED;
+ }
+ if (likely(ireq->sack_ok)) {
+@@ -3578,7 +3578,7 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
+ rcu_read_lock();
+ md5 = tcp_rsk(req)->af_specific->req_md5_lookup(sk, req_to_sk(req));
+ #endif
+- skb_set_hash(skb, tcp_rsk(req)->txhash, PKT_HASH_TYPE_L4);
++ skb_set_hash(skb, READ_ONCE(tcp_rsk(req)->txhash), PKT_HASH_TYPE_L4);
+ /* bpf program will be interested in the tcp_flags */
+ TCP_SKB_CB(skb)->tcp_flags = TCPHDR_SYN | TCPHDR_ACK;
+ tcp_header_size = tcp_synack_options(sk, req, mss, skb, &opts, md5,
+@@ -4121,7 +4121,7 @@ int tcp_rtx_synack(const struct sock *sk, struct request_sock *req)
+
+ /* Paired with WRITE_ONCE() in sock_setsockopt() */
+ if (READ_ONCE(sk->sk_txrehash) == SOCK_TXREHASH_ENABLED)
+- tcp_rsk(req)->txhash = net_tx_rndhash();
++ WRITE_ONCE(tcp_rsk(req)->txhash, net_tx_rndhash());
+ res = af_ops->send_synack(sk, NULL, &fl, req, NULL, TCP_SYNACK_NORMAL,
+ NULL);
+ if (!res) {
+diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
+index 1f01e15ca24fd..4a61832e7f69b 100644
+--- a/net/ipv4/udp_offload.c
++++ b/net/ipv4/udp_offload.c
+@@ -273,13 +273,20 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
+ __sum16 check;
+ __be16 newlen;
+
+- if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST)
+- return __udp_gso_segment_list(gso_skb, features, is_ipv6);
+-
+ mss = skb_shinfo(gso_skb)->gso_size;
+ if (gso_skb->len <= sizeof(*uh) + mss)
+ return ERR_PTR(-EINVAL);
+
++ if (skb_gso_ok(gso_skb, features | NETIF_F_GSO_ROBUST)) {
++ /* Packet is from an untrusted source, reset gso_segs. */
++ skb_shinfo(gso_skb)->gso_segs = DIV_ROUND_UP(gso_skb->len - sizeof(*uh),
++ mss);
++ return NULL;
++ }
++
++ if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST)
++ return __udp_gso_segment_list(gso_skb, features, is_ipv6);
++
+ skb_pull(gso_skb, sizeof(*uh));
+
+ /* clear destructor to avoid skb_segment assigning it to tail */
+@@ -387,8 +394,7 @@ static struct sk_buff *udp4_ufo_fragment(struct sk_buff *skb,
+ if (!pskb_may_pull(skb, sizeof(struct udphdr)))
+ goto out;
+
+- if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4 &&
+- !skb_gso_ok(skb, features | NETIF_F_GSO_ROBUST))
++ if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4)
+ return __udp_gso_segment(skb, features, false);
+
+ mss = skb_shinfo(skb)->gso_size;
+diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
+index da80974ad23ae..070d87abf7c02 100644
+--- a/net/ipv6/ip6_gre.c
++++ b/net/ipv6/ip6_gre.c
+@@ -955,7 +955,8 @@ static netdev_tx_t ip6erspan_tunnel_xmit(struct sk_buff *skb,
+ goto tx_err;
+
+ if (skb->len > dev->mtu + dev->hard_header_len) {
+- pskb_trim(skb, dev->mtu + dev->hard_header_len);
++ if (pskb_trim(skb, dev->mtu + dev->hard_header_len))
++ goto tx_err;
+ truncate = true;
+ }
+
+diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
+index 7132eb213a7a2..f7c248a7f8d1d 100644
+--- a/net/ipv6/tcp_ipv6.c
++++ b/net/ipv6/tcp_ipv6.c
+@@ -1130,10 +1130,10 @@ static void tcp_v6_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb,
+ tcp_rsk(req)->rcv_nxt,
+ req->rsk_rcv_wnd >> inet_rsk(req)->rcv_wscale,
+ tcp_time_stamp_raw() + tcp_rsk(req)->ts_off,
+- req->ts_recent, sk->sk_bound_dev_if,
++ READ_ONCE(req->ts_recent), sk->sk_bound_dev_if,
+ tcp_v6_md5_do_lookup(sk, &ipv6_hdr(skb)->saddr, l3index),
+ ipv6_get_dsfield(ipv6_hdr(skb)), 0, sk->sk_priority,
+- tcp_rsk(req)->txhash);
++ READ_ONCE(tcp_rsk(req)->txhash));
+ }
+
+
+diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
+index c39c1e32f9804..e0e10f6bcdc18 100644
+--- a/net/ipv6/udp_offload.c
++++ b/net/ipv6/udp_offload.c
+@@ -42,8 +42,7 @@ static struct sk_buff *udp6_ufo_fragment(struct sk_buff *skb,
+ if (!pskb_may_pull(skb, sizeof(struct udphdr)))
+ goto out;
+
+- if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4 &&
+- !skb_gso_ok(skb, features | NETIF_F_GSO_ROBUST))
++ if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4)
+ return __udp_gso_segment(skb, features, true);
+
+ mss = skb_shinfo(skb)->gso_size;
+diff --git a/net/llc/llc_input.c b/net/llc/llc_input.c
+index c309b72a58779..7cac441862e21 100644
+--- a/net/llc/llc_input.c
++++ b/net/llc/llc_input.c
+@@ -163,9 +163,6 @@ int llc_rcv(struct sk_buff *skb, struct net_device *dev,
+ void (*sta_handler)(struct sk_buff *skb);
+ void (*sap_handler)(struct llc_sap *sap, struct sk_buff *skb);
+
+- if (!net_eq(dev_net(dev), &init_net))
+- goto drop;
+-
+ /*
+ * When the interface is in promisc. mode, drop all the crap that it
+ * receives, do not try to analyse it.
+diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
+index 18546f9b2a63a..ccf0b3d80fd97 100644
+--- a/net/netfilter/nf_tables_api.c
++++ b/net/netfilter/nf_tables_api.c
+@@ -3684,8 +3684,6 @@ int nft_chain_validate(const struct nft_ctx *ctx, const struct nft_chain *chain)
+ if (err < 0)
+ return err;
+ }
+-
+- cond_resched();
+ }
+
+ return 0;
+@@ -3709,6 +3707,8 @@ static int nft_table_validate(struct net *net, const struct nft_table *table)
+ err = nft_chain_validate(&ctx, chain);
+ if (err < 0)
+ return err;
++
++ cond_resched();
+ }
+
+ return 0;
+@@ -4086,6 +4086,8 @@ static int nf_tables_delrule(struct sk_buff *skb, const struct nfnl_info *info,
+ list_for_each_entry(chain, &table->chains, list) {
+ if (!nft_is_active_next(net, chain))
+ continue;
++ if (nft_chain_is_bound(chain))
++ continue;
+
+ ctx.chain = chain;
+ err = nft_delrule_by_chain(&ctx);
+@@ -10482,6 +10484,9 @@ static int nft_verdict_init(const struct nft_ctx *ctx, struct nft_data *data,
+
+ if (!tb[NFTA_VERDICT_CODE])
+ return -EINVAL;
++
++ /* zero padding hole for memcmp */
++ memset(data, 0, sizeof(*data));
+ data->verdict.code = ntohl(nla_get_be32(tb[NFTA_VERDICT_CODE]));
+
+ switch (data->verdict.code) {
+@@ -10764,6 +10769,9 @@ static void __nft_release_table(struct net *net, struct nft_table *table)
+ ctx.family = table->family;
+ ctx.table = table;
+ list_for_each_entry(chain, &table->chains, list) {
++ if (nft_chain_is_bound(chain))
++ continue;
++
+ ctx.chain = chain;
+ list_for_each_entry_safe(rule, nr, &chain->rules, list) {
+ list_del(&rule->list);
+diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c
+index 0452ee586c1cc..a81829c10feab 100644
+--- a/net/netfilter/nft_set_pipapo.c
++++ b/net/netfilter/nft_set_pipapo.c
+@@ -1930,7 +1930,11 @@ static void nft_pipapo_remove(const struct net *net, const struct nft_set *set,
+ int i, start, rules_fx;
+
+ match_start = data;
+- match_end = (const u8 *)nft_set_ext_key_end(&e->ext)->data;
++
++ if (nft_set_ext_exists(&e->ext, NFT_SET_EXT_KEY_END))
++ match_end = (const u8 *)nft_set_ext_key_end(&e->ext)->data;
++ else
++ match_end = data;
+
+ start = first_rule;
+ rules_fx = rules_f0;
+diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
+index 466c26df853a0..382c7a71f81f2 100644
+--- a/net/sched/cls_bpf.c
++++ b/net/sched/cls_bpf.c
+@@ -406,56 +406,6 @@ static int cls_bpf_prog_from_efd(struct nlattr **tb, struct cls_bpf_prog *prog,
+ return 0;
+ }
+
+-static int cls_bpf_set_parms(struct net *net, struct tcf_proto *tp,
+- struct cls_bpf_prog *prog, unsigned long base,
+- struct nlattr **tb, struct nlattr *est, u32 flags,
+- struct netlink_ext_ack *extack)
+-{
+- bool is_bpf, is_ebpf, have_exts = false;
+- u32 gen_flags = 0;
+- int ret;
+-
+- is_bpf = tb[TCA_BPF_OPS_LEN] && tb[TCA_BPF_OPS];
+- is_ebpf = tb[TCA_BPF_FD];
+- if ((!is_bpf && !is_ebpf) || (is_bpf && is_ebpf))
+- return -EINVAL;
+-
+- ret = tcf_exts_validate(net, tp, tb, est, &prog->exts, flags,
+- extack);
+- if (ret < 0)
+- return ret;
+-
+- if (tb[TCA_BPF_FLAGS]) {
+- u32 bpf_flags = nla_get_u32(tb[TCA_BPF_FLAGS]);
+-
+- if (bpf_flags & ~TCA_BPF_FLAG_ACT_DIRECT)
+- return -EINVAL;
+-
+- have_exts = bpf_flags & TCA_BPF_FLAG_ACT_DIRECT;
+- }
+- if (tb[TCA_BPF_FLAGS_GEN]) {
+- gen_flags = nla_get_u32(tb[TCA_BPF_FLAGS_GEN]);
+- if (gen_flags & ~CLS_BPF_SUPPORTED_GEN_FLAGS ||
+- !tc_flags_valid(gen_flags))
+- return -EINVAL;
+- }
+-
+- prog->exts_integrated = have_exts;
+- prog->gen_flags = gen_flags;
+-
+- ret = is_bpf ? cls_bpf_prog_from_ops(tb, prog) :
+- cls_bpf_prog_from_efd(tb, prog, gen_flags, tp);
+- if (ret < 0)
+- return ret;
+-
+- if (tb[TCA_BPF_CLASSID]) {
+- prog->res.classid = nla_get_u32(tb[TCA_BPF_CLASSID]);
+- tcf_bind_filter(tp, &prog->res, base);
+- }
+-
+- return 0;
+-}
+-
+ static int cls_bpf_change(struct net *net, struct sk_buff *in_skb,
+ struct tcf_proto *tp, unsigned long base,
+ u32 handle, struct nlattr **tca,
+@@ -463,9 +413,12 @@ static int cls_bpf_change(struct net *net, struct sk_buff *in_skb,
+ struct netlink_ext_ack *extack)
+ {
+ struct cls_bpf_head *head = rtnl_dereference(tp->root);
++ bool is_bpf, is_ebpf, have_exts = false;
+ struct cls_bpf_prog *oldprog = *arg;
+ struct nlattr *tb[TCA_BPF_MAX + 1];
++ bool bound_to_filter = false;
+ struct cls_bpf_prog *prog;
++ u32 gen_flags = 0;
+ int ret;
+
+ if (tca[TCA_OPTIONS] == NULL)
+@@ -504,11 +457,51 @@ static int cls_bpf_change(struct net *net, struct sk_buff *in_skb,
+ goto errout;
+ prog->handle = handle;
+
+- ret = cls_bpf_set_parms(net, tp, prog, base, tb, tca[TCA_RATE], flags,
+- extack);
++ is_bpf = tb[TCA_BPF_OPS_LEN] && tb[TCA_BPF_OPS];
++ is_ebpf = tb[TCA_BPF_FD];
++ if ((!is_bpf && !is_ebpf) || (is_bpf && is_ebpf)) {
++ ret = -EINVAL;
++ goto errout_idr;
++ }
++
++ ret = tcf_exts_validate(net, tp, tb, tca[TCA_RATE], &prog->exts,
++ flags, extack);
++ if (ret < 0)
++ goto errout_idr;
++
++ if (tb[TCA_BPF_FLAGS]) {
++ u32 bpf_flags = nla_get_u32(tb[TCA_BPF_FLAGS]);
++
++ if (bpf_flags & ~TCA_BPF_FLAG_ACT_DIRECT) {
++ ret = -EINVAL;
++ goto errout_idr;
++ }
++
++ have_exts = bpf_flags & TCA_BPF_FLAG_ACT_DIRECT;
++ }
++ if (tb[TCA_BPF_FLAGS_GEN]) {
++ gen_flags = nla_get_u32(tb[TCA_BPF_FLAGS_GEN]);
++ if (gen_flags & ~CLS_BPF_SUPPORTED_GEN_FLAGS ||
++ !tc_flags_valid(gen_flags)) {
++ ret = -EINVAL;
++ goto errout_idr;
++ }
++ }
++
++ prog->exts_integrated = have_exts;
++ prog->gen_flags = gen_flags;
++
++ ret = is_bpf ? cls_bpf_prog_from_ops(tb, prog) :
++ cls_bpf_prog_from_efd(tb, prog, gen_flags, tp);
+ if (ret < 0)
+ goto errout_idr;
+
++ if (tb[TCA_BPF_CLASSID]) {
++ prog->res.classid = nla_get_u32(tb[TCA_BPF_CLASSID]);
++ tcf_bind_filter(tp, &prog->res, base);
++ bound_to_filter = true;
++ }
++
+ ret = cls_bpf_offload(tp, prog, oldprog, extack);
+ if (ret)
+ goto errout_parms;
+@@ -530,6 +523,8 @@ static int cls_bpf_change(struct net *net, struct sk_buff *in_skb,
+ return 0;
+
+ errout_parms:
++ if (bound_to_filter)
++ tcf_unbind_filter(tp, &prog->res);
+ cls_bpf_free_parms(prog);
+ errout_idr:
+ if (!oldprog)
+diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c
+index fa3bbd187eb97..c4ed11df62548 100644
+--- a/net/sched/cls_matchall.c
++++ b/net/sched/cls_matchall.c
+@@ -159,26 +159,6 @@ static const struct nla_policy mall_policy[TCA_MATCHALL_MAX + 1] = {
+ [TCA_MATCHALL_FLAGS] = { .type = NLA_U32 },
+ };
+
+-static int mall_set_parms(struct net *net, struct tcf_proto *tp,
+- struct cls_mall_head *head,
+- unsigned long base, struct nlattr **tb,
+- struct nlattr *est, u32 flags, u32 fl_flags,
+- struct netlink_ext_ack *extack)
+-{
+- int err;
+-
+- err = tcf_exts_validate_ex(net, tp, tb, est, &head->exts, flags,
+- fl_flags, extack);
+- if (err < 0)
+- return err;
+-
+- if (tb[TCA_MATCHALL_CLASSID]) {
+- head->res.classid = nla_get_u32(tb[TCA_MATCHALL_CLASSID]);
+- tcf_bind_filter(tp, &head->res, base);
+- }
+- return 0;
+-}
+-
+ static int mall_change(struct net *net, struct sk_buff *in_skb,
+ struct tcf_proto *tp, unsigned long base,
+ u32 handle, struct nlattr **tca,
+@@ -187,6 +167,7 @@ static int mall_change(struct net *net, struct sk_buff *in_skb,
+ {
+ struct cls_mall_head *head = rtnl_dereference(tp->root);
+ struct nlattr *tb[TCA_MATCHALL_MAX + 1];
++ bool bound_to_filter = false;
+ struct cls_mall_head *new;
+ u32 userflags = 0;
+ int err;
+@@ -226,11 +207,17 @@ static int mall_change(struct net *net, struct sk_buff *in_skb,
+ goto err_alloc_percpu;
+ }
+
+- err = mall_set_parms(net, tp, new, base, tb, tca[TCA_RATE],
+- flags, new->flags, extack);
+- if (err)
++ err = tcf_exts_validate_ex(net, tp, tb, tca[TCA_RATE],
++ &new->exts, flags, new->flags, extack);
++ if (err < 0)
+ goto err_set_parms;
+
++ if (tb[TCA_MATCHALL_CLASSID]) {
++ new->res.classid = nla_get_u32(tb[TCA_MATCHALL_CLASSID]);
++ tcf_bind_filter(tp, &new->res, base);
++ bound_to_filter = true;
++ }
++
+ if (!tc_skip_hw(new->flags)) {
+ err = mall_replace_hw_filter(tp, new, (unsigned long)new,
+ extack);
+@@ -246,6 +233,8 @@ static int mall_change(struct net *net, struct sk_buff *in_skb,
+ return 0;
+
+ err_replace_hw_filter:
++ if (bound_to_filter)
++ tcf_unbind_filter(tp, &new->res);
+ err_set_parms:
+ free_percpu(new->pf);
+ err_alloc_percpu:
+diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
+index d15d50de79802..5abf31e432caf 100644
+--- a/net/sched/cls_u32.c
++++ b/net/sched/cls_u32.c
+@@ -712,8 +712,23 @@ static const struct nla_policy u32_policy[TCA_U32_MAX + 1] = {
+ [TCA_U32_FLAGS] = { .type = NLA_U32 },
+ };
+
++static void u32_unbind_filter(struct tcf_proto *tp, struct tc_u_knode *n,
++ struct nlattr **tb)
++{
++ if (tb[TCA_U32_CLASSID])
++ tcf_unbind_filter(tp, &n->res);
++}
++
++static void u32_bind_filter(struct tcf_proto *tp, struct tc_u_knode *n,
++ unsigned long base, struct nlattr **tb)
++{
++ if (tb[TCA_U32_CLASSID]) {
++ n->res.classid = nla_get_u32(tb[TCA_U32_CLASSID]);
++ tcf_bind_filter(tp, &n->res, base);
++ }
++}
++
+ static int u32_set_parms(struct net *net, struct tcf_proto *tp,
+- unsigned long base,
+ struct tc_u_knode *n, struct nlattr **tb,
+ struct nlattr *est, u32 flags, u32 fl_flags,
+ struct netlink_ext_ack *extack)
+@@ -760,10 +775,6 @@ static int u32_set_parms(struct net *net, struct tcf_proto *tp,
+ if (ht_old)
+ ht_old->refcnt--;
+ }
+- if (tb[TCA_U32_CLASSID]) {
+- n->res.classid = nla_get_u32(tb[TCA_U32_CLASSID]);
+- tcf_bind_filter(tp, &n->res, base);
+- }
+
+ if (ifindex >= 0)
+ n->ifindex = ifindex;
+@@ -903,17 +914,27 @@ static int u32_change(struct net *net, struct sk_buff *in_skb,
+ if (!new)
+ return -ENOMEM;
+
+- err = u32_set_parms(net, tp, base, new, tb,
+- tca[TCA_RATE], flags, new->flags,
+- extack);
++ err = u32_set_parms(net, tp, new, tb, tca[TCA_RATE],
++ flags, new->flags, extack);
+
+ if (err) {
+ __u32_destroy_key(new);
+ return err;
+ }
+
++ u32_bind_filter(tp, new, base, tb);
++
+ err = u32_replace_hw_knode(tp, new, flags, extack);
+ if (err) {
++ u32_unbind_filter(tp, new, tb);
++
++ if (tb[TCA_U32_LINK]) {
++ struct tc_u_hnode *ht_old;
++
++ ht_old = rtnl_dereference(n->ht_down);
++ if (ht_old)
++ ht_old->refcnt++;
++ }
+ __u32_destroy_key(new);
+ return err;
+ }
+@@ -1074,15 +1095,18 @@ static int u32_change(struct net *net, struct sk_buff *in_skb,
+ }
+ #endif
+
+- err = u32_set_parms(net, tp, base, n, tb, tca[TCA_RATE],
++ err = u32_set_parms(net, tp, n, tb, tca[TCA_RATE],
+ flags, n->flags, extack);
++
++ u32_bind_filter(tp, n, base, tb);
++
+ if (err == 0) {
+ struct tc_u_knode __rcu **ins;
+ struct tc_u_knode *pins;
+
+ err = u32_replace_hw_knode(tp, n, flags, extack);
+ if (err)
+- goto errhw;
++ goto errunbind;
+
+ if (!tc_in_hw(n->flags))
+ n->flags |= TCA_CLS_FLAGS_NOT_IN_HW;
+@@ -1100,7 +1124,9 @@ static int u32_change(struct net *net, struct sk_buff *in_skb,
+ return 0;
+ }
+
+-errhw:
++errunbind:
++ u32_unbind_filter(tp, n, tb);
++
+ #ifdef CONFIG_CLS_U32_MARK
+ free_percpu(n->pcpu_success);
+ #endif
+diff --git a/net/wireless/wext-core.c b/net/wireless/wext-core.c
+index a125fd1fa1342..a161c64d1765e 100644
+--- a/net/wireless/wext-core.c
++++ b/net/wireless/wext-core.c
+@@ -815,6 +815,12 @@ static int ioctl_standard_iw_point(struct iw_point *iwp, unsigned int cmd,
+ }
+ }
+
++ /* Sanity-check to ensure we never end up _allocating_ zero
++ * bytes of data for extra.
++ */
++ if (extra_size <= 0)
++ return -EFAULT;
++
+ /* kzalloc() ensures NULL-termination for essid_compat. */
+ extra = kzalloc(extra_size, GFP_KERNEL);
+ if (!extra)
+diff --git a/scripts/Makefile.build b/scripts/Makefile.build
+index 9f94fc83f0865..5f4a0228543ae 100644
+--- a/scripts/Makefile.build
++++ b/scripts/Makefile.build
+@@ -279,6 +279,9 @@ $(obj)/%.lst: $(src)/%.c FORCE
+
+ rust_allowed_features := core_ffi_c,explicit_generic_args_with_impl_trait,new_uninit,pin_macro
+
++# `--out-dir` is required to avoid temporaries being created by `rustc` in the
++# current working directory, which may be not accessible in the out-of-tree
++# modules case.
+ rust_common_cmd = \
+ RUST_MODFILE=$(modfile) $(RUSTC_OR_CLIPPY) $(rust_flags) \
+ -Zallow-features=$(rust_allowed_features) \
+@@ -287,7 +290,7 @@ rust_common_cmd = \
+ --extern alloc --extern kernel \
+ --crate-type rlib -L $(objtree)/rust/ \
+ --crate-name $(basename $(notdir $@)) \
+- --emit=dep-info=$(depfile)
++ --out-dir $(dir $@) --emit=dep-info=$(depfile)
+
+ # `--emit=obj`, `--emit=asm` and `--emit=llvm-ir` imply a single codegen unit
+ # will be used. We explicitly request `-Ccodegen-units=1` in any case, and
+diff --git a/scripts/Makefile.host b/scripts/Makefile.host
+index 7aea9005e4970..8f7f842b54f9e 100644
+--- a/scripts/Makefile.host
++++ b/scripts/Makefile.host
+@@ -86,7 +86,11 @@ hostc_flags = -Wp,-MMD,$(depfile) \
+ hostcxx_flags = -Wp,-MMD,$(depfile) \
+ $(KBUILD_HOSTCXXFLAGS) $(HOST_EXTRACXXFLAGS) \
+ $(HOSTCXXFLAGS_$(target-stem).o)
+-hostrust_flags = --emit=dep-info=$(depfile) \
++
++# `--out-dir` is required to avoid temporaries being created by `rustc` in the
++# current working directory, which may be not accessible in the out-of-tree
++# modules case.
++hostrust_flags = --out-dir $(dir $@) --emit=dep-info=$(depfile) \
+ $(KBUILD_HOSTRUSTFLAGS) $(HOST_EXTRARUSTFLAGS) \
+ $(HOSTRUSTFLAGS_$(target-stem))
+
+diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
+index 0d2db41177b23..13af6d0ff845d 100644
+--- a/scripts/kallsyms.c
++++ b/scripts/kallsyms.c
+@@ -346,10 +346,10 @@ static void cleanup_symbol_name(char *s)
+ * ASCII[_] = 5f
+ * ASCII[a-z] = 61,7a
+ *
+- * As above, replacing '.' with '\0' does not affect the main sorting,
+- * but it helps us with subsorting.
++ * As above, replacing the first '.' in ".llvm." with '\0' does not
++ * affect the main sorting, but it helps us with subsorting.
+ */
+- p = strchr(s, '.');
++ p = strstr(s, ".llvm.");
+ if (p)
+ *p = '\0';
+ }
+diff --git a/security/keys/request_key.c b/security/keys/request_key.c
+index 07a0ef2baacd8..a7673ad86d18d 100644
+--- a/security/keys/request_key.c
++++ b/security/keys/request_key.c
+@@ -401,17 +401,21 @@ static int construct_alloc_key(struct keyring_search_context *ctx,
+ set_bit(KEY_FLAG_USER_CONSTRUCT, &key->flags);
+
+ if (dest_keyring) {
+- ret = __key_link_lock(dest_keyring, &ctx->index_key);
++ ret = __key_link_lock(dest_keyring, &key->index_key);
+ if (ret < 0)
+ goto link_lock_failed;
+- ret = __key_link_begin(dest_keyring, &ctx->index_key, &edit);
+- if (ret < 0)
+- goto link_prealloc_failed;
+ }
+
+- /* attach the key to the destination keyring under lock, but we do need
++ /*
++ * Attach the key to the destination keyring under lock, but we do need
+ * to do another check just in case someone beat us to it whilst we
+- * waited for locks */
++ * waited for locks.
++ *
++ * The caller might specify a comparison function which looks for keys
++ * that do not exactly match but are still equivalent from the caller's
++ * perspective. The __key_link_begin() operation must be done only after
++ * an actual key is determined.
++ */
+ mutex_lock(&key_construction_mutex);
+
+ rcu_read_lock();
+@@ -420,12 +424,16 @@ static int construct_alloc_key(struct keyring_search_context *ctx,
+ if (!IS_ERR(key_ref))
+ goto key_already_present;
+
+- if (dest_keyring)
++ if (dest_keyring) {
++ ret = __key_link_begin(dest_keyring, &key->index_key, &edit);
++ if (ret < 0)
++ goto link_alloc_failed;
+ __key_link(dest_keyring, key, &edit);
++ }
+
+ mutex_unlock(&key_construction_mutex);
+ if (dest_keyring)
+- __key_link_end(dest_keyring, &ctx->index_key, edit);
++ __key_link_end(dest_keyring, &key->index_key, edit);
+ mutex_unlock(&user->cons_lock);
+ *_key = key;
+ kleave(" = 0 [%d]", key_serial(key));
+@@ -438,10 +446,13 @@ key_already_present:
+ mutex_unlock(&key_construction_mutex);
+ key = key_ref_to_ptr(key_ref);
+ if (dest_keyring) {
++ ret = __key_link_begin(dest_keyring, &key->index_key, &edit);
++ if (ret < 0)
++ goto link_alloc_failed_unlocked;
+ ret = __key_link_check_live_key(dest_keyring, key);
+ if (ret == 0)
+ __key_link(dest_keyring, key, &edit);
+- __key_link_end(dest_keyring, &ctx->index_key, edit);
++ __key_link_end(dest_keyring, &key->index_key, edit);
+ if (ret < 0)
+ goto link_check_failed;
+ }
+@@ -456,8 +467,10 @@ link_check_failed:
+ kleave(" = %d [linkcheck]", ret);
+ return ret;
+
+-link_prealloc_failed:
+- __key_link_end(dest_keyring, &ctx->index_key, edit);
++link_alloc_failed:
++ mutex_unlock(&key_construction_mutex);
++link_alloc_failed_unlocked:
++ __key_link_end(dest_keyring, &key->index_key, edit);
+ link_lock_failed:
+ mutex_unlock(&user->cons_lock);
+ key_put(key);
+diff --git a/security/keys/trusted-keys/trusted_tpm2.c b/security/keys/trusted-keys/trusted_tpm2.c
+index 2b2c8eb258d5b..bc700f85f80be 100644
+--- a/security/keys/trusted-keys/trusted_tpm2.c
++++ b/security/keys/trusted-keys/trusted_tpm2.c
+@@ -186,7 +186,7 @@ int tpm2_key_priv(void *context, size_t hdrlen,
+ }
+
+ /**
+- * tpm_buf_append_auth() - append TPMS_AUTH_COMMAND to the buffer.
++ * tpm2_buf_append_auth() - append TPMS_AUTH_COMMAND to the buffer.
+ *
+ * @buf: an allocated tpm_buf instance
+ * @session_handle: session handle
+diff --git a/sound/pci/emu10k1/emufx.c b/sound/pci/emu10k1/emufx.c
+index 3f64ccab0e632..fba19a854f27a 100644
+--- a/sound/pci/emu10k1/emufx.c
++++ b/sound/pci/emu10k1/emufx.c
+@@ -1559,14 +1559,8 @@ A_OP(icode, &ptr, iMAC0, A_GPR(var), A_GPR(var), A_GPR(vol), A_EXTIN(input))
+ gpr += 2;
+
+ /* Master volume (will be renamed later) */
+- A_OP(icode, &ptr, iMAC0, A_GPR(playback+0+SND_EMU10K1_PLAYBACK_CHANNELS), A_C_00000000, A_GPR(gpr), A_GPR(playback+0+SND_EMU10K1_PLAYBACK_CHANNELS));
+- A_OP(icode, &ptr, iMAC0, A_GPR(playback+1+SND_EMU10K1_PLAYBACK_CHANNELS), A_C_00000000, A_GPR(gpr), A_GPR(playback+1+SND_EMU10K1_PLAYBACK_CHANNELS));
+- A_OP(icode, &ptr, iMAC0, A_GPR(playback+2+SND_EMU10K1_PLAYBACK_CHANNELS), A_C_00000000, A_GPR(gpr), A_GPR(playback+2+SND_EMU10K1_PLAYBACK_CHANNELS));
+- A_OP(icode, &ptr, iMAC0, A_GPR(playback+3+SND_EMU10K1_PLAYBACK_CHANNELS), A_C_00000000, A_GPR(gpr), A_GPR(playback+3+SND_EMU10K1_PLAYBACK_CHANNELS));
+- A_OP(icode, &ptr, iMAC0, A_GPR(playback+4+SND_EMU10K1_PLAYBACK_CHANNELS), A_C_00000000, A_GPR(gpr), A_GPR(playback+4+SND_EMU10K1_PLAYBACK_CHANNELS));
+- A_OP(icode, &ptr, iMAC0, A_GPR(playback+5+SND_EMU10K1_PLAYBACK_CHANNELS), A_C_00000000, A_GPR(gpr), A_GPR(playback+5+SND_EMU10K1_PLAYBACK_CHANNELS));
+- A_OP(icode, &ptr, iMAC0, A_GPR(playback+6+SND_EMU10K1_PLAYBACK_CHANNELS), A_C_00000000, A_GPR(gpr), A_GPR(playback+6+SND_EMU10K1_PLAYBACK_CHANNELS));
+- A_OP(icode, &ptr, iMAC0, A_GPR(playback+7+SND_EMU10K1_PLAYBACK_CHANNELS), A_C_00000000, A_GPR(gpr), A_GPR(playback+7+SND_EMU10K1_PLAYBACK_CHANNELS));
++ for (z = 0; z < 8; z++)
++ A_OP(icode, &ptr, iMAC0, A_GPR(playback+z+SND_EMU10K1_PLAYBACK_CHANNELS), A_C_00000000, A_GPR(gpr), A_GPR(playback+z+SND_EMU10K1_PLAYBACK_CHANNELS));
+ snd_emu10k1_init_mono_control(&controls[nctl++], "Wave Master Playback Volume", gpr, 0);
+ gpr += 2;
+
+@@ -1653,102 +1647,14 @@ A_OP(icode, &ptr, iMAC0, A_GPR(var), A_GPR(var), A_GPR(vol), A_EXTIN(input))
+ dev_dbg(emu->card->dev, "emufx.c: gpr=0x%x, tmp=0x%x\n",
+ gpr, tmp);
+ */
+- /* For the EMU1010: How to get 32bit values from the DSP. High 16bits into L, low 16bits into R. */
+- /* A_P16VIN(0) is delayed by one sample,
+- * so all other A_P16VIN channels will need to also be delayed
+- */
+- /* Left ADC in. 1 of 2 */
+ snd_emu10k1_audigy_dsp_convert_32_to_2x16( icode, &ptr, tmp, bit_shifter16, A_P16VIN(0x0), A_FXBUS2(0) );
+- /* Right ADC in 1 of 2 */
+- gpr_map[gpr++] = 0x00000000;
+- /* Delaying by one sample: instead of copying the input
+- * value A_P16VIN to output A_FXBUS2 as in the first channel,
+- * we use an auxiliary register, delaying the value by one
+- * sample
+- */
+- snd_emu10k1_audigy_dsp_convert_32_to_2x16( icode, &ptr, tmp, bit_shifter16, A_GPR(gpr - 1), A_FXBUS2(2) );
+- A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0x1), A_C_00000000, A_C_00000000);
+- gpr_map[gpr++] = 0x00000000;
+- snd_emu10k1_audigy_dsp_convert_32_to_2x16( icode, &ptr, tmp, bit_shifter16, A_GPR(gpr - 1), A_FXBUS2(4) );
+- A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0x2), A_C_00000000, A_C_00000000);
+- gpr_map[gpr++] = 0x00000000;
+- snd_emu10k1_audigy_dsp_convert_32_to_2x16( icode, &ptr, tmp, bit_shifter16, A_GPR(gpr - 1), A_FXBUS2(6) );
+- A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0x3), A_C_00000000, A_C_00000000);
+- /* For 96kHz mode */
+- /* Left ADC in. 2 of 2 */
+- gpr_map[gpr++] = 0x00000000;
+- snd_emu10k1_audigy_dsp_convert_32_to_2x16( icode, &ptr, tmp, bit_shifter16, A_GPR(gpr - 1), A_FXBUS2(0x8) );
+- A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0x4), A_C_00000000, A_C_00000000);
+- /* Right ADC in 2 of 2 */
+- gpr_map[gpr++] = 0x00000000;
+- snd_emu10k1_audigy_dsp_convert_32_to_2x16( icode, &ptr, tmp, bit_shifter16, A_GPR(gpr - 1), A_FXBUS2(0xa) );
+- A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0x5), A_C_00000000, A_C_00000000);
+- gpr_map[gpr++] = 0x00000000;
+- snd_emu10k1_audigy_dsp_convert_32_to_2x16( icode, &ptr, tmp, bit_shifter16, A_GPR(gpr - 1), A_FXBUS2(0xc) );
+- A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0x6), A_C_00000000, A_C_00000000);
+- gpr_map[gpr++] = 0x00000000;
+- snd_emu10k1_audigy_dsp_convert_32_to_2x16( icode, &ptr, tmp, bit_shifter16, A_GPR(gpr - 1), A_FXBUS2(0xe) );
+- A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0x7), A_C_00000000, A_C_00000000);
+- /* Pavel Hofman - we still have voices, A_FXBUS2s, and
+- * A_P16VINs available -
+- * let's add 8 more capture channels - total of 16
+- */
+- gpr_map[gpr++] = 0x00000000;
+- snd_emu10k1_audigy_dsp_convert_32_to_2x16(icode, &ptr, tmp,
+- bit_shifter16,
+- A_GPR(gpr - 1),
+- A_FXBUS2(0x10));
+- A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0x8),
+- A_C_00000000, A_C_00000000);
+- gpr_map[gpr++] = 0x00000000;
+- snd_emu10k1_audigy_dsp_convert_32_to_2x16(icode, &ptr, tmp,
+- bit_shifter16,
+- A_GPR(gpr - 1),
+- A_FXBUS2(0x12));
+- A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0x9),
+- A_C_00000000, A_C_00000000);
+- gpr_map[gpr++] = 0x00000000;
+- snd_emu10k1_audigy_dsp_convert_32_to_2x16(icode, &ptr, tmp,
+- bit_shifter16,
+- A_GPR(gpr - 1),
+- A_FXBUS2(0x14));
+- A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0xa),
+- A_C_00000000, A_C_00000000);
+- gpr_map[gpr++] = 0x00000000;
+- snd_emu10k1_audigy_dsp_convert_32_to_2x16(icode, &ptr, tmp,
+- bit_shifter16,
+- A_GPR(gpr - 1),
+- A_FXBUS2(0x16));
+- A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0xb),
+- A_C_00000000, A_C_00000000);
+- gpr_map[gpr++] = 0x00000000;
+- snd_emu10k1_audigy_dsp_convert_32_to_2x16(icode, &ptr, tmp,
+- bit_shifter16,
+- A_GPR(gpr - 1),
+- A_FXBUS2(0x18));
+- A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0xc),
+- A_C_00000000, A_C_00000000);
+- gpr_map[gpr++] = 0x00000000;
+- snd_emu10k1_audigy_dsp_convert_32_to_2x16(icode, &ptr, tmp,
+- bit_shifter16,
+- A_GPR(gpr - 1),
+- A_FXBUS2(0x1a));
+- A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0xd),
+- A_C_00000000, A_C_00000000);
+- gpr_map[gpr++] = 0x00000000;
+- snd_emu10k1_audigy_dsp_convert_32_to_2x16(icode, &ptr, tmp,
+- bit_shifter16,
+- A_GPR(gpr - 1),
+- A_FXBUS2(0x1c));
+- A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0xe),
+- A_C_00000000, A_C_00000000);
+- gpr_map[gpr++] = 0x00000000;
+- snd_emu10k1_audigy_dsp_convert_32_to_2x16(icode, &ptr, tmp,
+- bit_shifter16,
+- A_GPR(gpr - 1),
+- A_FXBUS2(0x1e));
+- A_OP(icode, &ptr, iACC3, A_GPR(gpr - 1), A_P16VIN(0xf),
+- A_C_00000000, A_C_00000000);
++ /* A_P16VIN(0) is delayed by one sample, so all other A_P16VIN channels
++ * will need to also be delayed; we use an auxiliary register for that. */
++ for (z = 1; z < 0x10; z++) {
++ snd_emu10k1_audigy_dsp_convert_32_to_2x16( icode, &ptr, tmp, bit_shifter16, A_GPR(gpr), A_FXBUS2(z * 2) );
++ A_OP(icode, &ptr, iACC3, A_GPR(gpr), A_P16VIN(z), A_C_00000000, A_C_00000000);
++ gpr_map[gpr++] = 0x00000000;
++ }
+ }
+
+ #if 0
+diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
+index f1b934a502169..169572c8ed40f 100644
+--- a/sound/pci/hda/patch_realtek.c
++++ b/sound/pci/hda/patch_realtek.c
+@@ -122,6 +122,7 @@ struct alc_spec {
+ unsigned int ultra_low_power:1;
+ unsigned int has_hs_key:1;
+ unsigned int no_internal_mic_pin:1;
++ unsigned int en_3kpull_low:1;
+
+ /* for PLL fix */
+ hda_nid_t pll_nid;
+@@ -3622,6 +3623,7 @@ static void alc256_shutup(struct hda_codec *codec)
+ if (!hp_pin)
+ hp_pin = 0x21;
+
++ alc_update_coefex_idx(codec, 0x57, 0x04, 0x0007, 0x1); /* Low power */
+ hp_pin_sense = snd_hda_jack_detect(codec, hp_pin);
+
+ if (hp_pin_sense)
+@@ -3638,8 +3640,7 @@ static void alc256_shutup(struct hda_codec *codec)
+ /* If disable 3k pulldown control for alc257, the Mic detection will not work correctly
+ * when booting with headset plugged. So skip setting it for the codec alc257
+ */
+- if (codec->core.vendor_id != 0x10ec0236 &&
+- codec->core.vendor_id != 0x10ec0257)
++ if (spec->en_3kpull_low)
+ alc_update_coef_idx(codec, 0x46, 0, 3 << 12);
+
+ if (!spec->no_shutup_pins)
+@@ -4623,6 +4624,21 @@ static void alc236_fixup_hp_mute_led_coefbit(struct hda_codec *codec,
+ }
+ }
+
++static void alc236_fixup_hp_mute_led_coefbit2(struct hda_codec *codec,
++ const struct hda_fixup *fix, int action)
++{
++ struct alc_spec *spec = codec->spec;
++
++ if (action == HDA_FIXUP_ACT_PRE_PROBE) {
++ spec->mute_led_polarity = 0;
++ spec->mute_led_coef.idx = 0x07;
++ spec->mute_led_coef.mask = 1;
++ spec->mute_led_coef.on = 1;
++ spec->mute_led_coef.off = 0;
++ snd_hda_gen_add_mute_led_cdev(codec, coef_mute_led_set);
++ }
++}
++
+ /* turn on/off mic-mute LED per capture hook by coef bit */
+ static int coef_micmute_led_set(struct led_classdev *led_cdev,
+ enum led_brightness brightness)
+@@ -7120,6 +7136,10 @@ enum {
+ ALC294_FIXUP_ASUS_DUAL_SPK,
+ ALC285_FIXUP_THINKPAD_X1_GEN7,
+ ALC285_FIXUP_THINKPAD_HEADSET_JACK,
++ ALC294_FIXUP_ASUS_ALLY,
++ ALC294_FIXUP_ASUS_ALLY_PINS,
++ ALC294_FIXUP_ASUS_ALLY_VERBS,
++ ALC294_FIXUP_ASUS_ALLY_SPEAKER,
+ ALC294_FIXUP_ASUS_HPE,
+ ALC294_FIXUP_ASUS_COEF_1B,
+ ALC294_FIXUP_ASUS_GX502_HP,
+@@ -7133,6 +7153,7 @@ enum {
+ ALC285_FIXUP_HP_GPIO_LED,
+ ALC285_FIXUP_HP_MUTE_LED,
+ ALC285_FIXUP_HP_SPECTRE_X360_MUTE_LED,
++ ALC236_FIXUP_HP_MUTE_LED_COEFBIT2,
+ ALC236_FIXUP_HP_GPIO_LED,
+ ALC236_FIXUP_HP_MUTE_LED,
+ ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF,
+@@ -7203,6 +7224,7 @@ enum {
+ ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN,
+ ALC295_FIXUP_DELL_INSPIRON_TOP_SPEAKERS,
+ ALC236_FIXUP_DELL_DUAL_CODECS,
++ ALC287_FIXUP_CS35L41_I2C_2_THINKPAD_ACPI,
+ };
+
+ /* A special fixup for Lenovo C940 and Yoga Duet 7;
+@@ -8432,6 +8454,47 @@ static const struct hda_fixup alc269_fixups[] = {
+ .chained = true,
+ .chain_id = ALC294_FIXUP_SPK2_TO_DAC1
+ },
++ [ALC294_FIXUP_ASUS_ALLY] = {
++ .type = HDA_FIXUP_FUNC,
++ .v.func = cs35l41_fixup_i2c_two,
++ .chained = true,
++ .chain_id = ALC294_FIXUP_ASUS_ALLY_PINS
++ },
++ [ALC294_FIXUP_ASUS_ALLY_PINS] = {
++ .type = HDA_FIXUP_PINS,
++ .v.pins = (const struct hda_pintbl[]) {
++ { 0x19, 0x03a11050 },
++ { 0x1a, 0x03a11c30 },
++ { 0x21, 0x03211420 },
++ { }
++ },
++ .chained = true,
++ .chain_id = ALC294_FIXUP_ASUS_ALLY_VERBS
++ },
++ [ALC294_FIXUP_ASUS_ALLY_VERBS] = {
++ .type = HDA_FIXUP_VERBS,
++ .v.verbs = (const struct hda_verb[]) {
++ { 0x20, AC_VERB_SET_COEF_INDEX, 0x45 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0x5089 },
++ { 0x20, AC_VERB_SET_COEF_INDEX, 0x46 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0x0004 },
++ { 0x20, AC_VERB_SET_COEF_INDEX, 0x47 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0xa47a },
++ { 0x20, AC_VERB_SET_COEF_INDEX, 0x49 },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0x0049},
++ { 0x20, AC_VERB_SET_COEF_INDEX, 0x4a },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0x201b },
++ { 0x20, AC_VERB_SET_COEF_INDEX, 0x6b },
++ { 0x20, AC_VERB_SET_PROC_COEF, 0x4278},
++ { }
++ },
++ .chained = true,
++ .chain_id = ALC294_FIXUP_ASUS_ALLY_SPEAKER
++ },
++ [ALC294_FIXUP_ASUS_ALLY_SPEAKER] = {
++ .type = HDA_FIXUP_FUNC,
++ .v.func = alc285_fixup_speaker2_to_dac1,
++ },
+ [ALC285_FIXUP_THINKPAD_X1_GEN7] = {
+ .type = HDA_FIXUP_FUNC,
+ .v.func = alc285_fixup_thinkpad_x1_gen7,
+@@ -8556,6 +8619,10 @@ static const struct hda_fixup alc269_fixups[] = {
+ .type = HDA_FIXUP_FUNC,
+ .v.func = alc285_fixup_hp_spectre_x360_mute_led,
+ },
++ [ALC236_FIXUP_HP_MUTE_LED_COEFBIT2] = {
++ .type = HDA_FIXUP_FUNC,
++ .v.func = alc236_fixup_hp_mute_led_coefbit2,
++ },
+ [ALC236_FIXUP_HP_GPIO_LED] = {
+ .type = HDA_FIXUP_FUNC,
+ .v.func = alc236_fixup_hp_gpio_led,
+@@ -9069,8 +9136,6 @@ static const struct hda_fixup alc269_fixups[] = {
+ [ALC287_FIXUP_CS35L41_I2C_2] = {
+ .type = HDA_FIXUP_FUNC,
+ .v.func = cs35l41_fixup_i2c_two,
+- .chained = true,
+- .chain_id = ALC269_FIXUP_THINKPAD_ACPI,
+ },
+ [ALC287_FIXUP_CS35L41_I2C_2_HP_GPIO_LED] = {
+ .type = HDA_FIXUP_FUNC,
+@@ -9207,6 +9272,12 @@ static const struct hda_fixup alc269_fixups[] = {
+ .chained = true,
+ .chain_id = ALC255_FIXUP_DELL1_MIC_NO_PRESENCE,
+ },
++ [ALC287_FIXUP_CS35L41_I2C_2_THINKPAD_ACPI] = {
++ .type = HDA_FIXUP_FUNC,
++ .v.func = cs35l41_fixup_i2c_two,
++ .chained = true,
++ .chain_id = ALC269_FIXUP_THINKPAD_ACPI,
++ },
+ };
+
+ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+@@ -9440,6 +9511,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x103c, 0x886d, "HP ZBook Fury 17.3 Inch G8 Mobile Workstation PC", ALC285_FIXUP_HP_GPIO_AMP_INIT),
+ SND_PCI_QUIRK(0x103c, 0x8870, "HP ZBook Fury 15.6 Inch G8 Mobile Workstation PC", ALC285_FIXUP_HP_GPIO_AMP_INIT),
+ SND_PCI_QUIRK(0x103c, 0x8873, "HP ZBook Studio 15.6 Inch G8 Mobile Workstation PC", ALC285_FIXUP_HP_GPIO_AMP_INIT),
++ SND_PCI_QUIRK(0x103c, 0x887a, "HP Laptop 15s-eq2xxx", ALC236_FIXUP_HP_MUTE_LED_COEFBIT2),
+ SND_PCI_QUIRK(0x103c, 0x888d, "HP ZBook Power 15.6 inch G8 Mobile Workstation PC", ALC236_FIXUP_HP_GPIO_LED),
+ SND_PCI_QUIRK(0x103c, 0x8895, "HP EliteBook 855 G8 Notebook PC", ALC285_FIXUP_HP_SPEAKERS_MICMUTE_LED),
+ SND_PCI_QUIRK(0x103c, 0x8896, "HP EliteBook 855 G8 Notebook PC", ALC285_FIXUP_HP_MUTE_LED),
+@@ -9535,6 +9607,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x1043, 0x16e3, "ASUS UX50", ALC269_FIXUP_STEREO_DMIC),
+ SND_PCI_QUIRK(0x1043, 0x1740, "ASUS UX430UA", ALC295_FIXUP_ASUS_DACS),
+ SND_PCI_QUIRK(0x1043, 0x17d1, "ASUS UX431FL", ALC294_FIXUP_ASUS_DUAL_SPK),
++ SND_PCI_QUIRK(0x1043, 0x17f3, "ROG Ally RC71L_RC71L", ALC294_FIXUP_ASUS_ALLY),
+ SND_PCI_QUIRK(0x1043, 0x1881, "ASUS Zephyrus S/M", ALC294_FIXUP_ASUS_GX502_PINS),
+ SND_PCI_QUIRK(0x1043, 0x18b1, "Asus MJ401TA", ALC256_FIXUP_ASUS_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1043, 0x18f1, "Asus FX505DT", ALC256_FIXUP_ASUS_HEADSET_MIC),
+@@ -9646,6 +9719,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x1558, 0x5157, "Clevo W517GU1", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1558, 0x51a1, "Clevo NS50MU", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1558, 0x51b1, "Clevo NS50AU", ALC256_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
++ SND_PCI_QUIRK(0x1558, 0x51b3, "Clevo NS70AU", ALC256_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1558, 0x5630, "Clevo NP50RNJS", ALC256_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1558, 0x70a1, "Clevo NB70T[HJK]", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1558, 0x70b3, "Clevo NK70SB", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+@@ -9729,14 +9803,14 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x17aa, 0x22be, "Thinkpad X1 Carbon 8th", ALC285_FIXUP_THINKPAD_HEADSET_JACK),
+ SND_PCI_QUIRK(0x17aa, 0x22c1, "Thinkpad P1 Gen 3", ALC285_FIXUP_THINKPAD_NO_BASS_SPK_HEADSET_JACK),
+ SND_PCI_QUIRK(0x17aa, 0x22c2, "Thinkpad X1 Extreme Gen 3", ALC285_FIXUP_THINKPAD_NO_BASS_SPK_HEADSET_JACK),
+- SND_PCI_QUIRK(0x17aa, 0x22f1, "Thinkpad", ALC287_FIXUP_CS35L41_I2C_2),
+- SND_PCI_QUIRK(0x17aa, 0x22f2, "Thinkpad", ALC287_FIXUP_CS35L41_I2C_2),
+- SND_PCI_QUIRK(0x17aa, 0x22f3, "Thinkpad", ALC287_FIXUP_CS35L41_I2C_2),
+- SND_PCI_QUIRK(0x17aa, 0x2316, "Thinkpad P1 Gen 6", ALC287_FIXUP_CS35L41_I2C_2),
+- SND_PCI_QUIRK(0x17aa, 0x2317, "Thinkpad P1 Gen 6", ALC287_FIXUP_CS35L41_I2C_2),
+- SND_PCI_QUIRK(0x17aa, 0x2318, "Thinkpad Z13 Gen2", ALC287_FIXUP_CS35L41_I2C_2),
+- SND_PCI_QUIRK(0x17aa, 0x2319, "Thinkpad Z16 Gen2", ALC287_FIXUP_CS35L41_I2C_2),
+- SND_PCI_QUIRK(0x17aa, 0x231a, "Thinkpad Z16 Gen2", ALC287_FIXUP_CS35L41_I2C_2),
++ SND_PCI_QUIRK(0x17aa, 0x22f1, "Thinkpad", ALC287_FIXUP_CS35L41_I2C_2_THINKPAD_ACPI),
++ SND_PCI_QUIRK(0x17aa, 0x22f2, "Thinkpad", ALC287_FIXUP_CS35L41_I2C_2_THINKPAD_ACPI),
++ SND_PCI_QUIRK(0x17aa, 0x22f3, "Thinkpad", ALC287_FIXUP_CS35L41_I2C_2_THINKPAD_ACPI),
++ SND_PCI_QUIRK(0x17aa, 0x2316, "Thinkpad P1 Gen 6", ALC287_FIXUP_CS35L41_I2C_2_THINKPAD_ACPI),
++ SND_PCI_QUIRK(0x17aa, 0x2317, "Thinkpad P1 Gen 6", ALC287_FIXUP_CS35L41_I2C_2_THINKPAD_ACPI),
++ SND_PCI_QUIRK(0x17aa, 0x2318, "Thinkpad Z13 Gen2", ALC287_FIXUP_CS35L41_I2C_2_THINKPAD_ACPI),
++ SND_PCI_QUIRK(0x17aa, 0x2319, "Thinkpad Z16 Gen2", ALC287_FIXUP_CS35L41_I2C_2_THINKPAD_ACPI),
++ SND_PCI_QUIRK(0x17aa, 0x231a, "Thinkpad Z16 Gen2", ALC287_FIXUP_CS35L41_I2C_2_THINKPAD_ACPI),
+ SND_PCI_QUIRK(0x17aa, 0x30bb, "ThinkCentre AIO", ALC233_FIXUP_LENOVO_LINE2_MIC_HOTKEY),
+ SND_PCI_QUIRK(0x17aa, 0x30e2, "ThinkCentre AIO", ALC233_FIXUP_LENOVO_LINE2_MIC_HOTKEY),
+ SND_PCI_QUIRK(0x17aa, 0x310c, "ThinkCentre Station", ALC294_FIXUP_LENOVO_MIC_LOCATION),
+@@ -10601,6 +10675,8 @@ static int patch_alc269(struct hda_codec *codec)
+ spec->shutup = alc256_shutup;
+ spec->init_hook = alc256_init;
+ spec->gen.mixer_nid = 0; /* ALC256 does not have any loopback mixer path */
++ if (codec->bus->pci->vendor == PCI_VENDOR_ID_AMD)
++ spec->en_3kpull_low = true;
+ break;
+ case 0x10ec0257:
+ spec->codec_variant = ALC269_TYPE_ALC257;
+diff --git a/sound/soc/amd/acp/amd.h b/sound/soc/amd/acp/amd.h
+index 5f2119f422715..12a176a50fd6e 100644
+--- a/sound/soc/amd/acp/amd.h
++++ b/sound/soc/amd/acp/amd.h
+@@ -173,7 +173,7 @@ int snd_amd_acp_find_config(struct pci_dev *pci);
+
+ static inline u64 acp_get_byte_count(struct acp_dev_data *adata, int dai_id, int direction)
+ {
+- u64 byte_count, low = 0, high = 0;
++ u64 byte_count = 0, low = 0, high = 0;
+
+ if (direction == SNDRV_PCM_STREAM_PLAYBACK) {
+ switch (dai_id) {
+@@ -191,7 +191,7 @@ static inline u64 acp_get_byte_count(struct acp_dev_data *adata, int dai_id, int
+ break;
+ default:
+ dev_err(adata->dev, "Invalid dai id %x\n", dai_id);
+- return -EINVAL;
++ goto POINTER_RETURN_BYTES;
+ }
+ } else {
+ switch (dai_id) {
+@@ -213,12 +213,13 @@ static inline u64 acp_get_byte_count(struct acp_dev_data *adata, int dai_id, int
+ break;
+ default:
+ dev_err(adata->dev, "Invalid dai id %x\n", dai_id);
+- return -EINVAL;
++ goto POINTER_RETURN_BYTES;
+ }
+ }
+ /* Get 64 bit value from two 32 bit registers */
+ byte_count = (high << 32) | low;
+
++POINTER_RETURN_BYTES:
+ return byte_count;
+ }
+
+diff --git a/sound/soc/codecs/Kconfig b/sound/soc/codecs/Kconfig
+index 8020097d4e4c8..1b50b2d66beb2 100644
+--- a/sound/soc/codecs/Kconfig
++++ b/sound/soc/codecs/Kconfig
+@@ -701,6 +701,7 @@ config SND_SOC_CS35L41_I2C
+
+ config SND_SOC_CS35L45
+ tristate
++ select REGMAP_IRQ
+
+ config SND_SOC_CS35L45_SPI
+ tristate "Cirrus Logic CS35L45 CODEC (SPI)"
+diff --git a/sound/soc/codecs/cs42l51-i2c.c b/sound/soc/codecs/cs42l51-i2c.c
+index 85238339fbcab..b2085ff4b3226 100644
+--- a/sound/soc/codecs/cs42l51-i2c.c
++++ b/sound/soc/codecs/cs42l51-i2c.c
+@@ -19,6 +19,12 @@ static struct i2c_device_id cs42l51_i2c_id[] = {
+ };
+ MODULE_DEVICE_TABLE(i2c, cs42l51_i2c_id);
+
++const struct of_device_id cs42l51_of_match[] = {
++ { .compatible = "cirrus,cs42l51", },
++ { }
++};
++MODULE_DEVICE_TABLE(of, cs42l51_of_match);
++
+ static int cs42l51_i2c_probe(struct i2c_client *i2c)
+ {
+ struct regmap_config config;
+diff --git a/sound/soc/codecs/cs42l51.c b/sound/soc/codecs/cs42l51.c
+index e88d9ff95cdfc..4b832d52f643f 100644
+--- a/sound/soc/codecs/cs42l51.c
++++ b/sound/soc/codecs/cs42l51.c
+@@ -826,13 +826,6 @@ int __maybe_unused cs42l51_resume(struct device *dev)
+ }
+ EXPORT_SYMBOL_GPL(cs42l51_resume);
+
+-const struct of_device_id cs42l51_of_match[] = {
+- { .compatible = "cirrus,cs42l51", },
+- { }
+-};
+-MODULE_DEVICE_TABLE(of, cs42l51_of_match);
+-EXPORT_SYMBOL_GPL(cs42l51_of_match);
+-
+ MODULE_AUTHOR("Arnaud Patard <arnaud.patard@rtp-net.org>");
+ MODULE_DESCRIPTION("Cirrus Logic CS42L51 ALSA SoC Codec Driver");
+ MODULE_LICENSE("GPL");
+diff --git a/sound/soc/codecs/cs42l51.h b/sound/soc/codecs/cs42l51.h
+index a79343e8a54ea..125703ede1133 100644
+--- a/sound/soc/codecs/cs42l51.h
++++ b/sound/soc/codecs/cs42l51.h
+@@ -16,7 +16,6 @@ int cs42l51_probe(struct device *dev, struct regmap *regmap);
+ void cs42l51_remove(struct device *dev);
+ int __maybe_unused cs42l51_suspend(struct device *dev);
+ int __maybe_unused cs42l51_resume(struct device *dev);
+-extern const struct of_device_id cs42l51_of_match[];
+
+ #define CS42L51_CHIP_ID 0x1B
+ #define CS42L51_CHIP_REV_A 0x00
+diff --git a/sound/soc/codecs/rt5640.c b/sound/soc/codecs/rt5640.c
+index 1392570555070..31578ea712a99 100644
+--- a/sound/soc/codecs/rt5640.c
++++ b/sound/soc/codecs/rt5640.c
+@@ -2567,9 +2567,10 @@ static void rt5640_enable_jack_detect(struct snd_soc_component *component,
+ if (jack_data && jack_data->use_platform_clock)
+ rt5640->use_platform_clock = jack_data->use_platform_clock;
+
+- ret = request_irq(rt5640->irq, rt5640_irq,
+- IRQF_TRIGGER_RISING | IRQF_TRIGGER_FALLING | IRQF_ONESHOT,
+- "rt5640", rt5640);
++ ret = devm_request_threaded_irq(component->dev, rt5640->irq,
++ NULL, rt5640_irq,
++ IRQF_TRIGGER_RISING | IRQF_TRIGGER_FALLING | IRQF_ONESHOT,
++ "rt5640", rt5640);
+ if (ret) {
+ dev_warn(component->dev, "Failed to reguest IRQ %d: %d\n", rt5640->irq, ret);
+ rt5640_disable_jack_detect(component);
+@@ -2622,8 +2623,9 @@ static void rt5640_enable_hda_jack_detect(
+
+ rt5640->jack = jack;
+
+- ret = request_irq(rt5640->irq, rt5640_irq,
+- IRQF_TRIGGER_RISING | IRQF_ONESHOT, "rt5640", rt5640);
++ ret = devm_request_threaded_irq(component->dev, rt5640->irq,
++ NULL, rt5640_irq, IRQF_TRIGGER_RISING | IRQF_ONESHOT,
++ "rt5640", rt5640);
+ if (ret) {
+ dev_warn(component->dev, "Failed to reguest IRQ %d: %d\n", rt5640->irq, ret);
+ rt5640->irq = -ENXIO;
+diff --git a/sound/soc/codecs/wcd-mbhc-v2.c b/sound/soc/codecs/wcd-mbhc-v2.c
+index 1911750f7445c..5da1934527f34 100644
+--- a/sound/soc/codecs/wcd-mbhc-v2.c
++++ b/sound/soc/codecs/wcd-mbhc-v2.c
+@@ -1454,7 +1454,7 @@ struct wcd_mbhc *wcd_mbhc_init(struct snd_soc_component *component,
+ return ERR_PTR(-EINVAL);
+ }
+
+- mbhc = devm_kzalloc(dev, sizeof(*mbhc), GFP_KERNEL);
++ mbhc = kzalloc(sizeof(*mbhc), GFP_KERNEL);
+ if (!mbhc)
+ return ERR_PTR(-ENOMEM);
+
+@@ -1474,61 +1474,76 @@ struct wcd_mbhc *wcd_mbhc_init(struct snd_soc_component *component,
+
+ INIT_WORK(&mbhc->correct_plug_swch, wcd_correct_swch_plug);
+
+- ret = devm_request_threaded_irq(dev, mbhc->intr_ids->mbhc_sw_intr, NULL,
++ ret = request_threaded_irq(mbhc->intr_ids->mbhc_sw_intr, NULL,
+ wcd_mbhc_mech_plug_detect_irq,
+ IRQF_ONESHOT | IRQF_TRIGGER_RISING,
+ "mbhc sw intr", mbhc);
+ if (ret)
+- goto err;
++ goto err_free_mbhc;
+
+- ret = devm_request_threaded_irq(dev, mbhc->intr_ids->mbhc_btn_press_intr, NULL,
++ ret = request_threaded_irq(mbhc->intr_ids->mbhc_btn_press_intr, NULL,
+ wcd_mbhc_btn_press_handler,
+ IRQF_ONESHOT | IRQF_TRIGGER_RISING,
+ "Button Press detect", mbhc);
+ if (ret)
+- goto err;
++ goto err_free_sw_intr;
+
+- ret = devm_request_threaded_irq(dev, mbhc->intr_ids->mbhc_btn_release_intr, NULL,
++ ret = request_threaded_irq(mbhc->intr_ids->mbhc_btn_release_intr, NULL,
+ wcd_mbhc_btn_release_handler,
+ IRQF_ONESHOT | IRQF_TRIGGER_RISING,
+ "Button Release detect", mbhc);
+ if (ret)
+- goto err;
++ goto err_free_btn_press_intr;
+
+- ret = devm_request_threaded_irq(dev, mbhc->intr_ids->mbhc_hs_ins_intr, NULL,
++ ret = request_threaded_irq(mbhc->intr_ids->mbhc_hs_ins_intr, NULL,
+ wcd_mbhc_adc_hs_ins_irq,
+ IRQF_ONESHOT | IRQF_TRIGGER_RISING,
+ "Elect Insert", mbhc);
+ if (ret)
+- goto err;
++ goto err_free_btn_release_intr;
+
+ disable_irq_nosync(mbhc->intr_ids->mbhc_hs_ins_intr);
+
+- ret = devm_request_threaded_irq(dev, mbhc->intr_ids->mbhc_hs_rem_intr, NULL,
++ ret = request_threaded_irq(mbhc->intr_ids->mbhc_hs_rem_intr, NULL,
+ wcd_mbhc_adc_hs_rem_irq,
+ IRQF_ONESHOT | IRQF_TRIGGER_RISING,
+ "Elect Remove", mbhc);
+ if (ret)
+- goto err;
++ goto err_free_hs_ins_intr;
+
+ disable_irq_nosync(mbhc->intr_ids->mbhc_hs_rem_intr);
+
+- ret = devm_request_threaded_irq(dev, mbhc->intr_ids->hph_left_ocp, NULL,
++ ret = request_threaded_irq(mbhc->intr_ids->hph_left_ocp, NULL,
+ wcd_mbhc_hphl_ocp_irq,
+ IRQF_ONESHOT | IRQF_TRIGGER_RISING,
+ "HPH_L OCP detect", mbhc);
+ if (ret)
+- goto err;
++ goto err_free_hs_rem_intr;
+
+- ret = devm_request_threaded_irq(dev, mbhc->intr_ids->hph_right_ocp, NULL,
++ ret = request_threaded_irq(mbhc->intr_ids->hph_right_ocp, NULL,
+ wcd_mbhc_hphr_ocp_irq,
+ IRQF_ONESHOT | IRQF_TRIGGER_RISING,
+ "HPH_R OCP detect", mbhc);
+ if (ret)
+- goto err;
++ goto err_free_hph_left_ocp;
+
+ return mbhc;
+-err:
++
++err_free_hph_left_ocp:
++ free_irq(mbhc->intr_ids->hph_left_ocp, mbhc);
++err_free_hs_rem_intr:
++ free_irq(mbhc->intr_ids->mbhc_hs_rem_intr, mbhc);
++err_free_hs_ins_intr:
++ free_irq(mbhc->intr_ids->mbhc_hs_ins_intr, mbhc);
++err_free_btn_release_intr:
++ free_irq(mbhc->intr_ids->mbhc_btn_release_intr, mbhc);
++err_free_btn_press_intr:
++ free_irq(mbhc->intr_ids->mbhc_btn_press_intr, mbhc);
++err_free_sw_intr:
++ free_irq(mbhc->intr_ids->mbhc_sw_intr, mbhc);
++err_free_mbhc:
++ kfree(mbhc);
++
+ dev_err(dev, "Failed to request mbhc interrupts %d\n", ret);
+
+ return ERR_PTR(ret);
+@@ -1537,9 +1552,19 @@ EXPORT_SYMBOL(wcd_mbhc_init);
+
+ void wcd_mbhc_deinit(struct wcd_mbhc *mbhc)
+ {
++ free_irq(mbhc->intr_ids->hph_right_ocp, mbhc);
++ free_irq(mbhc->intr_ids->hph_left_ocp, mbhc);
++ free_irq(mbhc->intr_ids->mbhc_hs_rem_intr, mbhc);
++ free_irq(mbhc->intr_ids->mbhc_hs_ins_intr, mbhc);
++ free_irq(mbhc->intr_ids->mbhc_btn_release_intr, mbhc);
++ free_irq(mbhc->intr_ids->mbhc_btn_press_intr, mbhc);
++ free_irq(mbhc->intr_ids->mbhc_sw_intr, mbhc);
++
+ mutex_lock(&mbhc->lock);
+ wcd_cancel_hs_detect_plug(mbhc, &mbhc->correct_plug_swch);
+ mutex_unlock(&mbhc->lock);
++
++ kfree(mbhc);
+ }
+ EXPORT_SYMBOL(wcd_mbhc_deinit);
+
+diff --git a/sound/soc/codecs/wcd934x.c b/sound/soc/codecs/wcd934x.c
+index c0d1fa36d8411..e467cbe12d8a9 100644
+--- a/sound/soc/codecs/wcd934x.c
++++ b/sound/soc/codecs/wcd934x.c
+@@ -3044,6 +3044,17 @@ static int wcd934x_mbhc_init(struct snd_soc_component *component)
+
+ return 0;
+ }
++
++static void wcd934x_mbhc_deinit(struct snd_soc_component *component)
++{
++ struct wcd934x_codec *wcd = snd_soc_component_get_drvdata(component);
++
++ if (!wcd->mbhc)
++ return;
++
++ wcd_mbhc_deinit(wcd->mbhc);
++}
++
+ static int wcd934x_comp_probe(struct snd_soc_component *component)
+ {
+ struct wcd934x_codec *wcd = dev_get_drvdata(component->dev);
+@@ -3077,6 +3088,7 @@ static void wcd934x_comp_remove(struct snd_soc_component *comp)
+ {
+ struct wcd934x_codec *wcd = dev_get_drvdata(comp->dev);
+
++ wcd934x_mbhc_deinit(comp);
+ wcd_clsh_ctrl_free(wcd->clsh_ctrl);
+ }
+
+diff --git a/sound/soc/codecs/wcd938x.c b/sound/soc/codecs/wcd938x.c
+index e7d6a02cdec0d..4a0b990f56e12 100644
+--- a/sound/soc/codecs/wcd938x.c
++++ b/sound/soc/codecs/wcd938x.c
+@@ -210,7 +210,7 @@ struct wcd938x_priv {
+ };
+
+ static const SNDRV_CTL_TLVD_DECLARE_DB_MINMAX(ear_pa_gain, 600, -1800);
+-static const SNDRV_CTL_TLVD_DECLARE_DB_MINMAX(line_gain, 600, -3000);
++static const DECLARE_TLV_DB_SCALE(line_gain, -3000, 150, -3000);
+ static const SNDRV_CTL_TLVD_DECLARE_DB_MINMAX(analog_gain, 0, 3000);
+
+ struct wcd938x_mbhc_zdet_param {
+@@ -2165,8 +2165,8 @@ static inline void wcd938x_mbhc_get_result_params(struct wcd938x_priv *wcd938x,
+ else if (x1 < minCode_param[noff])
+ *zdet = WCD938X_ZDET_FLOATING_IMPEDANCE;
+
+- pr_err("%s: d1=%d, c1=%d, x1=0x%x, z_val=%d(milliOhm)\n",
+- __func__, d1, c1, x1, *zdet);
++ pr_debug("%s: d1=%d, c1=%d, x1=0x%x, z_val=%d (milliohm)\n",
++ __func__, d1, c1, x1, *zdet);
+ ramp_down:
+ i = 0;
+ while (x1) {
+@@ -2625,6 +2625,8 @@ static int wcd938x_mbhc_init(struct snd_soc_component *component)
+ WCD938X_IRQ_HPHR_OCP_INT);
+
+ wcd938x->wcd_mbhc = wcd_mbhc_init(component, &mbhc_cb, intr_ids, wcd_mbhc_fields, true);
++ if (IS_ERR(wcd938x->wcd_mbhc))
++ return PTR_ERR(wcd938x->wcd_mbhc);
+
+ snd_soc_add_component_controls(component, impedance_detect_controls,
+ ARRAY_SIZE(impedance_detect_controls));
+@@ -2633,6 +2635,14 @@ static int wcd938x_mbhc_init(struct snd_soc_component *component)
+
+ return 0;
+ }
++
++static void wcd938x_mbhc_deinit(struct snd_soc_component *component)
++{
++ struct wcd938x_priv *wcd938x = snd_soc_component_get_drvdata(component);
++
++ wcd_mbhc_deinit(wcd938x->wcd_mbhc);
++}
++
+ /* END MBHC */
+
+ static const struct snd_kcontrol_new wcd938x_snd_controls[] = {
+@@ -2652,8 +2662,8 @@ static const struct snd_kcontrol_new wcd938x_snd_controls[] = {
+ wcd938x_get_swr_port, wcd938x_set_swr_port),
+ SOC_SINGLE_EXT("DSD_R Switch", WCD938X_DSD_R, 0, 1, 0,
+ wcd938x_get_swr_port, wcd938x_set_swr_port),
+- SOC_SINGLE_TLV("HPHL Volume", WCD938X_HPH_L_EN, 0, 0x18, 0, line_gain),
+- SOC_SINGLE_TLV("HPHR Volume", WCD938X_HPH_R_EN, 0, 0x18, 0, line_gain),
++ SOC_SINGLE_TLV("HPHL Volume", WCD938X_HPH_L_EN, 0, 0x18, 1, line_gain),
++ SOC_SINGLE_TLV("HPHR Volume", WCD938X_HPH_R_EN, 0, 0x18, 1, line_gain),
+ WCD938X_EAR_PA_GAIN_TLV("EAR_PA Volume", WCD938X_ANA_EAR_COMPANDER_CTL,
+ 2, 0x10, 0, ear_pa_gain),
+ SOC_SINGLE_EXT("ADC1 Switch", WCD938X_ADC1, 1, 1, 0,
+@@ -3080,16 +3090,33 @@ static int wcd938x_irq_init(struct wcd938x_priv *wcd, struct device *dev)
+ static int wcd938x_soc_codec_probe(struct snd_soc_component *component)
+ {
+ struct wcd938x_priv *wcd938x = snd_soc_component_get_drvdata(component);
++ struct sdw_slave *tx_sdw_dev = wcd938x->tx_sdw_dev;
+ struct device *dev = component->dev;
++ unsigned long time_left;
+ int ret, i;
+
++ time_left = wait_for_completion_timeout(&tx_sdw_dev->initialization_complete,
++ msecs_to_jiffies(2000));
++ if (!time_left) {
++ dev_err(dev, "soundwire device init timeout\n");
++ return -ETIMEDOUT;
++ }
++
+ snd_soc_component_init_regmap(component, wcd938x->regmap);
+
++ ret = pm_runtime_resume_and_get(dev);
++ if (ret < 0)
++ return ret;
++
+ wcd938x->variant = snd_soc_component_read_field(component,
+ WCD938X_DIGITAL_EFUSE_REG_0,
+ WCD938X_ID_MASK);
+
+ wcd938x->clsh_info = wcd_clsh_ctrl_alloc(component, WCD938X);
++ if (IS_ERR(wcd938x->clsh_info)) {
++ pm_runtime_put(dev);
++ return PTR_ERR(wcd938x->clsh_info);
++ }
+
+ wcd938x_io_init(wcd938x);
+ /* Set all interrupts as edge triggered */
+@@ -3098,6 +3125,8 @@ static int wcd938x_soc_codec_probe(struct snd_soc_component *component)
+ (WCD938X_DIGITAL_INTR_LEVEL_0 + i), 0);
+ }
+
++ pm_runtime_put(dev);
++
+ wcd938x->hphr_pdm_wd_int = regmap_irq_get_virq(wcd938x->irq_chip,
+ WCD938X_IRQ_HPHR_PDM_WD_INT);
+ wcd938x->hphl_pdm_wd_int = regmap_irq_get_virq(wcd938x->irq_chip,
+@@ -3109,20 +3138,26 @@ static int wcd938x_soc_codec_probe(struct snd_soc_component *component)
+ ret = request_threaded_irq(wcd938x->hphr_pdm_wd_int, NULL, wcd938x_wd_handle_irq,
+ IRQF_ONESHOT | IRQF_TRIGGER_RISING,
+ "HPHR PDM WD INT", wcd938x);
+- if (ret)
++ if (ret) {
+ dev_err(dev, "Failed to request HPHR WD interrupt (%d)\n", ret);
++ goto err_free_clsh_ctrl;
++ }
+
+ ret = request_threaded_irq(wcd938x->hphl_pdm_wd_int, NULL, wcd938x_wd_handle_irq,
+ IRQF_ONESHOT | IRQF_TRIGGER_RISING,
+ "HPHL PDM WD INT", wcd938x);
+- if (ret)
++ if (ret) {
+ dev_err(dev, "Failed to request HPHL WD interrupt (%d)\n", ret);
++ goto err_free_hphr_pdm_wd_int;
++ }
+
+ ret = request_threaded_irq(wcd938x->aux_pdm_wd_int, NULL, wcd938x_wd_handle_irq,
+ IRQF_ONESHOT | IRQF_TRIGGER_RISING,
+ "AUX PDM WD INT", wcd938x);
+- if (ret)
++ if (ret) {
+ dev_err(dev, "Failed to request Aux WD interrupt (%d)\n", ret);
++ goto err_free_hphl_pdm_wd_int;
++ }
+
+ /* Disable watchdog interrupt for HPH and AUX */
+ disable_irq_nosync(wcd938x->hphr_pdm_wd_int);
+@@ -3137,7 +3172,7 @@ static int wcd938x_soc_codec_probe(struct snd_soc_component *component)
+ dev_err(component->dev,
+ "%s: Failed to add snd ctrls for variant: %d\n",
+ __func__, wcd938x->variant);
+- goto err;
++ goto err_free_aux_pdm_wd_int;
+ }
+ break;
+ case WCD9385:
+@@ -3147,7 +3182,7 @@ static int wcd938x_soc_codec_probe(struct snd_soc_component *component)
+ dev_err(component->dev,
+ "%s: Failed to add snd ctrls for variant: %d\n",
+ __func__, wcd938x->variant);
+- goto err;
++ goto err_free_aux_pdm_wd_int;
+ }
+ break;
+ default:
+@@ -3155,12 +3190,38 @@ static int wcd938x_soc_codec_probe(struct snd_soc_component *component)
+ }
+
+ ret = wcd938x_mbhc_init(component);
+- if (ret)
++ if (ret) {
+ dev_err(component->dev, "mbhc initialization failed\n");
+-err:
++ goto err_free_aux_pdm_wd_int;
++ }
++
++ return 0;
++
++err_free_aux_pdm_wd_int:
++ free_irq(wcd938x->aux_pdm_wd_int, wcd938x);
++err_free_hphl_pdm_wd_int:
++ free_irq(wcd938x->hphl_pdm_wd_int, wcd938x);
++err_free_hphr_pdm_wd_int:
++ free_irq(wcd938x->hphr_pdm_wd_int, wcd938x);
++err_free_clsh_ctrl:
++ wcd_clsh_ctrl_free(wcd938x->clsh_info);
++
+ return ret;
+ }
+
++static void wcd938x_soc_codec_remove(struct snd_soc_component *component)
++{
++ struct wcd938x_priv *wcd938x = snd_soc_component_get_drvdata(component);
++
++ wcd938x_mbhc_deinit(component);
++
++ free_irq(wcd938x->aux_pdm_wd_int, wcd938x);
++ free_irq(wcd938x->hphl_pdm_wd_int, wcd938x);
++ free_irq(wcd938x->hphr_pdm_wd_int, wcd938x);
++
++ wcd_clsh_ctrl_free(wcd938x->clsh_info);
++}
++
+ static int wcd938x_codec_set_jack(struct snd_soc_component *comp,
+ struct snd_soc_jack *jack, void *data)
+ {
+@@ -3177,6 +3238,7 @@ static int wcd938x_codec_set_jack(struct snd_soc_component *comp,
+ static const struct snd_soc_component_driver soc_codec_dev_wcd938x = {
+ .name = "wcd938x_codec",
+ .probe = wcd938x_soc_codec_probe,
++ .remove = wcd938x_soc_codec_remove,
+ .controls = wcd938x_snd_controls,
+ .num_controls = ARRAY_SIZE(wcd938x_snd_controls),
+ .dapm_widgets = wcd938x_dapm_widgets,
+diff --git a/sound/soc/fsl/fsl_sai.c b/sound/soc/fsl/fsl_sai.c
+index e3105d48fb651..e9f1398ca9330 100644
+--- a/sound/soc/fsl/fsl_sai.c
++++ b/sound/soc/fsl/fsl_sai.c
+@@ -507,12 +507,6 @@ static int fsl_sai_set_bclk(struct snd_soc_dai *dai, bool tx, u32 freq)
+ savediv / 2 - 1);
+ }
+
+- if (sai->soc_data->max_register >= FSL_SAI_MCTL) {
+- /* SAI is in master mode at this point, so enable MCLK */
+- regmap_update_bits(sai->regmap, FSL_SAI_MCTL,
+- FSL_SAI_MCTL_MCLK_EN, FSL_SAI_MCTL_MCLK_EN);
+- }
+-
+ return 0;
+ }
+
+@@ -719,7 +713,7 @@ static void fsl_sai_config_disable(struct fsl_sai *sai, int dir)
+ u32 xcsr, count = 100;
+
+ regmap_update_bits(sai->regmap, FSL_SAI_xCSR(tx, ofs),
+- FSL_SAI_CSR_TERE, 0);
++ FSL_SAI_CSR_TERE | FSL_SAI_CSR_BCE, 0);
+
+ /* TERE will remain set till the end of current frame */
+ do {
+diff --git a/sound/soc/fsl/fsl_sai.h b/sound/soc/fsl/fsl_sai.h
+index a53c4f0e25faf..db8aabb156c7d 100644
+--- a/sound/soc/fsl/fsl_sai.h
++++ b/sound/soc/fsl/fsl_sai.h
+@@ -91,6 +91,7 @@
+ /* SAI Transmit/Receive Control Register */
+ #define FSL_SAI_CSR_TERE BIT(31)
+ #define FSL_SAI_CSR_SE BIT(30)
++#define FSL_SAI_CSR_BCE BIT(28)
+ #define FSL_SAI_CSR_FR BIT(25)
+ #define FSL_SAI_CSR_SR BIT(24)
+ #define FSL_SAI_CSR_xF_SHIFT 16
+diff --git a/sound/soc/qcom/qdsp6/q6apm.c b/sound/soc/qcom/qdsp6/q6apm.c
+index a7a3f973eb6d5..cdebf209c8a55 100644
+--- a/sound/soc/qcom/qdsp6/q6apm.c
++++ b/sound/soc/qcom/qdsp6/q6apm.c
+@@ -446,6 +446,8 @@ static int graph_callback(struct gpr_resp_pkt *data, void *priv, int op)
+
+ switch (hdr->opcode) {
+ case DATA_CMD_RSP_WR_SH_MEM_EP_DATA_BUFFER_DONE_V2:
++ if (!graph->ar_graph)
++ break;
+ client_event = APM_CLIENT_EVENT_DATA_WRITE_DONE;
+ mutex_lock(&graph->lock);
+ token = hdr->token & APM_WRITE_TOKEN_MASK;
+@@ -479,6 +481,8 @@ static int graph_callback(struct gpr_resp_pkt *data, void *priv, int op)
+ wake_up(&graph->cmd_wait);
+ break;
+ case DATA_CMD_RSP_RD_SH_MEM_EP_DATA_BUFFER_V2:
++ if (!graph->ar_graph)
++ break;
+ client_event = APM_CLIENT_EVENT_DATA_READ_DONE;
+ mutex_lock(&graph->lock);
+ rd_done = data->payload;
+@@ -581,8 +585,9 @@ int q6apm_graph_close(struct q6apm_graph *graph)
+ {
+ struct audioreach_graph *ar_graph = graph->ar_graph;
+
+- gpr_free_port(graph->port);
++ graph->ar_graph = NULL;
+ kref_put(&ar_graph->refcount, q6apm_put_audioreach_graph);
++ gpr_free_port(graph->port);
+ kfree(graph);
+
+ return 0;
+diff --git a/sound/soc/qcom/qdsp6/topology.c b/sound/soc/qcom/qdsp6/topology.c
+index cccc59b570b9a..130b22a34fb3b 100644
+--- a/sound/soc/qcom/qdsp6/topology.c
++++ b/sound/soc/qcom/qdsp6/topology.c
+@@ -1277,8 +1277,8 @@ int audioreach_tplg_init(struct snd_soc_component *component)
+
+ ret = snd_soc_tplg_component_load(component, &audioreach_tplg_ops, fw);
+ if (ret < 0) {
+- dev_err(dev, "tplg component load failed%d\n", ret);
+- ret = -EINVAL;
++ if (ret != -EPROBE_DEFER)
++ dev_err(dev, "tplg component load failed: %d\n", ret);
+ }
+
+ release_firmware(fw);
+diff --git a/sound/soc/sof/ipc3-dtrace.c b/sound/soc/sof/ipc3-dtrace.c
+index 1d3bca2d28dd6..35da85a45a9ae 100644
+--- a/sound/soc/sof/ipc3-dtrace.c
++++ b/sound/soc/sof/ipc3-dtrace.c
+@@ -186,7 +186,6 @@ static ssize_t dfsentry_trace_filter_write(struct file *file, const char __user
+ struct snd_sof_dfsentry *dfse = file->private_data;
+ struct sof_ipc_trace_filter_elem *elems = NULL;
+ struct snd_sof_dev *sdev = dfse->sdev;
+- loff_t pos = 0;
+ int num_elems;
+ char *string;
+ int ret;
+@@ -201,11 +200,11 @@ static ssize_t dfsentry_trace_filter_write(struct file *file, const char __user
+ if (!string)
+ return -ENOMEM;
+
+- /* assert null termination */
+- string[count] = 0;
+- ret = simple_write_to_buffer(string, count, &pos, from, count);
+- if (ret < 0)
++ if (copy_from_user(string, from, count)) {
++ ret = -EFAULT;
+ goto error;
++ }
++ string[count] = '\0';
+
+ ret = trace_filter_parse(sdev, string, &num_elems, &elems);
+ if (ret < 0)
+diff --git a/sound/soc/tegra/tegra210_adx.c b/sound/soc/tegra/tegra210_adx.c
+index 41117c1d61fb3..c9eba3566aeed 100644
+--- a/sound/soc/tegra/tegra210_adx.c
++++ b/sound/soc/tegra/tegra210_adx.c
+@@ -2,7 +2,7 @@
+ //
+ // tegra210_adx.c - Tegra210 ADX driver
+ //
+-// Copyright (c) 2021 NVIDIA CORPORATION. All rights reserved.
++// Copyright (c) 2021-2023 NVIDIA CORPORATION. All rights reserved.
+
+ #include <linux/clk.h>
+ #include <linux/device.h>
+@@ -175,10 +175,20 @@ static int tegra210_adx_get_byte_map(struct snd_kcontrol *kcontrol,
+ mc = (struct soc_mixer_control *)kcontrol->private_value;
+ enabled = adx->byte_mask[mc->reg / 32] & (1 << (mc->reg % 32));
+
++ /*
++ * TODO: Simplify this logic to just return from bytes_map[]
++ *
++ * Presently below is required since bytes_map[] is
++ * tightly packed and cannot store the control value of 256.
++ * Byte mask state is used to know if 256 needs to be returned.
++ * Note that for control value of 256, the put() call stores 0
++ * in the bytes_map[] and disables the corresponding bit in
++ * byte_mask[].
++ */
+ if (enabled)
+ ucontrol->value.integer.value[0] = bytes_map[mc->reg];
+ else
+- ucontrol->value.integer.value[0] = 0;
++ ucontrol->value.integer.value[0] = 256;
+
+ return 0;
+ }
+@@ -192,19 +202,19 @@ static int tegra210_adx_put_byte_map(struct snd_kcontrol *kcontrol,
+ int value = ucontrol->value.integer.value[0];
+ struct soc_mixer_control *mc =
+ (struct soc_mixer_control *)kcontrol->private_value;
++ unsigned int mask_val = adx->byte_mask[mc->reg / 32];
+
+- if (value == bytes_map[mc->reg])
++ if (value >= 0 && value <= 255)
++ mask_val |= (1 << (mc->reg % 32));
++ else
++ mask_val &= ~(1 << (mc->reg % 32));
++
++ if (mask_val == adx->byte_mask[mc->reg / 32])
+ return 0;
+
+- if (value >= 0 && value <= 255) {
+- /* update byte map and enable slot */
+- bytes_map[mc->reg] = value;
+- adx->byte_mask[mc->reg / 32] |= (1 << (mc->reg % 32));
+- } else {
+- /* reset byte map and disable slot */
+- bytes_map[mc->reg] = 0;
+- adx->byte_mask[mc->reg / 32] &= ~(1 << (mc->reg % 32));
+- }
++ /* Update byte map and slot */
++ bytes_map[mc->reg] = value % 256;
++ adx->byte_mask[mc->reg / 32] = mask_val;
+
+ return 1;
+ }
+diff --git a/sound/soc/tegra/tegra210_amx.c b/sound/soc/tegra/tegra210_amx.c
+index 782a141b65c0c..179876949b308 100644
+--- a/sound/soc/tegra/tegra210_amx.c
++++ b/sound/soc/tegra/tegra210_amx.c
+@@ -2,7 +2,7 @@
+ //
+ // tegra210_amx.c - Tegra210 AMX driver
+ //
+-// Copyright (c) 2021 NVIDIA CORPORATION. All rights reserved.
++// Copyright (c) 2021-2023 NVIDIA CORPORATION. All rights reserved.
+
+ #include <linux/clk.h>
+ #include <linux/device.h>
+@@ -203,10 +203,20 @@ static int tegra210_amx_get_byte_map(struct snd_kcontrol *kcontrol,
+ else
+ enabled = amx->byte_mask[0] & (1 << reg);
+
++ /*
++ * TODO: Simplify this logic to just return from bytes_map[]
++ *
++ * Presently below is required since bytes_map[] is
++ * tightly packed and cannot store the control value of 256.
++ * Byte mask state is used to know if 256 needs to be returned.
++ * Note that for control value of 256, the put() call stores 0
++ * in the bytes_map[] and disables the corresponding bit in
++ * byte_mask[].
++ */
+ if (enabled)
+ ucontrol->value.integer.value[0] = bytes_map[reg];
+ else
+- ucontrol->value.integer.value[0] = 0;
++ ucontrol->value.integer.value[0] = 256;
+
+ return 0;
+ }
+@@ -221,25 +231,19 @@ static int tegra210_amx_put_byte_map(struct snd_kcontrol *kcontrol,
+ unsigned char *bytes_map = (unsigned char *)&amx->map;
+ int reg = mc->reg;
+ int value = ucontrol->value.integer.value[0];
++ unsigned int mask_val = amx->byte_mask[reg / 32];
+
+- if (value == bytes_map[reg])
++ if (value >= 0 && value <= 255)
++ mask_val |= (1 << (reg % 32));
++ else
++ mask_val &= ~(1 << (reg % 32));
++
++ if (mask_val == amx->byte_mask[reg / 32])
+ return 0;
+
+- if (value >= 0 && value <= 255) {
+- /* Update byte map and enable slot */
+- bytes_map[reg] = value;
+- if (reg > 31)
+- amx->byte_mask[1] |= (1 << (reg - 32));
+- else
+- amx->byte_mask[0] |= (1 << reg);
+- } else {
+- /* Reset byte map and disable slot */
+- bytes_map[reg] = 0;
+- if (reg > 31)
+- amx->byte_mask[1] &= ~(1 << (reg - 32));
+- else
+- amx->byte_mask[0] &= ~(1 << reg);
+- }
++ /* Update byte map and slot */
++ bytes_map[reg] = value % 256;
++ amx->byte_mask[reg / 32] = mask_val;
+
+ return 1;
+ }
+diff --git a/tools/include/nolibc/stackprotector.h b/tools/include/nolibc/stackprotector.h
+index d119cbbbc256f..9890e86c26172 100644
+--- a/tools/include/nolibc/stackprotector.h
++++ b/tools/include/nolibc/stackprotector.h
+@@ -45,8 +45,9 @@ __attribute__((weak,no_stack_protector,section(".text.nolibc_stack_chk")))
+ void __stack_chk_init(void)
+ {
+ my_syscall3(__NR_getrandom, &__stack_chk_guard, sizeof(__stack_chk_guard), 0);
+- /* a bit more randomness in case getrandom() fails */
+- __stack_chk_guard ^= (uintptr_t) &__stack_chk_guard;
++ /* a bit more randomness in case getrandom() fails, ensure the guard is never 0 */
++ if (__stack_chk_guard != (uintptr_t) &__stack_chk_guard)
++ __stack_chk_guard ^= (uintptr_t) &__stack_chk_guard;
+ }
+ #endif // defined(NOLIBC_STACKPROTECTOR)
+
+diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
+index a794d9eca93d8..72f068682c9a2 100644
+--- a/tools/perf/Makefile.config
++++ b/tools/perf/Makefile.config
+@@ -155,9 +155,9 @@ FEATURE_CHECK_LDFLAGS-libcrypto = -lcrypto
+ ifdef CSINCLUDES
+ LIBOPENCSD_CFLAGS := -I$(CSINCLUDES)
+ endif
+-OPENCSDLIBS := -lopencsd_c_api
++OPENCSDLIBS := -lopencsd_c_api -lopencsd
+ ifeq ($(findstring -static,${LDFLAGS}),-static)
+- OPENCSDLIBS += -lopencsd -lstdc++
++ OPENCSDLIBS += -lstdc++
+ endif
+ ifdef CSLIBS
+ LIBOPENCSD_LDFLAGS := -L$(CSLIBS)
+diff --git a/tools/perf/tests/shell/test_uprobe_from_different_cu.sh b/tools/perf/tests/shell/test_uprobe_from_different_cu.sh
+new file mode 100644
+index 0000000000000..00d2e0e2e0c28
+--- /dev/null
++++ b/tools/perf/tests/shell/test_uprobe_from_different_cu.sh
+@@ -0,0 +1,77 @@
++#!/bin/bash
++# test perf probe of function from different CU
++# SPDX-License-Identifier: GPL-2.0
++
++set -e
++
++temp_dir=$(mktemp -d /tmp/perf-uprobe-different-cu-sh.XXXXXXXXXX)
++
++cleanup()
++{
++ trap - EXIT TERM INT
++ if [[ "${temp_dir}" =~ ^/tmp/perf-uprobe-different-cu-sh.*$ ]]; then
++ echo "--- Cleaning up ---"
++ perf probe -x ${temp_dir}/testfile -d foo
++ rm -f "${temp_dir}/"*
++ rmdir "${temp_dir}"
++ fi
++}
++
++trap_cleanup()
++{
++ cleanup
++ exit 1
++}
++
++trap trap_cleanup EXIT TERM INT
++
++cat > ${temp_dir}/testfile-foo.h << EOF
++struct t
++{
++ int *p;
++ int c;
++};
++
++extern int foo (int i, struct t *t);
++EOF
++
++cat > ${temp_dir}/testfile-foo.c << EOF
++#include "testfile-foo.h"
++
++int
++foo (int i, struct t *t)
++{
++ int j, res = 0;
++ for (j = 0; j < i && j < t->c; j++)
++ res += t->p[j];
++
++ return res;
++}
++EOF
++
++cat > ${temp_dir}/testfile-main.c << EOF
++#include "testfile-foo.h"
++
++static struct t g;
++
++int
++main (int argc, char **argv)
++{
++ int i;
++ int j[argc];
++ g.c = argc;
++ g.p = j;
++ for (i = 0; i < argc; i++)
++ j[i] = (int) argv[i][0];
++ return foo (3, &g);
++}
++EOF
++
++gcc -g -Og -flto -c ${temp_dir}/testfile-foo.c -o ${temp_dir}/testfile-foo.o
++gcc -g -Og -c ${temp_dir}/testfile-main.c -o ${temp_dir}/testfile-main.o
++gcc -g -Og -o ${temp_dir}/testfile ${temp_dir}/testfile-foo.o ${temp_dir}/testfile-main.o
++
++perf probe -x ${temp_dir}/testfile --funcs foo
++perf probe -x ${temp_dir}/testfile foo
++
++cleanup
+diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c
+index 3bff678745635..3597ca288c9c6 100644
+--- a/tools/perf/util/dwarf-aux.c
++++ b/tools/perf/util/dwarf-aux.c
+@@ -478,8 +478,10 @@ static const char *die_get_file_name(Dwarf_Die *dw_die, int idx)
+ {
+ Dwarf_Die cu_die;
+ Dwarf_Files *files;
++ Dwarf_Attribute attr_mem;
+
+- if (idx < 0 || !dwarf_diecu(dw_die, &cu_die, NULL, NULL) ||
++ if (idx < 0 || !dwarf_attr_integrate(dw_die, DW_AT_decl_file, &attr_mem) ||
++ !dwarf_cu_die(attr_mem.cu, &cu_die, NULL, NULL, NULL, NULL, NULL, NULL) ||
+ dwarf_getsrcfiles(&cu_die, &files, NULL) != 0)
+ return NULL;
+
+diff --git a/tools/testing/radix-tree/maple.c b/tools/testing/radix-tree/maple.c
+index 9286d3baa12d6..adc5392df4009 100644
+--- a/tools/testing/radix-tree/maple.c
++++ b/tools/testing/radix-tree/maple.c
+@@ -206,9 +206,9 @@ static noinline void check_new_node(struct maple_tree *mt)
+ e = i - 1;
+ } else {
+ if (i >= 4)
+- e = i - 4;
+- else if (i == 3)
+- e = i - 2;
++ e = i - 3;
++ else if (i >= 1)
++ e = i - 1;
+ else
+ e = 0;
+ }
+diff --git a/tools/testing/selftests/mm/mkdirty.c b/tools/testing/selftests/mm/mkdirty.c
+index 6d71d972997b2..301abb99e027e 100644
+--- a/tools/testing/selftests/mm/mkdirty.c
++++ b/tools/testing/selftests/mm/mkdirty.c
+@@ -321,8 +321,8 @@ close_uffd:
+ munmap:
+ munmap(dst, pagesize);
+ free(src);
+-#endif /* __NR_userfaultfd */
+ }
++#endif /* __NR_userfaultfd */
+
+ int main(void)
+ {
+diff --git a/tools/testing/selftests/tc-testing/config b/tools/testing/selftests/tc-testing/config
+index 6e73b09c20c81..71706197ba0f8 100644
+--- a/tools/testing/selftests/tc-testing/config
++++ b/tools/testing/selftests/tc-testing/config
+@@ -5,6 +5,8 @@ CONFIG_NF_CONNTRACK=m
+ CONFIG_NF_CONNTRACK_MARK=y
+ CONFIG_NF_CONNTRACK_ZONES=y
+ CONFIG_NF_CONNTRACK_LABELS=y
++CONFIG_NF_CONNTRACK_PROCFS=y
++CONFIG_NF_FLOW_TABLE=m
+ CONFIG_NF_NAT=m
+ CONFIG_NETFILTER_XT_TARGET_LOG=m
+
+diff --git a/tools/testing/selftests/tc-testing/settings b/tools/testing/selftests/tc-testing/settings
+new file mode 100644
+index 0000000000000..e2206265f67c7
+--- /dev/null
++++ b/tools/testing/selftests/tc-testing/settings
+@@ -0,0 +1 @@
++timeout=900
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-07-27 11:46 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-07-27 11:46 UTC (permalink / raw
To: gentoo-commits
commit: 2981ddb8e387a7f95ab528d4aab9e12d88d1125c
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Thu Jul 27 11:46:45 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Thu Jul 27 11:46:45 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=2981ddb8
Update README
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/0000_README b/0000_README
index 44858dee..58f42d41 100644
--- a/0000_README
+++ b/0000_README
@@ -67,6 +67,10 @@ Patch: 1005_linux-6.4.6.patch
From: https://www.kernel.org
Desc: Linux 6.4.6
+Patch: 1006_linux-6.4.7.patch
+From: https://www.kernel.org
+Desc: Linux 6.4.7
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-07-29 18:37 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-07-29 18:37 UTC (permalink / raw
To: gentoo-commits
commit: 6a24630d39aff977fda6c7c759e7bbbacad854d6
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Sat Jul 29 18:37:25 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Sat Jul 29 18:37:25 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=6a24630d
Add BMQ scheduler, USE=experimental
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 8 +
5020_BMQ-and-PDS-io-scheduler-v6.4-r0.patch | 11163 ++++++++++++++++++++++++++
5021_BMQ-and-PDS-gentoo-defaults.patch | 13 +
3 files changed, 11184 insertions(+)
diff --git a/0000_README b/0000_README
index 58f42d41..e76d9abd 100644
--- a/0000_README
+++ b/0000_README
@@ -114,3 +114,11 @@ Desc: Add Gentoo Linux support config settings and defaults.
Patch: 5010_enable-cpu-optimizations-universal.patch
From: https://github.com/graysky2/kernel_compiler_patch
Desc: Kernel >= 5.15 patch enables gcc = v11.1+ optimizations for additional CPUs.
+
+Patch: 5020_BMQ-and-PDS-io-scheduler-v6.4-r0.patch
+From: https://github.com/Frogging-Family/linux-tkg https://gitlab.com/alfredchen/projectc
+Desc: BMQ(BitMap Queue) Scheduler. A new CPU scheduler developed from PDS(incld). Inspired by the scheduler in zircon.
+
+Patch: 5021_BMQ-and-PDS-gentoo-defaults.patch
+From: https://gitweb.gentoo.org/proj/linux-patches.git/
+Desc: Set defaults for BMQ. Add archs as people test, default to N
diff --git a/5020_BMQ-and-PDS-io-scheduler-v6.4-r0.patch b/5020_BMQ-and-PDS-io-scheduler-v6.4-r0.patch
new file mode 100644
index 00000000..3061b321
--- /dev/null
+++ b/5020_BMQ-and-PDS-io-scheduler-v6.4-r0.patch
@@ -0,0 +1,11163 @@
+diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
+index 9e5bab29685f..b942b7dd8c42 100644
+--- a/Documentation/admin-guide/kernel-parameters.txt
++++ b/Documentation/admin-guide/kernel-parameters.txt
+@@ -5496,6 +5496,12 @@
+ sa1100ir [NET]
+ See drivers/net/irda/sa1100_ir.c.
+
++ sched_timeslice=
++ [KNL] Time slice in ms for Project C BMQ/PDS scheduler.
++ Format: integer 2, 4
++ Default: 4
++ See Documentation/scheduler/sched-BMQ.txt
++
+ sched_verbose [KNL] Enables verbose scheduler debug messages.
+
+ schedstats= [KNL,X86] Enable or disable scheduled statistics.
+diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
+index d85d90f5d000..f730195a3adb 100644
+--- a/Documentation/admin-guide/sysctl/kernel.rst
++++ b/Documentation/admin-guide/sysctl/kernel.rst
+@@ -1616,3 +1616,13 @@ is 10 seconds.
+
+ The softlockup threshold is (``2 * watchdog_thresh``). Setting this
+ tunable to zero will disable lockup detection altogether.
++
++yield_type:
++===========
++
++BMQ/PDS CPU scheduler only. This determines what type of yield calls
++to sched_yield will perform.
++
++ 0 - No yield.
++ 1 - Deboost and requeue task. (default)
++ 2 - Set run queue skip task.
+diff --git a/Documentation/scheduler/sched-BMQ.txt b/Documentation/scheduler/sched-BMQ.txt
+new file mode 100644
+index 000000000000..05c84eec0f31
+--- /dev/null
++++ b/Documentation/scheduler/sched-BMQ.txt
+@@ -0,0 +1,110 @@
++ BitMap queue CPU Scheduler
++ --------------------------
++
++CONTENT
++========
++
++ Background
++ Design
++ Overview
++ Task policy
++ Priority management
++ BitMap Queue
++ CPU Assignment and Migration
++
++
++Background
++==========
++
++BitMap Queue CPU scheduler, referred to as BMQ from here on, is an evolution
++of previous Priority and Deadline based Skiplist multiple queue scheduler(PDS),
++and inspired by Zircon scheduler. The goal of it is to keep the scheduler code
++simple, while efficiency and scalable for interactive tasks, such as desktop,
++movie playback and gaming etc.
++
++Design
++======
++
++Overview
++--------
++
++BMQ use per CPU run queue design, each CPU(logical) has it's own run queue,
++each CPU is responsible for scheduling the tasks that are putting into it's
++run queue.
++
++The run queue is a set of priority queues. Note that these queues are fifo
++queue for non-rt tasks or priority queue for rt tasks in data structure. See
++BitMap Queue below for details. BMQ is optimized for non-rt tasks in the fact
++that most applications are non-rt tasks. No matter the queue is fifo or
++priority, In each queue is an ordered list of runnable tasks awaiting execution
++and the data structures are the same. When it is time for a new task to run,
++the scheduler simply looks the lowest numbered queueue that contains a task,
++and runs the first task from the head of that queue. And per CPU idle task is
++also in the run queue, so the scheduler can always find a task to run on from
++its run queue.
++
++Each task will assigned the same timeslice(default 4ms) when it is picked to
++start running. Task will be reinserted at the end of the appropriate priority
++queue when it uses its whole timeslice. When the scheduler selects a new task
++from the priority queue it sets the CPU's preemption timer for the remainder of
++the previous timeslice. When that timer fires the scheduler will stop execution
++on that task, select another task and start over again.
++
++If a task blocks waiting for a shared resource then it's taken out of its
++priority queue and is placed in a wait queue for the shared resource. When it
++is unblocked it will be reinserted in the appropriate priority queue of an
++eligible CPU.
++
++Task policy
++-----------
++
++BMQ supports DEADLINE, FIFO, RR, NORMAL, BATCH and IDLE task policy like the
++mainline CFS scheduler. But BMQ is heavy optimized for non-rt task, that's
++NORMAL/BATCH/IDLE policy tasks. Below is the implementation detail of each
++policy.
++
++DEADLINE
++ It is squashed as priority 0 FIFO task.
++
++FIFO/RR
++ All RT tasks share one single priority queue in BMQ run queue designed. The
++complexity of insert operation is O(n). BMQ is not designed for system runs
++with major rt policy tasks.
++
++NORMAL/BATCH/IDLE
++ BATCH and IDLE tasks are treated as the same policy. They compete CPU with
++NORMAL policy tasks, but they just don't boost. To control the priority of
++NORMAL/BATCH/IDLE tasks, simply use nice level.
++
++ISO
++ ISO policy is not supported in BMQ. Please use nice level -20 NORMAL policy
++task instead.
++
++Priority management
++-------------------
++
++RT tasks have priority from 0-99. For non-rt tasks, there are three different
++factors used to determine the effective priority of a task. The effective
++priority being what is used to determine which queue it will be in.
++
++The first factor is simply the task’s static priority. Which is assigned from
++task's nice level, within [-20, 19] in userland's point of view and [0, 39]
++internally.
++
++The second factor is the priority boost. This is a value bounded between
++[-MAX_PRIORITY_ADJ, MAX_PRIORITY_ADJ] used to offset the base priority, it is
++modified by the following cases:
++
++*When a thread has used up its entire timeslice, always deboost its boost by
++increasing by one.
++*When a thread gives up cpu control(voluntary or non-voluntary) to reschedule,
++and its switch-in time(time after last switch and run) below the thredhold
++based on its priority boost, will boost its boost by decreasing by one buti is
++capped at 0 (won’t go negative).
++
++The intent in this system is to ensure that interactive threads are serviced
++quickly. These are usually the threads that interact directly with the user
++and cause user-perceivable latency. These threads usually do little work and
++spend most of their time blocked awaiting another user event. So they get the
++priority boost from unblocking while background threads that do most of the
++processing receive the priority penalty for using their entire timeslice.
+diff --git a/fs/proc/base.c b/fs/proc/base.c
+index 05452c3b9872..fa1ceb85ad24 100644
+--- a/fs/proc/base.c
++++ b/fs/proc/base.c
+@@ -480,7 +480,7 @@ static int proc_pid_schedstat(struct seq_file *m, struct pid_namespace *ns,
+ seq_puts(m, "0 0 0\n");
+ else
+ seq_printf(m, "%llu %llu %lu\n",
+- (unsigned long long)task->se.sum_exec_runtime,
++ (unsigned long long)tsk_seruntime(task),
+ (unsigned long long)task->sched_info.run_delay,
+ task->sched_info.pcount);
+
+diff --git a/include/asm-generic/resource.h b/include/asm-generic/resource.h
+index 8874f681b056..59eb72bf7d5f 100644
+--- a/include/asm-generic/resource.h
++++ b/include/asm-generic/resource.h
+@@ -23,7 +23,7 @@
+ [RLIMIT_LOCKS] = { RLIM_INFINITY, RLIM_INFINITY }, \
+ [RLIMIT_SIGPENDING] = { 0, 0 }, \
+ [RLIMIT_MSGQUEUE] = { MQ_BYTES_MAX, MQ_BYTES_MAX }, \
+- [RLIMIT_NICE] = { 0, 0 }, \
++ [RLIMIT_NICE] = { 30, 30 }, \
+ [RLIMIT_RTPRIO] = { 0, 0 }, \
+ [RLIMIT_RTTIME] = { RLIM_INFINITY, RLIM_INFINITY }, \
+ }
+diff --git a/include/linux/sched.h b/include/linux/sched.h
+index eed5d65b8d1f..cdfd9263ddd6 100644
+--- a/include/linux/sched.h
++++ b/include/linux/sched.h
+@@ -764,8 +764,14 @@ struct task_struct {
+ unsigned int ptrace;
+
+ #ifdef CONFIG_SMP
+- int on_cpu;
+ struct __call_single_node wake_entry;
++#endif
++#if defined(CONFIG_SMP) || defined(CONFIG_SCHED_ALT)
++ int on_cpu;
++#endif
++
++#ifdef CONFIG_SMP
++#ifndef CONFIG_SCHED_ALT
+ unsigned int wakee_flips;
+ unsigned long wakee_flip_decay_ts;
+ struct task_struct *last_wakee;
+@@ -779,6 +785,7 @@ struct task_struct {
+ */
+ int recent_used_cpu;
+ int wake_cpu;
++#endif /* !CONFIG_SCHED_ALT */
+ #endif
+ int on_rq;
+
+@@ -787,6 +794,20 @@ struct task_struct {
+ int normal_prio;
+ unsigned int rt_priority;
+
++#ifdef CONFIG_SCHED_ALT
++ u64 last_ran;
++ s64 time_slice;
++ int sq_idx;
++ struct list_head sq_node;
++#ifdef CONFIG_SCHED_BMQ
++ int boost_prio;
++#endif /* CONFIG_SCHED_BMQ */
++#ifdef CONFIG_SCHED_PDS
++ u64 deadline;
++#endif /* CONFIG_SCHED_PDS */
++ /* sched_clock time spent running */
++ u64 sched_time;
++#else /* !CONFIG_SCHED_ALT */
+ struct sched_entity se;
+ struct sched_rt_entity rt;
+ struct sched_dl_entity dl;
+@@ -797,6 +818,7 @@ struct task_struct {
+ unsigned long core_cookie;
+ unsigned int core_occupation;
+ #endif
++#endif /* !CONFIG_SCHED_ALT */
+
+ #ifdef CONFIG_CGROUP_SCHED
+ struct task_group *sched_task_group;
+@@ -1551,6 +1573,15 @@ struct task_struct {
+ */
+ };
+
++#ifdef CONFIG_SCHED_ALT
++#define tsk_seruntime(t) ((t)->sched_time)
++/* replace the uncertian rt_timeout with 0UL */
++#define tsk_rttimeout(t) (0UL)
++#else /* CFS */
++#define tsk_seruntime(t) ((t)->se.sum_exec_runtime)
++#define tsk_rttimeout(t) ((t)->rt.timeout)
++#endif /* !CONFIG_SCHED_ALT */
++
+ static inline struct pid *task_pid(struct task_struct *task)
+ {
+ return task->thread_pid;
+diff --git a/include/linux/sched/deadline.h b/include/linux/sched/deadline.h
+index 7c83d4d5a971..fa30f98cb2be 100644
+--- a/include/linux/sched/deadline.h
++++ b/include/linux/sched/deadline.h
+@@ -1,5 +1,24 @@
+ /* SPDX-License-Identifier: GPL-2.0 */
+
++#ifdef CONFIG_SCHED_ALT
++
++static inline int dl_task(struct task_struct *p)
++{
++ return 0;
++}
++
++#ifdef CONFIG_SCHED_BMQ
++#define __tsk_deadline(p) (0UL)
++#endif
++
++#ifdef CONFIG_SCHED_PDS
++#define __tsk_deadline(p) ((((u64) ((p)->prio))<<56) | (p)->deadline)
++#endif
++
++#else
++
++#define __tsk_deadline(p) ((p)->dl.deadline)
++
+ /*
+ * SCHED_DEADLINE tasks has negative priorities, reflecting
+ * the fact that any of them has higher prio than RT and
+@@ -21,6 +40,7 @@ static inline int dl_task(struct task_struct *p)
+ {
+ return dl_prio(p->prio);
+ }
++#endif /* CONFIG_SCHED_ALT */
+
+ static inline bool dl_time_before(u64 a, u64 b)
+ {
+diff --git a/include/linux/sched/prio.h b/include/linux/sched/prio.h
+index ab83d85e1183..6af9ae681116 100644
+--- a/include/linux/sched/prio.h
++++ b/include/linux/sched/prio.h
+@@ -18,6 +18,32 @@
+ #define MAX_PRIO (MAX_RT_PRIO + NICE_WIDTH)
+ #define DEFAULT_PRIO (MAX_RT_PRIO + NICE_WIDTH / 2)
+
++#ifdef CONFIG_SCHED_ALT
++
++/* Undefine MAX_PRIO and DEFAULT_PRIO */
++#undef MAX_PRIO
++#undef DEFAULT_PRIO
++
++/* +/- priority levels from the base priority */
++#ifdef CONFIG_SCHED_BMQ
++#define MAX_PRIORITY_ADJ (7)
++
++#define MIN_NORMAL_PRIO (MAX_RT_PRIO)
++#define MAX_PRIO (MIN_NORMAL_PRIO + NICE_WIDTH)
++#define DEFAULT_PRIO (MIN_NORMAL_PRIO + NICE_WIDTH / 2)
++#endif
++
++#ifdef CONFIG_SCHED_PDS
++#define MAX_PRIORITY_ADJ (0)
++
++#define MIN_NORMAL_PRIO (128)
++#define NORMAL_PRIO_NUM (64)
++#define MAX_PRIO (MIN_NORMAL_PRIO + NORMAL_PRIO_NUM)
++#define DEFAULT_PRIO (MAX_PRIO - NICE_WIDTH / 2)
++#endif
++
++#endif /* CONFIG_SCHED_ALT */
++
+ /*
+ * Convert user-nice values [ -20 ... 0 ... 19 ]
+ * to static priority [ MAX_RT_PRIO..MAX_PRIO-1 ],
+diff --git a/include/linux/sched/rt.h b/include/linux/sched/rt.h
+index 994c25640e15..8c050a59ece1 100644
+--- a/include/linux/sched/rt.h
++++ b/include/linux/sched/rt.h
+@@ -24,8 +24,10 @@ static inline bool task_is_realtime(struct task_struct *tsk)
+
+ if (policy == SCHED_FIFO || policy == SCHED_RR)
+ return true;
++#ifndef CONFIG_SCHED_ALT
+ if (policy == SCHED_DEADLINE)
+ return true;
++#endif
+ return false;
+ }
+
+diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
+index 816df6cc444e..c8da08e18c91 100644
+--- a/include/linux/sched/topology.h
++++ b/include/linux/sched/topology.h
+@@ -234,7 +234,8 @@ static inline bool cpus_share_cache(int this_cpu, int that_cpu)
+
+ #endif /* !CONFIG_SMP */
+
+-#if defined(CONFIG_ENERGY_MODEL) && defined(CONFIG_CPU_FREQ_GOV_SCHEDUTIL)
++#if defined(CONFIG_ENERGY_MODEL) && defined(CONFIG_CPU_FREQ_GOV_SCHEDUTIL) && \
++ !defined(CONFIG_SCHED_ALT)
+ extern void rebuild_sched_domains_energy(void);
+ #else
+ static inline void rebuild_sched_domains_energy(void)
+diff --git a/init/Kconfig b/init/Kconfig
+index 32c24950c4ce..cf951b739454 100644
+--- a/init/Kconfig
++++ b/init/Kconfig
+@@ -629,6 +629,7 @@ config TASK_IO_ACCOUNTING
+
+ config PSI
+ bool "Pressure stall information tracking"
++ depends on !SCHED_ALT
+ help
+ Collect metrics that indicate how overcommitted the CPU, memory,
+ and IO capacity are in the system.
+@@ -793,6 +794,7 @@ menu "Scheduler features"
+ config UCLAMP_TASK
+ bool "Enable utilization clamping for RT/FAIR tasks"
+ depends on CPU_FREQ_GOV_SCHEDUTIL
++ depends on !SCHED_ALT
+ help
+ This feature enables the scheduler to track the clamped utilization
+ of each CPU based on RUNNABLE tasks scheduled on that CPU.
+@@ -839,6 +841,35 @@ config UCLAMP_BUCKETS_COUNT
+
+ If in doubt, use the default value.
+
++menuconfig SCHED_ALT
++ bool "Alternative CPU Schedulers"
++ default y
++ help
++ This feature enable alternative CPU scheduler"
++
++if SCHED_ALT
++
++choice
++ prompt "Alternative CPU Scheduler"
++ default SCHED_BMQ
++
++config SCHED_BMQ
++ bool "BMQ CPU scheduler"
++ help
++ The BitMap Queue CPU scheduler for excellent interactivity and
++ responsiveness on the desktop and solid scalability on normal
++ hardware and commodity servers.
++
++config SCHED_PDS
++ bool "PDS CPU scheduler"
++ help
++ The Priority and Deadline based Skip list multiple queue CPU
++ Scheduler.
++
++endchoice
++
++endif
++
+ endmenu
+
+ #
+@@ -892,6 +923,7 @@ config NUMA_BALANCING
+ depends on ARCH_SUPPORTS_NUMA_BALANCING
+ depends on !ARCH_WANT_NUMA_VARIABLE_LOCALITY
+ depends on SMP && NUMA && MIGRATION && !PREEMPT_RT
++ depends on !SCHED_ALT
+ help
+ This option adds support for automatic NUMA aware memory/task placement.
+ The mechanism is quite primitive and is based on migrating memory when
+@@ -989,6 +1021,7 @@ config FAIR_GROUP_SCHED
+ depends on CGROUP_SCHED
+ default CGROUP_SCHED
+
++if !SCHED_ALT
+ config CFS_BANDWIDTH
+ bool "CPU bandwidth provisioning for FAIR_GROUP_SCHED"
+ depends on FAIR_GROUP_SCHED
+@@ -1011,6 +1044,7 @@ config RT_GROUP_SCHED
+ realtime bandwidth for them.
+ See Documentation/scheduler/sched-rt-group.rst for more information.
+
++endif #!SCHED_ALT
+ endif #CGROUP_SCHED
+
+ config SCHED_MM_CID
+@@ -1259,6 +1293,7 @@ config CHECKPOINT_RESTORE
+
+ config SCHED_AUTOGROUP
+ bool "Automatic process group scheduling"
++ depends on !SCHED_ALT
+ select CGROUPS
+ select CGROUP_SCHED
+ select FAIR_GROUP_SCHED
+diff --git a/init/init_task.c b/init/init_task.c
+index ff6c4b9bfe6b..19e9c662d1a1 100644
+--- a/init/init_task.c
++++ b/init/init_task.c
+@@ -75,9 +75,15 @@ struct task_struct init_task
+ .stack = init_stack,
+ .usage = REFCOUNT_INIT(2),
+ .flags = PF_KTHREAD,
++#ifdef CONFIG_SCHED_ALT
++ .prio = DEFAULT_PRIO + MAX_PRIORITY_ADJ,
++ .static_prio = DEFAULT_PRIO,
++ .normal_prio = DEFAULT_PRIO + MAX_PRIORITY_ADJ,
++#else
+ .prio = MAX_PRIO - 20,
+ .static_prio = MAX_PRIO - 20,
+ .normal_prio = MAX_PRIO - 20,
++#endif
+ .policy = SCHED_NORMAL,
+ .cpus_ptr = &init_task.cpus_mask,
+ .user_cpus_ptr = NULL,
+@@ -88,6 +94,17 @@ struct task_struct init_task
+ .restart_block = {
+ .fn = do_no_restart_syscall,
+ },
++#ifdef CONFIG_SCHED_ALT
++ .sq_node = LIST_HEAD_INIT(init_task.sq_node),
++#ifdef CONFIG_SCHED_BMQ
++ .boost_prio = 0,
++ .sq_idx = 15,
++#endif
++#ifdef CONFIG_SCHED_PDS
++ .deadline = 0,
++#endif
++ .time_slice = HZ,
++#else
+ .se = {
+ .group_node = LIST_HEAD_INIT(init_task.se.group_node),
+ },
+@@ -95,6 +112,7 @@ struct task_struct init_task
+ .run_list = LIST_HEAD_INIT(init_task.rt.run_list),
+ .time_slice = RR_TIMESLICE,
+ },
++#endif
+ .tasks = LIST_HEAD_INIT(init_task.tasks),
+ #ifdef CONFIG_SMP
+ .pushable_tasks = PLIST_NODE_INIT(init_task.pushable_tasks, MAX_PRIO),
+diff --git a/kernel/Kconfig.preempt b/kernel/Kconfig.preempt
+index c2f1fd95a821..41654679b1b2 100644
+--- a/kernel/Kconfig.preempt
++++ b/kernel/Kconfig.preempt
+@@ -117,7 +117,7 @@ config PREEMPT_DYNAMIC
+
+ config SCHED_CORE
+ bool "Core Scheduling for SMT"
+- depends on SCHED_SMT
++ depends on SCHED_SMT && !SCHED_ALT
+ help
+ This option permits Core Scheduling, a means of coordinated task
+ selection across SMT siblings. When enabled -- see
+diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
+index e4ca2dd2b764..82786dbb220c 100644
+--- a/kernel/cgroup/cpuset.c
++++ b/kernel/cgroup/cpuset.c
+@@ -791,7 +791,7 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial)
+ return ret;
+ }
+
+-#ifdef CONFIG_SMP
++#if defined(CONFIG_SMP) && !defined(CONFIG_SCHED_ALT)
+ /*
+ * Helper routine for generate_sched_domains().
+ * Do cpusets a, b have overlapping effective cpus_allowed masks?
+@@ -1187,7 +1187,7 @@ static void rebuild_sched_domains_locked(void)
+ /* Have scheduler rebuild the domains */
+ partition_and_rebuild_sched_domains(ndoms, doms, attr);
+ }
+-#else /* !CONFIG_SMP */
++#else /* !CONFIG_SMP || CONFIG_SCHED_ALT */
+ static void rebuild_sched_domains_locked(void)
+ {
+ }
+diff --git a/kernel/delayacct.c b/kernel/delayacct.c
+index 6f0c358e73d8..8111481ce8b1 100644
+--- a/kernel/delayacct.c
++++ b/kernel/delayacct.c
+@@ -150,7 +150,7 @@ int delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk)
+ */
+ t1 = tsk->sched_info.pcount;
+ t2 = tsk->sched_info.run_delay;
+- t3 = tsk->se.sum_exec_runtime;
++ t3 = tsk_seruntime(tsk);
+
+ d->cpu_count += t1;
+
+diff --git a/kernel/exit.c b/kernel/exit.c
+index edb50b4c9972..09e72bba7cc2 100644
+--- a/kernel/exit.c
++++ b/kernel/exit.c
+@@ -173,7 +173,7 @@ static void __exit_signal(struct task_struct *tsk)
+ sig->curr_target = next_thread(tsk);
+ }
+
+- add_device_randomness((const void*) &tsk->se.sum_exec_runtime,
++ add_device_randomness((const void*) &tsk_seruntime(tsk),
+ sizeof(unsigned long long));
+
+ /*
+@@ -194,7 +194,7 @@ static void __exit_signal(struct task_struct *tsk)
+ sig->inblock += task_io_get_inblock(tsk);
+ sig->oublock += task_io_get_oublock(tsk);
+ task_io_accounting_add(&sig->ioac, &tsk->ioac);
+- sig->sum_sched_runtime += tsk->se.sum_exec_runtime;
++ sig->sum_sched_runtime += tsk_seruntime(tsk);
+ sig->nr_threads--;
+ __unhash_process(tsk, group_dead);
+ write_sequnlock(&sig->stats_lock);
+diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
+index 728f434de2bb..0e1082a4e878 100644
+--- a/kernel/locking/rtmutex.c
++++ b/kernel/locking/rtmutex.c
+@@ -337,21 +337,25 @@ static __always_inline void
+ waiter_update_prio(struct rt_mutex_waiter *waiter, struct task_struct *task)
+ {
+ waiter->prio = __waiter_prio(task);
+- waiter->deadline = task->dl.deadline;
++ waiter->deadline = __tsk_deadline(task);
+ }
+
+ /*
+ * Only use with rt_mutex_waiter_{less,equal}()
+ */
+ #define task_to_waiter(p) \
+- &(struct rt_mutex_waiter){ .prio = __waiter_prio(p), .deadline = (p)->dl.deadline }
++ &(struct rt_mutex_waiter){ .prio = __waiter_prio(p), .deadline = __tsk_deadline(p) }
+
+ static __always_inline int rt_mutex_waiter_less(struct rt_mutex_waiter *left,
+ struct rt_mutex_waiter *right)
+ {
++#ifdef CONFIG_SCHED_PDS
++ return (left->deadline < right->deadline);
++#else
+ if (left->prio < right->prio)
+ return 1;
+
++#ifndef CONFIG_SCHED_BMQ
+ /*
+ * If both waiters have dl_prio(), we check the deadlines of the
+ * associated tasks.
+@@ -360,16 +364,22 @@ static __always_inline int rt_mutex_waiter_less(struct rt_mutex_waiter *left,
+ */
+ if (dl_prio(left->prio))
+ return dl_time_before(left->deadline, right->deadline);
++#endif
+
+ return 0;
++#endif
+ }
+
+ static __always_inline int rt_mutex_waiter_equal(struct rt_mutex_waiter *left,
+ struct rt_mutex_waiter *right)
+ {
++#ifdef CONFIG_SCHED_PDS
++ return (left->deadline == right->deadline);
++#else
+ if (left->prio != right->prio)
+ return 0;
+
++#ifndef CONFIG_SCHED_BMQ
+ /*
+ * If both waiters have dl_prio(), we check the deadlines of the
+ * associated tasks.
+@@ -378,8 +388,10 @@ static __always_inline int rt_mutex_waiter_equal(struct rt_mutex_waiter *left,
+ */
+ if (dl_prio(left->prio))
+ return left->deadline == right->deadline;
++#endif
+
+ return 1;
++#endif
+ }
+
+ static inline bool rt_mutex_steal(struct rt_mutex_waiter *waiter,
+diff --git a/kernel/sched/Makefile b/kernel/sched/Makefile
+index 976092b7bd45..31d587c16ec1 100644
+--- a/kernel/sched/Makefile
++++ b/kernel/sched/Makefile
+@@ -28,7 +28,12 @@ endif
+ # These compilation units have roughly the same size and complexity - so their
+ # build parallelizes well and finishes roughly at once:
+ #
++ifdef CONFIG_SCHED_ALT
++obj-y += alt_core.o
++obj-$(CONFIG_SCHED_DEBUG) += alt_debug.o
++else
+ obj-y += core.o
+ obj-y += fair.o
++endif
+ obj-y += build_policy.o
+ obj-y += build_utility.o
+diff --git a/kernel/sched/alt_core.c b/kernel/sched/alt_core.c
+new file mode 100644
+index 000000000000..3e8ddbd8001c
+--- /dev/null
++++ b/kernel/sched/alt_core.c
+@@ -0,0 +1,8729 @@
++/*
++ * kernel/sched/alt_core.c
++ *
++ * Core alternative kernel scheduler code and related syscalls
++ *
++ * Copyright (C) 1991-2002 Linus Torvalds
++ *
++ * 2009-08-13 Brainfuck deadline scheduling policy by Con Kolivas deletes
++ * a whole lot of those previous things.
++ * 2017-09-06 Priority and Deadline based Skip list multiple queue kernel
++ * scheduler by Alfred Chen.
++ * 2019-02-20 BMQ(BitMap Queue) kernel scheduler by Alfred Chen.
++ */
++#include <linux/sched/clock.h>
++#include <linux/sched/cputime.h>
++#include <linux/sched/debug.h>
++#include <linux/sched/isolation.h>
++#include <linux/sched/loadavg.h>
++#include <linux/sched/mm.h>
++#include <linux/sched/nohz.h>
++#include <linux/sched/stat.h>
++#include <linux/sched/wake_q.h>
++
++#include <linux/blkdev.h>
++#include <linux/context_tracking.h>
++#include <linux/cpuset.h>
++#include <linux/delayacct.h>
++#include <linux/init_task.h>
++#include <linux/kcov.h>
++#include <linux/kprobes.h>
++#include <linux/nmi.h>
++#include <linux/scs.h>
++
++#include <uapi/linux/sched/types.h>
++
++#include <asm/irq_regs.h>
++#include <asm/switch_to.h>
++
++#define CREATE_TRACE_POINTS
++#include <trace/events/sched.h>
++#include <trace/events/ipi.h>
++#undef CREATE_TRACE_POINTS
++
++#include "sched.h"
++
++#include "pelt.h"
++
++#include "../../io_uring/io-wq.h"
++#include "../smpboot.h"
++
++EXPORT_TRACEPOINT_SYMBOL_GPL(ipi_send_cpu);
++EXPORT_TRACEPOINT_SYMBOL_GPL(ipi_send_cpumask);
++
++/*
++ * Export tracepoints that act as a bare tracehook (ie: have no trace event
++ * associated with them) to allow external modules to probe them.
++ */
++EXPORT_TRACEPOINT_SYMBOL_GPL(pelt_irq_tp);
++
++#ifdef CONFIG_SCHED_DEBUG
++#define sched_feat(x) (1)
++/*
++ * Print a warning if need_resched is set for the given duration (if
++ * LATENCY_WARN is enabled).
++ *
++ * If sysctl_resched_latency_warn_once is set, only one warning will be shown
++ * per boot.
++ */
++__read_mostly int sysctl_resched_latency_warn_ms = 100;
++__read_mostly int sysctl_resched_latency_warn_once = 1;
++#else
++#define sched_feat(x) (0)
++#endif /* CONFIG_SCHED_DEBUG */
++
++#define ALT_SCHED_VERSION "v6.4-r0"
++
++/*
++ * Compile time debug macro
++ * #define ALT_SCHED_DEBUG
++ */
++
++/* rt_prio(prio) defined in include/linux/sched/rt.h */
++#define rt_task(p) rt_prio((p)->prio)
++#define rt_policy(policy) ((policy) == SCHED_FIFO || (policy) == SCHED_RR)
++#define task_has_rt_policy(p) (rt_policy((p)->policy))
++
++#define STOP_PRIO (MAX_RT_PRIO - 1)
++
++/* Default time slice is 4 in ms, can be set via kernel parameter "sched_timeslice" */
++u64 sched_timeslice_ns __read_mostly = (4 << 20);
++
++static inline void requeue_task(struct task_struct *p, struct rq *rq, int idx);
++
++#ifdef CONFIG_SCHED_BMQ
++#include "bmq.h"
++#endif
++#ifdef CONFIG_SCHED_PDS
++#include "pds.h"
++#endif
++
++struct affinity_context {
++ const struct cpumask *new_mask;
++ struct cpumask *user_mask;
++ unsigned int flags;
++};
++
++static int __init sched_timeslice(char *str)
++{
++ int timeslice_ms;
++
++ get_option(&str, ×lice_ms);
++ if (2 != timeslice_ms)
++ timeslice_ms = 4;
++ sched_timeslice_ns = timeslice_ms << 20;
++ sched_timeslice_imp(timeslice_ms);
++
++ return 0;
++}
++early_param("sched_timeslice", sched_timeslice);
++
++/* Reschedule if less than this many μs left */
++#define RESCHED_NS (100 << 10)
++
++/**
++ * sched_yield_type - Choose what sort of yield sched_yield will perform.
++ * 0: No yield.
++ * 1: Deboost and requeue task. (default)
++ * 2: Set rq skip task.
++ */
++int sched_yield_type __read_mostly = 1;
++
++#ifdef CONFIG_SMP
++static cpumask_t sched_rq_pending_mask ____cacheline_aligned_in_smp;
++
++DEFINE_PER_CPU_ALIGNED(cpumask_t [NR_CPU_AFFINITY_LEVELS], sched_cpu_topo_masks);
++DEFINE_PER_CPU_ALIGNED(cpumask_t *, sched_cpu_llc_mask);
++DEFINE_PER_CPU_ALIGNED(cpumask_t *, sched_cpu_topo_end_mask);
++
++#ifdef CONFIG_SCHED_SMT
++DEFINE_STATIC_KEY_FALSE(sched_smt_present);
++EXPORT_SYMBOL_GPL(sched_smt_present);
++#endif
++
++/*
++ * Keep a unique ID per domain (we use the first CPUs number in the cpumask of
++ * the domain), this allows us to quickly tell if two cpus are in the same cache
++ * domain, see cpus_share_cache().
++ */
++DEFINE_PER_CPU(int, sd_llc_id);
++#endif /* CONFIG_SMP */
++
++static DEFINE_MUTEX(sched_hotcpu_mutex);
++
++DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
++
++#ifndef prepare_arch_switch
++# define prepare_arch_switch(next) do { } while (0)
++#endif
++#ifndef finish_arch_post_lock_switch
++# define finish_arch_post_lock_switch() do { } while (0)
++#endif
++
++#ifdef CONFIG_SCHED_SMT
++static cpumask_t sched_sg_idle_mask ____cacheline_aligned_in_smp;
++#endif
++static cpumask_t sched_preempt_mask[SCHED_QUEUE_BITS] ____cacheline_aligned_in_smp;
++static cpumask_t *const sched_idle_mask = &sched_preempt_mask[0];
++
++/* task function */
++static inline const struct cpumask *task_user_cpus(struct task_struct *p)
++{
++ if (!p->user_cpus_ptr)
++ return cpu_possible_mask; /* &init_task.cpus_mask */
++ return p->user_cpus_ptr;
++}
++
++/* sched_queue related functions */
++static inline void sched_queue_init(struct sched_queue *q)
++{
++ int i;
++
++ bitmap_zero(q->bitmap, SCHED_QUEUE_BITS);
++ for(i = 0; i < SCHED_LEVELS; i++)
++ INIT_LIST_HEAD(&q->heads[i]);
++}
++
++/*
++ * Init idle task and put into queue structure of rq
++ * IMPORTANT: may be called multiple times for a single cpu
++ */
++static inline void sched_queue_init_idle(struct sched_queue *q,
++ struct task_struct *idle)
++{
++ idle->sq_idx = IDLE_TASK_SCHED_PRIO;
++ INIT_LIST_HEAD(&q->heads[idle->sq_idx]);
++ list_add(&idle->sq_node, &q->heads[idle->sq_idx]);
++}
++
++static inline void
++clear_recorded_preempt_mask(int pr, int low, int high, int cpu)
++{
++ if (low < pr && pr <= high)
++ cpumask_clear_cpu(cpu, sched_preempt_mask + SCHED_QUEUE_BITS - pr);
++}
++
++static inline void
++set_recorded_preempt_mask(int pr, int low, int high, int cpu)
++{
++ if (low < pr && pr <= high)
++ cpumask_set_cpu(cpu, sched_preempt_mask + SCHED_QUEUE_BITS - pr);
++}
++
++static atomic_t sched_prio_record = ATOMIC_INIT(0);
++
++/* water mark related functions */
++static inline void update_sched_preempt_mask(struct rq *rq)
++{
++ unsigned long prio = find_first_bit(rq->queue.bitmap, SCHED_QUEUE_BITS);
++ unsigned long last_prio = rq->prio;
++ int cpu, pr;
++
++ if (prio == last_prio)
++ return;
++
++ rq->prio = prio;
++ cpu = cpu_of(rq);
++ pr = atomic_read(&sched_prio_record);
++
++ if (prio < last_prio) {
++ if (IDLE_TASK_SCHED_PRIO == last_prio) {
++#ifdef CONFIG_SCHED_SMT
++ if (static_branch_likely(&sched_smt_present))
++ cpumask_andnot(&sched_sg_idle_mask,
++ &sched_sg_idle_mask, cpu_smt_mask(cpu));
++#endif
++ cpumask_clear_cpu(cpu, sched_idle_mask);
++ last_prio -= 2;
++ }
++ clear_recorded_preempt_mask(pr, prio, last_prio, cpu);
++
++ return;
++ }
++ /* last_prio < prio */
++ if (IDLE_TASK_SCHED_PRIO == prio) {
++#ifdef CONFIG_SCHED_SMT
++ if (static_branch_likely(&sched_smt_present) &&
++ cpumask_intersects(cpu_smt_mask(cpu), sched_idle_mask))
++ cpumask_or(&sched_sg_idle_mask,
++ &sched_sg_idle_mask, cpu_smt_mask(cpu));
++#endif
++ cpumask_set_cpu(cpu, sched_idle_mask);
++ prio -= 2;
++ }
++ set_recorded_preempt_mask(pr, last_prio, prio, cpu);
++}
++
++/*
++ * This routine assume that the idle task always in queue
++ */
++static inline struct task_struct *sched_rq_first_task(struct rq *rq)
++{
++ const struct list_head *head = &rq->queue.heads[sched_prio2idx(rq->prio, rq)];
++
++ return list_first_entry(head, struct task_struct, sq_node);
++}
++
++static inline struct task_struct *
++sched_rq_next_task(struct task_struct *p, struct rq *rq)
++{
++ unsigned long idx = p->sq_idx;
++ struct list_head *head = &rq->queue.heads[idx];
++
++ if (list_is_last(&p->sq_node, head)) {
++ idx = find_next_bit(rq->queue.bitmap, SCHED_QUEUE_BITS,
++ sched_idx2prio(idx, rq) + 1);
++ head = &rq->queue.heads[sched_prio2idx(idx, rq)];
++
++ return list_first_entry(head, struct task_struct, sq_node);
++ }
++
++ return list_next_entry(p, sq_node);
++}
++
++static inline struct task_struct *rq_runnable_task(struct rq *rq)
++{
++ struct task_struct *next = sched_rq_first_task(rq);
++
++ if (unlikely(next == rq->skip))
++ next = sched_rq_next_task(next, rq);
++
++ return next;
++}
++
++/*
++ * Serialization rules:
++ *
++ * Lock order:
++ *
++ * p->pi_lock
++ * rq->lock
++ * hrtimer_cpu_base->lock (hrtimer_start() for bandwidth controls)
++ *
++ * rq1->lock
++ * rq2->lock where: rq1 < rq2
++ *
++ * Regular state:
++ *
++ * Normal scheduling state is serialized by rq->lock. __schedule() takes the
++ * local CPU's rq->lock, it optionally removes the task from the runqueue and
++ * always looks at the local rq data structures to find the most eligible task
++ * to run next.
++ *
++ * Task enqueue is also under rq->lock, possibly taken from another CPU.
++ * Wakeups from another LLC domain might use an IPI to transfer the enqueue to
++ * the local CPU to avoid bouncing the runqueue state around [ see
++ * ttwu_queue_wakelist() ]
++ *
++ * Task wakeup, specifically wakeups that involve migration, are horribly
++ * complicated to avoid having to take two rq->locks.
++ *
++ * Special state:
++ *
++ * System-calls and anything external will use task_rq_lock() which acquires
++ * both p->pi_lock and rq->lock. As a consequence the state they change is
++ * stable while holding either lock:
++ *
++ * - sched_setaffinity()/
++ * set_cpus_allowed_ptr(): p->cpus_ptr, p->nr_cpus_allowed
++ * - set_user_nice(): p->se.load, p->*prio
++ * - __sched_setscheduler(): p->sched_class, p->policy, p->*prio,
++ * p->se.load, p->rt_priority,
++ * p->dl.dl_{runtime, deadline, period, flags, bw, density}
++ * - sched_setnuma(): p->numa_preferred_nid
++ * - sched_move_task(): p->sched_task_group
++ * - uclamp_update_active() p->uclamp*
++ *
++ * p->state <- TASK_*:
++ *
++ * is changed locklessly using set_current_state(), __set_current_state() or
++ * set_special_state(), see their respective comments, or by
++ * try_to_wake_up(). This latter uses p->pi_lock to serialize against
++ * concurrent self.
++ *
++ * p->on_rq <- { 0, 1 = TASK_ON_RQ_QUEUED, 2 = TASK_ON_RQ_MIGRATING }:
++ *
++ * is set by activate_task() and cleared by deactivate_task(), under
++ * rq->lock. Non-zero indicates the task is runnable, the special
++ * ON_RQ_MIGRATING state is used for migration without holding both
++ * rq->locks. It indicates task_cpu() is not stable, see task_rq_lock().
++ *
++ * p->on_cpu <- { 0, 1 }:
++ *
++ * is set by prepare_task() and cleared by finish_task() such that it will be
++ * set before p is scheduled-in and cleared after p is scheduled-out, both
++ * under rq->lock. Non-zero indicates the task is running on its CPU.
++ *
++ * [ The astute reader will observe that it is possible for two tasks on one
++ * CPU to have ->on_cpu = 1 at the same time. ]
++ *
++ * task_cpu(p): is changed by set_task_cpu(), the rules are:
++ *
++ * - Don't call set_task_cpu() on a blocked task:
++ *
++ * We don't care what CPU we're not running on, this simplifies hotplug,
++ * the CPU assignment of blocked tasks isn't required to be valid.
++ *
++ * - for try_to_wake_up(), called under p->pi_lock:
++ *
++ * This allows try_to_wake_up() to only take one rq->lock, see its comment.
++ *
++ * - for migration called under rq->lock:
++ * [ see task_on_rq_migrating() in task_rq_lock() ]
++ *
++ * o move_queued_task()
++ * o detach_task()
++ *
++ * - for migration called under double_rq_lock():
++ *
++ * o __migrate_swap_task()
++ * o push_rt_task() / pull_rt_task()
++ * o push_dl_task() / pull_dl_task()
++ * o dl_task_offline_migration()
++ *
++ */
++
++/*
++ * Context: p->pi_lock
++ */
++static inline struct rq
++*__task_access_lock(struct task_struct *p, raw_spinlock_t **plock)
++{
++ struct rq *rq;
++ for (;;) {
++ rq = task_rq(p);
++ if (p->on_cpu || task_on_rq_queued(p)) {
++ raw_spin_lock(&rq->lock);
++ if (likely((p->on_cpu || task_on_rq_queued(p))
++ && rq == task_rq(p))) {
++ *plock = &rq->lock;
++ return rq;
++ }
++ raw_spin_unlock(&rq->lock);
++ } else if (task_on_rq_migrating(p)) {
++ do {
++ cpu_relax();
++ } while (unlikely(task_on_rq_migrating(p)));
++ } else {
++ *plock = NULL;
++ return rq;
++ }
++ }
++}
++
++static inline void
++__task_access_unlock(struct task_struct *p, raw_spinlock_t *lock)
++{
++ if (NULL != lock)
++ raw_spin_unlock(lock);
++}
++
++static inline struct rq
++*task_access_lock_irqsave(struct task_struct *p, raw_spinlock_t **plock,
++ unsigned long *flags)
++{
++ struct rq *rq;
++ for (;;) {
++ rq = task_rq(p);
++ if (p->on_cpu || task_on_rq_queued(p)) {
++ raw_spin_lock_irqsave(&rq->lock, *flags);
++ if (likely((p->on_cpu || task_on_rq_queued(p))
++ && rq == task_rq(p))) {
++ *plock = &rq->lock;
++ return rq;
++ }
++ raw_spin_unlock_irqrestore(&rq->lock, *flags);
++ } else if (task_on_rq_migrating(p)) {
++ do {
++ cpu_relax();
++ } while (unlikely(task_on_rq_migrating(p)));
++ } else {
++ raw_spin_lock_irqsave(&p->pi_lock, *flags);
++ if (likely(!p->on_cpu && !p->on_rq &&
++ rq == task_rq(p))) {
++ *plock = &p->pi_lock;
++ return rq;
++ }
++ raw_spin_unlock_irqrestore(&p->pi_lock, *flags);
++ }
++ }
++}
++
++static inline void
++task_access_unlock_irqrestore(struct task_struct *p, raw_spinlock_t *lock,
++ unsigned long *flags)
++{
++ raw_spin_unlock_irqrestore(lock, *flags);
++}
++
++/*
++ * __task_rq_lock - lock the rq @p resides on.
++ */
++struct rq *__task_rq_lock(struct task_struct *p, struct rq_flags *rf)
++ __acquires(rq->lock)
++{
++ struct rq *rq;
++
++ lockdep_assert_held(&p->pi_lock);
++
++ for (;;) {
++ rq = task_rq(p);
++ raw_spin_lock(&rq->lock);
++ if (likely(rq == task_rq(p) && !task_on_rq_migrating(p)))
++ return rq;
++ raw_spin_unlock(&rq->lock);
++
++ while (unlikely(task_on_rq_migrating(p)))
++ cpu_relax();
++ }
++}
++
++/*
++ * task_rq_lock - lock p->pi_lock and lock the rq @p resides on.
++ */
++struct rq *task_rq_lock(struct task_struct *p, struct rq_flags *rf)
++ __acquires(p->pi_lock)
++ __acquires(rq->lock)
++{
++ struct rq *rq;
++
++ for (;;) {
++ raw_spin_lock_irqsave(&p->pi_lock, rf->flags);
++ rq = task_rq(p);
++ raw_spin_lock(&rq->lock);
++ /*
++ * move_queued_task() task_rq_lock()
++ *
++ * ACQUIRE (rq->lock)
++ * [S] ->on_rq = MIGRATING [L] rq = task_rq()
++ * WMB (__set_task_cpu()) ACQUIRE (rq->lock);
++ * [S] ->cpu = new_cpu [L] task_rq()
++ * [L] ->on_rq
++ * RELEASE (rq->lock)
++ *
++ * If we observe the old CPU in task_rq_lock(), the acquire of
++ * the old rq->lock will fully serialize against the stores.
++ *
++ * If we observe the new CPU in task_rq_lock(), the address
++ * dependency headed by '[L] rq = task_rq()' and the acquire
++ * will pair with the WMB to ensure we then also see migrating.
++ */
++ if (likely(rq == task_rq(p) && !task_on_rq_migrating(p))) {
++ return rq;
++ }
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, rf->flags);
++
++ while (unlikely(task_on_rq_migrating(p)))
++ cpu_relax();
++ }
++}
++
++static inline void
++rq_lock_irqsave(struct rq *rq, struct rq_flags *rf)
++ __acquires(rq->lock)
++{
++ raw_spin_lock_irqsave(&rq->lock, rf->flags);
++}
++
++static inline void
++rq_unlock_irqrestore(struct rq *rq, struct rq_flags *rf)
++ __releases(rq->lock)
++{
++ raw_spin_unlock_irqrestore(&rq->lock, rf->flags);
++}
++
++void raw_spin_rq_lock_nested(struct rq *rq, int subclass)
++{
++ raw_spinlock_t *lock;
++
++ /* Matches synchronize_rcu() in __sched_core_enable() */
++ preempt_disable();
++
++ for (;;) {
++ lock = __rq_lockp(rq);
++ raw_spin_lock_nested(lock, subclass);
++ if (likely(lock == __rq_lockp(rq))) {
++ /* preempt_count *MUST* be > 1 */
++ preempt_enable_no_resched();
++ return;
++ }
++ raw_spin_unlock(lock);
++ }
++}
++
++void raw_spin_rq_unlock(struct rq *rq)
++{
++ raw_spin_unlock(rq_lockp(rq));
++}
++
++/*
++ * RQ-clock updating methods:
++ */
++
++static void update_rq_clock_task(struct rq *rq, s64 delta)
++{
++/*
++ * In theory, the compile should just see 0 here, and optimize out the call
++ * to sched_rt_avg_update. But I don't trust it...
++ */
++ s64 __maybe_unused steal = 0, irq_delta = 0;
++
++#ifdef CONFIG_IRQ_TIME_ACCOUNTING
++ irq_delta = irq_time_read(cpu_of(rq)) - rq->prev_irq_time;
++
++ /*
++ * Since irq_time is only updated on {soft,}irq_exit, we might run into
++ * this case when a previous update_rq_clock() happened inside a
++ * {soft,}irq region.
++ *
++ * When this happens, we stop ->clock_task and only update the
++ * prev_irq_time stamp to account for the part that fit, so that a next
++ * update will consume the rest. This ensures ->clock_task is
++ * monotonic.
++ *
++ * It does however cause some slight miss-attribution of {soft,}irq
++ * time, a more accurate solution would be to update the irq_time using
++ * the current rq->clock timestamp, except that would require using
++ * atomic ops.
++ */
++ if (irq_delta > delta)
++ irq_delta = delta;
++
++ rq->prev_irq_time += irq_delta;
++ delta -= irq_delta;
++ delayacct_irq(rq->curr, irq_delta);
++#endif
++#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
++ if (static_key_false((¶virt_steal_rq_enabled))) {
++ steal = paravirt_steal_clock(cpu_of(rq));
++ steal -= rq->prev_steal_time_rq;
++
++ if (unlikely(steal > delta))
++ steal = delta;
++
++ rq->prev_steal_time_rq += steal;
++ delta -= steal;
++ }
++#endif
++
++ rq->clock_task += delta;
++
++#ifdef CONFIG_HAVE_SCHED_AVG_IRQ
++ if ((irq_delta + steal))
++ update_irq_load_avg(rq, irq_delta + steal);
++#endif
++}
++
++static inline void update_rq_clock(struct rq *rq)
++{
++ s64 delta = sched_clock_cpu(cpu_of(rq)) - rq->clock;
++
++ if (unlikely(delta <= 0))
++ return;
++ rq->clock += delta;
++ update_rq_time_edge(rq);
++ update_rq_clock_task(rq, delta);
++}
++
++/*
++ * RQ Load update routine
++ */
++#define RQ_LOAD_HISTORY_BITS (sizeof(s32) * 8ULL)
++#define RQ_UTIL_SHIFT (8)
++#define RQ_LOAD_HISTORY_TO_UTIL(l) (((l) >> (RQ_LOAD_HISTORY_BITS - 1 - RQ_UTIL_SHIFT)) & 0xff)
++
++#define LOAD_BLOCK(t) ((t) >> 17)
++#define LOAD_HALF_BLOCK(t) ((t) >> 16)
++#define BLOCK_MASK(t) ((t) & ((0x01 << 18) - 1))
++#define LOAD_BLOCK_BIT(b) (1UL << (RQ_LOAD_HISTORY_BITS - 1 - (b)))
++#define CURRENT_LOAD_BIT LOAD_BLOCK_BIT(0)
++
++static inline void rq_load_update(struct rq *rq)
++{
++ u64 time = rq->clock;
++ u64 delta = min(LOAD_BLOCK(time) - LOAD_BLOCK(rq->load_stamp),
++ RQ_LOAD_HISTORY_BITS - 1);
++ u64 prev = !!(rq->load_history & CURRENT_LOAD_BIT);
++ u64 curr = !!rq->nr_running;
++
++ if (delta) {
++ rq->load_history = rq->load_history >> delta;
++
++ if (delta < RQ_UTIL_SHIFT) {
++ rq->load_block += (~BLOCK_MASK(rq->load_stamp)) * prev;
++ if (!!LOAD_HALF_BLOCK(rq->load_block) ^ curr)
++ rq->load_history ^= LOAD_BLOCK_BIT(delta);
++ }
++
++ rq->load_block = BLOCK_MASK(time) * prev;
++ } else {
++ rq->load_block += (time - rq->load_stamp) * prev;
++ }
++ if (prev ^ curr)
++ rq->load_history ^= CURRENT_LOAD_BIT;
++ rq->load_stamp = time;
++}
++
++unsigned long rq_load_util(struct rq *rq, unsigned long max)
++{
++ return RQ_LOAD_HISTORY_TO_UTIL(rq->load_history) * (max >> RQ_UTIL_SHIFT);
++}
++
++#ifdef CONFIG_SMP
++unsigned long sched_cpu_util(int cpu)
++{
++ return rq_load_util(cpu_rq(cpu), arch_scale_cpu_capacity(cpu));
++}
++#endif /* CONFIG_SMP */
++
++#ifdef CONFIG_CPU_FREQ
++/**
++ * cpufreq_update_util - Take a note about CPU utilization changes.
++ * @rq: Runqueue to carry out the update for.
++ * @flags: Update reason flags.
++ *
++ * This function is called by the scheduler on the CPU whose utilization is
++ * being updated.
++ *
++ * It can only be called from RCU-sched read-side critical sections.
++ *
++ * The way cpufreq is currently arranged requires it to evaluate the CPU
++ * performance state (frequency/voltage) on a regular basis to prevent it from
++ * being stuck in a completely inadequate performance level for too long.
++ * That is not guaranteed to happen if the updates are only triggered from CFS
++ * and DL, though, because they may not be coming in if only RT tasks are
++ * active all the time (or there are RT tasks only).
++ *
++ * As a workaround for that issue, this function is called periodically by the
++ * RT sched class to trigger extra cpufreq updates to prevent it from stalling,
++ * but that really is a band-aid. Going forward it should be replaced with
++ * solutions targeted more specifically at RT tasks.
++ */
++static inline void cpufreq_update_util(struct rq *rq, unsigned int flags)
++{
++ struct update_util_data *data;
++
++#ifdef CONFIG_SMP
++ rq_load_update(rq);
++#endif
++ data = rcu_dereference_sched(*per_cpu_ptr(&cpufreq_update_util_data,
++ cpu_of(rq)));
++ if (data)
++ data->func(data, rq_clock(rq), flags);
++}
++#else
++static inline void cpufreq_update_util(struct rq *rq, unsigned int flags)
++{
++#ifdef CONFIG_SMP
++ rq_load_update(rq);
++#endif
++}
++#endif /* CONFIG_CPU_FREQ */
++
++#ifdef CONFIG_NO_HZ_FULL
++/*
++ * Tick may be needed by tasks in the runqueue depending on their policy and
++ * requirements. If tick is needed, lets send the target an IPI to kick it out
++ * of nohz mode if necessary.
++ */
++static inline void sched_update_tick_dependency(struct rq *rq)
++{
++ int cpu = cpu_of(rq);
++
++ if (!tick_nohz_full_cpu(cpu))
++ return;
++
++ if (rq->nr_running < 2)
++ tick_nohz_dep_clear_cpu(cpu, TICK_DEP_BIT_SCHED);
++ else
++ tick_nohz_dep_set_cpu(cpu, TICK_DEP_BIT_SCHED);
++}
++#else /* !CONFIG_NO_HZ_FULL */
++static inline void sched_update_tick_dependency(struct rq *rq) { }
++#endif
++
++bool sched_task_on_rq(struct task_struct *p)
++{
++ return task_on_rq_queued(p);
++}
++
++unsigned long get_wchan(struct task_struct *p)
++{
++ unsigned long ip = 0;
++ unsigned int state;
++
++ if (!p || p == current)
++ return 0;
++
++ /* Only get wchan if task is blocked and we can keep it that way. */
++ raw_spin_lock_irq(&p->pi_lock);
++ state = READ_ONCE(p->__state);
++ smp_rmb(); /* see try_to_wake_up() */
++ if (state != TASK_RUNNING && state != TASK_WAKING && !p->on_rq)
++ ip = __get_wchan(p);
++ raw_spin_unlock_irq(&p->pi_lock);
++
++ return ip;
++}
++
++/*
++ * Add/Remove/Requeue task to/from the runqueue routines
++ * Context: rq->lock
++ */
++#define __SCHED_DEQUEUE_TASK(p, rq, flags, func) \
++ sched_info_dequeue(rq, p); \
++ \
++ list_del(&p->sq_node); \
++ if (list_empty(&rq->queue.heads[p->sq_idx])) { \
++ clear_bit(sched_idx2prio(p->sq_idx, rq), rq->queue.bitmap); \
++ func; \
++ }
++
++#define __SCHED_ENQUEUE_TASK(p, rq, flags) \
++ sched_info_enqueue(rq, p); \
++ \
++ p->sq_idx = task_sched_prio_idx(p, rq); \
++ list_add_tail(&p->sq_node, &rq->queue.heads[p->sq_idx]); \
++ set_bit(sched_idx2prio(p->sq_idx, rq), rq->queue.bitmap);
++
++static inline void dequeue_task(struct task_struct *p, struct rq *rq, int flags)
++{
++#ifdef ALT_SCHED_DEBUG
++ lockdep_assert_held(&rq->lock);
++
++ /*printk(KERN_INFO "sched: dequeue(%d) %px %016llx\n", cpu_of(rq), p, p->deadline);*/
++ WARN_ONCE(task_rq(p) != rq, "sched: dequeue task reside on cpu%d from cpu%d\n",
++ task_cpu(p), cpu_of(rq));
++#endif
++
++ __SCHED_DEQUEUE_TASK(p, rq, flags, update_sched_preempt_mask(rq));
++ --rq->nr_running;
++#ifdef CONFIG_SMP
++ if (1 == rq->nr_running)
++ cpumask_clear_cpu(cpu_of(rq), &sched_rq_pending_mask);
++#endif
++
++ sched_update_tick_dependency(rq);
++}
++
++static inline void enqueue_task(struct task_struct *p, struct rq *rq, int flags)
++{
++#ifdef ALT_SCHED_DEBUG
++ lockdep_assert_held(&rq->lock);
++
++ /*printk(KERN_INFO "sched: enqueue(%d) %px %d\n", cpu_of(rq), p, p->prio);*/
++ WARN_ONCE(task_rq(p) != rq, "sched: enqueue task reside on cpu%d to cpu%d\n",
++ task_cpu(p), cpu_of(rq));
++#endif
++
++ __SCHED_ENQUEUE_TASK(p, rq, flags);
++ update_sched_preempt_mask(rq);
++ ++rq->nr_running;
++#ifdef CONFIG_SMP
++ if (2 == rq->nr_running)
++ cpumask_set_cpu(cpu_of(rq), &sched_rq_pending_mask);
++#endif
++
++ sched_update_tick_dependency(rq);
++}
++
++static inline void requeue_task(struct task_struct *p, struct rq *rq, int idx)
++{
++#ifdef ALT_SCHED_DEBUG
++ lockdep_assert_held(&rq->lock);
++ /*printk(KERN_INFO "sched: requeue(%d) %px %016llx\n", cpu_of(rq), p, p->deadline);*/
++ WARN_ONCE(task_rq(p) != rq, "sched: cpu[%d] requeue task reside on cpu%d\n",
++ cpu_of(rq), task_cpu(p));
++#endif
++
++ list_del(&p->sq_node);
++ list_add_tail(&p->sq_node, &rq->queue.heads[idx]);
++ if (idx != p->sq_idx) {
++ if (list_empty(&rq->queue.heads[p->sq_idx]))
++ clear_bit(sched_idx2prio(p->sq_idx, rq), rq->queue.bitmap);
++ p->sq_idx = idx;
++ set_bit(sched_idx2prio(p->sq_idx, rq), rq->queue.bitmap);
++ update_sched_preempt_mask(rq);
++ }
++}
++
++/*
++ * cmpxchg based fetch_or, macro so it works for different integer types
++ */
++#define fetch_or(ptr, mask) \
++ ({ \
++ typeof(ptr) _ptr = (ptr); \
++ typeof(mask) _mask = (mask); \
++ typeof(*_ptr) _val = *_ptr; \
++ \
++ do { \
++ } while (!try_cmpxchg(_ptr, &_val, _val | _mask)); \
++ _val; \
++})
++
++#if defined(CONFIG_SMP) && defined(TIF_POLLING_NRFLAG)
++/*
++ * Atomically set TIF_NEED_RESCHED and test for TIF_POLLING_NRFLAG,
++ * this avoids any races wrt polling state changes and thereby avoids
++ * spurious IPIs.
++ */
++static inline bool set_nr_and_not_polling(struct task_struct *p)
++{
++ struct thread_info *ti = task_thread_info(p);
++ return !(fetch_or(&ti->flags, _TIF_NEED_RESCHED) & _TIF_POLLING_NRFLAG);
++}
++
++/*
++ * Atomically set TIF_NEED_RESCHED if TIF_POLLING_NRFLAG is set.
++ *
++ * If this returns true, then the idle task promises to call
++ * sched_ttwu_pending() and reschedule soon.
++ */
++static bool set_nr_if_polling(struct task_struct *p)
++{
++ struct thread_info *ti = task_thread_info(p);
++ typeof(ti->flags) val = READ_ONCE(ti->flags);
++
++ for (;;) {
++ if (!(val & _TIF_POLLING_NRFLAG))
++ return false;
++ if (val & _TIF_NEED_RESCHED)
++ return true;
++ if (try_cmpxchg(&ti->flags, &val, val | _TIF_NEED_RESCHED))
++ break;
++ }
++ return true;
++}
++
++#else
++static inline bool set_nr_and_not_polling(struct task_struct *p)
++{
++ set_tsk_need_resched(p);
++ return true;
++}
++
++#ifdef CONFIG_SMP
++static inline bool set_nr_if_polling(struct task_struct *p)
++{
++ return false;
++}
++#endif
++#endif
++
++static bool __wake_q_add(struct wake_q_head *head, struct task_struct *task)
++{
++ struct wake_q_node *node = &task->wake_q;
++
++ /*
++ * Atomically grab the task, if ->wake_q is !nil already it means
++ * it's already queued (either by us or someone else) and will get the
++ * wakeup due to that.
++ *
++ * In order to ensure that a pending wakeup will observe our pending
++ * state, even in the failed case, an explicit smp_mb() must be used.
++ */
++ smp_mb__before_atomic();
++ if (unlikely(cmpxchg_relaxed(&node->next, NULL, WAKE_Q_TAIL)))
++ return false;
++
++ /*
++ * The head is context local, there can be no concurrency.
++ */
++ *head->lastp = node;
++ head->lastp = &node->next;
++ return true;
++}
++
++/**
++ * wake_q_add() - queue a wakeup for 'later' waking.
++ * @head: the wake_q_head to add @task to
++ * @task: the task to queue for 'later' wakeup
++ *
++ * Queue a task for later wakeup, most likely by the wake_up_q() call in the
++ * same context, _HOWEVER_ this is not guaranteed, the wakeup can come
++ * instantly.
++ *
++ * This function must be used as-if it were wake_up_process(); IOW the task
++ * must be ready to be woken at this location.
++ */
++void wake_q_add(struct wake_q_head *head, struct task_struct *task)
++{
++ if (__wake_q_add(head, task))
++ get_task_struct(task);
++}
++
++/**
++ * wake_q_add_safe() - safely queue a wakeup for 'later' waking.
++ * @head: the wake_q_head to add @task to
++ * @task: the task to queue for 'later' wakeup
++ *
++ * Queue a task for later wakeup, most likely by the wake_up_q() call in the
++ * same context, _HOWEVER_ this is not guaranteed, the wakeup can come
++ * instantly.
++ *
++ * This function must be used as-if it were wake_up_process(); IOW the task
++ * must be ready to be woken at this location.
++ *
++ * This function is essentially a task-safe equivalent to wake_q_add(). Callers
++ * that already hold reference to @task can call the 'safe' version and trust
++ * wake_q to do the right thing depending whether or not the @task is already
++ * queued for wakeup.
++ */
++void wake_q_add_safe(struct wake_q_head *head, struct task_struct *task)
++{
++ if (!__wake_q_add(head, task))
++ put_task_struct(task);
++}
++
++void wake_up_q(struct wake_q_head *head)
++{
++ struct wake_q_node *node = head->first;
++
++ while (node != WAKE_Q_TAIL) {
++ struct task_struct *task;
++
++ task = container_of(node, struct task_struct, wake_q);
++ /* task can safely be re-inserted now: */
++ node = node->next;
++ task->wake_q.next = NULL;
++
++ /*
++ * wake_up_process() executes a full barrier, which pairs with
++ * the queueing in wake_q_add() so as not to miss wakeups.
++ */
++ wake_up_process(task);
++ put_task_struct(task);
++ }
++}
++
++/*
++ * resched_curr - mark rq's current task 'to be rescheduled now'.
++ *
++ * On UP this means the setting of the need_resched flag, on SMP it
++ * might also involve a cross-CPU call to trigger the scheduler on
++ * the target CPU.
++ */
++void resched_curr(struct rq *rq)
++{
++ struct task_struct *curr = rq->curr;
++ int cpu;
++
++ lockdep_assert_held(&rq->lock);
++
++ if (test_tsk_need_resched(curr))
++ return;
++
++ cpu = cpu_of(rq);
++ if (cpu == smp_processor_id()) {
++ set_tsk_need_resched(curr);
++ set_preempt_need_resched();
++ return;
++ }
++
++ if (set_nr_and_not_polling(curr))
++ smp_send_reschedule(cpu);
++ else
++ trace_sched_wake_idle_without_ipi(cpu);
++}
++
++void resched_cpu(int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++ unsigned long flags;
++
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ if (cpu_online(cpu) || cpu == smp_processor_id())
++ resched_curr(cpu_rq(cpu));
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++}
++
++#ifdef CONFIG_SMP
++#ifdef CONFIG_NO_HZ_COMMON
++void nohz_balance_enter_idle(int cpu) {}
++
++void select_nohz_load_balancer(int stop_tick) {}
++
++void set_cpu_sd_state_idle(void) {}
++
++/*
++ * In the semi idle case, use the nearest busy CPU for migrating timers
++ * from an idle CPU. This is good for power-savings.
++ *
++ * We don't do similar optimization for completely idle system, as
++ * selecting an idle CPU will add more delays to the timers than intended
++ * (as that CPU's timer base may not be uptodate wrt jiffies etc).
++ */
++int get_nohz_timer_target(void)
++{
++ int i, cpu = smp_processor_id(), default_cpu = -1;
++ struct cpumask *mask;
++ const struct cpumask *hk_mask;
++
++ if (housekeeping_cpu(cpu, HK_TYPE_TIMER)) {
++ if (!idle_cpu(cpu))
++ return cpu;
++ default_cpu = cpu;
++ }
++
++ hk_mask = housekeeping_cpumask(HK_TYPE_TIMER);
++
++ for (mask = per_cpu(sched_cpu_topo_masks, cpu) + 1;
++ mask < per_cpu(sched_cpu_topo_end_mask, cpu); mask++)
++ for_each_cpu_and(i, mask, hk_mask)
++ if (!idle_cpu(i))
++ return i;
++
++ if (default_cpu == -1)
++ default_cpu = housekeeping_any_cpu(HK_TYPE_TIMER);
++ cpu = default_cpu;
++
++ return cpu;
++}
++
++/*
++ * When add_timer_on() enqueues a timer into the timer wheel of an
++ * idle CPU then this timer might expire before the next timer event
++ * which is scheduled to wake up that CPU. In case of a completely
++ * idle system the next event might even be infinite time into the
++ * future. wake_up_idle_cpu() ensures that the CPU is woken up and
++ * leaves the inner idle loop so the newly added timer is taken into
++ * account when the CPU goes back to idle and evaluates the timer
++ * wheel for the next timer event.
++ */
++static inline void wake_up_idle_cpu(int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ if (cpu == smp_processor_id())
++ return;
++
++ if (set_nr_and_not_polling(rq->idle))
++ smp_send_reschedule(cpu);
++ else
++ trace_sched_wake_idle_without_ipi(cpu);
++}
++
++static inline bool wake_up_full_nohz_cpu(int cpu)
++{
++ /*
++ * We just need the target to call irq_exit() and re-evaluate
++ * the next tick. The nohz full kick at least implies that.
++ * If needed we can still optimize that later with an
++ * empty IRQ.
++ */
++ if (cpu_is_offline(cpu))
++ return true; /* Don't try to wake offline CPUs. */
++ if (tick_nohz_full_cpu(cpu)) {
++ if (cpu != smp_processor_id() ||
++ tick_nohz_tick_stopped())
++ tick_nohz_full_kick_cpu(cpu);
++ return true;
++ }
++
++ return false;
++}
++
++void wake_up_nohz_cpu(int cpu)
++{
++ if (!wake_up_full_nohz_cpu(cpu))
++ wake_up_idle_cpu(cpu);
++}
++
++static void nohz_csd_func(void *info)
++{
++ struct rq *rq = info;
++ int cpu = cpu_of(rq);
++ unsigned int flags;
++
++ /*
++ * Release the rq::nohz_csd.
++ */
++ flags = atomic_fetch_andnot(NOHZ_KICK_MASK, nohz_flags(cpu));
++ WARN_ON(!(flags & NOHZ_KICK_MASK));
++
++ rq->idle_balance = idle_cpu(cpu);
++ if (rq->idle_balance && !need_resched()) {
++ rq->nohz_idle_balance = flags;
++ raise_softirq_irqoff(SCHED_SOFTIRQ);
++ }
++}
++
++#endif /* CONFIG_NO_HZ_COMMON */
++#endif /* CONFIG_SMP */
++
++static inline void check_preempt_curr(struct rq *rq)
++{
++ if (sched_rq_first_task(rq) != rq->curr)
++ resched_curr(rq);
++}
++
++#ifdef CONFIG_SCHED_HRTICK
++/*
++ * Use HR-timers to deliver accurate preemption points.
++ */
++
++static void hrtick_clear(struct rq *rq)
++{
++ if (hrtimer_active(&rq->hrtick_timer))
++ hrtimer_cancel(&rq->hrtick_timer);
++}
++
++/*
++ * High-resolution timer tick.
++ * Runs from hardirq context with interrupts disabled.
++ */
++static enum hrtimer_restart hrtick(struct hrtimer *timer)
++{
++ struct rq *rq = container_of(timer, struct rq, hrtick_timer);
++
++ WARN_ON_ONCE(cpu_of(rq) != smp_processor_id());
++
++ raw_spin_lock(&rq->lock);
++ resched_curr(rq);
++ raw_spin_unlock(&rq->lock);
++
++ return HRTIMER_NORESTART;
++}
++
++/*
++ * Use hrtick when:
++ * - enabled by features
++ * - hrtimer is actually high res
++ */
++static inline int hrtick_enabled(struct rq *rq)
++{
++ /**
++ * Alt schedule FW doesn't support sched_feat yet
++ if (!sched_feat(HRTICK))
++ return 0;
++ */
++ if (!cpu_active(cpu_of(rq)))
++ return 0;
++ return hrtimer_is_hres_active(&rq->hrtick_timer);
++}
++
++#ifdef CONFIG_SMP
++
++static void __hrtick_restart(struct rq *rq)
++{
++ struct hrtimer *timer = &rq->hrtick_timer;
++ ktime_t time = rq->hrtick_time;
++
++ hrtimer_start(timer, time, HRTIMER_MODE_ABS_PINNED_HARD);
++}
++
++/*
++ * called from hardirq (IPI) context
++ */
++static void __hrtick_start(void *arg)
++{
++ struct rq *rq = arg;
++
++ raw_spin_lock(&rq->lock);
++ __hrtick_restart(rq);
++ raw_spin_unlock(&rq->lock);
++}
++
++/*
++ * Called to set the hrtick timer state.
++ *
++ * called with rq->lock held and irqs disabled
++ */
++void hrtick_start(struct rq *rq, u64 delay)
++{
++ struct hrtimer *timer = &rq->hrtick_timer;
++ s64 delta;
++
++ /*
++ * Don't schedule slices shorter than 10000ns, that just
++ * doesn't make sense and can cause timer DoS.
++ */
++ delta = max_t(s64, delay, 10000LL);
++
++ rq->hrtick_time = ktime_add_ns(timer->base->get_time(), delta);
++
++ if (rq == this_rq())
++ __hrtick_restart(rq);
++ else
++ smp_call_function_single_async(cpu_of(rq), &rq->hrtick_csd);
++}
++
++#else
++/*
++ * Called to set the hrtick timer state.
++ *
++ * called with rq->lock held and irqs disabled
++ */
++void hrtick_start(struct rq *rq, u64 delay)
++{
++ /*
++ * Don't schedule slices shorter than 10000ns, that just
++ * doesn't make sense. Rely on vruntime for fairness.
++ */
++ delay = max_t(u64, delay, 10000LL);
++ hrtimer_start(&rq->hrtick_timer, ns_to_ktime(delay),
++ HRTIMER_MODE_REL_PINNED_HARD);
++}
++#endif /* CONFIG_SMP */
++
++static void hrtick_rq_init(struct rq *rq)
++{
++#ifdef CONFIG_SMP
++ INIT_CSD(&rq->hrtick_csd, __hrtick_start, rq);
++#endif
++
++ hrtimer_init(&rq->hrtick_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD);
++ rq->hrtick_timer.function = hrtick;
++}
++#else /* CONFIG_SCHED_HRTICK */
++static inline int hrtick_enabled(struct rq *rq)
++{
++ return 0;
++}
++
++static inline void hrtick_clear(struct rq *rq)
++{
++}
++
++static inline void hrtick_rq_init(struct rq *rq)
++{
++}
++#endif /* CONFIG_SCHED_HRTICK */
++
++static inline int __normal_prio(int policy, int rt_prio, int static_prio)
++{
++ return rt_policy(policy) ? (MAX_RT_PRIO - 1 - rt_prio) :
++ static_prio + MAX_PRIORITY_ADJ;
++}
++
++/*
++ * Calculate the expected normal priority: i.e. priority
++ * without taking RT-inheritance into account. Might be
++ * boosted by interactivity modifiers. Changes upon fork,
++ * setprio syscalls, and whenever the interactivity
++ * estimator recalculates.
++ */
++static inline int normal_prio(struct task_struct *p)
++{
++ return __normal_prio(p->policy, p->rt_priority, p->static_prio);
++}
++
++/*
++ * Calculate the current priority, i.e. the priority
++ * taken into account by the scheduler. This value might
++ * be boosted by RT tasks as it will be RT if the task got
++ * RT-boosted. If not then it returns p->normal_prio.
++ */
++static int effective_prio(struct task_struct *p)
++{
++ p->normal_prio = normal_prio(p);
++ /*
++ * If we are RT tasks or we were boosted to RT priority,
++ * keep the priority unchanged. Otherwise, update priority
++ * to the normal priority:
++ */
++ if (!rt_prio(p->prio))
++ return p->normal_prio;
++ return p->prio;
++}
++
++/*
++ * activate_task - move a task to the runqueue.
++ *
++ * Context: rq->lock
++ */
++static void activate_task(struct task_struct *p, struct rq *rq)
++{
++ enqueue_task(p, rq, ENQUEUE_WAKEUP);
++ p->on_rq = TASK_ON_RQ_QUEUED;
++
++ /*
++ * If in_iowait is set, the code below may not trigger any cpufreq
++ * utilization updates, so do it here explicitly with the IOWAIT flag
++ * passed.
++ */
++ cpufreq_update_util(rq, SCHED_CPUFREQ_IOWAIT * p->in_iowait);
++}
++
++/*
++ * deactivate_task - remove a task from the runqueue.
++ *
++ * Context: rq->lock
++ */
++static inline void deactivate_task(struct task_struct *p, struct rq *rq)
++{
++ dequeue_task(p, rq, DEQUEUE_SLEEP);
++ p->on_rq = 0;
++ cpufreq_update_util(rq, 0);
++}
++
++static inline void __set_task_cpu(struct task_struct *p, unsigned int cpu)
++{
++#ifdef CONFIG_SMP
++ /*
++ * After ->cpu is set up to a new value, task_access_lock(p, ...) can be
++ * successfully executed on another CPU. We must ensure that updates of
++ * per-task data have been completed by this moment.
++ */
++ smp_wmb();
++
++ WRITE_ONCE(task_thread_info(p)->cpu, cpu);
++#endif
++}
++
++static inline bool is_migration_disabled(struct task_struct *p)
++{
++#ifdef CONFIG_SMP
++ return p->migration_disabled;
++#else
++ return false;
++#endif
++}
++
++#define SCA_CHECK 0x01
++#define SCA_USER 0x08
++
++#ifdef CONFIG_SMP
++
++void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
++{
++#ifdef CONFIG_SCHED_DEBUG
++ unsigned int state = READ_ONCE(p->__state);
++
++ /*
++ * We should never call set_task_cpu() on a blocked task,
++ * ttwu() will sort out the placement.
++ */
++ WARN_ON_ONCE(state != TASK_RUNNING && state != TASK_WAKING && !p->on_rq);
++
++#ifdef CONFIG_LOCKDEP
++ /*
++ * The caller should hold either p->pi_lock or rq->lock, when changing
++ * a task's CPU. ->pi_lock for waking tasks, rq->lock for runnable tasks.
++ *
++ * sched_move_task() holds both and thus holding either pins the cgroup,
++ * see task_group().
++ */
++ WARN_ON_ONCE(debug_locks && !(lockdep_is_held(&p->pi_lock) ||
++ lockdep_is_held(&task_rq(p)->lock)));
++#endif
++ /*
++ * Clearly, migrating tasks to offline CPUs is a fairly daft thing.
++ */
++ WARN_ON_ONCE(!cpu_online(new_cpu));
++
++ WARN_ON_ONCE(is_migration_disabled(p));
++#endif
++ trace_sched_migrate_task(p, new_cpu);
++
++ if (task_cpu(p) != new_cpu)
++ {
++ rseq_migrate(p);
++ perf_event_task_migrate(p);
++ }
++
++ __set_task_cpu(p, new_cpu);
++}
++
++#define MDF_FORCE_ENABLED 0x80
++
++static void
++__do_set_cpus_ptr(struct task_struct *p, const struct cpumask *new_mask)
++{
++ /*
++ * This here violates the locking rules for affinity, since we're only
++ * supposed to change these variables while holding both rq->lock and
++ * p->pi_lock.
++ *
++ * HOWEVER, it magically works, because ttwu() is the only code that
++ * accesses these variables under p->pi_lock and only does so after
++ * smp_cond_load_acquire(&p->on_cpu, !VAL), and we're in __schedule()
++ * before finish_task().
++ *
++ * XXX do further audits, this smells like something putrid.
++ */
++ SCHED_WARN_ON(!p->on_cpu);
++ p->cpus_ptr = new_mask;
++}
++
++void migrate_disable(void)
++{
++ struct task_struct *p = current;
++ int cpu;
++
++ if (p->migration_disabled) {
++ p->migration_disabled++;
++ return;
++ }
++
++ preempt_disable();
++ cpu = smp_processor_id();
++ if (cpumask_test_cpu(cpu, &p->cpus_mask)) {
++ cpu_rq(cpu)->nr_pinned++;
++ p->migration_disabled = 1;
++ p->migration_flags &= ~MDF_FORCE_ENABLED;
++
++ /*
++ * Violates locking rules! see comment in __do_set_cpus_ptr().
++ */
++ if (p->cpus_ptr == &p->cpus_mask)
++ __do_set_cpus_ptr(p, cpumask_of(cpu));
++ }
++ preempt_enable();
++}
++EXPORT_SYMBOL_GPL(migrate_disable);
++
++void migrate_enable(void)
++{
++ struct task_struct *p = current;
++
++ if (0 == p->migration_disabled)
++ return;
++
++ if (p->migration_disabled > 1) {
++ p->migration_disabled--;
++ return;
++ }
++
++ if (WARN_ON_ONCE(!p->migration_disabled))
++ return;
++
++ /*
++ * Ensure stop_task runs either before or after this, and that
++ * __set_cpus_allowed_ptr(SCA_MIGRATE_ENABLE) doesn't schedule().
++ */
++ preempt_disable();
++ /*
++ * Assumption: current should be running on allowed cpu
++ */
++ WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &p->cpus_mask));
++ if (p->cpus_ptr != &p->cpus_mask)
++ __do_set_cpus_ptr(p, &p->cpus_mask);
++ /*
++ * Mustn't clear migration_disabled() until cpus_ptr points back at the
++ * regular cpus_mask, otherwise things that race (eg.
++ * select_fallback_rq) get confused.
++ */
++ barrier();
++ p->migration_disabled = 0;
++ this_rq()->nr_pinned--;
++ preempt_enable();
++}
++EXPORT_SYMBOL_GPL(migrate_enable);
++
++static inline bool rq_has_pinned_tasks(struct rq *rq)
++{
++ return rq->nr_pinned;
++}
++
++/*
++ * Per-CPU kthreads are allowed to run on !active && online CPUs, see
++ * __set_cpus_allowed_ptr() and select_fallback_rq().
++ */
++static inline bool is_cpu_allowed(struct task_struct *p, int cpu)
++{
++ /* When not in the task's cpumask, no point in looking further. */
++ if (!cpumask_test_cpu(cpu, p->cpus_ptr))
++ return false;
++
++ /* migrate_disabled() must be allowed to finish. */
++ if (is_migration_disabled(p))
++ return cpu_online(cpu);
++
++ /* Non kernel threads are not allowed during either online or offline. */
++ if (!(p->flags & PF_KTHREAD))
++ return cpu_active(cpu) && task_cpu_possible(cpu, p);
++
++ /* KTHREAD_IS_PER_CPU is always allowed. */
++ if (kthread_is_per_cpu(p))
++ return cpu_online(cpu);
++
++ /* Regular kernel threads don't get to stay during offline. */
++ if (cpu_dying(cpu))
++ return false;
++
++ /* But are allowed during online. */
++ return cpu_online(cpu);
++}
++
++/*
++ * This is how migration works:
++ *
++ * 1) we invoke migration_cpu_stop() on the target CPU using
++ * stop_one_cpu().
++ * 2) stopper starts to run (implicitly forcing the migrated thread
++ * off the CPU)
++ * 3) it checks whether the migrated task is still in the wrong runqueue.
++ * 4) if it's in the wrong runqueue then the migration thread removes
++ * it and puts it into the right queue.
++ * 5) stopper completes and stop_one_cpu() returns and the migration
++ * is done.
++ */
++
++/*
++ * move_queued_task - move a queued task to new rq.
++ *
++ * Returns (locked) new rq. Old rq's lock is released.
++ */
++static struct rq *move_queued_task(struct rq *rq, struct task_struct *p, int
++ new_cpu)
++{
++ int src_cpu;
++
++ lockdep_assert_held(&rq->lock);
++
++ src_cpu = cpu_of(rq);
++ WRITE_ONCE(p->on_rq, TASK_ON_RQ_MIGRATING);
++ dequeue_task(p, rq, 0);
++ set_task_cpu(p, new_cpu);
++ raw_spin_unlock(&rq->lock);
++
++ rq = cpu_rq(new_cpu);
++
++ raw_spin_lock(&rq->lock);
++ WARN_ON_ONCE(task_cpu(p) != new_cpu);
++
++ sched_mm_cid_migrate_to(rq, p, src_cpu);
++
++ sched_task_sanity_check(p, rq);
++ enqueue_task(p, rq, 0);
++ p->on_rq = TASK_ON_RQ_QUEUED;
++ check_preempt_curr(rq);
++
++ return rq;
++}
++
++struct migration_arg {
++ struct task_struct *task;
++ int dest_cpu;
++};
++
++/*
++ * Move (not current) task off this CPU, onto the destination CPU. We're doing
++ * this because either it can't run here any more (set_cpus_allowed()
++ * away from this CPU, or CPU going down), or because we're
++ * attempting to rebalance this task on exec (sched_exec).
++ *
++ * So we race with normal scheduler movements, but that's OK, as long
++ * as the task is no longer on this CPU.
++ */
++static struct rq *__migrate_task(struct rq *rq, struct task_struct *p, int
++ dest_cpu)
++{
++ /* Affinity changed (again). */
++ if (!is_cpu_allowed(p, dest_cpu))
++ return rq;
++
++ update_rq_clock(rq);
++ return move_queued_task(rq, p, dest_cpu);
++}
++
++/*
++ * migration_cpu_stop - this will be executed by a highprio stopper thread
++ * and performs thread migration by bumping thread off CPU then
++ * 'pushing' onto another runqueue.
++ */
++static int migration_cpu_stop(void *data)
++{
++ struct migration_arg *arg = data;
++ struct task_struct *p = arg->task;
++ struct rq *rq = this_rq();
++ unsigned long flags;
++
++ /*
++ * The original target CPU might have gone down and we might
++ * be on another CPU but it doesn't matter.
++ */
++ local_irq_save(flags);
++ /*
++ * We need to explicitly wake pending tasks before running
++ * __migrate_task() such that we will not miss enforcing cpus_ptr
++ * during wakeups, see set_cpus_allowed_ptr()'s TASK_WAKING test.
++ */
++ flush_smp_call_function_queue();
++
++ raw_spin_lock(&p->pi_lock);
++ raw_spin_lock(&rq->lock);
++ /*
++ * If task_rq(p) != rq, it cannot be migrated here, because we're
++ * holding rq->lock, if p->on_rq == 0 it cannot get enqueued because
++ * we're holding p->pi_lock.
++ */
++ if (task_rq(p) == rq && task_on_rq_queued(p))
++ rq = __migrate_task(rq, p, arg->dest_cpu);
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++
++ return 0;
++}
++
++static inline void
++set_cpus_allowed_common(struct task_struct *p, struct affinity_context *ctx)
++{
++ cpumask_copy(&p->cpus_mask, ctx->new_mask);
++ p->nr_cpus_allowed = cpumask_weight(ctx->new_mask);
++
++ /*
++ * Swap in a new user_cpus_ptr if SCA_USER flag set
++ */
++ if (ctx->flags & SCA_USER)
++ swap(p->user_cpus_ptr, ctx->user_mask);
++}
++
++static void
++__do_set_cpus_allowed(struct task_struct *p, struct affinity_context *ctx)
++{
++ lockdep_assert_held(&p->pi_lock);
++ set_cpus_allowed_common(p, ctx);
++}
++
++/*
++ * Used for kthread_bind() and select_fallback_rq(), in both cases the user
++ * affinity (if any) should be destroyed too.
++ */
++void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
++{
++ struct affinity_context ac = {
++ .new_mask = new_mask,
++ .user_mask = NULL,
++ .flags = SCA_USER, /* clear the user requested mask */
++ };
++ union cpumask_rcuhead {
++ cpumask_t cpumask;
++ struct rcu_head rcu;
++ };
++
++ __do_set_cpus_allowed(p, &ac);
++
++ /*
++ * Because this is called with p->pi_lock held, it is not possible
++ * to use kfree() here (when PREEMPT_RT=y), therefore punt to using
++ * kfree_rcu().
++ */
++ kfree_rcu((union cpumask_rcuhead *)ac.user_mask, rcu);
++}
++
++static cpumask_t *alloc_user_cpus_ptr(int node)
++{
++ /*
++ * See do_set_cpus_allowed() above for the rcu_head usage.
++ */
++ int size = max_t(int, cpumask_size(), sizeof(struct rcu_head));
++
++ return kmalloc_node(size, GFP_KERNEL, node);
++}
++
++int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
++ int node)
++{
++ cpumask_t *user_mask;
++ unsigned long flags;
++
++ /*
++ * Always clear dst->user_cpus_ptr first as their user_cpus_ptr's
++ * may differ by now due to racing.
++ */
++ dst->user_cpus_ptr = NULL;
++
++ /*
++ * This check is racy and losing the race is a valid situation.
++ * It is not worth the extra overhead of taking the pi_lock on
++ * every fork/clone.
++ */
++ if (data_race(!src->user_cpus_ptr))
++ return 0;
++
++ user_mask = alloc_user_cpus_ptr(node);
++ if (!user_mask)
++ return -ENOMEM;
++
++ /*
++ * Use pi_lock to protect content of user_cpus_ptr
++ *
++ * Though unlikely, user_cpus_ptr can be reset to NULL by a concurrent
++ * do_set_cpus_allowed().
++ */
++ raw_spin_lock_irqsave(&src->pi_lock, flags);
++ if (src->user_cpus_ptr) {
++ swap(dst->user_cpus_ptr, user_mask);
++ cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
++ }
++ raw_spin_unlock_irqrestore(&src->pi_lock, flags);
++
++ if (unlikely(user_mask))
++ kfree(user_mask);
++
++ return 0;
++}
++
++static inline struct cpumask *clear_user_cpus_ptr(struct task_struct *p)
++{
++ struct cpumask *user_mask = NULL;
++
++ swap(p->user_cpus_ptr, user_mask);
++
++ return user_mask;
++}
++
++void release_user_cpus_ptr(struct task_struct *p)
++{
++ kfree(clear_user_cpus_ptr(p));
++}
++
++#endif
++
++/**
++ * task_curr - is this task currently executing on a CPU?
++ * @p: the task in question.
++ *
++ * Return: 1 if the task is currently executing. 0 otherwise.
++ */
++inline int task_curr(const struct task_struct *p)
++{
++ return cpu_curr(task_cpu(p)) == p;
++}
++
++#ifdef CONFIG_SMP
++/*
++ * wait_task_inactive - wait for a thread to unschedule.
++ *
++ * Wait for the thread to block in any of the states set in @match_state.
++ * If it changes, i.e. @p might have woken up, then return zero. When we
++ * succeed in waiting for @p to be off its CPU, we return a positive number
++ * (its total switch count). If a second call a short while later returns the
++ * same number, the caller can be sure that @p has remained unscheduled the
++ * whole time.
++ *
++ * The caller must ensure that the task *will* unschedule sometime soon,
++ * else this function might spin for a *long* time. This function can't
++ * be called with interrupts off, or it may introduce deadlock with
++ * smp_call_function() if an IPI is sent by the same process we are
++ * waiting to become inactive.
++ */
++unsigned long wait_task_inactive(struct task_struct *p, unsigned int match_state)
++{
++ unsigned long flags;
++ bool running, on_rq;
++ unsigned long ncsw;
++ struct rq *rq;
++ raw_spinlock_t *lock;
++
++ for (;;) {
++ rq = task_rq(p);
++
++ /*
++ * If the task is actively running on another CPU
++ * still, just relax and busy-wait without holding
++ * any locks.
++ *
++ * NOTE! Since we don't hold any locks, it's not
++ * even sure that "rq" stays as the right runqueue!
++ * But we don't care, since this will return false
++ * if the runqueue has changed and p is actually now
++ * running somewhere else!
++ */
++ while (task_on_cpu(p) && p == rq->curr) {
++ if (!(READ_ONCE(p->__state) & match_state))
++ return 0;
++ cpu_relax();
++ }
++
++ /*
++ * Ok, time to look more closely! We need the rq
++ * lock now, to be *sure*. If we're wrong, we'll
++ * just go back and repeat.
++ */
++ task_access_lock_irqsave(p, &lock, &flags);
++ trace_sched_wait_task(p);
++ running = task_on_cpu(p);
++ on_rq = p->on_rq;
++ ncsw = 0;
++ if (READ_ONCE(p->__state) & match_state)
++ ncsw = p->nvcsw | LONG_MIN; /* sets MSB */
++ task_access_unlock_irqrestore(p, lock, &flags);
++
++ /*
++ * If it changed from the expected state, bail out now.
++ */
++ if (unlikely(!ncsw))
++ break;
++
++ /*
++ * Was it really running after all now that we
++ * checked with the proper locks actually held?
++ *
++ * Oops. Go back and try again..
++ */
++ if (unlikely(running)) {
++ cpu_relax();
++ continue;
++ }
++
++ /*
++ * It's not enough that it's not actively running,
++ * it must be off the runqueue _entirely_, and not
++ * preempted!
++ *
++ * So if it was still runnable (but just not actively
++ * running right now), it's preempted, and we should
++ * yield - it could be a while.
++ */
++ if (unlikely(on_rq)) {
++ ktime_t to = NSEC_PER_SEC / HZ;
++
++ set_current_state(TASK_UNINTERRUPTIBLE);
++ schedule_hrtimeout(&to, HRTIMER_MODE_REL_HARD);
++ continue;
++ }
++
++ /*
++ * Ahh, all good. It wasn't running, and it wasn't
++ * runnable, which means that it will never become
++ * running in the future either. We're all done!
++ */
++ break;
++ }
++
++ return ncsw;
++}
++
++/***
++ * kick_process - kick a running thread to enter/exit the kernel
++ * @p: the to-be-kicked thread
++ *
++ * Cause a process which is running on another CPU to enter
++ * kernel-mode, without any delay. (to get signals handled.)
++ *
++ * NOTE: this function doesn't have to take the runqueue lock,
++ * because all it wants to ensure is that the remote task enters
++ * the kernel. If the IPI races and the task has been migrated
++ * to another CPU then no harm is done and the purpose has been
++ * achieved as well.
++ */
++void kick_process(struct task_struct *p)
++{
++ int cpu;
++
++ preempt_disable();
++ cpu = task_cpu(p);
++ if ((cpu != smp_processor_id()) && task_curr(p))
++ smp_send_reschedule(cpu);
++ preempt_enable();
++}
++EXPORT_SYMBOL_GPL(kick_process);
++
++/*
++ * ->cpus_ptr is protected by both rq->lock and p->pi_lock
++ *
++ * A few notes on cpu_active vs cpu_online:
++ *
++ * - cpu_active must be a subset of cpu_online
++ *
++ * - on CPU-up we allow per-CPU kthreads on the online && !active CPU,
++ * see __set_cpus_allowed_ptr(). At this point the newly online
++ * CPU isn't yet part of the sched domains, and balancing will not
++ * see it.
++ *
++ * - on cpu-down we clear cpu_active() to mask the sched domains and
++ * avoid the load balancer to place new tasks on the to be removed
++ * CPU. Existing tasks will remain running there and will be taken
++ * off.
++ *
++ * This means that fallback selection must not select !active CPUs.
++ * And can assume that any active CPU must be online. Conversely
++ * select_task_rq() below may allow selection of !active CPUs in order
++ * to satisfy the above rules.
++ */
++static int select_fallback_rq(int cpu, struct task_struct *p)
++{
++ int nid = cpu_to_node(cpu);
++ const struct cpumask *nodemask = NULL;
++ enum { cpuset, possible, fail } state = cpuset;
++ int dest_cpu;
++
++ /*
++ * If the node that the CPU is on has been offlined, cpu_to_node()
++ * will return -1. There is no CPU on the node, and we should
++ * select the CPU on the other node.
++ */
++ if (nid != -1) {
++ nodemask = cpumask_of_node(nid);
++
++ /* Look for allowed, online CPU in same node. */
++ for_each_cpu(dest_cpu, nodemask) {
++ if (is_cpu_allowed(p, dest_cpu))
++ return dest_cpu;
++ }
++ }
++
++ for (;;) {
++ /* Any allowed, online CPU? */
++ for_each_cpu(dest_cpu, p->cpus_ptr) {
++ if (!is_cpu_allowed(p, dest_cpu))
++ continue;
++ goto out;
++ }
++
++ /* No more Mr. Nice Guy. */
++ switch (state) {
++ case cpuset:
++ if (cpuset_cpus_allowed_fallback(p)) {
++ state = possible;
++ break;
++ }
++ fallthrough;
++ case possible:
++ /*
++ * XXX When called from select_task_rq() we only
++ * hold p->pi_lock and again violate locking order.
++ *
++ * More yuck to audit.
++ */
++ do_set_cpus_allowed(p, task_cpu_possible_mask(p));
++ state = fail;
++ break;
++
++ case fail:
++ BUG();
++ break;
++ }
++ }
++
++out:
++ if (state != cpuset) {
++ /*
++ * Don't tell them about moving exiting tasks or
++ * kernel threads (both mm NULL), since they never
++ * leave kernel.
++ */
++ if (p->mm && printk_ratelimit()) {
++ printk_deferred("process %d (%s) no longer affine to cpu%d\n",
++ task_pid_nr(p), p->comm, cpu);
++ }
++ }
++
++ return dest_cpu;
++}
++
++static inline void
++sched_preempt_mask_flush(cpumask_t *mask, int prio)
++{
++ int cpu;
++
++ cpumask_copy(mask, sched_idle_mask);
++
++ for_each_clear_bit(cpu, cpumask_bits(mask), nr_cpumask_bits) {
++ if (prio < cpu_rq(cpu)->prio)
++ cpumask_set_cpu(cpu, mask);
++ }
++}
++
++static inline int
++preempt_mask_check(struct task_struct *p, cpumask_t *allow_mask, cpumask_t *preempt_mask)
++{
++ int task_prio = task_sched_prio(p);
++ cpumask_t *mask = sched_preempt_mask + SCHED_QUEUE_BITS - 1 - task_prio;
++ int pr = atomic_read(&sched_prio_record);
++
++ if (pr != task_prio) {
++ sched_preempt_mask_flush(mask, task_prio);
++ atomic_set(&sched_prio_record, task_prio);
++ }
++
++ return cpumask_and(preempt_mask, allow_mask, mask);
++}
++
++static inline int select_task_rq(struct task_struct *p)
++{
++ cpumask_t allow_mask, mask;
++
++ if (unlikely(!cpumask_and(&allow_mask, p->cpus_ptr, cpu_active_mask)))
++ return select_fallback_rq(task_cpu(p), p);
++
++ if (
++#ifdef CONFIG_SCHED_SMT
++ cpumask_and(&mask, &allow_mask, &sched_sg_idle_mask) ||
++#endif
++ cpumask_and(&mask, &allow_mask, sched_idle_mask) ||
++ preempt_mask_check(p, &allow_mask, &mask))
++ return best_mask_cpu(task_cpu(p), &mask);
++
++ return best_mask_cpu(task_cpu(p), &allow_mask);
++}
++
++void sched_set_stop_task(int cpu, struct task_struct *stop)
++{
++ static struct lock_class_key stop_pi_lock;
++ struct sched_param stop_param = { .sched_priority = STOP_PRIO };
++ struct sched_param start_param = { .sched_priority = 0 };
++ struct task_struct *old_stop = cpu_rq(cpu)->stop;
++
++ if (stop) {
++ /*
++ * Make it appear like a SCHED_FIFO task, its something
++ * userspace knows about and won't get confused about.
++ *
++ * Also, it will make PI more or less work without too
++ * much confusion -- but then, stop work should not
++ * rely on PI working anyway.
++ */
++ sched_setscheduler_nocheck(stop, SCHED_FIFO, &stop_param);
++
++ /*
++ * The PI code calls rt_mutex_setprio() with ->pi_lock held to
++ * adjust the effective priority of a task. As a result,
++ * rt_mutex_setprio() can trigger (RT) balancing operations,
++ * which can then trigger wakeups of the stop thread to push
++ * around the current task.
++ *
++ * The stop task itself will never be part of the PI-chain, it
++ * never blocks, therefore that ->pi_lock recursion is safe.
++ * Tell lockdep about this by placing the stop->pi_lock in its
++ * own class.
++ */
++ lockdep_set_class(&stop->pi_lock, &stop_pi_lock);
++ }
++
++ cpu_rq(cpu)->stop = stop;
++
++ if (old_stop) {
++ /*
++ * Reset it back to a normal scheduling policy so that
++ * it can die in pieces.
++ */
++ sched_setscheduler_nocheck(old_stop, SCHED_NORMAL, &start_param);
++ }
++}
++
++static int affine_move_task(struct rq *rq, struct task_struct *p, int dest_cpu,
++ raw_spinlock_t *lock, unsigned long irq_flags)
++ __releases(rq->lock)
++ __releases(p->pi_lock)
++{
++ /* Can the task run on the task's current CPU? If so, we're done */
++ if (!cpumask_test_cpu(task_cpu(p), &p->cpus_mask)) {
++ if (p->migration_disabled) {
++ if (likely(p->cpus_ptr != &p->cpus_mask))
++ __do_set_cpus_ptr(p, &p->cpus_mask);
++ p->migration_disabled = 0;
++ p->migration_flags |= MDF_FORCE_ENABLED;
++ /* When p is migrate_disabled, rq->lock should be held */
++ rq->nr_pinned--;
++ }
++
++ if (task_on_cpu(p) || READ_ONCE(p->__state) == TASK_WAKING) {
++ struct migration_arg arg = { p, dest_cpu };
++
++ /* Need help from migration thread: drop lock and wait. */
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
++ stop_one_cpu(cpu_of(rq), migration_cpu_stop, &arg);
++ return 0;
++ }
++ if (task_on_rq_queued(p)) {
++ /*
++ * OK, since we're going to drop the lock immediately
++ * afterwards anyway.
++ */
++ update_rq_clock(rq);
++ rq = move_queued_task(rq, p, dest_cpu);
++ lock = &rq->lock;
++ }
++ }
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
++ return 0;
++}
++
++static int __set_cpus_allowed_ptr_locked(struct task_struct *p,
++ struct affinity_context *ctx,
++ struct rq *rq,
++ raw_spinlock_t *lock,
++ unsigned long irq_flags)
++{
++ const struct cpumask *cpu_allowed_mask = task_cpu_possible_mask(p);
++ const struct cpumask *cpu_valid_mask = cpu_active_mask;
++ bool kthread = p->flags & PF_KTHREAD;
++ int dest_cpu;
++ int ret = 0;
++
++ if (kthread || is_migration_disabled(p)) {
++ /*
++ * Kernel threads are allowed on online && !active CPUs,
++ * however, during cpu-hot-unplug, even these might get pushed
++ * away if not KTHREAD_IS_PER_CPU.
++ *
++ * Specifically, migration_disabled() tasks must not fail the
++ * cpumask_any_and_distribute() pick below, esp. so on
++ * SCA_MIGRATE_ENABLE, otherwise we'll not call
++ * set_cpus_allowed_common() and actually reset p->cpus_ptr.
++ */
++ cpu_valid_mask = cpu_online_mask;
++ }
++
++ if (!kthread && !cpumask_subset(ctx->new_mask, cpu_allowed_mask)) {
++ ret = -EINVAL;
++ goto out;
++ }
++
++ /*
++ * Must re-check here, to close a race against __kthread_bind(),
++ * sched_setaffinity() is not guaranteed to observe the flag.
++ */
++ if ((ctx->flags & SCA_CHECK) && (p->flags & PF_NO_SETAFFINITY)) {
++ ret = -EINVAL;
++ goto out;
++ }
++
++ if (cpumask_equal(&p->cpus_mask, ctx->new_mask))
++ goto out;
++
++ dest_cpu = cpumask_any_and(cpu_valid_mask, ctx->new_mask);
++ if (dest_cpu >= nr_cpu_ids) {
++ ret = -EINVAL;
++ goto out;
++ }
++
++ __do_set_cpus_allowed(p, ctx);
++
++ return affine_move_task(rq, p, dest_cpu, lock, irq_flags);
++
++out:
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
++
++ return ret;
++}
++
++/*
++ * Change a given task's CPU affinity. Migrate the thread to a
++ * is removed from the allowed bitmask.
++ *
++ * NOTE: the caller must have a valid reference to the task, the
++ * task must not exit() & deallocate itself prematurely. The
++ * call is not atomic; no spinlocks may be held.
++ */
++static int __set_cpus_allowed_ptr(struct task_struct *p,
++ struct affinity_context *ctx)
++{
++ unsigned long irq_flags;
++ struct rq *rq;
++ raw_spinlock_t *lock;
++
++ raw_spin_lock_irqsave(&p->pi_lock, irq_flags);
++ rq = __task_access_lock(p, &lock);
++ /*
++ * Masking should be skipped if SCA_USER or any of the SCA_MIGRATE_*
++ * flags are set.
++ */
++ if (p->user_cpus_ptr &&
++ !(ctx->flags & SCA_USER) &&
++ cpumask_and(rq->scratch_mask, ctx->new_mask, p->user_cpus_ptr))
++ ctx->new_mask = rq->scratch_mask;
++
++
++ return __set_cpus_allowed_ptr_locked(p, ctx, rq, lock, irq_flags);
++}
++
++int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask)
++{
++ struct affinity_context ac = {
++ .new_mask = new_mask,
++ .flags = 0,
++ };
++
++ return __set_cpus_allowed_ptr(p, &ac);
++}
++EXPORT_SYMBOL_GPL(set_cpus_allowed_ptr);
++
++/*
++ * Change a given task's CPU affinity to the intersection of its current
++ * affinity mask and @subset_mask, writing the resulting mask to @new_mask.
++ * If user_cpus_ptr is defined, use it as the basis for restricting CPU
++ * affinity or use cpu_online_mask instead.
++ *
++ * If the resulting mask is empty, leave the affinity unchanged and return
++ * -EINVAL.
++ */
++static int restrict_cpus_allowed_ptr(struct task_struct *p,
++ struct cpumask *new_mask,
++ const struct cpumask *subset_mask)
++{
++ struct affinity_context ac = {
++ .new_mask = new_mask,
++ .flags = 0,
++ };
++ unsigned long irq_flags;
++ raw_spinlock_t *lock;
++ struct rq *rq;
++ int err;
++
++ raw_spin_lock_irqsave(&p->pi_lock, irq_flags);
++ rq = __task_access_lock(p, &lock);
++
++ if (!cpumask_and(new_mask, task_user_cpus(p), subset_mask)) {
++ err = -EINVAL;
++ goto err_unlock;
++ }
++
++ return __set_cpus_allowed_ptr_locked(p, &ac, rq, lock, irq_flags);
++
++err_unlock:
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
++ return err;
++}
++
++/*
++ * Restrict the CPU affinity of task @p so that it is a subset of
++ * task_cpu_possible_mask() and point @p->user_cpus_ptr to a copy of the
++ * old affinity mask. If the resulting mask is empty, we warn and walk
++ * up the cpuset hierarchy until we find a suitable mask.
++ */
++void force_compatible_cpus_allowed_ptr(struct task_struct *p)
++{
++ cpumask_var_t new_mask;
++ const struct cpumask *override_mask = task_cpu_possible_mask(p);
++
++ alloc_cpumask_var(&new_mask, GFP_KERNEL);
++
++ /*
++ * __migrate_task() can fail silently in the face of concurrent
++ * offlining of the chosen destination CPU, so take the hotplug
++ * lock to ensure that the migration succeeds.
++ */
++ cpus_read_lock();
++ if (!cpumask_available(new_mask))
++ goto out_set_mask;
++
++ if (!restrict_cpus_allowed_ptr(p, new_mask, override_mask))
++ goto out_free_mask;
++
++ /*
++ * We failed to find a valid subset of the affinity mask for the
++ * task, so override it based on its cpuset hierarchy.
++ */
++ cpuset_cpus_allowed(p, new_mask);
++ override_mask = new_mask;
++
++out_set_mask:
++ if (printk_ratelimit()) {
++ printk_deferred("Overriding affinity for process %d (%s) to CPUs %*pbl\n",
++ task_pid_nr(p), p->comm,
++ cpumask_pr_args(override_mask));
++ }
++
++ WARN_ON(set_cpus_allowed_ptr(p, override_mask));
++out_free_mask:
++ cpus_read_unlock();
++ free_cpumask_var(new_mask);
++}
++
++static int
++__sched_setaffinity(struct task_struct *p, struct affinity_context *ctx);
++
++/*
++ * Restore the affinity of a task @p which was previously restricted by a
++ * call to force_compatible_cpus_allowed_ptr().
++ *
++ * It is the caller's responsibility to serialise this with any calls to
++ * force_compatible_cpus_allowed_ptr(@p).
++ */
++void relax_compatible_cpus_allowed_ptr(struct task_struct *p)
++{
++ struct affinity_context ac = {
++ .new_mask = task_user_cpus(p),
++ .flags = 0,
++ };
++ int ret;
++
++ /*
++ * Try to restore the old affinity mask with __sched_setaffinity().
++ * Cpuset masking will be done there too.
++ */
++ ret = __sched_setaffinity(p, &ac);
++ WARN_ON_ONCE(ret);
++}
++
++#else /* CONFIG_SMP */
++
++static inline int select_task_rq(struct task_struct *p)
++{
++ return 0;
++}
++
++static inline int
++__set_cpus_allowed_ptr(struct task_struct *p,
++ struct affinity_context *ctx)
++{
++ return set_cpus_allowed_ptr(p, ctx->new_mask);
++}
++
++static inline bool rq_has_pinned_tasks(struct rq *rq)
++{
++ return false;
++}
++
++static inline cpumask_t *alloc_user_cpus_ptr(int node)
++{
++ return NULL;
++}
++
++#endif /* !CONFIG_SMP */
++
++static void
++ttwu_stat(struct task_struct *p, int cpu, int wake_flags)
++{
++ struct rq *rq;
++
++ if (!schedstat_enabled())
++ return;
++
++ rq = this_rq();
++
++#ifdef CONFIG_SMP
++ if (cpu == rq->cpu) {
++ __schedstat_inc(rq->ttwu_local);
++ __schedstat_inc(p->stats.nr_wakeups_local);
++ } else {
++ /** Alt schedule FW ToDo:
++ * How to do ttwu_wake_remote
++ */
++ }
++#endif /* CONFIG_SMP */
++
++ __schedstat_inc(rq->ttwu_count);
++ __schedstat_inc(p->stats.nr_wakeups);
++}
++
++/*
++ * Mark the task runnable.
++ */
++static inline void ttwu_do_wakeup(struct task_struct *p)
++{
++ WRITE_ONCE(p->__state, TASK_RUNNING);
++ trace_sched_wakeup(p);
++}
++
++static inline void
++ttwu_do_activate(struct rq *rq, struct task_struct *p, int wake_flags)
++{
++ if (p->sched_contributes_to_load)
++ rq->nr_uninterruptible--;
++
++ if (
++#ifdef CONFIG_SMP
++ !(wake_flags & WF_MIGRATED) &&
++#endif
++ p->in_iowait) {
++ delayacct_blkio_end(p);
++ atomic_dec(&task_rq(p)->nr_iowait);
++ }
++
++ activate_task(p, rq);
++ check_preempt_curr(rq);
++
++ ttwu_do_wakeup(p);
++}
++
++/*
++ * Consider @p being inside a wait loop:
++ *
++ * for (;;) {
++ * set_current_state(TASK_UNINTERRUPTIBLE);
++ *
++ * if (CONDITION)
++ * break;
++ *
++ * schedule();
++ * }
++ * __set_current_state(TASK_RUNNING);
++ *
++ * between set_current_state() and schedule(). In this case @p is still
++ * runnable, so all that needs doing is change p->state back to TASK_RUNNING in
++ * an atomic manner.
++ *
++ * By taking task_rq(p)->lock we serialize against schedule(), if @p->on_rq
++ * then schedule() must still happen and p->state can be changed to
++ * TASK_RUNNING. Otherwise we lost the race, schedule() has happened, and we
++ * need to do a full wakeup with enqueue.
++ *
++ * Returns: %true when the wakeup is done,
++ * %false otherwise.
++ */
++static int ttwu_runnable(struct task_struct *p, int wake_flags)
++{
++ struct rq *rq;
++ raw_spinlock_t *lock;
++ int ret = 0;
++
++ rq = __task_access_lock(p, &lock);
++ if (task_on_rq_queued(p)) {
++ if (!task_on_cpu(p)) {
++ /*
++ * When on_rq && !on_cpu the task is preempted, see if
++ * it should preempt the task that is current now.
++ */
++ update_rq_clock(rq);
++ check_preempt_curr(rq);
++ }
++ ttwu_do_wakeup(p);
++ ret = 1;
++ }
++ __task_access_unlock(p, lock);
++
++ return ret;
++}
++
++#ifdef CONFIG_SMP
++void sched_ttwu_pending(void *arg)
++{
++ struct llist_node *llist = arg;
++ struct rq *rq = this_rq();
++ struct task_struct *p, *t;
++ struct rq_flags rf;
++
++ if (!llist)
++ return;
++
++ rq_lock_irqsave(rq, &rf);
++ update_rq_clock(rq);
++
++ llist_for_each_entry_safe(p, t, llist, wake_entry.llist) {
++ if (WARN_ON_ONCE(p->on_cpu))
++ smp_cond_load_acquire(&p->on_cpu, !VAL);
++
++ if (WARN_ON_ONCE(task_cpu(p) != cpu_of(rq)))
++ set_task_cpu(p, cpu_of(rq));
++
++ ttwu_do_activate(rq, p, p->sched_remote_wakeup ? WF_MIGRATED : 0);
++ }
++
++ /*
++ * Must be after enqueueing at least once task such that
++ * idle_cpu() does not observe a false-negative -- if it does,
++ * it is possible for select_idle_siblings() to stack a number
++ * of tasks on this CPU during that window.
++ *
++ * It is ok to clear ttwu_pending when another task pending.
++ * We will receive IPI after local irq enabled and then enqueue it.
++ * Since now nr_running > 0, idle_cpu() will always get correct result.
++ */
++ WRITE_ONCE(rq->ttwu_pending, 0);
++ rq_unlock_irqrestore(rq, &rf);
++}
++
++/*
++ * Prepare the scene for sending an IPI for a remote smp_call
++ *
++ * Returns true if the caller can proceed with sending the IPI.
++ * Returns false otherwise.
++ */
++bool call_function_single_prep_ipi(int cpu)
++{
++ if (set_nr_if_polling(cpu_rq(cpu)->idle)) {
++ trace_sched_wake_idle_without_ipi(cpu);
++ return false;
++ }
++
++ return true;
++}
++
++/*
++ * Queue a task on the target CPUs wake_list and wake the CPU via IPI if
++ * necessary. The wakee CPU on receipt of the IPI will queue the task
++ * via sched_ttwu_wakeup() for activation so the wakee incurs the cost
++ * of the wakeup instead of the waker.
++ */
++static void __ttwu_queue_wakelist(struct task_struct *p, int cpu, int wake_flags)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ p->sched_remote_wakeup = !!(wake_flags & WF_MIGRATED);
++
++ WRITE_ONCE(rq->ttwu_pending, 1);
++ __smp_call_single_queue(cpu, &p->wake_entry.llist);
++}
++
++static inline bool ttwu_queue_cond(struct task_struct *p, int cpu)
++{
++ /*
++ * Do not complicate things with the async wake_list while the CPU is
++ * in hotplug state.
++ */
++ if (!cpu_active(cpu))
++ return false;
++
++ /* Ensure the task will still be allowed to run on the CPU. */
++ if (!cpumask_test_cpu(cpu, p->cpus_ptr))
++ return false;
++
++ /*
++ * If the CPU does not share cache, then queue the task on the
++ * remote rqs wakelist to avoid accessing remote data.
++ */
++ if (!cpus_share_cache(smp_processor_id(), cpu))
++ return true;
++
++ if (cpu == smp_processor_id())
++ return false;
++
++ /*
++ * If the wakee cpu is idle, or the task is descheduling and the
++ * only running task on the CPU, then use the wakelist to offload
++ * the task activation to the idle (or soon-to-be-idle) CPU as
++ * the current CPU is likely busy. nr_running is checked to
++ * avoid unnecessary task stacking.
++ *
++ * Note that we can only get here with (wakee) p->on_rq=0,
++ * p->on_cpu can be whatever, we've done the dequeue, so
++ * the wakee has been accounted out of ->nr_running.
++ */
++ if (!cpu_rq(cpu)->nr_running)
++ return true;
++
++ return false;
++}
++
++static bool ttwu_queue_wakelist(struct task_struct *p, int cpu, int wake_flags)
++{
++ if (__is_defined(ALT_SCHED_TTWU_QUEUE) && ttwu_queue_cond(p, cpu)) {
++ sched_clock_cpu(cpu); /* Sync clocks across CPUs */
++ __ttwu_queue_wakelist(p, cpu, wake_flags);
++ return true;
++ }
++
++ return false;
++}
++
++void wake_up_if_idle(int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++ unsigned long flags;
++
++ rcu_read_lock();
++
++ if (!is_idle_task(rcu_dereference(rq->curr)))
++ goto out;
++
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ if (is_idle_task(rq->curr))
++ resched_curr(rq);
++ /* Else CPU is not idle, do nothing here */
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++
++out:
++ rcu_read_unlock();
++}
++
++bool cpus_share_cache(int this_cpu, int that_cpu)
++{
++ if (this_cpu == that_cpu)
++ return true;
++
++ return per_cpu(sd_llc_id, this_cpu) == per_cpu(sd_llc_id, that_cpu);
++}
++#else /* !CONFIG_SMP */
++
++static inline bool ttwu_queue_wakelist(struct task_struct *p, int cpu, int wake_flags)
++{
++ return false;
++}
++
++#endif /* CONFIG_SMP */
++
++static inline void ttwu_queue(struct task_struct *p, int cpu, int wake_flags)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ if (ttwu_queue_wakelist(p, cpu, wake_flags))
++ return;
++
++ raw_spin_lock(&rq->lock);
++ update_rq_clock(rq);
++ ttwu_do_activate(rq, p, wake_flags);
++ raw_spin_unlock(&rq->lock);
++}
++
++/*
++ * Invoked from try_to_wake_up() to check whether the task can be woken up.
++ *
++ * The caller holds p::pi_lock if p != current or has preemption
++ * disabled when p == current.
++ *
++ * The rules of PREEMPT_RT saved_state:
++ *
++ * The related locking code always holds p::pi_lock when updating
++ * p::saved_state, which means the code is fully serialized in both cases.
++ *
++ * The lock wait and lock wakeups happen via TASK_RTLOCK_WAIT. No other
++ * bits set. This allows to distinguish all wakeup scenarios.
++ */
++static __always_inline
++bool ttwu_state_match(struct task_struct *p, unsigned int state, int *success)
++{
++ if (IS_ENABLED(CONFIG_DEBUG_PREEMPT)) {
++ WARN_ON_ONCE((state & TASK_RTLOCK_WAIT) &&
++ state != TASK_RTLOCK_WAIT);
++ }
++
++ if (READ_ONCE(p->__state) & state) {
++ *success = 1;
++ return true;
++ }
++
++#ifdef CONFIG_PREEMPT_RT
++ /*
++ * Saved state preserves the task state across blocking on
++ * an RT lock. If the state matches, set p::saved_state to
++ * TASK_RUNNING, but do not wake the task because it waits
++ * for a lock wakeup. Also indicate success because from
++ * the regular waker's point of view this has succeeded.
++ *
++ * After acquiring the lock the task will restore p::__state
++ * from p::saved_state which ensures that the regular
++ * wakeup is not lost. The restore will also set
++ * p::saved_state to TASK_RUNNING so any further tests will
++ * not result in false positives vs. @success
++ */
++ if (p->saved_state & state) {
++ p->saved_state = TASK_RUNNING;
++ *success = 1;
++ }
++#endif
++ return false;
++}
++
++/*
++ * Notes on Program-Order guarantees on SMP systems.
++ *
++ * MIGRATION
++ *
++ * The basic program-order guarantee on SMP systems is that when a task [t]
++ * migrates, all its activity on its old CPU [c0] happens-before any subsequent
++ * execution on its new CPU [c1].
++ *
++ * For migration (of runnable tasks) this is provided by the following means:
++ *
++ * A) UNLOCK of the rq(c0)->lock scheduling out task t
++ * B) migration for t is required to synchronize *both* rq(c0)->lock and
++ * rq(c1)->lock (if not at the same time, then in that order).
++ * C) LOCK of the rq(c1)->lock scheduling in task
++ *
++ * Transitivity guarantees that B happens after A and C after B.
++ * Note: we only require RCpc transitivity.
++ * Note: the CPU doing B need not be c0 or c1
++ *
++ * Example:
++ *
++ * CPU0 CPU1 CPU2
++ *
++ * LOCK rq(0)->lock
++ * sched-out X
++ * sched-in Y
++ * UNLOCK rq(0)->lock
++ *
++ * LOCK rq(0)->lock // orders against CPU0
++ * dequeue X
++ * UNLOCK rq(0)->lock
++ *
++ * LOCK rq(1)->lock
++ * enqueue X
++ * UNLOCK rq(1)->lock
++ *
++ * LOCK rq(1)->lock // orders against CPU2
++ * sched-out Z
++ * sched-in X
++ * UNLOCK rq(1)->lock
++ *
++ *
++ * BLOCKING -- aka. SLEEP + WAKEUP
++ *
++ * For blocking we (obviously) need to provide the same guarantee as for
++ * migration. However the means are completely different as there is no lock
++ * chain to provide order. Instead we do:
++ *
++ * 1) smp_store_release(X->on_cpu, 0) -- finish_task()
++ * 2) smp_cond_load_acquire(!X->on_cpu) -- try_to_wake_up()
++ *
++ * Example:
++ *
++ * CPU0 (schedule) CPU1 (try_to_wake_up) CPU2 (schedule)
++ *
++ * LOCK rq(0)->lock LOCK X->pi_lock
++ * dequeue X
++ * sched-out X
++ * smp_store_release(X->on_cpu, 0);
++ *
++ * smp_cond_load_acquire(&X->on_cpu, !VAL);
++ * X->state = WAKING
++ * set_task_cpu(X,2)
++ *
++ * LOCK rq(2)->lock
++ * enqueue X
++ * X->state = RUNNING
++ * UNLOCK rq(2)->lock
++ *
++ * LOCK rq(2)->lock // orders against CPU1
++ * sched-out Z
++ * sched-in X
++ * UNLOCK rq(2)->lock
++ *
++ * UNLOCK X->pi_lock
++ * UNLOCK rq(0)->lock
++ *
++ *
++ * However; for wakeups there is a second guarantee we must provide, namely we
++ * must observe the state that lead to our wakeup. That is, not only must our
++ * task observe its own prior state, it must also observe the stores prior to
++ * its wakeup.
++ *
++ * This means that any means of doing remote wakeups must order the CPU doing
++ * the wakeup against the CPU the task is going to end up running on. This,
++ * however, is already required for the regular Program-Order guarantee above,
++ * since the waking CPU is the one issueing the ACQUIRE (smp_cond_load_acquire).
++ *
++ */
++
++/**
++ * try_to_wake_up - wake up a thread
++ * @p: the thread to be awakened
++ * @state: the mask of task states that can be woken
++ * @wake_flags: wake modifier flags (WF_*)
++ *
++ * Conceptually does:
++ *
++ * If (@state & @p->state) @p->state = TASK_RUNNING.
++ *
++ * If the task was not queued/runnable, also place it back on a runqueue.
++ *
++ * This function is atomic against schedule() which would dequeue the task.
++ *
++ * It issues a full memory barrier before accessing @p->state, see the comment
++ * with set_current_state().
++ *
++ * Uses p->pi_lock to serialize against concurrent wake-ups.
++ *
++ * Relies on p->pi_lock stabilizing:
++ * - p->sched_class
++ * - p->cpus_ptr
++ * - p->sched_task_group
++ * in order to do migration, see its use of select_task_rq()/set_task_cpu().
++ *
++ * Tries really hard to only take one task_rq(p)->lock for performance.
++ * Takes rq->lock in:
++ * - ttwu_runnable() -- old rq, unavoidable, see comment there;
++ * - ttwu_queue() -- new rq, for enqueue of the task;
++ * - psi_ttwu_dequeue() -- much sadness :-( accounting will kill us.
++ *
++ * As a consequence we race really badly with just about everything. See the
++ * many memory barriers and their comments for details.
++ *
++ * Return: %true if @p->state changes (an actual wakeup was done),
++ * %false otherwise.
++ */
++static int try_to_wake_up(struct task_struct *p, unsigned int state,
++ int wake_flags)
++{
++ unsigned long flags;
++ int cpu, success = 0;
++
++ preempt_disable();
++ if (p == current) {
++ /*
++ * We're waking current, this means 'p->on_rq' and 'task_cpu(p)
++ * == smp_processor_id()'. Together this means we can special
++ * case the whole 'p->on_rq && ttwu_runnable()' case below
++ * without taking any locks.
++ *
++ * In particular:
++ * - we rely on Program-Order guarantees for all the ordering,
++ * - we're serialized against set_special_state() by virtue of
++ * it disabling IRQs (this allows not taking ->pi_lock).
++ */
++ if (!ttwu_state_match(p, state, &success))
++ goto out;
++
++ trace_sched_waking(p);
++ ttwu_do_wakeup(p);
++ goto out;
++ }
++
++ /*
++ * If we are going to wake up a thread waiting for CONDITION we
++ * need to ensure that CONDITION=1 done by the caller can not be
++ * reordered with p->state check below. This pairs with smp_store_mb()
++ * in set_current_state() that the waiting thread does.
++ */
++ raw_spin_lock_irqsave(&p->pi_lock, flags);
++ smp_mb__after_spinlock();
++ if (!ttwu_state_match(p, state, &success))
++ goto unlock;
++
++ trace_sched_waking(p);
++
++ /*
++ * Ensure we load p->on_rq _after_ p->state, otherwise it would
++ * be possible to, falsely, observe p->on_rq == 0 and get stuck
++ * in smp_cond_load_acquire() below.
++ *
++ * sched_ttwu_pending() try_to_wake_up()
++ * STORE p->on_rq = 1 LOAD p->state
++ * UNLOCK rq->lock
++ *
++ * __schedule() (switch to task 'p')
++ * LOCK rq->lock smp_rmb();
++ * smp_mb__after_spinlock();
++ * UNLOCK rq->lock
++ *
++ * [task p]
++ * STORE p->state = UNINTERRUPTIBLE LOAD p->on_rq
++ *
++ * Pairs with the LOCK+smp_mb__after_spinlock() on rq->lock in
++ * __schedule(). See the comment for smp_mb__after_spinlock().
++ *
++ * A similar smb_rmb() lives in try_invoke_on_locked_down_task().
++ */
++ smp_rmb();
++ if (READ_ONCE(p->on_rq) && ttwu_runnable(p, wake_flags))
++ goto unlock;
++
++#ifdef CONFIG_SMP
++ /*
++ * Ensure we load p->on_cpu _after_ p->on_rq, otherwise it would be
++ * possible to, falsely, observe p->on_cpu == 0.
++ *
++ * One must be running (->on_cpu == 1) in order to remove oneself
++ * from the runqueue.
++ *
++ * __schedule() (switch to task 'p') try_to_wake_up()
++ * STORE p->on_cpu = 1 LOAD p->on_rq
++ * UNLOCK rq->lock
++ *
++ * __schedule() (put 'p' to sleep)
++ * LOCK rq->lock smp_rmb();
++ * smp_mb__after_spinlock();
++ * STORE p->on_rq = 0 LOAD p->on_cpu
++ *
++ * Pairs with the LOCK+smp_mb__after_spinlock() on rq->lock in
++ * __schedule(). See the comment for smp_mb__after_spinlock().
++ *
++ * Form a control-dep-acquire with p->on_rq == 0 above, to ensure
++ * schedule()'s deactivate_task() has 'happened' and p will no longer
++ * care about it's own p->state. See the comment in __schedule().
++ */
++ smp_acquire__after_ctrl_dep();
++
++ /*
++ * We're doing the wakeup (@success == 1), they did a dequeue (p->on_rq
++ * == 0), which means we need to do an enqueue, change p->state to
++ * TASK_WAKING such that we can unlock p->pi_lock before doing the
++ * enqueue, such as ttwu_queue_wakelist().
++ */
++ WRITE_ONCE(p->__state, TASK_WAKING);
++
++ /*
++ * If the owning (remote) CPU is still in the middle of schedule() with
++ * this task as prev, considering queueing p on the remote CPUs wake_list
++ * which potentially sends an IPI instead of spinning on p->on_cpu to
++ * let the waker make forward progress. This is safe because IRQs are
++ * disabled and the IPI will deliver after on_cpu is cleared.
++ *
++ * Ensure we load task_cpu(p) after p->on_cpu:
++ *
++ * set_task_cpu(p, cpu);
++ * STORE p->cpu = @cpu
++ * __schedule() (switch to task 'p')
++ * LOCK rq->lock
++ * smp_mb__after_spin_lock() smp_cond_load_acquire(&p->on_cpu)
++ * STORE p->on_cpu = 1 LOAD p->cpu
++ *
++ * to ensure we observe the correct CPU on which the task is currently
++ * scheduling.
++ */
++ if (smp_load_acquire(&p->on_cpu) &&
++ ttwu_queue_wakelist(p, task_cpu(p), wake_flags))
++ goto unlock;
++
++ /*
++ * If the owning (remote) CPU is still in the middle of schedule() with
++ * this task as prev, wait until it's done referencing the task.
++ *
++ * Pairs with the smp_store_release() in finish_task().
++ *
++ * This ensures that tasks getting woken will be fully ordered against
++ * their previous state and preserve Program Order.
++ */
++ smp_cond_load_acquire(&p->on_cpu, !VAL);
++
++ sched_task_ttwu(p);
++
++ cpu = select_task_rq(p);
++
++ if (cpu != task_cpu(p)) {
++ if (p->in_iowait) {
++ delayacct_blkio_end(p);
++ atomic_dec(&task_rq(p)->nr_iowait);
++ }
++
++ wake_flags |= WF_MIGRATED;
++ set_task_cpu(p, cpu);
++ }
++#else
++ cpu = task_cpu(p);
++#endif /* CONFIG_SMP */
++
++ ttwu_queue(p, cpu, wake_flags);
++unlock:
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++out:
++ if (success)
++ ttwu_stat(p, task_cpu(p), wake_flags);
++ preempt_enable();
++
++ return success;
++}
++
++static bool __task_needs_rq_lock(struct task_struct *p)
++{
++ unsigned int state = READ_ONCE(p->__state);
++
++ /*
++ * Since pi->lock blocks try_to_wake_up(), we don't need rq->lock when
++ * the task is blocked. Make sure to check @state since ttwu() can drop
++ * locks at the end, see ttwu_queue_wakelist().
++ */
++ if (state == TASK_RUNNING || state == TASK_WAKING)
++ return true;
++
++ /*
++ * Ensure we load p->on_rq after p->__state, otherwise it would be
++ * possible to, falsely, observe p->on_rq == 0.
++ *
++ * See try_to_wake_up() for a longer comment.
++ */
++ smp_rmb();
++ if (p->on_rq)
++ return true;
++
++#ifdef CONFIG_SMP
++ /*
++ * Ensure the task has finished __schedule() and will not be referenced
++ * anymore. Again, see try_to_wake_up() for a longer comment.
++ */
++ smp_rmb();
++ smp_cond_load_acquire(&p->on_cpu, !VAL);
++#endif
++
++ return false;
++}
++
++/**
++ * task_call_func - Invoke a function on task in fixed state
++ * @p: Process for which the function is to be invoked, can be @current.
++ * @func: Function to invoke.
++ * @arg: Argument to function.
++ *
++ * Fix the task in it's current state by avoiding wakeups and or rq operations
++ * and call @func(@arg) on it. This function can use ->on_rq and task_curr()
++ * to work out what the state is, if required. Given that @func can be invoked
++ * with a runqueue lock held, it had better be quite lightweight.
++ *
++ * Returns:
++ * Whatever @func returns
++ */
++int task_call_func(struct task_struct *p, task_call_f func, void *arg)
++{
++ struct rq *rq = NULL;
++ struct rq_flags rf;
++ int ret;
++
++ raw_spin_lock_irqsave(&p->pi_lock, rf.flags);
++
++ if (__task_needs_rq_lock(p))
++ rq = __task_rq_lock(p, &rf);
++
++ /*
++ * At this point the task is pinned; either:
++ * - blocked and we're holding off wakeups (pi->lock)
++ * - woken, and we're holding off enqueue (rq->lock)
++ * - queued, and we're holding off schedule (rq->lock)
++ * - running, and we're holding off de-schedule (rq->lock)
++ *
++ * The called function (@func) can use: task_curr(), p->on_rq and
++ * p->__state to differentiate between these states.
++ */
++ ret = func(p, arg);
++
++ if (rq)
++ __task_rq_unlock(rq, &rf);
++
++ raw_spin_unlock_irqrestore(&p->pi_lock, rf.flags);
++ return ret;
++}
++
++/**
++ * cpu_curr_snapshot - Return a snapshot of the currently running task
++ * @cpu: The CPU on which to snapshot the task.
++ *
++ * Returns the task_struct pointer of the task "currently" running on
++ * the specified CPU. If the same task is running on that CPU throughout,
++ * the return value will be a pointer to that task's task_struct structure.
++ * If the CPU did any context switches even vaguely concurrently with the
++ * execution of this function, the return value will be a pointer to the
++ * task_struct structure of a randomly chosen task that was running on
++ * that CPU somewhere around the time that this function was executing.
++ *
++ * If the specified CPU was offline, the return value is whatever it
++ * is, perhaps a pointer to the task_struct structure of that CPU's idle
++ * task, but there is no guarantee. Callers wishing a useful return
++ * value must take some action to ensure that the specified CPU remains
++ * online throughout.
++ *
++ * This function executes full memory barriers before and after fetching
++ * the pointer, which permits the caller to confine this function's fetch
++ * with respect to the caller's accesses to other shared variables.
++ */
++struct task_struct *cpu_curr_snapshot(int cpu)
++{
++ struct task_struct *t;
++
++ smp_mb(); /* Pairing determined by caller's synchronization design. */
++ t = rcu_dereference(cpu_curr(cpu));
++ smp_mb(); /* Pairing determined by caller's synchronization design. */
++ return t;
++}
++
++/**
++ * wake_up_process - Wake up a specific process
++ * @p: The process to be woken up.
++ *
++ * Attempt to wake up the nominated process and move it to the set of runnable
++ * processes.
++ *
++ * Return: 1 if the process was woken up, 0 if it was already running.
++ *
++ * This function executes a full memory barrier before accessing the task state.
++ */
++int wake_up_process(struct task_struct *p)
++{
++ return try_to_wake_up(p, TASK_NORMAL, 0);
++}
++EXPORT_SYMBOL(wake_up_process);
++
++int wake_up_state(struct task_struct *p, unsigned int state)
++{
++ return try_to_wake_up(p, state, 0);
++}
++
++/*
++ * Perform scheduler related setup for a newly forked process p.
++ * p is forked by current.
++ *
++ * __sched_fork() is basic setup used by init_idle() too:
++ */
++static inline void __sched_fork(unsigned long clone_flags, struct task_struct *p)
++{
++ p->on_rq = 0;
++ p->on_cpu = 0;
++ p->utime = 0;
++ p->stime = 0;
++ p->sched_time = 0;
++
++#ifdef CONFIG_SCHEDSTATS
++ /* Even if schedstat is disabled, there should not be garbage */
++ memset(&p->stats, 0, sizeof(p->stats));
++#endif
++
++#ifdef CONFIG_PREEMPT_NOTIFIERS
++ INIT_HLIST_HEAD(&p->preempt_notifiers);
++#endif
++
++#ifdef CONFIG_COMPACTION
++ p->capture_control = NULL;
++#endif
++#ifdef CONFIG_SMP
++ p->wake_entry.u_flags = CSD_TYPE_TTWU;
++#endif
++ init_sched_mm_cid(p);
++}
++
++/*
++ * fork()/clone()-time setup:
++ */
++int sched_fork(unsigned long clone_flags, struct task_struct *p)
++{
++ __sched_fork(clone_flags, p);
++ /*
++ * We mark the process as NEW here. This guarantees that
++ * nobody will actually run it, and a signal or other external
++ * event cannot wake it up and insert it on the runqueue either.
++ */
++ p->__state = TASK_NEW;
++
++ /*
++ * Make sure we do not leak PI boosting priority to the child.
++ */
++ p->prio = current->normal_prio;
++
++ /*
++ * Revert to default priority/policy on fork if requested.
++ */
++ if (unlikely(p->sched_reset_on_fork)) {
++ if (task_has_rt_policy(p)) {
++ p->policy = SCHED_NORMAL;
++ p->static_prio = NICE_TO_PRIO(0);
++ p->rt_priority = 0;
++ } else if (PRIO_TO_NICE(p->static_prio) < 0)
++ p->static_prio = NICE_TO_PRIO(0);
++
++ p->prio = p->normal_prio = p->static_prio;
++
++ /*
++ * We don't need the reset flag anymore after the fork. It has
++ * fulfilled its duty:
++ */
++ p->sched_reset_on_fork = 0;
++ }
++
++#ifdef CONFIG_SCHED_INFO
++ if (unlikely(sched_info_on()))
++ memset(&p->sched_info, 0, sizeof(p->sched_info));
++#endif
++ init_task_preempt_count(p);
++
++ return 0;
++}
++
++void sched_cgroup_fork(struct task_struct *p, struct kernel_clone_args *kargs)
++{
++ unsigned long flags;
++ struct rq *rq;
++
++ /*
++ * Because we're not yet on the pid-hash, p->pi_lock isn't strictly
++ * required yet, but lockdep gets upset if rules are violated.
++ */
++ raw_spin_lock_irqsave(&p->pi_lock, flags);
++ /*
++ * Share the timeslice between parent and child, thus the
++ * total amount of pending timeslices in the system doesn't change,
++ * resulting in more scheduling fairness.
++ */
++ rq = this_rq();
++ raw_spin_lock(&rq->lock);
++
++ rq->curr->time_slice /= 2;
++ p->time_slice = rq->curr->time_slice;
++#ifdef CONFIG_SCHED_HRTICK
++ hrtick_start(rq, rq->curr->time_slice);
++#endif
++
++ if (p->time_slice < RESCHED_NS) {
++ p->time_slice = sched_timeslice_ns;
++ resched_curr(rq);
++ }
++ sched_task_fork(p, rq);
++ raw_spin_unlock(&rq->lock);
++
++ rseq_migrate(p);
++ /*
++ * We're setting the CPU for the first time, we don't migrate,
++ * so use __set_task_cpu().
++ */
++ __set_task_cpu(p, smp_processor_id());
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++}
++
++void sched_post_fork(struct task_struct *p)
++{
++}
++
++#ifdef CONFIG_SCHEDSTATS
++
++DEFINE_STATIC_KEY_FALSE(sched_schedstats);
++
++static void set_schedstats(bool enabled)
++{
++ if (enabled)
++ static_branch_enable(&sched_schedstats);
++ else
++ static_branch_disable(&sched_schedstats);
++}
++
++void force_schedstat_enabled(void)
++{
++ if (!schedstat_enabled()) {
++ pr_info("kernel profiling enabled schedstats, disable via kernel.sched_schedstats.\n");
++ static_branch_enable(&sched_schedstats);
++ }
++}
++
++static int __init setup_schedstats(char *str)
++{
++ int ret = 0;
++ if (!str)
++ goto out;
++
++ if (!strcmp(str, "enable")) {
++ set_schedstats(true);
++ ret = 1;
++ } else if (!strcmp(str, "disable")) {
++ set_schedstats(false);
++ ret = 1;
++ }
++out:
++ if (!ret)
++ pr_warn("Unable to parse schedstats=\n");
++
++ return ret;
++}
++__setup("schedstats=", setup_schedstats);
++
++#ifdef CONFIG_PROC_SYSCTL
++static int sysctl_schedstats(struct ctl_table *table, int write, void *buffer,
++ size_t *lenp, loff_t *ppos)
++{
++ struct ctl_table t;
++ int err;
++ int state = static_branch_likely(&sched_schedstats);
++
++ if (write && !capable(CAP_SYS_ADMIN))
++ return -EPERM;
++
++ t = *table;
++ t.data = &state;
++ err = proc_dointvec_minmax(&t, write, buffer, lenp, ppos);
++ if (err < 0)
++ return err;
++ if (write)
++ set_schedstats(state);
++ return err;
++}
++
++static struct ctl_table sched_core_sysctls[] = {
++ {
++ .procname = "sched_schedstats",
++ .data = NULL,
++ .maxlen = sizeof(unsigned int),
++ .mode = 0644,
++ .proc_handler = sysctl_schedstats,
++ .extra1 = SYSCTL_ZERO,
++ .extra2 = SYSCTL_ONE,
++ },
++ {}
++};
++static int __init sched_core_sysctl_init(void)
++{
++ register_sysctl_init("kernel", sched_core_sysctls);
++ return 0;
++}
++late_initcall(sched_core_sysctl_init);
++#endif /* CONFIG_PROC_SYSCTL */
++#endif /* CONFIG_SCHEDSTATS */
++
++/*
++ * wake_up_new_task - wake up a newly created task for the first time.
++ *
++ * This function will do some initial scheduler statistics housekeeping
++ * that must be done for every newly created context, then puts the task
++ * on the runqueue and wakes it.
++ */
++void wake_up_new_task(struct task_struct *p)
++{
++ unsigned long flags;
++ struct rq *rq;
++
++ raw_spin_lock_irqsave(&p->pi_lock, flags);
++ WRITE_ONCE(p->__state, TASK_RUNNING);
++ rq = cpu_rq(select_task_rq(p));
++#ifdef CONFIG_SMP
++ rseq_migrate(p);
++ /*
++ * Fork balancing, do it here and not earlier because:
++ * - cpus_ptr can change in the fork path
++ * - any previously selected CPU might disappear through hotplug
++ *
++ * Use __set_task_cpu() to avoid calling sched_class::migrate_task_rq,
++ * as we're not fully set-up yet.
++ */
++ __set_task_cpu(p, cpu_of(rq));
++#endif
++
++ raw_spin_lock(&rq->lock);
++ update_rq_clock(rq);
++
++ activate_task(p, rq);
++ trace_sched_wakeup_new(p);
++ check_preempt_curr(rq);
++
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++}
++
++#ifdef CONFIG_PREEMPT_NOTIFIERS
++
++static DEFINE_STATIC_KEY_FALSE(preempt_notifier_key);
++
++void preempt_notifier_inc(void)
++{
++ static_branch_inc(&preempt_notifier_key);
++}
++EXPORT_SYMBOL_GPL(preempt_notifier_inc);
++
++void preempt_notifier_dec(void)
++{
++ static_branch_dec(&preempt_notifier_key);
++}
++EXPORT_SYMBOL_GPL(preempt_notifier_dec);
++
++/**
++ * preempt_notifier_register - tell me when current is being preempted & rescheduled
++ * @notifier: notifier struct to register
++ */
++void preempt_notifier_register(struct preempt_notifier *notifier)
++{
++ if (!static_branch_unlikely(&preempt_notifier_key))
++ WARN(1, "registering preempt_notifier while notifiers disabled\n");
++
++ hlist_add_head(¬ifier->link, ¤t->preempt_notifiers);
++}
++EXPORT_SYMBOL_GPL(preempt_notifier_register);
++
++/**
++ * preempt_notifier_unregister - no longer interested in preemption notifications
++ * @notifier: notifier struct to unregister
++ *
++ * This is *not* safe to call from within a preemption notifier.
++ */
++void preempt_notifier_unregister(struct preempt_notifier *notifier)
++{
++ hlist_del(¬ifier->link);
++}
++EXPORT_SYMBOL_GPL(preempt_notifier_unregister);
++
++static void __fire_sched_in_preempt_notifiers(struct task_struct *curr)
++{
++ struct preempt_notifier *notifier;
++
++ hlist_for_each_entry(notifier, &curr->preempt_notifiers, link)
++ notifier->ops->sched_in(notifier, raw_smp_processor_id());
++}
++
++static __always_inline void fire_sched_in_preempt_notifiers(struct task_struct *curr)
++{
++ if (static_branch_unlikely(&preempt_notifier_key))
++ __fire_sched_in_preempt_notifiers(curr);
++}
++
++static void
++__fire_sched_out_preempt_notifiers(struct task_struct *curr,
++ struct task_struct *next)
++{
++ struct preempt_notifier *notifier;
++
++ hlist_for_each_entry(notifier, &curr->preempt_notifiers, link)
++ notifier->ops->sched_out(notifier, next);
++}
++
++static __always_inline void
++fire_sched_out_preempt_notifiers(struct task_struct *curr,
++ struct task_struct *next)
++{
++ if (static_branch_unlikely(&preempt_notifier_key))
++ __fire_sched_out_preempt_notifiers(curr, next);
++}
++
++#else /* !CONFIG_PREEMPT_NOTIFIERS */
++
++static inline void fire_sched_in_preempt_notifiers(struct task_struct *curr)
++{
++}
++
++static inline void
++fire_sched_out_preempt_notifiers(struct task_struct *curr,
++ struct task_struct *next)
++{
++}
++
++#endif /* CONFIG_PREEMPT_NOTIFIERS */
++
++static inline void prepare_task(struct task_struct *next)
++{
++ /*
++ * Claim the task as running, we do this before switching to it
++ * such that any running task will have this set.
++ *
++ * See the smp_load_acquire(&p->on_cpu) case in ttwu() and
++ * its ordering comment.
++ */
++ WRITE_ONCE(next->on_cpu, 1);
++}
++
++static inline void finish_task(struct task_struct *prev)
++{
++#ifdef CONFIG_SMP
++ /*
++ * This must be the very last reference to @prev from this CPU. After
++ * p->on_cpu is cleared, the task can be moved to a different CPU. We
++ * must ensure this doesn't happen until the switch is completely
++ * finished.
++ *
++ * In particular, the load of prev->state in finish_task_switch() must
++ * happen before this.
++ *
++ * Pairs with the smp_cond_load_acquire() in try_to_wake_up().
++ */
++ smp_store_release(&prev->on_cpu, 0);
++#else
++ prev->on_cpu = 0;
++#endif
++}
++
++#ifdef CONFIG_SMP
++
++static void do_balance_callbacks(struct rq *rq, struct balance_callback *head)
++{
++ void (*func)(struct rq *rq);
++ struct balance_callback *next;
++
++ lockdep_assert_held(&rq->lock);
++
++ while (head) {
++ func = (void (*)(struct rq *))head->func;
++ next = head->next;
++ head->next = NULL;
++ head = next;
++
++ func(rq);
++ }
++}
++
++static void balance_push(struct rq *rq);
++
++/*
++ * balance_push_callback is a right abuse of the callback interface and plays
++ * by significantly different rules.
++ *
++ * Where the normal balance_callback's purpose is to be ran in the same context
++ * that queued it (only later, when it's safe to drop rq->lock again),
++ * balance_push_callback is specifically targeted at __schedule().
++ *
++ * This abuse is tolerated because it places all the unlikely/odd cases behind
++ * a single test, namely: rq->balance_callback == NULL.
++ */
++struct balance_callback balance_push_callback = {
++ .next = NULL,
++ .func = balance_push,
++};
++
++static inline struct balance_callback *
++__splice_balance_callbacks(struct rq *rq, bool split)
++{
++ struct balance_callback *head = rq->balance_callback;
++
++ if (likely(!head))
++ return NULL;
++
++ lockdep_assert_rq_held(rq);
++ /*
++ * Must not take balance_push_callback off the list when
++ * splice_balance_callbacks() and balance_callbacks() are not
++ * in the same rq->lock section.
++ *
++ * In that case it would be possible for __schedule() to interleave
++ * and observe the list empty.
++ */
++ if (split && head == &balance_push_callback)
++ head = NULL;
++ else
++ rq->balance_callback = NULL;
++
++ return head;
++}
++
++static inline struct balance_callback *splice_balance_callbacks(struct rq *rq)
++{
++ return __splice_balance_callbacks(rq, true);
++}
++
++static void __balance_callbacks(struct rq *rq)
++{
++ do_balance_callbacks(rq, __splice_balance_callbacks(rq, false));
++}
++
++static inline void balance_callbacks(struct rq *rq, struct balance_callback *head)
++{
++ unsigned long flags;
++
++ if (unlikely(head)) {
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ do_balance_callbacks(rq, head);
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++ }
++}
++
++#else
++
++static inline void __balance_callbacks(struct rq *rq)
++{
++}
++
++static inline struct balance_callback *splice_balance_callbacks(struct rq *rq)
++{
++ return NULL;
++}
++
++static inline void balance_callbacks(struct rq *rq, struct balance_callback *head)
++{
++}
++
++#endif
++
++static inline void
++prepare_lock_switch(struct rq *rq, struct task_struct *next)
++{
++ /*
++ * Since the runqueue lock will be released by the next
++ * task (which is an invalid locking op but in the case
++ * of the scheduler it's an obvious special-case), so we
++ * do an early lockdep release here:
++ */
++ spin_release(&rq->lock.dep_map, _THIS_IP_);
++#ifdef CONFIG_DEBUG_SPINLOCK
++ /* this is a valid case when another task releases the spinlock */
++ rq->lock.owner = next;
++#endif
++}
++
++static inline void finish_lock_switch(struct rq *rq)
++{
++ /*
++ * If we are tracking spinlock dependencies then we have to
++ * fix up the runqueue lock - which gets 'carried over' from
++ * prev into current:
++ */
++ spin_acquire(&rq->lock.dep_map, 0, 0, _THIS_IP_);
++ __balance_callbacks(rq);
++ raw_spin_unlock_irq(&rq->lock);
++}
++
++/*
++ * NOP if the arch has not defined these:
++ */
++
++#ifndef prepare_arch_switch
++# define prepare_arch_switch(next) do { } while (0)
++#endif
++
++#ifndef finish_arch_post_lock_switch
++# define finish_arch_post_lock_switch() do { } while (0)
++#endif
++
++static inline void kmap_local_sched_out(void)
++{
++#ifdef CONFIG_KMAP_LOCAL
++ if (unlikely(current->kmap_ctrl.idx))
++ __kmap_local_sched_out();
++#endif
++}
++
++static inline void kmap_local_sched_in(void)
++{
++#ifdef CONFIG_KMAP_LOCAL
++ if (unlikely(current->kmap_ctrl.idx))
++ __kmap_local_sched_in();
++#endif
++}
++
++/**
++ * prepare_task_switch - prepare to switch tasks
++ * @rq: the runqueue preparing to switch
++ * @next: the task we are going to switch to.
++ *
++ * This is called with the rq lock held and interrupts off. It must
++ * be paired with a subsequent finish_task_switch after the context
++ * switch.
++ *
++ * prepare_task_switch sets up locking and calls architecture specific
++ * hooks.
++ */
++static inline void
++prepare_task_switch(struct rq *rq, struct task_struct *prev,
++ struct task_struct *next)
++{
++ kcov_prepare_switch(prev);
++ sched_info_switch(rq, prev, next);
++ perf_event_task_sched_out(prev, next);
++ rseq_preempt(prev);
++ fire_sched_out_preempt_notifiers(prev, next);
++ kmap_local_sched_out();
++ prepare_task(next);
++ prepare_arch_switch(next);
++}
++
++/**
++ * finish_task_switch - clean up after a task-switch
++ * @rq: runqueue associated with task-switch
++ * @prev: the thread we just switched away from.
++ *
++ * finish_task_switch must be called after the context switch, paired
++ * with a prepare_task_switch call before the context switch.
++ * finish_task_switch will reconcile locking set up by prepare_task_switch,
++ * and do any other architecture-specific cleanup actions.
++ *
++ * Note that we may have delayed dropping an mm in context_switch(). If
++ * so, we finish that here outside of the runqueue lock. (Doing it
++ * with the lock held can cause deadlocks; see schedule() for
++ * details.)
++ *
++ * The context switch have flipped the stack from under us and restored the
++ * local variables which were saved when this task called schedule() in the
++ * past. prev == current is still correct but we need to recalculate this_rq
++ * because prev may have moved to another CPU.
++ */
++static struct rq *finish_task_switch(struct task_struct *prev)
++ __releases(rq->lock)
++{
++ struct rq *rq = this_rq();
++ struct mm_struct *mm = rq->prev_mm;
++ unsigned int prev_state;
++
++ /*
++ * The previous task will have left us with a preempt_count of 2
++ * because it left us after:
++ *
++ * schedule()
++ * preempt_disable(); // 1
++ * __schedule()
++ * raw_spin_lock_irq(&rq->lock) // 2
++ *
++ * Also, see FORK_PREEMPT_COUNT.
++ */
++ if (WARN_ONCE(preempt_count() != 2*PREEMPT_DISABLE_OFFSET,
++ "corrupted preempt_count: %s/%d/0x%x\n",
++ current->comm, current->pid, preempt_count()))
++ preempt_count_set(FORK_PREEMPT_COUNT);
++
++ rq->prev_mm = NULL;
++
++ /*
++ * A task struct has one reference for the use as "current".
++ * If a task dies, then it sets TASK_DEAD in tsk->state and calls
++ * schedule one last time. The schedule call will never return, and
++ * the scheduled task must drop that reference.
++ *
++ * We must observe prev->state before clearing prev->on_cpu (in
++ * finish_task), otherwise a concurrent wakeup can get prev
++ * running on another CPU and we could rave with its RUNNING -> DEAD
++ * transition, resulting in a double drop.
++ */
++ prev_state = READ_ONCE(prev->__state);
++ vtime_task_switch(prev);
++ perf_event_task_sched_in(prev, current);
++ finish_task(prev);
++ tick_nohz_task_switch();
++ finish_lock_switch(rq);
++ finish_arch_post_lock_switch();
++ kcov_finish_switch(current);
++ /*
++ * kmap_local_sched_out() is invoked with rq::lock held and
++ * interrupts disabled. There is no requirement for that, but the
++ * sched out code does not have an interrupt enabled section.
++ * Restoring the maps on sched in does not require interrupts being
++ * disabled either.
++ */
++ kmap_local_sched_in();
++
++ fire_sched_in_preempt_notifiers(current);
++ /*
++ * When switching through a kernel thread, the loop in
++ * membarrier_{private,global}_expedited() may have observed that
++ * kernel thread and not issued an IPI. It is therefore possible to
++ * schedule between user->kernel->user threads without passing though
++ * switch_mm(). Membarrier requires a barrier after storing to
++ * rq->curr, before returning to userspace, so provide them here:
++ *
++ * - a full memory barrier for {PRIVATE,GLOBAL}_EXPEDITED, implicitly
++ * provided by mmdrop(),
++ * - a sync_core for SYNC_CORE.
++ */
++ if (mm) {
++ membarrier_mm_sync_core_before_usermode(mm);
++ mmdrop_sched(mm);
++ }
++ if (unlikely(prev_state == TASK_DEAD)) {
++ /* Task is done with its stack. */
++ put_task_stack(prev);
++
++ put_task_struct_rcu_user(prev);
++ }
++
++ return rq;
++}
++
++/**
++ * schedule_tail - first thing a freshly forked thread must call.
++ * @prev: the thread we just switched away from.
++ */
++asmlinkage __visible void schedule_tail(struct task_struct *prev)
++ __releases(rq->lock)
++{
++ /*
++ * New tasks start with FORK_PREEMPT_COUNT, see there and
++ * finish_task_switch() for details.
++ *
++ * finish_task_switch() will drop rq->lock() and lower preempt_count
++ * and the preempt_enable() will end up enabling preemption (on
++ * PREEMPT_COUNT kernels).
++ */
++
++ finish_task_switch(prev);
++ preempt_enable();
++
++ if (current->set_child_tid)
++ put_user(task_pid_vnr(current), current->set_child_tid);
++
++ calculate_sigpending();
++}
++
++/*
++ * context_switch - switch to the new MM and the new thread's register state.
++ */
++static __always_inline struct rq *
++context_switch(struct rq *rq, struct task_struct *prev,
++ struct task_struct *next)
++{
++ prepare_task_switch(rq, prev, next);
++
++ /*
++ * For paravirt, this is coupled with an exit in switch_to to
++ * combine the page table reload and the switch backend into
++ * one hypercall.
++ */
++ arch_start_context_switch(prev);
++
++ /*
++ * kernel -> kernel lazy + transfer active
++ * user -> kernel lazy + mmgrab() active
++ *
++ * kernel -> user switch + mmdrop() active
++ * user -> user switch
++ *
++ * switch_mm_cid() needs to be updated if the barriers provided
++ * by context_switch() are modified.
++ */
++ if (!next->mm) { // to kernel
++ enter_lazy_tlb(prev->active_mm, next);
++
++ next->active_mm = prev->active_mm;
++ if (prev->mm) // from user
++ mmgrab(prev->active_mm);
++ else
++ prev->active_mm = NULL;
++ } else { // to user
++ membarrier_switch_mm(rq, prev->active_mm, next->mm);
++ /*
++ * sys_membarrier() requires an smp_mb() between setting
++ * rq->curr / membarrier_switch_mm() and returning to userspace.
++ *
++ * The below provides this either through switch_mm(), or in
++ * case 'prev->active_mm == next->mm' through
++ * finish_task_switch()'s mmdrop().
++ */
++ switch_mm_irqs_off(prev->active_mm, next->mm, next);
++ lru_gen_use_mm(next->mm);
++
++ if (!prev->mm) { // from kernel
++ /* will mmdrop() in finish_task_switch(). */
++ rq->prev_mm = prev->active_mm;
++ prev->active_mm = NULL;
++ }
++ }
++
++ /* switch_mm_cid() requires the memory barriers above. */
++ switch_mm_cid(rq, prev, next);
++
++ prepare_lock_switch(rq, next);
++
++ /* Here we just switch the register state and the stack. */
++ switch_to(prev, next, prev);
++ barrier();
++
++ return finish_task_switch(prev);
++}
++
++/*
++ * nr_running, nr_uninterruptible and nr_context_switches:
++ *
++ * externally visible scheduler statistics: current number of runnable
++ * threads, total number of context switches performed since bootup.
++ */
++unsigned int nr_running(void)
++{
++ unsigned int i, sum = 0;
++
++ for_each_online_cpu(i)
++ sum += cpu_rq(i)->nr_running;
++
++ return sum;
++}
++
++/*
++ * Check if only the current task is running on the CPU.
++ *
++ * Caution: this function does not check that the caller has disabled
++ * preemption, thus the result might have a time-of-check-to-time-of-use
++ * race. The caller is responsible to use it correctly, for example:
++ *
++ * - from a non-preemptible section (of course)
++ *
++ * - from a thread that is bound to a single CPU
++ *
++ * - in a loop with very short iterations (e.g. a polling loop)
++ */
++bool single_task_running(void)
++{
++ return raw_rq()->nr_running == 1;
++}
++EXPORT_SYMBOL(single_task_running);
++
++unsigned long long nr_context_switches_cpu(int cpu)
++{
++ return cpu_rq(cpu)->nr_switches;
++}
++
++unsigned long long nr_context_switches(void)
++{
++ int i;
++ unsigned long long sum = 0;
++
++ for_each_possible_cpu(i)
++ sum += cpu_rq(i)->nr_switches;
++
++ return sum;
++}
++
++/*
++ * Consumers of these two interfaces, like for example the cpuidle menu
++ * governor, are using nonsensical data. Preferring shallow idle state selection
++ * for a CPU that has IO-wait which might not even end up running the task when
++ * it does become runnable.
++ */
++
++unsigned int nr_iowait_cpu(int cpu)
++{
++ return atomic_read(&cpu_rq(cpu)->nr_iowait);
++}
++
++/*
++ * IO-wait accounting, and how it's mostly bollocks (on SMP).
++ *
++ * The idea behind IO-wait account is to account the idle time that we could
++ * have spend running if it were not for IO. That is, if we were to improve the
++ * storage performance, we'd have a proportional reduction in IO-wait time.
++ *
++ * This all works nicely on UP, where, when a task blocks on IO, we account
++ * idle time as IO-wait, because if the storage were faster, it could've been
++ * running and we'd not be idle.
++ *
++ * This has been extended to SMP, by doing the same for each CPU. This however
++ * is broken.
++ *
++ * Imagine for instance the case where two tasks block on one CPU, only the one
++ * CPU will have IO-wait accounted, while the other has regular idle. Even
++ * though, if the storage were faster, both could've ran at the same time,
++ * utilising both CPUs.
++ *
++ * This means, that when looking globally, the current IO-wait accounting on
++ * SMP is a lower bound, by reason of under accounting.
++ *
++ * Worse, since the numbers are provided per CPU, they are sometimes
++ * interpreted per CPU, and that is nonsensical. A blocked task isn't strictly
++ * associated with any one particular CPU, it can wake to another CPU than it
++ * blocked on. This means the per CPU IO-wait number is meaningless.
++ *
++ * Task CPU affinities can make all that even more 'interesting'.
++ */
++
++unsigned int nr_iowait(void)
++{
++ unsigned int i, sum = 0;
++
++ for_each_possible_cpu(i)
++ sum += nr_iowait_cpu(i);
++
++ return sum;
++}
++
++#ifdef CONFIG_SMP
++
++/*
++ * sched_exec - execve() is a valuable balancing opportunity, because at
++ * this point the task has the smallest effective memory and cache
++ * footprint.
++ */
++void sched_exec(void)
++{
++}
++
++#endif
++
++DEFINE_PER_CPU(struct kernel_stat, kstat);
++DEFINE_PER_CPU(struct kernel_cpustat, kernel_cpustat);
++
++EXPORT_PER_CPU_SYMBOL(kstat);
++EXPORT_PER_CPU_SYMBOL(kernel_cpustat);
++
++static inline void update_curr(struct rq *rq, struct task_struct *p)
++{
++ s64 ns = rq->clock_task - p->last_ran;
++
++ p->sched_time += ns;
++ cgroup_account_cputime(p, ns);
++ account_group_exec_runtime(p, ns);
++
++ p->time_slice -= ns;
++ p->last_ran = rq->clock_task;
++}
++
++/*
++ * Return accounted runtime for the task.
++ * Return separately the current's pending runtime that have not been
++ * accounted yet.
++ */
++unsigned long long task_sched_runtime(struct task_struct *p)
++{
++ unsigned long flags;
++ struct rq *rq;
++ raw_spinlock_t *lock;
++ u64 ns;
++
++#if defined(CONFIG_64BIT) && defined(CONFIG_SMP)
++ /*
++ * 64-bit doesn't need locks to atomically read a 64-bit value.
++ * So we have a optimization chance when the task's delta_exec is 0.
++ * Reading ->on_cpu is racy, but this is ok.
++ *
++ * If we race with it leaving CPU, we'll take a lock. So we're correct.
++ * If we race with it entering CPU, unaccounted time is 0. This is
++ * indistinguishable from the read occurring a few cycles earlier.
++ * If we see ->on_cpu without ->on_rq, the task is leaving, and has
++ * been accounted, so we're correct here as well.
++ */
++ if (!p->on_cpu || !task_on_rq_queued(p))
++ return tsk_seruntime(p);
++#endif
++
++ rq = task_access_lock_irqsave(p, &lock, &flags);
++ /*
++ * Must be ->curr _and_ ->on_rq. If dequeued, we would
++ * project cycles that may never be accounted to this
++ * thread, breaking clock_gettime().
++ */
++ if (p == rq->curr && task_on_rq_queued(p)) {
++ update_rq_clock(rq);
++ update_curr(rq, p);
++ }
++ ns = tsk_seruntime(p);
++ task_access_unlock_irqrestore(p, lock, &flags);
++
++ return ns;
++}
++
++/* This manages tasks that have run out of timeslice during a scheduler_tick */
++static inline void scheduler_task_tick(struct rq *rq)
++{
++ struct task_struct *p = rq->curr;
++
++ if (is_idle_task(p))
++ return;
++
++ update_curr(rq, p);
++ cpufreq_update_util(rq, 0);
++
++ /*
++ * Tasks have less than RESCHED_NS of time slice left they will be
++ * rescheduled.
++ */
++ if (p->time_slice >= RESCHED_NS)
++ return;
++ set_tsk_need_resched(p);
++ set_preempt_need_resched();
++}
++
++#ifdef CONFIG_SCHED_DEBUG
++static u64 cpu_resched_latency(struct rq *rq)
++{
++ int latency_warn_ms = READ_ONCE(sysctl_resched_latency_warn_ms);
++ u64 resched_latency, now = rq_clock(rq);
++ static bool warned_once;
++
++ if (sysctl_resched_latency_warn_once && warned_once)
++ return 0;
++
++ if (!need_resched() || !latency_warn_ms)
++ return 0;
++
++ if (system_state == SYSTEM_BOOTING)
++ return 0;
++
++ if (!rq->last_seen_need_resched_ns) {
++ rq->last_seen_need_resched_ns = now;
++ rq->ticks_without_resched = 0;
++ return 0;
++ }
++
++ rq->ticks_without_resched++;
++ resched_latency = now - rq->last_seen_need_resched_ns;
++ if (resched_latency <= latency_warn_ms * NSEC_PER_MSEC)
++ return 0;
++
++ warned_once = true;
++
++ return resched_latency;
++}
++
++static int __init setup_resched_latency_warn_ms(char *str)
++{
++ long val;
++
++ if ((kstrtol(str, 0, &val))) {
++ pr_warn("Unable to set resched_latency_warn_ms\n");
++ return 1;
++ }
++
++ sysctl_resched_latency_warn_ms = val;
++ return 1;
++}
++__setup("resched_latency_warn_ms=", setup_resched_latency_warn_ms);
++#else
++static inline u64 cpu_resched_latency(struct rq *rq) { return 0; }
++#endif /* CONFIG_SCHED_DEBUG */
++
++/*
++ * This function gets called by the timer code, with HZ frequency.
++ * We call it with interrupts disabled.
++ */
++void scheduler_tick(void)
++{
++ int cpu __maybe_unused = smp_processor_id();
++ struct rq *rq = cpu_rq(cpu);
++ u64 resched_latency;
++
++ if (housekeeping_cpu(cpu, HK_TYPE_TICK))
++ arch_scale_freq_tick();
++
++ sched_clock_tick();
++
++ raw_spin_lock(&rq->lock);
++ update_rq_clock(rq);
++
++ scheduler_task_tick(rq);
++ if (sched_feat(LATENCY_WARN))
++ resched_latency = cpu_resched_latency(rq);
++ calc_global_load_tick(rq);
++
++ task_tick_mm_cid(rq, rq->curr);
++
++ rq->last_tick = rq->clock;
++ raw_spin_unlock(&rq->lock);
++
++ if (sched_feat(LATENCY_WARN) && resched_latency)
++ resched_latency_warn(cpu, resched_latency);
++
++ perf_event_task_tick();
++}
++
++#ifdef CONFIG_SCHED_SMT
++static inline int sg_balance_cpu_stop(void *data)
++{
++ struct rq *rq = this_rq();
++ struct task_struct *p = data;
++ cpumask_t tmp;
++ unsigned long flags;
++
++ local_irq_save(flags);
++
++ raw_spin_lock(&p->pi_lock);
++ raw_spin_lock(&rq->lock);
++
++ rq->active_balance = 0;
++ /* _something_ may have changed the task, double check again */
++ if (task_on_rq_queued(p) && task_rq(p) == rq &&
++ cpumask_and(&tmp, p->cpus_ptr, &sched_sg_idle_mask) &&
++ !is_migration_disabled(p)) {
++ int cpu = cpu_of(rq);
++ int dcpu = __best_mask_cpu(&tmp, per_cpu(sched_cpu_llc_mask, cpu));
++ rq = move_queued_task(rq, p, dcpu);
++ }
++
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock(&p->pi_lock);
++
++ local_irq_restore(flags);
++
++ return 0;
++}
++
++/* sg_balance_trigger - trigger slibing group balance for @cpu */
++static inline int sg_balance_trigger(const int cpu)
++{
++ struct rq *rq= cpu_rq(cpu);
++ unsigned long flags;
++ struct task_struct *curr;
++ int res;
++
++ if (!raw_spin_trylock_irqsave(&rq->lock, flags))
++ return 0;
++ curr = rq->curr;
++ res = (!is_idle_task(curr)) && (1 == rq->nr_running) &&\
++ cpumask_intersects(curr->cpus_ptr, &sched_sg_idle_mask) &&\
++ !is_migration_disabled(curr) && (!rq->active_balance);
++
++ if (res)
++ rq->active_balance = 1;
++
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++
++ if (res)
++ stop_one_cpu_nowait(cpu, sg_balance_cpu_stop, curr,
++ &rq->active_balance_work);
++ return res;
++}
++
++/*
++ * sg_balance - slibing group balance check for run queue @rq
++ */
++static inline void sg_balance(struct rq *rq, int cpu)
++{
++ cpumask_t chk;
++
++ /* exit when cpu is offline */
++ if (unlikely(!rq->online))
++ return;
++
++ /*
++ * Only cpu in slibing idle group will do the checking and then
++ * find potential cpus which can migrate the current running task
++ */
++ if (cpumask_test_cpu(cpu, &sched_sg_idle_mask) &&
++ cpumask_andnot(&chk, cpu_online_mask, sched_idle_mask) &&
++ cpumask_andnot(&chk, &chk, &sched_rq_pending_mask)) {
++ int i;
++
++ for_each_cpu_wrap(i, &chk, cpu) {
++ if (!cpumask_intersects(cpu_smt_mask(i), sched_idle_mask) &&\
++ sg_balance_trigger(i))
++ return;
++ }
++ }
++}
++#endif /* CONFIG_SCHED_SMT */
++
++#ifdef CONFIG_NO_HZ_FULL
++
++struct tick_work {
++ int cpu;
++ atomic_t state;
++ struct delayed_work work;
++};
++/* Values for ->state, see diagram below. */
++#define TICK_SCHED_REMOTE_OFFLINE 0
++#define TICK_SCHED_REMOTE_OFFLINING 1
++#define TICK_SCHED_REMOTE_RUNNING 2
++
++/*
++ * State diagram for ->state:
++ *
++ *
++ * TICK_SCHED_REMOTE_OFFLINE
++ * | ^
++ * | |
++ * | | sched_tick_remote()
++ * | |
++ * | |
++ * +--TICK_SCHED_REMOTE_OFFLINING
++ * | ^
++ * | |
++ * sched_tick_start() | | sched_tick_stop()
++ * | |
++ * V |
++ * TICK_SCHED_REMOTE_RUNNING
++ *
++ *
++ * Other transitions get WARN_ON_ONCE(), except that sched_tick_remote()
++ * and sched_tick_start() are happy to leave the state in RUNNING.
++ */
++
++static struct tick_work __percpu *tick_work_cpu;
++
++static void sched_tick_remote(struct work_struct *work)
++{
++ struct delayed_work *dwork = to_delayed_work(work);
++ struct tick_work *twork = container_of(dwork, struct tick_work, work);
++ int cpu = twork->cpu;
++ struct rq *rq = cpu_rq(cpu);
++ struct task_struct *curr;
++ unsigned long flags;
++ u64 delta;
++ int os;
++
++ /*
++ * Handle the tick only if it appears the remote CPU is running in full
++ * dynticks mode. The check is racy by nature, but missing a tick or
++ * having one too much is no big deal because the scheduler tick updates
++ * statistics and checks timeslices in a time-independent way, regardless
++ * of when exactly it is running.
++ */
++ if (!tick_nohz_tick_stopped_cpu(cpu))
++ goto out_requeue;
++
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ curr = rq->curr;
++ if (cpu_is_offline(cpu))
++ goto out_unlock;
++
++ update_rq_clock(rq);
++ if (!is_idle_task(curr)) {
++ /*
++ * Make sure the next tick runs within a reasonable
++ * amount of time.
++ */
++ delta = rq_clock_task(rq) - curr->last_ran;
++ WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3);
++ }
++ scheduler_task_tick(rq);
++
++ calc_load_nohz_remote(rq);
++out_unlock:
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++
++out_requeue:
++ /*
++ * Run the remote tick once per second (1Hz). This arbitrary
++ * frequency is large enough to avoid overload but short enough
++ * to keep scheduler internal stats reasonably up to date. But
++ * first update state to reflect hotplug activity if required.
++ */
++ os = atomic_fetch_add_unless(&twork->state, -1, TICK_SCHED_REMOTE_RUNNING);
++ WARN_ON_ONCE(os == TICK_SCHED_REMOTE_OFFLINE);
++ if (os == TICK_SCHED_REMOTE_RUNNING)
++ queue_delayed_work(system_unbound_wq, dwork, HZ);
++}
++
++static void sched_tick_start(int cpu)
++{
++ int os;
++ struct tick_work *twork;
++
++ if (housekeeping_cpu(cpu, HK_TYPE_TICK))
++ return;
++
++ WARN_ON_ONCE(!tick_work_cpu);
++
++ twork = per_cpu_ptr(tick_work_cpu, cpu);
++ os = atomic_xchg(&twork->state, TICK_SCHED_REMOTE_RUNNING);
++ WARN_ON_ONCE(os == TICK_SCHED_REMOTE_RUNNING);
++ if (os == TICK_SCHED_REMOTE_OFFLINE) {
++ twork->cpu = cpu;
++ INIT_DELAYED_WORK(&twork->work, sched_tick_remote);
++ queue_delayed_work(system_unbound_wq, &twork->work, HZ);
++ }
++}
++
++#ifdef CONFIG_HOTPLUG_CPU
++static void sched_tick_stop(int cpu)
++{
++ struct tick_work *twork;
++ int os;
++
++ if (housekeeping_cpu(cpu, HK_TYPE_TICK))
++ return;
++
++ WARN_ON_ONCE(!tick_work_cpu);
++
++ twork = per_cpu_ptr(tick_work_cpu, cpu);
++ /* There cannot be competing actions, but don't rely on stop-machine. */
++ os = atomic_xchg(&twork->state, TICK_SCHED_REMOTE_OFFLINING);
++ WARN_ON_ONCE(os != TICK_SCHED_REMOTE_RUNNING);
++ /* Don't cancel, as this would mess up the state machine. */
++}
++#endif /* CONFIG_HOTPLUG_CPU */
++
++int __init sched_tick_offload_init(void)
++{
++ tick_work_cpu = alloc_percpu(struct tick_work);
++ BUG_ON(!tick_work_cpu);
++ return 0;
++}
++
++#else /* !CONFIG_NO_HZ_FULL */
++static inline void sched_tick_start(int cpu) { }
++static inline void sched_tick_stop(int cpu) { }
++#endif
++
++#if defined(CONFIG_PREEMPTION) && (defined(CONFIG_DEBUG_PREEMPT) || \
++ defined(CONFIG_PREEMPT_TRACER))
++/*
++ * If the value passed in is equal to the current preempt count
++ * then we just disabled preemption. Start timing the latency.
++ */
++static inline void preempt_latency_start(int val)
++{
++ if (preempt_count() == val) {
++ unsigned long ip = get_lock_parent_ip();
++#ifdef CONFIG_DEBUG_PREEMPT
++ current->preempt_disable_ip = ip;
++#endif
++ trace_preempt_off(CALLER_ADDR0, ip);
++ }
++}
++
++void preempt_count_add(int val)
++{
++#ifdef CONFIG_DEBUG_PREEMPT
++ /*
++ * Underflow?
++ */
++ if (DEBUG_LOCKS_WARN_ON((preempt_count() < 0)))
++ return;
++#endif
++ __preempt_count_add(val);
++#ifdef CONFIG_DEBUG_PREEMPT
++ /*
++ * Spinlock count overflowing soon?
++ */
++ DEBUG_LOCKS_WARN_ON((preempt_count() & PREEMPT_MASK) >=
++ PREEMPT_MASK - 10);
++#endif
++ preempt_latency_start(val);
++}
++EXPORT_SYMBOL(preempt_count_add);
++NOKPROBE_SYMBOL(preempt_count_add);
++
++/*
++ * If the value passed in equals to the current preempt count
++ * then we just enabled preemption. Stop timing the latency.
++ */
++static inline void preempt_latency_stop(int val)
++{
++ if (preempt_count() == val)
++ trace_preempt_on(CALLER_ADDR0, get_lock_parent_ip());
++}
++
++void preempt_count_sub(int val)
++{
++#ifdef CONFIG_DEBUG_PREEMPT
++ /*
++ * Underflow?
++ */
++ if (DEBUG_LOCKS_WARN_ON(val > preempt_count()))
++ return;
++ /*
++ * Is the spinlock portion underflowing?
++ */
++ if (DEBUG_LOCKS_WARN_ON((val < PREEMPT_MASK) &&
++ !(preempt_count() & PREEMPT_MASK)))
++ return;
++#endif
++
++ preempt_latency_stop(val);
++ __preempt_count_sub(val);
++}
++EXPORT_SYMBOL(preempt_count_sub);
++NOKPROBE_SYMBOL(preempt_count_sub);
++
++#else
++static inline void preempt_latency_start(int val) { }
++static inline void preempt_latency_stop(int val) { }
++#endif
++
++static inline unsigned long get_preempt_disable_ip(struct task_struct *p)
++{
++#ifdef CONFIG_DEBUG_PREEMPT
++ return p->preempt_disable_ip;
++#else
++ return 0;
++#endif
++}
++
++/*
++ * Print scheduling while atomic bug:
++ */
++static noinline void __schedule_bug(struct task_struct *prev)
++{
++ /* Save this before calling printk(), since that will clobber it */
++ unsigned long preempt_disable_ip = get_preempt_disable_ip(current);
++
++ if (oops_in_progress)
++ return;
++
++ printk(KERN_ERR "BUG: scheduling while atomic: %s/%d/0x%08x\n",
++ prev->comm, prev->pid, preempt_count());
++
++ debug_show_held_locks(prev);
++ print_modules();
++ if (irqs_disabled())
++ print_irqtrace_events(prev);
++ if (IS_ENABLED(CONFIG_DEBUG_PREEMPT)
++ && in_atomic_preempt_off()) {
++ pr_err("Preemption disabled at:");
++ print_ip_sym(KERN_ERR, preempt_disable_ip);
++ }
++ check_panic_on_warn("scheduling while atomic");
++
++ dump_stack();
++ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
++}
++
++/*
++ * Various schedule()-time debugging checks and statistics:
++ */
++static inline void schedule_debug(struct task_struct *prev, bool preempt)
++{
++#ifdef CONFIG_SCHED_STACK_END_CHECK
++ if (task_stack_end_corrupted(prev))
++ panic("corrupted stack end detected inside scheduler\n");
++
++ if (task_scs_end_corrupted(prev))
++ panic("corrupted shadow stack detected inside scheduler\n");
++#endif
++
++#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
++ if (!preempt && READ_ONCE(prev->__state) && prev->non_block_count) {
++ printk(KERN_ERR "BUG: scheduling in a non-blocking section: %s/%d/%i\n",
++ prev->comm, prev->pid, prev->non_block_count);
++ dump_stack();
++ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
++ }
++#endif
++
++ if (unlikely(in_atomic_preempt_off())) {
++ __schedule_bug(prev);
++ preempt_count_set(PREEMPT_DISABLED);
++ }
++ rcu_sleep_check();
++ SCHED_WARN_ON(ct_state() == CONTEXT_USER);
++
++ profile_hit(SCHED_PROFILING, __builtin_return_address(0));
++
++ schedstat_inc(this_rq()->sched_count);
++}
++
++#ifdef ALT_SCHED_DEBUG
++void alt_sched_debug(void)
++{
++ printk(KERN_INFO "sched: pending: 0x%04lx, idle: 0x%04lx, sg_idle: 0x%04lx\n",
++ sched_rq_pending_mask.bits[0],
++ sched_idle_mask->bits[0],
++ sched_sg_idle_mask.bits[0]);
++}
++#else
++inline void alt_sched_debug(void) {}
++#endif
++
++#ifdef CONFIG_SMP
++
++#ifdef CONFIG_PREEMPT_RT
++#define SCHED_NR_MIGRATE_BREAK 8
++#else
++#define SCHED_NR_MIGRATE_BREAK 32
++#endif
++
++const_debug unsigned int sysctl_sched_nr_migrate = SCHED_NR_MIGRATE_BREAK;
++
++/*
++ * Migrate pending tasks in @rq to @dest_cpu
++ */
++static inline int
++migrate_pending_tasks(struct rq *rq, struct rq *dest_rq, const int dest_cpu)
++{
++ struct task_struct *p, *skip = rq->curr;
++ int nr_migrated = 0;
++ int nr_tries = min(rq->nr_running / 2, sysctl_sched_nr_migrate);
++
++ /* WA to check rq->curr is still on rq */
++ if (!task_on_rq_queued(skip))
++ return 0;
++
++ while (skip != rq->idle && nr_tries &&
++ (p = sched_rq_next_task(skip, rq)) != rq->idle) {
++ skip = sched_rq_next_task(p, rq);
++ if (cpumask_test_cpu(dest_cpu, p->cpus_ptr)) {
++ __SCHED_DEQUEUE_TASK(p, rq, 0, );
++ set_task_cpu(p, dest_cpu);
++ sched_task_sanity_check(p, dest_rq);
++ sched_mm_cid_migrate_to(dest_rq, p, cpu_of(rq));
++ __SCHED_ENQUEUE_TASK(p, dest_rq, 0);
++ nr_migrated++;
++ }
++ nr_tries--;
++ }
++
++ return nr_migrated;
++}
++
++static inline int take_other_rq_tasks(struct rq *rq, int cpu)
++{
++ struct cpumask *topo_mask, *end_mask;
++
++ if (unlikely(!rq->online))
++ return 0;
++
++ if (cpumask_empty(&sched_rq_pending_mask))
++ return 0;
++
++ topo_mask = per_cpu(sched_cpu_topo_masks, cpu) + 1;
++ end_mask = per_cpu(sched_cpu_topo_end_mask, cpu);
++ do {
++ int i;
++ for_each_cpu_and(i, &sched_rq_pending_mask, topo_mask) {
++ int nr_migrated;
++ struct rq *src_rq;
++
++ src_rq = cpu_rq(i);
++ if (!do_raw_spin_trylock(&src_rq->lock))
++ continue;
++ spin_acquire(&src_rq->lock.dep_map,
++ SINGLE_DEPTH_NESTING, 1, _RET_IP_);
++
++ if ((nr_migrated = migrate_pending_tasks(src_rq, rq, cpu))) {
++ src_rq->nr_running -= nr_migrated;
++ if (src_rq->nr_running < 2)
++ cpumask_clear_cpu(i, &sched_rq_pending_mask);
++
++ spin_release(&src_rq->lock.dep_map, _RET_IP_);
++ do_raw_spin_unlock(&src_rq->lock);
++
++ rq->nr_running += nr_migrated;
++ if (rq->nr_running > 1)
++ cpumask_set_cpu(cpu, &sched_rq_pending_mask);
++
++ update_sched_preempt_mask(rq);
++ cpufreq_update_util(rq, 0);
++
++ return 1;
++ }
++
++ spin_release(&src_rq->lock.dep_map, _RET_IP_);
++ do_raw_spin_unlock(&src_rq->lock);
++ }
++ } while (++topo_mask < end_mask);
++
++ return 0;
++}
++#endif
++
++/*
++ * Timeslices below RESCHED_NS are considered as good as expired as there's no
++ * point rescheduling when there's so little time left.
++ */
++static inline void check_curr(struct task_struct *p, struct rq *rq)
++{
++ if (unlikely(rq->idle == p))
++ return;
++
++ update_curr(rq, p);
++
++ if (p->time_slice < RESCHED_NS)
++ time_slice_expired(p, rq);
++}
++
++static inline struct task_struct *
++choose_next_task(struct rq *rq, int cpu)
++{
++ struct task_struct *next;
++
++ if (unlikely(rq->skip)) {
++ next = rq_runnable_task(rq);
++ if (next == rq->idle) {
++#ifdef CONFIG_SMP
++ if (!take_other_rq_tasks(rq, cpu)) {
++#endif
++ rq->skip = NULL;
++ schedstat_inc(rq->sched_goidle);
++ return next;
++#ifdef CONFIG_SMP
++ }
++ next = rq_runnable_task(rq);
++#endif
++ }
++ rq->skip = NULL;
++#ifdef CONFIG_HIGH_RES_TIMERS
++ hrtick_start(rq, next->time_slice);
++#endif
++ return next;
++ }
++
++ next = sched_rq_first_task(rq);
++ if (next == rq->idle) {
++#ifdef CONFIG_SMP
++ if (!take_other_rq_tasks(rq, cpu)) {
++#endif
++ schedstat_inc(rq->sched_goidle);
++ /*printk(KERN_INFO "sched: choose_next_task(%d) idle %px\n", cpu, next);*/
++ return next;
++#ifdef CONFIG_SMP
++ }
++ next = sched_rq_first_task(rq);
++#endif
++ }
++#ifdef CONFIG_HIGH_RES_TIMERS
++ hrtick_start(rq, next->time_slice);
++#endif
++ /*printk(KERN_INFO "sched: choose_next_task(%d) next %px\n", cpu, next);*/
++ return next;
++}
++
++/*
++ * Constants for the sched_mode argument of __schedule().
++ *
++ * The mode argument allows RT enabled kernels to differentiate a
++ * preemption from blocking on an 'sleeping' spin/rwlock. Note that
++ * SM_MASK_PREEMPT for !RT has all bits set, which allows the compiler to
++ * optimize the AND operation out and just check for zero.
++ */
++#define SM_NONE 0x0
++#define SM_PREEMPT 0x1
++#define SM_RTLOCK_WAIT 0x2
++
++#ifndef CONFIG_PREEMPT_RT
++# define SM_MASK_PREEMPT (~0U)
++#else
++# define SM_MASK_PREEMPT SM_PREEMPT
++#endif
++
++/*
++ * schedule() is the main scheduler function.
++ *
++ * The main means of driving the scheduler and thus entering this function are:
++ *
++ * 1. Explicit blocking: mutex, semaphore, waitqueue, etc.
++ *
++ * 2. TIF_NEED_RESCHED flag is checked on interrupt and userspace return
++ * paths. For example, see arch/x86/entry_64.S.
++ *
++ * To drive preemption between tasks, the scheduler sets the flag in timer
++ * interrupt handler scheduler_tick().
++ *
++ * 3. Wakeups don't really cause entry into schedule(). They add a
++ * task to the run-queue and that's it.
++ *
++ * Now, if the new task added to the run-queue preempts the current
++ * task, then the wakeup sets TIF_NEED_RESCHED and schedule() gets
++ * called on the nearest possible occasion:
++ *
++ * - If the kernel is preemptible (CONFIG_PREEMPTION=y):
++ *
++ * - in syscall or exception context, at the next outmost
++ * preempt_enable(). (this might be as soon as the wake_up()'s
++ * spin_unlock()!)
++ *
++ * - in IRQ context, return from interrupt-handler to
++ * preemptible context
++ *
++ * - If the kernel is not preemptible (CONFIG_PREEMPTION is not set)
++ * then at the next:
++ *
++ * - cond_resched() call
++ * - explicit schedule() call
++ * - return from syscall or exception to user-space
++ * - return from interrupt-handler to user-space
++ *
++ * WARNING: must be called with preemption disabled!
++ */
++static void __sched notrace __schedule(unsigned int sched_mode)
++{
++ struct task_struct *prev, *next;
++ unsigned long *switch_count;
++ unsigned long prev_state;
++ struct rq *rq;
++ int cpu;
++
++ cpu = smp_processor_id();
++ rq = cpu_rq(cpu);
++ prev = rq->curr;
++
++ schedule_debug(prev, !!sched_mode);
++
++ /* by passing sched_feat(HRTICK) checking which Alt schedule FW doesn't support */
++ hrtick_clear(rq);
++
++ local_irq_disable();
++ rcu_note_context_switch(!!sched_mode);
++
++ /*
++ * Make sure that signal_pending_state()->signal_pending() below
++ * can't be reordered with __set_current_state(TASK_INTERRUPTIBLE)
++ * done by the caller to avoid the race with signal_wake_up():
++ *
++ * __set_current_state(@state) signal_wake_up()
++ * schedule() set_tsk_thread_flag(p, TIF_SIGPENDING)
++ * wake_up_state(p, state)
++ * LOCK rq->lock LOCK p->pi_state
++ * smp_mb__after_spinlock() smp_mb__after_spinlock()
++ * if (signal_pending_state()) if (p->state & @state)
++ *
++ * Also, the membarrier system call requires a full memory barrier
++ * after coming from user-space, before storing to rq->curr.
++ */
++ raw_spin_lock(&rq->lock);
++ smp_mb__after_spinlock();
++
++ update_rq_clock(rq);
++
++ switch_count = &prev->nivcsw;
++ /*
++ * We must load prev->state once (task_struct::state is volatile), such
++ * that we form a control dependency vs deactivate_task() below.
++ */
++ prev_state = READ_ONCE(prev->__state);
++ if (!(sched_mode & SM_MASK_PREEMPT) && prev_state) {
++ if (signal_pending_state(prev_state, prev)) {
++ WRITE_ONCE(prev->__state, TASK_RUNNING);
++ } else {
++ prev->sched_contributes_to_load =
++ (prev_state & TASK_UNINTERRUPTIBLE) &&
++ !(prev_state & TASK_NOLOAD) &&
++ !(prev_state & TASK_FROZEN);
++
++ if (prev->sched_contributes_to_load)
++ rq->nr_uninterruptible++;
++
++ /*
++ * __schedule() ttwu()
++ * prev_state = prev->state; if (p->on_rq && ...)
++ * if (prev_state) goto out;
++ * p->on_rq = 0; smp_acquire__after_ctrl_dep();
++ * p->state = TASK_WAKING
++ *
++ * Where __schedule() and ttwu() have matching control dependencies.
++ *
++ * After this, schedule() must not care about p->state any more.
++ */
++ sched_task_deactivate(prev, rq);
++ deactivate_task(prev, rq);
++
++ if (prev->in_iowait) {
++ atomic_inc(&rq->nr_iowait);
++ delayacct_blkio_start();
++ }
++ }
++ switch_count = &prev->nvcsw;
++ }
++
++ check_curr(prev, rq);
++
++ next = choose_next_task(rq, cpu);
++ clear_tsk_need_resched(prev);
++ clear_preempt_need_resched();
++#ifdef CONFIG_SCHED_DEBUG
++ rq->last_seen_need_resched_ns = 0;
++#endif
++
++ if (likely(prev != next)) {
++ next->last_ran = rq->clock_task;
++ rq->last_ts_switch = rq->clock;
++
++ /*printk(KERN_INFO "sched: %px -> %px\n", prev, next);*/
++ rq->nr_switches++;
++ /*
++ * RCU users of rcu_dereference(rq->curr) may not see
++ * changes to task_struct made by pick_next_task().
++ */
++ RCU_INIT_POINTER(rq->curr, next);
++ /*
++ * The membarrier system call requires each architecture
++ * to have a full memory barrier after updating
++ * rq->curr, before returning to user-space.
++ *
++ * Here are the schemes providing that barrier on the
++ * various architectures:
++ * - mm ? switch_mm() : mmdrop() for x86, s390, sparc, PowerPC.
++ * switch_mm() rely on membarrier_arch_switch_mm() on PowerPC.
++ * - finish_lock_switch() for weakly-ordered
++ * architectures where spin_unlock is a full barrier,
++ * - switch_to() for arm64 (weakly-ordered, spin_unlock
++ * is a RELEASE barrier),
++ */
++ ++*switch_count;
++
++ trace_sched_switch(sched_mode & SM_MASK_PREEMPT, prev, next, prev_state);
++
++ /* Also unlocks the rq: */
++ rq = context_switch(rq, prev, next);
++
++ cpu = cpu_of(rq);
++ } else {
++ __balance_callbacks(rq);
++ raw_spin_unlock_irq(&rq->lock);
++ }
++
++#ifdef CONFIG_SCHED_SMT
++ sg_balance(rq, cpu);
++#endif
++}
++
++void __noreturn do_task_dead(void)
++{
++ /* Causes final put_task_struct in finish_task_switch(): */
++ set_special_state(TASK_DEAD);
++
++ /* Tell freezer to ignore us: */
++ current->flags |= PF_NOFREEZE;
++
++ __schedule(SM_NONE);
++ BUG();
++
++ /* Avoid "noreturn function does return" - but don't continue if BUG() is a NOP: */
++ for (;;)
++ cpu_relax();
++}
++
++static inline void sched_submit_work(struct task_struct *tsk)
++{
++ unsigned int task_flags;
++
++ if (task_is_running(tsk))
++ return;
++
++ task_flags = tsk->flags;
++ /*
++ * If a worker goes to sleep, notify and ask workqueue whether it
++ * wants to wake up a task to maintain concurrency.
++ */
++ if (task_flags & (PF_WQ_WORKER | PF_IO_WORKER)) {
++ if (task_flags & PF_WQ_WORKER)
++ wq_worker_sleeping(tsk);
++ else
++ io_wq_worker_sleeping(tsk);
++ }
++
++ /*
++ * spinlock and rwlock must not flush block requests. This will
++ * deadlock if the callback attempts to acquire a lock which is
++ * already acquired.
++ */
++ SCHED_WARN_ON(current->__state & TASK_RTLOCK_WAIT);
++
++ /*
++ * If we are going to sleep and we have plugged IO queued,
++ * make sure to submit it to avoid deadlocks.
++ */
++ blk_flush_plug(tsk->plug, true);
++}
++
++static void sched_update_worker(struct task_struct *tsk)
++{
++ if (tsk->flags & (PF_WQ_WORKER | PF_IO_WORKER)) {
++ if (tsk->flags & PF_WQ_WORKER)
++ wq_worker_running(tsk);
++ else
++ io_wq_worker_running(tsk);
++ }
++}
++
++asmlinkage __visible void __sched schedule(void)
++{
++ struct task_struct *tsk = current;
++
++ sched_submit_work(tsk);
++ do {
++ preempt_disable();
++ __schedule(SM_NONE);
++ sched_preempt_enable_no_resched();
++ } while (need_resched());
++ sched_update_worker(tsk);
++}
++EXPORT_SYMBOL(schedule);
++
++/*
++ * synchronize_rcu_tasks() makes sure that no task is stuck in preempted
++ * state (have scheduled out non-voluntarily) by making sure that all
++ * tasks have either left the run queue or have gone into user space.
++ * As idle tasks do not do either, they must not ever be preempted
++ * (schedule out non-voluntarily).
++ *
++ * schedule_idle() is similar to schedule_preempt_disable() except that it
++ * never enables preemption because it does not call sched_submit_work().
++ */
++void __sched schedule_idle(void)
++{
++ /*
++ * As this skips calling sched_submit_work(), which the idle task does
++ * regardless because that function is a nop when the task is in a
++ * TASK_RUNNING state, make sure this isn't used someplace that the
++ * current task can be in any other state. Note, idle is always in the
++ * TASK_RUNNING state.
++ */
++ WARN_ON_ONCE(current->__state);
++ do {
++ __schedule(SM_NONE);
++ } while (need_resched());
++}
++
++#if defined(CONFIG_CONTEXT_TRACKING_USER) && !defined(CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK)
++asmlinkage __visible void __sched schedule_user(void)
++{
++ /*
++ * If we come here after a random call to set_need_resched(),
++ * or we have been woken up remotely but the IPI has not yet arrived,
++ * we haven't yet exited the RCU idle mode. Do it here manually until
++ * we find a better solution.
++ *
++ * NB: There are buggy callers of this function. Ideally we
++ * should warn if prev_state != CONTEXT_USER, but that will trigger
++ * too frequently to make sense yet.
++ */
++ enum ctx_state prev_state = exception_enter();
++ schedule();
++ exception_exit(prev_state);
++}
++#endif
++
++/**
++ * schedule_preempt_disabled - called with preemption disabled
++ *
++ * Returns with preemption disabled. Note: preempt_count must be 1
++ */
++void __sched schedule_preempt_disabled(void)
++{
++ sched_preempt_enable_no_resched();
++ schedule();
++ preempt_disable();
++}
++
++#ifdef CONFIG_PREEMPT_RT
++void __sched notrace schedule_rtlock(void)
++{
++ do {
++ preempt_disable();
++ __schedule(SM_RTLOCK_WAIT);
++ sched_preempt_enable_no_resched();
++ } while (need_resched());
++}
++NOKPROBE_SYMBOL(schedule_rtlock);
++#endif
++
++static void __sched notrace preempt_schedule_common(void)
++{
++ do {
++ /*
++ * Because the function tracer can trace preempt_count_sub()
++ * and it also uses preempt_enable/disable_notrace(), if
++ * NEED_RESCHED is set, the preempt_enable_notrace() called
++ * by the function tracer will call this function again and
++ * cause infinite recursion.
++ *
++ * Preemption must be disabled here before the function
++ * tracer can trace. Break up preempt_disable() into two
++ * calls. One to disable preemption without fear of being
++ * traced. The other to still record the preemption latency,
++ * which can also be traced by the function tracer.
++ */
++ preempt_disable_notrace();
++ preempt_latency_start(1);
++ __schedule(SM_PREEMPT);
++ preempt_latency_stop(1);
++ preempt_enable_no_resched_notrace();
++
++ /*
++ * Check again in case we missed a preemption opportunity
++ * between schedule and now.
++ */
++ } while (need_resched());
++}
++
++#ifdef CONFIG_PREEMPTION
++/*
++ * This is the entry point to schedule() from in-kernel preemption
++ * off of preempt_enable.
++ */
++asmlinkage __visible void __sched notrace preempt_schedule(void)
++{
++ /*
++ * If there is a non-zero preempt_count or interrupts are disabled,
++ * we do not want to preempt the current task. Just return..
++ */
++ if (likely(!preemptible()))
++ return;
++
++ preempt_schedule_common();
++}
++NOKPROBE_SYMBOL(preempt_schedule);
++EXPORT_SYMBOL(preempt_schedule);
++
++#ifdef CONFIG_PREEMPT_DYNAMIC
++#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
++#ifndef preempt_schedule_dynamic_enabled
++#define preempt_schedule_dynamic_enabled preempt_schedule
++#define preempt_schedule_dynamic_disabled NULL
++#endif
++DEFINE_STATIC_CALL(preempt_schedule, preempt_schedule_dynamic_enabled);
++EXPORT_STATIC_CALL_TRAMP(preempt_schedule);
++#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
++static DEFINE_STATIC_KEY_TRUE(sk_dynamic_preempt_schedule);
++void __sched notrace dynamic_preempt_schedule(void)
++{
++ if (!static_branch_unlikely(&sk_dynamic_preempt_schedule))
++ return;
++ preempt_schedule();
++}
++NOKPROBE_SYMBOL(dynamic_preempt_schedule);
++EXPORT_SYMBOL(dynamic_preempt_schedule);
++#endif
++#endif
++
++/**
++ * preempt_schedule_notrace - preempt_schedule called by tracing
++ *
++ * The tracing infrastructure uses preempt_enable_notrace to prevent
++ * recursion and tracing preempt enabling caused by the tracing
++ * infrastructure itself. But as tracing can happen in areas coming
++ * from userspace or just about to enter userspace, a preempt enable
++ * can occur before user_exit() is called. This will cause the scheduler
++ * to be called when the system is still in usermode.
++ *
++ * To prevent this, the preempt_enable_notrace will use this function
++ * instead of preempt_schedule() to exit user context if needed before
++ * calling the scheduler.
++ */
++asmlinkage __visible void __sched notrace preempt_schedule_notrace(void)
++{
++ enum ctx_state prev_ctx;
++
++ if (likely(!preemptible()))
++ return;
++
++ do {
++ /*
++ * Because the function tracer can trace preempt_count_sub()
++ * and it also uses preempt_enable/disable_notrace(), if
++ * NEED_RESCHED is set, the preempt_enable_notrace() called
++ * by the function tracer will call this function again and
++ * cause infinite recursion.
++ *
++ * Preemption must be disabled here before the function
++ * tracer can trace. Break up preempt_disable() into two
++ * calls. One to disable preemption without fear of being
++ * traced. The other to still record the preemption latency,
++ * which can also be traced by the function tracer.
++ */
++ preempt_disable_notrace();
++ preempt_latency_start(1);
++ /*
++ * Needs preempt disabled in case user_exit() is traced
++ * and the tracer calls preempt_enable_notrace() causing
++ * an infinite recursion.
++ */
++ prev_ctx = exception_enter();
++ __schedule(SM_PREEMPT);
++ exception_exit(prev_ctx);
++
++ preempt_latency_stop(1);
++ preempt_enable_no_resched_notrace();
++ } while (need_resched());
++}
++EXPORT_SYMBOL_GPL(preempt_schedule_notrace);
++
++#ifdef CONFIG_PREEMPT_DYNAMIC
++#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
++#ifndef preempt_schedule_notrace_dynamic_enabled
++#define preempt_schedule_notrace_dynamic_enabled preempt_schedule_notrace
++#define preempt_schedule_notrace_dynamic_disabled NULL
++#endif
++DEFINE_STATIC_CALL(preempt_schedule_notrace, preempt_schedule_notrace_dynamic_enabled);
++EXPORT_STATIC_CALL_TRAMP(preempt_schedule_notrace);
++#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
++static DEFINE_STATIC_KEY_TRUE(sk_dynamic_preempt_schedule_notrace);
++void __sched notrace dynamic_preempt_schedule_notrace(void)
++{
++ if (!static_branch_unlikely(&sk_dynamic_preempt_schedule_notrace))
++ return;
++ preempt_schedule_notrace();
++}
++NOKPROBE_SYMBOL(dynamic_preempt_schedule_notrace);
++EXPORT_SYMBOL(dynamic_preempt_schedule_notrace);
++#endif
++#endif
++
++#endif /* CONFIG_PREEMPTION */
++
++/*
++ * This is the entry point to schedule() from kernel preemption
++ * off of irq context.
++ * Note, that this is called and return with irqs disabled. This will
++ * protect us against recursive calling from irq.
++ */
++asmlinkage __visible void __sched preempt_schedule_irq(void)
++{
++ enum ctx_state prev_state;
++
++ /* Catch callers which need to be fixed */
++ BUG_ON(preempt_count() || !irqs_disabled());
++
++ prev_state = exception_enter();
++
++ do {
++ preempt_disable();
++ local_irq_enable();
++ __schedule(SM_PREEMPT);
++ local_irq_disable();
++ sched_preempt_enable_no_resched();
++ } while (need_resched());
++
++ exception_exit(prev_state);
++}
++
++int default_wake_function(wait_queue_entry_t *curr, unsigned mode, int wake_flags,
++ void *key)
++{
++ WARN_ON_ONCE(IS_ENABLED(CONFIG_SCHED_DEBUG) && wake_flags & ~WF_SYNC);
++ return try_to_wake_up(curr->private, mode, wake_flags);
++}
++EXPORT_SYMBOL(default_wake_function);
++
++static inline void check_task_changed(struct task_struct *p, struct rq *rq)
++{
++ /* Trigger resched if task sched_prio has been modified. */
++ if (task_on_rq_queued(p)) {
++ int idx;
++
++ update_rq_clock(rq);
++ idx = task_sched_prio_idx(p, rq);
++ if (idx != p->sq_idx) {
++ requeue_task(p, rq, idx);
++ check_preempt_curr(rq);
++ }
++ }
++}
++
++static void __setscheduler_prio(struct task_struct *p, int prio)
++{
++ p->prio = prio;
++}
++
++#ifdef CONFIG_RT_MUTEXES
++
++static inline int __rt_effective_prio(struct task_struct *pi_task, int prio)
++{
++ if (pi_task)
++ prio = min(prio, pi_task->prio);
++
++ return prio;
++}
++
++static inline int rt_effective_prio(struct task_struct *p, int prio)
++{
++ struct task_struct *pi_task = rt_mutex_get_top_task(p);
++
++ return __rt_effective_prio(pi_task, prio);
++}
++
++/*
++ * rt_mutex_setprio - set the current priority of a task
++ * @p: task to boost
++ * @pi_task: donor task
++ *
++ * This function changes the 'effective' priority of a task. It does
++ * not touch ->normal_prio like __setscheduler().
++ *
++ * Used by the rt_mutex code to implement priority inheritance
++ * logic. Call site only calls if the priority of the task changed.
++ */
++void rt_mutex_setprio(struct task_struct *p, struct task_struct *pi_task)
++{
++ int prio;
++ struct rq *rq;
++ raw_spinlock_t *lock;
++
++ /* XXX used to be waiter->prio, not waiter->task->prio */
++ prio = __rt_effective_prio(pi_task, p->normal_prio);
++
++ /*
++ * If nothing changed; bail early.
++ */
++ if (p->pi_top_task == pi_task && prio == p->prio)
++ return;
++
++ rq = __task_access_lock(p, &lock);
++ /*
++ * Set under pi_lock && rq->lock, such that the value can be used under
++ * either lock.
++ *
++ * Note that there is loads of tricky to make this pointer cache work
++ * right. rt_mutex_slowunlock()+rt_mutex_postunlock() work together to
++ * ensure a task is de-boosted (pi_task is set to NULL) before the
++ * task is allowed to run again (and can exit). This ensures the pointer
++ * points to a blocked task -- which guarantees the task is present.
++ */
++ p->pi_top_task = pi_task;
++
++ /*
++ * For FIFO/RR we only need to set prio, if that matches we're done.
++ */
++ if (prio == p->prio)
++ goto out_unlock;
++
++ /*
++ * Idle task boosting is a nono in general. There is one
++ * exception, when PREEMPT_RT and NOHZ is active:
++ *
++ * The idle task calls get_next_timer_interrupt() and holds
++ * the timer wheel base->lock on the CPU and another CPU wants
++ * to access the timer (probably to cancel it). We can safely
++ * ignore the boosting request, as the idle CPU runs this code
++ * with interrupts disabled and will complete the lock
++ * protected section without being interrupted. So there is no
++ * real need to boost.
++ */
++ if (unlikely(p == rq->idle)) {
++ WARN_ON(p != rq->curr);
++ WARN_ON(p->pi_blocked_on);
++ goto out_unlock;
++ }
++
++ trace_sched_pi_setprio(p, pi_task);
++
++ __setscheduler_prio(p, prio);
++
++ check_task_changed(p, rq);
++out_unlock:
++ /* Avoid rq from going away on us: */
++ preempt_disable();
++
++ __balance_callbacks(rq);
++ __task_access_unlock(p, lock);
++
++ preempt_enable();
++}
++#else
++static inline int rt_effective_prio(struct task_struct *p, int prio)
++{
++ return prio;
++}
++#endif
++
++void set_user_nice(struct task_struct *p, long nice)
++{
++ unsigned long flags;
++ struct rq *rq;
++ raw_spinlock_t *lock;
++
++ if (task_nice(p) == nice || nice < MIN_NICE || nice > MAX_NICE)
++ return;
++ /*
++ * We have to be careful, if called from sys_setpriority(),
++ * the task might be in the middle of scheduling on another CPU.
++ */
++ raw_spin_lock_irqsave(&p->pi_lock, flags);
++ rq = __task_access_lock(p, &lock);
++
++ p->static_prio = NICE_TO_PRIO(nice);
++ /*
++ * The RT priorities are set via sched_setscheduler(), but we still
++ * allow the 'normal' nice value to be set - but as expected
++ * it won't have any effect on scheduling until the task is
++ * not SCHED_NORMAL/SCHED_BATCH:
++ */
++ if (task_has_rt_policy(p))
++ goto out_unlock;
++
++ p->prio = effective_prio(p);
++
++ check_task_changed(p, rq);
++out_unlock:
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++}
++EXPORT_SYMBOL(set_user_nice);
++
++/*
++ * is_nice_reduction - check if nice value is an actual reduction
++ *
++ * Similar to can_nice() but does not perform a capability check.
++ *
++ * @p: task
++ * @nice: nice value
++ */
++static bool is_nice_reduction(const struct task_struct *p, const int nice)
++{
++ /* Convert nice value [19,-20] to rlimit style value [1,40]: */
++ int nice_rlim = nice_to_rlimit(nice);
++
++ return (nice_rlim <= task_rlimit(p, RLIMIT_NICE));
++}
++
++/*
++ * can_nice - check if a task can reduce its nice value
++ * @p: task
++ * @nice: nice value
++ */
++int can_nice(const struct task_struct *p, const int nice)
++{
++ return is_nice_reduction(p, nice) || capable(CAP_SYS_NICE);
++}
++
++#ifdef __ARCH_WANT_SYS_NICE
++
++/*
++ * sys_nice - change the priority of the current process.
++ * @increment: priority increment
++ *
++ * sys_setpriority is a more generic, but much slower function that
++ * does similar things.
++ */
++SYSCALL_DEFINE1(nice, int, increment)
++{
++ long nice, retval;
++
++ /*
++ * Setpriority might change our priority at the same moment.
++ * We don't have to worry. Conceptually one call occurs first
++ * and we have a single winner.
++ */
++
++ increment = clamp(increment, -NICE_WIDTH, NICE_WIDTH);
++ nice = task_nice(current) + increment;
++
++ nice = clamp_val(nice, MIN_NICE, MAX_NICE);
++ if (increment < 0 && !can_nice(current, nice))
++ return -EPERM;
++
++ retval = security_task_setnice(current, nice);
++ if (retval)
++ return retval;
++
++ set_user_nice(current, nice);
++ return 0;
++}
++
++#endif
++
++/**
++ * task_prio - return the priority value of a given task.
++ * @p: the task in question.
++ *
++ * Return: The priority value as seen by users in /proc.
++ *
++ * sched policy return value kernel prio user prio/nice
++ *
++ * (BMQ)normal, batch, idle[0 ... 53] [100 ... 139] 0/[-20 ... 19]/[-7 ... 7]
++ * (PDS)normal, batch, idle[0 ... 39] 100 0/[-20 ... 19]
++ * fifo, rr [-1 ... -100] [99 ... 0] [0 ... 99]
++ */
++int task_prio(const struct task_struct *p)
++{
++ return (p->prio < MAX_RT_PRIO) ? p->prio - MAX_RT_PRIO :
++ task_sched_prio_normal(p, task_rq(p));
++}
++
++/**
++ * idle_cpu - is a given CPU idle currently?
++ * @cpu: the processor in question.
++ *
++ * Return: 1 if the CPU is currently idle. 0 otherwise.
++ */
++int idle_cpu(int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ if (rq->curr != rq->idle)
++ return 0;
++
++ if (rq->nr_running)
++ return 0;
++
++#ifdef CONFIG_SMP
++ if (rq->ttwu_pending)
++ return 0;
++#endif
++
++ return 1;
++}
++
++/**
++ * idle_task - return the idle task for a given CPU.
++ * @cpu: the processor in question.
++ *
++ * Return: The idle task for the cpu @cpu.
++ */
++struct task_struct *idle_task(int cpu)
++{
++ return cpu_rq(cpu)->idle;
++}
++
++/**
++ * find_process_by_pid - find a process with a matching PID value.
++ * @pid: the pid in question.
++ *
++ * The task of @pid, if found. %NULL otherwise.
++ */
++static inline struct task_struct *find_process_by_pid(pid_t pid)
++{
++ return pid ? find_task_by_vpid(pid) : current;
++}
++
++/*
++ * sched_setparam() passes in -1 for its policy, to let the functions
++ * it calls know not to change it.
++ */
++#define SETPARAM_POLICY -1
++
++static void __setscheduler_params(struct task_struct *p,
++ const struct sched_attr *attr)
++{
++ int policy = attr->sched_policy;
++
++ if (policy == SETPARAM_POLICY)
++ policy = p->policy;
++
++ p->policy = policy;
++
++ /*
++ * allow normal nice value to be set, but will not have any
++ * effect on scheduling until the task not SCHED_NORMAL/
++ * SCHED_BATCH
++ */
++ p->static_prio = NICE_TO_PRIO(attr->sched_nice);
++
++ /*
++ * __sched_setscheduler() ensures attr->sched_priority == 0 when
++ * !rt_policy. Always setting this ensures that things like
++ * getparam()/getattr() don't report silly values for !rt tasks.
++ */
++ p->rt_priority = attr->sched_priority;
++ p->normal_prio = normal_prio(p);
++}
++
++/*
++ * check the target process has a UID that matches the current process's
++ */
++static bool check_same_owner(struct task_struct *p)
++{
++ const struct cred *cred = current_cred(), *pcred;
++ bool match;
++
++ rcu_read_lock();
++ pcred = __task_cred(p);
++ match = (uid_eq(cred->euid, pcred->euid) ||
++ uid_eq(cred->euid, pcred->uid));
++ rcu_read_unlock();
++ return match;
++}
++
++/*
++ * Allow unprivileged RT tasks to decrease priority.
++ * Only issue a capable test if needed and only once to avoid an audit
++ * event on permitted non-privileged operations:
++ */
++static int user_check_sched_setscheduler(struct task_struct *p,
++ const struct sched_attr *attr,
++ int policy, int reset_on_fork)
++{
++ if (rt_policy(policy)) {
++ unsigned long rlim_rtprio = task_rlimit(p, RLIMIT_RTPRIO);
++
++ /* Can't set/change the rt policy: */
++ if (policy != p->policy && !rlim_rtprio)
++ goto req_priv;
++
++ /* Can't increase priority: */
++ if (attr->sched_priority > p->rt_priority &&
++ attr->sched_priority > rlim_rtprio)
++ goto req_priv;
++ }
++
++ /* Can't change other user's priorities: */
++ if (!check_same_owner(p))
++ goto req_priv;
++
++ /* Normal users shall not reset the sched_reset_on_fork flag: */
++ if (p->sched_reset_on_fork && !reset_on_fork)
++ goto req_priv;
++
++ return 0;
++
++req_priv:
++ if (!capable(CAP_SYS_NICE))
++ return -EPERM;
++
++ return 0;
++}
++
++static int __sched_setscheduler(struct task_struct *p,
++ const struct sched_attr *attr,
++ bool user, bool pi)
++{
++ const struct sched_attr dl_squash_attr = {
++ .size = sizeof(struct sched_attr),
++ .sched_policy = SCHED_FIFO,
++ .sched_nice = 0,
++ .sched_priority = 99,
++ };
++ int oldpolicy = -1, policy = attr->sched_policy;
++ int retval, newprio;
++ struct balance_callback *head;
++ unsigned long flags;
++ struct rq *rq;
++ int reset_on_fork;
++ raw_spinlock_t *lock;
++
++ /* The pi code expects interrupts enabled */
++ BUG_ON(pi && in_interrupt());
++
++ /*
++ * Alt schedule FW supports SCHED_DEADLINE by squash it as prio 0 SCHED_FIFO
++ */
++ if (unlikely(SCHED_DEADLINE == policy)) {
++ attr = &dl_squash_attr;
++ policy = attr->sched_policy;
++ }
++recheck:
++ /* Double check policy once rq lock held */
++ if (policy < 0) {
++ reset_on_fork = p->sched_reset_on_fork;
++ policy = oldpolicy = p->policy;
++ } else {
++ reset_on_fork = !!(attr->sched_flags & SCHED_RESET_ON_FORK);
++
++ if (policy > SCHED_IDLE)
++ return -EINVAL;
++ }
++
++ if (attr->sched_flags & ~(SCHED_FLAG_ALL))
++ return -EINVAL;
++
++ /*
++ * Valid priorities for SCHED_FIFO and SCHED_RR are
++ * 1..MAX_RT_PRIO-1, valid priority for SCHED_NORMAL and
++ * SCHED_BATCH and SCHED_IDLE is 0.
++ */
++ if (attr->sched_priority < 0 ||
++ (p->mm && attr->sched_priority > MAX_RT_PRIO - 1) ||
++ (!p->mm && attr->sched_priority > MAX_RT_PRIO - 1))
++ return -EINVAL;
++ if ((SCHED_RR == policy || SCHED_FIFO == policy) !=
++ (attr->sched_priority != 0))
++ return -EINVAL;
++
++ if (user) {
++ retval = user_check_sched_setscheduler(p, attr, policy, reset_on_fork);
++ if (retval)
++ return retval;
++
++ retval = security_task_setscheduler(p);
++ if (retval)
++ return retval;
++ }
++
++ if (pi)
++ cpuset_read_lock();
++
++ /*
++ * Make sure no PI-waiters arrive (or leave) while we are
++ * changing the priority of the task:
++ */
++ raw_spin_lock_irqsave(&p->pi_lock, flags);
++
++ /*
++ * To be able to change p->policy safely, task_access_lock()
++ * must be called.
++ * IF use task_access_lock() here:
++ * For the task p which is not running, reading rq->stop is
++ * racy but acceptable as ->stop doesn't change much.
++ * An enhancemnet can be made to read rq->stop saftly.
++ */
++ rq = __task_access_lock(p, &lock);
++
++ /*
++ * Changing the policy of the stop threads its a very bad idea
++ */
++ if (p == rq->stop) {
++ retval = -EINVAL;
++ goto unlock;
++ }
++
++ /*
++ * If not changing anything there's no need to proceed further:
++ */
++ if (unlikely(policy == p->policy)) {
++ if (rt_policy(policy) && attr->sched_priority != p->rt_priority)
++ goto change;
++ if (!rt_policy(policy) &&
++ NICE_TO_PRIO(attr->sched_nice) != p->static_prio)
++ goto change;
++
++ p->sched_reset_on_fork = reset_on_fork;
++ retval = 0;
++ goto unlock;
++ }
++change:
++
++ /* Re-check policy now with rq lock held */
++ if (unlikely(oldpolicy != -1 && oldpolicy != p->policy)) {
++ policy = oldpolicy = -1;
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++ if (pi)
++ cpuset_read_unlock();
++ goto recheck;
++ }
++
++ p->sched_reset_on_fork = reset_on_fork;
++
++ newprio = __normal_prio(policy, attr->sched_priority, NICE_TO_PRIO(attr->sched_nice));
++ if (pi) {
++ /*
++ * Take priority boosted tasks into account. If the new
++ * effective priority is unchanged, we just store the new
++ * normal parameters and do not touch the scheduler class and
++ * the runqueue. This will be done when the task deboost
++ * itself.
++ */
++ newprio = rt_effective_prio(p, newprio);
++ }
++
++ if (!(attr->sched_flags & SCHED_FLAG_KEEP_PARAMS)) {
++ __setscheduler_params(p, attr);
++ __setscheduler_prio(p, newprio);
++ }
++
++ check_task_changed(p, rq);
++
++ /* Avoid rq from going away on us: */
++ preempt_disable();
++ head = splice_balance_callbacks(rq);
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++
++ if (pi) {
++ cpuset_read_unlock();
++ rt_mutex_adjust_pi(p);
++ }
++
++ /* Run balance callbacks after we've adjusted the PI chain: */
++ balance_callbacks(rq, head);
++ preempt_enable();
++
++ return 0;
++
++unlock:
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++ if (pi)
++ cpuset_read_unlock();
++ return retval;
++}
++
++static int _sched_setscheduler(struct task_struct *p, int policy,
++ const struct sched_param *param, bool check)
++{
++ struct sched_attr attr = {
++ .sched_policy = policy,
++ .sched_priority = param->sched_priority,
++ .sched_nice = PRIO_TO_NICE(p->static_prio),
++ };
++
++ /* Fixup the legacy SCHED_RESET_ON_FORK hack. */
++ if ((policy != SETPARAM_POLICY) && (policy & SCHED_RESET_ON_FORK)) {
++ attr.sched_flags |= SCHED_FLAG_RESET_ON_FORK;
++ policy &= ~SCHED_RESET_ON_FORK;
++ attr.sched_policy = policy;
++ }
++
++ return __sched_setscheduler(p, &attr, check, true);
++}
++
++/**
++ * sched_setscheduler - change the scheduling policy and/or RT priority of a thread.
++ * @p: the task in question.
++ * @policy: new policy.
++ * @param: structure containing the new RT priority.
++ *
++ * Use sched_set_fifo(), read its comment.
++ *
++ * Return: 0 on success. An error code otherwise.
++ *
++ * NOTE that the task may be already dead.
++ */
++int sched_setscheduler(struct task_struct *p, int policy,
++ const struct sched_param *param)
++{
++ return _sched_setscheduler(p, policy, param, true);
++}
++
++int sched_setattr(struct task_struct *p, const struct sched_attr *attr)
++{
++ return __sched_setscheduler(p, attr, true, true);
++}
++
++int sched_setattr_nocheck(struct task_struct *p, const struct sched_attr *attr)
++{
++ return __sched_setscheduler(p, attr, false, true);
++}
++EXPORT_SYMBOL_GPL(sched_setattr_nocheck);
++
++/**
++ * sched_setscheduler_nocheck - change the scheduling policy and/or RT priority of a thread from kernelspace.
++ * @p: the task in question.
++ * @policy: new policy.
++ * @param: structure containing the new RT priority.
++ *
++ * Just like sched_setscheduler, only don't bother checking if the
++ * current context has permission. For example, this is needed in
++ * stop_machine(): we create temporary high priority worker threads,
++ * but our caller might not have that capability.
++ *
++ * Return: 0 on success. An error code otherwise.
++ */
++int sched_setscheduler_nocheck(struct task_struct *p, int policy,
++ const struct sched_param *param)
++{
++ return _sched_setscheduler(p, policy, param, false);
++}
++
++/*
++ * SCHED_FIFO is a broken scheduler model; that is, it is fundamentally
++ * incapable of resource management, which is the one thing an OS really should
++ * be doing.
++ *
++ * This is of course the reason it is limited to privileged users only.
++ *
++ * Worse still; it is fundamentally impossible to compose static priority
++ * workloads. You cannot take two correctly working static prio workloads
++ * and smash them together and still expect them to work.
++ *
++ * For this reason 'all' FIFO tasks the kernel creates are basically at:
++ *
++ * MAX_RT_PRIO / 2
++ *
++ * The administrator _MUST_ configure the system, the kernel simply doesn't
++ * know enough information to make a sensible choice.
++ */
++void sched_set_fifo(struct task_struct *p)
++{
++ struct sched_param sp = { .sched_priority = MAX_RT_PRIO / 2 };
++ WARN_ON_ONCE(sched_setscheduler_nocheck(p, SCHED_FIFO, &sp) != 0);
++}
++EXPORT_SYMBOL_GPL(sched_set_fifo);
++
++/*
++ * For when you don't much care about FIFO, but want to be above SCHED_NORMAL.
++ */
++void sched_set_fifo_low(struct task_struct *p)
++{
++ struct sched_param sp = { .sched_priority = 1 };
++ WARN_ON_ONCE(sched_setscheduler_nocheck(p, SCHED_FIFO, &sp) != 0);
++}
++EXPORT_SYMBOL_GPL(sched_set_fifo_low);
++
++void sched_set_normal(struct task_struct *p, int nice)
++{
++ struct sched_attr attr = {
++ .sched_policy = SCHED_NORMAL,
++ .sched_nice = nice,
++ };
++ WARN_ON_ONCE(sched_setattr_nocheck(p, &attr) != 0);
++}
++EXPORT_SYMBOL_GPL(sched_set_normal);
++
++static int
++do_sched_setscheduler(pid_t pid, int policy, struct sched_param __user *param)
++{
++ struct sched_param lparam;
++ struct task_struct *p;
++ int retval;
++
++ if (!param || pid < 0)
++ return -EINVAL;
++ if (copy_from_user(&lparam, param, sizeof(struct sched_param)))
++ return -EFAULT;
++
++ rcu_read_lock();
++ retval = -ESRCH;
++ p = find_process_by_pid(pid);
++ if (likely(p))
++ get_task_struct(p);
++ rcu_read_unlock();
++
++ if (likely(p)) {
++ retval = sched_setscheduler(p, policy, &lparam);
++ put_task_struct(p);
++ }
++
++ return retval;
++}
++
++/*
++ * Mimics kernel/events/core.c perf_copy_attr().
++ */
++static int sched_copy_attr(struct sched_attr __user *uattr, struct sched_attr *attr)
++{
++ u32 size;
++ int ret;
++
++ /* Zero the full structure, so that a short copy will be nice: */
++ memset(attr, 0, sizeof(*attr));
++
++ ret = get_user(size, &uattr->size);
++ if (ret)
++ return ret;
++
++ /* ABI compatibility quirk: */
++ if (!size)
++ size = SCHED_ATTR_SIZE_VER0;
++
++ if (size < SCHED_ATTR_SIZE_VER0 || size > PAGE_SIZE)
++ goto err_size;
++
++ ret = copy_struct_from_user(attr, sizeof(*attr), uattr, size);
++ if (ret) {
++ if (ret == -E2BIG)
++ goto err_size;
++ return ret;
++ }
++
++ /*
++ * XXX: Do we want to be lenient like existing syscalls; or do we want
++ * to be strict and return an error on out-of-bounds values?
++ */
++ attr->sched_nice = clamp(attr->sched_nice, -20, 19);
++
++ /* sched/core.c uses zero here but we already know ret is zero */
++ return 0;
++
++err_size:
++ put_user(sizeof(*attr), &uattr->size);
++ return -E2BIG;
++}
++
++/**
++ * sys_sched_setscheduler - set/change the scheduler policy and RT priority
++ * @pid: the pid in question.
++ * @policy: new policy.
++ *
++ * Return: 0 on success. An error code otherwise.
++ * @param: structure containing the new RT priority.
++ */
++SYSCALL_DEFINE3(sched_setscheduler, pid_t, pid, int, policy, struct sched_param __user *, param)
++{
++ if (policy < 0)
++ return -EINVAL;
++
++ return do_sched_setscheduler(pid, policy, param);
++}
++
++/**
++ * sys_sched_setparam - set/change the RT priority of a thread
++ * @pid: the pid in question.
++ * @param: structure containing the new RT priority.
++ *
++ * Return: 0 on success. An error code otherwise.
++ */
++SYSCALL_DEFINE2(sched_setparam, pid_t, pid, struct sched_param __user *, param)
++{
++ return do_sched_setscheduler(pid, SETPARAM_POLICY, param);
++}
++
++/**
++ * sys_sched_setattr - same as above, but with extended sched_attr
++ * @pid: the pid in question.
++ * @uattr: structure containing the extended parameters.
++ */
++SYSCALL_DEFINE3(sched_setattr, pid_t, pid, struct sched_attr __user *, uattr,
++ unsigned int, flags)
++{
++ struct sched_attr attr;
++ struct task_struct *p;
++ int retval;
++
++ if (!uattr || pid < 0 || flags)
++ return -EINVAL;
++
++ retval = sched_copy_attr(uattr, &attr);
++ if (retval)
++ return retval;
++
++ if ((int)attr.sched_policy < 0)
++ return -EINVAL;
++
++ rcu_read_lock();
++ retval = -ESRCH;
++ p = find_process_by_pid(pid);
++ if (likely(p))
++ get_task_struct(p);
++ rcu_read_unlock();
++
++ if (likely(p)) {
++ retval = sched_setattr(p, &attr);
++ put_task_struct(p);
++ }
++
++ return retval;
++}
++
++/**
++ * sys_sched_getscheduler - get the policy (scheduling class) of a thread
++ * @pid: the pid in question.
++ *
++ * Return: On success, the policy of the thread. Otherwise, a negative error
++ * code.
++ */
++SYSCALL_DEFINE1(sched_getscheduler, pid_t, pid)
++{
++ struct task_struct *p;
++ int retval = -EINVAL;
++
++ if (pid < 0)
++ goto out_nounlock;
++
++ retval = -ESRCH;
++ rcu_read_lock();
++ p = find_process_by_pid(pid);
++ if (p) {
++ retval = security_task_getscheduler(p);
++ if (!retval)
++ retval = p->policy;
++ }
++ rcu_read_unlock();
++
++out_nounlock:
++ return retval;
++}
++
++/**
++ * sys_sched_getscheduler - get the RT priority of a thread
++ * @pid: the pid in question.
++ * @param: structure containing the RT priority.
++ *
++ * Return: On success, 0 and the RT priority is in @param. Otherwise, an error
++ * code.
++ */
++SYSCALL_DEFINE2(sched_getparam, pid_t, pid, struct sched_param __user *, param)
++{
++ struct sched_param lp = { .sched_priority = 0 };
++ struct task_struct *p;
++ int retval = -EINVAL;
++
++ if (!param || pid < 0)
++ goto out_nounlock;
++
++ rcu_read_lock();
++ p = find_process_by_pid(pid);
++ retval = -ESRCH;
++ if (!p)
++ goto out_unlock;
++
++ retval = security_task_getscheduler(p);
++ if (retval)
++ goto out_unlock;
++
++ if (task_has_rt_policy(p))
++ lp.sched_priority = p->rt_priority;
++ rcu_read_unlock();
++
++ /*
++ * This one might sleep, we cannot do it with a spinlock held ...
++ */
++ retval = copy_to_user(param, &lp, sizeof(*param)) ? -EFAULT : 0;
++
++out_nounlock:
++ return retval;
++
++out_unlock:
++ rcu_read_unlock();
++ return retval;
++}
++
++/*
++ * Copy the kernel size attribute structure (which might be larger
++ * than what user-space knows about) to user-space.
++ *
++ * Note that all cases are valid: user-space buffer can be larger or
++ * smaller than the kernel-space buffer. The usual case is that both
++ * have the same size.
++ */
++static int
++sched_attr_copy_to_user(struct sched_attr __user *uattr,
++ struct sched_attr *kattr,
++ unsigned int usize)
++{
++ unsigned int ksize = sizeof(*kattr);
++
++ if (!access_ok(uattr, usize))
++ return -EFAULT;
++
++ /*
++ * sched_getattr() ABI forwards and backwards compatibility:
++ *
++ * If usize == ksize then we just copy everything to user-space and all is good.
++ *
++ * If usize < ksize then we only copy as much as user-space has space for,
++ * this keeps ABI compatibility as well. We skip the rest.
++ *
++ * If usize > ksize then user-space is using a newer version of the ABI,
++ * which part the kernel doesn't know about. Just ignore it - tooling can
++ * detect the kernel's knowledge of attributes from the attr->size value
++ * which is set to ksize in this case.
++ */
++ kattr->size = min(usize, ksize);
++
++ if (copy_to_user(uattr, kattr, kattr->size))
++ return -EFAULT;
++
++ return 0;
++}
++
++/**
++ * sys_sched_getattr - similar to sched_getparam, but with sched_attr
++ * @pid: the pid in question.
++ * @uattr: structure containing the extended parameters.
++ * @usize: sizeof(attr) for fwd/bwd comp.
++ * @flags: for future extension.
++ */
++SYSCALL_DEFINE4(sched_getattr, pid_t, pid, struct sched_attr __user *, uattr,
++ unsigned int, usize, unsigned int, flags)
++{
++ struct sched_attr kattr = { };
++ struct task_struct *p;
++ int retval;
++
++ if (!uattr || pid < 0 || usize > PAGE_SIZE ||
++ usize < SCHED_ATTR_SIZE_VER0 || flags)
++ return -EINVAL;
++
++ rcu_read_lock();
++ p = find_process_by_pid(pid);
++ retval = -ESRCH;
++ if (!p)
++ goto out_unlock;
++
++ retval = security_task_getscheduler(p);
++ if (retval)
++ goto out_unlock;
++
++ kattr.sched_policy = p->policy;
++ if (p->sched_reset_on_fork)
++ kattr.sched_flags |= SCHED_FLAG_RESET_ON_FORK;
++ if (task_has_rt_policy(p))
++ kattr.sched_priority = p->rt_priority;
++ else
++ kattr.sched_nice = task_nice(p);
++ kattr.sched_flags &= SCHED_FLAG_ALL;
++
++#ifdef CONFIG_UCLAMP_TASK
++ kattr.sched_util_min = p->uclamp_req[UCLAMP_MIN].value;
++ kattr.sched_util_max = p->uclamp_req[UCLAMP_MAX].value;
++#endif
++
++ rcu_read_unlock();
++
++ return sched_attr_copy_to_user(uattr, &kattr, usize);
++
++out_unlock:
++ rcu_read_unlock();
++ return retval;
++}
++
++#ifdef CONFIG_SMP
++int dl_task_check_affinity(struct task_struct *p, const struct cpumask *mask)
++{
++ return 0;
++}
++#endif
++
++static int
++__sched_setaffinity(struct task_struct *p, struct affinity_context *ctx)
++{
++ int retval;
++ cpumask_var_t cpus_allowed, new_mask;
++
++ if (!alloc_cpumask_var(&cpus_allowed, GFP_KERNEL))
++ return -ENOMEM;
++
++ if (!alloc_cpumask_var(&new_mask, GFP_KERNEL)) {
++ retval = -ENOMEM;
++ goto out_free_cpus_allowed;
++ }
++
++ cpuset_cpus_allowed(p, cpus_allowed);
++ cpumask_and(new_mask, ctx->new_mask, cpus_allowed);
++
++ ctx->new_mask = new_mask;
++ ctx->flags |= SCA_CHECK;
++
++ retval = __set_cpus_allowed_ptr(p, ctx);
++ if (retval)
++ goto out_free_new_mask;
++
++ cpuset_cpus_allowed(p, cpus_allowed);
++ if (!cpumask_subset(new_mask, cpus_allowed)) {
++ /*
++ * We must have raced with a concurrent cpuset
++ * update. Just reset the cpus_allowed to the
++ * cpuset's cpus_allowed
++ */
++ cpumask_copy(new_mask, cpus_allowed);
++
++ /*
++ * If SCA_USER is set, a 2nd call to __set_cpus_allowed_ptr()
++ * will restore the previous user_cpus_ptr value.
++ *
++ * In the unlikely event a previous user_cpus_ptr exists,
++ * we need to further restrict the mask to what is allowed
++ * by that old user_cpus_ptr.
++ */
++ if (unlikely((ctx->flags & SCA_USER) && ctx->user_mask)) {
++ bool empty = !cpumask_and(new_mask, new_mask,
++ ctx->user_mask);
++
++ if (WARN_ON_ONCE(empty))
++ cpumask_copy(new_mask, cpus_allowed);
++ }
++ __set_cpus_allowed_ptr(p, ctx);
++ retval = -EINVAL;
++ }
++
++out_free_new_mask:
++ free_cpumask_var(new_mask);
++out_free_cpus_allowed:
++ free_cpumask_var(cpus_allowed);
++ return retval;
++}
++
++long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
++{
++ struct affinity_context ac;
++ struct cpumask *user_mask;
++ struct task_struct *p;
++ int retval;
++
++ rcu_read_lock();
++
++ p = find_process_by_pid(pid);
++ if (!p) {
++ rcu_read_unlock();
++ return -ESRCH;
++ }
++
++ /* Prevent p going away */
++ get_task_struct(p);
++ rcu_read_unlock();
++
++ if (p->flags & PF_NO_SETAFFINITY) {
++ retval = -EINVAL;
++ goto out_put_task;
++ }
++
++ if (!check_same_owner(p)) {
++ rcu_read_lock();
++ if (!ns_capable(__task_cred(p)->user_ns, CAP_SYS_NICE)) {
++ rcu_read_unlock();
++ retval = -EPERM;
++ goto out_put_task;
++ }
++ rcu_read_unlock();
++ }
++
++ retval = security_task_setscheduler(p);
++ if (retval)
++ goto out_put_task;
++
++ /*
++ * With non-SMP configs, user_cpus_ptr/user_mask isn't used and
++ * alloc_user_cpus_ptr() returns NULL.
++ */
++ user_mask = alloc_user_cpus_ptr(NUMA_NO_NODE);
++ if (user_mask) {
++ cpumask_copy(user_mask, in_mask);
++ } else if (IS_ENABLED(CONFIG_SMP)) {
++ retval = -ENOMEM;
++ goto out_put_task;
++ }
++
++ ac = (struct affinity_context){
++ .new_mask = in_mask,
++ .user_mask = user_mask,
++ .flags = SCA_USER,
++ };
++
++ retval = __sched_setaffinity(p, &ac);
++ kfree(ac.user_mask);
++
++out_put_task:
++ put_task_struct(p);
++ return retval;
++}
++
++static int get_user_cpu_mask(unsigned long __user *user_mask_ptr, unsigned len,
++ struct cpumask *new_mask)
++{
++ if (len < cpumask_size())
++ cpumask_clear(new_mask);
++ else if (len > cpumask_size())
++ len = cpumask_size();
++
++ return copy_from_user(new_mask, user_mask_ptr, len) ? -EFAULT : 0;
++}
++
++/**
++ * sys_sched_setaffinity - set the CPU affinity of a process
++ * @pid: pid of the process
++ * @len: length in bytes of the bitmask pointed to by user_mask_ptr
++ * @user_mask_ptr: user-space pointer to the new CPU mask
++ *
++ * Return: 0 on success. An error code otherwise.
++ */
++SYSCALL_DEFINE3(sched_setaffinity, pid_t, pid, unsigned int, len,
++ unsigned long __user *, user_mask_ptr)
++{
++ cpumask_var_t new_mask;
++ int retval;
++
++ if (!alloc_cpumask_var(&new_mask, GFP_KERNEL))
++ return -ENOMEM;
++
++ retval = get_user_cpu_mask(user_mask_ptr, len, new_mask);
++ if (retval == 0)
++ retval = sched_setaffinity(pid, new_mask);
++ free_cpumask_var(new_mask);
++ return retval;
++}
++
++long sched_getaffinity(pid_t pid, cpumask_t *mask)
++{
++ struct task_struct *p;
++ raw_spinlock_t *lock;
++ unsigned long flags;
++ int retval;
++
++ rcu_read_lock();
++
++ retval = -ESRCH;
++ p = find_process_by_pid(pid);
++ if (!p)
++ goto out_unlock;
++
++ retval = security_task_getscheduler(p);
++ if (retval)
++ goto out_unlock;
++
++ task_access_lock_irqsave(p, &lock, &flags);
++ cpumask_and(mask, &p->cpus_mask, cpu_active_mask);
++ task_access_unlock_irqrestore(p, lock, &flags);
++
++out_unlock:
++ rcu_read_unlock();
++
++ return retval;
++}
++
++/**
++ * sys_sched_getaffinity - get the CPU affinity of a process
++ * @pid: pid of the process
++ * @len: length in bytes of the bitmask pointed to by user_mask_ptr
++ * @user_mask_ptr: user-space pointer to hold the current CPU mask
++ *
++ * Return: size of CPU mask copied to user_mask_ptr on success. An
++ * error code otherwise.
++ */
++SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len,
++ unsigned long __user *, user_mask_ptr)
++{
++ int ret;
++ cpumask_var_t mask;
++
++ if ((len * BITS_PER_BYTE) < nr_cpu_ids)
++ return -EINVAL;
++ if (len & (sizeof(unsigned long)-1))
++ return -EINVAL;
++
++ if (!zalloc_cpumask_var(&mask, GFP_KERNEL))
++ return -ENOMEM;
++
++ ret = sched_getaffinity(pid, mask);
++ if (ret == 0) {
++ unsigned int retlen = min(len, cpumask_size());
++
++ if (copy_to_user(user_mask_ptr, cpumask_bits(mask), retlen))
++ ret = -EFAULT;
++ else
++ ret = retlen;
++ }
++ free_cpumask_var(mask);
++
++ return ret;
++}
++
++static void do_sched_yield(void)
++{
++ struct rq *rq;
++ struct rq_flags rf;
++
++ if (!sched_yield_type)
++ return;
++
++ rq = this_rq_lock_irq(&rf);
++
++ schedstat_inc(rq->yld_count);
++
++ if (1 == sched_yield_type) {
++ if (!rt_task(current))
++ do_sched_yield_type_1(current, rq);
++ } else if (2 == sched_yield_type) {
++ if (rq->nr_running > 1)
++ rq->skip = current;
++ }
++
++ preempt_disable();
++ raw_spin_unlock_irq(&rq->lock);
++ sched_preempt_enable_no_resched();
++
++ schedule();
++}
++
++/**
++ * sys_sched_yield - yield the current processor to other threads.
++ *
++ * This function yields the current CPU to other tasks. If there are no
++ * other threads running on this CPU then this function will return.
++ *
++ * Return: 0.
++ */
++SYSCALL_DEFINE0(sched_yield)
++{
++ do_sched_yield();
++ return 0;
++}
++
++#if !defined(CONFIG_PREEMPTION) || defined(CONFIG_PREEMPT_DYNAMIC)
++int __sched __cond_resched(void)
++{
++ if (should_resched(0)) {
++ preempt_schedule_common();
++ return 1;
++ }
++ /*
++ * In preemptible kernels, ->rcu_read_lock_nesting tells the tick
++ * whether the current CPU is in an RCU read-side critical section,
++ * so the tick can report quiescent states even for CPUs looping
++ * in kernel context. In contrast, in non-preemptible kernels,
++ * RCU readers leave no in-memory hints, which means that CPU-bound
++ * processes executing in kernel context might never report an
++ * RCU quiescent state. Therefore, the following code causes
++ * cond_resched() to report a quiescent state, but only when RCU
++ * is in urgent need of one.
++ */
++#ifndef CONFIG_PREEMPT_RCU
++ rcu_all_qs();
++#endif
++ return 0;
++}
++EXPORT_SYMBOL(__cond_resched);
++#endif
++
++#ifdef CONFIG_PREEMPT_DYNAMIC
++#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
++#define cond_resched_dynamic_enabled __cond_resched
++#define cond_resched_dynamic_disabled ((void *)&__static_call_return0)
++DEFINE_STATIC_CALL_RET0(cond_resched, __cond_resched);
++EXPORT_STATIC_CALL_TRAMP(cond_resched);
++
++#define might_resched_dynamic_enabled __cond_resched
++#define might_resched_dynamic_disabled ((void *)&__static_call_return0)
++DEFINE_STATIC_CALL_RET0(might_resched, __cond_resched);
++EXPORT_STATIC_CALL_TRAMP(might_resched);
++#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
++static DEFINE_STATIC_KEY_FALSE(sk_dynamic_cond_resched);
++int __sched dynamic_cond_resched(void)
++{
++ klp_sched_try_switch();
++ if (!static_branch_unlikely(&sk_dynamic_cond_resched))
++ return 0;
++ return __cond_resched();
++}
++EXPORT_SYMBOL(dynamic_cond_resched);
++
++static DEFINE_STATIC_KEY_FALSE(sk_dynamic_might_resched);
++int __sched dynamic_might_resched(void)
++{
++ if (!static_branch_unlikely(&sk_dynamic_might_resched))
++ return 0;
++ return __cond_resched();
++}
++EXPORT_SYMBOL(dynamic_might_resched);
++#endif
++#endif
++
++/*
++ * __cond_resched_lock() - if a reschedule is pending, drop the given lock,
++ * call schedule, and on return reacquire the lock.
++ *
++ * This works OK both with and without CONFIG_PREEMPTION. We do strange low-level
++ * operations here to prevent schedule() from being called twice (once via
++ * spin_unlock(), once by hand).
++ */
++int __cond_resched_lock(spinlock_t *lock)
++{
++ int resched = should_resched(PREEMPT_LOCK_OFFSET);
++ int ret = 0;
++
++ lockdep_assert_held(lock);
++
++ if (spin_needbreak(lock) || resched) {
++ spin_unlock(lock);
++ if (!_cond_resched())
++ cpu_relax();
++ ret = 1;
++ spin_lock(lock);
++ }
++ return ret;
++}
++EXPORT_SYMBOL(__cond_resched_lock);
++
++int __cond_resched_rwlock_read(rwlock_t *lock)
++{
++ int resched = should_resched(PREEMPT_LOCK_OFFSET);
++ int ret = 0;
++
++ lockdep_assert_held_read(lock);
++
++ if (rwlock_needbreak(lock) || resched) {
++ read_unlock(lock);
++ if (!_cond_resched())
++ cpu_relax();
++ ret = 1;
++ read_lock(lock);
++ }
++ return ret;
++}
++EXPORT_SYMBOL(__cond_resched_rwlock_read);
++
++int __cond_resched_rwlock_write(rwlock_t *lock)
++{
++ int resched = should_resched(PREEMPT_LOCK_OFFSET);
++ int ret = 0;
++
++ lockdep_assert_held_write(lock);
++
++ if (rwlock_needbreak(lock) || resched) {
++ write_unlock(lock);
++ if (!_cond_resched())
++ cpu_relax();
++ ret = 1;
++ write_lock(lock);
++ }
++ return ret;
++}
++EXPORT_SYMBOL(__cond_resched_rwlock_write);
++
++#ifdef CONFIG_PREEMPT_DYNAMIC
++
++#ifdef CONFIG_GENERIC_ENTRY
++#include <linux/entry-common.h>
++#endif
++
++/*
++ * SC:cond_resched
++ * SC:might_resched
++ * SC:preempt_schedule
++ * SC:preempt_schedule_notrace
++ * SC:irqentry_exit_cond_resched
++ *
++ *
++ * NONE:
++ * cond_resched <- __cond_resched
++ * might_resched <- RET0
++ * preempt_schedule <- NOP
++ * preempt_schedule_notrace <- NOP
++ * irqentry_exit_cond_resched <- NOP
++ *
++ * VOLUNTARY:
++ * cond_resched <- __cond_resched
++ * might_resched <- __cond_resched
++ * preempt_schedule <- NOP
++ * preempt_schedule_notrace <- NOP
++ * irqentry_exit_cond_resched <- NOP
++ *
++ * FULL:
++ * cond_resched <- RET0
++ * might_resched <- RET0
++ * preempt_schedule <- preempt_schedule
++ * preempt_schedule_notrace <- preempt_schedule_notrace
++ * irqentry_exit_cond_resched <- irqentry_exit_cond_resched
++ */
++
++enum {
++ preempt_dynamic_undefined = -1,
++ preempt_dynamic_none,
++ preempt_dynamic_voluntary,
++ preempt_dynamic_full,
++};
++
++int preempt_dynamic_mode = preempt_dynamic_undefined;
++
++int sched_dynamic_mode(const char *str)
++{
++ if (!strcmp(str, "none"))
++ return preempt_dynamic_none;
++
++ if (!strcmp(str, "voluntary"))
++ return preempt_dynamic_voluntary;
++
++ if (!strcmp(str, "full"))
++ return preempt_dynamic_full;
++
++ return -EINVAL;
++}
++
++#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
++#define preempt_dynamic_enable(f) static_call_update(f, f##_dynamic_enabled)
++#define preempt_dynamic_disable(f) static_call_update(f, f##_dynamic_disabled)
++#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
++#define preempt_dynamic_enable(f) static_key_enable(&sk_dynamic_##f.key)
++#define preempt_dynamic_disable(f) static_key_disable(&sk_dynamic_##f.key)
++#else
++#error "Unsupported PREEMPT_DYNAMIC mechanism"
++#endif
++
++static DEFINE_MUTEX(sched_dynamic_mutex);
++static bool klp_override;
++
++static void __sched_dynamic_update(int mode)
++{
++ /*
++ * Avoid {NONE,VOLUNTARY} -> FULL transitions from ever ending up in
++ * the ZERO state, which is invalid.
++ */
++ if (!klp_override)
++ preempt_dynamic_enable(cond_resched);
++ preempt_dynamic_enable(cond_resched);
++ preempt_dynamic_enable(might_resched);
++ preempt_dynamic_enable(preempt_schedule);
++ preempt_dynamic_enable(preempt_schedule_notrace);
++ preempt_dynamic_enable(irqentry_exit_cond_resched);
++
++ switch (mode) {
++ case preempt_dynamic_none:
++ if (!klp_override)
++ preempt_dynamic_enable(cond_resched);
++ preempt_dynamic_disable(might_resched);
++ preempt_dynamic_disable(preempt_schedule);
++ preempt_dynamic_disable(preempt_schedule_notrace);
++ preempt_dynamic_disable(irqentry_exit_cond_resched);
++ if (mode != preempt_dynamic_mode)
++ pr_info("Dynamic Preempt: none\n");
++ break;
++
++ case preempt_dynamic_voluntary:
++ if (!klp_override)
++ preempt_dynamic_enable(cond_resched);
++ preempt_dynamic_enable(might_resched);
++ preempt_dynamic_disable(preempt_schedule);
++ preempt_dynamic_disable(preempt_schedule_notrace);
++ preempt_dynamic_disable(irqentry_exit_cond_resched);
++ if (mode != preempt_dynamic_mode)
++ pr_info("Dynamic Preempt: voluntary\n");
++ break;
++
++ case preempt_dynamic_full:
++ if (!klp_override)
++ preempt_dynamic_enable(cond_resched);
++ preempt_dynamic_disable(might_resched);
++ preempt_dynamic_enable(preempt_schedule);
++ preempt_dynamic_enable(preempt_schedule_notrace);
++ preempt_dynamic_enable(irqentry_exit_cond_resched);
++ if (mode != preempt_dynamic_mode)
++ pr_info("Dynamic Preempt: full\n");
++ break;
++ }
++
++ preempt_dynamic_mode = mode;
++}
++
++void sched_dynamic_update(int mode)
++{
++ mutex_lock(&sched_dynamic_mutex);
++ __sched_dynamic_update(mode);
++ mutex_unlock(&sched_dynamic_mutex);
++}
++
++#ifdef CONFIG_HAVE_PREEMPT_DYNAMIC_CALL
++
++static int klp_cond_resched(void)
++{
++ __klp_sched_try_switch();
++ return __cond_resched();
++}
++
++void sched_dynamic_klp_enable(void)
++{
++ mutex_lock(&sched_dynamic_mutex);
++
++ klp_override = true;
++ static_call_update(cond_resched, klp_cond_resched);
++
++ mutex_unlock(&sched_dynamic_mutex);
++}
++
++void sched_dynamic_klp_disable(void)
++{
++ mutex_lock(&sched_dynamic_mutex);
++
++ klp_override = false;
++ __sched_dynamic_update(preempt_dynamic_mode);
++
++ mutex_unlock(&sched_dynamic_mutex);
++}
++
++#endif /* CONFIG_HAVE_PREEMPT_DYNAMIC_CALL */
++
++
++static int __init setup_preempt_mode(char *str)
++{
++ int mode = sched_dynamic_mode(str);
++ if (mode < 0) {
++ pr_warn("Dynamic Preempt: unsupported mode: %s\n", str);
++ return 0;
++ }
++
++ sched_dynamic_update(mode);
++ return 1;
++}
++__setup("preempt=", setup_preempt_mode);
++
++static void __init preempt_dynamic_init(void)
++{
++ if (preempt_dynamic_mode == preempt_dynamic_undefined) {
++ if (IS_ENABLED(CONFIG_PREEMPT_NONE)) {
++ sched_dynamic_update(preempt_dynamic_none);
++ } else if (IS_ENABLED(CONFIG_PREEMPT_VOLUNTARY)) {
++ sched_dynamic_update(preempt_dynamic_voluntary);
++ } else {
++ /* Default static call setting, nothing to do */
++ WARN_ON_ONCE(!IS_ENABLED(CONFIG_PREEMPT));
++ preempt_dynamic_mode = preempt_dynamic_full;
++ pr_info("Dynamic Preempt: full\n");
++ }
++ }
++}
++
++#define PREEMPT_MODEL_ACCESSOR(mode) \
++ bool preempt_model_##mode(void) \
++ { \
++ WARN_ON_ONCE(preempt_dynamic_mode == preempt_dynamic_undefined); \
++ return preempt_dynamic_mode == preempt_dynamic_##mode; \
++ } \
++ EXPORT_SYMBOL_GPL(preempt_model_##mode)
++
++PREEMPT_MODEL_ACCESSOR(none);
++PREEMPT_MODEL_ACCESSOR(voluntary);
++PREEMPT_MODEL_ACCESSOR(full);
++
++#else /* !CONFIG_PREEMPT_DYNAMIC */
++
++static inline void preempt_dynamic_init(void) { }
++
++#endif /* #ifdef CONFIG_PREEMPT_DYNAMIC */
++
++/**
++ * yield - yield the current processor to other threads.
++ *
++ * Do not ever use this function, there's a 99% chance you're doing it wrong.
++ *
++ * The scheduler is at all times free to pick the calling task as the most
++ * eligible task to run, if removing the yield() call from your code breaks
++ * it, it's already broken.
++ *
++ * Typical broken usage is:
++ *
++ * while (!event)
++ * yield();
++ *
++ * where one assumes that yield() will let 'the other' process run that will
++ * make event true. If the current task is a SCHED_FIFO task that will never
++ * happen. Never use yield() as a progress guarantee!!
++ *
++ * If you want to use yield() to wait for something, use wait_event().
++ * If you want to use yield() to be 'nice' for others, use cond_resched().
++ * If you still want to use yield(), do not!
++ */
++void __sched yield(void)
++{
++ set_current_state(TASK_RUNNING);
++ do_sched_yield();
++}
++EXPORT_SYMBOL(yield);
++
++/**
++ * yield_to - yield the current processor to another thread in
++ * your thread group, or accelerate that thread toward the
++ * processor it's on.
++ * @p: target task
++ * @preempt: whether task preemption is allowed or not
++ *
++ * It's the caller's job to ensure that the target task struct
++ * can't go away on us before we can do any checks.
++ *
++ * In Alt schedule FW, yield_to is not supported.
++ *
++ * Return:
++ * true (>0) if we indeed boosted the target task.
++ * false (0) if we failed to boost the target.
++ * -ESRCH if there's no task to yield to.
++ */
++int __sched yield_to(struct task_struct *p, bool preempt)
++{
++ return 0;
++}
++EXPORT_SYMBOL_GPL(yield_to);
++
++int io_schedule_prepare(void)
++{
++ int old_iowait = current->in_iowait;
++
++ current->in_iowait = 1;
++ blk_flush_plug(current->plug, true);
++ return old_iowait;
++}
++
++void io_schedule_finish(int token)
++{
++ current->in_iowait = token;
++}
++
++/*
++ * This task is about to go to sleep on IO. Increment rq->nr_iowait so
++ * that process accounting knows that this is a task in IO wait state.
++ *
++ * But don't do that if it is a deliberate, throttling IO wait (this task
++ * has set its backing_dev_info: the queue against which it should throttle)
++ */
++
++long __sched io_schedule_timeout(long timeout)
++{
++ int token;
++ long ret;
++
++ token = io_schedule_prepare();
++ ret = schedule_timeout(timeout);
++ io_schedule_finish(token);
++
++ return ret;
++}
++EXPORT_SYMBOL(io_schedule_timeout);
++
++void __sched io_schedule(void)
++{
++ int token;
++
++ token = io_schedule_prepare();
++ schedule();
++ io_schedule_finish(token);
++}
++EXPORT_SYMBOL(io_schedule);
++
++/**
++ * sys_sched_get_priority_max - return maximum RT priority.
++ * @policy: scheduling class.
++ *
++ * Return: On success, this syscall returns the maximum
++ * rt_priority that can be used by a given scheduling class.
++ * On failure, a negative error code is returned.
++ */
++SYSCALL_DEFINE1(sched_get_priority_max, int, policy)
++{
++ int ret = -EINVAL;
++
++ switch (policy) {
++ case SCHED_FIFO:
++ case SCHED_RR:
++ ret = MAX_RT_PRIO - 1;
++ break;
++ case SCHED_NORMAL:
++ case SCHED_BATCH:
++ case SCHED_IDLE:
++ ret = 0;
++ break;
++ }
++ return ret;
++}
++
++/**
++ * sys_sched_get_priority_min - return minimum RT priority.
++ * @policy: scheduling class.
++ *
++ * Return: On success, this syscall returns the minimum
++ * rt_priority that can be used by a given scheduling class.
++ * On failure, a negative error code is returned.
++ */
++SYSCALL_DEFINE1(sched_get_priority_min, int, policy)
++{
++ int ret = -EINVAL;
++
++ switch (policy) {
++ case SCHED_FIFO:
++ case SCHED_RR:
++ ret = 1;
++ break;
++ case SCHED_NORMAL:
++ case SCHED_BATCH:
++ case SCHED_IDLE:
++ ret = 0;
++ break;
++ }
++ return ret;
++}
++
++static int sched_rr_get_interval(pid_t pid, struct timespec64 *t)
++{
++ struct task_struct *p;
++ int retval;
++
++ alt_sched_debug();
++
++ if (pid < 0)
++ return -EINVAL;
++
++ retval = -ESRCH;
++ rcu_read_lock();
++ p = find_process_by_pid(pid);
++ if (!p)
++ goto out_unlock;
++
++ retval = security_task_getscheduler(p);
++ if (retval)
++ goto out_unlock;
++ rcu_read_unlock();
++
++ *t = ns_to_timespec64(sched_timeslice_ns);
++ return 0;
++
++out_unlock:
++ rcu_read_unlock();
++ return retval;
++}
++
++/**
++ * sys_sched_rr_get_interval - return the default timeslice of a process.
++ * @pid: pid of the process.
++ * @interval: userspace pointer to the timeslice value.
++ *
++ *
++ * Return: On success, 0 and the timeslice is in @interval. Otherwise,
++ * an error code.
++ */
++SYSCALL_DEFINE2(sched_rr_get_interval, pid_t, pid,
++ struct __kernel_timespec __user *, interval)
++{
++ struct timespec64 t;
++ int retval = sched_rr_get_interval(pid, &t);
++
++ if (retval == 0)
++ retval = put_timespec64(&t, interval);
++
++ return retval;
++}
++
++#ifdef CONFIG_COMPAT_32BIT_TIME
++SYSCALL_DEFINE2(sched_rr_get_interval_time32, pid_t, pid,
++ struct old_timespec32 __user *, interval)
++{
++ struct timespec64 t;
++ int retval = sched_rr_get_interval(pid, &t);
++
++ if (retval == 0)
++ retval = put_old_timespec32(&t, interval);
++ return retval;
++}
++#endif
++
++void sched_show_task(struct task_struct *p)
++{
++ unsigned long free = 0;
++ int ppid;
++
++ if (!try_get_task_stack(p))
++ return;
++
++ pr_info("task:%-15.15s state:%c", p->comm, task_state_to_char(p));
++
++ if (task_is_running(p))
++ pr_cont(" running task ");
++#ifdef CONFIG_DEBUG_STACK_USAGE
++ free = stack_not_used(p);
++#endif
++ ppid = 0;
++ rcu_read_lock();
++ if (pid_alive(p))
++ ppid = task_pid_nr(rcu_dereference(p->real_parent));
++ rcu_read_unlock();
++ pr_cont(" stack:%-5lu pid:%-5d ppid:%-6d flags:0x%08lx\n",
++ free, task_pid_nr(p), ppid,
++ read_task_thread_flags(p));
++
++ print_worker_info(KERN_INFO, p);
++ print_stop_info(KERN_INFO, p);
++ show_stack(p, NULL, KERN_INFO);
++ put_task_stack(p);
++}
++EXPORT_SYMBOL_GPL(sched_show_task);
++
++static inline bool
++state_filter_match(unsigned long state_filter, struct task_struct *p)
++{
++ unsigned int state = READ_ONCE(p->__state);
++
++ /* no filter, everything matches */
++ if (!state_filter)
++ return true;
++
++ /* filter, but doesn't match */
++ if (!(state & state_filter))
++ return false;
++
++ /*
++ * When looking for TASK_UNINTERRUPTIBLE skip TASK_IDLE (allows
++ * TASK_KILLABLE).
++ */
++ if (state_filter == TASK_UNINTERRUPTIBLE && (state & TASK_NOLOAD))
++ return false;
++
++ return true;
++}
++
++
++void show_state_filter(unsigned int state_filter)
++{
++ struct task_struct *g, *p;
++
++ rcu_read_lock();
++ for_each_process_thread(g, p) {
++ /*
++ * reset the NMI-timeout, listing all files on a slow
++ * console might take a lot of time:
++ * Also, reset softlockup watchdogs on all CPUs, because
++ * another CPU might be blocked waiting for us to process
++ * an IPI.
++ */
++ touch_nmi_watchdog();
++ touch_all_softlockup_watchdogs();
++ if (state_filter_match(state_filter, p))
++ sched_show_task(p);
++ }
++
++#ifdef CONFIG_SCHED_DEBUG
++ /* TODO: Alt schedule FW should support this
++ if (!state_filter)
++ sysrq_sched_debug_show();
++ */
++#endif
++ rcu_read_unlock();
++ /*
++ * Only show locks if all tasks are dumped:
++ */
++ if (!state_filter)
++ debug_show_all_locks();
++}
++
++void dump_cpu_task(int cpu)
++{
++ if (cpu == smp_processor_id() && in_hardirq()) {
++ struct pt_regs *regs;
++
++ regs = get_irq_regs();
++ if (regs) {
++ show_regs(regs);
++ return;
++ }
++ }
++
++ if (trigger_single_cpu_backtrace(cpu))
++ return;
++
++ pr_info("Task dump for CPU %d:\n", cpu);
++ sched_show_task(cpu_curr(cpu));
++}
++
++/**
++ * init_idle - set up an idle thread for a given CPU
++ * @idle: task in question
++ * @cpu: CPU the idle task belongs to
++ *
++ * NOTE: this function does not set the idle thread's NEED_RESCHED
++ * flag, to make booting more robust.
++ */
++void __init init_idle(struct task_struct *idle, int cpu)
++{
++#ifdef CONFIG_SMP
++ struct affinity_context ac = (struct affinity_context) {
++ .new_mask = cpumask_of(cpu),
++ .flags = 0,
++ };
++#endif
++ struct rq *rq = cpu_rq(cpu);
++ unsigned long flags;
++
++ __sched_fork(0, idle);
++
++ raw_spin_lock_irqsave(&idle->pi_lock, flags);
++ raw_spin_lock(&rq->lock);
++
++ idle->last_ran = rq->clock_task;
++ idle->__state = TASK_RUNNING;
++ /*
++ * PF_KTHREAD should already be set at this point; regardless, make it
++ * look like a proper per-CPU kthread.
++ */
++ idle->flags |= PF_IDLE | PF_KTHREAD | PF_NO_SETAFFINITY;
++ kthread_set_per_cpu(idle, cpu);
++
++ sched_queue_init_idle(&rq->queue, idle);
++
++#ifdef CONFIG_SMP
++ /*
++ * It's possible that init_idle() gets called multiple times on a task,
++ * in that case do_set_cpus_allowed() will not do the right thing.
++ *
++ * And since this is boot we can forgo the serialisation.
++ */
++ set_cpus_allowed_common(idle, &ac);
++#endif
++
++ /* Silence PROVE_RCU */
++ rcu_read_lock();
++ __set_task_cpu(idle, cpu);
++ rcu_read_unlock();
++
++ rq->idle = idle;
++ rcu_assign_pointer(rq->curr, idle);
++ idle->on_cpu = 1;
++
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock_irqrestore(&idle->pi_lock, flags);
++
++ /* Set the preempt count _outside_ the spinlocks! */
++ init_idle_preempt_count(idle, cpu);
++
++ ftrace_graph_init_idle_task(idle, cpu);
++ vtime_init_idle(idle, cpu);
++#ifdef CONFIG_SMP
++ sprintf(idle->comm, "%s/%d", INIT_TASK_COMM, cpu);
++#endif
++}
++
++#ifdef CONFIG_SMP
++
++int cpuset_cpumask_can_shrink(const struct cpumask __maybe_unused *cur,
++ const struct cpumask __maybe_unused *trial)
++{
++ return 1;
++}
++
++int task_can_attach(struct task_struct *p,
++ const struct cpumask *cs_effective_cpus)
++{
++ int ret = 0;
++
++ /*
++ * Kthreads which disallow setaffinity shouldn't be moved
++ * to a new cpuset; we don't want to change their CPU
++ * affinity and isolating such threads by their set of
++ * allowed nodes is unnecessary. Thus, cpusets are not
++ * applicable for such threads. This prevents checking for
++ * success of set_cpus_allowed_ptr() on all attached tasks
++ * before cpus_mask may be changed.
++ */
++ if (p->flags & PF_NO_SETAFFINITY)
++ ret = -EINVAL;
++
++ return ret;
++}
++
++bool sched_smp_initialized __read_mostly;
++
++#ifdef CONFIG_HOTPLUG_CPU
++/*
++ * Ensures that the idle task is using init_mm right before its CPU goes
++ * offline.
++ */
++void idle_task_exit(void)
++{
++ struct mm_struct *mm = current->active_mm;
++
++ BUG_ON(current != this_rq()->idle);
++
++ if (mm != &init_mm) {
++ switch_mm(mm, &init_mm, current);
++ finish_arch_post_lock_switch();
++ }
++
++ /* finish_cpu(), as ran on the BP, will clean up the active_mm state */
++}
++
++static int __balance_push_cpu_stop(void *arg)
++{
++ struct task_struct *p = arg;
++ struct rq *rq = this_rq();
++ struct rq_flags rf;
++ int cpu;
++
++ raw_spin_lock_irq(&p->pi_lock);
++ rq_lock(rq, &rf);
++
++ update_rq_clock(rq);
++
++ if (task_rq(p) == rq && task_on_rq_queued(p)) {
++ cpu = select_fallback_rq(rq->cpu, p);
++ rq = __migrate_task(rq, p, cpu);
++ }
++
++ rq_unlock(rq, &rf);
++ raw_spin_unlock_irq(&p->pi_lock);
++
++ put_task_struct(p);
++
++ return 0;
++}
++
++static DEFINE_PER_CPU(struct cpu_stop_work, push_work);
++
++/*
++ * This is enabled below SCHED_AP_ACTIVE; when !cpu_active(), but only
++ * effective when the hotplug motion is down.
++ */
++static void balance_push(struct rq *rq)
++{
++ struct task_struct *push_task = rq->curr;
++
++ lockdep_assert_held(&rq->lock);
++
++ /*
++ * Ensure the thing is persistent until balance_push_set(.on = false);
++ */
++ rq->balance_callback = &balance_push_callback;
++
++ /*
++ * Only active while going offline and when invoked on the outgoing
++ * CPU.
++ */
++ if (!cpu_dying(rq->cpu) || rq != this_rq())
++ return;
++
++ /*
++ * Both the cpu-hotplug and stop task are in this case and are
++ * required to complete the hotplug process.
++ */
++ if (kthread_is_per_cpu(push_task) ||
++ is_migration_disabled(push_task)) {
++
++ /*
++ * If this is the idle task on the outgoing CPU try to wake
++ * up the hotplug control thread which might wait for the
++ * last task to vanish. The rcuwait_active() check is
++ * accurate here because the waiter is pinned on this CPU
++ * and can't obviously be running in parallel.
++ *
++ * On RT kernels this also has to check whether there are
++ * pinned and scheduled out tasks on the runqueue. They
++ * need to leave the migrate disabled section first.
++ */
++ if (!rq->nr_running && !rq_has_pinned_tasks(rq) &&
++ rcuwait_active(&rq->hotplug_wait)) {
++ raw_spin_unlock(&rq->lock);
++ rcuwait_wake_up(&rq->hotplug_wait);
++ raw_spin_lock(&rq->lock);
++ }
++ return;
++ }
++
++ get_task_struct(push_task);
++ /*
++ * Temporarily drop rq->lock such that we can wake-up the stop task.
++ * Both preemption and IRQs are still disabled.
++ */
++ raw_spin_unlock(&rq->lock);
++ stop_one_cpu_nowait(rq->cpu, __balance_push_cpu_stop, push_task,
++ this_cpu_ptr(&push_work));
++ /*
++ * At this point need_resched() is true and we'll take the loop in
++ * schedule(). The next pick is obviously going to be the stop task
++ * which kthread_is_per_cpu() and will push this task away.
++ */
++ raw_spin_lock(&rq->lock);
++}
++
++static void balance_push_set(int cpu, bool on)
++{
++ struct rq *rq = cpu_rq(cpu);
++ struct rq_flags rf;
++
++ rq_lock_irqsave(rq, &rf);
++ if (on) {
++ WARN_ON_ONCE(rq->balance_callback);
++ rq->balance_callback = &balance_push_callback;
++ } else if (rq->balance_callback == &balance_push_callback) {
++ rq->balance_callback = NULL;
++ }
++ rq_unlock_irqrestore(rq, &rf);
++}
++
++/*
++ * Invoked from a CPUs hotplug control thread after the CPU has been marked
++ * inactive. All tasks which are not per CPU kernel threads are either
++ * pushed off this CPU now via balance_push() or placed on a different CPU
++ * during wakeup. Wait until the CPU is quiescent.
++ */
++static void balance_hotplug_wait(void)
++{
++ struct rq *rq = this_rq();
++
++ rcuwait_wait_event(&rq->hotplug_wait,
++ rq->nr_running == 1 && !rq_has_pinned_tasks(rq),
++ TASK_UNINTERRUPTIBLE);
++}
++
++#else
++
++static void balance_push(struct rq *rq)
++{
++}
++
++static void balance_push_set(int cpu, bool on)
++{
++}
++
++static inline void balance_hotplug_wait(void)
++{
++}
++#endif /* CONFIG_HOTPLUG_CPU */
++
++static void set_rq_offline(struct rq *rq)
++{
++ if (rq->online)
++ rq->online = false;
++}
++
++static void set_rq_online(struct rq *rq)
++{
++ if (!rq->online)
++ rq->online = true;
++}
++
++/*
++ * used to mark begin/end of suspend/resume:
++ */
++static int num_cpus_frozen;
++
++/*
++ * Update cpusets according to cpu_active mask. If cpusets are
++ * disabled, cpuset_update_active_cpus() becomes a simple wrapper
++ * around partition_sched_domains().
++ *
++ * If we come here as part of a suspend/resume, don't touch cpusets because we
++ * want to restore it back to its original state upon resume anyway.
++ */
++static void cpuset_cpu_active(void)
++{
++ if (cpuhp_tasks_frozen) {
++ /*
++ * num_cpus_frozen tracks how many CPUs are involved in suspend
++ * resume sequence. As long as this is not the last online
++ * operation in the resume sequence, just build a single sched
++ * domain, ignoring cpusets.
++ */
++ partition_sched_domains(1, NULL, NULL);
++ if (--num_cpus_frozen)
++ return;
++ /*
++ * This is the last CPU online operation. So fall through and
++ * restore the original sched domains by considering the
++ * cpuset configurations.
++ */
++ cpuset_force_rebuild();
++ }
++
++ cpuset_update_active_cpus();
++}
++
++static int cpuset_cpu_inactive(unsigned int cpu)
++{
++ if (!cpuhp_tasks_frozen) {
++ cpuset_update_active_cpus();
++ } else {
++ num_cpus_frozen++;
++ partition_sched_domains(1, NULL, NULL);
++ }
++ return 0;
++}
++
++int sched_cpu_activate(unsigned int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++ unsigned long flags;
++
++ /*
++ * Clear the balance_push callback and prepare to schedule
++ * regular tasks.
++ */
++ balance_push_set(cpu, false);
++
++#ifdef CONFIG_SCHED_SMT
++ /*
++ * When going up, increment the number of cores with SMT present.
++ */
++ if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
++ static_branch_inc_cpuslocked(&sched_smt_present);
++#endif
++ set_cpu_active(cpu, true);
++
++ if (sched_smp_initialized)
++ cpuset_cpu_active();
++
++ /*
++ * Put the rq online, if not already. This happens:
++ *
++ * 1) In the early boot process, because we build the real domains
++ * after all cpus have been brought up.
++ *
++ * 2) At runtime, if cpuset_cpu_active() fails to rebuild the
++ * domains.
++ */
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ set_rq_online(rq);
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++
++ return 0;
++}
++
++int sched_cpu_deactivate(unsigned int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++ unsigned long flags;
++ int ret;
++
++ set_cpu_active(cpu, false);
++
++ /*
++ * From this point forward, this CPU will refuse to run any task that
++ * is not: migrate_disable() or KTHREAD_IS_PER_CPU, and will actively
++ * push those tasks away until this gets cleared, see
++ * sched_cpu_dying().
++ */
++ balance_push_set(cpu, true);
++
++ /*
++ * We've cleared cpu_active_mask, wait for all preempt-disabled and RCU
++ * users of this state to go away such that all new such users will
++ * observe it.
++ *
++ * Specifically, we rely on ttwu to no longer target this CPU, see
++ * ttwu_queue_cond() and is_cpu_allowed().
++ *
++ * Do sync before park smpboot threads to take care the rcu boost case.
++ */
++ synchronize_rcu();
++
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ update_rq_clock(rq);
++ set_rq_offline(rq);
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++
++#ifdef CONFIG_SCHED_SMT
++ /*
++ * When going down, decrement the number of cores with SMT present.
++ */
++ if (cpumask_weight(cpu_smt_mask(cpu)) == 2) {
++ static_branch_dec_cpuslocked(&sched_smt_present);
++ if (!static_branch_likely(&sched_smt_present))
++ cpumask_clear(&sched_sg_idle_mask);
++ }
++#endif
++
++ if (!sched_smp_initialized)
++ return 0;
++
++ ret = cpuset_cpu_inactive(cpu);
++ if (ret) {
++ balance_push_set(cpu, false);
++ set_cpu_active(cpu, true);
++ return ret;
++ }
++
++ return 0;
++}
++
++static void sched_rq_cpu_starting(unsigned int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ rq->calc_load_update = calc_load_update;
++}
++
++int sched_cpu_starting(unsigned int cpu)
++{
++ sched_rq_cpu_starting(cpu);
++ sched_tick_start(cpu);
++ return 0;
++}
++
++#ifdef CONFIG_HOTPLUG_CPU
++
++/*
++ * Invoked immediately before the stopper thread is invoked to bring the
++ * CPU down completely. At this point all per CPU kthreads except the
++ * hotplug thread (current) and the stopper thread (inactive) have been
++ * either parked or have been unbound from the outgoing CPU. Ensure that
++ * any of those which might be on the way out are gone.
++ *
++ * If after this point a bound task is being woken on this CPU then the
++ * responsible hotplug callback has failed to do it's job.
++ * sched_cpu_dying() will catch it with the appropriate fireworks.
++ */
++int sched_cpu_wait_empty(unsigned int cpu)
++{
++ balance_hotplug_wait();
++ return 0;
++}
++
++/*
++ * Since this CPU is going 'away' for a while, fold any nr_active delta we
++ * might have. Called from the CPU stopper task after ensuring that the
++ * stopper is the last running task on the CPU, so nr_active count is
++ * stable. We need to take the teardown thread which is calling this into
++ * account, so we hand in adjust = 1 to the load calculation.
++ *
++ * Also see the comment "Global load-average calculations".
++ */
++static void calc_load_migrate(struct rq *rq)
++{
++ long delta = calc_load_fold_active(rq, 1);
++
++ if (delta)
++ atomic_long_add(delta, &calc_load_tasks);
++}
++
++static void dump_rq_tasks(struct rq *rq, const char *loglvl)
++{
++ struct task_struct *g, *p;
++ int cpu = cpu_of(rq);
++
++ lockdep_assert_held(&rq->lock);
++
++ printk("%sCPU%d enqueued tasks (%u total):\n", loglvl, cpu, rq->nr_running);
++ for_each_process_thread(g, p) {
++ if (task_cpu(p) != cpu)
++ continue;
++
++ if (!task_on_rq_queued(p))
++ continue;
++
++ printk("%s\tpid: %d, name: %s\n", loglvl, p->pid, p->comm);
++ }
++}
++
++int sched_cpu_dying(unsigned int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++ unsigned long flags;
++
++ /* Handle pending wakeups and then migrate everything off */
++ sched_tick_stop(cpu);
++
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ if (rq->nr_running != 1 || rq_has_pinned_tasks(rq)) {
++ WARN(true, "Dying CPU not properly vacated!");
++ dump_rq_tasks(rq, KERN_WARNING);
++ }
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++
++ calc_load_migrate(rq);
++ hrtick_clear(rq);
++ return 0;
++}
++#endif
++
++#ifdef CONFIG_SMP
++static void sched_init_topology_cpumask_early(void)
++{
++ int cpu;
++ cpumask_t *tmp;
++
++ for_each_possible_cpu(cpu) {
++ /* init topo masks */
++ tmp = per_cpu(sched_cpu_topo_masks, cpu);
++
++ cpumask_copy(tmp, cpumask_of(cpu));
++ tmp++;
++ cpumask_copy(tmp, cpu_possible_mask);
++ per_cpu(sched_cpu_llc_mask, cpu) = tmp;
++ per_cpu(sched_cpu_topo_end_mask, cpu) = ++tmp;
++ /*per_cpu(sd_llc_id, cpu) = cpu;*/
++ }
++}
++
++#define TOPOLOGY_CPUMASK(name, mask, last)\
++ if (cpumask_and(topo, topo, mask)) { \
++ cpumask_copy(topo, mask); \
++ printk(KERN_INFO "sched: cpu#%02d topo: 0x%08lx - "#name, \
++ cpu, (topo++)->bits[0]); \
++ } \
++ if (!last) \
++ bitmap_complement(cpumask_bits(topo), cpumask_bits(mask), \
++ nr_cpumask_bits);
++
++static void sched_init_topology_cpumask(void)
++{
++ int cpu;
++ cpumask_t *topo;
++
++ for_each_online_cpu(cpu) {
++ /* take chance to reset time slice for idle tasks */
++ cpu_rq(cpu)->idle->time_slice = sched_timeslice_ns;
++
++ topo = per_cpu(sched_cpu_topo_masks, cpu) + 1;
++
++ bitmap_complement(cpumask_bits(topo), cpumask_bits(cpumask_of(cpu)),
++ nr_cpumask_bits);
++#ifdef CONFIG_SCHED_SMT
++ TOPOLOGY_CPUMASK(smt, topology_sibling_cpumask(cpu), false);
++#endif
++ per_cpu(sd_llc_id, cpu) = cpumask_first(cpu_coregroup_mask(cpu));
++ per_cpu(sched_cpu_llc_mask, cpu) = topo;
++ TOPOLOGY_CPUMASK(coregroup, cpu_coregroup_mask(cpu), false);
++
++ TOPOLOGY_CPUMASK(core, topology_core_cpumask(cpu), false);
++
++ TOPOLOGY_CPUMASK(others, cpu_online_mask, true);
++
++ per_cpu(sched_cpu_topo_end_mask, cpu) = topo;
++ printk(KERN_INFO "sched: cpu#%02d llc_id = %d, llc_mask idx = %d\n",
++ cpu, per_cpu(sd_llc_id, cpu),
++ (int) (per_cpu(sched_cpu_llc_mask, cpu) -
++ per_cpu(sched_cpu_topo_masks, cpu)));
++ }
++}
++#endif
++
++void __init sched_init_smp(void)
++{
++ /* Move init over to a non-isolated CPU */
++ if (set_cpus_allowed_ptr(current, housekeeping_cpumask(HK_TYPE_DOMAIN)) < 0)
++ BUG();
++ current->flags &= ~PF_NO_SETAFFINITY;
++
++ sched_init_topology_cpumask();
++
++ sched_smp_initialized = true;
++}
++
++static int __init migration_init(void)
++{
++ sched_cpu_starting(smp_processor_id());
++ return 0;
++}
++early_initcall(migration_init);
++
++#else
++void __init sched_init_smp(void)
++{
++ cpu_rq(0)->idle->time_slice = sched_timeslice_ns;
++}
++#endif /* CONFIG_SMP */
++
++int in_sched_functions(unsigned long addr)
++{
++ return in_lock_functions(addr) ||
++ (addr >= (unsigned long)__sched_text_start
++ && addr < (unsigned long)__sched_text_end);
++}
++
++#ifdef CONFIG_CGROUP_SCHED
++/* task group related information */
++struct task_group {
++ struct cgroup_subsys_state css;
++
++ struct rcu_head rcu;
++ struct list_head list;
++
++ struct task_group *parent;
++ struct list_head siblings;
++ struct list_head children;
++#ifdef CONFIG_FAIR_GROUP_SCHED
++ unsigned long shares;
++#endif
++};
++
++/*
++ * Default task group.
++ * Every task in system belongs to this group at bootup.
++ */
++struct task_group root_task_group;
++LIST_HEAD(task_groups);
++
++/* Cacheline aligned slab cache for task_group */
++static struct kmem_cache *task_group_cache __read_mostly;
++#endif /* CONFIG_CGROUP_SCHED */
++
++void __init sched_init(void)
++{
++ int i;
++ struct rq *rq;
++
++ printk(KERN_INFO "sched/alt: "ALT_SCHED_NAME" CPU Scheduler "ALT_SCHED_VERSION\
++ " by Alfred Chen.\n");
++
++ wait_bit_init();
++
++#ifdef CONFIG_SMP
++ for (i = 0; i < SCHED_QUEUE_BITS; i++)
++ cpumask_copy(sched_preempt_mask + i, cpu_present_mask);
++#endif
++
++#ifdef CONFIG_CGROUP_SCHED
++ task_group_cache = KMEM_CACHE(task_group, 0);
++
++ list_add(&root_task_group.list, &task_groups);
++ INIT_LIST_HEAD(&root_task_group.children);
++ INIT_LIST_HEAD(&root_task_group.siblings);
++#endif /* CONFIG_CGROUP_SCHED */
++ for_each_possible_cpu(i) {
++ rq = cpu_rq(i);
++
++ sched_queue_init(&rq->queue);
++ rq->prio = IDLE_TASK_SCHED_PRIO;
++ rq->skip = NULL;
++
++ raw_spin_lock_init(&rq->lock);
++ rq->nr_running = rq->nr_uninterruptible = 0;
++ rq->calc_load_active = 0;
++ rq->calc_load_update = jiffies + LOAD_FREQ;
++#ifdef CONFIG_SMP
++ rq->online = false;
++ rq->cpu = i;
++
++#ifdef CONFIG_SCHED_SMT
++ rq->active_balance = 0;
++#endif
++
++#ifdef CONFIG_NO_HZ_COMMON
++ INIT_CSD(&rq->nohz_csd, nohz_csd_func, rq);
++#endif
++ rq->balance_callback = &balance_push_callback;
++#ifdef CONFIG_HOTPLUG_CPU
++ rcuwait_init(&rq->hotplug_wait);
++#endif
++#endif /* CONFIG_SMP */
++ rq->nr_switches = 0;
++
++ hrtick_rq_init(rq);
++ atomic_set(&rq->nr_iowait, 0);
++
++ zalloc_cpumask_var_node(&rq->scratch_mask, GFP_KERNEL, cpu_to_node(i));
++ }
++#ifdef CONFIG_SMP
++ /* Set rq->online for cpu 0 */
++ cpu_rq(0)->online = true;
++#endif
++ /*
++ * The boot idle thread does lazy MMU switching as well:
++ */
++ mmgrab(&init_mm);
++ enter_lazy_tlb(&init_mm, current);
++
++ /*
++ * The idle task doesn't need the kthread struct to function, but it
++ * is dressed up as a per-CPU kthread and thus needs to play the part
++ * if we want to avoid special-casing it in code that deals with per-CPU
++ * kthreads.
++ */
++ WARN_ON(!set_kthread_struct(current));
++
++ /*
++ * Make us the idle thread. Technically, schedule() should not be
++ * called from this thread, however somewhere below it might be,
++ * but because we are the idle thread, we just pick up running again
++ * when this runqueue becomes "idle".
++ */
++ init_idle(current, smp_processor_id());
++
++ calc_load_update = jiffies + LOAD_FREQ;
++
++#ifdef CONFIG_SMP
++ idle_thread_set_boot_cpu();
++ balance_push_set(smp_processor_id(), false);
++
++ sched_init_topology_cpumask_early();
++#endif /* SMP */
++
++ preempt_dynamic_init();
++}
++
++#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
++
++void __might_sleep(const char *file, int line)
++{
++ unsigned int state = get_current_state();
++ /*
++ * Blocking primitives will set (and therefore destroy) current->state,
++ * since we will exit with TASK_RUNNING make sure we enter with it,
++ * otherwise we will destroy state.
++ */
++ WARN_ONCE(state != TASK_RUNNING && current->task_state_change,
++ "do not call blocking ops when !TASK_RUNNING; "
++ "state=%x set at [<%p>] %pS\n", state,
++ (void *)current->task_state_change,
++ (void *)current->task_state_change);
++
++ __might_resched(file, line, 0);
++}
++EXPORT_SYMBOL(__might_sleep);
++
++static void print_preempt_disable_ip(int preempt_offset, unsigned long ip)
++{
++ if (!IS_ENABLED(CONFIG_DEBUG_PREEMPT))
++ return;
++
++ if (preempt_count() == preempt_offset)
++ return;
++
++ pr_err("Preemption disabled at:");
++ print_ip_sym(KERN_ERR, ip);
++}
++
++static inline bool resched_offsets_ok(unsigned int offsets)
++{
++ unsigned int nested = preempt_count();
++
++ nested += rcu_preempt_depth() << MIGHT_RESCHED_RCU_SHIFT;
++
++ return nested == offsets;
++}
++
++void __might_resched(const char *file, int line, unsigned int offsets)
++{
++ /* Ratelimiting timestamp: */
++ static unsigned long prev_jiffy;
++
++ unsigned long preempt_disable_ip;
++
++ /* WARN_ON_ONCE() by default, no rate limit required: */
++ rcu_sleep_check();
++
++ if ((resched_offsets_ok(offsets) && !irqs_disabled() &&
++ !is_idle_task(current) && !current->non_block_count) ||
++ system_state == SYSTEM_BOOTING || system_state > SYSTEM_RUNNING ||
++ oops_in_progress)
++ return;
++ if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
++ return;
++ prev_jiffy = jiffies;
++
++ /* Save this before calling printk(), since that will clobber it: */
++ preempt_disable_ip = get_preempt_disable_ip(current);
++
++ pr_err("BUG: sleeping function called from invalid context at %s:%d\n",
++ file, line);
++ pr_err("in_atomic(): %d, irqs_disabled(): %d, non_block: %d, pid: %d, name: %s\n",
++ in_atomic(), irqs_disabled(), current->non_block_count,
++ current->pid, current->comm);
++ pr_err("preempt_count: %x, expected: %x\n", preempt_count(),
++ offsets & MIGHT_RESCHED_PREEMPT_MASK);
++
++ if (IS_ENABLED(CONFIG_PREEMPT_RCU)) {
++ pr_err("RCU nest depth: %d, expected: %u\n",
++ rcu_preempt_depth(), offsets >> MIGHT_RESCHED_RCU_SHIFT);
++ }
++
++ if (task_stack_end_corrupted(current))
++ pr_emerg("Thread overran stack, or stack corrupted\n");
++
++ debug_show_held_locks(current);
++ if (irqs_disabled())
++ print_irqtrace_events(current);
++
++ print_preempt_disable_ip(offsets & MIGHT_RESCHED_PREEMPT_MASK,
++ preempt_disable_ip);
++
++ dump_stack();
++ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
++}
++EXPORT_SYMBOL(__might_resched);
++
++void __cant_sleep(const char *file, int line, int preempt_offset)
++{
++ static unsigned long prev_jiffy;
++
++ if (irqs_disabled())
++ return;
++
++ if (!IS_ENABLED(CONFIG_PREEMPT_COUNT))
++ return;
++
++ if (preempt_count() > preempt_offset)
++ return;
++
++ if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
++ return;
++ prev_jiffy = jiffies;
++
++ printk(KERN_ERR "BUG: assuming atomic context at %s:%d\n", file, line);
++ printk(KERN_ERR "in_atomic(): %d, irqs_disabled(): %d, pid: %d, name: %s\n",
++ in_atomic(), irqs_disabled(),
++ current->pid, current->comm);
++
++ debug_show_held_locks(current);
++ dump_stack();
++ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
++}
++EXPORT_SYMBOL_GPL(__cant_sleep);
++
++#ifdef CONFIG_SMP
++void __cant_migrate(const char *file, int line)
++{
++ static unsigned long prev_jiffy;
++
++ if (irqs_disabled())
++ return;
++
++ if (is_migration_disabled(current))
++ return;
++
++ if (!IS_ENABLED(CONFIG_PREEMPT_COUNT))
++ return;
++
++ if (preempt_count() > 0)
++ return;
++
++ if (current->migration_flags & MDF_FORCE_ENABLED)
++ return;
++
++ if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
++ return;
++ prev_jiffy = jiffies;
++
++ pr_err("BUG: assuming non migratable context at %s:%d\n", file, line);
++ pr_err("in_atomic(): %d, irqs_disabled(): %d, migration_disabled() %u pid: %d, name: %s\n",
++ in_atomic(), irqs_disabled(), is_migration_disabled(current),
++ current->pid, current->comm);
++
++ debug_show_held_locks(current);
++ dump_stack();
++ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
++}
++EXPORT_SYMBOL_GPL(__cant_migrate);
++#endif
++#endif
++
++#ifdef CONFIG_MAGIC_SYSRQ
++void normalize_rt_tasks(void)
++{
++ struct task_struct *g, *p;
++ struct sched_attr attr = {
++ .sched_policy = SCHED_NORMAL,
++ };
++
++ read_lock(&tasklist_lock);
++ for_each_process_thread(g, p) {
++ /*
++ * Only normalize user tasks:
++ */
++ if (p->flags & PF_KTHREAD)
++ continue;
++
++ schedstat_set(p->stats.wait_start, 0);
++ schedstat_set(p->stats.sleep_start, 0);
++ schedstat_set(p->stats.block_start, 0);
++
++ if (!rt_task(p)) {
++ /*
++ * Renice negative nice level userspace
++ * tasks back to 0:
++ */
++ if (task_nice(p) < 0)
++ set_user_nice(p, 0);
++ continue;
++ }
++
++ __sched_setscheduler(p, &attr, false, false);
++ }
++ read_unlock(&tasklist_lock);
++}
++#endif /* CONFIG_MAGIC_SYSRQ */
++
++#if defined(CONFIG_IA64) || defined(CONFIG_KGDB_KDB)
++/*
++ * These functions are only useful for the IA64 MCA handling, or kdb.
++ *
++ * They can only be called when the whole system has been
++ * stopped - every CPU needs to be quiescent, and no scheduling
++ * activity can take place. Using them for anything else would
++ * be a serious bug, and as a result, they aren't even visible
++ * under any other configuration.
++ */
++
++/**
++ * curr_task - return the current task for a given CPU.
++ * @cpu: the processor in question.
++ *
++ * ONLY VALID WHEN THE WHOLE SYSTEM IS STOPPED!
++ *
++ * Return: The current task for @cpu.
++ */
++struct task_struct *curr_task(int cpu)
++{
++ return cpu_curr(cpu);
++}
++
++#endif /* defined(CONFIG_IA64) || defined(CONFIG_KGDB_KDB) */
++
++#ifdef CONFIG_IA64
++/**
++ * ia64_set_curr_task - set the current task for a given CPU.
++ * @cpu: the processor in question.
++ * @p: the task pointer to set.
++ *
++ * Description: This function must only be used when non-maskable interrupts
++ * are serviced on a separate stack. It allows the architecture to switch the
++ * notion of the current task on a CPU in a non-blocking manner. This function
++ * must be called with all CPU's synchronised, and interrupts disabled, the
++ * and caller must save the original value of the current task (see
++ * curr_task() above) and restore that value before reenabling interrupts and
++ * re-starting the system.
++ *
++ * ONLY VALID WHEN THE WHOLE SYSTEM IS STOPPED!
++ */
++void ia64_set_curr_task(int cpu, struct task_struct *p)
++{
++ cpu_curr(cpu) = p;
++}
++
++#endif
++
++#ifdef CONFIG_CGROUP_SCHED
++static void sched_free_group(struct task_group *tg)
++{
++ kmem_cache_free(task_group_cache, tg);
++}
++
++static void sched_free_group_rcu(struct rcu_head *rhp)
++{
++ sched_free_group(container_of(rhp, struct task_group, rcu));
++}
++
++static void sched_unregister_group(struct task_group *tg)
++{
++ /*
++ * We have to wait for yet another RCU grace period to expire, as
++ * print_cfs_stats() might run concurrently.
++ */
++ call_rcu(&tg->rcu, sched_free_group_rcu);
++}
++
++/* allocate runqueue etc for a new task group */
++struct task_group *sched_create_group(struct task_group *parent)
++{
++ struct task_group *tg;
++
++ tg = kmem_cache_alloc(task_group_cache, GFP_KERNEL | __GFP_ZERO);
++ if (!tg)
++ return ERR_PTR(-ENOMEM);
++
++ return tg;
++}
++
++void sched_online_group(struct task_group *tg, struct task_group *parent)
++{
++}
++
++/* rcu callback to free various structures associated with a task group */
++static void sched_unregister_group_rcu(struct rcu_head *rhp)
++{
++ /* Now it should be safe to free those cfs_rqs: */
++ sched_unregister_group(container_of(rhp, struct task_group, rcu));
++}
++
++void sched_destroy_group(struct task_group *tg)
++{
++ /* Wait for possible concurrent references to cfs_rqs complete: */
++ call_rcu(&tg->rcu, sched_unregister_group_rcu);
++}
++
++void sched_release_group(struct task_group *tg)
++{
++}
++
++static inline struct task_group *css_tg(struct cgroup_subsys_state *css)
++{
++ return css ? container_of(css, struct task_group, css) : NULL;
++}
++
++static struct cgroup_subsys_state *
++cpu_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
++{
++ struct task_group *parent = css_tg(parent_css);
++ struct task_group *tg;
++
++ if (!parent) {
++ /* This is early initialization for the top cgroup */
++ return &root_task_group.css;
++ }
++
++ tg = sched_create_group(parent);
++ if (IS_ERR(tg))
++ return ERR_PTR(-ENOMEM);
++ return &tg->css;
++}
++
++/* Expose task group only after completing cgroup initialization */
++static int cpu_cgroup_css_online(struct cgroup_subsys_state *css)
++{
++ struct task_group *tg = css_tg(css);
++ struct task_group *parent = css_tg(css->parent);
++
++ if (parent)
++ sched_online_group(tg, parent);
++ return 0;
++}
++
++static void cpu_cgroup_css_released(struct cgroup_subsys_state *css)
++{
++ struct task_group *tg = css_tg(css);
++
++ sched_release_group(tg);
++}
++
++static void cpu_cgroup_css_free(struct cgroup_subsys_state *css)
++{
++ struct task_group *tg = css_tg(css);
++
++ /*
++ * Relies on the RCU grace period between css_released() and this.
++ */
++ sched_unregister_group(tg);
++}
++
++#ifdef CONFIG_RT_GROUP_SCHED
++static int cpu_cgroup_can_attach(struct cgroup_taskset *tset)
++{
++ return 0;
++}
++#endif
++
++static void cpu_cgroup_attach(struct cgroup_taskset *tset)
++{
++}
++
++#ifdef CONFIG_FAIR_GROUP_SCHED
++static DEFINE_MUTEX(shares_mutex);
++
++int sched_group_set_shares(struct task_group *tg, unsigned long shares)
++{
++ /*
++ * We can't change the weight of the root cgroup.
++ */
++ if (&root_task_group == tg)
++ return -EINVAL;
++
++ shares = clamp(shares, scale_load(MIN_SHARES), scale_load(MAX_SHARES));
++
++ mutex_lock(&shares_mutex);
++ if (tg->shares == shares)
++ goto done;
++
++ tg->shares = shares;
++done:
++ mutex_unlock(&shares_mutex);
++ return 0;
++}
++
++static int cpu_shares_write_u64(struct cgroup_subsys_state *css,
++ struct cftype *cftype, u64 shareval)
++{
++ if (shareval > scale_load_down(ULONG_MAX))
++ shareval = MAX_SHARES;
++ return sched_group_set_shares(css_tg(css), scale_load(shareval));
++}
++
++static u64 cpu_shares_read_u64(struct cgroup_subsys_state *css,
++ struct cftype *cft)
++{
++ struct task_group *tg = css_tg(css);
++
++ return (u64) scale_load_down(tg->shares);
++}
++#endif
++
++static struct cftype cpu_legacy_files[] = {
++#ifdef CONFIG_FAIR_GROUP_SCHED
++ {
++ .name = "shares",
++ .read_u64 = cpu_shares_read_u64,
++ .write_u64 = cpu_shares_write_u64,
++ },
++#endif
++ { } /* Terminate */
++};
++
++
++static struct cftype cpu_files[] = {
++ { } /* terminate */
++};
++
++static int cpu_extra_stat_show(struct seq_file *sf,
++ struct cgroup_subsys_state *css)
++{
++ return 0;
++}
++
++struct cgroup_subsys cpu_cgrp_subsys = {
++ .css_alloc = cpu_cgroup_css_alloc,
++ .css_online = cpu_cgroup_css_online,
++ .css_released = cpu_cgroup_css_released,
++ .css_free = cpu_cgroup_css_free,
++ .css_extra_stat_show = cpu_extra_stat_show,
++#ifdef CONFIG_RT_GROUP_SCHED
++ .can_attach = cpu_cgroup_can_attach,
++#endif
++ .attach = cpu_cgroup_attach,
++ .legacy_cftypes = cpu_files,
++ .legacy_cftypes = cpu_legacy_files,
++ .dfl_cftypes = cpu_files,
++ .early_init = true,
++ .threaded = true,
++};
++#endif /* CONFIG_CGROUP_SCHED */
++
++#undef CREATE_TRACE_POINTS
++
++#ifdef CONFIG_SCHED_MM_CID
++
++#
++/*
++ * @cid_lock: Guarantee forward-progress of cid allocation.
++ *
++ * Concurrency ID allocation within a bitmap is mostly lock-free. The cid_lock
++ * is only used when contention is detected by the lock-free allocation so
++ * forward progress can be guaranteed.
++ */
++DEFINE_RAW_SPINLOCK(cid_lock);
++
++/*
++ * @use_cid_lock: Select cid allocation behavior: lock-free vs spinlock.
++ *
++ * When @use_cid_lock is 0, the cid allocation is lock-free. When contention is
++ * detected, it is set to 1 to ensure that all newly coming allocations are
++ * serialized by @cid_lock until the allocation which detected contention
++ * completes and sets @use_cid_lock back to 0. This guarantees forward progress
++ * of a cid allocation.
++ */
++int use_cid_lock;
++
++/*
++ * mm_cid remote-clear implements a lock-free algorithm to clear per-mm/cpu cid
++ * concurrently with respect to the execution of the source runqueue context
++ * switch.
++ *
++ * There is one basic properties we want to guarantee here:
++ *
++ * (1) Remote-clear should _never_ mark a per-cpu cid UNSET when it is actively
++ * used by a task. That would lead to concurrent allocation of the cid and
++ * userspace corruption.
++ *
++ * Provide this guarantee by introducing a Dekker memory ordering to guarantee
++ * that a pair of loads observe at least one of a pair of stores, which can be
++ * shown as:
++ *
++ * X = Y = 0
++ *
++ * w[X]=1 w[Y]=1
++ * MB MB
++ * r[Y]=y r[X]=x
++ *
++ * Which guarantees that x==0 && y==0 is impossible. But rather than using
++ * values 0 and 1, this algorithm cares about specific state transitions of the
++ * runqueue current task (as updated by the scheduler context switch), and the
++ * per-mm/cpu cid value.
++ *
++ * Let's introduce task (Y) which has task->mm == mm and task (N) which has
++ * task->mm != mm for the rest of the discussion. There are two scheduler state
++ * transitions on context switch we care about:
++ *
++ * (TSA) Store to rq->curr with transition from (N) to (Y)
++ *
++ * (TSB) Store to rq->curr with transition from (Y) to (N)
++ *
++ * On the remote-clear side, there is one transition we care about:
++ *
++ * (TMA) cmpxchg to *pcpu_cid to set the LAZY flag
++ *
++ * There is also a transition to UNSET state which can be performed from all
++ * sides (scheduler, remote-clear). It is always performed with a cmpxchg which
++ * guarantees that only a single thread will succeed:
++ *
++ * (TMB) cmpxchg to *pcpu_cid to mark UNSET
++ *
++ * Just to be clear, what we do _not_ want to happen is a transition to UNSET
++ * when a thread is actively using the cid (property (1)).
++ *
++ * Let's looks at the relevant combinations of TSA/TSB, and TMA transitions.
++ *
++ * Scenario A) (TSA)+(TMA) (from next task perspective)
++ *
++ * CPU0 CPU1
++ *
++ * Context switch CS-1 Remote-clear
++ * - store to rq->curr: (N)->(Y) (TSA) - cmpxchg to *pcpu_id to LAZY (TMA)
++ * (implied barrier after cmpxchg)
++ * - switch_mm_cid()
++ * - memory barrier (see switch_mm_cid()
++ * comment explaining how this barrier
++ * is combined with other scheduler
++ * barriers)
++ * - mm_cid_get (next)
++ * - READ_ONCE(*pcpu_cid) - rcu_dereference(src_rq->curr)
++ *
++ * This Dekker ensures that either task (Y) is observed by the
++ * rcu_dereference() or the LAZY flag is observed by READ_ONCE(), or both are
++ * observed.
++ *
++ * If task (Y) store is observed by rcu_dereference(), it means that there is
++ * still an active task on the cpu. Remote-clear will therefore not transition
++ * to UNSET, which fulfills property (1).
++ *
++ * If task (Y) is not observed, but the lazy flag is observed by READ_ONCE(),
++ * it will move its state to UNSET, which clears the percpu cid perhaps
++ * uselessly (which is not an issue for correctness). Because task (Y) is not
++ * observed, CPU1 can move ahead to set the state to UNSET. Because moving
++ * state to UNSET is done with a cmpxchg expecting that the old state has the
++ * LAZY flag set, only one thread will successfully UNSET.
++ *
++ * If both states (LAZY flag and task (Y)) are observed, the thread on CPU0
++ * will observe the LAZY flag and transition to UNSET (perhaps uselessly), and
++ * CPU1 will observe task (Y) and do nothing more, which is fine.
++ *
++ * What we are effectively preventing with this Dekker is a scenario where
++ * neither LAZY flag nor store (Y) are observed, which would fail property (1)
++ * because this would UNSET a cid which is actively used.
++ */
++
++void sched_mm_cid_migrate_from(struct task_struct *t)
++{
++ t->migrate_from_cpu = task_cpu(t);
++}
++
++static
++int __sched_mm_cid_migrate_from_fetch_cid(struct rq *src_rq,
++ struct task_struct *t,
++ struct mm_cid *src_pcpu_cid)
++{
++ struct mm_struct *mm = t->mm;
++ struct task_struct *src_task;
++ int src_cid, last_mm_cid;
++
++ if (!mm)
++ return -1;
++
++ last_mm_cid = t->last_mm_cid;
++ /*
++ * If the migrated task has no last cid, or if the current
++ * task on src rq uses the cid, it means the source cid does not need
++ * to be moved to the destination cpu.
++ */
++ if (last_mm_cid == -1)
++ return -1;
++ src_cid = READ_ONCE(src_pcpu_cid->cid);
++ if (!mm_cid_is_valid(src_cid) || last_mm_cid != src_cid)
++ return -1;
++
++ /*
++ * If we observe an active task using the mm on this rq, it means we
++ * are not the last task to be migrated from this cpu for this mm, so
++ * there is no need to move src_cid to the destination cpu.
++ */
++ rcu_read_lock();
++ src_task = rcu_dereference(src_rq->curr);
++ if (READ_ONCE(src_task->mm_cid_active) && src_task->mm == mm) {
++ rcu_read_unlock();
++ t->last_mm_cid = -1;
++ return -1;
++ }
++ rcu_read_unlock();
++
++ return src_cid;
++}
++
++static
++int __sched_mm_cid_migrate_from_try_steal_cid(struct rq *src_rq,
++ struct task_struct *t,
++ struct mm_cid *src_pcpu_cid,
++ int src_cid)
++{
++ struct task_struct *src_task;
++ struct mm_struct *mm = t->mm;
++ int lazy_cid;
++
++ if (src_cid == -1)
++ return -1;
++
++ /*
++ * Attempt to clear the source cpu cid to move it to the destination
++ * cpu.
++ */
++ lazy_cid = mm_cid_set_lazy_put(src_cid);
++ if (!try_cmpxchg(&src_pcpu_cid->cid, &src_cid, lazy_cid))
++ return -1;
++
++ /*
++ * The implicit barrier after cmpxchg per-mm/cpu cid before loading
++ * rq->curr->mm matches the scheduler barrier in context_switch()
++ * between store to rq->curr and load of prev and next task's
++ * per-mm/cpu cid.
++ *
++ * The implicit barrier after cmpxchg per-mm/cpu cid before loading
++ * rq->curr->mm_cid_active matches the barrier in
++ * sched_mm_cid_exit_signals(), sched_mm_cid_before_execve(), and
++ * sched_mm_cid_after_execve() between store to t->mm_cid_active and
++ * load of per-mm/cpu cid.
++ */
++
++ /*
++ * If we observe an active task using the mm on this rq after setting
++ * the lazy-put flag, this task will be responsible for transitioning
++ * from lazy-put flag set to MM_CID_UNSET.
++ */
++ rcu_read_lock();
++ src_task = rcu_dereference(src_rq->curr);
++ if (READ_ONCE(src_task->mm_cid_active) && src_task->mm == mm) {
++ rcu_read_unlock();
++ /*
++ * We observed an active task for this mm, there is therefore
++ * no point in moving this cid to the destination cpu.
++ */
++ t->last_mm_cid = -1;
++ return -1;
++ }
++ rcu_read_unlock();
++
++ /*
++ * The src_cid is unused, so it can be unset.
++ */
++ if (!try_cmpxchg(&src_pcpu_cid->cid, &lazy_cid, MM_CID_UNSET))
++ return -1;
++ return src_cid;
++}
++
++/*
++ * Migration to dst cpu. Called with dst_rq lock held.
++ * Interrupts are disabled, which keeps the window of cid ownership without the
++ * source rq lock held small.
++ */
++void sched_mm_cid_migrate_to(struct rq *dst_rq, struct task_struct *t, int src_cpu)
++{
++ struct mm_cid *src_pcpu_cid, *dst_pcpu_cid;
++ struct mm_struct *mm = t->mm;
++ int src_cid, dst_cid;
++ struct rq *src_rq;
++
++ lockdep_assert_rq_held(dst_rq);
++
++ if (!mm)
++ return;
++ if (src_cpu == -1) {
++ t->last_mm_cid = -1;
++ return;
++ }
++ /*
++ * Move the src cid if the dst cid is unset. This keeps id
++ * allocation closest to 0 in cases where few threads migrate around
++ * many cpus.
++ *
++ * If destination cid is already set, we may have to just clear
++ * the src cid to ensure compactness in frequent migrations
++ * scenarios.
++ *
++ * It is not useful to clear the src cid when the number of threads is
++ * greater or equal to the number of allowed cpus, because user-space
++ * can expect that the number of allowed cids can reach the number of
++ * allowed cpus.
++ */
++ dst_pcpu_cid = per_cpu_ptr(mm->pcpu_cid, cpu_of(dst_rq));
++ dst_cid = READ_ONCE(dst_pcpu_cid->cid);
++ if (!mm_cid_is_unset(dst_cid) &&
++ atomic_read(&mm->mm_users) >= t->nr_cpus_allowed)
++ return;
++ src_pcpu_cid = per_cpu_ptr(mm->pcpu_cid, src_cpu);
++ src_rq = cpu_rq(src_cpu);
++ src_cid = __sched_mm_cid_migrate_from_fetch_cid(src_rq, t, src_pcpu_cid);
++ if (src_cid == -1)
++ return;
++ src_cid = __sched_mm_cid_migrate_from_try_steal_cid(src_rq, t, src_pcpu_cid,
++ src_cid);
++ if (src_cid == -1)
++ return;
++ if (!mm_cid_is_unset(dst_cid)) {
++ __mm_cid_put(mm, src_cid);
++ return;
++ }
++ /* Move src_cid to dst cpu. */
++ mm_cid_snapshot_time(dst_rq, mm);
++ WRITE_ONCE(dst_pcpu_cid->cid, src_cid);
++}
++
++static void sched_mm_cid_remote_clear(struct mm_struct *mm, struct mm_cid *pcpu_cid,
++ int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++ struct task_struct *t;
++ unsigned long flags;
++ int cid, lazy_cid;
++
++ cid = READ_ONCE(pcpu_cid->cid);
++ if (!mm_cid_is_valid(cid))
++ return;
++
++ /*
++ * Clear the cpu cid if it is set to keep cid allocation compact. If
++ * there happens to be other tasks left on the source cpu using this
++ * mm, the next task using this mm will reallocate its cid on context
++ * switch.
++ */
++ lazy_cid = mm_cid_set_lazy_put(cid);
++ if (!try_cmpxchg(&pcpu_cid->cid, &cid, lazy_cid))
++ return;
++
++ /*
++ * The implicit barrier after cmpxchg per-mm/cpu cid before loading
++ * rq->curr->mm matches the scheduler barrier in context_switch()
++ * between store to rq->curr and load of prev and next task's
++ * per-mm/cpu cid.
++ *
++ * The implicit barrier after cmpxchg per-mm/cpu cid before loading
++ * rq->curr->mm_cid_active matches the barrier in
++ * sched_mm_cid_exit_signals(), sched_mm_cid_before_execve(), and
++ * sched_mm_cid_after_execve() between store to t->mm_cid_active and
++ * load of per-mm/cpu cid.
++ */
++
++ /*
++ * If we observe an active task using the mm on this rq after setting
++ * the lazy-put flag, that task will be responsible for transitioning
++ * from lazy-put flag set to MM_CID_UNSET.
++ */
++ rcu_read_lock();
++ t = rcu_dereference(rq->curr);
++ if (READ_ONCE(t->mm_cid_active) && t->mm == mm) {
++ rcu_read_unlock();
++ return;
++ }
++ rcu_read_unlock();
++
++ /*
++ * The cid is unused, so it can be unset.
++ * Disable interrupts to keep the window of cid ownership without rq
++ * lock small.
++ */
++ local_irq_save(flags);
++ if (try_cmpxchg(&pcpu_cid->cid, &lazy_cid, MM_CID_UNSET))
++ __mm_cid_put(mm, cid);
++ local_irq_restore(flags);
++}
++
++static void sched_mm_cid_remote_clear_old(struct mm_struct *mm, int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++ struct mm_cid *pcpu_cid;
++ struct task_struct *curr;
++ u64 rq_clock;
++
++ /*
++ * rq->clock load is racy on 32-bit but one spurious clear once in a
++ * while is irrelevant.
++ */
++ rq_clock = READ_ONCE(rq->clock);
++ pcpu_cid = per_cpu_ptr(mm->pcpu_cid, cpu);
++
++ /*
++ * In order to take care of infrequently scheduled tasks, bump the time
++ * snapshot associated with this cid if an active task using the mm is
++ * observed on this rq.
++ */
++ rcu_read_lock();
++ curr = rcu_dereference(rq->curr);
++ if (READ_ONCE(curr->mm_cid_active) && curr->mm == mm) {
++ WRITE_ONCE(pcpu_cid->time, rq_clock);
++ rcu_read_unlock();
++ return;
++ }
++ rcu_read_unlock();
++
++ if (rq_clock < pcpu_cid->time + SCHED_MM_CID_PERIOD_NS)
++ return;
++ sched_mm_cid_remote_clear(mm, pcpu_cid, cpu);
++}
++
++static void sched_mm_cid_remote_clear_weight(struct mm_struct *mm, int cpu,
++ int weight)
++{
++ struct mm_cid *pcpu_cid;
++ int cid;
++
++ pcpu_cid = per_cpu_ptr(mm->pcpu_cid, cpu);
++ cid = READ_ONCE(pcpu_cid->cid);
++ if (!mm_cid_is_valid(cid) || cid < weight)
++ return;
++ sched_mm_cid_remote_clear(mm, pcpu_cid, cpu);
++}
++
++static void task_mm_cid_work(struct callback_head *work)
++{
++ unsigned long now = jiffies, old_scan, next_scan;
++ struct task_struct *t = current;
++ struct cpumask *cidmask;
++ struct mm_struct *mm;
++ int weight, cpu;
++
++ SCHED_WARN_ON(t != container_of(work, struct task_struct, cid_work));
++
++ work->next = work; /* Prevent double-add */
++ if (t->flags & PF_EXITING)
++ return;
++ mm = t->mm;
++ if (!mm)
++ return;
++ old_scan = READ_ONCE(mm->mm_cid_next_scan);
++ next_scan = now + msecs_to_jiffies(MM_CID_SCAN_DELAY);
++ if (!old_scan) {
++ unsigned long res;
++
++ res = cmpxchg(&mm->mm_cid_next_scan, old_scan, next_scan);
++ if (res != old_scan)
++ old_scan = res;
++ else
++ old_scan = next_scan;
++ }
++ if (time_before(now, old_scan))
++ return;
++ if (!try_cmpxchg(&mm->mm_cid_next_scan, &old_scan, next_scan))
++ return;
++ cidmask = mm_cidmask(mm);
++ /* Clear cids that were not recently used. */
++ for_each_possible_cpu(cpu)
++ sched_mm_cid_remote_clear_old(mm, cpu);
++ weight = cpumask_weight(cidmask);
++ /*
++ * Clear cids that are greater or equal to the cidmask weight to
++ * recompact it.
++ */
++ for_each_possible_cpu(cpu)
++ sched_mm_cid_remote_clear_weight(mm, cpu, weight);
++}
++
++void init_sched_mm_cid(struct task_struct *t)
++{
++ struct mm_struct *mm = t->mm;
++ int mm_users = 0;
++
++ if (mm) {
++ mm_users = atomic_read(&mm->mm_users);
++ if (mm_users == 1)
++ mm->mm_cid_next_scan = jiffies + msecs_to_jiffies(MM_CID_SCAN_DELAY);
++ }
++ t->cid_work.next = &t->cid_work; /* Protect against double add */
++ init_task_work(&t->cid_work, task_mm_cid_work);
++}
++
++void task_tick_mm_cid(struct rq *rq, struct task_struct *curr)
++{
++ struct callback_head *work = &curr->cid_work;
++ unsigned long now = jiffies;
++
++ if (!curr->mm || (curr->flags & (PF_EXITING | PF_KTHREAD)) ||
++ work->next != work)
++ return;
++ if (time_before(now, READ_ONCE(curr->mm->mm_cid_next_scan)))
++ return;
++ task_work_add(curr, work, TWA_RESUME);
++}
++
++void sched_mm_cid_exit_signals(struct task_struct *t)
++{
++ struct mm_struct *mm = t->mm;
++ struct rq_flags rf;
++ struct rq *rq;
++
++ if (!mm)
++ return;
++
++ preempt_disable();
++ rq = this_rq();
++ rq_lock_irqsave(rq, &rf);
++ preempt_enable_no_resched(); /* holding spinlock */
++ WRITE_ONCE(t->mm_cid_active, 0);
++ /*
++ * Store t->mm_cid_active before loading per-mm/cpu cid.
++ * Matches barrier in sched_mm_cid_remote_clear_old().
++ */
++ smp_mb();
++ mm_cid_put(mm);
++ t->last_mm_cid = t->mm_cid = -1;
++ rq_unlock_irqrestore(rq, &rf);
++}
++
++void sched_mm_cid_before_execve(struct task_struct *t)
++{
++ struct mm_struct *mm = t->mm;
++ struct rq_flags rf;
++ struct rq *rq;
++
++ if (!mm)
++ return;
++
++ preempt_disable();
++ rq = this_rq();
++ rq_lock_irqsave(rq, &rf);
++ preempt_enable_no_resched(); /* holding spinlock */
++ WRITE_ONCE(t->mm_cid_active, 0);
++ /*
++ * Store t->mm_cid_active before loading per-mm/cpu cid.
++ * Matches barrier in sched_mm_cid_remote_clear_old().
++ */
++ smp_mb();
++ mm_cid_put(mm);
++ t->last_mm_cid = t->mm_cid = -1;
++ rq_unlock_irqrestore(rq, &rf);
++}
++
++void sched_mm_cid_after_execve(struct task_struct *t)
++{
++ struct mm_struct *mm = t->mm;
++ struct rq_flags rf;
++ struct rq *rq;
++
++ if (!mm)
++ return;
++
++ preempt_disable();
++ rq = this_rq();
++ rq_lock_irqsave(rq, &rf);
++ preempt_enable_no_resched(); /* holding spinlock */
++ WRITE_ONCE(t->mm_cid_active, 1);
++ /*
++ * Store t->mm_cid_active before loading per-mm/cpu cid.
++ * Matches barrier in sched_mm_cid_remote_clear_old().
++ */
++ smp_mb();
++ t->last_mm_cid = t->mm_cid = mm_cid_get(rq, mm);
++ rq_unlock_irqrestore(rq, &rf);
++ rseq_set_notify_resume(t);
++}
++
++void sched_mm_cid_fork(struct task_struct *t)
++{
++ WARN_ON_ONCE(!t->mm || t->mm_cid != -1);
++ t->mm_cid_active = 1;
++}
++#endif
+diff --git a/kernel/sched/alt_debug.c b/kernel/sched/alt_debug.c
+new file mode 100644
+index 000000000000..1212a031700e
+--- /dev/null
++++ b/kernel/sched/alt_debug.c
+@@ -0,0 +1,31 @@
++/*
++ * kernel/sched/alt_debug.c
++ *
++ * Print the alt scheduler debugging details
++ *
++ * Author: Alfred Chen
++ * Date : 2020
++ */
++#include "sched.h"
++
++/*
++ * This allows printing both to /proc/sched_debug and
++ * to the console
++ */
++#define SEQ_printf(m, x...) \
++ do { \
++ if (m) \
++ seq_printf(m, x); \
++ else \
++ pr_cont(x); \
++ } while (0)
++
++void proc_sched_show_task(struct task_struct *p, struct pid_namespace *ns,
++ struct seq_file *m)
++{
++ SEQ_printf(m, "%s (%d, #threads: %d)\n", p->comm, task_pid_nr_ns(p, ns),
++ get_nr_threads(p));
++}
++
++void proc_sched_set_task(struct task_struct *p)
++{}
+diff --git a/kernel/sched/alt_sched.h b/kernel/sched/alt_sched.h
+new file mode 100644
+index 000000000000..5494f27cdb04
+--- /dev/null
++++ b/kernel/sched/alt_sched.h
+@@ -0,0 +1,906 @@
++#ifndef ALT_SCHED_H
++#define ALT_SCHED_H
++
++#include <linux/context_tracking.h>
++#include <linux/profile.h>
++#include <linux/stop_machine.h>
++#include <linux/syscalls.h>
++#include <linux/tick.h>
++
++#include <trace/events/power.h>
++#include <trace/events/sched.h>
++
++#include "../workqueue_internal.h"
++
++#include "cpupri.h"
++
++#ifdef CONFIG_SCHED_BMQ
++/* bits:
++ * RT(0-99), (Low prio adj range, nice width, high prio adj range) / 2, cpu idle task */
++#define SCHED_LEVELS (MAX_RT_PRIO + NICE_WIDTH / 2 + MAX_PRIORITY_ADJ + 1)
++#endif
++
++#ifdef CONFIG_SCHED_PDS
++/* bits: RT(0-24), reserved(25-31), SCHED_NORMAL_PRIO_NUM(32), cpu idle task(1) */
++#define SCHED_LEVELS (64 + 1)
++#endif /* CONFIG_SCHED_PDS */
++
++#define IDLE_TASK_SCHED_PRIO (SCHED_LEVELS - 1)
++
++#ifdef CONFIG_SCHED_DEBUG
++# define SCHED_WARN_ON(x) WARN_ONCE(x, #x)
++extern void resched_latency_warn(int cpu, u64 latency);
++#else
++# define SCHED_WARN_ON(x) ({ (void)(x), 0; })
++static inline void resched_latency_warn(int cpu, u64 latency) {}
++#endif
++
++/*
++ * Increase resolution of nice-level calculations for 64-bit architectures.
++ * The extra resolution improves shares distribution and load balancing of
++ * low-weight task groups (eg. nice +19 on an autogroup), deeper taskgroup
++ * hierarchies, especially on larger systems. This is not a user-visible change
++ * and does not change the user-interface for setting shares/weights.
++ *
++ * We increase resolution only if we have enough bits to allow this increased
++ * resolution (i.e. 64-bit). The costs for increasing resolution when 32-bit
++ * are pretty high and the returns do not justify the increased costs.
++ *
++ * Really only required when CONFIG_FAIR_GROUP_SCHED=y is also set, but to
++ * increase coverage and consistency always enable it on 64-bit platforms.
++ */
++#ifdef CONFIG_64BIT
++# define NICE_0_LOAD_SHIFT (SCHED_FIXEDPOINT_SHIFT + SCHED_FIXEDPOINT_SHIFT)
++# define scale_load(w) ((w) << SCHED_FIXEDPOINT_SHIFT)
++# define scale_load_down(w) \
++({ \
++ unsigned long __w = (w); \
++ if (__w) \
++ __w = max(2UL, __w >> SCHED_FIXEDPOINT_SHIFT); \
++ __w; \
++})
++#else
++# define NICE_0_LOAD_SHIFT (SCHED_FIXEDPOINT_SHIFT)
++# define scale_load(w) (w)
++# define scale_load_down(w) (w)
++#endif
++
++#ifdef CONFIG_FAIR_GROUP_SCHED
++#define ROOT_TASK_GROUP_LOAD NICE_0_LOAD
++
++/*
++ * A weight of 0 or 1 can cause arithmetics problems.
++ * A weight of a cfs_rq is the sum of weights of which entities
++ * are queued on this cfs_rq, so a weight of a entity should not be
++ * too large, so as the shares value of a task group.
++ * (The default weight is 1024 - so there's no practical
++ * limitation from this.)
++ */
++#define MIN_SHARES (1UL << 1)
++#define MAX_SHARES (1UL << 18)
++#endif
++
++/*
++ * Tunables that become constants when CONFIG_SCHED_DEBUG is off:
++ */
++#ifdef CONFIG_SCHED_DEBUG
++# define const_debug __read_mostly
++#else
++# define const_debug const
++#endif
++
++/* task_struct::on_rq states: */
++#define TASK_ON_RQ_QUEUED 1
++#define TASK_ON_RQ_MIGRATING 2
++
++static inline int task_on_rq_queued(struct task_struct *p)
++{
++ return p->on_rq == TASK_ON_RQ_QUEUED;
++}
++
++static inline int task_on_rq_migrating(struct task_struct *p)
++{
++ return READ_ONCE(p->on_rq) == TASK_ON_RQ_MIGRATING;
++}
++
++/*
++ * wake flags
++ */
++#define WF_SYNC 0x01 /* waker goes to sleep after wakeup */
++#define WF_FORK 0x02 /* child wakeup after fork */
++#define WF_MIGRATED 0x04 /* internal use, task got migrated */
++
++#define SCHED_QUEUE_BITS (SCHED_LEVELS - 1)
++
++struct sched_queue {
++ DECLARE_BITMAP(bitmap, SCHED_QUEUE_BITS);
++ struct list_head heads[SCHED_LEVELS];
++};
++
++struct rq;
++struct cpuidle_state;
++
++struct balance_callback {
++ struct balance_callback *next;
++ void (*func)(struct rq *rq);
++};
++
++/*
++ * This is the main, per-CPU runqueue data structure.
++ * This data should only be modified by the local cpu.
++ */
++struct rq {
++ /* runqueue lock: */
++ raw_spinlock_t lock;
++
++ struct task_struct __rcu *curr;
++ struct task_struct *idle, *stop, *skip;
++ struct mm_struct *prev_mm;
++
++ struct sched_queue queue;
++#ifdef CONFIG_SCHED_PDS
++ u64 time_edge;
++#endif
++ unsigned long prio;
++
++ /* switch count */
++ u64 nr_switches;
++
++ atomic_t nr_iowait;
++
++#ifdef CONFIG_SCHED_DEBUG
++ u64 last_seen_need_resched_ns;
++ int ticks_without_resched;
++#endif
++
++#ifdef CONFIG_MEMBARRIER
++ int membarrier_state;
++#endif
++
++#ifdef CONFIG_SMP
++ int cpu; /* cpu of this runqueue */
++ bool online;
++
++ unsigned int ttwu_pending;
++ unsigned char nohz_idle_balance;
++ unsigned char idle_balance;
++
++#ifdef CONFIG_HAVE_SCHED_AVG_IRQ
++ struct sched_avg avg_irq;
++#endif
++
++#ifdef CONFIG_SCHED_SMT
++ int active_balance;
++ struct cpu_stop_work active_balance_work;
++#endif
++ struct balance_callback *balance_callback;
++#ifdef CONFIG_HOTPLUG_CPU
++ struct rcuwait hotplug_wait;
++#endif
++ unsigned int nr_pinned;
++
++#endif /* CONFIG_SMP */
++#ifdef CONFIG_IRQ_TIME_ACCOUNTING
++ u64 prev_irq_time;
++#endif /* CONFIG_IRQ_TIME_ACCOUNTING */
++#ifdef CONFIG_PARAVIRT
++ u64 prev_steal_time;
++#endif /* CONFIG_PARAVIRT */
++#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
++ u64 prev_steal_time_rq;
++#endif /* CONFIG_PARAVIRT_TIME_ACCOUNTING */
++
++ /* For genenal cpu load util */
++ s32 load_history;
++ u64 load_block;
++ u64 load_stamp;
++
++ /* calc_load related fields */
++ unsigned long calc_load_update;
++ long calc_load_active;
++
++ u64 clock, last_tick;
++ u64 last_ts_switch;
++ u64 clock_task;
++
++ unsigned int nr_running;
++ unsigned long nr_uninterruptible;
++
++#ifdef CONFIG_SCHED_HRTICK
++#ifdef CONFIG_SMP
++ call_single_data_t hrtick_csd;
++#endif
++ struct hrtimer hrtick_timer;
++ ktime_t hrtick_time;
++#endif
++
++#ifdef CONFIG_SCHEDSTATS
++
++ /* latency stats */
++ struct sched_info rq_sched_info;
++ unsigned long long rq_cpu_time;
++ /* could above be rq->cfs_rq.exec_clock + rq->rt_rq.rt_runtime ? */
++
++ /* sys_sched_yield() stats */
++ unsigned int yld_count;
++
++ /* schedule() stats */
++ unsigned int sched_switch;
++ unsigned int sched_count;
++ unsigned int sched_goidle;
++
++ /* try_to_wake_up() stats */
++ unsigned int ttwu_count;
++ unsigned int ttwu_local;
++#endif /* CONFIG_SCHEDSTATS */
++
++#ifdef CONFIG_CPU_IDLE
++ /* Must be inspected within a rcu lock section */
++ struct cpuidle_state *idle_state;
++#endif
++
++#ifdef CONFIG_NO_HZ_COMMON
++#ifdef CONFIG_SMP
++ call_single_data_t nohz_csd;
++#endif
++ atomic_t nohz_flags;
++#endif /* CONFIG_NO_HZ_COMMON */
++
++ /* Scratch cpumask to be temporarily used under rq_lock */
++ cpumask_var_t scratch_mask;
++};
++
++extern unsigned long rq_load_util(struct rq *rq, unsigned long max);
++
++extern unsigned long calc_load_update;
++extern atomic_long_t calc_load_tasks;
++
++extern void calc_global_load_tick(struct rq *this_rq);
++extern long calc_load_fold_active(struct rq *this_rq, long adjust);
++
++DECLARE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
++#define cpu_rq(cpu) (&per_cpu(runqueues, (cpu)))
++#define this_rq() this_cpu_ptr(&runqueues)
++#define task_rq(p) cpu_rq(task_cpu(p))
++#define cpu_curr(cpu) (cpu_rq(cpu)->curr)
++#define raw_rq() raw_cpu_ptr(&runqueues)
++
++#ifdef CONFIG_SMP
++#if defined(CONFIG_SCHED_DEBUG) && defined(CONFIG_SYSCTL)
++void register_sched_domain_sysctl(void);
++void unregister_sched_domain_sysctl(void);
++#else
++static inline void register_sched_domain_sysctl(void)
++{
++}
++static inline void unregister_sched_domain_sysctl(void)
++{
++}
++#endif
++
++extern bool sched_smp_initialized;
++
++enum {
++ ITSELF_LEVEL_SPACE_HOLDER,
++#ifdef CONFIG_SCHED_SMT
++ SMT_LEVEL_SPACE_HOLDER,
++#endif
++ COREGROUP_LEVEL_SPACE_HOLDER,
++ CORE_LEVEL_SPACE_HOLDER,
++ OTHER_LEVEL_SPACE_HOLDER,
++ NR_CPU_AFFINITY_LEVELS
++};
++
++DECLARE_PER_CPU_ALIGNED(cpumask_t [NR_CPU_AFFINITY_LEVELS], sched_cpu_topo_masks);
++
++static inline int
++__best_mask_cpu(const cpumask_t *cpumask, const cpumask_t *mask)
++{
++ int cpu;
++
++ while ((cpu = cpumask_any_and(cpumask, mask)) >= nr_cpu_ids)
++ mask++;
++
++ return cpu;
++}
++
++static inline int best_mask_cpu(int cpu, const cpumask_t *mask)
++{
++ return __best_mask_cpu(mask, per_cpu(sched_cpu_topo_masks, cpu));
++}
++
++extern void flush_smp_call_function_queue(void);
++
++#else /* !CONFIG_SMP */
++static inline void flush_smp_call_function_queue(void) { }
++#endif
++
++#ifndef arch_scale_freq_tick
++static __always_inline
++void arch_scale_freq_tick(void)
++{
++}
++#endif
++
++#ifndef arch_scale_freq_capacity
++static __always_inline
++unsigned long arch_scale_freq_capacity(int cpu)
++{
++ return SCHED_CAPACITY_SCALE;
++}
++#endif
++
++static inline u64 __rq_clock_broken(struct rq *rq)
++{
++ return READ_ONCE(rq->clock);
++}
++
++static inline u64 rq_clock(struct rq *rq)
++{
++ /*
++ * Relax lockdep_assert_held() checking as in VRQ, call to
++ * sched_info_xxxx() may not held rq->lock
++ * lockdep_assert_held(&rq->lock);
++ */
++ return rq->clock;
++}
++
++static inline u64 rq_clock_task(struct rq *rq)
++{
++ /*
++ * Relax lockdep_assert_held() checking as in VRQ, call to
++ * sched_info_xxxx() may not held rq->lock
++ * lockdep_assert_held(&rq->lock);
++ */
++ return rq->clock_task;
++}
++
++/*
++ * {de,en}queue flags:
++ *
++ * DEQUEUE_SLEEP - task is no longer runnable
++ * ENQUEUE_WAKEUP - task just became runnable
++ *
++ */
++
++#define DEQUEUE_SLEEP 0x01
++
++#define ENQUEUE_WAKEUP 0x01
++
++
++/*
++ * Below are scheduler API which using in other kernel code
++ * It use the dummy rq_flags
++ * ToDo : BMQ need to support these APIs for compatibility with mainline
++ * scheduler code.
++ */
++struct rq_flags {
++ unsigned long flags;
++};
++
++struct rq *__task_rq_lock(struct task_struct *p, struct rq_flags *rf)
++ __acquires(rq->lock);
++
++struct rq *task_rq_lock(struct task_struct *p, struct rq_flags *rf)
++ __acquires(p->pi_lock)
++ __acquires(rq->lock);
++
++static inline void __task_rq_unlock(struct rq *rq, struct rq_flags *rf)
++ __releases(rq->lock)
++{
++ raw_spin_unlock(&rq->lock);
++}
++
++static inline void
++task_rq_unlock(struct rq *rq, struct task_struct *p, struct rq_flags *rf)
++ __releases(rq->lock)
++ __releases(p->pi_lock)
++{
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, rf->flags);
++}
++
++static inline void
++rq_lock(struct rq *rq, struct rq_flags *rf)
++ __acquires(rq->lock)
++{
++ raw_spin_lock(&rq->lock);
++}
++
++static inline void
++rq_unlock(struct rq *rq, struct rq_flags *rf)
++ __releases(rq->lock)
++{
++ raw_spin_unlock(&rq->lock);
++}
++
++static inline void
++rq_lock_irq(struct rq *rq, struct rq_flags *rf)
++ __acquires(rq->lock)
++{
++ raw_spin_lock_irq(&rq->lock);
++}
++
++static inline void
++rq_unlock_irq(struct rq *rq, struct rq_flags *rf)
++ __releases(rq->lock)
++{
++ raw_spin_unlock_irq(&rq->lock);
++}
++
++static inline struct rq *
++this_rq_lock_irq(struct rq_flags *rf)
++ __acquires(rq->lock)
++{
++ struct rq *rq;
++
++ local_irq_disable();
++ rq = this_rq();
++ raw_spin_lock(&rq->lock);
++
++ return rq;
++}
++
++static inline raw_spinlock_t *__rq_lockp(struct rq *rq)
++{
++ return &rq->lock;
++}
++
++static inline raw_spinlock_t *rq_lockp(struct rq *rq)
++{
++ return __rq_lockp(rq);
++}
++
++static inline void lockdep_assert_rq_held(struct rq *rq)
++{
++ lockdep_assert_held(__rq_lockp(rq));
++}
++
++extern void raw_spin_rq_lock_nested(struct rq *rq, int subclass);
++extern void raw_spin_rq_unlock(struct rq *rq);
++
++static inline void raw_spin_rq_lock(struct rq *rq)
++{
++ raw_spin_rq_lock_nested(rq, 0);
++}
++
++static inline void raw_spin_rq_lock_irq(struct rq *rq)
++{
++ local_irq_disable();
++ raw_spin_rq_lock(rq);
++}
++
++static inline void raw_spin_rq_unlock_irq(struct rq *rq)
++{
++ raw_spin_rq_unlock(rq);
++ local_irq_enable();
++}
++
++static inline int task_current(struct rq *rq, struct task_struct *p)
++{
++ return rq->curr == p;
++}
++
++static inline bool task_on_cpu(struct task_struct *p)
++{
++ return p->on_cpu;
++}
++
++extern int task_running_nice(struct task_struct *p);
++
++extern struct static_key_false sched_schedstats;
++
++#ifdef CONFIG_CPU_IDLE
++static inline void idle_set_state(struct rq *rq,
++ struct cpuidle_state *idle_state)
++{
++ rq->idle_state = idle_state;
++}
++
++static inline struct cpuidle_state *idle_get_state(struct rq *rq)
++{
++ WARN_ON(!rcu_read_lock_held());
++ return rq->idle_state;
++}
++#else
++static inline void idle_set_state(struct rq *rq,
++ struct cpuidle_state *idle_state)
++{
++}
++
++static inline struct cpuidle_state *idle_get_state(struct rq *rq)
++{
++ return NULL;
++}
++#endif
++
++static inline int cpu_of(const struct rq *rq)
++{
++#ifdef CONFIG_SMP
++ return rq->cpu;
++#else
++ return 0;
++#endif
++}
++
++#include "stats.h"
++
++#ifdef CONFIG_NO_HZ_COMMON
++#define NOHZ_BALANCE_KICK_BIT 0
++#define NOHZ_STATS_KICK_BIT 1
++
++#define NOHZ_BALANCE_KICK BIT(NOHZ_BALANCE_KICK_BIT)
++#define NOHZ_STATS_KICK BIT(NOHZ_STATS_KICK_BIT)
++
++#define NOHZ_KICK_MASK (NOHZ_BALANCE_KICK | NOHZ_STATS_KICK)
++
++#define nohz_flags(cpu) (&cpu_rq(cpu)->nohz_flags)
++
++/* TODO: needed?
++extern void nohz_balance_exit_idle(struct rq *rq);
++#else
++static inline void nohz_balance_exit_idle(struct rq *rq) { }
++*/
++#endif
++
++#ifdef CONFIG_IRQ_TIME_ACCOUNTING
++struct irqtime {
++ u64 total;
++ u64 tick_delta;
++ u64 irq_start_time;
++ struct u64_stats_sync sync;
++};
++
++DECLARE_PER_CPU(struct irqtime, cpu_irqtime);
++
++/*
++ * Returns the irqtime minus the softirq time computed by ksoftirqd.
++ * Otherwise ksoftirqd's sum_exec_runtime is substracted its own runtime
++ * and never move forward.
++ */
++static inline u64 irq_time_read(int cpu)
++{
++ struct irqtime *irqtime = &per_cpu(cpu_irqtime, cpu);
++ unsigned int seq;
++ u64 total;
++
++ do {
++ seq = __u64_stats_fetch_begin(&irqtime->sync);
++ total = irqtime->total;
++ } while (__u64_stats_fetch_retry(&irqtime->sync, seq));
++
++ return total;
++}
++#endif /* CONFIG_IRQ_TIME_ACCOUNTING */
++
++#ifdef CONFIG_CPU_FREQ
++DECLARE_PER_CPU(struct update_util_data __rcu *, cpufreq_update_util_data);
++#endif /* CONFIG_CPU_FREQ */
++
++#ifdef CONFIG_NO_HZ_FULL
++extern int __init sched_tick_offload_init(void);
++#else
++static inline int sched_tick_offload_init(void) { return 0; }
++#endif
++
++#ifdef arch_scale_freq_capacity
++#ifndef arch_scale_freq_invariant
++#define arch_scale_freq_invariant() (true)
++#endif
++#else /* arch_scale_freq_capacity */
++#define arch_scale_freq_invariant() (false)
++#endif
++
++extern void schedule_idle(void);
++
++#define cap_scale(v, s) ((v)*(s) >> SCHED_CAPACITY_SHIFT)
++
++/*
++ * !! For sched_setattr_nocheck() (kernel) only !!
++ *
++ * This is actually gross. :(
++ *
++ * It is used to make schedutil kworker(s) higher priority than SCHED_DEADLINE
++ * tasks, but still be able to sleep. We need this on platforms that cannot
++ * atomically change clock frequency. Remove once fast switching will be
++ * available on such platforms.
++ *
++ * SUGOV stands for SchedUtil GOVernor.
++ */
++#define SCHED_FLAG_SUGOV 0x10000000
++
++#ifdef CONFIG_MEMBARRIER
++/*
++ * The scheduler provides memory barriers required by membarrier between:
++ * - prior user-space memory accesses and store to rq->membarrier_state,
++ * - store to rq->membarrier_state and following user-space memory accesses.
++ * In the same way it provides those guarantees around store to rq->curr.
++ */
++static inline void membarrier_switch_mm(struct rq *rq,
++ struct mm_struct *prev_mm,
++ struct mm_struct *next_mm)
++{
++ int membarrier_state;
++
++ if (prev_mm == next_mm)
++ return;
++
++ membarrier_state = atomic_read(&next_mm->membarrier_state);
++ if (READ_ONCE(rq->membarrier_state) == membarrier_state)
++ return;
++
++ WRITE_ONCE(rq->membarrier_state, membarrier_state);
++}
++#else
++static inline void membarrier_switch_mm(struct rq *rq,
++ struct mm_struct *prev_mm,
++ struct mm_struct *next_mm)
++{
++}
++#endif
++
++#ifdef CONFIG_NUMA
++extern int sched_numa_find_closest(const struct cpumask *cpus, int cpu);
++#else
++static inline int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
++{
++ return nr_cpu_ids;
++}
++#endif
++
++extern void swake_up_all_locked(struct swait_queue_head *q);
++extern void __prepare_to_swait(struct swait_queue_head *q, struct swait_queue *wait);
++
++#ifdef CONFIG_PREEMPT_DYNAMIC
++extern int preempt_dynamic_mode;
++extern int sched_dynamic_mode(const char *str);
++extern void sched_dynamic_update(int mode);
++#endif
++
++static inline void nohz_run_idle_balance(int cpu) { }
++
++static inline
++unsigned long uclamp_rq_util_with(struct rq *rq, unsigned long util,
++ struct task_struct *p)
++{
++ return util;
++}
++
++static inline bool uclamp_rq_is_capped(struct rq *rq) { return false; }
++
++#ifdef CONFIG_SCHED_MM_CID
++
++#define SCHED_MM_CID_PERIOD_NS (100ULL * 1000000) /* 100ms */
++#define MM_CID_SCAN_DELAY 100 /* 100ms */
++
++extern raw_spinlock_t cid_lock;
++extern int use_cid_lock;
++
++extern void sched_mm_cid_migrate_from(struct task_struct *t);
++extern void sched_mm_cid_migrate_to(struct rq *dst_rq, struct task_struct *t, int src_cpu);
++extern void task_tick_mm_cid(struct rq *rq, struct task_struct *curr);
++extern void init_sched_mm_cid(struct task_struct *t);
++
++static inline void __mm_cid_put(struct mm_struct *mm, int cid)
++{
++ if (cid < 0)
++ return;
++ cpumask_clear_cpu(cid, mm_cidmask(mm));
++}
++
++/*
++ * The per-mm/cpu cid can have the MM_CID_LAZY_PUT flag set or transition to
++ * the MM_CID_UNSET state without holding the rq lock, but the rq lock needs to
++ * be held to transition to other states.
++ *
++ * State transitions synchronized with cmpxchg or try_cmpxchg need to be
++ * consistent across cpus, which prevents use of this_cpu_cmpxchg.
++ */
++static inline void mm_cid_put_lazy(struct task_struct *t)
++{
++ struct mm_struct *mm = t->mm;
++ struct mm_cid __percpu *pcpu_cid = mm->pcpu_cid;
++ int cid;
++
++ lockdep_assert_irqs_disabled();
++ cid = __this_cpu_read(pcpu_cid->cid);
++ if (!mm_cid_is_lazy_put(cid) ||
++ !try_cmpxchg(&this_cpu_ptr(pcpu_cid)->cid, &cid, MM_CID_UNSET))
++ return;
++ __mm_cid_put(mm, mm_cid_clear_lazy_put(cid));
++}
++
++static inline int mm_cid_pcpu_unset(struct mm_struct *mm)
++{
++ struct mm_cid __percpu *pcpu_cid = mm->pcpu_cid;
++ int cid, res;
++
++ lockdep_assert_irqs_disabled();
++ cid = __this_cpu_read(pcpu_cid->cid);
++ for (;;) {
++ if (mm_cid_is_unset(cid))
++ return MM_CID_UNSET;
++ /*
++ * Attempt transition from valid or lazy-put to unset.
++ */
++ res = cmpxchg(&this_cpu_ptr(pcpu_cid)->cid, cid, MM_CID_UNSET);
++ if (res == cid)
++ break;
++ cid = res;
++ }
++ return cid;
++}
++
++static inline void mm_cid_put(struct mm_struct *mm)
++{
++ int cid;
++
++ lockdep_assert_irqs_disabled();
++ cid = mm_cid_pcpu_unset(mm);
++ if (cid == MM_CID_UNSET)
++ return;
++ __mm_cid_put(mm, mm_cid_clear_lazy_put(cid));
++}
++
++static inline int __mm_cid_try_get(struct mm_struct *mm)
++{
++ struct cpumask *cpumask;
++ int cid;
++
++ cpumask = mm_cidmask(mm);
++ /*
++ * Retry finding first zero bit if the mask is temporarily
++ * filled. This only happens during concurrent remote-clear
++ * which owns a cid without holding a rq lock.
++ */
++ for (;;) {
++ cid = cpumask_first_zero(cpumask);
++ if (cid < nr_cpu_ids)
++ break;
++ cpu_relax();
++ }
++ if (cpumask_test_and_set_cpu(cid, cpumask))
++ return -1;
++ return cid;
++}
++
++/*
++ * Save a snapshot of the current runqueue time of this cpu
++ * with the per-cpu cid value, allowing to estimate how recently it was used.
++ */
++static inline void mm_cid_snapshot_time(struct rq *rq, struct mm_struct *mm)
++{
++ struct mm_cid *pcpu_cid = per_cpu_ptr(mm->pcpu_cid, cpu_of(rq));
++
++ lockdep_assert_rq_held(rq);
++ WRITE_ONCE(pcpu_cid->time, rq->clock);
++}
++
++static inline int __mm_cid_get(struct rq *rq, struct mm_struct *mm)
++{
++ int cid;
++
++ /*
++ * All allocations (even those using the cid_lock) are lock-free. If
++ * use_cid_lock is set, hold the cid_lock to perform cid allocation to
++ * guarantee forward progress.
++ */
++ if (!READ_ONCE(use_cid_lock)) {
++ cid = __mm_cid_try_get(mm);
++ if (cid >= 0)
++ goto end;
++ raw_spin_lock(&cid_lock);
++ } else {
++ raw_spin_lock(&cid_lock);
++ cid = __mm_cid_try_get(mm);
++ if (cid >= 0)
++ goto unlock;
++ }
++
++ /*
++ * cid concurrently allocated. Retry while forcing following
++ * allocations to use the cid_lock to ensure forward progress.
++ */
++ WRITE_ONCE(use_cid_lock, 1);
++ /*
++ * Set use_cid_lock before allocation. Only care about program order
++ * because this is only required for forward progress.
++ */
++ barrier();
++ /*
++ * Retry until it succeeds. It is guaranteed to eventually succeed once
++ * all newcoming allocations observe the use_cid_lock flag set.
++ */
++ do {
++ cid = __mm_cid_try_get(mm);
++ cpu_relax();
++ } while (cid < 0);
++ /*
++ * Allocate before clearing use_cid_lock. Only care about
++ * program order because this is for forward progress.
++ */
++ barrier();
++ WRITE_ONCE(use_cid_lock, 0);
++unlock:
++ raw_spin_unlock(&cid_lock);
++end:
++ mm_cid_snapshot_time(rq, mm);
++ return cid;
++}
++
++static inline int mm_cid_get(struct rq *rq, struct mm_struct *mm)
++{
++ struct mm_cid __percpu *pcpu_cid = mm->pcpu_cid;
++ struct cpumask *cpumask;
++ int cid;
++
++ lockdep_assert_rq_held(rq);
++ cpumask = mm_cidmask(mm);
++ cid = __this_cpu_read(pcpu_cid->cid);
++ if (mm_cid_is_valid(cid)) {
++ mm_cid_snapshot_time(rq, mm);
++ return cid;
++ }
++ if (mm_cid_is_lazy_put(cid)) {
++ if (try_cmpxchg(&this_cpu_ptr(pcpu_cid)->cid, &cid, MM_CID_UNSET))
++ __mm_cid_put(mm, mm_cid_clear_lazy_put(cid));
++ }
++ cid = __mm_cid_get(rq, mm);
++ __this_cpu_write(pcpu_cid->cid, cid);
++ return cid;
++}
++
++static inline void switch_mm_cid(struct rq *rq,
++ struct task_struct *prev,
++ struct task_struct *next)
++{
++ /*
++ * Provide a memory barrier between rq->curr store and load of
++ * {prev,next}->mm->pcpu_cid[cpu] on rq->curr->mm transition.
++ *
++ * Should be adapted if context_switch() is modified.
++ */
++ if (!next->mm) { // to kernel
++ /*
++ * user -> kernel transition does not guarantee a barrier, but
++ * we can use the fact that it performs an atomic operation in
++ * mmgrab().
++ */
++ if (prev->mm) // from user
++ smp_mb__after_mmgrab();
++ /*
++ * kernel -> kernel transition does not change rq->curr->mm
++ * state. It stays NULL.
++ */
++ } else { // to user
++ /*
++ * kernel -> user transition does not provide a barrier
++ * between rq->curr store and load of {prev,next}->mm->pcpu_cid[cpu].
++ * Provide it here.
++ */
++ if (!prev->mm) // from kernel
++ smp_mb();
++ /*
++ * user -> user transition guarantees a memory barrier through
++ * switch_mm() when current->mm changes. If current->mm is
++ * unchanged, no barrier is needed.
++ */
++ }
++ if (prev->mm_cid_active) {
++ mm_cid_snapshot_time(rq, prev->mm);
++ mm_cid_put_lazy(prev);
++ prev->mm_cid = -1;
++ }
++ if (next->mm_cid_active)
++ next->last_mm_cid = next->mm_cid = mm_cid_get(rq, next->mm);
++}
++
++#else
++static inline void switch_mm_cid(struct rq *rq, struct task_struct *prev, struct task_struct *next) { }
++static inline void sched_mm_cid_migrate_from(struct task_struct *t) { }
++static inline void sched_mm_cid_migrate_to(struct rq *dst_rq, struct task_struct *t, int src_cpu) { }
++static inline void task_tick_mm_cid(struct rq *rq, struct task_struct *curr) { }
++static inline void init_sched_mm_cid(struct task_struct *t) { }
++#endif
++
++#endif /* ALT_SCHED_H */
+diff --git a/kernel/sched/bmq.h b/kernel/sched/bmq.h
+new file mode 100644
+index 000000000000..f29b8f3aa786
+--- /dev/null
++++ b/kernel/sched/bmq.h
+@@ -0,0 +1,110 @@
++#define ALT_SCHED_NAME "BMQ"
++
++/*
++ * BMQ only routines
++ */
++#define rq_switch_time(rq) ((rq)->clock - (rq)->last_ts_switch)
++#define boost_threshold(p) (sched_timeslice_ns >>\
++ (15 - MAX_PRIORITY_ADJ - (p)->boost_prio))
++
++static inline void boost_task(struct task_struct *p)
++{
++ int limit;
++
++ switch (p->policy) {
++ case SCHED_NORMAL:
++ limit = -MAX_PRIORITY_ADJ;
++ break;
++ case SCHED_BATCH:
++ case SCHED_IDLE:
++ limit = 0;
++ break;
++ default:
++ return;
++ }
++
++ if (p->boost_prio > limit)
++ p->boost_prio--;
++}
++
++static inline void deboost_task(struct task_struct *p)
++{
++ if (p->boost_prio < MAX_PRIORITY_ADJ)
++ p->boost_prio++;
++}
++
++/*
++ * Common interfaces
++ */
++static inline void sched_timeslice_imp(const int timeslice_ms) {}
++
++static inline int
++task_sched_prio_normal(const struct task_struct *p, const struct rq *rq)
++{
++ return p->prio + p->boost_prio - MAX_RT_PRIO;
++}
++
++static inline int task_sched_prio(const struct task_struct *p)
++{
++ return (p->prio < MAX_RT_PRIO)? p->prio : MAX_RT_PRIO / 2 + (p->prio + p->boost_prio) / 2;
++}
++
++static inline int
++task_sched_prio_idx(const struct task_struct *p, const struct rq *rq)
++{
++ return task_sched_prio(p);
++}
++
++static inline int sched_prio2idx(int prio, struct rq *rq)
++{
++ return prio;
++}
++
++static inline int sched_idx2prio(int idx, struct rq *rq)
++{
++ return idx;
++}
++
++static inline void time_slice_expired(struct task_struct *p, struct rq *rq)
++{
++ p->time_slice = sched_timeslice_ns;
++
++ if (SCHED_FIFO != p->policy && task_on_rq_queued(p)) {
++ if (SCHED_RR != p->policy)
++ deboost_task(p);
++ requeue_task(p, rq, task_sched_prio_idx(p, rq));
++ }
++}
++
++static inline void sched_task_sanity_check(struct task_struct *p, struct rq *rq) {}
++
++inline int task_running_nice(struct task_struct *p)
++{
++ return (p->prio + p->boost_prio > DEFAULT_PRIO + MAX_PRIORITY_ADJ);
++}
++
++static void sched_task_fork(struct task_struct *p, struct rq *rq)
++{
++ p->boost_prio = MAX_PRIORITY_ADJ;
++}
++
++static inline void do_sched_yield_type_1(struct task_struct *p, struct rq *rq)
++{
++ p->boost_prio = MAX_PRIORITY_ADJ;
++}
++
++#ifdef CONFIG_SMP
++static inline void sched_task_ttwu(struct task_struct *p)
++{
++ if(this_rq()->clock_task - p->last_ran > sched_timeslice_ns)
++ boost_task(p);
++}
++#endif
++
++static inline void sched_task_deactivate(struct task_struct *p, struct rq *rq)
++{
++ if (rq_switch_time(rq) < boost_threshold(p))
++ boost_task(p);
++}
++
++static inline void update_rq_time_edge(struct rq *rq) {}
+diff --git a/kernel/sched/build_policy.c b/kernel/sched/build_policy.c
+index d9dc9ab3773f..71a25540d65e 100644
+--- a/kernel/sched/build_policy.c
++++ b/kernel/sched/build_policy.c
+@@ -42,13 +42,19 @@
+
+ #include "idle.c"
+
++#ifndef CONFIG_SCHED_ALT
+ #include "rt.c"
++#endif
+
+ #ifdef CONFIG_SMP
++#ifndef CONFIG_SCHED_ALT
+ # include "cpudeadline.c"
++#endif
+ # include "pelt.c"
+ #endif
+
+ #include "cputime.c"
+-#include "deadline.c"
+
++#ifndef CONFIG_SCHED_ALT
++#include "deadline.c"
++#endif
+diff --git a/kernel/sched/build_utility.c b/kernel/sched/build_utility.c
+index 99bdd96f454f..23f80a86d2d7 100644
+--- a/kernel/sched/build_utility.c
++++ b/kernel/sched/build_utility.c
+@@ -85,7 +85,9 @@
+
+ #ifdef CONFIG_SMP
+ # include "cpupri.c"
++#ifndef CONFIG_SCHED_ALT
+ # include "stop_task.c"
++#endif
+ # include "topology.c"
+ #endif
+
+diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
+index e3211455b203..87f7a4f732c8 100644
+--- a/kernel/sched/cpufreq_schedutil.c
++++ b/kernel/sched/cpufreq_schedutil.c
+@@ -157,9 +157,14 @@ static void sugov_get_util(struct sugov_cpu *sg_cpu)
+ {
+ struct rq *rq = cpu_rq(sg_cpu->cpu);
+
++#ifndef CONFIG_SCHED_ALT
+ sg_cpu->bw_dl = cpu_bw_dl(rq);
+ sg_cpu->util = effective_cpu_util(sg_cpu->cpu, cpu_util_cfs(sg_cpu->cpu),
+ FREQUENCY_UTIL, NULL);
++#else
++ sg_cpu->bw_dl = 0;
++ sg_cpu->util = rq_load_util(rq, arch_scale_cpu_capacity(sg_cpu->cpu));
++#endif /* CONFIG_SCHED_ALT */
+ }
+
+ /**
+@@ -305,8 +310,10 @@ static inline bool sugov_cpu_is_busy(struct sugov_cpu *sg_cpu) { return false; }
+ */
+ static inline void ignore_dl_rate_limit(struct sugov_cpu *sg_cpu)
+ {
++#ifndef CONFIG_SCHED_ALT
+ if (cpu_bw_dl(cpu_rq(sg_cpu->cpu)) > sg_cpu->bw_dl)
+ sg_cpu->sg_policy->limits_changed = true;
++#endif
+ }
+
+ static inline bool sugov_update_single_common(struct sugov_cpu *sg_cpu,
+@@ -609,6 +616,7 @@ static int sugov_kthread_create(struct sugov_policy *sg_policy)
+ }
+
+ ret = sched_setattr_nocheck(thread, &attr);
++
+ if (ret) {
+ kthread_stop(thread);
+ pr_warn("%s: failed to set SCHED_DEADLINE\n", __func__);
+@@ -841,7 +849,9 @@ cpufreq_governor_init(schedutil_gov);
+ #ifdef CONFIG_ENERGY_MODEL
+ static void rebuild_sd_workfn(struct work_struct *work)
+ {
++#ifndef CONFIG_SCHED_ALT
+ rebuild_sched_domains_energy();
++#endif /* CONFIG_SCHED_ALT */
+ }
+ static DECLARE_WORK(rebuild_sd_work, rebuild_sd_workfn);
+
+diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
+index af7952f12e6c..6461cbbb734d 100644
+--- a/kernel/sched/cputime.c
++++ b/kernel/sched/cputime.c
+@@ -126,7 +126,7 @@ void account_user_time(struct task_struct *p, u64 cputime)
+ p->utime += cputime;
+ account_group_user_time(p, cputime);
+
+- index = (task_nice(p) > 0) ? CPUTIME_NICE : CPUTIME_USER;
++ index = task_running_nice(p) ? CPUTIME_NICE : CPUTIME_USER;
+
+ /* Add user time to cpustat. */
+ task_group_account_field(p, index, cputime);
+@@ -150,7 +150,7 @@ void account_guest_time(struct task_struct *p, u64 cputime)
+ p->gtime += cputime;
+
+ /* Add guest time to cpustat. */
+- if (task_nice(p) > 0) {
++ if (task_running_nice(p)) {
+ task_group_account_field(p, CPUTIME_NICE, cputime);
+ cpustat[CPUTIME_GUEST_NICE] += cputime;
+ } else {
+@@ -288,7 +288,7 @@ static inline u64 account_other_time(u64 max)
+ #ifdef CONFIG_64BIT
+ static inline u64 read_sum_exec_runtime(struct task_struct *t)
+ {
+- return t->se.sum_exec_runtime;
++ return tsk_seruntime(t);
+ }
+ #else
+ static u64 read_sum_exec_runtime(struct task_struct *t)
+@@ -298,7 +298,7 @@ static u64 read_sum_exec_runtime(struct task_struct *t)
+ struct rq *rq;
+
+ rq = task_rq_lock(t, &rf);
+- ns = t->se.sum_exec_runtime;
++ ns = tsk_seruntime(t);
+ task_rq_unlock(rq, t, &rf);
+
+ return ns;
+@@ -630,7 +630,7 @@ void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev,
+ void task_cputime_adjusted(struct task_struct *p, u64 *ut, u64 *st)
+ {
+ struct task_cputime cputime = {
+- .sum_exec_runtime = p->se.sum_exec_runtime,
++ .sum_exec_runtime = tsk_seruntime(p),
+ };
+
+ if (task_cputime(p, &cputime.utime, &cputime.stime))
+diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
+index 0b2340a79b65..1e5407b8a738 100644
+--- a/kernel/sched/debug.c
++++ b/kernel/sched/debug.c
+@@ -7,6 +7,7 @@
+ * Copyright(C) 2007, Red Hat, Inc., Ingo Molnar
+ */
+
++#ifndef CONFIG_SCHED_ALT
+ /*
+ * This allows printing both to /proc/sched_debug and
+ * to the console
+@@ -215,6 +216,7 @@ static const struct file_operations sched_scaling_fops = {
+ };
+
+ #endif /* SMP */
++#endif /* !CONFIG_SCHED_ALT */
+
+ #ifdef CONFIG_PREEMPT_DYNAMIC
+
+@@ -278,6 +280,7 @@ static const struct file_operations sched_dynamic_fops = {
+
+ #endif /* CONFIG_PREEMPT_DYNAMIC */
+
++#ifndef CONFIG_SCHED_ALT
+ __read_mostly bool sched_debug_verbose;
+
+ #ifdef CONFIG_SMP
+@@ -332,6 +335,7 @@ static const struct file_operations sched_debug_fops = {
+ .llseek = seq_lseek,
+ .release = seq_release,
+ };
++#endif /* !CONFIG_SCHED_ALT */
+
+ static struct dentry *debugfs_sched;
+
+@@ -341,12 +345,16 @@ static __init int sched_init_debug(void)
+
+ debugfs_sched = debugfs_create_dir("sched", NULL);
+
++#ifndef CONFIG_SCHED_ALT
+ debugfs_create_file("features", 0644, debugfs_sched, NULL, &sched_feat_fops);
+ debugfs_create_file_unsafe("verbose", 0644, debugfs_sched, &sched_debug_verbose, &sched_verbose_fops);
++ debugfs_create_bool("verbose", 0644, debugfs_sched, &sched_debug_verbose);
++#endif /* !CONFIG_SCHED_ALT */
+ #ifdef CONFIG_PREEMPT_DYNAMIC
+ debugfs_create_file("preempt", 0644, debugfs_sched, NULL, &sched_dynamic_fops);
+ #endif
+
++#ifndef CONFIG_SCHED_ALT
+ debugfs_create_u32("latency_ns", 0644, debugfs_sched, &sysctl_sched_latency);
+ debugfs_create_u32("min_granularity_ns", 0644, debugfs_sched, &sysctl_sched_min_granularity);
+ debugfs_create_u32("idle_min_granularity_ns", 0644, debugfs_sched, &sysctl_sched_idle_min_granularity);
+@@ -376,11 +384,13 @@ static __init int sched_init_debug(void)
+ #endif
+
+ debugfs_create_file("debug", 0444, debugfs_sched, NULL, &sched_debug_fops);
++#endif /* !CONFIG_SCHED_ALT */
+
+ return 0;
+ }
+ late_initcall(sched_init_debug);
+
++#ifndef CONFIG_SCHED_ALT
+ #ifdef CONFIG_SMP
+
+ static cpumask_var_t sd_sysctl_cpus;
+@@ -1114,6 +1124,7 @@ void proc_sched_set_task(struct task_struct *p)
+ memset(&p->stats, 0, sizeof(p->stats));
+ #endif
+ }
++#endif /* !CONFIG_SCHED_ALT */
+
+ void resched_latency_warn(int cpu, u64 latency)
+ {
+diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
+index 342f58a329f5..ab493e759084 100644
+--- a/kernel/sched/idle.c
++++ b/kernel/sched/idle.c
+@@ -379,6 +379,7 @@ void cpu_startup_entry(enum cpuhp_state state)
+ do_idle();
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ /*
+ * idle-task scheduling class.
+ */
+@@ -500,3 +501,4 @@ DEFINE_SCHED_CLASS(idle) = {
+ .switched_to = switched_to_idle,
+ .update_curr = update_curr_idle,
+ };
++#endif
+diff --git a/kernel/sched/pds.h b/kernel/sched/pds.h
+new file mode 100644
+index 000000000000..15cc4887efed
+--- /dev/null
++++ b/kernel/sched/pds.h
+@@ -0,0 +1,152 @@
++#define ALT_SCHED_NAME "PDS"
++
++#define MIN_SCHED_NORMAL_PRIO (32)
++static const u64 RT_MASK = ((1ULL << MIN_SCHED_NORMAL_PRIO) - 1);
++
++#define SCHED_NORMAL_PRIO_NUM (32)
++#define SCHED_EDGE_DELTA (SCHED_NORMAL_PRIO_NUM - NICE_WIDTH / 2)
++
++/* PDS assume NORMAL_PRIO_NUM is power of 2 */
++#define SCHED_NORMAL_PRIO_MOD(x) ((x) & (SCHED_NORMAL_PRIO_NUM - 1))
++
++/* default time slice 4ms -> shift 22, 2 time slice slots -> shift 23 */
++static __read_mostly int sched_timeslice_shift = 23;
++
++/*
++ * Common interfaces
++ */
++static inline void sched_timeslice_imp(const int timeslice_ms)
++{
++ if (2 == timeslice_ms)
++ sched_timeslice_shift = 22;
++}
++
++static inline int
++task_sched_prio_normal(const struct task_struct *p, const struct rq *rq)
++{
++ s64 delta = p->deadline - rq->time_edge + SCHED_EDGE_DELTA;
++
++#ifdef ALT_SCHED_DEBUG
++ if (WARN_ONCE(delta > NORMAL_PRIO_NUM - 1,
++ "pds: task_sched_prio_normal() delta %lld\n", delta))
++ return SCHED_NORMAL_PRIO_NUM - 1;
++#endif
++
++ return max(0LL, delta);
++}
++
++static inline int task_sched_prio(const struct task_struct *p)
++{
++ return (p->prio < MIN_NORMAL_PRIO) ? (p->prio >> 2) :
++ MIN_SCHED_NORMAL_PRIO + task_sched_prio_normal(p, task_rq(p));
++}
++
++static inline int
++task_sched_prio_idx(const struct task_struct *p, const struct rq *rq)
++{
++ u64 idx;
++
++ if (p->prio < MIN_NORMAL_PRIO)
++ return p->prio >> 2;
++
++ idx = max(p->deadline + SCHED_EDGE_DELTA, rq->time_edge);
++ /*printk(KERN_INFO "sched: task_sched_prio_idx edge:%llu, deadline=%llu idx=%llu\n", rq->time_edge, p->deadline, idx);*/
++ return MIN_SCHED_NORMAL_PRIO + SCHED_NORMAL_PRIO_MOD(idx);
++}
++
++static inline int sched_prio2idx(int sched_prio, struct rq *rq)
++{
++ return (IDLE_TASK_SCHED_PRIO == sched_prio || sched_prio < MIN_SCHED_NORMAL_PRIO) ?
++ sched_prio :
++ MIN_SCHED_NORMAL_PRIO + SCHED_NORMAL_PRIO_MOD(sched_prio + rq->time_edge);
++}
++
++static inline int sched_idx2prio(int sched_idx, struct rq *rq)
++{
++ return (sched_idx < MIN_SCHED_NORMAL_PRIO) ?
++ sched_idx :
++ MIN_SCHED_NORMAL_PRIO + SCHED_NORMAL_PRIO_MOD(sched_idx - rq->time_edge);
++}
++
++static inline void sched_renew_deadline(struct task_struct *p, const struct rq *rq)
++{
++ if (p->prio >= MIN_NORMAL_PRIO)
++ p->deadline = rq->time_edge + (p->static_prio - (MAX_PRIO - NICE_WIDTH)) / 2;
++}
++
++int task_running_nice(struct task_struct *p)
++{
++ return (p->prio > DEFAULT_PRIO);
++}
++
++static inline void update_rq_time_edge(struct rq *rq)
++{
++ struct list_head head;
++ u64 old = rq->time_edge;
++ u64 now = rq->clock >> sched_timeslice_shift;
++ u64 prio, delta;
++ DECLARE_BITMAP(normal, SCHED_QUEUE_BITS);
++
++ if (now == old)
++ return;
++
++ rq->time_edge = now;
++ delta = min_t(u64, SCHED_NORMAL_PRIO_NUM, now - old);
++ INIT_LIST_HEAD(&head);
++
++ /*printk(KERN_INFO "sched: update_rq_time_edge 0x%016lx %llu\n", rq->queue.bitmap[0], delta);*/
++ prio = MIN_SCHED_NORMAL_PRIO;
++ for_each_set_bit_from(prio, rq->queue.bitmap, MIN_SCHED_NORMAL_PRIO + delta)
++ list_splice_tail_init(rq->queue.heads + MIN_SCHED_NORMAL_PRIO +
++ SCHED_NORMAL_PRIO_MOD(prio + old), &head);
++
++ bitmap_shift_right(normal, rq->queue.bitmap, delta, SCHED_QUEUE_BITS);
++ if (!list_empty(&head)) {
++ struct task_struct *p;
++ u64 idx = MIN_SCHED_NORMAL_PRIO + SCHED_NORMAL_PRIO_MOD(now);
++
++ list_for_each_entry(p, &head, sq_node)
++ p->sq_idx = idx;
++
++ list_splice(&head, rq->queue.heads + idx);
++ set_bit(MIN_SCHED_NORMAL_PRIO, normal);
++ }
++ bitmap_replace(rq->queue.bitmap, normal, rq->queue.bitmap,
++ (const unsigned long *)&RT_MASK, SCHED_QUEUE_BITS);
++
++ if (rq->prio < MIN_SCHED_NORMAL_PRIO || IDLE_TASK_SCHED_PRIO == rq->prio)
++ return;
++
++ rq->prio = (rq->prio < MIN_SCHED_NORMAL_PRIO + delta) ?
++ MIN_SCHED_NORMAL_PRIO : rq->prio - delta;
++}
++
++static inline void time_slice_expired(struct task_struct *p, struct rq *rq)
++{
++ p->time_slice = sched_timeslice_ns;
++ sched_renew_deadline(p, rq);
++ if (SCHED_FIFO != p->policy && task_on_rq_queued(p))
++ requeue_task(p, rq, task_sched_prio_idx(p, rq));
++}
++
++static inline void sched_task_sanity_check(struct task_struct *p, struct rq *rq)
++{
++ u64 max_dl = rq->time_edge + NICE_WIDTH / 2 - 1;
++ if (unlikely(p->deadline > max_dl))
++ p->deadline = max_dl;
++}
++
++static void sched_task_fork(struct task_struct *p, struct rq *rq)
++{
++ sched_renew_deadline(p, rq);
++}
++
++static inline void do_sched_yield_type_1(struct task_struct *p, struct rq *rq)
++{
++ time_slice_expired(p, rq);
++}
++
++#ifdef CONFIG_SMP
++static inline void sched_task_ttwu(struct task_struct *p) {}
++#endif
++static inline void sched_task_deactivate(struct task_struct *p, struct rq *rq) {}
+diff --git a/kernel/sched/pelt.c b/kernel/sched/pelt.c
+index 0f310768260c..bd38bf738fe9 100644
+--- a/kernel/sched/pelt.c
++++ b/kernel/sched/pelt.c
+@@ -266,6 +266,7 @@ ___update_load_avg(struct sched_avg *sa, unsigned long load)
+ WRITE_ONCE(sa->util_avg, sa->util_sum / divider);
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ /*
+ * sched_entity:
+ *
+@@ -383,8 +384,9 @@ int update_dl_rq_load_avg(u64 now, struct rq *rq, int running)
+
+ return 0;
+ }
++#endif
+
+-#ifdef CONFIG_SCHED_THERMAL_PRESSURE
++#if defined(CONFIG_SCHED_THERMAL_PRESSURE) && !defined(CONFIG_SCHED_ALT)
+ /*
+ * thermal:
+ *
+diff --git a/kernel/sched/pelt.h b/kernel/sched/pelt.h
+index 3a0e0dc28721..e8a7d84aa5a5 100644
+--- a/kernel/sched/pelt.h
++++ b/kernel/sched/pelt.h
+@@ -1,13 +1,15 @@
+ #ifdef CONFIG_SMP
+ #include "sched-pelt.h"
+
++#ifndef CONFIG_SCHED_ALT
+ int __update_load_avg_blocked_se(u64 now, struct sched_entity *se);
+ int __update_load_avg_se(u64 now, struct cfs_rq *cfs_rq, struct sched_entity *se);
+ int __update_load_avg_cfs_rq(u64 now, struct cfs_rq *cfs_rq);
+ int update_rt_rq_load_avg(u64 now, struct rq *rq, int running);
+ int update_dl_rq_load_avg(u64 now, struct rq *rq, int running);
++#endif
+
+-#ifdef CONFIG_SCHED_THERMAL_PRESSURE
++#if defined(CONFIG_SCHED_THERMAL_PRESSURE) && !defined(CONFIG_SCHED_ALT)
+ int update_thermal_load_avg(u64 now, struct rq *rq, u64 capacity);
+
+ static inline u64 thermal_load_avg(struct rq *rq)
+@@ -44,6 +46,7 @@ static inline u32 get_pelt_divider(struct sched_avg *avg)
+ return PELT_MIN_DIVIDER + avg->period_contrib;
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ static inline void cfs_se_util_change(struct sched_avg *avg)
+ {
+ unsigned int enqueued;
+@@ -180,9 +183,11 @@ static inline u64 cfs_rq_clock_pelt(struct cfs_rq *cfs_rq)
+ return rq_clock_pelt(rq_of(cfs_rq));
+ }
+ #endif
++#endif /* CONFIG_SCHED_ALT */
+
+ #else
+
++#ifndef CONFIG_SCHED_ALT
+ static inline int
+ update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq)
+ {
+@@ -200,6 +205,7 @@ update_dl_rq_load_avg(u64 now, struct rq *rq, int running)
+ {
+ return 0;
+ }
++#endif
+
+ static inline int
+ update_thermal_load_avg(u64 now, struct rq *rq, u64 capacity)
+diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
+index ec7b3e0a2b20..3b4052dd7bee 100644
+--- a/kernel/sched/sched.h
++++ b/kernel/sched/sched.h
+@@ -5,6 +5,10 @@
+ #ifndef _KERNEL_SCHED_SCHED_H
+ #define _KERNEL_SCHED_SCHED_H
+
++#ifdef CONFIG_SCHED_ALT
++#include "alt_sched.h"
++#else
++
+ #include <linux/sched/affinity.h>
+ #include <linux/sched/autogroup.h>
+ #include <linux/sched/cpufreq.h>
+@@ -3487,4 +3491,9 @@ static inline void task_tick_mm_cid(struct rq *rq, struct task_struct *curr) { }
+ static inline void init_sched_mm_cid(struct task_struct *t) { }
+ #endif
+
++static inline int task_running_nice(struct task_struct *p)
++{
++ return (task_nice(p) > 0);
++}
++#endif /* !CONFIG_SCHED_ALT */
+ #endif /* _KERNEL_SCHED_SCHED_H */
+diff --git a/kernel/sched/stats.c b/kernel/sched/stats.c
+index 857f837f52cb..5486c63e4790 100644
+--- a/kernel/sched/stats.c
++++ b/kernel/sched/stats.c
+@@ -125,8 +125,10 @@ static int show_schedstat(struct seq_file *seq, void *v)
+ } else {
+ struct rq *rq;
+ #ifdef CONFIG_SMP
++#ifndef CONFIG_SCHED_ALT
+ struct sched_domain *sd;
+ int dcount = 0;
++#endif
+ #endif
+ cpu = (unsigned long)(v - 2);
+ rq = cpu_rq(cpu);
+@@ -143,6 +145,7 @@ static int show_schedstat(struct seq_file *seq, void *v)
+ seq_printf(seq, "\n");
+
+ #ifdef CONFIG_SMP
++#ifndef CONFIG_SCHED_ALT
+ /* domain-specific stats */
+ rcu_read_lock();
+ for_each_domain(cpu, sd) {
+@@ -171,6 +174,7 @@ static int show_schedstat(struct seq_file *seq, void *v)
+ sd->ttwu_move_balance);
+ }
+ rcu_read_unlock();
++#endif
+ #endif
+ }
+ return 0;
+diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h
+index 38f3698f5e5b..b9d597394316 100644
+--- a/kernel/sched/stats.h
++++ b/kernel/sched/stats.h
+@@ -89,6 +89,7 @@ static inline void rq_sched_info_depart (struct rq *rq, unsigned long long delt
+
+ #endif /* CONFIG_SCHEDSTATS */
+
++#ifndef CONFIG_SCHED_ALT
+ #ifdef CONFIG_FAIR_GROUP_SCHED
+ struct sched_entity_stats {
+ struct sched_entity se;
+@@ -105,6 +106,7 @@ __schedstats_from_se(struct sched_entity *se)
+ #endif
+ return &task_of(se)->stats;
+ }
++#endif /* CONFIG_SCHED_ALT */
+
+ #ifdef CONFIG_PSI
+ void psi_task_change(struct task_struct *task, int clear, int set);
+diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
+index 6682535e37c8..144875e2728d 100644
+--- a/kernel/sched/topology.c
++++ b/kernel/sched/topology.c
+@@ -3,6 +3,7 @@
+ * Scheduler topology setup/handling methods
+ */
+
++#ifndef CONFIG_SCHED_ALT
+ #include <linux/bsearch.h>
+
+ DEFINE_MUTEX(sched_domains_mutex);
+@@ -1415,8 +1416,10 @@ static void asym_cpu_capacity_scan(void)
+ */
+
+ static int default_relax_domain_level = -1;
++#endif /* CONFIG_SCHED_ALT */
+ int sched_domain_level_max;
+
++#ifndef CONFIG_SCHED_ALT
+ static int __init setup_relax_domain_level(char *str)
+ {
+ if (kstrtoint(str, 0, &default_relax_domain_level))
+@@ -1649,6 +1652,7 @@ sd_init(struct sched_domain_topology_level *tl,
+
+ return sd;
+ }
++#endif /* CONFIG_SCHED_ALT */
+
+ /*
+ * Topology list, bottom-up.
+@@ -1685,6 +1689,7 @@ void set_sched_topology(struct sched_domain_topology_level *tl)
+ sched_domain_topology_saved = NULL;
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ #ifdef CONFIG_NUMA
+
+ static const struct cpumask *sd_numa_mask(int cpu)
+@@ -2740,3 +2745,20 @@ void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
+ partition_sched_domains_locked(ndoms_new, doms_new, dattr_new);
+ mutex_unlock(&sched_domains_mutex);
+ }
++#else /* CONFIG_SCHED_ALT */
++void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
++ struct sched_domain_attr *dattr_new)
++{}
++
++#ifdef CONFIG_NUMA
++int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
++{
++ return best_mask_cpu(cpu, cpus);
++}
++
++int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node)
++{
++ return cpumask_nth(cpu, cpus);
++}
++#endif /* CONFIG_NUMA */
++#endif
+diff --git a/kernel/sysctl.c b/kernel/sysctl.c
+index bfe53e835524..943fa125064b 100644
+--- a/kernel/sysctl.c
++++ b/kernel/sysctl.c
+@@ -92,6 +92,10 @@ EXPORT_SYMBOL_GPL(sysctl_long_vals);
+
+ /* Constants used for minimum and maximum */
+
++#ifdef CONFIG_SCHED_ALT
++extern int sched_yield_type;
++#endif
++
+ #ifdef CONFIG_PERF_EVENTS
+ static const int six_hundred_forty_kb = 640 * 1024;
+ #endif
+@@ -1917,6 +1921,17 @@ static struct ctl_table kern_table[] = {
+ .proc_handler = proc_dointvec,
+ },
+ #endif
++#ifdef CONFIG_SCHED_ALT
++ {
++ .procname = "yield_type",
++ .data = &sched_yield_type,
++ .maxlen = sizeof (int),
++ .mode = 0644,
++ .proc_handler = &proc_dointvec_minmax,
++ .extra1 = SYSCTL_ZERO,
++ .extra2 = SYSCTL_TWO,
++ },
++#endif
+ #if defined(CONFIG_S390) && defined(CONFIG_SMP)
+ {
+ .procname = "spin_retry",
+diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
+index e8c08292defc..3823ff0ddc0f 100644
+--- a/kernel/time/hrtimer.c
++++ b/kernel/time/hrtimer.c
+@@ -2088,8 +2088,10 @@ long hrtimer_nanosleep(ktime_t rqtp, const enum hrtimer_mode mode,
+ int ret = 0;
+ u64 slack;
+
++#ifndef CONFIG_SCHED_ALT
+ slack = current->timer_slack_ns;
+- if (rt_task(current))
++ if (dl_task(current) || rt_task(current))
++#endif
+ slack = 0;
+
+ hrtimer_init_sleeper_on_stack(&t, clockid, mode);
+diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
+index e9c6f9d0e42c..43ee0a94abdd 100644
+--- a/kernel/time/posix-cpu-timers.c
++++ b/kernel/time/posix-cpu-timers.c
+@@ -223,7 +223,7 @@ static void task_sample_cputime(struct task_struct *p, u64 *samples)
+ u64 stime, utime;
+
+ task_cputime(p, &utime, &stime);
+- store_samples(samples, stime, utime, p->se.sum_exec_runtime);
++ store_samples(samples, stime, utime, tsk_seruntime(p));
+ }
+
+ static void proc_sample_cputime_atomic(struct task_cputime_atomic *at,
+@@ -867,6 +867,7 @@ static void collect_posix_cputimers(struct posix_cputimers *pct, u64 *samples,
+ }
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ static inline void check_dl_overrun(struct task_struct *tsk)
+ {
+ if (tsk->dl.dl_overrun) {
+@@ -874,6 +875,7 @@ static inline void check_dl_overrun(struct task_struct *tsk)
+ send_signal_locked(SIGXCPU, SEND_SIG_PRIV, tsk, PIDTYPE_TGID);
+ }
+ }
++#endif
+
+ static bool check_rlimit(u64 time, u64 limit, int signo, bool rt, bool hard)
+ {
+@@ -901,8 +903,10 @@ static void check_thread_timers(struct task_struct *tsk,
+ u64 samples[CPUCLOCK_MAX];
+ unsigned long soft;
+
++#ifndef CONFIG_SCHED_ALT
+ if (dl_task(tsk))
+ check_dl_overrun(tsk);
++#endif
+
+ if (expiry_cache_is_inactive(pct))
+ return;
+@@ -916,7 +920,7 @@ static void check_thread_timers(struct task_struct *tsk,
+ soft = task_rlimit(tsk, RLIMIT_RTTIME);
+ if (soft != RLIM_INFINITY) {
+ /* Task RT timeout is accounted in jiffies. RTTIME is usec */
+- unsigned long rttime = tsk->rt.timeout * (USEC_PER_SEC / HZ);
++ unsigned long rttime = tsk_rttimeout(tsk) * (USEC_PER_SEC / HZ);
+ unsigned long hard = task_rlimit_max(tsk, RLIMIT_RTTIME);
+
+ /* At the hard limit, send SIGKILL. No further action. */
+@@ -1152,8 +1156,10 @@ static inline bool fastpath_timer_check(struct task_struct *tsk)
+ return true;
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ if (dl_task(tsk) && tsk->dl.dl_overrun)
+ return true;
++#endif
+
+ return false;
+ }
+diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
+index 529590499b1f..d04bb99b4f0e 100644
+--- a/kernel/trace/trace_selftest.c
++++ b/kernel/trace/trace_selftest.c
+@@ -1155,10 +1155,15 @@ static int trace_wakeup_test_thread(void *data)
+ {
+ /* Make this a -deadline thread */
+ static const struct sched_attr attr = {
++#ifdef CONFIG_SCHED_ALT
++ /* No deadline on BMQ/PDS, use RR */
++ .sched_policy = SCHED_RR,
++#else
+ .sched_policy = SCHED_DEADLINE,
+ .sched_runtime = 100000ULL,
+ .sched_deadline = 10000000ULL,
+ .sched_period = 10000000ULL
++#endif
+ };
+ struct wakeup_test_data *x = data;
+
diff --git a/5021_BMQ-and-PDS-gentoo-defaults.patch b/5021_BMQ-and-PDS-gentoo-defaults.patch
new file mode 100644
index 00000000..6dc48eec
--- /dev/null
+++ b/5021_BMQ-and-PDS-gentoo-defaults.patch
@@ -0,0 +1,13 @@
+--- a/init/Kconfig 2023-02-13 08:16:09.534315265 -0500
++++ b/init/Kconfig 2023-02-13 08:17:24.130237204 -0500
+@@ -867,8 +867,9 @@ config UCLAMP_BUCKETS_COUNT
+ If in doubt, use the default value.
+
+ menuconfig SCHED_ALT
++ depends on X86_64
+ bool "Alternative CPU Schedulers"
+- default y
++ default n
+ help
+ This feature enable alternative CPU scheduler"
+
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-08-02 10:35 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-08-02 10:35 UTC (permalink / raw
To: gentoo-commits
commit: 266b5aee1131c032a89ed54563730a0a2bcb93ae
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Wed Aug 2 10:35:07 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Wed Aug 2 10:35:07 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=266b5aee
Remove BMQ temporarily as there is an issue with the patch
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 8 -
5020_BMQ-and-PDS-io-scheduler-v6.4-r0.patch | 11163 --------------------------
5021_BMQ-and-PDS-gentoo-defaults.patch | 13 -
3 files changed, 11184 deletions(-)
diff --git a/0000_README b/0000_README
index e76d9abd..58f42d41 100644
--- a/0000_README
+++ b/0000_README
@@ -114,11 +114,3 @@ Desc: Add Gentoo Linux support config settings and defaults.
Patch: 5010_enable-cpu-optimizations-universal.patch
From: https://github.com/graysky2/kernel_compiler_patch
Desc: Kernel >= 5.15 patch enables gcc = v11.1+ optimizations for additional CPUs.
-
-Patch: 5020_BMQ-and-PDS-io-scheduler-v6.4-r0.patch
-From: https://github.com/Frogging-Family/linux-tkg https://gitlab.com/alfredchen/projectc
-Desc: BMQ(BitMap Queue) Scheduler. A new CPU scheduler developed from PDS(incld). Inspired by the scheduler in zircon.
-
-Patch: 5021_BMQ-and-PDS-gentoo-defaults.patch
-From: https://gitweb.gentoo.org/proj/linux-patches.git/
-Desc: Set defaults for BMQ. Add archs as people test, default to N
diff --git a/5020_BMQ-and-PDS-io-scheduler-v6.4-r0.patch b/5020_BMQ-and-PDS-io-scheduler-v6.4-r0.patch
deleted file mode 100644
index 3061b321..00000000
--- a/5020_BMQ-and-PDS-io-scheduler-v6.4-r0.patch
+++ /dev/null
@@ -1,11163 +0,0 @@
-diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
-index 9e5bab29685f..b942b7dd8c42 100644
---- a/Documentation/admin-guide/kernel-parameters.txt
-+++ b/Documentation/admin-guide/kernel-parameters.txt
-@@ -5496,6 +5496,12 @@
- sa1100ir [NET]
- See drivers/net/irda/sa1100_ir.c.
-
-+ sched_timeslice=
-+ [KNL] Time slice in ms for Project C BMQ/PDS scheduler.
-+ Format: integer 2, 4
-+ Default: 4
-+ See Documentation/scheduler/sched-BMQ.txt
-+
- sched_verbose [KNL] Enables verbose scheduler debug messages.
-
- schedstats= [KNL,X86] Enable or disable scheduled statistics.
-diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
-index d85d90f5d000..f730195a3adb 100644
---- a/Documentation/admin-guide/sysctl/kernel.rst
-+++ b/Documentation/admin-guide/sysctl/kernel.rst
-@@ -1616,3 +1616,13 @@ is 10 seconds.
-
- The softlockup threshold is (``2 * watchdog_thresh``). Setting this
- tunable to zero will disable lockup detection altogether.
-+
-+yield_type:
-+===========
-+
-+BMQ/PDS CPU scheduler only. This determines what type of yield calls
-+to sched_yield will perform.
-+
-+ 0 - No yield.
-+ 1 - Deboost and requeue task. (default)
-+ 2 - Set run queue skip task.
-diff --git a/Documentation/scheduler/sched-BMQ.txt b/Documentation/scheduler/sched-BMQ.txt
-new file mode 100644
-index 000000000000..05c84eec0f31
---- /dev/null
-+++ b/Documentation/scheduler/sched-BMQ.txt
-@@ -0,0 +1,110 @@
-+ BitMap queue CPU Scheduler
-+ --------------------------
-+
-+CONTENT
-+========
-+
-+ Background
-+ Design
-+ Overview
-+ Task policy
-+ Priority management
-+ BitMap Queue
-+ CPU Assignment and Migration
-+
-+
-+Background
-+==========
-+
-+BitMap Queue CPU scheduler, referred to as BMQ from here on, is an evolution
-+of previous Priority and Deadline based Skiplist multiple queue scheduler(PDS),
-+and inspired by Zircon scheduler. The goal of it is to keep the scheduler code
-+simple, while efficiency and scalable for interactive tasks, such as desktop,
-+movie playback and gaming etc.
-+
-+Design
-+======
-+
-+Overview
-+--------
-+
-+BMQ use per CPU run queue design, each CPU(logical) has it's own run queue,
-+each CPU is responsible for scheduling the tasks that are putting into it's
-+run queue.
-+
-+The run queue is a set of priority queues. Note that these queues are fifo
-+queue for non-rt tasks or priority queue for rt tasks in data structure. See
-+BitMap Queue below for details. BMQ is optimized for non-rt tasks in the fact
-+that most applications are non-rt tasks. No matter the queue is fifo or
-+priority, In each queue is an ordered list of runnable tasks awaiting execution
-+and the data structures are the same. When it is time for a new task to run,
-+the scheduler simply looks the lowest numbered queueue that contains a task,
-+and runs the first task from the head of that queue. And per CPU idle task is
-+also in the run queue, so the scheduler can always find a task to run on from
-+its run queue.
-+
-+Each task will assigned the same timeslice(default 4ms) when it is picked to
-+start running. Task will be reinserted at the end of the appropriate priority
-+queue when it uses its whole timeslice. When the scheduler selects a new task
-+from the priority queue it sets the CPU's preemption timer for the remainder of
-+the previous timeslice. When that timer fires the scheduler will stop execution
-+on that task, select another task and start over again.
-+
-+If a task blocks waiting for a shared resource then it's taken out of its
-+priority queue and is placed in a wait queue for the shared resource. When it
-+is unblocked it will be reinserted in the appropriate priority queue of an
-+eligible CPU.
-+
-+Task policy
-+-----------
-+
-+BMQ supports DEADLINE, FIFO, RR, NORMAL, BATCH and IDLE task policy like the
-+mainline CFS scheduler. But BMQ is heavy optimized for non-rt task, that's
-+NORMAL/BATCH/IDLE policy tasks. Below is the implementation detail of each
-+policy.
-+
-+DEADLINE
-+ It is squashed as priority 0 FIFO task.
-+
-+FIFO/RR
-+ All RT tasks share one single priority queue in BMQ run queue designed. The
-+complexity of insert operation is O(n). BMQ is not designed for system runs
-+with major rt policy tasks.
-+
-+NORMAL/BATCH/IDLE
-+ BATCH and IDLE tasks are treated as the same policy. They compete CPU with
-+NORMAL policy tasks, but they just don't boost. To control the priority of
-+NORMAL/BATCH/IDLE tasks, simply use nice level.
-+
-+ISO
-+ ISO policy is not supported in BMQ. Please use nice level -20 NORMAL policy
-+task instead.
-+
-+Priority management
-+-------------------
-+
-+RT tasks have priority from 0-99. For non-rt tasks, there are three different
-+factors used to determine the effective priority of a task. The effective
-+priority being what is used to determine which queue it will be in.
-+
-+The first factor is simply the task’s static priority. Which is assigned from
-+task's nice level, within [-20, 19] in userland's point of view and [0, 39]
-+internally.
-+
-+The second factor is the priority boost. This is a value bounded between
-+[-MAX_PRIORITY_ADJ, MAX_PRIORITY_ADJ] used to offset the base priority, it is
-+modified by the following cases:
-+
-+*When a thread has used up its entire timeslice, always deboost its boost by
-+increasing by one.
-+*When a thread gives up cpu control(voluntary or non-voluntary) to reschedule,
-+and its switch-in time(time after last switch and run) below the thredhold
-+based on its priority boost, will boost its boost by decreasing by one buti is
-+capped at 0 (won’t go negative).
-+
-+The intent in this system is to ensure that interactive threads are serviced
-+quickly. These are usually the threads that interact directly with the user
-+and cause user-perceivable latency. These threads usually do little work and
-+spend most of their time blocked awaiting another user event. So they get the
-+priority boost from unblocking while background threads that do most of the
-+processing receive the priority penalty for using their entire timeslice.
-diff --git a/fs/proc/base.c b/fs/proc/base.c
-index 05452c3b9872..fa1ceb85ad24 100644
---- a/fs/proc/base.c
-+++ b/fs/proc/base.c
-@@ -480,7 +480,7 @@ static int proc_pid_schedstat(struct seq_file *m, struct pid_namespace *ns,
- seq_puts(m, "0 0 0\n");
- else
- seq_printf(m, "%llu %llu %lu\n",
-- (unsigned long long)task->se.sum_exec_runtime,
-+ (unsigned long long)tsk_seruntime(task),
- (unsigned long long)task->sched_info.run_delay,
- task->sched_info.pcount);
-
-diff --git a/include/asm-generic/resource.h b/include/asm-generic/resource.h
-index 8874f681b056..59eb72bf7d5f 100644
---- a/include/asm-generic/resource.h
-+++ b/include/asm-generic/resource.h
-@@ -23,7 +23,7 @@
- [RLIMIT_LOCKS] = { RLIM_INFINITY, RLIM_INFINITY }, \
- [RLIMIT_SIGPENDING] = { 0, 0 }, \
- [RLIMIT_MSGQUEUE] = { MQ_BYTES_MAX, MQ_BYTES_MAX }, \
-- [RLIMIT_NICE] = { 0, 0 }, \
-+ [RLIMIT_NICE] = { 30, 30 }, \
- [RLIMIT_RTPRIO] = { 0, 0 }, \
- [RLIMIT_RTTIME] = { RLIM_INFINITY, RLIM_INFINITY }, \
- }
-diff --git a/include/linux/sched.h b/include/linux/sched.h
-index eed5d65b8d1f..cdfd9263ddd6 100644
---- a/include/linux/sched.h
-+++ b/include/linux/sched.h
-@@ -764,8 +764,14 @@ struct task_struct {
- unsigned int ptrace;
-
- #ifdef CONFIG_SMP
-- int on_cpu;
- struct __call_single_node wake_entry;
-+#endif
-+#if defined(CONFIG_SMP) || defined(CONFIG_SCHED_ALT)
-+ int on_cpu;
-+#endif
-+
-+#ifdef CONFIG_SMP
-+#ifndef CONFIG_SCHED_ALT
- unsigned int wakee_flips;
- unsigned long wakee_flip_decay_ts;
- struct task_struct *last_wakee;
-@@ -779,6 +785,7 @@ struct task_struct {
- */
- int recent_used_cpu;
- int wake_cpu;
-+#endif /* !CONFIG_SCHED_ALT */
- #endif
- int on_rq;
-
-@@ -787,6 +794,20 @@ struct task_struct {
- int normal_prio;
- unsigned int rt_priority;
-
-+#ifdef CONFIG_SCHED_ALT
-+ u64 last_ran;
-+ s64 time_slice;
-+ int sq_idx;
-+ struct list_head sq_node;
-+#ifdef CONFIG_SCHED_BMQ
-+ int boost_prio;
-+#endif /* CONFIG_SCHED_BMQ */
-+#ifdef CONFIG_SCHED_PDS
-+ u64 deadline;
-+#endif /* CONFIG_SCHED_PDS */
-+ /* sched_clock time spent running */
-+ u64 sched_time;
-+#else /* !CONFIG_SCHED_ALT */
- struct sched_entity se;
- struct sched_rt_entity rt;
- struct sched_dl_entity dl;
-@@ -797,6 +818,7 @@ struct task_struct {
- unsigned long core_cookie;
- unsigned int core_occupation;
- #endif
-+#endif /* !CONFIG_SCHED_ALT */
-
- #ifdef CONFIG_CGROUP_SCHED
- struct task_group *sched_task_group;
-@@ -1551,6 +1573,15 @@ struct task_struct {
- */
- };
-
-+#ifdef CONFIG_SCHED_ALT
-+#define tsk_seruntime(t) ((t)->sched_time)
-+/* replace the uncertian rt_timeout with 0UL */
-+#define tsk_rttimeout(t) (0UL)
-+#else /* CFS */
-+#define tsk_seruntime(t) ((t)->se.sum_exec_runtime)
-+#define tsk_rttimeout(t) ((t)->rt.timeout)
-+#endif /* !CONFIG_SCHED_ALT */
-+
- static inline struct pid *task_pid(struct task_struct *task)
- {
- return task->thread_pid;
-diff --git a/include/linux/sched/deadline.h b/include/linux/sched/deadline.h
-index 7c83d4d5a971..fa30f98cb2be 100644
---- a/include/linux/sched/deadline.h
-+++ b/include/linux/sched/deadline.h
-@@ -1,5 +1,24 @@
- /* SPDX-License-Identifier: GPL-2.0 */
-
-+#ifdef CONFIG_SCHED_ALT
-+
-+static inline int dl_task(struct task_struct *p)
-+{
-+ return 0;
-+}
-+
-+#ifdef CONFIG_SCHED_BMQ
-+#define __tsk_deadline(p) (0UL)
-+#endif
-+
-+#ifdef CONFIG_SCHED_PDS
-+#define __tsk_deadline(p) ((((u64) ((p)->prio))<<56) | (p)->deadline)
-+#endif
-+
-+#else
-+
-+#define __tsk_deadline(p) ((p)->dl.deadline)
-+
- /*
- * SCHED_DEADLINE tasks has negative priorities, reflecting
- * the fact that any of them has higher prio than RT and
-@@ -21,6 +40,7 @@ static inline int dl_task(struct task_struct *p)
- {
- return dl_prio(p->prio);
- }
-+#endif /* CONFIG_SCHED_ALT */
-
- static inline bool dl_time_before(u64 a, u64 b)
- {
-diff --git a/include/linux/sched/prio.h b/include/linux/sched/prio.h
-index ab83d85e1183..6af9ae681116 100644
---- a/include/linux/sched/prio.h
-+++ b/include/linux/sched/prio.h
-@@ -18,6 +18,32 @@
- #define MAX_PRIO (MAX_RT_PRIO + NICE_WIDTH)
- #define DEFAULT_PRIO (MAX_RT_PRIO + NICE_WIDTH / 2)
-
-+#ifdef CONFIG_SCHED_ALT
-+
-+/* Undefine MAX_PRIO and DEFAULT_PRIO */
-+#undef MAX_PRIO
-+#undef DEFAULT_PRIO
-+
-+/* +/- priority levels from the base priority */
-+#ifdef CONFIG_SCHED_BMQ
-+#define MAX_PRIORITY_ADJ (7)
-+
-+#define MIN_NORMAL_PRIO (MAX_RT_PRIO)
-+#define MAX_PRIO (MIN_NORMAL_PRIO + NICE_WIDTH)
-+#define DEFAULT_PRIO (MIN_NORMAL_PRIO + NICE_WIDTH / 2)
-+#endif
-+
-+#ifdef CONFIG_SCHED_PDS
-+#define MAX_PRIORITY_ADJ (0)
-+
-+#define MIN_NORMAL_PRIO (128)
-+#define NORMAL_PRIO_NUM (64)
-+#define MAX_PRIO (MIN_NORMAL_PRIO + NORMAL_PRIO_NUM)
-+#define DEFAULT_PRIO (MAX_PRIO - NICE_WIDTH / 2)
-+#endif
-+
-+#endif /* CONFIG_SCHED_ALT */
-+
- /*
- * Convert user-nice values [ -20 ... 0 ... 19 ]
- * to static priority [ MAX_RT_PRIO..MAX_PRIO-1 ],
-diff --git a/include/linux/sched/rt.h b/include/linux/sched/rt.h
-index 994c25640e15..8c050a59ece1 100644
---- a/include/linux/sched/rt.h
-+++ b/include/linux/sched/rt.h
-@@ -24,8 +24,10 @@ static inline bool task_is_realtime(struct task_struct *tsk)
-
- if (policy == SCHED_FIFO || policy == SCHED_RR)
- return true;
-+#ifndef CONFIG_SCHED_ALT
- if (policy == SCHED_DEADLINE)
- return true;
-+#endif
- return false;
- }
-
-diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
-index 816df6cc444e..c8da08e18c91 100644
---- a/include/linux/sched/topology.h
-+++ b/include/linux/sched/topology.h
-@@ -234,7 +234,8 @@ static inline bool cpus_share_cache(int this_cpu, int that_cpu)
-
- #endif /* !CONFIG_SMP */
-
--#if defined(CONFIG_ENERGY_MODEL) && defined(CONFIG_CPU_FREQ_GOV_SCHEDUTIL)
-+#if defined(CONFIG_ENERGY_MODEL) && defined(CONFIG_CPU_FREQ_GOV_SCHEDUTIL) && \
-+ !defined(CONFIG_SCHED_ALT)
- extern void rebuild_sched_domains_energy(void);
- #else
- static inline void rebuild_sched_domains_energy(void)
-diff --git a/init/Kconfig b/init/Kconfig
-index 32c24950c4ce..cf951b739454 100644
---- a/init/Kconfig
-+++ b/init/Kconfig
-@@ -629,6 +629,7 @@ config TASK_IO_ACCOUNTING
-
- config PSI
- bool "Pressure stall information tracking"
-+ depends on !SCHED_ALT
- help
- Collect metrics that indicate how overcommitted the CPU, memory,
- and IO capacity are in the system.
-@@ -793,6 +794,7 @@ menu "Scheduler features"
- config UCLAMP_TASK
- bool "Enable utilization clamping for RT/FAIR tasks"
- depends on CPU_FREQ_GOV_SCHEDUTIL
-+ depends on !SCHED_ALT
- help
- This feature enables the scheduler to track the clamped utilization
- of each CPU based on RUNNABLE tasks scheduled on that CPU.
-@@ -839,6 +841,35 @@ config UCLAMP_BUCKETS_COUNT
-
- If in doubt, use the default value.
-
-+menuconfig SCHED_ALT
-+ bool "Alternative CPU Schedulers"
-+ default y
-+ help
-+ This feature enable alternative CPU scheduler"
-+
-+if SCHED_ALT
-+
-+choice
-+ prompt "Alternative CPU Scheduler"
-+ default SCHED_BMQ
-+
-+config SCHED_BMQ
-+ bool "BMQ CPU scheduler"
-+ help
-+ The BitMap Queue CPU scheduler for excellent interactivity and
-+ responsiveness on the desktop and solid scalability on normal
-+ hardware and commodity servers.
-+
-+config SCHED_PDS
-+ bool "PDS CPU scheduler"
-+ help
-+ The Priority and Deadline based Skip list multiple queue CPU
-+ Scheduler.
-+
-+endchoice
-+
-+endif
-+
- endmenu
-
- #
-@@ -892,6 +923,7 @@ config NUMA_BALANCING
- depends on ARCH_SUPPORTS_NUMA_BALANCING
- depends on !ARCH_WANT_NUMA_VARIABLE_LOCALITY
- depends on SMP && NUMA && MIGRATION && !PREEMPT_RT
-+ depends on !SCHED_ALT
- help
- This option adds support for automatic NUMA aware memory/task placement.
- The mechanism is quite primitive and is based on migrating memory when
-@@ -989,6 +1021,7 @@ config FAIR_GROUP_SCHED
- depends on CGROUP_SCHED
- default CGROUP_SCHED
-
-+if !SCHED_ALT
- config CFS_BANDWIDTH
- bool "CPU bandwidth provisioning for FAIR_GROUP_SCHED"
- depends on FAIR_GROUP_SCHED
-@@ -1011,6 +1044,7 @@ config RT_GROUP_SCHED
- realtime bandwidth for them.
- See Documentation/scheduler/sched-rt-group.rst for more information.
-
-+endif #!SCHED_ALT
- endif #CGROUP_SCHED
-
- config SCHED_MM_CID
-@@ -1259,6 +1293,7 @@ config CHECKPOINT_RESTORE
-
- config SCHED_AUTOGROUP
- bool "Automatic process group scheduling"
-+ depends on !SCHED_ALT
- select CGROUPS
- select CGROUP_SCHED
- select FAIR_GROUP_SCHED
-diff --git a/init/init_task.c b/init/init_task.c
-index ff6c4b9bfe6b..19e9c662d1a1 100644
---- a/init/init_task.c
-+++ b/init/init_task.c
-@@ -75,9 +75,15 @@ struct task_struct init_task
- .stack = init_stack,
- .usage = REFCOUNT_INIT(2),
- .flags = PF_KTHREAD,
-+#ifdef CONFIG_SCHED_ALT
-+ .prio = DEFAULT_PRIO + MAX_PRIORITY_ADJ,
-+ .static_prio = DEFAULT_PRIO,
-+ .normal_prio = DEFAULT_PRIO + MAX_PRIORITY_ADJ,
-+#else
- .prio = MAX_PRIO - 20,
- .static_prio = MAX_PRIO - 20,
- .normal_prio = MAX_PRIO - 20,
-+#endif
- .policy = SCHED_NORMAL,
- .cpus_ptr = &init_task.cpus_mask,
- .user_cpus_ptr = NULL,
-@@ -88,6 +94,17 @@ struct task_struct init_task
- .restart_block = {
- .fn = do_no_restart_syscall,
- },
-+#ifdef CONFIG_SCHED_ALT
-+ .sq_node = LIST_HEAD_INIT(init_task.sq_node),
-+#ifdef CONFIG_SCHED_BMQ
-+ .boost_prio = 0,
-+ .sq_idx = 15,
-+#endif
-+#ifdef CONFIG_SCHED_PDS
-+ .deadline = 0,
-+#endif
-+ .time_slice = HZ,
-+#else
- .se = {
- .group_node = LIST_HEAD_INIT(init_task.se.group_node),
- },
-@@ -95,6 +112,7 @@ struct task_struct init_task
- .run_list = LIST_HEAD_INIT(init_task.rt.run_list),
- .time_slice = RR_TIMESLICE,
- },
-+#endif
- .tasks = LIST_HEAD_INIT(init_task.tasks),
- #ifdef CONFIG_SMP
- .pushable_tasks = PLIST_NODE_INIT(init_task.pushable_tasks, MAX_PRIO),
-diff --git a/kernel/Kconfig.preempt b/kernel/Kconfig.preempt
-index c2f1fd95a821..41654679b1b2 100644
---- a/kernel/Kconfig.preempt
-+++ b/kernel/Kconfig.preempt
-@@ -117,7 +117,7 @@ config PREEMPT_DYNAMIC
-
- config SCHED_CORE
- bool "Core Scheduling for SMT"
-- depends on SCHED_SMT
-+ depends on SCHED_SMT && !SCHED_ALT
- help
- This option permits Core Scheduling, a means of coordinated task
- selection across SMT siblings. When enabled -- see
-diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
-index e4ca2dd2b764..82786dbb220c 100644
---- a/kernel/cgroup/cpuset.c
-+++ b/kernel/cgroup/cpuset.c
-@@ -791,7 +791,7 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial)
- return ret;
- }
-
--#ifdef CONFIG_SMP
-+#if defined(CONFIG_SMP) && !defined(CONFIG_SCHED_ALT)
- /*
- * Helper routine for generate_sched_domains().
- * Do cpusets a, b have overlapping effective cpus_allowed masks?
-@@ -1187,7 +1187,7 @@ static void rebuild_sched_domains_locked(void)
- /* Have scheduler rebuild the domains */
- partition_and_rebuild_sched_domains(ndoms, doms, attr);
- }
--#else /* !CONFIG_SMP */
-+#else /* !CONFIG_SMP || CONFIG_SCHED_ALT */
- static void rebuild_sched_domains_locked(void)
- {
- }
-diff --git a/kernel/delayacct.c b/kernel/delayacct.c
-index 6f0c358e73d8..8111481ce8b1 100644
---- a/kernel/delayacct.c
-+++ b/kernel/delayacct.c
-@@ -150,7 +150,7 @@ int delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk)
- */
- t1 = tsk->sched_info.pcount;
- t2 = tsk->sched_info.run_delay;
-- t3 = tsk->se.sum_exec_runtime;
-+ t3 = tsk_seruntime(tsk);
-
- d->cpu_count += t1;
-
-diff --git a/kernel/exit.c b/kernel/exit.c
-index edb50b4c9972..09e72bba7cc2 100644
---- a/kernel/exit.c
-+++ b/kernel/exit.c
-@@ -173,7 +173,7 @@ static void __exit_signal(struct task_struct *tsk)
- sig->curr_target = next_thread(tsk);
- }
-
-- add_device_randomness((const void*) &tsk->se.sum_exec_runtime,
-+ add_device_randomness((const void*) &tsk_seruntime(tsk),
- sizeof(unsigned long long));
-
- /*
-@@ -194,7 +194,7 @@ static void __exit_signal(struct task_struct *tsk)
- sig->inblock += task_io_get_inblock(tsk);
- sig->oublock += task_io_get_oublock(tsk);
- task_io_accounting_add(&sig->ioac, &tsk->ioac);
-- sig->sum_sched_runtime += tsk->se.sum_exec_runtime;
-+ sig->sum_sched_runtime += tsk_seruntime(tsk);
- sig->nr_threads--;
- __unhash_process(tsk, group_dead);
- write_sequnlock(&sig->stats_lock);
-diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
-index 728f434de2bb..0e1082a4e878 100644
---- a/kernel/locking/rtmutex.c
-+++ b/kernel/locking/rtmutex.c
-@@ -337,21 +337,25 @@ static __always_inline void
- waiter_update_prio(struct rt_mutex_waiter *waiter, struct task_struct *task)
- {
- waiter->prio = __waiter_prio(task);
-- waiter->deadline = task->dl.deadline;
-+ waiter->deadline = __tsk_deadline(task);
- }
-
- /*
- * Only use with rt_mutex_waiter_{less,equal}()
- */
- #define task_to_waiter(p) \
-- &(struct rt_mutex_waiter){ .prio = __waiter_prio(p), .deadline = (p)->dl.deadline }
-+ &(struct rt_mutex_waiter){ .prio = __waiter_prio(p), .deadline = __tsk_deadline(p) }
-
- static __always_inline int rt_mutex_waiter_less(struct rt_mutex_waiter *left,
- struct rt_mutex_waiter *right)
- {
-+#ifdef CONFIG_SCHED_PDS
-+ return (left->deadline < right->deadline);
-+#else
- if (left->prio < right->prio)
- return 1;
-
-+#ifndef CONFIG_SCHED_BMQ
- /*
- * If both waiters have dl_prio(), we check the deadlines of the
- * associated tasks.
-@@ -360,16 +364,22 @@ static __always_inline int rt_mutex_waiter_less(struct rt_mutex_waiter *left,
- */
- if (dl_prio(left->prio))
- return dl_time_before(left->deadline, right->deadline);
-+#endif
-
- return 0;
-+#endif
- }
-
- static __always_inline int rt_mutex_waiter_equal(struct rt_mutex_waiter *left,
- struct rt_mutex_waiter *right)
- {
-+#ifdef CONFIG_SCHED_PDS
-+ return (left->deadline == right->deadline);
-+#else
- if (left->prio != right->prio)
- return 0;
-
-+#ifndef CONFIG_SCHED_BMQ
- /*
- * If both waiters have dl_prio(), we check the deadlines of the
- * associated tasks.
-@@ -378,8 +388,10 @@ static __always_inline int rt_mutex_waiter_equal(struct rt_mutex_waiter *left,
- */
- if (dl_prio(left->prio))
- return left->deadline == right->deadline;
-+#endif
-
- return 1;
-+#endif
- }
-
- static inline bool rt_mutex_steal(struct rt_mutex_waiter *waiter,
-diff --git a/kernel/sched/Makefile b/kernel/sched/Makefile
-index 976092b7bd45..31d587c16ec1 100644
---- a/kernel/sched/Makefile
-+++ b/kernel/sched/Makefile
-@@ -28,7 +28,12 @@ endif
- # These compilation units have roughly the same size and complexity - so their
- # build parallelizes well and finishes roughly at once:
- #
-+ifdef CONFIG_SCHED_ALT
-+obj-y += alt_core.o
-+obj-$(CONFIG_SCHED_DEBUG) += alt_debug.o
-+else
- obj-y += core.o
- obj-y += fair.o
-+endif
- obj-y += build_policy.o
- obj-y += build_utility.o
-diff --git a/kernel/sched/alt_core.c b/kernel/sched/alt_core.c
-new file mode 100644
-index 000000000000..3e8ddbd8001c
---- /dev/null
-+++ b/kernel/sched/alt_core.c
-@@ -0,0 +1,8729 @@
-+/*
-+ * kernel/sched/alt_core.c
-+ *
-+ * Core alternative kernel scheduler code and related syscalls
-+ *
-+ * Copyright (C) 1991-2002 Linus Torvalds
-+ *
-+ * 2009-08-13 Brainfuck deadline scheduling policy by Con Kolivas deletes
-+ * a whole lot of those previous things.
-+ * 2017-09-06 Priority and Deadline based Skip list multiple queue kernel
-+ * scheduler by Alfred Chen.
-+ * 2019-02-20 BMQ(BitMap Queue) kernel scheduler by Alfred Chen.
-+ */
-+#include <linux/sched/clock.h>
-+#include <linux/sched/cputime.h>
-+#include <linux/sched/debug.h>
-+#include <linux/sched/isolation.h>
-+#include <linux/sched/loadavg.h>
-+#include <linux/sched/mm.h>
-+#include <linux/sched/nohz.h>
-+#include <linux/sched/stat.h>
-+#include <linux/sched/wake_q.h>
-+
-+#include <linux/blkdev.h>
-+#include <linux/context_tracking.h>
-+#include <linux/cpuset.h>
-+#include <linux/delayacct.h>
-+#include <linux/init_task.h>
-+#include <linux/kcov.h>
-+#include <linux/kprobes.h>
-+#include <linux/nmi.h>
-+#include <linux/scs.h>
-+
-+#include <uapi/linux/sched/types.h>
-+
-+#include <asm/irq_regs.h>
-+#include <asm/switch_to.h>
-+
-+#define CREATE_TRACE_POINTS
-+#include <trace/events/sched.h>
-+#include <trace/events/ipi.h>
-+#undef CREATE_TRACE_POINTS
-+
-+#include "sched.h"
-+
-+#include "pelt.h"
-+
-+#include "../../io_uring/io-wq.h"
-+#include "../smpboot.h"
-+
-+EXPORT_TRACEPOINT_SYMBOL_GPL(ipi_send_cpu);
-+EXPORT_TRACEPOINT_SYMBOL_GPL(ipi_send_cpumask);
-+
-+/*
-+ * Export tracepoints that act as a bare tracehook (ie: have no trace event
-+ * associated with them) to allow external modules to probe them.
-+ */
-+EXPORT_TRACEPOINT_SYMBOL_GPL(pelt_irq_tp);
-+
-+#ifdef CONFIG_SCHED_DEBUG
-+#define sched_feat(x) (1)
-+/*
-+ * Print a warning if need_resched is set for the given duration (if
-+ * LATENCY_WARN is enabled).
-+ *
-+ * If sysctl_resched_latency_warn_once is set, only one warning will be shown
-+ * per boot.
-+ */
-+__read_mostly int sysctl_resched_latency_warn_ms = 100;
-+__read_mostly int sysctl_resched_latency_warn_once = 1;
-+#else
-+#define sched_feat(x) (0)
-+#endif /* CONFIG_SCHED_DEBUG */
-+
-+#define ALT_SCHED_VERSION "v6.4-r0"
-+
-+/*
-+ * Compile time debug macro
-+ * #define ALT_SCHED_DEBUG
-+ */
-+
-+/* rt_prio(prio) defined in include/linux/sched/rt.h */
-+#define rt_task(p) rt_prio((p)->prio)
-+#define rt_policy(policy) ((policy) == SCHED_FIFO || (policy) == SCHED_RR)
-+#define task_has_rt_policy(p) (rt_policy((p)->policy))
-+
-+#define STOP_PRIO (MAX_RT_PRIO - 1)
-+
-+/* Default time slice is 4 in ms, can be set via kernel parameter "sched_timeslice" */
-+u64 sched_timeslice_ns __read_mostly = (4 << 20);
-+
-+static inline void requeue_task(struct task_struct *p, struct rq *rq, int idx);
-+
-+#ifdef CONFIG_SCHED_BMQ
-+#include "bmq.h"
-+#endif
-+#ifdef CONFIG_SCHED_PDS
-+#include "pds.h"
-+#endif
-+
-+struct affinity_context {
-+ const struct cpumask *new_mask;
-+ struct cpumask *user_mask;
-+ unsigned int flags;
-+};
-+
-+static int __init sched_timeslice(char *str)
-+{
-+ int timeslice_ms;
-+
-+ get_option(&str, ×lice_ms);
-+ if (2 != timeslice_ms)
-+ timeslice_ms = 4;
-+ sched_timeslice_ns = timeslice_ms << 20;
-+ sched_timeslice_imp(timeslice_ms);
-+
-+ return 0;
-+}
-+early_param("sched_timeslice", sched_timeslice);
-+
-+/* Reschedule if less than this many μs left */
-+#define RESCHED_NS (100 << 10)
-+
-+/**
-+ * sched_yield_type - Choose what sort of yield sched_yield will perform.
-+ * 0: No yield.
-+ * 1: Deboost and requeue task. (default)
-+ * 2: Set rq skip task.
-+ */
-+int sched_yield_type __read_mostly = 1;
-+
-+#ifdef CONFIG_SMP
-+static cpumask_t sched_rq_pending_mask ____cacheline_aligned_in_smp;
-+
-+DEFINE_PER_CPU_ALIGNED(cpumask_t [NR_CPU_AFFINITY_LEVELS], sched_cpu_topo_masks);
-+DEFINE_PER_CPU_ALIGNED(cpumask_t *, sched_cpu_llc_mask);
-+DEFINE_PER_CPU_ALIGNED(cpumask_t *, sched_cpu_topo_end_mask);
-+
-+#ifdef CONFIG_SCHED_SMT
-+DEFINE_STATIC_KEY_FALSE(sched_smt_present);
-+EXPORT_SYMBOL_GPL(sched_smt_present);
-+#endif
-+
-+/*
-+ * Keep a unique ID per domain (we use the first CPUs number in the cpumask of
-+ * the domain), this allows us to quickly tell if two cpus are in the same cache
-+ * domain, see cpus_share_cache().
-+ */
-+DEFINE_PER_CPU(int, sd_llc_id);
-+#endif /* CONFIG_SMP */
-+
-+static DEFINE_MUTEX(sched_hotcpu_mutex);
-+
-+DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
-+
-+#ifndef prepare_arch_switch
-+# define prepare_arch_switch(next) do { } while (0)
-+#endif
-+#ifndef finish_arch_post_lock_switch
-+# define finish_arch_post_lock_switch() do { } while (0)
-+#endif
-+
-+#ifdef CONFIG_SCHED_SMT
-+static cpumask_t sched_sg_idle_mask ____cacheline_aligned_in_smp;
-+#endif
-+static cpumask_t sched_preempt_mask[SCHED_QUEUE_BITS] ____cacheline_aligned_in_smp;
-+static cpumask_t *const sched_idle_mask = &sched_preempt_mask[0];
-+
-+/* task function */
-+static inline const struct cpumask *task_user_cpus(struct task_struct *p)
-+{
-+ if (!p->user_cpus_ptr)
-+ return cpu_possible_mask; /* &init_task.cpus_mask */
-+ return p->user_cpus_ptr;
-+}
-+
-+/* sched_queue related functions */
-+static inline void sched_queue_init(struct sched_queue *q)
-+{
-+ int i;
-+
-+ bitmap_zero(q->bitmap, SCHED_QUEUE_BITS);
-+ for(i = 0; i < SCHED_LEVELS; i++)
-+ INIT_LIST_HEAD(&q->heads[i]);
-+}
-+
-+/*
-+ * Init idle task and put into queue structure of rq
-+ * IMPORTANT: may be called multiple times for a single cpu
-+ */
-+static inline void sched_queue_init_idle(struct sched_queue *q,
-+ struct task_struct *idle)
-+{
-+ idle->sq_idx = IDLE_TASK_SCHED_PRIO;
-+ INIT_LIST_HEAD(&q->heads[idle->sq_idx]);
-+ list_add(&idle->sq_node, &q->heads[idle->sq_idx]);
-+}
-+
-+static inline void
-+clear_recorded_preempt_mask(int pr, int low, int high, int cpu)
-+{
-+ if (low < pr && pr <= high)
-+ cpumask_clear_cpu(cpu, sched_preempt_mask + SCHED_QUEUE_BITS - pr);
-+}
-+
-+static inline void
-+set_recorded_preempt_mask(int pr, int low, int high, int cpu)
-+{
-+ if (low < pr && pr <= high)
-+ cpumask_set_cpu(cpu, sched_preempt_mask + SCHED_QUEUE_BITS - pr);
-+}
-+
-+static atomic_t sched_prio_record = ATOMIC_INIT(0);
-+
-+/* water mark related functions */
-+static inline void update_sched_preempt_mask(struct rq *rq)
-+{
-+ unsigned long prio = find_first_bit(rq->queue.bitmap, SCHED_QUEUE_BITS);
-+ unsigned long last_prio = rq->prio;
-+ int cpu, pr;
-+
-+ if (prio == last_prio)
-+ return;
-+
-+ rq->prio = prio;
-+ cpu = cpu_of(rq);
-+ pr = atomic_read(&sched_prio_record);
-+
-+ if (prio < last_prio) {
-+ if (IDLE_TASK_SCHED_PRIO == last_prio) {
-+#ifdef CONFIG_SCHED_SMT
-+ if (static_branch_likely(&sched_smt_present))
-+ cpumask_andnot(&sched_sg_idle_mask,
-+ &sched_sg_idle_mask, cpu_smt_mask(cpu));
-+#endif
-+ cpumask_clear_cpu(cpu, sched_idle_mask);
-+ last_prio -= 2;
-+ }
-+ clear_recorded_preempt_mask(pr, prio, last_prio, cpu);
-+
-+ return;
-+ }
-+ /* last_prio < prio */
-+ if (IDLE_TASK_SCHED_PRIO == prio) {
-+#ifdef CONFIG_SCHED_SMT
-+ if (static_branch_likely(&sched_smt_present) &&
-+ cpumask_intersects(cpu_smt_mask(cpu), sched_idle_mask))
-+ cpumask_or(&sched_sg_idle_mask,
-+ &sched_sg_idle_mask, cpu_smt_mask(cpu));
-+#endif
-+ cpumask_set_cpu(cpu, sched_idle_mask);
-+ prio -= 2;
-+ }
-+ set_recorded_preempt_mask(pr, last_prio, prio, cpu);
-+}
-+
-+/*
-+ * This routine assume that the idle task always in queue
-+ */
-+static inline struct task_struct *sched_rq_first_task(struct rq *rq)
-+{
-+ const struct list_head *head = &rq->queue.heads[sched_prio2idx(rq->prio, rq)];
-+
-+ return list_first_entry(head, struct task_struct, sq_node);
-+}
-+
-+static inline struct task_struct *
-+sched_rq_next_task(struct task_struct *p, struct rq *rq)
-+{
-+ unsigned long idx = p->sq_idx;
-+ struct list_head *head = &rq->queue.heads[idx];
-+
-+ if (list_is_last(&p->sq_node, head)) {
-+ idx = find_next_bit(rq->queue.bitmap, SCHED_QUEUE_BITS,
-+ sched_idx2prio(idx, rq) + 1);
-+ head = &rq->queue.heads[sched_prio2idx(idx, rq)];
-+
-+ return list_first_entry(head, struct task_struct, sq_node);
-+ }
-+
-+ return list_next_entry(p, sq_node);
-+}
-+
-+static inline struct task_struct *rq_runnable_task(struct rq *rq)
-+{
-+ struct task_struct *next = sched_rq_first_task(rq);
-+
-+ if (unlikely(next == rq->skip))
-+ next = sched_rq_next_task(next, rq);
-+
-+ return next;
-+}
-+
-+/*
-+ * Serialization rules:
-+ *
-+ * Lock order:
-+ *
-+ * p->pi_lock
-+ * rq->lock
-+ * hrtimer_cpu_base->lock (hrtimer_start() for bandwidth controls)
-+ *
-+ * rq1->lock
-+ * rq2->lock where: rq1 < rq2
-+ *
-+ * Regular state:
-+ *
-+ * Normal scheduling state is serialized by rq->lock. __schedule() takes the
-+ * local CPU's rq->lock, it optionally removes the task from the runqueue and
-+ * always looks at the local rq data structures to find the most eligible task
-+ * to run next.
-+ *
-+ * Task enqueue is also under rq->lock, possibly taken from another CPU.
-+ * Wakeups from another LLC domain might use an IPI to transfer the enqueue to
-+ * the local CPU to avoid bouncing the runqueue state around [ see
-+ * ttwu_queue_wakelist() ]
-+ *
-+ * Task wakeup, specifically wakeups that involve migration, are horribly
-+ * complicated to avoid having to take two rq->locks.
-+ *
-+ * Special state:
-+ *
-+ * System-calls and anything external will use task_rq_lock() which acquires
-+ * both p->pi_lock and rq->lock. As a consequence the state they change is
-+ * stable while holding either lock:
-+ *
-+ * - sched_setaffinity()/
-+ * set_cpus_allowed_ptr(): p->cpus_ptr, p->nr_cpus_allowed
-+ * - set_user_nice(): p->se.load, p->*prio
-+ * - __sched_setscheduler(): p->sched_class, p->policy, p->*prio,
-+ * p->se.load, p->rt_priority,
-+ * p->dl.dl_{runtime, deadline, period, flags, bw, density}
-+ * - sched_setnuma(): p->numa_preferred_nid
-+ * - sched_move_task(): p->sched_task_group
-+ * - uclamp_update_active() p->uclamp*
-+ *
-+ * p->state <- TASK_*:
-+ *
-+ * is changed locklessly using set_current_state(), __set_current_state() or
-+ * set_special_state(), see their respective comments, or by
-+ * try_to_wake_up(). This latter uses p->pi_lock to serialize against
-+ * concurrent self.
-+ *
-+ * p->on_rq <- { 0, 1 = TASK_ON_RQ_QUEUED, 2 = TASK_ON_RQ_MIGRATING }:
-+ *
-+ * is set by activate_task() and cleared by deactivate_task(), under
-+ * rq->lock. Non-zero indicates the task is runnable, the special
-+ * ON_RQ_MIGRATING state is used for migration without holding both
-+ * rq->locks. It indicates task_cpu() is not stable, see task_rq_lock().
-+ *
-+ * p->on_cpu <- { 0, 1 }:
-+ *
-+ * is set by prepare_task() and cleared by finish_task() such that it will be
-+ * set before p is scheduled-in and cleared after p is scheduled-out, both
-+ * under rq->lock. Non-zero indicates the task is running on its CPU.
-+ *
-+ * [ The astute reader will observe that it is possible for two tasks on one
-+ * CPU to have ->on_cpu = 1 at the same time. ]
-+ *
-+ * task_cpu(p): is changed by set_task_cpu(), the rules are:
-+ *
-+ * - Don't call set_task_cpu() on a blocked task:
-+ *
-+ * We don't care what CPU we're not running on, this simplifies hotplug,
-+ * the CPU assignment of blocked tasks isn't required to be valid.
-+ *
-+ * - for try_to_wake_up(), called under p->pi_lock:
-+ *
-+ * This allows try_to_wake_up() to only take one rq->lock, see its comment.
-+ *
-+ * - for migration called under rq->lock:
-+ * [ see task_on_rq_migrating() in task_rq_lock() ]
-+ *
-+ * o move_queued_task()
-+ * o detach_task()
-+ *
-+ * - for migration called under double_rq_lock():
-+ *
-+ * o __migrate_swap_task()
-+ * o push_rt_task() / pull_rt_task()
-+ * o push_dl_task() / pull_dl_task()
-+ * o dl_task_offline_migration()
-+ *
-+ */
-+
-+/*
-+ * Context: p->pi_lock
-+ */
-+static inline struct rq
-+*__task_access_lock(struct task_struct *p, raw_spinlock_t **plock)
-+{
-+ struct rq *rq;
-+ for (;;) {
-+ rq = task_rq(p);
-+ if (p->on_cpu || task_on_rq_queued(p)) {
-+ raw_spin_lock(&rq->lock);
-+ if (likely((p->on_cpu || task_on_rq_queued(p))
-+ && rq == task_rq(p))) {
-+ *plock = &rq->lock;
-+ return rq;
-+ }
-+ raw_spin_unlock(&rq->lock);
-+ } else if (task_on_rq_migrating(p)) {
-+ do {
-+ cpu_relax();
-+ } while (unlikely(task_on_rq_migrating(p)));
-+ } else {
-+ *plock = NULL;
-+ return rq;
-+ }
-+ }
-+}
-+
-+static inline void
-+__task_access_unlock(struct task_struct *p, raw_spinlock_t *lock)
-+{
-+ if (NULL != lock)
-+ raw_spin_unlock(lock);
-+}
-+
-+static inline struct rq
-+*task_access_lock_irqsave(struct task_struct *p, raw_spinlock_t **plock,
-+ unsigned long *flags)
-+{
-+ struct rq *rq;
-+ for (;;) {
-+ rq = task_rq(p);
-+ if (p->on_cpu || task_on_rq_queued(p)) {
-+ raw_spin_lock_irqsave(&rq->lock, *flags);
-+ if (likely((p->on_cpu || task_on_rq_queued(p))
-+ && rq == task_rq(p))) {
-+ *plock = &rq->lock;
-+ return rq;
-+ }
-+ raw_spin_unlock_irqrestore(&rq->lock, *flags);
-+ } else if (task_on_rq_migrating(p)) {
-+ do {
-+ cpu_relax();
-+ } while (unlikely(task_on_rq_migrating(p)));
-+ } else {
-+ raw_spin_lock_irqsave(&p->pi_lock, *flags);
-+ if (likely(!p->on_cpu && !p->on_rq &&
-+ rq == task_rq(p))) {
-+ *plock = &p->pi_lock;
-+ return rq;
-+ }
-+ raw_spin_unlock_irqrestore(&p->pi_lock, *flags);
-+ }
-+ }
-+}
-+
-+static inline void
-+task_access_unlock_irqrestore(struct task_struct *p, raw_spinlock_t *lock,
-+ unsigned long *flags)
-+{
-+ raw_spin_unlock_irqrestore(lock, *flags);
-+}
-+
-+/*
-+ * __task_rq_lock - lock the rq @p resides on.
-+ */
-+struct rq *__task_rq_lock(struct task_struct *p, struct rq_flags *rf)
-+ __acquires(rq->lock)
-+{
-+ struct rq *rq;
-+
-+ lockdep_assert_held(&p->pi_lock);
-+
-+ for (;;) {
-+ rq = task_rq(p);
-+ raw_spin_lock(&rq->lock);
-+ if (likely(rq == task_rq(p) && !task_on_rq_migrating(p)))
-+ return rq;
-+ raw_spin_unlock(&rq->lock);
-+
-+ while (unlikely(task_on_rq_migrating(p)))
-+ cpu_relax();
-+ }
-+}
-+
-+/*
-+ * task_rq_lock - lock p->pi_lock and lock the rq @p resides on.
-+ */
-+struct rq *task_rq_lock(struct task_struct *p, struct rq_flags *rf)
-+ __acquires(p->pi_lock)
-+ __acquires(rq->lock)
-+{
-+ struct rq *rq;
-+
-+ for (;;) {
-+ raw_spin_lock_irqsave(&p->pi_lock, rf->flags);
-+ rq = task_rq(p);
-+ raw_spin_lock(&rq->lock);
-+ /*
-+ * move_queued_task() task_rq_lock()
-+ *
-+ * ACQUIRE (rq->lock)
-+ * [S] ->on_rq = MIGRATING [L] rq = task_rq()
-+ * WMB (__set_task_cpu()) ACQUIRE (rq->lock);
-+ * [S] ->cpu = new_cpu [L] task_rq()
-+ * [L] ->on_rq
-+ * RELEASE (rq->lock)
-+ *
-+ * If we observe the old CPU in task_rq_lock(), the acquire of
-+ * the old rq->lock will fully serialize against the stores.
-+ *
-+ * If we observe the new CPU in task_rq_lock(), the address
-+ * dependency headed by '[L] rq = task_rq()' and the acquire
-+ * will pair with the WMB to ensure we then also see migrating.
-+ */
-+ if (likely(rq == task_rq(p) && !task_on_rq_migrating(p))) {
-+ return rq;
-+ }
-+ raw_spin_unlock(&rq->lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, rf->flags);
-+
-+ while (unlikely(task_on_rq_migrating(p)))
-+ cpu_relax();
-+ }
-+}
-+
-+static inline void
-+rq_lock_irqsave(struct rq *rq, struct rq_flags *rf)
-+ __acquires(rq->lock)
-+{
-+ raw_spin_lock_irqsave(&rq->lock, rf->flags);
-+}
-+
-+static inline void
-+rq_unlock_irqrestore(struct rq *rq, struct rq_flags *rf)
-+ __releases(rq->lock)
-+{
-+ raw_spin_unlock_irqrestore(&rq->lock, rf->flags);
-+}
-+
-+void raw_spin_rq_lock_nested(struct rq *rq, int subclass)
-+{
-+ raw_spinlock_t *lock;
-+
-+ /* Matches synchronize_rcu() in __sched_core_enable() */
-+ preempt_disable();
-+
-+ for (;;) {
-+ lock = __rq_lockp(rq);
-+ raw_spin_lock_nested(lock, subclass);
-+ if (likely(lock == __rq_lockp(rq))) {
-+ /* preempt_count *MUST* be > 1 */
-+ preempt_enable_no_resched();
-+ return;
-+ }
-+ raw_spin_unlock(lock);
-+ }
-+}
-+
-+void raw_spin_rq_unlock(struct rq *rq)
-+{
-+ raw_spin_unlock(rq_lockp(rq));
-+}
-+
-+/*
-+ * RQ-clock updating methods:
-+ */
-+
-+static void update_rq_clock_task(struct rq *rq, s64 delta)
-+{
-+/*
-+ * In theory, the compile should just see 0 here, and optimize out the call
-+ * to sched_rt_avg_update. But I don't trust it...
-+ */
-+ s64 __maybe_unused steal = 0, irq_delta = 0;
-+
-+#ifdef CONFIG_IRQ_TIME_ACCOUNTING
-+ irq_delta = irq_time_read(cpu_of(rq)) - rq->prev_irq_time;
-+
-+ /*
-+ * Since irq_time is only updated on {soft,}irq_exit, we might run into
-+ * this case when a previous update_rq_clock() happened inside a
-+ * {soft,}irq region.
-+ *
-+ * When this happens, we stop ->clock_task and only update the
-+ * prev_irq_time stamp to account for the part that fit, so that a next
-+ * update will consume the rest. This ensures ->clock_task is
-+ * monotonic.
-+ *
-+ * It does however cause some slight miss-attribution of {soft,}irq
-+ * time, a more accurate solution would be to update the irq_time using
-+ * the current rq->clock timestamp, except that would require using
-+ * atomic ops.
-+ */
-+ if (irq_delta > delta)
-+ irq_delta = delta;
-+
-+ rq->prev_irq_time += irq_delta;
-+ delta -= irq_delta;
-+ delayacct_irq(rq->curr, irq_delta);
-+#endif
-+#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
-+ if (static_key_false((¶virt_steal_rq_enabled))) {
-+ steal = paravirt_steal_clock(cpu_of(rq));
-+ steal -= rq->prev_steal_time_rq;
-+
-+ if (unlikely(steal > delta))
-+ steal = delta;
-+
-+ rq->prev_steal_time_rq += steal;
-+ delta -= steal;
-+ }
-+#endif
-+
-+ rq->clock_task += delta;
-+
-+#ifdef CONFIG_HAVE_SCHED_AVG_IRQ
-+ if ((irq_delta + steal))
-+ update_irq_load_avg(rq, irq_delta + steal);
-+#endif
-+}
-+
-+static inline void update_rq_clock(struct rq *rq)
-+{
-+ s64 delta = sched_clock_cpu(cpu_of(rq)) - rq->clock;
-+
-+ if (unlikely(delta <= 0))
-+ return;
-+ rq->clock += delta;
-+ update_rq_time_edge(rq);
-+ update_rq_clock_task(rq, delta);
-+}
-+
-+/*
-+ * RQ Load update routine
-+ */
-+#define RQ_LOAD_HISTORY_BITS (sizeof(s32) * 8ULL)
-+#define RQ_UTIL_SHIFT (8)
-+#define RQ_LOAD_HISTORY_TO_UTIL(l) (((l) >> (RQ_LOAD_HISTORY_BITS - 1 - RQ_UTIL_SHIFT)) & 0xff)
-+
-+#define LOAD_BLOCK(t) ((t) >> 17)
-+#define LOAD_HALF_BLOCK(t) ((t) >> 16)
-+#define BLOCK_MASK(t) ((t) & ((0x01 << 18) - 1))
-+#define LOAD_BLOCK_BIT(b) (1UL << (RQ_LOAD_HISTORY_BITS - 1 - (b)))
-+#define CURRENT_LOAD_BIT LOAD_BLOCK_BIT(0)
-+
-+static inline void rq_load_update(struct rq *rq)
-+{
-+ u64 time = rq->clock;
-+ u64 delta = min(LOAD_BLOCK(time) - LOAD_BLOCK(rq->load_stamp),
-+ RQ_LOAD_HISTORY_BITS - 1);
-+ u64 prev = !!(rq->load_history & CURRENT_LOAD_BIT);
-+ u64 curr = !!rq->nr_running;
-+
-+ if (delta) {
-+ rq->load_history = rq->load_history >> delta;
-+
-+ if (delta < RQ_UTIL_SHIFT) {
-+ rq->load_block += (~BLOCK_MASK(rq->load_stamp)) * prev;
-+ if (!!LOAD_HALF_BLOCK(rq->load_block) ^ curr)
-+ rq->load_history ^= LOAD_BLOCK_BIT(delta);
-+ }
-+
-+ rq->load_block = BLOCK_MASK(time) * prev;
-+ } else {
-+ rq->load_block += (time - rq->load_stamp) * prev;
-+ }
-+ if (prev ^ curr)
-+ rq->load_history ^= CURRENT_LOAD_BIT;
-+ rq->load_stamp = time;
-+}
-+
-+unsigned long rq_load_util(struct rq *rq, unsigned long max)
-+{
-+ return RQ_LOAD_HISTORY_TO_UTIL(rq->load_history) * (max >> RQ_UTIL_SHIFT);
-+}
-+
-+#ifdef CONFIG_SMP
-+unsigned long sched_cpu_util(int cpu)
-+{
-+ return rq_load_util(cpu_rq(cpu), arch_scale_cpu_capacity(cpu));
-+}
-+#endif /* CONFIG_SMP */
-+
-+#ifdef CONFIG_CPU_FREQ
-+/**
-+ * cpufreq_update_util - Take a note about CPU utilization changes.
-+ * @rq: Runqueue to carry out the update for.
-+ * @flags: Update reason flags.
-+ *
-+ * This function is called by the scheduler on the CPU whose utilization is
-+ * being updated.
-+ *
-+ * It can only be called from RCU-sched read-side critical sections.
-+ *
-+ * The way cpufreq is currently arranged requires it to evaluate the CPU
-+ * performance state (frequency/voltage) on a regular basis to prevent it from
-+ * being stuck in a completely inadequate performance level for too long.
-+ * That is not guaranteed to happen if the updates are only triggered from CFS
-+ * and DL, though, because they may not be coming in if only RT tasks are
-+ * active all the time (or there are RT tasks only).
-+ *
-+ * As a workaround for that issue, this function is called periodically by the
-+ * RT sched class to trigger extra cpufreq updates to prevent it from stalling,
-+ * but that really is a band-aid. Going forward it should be replaced with
-+ * solutions targeted more specifically at RT tasks.
-+ */
-+static inline void cpufreq_update_util(struct rq *rq, unsigned int flags)
-+{
-+ struct update_util_data *data;
-+
-+#ifdef CONFIG_SMP
-+ rq_load_update(rq);
-+#endif
-+ data = rcu_dereference_sched(*per_cpu_ptr(&cpufreq_update_util_data,
-+ cpu_of(rq)));
-+ if (data)
-+ data->func(data, rq_clock(rq), flags);
-+}
-+#else
-+static inline void cpufreq_update_util(struct rq *rq, unsigned int flags)
-+{
-+#ifdef CONFIG_SMP
-+ rq_load_update(rq);
-+#endif
-+}
-+#endif /* CONFIG_CPU_FREQ */
-+
-+#ifdef CONFIG_NO_HZ_FULL
-+/*
-+ * Tick may be needed by tasks in the runqueue depending on their policy and
-+ * requirements. If tick is needed, lets send the target an IPI to kick it out
-+ * of nohz mode if necessary.
-+ */
-+static inline void sched_update_tick_dependency(struct rq *rq)
-+{
-+ int cpu = cpu_of(rq);
-+
-+ if (!tick_nohz_full_cpu(cpu))
-+ return;
-+
-+ if (rq->nr_running < 2)
-+ tick_nohz_dep_clear_cpu(cpu, TICK_DEP_BIT_SCHED);
-+ else
-+ tick_nohz_dep_set_cpu(cpu, TICK_DEP_BIT_SCHED);
-+}
-+#else /* !CONFIG_NO_HZ_FULL */
-+static inline void sched_update_tick_dependency(struct rq *rq) { }
-+#endif
-+
-+bool sched_task_on_rq(struct task_struct *p)
-+{
-+ return task_on_rq_queued(p);
-+}
-+
-+unsigned long get_wchan(struct task_struct *p)
-+{
-+ unsigned long ip = 0;
-+ unsigned int state;
-+
-+ if (!p || p == current)
-+ return 0;
-+
-+ /* Only get wchan if task is blocked and we can keep it that way. */
-+ raw_spin_lock_irq(&p->pi_lock);
-+ state = READ_ONCE(p->__state);
-+ smp_rmb(); /* see try_to_wake_up() */
-+ if (state != TASK_RUNNING && state != TASK_WAKING && !p->on_rq)
-+ ip = __get_wchan(p);
-+ raw_spin_unlock_irq(&p->pi_lock);
-+
-+ return ip;
-+}
-+
-+/*
-+ * Add/Remove/Requeue task to/from the runqueue routines
-+ * Context: rq->lock
-+ */
-+#define __SCHED_DEQUEUE_TASK(p, rq, flags, func) \
-+ sched_info_dequeue(rq, p); \
-+ \
-+ list_del(&p->sq_node); \
-+ if (list_empty(&rq->queue.heads[p->sq_idx])) { \
-+ clear_bit(sched_idx2prio(p->sq_idx, rq), rq->queue.bitmap); \
-+ func; \
-+ }
-+
-+#define __SCHED_ENQUEUE_TASK(p, rq, flags) \
-+ sched_info_enqueue(rq, p); \
-+ \
-+ p->sq_idx = task_sched_prio_idx(p, rq); \
-+ list_add_tail(&p->sq_node, &rq->queue.heads[p->sq_idx]); \
-+ set_bit(sched_idx2prio(p->sq_idx, rq), rq->queue.bitmap);
-+
-+static inline void dequeue_task(struct task_struct *p, struct rq *rq, int flags)
-+{
-+#ifdef ALT_SCHED_DEBUG
-+ lockdep_assert_held(&rq->lock);
-+
-+ /*printk(KERN_INFO "sched: dequeue(%d) %px %016llx\n", cpu_of(rq), p, p->deadline);*/
-+ WARN_ONCE(task_rq(p) != rq, "sched: dequeue task reside on cpu%d from cpu%d\n",
-+ task_cpu(p), cpu_of(rq));
-+#endif
-+
-+ __SCHED_DEQUEUE_TASK(p, rq, flags, update_sched_preempt_mask(rq));
-+ --rq->nr_running;
-+#ifdef CONFIG_SMP
-+ if (1 == rq->nr_running)
-+ cpumask_clear_cpu(cpu_of(rq), &sched_rq_pending_mask);
-+#endif
-+
-+ sched_update_tick_dependency(rq);
-+}
-+
-+static inline void enqueue_task(struct task_struct *p, struct rq *rq, int flags)
-+{
-+#ifdef ALT_SCHED_DEBUG
-+ lockdep_assert_held(&rq->lock);
-+
-+ /*printk(KERN_INFO "sched: enqueue(%d) %px %d\n", cpu_of(rq), p, p->prio);*/
-+ WARN_ONCE(task_rq(p) != rq, "sched: enqueue task reside on cpu%d to cpu%d\n",
-+ task_cpu(p), cpu_of(rq));
-+#endif
-+
-+ __SCHED_ENQUEUE_TASK(p, rq, flags);
-+ update_sched_preempt_mask(rq);
-+ ++rq->nr_running;
-+#ifdef CONFIG_SMP
-+ if (2 == rq->nr_running)
-+ cpumask_set_cpu(cpu_of(rq), &sched_rq_pending_mask);
-+#endif
-+
-+ sched_update_tick_dependency(rq);
-+}
-+
-+static inline void requeue_task(struct task_struct *p, struct rq *rq, int idx)
-+{
-+#ifdef ALT_SCHED_DEBUG
-+ lockdep_assert_held(&rq->lock);
-+ /*printk(KERN_INFO "sched: requeue(%d) %px %016llx\n", cpu_of(rq), p, p->deadline);*/
-+ WARN_ONCE(task_rq(p) != rq, "sched: cpu[%d] requeue task reside on cpu%d\n",
-+ cpu_of(rq), task_cpu(p));
-+#endif
-+
-+ list_del(&p->sq_node);
-+ list_add_tail(&p->sq_node, &rq->queue.heads[idx]);
-+ if (idx != p->sq_idx) {
-+ if (list_empty(&rq->queue.heads[p->sq_idx]))
-+ clear_bit(sched_idx2prio(p->sq_idx, rq), rq->queue.bitmap);
-+ p->sq_idx = idx;
-+ set_bit(sched_idx2prio(p->sq_idx, rq), rq->queue.bitmap);
-+ update_sched_preempt_mask(rq);
-+ }
-+}
-+
-+/*
-+ * cmpxchg based fetch_or, macro so it works for different integer types
-+ */
-+#define fetch_or(ptr, mask) \
-+ ({ \
-+ typeof(ptr) _ptr = (ptr); \
-+ typeof(mask) _mask = (mask); \
-+ typeof(*_ptr) _val = *_ptr; \
-+ \
-+ do { \
-+ } while (!try_cmpxchg(_ptr, &_val, _val | _mask)); \
-+ _val; \
-+})
-+
-+#if defined(CONFIG_SMP) && defined(TIF_POLLING_NRFLAG)
-+/*
-+ * Atomically set TIF_NEED_RESCHED and test for TIF_POLLING_NRFLAG,
-+ * this avoids any races wrt polling state changes and thereby avoids
-+ * spurious IPIs.
-+ */
-+static inline bool set_nr_and_not_polling(struct task_struct *p)
-+{
-+ struct thread_info *ti = task_thread_info(p);
-+ return !(fetch_or(&ti->flags, _TIF_NEED_RESCHED) & _TIF_POLLING_NRFLAG);
-+}
-+
-+/*
-+ * Atomically set TIF_NEED_RESCHED if TIF_POLLING_NRFLAG is set.
-+ *
-+ * If this returns true, then the idle task promises to call
-+ * sched_ttwu_pending() and reschedule soon.
-+ */
-+static bool set_nr_if_polling(struct task_struct *p)
-+{
-+ struct thread_info *ti = task_thread_info(p);
-+ typeof(ti->flags) val = READ_ONCE(ti->flags);
-+
-+ for (;;) {
-+ if (!(val & _TIF_POLLING_NRFLAG))
-+ return false;
-+ if (val & _TIF_NEED_RESCHED)
-+ return true;
-+ if (try_cmpxchg(&ti->flags, &val, val | _TIF_NEED_RESCHED))
-+ break;
-+ }
-+ return true;
-+}
-+
-+#else
-+static inline bool set_nr_and_not_polling(struct task_struct *p)
-+{
-+ set_tsk_need_resched(p);
-+ return true;
-+}
-+
-+#ifdef CONFIG_SMP
-+static inline bool set_nr_if_polling(struct task_struct *p)
-+{
-+ return false;
-+}
-+#endif
-+#endif
-+
-+static bool __wake_q_add(struct wake_q_head *head, struct task_struct *task)
-+{
-+ struct wake_q_node *node = &task->wake_q;
-+
-+ /*
-+ * Atomically grab the task, if ->wake_q is !nil already it means
-+ * it's already queued (either by us or someone else) and will get the
-+ * wakeup due to that.
-+ *
-+ * In order to ensure that a pending wakeup will observe our pending
-+ * state, even in the failed case, an explicit smp_mb() must be used.
-+ */
-+ smp_mb__before_atomic();
-+ if (unlikely(cmpxchg_relaxed(&node->next, NULL, WAKE_Q_TAIL)))
-+ return false;
-+
-+ /*
-+ * The head is context local, there can be no concurrency.
-+ */
-+ *head->lastp = node;
-+ head->lastp = &node->next;
-+ return true;
-+}
-+
-+/**
-+ * wake_q_add() - queue a wakeup for 'later' waking.
-+ * @head: the wake_q_head to add @task to
-+ * @task: the task to queue for 'later' wakeup
-+ *
-+ * Queue a task for later wakeup, most likely by the wake_up_q() call in the
-+ * same context, _HOWEVER_ this is not guaranteed, the wakeup can come
-+ * instantly.
-+ *
-+ * This function must be used as-if it were wake_up_process(); IOW the task
-+ * must be ready to be woken at this location.
-+ */
-+void wake_q_add(struct wake_q_head *head, struct task_struct *task)
-+{
-+ if (__wake_q_add(head, task))
-+ get_task_struct(task);
-+}
-+
-+/**
-+ * wake_q_add_safe() - safely queue a wakeup for 'later' waking.
-+ * @head: the wake_q_head to add @task to
-+ * @task: the task to queue for 'later' wakeup
-+ *
-+ * Queue a task for later wakeup, most likely by the wake_up_q() call in the
-+ * same context, _HOWEVER_ this is not guaranteed, the wakeup can come
-+ * instantly.
-+ *
-+ * This function must be used as-if it were wake_up_process(); IOW the task
-+ * must be ready to be woken at this location.
-+ *
-+ * This function is essentially a task-safe equivalent to wake_q_add(). Callers
-+ * that already hold reference to @task can call the 'safe' version and trust
-+ * wake_q to do the right thing depending whether or not the @task is already
-+ * queued for wakeup.
-+ */
-+void wake_q_add_safe(struct wake_q_head *head, struct task_struct *task)
-+{
-+ if (!__wake_q_add(head, task))
-+ put_task_struct(task);
-+}
-+
-+void wake_up_q(struct wake_q_head *head)
-+{
-+ struct wake_q_node *node = head->first;
-+
-+ while (node != WAKE_Q_TAIL) {
-+ struct task_struct *task;
-+
-+ task = container_of(node, struct task_struct, wake_q);
-+ /* task can safely be re-inserted now: */
-+ node = node->next;
-+ task->wake_q.next = NULL;
-+
-+ /*
-+ * wake_up_process() executes a full barrier, which pairs with
-+ * the queueing in wake_q_add() so as not to miss wakeups.
-+ */
-+ wake_up_process(task);
-+ put_task_struct(task);
-+ }
-+}
-+
-+/*
-+ * resched_curr - mark rq's current task 'to be rescheduled now'.
-+ *
-+ * On UP this means the setting of the need_resched flag, on SMP it
-+ * might also involve a cross-CPU call to trigger the scheduler on
-+ * the target CPU.
-+ */
-+void resched_curr(struct rq *rq)
-+{
-+ struct task_struct *curr = rq->curr;
-+ int cpu;
-+
-+ lockdep_assert_held(&rq->lock);
-+
-+ if (test_tsk_need_resched(curr))
-+ return;
-+
-+ cpu = cpu_of(rq);
-+ if (cpu == smp_processor_id()) {
-+ set_tsk_need_resched(curr);
-+ set_preempt_need_resched();
-+ return;
-+ }
-+
-+ if (set_nr_and_not_polling(curr))
-+ smp_send_reschedule(cpu);
-+ else
-+ trace_sched_wake_idle_without_ipi(cpu);
-+}
-+
-+void resched_cpu(int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+ unsigned long flags;
-+
-+ raw_spin_lock_irqsave(&rq->lock, flags);
-+ if (cpu_online(cpu) || cpu == smp_processor_id())
-+ resched_curr(cpu_rq(cpu));
-+ raw_spin_unlock_irqrestore(&rq->lock, flags);
-+}
-+
-+#ifdef CONFIG_SMP
-+#ifdef CONFIG_NO_HZ_COMMON
-+void nohz_balance_enter_idle(int cpu) {}
-+
-+void select_nohz_load_balancer(int stop_tick) {}
-+
-+void set_cpu_sd_state_idle(void) {}
-+
-+/*
-+ * In the semi idle case, use the nearest busy CPU for migrating timers
-+ * from an idle CPU. This is good for power-savings.
-+ *
-+ * We don't do similar optimization for completely idle system, as
-+ * selecting an idle CPU will add more delays to the timers than intended
-+ * (as that CPU's timer base may not be uptodate wrt jiffies etc).
-+ */
-+int get_nohz_timer_target(void)
-+{
-+ int i, cpu = smp_processor_id(), default_cpu = -1;
-+ struct cpumask *mask;
-+ const struct cpumask *hk_mask;
-+
-+ if (housekeeping_cpu(cpu, HK_TYPE_TIMER)) {
-+ if (!idle_cpu(cpu))
-+ return cpu;
-+ default_cpu = cpu;
-+ }
-+
-+ hk_mask = housekeeping_cpumask(HK_TYPE_TIMER);
-+
-+ for (mask = per_cpu(sched_cpu_topo_masks, cpu) + 1;
-+ mask < per_cpu(sched_cpu_topo_end_mask, cpu); mask++)
-+ for_each_cpu_and(i, mask, hk_mask)
-+ if (!idle_cpu(i))
-+ return i;
-+
-+ if (default_cpu == -1)
-+ default_cpu = housekeeping_any_cpu(HK_TYPE_TIMER);
-+ cpu = default_cpu;
-+
-+ return cpu;
-+}
-+
-+/*
-+ * When add_timer_on() enqueues a timer into the timer wheel of an
-+ * idle CPU then this timer might expire before the next timer event
-+ * which is scheduled to wake up that CPU. In case of a completely
-+ * idle system the next event might even be infinite time into the
-+ * future. wake_up_idle_cpu() ensures that the CPU is woken up and
-+ * leaves the inner idle loop so the newly added timer is taken into
-+ * account when the CPU goes back to idle and evaluates the timer
-+ * wheel for the next timer event.
-+ */
-+static inline void wake_up_idle_cpu(int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+
-+ if (cpu == smp_processor_id())
-+ return;
-+
-+ if (set_nr_and_not_polling(rq->idle))
-+ smp_send_reschedule(cpu);
-+ else
-+ trace_sched_wake_idle_without_ipi(cpu);
-+}
-+
-+static inline bool wake_up_full_nohz_cpu(int cpu)
-+{
-+ /*
-+ * We just need the target to call irq_exit() and re-evaluate
-+ * the next tick. The nohz full kick at least implies that.
-+ * If needed we can still optimize that later with an
-+ * empty IRQ.
-+ */
-+ if (cpu_is_offline(cpu))
-+ return true; /* Don't try to wake offline CPUs. */
-+ if (tick_nohz_full_cpu(cpu)) {
-+ if (cpu != smp_processor_id() ||
-+ tick_nohz_tick_stopped())
-+ tick_nohz_full_kick_cpu(cpu);
-+ return true;
-+ }
-+
-+ return false;
-+}
-+
-+void wake_up_nohz_cpu(int cpu)
-+{
-+ if (!wake_up_full_nohz_cpu(cpu))
-+ wake_up_idle_cpu(cpu);
-+}
-+
-+static void nohz_csd_func(void *info)
-+{
-+ struct rq *rq = info;
-+ int cpu = cpu_of(rq);
-+ unsigned int flags;
-+
-+ /*
-+ * Release the rq::nohz_csd.
-+ */
-+ flags = atomic_fetch_andnot(NOHZ_KICK_MASK, nohz_flags(cpu));
-+ WARN_ON(!(flags & NOHZ_KICK_MASK));
-+
-+ rq->idle_balance = idle_cpu(cpu);
-+ if (rq->idle_balance && !need_resched()) {
-+ rq->nohz_idle_balance = flags;
-+ raise_softirq_irqoff(SCHED_SOFTIRQ);
-+ }
-+}
-+
-+#endif /* CONFIG_NO_HZ_COMMON */
-+#endif /* CONFIG_SMP */
-+
-+static inline void check_preempt_curr(struct rq *rq)
-+{
-+ if (sched_rq_first_task(rq) != rq->curr)
-+ resched_curr(rq);
-+}
-+
-+#ifdef CONFIG_SCHED_HRTICK
-+/*
-+ * Use HR-timers to deliver accurate preemption points.
-+ */
-+
-+static void hrtick_clear(struct rq *rq)
-+{
-+ if (hrtimer_active(&rq->hrtick_timer))
-+ hrtimer_cancel(&rq->hrtick_timer);
-+}
-+
-+/*
-+ * High-resolution timer tick.
-+ * Runs from hardirq context with interrupts disabled.
-+ */
-+static enum hrtimer_restart hrtick(struct hrtimer *timer)
-+{
-+ struct rq *rq = container_of(timer, struct rq, hrtick_timer);
-+
-+ WARN_ON_ONCE(cpu_of(rq) != smp_processor_id());
-+
-+ raw_spin_lock(&rq->lock);
-+ resched_curr(rq);
-+ raw_spin_unlock(&rq->lock);
-+
-+ return HRTIMER_NORESTART;
-+}
-+
-+/*
-+ * Use hrtick when:
-+ * - enabled by features
-+ * - hrtimer is actually high res
-+ */
-+static inline int hrtick_enabled(struct rq *rq)
-+{
-+ /**
-+ * Alt schedule FW doesn't support sched_feat yet
-+ if (!sched_feat(HRTICK))
-+ return 0;
-+ */
-+ if (!cpu_active(cpu_of(rq)))
-+ return 0;
-+ return hrtimer_is_hres_active(&rq->hrtick_timer);
-+}
-+
-+#ifdef CONFIG_SMP
-+
-+static void __hrtick_restart(struct rq *rq)
-+{
-+ struct hrtimer *timer = &rq->hrtick_timer;
-+ ktime_t time = rq->hrtick_time;
-+
-+ hrtimer_start(timer, time, HRTIMER_MODE_ABS_PINNED_HARD);
-+}
-+
-+/*
-+ * called from hardirq (IPI) context
-+ */
-+static void __hrtick_start(void *arg)
-+{
-+ struct rq *rq = arg;
-+
-+ raw_spin_lock(&rq->lock);
-+ __hrtick_restart(rq);
-+ raw_spin_unlock(&rq->lock);
-+}
-+
-+/*
-+ * Called to set the hrtick timer state.
-+ *
-+ * called with rq->lock held and irqs disabled
-+ */
-+void hrtick_start(struct rq *rq, u64 delay)
-+{
-+ struct hrtimer *timer = &rq->hrtick_timer;
-+ s64 delta;
-+
-+ /*
-+ * Don't schedule slices shorter than 10000ns, that just
-+ * doesn't make sense and can cause timer DoS.
-+ */
-+ delta = max_t(s64, delay, 10000LL);
-+
-+ rq->hrtick_time = ktime_add_ns(timer->base->get_time(), delta);
-+
-+ if (rq == this_rq())
-+ __hrtick_restart(rq);
-+ else
-+ smp_call_function_single_async(cpu_of(rq), &rq->hrtick_csd);
-+}
-+
-+#else
-+/*
-+ * Called to set the hrtick timer state.
-+ *
-+ * called with rq->lock held and irqs disabled
-+ */
-+void hrtick_start(struct rq *rq, u64 delay)
-+{
-+ /*
-+ * Don't schedule slices shorter than 10000ns, that just
-+ * doesn't make sense. Rely on vruntime for fairness.
-+ */
-+ delay = max_t(u64, delay, 10000LL);
-+ hrtimer_start(&rq->hrtick_timer, ns_to_ktime(delay),
-+ HRTIMER_MODE_REL_PINNED_HARD);
-+}
-+#endif /* CONFIG_SMP */
-+
-+static void hrtick_rq_init(struct rq *rq)
-+{
-+#ifdef CONFIG_SMP
-+ INIT_CSD(&rq->hrtick_csd, __hrtick_start, rq);
-+#endif
-+
-+ hrtimer_init(&rq->hrtick_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD);
-+ rq->hrtick_timer.function = hrtick;
-+}
-+#else /* CONFIG_SCHED_HRTICK */
-+static inline int hrtick_enabled(struct rq *rq)
-+{
-+ return 0;
-+}
-+
-+static inline void hrtick_clear(struct rq *rq)
-+{
-+}
-+
-+static inline void hrtick_rq_init(struct rq *rq)
-+{
-+}
-+#endif /* CONFIG_SCHED_HRTICK */
-+
-+static inline int __normal_prio(int policy, int rt_prio, int static_prio)
-+{
-+ return rt_policy(policy) ? (MAX_RT_PRIO - 1 - rt_prio) :
-+ static_prio + MAX_PRIORITY_ADJ;
-+}
-+
-+/*
-+ * Calculate the expected normal priority: i.e. priority
-+ * without taking RT-inheritance into account. Might be
-+ * boosted by interactivity modifiers. Changes upon fork,
-+ * setprio syscalls, and whenever the interactivity
-+ * estimator recalculates.
-+ */
-+static inline int normal_prio(struct task_struct *p)
-+{
-+ return __normal_prio(p->policy, p->rt_priority, p->static_prio);
-+}
-+
-+/*
-+ * Calculate the current priority, i.e. the priority
-+ * taken into account by the scheduler. This value might
-+ * be boosted by RT tasks as it will be RT if the task got
-+ * RT-boosted. If not then it returns p->normal_prio.
-+ */
-+static int effective_prio(struct task_struct *p)
-+{
-+ p->normal_prio = normal_prio(p);
-+ /*
-+ * If we are RT tasks or we were boosted to RT priority,
-+ * keep the priority unchanged. Otherwise, update priority
-+ * to the normal priority:
-+ */
-+ if (!rt_prio(p->prio))
-+ return p->normal_prio;
-+ return p->prio;
-+}
-+
-+/*
-+ * activate_task - move a task to the runqueue.
-+ *
-+ * Context: rq->lock
-+ */
-+static void activate_task(struct task_struct *p, struct rq *rq)
-+{
-+ enqueue_task(p, rq, ENQUEUE_WAKEUP);
-+ p->on_rq = TASK_ON_RQ_QUEUED;
-+
-+ /*
-+ * If in_iowait is set, the code below may not trigger any cpufreq
-+ * utilization updates, so do it here explicitly with the IOWAIT flag
-+ * passed.
-+ */
-+ cpufreq_update_util(rq, SCHED_CPUFREQ_IOWAIT * p->in_iowait);
-+}
-+
-+/*
-+ * deactivate_task - remove a task from the runqueue.
-+ *
-+ * Context: rq->lock
-+ */
-+static inline void deactivate_task(struct task_struct *p, struct rq *rq)
-+{
-+ dequeue_task(p, rq, DEQUEUE_SLEEP);
-+ p->on_rq = 0;
-+ cpufreq_update_util(rq, 0);
-+}
-+
-+static inline void __set_task_cpu(struct task_struct *p, unsigned int cpu)
-+{
-+#ifdef CONFIG_SMP
-+ /*
-+ * After ->cpu is set up to a new value, task_access_lock(p, ...) can be
-+ * successfully executed on another CPU. We must ensure that updates of
-+ * per-task data have been completed by this moment.
-+ */
-+ smp_wmb();
-+
-+ WRITE_ONCE(task_thread_info(p)->cpu, cpu);
-+#endif
-+}
-+
-+static inline bool is_migration_disabled(struct task_struct *p)
-+{
-+#ifdef CONFIG_SMP
-+ return p->migration_disabled;
-+#else
-+ return false;
-+#endif
-+}
-+
-+#define SCA_CHECK 0x01
-+#define SCA_USER 0x08
-+
-+#ifdef CONFIG_SMP
-+
-+void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
-+{
-+#ifdef CONFIG_SCHED_DEBUG
-+ unsigned int state = READ_ONCE(p->__state);
-+
-+ /*
-+ * We should never call set_task_cpu() on a blocked task,
-+ * ttwu() will sort out the placement.
-+ */
-+ WARN_ON_ONCE(state != TASK_RUNNING && state != TASK_WAKING && !p->on_rq);
-+
-+#ifdef CONFIG_LOCKDEP
-+ /*
-+ * The caller should hold either p->pi_lock or rq->lock, when changing
-+ * a task's CPU. ->pi_lock for waking tasks, rq->lock for runnable tasks.
-+ *
-+ * sched_move_task() holds both and thus holding either pins the cgroup,
-+ * see task_group().
-+ */
-+ WARN_ON_ONCE(debug_locks && !(lockdep_is_held(&p->pi_lock) ||
-+ lockdep_is_held(&task_rq(p)->lock)));
-+#endif
-+ /*
-+ * Clearly, migrating tasks to offline CPUs is a fairly daft thing.
-+ */
-+ WARN_ON_ONCE(!cpu_online(new_cpu));
-+
-+ WARN_ON_ONCE(is_migration_disabled(p));
-+#endif
-+ trace_sched_migrate_task(p, new_cpu);
-+
-+ if (task_cpu(p) != new_cpu)
-+ {
-+ rseq_migrate(p);
-+ perf_event_task_migrate(p);
-+ }
-+
-+ __set_task_cpu(p, new_cpu);
-+}
-+
-+#define MDF_FORCE_ENABLED 0x80
-+
-+static void
-+__do_set_cpus_ptr(struct task_struct *p, const struct cpumask *new_mask)
-+{
-+ /*
-+ * This here violates the locking rules for affinity, since we're only
-+ * supposed to change these variables while holding both rq->lock and
-+ * p->pi_lock.
-+ *
-+ * HOWEVER, it magically works, because ttwu() is the only code that
-+ * accesses these variables under p->pi_lock and only does so after
-+ * smp_cond_load_acquire(&p->on_cpu, !VAL), and we're in __schedule()
-+ * before finish_task().
-+ *
-+ * XXX do further audits, this smells like something putrid.
-+ */
-+ SCHED_WARN_ON(!p->on_cpu);
-+ p->cpus_ptr = new_mask;
-+}
-+
-+void migrate_disable(void)
-+{
-+ struct task_struct *p = current;
-+ int cpu;
-+
-+ if (p->migration_disabled) {
-+ p->migration_disabled++;
-+ return;
-+ }
-+
-+ preempt_disable();
-+ cpu = smp_processor_id();
-+ if (cpumask_test_cpu(cpu, &p->cpus_mask)) {
-+ cpu_rq(cpu)->nr_pinned++;
-+ p->migration_disabled = 1;
-+ p->migration_flags &= ~MDF_FORCE_ENABLED;
-+
-+ /*
-+ * Violates locking rules! see comment in __do_set_cpus_ptr().
-+ */
-+ if (p->cpus_ptr == &p->cpus_mask)
-+ __do_set_cpus_ptr(p, cpumask_of(cpu));
-+ }
-+ preempt_enable();
-+}
-+EXPORT_SYMBOL_GPL(migrate_disable);
-+
-+void migrate_enable(void)
-+{
-+ struct task_struct *p = current;
-+
-+ if (0 == p->migration_disabled)
-+ return;
-+
-+ if (p->migration_disabled > 1) {
-+ p->migration_disabled--;
-+ return;
-+ }
-+
-+ if (WARN_ON_ONCE(!p->migration_disabled))
-+ return;
-+
-+ /*
-+ * Ensure stop_task runs either before or after this, and that
-+ * __set_cpus_allowed_ptr(SCA_MIGRATE_ENABLE) doesn't schedule().
-+ */
-+ preempt_disable();
-+ /*
-+ * Assumption: current should be running on allowed cpu
-+ */
-+ WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &p->cpus_mask));
-+ if (p->cpus_ptr != &p->cpus_mask)
-+ __do_set_cpus_ptr(p, &p->cpus_mask);
-+ /*
-+ * Mustn't clear migration_disabled() until cpus_ptr points back at the
-+ * regular cpus_mask, otherwise things that race (eg.
-+ * select_fallback_rq) get confused.
-+ */
-+ barrier();
-+ p->migration_disabled = 0;
-+ this_rq()->nr_pinned--;
-+ preempt_enable();
-+}
-+EXPORT_SYMBOL_GPL(migrate_enable);
-+
-+static inline bool rq_has_pinned_tasks(struct rq *rq)
-+{
-+ return rq->nr_pinned;
-+}
-+
-+/*
-+ * Per-CPU kthreads are allowed to run on !active && online CPUs, see
-+ * __set_cpus_allowed_ptr() and select_fallback_rq().
-+ */
-+static inline bool is_cpu_allowed(struct task_struct *p, int cpu)
-+{
-+ /* When not in the task's cpumask, no point in looking further. */
-+ if (!cpumask_test_cpu(cpu, p->cpus_ptr))
-+ return false;
-+
-+ /* migrate_disabled() must be allowed to finish. */
-+ if (is_migration_disabled(p))
-+ return cpu_online(cpu);
-+
-+ /* Non kernel threads are not allowed during either online or offline. */
-+ if (!(p->flags & PF_KTHREAD))
-+ return cpu_active(cpu) && task_cpu_possible(cpu, p);
-+
-+ /* KTHREAD_IS_PER_CPU is always allowed. */
-+ if (kthread_is_per_cpu(p))
-+ return cpu_online(cpu);
-+
-+ /* Regular kernel threads don't get to stay during offline. */
-+ if (cpu_dying(cpu))
-+ return false;
-+
-+ /* But are allowed during online. */
-+ return cpu_online(cpu);
-+}
-+
-+/*
-+ * This is how migration works:
-+ *
-+ * 1) we invoke migration_cpu_stop() on the target CPU using
-+ * stop_one_cpu().
-+ * 2) stopper starts to run (implicitly forcing the migrated thread
-+ * off the CPU)
-+ * 3) it checks whether the migrated task is still in the wrong runqueue.
-+ * 4) if it's in the wrong runqueue then the migration thread removes
-+ * it and puts it into the right queue.
-+ * 5) stopper completes and stop_one_cpu() returns and the migration
-+ * is done.
-+ */
-+
-+/*
-+ * move_queued_task - move a queued task to new rq.
-+ *
-+ * Returns (locked) new rq. Old rq's lock is released.
-+ */
-+static struct rq *move_queued_task(struct rq *rq, struct task_struct *p, int
-+ new_cpu)
-+{
-+ int src_cpu;
-+
-+ lockdep_assert_held(&rq->lock);
-+
-+ src_cpu = cpu_of(rq);
-+ WRITE_ONCE(p->on_rq, TASK_ON_RQ_MIGRATING);
-+ dequeue_task(p, rq, 0);
-+ set_task_cpu(p, new_cpu);
-+ raw_spin_unlock(&rq->lock);
-+
-+ rq = cpu_rq(new_cpu);
-+
-+ raw_spin_lock(&rq->lock);
-+ WARN_ON_ONCE(task_cpu(p) != new_cpu);
-+
-+ sched_mm_cid_migrate_to(rq, p, src_cpu);
-+
-+ sched_task_sanity_check(p, rq);
-+ enqueue_task(p, rq, 0);
-+ p->on_rq = TASK_ON_RQ_QUEUED;
-+ check_preempt_curr(rq);
-+
-+ return rq;
-+}
-+
-+struct migration_arg {
-+ struct task_struct *task;
-+ int dest_cpu;
-+};
-+
-+/*
-+ * Move (not current) task off this CPU, onto the destination CPU. We're doing
-+ * this because either it can't run here any more (set_cpus_allowed()
-+ * away from this CPU, or CPU going down), or because we're
-+ * attempting to rebalance this task on exec (sched_exec).
-+ *
-+ * So we race with normal scheduler movements, but that's OK, as long
-+ * as the task is no longer on this CPU.
-+ */
-+static struct rq *__migrate_task(struct rq *rq, struct task_struct *p, int
-+ dest_cpu)
-+{
-+ /* Affinity changed (again). */
-+ if (!is_cpu_allowed(p, dest_cpu))
-+ return rq;
-+
-+ update_rq_clock(rq);
-+ return move_queued_task(rq, p, dest_cpu);
-+}
-+
-+/*
-+ * migration_cpu_stop - this will be executed by a highprio stopper thread
-+ * and performs thread migration by bumping thread off CPU then
-+ * 'pushing' onto another runqueue.
-+ */
-+static int migration_cpu_stop(void *data)
-+{
-+ struct migration_arg *arg = data;
-+ struct task_struct *p = arg->task;
-+ struct rq *rq = this_rq();
-+ unsigned long flags;
-+
-+ /*
-+ * The original target CPU might have gone down and we might
-+ * be on another CPU but it doesn't matter.
-+ */
-+ local_irq_save(flags);
-+ /*
-+ * We need to explicitly wake pending tasks before running
-+ * __migrate_task() such that we will not miss enforcing cpus_ptr
-+ * during wakeups, see set_cpus_allowed_ptr()'s TASK_WAKING test.
-+ */
-+ flush_smp_call_function_queue();
-+
-+ raw_spin_lock(&p->pi_lock);
-+ raw_spin_lock(&rq->lock);
-+ /*
-+ * If task_rq(p) != rq, it cannot be migrated here, because we're
-+ * holding rq->lock, if p->on_rq == 0 it cannot get enqueued because
-+ * we're holding p->pi_lock.
-+ */
-+ if (task_rq(p) == rq && task_on_rq_queued(p))
-+ rq = __migrate_task(rq, p, arg->dest_cpu);
-+ raw_spin_unlock(&rq->lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
-+
-+ return 0;
-+}
-+
-+static inline void
-+set_cpus_allowed_common(struct task_struct *p, struct affinity_context *ctx)
-+{
-+ cpumask_copy(&p->cpus_mask, ctx->new_mask);
-+ p->nr_cpus_allowed = cpumask_weight(ctx->new_mask);
-+
-+ /*
-+ * Swap in a new user_cpus_ptr if SCA_USER flag set
-+ */
-+ if (ctx->flags & SCA_USER)
-+ swap(p->user_cpus_ptr, ctx->user_mask);
-+}
-+
-+static void
-+__do_set_cpus_allowed(struct task_struct *p, struct affinity_context *ctx)
-+{
-+ lockdep_assert_held(&p->pi_lock);
-+ set_cpus_allowed_common(p, ctx);
-+}
-+
-+/*
-+ * Used for kthread_bind() and select_fallback_rq(), in both cases the user
-+ * affinity (if any) should be destroyed too.
-+ */
-+void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
-+{
-+ struct affinity_context ac = {
-+ .new_mask = new_mask,
-+ .user_mask = NULL,
-+ .flags = SCA_USER, /* clear the user requested mask */
-+ };
-+ union cpumask_rcuhead {
-+ cpumask_t cpumask;
-+ struct rcu_head rcu;
-+ };
-+
-+ __do_set_cpus_allowed(p, &ac);
-+
-+ /*
-+ * Because this is called with p->pi_lock held, it is not possible
-+ * to use kfree() here (when PREEMPT_RT=y), therefore punt to using
-+ * kfree_rcu().
-+ */
-+ kfree_rcu((union cpumask_rcuhead *)ac.user_mask, rcu);
-+}
-+
-+static cpumask_t *alloc_user_cpus_ptr(int node)
-+{
-+ /*
-+ * See do_set_cpus_allowed() above for the rcu_head usage.
-+ */
-+ int size = max_t(int, cpumask_size(), sizeof(struct rcu_head));
-+
-+ return kmalloc_node(size, GFP_KERNEL, node);
-+}
-+
-+int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
-+ int node)
-+{
-+ cpumask_t *user_mask;
-+ unsigned long flags;
-+
-+ /*
-+ * Always clear dst->user_cpus_ptr first as their user_cpus_ptr's
-+ * may differ by now due to racing.
-+ */
-+ dst->user_cpus_ptr = NULL;
-+
-+ /*
-+ * This check is racy and losing the race is a valid situation.
-+ * It is not worth the extra overhead of taking the pi_lock on
-+ * every fork/clone.
-+ */
-+ if (data_race(!src->user_cpus_ptr))
-+ return 0;
-+
-+ user_mask = alloc_user_cpus_ptr(node);
-+ if (!user_mask)
-+ return -ENOMEM;
-+
-+ /*
-+ * Use pi_lock to protect content of user_cpus_ptr
-+ *
-+ * Though unlikely, user_cpus_ptr can be reset to NULL by a concurrent
-+ * do_set_cpus_allowed().
-+ */
-+ raw_spin_lock_irqsave(&src->pi_lock, flags);
-+ if (src->user_cpus_ptr) {
-+ swap(dst->user_cpus_ptr, user_mask);
-+ cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
-+ }
-+ raw_spin_unlock_irqrestore(&src->pi_lock, flags);
-+
-+ if (unlikely(user_mask))
-+ kfree(user_mask);
-+
-+ return 0;
-+}
-+
-+static inline struct cpumask *clear_user_cpus_ptr(struct task_struct *p)
-+{
-+ struct cpumask *user_mask = NULL;
-+
-+ swap(p->user_cpus_ptr, user_mask);
-+
-+ return user_mask;
-+}
-+
-+void release_user_cpus_ptr(struct task_struct *p)
-+{
-+ kfree(clear_user_cpus_ptr(p));
-+}
-+
-+#endif
-+
-+/**
-+ * task_curr - is this task currently executing on a CPU?
-+ * @p: the task in question.
-+ *
-+ * Return: 1 if the task is currently executing. 0 otherwise.
-+ */
-+inline int task_curr(const struct task_struct *p)
-+{
-+ return cpu_curr(task_cpu(p)) == p;
-+}
-+
-+#ifdef CONFIG_SMP
-+/*
-+ * wait_task_inactive - wait for a thread to unschedule.
-+ *
-+ * Wait for the thread to block in any of the states set in @match_state.
-+ * If it changes, i.e. @p might have woken up, then return zero. When we
-+ * succeed in waiting for @p to be off its CPU, we return a positive number
-+ * (its total switch count). If a second call a short while later returns the
-+ * same number, the caller can be sure that @p has remained unscheduled the
-+ * whole time.
-+ *
-+ * The caller must ensure that the task *will* unschedule sometime soon,
-+ * else this function might spin for a *long* time. This function can't
-+ * be called with interrupts off, or it may introduce deadlock with
-+ * smp_call_function() if an IPI is sent by the same process we are
-+ * waiting to become inactive.
-+ */
-+unsigned long wait_task_inactive(struct task_struct *p, unsigned int match_state)
-+{
-+ unsigned long flags;
-+ bool running, on_rq;
-+ unsigned long ncsw;
-+ struct rq *rq;
-+ raw_spinlock_t *lock;
-+
-+ for (;;) {
-+ rq = task_rq(p);
-+
-+ /*
-+ * If the task is actively running on another CPU
-+ * still, just relax and busy-wait without holding
-+ * any locks.
-+ *
-+ * NOTE! Since we don't hold any locks, it's not
-+ * even sure that "rq" stays as the right runqueue!
-+ * But we don't care, since this will return false
-+ * if the runqueue has changed and p is actually now
-+ * running somewhere else!
-+ */
-+ while (task_on_cpu(p) && p == rq->curr) {
-+ if (!(READ_ONCE(p->__state) & match_state))
-+ return 0;
-+ cpu_relax();
-+ }
-+
-+ /*
-+ * Ok, time to look more closely! We need the rq
-+ * lock now, to be *sure*. If we're wrong, we'll
-+ * just go back and repeat.
-+ */
-+ task_access_lock_irqsave(p, &lock, &flags);
-+ trace_sched_wait_task(p);
-+ running = task_on_cpu(p);
-+ on_rq = p->on_rq;
-+ ncsw = 0;
-+ if (READ_ONCE(p->__state) & match_state)
-+ ncsw = p->nvcsw | LONG_MIN; /* sets MSB */
-+ task_access_unlock_irqrestore(p, lock, &flags);
-+
-+ /*
-+ * If it changed from the expected state, bail out now.
-+ */
-+ if (unlikely(!ncsw))
-+ break;
-+
-+ /*
-+ * Was it really running after all now that we
-+ * checked with the proper locks actually held?
-+ *
-+ * Oops. Go back and try again..
-+ */
-+ if (unlikely(running)) {
-+ cpu_relax();
-+ continue;
-+ }
-+
-+ /*
-+ * It's not enough that it's not actively running,
-+ * it must be off the runqueue _entirely_, and not
-+ * preempted!
-+ *
-+ * So if it was still runnable (but just not actively
-+ * running right now), it's preempted, and we should
-+ * yield - it could be a while.
-+ */
-+ if (unlikely(on_rq)) {
-+ ktime_t to = NSEC_PER_SEC / HZ;
-+
-+ set_current_state(TASK_UNINTERRUPTIBLE);
-+ schedule_hrtimeout(&to, HRTIMER_MODE_REL_HARD);
-+ continue;
-+ }
-+
-+ /*
-+ * Ahh, all good. It wasn't running, and it wasn't
-+ * runnable, which means that it will never become
-+ * running in the future either. We're all done!
-+ */
-+ break;
-+ }
-+
-+ return ncsw;
-+}
-+
-+/***
-+ * kick_process - kick a running thread to enter/exit the kernel
-+ * @p: the to-be-kicked thread
-+ *
-+ * Cause a process which is running on another CPU to enter
-+ * kernel-mode, without any delay. (to get signals handled.)
-+ *
-+ * NOTE: this function doesn't have to take the runqueue lock,
-+ * because all it wants to ensure is that the remote task enters
-+ * the kernel. If the IPI races and the task has been migrated
-+ * to another CPU then no harm is done and the purpose has been
-+ * achieved as well.
-+ */
-+void kick_process(struct task_struct *p)
-+{
-+ int cpu;
-+
-+ preempt_disable();
-+ cpu = task_cpu(p);
-+ if ((cpu != smp_processor_id()) && task_curr(p))
-+ smp_send_reschedule(cpu);
-+ preempt_enable();
-+}
-+EXPORT_SYMBOL_GPL(kick_process);
-+
-+/*
-+ * ->cpus_ptr is protected by both rq->lock and p->pi_lock
-+ *
-+ * A few notes on cpu_active vs cpu_online:
-+ *
-+ * - cpu_active must be a subset of cpu_online
-+ *
-+ * - on CPU-up we allow per-CPU kthreads on the online && !active CPU,
-+ * see __set_cpus_allowed_ptr(). At this point the newly online
-+ * CPU isn't yet part of the sched domains, and balancing will not
-+ * see it.
-+ *
-+ * - on cpu-down we clear cpu_active() to mask the sched domains and
-+ * avoid the load balancer to place new tasks on the to be removed
-+ * CPU. Existing tasks will remain running there and will be taken
-+ * off.
-+ *
-+ * This means that fallback selection must not select !active CPUs.
-+ * And can assume that any active CPU must be online. Conversely
-+ * select_task_rq() below may allow selection of !active CPUs in order
-+ * to satisfy the above rules.
-+ */
-+static int select_fallback_rq(int cpu, struct task_struct *p)
-+{
-+ int nid = cpu_to_node(cpu);
-+ const struct cpumask *nodemask = NULL;
-+ enum { cpuset, possible, fail } state = cpuset;
-+ int dest_cpu;
-+
-+ /*
-+ * If the node that the CPU is on has been offlined, cpu_to_node()
-+ * will return -1. There is no CPU on the node, and we should
-+ * select the CPU on the other node.
-+ */
-+ if (nid != -1) {
-+ nodemask = cpumask_of_node(nid);
-+
-+ /* Look for allowed, online CPU in same node. */
-+ for_each_cpu(dest_cpu, nodemask) {
-+ if (is_cpu_allowed(p, dest_cpu))
-+ return dest_cpu;
-+ }
-+ }
-+
-+ for (;;) {
-+ /* Any allowed, online CPU? */
-+ for_each_cpu(dest_cpu, p->cpus_ptr) {
-+ if (!is_cpu_allowed(p, dest_cpu))
-+ continue;
-+ goto out;
-+ }
-+
-+ /* No more Mr. Nice Guy. */
-+ switch (state) {
-+ case cpuset:
-+ if (cpuset_cpus_allowed_fallback(p)) {
-+ state = possible;
-+ break;
-+ }
-+ fallthrough;
-+ case possible:
-+ /*
-+ * XXX When called from select_task_rq() we only
-+ * hold p->pi_lock and again violate locking order.
-+ *
-+ * More yuck to audit.
-+ */
-+ do_set_cpus_allowed(p, task_cpu_possible_mask(p));
-+ state = fail;
-+ break;
-+
-+ case fail:
-+ BUG();
-+ break;
-+ }
-+ }
-+
-+out:
-+ if (state != cpuset) {
-+ /*
-+ * Don't tell them about moving exiting tasks or
-+ * kernel threads (both mm NULL), since they never
-+ * leave kernel.
-+ */
-+ if (p->mm && printk_ratelimit()) {
-+ printk_deferred("process %d (%s) no longer affine to cpu%d\n",
-+ task_pid_nr(p), p->comm, cpu);
-+ }
-+ }
-+
-+ return dest_cpu;
-+}
-+
-+static inline void
-+sched_preempt_mask_flush(cpumask_t *mask, int prio)
-+{
-+ int cpu;
-+
-+ cpumask_copy(mask, sched_idle_mask);
-+
-+ for_each_clear_bit(cpu, cpumask_bits(mask), nr_cpumask_bits) {
-+ if (prio < cpu_rq(cpu)->prio)
-+ cpumask_set_cpu(cpu, mask);
-+ }
-+}
-+
-+static inline int
-+preempt_mask_check(struct task_struct *p, cpumask_t *allow_mask, cpumask_t *preempt_mask)
-+{
-+ int task_prio = task_sched_prio(p);
-+ cpumask_t *mask = sched_preempt_mask + SCHED_QUEUE_BITS - 1 - task_prio;
-+ int pr = atomic_read(&sched_prio_record);
-+
-+ if (pr != task_prio) {
-+ sched_preempt_mask_flush(mask, task_prio);
-+ atomic_set(&sched_prio_record, task_prio);
-+ }
-+
-+ return cpumask_and(preempt_mask, allow_mask, mask);
-+}
-+
-+static inline int select_task_rq(struct task_struct *p)
-+{
-+ cpumask_t allow_mask, mask;
-+
-+ if (unlikely(!cpumask_and(&allow_mask, p->cpus_ptr, cpu_active_mask)))
-+ return select_fallback_rq(task_cpu(p), p);
-+
-+ if (
-+#ifdef CONFIG_SCHED_SMT
-+ cpumask_and(&mask, &allow_mask, &sched_sg_idle_mask) ||
-+#endif
-+ cpumask_and(&mask, &allow_mask, sched_idle_mask) ||
-+ preempt_mask_check(p, &allow_mask, &mask))
-+ return best_mask_cpu(task_cpu(p), &mask);
-+
-+ return best_mask_cpu(task_cpu(p), &allow_mask);
-+}
-+
-+void sched_set_stop_task(int cpu, struct task_struct *stop)
-+{
-+ static struct lock_class_key stop_pi_lock;
-+ struct sched_param stop_param = { .sched_priority = STOP_PRIO };
-+ struct sched_param start_param = { .sched_priority = 0 };
-+ struct task_struct *old_stop = cpu_rq(cpu)->stop;
-+
-+ if (stop) {
-+ /*
-+ * Make it appear like a SCHED_FIFO task, its something
-+ * userspace knows about and won't get confused about.
-+ *
-+ * Also, it will make PI more or less work without too
-+ * much confusion -- but then, stop work should not
-+ * rely on PI working anyway.
-+ */
-+ sched_setscheduler_nocheck(stop, SCHED_FIFO, &stop_param);
-+
-+ /*
-+ * The PI code calls rt_mutex_setprio() with ->pi_lock held to
-+ * adjust the effective priority of a task. As a result,
-+ * rt_mutex_setprio() can trigger (RT) balancing operations,
-+ * which can then trigger wakeups of the stop thread to push
-+ * around the current task.
-+ *
-+ * The stop task itself will never be part of the PI-chain, it
-+ * never blocks, therefore that ->pi_lock recursion is safe.
-+ * Tell lockdep about this by placing the stop->pi_lock in its
-+ * own class.
-+ */
-+ lockdep_set_class(&stop->pi_lock, &stop_pi_lock);
-+ }
-+
-+ cpu_rq(cpu)->stop = stop;
-+
-+ if (old_stop) {
-+ /*
-+ * Reset it back to a normal scheduling policy so that
-+ * it can die in pieces.
-+ */
-+ sched_setscheduler_nocheck(old_stop, SCHED_NORMAL, &start_param);
-+ }
-+}
-+
-+static int affine_move_task(struct rq *rq, struct task_struct *p, int dest_cpu,
-+ raw_spinlock_t *lock, unsigned long irq_flags)
-+ __releases(rq->lock)
-+ __releases(p->pi_lock)
-+{
-+ /* Can the task run on the task's current CPU? If so, we're done */
-+ if (!cpumask_test_cpu(task_cpu(p), &p->cpus_mask)) {
-+ if (p->migration_disabled) {
-+ if (likely(p->cpus_ptr != &p->cpus_mask))
-+ __do_set_cpus_ptr(p, &p->cpus_mask);
-+ p->migration_disabled = 0;
-+ p->migration_flags |= MDF_FORCE_ENABLED;
-+ /* When p is migrate_disabled, rq->lock should be held */
-+ rq->nr_pinned--;
-+ }
-+
-+ if (task_on_cpu(p) || READ_ONCE(p->__state) == TASK_WAKING) {
-+ struct migration_arg arg = { p, dest_cpu };
-+
-+ /* Need help from migration thread: drop lock and wait. */
-+ __task_access_unlock(p, lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
-+ stop_one_cpu(cpu_of(rq), migration_cpu_stop, &arg);
-+ return 0;
-+ }
-+ if (task_on_rq_queued(p)) {
-+ /*
-+ * OK, since we're going to drop the lock immediately
-+ * afterwards anyway.
-+ */
-+ update_rq_clock(rq);
-+ rq = move_queued_task(rq, p, dest_cpu);
-+ lock = &rq->lock;
-+ }
-+ }
-+ __task_access_unlock(p, lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
-+ return 0;
-+}
-+
-+static int __set_cpus_allowed_ptr_locked(struct task_struct *p,
-+ struct affinity_context *ctx,
-+ struct rq *rq,
-+ raw_spinlock_t *lock,
-+ unsigned long irq_flags)
-+{
-+ const struct cpumask *cpu_allowed_mask = task_cpu_possible_mask(p);
-+ const struct cpumask *cpu_valid_mask = cpu_active_mask;
-+ bool kthread = p->flags & PF_KTHREAD;
-+ int dest_cpu;
-+ int ret = 0;
-+
-+ if (kthread || is_migration_disabled(p)) {
-+ /*
-+ * Kernel threads are allowed on online && !active CPUs,
-+ * however, during cpu-hot-unplug, even these might get pushed
-+ * away if not KTHREAD_IS_PER_CPU.
-+ *
-+ * Specifically, migration_disabled() tasks must not fail the
-+ * cpumask_any_and_distribute() pick below, esp. so on
-+ * SCA_MIGRATE_ENABLE, otherwise we'll not call
-+ * set_cpus_allowed_common() and actually reset p->cpus_ptr.
-+ */
-+ cpu_valid_mask = cpu_online_mask;
-+ }
-+
-+ if (!kthread && !cpumask_subset(ctx->new_mask, cpu_allowed_mask)) {
-+ ret = -EINVAL;
-+ goto out;
-+ }
-+
-+ /*
-+ * Must re-check here, to close a race against __kthread_bind(),
-+ * sched_setaffinity() is not guaranteed to observe the flag.
-+ */
-+ if ((ctx->flags & SCA_CHECK) && (p->flags & PF_NO_SETAFFINITY)) {
-+ ret = -EINVAL;
-+ goto out;
-+ }
-+
-+ if (cpumask_equal(&p->cpus_mask, ctx->new_mask))
-+ goto out;
-+
-+ dest_cpu = cpumask_any_and(cpu_valid_mask, ctx->new_mask);
-+ if (dest_cpu >= nr_cpu_ids) {
-+ ret = -EINVAL;
-+ goto out;
-+ }
-+
-+ __do_set_cpus_allowed(p, ctx);
-+
-+ return affine_move_task(rq, p, dest_cpu, lock, irq_flags);
-+
-+out:
-+ __task_access_unlock(p, lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
-+
-+ return ret;
-+}
-+
-+/*
-+ * Change a given task's CPU affinity. Migrate the thread to a
-+ * is removed from the allowed bitmask.
-+ *
-+ * NOTE: the caller must have a valid reference to the task, the
-+ * task must not exit() & deallocate itself prematurely. The
-+ * call is not atomic; no spinlocks may be held.
-+ */
-+static int __set_cpus_allowed_ptr(struct task_struct *p,
-+ struct affinity_context *ctx)
-+{
-+ unsigned long irq_flags;
-+ struct rq *rq;
-+ raw_spinlock_t *lock;
-+
-+ raw_spin_lock_irqsave(&p->pi_lock, irq_flags);
-+ rq = __task_access_lock(p, &lock);
-+ /*
-+ * Masking should be skipped if SCA_USER or any of the SCA_MIGRATE_*
-+ * flags are set.
-+ */
-+ if (p->user_cpus_ptr &&
-+ !(ctx->flags & SCA_USER) &&
-+ cpumask_and(rq->scratch_mask, ctx->new_mask, p->user_cpus_ptr))
-+ ctx->new_mask = rq->scratch_mask;
-+
-+
-+ return __set_cpus_allowed_ptr_locked(p, ctx, rq, lock, irq_flags);
-+}
-+
-+int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask)
-+{
-+ struct affinity_context ac = {
-+ .new_mask = new_mask,
-+ .flags = 0,
-+ };
-+
-+ return __set_cpus_allowed_ptr(p, &ac);
-+}
-+EXPORT_SYMBOL_GPL(set_cpus_allowed_ptr);
-+
-+/*
-+ * Change a given task's CPU affinity to the intersection of its current
-+ * affinity mask and @subset_mask, writing the resulting mask to @new_mask.
-+ * If user_cpus_ptr is defined, use it as the basis for restricting CPU
-+ * affinity or use cpu_online_mask instead.
-+ *
-+ * If the resulting mask is empty, leave the affinity unchanged and return
-+ * -EINVAL.
-+ */
-+static int restrict_cpus_allowed_ptr(struct task_struct *p,
-+ struct cpumask *new_mask,
-+ const struct cpumask *subset_mask)
-+{
-+ struct affinity_context ac = {
-+ .new_mask = new_mask,
-+ .flags = 0,
-+ };
-+ unsigned long irq_flags;
-+ raw_spinlock_t *lock;
-+ struct rq *rq;
-+ int err;
-+
-+ raw_spin_lock_irqsave(&p->pi_lock, irq_flags);
-+ rq = __task_access_lock(p, &lock);
-+
-+ if (!cpumask_and(new_mask, task_user_cpus(p), subset_mask)) {
-+ err = -EINVAL;
-+ goto err_unlock;
-+ }
-+
-+ return __set_cpus_allowed_ptr_locked(p, &ac, rq, lock, irq_flags);
-+
-+err_unlock:
-+ __task_access_unlock(p, lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
-+ return err;
-+}
-+
-+/*
-+ * Restrict the CPU affinity of task @p so that it is a subset of
-+ * task_cpu_possible_mask() and point @p->user_cpus_ptr to a copy of the
-+ * old affinity mask. If the resulting mask is empty, we warn and walk
-+ * up the cpuset hierarchy until we find a suitable mask.
-+ */
-+void force_compatible_cpus_allowed_ptr(struct task_struct *p)
-+{
-+ cpumask_var_t new_mask;
-+ const struct cpumask *override_mask = task_cpu_possible_mask(p);
-+
-+ alloc_cpumask_var(&new_mask, GFP_KERNEL);
-+
-+ /*
-+ * __migrate_task() can fail silently in the face of concurrent
-+ * offlining of the chosen destination CPU, so take the hotplug
-+ * lock to ensure that the migration succeeds.
-+ */
-+ cpus_read_lock();
-+ if (!cpumask_available(new_mask))
-+ goto out_set_mask;
-+
-+ if (!restrict_cpus_allowed_ptr(p, new_mask, override_mask))
-+ goto out_free_mask;
-+
-+ /*
-+ * We failed to find a valid subset of the affinity mask for the
-+ * task, so override it based on its cpuset hierarchy.
-+ */
-+ cpuset_cpus_allowed(p, new_mask);
-+ override_mask = new_mask;
-+
-+out_set_mask:
-+ if (printk_ratelimit()) {
-+ printk_deferred("Overriding affinity for process %d (%s) to CPUs %*pbl\n",
-+ task_pid_nr(p), p->comm,
-+ cpumask_pr_args(override_mask));
-+ }
-+
-+ WARN_ON(set_cpus_allowed_ptr(p, override_mask));
-+out_free_mask:
-+ cpus_read_unlock();
-+ free_cpumask_var(new_mask);
-+}
-+
-+static int
-+__sched_setaffinity(struct task_struct *p, struct affinity_context *ctx);
-+
-+/*
-+ * Restore the affinity of a task @p which was previously restricted by a
-+ * call to force_compatible_cpus_allowed_ptr().
-+ *
-+ * It is the caller's responsibility to serialise this with any calls to
-+ * force_compatible_cpus_allowed_ptr(@p).
-+ */
-+void relax_compatible_cpus_allowed_ptr(struct task_struct *p)
-+{
-+ struct affinity_context ac = {
-+ .new_mask = task_user_cpus(p),
-+ .flags = 0,
-+ };
-+ int ret;
-+
-+ /*
-+ * Try to restore the old affinity mask with __sched_setaffinity().
-+ * Cpuset masking will be done there too.
-+ */
-+ ret = __sched_setaffinity(p, &ac);
-+ WARN_ON_ONCE(ret);
-+}
-+
-+#else /* CONFIG_SMP */
-+
-+static inline int select_task_rq(struct task_struct *p)
-+{
-+ return 0;
-+}
-+
-+static inline int
-+__set_cpus_allowed_ptr(struct task_struct *p,
-+ struct affinity_context *ctx)
-+{
-+ return set_cpus_allowed_ptr(p, ctx->new_mask);
-+}
-+
-+static inline bool rq_has_pinned_tasks(struct rq *rq)
-+{
-+ return false;
-+}
-+
-+static inline cpumask_t *alloc_user_cpus_ptr(int node)
-+{
-+ return NULL;
-+}
-+
-+#endif /* !CONFIG_SMP */
-+
-+static void
-+ttwu_stat(struct task_struct *p, int cpu, int wake_flags)
-+{
-+ struct rq *rq;
-+
-+ if (!schedstat_enabled())
-+ return;
-+
-+ rq = this_rq();
-+
-+#ifdef CONFIG_SMP
-+ if (cpu == rq->cpu) {
-+ __schedstat_inc(rq->ttwu_local);
-+ __schedstat_inc(p->stats.nr_wakeups_local);
-+ } else {
-+ /** Alt schedule FW ToDo:
-+ * How to do ttwu_wake_remote
-+ */
-+ }
-+#endif /* CONFIG_SMP */
-+
-+ __schedstat_inc(rq->ttwu_count);
-+ __schedstat_inc(p->stats.nr_wakeups);
-+}
-+
-+/*
-+ * Mark the task runnable.
-+ */
-+static inline void ttwu_do_wakeup(struct task_struct *p)
-+{
-+ WRITE_ONCE(p->__state, TASK_RUNNING);
-+ trace_sched_wakeup(p);
-+}
-+
-+static inline void
-+ttwu_do_activate(struct rq *rq, struct task_struct *p, int wake_flags)
-+{
-+ if (p->sched_contributes_to_load)
-+ rq->nr_uninterruptible--;
-+
-+ if (
-+#ifdef CONFIG_SMP
-+ !(wake_flags & WF_MIGRATED) &&
-+#endif
-+ p->in_iowait) {
-+ delayacct_blkio_end(p);
-+ atomic_dec(&task_rq(p)->nr_iowait);
-+ }
-+
-+ activate_task(p, rq);
-+ check_preempt_curr(rq);
-+
-+ ttwu_do_wakeup(p);
-+}
-+
-+/*
-+ * Consider @p being inside a wait loop:
-+ *
-+ * for (;;) {
-+ * set_current_state(TASK_UNINTERRUPTIBLE);
-+ *
-+ * if (CONDITION)
-+ * break;
-+ *
-+ * schedule();
-+ * }
-+ * __set_current_state(TASK_RUNNING);
-+ *
-+ * between set_current_state() and schedule(). In this case @p is still
-+ * runnable, so all that needs doing is change p->state back to TASK_RUNNING in
-+ * an atomic manner.
-+ *
-+ * By taking task_rq(p)->lock we serialize against schedule(), if @p->on_rq
-+ * then schedule() must still happen and p->state can be changed to
-+ * TASK_RUNNING. Otherwise we lost the race, schedule() has happened, and we
-+ * need to do a full wakeup with enqueue.
-+ *
-+ * Returns: %true when the wakeup is done,
-+ * %false otherwise.
-+ */
-+static int ttwu_runnable(struct task_struct *p, int wake_flags)
-+{
-+ struct rq *rq;
-+ raw_spinlock_t *lock;
-+ int ret = 0;
-+
-+ rq = __task_access_lock(p, &lock);
-+ if (task_on_rq_queued(p)) {
-+ if (!task_on_cpu(p)) {
-+ /*
-+ * When on_rq && !on_cpu the task is preempted, see if
-+ * it should preempt the task that is current now.
-+ */
-+ update_rq_clock(rq);
-+ check_preempt_curr(rq);
-+ }
-+ ttwu_do_wakeup(p);
-+ ret = 1;
-+ }
-+ __task_access_unlock(p, lock);
-+
-+ return ret;
-+}
-+
-+#ifdef CONFIG_SMP
-+void sched_ttwu_pending(void *arg)
-+{
-+ struct llist_node *llist = arg;
-+ struct rq *rq = this_rq();
-+ struct task_struct *p, *t;
-+ struct rq_flags rf;
-+
-+ if (!llist)
-+ return;
-+
-+ rq_lock_irqsave(rq, &rf);
-+ update_rq_clock(rq);
-+
-+ llist_for_each_entry_safe(p, t, llist, wake_entry.llist) {
-+ if (WARN_ON_ONCE(p->on_cpu))
-+ smp_cond_load_acquire(&p->on_cpu, !VAL);
-+
-+ if (WARN_ON_ONCE(task_cpu(p) != cpu_of(rq)))
-+ set_task_cpu(p, cpu_of(rq));
-+
-+ ttwu_do_activate(rq, p, p->sched_remote_wakeup ? WF_MIGRATED : 0);
-+ }
-+
-+ /*
-+ * Must be after enqueueing at least once task such that
-+ * idle_cpu() does not observe a false-negative -- if it does,
-+ * it is possible for select_idle_siblings() to stack a number
-+ * of tasks on this CPU during that window.
-+ *
-+ * It is ok to clear ttwu_pending when another task pending.
-+ * We will receive IPI after local irq enabled and then enqueue it.
-+ * Since now nr_running > 0, idle_cpu() will always get correct result.
-+ */
-+ WRITE_ONCE(rq->ttwu_pending, 0);
-+ rq_unlock_irqrestore(rq, &rf);
-+}
-+
-+/*
-+ * Prepare the scene for sending an IPI for a remote smp_call
-+ *
-+ * Returns true if the caller can proceed with sending the IPI.
-+ * Returns false otherwise.
-+ */
-+bool call_function_single_prep_ipi(int cpu)
-+{
-+ if (set_nr_if_polling(cpu_rq(cpu)->idle)) {
-+ trace_sched_wake_idle_without_ipi(cpu);
-+ return false;
-+ }
-+
-+ return true;
-+}
-+
-+/*
-+ * Queue a task on the target CPUs wake_list and wake the CPU via IPI if
-+ * necessary. The wakee CPU on receipt of the IPI will queue the task
-+ * via sched_ttwu_wakeup() for activation so the wakee incurs the cost
-+ * of the wakeup instead of the waker.
-+ */
-+static void __ttwu_queue_wakelist(struct task_struct *p, int cpu, int wake_flags)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+
-+ p->sched_remote_wakeup = !!(wake_flags & WF_MIGRATED);
-+
-+ WRITE_ONCE(rq->ttwu_pending, 1);
-+ __smp_call_single_queue(cpu, &p->wake_entry.llist);
-+}
-+
-+static inline bool ttwu_queue_cond(struct task_struct *p, int cpu)
-+{
-+ /*
-+ * Do not complicate things with the async wake_list while the CPU is
-+ * in hotplug state.
-+ */
-+ if (!cpu_active(cpu))
-+ return false;
-+
-+ /* Ensure the task will still be allowed to run on the CPU. */
-+ if (!cpumask_test_cpu(cpu, p->cpus_ptr))
-+ return false;
-+
-+ /*
-+ * If the CPU does not share cache, then queue the task on the
-+ * remote rqs wakelist to avoid accessing remote data.
-+ */
-+ if (!cpus_share_cache(smp_processor_id(), cpu))
-+ return true;
-+
-+ if (cpu == smp_processor_id())
-+ return false;
-+
-+ /*
-+ * If the wakee cpu is idle, or the task is descheduling and the
-+ * only running task on the CPU, then use the wakelist to offload
-+ * the task activation to the idle (or soon-to-be-idle) CPU as
-+ * the current CPU is likely busy. nr_running is checked to
-+ * avoid unnecessary task stacking.
-+ *
-+ * Note that we can only get here with (wakee) p->on_rq=0,
-+ * p->on_cpu can be whatever, we've done the dequeue, so
-+ * the wakee has been accounted out of ->nr_running.
-+ */
-+ if (!cpu_rq(cpu)->nr_running)
-+ return true;
-+
-+ return false;
-+}
-+
-+static bool ttwu_queue_wakelist(struct task_struct *p, int cpu, int wake_flags)
-+{
-+ if (__is_defined(ALT_SCHED_TTWU_QUEUE) && ttwu_queue_cond(p, cpu)) {
-+ sched_clock_cpu(cpu); /* Sync clocks across CPUs */
-+ __ttwu_queue_wakelist(p, cpu, wake_flags);
-+ return true;
-+ }
-+
-+ return false;
-+}
-+
-+void wake_up_if_idle(int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+ unsigned long flags;
-+
-+ rcu_read_lock();
-+
-+ if (!is_idle_task(rcu_dereference(rq->curr)))
-+ goto out;
-+
-+ raw_spin_lock_irqsave(&rq->lock, flags);
-+ if (is_idle_task(rq->curr))
-+ resched_curr(rq);
-+ /* Else CPU is not idle, do nothing here */
-+ raw_spin_unlock_irqrestore(&rq->lock, flags);
-+
-+out:
-+ rcu_read_unlock();
-+}
-+
-+bool cpus_share_cache(int this_cpu, int that_cpu)
-+{
-+ if (this_cpu == that_cpu)
-+ return true;
-+
-+ return per_cpu(sd_llc_id, this_cpu) == per_cpu(sd_llc_id, that_cpu);
-+}
-+#else /* !CONFIG_SMP */
-+
-+static inline bool ttwu_queue_wakelist(struct task_struct *p, int cpu, int wake_flags)
-+{
-+ return false;
-+}
-+
-+#endif /* CONFIG_SMP */
-+
-+static inline void ttwu_queue(struct task_struct *p, int cpu, int wake_flags)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+
-+ if (ttwu_queue_wakelist(p, cpu, wake_flags))
-+ return;
-+
-+ raw_spin_lock(&rq->lock);
-+ update_rq_clock(rq);
-+ ttwu_do_activate(rq, p, wake_flags);
-+ raw_spin_unlock(&rq->lock);
-+}
-+
-+/*
-+ * Invoked from try_to_wake_up() to check whether the task can be woken up.
-+ *
-+ * The caller holds p::pi_lock if p != current or has preemption
-+ * disabled when p == current.
-+ *
-+ * The rules of PREEMPT_RT saved_state:
-+ *
-+ * The related locking code always holds p::pi_lock when updating
-+ * p::saved_state, which means the code is fully serialized in both cases.
-+ *
-+ * The lock wait and lock wakeups happen via TASK_RTLOCK_WAIT. No other
-+ * bits set. This allows to distinguish all wakeup scenarios.
-+ */
-+static __always_inline
-+bool ttwu_state_match(struct task_struct *p, unsigned int state, int *success)
-+{
-+ if (IS_ENABLED(CONFIG_DEBUG_PREEMPT)) {
-+ WARN_ON_ONCE((state & TASK_RTLOCK_WAIT) &&
-+ state != TASK_RTLOCK_WAIT);
-+ }
-+
-+ if (READ_ONCE(p->__state) & state) {
-+ *success = 1;
-+ return true;
-+ }
-+
-+#ifdef CONFIG_PREEMPT_RT
-+ /*
-+ * Saved state preserves the task state across blocking on
-+ * an RT lock. If the state matches, set p::saved_state to
-+ * TASK_RUNNING, but do not wake the task because it waits
-+ * for a lock wakeup. Also indicate success because from
-+ * the regular waker's point of view this has succeeded.
-+ *
-+ * After acquiring the lock the task will restore p::__state
-+ * from p::saved_state which ensures that the regular
-+ * wakeup is not lost. The restore will also set
-+ * p::saved_state to TASK_RUNNING so any further tests will
-+ * not result in false positives vs. @success
-+ */
-+ if (p->saved_state & state) {
-+ p->saved_state = TASK_RUNNING;
-+ *success = 1;
-+ }
-+#endif
-+ return false;
-+}
-+
-+/*
-+ * Notes on Program-Order guarantees on SMP systems.
-+ *
-+ * MIGRATION
-+ *
-+ * The basic program-order guarantee on SMP systems is that when a task [t]
-+ * migrates, all its activity on its old CPU [c0] happens-before any subsequent
-+ * execution on its new CPU [c1].
-+ *
-+ * For migration (of runnable tasks) this is provided by the following means:
-+ *
-+ * A) UNLOCK of the rq(c0)->lock scheduling out task t
-+ * B) migration for t is required to synchronize *both* rq(c0)->lock and
-+ * rq(c1)->lock (if not at the same time, then in that order).
-+ * C) LOCK of the rq(c1)->lock scheduling in task
-+ *
-+ * Transitivity guarantees that B happens after A and C after B.
-+ * Note: we only require RCpc transitivity.
-+ * Note: the CPU doing B need not be c0 or c1
-+ *
-+ * Example:
-+ *
-+ * CPU0 CPU1 CPU2
-+ *
-+ * LOCK rq(0)->lock
-+ * sched-out X
-+ * sched-in Y
-+ * UNLOCK rq(0)->lock
-+ *
-+ * LOCK rq(0)->lock // orders against CPU0
-+ * dequeue X
-+ * UNLOCK rq(0)->lock
-+ *
-+ * LOCK rq(1)->lock
-+ * enqueue X
-+ * UNLOCK rq(1)->lock
-+ *
-+ * LOCK rq(1)->lock // orders against CPU2
-+ * sched-out Z
-+ * sched-in X
-+ * UNLOCK rq(1)->lock
-+ *
-+ *
-+ * BLOCKING -- aka. SLEEP + WAKEUP
-+ *
-+ * For blocking we (obviously) need to provide the same guarantee as for
-+ * migration. However the means are completely different as there is no lock
-+ * chain to provide order. Instead we do:
-+ *
-+ * 1) smp_store_release(X->on_cpu, 0) -- finish_task()
-+ * 2) smp_cond_load_acquire(!X->on_cpu) -- try_to_wake_up()
-+ *
-+ * Example:
-+ *
-+ * CPU0 (schedule) CPU1 (try_to_wake_up) CPU2 (schedule)
-+ *
-+ * LOCK rq(0)->lock LOCK X->pi_lock
-+ * dequeue X
-+ * sched-out X
-+ * smp_store_release(X->on_cpu, 0);
-+ *
-+ * smp_cond_load_acquire(&X->on_cpu, !VAL);
-+ * X->state = WAKING
-+ * set_task_cpu(X,2)
-+ *
-+ * LOCK rq(2)->lock
-+ * enqueue X
-+ * X->state = RUNNING
-+ * UNLOCK rq(2)->lock
-+ *
-+ * LOCK rq(2)->lock // orders against CPU1
-+ * sched-out Z
-+ * sched-in X
-+ * UNLOCK rq(2)->lock
-+ *
-+ * UNLOCK X->pi_lock
-+ * UNLOCK rq(0)->lock
-+ *
-+ *
-+ * However; for wakeups there is a second guarantee we must provide, namely we
-+ * must observe the state that lead to our wakeup. That is, not only must our
-+ * task observe its own prior state, it must also observe the stores prior to
-+ * its wakeup.
-+ *
-+ * This means that any means of doing remote wakeups must order the CPU doing
-+ * the wakeup against the CPU the task is going to end up running on. This,
-+ * however, is already required for the regular Program-Order guarantee above,
-+ * since the waking CPU is the one issueing the ACQUIRE (smp_cond_load_acquire).
-+ *
-+ */
-+
-+/**
-+ * try_to_wake_up - wake up a thread
-+ * @p: the thread to be awakened
-+ * @state: the mask of task states that can be woken
-+ * @wake_flags: wake modifier flags (WF_*)
-+ *
-+ * Conceptually does:
-+ *
-+ * If (@state & @p->state) @p->state = TASK_RUNNING.
-+ *
-+ * If the task was not queued/runnable, also place it back on a runqueue.
-+ *
-+ * This function is atomic against schedule() which would dequeue the task.
-+ *
-+ * It issues a full memory barrier before accessing @p->state, see the comment
-+ * with set_current_state().
-+ *
-+ * Uses p->pi_lock to serialize against concurrent wake-ups.
-+ *
-+ * Relies on p->pi_lock stabilizing:
-+ * - p->sched_class
-+ * - p->cpus_ptr
-+ * - p->sched_task_group
-+ * in order to do migration, see its use of select_task_rq()/set_task_cpu().
-+ *
-+ * Tries really hard to only take one task_rq(p)->lock for performance.
-+ * Takes rq->lock in:
-+ * - ttwu_runnable() -- old rq, unavoidable, see comment there;
-+ * - ttwu_queue() -- new rq, for enqueue of the task;
-+ * - psi_ttwu_dequeue() -- much sadness :-( accounting will kill us.
-+ *
-+ * As a consequence we race really badly with just about everything. See the
-+ * many memory barriers and their comments for details.
-+ *
-+ * Return: %true if @p->state changes (an actual wakeup was done),
-+ * %false otherwise.
-+ */
-+static int try_to_wake_up(struct task_struct *p, unsigned int state,
-+ int wake_flags)
-+{
-+ unsigned long flags;
-+ int cpu, success = 0;
-+
-+ preempt_disable();
-+ if (p == current) {
-+ /*
-+ * We're waking current, this means 'p->on_rq' and 'task_cpu(p)
-+ * == smp_processor_id()'. Together this means we can special
-+ * case the whole 'p->on_rq && ttwu_runnable()' case below
-+ * without taking any locks.
-+ *
-+ * In particular:
-+ * - we rely on Program-Order guarantees for all the ordering,
-+ * - we're serialized against set_special_state() by virtue of
-+ * it disabling IRQs (this allows not taking ->pi_lock).
-+ */
-+ if (!ttwu_state_match(p, state, &success))
-+ goto out;
-+
-+ trace_sched_waking(p);
-+ ttwu_do_wakeup(p);
-+ goto out;
-+ }
-+
-+ /*
-+ * If we are going to wake up a thread waiting for CONDITION we
-+ * need to ensure that CONDITION=1 done by the caller can not be
-+ * reordered with p->state check below. This pairs with smp_store_mb()
-+ * in set_current_state() that the waiting thread does.
-+ */
-+ raw_spin_lock_irqsave(&p->pi_lock, flags);
-+ smp_mb__after_spinlock();
-+ if (!ttwu_state_match(p, state, &success))
-+ goto unlock;
-+
-+ trace_sched_waking(p);
-+
-+ /*
-+ * Ensure we load p->on_rq _after_ p->state, otherwise it would
-+ * be possible to, falsely, observe p->on_rq == 0 and get stuck
-+ * in smp_cond_load_acquire() below.
-+ *
-+ * sched_ttwu_pending() try_to_wake_up()
-+ * STORE p->on_rq = 1 LOAD p->state
-+ * UNLOCK rq->lock
-+ *
-+ * __schedule() (switch to task 'p')
-+ * LOCK rq->lock smp_rmb();
-+ * smp_mb__after_spinlock();
-+ * UNLOCK rq->lock
-+ *
-+ * [task p]
-+ * STORE p->state = UNINTERRUPTIBLE LOAD p->on_rq
-+ *
-+ * Pairs with the LOCK+smp_mb__after_spinlock() on rq->lock in
-+ * __schedule(). See the comment for smp_mb__after_spinlock().
-+ *
-+ * A similar smb_rmb() lives in try_invoke_on_locked_down_task().
-+ */
-+ smp_rmb();
-+ if (READ_ONCE(p->on_rq) && ttwu_runnable(p, wake_flags))
-+ goto unlock;
-+
-+#ifdef CONFIG_SMP
-+ /*
-+ * Ensure we load p->on_cpu _after_ p->on_rq, otherwise it would be
-+ * possible to, falsely, observe p->on_cpu == 0.
-+ *
-+ * One must be running (->on_cpu == 1) in order to remove oneself
-+ * from the runqueue.
-+ *
-+ * __schedule() (switch to task 'p') try_to_wake_up()
-+ * STORE p->on_cpu = 1 LOAD p->on_rq
-+ * UNLOCK rq->lock
-+ *
-+ * __schedule() (put 'p' to sleep)
-+ * LOCK rq->lock smp_rmb();
-+ * smp_mb__after_spinlock();
-+ * STORE p->on_rq = 0 LOAD p->on_cpu
-+ *
-+ * Pairs with the LOCK+smp_mb__after_spinlock() on rq->lock in
-+ * __schedule(). See the comment for smp_mb__after_spinlock().
-+ *
-+ * Form a control-dep-acquire with p->on_rq == 0 above, to ensure
-+ * schedule()'s deactivate_task() has 'happened' and p will no longer
-+ * care about it's own p->state. See the comment in __schedule().
-+ */
-+ smp_acquire__after_ctrl_dep();
-+
-+ /*
-+ * We're doing the wakeup (@success == 1), they did a dequeue (p->on_rq
-+ * == 0), which means we need to do an enqueue, change p->state to
-+ * TASK_WAKING such that we can unlock p->pi_lock before doing the
-+ * enqueue, such as ttwu_queue_wakelist().
-+ */
-+ WRITE_ONCE(p->__state, TASK_WAKING);
-+
-+ /*
-+ * If the owning (remote) CPU is still in the middle of schedule() with
-+ * this task as prev, considering queueing p on the remote CPUs wake_list
-+ * which potentially sends an IPI instead of spinning on p->on_cpu to
-+ * let the waker make forward progress. This is safe because IRQs are
-+ * disabled and the IPI will deliver after on_cpu is cleared.
-+ *
-+ * Ensure we load task_cpu(p) after p->on_cpu:
-+ *
-+ * set_task_cpu(p, cpu);
-+ * STORE p->cpu = @cpu
-+ * __schedule() (switch to task 'p')
-+ * LOCK rq->lock
-+ * smp_mb__after_spin_lock() smp_cond_load_acquire(&p->on_cpu)
-+ * STORE p->on_cpu = 1 LOAD p->cpu
-+ *
-+ * to ensure we observe the correct CPU on which the task is currently
-+ * scheduling.
-+ */
-+ if (smp_load_acquire(&p->on_cpu) &&
-+ ttwu_queue_wakelist(p, task_cpu(p), wake_flags))
-+ goto unlock;
-+
-+ /*
-+ * If the owning (remote) CPU is still in the middle of schedule() with
-+ * this task as prev, wait until it's done referencing the task.
-+ *
-+ * Pairs with the smp_store_release() in finish_task().
-+ *
-+ * This ensures that tasks getting woken will be fully ordered against
-+ * their previous state and preserve Program Order.
-+ */
-+ smp_cond_load_acquire(&p->on_cpu, !VAL);
-+
-+ sched_task_ttwu(p);
-+
-+ cpu = select_task_rq(p);
-+
-+ if (cpu != task_cpu(p)) {
-+ if (p->in_iowait) {
-+ delayacct_blkio_end(p);
-+ atomic_dec(&task_rq(p)->nr_iowait);
-+ }
-+
-+ wake_flags |= WF_MIGRATED;
-+ set_task_cpu(p, cpu);
-+ }
-+#else
-+ cpu = task_cpu(p);
-+#endif /* CONFIG_SMP */
-+
-+ ttwu_queue(p, cpu, wake_flags);
-+unlock:
-+ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
-+out:
-+ if (success)
-+ ttwu_stat(p, task_cpu(p), wake_flags);
-+ preempt_enable();
-+
-+ return success;
-+}
-+
-+static bool __task_needs_rq_lock(struct task_struct *p)
-+{
-+ unsigned int state = READ_ONCE(p->__state);
-+
-+ /*
-+ * Since pi->lock blocks try_to_wake_up(), we don't need rq->lock when
-+ * the task is blocked. Make sure to check @state since ttwu() can drop
-+ * locks at the end, see ttwu_queue_wakelist().
-+ */
-+ if (state == TASK_RUNNING || state == TASK_WAKING)
-+ return true;
-+
-+ /*
-+ * Ensure we load p->on_rq after p->__state, otherwise it would be
-+ * possible to, falsely, observe p->on_rq == 0.
-+ *
-+ * See try_to_wake_up() for a longer comment.
-+ */
-+ smp_rmb();
-+ if (p->on_rq)
-+ return true;
-+
-+#ifdef CONFIG_SMP
-+ /*
-+ * Ensure the task has finished __schedule() and will not be referenced
-+ * anymore. Again, see try_to_wake_up() for a longer comment.
-+ */
-+ smp_rmb();
-+ smp_cond_load_acquire(&p->on_cpu, !VAL);
-+#endif
-+
-+ return false;
-+}
-+
-+/**
-+ * task_call_func - Invoke a function on task in fixed state
-+ * @p: Process for which the function is to be invoked, can be @current.
-+ * @func: Function to invoke.
-+ * @arg: Argument to function.
-+ *
-+ * Fix the task in it's current state by avoiding wakeups and or rq operations
-+ * and call @func(@arg) on it. This function can use ->on_rq and task_curr()
-+ * to work out what the state is, if required. Given that @func can be invoked
-+ * with a runqueue lock held, it had better be quite lightweight.
-+ *
-+ * Returns:
-+ * Whatever @func returns
-+ */
-+int task_call_func(struct task_struct *p, task_call_f func, void *arg)
-+{
-+ struct rq *rq = NULL;
-+ struct rq_flags rf;
-+ int ret;
-+
-+ raw_spin_lock_irqsave(&p->pi_lock, rf.flags);
-+
-+ if (__task_needs_rq_lock(p))
-+ rq = __task_rq_lock(p, &rf);
-+
-+ /*
-+ * At this point the task is pinned; either:
-+ * - blocked and we're holding off wakeups (pi->lock)
-+ * - woken, and we're holding off enqueue (rq->lock)
-+ * - queued, and we're holding off schedule (rq->lock)
-+ * - running, and we're holding off de-schedule (rq->lock)
-+ *
-+ * The called function (@func) can use: task_curr(), p->on_rq and
-+ * p->__state to differentiate between these states.
-+ */
-+ ret = func(p, arg);
-+
-+ if (rq)
-+ __task_rq_unlock(rq, &rf);
-+
-+ raw_spin_unlock_irqrestore(&p->pi_lock, rf.flags);
-+ return ret;
-+}
-+
-+/**
-+ * cpu_curr_snapshot - Return a snapshot of the currently running task
-+ * @cpu: The CPU on which to snapshot the task.
-+ *
-+ * Returns the task_struct pointer of the task "currently" running on
-+ * the specified CPU. If the same task is running on that CPU throughout,
-+ * the return value will be a pointer to that task's task_struct structure.
-+ * If the CPU did any context switches even vaguely concurrently with the
-+ * execution of this function, the return value will be a pointer to the
-+ * task_struct structure of a randomly chosen task that was running on
-+ * that CPU somewhere around the time that this function was executing.
-+ *
-+ * If the specified CPU was offline, the return value is whatever it
-+ * is, perhaps a pointer to the task_struct structure of that CPU's idle
-+ * task, but there is no guarantee. Callers wishing a useful return
-+ * value must take some action to ensure that the specified CPU remains
-+ * online throughout.
-+ *
-+ * This function executes full memory barriers before and after fetching
-+ * the pointer, which permits the caller to confine this function's fetch
-+ * with respect to the caller's accesses to other shared variables.
-+ */
-+struct task_struct *cpu_curr_snapshot(int cpu)
-+{
-+ struct task_struct *t;
-+
-+ smp_mb(); /* Pairing determined by caller's synchronization design. */
-+ t = rcu_dereference(cpu_curr(cpu));
-+ smp_mb(); /* Pairing determined by caller's synchronization design. */
-+ return t;
-+}
-+
-+/**
-+ * wake_up_process - Wake up a specific process
-+ * @p: The process to be woken up.
-+ *
-+ * Attempt to wake up the nominated process and move it to the set of runnable
-+ * processes.
-+ *
-+ * Return: 1 if the process was woken up, 0 if it was already running.
-+ *
-+ * This function executes a full memory barrier before accessing the task state.
-+ */
-+int wake_up_process(struct task_struct *p)
-+{
-+ return try_to_wake_up(p, TASK_NORMAL, 0);
-+}
-+EXPORT_SYMBOL(wake_up_process);
-+
-+int wake_up_state(struct task_struct *p, unsigned int state)
-+{
-+ return try_to_wake_up(p, state, 0);
-+}
-+
-+/*
-+ * Perform scheduler related setup for a newly forked process p.
-+ * p is forked by current.
-+ *
-+ * __sched_fork() is basic setup used by init_idle() too:
-+ */
-+static inline void __sched_fork(unsigned long clone_flags, struct task_struct *p)
-+{
-+ p->on_rq = 0;
-+ p->on_cpu = 0;
-+ p->utime = 0;
-+ p->stime = 0;
-+ p->sched_time = 0;
-+
-+#ifdef CONFIG_SCHEDSTATS
-+ /* Even if schedstat is disabled, there should not be garbage */
-+ memset(&p->stats, 0, sizeof(p->stats));
-+#endif
-+
-+#ifdef CONFIG_PREEMPT_NOTIFIERS
-+ INIT_HLIST_HEAD(&p->preempt_notifiers);
-+#endif
-+
-+#ifdef CONFIG_COMPACTION
-+ p->capture_control = NULL;
-+#endif
-+#ifdef CONFIG_SMP
-+ p->wake_entry.u_flags = CSD_TYPE_TTWU;
-+#endif
-+ init_sched_mm_cid(p);
-+}
-+
-+/*
-+ * fork()/clone()-time setup:
-+ */
-+int sched_fork(unsigned long clone_flags, struct task_struct *p)
-+{
-+ __sched_fork(clone_flags, p);
-+ /*
-+ * We mark the process as NEW here. This guarantees that
-+ * nobody will actually run it, and a signal or other external
-+ * event cannot wake it up and insert it on the runqueue either.
-+ */
-+ p->__state = TASK_NEW;
-+
-+ /*
-+ * Make sure we do not leak PI boosting priority to the child.
-+ */
-+ p->prio = current->normal_prio;
-+
-+ /*
-+ * Revert to default priority/policy on fork if requested.
-+ */
-+ if (unlikely(p->sched_reset_on_fork)) {
-+ if (task_has_rt_policy(p)) {
-+ p->policy = SCHED_NORMAL;
-+ p->static_prio = NICE_TO_PRIO(0);
-+ p->rt_priority = 0;
-+ } else if (PRIO_TO_NICE(p->static_prio) < 0)
-+ p->static_prio = NICE_TO_PRIO(0);
-+
-+ p->prio = p->normal_prio = p->static_prio;
-+
-+ /*
-+ * We don't need the reset flag anymore after the fork. It has
-+ * fulfilled its duty:
-+ */
-+ p->sched_reset_on_fork = 0;
-+ }
-+
-+#ifdef CONFIG_SCHED_INFO
-+ if (unlikely(sched_info_on()))
-+ memset(&p->sched_info, 0, sizeof(p->sched_info));
-+#endif
-+ init_task_preempt_count(p);
-+
-+ return 0;
-+}
-+
-+void sched_cgroup_fork(struct task_struct *p, struct kernel_clone_args *kargs)
-+{
-+ unsigned long flags;
-+ struct rq *rq;
-+
-+ /*
-+ * Because we're not yet on the pid-hash, p->pi_lock isn't strictly
-+ * required yet, but lockdep gets upset if rules are violated.
-+ */
-+ raw_spin_lock_irqsave(&p->pi_lock, flags);
-+ /*
-+ * Share the timeslice between parent and child, thus the
-+ * total amount of pending timeslices in the system doesn't change,
-+ * resulting in more scheduling fairness.
-+ */
-+ rq = this_rq();
-+ raw_spin_lock(&rq->lock);
-+
-+ rq->curr->time_slice /= 2;
-+ p->time_slice = rq->curr->time_slice;
-+#ifdef CONFIG_SCHED_HRTICK
-+ hrtick_start(rq, rq->curr->time_slice);
-+#endif
-+
-+ if (p->time_slice < RESCHED_NS) {
-+ p->time_slice = sched_timeslice_ns;
-+ resched_curr(rq);
-+ }
-+ sched_task_fork(p, rq);
-+ raw_spin_unlock(&rq->lock);
-+
-+ rseq_migrate(p);
-+ /*
-+ * We're setting the CPU for the first time, we don't migrate,
-+ * so use __set_task_cpu().
-+ */
-+ __set_task_cpu(p, smp_processor_id());
-+ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
-+}
-+
-+void sched_post_fork(struct task_struct *p)
-+{
-+}
-+
-+#ifdef CONFIG_SCHEDSTATS
-+
-+DEFINE_STATIC_KEY_FALSE(sched_schedstats);
-+
-+static void set_schedstats(bool enabled)
-+{
-+ if (enabled)
-+ static_branch_enable(&sched_schedstats);
-+ else
-+ static_branch_disable(&sched_schedstats);
-+}
-+
-+void force_schedstat_enabled(void)
-+{
-+ if (!schedstat_enabled()) {
-+ pr_info("kernel profiling enabled schedstats, disable via kernel.sched_schedstats.\n");
-+ static_branch_enable(&sched_schedstats);
-+ }
-+}
-+
-+static int __init setup_schedstats(char *str)
-+{
-+ int ret = 0;
-+ if (!str)
-+ goto out;
-+
-+ if (!strcmp(str, "enable")) {
-+ set_schedstats(true);
-+ ret = 1;
-+ } else if (!strcmp(str, "disable")) {
-+ set_schedstats(false);
-+ ret = 1;
-+ }
-+out:
-+ if (!ret)
-+ pr_warn("Unable to parse schedstats=\n");
-+
-+ return ret;
-+}
-+__setup("schedstats=", setup_schedstats);
-+
-+#ifdef CONFIG_PROC_SYSCTL
-+static int sysctl_schedstats(struct ctl_table *table, int write, void *buffer,
-+ size_t *lenp, loff_t *ppos)
-+{
-+ struct ctl_table t;
-+ int err;
-+ int state = static_branch_likely(&sched_schedstats);
-+
-+ if (write && !capable(CAP_SYS_ADMIN))
-+ return -EPERM;
-+
-+ t = *table;
-+ t.data = &state;
-+ err = proc_dointvec_minmax(&t, write, buffer, lenp, ppos);
-+ if (err < 0)
-+ return err;
-+ if (write)
-+ set_schedstats(state);
-+ return err;
-+}
-+
-+static struct ctl_table sched_core_sysctls[] = {
-+ {
-+ .procname = "sched_schedstats",
-+ .data = NULL,
-+ .maxlen = sizeof(unsigned int),
-+ .mode = 0644,
-+ .proc_handler = sysctl_schedstats,
-+ .extra1 = SYSCTL_ZERO,
-+ .extra2 = SYSCTL_ONE,
-+ },
-+ {}
-+};
-+static int __init sched_core_sysctl_init(void)
-+{
-+ register_sysctl_init("kernel", sched_core_sysctls);
-+ return 0;
-+}
-+late_initcall(sched_core_sysctl_init);
-+#endif /* CONFIG_PROC_SYSCTL */
-+#endif /* CONFIG_SCHEDSTATS */
-+
-+/*
-+ * wake_up_new_task - wake up a newly created task for the first time.
-+ *
-+ * This function will do some initial scheduler statistics housekeeping
-+ * that must be done for every newly created context, then puts the task
-+ * on the runqueue and wakes it.
-+ */
-+void wake_up_new_task(struct task_struct *p)
-+{
-+ unsigned long flags;
-+ struct rq *rq;
-+
-+ raw_spin_lock_irqsave(&p->pi_lock, flags);
-+ WRITE_ONCE(p->__state, TASK_RUNNING);
-+ rq = cpu_rq(select_task_rq(p));
-+#ifdef CONFIG_SMP
-+ rseq_migrate(p);
-+ /*
-+ * Fork balancing, do it here and not earlier because:
-+ * - cpus_ptr can change in the fork path
-+ * - any previously selected CPU might disappear through hotplug
-+ *
-+ * Use __set_task_cpu() to avoid calling sched_class::migrate_task_rq,
-+ * as we're not fully set-up yet.
-+ */
-+ __set_task_cpu(p, cpu_of(rq));
-+#endif
-+
-+ raw_spin_lock(&rq->lock);
-+ update_rq_clock(rq);
-+
-+ activate_task(p, rq);
-+ trace_sched_wakeup_new(p);
-+ check_preempt_curr(rq);
-+
-+ raw_spin_unlock(&rq->lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
-+}
-+
-+#ifdef CONFIG_PREEMPT_NOTIFIERS
-+
-+static DEFINE_STATIC_KEY_FALSE(preempt_notifier_key);
-+
-+void preempt_notifier_inc(void)
-+{
-+ static_branch_inc(&preempt_notifier_key);
-+}
-+EXPORT_SYMBOL_GPL(preempt_notifier_inc);
-+
-+void preempt_notifier_dec(void)
-+{
-+ static_branch_dec(&preempt_notifier_key);
-+}
-+EXPORT_SYMBOL_GPL(preempt_notifier_dec);
-+
-+/**
-+ * preempt_notifier_register - tell me when current is being preempted & rescheduled
-+ * @notifier: notifier struct to register
-+ */
-+void preempt_notifier_register(struct preempt_notifier *notifier)
-+{
-+ if (!static_branch_unlikely(&preempt_notifier_key))
-+ WARN(1, "registering preempt_notifier while notifiers disabled\n");
-+
-+ hlist_add_head(¬ifier->link, ¤t->preempt_notifiers);
-+}
-+EXPORT_SYMBOL_GPL(preempt_notifier_register);
-+
-+/**
-+ * preempt_notifier_unregister - no longer interested in preemption notifications
-+ * @notifier: notifier struct to unregister
-+ *
-+ * This is *not* safe to call from within a preemption notifier.
-+ */
-+void preempt_notifier_unregister(struct preempt_notifier *notifier)
-+{
-+ hlist_del(¬ifier->link);
-+}
-+EXPORT_SYMBOL_GPL(preempt_notifier_unregister);
-+
-+static void __fire_sched_in_preempt_notifiers(struct task_struct *curr)
-+{
-+ struct preempt_notifier *notifier;
-+
-+ hlist_for_each_entry(notifier, &curr->preempt_notifiers, link)
-+ notifier->ops->sched_in(notifier, raw_smp_processor_id());
-+}
-+
-+static __always_inline void fire_sched_in_preempt_notifiers(struct task_struct *curr)
-+{
-+ if (static_branch_unlikely(&preempt_notifier_key))
-+ __fire_sched_in_preempt_notifiers(curr);
-+}
-+
-+static void
-+__fire_sched_out_preempt_notifiers(struct task_struct *curr,
-+ struct task_struct *next)
-+{
-+ struct preempt_notifier *notifier;
-+
-+ hlist_for_each_entry(notifier, &curr->preempt_notifiers, link)
-+ notifier->ops->sched_out(notifier, next);
-+}
-+
-+static __always_inline void
-+fire_sched_out_preempt_notifiers(struct task_struct *curr,
-+ struct task_struct *next)
-+{
-+ if (static_branch_unlikely(&preempt_notifier_key))
-+ __fire_sched_out_preempt_notifiers(curr, next);
-+}
-+
-+#else /* !CONFIG_PREEMPT_NOTIFIERS */
-+
-+static inline void fire_sched_in_preempt_notifiers(struct task_struct *curr)
-+{
-+}
-+
-+static inline void
-+fire_sched_out_preempt_notifiers(struct task_struct *curr,
-+ struct task_struct *next)
-+{
-+}
-+
-+#endif /* CONFIG_PREEMPT_NOTIFIERS */
-+
-+static inline void prepare_task(struct task_struct *next)
-+{
-+ /*
-+ * Claim the task as running, we do this before switching to it
-+ * such that any running task will have this set.
-+ *
-+ * See the smp_load_acquire(&p->on_cpu) case in ttwu() and
-+ * its ordering comment.
-+ */
-+ WRITE_ONCE(next->on_cpu, 1);
-+}
-+
-+static inline void finish_task(struct task_struct *prev)
-+{
-+#ifdef CONFIG_SMP
-+ /*
-+ * This must be the very last reference to @prev from this CPU. After
-+ * p->on_cpu is cleared, the task can be moved to a different CPU. We
-+ * must ensure this doesn't happen until the switch is completely
-+ * finished.
-+ *
-+ * In particular, the load of prev->state in finish_task_switch() must
-+ * happen before this.
-+ *
-+ * Pairs with the smp_cond_load_acquire() in try_to_wake_up().
-+ */
-+ smp_store_release(&prev->on_cpu, 0);
-+#else
-+ prev->on_cpu = 0;
-+#endif
-+}
-+
-+#ifdef CONFIG_SMP
-+
-+static void do_balance_callbacks(struct rq *rq, struct balance_callback *head)
-+{
-+ void (*func)(struct rq *rq);
-+ struct balance_callback *next;
-+
-+ lockdep_assert_held(&rq->lock);
-+
-+ while (head) {
-+ func = (void (*)(struct rq *))head->func;
-+ next = head->next;
-+ head->next = NULL;
-+ head = next;
-+
-+ func(rq);
-+ }
-+}
-+
-+static void balance_push(struct rq *rq);
-+
-+/*
-+ * balance_push_callback is a right abuse of the callback interface and plays
-+ * by significantly different rules.
-+ *
-+ * Where the normal balance_callback's purpose is to be ran in the same context
-+ * that queued it (only later, when it's safe to drop rq->lock again),
-+ * balance_push_callback is specifically targeted at __schedule().
-+ *
-+ * This abuse is tolerated because it places all the unlikely/odd cases behind
-+ * a single test, namely: rq->balance_callback == NULL.
-+ */
-+struct balance_callback balance_push_callback = {
-+ .next = NULL,
-+ .func = balance_push,
-+};
-+
-+static inline struct balance_callback *
-+__splice_balance_callbacks(struct rq *rq, bool split)
-+{
-+ struct balance_callback *head = rq->balance_callback;
-+
-+ if (likely(!head))
-+ return NULL;
-+
-+ lockdep_assert_rq_held(rq);
-+ /*
-+ * Must not take balance_push_callback off the list when
-+ * splice_balance_callbacks() and balance_callbacks() are not
-+ * in the same rq->lock section.
-+ *
-+ * In that case it would be possible for __schedule() to interleave
-+ * and observe the list empty.
-+ */
-+ if (split && head == &balance_push_callback)
-+ head = NULL;
-+ else
-+ rq->balance_callback = NULL;
-+
-+ return head;
-+}
-+
-+static inline struct balance_callback *splice_balance_callbacks(struct rq *rq)
-+{
-+ return __splice_balance_callbacks(rq, true);
-+}
-+
-+static void __balance_callbacks(struct rq *rq)
-+{
-+ do_balance_callbacks(rq, __splice_balance_callbacks(rq, false));
-+}
-+
-+static inline void balance_callbacks(struct rq *rq, struct balance_callback *head)
-+{
-+ unsigned long flags;
-+
-+ if (unlikely(head)) {
-+ raw_spin_lock_irqsave(&rq->lock, flags);
-+ do_balance_callbacks(rq, head);
-+ raw_spin_unlock_irqrestore(&rq->lock, flags);
-+ }
-+}
-+
-+#else
-+
-+static inline void __balance_callbacks(struct rq *rq)
-+{
-+}
-+
-+static inline struct balance_callback *splice_balance_callbacks(struct rq *rq)
-+{
-+ return NULL;
-+}
-+
-+static inline void balance_callbacks(struct rq *rq, struct balance_callback *head)
-+{
-+}
-+
-+#endif
-+
-+static inline void
-+prepare_lock_switch(struct rq *rq, struct task_struct *next)
-+{
-+ /*
-+ * Since the runqueue lock will be released by the next
-+ * task (which is an invalid locking op but in the case
-+ * of the scheduler it's an obvious special-case), so we
-+ * do an early lockdep release here:
-+ */
-+ spin_release(&rq->lock.dep_map, _THIS_IP_);
-+#ifdef CONFIG_DEBUG_SPINLOCK
-+ /* this is a valid case when another task releases the spinlock */
-+ rq->lock.owner = next;
-+#endif
-+}
-+
-+static inline void finish_lock_switch(struct rq *rq)
-+{
-+ /*
-+ * If we are tracking spinlock dependencies then we have to
-+ * fix up the runqueue lock - which gets 'carried over' from
-+ * prev into current:
-+ */
-+ spin_acquire(&rq->lock.dep_map, 0, 0, _THIS_IP_);
-+ __balance_callbacks(rq);
-+ raw_spin_unlock_irq(&rq->lock);
-+}
-+
-+/*
-+ * NOP if the arch has not defined these:
-+ */
-+
-+#ifndef prepare_arch_switch
-+# define prepare_arch_switch(next) do { } while (0)
-+#endif
-+
-+#ifndef finish_arch_post_lock_switch
-+# define finish_arch_post_lock_switch() do { } while (0)
-+#endif
-+
-+static inline void kmap_local_sched_out(void)
-+{
-+#ifdef CONFIG_KMAP_LOCAL
-+ if (unlikely(current->kmap_ctrl.idx))
-+ __kmap_local_sched_out();
-+#endif
-+}
-+
-+static inline void kmap_local_sched_in(void)
-+{
-+#ifdef CONFIG_KMAP_LOCAL
-+ if (unlikely(current->kmap_ctrl.idx))
-+ __kmap_local_sched_in();
-+#endif
-+}
-+
-+/**
-+ * prepare_task_switch - prepare to switch tasks
-+ * @rq: the runqueue preparing to switch
-+ * @next: the task we are going to switch to.
-+ *
-+ * This is called with the rq lock held and interrupts off. It must
-+ * be paired with a subsequent finish_task_switch after the context
-+ * switch.
-+ *
-+ * prepare_task_switch sets up locking and calls architecture specific
-+ * hooks.
-+ */
-+static inline void
-+prepare_task_switch(struct rq *rq, struct task_struct *prev,
-+ struct task_struct *next)
-+{
-+ kcov_prepare_switch(prev);
-+ sched_info_switch(rq, prev, next);
-+ perf_event_task_sched_out(prev, next);
-+ rseq_preempt(prev);
-+ fire_sched_out_preempt_notifiers(prev, next);
-+ kmap_local_sched_out();
-+ prepare_task(next);
-+ prepare_arch_switch(next);
-+}
-+
-+/**
-+ * finish_task_switch - clean up after a task-switch
-+ * @rq: runqueue associated with task-switch
-+ * @prev: the thread we just switched away from.
-+ *
-+ * finish_task_switch must be called after the context switch, paired
-+ * with a prepare_task_switch call before the context switch.
-+ * finish_task_switch will reconcile locking set up by prepare_task_switch,
-+ * and do any other architecture-specific cleanup actions.
-+ *
-+ * Note that we may have delayed dropping an mm in context_switch(). If
-+ * so, we finish that here outside of the runqueue lock. (Doing it
-+ * with the lock held can cause deadlocks; see schedule() for
-+ * details.)
-+ *
-+ * The context switch have flipped the stack from under us and restored the
-+ * local variables which were saved when this task called schedule() in the
-+ * past. prev == current is still correct but we need to recalculate this_rq
-+ * because prev may have moved to another CPU.
-+ */
-+static struct rq *finish_task_switch(struct task_struct *prev)
-+ __releases(rq->lock)
-+{
-+ struct rq *rq = this_rq();
-+ struct mm_struct *mm = rq->prev_mm;
-+ unsigned int prev_state;
-+
-+ /*
-+ * The previous task will have left us with a preempt_count of 2
-+ * because it left us after:
-+ *
-+ * schedule()
-+ * preempt_disable(); // 1
-+ * __schedule()
-+ * raw_spin_lock_irq(&rq->lock) // 2
-+ *
-+ * Also, see FORK_PREEMPT_COUNT.
-+ */
-+ if (WARN_ONCE(preempt_count() != 2*PREEMPT_DISABLE_OFFSET,
-+ "corrupted preempt_count: %s/%d/0x%x\n",
-+ current->comm, current->pid, preempt_count()))
-+ preempt_count_set(FORK_PREEMPT_COUNT);
-+
-+ rq->prev_mm = NULL;
-+
-+ /*
-+ * A task struct has one reference for the use as "current".
-+ * If a task dies, then it sets TASK_DEAD in tsk->state and calls
-+ * schedule one last time. The schedule call will never return, and
-+ * the scheduled task must drop that reference.
-+ *
-+ * We must observe prev->state before clearing prev->on_cpu (in
-+ * finish_task), otherwise a concurrent wakeup can get prev
-+ * running on another CPU and we could rave with its RUNNING -> DEAD
-+ * transition, resulting in a double drop.
-+ */
-+ prev_state = READ_ONCE(prev->__state);
-+ vtime_task_switch(prev);
-+ perf_event_task_sched_in(prev, current);
-+ finish_task(prev);
-+ tick_nohz_task_switch();
-+ finish_lock_switch(rq);
-+ finish_arch_post_lock_switch();
-+ kcov_finish_switch(current);
-+ /*
-+ * kmap_local_sched_out() is invoked with rq::lock held and
-+ * interrupts disabled. There is no requirement for that, but the
-+ * sched out code does not have an interrupt enabled section.
-+ * Restoring the maps on sched in does not require interrupts being
-+ * disabled either.
-+ */
-+ kmap_local_sched_in();
-+
-+ fire_sched_in_preempt_notifiers(current);
-+ /*
-+ * When switching through a kernel thread, the loop in
-+ * membarrier_{private,global}_expedited() may have observed that
-+ * kernel thread and not issued an IPI. It is therefore possible to
-+ * schedule between user->kernel->user threads without passing though
-+ * switch_mm(). Membarrier requires a barrier after storing to
-+ * rq->curr, before returning to userspace, so provide them here:
-+ *
-+ * - a full memory barrier for {PRIVATE,GLOBAL}_EXPEDITED, implicitly
-+ * provided by mmdrop(),
-+ * - a sync_core for SYNC_CORE.
-+ */
-+ if (mm) {
-+ membarrier_mm_sync_core_before_usermode(mm);
-+ mmdrop_sched(mm);
-+ }
-+ if (unlikely(prev_state == TASK_DEAD)) {
-+ /* Task is done with its stack. */
-+ put_task_stack(prev);
-+
-+ put_task_struct_rcu_user(prev);
-+ }
-+
-+ return rq;
-+}
-+
-+/**
-+ * schedule_tail - first thing a freshly forked thread must call.
-+ * @prev: the thread we just switched away from.
-+ */
-+asmlinkage __visible void schedule_tail(struct task_struct *prev)
-+ __releases(rq->lock)
-+{
-+ /*
-+ * New tasks start with FORK_PREEMPT_COUNT, see there and
-+ * finish_task_switch() for details.
-+ *
-+ * finish_task_switch() will drop rq->lock() and lower preempt_count
-+ * and the preempt_enable() will end up enabling preemption (on
-+ * PREEMPT_COUNT kernels).
-+ */
-+
-+ finish_task_switch(prev);
-+ preempt_enable();
-+
-+ if (current->set_child_tid)
-+ put_user(task_pid_vnr(current), current->set_child_tid);
-+
-+ calculate_sigpending();
-+}
-+
-+/*
-+ * context_switch - switch to the new MM and the new thread's register state.
-+ */
-+static __always_inline struct rq *
-+context_switch(struct rq *rq, struct task_struct *prev,
-+ struct task_struct *next)
-+{
-+ prepare_task_switch(rq, prev, next);
-+
-+ /*
-+ * For paravirt, this is coupled with an exit in switch_to to
-+ * combine the page table reload and the switch backend into
-+ * one hypercall.
-+ */
-+ arch_start_context_switch(prev);
-+
-+ /*
-+ * kernel -> kernel lazy + transfer active
-+ * user -> kernel lazy + mmgrab() active
-+ *
-+ * kernel -> user switch + mmdrop() active
-+ * user -> user switch
-+ *
-+ * switch_mm_cid() needs to be updated if the barriers provided
-+ * by context_switch() are modified.
-+ */
-+ if (!next->mm) { // to kernel
-+ enter_lazy_tlb(prev->active_mm, next);
-+
-+ next->active_mm = prev->active_mm;
-+ if (prev->mm) // from user
-+ mmgrab(prev->active_mm);
-+ else
-+ prev->active_mm = NULL;
-+ } else { // to user
-+ membarrier_switch_mm(rq, prev->active_mm, next->mm);
-+ /*
-+ * sys_membarrier() requires an smp_mb() between setting
-+ * rq->curr / membarrier_switch_mm() and returning to userspace.
-+ *
-+ * The below provides this either through switch_mm(), or in
-+ * case 'prev->active_mm == next->mm' through
-+ * finish_task_switch()'s mmdrop().
-+ */
-+ switch_mm_irqs_off(prev->active_mm, next->mm, next);
-+ lru_gen_use_mm(next->mm);
-+
-+ if (!prev->mm) { // from kernel
-+ /* will mmdrop() in finish_task_switch(). */
-+ rq->prev_mm = prev->active_mm;
-+ prev->active_mm = NULL;
-+ }
-+ }
-+
-+ /* switch_mm_cid() requires the memory barriers above. */
-+ switch_mm_cid(rq, prev, next);
-+
-+ prepare_lock_switch(rq, next);
-+
-+ /* Here we just switch the register state and the stack. */
-+ switch_to(prev, next, prev);
-+ barrier();
-+
-+ return finish_task_switch(prev);
-+}
-+
-+/*
-+ * nr_running, nr_uninterruptible and nr_context_switches:
-+ *
-+ * externally visible scheduler statistics: current number of runnable
-+ * threads, total number of context switches performed since bootup.
-+ */
-+unsigned int nr_running(void)
-+{
-+ unsigned int i, sum = 0;
-+
-+ for_each_online_cpu(i)
-+ sum += cpu_rq(i)->nr_running;
-+
-+ return sum;
-+}
-+
-+/*
-+ * Check if only the current task is running on the CPU.
-+ *
-+ * Caution: this function does not check that the caller has disabled
-+ * preemption, thus the result might have a time-of-check-to-time-of-use
-+ * race. The caller is responsible to use it correctly, for example:
-+ *
-+ * - from a non-preemptible section (of course)
-+ *
-+ * - from a thread that is bound to a single CPU
-+ *
-+ * - in a loop with very short iterations (e.g. a polling loop)
-+ */
-+bool single_task_running(void)
-+{
-+ return raw_rq()->nr_running == 1;
-+}
-+EXPORT_SYMBOL(single_task_running);
-+
-+unsigned long long nr_context_switches_cpu(int cpu)
-+{
-+ return cpu_rq(cpu)->nr_switches;
-+}
-+
-+unsigned long long nr_context_switches(void)
-+{
-+ int i;
-+ unsigned long long sum = 0;
-+
-+ for_each_possible_cpu(i)
-+ sum += cpu_rq(i)->nr_switches;
-+
-+ return sum;
-+}
-+
-+/*
-+ * Consumers of these two interfaces, like for example the cpuidle menu
-+ * governor, are using nonsensical data. Preferring shallow idle state selection
-+ * for a CPU that has IO-wait which might not even end up running the task when
-+ * it does become runnable.
-+ */
-+
-+unsigned int nr_iowait_cpu(int cpu)
-+{
-+ return atomic_read(&cpu_rq(cpu)->nr_iowait);
-+}
-+
-+/*
-+ * IO-wait accounting, and how it's mostly bollocks (on SMP).
-+ *
-+ * The idea behind IO-wait account is to account the idle time that we could
-+ * have spend running if it were not for IO. That is, if we were to improve the
-+ * storage performance, we'd have a proportional reduction in IO-wait time.
-+ *
-+ * This all works nicely on UP, where, when a task blocks on IO, we account
-+ * idle time as IO-wait, because if the storage were faster, it could've been
-+ * running and we'd not be idle.
-+ *
-+ * This has been extended to SMP, by doing the same for each CPU. This however
-+ * is broken.
-+ *
-+ * Imagine for instance the case where two tasks block on one CPU, only the one
-+ * CPU will have IO-wait accounted, while the other has regular idle. Even
-+ * though, if the storage were faster, both could've ran at the same time,
-+ * utilising both CPUs.
-+ *
-+ * This means, that when looking globally, the current IO-wait accounting on
-+ * SMP is a lower bound, by reason of under accounting.
-+ *
-+ * Worse, since the numbers are provided per CPU, they are sometimes
-+ * interpreted per CPU, and that is nonsensical. A blocked task isn't strictly
-+ * associated with any one particular CPU, it can wake to another CPU than it
-+ * blocked on. This means the per CPU IO-wait number is meaningless.
-+ *
-+ * Task CPU affinities can make all that even more 'interesting'.
-+ */
-+
-+unsigned int nr_iowait(void)
-+{
-+ unsigned int i, sum = 0;
-+
-+ for_each_possible_cpu(i)
-+ sum += nr_iowait_cpu(i);
-+
-+ return sum;
-+}
-+
-+#ifdef CONFIG_SMP
-+
-+/*
-+ * sched_exec - execve() is a valuable balancing opportunity, because at
-+ * this point the task has the smallest effective memory and cache
-+ * footprint.
-+ */
-+void sched_exec(void)
-+{
-+}
-+
-+#endif
-+
-+DEFINE_PER_CPU(struct kernel_stat, kstat);
-+DEFINE_PER_CPU(struct kernel_cpustat, kernel_cpustat);
-+
-+EXPORT_PER_CPU_SYMBOL(kstat);
-+EXPORT_PER_CPU_SYMBOL(kernel_cpustat);
-+
-+static inline void update_curr(struct rq *rq, struct task_struct *p)
-+{
-+ s64 ns = rq->clock_task - p->last_ran;
-+
-+ p->sched_time += ns;
-+ cgroup_account_cputime(p, ns);
-+ account_group_exec_runtime(p, ns);
-+
-+ p->time_slice -= ns;
-+ p->last_ran = rq->clock_task;
-+}
-+
-+/*
-+ * Return accounted runtime for the task.
-+ * Return separately the current's pending runtime that have not been
-+ * accounted yet.
-+ */
-+unsigned long long task_sched_runtime(struct task_struct *p)
-+{
-+ unsigned long flags;
-+ struct rq *rq;
-+ raw_spinlock_t *lock;
-+ u64 ns;
-+
-+#if defined(CONFIG_64BIT) && defined(CONFIG_SMP)
-+ /*
-+ * 64-bit doesn't need locks to atomically read a 64-bit value.
-+ * So we have a optimization chance when the task's delta_exec is 0.
-+ * Reading ->on_cpu is racy, but this is ok.
-+ *
-+ * If we race with it leaving CPU, we'll take a lock. So we're correct.
-+ * If we race with it entering CPU, unaccounted time is 0. This is
-+ * indistinguishable from the read occurring a few cycles earlier.
-+ * If we see ->on_cpu without ->on_rq, the task is leaving, and has
-+ * been accounted, so we're correct here as well.
-+ */
-+ if (!p->on_cpu || !task_on_rq_queued(p))
-+ return tsk_seruntime(p);
-+#endif
-+
-+ rq = task_access_lock_irqsave(p, &lock, &flags);
-+ /*
-+ * Must be ->curr _and_ ->on_rq. If dequeued, we would
-+ * project cycles that may never be accounted to this
-+ * thread, breaking clock_gettime().
-+ */
-+ if (p == rq->curr && task_on_rq_queued(p)) {
-+ update_rq_clock(rq);
-+ update_curr(rq, p);
-+ }
-+ ns = tsk_seruntime(p);
-+ task_access_unlock_irqrestore(p, lock, &flags);
-+
-+ return ns;
-+}
-+
-+/* This manages tasks that have run out of timeslice during a scheduler_tick */
-+static inline void scheduler_task_tick(struct rq *rq)
-+{
-+ struct task_struct *p = rq->curr;
-+
-+ if (is_idle_task(p))
-+ return;
-+
-+ update_curr(rq, p);
-+ cpufreq_update_util(rq, 0);
-+
-+ /*
-+ * Tasks have less than RESCHED_NS of time slice left they will be
-+ * rescheduled.
-+ */
-+ if (p->time_slice >= RESCHED_NS)
-+ return;
-+ set_tsk_need_resched(p);
-+ set_preempt_need_resched();
-+}
-+
-+#ifdef CONFIG_SCHED_DEBUG
-+static u64 cpu_resched_latency(struct rq *rq)
-+{
-+ int latency_warn_ms = READ_ONCE(sysctl_resched_latency_warn_ms);
-+ u64 resched_latency, now = rq_clock(rq);
-+ static bool warned_once;
-+
-+ if (sysctl_resched_latency_warn_once && warned_once)
-+ return 0;
-+
-+ if (!need_resched() || !latency_warn_ms)
-+ return 0;
-+
-+ if (system_state == SYSTEM_BOOTING)
-+ return 0;
-+
-+ if (!rq->last_seen_need_resched_ns) {
-+ rq->last_seen_need_resched_ns = now;
-+ rq->ticks_without_resched = 0;
-+ return 0;
-+ }
-+
-+ rq->ticks_without_resched++;
-+ resched_latency = now - rq->last_seen_need_resched_ns;
-+ if (resched_latency <= latency_warn_ms * NSEC_PER_MSEC)
-+ return 0;
-+
-+ warned_once = true;
-+
-+ return resched_latency;
-+}
-+
-+static int __init setup_resched_latency_warn_ms(char *str)
-+{
-+ long val;
-+
-+ if ((kstrtol(str, 0, &val))) {
-+ pr_warn("Unable to set resched_latency_warn_ms\n");
-+ return 1;
-+ }
-+
-+ sysctl_resched_latency_warn_ms = val;
-+ return 1;
-+}
-+__setup("resched_latency_warn_ms=", setup_resched_latency_warn_ms);
-+#else
-+static inline u64 cpu_resched_latency(struct rq *rq) { return 0; }
-+#endif /* CONFIG_SCHED_DEBUG */
-+
-+/*
-+ * This function gets called by the timer code, with HZ frequency.
-+ * We call it with interrupts disabled.
-+ */
-+void scheduler_tick(void)
-+{
-+ int cpu __maybe_unused = smp_processor_id();
-+ struct rq *rq = cpu_rq(cpu);
-+ u64 resched_latency;
-+
-+ if (housekeeping_cpu(cpu, HK_TYPE_TICK))
-+ arch_scale_freq_tick();
-+
-+ sched_clock_tick();
-+
-+ raw_spin_lock(&rq->lock);
-+ update_rq_clock(rq);
-+
-+ scheduler_task_tick(rq);
-+ if (sched_feat(LATENCY_WARN))
-+ resched_latency = cpu_resched_latency(rq);
-+ calc_global_load_tick(rq);
-+
-+ task_tick_mm_cid(rq, rq->curr);
-+
-+ rq->last_tick = rq->clock;
-+ raw_spin_unlock(&rq->lock);
-+
-+ if (sched_feat(LATENCY_WARN) && resched_latency)
-+ resched_latency_warn(cpu, resched_latency);
-+
-+ perf_event_task_tick();
-+}
-+
-+#ifdef CONFIG_SCHED_SMT
-+static inline int sg_balance_cpu_stop(void *data)
-+{
-+ struct rq *rq = this_rq();
-+ struct task_struct *p = data;
-+ cpumask_t tmp;
-+ unsigned long flags;
-+
-+ local_irq_save(flags);
-+
-+ raw_spin_lock(&p->pi_lock);
-+ raw_spin_lock(&rq->lock);
-+
-+ rq->active_balance = 0;
-+ /* _something_ may have changed the task, double check again */
-+ if (task_on_rq_queued(p) && task_rq(p) == rq &&
-+ cpumask_and(&tmp, p->cpus_ptr, &sched_sg_idle_mask) &&
-+ !is_migration_disabled(p)) {
-+ int cpu = cpu_of(rq);
-+ int dcpu = __best_mask_cpu(&tmp, per_cpu(sched_cpu_llc_mask, cpu));
-+ rq = move_queued_task(rq, p, dcpu);
-+ }
-+
-+ raw_spin_unlock(&rq->lock);
-+ raw_spin_unlock(&p->pi_lock);
-+
-+ local_irq_restore(flags);
-+
-+ return 0;
-+}
-+
-+/* sg_balance_trigger - trigger slibing group balance for @cpu */
-+static inline int sg_balance_trigger(const int cpu)
-+{
-+ struct rq *rq= cpu_rq(cpu);
-+ unsigned long flags;
-+ struct task_struct *curr;
-+ int res;
-+
-+ if (!raw_spin_trylock_irqsave(&rq->lock, flags))
-+ return 0;
-+ curr = rq->curr;
-+ res = (!is_idle_task(curr)) && (1 == rq->nr_running) &&\
-+ cpumask_intersects(curr->cpus_ptr, &sched_sg_idle_mask) &&\
-+ !is_migration_disabled(curr) && (!rq->active_balance);
-+
-+ if (res)
-+ rq->active_balance = 1;
-+
-+ raw_spin_unlock_irqrestore(&rq->lock, flags);
-+
-+ if (res)
-+ stop_one_cpu_nowait(cpu, sg_balance_cpu_stop, curr,
-+ &rq->active_balance_work);
-+ return res;
-+}
-+
-+/*
-+ * sg_balance - slibing group balance check for run queue @rq
-+ */
-+static inline void sg_balance(struct rq *rq, int cpu)
-+{
-+ cpumask_t chk;
-+
-+ /* exit when cpu is offline */
-+ if (unlikely(!rq->online))
-+ return;
-+
-+ /*
-+ * Only cpu in slibing idle group will do the checking and then
-+ * find potential cpus which can migrate the current running task
-+ */
-+ if (cpumask_test_cpu(cpu, &sched_sg_idle_mask) &&
-+ cpumask_andnot(&chk, cpu_online_mask, sched_idle_mask) &&
-+ cpumask_andnot(&chk, &chk, &sched_rq_pending_mask)) {
-+ int i;
-+
-+ for_each_cpu_wrap(i, &chk, cpu) {
-+ if (!cpumask_intersects(cpu_smt_mask(i), sched_idle_mask) &&\
-+ sg_balance_trigger(i))
-+ return;
-+ }
-+ }
-+}
-+#endif /* CONFIG_SCHED_SMT */
-+
-+#ifdef CONFIG_NO_HZ_FULL
-+
-+struct tick_work {
-+ int cpu;
-+ atomic_t state;
-+ struct delayed_work work;
-+};
-+/* Values for ->state, see diagram below. */
-+#define TICK_SCHED_REMOTE_OFFLINE 0
-+#define TICK_SCHED_REMOTE_OFFLINING 1
-+#define TICK_SCHED_REMOTE_RUNNING 2
-+
-+/*
-+ * State diagram for ->state:
-+ *
-+ *
-+ * TICK_SCHED_REMOTE_OFFLINE
-+ * | ^
-+ * | |
-+ * | | sched_tick_remote()
-+ * | |
-+ * | |
-+ * +--TICK_SCHED_REMOTE_OFFLINING
-+ * | ^
-+ * | |
-+ * sched_tick_start() | | sched_tick_stop()
-+ * | |
-+ * V |
-+ * TICK_SCHED_REMOTE_RUNNING
-+ *
-+ *
-+ * Other transitions get WARN_ON_ONCE(), except that sched_tick_remote()
-+ * and sched_tick_start() are happy to leave the state in RUNNING.
-+ */
-+
-+static struct tick_work __percpu *tick_work_cpu;
-+
-+static void sched_tick_remote(struct work_struct *work)
-+{
-+ struct delayed_work *dwork = to_delayed_work(work);
-+ struct tick_work *twork = container_of(dwork, struct tick_work, work);
-+ int cpu = twork->cpu;
-+ struct rq *rq = cpu_rq(cpu);
-+ struct task_struct *curr;
-+ unsigned long flags;
-+ u64 delta;
-+ int os;
-+
-+ /*
-+ * Handle the tick only if it appears the remote CPU is running in full
-+ * dynticks mode. The check is racy by nature, but missing a tick or
-+ * having one too much is no big deal because the scheduler tick updates
-+ * statistics and checks timeslices in a time-independent way, regardless
-+ * of when exactly it is running.
-+ */
-+ if (!tick_nohz_tick_stopped_cpu(cpu))
-+ goto out_requeue;
-+
-+ raw_spin_lock_irqsave(&rq->lock, flags);
-+ curr = rq->curr;
-+ if (cpu_is_offline(cpu))
-+ goto out_unlock;
-+
-+ update_rq_clock(rq);
-+ if (!is_idle_task(curr)) {
-+ /*
-+ * Make sure the next tick runs within a reasonable
-+ * amount of time.
-+ */
-+ delta = rq_clock_task(rq) - curr->last_ran;
-+ WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3);
-+ }
-+ scheduler_task_tick(rq);
-+
-+ calc_load_nohz_remote(rq);
-+out_unlock:
-+ raw_spin_unlock_irqrestore(&rq->lock, flags);
-+
-+out_requeue:
-+ /*
-+ * Run the remote tick once per second (1Hz). This arbitrary
-+ * frequency is large enough to avoid overload but short enough
-+ * to keep scheduler internal stats reasonably up to date. But
-+ * first update state to reflect hotplug activity if required.
-+ */
-+ os = atomic_fetch_add_unless(&twork->state, -1, TICK_SCHED_REMOTE_RUNNING);
-+ WARN_ON_ONCE(os == TICK_SCHED_REMOTE_OFFLINE);
-+ if (os == TICK_SCHED_REMOTE_RUNNING)
-+ queue_delayed_work(system_unbound_wq, dwork, HZ);
-+}
-+
-+static void sched_tick_start(int cpu)
-+{
-+ int os;
-+ struct tick_work *twork;
-+
-+ if (housekeeping_cpu(cpu, HK_TYPE_TICK))
-+ return;
-+
-+ WARN_ON_ONCE(!tick_work_cpu);
-+
-+ twork = per_cpu_ptr(tick_work_cpu, cpu);
-+ os = atomic_xchg(&twork->state, TICK_SCHED_REMOTE_RUNNING);
-+ WARN_ON_ONCE(os == TICK_SCHED_REMOTE_RUNNING);
-+ if (os == TICK_SCHED_REMOTE_OFFLINE) {
-+ twork->cpu = cpu;
-+ INIT_DELAYED_WORK(&twork->work, sched_tick_remote);
-+ queue_delayed_work(system_unbound_wq, &twork->work, HZ);
-+ }
-+}
-+
-+#ifdef CONFIG_HOTPLUG_CPU
-+static void sched_tick_stop(int cpu)
-+{
-+ struct tick_work *twork;
-+ int os;
-+
-+ if (housekeeping_cpu(cpu, HK_TYPE_TICK))
-+ return;
-+
-+ WARN_ON_ONCE(!tick_work_cpu);
-+
-+ twork = per_cpu_ptr(tick_work_cpu, cpu);
-+ /* There cannot be competing actions, but don't rely on stop-machine. */
-+ os = atomic_xchg(&twork->state, TICK_SCHED_REMOTE_OFFLINING);
-+ WARN_ON_ONCE(os != TICK_SCHED_REMOTE_RUNNING);
-+ /* Don't cancel, as this would mess up the state machine. */
-+}
-+#endif /* CONFIG_HOTPLUG_CPU */
-+
-+int __init sched_tick_offload_init(void)
-+{
-+ tick_work_cpu = alloc_percpu(struct tick_work);
-+ BUG_ON(!tick_work_cpu);
-+ return 0;
-+}
-+
-+#else /* !CONFIG_NO_HZ_FULL */
-+static inline void sched_tick_start(int cpu) { }
-+static inline void sched_tick_stop(int cpu) { }
-+#endif
-+
-+#if defined(CONFIG_PREEMPTION) && (defined(CONFIG_DEBUG_PREEMPT) || \
-+ defined(CONFIG_PREEMPT_TRACER))
-+/*
-+ * If the value passed in is equal to the current preempt count
-+ * then we just disabled preemption. Start timing the latency.
-+ */
-+static inline void preempt_latency_start(int val)
-+{
-+ if (preempt_count() == val) {
-+ unsigned long ip = get_lock_parent_ip();
-+#ifdef CONFIG_DEBUG_PREEMPT
-+ current->preempt_disable_ip = ip;
-+#endif
-+ trace_preempt_off(CALLER_ADDR0, ip);
-+ }
-+}
-+
-+void preempt_count_add(int val)
-+{
-+#ifdef CONFIG_DEBUG_PREEMPT
-+ /*
-+ * Underflow?
-+ */
-+ if (DEBUG_LOCKS_WARN_ON((preempt_count() < 0)))
-+ return;
-+#endif
-+ __preempt_count_add(val);
-+#ifdef CONFIG_DEBUG_PREEMPT
-+ /*
-+ * Spinlock count overflowing soon?
-+ */
-+ DEBUG_LOCKS_WARN_ON((preempt_count() & PREEMPT_MASK) >=
-+ PREEMPT_MASK - 10);
-+#endif
-+ preempt_latency_start(val);
-+}
-+EXPORT_SYMBOL(preempt_count_add);
-+NOKPROBE_SYMBOL(preempt_count_add);
-+
-+/*
-+ * If the value passed in equals to the current preempt count
-+ * then we just enabled preemption. Stop timing the latency.
-+ */
-+static inline void preempt_latency_stop(int val)
-+{
-+ if (preempt_count() == val)
-+ trace_preempt_on(CALLER_ADDR0, get_lock_parent_ip());
-+}
-+
-+void preempt_count_sub(int val)
-+{
-+#ifdef CONFIG_DEBUG_PREEMPT
-+ /*
-+ * Underflow?
-+ */
-+ if (DEBUG_LOCKS_WARN_ON(val > preempt_count()))
-+ return;
-+ /*
-+ * Is the spinlock portion underflowing?
-+ */
-+ if (DEBUG_LOCKS_WARN_ON((val < PREEMPT_MASK) &&
-+ !(preempt_count() & PREEMPT_MASK)))
-+ return;
-+#endif
-+
-+ preempt_latency_stop(val);
-+ __preempt_count_sub(val);
-+}
-+EXPORT_SYMBOL(preempt_count_sub);
-+NOKPROBE_SYMBOL(preempt_count_sub);
-+
-+#else
-+static inline void preempt_latency_start(int val) { }
-+static inline void preempt_latency_stop(int val) { }
-+#endif
-+
-+static inline unsigned long get_preempt_disable_ip(struct task_struct *p)
-+{
-+#ifdef CONFIG_DEBUG_PREEMPT
-+ return p->preempt_disable_ip;
-+#else
-+ return 0;
-+#endif
-+}
-+
-+/*
-+ * Print scheduling while atomic bug:
-+ */
-+static noinline void __schedule_bug(struct task_struct *prev)
-+{
-+ /* Save this before calling printk(), since that will clobber it */
-+ unsigned long preempt_disable_ip = get_preempt_disable_ip(current);
-+
-+ if (oops_in_progress)
-+ return;
-+
-+ printk(KERN_ERR "BUG: scheduling while atomic: %s/%d/0x%08x\n",
-+ prev->comm, prev->pid, preempt_count());
-+
-+ debug_show_held_locks(prev);
-+ print_modules();
-+ if (irqs_disabled())
-+ print_irqtrace_events(prev);
-+ if (IS_ENABLED(CONFIG_DEBUG_PREEMPT)
-+ && in_atomic_preempt_off()) {
-+ pr_err("Preemption disabled at:");
-+ print_ip_sym(KERN_ERR, preempt_disable_ip);
-+ }
-+ check_panic_on_warn("scheduling while atomic");
-+
-+ dump_stack();
-+ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
-+}
-+
-+/*
-+ * Various schedule()-time debugging checks and statistics:
-+ */
-+static inline void schedule_debug(struct task_struct *prev, bool preempt)
-+{
-+#ifdef CONFIG_SCHED_STACK_END_CHECK
-+ if (task_stack_end_corrupted(prev))
-+ panic("corrupted stack end detected inside scheduler\n");
-+
-+ if (task_scs_end_corrupted(prev))
-+ panic("corrupted shadow stack detected inside scheduler\n");
-+#endif
-+
-+#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
-+ if (!preempt && READ_ONCE(prev->__state) && prev->non_block_count) {
-+ printk(KERN_ERR "BUG: scheduling in a non-blocking section: %s/%d/%i\n",
-+ prev->comm, prev->pid, prev->non_block_count);
-+ dump_stack();
-+ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
-+ }
-+#endif
-+
-+ if (unlikely(in_atomic_preempt_off())) {
-+ __schedule_bug(prev);
-+ preempt_count_set(PREEMPT_DISABLED);
-+ }
-+ rcu_sleep_check();
-+ SCHED_WARN_ON(ct_state() == CONTEXT_USER);
-+
-+ profile_hit(SCHED_PROFILING, __builtin_return_address(0));
-+
-+ schedstat_inc(this_rq()->sched_count);
-+}
-+
-+#ifdef ALT_SCHED_DEBUG
-+void alt_sched_debug(void)
-+{
-+ printk(KERN_INFO "sched: pending: 0x%04lx, idle: 0x%04lx, sg_idle: 0x%04lx\n",
-+ sched_rq_pending_mask.bits[0],
-+ sched_idle_mask->bits[0],
-+ sched_sg_idle_mask.bits[0]);
-+}
-+#else
-+inline void alt_sched_debug(void) {}
-+#endif
-+
-+#ifdef CONFIG_SMP
-+
-+#ifdef CONFIG_PREEMPT_RT
-+#define SCHED_NR_MIGRATE_BREAK 8
-+#else
-+#define SCHED_NR_MIGRATE_BREAK 32
-+#endif
-+
-+const_debug unsigned int sysctl_sched_nr_migrate = SCHED_NR_MIGRATE_BREAK;
-+
-+/*
-+ * Migrate pending tasks in @rq to @dest_cpu
-+ */
-+static inline int
-+migrate_pending_tasks(struct rq *rq, struct rq *dest_rq, const int dest_cpu)
-+{
-+ struct task_struct *p, *skip = rq->curr;
-+ int nr_migrated = 0;
-+ int nr_tries = min(rq->nr_running / 2, sysctl_sched_nr_migrate);
-+
-+ /* WA to check rq->curr is still on rq */
-+ if (!task_on_rq_queued(skip))
-+ return 0;
-+
-+ while (skip != rq->idle && nr_tries &&
-+ (p = sched_rq_next_task(skip, rq)) != rq->idle) {
-+ skip = sched_rq_next_task(p, rq);
-+ if (cpumask_test_cpu(dest_cpu, p->cpus_ptr)) {
-+ __SCHED_DEQUEUE_TASK(p, rq, 0, );
-+ set_task_cpu(p, dest_cpu);
-+ sched_task_sanity_check(p, dest_rq);
-+ sched_mm_cid_migrate_to(dest_rq, p, cpu_of(rq));
-+ __SCHED_ENQUEUE_TASK(p, dest_rq, 0);
-+ nr_migrated++;
-+ }
-+ nr_tries--;
-+ }
-+
-+ return nr_migrated;
-+}
-+
-+static inline int take_other_rq_tasks(struct rq *rq, int cpu)
-+{
-+ struct cpumask *topo_mask, *end_mask;
-+
-+ if (unlikely(!rq->online))
-+ return 0;
-+
-+ if (cpumask_empty(&sched_rq_pending_mask))
-+ return 0;
-+
-+ topo_mask = per_cpu(sched_cpu_topo_masks, cpu) + 1;
-+ end_mask = per_cpu(sched_cpu_topo_end_mask, cpu);
-+ do {
-+ int i;
-+ for_each_cpu_and(i, &sched_rq_pending_mask, topo_mask) {
-+ int nr_migrated;
-+ struct rq *src_rq;
-+
-+ src_rq = cpu_rq(i);
-+ if (!do_raw_spin_trylock(&src_rq->lock))
-+ continue;
-+ spin_acquire(&src_rq->lock.dep_map,
-+ SINGLE_DEPTH_NESTING, 1, _RET_IP_);
-+
-+ if ((nr_migrated = migrate_pending_tasks(src_rq, rq, cpu))) {
-+ src_rq->nr_running -= nr_migrated;
-+ if (src_rq->nr_running < 2)
-+ cpumask_clear_cpu(i, &sched_rq_pending_mask);
-+
-+ spin_release(&src_rq->lock.dep_map, _RET_IP_);
-+ do_raw_spin_unlock(&src_rq->lock);
-+
-+ rq->nr_running += nr_migrated;
-+ if (rq->nr_running > 1)
-+ cpumask_set_cpu(cpu, &sched_rq_pending_mask);
-+
-+ update_sched_preempt_mask(rq);
-+ cpufreq_update_util(rq, 0);
-+
-+ return 1;
-+ }
-+
-+ spin_release(&src_rq->lock.dep_map, _RET_IP_);
-+ do_raw_spin_unlock(&src_rq->lock);
-+ }
-+ } while (++topo_mask < end_mask);
-+
-+ return 0;
-+}
-+#endif
-+
-+/*
-+ * Timeslices below RESCHED_NS are considered as good as expired as there's no
-+ * point rescheduling when there's so little time left.
-+ */
-+static inline void check_curr(struct task_struct *p, struct rq *rq)
-+{
-+ if (unlikely(rq->idle == p))
-+ return;
-+
-+ update_curr(rq, p);
-+
-+ if (p->time_slice < RESCHED_NS)
-+ time_slice_expired(p, rq);
-+}
-+
-+static inline struct task_struct *
-+choose_next_task(struct rq *rq, int cpu)
-+{
-+ struct task_struct *next;
-+
-+ if (unlikely(rq->skip)) {
-+ next = rq_runnable_task(rq);
-+ if (next == rq->idle) {
-+#ifdef CONFIG_SMP
-+ if (!take_other_rq_tasks(rq, cpu)) {
-+#endif
-+ rq->skip = NULL;
-+ schedstat_inc(rq->sched_goidle);
-+ return next;
-+#ifdef CONFIG_SMP
-+ }
-+ next = rq_runnable_task(rq);
-+#endif
-+ }
-+ rq->skip = NULL;
-+#ifdef CONFIG_HIGH_RES_TIMERS
-+ hrtick_start(rq, next->time_slice);
-+#endif
-+ return next;
-+ }
-+
-+ next = sched_rq_first_task(rq);
-+ if (next == rq->idle) {
-+#ifdef CONFIG_SMP
-+ if (!take_other_rq_tasks(rq, cpu)) {
-+#endif
-+ schedstat_inc(rq->sched_goidle);
-+ /*printk(KERN_INFO "sched: choose_next_task(%d) idle %px\n", cpu, next);*/
-+ return next;
-+#ifdef CONFIG_SMP
-+ }
-+ next = sched_rq_first_task(rq);
-+#endif
-+ }
-+#ifdef CONFIG_HIGH_RES_TIMERS
-+ hrtick_start(rq, next->time_slice);
-+#endif
-+ /*printk(KERN_INFO "sched: choose_next_task(%d) next %px\n", cpu, next);*/
-+ return next;
-+}
-+
-+/*
-+ * Constants for the sched_mode argument of __schedule().
-+ *
-+ * The mode argument allows RT enabled kernels to differentiate a
-+ * preemption from blocking on an 'sleeping' spin/rwlock. Note that
-+ * SM_MASK_PREEMPT for !RT has all bits set, which allows the compiler to
-+ * optimize the AND operation out and just check for zero.
-+ */
-+#define SM_NONE 0x0
-+#define SM_PREEMPT 0x1
-+#define SM_RTLOCK_WAIT 0x2
-+
-+#ifndef CONFIG_PREEMPT_RT
-+# define SM_MASK_PREEMPT (~0U)
-+#else
-+# define SM_MASK_PREEMPT SM_PREEMPT
-+#endif
-+
-+/*
-+ * schedule() is the main scheduler function.
-+ *
-+ * The main means of driving the scheduler and thus entering this function are:
-+ *
-+ * 1. Explicit blocking: mutex, semaphore, waitqueue, etc.
-+ *
-+ * 2. TIF_NEED_RESCHED flag is checked on interrupt and userspace return
-+ * paths. For example, see arch/x86/entry_64.S.
-+ *
-+ * To drive preemption between tasks, the scheduler sets the flag in timer
-+ * interrupt handler scheduler_tick().
-+ *
-+ * 3. Wakeups don't really cause entry into schedule(). They add a
-+ * task to the run-queue and that's it.
-+ *
-+ * Now, if the new task added to the run-queue preempts the current
-+ * task, then the wakeup sets TIF_NEED_RESCHED and schedule() gets
-+ * called on the nearest possible occasion:
-+ *
-+ * - If the kernel is preemptible (CONFIG_PREEMPTION=y):
-+ *
-+ * - in syscall or exception context, at the next outmost
-+ * preempt_enable(). (this might be as soon as the wake_up()'s
-+ * spin_unlock()!)
-+ *
-+ * - in IRQ context, return from interrupt-handler to
-+ * preemptible context
-+ *
-+ * - If the kernel is not preemptible (CONFIG_PREEMPTION is not set)
-+ * then at the next:
-+ *
-+ * - cond_resched() call
-+ * - explicit schedule() call
-+ * - return from syscall or exception to user-space
-+ * - return from interrupt-handler to user-space
-+ *
-+ * WARNING: must be called with preemption disabled!
-+ */
-+static void __sched notrace __schedule(unsigned int sched_mode)
-+{
-+ struct task_struct *prev, *next;
-+ unsigned long *switch_count;
-+ unsigned long prev_state;
-+ struct rq *rq;
-+ int cpu;
-+
-+ cpu = smp_processor_id();
-+ rq = cpu_rq(cpu);
-+ prev = rq->curr;
-+
-+ schedule_debug(prev, !!sched_mode);
-+
-+ /* by passing sched_feat(HRTICK) checking which Alt schedule FW doesn't support */
-+ hrtick_clear(rq);
-+
-+ local_irq_disable();
-+ rcu_note_context_switch(!!sched_mode);
-+
-+ /*
-+ * Make sure that signal_pending_state()->signal_pending() below
-+ * can't be reordered with __set_current_state(TASK_INTERRUPTIBLE)
-+ * done by the caller to avoid the race with signal_wake_up():
-+ *
-+ * __set_current_state(@state) signal_wake_up()
-+ * schedule() set_tsk_thread_flag(p, TIF_SIGPENDING)
-+ * wake_up_state(p, state)
-+ * LOCK rq->lock LOCK p->pi_state
-+ * smp_mb__after_spinlock() smp_mb__after_spinlock()
-+ * if (signal_pending_state()) if (p->state & @state)
-+ *
-+ * Also, the membarrier system call requires a full memory barrier
-+ * after coming from user-space, before storing to rq->curr.
-+ */
-+ raw_spin_lock(&rq->lock);
-+ smp_mb__after_spinlock();
-+
-+ update_rq_clock(rq);
-+
-+ switch_count = &prev->nivcsw;
-+ /*
-+ * We must load prev->state once (task_struct::state is volatile), such
-+ * that we form a control dependency vs deactivate_task() below.
-+ */
-+ prev_state = READ_ONCE(prev->__state);
-+ if (!(sched_mode & SM_MASK_PREEMPT) && prev_state) {
-+ if (signal_pending_state(prev_state, prev)) {
-+ WRITE_ONCE(prev->__state, TASK_RUNNING);
-+ } else {
-+ prev->sched_contributes_to_load =
-+ (prev_state & TASK_UNINTERRUPTIBLE) &&
-+ !(prev_state & TASK_NOLOAD) &&
-+ !(prev_state & TASK_FROZEN);
-+
-+ if (prev->sched_contributes_to_load)
-+ rq->nr_uninterruptible++;
-+
-+ /*
-+ * __schedule() ttwu()
-+ * prev_state = prev->state; if (p->on_rq && ...)
-+ * if (prev_state) goto out;
-+ * p->on_rq = 0; smp_acquire__after_ctrl_dep();
-+ * p->state = TASK_WAKING
-+ *
-+ * Where __schedule() and ttwu() have matching control dependencies.
-+ *
-+ * After this, schedule() must not care about p->state any more.
-+ */
-+ sched_task_deactivate(prev, rq);
-+ deactivate_task(prev, rq);
-+
-+ if (prev->in_iowait) {
-+ atomic_inc(&rq->nr_iowait);
-+ delayacct_blkio_start();
-+ }
-+ }
-+ switch_count = &prev->nvcsw;
-+ }
-+
-+ check_curr(prev, rq);
-+
-+ next = choose_next_task(rq, cpu);
-+ clear_tsk_need_resched(prev);
-+ clear_preempt_need_resched();
-+#ifdef CONFIG_SCHED_DEBUG
-+ rq->last_seen_need_resched_ns = 0;
-+#endif
-+
-+ if (likely(prev != next)) {
-+ next->last_ran = rq->clock_task;
-+ rq->last_ts_switch = rq->clock;
-+
-+ /*printk(KERN_INFO "sched: %px -> %px\n", prev, next);*/
-+ rq->nr_switches++;
-+ /*
-+ * RCU users of rcu_dereference(rq->curr) may not see
-+ * changes to task_struct made by pick_next_task().
-+ */
-+ RCU_INIT_POINTER(rq->curr, next);
-+ /*
-+ * The membarrier system call requires each architecture
-+ * to have a full memory barrier after updating
-+ * rq->curr, before returning to user-space.
-+ *
-+ * Here are the schemes providing that barrier on the
-+ * various architectures:
-+ * - mm ? switch_mm() : mmdrop() for x86, s390, sparc, PowerPC.
-+ * switch_mm() rely on membarrier_arch_switch_mm() on PowerPC.
-+ * - finish_lock_switch() for weakly-ordered
-+ * architectures where spin_unlock is a full barrier,
-+ * - switch_to() for arm64 (weakly-ordered, spin_unlock
-+ * is a RELEASE barrier),
-+ */
-+ ++*switch_count;
-+
-+ trace_sched_switch(sched_mode & SM_MASK_PREEMPT, prev, next, prev_state);
-+
-+ /* Also unlocks the rq: */
-+ rq = context_switch(rq, prev, next);
-+
-+ cpu = cpu_of(rq);
-+ } else {
-+ __balance_callbacks(rq);
-+ raw_spin_unlock_irq(&rq->lock);
-+ }
-+
-+#ifdef CONFIG_SCHED_SMT
-+ sg_balance(rq, cpu);
-+#endif
-+}
-+
-+void __noreturn do_task_dead(void)
-+{
-+ /* Causes final put_task_struct in finish_task_switch(): */
-+ set_special_state(TASK_DEAD);
-+
-+ /* Tell freezer to ignore us: */
-+ current->flags |= PF_NOFREEZE;
-+
-+ __schedule(SM_NONE);
-+ BUG();
-+
-+ /* Avoid "noreturn function does return" - but don't continue if BUG() is a NOP: */
-+ for (;;)
-+ cpu_relax();
-+}
-+
-+static inline void sched_submit_work(struct task_struct *tsk)
-+{
-+ unsigned int task_flags;
-+
-+ if (task_is_running(tsk))
-+ return;
-+
-+ task_flags = tsk->flags;
-+ /*
-+ * If a worker goes to sleep, notify and ask workqueue whether it
-+ * wants to wake up a task to maintain concurrency.
-+ */
-+ if (task_flags & (PF_WQ_WORKER | PF_IO_WORKER)) {
-+ if (task_flags & PF_WQ_WORKER)
-+ wq_worker_sleeping(tsk);
-+ else
-+ io_wq_worker_sleeping(tsk);
-+ }
-+
-+ /*
-+ * spinlock and rwlock must not flush block requests. This will
-+ * deadlock if the callback attempts to acquire a lock which is
-+ * already acquired.
-+ */
-+ SCHED_WARN_ON(current->__state & TASK_RTLOCK_WAIT);
-+
-+ /*
-+ * If we are going to sleep and we have plugged IO queued,
-+ * make sure to submit it to avoid deadlocks.
-+ */
-+ blk_flush_plug(tsk->plug, true);
-+}
-+
-+static void sched_update_worker(struct task_struct *tsk)
-+{
-+ if (tsk->flags & (PF_WQ_WORKER | PF_IO_WORKER)) {
-+ if (tsk->flags & PF_WQ_WORKER)
-+ wq_worker_running(tsk);
-+ else
-+ io_wq_worker_running(tsk);
-+ }
-+}
-+
-+asmlinkage __visible void __sched schedule(void)
-+{
-+ struct task_struct *tsk = current;
-+
-+ sched_submit_work(tsk);
-+ do {
-+ preempt_disable();
-+ __schedule(SM_NONE);
-+ sched_preempt_enable_no_resched();
-+ } while (need_resched());
-+ sched_update_worker(tsk);
-+}
-+EXPORT_SYMBOL(schedule);
-+
-+/*
-+ * synchronize_rcu_tasks() makes sure that no task is stuck in preempted
-+ * state (have scheduled out non-voluntarily) by making sure that all
-+ * tasks have either left the run queue or have gone into user space.
-+ * As idle tasks do not do either, they must not ever be preempted
-+ * (schedule out non-voluntarily).
-+ *
-+ * schedule_idle() is similar to schedule_preempt_disable() except that it
-+ * never enables preemption because it does not call sched_submit_work().
-+ */
-+void __sched schedule_idle(void)
-+{
-+ /*
-+ * As this skips calling sched_submit_work(), which the idle task does
-+ * regardless because that function is a nop when the task is in a
-+ * TASK_RUNNING state, make sure this isn't used someplace that the
-+ * current task can be in any other state. Note, idle is always in the
-+ * TASK_RUNNING state.
-+ */
-+ WARN_ON_ONCE(current->__state);
-+ do {
-+ __schedule(SM_NONE);
-+ } while (need_resched());
-+}
-+
-+#if defined(CONFIG_CONTEXT_TRACKING_USER) && !defined(CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK)
-+asmlinkage __visible void __sched schedule_user(void)
-+{
-+ /*
-+ * If we come here after a random call to set_need_resched(),
-+ * or we have been woken up remotely but the IPI has not yet arrived,
-+ * we haven't yet exited the RCU idle mode. Do it here manually until
-+ * we find a better solution.
-+ *
-+ * NB: There are buggy callers of this function. Ideally we
-+ * should warn if prev_state != CONTEXT_USER, but that will trigger
-+ * too frequently to make sense yet.
-+ */
-+ enum ctx_state prev_state = exception_enter();
-+ schedule();
-+ exception_exit(prev_state);
-+}
-+#endif
-+
-+/**
-+ * schedule_preempt_disabled - called with preemption disabled
-+ *
-+ * Returns with preemption disabled. Note: preempt_count must be 1
-+ */
-+void __sched schedule_preempt_disabled(void)
-+{
-+ sched_preempt_enable_no_resched();
-+ schedule();
-+ preempt_disable();
-+}
-+
-+#ifdef CONFIG_PREEMPT_RT
-+void __sched notrace schedule_rtlock(void)
-+{
-+ do {
-+ preempt_disable();
-+ __schedule(SM_RTLOCK_WAIT);
-+ sched_preempt_enable_no_resched();
-+ } while (need_resched());
-+}
-+NOKPROBE_SYMBOL(schedule_rtlock);
-+#endif
-+
-+static void __sched notrace preempt_schedule_common(void)
-+{
-+ do {
-+ /*
-+ * Because the function tracer can trace preempt_count_sub()
-+ * and it also uses preempt_enable/disable_notrace(), if
-+ * NEED_RESCHED is set, the preempt_enable_notrace() called
-+ * by the function tracer will call this function again and
-+ * cause infinite recursion.
-+ *
-+ * Preemption must be disabled here before the function
-+ * tracer can trace. Break up preempt_disable() into two
-+ * calls. One to disable preemption without fear of being
-+ * traced. The other to still record the preemption latency,
-+ * which can also be traced by the function tracer.
-+ */
-+ preempt_disable_notrace();
-+ preempt_latency_start(1);
-+ __schedule(SM_PREEMPT);
-+ preempt_latency_stop(1);
-+ preempt_enable_no_resched_notrace();
-+
-+ /*
-+ * Check again in case we missed a preemption opportunity
-+ * between schedule and now.
-+ */
-+ } while (need_resched());
-+}
-+
-+#ifdef CONFIG_PREEMPTION
-+/*
-+ * This is the entry point to schedule() from in-kernel preemption
-+ * off of preempt_enable.
-+ */
-+asmlinkage __visible void __sched notrace preempt_schedule(void)
-+{
-+ /*
-+ * If there is a non-zero preempt_count or interrupts are disabled,
-+ * we do not want to preempt the current task. Just return..
-+ */
-+ if (likely(!preemptible()))
-+ return;
-+
-+ preempt_schedule_common();
-+}
-+NOKPROBE_SYMBOL(preempt_schedule);
-+EXPORT_SYMBOL(preempt_schedule);
-+
-+#ifdef CONFIG_PREEMPT_DYNAMIC
-+#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
-+#ifndef preempt_schedule_dynamic_enabled
-+#define preempt_schedule_dynamic_enabled preempt_schedule
-+#define preempt_schedule_dynamic_disabled NULL
-+#endif
-+DEFINE_STATIC_CALL(preempt_schedule, preempt_schedule_dynamic_enabled);
-+EXPORT_STATIC_CALL_TRAMP(preempt_schedule);
-+#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
-+static DEFINE_STATIC_KEY_TRUE(sk_dynamic_preempt_schedule);
-+void __sched notrace dynamic_preempt_schedule(void)
-+{
-+ if (!static_branch_unlikely(&sk_dynamic_preempt_schedule))
-+ return;
-+ preempt_schedule();
-+}
-+NOKPROBE_SYMBOL(dynamic_preempt_schedule);
-+EXPORT_SYMBOL(dynamic_preempt_schedule);
-+#endif
-+#endif
-+
-+/**
-+ * preempt_schedule_notrace - preempt_schedule called by tracing
-+ *
-+ * The tracing infrastructure uses preempt_enable_notrace to prevent
-+ * recursion and tracing preempt enabling caused by the tracing
-+ * infrastructure itself. But as tracing can happen in areas coming
-+ * from userspace or just about to enter userspace, a preempt enable
-+ * can occur before user_exit() is called. This will cause the scheduler
-+ * to be called when the system is still in usermode.
-+ *
-+ * To prevent this, the preempt_enable_notrace will use this function
-+ * instead of preempt_schedule() to exit user context if needed before
-+ * calling the scheduler.
-+ */
-+asmlinkage __visible void __sched notrace preempt_schedule_notrace(void)
-+{
-+ enum ctx_state prev_ctx;
-+
-+ if (likely(!preemptible()))
-+ return;
-+
-+ do {
-+ /*
-+ * Because the function tracer can trace preempt_count_sub()
-+ * and it also uses preempt_enable/disable_notrace(), if
-+ * NEED_RESCHED is set, the preempt_enable_notrace() called
-+ * by the function tracer will call this function again and
-+ * cause infinite recursion.
-+ *
-+ * Preemption must be disabled here before the function
-+ * tracer can trace. Break up preempt_disable() into two
-+ * calls. One to disable preemption without fear of being
-+ * traced. The other to still record the preemption latency,
-+ * which can also be traced by the function tracer.
-+ */
-+ preempt_disable_notrace();
-+ preempt_latency_start(1);
-+ /*
-+ * Needs preempt disabled in case user_exit() is traced
-+ * and the tracer calls preempt_enable_notrace() causing
-+ * an infinite recursion.
-+ */
-+ prev_ctx = exception_enter();
-+ __schedule(SM_PREEMPT);
-+ exception_exit(prev_ctx);
-+
-+ preempt_latency_stop(1);
-+ preempt_enable_no_resched_notrace();
-+ } while (need_resched());
-+}
-+EXPORT_SYMBOL_GPL(preempt_schedule_notrace);
-+
-+#ifdef CONFIG_PREEMPT_DYNAMIC
-+#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
-+#ifndef preempt_schedule_notrace_dynamic_enabled
-+#define preempt_schedule_notrace_dynamic_enabled preempt_schedule_notrace
-+#define preempt_schedule_notrace_dynamic_disabled NULL
-+#endif
-+DEFINE_STATIC_CALL(preempt_schedule_notrace, preempt_schedule_notrace_dynamic_enabled);
-+EXPORT_STATIC_CALL_TRAMP(preempt_schedule_notrace);
-+#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
-+static DEFINE_STATIC_KEY_TRUE(sk_dynamic_preempt_schedule_notrace);
-+void __sched notrace dynamic_preempt_schedule_notrace(void)
-+{
-+ if (!static_branch_unlikely(&sk_dynamic_preempt_schedule_notrace))
-+ return;
-+ preempt_schedule_notrace();
-+}
-+NOKPROBE_SYMBOL(dynamic_preempt_schedule_notrace);
-+EXPORT_SYMBOL(dynamic_preempt_schedule_notrace);
-+#endif
-+#endif
-+
-+#endif /* CONFIG_PREEMPTION */
-+
-+/*
-+ * This is the entry point to schedule() from kernel preemption
-+ * off of irq context.
-+ * Note, that this is called and return with irqs disabled. This will
-+ * protect us against recursive calling from irq.
-+ */
-+asmlinkage __visible void __sched preempt_schedule_irq(void)
-+{
-+ enum ctx_state prev_state;
-+
-+ /* Catch callers which need to be fixed */
-+ BUG_ON(preempt_count() || !irqs_disabled());
-+
-+ prev_state = exception_enter();
-+
-+ do {
-+ preempt_disable();
-+ local_irq_enable();
-+ __schedule(SM_PREEMPT);
-+ local_irq_disable();
-+ sched_preempt_enable_no_resched();
-+ } while (need_resched());
-+
-+ exception_exit(prev_state);
-+}
-+
-+int default_wake_function(wait_queue_entry_t *curr, unsigned mode, int wake_flags,
-+ void *key)
-+{
-+ WARN_ON_ONCE(IS_ENABLED(CONFIG_SCHED_DEBUG) && wake_flags & ~WF_SYNC);
-+ return try_to_wake_up(curr->private, mode, wake_flags);
-+}
-+EXPORT_SYMBOL(default_wake_function);
-+
-+static inline void check_task_changed(struct task_struct *p, struct rq *rq)
-+{
-+ /* Trigger resched if task sched_prio has been modified. */
-+ if (task_on_rq_queued(p)) {
-+ int idx;
-+
-+ update_rq_clock(rq);
-+ idx = task_sched_prio_idx(p, rq);
-+ if (idx != p->sq_idx) {
-+ requeue_task(p, rq, idx);
-+ check_preempt_curr(rq);
-+ }
-+ }
-+}
-+
-+static void __setscheduler_prio(struct task_struct *p, int prio)
-+{
-+ p->prio = prio;
-+}
-+
-+#ifdef CONFIG_RT_MUTEXES
-+
-+static inline int __rt_effective_prio(struct task_struct *pi_task, int prio)
-+{
-+ if (pi_task)
-+ prio = min(prio, pi_task->prio);
-+
-+ return prio;
-+}
-+
-+static inline int rt_effective_prio(struct task_struct *p, int prio)
-+{
-+ struct task_struct *pi_task = rt_mutex_get_top_task(p);
-+
-+ return __rt_effective_prio(pi_task, prio);
-+}
-+
-+/*
-+ * rt_mutex_setprio - set the current priority of a task
-+ * @p: task to boost
-+ * @pi_task: donor task
-+ *
-+ * This function changes the 'effective' priority of a task. It does
-+ * not touch ->normal_prio like __setscheduler().
-+ *
-+ * Used by the rt_mutex code to implement priority inheritance
-+ * logic. Call site only calls if the priority of the task changed.
-+ */
-+void rt_mutex_setprio(struct task_struct *p, struct task_struct *pi_task)
-+{
-+ int prio;
-+ struct rq *rq;
-+ raw_spinlock_t *lock;
-+
-+ /* XXX used to be waiter->prio, not waiter->task->prio */
-+ prio = __rt_effective_prio(pi_task, p->normal_prio);
-+
-+ /*
-+ * If nothing changed; bail early.
-+ */
-+ if (p->pi_top_task == pi_task && prio == p->prio)
-+ return;
-+
-+ rq = __task_access_lock(p, &lock);
-+ /*
-+ * Set under pi_lock && rq->lock, such that the value can be used under
-+ * either lock.
-+ *
-+ * Note that there is loads of tricky to make this pointer cache work
-+ * right. rt_mutex_slowunlock()+rt_mutex_postunlock() work together to
-+ * ensure a task is de-boosted (pi_task is set to NULL) before the
-+ * task is allowed to run again (and can exit). This ensures the pointer
-+ * points to a blocked task -- which guarantees the task is present.
-+ */
-+ p->pi_top_task = pi_task;
-+
-+ /*
-+ * For FIFO/RR we only need to set prio, if that matches we're done.
-+ */
-+ if (prio == p->prio)
-+ goto out_unlock;
-+
-+ /*
-+ * Idle task boosting is a nono in general. There is one
-+ * exception, when PREEMPT_RT and NOHZ is active:
-+ *
-+ * The idle task calls get_next_timer_interrupt() and holds
-+ * the timer wheel base->lock on the CPU and another CPU wants
-+ * to access the timer (probably to cancel it). We can safely
-+ * ignore the boosting request, as the idle CPU runs this code
-+ * with interrupts disabled and will complete the lock
-+ * protected section without being interrupted. So there is no
-+ * real need to boost.
-+ */
-+ if (unlikely(p == rq->idle)) {
-+ WARN_ON(p != rq->curr);
-+ WARN_ON(p->pi_blocked_on);
-+ goto out_unlock;
-+ }
-+
-+ trace_sched_pi_setprio(p, pi_task);
-+
-+ __setscheduler_prio(p, prio);
-+
-+ check_task_changed(p, rq);
-+out_unlock:
-+ /* Avoid rq from going away on us: */
-+ preempt_disable();
-+
-+ __balance_callbacks(rq);
-+ __task_access_unlock(p, lock);
-+
-+ preempt_enable();
-+}
-+#else
-+static inline int rt_effective_prio(struct task_struct *p, int prio)
-+{
-+ return prio;
-+}
-+#endif
-+
-+void set_user_nice(struct task_struct *p, long nice)
-+{
-+ unsigned long flags;
-+ struct rq *rq;
-+ raw_spinlock_t *lock;
-+
-+ if (task_nice(p) == nice || nice < MIN_NICE || nice > MAX_NICE)
-+ return;
-+ /*
-+ * We have to be careful, if called from sys_setpriority(),
-+ * the task might be in the middle of scheduling on another CPU.
-+ */
-+ raw_spin_lock_irqsave(&p->pi_lock, flags);
-+ rq = __task_access_lock(p, &lock);
-+
-+ p->static_prio = NICE_TO_PRIO(nice);
-+ /*
-+ * The RT priorities are set via sched_setscheduler(), but we still
-+ * allow the 'normal' nice value to be set - but as expected
-+ * it won't have any effect on scheduling until the task is
-+ * not SCHED_NORMAL/SCHED_BATCH:
-+ */
-+ if (task_has_rt_policy(p))
-+ goto out_unlock;
-+
-+ p->prio = effective_prio(p);
-+
-+ check_task_changed(p, rq);
-+out_unlock:
-+ __task_access_unlock(p, lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
-+}
-+EXPORT_SYMBOL(set_user_nice);
-+
-+/*
-+ * is_nice_reduction - check if nice value is an actual reduction
-+ *
-+ * Similar to can_nice() but does not perform a capability check.
-+ *
-+ * @p: task
-+ * @nice: nice value
-+ */
-+static bool is_nice_reduction(const struct task_struct *p, const int nice)
-+{
-+ /* Convert nice value [19,-20] to rlimit style value [1,40]: */
-+ int nice_rlim = nice_to_rlimit(nice);
-+
-+ return (nice_rlim <= task_rlimit(p, RLIMIT_NICE));
-+}
-+
-+/*
-+ * can_nice - check if a task can reduce its nice value
-+ * @p: task
-+ * @nice: nice value
-+ */
-+int can_nice(const struct task_struct *p, const int nice)
-+{
-+ return is_nice_reduction(p, nice) || capable(CAP_SYS_NICE);
-+}
-+
-+#ifdef __ARCH_WANT_SYS_NICE
-+
-+/*
-+ * sys_nice - change the priority of the current process.
-+ * @increment: priority increment
-+ *
-+ * sys_setpriority is a more generic, but much slower function that
-+ * does similar things.
-+ */
-+SYSCALL_DEFINE1(nice, int, increment)
-+{
-+ long nice, retval;
-+
-+ /*
-+ * Setpriority might change our priority at the same moment.
-+ * We don't have to worry. Conceptually one call occurs first
-+ * and we have a single winner.
-+ */
-+
-+ increment = clamp(increment, -NICE_WIDTH, NICE_WIDTH);
-+ nice = task_nice(current) + increment;
-+
-+ nice = clamp_val(nice, MIN_NICE, MAX_NICE);
-+ if (increment < 0 && !can_nice(current, nice))
-+ return -EPERM;
-+
-+ retval = security_task_setnice(current, nice);
-+ if (retval)
-+ return retval;
-+
-+ set_user_nice(current, nice);
-+ return 0;
-+}
-+
-+#endif
-+
-+/**
-+ * task_prio - return the priority value of a given task.
-+ * @p: the task in question.
-+ *
-+ * Return: The priority value as seen by users in /proc.
-+ *
-+ * sched policy return value kernel prio user prio/nice
-+ *
-+ * (BMQ)normal, batch, idle[0 ... 53] [100 ... 139] 0/[-20 ... 19]/[-7 ... 7]
-+ * (PDS)normal, batch, idle[0 ... 39] 100 0/[-20 ... 19]
-+ * fifo, rr [-1 ... -100] [99 ... 0] [0 ... 99]
-+ */
-+int task_prio(const struct task_struct *p)
-+{
-+ return (p->prio < MAX_RT_PRIO) ? p->prio - MAX_RT_PRIO :
-+ task_sched_prio_normal(p, task_rq(p));
-+}
-+
-+/**
-+ * idle_cpu - is a given CPU idle currently?
-+ * @cpu: the processor in question.
-+ *
-+ * Return: 1 if the CPU is currently idle. 0 otherwise.
-+ */
-+int idle_cpu(int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+
-+ if (rq->curr != rq->idle)
-+ return 0;
-+
-+ if (rq->nr_running)
-+ return 0;
-+
-+#ifdef CONFIG_SMP
-+ if (rq->ttwu_pending)
-+ return 0;
-+#endif
-+
-+ return 1;
-+}
-+
-+/**
-+ * idle_task - return the idle task for a given CPU.
-+ * @cpu: the processor in question.
-+ *
-+ * Return: The idle task for the cpu @cpu.
-+ */
-+struct task_struct *idle_task(int cpu)
-+{
-+ return cpu_rq(cpu)->idle;
-+}
-+
-+/**
-+ * find_process_by_pid - find a process with a matching PID value.
-+ * @pid: the pid in question.
-+ *
-+ * The task of @pid, if found. %NULL otherwise.
-+ */
-+static inline struct task_struct *find_process_by_pid(pid_t pid)
-+{
-+ return pid ? find_task_by_vpid(pid) : current;
-+}
-+
-+/*
-+ * sched_setparam() passes in -1 for its policy, to let the functions
-+ * it calls know not to change it.
-+ */
-+#define SETPARAM_POLICY -1
-+
-+static void __setscheduler_params(struct task_struct *p,
-+ const struct sched_attr *attr)
-+{
-+ int policy = attr->sched_policy;
-+
-+ if (policy == SETPARAM_POLICY)
-+ policy = p->policy;
-+
-+ p->policy = policy;
-+
-+ /*
-+ * allow normal nice value to be set, but will not have any
-+ * effect on scheduling until the task not SCHED_NORMAL/
-+ * SCHED_BATCH
-+ */
-+ p->static_prio = NICE_TO_PRIO(attr->sched_nice);
-+
-+ /*
-+ * __sched_setscheduler() ensures attr->sched_priority == 0 when
-+ * !rt_policy. Always setting this ensures that things like
-+ * getparam()/getattr() don't report silly values for !rt tasks.
-+ */
-+ p->rt_priority = attr->sched_priority;
-+ p->normal_prio = normal_prio(p);
-+}
-+
-+/*
-+ * check the target process has a UID that matches the current process's
-+ */
-+static bool check_same_owner(struct task_struct *p)
-+{
-+ const struct cred *cred = current_cred(), *pcred;
-+ bool match;
-+
-+ rcu_read_lock();
-+ pcred = __task_cred(p);
-+ match = (uid_eq(cred->euid, pcred->euid) ||
-+ uid_eq(cred->euid, pcred->uid));
-+ rcu_read_unlock();
-+ return match;
-+}
-+
-+/*
-+ * Allow unprivileged RT tasks to decrease priority.
-+ * Only issue a capable test if needed and only once to avoid an audit
-+ * event on permitted non-privileged operations:
-+ */
-+static int user_check_sched_setscheduler(struct task_struct *p,
-+ const struct sched_attr *attr,
-+ int policy, int reset_on_fork)
-+{
-+ if (rt_policy(policy)) {
-+ unsigned long rlim_rtprio = task_rlimit(p, RLIMIT_RTPRIO);
-+
-+ /* Can't set/change the rt policy: */
-+ if (policy != p->policy && !rlim_rtprio)
-+ goto req_priv;
-+
-+ /* Can't increase priority: */
-+ if (attr->sched_priority > p->rt_priority &&
-+ attr->sched_priority > rlim_rtprio)
-+ goto req_priv;
-+ }
-+
-+ /* Can't change other user's priorities: */
-+ if (!check_same_owner(p))
-+ goto req_priv;
-+
-+ /* Normal users shall not reset the sched_reset_on_fork flag: */
-+ if (p->sched_reset_on_fork && !reset_on_fork)
-+ goto req_priv;
-+
-+ return 0;
-+
-+req_priv:
-+ if (!capable(CAP_SYS_NICE))
-+ return -EPERM;
-+
-+ return 0;
-+}
-+
-+static int __sched_setscheduler(struct task_struct *p,
-+ const struct sched_attr *attr,
-+ bool user, bool pi)
-+{
-+ const struct sched_attr dl_squash_attr = {
-+ .size = sizeof(struct sched_attr),
-+ .sched_policy = SCHED_FIFO,
-+ .sched_nice = 0,
-+ .sched_priority = 99,
-+ };
-+ int oldpolicy = -1, policy = attr->sched_policy;
-+ int retval, newprio;
-+ struct balance_callback *head;
-+ unsigned long flags;
-+ struct rq *rq;
-+ int reset_on_fork;
-+ raw_spinlock_t *lock;
-+
-+ /* The pi code expects interrupts enabled */
-+ BUG_ON(pi && in_interrupt());
-+
-+ /*
-+ * Alt schedule FW supports SCHED_DEADLINE by squash it as prio 0 SCHED_FIFO
-+ */
-+ if (unlikely(SCHED_DEADLINE == policy)) {
-+ attr = &dl_squash_attr;
-+ policy = attr->sched_policy;
-+ }
-+recheck:
-+ /* Double check policy once rq lock held */
-+ if (policy < 0) {
-+ reset_on_fork = p->sched_reset_on_fork;
-+ policy = oldpolicy = p->policy;
-+ } else {
-+ reset_on_fork = !!(attr->sched_flags & SCHED_RESET_ON_FORK);
-+
-+ if (policy > SCHED_IDLE)
-+ return -EINVAL;
-+ }
-+
-+ if (attr->sched_flags & ~(SCHED_FLAG_ALL))
-+ return -EINVAL;
-+
-+ /*
-+ * Valid priorities for SCHED_FIFO and SCHED_RR are
-+ * 1..MAX_RT_PRIO-1, valid priority for SCHED_NORMAL and
-+ * SCHED_BATCH and SCHED_IDLE is 0.
-+ */
-+ if (attr->sched_priority < 0 ||
-+ (p->mm && attr->sched_priority > MAX_RT_PRIO - 1) ||
-+ (!p->mm && attr->sched_priority > MAX_RT_PRIO - 1))
-+ return -EINVAL;
-+ if ((SCHED_RR == policy || SCHED_FIFO == policy) !=
-+ (attr->sched_priority != 0))
-+ return -EINVAL;
-+
-+ if (user) {
-+ retval = user_check_sched_setscheduler(p, attr, policy, reset_on_fork);
-+ if (retval)
-+ return retval;
-+
-+ retval = security_task_setscheduler(p);
-+ if (retval)
-+ return retval;
-+ }
-+
-+ if (pi)
-+ cpuset_read_lock();
-+
-+ /*
-+ * Make sure no PI-waiters arrive (or leave) while we are
-+ * changing the priority of the task:
-+ */
-+ raw_spin_lock_irqsave(&p->pi_lock, flags);
-+
-+ /*
-+ * To be able to change p->policy safely, task_access_lock()
-+ * must be called.
-+ * IF use task_access_lock() here:
-+ * For the task p which is not running, reading rq->stop is
-+ * racy but acceptable as ->stop doesn't change much.
-+ * An enhancemnet can be made to read rq->stop saftly.
-+ */
-+ rq = __task_access_lock(p, &lock);
-+
-+ /*
-+ * Changing the policy of the stop threads its a very bad idea
-+ */
-+ if (p == rq->stop) {
-+ retval = -EINVAL;
-+ goto unlock;
-+ }
-+
-+ /*
-+ * If not changing anything there's no need to proceed further:
-+ */
-+ if (unlikely(policy == p->policy)) {
-+ if (rt_policy(policy) && attr->sched_priority != p->rt_priority)
-+ goto change;
-+ if (!rt_policy(policy) &&
-+ NICE_TO_PRIO(attr->sched_nice) != p->static_prio)
-+ goto change;
-+
-+ p->sched_reset_on_fork = reset_on_fork;
-+ retval = 0;
-+ goto unlock;
-+ }
-+change:
-+
-+ /* Re-check policy now with rq lock held */
-+ if (unlikely(oldpolicy != -1 && oldpolicy != p->policy)) {
-+ policy = oldpolicy = -1;
-+ __task_access_unlock(p, lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
-+ if (pi)
-+ cpuset_read_unlock();
-+ goto recheck;
-+ }
-+
-+ p->sched_reset_on_fork = reset_on_fork;
-+
-+ newprio = __normal_prio(policy, attr->sched_priority, NICE_TO_PRIO(attr->sched_nice));
-+ if (pi) {
-+ /*
-+ * Take priority boosted tasks into account. If the new
-+ * effective priority is unchanged, we just store the new
-+ * normal parameters and do not touch the scheduler class and
-+ * the runqueue. This will be done when the task deboost
-+ * itself.
-+ */
-+ newprio = rt_effective_prio(p, newprio);
-+ }
-+
-+ if (!(attr->sched_flags & SCHED_FLAG_KEEP_PARAMS)) {
-+ __setscheduler_params(p, attr);
-+ __setscheduler_prio(p, newprio);
-+ }
-+
-+ check_task_changed(p, rq);
-+
-+ /* Avoid rq from going away on us: */
-+ preempt_disable();
-+ head = splice_balance_callbacks(rq);
-+ __task_access_unlock(p, lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
-+
-+ if (pi) {
-+ cpuset_read_unlock();
-+ rt_mutex_adjust_pi(p);
-+ }
-+
-+ /* Run balance callbacks after we've adjusted the PI chain: */
-+ balance_callbacks(rq, head);
-+ preempt_enable();
-+
-+ return 0;
-+
-+unlock:
-+ __task_access_unlock(p, lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
-+ if (pi)
-+ cpuset_read_unlock();
-+ return retval;
-+}
-+
-+static int _sched_setscheduler(struct task_struct *p, int policy,
-+ const struct sched_param *param, bool check)
-+{
-+ struct sched_attr attr = {
-+ .sched_policy = policy,
-+ .sched_priority = param->sched_priority,
-+ .sched_nice = PRIO_TO_NICE(p->static_prio),
-+ };
-+
-+ /* Fixup the legacy SCHED_RESET_ON_FORK hack. */
-+ if ((policy != SETPARAM_POLICY) && (policy & SCHED_RESET_ON_FORK)) {
-+ attr.sched_flags |= SCHED_FLAG_RESET_ON_FORK;
-+ policy &= ~SCHED_RESET_ON_FORK;
-+ attr.sched_policy = policy;
-+ }
-+
-+ return __sched_setscheduler(p, &attr, check, true);
-+}
-+
-+/**
-+ * sched_setscheduler - change the scheduling policy and/or RT priority of a thread.
-+ * @p: the task in question.
-+ * @policy: new policy.
-+ * @param: structure containing the new RT priority.
-+ *
-+ * Use sched_set_fifo(), read its comment.
-+ *
-+ * Return: 0 on success. An error code otherwise.
-+ *
-+ * NOTE that the task may be already dead.
-+ */
-+int sched_setscheduler(struct task_struct *p, int policy,
-+ const struct sched_param *param)
-+{
-+ return _sched_setscheduler(p, policy, param, true);
-+}
-+
-+int sched_setattr(struct task_struct *p, const struct sched_attr *attr)
-+{
-+ return __sched_setscheduler(p, attr, true, true);
-+}
-+
-+int sched_setattr_nocheck(struct task_struct *p, const struct sched_attr *attr)
-+{
-+ return __sched_setscheduler(p, attr, false, true);
-+}
-+EXPORT_SYMBOL_GPL(sched_setattr_nocheck);
-+
-+/**
-+ * sched_setscheduler_nocheck - change the scheduling policy and/or RT priority of a thread from kernelspace.
-+ * @p: the task in question.
-+ * @policy: new policy.
-+ * @param: structure containing the new RT priority.
-+ *
-+ * Just like sched_setscheduler, only don't bother checking if the
-+ * current context has permission. For example, this is needed in
-+ * stop_machine(): we create temporary high priority worker threads,
-+ * but our caller might not have that capability.
-+ *
-+ * Return: 0 on success. An error code otherwise.
-+ */
-+int sched_setscheduler_nocheck(struct task_struct *p, int policy,
-+ const struct sched_param *param)
-+{
-+ return _sched_setscheduler(p, policy, param, false);
-+}
-+
-+/*
-+ * SCHED_FIFO is a broken scheduler model; that is, it is fundamentally
-+ * incapable of resource management, which is the one thing an OS really should
-+ * be doing.
-+ *
-+ * This is of course the reason it is limited to privileged users only.
-+ *
-+ * Worse still; it is fundamentally impossible to compose static priority
-+ * workloads. You cannot take two correctly working static prio workloads
-+ * and smash them together and still expect them to work.
-+ *
-+ * For this reason 'all' FIFO tasks the kernel creates are basically at:
-+ *
-+ * MAX_RT_PRIO / 2
-+ *
-+ * The administrator _MUST_ configure the system, the kernel simply doesn't
-+ * know enough information to make a sensible choice.
-+ */
-+void sched_set_fifo(struct task_struct *p)
-+{
-+ struct sched_param sp = { .sched_priority = MAX_RT_PRIO / 2 };
-+ WARN_ON_ONCE(sched_setscheduler_nocheck(p, SCHED_FIFO, &sp) != 0);
-+}
-+EXPORT_SYMBOL_GPL(sched_set_fifo);
-+
-+/*
-+ * For when you don't much care about FIFO, but want to be above SCHED_NORMAL.
-+ */
-+void sched_set_fifo_low(struct task_struct *p)
-+{
-+ struct sched_param sp = { .sched_priority = 1 };
-+ WARN_ON_ONCE(sched_setscheduler_nocheck(p, SCHED_FIFO, &sp) != 0);
-+}
-+EXPORT_SYMBOL_GPL(sched_set_fifo_low);
-+
-+void sched_set_normal(struct task_struct *p, int nice)
-+{
-+ struct sched_attr attr = {
-+ .sched_policy = SCHED_NORMAL,
-+ .sched_nice = nice,
-+ };
-+ WARN_ON_ONCE(sched_setattr_nocheck(p, &attr) != 0);
-+}
-+EXPORT_SYMBOL_GPL(sched_set_normal);
-+
-+static int
-+do_sched_setscheduler(pid_t pid, int policy, struct sched_param __user *param)
-+{
-+ struct sched_param lparam;
-+ struct task_struct *p;
-+ int retval;
-+
-+ if (!param || pid < 0)
-+ return -EINVAL;
-+ if (copy_from_user(&lparam, param, sizeof(struct sched_param)))
-+ return -EFAULT;
-+
-+ rcu_read_lock();
-+ retval = -ESRCH;
-+ p = find_process_by_pid(pid);
-+ if (likely(p))
-+ get_task_struct(p);
-+ rcu_read_unlock();
-+
-+ if (likely(p)) {
-+ retval = sched_setscheduler(p, policy, &lparam);
-+ put_task_struct(p);
-+ }
-+
-+ return retval;
-+}
-+
-+/*
-+ * Mimics kernel/events/core.c perf_copy_attr().
-+ */
-+static int sched_copy_attr(struct sched_attr __user *uattr, struct sched_attr *attr)
-+{
-+ u32 size;
-+ int ret;
-+
-+ /* Zero the full structure, so that a short copy will be nice: */
-+ memset(attr, 0, sizeof(*attr));
-+
-+ ret = get_user(size, &uattr->size);
-+ if (ret)
-+ return ret;
-+
-+ /* ABI compatibility quirk: */
-+ if (!size)
-+ size = SCHED_ATTR_SIZE_VER0;
-+
-+ if (size < SCHED_ATTR_SIZE_VER0 || size > PAGE_SIZE)
-+ goto err_size;
-+
-+ ret = copy_struct_from_user(attr, sizeof(*attr), uattr, size);
-+ if (ret) {
-+ if (ret == -E2BIG)
-+ goto err_size;
-+ return ret;
-+ }
-+
-+ /*
-+ * XXX: Do we want to be lenient like existing syscalls; or do we want
-+ * to be strict and return an error on out-of-bounds values?
-+ */
-+ attr->sched_nice = clamp(attr->sched_nice, -20, 19);
-+
-+ /* sched/core.c uses zero here but we already know ret is zero */
-+ return 0;
-+
-+err_size:
-+ put_user(sizeof(*attr), &uattr->size);
-+ return -E2BIG;
-+}
-+
-+/**
-+ * sys_sched_setscheduler - set/change the scheduler policy and RT priority
-+ * @pid: the pid in question.
-+ * @policy: new policy.
-+ *
-+ * Return: 0 on success. An error code otherwise.
-+ * @param: structure containing the new RT priority.
-+ */
-+SYSCALL_DEFINE3(sched_setscheduler, pid_t, pid, int, policy, struct sched_param __user *, param)
-+{
-+ if (policy < 0)
-+ return -EINVAL;
-+
-+ return do_sched_setscheduler(pid, policy, param);
-+}
-+
-+/**
-+ * sys_sched_setparam - set/change the RT priority of a thread
-+ * @pid: the pid in question.
-+ * @param: structure containing the new RT priority.
-+ *
-+ * Return: 0 on success. An error code otherwise.
-+ */
-+SYSCALL_DEFINE2(sched_setparam, pid_t, pid, struct sched_param __user *, param)
-+{
-+ return do_sched_setscheduler(pid, SETPARAM_POLICY, param);
-+}
-+
-+/**
-+ * sys_sched_setattr - same as above, but with extended sched_attr
-+ * @pid: the pid in question.
-+ * @uattr: structure containing the extended parameters.
-+ */
-+SYSCALL_DEFINE3(sched_setattr, pid_t, pid, struct sched_attr __user *, uattr,
-+ unsigned int, flags)
-+{
-+ struct sched_attr attr;
-+ struct task_struct *p;
-+ int retval;
-+
-+ if (!uattr || pid < 0 || flags)
-+ return -EINVAL;
-+
-+ retval = sched_copy_attr(uattr, &attr);
-+ if (retval)
-+ return retval;
-+
-+ if ((int)attr.sched_policy < 0)
-+ return -EINVAL;
-+
-+ rcu_read_lock();
-+ retval = -ESRCH;
-+ p = find_process_by_pid(pid);
-+ if (likely(p))
-+ get_task_struct(p);
-+ rcu_read_unlock();
-+
-+ if (likely(p)) {
-+ retval = sched_setattr(p, &attr);
-+ put_task_struct(p);
-+ }
-+
-+ return retval;
-+}
-+
-+/**
-+ * sys_sched_getscheduler - get the policy (scheduling class) of a thread
-+ * @pid: the pid in question.
-+ *
-+ * Return: On success, the policy of the thread. Otherwise, a negative error
-+ * code.
-+ */
-+SYSCALL_DEFINE1(sched_getscheduler, pid_t, pid)
-+{
-+ struct task_struct *p;
-+ int retval = -EINVAL;
-+
-+ if (pid < 0)
-+ goto out_nounlock;
-+
-+ retval = -ESRCH;
-+ rcu_read_lock();
-+ p = find_process_by_pid(pid);
-+ if (p) {
-+ retval = security_task_getscheduler(p);
-+ if (!retval)
-+ retval = p->policy;
-+ }
-+ rcu_read_unlock();
-+
-+out_nounlock:
-+ return retval;
-+}
-+
-+/**
-+ * sys_sched_getscheduler - get the RT priority of a thread
-+ * @pid: the pid in question.
-+ * @param: structure containing the RT priority.
-+ *
-+ * Return: On success, 0 and the RT priority is in @param. Otherwise, an error
-+ * code.
-+ */
-+SYSCALL_DEFINE2(sched_getparam, pid_t, pid, struct sched_param __user *, param)
-+{
-+ struct sched_param lp = { .sched_priority = 0 };
-+ struct task_struct *p;
-+ int retval = -EINVAL;
-+
-+ if (!param || pid < 0)
-+ goto out_nounlock;
-+
-+ rcu_read_lock();
-+ p = find_process_by_pid(pid);
-+ retval = -ESRCH;
-+ if (!p)
-+ goto out_unlock;
-+
-+ retval = security_task_getscheduler(p);
-+ if (retval)
-+ goto out_unlock;
-+
-+ if (task_has_rt_policy(p))
-+ lp.sched_priority = p->rt_priority;
-+ rcu_read_unlock();
-+
-+ /*
-+ * This one might sleep, we cannot do it with a spinlock held ...
-+ */
-+ retval = copy_to_user(param, &lp, sizeof(*param)) ? -EFAULT : 0;
-+
-+out_nounlock:
-+ return retval;
-+
-+out_unlock:
-+ rcu_read_unlock();
-+ return retval;
-+}
-+
-+/*
-+ * Copy the kernel size attribute structure (which might be larger
-+ * than what user-space knows about) to user-space.
-+ *
-+ * Note that all cases are valid: user-space buffer can be larger or
-+ * smaller than the kernel-space buffer. The usual case is that both
-+ * have the same size.
-+ */
-+static int
-+sched_attr_copy_to_user(struct sched_attr __user *uattr,
-+ struct sched_attr *kattr,
-+ unsigned int usize)
-+{
-+ unsigned int ksize = sizeof(*kattr);
-+
-+ if (!access_ok(uattr, usize))
-+ return -EFAULT;
-+
-+ /*
-+ * sched_getattr() ABI forwards and backwards compatibility:
-+ *
-+ * If usize == ksize then we just copy everything to user-space and all is good.
-+ *
-+ * If usize < ksize then we only copy as much as user-space has space for,
-+ * this keeps ABI compatibility as well. We skip the rest.
-+ *
-+ * If usize > ksize then user-space is using a newer version of the ABI,
-+ * which part the kernel doesn't know about. Just ignore it - tooling can
-+ * detect the kernel's knowledge of attributes from the attr->size value
-+ * which is set to ksize in this case.
-+ */
-+ kattr->size = min(usize, ksize);
-+
-+ if (copy_to_user(uattr, kattr, kattr->size))
-+ return -EFAULT;
-+
-+ return 0;
-+}
-+
-+/**
-+ * sys_sched_getattr - similar to sched_getparam, but with sched_attr
-+ * @pid: the pid in question.
-+ * @uattr: structure containing the extended parameters.
-+ * @usize: sizeof(attr) for fwd/bwd comp.
-+ * @flags: for future extension.
-+ */
-+SYSCALL_DEFINE4(sched_getattr, pid_t, pid, struct sched_attr __user *, uattr,
-+ unsigned int, usize, unsigned int, flags)
-+{
-+ struct sched_attr kattr = { };
-+ struct task_struct *p;
-+ int retval;
-+
-+ if (!uattr || pid < 0 || usize > PAGE_SIZE ||
-+ usize < SCHED_ATTR_SIZE_VER0 || flags)
-+ return -EINVAL;
-+
-+ rcu_read_lock();
-+ p = find_process_by_pid(pid);
-+ retval = -ESRCH;
-+ if (!p)
-+ goto out_unlock;
-+
-+ retval = security_task_getscheduler(p);
-+ if (retval)
-+ goto out_unlock;
-+
-+ kattr.sched_policy = p->policy;
-+ if (p->sched_reset_on_fork)
-+ kattr.sched_flags |= SCHED_FLAG_RESET_ON_FORK;
-+ if (task_has_rt_policy(p))
-+ kattr.sched_priority = p->rt_priority;
-+ else
-+ kattr.sched_nice = task_nice(p);
-+ kattr.sched_flags &= SCHED_FLAG_ALL;
-+
-+#ifdef CONFIG_UCLAMP_TASK
-+ kattr.sched_util_min = p->uclamp_req[UCLAMP_MIN].value;
-+ kattr.sched_util_max = p->uclamp_req[UCLAMP_MAX].value;
-+#endif
-+
-+ rcu_read_unlock();
-+
-+ return sched_attr_copy_to_user(uattr, &kattr, usize);
-+
-+out_unlock:
-+ rcu_read_unlock();
-+ return retval;
-+}
-+
-+#ifdef CONFIG_SMP
-+int dl_task_check_affinity(struct task_struct *p, const struct cpumask *mask)
-+{
-+ return 0;
-+}
-+#endif
-+
-+static int
-+__sched_setaffinity(struct task_struct *p, struct affinity_context *ctx)
-+{
-+ int retval;
-+ cpumask_var_t cpus_allowed, new_mask;
-+
-+ if (!alloc_cpumask_var(&cpus_allowed, GFP_KERNEL))
-+ return -ENOMEM;
-+
-+ if (!alloc_cpumask_var(&new_mask, GFP_KERNEL)) {
-+ retval = -ENOMEM;
-+ goto out_free_cpus_allowed;
-+ }
-+
-+ cpuset_cpus_allowed(p, cpus_allowed);
-+ cpumask_and(new_mask, ctx->new_mask, cpus_allowed);
-+
-+ ctx->new_mask = new_mask;
-+ ctx->flags |= SCA_CHECK;
-+
-+ retval = __set_cpus_allowed_ptr(p, ctx);
-+ if (retval)
-+ goto out_free_new_mask;
-+
-+ cpuset_cpus_allowed(p, cpus_allowed);
-+ if (!cpumask_subset(new_mask, cpus_allowed)) {
-+ /*
-+ * We must have raced with a concurrent cpuset
-+ * update. Just reset the cpus_allowed to the
-+ * cpuset's cpus_allowed
-+ */
-+ cpumask_copy(new_mask, cpus_allowed);
-+
-+ /*
-+ * If SCA_USER is set, a 2nd call to __set_cpus_allowed_ptr()
-+ * will restore the previous user_cpus_ptr value.
-+ *
-+ * In the unlikely event a previous user_cpus_ptr exists,
-+ * we need to further restrict the mask to what is allowed
-+ * by that old user_cpus_ptr.
-+ */
-+ if (unlikely((ctx->flags & SCA_USER) && ctx->user_mask)) {
-+ bool empty = !cpumask_and(new_mask, new_mask,
-+ ctx->user_mask);
-+
-+ if (WARN_ON_ONCE(empty))
-+ cpumask_copy(new_mask, cpus_allowed);
-+ }
-+ __set_cpus_allowed_ptr(p, ctx);
-+ retval = -EINVAL;
-+ }
-+
-+out_free_new_mask:
-+ free_cpumask_var(new_mask);
-+out_free_cpus_allowed:
-+ free_cpumask_var(cpus_allowed);
-+ return retval;
-+}
-+
-+long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
-+{
-+ struct affinity_context ac;
-+ struct cpumask *user_mask;
-+ struct task_struct *p;
-+ int retval;
-+
-+ rcu_read_lock();
-+
-+ p = find_process_by_pid(pid);
-+ if (!p) {
-+ rcu_read_unlock();
-+ return -ESRCH;
-+ }
-+
-+ /* Prevent p going away */
-+ get_task_struct(p);
-+ rcu_read_unlock();
-+
-+ if (p->flags & PF_NO_SETAFFINITY) {
-+ retval = -EINVAL;
-+ goto out_put_task;
-+ }
-+
-+ if (!check_same_owner(p)) {
-+ rcu_read_lock();
-+ if (!ns_capable(__task_cred(p)->user_ns, CAP_SYS_NICE)) {
-+ rcu_read_unlock();
-+ retval = -EPERM;
-+ goto out_put_task;
-+ }
-+ rcu_read_unlock();
-+ }
-+
-+ retval = security_task_setscheduler(p);
-+ if (retval)
-+ goto out_put_task;
-+
-+ /*
-+ * With non-SMP configs, user_cpus_ptr/user_mask isn't used and
-+ * alloc_user_cpus_ptr() returns NULL.
-+ */
-+ user_mask = alloc_user_cpus_ptr(NUMA_NO_NODE);
-+ if (user_mask) {
-+ cpumask_copy(user_mask, in_mask);
-+ } else if (IS_ENABLED(CONFIG_SMP)) {
-+ retval = -ENOMEM;
-+ goto out_put_task;
-+ }
-+
-+ ac = (struct affinity_context){
-+ .new_mask = in_mask,
-+ .user_mask = user_mask,
-+ .flags = SCA_USER,
-+ };
-+
-+ retval = __sched_setaffinity(p, &ac);
-+ kfree(ac.user_mask);
-+
-+out_put_task:
-+ put_task_struct(p);
-+ return retval;
-+}
-+
-+static int get_user_cpu_mask(unsigned long __user *user_mask_ptr, unsigned len,
-+ struct cpumask *new_mask)
-+{
-+ if (len < cpumask_size())
-+ cpumask_clear(new_mask);
-+ else if (len > cpumask_size())
-+ len = cpumask_size();
-+
-+ return copy_from_user(new_mask, user_mask_ptr, len) ? -EFAULT : 0;
-+}
-+
-+/**
-+ * sys_sched_setaffinity - set the CPU affinity of a process
-+ * @pid: pid of the process
-+ * @len: length in bytes of the bitmask pointed to by user_mask_ptr
-+ * @user_mask_ptr: user-space pointer to the new CPU mask
-+ *
-+ * Return: 0 on success. An error code otherwise.
-+ */
-+SYSCALL_DEFINE3(sched_setaffinity, pid_t, pid, unsigned int, len,
-+ unsigned long __user *, user_mask_ptr)
-+{
-+ cpumask_var_t new_mask;
-+ int retval;
-+
-+ if (!alloc_cpumask_var(&new_mask, GFP_KERNEL))
-+ return -ENOMEM;
-+
-+ retval = get_user_cpu_mask(user_mask_ptr, len, new_mask);
-+ if (retval == 0)
-+ retval = sched_setaffinity(pid, new_mask);
-+ free_cpumask_var(new_mask);
-+ return retval;
-+}
-+
-+long sched_getaffinity(pid_t pid, cpumask_t *mask)
-+{
-+ struct task_struct *p;
-+ raw_spinlock_t *lock;
-+ unsigned long flags;
-+ int retval;
-+
-+ rcu_read_lock();
-+
-+ retval = -ESRCH;
-+ p = find_process_by_pid(pid);
-+ if (!p)
-+ goto out_unlock;
-+
-+ retval = security_task_getscheduler(p);
-+ if (retval)
-+ goto out_unlock;
-+
-+ task_access_lock_irqsave(p, &lock, &flags);
-+ cpumask_and(mask, &p->cpus_mask, cpu_active_mask);
-+ task_access_unlock_irqrestore(p, lock, &flags);
-+
-+out_unlock:
-+ rcu_read_unlock();
-+
-+ return retval;
-+}
-+
-+/**
-+ * sys_sched_getaffinity - get the CPU affinity of a process
-+ * @pid: pid of the process
-+ * @len: length in bytes of the bitmask pointed to by user_mask_ptr
-+ * @user_mask_ptr: user-space pointer to hold the current CPU mask
-+ *
-+ * Return: size of CPU mask copied to user_mask_ptr on success. An
-+ * error code otherwise.
-+ */
-+SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len,
-+ unsigned long __user *, user_mask_ptr)
-+{
-+ int ret;
-+ cpumask_var_t mask;
-+
-+ if ((len * BITS_PER_BYTE) < nr_cpu_ids)
-+ return -EINVAL;
-+ if (len & (sizeof(unsigned long)-1))
-+ return -EINVAL;
-+
-+ if (!zalloc_cpumask_var(&mask, GFP_KERNEL))
-+ return -ENOMEM;
-+
-+ ret = sched_getaffinity(pid, mask);
-+ if (ret == 0) {
-+ unsigned int retlen = min(len, cpumask_size());
-+
-+ if (copy_to_user(user_mask_ptr, cpumask_bits(mask), retlen))
-+ ret = -EFAULT;
-+ else
-+ ret = retlen;
-+ }
-+ free_cpumask_var(mask);
-+
-+ return ret;
-+}
-+
-+static void do_sched_yield(void)
-+{
-+ struct rq *rq;
-+ struct rq_flags rf;
-+
-+ if (!sched_yield_type)
-+ return;
-+
-+ rq = this_rq_lock_irq(&rf);
-+
-+ schedstat_inc(rq->yld_count);
-+
-+ if (1 == sched_yield_type) {
-+ if (!rt_task(current))
-+ do_sched_yield_type_1(current, rq);
-+ } else if (2 == sched_yield_type) {
-+ if (rq->nr_running > 1)
-+ rq->skip = current;
-+ }
-+
-+ preempt_disable();
-+ raw_spin_unlock_irq(&rq->lock);
-+ sched_preempt_enable_no_resched();
-+
-+ schedule();
-+}
-+
-+/**
-+ * sys_sched_yield - yield the current processor to other threads.
-+ *
-+ * This function yields the current CPU to other tasks. If there are no
-+ * other threads running on this CPU then this function will return.
-+ *
-+ * Return: 0.
-+ */
-+SYSCALL_DEFINE0(sched_yield)
-+{
-+ do_sched_yield();
-+ return 0;
-+}
-+
-+#if !defined(CONFIG_PREEMPTION) || defined(CONFIG_PREEMPT_DYNAMIC)
-+int __sched __cond_resched(void)
-+{
-+ if (should_resched(0)) {
-+ preempt_schedule_common();
-+ return 1;
-+ }
-+ /*
-+ * In preemptible kernels, ->rcu_read_lock_nesting tells the tick
-+ * whether the current CPU is in an RCU read-side critical section,
-+ * so the tick can report quiescent states even for CPUs looping
-+ * in kernel context. In contrast, in non-preemptible kernels,
-+ * RCU readers leave no in-memory hints, which means that CPU-bound
-+ * processes executing in kernel context might never report an
-+ * RCU quiescent state. Therefore, the following code causes
-+ * cond_resched() to report a quiescent state, but only when RCU
-+ * is in urgent need of one.
-+ */
-+#ifndef CONFIG_PREEMPT_RCU
-+ rcu_all_qs();
-+#endif
-+ return 0;
-+}
-+EXPORT_SYMBOL(__cond_resched);
-+#endif
-+
-+#ifdef CONFIG_PREEMPT_DYNAMIC
-+#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
-+#define cond_resched_dynamic_enabled __cond_resched
-+#define cond_resched_dynamic_disabled ((void *)&__static_call_return0)
-+DEFINE_STATIC_CALL_RET0(cond_resched, __cond_resched);
-+EXPORT_STATIC_CALL_TRAMP(cond_resched);
-+
-+#define might_resched_dynamic_enabled __cond_resched
-+#define might_resched_dynamic_disabled ((void *)&__static_call_return0)
-+DEFINE_STATIC_CALL_RET0(might_resched, __cond_resched);
-+EXPORT_STATIC_CALL_TRAMP(might_resched);
-+#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
-+static DEFINE_STATIC_KEY_FALSE(sk_dynamic_cond_resched);
-+int __sched dynamic_cond_resched(void)
-+{
-+ klp_sched_try_switch();
-+ if (!static_branch_unlikely(&sk_dynamic_cond_resched))
-+ return 0;
-+ return __cond_resched();
-+}
-+EXPORT_SYMBOL(dynamic_cond_resched);
-+
-+static DEFINE_STATIC_KEY_FALSE(sk_dynamic_might_resched);
-+int __sched dynamic_might_resched(void)
-+{
-+ if (!static_branch_unlikely(&sk_dynamic_might_resched))
-+ return 0;
-+ return __cond_resched();
-+}
-+EXPORT_SYMBOL(dynamic_might_resched);
-+#endif
-+#endif
-+
-+/*
-+ * __cond_resched_lock() - if a reschedule is pending, drop the given lock,
-+ * call schedule, and on return reacquire the lock.
-+ *
-+ * This works OK both with and without CONFIG_PREEMPTION. We do strange low-level
-+ * operations here to prevent schedule() from being called twice (once via
-+ * spin_unlock(), once by hand).
-+ */
-+int __cond_resched_lock(spinlock_t *lock)
-+{
-+ int resched = should_resched(PREEMPT_LOCK_OFFSET);
-+ int ret = 0;
-+
-+ lockdep_assert_held(lock);
-+
-+ if (spin_needbreak(lock) || resched) {
-+ spin_unlock(lock);
-+ if (!_cond_resched())
-+ cpu_relax();
-+ ret = 1;
-+ spin_lock(lock);
-+ }
-+ return ret;
-+}
-+EXPORT_SYMBOL(__cond_resched_lock);
-+
-+int __cond_resched_rwlock_read(rwlock_t *lock)
-+{
-+ int resched = should_resched(PREEMPT_LOCK_OFFSET);
-+ int ret = 0;
-+
-+ lockdep_assert_held_read(lock);
-+
-+ if (rwlock_needbreak(lock) || resched) {
-+ read_unlock(lock);
-+ if (!_cond_resched())
-+ cpu_relax();
-+ ret = 1;
-+ read_lock(lock);
-+ }
-+ return ret;
-+}
-+EXPORT_SYMBOL(__cond_resched_rwlock_read);
-+
-+int __cond_resched_rwlock_write(rwlock_t *lock)
-+{
-+ int resched = should_resched(PREEMPT_LOCK_OFFSET);
-+ int ret = 0;
-+
-+ lockdep_assert_held_write(lock);
-+
-+ if (rwlock_needbreak(lock) || resched) {
-+ write_unlock(lock);
-+ if (!_cond_resched())
-+ cpu_relax();
-+ ret = 1;
-+ write_lock(lock);
-+ }
-+ return ret;
-+}
-+EXPORT_SYMBOL(__cond_resched_rwlock_write);
-+
-+#ifdef CONFIG_PREEMPT_DYNAMIC
-+
-+#ifdef CONFIG_GENERIC_ENTRY
-+#include <linux/entry-common.h>
-+#endif
-+
-+/*
-+ * SC:cond_resched
-+ * SC:might_resched
-+ * SC:preempt_schedule
-+ * SC:preempt_schedule_notrace
-+ * SC:irqentry_exit_cond_resched
-+ *
-+ *
-+ * NONE:
-+ * cond_resched <- __cond_resched
-+ * might_resched <- RET0
-+ * preempt_schedule <- NOP
-+ * preempt_schedule_notrace <- NOP
-+ * irqentry_exit_cond_resched <- NOP
-+ *
-+ * VOLUNTARY:
-+ * cond_resched <- __cond_resched
-+ * might_resched <- __cond_resched
-+ * preempt_schedule <- NOP
-+ * preempt_schedule_notrace <- NOP
-+ * irqentry_exit_cond_resched <- NOP
-+ *
-+ * FULL:
-+ * cond_resched <- RET0
-+ * might_resched <- RET0
-+ * preempt_schedule <- preempt_schedule
-+ * preempt_schedule_notrace <- preempt_schedule_notrace
-+ * irqentry_exit_cond_resched <- irqentry_exit_cond_resched
-+ */
-+
-+enum {
-+ preempt_dynamic_undefined = -1,
-+ preempt_dynamic_none,
-+ preempt_dynamic_voluntary,
-+ preempt_dynamic_full,
-+};
-+
-+int preempt_dynamic_mode = preempt_dynamic_undefined;
-+
-+int sched_dynamic_mode(const char *str)
-+{
-+ if (!strcmp(str, "none"))
-+ return preempt_dynamic_none;
-+
-+ if (!strcmp(str, "voluntary"))
-+ return preempt_dynamic_voluntary;
-+
-+ if (!strcmp(str, "full"))
-+ return preempt_dynamic_full;
-+
-+ return -EINVAL;
-+}
-+
-+#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
-+#define preempt_dynamic_enable(f) static_call_update(f, f##_dynamic_enabled)
-+#define preempt_dynamic_disable(f) static_call_update(f, f##_dynamic_disabled)
-+#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
-+#define preempt_dynamic_enable(f) static_key_enable(&sk_dynamic_##f.key)
-+#define preempt_dynamic_disable(f) static_key_disable(&sk_dynamic_##f.key)
-+#else
-+#error "Unsupported PREEMPT_DYNAMIC mechanism"
-+#endif
-+
-+static DEFINE_MUTEX(sched_dynamic_mutex);
-+static bool klp_override;
-+
-+static void __sched_dynamic_update(int mode)
-+{
-+ /*
-+ * Avoid {NONE,VOLUNTARY} -> FULL transitions from ever ending up in
-+ * the ZERO state, which is invalid.
-+ */
-+ if (!klp_override)
-+ preempt_dynamic_enable(cond_resched);
-+ preempt_dynamic_enable(cond_resched);
-+ preempt_dynamic_enable(might_resched);
-+ preempt_dynamic_enable(preempt_schedule);
-+ preempt_dynamic_enable(preempt_schedule_notrace);
-+ preempt_dynamic_enable(irqentry_exit_cond_resched);
-+
-+ switch (mode) {
-+ case preempt_dynamic_none:
-+ if (!klp_override)
-+ preempt_dynamic_enable(cond_resched);
-+ preempt_dynamic_disable(might_resched);
-+ preempt_dynamic_disable(preempt_schedule);
-+ preempt_dynamic_disable(preempt_schedule_notrace);
-+ preempt_dynamic_disable(irqentry_exit_cond_resched);
-+ if (mode != preempt_dynamic_mode)
-+ pr_info("Dynamic Preempt: none\n");
-+ break;
-+
-+ case preempt_dynamic_voluntary:
-+ if (!klp_override)
-+ preempt_dynamic_enable(cond_resched);
-+ preempt_dynamic_enable(might_resched);
-+ preempt_dynamic_disable(preempt_schedule);
-+ preempt_dynamic_disable(preempt_schedule_notrace);
-+ preempt_dynamic_disable(irqentry_exit_cond_resched);
-+ if (mode != preempt_dynamic_mode)
-+ pr_info("Dynamic Preempt: voluntary\n");
-+ break;
-+
-+ case preempt_dynamic_full:
-+ if (!klp_override)
-+ preempt_dynamic_enable(cond_resched);
-+ preempt_dynamic_disable(might_resched);
-+ preempt_dynamic_enable(preempt_schedule);
-+ preempt_dynamic_enable(preempt_schedule_notrace);
-+ preempt_dynamic_enable(irqentry_exit_cond_resched);
-+ if (mode != preempt_dynamic_mode)
-+ pr_info("Dynamic Preempt: full\n");
-+ break;
-+ }
-+
-+ preempt_dynamic_mode = mode;
-+}
-+
-+void sched_dynamic_update(int mode)
-+{
-+ mutex_lock(&sched_dynamic_mutex);
-+ __sched_dynamic_update(mode);
-+ mutex_unlock(&sched_dynamic_mutex);
-+}
-+
-+#ifdef CONFIG_HAVE_PREEMPT_DYNAMIC_CALL
-+
-+static int klp_cond_resched(void)
-+{
-+ __klp_sched_try_switch();
-+ return __cond_resched();
-+}
-+
-+void sched_dynamic_klp_enable(void)
-+{
-+ mutex_lock(&sched_dynamic_mutex);
-+
-+ klp_override = true;
-+ static_call_update(cond_resched, klp_cond_resched);
-+
-+ mutex_unlock(&sched_dynamic_mutex);
-+}
-+
-+void sched_dynamic_klp_disable(void)
-+{
-+ mutex_lock(&sched_dynamic_mutex);
-+
-+ klp_override = false;
-+ __sched_dynamic_update(preempt_dynamic_mode);
-+
-+ mutex_unlock(&sched_dynamic_mutex);
-+}
-+
-+#endif /* CONFIG_HAVE_PREEMPT_DYNAMIC_CALL */
-+
-+
-+static int __init setup_preempt_mode(char *str)
-+{
-+ int mode = sched_dynamic_mode(str);
-+ if (mode < 0) {
-+ pr_warn("Dynamic Preempt: unsupported mode: %s\n", str);
-+ return 0;
-+ }
-+
-+ sched_dynamic_update(mode);
-+ return 1;
-+}
-+__setup("preempt=", setup_preempt_mode);
-+
-+static void __init preempt_dynamic_init(void)
-+{
-+ if (preempt_dynamic_mode == preempt_dynamic_undefined) {
-+ if (IS_ENABLED(CONFIG_PREEMPT_NONE)) {
-+ sched_dynamic_update(preempt_dynamic_none);
-+ } else if (IS_ENABLED(CONFIG_PREEMPT_VOLUNTARY)) {
-+ sched_dynamic_update(preempt_dynamic_voluntary);
-+ } else {
-+ /* Default static call setting, nothing to do */
-+ WARN_ON_ONCE(!IS_ENABLED(CONFIG_PREEMPT));
-+ preempt_dynamic_mode = preempt_dynamic_full;
-+ pr_info("Dynamic Preempt: full\n");
-+ }
-+ }
-+}
-+
-+#define PREEMPT_MODEL_ACCESSOR(mode) \
-+ bool preempt_model_##mode(void) \
-+ { \
-+ WARN_ON_ONCE(preempt_dynamic_mode == preempt_dynamic_undefined); \
-+ return preempt_dynamic_mode == preempt_dynamic_##mode; \
-+ } \
-+ EXPORT_SYMBOL_GPL(preempt_model_##mode)
-+
-+PREEMPT_MODEL_ACCESSOR(none);
-+PREEMPT_MODEL_ACCESSOR(voluntary);
-+PREEMPT_MODEL_ACCESSOR(full);
-+
-+#else /* !CONFIG_PREEMPT_DYNAMIC */
-+
-+static inline void preempt_dynamic_init(void) { }
-+
-+#endif /* #ifdef CONFIG_PREEMPT_DYNAMIC */
-+
-+/**
-+ * yield - yield the current processor to other threads.
-+ *
-+ * Do not ever use this function, there's a 99% chance you're doing it wrong.
-+ *
-+ * The scheduler is at all times free to pick the calling task as the most
-+ * eligible task to run, if removing the yield() call from your code breaks
-+ * it, it's already broken.
-+ *
-+ * Typical broken usage is:
-+ *
-+ * while (!event)
-+ * yield();
-+ *
-+ * where one assumes that yield() will let 'the other' process run that will
-+ * make event true. If the current task is a SCHED_FIFO task that will never
-+ * happen. Never use yield() as a progress guarantee!!
-+ *
-+ * If you want to use yield() to wait for something, use wait_event().
-+ * If you want to use yield() to be 'nice' for others, use cond_resched().
-+ * If you still want to use yield(), do not!
-+ */
-+void __sched yield(void)
-+{
-+ set_current_state(TASK_RUNNING);
-+ do_sched_yield();
-+}
-+EXPORT_SYMBOL(yield);
-+
-+/**
-+ * yield_to - yield the current processor to another thread in
-+ * your thread group, or accelerate that thread toward the
-+ * processor it's on.
-+ * @p: target task
-+ * @preempt: whether task preemption is allowed or not
-+ *
-+ * It's the caller's job to ensure that the target task struct
-+ * can't go away on us before we can do any checks.
-+ *
-+ * In Alt schedule FW, yield_to is not supported.
-+ *
-+ * Return:
-+ * true (>0) if we indeed boosted the target task.
-+ * false (0) if we failed to boost the target.
-+ * -ESRCH if there's no task to yield to.
-+ */
-+int __sched yield_to(struct task_struct *p, bool preempt)
-+{
-+ return 0;
-+}
-+EXPORT_SYMBOL_GPL(yield_to);
-+
-+int io_schedule_prepare(void)
-+{
-+ int old_iowait = current->in_iowait;
-+
-+ current->in_iowait = 1;
-+ blk_flush_plug(current->plug, true);
-+ return old_iowait;
-+}
-+
-+void io_schedule_finish(int token)
-+{
-+ current->in_iowait = token;
-+}
-+
-+/*
-+ * This task is about to go to sleep on IO. Increment rq->nr_iowait so
-+ * that process accounting knows that this is a task in IO wait state.
-+ *
-+ * But don't do that if it is a deliberate, throttling IO wait (this task
-+ * has set its backing_dev_info: the queue against which it should throttle)
-+ */
-+
-+long __sched io_schedule_timeout(long timeout)
-+{
-+ int token;
-+ long ret;
-+
-+ token = io_schedule_prepare();
-+ ret = schedule_timeout(timeout);
-+ io_schedule_finish(token);
-+
-+ return ret;
-+}
-+EXPORT_SYMBOL(io_schedule_timeout);
-+
-+void __sched io_schedule(void)
-+{
-+ int token;
-+
-+ token = io_schedule_prepare();
-+ schedule();
-+ io_schedule_finish(token);
-+}
-+EXPORT_SYMBOL(io_schedule);
-+
-+/**
-+ * sys_sched_get_priority_max - return maximum RT priority.
-+ * @policy: scheduling class.
-+ *
-+ * Return: On success, this syscall returns the maximum
-+ * rt_priority that can be used by a given scheduling class.
-+ * On failure, a negative error code is returned.
-+ */
-+SYSCALL_DEFINE1(sched_get_priority_max, int, policy)
-+{
-+ int ret = -EINVAL;
-+
-+ switch (policy) {
-+ case SCHED_FIFO:
-+ case SCHED_RR:
-+ ret = MAX_RT_PRIO - 1;
-+ break;
-+ case SCHED_NORMAL:
-+ case SCHED_BATCH:
-+ case SCHED_IDLE:
-+ ret = 0;
-+ break;
-+ }
-+ return ret;
-+}
-+
-+/**
-+ * sys_sched_get_priority_min - return minimum RT priority.
-+ * @policy: scheduling class.
-+ *
-+ * Return: On success, this syscall returns the minimum
-+ * rt_priority that can be used by a given scheduling class.
-+ * On failure, a negative error code is returned.
-+ */
-+SYSCALL_DEFINE1(sched_get_priority_min, int, policy)
-+{
-+ int ret = -EINVAL;
-+
-+ switch (policy) {
-+ case SCHED_FIFO:
-+ case SCHED_RR:
-+ ret = 1;
-+ break;
-+ case SCHED_NORMAL:
-+ case SCHED_BATCH:
-+ case SCHED_IDLE:
-+ ret = 0;
-+ break;
-+ }
-+ return ret;
-+}
-+
-+static int sched_rr_get_interval(pid_t pid, struct timespec64 *t)
-+{
-+ struct task_struct *p;
-+ int retval;
-+
-+ alt_sched_debug();
-+
-+ if (pid < 0)
-+ return -EINVAL;
-+
-+ retval = -ESRCH;
-+ rcu_read_lock();
-+ p = find_process_by_pid(pid);
-+ if (!p)
-+ goto out_unlock;
-+
-+ retval = security_task_getscheduler(p);
-+ if (retval)
-+ goto out_unlock;
-+ rcu_read_unlock();
-+
-+ *t = ns_to_timespec64(sched_timeslice_ns);
-+ return 0;
-+
-+out_unlock:
-+ rcu_read_unlock();
-+ return retval;
-+}
-+
-+/**
-+ * sys_sched_rr_get_interval - return the default timeslice of a process.
-+ * @pid: pid of the process.
-+ * @interval: userspace pointer to the timeslice value.
-+ *
-+ *
-+ * Return: On success, 0 and the timeslice is in @interval. Otherwise,
-+ * an error code.
-+ */
-+SYSCALL_DEFINE2(sched_rr_get_interval, pid_t, pid,
-+ struct __kernel_timespec __user *, interval)
-+{
-+ struct timespec64 t;
-+ int retval = sched_rr_get_interval(pid, &t);
-+
-+ if (retval == 0)
-+ retval = put_timespec64(&t, interval);
-+
-+ return retval;
-+}
-+
-+#ifdef CONFIG_COMPAT_32BIT_TIME
-+SYSCALL_DEFINE2(sched_rr_get_interval_time32, pid_t, pid,
-+ struct old_timespec32 __user *, interval)
-+{
-+ struct timespec64 t;
-+ int retval = sched_rr_get_interval(pid, &t);
-+
-+ if (retval == 0)
-+ retval = put_old_timespec32(&t, interval);
-+ return retval;
-+}
-+#endif
-+
-+void sched_show_task(struct task_struct *p)
-+{
-+ unsigned long free = 0;
-+ int ppid;
-+
-+ if (!try_get_task_stack(p))
-+ return;
-+
-+ pr_info("task:%-15.15s state:%c", p->comm, task_state_to_char(p));
-+
-+ if (task_is_running(p))
-+ pr_cont(" running task ");
-+#ifdef CONFIG_DEBUG_STACK_USAGE
-+ free = stack_not_used(p);
-+#endif
-+ ppid = 0;
-+ rcu_read_lock();
-+ if (pid_alive(p))
-+ ppid = task_pid_nr(rcu_dereference(p->real_parent));
-+ rcu_read_unlock();
-+ pr_cont(" stack:%-5lu pid:%-5d ppid:%-6d flags:0x%08lx\n",
-+ free, task_pid_nr(p), ppid,
-+ read_task_thread_flags(p));
-+
-+ print_worker_info(KERN_INFO, p);
-+ print_stop_info(KERN_INFO, p);
-+ show_stack(p, NULL, KERN_INFO);
-+ put_task_stack(p);
-+}
-+EXPORT_SYMBOL_GPL(sched_show_task);
-+
-+static inline bool
-+state_filter_match(unsigned long state_filter, struct task_struct *p)
-+{
-+ unsigned int state = READ_ONCE(p->__state);
-+
-+ /* no filter, everything matches */
-+ if (!state_filter)
-+ return true;
-+
-+ /* filter, but doesn't match */
-+ if (!(state & state_filter))
-+ return false;
-+
-+ /*
-+ * When looking for TASK_UNINTERRUPTIBLE skip TASK_IDLE (allows
-+ * TASK_KILLABLE).
-+ */
-+ if (state_filter == TASK_UNINTERRUPTIBLE && (state & TASK_NOLOAD))
-+ return false;
-+
-+ return true;
-+}
-+
-+
-+void show_state_filter(unsigned int state_filter)
-+{
-+ struct task_struct *g, *p;
-+
-+ rcu_read_lock();
-+ for_each_process_thread(g, p) {
-+ /*
-+ * reset the NMI-timeout, listing all files on a slow
-+ * console might take a lot of time:
-+ * Also, reset softlockup watchdogs on all CPUs, because
-+ * another CPU might be blocked waiting for us to process
-+ * an IPI.
-+ */
-+ touch_nmi_watchdog();
-+ touch_all_softlockup_watchdogs();
-+ if (state_filter_match(state_filter, p))
-+ sched_show_task(p);
-+ }
-+
-+#ifdef CONFIG_SCHED_DEBUG
-+ /* TODO: Alt schedule FW should support this
-+ if (!state_filter)
-+ sysrq_sched_debug_show();
-+ */
-+#endif
-+ rcu_read_unlock();
-+ /*
-+ * Only show locks if all tasks are dumped:
-+ */
-+ if (!state_filter)
-+ debug_show_all_locks();
-+}
-+
-+void dump_cpu_task(int cpu)
-+{
-+ if (cpu == smp_processor_id() && in_hardirq()) {
-+ struct pt_regs *regs;
-+
-+ regs = get_irq_regs();
-+ if (regs) {
-+ show_regs(regs);
-+ return;
-+ }
-+ }
-+
-+ if (trigger_single_cpu_backtrace(cpu))
-+ return;
-+
-+ pr_info("Task dump for CPU %d:\n", cpu);
-+ sched_show_task(cpu_curr(cpu));
-+}
-+
-+/**
-+ * init_idle - set up an idle thread for a given CPU
-+ * @idle: task in question
-+ * @cpu: CPU the idle task belongs to
-+ *
-+ * NOTE: this function does not set the idle thread's NEED_RESCHED
-+ * flag, to make booting more robust.
-+ */
-+void __init init_idle(struct task_struct *idle, int cpu)
-+{
-+#ifdef CONFIG_SMP
-+ struct affinity_context ac = (struct affinity_context) {
-+ .new_mask = cpumask_of(cpu),
-+ .flags = 0,
-+ };
-+#endif
-+ struct rq *rq = cpu_rq(cpu);
-+ unsigned long flags;
-+
-+ __sched_fork(0, idle);
-+
-+ raw_spin_lock_irqsave(&idle->pi_lock, flags);
-+ raw_spin_lock(&rq->lock);
-+
-+ idle->last_ran = rq->clock_task;
-+ idle->__state = TASK_RUNNING;
-+ /*
-+ * PF_KTHREAD should already be set at this point; regardless, make it
-+ * look like a proper per-CPU kthread.
-+ */
-+ idle->flags |= PF_IDLE | PF_KTHREAD | PF_NO_SETAFFINITY;
-+ kthread_set_per_cpu(idle, cpu);
-+
-+ sched_queue_init_idle(&rq->queue, idle);
-+
-+#ifdef CONFIG_SMP
-+ /*
-+ * It's possible that init_idle() gets called multiple times on a task,
-+ * in that case do_set_cpus_allowed() will not do the right thing.
-+ *
-+ * And since this is boot we can forgo the serialisation.
-+ */
-+ set_cpus_allowed_common(idle, &ac);
-+#endif
-+
-+ /* Silence PROVE_RCU */
-+ rcu_read_lock();
-+ __set_task_cpu(idle, cpu);
-+ rcu_read_unlock();
-+
-+ rq->idle = idle;
-+ rcu_assign_pointer(rq->curr, idle);
-+ idle->on_cpu = 1;
-+
-+ raw_spin_unlock(&rq->lock);
-+ raw_spin_unlock_irqrestore(&idle->pi_lock, flags);
-+
-+ /* Set the preempt count _outside_ the spinlocks! */
-+ init_idle_preempt_count(idle, cpu);
-+
-+ ftrace_graph_init_idle_task(idle, cpu);
-+ vtime_init_idle(idle, cpu);
-+#ifdef CONFIG_SMP
-+ sprintf(idle->comm, "%s/%d", INIT_TASK_COMM, cpu);
-+#endif
-+}
-+
-+#ifdef CONFIG_SMP
-+
-+int cpuset_cpumask_can_shrink(const struct cpumask __maybe_unused *cur,
-+ const struct cpumask __maybe_unused *trial)
-+{
-+ return 1;
-+}
-+
-+int task_can_attach(struct task_struct *p,
-+ const struct cpumask *cs_effective_cpus)
-+{
-+ int ret = 0;
-+
-+ /*
-+ * Kthreads which disallow setaffinity shouldn't be moved
-+ * to a new cpuset; we don't want to change their CPU
-+ * affinity and isolating such threads by their set of
-+ * allowed nodes is unnecessary. Thus, cpusets are not
-+ * applicable for such threads. This prevents checking for
-+ * success of set_cpus_allowed_ptr() on all attached tasks
-+ * before cpus_mask may be changed.
-+ */
-+ if (p->flags & PF_NO_SETAFFINITY)
-+ ret = -EINVAL;
-+
-+ return ret;
-+}
-+
-+bool sched_smp_initialized __read_mostly;
-+
-+#ifdef CONFIG_HOTPLUG_CPU
-+/*
-+ * Ensures that the idle task is using init_mm right before its CPU goes
-+ * offline.
-+ */
-+void idle_task_exit(void)
-+{
-+ struct mm_struct *mm = current->active_mm;
-+
-+ BUG_ON(current != this_rq()->idle);
-+
-+ if (mm != &init_mm) {
-+ switch_mm(mm, &init_mm, current);
-+ finish_arch_post_lock_switch();
-+ }
-+
-+ /* finish_cpu(), as ran on the BP, will clean up the active_mm state */
-+}
-+
-+static int __balance_push_cpu_stop(void *arg)
-+{
-+ struct task_struct *p = arg;
-+ struct rq *rq = this_rq();
-+ struct rq_flags rf;
-+ int cpu;
-+
-+ raw_spin_lock_irq(&p->pi_lock);
-+ rq_lock(rq, &rf);
-+
-+ update_rq_clock(rq);
-+
-+ if (task_rq(p) == rq && task_on_rq_queued(p)) {
-+ cpu = select_fallback_rq(rq->cpu, p);
-+ rq = __migrate_task(rq, p, cpu);
-+ }
-+
-+ rq_unlock(rq, &rf);
-+ raw_spin_unlock_irq(&p->pi_lock);
-+
-+ put_task_struct(p);
-+
-+ return 0;
-+}
-+
-+static DEFINE_PER_CPU(struct cpu_stop_work, push_work);
-+
-+/*
-+ * This is enabled below SCHED_AP_ACTIVE; when !cpu_active(), but only
-+ * effective when the hotplug motion is down.
-+ */
-+static void balance_push(struct rq *rq)
-+{
-+ struct task_struct *push_task = rq->curr;
-+
-+ lockdep_assert_held(&rq->lock);
-+
-+ /*
-+ * Ensure the thing is persistent until balance_push_set(.on = false);
-+ */
-+ rq->balance_callback = &balance_push_callback;
-+
-+ /*
-+ * Only active while going offline and when invoked on the outgoing
-+ * CPU.
-+ */
-+ if (!cpu_dying(rq->cpu) || rq != this_rq())
-+ return;
-+
-+ /*
-+ * Both the cpu-hotplug and stop task are in this case and are
-+ * required to complete the hotplug process.
-+ */
-+ if (kthread_is_per_cpu(push_task) ||
-+ is_migration_disabled(push_task)) {
-+
-+ /*
-+ * If this is the idle task on the outgoing CPU try to wake
-+ * up the hotplug control thread which might wait for the
-+ * last task to vanish. The rcuwait_active() check is
-+ * accurate here because the waiter is pinned on this CPU
-+ * and can't obviously be running in parallel.
-+ *
-+ * On RT kernels this also has to check whether there are
-+ * pinned and scheduled out tasks on the runqueue. They
-+ * need to leave the migrate disabled section first.
-+ */
-+ if (!rq->nr_running && !rq_has_pinned_tasks(rq) &&
-+ rcuwait_active(&rq->hotplug_wait)) {
-+ raw_spin_unlock(&rq->lock);
-+ rcuwait_wake_up(&rq->hotplug_wait);
-+ raw_spin_lock(&rq->lock);
-+ }
-+ return;
-+ }
-+
-+ get_task_struct(push_task);
-+ /*
-+ * Temporarily drop rq->lock such that we can wake-up the stop task.
-+ * Both preemption and IRQs are still disabled.
-+ */
-+ raw_spin_unlock(&rq->lock);
-+ stop_one_cpu_nowait(rq->cpu, __balance_push_cpu_stop, push_task,
-+ this_cpu_ptr(&push_work));
-+ /*
-+ * At this point need_resched() is true and we'll take the loop in
-+ * schedule(). The next pick is obviously going to be the stop task
-+ * which kthread_is_per_cpu() and will push this task away.
-+ */
-+ raw_spin_lock(&rq->lock);
-+}
-+
-+static void balance_push_set(int cpu, bool on)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+ struct rq_flags rf;
-+
-+ rq_lock_irqsave(rq, &rf);
-+ if (on) {
-+ WARN_ON_ONCE(rq->balance_callback);
-+ rq->balance_callback = &balance_push_callback;
-+ } else if (rq->balance_callback == &balance_push_callback) {
-+ rq->balance_callback = NULL;
-+ }
-+ rq_unlock_irqrestore(rq, &rf);
-+}
-+
-+/*
-+ * Invoked from a CPUs hotplug control thread after the CPU has been marked
-+ * inactive. All tasks which are not per CPU kernel threads are either
-+ * pushed off this CPU now via balance_push() or placed on a different CPU
-+ * during wakeup. Wait until the CPU is quiescent.
-+ */
-+static void balance_hotplug_wait(void)
-+{
-+ struct rq *rq = this_rq();
-+
-+ rcuwait_wait_event(&rq->hotplug_wait,
-+ rq->nr_running == 1 && !rq_has_pinned_tasks(rq),
-+ TASK_UNINTERRUPTIBLE);
-+}
-+
-+#else
-+
-+static void balance_push(struct rq *rq)
-+{
-+}
-+
-+static void balance_push_set(int cpu, bool on)
-+{
-+}
-+
-+static inline void balance_hotplug_wait(void)
-+{
-+}
-+#endif /* CONFIG_HOTPLUG_CPU */
-+
-+static void set_rq_offline(struct rq *rq)
-+{
-+ if (rq->online)
-+ rq->online = false;
-+}
-+
-+static void set_rq_online(struct rq *rq)
-+{
-+ if (!rq->online)
-+ rq->online = true;
-+}
-+
-+/*
-+ * used to mark begin/end of suspend/resume:
-+ */
-+static int num_cpus_frozen;
-+
-+/*
-+ * Update cpusets according to cpu_active mask. If cpusets are
-+ * disabled, cpuset_update_active_cpus() becomes a simple wrapper
-+ * around partition_sched_domains().
-+ *
-+ * If we come here as part of a suspend/resume, don't touch cpusets because we
-+ * want to restore it back to its original state upon resume anyway.
-+ */
-+static void cpuset_cpu_active(void)
-+{
-+ if (cpuhp_tasks_frozen) {
-+ /*
-+ * num_cpus_frozen tracks how many CPUs are involved in suspend
-+ * resume sequence. As long as this is not the last online
-+ * operation in the resume sequence, just build a single sched
-+ * domain, ignoring cpusets.
-+ */
-+ partition_sched_domains(1, NULL, NULL);
-+ if (--num_cpus_frozen)
-+ return;
-+ /*
-+ * This is the last CPU online operation. So fall through and
-+ * restore the original sched domains by considering the
-+ * cpuset configurations.
-+ */
-+ cpuset_force_rebuild();
-+ }
-+
-+ cpuset_update_active_cpus();
-+}
-+
-+static int cpuset_cpu_inactive(unsigned int cpu)
-+{
-+ if (!cpuhp_tasks_frozen) {
-+ cpuset_update_active_cpus();
-+ } else {
-+ num_cpus_frozen++;
-+ partition_sched_domains(1, NULL, NULL);
-+ }
-+ return 0;
-+}
-+
-+int sched_cpu_activate(unsigned int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+ unsigned long flags;
-+
-+ /*
-+ * Clear the balance_push callback and prepare to schedule
-+ * regular tasks.
-+ */
-+ balance_push_set(cpu, false);
-+
-+#ifdef CONFIG_SCHED_SMT
-+ /*
-+ * When going up, increment the number of cores with SMT present.
-+ */
-+ if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
-+ static_branch_inc_cpuslocked(&sched_smt_present);
-+#endif
-+ set_cpu_active(cpu, true);
-+
-+ if (sched_smp_initialized)
-+ cpuset_cpu_active();
-+
-+ /*
-+ * Put the rq online, if not already. This happens:
-+ *
-+ * 1) In the early boot process, because we build the real domains
-+ * after all cpus have been brought up.
-+ *
-+ * 2) At runtime, if cpuset_cpu_active() fails to rebuild the
-+ * domains.
-+ */
-+ raw_spin_lock_irqsave(&rq->lock, flags);
-+ set_rq_online(rq);
-+ raw_spin_unlock_irqrestore(&rq->lock, flags);
-+
-+ return 0;
-+}
-+
-+int sched_cpu_deactivate(unsigned int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+ unsigned long flags;
-+ int ret;
-+
-+ set_cpu_active(cpu, false);
-+
-+ /*
-+ * From this point forward, this CPU will refuse to run any task that
-+ * is not: migrate_disable() or KTHREAD_IS_PER_CPU, and will actively
-+ * push those tasks away until this gets cleared, see
-+ * sched_cpu_dying().
-+ */
-+ balance_push_set(cpu, true);
-+
-+ /*
-+ * We've cleared cpu_active_mask, wait for all preempt-disabled and RCU
-+ * users of this state to go away such that all new such users will
-+ * observe it.
-+ *
-+ * Specifically, we rely on ttwu to no longer target this CPU, see
-+ * ttwu_queue_cond() and is_cpu_allowed().
-+ *
-+ * Do sync before park smpboot threads to take care the rcu boost case.
-+ */
-+ synchronize_rcu();
-+
-+ raw_spin_lock_irqsave(&rq->lock, flags);
-+ update_rq_clock(rq);
-+ set_rq_offline(rq);
-+ raw_spin_unlock_irqrestore(&rq->lock, flags);
-+
-+#ifdef CONFIG_SCHED_SMT
-+ /*
-+ * When going down, decrement the number of cores with SMT present.
-+ */
-+ if (cpumask_weight(cpu_smt_mask(cpu)) == 2) {
-+ static_branch_dec_cpuslocked(&sched_smt_present);
-+ if (!static_branch_likely(&sched_smt_present))
-+ cpumask_clear(&sched_sg_idle_mask);
-+ }
-+#endif
-+
-+ if (!sched_smp_initialized)
-+ return 0;
-+
-+ ret = cpuset_cpu_inactive(cpu);
-+ if (ret) {
-+ balance_push_set(cpu, false);
-+ set_cpu_active(cpu, true);
-+ return ret;
-+ }
-+
-+ return 0;
-+}
-+
-+static void sched_rq_cpu_starting(unsigned int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+
-+ rq->calc_load_update = calc_load_update;
-+}
-+
-+int sched_cpu_starting(unsigned int cpu)
-+{
-+ sched_rq_cpu_starting(cpu);
-+ sched_tick_start(cpu);
-+ return 0;
-+}
-+
-+#ifdef CONFIG_HOTPLUG_CPU
-+
-+/*
-+ * Invoked immediately before the stopper thread is invoked to bring the
-+ * CPU down completely. At this point all per CPU kthreads except the
-+ * hotplug thread (current) and the stopper thread (inactive) have been
-+ * either parked or have been unbound from the outgoing CPU. Ensure that
-+ * any of those which might be on the way out are gone.
-+ *
-+ * If after this point a bound task is being woken on this CPU then the
-+ * responsible hotplug callback has failed to do it's job.
-+ * sched_cpu_dying() will catch it with the appropriate fireworks.
-+ */
-+int sched_cpu_wait_empty(unsigned int cpu)
-+{
-+ balance_hotplug_wait();
-+ return 0;
-+}
-+
-+/*
-+ * Since this CPU is going 'away' for a while, fold any nr_active delta we
-+ * might have. Called from the CPU stopper task after ensuring that the
-+ * stopper is the last running task on the CPU, so nr_active count is
-+ * stable. We need to take the teardown thread which is calling this into
-+ * account, so we hand in adjust = 1 to the load calculation.
-+ *
-+ * Also see the comment "Global load-average calculations".
-+ */
-+static void calc_load_migrate(struct rq *rq)
-+{
-+ long delta = calc_load_fold_active(rq, 1);
-+
-+ if (delta)
-+ atomic_long_add(delta, &calc_load_tasks);
-+}
-+
-+static void dump_rq_tasks(struct rq *rq, const char *loglvl)
-+{
-+ struct task_struct *g, *p;
-+ int cpu = cpu_of(rq);
-+
-+ lockdep_assert_held(&rq->lock);
-+
-+ printk("%sCPU%d enqueued tasks (%u total):\n", loglvl, cpu, rq->nr_running);
-+ for_each_process_thread(g, p) {
-+ if (task_cpu(p) != cpu)
-+ continue;
-+
-+ if (!task_on_rq_queued(p))
-+ continue;
-+
-+ printk("%s\tpid: %d, name: %s\n", loglvl, p->pid, p->comm);
-+ }
-+}
-+
-+int sched_cpu_dying(unsigned int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+ unsigned long flags;
-+
-+ /* Handle pending wakeups and then migrate everything off */
-+ sched_tick_stop(cpu);
-+
-+ raw_spin_lock_irqsave(&rq->lock, flags);
-+ if (rq->nr_running != 1 || rq_has_pinned_tasks(rq)) {
-+ WARN(true, "Dying CPU not properly vacated!");
-+ dump_rq_tasks(rq, KERN_WARNING);
-+ }
-+ raw_spin_unlock_irqrestore(&rq->lock, flags);
-+
-+ calc_load_migrate(rq);
-+ hrtick_clear(rq);
-+ return 0;
-+}
-+#endif
-+
-+#ifdef CONFIG_SMP
-+static void sched_init_topology_cpumask_early(void)
-+{
-+ int cpu;
-+ cpumask_t *tmp;
-+
-+ for_each_possible_cpu(cpu) {
-+ /* init topo masks */
-+ tmp = per_cpu(sched_cpu_topo_masks, cpu);
-+
-+ cpumask_copy(tmp, cpumask_of(cpu));
-+ tmp++;
-+ cpumask_copy(tmp, cpu_possible_mask);
-+ per_cpu(sched_cpu_llc_mask, cpu) = tmp;
-+ per_cpu(sched_cpu_topo_end_mask, cpu) = ++tmp;
-+ /*per_cpu(sd_llc_id, cpu) = cpu;*/
-+ }
-+}
-+
-+#define TOPOLOGY_CPUMASK(name, mask, last)\
-+ if (cpumask_and(topo, topo, mask)) { \
-+ cpumask_copy(topo, mask); \
-+ printk(KERN_INFO "sched: cpu#%02d topo: 0x%08lx - "#name, \
-+ cpu, (topo++)->bits[0]); \
-+ } \
-+ if (!last) \
-+ bitmap_complement(cpumask_bits(topo), cpumask_bits(mask), \
-+ nr_cpumask_bits);
-+
-+static void sched_init_topology_cpumask(void)
-+{
-+ int cpu;
-+ cpumask_t *topo;
-+
-+ for_each_online_cpu(cpu) {
-+ /* take chance to reset time slice for idle tasks */
-+ cpu_rq(cpu)->idle->time_slice = sched_timeslice_ns;
-+
-+ topo = per_cpu(sched_cpu_topo_masks, cpu) + 1;
-+
-+ bitmap_complement(cpumask_bits(topo), cpumask_bits(cpumask_of(cpu)),
-+ nr_cpumask_bits);
-+#ifdef CONFIG_SCHED_SMT
-+ TOPOLOGY_CPUMASK(smt, topology_sibling_cpumask(cpu), false);
-+#endif
-+ per_cpu(sd_llc_id, cpu) = cpumask_first(cpu_coregroup_mask(cpu));
-+ per_cpu(sched_cpu_llc_mask, cpu) = topo;
-+ TOPOLOGY_CPUMASK(coregroup, cpu_coregroup_mask(cpu), false);
-+
-+ TOPOLOGY_CPUMASK(core, topology_core_cpumask(cpu), false);
-+
-+ TOPOLOGY_CPUMASK(others, cpu_online_mask, true);
-+
-+ per_cpu(sched_cpu_topo_end_mask, cpu) = topo;
-+ printk(KERN_INFO "sched: cpu#%02d llc_id = %d, llc_mask idx = %d\n",
-+ cpu, per_cpu(sd_llc_id, cpu),
-+ (int) (per_cpu(sched_cpu_llc_mask, cpu) -
-+ per_cpu(sched_cpu_topo_masks, cpu)));
-+ }
-+}
-+#endif
-+
-+void __init sched_init_smp(void)
-+{
-+ /* Move init over to a non-isolated CPU */
-+ if (set_cpus_allowed_ptr(current, housekeeping_cpumask(HK_TYPE_DOMAIN)) < 0)
-+ BUG();
-+ current->flags &= ~PF_NO_SETAFFINITY;
-+
-+ sched_init_topology_cpumask();
-+
-+ sched_smp_initialized = true;
-+}
-+
-+static int __init migration_init(void)
-+{
-+ sched_cpu_starting(smp_processor_id());
-+ return 0;
-+}
-+early_initcall(migration_init);
-+
-+#else
-+void __init sched_init_smp(void)
-+{
-+ cpu_rq(0)->idle->time_slice = sched_timeslice_ns;
-+}
-+#endif /* CONFIG_SMP */
-+
-+int in_sched_functions(unsigned long addr)
-+{
-+ return in_lock_functions(addr) ||
-+ (addr >= (unsigned long)__sched_text_start
-+ && addr < (unsigned long)__sched_text_end);
-+}
-+
-+#ifdef CONFIG_CGROUP_SCHED
-+/* task group related information */
-+struct task_group {
-+ struct cgroup_subsys_state css;
-+
-+ struct rcu_head rcu;
-+ struct list_head list;
-+
-+ struct task_group *parent;
-+ struct list_head siblings;
-+ struct list_head children;
-+#ifdef CONFIG_FAIR_GROUP_SCHED
-+ unsigned long shares;
-+#endif
-+};
-+
-+/*
-+ * Default task group.
-+ * Every task in system belongs to this group at bootup.
-+ */
-+struct task_group root_task_group;
-+LIST_HEAD(task_groups);
-+
-+/* Cacheline aligned slab cache for task_group */
-+static struct kmem_cache *task_group_cache __read_mostly;
-+#endif /* CONFIG_CGROUP_SCHED */
-+
-+void __init sched_init(void)
-+{
-+ int i;
-+ struct rq *rq;
-+
-+ printk(KERN_INFO "sched/alt: "ALT_SCHED_NAME" CPU Scheduler "ALT_SCHED_VERSION\
-+ " by Alfred Chen.\n");
-+
-+ wait_bit_init();
-+
-+#ifdef CONFIG_SMP
-+ for (i = 0; i < SCHED_QUEUE_BITS; i++)
-+ cpumask_copy(sched_preempt_mask + i, cpu_present_mask);
-+#endif
-+
-+#ifdef CONFIG_CGROUP_SCHED
-+ task_group_cache = KMEM_CACHE(task_group, 0);
-+
-+ list_add(&root_task_group.list, &task_groups);
-+ INIT_LIST_HEAD(&root_task_group.children);
-+ INIT_LIST_HEAD(&root_task_group.siblings);
-+#endif /* CONFIG_CGROUP_SCHED */
-+ for_each_possible_cpu(i) {
-+ rq = cpu_rq(i);
-+
-+ sched_queue_init(&rq->queue);
-+ rq->prio = IDLE_TASK_SCHED_PRIO;
-+ rq->skip = NULL;
-+
-+ raw_spin_lock_init(&rq->lock);
-+ rq->nr_running = rq->nr_uninterruptible = 0;
-+ rq->calc_load_active = 0;
-+ rq->calc_load_update = jiffies + LOAD_FREQ;
-+#ifdef CONFIG_SMP
-+ rq->online = false;
-+ rq->cpu = i;
-+
-+#ifdef CONFIG_SCHED_SMT
-+ rq->active_balance = 0;
-+#endif
-+
-+#ifdef CONFIG_NO_HZ_COMMON
-+ INIT_CSD(&rq->nohz_csd, nohz_csd_func, rq);
-+#endif
-+ rq->balance_callback = &balance_push_callback;
-+#ifdef CONFIG_HOTPLUG_CPU
-+ rcuwait_init(&rq->hotplug_wait);
-+#endif
-+#endif /* CONFIG_SMP */
-+ rq->nr_switches = 0;
-+
-+ hrtick_rq_init(rq);
-+ atomic_set(&rq->nr_iowait, 0);
-+
-+ zalloc_cpumask_var_node(&rq->scratch_mask, GFP_KERNEL, cpu_to_node(i));
-+ }
-+#ifdef CONFIG_SMP
-+ /* Set rq->online for cpu 0 */
-+ cpu_rq(0)->online = true;
-+#endif
-+ /*
-+ * The boot idle thread does lazy MMU switching as well:
-+ */
-+ mmgrab(&init_mm);
-+ enter_lazy_tlb(&init_mm, current);
-+
-+ /*
-+ * The idle task doesn't need the kthread struct to function, but it
-+ * is dressed up as a per-CPU kthread and thus needs to play the part
-+ * if we want to avoid special-casing it in code that deals with per-CPU
-+ * kthreads.
-+ */
-+ WARN_ON(!set_kthread_struct(current));
-+
-+ /*
-+ * Make us the idle thread. Technically, schedule() should not be
-+ * called from this thread, however somewhere below it might be,
-+ * but because we are the idle thread, we just pick up running again
-+ * when this runqueue becomes "idle".
-+ */
-+ init_idle(current, smp_processor_id());
-+
-+ calc_load_update = jiffies + LOAD_FREQ;
-+
-+#ifdef CONFIG_SMP
-+ idle_thread_set_boot_cpu();
-+ balance_push_set(smp_processor_id(), false);
-+
-+ sched_init_topology_cpumask_early();
-+#endif /* SMP */
-+
-+ preempt_dynamic_init();
-+}
-+
-+#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
-+
-+void __might_sleep(const char *file, int line)
-+{
-+ unsigned int state = get_current_state();
-+ /*
-+ * Blocking primitives will set (and therefore destroy) current->state,
-+ * since we will exit with TASK_RUNNING make sure we enter with it,
-+ * otherwise we will destroy state.
-+ */
-+ WARN_ONCE(state != TASK_RUNNING && current->task_state_change,
-+ "do not call blocking ops when !TASK_RUNNING; "
-+ "state=%x set at [<%p>] %pS\n", state,
-+ (void *)current->task_state_change,
-+ (void *)current->task_state_change);
-+
-+ __might_resched(file, line, 0);
-+}
-+EXPORT_SYMBOL(__might_sleep);
-+
-+static void print_preempt_disable_ip(int preempt_offset, unsigned long ip)
-+{
-+ if (!IS_ENABLED(CONFIG_DEBUG_PREEMPT))
-+ return;
-+
-+ if (preempt_count() == preempt_offset)
-+ return;
-+
-+ pr_err("Preemption disabled at:");
-+ print_ip_sym(KERN_ERR, ip);
-+}
-+
-+static inline bool resched_offsets_ok(unsigned int offsets)
-+{
-+ unsigned int nested = preempt_count();
-+
-+ nested += rcu_preempt_depth() << MIGHT_RESCHED_RCU_SHIFT;
-+
-+ return nested == offsets;
-+}
-+
-+void __might_resched(const char *file, int line, unsigned int offsets)
-+{
-+ /* Ratelimiting timestamp: */
-+ static unsigned long prev_jiffy;
-+
-+ unsigned long preempt_disable_ip;
-+
-+ /* WARN_ON_ONCE() by default, no rate limit required: */
-+ rcu_sleep_check();
-+
-+ if ((resched_offsets_ok(offsets) && !irqs_disabled() &&
-+ !is_idle_task(current) && !current->non_block_count) ||
-+ system_state == SYSTEM_BOOTING || system_state > SYSTEM_RUNNING ||
-+ oops_in_progress)
-+ return;
-+ if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
-+ return;
-+ prev_jiffy = jiffies;
-+
-+ /* Save this before calling printk(), since that will clobber it: */
-+ preempt_disable_ip = get_preempt_disable_ip(current);
-+
-+ pr_err("BUG: sleeping function called from invalid context at %s:%d\n",
-+ file, line);
-+ pr_err("in_atomic(): %d, irqs_disabled(): %d, non_block: %d, pid: %d, name: %s\n",
-+ in_atomic(), irqs_disabled(), current->non_block_count,
-+ current->pid, current->comm);
-+ pr_err("preempt_count: %x, expected: %x\n", preempt_count(),
-+ offsets & MIGHT_RESCHED_PREEMPT_MASK);
-+
-+ if (IS_ENABLED(CONFIG_PREEMPT_RCU)) {
-+ pr_err("RCU nest depth: %d, expected: %u\n",
-+ rcu_preempt_depth(), offsets >> MIGHT_RESCHED_RCU_SHIFT);
-+ }
-+
-+ if (task_stack_end_corrupted(current))
-+ pr_emerg("Thread overran stack, or stack corrupted\n");
-+
-+ debug_show_held_locks(current);
-+ if (irqs_disabled())
-+ print_irqtrace_events(current);
-+
-+ print_preempt_disable_ip(offsets & MIGHT_RESCHED_PREEMPT_MASK,
-+ preempt_disable_ip);
-+
-+ dump_stack();
-+ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
-+}
-+EXPORT_SYMBOL(__might_resched);
-+
-+void __cant_sleep(const char *file, int line, int preempt_offset)
-+{
-+ static unsigned long prev_jiffy;
-+
-+ if (irqs_disabled())
-+ return;
-+
-+ if (!IS_ENABLED(CONFIG_PREEMPT_COUNT))
-+ return;
-+
-+ if (preempt_count() > preempt_offset)
-+ return;
-+
-+ if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
-+ return;
-+ prev_jiffy = jiffies;
-+
-+ printk(KERN_ERR "BUG: assuming atomic context at %s:%d\n", file, line);
-+ printk(KERN_ERR "in_atomic(): %d, irqs_disabled(): %d, pid: %d, name: %s\n",
-+ in_atomic(), irqs_disabled(),
-+ current->pid, current->comm);
-+
-+ debug_show_held_locks(current);
-+ dump_stack();
-+ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
-+}
-+EXPORT_SYMBOL_GPL(__cant_sleep);
-+
-+#ifdef CONFIG_SMP
-+void __cant_migrate(const char *file, int line)
-+{
-+ static unsigned long prev_jiffy;
-+
-+ if (irqs_disabled())
-+ return;
-+
-+ if (is_migration_disabled(current))
-+ return;
-+
-+ if (!IS_ENABLED(CONFIG_PREEMPT_COUNT))
-+ return;
-+
-+ if (preempt_count() > 0)
-+ return;
-+
-+ if (current->migration_flags & MDF_FORCE_ENABLED)
-+ return;
-+
-+ if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
-+ return;
-+ prev_jiffy = jiffies;
-+
-+ pr_err("BUG: assuming non migratable context at %s:%d\n", file, line);
-+ pr_err("in_atomic(): %d, irqs_disabled(): %d, migration_disabled() %u pid: %d, name: %s\n",
-+ in_atomic(), irqs_disabled(), is_migration_disabled(current),
-+ current->pid, current->comm);
-+
-+ debug_show_held_locks(current);
-+ dump_stack();
-+ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
-+}
-+EXPORT_SYMBOL_GPL(__cant_migrate);
-+#endif
-+#endif
-+
-+#ifdef CONFIG_MAGIC_SYSRQ
-+void normalize_rt_tasks(void)
-+{
-+ struct task_struct *g, *p;
-+ struct sched_attr attr = {
-+ .sched_policy = SCHED_NORMAL,
-+ };
-+
-+ read_lock(&tasklist_lock);
-+ for_each_process_thread(g, p) {
-+ /*
-+ * Only normalize user tasks:
-+ */
-+ if (p->flags & PF_KTHREAD)
-+ continue;
-+
-+ schedstat_set(p->stats.wait_start, 0);
-+ schedstat_set(p->stats.sleep_start, 0);
-+ schedstat_set(p->stats.block_start, 0);
-+
-+ if (!rt_task(p)) {
-+ /*
-+ * Renice negative nice level userspace
-+ * tasks back to 0:
-+ */
-+ if (task_nice(p) < 0)
-+ set_user_nice(p, 0);
-+ continue;
-+ }
-+
-+ __sched_setscheduler(p, &attr, false, false);
-+ }
-+ read_unlock(&tasklist_lock);
-+}
-+#endif /* CONFIG_MAGIC_SYSRQ */
-+
-+#if defined(CONFIG_IA64) || defined(CONFIG_KGDB_KDB)
-+/*
-+ * These functions are only useful for the IA64 MCA handling, or kdb.
-+ *
-+ * They can only be called when the whole system has been
-+ * stopped - every CPU needs to be quiescent, and no scheduling
-+ * activity can take place. Using them for anything else would
-+ * be a serious bug, and as a result, they aren't even visible
-+ * under any other configuration.
-+ */
-+
-+/**
-+ * curr_task - return the current task for a given CPU.
-+ * @cpu: the processor in question.
-+ *
-+ * ONLY VALID WHEN THE WHOLE SYSTEM IS STOPPED!
-+ *
-+ * Return: The current task for @cpu.
-+ */
-+struct task_struct *curr_task(int cpu)
-+{
-+ return cpu_curr(cpu);
-+}
-+
-+#endif /* defined(CONFIG_IA64) || defined(CONFIG_KGDB_KDB) */
-+
-+#ifdef CONFIG_IA64
-+/**
-+ * ia64_set_curr_task - set the current task for a given CPU.
-+ * @cpu: the processor in question.
-+ * @p: the task pointer to set.
-+ *
-+ * Description: This function must only be used when non-maskable interrupts
-+ * are serviced on a separate stack. It allows the architecture to switch the
-+ * notion of the current task on a CPU in a non-blocking manner. This function
-+ * must be called with all CPU's synchronised, and interrupts disabled, the
-+ * and caller must save the original value of the current task (see
-+ * curr_task() above) and restore that value before reenabling interrupts and
-+ * re-starting the system.
-+ *
-+ * ONLY VALID WHEN THE WHOLE SYSTEM IS STOPPED!
-+ */
-+void ia64_set_curr_task(int cpu, struct task_struct *p)
-+{
-+ cpu_curr(cpu) = p;
-+}
-+
-+#endif
-+
-+#ifdef CONFIG_CGROUP_SCHED
-+static void sched_free_group(struct task_group *tg)
-+{
-+ kmem_cache_free(task_group_cache, tg);
-+}
-+
-+static void sched_free_group_rcu(struct rcu_head *rhp)
-+{
-+ sched_free_group(container_of(rhp, struct task_group, rcu));
-+}
-+
-+static void sched_unregister_group(struct task_group *tg)
-+{
-+ /*
-+ * We have to wait for yet another RCU grace period to expire, as
-+ * print_cfs_stats() might run concurrently.
-+ */
-+ call_rcu(&tg->rcu, sched_free_group_rcu);
-+}
-+
-+/* allocate runqueue etc for a new task group */
-+struct task_group *sched_create_group(struct task_group *parent)
-+{
-+ struct task_group *tg;
-+
-+ tg = kmem_cache_alloc(task_group_cache, GFP_KERNEL | __GFP_ZERO);
-+ if (!tg)
-+ return ERR_PTR(-ENOMEM);
-+
-+ return tg;
-+}
-+
-+void sched_online_group(struct task_group *tg, struct task_group *parent)
-+{
-+}
-+
-+/* rcu callback to free various structures associated with a task group */
-+static void sched_unregister_group_rcu(struct rcu_head *rhp)
-+{
-+ /* Now it should be safe to free those cfs_rqs: */
-+ sched_unregister_group(container_of(rhp, struct task_group, rcu));
-+}
-+
-+void sched_destroy_group(struct task_group *tg)
-+{
-+ /* Wait for possible concurrent references to cfs_rqs complete: */
-+ call_rcu(&tg->rcu, sched_unregister_group_rcu);
-+}
-+
-+void sched_release_group(struct task_group *tg)
-+{
-+}
-+
-+static inline struct task_group *css_tg(struct cgroup_subsys_state *css)
-+{
-+ return css ? container_of(css, struct task_group, css) : NULL;
-+}
-+
-+static struct cgroup_subsys_state *
-+cpu_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
-+{
-+ struct task_group *parent = css_tg(parent_css);
-+ struct task_group *tg;
-+
-+ if (!parent) {
-+ /* This is early initialization for the top cgroup */
-+ return &root_task_group.css;
-+ }
-+
-+ tg = sched_create_group(parent);
-+ if (IS_ERR(tg))
-+ return ERR_PTR(-ENOMEM);
-+ return &tg->css;
-+}
-+
-+/* Expose task group only after completing cgroup initialization */
-+static int cpu_cgroup_css_online(struct cgroup_subsys_state *css)
-+{
-+ struct task_group *tg = css_tg(css);
-+ struct task_group *parent = css_tg(css->parent);
-+
-+ if (parent)
-+ sched_online_group(tg, parent);
-+ return 0;
-+}
-+
-+static void cpu_cgroup_css_released(struct cgroup_subsys_state *css)
-+{
-+ struct task_group *tg = css_tg(css);
-+
-+ sched_release_group(tg);
-+}
-+
-+static void cpu_cgroup_css_free(struct cgroup_subsys_state *css)
-+{
-+ struct task_group *tg = css_tg(css);
-+
-+ /*
-+ * Relies on the RCU grace period between css_released() and this.
-+ */
-+ sched_unregister_group(tg);
-+}
-+
-+#ifdef CONFIG_RT_GROUP_SCHED
-+static int cpu_cgroup_can_attach(struct cgroup_taskset *tset)
-+{
-+ return 0;
-+}
-+#endif
-+
-+static void cpu_cgroup_attach(struct cgroup_taskset *tset)
-+{
-+}
-+
-+#ifdef CONFIG_FAIR_GROUP_SCHED
-+static DEFINE_MUTEX(shares_mutex);
-+
-+int sched_group_set_shares(struct task_group *tg, unsigned long shares)
-+{
-+ /*
-+ * We can't change the weight of the root cgroup.
-+ */
-+ if (&root_task_group == tg)
-+ return -EINVAL;
-+
-+ shares = clamp(shares, scale_load(MIN_SHARES), scale_load(MAX_SHARES));
-+
-+ mutex_lock(&shares_mutex);
-+ if (tg->shares == shares)
-+ goto done;
-+
-+ tg->shares = shares;
-+done:
-+ mutex_unlock(&shares_mutex);
-+ return 0;
-+}
-+
-+static int cpu_shares_write_u64(struct cgroup_subsys_state *css,
-+ struct cftype *cftype, u64 shareval)
-+{
-+ if (shareval > scale_load_down(ULONG_MAX))
-+ shareval = MAX_SHARES;
-+ return sched_group_set_shares(css_tg(css), scale_load(shareval));
-+}
-+
-+static u64 cpu_shares_read_u64(struct cgroup_subsys_state *css,
-+ struct cftype *cft)
-+{
-+ struct task_group *tg = css_tg(css);
-+
-+ return (u64) scale_load_down(tg->shares);
-+}
-+#endif
-+
-+static struct cftype cpu_legacy_files[] = {
-+#ifdef CONFIG_FAIR_GROUP_SCHED
-+ {
-+ .name = "shares",
-+ .read_u64 = cpu_shares_read_u64,
-+ .write_u64 = cpu_shares_write_u64,
-+ },
-+#endif
-+ { } /* Terminate */
-+};
-+
-+
-+static struct cftype cpu_files[] = {
-+ { } /* terminate */
-+};
-+
-+static int cpu_extra_stat_show(struct seq_file *sf,
-+ struct cgroup_subsys_state *css)
-+{
-+ return 0;
-+}
-+
-+struct cgroup_subsys cpu_cgrp_subsys = {
-+ .css_alloc = cpu_cgroup_css_alloc,
-+ .css_online = cpu_cgroup_css_online,
-+ .css_released = cpu_cgroup_css_released,
-+ .css_free = cpu_cgroup_css_free,
-+ .css_extra_stat_show = cpu_extra_stat_show,
-+#ifdef CONFIG_RT_GROUP_SCHED
-+ .can_attach = cpu_cgroup_can_attach,
-+#endif
-+ .attach = cpu_cgroup_attach,
-+ .legacy_cftypes = cpu_files,
-+ .legacy_cftypes = cpu_legacy_files,
-+ .dfl_cftypes = cpu_files,
-+ .early_init = true,
-+ .threaded = true,
-+};
-+#endif /* CONFIG_CGROUP_SCHED */
-+
-+#undef CREATE_TRACE_POINTS
-+
-+#ifdef CONFIG_SCHED_MM_CID
-+
-+#
-+/*
-+ * @cid_lock: Guarantee forward-progress of cid allocation.
-+ *
-+ * Concurrency ID allocation within a bitmap is mostly lock-free. The cid_lock
-+ * is only used when contention is detected by the lock-free allocation so
-+ * forward progress can be guaranteed.
-+ */
-+DEFINE_RAW_SPINLOCK(cid_lock);
-+
-+/*
-+ * @use_cid_lock: Select cid allocation behavior: lock-free vs spinlock.
-+ *
-+ * When @use_cid_lock is 0, the cid allocation is lock-free. When contention is
-+ * detected, it is set to 1 to ensure that all newly coming allocations are
-+ * serialized by @cid_lock until the allocation which detected contention
-+ * completes and sets @use_cid_lock back to 0. This guarantees forward progress
-+ * of a cid allocation.
-+ */
-+int use_cid_lock;
-+
-+/*
-+ * mm_cid remote-clear implements a lock-free algorithm to clear per-mm/cpu cid
-+ * concurrently with respect to the execution of the source runqueue context
-+ * switch.
-+ *
-+ * There is one basic properties we want to guarantee here:
-+ *
-+ * (1) Remote-clear should _never_ mark a per-cpu cid UNSET when it is actively
-+ * used by a task. That would lead to concurrent allocation of the cid and
-+ * userspace corruption.
-+ *
-+ * Provide this guarantee by introducing a Dekker memory ordering to guarantee
-+ * that a pair of loads observe at least one of a pair of stores, which can be
-+ * shown as:
-+ *
-+ * X = Y = 0
-+ *
-+ * w[X]=1 w[Y]=1
-+ * MB MB
-+ * r[Y]=y r[X]=x
-+ *
-+ * Which guarantees that x==0 && y==0 is impossible. But rather than using
-+ * values 0 and 1, this algorithm cares about specific state transitions of the
-+ * runqueue current task (as updated by the scheduler context switch), and the
-+ * per-mm/cpu cid value.
-+ *
-+ * Let's introduce task (Y) which has task->mm == mm and task (N) which has
-+ * task->mm != mm for the rest of the discussion. There are two scheduler state
-+ * transitions on context switch we care about:
-+ *
-+ * (TSA) Store to rq->curr with transition from (N) to (Y)
-+ *
-+ * (TSB) Store to rq->curr with transition from (Y) to (N)
-+ *
-+ * On the remote-clear side, there is one transition we care about:
-+ *
-+ * (TMA) cmpxchg to *pcpu_cid to set the LAZY flag
-+ *
-+ * There is also a transition to UNSET state which can be performed from all
-+ * sides (scheduler, remote-clear). It is always performed with a cmpxchg which
-+ * guarantees that only a single thread will succeed:
-+ *
-+ * (TMB) cmpxchg to *pcpu_cid to mark UNSET
-+ *
-+ * Just to be clear, what we do _not_ want to happen is a transition to UNSET
-+ * when a thread is actively using the cid (property (1)).
-+ *
-+ * Let's looks at the relevant combinations of TSA/TSB, and TMA transitions.
-+ *
-+ * Scenario A) (TSA)+(TMA) (from next task perspective)
-+ *
-+ * CPU0 CPU1
-+ *
-+ * Context switch CS-1 Remote-clear
-+ * - store to rq->curr: (N)->(Y) (TSA) - cmpxchg to *pcpu_id to LAZY (TMA)
-+ * (implied barrier after cmpxchg)
-+ * - switch_mm_cid()
-+ * - memory barrier (see switch_mm_cid()
-+ * comment explaining how this barrier
-+ * is combined with other scheduler
-+ * barriers)
-+ * - mm_cid_get (next)
-+ * - READ_ONCE(*pcpu_cid) - rcu_dereference(src_rq->curr)
-+ *
-+ * This Dekker ensures that either task (Y) is observed by the
-+ * rcu_dereference() or the LAZY flag is observed by READ_ONCE(), or both are
-+ * observed.
-+ *
-+ * If task (Y) store is observed by rcu_dereference(), it means that there is
-+ * still an active task on the cpu. Remote-clear will therefore not transition
-+ * to UNSET, which fulfills property (1).
-+ *
-+ * If task (Y) is not observed, but the lazy flag is observed by READ_ONCE(),
-+ * it will move its state to UNSET, which clears the percpu cid perhaps
-+ * uselessly (which is not an issue for correctness). Because task (Y) is not
-+ * observed, CPU1 can move ahead to set the state to UNSET. Because moving
-+ * state to UNSET is done with a cmpxchg expecting that the old state has the
-+ * LAZY flag set, only one thread will successfully UNSET.
-+ *
-+ * If both states (LAZY flag and task (Y)) are observed, the thread on CPU0
-+ * will observe the LAZY flag and transition to UNSET (perhaps uselessly), and
-+ * CPU1 will observe task (Y) and do nothing more, which is fine.
-+ *
-+ * What we are effectively preventing with this Dekker is a scenario where
-+ * neither LAZY flag nor store (Y) are observed, which would fail property (1)
-+ * because this would UNSET a cid which is actively used.
-+ */
-+
-+void sched_mm_cid_migrate_from(struct task_struct *t)
-+{
-+ t->migrate_from_cpu = task_cpu(t);
-+}
-+
-+static
-+int __sched_mm_cid_migrate_from_fetch_cid(struct rq *src_rq,
-+ struct task_struct *t,
-+ struct mm_cid *src_pcpu_cid)
-+{
-+ struct mm_struct *mm = t->mm;
-+ struct task_struct *src_task;
-+ int src_cid, last_mm_cid;
-+
-+ if (!mm)
-+ return -1;
-+
-+ last_mm_cid = t->last_mm_cid;
-+ /*
-+ * If the migrated task has no last cid, or if the current
-+ * task on src rq uses the cid, it means the source cid does not need
-+ * to be moved to the destination cpu.
-+ */
-+ if (last_mm_cid == -1)
-+ return -1;
-+ src_cid = READ_ONCE(src_pcpu_cid->cid);
-+ if (!mm_cid_is_valid(src_cid) || last_mm_cid != src_cid)
-+ return -1;
-+
-+ /*
-+ * If we observe an active task using the mm on this rq, it means we
-+ * are not the last task to be migrated from this cpu for this mm, so
-+ * there is no need to move src_cid to the destination cpu.
-+ */
-+ rcu_read_lock();
-+ src_task = rcu_dereference(src_rq->curr);
-+ if (READ_ONCE(src_task->mm_cid_active) && src_task->mm == mm) {
-+ rcu_read_unlock();
-+ t->last_mm_cid = -1;
-+ return -1;
-+ }
-+ rcu_read_unlock();
-+
-+ return src_cid;
-+}
-+
-+static
-+int __sched_mm_cid_migrate_from_try_steal_cid(struct rq *src_rq,
-+ struct task_struct *t,
-+ struct mm_cid *src_pcpu_cid,
-+ int src_cid)
-+{
-+ struct task_struct *src_task;
-+ struct mm_struct *mm = t->mm;
-+ int lazy_cid;
-+
-+ if (src_cid == -1)
-+ return -1;
-+
-+ /*
-+ * Attempt to clear the source cpu cid to move it to the destination
-+ * cpu.
-+ */
-+ lazy_cid = mm_cid_set_lazy_put(src_cid);
-+ if (!try_cmpxchg(&src_pcpu_cid->cid, &src_cid, lazy_cid))
-+ return -1;
-+
-+ /*
-+ * The implicit barrier after cmpxchg per-mm/cpu cid before loading
-+ * rq->curr->mm matches the scheduler barrier in context_switch()
-+ * between store to rq->curr and load of prev and next task's
-+ * per-mm/cpu cid.
-+ *
-+ * The implicit barrier after cmpxchg per-mm/cpu cid before loading
-+ * rq->curr->mm_cid_active matches the barrier in
-+ * sched_mm_cid_exit_signals(), sched_mm_cid_before_execve(), and
-+ * sched_mm_cid_after_execve() between store to t->mm_cid_active and
-+ * load of per-mm/cpu cid.
-+ */
-+
-+ /*
-+ * If we observe an active task using the mm on this rq after setting
-+ * the lazy-put flag, this task will be responsible for transitioning
-+ * from lazy-put flag set to MM_CID_UNSET.
-+ */
-+ rcu_read_lock();
-+ src_task = rcu_dereference(src_rq->curr);
-+ if (READ_ONCE(src_task->mm_cid_active) && src_task->mm == mm) {
-+ rcu_read_unlock();
-+ /*
-+ * We observed an active task for this mm, there is therefore
-+ * no point in moving this cid to the destination cpu.
-+ */
-+ t->last_mm_cid = -1;
-+ return -1;
-+ }
-+ rcu_read_unlock();
-+
-+ /*
-+ * The src_cid is unused, so it can be unset.
-+ */
-+ if (!try_cmpxchg(&src_pcpu_cid->cid, &lazy_cid, MM_CID_UNSET))
-+ return -1;
-+ return src_cid;
-+}
-+
-+/*
-+ * Migration to dst cpu. Called with dst_rq lock held.
-+ * Interrupts are disabled, which keeps the window of cid ownership without the
-+ * source rq lock held small.
-+ */
-+void sched_mm_cid_migrate_to(struct rq *dst_rq, struct task_struct *t, int src_cpu)
-+{
-+ struct mm_cid *src_pcpu_cid, *dst_pcpu_cid;
-+ struct mm_struct *mm = t->mm;
-+ int src_cid, dst_cid;
-+ struct rq *src_rq;
-+
-+ lockdep_assert_rq_held(dst_rq);
-+
-+ if (!mm)
-+ return;
-+ if (src_cpu == -1) {
-+ t->last_mm_cid = -1;
-+ return;
-+ }
-+ /*
-+ * Move the src cid if the dst cid is unset. This keeps id
-+ * allocation closest to 0 in cases where few threads migrate around
-+ * many cpus.
-+ *
-+ * If destination cid is already set, we may have to just clear
-+ * the src cid to ensure compactness in frequent migrations
-+ * scenarios.
-+ *
-+ * It is not useful to clear the src cid when the number of threads is
-+ * greater or equal to the number of allowed cpus, because user-space
-+ * can expect that the number of allowed cids can reach the number of
-+ * allowed cpus.
-+ */
-+ dst_pcpu_cid = per_cpu_ptr(mm->pcpu_cid, cpu_of(dst_rq));
-+ dst_cid = READ_ONCE(dst_pcpu_cid->cid);
-+ if (!mm_cid_is_unset(dst_cid) &&
-+ atomic_read(&mm->mm_users) >= t->nr_cpus_allowed)
-+ return;
-+ src_pcpu_cid = per_cpu_ptr(mm->pcpu_cid, src_cpu);
-+ src_rq = cpu_rq(src_cpu);
-+ src_cid = __sched_mm_cid_migrate_from_fetch_cid(src_rq, t, src_pcpu_cid);
-+ if (src_cid == -1)
-+ return;
-+ src_cid = __sched_mm_cid_migrate_from_try_steal_cid(src_rq, t, src_pcpu_cid,
-+ src_cid);
-+ if (src_cid == -1)
-+ return;
-+ if (!mm_cid_is_unset(dst_cid)) {
-+ __mm_cid_put(mm, src_cid);
-+ return;
-+ }
-+ /* Move src_cid to dst cpu. */
-+ mm_cid_snapshot_time(dst_rq, mm);
-+ WRITE_ONCE(dst_pcpu_cid->cid, src_cid);
-+}
-+
-+static void sched_mm_cid_remote_clear(struct mm_struct *mm, struct mm_cid *pcpu_cid,
-+ int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+ struct task_struct *t;
-+ unsigned long flags;
-+ int cid, lazy_cid;
-+
-+ cid = READ_ONCE(pcpu_cid->cid);
-+ if (!mm_cid_is_valid(cid))
-+ return;
-+
-+ /*
-+ * Clear the cpu cid if it is set to keep cid allocation compact. If
-+ * there happens to be other tasks left on the source cpu using this
-+ * mm, the next task using this mm will reallocate its cid on context
-+ * switch.
-+ */
-+ lazy_cid = mm_cid_set_lazy_put(cid);
-+ if (!try_cmpxchg(&pcpu_cid->cid, &cid, lazy_cid))
-+ return;
-+
-+ /*
-+ * The implicit barrier after cmpxchg per-mm/cpu cid before loading
-+ * rq->curr->mm matches the scheduler barrier in context_switch()
-+ * between store to rq->curr and load of prev and next task's
-+ * per-mm/cpu cid.
-+ *
-+ * The implicit barrier after cmpxchg per-mm/cpu cid before loading
-+ * rq->curr->mm_cid_active matches the barrier in
-+ * sched_mm_cid_exit_signals(), sched_mm_cid_before_execve(), and
-+ * sched_mm_cid_after_execve() between store to t->mm_cid_active and
-+ * load of per-mm/cpu cid.
-+ */
-+
-+ /*
-+ * If we observe an active task using the mm on this rq after setting
-+ * the lazy-put flag, that task will be responsible for transitioning
-+ * from lazy-put flag set to MM_CID_UNSET.
-+ */
-+ rcu_read_lock();
-+ t = rcu_dereference(rq->curr);
-+ if (READ_ONCE(t->mm_cid_active) && t->mm == mm) {
-+ rcu_read_unlock();
-+ return;
-+ }
-+ rcu_read_unlock();
-+
-+ /*
-+ * The cid is unused, so it can be unset.
-+ * Disable interrupts to keep the window of cid ownership without rq
-+ * lock small.
-+ */
-+ local_irq_save(flags);
-+ if (try_cmpxchg(&pcpu_cid->cid, &lazy_cid, MM_CID_UNSET))
-+ __mm_cid_put(mm, cid);
-+ local_irq_restore(flags);
-+}
-+
-+static void sched_mm_cid_remote_clear_old(struct mm_struct *mm, int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+ struct mm_cid *pcpu_cid;
-+ struct task_struct *curr;
-+ u64 rq_clock;
-+
-+ /*
-+ * rq->clock load is racy on 32-bit but one spurious clear once in a
-+ * while is irrelevant.
-+ */
-+ rq_clock = READ_ONCE(rq->clock);
-+ pcpu_cid = per_cpu_ptr(mm->pcpu_cid, cpu);
-+
-+ /*
-+ * In order to take care of infrequently scheduled tasks, bump the time
-+ * snapshot associated with this cid if an active task using the mm is
-+ * observed on this rq.
-+ */
-+ rcu_read_lock();
-+ curr = rcu_dereference(rq->curr);
-+ if (READ_ONCE(curr->mm_cid_active) && curr->mm == mm) {
-+ WRITE_ONCE(pcpu_cid->time, rq_clock);
-+ rcu_read_unlock();
-+ return;
-+ }
-+ rcu_read_unlock();
-+
-+ if (rq_clock < pcpu_cid->time + SCHED_MM_CID_PERIOD_NS)
-+ return;
-+ sched_mm_cid_remote_clear(mm, pcpu_cid, cpu);
-+}
-+
-+static void sched_mm_cid_remote_clear_weight(struct mm_struct *mm, int cpu,
-+ int weight)
-+{
-+ struct mm_cid *pcpu_cid;
-+ int cid;
-+
-+ pcpu_cid = per_cpu_ptr(mm->pcpu_cid, cpu);
-+ cid = READ_ONCE(pcpu_cid->cid);
-+ if (!mm_cid_is_valid(cid) || cid < weight)
-+ return;
-+ sched_mm_cid_remote_clear(mm, pcpu_cid, cpu);
-+}
-+
-+static void task_mm_cid_work(struct callback_head *work)
-+{
-+ unsigned long now = jiffies, old_scan, next_scan;
-+ struct task_struct *t = current;
-+ struct cpumask *cidmask;
-+ struct mm_struct *mm;
-+ int weight, cpu;
-+
-+ SCHED_WARN_ON(t != container_of(work, struct task_struct, cid_work));
-+
-+ work->next = work; /* Prevent double-add */
-+ if (t->flags & PF_EXITING)
-+ return;
-+ mm = t->mm;
-+ if (!mm)
-+ return;
-+ old_scan = READ_ONCE(mm->mm_cid_next_scan);
-+ next_scan = now + msecs_to_jiffies(MM_CID_SCAN_DELAY);
-+ if (!old_scan) {
-+ unsigned long res;
-+
-+ res = cmpxchg(&mm->mm_cid_next_scan, old_scan, next_scan);
-+ if (res != old_scan)
-+ old_scan = res;
-+ else
-+ old_scan = next_scan;
-+ }
-+ if (time_before(now, old_scan))
-+ return;
-+ if (!try_cmpxchg(&mm->mm_cid_next_scan, &old_scan, next_scan))
-+ return;
-+ cidmask = mm_cidmask(mm);
-+ /* Clear cids that were not recently used. */
-+ for_each_possible_cpu(cpu)
-+ sched_mm_cid_remote_clear_old(mm, cpu);
-+ weight = cpumask_weight(cidmask);
-+ /*
-+ * Clear cids that are greater or equal to the cidmask weight to
-+ * recompact it.
-+ */
-+ for_each_possible_cpu(cpu)
-+ sched_mm_cid_remote_clear_weight(mm, cpu, weight);
-+}
-+
-+void init_sched_mm_cid(struct task_struct *t)
-+{
-+ struct mm_struct *mm = t->mm;
-+ int mm_users = 0;
-+
-+ if (mm) {
-+ mm_users = atomic_read(&mm->mm_users);
-+ if (mm_users == 1)
-+ mm->mm_cid_next_scan = jiffies + msecs_to_jiffies(MM_CID_SCAN_DELAY);
-+ }
-+ t->cid_work.next = &t->cid_work; /* Protect against double add */
-+ init_task_work(&t->cid_work, task_mm_cid_work);
-+}
-+
-+void task_tick_mm_cid(struct rq *rq, struct task_struct *curr)
-+{
-+ struct callback_head *work = &curr->cid_work;
-+ unsigned long now = jiffies;
-+
-+ if (!curr->mm || (curr->flags & (PF_EXITING | PF_KTHREAD)) ||
-+ work->next != work)
-+ return;
-+ if (time_before(now, READ_ONCE(curr->mm->mm_cid_next_scan)))
-+ return;
-+ task_work_add(curr, work, TWA_RESUME);
-+}
-+
-+void sched_mm_cid_exit_signals(struct task_struct *t)
-+{
-+ struct mm_struct *mm = t->mm;
-+ struct rq_flags rf;
-+ struct rq *rq;
-+
-+ if (!mm)
-+ return;
-+
-+ preempt_disable();
-+ rq = this_rq();
-+ rq_lock_irqsave(rq, &rf);
-+ preempt_enable_no_resched(); /* holding spinlock */
-+ WRITE_ONCE(t->mm_cid_active, 0);
-+ /*
-+ * Store t->mm_cid_active before loading per-mm/cpu cid.
-+ * Matches barrier in sched_mm_cid_remote_clear_old().
-+ */
-+ smp_mb();
-+ mm_cid_put(mm);
-+ t->last_mm_cid = t->mm_cid = -1;
-+ rq_unlock_irqrestore(rq, &rf);
-+}
-+
-+void sched_mm_cid_before_execve(struct task_struct *t)
-+{
-+ struct mm_struct *mm = t->mm;
-+ struct rq_flags rf;
-+ struct rq *rq;
-+
-+ if (!mm)
-+ return;
-+
-+ preempt_disable();
-+ rq = this_rq();
-+ rq_lock_irqsave(rq, &rf);
-+ preempt_enable_no_resched(); /* holding spinlock */
-+ WRITE_ONCE(t->mm_cid_active, 0);
-+ /*
-+ * Store t->mm_cid_active before loading per-mm/cpu cid.
-+ * Matches barrier in sched_mm_cid_remote_clear_old().
-+ */
-+ smp_mb();
-+ mm_cid_put(mm);
-+ t->last_mm_cid = t->mm_cid = -1;
-+ rq_unlock_irqrestore(rq, &rf);
-+}
-+
-+void sched_mm_cid_after_execve(struct task_struct *t)
-+{
-+ struct mm_struct *mm = t->mm;
-+ struct rq_flags rf;
-+ struct rq *rq;
-+
-+ if (!mm)
-+ return;
-+
-+ preempt_disable();
-+ rq = this_rq();
-+ rq_lock_irqsave(rq, &rf);
-+ preempt_enable_no_resched(); /* holding spinlock */
-+ WRITE_ONCE(t->mm_cid_active, 1);
-+ /*
-+ * Store t->mm_cid_active before loading per-mm/cpu cid.
-+ * Matches barrier in sched_mm_cid_remote_clear_old().
-+ */
-+ smp_mb();
-+ t->last_mm_cid = t->mm_cid = mm_cid_get(rq, mm);
-+ rq_unlock_irqrestore(rq, &rf);
-+ rseq_set_notify_resume(t);
-+}
-+
-+void sched_mm_cid_fork(struct task_struct *t)
-+{
-+ WARN_ON_ONCE(!t->mm || t->mm_cid != -1);
-+ t->mm_cid_active = 1;
-+}
-+#endif
-diff --git a/kernel/sched/alt_debug.c b/kernel/sched/alt_debug.c
-new file mode 100644
-index 000000000000..1212a031700e
---- /dev/null
-+++ b/kernel/sched/alt_debug.c
-@@ -0,0 +1,31 @@
-+/*
-+ * kernel/sched/alt_debug.c
-+ *
-+ * Print the alt scheduler debugging details
-+ *
-+ * Author: Alfred Chen
-+ * Date : 2020
-+ */
-+#include "sched.h"
-+
-+/*
-+ * This allows printing both to /proc/sched_debug and
-+ * to the console
-+ */
-+#define SEQ_printf(m, x...) \
-+ do { \
-+ if (m) \
-+ seq_printf(m, x); \
-+ else \
-+ pr_cont(x); \
-+ } while (0)
-+
-+void proc_sched_show_task(struct task_struct *p, struct pid_namespace *ns,
-+ struct seq_file *m)
-+{
-+ SEQ_printf(m, "%s (%d, #threads: %d)\n", p->comm, task_pid_nr_ns(p, ns),
-+ get_nr_threads(p));
-+}
-+
-+void proc_sched_set_task(struct task_struct *p)
-+{}
-diff --git a/kernel/sched/alt_sched.h b/kernel/sched/alt_sched.h
-new file mode 100644
-index 000000000000..5494f27cdb04
---- /dev/null
-+++ b/kernel/sched/alt_sched.h
-@@ -0,0 +1,906 @@
-+#ifndef ALT_SCHED_H
-+#define ALT_SCHED_H
-+
-+#include <linux/context_tracking.h>
-+#include <linux/profile.h>
-+#include <linux/stop_machine.h>
-+#include <linux/syscalls.h>
-+#include <linux/tick.h>
-+
-+#include <trace/events/power.h>
-+#include <trace/events/sched.h>
-+
-+#include "../workqueue_internal.h"
-+
-+#include "cpupri.h"
-+
-+#ifdef CONFIG_SCHED_BMQ
-+/* bits:
-+ * RT(0-99), (Low prio adj range, nice width, high prio adj range) / 2, cpu idle task */
-+#define SCHED_LEVELS (MAX_RT_PRIO + NICE_WIDTH / 2 + MAX_PRIORITY_ADJ + 1)
-+#endif
-+
-+#ifdef CONFIG_SCHED_PDS
-+/* bits: RT(0-24), reserved(25-31), SCHED_NORMAL_PRIO_NUM(32), cpu idle task(1) */
-+#define SCHED_LEVELS (64 + 1)
-+#endif /* CONFIG_SCHED_PDS */
-+
-+#define IDLE_TASK_SCHED_PRIO (SCHED_LEVELS - 1)
-+
-+#ifdef CONFIG_SCHED_DEBUG
-+# define SCHED_WARN_ON(x) WARN_ONCE(x, #x)
-+extern void resched_latency_warn(int cpu, u64 latency);
-+#else
-+# define SCHED_WARN_ON(x) ({ (void)(x), 0; })
-+static inline void resched_latency_warn(int cpu, u64 latency) {}
-+#endif
-+
-+/*
-+ * Increase resolution of nice-level calculations for 64-bit architectures.
-+ * The extra resolution improves shares distribution and load balancing of
-+ * low-weight task groups (eg. nice +19 on an autogroup), deeper taskgroup
-+ * hierarchies, especially on larger systems. This is not a user-visible change
-+ * and does not change the user-interface for setting shares/weights.
-+ *
-+ * We increase resolution only if we have enough bits to allow this increased
-+ * resolution (i.e. 64-bit). The costs for increasing resolution when 32-bit
-+ * are pretty high and the returns do not justify the increased costs.
-+ *
-+ * Really only required when CONFIG_FAIR_GROUP_SCHED=y is also set, but to
-+ * increase coverage and consistency always enable it on 64-bit platforms.
-+ */
-+#ifdef CONFIG_64BIT
-+# define NICE_0_LOAD_SHIFT (SCHED_FIXEDPOINT_SHIFT + SCHED_FIXEDPOINT_SHIFT)
-+# define scale_load(w) ((w) << SCHED_FIXEDPOINT_SHIFT)
-+# define scale_load_down(w) \
-+({ \
-+ unsigned long __w = (w); \
-+ if (__w) \
-+ __w = max(2UL, __w >> SCHED_FIXEDPOINT_SHIFT); \
-+ __w; \
-+})
-+#else
-+# define NICE_0_LOAD_SHIFT (SCHED_FIXEDPOINT_SHIFT)
-+# define scale_load(w) (w)
-+# define scale_load_down(w) (w)
-+#endif
-+
-+#ifdef CONFIG_FAIR_GROUP_SCHED
-+#define ROOT_TASK_GROUP_LOAD NICE_0_LOAD
-+
-+/*
-+ * A weight of 0 or 1 can cause arithmetics problems.
-+ * A weight of a cfs_rq is the sum of weights of which entities
-+ * are queued on this cfs_rq, so a weight of a entity should not be
-+ * too large, so as the shares value of a task group.
-+ * (The default weight is 1024 - so there's no practical
-+ * limitation from this.)
-+ */
-+#define MIN_SHARES (1UL << 1)
-+#define MAX_SHARES (1UL << 18)
-+#endif
-+
-+/*
-+ * Tunables that become constants when CONFIG_SCHED_DEBUG is off:
-+ */
-+#ifdef CONFIG_SCHED_DEBUG
-+# define const_debug __read_mostly
-+#else
-+# define const_debug const
-+#endif
-+
-+/* task_struct::on_rq states: */
-+#define TASK_ON_RQ_QUEUED 1
-+#define TASK_ON_RQ_MIGRATING 2
-+
-+static inline int task_on_rq_queued(struct task_struct *p)
-+{
-+ return p->on_rq == TASK_ON_RQ_QUEUED;
-+}
-+
-+static inline int task_on_rq_migrating(struct task_struct *p)
-+{
-+ return READ_ONCE(p->on_rq) == TASK_ON_RQ_MIGRATING;
-+}
-+
-+/*
-+ * wake flags
-+ */
-+#define WF_SYNC 0x01 /* waker goes to sleep after wakeup */
-+#define WF_FORK 0x02 /* child wakeup after fork */
-+#define WF_MIGRATED 0x04 /* internal use, task got migrated */
-+
-+#define SCHED_QUEUE_BITS (SCHED_LEVELS - 1)
-+
-+struct sched_queue {
-+ DECLARE_BITMAP(bitmap, SCHED_QUEUE_BITS);
-+ struct list_head heads[SCHED_LEVELS];
-+};
-+
-+struct rq;
-+struct cpuidle_state;
-+
-+struct balance_callback {
-+ struct balance_callback *next;
-+ void (*func)(struct rq *rq);
-+};
-+
-+/*
-+ * This is the main, per-CPU runqueue data structure.
-+ * This data should only be modified by the local cpu.
-+ */
-+struct rq {
-+ /* runqueue lock: */
-+ raw_spinlock_t lock;
-+
-+ struct task_struct __rcu *curr;
-+ struct task_struct *idle, *stop, *skip;
-+ struct mm_struct *prev_mm;
-+
-+ struct sched_queue queue;
-+#ifdef CONFIG_SCHED_PDS
-+ u64 time_edge;
-+#endif
-+ unsigned long prio;
-+
-+ /* switch count */
-+ u64 nr_switches;
-+
-+ atomic_t nr_iowait;
-+
-+#ifdef CONFIG_SCHED_DEBUG
-+ u64 last_seen_need_resched_ns;
-+ int ticks_without_resched;
-+#endif
-+
-+#ifdef CONFIG_MEMBARRIER
-+ int membarrier_state;
-+#endif
-+
-+#ifdef CONFIG_SMP
-+ int cpu; /* cpu of this runqueue */
-+ bool online;
-+
-+ unsigned int ttwu_pending;
-+ unsigned char nohz_idle_balance;
-+ unsigned char idle_balance;
-+
-+#ifdef CONFIG_HAVE_SCHED_AVG_IRQ
-+ struct sched_avg avg_irq;
-+#endif
-+
-+#ifdef CONFIG_SCHED_SMT
-+ int active_balance;
-+ struct cpu_stop_work active_balance_work;
-+#endif
-+ struct balance_callback *balance_callback;
-+#ifdef CONFIG_HOTPLUG_CPU
-+ struct rcuwait hotplug_wait;
-+#endif
-+ unsigned int nr_pinned;
-+
-+#endif /* CONFIG_SMP */
-+#ifdef CONFIG_IRQ_TIME_ACCOUNTING
-+ u64 prev_irq_time;
-+#endif /* CONFIG_IRQ_TIME_ACCOUNTING */
-+#ifdef CONFIG_PARAVIRT
-+ u64 prev_steal_time;
-+#endif /* CONFIG_PARAVIRT */
-+#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
-+ u64 prev_steal_time_rq;
-+#endif /* CONFIG_PARAVIRT_TIME_ACCOUNTING */
-+
-+ /* For genenal cpu load util */
-+ s32 load_history;
-+ u64 load_block;
-+ u64 load_stamp;
-+
-+ /* calc_load related fields */
-+ unsigned long calc_load_update;
-+ long calc_load_active;
-+
-+ u64 clock, last_tick;
-+ u64 last_ts_switch;
-+ u64 clock_task;
-+
-+ unsigned int nr_running;
-+ unsigned long nr_uninterruptible;
-+
-+#ifdef CONFIG_SCHED_HRTICK
-+#ifdef CONFIG_SMP
-+ call_single_data_t hrtick_csd;
-+#endif
-+ struct hrtimer hrtick_timer;
-+ ktime_t hrtick_time;
-+#endif
-+
-+#ifdef CONFIG_SCHEDSTATS
-+
-+ /* latency stats */
-+ struct sched_info rq_sched_info;
-+ unsigned long long rq_cpu_time;
-+ /* could above be rq->cfs_rq.exec_clock + rq->rt_rq.rt_runtime ? */
-+
-+ /* sys_sched_yield() stats */
-+ unsigned int yld_count;
-+
-+ /* schedule() stats */
-+ unsigned int sched_switch;
-+ unsigned int sched_count;
-+ unsigned int sched_goidle;
-+
-+ /* try_to_wake_up() stats */
-+ unsigned int ttwu_count;
-+ unsigned int ttwu_local;
-+#endif /* CONFIG_SCHEDSTATS */
-+
-+#ifdef CONFIG_CPU_IDLE
-+ /* Must be inspected within a rcu lock section */
-+ struct cpuidle_state *idle_state;
-+#endif
-+
-+#ifdef CONFIG_NO_HZ_COMMON
-+#ifdef CONFIG_SMP
-+ call_single_data_t nohz_csd;
-+#endif
-+ atomic_t nohz_flags;
-+#endif /* CONFIG_NO_HZ_COMMON */
-+
-+ /* Scratch cpumask to be temporarily used under rq_lock */
-+ cpumask_var_t scratch_mask;
-+};
-+
-+extern unsigned long rq_load_util(struct rq *rq, unsigned long max);
-+
-+extern unsigned long calc_load_update;
-+extern atomic_long_t calc_load_tasks;
-+
-+extern void calc_global_load_tick(struct rq *this_rq);
-+extern long calc_load_fold_active(struct rq *this_rq, long adjust);
-+
-+DECLARE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
-+#define cpu_rq(cpu) (&per_cpu(runqueues, (cpu)))
-+#define this_rq() this_cpu_ptr(&runqueues)
-+#define task_rq(p) cpu_rq(task_cpu(p))
-+#define cpu_curr(cpu) (cpu_rq(cpu)->curr)
-+#define raw_rq() raw_cpu_ptr(&runqueues)
-+
-+#ifdef CONFIG_SMP
-+#if defined(CONFIG_SCHED_DEBUG) && defined(CONFIG_SYSCTL)
-+void register_sched_domain_sysctl(void);
-+void unregister_sched_domain_sysctl(void);
-+#else
-+static inline void register_sched_domain_sysctl(void)
-+{
-+}
-+static inline void unregister_sched_domain_sysctl(void)
-+{
-+}
-+#endif
-+
-+extern bool sched_smp_initialized;
-+
-+enum {
-+ ITSELF_LEVEL_SPACE_HOLDER,
-+#ifdef CONFIG_SCHED_SMT
-+ SMT_LEVEL_SPACE_HOLDER,
-+#endif
-+ COREGROUP_LEVEL_SPACE_HOLDER,
-+ CORE_LEVEL_SPACE_HOLDER,
-+ OTHER_LEVEL_SPACE_HOLDER,
-+ NR_CPU_AFFINITY_LEVELS
-+};
-+
-+DECLARE_PER_CPU_ALIGNED(cpumask_t [NR_CPU_AFFINITY_LEVELS], sched_cpu_topo_masks);
-+
-+static inline int
-+__best_mask_cpu(const cpumask_t *cpumask, const cpumask_t *mask)
-+{
-+ int cpu;
-+
-+ while ((cpu = cpumask_any_and(cpumask, mask)) >= nr_cpu_ids)
-+ mask++;
-+
-+ return cpu;
-+}
-+
-+static inline int best_mask_cpu(int cpu, const cpumask_t *mask)
-+{
-+ return __best_mask_cpu(mask, per_cpu(sched_cpu_topo_masks, cpu));
-+}
-+
-+extern void flush_smp_call_function_queue(void);
-+
-+#else /* !CONFIG_SMP */
-+static inline void flush_smp_call_function_queue(void) { }
-+#endif
-+
-+#ifndef arch_scale_freq_tick
-+static __always_inline
-+void arch_scale_freq_tick(void)
-+{
-+}
-+#endif
-+
-+#ifndef arch_scale_freq_capacity
-+static __always_inline
-+unsigned long arch_scale_freq_capacity(int cpu)
-+{
-+ return SCHED_CAPACITY_SCALE;
-+}
-+#endif
-+
-+static inline u64 __rq_clock_broken(struct rq *rq)
-+{
-+ return READ_ONCE(rq->clock);
-+}
-+
-+static inline u64 rq_clock(struct rq *rq)
-+{
-+ /*
-+ * Relax lockdep_assert_held() checking as in VRQ, call to
-+ * sched_info_xxxx() may not held rq->lock
-+ * lockdep_assert_held(&rq->lock);
-+ */
-+ return rq->clock;
-+}
-+
-+static inline u64 rq_clock_task(struct rq *rq)
-+{
-+ /*
-+ * Relax lockdep_assert_held() checking as in VRQ, call to
-+ * sched_info_xxxx() may not held rq->lock
-+ * lockdep_assert_held(&rq->lock);
-+ */
-+ return rq->clock_task;
-+}
-+
-+/*
-+ * {de,en}queue flags:
-+ *
-+ * DEQUEUE_SLEEP - task is no longer runnable
-+ * ENQUEUE_WAKEUP - task just became runnable
-+ *
-+ */
-+
-+#define DEQUEUE_SLEEP 0x01
-+
-+#define ENQUEUE_WAKEUP 0x01
-+
-+
-+/*
-+ * Below are scheduler API which using in other kernel code
-+ * It use the dummy rq_flags
-+ * ToDo : BMQ need to support these APIs for compatibility with mainline
-+ * scheduler code.
-+ */
-+struct rq_flags {
-+ unsigned long flags;
-+};
-+
-+struct rq *__task_rq_lock(struct task_struct *p, struct rq_flags *rf)
-+ __acquires(rq->lock);
-+
-+struct rq *task_rq_lock(struct task_struct *p, struct rq_flags *rf)
-+ __acquires(p->pi_lock)
-+ __acquires(rq->lock);
-+
-+static inline void __task_rq_unlock(struct rq *rq, struct rq_flags *rf)
-+ __releases(rq->lock)
-+{
-+ raw_spin_unlock(&rq->lock);
-+}
-+
-+static inline void
-+task_rq_unlock(struct rq *rq, struct task_struct *p, struct rq_flags *rf)
-+ __releases(rq->lock)
-+ __releases(p->pi_lock)
-+{
-+ raw_spin_unlock(&rq->lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, rf->flags);
-+}
-+
-+static inline void
-+rq_lock(struct rq *rq, struct rq_flags *rf)
-+ __acquires(rq->lock)
-+{
-+ raw_spin_lock(&rq->lock);
-+}
-+
-+static inline void
-+rq_unlock(struct rq *rq, struct rq_flags *rf)
-+ __releases(rq->lock)
-+{
-+ raw_spin_unlock(&rq->lock);
-+}
-+
-+static inline void
-+rq_lock_irq(struct rq *rq, struct rq_flags *rf)
-+ __acquires(rq->lock)
-+{
-+ raw_spin_lock_irq(&rq->lock);
-+}
-+
-+static inline void
-+rq_unlock_irq(struct rq *rq, struct rq_flags *rf)
-+ __releases(rq->lock)
-+{
-+ raw_spin_unlock_irq(&rq->lock);
-+}
-+
-+static inline struct rq *
-+this_rq_lock_irq(struct rq_flags *rf)
-+ __acquires(rq->lock)
-+{
-+ struct rq *rq;
-+
-+ local_irq_disable();
-+ rq = this_rq();
-+ raw_spin_lock(&rq->lock);
-+
-+ return rq;
-+}
-+
-+static inline raw_spinlock_t *__rq_lockp(struct rq *rq)
-+{
-+ return &rq->lock;
-+}
-+
-+static inline raw_spinlock_t *rq_lockp(struct rq *rq)
-+{
-+ return __rq_lockp(rq);
-+}
-+
-+static inline void lockdep_assert_rq_held(struct rq *rq)
-+{
-+ lockdep_assert_held(__rq_lockp(rq));
-+}
-+
-+extern void raw_spin_rq_lock_nested(struct rq *rq, int subclass);
-+extern void raw_spin_rq_unlock(struct rq *rq);
-+
-+static inline void raw_spin_rq_lock(struct rq *rq)
-+{
-+ raw_spin_rq_lock_nested(rq, 0);
-+}
-+
-+static inline void raw_spin_rq_lock_irq(struct rq *rq)
-+{
-+ local_irq_disable();
-+ raw_spin_rq_lock(rq);
-+}
-+
-+static inline void raw_spin_rq_unlock_irq(struct rq *rq)
-+{
-+ raw_spin_rq_unlock(rq);
-+ local_irq_enable();
-+}
-+
-+static inline int task_current(struct rq *rq, struct task_struct *p)
-+{
-+ return rq->curr == p;
-+}
-+
-+static inline bool task_on_cpu(struct task_struct *p)
-+{
-+ return p->on_cpu;
-+}
-+
-+extern int task_running_nice(struct task_struct *p);
-+
-+extern struct static_key_false sched_schedstats;
-+
-+#ifdef CONFIG_CPU_IDLE
-+static inline void idle_set_state(struct rq *rq,
-+ struct cpuidle_state *idle_state)
-+{
-+ rq->idle_state = idle_state;
-+}
-+
-+static inline struct cpuidle_state *idle_get_state(struct rq *rq)
-+{
-+ WARN_ON(!rcu_read_lock_held());
-+ return rq->idle_state;
-+}
-+#else
-+static inline void idle_set_state(struct rq *rq,
-+ struct cpuidle_state *idle_state)
-+{
-+}
-+
-+static inline struct cpuidle_state *idle_get_state(struct rq *rq)
-+{
-+ return NULL;
-+}
-+#endif
-+
-+static inline int cpu_of(const struct rq *rq)
-+{
-+#ifdef CONFIG_SMP
-+ return rq->cpu;
-+#else
-+ return 0;
-+#endif
-+}
-+
-+#include "stats.h"
-+
-+#ifdef CONFIG_NO_HZ_COMMON
-+#define NOHZ_BALANCE_KICK_BIT 0
-+#define NOHZ_STATS_KICK_BIT 1
-+
-+#define NOHZ_BALANCE_KICK BIT(NOHZ_BALANCE_KICK_BIT)
-+#define NOHZ_STATS_KICK BIT(NOHZ_STATS_KICK_BIT)
-+
-+#define NOHZ_KICK_MASK (NOHZ_BALANCE_KICK | NOHZ_STATS_KICK)
-+
-+#define nohz_flags(cpu) (&cpu_rq(cpu)->nohz_flags)
-+
-+/* TODO: needed?
-+extern void nohz_balance_exit_idle(struct rq *rq);
-+#else
-+static inline void nohz_balance_exit_idle(struct rq *rq) { }
-+*/
-+#endif
-+
-+#ifdef CONFIG_IRQ_TIME_ACCOUNTING
-+struct irqtime {
-+ u64 total;
-+ u64 tick_delta;
-+ u64 irq_start_time;
-+ struct u64_stats_sync sync;
-+};
-+
-+DECLARE_PER_CPU(struct irqtime, cpu_irqtime);
-+
-+/*
-+ * Returns the irqtime minus the softirq time computed by ksoftirqd.
-+ * Otherwise ksoftirqd's sum_exec_runtime is substracted its own runtime
-+ * and never move forward.
-+ */
-+static inline u64 irq_time_read(int cpu)
-+{
-+ struct irqtime *irqtime = &per_cpu(cpu_irqtime, cpu);
-+ unsigned int seq;
-+ u64 total;
-+
-+ do {
-+ seq = __u64_stats_fetch_begin(&irqtime->sync);
-+ total = irqtime->total;
-+ } while (__u64_stats_fetch_retry(&irqtime->sync, seq));
-+
-+ return total;
-+}
-+#endif /* CONFIG_IRQ_TIME_ACCOUNTING */
-+
-+#ifdef CONFIG_CPU_FREQ
-+DECLARE_PER_CPU(struct update_util_data __rcu *, cpufreq_update_util_data);
-+#endif /* CONFIG_CPU_FREQ */
-+
-+#ifdef CONFIG_NO_HZ_FULL
-+extern int __init sched_tick_offload_init(void);
-+#else
-+static inline int sched_tick_offload_init(void) { return 0; }
-+#endif
-+
-+#ifdef arch_scale_freq_capacity
-+#ifndef arch_scale_freq_invariant
-+#define arch_scale_freq_invariant() (true)
-+#endif
-+#else /* arch_scale_freq_capacity */
-+#define arch_scale_freq_invariant() (false)
-+#endif
-+
-+extern void schedule_idle(void);
-+
-+#define cap_scale(v, s) ((v)*(s) >> SCHED_CAPACITY_SHIFT)
-+
-+/*
-+ * !! For sched_setattr_nocheck() (kernel) only !!
-+ *
-+ * This is actually gross. :(
-+ *
-+ * It is used to make schedutil kworker(s) higher priority than SCHED_DEADLINE
-+ * tasks, but still be able to sleep. We need this on platforms that cannot
-+ * atomically change clock frequency. Remove once fast switching will be
-+ * available on such platforms.
-+ *
-+ * SUGOV stands for SchedUtil GOVernor.
-+ */
-+#define SCHED_FLAG_SUGOV 0x10000000
-+
-+#ifdef CONFIG_MEMBARRIER
-+/*
-+ * The scheduler provides memory barriers required by membarrier between:
-+ * - prior user-space memory accesses and store to rq->membarrier_state,
-+ * - store to rq->membarrier_state and following user-space memory accesses.
-+ * In the same way it provides those guarantees around store to rq->curr.
-+ */
-+static inline void membarrier_switch_mm(struct rq *rq,
-+ struct mm_struct *prev_mm,
-+ struct mm_struct *next_mm)
-+{
-+ int membarrier_state;
-+
-+ if (prev_mm == next_mm)
-+ return;
-+
-+ membarrier_state = atomic_read(&next_mm->membarrier_state);
-+ if (READ_ONCE(rq->membarrier_state) == membarrier_state)
-+ return;
-+
-+ WRITE_ONCE(rq->membarrier_state, membarrier_state);
-+}
-+#else
-+static inline void membarrier_switch_mm(struct rq *rq,
-+ struct mm_struct *prev_mm,
-+ struct mm_struct *next_mm)
-+{
-+}
-+#endif
-+
-+#ifdef CONFIG_NUMA
-+extern int sched_numa_find_closest(const struct cpumask *cpus, int cpu);
-+#else
-+static inline int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
-+{
-+ return nr_cpu_ids;
-+}
-+#endif
-+
-+extern void swake_up_all_locked(struct swait_queue_head *q);
-+extern void __prepare_to_swait(struct swait_queue_head *q, struct swait_queue *wait);
-+
-+#ifdef CONFIG_PREEMPT_DYNAMIC
-+extern int preempt_dynamic_mode;
-+extern int sched_dynamic_mode(const char *str);
-+extern void sched_dynamic_update(int mode);
-+#endif
-+
-+static inline void nohz_run_idle_balance(int cpu) { }
-+
-+static inline
-+unsigned long uclamp_rq_util_with(struct rq *rq, unsigned long util,
-+ struct task_struct *p)
-+{
-+ return util;
-+}
-+
-+static inline bool uclamp_rq_is_capped(struct rq *rq) { return false; }
-+
-+#ifdef CONFIG_SCHED_MM_CID
-+
-+#define SCHED_MM_CID_PERIOD_NS (100ULL * 1000000) /* 100ms */
-+#define MM_CID_SCAN_DELAY 100 /* 100ms */
-+
-+extern raw_spinlock_t cid_lock;
-+extern int use_cid_lock;
-+
-+extern void sched_mm_cid_migrate_from(struct task_struct *t);
-+extern void sched_mm_cid_migrate_to(struct rq *dst_rq, struct task_struct *t, int src_cpu);
-+extern void task_tick_mm_cid(struct rq *rq, struct task_struct *curr);
-+extern void init_sched_mm_cid(struct task_struct *t);
-+
-+static inline void __mm_cid_put(struct mm_struct *mm, int cid)
-+{
-+ if (cid < 0)
-+ return;
-+ cpumask_clear_cpu(cid, mm_cidmask(mm));
-+}
-+
-+/*
-+ * The per-mm/cpu cid can have the MM_CID_LAZY_PUT flag set or transition to
-+ * the MM_CID_UNSET state without holding the rq lock, but the rq lock needs to
-+ * be held to transition to other states.
-+ *
-+ * State transitions synchronized with cmpxchg or try_cmpxchg need to be
-+ * consistent across cpus, which prevents use of this_cpu_cmpxchg.
-+ */
-+static inline void mm_cid_put_lazy(struct task_struct *t)
-+{
-+ struct mm_struct *mm = t->mm;
-+ struct mm_cid __percpu *pcpu_cid = mm->pcpu_cid;
-+ int cid;
-+
-+ lockdep_assert_irqs_disabled();
-+ cid = __this_cpu_read(pcpu_cid->cid);
-+ if (!mm_cid_is_lazy_put(cid) ||
-+ !try_cmpxchg(&this_cpu_ptr(pcpu_cid)->cid, &cid, MM_CID_UNSET))
-+ return;
-+ __mm_cid_put(mm, mm_cid_clear_lazy_put(cid));
-+}
-+
-+static inline int mm_cid_pcpu_unset(struct mm_struct *mm)
-+{
-+ struct mm_cid __percpu *pcpu_cid = mm->pcpu_cid;
-+ int cid, res;
-+
-+ lockdep_assert_irqs_disabled();
-+ cid = __this_cpu_read(pcpu_cid->cid);
-+ for (;;) {
-+ if (mm_cid_is_unset(cid))
-+ return MM_CID_UNSET;
-+ /*
-+ * Attempt transition from valid or lazy-put to unset.
-+ */
-+ res = cmpxchg(&this_cpu_ptr(pcpu_cid)->cid, cid, MM_CID_UNSET);
-+ if (res == cid)
-+ break;
-+ cid = res;
-+ }
-+ return cid;
-+}
-+
-+static inline void mm_cid_put(struct mm_struct *mm)
-+{
-+ int cid;
-+
-+ lockdep_assert_irqs_disabled();
-+ cid = mm_cid_pcpu_unset(mm);
-+ if (cid == MM_CID_UNSET)
-+ return;
-+ __mm_cid_put(mm, mm_cid_clear_lazy_put(cid));
-+}
-+
-+static inline int __mm_cid_try_get(struct mm_struct *mm)
-+{
-+ struct cpumask *cpumask;
-+ int cid;
-+
-+ cpumask = mm_cidmask(mm);
-+ /*
-+ * Retry finding first zero bit if the mask is temporarily
-+ * filled. This only happens during concurrent remote-clear
-+ * which owns a cid without holding a rq lock.
-+ */
-+ for (;;) {
-+ cid = cpumask_first_zero(cpumask);
-+ if (cid < nr_cpu_ids)
-+ break;
-+ cpu_relax();
-+ }
-+ if (cpumask_test_and_set_cpu(cid, cpumask))
-+ return -1;
-+ return cid;
-+}
-+
-+/*
-+ * Save a snapshot of the current runqueue time of this cpu
-+ * with the per-cpu cid value, allowing to estimate how recently it was used.
-+ */
-+static inline void mm_cid_snapshot_time(struct rq *rq, struct mm_struct *mm)
-+{
-+ struct mm_cid *pcpu_cid = per_cpu_ptr(mm->pcpu_cid, cpu_of(rq));
-+
-+ lockdep_assert_rq_held(rq);
-+ WRITE_ONCE(pcpu_cid->time, rq->clock);
-+}
-+
-+static inline int __mm_cid_get(struct rq *rq, struct mm_struct *mm)
-+{
-+ int cid;
-+
-+ /*
-+ * All allocations (even those using the cid_lock) are lock-free. If
-+ * use_cid_lock is set, hold the cid_lock to perform cid allocation to
-+ * guarantee forward progress.
-+ */
-+ if (!READ_ONCE(use_cid_lock)) {
-+ cid = __mm_cid_try_get(mm);
-+ if (cid >= 0)
-+ goto end;
-+ raw_spin_lock(&cid_lock);
-+ } else {
-+ raw_spin_lock(&cid_lock);
-+ cid = __mm_cid_try_get(mm);
-+ if (cid >= 0)
-+ goto unlock;
-+ }
-+
-+ /*
-+ * cid concurrently allocated. Retry while forcing following
-+ * allocations to use the cid_lock to ensure forward progress.
-+ */
-+ WRITE_ONCE(use_cid_lock, 1);
-+ /*
-+ * Set use_cid_lock before allocation. Only care about program order
-+ * because this is only required for forward progress.
-+ */
-+ barrier();
-+ /*
-+ * Retry until it succeeds. It is guaranteed to eventually succeed once
-+ * all newcoming allocations observe the use_cid_lock flag set.
-+ */
-+ do {
-+ cid = __mm_cid_try_get(mm);
-+ cpu_relax();
-+ } while (cid < 0);
-+ /*
-+ * Allocate before clearing use_cid_lock. Only care about
-+ * program order because this is for forward progress.
-+ */
-+ barrier();
-+ WRITE_ONCE(use_cid_lock, 0);
-+unlock:
-+ raw_spin_unlock(&cid_lock);
-+end:
-+ mm_cid_snapshot_time(rq, mm);
-+ return cid;
-+}
-+
-+static inline int mm_cid_get(struct rq *rq, struct mm_struct *mm)
-+{
-+ struct mm_cid __percpu *pcpu_cid = mm->pcpu_cid;
-+ struct cpumask *cpumask;
-+ int cid;
-+
-+ lockdep_assert_rq_held(rq);
-+ cpumask = mm_cidmask(mm);
-+ cid = __this_cpu_read(pcpu_cid->cid);
-+ if (mm_cid_is_valid(cid)) {
-+ mm_cid_snapshot_time(rq, mm);
-+ return cid;
-+ }
-+ if (mm_cid_is_lazy_put(cid)) {
-+ if (try_cmpxchg(&this_cpu_ptr(pcpu_cid)->cid, &cid, MM_CID_UNSET))
-+ __mm_cid_put(mm, mm_cid_clear_lazy_put(cid));
-+ }
-+ cid = __mm_cid_get(rq, mm);
-+ __this_cpu_write(pcpu_cid->cid, cid);
-+ return cid;
-+}
-+
-+static inline void switch_mm_cid(struct rq *rq,
-+ struct task_struct *prev,
-+ struct task_struct *next)
-+{
-+ /*
-+ * Provide a memory barrier between rq->curr store and load of
-+ * {prev,next}->mm->pcpu_cid[cpu] on rq->curr->mm transition.
-+ *
-+ * Should be adapted if context_switch() is modified.
-+ */
-+ if (!next->mm) { // to kernel
-+ /*
-+ * user -> kernel transition does not guarantee a barrier, but
-+ * we can use the fact that it performs an atomic operation in
-+ * mmgrab().
-+ */
-+ if (prev->mm) // from user
-+ smp_mb__after_mmgrab();
-+ /*
-+ * kernel -> kernel transition does not change rq->curr->mm
-+ * state. It stays NULL.
-+ */
-+ } else { // to user
-+ /*
-+ * kernel -> user transition does not provide a barrier
-+ * between rq->curr store and load of {prev,next}->mm->pcpu_cid[cpu].
-+ * Provide it here.
-+ */
-+ if (!prev->mm) // from kernel
-+ smp_mb();
-+ /*
-+ * user -> user transition guarantees a memory barrier through
-+ * switch_mm() when current->mm changes. If current->mm is
-+ * unchanged, no barrier is needed.
-+ */
-+ }
-+ if (prev->mm_cid_active) {
-+ mm_cid_snapshot_time(rq, prev->mm);
-+ mm_cid_put_lazy(prev);
-+ prev->mm_cid = -1;
-+ }
-+ if (next->mm_cid_active)
-+ next->last_mm_cid = next->mm_cid = mm_cid_get(rq, next->mm);
-+}
-+
-+#else
-+static inline void switch_mm_cid(struct rq *rq, struct task_struct *prev, struct task_struct *next) { }
-+static inline void sched_mm_cid_migrate_from(struct task_struct *t) { }
-+static inline void sched_mm_cid_migrate_to(struct rq *dst_rq, struct task_struct *t, int src_cpu) { }
-+static inline void task_tick_mm_cid(struct rq *rq, struct task_struct *curr) { }
-+static inline void init_sched_mm_cid(struct task_struct *t) { }
-+#endif
-+
-+#endif /* ALT_SCHED_H */
-diff --git a/kernel/sched/bmq.h b/kernel/sched/bmq.h
-new file mode 100644
-index 000000000000..f29b8f3aa786
---- /dev/null
-+++ b/kernel/sched/bmq.h
-@@ -0,0 +1,110 @@
-+#define ALT_SCHED_NAME "BMQ"
-+
-+/*
-+ * BMQ only routines
-+ */
-+#define rq_switch_time(rq) ((rq)->clock - (rq)->last_ts_switch)
-+#define boost_threshold(p) (sched_timeslice_ns >>\
-+ (15 - MAX_PRIORITY_ADJ - (p)->boost_prio))
-+
-+static inline void boost_task(struct task_struct *p)
-+{
-+ int limit;
-+
-+ switch (p->policy) {
-+ case SCHED_NORMAL:
-+ limit = -MAX_PRIORITY_ADJ;
-+ break;
-+ case SCHED_BATCH:
-+ case SCHED_IDLE:
-+ limit = 0;
-+ break;
-+ default:
-+ return;
-+ }
-+
-+ if (p->boost_prio > limit)
-+ p->boost_prio--;
-+}
-+
-+static inline void deboost_task(struct task_struct *p)
-+{
-+ if (p->boost_prio < MAX_PRIORITY_ADJ)
-+ p->boost_prio++;
-+}
-+
-+/*
-+ * Common interfaces
-+ */
-+static inline void sched_timeslice_imp(const int timeslice_ms) {}
-+
-+static inline int
-+task_sched_prio_normal(const struct task_struct *p, const struct rq *rq)
-+{
-+ return p->prio + p->boost_prio - MAX_RT_PRIO;
-+}
-+
-+static inline int task_sched_prio(const struct task_struct *p)
-+{
-+ return (p->prio < MAX_RT_PRIO)? p->prio : MAX_RT_PRIO / 2 + (p->prio + p->boost_prio) / 2;
-+}
-+
-+static inline int
-+task_sched_prio_idx(const struct task_struct *p, const struct rq *rq)
-+{
-+ return task_sched_prio(p);
-+}
-+
-+static inline int sched_prio2idx(int prio, struct rq *rq)
-+{
-+ return prio;
-+}
-+
-+static inline int sched_idx2prio(int idx, struct rq *rq)
-+{
-+ return idx;
-+}
-+
-+static inline void time_slice_expired(struct task_struct *p, struct rq *rq)
-+{
-+ p->time_slice = sched_timeslice_ns;
-+
-+ if (SCHED_FIFO != p->policy && task_on_rq_queued(p)) {
-+ if (SCHED_RR != p->policy)
-+ deboost_task(p);
-+ requeue_task(p, rq, task_sched_prio_idx(p, rq));
-+ }
-+}
-+
-+static inline void sched_task_sanity_check(struct task_struct *p, struct rq *rq) {}
-+
-+inline int task_running_nice(struct task_struct *p)
-+{
-+ return (p->prio + p->boost_prio > DEFAULT_PRIO + MAX_PRIORITY_ADJ);
-+}
-+
-+static void sched_task_fork(struct task_struct *p, struct rq *rq)
-+{
-+ p->boost_prio = MAX_PRIORITY_ADJ;
-+}
-+
-+static inline void do_sched_yield_type_1(struct task_struct *p, struct rq *rq)
-+{
-+ p->boost_prio = MAX_PRIORITY_ADJ;
-+}
-+
-+#ifdef CONFIG_SMP
-+static inline void sched_task_ttwu(struct task_struct *p)
-+{
-+ if(this_rq()->clock_task - p->last_ran > sched_timeslice_ns)
-+ boost_task(p);
-+}
-+#endif
-+
-+static inline void sched_task_deactivate(struct task_struct *p, struct rq *rq)
-+{
-+ if (rq_switch_time(rq) < boost_threshold(p))
-+ boost_task(p);
-+}
-+
-+static inline void update_rq_time_edge(struct rq *rq) {}
-diff --git a/kernel/sched/build_policy.c b/kernel/sched/build_policy.c
-index d9dc9ab3773f..71a25540d65e 100644
---- a/kernel/sched/build_policy.c
-+++ b/kernel/sched/build_policy.c
-@@ -42,13 +42,19 @@
-
- #include "idle.c"
-
-+#ifndef CONFIG_SCHED_ALT
- #include "rt.c"
-+#endif
-
- #ifdef CONFIG_SMP
-+#ifndef CONFIG_SCHED_ALT
- # include "cpudeadline.c"
-+#endif
- # include "pelt.c"
- #endif
-
- #include "cputime.c"
--#include "deadline.c"
-
-+#ifndef CONFIG_SCHED_ALT
-+#include "deadline.c"
-+#endif
-diff --git a/kernel/sched/build_utility.c b/kernel/sched/build_utility.c
-index 99bdd96f454f..23f80a86d2d7 100644
---- a/kernel/sched/build_utility.c
-+++ b/kernel/sched/build_utility.c
-@@ -85,7 +85,9 @@
-
- #ifdef CONFIG_SMP
- # include "cpupri.c"
-+#ifndef CONFIG_SCHED_ALT
- # include "stop_task.c"
-+#endif
- # include "topology.c"
- #endif
-
-diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
-index e3211455b203..87f7a4f732c8 100644
---- a/kernel/sched/cpufreq_schedutil.c
-+++ b/kernel/sched/cpufreq_schedutil.c
-@@ -157,9 +157,14 @@ static void sugov_get_util(struct sugov_cpu *sg_cpu)
- {
- struct rq *rq = cpu_rq(sg_cpu->cpu);
-
-+#ifndef CONFIG_SCHED_ALT
- sg_cpu->bw_dl = cpu_bw_dl(rq);
- sg_cpu->util = effective_cpu_util(sg_cpu->cpu, cpu_util_cfs(sg_cpu->cpu),
- FREQUENCY_UTIL, NULL);
-+#else
-+ sg_cpu->bw_dl = 0;
-+ sg_cpu->util = rq_load_util(rq, arch_scale_cpu_capacity(sg_cpu->cpu));
-+#endif /* CONFIG_SCHED_ALT */
- }
-
- /**
-@@ -305,8 +310,10 @@ static inline bool sugov_cpu_is_busy(struct sugov_cpu *sg_cpu) { return false; }
- */
- static inline void ignore_dl_rate_limit(struct sugov_cpu *sg_cpu)
- {
-+#ifndef CONFIG_SCHED_ALT
- if (cpu_bw_dl(cpu_rq(sg_cpu->cpu)) > sg_cpu->bw_dl)
- sg_cpu->sg_policy->limits_changed = true;
-+#endif
- }
-
- static inline bool sugov_update_single_common(struct sugov_cpu *sg_cpu,
-@@ -609,6 +616,7 @@ static int sugov_kthread_create(struct sugov_policy *sg_policy)
- }
-
- ret = sched_setattr_nocheck(thread, &attr);
-+
- if (ret) {
- kthread_stop(thread);
- pr_warn("%s: failed to set SCHED_DEADLINE\n", __func__);
-@@ -841,7 +849,9 @@ cpufreq_governor_init(schedutil_gov);
- #ifdef CONFIG_ENERGY_MODEL
- static void rebuild_sd_workfn(struct work_struct *work)
- {
-+#ifndef CONFIG_SCHED_ALT
- rebuild_sched_domains_energy();
-+#endif /* CONFIG_SCHED_ALT */
- }
- static DECLARE_WORK(rebuild_sd_work, rebuild_sd_workfn);
-
-diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
-index af7952f12e6c..6461cbbb734d 100644
---- a/kernel/sched/cputime.c
-+++ b/kernel/sched/cputime.c
-@@ -126,7 +126,7 @@ void account_user_time(struct task_struct *p, u64 cputime)
- p->utime += cputime;
- account_group_user_time(p, cputime);
-
-- index = (task_nice(p) > 0) ? CPUTIME_NICE : CPUTIME_USER;
-+ index = task_running_nice(p) ? CPUTIME_NICE : CPUTIME_USER;
-
- /* Add user time to cpustat. */
- task_group_account_field(p, index, cputime);
-@@ -150,7 +150,7 @@ void account_guest_time(struct task_struct *p, u64 cputime)
- p->gtime += cputime;
-
- /* Add guest time to cpustat. */
-- if (task_nice(p) > 0) {
-+ if (task_running_nice(p)) {
- task_group_account_field(p, CPUTIME_NICE, cputime);
- cpustat[CPUTIME_GUEST_NICE] += cputime;
- } else {
-@@ -288,7 +288,7 @@ static inline u64 account_other_time(u64 max)
- #ifdef CONFIG_64BIT
- static inline u64 read_sum_exec_runtime(struct task_struct *t)
- {
-- return t->se.sum_exec_runtime;
-+ return tsk_seruntime(t);
- }
- #else
- static u64 read_sum_exec_runtime(struct task_struct *t)
-@@ -298,7 +298,7 @@ static u64 read_sum_exec_runtime(struct task_struct *t)
- struct rq *rq;
-
- rq = task_rq_lock(t, &rf);
-- ns = t->se.sum_exec_runtime;
-+ ns = tsk_seruntime(t);
- task_rq_unlock(rq, t, &rf);
-
- return ns;
-@@ -630,7 +630,7 @@ void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev,
- void task_cputime_adjusted(struct task_struct *p, u64 *ut, u64 *st)
- {
- struct task_cputime cputime = {
-- .sum_exec_runtime = p->se.sum_exec_runtime,
-+ .sum_exec_runtime = tsk_seruntime(p),
- };
-
- if (task_cputime(p, &cputime.utime, &cputime.stime))
-diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
-index 0b2340a79b65..1e5407b8a738 100644
---- a/kernel/sched/debug.c
-+++ b/kernel/sched/debug.c
-@@ -7,6 +7,7 @@
- * Copyright(C) 2007, Red Hat, Inc., Ingo Molnar
- */
-
-+#ifndef CONFIG_SCHED_ALT
- /*
- * This allows printing both to /proc/sched_debug and
- * to the console
-@@ -215,6 +216,7 @@ static const struct file_operations sched_scaling_fops = {
- };
-
- #endif /* SMP */
-+#endif /* !CONFIG_SCHED_ALT */
-
- #ifdef CONFIG_PREEMPT_DYNAMIC
-
-@@ -278,6 +280,7 @@ static const struct file_operations sched_dynamic_fops = {
-
- #endif /* CONFIG_PREEMPT_DYNAMIC */
-
-+#ifndef CONFIG_SCHED_ALT
- __read_mostly bool sched_debug_verbose;
-
- #ifdef CONFIG_SMP
-@@ -332,6 +335,7 @@ static const struct file_operations sched_debug_fops = {
- .llseek = seq_lseek,
- .release = seq_release,
- };
-+#endif /* !CONFIG_SCHED_ALT */
-
- static struct dentry *debugfs_sched;
-
-@@ -341,12 +345,16 @@ static __init int sched_init_debug(void)
-
- debugfs_sched = debugfs_create_dir("sched", NULL);
-
-+#ifndef CONFIG_SCHED_ALT
- debugfs_create_file("features", 0644, debugfs_sched, NULL, &sched_feat_fops);
- debugfs_create_file_unsafe("verbose", 0644, debugfs_sched, &sched_debug_verbose, &sched_verbose_fops);
-+ debugfs_create_bool("verbose", 0644, debugfs_sched, &sched_debug_verbose);
-+#endif /* !CONFIG_SCHED_ALT */
- #ifdef CONFIG_PREEMPT_DYNAMIC
- debugfs_create_file("preempt", 0644, debugfs_sched, NULL, &sched_dynamic_fops);
- #endif
-
-+#ifndef CONFIG_SCHED_ALT
- debugfs_create_u32("latency_ns", 0644, debugfs_sched, &sysctl_sched_latency);
- debugfs_create_u32("min_granularity_ns", 0644, debugfs_sched, &sysctl_sched_min_granularity);
- debugfs_create_u32("idle_min_granularity_ns", 0644, debugfs_sched, &sysctl_sched_idle_min_granularity);
-@@ -376,11 +384,13 @@ static __init int sched_init_debug(void)
- #endif
-
- debugfs_create_file("debug", 0444, debugfs_sched, NULL, &sched_debug_fops);
-+#endif /* !CONFIG_SCHED_ALT */
-
- return 0;
- }
- late_initcall(sched_init_debug);
-
-+#ifndef CONFIG_SCHED_ALT
- #ifdef CONFIG_SMP
-
- static cpumask_var_t sd_sysctl_cpus;
-@@ -1114,6 +1124,7 @@ void proc_sched_set_task(struct task_struct *p)
- memset(&p->stats, 0, sizeof(p->stats));
- #endif
- }
-+#endif /* !CONFIG_SCHED_ALT */
-
- void resched_latency_warn(int cpu, u64 latency)
- {
-diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
-index 342f58a329f5..ab493e759084 100644
---- a/kernel/sched/idle.c
-+++ b/kernel/sched/idle.c
-@@ -379,6 +379,7 @@ void cpu_startup_entry(enum cpuhp_state state)
- do_idle();
- }
-
-+#ifndef CONFIG_SCHED_ALT
- /*
- * idle-task scheduling class.
- */
-@@ -500,3 +501,4 @@ DEFINE_SCHED_CLASS(idle) = {
- .switched_to = switched_to_idle,
- .update_curr = update_curr_idle,
- };
-+#endif
-diff --git a/kernel/sched/pds.h b/kernel/sched/pds.h
-new file mode 100644
-index 000000000000..15cc4887efed
---- /dev/null
-+++ b/kernel/sched/pds.h
-@@ -0,0 +1,152 @@
-+#define ALT_SCHED_NAME "PDS"
-+
-+#define MIN_SCHED_NORMAL_PRIO (32)
-+static const u64 RT_MASK = ((1ULL << MIN_SCHED_NORMAL_PRIO) - 1);
-+
-+#define SCHED_NORMAL_PRIO_NUM (32)
-+#define SCHED_EDGE_DELTA (SCHED_NORMAL_PRIO_NUM - NICE_WIDTH / 2)
-+
-+/* PDS assume NORMAL_PRIO_NUM is power of 2 */
-+#define SCHED_NORMAL_PRIO_MOD(x) ((x) & (SCHED_NORMAL_PRIO_NUM - 1))
-+
-+/* default time slice 4ms -> shift 22, 2 time slice slots -> shift 23 */
-+static __read_mostly int sched_timeslice_shift = 23;
-+
-+/*
-+ * Common interfaces
-+ */
-+static inline void sched_timeslice_imp(const int timeslice_ms)
-+{
-+ if (2 == timeslice_ms)
-+ sched_timeslice_shift = 22;
-+}
-+
-+static inline int
-+task_sched_prio_normal(const struct task_struct *p, const struct rq *rq)
-+{
-+ s64 delta = p->deadline - rq->time_edge + SCHED_EDGE_DELTA;
-+
-+#ifdef ALT_SCHED_DEBUG
-+ if (WARN_ONCE(delta > NORMAL_PRIO_NUM - 1,
-+ "pds: task_sched_prio_normal() delta %lld\n", delta))
-+ return SCHED_NORMAL_PRIO_NUM - 1;
-+#endif
-+
-+ return max(0LL, delta);
-+}
-+
-+static inline int task_sched_prio(const struct task_struct *p)
-+{
-+ return (p->prio < MIN_NORMAL_PRIO) ? (p->prio >> 2) :
-+ MIN_SCHED_NORMAL_PRIO + task_sched_prio_normal(p, task_rq(p));
-+}
-+
-+static inline int
-+task_sched_prio_idx(const struct task_struct *p, const struct rq *rq)
-+{
-+ u64 idx;
-+
-+ if (p->prio < MIN_NORMAL_PRIO)
-+ return p->prio >> 2;
-+
-+ idx = max(p->deadline + SCHED_EDGE_DELTA, rq->time_edge);
-+ /*printk(KERN_INFO "sched: task_sched_prio_idx edge:%llu, deadline=%llu idx=%llu\n", rq->time_edge, p->deadline, idx);*/
-+ return MIN_SCHED_NORMAL_PRIO + SCHED_NORMAL_PRIO_MOD(idx);
-+}
-+
-+static inline int sched_prio2idx(int sched_prio, struct rq *rq)
-+{
-+ return (IDLE_TASK_SCHED_PRIO == sched_prio || sched_prio < MIN_SCHED_NORMAL_PRIO) ?
-+ sched_prio :
-+ MIN_SCHED_NORMAL_PRIO + SCHED_NORMAL_PRIO_MOD(sched_prio + rq->time_edge);
-+}
-+
-+static inline int sched_idx2prio(int sched_idx, struct rq *rq)
-+{
-+ return (sched_idx < MIN_SCHED_NORMAL_PRIO) ?
-+ sched_idx :
-+ MIN_SCHED_NORMAL_PRIO + SCHED_NORMAL_PRIO_MOD(sched_idx - rq->time_edge);
-+}
-+
-+static inline void sched_renew_deadline(struct task_struct *p, const struct rq *rq)
-+{
-+ if (p->prio >= MIN_NORMAL_PRIO)
-+ p->deadline = rq->time_edge + (p->static_prio - (MAX_PRIO - NICE_WIDTH)) / 2;
-+}
-+
-+int task_running_nice(struct task_struct *p)
-+{
-+ return (p->prio > DEFAULT_PRIO);
-+}
-+
-+static inline void update_rq_time_edge(struct rq *rq)
-+{
-+ struct list_head head;
-+ u64 old = rq->time_edge;
-+ u64 now = rq->clock >> sched_timeslice_shift;
-+ u64 prio, delta;
-+ DECLARE_BITMAP(normal, SCHED_QUEUE_BITS);
-+
-+ if (now == old)
-+ return;
-+
-+ rq->time_edge = now;
-+ delta = min_t(u64, SCHED_NORMAL_PRIO_NUM, now - old);
-+ INIT_LIST_HEAD(&head);
-+
-+ /*printk(KERN_INFO "sched: update_rq_time_edge 0x%016lx %llu\n", rq->queue.bitmap[0], delta);*/
-+ prio = MIN_SCHED_NORMAL_PRIO;
-+ for_each_set_bit_from(prio, rq->queue.bitmap, MIN_SCHED_NORMAL_PRIO + delta)
-+ list_splice_tail_init(rq->queue.heads + MIN_SCHED_NORMAL_PRIO +
-+ SCHED_NORMAL_PRIO_MOD(prio + old), &head);
-+
-+ bitmap_shift_right(normal, rq->queue.bitmap, delta, SCHED_QUEUE_BITS);
-+ if (!list_empty(&head)) {
-+ struct task_struct *p;
-+ u64 idx = MIN_SCHED_NORMAL_PRIO + SCHED_NORMAL_PRIO_MOD(now);
-+
-+ list_for_each_entry(p, &head, sq_node)
-+ p->sq_idx = idx;
-+
-+ list_splice(&head, rq->queue.heads + idx);
-+ set_bit(MIN_SCHED_NORMAL_PRIO, normal);
-+ }
-+ bitmap_replace(rq->queue.bitmap, normal, rq->queue.bitmap,
-+ (const unsigned long *)&RT_MASK, SCHED_QUEUE_BITS);
-+
-+ if (rq->prio < MIN_SCHED_NORMAL_PRIO || IDLE_TASK_SCHED_PRIO == rq->prio)
-+ return;
-+
-+ rq->prio = (rq->prio < MIN_SCHED_NORMAL_PRIO + delta) ?
-+ MIN_SCHED_NORMAL_PRIO : rq->prio - delta;
-+}
-+
-+static inline void time_slice_expired(struct task_struct *p, struct rq *rq)
-+{
-+ p->time_slice = sched_timeslice_ns;
-+ sched_renew_deadline(p, rq);
-+ if (SCHED_FIFO != p->policy && task_on_rq_queued(p))
-+ requeue_task(p, rq, task_sched_prio_idx(p, rq));
-+}
-+
-+static inline void sched_task_sanity_check(struct task_struct *p, struct rq *rq)
-+{
-+ u64 max_dl = rq->time_edge + NICE_WIDTH / 2 - 1;
-+ if (unlikely(p->deadline > max_dl))
-+ p->deadline = max_dl;
-+}
-+
-+static void sched_task_fork(struct task_struct *p, struct rq *rq)
-+{
-+ sched_renew_deadline(p, rq);
-+}
-+
-+static inline void do_sched_yield_type_1(struct task_struct *p, struct rq *rq)
-+{
-+ time_slice_expired(p, rq);
-+}
-+
-+#ifdef CONFIG_SMP
-+static inline void sched_task_ttwu(struct task_struct *p) {}
-+#endif
-+static inline void sched_task_deactivate(struct task_struct *p, struct rq *rq) {}
-diff --git a/kernel/sched/pelt.c b/kernel/sched/pelt.c
-index 0f310768260c..bd38bf738fe9 100644
---- a/kernel/sched/pelt.c
-+++ b/kernel/sched/pelt.c
-@@ -266,6 +266,7 @@ ___update_load_avg(struct sched_avg *sa, unsigned long load)
- WRITE_ONCE(sa->util_avg, sa->util_sum / divider);
- }
-
-+#ifndef CONFIG_SCHED_ALT
- /*
- * sched_entity:
- *
-@@ -383,8 +384,9 @@ int update_dl_rq_load_avg(u64 now, struct rq *rq, int running)
-
- return 0;
- }
-+#endif
-
--#ifdef CONFIG_SCHED_THERMAL_PRESSURE
-+#if defined(CONFIG_SCHED_THERMAL_PRESSURE) && !defined(CONFIG_SCHED_ALT)
- /*
- * thermal:
- *
-diff --git a/kernel/sched/pelt.h b/kernel/sched/pelt.h
-index 3a0e0dc28721..e8a7d84aa5a5 100644
---- a/kernel/sched/pelt.h
-+++ b/kernel/sched/pelt.h
-@@ -1,13 +1,15 @@
- #ifdef CONFIG_SMP
- #include "sched-pelt.h"
-
-+#ifndef CONFIG_SCHED_ALT
- int __update_load_avg_blocked_se(u64 now, struct sched_entity *se);
- int __update_load_avg_se(u64 now, struct cfs_rq *cfs_rq, struct sched_entity *se);
- int __update_load_avg_cfs_rq(u64 now, struct cfs_rq *cfs_rq);
- int update_rt_rq_load_avg(u64 now, struct rq *rq, int running);
- int update_dl_rq_load_avg(u64 now, struct rq *rq, int running);
-+#endif
-
--#ifdef CONFIG_SCHED_THERMAL_PRESSURE
-+#if defined(CONFIG_SCHED_THERMAL_PRESSURE) && !defined(CONFIG_SCHED_ALT)
- int update_thermal_load_avg(u64 now, struct rq *rq, u64 capacity);
-
- static inline u64 thermal_load_avg(struct rq *rq)
-@@ -44,6 +46,7 @@ static inline u32 get_pelt_divider(struct sched_avg *avg)
- return PELT_MIN_DIVIDER + avg->period_contrib;
- }
-
-+#ifndef CONFIG_SCHED_ALT
- static inline void cfs_se_util_change(struct sched_avg *avg)
- {
- unsigned int enqueued;
-@@ -180,9 +183,11 @@ static inline u64 cfs_rq_clock_pelt(struct cfs_rq *cfs_rq)
- return rq_clock_pelt(rq_of(cfs_rq));
- }
- #endif
-+#endif /* CONFIG_SCHED_ALT */
-
- #else
-
-+#ifndef CONFIG_SCHED_ALT
- static inline int
- update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq)
- {
-@@ -200,6 +205,7 @@ update_dl_rq_load_avg(u64 now, struct rq *rq, int running)
- {
- return 0;
- }
-+#endif
-
- static inline int
- update_thermal_load_avg(u64 now, struct rq *rq, u64 capacity)
-diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
-index ec7b3e0a2b20..3b4052dd7bee 100644
---- a/kernel/sched/sched.h
-+++ b/kernel/sched/sched.h
-@@ -5,6 +5,10 @@
- #ifndef _KERNEL_SCHED_SCHED_H
- #define _KERNEL_SCHED_SCHED_H
-
-+#ifdef CONFIG_SCHED_ALT
-+#include "alt_sched.h"
-+#else
-+
- #include <linux/sched/affinity.h>
- #include <linux/sched/autogroup.h>
- #include <linux/sched/cpufreq.h>
-@@ -3487,4 +3491,9 @@ static inline void task_tick_mm_cid(struct rq *rq, struct task_struct *curr) { }
- static inline void init_sched_mm_cid(struct task_struct *t) { }
- #endif
-
-+static inline int task_running_nice(struct task_struct *p)
-+{
-+ return (task_nice(p) > 0);
-+}
-+#endif /* !CONFIG_SCHED_ALT */
- #endif /* _KERNEL_SCHED_SCHED_H */
-diff --git a/kernel/sched/stats.c b/kernel/sched/stats.c
-index 857f837f52cb..5486c63e4790 100644
---- a/kernel/sched/stats.c
-+++ b/kernel/sched/stats.c
-@@ -125,8 +125,10 @@ static int show_schedstat(struct seq_file *seq, void *v)
- } else {
- struct rq *rq;
- #ifdef CONFIG_SMP
-+#ifndef CONFIG_SCHED_ALT
- struct sched_domain *sd;
- int dcount = 0;
-+#endif
- #endif
- cpu = (unsigned long)(v - 2);
- rq = cpu_rq(cpu);
-@@ -143,6 +145,7 @@ static int show_schedstat(struct seq_file *seq, void *v)
- seq_printf(seq, "\n");
-
- #ifdef CONFIG_SMP
-+#ifndef CONFIG_SCHED_ALT
- /* domain-specific stats */
- rcu_read_lock();
- for_each_domain(cpu, sd) {
-@@ -171,6 +174,7 @@ static int show_schedstat(struct seq_file *seq, void *v)
- sd->ttwu_move_balance);
- }
- rcu_read_unlock();
-+#endif
- #endif
- }
- return 0;
-diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h
-index 38f3698f5e5b..b9d597394316 100644
---- a/kernel/sched/stats.h
-+++ b/kernel/sched/stats.h
-@@ -89,6 +89,7 @@ static inline void rq_sched_info_depart (struct rq *rq, unsigned long long delt
-
- #endif /* CONFIG_SCHEDSTATS */
-
-+#ifndef CONFIG_SCHED_ALT
- #ifdef CONFIG_FAIR_GROUP_SCHED
- struct sched_entity_stats {
- struct sched_entity se;
-@@ -105,6 +106,7 @@ __schedstats_from_se(struct sched_entity *se)
- #endif
- return &task_of(se)->stats;
- }
-+#endif /* CONFIG_SCHED_ALT */
-
- #ifdef CONFIG_PSI
- void psi_task_change(struct task_struct *task, int clear, int set);
-diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
-index 6682535e37c8..144875e2728d 100644
---- a/kernel/sched/topology.c
-+++ b/kernel/sched/topology.c
-@@ -3,6 +3,7 @@
- * Scheduler topology setup/handling methods
- */
-
-+#ifndef CONFIG_SCHED_ALT
- #include <linux/bsearch.h>
-
- DEFINE_MUTEX(sched_domains_mutex);
-@@ -1415,8 +1416,10 @@ static void asym_cpu_capacity_scan(void)
- */
-
- static int default_relax_domain_level = -1;
-+#endif /* CONFIG_SCHED_ALT */
- int sched_domain_level_max;
-
-+#ifndef CONFIG_SCHED_ALT
- static int __init setup_relax_domain_level(char *str)
- {
- if (kstrtoint(str, 0, &default_relax_domain_level))
-@@ -1649,6 +1652,7 @@ sd_init(struct sched_domain_topology_level *tl,
-
- return sd;
- }
-+#endif /* CONFIG_SCHED_ALT */
-
- /*
- * Topology list, bottom-up.
-@@ -1685,6 +1689,7 @@ void set_sched_topology(struct sched_domain_topology_level *tl)
- sched_domain_topology_saved = NULL;
- }
-
-+#ifndef CONFIG_SCHED_ALT
- #ifdef CONFIG_NUMA
-
- static const struct cpumask *sd_numa_mask(int cpu)
-@@ -2740,3 +2745,20 @@ void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
- partition_sched_domains_locked(ndoms_new, doms_new, dattr_new);
- mutex_unlock(&sched_domains_mutex);
- }
-+#else /* CONFIG_SCHED_ALT */
-+void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
-+ struct sched_domain_attr *dattr_new)
-+{}
-+
-+#ifdef CONFIG_NUMA
-+int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
-+{
-+ return best_mask_cpu(cpu, cpus);
-+}
-+
-+int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node)
-+{
-+ return cpumask_nth(cpu, cpus);
-+}
-+#endif /* CONFIG_NUMA */
-+#endif
-diff --git a/kernel/sysctl.c b/kernel/sysctl.c
-index bfe53e835524..943fa125064b 100644
---- a/kernel/sysctl.c
-+++ b/kernel/sysctl.c
-@@ -92,6 +92,10 @@ EXPORT_SYMBOL_GPL(sysctl_long_vals);
-
- /* Constants used for minimum and maximum */
-
-+#ifdef CONFIG_SCHED_ALT
-+extern int sched_yield_type;
-+#endif
-+
- #ifdef CONFIG_PERF_EVENTS
- static const int six_hundred_forty_kb = 640 * 1024;
- #endif
-@@ -1917,6 +1921,17 @@ static struct ctl_table kern_table[] = {
- .proc_handler = proc_dointvec,
- },
- #endif
-+#ifdef CONFIG_SCHED_ALT
-+ {
-+ .procname = "yield_type",
-+ .data = &sched_yield_type,
-+ .maxlen = sizeof (int),
-+ .mode = 0644,
-+ .proc_handler = &proc_dointvec_minmax,
-+ .extra1 = SYSCTL_ZERO,
-+ .extra2 = SYSCTL_TWO,
-+ },
-+#endif
- #if defined(CONFIG_S390) && defined(CONFIG_SMP)
- {
- .procname = "spin_retry",
-diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
-index e8c08292defc..3823ff0ddc0f 100644
---- a/kernel/time/hrtimer.c
-+++ b/kernel/time/hrtimer.c
-@@ -2088,8 +2088,10 @@ long hrtimer_nanosleep(ktime_t rqtp, const enum hrtimer_mode mode,
- int ret = 0;
- u64 slack;
-
-+#ifndef CONFIG_SCHED_ALT
- slack = current->timer_slack_ns;
-- if (rt_task(current))
-+ if (dl_task(current) || rt_task(current))
-+#endif
- slack = 0;
-
- hrtimer_init_sleeper_on_stack(&t, clockid, mode);
-diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
-index e9c6f9d0e42c..43ee0a94abdd 100644
---- a/kernel/time/posix-cpu-timers.c
-+++ b/kernel/time/posix-cpu-timers.c
-@@ -223,7 +223,7 @@ static void task_sample_cputime(struct task_struct *p, u64 *samples)
- u64 stime, utime;
-
- task_cputime(p, &utime, &stime);
-- store_samples(samples, stime, utime, p->se.sum_exec_runtime);
-+ store_samples(samples, stime, utime, tsk_seruntime(p));
- }
-
- static void proc_sample_cputime_atomic(struct task_cputime_atomic *at,
-@@ -867,6 +867,7 @@ static void collect_posix_cputimers(struct posix_cputimers *pct, u64 *samples,
- }
- }
-
-+#ifndef CONFIG_SCHED_ALT
- static inline void check_dl_overrun(struct task_struct *tsk)
- {
- if (tsk->dl.dl_overrun) {
-@@ -874,6 +875,7 @@ static inline void check_dl_overrun(struct task_struct *tsk)
- send_signal_locked(SIGXCPU, SEND_SIG_PRIV, tsk, PIDTYPE_TGID);
- }
- }
-+#endif
-
- static bool check_rlimit(u64 time, u64 limit, int signo, bool rt, bool hard)
- {
-@@ -901,8 +903,10 @@ static void check_thread_timers(struct task_struct *tsk,
- u64 samples[CPUCLOCK_MAX];
- unsigned long soft;
-
-+#ifndef CONFIG_SCHED_ALT
- if (dl_task(tsk))
- check_dl_overrun(tsk);
-+#endif
-
- if (expiry_cache_is_inactive(pct))
- return;
-@@ -916,7 +920,7 @@ static void check_thread_timers(struct task_struct *tsk,
- soft = task_rlimit(tsk, RLIMIT_RTTIME);
- if (soft != RLIM_INFINITY) {
- /* Task RT timeout is accounted in jiffies. RTTIME is usec */
-- unsigned long rttime = tsk->rt.timeout * (USEC_PER_SEC / HZ);
-+ unsigned long rttime = tsk_rttimeout(tsk) * (USEC_PER_SEC / HZ);
- unsigned long hard = task_rlimit_max(tsk, RLIMIT_RTTIME);
-
- /* At the hard limit, send SIGKILL. No further action. */
-@@ -1152,8 +1156,10 @@ static inline bool fastpath_timer_check(struct task_struct *tsk)
- return true;
- }
-
-+#ifndef CONFIG_SCHED_ALT
- if (dl_task(tsk) && tsk->dl.dl_overrun)
- return true;
-+#endif
-
- return false;
- }
-diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
-index 529590499b1f..d04bb99b4f0e 100644
---- a/kernel/trace/trace_selftest.c
-+++ b/kernel/trace/trace_selftest.c
-@@ -1155,10 +1155,15 @@ static int trace_wakeup_test_thread(void *data)
- {
- /* Make this a -deadline thread */
- static const struct sched_attr attr = {
-+#ifdef CONFIG_SCHED_ALT
-+ /* No deadline on BMQ/PDS, use RR */
-+ .sched_policy = SCHED_RR,
-+#else
- .sched_policy = SCHED_DEADLINE,
- .sched_runtime = 100000ULL,
- .sched_deadline = 10000000ULL,
- .sched_period = 10000000ULL
-+#endif
- };
- struct wakeup_test_data *x = data;
-
diff --git a/5021_BMQ-and-PDS-gentoo-defaults.patch b/5021_BMQ-and-PDS-gentoo-defaults.patch
deleted file mode 100644
index 6dc48eec..00000000
--- a/5021_BMQ-and-PDS-gentoo-defaults.patch
+++ /dev/null
@@ -1,13 +0,0 @@
---- a/init/Kconfig 2023-02-13 08:16:09.534315265 -0500
-+++ b/init/Kconfig 2023-02-13 08:17:24.130237204 -0500
-@@ -867,8 +867,9 @@ config UCLAMP_BUCKETS_COUNT
- If in doubt, use the default value.
-
- menuconfig SCHED_ALT
-+ depends on X86_64
- bool "Alternative CPU Schedulers"
-- default y
-+ default n
- help
- This feature enable alternative CPU scheduler"
-
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-08-03 11:47 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-08-03 11:47 UTC (permalink / raw
To: gentoo-commits
commit: 13be3c70c1e65f58ee3fc0a405121c2fc7637614
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Thu Aug 3 11:47:21 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Thu Aug 3 11:47:21 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=13be3c70
Linux patch 6.4.8
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1007_linux-6.4.8.patch | 12270 +++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 12274 insertions(+)
diff --git a/0000_README b/0000_README
index 58f42d41..22c14f8e 100644
--- a/0000_README
+++ b/0000_README
@@ -71,6 +71,10 @@ Patch: 1006_linux-6.4.7.patch
From: https://www.kernel.org
Desc: Linux 6.4.7
+Patch: 1007_linux-6.4.8.patch
+From: https://www.kernel.org
+Desc: Linux 6.4.8
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1007_linux-6.4.8.patch b/1007_linux-6.4.8.patch
new file mode 100644
index 00000000..002dfc74
--- /dev/null
+++ b/1007_linux-6.4.8.patch
@@ -0,0 +1,12270 @@
+diff --git a/Documentation/ABI/testing/sysfs-module b/Documentation/ABI/testing/sysfs-module
+index 08886367d0470..62addab47d0c5 100644
+--- a/Documentation/ABI/testing/sysfs-module
++++ b/Documentation/ABI/testing/sysfs-module
+@@ -60,3 +60,14 @@ Description: Module taint flags:
+ C staging driver module
+ E unsigned module
+ == =====================
++
++What: /sys/module/grant_table/parameters/free_per_iteration
++Date: July 2023
++KernelVersion: 6.5 but backported to all supported stable branches
++Contact: Xen developer discussion <xen-devel@lists.xenproject.org>
++Description: Read and write number of grant entries to attempt to free per iteration.
++
++ Note: Future versions of Xen and Linux may provide a better
++ interface for controlling the rate of deferred grant reclaim
++ or may not need it at all.
++Users: Qubes OS (https://www.qubes-os.org)
+diff --git a/Documentation/admin-guide/hw-vuln/spectre.rst b/Documentation/admin-guide/hw-vuln/spectre.rst
+index 4d186f599d90f..32a8893e56177 100644
+--- a/Documentation/admin-guide/hw-vuln/spectre.rst
++++ b/Documentation/admin-guide/hw-vuln/spectre.rst
+@@ -484,11 +484,14 @@ Spectre variant 2
+
+ Systems which support enhanced IBRS (eIBRS) enable IBRS protection once at
+ boot, by setting the IBRS bit, and they're automatically protected against
+- Spectre v2 variant attacks, including cross-thread branch target injections
+- on SMT systems (STIBP). In other words, eIBRS enables STIBP too.
++ Spectre v2 variant attacks.
+
+- Legacy IBRS systems clear the IBRS bit on exit to userspace and
+- therefore explicitly enable STIBP for that
++ On Intel's enhanced IBRS systems, this includes cross-thread branch target
++ injections on SMT systems (STIBP). In other words, Intel eIBRS enables
++ STIBP, too.
++
++ AMD Automatic IBRS does not protect userspace, and Legacy IBRS systems clear
++ the IBRS bit on exit to userspace, therefore both explicitly enable STIBP.
+
+ The retpoline mitigation is turned on by default on vulnerable
+ CPUs. It can be forced on or off by the administrator
+diff --git a/Documentation/filesystems/tmpfs.rst b/Documentation/filesystems/tmpfs.rst
+index f18f46be5c0c7..2cd8fa332feb7 100644
+--- a/Documentation/filesystems/tmpfs.rst
++++ b/Documentation/filesystems/tmpfs.rst
+@@ -84,8 +84,6 @@ nr_inodes The maximum number of inodes for this instance. The default
+ is half of the number of your physical RAM pages, or (on a
+ machine with highmem) the number of lowmem RAM pages,
+ whichever is the lower.
+-noswap Disables swap. Remounts must respect the original settings.
+- By default swap is enabled.
+ ========= ============================================================
+
+ These parameters accept a suffix k, m or g for kilo, mega and giga and
+@@ -99,36 +97,31 @@ mount with such options, since it allows any user with write access to
+ use up all the memory on the machine; but enhances the scalability of
+ that instance in a system with many CPUs making intensive use of it.
+
++tmpfs blocks may be swapped out, when there is a shortage of memory.
++tmpfs has a mount option to disable its use of swap:
++
++====== ===========================================================
++noswap Disables swap. Remounts must respect the original settings.
++ By default swap is enabled.
++====== ===========================================================
++
+ tmpfs also supports Transparent Huge Pages which requires a kernel
+ configured with CONFIG_TRANSPARENT_HUGEPAGE and with huge supported for
+ your system (has_transparent_hugepage(), which is architecture specific).
+ The mount options for this are:
+
+-====== ============================================================
+-huge=0 never: disables huge pages for the mount
+-huge=1 always: enables huge pages for the mount
+-huge=2 within_size: only allocate huge pages if the page will be
+- fully within i_size, also respect fadvise()/madvise() hints.
+-huge=3 advise: only allocate huge pages if requested with
+- fadvise()/madvise()
+-====== ============================================================
+-
+-There is a sysfs file which you can also use to control system wide THP
+-configuration for all tmpfs mounts, the file is:
+-
+-/sys/kernel/mm/transparent_hugepage/shmem_enabled
+-
+-This sysfs file is placed on top of THP sysfs directory and so is registered
+-by THP code. It is however only used to control all tmpfs mounts with one
+-single knob. Since it controls all tmpfs mounts it should only be used either
+-for emergency or testing purposes. The values you can set for shmem_enabled are:
+-
+-== ============================================================
+--1 deny: disables huge on shm_mnt and all mounts, for
+- emergency use
+--2 force: enables huge on shm_mnt and all mounts, w/o needing
+- option, for testing
+-== ============================================================
++================ ==============================================================
++huge=never Do not allocate huge pages. This is the default.
++huge=always Attempt to allocate huge page every time a new page is needed.
++huge=within_size Only allocate huge page if it will be fully within i_size.
++ Also respect madvise(2) hints.
++huge=advise Only allocate huge page if requested with madvise(2).
++================ ==============================================================
++
++See also Documentation/admin-guide/mm/transhuge.rst, which describes the
++sysfs file /sys/kernel/mm/transparent_hugepage/shmem_enabled: which can
++be used to deny huge pages on all tmpfs mounts in an emergency, or to
++force huge pages on all tmpfs mounts for testing.
+
+ tmpfs has a mount option to set the NUMA memory allocation policy for
+ all files in that instance (if CONFIG_NUMA is enabled) - which can be
+diff --git a/Documentation/process/security-bugs.rst b/Documentation/process/security-bugs.rst
+index 82e29837d5898..5a6993795bd26 100644
+--- a/Documentation/process/security-bugs.rst
++++ b/Documentation/process/security-bugs.rst
+@@ -63,31 +63,28 @@ information submitted to the security list and any followup discussions
+ of the report are treated confidentially even after the embargo has been
+ lifted, in perpetuity.
+
+-Coordination
+-------------
+-
+-Fixes for sensitive bugs, such as those that might lead to privilege
+-escalations, may need to be coordinated with the private
+-<linux-distros@vs.openwall.org> mailing list so that distribution vendors
+-are well prepared to issue a fixed kernel upon public disclosure of the
+-upstream fix. Distros will need some time to test the proposed patch and
+-will generally request at least a few days of embargo, and vendor update
+-publication prefers to happen Tuesday through Thursday. When appropriate,
+-the security team can assist with this coordination, or the reporter can
+-include linux-distros from the start. In this case, remember to prefix
+-the email Subject line with "[vs]" as described in the linux-distros wiki:
+-<http://oss-security.openwall.org/wiki/mailing-lists/distros#how-to-use-the-lists>
++Coordination with other groups
++------------------------------
++
++The kernel security team strongly recommends that reporters of potential
++security issues NEVER contact the "linux-distros" mailing list until
++AFTER discussing it with the kernel security team. Do not Cc: both
++lists at once. You may contact the linux-distros mailing list after a
++fix has been agreed on and you fully understand the requirements that
++doing so will impose on you and the kernel community.
++
++The different lists have different goals and the linux-distros rules do
++not contribute to actually fixing any potential security problems.
+
+ CVE assignment
+ --------------
+
+-The security team does not normally assign CVEs, nor do we require them
+-for reports or fixes, as this can needlessly complicate the process and
+-may delay the bug handling. If a reporter wishes to have a CVE identifier
+-assigned ahead of public disclosure, they will need to contact the private
+-linux-distros list, described above. When such a CVE identifier is known
+-before a patch is provided, it is desirable to mention it in the commit
+-message if the reporter agrees.
++The security team does not assign CVEs, nor do we require them for
++reports or fixes, as this can needlessly complicate the process and may
++delay the bug handling. If a reporter wishes to have a CVE identifier
++assigned, they should find one by themselves, for example by contacting
++MITRE directly. However under no circumstances will a patch inclusion
++be delayed to wait for a CVE identifier to arrive.
+
+ Non-disclosure agreements
+ -------------------------
+diff --git a/Makefile b/Makefile
+index b3dc3b5f14cae..9607ce0b8a10c 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 6
+ PATCHLEVEL = 4
+-SUBLEVEL = 7
++SUBLEVEL = 8
+ EXTRAVERSION =
+ NAME = Hurr durr I'ma ninja sloth
+
+diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
+index 4eb601e7de507..06382da630123 100644
+--- a/arch/arm64/include/asm/virt.h
++++ b/arch/arm64/include/asm/virt.h
+@@ -78,6 +78,7 @@ extern u32 __boot_cpu_mode[2];
+
+ void __hyp_set_vectors(phys_addr_t phys_vector_base);
+ void __hyp_reset_vectors(void);
++bool is_kvm_arm_initialised(void);
+
+ DECLARE_STATIC_KEY_FALSE(kvm_protected_mode_initialized);
+
+diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
+index 9d7d10d60bfdc..520b681a07bb0 100644
+--- a/arch/arm64/kernel/fpsimd.c
++++ b/arch/arm64/kernel/fpsimd.c
+@@ -917,6 +917,8 @@ int vec_set_vector_length(struct task_struct *task, enum vec_type type,
+ if (task == current)
+ put_cpu_fpsimd_context();
+
++ task_set_vl(task, type, vl);
++
+ /*
+ * Free the changed states if they are not in use, SME will be
+ * reallocated to the correct size on next use and we just
+@@ -931,8 +933,6 @@ int vec_set_vector_length(struct task_struct *task, enum vec_type type,
+ if (free_sme)
+ sme_free(task);
+
+- task_set_vl(task, type, vl);
+-
+ out:
+ update_tsk_thread_flag(task, vec_vl_inherit_flag(type),
+ flags & PR_SVE_VL_INHERIT);
+diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
+index 7d8c3dd8b7ca9..3a2606ba3e583 100644
+--- a/arch/arm64/kvm/arm.c
++++ b/arch/arm64/kvm/arm.c
+@@ -51,11 +51,16 @@ DECLARE_KVM_HYP_PER_CPU(unsigned long, kvm_hyp_vector);
+ DEFINE_PER_CPU(unsigned long, kvm_arm_hyp_stack_page);
+ DECLARE_KVM_NVHE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);
+
+-static bool vgic_present;
++static bool vgic_present, kvm_arm_initialised;
+
+ static DEFINE_PER_CPU(unsigned char, kvm_arm_hardware_enabled);
+ DEFINE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
+
++bool is_kvm_arm_initialised(void)
++{
++ return kvm_arm_initialised;
++}
++
+ int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
+ {
+ return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
+@@ -2396,6 +2401,8 @@ static __init int kvm_arm_init(void)
+ if (err)
+ goto out_subs;
+
++ kvm_arm_initialised = true;
++
+ return 0;
+
+ out_subs:
+diff --git a/arch/arm64/kvm/pkvm.c b/arch/arm64/kvm/pkvm.c
+index 6e9ece1ebbe72..3895416cb15ae 100644
+--- a/arch/arm64/kvm/pkvm.c
++++ b/arch/arm64/kvm/pkvm.c
+@@ -243,7 +243,7 @@ static int __init finalize_pkvm(void)
+ {
+ int ret;
+
+- if (!is_protected_kvm_enabled())
++ if (!is_protected_kvm_enabled() || !is_kvm_arm_initialised())
+ return 0;
+
+ /*
+diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
+index 73519e13bbb39..2570e7c1eb75f 100644
+--- a/arch/loongarch/Kconfig
++++ b/arch/loongarch/Kconfig
+@@ -12,6 +12,7 @@ config LOONGARCH
+ select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI
+ select ARCH_HAS_FORTIFY_SOURCE
+ select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
++ select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
+ select ARCH_HAS_PTE_SPECIAL
+ select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
+ select ARCH_INLINE_READ_LOCK if !PREEMPTION
+diff --git a/arch/loongarch/lib/clear_user.S b/arch/loongarch/lib/clear_user.S
+index fd1d62b244f2f..9dcf717193874 100644
+--- a/arch/loongarch/lib/clear_user.S
++++ b/arch/loongarch/lib/clear_user.S
+@@ -108,6 +108,7 @@ SYM_FUNC_START(__clear_user_fast)
+ addi.d a3, a2, -8
+ bgeu a0, a3, .Llt8
+ 15: st.d zero, a0, 0
++ addi.d a0, a0, 8
+
+ .Llt8:
+ 16: st.d zero, a2, -8
+@@ -188,7 +189,7 @@ SYM_FUNC_START(__clear_user_fast)
+ _asm_extable 13b, .L_fixup_handle_0
+ _asm_extable 14b, .L_fixup_handle_1
+ _asm_extable 15b, .L_fixup_handle_0
+- _asm_extable 16b, .L_fixup_handle_1
++ _asm_extable 16b, .L_fixup_handle_0
+ _asm_extable 17b, .L_fixup_handle_s0
+ _asm_extable 18b, .L_fixup_handle_s0
+ _asm_extable 19b, .L_fixup_handle_s0
+diff --git a/arch/loongarch/lib/copy_user.S b/arch/loongarch/lib/copy_user.S
+index b21f6d5d38f51..fecd08cad702d 100644
+--- a/arch/loongarch/lib/copy_user.S
++++ b/arch/loongarch/lib/copy_user.S
+@@ -136,6 +136,7 @@ SYM_FUNC_START(__copy_user_fast)
+ bgeu a1, a4, .Llt8
+ 30: ld.d t0, a1, 0
+ 31: st.d t0, a0, 0
++ addi.d a0, a0, 8
+
+ .Llt8:
+ 32: ld.d t0, a3, -8
+@@ -246,7 +247,7 @@ SYM_FUNC_START(__copy_user_fast)
+ _asm_extable 30b, .L_fixup_handle_0
+ _asm_extable 31b, .L_fixup_handle_0
+ _asm_extable 32b, .L_fixup_handle_0
+- _asm_extable 33b, .L_fixup_handle_1
++ _asm_extable 33b, .L_fixup_handle_0
+ _asm_extable 34b, .L_fixup_handle_s0
+ _asm_extable 35b, .L_fixup_handle_s0
+ _asm_extable 36b, .L_fixup_handle_s0
+diff --git a/arch/loongarch/net/bpf_jit.h b/arch/loongarch/net/bpf_jit.h
+index c335dc4eed370..68586338ecf85 100644
+--- a/arch/loongarch/net/bpf_jit.h
++++ b/arch/loongarch/net/bpf_jit.h
+@@ -150,7 +150,7 @@ static inline void move_imm(struct jit_ctx *ctx, enum loongarch_gpr rd, long imm
+ * no need to call lu32id to do a new filled operation.
+ */
+ imm_51_31 = (imm >> 31) & 0x1fffff;
+- if (imm_51_31 != 0 || imm_51_31 != 0x1fffff) {
++ if (imm_51_31 != 0 && imm_51_31 != 0x1fffff) {
+ /* lu32id rd, imm_51_32 */
+ imm_51_32 = (imm >> 32) & 0xfffff;
+ emit_insn(ctx, lu32id, rd, imm_51_32);
+diff --git a/arch/powerpc/platforms/pseries/vas.c b/arch/powerpc/platforms/pseries/vas.c
+index 9a44a98ba3420..3fbc2a6aa319d 100644
+--- a/arch/powerpc/platforms/pseries/vas.c
++++ b/arch/powerpc/platforms/pseries/vas.c
+@@ -744,6 +744,12 @@ static int reconfig_close_windows(struct vas_caps *vcap, int excess_creds,
+ }
+
+ task_ref = &win->vas_win.task_ref;
++ /*
++ * VAS mmap (coproc_mmap()) and its fault handler
++ * (vas_mmap_fault()) are called after holding mmap lock.
++ * So hold mmap mutex after mmap_lock to avoid deadlock.
++ */
++ mmap_write_lock(task_ref->mm);
+ mutex_lock(&task_ref->mmap_mutex);
+ vma = task_ref->vma;
+ /*
+@@ -752,7 +758,6 @@ static int reconfig_close_windows(struct vas_caps *vcap, int excess_creds,
+ */
+ win->vas_win.status |= flag;
+
+- mmap_write_lock(task_ref->mm);
+ /*
+ * vma is set in the original mapping. But this mapping
+ * is done with mmap() after the window is opened with ioctl.
+@@ -762,8 +767,8 @@ static int reconfig_close_windows(struct vas_caps *vcap, int excess_creds,
+ if (vma)
+ zap_vma_pages(vma);
+
+- mmap_write_unlock(task_ref->mm);
+ mutex_unlock(&task_ref->mmap_mutex);
++ mmap_write_unlock(task_ref->mm);
+ /*
+ * Close VAS window in the hypervisor, but do not
+ * free vas_window struct since it may be reused
+diff --git a/arch/s390/kvm/pv.c b/arch/s390/kvm/pv.c
+index 3ce5f4351156a..899f3b8ac0110 100644
+--- a/arch/s390/kvm/pv.c
++++ b/arch/s390/kvm/pv.c
+@@ -411,8 +411,12 @@ int kvm_s390_pv_deinit_cleanup_all(struct kvm *kvm, u16 *rc, u16 *rrc)
+ u16 _rc, _rrc;
+ int cc = 0;
+
+- /* Make sure the counter does not reach 0 before calling s390_uv_destroy_range */
+- atomic_inc(&kvm->mm->context.protected_count);
++ /*
++ * Nothing to do if the counter was already 0. Otherwise make sure
++ * the counter does not reach 0 before calling s390_uv_destroy_range.
++ */
++ if (!atomic_inc_not_zero(&kvm->mm->context.protected_count))
++ return 0;
+
+ *rc = 1;
+ /* If the current VM is protected, destroy it */
+diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
+index dbe8394234e2b..2f123429a291b 100644
+--- a/arch/s390/mm/fault.c
++++ b/arch/s390/mm/fault.c
+@@ -421,6 +421,8 @@ static inline vm_fault_t do_exception(struct pt_regs *regs, int access)
+ vma_end_read(vma);
+ if (!(fault & VM_FAULT_RETRY)) {
+ count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
++ if (likely(!(fault & VM_FAULT_ERROR)))
++ fault = 0;
+ goto out;
+ }
+ count_vm_vma_lock_event(VMA_LOCK_RETRY);
+diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c
+index dc90d1eb0d554..d7e8297d5642b 100644
+--- a/arch/s390/mm/gmap.c
++++ b/arch/s390/mm/gmap.c
+@@ -2846,6 +2846,7 @@ int s390_replace_asce(struct gmap *gmap)
+ page = alloc_pages(GFP_KERNEL_ACCOUNT, CRST_ALLOC_ORDER);
+ if (!page)
+ return -ENOMEM;
++ page->index = 0;
+ table = page_to_virt(page);
+ memcpy(table, gmap->table, 1UL << (CRST_ALLOC_ORDER + PAGE_SHIFT));
+
+diff --git a/arch/um/os-Linux/sigio.c b/arch/um/os-Linux/sigio.c
+index 37d60e72cf269..9e71794839e87 100644
+--- a/arch/um/os-Linux/sigio.c
++++ b/arch/um/os-Linux/sigio.c
+@@ -3,7 +3,6 @@
+ * Copyright (C) 2002 - 2008 Jeff Dike (jdike@{addtoit,linux.intel}.com)
+ */
+
+-#include <linux/minmax.h>
+ #include <unistd.h>
+ #include <errno.h>
+ #include <fcntl.h>
+@@ -51,7 +50,7 @@ static struct pollfds all_sigio_fds;
+
+ static int write_sigio_thread(void *unused)
+ {
+- struct pollfds *fds;
++ struct pollfds *fds, tmp;
+ struct pollfd *p;
+ int i, n, respond_fd;
+ char c;
+@@ -78,7 +77,9 @@ static int write_sigio_thread(void *unused)
+ "write_sigio_thread : "
+ "read on socket failed, "
+ "err = %d\n", errno);
+- swap(current_poll, next_poll);
++ tmp = current_poll;
++ current_poll = next_poll;
++ next_poll = tmp;
+ respond_fd = sigio_private[1];
+ }
+ else {
+diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
+index 13bc212cd4bc7..e3054e3e46d52 100644
+--- a/arch/x86/include/asm/kvm-x86-ops.h
++++ b/arch/x86/include/asm/kvm-x86-ops.h
+@@ -37,6 +37,7 @@ KVM_X86_OP(get_segment)
+ KVM_X86_OP(get_cpl)
+ KVM_X86_OP(set_segment)
+ KVM_X86_OP(get_cs_db_l_bits)
++KVM_X86_OP(is_valid_cr0)
+ KVM_X86_OP(set_cr0)
+ KVM_X86_OP_OPTIONAL(post_set_cr3)
+ KVM_X86_OP(is_valid_cr4)
+diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
+index fb9d1f2d6136c..938fe9572ae10 100644
+--- a/arch/x86/include/asm/kvm_host.h
++++ b/arch/x86/include/asm/kvm_host.h
+@@ -1566,9 +1566,10 @@ struct kvm_x86_ops {
+ void (*set_segment)(struct kvm_vcpu *vcpu,
+ struct kvm_segment *var, int seg);
+ void (*get_cs_db_l_bits)(struct kvm_vcpu *vcpu, int *db, int *l);
++ bool (*is_valid_cr0)(struct kvm_vcpu *vcpu, unsigned long cr0);
+ void (*set_cr0)(struct kvm_vcpu *vcpu, unsigned long cr0);
+ void (*post_set_cr3)(struct kvm_vcpu *vcpu, unsigned long cr3);
+- bool (*is_valid_cr4)(struct kvm_vcpu *vcpu, unsigned long cr0);
++ bool (*is_valid_cr4)(struct kvm_vcpu *vcpu, unsigned long cr4);
+ void (*set_cr4)(struct kvm_vcpu *vcpu, unsigned long cr4);
+ int (*set_efer)(struct kvm_vcpu *vcpu, u64 efer);
+ void (*get_idt)(struct kvm_vcpu *vcpu, struct desc_ptr *dt);
+diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
+index 182af64387d06..dbf7443c42ebd 100644
+--- a/arch/x86/kernel/cpu/bugs.c
++++ b/arch/x86/kernel/cpu/bugs.c
+@@ -1199,19 +1199,21 @@ spectre_v2_user_select_mitigation(void)
+ }
+
+ /*
+- * If no STIBP, enhanced IBRS is enabled, or SMT impossible, STIBP
++ * If no STIBP, Intel enhanced IBRS is enabled, or SMT impossible, STIBP
+ * is not required.
+ *
+- * Enhanced IBRS also protects against cross-thread branch target
++ * Intel's Enhanced IBRS also protects against cross-thread branch target
+ * injection in user-mode as the IBRS bit remains always set which
+ * implicitly enables cross-thread protections. However, in legacy IBRS
+ * mode, the IBRS bit is set only on kernel entry and cleared on return
+- * to userspace. This disables the implicit cross-thread protection,
+- * so allow for STIBP to be selected in that case.
++ * to userspace. AMD Automatic IBRS also does not protect userspace.
++ * These modes therefore disable the implicit cross-thread protection,
++ * so allow for STIBP to be selected in those cases.
+ */
+ if (!boot_cpu_has(X86_FEATURE_STIBP) ||
+ !smt_possible ||
+- spectre_v2_in_eibrs_mode(spectre_v2_enabled))
++ (spectre_v2_in_eibrs_mode(spectre_v2_enabled) &&
++ !boot_cpu_has(X86_FEATURE_AUTOIBRS)))
+ return;
+
+ /*
+@@ -2343,7 +2345,8 @@ static ssize_t mmio_stale_data_show_state(char *buf)
+
+ static char *stibp_state(void)
+ {
+- if (spectre_v2_in_eibrs_mode(spectre_v2_enabled))
++ if (spectre_v2_in_eibrs_mode(spectre_v2_enabled) &&
++ !boot_cpu_has(X86_FEATURE_AUTOIBRS))
+ return "";
+
+ switch (spectre_v2_user_stibp) {
+diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
+index 0b971f9740964..542fe6915ab78 100644
+--- a/arch/x86/kernel/cpu/mce/amd.c
++++ b/arch/x86/kernel/cpu/mce/amd.c
+@@ -1259,10 +1259,10 @@ static void __threshold_remove_blocks(struct threshold_bank *b)
+ struct threshold_block *pos = NULL;
+ struct threshold_block *tmp = NULL;
+
+- kobject_del(b->kobj);
++ kobject_put(b->kobj);
+
+ list_for_each_entry_safe(pos, tmp, &b->blocks->miscj, miscj)
+- kobject_del(&pos->kobj);
++ kobject_put(b->kobj);
+ }
+
+ static void threshold_remove_bank(struct threshold_bank *bank)
+diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
+index 58b1f208eff51..4a817d20ce3bb 100644
+--- a/arch/x86/kernel/traps.c
++++ b/arch/x86/kernel/traps.c
+@@ -697,9 +697,10 @@ static bool try_fixup_enqcmd_gp(void)
+ }
+
+ static bool gp_try_fixup_and_notify(struct pt_regs *regs, int trapnr,
+- unsigned long error_code, const char *str)
++ unsigned long error_code, const char *str,
++ unsigned long address)
+ {
+- if (fixup_exception(regs, trapnr, error_code, 0))
++ if (fixup_exception(regs, trapnr, error_code, address))
+ return true;
+
+ current->thread.error_code = error_code;
+@@ -759,7 +760,7 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection)
+ goto exit;
+ }
+
+- if (gp_try_fixup_and_notify(regs, X86_TRAP_GP, error_code, desc))
++ if (gp_try_fixup_and_notify(regs, X86_TRAP_GP, error_code, desc, 0))
+ goto exit;
+
+ if (error_code)
+@@ -1357,17 +1358,20 @@ DEFINE_IDTENTRY(exc_device_not_available)
+
+ #define VE_FAULT_STR "VE fault"
+
+-static void ve_raise_fault(struct pt_regs *regs, long error_code)
++static void ve_raise_fault(struct pt_regs *regs, long error_code,
++ unsigned long address)
+ {
+ if (user_mode(regs)) {
+ gp_user_force_sig_segv(regs, X86_TRAP_VE, error_code, VE_FAULT_STR);
+ return;
+ }
+
+- if (gp_try_fixup_and_notify(regs, X86_TRAP_VE, error_code, VE_FAULT_STR))
++ if (gp_try_fixup_and_notify(regs, X86_TRAP_VE, error_code,
++ VE_FAULT_STR, address)) {
+ return;
++ }
+
+- die_addr(VE_FAULT_STR, regs, error_code, 0);
++ die_addr(VE_FAULT_STR, regs, error_code, address);
+ }
+
+ /*
+@@ -1431,7 +1435,7 @@ DEFINE_IDTENTRY(exc_virtualization_exception)
+ * it successfully, treat it as #GP(0) and handle it.
+ */
+ if (!tdx_handle_virt_exception(regs, &ve))
+- ve_raise_fault(regs, 0);
++ ve_raise_fault(regs, 0, ve.gla);
+
+ cond_local_irq_disable(regs);
+ }
+diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
+index 54089f990c8f8..5a0c2d6791a0a 100644
+--- a/arch/x86/kvm/svm/svm.c
++++ b/arch/x86/kvm/svm/svm.c
+@@ -1799,6 +1799,11 @@ static void sev_post_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
+ }
+ }
+
++static bool svm_is_valid_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
++{
++ return true;
++}
++
+ void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
+ {
+ struct vcpu_svm *svm = to_svm(vcpu);
+@@ -4838,6 +4843,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
+ .set_segment = svm_set_segment,
+ .get_cpl = svm_get_cpl,
+ .get_cs_db_l_bits = svm_get_cs_db_l_bits,
++ .is_valid_cr0 = svm_is_valid_cr0,
+ .set_cr0 = svm_set_cr0,
+ .post_set_cr3 = sev_post_set_cr3,
+ .is_valid_cr4 = svm_is_valid_cr4,
+diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
+index 44fb619803b89..40f54cbf3f333 100644
+--- a/arch/x86/kvm/vmx/vmx.c
++++ b/arch/x86/kvm/vmx/vmx.c
+@@ -1503,6 +1503,11 @@ void vmx_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags)
+ struct vcpu_vmx *vmx = to_vmx(vcpu);
+ unsigned long old_rflags;
+
++ /*
++ * Unlike CR0 and CR4, RFLAGS handling requires checking if the vCPU
++ * is an unrestricted guest in order to mark L2 as needing emulation
++ * if L1 runs L2 as a restricted guest.
++ */
+ if (is_unrestricted_guest(vcpu)) {
+ kvm_register_mark_available(vcpu, VCPU_EXREG_RFLAGS);
+ vmx->rflags = rflags;
+@@ -3040,6 +3045,15 @@ static void enter_rmode(struct kvm_vcpu *vcpu)
+ struct vcpu_vmx *vmx = to_vmx(vcpu);
+ struct kvm_vmx *kvm_vmx = to_kvm_vmx(vcpu->kvm);
+
++ /*
++ * KVM should never use VM86 to virtualize Real Mode when L2 is active,
++ * as using VM86 is unnecessary if unrestricted guest is enabled, and
++ * if unrestricted guest is disabled, VM-Enter (from L1) with CR0.PG=0
++ * should VM-Fail and KVM should reject userspace attempts to stuff
++ * CR0.PG=0 when L2 is active.
++ */
++ WARN_ON_ONCE(is_guest_mode(vcpu));
++
+ vmx_get_segment(vcpu, &vmx->rmode.segs[VCPU_SREG_TR], VCPU_SREG_TR);
+ vmx_get_segment(vcpu, &vmx->rmode.segs[VCPU_SREG_ES], VCPU_SREG_ES);
+ vmx_get_segment(vcpu, &vmx->rmode.segs[VCPU_SREG_DS], VCPU_SREG_DS);
+@@ -3229,6 +3243,17 @@ void ept_save_pdptrs(struct kvm_vcpu *vcpu)
+ #define CR3_EXITING_BITS (CPU_BASED_CR3_LOAD_EXITING | \
+ CPU_BASED_CR3_STORE_EXITING)
+
++static bool vmx_is_valid_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
++{
++ if (is_guest_mode(vcpu))
++ return nested_guest_cr0_valid(vcpu, cr0);
++
++ if (to_vmx(vcpu)->nested.vmxon)
++ return nested_host_cr0_valid(vcpu, cr0);
++
++ return true;
++}
++
+ void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
+ {
+ struct vcpu_vmx *vmx = to_vmx(vcpu);
+@@ -3238,7 +3263,7 @@ void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
+ old_cr0_pg = kvm_read_cr0_bits(vcpu, X86_CR0_PG);
+
+ hw_cr0 = (cr0 & ~KVM_VM_CR0_ALWAYS_OFF);
+- if (is_unrestricted_guest(vcpu))
++ if (enable_unrestricted_guest)
+ hw_cr0 |= KVM_VM_CR0_ALWAYS_ON_UNRESTRICTED_GUEST;
+ else {
+ hw_cr0 |= KVM_VM_CR0_ALWAYS_ON;
+@@ -3266,7 +3291,7 @@ void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
+ }
+ #endif
+
+- if (enable_ept && !is_unrestricted_guest(vcpu)) {
++ if (enable_ept && !enable_unrestricted_guest) {
+ /*
+ * Ensure KVM has an up-to-date snapshot of the guest's CR3. If
+ * the below code _enables_ CR3 exiting, vmx_cache_reg() will
+@@ -3397,7 +3422,7 @@ void vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
+ unsigned long hw_cr4;
+
+ hw_cr4 = (cr4_read_shadow() & X86_CR4_MCE) | (cr4 & ~X86_CR4_MCE);
+- if (is_unrestricted_guest(vcpu))
++ if (enable_unrestricted_guest)
+ hw_cr4 |= KVM_VM_CR4_ALWAYS_ON_UNRESTRICTED_GUEST;
+ else if (vmx->rmode.vm86_active)
+ hw_cr4 |= KVM_RMODE_VM_CR4_ALWAYS_ON;
+@@ -3417,7 +3442,7 @@ void vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
+ vcpu->arch.cr4 = cr4;
+ kvm_register_mark_available(vcpu, VCPU_EXREG_CR4);
+
+- if (!is_unrestricted_guest(vcpu)) {
++ if (!enable_unrestricted_guest) {
+ if (enable_ept) {
+ if (!is_paging(vcpu)) {
+ hw_cr4 &= ~X86_CR4_PAE;
+@@ -5367,18 +5392,11 @@ static int handle_set_cr0(struct kvm_vcpu *vcpu, unsigned long val)
+ val = (val & ~vmcs12->cr0_guest_host_mask) |
+ (vmcs12->guest_cr0 & vmcs12->cr0_guest_host_mask);
+
+- if (!nested_guest_cr0_valid(vcpu, val))
+- return 1;
+-
+ if (kvm_set_cr0(vcpu, val))
+ return 1;
+ vmcs_writel(CR0_READ_SHADOW, orig_val);
+ return 0;
+ } else {
+- if (to_vmx(vcpu)->nested.vmxon &&
+- !nested_host_cr0_valid(vcpu, val))
+- return 1;
+-
+ return kvm_set_cr0(vcpu, val);
+ }
+ }
+@@ -8160,6 +8178,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
+ .set_segment = vmx_set_segment,
+ .get_cpl = vmx_get_cpl,
+ .get_cs_db_l_bits = vmx_get_cs_db_l_bits,
++ .is_valid_cr0 = vmx_is_valid_cr0,
+ .set_cr0 = vmx_set_cr0,
+ .is_valid_cr4 = vmx_is_valid_cr4,
+ .set_cr4 = vmx_set_cr4,
+diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
+index 04b57a336b34e..f04bed5a5aff9 100644
+--- a/arch/x86/kvm/x86.c
++++ b/arch/x86/kvm/x86.c
+@@ -906,6 +906,22 @@ int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long cr3)
+ }
+ EXPORT_SYMBOL_GPL(load_pdptrs);
+
++static bool kvm_is_valid_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
++{
++#ifdef CONFIG_X86_64
++ if (cr0 & 0xffffffff00000000UL)
++ return false;
++#endif
++
++ if ((cr0 & X86_CR0_NW) && !(cr0 & X86_CR0_CD))
++ return false;
++
++ if ((cr0 & X86_CR0_PG) && !(cr0 & X86_CR0_PE))
++ return false;
++
++ return static_call(kvm_x86_is_valid_cr0)(vcpu, cr0);
++}
++
+ void kvm_post_set_cr0(struct kvm_vcpu *vcpu, unsigned long old_cr0, unsigned long cr0)
+ {
+ /*
+@@ -952,20 +968,13 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
+ {
+ unsigned long old_cr0 = kvm_read_cr0(vcpu);
+
+- cr0 |= X86_CR0_ET;
+-
+-#ifdef CONFIG_X86_64
+- if (cr0 & 0xffffffff00000000UL)
++ if (!kvm_is_valid_cr0(vcpu, cr0))
+ return 1;
+-#endif
+-
+- cr0 &= ~CR0_RESERVED_BITS;
+
+- if ((cr0 & X86_CR0_NW) && !(cr0 & X86_CR0_CD))
+- return 1;
++ cr0 |= X86_CR0_ET;
+
+- if ((cr0 & X86_CR0_PG) && !(cr0 & X86_CR0_PE))
+- return 1;
++ /* Write to CR0 reserved bits are ignored, even on Intel. */
++ cr0 &= ~CR0_RESERVED_BITS;
+
+ #ifdef CONFIG_X86_64
+ if ((vcpu->arch.efer & EFER_LME) && !is_paging(vcpu) &&
+@@ -11461,7 +11470,8 @@ static bool kvm_is_valid_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
+ return false;
+ }
+
+- return kvm_is_valid_cr4(vcpu, sregs->cr4);
++ return kvm_is_valid_cr4(vcpu, sregs->cr4) &&
++ kvm_is_valid_cr0(vcpu, sregs->cr0);
+ }
+
+ static int __set_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs,
+diff --git a/block/blk-core.c b/block/blk-core.c
+index 3fc68b9444791..0434f5a8151fe 100644
+--- a/block/blk-core.c
++++ b/block/blk-core.c
+@@ -1141,8 +1141,7 @@ void __blk_flush_plug(struct blk_plug *plug, bool from_schedule)
+ {
+ if (!list_empty(&plug->cb_list))
+ flush_plug_callbacks(plug, from_schedule);
+- if (!rq_list_empty(plug->mq_list))
+- blk_mq_flush_plug_list(plug, from_schedule);
++ blk_mq_flush_plug_list(plug, from_schedule);
+ /*
+ * Unconditionally flush out cached requests, even if the unplug
+ * event came from schedule. Since we know hold references to the
+diff --git a/block/blk-mq.c b/block/blk-mq.c
+index 73ed8ccb09ce8..58bf41e8e66c7 100644
+--- a/block/blk-mq.c
++++ b/block/blk-mq.c
+@@ -2754,7 +2754,14 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule)
+ {
+ struct request *rq;
+
+- if (rq_list_empty(plug->mq_list))
++ /*
++ * We may have been called recursively midway through handling
++ * plug->mq_list via a schedule() in the driver's queue_rq() callback.
++ * To avoid mq_list changing under our feet, clear rq_count early and
++ * bail out specifically if rq_count is 0 rather than checking
++ * whether the mq_list is empty.
++ */
++ if (plug->rq_count == 0)
+ return;
+ plug->rq_count = 0;
+
+diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
+index 38fb84974f352..8a384e6cfa132 100644
+--- a/drivers/acpi/arm64/iort.c
++++ b/drivers/acpi/arm64/iort.c
+@@ -1006,9 +1006,6 @@ static void iort_node_get_rmr_info(struct acpi_iort_node *node,
+ for (i = 0; i < node->mapping_count; i++, map++) {
+ struct acpi_iort_node *parent;
+
+- if (!map->id_count)
+- continue;
+-
+ parent = ACPI_ADD_PTR(struct acpi_iort_node, iort_table,
+ map->output_reference);
+ if (parent != iommu)
+diff --git a/drivers/ata/pata_ns87415.c b/drivers/ata/pata_ns87415.c
+index d60e1f69d7b02..c697219a61a2d 100644
+--- a/drivers/ata/pata_ns87415.c
++++ b/drivers/ata/pata_ns87415.c
+@@ -260,7 +260,7 @@ static u8 ns87560_check_status(struct ata_port *ap)
+ * LOCKING:
+ * Inherited from caller.
+ */
+-void ns87560_tf_read(struct ata_port *ap, struct ata_taskfile *tf)
++static void ns87560_tf_read(struct ata_port *ap, struct ata_taskfile *tf)
+ {
+ struct ata_ioports *ioaddr = &ap->ioaddr;
+
+diff --git a/drivers/base/power/power.h b/drivers/base/power/power.h
+index 0eb7f02b3ad59..922ed457db191 100644
+--- a/drivers/base/power/power.h
++++ b/drivers/base/power/power.h
+@@ -29,6 +29,7 @@ extern u64 pm_runtime_active_time(struct device *dev);
+ #define WAKE_IRQ_DEDICATED_MASK (WAKE_IRQ_DEDICATED_ALLOCATED | \
+ WAKE_IRQ_DEDICATED_MANAGED | \
+ WAKE_IRQ_DEDICATED_REVERSE)
++#define WAKE_IRQ_DEDICATED_ENABLED BIT(3)
+
+ struct wake_irq {
+ struct device *dev;
+diff --git a/drivers/base/power/wakeirq.c b/drivers/base/power/wakeirq.c
+index d487a6bac630f..afd094dec5ca3 100644
+--- a/drivers/base/power/wakeirq.c
++++ b/drivers/base/power/wakeirq.c
+@@ -314,8 +314,10 @@ void dev_pm_enable_wake_irq_check(struct device *dev,
+ return;
+
+ enable:
+- if (!can_change_status || !(wirq->status & WAKE_IRQ_DEDICATED_REVERSE))
++ if (!can_change_status || !(wirq->status & WAKE_IRQ_DEDICATED_REVERSE)) {
+ enable_irq(wirq->irq);
++ wirq->status |= WAKE_IRQ_DEDICATED_ENABLED;
++ }
+ }
+
+ /**
+@@ -336,8 +338,10 @@ void dev_pm_disable_wake_irq_check(struct device *dev, bool cond_disable)
+ if (cond_disable && (wirq->status & WAKE_IRQ_DEDICATED_REVERSE))
+ return;
+
+- if (wirq->status & WAKE_IRQ_DEDICATED_MANAGED)
++ if (wirq->status & WAKE_IRQ_DEDICATED_MANAGED) {
++ wirq->status &= ~WAKE_IRQ_DEDICATED_ENABLED;
+ disable_irq_nosync(wirq->irq);
++ }
+ }
+
+ /**
+@@ -376,7 +380,7 @@ void dev_pm_arm_wake_irq(struct wake_irq *wirq)
+
+ if (device_may_wakeup(wirq->dev)) {
+ if (wirq->status & WAKE_IRQ_DEDICATED_ALLOCATED &&
+- !pm_runtime_status_suspended(wirq->dev))
++ !(wirq->status & WAKE_IRQ_DEDICATED_ENABLED))
+ enable_irq(wirq->irq);
+
+ enable_irq_wake(wirq->irq);
+@@ -399,7 +403,7 @@ void dev_pm_disarm_wake_irq(struct wake_irq *wirq)
+ disable_irq_wake(wirq->irq);
+
+ if (wirq->status & WAKE_IRQ_DEDICATED_ALLOCATED &&
+- !pm_runtime_status_suspended(wirq->dev))
++ !(wirq->status & WAKE_IRQ_DEDICATED_ENABLED))
+ disable_irq_nosync(wirq->irq);
+ }
+ }
+diff --git a/drivers/base/regmap/regmap-kunit.c b/drivers/base/regmap/regmap-kunit.c
+index f76d416881349..0b3dacc7fa424 100644
+--- a/drivers/base/regmap/regmap-kunit.c
++++ b/drivers/base/regmap/regmap-kunit.c
+@@ -58,6 +58,9 @@ static struct regmap *gen_regmap(struct regmap_config *config,
+ int i;
+ struct reg_default *defaults;
+
++ config->disable_locking = config->cache_type == REGCACHE_RBTREE ||
++ config->cache_type == REGCACHE_MAPLE;
++
+ buf = kmalloc(size, GFP_KERNEL);
+ if (!buf)
+ return ERR_PTR(-ENOMEM);
+diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
+index 632751ddb2870..5c86151b0d3a5 100644
+--- a/drivers/block/rbd.c
++++ b/drivers/block/rbd.c
+@@ -3849,51 +3849,82 @@ static void wake_lock_waiters(struct rbd_device *rbd_dev, int result)
+ list_splice_tail_init(&rbd_dev->acquiring_list, &rbd_dev->running_list);
+ }
+
+-static int get_lock_owner_info(struct rbd_device *rbd_dev,
+- struct ceph_locker **lockers, u32 *num_lockers)
++static bool locker_equal(const struct ceph_locker *lhs,
++ const struct ceph_locker *rhs)
++{
++ return lhs->id.name.type == rhs->id.name.type &&
++ lhs->id.name.num == rhs->id.name.num &&
++ !strcmp(lhs->id.cookie, rhs->id.cookie) &&
++ ceph_addr_equal_no_type(&lhs->info.addr, &rhs->info.addr);
++}
++
++static void free_locker(struct ceph_locker *locker)
++{
++ if (locker)
++ ceph_free_lockers(locker, 1);
++}
++
++static struct ceph_locker *get_lock_owner_info(struct rbd_device *rbd_dev)
+ {
+ struct ceph_osd_client *osdc = &rbd_dev->rbd_client->client->osdc;
++ struct ceph_locker *lockers;
++ u32 num_lockers;
+ u8 lock_type;
+ char *lock_tag;
++ u64 handle;
+ int ret;
+
+- dout("%s rbd_dev %p\n", __func__, rbd_dev);
+-
+ ret = ceph_cls_lock_info(osdc, &rbd_dev->header_oid,
+ &rbd_dev->header_oloc, RBD_LOCK_NAME,
+- &lock_type, &lock_tag, lockers, num_lockers);
+- if (ret)
+- return ret;
++ &lock_type, &lock_tag, &lockers, &num_lockers);
++ if (ret) {
++ rbd_warn(rbd_dev, "failed to retrieve lockers: %d", ret);
++ return ERR_PTR(ret);
++ }
+
+- if (*num_lockers == 0) {
++ if (num_lockers == 0) {
+ dout("%s rbd_dev %p no lockers detected\n", __func__, rbd_dev);
++ lockers = NULL;
+ goto out;
+ }
+
+ if (strcmp(lock_tag, RBD_LOCK_TAG)) {
+ rbd_warn(rbd_dev, "locked by external mechanism, tag %s",
+ lock_tag);
+- ret = -EBUSY;
+- goto out;
++ goto err_busy;
+ }
+
+- if (lock_type == CEPH_CLS_LOCK_SHARED) {
+- rbd_warn(rbd_dev, "shared lock type detected");
+- ret = -EBUSY;
+- goto out;
++ if (lock_type != CEPH_CLS_LOCK_EXCLUSIVE) {
++ rbd_warn(rbd_dev, "incompatible lock type detected");
++ goto err_busy;
+ }
+
+- if (strncmp((*lockers)[0].id.cookie, RBD_LOCK_COOKIE_PREFIX,
+- strlen(RBD_LOCK_COOKIE_PREFIX))) {
++ WARN_ON(num_lockers != 1);
++ ret = sscanf(lockers[0].id.cookie, RBD_LOCK_COOKIE_PREFIX " %llu",
++ &handle);
++ if (ret != 1) {
+ rbd_warn(rbd_dev, "locked by external mechanism, cookie %s",
+- (*lockers)[0].id.cookie);
+- ret = -EBUSY;
+- goto out;
++ lockers[0].id.cookie);
++ goto err_busy;
+ }
++ if (ceph_addr_is_blank(&lockers[0].info.addr)) {
++ rbd_warn(rbd_dev, "locker has a blank address");
++ goto err_busy;
++ }
++
++ dout("%s rbd_dev %p got locker %s%llu@%pISpc/%u handle %llu\n",
++ __func__, rbd_dev, ENTITY_NAME(lockers[0].id.name),
++ &lockers[0].info.addr.in_addr,
++ le32_to_cpu(lockers[0].info.addr.nonce), handle);
+
+ out:
+ kfree(lock_tag);
+- return ret;
++ return lockers;
++
++err_busy:
++ kfree(lock_tag);
++ ceph_free_lockers(lockers, num_lockers);
++ return ERR_PTR(-EBUSY);
+ }
+
+ static int find_watcher(struct rbd_device *rbd_dev,
+@@ -3947,51 +3978,68 @@ out:
+ static int rbd_try_lock(struct rbd_device *rbd_dev)
+ {
+ struct ceph_client *client = rbd_dev->rbd_client->client;
+- struct ceph_locker *lockers;
+- u32 num_lockers;
++ struct ceph_locker *locker, *refreshed_locker;
+ int ret;
+
+ for (;;) {
++ locker = refreshed_locker = NULL;
++
+ ret = rbd_lock(rbd_dev);
+ if (ret != -EBUSY)
+- return ret;
++ goto out;
+
+ /* determine if the current lock holder is still alive */
+- ret = get_lock_owner_info(rbd_dev, &lockers, &num_lockers);
+- if (ret)
+- return ret;
+-
+- if (num_lockers == 0)
++ locker = get_lock_owner_info(rbd_dev);
++ if (IS_ERR(locker)) {
++ ret = PTR_ERR(locker);
++ locker = NULL;
++ goto out;
++ }
++ if (!locker)
+ goto again;
+
+- ret = find_watcher(rbd_dev, lockers);
++ ret = find_watcher(rbd_dev, locker);
+ if (ret)
+ goto out; /* request lock or error */
+
++ refreshed_locker = get_lock_owner_info(rbd_dev);
++ if (IS_ERR(refreshed_locker)) {
++ ret = PTR_ERR(refreshed_locker);
++ refreshed_locker = NULL;
++ goto out;
++ }
++ if (!refreshed_locker ||
++ !locker_equal(locker, refreshed_locker))
++ goto again;
++
+ rbd_warn(rbd_dev, "breaking header lock owned by %s%llu",
+- ENTITY_NAME(lockers[0].id.name));
++ ENTITY_NAME(locker->id.name));
+
+ ret = ceph_monc_blocklist_add(&client->monc,
+- &lockers[0].info.addr);
++ &locker->info.addr);
+ if (ret) {
+- rbd_warn(rbd_dev, "blocklist of %s%llu failed: %d",
+- ENTITY_NAME(lockers[0].id.name), ret);
++ rbd_warn(rbd_dev, "failed to blocklist %s%llu: %d",
++ ENTITY_NAME(locker->id.name), ret);
+ goto out;
+ }
+
+ ret = ceph_cls_break_lock(&client->osdc, &rbd_dev->header_oid,
+ &rbd_dev->header_oloc, RBD_LOCK_NAME,
+- lockers[0].id.cookie,
+- &lockers[0].id.name);
+- if (ret && ret != -ENOENT)
++ locker->id.cookie, &locker->id.name);
++ if (ret && ret != -ENOENT) {
++ rbd_warn(rbd_dev, "failed to break header lock: %d",
++ ret);
+ goto out;
++ }
+
+ again:
+- ceph_free_lockers(lockers, num_lockers);
++ free_locker(refreshed_locker);
++ free_locker(locker);
+ }
+
+ out:
+- ceph_free_lockers(lockers, num_lockers);
++ free_locker(refreshed_locker);
++ free_locker(locker);
+ return ret;
+ }
+
+diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
+index 33d3298a0da16..e6b6e5eee4dea 100644
+--- a/drivers/block/ublk_drv.c
++++ b/drivers/block/ublk_drv.c
+@@ -1632,7 +1632,8 @@ static int ublk_ctrl_start_dev(struct ublk_device *ub, struct io_uring_cmd *cmd)
+ if (ublksrv_pid <= 0)
+ return -EINVAL;
+
+- wait_for_completion_interruptible(&ub->completion);
++ if (wait_for_completion_interruptible(&ub->completion) != 0)
++ return -EINTR;
+
+ schedule_delayed_work(&ub->monitor_work, UBLK_DAEMON_MONITOR_PERIOD);
+
+@@ -1908,8 +1909,8 @@ static int ublk_ctrl_del_dev(struct ublk_device **p_ub)
+ * - the device number is freed already, we will not find this
+ * device via ublk_get_device_from_id()
+ */
+- wait_event_interruptible(ublk_idr_wq, ublk_idr_freed(idx));
+-
++ if (wait_event_interruptible(ublk_idr_wq, ublk_idr_freed(idx)))
++ return -EINTR;
+ return 0;
+ }
+
+@@ -2106,7 +2107,9 @@ static int ublk_ctrl_end_recovery(struct ublk_device *ub,
+ pr_devel("%s: Waiting for new ubq_daemons(nr: %d) are ready, dev id %d...\n",
+ __func__, ub->dev_info.nr_hw_queues, header->dev_id);
+ /* wait until new ubq_daemon sending all FETCH_REQ */
+- wait_for_completion_interruptible(&ub->completion);
++ if (wait_for_completion_interruptible(&ub->completion))
++ return -EINTR;
++
+ pr_devel("%s: All new ubq_daemons(nr: %d) are ready, dev id %d\n",
+ __func__, ub->dev_info.nr_hw_queues, header->dev_id);
+
+diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
+index 88a5384c09c02..b95963095729a 100644
+--- a/drivers/char/tpm/tpm_tis_core.c
++++ b/drivers/char/tpm/tpm_tis_core.c
+@@ -366,8 +366,13 @@ static int tpm_tis_recv(struct tpm_chip *chip, u8 *buf, size_t count)
+ goto out;
+ }
+
+- size += recv_data(chip, &buf[TPM_HEADER_SIZE],
+- expected - TPM_HEADER_SIZE);
++ rc = recv_data(chip, &buf[TPM_HEADER_SIZE],
++ expected - TPM_HEADER_SIZE);
++ if (rc < 0) {
++ size = rc;
++ goto out;
++ }
++ size += rc;
+ if (size < expected) {
+ dev_err(&chip->dev, "Unable to read remainder of result\n");
+ size = -ETIME;
+diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
+index 7e1765b09e04a..8757bf886207b 100644
+--- a/drivers/cxl/acpi.c
++++ b/drivers/cxl/acpi.c
+@@ -296,9 +296,8 @@ err_xormap:
+ else
+ rc = cxl_decoder_autoremove(dev, cxld);
+ if (rc) {
+- dev_err(dev, "Failed to add decode range [%#llx - %#llx]\n",
+- cxld->hpa_range.start, cxld->hpa_range.end);
+- return 0;
++ dev_err(dev, "Failed to add decode range: %pr", res);
++ return rc;
+ }
+ dev_dbg(dev, "add: %s node: %d range [%#llx - %#llx]\n",
+ dev_name(&cxld->dev),
+diff --git a/drivers/dma-buf/dma-fence-unwrap.c b/drivers/dma-buf/dma-fence-unwrap.c
+index 7002bca792ff0..c625bb2b5d563 100644
+--- a/drivers/dma-buf/dma-fence-unwrap.c
++++ b/drivers/dma-buf/dma-fence-unwrap.c
+@@ -66,18 +66,36 @@ struct dma_fence *__dma_fence_unwrap_merge(unsigned int num_fences,
+ {
+ struct dma_fence_array *result;
+ struct dma_fence *tmp, **array;
++ ktime_t timestamp;
+ unsigned int i;
+ size_t count;
+
+ count = 0;
++ timestamp = ns_to_ktime(0);
+ for (i = 0; i < num_fences; ++i) {
+- dma_fence_unwrap_for_each(tmp, &iter[i], fences[i])
+- if (!dma_fence_is_signaled(tmp))
++ dma_fence_unwrap_for_each(tmp, &iter[i], fences[i]) {
++ if (!dma_fence_is_signaled(tmp)) {
+ ++count;
++ } else if (test_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT,
++ &tmp->flags)) {
++ if (ktime_after(tmp->timestamp, timestamp))
++ timestamp = tmp->timestamp;
++ } else {
++ /*
++ * Use the current time if the fence is
++ * currently signaling.
++ */
++ timestamp = ktime_get();
++ }
++ }
+ }
+
++ /*
++ * If we couldn't find a pending fence just return a private signaled
++ * fence with the timestamp of the last signaled one.
++ */
+ if (count == 0)
+- return dma_fence_get_stub();
++ return dma_fence_allocate_private_stub(timestamp);
+
+ array = kmalloc_array(count, sizeof(*array), GFP_KERNEL);
+ if (!array)
+@@ -138,7 +156,7 @@ restart:
+ } while (tmp);
+
+ if (count == 0) {
+- tmp = dma_fence_get_stub();
++ tmp = dma_fence_allocate_private_stub(ktime_get());
+ goto return_tmp;
+ }
+
+diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
+index f177c56269bb0..8aa8f8cb7071e 100644
+--- a/drivers/dma-buf/dma-fence.c
++++ b/drivers/dma-buf/dma-fence.c
+@@ -150,16 +150,17 @@ EXPORT_SYMBOL(dma_fence_get_stub);
+
+ /**
+ * dma_fence_allocate_private_stub - return a private, signaled fence
++ * @timestamp: timestamp when the fence was signaled
+ *
+ * Return a newly allocated and signaled stub fence.
+ */
+-struct dma_fence *dma_fence_allocate_private_stub(void)
++struct dma_fence *dma_fence_allocate_private_stub(ktime_t timestamp)
+ {
+ struct dma_fence *fence;
+
+ fence = kzalloc(sizeof(*fence), GFP_KERNEL);
+ if (fence == NULL)
+- return ERR_PTR(-ENOMEM);
++ return NULL;
+
+ dma_fence_init(fence,
+ &dma_fence_stub_ops,
+@@ -169,7 +170,7 @@ struct dma_fence *dma_fence_allocate_private_stub(void)
+ set_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
+ &fence->flags);
+
+- dma_fence_signal(fence);
++ dma_fence_signal_timestamp(fence, timestamp);
+
+ return fence;
+ }
+diff --git a/drivers/gpio/gpio-mvebu.c b/drivers/gpio/gpio-mvebu.c
+index a68f682aec012..67497116ce27d 100644
+--- a/drivers/gpio/gpio-mvebu.c
++++ b/drivers/gpio/gpio-mvebu.c
+@@ -874,7 +874,7 @@ static int mvebu_pwm_probe(struct platform_device *pdev,
+
+ spin_lock_init(&mvpwm->lock);
+
+- return pwmchip_add(&mvpwm->chip);
++ return devm_pwmchip_add(dev, &mvpwm->chip);
+ }
+
+ #ifdef CONFIG_DEBUG_FS
+@@ -1112,6 +1112,13 @@ static int mvebu_gpio_probe_syscon(struct platform_device *pdev,
+ return 0;
+ }
+
++static void mvebu_gpio_remove_irq_domain(void *data)
++{
++ struct irq_domain *domain = data;
++
++ irq_domain_remove(domain);
++}
++
+ static int mvebu_gpio_probe(struct platform_device *pdev)
+ {
+ struct mvebu_gpio_chip *mvchip;
+@@ -1243,17 +1250,21 @@ static int mvebu_gpio_probe(struct platform_device *pdev)
+ if (!mvchip->domain) {
+ dev_err(&pdev->dev, "couldn't allocate irq domain %s (DT).\n",
+ mvchip->chip.label);
+- err = -ENODEV;
+- goto err_pwm;
++ return -ENODEV;
+ }
+
++ err = devm_add_action_or_reset(&pdev->dev, mvebu_gpio_remove_irq_domain,
++ mvchip->domain);
++ if (err)
++ return err;
++
+ err = irq_alloc_domain_generic_chips(
+ mvchip->domain, ngpios, 2, np->name, handle_level_irq,
+ IRQ_NOREQUEST | IRQ_NOPROBE | IRQ_LEVEL, 0, 0);
+ if (err) {
+ dev_err(&pdev->dev, "couldn't allocate irq chips %s (DT).\n",
+ mvchip->chip.label);
+- goto err_domain;
++ return err;
+ }
+
+ /*
+@@ -1293,13 +1304,6 @@ static int mvebu_gpio_probe(struct platform_device *pdev)
+ }
+
+ return 0;
+-
+-err_domain:
+- irq_domain_remove(mvchip->domain);
+-err_pwm:
+- pwmchip_remove(&mvchip->mvpwm->chip);
+-
+- return err;
+ }
+
+ static struct platform_driver mvebu_gpio_driver = {
+diff --git a/drivers/gpio/gpio-tps68470.c b/drivers/gpio/gpio-tps68470.c
+index aaddcabe9b359..532deaddfd4e2 100644
+--- a/drivers/gpio/gpio-tps68470.c
++++ b/drivers/gpio/gpio-tps68470.c
+@@ -91,13 +91,13 @@ static int tps68470_gpio_output(struct gpio_chip *gc, unsigned int offset,
+ struct tps68470_gpio_data *tps68470_gpio = gpiochip_get_data(gc);
+ struct regmap *regmap = tps68470_gpio->tps68470_regmap;
+
++ /* Set the initial value */
++ tps68470_gpio_set(gc, offset, value);
++
+ /* rest are always outputs */
+ if (offset >= TPS68470_N_REGULAR_GPIO)
+ return 0;
+
+- /* Set the initial value */
+- tps68470_gpio_set(gc, offset, value);
+-
+ return regmap_update_bits(regmap, TPS68470_GPIO_CTL_REG_A(offset),
+ TPS68470_GPIO_MODE_MASK,
+ TPS68470_GPIO_MODE_OUT_CMOS);
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+index 02b827785e399..129081ffa0a5f 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+@@ -1246,6 +1246,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
+ void amdgpu_device_pci_config_reset(struct amdgpu_device *adev);
+ int amdgpu_device_pci_reset(struct amdgpu_device *adev);
+ bool amdgpu_device_need_post(struct amdgpu_device *adev);
++bool amdgpu_device_pcie_dynamic_switching_supported(void);
+ bool amdgpu_device_should_use_aspm(struct amdgpu_device *adev);
+ bool amdgpu_device_aspm_support_quirk(void);
+
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+index 5c7d40873ee20..167b2a1c416eb 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+@@ -1352,6 +1352,25 @@ bool amdgpu_device_need_post(struct amdgpu_device *adev)
+ return true;
+ }
+
++/*
++ * Intel hosts such as Raptor Lake and Sapphire Rapids don't support dynamic
++ * speed switching. Until we have confirmation from Intel that a specific host
++ * supports it, it's safer that we keep it disabled for all.
++ *
++ * https://edc.intel.com/content/www/us/en/design/products/platforms/details/raptor-lake-s/13th-generation-core-processors-datasheet-volume-1-of-2/005/pci-express-support/
++ * https://gitlab.freedesktop.org/drm/amd/-/issues/2663
++ */
++bool amdgpu_device_pcie_dynamic_switching_supported(void)
++{
++#if IS_ENABLED(CONFIG_X86)
++ struct cpuinfo_x86 *c = &cpu_data(0);
++
++ if (c->x86_vendor == X86_VENDOR_INTEL)
++ return false;
++#endif
++ return true;
++}
++
+ /**
+ * amdgpu_device_should_use_aspm - check if the device should program ASPM
+ *
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+index e4757a2807d9a..db820331f2c61 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+@@ -491,11 +491,11 @@ static int psp_sw_init(void *handle)
+ return 0;
+
+ failed2:
+- amdgpu_bo_free_kernel(&psp->fw_pri_bo,
+- &psp->fw_pri_mc_addr, &psp->fw_pri_buf);
+-failed1:
+ amdgpu_bo_free_kernel(&psp->fence_buf_bo,
+ &psp->fence_buf_mc_addr, &psp->fence_buf);
++failed1:
++ amdgpu_bo_free_kernel(&psp->fw_pri_bo,
++ &psp->fw_pri_mc_addr, &psp->fw_pri_buf);
+ return ret;
+ }
+
+diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
+index 888e80f498e97..9bc86deac9e8e 100644
+--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
++++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
+@@ -706,7 +706,7 @@ void dm_handle_mst_sideband_msg_ready_event(
+
+ if (retry == 3) {
+ DRM_ERROR("Failed to ack MST event.\n");
+- return;
++ break;
+ }
+
+ drm_dp_mst_hpd_irq_send_new_request(&aconnector->mst_mgr);
+diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c
+index d647f68fd5630..4f61d4f257cd7 100644
+--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c
++++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c
+@@ -24,6 +24,7 @@
+ */
+
+ #include "amdgpu_dm_psr.h"
++#include "dc_dmub_srv.h"
+ #include "dc.h"
+ #include "dm_helpers.h"
+ #include "amdgpu_dm.h"
+@@ -50,7 +51,7 @@ static bool link_supports_psrsu(struct dc_link *link)
+ !link->dpcd_caps.psr_info.psr2_su_y_granularity_cap)
+ return false;
+
+- return true;
++ return dc_dmub_check_min_version(dc->ctx->dmub_srv->dmub);
+ }
+
+ /*
+diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c
+index 6eace83c9c6f5..d22095a3a265a 100644
+--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
++++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
+@@ -2626,8 +2626,11 @@ static enum surface_update_type check_update_surfaces_for_stream(
+
+ if (stream_update->mst_bw_update)
+ su_flags->bits.mst_bw = 1;
+- if (stream_update->crtc_timing_adjust && dc_extended_blank_supported(dc))
+- su_flags->bits.crtc_timing_adjust = 1;
++
++ if (stream_update->stream && stream_update->stream->freesync_on_desktop &&
++ (stream_update->vrr_infopacket || stream_update->allow_freesync ||
++ stream_update->vrr_active_variable))
++ su_flags->bits.fams_changed = 1;
+
+ if (su_flags->raw != 0)
+ overall_type = UPDATE_TYPE_FULL;
+@@ -4894,21 +4897,3 @@ void dc_notify_vsync_int_state(struct dc *dc, struct dc_stream_state *stream, bo
+ if (pipe->stream_res.abm && pipe->stream_res.abm->funcs->set_abm_pause)
+ pipe->stream_res.abm->funcs->set_abm_pause(pipe->stream_res.abm, !enable, i, pipe->stream_res.tg->inst);
+ }
+-
+-/**
+- * dc_extended_blank_supported - Decide whether extended blank is supported
+- *
+- * @dc: [in] Current DC state
+- *
+- * Extended blank is a freesync optimization feature to be enabled in the
+- * future. During the extra vblank period gained from freesync, we have the
+- * ability to enter z9/z10.
+- *
+- * Return:
+- * Indicate whether extended blank is supported (%true or %false)
+- */
+-bool dc_extended_blank_supported(struct dc *dc)
+-{
+- return dc->debug.extended_blank_optimization && !dc->debug.disable_z10
+- && dc->caps.zstate_support && dc->caps.is_apu;
+-}
+diff --git a/drivers/gpu/drm/amd/display/dc/dc.h b/drivers/gpu/drm/amd/display/dc/dc.h
+index 4d93ca9c627b0..9279990e43694 100644
+--- a/drivers/gpu/drm/amd/display/dc/dc.h
++++ b/drivers/gpu/drm/amd/display/dc/dc.h
+@@ -855,7 +855,6 @@ struct dc_debug_options {
+ bool force_usr_allow;
+ /* uses value at boot and disables switch */
+ bool disable_dtb_ref_clk_switch;
+- uint32_t fixed_vs_aux_delay_config_wa;
+ bool extended_blank_optimization;
+ union aux_wake_wa_options aux_wake_wa;
+ uint32_t mst_start_top_delay;
+@@ -2126,8 +2125,6 @@ struct dc_sink_init_data {
+ bool converter_disable_audio;
+ };
+
+-bool dc_extended_blank_supported(struct dc *dc);
+-
+ struct dc_sink *dc_sink_create(const struct dc_sink_init_data *init_params);
+
+ /* Newer interfaces */
+diff --git a/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c b/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c
+index a9b9490a532c2..ab4542b57b9a3 100644
+--- a/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c
++++ b/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c
+@@ -1079,3 +1079,10 @@ void dc_send_update_cursor_info_to_dmu(
+ dc_send_cmd_to_dmu(pCtx->stream->ctx->dmub_srv, &cmd);
+ }
+ }
++
++bool dc_dmub_check_min_version(struct dmub_srv *srv)
++{
++ if (!srv->hw_funcs.is_psrsu_supported)
++ return true;
++ return srv->hw_funcs.is_psrsu_supported(srv);
++}
+diff --git a/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.h b/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.h
+index d34f5563df2ec..9a248ced03b9c 100644
+--- a/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.h
++++ b/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.h
+@@ -89,4 +89,5 @@ void dc_dmub_setup_subvp_dmub_command(struct dc *dc, struct dc_state *context, b
+ void dc_dmub_srv_log_diagnostic_data(struct dc_dmub_srv *dc_dmub_srv);
+
+ void dc_send_update_cursor_info_to_dmu(struct pipe_ctx *pCtx, uint8_t pipe_idx);
++bool dc_dmub_check_min_version(struct dmub_srv *srv);
+ #endif /* _DMUB_DC_SRV_H_ */
+diff --git a/drivers/gpu/drm/amd/display/dc/dc_stream.h b/drivers/gpu/drm/amd/display/dc/dc_stream.h
+index 25284006019c3..270282fbda4ab 100644
+--- a/drivers/gpu/drm/amd/display/dc/dc_stream.h
++++ b/drivers/gpu/drm/amd/display/dc/dc_stream.h
+@@ -131,6 +131,7 @@ union stream_update_flags {
+ uint32_t dsc_changed : 1;
+ uint32_t mst_bw : 1;
+ uint32_t crtc_timing_adjust : 1;
++ uint32_t fams_changed : 1;
+ } bits;
+
+ uint32_t raw;
+diff --git a/drivers/gpu/drm/amd/display/dc/dc_types.h b/drivers/gpu/drm/amd/display/dc/dc_types.h
+index 45ab48fe5d004..139a77acd5d02 100644
+--- a/drivers/gpu/drm/amd/display/dc/dc_types.h
++++ b/drivers/gpu/drm/amd/display/dc/dc_types.h
+@@ -196,6 +196,7 @@ struct dc_panel_patch {
+ unsigned int disable_fams;
+ unsigned int skip_avmute;
+ unsigned int mst_start_top_delay;
++ unsigned int delay_disable_aux_intercept_ms;
+ };
+
+ struct dc_edid_caps {
+diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
+index c38be3c6c234e..a621b6a27c1fc 100644
+--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
++++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
+@@ -2128,7 +2128,7 @@ void dcn20_optimize_bandwidth(
+ dc->clk_mgr,
+ context,
+ true);
+- if (dc_extended_blank_supported(dc) && context->bw_ctx.bw.dcn.clk.zstate_support == DCN_ZSTATE_SUPPORT_ALLOW) {
++ if (context->bw_ctx.bw.dcn.clk.zstate_support == DCN_ZSTATE_SUPPORT_ALLOW) {
+ for (i = 0; i < dc->res_pool->pipe_count; ++i) {
+ struct pipe_ctx *pipe_ctx = &context->res_ctx.pipe_ctx[i];
+
+@@ -2136,7 +2136,7 @@ void dcn20_optimize_bandwidth(
+ && pipe_ctx->stream->adjust.v_total_min == pipe_ctx->stream->adjust.v_total_max
+ && pipe_ctx->stream->adjust.v_total_max > pipe_ctx->stream->timing.v_total)
+ pipe_ctx->plane_res.hubp->funcs->program_extended_blank(pipe_ctx->plane_res.hubp,
+- pipe_ctx->dlg_regs.optimized_min_dst_y_next_start);
++ pipe_ctx->dlg_regs.min_dst_y_next_start);
+ }
+ }
+ }
+diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_optc.c b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_optc.c
+index c95f000b63b28..34b08d90dc1da 100644
+--- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_optc.c
++++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_optc.c
+@@ -301,7 +301,12 @@ static void optc3_wait_drr_doublebuffer_pending_clear(struct timing_generator *o
+
+ void optc3_set_vtotal_min_max(struct timing_generator *optc, int vtotal_min, int vtotal_max)
+ {
+- optc1_set_vtotal_min_max(optc, vtotal_min, vtotal_max);
++ struct dc *dc = optc->ctx->dc;
++
++ if (dc->caps.dmub_caps.mclk_sw && !dc->debug.disable_fams)
++ dc_dmub_srv_drr_update_cmd(dc, optc->inst, vtotal_min, vtotal_max);
++ else
++ optc1_set_vtotal_min_max(optc, vtotal_min, vtotal_max);
+ }
+
+ void optc3_tg_init(struct timing_generator *optc)
+diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hubbub.c b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hubbub.c
+index 7e7cd5b64e6a1..7445ed27852a1 100644
+--- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hubbub.c
++++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hubbub.c
+@@ -103,6 +103,7 @@ static void dcn31_program_det_size(struct hubbub *hubbub, int hubp_inst, unsigne
+ default:
+ break;
+ }
++ DC_LOG_DEBUG("Set DET%d to %d segments\n", hubp_inst, det_size_segments);
+ /* Should never be hit, if it is we have an erroneous hw config*/
+ ASSERT(hubbub2->det0_size + hubbub2->det1_size + hubbub2->det2_size
+ + hubbub2->det3_size + hubbub2->compbuf_size_segments <= hubbub2->crb_size_segs);
+diff --git a/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c b/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c
+index 41c972c8eb198..ae99b2851e019 100644
+--- a/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c
++++ b/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c
+@@ -136,6 +136,9 @@
+
+ #define DCN3_15_MAX_DET_SIZE 384
+ #define DCN3_15_CRB_SEGMENT_SIZE_KB 64
++#define DCN3_15_MAX_DET_SEGS (DCN3_15_MAX_DET_SIZE / DCN3_15_CRB_SEGMENT_SIZE_KB)
++/* Minimum 2 extra segments need to be in compbuf and claimable to guarantee seamless mpo transitions */
++#define MIN_RESERVED_DET_SEGS 2
+
+ enum dcn31_clk_src_array_id {
+ DCN31_CLK_SRC_PLL0,
+@@ -1636,21 +1639,61 @@ static bool is_dual_plane(enum surface_pixel_format format)
+ return format >= SURFACE_PIXEL_FORMAT_VIDEO_BEGIN || format == SURFACE_PIXEL_FORMAT_GRPH_RGBE_ALPHA;
+ }
+
++static int source_format_to_bpp (enum source_format_class SourcePixelFormat)
++{
++ if (SourcePixelFormat == dm_444_64)
++ return 8;
++ else if (SourcePixelFormat == dm_444_16 || SourcePixelFormat == dm_444_16)
++ return 2;
++ else if (SourcePixelFormat == dm_444_8)
++ return 1;
++ else if (SourcePixelFormat == dm_rgbe_alpha)
++ return 5;
++ else if (SourcePixelFormat == dm_420_8)
++ return 3;
++ else if (SourcePixelFormat == dm_420_12)
++ return 6;
++ else
++ return 4;
++}
++
++static bool allow_pixel_rate_crb(struct dc *dc, struct dc_state *context)
++{
++ int i;
++ struct resource_context *res_ctx = &context->res_ctx;
++
++ /*Don't apply for single stream*/
++ if (context->stream_count < 2)
++ return false;
++
++ for (i = 0; i < dc->res_pool->pipe_count; i++) {
++ if (!res_ctx->pipe_ctx[i].stream)
++ continue;
++
++ /*Don't apply if MPO to avoid transition issues*/
++ if (res_ctx->pipe_ctx[i].top_pipe && res_ctx->pipe_ctx[i].top_pipe->plane_state != res_ctx->pipe_ctx[i].plane_state)
++ return false;
++ }
++ return true;
++}
++
+ static int dcn315_populate_dml_pipes_from_context(
+ struct dc *dc, struct dc_state *context,
+ display_e2e_pipe_params_st *pipes,
+ bool fast_validate)
+ {
+- int i, pipe_cnt;
++ int i, pipe_cnt, crb_idx, crb_pipes;
+ struct resource_context *res_ctx = &context->res_ctx;
+ struct pipe_ctx *pipe;
+ const int max_usable_det = context->bw_ctx.dml.ip.config_return_buffer_size_in_kbytes - DCN3_15_MIN_COMPBUF_SIZE_KB;
++ int remaining_det_segs = max_usable_det / DCN3_15_CRB_SEGMENT_SIZE_KB;
++ bool pixel_rate_crb = allow_pixel_rate_crb(dc, context);
+
+ DC_FP_START();
+ dcn31x_populate_dml_pipes_from_context(dc, context, pipes, fast_validate);
+ DC_FP_END();
+
+- for (i = 0, pipe_cnt = 0; i < dc->res_pool->pipe_count; i++) {
++ for (i = 0, pipe_cnt = 0, crb_pipes = 0; i < dc->res_pool->pipe_count; i++) {
+ struct dc_crtc_timing *timing;
+
+ if (!res_ctx->pipe_ctx[i].stream)
+@@ -1671,6 +1714,23 @@ static int dcn315_populate_dml_pipes_from_context(
+ pipes[pipe_cnt].dout.dsc_input_bpc = 0;
+ DC_FP_START();
+ dcn31_zero_pipe_dcc_fraction(pipes, pipe_cnt);
++ if (pixel_rate_crb && !pipe->top_pipe && !pipe->prev_odm_pipe) {
++ int bpp = source_format_to_bpp(pipes[pipe_cnt].pipe.src.source_format);
++ /* Ceil to crb segment size */
++ int approx_det_segs_required_for_pstate = dcn_get_approx_det_segs_required_for_pstate(
++ &context->bw_ctx.dml.soc, timing->pix_clk_100hz, bpp, DCN3_15_CRB_SEGMENT_SIZE_KB);
++ if (approx_det_segs_required_for_pstate <= 2 * DCN3_15_MAX_DET_SEGS) {
++ bool split_required = approx_det_segs_required_for_pstate > DCN3_15_MAX_DET_SEGS;
++ split_required = split_required || timing->pix_clk_100hz >= dcn_get_max_non_odm_pix_rate_100hz(&dc->dml.soc);
++ split_required = split_required || (pipe->plane_state && pipe->plane_state->src_rect.width > 5120);
++ if (split_required)
++ approx_det_segs_required_for_pstate += approx_det_segs_required_for_pstate % 2;
++ pipes[pipe_cnt].pipe.src.det_size_override = approx_det_segs_required_for_pstate;
++ remaining_det_segs -= approx_det_segs_required_for_pstate;
++ } else
++ remaining_det_segs = -1;
++ crb_pipes++;
++ }
+ DC_FP_END();
+
+ if (pipes[pipe_cnt].dout.dsc_enable) {
+@@ -1689,16 +1749,54 @@ static int dcn315_populate_dml_pipes_from_context(
+ break;
+ }
+ }
+-
+ pipe_cnt++;
+ }
+
++ /* Spread remaining unreserved crb evenly among all pipes*/
++ if (pixel_rate_crb) {
++ for (i = 0, pipe_cnt = 0, crb_idx = 0; i < dc->res_pool->pipe_count; i++) {
++ pipe = &res_ctx->pipe_ctx[i];
++ if (!pipe->stream)
++ continue;
++
++ /* Do not use asymetric crb if not enough for pstate support */
++ if (remaining_det_segs < 0) {
++ pipes[pipe_cnt].pipe.src.det_size_override = 0;
++ continue;
++ }
++
++ if (!pipe->top_pipe && !pipe->prev_odm_pipe) {
++ bool split_required = pipe->stream->timing.pix_clk_100hz >= dcn_get_max_non_odm_pix_rate_100hz(&dc->dml.soc)
++ || (pipe->plane_state && pipe->plane_state->src_rect.width > 5120);
++
++ if (remaining_det_segs > MIN_RESERVED_DET_SEGS)
++ pipes[pipe_cnt].pipe.src.det_size_override += (remaining_det_segs - MIN_RESERVED_DET_SEGS) / crb_pipes +
++ (crb_idx < (remaining_det_segs - MIN_RESERVED_DET_SEGS) % crb_pipes ? 1 : 0);
++ if (pipes[pipe_cnt].pipe.src.det_size_override > 2 * DCN3_15_MAX_DET_SEGS) {
++ /* Clamp to 2 pipe split max det segments */
++ remaining_det_segs += pipes[pipe_cnt].pipe.src.det_size_override - 2 * (DCN3_15_MAX_DET_SEGS);
++ pipes[pipe_cnt].pipe.src.det_size_override = 2 * DCN3_15_MAX_DET_SEGS;
++ }
++ if (pipes[pipe_cnt].pipe.src.det_size_override > DCN3_15_MAX_DET_SEGS || split_required) {
++ /* If we are splitting we must have an even number of segments */
++ remaining_det_segs += pipes[pipe_cnt].pipe.src.det_size_override % 2;
++ pipes[pipe_cnt].pipe.src.det_size_override -= pipes[pipe_cnt].pipe.src.det_size_override % 2;
++ }
++ /* Convert segments into size for DML use */
++ pipes[pipe_cnt].pipe.src.det_size_override *= DCN3_15_CRB_SEGMENT_SIZE_KB;
++
++ crb_idx++;
++ }
++ pipe_cnt++;
++ }
++ }
++
+ if (pipe_cnt)
+ context->bw_ctx.dml.ip.det_buffer_size_kbytes =
+ (max_usable_det / DCN3_15_CRB_SEGMENT_SIZE_KB / pipe_cnt) * DCN3_15_CRB_SEGMENT_SIZE_KB;
+ if (context->bw_ctx.dml.ip.det_buffer_size_kbytes > DCN3_15_MAX_DET_SIZE)
+ context->bw_ctx.dml.ip.det_buffer_size_kbytes = DCN3_15_MAX_DET_SIZE;
+- ASSERT(context->bw_ctx.dml.ip.det_buffer_size_kbytes >= DCN3_15_DEFAULT_DET_SIZE);
++
+ dc->config.enable_4to1MPC = false;
+ if (pipe_cnt == 1 && pipe->plane_state && !dc->debug.disable_z9_mpc) {
+ if (is_dual_plane(pipe->plane_state->format)
+diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c
+index f1c1a4b5fcac3..7661f8946aa31 100644
+--- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c
++++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c
+@@ -948,10 +948,10 @@ static enum dcn_zstate_support_state decide_zstate_support(struct dc *dc, struc
+ {
+ int plane_count;
+ int i;
+- unsigned int optimized_min_dst_y_next_start_us;
++ unsigned int min_dst_y_next_start_us;
+
+ plane_count = 0;
+- optimized_min_dst_y_next_start_us = 0;
++ min_dst_y_next_start_us = 0;
+ for (i = 0; i < dc->res_pool->pipe_count; i++) {
+ if (context->res_ctx.pipe_ctx[i].plane_state)
+ plane_count++;
+@@ -973,19 +973,18 @@ static enum dcn_zstate_support_state decide_zstate_support(struct dc *dc, struc
+ else if (context->stream_count == 1 && context->streams[0]->signal == SIGNAL_TYPE_EDP) {
+ struct dc_link *link = context->streams[0]->sink->link;
+ struct dc_stream_status *stream_status = &context->stream_status[0];
++ struct dc_stream_state *current_stream = context->streams[0];
+ int minmum_z8_residency = dc->debug.minimum_z8_residency_time > 0 ? dc->debug.minimum_z8_residency_time : 1000;
+ bool allow_z8 = context->bw_ctx.dml.vba.StutterPeriod > (double)minmum_z8_residency;
+ bool is_pwrseq0 = link->link_index == 0;
++ bool isFreesyncVideo;
+
+- if (dc_extended_blank_supported(dc)) {
+- for (i = 0; i < dc->res_pool->pipe_count; i++) {
+- if (context->res_ctx.pipe_ctx[i].stream == context->streams[0]
+- && context->res_ctx.pipe_ctx[i].stream->adjust.v_total_min == context->res_ctx.pipe_ctx[i].stream->adjust.v_total_max
+- && context->res_ctx.pipe_ctx[i].stream->adjust.v_total_min > context->res_ctx.pipe_ctx[i].stream->timing.v_total) {
+- optimized_min_dst_y_next_start_us =
+- context->res_ctx.pipe_ctx[i].dlg_regs.optimized_min_dst_y_next_start_us;
+- break;
+- }
++ isFreesyncVideo = current_stream->adjust.v_total_min == current_stream->adjust.v_total_max;
++ isFreesyncVideo = isFreesyncVideo && current_stream->timing.v_total < current_stream->adjust.v_total_min;
++ for (i = 0; i < dc->res_pool->pipe_count; i++) {
++ if (context->res_ctx.pipe_ctx[i].stream == current_stream && isFreesyncVideo) {
++ min_dst_y_next_start_us = context->res_ctx.pipe_ctx[i].dlg_regs.min_dst_y_next_start_us;
++ break;
+ }
+ }
+
+@@ -993,7 +992,7 @@ static enum dcn_zstate_support_state decide_zstate_support(struct dc *dc, struc
+ if (stream_status->plane_count > 1)
+ return DCN_ZSTATE_SUPPORT_DISALLOW;
+
+- if (is_pwrseq0 && (context->bw_ctx.dml.vba.StutterPeriod > 5000.0 || optimized_min_dst_y_next_start_us > 5000))
++ if (is_pwrseq0 && (context->bw_ctx.dml.vba.StutterPeriod > 5000.0 || min_dst_y_next_start_us > 5000))
+ return DCN_ZSTATE_SUPPORT_ALLOW;
+ else if (is_pwrseq0 && link->psr_settings.psr_version == DC_PSR_VERSION_1 && !link->panel_config.psr.disable_psr)
+ return allow_z8 ? DCN_ZSTATE_SUPPORT_ALLOW_Z8_Z10_ONLY : DCN_ZSTATE_SUPPORT_ALLOW_Z10_ONLY;
+diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c
+index 59836570603ac..19d034341e640 100644
+--- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c
++++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c
+@@ -483,7 +483,7 @@ void dcn31_calculate_wm_and_dlg_fp(
+ int pipe_cnt,
+ int vlevel)
+ {
+- int i, pipe_idx, active_hubp_count = 0;
++ int i, pipe_idx, total_det = 0, active_hubp_count = 0;
+ double dcfclk = context->bw_ctx.dml.vba.DCFCLKState[vlevel][context->bw_ctx.dml.vba.maxMpcComb];
+
+ dc_assert_fp_enabled();
+@@ -563,6 +563,18 @@ void dcn31_calculate_wm_and_dlg_fp(
+ if (context->res_ctx.pipe_ctx[i].stream)
+ context->res_ctx.pipe_ctx[i].plane_res.bw.dppclk_khz = 0;
+ }
++ for (i = 0, pipe_idx = 0; i < dc->res_pool->pipe_count; i++) {
++ if (!context->res_ctx.pipe_ctx[i].stream)
++ continue;
++
++ context->res_ctx.pipe_ctx[i].det_buffer_size_kb =
++ get_det_buffer_size_kbytes(&context->bw_ctx.dml, pipes, pipe_cnt, pipe_idx);
++ if (context->res_ctx.pipe_ctx[i].det_buffer_size_kb > 384)
++ context->res_ctx.pipe_ctx[i].det_buffer_size_kb /= 2;
++ total_det += context->res_ctx.pipe_ctx[i].det_buffer_size_kb;
++ pipe_idx++;
++ }
++ context->bw_ctx.bw.dcn.compbuf_size_kb = context->bw_ctx.dml.ip.config_return_buffer_size_in_kbytes - total_det;
+ }
+
+ void dcn31_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_params)
+@@ -815,3 +827,14 @@ int dcn_get_max_non_odm_pix_rate_100hz(struct _vcs_dpi_soc_bounding_box_st *soc)
+ {
+ return soc->clock_limits[0].dispclk_mhz * 10000.0 / (1.0 + soc->dcn_downspread_percent / 100.0);
+ }
++
++int dcn_get_approx_det_segs_required_for_pstate(
++ struct _vcs_dpi_soc_bounding_box_st *soc,
++ int pix_clk_100hz, int bpp, int seg_size_kb)
++{
++ /* Roughly calculate required crb to hide latency. In practice there is slightly
++ * more buffer available for latency hiding
++ */
++ return (int)(soc->dram_clock_change_latency_us * pix_clk_100hz * bpp
++ / 10240000 + seg_size_kb - 1) / seg_size_kb;
++}
+diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h
+index 687d3522cc33e..8f9c8faed2605 100644
+--- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h
++++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h
+@@ -47,6 +47,9 @@ void dcn31_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_params
+ void dcn315_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_params);
+ void dcn316_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_params);
+ int dcn_get_max_non_odm_pix_rate_100hz(struct _vcs_dpi_soc_bounding_box_st *soc);
++int dcn_get_approx_det_segs_required_for_pstate(
++ struct _vcs_dpi_soc_bounding_box_st *soc,
++ int pix_clk_100hz, int bpp, int seg_size_kb);
+
+ int dcn31x_populate_dml_pipes_from_context(struct dc *dc,
+ struct dc_state *context,
+diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
+index bd674dc30df33..a0f44eef7763f 100644
+--- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
++++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
+@@ -532,7 +532,8 @@ static void CalculateStutterEfficiency(
+ static void CalculateSwathAndDETConfiguration(
+ bool ForceSingleDPP,
+ int NumberOfActivePlanes,
+- unsigned int DETBufferSizeInKByte,
++ bool DETSharedByAllDPP,
++ unsigned int DETBufferSizeInKByte[],
+ double MaximumSwathWidthLuma[],
+ double MaximumSwathWidthChroma[],
+ enum scan_direction_class SourceScan[],
+@@ -3118,7 +3119,7 @@ static void DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman
+ v->SurfaceWidthC[k],
+ v->SurfaceHeightY[k],
+ v->SurfaceHeightC[k],
+- v->DETBufferSizeInKByte[0] * 1024,
++ v->DETBufferSizeInKByte[k] * 1024,
+ v->BlockHeight256BytesY[k],
+ v->BlockHeight256BytesC[k],
+ v->SurfaceTiling[k],
+@@ -3313,7 +3314,8 @@ static void DisplayPipeConfiguration(struct display_mode_lib *mode_lib)
+ CalculateSwathAndDETConfiguration(
+ false,
+ v->NumberOfActivePlanes,
+- v->DETBufferSizeInKByte[0],
++ mode_lib->project == DML_PROJECT_DCN315 && v->DETSizeOverride[0],
++ v->DETBufferSizeInKByte,
+ dummy1,
+ dummy2,
+ v->SourceScan,
+@@ -3779,14 +3781,16 @@ static noinline void CalculatePrefetchSchedulePerPlane(
+ &v->VReadyOffsetPix[k]);
+ }
+
+-static void PatchDETBufferSizeInKByte(unsigned int NumberOfActivePlanes, int NoOfDPPThisState[], unsigned int config_return_buffer_size_in_kbytes, unsigned int *DETBufferSizeInKByte)
++static void PatchDETBufferSizeInKByte(unsigned int NumberOfActivePlanes, int NoOfDPPThisState[], unsigned int config_return_buffer_size_in_kbytes, unsigned int DETBufferSizeInKByte[])
+ {
+ int i, total_pipes = 0;
+ for (i = 0; i < NumberOfActivePlanes; i++)
+ total_pipes += NoOfDPPThisState[i];
+- *DETBufferSizeInKByte = ((config_return_buffer_size_in_kbytes - DCN3_15_MIN_COMPBUF_SIZE_KB) / 64 / total_pipes) * 64;
+- if (*DETBufferSizeInKByte > DCN3_15_MAX_DET_SIZE)
+- *DETBufferSizeInKByte = DCN3_15_MAX_DET_SIZE;
++ DETBufferSizeInKByte[0] = ((config_return_buffer_size_in_kbytes - DCN3_15_MIN_COMPBUF_SIZE_KB) / 64 / total_pipes) * 64;
++ if (DETBufferSizeInKByte[0] > DCN3_15_MAX_DET_SIZE)
++ DETBufferSizeInKByte[0] = DCN3_15_MAX_DET_SIZE;
++ for (i = 1; i < NumberOfActivePlanes; i++)
++ DETBufferSizeInKByte[i] = DETBufferSizeInKByte[0];
+ }
+
+
+@@ -4026,7 +4030,8 @@ void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l
+ CalculateSwathAndDETConfiguration(
+ true,
+ v->NumberOfActivePlanes,
+- v->DETBufferSizeInKByte[0],
++ mode_lib->project == DML_PROJECT_DCN315 && v->DETSizeOverride[0],
++ v->DETBufferSizeInKByte,
+ v->MaximumSwathWidthLuma,
+ v->MaximumSwathWidthChroma,
+ v->SourceScan,
+@@ -4166,6 +4171,10 @@ void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l
+ || (v->PlaneRequiredDISPCLK > v->MaxDispclkRoundedDownToDFSGranularity)) {
+ v->DISPCLK_DPPCLK_Support[i][j] = false;
+ }
++ if (mode_lib->project == DML_PROJECT_DCN315 && v->DETSizeOverride[k] > DCN3_15_MAX_DET_SIZE && v->NoOfDPP[i][j][k] < 2) {
++ v->MPCCombine[i][j][k] = true;
++ v->NoOfDPP[i][j][k] = 2;
++ }
+ }
+ v->TotalNumberOfActiveDPP[i][j] = 0;
+ v->TotalNumberOfSingleDPPPlanes[i][j] = 0;
+@@ -4642,12 +4651,13 @@ void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l
+ v->ODMCombineEnableThisState[k] = v->ODMCombineEnablePerState[i][k];
+ }
+
+- if (v->NumberOfActivePlanes > 1 && mode_lib->project == DML_PROJECT_DCN315)
+- PatchDETBufferSizeInKByte(v->NumberOfActivePlanes, v->NoOfDPPThisState, v->ip.config_return_buffer_size_in_kbytes, &v->DETBufferSizeInKByte[0]);
++ if (v->NumberOfActivePlanes > 1 && mode_lib->project == DML_PROJECT_DCN315 && !v->DETSizeOverride[0])
++ PatchDETBufferSizeInKByte(v->NumberOfActivePlanes, v->NoOfDPPThisState, v->ip.config_return_buffer_size_in_kbytes, v->DETBufferSizeInKByte);
+ CalculateSwathAndDETConfiguration(
+ false,
+ v->NumberOfActivePlanes,
+- v->DETBufferSizeInKByte[0],
++ mode_lib->project == DML_PROJECT_DCN315 && v->DETSizeOverride[0],
++ v->DETBufferSizeInKByte,
+ v->MaximumSwathWidthLuma,
+ v->MaximumSwathWidthChroma,
+ v->SourceScan,
+@@ -6611,7 +6621,8 @@ static void CalculateStutterEfficiency(
+ static void CalculateSwathAndDETConfiguration(
+ bool ForceSingleDPP,
+ int NumberOfActivePlanes,
+- unsigned int DETBufferSizeInKByte,
++ bool DETSharedByAllDPP,
++ unsigned int DETBufferSizeInKByteA[],
+ double MaximumSwathWidthLuma[],
+ double MaximumSwathWidthChroma[],
+ enum scan_direction_class SourceScan[],
+@@ -6695,6 +6706,10 @@ static void CalculateSwathAndDETConfiguration(
+
+ *ViewportSizeSupport = true;
+ for (k = 0; k < NumberOfActivePlanes; ++k) {
++ unsigned int DETBufferSizeInKByte = DETBufferSizeInKByteA[k];
++
++ if (DETSharedByAllDPP && DPPPerPlane[k])
++ DETBufferSizeInKByte /= DPPPerPlane[k];
+ if ((SourcePixelFormat[k] == dm_444_64 || SourcePixelFormat[k] == dm_444_32 || SourcePixelFormat[k] == dm_444_16 || SourcePixelFormat[k] == dm_mono_16
+ || SourcePixelFormat[k] == dm_mono_8 || SourcePixelFormat[k] == dm_rgbe)) {
+ if (SurfaceTiling[k] == dm_sw_linear
+diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_rq_dlg_calc_31.c b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_rq_dlg_calc_31.c
+index 2244e4fb8c96d..fcde8f21b8be0 100644
+--- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_rq_dlg_calc_31.c
++++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_rq_dlg_calc_31.c
+@@ -987,8 +987,7 @@ static void dml_rq_dlg_get_dlg_params(
+
+ dlg_vblank_start = interlaced ? (vblank_start / 2) : vblank_start;
+ disp_dlg_regs->min_dst_y_next_start = (unsigned int) (((double) dlg_vblank_start) * dml_pow(2, 2));
+- disp_dlg_regs->optimized_min_dst_y_next_start_us = 0;
+- disp_dlg_regs->optimized_min_dst_y_next_start = disp_dlg_regs->min_dst_y_next_start;
++ disp_dlg_regs->min_dst_y_next_start_us = 0;
+ ASSERT(disp_dlg_regs->min_dst_y_next_start < (unsigned int)dml_pow(2, 18));
+
+ dml_print("DML_DLG: %s: min_ttu_vblank (us) = %3.2f\n", __func__, min_ttu_vblank);
+diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c
+index 9e54e3d0eb780..b878effa2129b 100644
+--- a/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c
++++ b/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c
+@@ -33,7 +33,7 @@
+ #include "dml/display_mode_vba.h"
+
+ struct _vcs_dpi_ip_params_st dcn3_14_ip = {
+- .VBlankNomDefaultUS = 668,
++ .VBlankNomDefaultUS = 800,
+ .gpuvm_enable = 1,
+ .gpuvm_max_page_table_levels = 1,
+ .hostvm_enable = 1,
+@@ -286,6 +286,7 @@ int dcn314_populate_dml_pipes_from_context_fpu(struct dc *dc, struct dc_state *c
+ struct resource_context *res_ctx = &context->res_ctx;
+ struct pipe_ctx *pipe;
+ bool upscaled = false;
++ const unsigned int max_allowed_vblank_nom = 1023;
+
+ dc_assert_fp_enabled();
+
+@@ -299,9 +300,15 @@ int dcn314_populate_dml_pipes_from_context_fpu(struct dc *dc, struct dc_state *c
+ pipe = &res_ctx->pipe_ctx[i];
+ timing = &pipe->stream->timing;
+
+- if (dc_extended_blank_supported(dc) && pipe->stream->adjust.v_total_max == pipe->stream->adjust.v_total_min
+- && pipe->stream->adjust.v_total_min > timing->v_total)
++ if (pipe->stream->adjust.v_total_min != 0)
+ pipes[pipe_cnt].pipe.dest.vtotal = pipe->stream->adjust.v_total_min;
++ else
++ pipes[pipe_cnt].pipe.dest.vtotal = timing->v_total;
++
++ pipes[pipe_cnt].pipe.dest.vblank_nom = timing->v_total - pipes[pipe_cnt].pipe.dest.vactive;
++ pipes[pipe_cnt].pipe.dest.vblank_nom = min(pipes[pipe_cnt].pipe.dest.vblank_nom, dcn3_14_ip.VBlankNomDefaultUS);
++ pipes[pipe_cnt].pipe.dest.vblank_nom = max(pipes[pipe_cnt].pipe.dest.vblank_nom, timing->v_sync_width);
++ pipes[pipe_cnt].pipe.dest.vblank_nom = min(pipes[pipe_cnt].pipe.dest.vblank_nom, max_allowed_vblank_nom);
+
+ if (pipe->plane_state &&
+ (pipe->plane_state->src_rect.height < pipe->plane_state->dst_rect.height ||
+@@ -323,8 +330,6 @@ int dcn314_populate_dml_pipes_from_context_fpu(struct dc *dc, struct dc_state *c
+ pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_luma = 0;
+ pipes[pipe_cnt].pipe.src.dcc_fraction_of_zs_req_chroma = 0;
+ pipes[pipe_cnt].pipe.dest.vfront_porch = timing->v_front_porch;
+- pipes[pipe_cnt].pipe.dest.vblank_nom =
+- dcn3_14_ip.VBlankNomDefaultUS / (timing->h_total / (timing->pix_clk_100hz / 10000.0));
+ pipes[pipe_cnt].pipe.src.dcc_rate = 3;
+ pipes[pipe_cnt].dout.dsc_input_bpc = 0;
+
+diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_rq_dlg_calc_314.c b/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_rq_dlg_calc_314.c
+index ea4eb66066c42..4f945458b2b7e 100644
+--- a/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_rq_dlg_calc_314.c
++++ b/drivers/gpu/drm/amd/display/dc/dml/dcn314/display_rq_dlg_calc_314.c
+@@ -1051,7 +1051,6 @@ static void dml_rq_dlg_get_dlg_params(
+
+ float vba__refcyc_per_req_delivery_pre_l = get_refcyc_per_req_delivery_pre_l_in_us(mode_lib, e2e_pipe_param, num_pipes, pipe_idx) * refclk_freq_in_mhz; // From VBA
+ float vba__refcyc_per_req_delivery_l = get_refcyc_per_req_delivery_l_in_us(mode_lib, e2e_pipe_param, num_pipes, pipe_idx) * refclk_freq_in_mhz; // From VBA
+- int blank_lines = 0;
+
+ memset(disp_dlg_regs, 0, sizeof(*disp_dlg_regs));
+ memset(disp_ttu_regs, 0, sizeof(*disp_ttu_regs));
+@@ -1075,17 +1074,10 @@ static void dml_rq_dlg_get_dlg_params(
+ min_ttu_vblank = get_min_ttu_vblank_in_us(mode_lib, e2e_pipe_param, num_pipes, pipe_idx); // From VBA
+
+ dlg_vblank_start = interlaced ? (vblank_start / 2) : vblank_start;
+- disp_dlg_regs->optimized_min_dst_y_next_start = disp_dlg_regs->min_dst_y_next_start;
+- disp_dlg_regs->optimized_min_dst_y_next_start_us = 0;
+- disp_dlg_regs->min_dst_y_next_start = (unsigned int) (((double) dlg_vblank_start) * dml_pow(2, 2));
+- blank_lines = (dst->vblank_end + dst->vtotal_min - dst->vblank_start - dst->vstartup_start - 1);
+- if (blank_lines < 0)
+- blank_lines = 0;
+- if (blank_lines != 0) {
+- disp_dlg_regs->optimized_min_dst_y_next_start = vba__min_dst_y_next_start;
+- disp_dlg_regs->optimized_min_dst_y_next_start_us = (disp_dlg_regs->optimized_min_dst_y_next_start * dst->hactive) / (unsigned int) dst->pixel_rate_mhz;
+- disp_dlg_regs->min_dst_y_next_start = disp_dlg_regs->optimized_min_dst_y_next_start;
+- }
++ disp_dlg_regs->min_dst_y_next_start_us =
++ (vba__min_dst_y_next_start * dst->hactive) / (unsigned int) dst->pixel_rate_mhz;
++ disp_dlg_regs->min_dst_y_next_start = vba__min_dst_y_next_start * dml_pow(2, 2);
++
+ ASSERT(disp_dlg_regs->min_dst_y_next_start < (unsigned int)dml_pow(2, 18));
+
+ dml_print("DML_DLG: %s: min_ttu_vblank (us) = %3.2f\n", __func__, min_ttu_vblank);
+diff --git a/drivers/gpu/drm/amd/display/dc/dml/display_mode_structs.h b/drivers/gpu/drm/amd/display/dc/dml/display_mode_structs.h
+index 3c077164f3620..ff0246a9458fd 100644
+--- a/drivers/gpu/drm/amd/display/dc/dml/display_mode_structs.h
++++ b/drivers/gpu/drm/amd/display/dc/dml/display_mode_structs.h
+@@ -619,8 +619,7 @@ struct _vcs_dpi_display_dlg_regs_st {
+ unsigned int refcyc_h_blank_end;
+ unsigned int dlg_vblank_end;
+ unsigned int min_dst_y_next_start;
+- unsigned int optimized_min_dst_y_next_start;
+- unsigned int optimized_min_dst_y_next_start_us;
++ unsigned int min_dst_y_next_start_us;
+ unsigned int refcyc_per_htotal;
+ unsigned int refcyc_x_after_scaler;
+ unsigned int dst_y_after_scaler;
+diff --git a/drivers/gpu/drm/amd/display/dc/dml/display_mode_vba.c b/drivers/gpu/drm/amd/display/dc/dml/display_mode_vba.c
+index f9653f511baa3..2f63ae954826c 100644
+--- a/drivers/gpu/drm/amd/display/dc/dml/display_mode_vba.c
++++ b/drivers/gpu/drm/amd/display/dc/dml/display_mode_vba.c
+@@ -571,6 +571,10 @@ static void fetch_pipe_params(struct display_mode_lib *mode_lib)
+ mode_lib->vba.OutputLinkDPRate[mode_lib->vba.NumberOfActivePlanes] = dout->dp_rate;
+ mode_lib->vba.ODMUse[mode_lib->vba.NumberOfActivePlanes] = dst->odm_combine_policy;
+ mode_lib->vba.DETSizeOverride[mode_lib->vba.NumberOfActivePlanes] = src->det_size_override;
++ if (src->det_size_override)
++ mode_lib->vba.DETBufferSizeInKByte[mode_lib->vba.NumberOfActivePlanes] = src->det_size_override;
++ else
++ mode_lib->vba.DETBufferSizeInKByte[mode_lib->vba.NumberOfActivePlanes] = ip->det_buffer_size_kbytes;
+ //TODO: Need to assign correct values to dp_multistream vars
+ mode_lib->vba.OutputMultistreamEn[mode_lib->vba.NumberOfActiveSurfaces] = dout->dp_multistream_en;
+ mode_lib->vba.OutputMultistreamId[mode_lib->vba.NumberOfActiveSurfaces] = dout->dp_multistream_id;
+@@ -785,6 +789,8 @@ static void fetch_pipe_params(struct display_mode_lib *mode_lib)
+ mode_lib->vba.pipe_plane[k] =
+ mode_lib->vba.NumberOfActivePlanes;
+ mode_lib->vba.DPPPerPlane[mode_lib->vba.NumberOfActivePlanes]++;
++ if (src_k->det_size_override)
++ mode_lib->vba.DETBufferSizeInKByte[mode_lib->vba.NumberOfActivePlanes] = src_k->det_size_override;
+ if (mode_lib->vba.SourceScan[mode_lib->vba.NumberOfActivePlanes]
+ == dm_horz) {
+ mode_lib->vba.ViewportWidth[mode_lib->vba.NumberOfActivePlanes] +=
+diff --git a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_fixed_vs_pe_retimer.c b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_fixed_vs_pe_retimer.c
+index 5731c4b61f9f0..15faaf645b145 100644
+--- a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_fixed_vs_pe_retimer.c
++++ b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_training_fixed_vs_pe_retimer.c
+@@ -233,7 +233,7 @@ enum link_training_result dp_perform_fixed_vs_pe_training_sequence_legacy(
+ link->dpcd_caps.lttpr_caps.phy_repeater_cnt);
+ const uint8_t vendor_lttpr_write_data_intercept_en[4] = {0x1, 0x55, 0x63, 0x0};
+ const uint8_t vendor_lttpr_write_data_intercept_dis[4] = {0x1, 0x55, 0x63, 0x68};
+- uint32_t pre_disable_intercept_delay_ms = link->dc->debug.fixed_vs_aux_delay_config_wa;
++ uint32_t pre_disable_intercept_delay_ms = 0;
+ uint8_t vendor_lttpr_write_data_vs[4] = {0x1, 0x51, 0x63, 0x0};
+ uint8_t vendor_lttpr_write_data_pe[4] = {0x1, 0x52, 0x63, 0x0};
+ uint32_t vendor_lttpr_write_address = 0xF004F;
+@@ -244,6 +244,10 @@ enum link_training_result dp_perform_fixed_vs_pe_training_sequence_legacy(
+ uint8_t toggle_rate;
+ uint8_t rate;
+
++ if (link->local_sink)
++ pre_disable_intercept_delay_ms =
++ link->local_sink->edid_caps.panel_patch.delay_disable_aux_intercept_ms;
++
+ /* Only 8b/10b is supported */
+ ASSERT(link_dp_get_encoding_format(<_settings->link_settings) ==
+ DP_8b_10b_ENCODING);
+@@ -259,7 +263,7 @@ enum link_training_result dp_perform_fixed_vs_pe_training_sequence_legacy(
+
+ /* Certain display and cable configuration require extra delay */
+ if (offset > 2)
+- pre_disable_intercept_delay_ms = link->dc->debug.fixed_vs_aux_delay_config_wa * 2;
++ pre_disable_intercept_delay_ms = pre_disable_intercept_delay_ms * 2;
+ }
+
+ /* Vendor specific: Reset lane settings */
+@@ -380,7 +384,8 @@ enum link_training_result dp_perform_fixed_vs_pe_training_sequence_legacy(
+ 0);
+ /* Vendor specific: Disable intercept */
+ for (i = 0; i < max_vendor_dpcd_retries; i++) {
+- msleep(pre_disable_intercept_delay_ms);
++ if (pre_disable_intercept_delay_ms != 0)
++ msleep(pre_disable_intercept_delay_ms);
+ dpcd_status = core_link_write_dpcd(
+ link,
+ vendor_lttpr_write_address,
+@@ -591,10 +596,9 @@ enum link_training_result dp_perform_fixed_vs_pe_training_sequence(
+ const uint8_t vendor_lttpr_write_data_adicora_eq1[4] = {0x1, 0x55, 0x63, 0x2E};
+ const uint8_t vendor_lttpr_write_data_adicora_eq2[4] = {0x1, 0x55, 0x63, 0x01};
+ const uint8_t vendor_lttpr_write_data_adicora_eq3[4] = {0x1, 0x55, 0x63, 0x68};
+- uint32_t pre_disable_intercept_delay_ms = link->dc->debug.fixed_vs_aux_delay_config_wa;
+ uint8_t vendor_lttpr_write_data_vs[4] = {0x1, 0x51, 0x63, 0x0};
+ uint8_t vendor_lttpr_write_data_pe[4] = {0x1, 0x52, 0x63, 0x0};
+-
++ uint32_t pre_disable_intercept_delay_ms = 0;
+ uint32_t vendor_lttpr_write_address = 0xF004F;
+ enum link_training_result status = LINK_TRAINING_SUCCESS;
+ uint8_t lane = 0;
+@@ -603,6 +607,10 @@ enum link_training_result dp_perform_fixed_vs_pe_training_sequence(
+ uint8_t toggle_rate;
+ uint8_t rate;
+
++ if (link->local_sink)
++ pre_disable_intercept_delay_ms =
++ link->local_sink->edid_caps.panel_patch.delay_disable_aux_intercept_ms;
++
+ /* Only 8b/10b is supported */
+ ASSERT(link_dp_get_encoding_format(<_settings->link_settings) ==
+ DP_8b_10b_ENCODING);
+@@ -618,7 +626,7 @@ enum link_training_result dp_perform_fixed_vs_pe_training_sequence(
+
+ /* Certain display and cable configuration require extra delay */
+ if (offset > 2)
+- pre_disable_intercept_delay_ms = link->dc->debug.fixed_vs_aux_delay_config_wa * 2;
++ pre_disable_intercept_delay_ms = pre_disable_intercept_delay_ms * 2;
+ }
+
+ /* Vendor specific: Reset lane settings */
+@@ -739,7 +747,8 @@ enum link_training_result dp_perform_fixed_vs_pe_training_sequence(
+ 0);
+ /* Vendor specific: Disable intercept */
+ for (i = 0; i < max_vendor_dpcd_retries; i++) {
+- msleep(pre_disable_intercept_delay_ms);
++ if (pre_disable_intercept_delay_ms != 0)
++ msleep(pre_disable_intercept_delay_ms);
+ dpcd_status = core_link_write_dpcd(
+ link,
+ vendor_lttpr_write_address,
+diff --git a/drivers/gpu/drm/amd/display/dmub/dmub_srv.h b/drivers/gpu/drm/amd/display/dmub/dmub_srv.h
+index 554ab48d4e647..9cad599b27094 100644
+--- a/drivers/gpu/drm/amd/display/dmub/dmub_srv.h
++++ b/drivers/gpu/drm/amd/display/dmub/dmub_srv.h
+@@ -364,6 +364,8 @@ struct dmub_srv_hw_funcs {
+
+ bool (*is_supported)(struct dmub_srv *dmub);
+
++ bool (*is_psrsu_supported)(struct dmub_srv *dmub);
++
+ bool (*is_hw_init)(struct dmub_srv *dmub);
+
+ bool (*is_phy_init)(struct dmub_srv *dmub);
+diff --git a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
+index 598fa1de54ce3..1c55d3b01f53e 100644
+--- a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
++++ b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
+@@ -360,7 +360,7 @@ union dmub_fw_boot_status {
+ uint32_t optimized_init_done : 1; /**< 1 if optimized init done */
+ uint32_t restore_required : 1; /**< 1 if driver should call restore */
+ uint32_t defer_load : 1; /**< 1 if VBIOS data is deferred programmed */
+- uint32_t reserved : 1;
++ uint32_t fams_enabled : 1; /**< 1 if VBIOS data is deferred programmed */
+ uint32_t detection_required: 1; /**< if detection need to be triggered by driver */
+ uint32_t hw_power_init_done: 1; /**< 1 if hw power init is completed */
+ } bits; /**< status bits */
+diff --git a/drivers/gpu/drm/amd/display/dmub/src/Makefile b/drivers/gpu/drm/amd/display/dmub/src/Makefile
+index 0589ad4778eea..caf095aca8f3f 100644
+--- a/drivers/gpu/drm/amd/display/dmub/src/Makefile
++++ b/drivers/gpu/drm/amd/display/dmub/src/Makefile
+@@ -22,7 +22,7 @@
+
+ DMUB = dmub_srv.o dmub_srv_stat.o dmub_reg.o dmub_dcn20.o dmub_dcn21.o
+ DMUB += dmub_dcn30.o dmub_dcn301.o dmub_dcn302.o dmub_dcn303.o
+-DMUB += dmub_dcn31.o dmub_dcn315.o dmub_dcn316.o
++DMUB += dmub_dcn31.o dmub_dcn314.o dmub_dcn315.o dmub_dcn316.o
+ DMUB += dmub_dcn32.o
+
+ AMD_DAL_DMUB = $(addprefix $(AMDDALPATH)/dmub/src/,$(DMUB))
+diff --git a/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.c b/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.c
+index c90b9ee42e126..89d24fb7024e2 100644
+--- a/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.c
++++ b/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.c
+@@ -297,6 +297,11 @@ bool dmub_dcn31_is_supported(struct dmub_srv *dmub)
+ return supported;
+ }
+
++bool dmub_dcn31_is_psrsu_supported(struct dmub_srv *dmub)
++{
++ return dmub->fw_version >= DMUB_FW_VERSION(4, 0, 59);
++}
++
+ void dmub_dcn31_set_gpint(struct dmub_srv *dmub,
+ union dmub_gpint_data_register reg)
+ {
+diff --git a/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.h b/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.h
+index f6db6f89d45dc..eb62410941473 100644
+--- a/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.h
++++ b/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.h
+@@ -219,6 +219,8 @@ bool dmub_dcn31_is_hw_init(struct dmub_srv *dmub);
+
+ bool dmub_dcn31_is_supported(struct dmub_srv *dmub);
+
++bool dmub_dcn31_is_psrsu_supported(struct dmub_srv *dmub);
++
+ void dmub_dcn31_set_gpint(struct dmub_srv *dmub,
+ union dmub_gpint_data_register reg);
+
+diff --git a/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn314.c b/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn314.c
+new file mode 100644
+index 0000000000000..f161aeb7e7c4a
+--- /dev/null
++++ b/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn314.c
+@@ -0,0 +1,67 @@
++/*
++ * Copyright 2021 Advanced Micro Devices, Inc.
++ *
++ * Permission is hereby granted, free of charge, to any person obtaining a
++ * copy of this software and associated documentation files (the "Software"),
++ * to deal in the Software without restriction, including without limitation
++ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
++ * and/or sell copies of the Software, and to permit persons to whom the
++ * Software is furnished to do so, subject to the following conditions:
++ *
++ * The above copyright notice and this permission notice shall be included in
++ * all copies or substantial portions of the Software.
++ *
++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
++ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
++ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
++ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
++ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
++ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
++ * OTHER DEALINGS IN THE SOFTWARE.
++ *
++ * Authors: AMD
++ *
++ */
++
++#include "../dmub_srv.h"
++#include "dmub_reg.h"
++#include "dmub_dcn314.h"
++
++#include "dcn/dcn_3_1_4_offset.h"
++#include "dcn/dcn_3_1_4_sh_mask.h"
++
++#define DCN_BASE__INST0_SEG0 0x00000012
++#define DCN_BASE__INST0_SEG1 0x000000C0
++#define DCN_BASE__INST0_SEG2 0x000034C0
++#define DCN_BASE__INST0_SEG3 0x00009000
++#define DCN_BASE__INST0_SEG4 0x02403C00
++#define DCN_BASE__INST0_SEG5 0
++
++#define BASE_INNER(seg) DCN_BASE__INST0_SEG##seg
++#define CTX dmub
++#define REGS dmub->regs_dcn31
++#define REG_OFFSET_EXP(reg_name) (BASE(reg##reg_name##_BASE_IDX) + reg##reg_name)
++
++/* Registers. */
++
++const struct dmub_srv_dcn31_regs dmub_srv_dcn314_regs = {
++#define DMUB_SR(reg) REG_OFFSET_EXP(reg),
++ {
++ DMUB_DCN31_REGS()
++ DMCUB_INTERNAL_REGS()
++ },
++#undef DMUB_SR
++
++#define DMUB_SF(reg, field) FD_MASK(reg, field),
++ { DMUB_DCN31_FIELDS() },
++#undef DMUB_SF
++
++#define DMUB_SF(reg, field) FD_SHIFT(reg, field),
++ { DMUB_DCN31_FIELDS() },
++#undef DMUB_SF
++};
++
++bool dmub_dcn314_is_psrsu_supported(struct dmub_srv *dmub)
++{
++ return dmub->fw_version >= DMUB_FW_VERSION(8, 0, 16);
++}
+diff --git a/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn314.h b/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn314.h
+new file mode 100644
+index 0000000000000..f213bd82c9110
+--- /dev/null
++++ b/drivers/gpu/drm/amd/display/dmub/src/dmub_dcn314.h
+@@ -0,0 +1,35 @@
++/*
++ * Copyright 2021 Advanced Micro Devices, Inc.
++ *
++ * Permission is hereby granted, free of charge, to any person obtaining a
++ * copy of this software and associated documentation files (the "Software"),
++ * to deal in the Software without restriction, including without limitation
++ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
++ * and/or sell copies of the Software, and to permit persons to whom the
++ * Software is furnished to do so, subject to the following conditions:
++ *
++ * The above copyright notice and this permission notice shall be included in
++ * all copies or substantial portions of the Software.
++ *
++ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
++ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
++ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
++ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
++ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
++ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
++ * OTHER DEALINGS IN THE SOFTWARE.
++ *
++ * Authors: AMD
++ *
++ */
++
++#ifndef _DMUB_DCN314_H_
++#define _DMUB_DCN314_H_
++
++#include "dmub_dcn31.h"
++
++extern const struct dmub_srv_dcn31_regs dmub_srv_dcn314_regs;
++
++bool dmub_dcn314_is_psrsu_supported(struct dmub_srv *dmub);
++
++#endif /* _DMUB_DCN314_H_ */
+diff --git a/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c b/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c
+index 92c18bfb98b3b..0dab22d794808 100644
+--- a/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c
++++ b/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c
+@@ -32,6 +32,7 @@
+ #include "dmub_dcn302.h"
+ #include "dmub_dcn303.h"
+ #include "dmub_dcn31.h"
++#include "dmub_dcn314.h"
+ #include "dmub_dcn315.h"
+ #include "dmub_dcn316.h"
+ #include "dmub_dcn32.h"
+@@ -226,12 +227,17 @@ static bool dmub_srv_hw_setup(struct dmub_srv *dmub, enum dmub_asic asic)
+ case DMUB_ASIC_DCN314:
+ case DMUB_ASIC_DCN315:
+ case DMUB_ASIC_DCN316:
+- if (asic == DMUB_ASIC_DCN315)
++ if (asic == DMUB_ASIC_DCN314) {
++ dmub->regs_dcn31 = &dmub_srv_dcn314_regs;
++ funcs->is_psrsu_supported = dmub_dcn314_is_psrsu_supported;
++ } else if (asic == DMUB_ASIC_DCN315) {
+ dmub->regs_dcn31 = &dmub_srv_dcn315_regs;
+- else if (asic == DMUB_ASIC_DCN316)
++ } else if (asic == DMUB_ASIC_DCN316) {
+ dmub->regs_dcn31 = &dmub_srv_dcn316_regs;
+- else
++ } else {
+ dmub->regs_dcn31 = &dmub_srv_dcn31_regs;
++ funcs->is_psrsu_supported = dmub_dcn31_is_psrsu_supported;
++ }
+ funcs->reset = dmub_dcn31_reset;
+ funcs->reset_release = dmub_dcn31_reset_release;
+ funcs->backdoor_load = dmub_dcn31_backdoor_load;
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+index e22fc563b462f..0cda3b276f611 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+@@ -2081,89 +2081,36 @@ static int sienna_cichlid_display_disable_memory_clock_switch(struct smu_context
+ return ret;
+ }
+
+-static void sienna_cichlid_get_override_pcie_settings(struct smu_context *smu,
+- uint32_t *gen_speed_override,
+- uint32_t *lane_width_override)
+-{
+- struct amdgpu_device *adev = smu->adev;
+-
+- *gen_speed_override = 0xff;
+- *lane_width_override = 0xff;
+-
+- switch (adev->pdev->device) {
+- case 0x73A0:
+- case 0x73A1:
+- case 0x73A2:
+- case 0x73A3:
+- case 0x73AB:
+- case 0x73AE:
+- /* Bit 7:0: PCIE lane width, 1 to 7 corresponds is x1 to x32 */
+- *lane_width_override = 6;
+- break;
+- case 0x73E0:
+- case 0x73E1:
+- case 0x73E3:
+- *lane_width_override = 4;
+- break;
+- case 0x7420:
+- case 0x7421:
+- case 0x7422:
+- case 0x7423:
+- case 0x7424:
+- *lane_width_override = 3;
+- break;
+- default:
+- break;
+- }
+-}
+-
+-#define MAX(a, b) ((a) > (b) ? (a) : (b))
+-
+ static int sienna_cichlid_update_pcie_parameters(struct smu_context *smu,
+ uint32_t pcie_gen_cap,
+ uint32_t pcie_width_cap)
+ {
+ struct smu_11_0_dpm_context *dpm_context = smu->smu_dpm.dpm_context;
+ struct smu_11_0_pcie_table *pcie_table = &dpm_context->dpm_tables.pcie_table;
+- uint32_t gen_speed_override, lane_width_override;
+- uint8_t *table_member1, *table_member2;
+- uint32_t min_gen_speed, max_gen_speed;
+- uint32_t min_lane_width, max_lane_width;
+- uint32_t smu_pcie_arg;
++ u32 smu_pcie_arg;
+ int ret, i;
+
+- GET_PPTABLE_MEMBER(PcieGenSpeed, &table_member1);
+- GET_PPTABLE_MEMBER(PcieLaneCount, &table_member2);
+-
+- sienna_cichlid_get_override_pcie_settings(smu,
+- &gen_speed_override,
+- &lane_width_override);
++ /* PCIE gen speed and lane width override */
++ if (!amdgpu_device_pcie_dynamic_switching_supported()) {
++ if (pcie_table->pcie_gen[NUM_LINK_LEVELS - 1] < pcie_gen_cap)
++ pcie_gen_cap = pcie_table->pcie_gen[NUM_LINK_LEVELS - 1];
+
+- /* PCIE gen speed override */
+- if (gen_speed_override != 0xff) {
+- min_gen_speed = MIN(pcie_gen_cap, gen_speed_override);
+- max_gen_speed = MIN(pcie_gen_cap, gen_speed_override);
+- } else {
+- min_gen_speed = MAX(0, table_member1[0]);
+- max_gen_speed = MIN(pcie_gen_cap, table_member1[1]);
+- min_gen_speed = min_gen_speed > max_gen_speed ?
+- max_gen_speed : min_gen_speed;
+- }
+- pcie_table->pcie_gen[0] = min_gen_speed;
+- pcie_table->pcie_gen[1] = max_gen_speed;
++ if (pcie_table->pcie_lane[NUM_LINK_LEVELS - 1] < pcie_width_cap)
++ pcie_width_cap = pcie_table->pcie_lane[NUM_LINK_LEVELS - 1];
+
+- /* PCIE lane width override */
+- if (lane_width_override != 0xff) {
+- min_lane_width = MIN(pcie_width_cap, lane_width_override);
+- max_lane_width = MIN(pcie_width_cap, lane_width_override);
++ /* Force all levels to use the same settings */
++ for (i = 0; i < NUM_LINK_LEVELS; i++) {
++ pcie_table->pcie_gen[i] = pcie_gen_cap;
++ pcie_table->pcie_lane[i] = pcie_width_cap;
++ }
+ } else {
+- min_lane_width = MAX(1, table_member2[0]);
+- max_lane_width = MIN(pcie_width_cap, table_member2[1]);
+- min_lane_width = min_lane_width > max_lane_width ?
+- max_lane_width : min_lane_width;
++ for (i = 0; i < NUM_LINK_LEVELS; i++) {
++ if (pcie_table->pcie_gen[i] > pcie_gen_cap)
++ pcie_table->pcie_gen[i] = pcie_gen_cap;
++ if (pcie_table->pcie_lane[i] > pcie_width_cap)
++ pcie_table->pcie_lane[i] = pcie_width_cap;
++ }
+ }
+- pcie_table->pcie_lane[0] = min_lane_width;
+- pcie_table->pcie_lane[1] = max_lane_width;
+
+ for (i = 0; i < NUM_LINK_LEVELS; i++) {
+ smu_pcie_arg = (i << 16 |
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
+index 7acf731a69ccf..79e9230fc7960 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
+@@ -2454,25 +2454,6 @@ int smu_v13_0_mode1_reset(struct smu_context *smu)
+ return ret;
+ }
+
+-/*
+- * Intel hosts such as Raptor Lake and Sapphire Rapids don't support dynamic
+- * speed switching. Until we have confirmation from Intel that a specific host
+- * supports it, it's safer that we keep it disabled for all.
+- *
+- * https://edc.intel.com/content/www/us/en/design/products/platforms/details/raptor-lake-s/13th-generation-core-processors-datasheet-volume-1-of-2/005/pci-express-support/
+- * https://gitlab.freedesktop.org/drm/amd/-/issues/2663
+- */
+-static bool smu_v13_0_is_pcie_dynamic_switching_supported(void)
+-{
+-#if IS_ENABLED(CONFIG_X86)
+- struct cpuinfo_x86 *c = &cpu_data(0);
+-
+- if (c->x86_vendor == X86_VENDOR_INTEL)
+- return false;
+-#endif
+- return true;
+-}
+-
+ int smu_v13_0_update_pcie_parameters(struct smu_context *smu,
+ uint32_t pcie_gen_cap,
+ uint32_t pcie_width_cap)
+@@ -2484,7 +2465,7 @@ int smu_v13_0_update_pcie_parameters(struct smu_context *smu,
+ uint32_t smu_pcie_arg;
+ int ret, i;
+
+- if (!smu_v13_0_is_pcie_dynamic_switching_supported()) {
++ if (!amdgpu_device_pcie_dynamic_switching_supported()) {
+ if (pcie_table->pcie_gen[num_of_levels - 1] < pcie_gen_cap)
+ pcie_gen_cap = pcie_table->pcie_gen[num_of_levels - 1];
+
+diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
+index 0c2be83605258..e592c5da70cee 100644
+--- a/drivers/gpu/drm/drm_syncobj.c
++++ b/drivers/gpu/drm/drm_syncobj.c
+@@ -353,10 +353,10 @@ EXPORT_SYMBOL(drm_syncobj_replace_fence);
+ */
+ static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
+ {
+- struct dma_fence *fence = dma_fence_allocate_private_stub();
++ struct dma_fence *fence = dma_fence_allocate_private_stub(ktime_get());
+
+- if (IS_ERR(fence))
+- return PTR_ERR(fence);
++ if (!fence)
++ return -ENOMEM;
+
+ drm_syncobj_replace_fence(syncobj, fence);
+ dma_fence_put(fence);
+diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c b/drivers/gpu/drm/i915/display/intel_dpt.c
+index b8027392144de..25b9a0ba29ebc 100644
+--- a/drivers/gpu/drm/i915/display/intel_dpt.c
++++ b/drivers/gpu/drm/i915/display/intel_dpt.c
+@@ -166,6 +166,8 @@ struct i915_vma *intel_dpt_pin(struct i915_address_space *vm)
+ i915_vma_get(vma);
+ }
+
++ dpt->obj->mm.dirty = true;
++
+ atomic_dec(&i915->gpu_error.pending_fb_pin);
+ intel_runtime_pm_put(&i915->runtime_pm, wakeref);
+
+@@ -261,7 +263,7 @@ intel_dpt_create(struct intel_framebuffer *fb)
+ dpt_obj = i915_gem_object_create_stolen(i915, size);
+ if (IS_ERR(dpt_obj) && !HAS_LMEM(i915)) {
+ drm_dbg_kms(&i915->drm, "Allocating dpt from smem\n");
+- dpt_obj = i915_gem_object_create_internal(i915, size);
++ dpt_obj = i915_gem_object_create_shmem(i915, size);
+ }
+ if (IS_ERR(dpt_obj))
+ return ERR_CAST(dpt_obj);
+diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
+index 99f39a5feca15..e86e75971ec60 100644
+--- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
++++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
+@@ -1190,8 +1190,10 @@ static int igt_write_huge(struct drm_i915_private *i915,
+ * times in succession a possibility by enlarging the permutation array.
+ */
+ order = i915_random_order(count * count, &prng);
+- if (!order)
+- return -ENOMEM;
++ if (!order) {
++ err = -ENOMEM;
++ goto out;
++ }
+
+ max_page_size = rounddown_pow_of_two(obj->mm.page_sizes.sg);
+ max = div_u64(max - size, max_page_size);
+diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+index a99310b687932..bbb1bf33f98ef 100644
+--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
++++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+@@ -89,7 +89,7 @@ static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct msm_gem_submit *submit
+ * since we've already mapped it once in
+ * submit_reloc()
+ */
+- if (WARN_ON(!ptr))
++ if (WARN_ON(IS_ERR_OR_NULL(ptr)))
+ return;
+
+ for (i = 0; i < dwords; i++) {
+diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h
+index 790f55e245332..e788ed72eb0d3 100644
+--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h
++++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h
+@@ -206,7 +206,7 @@ static const struct a6xx_shader_block {
+ SHADER(A6XX_SP_LB_3_DATA, 0x800),
+ SHADER(A6XX_SP_LB_4_DATA, 0x800),
+ SHADER(A6XX_SP_LB_5_DATA, 0x200),
+- SHADER(A6XX_SP_CB_BINDLESS_DATA, 0x2000),
++ SHADER(A6XX_SP_CB_BINDLESS_DATA, 0x800),
+ SHADER(A6XX_SP_CB_LEGACY_DATA, 0x280),
+ SHADER(A6XX_SP_UAV_DATA, 0x80),
+ SHADER(A6XX_SP_INST_TAG, 0x80),
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h
+index e3795995e1454..29bb8ee2bc266 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h
++++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h
+@@ -14,19 +14,6 @@
+
+ #define DPU_PERF_DEFAULT_MAX_CORE_CLK_RATE 412500000
+
+-/**
+- * enum dpu_core_perf_data_bus_id - data bus identifier
+- * @DPU_CORE_PERF_DATA_BUS_ID_MNOC: DPU/MNOC data bus
+- * @DPU_CORE_PERF_DATA_BUS_ID_LLCC: MNOC/LLCC data bus
+- * @DPU_CORE_PERF_DATA_BUS_ID_EBI: LLCC/EBI data bus
+- */
+-enum dpu_core_perf_data_bus_id {
+- DPU_CORE_PERF_DATA_BUS_ID_MNOC,
+- DPU_CORE_PERF_DATA_BUS_ID_LLCC,
+- DPU_CORE_PERF_DATA_BUS_ID_EBI,
+- DPU_CORE_PERF_DATA_BUS_ID_MAX,
+-};
+-
+ /**
+ * struct dpu_core_perf_params - definition of performance parameters
+ * @max_per_pipe_ib: maximum instantaneous bandwidth request
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
+index f6270b7a0b140..5afbc16ec5bbb 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
++++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
+@@ -51,7 +51,7 @@
+
+ static const u32 fetch_tbl[SSPP_MAX] = {CTL_INVALID_BIT, 16, 17, 18, 19,
+ CTL_INVALID_BIT, CTL_INVALID_BIT, CTL_INVALID_BIT, CTL_INVALID_BIT, 0,
+- 1, 2, 3, CTL_INVALID_BIT, CTL_INVALID_BIT};
++ 1, 2, 3, 4, 5};
+
+ static const struct dpu_ctl_cfg *_ctl_offset(enum dpu_ctl ctl,
+ const struct dpu_mdss_cfg *m,
+@@ -209,6 +209,12 @@ static void dpu_hw_ctl_update_pending_flush_sspp(struct dpu_hw_ctl *ctx,
+ case SSPP_DMA3:
+ ctx->pending_flush_mask |= BIT(25);
+ break;
++ case SSPP_DMA4:
++ ctx->pending_flush_mask |= BIT(13);
++ break;
++ case SSPP_DMA5:
++ ctx->pending_flush_mask |= BIT(14);
++ break;
+ case SSPP_CURSOR0:
+ ctx->pending_flush_mask |= BIT(22);
+ break;
+diff --git a/drivers/gpu/drm/msm/dsi/phy/dsi_phy_14nm.c b/drivers/gpu/drm/msm/dsi/phy/dsi_phy_14nm.c
+index 3ce45b023e637..31deda1c664ad 100644
+--- a/drivers/gpu/drm/msm/dsi/phy/dsi_phy_14nm.c
++++ b/drivers/gpu/drm/msm/dsi/phy/dsi_phy_14nm.c
+@@ -1087,8 +1087,6 @@ const struct msm_dsi_phy_cfg dsi_phy_14nm_8953_cfgs = {
+
+ const struct msm_dsi_phy_cfg dsi_phy_14nm_2290_cfgs = {
+ .has_phy_lane = true,
+- .regulator_data = dsi_phy_14nm_17mA_regulators,
+- .num_regulators = ARRAY_SIZE(dsi_phy_14nm_17mA_regulators),
+ .ops = {
+ .enable = dsi_14nm_phy_enable,
+ .disable = dsi_14nm_phy_disable,
+diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c
+index 96599ec3eb783..1a5d4f1c8b422 100644
+--- a/drivers/gpu/drm/msm/msm_fence.c
++++ b/drivers/gpu/drm/msm/msm_fence.c
+@@ -191,6 +191,12 @@ msm_fence_init(struct dma_fence *fence, struct msm_fence_context *fctx)
+
+ f->fctx = fctx;
+
++ /*
++ * Until this point, the fence was just some pre-allocated memory,
++ * no-one should have taken a reference to it yet.
++ */
++ WARN_ON(kref_read(&fence->refcount));
++
+ dma_fence_init(&f->base, &msm_fence_ops, &fctx->spinlock,
+ fctx->context, ++fctx->last_fence);
+ }
+diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
+index 9f5933c75e3df..1bd78041b4d0d 100644
+--- a/drivers/gpu/drm/msm/msm_gem_submit.c
++++ b/drivers/gpu/drm/msm/msm_gem_submit.c
+@@ -86,7 +86,19 @@ void __msm_gem_submit_destroy(struct kref *kref)
+ }
+
+ dma_fence_put(submit->user_fence);
+- dma_fence_put(submit->hw_fence);
++
++ /*
++ * If the submit is freed before msm_job_run(), then hw_fence is
++ * just some pre-allocated memory, not a reference counted fence.
++ * Once the job runs and the hw_fence is initialized, it will
++ * have a refcount of at least one, since the submit holds a ref
++ * to the hw_fence.
++ */
++ if (kref_read(&submit->hw_fence->refcount) == 0) {
++ kfree(submit->hw_fence);
++ } else {
++ dma_fence_put(submit->hw_fence);
++ }
+
+ put_pid(submit->pid);
+ msm_submitqueue_put(submit->queue);
+@@ -890,7 +902,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
+ * after the job is armed
+ */
+ if ((args->flags & MSM_SUBMIT_FENCE_SN_IN) &&
+- idr_find(&queue->fence_idr, args->fence)) {
++ (!args->fence || idr_find(&queue->fence_idr, args->fence))) {
+ spin_unlock(&queue->idr_lock);
+ idr_preload_end();
+ ret = -EINVAL;
+diff --git a/drivers/gpu/drm/msm/msm_mdss.c b/drivers/gpu/drm/msm/msm_mdss.c
+index e8c93731aaa18..4ae6fac20e48c 100644
+--- a/drivers/gpu/drm/msm/msm_mdss.c
++++ b/drivers/gpu/drm/msm/msm_mdss.c
+@@ -189,6 +189,7 @@ static int _msm_mdss_irq_domain_add(struct msm_mdss *msm_mdss)
+ #define UBWC_2_0 0x20000000
+ #define UBWC_3_0 0x30000000
+ #define UBWC_4_0 0x40000000
++#define UBWC_4_3 0x40030000
+
+ static void msm_mdss_setup_ubwc_dec_20(struct msm_mdss *msm_mdss)
+ {
+@@ -227,7 +228,10 @@ static void msm_mdss_setup_ubwc_dec_40(struct msm_mdss *msm_mdss)
+ writel_relaxed(1, msm_mdss->mmio + UBWC_CTRL_2);
+ writel_relaxed(0, msm_mdss->mmio + UBWC_PREDICTION_MODE);
+ } else {
+- writel_relaxed(2, msm_mdss->mmio + UBWC_CTRL_2);
++ if (data->ubwc_dec_version == UBWC_4_3)
++ writel_relaxed(3, msm_mdss->mmio + UBWC_CTRL_2);
++ else
++ writel_relaxed(2, msm_mdss->mmio + UBWC_CTRL_2);
+ writel_relaxed(1, msm_mdss->mmio + UBWC_PREDICTION_MODE);
+ }
+ }
+@@ -271,6 +275,7 @@ static int msm_mdss_enable(struct msm_mdss *msm_mdss)
+ msm_mdss_setup_ubwc_dec_30(msm_mdss);
+ break;
+ case UBWC_4_0:
++ case UBWC_4_3:
+ msm_mdss_setup_ubwc_dec_40(msm_mdss);
+ break;
+ default:
+@@ -561,6 +566,16 @@ static const struct msm_mdss_data sm8250_data = {
+ .macrotile_mode = 1,
+ };
+
++static const struct msm_mdss_data sm8550_data = {
++ .ubwc_version = UBWC_4_0,
++ .ubwc_dec_version = UBWC_4_3,
++ .ubwc_swizzle = 6,
++ .ubwc_static = 1,
++ /* TODO: highest_bank_bit = 2 for LP_DDR4 */
++ .highest_bank_bit = 3,
++ .macrotile_mode = 1,
++};
++
+ static const struct of_device_id mdss_dt_match[] = {
+ { .compatible = "qcom,mdss" },
+ { .compatible = "qcom,msm8998-mdss" },
+@@ -575,7 +590,7 @@ static const struct of_device_id mdss_dt_match[] = {
+ { .compatible = "qcom,sm8250-mdss", .data = &sm8250_data },
+ { .compatible = "qcom,sm8350-mdss", .data = &sm8250_data },
+ { .compatible = "qcom,sm8450-mdss", .data = &sm8250_data },
+- { .compatible = "qcom,sm8550-mdss", .data = &sm8250_data },
++ { .compatible = "qcom,sm8550-mdss", .data = &sm8550_data },
+ {}
+ };
+ MODULE_DEVICE_TABLE(of, mdss_dt_match);
+diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
+index 1a1cfd675cc46..7139a522b2f3b 100644
+--- a/drivers/gpu/drm/ttm/ttm_bo.c
++++ b/drivers/gpu/drm/ttm/ttm_bo.c
+@@ -517,6 +517,12 @@ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo,
+ {
+ bool ret = false;
+
++ if (bo->pin_count) {
++ *locked = false;
++ *busy = false;
++ return false;
++ }
++
+ if (bo->base.resv == ctx->resv) {
+ dma_resv_assert_held(bo->base.resv);
+ if (ctx->allow_res_evict)
+diff --git a/drivers/hwmon/aquacomputer_d5next.c b/drivers/hwmon/aquacomputer_d5next.c
+index a4fcd4ebf76c2..c2b99fd4f436c 100644
+--- a/drivers/hwmon/aquacomputer_d5next.c
++++ b/drivers/hwmon/aquacomputer_d5next.c
+@@ -969,7 +969,7 @@ static int aqc_read(struct device *dev, enum hwmon_sensor_types type, u32 attr,
+ if (ret < 0)
+ return ret;
+
+- *val = aqc_percent_to_pwm(ret);
++ *val = aqc_percent_to_pwm(*val);
+ break;
+ }
+ break;
+diff --git a/drivers/hwmon/k10temp.c b/drivers/hwmon/k10temp.c
+index 7b177b9fbb097..a267b11731a8a 100644
+--- a/drivers/hwmon/k10temp.c
++++ b/drivers/hwmon/k10temp.c
+@@ -77,6 +77,13 @@ static DEFINE_MUTEX(nb_smu_ind_mutex);
+ #define ZEN_CUR_TEMP_RANGE_SEL_MASK BIT(19)
+ #define ZEN_CUR_TEMP_TJ_SEL_MASK GENMASK(17, 16)
+
++/*
++ * AMD's Industrial processor 3255 supports temperature from -40 deg to 105 deg Celsius.
++ * Use the model name to identify 3255 CPUs and set a flag to display negative temperature.
++ * Do not round off to zero for negative Tctl or Tdie values if the flag is set
++ */
++#define AMD_I3255_STR "3255"
++
+ struct k10temp_data {
+ struct pci_dev *pdev;
+ void (*read_htcreg)(struct pci_dev *pdev, u32 *regval);
+@@ -86,6 +93,7 @@ struct k10temp_data {
+ u32 show_temp;
+ bool is_zen;
+ u32 ccd_offset;
++ bool disp_negative;
+ };
+
+ #define TCTL_BIT 0
+@@ -204,12 +212,12 @@ static int k10temp_read_temp(struct device *dev, u32 attr, int channel,
+ switch (channel) {
+ case 0: /* Tctl */
+ *val = get_raw_temp(data);
+- if (*val < 0)
++ if (*val < 0 && !data->disp_negative)
+ *val = 0;
+ break;
+ case 1: /* Tdie */
+ *val = get_raw_temp(data) - data->temp_offset;
+- if (*val < 0)
++ if (*val < 0 && !data->disp_negative)
+ *val = 0;
+ break;
+ case 2 ... 13: /* Tccd{1-12} */
+@@ -405,6 +413,11 @@ static int k10temp_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+ data->pdev = pdev;
+ data->show_temp |= BIT(TCTL_BIT); /* Always show Tctl */
+
++ if (boot_cpu_data.x86 == 0x17 &&
++ strstr(boot_cpu_data.x86_model_id, AMD_I3255_STR)) {
++ data->disp_negative = true;
++ }
++
+ if (boot_cpu_data.x86 == 0x15 &&
+ ((boot_cpu_data.x86_model & 0xf0) == 0x60 ||
+ (boot_cpu_data.x86_model & 0xf0) == 0x70)) {
+diff --git a/drivers/hwmon/nct7802.c b/drivers/hwmon/nct7802.c
+index a175f8283695e..e64c12d90a042 100644
+--- a/drivers/hwmon/nct7802.c
++++ b/drivers/hwmon/nct7802.c
+@@ -725,7 +725,7 @@ static umode_t nct7802_temp_is_visible(struct kobject *kobj,
+ if (index >= 38 && index < 46 && !(reg & 0x01)) /* PECI 0 */
+ return 0;
+
+- if (index >= 0x46 && (!(reg & 0x02))) /* PECI 1 */
++ if (index >= 46 && !(reg & 0x02)) /* PECI 1 */
+ return 0;
+
+ return attr->mode;
+diff --git a/drivers/hwmon/pmbus/pmbus_core.c b/drivers/hwmon/pmbus/pmbus_core.c
+index 9d14954da94fb..8e54be26c88f3 100644
+--- a/drivers/hwmon/pmbus/pmbus_core.c
++++ b/drivers/hwmon/pmbus/pmbus_core.c
+@@ -2745,9 +2745,8 @@ static const struct pmbus_status_category __maybe_unused pmbus_status_flag_map[]
+ },
+ };
+
+-static int _pmbus_is_enabled(struct device *dev, u8 page)
++static int _pmbus_is_enabled(struct i2c_client *client, u8 page)
+ {
+- struct i2c_client *client = to_i2c_client(dev->parent);
+ int ret;
+
+ ret = _pmbus_read_byte_data(client, page, PMBUS_OPERATION);
+@@ -2758,17 +2757,16 @@ static int _pmbus_is_enabled(struct device *dev, u8 page)
+ return !!(ret & PB_OPERATION_CONTROL_ON);
+ }
+
+-static int __maybe_unused pmbus_is_enabled(struct device *dev, u8 page)
++static int __maybe_unused pmbus_is_enabled(struct i2c_client *client, u8 page)
+ {
+- struct i2c_client *client = to_i2c_client(dev->parent);
+ struct pmbus_data *data = i2c_get_clientdata(client);
+ int ret;
+
+ mutex_lock(&data->update_lock);
+- ret = _pmbus_is_enabled(dev, page);
++ ret = _pmbus_is_enabled(client, page);
+ mutex_unlock(&data->update_lock);
+
+- return !!(ret & PB_OPERATION_CONTROL_ON);
++ return ret;
+ }
+
+ #define to_dev_attr(_dev_attr) \
+@@ -2844,7 +2842,7 @@ static int _pmbus_get_flags(struct pmbus_data *data, u8 page, unsigned int *flag
+ if (status < 0)
+ return status;
+
+- if (_pmbus_is_enabled(dev, page)) {
++ if (_pmbus_is_enabled(client, page)) {
+ if (status & PB_STATUS_OFF) {
+ *flags |= REGULATOR_ERROR_FAIL;
+ *event |= REGULATOR_EVENT_FAIL;
+@@ -2898,7 +2896,10 @@ static int __maybe_unused pmbus_get_flags(struct pmbus_data *data, u8 page, unsi
+ #if IS_ENABLED(CONFIG_REGULATOR)
+ static int pmbus_regulator_is_enabled(struct regulator_dev *rdev)
+ {
+- return pmbus_is_enabled(rdev_get_dev(rdev), rdev_get_id(rdev));
++ struct device *dev = rdev_get_dev(rdev);
++ struct i2c_client *client = to_i2c_client(dev->parent);
++
++ return pmbus_is_enabled(client, rdev_get_id(rdev));
+ }
+
+ static int _pmbus_regulator_on_off(struct regulator_dev *rdev, bool enable)
+@@ -2945,6 +2946,7 @@ static int pmbus_regulator_get_status(struct regulator_dev *rdev)
+ struct pmbus_data *data = i2c_get_clientdata(client);
+ u8 page = rdev_get_id(rdev);
+ int status, ret;
++ int event;
+
+ mutex_lock(&data->update_lock);
+ status = pmbus_get_status(client, page, PMBUS_STATUS_WORD);
+@@ -2964,7 +2966,7 @@ static int pmbus_regulator_get_status(struct regulator_dev *rdev)
+ goto unlock;
+ }
+
+- ret = pmbus_regulator_get_error_flags(rdev, &status);
++ ret = _pmbus_get_flags(data, rdev_get_id(rdev), &status, &event, false);
+ if (ret)
+ goto unlock;
+
+diff --git a/drivers/i2c/busses/i2c-ibm_iic.c b/drivers/i2c/busses/i2c-ibm_iic.c
+index eeb80e34f9ad7..de3b609515e08 100644
+--- a/drivers/i2c/busses/i2c-ibm_iic.c
++++ b/drivers/i2c/busses/i2c-ibm_iic.c
+@@ -694,10 +694,8 @@ static int iic_probe(struct platform_device *ofdev)
+ int ret;
+
+ dev = kzalloc(sizeof(*dev), GFP_KERNEL);
+- if (!dev) {
+- dev_err(&ofdev->dev, "failed to allocate device data\n");
++ if (!dev)
+ return -ENOMEM;
+- }
+
+ platform_set_drvdata(ofdev, dev);
+
+diff --git a/drivers/i2c/busses/i2c-nomadik.c b/drivers/i2c/busses/i2c-nomadik.c
+index a2d12a5b1c34c..9c5d66bd6dc1c 100644
+--- a/drivers/i2c/busses/i2c-nomadik.c
++++ b/drivers/i2c/busses/i2c-nomadik.c
+@@ -970,12 +970,10 @@ static int nmk_i2c_probe(struct amba_device *adev, const struct amba_id *id)
+ struct i2c_vendor_data *vendor = id->data;
+ u32 max_fifo_threshold = (vendor->fifodepth / 2) - 1;
+
+- dev = devm_kzalloc(&adev->dev, sizeof(struct nmk_i2c_dev), GFP_KERNEL);
+- if (!dev) {
+- dev_err(&adev->dev, "cannot allocate memory\n");
+- ret = -ENOMEM;
+- goto err_no_mem;
+- }
++ dev = devm_kzalloc(&adev->dev, sizeof(*dev), GFP_KERNEL);
++ if (!dev)
++ return -ENOMEM;
++
+ dev->vendor = vendor;
+ dev->adev = adev;
+ nmk_i2c_of_probe(np, dev);
+@@ -996,30 +994,21 @@ static int nmk_i2c_probe(struct amba_device *adev, const struct amba_id *id)
+
+ dev->virtbase = devm_ioremap(&adev->dev, adev->res.start,
+ resource_size(&adev->res));
+- if (!dev->virtbase) {
+- ret = -ENOMEM;
+- goto err_no_mem;
+- }
++ if (!dev->virtbase)
++ return -ENOMEM;
+
+ dev->irq = adev->irq[0];
+ ret = devm_request_irq(&adev->dev, dev->irq, i2c_irq_handler, 0,
+ DRIVER_NAME, dev);
+ if (ret) {
+ dev_err(&adev->dev, "cannot claim the irq %d\n", dev->irq);
+- goto err_no_mem;
++ return ret;
+ }
+
+- dev->clk = devm_clk_get(&adev->dev, NULL);
++ dev->clk = devm_clk_get_enabled(&adev->dev, NULL);
+ if (IS_ERR(dev->clk)) {
+- dev_err(&adev->dev, "could not get i2c clock\n");
+- ret = PTR_ERR(dev->clk);
+- goto err_no_mem;
+- }
+-
+- ret = clk_prepare_enable(dev->clk);
+- if (ret) {
+- dev_err(&adev->dev, "can't prepare_enable clock\n");
+- goto err_no_mem;
++ dev_err(&adev->dev, "could enable i2c clock\n");
++ return PTR_ERR(dev->clk);
+ }
+
+ init_hw(dev);
+@@ -1042,22 +1031,15 @@ static int nmk_i2c_probe(struct amba_device *adev, const struct amba_id *id)
+
+ ret = i2c_add_adapter(adap);
+ if (ret)
+- goto err_no_adap;
++ return ret;
+
+ pm_runtime_put(&adev->dev);
+
+ return 0;
+-
+- err_no_adap:
+- clk_disable_unprepare(dev->clk);
+- err_no_mem:
+-
+- return ret;
+ }
+
+ static void nmk_i2c_remove(struct amba_device *adev)
+ {
+- struct resource *res = &adev->res;
+ struct nmk_i2c_dev *dev = amba_get_drvdata(adev);
+
+ i2c_del_adapter(&dev->adap);
+@@ -1066,8 +1048,6 @@ static void nmk_i2c_remove(struct amba_device *adev)
+ clear_all_interrupts(dev);
+ /* disable the controller */
+ i2c_clr_bit(dev->virtbase + I2C_CR, I2C_CR_PE);
+- clk_disable_unprepare(dev->clk);
+- release_mem_region(res->start, resource_size(res));
+ }
+
+ static struct i2c_vendor_data vendor_stn8815 = {
+diff --git a/drivers/i2c/busses/i2c-sh7760.c b/drivers/i2c/busses/i2c-sh7760.c
+index 319d1fa617c88..051b904cb35f6 100644
+--- a/drivers/i2c/busses/i2c-sh7760.c
++++ b/drivers/i2c/busses/i2c-sh7760.c
+@@ -443,9 +443,8 @@ static int sh7760_i2c_probe(struct platform_device *pdev)
+ goto out0;
+ }
+
+- id = kzalloc(sizeof(struct cami2c), GFP_KERNEL);
++ id = kzalloc(sizeof(*id), GFP_KERNEL);
+ if (!id) {
+- dev_err(&pdev->dev, "no mem for private data\n");
+ ret = -ENOMEM;
+ goto out0;
+ }
+diff --git a/drivers/i2c/busses/i2c-tiny-usb.c b/drivers/i2c/busses/i2c-tiny-usb.c
+index 7279ca0eaa2d0..d1fa9ff5aeab4 100644
+--- a/drivers/i2c/busses/i2c-tiny-usb.c
++++ b/drivers/i2c/busses/i2c-tiny-usb.c
+@@ -226,10 +226,8 @@ static int i2c_tiny_usb_probe(struct usb_interface *interface,
+
+ /* allocate memory for our device state and initialize it */
+ dev = kzalloc(sizeof(*dev), GFP_KERNEL);
+- if (dev == NULL) {
+- dev_err(&interface->dev, "Out of memory\n");
++ if (!dev)
+ goto error;
+- }
+
+ dev->usb_dev = usb_get_dev(interface_to_usbdev(interface));
+ dev->interface = interface;
+diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
+index 6b3f4384e46ac..a60e587aea817 100644
+--- a/drivers/infiniband/core/cma.c
++++ b/drivers/infiniband/core/cma.c
+@@ -4062,6 +4062,8 @@ static int resolve_prepare_src(struct rdma_id_private *id_priv,
+ RDMA_CM_ADDR_QUERY)))
+ return -EINVAL;
+
++ } else {
++ memcpy(cma_dst_addr(id_priv), dst_addr, rdma_addr_size(dst_addr));
+ }
+
+ if (cma_family(id_priv) != dst_addr->sa_family) {
+diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+index 952811c40c54b..ebe6852c40e8c 100644
+--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
++++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+@@ -797,7 +797,10 @@ fail:
+ int bnxt_re_destroy_qp(struct ib_qp *ib_qp, struct ib_udata *udata)
+ {
+ struct bnxt_re_qp *qp = container_of(ib_qp, struct bnxt_re_qp, ib_qp);
++ struct bnxt_qplib_qp *qplib_qp = &qp->qplib_qp;
+ struct bnxt_re_dev *rdev = qp->rdev;
++ struct bnxt_qplib_nq *scq_nq = NULL;
++ struct bnxt_qplib_nq *rcq_nq = NULL;
+ unsigned int flags;
+ int rc;
+
+@@ -831,6 +834,15 @@ int bnxt_re_destroy_qp(struct ib_qp *ib_qp, struct ib_udata *udata)
+ ib_umem_release(qp->rumem);
+ ib_umem_release(qp->sumem);
+
++ /* Flush all the entries of notification queue associated with
++ * given qp.
++ */
++ scq_nq = qplib_qp->scq->nq;
++ rcq_nq = qplib_qp->rcq->nq;
++ bnxt_re_synchronize_nq(scq_nq);
++ if (scq_nq != rcq_nq)
++ bnxt_re_synchronize_nq(rcq_nq);
++
+ return 0;
+ }
+
+diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.c b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
+index 55f092c2c8a88..b34cc500f51f3 100644
+--- a/drivers/infiniband/hw/bnxt_re/qplib_fp.c
++++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
+@@ -381,6 +381,24 @@ static void bnxt_qplib_service_nq(struct tasklet_struct *t)
+ spin_unlock_bh(&hwq->lock);
+ }
+
++/* bnxt_re_synchronize_nq - self polling notification queue.
++ * @nq - notification queue pointer
++ *
++ * This function will start polling entries of a given notification queue
++ * for all pending entries.
++ * This function is useful to synchronize notification entries while resources
++ * are going away.
++ */
++
++void bnxt_re_synchronize_nq(struct bnxt_qplib_nq *nq)
++{
++ int budget = nq->budget;
++
++ nq->budget = nq->hwq.max_elements;
++ bnxt_qplib_service_nq(&nq->nq_tasklet);
++ nq->budget = budget;
++}
++
+ static irqreturn_t bnxt_qplib_nq_irq(int irq, void *dev_instance)
+ {
+ struct bnxt_qplib_nq *nq = dev_instance;
+@@ -402,19 +420,19 @@ void bnxt_qplib_nq_stop_irq(struct bnxt_qplib_nq *nq, bool kill)
+ if (!nq->requested)
+ return;
+
+- tasklet_disable(&nq->nq_tasklet);
++ nq->requested = false;
+ /* Mask h/w interrupt */
+ bnxt_qplib_ring_nq_db(&nq->nq_db.dbinfo, nq->res->cctx, false);
+ /* Sync with last running IRQ handler */
+ synchronize_irq(nq->msix_vec);
+- if (kill)
+- tasklet_kill(&nq->nq_tasklet);
+-
+ irq_set_affinity_hint(nq->msix_vec, NULL);
+ free_irq(nq->msix_vec, nq);
+ kfree(nq->name);
+ nq->name = NULL;
+- nq->requested = false;
++
++ if (kill)
++ tasklet_kill(&nq->nq_tasklet);
++ tasklet_disable(&nq->nq_tasklet);
+ }
+
+ void bnxt_qplib_disable_nq(struct bnxt_qplib_nq *nq)
+diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.h b/drivers/infiniband/hw/bnxt_re/qplib_fp.h
+index a42820821c473..404b851091ca2 100644
+--- a/drivers/infiniband/hw/bnxt_re/qplib_fp.h
++++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.h
+@@ -553,6 +553,7 @@ int bnxt_qplib_process_flush_list(struct bnxt_qplib_cq *cq,
+ struct bnxt_qplib_cqe *cqe,
+ int num_cqes);
+ void bnxt_qplib_flush_cqn_wq(struct bnxt_qplib_qp *qp);
++void bnxt_re_synchronize_nq(struct bnxt_qplib_nq *nq);
+
+ static inline void *bnxt_qplib_get_swqe(struct bnxt_qplib_q *que, u32 *swq_idx)
+ {
+diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
+index c11b8e708844c..05683ce64887f 100644
+--- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
++++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
+@@ -53,70 +53,139 @@
+
+ static void bnxt_qplib_service_creq(struct tasklet_struct *t);
+
+-/* Hardware communication channel */
+-static int __wait_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie)
++/**
++ * bnxt_qplib_map_rc - map return type based on opcode
++ * @opcode - roce slow path opcode
++ *
++ * In some cases like firmware halt is detected, the driver is supposed to
++ * remap the error code of the timed out command.
++ *
++ * It is not safe to assume hardware is really inactive so certain opcodes
++ * like destroy qp etc are not safe to be returned success, but this function
++ * will be called when FW already reports a timeout. This would be possible
++ * only when FW crashes and resets. This will clear all the HW resources.
++ *
++ * Returns:
++ * 0 to communicate success to caller.
++ * Non zero error code to communicate failure to caller.
++ */
++static int bnxt_qplib_map_rc(u8 opcode)
++{
++ switch (opcode) {
++ case CMDQ_BASE_OPCODE_DESTROY_QP:
++ case CMDQ_BASE_OPCODE_DESTROY_SRQ:
++ case CMDQ_BASE_OPCODE_DESTROY_CQ:
++ case CMDQ_BASE_OPCODE_DEALLOCATE_KEY:
++ case CMDQ_BASE_OPCODE_DEREGISTER_MR:
++ case CMDQ_BASE_OPCODE_DELETE_GID:
++ case CMDQ_BASE_OPCODE_DESTROY_QP1:
++ case CMDQ_BASE_OPCODE_DESTROY_AH:
++ case CMDQ_BASE_OPCODE_DEINITIALIZE_FW:
++ case CMDQ_BASE_OPCODE_MODIFY_ROCE_CC:
++ case CMDQ_BASE_OPCODE_SET_LINK_AGGR_MODE:
++ return 0;
++ default:
++ return -ETIMEDOUT;
++ }
++}
++
++/**
++ * __wait_for_resp - Don't hold the cpu context and wait for response
++ * @rcfw - rcfw channel instance of rdev
++ * @cookie - cookie to track the command
++ * @opcode - rcfw submitted for given opcode
++ *
++ * Wait for command completion in sleepable context.
++ *
++ * Returns:
++ * 0 if command is completed by firmware.
++ * Non zero error code for rest of the case.
++ */
++static int __wait_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie, u8 opcode)
+ {
+ struct bnxt_qplib_cmdq_ctx *cmdq;
+ u16 cbit;
+- int rc;
++ int ret;
+
+ cmdq = &rcfw->cmdq;
+ cbit = cookie % rcfw->cmdq_depth;
+- rc = wait_event_timeout(cmdq->waitq,
+- !test_bit(cbit, cmdq->cmdq_bitmap),
+- msecs_to_jiffies(RCFW_CMD_WAIT_TIME_MS));
+- return rc ? 0 : -ETIMEDOUT;
++
++ do {
++ if (test_bit(ERR_DEVICE_DETACHED, &cmdq->flags))
++ return bnxt_qplib_map_rc(opcode);
++
++ /* Non zero means command completed */
++ ret = wait_event_timeout(cmdq->waitq,
++ !test_bit(cbit, cmdq->cmdq_bitmap),
++ msecs_to_jiffies(10000));
++
++ if (!test_bit(cbit, cmdq->cmdq_bitmap))
++ return 0;
++
++ bnxt_qplib_service_creq(&rcfw->creq.creq_tasklet);
++
++ if (!test_bit(cbit, cmdq->cmdq_bitmap))
++ return 0;
++
++ } while (true);
+ };
+
+-static int __block_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie)
++/**
++ * __block_for_resp - hold the cpu context and wait for response
++ * @rcfw - rcfw channel instance of rdev
++ * @cookie - cookie to track the command
++ * @opcode - rcfw submitted for given opcode
++ *
++ * This function will hold the cpu (non-sleepable context) and
++ * wait for command completion. Maximum holding interval is 8 second.
++ *
++ * Returns:
++ * -ETIMEOUT if command is not completed in specific time interval.
++ * 0 if command is completed by firmware.
++ */
++static int __block_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie, u8 opcode)
+ {
+- u32 count = RCFW_BLOCKED_CMD_WAIT_COUNT;
+- struct bnxt_qplib_cmdq_ctx *cmdq;
++ struct bnxt_qplib_cmdq_ctx *cmdq = &rcfw->cmdq;
++ unsigned long issue_time = 0;
+ u16 cbit;
+
+- cmdq = &rcfw->cmdq;
+ cbit = cookie % rcfw->cmdq_depth;
+- if (!test_bit(cbit, cmdq->cmdq_bitmap))
+- goto done;
++ issue_time = jiffies;
++
+ do {
++ if (test_bit(ERR_DEVICE_DETACHED, &cmdq->flags))
++ return bnxt_qplib_map_rc(opcode);
++
+ udelay(1);
++
+ bnxt_qplib_service_creq(&rcfw->creq.creq_tasklet);
+- } while (test_bit(cbit, cmdq->cmdq_bitmap) && --count);
+-done:
+- return count ? 0 : -ETIMEDOUT;
++ if (!test_bit(cbit, cmdq->cmdq_bitmap))
++ return 0;
++
++ } while (time_before(jiffies, issue_time + (8 * HZ)));
++
++ return -ETIMEDOUT;
+ };
+
+ static int __send_message(struct bnxt_qplib_rcfw *rcfw,
+ struct bnxt_qplib_cmdqmsg *msg)
+ {
+- struct bnxt_qplib_cmdq_ctx *cmdq = &rcfw->cmdq;
+- struct bnxt_qplib_hwq *hwq = &cmdq->hwq;
++ u32 bsize, opcode, free_slots, required_slots;
++ struct bnxt_qplib_cmdq_ctx *cmdq;
+ struct bnxt_qplib_crsqe *crsqe;
+ struct bnxt_qplib_cmdqe *cmdqe;
++ struct bnxt_qplib_hwq *hwq;
+ u32 sw_prod, cmdq_prod;
+ struct pci_dev *pdev;
+ unsigned long flags;
+- u32 bsize, opcode;
+ u16 cookie, cbit;
+ u8 *preq;
+
++ cmdq = &rcfw->cmdq;
++ hwq = &cmdq->hwq;
+ pdev = rcfw->pdev;
+
+ opcode = __get_cmdq_base_opcode(msg->req, msg->req_sz);
+- if (!test_bit(FIRMWARE_INITIALIZED_FLAG, &cmdq->flags) &&
+- (opcode != CMDQ_BASE_OPCODE_QUERY_FUNC &&
+- opcode != CMDQ_BASE_OPCODE_INITIALIZE_FW &&
+- opcode != CMDQ_BASE_OPCODE_QUERY_VERSION)) {
+- dev_err(&pdev->dev,
+- "RCFW not initialized, reject opcode 0x%x\n", opcode);
+- return -EINVAL;
+- }
+-
+- if (test_bit(FIRMWARE_INITIALIZED_FLAG, &cmdq->flags) &&
+- opcode == CMDQ_BASE_OPCODE_INITIALIZE_FW) {
+- dev_err(&pdev->dev, "RCFW already initialized!\n");
+- return -EINVAL;
+- }
+
+ if (test_bit(FIRMWARE_TIMED_OUT, &cmdq->flags))
+ return -ETIMEDOUT;
+@@ -125,40 +194,37 @@ static int __send_message(struct bnxt_qplib_rcfw *rcfw,
+ * cmdqe
+ */
+ spin_lock_irqsave(&hwq->lock, flags);
+- if (msg->req->cmd_size >= HWQ_FREE_SLOTS(hwq)) {
+- dev_err(&pdev->dev, "RCFW: CMDQ is full!\n");
++ required_slots = bnxt_qplib_get_cmd_slots(msg->req);
++ free_slots = HWQ_FREE_SLOTS(hwq);
++ cookie = cmdq->seq_num & RCFW_MAX_COOKIE_VALUE;
++ cbit = cookie % rcfw->cmdq_depth;
++
++ if (required_slots >= free_slots ||
++ test_bit(cbit, cmdq->cmdq_bitmap)) {
++ dev_info_ratelimited(&pdev->dev,
++ "CMDQ is full req/free %d/%d!",
++ required_slots, free_slots);
+ spin_unlock_irqrestore(&hwq->lock, flags);
+ return -EAGAIN;
+ }
+-
+-
+- cookie = cmdq->seq_num & RCFW_MAX_COOKIE_VALUE;
+- cbit = cookie % rcfw->cmdq_depth;
+ if (msg->block)
+ cookie |= RCFW_CMD_IS_BLOCKING;
+-
+ set_bit(cbit, cmdq->cmdq_bitmap);
+ __set_cmdq_base_cookie(msg->req, msg->req_sz, cpu_to_le16(cookie));
+ crsqe = &rcfw->crsqe_tbl[cbit];
+- if (crsqe->resp) {
+- spin_unlock_irqrestore(&hwq->lock, flags);
+- return -EBUSY;
+- }
+-
+- /* change the cmd_size to the number of 16byte cmdq unit.
+- * req->cmd_size is modified here
+- */
+ bsize = bnxt_qplib_set_cmd_slots(msg->req);
+-
+- memset(msg->resp, 0, sizeof(*msg->resp));
++ crsqe->free_slots = free_slots;
+ crsqe->resp = (struct creq_qp_event *)msg->resp;
+ crsqe->resp->cookie = cpu_to_le16(cookie);
+ crsqe->req_size = __get_cmdq_base_cmd_size(msg->req, msg->req_sz);
+ if (__get_cmdq_base_resp_size(msg->req, msg->req_sz) && msg->sb) {
+ struct bnxt_qplib_rcfw_sbuf *sbuf = msg->sb;
+- __set_cmdq_base_resp_addr(msg->req, msg->req_sz, cpu_to_le64(sbuf->dma_addr));
++
++ __set_cmdq_base_resp_addr(msg->req, msg->req_sz,
++ cpu_to_le64(sbuf->dma_addr));
+ __set_cmdq_base_resp_size(msg->req, msg->req_sz,
+- ALIGN(sbuf->size, BNXT_QPLIB_CMDQE_UNITS));
++ ALIGN(sbuf->size,
++ BNXT_QPLIB_CMDQE_UNITS));
+ }
+
+ preq = (u8 *)msg->req;
+@@ -166,11 +232,6 @@ static int __send_message(struct bnxt_qplib_rcfw *rcfw,
+ /* Locate the next cmdq slot */
+ sw_prod = HWQ_CMP(hwq->prod, hwq);
+ cmdqe = bnxt_qplib_get_qe(hwq, sw_prod, NULL);
+- if (!cmdqe) {
+- dev_err(&pdev->dev,
+- "RCFW request failed with no cmdqe!\n");
+- goto done;
+- }
+ /* Copy a segment of the req cmd to the cmdq */
+ memset(cmdqe, 0, sizeof(*cmdqe));
+ memcpy(cmdqe, preq, min_t(u32, bsize, sizeof(*cmdqe)));
+@@ -194,45 +255,121 @@ static int __send_message(struct bnxt_qplib_rcfw *rcfw,
+ wmb();
+ writel(cmdq_prod, cmdq->cmdq_mbox.prod);
+ writel(RCFW_CMDQ_TRIG_VAL, cmdq->cmdq_mbox.db);
+-done:
+ spin_unlock_irqrestore(&hwq->lock, flags);
+ /* Return the CREQ response pointer */
+ return 0;
+ }
+
+-int bnxt_qplib_rcfw_send_message(struct bnxt_qplib_rcfw *rcfw,
+- struct bnxt_qplib_cmdqmsg *msg)
++/**
++ * __poll_for_resp - self poll completion for rcfw command
++ * @rcfw - rcfw channel instance of rdev
++ * @cookie - cookie to track the command
++ * @opcode - rcfw submitted for given opcode
++ *
++ * It works same as __wait_for_resp except this function will
++ * do self polling in sort interval since interrupt is disabled.
++ * This function can not be called from non-sleepable context.
++ *
++ * Returns:
++ * -ETIMEOUT if command is not completed in specific time interval.
++ * 0 if command is completed by firmware.
++ */
++static int __poll_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie,
++ u8 opcode)
++{
++ struct bnxt_qplib_cmdq_ctx *cmdq = &rcfw->cmdq;
++ unsigned long issue_time;
++ u16 cbit;
++
++ cbit = cookie % rcfw->cmdq_depth;
++ issue_time = jiffies;
++
++ do {
++ if (test_bit(ERR_DEVICE_DETACHED, &cmdq->flags))
++ return bnxt_qplib_map_rc(opcode);
++
++ usleep_range(1000, 1001);
++
++ bnxt_qplib_service_creq(&rcfw->creq.creq_tasklet);
++ if (!test_bit(cbit, cmdq->cmdq_bitmap))
++ return 0;
++ if (jiffies_to_msecs(jiffies - issue_time) > 10000)
++ return -ETIMEDOUT;
++ } while (true);
++};
++
++static int __send_message_basic_sanity(struct bnxt_qplib_rcfw *rcfw,
++ struct bnxt_qplib_cmdqmsg *msg)
++{
++ struct bnxt_qplib_cmdq_ctx *cmdq;
++ u32 opcode;
++
++ cmdq = &rcfw->cmdq;
++ opcode = __get_cmdq_base_opcode(msg->req, msg->req_sz);
++
++ /* Prevent posting if f/w is not in a state to process */
++ if (test_bit(ERR_DEVICE_DETACHED, &rcfw->cmdq.flags))
++ return -ENXIO;
++
++ if (test_bit(FIRMWARE_INITIALIZED_FLAG, &cmdq->flags) &&
++ opcode == CMDQ_BASE_OPCODE_INITIALIZE_FW) {
++ dev_err(&rcfw->pdev->dev, "QPLIB: RCFW already initialized!");
++ return -EINVAL;
++ }
++
++ if (!test_bit(FIRMWARE_INITIALIZED_FLAG, &cmdq->flags) &&
++ (opcode != CMDQ_BASE_OPCODE_QUERY_FUNC &&
++ opcode != CMDQ_BASE_OPCODE_INITIALIZE_FW &&
++ opcode != CMDQ_BASE_OPCODE_QUERY_VERSION)) {
++ dev_err(&rcfw->pdev->dev,
++ "QPLIB: RCFW not initialized, reject opcode 0x%x",
++ opcode);
++ return -EOPNOTSUPP;
++ }
++
++ return 0;
++}
++
++/**
++ * __bnxt_qplib_rcfw_send_message - qplib interface to send
++ * and complete rcfw command.
++ * @rcfw - rcfw channel instance of rdev
++ * @msg - qplib message internal
++ *
++ * This function does not account shadow queue depth. It will send
++ * all the command unconditionally as long as send queue is not full.
++ *
++ * Returns:
++ * 0 if command completed by firmware.
++ * Non zero if the command is not completed by firmware.
++ */
++static int __bnxt_qplib_rcfw_send_message(struct bnxt_qplib_rcfw *rcfw,
++ struct bnxt_qplib_cmdqmsg *msg)
+ {
+ struct creq_qp_event *evnt = (struct creq_qp_event *)msg->resp;
+ u16 cookie;
+- u8 opcode, retry_cnt = 0xFF;
+ int rc = 0;
++ u8 opcode;
+
+- /* Prevent posting if f/w is not in a state to process */
+- if (test_bit(ERR_DEVICE_DETACHED, &rcfw->cmdq.flags))
+- return 0;
++ opcode = __get_cmdq_base_opcode(msg->req, msg->req_sz);
+
+- do {
+- opcode = __get_cmdq_base_opcode(msg->req, msg->req_sz);
+- rc = __send_message(rcfw, msg);
+- cookie = le16_to_cpu(__get_cmdq_base_cookie(msg->req, msg->req_sz)) &
+- RCFW_MAX_COOKIE_VALUE;
+- if (!rc)
+- break;
+- if (!retry_cnt || (rc != -EAGAIN && rc != -EBUSY)) {
+- /* send failed */
+- dev_err(&rcfw->pdev->dev, "cmdq[%#x]=%#x send failed\n",
+- cookie, opcode);
+- return rc;
+- }
+- msg->block ? mdelay(1) : usleep_range(500, 1000);
++ rc = __send_message_basic_sanity(rcfw, msg);
++ if (rc)
++ return rc == -ENXIO ? bnxt_qplib_map_rc(opcode) : rc;
+
+- } while (retry_cnt--);
++ rc = __send_message(rcfw, msg);
++ if (rc)
++ return rc;
++
++ cookie = le16_to_cpu(__get_cmdq_base_cookie(msg->req, msg->req_sz))
++ & RCFW_MAX_COOKIE_VALUE;
+
+ if (msg->block)
+- rc = __block_for_resp(rcfw, cookie);
++ rc = __block_for_resp(rcfw, cookie, opcode);
++ else if (atomic_read(&rcfw->rcfw_intr_enabled))
++ rc = __wait_for_resp(rcfw, cookie, opcode);
+ else
+- rc = __wait_for_resp(rcfw, cookie);
++ rc = __poll_for_resp(rcfw, cookie, opcode);
+ if (rc) {
+ /* timed out */
+ dev_err(&rcfw->pdev->dev, "cmdq[%#x]=%#x timedout (%d)msec\n",
+@@ -250,6 +387,48 @@ int bnxt_qplib_rcfw_send_message(struct bnxt_qplib_rcfw *rcfw,
+
+ return rc;
+ }
++
++/**
++ * bnxt_qplib_rcfw_send_message - qplib interface to send
++ * and complete rcfw command.
++ * @rcfw - rcfw channel instance of rdev
++ * @msg - qplib message internal
++ *
++ * Driver interact with Firmware through rcfw channel/slow path in two ways.
++ * a. Blocking rcfw command send. In this path, driver cannot hold
++ * the context for longer period since it is holding cpu until
++ * command is not completed.
++ * b. Non-blocking rcfw command send. In this path, driver can hold the
++ * context for longer period. There may be many pending command waiting
++ * for completion because of non-blocking nature.
++ *
++ * Driver will use shadow queue depth. Current queue depth of 8K
++ * (due to size of rcfw message there can be actual ~4K rcfw outstanding)
++ * is not optimal for rcfw command processing in firmware.
++ *
++ * Restrict at max #RCFW_CMD_NON_BLOCKING_SHADOW_QD Non-Blocking rcfw commands.
++ * Allow all blocking commands until there is no queue full.
++ *
++ * Returns:
++ * 0 if command completed by firmware.
++ * Non zero if the command is not completed by firmware.
++ */
++int bnxt_qplib_rcfw_send_message(struct bnxt_qplib_rcfw *rcfw,
++ struct bnxt_qplib_cmdqmsg *msg)
++{
++ int ret;
++
++ if (!msg->block) {
++ down(&rcfw->rcfw_inflight);
++ ret = __bnxt_qplib_rcfw_send_message(rcfw, msg);
++ up(&rcfw->rcfw_inflight);
++ } else {
++ ret = __bnxt_qplib_rcfw_send_message(rcfw, msg);
++ }
++
++ return ret;
++}
++
+ /* Completions */
+ static int bnxt_qplib_process_func_event(struct bnxt_qplib_rcfw *rcfw,
+ struct creq_func_event *func_event)
+@@ -647,18 +826,18 @@ void bnxt_qplib_rcfw_stop_irq(struct bnxt_qplib_rcfw *rcfw, bool kill)
+ if (!creq->requested)
+ return;
+
+- tasklet_disable(&creq->creq_tasklet);
++ creq->requested = false;
+ /* Mask h/w interrupts */
+ bnxt_qplib_ring_nq_db(&creq->creq_db.dbinfo, rcfw->res->cctx, false);
+ /* Sync with last running IRQ-handler */
+ synchronize_irq(creq->msix_vec);
+- if (kill)
+- tasklet_kill(&creq->creq_tasklet);
+-
+ free_irq(creq->msix_vec, rcfw);
+ kfree(creq->irq_name);
+ creq->irq_name = NULL;
+- creq->requested = false;
++ atomic_set(&rcfw->rcfw_intr_enabled, 0);
++ if (kill)
++ tasklet_kill(&creq->creq_tasklet);
++ tasklet_disable(&creq->creq_tasklet);
+ }
+
+ void bnxt_qplib_disable_rcfw_channel(struct bnxt_qplib_rcfw *rcfw)
+@@ -720,6 +899,7 @@ int bnxt_qplib_rcfw_start_irq(struct bnxt_qplib_rcfw *rcfw, int msix_vector,
+ creq->requested = true;
+
+ bnxt_qplib_ring_nq_db(&creq->creq_db.dbinfo, res->cctx, true);
++ atomic_inc(&rcfw->rcfw_intr_enabled);
+
+ return 0;
+ }
+@@ -856,6 +1036,7 @@ int bnxt_qplib_enable_rcfw_channel(struct bnxt_qplib_rcfw *rcfw,
+ return rc;
+ }
+
++ sema_init(&rcfw->rcfw_inflight, RCFW_CMD_NON_BLOCKING_SHADOW_QD);
+ bnxt_qplib_start_rcfw(rcfw);
+
+ return 0;
+diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h
+index 92f7a25533d3b..4608c0ef07a87 100644
+--- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h
++++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h
+@@ -67,6 +67,8 @@ static inline void bnxt_qplib_rcfw_cmd_prep(struct cmdq_base *req,
+ req->cmd_size = cmd_size;
+ }
+
++/* Shadow queue depth for non blocking command */
++#define RCFW_CMD_NON_BLOCKING_SHADOW_QD 64
+ #define RCFW_CMD_WAIT_TIME_MS 20000 /* 20 Seconds timeout */
+
+ /* CMDQ elements */
+@@ -89,6 +91,26 @@ static inline u32 bnxt_qplib_cmdqe_page_size(u32 depth)
+ return (bnxt_qplib_cmdqe_npages(depth) * PAGE_SIZE);
+ }
+
++/* Get the number of command units required for the req. The
++ * function returns correct value only if called before
++ * setting using bnxt_qplib_set_cmd_slots
++ */
++static inline u32 bnxt_qplib_get_cmd_slots(struct cmdq_base *req)
++{
++ u32 cmd_units = 0;
++
++ if (HAS_TLV_HEADER(req)) {
++ struct roce_tlv *tlv_req = (struct roce_tlv *)req;
++
++ cmd_units = tlv_req->total_size;
++ } else {
++ cmd_units = (req->cmd_size + BNXT_QPLIB_CMDQE_UNITS - 1) /
++ BNXT_QPLIB_CMDQE_UNITS;
++ }
++
++ return cmd_units;
++}
++
+ static inline u32 bnxt_qplib_set_cmd_slots(struct cmdq_base *req)
+ {
+ u32 cmd_byte = 0;
+@@ -132,6 +154,8 @@ typedef int (*aeq_handler_t)(struct bnxt_qplib_rcfw *, void *, void *);
+ struct bnxt_qplib_crsqe {
+ struct creq_qp_event *resp;
+ u32 req_size;
++ /* Free slots at the time of submission */
++ u32 free_slots;
+ };
+
+ struct bnxt_qplib_rcfw_sbuf {
+@@ -201,6 +225,8 @@ struct bnxt_qplib_rcfw {
+ u64 oos_prev;
+ u32 init_oos_stats;
+ u32 cmdq_depth;
++ atomic_t rcfw_intr_enabled;
++ struct semaphore rcfw_inflight;
+ };
+
+ struct bnxt_qplib_cmdqmsg {
+diff --git a/drivers/infiniband/hw/irdma/ctrl.c b/drivers/infiniband/hw/irdma/ctrl.c
+index d88c9184007ea..45e3344daa048 100644
+--- a/drivers/infiniband/hw/irdma/ctrl.c
++++ b/drivers/infiniband/hw/irdma/ctrl.c
+@@ -2712,13 +2712,13 @@ static int irdma_sc_cq_modify(struct irdma_sc_cq *cq,
+ */
+ void irdma_check_cqp_progress(struct irdma_cqp_timeout *timeout, struct irdma_sc_dev *dev)
+ {
+- if (timeout->compl_cqp_cmds != dev->cqp_cmd_stats[IRDMA_OP_CMPL_CMDS]) {
+- timeout->compl_cqp_cmds = dev->cqp_cmd_stats[IRDMA_OP_CMPL_CMDS];
++ u64 completed_ops = atomic64_read(&dev->cqp->completed_ops);
++
++ if (timeout->compl_cqp_cmds != completed_ops) {
++ timeout->compl_cqp_cmds = completed_ops;
+ timeout->count = 0;
+- } else {
+- if (dev->cqp_cmd_stats[IRDMA_OP_REQ_CMDS] !=
+- timeout->compl_cqp_cmds)
+- timeout->count++;
++ } else if (timeout->compl_cqp_cmds != dev->cqp->requested_ops) {
++ timeout->count++;
+ }
+ }
+
+@@ -2761,7 +2761,7 @@ static int irdma_cqp_poll_registers(struct irdma_sc_cqp *cqp, u32 tail,
+ if (newtail != tail) {
+ /* SUCCESS */
+ IRDMA_RING_MOVE_TAIL(cqp->sq_ring);
+- cqp->dev->cqp_cmd_stats[IRDMA_OP_CMPL_CMDS]++;
++ atomic64_inc(&cqp->completed_ops);
+ return 0;
+ }
+ udelay(cqp->dev->hw_attrs.max_sleep_count);
+@@ -3121,8 +3121,8 @@ int irdma_sc_cqp_init(struct irdma_sc_cqp *cqp,
+ info->dev->cqp = cqp;
+
+ IRDMA_RING_INIT(cqp->sq_ring, cqp->sq_size);
+- cqp->dev->cqp_cmd_stats[IRDMA_OP_REQ_CMDS] = 0;
+- cqp->dev->cqp_cmd_stats[IRDMA_OP_CMPL_CMDS] = 0;
++ cqp->requested_ops = 0;
++ atomic64_set(&cqp->completed_ops, 0);
+ /* for the cqp commands backlog. */
+ INIT_LIST_HEAD(&cqp->dev->cqp_cmd_head);
+
+@@ -3274,7 +3274,7 @@ __le64 *irdma_sc_cqp_get_next_send_wqe_idx(struct irdma_sc_cqp *cqp, u64 scratch
+ if (ret_code)
+ return NULL;
+
+- cqp->dev->cqp_cmd_stats[IRDMA_OP_REQ_CMDS]++;
++ cqp->requested_ops++;
+ if (!*wqe_idx)
+ cqp->polarity = !cqp->polarity;
+ wqe = cqp->sq_base[*wqe_idx].elem;
+@@ -3363,6 +3363,9 @@ int irdma_sc_ccq_get_cqe_info(struct irdma_sc_cq *ccq,
+ if (polarity != ccq->cq_uk.polarity)
+ return -ENOENT;
+
++ /* Ensure CEQE contents are read after valid bit is checked */
++ dma_rmb();
++
+ get_64bit_val(cqe, 8, &qp_ctx);
+ cqp = (struct irdma_sc_cqp *)(unsigned long)qp_ctx;
+ info->error = (bool)FIELD_GET(IRDMA_CQ_ERROR, temp);
+@@ -3397,7 +3400,7 @@ int irdma_sc_ccq_get_cqe_info(struct irdma_sc_cq *ccq,
+ dma_wmb(); /* make sure shadow area is updated before moving tail */
+
+ IRDMA_RING_MOVE_TAIL(cqp->sq_ring);
+- ccq->dev->cqp_cmd_stats[IRDMA_OP_CMPL_CMDS]++;
++ atomic64_inc(&cqp->completed_ops);
+
+ return ret_code;
+ }
+@@ -4009,13 +4012,17 @@ int irdma_sc_get_next_aeqe(struct irdma_sc_aeq *aeq,
+ u8 polarity;
+
+ aeqe = IRDMA_GET_CURRENT_AEQ_ELEM(aeq);
+- get_64bit_val(aeqe, 0, &compl_ctx);
+ get_64bit_val(aeqe, 8, &temp);
+ polarity = (u8)FIELD_GET(IRDMA_AEQE_VALID, temp);
+
+ if (aeq->polarity != polarity)
+ return -ENOENT;
+
++ /* Ensure AEQE contents are read after valid bit is checked */
++ dma_rmb();
++
++ get_64bit_val(aeqe, 0, &compl_ctx);
++
+ print_hex_dump_debug("WQE: AEQ_ENTRY WQE", DUMP_PREFIX_OFFSET, 16, 8,
+ aeqe, 16, false);
+
+diff --git a/drivers/infiniband/hw/irdma/defs.h b/drivers/infiniband/hw/irdma/defs.h
+index 6014b9d06a9ba..d06e45d2c23fd 100644
+--- a/drivers/infiniband/hw/irdma/defs.h
++++ b/drivers/infiniband/hw/irdma/defs.h
+@@ -191,32 +191,30 @@ enum irdma_cqp_op_type {
+ IRDMA_OP_MANAGE_VF_PBLE_BP = 25,
+ IRDMA_OP_QUERY_FPM_VAL = 26,
+ IRDMA_OP_COMMIT_FPM_VAL = 27,
+- IRDMA_OP_REQ_CMDS = 28,
+- IRDMA_OP_CMPL_CMDS = 29,
+- IRDMA_OP_AH_CREATE = 30,
+- IRDMA_OP_AH_MODIFY = 31,
+- IRDMA_OP_AH_DESTROY = 32,
+- IRDMA_OP_MC_CREATE = 33,
+- IRDMA_OP_MC_DESTROY = 34,
+- IRDMA_OP_MC_MODIFY = 35,
+- IRDMA_OP_STATS_ALLOCATE = 36,
+- IRDMA_OP_STATS_FREE = 37,
+- IRDMA_OP_STATS_GATHER = 38,
+- IRDMA_OP_WS_ADD_NODE = 39,
+- IRDMA_OP_WS_MODIFY_NODE = 40,
+- IRDMA_OP_WS_DELETE_NODE = 41,
+- IRDMA_OP_WS_FAILOVER_START = 42,
+- IRDMA_OP_WS_FAILOVER_COMPLETE = 43,
+- IRDMA_OP_SET_UP_MAP = 44,
+- IRDMA_OP_GEN_AE = 45,
+- IRDMA_OP_QUERY_RDMA_FEATURES = 46,
+- IRDMA_OP_ALLOC_LOCAL_MAC_ENTRY = 47,
+- IRDMA_OP_ADD_LOCAL_MAC_ENTRY = 48,
+- IRDMA_OP_DELETE_LOCAL_MAC_ENTRY = 49,
+- IRDMA_OP_CQ_MODIFY = 50,
++ IRDMA_OP_AH_CREATE = 28,
++ IRDMA_OP_AH_MODIFY = 29,
++ IRDMA_OP_AH_DESTROY = 30,
++ IRDMA_OP_MC_CREATE = 31,
++ IRDMA_OP_MC_DESTROY = 32,
++ IRDMA_OP_MC_MODIFY = 33,
++ IRDMA_OP_STATS_ALLOCATE = 34,
++ IRDMA_OP_STATS_FREE = 35,
++ IRDMA_OP_STATS_GATHER = 36,
++ IRDMA_OP_WS_ADD_NODE = 37,
++ IRDMA_OP_WS_MODIFY_NODE = 38,
++ IRDMA_OP_WS_DELETE_NODE = 39,
++ IRDMA_OP_WS_FAILOVER_START = 40,
++ IRDMA_OP_WS_FAILOVER_COMPLETE = 41,
++ IRDMA_OP_SET_UP_MAP = 42,
++ IRDMA_OP_GEN_AE = 43,
++ IRDMA_OP_QUERY_RDMA_FEATURES = 44,
++ IRDMA_OP_ALLOC_LOCAL_MAC_ENTRY = 45,
++ IRDMA_OP_ADD_LOCAL_MAC_ENTRY = 46,
++ IRDMA_OP_DELETE_LOCAL_MAC_ENTRY = 47,
++ IRDMA_OP_CQ_MODIFY = 48,
+
+ /* Must be last entry*/
+- IRDMA_MAX_CQP_OPS = 51,
++ IRDMA_MAX_CQP_OPS = 49,
+ };
+
+ /* CQP SQ WQES */
+diff --git a/drivers/infiniband/hw/irdma/hw.c b/drivers/infiniband/hw/irdma/hw.c
+index 795f7fd4f2574..457368e324e10 100644
+--- a/drivers/infiniband/hw/irdma/hw.c
++++ b/drivers/infiniband/hw/irdma/hw.c
+@@ -191,6 +191,7 @@ static void irdma_set_flush_fields(struct irdma_sc_qp *qp,
+ case IRDMA_AE_AMP_MWBIND_INVALID_RIGHTS:
+ case IRDMA_AE_AMP_MWBIND_BIND_DISABLED:
+ case IRDMA_AE_AMP_MWBIND_INVALID_BOUNDS:
++ case IRDMA_AE_AMP_MWBIND_VALID_STAG:
+ qp->flush_code = FLUSH_MW_BIND_ERR;
+ qp->event_type = IRDMA_QP_EVENT_ACCESS_ERR;
+ break;
+@@ -2075,7 +2076,7 @@ void irdma_cqp_ce_handler(struct irdma_pci_f *rf, struct irdma_sc_cq *cq)
+ cqp_request->compl_info.error = info.error;
+
+ if (cqp_request->waiting) {
+- cqp_request->request_done = true;
++ WRITE_ONCE(cqp_request->request_done, true);
+ wake_up(&cqp_request->waitq);
+ irdma_put_cqp_request(&rf->cqp, cqp_request);
+ } else {
+diff --git a/drivers/infiniband/hw/irdma/main.h b/drivers/infiniband/hw/irdma/main.h
+index def6dd58dcd48..2323962cdeacb 100644
+--- a/drivers/infiniband/hw/irdma/main.h
++++ b/drivers/infiniband/hw/irdma/main.h
+@@ -161,8 +161,8 @@ struct irdma_cqp_request {
+ void (*callback_fcn)(struct irdma_cqp_request *cqp_request);
+ void *param;
+ struct irdma_cqp_compl_info compl_info;
++ bool request_done; /* READ/WRITE_ONCE macros operate on it */
+ bool waiting:1;
+- bool request_done:1;
+ bool dynamic:1;
+ };
+
+diff --git a/drivers/infiniband/hw/irdma/puda.c b/drivers/infiniband/hw/irdma/puda.c
+index 4ec9639f1bdbf..562531712ea44 100644
+--- a/drivers/infiniband/hw/irdma/puda.c
++++ b/drivers/infiniband/hw/irdma/puda.c
+@@ -230,6 +230,9 @@ static int irdma_puda_poll_info(struct irdma_sc_cq *cq,
+ if (valid_bit != cq_uk->polarity)
+ return -ENOENT;
+
++ /* Ensure CQE contents are read after valid bit is checked */
++ dma_rmb();
++
+ if (cq->dev->hw_attrs.uk_attrs.hw_rev >= IRDMA_GEN_2)
+ ext_valid = (bool)FIELD_GET(IRDMA_CQ_EXTCQE, qword3);
+
+@@ -243,6 +246,9 @@ static int irdma_puda_poll_info(struct irdma_sc_cq *cq,
+ if (polarity != cq_uk->polarity)
+ return -ENOENT;
+
++ /* Ensure ext CQE contents are read after ext valid bit is checked */
++ dma_rmb();
++
+ IRDMA_RING_MOVE_HEAD_NOCHECK(cq_uk->cq_ring);
+ if (!IRDMA_RING_CURRENT_HEAD(cq_uk->cq_ring))
+ cq_uk->polarity = !cq_uk->polarity;
+diff --git a/drivers/infiniband/hw/irdma/type.h b/drivers/infiniband/hw/irdma/type.h
+index 5ee68604e59fc..a20709577ab0a 100644
+--- a/drivers/infiniband/hw/irdma/type.h
++++ b/drivers/infiniband/hw/irdma/type.h
+@@ -365,6 +365,8 @@ struct irdma_sc_cqp {
+ struct irdma_dcqcn_cc_params dcqcn_params;
+ __le64 *host_ctx;
+ u64 *scratch_array;
++ u64 requested_ops;
++ atomic64_t completed_ops;
+ u32 cqp_id;
+ u32 sq_size;
+ u32 hw_sq_size;
+diff --git a/drivers/infiniband/hw/irdma/uk.c b/drivers/infiniband/hw/irdma/uk.c
+index dd428d915c175..280d633d4ec4f 100644
+--- a/drivers/infiniband/hw/irdma/uk.c
++++ b/drivers/infiniband/hw/irdma/uk.c
+@@ -1161,7 +1161,7 @@ int irdma_uk_cq_poll_cmpl(struct irdma_cq_uk *cq,
+ }
+ wqe_idx = (u32)FIELD_GET(IRDMA_CQ_WQEIDX, qword3);
+ info->qp_handle = (irdma_qp_handle)(unsigned long)qp;
+- info->op_type = (u8)FIELD_GET(IRDMA_CQ_SQ, qword3);
++ info->op_type = (u8)FIELD_GET(IRDMACQ_OP, qword3);
+
+ if (info->q_type == IRDMA_CQE_QTYPE_RQ) {
+ u32 array_idx;
+@@ -1527,6 +1527,9 @@ void irdma_uk_clean_cq(void *q, struct irdma_cq_uk *cq)
+ if (polarity != temp)
+ break;
+
++ /* Ensure CQE contents are read after valid bit is checked */
++ dma_rmb();
++
+ get_64bit_val(cqe, 8, &comp_ctx);
+ if ((void *)(unsigned long)comp_ctx == q)
+ set_64bit_val(cqe, 8, 0);
+diff --git a/drivers/infiniband/hw/irdma/utils.c b/drivers/infiniband/hw/irdma/utils.c
+index 71e1c5d347092..eb083f70b09ff 100644
+--- a/drivers/infiniband/hw/irdma/utils.c
++++ b/drivers/infiniband/hw/irdma/utils.c
+@@ -481,7 +481,7 @@ void irdma_free_cqp_request(struct irdma_cqp *cqp,
+ if (cqp_request->dynamic) {
+ kfree(cqp_request);
+ } else {
+- cqp_request->request_done = false;
++ WRITE_ONCE(cqp_request->request_done, false);
+ cqp_request->callback_fcn = NULL;
+ cqp_request->waiting = false;
+
+@@ -515,7 +515,7 @@ irdma_free_pending_cqp_request(struct irdma_cqp *cqp,
+ {
+ if (cqp_request->waiting) {
+ cqp_request->compl_info.error = true;
+- cqp_request->request_done = true;
++ WRITE_ONCE(cqp_request->request_done, true);
+ wake_up(&cqp_request->waitq);
+ }
+ wait_event_timeout(cqp->remove_wq,
+@@ -567,11 +567,11 @@ static int irdma_wait_event(struct irdma_pci_f *rf,
+ bool cqp_error = false;
+ int err_code = 0;
+
+- cqp_timeout.compl_cqp_cmds = rf->sc_dev.cqp_cmd_stats[IRDMA_OP_CMPL_CMDS];
++ cqp_timeout.compl_cqp_cmds = atomic64_read(&rf->sc_dev.cqp->completed_ops);
+ do {
+ irdma_cqp_ce_handler(rf, &rf->ccq.sc_cq);
+ if (wait_event_timeout(cqp_request->waitq,
+- cqp_request->request_done,
++ READ_ONCE(cqp_request->request_done),
+ msecs_to_jiffies(CQP_COMPL_WAIT_TIME_MS)))
+ break;
+
+diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
+index 456656617c33f..9d08aa99f3cb0 100644
+--- a/drivers/infiniband/hw/mlx4/qp.c
++++ b/drivers/infiniband/hw/mlx4/qp.c
+@@ -565,15 +565,15 @@ static int set_qp_rss(struct mlx4_ib_dev *dev, struct mlx4_ib_rss *rss_ctx,
+ return (-EOPNOTSUPP);
+ }
+
+- if (ucmd->rx_hash_fields_mask & ~(MLX4_IB_RX_HASH_SRC_IPV4 |
+- MLX4_IB_RX_HASH_DST_IPV4 |
+- MLX4_IB_RX_HASH_SRC_IPV6 |
+- MLX4_IB_RX_HASH_DST_IPV6 |
+- MLX4_IB_RX_HASH_SRC_PORT_TCP |
+- MLX4_IB_RX_HASH_DST_PORT_TCP |
+- MLX4_IB_RX_HASH_SRC_PORT_UDP |
+- MLX4_IB_RX_HASH_DST_PORT_UDP |
+- MLX4_IB_RX_HASH_INNER)) {
++ if (ucmd->rx_hash_fields_mask & ~(u64)(MLX4_IB_RX_HASH_SRC_IPV4 |
++ MLX4_IB_RX_HASH_DST_IPV4 |
++ MLX4_IB_RX_HASH_SRC_IPV6 |
++ MLX4_IB_RX_HASH_DST_IPV6 |
++ MLX4_IB_RX_HASH_SRC_PORT_TCP |
++ MLX4_IB_RX_HASH_DST_PORT_TCP |
++ MLX4_IB_RX_HASH_SRC_PORT_UDP |
++ MLX4_IB_RX_HASH_DST_PORT_UDP |
++ MLX4_IB_RX_HASH_INNER)) {
+ pr_debug("RX Hash fields_mask has unsupported mask (0x%llx)\n",
+ ucmd->rx_hash_fields_mask);
+ return (-EOPNOTSUPP);
+diff --git a/drivers/infiniband/hw/mthca/mthca_qp.c b/drivers/infiniband/hw/mthca/mthca_qp.c
+index 69bba0ef4a5df..53f43649f7d08 100644
+--- a/drivers/infiniband/hw/mthca/mthca_qp.c
++++ b/drivers/infiniband/hw/mthca/mthca_qp.c
+@@ -1393,7 +1393,7 @@ int mthca_alloc_sqp(struct mthca_dev *dev,
+ if (mthca_array_get(&dev->qp_table.qp, mqpn))
+ err = -EBUSY;
+ else
+- mthca_array_set(&dev->qp_table.qp, mqpn, qp->sqp);
++ mthca_array_set(&dev->qp_table.qp, mqpn, qp);
+ spin_unlock_irq(&dev->qp_table.lock);
+
+ if (err)
+diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
+index 29d05663d4d17..ed2937a4e196f 100644
+--- a/drivers/iommu/iommufd/device.c
++++ b/drivers/iommu/iommufd/device.c
+@@ -109,10 +109,7 @@ EXPORT_SYMBOL_NS_GPL(iommufd_device_bind, IOMMUFD);
+ */
+ void iommufd_device_unbind(struct iommufd_device *idev)
+ {
+- bool was_destroyed;
+-
+- was_destroyed = iommufd_object_destroy_user(idev->ictx, &idev->obj);
+- WARN_ON(!was_destroyed);
++ iommufd_object_destroy_user(idev->ictx, &idev->obj);
+ }
+ EXPORT_SYMBOL_NS_GPL(iommufd_device_unbind, IOMMUFD);
+
+@@ -382,7 +379,7 @@ void iommufd_device_detach(struct iommufd_device *idev)
+ mutex_unlock(&hwpt->devices_lock);
+
+ if (hwpt->auto_domain)
+- iommufd_object_destroy_user(idev->ictx, &hwpt->obj);
++ iommufd_object_deref_user(idev->ictx, &hwpt->obj);
+ else
+ refcount_dec(&hwpt->obj.users);
+
+@@ -456,10 +453,7 @@ EXPORT_SYMBOL_NS_GPL(iommufd_access_create, IOMMUFD);
+ */
+ void iommufd_access_destroy(struct iommufd_access *access)
+ {
+- bool was_destroyed;
+-
+- was_destroyed = iommufd_object_destroy_user(access->ictx, &access->obj);
+- WARN_ON(!was_destroyed);
++ iommufd_object_destroy_user(access->ictx, &access->obj);
+ }
+ EXPORT_SYMBOL_NS_GPL(iommufd_access_destroy, IOMMUFD);
+
+diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
+index b38e67d1988bd..f9790983699ce 100644
+--- a/drivers/iommu/iommufd/iommufd_private.h
++++ b/drivers/iommu/iommufd/iommufd_private.h
+@@ -176,8 +176,19 @@ void iommufd_object_abort_and_destroy(struct iommufd_ctx *ictx,
+ struct iommufd_object *obj);
+ void iommufd_object_finalize(struct iommufd_ctx *ictx,
+ struct iommufd_object *obj);
+-bool iommufd_object_destroy_user(struct iommufd_ctx *ictx,
+- struct iommufd_object *obj);
++void __iommufd_object_destroy_user(struct iommufd_ctx *ictx,
++ struct iommufd_object *obj, bool allow_fail);
++static inline void iommufd_object_destroy_user(struct iommufd_ctx *ictx,
++ struct iommufd_object *obj)
++{
++ __iommufd_object_destroy_user(ictx, obj, false);
++}
++static inline void iommufd_object_deref_user(struct iommufd_ctx *ictx,
++ struct iommufd_object *obj)
++{
++ __iommufd_object_destroy_user(ictx, obj, true);
++}
++
+ struct iommufd_object *_iommufd_object_alloc(struct iommufd_ctx *ictx,
+ size_t size,
+ enum iommufd_object_type type);
+diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
+index 3fbe636c3d8a6..4cf5f73f27084 100644
+--- a/drivers/iommu/iommufd/main.c
++++ b/drivers/iommu/iommufd/main.c
+@@ -116,14 +116,56 @@ struct iommufd_object *iommufd_get_object(struct iommufd_ctx *ictx, u32 id,
+ return obj;
+ }
+
++/*
++ * Remove the given object id from the xarray if the only reference to the
++ * object is held by the xarray. The caller must call ops destroy().
++ */
++static struct iommufd_object *iommufd_object_remove(struct iommufd_ctx *ictx,
++ u32 id, bool extra_put)
++{
++ struct iommufd_object *obj;
++ XA_STATE(xas, &ictx->objects, id);
++
++ xa_lock(&ictx->objects);
++ obj = xas_load(&xas);
++ if (xa_is_zero(obj) || !obj) {
++ obj = ERR_PTR(-ENOENT);
++ goto out_xa;
++ }
++
++ /*
++ * If the caller is holding a ref on obj we put it here under the
++ * spinlock.
++ */
++ if (extra_put)
++ refcount_dec(&obj->users);
++
++ if (!refcount_dec_if_one(&obj->users)) {
++ obj = ERR_PTR(-EBUSY);
++ goto out_xa;
++ }
++
++ xas_store(&xas, NULL);
++ if (ictx->vfio_ioas == container_of(obj, struct iommufd_ioas, obj))
++ ictx->vfio_ioas = NULL;
++
++out_xa:
++ xa_unlock(&ictx->objects);
++
++ /* The returned object reference count is zero */
++ return obj;
++}
++
+ /*
+ * The caller holds a users refcount and wants to destroy the object. Returns
+ * true if the object was destroyed. In all cases the caller no longer has a
+ * reference on obj.
+ */
+-bool iommufd_object_destroy_user(struct iommufd_ctx *ictx,
+- struct iommufd_object *obj)
++void __iommufd_object_destroy_user(struct iommufd_ctx *ictx,
++ struct iommufd_object *obj, bool allow_fail)
+ {
++ struct iommufd_object *ret;
++
+ /*
+ * The purpose of the destroy_rwsem is to ensure deterministic
+ * destruction of objects used by external drivers and destroyed by this
+@@ -131,22 +173,22 @@ bool iommufd_object_destroy_user(struct iommufd_ctx *ictx,
+ * side of this, such as during ioctl execution.
+ */
+ down_write(&obj->destroy_rwsem);
+- xa_lock(&ictx->objects);
+- refcount_dec(&obj->users);
+- if (!refcount_dec_if_one(&obj->users)) {
+- xa_unlock(&ictx->objects);
+- up_write(&obj->destroy_rwsem);
+- return false;
+- }
+- __xa_erase(&ictx->objects, obj->id);
+- if (ictx->vfio_ioas && &ictx->vfio_ioas->obj == obj)
+- ictx->vfio_ioas = NULL;
+- xa_unlock(&ictx->objects);
++ ret = iommufd_object_remove(ictx, obj->id, true);
+ up_write(&obj->destroy_rwsem);
+
++ if (allow_fail && IS_ERR(ret))
++ return;
++
++ /*
++ * If there is a bug and we couldn't destroy the object then we did put
++ * back the caller's refcount and will eventually try to free it again
++ * during close.
++ */
++ if (WARN_ON(IS_ERR(ret)))
++ return;
++
+ iommufd_object_ops[obj->type].destroy(obj);
+ kfree(obj);
+- return true;
+ }
+
+ static int iommufd_destroy(struct iommufd_ucmd *ucmd)
+@@ -154,13 +196,11 @@ static int iommufd_destroy(struct iommufd_ucmd *ucmd)
+ struct iommu_destroy *cmd = ucmd->cmd;
+ struct iommufd_object *obj;
+
+- obj = iommufd_get_object(ucmd->ictx, cmd->id, IOMMUFD_OBJ_ANY);
++ obj = iommufd_object_remove(ucmd->ictx, cmd->id, false);
+ if (IS_ERR(obj))
+ return PTR_ERR(obj);
+- iommufd_ref_to_users(obj);
+- /* See iommufd_ref_to_users() */
+- if (!iommufd_object_destroy_user(ucmd->ictx, obj))
+- return -EBUSY;
++ iommufd_object_ops[obj->type].destroy(obj);
++ kfree(obj);
+ return 0;
+ }
+
+diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
+index 3c47846cc5efe..d80669c4caf4f 100644
+--- a/drivers/iommu/iommufd/pages.c
++++ b/drivers/iommu/iommufd/pages.c
+@@ -297,7 +297,7 @@ static void batch_clear_carry(struct pfn_batch *batch, unsigned int keep_pfns)
+ batch->pfns[0] = batch->pfns[batch->end - 1] +
+ (batch->npfns[batch->end - 1] - keep_pfns);
+ batch->npfns[0] = keep_pfns;
+- batch->end = 0;
++ batch->end = 1;
+ }
+
+ static void batch_skip_carry(struct pfn_batch *batch, unsigned int skip_pfns)
+diff --git a/drivers/irqchip/irq-bcm6345-l1.c b/drivers/irqchip/irq-bcm6345-l1.c
+index fa113cb2529a4..6341c0167c4ab 100644
+--- a/drivers/irqchip/irq-bcm6345-l1.c
++++ b/drivers/irqchip/irq-bcm6345-l1.c
+@@ -82,6 +82,7 @@ struct bcm6345_l1_chip {
+ };
+
+ struct bcm6345_l1_cpu {
++ struct bcm6345_l1_chip *intc;
+ void __iomem *map_base;
+ unsigned int parent_irq;
+ u32 enable_cache[];
+@@ -115,17 +116,11 @@ static inline unsigned int cpu_for_irq(struct bcm6345_l1_chip *intc,
+
+ static void bcm6345_l1_irq_handle(struct irq_desc *desc)
+ {
+- struct bcm6345_l1_chip *intc = irq_desc_get_handler_data(desc);
+- struct bcm6345_l1_cpu *cpu;
++ struct bcm6345_l1_cpu *cpu = irq_desc_get_handler_data(desc);
++ struct bcm6345_l1_chip *intc = cpu->intc;
+ struct irq_chip *chip = irq_desc_get_chip(desc);
+ unsigned int idx;
+
+-#ifdef CONFIG_SMP
+- cpu = intc->cpus[cpu_logical_map(smp_processor_id())];
+-#else
+- cpu = intc->cpus[0];
+-#endif
+-
+ chained_irq_enter(chip, desc);
+
+ for (idx = 0; idx < intc->n_words; idx++) {
+@@ -253,6 +248,7 @@ static int __init bcm6345_l1_init_one(struct device_node *dn,
+ if (!cpu)
+ return -ENOMEM;
+
++ cpu->intc = intc;
+ cpu->map_base = ioremap(res.start, sz);
+ if (!cpu->map_base)
+ return -ENOMEM;
+@@ -271,7 +267,7 @@ static int __init bcm6345_l1_init_one(struct device_node *dn,
+ return -EINVAL;
+ }
+ irq_set_chained_handler_and_data(cpu->parent_irq,
+- bcm6345_l1_irq_handle, intc);
++ bcm6345_l1_irq_handle, cpu);
+
+ return 0;
+ }
+diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
+index 0ec2b1e1df75b..c5cb2830e8537 100644
+--- a/drivers/irqchip/irq-gic-v3-its.c
++++ b/drivers/irqchip/irq-gic-v3-its.c
+@@ -273,13 +273,23 @@ static void vpe_to_cpuid_unlock(struct its_vpe *vpe, unsigned long flags)
+ raw_spin_unlock_irqrestore(&vpe->vpe_lock, flags);
+ }
+
++static struct irq_chip its_vpe_irq_chip;
++
+ static int irq_to_cpuid_lock(struct irq_data *d, unsigned long *flags)
+ {
+- struct its_vlpi_map *map = get_vlpi_map(d);
++ struct its_vpe *vpe = NULL;
+ int cpu;
+
+- if (map) {
+- cpu = vpe_to_cpuid_lock(map->vpe, flags);
++ if (d->chip == &its_vpe_irq_chip) {
++ vpe = irq_data_get_irq_chip_data(d);
++ } else {
++ struct its_vlpi_map *map = get_vlpi_map(d);
++ if (map)
++ vpe = map->vpe;
++ }
++
++ if (vpe) {
++ cpu = vpe_to_cpuid_lock(vpe, flags);
+ } else {
+ /* Physical LPIs are already locked via the irq_desc lock */
+ struct its_device *its_dev = irq_data_get_irq_chip_data(d);
+@@ -293,10 +303,18 @@ static int irq_to_cpuid_lock(struct irq_data *d, unsigned long *flags)
+
+ static void irq_to_cpuid_unlock(struct irq_data *d, unsigned long flags)
+ {
+- struct its_vlpi_map *map = get_vlpi_map(d);
++ struct its_vpe *vpe = NULL;
++
++ if (d->chip == &its_vpe_irq_chip) {
++ vpe = irq_data_get_irq_chip_data(d);
++ } else {
++ struct its_vlpi_map *map = get_vlpi_map(d);
++ if (map)
++ vpe = map->vpe;
++ }
+
+- if (map)
+- vpe_to_cpuid_unlock(map->vpe, flags);
++ if (vpe)
++ vpe_to_cpuid_unlock(vpe, flags);
+ }
+
+ static struct its_collection *valid_col(struct its_collection *col)
+@@ -1433,14 +1451,29 @@ static void wait_for_syncr(void __iomem *rdbase)
+ cpu_relax();
+ }
+
+-static void direct_lpi_inv(struct irq_data *d)
++static void __direct_lpi_inv(struct irq_data *d, u64 val)
+ {
+- struct its_vlpi_map *map = get_vlpi_map(d);
+ void __iomem *rdbase;
+ unsigned long flags;
+- u64 val;
+ int cpu;
+
++ /* Target the redistributor this LPI is currently routed to */
++ cpu = irq_to_cpuid_lock(d, &flags);
++ raw_spin_lock(&gic_data_rdist_cpu(cpu)->rd_lock);
++
++ rdbase = per_cpu_ptr(gic_rdists->rdist, cpu)->rd_base;
++ gic_write_lpir(val, rdbase + GICR_INVLPIR);
++ wait_for_syncr(rdbase);
++
++ raw_spin_unlock(&gic_data_rdist_cpu(cpu)->rd_lock);
++ irq_to_cpuid_unlock(d, flags);
++}
++
++static void direct_lpi_inv(struct irq_data *d)
++{
++ struct its_vlpi_map *map = get_vlpi_map(d);
++ u64 val;
++
+ if (map) {
+ struct its_device *its_dev = irq_data_get_irq_chip_data(d);
+
+@@ -1453,15 +1486,7 @@ static void direct_lpi_inv(struct irq_data *d)
+ val = d->hwirq;
+ }
+
+- /* Target the redistributor this LPI is currently routed to */
+- cpu = irq_to_cpuid_lock(d, &flags);
+- raw_spin_lock(&gic_data_rdist_cpu(cpu)->rd_lock);
+- rdbase = per_cpu_ptr(gic_rdists->rdist, cpu)->rd_base;
+- gic_write_lpir(val, rdbase + GICR_INVLPIR);
+-
+- wait_for_syncr(rdbase);
+- raw_spin_unlock(&gic_data_rdist_cpu(cpu)->rd_lock);
+- irq_to_cpuid_unlock(d, flags);
++ __direct_lpi_inv(d, val);
+ }
+
+ static void lpi_update_config(struct irq_data *d, u8 clr, u8 set)
+@@ -3952,18 +3977,10 @@ static void its_vpe_send_inv(struct irq_data *d)
+ {
+ struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
+
+- if (gic_rdists->has_direct_lpi) {
+- void __iomem *rdbase;
+-
+- /* Target the redistributor this VPE is currently known on */
+- raw_spin_lock(&gic_data_rdist_cpu(vpe->col_idx)->rd_lock);
+- rdbase = per_cpu_ptr(gic_rdists->rdist, vpe->col_idx)->rd_base;
+- gic_write_lpir(d->parent_data->hwirq, rdbase + GICR_INVLPIR);
+- wait_for_syncr(rdbase);
+- raw_spin_unlock(&gic_data_rdist_cpu(vpe->col_idx)->rd_lock);
+- } else {
++ if (gic_rdists->has_direct_lpi)
++ __direct_lpi_inv(d, d->parent_data->hwirq);
++ else
+ its_vpe_send_cmd(vpe, its_send_inv);
+- }
+ }
+
+ static void its_vpe_mask_irq(struct irq_data *d)
+diff --git a/drivers/md/dm-cache-policy-smq.c b/drivers/md/dm-cache-policy-smq.c
+index 493a8715dc8f8..8bd2ad743d9ae 100644
+--- a/drivers/md/dm-cache-policy-smq.c
++++ b/drivers/md/dm-cache-policy-smq.c
+@@ -857,7 +857,13 @@ struct smq_policy {
+
+ struct background_tracker *bg_work;
+
+- bool migrations_allowed;
++ bool migrations_allowed:1;
++
++ /*
++ * If this is set the policy will try and clean the whole cache
++ * even if the device is not idle.
++ */
++ bool cleaner:1;
+ };
+
+ /*----------------------------------------------------------------*/
+@@ -1138,7 +1144,7 @@ static bool clean_target_met(struct smq_policy *mq, bool idle)
+ * Cache entries may not be populated. So we cannot rely on the
+ * size of the clean queue.
+ */
+- if (idle) {
++ if (idle || mq->cleaner) {
+ /*
+ * We'd like to clean everything.
+ */
+@@ -1722,11 +1728,9 @@ static void calc_hotspot_params(sector_t origin_size,
+ *hotspot_block_size /= 2u;
+ }
+
+-static struct dm_cache_policy *__smq_create(dm_cblock_t cache_size,
+- sector_t origin_size,
+- sector_t cache_block_size,
+- bool mimic_mq,
+- bool migrations_allowed)
++static struct dm_cache_policy *
++__smq_create(dm_cblock_t cache_size, sector_t origin_size, sector_t cache_block_size,
++ bool mimic_mq, bool migrations_allowed, bool cleaner)
+ {
+ unsigned int i;
+ unsigned int nr_sentinels_per_queue = 2u * NR_CACHE_LEVELS;
+@@ -1813,6 +1817,7 @@ static struct dm_cache_policy *__smq_create(dm_cblock_t cache_size,
+ goto bad_btracker;
+
+ mq->migrations_allowed = migrations_allowed;
++ mq->cleaner = cleaner;
+
+ return &mq->policy;
+
+@@ -1836,21 +1841,24 @@ static struct dm_cache_policy *smq_create(dm_cblock_t cache_size,
+ sector_t origin_size,
+ sector_t cache_block_size)
+ {
+- return __smq_create(cache_size, origin_size, cache_block_size, false, true);
++ return __smq_create(cache_size, origin_size, cache_block_size,
++ false, true, false);
+ }
+
+ static struct dm_cache_policy *mq_create(dm_cblock_t cache_size,
+ sector_t origin_size,
+ sector_t cache_block_size)
+ {
+- return __smq_create(cache_size, origin_size, cache_block_size, true, true);
++ return __smq_create(cache_size, origin_size, cache_block_size,
++ true, true, false);
+ }
+
+ static struct dm_cache_policy *cleaner_create(dm_cblock_t cache_size,
+ sector_t origin_size,
+ sector_t cache_block_size)
+ {
+- return __smq_create(cache_size, origin_size, cache_block_size, false, false);
++ return __smq_create(cache_size, origin_size, cache_block_size,
++ false, false, true);
+ }
+
+ /*----------------------------------------------------------------*/
+diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
+index c8821fcb82998..de3dd6e6bb892 100644
+--- a/drivers/md/dm-raid.c
++++ b/drivers/md/dm-raid.c
+@@ -3251,8 +3251,7 @@ size_check:
+ r = md_start(&rs->md);
+ if (r) {
+ ti->error = "Failed to start raid array";
+- mddev_unlock(&rs->md);
+- goto bad_md_start;
++ goto bad_unlock;
+ }
+
+ /* If raid4/5/6 journal mode explicitly requested (only possible with journal dev) -> set it */
+@@ -3260,8 +3259,7 @@ size_check:
+ r = r5c_journal_mode_set(&rs->md, rs->journal_dev.mode);
+ if (r) {
+ ti->error = "Failed to set raid4/5/6 journal mode";
+- mddev_unlock(&rs->md);
+- goto bad_journal_mode_set;
++ goto bad_unlock;
+ }
+ }
+
+@@ -3272,14 +3270,14 @@ size_check:
+ if (rs_is_raid456(rs)) {
+ r = rs_set_raid456_stripe_cache(rs);
+ if (r)
+- goto bad_stripe_cache;
++ goto bad_unlock;
+ }
+
+ /* Now do an early reshape check */
+ if (test_bit(RT_FLAG_RESHAPE_RS, &rs->runtime_flags)) {
+ r = rs_check_reshape(rs);
+ if (r)
+- goto bad_check_reshape;
++ goto bad_unlock;
+
+ /* Restore new, ctr requested layout to perform check */
+ rs_config_restore(rs, &rs_layout);
+@@ -3288,7 +3286,7 @@ size_check:
+ r = rs->md.pers->check_reshape(&rs->md);
+ if (r) {
+ ti->error = "Reshape check failed";
+- goto bad_check_reshape;
++ goto bad_unlock;
+ }
+ }
+ }
+@@ -3299,11 +3297,9 @@ size_check:
+ mddev_unlock(&rs->md);
+ return 0;
+
+-bad_md_start:
+-bad_journal_mode_set:
+-bad_stripe_cache:
+-bad_check_reshape:
++bad_unlock:
+ md_stop(&rs->md);
++ mddev_unlock(&rs->md);
+ bad:
+ raid_set_free(rs);
+
+@@ -3314,7 +3310,9 @@ static void raid_dtr(struct dm_target *ti)
+ {
+ struct raid_set *rs = ti->private;
+
++ mddev_lock_nointr(&rs->md);
+ md_stop(&rs->md);
++ mddev_unlock(&rs->md);
+ raid_set_free(rs);
+ }
+
+diff --git a/drivers/md/md.c b/drivers/md/md.c
+index 18384251399ab..32d7ba8069aef 100644
+--- a/drivers/md/md.c
++++ b/drivers/md/md.c
+@@ -6260,6 +6260,8 @@ static void __md_stop(struct mddev *mddev)
+
+ void md_stop(struct mddev *mddev)
+ {
++ lockdep_assert_held(&mddev->reconfig_mutex);
++
+ /* stop the array and free an attached data structures.
+ * This is called from dm-raid
+ */
+diff --git a/drivers/media/i2c/tc358746.c b/drivers/media/i2c/tc358746.c
+index ec1a193ba161a..25fbce5cabdaa 100644
+--- a/drivers/media/i2c/tc358746.c
++++ b/drivers/media/i2c/tc358746.c
+@@ -813,8 +813,8 @@ static unsigned long tc358746_find_pll_settings(struct tc358746 *tc358746,
+ u32 min_delta = 0xffffffff;
+ u16 prediv_max = 17;
+ u16 prediv_min = 1;
+- u16 m_best, mul;
+- u16 p_best, p;
++ u16 m_best = 0, mul;
++ u16 p_best = 1, p;
+ u8 postdiv;
+
+ if (fout > 1000 * HZ_PER_MHZ) {
+diff --git a/drivers/media/platform/amphion/vpu_core.c b/drivers/media/platform/amphion/vpu_core.c
+index de23627a119a0..82bf8b3be66a2 100644
+--- a/drivers/media/platform/amphion/vpu_core.c
++++ b/drivers/media/platform/amphion/vpu_core.c
+@@ -826,7 +826,7 @@ static const struct dev_pm_ops vpu_core_pm_ops = {
+
+ static struct vpu_core_resources imx8q_enc = {
+ .type = VPU_CORE_TYPE_ENC,
+- .fwname = "vpu/vpu_fw_imx8_enc.bin",
++ .fwname = "amphion/vpu/vpu_fw_imx8_enc.bin",
+ .stride = 16,
+ .max_width = 1920,
+ .max_height = 1920,
+@@ -841,7 +841,7 @@ static struct vpu_core_resources imx8q_enc = {
+
+ static struct vpu_core_resources imx8q_dec = {
+ .type = VPU_CORE_TYPE_DEC,
+- .fwname = "vpu/vpu_fw_imx8_dec.bin",
++ .fwname = "amphion/vpu/vpu_fw_imx8_dec.bin",
+ .stride = 256,
+ .max_width = 8188,
+ .max_height = 8188,
+diff --git a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
+index 0051f372a66cf..40cb3cb87ba17 100644
+--- a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
++++ b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
+@@ -936,148 +936,6 @@ static int mtk_jpeg_set_dec_dst(struct mtk_jpeg_ctx *ctx,
+ return 0;
+ }
+
+-static int mtk_jpegenc_get_hw(struct mtk_jpeg_ctx *ctx)
+-{
+- struct mtk_jpegenc_comp_dev *comp_jpeg;
+- struct mtk_jpeg_dev *jpeg = ctx->jpeg;
+- unsigned long flags;
+- int hw_id = -1;
+- int i;
+-
+- spin_lock_irqsave(&jpeg->hw_lock, flags);
+- for (i = 0; i < MTK_JPEGENC_HW_MAX; i++) {
+- comp_jpeg = jpeg->enc_hw_dev[i];
+- if (comp_jpeg->hw_state == MTK_JPEG_HW_IDLE) {
+- hw_id = i;
+- comp_jpeg->hw_state = MTK_JPEG_HW_BUSY;
+- break;
+- }
+- }
+- spin_unlock_irqrestore(&jpeg->hw_lock, flags);
+-
+- return hw_id;
+-}
+-
+-static int mtk_jpegenc_set_hw_param(struct mtk_jpeg_ctx *ctx,
+- int hw_id,
+- struct vb2_v4l2_buffer *src_buf,
+- struct vb2_v4l2_buffer *dst_buf)
+-{
+- struct mtk_jpegenc_comp_dev *jpeg = ctx->jpeg->enc_hw_dev[hw_id];
+-
+- jpeg->hw_param.curr_ctx = ctx;
+- jpeg->hw_param.src_buffer = src_buf;
+- jpeg->hw_param.dst_buffer = dst_buf;
+-
+- return 0;
+-}
+-
+-static int mtk_jpegenc_put_hw(struct mtk_jpeg_dev *jpeg, int hw_id)
+-{
+- unsigned long flags;
+-
+- spin_lock_irqsave(&jpeg->hw_lock, flags);
+- jpeg->enc_hw_dev[hw_id]->hw_state = MTK_JPEG_HW_IDLE;
+- spin_unlock_irqrestore(&jpeg->hw_lock, flags);
+-
+- return 0;
+-}
+-
+-static void mtk_jpegenc_worker(struct work_struct *work)
+-{
+- struct mtk_jpegenc_comp_dev *comp_jpeg[MTK_JPEGENC_HW_MAX];
+- enum vb2_buffer_state buf_state = VB2_BUF_STATE_ERROR;
+- struct mtk_jpeg_src_buf *jpeg_dst_buf;
+- struct vb2_v4l2_buffer *src_buf, *dst_buf;
+- int ret, i, hw_id = 0;
+- unsigned long flags;
+-
+- struct mtk_jpeg_ctx *ctx = container_of(work,
+- struct mtk_jpeg_ctx,
+- jpeg_work);
+- struct mtk_jpeg_dev *jpeg = ctx->jpeg;
+-
+- for (i = 0; i < MTK_JPEGENC_HW_MAX; i++)
+- comp_jpeg[i] = jpeg->enc_hw_dev[i];
+- i = 0;
+-
+-retry_select:
+- hw_id = mtk_jpegenc_get_hw(ctx);
+- if (hw_id < 0) {
+- ret = wait_event_interruptible(jpeg->hw_wq,
+- atomic_read(&jpeg->hw_rdy) > 0);
+- if (ret != 0 || (i++ > MTK_JPEG_MAX_RETRY_TIME)) {
+- dev_err(jpeg->dev, "%s : %d, all HW are busy\n",
+- __func__, __LINE__);
+- v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx);
+- return;
+- }
+-
+- goto retry_select;
+- }
+-
+- atomic_dec(&jpeg->hw_rdy);
+- src_buf = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
+- if (!src_buf)
+- goto getbuf_fail;
+-
+- dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
+- if (!dst_buf)
+- goto getbuf_fail;
+-
+- v4l2_m2m_buf_copy_metadata(src_buf, dst_buf, true);
+-
+- mtk_jpegenc_set_hw_param(ctx, hw_id, src_buf, dst_buf);
+- ret = pm_runtime_get_sync(comp_jpeg[hw_id]->dev);
+- if (ret < 0) {
+- dev_err(jpeg->dev, "%s : %d, pm_runtime_get_sync fail !!!\n",
+- __func__, __LINE__);
+- goto enc_end;
+- }
+-
+- ret = clk_prepare_enable(comp_jpeg[hw_id]->venc_clk.clks->clk);
+- if (ret) {
+- dev_err(jpeg->dev, "%s : %d, jpegenc clk_prepare_enable fail\n",
+- __func__, __LINE__);
+- goto enc_end;
+- }
+-
+- v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
+- v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
+-
+- schedule_delayed_work(&comp_jpeg[hw_id]->job_timeout_work,
+- msecs_to_jiffies(MTK_JPEG_HW_TIMEOUT_MSEC));
+-
+- spin_lock_irqsave(&comp_jpeg[hw_id]->hw_lock, flags);
+- jpeg_dst_buf = mtk_jpeg_vb2_to_srcbuf(&dst_buf->vb2_buf);
+- jpeg_dst_buf->curr_ctx = ctx;
+- jpeg_dst_buf->frame_num = ctx->total_frame_num;
+- ctx->total_frame_num++;
+- mtk_jpeg_enc_reset(comp_jpeg[hw_id]->reg_base);
+- mtk_jpeg_set_enc_dst(ctx,
+- comp_jpeg[hw_id]->reg_base,
+- &dst_buf->vb2_buf);
+- mtk_jpeg_set_enc_src(ctx,
+- comp_jpeg[hw_id]->reg_base,
+- &src_buf->vb2_buf);
+- mtk_jpeg_set_enc_params(ctx, comp_jpeg[hw_id]->reg_base);
+- mtk_jpeg_enc_start(comp_jpeg[hw_id]->reg_base);
+- v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx);
+- spin_unlock_irqrestore(&comp_jpeg[hw_id]->hw_lock, flags);
+-
+- return;
+-
+-enc_end:
+- v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
+- v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
+- v4l2_m2m_buf_done(src_buf, buf_state);
+- v4l2_m2m_buf_done(dst_buf, buf_state);
+-getbuf_fail:
+- atomic_inc(&jpeg->hw_rdy);
+- mtk_jpegenc_put_hw(jpeg, hw_id);
+- v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx);
+-}
+-
+ static void mtk_jpeg_enc_device_run(void *priv)
+ {
+ struct mtk_jpeg_ctx *ctx = priv;
+@@ -1128,206 +986,39 @@ static void mtk_jpeg_multicore_enc_device_run(void *priv)
+ queue_work(jpeg->workqueue, &ctx->jpeg_work);
+ }
+
+-static int mtk_jpegdec_get_hw(struct mtk_jpeg_ctx *ctx)
++static void mtk_jpeg_multicore_dec_device_run(void *priv)
+ {
+- struct mtk_jpegdec_comp_dev *comp_jpeg;
++ struct mtk_jpeg_ctx *ctx = priv;
+ struct mtk_jpeg_dev *jpeg = ctx->jpeg;
+- unsigned long flags;
+- int hw_id = -1;
+- int i;
+-
+- spin_lock_irqsave(&jpeg->hw_lock, flags);
+- for (i = 0; i < MTK_JPEGDEC_HW_MAX; i++) {
+- comp_jpeg = jpeg->dec_hw_dev[i];
+- if (comp_jpeg->hw_state == MTK_JPEG_HW_IDLE) {
+- hw_id = i;
+- comp_jpeg->hw_state = MTK_JPEG_HW_BUSY;
+- break;
+- }
+- }
+- spin_unlock_irqrestore(&jpeg->hw_lock, flags);
+
+- return hw_id;
+-}
+-
+-static int mtk_jpegdec_put_hw(struct mtk_jpeg_dev *jpeg, int hw_id)
+-{
+- unsigned long flags;
+-
+- spin_lock_irqsave(&jpeg->hw_lock, flags);
+- jpeg->dec_hw_dev[hw_id]->hw_state =
+- MTK_JPEG_HW_IDLE;
+- spin_unlock_irqrestore(&jpeg->hw_lock, flags);
+-
+- return 0;
+-}
+-
+-static int mtk_jpegdec_set_hw_param(struct mtk_jpeg_ctx *ctx,
+- int hw_id,
+- struct vb2_v4l2_buffer *src_buf,
+- struct vb2_v4l2_buffer *dst_buf)
+-{
+- struct mtk_jpegdec_comp_dev *jpeg =
+- ctx->jpeg->dec_hw_dev[hw_id];
+-
+- jpeg->hw_param.curr_ctx = ctx;
+- jpeg->hw_param.src_buffer = src_buf;
+- jpeg->hw_param.dst_buffer = dst_buf;
+-
+- return 0;
++ queue_work(jpeg->workqueue, &ctx->jpeg_work);
+ }
+
+-static void mtk_jpegdec_worker(struct work_struct *work)
++static void mtk_jpeg_dec_device_run(void *priv)
+ {
+- struct mtk_jpeg_ctx *ctx = container_of(work, struct mtk_jpeg_ctx,
+- jpeg_work);
+- struct mtk_jpegdec_comp_dev *comp_jpeg[MTK_JPEGDEC_HW_MAX];
+- enum vb2_buffer_state buf_state = VB2_BUF_STATE_ERROR;
+- struct mtk_jpeg_src_buf *jpeg_src_buf, *jpeg_dst_buf;
+- struct vb2_v4l2_buffer *src_buf, *dst_buf;
++ struct mtk_jpeg_ctx *ctx = priv;
+ struct mtk_jpeg_dev *jpeg = ctx->jpeg;
+- int ret, i, hw_id = 0;
++ struct vb2_v4l2_buffer *src_buf, *dst_buf;
++ enum vb2_buffer_state buf_state = VB2_BUF_STATE_ERROR;
++ unsigned long flags;
++ struct mtk_jpeg_src_buf *jpeg_src_buf;
+ struct mtk_jpeg_bs bs;
+ struct mtk_jpeg_fb fb;
+- unsigned long flags;
+-
+- for (i = 0; i < MTK_JPEGDEC_HW_MAX; i++)
+- comp_jpeg[i] = jpeg->dec_hw_dev[i];
+- i = 0;
+-
+-retry_select:
+- hw_id = mtk_jpegdec_get_hw(ctx);
+- if (hw_id < 0) {
+- ret = wait_event_interruptible_timeout(jpeg->hw_wq,
+- atomic_read(&jpeg->hw_rdy) > 0,
+- MTK_JPEG_HW_TIMEOUT_MSEC);
+- if (ret != 0 || (i++ > MTK_JPEG_MAX_RETRY_TIME)) {
+- dev_err(jpeg->dev, "%s : %d, all HW are busy\n",
+- __func__, __LINE__);
+- v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx);
+- return;
+- }
+-
+- goto retry_select;
+- }
++ int ret;
+
+- atomic_dec(&jpeg->hw_rdy);
+ src_buf = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
+- if (!src_buf)
+- goto getbuf_fail;
+-
+ dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
+- if (!dst_buf)
+- goto getbuf_fail;
+-
+- v4l2_m2m_buf_copy_metadata(src_buf, dst_buf, true);
+ jpeg_src_buf = mtk_jpeg_vb2_to_srcbuf(&src_buf->vb2_buf);
+- jpeg_dst_buf = mtk_jpeg_vb2_to_srcbuf(&dst_buf->vb2_buf);
+
+- if (mtk_jpeg_check_resolution_change(ctx,
+- &jpeg_src_buf->dec_param)) {
++ if (mtk_jpeg_check_resolution_change(ctx, &jpeg_src_buf->dec_param)) {
+ mtk_jpeg_queue_src_chg_event(ctx);
+ ctx->state = MTK_JPEG_SOURCE_CHANGE;
+- goto getbuf_fail;
++ v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx);
++ return;
+ }
+
+- jpeg_src_buf->curr_ctx = ctx;
+- jpeg_src_buf->frame_num = ctx->total_frame_num;
+- jpeg_dst_buf->curr_ctx = ctx;
+- jpeg_dst_buf->frame_num = ctx->total_frame_num;
+-
+- mtk_jpegdec_set_hw_param(ctx, hw_id, src_buf, dst_buf);
+- ret = pm_runtime_get_sync(comp_jpeg[hw_id]->dev);
+- if (ret < 0) {
+- dev_err(jpeg->dev, "%s : %d, pm_runtime_get_sync fail !!!\n",
+- __func__, __LINE__);
+- goto dec_end;
+- }
+-
+- ret = clk_prepare_enable(comp_jpeg[hw_id]->jdec_clk.clks->clk);
+- if (ret) {
+- dev_err(jpeg->dev, "%s : %d, jpegdec clk_prepare_enable fail\n",
+- __func__, __LINE__);
+- goto clk_end;
+- }
+-
+- v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
+- v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
+-
+- schedule_delayed_work(&comp_jpeg[hw_id]->job_timeout_work,
+- msecs_to_jiffies(MTK_JPEG_HW_TIMEOUT_MSEC));
+-
+- mtk_jpeg_set_dec_src(ctx, &src_buf->vb2_buf, &bs);
+- if (mtk_jpeg_set_dec_dst(ctx,
+- &jpeg_src_buf->dec_param,
+- &dst_buf->vb2_buf, &fb)) {
+- dev_err(jpeg->dev, "%s : %d, mtk_jpeg_set_dec_dst fail\n",
+- __func__, __LINE__);
+- goto setdst_end;
+- }
+-
+- spin_lock_irqsave(&comp_jpeg[hw_id]->hw_lock, flags);
+- ctx->total_frame_num++;
+- mtk_jpeg_dec_reset(comp_jpeg[hw_id]->reg_base);
+- mtk_jpeg_dec_set_config(comp_jpeg[hw_id]->reg_base,
+- &jpeg_src_buf->dec_param,
+- jpeg_src_buf->bs_size,
+- &bs,
+- &fb);
+- mtk_jpeg_dec_start(comp_jpeg[hw_id]->reg_base);
+- v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx);
+- spin_unlock_irqrestore(&comp_jpeg[hw_id]->hw_lock, flags);
+-
+- return;
+-
+-setdst_end:
+- clk_disable_unprepare(comp_jpeg[hw_id]->jdec_clk.clks->clk);
+-clk_end:
+- pm_runtime_put(comp_jpeg[hw_id]->dev);
+-dec_end:
+- v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
+- v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
+- v4l2_m2m_buf_done(src_buf, buf_state);
+- v4l2_m2m_buf_done(dst_buf, buf_state);
+-getbuf_fail:
+- atomic_inc(&jpeg->hw_rdy);
+- mtk_jpegdec_put_hw(jpeg, hw_id);
+- v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx);
+-}
+-
+-static void mtk_jpeg_multicore_dec_device_run(void *priv)
+-{
+- struct mtk_jpeg_ctx *ctx = priv;
+- struct mtk_jpeg_dev *jpeg = ctx->jpeg;
+-
+- queue_work(jpeg->workqueue, &ctx->jpeg_work);
+-}
+-
+-static void mtk_jpeg_dec_device_run(void *priv)
+-{
+- struct mtk_jpeg_ctx *ctx = priv;
+- struct mtk_jpeg_dev *jpeg = ctx->jpeg;
+- struct vb2_v4l2_buffer *src_buf, *dst_buf;
+- enum vb2_buffer_state buf_state = VB2_BUF_STATE_ERROR;
+- unsigned long flags;
+- struct mtk_jpeg_src_buf *jpeg_src_buf;
+- struct mtk_jpeg_bs bs;
+- struct mtk_jpeg_fb fb;
+- int ret;
+-
+- src_buf = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
+- dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
+- jpeg_src_buf = mtk_jpeg_vb2_to_srcbuf(&src_buf->vb2_buf);
+-
+- if (mtk_jpeg_check_resolution_change(ctx, &jpeg_src_buf->dec_param)) {
+- mtk_jpeg_queue_src_chg_event(ctx);
+- ctx->state = MTK_JPEG_SOURCE_CHANGE;
+- v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx);
+- return;
+- }
+-
+- ret = pm_runtime_resume_and_get(jpeg->dev);
+- if (ret < 0)
++ ret = pm_runtime_resume_and_get(jpeg->dev);
++ if (ret < 0)
+ goto dec_end;
+
+ schedule_delayed_work(&jpeg->job_timeout_work,
+@@ -1430,101 +1121,6 @@ static void mtk_jpeg_clk_off(struct mtk_jpeg_dev *jpeg)
+ jpeg->variant->clks);
+ }
+
+-static irqreturn_t mtk_jpeg_enc_done(struct mtk_jpeg_dev *jpeg)
+-{
+- struct mtk_jpeg_ctx *ctx;
+- struct vb2_v4l2_buffer *src_buf, *dst_buf;
+- enum vb2_buffer_state buf_state = VB2_BUF_STATE_ERROR;
+- u32 result_size;
+-
+- ctx = v4l2_m2m_get_curr_priv(jpeg->m2m_dev);
+- if (!ctx) {
+- v4l2_err(&jpeg->v4l2_dev, "Context is NULL\n");
+- return IRQ_HANDLED;
+- }
+-
+- src_buf = v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
+- dst_buf = v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
+-
+- result_size = mtk_jpeg_enc_get_file_size(jpeg->reg_base);
+- vb2_set_plane_payload(&dst_buf->vb2_buf, 0, result_size);
+-
+- buf_state = VB2_BUF_STATE_DONE;
+-
+- v4l2_m2m_buf_done(src_buf, buf_state);
+- v4l2_m2m_buf_done(dst_buf, buf_state);
+- v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx);
+- pm_runtime_put(ctx->jpeg->dev);
+- return IRQ_HANDLED;
+-}
+-
+-static irqreturn_t mtk_jpeg_enc_irq(int irq, void *priv)
+-{
+- struct mtk_jpeg_dev *jpeg = priv;
+- u32 irq_status;
+- irqreturn_t ret = IRQ_NONE;
+-
+- cancel_delayed_work(&jpeg->job_timeout_work);
+-
+- irq_status = readl(jpeg->reg_base + JPEG_ENC_INT_STS) &
+- JPEG_ENC_INT_STATUS_MASK_ALLIRQ;
+- if (irq_status)
+- writel(0, jpeg->reg_base + JPEG_ENC_INT_STS);
+-
+- if (!(irq_status & JPEG_ENC_INT_STATUS_DONE))
+- return ret;
+-
+- ret = mtk_jpeg_enc_done(jpeg);
+- return ret;
+-}
+-
+-static irqreturn_t mtk_jpeg_dec_irq(int irq, void *priv)
+-{
+- struct mtk_jpeg_dev *jpeg = priv;
+- struct mtk_jpeg_ctx *ctx;
+- struct vb2_v4l2_buffer *src_buf, *dst_buf;
+- struct mtk_jpeg_src_buf *jpeg_src_buf;
+- enum vb2_buffer_state buf_state = VB2_BUF_STATE_ERROR;
+- u32 dec_irq_ret;
+- u32 dec_ret;
+- int i;
+-
+- cancel_delayed_work(&jpeg->job_timeout_work);
+-
+- dec_ret = mtk_jpeg_dec_get_int_status(jpeg->reg_base);
+- dec_irq_ret = mtk_jpeg_dec_enum_result(dec_ret);
+- ctx = v4l2_m2m_get_curr_priv(jpeg->m2m_dev);
+- if (!ctx) {
+- v4l2_err(&jpeg->v4l2_dev, "Context is NULL\n");
+- return IRQ_HANDLED;
+- }
+-
+- src_buf = v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
+- dst_buf = v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
+- jpeg_src_buf = mtk_jpeg_vb2_to_srcbuf(&src_buf->vb2_buf);
+-
+- if (dec_irq_ret >= MTK_JPEG_DEC_RESULT_UNDERFLOW)
+- mtk_jpeg_dec_reset(jpeg->reg_base);
+-
+- if (dec_irq_ret != MTK_JPEG_DEC_RESULT_EOF_DONE) {
+- dev_err(jpeg->dev, "decode failed\n");
+- goto dec_end;
+- }
+-
+- for (i = 0; i < dst_buf->vb2_buf.num_planes; i++)
+- vb2_set_plane_payload(&dst_buf->vb2_buf, i,
+- jpeg_src_buf->dec_param.comp_size[i]);
+-
+- buf_state = VB2_BUF_STATE_DONE;
+-
+-dec_end:
+- v4l2_m2m_buf_done(src_buf, buf_state);
+- v4l2_m2m_buf_done(dst_buf, buf_state);
+- v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx);
+- pm_runtime_put(ctx->jpeg->dev);
+- return IRQ_HANDLED;
+-}
+-
+ static void mtk_jpeg_set_default_params(struct mtk_jpeg_ctx *ctx)
+ {
+ struct mtk_jpeg_q_data *q = &ctx->out_q;
+@@ -1637,15 +1233,6 @@ static const struct v4l2_file_operations mtk_jpeg_fops = {
+ .mmap = v4l2_m2m_fop_mmap,
+ };
+
+-static struct clk_bulk_data mt8173_jpeg_dec_clocks[] = {
+- { .id = "jpgdec-smi" },
+- { .id = "jpgdec" },
+-};
+-
+-static struct clk_bulk_data mtk_jpeg_clocks[] = {
+- { .id = "jpgenc" },
+-};
+-
+ static void mtk_jpeg_job_timeout_work(struct work_struct *work)
+ {
+ struct mtk_jpeg_dev *jpeg = container_of(work, struct mtk_jpeg_dev,
+@@ -1866,7 +1453,419 @@ static const struct dev_pm_ops mtk_jpeg_pm_ops = {
+ SET_RUNTIME_PM_OPS(mtk_jpeg_pm_suspend, mtk_jpeg_pm_resume, NULL)
+ };
+
+-#if defined(CONFIG_OF)
++static int mtk_jpegenc_get_hw(struct mtk_jpeg_ctx *ctx)
++{
++ struct mtk_jpegenc_comp_dev *comp_jpeg;
++ struct mtk_jpeg_dev *jpeg = ctx->jpeg;
++ unsigned long flags;
++ int hw_id = -1;
++ int i;
++
++ spin_lock_irqsave(&jpeg->hw_lock, flags);
++ for (i = 0; i < MTK_JPEGENC_HW_MAX; i++) {
++ comp_jpeg = jpeg->enc_hw_dev[i];
++ if (comp_jpeg->hw_state == MTK_JPEG_HW_IDLE) {
++ hw_id = i;
++ comp_jpeg->hw_state = MTK_JPEG_HW_BUSY;
++ break;
++ }
++ }
++ spin_unlock_irqrestore(&jpeg->hw_lock, flags);
++
++ return hw_id;
++}
++
++static int mtk_jpegenc_set_hw_param(struct mtk_jpeg_ctx *ctx,
++ int hw_id,
++ struct vb2_v4l2_buffer *src_buf,
++ struct vb2_v4l2_buffer *dst_buf)
++{
++ struct mtk_jpegenc_comp_dev *jpeg = ctx->jpeg->enc_hw_dev[hw_id];
++
++ jpeg->hw_param.curr_ctx = ctx;
++ jpeg->hw_param.src_buffer = src_buf;
++ jpeg->hw_param.dst_buffer = dst_buf;
++
++ return 0;
++}
++
++static int mtk_jpegenc_put_hw(struct mtk_jpeg_dev *jpeg, int hw_id)
++{
++ unsigned long flags;
++
++ spin_lock_irqsave(&jpeg->hw_lock, flags);
++ jpeg->enc_hw_dev[hw_id]->hw_state = MTK_JPEG_HW_IDLE;
++ spin_unlock_irqrestore(&jpeg->hw_lock, flags);
++
++ return 0;
++}
++
++static int mtk_jpegdec_get_hw(struct mtk_jpeg_ctx *ctx)
++{
++ struct mtk_jpegdec_comp_dev *comp_jpeg;
++ struct mtk_jpeg_dev *jpeg = ctx->jpeg;
++ unsigned long flags;
++ int hw_id = -1;
++ int i;
++
++ spin_lock_irqsave(&jpeg->hw_lock, flags);
++ for (i = 0; i < MTK_JPEGDEC_HW_MAX; i++) {
++ comp_jpeg = jpeg->dec_hw_dev[i];
++ if (comp_jpeg->hw_state == MTK_JPEG_HW_IDLE) {
++ hw_id = i;
++ comp_jpeg->hw_state = MTK_JPEG_HW_BUSY;
++ break;
++ }
++ }
++ spin_unlock_irqrestore(&jpeg->hw_lock, flags);
++
++ return hw_id;
++}
++
++static int mtk_jpegdec_put_hw(struct mtk_jpeg_dev *jpeg, int hw_id)
++{
++ unsigned long flags;
++
++ spin_lock_irqsave(&jpeg->hw_lock, flags);
++ jpeg->dec_hw_dev[hw_id]->hw_state =
++ MTK_JPEG_HW_IDLE;
++ spin_unlock_irqrestore(&jpeg->hw_lock, flags);
++
++ return 0;
++}
++
++static int mtk_jpegdec_set_hw_param(struct mtk_jpeg_ctx *ctx,
++ int hw_id,
++ struct vb2_v4l2_buffer *src_buf,
++ struct vb2_v4l2_buffer *dst_buf)
++{
++ struct mtk_jpegdec_comp_dev *jpeg =
++ ctx->jpeg->dec_hw_dev[hw_id];
++
++ jpeg->hw_param.curr_ctx = ctx;
++ jpeg->hw_param.src_buffer = src_buf;
++ jpeg->hw_param.dst_buffer = dst_buf;
++
++ return 0;
++}
++
++static irqreturn_t mtk_jpeg_enc_done(struct mtk_jpeg_dev *jpeg)
++{
++ struct mtk_jpeg_ctx *ctx;
++ struct vb2_v4l2_buffer *src_buf, *dst_buf;
++ enum vb2_buffer_state buf_state = VB2_BUF_STATE_ERROR;
++ u32 result_size;
++
++ ctx = v4l2_m2m_get_curr_priv(jpeg->m2m_dev);
++ if (!ctx) {
++ v4l2_err(&jpeg->v4l2_dev, "Context is NULL\n");
++ return IRQ_HANDLED;
++ }
++
++ src_buf = v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
++ dst_buf = v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
++
++ result_size = mtk_jpeg_enc_get_file_size(jpeg->reg_base);
++ vb2_set_plane_payload(&dst_buf->vb2_buf, 0, result_size);
++
++ buf_state = VB2_BUF_STATE_DONE;
++
++ v4l2_m2m_buf_done(src_buf, buf_state);
++ v4l2_m2m_buf_done(dst_buf, buf_state);
++ v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx);
++ pm_runtime_put(ctx->jpeg->dev);
++ return IRQ_HANDLED;
++}
++
++static void mtk_jpegenc_worker(struct work_struct *work)
++{
++ struct mtk_jpegenc_comp_dev *comp_jpeg[MTK_JPEGENC_HW_MAX];
++ enum vb2_buffer_state buf_state = VB2_BUF_STATE_ERROR;
++ struct mtk_jpeg_src_buf *jpeg_dst_buf;
++ struct vb2_v4l2_buffer *src_buf, *dst_buf;
++ int ret, i, hw_id = 0;
++ unsigned long flags;
++
++ struct mtk_jpeg_ctx *ctx = container_of(work,
++ struct mtk_jpeg_ctx,
++ jpeg_work);
++ struct mtk_jpeg_dev *jpeg = ctx->jpeg;
++
++ for (i = 0; i < MTK_JPEGENC_HW_MAX; i++)
++ comp_jpeg[i] = jpeg->enc_hw_dev[i];
++ i = 0;
++
++retry_select:
++ hw_id = mtk_jpegenc_get_hw(ctx);
++ if (hw_id < 0) {
++ ret = wait_event_interruptible(jpeg->hw_wq,
++ atomic_read(&jpeg->hw_rdy) > 0);
++ if (ret != 0 || (i++ > MTK_JPEG_MAX_RETRY_TIME)) {
++ dev_err(jpeg->dev, "%s : %d, all HW are busy\n",
++ __func__, __LINE__);
++ v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx);
++ return;
++ }
++
++ goto retry_select;
++ }
++
++ atomic_dec(&jpeg->hw_rdy);
++ src_buf = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
++ if (!src_buf)
++ goto getbuf_fail;
++
++ dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
++ if (!dst_buf)
++ goto getbuf_fail;
++
++ v4l2_m2m_buf_copy_metadata(src_buf, dst_buf, true);
++
++ mtk_jpegenc_set_hw_param(ctx, hw_id, src_buf, dst_buf);
++ ret = pm_runtime_get_sync(comp_jpeg[hw_id]->dev);
++ if (ret < 0) {
++ dev_err(jpeg->dev, "%s : %d, pm_runtime_get_sync fail !!!\n",
++ __func__, __LINE__);
++ goto enc_end;
++ }
++
++ ret = clk_prepare_enable(comp_jpeg[hw_id]->venc_clk.clks->clk);
++ if (ret) {
++ dev_err(jpeg->dev, "%s : %d, jpegenc clk_prepare_enable fail\n",
++ __func__, __LINE__);
++ goto enc_end;
++ }
++
++ v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
++ v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
++
++ schedule_delayed_work(&comp_jpeg[hw_id]->job_timeout_work,
++ msecs_to_jiffies(MTK_JPEG_HW_TIMEOUT_MSEC));
++
++ spin_lock_irqsave(&comp_jpeg[hw_id]->hw_lock, flags);
++ jpeg_dst_buf = mtk_jpeg_vb2_to_srcbuf(&dst_buf->vb2_buf);
++ jpeg_dst_buf->curr_ctx = ctx;
++ jpeg_dst_buf->frame_num = ctx->total_frame_num;
++ ctx->total_frame_num++;
++ mtk_jpeg_enc_reset(comp_jpeg[hw_id]->reg_base);
++ mtk_jpeg_set_enc_dst(ctx,
++ comp_jpeg[hw_id]->reg_base,
++ &dst_buf->vb2_buf);
++ mtk_jpeg_set_enc_src(ctx,
++ comp_jpeg[hw_id]->reg_base,
++ &src_buf->vb2_buf);
++ mtk_jpeg_set_enc_params(ctx, comp_jpeg[hw_id]->reg_base);
++ mtk_jpeg_enc_start(comp_jpeg[hw_id]->reg_base);
++ v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx);
++ spin_unlock_irqrestore(&comp_jpeg[hw_id]->hw_lock, flags);
++
++ return;
++
++enc_end:
++ v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
++ v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
++ v4l2_m2m_buf_done(src_buf, buf_state);
++ v4l2_m2m_buf_done(dst_buf, buf_state);
++getbuf_fail:
++ atomic_inc(&jpeg->hw_rdy);
++ mtk_jpegenc_put_hw(jpeg, hw_id);
++ v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx);
++}
++
++static void mtk_jpegdec_worker(struct work_struct *work)
++{
++ struct mtk_jpeg_ctx *ctx = container_of(work, struct mtk_jpeg_ctx,
++ jpeg_work);
++ struct mtk_jpegdec_comp_dev *comp_jpeg[MTK_JPEGDEC_HW_MAX];
++ enum vb2_buffer_state buf_state = VB2_BUF_STATE_ERROR;
++ struct mtk_jpeg_src_buf *jpeg_src_buf, *jpeg_dst_buf;
++ struct vb2_v4l2_buffer *src_buf, *dst_buf;
++ struct mtk_jpeg_dev *jpeg = ctx->jpeg;
++ int ret, i, hw_id = 0;
++ struct mtk_jpeg_bs bs;
++ struct mtk_jpeg_fb fb;
++ unsigned long flags;
++
++ for (i = 0; i < MTK_JPEGDEC_HW_MAX; i++)
++ comp_jpeg[i] = jpeg->dec_hw_dev[i];
++ i = 0;
++
++retry_select:
++ hw_id = mtk_jpegdec_get_hw(ctx);
++ if (hw_id < 0) {
++ ret = wait_event_interruptible_timeout(jpeg->hw_wq,
++ atomic_read(&jpeg->hw_rdy) > 0,
++ MTK_JPEG_HW_TIMEOUT_MSEC);
++ if (ret != 0 || (i++ > MTK_JPEG_MAX_RETRY_TIME)) {
++ dev_err(jpeg->dev, "%s : %d, all HW are busy\n",
++ __func__, __LINE__);
++ v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx);
++ return;
++ }
++
++ goto retry_select;
++ }
++
++ atomic_dec(&jpeg->hw_rdy);
++ src_buf = v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx);
++ if (!src_buf)
++ goto getbuf_fail;
++
++ dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
++ if (!dst_buf)
++ goto getbuf_fail;
++
++ v4l2_m2m_buf_copy_metadata(src_buf, dst_buf, true);
++ jpeg_src_buf = mtk_jpeg_vb2_to_srcbuf(&src_buf->vb2_buf);
++ jpeg_dst_buf = mtk_jpeg_vb2_to_srcbuf(&dst_buf->vb2_buf);
++
++ if (mtk_jpeg_check_resolution_change(ctx,
++ &jpeg_src_buf->dec_param)) {
++ mtk_jpeg_queue_src_chg_event(ctx);
++ ctx->state = MTK_JPEG_SOURCE_CHANGE;
++ goto getbuf_fail;
++ }
++
++ jpeg_src_buf->curr_ctx = ctx;
++ jpeg_src_buf->frame_num = ctx->total_frame_num;
++ jpeg_dst_buf->curr_ctx = ctx;
++ jpeg_dst_buf->frame_num = ctx->total_frame_num;
++
++ mtk_jpegdec_set_hw_param(ctx, hw_id, src_buf, dst_buf);
++ ret = pm_runtime_get_sync(comp_jpeg[hw_id]->dev);
++ if (ret < 0) {
++ dev_err(jpeg->dev, "%s : %d, pm_runtime_get_sync fail !!!\n",
++ __func__, __LINE__);
++ goto dec_end;
++ }
++
++ ret = clk_prepare_enable(comp_jpeg[hw_id]->jdec_clk.clks->clk);
++ if (ret) {
++ dev_err(jpeg->dev, "%s : %d, jpegdec clk_prepare_enable fail\n",
++ __func__, __LINE__);
++ goto clk_end;
++ }
++
++ v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
++ v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
++
++ schedule_delayed_work(&comp_jpeg[hw_id]->job_timeout_work,
++ msecs_to_jiffies(MTK_JPEG_HW_TIMEOUT_MSEC));
++
++ mtk_jpeg_set_dec_src(ctx, &src_buf->vb2_buf, &bs);
++ if (mtk_jpeg_set_dec_dst(ctx,
++ &jpeg_src_buf->dec_param,
++ &dst_buf->vb2_buf, &fb)) {
++ dev_err(jpeg->dev, "%s : %d, mtk_jpeg_set_dec_dst fail\n",
++ __func__, __LINE__);
++ goto setdst_end;
++ }
++
++ spin_lock_irqsave(&comp_jpeg[hw_id]->hw_lock, flags);
++ ctx->total_frame_num++;
++ mtk_jpeg_dec_reset(comp_jpeg[hw_id]->reg_base);
++ mtk_jpeg_dec_set_config(comp_jpeg[hw_id]->reg_base,
++ &jpeg_src_buf->dec_param,
++ jpeg_src_buf->bs_size,
++ &bs,
++ &fb);
++ mtk_jpeg_dec_start(comp_jpeg[hw_id]->reg_base);
++ v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx);
++ spin_unlock_irqrestore(&comp_jpeg[hw_id]->hw_lock, flags);
++
++ return;
++
++setdst_end:
++ clk_disable_unprepare(comp_jpeg[hw_id]->jdec_clk.clks->clk);
++clk_end:
++ pm_runtime_put(comp_jpeg[hw_id]->dev);
++dec_end:
++ v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
++ v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
++ v4l2_m2m_buf_done(src_buf, buf_state);
++ v4l2_m2m_buf_done(dst_buf, buf_state);
++getbuf_fail:
++ atomic_inc(&jpeg->hw_rdy);
++ mtk_jpegdec_put_hw(jpeg, hw_id);
++ v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx);
++}
++
++static irqreturn_t mtk_jpeg_enc_irq(int irq, void *priv)
++{
++ struct mtk_jpeg_dev *jpeg = priv;
++ u32 irq_status;
++ irqreturn_t ret = IRQ_NONE;
++
++ cancel_delayed_work(&jpeg->job_timeout_work);
++
++ irq_status = readl(jpeg->reg_base + JPEG_ENC_INT_STS) &
++ JPEG_ENC_INT_STATUS_MASK_ALLIRQ;
++ if (irq_status)
++ writel(0, jpeg->reg_base + JPEG_ENC_INT_STS);
++
++ if (!(irq_status & JPEG_ENC_INT_STATUS_DONE))
++ return ret;
++
++ ret = mtk_jpeg_enc_done(jpeg);
++ return ret;
++}
++
++static irqreturn_t mtk_jpeg_dec_irq(int irq, void *priv)
++{
++ struct mtk_jpeg_dev *jpeg = priv;
++ struct mtk_jpeg_ctx *ctx;
++ struct vb2_v4l2_buffer *src_buf, *dst_buf;
++ struct mtk_jpeg_src_buf *jpeg_src_buf;
++ enum vb2_buffer_state buf_state = VB2_BUF_STATE_ERROR;
++ u32 dec_irq_ret;
++ u32 dec_ret;
++ int i;
++
++ cancel_delayed_work(&jpeg->job_timeout_work);
++
++ dec_ret = mtk_jpeg_dec_get_int_status(jpeg->reg_base);
++ dec_irq_ret = mtk_jpeg_dec_enum_result(dec_ret);
++ ctx = v4l2_m2m_get_curr_priv(jpeg->m2m_dev);
++ if (!ctx) {
++ v4l2_err(&jpeg->v4l2_dev, "Context is NULL\n");
++ return IRQ_HANDLED;
++ }
++
++ src_buf = v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
++ dst_buf = v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
++ jpeg_src_buf = mtk_jpeg_vb2_to_srcbuf(&src_buf->vb2_buf);
++
++ if (dec_irq_ret >= MTK_JPEG_DEC_RESULT_UNDERFLOW)
++ mtk_jpeg_dec_reset(jpeg->reg_base);
++
++ if (dec_irq_ret != MTK_JPEG_DEC_RESULT_EOF_DONE) {
++ dev_err(jpeg->dev, "decode failed\n");
++ goto dec_end;
++ }
++
++ for (i = 0; i < dst_buf->vb2_buf.num_planes; i++)
++ vb2_set_plane_payload(&dst_buf->vb2_buf, i,
++ jpeg_src_buf->dec_param.comp_size[i]);
++
++ buf_state = VB2_BUF_STATE_DONE;
++
++dec_end:
++ v4l2_m2m_buf_done(src_buf, buf_state);
++ v4l2_m2m_buf_done(dst_buf, buf_state);
++ v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx);
++ pm_runtime_put(ctx->jpeg->dev);
++ return IRQ_HANDLED;
++}
++
++static struct clk_bulk_data mtk_jpeg_clocks[] = {
++ { .id = "jpgenc" },
++};
++
++static struct clk_bulk_data mt8173_jpeg_dec_clocks[] = {
++ { .id = "jpgdec-smi" },
++ { .id = "jpgdec" },
++};
++
+ static const struct mtk_jpeg_variant mt8173_jpeg_drvdata = {
+ .clks = mt8173_jpeg_dec_clocks,
+ .num_clks = ARRAY_SIZE(mt8173_jpeg_dec_clocks),
+@@ -1949,14 +1948,13 @@ static const struct of_device_id mtk_jpeg_match[] = {
+ };
+
+ MODULE_DEVICE_TABLE(of, mtk_jpeg_match);
+-#endif
+
+ static struct platform_driver mtk_jpeg_driver = {
+ .probe = mtk_jpeg_probe,
+ .remove_new = mtk_jpeg_remove,
+ .driver = {
+ .name = MTK_JPEG_NAME,
+- .of_match_table = of_match_ptr(mtk_jpeg_match),
++ .of_match_table = mtk_jpeg_match,
+ .pm = &mtk_jpeg_pm_ops,
+ },
+ };
+diff --git a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_dec_hw.c b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_dec_hw.c
+index 869068fac5e2f..baa7be58ce691 100644
+--- a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_dec_hw.c
++++ b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_dec_hw.c
+@@ -39,7 +39,6 @@ enum mtk_jpeg_color {
+ MTK_JPEG_COLOR_400 = 0x00110000
+ };
+
+-#if defined(CONFIG_OF)
+ static const struct of_device_id mtk_jpegdec_hw_ids[] = {
+ {
+ .compatible = "mediatek,mt8195-jpgdec-hw",
+@@ -47,7 +46,6 @@ static const struct of_device_id mtk_jpegdec_hw_ids[] = {
+ {},
+ };
+ MODULE_DEVICE_TABLE(of, mtk_jpegdec_hw_ids);
+-#endif
+
+ static inline int mtk_jpeg_verify_align(u32 val, int align, u32 reg)
+ {
+@@ -653,7 +651,7 @@ static struct platform_driver mtk_jpegdec_hw_driver = {
+ .probe = mtk_jpegdec_hw_probe,
+ .driver = {
+ .name = "mtk-jpegdec-hw",
+- .of_match_table = of_match_ptr(mtk_jpegdec_hw_ids),
++ .of_match_table = mtk_jpegdec_hw_ids,
+ },
+ };
+
+diff --git a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_enc_hw.c b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_enc_hw.c
+index 71e85b4bbf127..244018365b6f1 100644
+--- a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_enc_hw.c
++++ b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_enc_hw.c
+@@ -46,7 +46,6 @@ static const struct mtk_jpeg_enc_qlt mtk_jpeg_enc_quality[] = {
+ {.quality_param = 97, .hardware_value = JPEG_ENC_QUALITY_Q97},
+ };
+
+-#if defined(CONFIG_OF)
+ static const struct of_device_id mtk_jpegenc_drv_ids[] = {
+ {
+ .compatible = "mediatek,mt8195-jpgenc-hw",
+@@ -54,7 +53,6 @@ static const struct of_device_id mtk_jpegenc_drv_ids[] = {
+ {},
+ };
+ MODULE_DEVICE_TABLE(of, mtk_jpegenc_drv_ids);
+-#endif
+
+ void mtk_jpeg_enc_reset(void __iomem *base)
+ {
+@@ -377,7 +375,7 @@ static struct platform_driver mtk_jpegenc_hw_driver = {
+ .probe = mtk_jpegenc_hw_probe,
+ .driver = {
+ .name = "mtk-jpegenc-hw",
+- .of_match_table = of_match_ptr(mtk_jpegenc_drv_ids),
++ .of_match_table = mtk_jpegenc_drv_ids,
+ },
+ };
+
+diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
+index 091e035c76a6f..1a0776f9b008a 100644
+--- a/drivers/net/bonding/bond_main.c
++++ b/drivers/net/bonding/bond_main.c
+@@ -1507,6 +1507,11 @@ static void bond_setup_by_slave(struct net_device *bond_dev,
+
+ memcpy(bond_dev->broadcast, slave_dev->broadcast,
+ slave_dev->addr_len);
++
++ if (slave_dev->flags & IFF_POINTOPOINT) {
++ bond_dev->flags &= ~(IFF_BROADCAST | IFF_MULTICAST);
++ bond_dev->flags |= (IFF_POINTOPOINT | IFF_NOARP);
++ }
+ }
+
+ /* On bonding slaves other than the currently active slave, suppress
+diff --git a/drivers/net/can/usb/gs_usb.c b/drivers/net/can/usb/gs_usb.c
+index f418066569fcc..bd9eb066ecf15 100644
+--- a/drivers/net/can/usb/gs_usb.c
++++ b/drivers/net/can/usb/gs_usb.c
+@@ -1030,6 +1030,8 @@ static int gs_can_close(struct net_device *netdev)
+ usb_kill_anchored_urbs(&dev->tx_submitted);
+ atomic_set(&dev->active_tx_urbs, 0);
+
++ dev->can.state = CAN_STATE_STOPPED;
++
+ /* reset the device */
+ rc = gs_cmd_reset(dev);
+ if (rc < 0)
+diff --git a/drivers/net/dsa/qca/qca8k-8xxx.c b/drivers/net/dsa/qca/qca8k-8xxx.c
+index d775a14784f7e..613af28663d79 100644
+--- a/drivers/net/dsa/qca/qca8k-8xxx.c
++++ b/drivers/net/dsa/qca/qca8k-8xxx.c
+@@ -576,8 +576,11 @@ static struct regmap_config qca8k_regmap_config = {
+ .rd_table = &qca8k_readable_table,
+ .disable_locking = true, /* Locking is handled by qca8k read/write */
+ .cache_type = REGCACHE_NONE, /* Explicitly disable CACHE */
+- .max_raw_read = 32, /* mgmt eth can read/write up to 8 registers at time */
+- .max_raw_write = 32,
++ .max_raw_read = 32, /* mgmt eth can read up to 8 registers at time */
++ /* ATU regs suffer from a bug where some data are not correctly
++ * written. Disable bulk write to correctly write ATU entry.
++ */
++ .use_single_write = true,
+ };
+
+ static int
+diff --git a/drivers/net/dsa/qca/qca8k-common.c b/drivers/net/dsa/qca/qca8k-common.c
+index 96773e4325582..8536c4f6363e9 100644
+--- a/drivers/net/dsa/qca/qca8k-common.c
++++ b/drivers/net/dsa/qca/qca8k-common.c
+@@ -244,7 +244,7 @@ void qca8k_fdb_flush(struct qca8k_priv *priv)
+ }
+
+ static int qca8k_fdb_search_and_insert(struct qca8k_priv *priv, u8 port_mask,
+- const u8 *mac, u16 vid)
++ const u8 *mac, u16 vid, u8 aging)
+ {
+ struct qca8k_fdb fdb = { 0 };
+ int ret;
+@@ -261,10 +261,12 @@ static int qca8k_fdb_search_and_insert(struct qca8k_priv *priv, u8 port_mask,
+ goto exit;
+
+ /* Rule exist. Delete first */
+- if (!fdb.aging) {
++ if (fdb.aging) {
+ ret = qca8k_fdb_access(priv, QCA8K_FDB_PURGE, -1);
+ if (ret)
+ goto exit;
++ } else {
++ fdb.aging = aging;
+ }
+
+ /* Add port to fdb portmask */
+@@ -291,6 +293,10 @@ static int qca8k_fdb_search_and_del(struct qca8k_priv *priv, u8 port_mask,
+ if (ret < 0)
+ goto exit;
+
++ ret = qca8k_fdb_read(priv, &fdb);
++ if (ret < 0)
++ goto exit;
++
+ /* Rule doesn't exist. Why delete? */
+ if (!fdb.aging) {
+ ret = -EINVAL;
+@@ -810,7 +816,11 @@ int qca8k_port_mdb_add(struct dsa_switch *ds, int port,
+ const u8 *addr = mdb->addr;
+ u16 vid = mdb->vid;
+
+- return qca8k_fdb_search_and_insert(priv, BIT(port), addr, vid);
++ if (!vid)
++ vid = QCA8K_PORT_VID_DEF;
++
++ return qca8k_fdb_search_and_insert(priv, BIT(port), addr, vid,
++ QCA8K_ATU_STATUS_STATIC);
+ }
+
+ int qca8k_port_mdb_del(struct dsa_switch *ds, int port,
+@@ -821,6 +831,9 @@ int qca8k_port_mdb_del(struct dsa_switch *ds, int port,
+ const u8 *addr = mdb->addr;
+ u16 vid = mdb->vid;
+
++ if (!vid)
++ vid = QCA8K_PORT_VID_DEF;
++
+ return qca8k_fdb_search_and_del(priv, BIT(port), addr, vid);
+ }
+
+diff --git a/drivers/net/ethernet/atheros/atl1e/atl1e_main.c b/drivers/net/ethernet/atheros/atl1e/atl1e_main.c
+index 5db0f3495a32e..5935be190b9e2 100644
+--- a/drivers/net/ethernet/atheros/atl1e/atl1e_main.c
++++ b/drivers/net/ethernet/atheros/atl1e/atl1e_main.c
+@@ -1641,8 +1641,11 @@ static int atl1e_tso_csum(struct atl1e_adapter *adapter,
+ real_len = (((unsigned char *)ip_hdr(skb) - skb->data)
+ + ntohs(ip_hdr(skb)->tot_len));
+
+- if (real_len < skb->len)
+- pskb_trim(skb, real_len);
++ if (real_len < skb->len) {
++ err = pskb_trim(skb, real_len);
++ if (err)
++ return err;
++ }
+
+ hdr_len = skb_tcp_all_headers(skb);
+ if (unlikely(skb->len == hdr_len)) {
+diff --git a/drivers/net/ethernet/atheros/atlx/atl1.c b/drivers/net/ethernet/atheros/atlx/atl1.c
+index c8444bcdf5270..02aa6fd8ebc2d 100644
+--- a/drivers/net/ethernet/atheros/atlx/atl1.c
++++ b/drivers/net/ethernet/atheros/atlx/atl1.c
+@@ -2113,8 +2113,11 @@ static int atl1_tso(struct atl1_adapter *adapter, struct sk_buff *skb,
+
+ real_len = (((unsigned char *)iph - skb->data) +
+ ntohs(iph->tot_len));
+- if (real_len < skb->len)
+- pskb_trim(skb, real_len);
++ if (real_len < skb->len) {
++ err = pskb_trim(skb, real_len);
++ if (err)
++ return err;
++ }
+ hdr_len = skb_tcp_all_headers(skb);
+ if (skb->len == hdr_len) {
+ iph->check = 0;
+diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
+index 0defd519ba62e..7fa057d379c1a 100644
+--- a/drivers/net/ethernet/emulex/benet/be_main.c
++++ b/drivers/net/ethernet/emulex/benet/be_main.c
+@@ -1138,7 +1138,8 @@ static struct sk_buff *be_lancer_xmit_workarounds(struct be_adapter *adapter,
+ (lancer_chip(adapter) || BE3_chip(adapter) ||
+ skb_vlan_tag_present(skb)) && is_ipv4_pkt(skb)) {
+ ip = (struct iphdr *)ip_hdr(skb);
+- pskb_trim(skb, eth_hdr_len + ntohs(ip->tot_len));
++ if (unlikely(pskb_trim(skb, eth_hdr_len + ntohs(ip->tot_len))))
++ goto tx_drop;
+ }
+
+ /* If vlan tag is already inlined in the packet, skip HW VLAN
+diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
+index 7659888a96917..92410f30ad241 100644
+--- a/drivers/net/ethernet/freescale/fec_main.c
++++ b/drivers/net/ethernet/freescale/fec_main.c
+@@ -1372,7 +1372,7 @@ fec_enet_hwtstamp(struct fec_enet_private *fep, unsigned ts,
+ }
+
+ static void
+-fec_enet_tx_queue(struct net_device *ndev, u16 queue_id)
++fec_enet_tx_queue(struct net_device *ndev, u16 queue_id, int budget)
+ {
+ struct fec_enet_private *fep;
+ struct xdp_frame *xdpf;
+@@ -1416,6 +1416,14 @@ fec_enet_tx_queue(struct net_device *ndev, u16 queue_id)
+ if (!skb)
+ goto tx_buf_done;
+ } else {
++ /* Tx processing cannot call any XDP (or page pool) APIs if
++ * the "budget" is 0. Because NAPI is called with budget of
++ * 0 (such as netpoll) indicates we may be in an IRQ context,
++ * however, we can't use the page pool from IRQ context.
++ */
++ if (unlikely(!budget))
++ break;
++
+ xdpf = txq->tx_buf[index].xdp;
+ if (bdp->cbd_bufaddr)
+ dma_unmap_single(&fep->pdev->dev,
+@@ -1508,14 +1516,14 @@ tx_buf_done:
+ writel(0, txq->bd.reg_desc_active);
+ }
+
+-static void fec_enet_tx(struct net_device *ndev)
++static void fec_enet_tx(struct net_device *ndev, int budget)
+ {
+ struct fec_enet_private *fep = netdev_priv(ndev);
+ int i;
+
+ /* Make sure that AVB queues are processed first. */
+ for (i = fep->num_tx_queues - 1; i >= 0; i--)
+- fec_enet_tx_queue(ndev, i);
++ fec_enet_tx_queue(ndev, i, budget);
+ }
+
+ static void fec_enet_update_cbd(struct fec_enet_priv_rx_q *rxq,
+@@ -1858,7 +1866,7 @@ static int fec_enet_rx_napi(struct napi_struct *napi, int budget)
+
+ do {
+ done += fec_enet_rx(ndev, budget - done);
+- fec_enet_tx(ndev);
++ fec_enet_tx(ndev, budget);
+ } while ((done < budget) && fec_enet_collect_events(fep));
+
+ if (done < budget) {
+@@ -3908,6 +3916,8 @@ static int fec_enet_xdp_xmit(struct net_device *dev,
+
+ __netif_tx_lock(nq, cpu);
+
++ /* Avoid tx timeout as XDP shares the queue with kernel stack */
++ txq_trans_cond_update(nq);
+ for (i = 0; i < num_frames; i++) {
+ if (fec_enet_txq_xmit_frame(fep, txq, frames[i]) != 0)
+ break;
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
+index 9c9c72dc57e00..06f29e80104c0 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h
++++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
+@@ -31,6 +31,7 @@
+ #include <linux/pci.h>
+ #include <linux/pkt_sched.h>
+ #include <linux/types.h>
++#include <linux/bitmap.h>
+ #include <net/pkt_cls.h>
+ #include <net/pkt_sched.h>
+
+@@ -407,7 +408,7 @@ struct hnae3_ae_dev {
+ unsigned long hw_err_reset_req;
+ struct hnae3_dev_specs dev_specs;
+ u32 dev_version;
+- unsigned long caps[BITS_TO_LONGS(HNAE3_DEV_CAPS_MAX_NUM)];
++ DECLARE_BITMAP(caps, HNAE3_DEV_CAPS_MAX_NUM);
+ void *priv;
+ };
+
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_common/hclge_comm_cmd.c b/drivers/net/ethernet/hisilicon/hns3/hns3_common/hclge_comm_cmd.c
+index b85c412683ddc..16ba98ff2c9b1 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/hns3_common/hclge_comm_cmd.c
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3_common/hclge_comm_cmd.c
+@@ -171,6 +171,20 @@ static const struct hclge_comm_caps_bit_map hclge_vf_cmd_caps[] = {
+ {HCLGE_COMM_CAP_GRO_B, HNAE3_DEV_SUPPORT_GRO_B},
+ };
+
++static void
++hclge_comm_capability_to_bitmap(unsigned long *bitmap, __le32 *caps)
++{
++ const unsigned int words = HCLGE_COMM_QUERY_CAP_LENGTH;
++ u32 val[HCLGE_COMM_QUERY_CAP_LENGTH];
++ unsigned int i;
++
++ for (i = 0; i < words; i++)
++ val[i] = __le32_to_cpu(caps[i]);
++
++ bitmap_from_arr32(bitmap, val,
++ HCLGE_COMM_QUERY_CAP_LENGTH * BITS_PER_TYPE(u32));
++}
++
+ static void
+ hclge_comm_parse_capability(struct hnae3_ae_dev *ae_dev, bool is_pf,
+ struct hclge_comm_query_version_cmd *cmd)
+@@ -179,11 +193,12 @@ hclge_comm_parse_capability(struct hnae3_ae_dev *ae_dev, bool is_pf,
+ is_pf ? hclge_pf_cmd_caps : hclge_vf_cmd_caps;
+ u32 size = is_pf ? ARRAY_SIZE(hclge_pf_cmd_caps) :
+ ARRAY_SIZE(hclge_vf_cmd_caps);
+- u32 caps, i;
++ DECLARE_BITMAP(caps, HCLGE_COMM_QUERY_CAP_LENGTH * BITS_PER_TYPE(u32));
++ u32 i;
+
+- caps = __le32_to_cpu(cmd->caps[0]);
++ hclge_comm_capability_to_bitmap(caps, cmd->caps);
+ for (i = 0; i < size; i++)
+- if (hnae3_get_bit(caps, caps_map[i].imp_bit))
++ if (test_bit(caps_map[i].imp_bit, caps))
+ set_bit(caps_map[i].local_bit, ae_dev->caps);
+ }
+
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c
+index c4aded65e848b..09362823140d5 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c
+@@ -52,7 +52,10 @@ static void hclge_tm_info_to_ieee_ets(struct hclge_dev *hdev,
+
+ for (i = 0; i < HNAE3_MAX_TC; i++) {
+ ets->prio_tc[i] = hdev->tm_info.prio_tc[i];
+- ets->tc_tx_bw[i] = hdev->tm_info.pg_info[0].tc_dwrr[i];
++ if (i < hdev->tm_info.num_tc)
++ ets->tc_tx_bw[i] = hdev->tm_info.pg_info[0].tc_dwrr[i];
++ else
++ ets->tc_tx_bw[i] = 0;
+
+ if (hdev->tm_info.tc_info[i].tc_sch_mode ==
+ HCLGE_SCH_MODE_SP)
+@@ -123,7 +126,8 @@ static u8 hclge_ets_tc_changed(struct hclge_dev *hdev, struct ieee_ets *ets,
+ }
+
+ static int hclge_ets_sch_mode_validate(struct hclge_dev *hdev,
+- struct ieee_ets *ets, bool *changed)
++ struct ieee_ets *ets, bool *changed,
++ u8 tc_num)
+ {
+ bool has_ets_tc = false;
+ u32 total_ets_bw = 0;
+@@ -137,6 +141,13 @@ static int hclge_ets_sch_mode_validate(struct hclge_dev *hdev,
+ *changed = true;
+ break;
+ case IEEE_8021QAZ_TSA_ETS:
++ if (i >= tc_num) {
++ dev_err(&hdev->pdev->dev,
++ "tc%u is disabled, cannot set ets bw\n",
++ i);
++ return -EINVAL;
++ }
++
+ /* The hardware will switch to sp mode if bandwidth is
+ * 0, so limit ets bandwidth must be greater than 0.
+ */
+@@ -176,7 +187,7 @@ static int hclge_ets_validate(struct hclge_dev *hdev, struct ieee_ets *ets,
+ if (ret)
+ return ret;
+
+- ret = hclge_ets_sch_mode_validate(hdev, ets, changed);
++ ret = hclge_ets_sch_mode_validate(hdev, ets, changed, tc_num);
+ if (ret)
+ return ret;
+
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c
+index 233c132dc513e..409db2e709651 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c
+@@ -693,8 +693,7 @@ static int hclge_dbg_dump_tc(struct hclge_dev *hdev, char *buf, int len)
+ for (i = 0; i < HNAE3_MAX_TC; i++) {
+ sch_mode_str = ets_weight->tc_weight[i] ? "dwrr" : "sp";
+ pos += scnprintf(buf + pos, len - pos, "%u %4s %3u\n",
+- i, sch_mode_str,
+- hdev->tm_info.pg_info[0].tc_dwrr[i]);
++ i, sch_mode_str, ets_weight->tc_weight[i]);
+ }
+
+ return 0;
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c
+index 922c0da3660c7..150f146fa24fb 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c
+@@ -785,6 +785,7 @@ static void hclge_tm_tc_info_init(struct hclge_dev *hdev)
+ static void hclge_tm_pg_info_init(struct hclge_dev *hdev)
+ {
+ #define BW_PERCENT 100
++#define DEFAULT_BW_WEIGHT 1
+
+ u8 i;
+
+@@ -806,7 +807,7 @@ static void hclge_tm_pg_info_init(struct hclge_dev *hdev)
+ for (k = 0; k < hdev->tm_info.num_tc; k++)
+ hdev->tm_info.pg_info[i].tc_dwrr[k] = BW_PERCENT;
+ for (; k < HNAE3_MAX_TC; k++)
+- hdev->tm_info.pg_info[i].tc_dwrr[k] = 0;
++ hdev->tm_info.pg_info[i].tc_dwrr[k] = DEFAULT_BW_WEIGHT;
+ }
+ }
+
+diff --git a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
+index 9954493cd4489..62497f5565c59 100644
+--- a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
++++ b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c
+@@ -1839,7 +1839,7 @@ void i40e_dbg_pf_exit(struct i40e_pf *pf)
+ void i40e_dbg_init(void)
+ {
+ i40e_dbg_root = debugfs_create_dir(i40e_driver_name, NULL);
+- if (!i40e_dbg_root)
++ if (IS_ERR(i40e_dbg_root))
+ pr_info("init of debugfs failed\n");
+ }
+
+diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
+index ba96312feb505..e48810e0627d2 100644
+--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
++++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
+@@ -3286,9 +3286,6 @@ static void iavf_adminq_task(struct work_struct *work)
+ u32 val, oldval;
+ u16 pending;
+
+- if (adapter->flags & IAVF_FLAG_PF_COMMS_FAILED)
+- goto out;
+-
+ if (!mutex_trylock(&adapter->crit_lock)) {
+ if (adapter->state == __IAVF_REMOVE)
+ return;
+@@ -3297,10 +3294,13 @@ static void iavf_adminq_task(struct work_struct *work)
+ goto out;
+ }
+
++ if (adapter->flags & IAVF_FLAG_PF_COMMS_FAILED)
++ goto unlock;
++
+ event.buf_len = IAVF_MAX_AQ_BUF_SIZE;
+ event.msg_buf = kzalloc(event.buf_len, GFP_KERNEL);
+ if (!event.msg_buf)
+- goto out;
++ goto unlock;
+
+ do {
+ ret = iavf_clean_arq_element(hw, &event, &pending);
+@@ -3315,7 +3315,6 @@ static void iavf_adminq_task(struct work_struct *work)
+ if (pending != 0)
+ memset(event.msg_buf, 0, IAVF_MAX_AQ_BUF_SIZE);
+ } while (pending);
+- mutex_unlock(&adapter->crit_lock);
+
+ if (iavf_is_reset_in_progress(adapter))
+ goto freedom;
+@@ -3359,6 +3358,8 @@ static void iavf_adminq_task(struct work_struct *work)
+
+ freedom:
+ kfree(event.msg_buf);
++unlock:
++ mutex_unlock(&adapter->crit_lock);
+ out:
+ /* re-enable Admin queue interrupt cause */
+ iavf_misc_irq_enable(adapter);
+diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool_fdir.c b/drivers/net/ethernet/intel/ice/ice_ethtool_fdir.c
+index ead6d50fc0adc..8c6e13f87b7d3 100644
+--- a/drivers/net/ethernet/intel/ice/ice_ethtool_fdir.c
++++ b/drivers/net/ethernet/intel/ice/ice_ethtool_fdir.c
+@@ -1281,16 +1281,21 @@ ice_cfg_fdir_xtrct_seq(struct ice_pf *pf, struct ethtool_rx_flow_spec *fsp,
+ ICE_FLOW_FLD_OFF_INVAL);
+ }
+
+- /* add filter for outer headers */
+ fltr_idx = ice_ethtool_flow_to_fltr(fsp->flow_type & ~FLOW_EXT);
++
++ assign_bit(fltr_idx, hw->fdir_perfect_fltr, perfect_filter);
++
++ /* add filter for outer headers */
+ ret = ice_fdir_set_hw_fltr_rule(pf, seg, fltr_idx,
+ ICE_FD_HW_SEG_NON_TUN);
+- if (ret == -EEXIST)
+- /* Rule already exists, free memory and continue */
+- devm_kfree(dev, seg);
+- else if (ret)
++ if (ret == -EEXIST) {
++ /* Rule already exists, free memory and count as success */
++ ret = 0;
++ goto err_exit;
++ } else if (ret) {
+ /* could not write filter, free memory */
+ goto err_exit;
++ }
+
+ /* make tunneled filter HW entries if possible */
+ memcpy(&tun_seg[1], seg, sizeof(*seg));
+@@ -1305,18 +1310,13 @@ ice_cfg_fdir_xtrct_seq(struct ice_pf *pf, struct ethtool_rx_flow_spec *fsp,
+ devm_kfree(dev, tun_seg);
+ }
+
+- if (perfect_filter)
+- set_bit(fltr_idx, hw->fdir_perfect_fltr);
+- else
+- clear_bit(fltr_idx, hw->fdir_perfect_fltr);
+-
+ return ret;
+
+ err_exit:
+ devm_kfree(dev, tun_seg);
+ devm_kfree(dev, seg);
+
+- return -EOPNOTSUPP;
++ return ret;
+ }
+
+ /**
+@@ -1914,7 +1914,9 @@ int ice_add_fdir_ethtool(struct ice_vsi *vsi, struct ethtool_rxnfc *cmd)
+ input->comp_report = ICE_FXD_FLTR_QW0_COMP_REPORT_SW_FAIL;
+
+ /* input struct is added to the HW filter list */
+- ice_fdir_update_list_entry(pf, input, fsp->location);
++ ret = ice_fdir_update_list_entry(pf, input, fsp->location);
++ if (ret)
++ goto release_lock;
+
+ ret = ice_fdir_write_all_fltr(pf, input, true);
+ if (ret)
+diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
+index 496a4eb687b00..3ccf2fedc5af7 100644
+--- a/drivers/net/ethernet/intel/igc/igc_main.c
++++ b/drivers/net/ethernet/intel/igc/igc_main.c
+@@ -316,6 +316,33 @@ static void igc_clean_all_tx_rings(struct igc_adapter *adapter)
+ igc_clean_tx_ring(adapter->tx_ring[i]);
+ }
+
++static void igc_disable_tx_ring_hw(struct igc_ring *ring)
++{
++ struct igc_hw *hw = &ring->q_vector->adapter->hw;
++ u8 idx = ring->reg_idx;
++ u32 txdctl;
++
++ txdctl = rd32(IGC_TXDCTL(idx));
++ txdctl &= ~IGC_TXDCTL_QUEUE_ENABLE;
++ txdctl |= IGC_TXDCTL_SWFLUSH;
++ wr32(IGC_TXDCTL(idx), txdctl);
++}
++
++/**
++ * igc_disable_all_tx_rings_hw - Disable all transmit queue operation
++ * @adapter: board private structure
++ */
++static void igc_disable_all_tx_rings_hw(struct igc_adapter *adapter)
++{
++ int i;
++
++ for (i = 0; i < adapter->num_tx_queues; i++) {
++ struct igc_ring *tx_ring = adapter->tx_ring[i];
++
++ igc_disable_tx_ring_hw(tx_ring);
++ }
++}
++
+ /**
+ * igc_setup_tx_resources - allocate Tx resources (Descriptors)
+ * @tx_ring: tx descriptor ring (for a specific queue) to setup
+@@ -5056,6 +5083,7 @@ void igc_down(struct igc_adapter *adapter)
+ /* clear VLAN promisc flag so VFTA will be updated if necessary */
+ adapter->flags &= ~IGC_FLAG_VLAN_PROMISC;
+
++ igc_disable_all_tx_rings_hw(adapter);
+ igc_clean_all_tx_rings(adapter);
+ igc_clean_all_rx_rings(adapter);
+ }
+@@ -7274,18 +7302,6 @@ void igc_enable_rx_ring(struct igc_ring *ring)
+ igc_alloc_rx_buffers(ring, igc_desc_unused(ring));
+ }
+
+-static void igc_disable_tx_ring_hw(struct igc_ring *ring)
+-{
+- struct igc_hw *hw = &ring->q_vector->adapter->hw;
+- u8 idx = ring->reg_idx;
+- u32 txdctl;
+-
+- txdctl = rd32(IGC_TXDCTL(idx));
+- txdctl &= ~IGC_TXDCTL_QUEUE_ENABLE;
+- txdctl |= IGC_TXDCTL_SWFLUSH;
+- wr32(IGC_TXDCTL(idx), txdctl);
+-}
+-
+ void igc_disable_tx_ring(struct igc_ring *ring)
+ {
+ igc_disable_tx_ring_hw(ring);
+diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+index 1726297f2e0df..8eb9839a3ca69 100644
+--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
++++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+@@ -8479,7 +8479,7 @@ static void ixgbe_atr(struct ixgbe_ring *ring,
+ struct ixgbe_adapter *adapter = q_vector->adapter;
+
+ if (unlikely(skb_tail_pointer(skb) < hdr.network +
+- VXLAN_HEADROOM))
++ vxlan_headroom(0)))
+ return;
+
+ /* verify the port is recognized as VXLAN */
+diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_hash.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_hash.c
+index 6fe67f3a7f6f1..7e20282c12d00 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_hash.c
++++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_hash.c
+@@ -218,13 +218,54 @@ void npc_config_secret_key(struct rvu *rvu, int blkaddr)
+
+ void npc_program_mkex_hash(struct rvu *rvu, int blkaddr)
+ {
++ struct npc_mcam_kex_hash *mh = rvu->kpu.mkex_hash;
+ struct hw_cap *hwcap = &rvu->hw->cap;
++ u8 intf, ld, hdr_offset, byte_len;
+ struct rvu_hwinfo *hw = rvu->hw;
+- u8 intf;
++ u64 cfg;
+
++ /* Check if hardware supports hash extraction */
+ if (!hwcap->npc_hash_extract)
+ return;
+
++ /* Check if IPv6 source/destination address
++ * should be hash enabled.
++ * Hashing reduces 128bit SIP/DIP fields to 32bit
++ * so that 224 bit X2 key can be used for IPv6 based filters as well,
++ * which in turn results in more number of MCAM entries available for
++ * use.
++ *
++ * Hashing of IPV6 SIP/DIP is enabled in below scenarios
++ * 1. If the silicon variant supports hashing feature
++ * 2. If the number of bytes of IP addr being extracted is 4 bytes ie
++ * 32bit. The assumption here is that if user wants 8bytes of LSB of
++ * IP addr or full 16 bytes then his intention is not to use 32bit
++ * hash.
++ */
++ for (intf = 0; intf < hw->npc_intfs; intf++) {
++ for (ld = 0; ld < NPC_MAX_LD; ld++) {
++ cfg = rvu_read64(rvu, blkaddr,
++ NPC_AF_INTFX_LIDX_LTX_LDX_CFG(intf,
++ NPC_LID_LC,
++ NPC_LT_LC_IP6,
++ ld));
++ hdr_offset = FIELD_GET(NPC_HDR_OFFSET, cfg);
++ byte_len = FIELD_GET(NPC_BYTESM, cfg);
++ /* Hashing of IPv6 source/destination address should be
++ * enabled if,
++ * hdr_offset == 8 (offset of source IPv6 address) or
++ * hdr_offset == 24 (offset of destination IPv6)
++ * address) and the number of byte to be
++ * extracted is 4. As per hardware configuration
++ * byte_len should be == actual byte_len - 1.
++ * Hence byte_len is checked against 3 but nor 4.
++ */
++ if ((hdr_offset == 8 || hdr_offset == 24) && byte_len == 3)
++ mh->lid_lt_ld_hash_en[intf][NPC_LID_LC][NPC_LT_LC_IP6][ld] = true;
++ }
++ }
++
++ /* Update hash configuration if the field is hash enabled */
+ for (intf = 0; intf < hw->npc_intfs; intf++) {
+ npc_program_mkex_hash_rx(rvu, blkaddr, intf);
+ npc_program_mkex_hash_tx(rvu, blkaddr, intf);
+diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_hash.h b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_hash.h
+index a1c3d987b8044..57a09328d46b5 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_hash.h
++++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_hash.h
+@@ -70,8 +70,8 @@ static struct npc_mcam_kex_hash npc_mkex_hash_default __maybe_unused = {
+ [NIX_INTF_RX] = {
+ [NPC_LID_LC] = {
+ [NPC_LT_LC_IP6] = {
+- true,
+- true,
++ false,
++ false,
+ },
+ },
+ },
+@@ -79,8 +79,8 @@ static struct npc_mcam_kex_hash npc_mkex_hash_default __maybe_unused = {
+ [NIX_INTF_TX] = {
+ [NPC_LID_LC] = {
+ [NPC_LT_LC_IP6] = {
+- true,
+- true,
++ false,
++ false,
+ },
+ },
+ },
+diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
+index b69122686407d..31fdecb414b6f 100644
+--- a/drivers/net/ethernet/realtek/r8169_main.c
++++ b/drivers/net/ethernet/realtek/r8169_main.c
+@@ -623,6 +623,7 @@ struct rtl8169_private {
+ int cfg9346_usage_count;
+
+ unsigned supports_gmii:1;
++ unsigned aspm_manageable:1;
+ dma_addr_t counters_phys_addr;
+ struct rtl8169_counters *counters;
+ struct rtl8169_tc_offsets tc_offset;
+@@ -2746,7 +2747,8 @@ static void rtl_hw_aspm_clkreq_enable(struct rtl8169_private *tp, bool enable)
+ if (tp->mac_version < RTL_GIGA_MAC_VER_32)
+ return;
+
+- if (enable) {
++ /* Don't enable ASPM in the chip if OS can't control ASPM */
++ if (enable && tp->aspm_manageable) {
+ /* On these chip versions ASPM can even harm
+ * bus communication of other PCI devices.
+ */
+@@ -5156,6 +5158,16 @@ done:
+ rtl_rar_set(tp, mac_addr);
+ }
+
++/* register is set if system vendor successfully tested ASPM 1.2 */
++static bool rtl_aspm_is_safe(struct rtl8169_private *tp)
++{
++ if (tp->mac_version >= RTL_GIGA_MAC_VER_61 &&
++ r8168_mac_ocp_read(tp, 0xc0b2) & 0xf)
++ return true;
++
++ return false;
++}
++
+ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
+ {
+ struct rtl8169_private *tp;
+@@ -5227,6 +5239,19 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
+
+ tp->mac_version = chipset;
+
++ /* Disable ASPM L1 as that cause random device stop working
++ * problems as well as full system hangs for some PCIe devices users.
++ * Chips from RTL8168h partially have issues with L1.2, but seem
++ * to work fine with L1 and L1.1.
++ */
++ if (rtl_aspm_is_safe(tp))
++ rc = 0;
++ else if (tp->mac_version >= RTL_GIGA_MAC_VER_46)
++ rc = pci_disable_link_state(pdev, PCIE_LINK_STATE_L1_2);
++ else
++ rc = pci_disable_link_state(pdev, PCIE_LINK_STATE_L1);
++ tp->aspm_manageable = !rc;
++
+ tp->dash_type = rtl_check_dash(tp);
+
+ tp->cp_cmd = RTL_R16(tp, CPlusCmd) & CPCMD_MASK;
+diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c
+index df41eac54058f..03ceb6a940732 100644
+--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c
++++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c
+@@ -240,13 +240,15 @@ void stmmac_dwmac4_set_mac_addr(void __iomem *ioaddr, const u8 addr[6],
+ void stmmac_dwmac4_set_mac(void __iomem *ioaddr, bool enable)
+ {
+ u32 value = readl(ioaddr + GMAC_CONFIG);
++ u32 old_val = value;
+
+ if (enable)
+ value |= GMAC_CONFIG_RE | GMAC_CONFIG_TE;
+ else
+ value &= ~(GMAC_CONFIG_TE | GMAC_CONFIG_RE);
+
+- writel(value, ioaddr + GMAC_CONFIG);
++ if (value != old_val)
++ writel(value, ioaddr + GMAC_CONFIG);
+ }
+
+ void stmmac_dwmac4_get_mac_addr(void __iomem *ioaddr, unsigned char *addr,
+diff --git a/drivers/net/ipa/ipa_table.c b/drivers/net/ipa/ipa_table.c
+index f0529c31d0b6e..7b637bb8b41c8 100644
+--- a/drivers/net/ipa/ipa_table.c
++++ b/drivers/net/ipa/ipa_table.c
+@@ -273,16 +273,15 @@ static int ipa_filter_reset(struct ipa *ipa, bool modem)
+ if (ret)
+ return ret;
+
+- ret = ipa_filter_reset_table(ipa, true, false, modem);
+- if (ret)
++ ret = ipa_filter_reset_table(ipa, false, true, modem);
++ if (ret || !ipa_table_hash_support(ipa))
+ return ret;
+
+- ret = ipa_filter_reset_table(ipa, false, true, modem);
++ ret = ipa_filter_reset_table(ipa, true, false, modem);
+ if (ret)
+ return ret;
+- ret = ipa_filter_reset_table(ipa, true, true, modem);
+
+- return ret;
++ return ipa_filter_reset_table(ipa, true, true, modem);
+ }
+
+ /* The AP routes and modem routes are each contiguous within the
+@@ -291,12 +290,13 @@ static int ipa_filter_reset(struct ipa *ipa, bool modem)
+ * */
+ static int ipa_route_reset(struct ipa *ipa, bool modem)
+ {
++ bool hash_support = ipa_table_hash_support(ipa);
+ u32 modem_route_count = ipa->modem_route_count;
+ struct gsi_trans *trans;
+ u16 first;
+ u16 count;
+
+- trans = ipa_cmd_trans_alloc(ipa, 4);
++ trans = ipa_cmd_trans_alloc(ipa, hash_support ? 4 : 2);
+ if (!trans) {
+ dev_err(&ipa->pdev->dev,
+ "no transaction for %s route reset\n",
+@@ -313,10 +313,12 @@ static int ipa_route_reset(struct ipa *ipa, bool modem)
+ }
+
+ ipa_table_reset_add(trans, false, false, false, first, count);
+- ipa_table_reset_add(trans, false, true, false, first, count);
+-
+ ipa_table_reset_add(trans, false, false, true, first, count);
+- ipa_table_reset_add(trans, false, true, true, first, count);
++
++ if (hash_support) {
++ ipa_table_reset_add(trans, false, true, false, first, count);
++ ipa_table_reset_add(trans, false, true, true, first, count);
++ }
+
+ gsi_trans_commit_wait(trans);
+
+diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
+index 4a53debf9d7c4..ed908165a8b4e 100644
+--- a/drivers/net/macvlan.c
++++ b/drivers/net/macvlan.c
+@@ -1746,6 +1746,7 @@ static const struct nla_policy macvlan_policy[IFLA_MACVLAN_MAX + 1] = {
+ [IFLA_MACVLAN_MACADDR_COUNT] = { .type = NLA_U32 },
+ [IFLA_MACVLAN_BC_QUEUE_LEN] = { .type = NLA_U32 },
+ [IFLA_MACVLAN_BC_QUEUE_LEN_USED] = { .type = NLA_REJECT },
++ [IFLA_MACVLAN_BC_CUTOFF] = { .type = NLA_S32 },
+ };
+
+ int macvlan_link_register(struct rtnl_link_ops *ops)
+diff --git a/drivers/net/phy/marvell10g.c b/drivers/net/phy/marvell10g.c
+index 55d9d7acc32eb..d4bb90d768811 100644
+--- a/drivers/net/phy/marvell10g.c
++++ b/drivers/net/phy/marvell10g.c
+@@ -328,6 +328,13 @@ static int mv3310_power_up(struct phy_device *phydev)
+ ret = phy_clear_bits_mmd(phydev, MDIO_MMD_VEND2, MV_V2_PORT_CTRL,
+ MV_V2_PORT_CTRL_PWRDOWN);
+
++ /* Sometimes, the power down bit doesn't clear immediately, and
++ * a read of this register causes the bit not to clear. Delay
++ * 100us to allow the PHY to come out of power down mode before
++ * the next access.
++ */
++ udelay(100);
++
+ if (phydev->drv->phy_id != MARVELL_PHY_ID_88X3310 ||
+ priv->firmware_ver < 0x00030000)
+ return ret;
+diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
+index 555b0b1e9a789..d3dc22509ea58 100644
+--- a/drivers/net/team/team.c
++++ b/drivers/net/team/team.c
+@@ -2135,6 +2135,15 @@ static void team_setup_by_port(struct net_device *dev,
+ dev->mtu = port_dev->mtu;
+ memcpy(dev->broadcast, port_dev->broadcast, port_dev->addr_len);
+ eth_hw_addr_inherit(dev, port_dev);
++
++ if (port_dev->flags & IFF_POINTOPOINT) {
++ dev->flags &= ~(IFF_BROADCAST | IFF_MULTICAST);
++ dev->flags |= (IFF_POINTOPOINT | IFF_NOARP);
++ } else if ((port_dev->flags & (IFF_BROADCAST | IFF_MULTICAST)) ==
++ (IFF_BROADCAST | IFF_MULTICAST)) {
++ dev->flags |= (IFF_BROADCAST | IFF_MULTICAST);
++ dev->flags &= ~(IFF_POINTOPOINT | IFF_NOARP);
++ }
+ }
+
+ static int team_dev_type_check_change(struct net_device *dev,
+diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
+index 486b5849033dc..2336a0e4befa5 100644
+--- a/drivers/net/virtio_net.c
++++ b/drivers/net/virtio_net.c
+@@ -4110,6 +4110,8 @@ static int virtnet_probe(struct virtio_device *vdev)
+ if (vi->has_rss || vi->has_rss_hash_report)
+ virtnet_init_default_rss(vi);
+
++ _virtnet_set_queues(vi, vi->curr_queue_pairs);
++
+ /* serialize netdev register + virtio_device_ready() with ndo_open() */
+ rtnl_lock();
+
+@@ -4148,8 +4150,6 @@ static int virtnet_probe(struct virtio_device *vdev)
+ goto free_unregister_netdev;
+ }
+
+- virtnet_set_queues(vi, vi->curr_queue_pairs);
+-
+ /* Assume link up if device can't report link status,
+ otherwise get link status from config. */
+ netif_carrier_off(dev);
+diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c
+index 561fe1b314f5f..7532cac2154c5 100644
+--- a/drivers/net/vxlan/vxlan_core.c
++++ b/drivers/net/vxlan/vxlan_core.c
+@@ -623,6 +623,32 @@ static int vxlan_fdb_append(struct vxlan_fdb *f,
+ return 1;
+ }
+
++static bool vxlan_parse_gpe_proto(struct vxlanhdr *hdr, __be16 *protocol)
++{
++ struct vxlanhdr_gpe *gpe = (struct vxlanhdr_gpe *)hdr;
++
++ /* Need to have Next Protocol set for interfaces in GPE mode. */
++ if (!gpe->np_applied)
++ return false;
++ /* "The initial version is 0. If a receiver does not support the
++ * version indicated it MUST drop the packet.
++ */
++ if (gpe->version != 0)
++ return false;
++ /* "When the O bit is set to 1, the packet is an OAM packet and OAM
++ * processing MUST occur." However, we don't implement OAM
++ * processing, thus drop the packet.
++ */
++ if (gpe->oam_flag)
++ return false;
++
++ *protocol = tun_p_to_eth_p(gpe->next_protocol);
++ if (!*protocol)
++ return false;
++
++ return true;
++}
++
+ static struct vxlanhdr *vxlan_gro_remcsum(struct sk_buff *skb,
+ unsigned int off,
+ struct vxlanhdr *vh, size_t hdrlen,
+@@ -649,26 +675,24 @@ static struct vxlanhdr *vxlan_gro_remcsum(struct sk_buff *skb,
+ return vh;
+ }
+
+-static struct sk_buff *vxlan_gro_receive(struct sock *sk,
+- struct list_head *head,
+- struct sk_buff *skb)
++static struct vxlanhdr *vxlan_gro_prepare_receive(struct sock *sk,
++ struct list_head *head,
++ struct sk_buff *skb,
++ struct gro_remcsum *grc)
+ {
+- struct sk_buff *pp = NULL;
+ struct sk_buff *p;
+ struct vxlanhdr *vh, *vh2;
+ unsigned int hlen, off_vx;
+- int flush = 1;
+ struct vxlan_sock *vs = rcu_dereference_sk_user_data(sk);
+ __be32 flags;
+- struct gro_remcsum grc;
+
+- skb_gro_remcsum_init(&grc);
++ skb_gro_remcsum_init(grc);
+
+ off_vx = skb_gro_offset(skb);
+ hlen = off_vx + sizeof(*vh);
+ vh = skb_gro_header(skb, hlen, off_vx);
+ if (unlikely(!vh))
+- goto out;
++ return NULL;
+
+ skb_gro_postpull_rcsum(skb, vh, sizeof(struct vxlanhdr));
+
+@@ -676,12 +700,12 @@ static struct sk_buff *vxlan_gro_receive(struct sock *sk,
+
+ if ((flags & VXLAN_HF_RCO) && (vs->flags & VXLAN_F_REMCSUM_RX)) {
+ vh = vxlan_gro_remcsum(skb, off_vx, vh, sizeof(struct vxlanhdr),
+- vh->vx_vni, &grc,
++ vh->vx_vni, grc,
+ !!(vs->flags &
+ VXLAN_F_REMCSUM_NOPARTIAL));
+
+ if (!vh)
+- goto out;
++ return NULL;
+ }
+
+ skb_gro_pull(skb, sizeof(struct vxlanhdr)); /* pull vxlan header */
+@@ -698,12 +722,48 @@ static struct sk_buff *vxlan_gro_receive(struct sock *sk,
+ }
+ }
+
+- pp = call_gro_receive(eth_gro_receive, head, skb);
+- flush = 0;
++ return vh;
++}
+
+-out:
++static struct sk_buff *vxlan_gro_receive(struct sock *sk,
++ struct list_head *head,
++ struct sk_buff *skb)
++{
++ struct sk_buff *pp = NULL;
++ struct gro_remcsum grc;
++ int flush = 1;
++
++ if (vxlan_gro_prepare_receive(sk, head, skb, &grc)) {
++ pp = call_gro_receive(eth_gro_receive, head, skb);
++ flush = 0;
++ }
+ skb_gro_flush_final_remcsum(skb, pp, flush, &grc);
++ return pp;
++}
++
++static struct sk_buff *vxlan_gpe_gro_receive(struct sock *sk,
++ struct list_head *head,
++ struct sk_buff *skb)
++{
++ const struct packet_offload *ptype;
++ struct sk_buff *pp = NULL;
++ struct gro_remcsum grc;
++ struct vxlanhdr *vh;
++ __be16 protocol;
++ int flush = 1;
+
++ vh = vxlan_gro_prepare_receive(sk, head, skb, &grc);
++ if (vh) {
++ if (!vxlan_parse_gpe_proto(vh, &protocol))
++ goto out;
++ ptype = gro_find_receive_by_type(protocol);
++ if (!ptype)
++ goto out;
++ pp = call_gro_receive(ptype->callbacks.gro_receive, head, skb);
++ flush = 0;
++ }
++out:
++ skb_gro_flush_final_remcsum(skb, pp, flush, &grc);
+ return pp;
+ }
+
+@@ -715,6 +775,21 @@ static int vxlan_gro_complete(struct sock *sk, struct sk_buff *skb, int nhoff)
+ return eth_gro_complete(skb, nhoff + sizeof(struct vxlanhdr));
+ }
+
++static int vxlan_gpe_gro_complete(struct sock *sk, struct sk_buff *skb, int nhoff)
++{
++ struct vxlanhdr *vh = (struct vxlanhdr *)(skb->data + nhoff);
++ const struct packet_offload *ptype;
++ int err = -ENOSYS;
++ __be16 protocol;
++
++ if (!vxlan_parse_gpe_proto(vh, &protocol))
++ return err;
++ ptype = gro_find_complete_by_type(protocol);
++ if (ptype)
++ err = ptype->callbacks.gro_complete(skb, nhoff + sizeof(struct vxlanhdr));
++ return err;
++}
++
+ static struct vxlan_fdb *vxlan_fdb_alloc(struct vxlan_dev *vxlan, const u8 *mac,
+ __u16 state, __be32 src_vni,
+ __u16 ndm_flags)
+@@ -1525,35 +1600,6 @@ out:
+ unparsed->vx_flags &= ~VXLAN_GBP_USED_BITS;
+ }
+
+-static bool vxlan_parse_gpe_hdr(struct vxlanhdr *unparsed,
+- __be16 *protocol,
+- struct sk_buff *skb, u32 vxflags)
+-{
+- struct vxlanhdr_gpe *gpe = (struct vxlanhdr_gpe *)unparsed;
+-
+- /* Need to have Next Protocol set for interfaces in GPE mode. */
+- if (!gpe->np_applied)
+- return false;
+- /* "The initial version is 0. If a receiver does not support the
+- * version indicated it MUST drop the packet.
+- */
+- if (gpe->version != 0)
+- return false;
+- /* "When the O bit is set to 1, the packet is an OAM packet and OAM
+- * processing MUST occur." However, we don't implement OAM
+- * processing, thus drop the packet.
+- */
+- if (gpe->oam_flag)
+- return false;
+-
+- *protocol = tun_p_to_eth_p(gpe->next_protocol);
+- if (!*protocol)
+- return false;
+-
+- unparsed->vx_flags &= ~VXLAN_GPE_USED_BITS;
+- return true;
+-}
+-
+ static bool vxlan_set_mac(struct vxlan_dev *vxlan,
+ struct vxlan_sock *vs,
+ struct sk_buff *skb, __be32 vni)
+@@ -1655,8 +1701,9 @@ static int vxlan_rcv(struct sock *sk, struct sk_buff *skb)
+ * used by VXLAN extensions if explicitly requested.
+ */
+ if (vs->flags & VXLAN_F_GPE) {
+- if (!vxlan_parse_gpe_hdr(&unparsed, &protocol, skb, vs->flags))
++ if (!vxlan_parse_gpe_proto(&unparsed, &protocol))
+ goto drop;
++ unparsed.vx_flags &= ~VXLAN_GPE_USED_BITS;
+ raw_proto = true;
+ }
+
+@@ -2515,7 +2562,7 @@ void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
+ }
+
+ ndst = &rt->dst;
+- err = skb_tunnel_check_pmtu(skb, ndst, VXLAN_HEADROOM,
++ err = skb_tunnel_check_pmtu(skb, ndst, vxlan_headroom(flags & VXLAN_F_GPE),
+ netif_is_any_bridge_port(dev));
+ if (err < 0) {
+ goto tx_error;
+@@ -2576,7 +2623,8 @@ void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
+ goto out_unlock;
+ }
+
+- err = skb_tunnel_check_pmtu(skb, ndst, VXLAN6_HEADROOM,
++ err = skb_tunnel_check_pmtu(skb, ndst,
++ vxlan_headroom((flags & VXLAN_F_GPE) | VXLAN_F_IPV6),
+ netif_is_any_bridge_port(dev));
+ if (err < 0) {
+ goto tx_error;
+@@ -2988,14 +3036,12 @@ static int vxlan_change_mtu(struct net_device *dev, int new_mtu)
+ struct vxlan_rdst *dst = &vxlan->default_dst;
+ struct net_device *lowerdev = __dev_get_by_index(vxlan->net,
+ dst->remote_ifindex);
+- bool use_ipv6 = !!(vxlan->cfg.flags & VXLAN_F_IPV6);
+
+ /* This check is different than dev->max_mtu, because it looks at
+ * the lowerdev->mtu, rather than the static dev->max_mtu
+ */
+ if (lowerdev) {
+- int max_mtu = lowerdev->mtu -
+- (use_ipv6 ? VXLAN6_HEADROOM : VXLAN_HEADROOM);
++ int max_mtu = lowerdev->mtu - vxlan_headroom(vxlan->cfg.flags);
+ if (new_mtu > max_mtu)
+ return -EINVAL;
+ }
+@@ -3376,8 +3422,13 @@ static struct vxlan_sock *vxlan_socket_create(struct net *net, bool ipv6,
+ tunnel_cfg.encap_rcv = vxlan_rcv;
+ tunnel_cfg.encap_err_lookup = vxlan_err_lookup;
+ tunnel_cfg.encap_destroy = NULL;
+- tunnel_cfg.gro_receive = vxlan_gro_receive;
+- tunnel_cfg.gro_complete = vxlan_gro_complete;
++ if (vs->flags & VXLAN_F_GPE) {
++ tunnel_cfg.gro_receive = vxlan_gpe_gro_receive;
++ tunnel_cfg.gro_complete = vxlan_gpe_gro_complete;
++ } else {
++ tunnel_cfg.gro_receive = vxlan_gro_receive;
++ tunnel_cfg.gro_complete = vxlan_gro_complete;
++ }
+
+ setup_udp_tunnel_sock(net, sock, &tunnel_cfg);
+
+@@ -3641,11 +3692,11 @@ static void vxlan_config_apply(struct net_device *dev,
+ struct vxlan_dev *vxlan = netdev_priv(dev);
+ struct vxlan_rdst *dst = &vxlan->default_dst;
+ unsigned short needed_headroom = ETH_HLEN;
+- bool use_ipv6 = !!(conf->flags & VXLAN_F_IPV6);
+ int max_mtu = ETH_MAX_MTU;
++ u32 flags = conf->flags;
+
+ if (!changelink) {
+- if (conf->flags & VXLAN_F_GPE)
++ if (flags & VXLAN_F_GPE)
+ vxlan_raw_setup(dev);
+ else
+ vxlan_ether_setup(dev);
+@@ -3670,8 +3721,7 @@ static void vxlan_config_apply(struct net_device *dev,
+
+ dev->needed_tailroom = lowerdev->needed_tailroom;
+
+- max_mtu = lowerdev->mtu - (use_ipv6 ? VXLAN6_HEADROOM :
+- VXLAN_HEADROOM);
++ max_mtu = lowerdev->mtu - vxlan_headroom(flags);
+ if (max_mtu < ETH_MIN_MTU)
+ max_mtu = ETH_MIN_MTU;
+
+@@ -3682,10 +3732,9 @@ static void vxlan_config_apply(struct net_device *dev,
+ if (dev->mtu > max_mtu)
+ dev->mtu = max_mtu;
+
+- if (use_ipv6 || conf->flags & VXLAN_F_COLLECT_METADATA)
+- needed_headroom += VXLAN6_HEADROOM;
+- else
+- needed_headroom += VXLAN_HEADROOM;
++ if (flags & VXLAN_F_COLLECT_METADATA)
++ flags |= VXLAN_F_IPV6;
++ needed_headroom += vxlan_headroom(flags);
+ dev->needed_headroom = needed_headroom;
+
+ memcpy(&vxlan->cfg, conf, sizeof(*conf));
+diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
+index 827d91e73efab..0af0e965fb57e 100644
+--- a/drivers/pci/controller/pcie-rockchip-ep.c
++++ b/drivers/pci/controller/pcie-rockchip-ep.c
+@@ -61,65 +61,32 @@ static void rockchip_pcie_clear_ep_ob_atu(struct rockchip_pcie *rockchip,
+ ROCKCHIP_PCIE_AT_OB_REGION_DESC0(region));
+ rockchip_pcie_write(rockchip, 0,
+ ROCKCHIP_PCIE_AT_OB_REGION_DESC1(region));
+- rockchip_pcie_write(rockchip, 0,
+- ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR0(region));
+- rockchip_pcie_write(rockchip, 0,
+- ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR1(region));
+ }
+
+ static void rockchip_pcie_prog_ep_ob_atu(struct rockchip_pcie *rockchip, u8 fn,
+- u32 r, u32 type, u64 cpu_addr,
+- u64 pci_addr, size_t size)
++ u32 r, u64 cpu_addr, u64 pci_addr,
++ size_t size)
+ {
+- u64 sz = 1ULL << fls64(size - 1);
+- int num_pass_bits = ilog2(sz);
+- u32 addr0, addr1, desc0, desc1;
+- bool is_nor_msg = (type == AXI_WRAPPER_NOR_MSG);
++ int num_pass_bits = fls64(size - 1);
++ u32 addr0, addr1, desc0;
+
+- /* The minimal region size is 1MB */
+ if (num_pass_bits < 8)
+ num_pass_bits = 8;
+
+- cpu_addr -= rockchip->mem_res->start;
+- addr0 = ((is_nor_msg ? 0x10 : (num_pass_bits - 1)) &
+- PCIE_CORE_OB_REGION_ADDR0_NUM_BITS) |
+- (lower_32_bits(cpu_addr) & PCIE_CORE_OB_REGION_ADDR0_LO_ADDR);
+- addr1 = upper_32_bits(is_nor_msg ? cpu_addr : pci_addr);
+- desc0 = ROCKCHIP_PCIE_AT_OB_REGION_DESC0_DEVFN(fn) | type;
+- desc1 = 0;
+-
+- if (is_nor_msg) {
+- rockchip_pcie_write(rockchip, 0,
+- ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0(r));
+- rockchip_pcie_write(rockchip, 0,
+- ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR1(r));
+- rockchip_pcie_write(rockchip, desc0,
+- ROCKCHIP_PCIE_AT_OB_REGION_DESC0(r));
+- rockchip_pcie_write(rockchip, desc1,
+- ROCKCHIP_PCIE_AT_OB_REGION_DESC1(r));
+- } else {
+- /* PCI bus address region */
+- rockchip_pcie_write(rockchip, addr0,
+- ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0(r));
+- rockchip_pcie_write(rockchip, addr1,
+- ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR1(r));
+- rockchip_pcie_write(rockchip, desc0,
+- ROCKCHIP_PCIE_AT_OB_REGION_DESC0(r));
+- rockchip_pcie_write(rockchip, desc1,
+- ROCKCHIP_PCIE_AT_OB_REGION_DESC1(r));
+-
+- addr0 =
+- ((num_pass_bits - 1) & PCIE_CORE_OB_REGION_ADDR0_NUM_BITS) |
+- (lower_32_bits(cpu_addr) &
+- PCIE_CORE_OB_REGION_ADDR0_LO_ADDR);
+- addr1 = upper_32_bits(cpu_addr);
+- }
++ addr0 = ((num_pass_bits - 1) & PCIE_CORE_OB_REGION_ADDR0_NUM_BITS) |
++ (lower_32_bits(pci_addr) & PCIE_CORE_OB_REGION_ADDR0_LO_ADDR);
++ addr1 = upper_32_bits(pci_addr);
++ desc0 = ROCKCHIP_PCIE_AT_OB_REGION_DESC0_DEVFN(fn) | AXI_WRAPPER_MEM_WRITE;
+
+- /* CPU bus address region */
++ /* PCI bus address region */
+ rockchip_pcie_write(rockchip, addr0,
+- ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR0(r));
++ ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0(r));
+ rockchip_pcie_write(rockchip, addr1,
+- ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR1(r));
++ ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR1(r));
++ rockchip_pcie_write(rockchip, desc0,
++ ROCKCHIP_PCIE_AT_OB_REGION_DESC0(r));
++ rockchip_pcie_write(rockchip, 0,
++ ROCKCHIP_PCIE_AT_OB_REGION_DESC1(r));
+ }
+
+ static int rockchip_pcie_ep_write_header(struct pci_epc *epc, u8 fn, u8 vfn,
+@@ -258,26 +225,20 @@ static void rockchip_pcie_ep_clear_bar(struct pci_epc *epc, u8 fn, u8 vfn,
+ ROCKCHIP_PCIE_AT_IB_EP_FUNC_BAR_ADDR1(fn, bar));
+ }
+
++static inline u32 rockchip_ob_region(phys_addr_t addr)
++{
++ return (addr >> ilog2(SZ_1M)) & 0x1f;
++}
++
+ static int rockchip_pcie_ep_map_addr(struct pci_epc *epc, u8 fn, u8 vfn,
+ phys_addr_t addr, u64 pci_addr,
+ size_t size)
+ {
+ struct rockchip_pcie_ep *ep = epc_get_drvdata(epc);
+ struct rockchip_pcie *pcie = &ep->rockchip;
+- u32 r;
+-
+- r = find_first_zero_bit(&ep->ob_region_map, BITS_PER_LONG);
+- /*
+- * Region 0 is reserved for configuration space and shouldn't
+- * be used elsewhere per TRM, so leave it out.
+- */
+- if (r >= ep->max_regions - 1) {
+- dev_err(&epc->dev, "no free outbound region\n");
+- return -EINVAL;
+- }
++ u32 r = rockchip_ob_region(addr);
+
+- rockchip_pcie_prog_ep_ob_atu(pcie, fn, r, AXI_WRAPPER_MEM_WRITE, addr,
+- pci_addr, size);
++ rockchip_pcie_prog_ep_ob_atu(pcie, fn, r, addr, pci_addr, size);
+
+ set_bit(r, &ep->ob_region_map);
+ ep->ob_addr[r] = addr;
+@@ -292,15 +253,11 @@ static void rockchip_pcie_ep_unmap_addr(struct pci_epc *epc, u8 fn, u8 vfn,
+ struct rockchip_pcie *rockchip = &ep->rockchip;
+ u32 r;
+
+- for (r = 0; r < ep->max_regions - 1; r++)
++ for (r = 0; r < ep->max_regions; r++)
+ if (ep->ob_addr[r] == addr)
+ break;
+
+- /*
+- * Region 0 is reserved for configuration space and shouldn't
+- * be used elsewhere per TRM, so leave it out.
+- */
+- if (r == ep->max_regions - 1)
++ if (r == ep->max_regions)
+ return;
+
+ rockchip_pcie_clear_ep_ob_atu(rockchip, r);
+@@ -397,7 +354,8 @@ static int rockchip_pcie_ep_send_msi_irq(struct rockchip_pcie_ep *ep, u8 fn,
+ struct rockchip_pcie *rockchip = &ep->rockchip;
+ u32 flags, mme, data, data_mask;
+ u8 msi_count;
+- u64 pci_addr, pci_addr_mask = 0xff;
++ u64 pci_addr;
++ u32 r;
+
+ /* Check MSI enable bit */
+ flags = rockchip_pcie_read(&ep->rockchip,
+@@ -431,21 +389,20 @@ static int rockchip_pcie_ep_send_msi_irq(struct rockchip_pcie_ep *ep, u8 fn,
+ ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
+ ROCKCHIP_PCIE_EP_MSI_CTRL_REG +
+ PCI_MSI_ADDRESS_LO);
+- pci_addr &= GENMASK_ULL(63, 2);
+
+ /* Set the outbound region if needed. */
+- if (unlikely(ep->irq_pci_addr != (pci_addr & ~pci_addr_mask) ||
++ if (unlikely(ep->irq_pci_addr != (pci_addr & PCIE_ADDR_MASK) ||
+ ep->irq_pci_fn != fn)) {
+- rockchip_pcie_prog_ep_ob_atu(rockchip, fn, ep->max_regions - 1,
+- AXI_WRAPPER_MEM_WRITE,
++ r = rockchip_ob_region(ep->irq_phys_addr);
++ rockchip_pcie_prog_ep_ob_atu(rockchip, fn, r,
+ ep->irq_phys_addr,
+- pci_addr & ~pci_addr_mask,
+- pci_addr_mask + 1);
+- ep->irq_pci_addr = (pci_addr & ~pci_addr_mask);
++ pci_addr & PCIE_ADDR_MASK,
++ ~PCIE_ADDR_MASK + 1);
++ ep->irq_pci_addr = (pci_addr & PCIE_ADDR_MASK);
+ ep->irq_pci_fn = fn;
+ }
+
+- writew(data, ep->irq_cpu_addr + (pci_addr & pci_addr_mask));
++ writew(data, ep->irq_cpu_addr + (pci_addr & ~PCIE_ADDR_MASK));
+ return 0;
+ }
+
+@@ -527,6 +484,8 @@ static int rockchip_pcie_parse_ep_dt(struct rockchip_pcie *rockchip,
+ if (err < 0 || ep->max_regions > MAX_REGION_LIMIT)
+ ep->max_regions = MAX_REGION_LIMIT;
+
++ ep->ob_region_map = 0;
++
+ err = of_property_read_u8(dev->of_node, "max-functions",
+ &ep->epc->max_functions);
+ if (err < 0)
+@@ -547,7 +506,9 @@ static int rockchip_pcie_ep_probe(struct platform_device *pdev)
+ struct rockchip_pcie *rockchip;
+ struct pci_epc *epc;
+ size_t max_regions;
+- int err;
++ struct pci_epc_mem_window *windows = NULL;
++ int err, i;
++ u32 cfg_msi, cfg_msix_cp;
+
+ ep = devm_kzalloc(dev, sizeof(*ep), GFP_KERNEL);
+ if (!ep)
+@@ -594,15 +555,27 @@ static int rockchip_pcie_ep_probe(struct platform_device *pdev)
+ /* Only enable function 0 by default */
+ rockchip_pcie_write(rockchip, BIT(0), PCIE_CORE_PHY_FUNC_CFG);
+
+- err = pci_epc_mem_init(epc, rockchip->mem_res->start,
+- resource_size(rockchip->mem_res), PAGE_SIZE);
++ windows = devm_kcalloc(dev, ep->max_regions,
++ sizeof(struct pci_epc_mem_window), GFP_KERNEL);
++ if (!windows) {
++ err = -ENOMEM;
++ goto err_uninit_port;
++ }
++ for (i = 0; i < ep->max_regions; i++) {
++ windows[i].phys_base = rockchip->mem_res->start + (SZ_1M * i);
++ windows[i].size = SZ_1M;
++ windows[i].page_size = SZ_1M;
++ }
++ err = pci_epc_multi_mem_init(epc, windows, ep->max_regions);
++ devm_kfree(dev, windows);
++
+ if (err < 0) {
+ dev_err(dev, "failed to initialize the memory space\n");
+ goto err_uninit_port;
+ }
+
+ ep->irq_cpu_addr = pci_epc_mem_alloc_addr(epc, &ep->irq_phys_addr,
+- SZ_128K);
++ SZ_1M);
+ if (!ep->irq_cpu_addr) {
+ dev_err(dev, "failed to reserve memory space for MSI\n");
+ err = -ENOMEM;
+@@ -611,6 +584,29 @@ static int rockchip_pcie_ep_probe(struct platform_device *pdev)
+
+ ep->irq_pci_addr = ROCKCHIP_PCIE_EP_DUMMY_IRQ_ADDR;
+
++ /*
++ * MSI-X is not supported but the controller still advertises the MSI-X
++ * capability by default, which can lead to the Root Complex side
++ * allocating MSI-X vectors which cannot be used. Avoid this by skipping
++ * the MSI-X capability entry in the PCIe capabilities linked-list: get
++ * the next pointer from the MSI-X entry and set that in the MSI
++ * capability entry (which is the previous entry). This way the MSI-X
++ * entry is skipped (left out of the linked-list) and not advertised.
++ */
++ cfg_msi = rockchip_pcie_read(rockchip, PCIE_EP_CONFIG_BASE +
++ ROCKCHIP_PCIE_EP_MSI_CTRL_REG);
++
++ cfg_msi &= ~ROCKCHIP_PCIE_EP_MSI_CP1_MASK;
++
++ cfg_msix_cp = rockchip_pcie_read(rockchip, PCIE_EP_CONFIG_BASE +
++ ROCKCHIP_PCIE_EP_MSIX_CAP_REG) &
++ ROCKCHIP_PCIE_EP_MSIX_CAP_CP_MASK;
++
++ cfg_msi |= cfg_msix_cp;
++
++ rockchip_pcie_write(rockchip, cfg_msi,
++ PCIE_EP_CONFIG_BASE + ROCKCHIP_PCIE_EP_MSI_CTRL_REG);
++
+ rockchip_pcie_write(rockchip, PCIE_CLIENT_CONF_ENABLE,
+ PCIE_CLIENT_CONFIG);
+
+diff --git a/drivers/pci/controller/pcie-rockchip.h b/drivers/pci/controller/pcie-rockchip.h
+index 8e92dc3339ecc..fe0333778fd93 100644
+--- a/drivers/pci/controller/pcie-rockchip.h
++++ b/drivers/pci/controller/pcie-rockchip.h
+@@ -139,6 +139,7 @@
+
+ #define PCIE_RC_RP_ATS_BASE 0x400000
+ #define PCIE_RC_CONFIG_NORMAL_BASE 0x800000
++#define PCIE_EP_PF_CONFIG_REGS_BASE 0x800000
+ #define PCIE_RC_CONFIG_BASE 0xa00000
+ #define PCIE_EP_CONFIG_BASE 0xa00000
+ #define PCIE_EP_CONFIG_DID_VID (PCIE_EP_CONFIG_BASE + 0x00)
+@@ -157,10 +158,11 @@
+ #define PCIE_RC_CONFIG_THP_CAP (PCIE_RC_CONFIG_BASE + 0x274)
+ #define PCIE_RC_CONFIG_THP_CAP_NEXT_MASK GENMASK(31, 20)
+
++#define PCIE_ADDR_MASK 0xffffff00
+ #define PCIE_CORE_AXI_CONF_BASE 0xc00000
+ #define PCIE_CORE_OB_REGION_ADDR0 (PCIE_CORE_AXI_CONF_BASE + 0x0)
+ #define PCIE_CORE_OB_REGION_ADDR0_NUM_BITS 0x3f
+-#define PCIE_CORE_OB_REGION_ADDR0_LO_ADDR 0xffffff00
++#define PCIE_CORE_OB_REGION_ADDR0_LO_ADDR PCIE_ADDR_MASK
+ #define PCIE_CORE_OB_REGION_ADDR1 (PCIE_CORE_AXI_CONF_BASE + 0x4)
+ #define PCIE_CORE_OB_REGION_DESC0 (PCIE_CORE_AXI_CONF_BASE + 0x8)
+ #define PCIE_CORE_OB_REGION_DESC1 (PCIE_CORE_AXI_CONF_BASE + 0xc)
+@@ -168,7 +170,7 @@
+ #define PCIE_CORE_AXI_INBOUND_BASE 0xc00800
+ #define PCIE_RP_IB_ADDR0 (PCIE_CORE_AXI_INBOUND_BASE + 0x0)
+ #define PCIE_CORE_IB_REGION_ADDR0_NUM_BITS 0x3f
+-#define PCIE_CORE_IB_REGION_ADDR0_LO_ADDR 0xffffff00
++#define PCIE_CORE_IB_REGION_ADDR0_LO_ADDR PCIE_ADDR_MASK
+ #define PCIE_RP_IB_ADDR1 (PCIE_CORE_AXI_INBOUND_BASE + 0x4)
+
+ /* Size of one AXI Region (not Region 0) */
+@@ -225,6 +227,8 @@
+ #define ROCKCHIP_PCIE_EP_CMD_STATUS 0x4
+ #define ROCKCHIP_PCIE_EP_CMD_STATUS_IS BIT(19)
+ #define ROCKCHIP_PCIE_EP_MSI_CTRL_REG 0x90
++#define ROCKCHIP_PCIE_EP_MSI_CP1_OFFSET 8
++#define ROCKCHIP_PCIE_EP_MSI_CP1_MASK GENMASK(15, 8)
+ #define ROCKCHIP_PCIE_EP_MSI_FLAGS_OFFSET 16
+ #define ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET 17
+ #define ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_MASK GENMASK(19, 17)
+@@ -232,14 +236,19 @@
+ #define ROCKCHIP_PCIE_EP_MSI_CTRL_MME_MASK GENMASK(22, 20)
+ #define ROCKCHIP_PCIE_EP_MSI_CTRL_ME BIT(16)
+ #define ROCKCHIP_PCIE_EP_MSI_CTRL_MASK_MSI_CAP BIT(24)
++#define ROCKCHIP_PCIE_EP_MSIX_CAP_REG 0xb0
++#define ROCKCHIP_PCIE_EP_MSIX_CAP_CP_OFFSET 8
++#define ROCKCHIP_PCIE_EP_MSIX_CAP_CP_MASK GENMASK(15, 8)
+ #define ROCKCHIP_PCIE_EP_DUMMY_IRQ_ADDR 0x1
+-#define ROCKCHIP_PCIE_EP_FUNC_BASE(fn) (((fn) << 12) & GENMASK(19, 12))
++#define ROCKCHIP_PCIE_EP_PCI_LEGACY_IRQ_ADDR 0x3
++#define ROCKCHIP_PCIE_EP_FUNC_BASE(fn) \
++ (PCIE_EP_PF_CONFIG_REGS_BASE + (((fn) << 12) & GENMASK(19, 12)))
++#define ROCKCHIP_PCIE_EP_VIRT_FUNC_BASE(fn) \
++ (PCIE_EP_PF_CONFIG_REGS_BASE + 0x10000 + (((fn) << 12) & GENMASK(19, 12)))
+ #define ROCKCHIP_PCIE_AT_IB_EP_FUNC_BAR_ADDR0(fn, bar) \
+- (PCIE_RC_RP_ATS_BASE + 0x0840 + (fn) * 0x0040 + (bar) * 0x0008)
++ (PCIE_CORE_AXI_CONF_BASE + 0x0828 + (fn) * 0x0040 + (bar) * 0x0008)
+ #define ROCKCHIP_PCIE_AT_IB_EP_FUNC_BAR_ADDR1(fn, bar) \
+- (PCIE_RC_RP_ATS_BASE + 0x0844 + (fn) * 0x0040 + (bar) * 0x0008)
+-#define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0(r) \
+- (PCIE_RC_RP_ATS_BASE + 0x0000 + ((r) & 0x1f) * 0x0020)
++ (PCIE_CORE_AXI_CONF_BASE + 0x082c + (fn) * 0x0040 + (bar) * 0x0008)
+ #define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0_DEVFN_MASK GENMASK(19, 12)
+ #define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0_DEVFN(devfn) \
+ (((devfn) << 12) & \
+@@ -247,20 +256,21 @@
+ #define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0_BUS_MASK GENMASK(27, 20)
+ #define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0_BUS(bus) \
+ (((bus) << 20) & ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0_BUS_MASK)
++#define PCIE_RC_EP_ATR_OB_REGIONS_1_32 (PCIE_CORE_AXI_CONF_BASE + 0x0020)
++#define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR0(r) \
++ (PCIE_RC_EP_ATR_OB_REGIONS_1_32 + 0x0000 + ((r) & 0x1f) * 0x0020)
+ #define ROCKCHIP_PCIE_AT_OB_REGION_PCI_ADDR1(r) \
+- (PCIE_RC_RP_ATS_BASE + 0x0004 + ((r) & 0x1f) * 0x0020)
++ (PCIE_RC_EP_ATR_OB_REGIONS_1_32 + 0x0004 + ((r) & 0x1f) * 0x0020)
+ #define ROCKCHIP_PCIE_AT_OB_REGION_DESC0_HARDCODED_RID BIT(23)
+ #define ROCKCHIP_PCIE_AT_OB_REGION_DESC0_DEVFN_MASK GENMASK(31, 24)
+ #define ROCKCHIP_PCIE_AT_OB_REGION_DESC0_DEVFN(devfn) \
+ (((devfn) << 24) & ROCKCHIP_PCIE_AT_OB_REGION_DESC0_DEVFN_MASK)
+ #define ROCKCHIP_PCIE_AT_OB_REGION_DESC0(r) \
+- (PCIE_RC_RP_ATS_BASE + 0x0008 + ((r) & 0x1f) * 0x0020)
+-#define ROCKCHIP_PCIE_AT_OB_REGION_DESC1(r) \
+- (PCIE_RC_RP_ATS_BASE + 0x000c + ((r) & 0x1f) * 0x0020)
+-#define ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR0(r) \
+- (PCIE_RC_RP_ATS_BASE + 0x0018 + ((r) & 0x1f) * 0x0020)
+-#define ROCKCHIP_PCIE_AT_OB_REGION_CPU_ADDR1(r) \
+- (PCIE_RC_RP_ATS_BASE + 0x001c + ((r) & 0x1f) * 0x0020)
++ (PCIE_RC_EP_ATR_OB_REGIONS_1_32 + 0x0008 + ((r) & 0x1f) * 0x0020)
++#define ROCKCHIP_PCIE_AT_OB_REGION_DESC1(r) \
++ (PCIE_RC_EP_ATR_OB_REGIONS_1_32 + 0x000c + ((r) & 0x1f) * 0x0020)
++#define ROCKCHIP_PCIE_AT_OB_REGION_DESC2(r) \
++ (PCIE_RC_EP_ATR_OB_REGIONS_1_32 + 0x0010 + ((r) & 0x1f) * 0x0020)
+
+ #define ROCKCHIP_PCIE_CORE_EP_FUNC_BAR_CFG0(fn) \
+ (PCIE_CORE_CTRL_MGMT_BASE + 0x0240 + (fn) * 0x0008)
+diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
+index db32335039d61..998e26de2ad76 100644
+--- a/drivers/pci/pcie/aspm.c
++++ b/drivers/pci/pcie/aspm.c
+@@ -193,12 +193,39 @@ static void pcie_clkpm_cap_init(struct pcie_link_state *link, int blacklist)
+ link->clkpm_disable = blacklist ? 1 : 0;
+ }
+
+-static bool pcie_retrain_link(struct pcie_link_state *link)
++static int pcie_wait_for_retrain(struct pci_dev *pdev)
+ {
+- struct pci_dev *parent = link->pdev;
+ unsigned long end_jiffies;
+ u16 reg16;
+
++ /* Wait for Link Training to be cleared by hardware */
++ end_jiffies = jiffies + LINK_RETRAIN_TIMEOUT;
++ do {
++ pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, ®16);
++ if (!(reg16 & PCI_EXP_LNKSTA_LT))
++ return 0;
++ msleep(1);
++ } while (time_before(jiffies, end_jiffies));
++
++ return -ETIMEDOUT;
++}
++
++static int pcie_retrain_link(struct pcie_link_state *link)
++{
++ struct pci_dev *parent = link->pdev;
++ int rc;
++ u16 reg16;
++
++ /*
++ * Ensure the updated LNKCTL parameters are used during link
++ * training by checking that there is no ongoing link training to
++ * avoid LTSSM race as recommended in Implementation Note at the
++ * end of PCIe r6.0.1 sec 7.5.3.7.
++ */
++ rc = pcie_wait_for_retrain(parent);
++ if (rc)
++ return rc;
++
+ pcie_capability_read_word(parent, PCI_EXP_LNKCTL, ®16);
+ reg16 |= PCI_EXP_LNKCTL_RL;
+ pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
+@@ -212,15 +239,7 @@ static bool pcie_retrain_link(struct pcie_link_state *link)
+ pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
+ }
+
+- /* Wait for link training end. Break out after waiting for timeout */
+- end_jiffies = jiffies + LINK_RETRAIN_TIMEOUT;
+- do {
+- pcie_capability_read_word(parent, PCI_EXP_LNKSTA, ®16);
+- if (!(reg16 & PCI_EXP_LNKSTA_LT))
+- break;
+- msleep(1);
+- } while (time_before(jiffies, end_jiffies));
+- return !(reg16 & PCI_EXP_LNKSTA_LT);
++ return pcie_wait_for_retrain(parent);
+ }
+
+ /*
+@@ -289,15 +308,15 @@ static void pcie_aspm_configure_common_clock(struct pcie_link_state *link)
+ reg16 &= ~PCI_EXP_LNKCTL_CCC;
+ pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
+
+- if (pcie_retrain_link(link))
+- return;
++ if (pcie_retrain_link(link)) {
+
+- /* Training failed. Restore common clock configurations */
+- pci_err(parent, "ASPM: Could not configure common clock\n");
+- list_for_each_entry(child, &linkbus->devices, bus_list)
+- pcie_capability_write_word(child, PCI_EXP_LNKCTL,
++ /* Training failed. Restore common clock configurations */
++ pci_err(parent, "ASPM: Could not configure common clock\n");
++ list_for_each_entry(child, &linkbus->devices, bus_list)
++ pcie_capability_write_word(child, PCI_EXP_LNKCTL,
+ child_reg[PCI_FUNC(child->devfn)]);
+- pcie_capability_write_word(parent, PCI_EXP_LNKCTL, parent_reg);
++ pcie_capability_write_word(parent, PCI_EXP_LNKCTL, parent_reg);
++ }
+ }
+
+ /* Convert L0s latency encoding to ns */
+diff --git a/drivers/phy/hisilicon/phy-hisi-inno-usb2.c b/drivers/phy/hisilicon/phy-hisi-inno-usb2.c
+index b133ae06757ab..a922fb11a1092 100644
+--- a/drivers/phy/hisilicon/phy-hisi-inno-usb2.c
++++ b/drivers/phy/hisilicon/phy-hisi-inno-usb2.c
+@@ -158,7 +158,7 @@ static int hisi_inno_phy_probe(struct platform_device *pdev)
+ phy_set_drvdata(phy, &priv->ports[i]);
+ i++;
+
+- if (i > INNO_PHY_PORT_NUM) {
++ if (i >= INNO_PHY_PORT_NUM) {
+ dev_warn(dev, "Support %d ports in maximum\n", i);
+ of_node_put(child);
+ break;
+diff --git a/drivers/phy/mediatek/phy-mtk-dp.c b/drivers/phy/mediatek/phy-mtk-dp.c
+index 232fd3f1ff1b1..d7024a1443358 100644
+--- a/drivers/phy/mediatek/phy-mtk-dp.c
++++ b/drivers/phy/mediatek/phy-mtk-dp.c
+@@ -169,7 +169,7 @@ static int mtk_dp_phy_probe(struct platform_device *pdev)
+
+ regs = *(struct regmap **)dev->platform_data;
+ if (!regs)
+- return dev_err_probe(dev, EINVAL,
++ return dev_err_probe(dev, -EINVAL,
+ "No data passed, requires struct regmap**\n");
+
+ dp_phy = devm_kzalloc(dev, sizeof(*dp_phy), GFP_KERNEL);
+diff --git a/drivers/phy/mediatek/phy-mtk-hdmi-mt8195.c b/drivers/phy/mediatek/phy-mtk-hdmi-mt8195.c
+index 8aa7251de4a96..bbfe11d6a69d7 100644
+--- a/drivers/phy/mediatek/phy-mtk-hdmi-mt8195.c
++++ b/drivers/phy/mediatek/phy-mtk-hdmi-mt8195.c
+@@ -253,7 +253,7 @@ static int mtk_hdmi_pll_calc(struct mtk_hdmi_phy *hdmi_phy, struct clk_hw *hw,
+ for (i = 0; i < ARRAY_SIZE(txpredivs); i++) {
+ ns_hdmipll_ck = 5 * tmds_clk * txposdiv * txpredivs[i];
+ if (ns_hdmipll_ck >= 5 * GIGA &&
+- ns_hdmipll_ck <= 1 * GIGA)
++ ns_hdmipll_ck <= 12 * GIGA)
+ break;
+ }
+ if (i == (ARRAY_SIZE(txpredivs) - 1) &&
+diff --git a/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c b/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c
+index 6c237f3cc66db..6170f8fd118e2 100644
+--- a/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c
++++ b/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c
+@@ -110,11 +110,13 @@ struct phy_override_seq {
+ /**
+ * struct qcom_snps_hsphy - snps hs phy attributes
+ *
++ * @dev: device structure
++ *
+ * @phy: generic phy
+ * @base: iomapped memory space for snps hs phy
+ *
+- * @cfg_ahb_clk: AHB2PHY interface clock
+- * @ref_clk: phy reference clock
++ * @num_clks: number of clocks
++ * @clks: array of clocks
+ * @phy_reset: phy reset control
+ * @vregs: regulator supplies bulk data
+ * @phy_initialized: if PHY has been initialized correctly
+@@ -122,11 +124,13 @@ struct phy_override_seq {
+ * @update_seq_cfg: tuning parameters for phy init
+ */
+ struct qcom_snps_hsphy {
++ struct device *dev;
++
+ struct phy *phy;
+ void __iomem *base;
+
+- struct clk *cfg_ahb_clk;
+- struct clk *ref_clk;
++ int num_clks;
++ struct clk_bulk_data *clks;
+ struct reset_control *phy_reset;
+ struct regulator_bulk_data vregs[SNPS_HS_NUM_VREGS];
+
+@@ -135,6 +139,34 @@ struct qcom_snps_hsphy {
+ struct phy_override_seq update_seq_cfg[NUM_HSPHY_TUNING_PARAMS];
+ };
+
++static int qcom_snps_hsphy_clk_init(struct qcom_snps_hsphy *hsphy)
++{
++ struct device *dev = hsphy->dev;
++
++ hsphy->num_clks = 2;
++ hsphy->clks = devm_kcalloc(dev, hsphy->num_clks, sizeof(*hsphy->clks), GFP_KERNEL);
++ if (!hsphy->clks)
++ return -ENOMEM;
++
++ /*
++ * TODO: Currently no device tree instantiation of the PHY is using the clock.
++ * This needs to be fixed in order for this code to be able to use devm_clk_bulk_get().
++ */
++ hsphy->clks[0].id = "cfg_ahb";
++ hsphy->clks[0].clk = devm_clk_get_optional(dev, "cfg_ahb");
++ if (IS_ERR(hsphy->clks[0].clk))
++ return dev_err_probe(dev, PTR_ERR(hsphy->clks[0].clk),
++ "failed to get cfg_ahb clk\n");
++
++ hsphy->clks[1].id = "ref";
++ hsphy->clks[1].clk = devm_clk_get(dev, "ref");
++ if (IS_ERR(hsphy->clks[1].clk))
++ return dev_err_probe(dev, PTR_ERR(hsphy->clks[1].clk),
++ "failed to get ref clk\n");
++
++ return 0;
++}
++
+ static inline void qcom_snps_hsphy_write_mask(void __iomem *base, u32 offset,
+ u32 mask, u32 val)
+ {
+@@ -165,22 +197,13 @@ static int qcom_snps_hsphy_suspend(struct qcom_snps_hsphy *hsphy)
+ 0, USB2_AUTO_RESUME);
+ }
+
+- clk_disable_unprepare(hsphy->cfg_ahb_clk);
+ return 0;
+ }
+
+ static int qcom_snps_hsphy_resume(struct qcom_snps_hsphy *hsphy)
+ {
+- int ret;
+-
+ dev_dbg(&hsphy->phy->dev, "Resume QCOM SNPS PHY, mode\n");
+
+- ret = clk_prepare_enable(hsphy->cfg_ahb_clk);
+- if (ret) {
+- dev_err(&hsphy->phy->dev, "failed to enable cfg ahb clock\n");
+- return ret;
+- }
+-
+ return 0;
+ }
+
+@@ -374,16 +397,16 @@ static int qcom_snps_hsphy_init(struct phy *phy)
+ if (ret)
+ return ret;
+
+- ret = clk_prepare_enable(hsphy->cfg_ahb_clk);
++ ret = clk_bulk_prepare_enable(hsphy->num_clks, hsphy->clks);
+ if (ret) {
+- dev_err(&phy->dev, "failed to enable cfg ahb clock, %d\n", ret);
++ dev_err(&phy->dev, "failed to enable clocks, %d\n", ret);
+ goto poweroff_phy;
+ }
+
+ ret = reset_control_assert(hsphy->phy_reset);
+ if (ret) {
+ dev_err(&phy->dev, "failed to assert phy_reset, %d\n", ret);
+- goto disable_ahb_clk;
++ goto disable_clks;
+ }
+
+ usleep_range(100, 150);
+@@ -391,7 +414,7 @@ static int qcom_snps_hsphy_init(struct phy *phy)
+ ret = reset_control_deassert(hsphy->phy_reset);
+ if (ret) {
+ dev_err(&phy->dev, "failed to de-assert phy_reset, %d\n", ret);
+- goto disable_ahb_clk;
++ goto disable_clks;
+ }
+
+ qcom_snps_hsphy_write_mask(hsphy->base, USB2_PHY_USB_PHY_CFG0,
+@@ -448,8 +471,8 @@ static int qcom_snps_hsphy_init(struct phy *phy)
+
+ return 0;
+
+-disable_ahb_clk:
+- clk_disable_unprepare(hsphy->cfg_ahb_clk);
++disable_clks:
++ clk_bulk_disable_unprepare(hsphy->num_clks, hsphy->clks);
+ poweroff_phy:
+ regulator_bulk_disable(ARRAY_SIZE(hsphy->vregs), hsphy->vregs);
+
+@@ -461,7 +484,7 @@ static int qcom_snps_hsphy_exit(struct phy *phy)
+ struct qcom_snps_hsphy *hsphy = phy_get_drvdata(phy);
+
+ reset_control_assert(hsphy->phy_reset);
+- clk_disable_unprepare(hsphy->cfg_ahb_clk);
++ clk_bulk_disable_unprepare(hsphy->num_clks, hsphy->clks);
+ regulator_bulk_disable(ARRAY_SIZE(hsphy->vregs), hsphy->vregs);
+ hsphy->phy_initialized = false;
+
+@@ -554,14 +577,15 @@ static int qcom_snps_hsphy_probe(struct platform_device *pdev)
+ if (!hsphy)
+ return -ENOMEM;
+
++ hsphy->dev = dev;
++
+ hsphy->base = devm_platform_ioremap_resource(pdev, 0);
+ if (IS_ERR(hsphy->base))
+ return PTR_ERR(hsphy->base);
+
+- hsphy->ref_clk = devm_clk_get(dev, "ref");
+- if (IS_ERR(hsphy->ref_clk))
+- return dev_err_probe(dev, PTR_ERR(hsphy->ref_clk),
+- "failed to get ref clk\n");
++ ret = qcom_snps_hsphy_clk_init(hsphy);
++ if (ret)
++ return dev_err_probe(dev, ret, "failed to initialize clocks\n");
+
+ hsphy->phy_reset = devm_reset_control_get_exclusive(&pdev->dev, NULL);
+ if (IS_ERR(hsphy->phy_reset)) {
+diff --git a/drivers/platform/x86/amd/pmf/acpi.c b/drivers/platform/x86/amd/pmf/acpi.c
+index 081e84e116e79..3fc5e4547d9f2 100644
+--- a/drivers/platform/x86/amd/pmf/acpi.c
++++ b/drivers/platform/x86/amd/pmf/acpi.c
+@@ -106,6 +106,27 @@ int apmf_get_static_slider_granular(struct amd_pmf_dev *pdev,
+ data, sizeof(*data));
+ }
+
++int apmf_os_power_slider_update(struct amd_pmf_dev *pdev, u8 event)
++{
++ struct os_power_slider args;
++ struct acpi_buffer params;
++ union acpi_object *info;
++ int err = 0;
++
++ args.size = sizeof(args);
++ args.slider_event = event;
++
++ params.length = sizeof(args);
++ params.pointer = (void *)&args;
++
++ info = apmf_if_call(pdev, APMF_FUNC_OS_POWER_SLIDER_UPDATE, ¶ms);
++ if (!info)
++ err = -EIO;
++
++ kfree(info);
++ return err;
++}
++
+ static void apmf_sbios_heartbeat_notify(struct work_struct *work)
+ {
+ struct amd_pmf_dev *dev = container_of(work, struct amd_pmf_dev, heart_beat.work);
+@@ -289,7 +310,7 @@ int apmf_acpi_init(struct amd_pmf_dev *pmf_dev)
+
+ ret = apmf_get_system_params(pmf_dev);
+ if (ret) {
+- dev_err(pmf_dev->dev, "APMF apmf_get_system_params failed :%d\n", ret);
++ dev_dbg(pmf_dev->dev, "APMF apmf_get_system_params failed :%d\n", ret);
+ goto out;
+ }
+
+diff --git a/drivers/platform/x86/amd/pmf/core.c b/drivers/platform/x86/amd/pmf/core.c
+index 7780705917b76..a022325161273 100644
+--- a/drivers/platform/x86/amd/pmf/core.c
++++ b/drivers/platform/x86/amd/pmf/core.c
+@@ -71,7 +71,11 @@ static int amd_pmf_pwr_src_notify_call(struct notifier_block *nb, unsigned long
+ return NOTIFY_DONE;
+ }
+
+- amd_pmf_set_sps_power_limits(pmf);
++ if (is_apmf_func_supported(pmf, APMF_FUNC_STATIC_SLIDER_GRANULAR))
++ amd_pmf_set_sps_power_limits(pmf);
++
++ if (is_apmf_func_supported(pmf, APMF_FUNC_OS_POWER_SLIDER_UPDATE))
++ amd_pmf_power_slider_update_event(pmf);
+
+ return NOTIFY_OK;
+ }
+@@ -295,7 +299,8 @@ static void amd_pmf_init_features(struct amd_pmf_dev *dev)
+ int ret;
+
+ /* Enable Static Slider */
+- if (is_apmf_func_supported(dev, APMF_FUNC_STATIC_SLIDER_GRANULAR)) {
++ if (is_apmf_func_supported(dev, APMF_FUNC_STATIC_SLIDER_GRANULAR) ||
++ is_apmf_func_supported(dev, APMF_FUNC_OS_POWER_SLIDER_UPDATE)) {
+ amd_pmf_init_sps(dev);
+ dev->pwr_src_notifier.notifier_call = amd_pmf_pwr_src_notify_call;
+ power_supply_reg_notifier(&dev->pwr_src_notifier);
+diff --git a/drivers/platform/x86/amd/pmf/pmf.h b/drivers/platform/x86/amd/pmf/pmf.h
+index 06c30cdc05733..deba88e6e4c8d 100644
+--- a/drivers/platform/x86/amd/pmf/pmf.h
++++ b/drivers/platform/x86/amd/pmf/pmf.h
+@@ -21,6 +21,7 @@
+ #define APMF_FUNC_SBIOS_HEARTBEAT 4
+ #define APMF_FUNC_AUTO_MODE 5
+ #define APMF_FUNC_SET_FAN_IDX 7
++#define APMF_FUNC_OS_POWER_SLIDER_UPDATE 8
+ #define APMF_FUNC_STATIC_SLIDER_GRANULAR 9
+ #define APMF_FUNC_DYN_SLIDER_AC 11
+ #define APMF_FUNC_DYN_SLIDER_DC 12
+@@ -44,6 +45,14 @@
+ #define GET_STT_LIMIT_APU 0x20
+ #define GET_STT_LIMIT_HS2 0x21
+
++/* OS slider update notification */
++#define DC_BEST_PERF 0
++#define DC_BETTER_PERF 1
++#define DC_BATTERY_SAVER 3
++#define AC_BEST_PERF 4
++#define AC_BETTER_PERF 5
++#define AC_BETTER_BATTERY 6
++
+ /* Fan Index for Auto Mode */
+ #define FAN_INDEX_AUTO 0xFFFFFFFF
+
+@@ -193,6 +202,11 @@ struct amd_pmf_static_slider_granular {
+ struct apmf_sps_prop_granular prop[POWER_SOURCE_MAX][POWER_MODE_MAX];
+ };
+
++struct os_power_slider {
++ u16 size;
++ u8 slider_event;
++} __packed;
++
+ struct fan_table_control {
+ bool manual;
+ unsigned long fan_id;
+@@ -383,6 +397,7 @@ int amd_pmf_send_cmd(struct amd_pmf_dev *dev, u8 message, bool get, u32 arg, u32
+ int amd_pmf_init_metrics_table(struct amd_pmf_dev *dev);
+ int amd_pmf_get_power_source(void);
+ int apmf_install_handler(struct amd_pmf_dev *pmf_dev);
++int apmf_os_power_slider_update(struct amd_pmf_dev *dev, u8 flag);
+
+ /* SPS Layer */
+ int amd_pmf_get_pprof_modes(struct amd_pmf_dev *pmf);
+@@ -393,6 +408,7 @@ void amd_pmf_deinit_sps(struct amd_pmf_dev *dev);
+ int apmf_get_static_slider_granular(struct amd_pmf_dev *pdev,
+ struct apmf_static_slider_granular_output *output);
+ bool is_pprof_balanced(struct amd_pmf_dev *pmf);
++int amd_pmf_power_slider_update_event(struct amd_pmf_dev *dev);
+
+
+ int apmf_update_fan_idx(struct amd_pmf_dev *pdev, bool manual, u32 idx);
+diff --git a/drivers/platform/x86/amd/pmf/sps.c b/drivers/platform/x86/amd/pmf/sps.c
+index bed762d47a14a..fd448844de206 100644
+--- a/drivers/platform/x86/amd/pmf/sps.c
++++ b/drivers/platform/x86/amd/pmf/sps.c
+@@ -119,14 +119,77 @@ int amd_pmf_get_pprof_modes(struct amd_pmf_dev *pmf)
+ return mode;
+ }
+
++int amd_pmf_power_slider_update_event(struct amd_pmf_dev *dev)
++{
++ u8 mode, flag = 0;
++ int src;
++
++ mode = amd_pmf_get_pprof_modes(dev);
++ if (mode < 0)
++ return mode;
++
++ src = amd_pmf_get_power_source();
++
++ if (src == POWER_SOURCE_AC) {
++ switch (mode) {
++ case POWER_MODE_PERFORMANCE:
++ flag |= BIT(AC_BEST_PERF);
++ break;
++ case POWER_MODE_BALANCED_POWER:
++ flag |= BIT(AC_BETTER_PERF);
++ break;
++ case POWER_MODE_POWER_SAVER:
++ flag |= BIT(AC_BETTER_BATTERY);
++ break;
++ default:
++ dev_err(dev->dev, "unsupported platform profile\n");
++ return -EOPNOTSUPP;
++ }
++
++ } else if (src == POWER_SOURCE_DC) {
++ switch (mode) {
++ case POWER_MODE_PERFORMANCE:
++ flag |= BIT(DC_BEST_PERF);
++ break;
++ case POWER_MODE_BALANCED_POWER:
++ flag |= BIT(DC_BETTER_PERF);
++ break;
++ case POWER_MODE_POWER_SAVER:
++ flag |= BIT(DC_BATTERY_SAVER);
++ break;
++ default:
++ dev_err(dev->dev, "unsupported platform profile\n");
++ return -EOPNOTSUPP;
++ }
++ }
++
++ apmf_os_power_slider_update(dev, flag);
++
++ return 0;
++}
++
+ static int amd_pmf_profile_set(struct platform_profile_handler *pprof,
+ enum platform_profile_option profile)
+ {
+ struct amd_pmf_dev *pmf = container_of(pprof, struct amd_pmf_dev, pprof);
++ int ret = 0;
+
+ pmf->current_profile = profile;
+
+- return amd_pmf_set_sps_power_limits(pmf);
++ /* Notify EC about the slider position change */
++ if (is_apmf_func_supported(pmf, APMF_FUNC_OS_POWER_SLIDER_UPDATE)) {
++ ret = amd_pmf_power_slider_update_event(pmf);
++ if (ret)
++ return ret;
++ }
++
++ if (is_apmf_func_supported(pmf, APMF_FUNC_STATIC_SLIDER_GRANULAR)) {
++ ret = amd_pmf_set_sps_power_limits(pmf);
++ if (ret)
++ return ret;
++ }
++
++ return 0;
+ }
+
+ int amd_pmf_init_sps(struct amd_pmf_dev *dev)
+@@ -134,10 +197,13 @@ int amd_pmf_init_sps(struct amd_pmf_dev *dev)
+ int err;
+
+ dev->current_profile = PLATFORM_PROFILE_BALANCED;
+- amd_pmf_load_defaults_sps(dev);
+
+- /* update SPS balanced power mode thermals */
+- amd_pmf_set_sps_power_limits(dev);
++ if (is_apmf_func_supported(dev, APMF_FUNC_STATIC_SLIDER_GRANULAR)) {
++ amd_pmf_load_defaults_sps(dev);
++
++ /* update SPS balanced power mode thermals */
++ amd_pmf_set_sps_power_limits(dev);
++ }
+
+ dev->pprof.profile_get = amd_pmf_profile_get;
+ dev->pprof.profile_set = amd_pmf_profile_set;
+diff --git a/drivers/platform/x86/msi-laptop.c b/drivers/platform/x86/msi-laptop.c
+index 6b18ec543ac3a..f4c6c36e05a52 100644
+--- a/drivers/platform/x86/msi-laptop.c
++++ b/drivers/platform/x86/msi-laptop.c
+@@ -208,7 +208,7 @@ static ssize_t set_device_state(const char *buf, size_t count, u8 mask)
+ return -EINVAL;
+
+ if (quirks->ec_read_only)
+- return -EOPNOTSUPP;
++ return 0;
+
+ /* read current device state */
+ result = ec_read(MSI_STANDARD_EC_COMMAND_ADDRESS, &rdata);
+@@ -838,15 +838,15 @@ static bool msi_laptop_i8042_filter(unsigned char data, unsigned char str,
+ static void msi_init_rfkill(struct work_struct *ignored)
+ {
+ if (rfk_wlan) {
+- rfkill_set_sw_state(rfk_wlan, !wlan_s);
++ msi_rfkill_set_state(rfk_wlan, !wlan_s);
+ rfkill_wlan_set(NULL, !wlan_s);
+ }
+ if (rfk_bluetooth) {
+- rfkill_set_sw_state(rfk_bluetooth, !bluetooth_s);
++ msi_rfkill_set_state(rfk_bluetooth, !bluetooth_s);
+ rfkill_bluetooth_set(NULL, !bluetooth_s);
+ }
+ if (rfk_threeg) {
+- rfkill_set_sw_state(rfk_threeg, !threeg_s);
++ msi_rfkill_set_state(rfk_threeg, !threeg_s);
+ rfkill_threeg_set(NULL, !threeg_s);
+ }
+ }
+diff --git a/drivers/s390/block/dasd_3990_erp.c b/drivers/s390/block/dasd_3990_erp.c
+index 9fd36c4687064..f0f210627cadf 100644
+--- a/drivers/s390/block/dasd_3990_erp.c
++++ b/drivers/s390/block/dasd_3990_erp.c
+@@ -1050,7 +1050,7 @@ dasd_3990_erp_com_rej(struct dasd_ccw_req * erp, char *sense)
+ dev_err(&device->cdev->dev, "An I/O request was rejected"
+ " because writing is inhibited\n");
+ erp = dasd_3990_erp_cleanup(erp, DASD_CQR_FAILED);
+- } else if (sense[7] & SNS7_INVALID_ON_SEC) {
++ } else if (sense[7] == SNS7_INVALID_ON_SEC) {
+ dev_err(&device->cdev->dev, "An I/O request was rejected on a copy pair secondary device\n");
+ /* suppress dump of sense data for this error */
+ set_bit(DASD_CQR_SUPPRESS_CR, &erp->refers->flags);
+diff --git a/drivers/s390/block/dasd_ioctl.c b/drivers/s390/block/dasd_ioctl.c
+index 8fca725b3daec..87890b6efcdcf 100644
+--- a/drivers/s390/block/dasd_ioctl.c
++++ b/drivers/s390/block/dasd_ioctl.c
+@@ -131,6 +131,7 @@ static int dasd_ioctl_resume(struct dasd_block *block)
+ spin_unlock_irqrestore(get_ccwdev_lock(base->cdev), flags);
+
+ dasd_schedule_block_bh(block);
++ dasd_schedule_device_bh(base);
+ return 0;
+ }
+
+diff --git a/drivers/soundwire/amd_manager.c b/drivers/soundwire/amd_manager.c
+index 9fb7f91ca1827..21c638e38c51f 100644
+--- a/drivers/soundwire/amd_manager.c
++++ b/drivers/soundwire/amd_manager.c
+@@ -910,9 +910,9 @@ static int amd_sdw_manager_probe(struct platform_device *pdev)
+ return -ENOMEM;
+
+ amd_manager->acp_mmio = devm_ioremap(dev, res->start, resource_size(res));
+- if (IS_ERR(amd_manager->mmio)) {
++ if (!amd_manager->acp_mmio) {
+ dev_err(dev, "mmio not found\n");
+- return PTR_ERR(amd_manager->mmio);
++ return -ENOMEM;
+ }
+ amd_manager->instance = pdata->instance;
+ amd_manager->mmio = amd_manager->acp_mmio +
+diff --git a/drivers/soundwire/bus.c b/drivers/soundwire/bus.c
+index 1ea6a64f8c4a5..66e5dba919faa 100644
+--- a/drivers/soundwire/bus.c
++++ b/drivers/soundwire/bus.c
+@@ -908,8 +908,8 @@ static void sdw_modify_slave_status(struct sdw_slave *slave,
+ "initializing enumeration and init completion for Slave %d\n",
+ slave->dev_num);
+
+- init_completion(&slave->enumeration_complete);
+- init_completion(&slave->initialization_complete);
++ reinit_completion(&slave->enumeration_complete);
++ reinit_completion(&slave->initialization_complete);
+
+ } else if ((status == SDW_SLAVE_ATTACHED) &&
+ (slave->status == SDW_SLAVE_UNATTACHED)) {
+@@ -917,7 +917,7 @@ static void sdw_modify_slave_status(struct sdw_slave *slave,
+ "signaling enumeration completion for Slave %d\n",
+ slave->dev_num);
+
+- complete(&slave->enumeration_complete);
++ complete_all(&slave->enumeration_complete);
+ }
+ slave->status = status;
+ mutex_unlock(&bus->bus_lock);
+@@ -1941,7 +1941,7 @@ int sdw_handle_slave_status(struct sdw_bus *bus,
+ "signaling initialization completion for Slave %d\n",
+ slave->dev_num);
+
+- complete(&slave->initialization_complete);
++ complete_all(&slave->initialization_complete);
+
+ /*
+ * If the manager became pm_runtime active, the peripherals will be
+diff --git a/drivers/soundwire/qcom.c b/drivers/soundwire/qcom.c
+index e3ef5ebae6b7c..027979c66486c 100644
+--- a/drivers/soundwire/qcom.c
++++ b/drivers/soundwire/qcom.c
+@@ -437,7 +437,7 @@ static int qcom_swrm_get_alert_slave_dev_num(struct qcom_swrm_ctrl *ctrl)
+ status = (val >> (dev_num * SWRM_MCP_SLV_STATUS_SZ));
+
+ if ((status & SWRM_MCP_SLV_STATUS_MASK) == SDW_SLAVE_ALERT) {
+- ctrl->status[dev_num] = status;
++ ctrl->status[dev_num] = status & SWRM_MCP_SLV_STATUS_MASK;
+ return dev_num;
+ }
+ }
+diff --git a/drivers/staging/ks7010/ks_wlan_net.c b/drivers/staging/ks7010/ks_wlan_net.c
+index e03c87f0bfe7a..0fb97a79ad0b3 100644
+--- a/drivers/staging/ks7010/ks_wlan_net.c
++++ b/drivers/staging/ks7010/ks_wlan_net.c
+@@ -1583,8 +1583,10 @@ static int ks_wlan_set_encode_ext(struct net_device *dev,
+ commit |= SME_WEP_FLAG;
+ }
+ if (enc->key_len) {
+- memcpy(&key->key_val[0], &enc->key[0], enc->key_len);
+- key->key_len = enc->key_len;
++ int key_len = clamp_val(enc->key_len, 0, IW_ENCODING_TOKEN_MAX);
++
++ memcpy(&key->key_val[0], &enc->key[0], key_len);
++ key->key_len = key_len;
+ commit |= (SME_WEP_VAL1 << index);
+ }
+ break;
+diff --git a/drivers/staging/media/atomisp/Kconfig b/drivers/staging/media/atomisp/Kconfig
+index c9bff98e5309a..e9b168ba97bf1 100644
+--- a/drivers/staging/media/atomisp/Kconfig
++++ b/drivers/staging/media/atomisp/Kconfig
+@@ -13,6 +13,7 @@ config VIDEO_ATOMISP
+ tristate "Intel Atom Image Signal Processor Driver"
+ depends on VIDEO_DEV && INTEL_ATOMISP
+ depends on PMIC_OPREGION
++ select V4L2_FWNODE
+ select IOSF_MBI
+ select VIDEOBUF2_VMALLOC
+ select VIDEO_V4L2_SUBDEV_API
+diff --git a/drivers/staging/rtl8712/rtl871x_xmit.c b/drivers/staging/rtl8712/rtl871x_xmit.c
+index 090345bad2230..6353dbe554d3a 100644
+--- a/drivers/staging/rtl8712/rtl871x_xmit.c
++++ b/drivers/staging/rtl8712/rtl871x_xmit.c
+@@ -21,6 +21,7 @@
+ #include "osdep_intf.h"
+ #include "usb_ops.h"
+
++#include <linux/usb.h>
+ #include <linux/ieee80211.h>
+
+ static const u8 P802_1H_OUI[P80211_OUI_LEN] = {0x00, 0x00, 0xf8};
+@@ -55,6 +56,7 @@ int _r8712_init_xmit_priv(struct xmit_priv *pxmitpriv,
+ sint i;
+ struct xmit_buf *pxmitbuf;
+ struct xmit_frame *pxframe;
++ int j;
+
+ memset((unsigned char *)pxmitpriv, 0, sizeof(struct xmit_priv));
+ spin_lock_init(&pxmitpriv->lock);
+@@ -117,11 +119,8 @@ int _r8712_init_xmit_priv(struct xmit_priv *pxmitpriv,
+ _init_queue(&pxmitpriv->pending_xmitbuf_queue);
+ pxmitpriv->pallocated_xmitbuf =
+ kmalloc(NR_XMITBUFF * sizeof(struct xmit_buf) + 4, GFP_ATOMIC);
+- if (!pxmitpriv->pallocated_xmitbuf) {
+- kfree(pxmitpriv->pallocated_frame_buf);
+- pxmitpriv->pallocated_frame_buf = NULL;
+- return -ENOMEM;
+- }
++ if (!pxmitpriv->pallocated_xmitbuf)
++ goto clean_up_frame_buf;
+ pxmitpriv->pxmitbuf = pxmitpriv->pallocated_xmitbuf + 4 -
+ ((addr_t)(pxmitpriv->pallocated_xmitbuf) & 3);
+ pxmitbuf = (struct xmit_buf *)pxmitpriv->pxmitbuf;
+@@ -129,13 +128,17 @@ int _r8712_init_xmit_priv(struct xmit_priv *pxmitpriv,
+ INIT_LIST_HEAD(&pxmitbuf->list);
+ pxmitbuf->pallocated_buf =
+ kmalloc(MAX_XMITBUF_SZ + XMITBUF_ALIGN_SZ, GFP_ATOMIC);
+- if (!pxmitbuf->pallocated_buf)
+- return -ENOMEM;
++ if (!pxmitbuf->pallocated_buf) {
++ j = 0;
++ goto clean_up_alloc_buf;
++ }
+ pxmitbuf->pbuf = pxmitbuf->pallocated_buf + XMITBUF_ALIGN_SZ -
+ ((addr_t) (pxmitbuf->pallocated_buf) &
+ (XMITBUF_ALIGN_SZ - 1));
+- if (r8712_xmit_resource_alloc(padapter, pxmitbuf))
+- return -ENOMEM;
++ if (r8712_xmit_resource_alloc(padapter, pxmitbuf)) {
++ j = 1;
++ goto clean_up_alloc_buf;
++ }
+ list_add_tail(&pxmitbuf->list,
+ &(pxmitpriv->free_xmitbuf_queue.queue));
+ pxmitbuf++;
+@@ -146,6 +149,28 @@ int _r8712_init_xmit_priv(struct xmit_priv *pxmitpriv,
+ init_hwxmits(pxmitpriv->hwxmits, pxmitpriv->hwxmit_entry);
+ tasklet_setup(&pxmitpriv->xmit_tasklet, r8712_xmit_bh);
+ return 0;
++
++clean_up_alloc_buf:
++ if (j) {
++ /* failure happened in r8712_xmit_resource_alloc()
++ * delete extra pxmitbuf->pallocated_buf
++ */
++ kfree(pxmitbuf->pallocated_buf);
++ }
++ for (j = 0; j < i; j++) {
++ int k;
++
++ pxmitbuf--; /* reset pointer */
++ kfree(pxmitbuf->pallocated_buf);
++ for (k = 0; k < 8; k++) /* delete xmit urb's */
++ usb_free_urb(pxmitbuf->pxmit_urb[k]);
++ }
++ kfree(pxmitpriv->pallocated_xmitbuf);
++ pxmitpriv->pallocated_xmitbuf = NULL;
++clean_up_frame_buf:
++ kfree(pxmitpriv->pallocated_frame_buf);
++ pxmitpriv->pallocated_frame_buf = NULL;
++ return -ENOMEM;
+ }
+
+ void _free_xmit_priv(struct xmit_priv *pxmitpriv)
+diff --git a/drivers/staging/rtl8712/xmit_linux.c b/drivers/staging/rtl8712/xmit_linux.c
+index 132afbf49dde9..ceb6b590b310f 100644
+--- a/drivers/staging/rtl8712/xmit_linux.c
++++ b/drivers/staging/rtl8712/xmit_linux.c
+@@ -112,6 +112,12 @@ int r8712_xmit_resource_alloc(struct _adapter *padapter,
+ for (i = 0; i < 8; i++) {
+ pxmitbuf->pxmit_urb[i] = usb_alloc_urb(0, GFP_KERNEL);
+ if (!pxmitbuf->pxmit_urb[i]) {
++ int k;
++
++ for (k = i - 1; k >= 0; k--) {
++ /* handle allocation errors part way through loop */
++ usb_free_urb(pxmitbuf->pxmit_urb[k]);
++ }
+ netdev_err(padapter->pnetdev, "pxmitbuf->pxmit_urb[i] == NULL\n");
+ return -ENOMEM;
+ }
+diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c
+index 6fb14e5211971..bc07ae1c284cf 100644
+--- a/drivers/thermal/thermal_of.c
++++ b/drivers/thermal/thermal_of.c
+@@ -238,17 +238,13 @@ static int thermal_of_monitor_init(struct device_node *np, int *delay, int *pdel
+ return 0;
+ }
+
+-static struct thermal_zone_params *thermal_of_parameters_init(struct device_node *np)
++static void thermal_of_parameters_init(struct device_node *np,
++ struct thermal_zone_params *tzp)
+ {
+- struct thermal_zone_params *tzp;
+ int coef[2];
+ int ncoef = ARRAY_SIZE(coef);
+ int prop, ret;
+
+- tzp = kzalloc(sizeof(*tzp), GFP_KERNEL);
+- if (!tzp)
+- return ERR_PTR(-ENOMEM);
+-
+ tzp->no_hwmon = true;
+
+ if (!of_property_read_u32(np, "sustainable-power", &prop))
+@@ -267,8 +263,6 @@ static struct thermal_zone_params *thermal_of_parameters_init(struct device_node
+
+ tzp->slope = coef[0];
+ tzp->offset = coef[1];
+-
+- return tzp;
+ }
+
+ static struct device_node *thermal_of_zone_get_by_name(struct thermal_zone_device *tz)
+@@ -442,13 +436,11 @@ static int thermal_of_unbind(struct thermal_zone_device *tz,
+ static void thermal_of_zone_unregister(struct thermal_zone_device *tz)
+ {
+ struct thermal_trip *trips = tz->trips;
+- struct thermal_zone_params *tzp = tz->tzp;
+ struct thermal_zone_device_ops *ops = tz->ops;
+
+ thermal_zone_device_disable(tz);
+ thermal_zone_device_unregister(tz);
+ kfree(trips);
+- kfree(tzp);
+ kfree(ops);
+ }
+
+@@ -477,7 +469,7 @@ static struct thermal_zone_device *thermal_of_zone_register(struct device_node *
+ {
+ struct thermal_zone_device *tz;
+ struct thermal_trip *trips;
+- struct thermal_zone_params *tzp;
++ struct thermal_zone_params tzp = {};
+ struct thermal_zone_device_ops *of_ops;
+ struct device_node *np;
+ int delay, pdelay;
+@@ -509,12 +501,7 @@ static struct thermal_zone_device *thermal_of_zone_register(struct device_node *
+ goto out_kfree_trips;
+ }
+
+- tzp = thermal_of_parameters_init(np);
+- if (IS_ERR(tzp)) {
+- ret = PTR_ERR(tzp);
+- pr_err("Failed to initialize parameter from %pOFn: %d\n", np, ret);
+- goto out_kfree_trips;
+- }
++ thermal_of_parameters_init(np, &tzp);
+
+ of_ops->bind = thermal_of_bind;
+ of_ops->unbind = thermal_of_unbind;
+@@ -522,12 +509,12 @@ static struct thermal_zone_device *thermal_of_zone_register(struct device_node *
+ mask = GENMASK_ULL((ntrips) - 1, 0);
+
+ tz = thermal_zone_device_register_with_trips(np->name, trips, ntrips,
+- mask, data, of_ops, tzp,
++ mask, data, of_ops, &tzp,
+ pdelay, delay);
+ if (IS_ERR(tz)) {
+ ret = PTR_ERR(tz);
+ pr_err("Failed to register thermal zone %pOFn: %d\n", np, ret);
+- goto out_kfree_tzp;
++ goto out_kfree_trips;
+ }
+
+ ret = thermal_zone_device_enable(tz);
+@@ -540,8 +527,6 @@ static struct thermal_zone_device *thermal_of_zone_register(struct device_node *
+
+ return tz;
+
+-out_kfree_tzp:
+- kfree(tzp);
+ out_kfree_trips:
+ kfree(trips);
+ out_kfree_of_ops:
+diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c
+index b411a26cc092c..1cdefac4dd1b5 100644
+--- a/drivers/tty/n_gsm.c
++++ b/drivers/tty/n_gsm.c
+@@ -3070,8 +3070,10 @@ static void gsm_cleanup_mux(struct gsm_mux *gsm, bool disc)
+ gsm->has_devices = false;
+ }
+ for (i = NUM_DLCI - 1; i >= 0; i--)
+- if (gsm->dlci[i])
++ if (gsm->dlci[i]) {
+ gsm_dlci_release(gsm->dlci[i]);
++ gsm->dlci[i] = NULL;
++ }
+ mutex_unlock(&gsm->mutex);
+ /* Now wipe the queues */
+ tty_ldisc_flush(gsm->tty);
+diff --git a/drivers/tty/serial/8250/8250_dwlib.c b/drivers/tty/serial/8250/8250_dwlib.c
+index 75f32f054ebb1..84843e204a5e8 100644
+--- a/drivers/tty/serial/8250/8250_dwlib.c
++++ b/drivers/tty/serial/8250/8250_dwlib.c
+@@ -244,7 +244,7 @@ void dw8250_setup_port(struct uart_port *p)
+ struct dw8250_port_data *pd = p->private_data;
+ struct dw8250_data *data = to_dw8250_data(pd);
+ struct uart_8250_port *up = up_to_u8250p(p);
+- u32 reg;
++ u32 reg, old_dlf;
+
+ pd->hw_rs485_support = dw8250_detect_rs485_hw(p);
+ if (pd->hw_rs485_support) {
+@@ -270,9 +270,11 @@ void dw8250_setup_port(struct uart_port *p)
+ dev_dbg(p->dev, "Designware UART version %c.%c%c\n",
+ (reg >> 24) & 0xff, (reg >> 16) & 0xff, (reg >> 8) & 0xff);
+
++ /* Preserve value written by firmware or bootloader */
++ old_dlf = dw8250_readl_ext(p, DW_UART_DLF);
+ dw8250_writel_ext(p, DW_UART_DLF, ~0U);
+ reg = dw8250_readl_ext(p, DW_UART_DLF);
+- dw8250_writel_ext(p, DW_UART_DLF, 0);
++ dw8250_writel_ext(p, DW_UART_DLF, old_dlf);
+
+ if (reg) {
+ pd->dlf_size = fls(reg);
+diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
+index 8582479f0211a..22fe5a8ce9399 100644
+--- a/drivers/tty/serial/qcom_geni_serial.c
++++ b/drivers/tty/serial/qcom_geni_serial.c
+@@ -1676,13 +1676,6 @@ static int qcom_geni_serial_probe(struct platform_device *pdev)
+ if (ret)
+ return ret;
+
+- /*
+- * Set pm_runtime status as ACTIVE so that wakeup_irq gets
+- * enabled/disabled from dev_pm_arm_wake_irq during system
+- * suspend/resume respectively.
+- */
+- pm_runtime_set_active(&pdev->dev);
+-
+ if (port->wakeup_irq > 0) {
+ device_init_wakeup(&pdev->dev, true);
+ ret = dev_pm_set_dedicated_wake_irq(&pdev->dev,
+diff --git a/drivers/tty/serial/sh-sci.c b/drivers/tty/serial/sh-sci.c
+index 7c9457962a3df..8b7a42e05d6d5 100644
+--- a/drivers/tty/serial/sh-sci.c
++++ b/drivers/tty/serial/sh-sci.c
+@@ -590,7 +590,7 @@ static void sci_start_tx(struct uart_port *port)
+ dma_submit_error(s->cookie_tx)) {
+ if (s->cfg->regtype == SCIx_RZ_SCIFA_REGTYPE)
+ /* Switch irq from SCIF to DMA */
+- disable_irq(s->irqs[SCIx_TXI_IRQ]);
++ disable_irq_nosync(s->irqs[SCIx_TXI_IRQ]);
+
+ s->cookie_tx = 0;
+ schedule_work(&s->work_tx);
+diff --git a/drivers/tty/serial/sifive.c b/drivers/tty/serial/sifive.c
+index 1f565a216e748..a19db49327e29 100644
+--- a/drivers/tty/serial/sifive.c
++++ b/drivers/tty/serial/sifive.c
+@@ -811,7 +811,7 @@ static void sifive_serial_console_write(struct console *co, const char *s,
+ local_irq_restore(flags);
+ }
+
+-static int __init sifive_serial_console_setup(struct console *co, char *options)
++static int sifive_serial_console_setup(struct console *co, char *options)
+ {
+ struct sifive_serial_port *ssp;
+ int baud = SIFIVE_DEFAULT_BAUD_RATE;
+diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
+index c84be40fb8dfa..429beccdfeb1a 100644
+--- a/drivers/tty/tty_io.c
++++ b/drivers/tty/tty_io.c
+@@ -2276,7 +2276,7 @@ static int tiocsti(struct tty_struct *tty, char __user *p)
+ char ch, mbz = 0;
+ struct tty_ldisc *ld;
+
+- if (!tty_legacy_tiocsti)
++ if (!tty_legacy_tiocsti && !capable(CAP_SYS_ADMIN))
+ return -EIO;
+
+ if ((current->signal->tty != tty) && !capable(CAP_SYS_ADMIN))
+diff --git a/drivers/usb/cdns3/cdns3-gadget.c b/drivers/usb/cdns3/cdns3-gadget.c
+index 1dcadef933e3a..69a44bd7e5d02 100644
+--- a/drivers/usb/cdns3/cdns3-gadget.c
++++ b/drivers/usb/cdns3/cdns3-gadget.c
+@@ -3012,12 +3012,14 @@ static int cdns3_gadget_udc_stop(struct usb_gadget *gadget)
+ static int cdns3_gadget_check_config(struct usb_gadget *gadget)
+ {
+ struct cdns3_device *priv_dev = gadget_to_cdns3_device(gadget);
++ struct cdns3_endpoint *priv_ep;
+ struct usb_ep *ep;
+ int n_in = 0;
+ int total;
+
+ list_for_each_entry(ep, &gadget->ep_list, ep_list) {
+- if (ep->claimed && (ep->address & USB_DIR_IN))
++ priv_ep = ep_to_cdns3_ep(ep);
++ if ((priv_ep->flags & EP_CLAIMED) && (ep->address & USB_DIR_IN))
+ n_in++;
+ }
+
+diff --git a/drivers/usb/core/quirks.c b/drivers/usb/core/quirks.c
+index 934b3d997702e..15e9bd180a1d2 100644
+--- a/drivers/usb/core/quirks.c
++++ b/drivers/usb/core/quirks.c
+@@ -436,6 +436,10 @@ static const struct usb_device_id usb_quirk_list[] = {
+ /* novation SoundControl XL */
+ { USB_DEVICE(0x1235, 0x0061), .driver_info = USB_QUIRK_RESET_RESUME },
+
++ /* Focusrite Scarlett Solo USB */
++ { USB_DEVICE(0x1235, 0x8211), .driver_info =
++ USB_QUIRK_DISCONNECT_SUSPEND },
++
+ /* Huawei 4G LTE module */
+ { USB_DEVICE(0x12d1, 0x15bb), .driver_info =
+ USB_QUIRK_DISCONNECT_SUSPEND },
+diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
+index d68958e151a78..99963e724b716 100644
+--- a/drivers/usb/dwc3/core.c
++++ b/drivers/usb/dwc3/core.c
+@@ -277,9 +277,9 @@ int dwc3_core_soft_reset(struct dwc3 *dwc)
+ /*
+ * We're resetting only the device side because, if we're in host mode,
+ * XHCI driver will reset the host block. If dwc3 was configured for
+- * host-only mode, then we can return early.
++ * host-only mode or current role is host, then we can return early.
+ */
+- if (dwc->current_dr_role == DWC3_GCTL_PRTCAP_HOST)
++ if (dwc->dr_mode == USB_DR_MODE_HOST || dwc->current_dr_role == DWC3_GCTL_PRTCAP_HOST)
+ return 0;
+
+ reg = dwc3_readl(dwc->regs, DWC3_DCTL);
+@@ -1209,22 +1209,6 @@ static int dwc3_core_init(struct dwc3 *dwc)
+ dwc3_writel(dwc->regs, DWC3_GUCTL1, reg);
+ }
+
+- if (dwc->dr_mode == USB_DR_MODE_HOST ||
+- dwc->dr_mode == USB_DR_MODE_OTG) {
+- reg = dwc3_readl(dwc->regs, DWC3_GUCTL);
+-
+- /*
+- * Enable Auto retry Feature to make the controller operating in
+- * Host mode on seeing transaction errors(CRC errors or internal
+- * overrun scenerios) on IN transfers to reply to the device
+- * with a non-terminating retry ACK (i.e, an ACK transcation
+- * packet with Retry=1 & Nump != 0)
+- */
+- reg |= DWC3_GUCTL_HSTINAUTORETRY;
+-
+- dwc3_writel(dwc->regs, DWC3_GUCTL, reg);
+- }
+-
+ /*
+ * Must config both number of packets and max burst settings to enable
+ * RX and/or TX threshold.
+diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
+index 1f043c31a0969..59a31cac30823 100644
+--- a/drivers/usb/dwc3/core.h
++++ b/drivers/usb/dwc3/core.h
+@@ -254,9 +254,6 @@
+ #define DWC3_GCTL_GBLHIBERNATIONEN BIT(1)
+ #define DWC3_GCTL_DSBLCLKGTNG BIT(0)
+
+-/* Global User Control Register */
+-#define DWC3_GUCTL_HSTINAUTORETRY BIT(14)
+-
+ /* Global User Control 1 Register */
+ #define DWC3_GUCTL1_DEV_DECOUPLE_L1L2_EVT BIT(31)
+ #define DWC3_GUCTL1_TX_IPGAP_LINECHECK_DIS BIT(28)
+diff --git a/drivers/usb/dwc3/dwc3-pci.c b/drivers/usb/dwc3/dwc3-pci.c
+index 44a04c9b20735..6604845c397cd 100644
+--- a/drivers/usb/dwc3/dwc3-pci.c
++++ b/drivers/usb/dwc3/dwc3-pci.c
+@@ -233,10 +233,12 @@ static int dwc3_pci_quirks(struct dwc3_pci *dwc,
+
+ /*
+ * A lot of BYT devices lack ACPI resource entries for
+- * the GPIOs, add a fallback mapping to the reference
++ * the GPIOs. If the ACPI entry for the GPIO controller
++ * is present add a fallback mapping to the reference
+ * design GPIOs which all boards seem to use.
+ */
+- gpiod_add_lookup_table(&platform_bytcr_gpios);
++ if (acpi_dev_present("INT33FC", NULL, -1))
++ gpiod_add_lookup_table(&platform_bytcr_gpios);
+
+ /*
+ * These GPIOs will turn on the USB2 PHY. Note that we have to
+diff --git a/drivers/usb/gadget/composite.c b/drivers/usb/gadget/composite.c
+index 1b3489149e5ea..dd9b90481b4c2 100644
+--- a/drivers/usb/gadget/composite.c
++++ b/drivers/usb/gadget/composite.c
+@@ -1125,6 +1125,10 @@ int usb_add_config(struct usb_composite_dev *cdev,
+ goto done;
+
+ status = bind(config);
++
++ if (status == 0)
++ status = usb_gadget_check_config(cdev->gadget);
++
+ if (status < 0) {
+ while (!list_empty(&config->functions)) {
+ struct usb_function *f;
+diff --git a/drivers/usb/gadget/legacy/raw_gadget.c b/drivers/usb/gadget/legacy/raw_gadget.c
+index 2acece16b8900..e549022642e56 100644
+--- a/drivers/usb/gadget/legacy/raw_gadget.c
++++ b/drivers/usb/gadget/legacy/raw_gadget.c
+@@ -310,13 +310,15 @@ static int gadget_bind(struct usb_gadget *gadget,
+ dev->eps_num = i;
+ spin_unlock_irqrestore(&dev->lock, flags);
+
+- /* Matches kref_put() in gadget_unbind(). */
+- kref_get(&dev->count);
+-
+ ret = raw_queue_event(dev, USB_RAW_EVENT_CONNECT, 0, NULL);
+- if (ret < 0)
++ if (ret < 0) {
+ dev_err(&gadget->dev, "failed to queue event\n");
++ set_gadget_data(gadget, NULL);
++ return ret;
++ }
+
++ /* Matches kref_put() in gadget_unbind(). */
++ kref_get(&dev->count);
+ return ret;
+ }
+
+diff --git a/drivers/usb/gadget/udc/core.c b/drivers/usb/gadget/udc/core.c
+index 83fd1de14784f..0068d0c448658 100644
+--- a/drivers/usb/gadget/udc/core.c
++++ b/drivers/usb/gadget/udc/core.c
+@@ -878,7 +878,6 @@ int usb_gadget_activate(struct usb_gadget *gadget)
+ */
+ if (gadget->connected)
+ ret = usb_gadget_connect_locked(gadget);
+- mutex_unlock(&gadget->udc->connect_lock);
+
+ unlock:
+ mutex_unlock(&gadget->udc->connect_lock);
+diff --git a/drivers/usb/gadget/udc/tegra-xudc.c b/drivers/usb/gadget/udc/tegra-xudc.c
+index 34e9c1df54c79..a0c11f51873e5 100644
+--- a/drivers/usb/gadget/udc/tegra-xudc.c
++++ b/drivers/usb/gadget/udc/tegra-xudc.c
+@@ -3718,15 +3718,15 @@ static int tegra_xudc_powerdomain_init(struct tegra_xudc *xudc)
+ int err;
+
+ xudc->genpd_dev_device = dev_pm_domain_attach_by_name(dev, "dev");
+- if (IS_ERR_OR_NULL(xudc->genpd_dev_device)) {
+- err = PTR_ERR(xudc->genpd_dev_device) ? : -ENODATA;
++ if (IS_ERR(xudc->genpd_dev_device)) {
++ err = PTR_ERR(xudc->genpd_dev_device);
+ dev_err(dev, "failed to get device power domain: %d\n", err);
+ return err;
+ }
+
+ xudc->genpd_dev_ss = dev_pm_domain_attach_by_name(dev, "ss");
+- if (IS_ERR_OR_NULL(xudc->genpd_dev_ss)) {
+- err = PTR_ERR(xudc->genpd_dev_ss) ? : -ENODATA;
++ if (IS_ERR(xudc->genpd_dev_ss)) {
++ err = PTR_ERR(xudc->genpd_dev_ss);
+ dev_err(dev, "failed to get SuperSpeed power domain: %d\n", err);
+ return err;
+ }
+diff --git a/drivers/usb/host/ohci-at91.c b/drivers/usb/host/ohci-at91.c
+index 533537ef3c21d..360680769494b 100644
+--- a/drivers/usb/host/ohci-at91.c
++++ b/drivers/usb/host/ohci-at91.c
+@@ -673,7 +673,13 @@ ohci_hcd_at91_drv_resume(struct device *dev)
+ else
+ at91_start_clock(ohci_at91);
+
+- ohci_resume(hcd, false);
++ /*
++ * According to the comment in ohci_hcd_at91_drv_suspend()
++ * we need to do a reset if the 48Mhz clock was stopped,
++ * that is, if ohci_at91->wakeup is clear. Tell ohci_resume()
++ * to reset in this case by setting its "hibernated" flag.
++ */
++ ohci_resume(hcd, !ohci_at91->wakeup);
+
+ return 0;
+ }
+diff --git a/drivers/usb/host/xhci-mtk.c b/drivers/usb/host/xhci-mtk.c
+index 90cf40d6d0c31..b60521e1a9a63 100644
+--- a/drivers/usb/host/xhci-mtk.c
++++ b/drivers/usb/host/xhci-mtk.c
+@@ -592,6 +592,7 @@ static int xhci_mtk_probe(struct platform_device *pdev)
+ }
+
+ device_init_wakeup(dev, true);
++ dma_set_max_seg_size(dev, UINT_MAX);
+
+ xhci = hcd_to_xhci(hcd);
+ xhci->main_hcd = hcd;
+diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
+index a410162e15df1..db9826c38b20b 100644
+--- a/drivers/usb/host/xhci-pci.c
++++ b/drivers/usb/host/xhci-pci.c
+@@ -486,10 +486,8 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
+ pdev->device == 0x3432)
+ xhci->quirks |= XHCI_BROKEN_STREAMS;
+
+- if (pdev->vendor == PCI_VENDOR_ID_VIA && pdev->device == 0x3483) {
++ if (pdev->vendor == PCI_VENDOR_ID_VIA && pdev->device == 0x3483)
+ xhci->quirks |= XHCI_LPM_SUPPORT;
+- xhci->quirks |= XHCI_EP_CTX_BROKEN_DCS;
+- }
+
+ if (pdev->vendor == PCI_VENDOR_ID_ASMEDIA &&
+ pdev->device == PCI_DEVICE_ID_ASMEDIA_1042_XHCI) {
+diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
+index 2bc82b3a2f984..ad70f63593093 100644
+--- a/drivers/usb/host/xhci-ring.c
++++ b/drivers/usb/host/xhci-ring.c
+@@ -592,11 +592,8 @@ static int xhci_move_dequeue_past_td(struct xhci_hcd *xhci,
+ struct xhci_ring *ep_ring;
+ struct xhci_command *cmd;
+ struct xhci_segment *new_seg;
+- struct xhci_segment *halted_seg = NULL;
+ union xhci_trb *new_deq;
+ int new_cycle;
+- union xhci_trb *halted_trb;
+- int index = 0;
+ dma_addr_t addr;
+ u64 hw_dequeue;
+ bool cycle_found = false;
+@@ -634,27 +631,7 @@ static int xhci_move_dequeue_past_td(struct xhci_hcd *xhci,
+ hw_dequeue = xhci_get_hw_deq(xhci, dev, ep_index, stream_id);
+ new_seg = ep_ring->deq_seg;
+ new_deq = ep_ring->dequeue;
+-
+- /*
+- * Quirk: xHC write-back of the DCS field in the hardware dequeue
+- * pointer is wrong - use the cycle state of the TRB pointed to by
+- * the dequeue pointer.
+- */
+- if (xhci->quirks & XHCI_EP_CTX_BROKEN_DCS &&
+- !(ep->ep_state & EP_HAS_STREAMS))
+- halted_seg = trb_in_td(xhci, td->start_seg,
+- td->first_trb, td->last_trb,
+- hw_dequeue & ~0xf, false);
+- if (halted_seg) {
+- index = ((dma_addr_t)(hw_dequeue & ~0xf) - halted_seg->dma) /
+- sizeof(*halted_trb);
+- halted_trb = &halted_seg->trbs[index];
+- new_cycle = halted_trb->generic.field[3] & 0x1;
+- xhci_dbg(xhci, "Endpoint DCS = %d TRB index = %d cycle = %d\n",
+- (u8)(hw_dequeue & 0x1), index, new_cycle);
+- } else {
+- new_cycle = hw_dequeue & 0x1;
+- }
++ new_cycle = hw_dequeue & 0x1;
+
+ /*
+ * We want to find the pointer, segment and cycle state of the new trb
+diff --git a/drivers/usb/host/xhci-tegra.c b/drivers/usb/host/xhci-tegra.c
+index 8a9c7deb7686e..d28fa892c2866 100644
+--- a/drivers/usb/host/xhci-tegra.c
++++ b/drivers/usb/host/xhci-tegra.c
+@@ -1145,15 +1145,15 @@ static int tegra_xusb_powerdomain_init(struct device *dev,
+ int err;
+
+ tegra->genpd_dev_host = dev_pm_domain_attach_by_name(dev, "xusb_host");
+- if (IS_ERR_OR_NULL(tegra->genpd_dev_host)) {
+- err = PTR_ERR(tegra->genpd_dev_host) ? : -ENODATA;
++ if (IS_ERR(tegra->genpd_dev_host)) {
++ err = PTR_ERR(tegra->genpd_dev_host);
+ dev_err(dev, "failed to get host pm-domain: %d\n", err);
+ return err;
+ }
+
+ tegra->genpd_dev_ss = dev_pm_domain_attach_by_name(dev, "xusb_ss");
+- if (IS_ERR_OR_NULL(tegra->genpd_dev_ss)) {
+- err = PTR_ERR(tegra->genpd_dev_ss) ? : -ENODATA;
++ if (IS_ERR(tegra->genpd_dev_ss)) {
++ err = PTR_ERR(tegra->genpd_dev_ss);
+ dev_err(dev, "failed to get superspeed pm-domain: %d\n", err);
+ return err;
+ }
+diff --git a/drivers/usb/misc/ehset.c b/drivers/usb/misc/ehset.c
+index 986d6589f0535..36b6e9fa7ffb6 100644
+--- a/drivers/usb/misc/ehset.c
++++ b/drivers/usb/misc/ehset.c
+@@ -77,7 +77,7 @@ static int ehset_probe(struct usb_interface *intf,
+ switch (test_pid) {
+ case TEST_SE0_NAK_PID:
+ ret = ehset_prepare_port_for_testing(hub_udev, portnum);
+- if (!ret)
++ if (ret < 0)
+ break;
+ ret = usb_control_msg_send(hub_udev, 0, USB_REQ_SET_FEATURE,
+ USB_RT_PORT, USB_PORT_FEAT_TEST,
+@@ -86,7 +86,7 @@ static int ehset_probe(struct usb_interface *intf,
+ break;
+ case TEST_J_PID:
+ ret = ehset_prepare_port_for_testing(hub_udev, portnum);
+- if (!ret)
++ if (ret < 0)
+ break;
+ ret = usb_control_msg_send(hub_udev, 0, USB_REQ_SET_FEATURE,
+ USB_RT_PORT, USB_PORT_FEAT_TEST,
+@@ -95,7 +95,7 @@ static int ehset_probe(struct usb_interface *intf,
+ break;
+ case TEST_K_PID:
+ ret = ehset_prepare_port_for_testing(hub_udev, portnum);
+- if (!ret)
++ if (ret < 0)
+ break;
+ ret = usb_control_msg_send(hub_udev, 0, USB_REQ_SET_FEATURE,
+ USB_RT_PORT, USB_PORT_FEAT_TEST,
+@@ -104,7 +104,7 @@ static int ehset_probe(struct usb_interface *intf,
+ break;
+ case TEST_PACKET_PID:
+ ret = ehset_prepare_port_for_testing(hub_udev, portnum);
+- if (!ret)
++ if (ret < 0)
+ break;
+ ret = usb_control_msg_send(hub_udev, 0, USB_REQ_SET_FEATURE,
+ USB_RT_PORT, USB_PORT_FEAT_TEST,
+diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c
+index 288a96a742661..8ac98e60fff56 100644
+--- a/drivers/usb/serial/option.c
++++ b/drivers/usb/serial/option.c
+@@ -251,6 +251,7 @@ static void option_instat_callback(struct urb *urb);
+ #define QUECTEL_PRODUCT_EM061K_LTA 0x0123
+ #define QUECTEL_PRODUCT_EM061K_LMS 0x0124
+ #define QUECTEL_PRODUCT_EC25 0x0125
++#define QUECTEL_PRODUCT_EM060K_128 0x0128
+ #define QUECTEL_PRODUCT_EG91 0x0191
+ #define QUECTEL_PRODUCT_EG95 0x0195
+ #define QUECTEL_PRODUCT_BG96 0x0296
+@@ -268,6 +269,7 @@ static void option_instat_callback(struct urb *urb);
+ #define QUECTEL_PRODUCT_RM520N 0x0801
+ #define QUECTEL_PRODUCT_EC200U 0x0901
+ #define QUECTEL_PRODUCT_EC200S_CN 0x6002
++#define QUECTEL_PRODUCT_EC200A 0x6005
+ #define QUECTEL_PRODUCT_EM061K_LWW 0x6008
+ #define QUECTEL_PRODUCT_EM061K_LCN 0x6009
+ #define QUECTEL_PRODUCT_EC200T 0x6026
+@@ -1197,6 +1199,9 @@ static const struct usb_device_id option_ids[] = {
+ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM060K, 0xff, 0x00, 0x40) },
+ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM060K, 0xff, 0xff, 0x30) },
+ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM060K, 0xff, 0xff, 0x40) },
++ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM060K_128, 0xff, 0xff, 0x30) },
++ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM060K_128, 0xff, 0x00, 0x40) },
++ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM060K_128, 0xff, 0xff, 0x40) },
+ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM061K_LCN, 0xff, 0xff, 0x30) },
+ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM061K_LCN, 0xff, 0x00, 0x40) },
+ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM061K_LCN, 0xff, 0xff, 0x40) },
+@@ -1225,6 +1230,7 @@ static const struct usb_device_id option_ids[] = {
+ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_RM520N, 0xff, 0, 0) },
+ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, 0x0900, 0xff, 0, 0), /* RM500U-CN */
+ .driver_info = ZLP },
++ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EC200A, 0xff, 0, 0) },
+ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EC200U, 0xff, 0, 0) },
+ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EC200S_CN, 0xff, 0, 0) },
+ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EC200T, 0xff, 0, 0) },
+diff --git a/drivers/usb/serial/usb-serial-simple.c b/drivers/usb/serial/usb-serial-simple.c
+index 4c6747889a194..24b8772a345e2 100644
+--- a/drivers/usb/serial/usb-serial-simple.c
++++ b/drivers/usb/serial/usb-serial-simple.c
+@@ -38,16 +38,6 @@ static struct usb_serial_driver vendor##_device = { \
+ { USB_DEVICE(0x0a21, 0x8001) } /* MMT-7305WW */
+ DEVICE(carelink, CARELINK_IDS);
+
+-/* ZIO Motherboard USB driver */
+-#define ZIO_IDS() \
+- { USB_DEVICE(0x1CBE, 0x0103) }
+-DEVICE(zio, ZIO_IDS);
+-
+-/* Funsoft Serial USB driver */
+-#define FUNSOFT_IDS() \
+- { USB_DEVICE(0x1404, 0xcddc) }
+-DEVICE(funsoft, FUNSOFT_IDS);
+-
+ /* Infineon Flashloader driver */
+ #define FLASHLOADER_IDS() \
+ { USB_DEVICE_INTERFACE_CLASS(0x058b, 0x0041, USB_CLASS_CDC_DATA) }, \
+@@ -55,6 +45,11 @@ DEVICE(funsoft, FUNSOFT_IDS);
+ { USB_DEVICE(0x8087, 0x0801) }
+ DEVICE(flashloader, FLASHLOADER_IDS);
+
++/* Funsoft Serial USB driver */
++#define FUNSOFT_IDS() \
++ { USB_DEVICE(0x1404, 0xcddc) }
++DEVICE(funsoft, FUNSOFT_IDS);
++
+ /* Google Serial USB SubClass */
+ #define GOOGLE_IDS() \
+ { USB_VENDOR_AND_INTERFACE_INFO(0x18d1, \
+@@ -63,16 +58,21 @@ DEVICE(flashloader, FLASHLOADER_IDS);
+ 0x01) }
+ DEVICE(google, GOOGLE_IDS);
+
++/* HP4x (48/49) Generic Serial driver */
++#define HP4X_IDS() \
++ { USB_DEVICE(0x03f0, 0x0121) }
++DEVICE(hp4x, HP4X_IDS);
++
++/* KAUFMANN RKS+CAN VCP */
++#define KAUFMANN_IDS() \
++ { USB_DEVICE(0x16d0, 0x0870) }
++DEVICE(kaufmann, KAUFMANN_IDS);
++
+ /* Libtransistor USB console */
+ #define LIBTRANSISTOR_IDS() \
+ { USB_DEVICE(0x1209, 0x8b00) }
+ DEVICE(libtransistor, LIBTRANSISTOR_IDS);
+
+-/* ViVOpay USB Serial Driver */
+-#define VIVOPAY_IDS() \
+- { USB_DEVICE(0x1d5f, 0x1004) } /* ViVOpay 8800 */
+-DEVICE(vivopay, VIVOPAY_IDS);
+-
+ /* Motorola USB Phone driver */
+ #define MOTO_IDS() \
+ { USB_DEVICE(0x05c6, 0x3197) }, /* unknown Motorola phone */ \
+@@ -101,10 +101,10 @@ DEVICE(nokia, NOKIA_IDS);
+ { USB_DEVICE(0x09d7, 0x0100) } /* NovAtel FlexPack GPS */
+ DEVICE_N(novatel_gps, NOVATEL_IDS, 3);
+
+-/* HP4x (48/49) Generic Serial driver */
+-#define HP4X_IDS() \
+- { USB_DEVICE(0x03f0, 0x0121) }
+-DEVICE(hp4x, HP4X_IDS);
++/* Siemens USB/MPI adapter */
++#define SIEMENS_IDS() \
++ { USB_DEVICE(0x908, 0x0004) }
++DEVICE(siemens_mpi, SIEMENS_IDS);
+
+ /* Suunto ANT+ USB Driver */
+ #define SUUNTO_IDS() \
+@@ -112,45 +112,52 @@ DEVICE(hp4x, HP4X_IDS);
+ { USB_DEVICE(0x0fcf, 0x1009) } /* Dynastream ANT USB-m Stick */
+ DEVICE(suunto, SUUNTO_IDS);
+
+-/* Siemens USB/MPI adapter */
+-#define SIEMENS_IDS() \
+- { USB_DEVICE(0x908, 0x0004) }
+-DEVICE(siemens_mpi, SIEMENS_IDS);
++/* ViVOpay USB Serial Driver */
++#define VIVOPAY_IDS() \
++ { USB_DEVICE(0x1d5f, 0x1004) } /* ViVOpay 8800 */
++DEVICE(vivopay, VIVOPAY_IDS);
++
++/* ZIO Motherboard USB driver */
++#define ZIO_IDS() \
++ { USB_DEVICE(0x1CBE, 0x0103) }
++DEVICE(zio, ZIO_IDS);
+
+ /* All of the above structures mushed into two lists */
+ static struct usb_serial_driver * const serial_drivers[] = {
+ &carelink_device,
+- &zio_device,
+- &funsoft_device,
+ &flashloader_device,
++ &funsoft_device,
+ &google_device,
++ &hp4x_device,
++ &kaufmann_device,
+ &libtransistor_device,
+- &vivopay_device,
+ &moto_modem_device,
+ &motorola_tetra_device,
+ &nokia_device,
+ &novatel_gps_device,
+- &hp4x_device,
+- &suunto_device,
+ &siemens_mpi_device,
++ &suunto_device,
++ &vivopay_device,
++ &zio_device,
+ NULL
+ };
+
+ static const struct usb_device_id id_table[] = {
+ CARELINK_IDS(),
+- ZIO_IDS(),
+- FUNSOFT_IDS(),
+ FLASHLOADER_IDS(),
++ FUNSOFT_IDS(),
+ GOOGLE_IDS(),
++ HP4X_IDS(),
++ KAUFMANN_IDS(),
+ LIBTRANSISTOR_IDS(),
+- VIVOPAY_IDS(),
+ MOTO_IDS(),
+ MOTOROLA_TETRA_IDS(),
+ NOKIA_IDS(),
+ NOVATEL_IDS(),
+- HP4X_IDS(),
+- SUUNTO_IDS(),
+ SIEMENS_IDS(),
++ SUUNTO_IDS(),
++ VIVOPAY_IDS(),
++ ZIO_IDS(),
+ { },
+ };
+ MODULE_DEVICE_TABLE(usb, id_table);
+diff --git a/drivers/usb/typec/class.c b/drivers/usb/typec/class.c
+index 349cc2030c903..5c6469548b203 100644
+--- a/drivers/usb/typec/class.c
++++ b/drivers/usb/typec/class.c
+@@ -1277,8 +1277,7 @@ static ssize_t select_usb_power_delivery_show(struct device *dev,
+ {
+ struct typec_port *port = to_typec_port(dev);
+ struct usb_power_delivery **pds;
+- struct usb_power_delivery *pd;
+- int ret = 0;
++ int i, ret = 0;
+
+ if (!port->ops || !port->ops->pd_get)
+ return -EOPNOTSUPP;
+@@ -1287,11 +1286,11 @@ static ssize_t select_usb_power_delivery_show(struct device *dev,
+ if (!pds)
+ return 0;
+
+- for (pd = pds[0]; pd; pd++) {
+- if (pd == port->pd)
+- ret += sysfs_emit(buf + ret, "[%s] ", dev_name(&pd->dev));
++ for (i = 0; pds[i]; i++) {
++ if (pds[i] == port->pd)
++ ret += sysfs_emit_at(buf, ret, "[%s] ", dev_name(&pds[i]->dev));
+ else
+- ret += sysfs_emit(buf + ret, "%s ", dev_name(&pd->dev));
++ ret += sysfs_emit_at(buf, ret, "%s ", dev_name(&pds[i]->dev));
+ }
+
+ buf[ret - 1] = '\n';
+@@ -2288,6 +2287,8 @@ struct typec_port *typec_register_port(struct device *parent,
+ return ERR_PTR(ret);
+ }
+
++ port->pd = cap->pd;
++
+ ret = device_add(&port->dev);
+ if (ret) {
+ dev_err(parent, "failed to register port (%d)\n", ret);
+@@ -2295,7 +2296,7 @@ struct typec_port *typec_register_port(struct device *parent,
+ return ERR_PTR(ret);
+ }
+
+- ret = typec_port_set_usb_power_delivery(port, cap->pd);
++ ret = usb_power_delivery_link_device(port->pd, &port->dev);
+ if (ret) {
+ dev_err(&port->dev, "failed to link pd\n");
+ device_unregister(&port->dev);
+diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
+index e1ec725c2819d..f13c3b76ad1eb 100644
+--- a/drivers/xen/grant-table.c
++++ b/drivers/xen/grant-table.c
+@@ -498,14 +498,21 @@ static LIST_HEAD(deferred_list);
+ static void gnttab_handle_deferred(struct timer_list *);
+ static DEFINE_TIMER(deferred_timer, gnttab_handle_deferred);
+
++static atomic64_t deferred_count;
++static atomic64_t leaked_count;
++static unsigned int free_per_iteration = 10;
++module_param(free_per_iteration, uint, 0600);
++
+ static void gnttab_handle_deferred(struct timer_list *unused)
+ {
+- unsigned int nr = 10;
++ unsigned int nr = READ_ONCE(free_per_iteration);
++ const bool ignore_limit = nr == 0;
+ struct deferred_entry *first = NULL;
+ unsigned long flags;
++ size_t freed = 0;
+
+ spin_lock_irqsave(&gnttab_list_lock, flags);
+- while (nr--) {
++ while ((ignore_limit || nr--) && !list_empty(&deferred_list)) {
+ struct deferred_entry *entry
+ = list_first_entry(&deferred_list,
+ struct deferred_entry, list);
+@@ -515,10 +522,14 @@ static void gnttab_handle_deferred(struct timer_list *unused)
+ list_del(&entry->list);
+ spin_unlock_irqrestore(&gnttab_list_lock, flags);
+ if (_gnttab_end_foreign_access_ref(entry->ref)) {
++ uint64_t ret = atomic64_dec_return(&deferred_count);
++
+ put_free_entry(entry->ref);
+- pr_debug("freeing g.e. %#x (pfn %#lx)\n",
+- entry->ref, page_to_pfn(entry->page));
++ pr_debug("freeing g.e. %#x (pfn %#lx), %llu remaining\n",
++ entry->ref, page_to_pfn(entry->page),
++ (unsigned long long)ret);
+ put_page(entry->page);
++ freed++;
+ kfree(entry);
+ entry = NULL;
+ } else {
+@@ -530,21 +541,22 @@ static void gnttab_handle_deferred(struct timer_list *unused)
+ spin_lock_irqsave(&gnttab_list_lock, flags);
+ if (entry)
+ list_add_tail(&entry->list, &deferred_list);
+- else if (list_empty(&deferred_list))
+- break;
+ }
+- if (!list_empty(&deferred_list) && !timer_pending(&deferred_timer)) {
++ if (list_empty(&deferred_list))
++ WARN_ON(atomic64_read(&deferred_count));
++ else if (!timer_pending(&deferred_timer)) {
+ deferred_timer.expires = jiffies + HZ;
+ add_timer(&deferred_timer);
+ }
+ spin_unlock_irqrestore(&gnttab_list_lock, flags);
++ pr_debug("Freed %zu references", freed);
+ }
+
+ static void gnttab_add_deferred(grant_ref_t ref, struct page *page)
+ {
+ struct deferred_entry *entry;
+ gfp_t gfp = (in_atomic() || irqs_disabled()) ? GFP_ATOMIC : GFP_KERNEL;
+- const char *what = KERN_WARNING "leaking";
++ uint64_t leaked, deferred;
+
+ entry = kmalloc(sizeof(*entry), gfp);
+ if (!page) {
+@@ -567,10 +579,16 @@ static void gnttab_add_deferred(grant_ref_t ref, struct page *page)
+ add_timer(&deferred_timer);
+ }
+ spin_unlock_irqrestore(&gnttab_list_lock, flags);
+- what = KERN_DEBUG "deferring";
++ deferred = atomic64_inc_return(&deferred_count);
++ leaked = atomic64_read(&leaked_count);
++ pr_debug("deferring g.e. %#x (pfn %#lx) (total deferred %llu, total leaked %llu)\n",
++ ref, page ? page_to_pfn(page) : -1, deferred, leaked);
++ } else {
++ deferred = atomic64_read(&deferred_count);
++ leaked = atomic64_inc_return(&leaked_count);
++ pr_warn("leaking g.e. %#x (pfn %#lx) (total deferred %llu, total leaked %llu)\n",
++ ref, page ? page_to_pfn(page) : -1, deferred, leaked);
+ }
+- printk("%s g.e. %#x (pfn %#lx)\n",
+- what, ref, page ? page_to_pfn(page) : -1);
+ }
+
+ int gnttab_try_end_foreign_access(grant_ref_t ref)
+diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
+index 58b732dcbfb83..639bf628389ba 100644
+--- a/drivers/xen/xenbus/xenbus_probe.c
++++ b/drivers/xen/xenbus/xenbus_probe.c
+@@ -811,6 +811,9 @@ static int xenbus_probe_thread(void *unused)
+
+ static int __init xenbus_probe_initcall(void)
+ {
++ if (!xen_domain())
++ return -ENODEV;
++
+ /*
+ * Probe XenBus here in the XS_PV case, and also XS_HVM unless we
+ * need to wait for the platform PCI device to come up or
+diff --git a/fs/9p/fid.h b/fs/9p/fid.h
+index 0c51889a60b33..29281b7c38870 100644
+--- a/fs/9p/fid.h
++++ b/fs/9p/fid.h
+@@ -46,8 +46,8 @@ static inline struct p9_fid *v9fs_fid_clone(struct dentry *dentry)
+ * NOTE: these are set after open so only reflect 9p client not
+ * underlying file system on server.
+ */
+-static inline void v9fs_fid_add_modes(struct p9_fid *fid, int s_flags,
+- int s_cache, unsigned int f_flags)
++static inline void v9fs_fid_add_modes(struct p9_fid *fid, unsigned int s_flags,
++ unsigned int s_cache, unsigned int f_flags)
+ {
+ if (fid->qid.type != P9_QTFILE)
+ return;
+@@ -57,7 +57,7 @@ static inline void v9fs_fid_add_modes(struct p9_fid *fid, int s_flags,
+ (s_flags & V9FS_DIRECT_IO) || (f_flags & O_DIRECT)) {
+ fid->mode |= P9L_DIRECT; /* no read or write cache */
+ } else if ((!(s_cache & CACHE_WRITEBACK)) ||
+- (f_flags & O_DSYNC) | (s_flags & V9FS_SYNC)) {
++ (f_flags & O_DSYNC) || (s_flags & V9FS_SYNC)) {
+ fid->mode |= P9L_NOWRITECACHE;
+ }
+ }
+diff --git a/fs/9p/v9fs.h b/fs/9p/v9fs.h
+index 06a2514f0d882..698c43dd5dc86 100644
+--- a/fs/9p/v9fs.h
++++ b/fs/9p/v9fs.h
+@@ -108,7 +108,7 @@ enum p9_cache_bits {
+
+ struct v9fs_session_info {
+ /* options */
+- unsigned char flags;
++ unsigned int flags;
+ unsigned char nodev;
+ unsigned short debug;
+ unsigned int afid;
+diff --git a/fs/9p/vfs_dir.c b/fs/9p/vfs_dir.c
+index 45b684b7d8d7c..4102759a5cb56 100644
+--- a/fs/9p/vfs_dir.c
++++ b/fs/9p/vfs_dir.c
+@@ -208,7 +208,7 @@ int v9fs_dir_release(struct inode *inode, struct file *filp)
+ struct p9_fid *fid;
+ __le32 version;
+ loff_t i_size;
+- int retval = 0;
++ int retval = 0, put_err;
+
+ fid = filp->private_data;
+ p9_debug(P9_DEBUG_VFS, "inode: %p filp: %p fid: %d\n",
+@@ -221,7 +221,8 @@ int v9fs_dir_release(struct inode *inode, struct file *filp)
+ spin_lock(&inode->i_lock);
+ hlist_del(&fid->ilist);
+ spin_unlock(&inode->i_lock);
+- retval = p9_fid_put(fid);
++ put_err = p9_fid_put(fid);
++ retval = retval < 0 ? retval : put_err;
+ }
+
+ if ((filp->f_mode & FMODE_WRITE)) {
+diff --git a/fs/9p/vfs_file.c b/fs/9p/vfs_file.c
+index 6c31b8c8112d9..99cb4f04cbdbb 100644
+--- a/fs/9p/vfs_file.c
++++ b/fs/9p/vfs_file.c
+@@ -483,10 +483,7 @@ v9fs_file_mmap(struct file *filp, struct vm_area_struct *vma)
+ p9_debug(P9_DEBUG_MMAP, "filp :%p\n", filp);
+
+ if (!(v9ses->cache & CACHE_WRITEBACK)) {
+- p9_debug(P9_DEBUG_CACHE, "(no mmap mode)");
+- if (vma->vm_flags & VM_MAYSHARE)
+- return -ENODEV;
+- invalidate_inode_pages2(filp->f_mapping);
++ p9_debug(P9_DEBUG_CACHE, "(read-only mmap mode)");
+ return generic_file_readonly_mmap(filp, vma);
+ }
+
+diff --git a/fs/btrfs/block-rsv.c b/fs/btrfs/block-rsv.c
+index ac18c43fadadc..8e21c2faff625 100644
+--- a/fs/btrfs/block-rsv.c
++++ b/fs/btrfs/block-rsv.c
+@@ -349,6 +349,11 @@ void btrfs_update_global_block_rsv(struct btrfs_fs_info *fs_info)
+ }
+ read_unlock(&fs_info->global_root_lock);
+
++ if (btrfs_fs_compat_ro(fs_info, BLOCK_GROUP_TREE)) {
++ num_bytes += btrfs_root_used(&fs_info->block_group_root->root_item);
++ min_items++;
++ }
++
+ /*
+ * But we also want to reserve enough space so we can do the fallback
+ * global reserve for an unlink, which is an additional
+diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
+index 795b30913c542..9f056ad41df04 100644
+--- a/fs/btrfs/disk-io.c
++++ b/fs/btrfs/disk-io.c
+@@ -3692,11 +3692,16 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
+ * For devices supporting discard turn on discard=async automatically,
+ * unless it's already set or disabled. This could be turned off by
+ * nodiscard for the same mount.
++ *
++ * The zoned mode piggy backs on the discard functionality for
++ * resetting a zone. There is no reason to delay the zone reset as it is
++ * fast enough. So, do not enable async discard for zoned mode.
+ */
+ if (!(btrfs_test_opt(fs_info, DISCARD_SYNC) ||
+ btrfs_test_opt(fs_info, DISCARD_ASYNC) ||
+ btrfs_test_opt(fs_info, NODISCARD)) &&
+- fs_info->fs_devices->discardable) {
++ fs_info->fs_devices->discardable &&
++ !btrfs_is_zoned(fs_info)) {
+ btrfs_set_and_info(fs_info, DISCARD_ASYNC,
+ "auto enabling async discard");
+ }
+diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
+index a37a6587efaf0..82b9779deaa88 100644
+--- a/fs/btrfs/extent_io.c
++++ b/fs/btrfs/extent_io.c
+@@ -478,6 +478,15 @@ void extent_clear_unlock_delalloc(struct btrfs_inode *inode, u64 start, u64 end,
+ start, end, page_ops, NULL);
+ }
+
++static bool btrfs_verify_page(struct page *page, u64 start)
++{
++ if (!fsverity_active(page->mapping->host) ||
++ PageError(page) || PageUptodate(page) ||
++ start >= i_size_read(page->mapping->host))
++ return true;
++ return fsverity_verify_page(page);
++}
++
+ static void end_page_read(struct page *page, bool uptodate, u64 start, u32 len)
+ {
+ struct btrfs_fs_info *fs_info = btrfs_sb(page->mapping->host->i_sb);
+@@ -485,16 +494,8 @@ static void end_page_read(struct page *page, bool uptodate, u64 start, u32 len)
+ ASSERT(page_offset(page) <= start &&
+ start + len <= page_offset(page) + PAGE_SIZE);
+
+- if (uptodate) {
+- if (fsverity_active(page->mapping->host) &&
+- !PageError(page) &&
+- !PageUptodate(page) &&
+- start < i_size_read(page->mapping->host) &&
+- !fsverity_verify_page(page)) {
+- btrfs_page_set_error(fs_info, page, start, len);
+- } else {
+- btrfs_page_set_uptodate(fs_info, page, start, len);
+- }
++ if (uptodate && btrfs_verify_page(page, start)) {
++ btrfs_page_set_uptodate(fs_info, page, start, len);
+ } else {
+ btrfs_page_clear_uptodate(fs_info, page, start, len);
+ btrfs_page_set_error(fs_info, page, start, len);
+diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
+index 360bf2522a871..2637d6b157ff9 100644
+--- a/fs/btrfs/qgroup.c
++++ b/fs/btrfs/qgroup.c
+@@ -1232,12 +1232,23 @@ int btrfs_quota_disable(struct btrfs_fs_info *fs_info)
+ int ret = 0;
+
+ /*
+- * We need to have subvol_sem write locked, to prevent races between
+- * concurrent tasks trying to disable quotas, because we will unlock
+- * and relock qgroup_ioctl_lock across BTRFS_FS_QUOTA_ENABLED changes.
++ * We need to have subvol_sem write locked to prevent races with
++ * snapshot creation.
+ */
+ lockdep_assert_held_write(&fs_info->subvol_sem);
+
++ /*
++ * Lock the cleaner mutex to prevent races with concurrent relocation,
++ * because relocation may be building backrefs for blocks of the quota
++ * root while we are deleting the root. This is like dropping fs roots
++ * of deleted snapshots/subvolumes, we need the same protection.
++ *
++ * This also prevents races between concurrent tasks trying to disable
++ * quotas, because we will unlock and relock qgroup_ioctl_lock across
++ * BTRFS_FS_QUOTA_ENABLED changes.
++ */
++ mutex_lock(&fs_info->cleaner_mutex);
++
+ mutex_lock(&fs_info->qgroup_ioctl_lock);
+ if (!fs_info->quota_root)
+ goto out;
+@@ -1319,6 +1330,7 @@ out:
+ btrfs_end_transaction(trans);
+ else if (trans)
+ ret = btrfs_end_transaction(trans);
++ mutex_unlock(&fs_info->cleaner_mutex);
+
+ return ret;
+ }
+diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
+index 8b6a99b8d7f6d..eaf7511bbc37e 100644
+--- a/fs/btrfs/transaction.c
++++ b/fs/btrfs/transaction.c
+@@ -828,8 +828,13 @@ btrfs_attach_transaction_barrier(struct btrfs_root *root)
+
+ trans = start_transaction(root, 0, TRANS_ATTACH,
+ BTRFS_RESERVE_NO_FLUSH, true);
+- if (trans == ERR_PTR(-ENOENT))
+- btrfs_wait_for_commit(root->fs_info, 0);
++ if (trans == ERR_PTR(-ENOENT)) {
++ int ret;
++
++ ret = btrfs_wait_for_commit(root->fs_info, 0);
++ if (ret)
++ return ERR_PTR(ret);
++ }
+
+ return trans;
+ }
+@@ -933,6 +938,7 @@ int btrfs_wait_for_commit(struct btrfs_fs_info *fs_info, u64 transid)
+ }
+
+ wait_for_commit(cur_trans, TRANS_STATE_COMPLETED);
++ ret = cur_trans->aborted;
+ btrfs_put_transaction(cur_trans);
+ out:
+ return ret;
+diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
+index 39828af4a4e8c..1c56e056afccf 100644
+--- a/fs/btrfs/zoned.c
++++ b/fs/btrfs/zoned.c
+@@ -804,6 +804,9 @@ int btrfs_check_mountopts_zoned(struct btrfs_fs_info *info)
+ return -EINVAL;
+ }
+
++ btrfs_clear_and_info(info, DISCARD_ASYNC,
++ "zoned: async discard ignored and disabled for zoned mode");
++
+ return 0;
+ }
+
+diff --git a/fs/ceph/metric.c b/fs/ceph/metric.c
+index c47347d2e84e3..9560b7bc6009a 100644
+--- a/fs/ceph/metric.c
++++ b/fs/ceph/metric.c
+@@ -208,7 +208,7 @@ static void metric_delayed_work(struct work_struct *work)
+ struct ceph_mds_client *mdsc =
+ container_of(m, struct ceph_mds_client, metric);
+
+- if (mdsc->stopping)
++ if (mdsc->stopping || disable_send_metrics)
+ return;
+
+ if (!m->session || !check_session_state(m->session)) {
+diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
+index fd4d12c58c3b4..3fa5de892d89d 100644
+--- a/fs/ext4/mballoc.c
++++ b/fs/ext4/mballoc.c
+@@ -4528,6 +4528,37 @@ ext4_mb_check_group_pa(ext4_fsblk_t goal_block,
+ return pa;
+ }
+
++/*
++ * check if found pa meets EXT4_MB_HINT_GOAL_ONLY
++ */
++static bool
++ext4_mb_pa_goal_check(struct ext4_allocation_context *ac,
++ struct ext4_prealloc_space *pa)
++{
++ struct ext4_sb_info *sbi = EXT4_SB(ac->ac_sb);
++ ext4_fsblk_t start;
++
++ if (likely(!(ac->ac_flags & EXT4_MB_HINT_GOAL_ONLY)))
++ return true;
++
++ /*
++ * If EXT4_MB_HINT_GOAL_ONLY is set, ac_g_ex will not be adjusted
++ * in ext4_mb_normalize_request and will keep same with ac_o_ex
++ * from ext4_mb_initialize_context. Choose ac_g_ex here to keep
++ * consistent with ext4_mb_find_by_goal.
++ */
++ start = pa->pa_pstart +
++ (ac->ac_g_ex.fe_logical - pa->pa_lstart);
++ if (ext4_grp_offs_to_block(ac->ac_sb, &ac->ac_g_ex) != start)
++ return false;
++
++ if (ac->ac_g_ex.fe_len > pa->pa_len -
++ EXT4_B2C(sbi, ac->ac_g_ex.fe_logical - pa->pa_lstart))
++ return false;
++
++ return true;
++}
++
+ /*
+ * search goal blocks in preallocated space
+ */
+@@ -4538,8 +4569,8 @@ ext4_mb_use_preallocated(struct ext4_allocation_context *ac)
+ int order, i;
+ struct ext4_inode_info *ei = EXT4_I(ac->ac_inode);
+ struct ext4_locality_group *lg;
+- struct ext4_prealloc_space *tmp_pa, *cpa = NULL;
+- ext4_lblk_t tmp_pa_start, tmp_pa_end;
++ struct ext4_prealloc_space *tmp_pa = NULL, *cpa = NULL;
++ loff_t tmp_pa_end;
+ struct rb_node *iter;
+ ext4_fsblk_t goal_block;
+
+@@ -4547,47 +4578,151 @@ ext4_mb_use_preallocated(struct ext4_allocation_context *ac)
+ if (!(ac->ac_flags & EXT4_MB_HINT_DATA))
+ return false;
+
+- /* first, try per-file preallocation */
++ /*
++ * first, try per-file preallocation by searching the inode pa rbtree.
++ *
++ * Here, we can't do a direct traversal of the tree because
++ * ext4_mb_discard_group_preallocation() can paralelly mark the pa
++ * deleted and that can cause direct traversal to skip some entries.
++ */
+ read_lock(&ei->i_prealloc_lock);
++
++ if (RB_EMPTY_ROOT(&ei->i_prealloc_node)) {
++ goto try_group_pa;
++ }
++
++ /*
++ * Step 1: Find a pa with logical start immediately adjacent to the
++ * original logical start. This could be on the left or right.
++ *
++ * (tmp_pa->pa_lstart never changes so we can skip locking for it).
++ */
+ for (iter = ei->i_prealloc_node.rb_node; iter;
+ iter = ext4_mb_pa_rb_next_iter(ac->ac_o_ex.fe_logical,
+- tmp_pa_start, iter)) {
++ tmp_pa->pa_lstart, iter)) {
+ tmp_pa = rb_entry(iter, struct ext4_prealloc_space,
+ pa_node.inode_node);
++ }
+
+- /* all fields in this condition don't change,
+- * so we can skip locking for them */
+- tmp_pa_start = tmp_pa->pa_lstart;
+- tmp_pa_end = tmp_pa->pa_lstart + EXT4_C2B(sbi, tmp_pa->pa_len);
+-
+- /* original request start doesn't lie in this PA */
+- if (ac->ac_o_ex.fe_logical < tmp_pa_start ||
+- ac->ac_o_ex.fe_logical >= tmp_pa_end)
+- continue;
++ /*
++ * Step 2: The adjacent pa might be to the right of logical start, find
++ * the left adjacent pa. After this step we'd have a valid tmp_pa whose
++ * logical start is towards the left of original request's logical start
++ */
++ if (tmp_pa->pa_lstart > ac->ac_o_ex.fe_logical) {
++ struct rb_node *tmp;
++ tmp = rb_prev(&tmp_pa->pa_node.inode_node);
+
+- /* non-extent files can't have physical blocks past 2^32 */
+- if (!(ext4_test_inode_flag(ac->ac_inode, EXT4_INODE_EXTENTS)) &&
+- (tmp_pa->pa_pstart + EXT4_C2B(sbi, tmp_pa->pa_len) >
+- EXT4_MAX_BLOCK_FILE_PHYS)) {
++ if (tmp) {
++ tmp_pa = rb_entry(tmp, struct ext4_prealloc_space,
++ pa_node.inode_node);
++ } else {
+ /*
+- * Since PAs don't overlap, we won't find any
+- * other PA to satisfy this.
++ * If there is no adjacent pa to the left then finding
++ * an overlapping pa is not possible hence stop searching
++ * inode pa tree
+ */
+- break;
++ goto try_group_pa;
+ }
++ }
++
++ BUG_ON(!(tmp_pa && tmp_pa->pa_lstart <= ac->ac_o_ex.fe_logical));
+
+- /* found preallocated blocks, use them */
++ /*
++ * Step 3: If the left adjacent pa is deleted, keep moving left to find
++ * the first non deleted adjacent pa. After this step we should have a
++ * valid tmp_pa which is guaranteed to be non deleted.
++ */
++ for (iter = &tmp_pa->pa_node.inode_node;; iter = rb_prev(iter)) {
++ if (!iter) {
++ /*
++ * no non deleted left adjacent pa, so stop searching
++ * inode pa tree
++ */
++ goto try_group_pa;
++ }
++ tmp_pa = rb_entry(iter, struct ext4_prealloc_space,
++ pa_node.inode_node);
+ spin_lock(&tmp_pa->pa_lock);
+- if (tmp_pa->pa_deleted == 0 && tmp_pa->pa_free) {
+- atomic_inc(&tmp_pa->pa_count);
+- ext4_mb_use_inode_pa(ac, tmp_pa);
++ if (tmp_pa->pa_deleted == 0) {
++ /*
++ * We will keep holding the pa_lock from
++ * this point on because we don't want group discard
++ * to delete this pa underneath us. Since group
++ * discard is anyways an ENOSPC operation it
++ * should be okay for it to wait a few more cycles.
++ */
++ break;
++ } else {
+ spin_unlock(&tmp_pa->pa_lock);
+- ac->ac_criteria = 10;
+- read_unlock(&ei->i_prealloc_lock);
+- return true;
+ }
++ }
++
++ BUG_ON(!(tmp_pa && tmp_pa->pa_lstart <= ac->ac_o_ex.fe_logical));
++ BUG_ON(tmp_pa->pa_deleted == 1);
++
++ /*
++ * Step 4: We now have the non deleted left adjacent pa. Only this
++ * pa can possibly satisfy the request hence check if it overlaps
++ * original logical start and stop searching if it doesn't.
++ */
++ tmp_pa_end = (loff_t)tmp_pa->pa_lstart + EXT4_C2B(sbi, tmp_pa->pa_len);
++
++ if (ac->ac_o_ex.fe_logical >= tmp_pa_end) {
++ spin_unlock(&tmp_pa->pa_lock);
++ goto try_group_pa;
++ }
++
++ /* non-extent files can't have physical blocks past 2^32 */
++ if (!(ext4_test_inode_flag(ac->ac_inode, EXT4_INODE_EXTENTS)) &&
++ (tmp_pa->pa_pstart + EXT4_C2B(sbi, tmp_pa->pa_len) >
++ EXT4_MAX_BLOCK_FILE_PHYS)) {
++ /*
++ * Since PAs don't overlap, we won't find any other PA to
++ * satisfy this.
++ */
++ spin_unlock(&tmp_pa->pa_lock);
++ goto try_group_pa;
++ }
++
++ if (tmp_pa->pa_free && likely(ext4_mb_pa_goal_check(ac, tmp_pa))) {
++ atomic_inc(&tmp_pa->pa_count);
++ ext4_mb_use_inode_pa(ac, tmp_pa);
+ spin_unlock(&tmp_pa->pa_lock);
++ read_unlock(&ei->i_prealloc_lock);
++ return true;
++ } else {
++ /*
++ * We found a valid overlapping pa but couldn't use it because
++ * it had no free blocks. This should ideally never happen
++ * because:
++ *
++ * 1. When a new inode pa is added to rbtree it must have
++ * pa_free > 0 since otherwise we won't actually need
++ * preallocation.
++ *
++ * 2. An inode pa that is in the rbtree can only have it's
++ * pa_free become zero when another thread calls:
++ * ext4_mb_new_blocks
++ * ext4_mb_use_preallocated
++ * ext4_mb_use_inode_pa
++ *
++ * 3. Further, after the above calls make pa_free == 0, we will
++ * immediately remove it from the rbtree in:
++ * ext4_mb_new_blocks
++ * ext4_mb_release_context
++ * ext4_mb_put_pa
++ *
++ * 4. Since the pa_free becoming 0 and pa_free getting removed
++ * from tree both happen in ext4_mb_new_blocks, which is always
++ * called with i_data_sem held for data allocations, we can be
++ * sure that another process will never see a pa in rbtree with
++ * pa_free == 0.
++ */
++ WARN_ON_ONCE(tmp_pa->pa_free == 0);
+ }
++ spin_unlock(&tmp_pa->pa_lock);
++try_group_pa:
+ read_unlock(&ei->i_prealloc_lock);
+
+ /* can we use group allocation? */
+@@ -4625,7 +4760,6 @@ ext4_mb_use_preallocated(struct ext4_allocation_context *ac)
+ }
+ if (cpa) {
+ ext4_mb_use_group_pa(ac, cpa);
+- ac->ac_criteria = 20;
+ return true;
+ }
+ return false;
+@@ -5399,6 +5533,10 @@ static void ext4_mb_show_ac(struct ext4_allocation_context *ac)
+ (unsigned long)ac->ac_b_ex.fe_logical,
+ (int)ac->ac_criteria);
+ mb_debug(sb, "%u found", ac->ac_found);
++ mb_debug(sb, "used pa: %s, ", ac->ac_pa ? "yes" : "no");
++ if (ac->ac_pa)
++ mb_debug(sb, "pa_type %s\n", ac->ac_pa->pa_type == MB_GROUP_PA ?
++ "group pa" : "inode pa");
+ ext4_mb_show_pa(sb);
+ }
+ #else
+diff --git a/fs/file.c b/fs/file.c
+index 7893ea161d770..35c62b54c9d65 100644
+--- a/fs/file.c
++++ b/fs/file.c
+@@ -1042,10 +1042,8 @@ unsigned long __fdget_pos(unsigned int fd)
+ struct file *file = (struct file *)(v & ~3);
+
+ if (file && (file->f_mode & FMODE_ATOMIC_POS)) {
+- if (file_count(file) > 1) {
+- v |= FDPUT_POS_UNLOCK;
+- mutex_lock(&file->f_pos_lock);
+- }
++ v |= FDPUT_POS_UNLOCK;
++ mutex_lock(&file->f_pos_lock);
+ }
+ return v;
+ }
+diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
+index 25e3c20eb19f6..c4e0da6db7195 100644
+--- a/fs/jbd2/checkpoint.c
++++ b/fs/jbd2/checkpoint.c
+@@ -221,20 +221,6 @@ restart:
+ jh = transaction->t_checkpoint_list;
+ bh = jh2bh(jh);
+
+- /*
+- * The buffer may be writing back, or flushing out in the
+- * last couple of cycles, or re-adding into a new transaction,
+- * need to check it again until it's unlocked.
+- */
+- if (buffer_locked(bh)) {
+- get_bh(bh);
+- spin_unlock(&journal->j_list_lock);
+- wait_on_buffer(bh);
+- /* the journal_head may have gone by now */
+- BUFFER_TRACE(bh, "brelse");
+- __brelse(bh);
+- goto retry;
+- }
+ if (jh->b_transaction != NULL) {
+ transaction_t *t = jh->b_transaction;
+ tid_t tid = t->t_tid;
+@@ -269,7 +255,22 @@ restart:
+ spin_lock(&journal->j_list_lock);
+ goto restart;
+ }
+- if (!buffer_dirty(bh)) {
++ if (!trylock_buffer(bh)) {
++ /*
++ * The buffer is locked, it may be writing back, or
++ * flushing out in the last couple of cycles, or
++ * re-adding into a new transaction, need to check
++ * it again until it's unlocked.
++ */
++ get_bh(bh);
++ spin_unlock(&journal->j_list_lock);
++ wait_on_buffer(bh);
++ /* the journal_head may have gone by now */
++ BUFFER_TRACE(bh, "brelse");
++ __brelse(bh);
++ goto retry;
++ } else if (!buffer_dirty(bh)) {
++ unlock_buffer(bh);
+ BUFFER_TRACE(bh, "remove from checkpoint");
+ /*
+ * If the transaction was released or the checkpoint
+@@ -279,6 +280,7 @@ restart:
+ !transaction->t_checkpoint_list)
+ goto out;
+ } else {
++ unlock_buffer(bh);
+ /*
+ * We are about to write the buffer, it could be
+ * raced by some other transaction shrink or buffer
+diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
+index 6e61fa3acaf11..3aefbad4cc099 100644
+--- a/fs/nfsd/nfs4state.c
++++ b/fs/nfsd/nfs4state.c
+@@ -6341,8 +6341,6 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
+ if (ZERO_STATEID(stateid) || ONE_STATEID(stateid) ||
+ CLOSE_STATEID(stateid))
+ return status;
+- if (!same_clid(&stateid->si_opaque.so_clid, &cl->cl_clientid))
+- return status;
+ spin_lock(&cl->cl_lock);
+ s = find_stateid_locked(cl, stateid);
+ if (!s)
+diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c
+index 03f5963914a14..a0e1463c3fc4d 100644
+--- a/fs/proc/vmcore.c
++++ b/fs/proc/vmcore.c
+@@ -132,7 +132,7 @@ ssize_t read_from_oldmem(struct iov_iter *iter, size_t count,
+ u64 *ppos, bool encrypted)
+ {
+ unsigned long pfn, offset;
+- size_t nr_bytes;
++ ssize_t nr_bytes;
+ ssize_t read = 0, tmp;
+ int idx;
+
+diff --git a/fs/smb/client/sess.c b/fs/smb/client/sess.c
+index 335c078c42fb5..c57ca2050b73f 100644
+--- a/fs/smb/client/sess.c
++++ b/fs/smb/client/sess.c
+@@ -1013,6 +1013,7 @@ setup_ntlm_smb3_neg_ret:
+ }
+
+
++/* See MS-NLMP 2.2.1.3 */
+ int build_ntlmssp_auth_blob(unsigned char **pbuffer,
+ u16 *buflen,
+ struct cifs_ses *ses,
+@@ -1047,7 +1048,8 @@ int build_ntlmssp_auth_blob(unsigned char **pbuffer,
+
+ flags = ses->ntlmssp->server_flags | NTLMSSP_REQUEST_TARGET |
+ NTLMSSP_NEGOTIATE_TARGET_INFO | NTLMSSP_NEGOTIATE_WORKSTATION_SUPPLIED;
+-
++ /* we only send version information in ntlmssp negotiate, so do not set this flag */
++ flags = flags & ~NTLMSSP_NEGOTIATE_VERSION;
+ tmp = *pbuffer + sizeof(AUTHENTICATE_MESSAGE);
+ sec_blob->NegotiateFlags = cpu_to_le32(flags);
+
+diff --git a/fs/smb/server/ksmbd_netlink.h b/fs/smb/server/ksmbd_netlink.h
+index fb8b2d566efb6..b7521e41402e0 100644
+--- a/fs/smb/server/ksmbd_netlink.h
++++ b/fs/smb/server/ksmbd_netlink.h
+@@ -352,7 +352,8 @@ enum KSMBD_TREE_CONN_STATUS {
+ #define KSMBD_SHARE_FLAG_STREAMS BIT(11)
+ #define KSMBD_SHARE_FLAG_FOLLOW_SYMLINKS BIT(12)
+ #define KSMBD_SHARE_FLAG_ACL_XATTR BIT(13)
+-#define KSMBD_SHARE_FLAG_UPDATE BIT(14)
++#define KSMBD_SHARE_FLAG_UPDATE BIT(14)
++#define KSMBD_SHARE_FLAG_CROSSMNT BIT(15)
+
+ /*
+ * Tree connect request flags.
+diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c
+index 1cc336f512851..d7e5196485604 100644
+--- a/fs/smb/server/smb2pdu.c
++++ b/fs/smb/server/smb2pdu.c
+@@ -2467,8 +2467,9 @@ static void smb2_update_xattrs(struct ksmbd_tree_connect *tcon,
+ }
+ }
+
+-static int smb2_creat(struct ksmbd_work *work, struct path *path, char *name,
+- int open_flags, umode_t posix_mode, bool is_dir)
++static int smb2_creat(struct ksmbd_work *work, struct path *parent_path,
++ struct path *path, char *name, int open_flags,
++ umode_t posix_mode, bool is_dir)
+ {
+ struct ksmbd_tree_connect *tcon = work->tcon;
+ struct ksmbd_share_config *share = tcon->share_conf;
+@@ -2495,7 +2496,7 @@ static int smb2_creat(struct ksmbd_work *work, struct path *path, char *name,
+ return rc;
+ }
+
+- rc = ksmbd_vfs_kern_path_locked(work, name, 0, path, 0);
++ rc = ksmbd_vfs_kern_path_locked(work, name, 0, parent_path, path, 0);
+ if (rc) {
+ pr_err("cannot get linux path (%s), err = %d\n",
+ name, rc);
+@@ -2565,7 +2566,7 @@ int smb2_open(struct ksmbd_work *work)
+ struct ksmbd_tree_connect *tcon = work->tcon;
+ struct smb2_create_req *req;
+ struct smb2_create_rsp *rsp;
+- struct path path;
++ struct path path, parent_path;
+ struct ksmbd_share_config *share = tcon->share_conf;
+ struct ksmbd_file *fp = NULL;
+ struct file *filp = NULL;
+@@ -2786,7 +2787,8 @@ int smb2_open(struct ksmbd_work *work)
+ goto err_out1;
+ }
+
+- rc = ksmbd_vfs_kern_path_locked(work, name, LOOKUP_NO_SYMLINKS, &path, 1);
++ rc = ksmbd_vfs_kern_path_locked(work, name, LOOKUP_NO_SYMLINKS,
++ &parent_path, &path, 1);
+ if (!rc) {
+ file_present = true;
+
+@@ -2908,7 +2910,8 @@ int smb2_open(struct ksmbd_work *work)
+
+ /*create file if not present */
+ if (!file_present) {
+- rc = smb2_creat(work, &path, name, open_flags, posix_mode,
++ rc = smb2_creat(work, &parent_path, &path, name, open_flags,
++ posix_mode,
+ req->CreateOptions & FILE_DIRECTORY_FILE_LE);
+ if (rc) {
+ if (rc == -ENOENT) {
+@@ -3323,8 +3326,9 @@ int smb2_open(struct ksmbd_work *work)
+
+ err_out:
+ if (file_present || created) {
+- inode_unlock(d_inode(path.dentry->d_parent));
+- dput(path.dentry);
++ inode_unlock(d_inode(parent_path.dentry));
++ path_put(&path);
++ path_put(&parent_path);
+ }
+ ksmbd_revert_fsids(work);
+ err_out1:
+@@ -5547,7 +5551,7 @@ static int smb2_create_link(struct ksmbd_work *work,
+ struct nls_table *local_nls)
+ {
+ char *link_name = NULL, *target_name = NULL, *pathname = NULL;
+- struct path path;
++ struct path path, parent_path;
+ bool file_present = false;
+ int rc;
+
+@@ -5577,7 +5581,7 @@ static int smb2_create_link(struct ksmbd_work *work,
+
+ ksmbd_debug(SMB, "target name is %s\n", target_name);
+ rc = ksmbd_vfs_kern_path_locked(work, link_name, LOOKUP_NO_SYMLINKS,
+- &path, 0);
++ &parent_path, &path, 0);
+ if (rc) {
+ if (rc != -ENOENT)
+ goto out;
+@@ -5607,8 +5611,9 @@ static int smb2_create_link(struct ksmbd_work *work,
+ rc = -EINVAL;
+ out:
+ if (file_present) {
+- inode_unlock(d_inode(path.dentry->d_parent));
++ inode_unlock(d_inode(parent_path.dentry));
+ path_put(&path);
++ path_put(&parent_path);
+ }
+ if (!IS_ERR(link_name))
+ kfree(link_name);
+diff --git a/fs/smb/server/vfs.c b/fs/smb/server/vfs.c
+index 81489fdedd8e0..911cb3d294b86 100644
+--- a/fs/smb/server/vfs.c
++++ b/fs/smb/server/vfs.c
+@@ -63,13 +63,13 @@ int ksmbd_vfs_lock_parent(struct dentry *parent, struct dentry *child)
+
+ static int ksmbd_vfs_path_lookup_locked(struct ksmbd_share_config *share_conf,
+ char *pathname, unsigned int flags,
++ struct path *parent_path,
+ struct path *path)
+ {
+ struct qstr last;
+ struct filename *filename;
+ struct path *root_share_path = &share_conf->vfs_path;
+ int err, type;
+- struct path parent_path;
+ struct dentry *d;
+
+ if (pathname[0] == '\0') {
+@@ -84,7 +84,7 @@ static int ksmbd_vfs_path_lookup_locked(struct ksmbd_share_config *share_conf,
+ return PTR_ERR(filename);
+
+ err = vfs_path_parent_lookup(filename, flags,
+- &parent_path, &last, &type,
++ parent_path, &last, &type,
+ root_share_path);
+ if (err) {
+ putname(filename);
+@@ -92,13 +92,13 @@ static int ksmbd_vfs_path_lookup_locked(struct ksmbd_share_config *share_conf,
+ }
+
+ if (unlikely(type != LAST_NORM)) {
+- path_put(&parent_path);
++ path_put(parent_path);
+ putname(filename);
+ return -ENOENT;
+ }
+
+- inode_lock_nested(parent_path.dentry->d_inode, I_MUTEX_PARENT);
+- d = lookup_one_qstr_excl(&last, parent_path.dentry, 0);
++ inode_lock_nested(parent_path->dentry->d_inode, I_MUTEX_PARENT);
++ d = lookup_one_qstr_excl(&last, parent_path->dentry, 0);
+ if (IS_ERR(d))
+ goto err_out;
+
+@@ -108,15 +108,22 @@ static int ksmbd_vfs_path_lookup_locked(struct ksmbd_share_config *share_conf,
+ }
+
+ path->dentry = d;
+- path->mnt = share_conf->vfs_path.mnt;
+- path_put(&parent_path);
+- putname(filename);
++ path->mnt = mntget(parent_path->mnt);
++
++ if (test_share_config_flag(share_conf, KSMBD_SHARE_FLAG_CROSSMNT)) {
++ err = follow_down(path, 0);
++ if (err < 0) {
++ path_put(path);
++ goto err_out;
++ }
++ }
+
++ putname(filename);
+ return 0;
+
+ err_out:
+- inode_unlock(parent_path.dentry->d_inode);
+- path_put(&parent_path);
++ inode_unlock(d_inode(parent_path->dentry));
++ path_put(parent_path);
+ putname(filename);
+ return -ENOENT;
+ }
+@@ -1198,14 +1205,14 @@ static int ksmbd_vfs_lookup_in_dir(const struct path *dir, char *name,
+ * Return: 0 on success, otherwise error
+ */
+ int ksmbd_vfs_kern_path_locked(struct ksmbd_work *work, char *name,
+- unsigned int flags, struct path *path,
+- bool caseless)
++ unsigned int flags, struct path *parent_path,
++ struct path *path, bool caseless)
+ {
+ struct ksmbd_share_config *share_conf = work->tcon->share_conf;
+ int err;
+- struct path parent_path;
+
+- err = ksmbd_vfs_path_lookup_locked(share_conf, name, flags, path);
++ err = ksmbd_vfs_path_lookup_locked(share_conf, name, flags, parent_path,
++ path);
+ if (!err)
+ return err;
+
+@@ -1220,10 +1227,10 @@ int ksmbd_vfs_kern_path_locked(struct ksmbd_work *work, char *name,
+ path_len = strlen(filepath);
+ remain_len = path_len;
+
+- parent_path = share_conf->vfs_path;
+- path_get(&parent_path);
++ *parent_path = share_conf->vfs_path;
++ path_get(parent_path);
+
+- while (d_can_lookup(parent_path.dentry)) {
++ while (d_can_lookup(parent_path->dentry)) {
+ char *filename = filepath + path_len - remain_len;
+ char *next = strchrnul(filename, '/');
+ size_t filename_len = next - filename;
+@@ -1232,7 +1239,7 @@ int ksmbd_vfs_kern_path_locked(struct ksmbd_work *work, char *name,
+ if (filename_len == 0)
+ break;
+
+- err = ksmbd_vfs_lookup_in_dir(&parent_path, filename,
++ err = ksmbd_vfs_lookup_in_dir(parent_path, filename,
+ filename_len,
+ work->conn->um);
+ if (err)
+@@ -1249,8 +1256,8 @@ int ksmbd_vfs_kern_path_locked(struct ksmbd_work *work, char *name,
+ goto out2;
+ else if (is_last)
+ goto out1;
+- path_put(&parent_path);
+- parent_path = *path;
++ path_put(parent_path);
++ *parent_path = *path;
+
+ next[0] = '/';
+ remain_len -= filename_len + 1;
+@@ -1258,16 +1265,17 @@ int ksmbd_vfs_kern_path_locked(struct ksmbd_work *work, char *name,
+
+ err = -EINVAL;
+ out2:
+- path_put(&parent_path);
++ path_put(parent_path);
+ out1:
+ kfree(filepath);
+ }
+
+ if (!err) {
+- err = ksmbd_vfs_lock_parent(parent_path.dentry, path->dentry);
+- if (err)
+- dput(path->dentry);
+- path_put(&parent_path);
++ err = ksmbd_vfs_lock_parent(parent_path->dentry, path->dentry);
++ if (err) {
++ path_put(path);
++ path_put(parent_path);
++ }
+ }
+ return err;
+ }
+diff --git a/fs/smb/server/vfs.h b/fs/smb/server/vfs.h
+index 8c0931d4d5310..9df4a2a6776b2 100644
+--- a/fs/smb/server/vfs.h
++++ b/fs/smb/server/vfs.h
+@@ -115,8 +115,8 @@ int ksmbd_vfs_xattr_stream_name(char *stream_name, char **xattr_stream_name,
+ int ksmbd_vfs_remove_xattr(struct mnt_idmap *idmap,
+ const struct path *path, char *attr_name);
+ int ksmbd_vfs_kern_path_locked(struct ksmbd_work *work, char *name,
+- unsigned int flags, struct path *path,
+- bool caseless);
++ unsigned int flags, struct path *parent_path,
++ struct path *path, bool caseless);
+ struct dentry *ksmbd_vfs_kern_path_create(struct ksmbd_work *work,
+ const char *name,
+ unsigned int flags,
+diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
+index d54b595a0fe0f..0d678e9a7b248 100644
+--- a/include/linux/dma-fence.h
++++ b/include/linux/dma-fence.h
+@@ -606,7 +606,7 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr)
+ void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline);
+
+ struct dma_fence *dma_fence_get_stub(void);
+-struct dma_fence *dma_fence_allocate_private_stub(void);
++struct dma_fence *dma_fence_allocate_private_stub(ktime_t timestamp);
+ u64 dma_fence_context_alloc(unsigned num);
+
+ extern const struct dma_fence_ops dma_fence_array_ops;
+diff --git a/include/linux/mm.h b/include/linux/mm.h
+index 9e10485f37e7f..d1fd7c544dcd8 100644
+--- a/include/linux/mm.h
++++ b/include/linux/mm.h
+@@ -641,8 +641,14 @@ static inline void vma_numab_state_free(struct vm_area_struct *vma) {}
+ */
+ static inline bool vma_start_read(struct vm_area_struct *vma)
+ {
+- /* Check before locking. A race might cause false locked result. */
+- if (vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq))
++ /*
++ * Check before locking. A race might cause false locked result.
++ * We can use READ_ONCE() for the mm_lock_seq here, and don't need
++ * ACQUIRE semantics, because this is just a lockless check whose result
++ * we don't rely on for anything - the mm_lock_seq read against which we
++ * need ordering is below.
++ */
++ if (READ_ONCE(vma->vm_lock_seq) == READ_ONCE(vma->vm_mm->mm_lock_seq))
+ return false;
+
+ if (unlikely(down_read_trylock(&vma->vm_lock->lock) == 0))
+@@ -653,8 +659,13 @@ static inline bool vma_start_read(struct vm_area_struct *vma)
+ * False unlocked result is impossible because we modify and check
+ * vma->vm_lock_seq under vma->vm_lock protection and mm->mm_lock_seq
+ * modification invalidates all existing locks.
++ *
++ * We must use ACQUIRE semantics for the mm_lock_seq so that if we are
++ * racing with vma_end_write_all(), we only start reading from the VMA
++ * after it has been unlocked.
++ * This pairs with RELEASE semantics in vma_end_write_all().
+ */
+- if (unlikely(vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq))) {
++ if (unlikely(vma->vm_lock_seq == smp_load_acquire(&vma->vm_mm->mm_lock_seq))) {
+ up_read(&vma->vm_lock->lock);
+ return false;
+ }
+@@ -676,7 +687,7 @@ static bool __is_vma_write_locked(struct vm_area_struct *vma, int *mm_lock_seq)
+ * current task is holding mmap_write_lock, both vma->vm_lock_seq and
+ * mm->mm_lock_seq can't be concurrently modified.
+ */
+- *mm_lock_seq = READ_ONCE(vma->vm_mm->mm_lock_seq);
++ *mm_lock_seq = vma->vm_mm->mm_lock_seq;
+ return (vma->vm_lock_seq == *mm_lock_seq);
+ }
+
+@@ -688,7 +699,13 @@ static inline void vma_start_write(struct vm_area_struct *vma)
+ return;
+
+ down_write(&vma->vm_lock->lock);
+- vma->vm_lock_seq = mm_lock_seq;
++ /*
++ * We should use WRITE_ONCE() here because we can have concurrent reads
++ * from the early lockless pessimistic check in vma_start_read().
++ * We don't really care about the correctness of that early check, but
++ * we should use WRITE_ONCE() for cleanliness and to keep KCSAN happy.
++ */
++ WRITE_ONCE(vma->vm_lock_seq, mm_lock_seq);
+ up_write(&vma->vm_lock->lock);
+ }
+
+@@ -702,7 +719,7 @@ static inline bool vma_try_start_write(struct vm_area_struct *vma)
+ if (!down_write_trylock(&vma->vm_lock->lock))
+ return false;
+
+- vma->vm_lock_seq = mm_lock_seq;
++ WRITE_ONCE(vma->vm_lock_seq, mm_lock_seq);
+ up_write(&vma->vm_lock->lock);
+ return true;
+ }
+diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
+index de10fc797c8e9..5e74ce4a28cd6 100644
+--- a/include/linux/mm_types.h
++++ b/include/linux/mm_types.h
+@@ -514,6 +514,20 @@ struct vm_area_struct {
+ };
+
+ #ifdef CONFIG_PER_VMA_LOCK
++ /*
++ * Can only be written (using WRITE_ONCE()) while holding both:
++ * - mmap_lock (in write mode)
++ * - vm_lock->lock (in write mode)
++ * Can be read reliably while holding one of:
++ * - mmap_lock (in read or write mode)
++ * - vm_lock->lock (in read or write mode)
++ * Can be read unreliably (using READ_ONCE()) for pessimistic bailout
++ * while holding nothing (except RCU to keep the VMA struct allocated).
++ *
++ * This sequence counter is explicitly allowed to overflow; sequence
++ * counter reuse can only lead to occasional unnecessary use of the
++ * slowpath.
++ */
+ int vm_lock_seq;
+ struct vma_lock *vm_lock;
+
+@@ -679,6 +693,20 @@ struct mm_struct {
+ * by mmlist_lock
+ */
+ #ifdef CONFIG_PER_VMA_LOCK
++ /*
++ * This field has lock-like semantics, meaning it is sometimes
++ * accessed with ACQUIRE/RELEASE semantics.
++ * Roughly speaking, incrementing the sequence number is
++ * equivalent to releasing locks on VMAs; reading the sequence
++ * number can be part of taking a read lock on a VMA.
++ *
++ * Can be modified under write mmap_lock using RELEASE
++ * semantics.
++ * Can be read with no other protection when holding write
++ * mmap_lock.
++ * Can be read with ACQUIRE semantics if not holding write
++ * mmap_lock.
++ */
+ int mm_lock_seq;
+ #endif
+
+diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h
+index aab8f1b28d262..e05e167dbd166 100644
+--- a/include/linux/mmap_lock.h
++++ b/include/linux/mmap_lock.h
+@@ -76,8 +76,14 @@ static inline void mmap_assert_write_locked(struct mm_struct *mm)
+ static inline void vma_end_write_all(struct mm_struct *mm)
+ {
+ mmap_assert_write_locked(mm);
+- /* No races during update due to exclusive mmap_lock being held */
+- WRITE_ONCE(mm->mm_lock_seq, mm->mm_lock_seq + 1);
++ /*
++ * Nobody can concurrently modify mm->mm_lock_seq due to exclusive
++ * mmap_lock being held.
++ * We need RELEASE semantics here to ensure that preceding stores into
++ * the VMA take effect before we unlock it with this store.
++ * Pairs with ACQUIRE semantics in vma_start_read().
++ */
++ smp_store_release(&mm->mm_lock_seq, mm->mm_lock_seq + 1);
+ }
+ #else
+ static inline void vma_end_write_all(struct mm_struct *mm) {}
+diff --git a/include/net/ipv6.h b/include/net/ipv6.h
+index 7332296eca44b..2acc4c808d45d 100644
+--- a/include/net/ipv6.h
++++ b/include/net/ipv6.h
+@@ -752,12 +752,8 @@ static inline u32 ipv6_addr_hash(const struct in6_addr *a)
+ /* more secured version of ipv6_addr_hash() */
+ static inline u32 __ipv6_addr_jhash(const struct in6_addr *a, const u32 initval)
+ {
+- u32 v = (__force u32)a->s6_addr32[0] ^ (__force u32)a->s6_addr32[1];
+-
+- return jhash_3words(v,
+- (__force u32)a->s6_addr32[2],
+- (__force u32)a->s6_addr32[3],
+- initval);
++ return jhash2((__force const u32 *)a->s6_addr32,
++ ARRAY_SIZE(a->s6_addr32), initval);
+ }
+
+ static inline bool ipv6_addr_loopback(const struct in6_addr *a)
+diff --git a/include/net/vxlan.h b/include/net/vxlan.h
+index 20bd7d893e10a..b57567296bc67 100644
+--- a/include/net/vxlan.h
++++ b/include/net/vxlan.h
+@@ -384,10 +384,15 @@ static inline netdev_features_t vxlan_features_check(struct sk_buff *skb,
+ return features;
+ }
+
+-/* IP header + UDP + VXLAN + Ethernet header */
+-#define VXLAN_HEADROOM (20 + 8 + 8 + 14)
+-/* IPv6 header + UDP + VXLAN + Ethernet header */
+-#define VXLAN6_HEADROOM (40 + 8 + 8 + 14)
++static inline int vxlan_headroom(u32 flags)
++{
++ /* VXLAN: IP4/6 header + UDP + VXLAN + Ethernet header */
++ /* VXLAN-GPE: IP4/6 header + UDP + VXLAN */
++ return (flags & VXLAN_F_IPV6 ? sizeof(struct ipv6hdr) :
++ sizeof(struct iphdr)) +
++ sizeof(struct udphdr) + sizeof(struct vxlanhdr) +
++ (flags & VXLAN_F_GPE ? 0 : ETH_HLEN);
++}
+
+ static inline struct vxlanhdr *vxlan_hdr(struct sk_buff *skb)
+ {
+diff --git a/include/uapi/linux/blkzoned.h b/include/uapi/linux/blkzoned.h
+index b80fcc9ea5257..f85743ef6e7d1 100644
+--- a/include/uapi/linux/blkzoned.h
++++ b/include/uapi/linux/blkzoned.h
+@@ -51,13 +51,13 @@ enum blk_zone_type {
+ *
+ * The Zone Condition state machine in the ZBC/ZAC standards maps the above
+ * deinitions as:
+- * - ZC1: Empty | BLK_ZONE_EMPTY
++ * - ZC1: Empty | BLK_ZONE_COND_EMPTY
+ * - ZC2: Implicit Open | BLK_ZONE_COND_IMP_OPEN
+ * - ZC3: Explicit Open | BLK_ZONE_COND_EXP_OPEN
+- * - ZC4: Closed | BLK_ZONE_CLOSED
+- * - ZC5: Full | BLK_ZONE_FULL
+- * - ZC6: Read Only | BLK_ZONE_READONLY
+- * - ZC7: Offline | BLK_ZONE_OFFLINE
++ * - ZC4: Closed | BLK_ZONE_COND_CLOSED
++ * - ZC5: Full | BLK_ZONE_COND_FULL
++ * - ZC6: Read Only | BLK_ZONE_COND_READONLY
++ * - ZC7: Offline | BLK_ZONE_COND_OFFLINE
+ *
+ * Conditions 0x5 to 0xC are reserved by the current ZBC/ZAC spec and should
+ * be considered invalid.
+diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
+index d6667b435dd39..2989b81cca82a 100644
+--- a/io_uring/io_uring.c
++++ b/io_uring/io_uring.c
+@@ -2579,11 +2579,20 @@ int io_run_task_work_sig(struct io_ring_ctx *ctx)
+ return 0;
+ }
+
++static bool current_pending_io(void)
++{
++ struct io_uring_task *tctx = current->io_uring;
++
++ if (!tctx)
++ return false;
++ return percpu_counter_read_positive(&tctx->inflight);
++}
++
+ /* when returns >0, the caller should retry */
+ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
+ struct io_wait_queue *iowq)
+ {
+- int token, ret;
++ int io_wait, ret;
+
+ if (unlikely(READ_ONCE(ctx->check_cq)))
+ return 1;
+@@ -2597,17 +2606,19 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx,
+ return 0;
+
+ /*
+- * Use io_schedule_prepare/finish, so cpufreq can take into account
+- * that the task is waiting for IO - turns out to be important for low
+- * QD IO.
++ * Mark us as being in io_wait if we have pending requests, so cpufreq
++ * can take into account that the task is waiting for IO - turns out
++ * to be important for low QD IO.
+ */
+- token = io_schedule_prepare();
++ io_wait = current->in_iowait;
++ if (current_pending_io())
++ current->in_iowait = 1;
+ ret = 0;
+ if (iowq->timeout == KTIME_MAX)
+ schedule();
+ else if (!schedule_hrtimeout(&iowq->timeout, HRTIMER_MODE_ABS))
+ ret = -ETIME;
+- io_schedule_finish(token);
++ current->in_iowait = io_wait;
+ return ret;
+ }
+
+@@ -3859,7 +3870,7 @@ static __cold int io_uring_create(unsigned entries, struct io_uring_params *p,
+ ctx->syscall_iopoll = 1;
+
+ ctx->compat = in_compat_syscall();
+- if (!capable(CAP_IPC_LOCK))
++ if (!ns_capable_noaudit(&init_user_ns, CAP_IPC_LOCK))
+ ctx->user = get_uid(current_user());
+
+ /*
+diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
+index 728f434de2bbf..21db0df0eb000 100644
+--- a/kernel/locking/rtmutex.c
++++ b/kernel/locking/rtmutex.c
+@@ -333,21 +333,43 @@ static __always_inline int __waiter_prio(struct task_struct *task)
+ return prio;
+ }
+
++/*
++ * Update the waiter->tree copy of the sort keys.
++ */
+ static __always_inline void
+ waiter_update_prio(struct rt_mutex_waiter *waiter, struct task_struct *task)
+ {
+- waiter->prio = __waiter_prio(task);
+- waiter->deadline = task->dl.deadline;
++ lockdep_assert_held(&waiter->lock->wait_lock);
++ lockdep_assert(RB_EMPTY_NODE(&waiter->tree.entry));
++
++ waiter->tree.prio = __waiter_prio(task);
++ waiter->tree.deadline = task->dl.deadline;
++}
++
++/*
++ * Update the waiter->pi_tree copy of the sort keys (from the tree copy).
++ */
++static __always_inline void
++waiter_clone_prio(struct rt_mutex_waiter *waiter, struct task_struct *task)
++{
++ lockdep_assert_held(&waiter->lock->wait_lock);
++ lockdep_assert_held(&task->pi_lock);
++ lockdep_assert(RB_EMPTY_NODE(&waiter->pi_tree.entry));
++
++ waiter->pi_tree.prio = waiter->tree.prio;
++ waiter->pi_tree.deadline = waiter->tree.deadline;
+ }
+
+ /*
+- * Only use with rt_mutex_waiter_{less,equal}()
++ * Only use with rt_waiter_node_{less,equal}()
+ */
++#define task_to_waiter_node(p) \
++ &(struct rt_waiter_node){ .prio = __waiter_prio(p), .deadline = (p)->dl.deadline }
+ #define task_to_waiter(p) \
+- &(struct rt_mutex_waiter){ .prio = __waiter_prio(p), .deadline = (p)->dl.deadline }
++ &(struct rt_mutex_waiter){ .tree = *task_to_waiter_node(p) }
+
+-static __always_inline int rt_mutex_waiter_less(struct rt_mutex_waiter *left,
+- struct rt_mutex_waiter *right)
++static __always_inline int rt_waiter_node_less(struct rt_waiter_node *left,
++ struct rt_waiter_node *right)
+ {
+ if (left->prio < right->prio)
+ return 1;
+@@ -364,8 +386,8 @@ static __always_inline int rt_mutex_waiter_less(struct rt_mutex_waiter *left,
+ return 0;
+ }
+
+-static __always_inline int rt_mutex_waiter_equal(struct rt_mutex_waiter *left,
+- struct rt_mutex_waiter *right)
++static __always_inline int rt_waiter_node_equal(struct rt_waiter_node *left,
++ struct rt_waiter_node *right)
+ {
+ if (left->prio != right->prio)
+ return 0;
+@@ -385,7 +407,7 @@ static __always_inline int rt_mutex_waiter_equal(struct rt_mutex_waiter *left,
+ static inline bool rt_mutex_steal(struct rt_mutex_waiter *waiter,
+ struct rt_mutex_waiter *top_waiter)
+ {
+- if (rt_mutex_waiter_less(waiter, top_waiter))
++ if (rt_waiter_node_less(&waiter->tree, &top_waiter->tree))
+ return true;
+
+ #ifdef RT_MUTEX_BUILD_SPINLOCKS
+@@ -393,30 +415,30 @@ static inline bool rt_mutex_steal(struct rt_mutex_waiter *waiter,
+ * Note that RT tasks are excluded from same priority (lateral)
+ * steals to prevent the introduction of an unbounded latency.
+ */
+- if (rt_prio(waiter->prio) || dl_prio(waiter->prio))
++ if (rt_prio(waiter->tree.prio) || dl_prio(waiter->tree.prio))
+ return false;
+
+- return rt_mutex_waiter_equal(waiter, top_waiter);
++ return rt_waiter_node_equal(&waiter->tree, &top_waiter->tree);
+ #else
+ return false;
+ #endif
+ }
+
+ #define __node_2_waiter(node) \
+- rb_entry((node), struct rt_mutex_waiter, tree_entry)
++ rb_entry((node), struct rt_mutex_waiter, tree.entry)
+
+ static __always_inline bool __waiter_less(struct rb_node *a, const struct rb_node *b)
+ {
+ struct rt_mutex_waiter *aw = __node_2_waiter(a);
+ struct rt_mutex_waiter *bw = __node_2_waiter(b);
+
+- if (rt_mutex_waiter_less(aw, bw))
++ if (rt_waiter_node_less(&aw->tree, &bw->tree))
+ return 1;
+
+ if (!build_ww_mutex())
+ return 0;
+
+- if (rt_mutex_waiter_less(bw, aw))
++ if (rt_waiter_node_less(&bw->tree, &aw->tree))
+ return 0;
+
+ /* NOTE: relies on waiter->ww_ctx being set before insertion */
+@@ -434,48 +456,58 @@ static __always_inline bool __waiter_less(struct rb_node *a, const struct rb_nod
+ static __always_inline void
+ rt_mutex_enqueue(struct rt_mutex_base *lock, struct rt_mutex_waiter *waiter)
+ {
+- rb_add_cached(&waiter->tree_entry, &lock->waiters, __waiter_less);
++ lockdep_assert_held(&lock->wait_lock);
++
++ rb_add_cached(&waiter->tree.entry, &lock->waiters, __waiter_less);
+ }
+
+ static __always_inline void
+ rt_mutex_dequeue(struct rt_mutex_base *lock, struct rt_mutex_waiter *waiter)
+ {
+- if (RB_EMPTY_NODE(&waiter->tree_entry))
++ lockdep_assert_held(&lock->wait_lock);
++
++ if (RB_EMPTY_NODE(&waiter->tree.entry))
+ return;
+
+- rb_erase_cached(&waiter->tree_entry, &lock->waiters);
+- RB_CLEAR_NODE(&waiter->tree_entry);
++ rb_erase_cached(&waiter->tree.entry, &lock->waiters);
++ RB_CLEAR_NODE(&waiter->tree.entry);
+ }
+
+-#define __node_2_pi_waiter(node) \
+- rb_entry((node), struct rt_mutex_waiter, pi_tree_entry)
++#define __node_2_rt_node(node) \
++ rb_entry((node), struct rt_waiter_node, entry)
+
+-static __always_inline bool
+-__pi_waiter_less(struct rb_node *a, const struct rb_node *b)
++static __always_inline bool __pi_waiter_less(struct rb_node *a, const struct rb_node *b)
+ {
+- return rt_mutex_waiter_less(__node_2_pi_waiter(a), __node_2_pi_waiter(b));
++ return rt_waiter_node_less(__node_2_rt_node(a), __node_2_rt_node(b));
+ }
+
+ static __always_inline void
+ rt_mutex_enqueue_pi(struct task_struct *task, struct rt_mutex_waiter *waiter)
+ {
+- rb_add_cached(&waiter->pi_tree_entry, &task->pi_waiters, __pi_waiter_less);
++ lockdep_assert_held(&task->pi_lock);
++
++ rb_add_cached(&waiter->pi_tree.entry, &task->pi_waiters, __pi_waiter_less);
+ }
+
+ static __always_inline void
+ rt_mutex_dequeue_pi(struct task_struct *task, struct rt_mutex_waiter *waiter)
+ {
+- if (RB_EMPTY_NODE(&waiter->pi_tree_entry))
++ lockdep_assert_held(&task->pi_lock);
++
++ if (RB_EMPTY_NODE(&waiter->pi_tree.entry))
+ return;
+
+- rb_erase_cached(&waiter->pi_tree_entry, &task->pi_waiters);
+- RB_CLEAR_NODE(&waiter->pi_tree_entry);
++ rb_erase_cached(&waiter->pi_tree.entry, &task->pi_waiters);
++ RB_CLEAR_NODE(&waiter->pi_tree.entry);
+ }
+
+-static __always_inline void rt_mutex_adjust_prio(struct task_struct *p)
++static __always_inline void rt_mutex_adjust_prio(struct rt_mutex_base *lock,
++ struct task_struct *p)
+ {
+ struct task_struct *pi_task = NULL;
+
++ lockdep_assert_held(&lock->wait_lock);
++ lockdep_assert(rt_mutex_owner(lock) == p);
+ lockdep_assert_held(&p->pi_lock);
+
+ if (task_has_pi_waiters(p))
+@@ -571,9 +603,14 @@ static __always_inline struct rt_mutex_base *task_blocked_on_lock(struct task_st
+ * Chain walk basics and protection scope
+ *
+ * [R] refcount on task
+- * [P] task->pi_lock held
++ * [Pn] task->pi_lock held
+ * [L] rtmutex->wait_lock held
+ *
++ * Normal locking order:
++ *
++ * rtmutex->wait_lock
++ * task->pi_lock
++ *
+ * Step Description Protected by
+ * function arguments:
+ * @task [R]
+@@ -588,27 +625,32 @@ static __always_inline struct rt_mutex_base *task_blocked_on_lock(struct task_st
+ * again:
+ * loop_sanity_check();
+ * retry:
+- * [1] lock(task->pi_lock); [R] acquire [P]
+- * [2] waiter = task->pi_blocked_on; [P]
+- * [3] check_exit_conditions_1(); [P]
+- * [4] lock = waiter->lock; [P]
+- * [5] if (!try_lock(lock->wait_lock)) { [P] try to acquire [L]
+- * unlock(task->pi_lock); release [P]
++ * [1] lock(task->pi_lock); [R] acquire [P1]
++ * [2] waiter = task->pi_blocked_on; [P1]
++ * [3] check_exit_conditions_1(); [P1]
++ * [4] lock = waiter->lock; [P1]
++ * [5] if (!try_lock(lock->wait_lock)) { [P1] try to acquire [L]
++ * unlock(task->pi_lock); release [P1]
+ * goto retry;
+ * }
+- * [6] check_exit_conditions_2(); [P] + [L]
+- * [7] requeue_lock_waiter(lock, waiter); [P] + [L]
+- * [8] unlock(task->pi_lock); release [P]
++ * [6] check_exit_conditions_2(); [P1] + [L]
++ * [7] requeue_lock_waiter(lock, waiter); [P1] + [L]
++ * [8] unlock(task->pi_lock); release [P1]
+ * put_task_struct(task); release [R]
+ * [9] check_exit_conditions_3(); [L]
+ * [10] task = owner(lock); [L]
+ * get_task_struct(task); [L] acquire [R]
+- * lock(task->pi_lock); [L] acquire [P]
+- * [11] requeue_pi_waiter(tsk, waiters(lock));[P] + [L]
+- * [12] check_exit_conditions_4(); [P] + [L]
+- * [13] unlock(task->pi_lock); release [P]
++ * lock(task->pi_lock); [L] acquire [P2]
++ * [11] requeue_pi_waiter(tsk, waiters(lock));[P2] + [L]
++ * [12] check_exit_conditions_4(); [P2] + [L]
++ * [13] unlock(task->pi_lock); release [P2]
+ * unlock(lock->wait_lock); release [L]
+ * goto again;
++ *
++ * Where P1 is the blocking task and P2 is the lock owner; going up one step
++ * the owner becomes the next blocked task etc..
++ *
++*
+ */
+ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task,
+ enum rtmutex_chainwalk chwalk,
+@@ -756,7 +798,7 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task,
+ * enabled we continue, but stop the requeueing in the chain
+ * walk.
+ */
+- if (rt_mutex_waiter_equal(waiter, task_to_waiter(task))) {
++ if (rt_waiter_node_equal(&waiter->tree, task_to_waiter_node(task))) {
+ if (!detect_deadlock)
+ goto out_unlock_pi;
+ else
+@@ -764,13 +806,18 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task,
+ }
+
+ /*
+- * [4] Get the next lock
++ * [4] Get the next lock; per holding task->pi_lock we can't unblock
++ * and guarantee @lock's existence.
+ */
+ lock = waiter->lock;
+ /*
+ * [5] We need to trylock here as we are holding task->pi_lock,
+ * which is the reverse lock order versus the other rtmutex
+ * operations.
++ *
++ * Per the above, holding task->pi_lock guarantees lock exists, so
++ * inverting this lock order is infeasible from a life-time
++ * perspective.
+ */
+ if (!raw_spin_trylock(&lock->wait_lock)) {
+ raw_spin_unlock_irq(&task->pi_lock);
+@@ -874,17 +921,18 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task,
+ * or
+ *
+ * DL CBS enforcement advancing the effective deadline.
+- *
+- * Even though pi_waiters also uses these fields, and that tree is only
+- * updated in [11], we can do this here, since we hold [L], which
+- * serializes all pi_waiters access and rb_erase() does not care about
+- * the values of the node being removed.
+ */
+ waiter_update_prio(waiter, task);
+
+ rt_mutex_enqueue(lock, waiter);
+
+- /* [8] Release the task */
++ /*
++ * [8] Release the (blocking) task in preparation for
++ * taking the owner task in [10].
++ *
++ * Since we hold lock->waiter_lock, task cannot unblock, even if we
++ * release task->pi_lock.
++ */
+ raw_spin_unlock(&task->pi_lock);
+ put_task_struct(task);
+
+@@ -908,7 +956,12 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task,
+ return 0;
+ }
+
+- /* [10] Grab the next task, i.e. the owner of @lock */
++ /*
++ * [10] Grab the next task, i.e. the owner of @lock
++ *
++ * Per holding lock->wait_lock and checking for !owner above, there
++ * must be an owner and it cannot go away.
++ */
+ task = get_task_struct(rt_mutex_owner(lock));
+ raw_spin_lock(&task->pi_lock);
+
+@@ -921,8 +974,9 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task,
+ * and adjust the priority of the owner.
+ */
+ rt_mutex_dequeue_pi(task, prerequeue_top_waiter);
++ waiter_clone_prio(waiter, task);
+ rt_mutex_enqueue_pi(task, waiter);
+- rt_mutex_adjust_prio(task);
++ rt_mutex_adjust_prio(lock, task);
+
+ } else if (prerequeue_top_waiter == waiter) {
+ /*
+@@ -937,8 +991,9 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task,
+ */
+ rt_mutex_dequeue_pi(task, waiter);
+ waiter = rt_mutex_top_waiter(lock);
++ waiter_clone_prio(waiter, task);
+ rt_mutex_enqueue_pi(task, waiter);
+- rt_mutex_adjust_prio(task);
++ rt_mutex_adjust_prio(lock, task);
+ } else {
+ /*
+ * Nothing changed. No need to do any priority
+@@ -1154,6 +1209,7 @@ static int __sched task_blocks_on_rt_mutex(struct rt_mutex_base *lock,
+ waiter->task = task;
+ waiter->lock = lock;
+ waiter_update_prio(waiter, task);
++ waiter_clone_prio(waiter, task);
+
+ /* Get the top priority waiter on the lock */
+ if (rt_mutex_has_waiters(lock))
+@@ -1187,7 +1243,7 @@ static int __sched task_blocks_on_rt_mutex(struct rt_mutex_base *lock,
+ rt_mutex_dequeue_pi(owner, top_waiter);
+ rt_mutex_enqueue_pi(owner, waiter);
+
+- rt_mutex_adjust_prio(owner);
++ rt_mutex_adjust_prio(lock, owner);
+ if (owner->pi_blocked_on)
+ chain_walk = 1;
+ } else if (rt_mutex_cond_detect_deadlock(waiter, chwalk)) {
+@@ -1234,6 +1290,8 @@ static void __sched mark_wakeup_next_waiter(struct rt_wake_q_head *wqh,
+ {
+ struct rt_mutex_waiter *waiter;
+
++ lockdep_assert_held(&lock->wait_lock);
++
+ raw_spin_lock(¤t->pi_lock);
+
+ waiter = rt_mutex_top_waiter(lock);
+@@ -1246,7 +1304,7 @@ static void __sched mark_wakeup_next_waiter(struct rt_wake_q_head *wqh,
+ * task unblocks.
+ */
+ rt_mutex_dequeue_pi(current, waiter);
+- rt_mutex_adjust_prio(current);
++ rt_mutex_adjust_prio(lock, current);
+
+ /*
+ * As we are waking up the top waiter, and the waiter stays
+@@ -1482,7 +1540,7 @@ static void __sched remove_waiter(struct rt_mutex_base *lock,
+ if (rt_mutex_has_waiters(lock))
+ rt_mutex_enqueue_pi(owner, rt_mutex_top_waiter(lock));
+
+- rt_mutex_adjust_prio(owner);
++ rt_mutex_adjust_prio(lock, owner);
+
+ /* Store the lock on which owner is blocked or NULL */
+ next_lock = task_blocked_on_lock(owner);
+diff --git a/kernel/locking/rtmutex_api.c b/kernel/locking/rtmutex_api.c
+index cb9fdff76a8a3..a6974d0445930 100644
+--- a/kernel/locking/rtmutex_api.c
++++ b/kernel/locking/rtmutex_api.c
+@@ -459,7 +459,7 @@ void __sched rt_mutex_adjust_pi(struct task_struct *task)
+ raw_spin_lock_irqsave(&task->pi_lock, flags);
+
+ waiter = task->pi_blocked_on;
+- if (!waiter || rt_mutex_waiter_equal(waiter, task_to_waiter(task))) {
++ if (!waiter || rt_waiter_node_equal(&waiter->tree, task_to_waiter_node(task))) {
+ raw_spin_unlock_irqrestore(&task->pi_lock, flags);
+ return;
+ }
+diff --git a/kernel/locking/rtmutex_common.h b/kernel/locking/rtmutex_common.h
+index c47e8361bfb5c..1162e07cdaea1 100644
+--- a/kernel/locking/rtmutex_common.h
++++ b/kernel/locking/rtmutex_common.h
+@@ -17,27 +17,44 @@
+ #include <linux/rtmutex.h>
+ #include <linux/sched/wake_q.h>
+
++
++/*
++ * This is a helper for the struct rt_mutex_waiter below. A waiter goes in two
++ * separate trees and they need their own copy of the sort keys because of
++ * different locking requirements.
++ *
++ * @entry: rbtree node to enqueue into the waiters tree
++ * @prio: Priority of the waiter
++ * @deadline: Deadline of the waiter if applicable
++ *
++ * See rt_waiter_node_less() and waiter_*_prio().
++ */
++struct rt_waiter_node {
++ struct rb_node entry;
++ int prio;
++ u64 deadline;
++};
++
+ /*
+ * This is the control structure for tasks blocked on a rt_mutex,
+ * which is allocated on the kernel stack on of the blocked task.
+ *
+- * @tree_entry: pi node to enqueue into the mutex waiters tree
+- * @pi_tree_entry: pi node to enqueue into the mutex owner waiters tree
++ * @tree: node to enqueue into the mutex waiters tree
++ * @pi_tree: node to enqueue into the mutex owner waiters tree
+ * @task: task reference to the blocked task
+ * @lock: Pointer to the rt_mutex on which the waiter blocks
+ * @wake_state: Wakeup state to use (TASK_NORMAL or TASK_RTLOCK_WAIT)
+- * @prio: Priority of the waiter
+- * @deadline: Deadline of the waiter if applicable
+ * @ww_ctx: WW context pointer
++ *
++ * @tree is ordered by @lock->wait_lock
++ * @pi_tree is ordered by rt_mutex_owner(@lock)->pi_lock
+ */
+ struct rt_mutex_waiter {
+- struct rb_node tree_entry;
+- struct rb_node pi_tree_entry;
++ struct rt_waiter_node tree;
++ struct rt_waiter_node pi_tree;
+ struct task_struct *task;
+ struct rt_mutex_base *lock;
+ unsigned int wake_state;
+- int prio;
+- u64 deadline;
+ struct ww_acquire_ctx *ww_ctx;
+ };
+
+@@ -105,7 +122,7 @@ static inline bool rt_mutex_waiter_is_top_waiter(struct rt_mutex_base *lock,
+ {
+ struct rb_node *leftmost = rb_first_cached(&lock->waiters);
+
+- return rb_entry(leftmost, struct rt_mutex_waiter, tree_entry) == waiter;
++ return rb_entry(leftmost, struct rt_mutex_waiter, tree.entry) == waiter;
+ }
+
+ static inline struct rt_mutex_waiter *rt_mutex_top_waiter(struct rt_mutex_base *lock)
+@@ -113,8 +130,10 @@ static inline struct rt_mutex_waiter *rt_mutex_top_waiter(struct rt_mutex_base *
+ struct rb_node *leftmost = rb_first_cached(&lock->waiters);
+ struct rt_mutex_waiter *w = NULL;
+
++ lockdep_assert_held(&lock->wait_lock);
++
+ if (leftmost) {
+- w = rb_entry(leftmost, struct rt_mutex_waiter, tree_entry);
++ w = rb_entry(leftmost, struct rt_mutex_waiter, tree.entry);
+ BUG_ON(w->lock != lock);
+ }
+ return w;
+@@ -127,8 +146,10 @@ static inline int task_has_pi_waiters(struct task_struct *p)
+
+ static inline struct rt_mutex_waiter *task_top_pi_waiter(struct task_struct *p)
+ {
++ lockdep_assert_held(&p->pi_lock);
++
+ return rb_entry(p->pi_waiters.rb_leftmost, struct rt_mutex_waiter,
+- pi_tree_entry);
++ pi_tree.entry);
+ }
+
+ #define RT_MUTEX_HAS_WAITERS 1UL
+@@ -190,8 +211,8 @@ static inline void debug_rt_mutex_free_waiter(struct rt_mutex_waiter *waiter)
+ static inline void rt_mutex_init_waiter(struct rt_mutex_waiter *waiter)
+ {
+ debug_rt_mutex_init_waiter(waiter);
+- RB_CLEAR_NODE(&waiter->pi_tree_entry);
+- RB_CLEAR_NODE(&waiter->tree_entry);
++ RB_CLEAR_NODE(&waiter->pi_tree.entry);
++ RB_CLEAR_NODE(&waiter->tree.entry);
+ waiter->wake_state = TASK_NORMAL;
+ waiter->task = NULL;
+ }
+diff --git a/kernel/locking/ww_mutex.h b/kernel/locking/ww_mutex.h
+index 56f139201f246..3ad2cc4823e59 100644
+--- a/kernel/locking/ww_mutex.h
++++ b/kernel/locking/ww_mutex.h
+@@ -96,25 +96,25 @@ __ww_waiter_first(struct rt_mutex *lock)
+ struct rb_node *n = rb_first(&lock->rtmutex.waiters.rb_root);
+ if (!n)
+ return NULL;
+- return rb_entry(n, struct rt_mutex_waiter, tree_entry);
++ return rb_entry(n, struct rt_mutex_waiter, tree.entry);
+ }
+
+ static inline struct rt_mutex_waiter *
+ __ww_waiter_next(struct rt_mutex *lock, struct rt_mutex_waiter *w)
+ {
+- struct rb_node *n = rb_next(&w->tree_entry);
++ struct rb_node *n = rb_next(&w->tree.entry);
+ if (!n)
+ return NULL;
+- return rb_entry(n, struct rt_mutex_waiter, tree_entry);
++ return rb_entry(n, struct rt_mutex_waiter, tree.entry);
+ }
+
+ static inline struct rt_mutex_waiter *
+ __ww_waiter_prev(struct rt_mutex *lock, struct rt_mutex_waiter *w)
+ {
+- struct rb_node *n = rb_prev(&w->tree_entry);
++ struct rb_node *n = rb_prev(&w->tree.entry);
+ if (!n)
+ return NULL;
+- return rb_entry(n, struct rt_mutex_waiter, tree_entry);
++ return rb_entry(n, struct rt_mutex_waiter, tree.entry);
+ }
+
+ static inline struct rt_mutex_waiter *
+@@ -123,7 +123,7 @@ __ww_waiter_last(struct rt_mutex *lock)
+ struct rb_node *n = rb_last(&lock->rtmutex.waiters.rb_root);
+ if (!n)
+ return NULL;
+- return rb_entry(n, struct rt_mutex_waiter, tree_entry);
++ return rb_entry(n, struct rt_mutex_waiter, tree.entry);
+ }
+
+ static inline void
+diff --git a/kernel/signal.c b/kernel/signal.c
+index 2547fa73bde51..1b39cba7dfd38 100644
+--- a/kernel/signal.c
++++ b/kernel/signal.c
+@@ -561,6 +561,10 @@ bool unhandled_signal(struct task_struct *tsk, int sig)
+ if (handler != SIG_IGN && handler != SIG_DFL)
+ return false;
+
++ /* If dying, we handle all new signals by ignoring them */
++ if (fatal_signal_pending(tsk))
++ return false;
++
+ /* if ptraced, let the tracer determine */
+ return !tsk->ptrace;
+ }
+diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
+index 14d8001140c82..99634b29a8b82 100644
+--- a/kernel/trace/ring_buffer.c
++++ b/kernel/trace/ring_buffer.c
+@@ -523,6 +523,8 @@ struct ring_buffer_per_cpu {
+ rb_time_t before_stamp;
+ u64 event_stamp[MAX_NEST];
+ u64 read_stamp;
++ /* pages removed since last reset */
++ unsigned long pages_removed;
+ /* ring buffer pages to update, > 0 to add, < 0 to remove */
+ long nr_pages_to_update;
+ struct list_head new_pages; /* new pages to add */
+@@ -558,6 +560,7 @@ struct ring_buffer_iter {
+ struct buffer_page *head_page;
+ struct buffer_page *cache_reader_page;
+ unsigned long cache_read;
++ unsigned long cache_pages_removed;
+ u64 read_stamp;
+ u64 page_stamp;
+ struct ring_buffer_event *event;
+@@ -1956,6 +1959,8 @@ rb_remove_pages(struct ring_buffer_per_cpu *cpu_buffer, unsigned long nr_pages)
+ to_remove = rb_list_head(to_remove)->next;
+ head_bit |= (unsigned long)to_remove & RB_PAGE_HEAD;
+ }
++ /* Read iterators need to reset themselves when some pages removed */
++ cpu_buffer->pages_removed += nr_removed;
+
+ next_page = rb_list_head(to_remove)->next;
+
+@@ -1977,12 +1982,6 @@ rb_remove_pages(struct ring_buffer_per_cpu *cpu_buffer, unsigned long nr_pages)
+ cpu_buffer->head_page = list_entry(next_page,
+ struct buffer_page, list);
+
+- /*
+- * change read pointer to make sure any read iterators reset
+- * themselves
+- */
+- cpu_buffer->read = 0;
+-
+ /* pages are removed, resume tracing and then free the pages */
+ atomic_dec(&cpu_buffer->record_disabled);
+ raw_spin_unlock_irq(&cpu_buffer->reader_lock);
+@@ -4392,6 +4391,7 @@ static void rb_iter_reset(struct ring_buffer_iter *iter)
+
+ iter->cache_reader_page = iter->head_page;
+ iter->cache_read = cpu_buffer->read;
++ iter->cache_pages_removed = cpu_buffer->pages_removed;
+
+ if (iter->head) {
+ iter->read_stamp = cpu_buffer->read_stamp;
+@@ -4846,12 +4846,13 @@ rb_iter_peek(struct ring_buffer_iter *iter, u64 *ts)
+ buffer = cpu_buffer->buffer;
+
+ /*
+- * Check if someone performed a consuming read to
+- * the buffer. A consuming read invalidates the iterator
+- * and we need to reset the iterator in this case.
++ * Check if someone performed a consuming read to the buffer
++ * or removed some pages from the buffer. In these cases,
++ * iterator was invalidated and we need to reset it.
+ */
+ if (unlikely(iter->cache_read != cpu_buffer->read ||
+- iter->cache_reader_page != cpu_buffer->reader_page))
++ iter->cache_reader_page != cpu_buffer->reader_page ||
++ iter->cache_pages_removed != cpu_buffer->pages_removed))
+ rb_iter_reset(iter);
+
+ again:
+@@ -5295,6 +5296,7 @@ rb_reset_cpu(struct ring_buffer_per_cpu *cpu_buffer)
+ cpu_buffer->last_overrun = 0;
+
+ rb_head_page_activate(cpu_buffer);
++ cpu_buffer->pages_removed = 0;
+ }
+
+ /* Must have disabled the cpu buffer then done a synchronize_rcu */
+diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
+index 57e539d479890..32f39eabc0716 100644
+--- a/kernel/trace/trace_events.c
++++ b/kernel/trace/trace_events.c
+@@ -611,7 +611,6 @@ static int __ftrace_event_enable_disable(struct trace_event_file *file,
+ {
+ struct trace_event_call *call = file->event_call;
+ struct trace_array *tr = file->tr;
+- unsigned long file_flags = file->flags;
+ int ret = 0;
+ int disable;
+
+@@ -635,6 +634,8 @@ static int __ftrace_event_enable_disable(struct trace_event_file *file,
+ break;
+ disable = file->flags & EVENT_FILE_FL_SOFT_DISABLED;
+ clear_bit(EVENT_FILE_FL_SOFT_MODE_BIT, &file->flags);
++ /* Disable use of trace_buffered_event */
++ trace_buffered_event_disable();
+ } else
+ disable = !(file->flags & EVENT_FILE_FL_SOFT_MODE);
+
+@@ -673,6 +674,8 @@ static int __ftrace_event_enable_disable(struct trace_event_file *file,
+ if (atomic_inc_return(&file->sm_ref) > 1)
+ break;
+ set_bit(EVENT_FILE_FL_SOFT_MODE_BIT, &file->flags);
++ /* Enable use of trace_buffered_event */
++ trace_buffered_event_enable();
+ }
+
+ if (!(file->flags & EVENT_FILE_FL_ENABLED)) {
+@@ -712,15 +715,6 @@ static int __ftrace_event_enable_disable(struct trace_event_file *file,
+ break;
+ }
+
+- /* Enable or disable use of trace_buffered_event */
+- if ((file_flags & EVENT_FILE_FL_SOFT_DISABLED) !=
+- (file->flags & EVENT_FILE_FL_SOFT_DISABLED)) {
+- if (file->flags & EVENT_FILE_FL_SOFT_DISABLED)
+- trace_buffered_event_enable();
+- else
+- trace_buffered_event_disable();
+- }
+-
+ return ret;
+ }
+
+diff --git a/lib/test_maple_tree.c b/lib/test_maple_tree.c
+index f1db333270e9f..fad668042f3e7 100644
+--- a/lib/test_maple_tree.c
++++ b/lib/test_maple_tree.c
+@@ -30,54 +30,54 @@
+ #else
+ #define cond_resched() do {} while (0)
+ #endif
+-static
+-int mtree_insert_index(struct maple_tree *mt, unsigned long index, gfp_t gfp)
++static int __init mtree_insert_index(struct maple_tree *mt,
++ unsigned long index, gfp_t gfp)
+ {
+ return mtree_insert(mt, index, xa_mk_value(index & LONG_MAX), gfp);
+ }
+
+-static void mtree_erase_index(struct maple_tree *mt, unsigned long index)
++static void __init mtree_erase_index(struct maple_tree *mt, unsigned long index)
+ {
+ MT_BUG_ON(mt, mtree_erase(mt, index) != xa_mk_value(index & LONG_MAX));
+ MT_BUG_ON(mt, mtree_load(mt, index) != NULL);
+ }
+
+-static int mtree_test_insert(struct maple_tree *mt, unsigned long index,
++static int __init mtree_test_insert(struct maple_tree *mt, unsigned long index,
+ void *ptr)
+ {
+ return mtree_insert(mt, index, ptr, GFP_KERNEL);
+ }
+
+-static int mtree_test_store_range(struct maple_tree *mt, unsigned long start,
+- unsigned long end, void *ptr)
++static int __init mtree_test_store_range(struct maple_tree *mt,
++ unsigned long start, unsigned long end, void *ptr)
+ {
+ return mtree_store_range(mt, start, end, ptr, GFP_KERNEL);
+ }
+
+-static int mtree_test_store(struct maple_tree *mt, unsigned long start,
++static int __init mtree_test_store(struct maple_tree *mt, unsigned long start,
+ void *ptr)
+ {
+ return mtree_test_store_range(mt, start, start, ptr);
+ }
+
+-static int mtree_test_insert_range(struct maple_tree *mt, unsigned long start,
+- unsigned long end, void *ptr)
++static int __init mtree_test_insert_range(struct maple_tree *mt,
++ unsigned long start, unsigned long end, void *ptr)
+ {
+ return mtree_insert_range(mt, start, end, ptr, GFP_KERNEL);
+ }
+
+-static void *mtree_test_load(struct maple_tree *mt, unsigned long index)
++static void __init *mtree_test_load(struct maple_tree *mt, unsigned long index)
+ {
+ return mtree_load(mt, index);
+ }
+
+-static void *mtree_test_erase(struct maple_tree *mt, unsigned long index)
++static void __init *mtree_test_erase(struct maple_tree *mt, unsigned long index)
+ {
+ return mtree_erase(mt, index);
+ }
+
+ #if defined(CONFIG_64BIT)
+-static noinline void check_mtree_alloc_range(struct maple_tree *mt,
++static noinline void __init check_mtree_alloc_range(struct maple_tree *mt,
+ unsigned long start, unsigned long end, unsigned long size,
+ unsigned long expected, int eret, void *ptr)
+ {
+@@ -94,7 +94,7 @@ static noinline void check_mtree_alloc_range(struct maple_tree *mt,
+ MT_BUG_ON(mt, result != expected);
+ }
+
+-static noinline void check_mtree_alloc_rrange(struct maple_tree *mt,
++static noinline void __init check_mtree_alloc_rrange(struct maple_tree *mt,
+ unsigned long start, unsigned long end, unsigned long size,
+ unsigned long expected, int eret, void *ptr)
+ {
+@@ -112,8 +112,8 @@ static noinline void check_mtree_alloc_rrange(struct maple_tree *mt,
+ }
+ #endif
+
+-static noinline void check_load(struct maple_tree *mt, unsigned long index,
+- void *ptr)
++static noinline void __init check_load(struct maple_tree *mt,
++ unsigned long index, void *ptr)
+ {
+ void *ret = mtree_test_load(mt, index);
+
+@@ -122,7 +122,7 @@ static noinline void check_load(struct maple_tree *mt, unsigned long index,
+ MT_BUG_ON(mt, ret != ptr);
+ }
+
+-static noinline void check_store_range(struct maple_tree *mt,
++static noinline void __init check_store_range(struct maple_tree *mt,
+ unsigned long start, unsigned long end, void *ptr, int expected)
+ {
+ int ret = -EINVAL;
+@@ -138,7 +138,7 @@ static noinline void check_store_range(struct maple_tree *mt,
+ check_load(mt, i, ptr);
+ }
+
+-static noinline void check_insert_range(struct maple_tree *mt,
++static noinline void __init check_insert_range(struct maple_tree *mt,
+ unsigned long start, unsigned long end, void *ptr, int expected)
+ {
+ int ret = -EINVAL;
+@@ -154,8 +154,8 @@ static noinline void check_insert_range(struct maple_tree *mt,
+ check_load(mt, i, ptr);
+ }
+
+-static noinline void check_insert(struct maple_tree *mt, unsigned long index,
+- void *ptr)
++static noinline void __init check_insert(struct maple_tree *mt,
++ unsigned long index, void *ptr)
+ {
+ int ret = -EINVAL;
+
+@@ -163,7 +163,7 @@ static noinline void check_insert(struct maple_tree *mt, unsigned long index,
+ MT_BUG_ON(mt, ret != 0);
+ }
+
+-static noinline void check_dup_insert(struct maple_tree *mt,
++static noinline void __init check_dup_insert(struct maple_tree *mt,
+ unsigned long index, void *ptr)
+ {
+ int ret = -EINVAL;
+@@ -173,13 +173,13 @@ static noinline void check_dup_insert(struct maple_tree *mt,
+ }
+
+
+-static noinline
+-void check_index_load(struct maple_tree *mt, unsigned long index)
++static noinline void __init check_index_load(struct maple_tree *mt,
++ unsigned long index)
+ {
+ return check_load(mt, index, xa_mk_value(index & LONG_MAX));
+ }
+
+-static inline int not_empty(struct maple_node *node)
++static inline __init int not_empty(struct maple_node *node)
+ {
+ int i;
+
+@@ -194,8 +194,8 @@ static inline int not_empty(struct maple_node *node)
+ }
+
+
+-static noinline void check_rev_seq(struct maple_tree *mt, unsigned long max,
+- bool verbose)
++static noinline void __init check_rev_seq(struct maple_tree *mt,
++ unsigned long max, bool verbose)
+ {
+ unsigned long i = max, j;
+
+@@ -227,7 +227,7 @@ static noinline void check_rev_seq(struct maple_tree *mt, unsigned long max,
+ #endif
+ }
+
+-static noinline void check_seq(struct maple_tree *mt, unsigned long max,
++static noinline void __init check_seq(struct maple_tree *mt, unsigned long max,
+ bool verbose)
+ {
+ unsigned long i, j;
+@@ -256,7 +256,7 @@ static noinline void check_seq(struct maple_tree *mt, unsigned long max,
+ #endif
+ }
+
+-static noinline void check_lb_not_empty(struct maple_tree *mt)
++static noinline void __init check_lb_not_empty(struct maple_tree *mt)
+ {
+ unsigned long i, j;
+ unsigned long huge = 4000UL * 1000 * 1000;
+@@ -275,13 +275,13 @@ static noinline void check_lb_not_empty(struct maple_tree *mt)
+ mtree_destroy(mt);
+ }
+
+-static noinline void check_lower_bound_split(struct maple_tree *mt)
++static noinline void __init check_lower_bound_split(struct maple_tree *mt)
+ {
+ MT_BUG_ON(mt, !mtree_empty(mt));
+ check_lb_not_empty(mt);
+ }
+
+-static noinline void check_upper_bound_split(struct maple_tree *mt)
++static noinline void __init check_upper_bound_split(struct maple_tree *mt)
+ {
+ unsigned long i, j;
+ unsigned long huge;
+@@ -306,7 +306,7 @@ static noinline void check_upper_bound_split(struct maple_tree *mt)
+ mtree_destroy(mt);
+ }
+
+-static noinline void check_mid_split(struct maple_tree *mt)
++static noinline void __init check_mid_split(struct maple_tree *mt)
+ {
+ unsigned long huge = 8000UL * 1000 * 1000;
+
+@@ -315,7 +315,7 @@ static noinline void check_mid_split(struct maple_tree *mt)
+ check_lb_not_empty(mt);
+ }
+
+-static noinline void check_rev_find(struct maple_tree *mt)
++static noinline void __init check_rev_find(struct maple_tree *mt)
+ {
+ int i, nr_entries = 200;
+ void *val;
+@@ -354,7 +354,7 @@ static noinline void check_rev_find(struct maple_tree *mt)
+ rcu_read_unlock();
+ }
+
+-static noinline void check_find(struct maple_tree *mt)
++static noinline void __init check_find(struct maple_tree *mt)
+ {
+ unsigned long val = 0;
+ unsigned long count;
+@@ -571,7 +571,7 @@ static noinline void check_find(struct maple_tree *mt)
+ mtree_destroy(mt);
+ }
+
+-static noinline void check_find_2(struct maple_tree *mt)
++static noinline void __init check_find_2(struct maple_tree *mt)
+ {
+ unsigned long i, j;
+ void *entry;
+@@ -616,7 +616,7 @@ static noinline void check_find_2(struct maple_tree *mt)
+
+
+ #if defined(CONFIG_64BIT)
+-static noinline void check_alloc_rev_range(struct maple_tree *mt)
++static noinline void __init check_alloc_rev_range(struct maple_tree *mt)
+ {
+ /*
+ * Generated by:
+@@ -624,7 +624,7 @@ static noinline void check_alloc_rev_range(struct maple_tree *mt)
+ * awk -F "-" '{printf "0x%s, 0x%s, ", $1, $2}'
+ */
+
+- unsigned long range[] = {
++ static const unsigned long range[] = {
+ /* Inclusive , Exclusive. */
+ 0x565234af2000, 0x565234af4000,
+ 0x565234af4000, 0x565234af9000,
+@@ -652,7 +652,7 @@ static noinline void check_alloc_rev_range(struct maple_tree *mt)
+ 0x7fff58791000, 0x7fff58793000,
+ };
+
+- unsigned long holes[] = {
++ static const unsigned long holes[] = {
+ /*
+ * Note: start of hole is INCLUSIVE
+ * end of hole is EXCLUSIVE
+@@ -672,7 +672,7 @@ static noinline void check_alloc_rev_range(struct maple_tree *mt)
+ * 4. number that should be returned.
+ * 5. return value
+ */
+- unsigned long req_range[] = {
++ static const unsigned long req_range[] = {
+ 0x565234af9000, /* Min */
+ 0x7fff58791000, /* Max */
+ 0x1000, /* Size */
+@@ -783,7 +783,7 @@ static noinline void check_alloc_rev_range(struct maple_tree *mt)
+ mtree_destroy(mt);
+ }
+
+-static noinline void check_alloc_range(struct maple_tree *mt)
++static noinline void __init check_alloc_range(struct maple_tree *mt)
+ {
+ /*
+ * Generated by:
+@@ -791,7 +791,7 @@ static noinline void check_alloc_range(struct maple_tree *mt)
+ * awk -F "-" '{printf "0x%s, 0x%s, ", $1, $2}'
+ */
+
+- unsigned long range[] = {
++ static const unsigned long range[] = {
+ /* Inclusive , Exclusive. */
+ 0x565234af2000, 0x565234af4000,
+ 0x565234af4000, 0x565234af9000,
+@@ -818,7 +818,7 @@ static noinline void check_alloc_range(struct maple_tree *mt)
+ 0x7fff5878e000, 0x7fff58791000,
+ 0x7fff58791000, 0x7fff58793000,
+ };
+- unsigned long holes[] = {
++ static const unsigned long holes[] = {
+ /* Start of hole, end of hole, size of hole (+1) */
+ 0x565234afb000, 0x565234afc000, 0x1000,
+ 0x565234afe000, 0x565235def000, 0x12F1000,
+@@ -833,7 +833,7 @@ static noinline void check_alloc_range(struct maple_tree *mt)
+ * 4. number that should be returned.
+ * 5. return value
+ */
+- unsigned long req_range[] = {
++ static const unsigned long req_range[] = {
+ 0x565234af9000, /* Min */
+ 0x7fff58791000, /* Max */
+ 0x1000, /* Size */
+@@ -942,10 +942,10 @@ static noinline void check_alloc_range(struct maple_tree *mt)
+ }
+ #endif
+
+-static noinline void check_ranges(struct maple_tree *mt)
++static noinline void __init check_ranges(struct maple_tree *mt)
+ {
+ int i, val, val2;
+- unsigned long r[] = {
++ static const unsigned long r[] = {
+ 10, 15,
+ 20, 25,
+ 17, 22, /* Overlaps previous range. */
+@@ -1210,7 +1210,7 @@ static noinline void check_ranges(struct maple_tree *mt)
+ MT_BUG_ON(mt, mt_height(mt) != 4);
+ }
+
+-static noinline void check_next_entry(struct maple_tree *mt)
++static noinline void __init check_next_entry(struct maple_tree *mt)
+ {
+ void *entry = NULL;
+ unsigned long limit = 30, i = 0;
+@@ -1234,7 +1234,7 @@ static noinline void check_next_entry(struct maple_tree *mt)
+ mtree_destroy(mt);
+ }
+
+-static noinline void check_prev_entry(struct maple_tree *mt)
++static noinline void __init check_prev_entry(struct maple_tree *mt)
+ {
+ unsigned long index = 16;
+ void *value;
+@@ -1278,7 +1278,7 @@ static noinline void check_prev_entry(struct maple_tree *mt)
+ mas_unlock(&mas);
+ }
+
+-static noinline void check_root_expand(struct maple_tree *mt)
++static noinline void __init check_root_expand(struct maple_tree *mt)
+ {
+ MA_STATE(mas, mt, 0, 0);
+ void *ptr;
+@@ -1367,13 +1367,13 @@ static noinline void check_root_expand(struct maple_tree *mt)
+ mas_unlock(&mas);
+ }
+
+-static noinline void check_gap_combining(struct maple_tree *mt)
++static noinline void __init check_gap_combining(struct maple_tree *mt)
+ {
+ struct maple_enode *mn1, *mn2;
+ void *entry;
+ unsigned long singletons = 100;
+- unsigned long *seq100;
+- unsigned long seq100_64[] = {
++ static const unsigned long *seq100;
++ static const unsigned long seq100_64[] = {
+ /* 0-5 */
+ 74, 75, 76,
+ 50, 100, 2,
+@@ -1387,7 +1387,7 @@ static noinline void check_gap_combining(struct maple_tree *mt)
+ 76, 2, 79, 85, 4,
+ };
+
+- unsigned long seq100_32[] = {
++ static const unsigned long seq100_32[] = {
+ /* 0-5 */
+ 61, 62, 63,
+ 50, 100, 2,
+@@ -1401,11 +1401,11 @@ static noinline void check_gap_combining(struct maple_tree *mt)
+ 76, 2, 79, 85, 4,
+ };
+
+- unsigned long seq2000[] = {
++ static const unsigned long seq2000[] = {
+ 1152, 1151,
+ 1100, 1200, 2,
+ };
+- unsigned long seq400[] = {
++ static const unsigned long seq400[] = {
+ 286, 318,
+ 256, 260, 266, 270, 275, 280, 290, 398,
+ 286, 310,
+@@ -1564,7 +1564,7 @@ static noinline void check_gap_combining(struct maple_tree *mt)
+ mt_set_non_kernel(0);
+ mtree_destroy(mt);
+ }
+-static noinline void check_node_overwrite(struct maple_tree *mt)
++static noinline void __init check_node_overwrite(struct maple_tree *mt)
+ {
+ int i, max = 4000;
+
+@@ -1577,7 +1577,7 @@ static noinline void check_node_overwrite(struct maple_tree *mt)
+ }
+
+ #if defined(BENCH_SLOT_STORE)
+-static noinline void bench_slot_store(struct maple_tree *mt)
++static noinline void __init bench_slot_store(struct maple_tree *mt)
+ {
+ int i, brk = 105, max = 1040, brk_start = 100, count = 20000000;
+
+@@ -1593,7 +1593,7 @@ static noinline void bench_slot_store(struct maple_tree *mt)
+ #endif
+
+ #if defined(BENCH_NODE_STORE)
+-static noinline void bench_node_store(struct maple_tree *mt)
++static noinline void __init bench_node_store(struct maple_tree *mt)
+ {
+ int i, overwrite = 76, max = 240, count = 20000000;
+
+@@ -1612,7 +1612,7 @@ static noinline void bench_node_store(struct maple_tree *mt)
+ #endif
+
+ #if defined(BENCH_AWALK)
+-static noinline void bench_awalk(struct maple_tree *mt)
++static noinline void __init bench_awalk(struct maple_tree *mt)
+ {
+ int i, max = 2500, count = 50000000;
+ MA_STATE(mas, mt, 1470, 1470);
+@@ -1629,7 +1629,7 @@ static noinline void bench_awalk(struct maple_tree *mt)
+ }
+ #endif
+ #if defined(BENCH_WALK)
+-static noinline void bench_walk(struct maple_tree *mt)
++static noinline void __init bench_walk(struct maple_tree *mt)
+ {
+ int i, max = 2500, count = 550000000;
+ MA_STATE(mas, mt, 1470, 1470);
+@@ -1646,7 +1646,7 @@ static noinline void bench_walk(struct maple_tree *mt)
+ #endif
+
+ #if defined(BENCH_MT_FOR_EACH)
+-static noinline void bench_mt_for_each(struct maple_tree *mt)
++static noinline void __init bench_mt_for_each(struct maple_tree *mt)
+ {
+ int i, count = 1000000;
+ unsigned long max = 2500, index = 0;
+@@ -1670,7 +1670,7 @@ static noinline void bench_mt_for_each(struct maple_tree *mt)
+ #endif
+
+ /* check_forking - simulate the kernel forking sequence with the tree. */
+-static noinline void check_forking(struct maple_tree *mt)
++static noinline void __init check_forking(struct maple_tree *mt)
+ {
+
+ struct maple_tree newmt;
+@@ -1709,7 +1709,7 @@ static noinline void check_forking(struct maple_tree *mt)
+ mtree_destroy(&newmt);
+ }
+
+-static noinline void check_iteration(struct maple_tree *mt)
++static noinline void __init check_iteration(struct maple_tree *mt)
+ {
+ int i, nr_entries = 125;
+ void *val;
+@@ -1777,7 +1777,7 @@ static noinline void check_iteration(struct maple_tree *mt)
+ mt_set_non_kernel(0);
+ }
+
+-static noinline void check_mas_store_gfp(struct maple_tree *mt)
++static noinline void __init check_mas_store_gfp(struct maple_tree *mt)
+ {
+
+ struct maple_tree newmt;
+@@ -1810,7 +1810,7 @@ static noinline void check_mas_store_gfp(struct maple_tree *mt)
+ }
+
+ #if defined(BENCH_FORK)
+-static noinline void bench_forking(struct maple_tree *mt)
++static noinline void __init bench_forking(struct maple_tree *mt)
+ {
+
+ struct maple_tree newmt;
+@@ -1852,22 +1852,27 @@ static noinline void bench_forking(struct maple_tree *mt)
+ }
+ #endif
+
+-static noinline void next_prev_test(struct maple_tree *mt)
++static noinline void __init next_prev_test(struct maple_tree *mt)
+ {
+ int i, nr_entries;
+ void *val;
+ MA_STATE(mas, mt, 0, 0);
+ struct maple_enode *mn;
+- unsigned long *level2;
+- unsigned long level2_64[] = {707, 1000, 710, 715, 720, 725};
+- unsigned long level2_32[] = {1747, 2000, 1750, 1755, 1760, 1765};
++ static const unsigned long *level2;
++ static const unsigned long level2_64[] = { 707, 1000, 710, 715, 720,
++ 725};
++ static const unsigned long level2_32[] = { 1747, 2000, 1750, 1755,
++ 1760, 1765};
++ unsigned long last_index;
+
+ if (MAPLE_32BIT) {
+ nr_entries = 500;
+ level2 = level2_32;
++ last_index = 0x138e;
+ } else {
+ nr_entries = 200;
+ level2 = level2_64;
++ last_index = 0x7d6;
+ }
+
+ for (i = 0; i <= nr_entries; i++)
+@@ -1974,7 +1979,7 @@ static noinline void next_prev_test(struct maple_tree *mt)
+
+ val = mas_next(&mas, ULONG_MAX);
+ MT_BUG_ON(mt, val != NULL);
+- MT_BUG_ON(mt, mas.index != ULONG_MAX);
++ MT_BUG_ON(mt, mas.index != last_index);
+ MT_BUG_ON(mt, mas.last != ULONG_MAX);
+
+ val = mas_prev(&mas, 0);
+@@ -2028,7 +2033,7 @@ static noinline void next_prev_test(struct maple_tree *mt)
+
+
+ /* Test spanning writes that require balancing right sibling or right cousin */
+-static noinline void check_spanning_relatives(struct maple_tree *mt)
++static noinline void __init check_spanning_relatives(struct maple_tree *mt)
+ {
+
+ unsigned long i, nr_entries = 1000;
+@@ -2041,7 +2046,7 @@ static noinline void check_spanning_relatives(struct maple_tree *mt)
+ mtree_store_range(mt, 9365, 9955, NULL, GFP_KERNEL);
+ }
+
+-static noinline void check_fuzzer(struct maple_tree *mt)
++static noinline void __init check_fuzzer(struct maple_tree *mt)
+ {
+ /*
+ * 1. Causes a spanning rebalance of a single root node.
+@@ -2438,7 +2443,7 @@ static noinline void check_fuzzer(struct maple_tree *mt)
+ }
+
+ /* duplicate the tree with a specific gap */
+-static noinline void check_dup_gaps(struct maple_tree *mt,
++static noinline void __init check_dup_gaps(struct maple_tree *mt,
+ unsigned long nr_entries, bool zero_start,
+ unsigned long gap)
+ {
+@@ -2478,7 +2483,7 @@ static noinline void check_dup_gaps(struct maple_tree *mt,
+ }
+
+ /* Duplicate many sizes of trees. Mainly to test expected entry values */
+-static noinline void check_dup(struct maple_tree *mt)
++static noinline void __init check_dup(struct maple_tree *mt)
+ {
+ int i;
+ int big_start = 100010;
+@@ -2566,7 +2571,7 @@ static noinline void check_dup(struct maple_tree *mt)
+ }
+ }
+
+-static noinline void check_bnode_min_spanning(struct maple_tree *mt)
++static noinline void __init check_bnode_min_spanning(struct maple_tree *mt)
+ {
+ int i = 50;
+ MA_STATE(mas, mt, 0, 0);
+@@ -2585,7 +2590,7 @@ static noinline void check_bnode_min_spanning(struct maple_tree *mt)
+ mt_set_non_kernel(0);
+ }
+
+-static noinline void check_empty_area_window(struct maple_tree *mt)
++static noinline void __init check_empty_area_window(struct maple_tree *mt)
+ {
+ unsigned long i, nr_entries = 20;
+ MA_STATE(mas, mt, 0, 0);
+@@ -2670,7 +2675,7 @@ static noinline void check_empty_area_window(struct maple_tree *mt)
+ rcu_read_unlock();
+ }
+
+-static noinline void check_empty_area_fill(struct maple_tree *mt)
++static noinline void __init check_empty_area_fill(struct maple_tree *mt)
+ {
+ const unsigned long max = 0x25D78000;
+ unsigned long size;
+@@ -2714,11 +2719,11 @@ static noinline void check_empty_area_fill(struct maple_tree *mt)
+ }
+
+ static DEFINE_MTREE(tree);
+-static int maple_tree_seed(void)
++static int __init maple_tree_seed(void)
+ {
+- unsigned long set[] = {5015, 5014, 5017, 25, 1000,
+- 1001, 1002, 1003, 1005, 0,
+- 5003, 5002};
++ unsigned long set[] = { 5015, 5014, 5017, 25, 1000,
++ 1001, 1002, 1003, 1005, 0,
++ 5003, 5002};
+ void *ptr = &set;
+
+ pr_info("\nTEST STARTING\n\n");
+@@ -2988,7 +2993,7 @@ skip:
+ return -EINVAL;
+ }
+
+-static void maple_tree_harvest(void)
++static void __exit maple_tree_harvest(void)
+ {
+
+ }
+diff --git a/mm/memory-failure.c b/mm/memory-failure.c
+index 5b663eca1f293..47e2b545ffcc6 100644
+--- a/mm/memory-failure.c
++++ b/mm/memory-failure.c
+@@ -2490,7 +2490,7 @@ int unpoison_memory(unsigned long pfn)
+ goto unlock_mutex;
+ }
+
+- if (!folio_test_hwpoison(folio)) {
++ if (!PageHWPoison(p)) {
+ unpoison_pr_info("Unpoison: Page was already unpoisoned %#lx\n",
+ pfn, &unpoison_rs);
+ goto unlock_mutex;
+diff --git a/mm/mempolicy.c b/mm/mempolicy.c
+index 1756389a06094..d524bf8d0e90c 100644
+--- a/mm/mempolicy.c
++++ b/mm/mempolicy.c
+@@ -384,8 +384,10 @@ void mpol_rebind_mm(struct mm_struct *mm, nodemask_t *new)
+ VMA_ITERATOR(vmi, mm, 0);
+
+ mmap_write_lock(mm);
+- for_each_vma(vmi, vma)
++ for_each_vma(vmi, vma) {
++ vma_start_write(vma);
+ mpol_rebind_policy(vma->vm_policy, new);
++ }
+ mmap_write_unlock(mm);
+ }
+
+@@ -765,6 +767,8 @@ static int vma_replace_policy(struct vm_area_struct *vma,
+ struct mempolicy *old;
+ struct mempolicy *new;
+
++ vma_assert_write_locked(vma);
++
+ pr_debug("vma %lx-%lx/%lx vm_ops %p vm_file %p set_policy %p\n",
+ vma->vm_start, vma->vm_end, vma->vm_pgoff,
+ vma->vm_ops, vma->vm_file,
+@@ -1313,6 +1317,14 @@ static long do_mbind(unsigned long start, unsigned long len,
+ if (err)
+ goto mpol_out;
+
++ /*
++ * Lock the VMAs before scanning for pages to migrate, to ensure we don't
++ * miss a concurrently inserted page.
++ */
++ vma_iter_init(&vmi, mm, start);
++ for_each_vma_range(vmi, vma, end)
++ vma_start_write(vma);
++
+ ret = queue_pages_range(mm, start, end, nmask,
+ flags | MPOL_MF_INVERT, &pagelist);
+
+@@ -1538,6 +1550,7 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, le
+ break;
+ }
+
++ vma_start_write(vma);
+ new->home_node = home_node;
+ err = mbind_range(&vmi, vma, &prev, start, end, new);
+ mpol_put(new);
+diff --git a/mm/mmap.c b/mm/mmap.c
+index 5c5a917b261e7..224a9646a7dbd 100644
+--- a/mm/mmap.c
++++ b/mm/mmap.c
+@@ -647,6 +647,7 @@ static inline int dup_anon_vma(struct vm_area_struct *dst,
+ * anon pages imported.
+ */
+ if (src->anon_vma && !dst->anon_vma) {
++ vma_start_write(dst);
+ dst->anon_vma = src->anon_vma;
+ return anon_vma_clone(dst, src);
+ }
+diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
+index cd7b0bf5369ec..5eb4898cccd4c 100644
+--- a/net/ceph/messenger.c
++++ b/net/ceph/messenger.c
+@@ -1123,6 +1123,7 @@ bool ceph_addr_is_blank(const struct ceph_entity_addr *addr)
+ return true;
+ }
+ }
++EXPORT_SYMBOL(ceph_addr_is_blank);
+
+ int ceph_addr_port(const struct ceph_entity_addr *addr)
+ {
+diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
+index 5affca8e2f53a..c63f1d62d60a5 100644
+--- a/net/ipv6/addrconf.c
++++ b/net/ipv6/addrconf.c
+@@ -2561,12 +2561,18 @@ static void manage_tempaddrs(struct inet6_dev *idev,
+ ipv6_ifa_notify(0, ift);
+ }
+
+- if ((create || list_empty(&idev->tempaddr_list)) &&
+- idev->cnf.use_tempaddr > 0) {
++ /* Also create a temporary address if it's enabled but no temporary
++ * address currently exists.
++ * However, we get called with valid_lft == 0, prefered_lft == 0, create == false
++ * as part of cleanup (ie. deleting the mngtmpaddr).
++ * We don't want that to result in creating a new temporary ip address.
++ */
++ if (list_empty(&idev->tempaddr_list) && (valid_lft || prefered_lft))
++ create = true;
++
++ if (create && idev->cnf.use_tempaddr > 0) {
+ /* When a new public address is created as described
+ * in [ADDRCONF], also create a new temporary address.
+- * Also create a temporary address if it's enabled but
+- * no temporary address currently exists.
+ */
+ read_unlock_bh(&idev->lock);
+ ipv6_create_tempaddr(ifp, false);
+diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
+index b069826869d05..f2eeb8a850af2 100644
+--- a/net/mptcp/protocol.c
++++ b/net/mptcp/protocol.c
+@@ -3717,10 +3717,9 @@ static int mptcp_listen(struct socket *sock, int backlog)
+ if (!err) {
+ sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1);
+ mptcp_copy_inaddrs(sk, ssock->sk);
++ mptcp_event_pm_listener(ssock->sk, MPTCP_EVENT_LISTENER_CREATED);
+ }
+
+- mptcp_event_pm_listener(ssock->sk, MPTCP_EVENT_LISTENER_CREATED);
+-
+ unlock:
+ release_sock(sk);
+ return err;
+diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
+index ccf0b3d80fd97..da00c411a9cd4 100644
+--- a/net/netfilter/nf_tables_api.c
++++ b/net/netfilter/nf_tables_api.c
+@@ -3810,8 +3810,6 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info,
+ NL_SET_BAD_ATTR(extack, nla[NFTA_RULE_CHAIN]);
+ return PTR_ERR(chain);
+ }
+- if (nft_chain_is_bound(chain))
+- return -EOPNOTSUPP;
+
+ } else if (nla[NFTA_RULE_CHAIN_ID]) {
+ chain = nft_chain_lookup_byid(net, table, nla[NFTA_RULE_CHAIN_ID],
+@@ -3824,6 +3822,9 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info,
+ return -EINVAL;
+ }
+
++ if (nft_chain_is_bound(chain))
++ return -EOPNOTSUPP;
++
+ if (nla[NFTA_RULE_HANDLE]) {
+ handle = be64_to_cpu(nla_get_be64(nla[NFTA_RULE_HANDLE]));
+ rule = __nft_rule_lookup(chain, handle);
+diff --git a/net/netfilter/nft_immediate.c b/net/netfilter/nft_immediate.c
+index 407d7197f75bb..fccb3cf7749c1 100644
+--- a/net/netfilter/nft_immediate.c
++++ b/net/netfilter/nft_immediate.c
+@@ -125,15 +125,27 @@ static void nft_immediate_activate(const struct nft_ctx *ctx,
+ return nft_data_hold(&priv->data, nft_dreg_to_type(priv->dreg));
+ }
+
++static void nft_immediate_chain_deactivate(const struct nft_ctx *ctx,
++ struct nft_chain *chain,
++ enum nft_trans_phase phase)
++{
++ struct nft_ctx chain_ctx;
++ struct nft_rule *rule;
++
++ chain_ctx = *ctx;
++ chain_ctx.chain = chain;
++
++ list_for_each_entry(rule, &chain->rules, list)
++ nft_rule_expr_deactivate(&chain_ctx, rule, phase);
++}
++
+ static void nft_immediate_deactivate(const struct nft_ctx *ctx,
+ const struct nft_expr *expr,
+ enum nft_trans_phase phase)
+ {
+ const struct nft_immediate_expr *priv = nft_expr_priv(expr);
+ const struct nft_data *data = &priv->data;
+- struct nft_ctx chain_ctx;
+ struct nft_chain *chain;
+- struct nft_rule *rule;
+
+ if (priv->dreg == NFT_REG_VERDICT) {
+ switch (data->verdict.code) {
+@@ -143,20 +155,17 @@ static void nft_immediate_deactivate(const struct nft_ctx *ctx,
+ if (!nft_chain_binding(chain))
+ break;
+
+- chain_ctx = *ctx;
+- chain_ctx.chain = chain;
+-
+- list_for_each_entry(rule, &chain->rules, list)
+- nft_rule_expr_deactivate(&chain_ctx, rule, phase);
+-
+ switch (phase) {
+ case NFT_TRANS_PREPARE_ERROR:
+ nf_tables_unbind_chain(ctx, chain);
+- fallthrough;
++ nft_deactivate_next(ctx->net, chain);
++ break;
+ case NFT_TRANS_PREPARE:
++ nft_immediate_chain_deactivate(ctx, chain, phase);
+ nft_deactivate_next(ctx->net, chain);
+ break;
+ default:
++ nft_immediate_chain_deactivate(ctx, chain, phase);
+ nft_chain_del(chain);
+ chain->bound = false;
+ nft_use_dec(&chain->table->use);
+diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c
+index 5c05c9b990fba..8d73fffd2d09d 100644
+--- a/net/netfilter/nft_set_rbtree.c
++++ b/net/netfilter/nft_set_rbtree.c
+@@ -217,29 +217,37 @@ static void *nft_rbtree_get(const struct net *net, const struct nft_set *set,
+
+ static int nft_rbtree_gc_elem(const struct nft_set *__set,
+ struct nft_rbtree *priv,
+- struct nft_rbtree_elem *rbe)
++ struct nft_rbtree_elem *rbe,
++ u8 genmask)
+ {
+ struct nft_set *set = (struct nft_set *)__set;
+ struct rb_node *prev = rb_prev(&rbe->node);
+- struct nft_rbtree_elem *rbe_prev = NULL;
++ struct nft_rbtree_elem *rbe_prev;
+ struct nft_set_gc_batch *gcb;
+
+ gcb = nft_set_gc_batch_check(set, NULL, GFP_ATOMIC);
+ if (!gcb)
+ return -ENOMEM;
+
+- /* search for expired end interval coming before this element. */
++ /* search for end interval coming before this element.
++ * end intervals don't carry a timeout extension, they
++ * are coupled with the interval start element.
++ */
+ while (prev) {
+ rbe_prev = rb_entry(prev, struct nft_rbtree_elem, node);
+- if (nft_rbtree_interval_end(rbe_prev))
++ if (nft_rbtree_interval_end(rbe_prev) &&
++ nft_set_elem_active(&rbe_prev->ext, genmask))
+ break;
+
+ prev = rb_prev(prev);
+ }
+
+- if (rbe_prev) {
++ if (prev) {
++ rbe_prev = rb_entry(prev, struct nft_rbtree_elem, node);
++
+ rb_erase(&rbe_prev->node, &priv->root);
+ atomic_dec(&set->nelems);
++ nft_set_gc_batch_add(gcb, rbe_prev);
+ }
+
+ rb_erase(&rbe->node, &priv->root);
+@@ -321,7 +329,7 @@ static int __nft_rbtree_insert(const struct net *net, const struct nft_set *set,
+
+ /* perform garbage collection to avoid bogus overlap reports. */
+ if (nft_set_elem_expired(&rbe->ext)) {
+- err = nft_rbtree_gc_elem(set, priv, rbe);
++ err = nft_rbtree_gc_elem(set, priv, rbe, genmask);
+ if (err < 0)
+ return err;
+
+diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c
+index ab69ff7577fc7..793009f445c03 100644
+--- a/net/sched/sch_mqprio.c
++++ b/net/sched/sch_mqprio.c
+@@ -290,6 +290,13 @@ static int mqprio_parse_nlattr(struct Qdisc *sch, struct tc_mqprio_qopt *qopt,
+ "Attribute type expected to be TCA_MQPRIO_MIN_RATE64");
+ return -EINVAL;
+ }
++
++ if (nla_len(attr) != sizeof(u64)) {
++ NL_SET_ERR_MSG_ATTR(extack, attr,
++ "Attribute TCA_MQPRIO_MIN_RATE64 expected to have 8 bytes length");
++ return -EINVAL;
++ }
++
+ if (i >= qopt->num_tc)
+ break;
+ priv->min_rate[i] = nla_get_u64(attr);
+@@ -312,6 +319,13 @@ static int mqprio_parse_nlattr(struct Qdisc *sch, struct tc_mqprio_qopt *qopt,
+ "Attribute type expected to be TCA_MQPRIO_MAX_RATE64");
+ return -EINVAL;
+ }
++
++ if (nla_len(attr) != sizeof(u64)) {
++ NL_SET_ERR_MSG_ATTR(extack, attr,
++ "Attribute TCA_MQPRIO_MAX_RATE64 expected to have 8 bytes length");
++ return -EINVAL;
++ }
++
+ if (i >= qopt->num_tc)
+ break;
+ priv->max_rate[i] = nla_get_u64(attr);
+diff --git a/net/tipc/crypto.c b/net/tipc/crypto.c
+index 577fa5af33ec7..302fd749c4249 100644
+--- a/net/tipc/crypto.c
++++ b/net/tipc/crypto.c
+@@ -1960,7 +1960,8 @@ rcv:
+
+ skb_reset_network_header(*skb);
+ skb_pull(*skb, tipc_ehdr_size(ehdr));
+- pskb_trim(*skb, (*skb)->len - aead->authsize);
++ if (pskb_trim(*skb, (*skb)->len - aead->authsize))
++ goto free_skb;
+
+ /* Validate TIPCv2 message */
+ if (unlikely(!tipc_msg_validate(skb))) {
+diff --git a/net/tipc/node.c b/net/tipc/node.c
+index 5e000fde80676..a9c5b6594889b 100644
+--- a/net/tipc/node.c
++++ b/net/tipc/node.c
+@@ -583,7 +583,7 @@ update:
+ n->capabilities, &n->bc_entry.inputq1,
+ &n->bc_entry.namedq, snd_l, &n->bc_entry.link)) {
+ pr_warn("Broadcast rcv link creation failed, no memory\n");
+- kfree(n);
++ tipc_node_put(n);
+ n = NULL;
+ goto exit;
+ }
+diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
+index 169572c8ed40f..bcd548e247fc8 100644
+--- a/sound/pci/hda/patch_realtek.c
++++ b/sound/pci/hda/patch_realtek.c
+@@ -9502,6 +9502,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x103c, 0x880d, "HP EliteBook 830 G8 Notebook PC", ALC285_FIXUP_HP_GPIO_LED),
+ SND_PCI_QUIRK(0x103c, 0x8811, "HP Spectre x360 15-eb1xxx", ALC285_FIXUP_HP_SPECTRE_X360_EB1),
+ SND_PCI_QUIRK(0x103c, 0x8812, "HP Spectre x360 15-eb1xxx", ALC285_FIXUP_HP_SPECTRE_X360_EB1),
++ SND_PCI_QUIRK(0x103c, 0x881d, "HP 250 G8 Notebook PC", ALC236_FIXUP_HP_MUTE_LED_COEFBIT2),
+ SND_PCI_QUIRK(0x103c, 0x8846, "HP EliteBook 850 G8 Notebook PC", ALC285_FIXUP_HP_GPIO_LED),
+ SND_PCI_QUIRK(0x103c, 0x8847, "HP EliteBook x360 830 G8 Notebook PC", ALC285_FIXUP_HP_GPIO_LED),
+ SND_PCI_QUIRK(0x103c, 0x884b, "HP EliteBook 840 Aero G8 Notebook PC", ALC285_FIXUP_HP_GPIO_LED),
+@@ -9628,6 +9629,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x1043, 0x1c92, "ASUS ROG Strix G15", ALC285_FIXUP_ASUS_G533Z_PINS),
+ SND_PCI_QUIRK(0x1043, 0x1caf, "ASUS G634JYR/JZR", ALC285_FIXUP_ASUS_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1043, 0x1ccd, "ASUS X555UB", ALC256_FIXUP_ASUS_MIC),
++ SND_PCI_QUIRK(0x1043, 0x1d1f, "ASUS ROG Strix G17 2023 (G713PV)", ALC287_FIXUP_CS35L41_I2C_2),
+ SND_PCI_QUIRK(0x1043, 0x1d42, "ASUS Zephyrus G14 2022", ALC289_FIXUP_ASUS_GA401),
+ SND_PCI_QUIRK(0x1043, 0x1d4e, "ASUS TM420", ALC256_FIXUP_ASUS_HPE),
+ SND_PCI_QUIRK(0x1043, 0x1e02, "ASUS UX3402", ALC245_FIXUP_CS35L41_SPI_2),
+diff --git a/sound/soc/codecs/wm8904.c b/sound/soc/codecs/wm8904.c
+index 791d8738d1c0e..a05d4dafd3d77 100644
+--- a/sound/soc/codecs/wm8904.c
++++ b/sound/soc/codecs/wm8904.c
+@@ -2308,6 +2308,9 @@ static int wm8904_i2c_probe(struct i2c_client *i2c)
+ regmap_update_bits(wm8904->regmap, WM8904_BIAS_CONTROL_0,
+ WM8904_POBCTRL, 0);
+
++ /* Fill the cache for the ADC test register */
++ regmap_read(wm8904->regmap, WM8904_ADC_TEST_0, &val);
++
+ /* Can leave the device powered off until we need it */
+ regcache_cache_only(wm8904->regmap, true);
+ regulator_bulk_disable(ARRAY_SIZE(wm8904->supplies), wm8904->supplies);
+diff --git a/sound/soc/fsl/fsl_spdif.c b/sound/soc/fsl/fsl_spdif.c
+index 015c3708aa04e..3fd26f2cdd60f 100644
+--- a/sound/soc/fsl/fsl_spdif.c
++++ b/sound/soc/fsl/fsl_spdif.c
+@@ -751,6 +751,8 @@ static int fsl_spdif_trigger(struct snd_pcm_substream *substream,
+ case SNDRV_PCM_TRIGGER_PAUSE_PUSH:
+ regmap_update_bits(regmap, REG_SPDIF_SCR, dmaen, 0);
+ regmap_update_bits(regmap, REG_SPDIF_SIE, intr, 0);
++ regmap_write(regmap, REG_SPDIF_STL, 0x0);
++ regmap_write(regmap, REG_SPDIF_STR, 0x0);
+ break;
+ default:
+ return -EINVAL;
+diff --git a/tools/net/ynl/lib/ynl.py b/tools/net/ynl/lib/ynl.py
+index 3144f33196be4..35462c7ce48b5 100644
+--- a/tools/net/ynl/lib/ynl.py
++++ b/tools/net/ynl/lib/ynl.py
+@@ -405,8 +405,8 @@ class YnlFamily(SpecFamily):
+ def _decode_enum(self, rsp, attr_spec):
+ raw = rsp[attr_spec['name']]
+ enum = self.consts[attr_spec['enum']]
+- i = attr_spec.get('value-start', 0)
+ if 'enum-as-flags' in attr_spec and attr_spec['enum-as-flags']:
++ i = 0
+ value = set()
+ while raw:
+ if raw & 1:
+@@ -414,7 +414,7 @@ class YnlFamily(SpecFamily):
+ raw >>= 1
+ i += 1
+ else:
+- value = enum.entries_by_val[raw - i].name
++ value = enum.entries_by_val[raw].name
+ rsp[attr_spec['name']] = value
+
+ def _decode_binary(self, attr, attr_spec):
+diff --git a/tools/testing/radix-tree/linux/init.h b/tools/testing/radix-tree/linux/init.h
+index 1bb0afc213099..81563c3dfce79 100644
+--- a/tools/testing/radix-tree/linux/init.h
++++ b/tools/testing/radix-tree/linux/init.h
+@@ -1 +1,2 @@
+ #define __init
++#define __exit
+diff --git a/tools/testing/radix-tree/maple.c b/tools/testing/radix-tree/maple.c
+index adc5392df4009..67c56e9e92606 100644
+--- a/tools/testing/radix-tree/maple.c
++++ b/tools/testing/radix-tree/maple.c
+@@ -14,6 +14,7 @@
+ #include "test.h"
+ #include <stdlib.h>
+ #include <time.h>
++#include "linux/init.h"
+
+ #define module_init(x)
+ #define module_exit(x)
+@@ -81,7 +82,7 @@ static void check_mas_alloc_node_count(struct ma_state *mas)
+ * check_new_node() - Check the creation of new nodes and error path
+ * verification.
+ */
+-static noinline void check_new_node(struct maple_tree *mt)
++static noinline void __init check_new_node(struct maple_tree *mt)
+ {
+
+ struct maple_node *mn, *mn2, *mn3;
+@@ -455,7 +456,7 @@ static noinline void check_new_node(struct maple_tree *mt)
+ /*
+ * Check erasing including RCU.
+ */
+-static noinline void check_erase(struct maple_tree *mt, unsigned long index,
++static noinline void __init check_erase(struct maple_tree *mt, unsigned long index,
+ void *ptr)
+ {
+ MT_BUG_ON(mt, mtree_test_erase(mt, index) != ptr);
+@@ -465,24 +466,24 @@ static noinline void check_erase(struct maple_tree *mt, unsigned long index,
+ #define erase_check_insert(mt, i) check_insert(mt, set[i], entry[i%2])
+ #define erase_check_erase(mt, i) check_erase(mt, set[i], entry[i%2])
+
+-static noinline void check_erase_testset(struct maple_tree *mt)
++static noinline void __init check_erase_testset(struct maple_tree *mt)
+ {
+- unsigned long set[] = { 5015, 5014, 5017, 25, 1000,
+- 1001, 1002, 1003, 1005, 0,
+- 6003, 6002, 6008, 6012, 6015,
+- 7003, 7002, 7008, 7012, 7015,
+- 8003, 8002, 8008, 8012, 8015,
+- 9003, 9002, 9008, 9012, 9015,
+- 10003, 10002, 10008, 10012, 10015,
+- 11003, 11002, 11008, 11012, 11015,
+- 12003, 12002, 12008, 12012, 12015,
+- 13003, 13002, 13008, 13012, 13015,
+- 14003, 14002, 14008, 14012, 14015,
+- 15003, 15002, 15008, 15012, 15015,
+- };
+-
+-
+- void *ptr = &set;
++ static const unsigned long set[] = { 5015, 5014, 5017, 25, 1000,
++ 1001, 1002, 1003, 1005, 0,
++ 6003, 6002, 6008, 6012, 6015,
++ 7003, 7002, 7008, 7012, 7015,
++ 8003, 8002, 8008, 8012, 8015,
++ 9003, 9002, 9008, 9012, 9015,
++ 10003, 10002, 10008, 10012, 10015,
++ 11003, 11002, 11008, 11012, 11015,
++ 12003, 12002, 12008, 12012, 12015,
++ 13003, 13002, 13008, 13012, 13015,
++ 14003, 14002, 14008, 14012, 14015,
++ 15003, 15002, 15008, 15012, 15015,
++ };
++
++
++ void *ptr = &check_erase_testset;
+ void *entry[2] = { ptr, mt };
+ void *root_node;
+
+@@ -739,7 +740,7 @@ static noinline void check_erase_testset(struct maple_tree *mt)
+ int mas_ce2_over_count(struct ma_state *mas_start, struct ma_state *mas_end,
+ void *s_entry, unsigned long s_min,
+ void *e_entry, unsigned long e_max,
+- unsigned long *set, int i, bool null_entry)
++ const unsigned long *set, int i, bool null_entry)
+ {
+ int count = 0, span = 0;
+ unsigned long retry = 0;
+@@ -969,8 +970,8 @@ retry:
+ }
+
+ #if defined(CONFIG_64BIT)
+-static noinline void check_erase2_testset(struct maple_tree *mt,
+- unsigned long *set, unsigned long size)
++static noinline void __init check_erase2_testset(struct maple_tree *mt,
++ const unsigned long *set, unsigned long size)
+ {
+ int entry_count = 0;
+ int check = 0;
+@@ -1114,11 +1115,11 @@ static noinline void check_erase2_testset(struct maple_tree *mt,
+
+
+ /* These tests were pulled from KVM tree modifications which failed. */
+-static noinline void check_erase2_sets(struct maple_tree *mt)
++static noinline void __init check_erase2_sets(struct maple_tree *mt)
+ {
+ void *entry;
+ unsigned long start = 0;
+- unsigned long set[] = {
++ static const unsigned long set[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140721266458624, 140737488351231,
+ ERASE, 140721266458624, 140737488351231,
+@@ -1136,7 +1137,7 @@ ERASE, 140253902692352, 140253902864383,
+ STORE, 140253902692352, 140253902696447,
+ STORE, 140253902696448, 140253902864383,
+ };
+- unsigned long set2[] = {
++ static const unsigned long set2[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140735933583360, 140737488351231,
+ ERASE, 140735933583360, 140737488351231,
+@@ -1160,7 +1161,7 @@ STORE, 140277094813696, 140277094821887,
+ STORE, 140277094821888, 140277094825983,
+ STORE, 140735933906944, 140735933911039,
+ };
+- unsigned long set3[] = {
++ static const unsigned long set3[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140735790264320, 140737488351231,
+ ERASE, 140735790264320, 140737488351231,
+@@ -1203,7 +1204,7 @@ STORE, 47135835840512, 47135835885567,
+ STORE, 47135835885568, 47135835893759,
+ };
+
+- unsigned long set4[] = {
++ static const unsigned long set4[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140728251703296, 140737488351231,
+ ERASE, 140728251703296, 140737488351231,
+@@ -1224,7 +1225,7 @@ ERASE, 47646523277312, 47646523445247,
+ STORE, 47646523277312, 47646523400191,
+ };
+
+- unsigned long set5[] = {
++ static const unsigned long set5[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140726874062848, 140737488351231,
+ ERASE, 140726874062848, 140737488351231,
+@@ -1357,7 +1358,7 @@ STORE, 47884791619584, 47884791623679,
+ STORE, 47884791623680, 47884791627775,
+ };
+
+- unsigned long set6[] = {
++ static const unsigned long set6[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140722999021568, 140737488351231,
+ ERASE, 140722999021568, 140737488351231,
+@@ -1489,7 +1490,7 @@ ERASE, 47430432014336, 47430432022527,
+ STORE, 47430432014336, 47430432018431,
+ STORE, 47430432018432, 47430432022527,
+ };
+- unsigned long set7[] = {
++ static const unsigned long set7[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140729808330752, 140737488351231,
+ ERASE, 140729808330752, 140737488351231,
+@@ -1621,7 +1622,7 @@ ERASE, 47439987130368, 47439987138559,
+ STORE, 47439987130368, 47439987134463,
+ STORE, 47439987134464, 47439987138559,
+ };
+- unsigned long set8[] = {
++ static const unsigned long set8[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140722482974720, 140737488351231,
+ ERASE, 140722482974720, 140737488351231,
+@@ -1754,7 +1755,7 @@ STORE, 47708488638464, 47708488642559,
+ STORE, 47708488642560, 47708488646655,
+ };
+
+- unsigned long set9[] = {
++ static const unsigned long set9[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140736427839488, 140737488351231,
+ ERASE, 140736427839488, 140736427839488,
+@@ -5620,7 +5621,7 @@ ERASE, 47906195480576, 47906195480576,
+ STORE, 94641242615808, 94641242750975,
+ };
+
+- unsigned long set10[] = {
++ static const unsigned long set10[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140736427839488, 140737488351231,
+ ERASE, 140736427839488, 140736427839488,
+@@ -9484,7 +9485,7 @@ STORE, 139726599680000, 139726599684095,
+ ERASE, 47906195480576, 47906195480576,
+ STORE, 94641242615808, 94641242750975,
+ };
+- unsigned long set11[] = {
++ static const unsigned long set11[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140732658499584, 140737488351231,
+ ERASE, 140732658499584, 140732658499584,
+@@ -9510,7 +9511,7 @@ STORE, 140732658565120, 140732658569215,
+ STORE, 140732658552832, 140732658565119,
+ };
+
+- unsigned long set12[] = { /* contains 12 values. */
++ static const unsigned long set12[] = { /* contains 12 values. */
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140732658499584, 140737488351231,
+ ERASE, 140732658499584, 140732658499584,
+@@ -9537,7 +9538,7 @@ STORE, 140732658552832, 140732658565119,
+ STORE, 140014592741375, 140014592741375, /* contrived */
+ STORE, 140014592733184, 140014592741376, /* creates first entry retry. */
+ };
+- unsigned long set13[] = {
++ static const unsigned long set13[] = {
+ STORE, 140373516247040, 140373516251135,/*: ffffa2e7b0e10d80 */
+ STORE, 140373516251136, 140373516255231,/*: ffffa2e7b1195d80 */
+ STORE, 140373516255232, 140373516443647,/*: ffffa2e7b0e109c0 */
+@@ -9550,7 +9551,7 @@ STORE, 140373518684160, 140373518688254,/*: ffffa2e7b05fec00 */
+ STORE, 140373518688256, 140373518692351,/*: ffffa2e7bfbdcd80 */
+ STORE, 140373518692352, 140373518696447,/*: ffffa2e7b0749e40 */
+ };
+- unsigned long set14[] = {
++ static const unsigned long set14[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140731667996672, 140737488351231,
+ SNULL, 140731668000767, 140737488351231,
+@@ -9834,7 +9835,7 @@ SNULL, 139826136543232, 139826136809471,
+ STORE, 139826136809472, 139826136842239,
+ STORE, 139826136543232, 139826136809471,
+ };
+- unsigned long set15[] = {
++ static const unsigned long set15[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140722061451264, 140737488351231,
+ SNULL, 140722061455359, 140737488351231,
+@@ -10119,7 +10120,7 @@ STORE, 139906808958976, 139906808991743,
+ STORE, 139906808692736, 139906808958975,
+ };
+
+- unsigned long set16[] = {
++ static const unsigned long set16[] = {
+ STORE, 94174808662016, 94174809321471,
+ STORE, 94174811414528, 94174811426815,
+ STORE, 94174811426816, 94174811430911,
+@@ -10330,7 +10331,7 @@ STORE, 139921865613312, 139921865617407,
+ STORE, 139921865547776, 139921865564159,
+ };
+
+- unsigned long set17[] = {
++ static const unsigned long set17[] = {
+ STORE, 94397057224704, 94397057646591,
+ STORE, 94397057650688, 94397057691647,
+ STORE, 94397057691648, 94397057695743,
+@@ -10392,7 +10393,7 @@ STORE, 140720477511680, 140720477646847,
+ STORE, 140720478302208, 140720478314495,
+ STORE, 140720478314496, 140720478318591,
+ };
+- unsigned long set18[] = {
++ static const unsigned long set18[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140724953673728, 140737488351231,
+ SNULL, 140724953677823, 140737488351231,
+@@ -10425,7 +10426,7 @@ STORE, 140222970597376, 140222970605567,
+ ERASE, 140222970597376, 140222970605567,
+ STORE, 140222970597376, 140222970605567,
+ };
+- unsigned long set19[] = {
++ static const unsigned long set19[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140725182459904, 140737488351231,
+ SNULL, 140725182463999, 140737488351231,
+@@ -10694,7 +10695,7 @@ STORE, 140656836775936, 140656836780031,
+ STORE, 140656787476480, 140656791920639,
+ ERASE, 140656774639616, 140656779083775,
+ };
+- unsigned long set20[] = {
++ static const unsigned long set20[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140735952392192, 140737488351231,
+ SNULL, 140735952396287, 140737488351231,
+@@ -10850,7 +10851,7 @@ STORE, 140590386819072, 140590386823167,
+ STORE, 140590386823168, 140590386827263,
+ SNULL, 140590376591359, 140590376595455,
+ };
+- unsigned long set21[] = {
++ static const unsigned long set21[] = {
+ STORE, 93874710941696, 93874711363583,
+ STORE, 93874711367680, 93874711408639,
+ STORE, 93874711408640, 93874711412735,
+@@ -10920,7 +10921,7 @@ ERASE, 140708393312256, 140708393316351,
+ ERASE, 140708393308160, 140708393312255,
+ ERASE, 140708393291776, 140708393308159,
+ };
+- unsigned long set22[] = {
++ static const unsigned long set22[] = {
+ STORE, 93951397134336, 93951397183487,
+ STORE, 93951397183488, 93951397728255,
+ STORE, 93951397728256, 93951397826559,
+@@ -11047,7 +11048,7 @@ STORE, 140551361253376, 140551361519615,
+ ERASE, 140551361253376, 140551361519615,
+ };
+
+- unsigned long set23[] = {
++ static const unsigned long set23[] = {
+ STORE, 94014447943680, 94014448156671,
+ STORE, 94014450253824, 94014450257919,
+ STORE, 94014450257920, 94014450266111,
+@@ -14371,7 +14372,7 @@ SNULL, 140175956627455, 140175985139711,
+ STORE, 140175927242752, 140175956627455,
+ STORE, 140175956627456, 140175985139711,
+ };
+- unsigned long set24[] = {
++ static const unsigned long set24[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140735281639424, 140737488351231,
+ SNULL, 140735281643519, 140737488351231,
+@@ -15533,7 +15534,7 @@ ERASE, 139635393024000, 139635401412607,
+ ERASE, 139635384627200, 139635384631295,
+ ERASE, 139635384631296, 139635393019903,
+ };
+- unsigned long set25[] = {
++ static const unsigned long set25[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140737488343040, 140737488351231,
+ STORE, 140722547441664, 140737488351231,
+@@ -22321,7 +22322,7 @@ STORE, 140249652703232, 140249682087935,
+ STORE, 140249682087936, 140249710600191,
+ };
+
+- unsigned long set26[] = {
++ static const unsigned long set26[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140729464770560, 140737488351231,
+ SNULL, 140729464774655, 140737488351231,
+@@ -22345,7 +22346,7 @@ ERASE, 140109040951296, 140109040959487,
+ STORE, 140109040955392, 140109040959487,
+ ERASE, 140109040955392, 140109040959487,
+ };
+- unsigned long set27[] = {
++ static const unsigned long set27[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140726128070656, 140737488351231,
+ SNULL, 140726128074751, 140737488351231,
+@@ -22741,7 +22742,7 @@ STORE, 140415509696512, 140415535910911,
+ ERASE, 140415537422336, 140415562588159,
+ STORE, 140415482433536, 140415509696511,
+ };
+- unsigned long set28[] = {
++ static const unsigned long set28[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140722475622400, 140737488351231,
+ SNULL, 140722475626495, 140737488351231,
+@@ -22809,7 +22810,7 @@ STORE, 139918413348864, 139918413352959,
+ ERASE, 139918413316096, 139918413344767,
+ STORE, 93865848528896, 93865848664063,
+ };
+- unsigned long set29[] = {
++ static const unsigned long set29[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140734467944448, 140737488351231,
+ SNULL, 140734467948543, 140737488351231,
+@@ -23684,7 +23685,7 @@ ERASE, 140143079972864, 140143088361471,
+ ERASE, 140143205793792, 140143205797887,
+ ERASE, 140143205797888, 140143214186495,
+ };
+- unsigned long set30[] = {
++ static const unsigned long set30[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140733436743680, 140737488351231,
+ SNULL, 140733436747775, 140737488351231,
+@@ -24566,7 +24567,7 @@ ERASE, 140165225893888, 140165225897983,
+ ERASE, 140165225897984, 140165234286591,
+ ERASE, 140165058105344, 140165058109439,
+ };
+- unsigned long set31[] = {
++ static const unsigned long set31[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140730890784768, 140737488351231,
+ SNULL, 140730890788863, 140737488351231,
+@@ -25379,7 +25380,7 @@ ERASE, 140623906590720, 140623914979327,
+ ERASE, 140622950277120, 140622950281215,
+ ERASE, 140622950281216, 140622958669823,
+ };
+- unsigned long set32[] = {
++ static const unsigned long set32[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140731244212224, 140737488351231,
+ SNULL, 140731244216319, 140737488351231,
+@@ -26175,7 +26176,7 @@ ERASE, 140400417288192, 140400425676799,
+ ERASE, 140400283066368, 140400283070463,
+ ERASE, 140400283070464, 140400291459071,
+ };
+- unsigned long set33[] = {
++ static const unsigned long set33[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140734562918400, 140737488351231,
+ SNULL, 140734562922495, 140737488351231,
+@@ -26317,7 +26318,7 @@ STORE, 140582961786880, 140583003750399,
+ ERASE, 140582961786880, 140583003750399,
+ };
+
+- unsigned long set34[] = {
++ static const unsigned long set34[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140731327180800, 140737488351231,
+ SNULL, 140731327184895, 140737488351231,
+@@ -27198,7 +27199,7 @@ ERASE, 140012522094592, 140012530483199,
+ ERASE, 140012033142784, 140012033146879,
+ ERASE, 140012033146880, 140012041535487,
+ };
+- unsigned long set35[] = {
++ static const unsigned long set35[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140730536939520, 140737488351231,
+ SNULL, 140730536943615, 140737488351231,
+@@ -27955,7 +27956,7 @@ ERASE, 140474471936000, 140474480324607,
+ ERASE, 140474396430336, 140474396434431,
+ ERASE, 140474396434432, 140474404823039,
+ };
+- unsigned long set36[] = {
++ static const unsigned long set36[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140723893125120, 140737488351231,
+ SNULL, 140723893129215, 140737488351231,
+@@ -28816,7 +28817,7 @@ ERASE, 140121890357248, 140121898745855,
+ ERASE, 140121269587968, 140121269592063,
+ ERASE, 140121269592064, 140121277980671,
+ };
+- unsigned long set37[] = {
++ static const unsigned long set37[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140722404016128, 140737488351231,
+ SNULL, 140722404020223, 140737488351231,
+@@ -28942,7 +28943,7 @@ STORE, 139759821246464, 139759888355327,
+ ERASE, 139759821246464, 139759888355327,
+ ERASE, 139759888355328, 139759955464191,
+ };
+- unsigned long set38[] = {
++ static const unsigned long set38[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140730666221568, 140737488351231,
+ SNULL, 140730666225663, 140737488351231,
+@@ -29752,7 +29753,7 @@ ERASE, 140613504712704, 140613504716799,
+ ERASE, 140613504716800, 140613513105407,
+ };
+
+- unsigned long set39[] = {
++ static const unsigned long set39[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140736271417344, 140737488351231,
+ SNULL, 140736271421439, 140737488351231,
+@@ -30124,7 +30125,7 @@ STORE, 140325364428800, 140325372821503,
+ STORE, 140325356036096, 140325364428799,
+ SNULL, 140325364432895, 140325372821503,
+ };
+- unsigned long set40[] = {
++ static const unsigned long set40[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140734309167104, 140737488351231,
+ SNULL, 140734309171199, 140737488351231,
+@@ -30875,7 +30876,7 @@ ERASE, 140320289300480, 140320289304575,
+ ERASE, 140320289304576, 140320297693183,
+ ERASE, 140320163409920, 140320163414015,
+ };
+- unsigned long set41[] = {
++ static const unsigned long set41[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140728157171712, 140737488351231,
+ SNULL, 140728157175807, 140737488351231,
+@@ -31185,7 +31186,7 @@ STORE, 94376135090176, 94376135094271,
+ STORE, 94376135094272, 94376135098367,
+ SNULL, 94376135094272, 94377208836095,
+ };
+- unsigned long set42[] = {
++ static const unsigned long set42[] = {
+ STORE, 314572800, 1388314623,
+ STORE, 1462157312, 1462169599,
+ STORE, 1462169600, 1462185983,
+@@ -33862,7 +33863,7 @@ SNULL, 3798999040, 3799101439,
+ */
+ };
+
+- unsigned long set43[] = {
++ static const unsigned long set43[] = {
+ STORE, 140737488347136, 140737488351231,
+ STORE, 140734187720704, 140737488351231,
+ SNULL, 140734187724800, 140737488351231,
+@@ -34996,7 +34997,7 @@ void run_check_rcu_slowread(struct maple_tree *mt, struct rcu_test_struct *vals)
+ MT_BUG_ON(mt, !vals->seen_entry3);
+ MT_BUG_ON(mt, !vals->seen_both);
+ }
+-static noinline void check_rcu_simulated(struct maple_tree *mt)
++static noinline void __init check_rcu_simulated(struct maple_tree *mt)
+ {
+ unsigned long i, nr_entries = 1000;
+ unsigned long target = 4320;
+@@ -35157,7 +35158,7 @@ static noinline void check_rcu_simulated(struct maple_tree *mt)
+ rcu_unregister_thread();
+ }
+
+-static noinline void check_rcu_threaded(struct maple_tree *mt)
++static noinline void __init check_rcu_threaded(struct maple_tree *mt)
+ {
+ unsigned long i, nr_entries = 1000;
+ struct rcu_test_struct vals;
+@@ -35366,7 +35367,7 @@ static void check_dfs_preorder(struct maple_tree *mt)
+ /* End of depth first search tests */
+
+ /* Preallocation testing */
+-static noinline void check_prealloc(struct maple_tree *mt)
++static noinline void __init check_prealloc(struct maple_tree *mt)
+ {
+ unsigned long i, max = 100;
+ unsigned long allocated;
+@@ -35494,7 +35495,7 @@ static noinline void check_prealloc(struct maple_tree *mt)
+ /* End of preallocation testing */
+
+ /* Spanning writes, writes that span nodes and layers of the tree */
+-static noinline void check_spanning_write(struct maple_tree *mt)
++static noinline void __init check_spanning_write(struct maple_tree *mt)
+ {
+ unsigned long i, max = 5000;
+ MA_STATE(mas, mt, 1200, 2380);
+@@ -35662,7 +35663,7 @@ static noinline void check_spanning_write(struct maple_tree *mt)
+ /* End of spanning write testing */
+
+ /* Writes to a NULL area that are adjacent to other NULLs */
+-static noinline void check_null_expand(struct maple_tree *mt)
++static noinline void __init check_null_expand(struct maple_tree *mt)
+ {
+ unsigned long i, max = 100;
+ unsigned char data_end;
+@@ -35723,7 +35724,7 @@ static noinline void check_null_expand(struct maple_tree *mt)
+ /* End of NULL area expansions */
+
+ /* Checking for no memory is best done outside the kernel */
+-static noinline void check_nomem(struct maple_tree *mt)
++static noinline void __init check_nomem(struct maple_tree *mt)
+ {
+ MA_STATE(ms, mt, 1, 1);
+
+@@ -35758,7 +35759,7 @@ static noinline void check_nomem(struct maple_tree *mt)
+ mtree_destroy(mt);
+ }
+
+-static noinline void check_locky(struct maple_tree *mt)
++static noinline void __init check_locky(struct maple_tree *mt)
+ {
+ MA_STATE(ms, mt, 2, 2);
+ MA_STATE(reader, mt, 2, 2);
+diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
+index 0ae8cafde439f..a40c35c90c52b 100755
+--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
++++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
+@@ -156,9 +156,7 @@ check_tools()
+ elif ! iptables -V &> /dev/null; then
+ echo "SKIP: Could not run all tests without iptables tool"
+ exit $ksft_skip
+- fi
+-
+- if ! ip6tables -V &> /dev/null; then
++ elif ! ip6tables -V &> /dev/null; then
+ echo "SKIP: Could not run all tests without ip6tables tool"
+ exit $ksft_skip
+ fi
+diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
+index 65f94f592ff88..1f40d6bc17bc3 100644
+--- a/virt/kvm/kvm_main.c
++++ b/virt/kvm/kvm_main.c
+@@ -4047,8 +4047,17 @@ static ssize_t kvm_vcpu_stats_read(struct file *file, char __user *user_buffer,
+ sizeof(vcpu->stat), user_buffer, size, offset);
+ }
+
++static int kvm_vcpu_stats_release(struct inode *inode, struct file *file)
++{
++ struct kvm_vcpu *vcpu = file->private_data;
++
++ kvm_put_kvm(vcpu->kvm);
++ return 0;
++}
++
+ static const struct file_operations kvm_vcpu_stats_fops = {
+ .read = kvm_vcpu_stats_read,
++ .release = kvm_vcpu_stats_release,
+ .llseek = noop_llseek,
+ };
+
+@@ -4069,6 +4078,9 @@ static int kvm_vcpu_ioctl_get_stats_fd(struct kvm_vcpu *vcpu)
+ put_unused_fd(fd);
+ return PTR_ERR(file);
+ }
++
++ kvm_get_kvm(vcpu->kvm);
++
+ file->f_mode |= FMODE_PREAD;
+ fd_install(fd, file);
+
+@@ -4712,8 +4724,17 @@ static ssize_t kvm_vm_stats_read(struct file *file, char __user *user_buffer,
+ sizeof(kvm->stat), user_buffer, size, offset);
+ }
+
++static int kvm_vm_stats_release(struct inode *inode, struct file *file)
++{
++ struct kvm *kvm = file->private_data;
++
++ kvm_put_kvm(kvm);
++ return 0;
++}
++
+ static const struct file_operations kvm_vm_stats_fops = {
+ .read = kvm_vm_stats_read,
++ .release = kvm_vm_stats_release,
+ .llseek = noop_llseek,
+ };
+
+@@ -4732,6 +4753,9 @@ static int kvm_vm_ioctl_get_stats_fd(struct kvm *kvm)
+ put_unused_fd(fd);
+ return PTR_ERR(file);
+ }
++
++ kvm_get_kvm(kvm);
++
+ file->f_mode |= FMODE_PREAD;
+ fd_install(fd, file);
+
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-08-08 18:39 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-08-08 18:39 UTC (permalink / raw
To: gentoo-commits
commit: 78afbd5049dbaef69f2dbe1c2611f1a2f3292dce
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Tue Aug 8 18:39:34 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Tue Aug 8 18:39:34 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=78afbd50
Linux patch 6.4.9
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1008_linux-6.4.9.patch | 2648 ++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 2652 insertions(+)
diff --git a/0000_README b/0000_README
index 22c14f8e..65dbf206 100644
--- a/0000_README
+++ b/0000_README
@@ -75,6 +75,10 @@ Patch: 1007_linux-6.4.8.patch
From: https://www.kernel.org
Desc: Linux 6.4.8
+Patch: 1008_linux-6.4.9.patch
+From: https://www.kernel.org
+Desc: Linux 6.4.9
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1008_linux-6.4.9.patch b/1008_linux-6.4.9.patch
new file mode 100644
index 00000000..e65d4323
--- /dev/null
+++ b/1008_linux-6.4.9.patch
@@ -0,0 +1,2648 @@
+diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
+index f54867cadb0f6..13c01b641dc70 100644
+--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
++++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
+@@ -513,17 +513,18 @@ Description: information about CPUs heterogeneity.
+ cpu_capacity: capacity of cpuX.
+
+ What: /sys/devices/system/cpu/vulnerabilities
++ /sys/devices/system/cpu/vulnerabilities/gather_data_sampling
++ /sys/devices/system/cpu/vulnerabilities/itlb_multihit
++ /sys/devices/system/cpu/vulnerabilities/l1tf
++ /sys/devices/system/cpu/vulnerabilities/mds
+ /sys/devices/system/cpu/vulnerabilities/meltdown
++ /sys/devices/system/cpu/vulnerabilities/mmio_stale_data
++ /sys/devices/system/cpu/vulnerabilities/retbleed
++ /sys/devices/system/cpu/vulnerabilities/spec_store_bypass
+ /sys/devices/system/cpu/vulnerabilities/spectre_v1
+ /sys/devices/system/cpu/vulnerabilities/spectre_v2
+- /sys/devices/system/cpu/vulnerabilities/spec_store_bypass
+- /sys/devices/system/cpu/vulnerabilities/l1tf
+- /sys/devices/system/cpu/vulnerabilities/mds
+ /sys/devices/system/cpu/vulnerabilities/srbds
+ /sys/devices/system/cpu/vulnerabilities/tsx_async_abort
+- /sys/devices/system/cpu/vulnerabilities/itlb_multihit
+- /sys/devices/system/cpu/vulnerabilities/mmio_stale_data
+- /sys/devices/system/cpu/vulnerabilities/retbleed
+ Date: January 2018
+ Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
+ Description: Information about CPU vulnerabilities
+diff --git a/Documentation/admin-guide/hw-vuln/gather_data_sampling.rst b/Documentation/admin-guide/hw-vuln/gather_data_sampling.rst
+new file mode 100644
+index 0000000000000..264bfa937f7de
+--- /dev/null
++++ b/Documentation/admin-guide/hw-vuln/gather_data_sampling.rst
+@@ -0,0 +1,109 @@
++.. SPDX-License-Identifier: GPL-2.0
++
++GDS - Gather Data Sampling
++==========================
++
++Gather Data Sampling is a hardware vulnerability which allows unprivileged
++speculative access to data which was previously stored in vector registers.
++
++Problem
++-------
++When a gather instruction performs loads from memory, different data elements
++are merged into the destination vector register. However, when a gather
++instruction that is transiently executed encounters a fault, stale data from
++architectural or internal vector registers may get transiently forwarded to the
++destination vector register instead. This will allow a malicious attacker to
++infer stale data using typical side channel techniques like cache timing
++attacks. GDS is a purely sampling-based attack.
++
++The attacker uses gather instructions to infer the stale vector register data.
++The victim does not need to do anything special other than use the vector
++registers. The victim does not need to use gather instructions to be
++vulnerable.
++
++Because the buffers are shared between Hyper-Threads cross Hyper-Thread attacks
++are possible.
++
++Attack scenarios
++----------------
++Without mitigation, GDS can infer stale data across virtually all
++permission boundaries:
++
++ Non-enclaves can infer SGX enclave data
++ Userspace can infer kernel data
++ Guests can infer data from hosts
++ Guest can infer guest from other guests
++ Users can infer data from other users
++
++Because of this, it is important to ensure that the mitigation stays enabled in
++lower-privilege contexts like guests and when running outside SGX enclaves.
++
++The hardware enforces the mitigation for SGX. Likewise, VMMs should ensure
++that guests are not allowed to disable the GDS mitigation. If a host erred and
++allowed this, a guest could theoretically disable GDS mitigation, mount an
++attack, and re-enable it.
++
++Mitigation mechanism
++--------------------
++This issue is mitigated in microcode. The microcode defines the following new
++bits:
++
++ ================================ === ============================
++ IA32_ARCH_CAPABILITIES[GDS_CTRL] R/O Enumerates GDS vulnerability
++ and mitigation support.
++ IA32_ARCH_CAPABILITIES[GDS_NO] R/O Processor is not vulnerable.
++ IA32_MCU_OPT_CTRL[GDS_MITG_DIS] R/W Disables the mitigation
++ 0 by default.
++ IA32_MCU_OPT_CTRL[GDS_MITG_LOCK] R/W Locks GDS_MITG_DIS=0. Writes
++ to GDS_MITG_DIS are ignored
++ Can't be cleared once set.
++ ================================ === ============================
++
++GDS can also be mitigated on systems that don't have updated microcode by
++disabling AVX. This can be done by setting gather_data_sampling="force" or
++"clearcpuid=avx" on the kernel command-line.
++
++If used, these options will disable AVX use by turning off XSAVE YMM support.
++However, the processor will still enumerate AVX support. Userspace that
++does not follow proper AVX enumeration to check both AVX *and* XSAVE YMM
++support will break.
++
++Mitigation control on the kernel command line
++---------------------------------------------
++The mitigation can be disabled by setting "gather_data_sampling=off" or
++"mitigations=off" on the kernel command line. Not specifying either will default
++to the mitigation being enabled. Specifying "gather_data_sampling=force" will
++use the microcode mitigation when available or disable AVX on affected systems
++where the microcode hasn't been updated to include the mitigation.
++
++GDS System Information
++------------------------
++The kernel provides vulnerability status information through sysfs. For
++GDS this can be accessed by the following sysfs file:
++
++/sys/devices/system/cpu/vulnerabilities/gather_data_sampling
++
++The possible values contained in this file are:
++
++ ============================== =============================================
++ Not affected Processor not vulnerable.
++ Vulnerable Processor vulnerable and mitigation disabled.
++ Vulnerable: No microcode Processor vulnerable and microcode is missing
++ mitigation.
++ Mitigation: AVX disabled,
++ no microcode Processor is vulnerable and microcode is missing
++ mitigation. AVX disabled as mitigation.
++ Mitigation: Microcode Processor is vulnerable and mitigation is in
++ effect.
++ Mitigation: Microcode (locked) Processor is vulnerable and mitigation is in
++ effect and cannot be disabled.
++ Unknown: Dependent on
++ hypervisor status Running on a virtual guest processor that is
++ affected but with no way to know if host
++ processor is mitigated or vulnerable.
++ ============================== =============================================
++
++GDS Default mitigation
++----------------------
++The updated microcode will enable the mitigation by default. The kernel's
++default action is to leave the mitigation enabled.
+diff --git a/Documentation/admin-guide/hw-vuln/index.rst b/Documentation/admin-guide/hw-vuln/index.rst
+index e0614760a99e7..436fac0bd9c35 100644
+--- a/Documentation/admin-guide/hw-vuln/index.rst
++++ b/Documentation/admin-guide/hw-vuln/index.rst
+@@ -19,3 +19,4 @@ are configurable at compile, boot or run time.
+ l1d_flush.rst
+ processor_mmio_stale_data.rst
+ cross-thread-rsb.rst
++ gather_data_sampling.rst
+diff --git a/Documentation/admin-guide/hw-vuln/srso.rst b/Documentation/admin-guide/hw-vuln/srso.rst
+new file mode 100644
+index 0000000000000..2f923c805802f
+--- /dev/null
++++ b/Documentation/admin-guide/hw-vuln/srso.rst
+@@ -0,0 +1,133 @@
++.. SPDX-License-Identifier: GPL-2.0
++
++Speculative Return Stack Overflow (SRSO)
++========================================
++
++This is a mitigation for the speculative return stack overflow (SRSO)
++vulnerability found on AMD processors. The mechanism is by now the well
++known scenario of poisoning CPU functional units - the Branch Target
++Buffer (BTB) and Return Address Predictor (RAP) in this case - and then
++tricking the elevated privilege domain (the kernel) into leaking
++sensitive data.
++
++AMD CPUs predict RET instructions using a Return Address Predictor (aka
++Return Address Stack/Return Stack Buffer). In some cases, a non-architectural
++CALL instruction (i.e., an instruction predicted to be a CALL but is
++not actually a CALL) can create an entry in the RAP which may be used
++to predict the target of a subsequent RET instruction.
++
++The specific circumstances that lead to this varies by microarchitecture
++but the concern is that an attacker can mis-train the CPU BTB to predict
++non-architectural CALL instructions in kernel space and use this to
++control the speculative target of a subsequent kernel RET, potentially
++leading to information disclosure via a speculative side-channel.
++
++The issue is tracked under CVE-2023-20569.
++
++Affected processors
++-------------------
++
++AMD Zen, generations 1-4. That is, all families 0x17 and 0x19. Older
++processors have not been investigated.
++
++System information and options
++------------------------------
++
++First of all, it is required that the latest microcode be loaded for
++mitigations to be effective.
++
++The sysfs file showing SRSO mitigation status is:
++
++ /sys/devices/system/cpu/vulnerabilities/spec_rstack_overflow
++
++The possible values in this file are:
++
++ - 'Not affected' The processor is not vulnerable
++
++ - 'Vulnerable: no microcode' The processor is vulnerable, no
++ microcode extending IBPB functionality
++ to address the vulnerability has been
++ applied.
++
++ - 'Mitigation: microcode' Extended IBPB functionality microcode
++ patch has been applied. It does not
++ address User->Kernel and Guest->Host
++ transitions protection but it does
++ address User->User and VM->VM attack
++ vectors.
++
++ (spec_rstack_overflow=microcode)
++
++ - 'Mitigation: safe RET' Software-only mitigation. It complements
++ the extended IBPB microcode patch
++ functionality by addressing User->Kernel
++ and Guest->Host transitions protection.
++
++ Selected by default or by
++ spec_rstack_overflow=safe-ret
++
++ - 'Mitigation: IBPB' Similar protection as "safe RET" above
++ but employs an IBPB barrier on privilege
++ domain crossings (User->Kernel,
++ Guest->Host).
++
++ (spec_rstack_overflow=ibpb)
++
++ - 'Mitigation: IBPB on VMEXIT' Mitigation addressing the cloud provider
++ scenario - the Guest->Host transitions
++ only.
++
++ (spec_rstack_overflow=ibpb-vmexit)
++
++In order to exploit vulnerability, an attacker needs to:
++
++ - gain local access on the machine
++
++ - break kASLR
++
++ - find gadgets in the running kernel in order to use them in the exploit
++
++ - potentially create and pin an additional workload on the sibling
++ thread, depending on the microarchitecture (not necessary on fam 0x19)
++
++ - run the exploit
++
++Considering the performance implications of each mitigation type, the
++default one is 'Mitigation: safe RET' which should take care of most
++attack vectors, including the local User->Kernel one.
++
++As always, the user is advised to keep her/his system up-to-date by
++applying software updates regularly.
++
++The default setting will be reevaluated when needed and especially when
++new attack vectors appear.
++
++As one can surmise, 'Mitigation: safe RET' does come at the cost of some
++performance depending on the workload. If one trusts her/his userspace
++and does not want to suffer the performance impact, one can always
++disable the mitigation with spec_rstack_overflow=off.
++
++Similarly, 'Mitigation: IBPB' is another full mitigation type employing
++an indrect branch prediction barrier after having applied the required
++microcode patch for one's system. This mitigation comes also at
++a performance cost.
++
++Mitigation: safe RET
++--------------------
++
++The mitigation works by ensuring all RET instructions speculate to
++a controlled location, similar to how speculation is controlled in the
++retpoline sequence. To accomplish this, the __x86_return_thunk forces
++the CPU to mispredict every function return using a 'safe return'
++sequence.
++
++To ensure the safety of this mitigation, the kernel must ensure that the
++safe return sequence is itself free from attacker interference. In Zen3
++and Zen4, this is accomplished by creating a BTB alias between the
++untraining function srso_untrain_ret_alias() and the safe return
++function srso_safe_ret_alias() which results in evicting a potentially
++poisoned BTB entry and using that safe one for all function returns.
++
++In older Zen1 and Zen2, this is accomplished using a reinterpretation
++technique similar to Retbleed one: srso_untrain_ret() and
++srso_safe_ret().
+diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
+index 9e5bab29685ff..a8fc0eb6fb1d6 100644
+--- a/Documentation/admin-guide/kernel-parameters.txt
++++ b/Documentation/admin-guide/kernel-parameters.txt
+@@ -1627,6 +1627,26 @@
+ Format: off | on
+ default: on
+
++ gather_data_sampling=
++ [X86,INTEL] Control the Gather Data Sampling (GDS)
++ mitigation.
++
++ Gather Data Sampling is a hardware vulnerability which
++ allows unprivileged speculative access to data which was
++ previously stored in vector registers.
++
++ This issue is mitigated by default in updated microcode.
++ The mitigation may have a performance impact but can be
++ disabled. On systems without the microcode mitigation
++ disabling AVX serves as a mitigation.
++
++ force: Disable AVX to mitigate systems without
++ microcode mitigation. No effect if the microcode
++ mitigation is present. Known to cause crashes in
++ userspace with buggy AVX enumeration.
++
++ off: Disable GDS mitigation.
++
+ gcov_persist= [GCOV] When non-zero (default), profiling data for
+ kernel modules is saved and remains accessible via
+ debugfs, even when the module is unloaded/reloaded.
+@@ -3262,24 +3282,25 @@
+ Disable all optional CPU mitigations. This
+ improves system performance, but it may also
+ expose users to several CPU vulnerabilities.
+- Equivalent to: nopti [X86,PPC]
+- if nokaslr then kpti=0 [ARM64]
+- nospectre_v1 [X86,PPC]
+- nobp=0 [S390]
+- nospectre_v2 [X86,PPC,S390,ARM64]
+- spectre_v2_user=off [X86]
+- spec_store_bypass_disable=off [X86,PPC]
+- ssbd=force-off [ARM64]
+- nospectre_bhb [ARM64]
++ Equivalent to: if nokaslr then kpti=0 [ARM64]
++ gather_data_sampling=off [X86]
++ kvm.nx_huge_pages=off [X86]
+ l1tf=off [X86]
+ mds=off [X86]
+- tsx_async_abort=off [X86]
+- kvm.nx_huge_pages=off [X86]
+- srbds=off [X86,INTEL]
++ mmio_stale_data=off [X86]
+ no_entry_flush [PPC]
+ no_uaccess_flush [PPC]
+- mmio_stale_data=off [X86]
++ nobp=0 [S390]
++ nopti [X86,PPC]
++ nospectre_bhb [ARM64]
++ nospectre_v1 [X86,PPC]
++ nospectre_v2 [X86,PPC,S390,ARM64]
+ retbleed=off [X86]
++ spec_store_bypass_disable=off [X86,PPC]
++ spectre_v2_user=off [X86]
++ srbds=off [X86,INTEL]
++ ssbd=force-off [ARM64]
++ tsx_async_abort=off [X86]
+
+ Exceptions:
+ This does not have any effect on
+@@ -5839,6 +5860,17 @@
+ Not specifying this option is equivalent to
+ spectre_v2_user=auto.
+
++ spec_rstack_overflow=
++ [X86] Control RAS overflow mitigation on AMD Zen CPUs
++
++ off - Disable mitigation
++ microcode - Enable microcode mitigation only
++ safe-ret - Enable sw-only safe RET mitigation (default)
++ ibpb - Enable mitigation by issuing IBPB on
++ kernel entry
++ ibpb-vmexit - Issue IBPB only on VMEXIT
++ (cloud-specific mitigation)
++
+ spec_store_bypass_disable=
+ [HW] Control Speculative Store Bypass (SSB) Disable mitigation
+ (Speculative Store Bypass vulnerability)
+diff --git a/Makefile b/Makefile
+index 9607ce0b8a10c..5547e02f6104a 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 6
+ PATCHLEVEL = 4
+-SUBLEVEL = 8
++SUBLEVEL = 9
+ EXTRAVERSION =
+ NAME = Hurr durr I'ma ninja sloth
+
+diff --git a/arch/Kconfig b/arch/Kconfig
+index 205fd23e0cada..171e6b5e61b8a 100644
+--- a/arch/Kconfig
++++ b/arch/Kconfig
+@@ -285,6 +285,9 @@ config ARCH_HAS_DMA_SET_UNCACHED
+ config ARCH_HAS_DMA_CLEAR_UNCACHED
+ bool
+
++config ARCH_HAS_CPU_FINALIZE_INIT
++ bool
++
+ # Select if arch init_task must go in the __init_task_data section
+ config ARCH_TASK_STRUCT_ON_STACK
+ bool
+diff --git a/arch/alpha/include/asm/bugs.h b/arch/alpha/include/asm/bugs.h
+deleted file mode 100644
+index 78030d1c7e7e0..0000000000000
+--- a/arch/alpha/include/asm/bugs.h
++++ /dev/null
+@@ -1,20 +0,0 @@
+-/*
+- * include/asm-alpha/bugs.h
+- *
+- * Copyright (C) 1994 Linus Torvalds
+- */
+-
+-/*
+- * This is included by init/main.c to check for architecture-dependent bugs.
+- *
+- * Needs:
+- * void check_bugs(void);
+- */
+-
+-/*
+- * I don't know of any alpha bugs yet.. Nice chip
+- */
+-
+-static void check_bugs(void)
+-{
+-}
+diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
+index 9ed7f03ba15a3..7286fbedbe984 100644
+--- a/arch/arm/Kconfig
++++ b/arch/arm/Kconfig
+@@ -5,6 +5,7 @@ config ARM
+ select ARCH_32BIT_OFF_T
+ select ARCH_CORRECT_STACKTRACE_ON_KRETPROBE if HAVE_KRETPROBES && FRAME_POINTER && !ARM_UNWIND
+ select ARCH_HAS_BINFMT_FLAT
++ select ARCH_HAS_CPU_FINALIZE_INIT if MMU
+ select ARCH_HAS_CURRENT_STACK_POINTER
+ select ARCH_HAS_DEBUG_VIRTUAL if MMU
+ select ARCH_HAS_DMA_WRITE_COMBINE if !ARM_DMA_MEM_BUFFERABLE
+diff --git a/arch/arm/include/asm/bugs.h b/arch/arm/include/asm/bugs.h
+index 97a312ba08401..fe385551edeca 100644
+--- a/arch/arm/include/asm/bugs.h
++++ b/arch/arm/include/asm/bugs.h
+@@ -1,7 +1,5 @@
+ /* SPDX-License-Identifier: GPL-2.0-only */
+ /*
+- * arch/arm/include/asm/bugs.h
+- *
+ * Copyright (C) 1995-2003 Russell King
+ */
+ #ifndef __ASM_BUGS_H
+@@ -10,10 +8,8 @@
+ extern void check_writebuffer_bugs(void);
+
+ #ifdef CONFIG_MMU
+-extern void check_bugs(void);
+ extern void check_other_bugs(void);
+ #else
+-#define check_bugs() do { } while (0)
+ #define check_other_bugs() do { } while (0)
+ #endif
+
+diff --git a/arch/arm/kernel/bugs.c b/arch/arm/kernel/bugs.c
+index 14c8dbbb7d2df..087bce6ec8e9b 100644
+--- a/arch/arm/kernel/bugs.c
++++ b/arch/arm/kernel/bugs.c
+@@ -1,5 +1,6 @@
+ // SPDX-License-Identifier: GPL-2.0
+ #include <linux/init.h>
++#include <linux/cpu.h>
+ #include <asm/bugs.h>
+ #include <asm/proc-fns.h>
+
+@@ -11,7 +12,7 @@ void check_other_bugs(void)
+ #endif
+ }
+
+-void __init check_bugs(void)
++void __init arch_cpu_finalize_init(void)
+ {
+ check_writebuffer_bugs();
+ check_other_bugs();
+diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
+index 21fa63ce5ffc0..2cd93e6bf0fec 100644
+--- a/arch/ia64/Kconfig
++++ b/arch/ia64/Kconfig
+@@ -9,6 +9,7 @@ menu "Processor type and features"
+ config IA64
+ bool
+ select ARCH_BINFMT_ELF_EXTRA_PHDRS
++ select ARCH_HAS_CPU_FINALIZE_INIT
+ select ARCH_HAS_DMA_MARK_CLEAN
+ select ARCH_HAS_STRNCPY_FROM_USER
+ select ARCH_HAS_STRNLEN_USER
+diff --git a/arch/ia64/include/asm/bugs.h b/arch/ia64/include/asm/bugs.h
+deleted file mode 100644
+index 0d6b9bded56c6..0000000000000
+--- a/arch/ia64/include/asm/bugs.h
++++ /dev/null
+@@ -1,20 +0,0 @@
+-/* SPDX-License-Identifier: GPL-2.0 */
+-/*
+- * This is included by init/main.c to check for architecture-dependent bugs.
+- *
+- * Needs:
+- * void check_bugs(void);
+- *
+- * Based on <asm-alpha/bugs.h>.
+- *
+- * Modified 1998, 1999, 2003
+- * David Mosberger-Tang <davidm@hpl.hp.com>, Hewlett-Packard Co.
+- */
+-#ifndef _ASM_IA64_BUGS_H
+-#define _ASM_IA64_BUGS_H
+-
+-#include <asm/processor.h>
+-
+-extern void check_bugs (void);
+-
+-#endif /* _ASM_IA64_BUGS_H */
+diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
+index c057280442727..9009f1871e3b8 100644
+--- a/arch/ia64/kernel/setup.c
++++ b/arch/ia64/kernel/setup.c
+@@ -1067,8 +1067,7 @@ cpu_init (void)
+ }
+ }
+
+-void __init
+-check_bugs (void)
++void __init arch_cpu_finalize_init(void)
+ {
+ ia64_patch_mckinley_e9((unsigned long) __start___mckinley_e9_bundles,
+ (unsigned long) __end___mckinley_e9_bundles);
+diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
+index 2570e7c1eb75f..72fbfa1cc1577 100644
+--- a/arch/loongarch/Kconfig
++++ b/arch/loongarch/Kconfig
+@@ -10,6 +10,7 @@ config LOONGARCH
+ select ARCH_ENABLE_MEMORY_HOTPLUG
+ select ARCH_ENABLE_MEMORY_HOTREMOVE
+ select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI
++ select ARCH_HAS_CPU_FINALIZE_INIT
+ select ARCH_HAS_FORTIFY_SOURCE
+ select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
+ select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
+diff --git a/arch/loongarch/include/asm/bugs.h b/arch/loongarch/include/asm/bugs.h
+deleted file mode 100644
+index 98396535163b3..0000000000000
+--- a/arch/loongarch/include/asm/bugs.h
++++ /dev/null
+@@ -1,15 +0,0 @@
+-/* SPDX-License-Identifier: GPL-2.0 */
+-/*
+- * This is included by init/main.c to check for architecture-dependent bugs.
+- *
+- * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
+- */
+-#ifndef _ASM_BUGS_H
+-#define _ASM_BUGS_H
+-
+-#include <asm/cpu.h>
+-#include <asm/cpu-info.h>
+-
+-extern void check_bugs(void);
+-
+-#endif /* _ASM_BUGS_H */
+diff --git a/arch/loongarch/kernel/setup.c b/arch/loongarch/kernel/setup.c
+index 4444b13418f0e..78a00359bde3c 100644
+--- a/arch/loongarch/kernel/setup.c
++++ b/arch/loongarch/kernel/setup.c
+@@ -12,6 +12,7 @@
+ */
+ #include <linux/init.h>
+ #include <linux/acpi.h>
++#include <linux/cpu.h>
+ #include <linux/dmi.h>
+ #include <linux/efi.h>
+ #include <linux/export.h>
+@@ -37,7 +38,6 @@
+ #include <asm/addrspace.h>
+ #include <asm/alternative.h>
+ #include <asm/bootinfo.h>
+-#include <asm/bugs.h>
+ #include <asm/cache.h>
+ #include <asm/cpu.h>
+ #include <asm/dma.h>
+@@ -87,7 +87,7 @@ const char *get_system_type(void)
+ return "generic-loongson-machine";
+ }
+
+-void __init check_bugs(void)
++void __init arch_cpu_finalize_init(void)
+ {
+ alternative_instructions();
+ }
+diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig
+index 40198a1ebe274..dc792b321f1e9 100644
+--- a/arch/m68k/Kconfig
++++ b/arch/m68k/Kconfig
+@@ -4,6 +4,7 @@ config M68K
+ default y
+ select ARCH_32BIT_OFF_T
+ select ARCH_HAS_BINFMT_FLAT
++ select ARCH_HAS_CPU_FINALIZE_INIT if MMU
+ select ARCH_HAS_CURRENT_STACK_POINTER
+ select ARCH_HAS_DMA_PREP_COHERENT if HAS_DMA && MMU && !COLDFIRE
+ select ARCH_HAS_SYNC_DMA_FOR_DEVICE if HAS_DMA
+diff --git a/arch/m68k/include/asm/bugs.h b/arch/m68k/include/asm/bugs.h
+deleted file mode 100644
+index 745530651e0bf..0000000000000
+--- a/arch/m68k/include/asm/bugs.h
++++ /dev/null
+@@ -1,21 +0,0 @@
+-/* SPDX-License-Identifier: GPL-2.0 */
+-/*
+- * include/asm-m68k/bugs.h
+- *
+- * Copyright (C) 1994 Linus Torvalds
+- */
+-
+-/*
+- * This is included by init/main.c to check for architecture-dependent bugs.
+- *
+- * Needs:
+- * void check_bugs(void);
+- */
+-
+-#ifdef CONFIG_MMU
+-extern void check_bugs(void); /* in arch/m68k/kernel/setup.c */
+-#else
+-static void check_bugs(void)
+-{
+-}
+-#endif
+diff --git a/arch/m68k/kernel/setup_mm.c b/arch/m68k/kernel/setup_mm.c
+index fbff1cea62caa..6f1ae01f322cf 100644
+--- a/arch/m68k/kernel/setup_mm.c
++++ b/arch/m68k/kernel/setup_mm.c
+@@ -10,6 +10,7 @@
+ */
+
+ #include <linux/kernel.h>
++#include <linux/cpu.h>
+ #include <linux/mm.h>
+ #include <linux/sched.h>
+ #include <linux/delay.h>
+@@ -504,7 +505,7 @@ static int __init proc_hardware_init(void)
+ module_init(proc_hardware_init);
+ #endif
+
+-void check_bugs(void)
++void __init arch_cpu_finalize_init(void)
+ {
+ #if defined(CONFIG_FPU) && !defined(CONFIG_M68KFPU_EMU)
+ if (m68k_fputype == 0) {
+diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
+index 6796d839bcfdf..cb9b77e6f18c3 100644
+--- a/arch/mips/Kconfig
++++ b/arch/mips/Kconfig
+@@ -4,6 +4,7 @@ config MIPS
+ default y
+ select ARCH_32BIT_OFF_T if !64BIT
+ select ARCH_BINFMT_ELF_STATE if MIPS_FP_SUPPORT
++ select ARCH_HAS_CPU_FINALIZE_INIT
+ select ARCH_HAS_CURRENT_STACK_POINTER if !CC_IS_CLANG || CLANG_VERSION >= 140000
+ select ARCH_HAS_DEBUG_VIRTUAL if !64BIT
+ select ARCH_HAS_FORTIFY_SOURCE
+diff --git a/arch/mips/include/asm/bugs.h b/arch/mips/include/asm/bugs.h
+index 653f78f3a6852..84be74afcb9a3 100644
+--- a/arch/mips/include/asm/bugs.h
++++ b/arch/mips/include/asm/bugs.h
+@@ -1,17 +1,11 @@
+ /* SPDX-License-Identifier: GPL-2.0 */
+ /*
+- * This is included by init/main.c to check for architecture-dependent bugs.
+- *
+ * Copyright (C) 2007 Maciej W. Rozycki
+- *
+- * Needs:
+- * void check_bugs(void);
+ */
+ #ifndef _ASM_BUGS_H
+ #define _ASM_BUGS_H
+
+ #include <linux/bug.h>
+-#include <linux/delay.h>
+ #include <linux/smp.h>
+
+ #include <asm/cpu.h>
+@@ -24,17 +18,6 @@ extern void check_bugs64_early(void);
+ extern void check_bugs32(void);
+ extern void check_bugs64(void);
+
+-static inline void __init check_bugs(void)
+-{
+- unsigned int cpu = smp_processor_id();
+-
+- cpu_data[cpu].udelay_val = loops_per_jiffy;
+- check_bugs32();
+-
+- if (IS_ENABLED(CONFIG_CPU_R4X00_BUGS64))
+- check_bugs64();
+-}
+-
+ static inline int r4k_daddiu_bug(void)
+ {
+ if (!IS_ENABLED(CONFIG_CPU_R4X00_BUGS64))
+diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
+index c0e65135481b7..cb871eb784a7c 100644
+--- a/arch/mips/kernel/setup.c
++++ b/arch/mips/kernel/setup.c
+@@ -11,6 +11,8 @@
+ * Copyright (C) 2000, 2001, 2002, 2007 Maciej W. Rozycki
+ */
+ #include <linux/init.h>
++#include <linux/cpu.h>
++#include <linux/delay.h>
+ #include <linux/ioport.h>
+ #include <linux/export.h>
+ #include <linux/screen_info.h>
+@@ -841,3 +843,14 @@ static int __init setnocoherentio(char *str)
+ }
+ early_param("nocoherentio", setnocoherentio);
+ #endif
++
++void __init arch_cpu_finalize_init(void)
++{
++ unsigned int cpu = smp_processor_id();
++
++ cpu_data[cpu].udelay_val = loops_per_jiffy;
++ check_bugs32();
++
++ if (IS_ENABLED(CONFIG_CPU_R4X00_BUGS64))
++ check_bugs64();
++}
+diff --git a/arch/parisc/include/asm/bugs.h b/arch/parisc/include/asm/bugs.h
+deleted file mode 100644
+index 0a7f9db6bd1c7..0000000000000
+--- a/arch/parisc/include/asm/bugs.h
++++ /dev/null
+@@ -1,20 +0,0 @@
+-/* SPDX-License-Identifier: GPL-2.0 */
+-/*
+- * include/asm-parisc/bugs.h
+- *
+- * Copyright (C) 1999 Mike Shaver
+- */
+-
+-/*
+- * This is included by init/main.c to check for architecture-dependent bugs.
+- *
+- * Needs:
+- * void check_bugs(void);
+- */
+-
+-#include <asm/processor.h>
+-
+-static inline void check_bugs(void)
+-{
+-// identify_cpu(&boot_cpu_data);
+-}
+diff --git a/arch/powerpc/include/asm/bugs.h b/arch/powerpc/include/asm/bugs.h
+deleted file mode 100644
+index 01b8f6ca4dbbc..0000000000000
+--- a/arch/powerpc/include/asm/bugs.h
++++ /dev/null
+@@ -1,15 +0,0 @@
+-/* SPDX-License-Identifier: GPL-2.0-or-later */
+-#ifndef _ASM_POWERPC_BUGS_H
+-#define _ASM_POWERPC_BUGS_H
+-
+-/*
+- */
+-
+-/*
+- * This file is included by 'init/main.c' to check for
+- * architecture-dependent bugs.
+- */
+-
+-static inline void check_bugs(void) { }
+-
+-#endif /* _ASM_POWERPC_BUGS_H */
+diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
+index 393023d092450..2b3ce4fd39563 100644
+--- a/arch/sh/Kconfig
++++ b/arch/sh/Kconfig
+@@ -6,6 +6,7 @@ config SUPERH
+ select ARCH_ENABLE_MEMORY_HOTREMOVE if SPARSEMEM && MMU
+ select ARCH_HAVE_NMI_SAFE_CMPXCHG if (GUSA_RB || CPU_SH4A)
+ select ARCH_HAS_BINFMT_FLAT if !MMU
++ select ARCH_HAS_CPU_FINALIZE_INIT
+ select ARCH_HAS_CURRENT_STACK_POINTER
+ select ARCH_HAS_GIGANTIC_PAGE
+ select ARCH_HAS_GCOV_PROFILE_ALL
+diff --git a/arch/sh/include/asm/bugs.h b/arch/sh/include/asm/bugs.h
+deleted file mode 100644
+index fe52abb69cea3..0000000000000
+--- a/arch/sh/include/asm/bugs.h
++++ /dev/null
+@@ -1,74 +0,0 @@
+-/* SPDX-License-Identifier: GPL-2.0 */
+-#ifndef __ASM_SH_BUGS_H
+-#define __ASM_SH_BUGS_H
+-
+-/*
+- * This is included by init/main.c to check for architecture-dependent bugs.
+- *
+- * Needs:
+- * void check_bugs(void);
+- */
+-
+-/*
+- * I don't know of any Super-H bugs yet.
+- */
+-
+-#include <asm/processor.h>
+-
+-extern void select_idle_routine(void);
+-
+-static void __init check_bugs(void)
+-{
+- extern unsigned long loops_per_jiffy;
+- char *p = &init_utsname()->machine[2]; /* "sh" */
+-
+- select_idle_routine();
+-
+- current_cpu_data.loops_per_jiffy = loops_per_jiffy;
+-
+- switch (current_cpu_data.family) {
+- case CPU_FAMILY_SH2:
+- *p++ = '2';
+- break;
+- case CPU_FAMILY_SH2A:
+- *p++ = '2';
+- *p++ = 'a';
+- break;
+- case CPU_FAMILY_SH3:
+- *p++ = '3';
+- break;
+- case CPU_FAMILY_SH4:
+- *p++ = '4';
+- break;
+- case CPU_FAMILY_SH4A:
+- *p++ = '4';
+- *p++ = 'a';
+- break;
+- case CPU_FAMILY_SH4AL_DSP:
+- *p++ = '4';
+- *p++ = 'a';
+- *p++ = 'l';
+- *p++ = '-';
+- *p++ = 'd';
+- *p++ = 's';
+- *p++ = 'p';
+- break;
+- case CPU_FAMILY_UNKNOWN:
+- /*
+- * Specifically use CPU_FAMILY_UNKNOWN rather than
+- * default:, so we're able to have the compiler whine
+- * about unhandled enumerations.
+- */
+- break;
+- }
+-
+- printk("CPU: %s\n", get_cpu_subtype(¤t_cpu_data));
+-
+-#ifndef __LITTLE_ENDIAN__
+- /* 'eb' means 'Endian Big' */
+- *p++ = 'e';
+- *p++ = 'b';
+-#endif
+- *p = '\0';
+-}
+-#endif /* __ASM_SH_BUGS_H */
+diff --git a/arch/sh/include/asm/processor.h b/arch/sh/include/asm/processor.h
+index 85a6c1c3c16e7..73fba7c922f92 100644
+--- a/arch/sh/include/asm/processor.h
++++ b/arch/sh/include/asm/processor.h
+@@ -166,6 +166,8 @@ extern unsigned int instruction_size(unsigned int insn);
+ #define instruction_size(insn) (2)
+ #endif
+
++void select_idle_routine(void);
++
+ #endif /* __ASSEMBLY__ */
+
+ #include <asm/processor_32.h>
+diff --git a/arch/sh/kernel/idle.c b/arch/sh/kernel/idle.c
+index d662503b0665d..045d93f151fd7 100644
+--- a/arch/sh/kernel/idle.c
++++ b/arch/sh/kernel/idle.c
+@@ -15,6 +15,7 @@
+ #include <linux/irqflags.h>
+ #include <linux/smp.h>
+ #include <linux/atomic.h>
++#include <asm/processor.h>
+ #include <asm/smp.h>
+ #include <asm/bl_bit.h>
+
+diff --git a/arch/sh/kernel/setup.c b/arch/sh/kernel/setup.c
+index af977ec4ca5e5..cf7c0f72f2935 100644
+--- a/arch/sh/kernel/setup.c
++++ b/arch/sh/kernel/setup.c
+@@ -43,6 +43,7 @@
+ #include <asm/smp.h>
+ #include <asm/mmu_context.h>
+ #include <asm/mmzone.h>
++#include <asm/processor.h>
+ #include <asm/sparsemem.h>
+ #include <asm/platform_early.h>
+
+@@ -354,3 +355,57 @@ int test_mode_pin(int pin)
+ {
+ return sh_mv.mv_mode_pins() & pin;
+ }
++
++void __init arch_cpu_finalize_init(void)
++{
++ char *p = &init_utsname()->machine[2]; /* "sh" */
++
++ select_idle_routine();
++
++ current_cpu_data.loops_per_jiffy = loops_per_jiffy;
++
++ switch (current_cpu_data.family) {
++ case CPU_FAMILY_SH2:
++ *p++ = '2';
++ break;
++ case CPU_FAMILY_SH2A:
++ *p++ = '2';
++ *p++ = 'a';
++ break;
++ case CPU_FAMILY_SH3:
++ *p++ = '3';
++ break;
++ case CPU_FAMILY_SH4:
++ *p++ = '4';
++ break;
++ case CPU_FAMILY_SH4A:
++ *p++ = '4';
++ *p++ = 'a';
++ break;
++ case CPU_FAMILY_SH4AL_DSP:
++ *p++ = '4';
++ *p++ = 'a';
++ *p++ = 'l';
++ *p++ = '-';
++ *p++ = 'd';
++ *p++ = 's';
++ *p++ = 'p';
++ break;
++ case CPU_FAMILY_UNKNOWN:
++ /*
++ * Specifically use CPU_FAMILY_UNKNOWN rather than
++ * default:, so we're able to have the compiler whine
++ * about unhandled enumerations.
++ */
++ break;
++ }
++
++ pr_info("CPU: %s\n", get_cpu_subtype(¤t_cpu_data));
++
++#ifndef __LITTLE_ENDIAN__
++ /* 'eb' means 'Endian Big' */
++ *p++ = 'e';
++ *p++ = 'b';
++#endif
++ *p = '\0';
++}
+diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
+index 8c196990558b2..bb5e369cc6bdf 100644
+--- a/arch/sparc/Kconfig
++++ b/arch/sparc/Kconfig
+@@ -52,6 +52,7 @@ config SPARC
+ config SPARC32
+ def_bool !64BIT
+ select ARCH_32BIT_OFF_T
++ select ARCH_HAS_CPU_FINALIZE_INIT if !SMP
+ select ARCH_HAS_SYNC_DMA_FOR_CPU
+ select CLZ_TAB
+ select DMA_DIRECT_REMAP
+diff --git a/arch/sparc/include/asm/bugs.h b/arch/sparc/include/asm/bugs.h
+deleted file mode 100644
+index 02fa369b9c21f..0000000000000
+--- a/arch/sparc/include/asm/bugs.h
++++ /dev/null
+@@ -1,18 +0,0 @@
+-/* SPDX-License-Identifier: GPL-2.0 */
+-/* include/asm/bugs.h: Sparc probes for various bugs.
+- *
+- * Copyright (C) 1996, 2007 David S. Miller (davem@davemloft.net)
+- */
+-
+-#ifdef CONFIG_SPARC32
+-#include <asm/cpudata.h>
+-#endif
+-
+-extern unsigned long loops_per_jiffy;
+-
+-static void __init check_bugs(void)
+-{
+-#if defined(CONFIG_SPARC32) && !defined(CONFIG_SMP)
+- cpu_data(0).udelay_val = loops_per_jiffy;
+-#endif
+-}
+diff --git a/arch/sparc/kernel/setup_32.c b/arch/sparc/kernel/setup_32.c
+index c8e0dd99f3700..c9d1ba4f311b9 100644
+--- a/arch/sparc/kernel/setup_32.c
++++ b/arch/sparc/kernel/setup_32.c
+@@ -412,3 +412,10 @@ static int __init topology_init(void)
+ }
+
+ subsys_initcall(topology_init);
++
++#if defined(CONFIG_SPARC32) && !defined(CONFIG_SMP)
++void __init arch_cpu_finalize_init(void)
++{
++ cpu_data(0).udelay_val = loops_per_jiffy;
++}
++#endif
+diff --git a/arch/um/Kconfig b/arch/um/Kconfig
+index 541a9b18e3435..887cfb636c268 100644
+--- a/arch/um/Kconfig
++++ b/arch/um/Kconfig
+@@ -6,6 +6,7 @@ config UML
+ bool
+ default y
+ select ARCH_EPHEMERAL_INODES
++ select ARCH_HAS_CPU_FINALIZE_INIT
+ select ARCH_HAS_FORTIFY_SOURCE
+ select ARCH_HAS_GCOV_PROFILE_ALL
+ select ARCH_HAS_KCOV
+diff --git a/arch/um/include/asm/bugs.h b/arch/um/include/asm/bugs.h
+deleted file mode 100644
+index 4473942a08397..0000000000000
+--- a/arch/um/include/asm/bugs.h
++++ /dev/null
+@@ -1,7 +0,0 @@
+-/* SPDX-License-Identifier: GPL-2.0 */
+-#ifndef __UM_BUGS_H
+-#define __UM_BUGS_H
+-
+-void check_bugs(void);
+-
+-#endif
+diff --git a/arch/um/kernel/um_arch.c b/arch/um/kernel/um_arch.c
+index 0a23a98d4ca0a..918fed7ad4d8a 100644
+--- a/arch/um/kernel/um_arch.c
++++ b/arch/um/kernel/um_arch.c
+@@ -3,6 +3,7 @@
+ * Copyright (C) 2000 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com)
+ */
+
++#include <linux/cpu.h>
+ #include <linux/delay.h>
+ #include <linux/init.h>
+ #include <linux/mm.h>
+@@ -430,7 +431,7 @@ void __init setup_arch(char **cmdline_p)
+ }
+ }
+
+-void __init check_bugs(void)
++void __init arch_cpu_finalize_init(void)
+ {
+ arch_check_bugs();
+ os_check_bugs();
+diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
+index cb1031018afa5..5a9709cbd9e7c 100644
+--- a/arch/x86/Kconfig
++++ b/arch/x86/Kconfig
+@@ -71,6 +71,7 @@ config X86
+ select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI
+ select ARCH_HAS_CACHE_LINE_SIZE
+ select ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION
++ select ARCH_HAS_CPU_FINALIZE_INIT
+ select ARCH_HAS_CURRENT_STACK_POINTER
+ select ARCH_HAS_DEBUG_VIRTUAL
+ select ARCH_HAS_DEBUG_VM_PGTABLE if !X86_PAE
+@@ -2640,6 +2641,13 @@ config CPU_IBRS_ENTRY
+ This mitigates both spectre_v2 and retbleed at great cost to
+ performance.
+
++config CPU_SRSO
++ bool "Mitigate speculative RAS overflow on AMD"
++ depends on CPU_SUP_AMD && X86_64 && RETHUNK
++ default y
++ help
++ Enable the SRSO mitigation needed on AMD Zen1-4 machines.
++
+ config SLS
+ bool "Mitigate Straight-Line-Speculation"
+ depends on CC_HAS_SLS && X86_64
+@@ -2650,6 +2658,25 @@ config SLS
+ against straight line speculation. The kernel image might be slightly
+ larger.
+
++config GDS_FORCE_MITIGATION
++ bool "Force GDS Mitigation"
++ depends on CPU_SUP_INTEL
++ default n
++ help
++ Gather Data Sampling (GDS) is a hardware vulnerability which allows
++ unprivileged speculative access to data which was previously stored in
++ vector registers.
++
++ This option is equivalent to setting gather_data_sampling=force on the
++ command line. The microcode mitigation is used if present, otherwise
++ AVX is disabled as a mitigation. On affected systems that are missing
++ the microcode any userspace code that unconditionally uses AVX will
++ break with this option set.
++
++ Setting this option on systems not vulnerable to GDS has no effect.
++
++ If in doubt, say N.
++
+ endif
+
+ config ARCH_HAS_ADD_PAGES
+diff --git a/arch/x86/include/asm/bugs.h b/arch/x86/include/asm/bugs.h
+index 92ae283899409..f25ca2d709d40 100644
+--- a/arch/x86/include/asm/bugs.h
++++ b/arch/x86/include/asm/bugs.h
+@@ -4,8 +4,6 @@
+
+ #include <asm/processor.h>
+
+-extern void check_bugs(void);
+-
+ #if defined(CONFIG_CPU_SUP_INTEL) && defined(CONFIG_X86_32)
+ int ppro_with_ram_bug(void);
+ #else
+diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
+index cb8ca46213bed..094f88fee5369 100644
+--- a/arch/x86/include/asm/cpufeatures.h
++++ b/arch/x86/include/asm/cpufeatures.h
+@@ -14,7 +14,7 @@
+ * Defines x86 CPU feature bits
+ */
+ #define NCAPINTS 21 /* N 32-bit words worth of info */
+-#define NBUGINTS 1 /* N 32-bit bug flags */
++#define NBUGINTS 2 /* N 32-bit bug flags */
+
+ /*
+ * Note: If the comment begins with a quoted string, that string is used
+@@ -309,6 +309,10 @@
+ #define X86_FEATURE_SMBA (11*32+21) /* "" Slow Memory Bandwidth Allocation */
+ #define X86_FEATURE_BMEC (11*32+22) /* "" Bandwidth Monitoring Event Configuration */
+
++#define X86_FEATURE_SRSO (11*32+24) /* "" AMD BTB untrain RETs */
++#define X86_FEATURE_SRSO_ALIAS (11*32+25) /* "" AMD BTB untrain RETs through aliasing */
++#define X86_FEATURE_IBPB_ON_VMEXIT (11*32+26) /* "" Issue an IBPB only on VMEXIT */
++
+ /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */
+ #define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */
+ #define X86_FEATURE_AVX512_BF16 (12*32+ 5) /* AVX512 BFLOAT16 instructions */
+@@ -442,6 +446,10 @@
+ #define X86_FEATURE_AUTOIBRS (20*32+ 8) /* "" Automatic IBRS */
+ #define X86_FEATURE_NO_SMM_CTL_MSR (20*32+ 9) /* "" SMM_CTL MSR is not present */
+
++#define X86_FEATURE_SBPB (20*32+27) /* "" Selective Branch Prediction Barrier */
++#define X86_FEATURE_IBPB_BRTYPE (20*32+28) /* "" MSR_PRED_CMD[IBPB] flushes all branch type predictions */
++#define X86_FEATURE_SRSO_NO (20*32+29) /* "" CPU is not affected by SRSO */
++
+ /*
+ * BUG word(s)
+ */
+@@ -483,5 +491,8 @@
+ #define X86_BUG_RETBLEED X86_BUG(27) /* CPU is affected by RETBleed */
+ #define X86_BUG_EIBRS_PBRSB X86_BUG(28) /* EIBRS is vulnerable to Post Barrier RSB Predictions */
+ #define X86_BUG_SMT_RSB X86_BUG(29) /* CPU is vulnerable to Cross-Thread Return Address Predictions */
++#define X86_BUG_GDS X86_BUG(30) /* CPU is affected by Gather Data Sampling */
+
++/* BUG word 2 */
++#define X86_BUG_SRSO X86_BUG(1*32 + 0) /* AMD SRSO bug */
+ #endif /* _ASM_X86_CPUFEATURES_H */
+diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h
+index 503a577814b2e..b475d9a582b88 100644
+--- a/arch/x86/include/asm/fpu/api.h
++++ b/arch/x86/include/asm/fpu/api.h
+@@ -109,7 +109,7 @@ extern void fpu_reset_from_exception_fixup(void);
+
+ /* Boot, hotplug and resume */
+ extern void fpu__init_cpu(void);
+-extern void fpu__init_system(struct cpuinfo_x86 *c);
++extern void fpu__init_system(void);
+ extern void fpu__init_check_bugs(void);
+ extern void fpu__resume_cpu(void);
+
+diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
+index b7126701574c1..7f97a8a97e24a 100644
+--- a/arch/x86/include/asm/mem_encrypt.h
++++ b/arch/x86/include/asm/mem_encrypt.h
+@@ -17,6 +17,12 @@
+
+ #include <asm/bootparam.h>
+
++#ifdef CONFIG_X86_MEM_ENCRYPT
++void __init mem_encrypt_init(void);
++#else
++static inline void mem_encrypt_init(void) { }
++#endif
++
+ #ifdef CONFIG_AMD_MEM_ENCRYPT
+
+ extern u64 sme_me_mask;
+@@ -87,9 +93,6 @@ static inline void mem_encrypt_free_decrypted_mem(void) { }
+
+ #endif /* CONFIG_AMD_MEM_ENCRYPT */
+
+-/* Architecture __weak replacement functions */
+-void __init mem_encrypt_init(void);
+-
+ void add_encrypt_protection_map(void);
+
+ /*
+diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
+index a00a53e15ab73..1d111350197f3 100644
+--- a/arch/x86/include/asm/msr-index.h
++++ b/arch/x86/include/asm/msr-index.h
+@@ -57,6 +57,7 @@
+
+ #define MSR_IA32_PRED_CMD 0x00000049 /* Prediction Command */
+ #define PRED_CMD_IBPB BIT(0) /* Indirect Branch Prediction Barrier */
++#define PRED_CMD_SBPB BIT(7) /* Selective Branch Prediction Barrier */
+
+ #define MSR_PPIN_CTL 0x0000004e
+ #define MSR_PPIN 0x0000004f
+@@ -155,6 +156,15 @@
+ * Not susceptible to Post-Barrier
+ * Return Stack Buffer Predictions.
+ */
++#define ARCH_CAP_GDS_CTRL BIT(25) /*
++ * CPU is vulnerable to Gather
++ * Data Sampling (GDS) and
++ * has controls for mitigation.
++ */
++#define ARCH_CAP_GDS_NO BIT(26) /*
++ * CPU is not vulnerable to Gather
++ * Data Sampling (GDS).
++ */
+
+ #define ARCH_CAP_XAPIC_DISABLE BIT(21) /*
+ * IA32_XAPIC_DISABLE_STATUS MSR
+@@ -178,6 +188,8 @@
+ #define RNGDS_MITG_DIS BIT(0) /* SRBDS support */
+ #define RTM_ALLOW BIT(1) /* TSX development mode */
+ #define FB_CLEAR_DIS BIT(3) /* CPU Fill buffer clear disable */
++#define GDS_MITG_DIS BIT(4) /* Disable GDS mitigation */
++#define GDS_MITG_LOCKED BIT(5) /* GDS mitigation locked */
+
+ #define MSR_IA32_SYSENTER_CS 0x00000174
+ #define MSR_IA32_SYSENTER_ESP 0x00000175
+diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
+index edb2b0cb8efe5..e1e7b319fe78d 100644
+--- a/arch/x86/include/asm/nospec-branch.h
++++ b/arch/x86/include/asm/nospec-branch.h
+@@ -211,7 +211,8 @@
+ * eventually turn into it's own annotation.
+ */
+ .macro VALIDATE_UNRET_END
+-#if defined(CONFIG_NOINSTR_VALIDATION) && defined(CONFIG_CPU_UNRET_ENTRY)
++#if defined(CONFIG_NOINSTR_VALIDATION) && \
++ (defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_SRSO))
+ ANNOTATE_RETPOLINE_SAFE
+ nop
+ #endif
+@@ -285,13 +286,18 @@
+ */
+ .macro UNTRAIN_RET
+ #if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) || \
+- defined(CONFIG_CALL_DEPTH_TRACKING)
++ defined(CONFIG_CALL_DEPTH_TRACKING) || defined(CONFIG_CPU_SRSO)
+ VALIDATE_UNRET_END
+ ALTERNATIVE_3 "", \
+ CALL_ZEN_UNTRAIN_RET, X86_FEATURE_UNRET, \
+ "call entry_ibpb", X86_FEATURE_ENTRY_IBPB, \
+ __stringify(RESET_CALL_DEPTH), X86_FEATURE_CALL_DEPTH
+ #endif
++
++#ifdef CONFIG_CPU_SRSO
++ ALTERNATIVE_2 "", "call srso_untrain_ret", X86_FEATURE_SRSO, \
++ "call srso_untrain_ret_alias", X86_FEATURE_SRSO_ALIAS
++#endif
+ .endm
+
+ .macro UNTRAIN_RET_FROM_CALL
+@@ -303,6 +309,11 @@
+ "call entry_ibpb", X86_FEATURE_ENTRY_IBPB, \
+ __stringify(RESET_CALL_DEPTH_FROM_CALL), X86_FEATURE_CALL_DEPTH
+ #endif
++
++#ifdef CONFIG_CPU_SRSO
++ ALTERNATIVE_2 "", "call srso_untrain_ret", X86_FEATURE_SRSO, \
++ "call srso_untrain_ret_alias", X86_FEATURE_SRSO_ALIAS
++#endif
+ .endm
+
+
+@@ -328,6 +339,8 @@ extern retpoline_thunk_t __x86_indirect_jump_thunk_array[];
+
+ extern void __x86_return_thunk(void);
+ extern void zen_untrain_ret(void);
++extern void srso_untrain_ret(void);
++extern void srso_untrain_ret_alias(void);
+ extern void entry_ibpb(void);
+
+ #ifdef CONFIG_CALL_THUNKS
+@@ -475,11 +488,11 @@ void alternative_msr_write(unsigned int msr, u64 val, unsigned int feature)
+ : "memory");
+ }
+
++extern u64 x86_pred_cmd;
++
+ static inline void indirect_branch_prediction_barrier(void)
+ {
+- u64 val = PRED_CMD_IBPB;
+-
+- alternative_msr_write(MSR_IA32_PRED_CMD, val, X86_FEATURE_USE_IBPB);
++ alternative_msr_write(MSR_IA32_PRED_CMD, x86_pred_cmd, X86_FEATURE_USE_IBPB);
+ }
+
+ /* The Intel SPEC CTRL MSR base value cache */
+diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
+index a1e4fa58b3574..37f1826df2635 100644
+--- a/arch/x86/include/asm/processor.h
++++ b/arch/x86/include/asm/processor.h
+@@ -683,9 +683,11 @@ extern u16 get_llc_id(unsigned int cpu);
+ #ifdef CONFIG_CPU_SUP_AMD
+ extern u32 amd_get_nodes_per_socket(void);
+ extern u32 amd_get_highest_perf(void);
++extern bool cpu_has_ibpb_brtype_microcode(void);
+ #else
+ static inline u32 amd_get_nodes_per_socket(void) { return 0; }
+ static inline u32 amd_get_highest_perf(void) { return 0; }
++static inline bool cpu_has_ibpb_brtype_microcode(void) { return false; }
+ #endif
+
+ extern unsigned long arch_align_stack(unsigned long sp);
+diff --git a/arch/x86/include/asm/sigframe.h b/arch/x86/include/asm/sigframe.h
+index 5b1ed650b1248..84eab27248754 100644
+--- a/arch/x86/include/asm/sigframe.h
++++ b/arch/x86/include/asm/sigframe.h
+@@ -85,6 +85,4 @@ struct rt_sigframe_x32 {
+
+ #endif /* CONFIG_X86_64 */
+
+-void __init init_sigframe_size(void);
+-
+ #endif /* _ASM_X86_SIGFRAME_H */
+diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
+index 26ad7ca423e7c..4239b51e0bc50 100644
+--- a/arch/x86/kernel/cpu/amd.c
++++ b/arch/x86/kernel/cpu/amd.c
+@@ -1279,6 +1279,25 @@ u32 amd_get_highest_perf(void)
+ }
+ EXPORT_SYMBOL_GPL(amd_get_highest_perf);
+
++bool cpu_has_ibpb_brtype_microcode(void)
++{
++ switch (boot_cpu_data.x86) {
++ /* Zen1/2 IBPB flushes branch type predictions too. */
++ case 0x17:
++ return boot_cpu_has(X86_FEATURE_AMD_IBPB);
++ case 0x19:
++ /* Poke the MSR bit on Zen3/4 to check its presence. */
++ if (!wrmsrl_safe(MSR_IA32_PRED_CMD, PRED_CMD_SBPB)) {
++ setup_force_cpu_cap(X86_FEATURE_SBPB);
++ return true;
++ } else {
++ return false;
++ }
++ default:
++ return false;
++ }
++}
++
+ static void zenbleed_check_cpu(void *unused)
+ {
+ struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
+diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
+index dbf7443c42ebd..f3d627901d890 100644
+--- a/arch/x86/kernel/cpu/bugs.c
++++ b/arch/x86/kernel/cpu/bugs.c
+@@ -9,7 +9,6 @@
+ * - Andrew D. Balsa (code cleanup).
+ */
+ #include <linux/init.h>
+-#include <linux/utsname.h>
+ #include <linux/cpu.h>
+ #include <linux/module.h>
+ #include <linux/nospec.h>
+@@ -27,8 +26,6 @@
+ #include <asm/msr.h>
+ #include <asm/vmx.h>
+ #include <asm/paravirt.h>
+-#include <asm/alternative.h>
+-#include <asm/set_memory.h>
+ #include <asm/intel-family.h>
+ #include <asm/e820/api.h>
+ #include <asm/hypervisor.h>
+@@ -50,6 +47,8 @@ static void __init taa_select_mitigation(void);
+ static void __init mmio_select_mitigation(void);
+ static void __init srbds_select_mitigation(void);
+ static void __init l1d_flush_select_mitigation(void);
++static void __init gds_select_mitigation(void);
++static void __init srso_select_mitigation(void);
+
+ /* The base value of the SPEC_CTRL MSR without task-specific bits set */
+ u64 x86_spec_ctrl_base;
+@@ -59,6 +58,9 @@ EXPORT_SYMBOL_GPL(x86_spec_ctrl_base);
+ DEFINE_PER_CPU(u64, x86_spec_ctrl_current);
+ EXPORT_SYMBOL_GPL(x86_spec_ctrl_current);
+
++u64 x86_pred_cmd __ro_after_init = PRED_CMD_IBPB;
++EXPORT_SYMBOL_GPL(x86_pred_cmd);
++
+ static DEFINE_MUTEX(spec_ctrl_mutex);
+
+ /* Update SPEC_CTRL MSR and its cached copy unconditionally */
+@@ -125,21 +127,8 @@ DEFINE_STATIC_KEY_FALSE(switch_mm_cond_l1d_flush);
+ DEFINE_STATIC_KEY_FALSE(mmio_stale_data_clear);
+ EXPORT_SYMBOL_GPL(mmio_stale_data_clear);
+
+-void __init check_bugs(void)
++void __init cpu_select_mitigations(void)
+ {
+- identify_boot_cpu();
+-
+- /*
+- * identify_boot_cpu() initialized SMT support information, let the
+- * core code know.
+- */
+- cpu_smt_check_topology();
+-
+- if (!IS_ENABLED(CONFIG_SMP)) {
+- pr_info("CPU: ");
+- print_cpu_info(&boot_cpu_data);
+- }
+-
+ /*
+ * Read the SPEC_CTRL MSR to account for reserved bits which may
+ * have unknown values. AMD64_LS_CFG MSR is cached in the early AMD
+@@ -176,39 +165,8 @@ void __init check_bugs(void)
+ md_clear_select_mitigation();
+ srbds_select_mitigation();
+ l1d_flush_select_mitigation();
+-
+- arch_smt_update();
+-
+-#ifdef CONFIG_X86_32
+- /*
+- * Check whether we are able to run this kernel safely on SMP.
+- *
+- * - i386 is no longer supported.
+- * - In order to run on anything without a TSC, we need to be
+- * compiled for a i486.
+- */
+- if (boot_cpu_data.x86 < 4)
+- panic("Kernel requires i486+ for 'invlpg' and other features");
+-
+- init_utsname()->machine[1] =
+- '0' + (boot_cpu_data.x86 > 6 ? 6 : boot_cpu_data.x86);
+- alternative_instructions();
+-
+- fpu__init_check_bugs();
+-#else /* CONFIG_X86_64 */
+- alternative_instructions();
+-
+- /*
+- * Make sure the first 2MB area is not mapped by huge pages
+- * There are typically fixed size MTRRs in there and overlapping
+- * MTRRs into large pages causes slow downs.
+- *
+- * Right now we don't do that with gbpages because there seems
+- * very little benefit for that case.
+- */
+- if (!direct_gbpages)
+- set_memory_4k((unsigned long)__va(0), 1);
+-#endif
++ gds_select_mitigation();
++ srso_select_mitigation();
+ }
+
+ /*
+@@ -694,6 +652,149 @@ static int __init l1d_flush_parse_cmdline(char *str)
+ }
+ early_param("l1d_flush", l1d_flush_parse_cmdline);
+
++#undef pr_fmt
++#define pr_fmt(fmt) "GDS: " fmt
++
++enum gds_mitigations {
++ GDS_MITIGATION_OFF,
++ GDS_MITIGATION_UCODE_NEEDED,
++ GDS_MITIGATION_FORCE,
++ GDS_MITIGATION_FULL,
++ GDS_MITIGATION_FULL_LOCKED,
++ GDS_MITIGATION_HYPERVISOR,
++};
++
++#if IS_ENABLED(CONFIG_GDS_FORCE_MITIGATION)
++static enum gds_mitigations gds_mitigation __ro_after_init = GDS_MITIGATION_FORCE;
++#else
++static enum gds_mitigations gds_mitigation __ro_after_init = GDS_MITIGATION_FULL;
++#endif
++
++static const char * const gds_strings[] = {
++ [GDS_MITIGATION_OFF] = "Vulnerable",
++ [GDS_MITIGATION_UCODE_NEEDED] = "Vulnerable: No microcode",
++ [GDS_MITIGATION_FORCE] = "Mitigation: AVX disabled, no microcode",
++ [GDS_MITIGATION_FULL] = "Mitigation: Microcode",
++ [GDS_MITIGATION_FULL_LOCKED] = "Mitigation: Microcode (locked)",
++ [GDS_MITIGATION_HYPERVISOR] = "Unknown: Dependent on hypervisor status",
++};
++
++bool gds_ucode_mitigated(void)
++{
++ return (gds_mitigation == GDS_MITIGATION_FULL ||
++ gds_mitigation == GDS_MITIGATION_FULL_LOCKED);
++}
++EXPORT_SYMBOL_GPL(gds_ucode_mitigated);
++
++void update_gds_msr(void)
++{
++ u64 mcu_ctrl_after;
++ u64 mcu_ctrl;
++
++ switch (gds_mitigation) {
++ case GDS_MITIGATION_OFF:
++ rdmsrl(MSR_IA32_MCU_OPT_CTRL, mcu_ctrl);
++ mcu_ctrl |= GDS_MITG_DIS;
++ break;
++ case GDS_MITIGATION_FULL_LOCKED:
++ /*
++ * The LOCKED state comes from the boot CPU. APs might not have
++ * the same state. Make sure the mitigation is enabled on all
++ * CPUs.
++ */
++ case GDS_MITIGATION_FULL:
++ rdmsrl(MSR_IA32_MCU_OPT_CTRL, mcu_ctrl);
++ mcu_ctrl &= ~GDS_MITG_DIS;
++ break;
++ case GDS_MITIGATION_FORCE:
++ case GDS_MITIGATION_UCODE_NEEDED:
++ case GDS_MITIGATION_HYPERVISOR:
++ return;
++ };
++
++ wrmsrl(MSR_IA32_MCU_OPT_CTRL, mcu_ctrl);
++
++ /*
++ * Check to make sure that the WRMSR value was not ignored. Writes to
++ * GDS_MITG_DIS will be ignored if this processor is locked but the boot
++ * processor was not.
++ */
++ rdmsrl(MSR_IA32_MCU_OPT_CTRL, mcu_ctrl_after);
++ WARN_ON_ONCE(mcu_ctrl != mcu_ctrl_after);
++}
++
++static void __init gds_select_mitigation(void)
++{
++ u64 mcu_ctrl;
++
++ if (!boot_cpu_has_bug(X86_BUG_GDS))
++ return;
++
++ if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
++ gds_mitigation = GDS_MITIGATION_HYPERVISOR;
++ goto out;
++ }
++
++ if (cpu_mitigations_off())
++ gds_mitigation = GDS_MITIGATION_OFF;
++ /* Will verify below that mitigation _can_ be disabled */
++
++ /* No microcode */
++ if (!(x86_read_arch_cap_msr() & ARCH_CAP_GDS_CTRL)) {
++ if (gds_mitigation == GDS_MITIGATION_FORCE) {
++ /*
++ * This only needs to be done on the boot CPU so do it
++ * here rather than in update_gds_msr()
++ */
++ setup_clear_cpu_cap(X86_FEATURE_AVX);
++ pr_warn("Microcode update needed! Disabling AVX as mitigation.\n");
++ } else {
++ gds_mitigation = GDS_MITIGATION_UCODE_NEEDED;
++ }
++ goto out;
++ }
++
++ /* Microcode has mitigation, use it */
++ if (gds_mitigation == GDS_MITIGATION_FORCE)
++ gds_mitigation = GDS_MITIGATION_FULL;
++
++ rdmsrl(MSR_IA32_MCU_OPT_CTRL, mcu_ctrl);
++ if (mcu_ctrl & GDS_MITG_LOCKED) {
++ if (gds_mitigation == GDS_MITIGATION_OFF)
++ pr_warn("Mitigation locked. Disable failed.\n");
++
++ /*
++ * The mitigation is selected from the boot CPU. All other CPUs
++ * _should_ have the same state. If the boot CPU isn't locked
++ * but others are then update_gds_msr() will WARN() of the state
++ * mismatch. If the boot CPU is locked update_gds_msr() will
++ * ensure the other CPUs have the mitigation enabled.
++ */
++ gds_mitigation = GDS_MITIGATION_FULL_LOCKED;
++ }
++
++ update_gds_msr();
++out:
++ pr_info("%s\n", gds_strings[gds_mitigation]);
++}
++
++static int __init gds_parse_cmdline(char *str)
++{
++ if (!str)
++ return -EINVAL;
++
++ if (!boot_cpu_has_bug(X86_BUG_GDS))
++ return 0;
++
++ if (!strcmp(str, "off"))
++ gds_mitigation = GDS_MITIGATION_OFF;
++ else if (!strcmp(str, "force"))
++ gds_mitigation = GDS_MITIGATION_FORCE;
++
++ return 0;
++}
++early_param("gather_data_sampling", gds_parse_cmdline);
++
+ #undef pr_fmt
+ #define pr_fmt(fmt) "Spectre V1 : " fmt
+
+@@ -2236,6 +2337,165 @@ static int __init l1tf_cmdline(char *str)
+ }
+ early_param("l1tf", l1tf_cmdline);
+
++#undef pr_fmt
++#define pr_fmt(fmt) "Speculative Return Stack Overflow: " fmt
++
++enum srso_mitigation {
++ SRSO_MITIGATION_NONE,
++ SRSO_MITIGATION_MICROCODE,
++ SRSO_MITIGATION_SAFE_RET,
++ SRSO_MITIGATION_IBPB,
++ SRSO_MITIGATION_IBPB_ON_VMEXIT,
++};
++
++enum srso_mitigation_cmd {
++ SRSO_CMD_OFF,
++ SRSO_CMD_MICROCODE,
++ SRSO_CMD_SAFE_RET,
++ SRSO_CMD_IBPB,
++ SRSO_CMD_IBPB_ON_VMEXIT,
++};
++
++static const char * const srso_strings[] = {
++ [SRSO_MITIGATION_NONE] = "Vulnerable",
++ [SRSO_MITIGATION_MICROCODE] = "Mitigation: microcode",
++ [SRSO_MITIGATION_SAFE_RET] = "Mitigation: safe RET",
++ [SRSO_MITIGATION_IBPB] = "Mitigation: IBPB",
++ [SRSO_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT only"
++};
++
++static enum srso_mitigation srso_mitigation __ro_after_init = SRSO_MITIGATION_NONE;
++static enum srso_mitigation_cmd srso_cmd __ro_after_init = SRSO_CMD_SAFE_RET;
++
++static int __init srso_parse_cmdline(char *str)
++{
++ if (!str)
++ return -EINVAL;
++
++ if (!strcmp(str, "off"))
++ srso_cmd = SRSO_CMD_OFF;
++ else if (!strcmp(str, "microcode"))
++ srso_cmd = SRSO_CMD_MICROCODE;
++ else if (!strcmp(str, "safe-ret"))
++ srso_cmd = SRSO_CMD_SAFE_RET;
++ else if (!strcmp(str, "ibpb"))
++ srso_cmd = SRSO_CMD_IBPB;
++ else if (!strcmp(str, "ibpb-vmexit"))
++ srso_cmd = SRSO_CMD_IBPB_ON_VMEXIT;
++ else
++ pr_err("Ignoring unknown SRSO option (%s).", str);
++
++ return 0;
++}
++early_param("spec_rstack_overflow", srso_parse_cmdline);
++
++#define SRSO_NOTICE "WARNING: See https://kernel.org/doc/html/latest/admin-guide/hw-vuln/srso.html for mitigation options."
++
++static void __init srso_select_mitigation(void)
++{
++ bool has_microcode;
++
++ if (!boot_cpu_has_bug(X86_BUG_SRSO) || cpu_mitigations_off())
++ goto pred_cmd;
++
++ /*
++ * The first check is for the kernel running as a guest in order
++ * for guests to verify whether IBPB is a viable mitigation.
++ */
++ has_microcode = boot_cpu_has(X86_FEATURE_IBPB_BRTYPE) || cpu_has_ibpb_brtype_microcode();
++ if (!has_microcode) {
++ pr_warn("IBPB-extending microcode not applied!\n");
++ pr_warn(SRSO_NOTICE);
++ } else {
++ /*
++ * Enable the synthetic (even if in a real CPUID leaf)
++ * flags for guests.
++ */
++ setup_force_cpu_cap(X86_FEATURE_IBPB_BRTYPE);
++
++ /*
++ * Zen1/2 with SMT off aren't vulnerable after the right
++ * IBPB microcode has been applied.
++ */
++ if ((boot_cpu_data.x86 < 0x19) &&
++ (!cpu_smt_possible() || (cpu_smt_control == CPU_SMT_DISABLED)))
++ setup_force_cpu_cap(X86_FEATURE_SRSO_NO);
++ }
++
++ if (retbleed_mitigation == RETBLEED_MITIGATION_IBPB) {
++ if (has_microcode) {
++ pr_err("Retbleed IBPB mitigation enabled, using same for SRSO\n");
++ srso_mitigation = SRSO_MITIGATION_IBPB;
++ goto pred_cmd;
++ }
++ }
++
++ switch (srso_cmd) {
++ case SRSO_CMD_OFF:
++ return;
++
++ case SRSO_CMD_MICROCODE:
++ if (has_microcode) {
++ srso_mitigation = SRSO_MITIGATION_MICROCODE;
++ pr_warn(SRSO_NOTICE);
++ }
++ break;
++
++ case SRSO_CMD_SAFE_RET:
++ if (IS_ENABLED(CONFIG_CPU_SRSO)) {
++ /*
++ * Enable the return thunk for generated code
++ * like ftrace, static_call, etc.
++ */
++ setup_force_cpu_cap(X86_FEATURE_RETHUNK);
++
++ if (boot_cpu_data.x86 == 0x19)
++ setup_force_cpu_cap(X86_FEATURE_SRSO_ALIAS);
++ else
++ setup_force_cpu_cap(X86_FEATURE_SRSO);
++ srso_mitigation = SRSO_MITIGATION_SAFE_RET;
++ } else {
++ pr_err("WARNING: kernel not compiled with CPU_SRSO.\n");
++ goto pred_cmd;
++ }
++ break;
++
++ case SRSO_CMD_IBPB:
++ if (IS_ENABLED(CONFIG_CPU_IBPB_ENTRY)) {
++ if (has_microcode) {
++ setup_force_cpu_cap(X86_FEATURE_ENTRY_IBPB);
++ srso_mitigation = SRSO_MITIGATION_IBPB;
++ }
++ } else {
++ pr_err("WARNING: kernel not compiled with CPU_IBPB_ENTRY.\n");
++ goto pred_cmd;
++ }
++ break;
++
++ case SRSO_CMD_IBPB_ON_VMEXIT:
++ if (IS_ENABLED(CONFIG_CPU_SRSO)) {
++ if (!boot_cpu_has(X86_FEATURE_ENTRY_IBPB) && has_microcode) {
++ setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);
++ srso_mitigation = SRSO_MITIGATION_IBPB_ON_VMEXIT;
++ }
++ } else {
++ pr_err("WARNING: kernel not compiled with CPU_SRSO.\n");
++ goto pred_cmd;
++ }
++ break;
++
++ default:
++ break;
++ }
++
++ pr_info("%s%s\n", srso_strings[srso_mitigation], (has_microcode ? "" : ", no microcode"));
++
++pred_cmd:
++ if ((boot_cpu_has(X86_FEATURE_SRSO_NO) || srso_cmd == SRSO_CMD_OFF) &&
++ boot_cpu_has(X86_FEATURE_SBPB))
++ x86_pred_cmd = PRED_CMD_SBPB;
++}
++
+ #undef pr_fmt
+ #define pr_fmt(fmt) fmt
+
+@@ -2434,6 +2694,18 @@ static ssize_t retbleed_show_state(char *buf)
+ return sysfs_emit(buf, "%s\n", retbleed_strings[retbleed_mitigation]);
+ }
+
++static ssize_t gds_show_state(char *buf)
++{
++ return sysfs_emit(buf, "%s\n", gds_strings[gds_mitigation]);
++}
++
++static ssize_t srso_show_state(char *buf)
++{
++ return sysfs_emit(buf, "%s%s\n",
++ srso_strings[srso_mitigation],
++ (cpu_has_ibpb_brtype_microcode() ? "" : ", no microcode"));
++}
++
+ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr,
+ char *buf, unsigned int bug)
+ {
+@@ -2483,6 +2755,12 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr
+ case X86_BUG_RETBLEED:
+ return retbleed_show_state(buf);
+
++ case X86_BUG_GDS:
++ return gds_show_state(buf);
++
++ case X86_BUG_SRSO:
++ return srso_show_state(buf);
++
+ default:
+ break;
+ }
+@@ -2547,4 +2825,14 @@ ssize_t cpu_show_retbleed(struct device *dev, struct device_attribute *attr, cha
+ {
+ return cpu_show_common(dev, attr, buf, X86_BUG_RETBLEED);
+ }
++
++ssize_t cpu_show_gds(struct device *dev, struct device_attribute *attr, char *buf)
++{
++ return cpu_show_common(dev, attr, buf, X86_BUG_GDS);
++}
++
++ssize_t cpu_show_spec_rstack_overflow(struct device *dev, struct device_attribute *attr, char *buf)
++{
++ return cpu_show_common(dev, attr, buf, X86_BUG_SRSO);
++}
+ #endif
+diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
+index 04eebbacb5503..19c74e68c0a21 100644
+--- a/arch/x86/kernel/cpu/common.c
++++ b/arch/x86/kernel/cpu/common.c
+@@ -18,12 +18,16 @@
+ #include <linux/init.h>
+ #include <linux/kprobes.h>
+ #include <linux/kgdb.h>
++#include <linux/mem_encrypt.h>
+ #include <linux/smp.h>
++#include <linux/cpu.h>
+ #include <linux/io.h>
+ #include <linux/syscore_ops.h>
+ #include <linux/pgtable.h>
+ #include <linux/stackprotector.h>
++#include <linux/utsname.h>
+
++#include <asm/alternative.h>
+ #include <asm/cmdline.h>
+ #include <asm/perf_event.h>
+ #include <asm/mmu_context.h>
+@@ -59,7 +63,7 @@
+ #include <asm/intel-family.h>
+ #include <asm/cpu_device_id.h>
+ #include <asm/uv/uv.h>
+-#include <asm/sigframe.h>
++#include <asm/set_memory.h>
+ #include <asm/traps.h>
+ #include <asm/sev.h>
+
+@@ -1263,6 +1267,10 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = {
+ #define RETBLEED BIT(3)
+ /* CPU is affected by SMT (cross-thread) return predictions */
+ #define SMT_RSB BIT(4)
++/* CPU is affected by SRSO */
++#define SRSO BIT(5)
++/* CPU is affected by GDS */
++#define GDS BIT(6)
+
+ static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = {
+ VULNBL_INTEL_STEPPINGS(IVYBRIDGE, X86_STEPPING_ANY, SRBDS),
+@@ -1275,27 +1283,30 @@ static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = {
+ VULNBL_INTEL_STEPPINGS(BROADWELL_X, X86_STEPPING_ANY, MMIO),
+ VULNBL_INTEL_STEPPINGS(BROADWELL, X86_STEPPING_ANY, SRBDS),
+ VULNBL_INTEL_STEPPINGS(SKYLAKE_L, X86_STEPPING_ANY, SRBDS | MMIO | RETBLEED),
+- VULNBL_INTEL_STEPPINGS(SKYLAKE_X, X86_STEPPING_ANY, MMIO | RETBLEED),
++ VULNBL_INTEL_STEPPINGS(SKYLAKE_X, X86_STEPPING_ANY, MMIO | RETBLEED | GDS),
+ VULNBL_INTEL_STEPPINGS(SKYLAKE, X86_STEPPING_ANY, SRBDS | MMIO | RETBLEED),
+- VULNBL_INTEL_STEPPINGS(KABYLAKE_L, X86_STEPPING_ANY, SRBDS | MMIO | RETBLEED),
+- VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPING_ANY, SRBDS | MMIO | RETBLEED),
++ VULNBL_INTEL_STEPPINGS(KABYLAKE_L, X86_STEPPING_ANY, SRBDS | MMIO | RETBLEED | GDS),
++ VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPING_ANY, SRBDS | MMIO | RETBLEED | GDS),
+ VULNBL_INTEL_STEPPINGS(CANNONLAKE_L, X86_STEPPING_ANY, RETBLEED),
+- VULNBL_INTEL_STEPPINGS(ICELAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED),
+- VULNBL_INTEL_STEPPINGS(ICELAKE_D, X86_STEPPING_ANY, MMIO),
+- VULNBL_INTEL_STEPPINGS(ICELAKE_X, X86_STEPPING_ANY, MMIO),
+- VULNBL_INTEL_STEPPINGS(COMETLAKE, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED),
++ VULNBL_INTEL_STEPPINGS(ICELAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS),
++ VULNBL_INTEL_STEPPINGS(ICELAKE_D, X86_STEPPING_ANY, MMIO | GDS),
++ VULNBL_INTEL_STEPPINGS(ICELAKE_X, X86_STEPPING_ANY, MMIO | GDS),
++ VULNBL_INTEL_STEPPINGS(COMETLAKE, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS),
+ VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPINGS(0x0, 0x0), MMIO | RETBLEED),
+- VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED),
++ VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS),
++ VULNBL_INTEL_STEPPINGS(TIGERLAKE_L, X86_STEPPING_ANY, GDS),
++ VULNBL_INTEL_STEPPINGS(TIGERLAKE, X86_STEPPING_ANY, GDS),
+ VULNBL_INTEL_STEPPINGS(LAKEFIELD, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED),
+- VULNBL_INTEL_STEPPINGS(ROCKETLAKE, X86_STEPPING_ANY, MMIO | RETBLEED),
++ VULNBL_INTEL_STEPPINGS(ROCKETLAKE, X86_STEPPING_ANY, MMIO | RETBLEED | GDS),
+ VULNBL_INTEL_STEPPINGS(ATOM_TREMONT, X86_STEPPING_ANY, MMIO | MMIO_SBDS),
+ VULNBL_INTEL_STEPPINGS(ATOM_TREMONT_D, X86_STEPPING_ANY, MMIO),
+ VULNBL_INTEL_STEPPINGS(ATOM_TREMONT_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS),
+
+ VULNBL_AMD(0x15, RETBLEED),
+ VULNBL_AMD(0x16, RETBLEED),
+- VULNBL_AMD(0x17, RETBLEED | SMT_RSB),
++ VULNBL_AMD(0x17, RETBLEED | SMT_RSB | SRSO),
+ VULNBL_HYGON(0x18, RETBLEED | SMT_RSB),
++ VULNBL_AMD(0x19, SRSO),
+ {}
+ };
+
+@@ -1419,6 +1430,21 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
+ if (cpu_matches(cpu_vuln_blacklist, SMT_RSB))
+ setup_force_cpu_bug(X86_BUG_SMT_RSB);
+
++ /*
++ * Check if CPU is vulnerable to GDS. If running in a virtual machine on
++ * an affected processor, the VMM may have disabled the use of GATHER by
++ * disabling AVX2. The only way to do this in HW is to clear XCR0[2],
++ * which means that AVX will be disabled.
++ */
++ if (cpu_matches(cpu_vuln_blacklist, GDS) && !(ia32_cap & ARCH_CAP_GDS_NO) &&
++ boot_cpu_has(X86_FEATURE_AVX))
++ setup_force_cpu_bug(X86_BUG_GDS);
++
++ if (!cpu_has(c, X86_FEATURE_SRSO_NO)) {
++ if (cpu_matches(cpu_vuln_blacklist, SRSO))
++ setup_force_cpu_bug(X86_BUG_SRSO);
++ }
++
+ if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN))
+ return;
+
+@@ -1600,10 +1626,6 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
+
+ sld_setup(c);
+
+- fpu__init_system(c);
+-
+- init_sigframe_size();
+-
+ #ifdef CONFIG_X86_32
+ /*
+ * Regardless of whether PCID is enumerated, the SDM says
+@@ -1983,6 +2005,8 @@ void identify_secondary_cpu(struct cpuinfo_x86 *c)
+ validate_apic_and_package_id(c);
+ x86_spec_ctrl_setup_ap();
+ update_srbds_msr();
++ if (boot_cpu_has_bug(X86_BUG_GDS))
++ update_gds_msr();
+
+ tsx_ap_init();
+ }
+@@ -2285,8 +2309,6 @@ void cpu_init(void)
+
+ doublefault_init_cpu_tss();
+
+- fpu__init_cpu();
+-
+ if (is_uv_system())
+ uv_cpu_init();
+
+@@ -2302,6 +2324,7 @@ void cpu_init_secondary(void)
+ */
+ cpu_init_exception_handling();
+ cpu_init();
++ fpu__init_cpu();
+ }
+ #endif
+
+@@ -2364,3 +2387,69 @@ void arch_smt_update(void)
+ /* Check whether IPI broadcasting can be enabled */
+ apic_smt_update();
+ }
++
++void __init arch_cpu_finalize_init(void)
++{
++ identify_boot_cpu();
++
++ /*
++ * identify_boot_cpu() initialized SMT support information, let the
++ * core code know.
++ */
++ cpu_smt_check_topology();
++
++ if (!IS_ENABLED(CONFIG_SMP)) {
++ pr_info("CPU: ");
++ print_cpu_info(&boot_cpu_data);
++ }
++
++ cpu_select_mitigations();
++
++ arch_smt_update();
++
++ if (IS_ENABLED(CONFIG_X86_32)) {
++ /*
++ * Check whether this is a real i386 which is not longer
++ * supported and fixup the utsname.
++ */
++ if (boot_cpu_data.x86 < 4)
++ panic("Kernel requires i486+ for 'invlpg' and other features");
++
++ init_utsname()->machine[1] =
++ '0' + (boot_cpu_data.x86 > 6 ? 6 : boot_cpu_data.x86);
++ }
++
++ /*
++ * Must be before alternatives because it might set or clear
++ * feature bits.
++ */
++ fpu__init_system();
++ fpu__init_cpu();
++
++ alternative_instructions();
++
++ if (IS_ENABLED(CONFIG_X86_64)) {
++ /*
++ * Make sure the first 2MB area is not mapped by huge pages
++ * There are typically fixed size MTRRs in there and overlapping
++ * MTRRs into large pages causes slow downs.
++ *
++ * Right now we don't do that with gbpages because there seems
++ * very little benefit for that case.
++ */
++ if (!direct_gbpages)
++ set_memory_4k((unsigned long)__va(0), 1);
++ } else {
++ fpu__init_check_bugs();
++ }
++
++ /*
++ * This needs to be called before any devices perform DMA
++ * operations that might use the SWIOTLB bounce buffers. It will
++ * mark the bounce buffers as decrypted so that their usage will
++ * not cause "plain-text" data to be decrypted when accessed. It
++ * must be called after late_time_init() so that Hyper-V x86/x64
++ * hypercalls work when the SWIOTLB bounce buffers are decrypted.
++ */
++ mem_encrypt_init();
++}
+diff --git a/arch/x86/kernel/cpu/cpu.h b/arch/x86/kernel/cpu/cpu.h
+index f97b0fe13da80..1dcd7d4e38ef1 100644
+--- a/arch/x86/kernel/cpu/cpu.h
++++ b/arch/x86/kernel/cpu/cpu.h
+@@ -79,9 +79,11 @@ extern void detect_ht(struct cpuinfo_x86 *c);
+ extern void check_null_seg_clears_base(struct cpuinfo_x86 *c);
+
+ unsigned int aperfmperf_get_khz(int cpu);
++void cpu_select_mitigations(void);
+
+ extern void x86_spec_ctrl_setup_ap(void);
+ extern void update_srbds_msr(void);
++extern void update_gds_msr(void);
+
+ extern enum spectre_v2_mitigation spectre_v2_enabled;
+
+diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
+index 851eb13edc014..998a08f17e331 100644
+--- a/arch/x86/kernel/fpu/init.c
++++ b/arch/x86/kernel/fpu/init.c
+@@ -53,7 +53,7 @@ void fpu__init_cpu(void)
+ fpu__init_cpu_xstate();
+ }
+
+-static bool fpu__probe_without_cpuid(void)
++static bool __init fpu__probe_without_cpuid(void)
+ {
+ unsigned long cr0;
+ u16 fsw, fcw;
+@@ -71,7 +71,7 @@ static bool fpu__probe_without_cpuid(void)
+ return fsw == 0 && (fcw & 0x103f) == 0x003f;
+ }
+
+-static void fpu__init_system_early_generic(struct cpuinfo_x86 *c)
++static void __init fpu__init_system_early_generic(void)
+ {
+ if (!boot_cpu_has(X86_FEATURE_CPUID) &&
+ !test_bit(X86_FEATURE_FPU, (unsigned long *)cpu_caps_cleared)) {
+@@ -211,10 +211,10 @@ static void __init fpu__init_system_xstate_size_legacy(void)
+ * Called on the boot CPU once per system bootup, to set up the initial
+ * FPU state that is later cloned into all processes:
+ */
+-void __init fpu__init_system(struct cpuinfo_x86 *c)
++void __init fpu__init_system(void)
+ {
+ fpstate_reset(¤t->thread.fpu);
+- fpu__init_system_early_generic(c);
++ fpu__init_system_early_generic();
+
+ /*
+ * The FPU has to be operational for some of the
+diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
+index 004cb30b74198..cfeec3ee877eb 100644
+--- a/arch/x86/kernel/signal.c
++++ b/arch/x86/kernel/signal.c
+@@ -182,7 +182,7 @@ get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size,
+ static unsigned long __ro_after_init max_frame_size;
+ static unsigned int __ro_after_init fpu_default_state_size;
+
+-void __init init_sigframe_size(void)
++static int __init init_sigframe_size(void)
+ {
+ fpu_default_state_size = fpu__get_fpstate_size();
+
+@@ -194,7 +194,9 @@ void __init init_sigframe_size(void)
+ max_frame_size = round_up(max_frame_size, FRAME_ALIGNMENT);
+
+ pr_info("max sigframe size: %lu\n", max_frame_size);
++ return 0;
+ }
++early_initcall(init_sigframe_size);
+
+ unsigned long get_sigframe_size(void)
+ {
+diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
+index 25f155205770c..84f741b06376f 100644
+--- a/arch/x86/kernel/vmlinux.lds.S
++++ b/arch/x86/kernel/vmlinux.lds.S
+@@ -134,13 +134,27 @@ SECTIONS
+ SOFTIRQENTRY_TEXT
+ #ifdef CONFIG_RETPOLINE
+ __indirect_thunk_start = .;
+- *(.text.__x86.*)
++ *(.text.__x86.indirect_thunk)
++ *(.text.__x86.return_thunk)
+ __indirect_thunk_end = .;
+ #endif
+ STATIC_CALL_TEXT
+
+ ALIGN_ENTRY_TEXT_BEGIN
++#ifdef CONFIG_CPU_SRSO
++ *(.text.__x86.rethunk_untrain)
++#endif
++
+ ENTRY_TEXT
++
++#ifdef CONFIG_CPU_SRSO
++ /*
++ * See the comment above srso_untrain_ret_alias()'s
++ * definition.
++ */
++ . = srso_untrain_ret_alias | (1 << 2) | (1 << 8) | (1 << 14) | (1 << 20);
++ *(.text.__x86.rethunk_safe)
++#endif
+ ALIGN_ENTRY_TEXT_END
+ *(.gnu.warning)
+
+@@ -508,4 +522,19 @@ INIT_PER_CPU(irq_stack_backing_store);
+ "fixed_percpu_data is not at start of per-cpu area");
+ #endif
+
++ #ifdef CONFIG_RETHUNK
++. = ASSERT((__ret & 0x3f) == 0, "__ret not cacheline-aligned");
++. = ASSERT((srso_safe_ret & 0x3f) == 0, "srso_safe_ret not cacheline-aligned");
++#endif
++
++#ifdef CONFIG_CPU_SRSO
++/*
++ * GNU ld cannot do XOR so do: (A | B) - (A & B) in order to compute the XOR
++ * of the two function addresses:
++ */
++. = ASSERT(((srso_untrain_ret_alias | srso_safe_ret_alias) -
++ (srso_untrain_ret_alias & srso_safe_ret_alias)) == ((1 << 2) | (1 << 8) | (1 << 14) | (1 << 20)),
++ "SRSO function pair won't alias");
++#endif
++
+ #endif /* CONFIG_X86_64 */
+diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
+index 0c9660a07b233..680e1611cf26d 100644
+--- a/arch/x86/kvm/cpuid.c
++++ b/arch/x86/kvm/cpuid.c
+@@ -734,6 +734,9 @@ void kvm_set_cpu_caps(void)
+ F(NULL_SEL_CLR_BASE) | F(AUTOIBRS) | 0 /* PrefetchCtlMsr */
+ );
+
++ if (cpu_feature_enabled(X86_FEATURE_SRSO_NO))
++ kvm_cpu_cap_set(X86_FEATURE_SRSO_NO);
++
+ /*
+ * Synthesize "LFENCE is serializing" into the AMD-defined entry in
+ * KVM's supported CPUID if the feature is reported as supported by the
+diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
+index 5a0c2d6791a0a..af7b968f55703 100644
+--- a/arch/x86/kvm/svm/svm.c
++++ b/arch/x86/kvm/svm/svm.c
+@@ -1511,7 +1511,9 @@ static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
+
+ if (sd->current_vmcb != svm->vmcb) {
+ sd->current_vmcb = svm->vmcb;
+- indirect_branch_prediction_barrier();
++
++ if (!cpu_feature_enabled(X86_FEATURE_IBPB_ON_VMEXIT))
++ indirect_branch_prediction_barrier();
+ }
+ if (kvm_vcpu_apicv_active(vcpu))
+ avic_vcpu_load(vcpu, cpu);
+diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
+index 8e8295e774f0f..265452fc9ebe9 100644
+--- a/arch/x86/kvm/svm/vmenter.S
++++ b/arch/x86/kvm/svm/vmenter.S
+@@ -224,6 +224,9 @@ SYM_FUNC_START(__svm_vcpu_run)
+ */
+ UNTRAIN_RET
+
++ /* SRSO */
++ ALTERNATIVE "", "call entry_ibpb", X86_FEATURE_IBPB_ON_VMEXIT
++
+ /*
+ * Clear all general purpose registers except RSP and RAX to prevent
+ * speculative use of the guest's values, even those that are reloaded
+diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
+index f04bed5a5aff9..a96f0f775ae27 100644
+--- a/arch/x86/kvm/x86.c
++++ b/arch/x86/kvm/x86.c
+@@ -314,6 +314,8 @@ u64 __read_mostly host_xcr0;
+
+ static struct kmem_cache *x86_emulator_cache;
+
++extern bool gds_ucode_mitigated(void);
++
+ /*
+ * When called, it means the previous get/set msr reached an invalid msr.
+ * Return true if we want to ignore/silent this failed msr access.
+@@ -1617,7 +1619,7 @@ static bool kvm_is_immutable_feature_msr(u32 msr)
+ ARCH_CAP_SKIP_VMENTRY_L1DFLUSH | ARCH_CAP_SSB_NO | ARCH_CAP_MDS_NO | \
+ ARCH_CAP_PSCHANGE_MC_NO | ARCH_CAP_TSX_CTRL_MSR | ARCH_CAP_TAA_NO | \
+ ARCH_CAP_SBDR_SSDP_NO | ARCH_CAP_FBSDP_NO | ARCH_CAP_PSDP_NO | \
+- ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO)
++ ARCH_CAP_FB_CLEAR | ARCH_CAP_RRSBA | ARCH_CAP_PBRSB_NO | ARCH_CAP_GDS_NO)
+
+ static u64 kvm_get_arch_capabilities(void)
+ {
+@@ -1674,6 +1676,9 @@ static u64 kvm_get_arch_capabilities(void)
+ */
+ }
+
++ if (!boot_cpu_has_bug(X86_BUG_GDS) || gds_ucode_mitigated())
++ data |= ARCH_CAP_GDS_NO;
++
+ return data;
+ }
+
+diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
+index b3b1e376dce86..2cff585f22f29 100644
+--- a/arch/x86/lib/retpoline.S
++++ b/arch/x86/lib/retpoline.S
+@@ -11,6 +11,7 @@
+ #include <asm/unwind_hints.h>
+ #include <asm/percpu.h>
+ #include <asm/frame.h>
++#include <asm/nops.h>
+
+ .section .text.__x86.indirect_thunk
+
+@@ -131,6 +132,46 @@ SYM_CODE_END(__x86_indirect_jump_thunk_array)
+ */
+ #ifdef CONFIG_RETHUNK
+
++/*
++ * srso_untrain_ret_alias() and srso_safe_ret_alias() are placed at
++ * special addresses:
++ *
++ * - srso_untrain_ret_alias() is 2M aligned
++ * - srso_safe_ret_alias() is also in the same 2M page but bits 2, 8, 14
++ * and 20 in its virtual address are set (while those bits in the
++ * srso_untrain_ret_alias() function are cleared).
++ *
++ * This guarantees that those two addresses will alias in the branch
++ * target buffer of Zen3/4 generations, leading to any potential
++ * poisoned entries at that BTB slot to get evicted.
++ *
++ * As a result, srso_safe_ret_alias() becomes a safe return.
++ */
++#ifdef CONFIG_CPU_SRSO
++ .section .text.__x86.rethunk_untrain
++
++SYM_START(srso_untrain_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
++ ANNOTATE_NOENDBR
++ ASM_NOP2
++ lfence
++ jmp __x86_return_thunk
++SYM_FUNC_END(srso_untrain_ret_alias)
++__EXPORT_THUNK(srso_untrain_ret_alias)
++
++ .section .text.__x86.rethunk_safe
++#endif
++
++/* Needs a definition for the __x86_return_thunk alternative below. */
++SYM_START(srso_safe_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
++#ifdef CONFIG_CPU_SRSO
++ add $8, %_ASM_SP
++ UNWIND_HINT_FUNC
++#endif
++ ANNOTATE_UNRET_SAFE
++ ret
++ int3
++SYM_FUNC_END(srso_safe_ret_alias)
++
+ .section .text.__x86.return_thunk
+
+ /*
+@@ -143,7 +184,7 @@ SYM_CODE_END(__x86_indirect_jump_thunk_array)
+ * from re-poisioning the BTB prediction.
+ */
+ .align 64
+- .skip 63, 0xcc
++ .skip 64 - (__ret - zen_untrain_ret), 0xcc
+ SYM_START(zen_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
+ ANNOTATE_NOENDBR
+ /*
+@@ -175,10 +216,10 @@ SYM_START(zen_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
+ * evicted, __x86_return_thunk will suffer Straight Line Speculation
+ * which will be contained safely by the INT3.
+ */
+-SYM_INNER_LABEL(__x86_return_thunk, SYM_L_GLOBAL)
++SYM_INNER_LABEL(__ret, SYM_L_GLOBAL)
+ ret
+ int3
+-SYM_CODE_END(__x86_return_thunk)
++SYM_CODE_END(__ret)
+
+ /*
+ * Ensure the TEST decoding / BTB invalidation is complete.
+@@ -189,11 +230,45 @@ SYM_CODE_END(__x86_return_thunk)
+ * Jump back and execute the RET in the middle of the TEST instruction.
+ * INT3 is for SLS protection.
+ */
+- jmp __x86_return_thunk
++ jmp __ret
+ int3
+ SYM_FUNC_END(zen_untrain_ret)
+ __EXPORT_THUNK(zen_untrain_ret)
+
++/*
++ * SRSO untraining sequence for Zen1/2, similar to zen_untrain_ret()
++ * above. On kernel entry, srso_untrain_ret() is executed which is a
++ *
++ * movabs $0xccccccc308c48348,%rax
++ *
++ * and when the return thunk executes the inner label srso_safe_ret()
++ * later, it is a stack manipulation and a RET which is mispredicted and
++ * thus a "safe" one to use.
++ */
++ .align 64
++ .skip 64 - (srso_safe_ret - srso_untrain_ret), 0xcc
++SYM_START(srso_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
++ ANNOTATE_NOENDBR
++ .byte 0x48, 0xb8
++
++SYM_INNER_LABEL(srso_safe_ret, SYM_L_GLOBAL)
++ add $8, %_ASM_SP
++ ret
++ int3
++ int3
++ int3
++ lfence
++ call srso_safe_ret
++ int3
++SYM_CODE_END(srso_safe_ret)
++SYM_FUNC_END(srso_untrain_ret)
++__EXPORT_THUNK(srso_untrain_ret)
++
++SYM_FUNC_START(__x86_return_thunk)
++ ALTERNATIVE_2 "jmp __ret", "call srso_safe_ret", X86_FEATURE_SRSO, \
++ "call srso_safe_ret_alias", X86_FEATURE_SRSO_ALIAS
++ int3
++SYM_CODE_END(__x86_return_thunk)
+ EXPORT_SYMBOL(__x86_return_thunk)
+
+ #endif /* CONFIG_RETHUNK */
+diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c
+index a9cf8c8fa074c..0b6efc43faaf8 100644
+--- a/arch/x86/xen/smp_pv.c
++++ b/arch/x86/xen/smp_pv.c
+@@ -63,6 +63,7 @@ static void cpu_bringup(void)
+
+ cr4_init();
+ cpu_init();
++ fpu__init_cpu();
+ touch_softlockup_watchdog();
+
+ /* PVH runs in ring 0 and allows us to do native syscalls. Yay! */
+diff --git a/arch/xtensa/include/asm/bugs.h b/arch/xtensa/include/asm/bugs.h
+deleted file mode 100644
+index 69b29d1982494..0000000000000
+--- a/arch/xtensa/include/asm/bugs.h
++++ /dev/null
+@@ -1,18 +0,0 @@
+-/*
+- * include/asm-xtensa/bugs.h
+- *
+- * This is included by init/main.c to check for architecture-dependent bugs.
+- *
+- * Xtensa processors don't have any bugs. :)
+- *
+- * This file is subject to the terms and conditions of the GNU General
+- * Public License. See the file "COPYING" in the main directory of
+- * this archive for more details.
+- */
+-
+-#ifndef _XTENSA_BUGS_H
+-#define _XTENSA_BUGS_H
+-
+-static void check_bugs(void) { }
+-
+-#endif /* _XTENSA_BUGS_H */
+diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
+index c1815b9dae68e..1d36bfb4c097c 100644
+--- a/drivers/base/cpu.c
++++ b/drivers/base/cpu.c
+@@ -577,6 +577,18 @@ ssize_t __weak cpu_show_retbleed(struct device *dev,
+ return sysfs_emit(buf, "Not affected\n");
+ }
+
++ssize_t __weak cpu_show_gds(struct device *dev,
++ struct device_attribute *attr, char *buf)
++{
++ return sysfs_emit(buf, "Not affected\n");
++}
++
++ssize_t __weak cpu_show_spec_rstack_overflow(struct device *dev,
++ struct device_attribute *attr, char *buf)
++{
++ return sysfs_emit(buf, "Not affected\n");
++}
++
+ static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL);
+ static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL);
+ static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL);
+@@ -588,6 +600,8 @@ static DEVICE_ATTR(itlb_multihit, 0444, cpu_show_itlb_multihit, NULL);
+ static DEVICE_ATTR(srbds, 0444, cpu_show_srbds, NULL);
+ static DEVICE_ATTR(mmio_stale_data, 0444, cpu_show_mmio_stale_data, NULL);
+ static DEVICE_ATTR(retbleed, 0444, cpu_show_retbleed, NULL);
++static DEVICE_ATTR(gather_data_sampling, 0444, cpu_show_gds, NULL);
++static DEVICE_ATTR(spec_rstack_overflow, 0444, cpu_show_spec_rstack_overflow, NULL);
+
+ static struct attribute *cpu_root_vulnerabilities_attrs[] = {
+ &dev_attr_meltdown.attr,
+@@ -601,6 +615,8 @@ static struct attribute *cpu_root_vulnerabilities_attrs[] = {
+ &dev_attr_srbds.attr,
+ &dev_attr_mmio_stale_data.attr,
+ &dev_attr_retbleed.attr,
++ &dev_attr_gather_data_sampling.attr,
++ &dev_attr_spec_rstack_overflow.attr,
+ NULL
+ };
+
+diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
+index c1501f41e2d82..72091b3319639 100644
+--- a/drivers/net/xen-netback/netback.c
++++ b/drivers/net/xen-netback/netback.c
+@@ -396,7 +396,7 @@ static void xenvif_get_requests(struct xenvif_queue *queue,
+ struct gnttab_map_grant_ref *gop = queue->tx_map_ops + *map_ops;
+ struct xen_netif_tx_request *txp = first;
+
+- nr_slots = shinfo->nr_frags + 1;
++ nr_slots = shinfo->nr_frags + frag_overflow + 1;
+
+ copy_count(skb) = 0;
+ XENVIF_TX_CB(skb)->split_mask = 0;
+@@ -462,8 +462,8 @@ static void xenvif_get_requests(struct xenvif_queue *queue,
+ }
+ }
+
+- for (shinfo->nr_frags = 0; shinfo->nr_frags < nr_slots;
+- shinfo->nr_frags++, gop++) {
++ for (shinfo->nr_frags = 0; nr_slots > 0 && shinfo->nr_frags < MAX_SKB_FRAGS;
++ shinfo->nr_frags++, gop++, nr_slots--) {
+ index = pending_index(queue->pending_cons++);
+ pending_idx = queue->pending_ring[index];
+ xenvif_tx_create_map_op(queue, pending_idx, txp,
+@@ -476,12 +476,12 @@ static void xenvif_get_requests(struct xenvif_queue *queue,
+ txp++;
+ }
+
+- if (frag_overflow) {
++ if (nr_slots > 0) {
+
+ shinfo = skb_shinfo(nskb);
+ frags = shinfo->frags;
+
+- for (shinfo->nr_frags = 0; shinfo->nr_frags < frag_overflow;
++ for (shinfo->nr_frags = 0; shinfo->nr_frags < nr_slots;
+ shinfo->nr_frags++, txp++, gop++) {
+ index = pending_index(queue->pending_cons++);
+ pending_idx = queue->pending_ring[index];
+@@ -492,6 +492,11 @@ static void xenvif_get_requests(struct xenvif_queue *queue,
+ }
+
+ skb_shinfo(skb)->frag_list = nskb;
++ } else if (nskb) {
++ /* A frag_list skb was allocated but it is no longer needed
++ * because enough slots were converted to copy ops above.
++ */
++ kfree_skb(nskb);
+ }
+
+ (*copy_ops) = cop - queue->tx_copy_ops;
+diff --git a/include/asm-generic/bugs.h b/include/asm-generic/bugs.h
+deleted file mode 100644
+index 69021830f078d..0000000000000
+--- a/include/asm-generic/bugs.h
++++ /dev/null
+@@ -1,11 +0,0 @@
+-/* SPDX-License-Identifier: GPL-2.0 */
+-#ifndef __ASM_GENERIC_BUGS_H
+-#define __ASM_GENERIC_BUGS_H
+-/*
+- * This file is included by 'init/main.c' to check for
+- * architecture-dependent bugs.
+- */
+-
+-static inline void check_bugs(void) { }
+-
+-#endif /* __ASM_GENERIC_BUGS_H */
+diff --git a/include/linux/cpu.h b/include/linux/cpu.h
+index 8582a7142623d..ce41922470a5d 100644
+--- a/include/linux/cpu.h
++++ b/include/linux/cpu.h
+@@ -70,6 +70,8 @@ extern ssize_t cpu_show_mmio_stale_data(struct device *dev,
+ char *buf);
+ extern ssize_t cpu_show_retbleed(struct device *dev,
+ struct device_attribute *attr, char *buf);
++extern ssize_t cpu_show_spec_rstack_overflow(struct device *dev,
++ struct device_attribute *attr, char *buf);
+
+ extern __printf(4, 5)
+ struct device *cpu_device_create(struct device *parent, void *drvdata,
+@@ -184,6 +186,12 @@ void arch_cpu_idle_enter(void);
+ void arch_cpu_idle_exit(void);
+ void __noreturn arch_cpu_idle_dead(void);
+
++#ifdef CONFIG_ARCH_HAS_CPU_FINALIZE_INIT
++void arch_cpu_finalize_init(void);
++#else
++static inline void arch_cpu_finalize_init(void) { }
++#endif
++
+ int cpu_report_state(int cpu);
+ int cpu_check_up_prepare(int cpu);
+ void cpu_set_state_online(int cpu);
+diff --git a/init/main.c b/init/main.c
+index c445c1fb19b95..3bec87f4c4cdc 100644
+--- a/init/main.c
++++ b/init/main.c
+@@ -95,7 +95,6 @@
+ #include <linux/cache.h>
+ #include <linux/rodata_test.h>
+ #include <linux/jump_label.h>
+-#include <linux/mem_encrypt.h>
+ #include <linux/kcsan.h>
+ #include <linux/init_syscalls.h>
+ #include <linux/stackdepot.h>
+@@ -103,7 +102,6 @@
+ #include <net/net_namespace.h>
+
+ #include <asm/io.h>
+-#include <asm/bugs.h>
+ #include <asm/setup.h>
+ #include <asm/sections.h>
+ #include <asm/cacheflush.h>
+@@ -787,8 +785,6 @@ void __init __weak thread_stack_cache_init(void)
+ }
+ #endif
+
+-void __init __weak mem_encrypt_init(void) { }
+-
+ void __init __weak poking_init(void) { }
+
+ void __init __weak pgtable_cache_init(void) { }
+@@ -1043,15 +1039,7 @@ void start_kernel(void)
+ sched_clock_init();
+ calibrate_delay();
+
+- /*
+- * This needs to be called before any devices perform DMA
+- * operations that might use the SWIOTLB bounce buffers. It will
+- * mark the bounce buffers as decrypted so that their usage will
+- * not cause "plain-text" data to be decrypted when accessed. It
+- * must be called after late_time_init() so that Hyper-V x86/x64
+- * hypercalls work when the SWIOTLB bounce buffers are decrypted.
+- */
+- mem_encrypt_init();
++ arch_cpu_finalize_init();
+
+ pid_idr_init();
+ anon_vma_init();
+@@ -1079,8 +1067,6 @@ void start_kernel(void)
+ taskstats_init_early();
+ delayacct_init();
+
+- check_bugs();
+-
+ acpi_subsystem_init();
+ arch_post_acpi_subsys_init();
+ kcsan_init();
+diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h
+index cb8ca46213bed..1f6d904c6481d 100644
+--- a/tools/arch/x86/include/asm/cpufeatures.h
++++ b/tools/arch/x86/include/asm/cpufeatures.h
+@@ -14,7 +14,7 @@
+ * Defines x86 CPU feature bits
+ */
+ #define NCAPINTS 21 /* N 32-bit words worth of info */
+-#define NBUGINTS 1 /* N 32-bit bug flags */
++#define NBUGINTS 2 /* N 32-bit bug flags */
+
+ /*
+ * Note: If the comment begins with a quoted string, that string is used
+diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
+index 9ef024fd648c1..a25fdc08c39e8 100644
+--- a/tools/objtool/arch/x86/decode.c
++++ b/tools/objtool/arch/x86/decode.c
+@@ -824,5 +824,8 @@ bool arch_is_retpoline(struct symbol *sym)
+
+ bool arch_is_rethunk(struct symbol *sym)
+ {
+- return !strcmp(sym->name, "__x86_return_thunk");
++ return !strcmp(sym->name, "__x86_return_thunk") ||
++ !strcmp(sym->name, "srso_untrain_ret") ||
++ !strcmp(sym->name, "srso_safe_ret") ||
++ !strcmp(sym->name, "__ret");
+ }
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-08-11 11:53 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-08-11 11:53 UTC (permalink / raw
To: gentoo-commits
commit: 4f32cadda502032bf59fcd46932f2446e7097052
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Fri Aug 11 11:53:29 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Fri Aug 11 11:53:29 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=4f32cadd
Linux patch 6.4.10
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1009_linux-6.4.10.patch | 9089 +++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 9093 insertions(+)
diff --git a/0000_README b/0000_README
index 65dbf206..f63d6a30 100644
--- a/0000_README
+++ b/0000_README
@@ -79,6 +79,10 @@ Patch: 1008_linux-6.4.9.patch
From: https://www.kernel.org
Desc: Linux 6.4.9
+Patch: 1009_linux-6.4.10.patch
+From: https://www.kernel.org
+Desc: Linux 6.4.10
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1009_linux-6.4.10.patch b/1009_linux-6.4.10.patch
new file mode 100644
index 00000000..9488e2ad
--- /dev/null
+++ b/1009_linux-6.4.10.patch
@@ -0,0 +1,9089 @@
+diff --git a/Documentation/admin-guide/kdump/vmcoreinfo.rst b/Documentation/admin-guide/kdump/vmcoreinfo.rst
+index c18d94fa64704..f8ebb63b6c5d2 100644
+--- a/Documentation/admin-guide/kdump/vmcoreinfo.rst
++++ b/Documentation/admin-guide/kdump/vmcoreinfo.rst
+@@ -624,3 +624,9 @@ Used to get the correct ranges:
+ * VMALLOC_START ~ VMALLOC_END : vmalloc() / ioremap() space.
+ * VMEMMAP_START ~ VMEMMAP_END : vmemmap space, used for struct page array.
+ * KERNEL_LINK_ADDR : start address of Kernel link and BPF
++
++va_kernel_pa_offset
++-------------------
++
++Indicates the offset between the kernel virtual and physical mappings.
++Used to translate virtual to physical addresses.
+diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst
+index cd46e2b20a814..3ce6e4aebdef6 100644
+--- a/Documentation/arm64/silicon-errata.rst
++++ b/Documentation/arm64/silicon-errata.rst
+@@ -143,6 +143,10 @@ stable kernels.
+ +----------------+-----------------+-----------------+-----------------------------+
+ | ARM | MMU-500 | #841119,826419 | N/A |
+ +----------------+-----------------+-----------------+-----------------------------+
++| ARM | MMU-600 | #1076982,1209401| N/A |
+++----------------+-----------------+-----------------+-----------------------------+
++| ARM | MMU-700 | #2268618,2812531| N/A |
+++----------------+-----------------+-----------------+-----------------------------+
+ +----------------+-----------------+-----------------+-----------------------------+
+ | Broadcom | Brahma-B53 | N/A | ARM64_ERRATUM_845719 |
+ +----------------+-----------------+-----------------+-----------------------------+
+diff --git a/Makefile b/Makefile
+index 5547e02f6104a..bf463afef54bf 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 6
+ PATCHLEVEL = 4
+-SUBLEVEL = 9
++SUBLEVEL = 10
+ EXTRAVERSION =
+ NAME = Hurr durr I'ma ninja sloth
+
+diff --git a/arch/arm/boot/dts/at91-qil_a9260.dts b/arch/arm/boot/dts/at91-qil_a9260.dts
+index 9d26f99963483..5ccb3c139592d 100644
+--- a/arch/arm/boot/dts/at91-qil_a9260.dts
++++ b/arch/arm/boot/dts/at91-qil_a9260.dts
+@@ -108,7 +108,7 @@
+ status = "okay";
+ };
+
+- shdwc@fffffd10 {
++ shdwc: poweroff@fffffd10 {
+ atmel,wakeup-counter = <10>;
+ atmel,wakeup-rtt-timer;
+ };
+diff --git a/arch/arm/boot/dts/at91-sama5d27_som1_ek.dts b/arch/arm/boot/dts/at91-sama5d27_som1_ek.dts
+index 52ddd0571f1c0..d0a6dbd377dfa 100644
+--- a/arch/arm/boot/dts/at91-sama5d27_som1_ek.dts
++++ b/arch/arm/boot/dts/at91-sama5d27_som1_ek.dts
+@@ -139,7 +139,7 @@
+ };
+ };
+
+- shdwc@f8048010 {
++ poweroff@f8048010 {
+ debounce-delay-us = <976>;
+ atmel,wakeup-rtc-timer;
+
+diff --git a/arch/arm/boot/dts/at91-sama5d2_ptc_ek.dts b/arch/arm/boot/dts/at91-sama5d2_ptc_ek.dts
+index bf1c9ca72a9f3..200b20515ab12 100644
+--- a/arch/arm/boot/dts/at91-sama5d2_ptc_ek.dts
++++ b/arch/arm/boot/dts/at91-sama5d2_ptc_ek.dts
+@@ -204,7 +204,7 @@
+ };
+ };
+
+- shdwc@f8048010 {
++ poweroff@f8048010 {
+ debounce-delay-us = <976>;
+
+ input@0 {
+diff --git a/arch/arm/boot/dts/at91-sama5d2_xplained.dts b/arch/arm/boot/dts/at91-sama5d2_xplained.dts
+index 2d53c47d7cc86..6680031387e8c 100644
+--- a/arch/arm/boot/dts/at91-sama5d2_xplained.dts
++++ b/arch/arm/boot/dts/at91-sama5d2_xplained.dts
+@@ -348,7 +348,7 @@
+ };
+ };
+
+- shdwc@f8048010 {
++ poweroff@f8048010 {
+ debounce-delay-us = <976>;
+ atmel,wakeup-rtc-timer;
+
+diff --git a/arch/arm/boot/dts/at91rm9200.dtsi b/arch/arm/boot/dts/at91rm9200.dtsi
+index 6f9004ebf4245..37b500f6f3956 100644
+--- a/arch/arm/boot/dts/at91rm9200.dtsi
++++ b/arch/arm/boot/dts/at91rm9200.dtsi
+@@ -102,7 +102,7 @@
+ reg = <0xffffff00 0x100>;
+ };
+
+- pmc: pmc@fffffc00 {
++ pmc: clock-controller@fffffc00 {
+ compatible = "atmel,at91rm9200-pmc", "syscon";
+ reg = <0xfffffc00 0x100>;
+ interrupts = <1 IRQ_TYPE_LEVEL_HIGH 7>;
+diff --git a/arch/arm/boot/dts/at91sam9260.dtsi b/arch/arm/boot/dts/at91sam9260.dtsi
+index 789fe356dbf60..35a007365b6a5 100644
+--- a/arch/arm/boot/dts/at91sam9260.dtsi
++++ b/arch/arm/boot/dts/at91sam9260.dtsi
+@@ -115,7 +115,7 @@
+ reg = <0xffffee00 0x200>;
+ };
+
+- pmc: pmc@fffffc00 {
++ pmc: clock-controller@fffffc00 {
+ compatible = "atmel,at91sam9260-pmc", "syscon";
+ reg = <0xfffffc00 0x100>;
+ interrupts = <1 IRQ_TYPE_LEVEL_HIGH 7>;
+@@ -130,7 +130,7 @@
+ clocks = <&pmc PMC_TYPE_CORE PMC_SLOW>;
+ };
+
+- shdwc@fffffd10 {
++ shdwc: poweroff@fffffd10 {
+ compatible = "atmel,at91sam9260-shdwc";
+ reg = <0xfffffd10 0x10>;
+ clocks = <&pmc PMC_TYPE_CORE PMC_SLOW>;
+diff --git a/arch/arm/boot/dts/at91sam9260ek.dts b/arch/arm/boot/dts/at91sam9260ek.dts
+index bb72f050a4fef..720c15472c4a5 100644
+--- a/arch/arm/boot/dts/at91sam9260ek.dts
++++ b/arch/arm/boot/dts/at91sam9260ek.dts
+@@ -112,7 +112,7 @@
+ };
+ };
+
+- shdwc@fffffd10 {
++ shdwc: poweroff@fffffd10 {
+ atmel,wakeup-counter = <10>;
+ atmel,wakeup-rtt-timer;
+ };
+diff --git a/arch/arm/boot/dts/at91sam9261.dtsi b/arch/arm/boot/dts/at91sam9261.dtsi
+index ee0bd1aceb3f0..528ffc6f6f962 100644
+--- a/arch/arm/boot/dts/at91sam9261.dtsi
++++ b/arch/arm/boot/dts/at91sam9261.dtsi
+@@ -599,7 +599,7 @@
+ };
+ };
+
+- pmc: pmc@fffffc00 {
++ pmc: clock-controller@fffffc00 {
+ compatible = "atmel,at91sam9261-pmc", "syscon";
+ reg = <0xfffffc00 0x100>;
+ interrupts = <1 IRQ_TYPE_LEVEL_HIGH 7>;
+@@ -614,7 +614,7 @@
+ clocks = <&slow_xtal>;
+ };
+
+- shdwc@fffffd10 {
++ poweroff@fffffd10 {
+ compatible = "atmel,at91sam9260-shdwc";
+ reg = <0xfffffd10 0x10>;
+ clocks = <&slow_xtal>;
+diff --git a/arch/arm/boot/dts/at91sam9263.dtsi b/arch/arm/boot/dts/at91sam9263.dtsi
+index 3ce9ea9873129..75d8ff2d12c8a 100644
+--- a/arch/arm/boot/dts/at91sam9263.dtsi
++++ b/arch/arm/boot/dts/at91sam9263.dtsi
+@@ -101,7 +101,7 @@
+ atmel,external-irqs = <30 31>;
+ };
+
+- pmc: pmc@fffffc00 {
++ pmc: clock-controller@fffffc00 {
+ compatible = "atmel,at91sam9263-pmc", "syscon";
+ reg = <0xfffffc00 0x100>;
+ interrupts = <1 IRQ_TYPE_LEVEL_HIGH 7>;
+@@ -158,7 +158,7 @@
+ clocks = <&slow_xtal>;
+ };
+
+- shdwc@fffffd10 {
++ poweroff@fffffd10 {
+ compatible = "atmel,at91sam9260-shdwc";
+ reg = <0xfffffd10 0x10>;
+ clocks = <&slow_xtal>;
+diff --git a/arch/arm/boot/dts/at91sam9g20.dtsi b/arch/arm/boot/dts/at91sam9g20.dtsi
+index 708e1646b7f46..738a43ffd2281 100644
+--- a/arch/arm/boot/dts/at91sam9g20.dtsi
++++ b/arch/arm/boot/dts/at91sam9g20.dtsi
+@@ -41,7 +41,7 @@
+ atmel,adc-startup-time = <40>;
+ };
+
+- pmc: pmc@fffffc00 {
++ pmc: clock-controller@fffffc00 {
+ compatible = "atmel,at91sam9g20-pmc", "atmel,at91sam9260-pmc", "syscon";
+ };
+ };
+diff --git a/arch/arm/boot/dts/at91sam9g20ek_common.dtsi b/arch/arm/boot/dts/at91sam9g20ek_common.dtsi
+index 024af2db638eb..565b99e79c520 100644
+--- a/arch/arm/boot/dts/at91sam9g20ek_common.dtsi
++++ b/arch/arm/boot/dts/at91sam9g20ek_common.dtsi
+@@ -126,7 +126,7 @@
+ };
+ };
+
+- shdwc@fffffd10 {
++ shdwc: poweroff@fffffd10 {
+ atmel,wakeup-counter = <10>;
+ atmel,wakeup-rtt-timer;
+ };
+diff --git a/arch/arm/boot/dts/at91sam9g25.dtsi b/arch/arm/boot/dts/at91sam9g25.dtsi
+index d2f13afb35eaf..ec3c77221881c 100644
+--- a/arch/arm/boot/dts/at91sam9g25.dtsi
++++ b/arch/arm/boot/dts/at91sam9g25.dtsi
+@@ -26,7 +26,7 @@
+ >;
+ };
+
+- pmc: pmc@fffffc00 {
++ pmc: clock-controller@fffffc00 {
+ compatible = "atmel,at91sam9g25-pmc", "atmel,at91sam9x5-pmc", "syscon";
+ };
+ };
+diff --git a/arch/arm/boot/dts/at91sam9g35.dtsi b/arch/arm/boot/dts/at91sam9g35.dtsi
+index 48c2bc4a7753d..c9cfb93092ee6 100644
+--- a/arch/arm/boot/dts/at91sam9g35.dtsi
++++ b/arch/arm/boot/dts/at91sam9g35.dtsi
+@@ -25,7 +25,7 @@
+ >;
+ };
+
+- pmc: pmc@fffffc00 {
++ pmc: clock-controller@fffffc00 {
+ compatible = "atmel,at91sam9g35-pmc", "atmel,at91sam9x5-pmc", "syscon";
+ };
+ };
+diff --git a/arch/arm/boot/dts/at91sam9g45.dtsi b/arch/arm/boot/dts/at91sam9g45.dtsi
+index 95f5d76234dbb..7cccc606e36cd 100644
+--- a/arch/arm/boot/dts/at91sam9g45.dtsi
++++ b/arch/arm/boot/dts/at91sam9g45.dtsi
+@@ -129,7 +129,7 @@
+ reg = <0xffffea00 0x200>;
+ };
+
+- pmc: pmc@fffffc00 {
++ pmc: clock-controller@fffffc00 {
+ compatible = "atmel,at91sam9g45-pmc", "syscon";
+ reg = <0xfffffc00 0x100>;
+ interrupts = <1 IRQ_TYPE_LEVEL_HIGH 7>;
+@@ -152,7 +152,7 @@
+ };
+
+
+- shdwc@fffffd10 {
++ poweroff@fffffd10 {
+ compatible = "atmel,at91sam9rl-shdwc";
+ reg = <0xfffffd10 0x10>;
+ clocks = <&clk32k>;
+@@ -923,7 +923,7 @@
+ status = "disabled";
+ };
+
+- clk32k: sckc@fffffd50 {
++ clk32k: clock-controller@fffffd50 {
+ compatible = "atmel,at91sam9x5-sckc";
+ reg = <0xfffffd50 0x4>;
+ clocks = <&slow_xtal>;
+diff --git a/arch/arm/boot/dts/at91sam9n12.dtsi b/arch/arm/boot/dts/at91sam9n12.dtsi
+index 83114d26f10d0..16a9a908985da 100644
+--- a/arch/arm/boot/dts/at91sam9n12.dtsi
++++ b/arch/arm/boot/dts/at91sam9n12.dtsi
+@@ -118,7 +118,7 @@
+ reg = <0xffffea00 0x200>;
+ };
+
+- pmc: pmc@fffffc00 {
++ pmc: clock-controller@fffffc00 {
+ compatible = "atmel,at91sam9n12-pmc", "syscon";
+ reg = <0xfffffc00 0x200>;
+ #clock-cells = <2>;
+@@ -140,7 +140,7 @@
+ clocks = <&pmc PMC_TYPE_CORE PMC_MCK>;
+ };
+
+- shdwc@fffffe10 {
++ poweroff@fffffe10 {
+ compatible = "atmel,at91sam9x5-shdwc";
+ reg = <0xfffffe10 0x10>;
+ clocks = <&clk32k>;
+diff --git a/arch/arm/boot/dts/at91sam9rl.dtsi b/arch/arm/boot/dts/at91sam9rl.dtsi
+index 364a2ff0a763d..3d089ffbe1626 100644
+--- a/arch/arm/boot/dts/at91sam9rl.dtsi
++++ b/arch/arm/boot/dts/at91sam9rl.dtsi
+@@ -763,7 +763,7 @@
+ };
+ };
+
+- pmc: pmc@fffffc00 {
++ pmc: clock-controller@fffffc00 {
+ compatible = "atmel,at91sam9rl-pmc", "syscon";
+ reg = <0xfffffc00 0x100>;
+ interrupts = <1 IRQ_TYPE_LEVEL_HIGH 7>;
+@@ -778,7 +778,7 @@
+ clocks = <&clk32k>;
+ };
+
+- shdwc@fffffd10 {
++ poweroff@fffffd10 {
+ compatible = "atmel,at91sam9260-shdwc";
+ reg = <0xfffffd10 0x10>;
+ clocks = <&clk32k>;
+@@ -799,7 +799,7 @@
+ status = "disabled";
+ };
+
+- clk32k: sckc@fffffd50 {
++ clk32k: clock-controller@fffffd50 {
+ compatible = "atmel,at91sam9x5-sckc";
+ reg = <0xfffffd50 0x4>;
+ clocks = <&slow_xtal>;
+diff --git a/arch/arm/boot/dts/at91sam9x25.dtsi b/arch/arm/boot/dts/at91sam9x25.dtsi
+index 0fe8802e1242b..7036f5f045715 100644
+--- a/arch/arm/boot/dts/at91sam9x25.dtsi
++++ b/arch/arm/boot/dts/at91sam9x25.dtsi
+@@ -27,7 +27,7 @@
+ >;
+ };
+
+- pmc: pmc@fffffc00 {
++ pmc: clock-controller@fffffc00 {
+ compatible = "atmel,at91sam9x25-pmc", "atmel,at91sam9x5-pmc", "syscon";
+ };
+ };
+diff --git a/arch/arm/boot/dts/at91sam9x35.dtsi b/arch/arm/boot/dts/at91sam9x35.dtsi
+index 0bfa21f18f870..eb03b0497e371 100644
+--- a/arch/arm/boot/dts/at91sam9x35.dtsi
++++ b/arch/arm/boot/dts/at91sam9x35.dtsi
+@@ -26,7 +26,7 @@
+ >;
+ };
+
+- pmc: pmc@fffffc00 {
++ pmc: clock-controller@fffffc00 {
+ compatible = "atmel,at91sam9x35-pmc", "atmel,at91sam9x5-pmc", "syscon";
+ };
+ };
+diff --git a/arch/arm/boot/dts/at91sam9x5.dtsi b/arch/arm/boot/dts/at91sam9x5.dtsi
+index 0c26c925761b2..a1fed912f2eea 100644
+--- a/arch/arm/boot/dts/at91sam9x5.dtsi
++++ b/arch/arm/boot/dts/at91sam9x5.dtsi
+@@ -126,7 +126,7 @@
+ reg = <0xffffea00 0x200>;
+ };
+
+- pmc: pmc@fffffc00 {
++ pmc: clock-controller@fffffc00 {
+ compatible = "atmel,at91sam9x5-pmc", "syscon";
+ reg = <0xfffffc00 0x200>;
+ interrupts = <1 IRQ_TYPE_LEVEL_HIGH 7>;
+@@ -141,7 +141,7 @@
+ clocks = <&clk32k>;
+ };
+
+- shutdown_controller: shdwc@fffffe10 {
++ shutdown_controller: poweroff@fffffe10 {
+ compatible = "atmel,at91sam9x5-shdwc";
+ reg = <0xfffffe10 0x10>;
+ clocks = <&clk32k>;
+@@ -154,7 +154,7 @@
+ clocks = <&pmc PMC_TYPE_CORE PMC_MCK>;
+ };
+
+- clk32k: sckc@fffffe50 {
++ clk32k: clock-controller@fffffe50 {
+ compatible = "atmel,at91sam9x5-sckc";
+ reg = <0xfffffe50 0x4>;
+ clocks = <&slow_xtal>;
+diff --git a/arch/arm/boot/dts/imx53-sk-imx53.dts b/arch/arm/boot/dts/imx53-sk-imx53.dts
+index 103e73176e47d..1a00d290092ad 100644
+--- a/arch/arm/boot/dts/imx53-sk-imx53.dts
++++ b/arch/arm/boot/dts/imx53-sk-imx53.dts
+@@ -60,6 +60,16 @@
+ status = "okay";
+ };
+
++&cpu0 {
++ /* CPU rated to 800 MHz, not the default 1.2GHz. */
++ operating-points = <
++ /* kHz uV */
++ 166666 850000
++ 400000 900000
++ 800000 1050000
++ >;
++};
++
+ &ecspi1 {
+ pinctrl-names = "default";
+ pinctrl-0 = <&pinctrl_ecspi1>;
+diff --git a/arch/arm/boot/dts/imx6sll.dtsi b/arch/arm/boot/dts/imx6sll.dtsi
+index 2873369a57c02..3659fd5ecfa62 100644
+--- a/arch/arm/boot/dts/imx6sll.dtsi
++++ b/arch/arm/boot/dts/imx6sll.dtsi
+@@ -552,7 +552,7 @@
+ reg = <0x020ca000 0x1000>;
+ interrupts = <GIC_SPI 41 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&clks IMX6SLL_CLK_USBPHY2>;
+- phy-reg_3p0-supply = <®_3p0>;
++ phy-3p0-supply = <®_3p0>;
+ fsl,anatop = <&anatop>;
+ };
+
+diff --git a/arch/arm/boot/dts/sam9x60.dtsi b/arch/arm/boot/dts/sam9x60.dtsi
+index e67ede940071f..73d570a172690 100644
+--- a/arch/arm/boot/dts/sam9x60.dtsi
++++ b/arch/arm/boot/dts/sam9x60.dtsi
+@@ -172,7 +172,7 @@
+ status = "disabled";
+
+ uart4: serial@200 {
+- compatible = "microchip,sam9x60-dbgu", "microchip,sam9x60-usart", "atmel,at91sam9260-dbgu", "atmel,at91sam9260-usart";
++ compatible = "microchip,sam9x60-usart", "atmel,at91sam9260-usart";
+ reg = <0x200 0x200>;
+ interrupts = <13 IRQ_TYPE_LEVEL_HIGH 7>;
+ dmas = <&dma0
+@@ -240,7 +240,7 @@
+ status = "disabled";
+
+ uart5: serial@200 {
+- compatible = "microchip,sam9x60-dbgu", "microchip,sam9x60-usart", "atmel,at91sam9260-dbgu", "atmel,at91sam9260-usart";
++ compatible = "microchip,sam9x60-usart", "atmel,at91sam9260-usart";
+ reg = <0x200 0x200>;
+ atmel,usart-mode = <AT91_USART_MODE_SERIAL>;
+ interrupts = <14 IRQ_TYPE_LEVEL_HIGH 7>;
+@@ -370,7 +370,7 @@
+ status = "disabled";
+
+ uart11: serial@200 {
+- compatible = "microchip,sam9x60-dbgu", "microchip,sam9x60-usart", "atmel,at91sam9260-dbgu", "atmel,at91sam9260-usart";
++ compatible = "microchip,sam9x60-usart", "atmel,at91sam9260-usart";
+ reg = <0x200 0x200>;
+ interrupts = <32 IRQ_TYPE_LEVEL_HIGH 7>;
+ dmas = <&dma0
+@@ -419,7 +419,7 @@
+ status = "disabled";
+
+ uart12: serial@200 {
+- compatible = "microchip,sam9x60-dbgu", "microchip,sam9x60-usart", "atmel,at91sam9260-dbgu", "atmel,at91sam9260-usart";
++ compatible = "microchip,sam9x60-usart", "atmel,at91sam9260-usart";
+ reg = <0x200 0x200>;
+ interrupts = <33 IRQ_TYPE_LEVEL_HIGH 7>;
+ dmas = <&dma0
+@@ -576,7 +576,7 @@
+ status = "disabled";
+
+ uart6: serial@200 {
+- compatible = "microchip,sam9x60-dbgu", "microchip,sam9x60-usart", "atmel,at91sam9260-dbgu", "atmel,at91sam9260-usart";
++ compatible = "microchip,sam9x60-usart", "atmel,at91sam9260-usart";
+ reg = <0x200 0x200>;
+ interrupts = <9 IRQ_TYPE_LEVEL_HIGH 7>;
+ dmas = <&dma0
+@@ -625,7 +625,7 @@
+ status = "disabled";
+
+ uart7: serial@200 {
+- compatible = "microchip,sam9x60-dbgu", "microchip,sam9x60-usart", "atmel,at91sam9260-dbgu", "atmel,at91sam9260-usart";
++ compatible = "microchip,sam9x60-usart", "atmel,at91sam9260-usart";
+ reg = <0x200 0x200>;
+ interrupts = <10 IRQ_TYPE_LEVEL_HIGH 7>;
+ dmas = <&dma0
+@@ -674,7 +674,7 @@
+ status = "disabled";
+
+ uart8: serial@200 {
+- compatible = "microchip,sam9x60-dbgu", "microchip,sam9x60-usart", "atmel,at91sam9260-dbgu", "atmel,at91sam9260-usart";
++ compatible = "microchip,sam9x60-usart", "atmel,at91sam9260-usart";
+ reg = <0x200 0x200>;
+ interrupts = <11 IRQ_TYPE_LEVEL_HIGH 7>;
+ dmas = <&dma0
+@@ -723,7 +723,7 @@
+ status = "disabled";
+
+ uart0: serial@200 {
+- compatible = "microchip,sam9x60-dbgu", "microchip,sam9x60-usart", "atmel,at91sam9260-dbgu", "atmel,at91sam9260-usart";
++ compatible = "microchip,sam9x60-usart", "atmel,at91sam9260-usart";
+ reg = <0x200 0x200>;
+ interrupts = <5 IRQ_TYPE_LEVEL_HIGH 7>;
+ dmas = <&dma0
+@@ -791,7 +791,7 @@
+ status = "disabled";
+
+ uart1: serial@200 {
+- compatible = "microchip,sam9x60-dbgu", "microchip,sam9x60-usart", "atmel,at91sam9260-dbgu", "atmel,at91sam9260-usart";
++ compatible = "microchip,sam9x60-usart", "atmel,at91sam9260-usart";
+ reg = <0x200 0x200>;
+ interrupts = <6 IRQ_TYPE_LEVEL_HIGH 7>;
+ dmas = <&dma0
+@@ -859,7 +859,7 @@
+ status = "disabled";
+
+ uart2: serial@200 {
+- compatible = "microchip,sam9x60-dbgu", "microchip,sam9x60-usart", "atmel,at91sam9260-dbgu", "atmel,at91sam9260-usart";
++ compatible = "microchip,sam9x60-usart", "atmel,at91sam9260-usart";
+ reg = <0x200 0x200>;
+ interrupts = <7 IRQ_TYPE_LEVEL_HIGH 7>;
+ dmas = <&dma0
+@@ -927,7 +927,7 @@
+ status = "disabled";
+
+ uart3: serial@200 {
+- compatible = "microchip,sam9x60-dbgu", "microchip,sam9x60-usart", "atmel,at91sam9260-dbgu", "atmel,at91sam9260-usart";
++ compatible = "microchip,sam9x60-usart", "atmel,at91sam9260-usart";
+ reg = <0x200 0x200>;
+ interrupts = <8 IRQ_TYPE_LEVEL_HIGH 7>;
+ dmas = <&dma0
+@@ -1050,7 +1050,7 @@
+ status = "disabled";
+
+ uart9: serial@200 {
+- compatible = "microchip,sam9x60-dbgu", "microchip,sam9x60-usart", "atmel,at91sam9260-dbgu", "atmel,at91sam9260-usart";
++ compatible = "microchip,sam9x60-usart", "atmel,at91sam9260-usart";
+ reg = <0x200 0x200>;
+ interrupts = <15 IRQ_TYPE_LEVEL_HIGH 7>;
+ dmas = <&dma0
+@@ -1099,7 +1099,7 @@
+ status = "disabled";
+
+ uart10: serial@200 {
+- compatible = "microchip,sam9x60-dbgu", "microchip,sam9x60-usart", "atmel,at91sam9260-dbgu", "atmel,at91sam9260-usart";
++ compatible = "microchip,sam9x60-usart", "atmel,at91sam9260-usart";
+ reg = <0x200 0x200>;
+ interrupts = <16 IRQ_TYPE_LEVEL_HIGH 7>;
+ dmas = <&dma0
+@@ -1282,7 +1282,7 @@
+ };
+ };
+
+- pmc: pmc@fffffc00 {
++ pmc: clock-controller@fffffc00 {
+ compatible = "microchip,sam9x60-pmc", "syscon";
+ reg = <0xfffffc00 0x200>;
+ interrupts = <1 IRQ_TYPE_LEVEL_HIGH 7>;
+@@ -1297,7 +1297,7 @@
+ clocks = <&clk32k 0>;
+ };
+
+- shutdown_controller: shdwc@fffffe10 {
++ shutdown_controller: poweroff@fffffe10 {
+ compatible = "microchip,sam9x60-shdwc";
+ reg = <0xfffffe10 0x10>;
+ clocks = <&clk32k 0>;
+@@ -1322,7 +1322,7 @@
+ clocks = <&pmc PMC_TYPE_CORE PMC_MCK>;
+ };
+
+- clk32k: sckc@fffffe50 {
++ clk32k: clock-controller@fffffe50 {
+ compatible = "microchip,sam9x60-sckc";
+ reg = <0xfffffe50 0x4>;
+ clocks = <&slow_xtal>;
+diff --git a/arch/arm/boot/dts/sama5d2.dtsi b/arch/arm/boot/dts/sama5d2.dtsi
+index 14c35c12a115f..8ae270fabfa82 100644
+--- a/arch/arm/boot/dts/sama5d2.dtsi
++++ b/arch/arm/boot/dts/sama5d2.dtsi
+@@ -284,7 +284,7 @@
+ clock-names = "dma_clk";
+ };
+
+- pmc: pmc@f0014000 {
++ pmc: clock-controller@f0014000 {
+ compatible = "atmel,sama5d2-pmc", "syscon";
+ reg = <0xf0014000 0x160>;
+ interrupts = <74 IRQ_TYPE_LEVEL_HIGH 7>;
+@@ -680,7 +680,7 @@
+ clocks = <&clk32k>;
+ };
+
+- shutdown_controller: shdwc@f8048010 {
++ shutdown_controller: poweroff@f8048010 {
+ compatible = "atmel,sama5d2-shdwc";
+ reg = <0xf8048010 0x10>;
+ clocks = <&clk32k>;
+@@ -704,7 +704,7 @@
+ status = "disabled";
+ };
+
+- clk32k: sckc@f8048050 {
++ clk32k: clock-controller@f8048050 {
+ compatible = "atmel,sama5d4-sckc";
+ reg = <0xf8048050 0x4>;
+
+diff --git a/arch/arm/boot/dts/sama5d3.dtsi b/arch/arm/boot/dts/sama5d3.dtsi
+index bde8e92d60bb1..d9e66700d1c20 100644
+--- a/arch/arm/boot/dts/sama5d3.dtsi
++++ b/arch/arm/boot/dts/sama5d3.dtsi
+@@ -1001,7 +1001,7 @@
+ };
+ };
+
+- pmc: pmc@fffffc00 {
++ pmc: clock-controller@fffffc00 {
+ compatible = "atmel,sama5d3-pmc", "syscon";
+ reg = <0xfffffc00 0x120>;
+ interrupts = <1 IRQ_TYPE_LEVEL_HIGH 7>;
+@@ -1016,7 +1016,7 @@
+ clocks = <&clk32k>;
+ };
+
+- shutdown_controller: shutdown-controller@fffffe10 {
++ shutdown_controller: poweroff@fffffe10 {
+ compatible = "atmel,at91sam9x5-shdwc";
+ reg = <0xfffffe10 0x10>;
+ clocks = <&clk32k>;
+@@ -1040,7 +1040,7 @@
+ status = "disabled";
+ };
+
+- clk32k: sckc@fffffe50 {
++ clk32k: clock-controller@fffffe50 {
+ compatible = "atmel,sama5d3-sckc";
+ reg = <0xfffffe50 0x4>;
+ clocks = <&slow_xtal>;
+diff --git a/arch/arm/boot/dts/sama5d3_emac.dtsi b/arch/arm/boot/dts/sama5d3_emac.dtsi
+index 45226108850d2..5d7ce13de8ccf 100644
+--- a/arch/arm/boot/dts/sama5d3_emac.dtsi
++++ b/arch/arm/boot/dts/sama5d3_emac.dtsi
+@@ -30,7 +30,7 @@
+ };
+ };
+
+- pmc: pmc@fffffc00 {
++ pmc: clock-controller@fffffc00 {
+ };
+
+ macb1: ethernet@f802c000 {
+diff --git a/arch/arm/boot/dts/sama5d4.dtsi b/arch/arm/boot/dts/sama5d4.dtsi
+index af62157ae214f..41284e013f531 100644
+--- a/arch/arm/boot/dts/sama5d4.dtsi
++++ b/arch/arm/boot/dts/sama5d4.dtsi
+@@ -250,7 +250,7 @@
+ clock-names = "dma_clk";
+ };
+
+- pmc: pmc@f0018000 {
++ pmc: clock-controller@f0018000 {
+ compatible = "atmel,sama5d4-pmc", "syscon";
+ reg = <0xf0018000 0x120>;
+ interrupts = <1 IRQ_TYPE_LEVEL_HIGH 7>;
+@@ -740,7 +740,7 @@
+ clocks = <&clk32k>;
+ };
+
+- shutdown_controller: shdwc@fc068610 {
++ shutdown_controller: poweroff@fc068610 {
+ compatible = "atmel,at91sam9x5-shdwc";
+ reg = <0xfc068610 0x10>;
+ clocks = <&clk32k>;
+@@ -761,7 +761,7 @@
+ status = "disabled";
+ };
+
+- clk32k: sckc@fc068650 {
++ clk32k: clock-controller@fc068650 {
+ compatible = "atmel,sama5d4-sckc";
+ reg = <0xfc068650 0x4>;
+ #clock-cells = <0>;
+diff --git a/arch/arm/boot/dts/sama7g5.dtsi b/arch/arm/boot/dts/sama7g5.dtsi
+index 929ba73702e93..9642a42d84e60 100644
+--- a/arch/arm/boot/dts/sama7g5.dtsi
++++ b/arch/arm/boot/dts/sama7g5.dtsi
+@@ -241,7 +241,7 @@
+ clocks = <&pmc PMC_TYPE_PERIPHERAL 11>;
+ };
+
+- pmc: pmc@e0018000 {
++ pmc: clock-controller@e0018000 {
+ compatible = "microchip,sama7g5-pmc", "syscon";
+ reg = <0xe0018000 0x200>;
+ interrupts = <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>;
+@@ -257,7 +257,7 @@
+ clocks = <&clk32k 0>;
+ };
+
+- shdwc: shdwc@e001d010 {
++ shdwc: poweroff@e001d010 {
+ compatible = "microchip,sama7g5-shdwc", "syscon";
+ reg = <0xe001d010 0x10>;
+ clocks = <&clk32k 0>;
+diff --git a/arch/arm/boot/dts/usb_a9260.dts b/arch/arm/boot/dts/usb_a9260.dts
+index 6cfa83921ac26..66f8da89007db 100644
+--- a/arch/arm/boot/dts/usb_a9260.dts
++++ b/arch/arm/boot/dts/usb_a9260.dts
+@@ -22,7 +22,7 @@
+
+ ahb {
+ apb {
+- shdwc@fffffd10 {
++ shdwc: poweroff@fffffd10 {
+ atmel,wakeup-counter = <10>;
+ atmel,wakeup-rtt-timer;
+ };
+diff --git a/arch/arm/boot/dts/usb_a9263.dts b/arch/arm/boot/dts/usb_a9263.dts
+index b6cb9cdf81973..45745915b2e16 100644
+--- a/arch/arm/boot/dts/usb_a9263.dts
++++ b/arch/arm/boot/dts/usb_a9263.dts
+@@ -67,7 +67,7 @@
+ };
+ };
+
+- shdwc@fffffd10 {
++ poweroff@fffffd10 {
+ atmel,wakeup-counter = <10>;
+ atmel,wakeup-rtt-timer;
+ };
+diff --git a/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk.dts b/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk.dts
+index 38ae674f2f02a..3037f58057c9f 100644
+--- a/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk.dts
++++ b/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk.dts
+@@ -145,7 +145,7 @@
+ status = "okay";
+ clock-frequency = <100000>;
+ i2c-sda-falling-time-ns = <890>; /* hcnt */
+- i2c-sdl-falling-time-ns = <890>; /* lcnt */
++ i2c-scl-falling-time-ns = <890>; /* lcnt */
+
+ pinctrl-names = "default", "gpio";
+ pinctrl-0 = <&i2c1_pmx_func>;
+diff --git a/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk_nand.dts b/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk_nand.dts
+index ede99dcc05580..f4cf30bac5574 100644
+--- a/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk_nand.dts
++++ b/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk_nand.dts
+@@ -141,7 +141,7 @@
+ status = "okay";
+ clock-frequency = <100000>;
+ i2c-sda-falling-time-ns = <890>; /* hcnt */
+- i2c-sdl-falling-time-ns = <890>; /* lcnt */
++ i2c-scl-falling-time-ns = <890>; /* lcnt */
+
+ adc@14 {
+ compatible = "lltc,ltc2497";
+diff --git a/arch/arm64/boot/dts/freescale/imx8mm-phyboard-polis-rdk.dts b/arch/arm64/boot/dts/freescale/imx8mm-phyboard-polis-rdk.dts
+index 03e7679217b24..479948f8a4b75 100644
+--- a/arch/arm64/boot/dts/freescale/imx8mm-phyboard-polis-rdk.dts
++++ b/arch/arm64/boot/dts/freescale/imx8mm-phyboard-polis-rdk.dts
+@@ -141,7 +141,7 @@
+ };
+
+ &gpio1 {
+- gpio-line-names = "nINT_ETHPHY", "LED_RED", "WDOG_INT", "X_RTC_INT",
++ gpio-line-names = "", "LED_RED", "WDOG_INT", "X_RTC_INT",
+ "", "", "", "RESET_ETHPHY",
+ "CAN_nINT", "CAN_EN", "nENABLE_FLATLINK", "",
+ "USB_OTG_VBUS_EN", "", "LED_GREEN", "LED_BLUE";
+diff --git a/arch/arm64/boot/dts/freescale/imx8mm-phycore-som.dtsi b/arch/arm64/boot/dts/freescale/imx8mm-phycore-som.dtsi
+index 92616bc4f71f5..847f08537b48a 100644
+--- a/arch/arm64/boot/dts/freescale/imx8mm-phycore-som.dtsi
++++ b/arch/arm64/boot/dts/freescale/imx8mm-phycore-som.dtsi
+@@ -111,7 +111,7 @@
+ };
+
+ &gpio1 {
+- gpio-line-names = "nINT_ETHPHY", "", "WDOG_INT", "X_RTC_INT",
++ gpio-line-names = "", "", "WDOG_INT", "X_RTC_INT",
+ "", "", "", "RESET_ETHPHY",
+ "", "", "nENABLE_FLATLINK";
+ };
+@@ -210,7 +210,7 @@
+ };
+ };
+
+- reg_vdd_gpu: buck3 {
++ reg_vdd_vpu: buck3 {
+ regulator-always-on;
+ regulator-boot-on;
+ regulator-max-microvolt = <1000000>;
+diff --git a/arch/arm64/boot/dts/freescale/imx8mm-venice-gw7903.dts b/arch/arm64/boot/dts/freescale/imx8mm-venice-gw7903.dts
+index 363020a08c9b8..4660d086cb099 100644
+--- a/arch/arm64/boot/dts/freescale/imx8mm-venice-gw7903.dts
++++ b/arch/arm64/boot/dts/freescale/imx8mm-venice-gw7903.dts
+@@ -567,6 +567,10 @@
+ status = "okay";
+ };
+
++&disp_blk_ctrl {
++ status = "disabled";
++};
++
+ &pgc_mipi {
+ status = "disabled";
+ };
+diff --git a/arch/arm64/boot/dts/freescale/imx8mm-venice-gw7904.dts b/arch/arm64/boot/dts/freescale/imx8mm-venice-gw7904.dts
+index 93088fa1c3b9c..d5b7168558124 100644
+--- a/arch/arm64/boot/dts/freescale/imx8mm-venice-gw7904.dts
++++ b/arch/arm64/boot/dts/freescale/imx8mm-venice-gw7904.dts
+@@ -628,6 +628,10 @@
+ status = "okay";
+ };
+
++&disp_blk_ctrl {
++ status = "disabled";
++};
++
+ &pgc_mipi {
+ status = "disabled";
+ };
+diff --git a/arch/arm64/boot/dts/freescale/imx8mn-var-som.dtsi b/arch/arm64/boot/dts/freescale/imx8mn-var-som.dtsi
+index cbd9d124c80d0..c9d4fb75c21d3 100644
+--- a/arch/arm64/boot/dts/freescale/imx8mn-var-som.dtsi
++++ b/arch/arm64/boot/dts/freescale/imx8mn-var-som.dtsi
+@@ -351,7 +351,7 @@
+ MX8MN_IOMUXC_ENET_RXC_ENET1_RGMII_RXC 0x91
+ MX8MN_IOMUXC_ENET_RX_CTL_ENET1_RGMII_RX_CTL 0x91
+ MX8MN_IOMUXC_ENET_TX_CTL_ENET1_RGMII_TX_CTL 0x1f
+- MX8MN_IOMUXC_GPIO1_IO09_GPIO1_IO9 0x19
++ MX8MN_IOMUXC_GPIO1_IO09_GPIO1_IO9 0x159
+ >;
+ };
+
+diff --git a/arch/arm64/boot/dts/freescale/imx8mq.dtsi b/arch/arm64/boot/dts/freescale/imx8mq.dtsi
+index 0492556a10dbc..345c70c6c697a 100644
+--- a/arch/arm64/boot/dts/freescale/imx8mq.dtsi
++++ b/arch/arm64/boot/dts/freescale/imx8mq.dtsi
+@@ -770,7 +770,7 @@
+ <&clk IMX8MQ_SYS1_PLL_800M>,
+ <&clk IMX8MQ_VPU_PLL>;
+ assigned-clock-rates = <600000000>,
+- <600000000>,
++ <300000000>,
+ <800000000>,
+ <0>;
+ };
+diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
+index 520b681a07bb0..75c37b1c55aaf 100644
+--- a/arch/arm64/kernel/fpsimd.c
++++ b/arch/arm64/kernel/fpsimd.c
+@@ -679,7 +679,7 @@ static void fpsimd_to_sve(struct task_struct *task)
+ void *sst = task->thread.sve_state;
+ struct user_fpsimd_state const *fst = &task->thread.uw.fpsimd_state;
+
+- if (!system_supports_sve())
++ if (!system_supports_sve() && !system_supports_sme())
+ return;
+
+ vq = sve_vq_from_vl(thread_get_cur_vl(&task->thread));
+@@ -705,7 +705,7 @@ static void sve_to_fpsimd(struct task_struct *task)
+ unsigned int i;
+ __uint128_t const *p;
+
+- if (!system_supports_sve())
++ if (!system_supports_sve() && !system_supports_sme())
+ return;
+
+ vl = thread_get_cur_vl(&task->thread);
+@@ -835,7 +835,8 @@ void sve_sync_from_fpsimd_zeropad(struct task_struct *task)
+ void *sst = task->thread.sve_state;
+ struct user_fpsimd_state const *fst = &task->thread.uw.fpsimd_state;
+
+- if (!test_tsk_thread_flag(task, TIF_SVE))
++ if (!test_tsk_thread_flag(task, TIF_SVE) &&
++ !thread_sm_enabled(&task->thread))
+ return;
+
+ vq = sve_vq_from_vl(thread_get_cur_vl(&task->thread));
+@@ -909,7 +910,7 @@ int vec_set_vector_length(struct task_struct *task, enum vec_type type,
+ */
+ task->thread.svcr &= ~(SVCR_SM_MASK |
+ SVCR_ZA_MASK);
+- clear_thread_flag(TIF_SME);
++ clear_tsk_thread_flag(task, TIF_SME);
+ free_sme = true;
+ }
+ }
+diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
+index d7f4f0d1ae120..5b9b4305248b8 100644
+--- a/arch/arm64/kernel/ptrace.c
++++ b/arch/arm64/kernel/ptrace.c
+@@ -932,11 +932,13 @@ static int sve_set_common(struct task_struct *target,
+ /*
+ * Ensure target->thread.sve_state is up to date with target's
+ * FPSIMD regs, so that a short copyin leaves trailing
+- * registers unmodified. Always enable SVE even if going into
+- * streaming mode.
++ * registers unmodified. Only enable SVE if we are
++ * configuring normal SVE, a system with streaming SVE may not
++ * have normal SVE.
+ */
+ fpsimd_sync_to_sve(target);
+- set_tsk_thread_flag(target, TIF_SVE);
++ if (type == ARM64_VEC_SVE)
++ set_tsk_thread_flag(target, TIF_SVE);
+ target->thread.fp_type = FP_STATE_SVE;
+
+ BUILD_BUG_ON(SVE_PT_SVE_OFFSET != sizeof(header));
+@@ -1180,6 +1182,8 @@ static int zt_set(struct task_struct *target,
+ if (ret == 0)
+ target->thread.svcr |= SVCR_ZA_MASK;
+
++ fpsimd_flush_task_state(target);
++
+ return ret;
+ }
+
+diff --git a/arch/parisc/mm/fixmap.c b/arch/parisc/mm/fixmap.c
+index cc15d737fda64..ae3493dae9dc9 100644
+--- a/arch/parisc/mm/fixmap.c
++++ b/arch/parisc/mm/fixmap.c
+@@ -19,9 +19,6 @@ void notrace set_fixmap(enum fixed_addresses idx, phys_addr_t phys)
+ pmd_t *pmd = pmd_offset(pud, vaddr);
+ pte_t *pte;
+
+- if (pmd_none(*pmd))
+- pte = pte_alloc_kernel(pmd, vaddr);
+-
+ pte = pte_offset_kernel(pmd, vaddr);
+ set_pte_at(&init_mm, vaddr, pte, __mk_pte(phys, PAGE_KERNEL_RWX));
+ flush_tlb_kernel_range(vaddr, vaddr + PAGE_SIZE);
+diff --git a/arch/parisc/mm/init.c b/arch/parisc/mm/init.c
+index b0c43f3b0a5f8..16b3ef4b89763 100644
+--- a/arch/parisc/mm/init.c
++++ b/arch/parisc/mm/init.c
+@@ -671,6 +671,39 @@ static void __init gateway_init(void)
+ PAGE_SIZE, PAGE_GATEWAY, 1);
+ }
+
++static void __init fixmap_init(void)
++{
++ unsigned long addr = FIXMAP_START;
++ unsigned long end = FIXMAP_START + FIXMAP_SIZE;
++ pgd_t *pgd = pgd_offset_k(addr);
++ p4d_t *p4d = p4d_offset(pgd, addr);
++ pud_t *pud = pud_offset(p4d, addr);
++ pmd_t *pmd;
++
++ BUILD_BUG_ON(FIXMAP_SIZE > PMD_SIZE);
++
++#if CONFIG_PGTABLE_LEVELS == 3
++ if (pud_none(*pud)) {
++ pmd = memblock_alloc(PAGE_SIZE << PMD_TABLE_ORDER,
++ PAGE_SIZE << PMD_TABLE_ORDER);
++ if (!pmd)
++ panic("fixmap: pmd allocation failed.\n");
++ pud_populate(NULL, pud, pmd);
++ }
++#endif
++
++ pmd = pmd_offset(pud, addr);
++ do {
++ pte_t *pte = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
++ if (!pte)
++ panic("fixmap: pte allocation failed.\n");
++
++ pmd_populate_kernel(&init_mm, pmd, pte);
++
++ addr += PAGE_SIZE;
++ } while (addr < end);
++}
++
+ static void __init parisc_bootmem_free(void)
+ {
+ unsigned long max_zone_pfn[MAX_NR_ZONES] = { 0, };
+@@ -685,6 +718,7 @@ void __init paging_init(void)
+ setup_bootmem();
+ pagetable_init();
+ gateway_init();
++ fixmap_init();
+ flush_cache_all_local(); /* start with known state */
+ flush_tlb_all_local(NULL);
+
+diff --git a/arch/powerpc/include/asm/word-at-a-time.h b/arch/powerpc/include/asm/word-at-a-time.h
+index 46c31fb8748d5..30a12d2086871 100644
+--- a/arch/powerpc/include/asm/word-at-a-time.h
++++ b/arch/powerpc/include/asm/word-at-a-time.h
+@@ -34,7 +34,7 @@ static inline long find_zero(unsigned long mask)
+ return leading_zero_bits >> 3;
+ }
+
+-static inline bool has_zero(unsigned long val, unsigned long *data, const struct word_at_a_time *c)
++static inline unsigned long has_zero(unsigned long val, unsigned long *data, const struct word_at_a_time *c)
+ {
+ unsigned long rhs = val | c->low_bits;
+ *data = rhs;
+diff --git a/arch/powerpc/kernel/trace/ftrace_mprofile.S b/arch/powerpc/kernel/trace/ftrace_mprofile.S
+index ffb1db3868499..1f7d86de1538e 100644
+--- a/arch/powerpc/kernel/trace/ftrace_mprofile.S
++++ b/arch/powerpc/kernel/trace/ftrace_mprofile.S
+@@ -33,6 +33,9 @@
+ * and then arrange for the ftrace function to be called.
+ */
+ .macro ftrace_regs_entry allregs
++ /* Create a minimal stack frame for representing B */
++ PPC_STLU r1, -STACK_FRAME_MIN_SIZE(r1)
++
+ /* Create our stack frame + pt_regs */
+ PPC_STLU r1,-SWITCH_FRAME_SIZE(r1)
+
+@@ -42,7 +45,7 @@
+
+ #ifdef CONFIG_PPC64
+ /* Save the original return address in A's stack frame */
+- std r0, LRSAVE+SWITCH_FRAME_SIZE(r1)
++ std r0, LRSAVE+SWITCH_FRAME_SIZE+STACK_FRAME_MIN_SIZE(r1)
+ /* Ok to continue? */
+ lbz r3, PACA_FTRACE_ENABLED(r13)
+ cmpdi r3, 0
+@@ -77,6 +80,8 @@
+ mflr r7
+ /* Save it as pt_regs->nip */
+ PPC_STL r7, _NIP(r1)
++ /* Also save it in B's stackframe header for proper unwind */
++ PPC_STL r7, LRSAVE+SWITCH_FRAME_SIZE(r1)
+ /* Save the read LR in pt_regs->link */
+ PPC_STL r0, _LINK(r1)
+
+@@ -142,7 +147,7 @@
+ #endif
+
+ /* Pop our stack frame */
+- addi r1, r1, SWITCH_FRAME_SIZE
++ addi r1, r1, SWITCH_FRAME_SIZE+STACK_FRAME_MIN_SIZE
+
+ #ifdef CONFIG_LIVEPATCH_64
+ /* Based on the cmpd above, if the NIP was altered handle livepatch */
+diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
+index fe1b83020e0df..0ec5b45b1e86a 100644
+--- a/arch/powerpc/mm/init_64.c
++++ b/arch/powerpc/mm/init_64.c
+@@ -314,8 +314,7 @@ void __ref vmemmap_free(unsigned long start, unsigned long end,
+ start = ALIGN_DOWN(start, page_size);
+ if (altmap) {
+ alt_start = altmap->base_pfn;
+- alt_end = altmap->base_pfn + altmap->reserve +
+- altmap->free + altmap->alloc + altmap->align;
++ alt_end = altmap->base_pfn + altmap->reserve + altmap->free;
+ }
+
+ pr_debug("vmemmap_free %lx...%lx\n", start, end);
+diff --git a/arch/riscv/kernel/crash_core.c b/arch/riscv/kernel/crash_core.c
+index b351a3c013555..55f1d7856b544 100644
+--- a/arch/riscv/kernel/crash_core.c
++++ b/arch/riscv/kernel/crash_core.c
+@@ -18,4 +18,6 @@ void arch_crash_save_vmcoreinfo(void)
+ vmcoreinfo_append_str("NUMBER(MODULES_END)=0x%lx\n", MODULES_END);
+ #endif
+ vmcoreinfo_append_str("NUMBER(KERNEL_LINK_ADDR)=0x%lx\n", KERNEL_LINK_ADDR);
++ vmcoreinfo_append_str("NUMBER(va_kernel_pa_offset)=0x%lx\n",
++ kernel_map.va_kernel_pa_offset);
+ }
+diff --git a/arch/s390/kernel/sthyi.c b/arch/s390/kernel/sthyi.c
+index 4d141e2c132e5..2ea7f208f0e73 100644
+--- a/arch/s390/kernel/sthyi.c
++++ b/arch/s390/kernel/sthyi.c
+@@ -459,9 +459,9 @@ static int sthyi_update_cache(u64 *rc)
+ *
+ * Fills the destination with system information returned by the STHYI
+ * instruction. The data is generated by emulation or execution of STHYI,
+- * if available. The return value is the condition code that would be
+- * returned, the rc parameter is the return code which is passed in
+- * register R2 + 1.
++ * if available. The return value is either a negative error value or
++ * the condition code that would be returned, the rc parameter is the
++ * return code which is passed in register R2 + 1.
+ */
+ int sthyi_fill(void *dst, u64 *rc)
+ {
+diff --git a/arch/s390/kvm/intercept.c b/arch/s390/kvm/intercept.c
+index 2cda8d9d7c6ef..f817006f9f936 100644
+--- a/arch/s390/kvm/intercept.c
++++ b/arch/s390/kvm/intercept.c
+@@ -389,8 +389,8 @@ static int handle_partial_execution(struct kvm_vcpu *vcpu)
+ */
+ int handle_sthyi(struct kvm_vcpu *vcpu)
+ {
+- int reg1, reg2, r = 0;
+- u64 code, addr, cc = 0, rc = 0;
++ int reg1, reg2, cc = 0, r = 0;
++ u64 code, addr, rc = 0;
+ struct sthyi_sctns *sctns = NULL;
+
+ if (!test_kvm_facility(vcpu->kvm, 74))
+@@ -421,7 +421,10 @@ int handle_sthyi(struct kvm_vcpu *vcpu)
+ return -ENOMEM;
+
+ cc = sthyi_fill(sctns, &rc);
+-
++ if (cc < 0) {
++ free_page((unsigned long)sctns);
++ return cc;
++ }
+ out:
+ if (!cc) {
+ if (kvm_s390_pv_cpu_is_protected(vcpu)) {
+diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c
+index b9dcb4ae6c59a..05f4912380fac 100644
+--- a/arch/s390/mm/vmem.c
++++ b/arch/s390/mm/vmem.c
+@@ -761,6 +761,8 @@ void __init vmem_map_init(void)
+ if (static_key_enabled(&cpu_has_bear))
+ set_memory_nx(0, 1);
+ set_memory_nx(PAGE_SIZE, 1);
++ if (debug_pagealloc_enabled())
++ set_memory_4k(0, ident_map_size >> PAGE_SHIFT);
+
+ pr_info("Write protected kernel read-only data: %luk\n",
+ (unsigned long)(__end_rodata - _stext) >> 10);
+diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
+index 6c04b52f139b5..953e280c07c38 100644
+--- a/arch/x86/hyperv/hv_init.c
++++ b/arch/x86/hyperv/hv_init.c
+@@ -14,6 +14,7 @@
+ #include <asm/apic.h>
+ #include <asm/desc.h>
+ #include <asm/sev.h>
++#include <asm/ibt.h>
+ #include <asm/hypervisor.h>
+ #include <asm/hyperv-tlfs.h>
+ #include <asm/mshyperv.h>
+@@ -471,6 +472,26 @@ void __init hyperv_init(void)
+ wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
+ }
+
++ /*
++ * Some versions of Hyper-V that provide IBT in guest VMs have a bug
++ * in that there's no ENDBR64 instruction at the entry to the
++ * hypercall page. Because hypercalls are invoked via an indirect call
++ * to the hypercall page, all hypercall attempts fail when IBT is
++ * enabled, and Linux panics. For such buggy versions, disable IBT.
++ *
++ * Fixed versions of Hyper-V always provide ENDBR64 on the hypercall
++ * page, so if future Linux kernel versions enable IBT for 32-bit
++ * builds, additional hypercall page hackery will be required here
++ * to provide an ENDBR32.
++ */
++#ifdef CONFIG_X86_KERNEL_IBT
++ if (cpu_feature_enabled(X86_FEATURE_IBT) &&
++ *(u32 *)hv_hypercall_pg != gen_endbr()) {
++ setup_clear_cpu_cap(X86_FEATURE_IBT);
++ pr_warn("Hyper-V: Disabling IBT because of Hyper-V bug\n");
++ }
++#endif
++
+ /*
+ * hyperv_init() is called before LAPIC is initialized: see
+ * apic_intr_mode_init() -> x86_platform.apic_post_init() and
+diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
+index 094f88fee5369..b69b0d7756aab 100644
+--- a/arch/x86/include/asm/cpufeatures.h
++++ b/arch/x86/include/asm/cpufeatures.h
+@@ -495,4 +495,5 @@
+
+ /* BUG word 2 */
+ #define X86_BUG_SRSO X86_BUG(1*32 + 0) /* AMD SRSO bug */
++#define X86_BUG_DIV0 X86_BUG(1*32 + 1) /* AMD DIV0 speculation bug */
+ #endif /* _ASM_X86_CPUFEATURES_H */
+diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
+index 37f1826df2635..e8db1cff76fda 100644
+--- a/arch/x86/include/asm/processor.h
++++ b/arch/x86/include/asm/processor.h
+@@ -684,10 +684,12 @@ extern u16 get_llc_id(unsigned int cpu);
+ extern u32 amd_get_nodes_per_socket(void);
+ extern u32 amd_get_highest_perf(void);
+ extern bool cpu_has_ibpb_brtype_microcode(void);
++extern void amd_clear_divider(void);
+ #else
+ static inline u32 amd_get_nodes_per_socket(void) { return 0; }
+ static inline u32 amd_get_highest_perf(void) { return 0; }
+ static inline bool cpu_has_ibpb_brtype_microcode(void) { return false; }
++static inline void amd_clear_divider(void) { }
+ #endif
+
+ extern unsigned long arch_align_stack(unsigned long sp);
+diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
+index 4239b51e0bc50..c37a3a5cdabd3 100644
+--- a/arch/x86/kernel/cpu/amd.c
++++ b/arch/x86/kernel/cpu/amd.c
+@@ -75,6 +75,10 @@ static const int amd_zenbleed[] =
+ AMD_MODEL_RANGE(0x17, 0x60, 0x0, 0x7f, 0xf),
+ AMD_MODEL_RANGE(0x17, 0xa0, 0x0, 0xaf, 0xf));
+
++static const int amd_div0[] =
++ AMD_LEGACY_ERRATUM(AMD_MODEL_RANGE(0x17, 0x00, 0x0, 0x2f, 0xf),
++ AMD_MODEL_RANGE(0x17, 0x50, 0x0, 0x5f, 0xf));
++
+ static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum)
+ {
+ int osvw_id = *erratum++;
+@@ -1130,6 +1134,11 @@ static void init_amd(struct cpuinfo_x86 *c)
+ WARN_ON_ONCE(msr_set_bit(MSR_EFER, _EFER_AUTOIBRS));
+
+ zenbleed_check(c);
++
++ if (cpu_has_amd_erratum(c, amd_div0)) {
++ pr_notice_once("AMD Zen1 DIV0 bug detected. Disable SMT for full protection.\n");
++ setup_force_cpu_bug(X86_BUG_DIV0);
++ }
+ }
+
+ #ifdef CONFIG_X86_32
+@@ -1309,3 +1318,13 @@ void amd_check_microcode(void)
+ {
+ on_each_cpu(zenbleed_check_cpu, NULL, 1);
+ }
++
++/*
++ * Issue a DIV 0/1 insn to clear any division data from previous DIV
++ * operations.
++ */
++void noinstr amd_clear_divider(void)
++{
++ asm volatile(ALTERNATIVE("", "div %2\n\t", X86_BUG_DIV0)
++ :: "a" (0), "d" (0), "r" (1));
++}
+diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
+index 4a817d20ce3bb..1885326a8f659 100644
+--- a/arch/x86/kernel/traps.c
++++ b/arch/x86/kernel/traps.c
+@@ -206,6 +206,8 @@ DEFINE_IDTENTRY(exc_divide_error)
+ {
+ do_error_trap(regs, 0, "divide error", X86_TRAP_DE, SIGFPE,
+ FPE_INTDIV, error_get_trap_addr(regs));
++
++ amd_clear_divider();
+ }
+
+ DEFINE_IDTENTRY(exc_overflow)
+diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
+index 5c86151b0d3a5..e8af9d8f024eb 100644
+--- a/drivers/block/rbd.c
++++ b/drivers/block/rbd.c
+@@ -3675,7 +3675,7 @@ static int rbd_lock(struct rbd_device *rbd_dev)
+ ret = ceph_cls_lock(osdc, &rbd_dev->header_oid, &rbd_dev->header_oloc,
+ RBD_LOCK_NAME, CEPH_CLS_LOCK_EXCLUSIVE, cookie,
+ RBD_LOCK_TAG, "", 0);
+- if (ret)
++ if (ret && ret != -EEXIST)
+ return ret;
+
+ __rbd_lock(rbd_dev, cookie);
+@@ -3878,7 +3878,7 @@ static struct ceph_locker *get_lock_owner_info(struct rbd_device *rbd_dev)
+ &rbd_dev->header_oloc, RBD_LOCK_NAME,
+ &lock_type, &lock_tag, &lockers, &num_lockers);
+ if (ret) {
+- rbd_warn(rbd_dev, "failed to retrieve lockers: %d", ret);
++ rbd_warn(rbd_dev, "failed to get header lockers: %d", ret);
+ return ERR_PTR(ret);
+ }
+
+@@ -3940,8 +3940,10 @@ static int find_watcher(struct rbd_device *rbd_dev,
+ ret = ceph_osdc_list_watchers(osdc, &rbd_dev->header_oid,
+ &rbd_dev->header_oloc, &watchers,
+ &num_watchers);
+- if (ret)
++ if (ret) {
++ rbd_warn(rbd_dev, "failed to get watchers: %d", ret);
+ return ret;
++ }
+
+ sscanf(locker->id.cookie, RBD_LOCK_COOKIE_PREFIX " %llu", &cookie);
+ for (i = 0; i < num_watchers; i++) {
+@@ -3985,8 +3987,12 @@ static int rbd_try_lock(struct rbd_device *rbd_dev)
+ locker = refreshed_locker = NULL;
+
+ ret = rbd_lock(rbd_dev);
+- if (ret != -EBUSY)
++ if (!ret)
++ goto out;
++ if (ret != -EBUSY) {
++ rbd_warn(rbd_dev, "failed to lock header: %d", ret);
+ goto out;
++ }
+
+ /* determine if the current lock holder is still alive */
+ locker = get_lock_owner_info(rbd_dev);
+@@ -4089,11 +4095,8 @@ static int rbd_try_acquire_lock(struct rbd_device *rbd_dev)
+
+ ret = rbd_try_lock(rbd_dev);
+ if (ret < 0) {
+- rbd_warn(rbd_dev, "failed to lock header: %d", ret);
+- if (ret == -EBLOCKLISTED)
+- goto out;
+-
+- ret = 1; /* request lock anyway */
++ rbd_warn(rbd_dev, "failed to acquire lock: %d", ret);
++ goto out;
+ }
+ if (ret > 0) {
+ up_write(&rbd_dev->lock_rwsem);
+@@ -6627,12 +6630,11 @@ static int rbd_add_acquire_lock(struct rbd_device *rbd_dev)
+ cancel_delayed_work_sync(&rbd_dev->lock_dwork);
+ if (!ret)
+ ret = -ETIMEDOUT;
+- }
+
+- if (ret) {
+- rbd_warn(rbd_dev, "failed to acquire exclusive lock: %ld", ret);
+- return ret;
++ rbd_warn(rbd_dev, "failed to acquire lock: %ld", ret);
+ }
++ if (ret)
++ return ret;
+
+ /*
+ * The lock may have been released by now, unless automatic lock
+diff --git a/drivers/clk/imx/clk-imx93.c b/drivers/clk/imx/clk-imx93.c
+index b6c7c2725906c..44f435103c65a 100644
+--- a/drivers/clk/imx/clk-imx93.c
++++ b/drivers/clk/imx/clk-imx93.c
+@@ -291,7 +291,7 @@ static int imx93_clocks_probe(struct platform_device *pdev)
+ anatop_base = devm_of_iomap(dev, np, 0, NULL);
+ of_node_put(np);
+ if (WARN_ON(IS_ERR(anatop_base))) {
+- ret = PTR_ERR(base);
++ ret = PTR_ERR(anatop_base);
+ goto unregister_hws;
+ }
+
+diff --git a/drivers/clk/mediatek/clk-mt8183.c b/drivers/clk/mediatek/clk-mt8183.c
+index 2336a1b69c093..3b605c30e8494 100644
+--- a/drivers/clk/mediatek/clk-mt8183.c
++++ b/drivers/clk/mediatek/clk-mt8183.c
+@@ -328,6 +328,14 @@ static const char * const atb_parents[] = {
+ "syspll_d5"
+ };
+
++static const char * const sspm_parents[] = {
++ "clk26m",
++ "univpll_d2_d4",
++ "syspll_d2_d2",
++ "univpll_d2_d2",
++ "syspll_d3"
++};
++
+ static const char * const dpi0_parents[] = {
+ "clk26m",
+ "tvdpll_d2",
+@@ -506,6 +514,9 @@ static const struct mtk_mux top_muxes[] = {
+ /* CLK_CFG_6 */
+ MUX_GATE_CLR_SET_UPD(CLK_TOP_MUX_ATB, "atb_sel",
+ atb_parents, 0xa0, 0xa4, 0xa8, 0, 2, 7, 0x004, 24),
++ MUX_GATE_CLR_SET_UPD_FLAGS(CLK_TOP_MUX_SSPM, "sspm_sel",
++ sspm_parents, 0xa0, 0xa4, 0xa8, 8, 3, 15, 0x004, 25,
++ CLK_IS_CRITICAL | CLK_SET_RATE_PARENT),
+ MUX_GATE_CLR_SET_UPD(CLK_TOP_MUX_DPI0, "dpi0_sel",
+ dpi0_parents, 0xa0, 0xa4, 0xa8, 16, 4, 23, 0x004, 26),
+ MUX_GATE_CLR_SET_UPD(CLK_TOP_MUX_SCAM, "scam_sel",
+@@ -671,10 +682,18 @@ static const struct mtk_gate_regs infra3_cg_regs = {
+ GATE_MTK(_id, _name, _parent, &infra2_cg_regs, _shift, \
+ &mtk_clk_gate_ops_setclr)
+
++#define GATE_INFRA2_FLAGS(_id, _name, _parent, _shift, _flag) \
++ GATE_MTK_FLAGS(_id, _name, _parent, &infra2_cg_regs, \
++ _shift, &mtk_clk_gate_ops_setclr, _flag)
++
+ #define GATE_INFRA3(_id, _name, _parent, _shift) \
+ GATE_MTK(_id, _name, _parent, &infra3_cg_regs, _shift, \
+ &mtk_clk_gate_ops_setclr)
+
++#define GATE_INFRA3_FLAGS(_id, _name, _parent, _shift, _flag) \
++ GATE_MTK_FLAGS(_id, _name, _parent, &infra3_cg_regs, \
++ _shift, &mtk_clk_gate_ops_setclr, _flag)
++
+ static const struct mtk_gate infra_clks[] = {
+ /* INFRA0 */
+ GATE_INFRA0(CLK_INFRA_PMIC_TMR, "infra_pmic_tmr", "axi_sel", 0),
+@@ -746,7 +765,11 @@ static const struct mtk_gate infra_clks[] = {
+ GATE_INFRA2(CLK_INFRA_UNIPRO_TICK, "infra_unipro_tick", "fufs_sel", 12),
+ GATE_INFRA2(CLK_INFRA_UFS_MP_SAP_BCLK, "infra_ufs_mp_sap_bck", "fufs_sel", 13),
+ GATE_INFRA2(CLK_INFRA_MD32_BCLK, "infra_md32_bclk", "axi_sel", 14),
++ /* infra_sspm is main clock in co-processor, should not be closed in Linux. */
++ GATE_INFRA2_FLAGS(CLK_INFRA_SSPM, "infra_sspm", "sspm_sel", 15, CLK_IS_CRITICAL),
+ GATE_INFRA2(CLK_INFRA_UNIPRO_MBIST, "infra_unipro_mbist", "axi_sel", 16),
++ /* infra_sspm_bus_hclk is main clock in co-processor, should not be closed in Linux. */
++ GATE_INFRA2_FLAGS(CLK_INFRA_SSPM_BUS_HCLK, "infra_sspm_bus_hclk", "axi_sel", 17, CLK_IS_CRITICAL),
+ GATE_INFRA2(CLK_INFRA_I2C5, "infra_i2c5", "i2c_sel", 18),
+ GATE_INFRA2(CLK_INFRA_I2C5_ARBITER, "infra_i2c5_arbiter", "i2c_sel", 19),
+ GATE_INFRA2(CLK_INFRA_I2C5_IMM, "infra_i2c5_imm", "i2c_sel", 20),
+@@ -764,6 +787,10 @@ static const struct mtk_gate infra_clks[] = {
+ GATE_INFRA3(CLK_INFRA_MSDC0_SELF, "infra_msdc0_self", "msdc50_0_sel", 0),
+ GATE_INFRA3(CLK_INFRA_MSDC1_SELF, "infra_msdc1_self", "msdc50_0_sel", 1),
+ GATE_INFRA3(CLK_INFRA_MSDC2_SELF, "infra_msdc2_self", "msdc50_0_sel", 2),
++ /* infra_sspm_26m_self is main clock in co-processor, should not be closed in Linux. */
++ GATE_INFRA3_FLAGS(CLK_INFRA_SSPM_26M_SELF, "infra_sspm_26m_self", "f_f26m_ck", 3, CLK_IS_CRITICAL),
++ /* infra_sspm_32k_self is main clock in co-processor, should not be closed in Linux. */
++ GATE_INFRA3_FLAGS(CLK_INFRA_SSPM_32K_SELF, "infra_sspm_32k_self", "f_f26m_ck", 4, CLK_IS_CRITICAL),
+ GATE_INFRA3(CLK_INFRA_UFS_AXI, "infra_ufs_axi", "axi_sel", 5),
+ GATE_INFRA3(CLK_INFRA_I2C6, "infra_i2c6", "i2c_sel", 6),
+ GATE_INFRA3(CLK_INFRA_AP_MSDC0, "infra_ap_msdc0", "msdc50_hclk_sel", 7),
+diff --git a/drivers/firmware/arm_scmi/mailbox.c b/drivers/firmware/arm_scmi/mailbox.c
+index 1efa5e9392c42..19246ed1f01ff 100644
+--- a/drivers/firmware/arm_scmi/mailbox.c
++++ b/drivers/firmware/arm_scmi/mailbox.c
+@@ -166,8 +166,10 @@ static int mailbox_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
+ return -ENOMEM;
+
+ shmem = of_parse_phandle(cdev->of_node, "shmem", idx);
+- if (!of_device_is_compatible(shmem, "arm,scmi-shmem"))
++ if (!of_device_is_compatible(shmem, "arm,scmi-shmem")) {
++ of_node_put(shmem);
+ return -ENXIO;
++ }
+
+ ret = of_address_to_resource(shmem, 0, &res);
+ of_node_put(shmem);
+diff --git a/drivers/firmware/arm_scmi/raw_mode.c b/drivers/firmware/arm_scmi/raw_mode.c
+index 6971dcf72fb99..0493aa3c12bf5 100644
+--- a/drivers/firmware/arm_scmi/raw_mode.c
++++ b/drivers/firmware/arm_scmi/raw_mode.c
+@@ -818,10 +818,13 @@ static ssize_t scmi_dbg_raw_mode_common_write(struct file *filp,
+ * before sending it with a single RAW xfer.
+ */
+ if (rd->tx_size < rd->tx_req_size) {
+- size_t cnt;
++ ssize_t cnt;
+
+ cnt = simple_write_to_buffer(rd->tx.buf, rd->tx.len, ppos,
+ buf, count);
++ if (cnt < 0)
++ return cnt;
++
+ rd->tx_size += cnt;
+ if (cnt < count)
+ return cnt;
+diff --git a/drivers/firmware/arm_scmi/smc.c b/drivers/firmware/arm_scmi/smc.c
+index 93272e4bbd12b..9ba0aab8ce22d 100644
+--- a/drivers/firmware/arm_scmi/smc.c
++++ b/drivers/firmware/arm_scmi/smc.c
+@@ -23,6 +23,7 @@
+ /**
+ * struct scmi_smc - Structure representing a SCMI smc transport
+ *
++ * @irq: An optional IRQ for completion
+ * @cinfo: SCMI channel info
+ * @shmem: Transmit/Receive shared memory area
+ * @shmem_lock: Lock to protect access to Tx/Rx shared memory area.
+@@ -33,6 +34,7 @@
+ */
+
+ struct scmi_smc {
++ int irq;
+ struct scmi_chan_info *cinfo;
+ struct scmi_shared_mem __iomem *shmem;
+ /* Protect access to shmem area */
+@@ -106,7 +108,7 @@ static int smc_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
+ struct resource res;
+ struct device_node *np;
+ u32 func_id;
+- int ret, irq;
++ int ret;
+
+ if (!tx)
+ return -ENODEV;
+@@ -116,8 +118,10 @@ static int smc_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
+ return -ENOMEM;
+
+ np = of_parse_phandle(cdev->of_node, "shmem", 0);
+- if (!of_device_is_compatible(np, "arm,scmi-shmem"))
++ if (!of_device_is_compatible(np, "arm,scmi-shmem")) {
++ of_node_put(np);
+ return -ENXIO;
++ }
+
+ ret = of_address_to_resource(np, 0, &res);
+ of_node_put(np);
+@@ -142,11 +146,10 @@ static int smc_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
+ * completion of a message is signaled by an interrupt rather than by
+ * the return of the SMC call.
+ */
+- irq = of_irq_get_byname(cdev->of_node, "a2p");
+- if (irq > 0) {
+- ret = devm_request_irq(dev, irq, smc_msg_done_isr,
+- IRQF_NO_SUSPEND,
+- dev_name(dev), scmi_info);
++ scmi_info->irq = of_irq_get_byname(cdev->of_node, "a2p");
++ if (scmi_info->irq > 0) {
++ ret = request_irq(scmi_info->irq, smc_msg_done_isr,
++ IRQF_NO_SUSPEND, dev_name(dev), scmi_info);
+ if (ret) {
+ dev_err(dev, "failed to setup SCMI smc irq\n");
+ return ret;
+@@ -168,6 +171,10 @@ static int smc_chan_free(int id, void *p, void *data)
+ struct scmi_chan_info *cinfo = p;
+ struct scmi_smc *scmi_info = cinfo->transport_info;
+
++ /* Ignore any possible further reception on the IRQ path */
++ if (scmi_info->irq > 0)
++ free_irq(scmi_info->irq, scmi_info);
++
+ cinfo->transport_info = NULL;
+ scmi_info->cinfo = NULL;
+
+diff --git a/drivers/firmware/smccc/soc_id.c b/drivers/firmware/smccc/soc_id.c
+index 890eb454599a3..1990263fbba0e 100644
+--- a/drivers/firmware/smccc/soc_id.c
++++ b/drivers/firmware/smccc/soc_id.c
+@@ -34,7 +34,6 @@ static struct soc_device_attribute *soc_dev_attr;
+
+ static int __init smccc_soc_init(void)
+ {
+- struct arm_smccc_res res;
+ int soc_id_rev, soc_id_version;
+ static char soc_id_str[20], soc_id_rev_str[12];
+ static char soc_id_jep106_id_str[12];
+@@ -49,13 +48,13 @@ static int __init smccc_soc_init(void)
+ }
+
+ if (soc_id_version < 0) {
+- pr_err("ARCH_SOC_ID(0) returned error: %lx\n", res.a0);
++ pr_err("Invalid SoC Version: %x\n", soc_id_version);
+ return -EINVAL;
+ }
+
+ soc_id_rev = arm_smccc_get_soc_id_revision();
+ if (soc_id_rev < 0) {
+- pr_err("ARCH_SOC_ID(1) returned error: %lx\n", res.a0);
++ pr_err("Invalid SoC Revision: %x\n", soc_id_rev);
+ return -EINVAL;
+ }
+
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+index 2cd081cbf7062..59ffb9389c697 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+@@ -1623,14 +1623,15 @@ static int amdgpu_ttm_training_reserve_vram_fini(struct amdgpu_device *adev)
+ return 0;
+ }
+
+-static void amdgpu_ttm_training_data_block_init(struct amdgpu_device *adev)
++static void amdgpu_ttm_training_data_block_init(struct amdgpu_device *adev,
++ uint32_t reserve_size)
+ {
+ struct psp_memory_training_context *ctx = &adev->psp.mem_train_ctx;
+
+ memset(ctx, 0, sizeof(*ctx));
+
+ ctx->c2p_train_data_offset =
+- ALIGN((adev->gmc.mc_vram_size - adev->mman.discovery_tmr_size - SZ_1M), SZ_1M);
++ ALIGN((adev->gmc.mc_vram_size - reserve_size - SZ_1M), SZ_1M);
+ ctx->p2c_train_data_offset =
+ (adev->gmc.mc_vram_size - GDDR6_MEM_TRAINING_OFFSET);
+ ctx->train_data_size =
+@@ -1648,9 +1649,10 @@ static void amdgpu_ttm_training_data_block_init(struct amdgpu_device *adev)
+ */
+ static int amdgpu_ttm_reserve_tmr(struct amdgpu_device *adev)
+ {
+- int ret;
+ struct psp_memory_training_context *ctx = &adev->psp.mem_train_ctx;
+ bool mem_train_support = false;
++ uint32_t reserve_size = 0;
++ int ret;
+
+ if (!amdgpu_sriov_vf(adev)) {
+ if (amdgpu_atomfirmware_mem_training_supported(adev))
+@@ -1666,14 +1668,15 @@ static int amdgpu_ttm_reserve_tmr(struct amdgpu_device *adev)
+ * Otherwise, fallback to legacy approach to check and reserve tmr block for ip
+ * discovery data and G6 memory training data respectively
+ */
+- adev->mman.discovery_tmr_size =
+- amdgpu_atomfirmware_get_fw_reserved_fb_size(adev);
+- if (!adev->mman.discovery_tmr_size)
+- adev->mman.discovery_tmr_size = DISCOVERY_TMR_OFFSET;
++ if (adev->bios)
++ reserve_size =
++ amdgpu_atomfirmware_get_fw_reserved_fb_size(adev);
++ if (!reserve_size)
++ reserve_size = DISCOVERY_TMR_OFFSET;
+
+ if (mem_train_support) {
+ /* reserve vram for mem train according to TMR location */
+- amdgpu_ttm_training_data_block_init(adev);
++ amdgpu_ttm_training_data_block_init(adev, reserve_size);
+ ret = amdgpu_bo_create_kernel_at(adev,
+ ctx->c2p_train_data_offset,
+ ctx->train_data_size,
+@@ -1687,14 +1690,13 @@ static int amdgpu_ttm_reserve_tmr(struct amdgpu_device *adev)
+ ctx->init = PSP_MEM_TRAIN_RESERVE_SUCCESS;
+ }
+
+- ret = amdgpu_bo_create_kernel_at(adev,
+- adev->gmc.real_vram_size - adev->mman.discovery_tmr_size,
+- adev->mman.discovery_tmr_size,
+- &adev->mman.discovery_memory,
+- NULL);
++ ret = amdgpu_bo_create_kernel_at(
++ adev, adev->gmc.real_vram_size - reserve_size,
++ reserve_size, &adev->mman.fw_reserved_memory, NULL);
+ if (ret) {
+ DRM_ERROR("alloc tmr failed(%d)!\n", ret);
+- amdgpu_bo_free_kernel(&adev->mman.discovery_memory, NULL, NULL);
++ amdgpu_bo_free_kernel(&adev->mman.fw_reserved_memory,
++ NULL, NULL);
+ return ret;
+ }
+
+@@ -1881,8 +1883,9 @@ void amdgpu_ttm_fini(struct amdgpu_device *adev)
+ /* return the stolen vga memory back to VRAM */
+ amdgpu_bo_free_kernel(&adev->mman.stolen_vga_memory, NULL, NULL);
+ amdgpu_bo_free_kernel(&adev->mman.stolen_extended_memory, NULL, NULL);
+- /* return the IP Discovery TMR memory back to VRAM */
+- amdgpu_bo_free_kernel(&adev->mman.discovery_memory, NULL, NULL);
++ /* return the FW reserved memory back to VRAM */
++ amdgpu_bo_free_kernel(&adev->mman.fw_reserved_memory, NULL,
++ NULL);
+ if (adev->mman.stolen_reserved_size)
+ amdgpu_bo_free_kernel(&adev->mman.stolen_reserved_memory,
+ NULL, NULL);
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
+index e2cd5894afc9d..da6544fdc8ddd 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
+@@ -78,7 +78,8 @@ struct amdgpu_mman {
+ /* discovery */
+ uint8_t *discovery_bin;
+ uint32_t discovery_tmr_size;
+- struct amdgpu_bo *discovery_memory;
++ /* fw reserved memory */
++ struct amdgpu_bo *fw_reserved_memory;
+
+ /* firmware VRAM reservation */
+ u64 fw_vram_usage_start_offset;
+diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+index e1c76e5bfa827..2702ad4c26c88 100644
+--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
++++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+@@ -165,58 +165,148 @@ static u32 preparser_disable(bool state)
+ return MI_ARB_CHECK | 1 << 8 | state;
+ }
+
+-u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs, const i915_reg_t inv_reg)
++static i915_reg_t gen12_get_aux_inv_reg(struct intel_engine_cs *engine)
+ {
+- u32 gsi_offset = gt->uncore->gsi_offset;
++ switch (engine->id) {
++ case RCS0:
++ return GEN12_CCS_AUX_INV;
++ case BCS0:
++ return GEN12_BCS0_AUX_INV;
++ case VCS0:
++ return GEN12_VD0_AUX_INV;
++ case VCS2:
++ return GEN12_VD2_AUX_INV;
++ case VECS0:
++ return GEN12_VE0_AUX_INV;
++ case CCS0:
++ return GEN12_CCS0_AUX_INV;
++ default:
++ return INVALID_MMIO_REG;
++ }
++}
++
++static bool gen12_needs_ccs_aux_inv(struct intel_engine_cs *engine)
++{
++ i915_reg_t reg = gen12_get_aux_inv_reg(engine);
++
++ if (IS_PONTEVECCHIO(engine->i915))
++ return false;
++
++ /*
++ * So far platforms supported by i915 having flat ccs do not require
++ * AUX invalidation. Check also whether the engine requires it.
++ */
++ return i915_mmio_reg_valid(reg) && !HAS_FLAT_CCS(engine->i915);
++}
++
++u32 *gen12_emit_aux_table_inv(struct intel_engine_cs *engine, u32 *cs)
++{
++ i915_reg_t inv_reg = gen12_get_aux_inv_reg(engine);
++ u32 gsi_offset = engine->gt->uncore->gsi_offset;
++
++ if (!gen12_needs_ccs_aux_inv(engine))
++ return cs;
+
+ *cs++ = MI_LOAD_REGISTER_IMM(1) | MI_LRI_MMIO_REMAP_EN;
+ *cs++ = i915_mmio_reg_offset(inv_reg) + gsi_offset;
+ *cs++ = AUX_INV;
+- *cs++ = MI_NOOP;
++
++ *cs++ = MI_SEMAPHORE_WAIT_TOKEN |
++ MI_SEMAPHORE_REGISTER_POLL |
++ MI_SEMAPHORE_POLL |
++ MI_SEMAPHORE_SAD_EQ_SDD;
++ *cs++ = 0;
++ *cs++ = i915_mmio_reg_offset(inv_reg) + gsi_offset;
++ *cs++ = 0;
++ *cs++ = 0;
+
+ return cs;
+ }
+
++static int mtl_dummy_pipe_control(struct i915_request *rq)
++{
++ /* Wa_14016712196 */
++ if (IS_MTL_GRAPHICS_STEP(rq->engine->i915, M, STEP_A0, STEP_B0) ||
++ IS_MTL_GRAPHICS_STEP(rq->engine->i915, P, STEP_A0, STEP_B0)) {
++ u32 *cs;
++
++ /* dummy PIPE_CONTROL + depth flush */
++ cs = intel_ring_begin(rq, 6);
++ if (IS_ERR(cs))
++ return PTR_ERR(cs);
++ cs = gen12_emit_pipe_control(cs,
++ 0,
++ PIPE_CONTROL_DEPTH_CACHE_FLUSH,
++ LRC_PPHWSP_SCRATCH_ADDR);
++ intel_ring_advance(rq, cs);
++ }
++
++ return 0;
++}
++
+ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
+ {
+ struct intel_engine_cs *engine = rq->engine;
+
+- if (mode & EMIT_FLUSH) {
+- u32 flags = 0;
++ /*
++ * On Aux CCS platforms the invalidation of the Aux
++ * table requires quiescing memory traffic beforehand
++ */
++ if (mode & EMIT_FLUSH || gen12_needs_ccs_aux_inv(engine)) {
++ u32 bit_group_0 = 0;
++ u32 bit_group_1 = 0;
++ int err;
+ u32 *cs;
+
+- flags |= PIPE_CONTROL_TILE_CACHE_FLUSH;
+- flags |= PIPE_CONTROL_FLUSH_L3;
+- flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
+- flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
++ err = mtl_dummy_pipe_control(rq);
++ if (err)
++ return err;
++
++ bit_group_0 |= PIPE_CONTROL0_HDC_PIPELINE_FLUSH;
++
++ /*
++ * When required, in MTL and beyond platforms we
++ * need to set the CCS_FLUSH bit in the pipe control
++ */
++ if (GRAPHICS_VER_FULL(rq->i915) >= IP_VER(12, 70))
++ bit_group_0 |= PIPE_CONTROL_CCS_FLUSH;
++
++ bit_group_1 |= PIPE_CONTROL_TILE_CACHE_FLUSH;
++ bit_group_1 |= PIPE_CONTROL_FLUSH_L3;
++ bit_group_1 |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
++ bit_group_1 |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
+ /* Wa_1409600907:tgl,adl-p */
+- flags |= PIPE_CONTROL_DEPTH_STALL;
+- flags |= PIPE_CONTROL_DC_FLUSH_ENABLE;
+- flags |= PIPE_CONTROL_FLUSH_ENABLE;
++ bit_group_1 |= PIPE_CONTROL_DEPTH_STALL;
++ bit_group_1 |= PIPE_CONTROL_DC_FLUSH_ENABLE;
++ bit_group_1 |= PIPE_CONTROL_FLUSH_ENABLE;
+
+- flags |= PIPE_CONTROL_STORE_DATA_INDEX;
+- flags |= PIPE_CONTROL_QW_WRITE;
++ bit_group_1 |= PIPE_CONTROL_STORE_DATA_INDEX;
++ bit_group_1 |= PIPE_CONTROL_QW_WRITE;
+
+- flags |= PIPE_CONTROL_CS_STALL;
++ bit_group_1 |= PIPE_CONTROL_CS_STALL;
+
+ if (!HAS_3D_PIPELINE(engine->i915))
+- flags &= ~PIPE_CONTROL_3D_ARCH_FLAGS;
++ bit_group_1 &= ~PIPE_CONTROL_3D_ARCH_FLAGS;
+ else if (engine->class == COMPUTE_CLASS)
+- flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
++ bit_group_1 &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
+
+ cs = intel_ring_begin(rq, 6);
+ if (IS_ERR(cs))
+ return PTR_ERR(cs);
+
+- cs = gen12_emit_pipe_control(cs,
+- PIPE_CONTROL0_HDC_PIPELINE_FLUSH,
+- flags, LRC_PPHWSP_SCRATCH_ADDR);
++ cs = gen12_emit_pipe_control(cs, bit_group_0, bit_group_1,
++ LRC_PPHWSP_SCRATCH_ADDR);
+ intel_ring_advance(rq, cs);
+ }
+
+ if (mode & EMIT_INVALIDATE) {
+ u32 flags = 0;
+ u32 *cs, count;
++ int err;
++
++ err = mtl_dummy_pipe_control(rq);
++ if (err)
++ return err;
+
+ flags |= PIPE_CONTROL_COMMAND_CACHE_INVALIDATE;
+ flags |= PIPE_CONTROL_TLB_INVALIDATE;
+@@ -236,10 +326,9 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
+ else if (engine->class == COMPUTE_CLASS)
+ flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
+
+- if (!HAS_FLAT_CCS(rq->engine->i915))
+- count = 8 + 4;
+- else
+- count = 8;
++ count = 8;
++ if (gen12_needs_ccs_aux_inv(rq->engine))
++ count += 8;
+
+ cs = intel_ring_begin(rq, count);
+ if (IS_ERR(cs))
+@@ -254,11 +343,7 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
+
+ cs = gen8_emit_pipe_control(cs, flags, LRC_PPHWSP_SCRATCH_ADDR);
+
+- if (!HAS_FLAT_CCS(rq->engine->i915)) {
+- /* hsdes: 1809175790 */
+- cs = gen12_emit_aux_table_inv(rq->engine->gt,
+- cs, GEN12_GFX_CCS_AUX_NV);
+- }
++ cs = gen12_emit_aux_table_inv(engine, cs);
+
+ *cs++ = preparser_disable(false);
+ intel_ring_advance(rq, cs);
+@@ -269,21 +354,14 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
+
+ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
+ {
+- intel_engine_mask_t aux_inv = 0;
+- u32 cmd, *cs;
++ u32 cmd = 4;
++ u32 *cs;
+
+- cmd = 4;
+ if (mode & EMIT_INVALIDATE) {
+ cmd += 2;
+
+- if (!HAS_FLAT_CCS(rq->engine->i915) &&
+- (rq->engine->class == VIDEO_DECODE_CLASS ||
+- rq->engine->class == VIDEO_ENHANCEMENT_CLASS)) {
+- aux_inv = rq->engine->mask &
+- ~GENMASK(_BCS(I915_MAX_BCS - 1), BCS0);
+- if (aux_inv)
+- cmd += 4;
+- }
++ if (gen12_needs_ccs_aux_inv(rq->engine))
++ cmd += 8;
+ }
+
+ cs = intel_ring_begin(rq, cmd);
+@@ -307,6 +385,10 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
+ cmd |= MI_INVALIDATE_TLB;
+ if (rq->engine->class == VIDEO_DECODE_CLASS)
+ cmd |= MI_INVALIDATE_BSD;
++
++ if (gen12_needs_ccs_aux_inv(rq->engine) &&
++ rq->engine->class == COPY_ENGINE_CLASS)
++ cmd |= MI_FLUSH_DW_CCS;
+ }
+
+ *cs++ = cmd;
+@@ -314,14 +396,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
+ *cs++ = 0; /* upper addr */
+ *cs++ = 0; /* value */
+
+- if (aux_inv) { /* hsdes: 1809175790 */
+- if (rq->engine->class == VIDEO_DECODE_CLASS)
+- cs = gen12_emit_aux_table_inv(rq->engine->gt,
+- cs, GEN12_VD0_AUX_NV);
+- else
+- cs = gen12_emit_aux_table_inv(rq->engine->gt,
+- cs, GEN12_VE0_AUX_NV);
+- }
++ cs = gen12_emit_aux_table_inv(rq->engine, cs);
+
+ if (mode & EMIT_INVALIDATE)
+ *cs++ = preparser_disable(false);
+@@ -733,6 +808,13 @@ u32 *gen12_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs)
+ PIPE_CONTROL_DC_FLUSH_ENABLE |
+ PIPE_CONTROL_FLUSH_ENABLE);
+
++ /* Wa_14016712196 */
++ if (IS_MTL_GRAPHICS_STEP(i915, M, STEP_A0, STEP_B0) ||
++ IS_MTL_GRAPHICS_STEP(i915, P, STEP_A0, STEP_B0))
++ /* dummy PIPE_CONTROL + depth flush */
++ cs = gen12_emit_pipe_control(cs, 0,
++ PIPE_CONTROL_DEPTH_CACHE_FLUSH, 0);
++
+ if (GRAPHICS_VER(i915) == 12 && GRAPHICS_VER_FULL(i915) < IP_VER(12, 50))
+ /* Wa_1409600907 */
+ flags |= PIPE_CONTROL_DEPTH_STALL;
+diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.h b/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
+index 655e5c00ddc27..867ba697aceb8 100644
+--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
++++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
+@@ -13,6 +13,7 @@
+ #include "intel_gt_regs.h"
+ #include "intel_gpu_commands.h"
+
++struct intel_engine_cs;
+ struct intel_gt;
+ struct i915_request;
+
+@@ -46,28 +47,32 @@ u32 *gen8_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs);
+ u32 *gen11_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs);
+ u32 *gen12_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs);
+
+-u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs, const i915_reg_t inv_reg);
++u32 *gen12_emit_aux_table_inv(struct intel_engine_cs *engine, u32 *cs);
+
+ static inline u32 *
+-__gen8_emit_pipe_control(u32 *batch, u32 flags0, u32 flags1, u32 offset)
++__gen8_emit_pipe_control(u32 *batch, u32 bit_group_0,
++ u32 bit_group_1, u32 offset)
+ {
+ memset(batch, 0, 6 * sizeof(u32));
+
+- batch[0] = GFX_OP_PIPE_CONTROL(6) | flags0;
+- batch[1] = flags1;
++ batch[0] = GFX_OP_PIPE_CONTROL(6) | bit_group_0;
++ batch[1] = bit_group_1;
+ batch[2] = offset;
+
+ return batch + 6;
+ }
+
+-static inline u32 *gen8_emit_pipe_control(u32 *batch, u32 flags, u32 offset)
++static inline u32 *gen8_emit_pipe_control(u32 *batch,
++ u32 bit_group_1, u32 offset)
+ {
+- return __gen8_emit_pipe_control(batch, 0, flags, offset);
++ return __gen8_emit_pipe_control(batch, 0, bit_group_1, offset);
+ }
+
+-static inline u32 *gen12_emit_pipe_control(u32 *batch, u32 flags0, u32 flags1, u32 offset)
++static inline u32 *gen12_emit_pipe_control(u32 *batch, u32 bit_group_0,
++ u32 bit_group_1, u32 offset)
+ {
+- return __gen8_emit_pipe_control(batch, flags0, flags1, offset);
++ return __gen8_emit_pipe_control(batch, bit_group_0,
++ bit_group_1, offset);
+ }
+
+ static inline u32 *
+diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+index 5d143e2a8db03..2bd8d98d21102 100644
+--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
++++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+@@ -121,6 +121,7 @@
+ #define MI_SEMAPHORE_TARGET(engine) ((engine)<<15)
+ #define MI_SEMAPHORE_WAIT MI_INSTR(0x1c, 2) /* GEN8+ */
+ #define MI_SEMAPHORE_WAIT_TOKEN MI_INSTR(0x1c, 3) /* GEN12+ */
++#define MI_SEMAPHORE_REGISTER_POLL (1 << 16)
+ #define MI_SEMAPHORE_POLL (1 << 15)
+ #define MI_SEMAPHORE_SAD_GT_SDD (0 << 12)
+ #define MI_SEMAPHORE_SAD_GTE_SDD (1 << 12)
+@@ -299,6 +300,7 @@
+ #define PIPE_CONTROL_QW_WRITE (1<<14)
+ #define PIPE_CONTROL_POST_SYNC_OP_MASK (3<<14)
+ #define PIPE_CONTROL_DEPTH_STALL (1<<13)
++#define PIPE_CONTROL_CCS_FLUSH (1<<13) /* MTL+ */
+ #define PIPE_CONTROL_WRITE_FLUSH (1<<12)
+ #define PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH (1<<12) /* gen6+ */
+ #define PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE (1<<11) /* MBZ on ILK */
+diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+index fd1f9cd35e9d7..b8b7992e72537 100644
+--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
++++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+@@ -331,9 +331,11 @@
+ #define GEN8_PRIVATE_PAT_HI _MMIO(0x40e0 + 4)
+ #define GEN10_PAT_INDEX(index) _MMIO(0x40e0 + (index) * 4)
+ #define BSD_HWS_PGA_GEN7 _MMIO(0x4180)
+-#define GEN12_GFX_CCS_AUX_NV _MMIO(0x4208)
+-#define GEN12_VD0_AUX_NV _MMIO(0x4218)
+-#define GEN12_VD1_AUX_NV _MMIO(0x4228)
++
++#define GEN12_CCS_AUX_INV _MMIO(0x4208)
++#define GEN12_VD0_AUX_INV _MMIO(0x4218)
++#define GEN12_VE0_AUX_INV _MMIO(0x4238)
++#define GEN12_BCS0_AUX_INV _MMIO(0x4248)
+
+ #define GEN8_RTCR _MMIO(0x4260)
+ #define GEN8_M1TCR _MMIO(0x4264)
+@@ -341,14 +343,12 @@
+ #define GEN8_BTCR _MMIO(0x426c)
+ #define GEN8_VTCR _MMIO(0x4270)
+
+-#define GEN12_VD2_AUX_NV _MMIO(0x4298)
+-#define GEN12_VD3_AUX_NV _MMIO(0x42a8)
+-#define GEN12_VE0_AUX_NV _MMIO(0x4238)
+-
+ #define BLT_HWS_PGA_GEN7 _MMIO(0x4280)
+
+-#define GEN12_VE1_AUX_NV _MMIO(0x42b8)
++#define GEN12_VD2_AUX_INV _MMIO(0x4298)
++#define GEN12_CCS0_AUX_INV _MMIO(0x42c8)
+ #define AUX_INV REG_BIT(0)
++
+ #define VEBOX_HWS_PGA_GEN7 _MMIO(0x4380)
+
+ #define GEN12_AUX_ERR_DBG _MMIO(0x43f4)
+diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
+index 81a96c52a92b3..502a1c0093aab 100644
+--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
++++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
+@@ -1364,10 +1364,7 @@ gen12_emit_indirect_ctx_rcs(const struct intel_context *ce, u32 *cs)
+ IS_DG2_G11(ce->engine->i915))
+ cs = gen8_emit_pipe_control(cs, PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE, 0);
+
+- /* hsdes: 1809175790 */
+- if (!HAS_FLAT_CCS(ce->engine->i915))
+- cs = gen12_emit_aux_table_inv(ce->engine->gt,
+- cs, GEN12_GFX_CCS_AUX_NV);
++ cs = gen12_emit_aux_table_inv(ce->engine, cs);
+
+ /* Wa_16014892111 */
+ if (IS_DG2(ce->engine->i915))
+@@ -1390,17 +1387,7 @@ gen12_emit_indirect_ctx_xcs(const struct intel_context *ce, u32 *cs)
+ PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE,
+ 0);
+
+- /* hsdes: 1809175790 */
+- if (!HAS_FLAT_CCS(ce->engine->i915)) {
+- if (ce->engine->class == VIDEO_DECODE_CLASS)
+- cs = gen12_emit_aux_table_inv(ce->engine->gt,
+- cs, GEN12_VD0_AUX_NV);
+- else if (ce->engine->class == VIDEO_ENHANCEMENT_CLASS)
+- cs = gen12_emit_aux_table_inv(ce->engine->gt,
+- cs, GEN12_VE0_AUX_NV);
+- }
+-
+- return cs;
++ return gen12_emit_aux_table_inv(ce->engine, cs);
+ }
+
+ static void
+diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c
+index 8ef93889061a6..5ec293011d990 100644
+--- a/drivers/gpu/drm/i915/i915_active.c
++++ b/drivers/gpu/drm/i915/i915_active.c
+@@ -449,8 +449,11 @@ int i915_active_add_request(struct i915_active *ref, struct i915_request *rq)
+ }
+ } while (unlikely(is_barrier(active)));
+
+- if (!__i915_active_fence_set(active, fence))
++ fence = __i915_active_fence_set(active, fence);
++ if (!fence)
+ __i915_active_acquire(ref);
++ else
++ dma_fence_put(fence);
+
+ out:
+ i915_active_release(ref);
+@@ -469,13 +472,9 @@ __i915_active_set_fence(struct i915_active *ref,
+ return NULL;
+ }
+
+- rcu_read_lock();
+ prev = __i915_active_fence_set(active, fence);
+- if (prev)
+- prev = dma_fence_get_rcu(prev);
+- else
++ if (!prev)
+ __i915_active_acquire(ref);
+- rcu_read_unlock();
+
+ return prev;
+ }
+@@ -1019,10 +1018,11 @@ void i915_request_add_active_barriers(struct i915_request *rq)
+ *
+ * Records the new @fence as the last active fence along its timeline in
+ * this active tracker, moving the tracking callbacks from the previous
+- * fence onto this one. Returns the previous fence (if not already completed),
+- * which the caller must ensure is executed before the new fence. To ensure
+- * that the order of fences within the timeline of the i915_active_fence is
+- * understood, it should be locked by the caller.
++ * fence onto this one. Gets and returns a reference to the previous fence
++ * (if not already completed), which the caller must put after making sure
++ * that it is executed before the new fence. To ensure that the order of
++ * fences within the timeline of the i915_active_fence is understood, it
++ * should be locked by the caller.
+ */
+ struct dma_fence *
+ __i915_active_fence_set(struct i915_active_fence *active,
+@@ -1031,7 +1031,23 @@ __i915_active_fence_set(struct i915_active_fence *active,
+ struct dma_fence *prev;
+ unsigned long flags;
+
+- if (fence == rcu_access_pointer(active->fence))
++ /*
++ * In case of fences embedded in i915_requests, their memory is
++ * SLAB_FAILSAFE_BY_RCU, then it can be reused right after release
++ * by new requests. Then, there is a risk of passing back a pointer
++ * to a new, completely unrelated fence that reuses the same memory
++ * while tracked under a different active tracker. Combined with i915
++ * perf open/close operations that build await dependencies between
++ * engine kernel context requests and user requests from different
++ * timelines, this can lead to dependency loops and infinite waits.
++ *
++ * As a countermeasure, we try to get a reference to the active->fence
++ * first, so if we succeed and pass it back to our user then it is not
++ * released and potentially reused by an unrelated request before the
++ * user has a chance to set up an await dependency on it.
++ */
++ prev = i915_active_fence_get(active);
++ if (fence == prev)
+ return fence;
+
+ GEM_BUG_ON(test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags));
+@@ -1040,27 +1056,56 @@ __i915_active_fence_set(struct i915_active_fence *active,
+ * Consider that we have two threads arriving (A and B), with
+ * C already resident as the active->fence.
+ *
+- * A does the xchg first, and so it sees C or NULL depending
+- * on the timing of the interrupt handler. If it is NULL, the
+- * previous fence must have been signaled and we know that
+- * we are first on the timeline. If it is still present,
+- * we acquire the lock on that fence and serialise with the interrupt
+- * handler, in the process removing it from any future interrupt
+- * callback. A will then wait on C before executing (if present).
+- *
+- * As B is second, it sees A as the previous fence and so waits for
+- * it to complete its transition and takes over the occupancy for
+- * itself -- remembering that it needs to wait on A before executing.
++ * Both A and B have got a reference to C or NULL, depending on the
++ * timing of the interrupt handler. Let's assume that if A has got C
++ * then it has locked C first (before B).
+ *
+ * Note the strong ordering of the timeline also provides consistent
+ * nesting rules for the fence->lock; the inner lock is always the
+ * older lock.
+ */
+ spin_lock_irqsave(fence->lock, flags);
+- prev = xchg(__active_fence_slot(active), fence);
+- if (prev) {
+- GEM_BUG_ON(prev == fence);
++ if (prev)
+ spin_lock_nested(prev->lock, SINGLE_DEPTH_NESTING);
++
++ /*
++ * A does the cmpxchg first, and so it sees C or NULL, as before, or
++ * something else, depending on the timing of other threads and/or
++ * interrupt handler. If not the same as before then A unlocks C if
++ * applicable and retries, starting from an attempt to get a new
++ * active->fence. Meanwhile, B follows the same path as A.
++ * Once A succeeds with cmpxch, B fails again, retires, gets A from
++ * active->fence, locks it as soon as A completes, and possibly
++ * succeeds with cmpxchg.
++ */
++ while (cmpxchg(__active_fence_slot(active), prev, fence) != prev) {
++ if (prev) {
++ spin_unlock(prev->lock);
++ dma_fence_put(prev);
++ }
++ spin_unlock_irqrestore(fence->lock, flags);
++
++ prev = i915_active_fence_get(active);
++ GEM_BUG_ON(prev == fence);
++
++ spin_lock_irqsave(fence->lock, flags);
++ if (prev)
++ spin_lock_nested(prev->lock, SINGLE_DEPTH_NESTING);
++ }
++
++ /*
++ * If prev is NULL then the previous fence must have been signaled
++ * and we know that we are first on the timeline. If it is still
++ * present then, having the lock on that fence already acquired, we
++ * serialise with the interrupt handler, in the process of removing it
++ * from any future interrupt callback. A will then wait on C before
++ * executing (if present).
++ *
++ * As B is second, it sees A as the previous fence and so waits for
++ * it to complete its transition and takes over the occupancy for
++ * itself -- remembering that it needs to wait on A before executing.
++ */
++ if (prev) {
+ __list_del_entry(&active->cb.node);
+ spin_unlock(prev->lock); /* serialise with prev->cb_list */
+ }
+@@ -1077,11 +1122,7 @@ int i915_active_fence_set(struct i915_active_fence *active,
+ int err = 0;
+
+ /* Must maintain timeline ordering wrt previous active requests */
+- rcu_read_lock();
+ fence = __i915_active_fence_set(active, &rq->fence);
+- if (fence) /* but the previous fence may not belong to that timeline! */
+- fence = dma_fence_get_rcu(fence);
+- rcu_read_unlock();
+ if (fence) {
+ err = i915_request_await_dma_fence(rq, fence);
+ dma_fence_put(fence);
+diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
+index 630a732aaecca..b620d3c3fe724 100644
+--- a/drivers/gpu/drm/i915/i915_request.c
++++ b/drivers/gpu/drm/i915/i915_request.c
+@@ -1661,6 +1661,11 @@ __i915_request_ensure_parallel_ordering(struct i915_request *rq,
+
+ request_to_parent(rq)->parallel.last_rq = i915_request_get(rq);
+
++ /*
++ * Users have to put a reference potentially got by
++ * __i915_active_fence_set() to the returned request
++ * when no longer needed
++ */
+ return to_request(__i915_active_fence_set(&timeline->last_request,
+ &rq->fence));
+ }
+@@ -1707,6 +1712,10 @@ __i915_request_ensure_ordering(struct i915_request *rq,
+ 0);
+ }
+
++ /*
++ * Users have to put the reference to prev potentially got
++ * by __i915_active_fence_set() when no longer needed
++ */
+ return prev;
+ }
+
+@@ -1760,6 +1769,8 @@ __i915_request_add_to_timeline(struct i915_request *rq)
+ prev = __i915_request_ensure_ordering(rq, timeline);
+ else
+ prev = __i915_request_ensure_parallel_ordering(rq, timeline);
++ if (prev)
++ i915_request_put(prev);
+
+ /*
+ * Make sure that no request gazumped us - if it was allocated after
+diff --git a/drivers/gpu/drm/imx/ipuv3/ipuv3-crtc.c b/drivers/gpu/drm/imx/ipuv3/ipuv3-crtc.c
+index 5f26090b0c985..89585b31b985e 100644
+--- a/drivers/gpu/drm/imx/ipuv3/ipuv3-crtc.c
++++ b/drivers/gpu/drm/imx/ipuv3/ipuv3-crtc.c
+@@ -310,7 +310,7 @@ static void ipu_crtc_mode_set_nofb(struct drm_crtc *crtc)
+ dev_warn(ipu_crtc->dev, "8-pixel align hactive %d -> %d\n",
+ sig_cfg.mode.hactive, new_hactive);
+
+- sig_cfg.mode.hfront_porch = new_hactive - sig_cfg.mode.hactive;
++ sig_cfg.mode.hfront_porch -= new_hactive - sig_cfg.mode.hactive;
+ sig_cfg.mode.hactive = new_hactive;
+ }
+
+diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
+index 7139a522b2f3b..54e3083076b78 100644
+--- a/drivers/gpu/drm/ttm/ttm_bo.c
++++ b/drivers/gpu/drm/ttm/ttm_bo.c
+@@ -519,7 +519,8 @@ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo,
+
+ if (bo->pin_count) {
+ *locked = false;
+- *busy = false;
++ if (busy)
++ *busy = false;
+ return false;
+ }
+
+diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+index 3fd83fb757227..bbad54aa6c8ca 100644
+--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
++++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+@@ -894,6 +894,12 @@ static void arm_smmu_cmdq_batch_add(struct arm_smmu_device *smmu,
+ {
+ int index;
+
++ if (cmds->num == CMDQ_BATCH_ENTRIES - 1 &&
++ (smmu->options & ARM_SMMU_OPT_CMDQ_FORCE_SYNC)) {
++ arm_smmu_cmdq_issue_cmdlist(smmu, cmds->cmds, cmds->num, true);
++ cmds->num = 0;
++ }
++
+ if (cmds->num == CMDQ_BATCH_ENTRIES) {
+ arm_smmu_cmdq_issue_cmdlist(smmu, cmds->cmds, cmds->num, false);
+ cmds->num = 0;
+@@ -3429,6 +3435,44 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu, bool bypass)
+ return 0;
+ }
+
++#define IIDR_IMPLEMENTER_ARM 0x43b
++#define IIDR_PRODUCTID_ARM_MMU_600 0x483
++#define IIDR_PRODUCTID_ARM_MMU_700 0x487
++
++static void arm_smmu_device_iidr_probe(struct arm_smmu_device *smmu)
++{
++ u32 reg;
++ unsigned int implementer, productid, variant, revision;
++
++ reg = readl_relaxed(smmu->base + ARM_SMMU_IIDR);
++ implementer = FIELD_GET(IIDR_IMPLEMENTER, reg);
++ productid = FIELD_GET(IIDR_PRODUCTID, reg);
++ variant = FIELD_GET(IIDR_VARIANT, reg);
++ revision = FIELD_GET(IIDR_REVISION, reg);
++
++ switch (implementer) {
++ case IIDR_IMPLEMENTER_ARM:
++ switch (productid) {
++ case IIDR_PRODUCTID_ARM_MMU_600:
++ /* Arm erratum 1076982 */
++ if (variant == 0 && revision <= 2)
++ smmu->features &= ~ARM_SMMU_FEAT_SEV;
++ /* Arm erratum 1209401 */
++ if (variant < 2)
++ smmu->features &= ~ARM_SMMU_FEAT_NESTING;
++ break;
++ case IIDR_PRODUCTID_ARM_MMU_700:
++ /* Arm erratum 2812531 */
++ smmu->features &= ~ARM_SMMU_FEAT_BTM;
++ smmu->options |= ARM_SMMU_OPT_CMDQ_FORCE_SYNC;
++ /* Arm errata 2268618, 2812531 */
++ smmu->features &= ~ARM_SMMU_FEAT_NESTING;
++ break;
++ }
++ break;
++ }
++}
++
+ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
+ {
+ u32 reg;
+@@ -3635,6 +3679,12 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
+
+ smmu->ias = max(smmu->ias, smmu->oas);
+
++ if ((smmu->features & ARM_SMMU_FEAT_TRANS_S1) &&
++ (smmu->features & ARM_SMMU_FEAT_TRANS_S2))
++ smmu->features |= ARM_SMMU_FEAT_NESTING;
++
++ arm_smmu_device_iidr_probe(smmu);
++
+ if (arm_smmu_sva_supported(smmu))
+ smmu->features |= ARM_SMMU_FEAT_SVA;
+
+diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+index b574c58a34876..dcab85698a4e2 100644
+--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
++++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+@@ -69,6 +69,12 @@
+ #define IDR5_VAX GENMASK(11, 10)
+ #define IDR5_VAX_52_BIT 1
+
++#define ARM_SMMU_IIDR 0x18
++#define IIDR_PRODUCTID GENMASK(31, 20)
++#define IIDR_VARIANT GENMASK(19, 16)
++#define IIDR_REVISION GENMASK(15, 12)
++#define IIDR_IMPLEMENTER GENMASK(11, 0)
++
+ #define ARM_SMMU_CR0 0x20
+ #define CR0_ATSCHK (1 << 4)
+ #define CR0_CMDQEN (1 << 3)
+@@ -639,11 +645,13 @@ struct arm_smmu_device {
+ #define ARM_SMMU_FEAT_BTM (1 << 16)
+ #define ARM_SMMU_FEAT_SVA (1 << 17)
+ #define ARM_SMMU_FEAT_E2H (1 << 18)
++#define ARM_SMMU_FEAT_NESTING (1 << 19)
+ u32 features;
+
+ #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0)
+ #define ARM_SMMU_OPT_PAGE0_REGS_ONLY (1 << 1)
+ #define ARM_SMMU_OPT_MSIPOLL (1 << 2)
++#define ARM_SMMU_OPT_CMDQ_FORCE_SYNC (1 << 3)
+ u32 options;
+
+ struct arm_smmu_cmdq cmdq;
+diff --git a/drivers/isdn/hardware/mISDN/hfcpci.c b/drivers/isdn/hardware/mISDN/hfcpci.c
+index c0331b2680108..fe391de1aba32 100644
+--- a/drivers/isdn/hardware/mISDN/hfcpci.c
++++ b/drivers/isdn/hardware/mISDN/hfcpci.c
+@@ -839,7 +839,7 @@ hfcpci_fill_fifo(struct bchannel *bch)
+ *z1t = cpu_to_le16(new_z1); /* now send data */
+ if (bch->tx_idx < bch->tx_skb->len)
+ return;
+- dev_kfree_skb(bch->tx_skb);
++ dev_kfree_skb_any(bch->tx_skb);
+ if (get_next_bframe(bch))
+ goto next_t_frame;
+ return;
+@@ -895,7 +895,7 @@ hfcpci_fill_fifo(struct bchannel *bch)
+ }
+ bz->za[new_f1].z1 = cpu_to_le16(new_z1); /* for next buffer */
+ bz->f1 = new_f1; /* next frame */
+- dev_kfree_skb(bch->tx_skb);
++ dev_kfree_skb_any(bch->tx_skb);
+ get_next_bframe(bch);
+ }
+
+@@ -1119,7 +1119,7 @@ tx_birq(struct bchannel *bch)
+ if (bch->tx_skb && bch->tx_idx < bch->tx_skb->len)
+ hfcpci_fill_fifo(bch);
+ else {
+- dev_kfree_skb(bch->tx_skb);
++ dev_kfree_skb_any(bch->tx_skb);
+ if (get_next_bframe(bch))
+ hfcpci_fill_fifo(bch);
+ }
+@@ -2277,7 +2277,7 @@ _hfcpci_softirq(struct device *dev, void *unused)
+ return 0;
+
+ if (hc->hw.int_m2 & HFCPCI_IRQ_ENABLE) {
+- spin_lock(&hc->lock);
++ spin_lock_irq(&hc->lock);
+ bch = Sel_BCS(hc, hc->hw.bswapped ? 2 : 1);
+ if (bch && bch->state == ISDN_P_B_RAW) { /* B1 rx&tx */
+ main_rec_hfcpci(bch);
+@@ -2288,7 +2288,7 @@ _hfcpci_softirq(struct device *dev, void *unused)
+ main_rec_hfcpci(bch);
+ tx_birq(bch);
+ }
+- spin_unlock(&hc->lock);
++ spin_unlock_irq(&hc->lock);
+ }
+ return 0;
+ }
+diff --git a/drivers/mtd/nand/raw/fsl_upm.c b/drivers/mtd/nand/raw/fsl_upm.c
+index 086426139173f..7366e85c09fd9 100644
+--- a/drivers/mtd/nand/raw/fsl_upm.c
++++ b/drivers/mtd/nand/raw/fsl_upm.c
+@@ -135,7 +135,7 @@ static int fun_exec_op(struct nand_chip *chip, const struct nand_operation *op,
+ unsigned int i;
+ int ret;
+
+- if (op->cs > NAND_MAX_CHIPS)
++ if (op->cs >= NAND_MAX_CHIPS)
+ return -EINVAL;
+
+ if (check_only)
+diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
+index 4efb96e4e1c7a..b1325cf7abba6 100644
+--- a/drivers/mtd/nand/raw/meson_nand.c
++++ b/drivers/mtd/nand/raw/meson_nand.c
+@@ -1184,7 +1184,6 @@ static int meson_nand_attach_chip(struct nand_chip *nand)
+ struct meson_nfc *nfc = nand_get_controller_data(nand);
+ struct meson_nfc_nand_chip *meson_chip = to_meson_nand(nand);
+ struct mtd_info *mtd = nand_to_mtd(nand);
+- int nsectors = mtd->writesize / 1024;
+ int ret;
+
+ if (!mtd->name) {
+@@ -1202,7 +1201,7 @@ static int meson_nand_attach_chip(struct nand_chip *nand)
+ nand->options |= NAND_NO_SUBPAGE_WRITE;
+
+ ret = nand_ecc_choose_conf(nand, nfc->data->ecc_caps,
+- mtd->oobsize - 2 * nsectors);
++ mtd->oobsize - 2);
+ if (ret) {
+ dev_err(nfc->dev, "failed to ECC init\n");
+ return -EINVAL;
+diff --git a/drivers/mtd/nand/raw/omap_elm.c b/drivers/mtd/nand/raw/omap_elm.c
+index 6e1eac6644a66..4a97d4a76454a 100644
+--- a/drivers/mtd/nand/raw/omap_elm.c
++++ b/drivers/mtd/nand/raw/omap_elm.c
+@@ -177,17 +177,17 @@ static void elm_load_syndrome(struct elm_info *info,
+ switch (info->bch_type) {
+ case BCH8_ECC:
+ /* syndrome fragment 0 = ecc[9-12B] */
+- val = cpu_to_be32(*(u32 *) &ecc[9]);
++ val = (__force u32)cpu_to_be32(*(u32 *)&ecc[9]);
+ elm_write_reg(info, offset, val);
+
+ /* syndrome fragment 1 = ecc[5-8B] */
+ offset += 4;
+- val = cpu_to_be32(*(u32 *) &ecc[5]);
++ val = (__force u32)cpu_to_be32(*(u32 *)&ecc[5]);
+ elm_write_reg(info, offset, val);
+
+ /* syndrome fragment 2 = ecc[1-4B] */
+ offset += 4;
+- val = cpu_to_be32(*(u32 *) &ecc[1]);
++ val = (__force u32)cpu_to_be32(*(u32 *)&ecc[1]);
+ elm_write_reg(info, offset, val);
+
+ /* syndrome fragment 3 = ecc[0B] */
+@@ -197,35 +197,35 @@ static void elm_load_syndrome(struct elm_info *info,
+ break;
+ case BCH4_ECC:
+ /* syndrome fragment 0 = ecc[20-52b] bits */
+- val = (cpu_to_be32(*(u32 *) &ecc[3]) >> 4) |
++ val = ((__force u32)cpu_to_be32(*(u32 *)&ecc[3]) >> 4) |
+ ((ecc[2] & 0xf) << 28);
+ elm_write_reg(info, offset, val);
+
+ /* syndrome fragment 1 = ecc[0-20b] bits */
+ offset += 4;
+- val = cpu_to_be32(*(u32 *) &ecc[0]) >> 12;
++ val = (__force u32)cpu_to_be32(*(u32 *)&ecc[0]) >> 12;
+ elm_write_reg(info, offset, val);
+ break;
+ case BCH16_ECC:
+- val = cpu_to_be32(*(u32 *) &ecc[22]);
++ val = (__force u32)cpu_to_be32(*(u32 *)&ecc[22]);
+ elm_write_reg(info, offset, val);
+ offset += 4;
+- val = cpu_to_be32(*(u32 *) &ecc[18]);
++ val = (__force u32)cpu_to_be32(*(u32 *)&ecc[18]);
+ elm_write_reg(info, offset, val);
+ offset += 4;
+- val = cpu_to_be32(*(u32 *) &ecc[14]);
++ val = (__force u32)cpu_to_be32(*(u32 *)&ecc[14]);
+ elm_write_reg(info, offset, val);
+ offset += 4;
+- val = cpu_to_be32(*(u32 *) &ecc[10]);
++ val = (__force u32)cpu_to_be32(*(u32 *)&ecc[10]);
+ elm_write_reg(info, offset, val);
+ offset += 4;
+- val = cpu_to_be32(*(u32 *) &ecc[6]);
++ val = (__force u32)cpu_to_be32(*(u32 *)&ecc[6]);
+ elm_write_reg(info, offset, val);
+ offset += 4;
+- val = cpu_to_be32(*(u32 *) &ecc[2]);
++ val = (__force u32)cpu_to_be32(*(u32 *)&ecc[2]);
+ elm_write_reg(info, offset, val);
+ offset += 4;
+- val = cpu_to_be32(*(u32 *) &ecc[0]) >> 16;
++ val = (__force u32)cpu_to_be32(*(u32 *)&ecc[0]) >> 16;
+ elm_write_reg(info, offset, val);
+ break;
+ default:
+diff --git a/drivers/mtd/nand/raw/rockchip-nand-controller.c b/drivers/mtd/nand/raw/rockchip-nand-controller.c
+index 2312e27362cbe..5a04680342c32 100644
+--- a/drivers/mtd/nand/raw/rockchip-nand-controller.c
++++ b/drivers/mtd/nand/raw/rockchip-nand-controller.c
+@@ -562,9 +562,10 @@ static int rk_nfc_write_page_raw(struct nand_chip *chip, const u8 *buf,
+ * BBM OOB1 OOB2 OOB3 |......| PA0 PA1 PA2 PA3
+ *
+ * The rk_nfc_ooblayout_free() function already has reserved
+- * these 4 bytes with:
++ * these 4 bytes together with 2 bytes for BBM
++ * by reducing it's length:
+ *
+- * oob_region->offset = NFC_SYS_DATA_SIZE + 2;
++ * oob_region->length = rknand->metadata_size - NFC_SYS_DATA_SIZE - 2;
+ */
+ if (!i)
+ memcpy(rk_nfc_oob_ptr(chip, i),
+@@ -597,7 +598,7 @@ static int rk_nfc_write_page_hwecc(struct nand_chip *chip, const u8 *buf,
+ int pages_per_blk = mtd->erasesize / mtd->writesize;
+ int ret = 0, i, boot_rom_mode = 0;
+ dma_addr_t dma_data, dma_oob;
+- u32 reg;
++ u32 tmp;
+ u8 *oob;
+
+ nand_prog_page_begin_op(chip, page, 0, NULL, 0);
+@@ -624,6 +625,13 @@ static int rk_nfc_write_page_hwecc(struct nand_chip *chip, const u8 *buf,
+ *
+ * 0xFF 0xFF 0xFF 0xFF | BBM OOB1 OOB2 OOB3 | ...
+ *
++ * The code here just swaps the first 4 bytes with the last
++ * 4 bytes without losing any data.
++ *
++ * The chip->oob_poi data layout:
++ *
++ * BBM OOB1 OOB2 OOB3 |......| PA0 PA1 PA2 PA3
++ *
+ * Configure the ECC algorithm supported by the boot ROM.
+ */
+ if ((page < (pages_per_blk * rknand->boot_blks)) &&
+@@ -634,21 +642,17 @@ static int rk_nfc_write_page_hwecc(struct nand_chip *chip, const u8 *buf,
+ }
+
+ for (i = 0; i < ecc->steps; i++) {
+- if (!i) {
+- reg = 0xFFFFFFFF;
+- } else {
++ if (!i)
++ oob = chip->oob_poi + (ecc->steps - 1) * NFC_SYS_DATA_SIZE;
++ else
+ oob = chip->oob_poi + (i - 1) * NFC_SYS_DATA_SIZE;
+- reg = oob[0] | oob[1] << 8 | oob[2] << 16 |
+- oob[3] << 24;
+- }
+
+- if (!i && boot_rom_mode)
+- reg = (page & (pages_per_blk - 1)) * 4;
++ tmp = oob[0] | oob[1] << 8 | oob[2] << 16 | oob[3] << 24;
+
+ if (nfc->cfg->type == NFC_V9)
+- nfc->oob_buf[i] = reg;
++ nfc->oob_buf[i] = tmp;
+ else
+- nfc->oob_buf[i * (oob_step / 4)] = reg;
++ nfc->oob_buf[i * (oob_step / 4)] = tmp;
+ }
+
+ dma_data = dma_map_single(nfc->dev, (void *)nfc->page_buf,
+@@ -811,12 +815,17 @@ static int rk_nfc_read_page_hwecc(struct nand_chip *chip, u8 *buf, int oob_on,
+ goto timeout_err;
+ }
+
+- for (i = 1; i < ecc->steps; i++) {
+- oob = chip->oob_poi + (i - 1) * NFC_SYS_DATA_SIZE;
++ for (i = 0; i < ecc->steps; i++) {
++ if (!i)
++ oob = chip->oob_poi + (ecc->steps - 1) * NFC_SYS_DATA_SIZE;
++ else
++ oob = chip->oob_poi + (i - 1) * NFC_SYS_DATA_SIZE;
++
+ if (nfc->cfg->type == NFC_V9)
+ tmp = nfc->oob_buf[i];
+ else
+ tmp = nfc->oob_buf[i * (oob_step / 4)];
++
+ *oob++ = (u8)tmp;
+ *oob++ = (u8)(tmp >> 8);
+ *oob++ = (u8)(tmp >> 16);
+@@ -933,12 +942,8 @@ static int rk_nfc_ooblayout_free(struct mtd_info *mtd, int section,
+ if (section)
+ return -ERANGE;
+
+- /*
+- * The beginning of the OOB area stores the reserved data for the NFC,
+- * the size of the reserved data is NFC_SYS_DATA_SIZE bytes.
+- */
+ oob_region->length = rknand->metadata_size - NFC_SYS_DATA_SIZE - 2;
+- oob_region->offset = NFC_SYS_DATA_SIZE + 2;
++ oob_region->offset = 2;
+
+ return 0;
+ }
+diff --git a/drivers/mtd/nand/spi/toshiba.c b/drivers/mtd/nand/spi/toshiba.c
+index 7380b1ebaccd5..a80427c131216 100644
+--- a/drivers/mtd/nand/spi/toshiba.c
++++ b/drivers/mtd/nand/spi/toshiba.c
+@@ -73,7 +73,7 @@ static int tx58cxgxsxraix_ecc_get_status(struct spinand_device *spinand,
+ {
+ struct nand_device *nand = spinand_to_nand(spinand);
+ u8 mbf = 0;
+- struct spi_mem_op op = SPINAND_GET_FEATURE_OP(0x30, &mbf);
++ struct spi_mem_op op = SPINAND_GET_FEATURE_OP(0x30, spinand->scratchbuf);
+
+ switch (status & STATUS_ECC_MASK) {
+ case STATUS_ECC_NO_BITFLIPS:
+@@ -92,7 +92,7 @@ static int tx58cxgxsxraix_ecc_get_status(struct spinand_device *spinand,
+ if (spi_mem_exec_op(spinand->spimem, &op))
+ return nanddev_get_ecc_conf(nand)->strength;
+
+- mbf >>= 4;
++ mbf = *(spinand->scratchbuf) >> 4;
+
+ if (WARN_ON(mbf > nanddev_get_ecc_conf(nand)->strength || !mbf))
+ return nanddev_get_ecc_conf(nand)->strength;
+diff --git a/drivers/mtd/nand/spi/winbond.c b/drivers/mtd/nand/spi/winbond.c
+index 3ad58cd284d8b..f507e37593012 100644
+--- a/drivers/mtd/nand/spi/winbond.c
++++ b/drivers/mtd/nand/spi/winbond.c
+@@ -108,7 +108,7 @@ static int w25n02kv_ecc_get_status(struct spinand_device *spinand,
+ {
+ struct nand_device *nand = spinand_to_nand(spinand);
+ u8 mbf = 0;
+- struct spi_mem_op op = SPINAND_GET_FEATURE_OP(0x30, &mbf);
++ struct spi_mem_op op = SPINAND_GET_FEATURE_OP(0x30, spinand->scratchbuf);
+
+ switch (status & STATUS_ECC_MASK) {
+ case STATUS_ECC_NO_BITFLIPS:
+@@ -126,7 +126,7 @@ static int w25n02kv_ecc_get_status(struct spinand_device *spinand,
+ if (spi_mem_exec_op(spinand->spimem, &op))
+ return nanddev_get_ecc_conf(nand)->strength;
+
+- mbf >>= 4;
++ mbf = *(spinand->scratchbuf) >> 4;
+
+ if (WARN_ON(mbf > nanddev_get_ecc_conf(nand)->strength || !mbf))
+ return nanddev_get_ecc_conf(nand)->strength;
+diff --git a/drivers/mtd/spi-nor/spansion.c b/drivers/mtd/spi-nor/spansion.c
+index 36876aa849ede..15f9a80c10b9b 100644
+--- a/drivers/mtd/spi-nor/spansion.c
++++ b/drivers/mtd/spi-nor/spansion.c
+@@ -361,7 +361,7 @@ static int cypress_nor_determine_addr_mode_by_sr1(struct spi_nor *nor,
+ */
+ static int cypress_nor_set_addr_mode_nbytes(struct spi_nor *nor)
+ {
+- struct spi_mem_op op = {};
++ struct spi_mem_op op;
+ u8 addr_mode;
+ int ret;
+
+@@ -492,7 +492,7 @@ s25fs256t_post_bfpt_fixup(struct spi_nor *nor,
+ const struct sfdp_parameter_header *bfpt_header,
+ const struct sfdp_bfpt *bfpt)
+ {
+- struct spi_mem_op op = {};
++ struct spi_mem_op op;
+ int ret;
+
+ ret = cypress_nor_set_addr_mode_nbytes(nor);
+diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
+index cde253d27bd08..72374b066f64a 100644
+--- a/drivers/net/dsa/bcm_sf2.c
++++ b/drivers/net/dsa/bcm_sf2.c
+@@ -1436,7 +1436,9 @@ static int bcm_sf2_sw_probe(struct platform_device *pdev)
+ if (IS_ERR(priv->clk))
+ return PTR_ERR(priv->clk);
+
+- clk_prepare_enable(priv->clk);
++ ret = clk_prepare_enable(priv->clk);
++ if (ret)
++ return ret;
+
+ priv->clk_mdiv = devm_clk_get_optional(&pdev->dev, "sw_switch_mdiv");
+ if (IS_ERR(priv->clk_mdiv)) {
+@@ -1444,7 +1446,9 @@ static int bcm_sf2_sw_probe(struct platform_device *pdev)
+ goto out_clk;
+ }
+
+- clk_prepare_enable(priv->clk_mdiv);
++ ret = clk_prepare_enable(priv->clk_mdiv);
++ if (ret)
++ goto out_clk;
+
+ ret = bcm_sf2_sw_rst(priv);
+ if (ret) {
+diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+index b499bc9c4e067..e481960cb6c7a 100644
+--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
++++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+@@ -633,12 +633,13 @@ tx_kick_pending:
+ return NETDEV_TX_OK;
+ }
+
+-static void bnxt_tx_int(struct bnxt *bp, struct bnxt_napi *bnapi, int nr_pkts)
++static void bnxt_tx_int(struct bnxt *bp, struct bnxt_napi *bnapi, int budget)
+ {
+ struct bnxt_tx_ring_info *txr = bnapi->tx_ring;
+ struct netdev_queue *txq = netdev_get_tx_queue(bp->dev, txr->txq_index);
+ u16 cons = txr->tx_cons;
+ struct pci_dev *pdev = bp->pdev;
++ int nr_pkts = bnapi->tx_pkts;
+ int i;
+ unsigned int tx_bytes = 0;
+
+@@ -688,6 +689,7 @@ next_tx_int:
+ dev_kfree_skb_any(skb);
+ }
+
++ bnapi->tx_pkts = 0;
+ WRITE_ONCE(txr->tx_cons, cons);
+
+ __netif_txq_completed_wake(txq, nr_pkts, tx_bytes,
+@@ -697,17 +699,24 @@ next_tx_int:
+
+ static struct page *__bnxt_alloc_rx_page(struct bnxt *bp, dma_addr_t *mapping,
+ struct bnxt_rx_ring_info *rxr,
++ unsigned int *offset,
+ gfp_t gfp)
+ {
+ struct device *dev = &bp->pdev->dev;
+ struct page *page;
+
+- page = page_pool_dev_alloc_pages(rxr->page_pool);
++ if (PAGE_SIZE > BNXT_RX_PAGE_SIZE) {
++ page = page_pool_dev_alloc_frag(rxr->page_pool, offset,
++ BNXT_RX_PAGE_SIZE);
++ } else {
++ page = page_pool_dev_alloc_pages(rxr->page_pool);
++ *offset = 0;
++ }
+ if (!page)
+ return NULL;
+
+- *mapping = dma_map_page_attrs(dev, page, 0, PAGE_SIZE, bp->rx_dir,
+- DMA_ATTR_WEAK_ORDERING);
++ *mapping = dma_map_page_attrs(dev, page, *offset, BNXT_RX_PAGE_SIZE,
++ bp->rx_dir, DMA_ATTR_WEAK_ORDERING);
+ if (dma_mapping_error(dev, *mapping)) {
+ page_pool_recycle_direct(rxr->page_pool, page);
+ return NULL;
+@@ -747,15 +756,16 @@ int bnxt_alloc_rx_data(struct bnxt *bp, struct bnxt_rx_ring_info *rxr,
+ dma_addr_t mapping;
+
+ if (BNXT_RX_PAGE_MODE(bp)) {
++ unsigned int offset;
+ struct page *page =
+- __bnxt_alloc_rx_page(bp, &mapping, rxr, gfp);
++ __bnxt_alloc_rx_page(bp, &mapping, rxr, &offset, gfp);
+
+ if (!page)
+ return -ENOMEM;
+
+ mapping += bp->rx_dma_offset;
+ rx_buf->data = page;
+- rx_buf->data_ptr = page_address(page) + bp->rx_offset;
++ rx_buf->data_ptr = page_address(page) + offset + bp->rx_offset;
+ } else {
+ u8 *data = __bnxt_alloc_rx_frag(bp, &mapping, gfp);
+
+@@ -815,7 +825,7 @@ static inline int bnxt_alloc_rx_page(struct bnxt *bp,
+ unsigned int offset = 0;
+
+ if (BNXT_RX_PAGE_MODE(bp)) {
+- page = __bnxt_alloc_rx_page(bp, &mapping, rxr, gfp);
++ page = __bnxt_alloc_rx_page(bp, &mapping, rxr, &offset, gfp);
+
+ if (!page)
+ return -ENOMEM;
+@@ -962,15 +972,15 @@ static struct sk_buff *bnxt_rx_multi_page_skb(struct bnxt *bp,
+ return NULL;
+ }
+ dma_addr -= bp->rx_dma_offset;
+- dma_unmap_page_attrs(&bp->pdev->dev, dma_addr, PAGE_SIZE, bp->rx_dir,
+- DMA_ATTR_WEAK_ORDERING);
+- skb = build_skb(page_address(page), PAGE_SIZE);
++ dma_unmap_page_attrs(&bp->pdev->dev, dma_addr, BNXT_RX_PAGE_SIZE,
++ bp->rx_dir, DMA_ATTR_WEAK_ORDERING);
++ skb = build_skb(data_ptr - bp->rx_offset, BNXT_RX_PAGE_SIZE);
+ if (!skb) {
+ page_pool_recycle_direct(rxr->page_pool, page);
+ return NULL;
+ }
+ skb_mark_for_recycle(skb);
+- skb_reserve(skb, bp->rx_dma_offset);
++ skb_reserve(skb, bp->rx_offset);
+ __skb_put(skb, len);
+
+ return skb;
+@@ -996,8 +1006,8 @@ static struct sk_buff *bnxt_rx_page_skb(struct bnxt *bp,
+ return NULL;
+ }
+ dma_addr -= bp->rx_dma_offset;
+- dma_unmap_page_attrs(&bp->pdev->dev, dma_addr, PAGE_SIZE, bp->rx_dir,
+- DMA_ATTR_WEAK_ORDERING);
++ dma_unmap_page_attrs(&bp->pdev->dev, dma_addr, BNXT_RX_PAGE_SIZE,
++ bp->rx_dir, DMA_ATTR_WEAK_ORDERING);
+
+ if (unlikely(!payload))
+ payload = eth_get_headlen(bp->dev, data_ptr, len);
+@@ -1010,7 +1020,7 @@ static struct sk_buff *bnxt_rx_page_skb(struct bnxt *bp,
+
+ skb_mark_for_recycle(skb);
+ off = (void *)data_ptr - page_address(page);
+- skb_add_rx_frag(skb, 0, page, off, len, PAGE_SIZE);
++ skb_add_rx_frag(skb, 0, page, off, len, BNXT_RX_PAGE_SIZE);
+ memcpy(skb->data - NET_IP_ALIGN, data_ptr - NET_IP_ALIGN,
+ payload + NET_IP_ALIGN);
+
+@@ -1145,7 +1155,7 @@ static struct sk_buff *bnxt_rx_agg_pages_skb(struct bnxt *bp,
+
+ skb->data_len += total_frag_len;
+ skb->len += total_frag_len;
+- skb->truesize += PAGE_SIZE * agg_bufs;
++ skb->truesize += BNXT_RX_PAGE_SIZE * agg_bufs;
+ return skb;
+ }
+
+@@ -2573,12 +2583,11 @@ static int __bnxt_poll_work(struct bnxt *bp, struct bnxt_cp_ring_info *cpr,
+ return rx_pkts;
+ }
+
+-static void __bnxt_poll_work_done(struct bnxt *bp, struct bnxt_napi *bnapi)
++static void __bnxt_poll_work_done(struct bnxt *bp, struct bnxt_napi *bnapi,
++ int budget)
+ {
+- if (bnapi->tx_pkts) {
+- bnapi->tx_int(bp, bnapi, bnapi->tx_pkts);
+- bnapi->tx_pkts = 0;
+- }
++ if (bnapi->tx_pkts)
++ bnapi->tx_int(bp, bnapi, budget);
+
+ if ((bnapi->events & BNXT_RX_EVENT) && !(bnapi->in_reset)) {
+ struct bnxt_rx_ring_info *rxr = bnapi->rx_ring;
+@@ -2607,7 +2616,7 @@ static int bnxt_poll_work(struct bnxt *bp, struct bnxt_cp_ring_info *cpr,
+ */
+ bnxt_db_cq(bp, &cpr->cp_db, cpr->cp_raw_cons);
+
+- __bnxt_poll_work_done(bp, bnapi);
++ __bnxt_poll_work_done(bp, bnapi, budget);
+ return rx_pkts;
+ }
+
+@@ -2738,7 +2747,7 @@ static int __bnxt_poll_cqs(struct bnxt *bp, struct bnxt_napi *bnapi, int budget)
+ }
+
+ static void __bnxt_poll_cqs_done(struct bnxt *bp, struct bnxt_napi *bnapi,
+- u64 dbr_type)
++ u64 dbr_type, int budget)
+ {
+ struct bnxt_cp_ring_info *cpr = &bnapi->cp_ring;
+ int i;
+@@ -2754,7 +2763,7 @@ static void __bnxt_poll_cqs_done(struct bnxt *bp, struct bnxt_napi *bnapi,
+ cpr2->had_work_done = 0;
+ }
+ }
+- __bnxt_poll_work_done(bp, bnapi);
++ __bnxt_poll_work_done(bp, bnapi, budget);
+ }
+
+ static int bnxt_poll_p5(struct napi_struct *napi, int budget)
+@@ -2784,7 +2793,8 @@ static int bnxt_poll_p5(struct napi_struct *napi, int budget)
+ if (cpr->has_more_work)
+ break;
+
+- __bnxt_poll_cqs_done(bp, bnapi, DBR_TYPE_CQ_ARMALL);
++ __bnxt_poll_cqs_done(bp, bnapi, DBR_TYPE_CQ_ARMALL,
++ budget);
+ cpr->cp_raw_cons = raw_cons;
+ if (napi_complete_done(napi, work_done))
+ BNXT_DB_NQ_ARM_P5(&cpr->cp_db,
+@@ -2814,7 +2824,7 @@ static int bnxt_poll_p5(struct napi_struct *napi, int budget)
+ }
+ raw_cons = NEXT_RAW_CMP(raw_cons);
+ }
+- __bnxt_poll_cqs_done(bp, bnapi, DBR_TYPE_CQ);
++ __bnxt_poll_cqs_done(bp, bnapi, DBR_TYPE_CQ, budget);
+ if (raw_cons != cpr->cp_raw_cons) {
+ cpr->cp_raw_cons = raw_cons;
+ BNXT_DB_NQ_P5(&cpr->cp_db, raw_cons);
+@@ -2947,8 +2957,8 @@ skip_rx_tpa_free:
+ rx_buf->data = NULL;
+ if (BNXT_RX_PAGE_MODE(bp)) {
+ mapping -= bp->rx_dma_offset;
+- dma_unmap_page_attrs(&pdev->dev, mapping, PAGE_SIZE,
+- bp->rx_dir,
++ dma_unmap_page_attrs(&pdev->dev, mapping,
++ BNXT_RX_PAGE_SIZE, bp->rx_dir,
+ DMA_ATTR_WEAK_ORDERING);
+ page_pool_recycle_direct(rxr->page_pool, data);
+ } else {
+@@ -3217,6 +3227,8 @@ static int bnxt_alloc_rx_page_pool(struct bnxt *bp,
+ pp.napi = &rxr->bnapi->napi;
+ pp.dev = &bp->pdev->dev;
+ pp.dma_dir = DMA_BIDIRECTIONAL;
++ if (PAGE_SIZE > BNXT_RX_PAGE_SIZE)
++ pp.flags |= PP_FLAG_PAGE_FRAG;
+
+ rxr->page_pool = page_pool_create(&pp);
+ if (IS_ERR(rxr->page_pool)) {
+@@ -3993,26 +4005,29 @@ void bnxt_set_ring_params(struct bnxt *bp)
+ */
+ int bnxt_set_rx_skb_mode(struct bnxt *bp, bool page_mode)
+ {
++ struct net_device *dev = bp->dev;
++
+ if (page_mode) {
+ bp->flags &= ~BNXT_FLAG_AGG_RINGS;
+ bp->flags |= BNXT_FLAG_RX_PAGE_MODE;
+
+- if (bp->dev->mtu > BNXT_MAX_PAGE_MODE_MTU) {
++ if (bp->xdp_prog->aux->xdp_has_frags)
++ dev->max_mtu = min_t(u16, bp->max_mtu, BNXT_MAX_MTU);
++ else
++ dev->max_mtu =
++ min_t(u16, bp->max_mtu, BNXT_MAX_PAGE_MODE_MTU);
++ if (dev->mtu > BNXT_MAX_PAGE_MODE_MTU) {
+ bp->flags |= BNXT_FLAG_JUMBO;
+ bp->rx_skb_func = bnxt_rx_multi_page_skb;
+- bp->dev->max_mtu =
+- min_t(u16, bp->max_mtu, BNXT_MAX_MTU);
+ } else {
+ bp->flags |= BNXT_FLAG_NO_AGG_RINGS;
+ bp->rx_skb_func = bnxt_rx_page_skb;
+- bp->dev->max_mtu =
+- min_t(u16, bp->max_mtu, BNXT_MAX_PAGE_MODE_MTU);
+ }
+ bp->rx_dir = DMA_BIDIRECTIONAL;
+ /* Disable LRO or GRO_HW */
+- netdev_update_features(bp->dev);
++ netdev_update_features(dev);
+ } else {
+- bp->dev->max_mtu = bp->max_mtu;
++ dev->max_mtu = bp->max_mtu;
+ bp->flags &= ~BNXT_FLAG_RX_PAGE_MODE;
+ bp->rx_dir = DMA_FROM_DEVICE;
+ bp->rx_skb_func = bnxt_rx_skb;
+@@ -9433,6 +9448,8 @@ static void bnxt_enable_napi(struct bnxt *bp)
+ cpr->sw_stats.rx.rx_resets++;
+ bnapi->in_reset = false;
+
++ bnapi->tx_pkts = 0;
++
+ if (bnapi->rx_ring) {
+ INIT_WORK(&cpr->dim.work, bnxt_dim_work);
+ cpr->dim.mode = DIM_CQ_PERIOD_MODE_START_FROM_EQE;
+diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+index 080e73496066b..bb95c3dc5270f 100644
+--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
++++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+@@ -1005,7 +1005,7 @@ struct bnxt_napi {
+ struct bnxt_tx_ring_info *tx_ring;
+
+ void (*tx_int)(struct bnxt *, struct bnxt_napi *,
+- int);
++ int budget);
+ int tx_pkts;
+ u8 events;
+
+diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
+index 4efa5fe6972b2..fb43232310b2d 100644
+--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
++++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
+@@ -125,16 +125,20 @@ static void __bnxt_xmit_xdp_redirect(struct bnxt *bp,
+ dma_unmap_len_set(tx_buf, len, 0);
+ }
+
+-void bnxt_tx_int_xdp(struct bnxt *bp, struct bnxt_napi *bnapi, int nr_pkts)
++void bnxt_tx_int_xdp(struct bnxt *bp, struct bnxt_napi *bnapi, int budget)
+ {
+ struct bnxt_tx_ring_info *txr = bnapi->tx_ring;
+ struct bnxt_rx_ring_info *rxr = bnapi->rx_ring;
+ bool rx_doorbell_needed = false;
++ int nr_pkts = bnapi->tx_pkts;
+ struct bnxt_sw_tx_bd *tx_buf;
+ u16 tx_cons = txr->tx_cons;
+ u16 last_tx_cons = tx_cons;
+ int i, j, frags;
+
++ if (!budget)
++ return;
++
+ for (i = 0; i < nr_pkts; i++) {
+ tx_buf = &txr->tx_buf_ring[tx_cons];
+
+@@ -161,6 +165,8 @@ void bnxt_tx_int_xdp(struct bnxt *bp, struct bnxt_napi *bnapi, int nr_pkts)
+ }
+ tx_cons = NEXT_TX(tx_cons);
+ }
++
++ bnapi->tx_pkts = 0;
+ WRITE_ONCE(txr->tx_cons, tx_cons);
+ if (rx_doorbell_needed) {
+ tx_buf = &txr->tx_buf_ring[last_tx_cons];
+@@ -180,8 +186,8 @@ void bnxt_xdp_buff_init(struct bnxt *bp, struct bnxt_rx_ring_info *rxr,
+ u16 cons, u8 *data_ptr, unsigned int len,
+ struct xdp_buff *xdp)
+ {
++ u32 buflen = BNXT_RX_PAGE_SIZE;
+ struct bnxt_sw_rx_bd *rx_buf;
+- u32 buflen = PAGE_SIZE;
+ struct pci_dev *pdev;
+ dma_addr_t mapping;
+ u32 offset;
+@@ -297,7 +303,7 @@ bool bnxt_rx_xdp(struct bnxt *bp, struct bnxt_rx_ring_info *rxr, u16 cons,
+ rx_buf = &rxr->rx_buf_ring[cons];
+ mapping = rx_buf->mapping - bp->rx_dma_offset;
+ dma_unmap_page_attrs(&pdev->dev, mapping,
+- PAGE_SIZE, bp->rx_dir,
++ BNXT_RX_PAGE_SIZE, bp->rx_dir,
+ DMA_ATTR_WEAK_ORDERING);
+
+ /* if we are unable to allocate a new buffer, abort and reuse */
+@@ -480,7 +486,7 @@ bnxt_xdp_build_skb(struct bnxt *bp, struct sk_buff *skb, u8 num_frags,
+ }
+ xdp_update_skb_shared_info(skb, num_frags,
+ sinfo->xdp_frags_size,
+- PAGE_SIZE * sinfo->nr_frags,
++ BNXT_RX_PAGE_SIZE * sinfo->nr_frags,
+ xdp_buff_is_frag_pfmemalloc(xdp));
+ return skb;
+ }
+diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h
+index ea430d6961df3..5e412c5655ba5 100644
+--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h
++++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h
+@@ -16,7 +16,7 @@ struct bnxt_sw_tx_bd *bnxt_xmit_bd(struct bnxt *bp,
+ struct bnxt_tx_ring_info *txr,
+ dma_addr_t mapping, u32 len,
+ struct xdp_buff *xdp);
+-void bnxt_tx_int_xdp(struct bnxt *bp, struct bnxt_napi *bnapi, int nr_pkts);
++void bnxt_tx_int_xdp(struct bnxt *bp, struct bnxt_napi *bnapi, int budget);
+ bool bnxt_rx_xdp(struct bnxt *bp, struct bnxt_rx_ring_info *rxr, u16 cons,
+ struct xdp_buff xdp, struct page *page, u8 **data_ptr,
+ unsigned int *len, u8 *event);
+diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
+index a52cf9aae4988..5ef073a79ce94 100644
+--- a/drivers/net/ethernet/broadcom/tg3.c
++++ b/drivers/net/ethernet/broadcom/tg3.c
+@@ -57,6 +57,7 @@
+ #include <linux/crc32poly.h>
+
+ #include <net/checksum.h>
++#include <net/gso.h>
+ #include <net/ip.h>
+
+ #include <linux/io.h>
+diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
+index fbe70458fda27..34e8e7cb1bc54 100644
+--- a/drivers/net/ethernet/intel/ice/ice_main.c
++++ b/drivers/net/ethernet/intel/ice/ice_main.c
+@@ -9055,6 +9055,7 @@ ice_setup_tc(struct net_device *netdev, enum tc_setup_type type,
+ {
+ struct ice_netdev_priv *np = netdev_priv(netdev);
+ struct ice_pf *pf = np->vsi->back;
++ bool locked = false;
+ int err;
+
+ switch (type) {
+@@ -9064,10 +9065,27 @@ ice_setup_tc(struct net_device *netdev, enum tc_setup_type type,
+ ice_setup_tc_block_cb,
+ np, np, true);
+ case TC_SETUP_QDISC_MQPRIO:
++ if (pf->adev) {
++ mutex_lock(&pf->adev_mutex);
++ device_lock(&pf->adev->dev);
++ locked = true;
++ if (pf->adev->dev.driver) {
++ netdev_err(netdev, "Cannot change qdisc when RDMA is active\n");
++ err = -EBUSY;
++ goto adev_unlock;
++ }
++ }
++
+ /* setup traffic classifier for receive side */
+ mutex_lock(&pf->tc_mutex);
+ err = ice_setup_tc_mqprio_qdisc(netdev, type_data);
+ mutex_unlock(&pf->tc_mutex);
++
++adev_unlock:
++ if (locked) {
++ device_unlock(&pf->adev->dev);
++ mutex_unlock(&pf->adev_mutex);
++ }
+ return err;
+ default:
+ return -EOPNOTSUPP;
+diff --git a/drivers/net/ethernet/korina.c b/drivers/net/ethernet/korina.c
+index 2b9335cb4bb3a..8537578e1cf1d 100644
+--- a/drivers/net/ethernet/korina.c
++++ b/drivers/net/ethernet/korina.c
+@@ -1302,11 +1302,10 @@ static int korina_probe(struct platform_device *pdev)
+ else if (of_get_ethdev_address(pdev->dev.of_node, dev) < 0)
+ eth_hw_addr_random(dev);
+
+- clk = devm_clk_get_optional(&pdev->dev, "mdioclk");
++ clk = devm_clk_get_optional_enabled(&pdev->dev, "mdioclk");
+ if (IS_ERR(clk))
+ return PTR_ERR(clk);
+ if (clk) {
+- clk_prepare_enable(clk);
+ lp->mii_clock_freq = clk_get_rate(clk);
+ } else {
+ lp->mii_clock_freq = 200000000; /* max possible input clk */
+diff --git a/drivers/net/ethernet/marvell/octeon_ep/octep_ctrl_mbox.c b/drivers/net/ethernet/marvell/octeon_ep/octep_ctrl_mbox.c
+index 035ead7935c74..dab61cc1acb57 100644
+--- a/drivers/net/ethernet/marvell/octeon_ep/octep_ctrl_mbox.c
++++ b/drivers/net/ethernet/marvell/octeon_ep/octep_ctrl_mbox.c
+@@ -98,6 +98,9 @@ int octep_ctrl_mbox_init(struct octep_ctrl_mbox *mbox)
+ writeq(OCTEP_CTRL_MBOX_STATUS_INIT,
+ OCTEP_CTRL_MBOX_INFO_HOST_STATUS(mbox->barmem));
+
++ mutex_init(&mbox->h2fq_lock);
++ mutex_init(&mbox->f2hq_lock);
++
+ mbox->h2fq.sz = readl(OCTEP_CTRL_MBOX_H2FQ_SZ(mbox->barmem));
+ mbox->h2fq.hw_prod = OCTEP_CTRL_MBOX_H2FQ_PROD(mbox->barmem);
+ mbox->h2fq.hw_cons = OCTEP_CTRL_MBOX_H2FQ_CONS(mbox->barmem);
+diff --git a/drivers/net/ethernet/marvell/prestera/prestera_pci.c b/drivers/net/ethernet/marvell/prestera/prestera_pci.c
+index f328d957b2db7..35857dc19542f 100644
+--- a/drivers/net/ethernet/marvell/prestera/prestera_pci.c
++++ b/drivers/net/ethernet/marvell/prestera/prestera_pci.c
+@@ -727,7 +727,8 @@ pick_fw_ver:
+
+ err = request_firmware_direct(&fw->bin, fw_path, fw->dev.dev);
+ if (err) {
+- if (ver_maj == PRESTERA_SUPP_FW_MAJ_VER) {
++ if (ver_maj != PRESTERA_PREV_FW_MAJ_VER ||
++ ver_min != PRESTERA_PREV_FW_MIN_VER) {
+ ver_maj = PRESTERA_PREV_FW_MAJ_VER;
+ ver_min = PRESTERA_PREV_FW_MIN_VER;
+
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c
+index f0c3464f037f4..0c88cf47af01b 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c
+@@ -1030,9 +1030,6 @@ int mlx5e_tc_tun_encap_dests_set(struct mlx5e_priv *priv,
+ int out_index;
+ int err = 0;
+
+- if (!mlx5e_is_eswitch_flow(flow))
+- return 0;
+-
+ parse_attr = attr->parse_attr;
+ esw_attr = attr->esw_attr;
+ *vf_tun = false;
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
+index d97e6df66f454..b8dd744536553 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
+@@ -323,8 +323,11 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct mlx5e_rq *rq,
+ net_prefetch(mxbuf->xdp.data);
+
+ prog = rcu_dereference(rq->xdp_prog);
+- if (likely(prog && mlx5e_xdp_handle(rq, prog, mxbuf)))
++ if (likely(prog && mlx5e_xdp_handle(rq, prog, mxbuf))) {
++ if (likely(__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)))
++ wi->flags |= BIT(MLX5E_WQE_FRAG_SKIP_RELEASE);
+ return NULL; /* page/packet was consumed by XDP */
++ }
+
+ /* XDP_PASS: copy the data from the UMEM to a new SKB. The frame reuse
+ * will be handled by mlx5e_free_rx_wqe.
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c
+index dbe87bf89c0dd..832d36be4a17b 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c
+@@ -808,9 +808,9 @@ static void setup_fte_upper_proto_match(struct mlx5_flow_spec *spec, struct upsp
+ }
+
+ if (upspec->sport) {
+- MLX5_SET(fte_match_set_lyr_2_4, spec->match_criteria, udp_dport,
++ MLX5_SET(fte_match_set_lyr_2_4, spec->match_criteria, udp_sport,
+ upspec->sport_mask);
+- MLX5_SET(fte_match_set_lyr_2_4, spec->match_value, udp_dport, upspec->sport);
++ MLX5_SET(fte_match_set_lyr_2_4, spec->match_value, udp_sport, upspec->sport);
+ }
+ }
+
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c
+index eab5bc718771f..8d995e3048692 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c
+@@ -58,7 +58,9 @@ static int mlx5e_ipsec_remove_trailer(struct sk_buff *skb, struct xfrm_state *x)
+
+ trailer_len = alen + plen + 2;
+
+- pskb_trim(skb, skb->len - trailer_len);
++ ret = pskb_trim(skb, skb->len - trailer_len);
++ if (unlikely(ret))
++ return ret;
+ if (skb->protocol == htons(ETH_P_IP)) {
+ ipv4hdr->tot_len = htons(ntohs(ipv4hdr->tot_len) - trailer_len);
+ ip_send_check(ipv4hdr);
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.c
+index cf704f106b7c2..984fa04bd331b 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.c
+@@ -188,7 +188,6 @@ static void mlx5e_tls_debugfs_init(struct mlx5e_tls *tls,
+
+ int mlx5e_ktls_init(struct mlx5e_priv *priv)
+ {
+- struct mlx5_crypto_dek_pool *dek_pool;
+ struct mlx5e_tls *tls;
+
+ if (!mlx5e_is_ktls_device(priv->mdev))
+@@ -199,12 +198,6 @@ int mlx5e_ktls_init(struct mlx5e_priv *priv)
+ return -ENOMEM;
+ tls->mdev = priv->mdev;
+
+- dek_pool = mlx5_crypto_dek_pool_create(priv->mdev, MLX5_ACCEL_OBJ_TLS_KEY);
+- if (IS_ERR(dek_pool)) {
+- kfree(tls);
+- return PTR_ERR(dek_pool);
+- }
+- tls->dek_pool = dek_pool;
+ priv->tls = tls;
+
+ mlx5e_tls_debugfs_init(tls, priv->dfs_root);
+@@ -222,7 +215,6 @@ void mlx5e_ktls_cleanup(struct mlx5e_priv *priv)
+ debugfs_remove_recursive(tls->debugfs.dfs);
+ tls->debugfs.dfs = NULL;
+
+- mlx5_crypto_dek_pool_destroy(tls->dek_pool);
+ kfree(priv->tls);
+ priv->tls = NULL;
+ }
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c
+index 0e4c0a093293a..c49363dd6bf9a 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c
+@@ -908,28 +908,51 @@ static void mlx5e_tls_tx_debugfs_init(struct mlx5e_tls *tls,
+
+ int mlx5e_ktls_init_tx(struct mlx5e_priv *priv)
+ {
++ struct mlx5_crypto_dek_pool *dek_pool;
+ struct mlx5e_tls *tls = priv->tls;
++ int err;
++
++ if (!mlx5e_is_ktls_device(priv->mdev))
++ return 0;
++
++ /* DEK pool could be used by either or both of TX and RX. But we have to
++ * put the creation here to avoid syndrome when doing devlink reload.
++ */
++ dek_pool = mlx5_crypto_dek_pool_create(priv->mdev, MLX5_ACCEL_OBJ_TLS_KEY);
++ if (IS_ERR(dek_pool))
++ return PTR_ERR(dek_pool);
++ tls->dek_pool = dek_pool;
+
+ if (!mlx5e_is_ktls_tx(priv->mdev))
+ return 0;
+
+ priv->tls->tx_pool = mlx5e_tls_tx_pool_init(priv->mdev, &priv->tls->sw_stats);
+- if (!priv->tls->tx_pool)
+- return -ENOMEM;
++ if (!priv->tls->tx_pool) {
++ err = -ENOMEM;
++ goto err_tx_pool_init;
++ }
+
+ mlx5e_tls_tx_debugfs_init(tls, tls->debugfs.dfs);
+
+ return 0;
++
++err_tx_pool_init:
++ mlx5_crypto_dek_pool_destroy(dek_pool);
++ return err;
+ }
+
+ void mlx5e_ktls_cleanup_tx(struct mlx5e_priv *priv)
+ {
+ if (!mlx5e_is_ktls_tx(priv->mdev))
+- return;
++ goto dek_pool_destroy;
+
+ debugfs_remove_recursive(priv->tls->debugfs.dfs_tx);
+ priv->tls->debugfs.dfs_tx = NULL;
+
+ mlx5e_tls_tx_pool_cleanup(priv->tls->tx_pool);
+ priv->tls->tx_pool = NULL;
++
++dek_pool_destroy:
++ if (mlx5e_is_ktls_device(priv->mdev))
++ mlx5_crypto_dek_pool_destroy(priv->tls->dek_pool);
+ }
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec_fs.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec_fs.c
+index 7fc901a6ec5fc..414e285848813 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec_fs.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec_fs.c
+@@ -161,6 +161,7 @@ static int macsec_fs_tx_create_crypto_table_groups(struct mlx5e_flow_table *ft)
+
+ if (!in) {
+ kfree(ft->g);
++ ft->g = NULL;
+ return -ENOMEM;
+ }
+
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_arfs.c b/drivers/net/ethernet/mellanox/mlx5/core/en_arfs.c
+index bed0c2d043e70..329d8c90facdd 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_arfs.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_arfs.c
+@@ -135,6 +135,16 @@ static void arfs_del_rules(struct mlx5e_flow_steering *fs);
+
+ int mlx5e_arfs_disable(struct mlx5e_flow_steering *fs)
+ {
++ /* Moving to switchdev mode, fs->arfs is freed by mlx5e_nic_profile
++ * cleanup_rx callback and it is not recreated when
++ * mlx5e_uplink_rep_profile is loaded as mlx5e_create_flow_steering()
++ * is not called by the uplink_rep profile init_rx callback. Thus, if
++ * ntuple is set, moving to switchdev flow will enter this function
++ * with fs->arfs nullified.
++ */
++ if (!mlx5e_fs_get_arfs(fs))
++ return 0;
++
+ arfs_del_rules(fs);
+
+ return arfs_disable(fs);
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+index a5bdf78955d76..f084513fbead4 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+@@ -1036,7 +1036,23 @@ static int mlx5e_modify_rq_state(struct mlx5e_rq *rq, int curr_state, int next_s
+ return err;
+ }
+
+-static int mlx5e_rq_to_ready(struct mlx5e_rq *rq, int curr_state)
++static void mlx5e_flush_rq_cq(struct mlx5e_rq *rq)
++{
++ struct mlx5_cqwq *cqwq = &rq->cq.wq;
++ struct mlx5_cqe64 *cqe;
++
++ if (test_bit(MLX5E_RQ_STATE_MINI_CQE_ENHANCED, &rq->state)) {
++ while ((cqe = mlx5_cqwq_get_cqe_enahnced_comp(cqwq)))
++ mlx5_cqwq_pop(cqwq);
++ } else {
++ while ((cqe = mlx5_cqwq_get_cqe(cqwq)))
++ mlx5_cqwq_pop(cqwq);
++ }
++
++ mlx5_cqwq_update_db_record(cqwq);
++}
++
++int mlx5e_flush_rq(struct mlx5e_rq *rq, int curr_state)
+ {
+ struct net_device *dev = rq->netdev;
+ int err;
+@@ -1046,6 +1062,10 @@ static int mlx5e_rq_to_ready(struct mlx5e_rq *rq, int curr_state)
+ netdev_err(dev, "Failed to move rq 0x%x to reset\n", rq->rqn);
+ return err;
+ }
++
++ mlx5e_free_rx_descs(rq);
++ mlx5e_flush_rq_cq(rq);
++
+ err = mlx5e_modify_rq_state(rq, MLX5_RQC_STATE_RST, MLX5_RQC_STATE_RDY);
+ if (err) {
+ netdev_err(dev, "Failed to move rq 0x%x to ready\n", rq->rqn);
+@@ -1055,13 +1075,6 @@ static int mlx5e_rq_to_ready(struct mlx5e_rq *rq, int curr_state)
+ return 0;
+ }
+
+-int mlx5e_flush_rq(struct mlx5e_rq *rq, int curr_state)
+-{
+- mlx5e_free_rx_descs(rq);
+-
+- return mlx5e_rq_to_ready(rq, curr_state);
+-}
+-
+ static int mlx5e_modify_rq_vsd(struct mlx5e_rq *rq, bool vsd)
+ {
+ struct mlx5_core_dev *mdev = rq->mdev;
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+index 3e7041bd5705e..ad63d1f9a611f 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+@@ -964,7 +964,7 @@ static int mlx5e_init_rep_rx(struct mlx5e_priv *priv)
+ err = mlx5e_open_drop_rq(priv, &priv->drop_rq);
+ if (err) {
+ mlx5_core_err(mdev, "open drop rq failed, %d\n", err);
+- return err;
++ goto err_rx_res_free;
+ }
+
+ err = mlx5e_rx_res_init(priv->rx_res, priv->mdev, 0,
+@@ -998,6 +998,7 @@ err_destroy_rx_res:
+ mlx5e_rx_res_destroy(priv->rx_res);
+ err_close_drop_rq:
+ mlx5e_close_drop_rq(&priv->drop_rq);
++err_rx_res_free:
+ mlx5e_rx_res_free(priv->rx_res);
+ priv->rx_res = NULL;
+ err_free_fs:
+@@ -1111,6 +1112,10 @@ static int mlx5e_init_rep_tx(struct mlx5e_priv *priv)
+ return err;
+ }
+
++ err = mlx5e_rep_neigh_init(rpriv);
++ if (err)
++ goto err_neigh_init;
++
+ if (rpriv->rep->vport == MLX5_VPORT_UPLINK) {
+ err = mlx5e_init_uplink_rep_tx(rpriv);
+ if (err)
+@@ -1127,6 +1132,8 @@ err_ht_init:
+ if (rpriv->rep->vport == MLX5_VPORT_UPLINK)
+ mlx5e_cleanup_uplink_rep_tx(rpriv);
+ err_init_tx:
++ mlx5e_rep_neigh_cleanup(rpriv);
++err_neigh_init:
+ mlx5e_destroy_tises(priv);
+ return err;
+ }
+@@ -1140,22 +1147,17 @@ static void mlx5e_cleanup_rep_tx(struct mlx5e_priv *priv)
+ if (rpriv->rep->vport == MLX5_VPORT_UPLINK)
+ mlx5e_cleanup_uplink_rep_tx(rpriv);
+
++ mlx5e_rep_neigh_cleanup(rpriv);
+ mlx5e_destroy_tises(priv);
+ }
+
+ static void mlx5e_rep_enable(struct mlx5e_priv *priv)
+ {
+- struct mlx5e_rep_priv *rpriv = priv->ppriv;
+-
+ mlx5e_set_netdev_mtu_boundaries(priv);
+- mlx5e_rep_neigh_init(rpriv);
+ }
+
+ static void mlx5e_rep_disable(struct mlx5e_priv *priv)
+ {
+- struct mlx5e_rep_priv *rpriv = priv->ppriv;
+-
+- mlx5e_rep_neigh_cleanup(rpriv);
+ }
+
+ static int mlx5e_update_rep_rx(struct mlx5e_priv *priv)
+@@ -1205,7 +1207,6 @@ static int uplink_rep_async_event(struct notifier_block *nb, unsigned long event
+
+ static void mlx5e_uplink_rep_enable(struct mlx5e_priv *priv)
+ {
+- struct mlx5e_rep_priv *rpriv = priv->ppriv;
+ struct net_device *netdev = priv->netdev;
+ struct mlx5_core_dev *mdev = priv->mdev;
+ u16 max_mtu;
+@@ -1227,7 +1228,6 @@ static void mlx5e_uplink_rep_enable(struct mlx5e_priv *priv)
+ mlx5_notifier_register(mdev, &priv->events_nb);
+ mlx5e_dcbnl_initialize(priv);
+ mlx5e_dcbnl_init_app(priv);
+- mlx5e_rep_neigh_init(rpriv);
+ mlx5e_rep_bridge_init(priv);
+
+ netdev->wanted_features |= NETIF_F_HW_TC;
+@@ -1242,7 +1242,6 @@ static void mlx5e_uplink_rep_enable(struct mlx5e_priv *priv)
+
+ static void mlx5e_uplink_rep_disable(struct mlx5e_priv *priv)
+ {
+- struct mlx5e_rep_priv *rpriv = priv->ppriv;
+ struct mlx5_core_dev *mdev = priv->mdev;
+
+ rtnl_lock();
+@@ -1252,7 +1251,6 @@ static void mlx5e_uplink_rep_disable(struct mlx5e_priv *priv)
+ rtnl_unlock();
+
+ mlx5e_rep_bridge_cleanup(priv);
+- mlx5e_rep_neigh_cleanup(rpriv);
+ mlx5e_dcbnl_delete_app(priv);
+ mlx5_notifier_unregister(mdev, &priv->events_nb);
+ mlx5e_rep_tc_disable(priv);
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+index ed05ac8ae1de5..e002f013fa015 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+@@ -1725,6 +1725,19 @@ verify_attr_actions(u32 actions, struct netlink_ext_ack *extack)
+ return 0;
+ }
+
++static bool
++has_encap_dests(struct mlx5_flow_attr *attr)
++{
++ struct mlx5_esw_flow_attr *esw_attr = attr->esw_attr;
++ int out_index;
++
++ for (out_index = 0; out_index < MLX5_MAX_FLOW_FWD_VPORTS; out_index++)
++ if (esw_attr->dests[out_index].flags & MLX5_ESW_DEST_ENCAP)
++ return true;
++
++ return false;
++}
++
+ static int
+ post_process_attr(struct mlx5e_tc_flow *flow,
+ struct mlx5_flow_attr *attr,
+@@ -1737,9 +1750,11 @@ post_process_attr(struct mlx5e_tc_flow *flow,
+ if (err)
+ goto err_out;
+
+- err = mlx5e_tc_tun_encap_dests_set(flow->priv, flow, attr, extack, &vf_tun);
+- if (err)
+- goto err_out;
++ if (mlx5e_is_eswitch_flow(flow) && has_encap_dests(attr)) {
++ err = mlx5e_tc_tun_encap_dests_set(flow->priv, flow, attr, extack, &vf_tun);
++ if (err)
++ goto err_out;
++ }
+
+ if (attr->action & MLX5_FLOW_CONTEXT_ACTION_MOD_HDR) {
+ err = mlx5e_tc_attach_mod_hdr(flow->priv, flow, attr);
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+index 8d19c20d3447e..c1f419b36289c 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+@@ -1376,7 +1376,6 @@ esw_chains_create(struct mlx5_eswitch *esw, struct mlx5_flow_table *miss_fdb)
+
+ esw_init_chains_offload_flags(esw, &attr.flags);
+ attr.ns = MLX5_FLOW_NAMESPACE_FDB;
+- attr.fs_base_prio = FDB_TC_OFFLOAD;
+ attr.max_grp_num = esw->params.large_group_num;
+ attr.default_ft = miss_fdb;
+ attr.mapping = esw->offloads.reg_c0_obj_pool;
+@@ -4073,7 +4072,7 @@ int mlx5_devlink_port_fn_migratable_set(struct devlink_port *port, bool enable,
+ }
+
+ hca_caps = MLX5_ADDR_OF(query_hca_cap_out, query_ctx, capability);
+- MLX5_SET(cmd_hca_cap_2, hca_caps, migratable, 1);
++ MLX5_SET(cmd_hca_cap_2, hca_caps, migratable, enable);
+
+ err = mlx5_vport_set_other_func_cap(esw->dev, hca_caps, vport->vport,
+ MLX5_SET_HCA_CAP_OP_MOD_GENERAL_DEVICE2);
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+index 19da02c416161..5f87c446d3d97 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+@@ -889,7 +889,7 @@ static struct mlx5_flow_table *find_closest_ft_recursive(struct fs_node *root,
+ struct fs_node *iter = list_entry(start, struct fs_node, list);
+ struct mlx5_flow_table *ft = NULL;
+
+- if (!root || root->type == FS_TYPE_PRIO_CHAINS)
++ if (!root)
+ return NULL;
+
+ list_for_each_advance_continue(iter, &root->children, reverse) {
+@@ -905,20 +905,42 @@ static struct mlx5_flow_table *find_closest_ft_recursive(struct fs_node *root,
+ return ft;
+ }
+
+-/* If reverse is false then return the first flow table in next priority of
+- * prio in the tree, else return the last flow table in the previous priority
+- * of prio in the tree.
++static struct fs_node *find_prio_chains_parent(struct fs_node *parent,
++ struct fs_node **child)
++{
++ struct fs_node *node = NULL;
++
++ while (parent && parent->type != FS_TYPE_PRIO_CHAINS) {
++ node = parent;
++ parent = parent->parent;
++ }
++
++ if (child)
++ *child = node;
++
++ return parent;
++}
++
++/* If reverse is false then return the first flow table next to the passed node
++ * in the tree, else return the last flow table before the node in the tree.
++ * If skip is true, skip the flow tables in the same prio_chains prio.
+ */
+-static struct mlx5_flow_table *find_closest_ft(struct fs_prio *prio, bool reverse)
++static struct mlx5_flow_table *find_closest_ft(struct fs_node *node, bool reverse,
++ bool skip)
+ {
++ struct fs_node *prio_chains_parent = NULL;
+ struct mlx5_flow_table *ft = NULL;
+ struct fs_node *curr_node;
+ struct fs_node *parent;
+
+- parent = prio->node.parent;
+- curr_node = &prio->node;
++ if (skip)
++ prio_chains_parent = find_prio_chains_parent(node, NULL);
++ parent = node->parent;
++ curr_node = node;
+ while (!ft && parent) {
+- ft = find_closest_ft_recursive(parent, &curr_node->list, reverse);
++ if (parent != prio_chains_parent)
++ ft = find_closest_ft_recursive(parent, &curr_node->list,
++ reverse);
+ curr_node = parent;
+ parent = curr_node->parent;
+ }
+@@ -926,15 +948,15 @@ static struct mlx5_flow_table *find_closest_ft(struct fs_prio *prio, bool revers
+ }
+
+ /* Assuming all the tree is locked by mutex chain lock */
+-static struct mlx5_flow_table *find_next_chained_ft(struct fs_prio *prio)
++static struct mlx5_flow_table *find_next_chained_ft(struct fs_node *node)
+ {
+- return find_closest_ft(prio, false);
++ return find_closest_ft(node, false, true);
+ }
+
+ /* Assuming all the tree is locked by mutex chain lock */
+-static struct mlx5_flow_table *find_prev_chained_ft(struct fs_prio *prio)
++static struct mlx5_flow_table *find_prev_chained_ft(struct fs_node *node)
+ {
+- return find_closest_ft(prio, true);
++ return find_closest_ft(node, true, true);
+ }
+
+ static struct mlx5_flow_table *find_next_fwd_ft(struct mlx5_flow_table *ft,
+@@ -946,7 +968,7 @@ static struct mlx5_flow_table *find_next_fwd_ft(struct mlx5_flow_table *ft,
+ next_ns = flow_act->action & MLX5_FLOW_CONTEXT_ACTION_FWD_NEXT_NS;
+ fs_get_obj(prio, next_ns ? ft->ns->node.parent : ft->node.parent);
+
+- return find_next_chained_ft(prio);
++ return find_next_chained_ft(&prio->node);
+ }
+
+ static int connect_fts_in_prio(struct mlx5_core_dev *dev,
+@@ -970,21 +992,55 @@ static int connect_fts_in_prio(struct mlx5_core_dev *dev,
+ return 0;
+ }
+
++static struct mlx5_flow_table *find_closet_ft_prio_chains(struct fs_node *node,
++ struct fs_node *parent,
++ struct fs_node **child,
++ bool reverse)
++{
++ struct mlx5_flow_table *ft;
++
++ ft = find_closest_ft(node, reverse, false);
++
++ if (ft && parent == find_prio_chains_parent(&ft->node, child))
++ return ft;
++
++ return NULL;
++}
++
+ /* Connect flow tables from previous priority of prio to ft */
+ static int connect_prev_fts(struct mlx5_core_dev *dev,
+ struct mlx5_flow_table *ft,
+ struct fs_prio *prio)
+ {
++ struct fs_node *prio_parent, *parent = NULL, *child, *node;
+ struct mlx5_flow_table *prev_ft;
++ int err = 0;
++
++ prio_parent = find_prio_chains_parent(&prio->node, &child);
++
++ /* return directly if not under the first sub ns of prio_chains prio */
++ if (prio_parent && !list_is_first(&child->list, &prio_parent->children))
++ return 0;
+
+- prev_ft = find_prev_chained_ft(prio);
+- if (prev_ft) {
++ prev_ft = find_prev_chained_ft(&prio->node);
++ while (prev_ft) {
+ struct fs_prio *prev_prio;
+
+ fs_get_obj(prev_prio, prev_ft->node.parent);
+- return connect_fts_in_prio(dev, prev_prio, ft);
++ err = connect_fts_in_prio(dev, prev_prio, ft);
++ if (err)
++ break;
++
++ if (!parent) {
++ parent = find_prio_chains_parent(&prev_prio->node, &child);
++ if (!parent)
++ break;
++ }
++
++ node = child;
++ prev_ft = find_closet_ft_prio_chains(node, parent, &child, true);
+ }
+- return 0;
++ return err;
+ }
+
+ static int update_root_ft_create(struct mlx5_flow_table *ft, struct fs_prio
+@@ -1123,7 +1179,7 @@ static int connect_flow_table(struct mlx5_core_dev *dev, struct mlx5_flow_table
+ if (err)
+ return err;
+
+- next_ft = first_ft ? first_ft : find_next_chained_ft(prio);
++ next_ft = first_ft ? first_ft : find_next_chained_ft(&prio->node);
+ err = connect_fwd_rules(dev, ft, next_ft);
+ if (err)
+ return err;
+@@ -1198,7 +1254,7 @@ static struct mlx5_flow_table *__mlx5_create_flow_table(struct mlx5_flow_namespa
+
+ tree_init_node(&ft->node, del_hw_flow_table, del_sw_flow_table);
+ next_ft = unmanaged ? ft_attr->next_ft :
+- find_next_chained_ft(fs_prio);
++ find_next_chained_ft(&fs_prio->node);
+ ft->def_miss_action = ns->def_miss_action;
+ ft->ns = ns;
+ err = root->cmds->create_flow_table(root, ft, ft_attr, next_ft);
+@@ -2195,13 +2251,20 @@ EXPORT_SYMBOL(mlx5_del_flow_rules);
+ /* Assuming prio->node.children(flow tables) is sorted by level */
+ static struct mlx5_flow_table *find_next_ft(struct mlx5_flow_table *ft)
+ {
++ struct fs_node *prio_parent, *child;
+ struct fs_prio *prio;
+
+ fs_get_obj(prio, ft->node.parent);
+
+ if (!list_is_last(&ft->node.list, &prio->node.children))
+ return list_next_entry(ft, node.list);
+- return find_next_chained_ft(prio);
++
++ prio_parent = find_prio_chains_parent(&prio->node, &child);
++
++ if (prio_parent && list_is_first(&child->list, &prio_parent->children))
++ return find_closest_ft(&prio->node, false, false);
++
++ return find_next_chained_ft(&prio->node);
+ }
+
+ static int update_root_ft_destroy(struct mlx5_flow_table *ft)
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/fs_chains.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/fs_chains.c
+index db9df9798ffac..a80ecb672f33d 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/fs_chains.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/fs_chains.c
+@@ -178,7 +178,7 @@ mlx5_chains_create_table(struct mlx5_fs_chains *chains,
+ if (!mlx5_chains_ignore_flow_level_supported(chains) ||
+ (chain == 0 && prio == 1 && level == 0)) {
+ ft_attr.level = chains->fs_base_level;
+- ft_attr.prio = chains->fs_base_prio;
++ ft_attr.prio = chains->fs_base_prio + prio - 1;
+ ns = (chains->ns == MLX5_FLOW_NAMESPACE_FDB) ?
+ mlx5_get_fdb_sub_ns(chains->dev, chain) :
+ mlx5_get_flow_namespace(chains->dev, chains->ns);
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
+index d6ee016deae17..c7a06c8bbb7a3 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
+@@ -1456,6 +1456,7 @@ void mlx5_uninit_one(struct mlx5_core_dev *dev)
+ if (!test_bit(MLX5_INTERFACE_STATE_UP, &dev->intf_state)) {
+ mlx5_core_warn(dev, "%s: interface is down, NOP\n",
+ __func__);
++ mlx5_devlink_params_unregister(priv_to_devlink(dev));
+ mlx5_cleanup_once(dev);
+ goto out;
+ }
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_cmd.c
+index 1aa525e509f10..293d2edd03d59 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_cmd.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_cmd.c
+@@ -562,11 +562,12 @@ int mlx5dr_cmd_create_reformat_ctx(struct mlx5_core_dev *mdev,
+
+ err = mlx5_cmd_exec(mdev, in, inlen, out, sizeof(out));
+ if (err)
+- return err;
++ goto err_free_in;
+
+ *reformat_id = MLX5_GET(alloc_packet_reformat_context_out, out, packet_reformat_id);
+- kvfree(in);
+
++err_free_in:
++ kvfree(in);
+ return err;
+ }
+
+diff --git a/drivers/net/ethernet/myricom/myri10ge/myri10ge.c b/drivers/net/ethernet/myricom/myri10ge/myri10ge.c
+index c5687d94ea885..7b7e1c5b00f47 100644
+--- a/drivers/net/ethernet/myricom/myri10ge/myri10ge.c
++++ b/drivers/net/ethernet/myricom/myri10ge/myri10ge.c
+@@ -66,6 +66,7 @@
+ #include <linux/slab.h>
+ #include <linux/prefetch.h>
+ #include <net/checksum.h>
++#include <net/gso.h>
+ #include <net/ip.h>
+ #include <net/tcp.h>
+ #include <asm/byteorder.h>
+diff --git a/drivers/net/ethernet/qlogic/qed/qed_dev_api.h b/drivers/net/ethernet/qlogic/qed/qed_dev_api.h
+index f8682356d0cf4..94d4f9413ab7a 100644
+--- a/drivers/net/ethernet/qlogic/qed/qed_dev_api.h
++++ b/drivers/net/ethernet/qlogic/qed/qed_dev_api.h
+@@ -193,6 +193,22 @@ void qed_hw_remove(struct qed_dev *cdev);
+ */
+ struct qed_ptt *qed_ptt_acquire(struct qed_hwfn *p_hwfn);
+
++/**
++ * qed_ptt_acquire_context(): Allocate a PTT window honoring the context
++ * atomicy.
++ *
++ * @p_hwfn: HW device data.
++ * @is_atomic: Hint from the caller - if the func can sleep or not.
++ *
++ * Context: The function should not sleep in case is_atomic == true.
++ * Return: struct qed_ptt.
++ *
++ * Should be called at the entry point to the driver
++ * (at the beginning of an exported function).
++ */
++struct qed_ptt *qed_ptt_acquire_context(struct qed_hwfn *p_hwfn,
++ bool is_atomic);
++
+ /**
+ * qed_ptt_release(): Release PTT Window.
+ *
+diff --git a/drivers/net/ethernet/qlogic/qed/qed_fcoe.c b/drivers/net/ethernet/qlogic/qed/qed_fcoe.c
+index 3764190b948eb..04602ac947087 100644
+--- a/drivers/net/ethernet/qlogic/qed/qed_fcoe.c
++++ b/drivers/net/ethernet/qlogic/qed/qed_fcoe.c
+@@ -693,13 +693,14 @@ static void _qed_fcoe_get_pstats(struct qed_hwfn *p_hwfn,
+ }
+
+ static int qed_fcoe_get_stats(struct qed_hwfn *p_hwfn,
+- struct qed_fcoe_stats *p_stats)
++ struct qed_fcoe_stats *p_stats,
++ bool is_atomic)
+ {
+ struct qed_ptt *p_ptt;
+
+ memset(p_stats, 0, sizeof(*p_stats));
+
+- p_ptt = qed_ptt_acquire(p_hwfn);
++ p_ptt = qed_ptt_acquire_context(p_hwfn, is_atomic);
+
+ if (!p_ptt) {
+ DP_ERR(p_hwfn, "Failed to acquire ptt\n");
+@@ -973,19 +974,27 @@ static int qed_fcoe_destroy_conn(struct qed_dev *cdev,
+ QED_SPQ_MODE_EBLOCK, NULL);
+ }
+
++static int qed_fcoe_stats_context(struct qed_dev *cdev,
++ struct qed_fcoe_stats *stats,
++ bool is_atomic)
++{
++ return qed_fcoe_get_stats(QED_AFFIN_HWFN(cdev), stats, is_atomic);
++}
++
+ static int qed_fcoe_stats(struct qed_dev *cdev, struct qed_fcoe_stats *stats)
+ {
+- return qed_fcoe_get_stats(QED_AFFIN_HWFN(cdev), stats);
++ return qed_fcoe_stats_context(cdev, stats, false);
+ }
+
+ void qed_get_protocol_stats_fcoe(struct qed_dev *cdev,
+- struct qed_mcp_fcoe_stats *stats)
++ struct qed_mcp_fcoe_stats *stats,
++ bool is_atomic)
+ {
+ struct qed_fcoe_stats proto_stats;
+
+ /* Retrieve FW statistics */
+ memset(&proto_stats, 0, sizeof(proto_stats));
+- if (qed_fcoe_stats(cdev, &proto_stats)) {
++ if (qed_fcoe_stats_context(cdev, &proto_stats, is_atomic)) {
+ DP_VERBOSE(cdev, QED_MSG_STORAGE,
+ "Failed to collect FCoE statistics\n");
+ return;
+diff --git a/drivers/net/ethernet/qlogic/qed/qed_fcoe.h b/drivers/net/ethernet/qlogic/qed/qed_fcoe.h
+index 19c85adf4ceb1..214e8299ecb4e 100644
+--- a/drivers/net/ethernet/qlogic/qed/qed_fcoe.h
++++ b/drivers/net/ethernet/qlogic/qed/qed_fcoe.h
+@@ -28,8 +28,20 @@ int qed_fcoe_alloc(struct qed_hwfn *p_hwfn);
+ void qed_fcoe_setup(struct qed_hwfn *p_hwfn);
+
+ void qed_fcoe_free(struct qed_hwfn *p_hwfn);
++/**
++ * qed_get_protocol_stats_fcoe(): Fills provided statistics
++ * struct with statistics.
++ *
++ * @cdev: Qed dev pointer.
++ * @stats: Points to struct that will be filled with statistics.
++ * @is_atomic: Hint from the caller - if the func can sleep or not.
++ *
++ * Context: The function should not sleep in case is_atomic == true.
++ * Return: Void.
++ */
+ void qed_get_protocol_stats_fcoe(struct qed_dev *cdev,
+- struct qed_mcp_fcoe_stats *stats);
++ struct qed_mcp_fcoe_stats *stats,
++ bool is_atomic);
+ #else /* CONFIG_QED_FCOE */
+ static inline int qed_fcoe_alloc(struct qed_hwfn *p_hwfn)
+ {
+@@ -40,7 +52,8 @@ static inline void qed_fcoe_setup(struct qed_hwfn *p_hwfn) {}
+ static inline void qed_fcoe_free(struct qed_hwfn *p_hwfn) {}
+
+ static inline void qed_get_protocol_stats_fcoe(struct qed_dev *cdev,
+- struct qed_mcp_fcoe_stats *stats)
++ struct qed_mcp_fcoe_stats *stats,
++ bool is_atomic)
+ {
+ }
+ #endif /* CONFIG_QED_FCOE */
+diff --git a/drivers/net/ethernet/qlogic/qed/qed_hw.c b/drivers/net/ethernet/qlogic/qed/qed_hw.c
+index 554f30b0cfd5e..6263f847b6b92 100644
+--- a/drivers/net/ethernet/qlogic/qed/qed_hw.c
++++ b/drivers/net/ethernet/qlogic/qed/qed_hw.c
+@@ -23,7 +23,10 @@
+ #include "qed_reg_addr.h"
+ #include "qed_sriov.h"
+
+-#define QED_BAR_ACQUIRE_TIMEOUT 1000
++#define QED_BAR_ACQUIRE_TIMEOUT_USLEEP_CNT 1000
++#define QED_BAR_ACQUIRE_TIMEOUT_USLEEP 1000
++#define QED_BAR_ACQUIRE_TIMEOUT_UDELAY_CNT 100000
++#define QED_BAR_ACQUIRE_TIMEOUT_UDELAY 10
+
+ /* Invalid values */
+ #define QED_BAR_INVALID_OFFSET (cpu_to_le32(-1))
+@@ -84,12 +87,22 @@ void qed_ptt_pool_free(struct qed_hwfn *p_hwfn)
+ }
+
+ struct qed_ptt *qed_ptt_acquire(struct qed_hwfn *p_hwfn)
++{
++ return qed_ptt_acquire_context(p_hwfn, false);
++}
++
++struct qed_ptt *qed_ptt_acquire_context(struct qed_hwfn *p_hwfn, bool is_atomic)
+ {
+ struct qed_ptt *p_ptt;
+- unsigned int i;
++ unsigned int i, count;
++
++ if (is_atomic)
++ count = QED_BAR_ACQUIRE_TIMEOUT_UDELAY_CNT;
++ else
++ count = QED_BAR_ACQUIRE_TIMEOUT_USLEEP_CNT;
+
+ /* Take the free PTT from the list */
+- for (i = 0; i < QED_BAR_ACQUIRE_TIMEOUT; i++) {
++ for (i = 0; i < count; i++) {
+ spin_lock_bh(&p_hwfn->p_ptt_pool->lock);
+
+ if (!list_empty(&p_hwfn->p_ptt_pool->free_list)) {
+@@ -105,7 +118,12 @@ struct qed_ptt *qed_ptt_acquire(struct qed_hwfn *p_hwfn)
+ }
+
+ spin_unlock_bh(&p_hwfn->p_ptt_pool->lock);
+- usleep_range(1000, 2000);
++
++ if (is_atomic)
++ udelay(QED_BAR_ACQUIRE_TIMEOUT_UDELAY);
++ else
++ usleep_range(QED_BAR_ACQUIRE_TIMEOUT_USLEEP,
++ QED_BAR_ACQUIRE_TIMEOUT_USLEEP * 2);
+ }
+
+ DP_NOTICE(p_hwfn, "PTT acquire timeout - failed to allocate PTT\n");
+diff --git a/drivers/net/ethernet/qlogic/qed/qed_iscsi.c b/drivers/net/ethernet/qlogic/qed/qed_iscsi.c
+index 511ab214eb9c8..980e7289b4814 100644
+--- a/drivers/net/ethernet/qlogic/qed/qed_iscsi.c
++++ b/drivers/net/ethernet/qlogic/qed/qed_iscsi.c
+@@ -999,13 +999,14 @@ static void _qed_iscsi_get_pstats(struct qed_hwfn *p_hwfn,
+ }
+
+ static int qed_iscsi_get_stats(struct qed_hwfn *p_hwfn,
+- struct qed_iscsi_stats *stats)
++ struct qed_iscsi_stats *stats,
++ bool is_atomic)
+ {
+ struct qed_ptt *p_ptt;
+
+ memset(stats, 0, sizeof(*stats));
+
+- p_ptt = qed_ptt_acquire(p_hwfn);
++ p_ptt = qed_ptt_acquire_context(p_hwfn, is_atomic);
+ if (!p_ptt) {
+ DP_ERR(p_hwfn, "Failed to acquire ptt\n");
+ return -EAGAIN;
+@@ -1336,9 +1337,16 @@ static int qed_iscsi_destroy_conn(struct qed_dev *cdev,
+ QED_SPQ_MODE_EBLOCK, NULL);
+ }
+
++static int qed_iscsi_stats_context(struct qed_dev *cdev,
++ struct qed_iscsi_stats *stats,
++ bool is_atomic)
++{
++ return qed_iscsi_get_stats(QED_AFFIN_HWFN(cdev), stats, is_atomic);
++}
++
+ static int qed_iscsi_stats(struct qed_dev *cdev, struct qed_iscsi_stats *stats)
+ {
+- return qed_iscsi_get_stats(QED_AFFIN_HWFN(cdev), stats);
++ return qed_iscsi_stats_context(cdev, stats, false);
+ }
+
+ static int qed_iscsi_change_mac(struct qed_dev *cdev,
+@@ -1358,13 +1366,14 @@ static int qed_iscsi_change_mac(struct qed_dev *cdev,
+ }
+
+ void qed_get_protocol_stats_iscsi(struct qed_dev *cdev,
+- struct qed_mcp_iscsi_stats *stats)
++ struct qed_mcp_iscsi_stats *stats,
++ bool is_atomic)
+ {
+ struct qed_iscsi_stats proto_stats;
+
+ /* Retrieve FW statistics */
+ memset(&proto_stats, 0, sizeof(proto_stats));
+- if (qed_iscsi_stats(cdev, &proto_stats)) {
++ if (qed_iscsi_stats_context(cdev, &proto_stats, is_atomic)) {
+ DP_VERBOSE(cdev, QED_MSG_STORAGE,
+ "Failed to collect ISCSI statistics\n");
+ return;
+diff --git a/drivers/net/ethernet/qlogic/qed/qed_iscsi.h b/drivers/net/ethernet/qlogic/qed/qed_iscsi.h
+index dec2b00259d42..974cb8d26608c 100644
+--- a/drivers/net/ethernet/qlogic/qed/qed_iscsi.h
++++ b/drivers/net/ethernet/qlogic/qed/qed_iscsi.h
+@@ -39,11 +39,14 @@ void qed_iscsi_free(struct qed_hwfn *p_hwfn);
+ *
+ * @cdev: Qed dev pointer.
+ * @stats: Points to struct that will be filled with statistics.
++ * @is_atomic: Hint from the caller - if the func can sleep or not.
+ *
++ * Context: The function should not sleep in case is_atomic == true.
+ * Return: Void.
+ */
+ void qed_get_protocol_stats_iscsi(struct qed_dev *cdev,
+- struct qed_mcp_iscsi_stats *stats);
++ struct qed_mcp_iscsi_stats *stats,
++ bool is_atomic);
+ #else /* IS_ENABLED(CONFIG_QED_ISCSI) */
+ static inline int qed_iscsi_alloc(struct qed_hwfn *p_hwfn)
+ {
+@@ -56,7 +59,8 @@ static inline void qed_iscsi_free(struct qed_hwfn *p_hwfn) {}
+
+ static inline void
+ qed_get_protocol_stats_iscsi(struct qed_dev *cdev,
+- struct qed_mcp_iscsi_stats *stats) {}
++ struct qed_mcp_iscsi_stats *stats,
++ bool is_atomic) {}
+ #endif /* IS_ENABLED(CONFIG_QED_ISCSI) */
+
+ #endif
+diff --git a/drivers/net/ethernet/qlogic/qed/qed_l2.c b/drivers/net/ethernet/qlogic/qed/qed_l2.c
+index 7776d3bdd459a..970b9aabbc3d7 100644
+--- a/drivers/net/ethernet/qlogic/qed/qed_l2.c
++++ b/drivers/net/ethernet/qlogic/qed/qed_l2.c
+@@ -1863,7 +1863,8 @@ static void __qed_get_vport_stats(struct qed_hwfn *p_hwfn,
+ }
+
+ static void _qed_get_vport_stats(struct qed_dev *cdev,
+- struct qed_eth_stats *stats)
++ struct qed_eth_stats *stats,
++ bool is_atomic)
+ {
+ u8 fw_vport = 0;
+ int i;
+@@ -1872,10 +1873,11 @@ static void _qed_get_vport_stats(struct qed_dev *cdev,
+
+ for_each_hwfn(cdev, i) {
+ struct qed_hwfn *p_hwfn = &cdev->hwfns[i];
+- struct qed_ptt *p_ptt = IS_PF(cdev) ? qed_ptt_acquire(p_hwfn)
+- : NULL;
++ struct qed_ptt *p_ptt;
+ bool b_get_port_stats;
+
++ p_ptt = IS_PF(cdev) ? qed_ptt_acquire_context(p_hwfn, is_atomic)
++ : NULL;
+ if (IS_PF(cdev)) {
+ /* The main vport index is relative first */
+ if (qed_fw_vport(p_hwfn, 0, &fw_vport)) {
+@@ -1900,6 +1902,13 @@ out:
+ }
+
+ void qed_get_vport_stats(struct qed_dev *cdev, struct qed_eth_stats *stats)
++{
++ qed_get_vport_stats_context(cdev, stats, false);
++}
++
++void qed_get_vport_stats_context(struct qed_dev *cdev,
++ struct qed_eth_stats *stats,
++ bool is_atomic)
+ {
+ u32 i;
+
+@@ -1908,7 +1917,7 @@ void qed_get_vport_stats(struct qed_dev *cdev, struct qed_eth_stats *stats)
+ return;
+ }
+
+- _qed_get_vport_stats(cdev, stats);
++ _qed_get_vport_stats(cdev, stats, is_atomic);
+
+ if (!cdev->reset_stats)
+ return;
+@@ -1960,7 +1969,7 @@ void qed_reset_vport_stats(struct qed_dev *cdev)
+ if (!cdev->reset_stats) {
+ DP_INFO(cdev, "Reset stats not allocated\n");
+ } else {
+- _qed_get_vport_stats(cdev, cdev->reset_stats);
++ _qed_get_vport_stats(cdev, cdev->reset_stats, false);
+ cdev->reset_stats->common.link_change_count = 0;
+ }
+ }
+diff --git a/drivers/net/ethernet/qlogic/qed/qed_l2.h b/drivers/net/ethernet/qlogic/qed/qed_l2.h
+index a538cf478c14e..2d2f82c785ad2 100644
+--- a/drivers/net/ethernet/qlogic/qed/qed_l2.h
++++ b/drivers/net/ethernet/qlogic/qed/qed_l2.h
+@@ -249,8 +249,32 @@ qed_sp_eth_rx_queues_update(struct qed_hwfn *p_hwfn,
+ enum spq_mode comp_mode,
+ struct qed_spq_comp_cb *p_comp_data);
+
++/**
++ * qed_get_vport_stats(): Fills provided statistics
++ * struct with statistics.
++ *
++ * @cdev: Qed dev pointer.
++ * @stats: Points to struct that will be filled with statistics.
++ *
++ * Return: Void.
++ */
+ void qed_get_vport_stats(struct qed_dev *cdev, struct qed_eth_stats *stats);
+
++/**
++ * qed_get_vport_stats_context(): Fills provided statistics
++ * struct with statistics.
++ *
++ * @cdev: Qed dev pointer.
++ * @stats: Points to struct that will be filled with statistics.
++ * @is_atomic: Hint from the caller - if the func can sleep or not.
++ *
++ * Context: The function should not sleep in case is_atomic == true.
++ * Return: Void.
++ */
++void qed_get_vport_stats_context(struct qed_dev *cdev,
++ struct qed_eth_stats *stats,
++ bool is_atomic);
++
+ void qed_reset_vport_stats(struct qed_dev *cdev);
+
+ /**
+diff --git a/drivers/net/ethernet/qlogic/qed/qed_main.c b/drivers/net/ethernet/qlogic/qed/qed_main.c
+index f5af83342856f..c278f8893042b 100644
+--- a/drivers/net/ethernet/qlogic/qed/qed_main.c
++++ b/drivers/net/ethernet/qlogic/qed/qed_main.c
+@@ -3092,7 +3092,7 @@ void qed_get_protocol_stats(struct qed_dev *cdev,
+
+ switch (type) {
+ case QED_MCP_LAN_STATS:
+- qed_get_vport_stats(cdev, ð_stats);
++ qed_get_vport_stats_context(cdev, ð_stats, true);
+ stats->lan_stats.ucast_rx_pkts =
+ eth_stats.common.rx_ucast_pkts;
+ stats->lan_stats.ucast_tx_pkts =
+@@ -3100,10 +3100,10 @@ void qed_get_protocol_stats(struct qed_dev *cdev,
+ stats->lan_stats.fcs_err = -1;
+ break;
+ case QED_MCP_FCOE_STATS:
+- qed_get_protocol_stats_fcoe(cdev, &stats->fcoe_stats);
++ qed_get_protocol_stats_fcoe(cdev, &stats->fcoe_stats, true);
+ break;
+ case QED_MCP_ISCSI_STATS:
+- qed_get_protocol_stats_iscsi(cdev, &stats->iscsi_stats);
++ qed_get_protocol_stats_iscsi(cdev, &stats->iscsi_stats, true);
+ break;
+ default:
+ DP_VERBOSE(cdev, QED_MSG_SP,
+diff --git a/drivers/net/ethernet/sfc/siena/tx_common.c b/drivers/net/ethernet/sfc/siena/tx_common.c
+index 93a32d61944f0..a7a9ab304e136 100644
+--- a/drivers/net/ethernet/sfc/siena/tx_common.c
++++ b/drivers/net/ethernet/sfc/siena/tx_common.c
+@@ -12,6 +12,7 @@
+ #include "efx.h"
+ #include "nic_common.h"
+ #include "tx_common.h"
++#include <net/gso.h>
+
+ static unsigned int efx_tx_cb_page_count(struct efx_tx_queue *tx_queue)
+ {
+diff --git a/drivers/net/ethernet/sfc/tx_common.c b/drivers/net/ethernet/sfc/tx_common.c
+index 755aa92bf8236..9f2393d343715 100644
+--- a/drivers/net/ethernet/sfc/tx_common.c
++++ b/drivers/net/ethernet/sfc/tx_common.c
+@@ -12,6 +12,7 @@
+ #include "efx.h"
+ #include "nic_common.h"
+ #include "tx_common.h"
++#include <net/gso.h>
+
+ static unsigned int efx_tx_cb_page_count(struct efx_tx_queue *tx_queue)
+ {
+diff --git a/drivers/net/ethernet/socionext/netsec.c b/drivers/net/ethernet/socionext/netsec.c
+index 2d7347b71c41b..0dcd6a568b061 100644
+--- a/drivers/net/ethernet/socionext/netsec.c
++++ b/drivers/net/ethernet/socionext/netsec.c
+@@ -1851,6 +1851,17 @@ static int netsec_of_probe(struct platform_device *pdev,
+ return err;
+ }
+
++ /*
++ * SynQuacer is physically configured with TX and RX delays
++ * but the standard firmware claimed otherwise for a long
++ * time, ignore it.
++ */
++ if (of_machine_is_compatible("socionext,developer-box") &&
++ priv->phy_interface != PHY_INTERFACE_MODE_RGMII_ID) {
++ dev_warn(&pdev->dev, "Outdated firmware reports incorrect PHY mode, overriding\n");
++ priv->phy_interface = PHY_INTERFACE_MODE_RGMII_ID;
++ }
++
+ priv->phy_np = of_parse_phandle(pdev->dev.of_node, "phy-handle", 0);
+ if (!priv->phy_np) {
+ dev_err(&pdev->dev, "missing required property 'phy-handle'\n");
+diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-tegra.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-tegra.c
+index bdf990cf2f310..0880048ccdddc 100644
+--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-tegra.c
++++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-tegra.c
+@@ -234,7 +234,8 @@ static int tegra_mgbe_probe(struct platform_device *pdev)
+ res.addr = mgbe->regs;
+ res.irq = irq;
+
+- mgbe->clks = devm_kzalloc(&pdev->dev, sizeof(*mgbe->clks), GFP_KERNEL);
++ mgbe->clks = devm_kcalloc(&pdev->dev, ARRAY_SIZE(mgbe_clks),
++ sizeof(*mgbe->clks), GFP_KERNEL);
+ if (!mgbe->clks)
+ return -ENOMEM;
+
+diff --git a/drivers/net/ethernet/sun/sunvnet_common.c b/drivers/net/ethernet/sun/sunvnet_common.c
+index a6211b95ed178..3525d5c0d694c 100644
+--- a/drivers/net/ethernet/sun/sunvnet_common.c
++++ b/drivers/net/ethernet/sun/sunvnet_common.c
+@@ -25,6 +25,7 @@
+ #endif
+
+ #include <net/ip.h>
++#include <net/gso.h>
+ #include <net/icmp.h>
+ #include <net/route.h>
+
+diff --git a/drivers/net/ethernet/xilinx/ll_temac_main.c b/drivers/net/ethernet/xilinx/ll_temac_main.c
+index e0ac1bcd9925c..49f303353ecb0 100644
+--- a/drivers/net/ethernet/xilinx/ll_temac_main.c
++++ b/drivers/net/ethernet/xilinx/ll_temac_main.c
+@@ -1567,12 +1567,16 @@ static int temac_probe(struct platform_device *pdev)
+ }
+
+ /* Error handle returned DMA RX and TX interrupts */
+- if (lp->rx_irq < 0)
+- return dev_err_probe(&pdev->dev, lp->rx_irq,
++ if (lp->rx_irq <= 0) {
++ rc = lp->rx_irq ?: -EINVAL;
++ return dev_err_probe(&pdev->dev, rc,
+ "could not get DMA RX irq\n");
+- if (lp->tx_irq < 0)
+- return dev_err_probe(&pdev->dev, lp->tx_irq,
++ }
++ if (lp->tx_irq <= 0) {
++ rc = lp->tx_irq ?: -EINVAL;
++ return dev_err_probe(&pdev->dev, rc,
+ "could not get DMA TX irq\n");
++ }
+
+ if (temac_np) {
+ /* Retrieve the MAC address */
+diff --git a/drivers/net/tap.c b/drivers/net/tap.c
+index d30d730ed5a71..49d1d6acf95eb 100644
+--- a/drivers/net/tap.c
++++ b/drivers/net/tap.c
+@@ -18,6 +18,7 @@
+ #include <linux/fs.h>
+ #include <linux/uio.h>
+
++#include <net/gso.h>
+ #include <net/net_namespace.h>
+ #include <net/rtnetlink.h>
+ #include <net/sock.h>
+@@ -533,7 +534,7 @@ static int tap_open(struct inode *inode, struct file *file)
+ q->sock.state = SS_CONNECTED;
+ q->sock.file = file;
+ q->sock.ops = &tap_socket_ops;
+- sock_init_data_uid(&q->sock, &q->sk, inode->i_uid);
++ sock_init_data_uid(&q->sock, &q->sk, current_fsuid());
+ q->sk.sk_write_space = tap_sock_write_space;
+ q->sk.sk_destruct = tap_sock_destruct;
+ q->flags = IFF_VNET_HDR | IFF_NO_PI | IFF_TAP;
+diff --git a/drivers/net/tun.c b/drivers/net/tun.c
+index d75456adc62ac..25f0191df00bf 100644
+--- a/drivers/net/tun.c
++++ b/drivers/net/tun.c
+@@ -3469,7 +3469,7 @@ static int tun_chr_open(struct inode *inode, struct file * file)
+ tfile->socket.file = file;
+ tfile->socket.ops = &tun_socket_ops;
+
+- sock_init_data_uid(&tfile->socket, &tfile->sk, inode->i_uid);
++ sock_init_data_uid(&tfile->socket, &tfile->sk, current_fsuid());
+
+ tfile->sk.sk_write_space = tun_sock_write_space;
+ tfile->sk.sk_sndbuf = INT_MAX;
+diff --git a/drivers/net/usb/cdc_ether.c b/drivers/net/usb/cdc_ether.c
+index 80849d115e5dd..c1a75ef4fd68c 100644
+--- a/drivers/net/usb/cdc_ether.c
++++ b/drivers/net/usb/cdc_ether.c
+@@ -618,9 +618,23 @@ static const struct usb_device_id products[] = {
+ .match_flags = USB_DEVICE_ID_MATCH_INT_INFO
+ | USB_DEVICE_ID_MATCH_DEVICE,
+ .idVendor = 0x04DD,
++ .idProduct = 0x8005, /* A-300 */
++ ZAURUS_FAKE_INTERFACE,
++ .driver_info = 0,
++}, {
++ .match_flags = USB_DEVICE_ID_MATCH_INT_INFO
++ | USB_DEVICE_ID_MATCH_DEVICE,
++ .idVendor = 0x04DD,
+ .idProduct = 0x8006, /* B-500/SL-5600 */
+ ZAURUS_MASTER_INTERFACE,
+ .driver_info = 0,
++}, {
++ .match_flags = USB_DEVICE_ID_MATCH_INT_INFO
++ | USB_DEVICE_ID_MATCH_DEVICE,
++ .idVendor = 0x04DD,
++ .idProduct = 0x8006, /* B-500/SL-5600 */
++ ZAURUS_FAKE_INTERFACE,
++ .driver_info = 0,
+ }, {
+ .match_flags = USB_DEVICE_ID_MATCH_INT_INFO
+ | USB_DEVICE_ID_MATCH_DEVICE,
+@@ -628,6 +642,13 @@ static const struct usb_device_id products[] = {
+ .idProduct = 0x8007, /* C-700 */
+ ZAURUS_MASTER_INTERFACE,
+ .driver_info = 0,
++}, {
++ .match_flags = USB_DEVICE_ID_MATCH_INT_INFO
++ | USB_DEVICE_ID_MATCH_DEVICE,
++ .idVendor = 0x04DD,
++ .idProduct = 0x8007, /* C-700 */
++ ZAURUS_FAKE_INTERFACE,
++ .driver_info = 0,
+ }, {
+ .match_flags = USB_DEVICE_ID_MATCH_INT_INFO
+ | USB_DEVICE_ID_MATCH_DEVICE,
+diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
+index c458c030fadf6..59cde06aa7f60 100644
+--- a/drivers/net/usb/lan78xx.c
++++ b/drivers/net/usb/lan78xx.c
+@@ -4224,8 +4224,6 @@ static void lan78xx_disconnect(struct usb_interface *intf)
+ if (!dev)
+ return;
+
+- set_bit(EVENT_DEV_DISCONNECT, &dev->flags);
+-
+ netif_napi_del(&dev->napi);
+
+ udev = interface_to_usbdev(intf);
+@@ -4233,6 +4231,8 @@ static void lan78xx_disconnect(struct usb_interface *intf)
+
+ unregister_netdev(net);
+
++ timer_shutdown_sync(&dev->stat_monitor);
++ set_bit(EVENT_DEV_DISCONNECT, &dev->flags);
+ cancel_delayed_work_sync(&dev->wq);
+
+ phydev = net->phydev;
+@@ -4247,9 +4247,6 @@ static void lan78xx_disconnect(struct usb_interface *intf)
+
+ usb_scuttle_anchored_urbs(&dev->deferred);
+
+- if (timer_pending(&dev->stat_monitor))
+- del_timer_sync(&dev->stat_monitor);
+-
+ lan78xx_unbind(dev, intf);
+
+ lan78xx_free_tx_resources(dev);
+diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
+index 0999a58ca9d26..0738baa5b82e4 100644
+--- a/drivers/net/usb/r8152.c
++++ b/drivers/net/usb/r8152.c
+@@ -27,6 +27,7 @@
+ #include <linux/firmware.h>
+ #include <crypto/hash.h>
+ #include <linux/usb/r8152.h>
++#include <net/gso.h>
+
+ /* Information for net-next */
+ #define NETNEXT_VERSION "12"
+diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
+index 283ffddda821d..2d14b0d78541a 100644
+--- a/drivers/net/usb/usbnet.c
++++ b/drivers/net/usb/usbnet.c
+@@ -1775,6 +1775,10 @@ usbnet_probe (struct usb_interface *udev, const struct usb_device_id *prod)
+ } else if (!info->in || !info->out)
+ status = usbnet_get_endpoints (dev, udev);
+ else {
++ u8 ep_addrs[3] = {
++ info->in + USB_DIR_IN, info->out + USB_DIR_OUT, 0
++ };
++
+ dev->in = usb_rcvbulkpipe (xdev, info->in);
+ dev->out = usb_sndbulkpipe (xdev, info->out);
+ if (!(info->flags & FLAG_NO_SETINT))
+@@ -1784,6 +1788,8 @@ usbnet_probe (struct usb_interface *udev, const struct usb_device_id *prod)
+ else
+ status = 0;
+
++ if (status == 0 && !usb_check_bulk_endpoints(udev, ep_addrs))
++ status = -EINVAL;
+ }
+ if (status >= 0 && dev->status)
+ status = init_status (dev, udev);
+diff --git a/drivers/net/usb/zaurus.c b/drivers/net/usb/zaurus.c
+index 7984f2157d222..df3617c4c44e8 100644
+--- a/drivers/net/usb/zaurus.c
++++ b/drivers/net/usb/zaurus.c
+@@ -289,9 +289,23 @@ static const struct usb_device_id products [] = {
+ .match_flags = USB_DEVICE_ID_MATCH_INT_INFO
+ | USB_DEVICE_ID_MATCH_DEVICE,
+ .idVendor = 0x04DD,
++ .idProduct = 0x8005, /* A-300 */
++ ZAURUS_FAKE_INTERFACE,
++ .driver_info = (unsigned long)&bogus_mdlm_info,
++}, {
++ .match_flags = USB_DEVICE_ID_MATCH_INT_INFO
++ | USB_DEVICE_ID_MATCH_DEVICE,
++ .idVendor = 0x04DD,
+ .idProduct = 0x8006, /* B-500/SL-5600 */
+ ZAURUS_MASTER_INTERFACE,
+ .driver_info = ZAURUS_PXA_INFO,
++}, {
++ .match_flags = USB_DEVICE_ID_MATCH_INT_INFO
++ | USB_DEVICE_ID_MATCH_DEVICE,
++ .idVendor = 0x04DD,
++ .idProduct = 0x8006, /* B-500/SL-5600 */
++ ZAURUS_FAKE_INTERFACE,
++ .driver_info = (unsigned long)&bogus_mdlm_info,
+ }, {
+ .match_flags = USB_DEVICE_ID_MATCH_INT_INFO
+ | USB_DEVICE_ID_MATCH_DEVICE,
+@@ -299,6 +313,13 @@ static const struct usb_device_id products [] = {
+ .idProduct = 0x8007, /* C-700 */
+ ZAURUS_MASTER_INTERFACE,
+ .driver_info = ZAURUS_PXA_INFO,
++}, {
++ .match_flags = USB_DEVICE_ID_MATCH_INT_INFO
++ | USB_DEVICE_ID_MATCH_DEVICE,
++ .idVendor = 0x04DD,
++ .idProduct = 0x8007, /* C-700 */
++ ZAURUS_FAKE_INTERFACE,
++ .driver_info = (unsigned long)&bogus_mdlm_info,
+ }, {
+ .match_flags = USB_DEVICE_ID_MATCH_INT_INFO
+ | USB_DEVICE_ID_MATCH_DEVICE,
+diff --git a/drivers/net/wireguard/device.c b/drivers/net/wireguard/device.c
+index d58e9f818d3b7..258dcc1039216 100644
+--- a/drivers/net/wireguard/device.c
++++ b/drivers/net/wireguard/device.c
+@@ -20,6 +20,7 @@
+ #include <linux/icmp.h>
+ #include <linux/suspend.h>
+ #include <net/dst_metadata.h>
++#include <net/gso.h>
+ #include <net/icmp.h>
+ #include <net/rtnetlink.h>
+ #include <net/ip_tunnels.h>
+diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/tx.c b/drivers/net/wireless/intel/iwlwifi/mvm/tx.c
+index 00719e1304386..682733193d3de 100644
+--- a/drivers/net/wireless/intel/iwlwifi/mvm/tx.c
++++ b/drivers/net/wireless/intel/iwlwifi/mvm/tx.c
+@@ -7,6 +7,7 @@
+ #include <linux/ieee80211.h>
+ #include <linux/etherdevice.h>
+ #include <linux/tcp.h>
++#include <net/gso.h>
+ #include <net/ip.h>
+ #include <net/ipv6.h>
+
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7615/eeprom.c b/drivers/net/wireless/mediatek/mt76/mt7615/eeprom.c
+index 68e88224b8b1f..ccedea7e8a50d 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7615/eeprom.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7615/eeprom.c
+@@ -128,12 +128,12 @@ mt7615_eeprom_parse_hw_band_cap(struct mt7615_dev *dev)
+ case MT_EE_5GHZ:
+ dev->mphy.cap.has_5ghz = true;
+ break;
+- case MT_EE_2GHZ:
+- dev->mphy.cap.has_2ghz = true;
+- break;
+ case MT_EE_DBDC:
+ dev->dbdc_support = true;
+ fallthrough;
++ case MT_EE_2GHZ:
++ dev->mphy.cap.has_2ghz = true;
++ break;
+ default:
+ dev->mphy.cap.has_2ghz = true;
+ dev->mphy.cap.has_5ghz = true;
+diff --git a/drivers/s390/net/qeth_core.h b/drivers/s390/net/qeth_core.h
+index 1d195429753dd..613eab7297046 100644
+--- a/drivers/s390/net/qeth_core.h
++++ b/drivers/s390/net/qeth_core.h
+@@ -716,7 +716,6 @@ struct qeth_card_info {
+ u16 chid;
+ u8 ids_valid:1; /* cssid,iid,chid */
+ u8 dev_addr_is_registered:1;
+- u8 open_when_online:1;
+ u8 promisc_mode:1;
+ u8 use_v1_blkt:1;
+ u8 is_vm_nic:1;
+diff --git a/drivers/s390/net/qeth_core_main.c b/drivers/s390/net/qeth_core_main.c
+index 1d5b207c2b9e9..cd783290bde5e 100644
+--- a/drivers/s390/net/qeth_core_main.c
++++ b/drivers/s390/net/qeth_core_main.c
+@@ -5373,8 +5373,6 @@ int qeth_set_offline(struct qeth_card *card, const struct qeth_discipline *disc,
+ qeth_clear_ipacmd_list(card);
+
+ rtnl_lock();
+- card->info.open_when_online = card->dev->flags & IFF_UP;
+- dev_close(card->dev);
+ netif_device_detach(card->dev);
+ netif_carrier_off(card->dev);
+ rtnl_unlock();
+diff --git a/drivers/s390/net/qeth_l2_main.c b/drivers/s390/net/qeth_l2_main.c
+index 9f13ed170a437..75910c0bcc2bc 100644
+--- a/drivers/s390/net/qeth_l2_main.c
++++ b/drivers/s390/net/qeth_l2_main.c
+@@ -2388,9 +2388,12 @@ static int qeth_l2_set_online(struct qeth_card *card, bool carrier_ok)
+ qeth_enable_hw_features(dev);
+ qeth_l2_enable_brport_features(card);
+
+- if (card->info.open_when_online) {
+- card->info.open_when_online = 0;
+- dev_open(dev, NULL);
++ if (netif_running(dev)) {
++ local_bh_disable();
++ napi_schedule(&card->napi);
++ /* kick-start the NAPI softirq: */
++ local_bh_enable();
++ qeth_l2_set_rx_mode(dev);
+ }
+ rtnl_unlock();
+ }
+diff --git a/drivers/s390/net/qeth_l3_main.c b/drivers/s390/net/qeth_l3_main.c
+index af4e60d2917e9..b92a32b4b1141 100644
+--- a/drivers/s390/net/qeth_l3_main.c
++++ b/drivers/s390/net/qeth_l3_main.c
+@@ -2018,9 +2018,11 @@ static int qeth_l3_set_online(struct qeth_card *card, bool carrier_ok)
+ netif_device_attach(dev);
+ qeth_enable_hw_features(dev);
+
+- if (card->info.open_when_online) {
+- card->info.open_when_online = 0;
+- dev_open(dev, NULL);
++ if (netif_running(dev)) {
++ local_bh_disable();
++ napi_schedule(&card->napi);
++ /* kick-start the NAPI softirq: */
++ local_bh_enable();
+ }
+ rtnl_unlock();
+ }
+diff --git a/drivers/s390/scsi/zfcp_fc.c b/drivers/s390/scsi/zfcp_fc.c
+index f21307537829b..4f0d0e55f0d46 100644
+--- a/drivers/s390/scsi/zfcp_fc.c
++++ b/drivers/s390/scsi/zfcp_fc.c
+@@ -534,8 +534,7 @@ static void zfcp_fc_adisc_handler(void *data)
+
+ /* re-init to undo drop from zfcp_fc_adisc() */
+ port->d_id = ntoh24(adisc_resp->adisc_port_id);
+- /* port is good, unblock rport without going through erp */
+- zfcp_scsi_schedule_rport_register(port);
++ /* port is still good, nothing to do */
+ out:
+ atomic_andnot(ZFCP_STATUS_PORT_LINK_TEST, &port->status);
+ put_device(&port->dev);
+@@ -595,9 +594,6 @@ void zfcp_fc_link_test_work(struct work_struct *work)
+ int retval;
+
+ set_worker_desc("zadisc%16llx", port->wwpn); /* < WORKER_DESC_LEN=24 */
+- get_device(&port->dev);
+- port->rport_task = RPORT_DEL;
+- zfcp_scsi_rport_work(&port->rport_work);
+
+ /* only issue one test command at one time per port */
+ if (atomic_read(&port->status) & ZFCP_STATUS_PORT_LINK_TEST)
+diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
+index 659196a2f63ad..4d72d82f73586 100644
+--- a/drivers/scsi/storvsc_drv.c
++++ b/drivers/scsi/storvsc_drv.c
+@@ -365,6 +365,7 @@ static void storvsc_on_channel_callback(void *context);
+ #define STORVSC_FC_MAX_LUNS_PER_TARGET 255
+ #define STORVSC_FC_MAX_TARGETS 128
+ #define STORVSC_FC_MAX_CHANNELS 8
++#define STORVSC_FC_MAX_XFER_SIZE ((u32)(512 * 1024))
+
+ #define STORVSC_IDE_MAX_LUNS_PER_TARGET 64
+ #define STORVSC_IDE_MAX_TARGETS 1
+@@ -2004,6 +2005,9 @@ static int storvsc_probe(struct hv_device *device,
+ * protecting it from any weird value.
+ */
+ max_xfer_bytes = round_down(stor_device->max_transfer_bytes, HV_HYP_PAGE_SIZE);
++ if (is_fc)
++ max_xfer_bytes = min(max_xfer_bytes, STORVSC_FC_MAX_XFER_SIZE);
++
+ /* max_hw_sectors_kb */
+ host->max_sectors = max_xfer_bytes >> 9;
+ /*
+diff --git a/drivers/soc/imx/imx8mp-blk-ctrl.c b/drivers/soc/imx/imx8mp-blk-ctrl.c
+index 870aecc0202ae..1c1fcab4979a4 100644
+--- a/drivers/soc/imx/imx8mp-blk-ctrl.c
++++ b/drivers/soc/imx/imx8mp-blk-ctrl.c
+@@ -164,7 +164,7 @@ static int imx8mp_hsio_blk_ctrl_probe(struct imx8mp_blk_ctrl *bc)
+ clk_hsio_pll->hw.init = &init;
+
+ hw = &clk_hsio_pll->hw;
+- ret = devm_clk_hw_register(bc->dev, hw);
++ ret = devm_clk_hw_register(bc->bus_power_dev, hw);
+ if (ret)
+ return ret;
+
+diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
+index 152b3ec911599..ad14dd745e4ae 100644
+--- a/fs/btrfs/block-group.c
++++ b/fs/btrfs/block-group.c
+@@ -499,12 +499,16 @@ static void fragment_free_space(struct btrfs_block_group *block_group)
+ * used yet since their free space will be released as soon as the transaction
+ * commits.
+ */
+-u64 add_new_free_space(struct btrfs_block_group *block_group, u64 start, u64 end)
++int add_new_free_space(struct btrfs_block_group *block_group, u64 start, u64 end,
++ u64 *total_added_ret)
+ {
+ struct btrfs_fs_info *info = block_group->fs_info;
+- u64 extent_start, extent_end, size, total_added = 0;
++ u64 extent_start, extent_end, size;
+ int ret;
+
++ if (total_added_ret)
++ *total_added_ret = 0;
++
+ while (start < end) {
+ ret = find_first_extent_bit(&info->excluded_extents, start,
+ &extent_start, &extent_end,
+@@ -517,10 +521,12 @@ u64 add_new_free_space(struct btrfs_block_group *block_group, u64 start, u64 end
+ start = extent_end + 1;
+ } else if (extent_start > start && extent_start < end) {
+ size = extent_start - start;
+- total_added += size;
+ ret = btrfs_add_free_space_async_trimmed(block_group,
+ start, size);
+- BUG_ON(ret); /* -ENOMEM or logic error */
++ if (ret)
++ return ret;
++ if (total_added_ret)
++ *total_added_ret += size;
+ start = extent_end + 1;
+ } else {
+ break;
+@@ -529,13 +535,15 @@ u64 add_new_free_space(struct btrfs_block_group *block_group, u64 start, u64 end
+
+ if (start < end) {
+ size = end - start;
+- total_added += size;
+ ret = btrfs_add_free_space_async_trimmed(block_group, start,
+ size);
+- BUG_ON(ret); /* -ENOMEM or logic error */
++ if (ret)
++ return ret;
++ if (total_added_ret)
++ *total_added_ret += size;
+ }
+
+- return total_added;
++ return 0;
+ }
+
+ /*
+@@ -779,8 +787,13 @@ next:
+
+ if (key.type == BTRFS_EXTENT_ITEM_KEY ||
+ key.type == BTRFS_METADATA_ITEM_KEY) {
+- total_found += add_new_free_space(block_group, last,
+- key.objectid);
++ u64 space_added;
++
++ ret = add_new_free_space(block_group, last, key.objectid,
++ &space_added);
++ if (ret)
++ goto out;
++ total_found += space_added;
+ if (key.type == BTRFS_METADATA_ITEM_KEY)
+ last = key.objectid +
+ fs_info->nodesize;
+@@ -795,11 +808,10 @@ next:
+ }
+ path->slots[0]++;
+ }
+- ret = 0;
+-
+- total_found += add_new_free_space(block_group, last,
+- block_group->start + block_group->length);
+
++ ret = add_new_free_space(block_group, last,
++ block_group->start + block_group->length,
++ NULL);
+ out:
+ btrfs_free_path(path);
+ return ret;
+@@ -2290,9 +2302,11 @@ static int read_one_block_group(struct btrfs_fs_info *info,
+ btrfs_free_excluded_extents(cache);
+ } else if (cache->used == 0) {
+ cache->cached = BTRFS_CACHE_FINISHED;
+- add_new_free_space(cache, cache->start,
+- cache->start + cache->length);
++ ret = add_new_free_space(cache, cache->start,
++ cache->start + cache->length, NULL);
+ btrfs_free_excluded_extents(cache);
++ if (ret)
++ goto error;
+ }
+
+ ret = btrfs_add_block_group_cache(info, cache);
+@@ -2728,9 +2742,12 @@ struct btrfs_block_group *btrfs_make_block_group(struct btrfs_trans_handle *tran
+ return ERR_PTR(ret);
+ }
+
+- add_new_free_space(cache, chunk_offset, chunk_offset + size);
+-
++ ret = add_new_free_space(cache, chunk_offset, chunk_offset + size, NULL);
+ btrfs_free_excluded_extents(cache);
++ if (ret) {
++ btrfs_put_block_group(cache);
++ return ERR_PTR(ret);
++ }
+
+ /*
+ * Ensure the corresponding space_info object is created and
+diff --git a/fs/btrfs/block-group.h b/fs/btrfs/block-group.h
+index cc0e4b37db2da..3195d0b0dbed8 100644
+--- a/fs/btrfs/block-group.h
++++ b/fs/btrfs/block-group.h
+@@ -277,8 +277,8 @@ int btrfs_cache_block_group(struct btrfs_block_group *cache, bool wait);
+ void btrfs_put_caching_control(struct btrfs_caching_control *ctl);
+ struct btrfs_caching_control *btrfs_get_caching_control(
+ struct btrfs_block_group *cache);
+-u64 add_new_free_space(struct btrfs_block_group *block_group,
+- u64 start, u64 end);
++int add_new_free_space(struct btrfs_block_group *block_group,
++ u64 start, u64 end, u64 *total_added_ret);
+ struct btrfs_trans_handle *btrfs_start_trans_remove_block_group(
+ struct btrfs_fs_info *fs_info,
+ const u64 chunk_offset);
+diff --git a/fs/btrfs/free-space-tree.c b/fs/btrfs/free-space-tree.c
+index 045ddce32eca4..f169378e2ca6e 100644
+--- a/fs/btrfs/free-space-tree.c
++++ b/fs/btrfs/free-space-tree.c
+@@ -1515,9 +1515,13 @@ static int load_free_space_bitmaps(struct btrfs_caching_control *caching_ctl,
+ if (prev_bit == 0 && bit == 1) {
+ extent_start = offset;
+ } else if (prev_bit == 1 && bit == 0) {
+- total_found += add_new_free_space(block_group,
+- extent_start,
+- offset);
++ u64 space_added;
++
++ ret = add_new_free_space(block_group, extent_start,
++ offset, &space_added);
++ if (ret)
++ goto out;
++ total_found += space_added;
+ if (total_found > CACHING_CTL_WAKE_UP) {
+ total_found = 0;
+ wake_up(&caching_ctl->wait);
+@@ -1529,8 +1533,9 @@ static int load_free_space_bitmaps(struct btrfs_caching_control *caching_ctl,
+ }
+ }
+ if (prev_bit == 1) {
+- total_found += add_new_free_space(block_group, extent_start,
+- end);
++ ret = add_new_free_space(block_group, extent_start, end, NULL);
++ if (ret)
++ goto out;
+ extent_count++;
+ }
+
+@@ -1569,6 +1574,8 @@ static int load_free_space_extents(struct btrfs_caching_control *caching_ctl,
+ end = block_group->start + block_group->length;
+
+ while (1) {
++ u64 space_added;
++
+ ret = btrfs_next_item(root, path);
+ if (ret < 0)
+ goto out;
+@@ -1583,8 +1590,11 @@ static int load_free_space_extents(struct btrfs_caching_control *caching_ctl,
+ ASSERT(key.type == BTRFS_FREE_SPACE_EXTENT_KEY);
+ ASSERT(key.objectid < end && key.objectid + key.offset <= end);
+
+- total_found += add_new_free_space(block_group, key.objectid,
+- key.objectid + key.offset);
++ ret = add_new_free_space(block_group, key.objectid,
++ key.objectid + key.offset, &space_added);
++ if (ret)
++ goto out;
++ total_found += space_added;
+ if (total_found > CACHING_CTL_WAKE_UP) {
+ total_found = 0;
+ wake_up(&caching_ctl->wait);
+diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
+index 4c0f22acf53d2..83c4abff496da 100644
+--- a/fs/ceph/mds_client.c
++++ b/fs/ceph/mds_client.c
+@@ -4762,7 +4762,7 @@ static void delayed_work(struct work_struct *work)
+
+ dout("mdsc delayed_work\n");
+
+- if (mdsc->stopping)
++ if (mdsc->stopping >= CEPH_MDSC_STOPPING_FLUSHED)
+ return;
+
+ mutex_lock(&mdsc->mutex);
+@@ -4941,7 +4941,7 @@ void send_flush_mdlog(struct ceph_mds_session *s)
+ void ceph_mdsc_pre_umount(struct ceph_mds_client *mdsc)
+ {
+ dout("pre_umount\n");
+- mdsc->stopping = 1;
++ mdsc->stopping = CEPH_MDSC_STOPPING_BEGIN;
+
+ ceph_mdsc_iterate_sessions(mdsc, send_flush_mdlog, true);
+ ceph_mdsc_iterate_sessions(mdsc, lock_unlock_session, false);
+diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h
+index 724307ff89cd9..86d2965e68a1f 100644
+--- a/fs/ceph/mds_client.h
++++ b/fs/ceph/mds_client.h
+@@ -380,6 +380,11 @@ struct cap_wait {
+ int want;
+ };
+
++enum {
++ CEPH_MDSC_STOPPING_BEGIN = 1,
++ CEPH_MDSC_STOPPING_FLUSHED = 2,
++};
++
+ /*
+ * mds client state
+ */
+diff --git a/fs/ceph/super.c b/fs/ceph/super.c
+index 3fc48b43cab0a..a5f52013314d6 100644
+--- a/fs/ceph/super.c
++++ b/fs/ceph/super.c
+@@ -1374,6 +1374,16 @@ static void ceph_kill_sb(struct super_block *s)
+ ceph_mdsc_pre_umount(fsc->mdsc);
+ flush_fs_workqueues(fsc);
+
++ /*
++ * Though the kill_anon_super() will finally trigger the
++ * sync_filesystem() anyway, we still need to do it here
++ * and then bump the stage of shutdown to stop the work
++ * queue as earlier as possible.
++ */
++ sync_filesystem(s);
++
++ fsc->mdsc->stopping = CEPH_MDSC_STOPPING_FLUSHED;
++
+ kill_anon_super(s);
+
+ fsc->client->extra_mon_dispatch = NULL;
+diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
+index 4a1c238600c52..470988bb7867e 100644
+--- a/fs/erofs/zdata.c
++++ b/fs/erofs/zdata.c
+@@ -1110,10 +1110,11 @@ static void z_erofs_do_decompressed_bvec(struct z_erofs_decompress_backend *be,
+ struct z_erofs_bvec *bvec)
+ {
+ struct z_erofs_bvec_item *item;
++ unsigned int pgnr;
+
+- if (!((bvec->offset + be->pcl->pageofs_out) & ~PAGE_MASK)) {
+- unsigned int pgnr;
+-
++ if (!((bvec->offset + be->pcl->pageofs_out) & ~PAGE_MASK) &&
++ (bvec->end == PAGE_SIZE ||
++ bvec->offset + bvec->end == be->pcl->length)) {
+ pgnr = (bvec->offset + be->pcl->pageofs_out) >> PAGE_SHIFT;
+ DBG_BUGON(pgnr >= be->nr_pages);
+ if (!be->decompressed_pages[pgnr]) {
+diff --git a/fs/exfat/balloc.c b/fs/exfat/balloc.c
+index 9f42f25fab920..e918decb37358 100644
+--- a/fs/exfat/balloc.c
++++ b/fs/exfat/balloc.c
+@@ -69,7 +69,7 @@ static int exfat_allocate_bitmap(struct super_block *sb,
+ }
+ sbi->map_sectors = ((need_map_size - 1) >>
+ (sb->s_blocksize_bits)) + 1;
+- sbi->vol_amap = kmalloc_array(sbi->map_sectors,
++ sbi->vol_amap = kvmalloc_array(sbi->map_sectors,
+ sizeof(struct buffer_head *), GFP_KERNEL);
+ if (!sbi->vol_amap)
+ return -ENOMEM;
+@@ -84,7 +84,7 @@ static int exfat_allocate_bitmap(struct super_block *sb,
+ while (j < i)
+ brelse(sbi->vol_amap[j++]);
+
+- kfree(sbi->vol_amap);
++ kvfree(sbi->vol_amap);
+ sbi->vol_amap = NULL;
+ return -EIO;
+ }
+@@ -138,7 +138,7 @@ void exfat_free_bitmap(struct exfat_sb_info *sbi)
+ for (i = 0; i < sbi->map_sectors; i++)
+ __brelse(sbi->vol_amap[i]);
+
+- kfree(sbi->vol_amap);
++ kvfree(sbi->vol_amap);
+ }
+
+ int exfat_set_bitmap(struct inode *inode, unsigned int clu, bool sync)
+diff --git a/fs/exfat/dir.c b/fs/exfat/dir.c
+index 957574180a5e3..598081d0d0595 100644
+--- a/fs/exfat/dir.c
++++ b/fs/exfat/dir.c
+@@ -34,6 +34,7 @@ static int exfat_get_uniname_from_ext_entry(struct super_block *sb,
+ {
+ int i, err;
+ struct exfat_entry_set_cache es;
++ unsigned int uni_len = 0, len;
+
+ err = exfat_get_dentry_set(&es, sb, p_dir, entry, ES_ALL_ENTRIES);
+ if (err)
+@@ -52,7 +53,10 @@ static int exfat_get_uniname_from_ext_entry(struct super_block *sb,
+ if (exfat_get_entry_type(ep) != TYPE_EXTEND)
+ break;
+
+- exfat_extract_uni_name(ep, uniname);
++ len = exfat_extract_uni_name(ep, uniname);
++ uni_len += len;
++ if (len != EXFAT_FILE_NAME_LEN || uni_len >= MAX_NAME_LENGTH)
++ break;
+ uniname += EXFAT_FILE_NAME_LEN;
+ }
+
+@@ -214,7 +218,10 @@ static void exfat_free_namebuf(struct exfat_dentry_namebuf *nb)
+ exfat_init_namebuf(nb);
+ }
+
+-/* skip iterating emit_dots when dir is empty */
++/*
++ * Before calling dir_emit*(), sbi->s_lock should be released
++ * because page fault can occur in dir_emit*().
++ */
+ #define ITER_POS_FILLED_DOTS (2)
+ static int exfat_iterate(struct file *file, struct dir_context *ctx)
+ {
+@@ -229,11 +236,10 @@ static int exfat_iterate(struct file *file, struct dir_context *ctx)
+ int err = 0, fake_offset = 0;
+
+ exfat_init_namebuf(nb);
+- mutex_lock(&EXFAT_SB(sb)->s_lock);
+
+ cpos = ctx->pos;
+ if (!dir_emit_dots(file, ctx))
+- goto unlock;
++ goto out;
+
+ if (ctx->pos == ITER_POS_FILLED_DOTS) {
+ cpos = 0;
+@@ -245,16 +251,18 @@ static int exfat_iterate(struct file *file, struct dir_context *ctx)
+ /* name buffer should be allocated before use */
+ err = exfat_alloc_namebuf(nb);
+ if (err)
+- goto unlock;
++ goto out;
+ get_new:
++ mutex_lock(&EXFAT_SB(sb)->s_lock);
++
+ if (ei->flags == ALLOC_NO_FAT_CHAIN && cpos >= i_size_read(inode))
+ goto end_of_dir;
+
+ err = exfat_readdir(inode, &cpos, &de);
+ if (err) {
+ /*
+- * At least we tried to read a sector. Move cpos to next sector
+- * position (should be aligned).
++ * At least we tried to read a sector.
++ * Move cpos to next sector position (should be aligned).
+ */
+ if (err == -EIO) {
+ cpos += 1 << (sb->s_blocksize_bits);
+@@ -277,16 +285,10 @@ get_new:
+ inum = iunique(sb, EXFAT_ROOT_INO);
+ }
+
+- /*
+- * Before calling dir_emit(), sb_lock should be released.
+- * Because page fault can occur in dir_emit() when the size
+- * of buffer given from user is larger than one page size.
+- */
+ mutex_unlock(&EXFAT_SB(sb)->s_lock);
+ if (!dir_emit(ctx, nb->lfn, strlen(nb->lfn), inum,
+ (de.attr & ATTR_SUBDIR) ? DT_DIR : DT_REG))
+- goto out_unlocked;
+- mutex_lock(&EXFAT_SB(sb)->s_lock);
++ goto out;
+ ctx->pos = cpos;
+ goto get_new;
+
+@@ -294,9 +296,8 @@ end_of_dir:
+ if (!cpos && fake_offset)
+ cpos = ITER_POS_FILLED_DOTS;
+ ctx->pos = cpos;
+-unlock:
+ mutex_unlock(&EXFAT_SB(sb)->s_lock);
+-out_unlocked:
++out:
+ /*
+ * To improve performance, free namebuf after unlock sb_lock.
+ * If namebuf is not allocated, this function do nothing
+@@ -1079,7 +1080,8 @@ rewind:
+ if (entry_type == TYPE_EXTEND) {
+ unsigned short entry_uniname[16], unichar;
+
+- if (step != DIRENT_STEP_NAME) {
++ if (step != DIRENT_STEP_NAME ||
++ name_len >= MAX_NAME_LENGTH) {
+ step = DIRENT_STEP_FILE;
+ continue;
+ }
+diff --git a/fs/ext2/ext2.h b/fs/ext2/ext2.h
+index 8244366862e4c..11572cc60d0e9 100644
+--- a/fs/ext2/ext2.h
++++ b/fs/ext2/ext2.h
+@@ -70,10 +70,7 @@ struct mb_cache;
+ * second extended-fs super-block data in memory
+ */
+ struct ext2_sb_info {
+- unsigned long s_frag_size; /* Size of a fragment in bytes */
+- unsigned long s_frags_per_block;/* Number of fragments per block */
+ unsigned long s_inodes_per_block;/* Number of inodes per block */
+- unsigned long s_frags_per_group;/* Number of fragments in a group */
+ unsigned long s_blocks_per_group;/* Number of blocks in a group */
+ unsigned long s_inodes_per_group;/* Number of inodes in a group */
+ unsigned long s_itb_per_group; /* Number of inode table blocks per group */
+@@ -188,15 +185,6 @@ static inline struct ext2_sb_info *EXT2_SB(struct super_block *sb)
+ #define EXT2_INODE_SIZE(s) (EXT2_SB(s)->s_inode_size)
+ #define EXT2_FIRST_INO(s) (EXT2_SB(s)->s_first_ino)
+
+-/*
+- * Macro-instructions used to manage fragments
+- */
+-#define EXT2_MIN_FRAG_SIZE 1024
+-#define EXT2_MAX_FRAG_SIZE 4096
+-#define EXT2_MIN_FRAG_LOG_SIZE 10
+-#define EXT2_FRAG_SIZE(s) (EXT2_SB(s)->s_frag_size)
+-#define EXT2_FRAGS_PER_BLOCK(s) (EXT2_SB(s)->s_frags_per_block)
+-
+ /*
+ * Structure of a blocks group descriptor
+ */
+diff --git a/fs/ext2/super.c b/fs/ext2/super.c
+index f342f347a695f..2959afc7541c7 100644
+--- a/fs/ext2/super.c
++++ b/fs/ext2/super.c
+@@ -668,10 +668,9 @@ static int ext2_setup_super (struct super_block * sb,
+ es->s_max_mnt_count = cpu_to_le16(EXT2_DFL_MAX_MNT_COUNT);
+ le16_add_cpu(&es->s_mnt_count, 1);
+ if (test_opt (sb, DEBUG))
+- ext2_msg(sb, KERN_INFO, "%s, %s, bs=%lu, fs=%lu, gc=%lu, "
++ ext2_msg(sb, KERN_INFO, "%s, %s, bs=%lu, gc=%lu, "
+ "bpg=%lu, ipg=%lu, mo=%04lx]",
+ EXT2FS_VERSION, EXT2FS_DATE, sb->s_blocksize,
+- sbi->s_frag_size,
+ sbi->s_groups_count,
+ EXT2_BLOCKS_PER_GROUP(sb),
+ EXT2_INODES_PER_GROUP(sb),
+@@ -1012,14 +1011,7 @@ static int ext2_fill_super(struct super_block *sb, void *data, int silent)
+ }
+ }
+
+- sbi->s_frag_size = EXT2_MIN_FRAG_SIZE <<
+- le32_to_cpu(es->s_log_frag_size);
+- if (sbi->s_frag_size == 0)
+- goto cantfind_ext2;
+- sbi->s_frags_per_block = sb->s_blocksize / sbi->s_frag_size;
+-
+ sbi->s_blocks_per_group = le32_to_cpu(es->s_blocks_per_group);
+- sbi->s_frags_per_group = le32_to_cpu(es->s_frags_per_group);
+ sbi->s_inodes_per_group = le32_to_cpu(es->s_inodes_per_group);
+
+ sbi->s_inodes_per_block = sb->s_blocksize / EXT2_INODE_SIZE(sb);
+@@ -1045,11 +1037,10 @@ static int ext2_fill_super(struct super_block *sb, void *data, int silent)
+ goto failed_mount;
+ }
+
+- if (sb->s_blocksize != sbi->s_frag_size) {
++ if (es->s_log_frag_size != es->s_log_block_size) {
+ ext2_msg(sb, KERN_ERR,
+- "error: fragsize %lu != blocksize %lu"
+- "(not supported yet)",
+- sbi->s_frag_size, sb->s_blocksize);
++ "error: fragsize log %u != blocksize log %u",
++ le32_to_cpu(es->s_log_frag_size), sb->s_blocksize_bits);
+ goto failed_mount;
+ }
+
+@@ -1066,12 +1057,6 @@ static int ext2_fill_super(struct super_block *sb, void *data, int silent)
+ sbi->s_blocks_per_group, sbi->s_inodes_per_group + 3);
+ goto failed_mount;
+ }
+- if (sbi->s_frags_per_group > sb->s_blocksize * 8) {
+- ext2_msg(sb, KERN_ERR,
+- "error: #fragments per group too big: %lu",
+- sbi->s_frags_per_group);
+- goto failed_mount;
+- }
+ if (sbi->s_inodes_per_group < sbi->s_inodes_per_block ||
+ sbi->s_inodes_per_group > sb->s_blocksize * 8) {
+ ext2_msg(sb, KERN_ERR,
+diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
+index d867056a01f65..271d4e7b22c91 100644
+--- a/fs/f2fs/f2fs.h
++++ b/fs/f2fs/f2fs.h
+@@ -3445,7 +3445,6 @@ static inline bool __is_valid_data_blkaddr(block_t blkaddr)
+ * file.c
+ */
+ int f2fs_sync_file(struct file *file, loff_t start, loff_t end, int datasync);
+-void f2fs_truncate_data_blocks(struct dnode_of_data *dn);
+ int f2fs_do_truncate_blocks(struct inode *inode, u64 from, bool lock);
+ int f2fs_truncate_blocks(struct inode *inode, u64 from, bool lock);
+ int f2fs_truncate(struct inode *inode);
+diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
+index 015ed274dc312..ead75c4e833d2 100644
+--- a/fs/f2fs/file.c
++++ b/fs/f2fs/file.c
+@@ -627,11 +627,6 @@ void f2fs_truncate_data_blocks_range(struct dnode_of_data *dn, int count)
+ dn->ofs_in_node, nr_free);
+ }
+
+-void f2fs_truncate_data_blocks(struct dnode_of_data *dn)
+-{
+- f2fs_truncate_data_blocks_range(dn, ADDRS_PER_BLOCK(dn->inode));
+-}
+-
+ static int truncate_partial_data_page(struct inode *inode, u64 from,
+ bool cache_only)
+ {
+diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
+index 6bdb1bed29ec9..f8e1fd32e3e4f 100644
+--- a/fs/f2fs/node.c
++++ b/fs/f2fs/node.c
+@@ -925,6 +925,7 @@ static int truncate_node(struct dnode_of_data *dn)
+
+ static int truncate_dnode(struct dnode_of_data *dn)
+ {
++ struct f2fs_sb_info *sbi = F2FS_I_SB(dn->inode);
+ struct page *page;
+ int err;
+
+@@ -932,16 +933,25 @@ static int truncate_dnode(struct dnode_of_data *dn)
+ return 1;
+
+ /* get direct node */
+- page = f2fs_get_node_page(F2FS_I_SB(dn->inode), dn->nid);
++ page = f2fs_get_node_page(sbi, dn->nid);
+ if (PTR_ERR(page) == -ENOENT)
+ return 1;
+ else if (IS_ERR(page))
+ return PTR_ERR(page);
+
++ if (IS_INODE(page) || ino_of_node(page) != dn->inode->i_ino) {
++ f2fs_err(sbi, "incorrect node reference, ino: %lu, nid: %u, ino_of_node: %u",
++ dn->inode->i_ino, dn->nid, ino_of_node(page));
++ set_sbi_flag(sbi, SBI_NEED_FSCK);
++ f2fs_handle_error(sbi, ERROR_INVALID_NODE_REFERENCE);
++ f2fs_put_page(page, 1);
++ return -EFSCORRUPTED;
++ }
++
+ /* Make dnode_of_data for parameter */
+ dn->node_page = page;
+ dn->ofs_in_node = 0;
+- f2fs_truncate_data_blocks(dn);
++ f2fs_truncate_data_blocks_range(dn, ADDRS_PER_BLOCK(dn->inode));
+ err = truncate_node(dn);
+ if (err) {
+ f2fs_put_page(page, 1);
+diff --git a/fs/file.c b/fs/file.c
+index 35c62b54c9d65..dbca26ef7a01a 100644
+--- a/fs/file.c
++++ b/fs/file.c
+@@ -1036,12 +1036,28 @@ unsigned long __fdget_raw(unsigned int fd)
+ return __fget_light(fd, 0);
+ }
+
++/*
++ * Try to avoid f_pos locking. We only need it if the
++ * file is marked for FMODE_ATOMIC_POS, and it can be
++ * accessed multiple ways.
++ *
++ * Always do it for directories, because pidfd_getfd()
++ * can make a file accessible even if it otherwise would
++ * not be, and for directories this is a correctness
++ * issue, not a "POSIX requirement".
++ */
++static inline bool file_needs_f_pos_lock(struct file *file)
++{
++ return (file->f_mode & FMODE_ATOMIC_POS) &&
++ (file_count(file) > 1 || S_ISDIR(file_inode(file)->i_mode));
++}
++
+ unsigned long __fdget_pos(unsigned int fd)
+ {
+ unsigned long v = __fdget(fd);
+ struct file *file = (struct file *)(v & ~3);
+
+- if (file && (file->f_mode & FMODE_ATOMIC_POS)) {
++ if (file && file_needs_f_pos_lock(file)) {
+ v |= FDPUT_POS_UNLOCK;
+ mutex_lock(&file->f_pos_lock);
+ }
+diff --git a/fs/ntfs3/attrlist.c b/fs/ntfs3/attrlist.c
+index c0c6bcbc8c05c..81c22df27c725 100644
+--- a/fs/ntfs3/attrlist.c
++++ b/fs/ntfs3/attrlist.c
+@@ -52,7 +52,7 @@ int ntfs_load_attr_list(struct ntfs_inode *ni, struct ATTRIB *attr)
+
+ if (!attr->non_res) {
+ lsize = le32_to_cpu(attr->res.data_size);
+- le = kmalloc(al_aligned(lsize), GFP_NOFS);
++ le = kmalloc(al_aligned(lsize), GFP_NOFS | __GFP_NOWARN);
+ if (!le) {
+ err = -ENOMEM;
+ goto out;
+@@ -80,7 +80,7 @@ int ntfs_load_attr_list(struct ntfs_inode *ni, struct ATTRIB *attr)
+ if (err < 0)
+ goto out;
+
+- le = kmalloc(al_aligned(lsize), GFP_NOFS);
++ le = kmalloc(al_aligned(lsize), GFP_NOFS | __GFP_NOWARN);
+ if (!le) {
+ err = -ENOMEM;
+ goto out;
+diff --git a/fs/open.c b/fs/open.c
+index 4478adcc4f3a0..15ab413d03458 100644
+--- a/fs/open.c
++++ b/fs/open.c
+@@ -1271,7 +1271,7 @@ inline int build_open_flags(const struct open_how *how, struct open_flags *op)
+ lookup_flags |= LOOKUP_IN_ROOT;
+ if (how->resolve & RESOLVE_CACHED) {
+ /* Don't bother even trying for create/truncate/tmpfile open */
+- if (flags & (O_TRUNC | O_CREAT | O_TMPFILE))
++ if (flags & (O_TRUNC | O_CREAT | __O_TMPFILE))
+ return -EAGAIN;
+ lookup_flags |= LOOKUP_CACHED;
+ }
+diff --git a/fs/smb/client/dfs.c b/fs/smb/client/dfs.c
+index cf83617236d8b..a9410e976bb07 100644
+--- a/fs/smb/client/dfs.c
++++ b/fs/smb/client/dfs.c
+@@ -178,8 +178,12 @@ static int __dfs_mount_share(struct cifs_mount_ctx *mnt_ctx)
+ struct dfs_cache_tgt_list tl = DFS_CACHE_TGT_LIST_INIT(tl);
+
+ rc = dfs_get_referral(mnt_ctx, ref_path + 1, NULL, &tl);
+- if (rc)
++ if (rc) {
++ rc = cifs_mount_get_tcon(mnt_ctx);
++ if (!rc)
++ rc = cifs_is_path_remote(mnt_ctx);
+ break;
++ }
+
+ tit = dfs_cache_get_tgt_iterator(&tl);
+ if (!tit) {
+diff --git a/fs/super.c b/fs/super.c
+index 04bc62ab7dfea..5c72c59c5153b 100644
+--- a/fs/super.c
++++ b/fs/super.c
+@@ -903,6 +903,7 @@ int reconfigure_super(struct fs_context *fc)
+ struct super_block *sb = fc->root->d_sb;
+ int retval;
+ bool remount_ro = false;
++ bool remount_rw = false;
+ bool force = fc->sb_flags & SB_FORCE;
+
+ if (fc->sb_flags_mask & ~MS_RMT_MASK)
+@@ -920,7 +921,7 @@ int reconfigure_super(struct fs_context *fc)
+ bdev_read_only(sb->s_bdev))
+ return -EACCES;
+ #endif
+-
++ remount_rw = !(fc->sb_flags & SB_RDONLY) && sb_rdonly(sb);
+ remount_ro = (fc->sb_flags & SB_RDONLY) && !sb_rdonly(sb);
+ }
+
+@@ -950,6 +951,14 @@ int reconfigure_super(struct fs_context *fc)
+ if (retval)
+ return retval;
+ }
++ } else if (remount_rw) {
++ /*
++ * We set s_readonly_remount here to protect filesystem's
++ * reconfigure code from writes from userspace until
++ * reconfigure finishes.
++ */
++ sb->s_readonly_remount = 1;
++ smp_wmb();
+ }
+
+ if (fc->ops->reconfigure) {
+diff --git a/fs/sysv/itree.c b/fs/sysv/itree.c
+index b22764fe669c8..58d7f43a13712 100644
+--- a/fs/sysv/itree.c
++++ b/fs/sysv/itree.c
+@@ -145,6 +145,10 @@ static int alloc_branch(struct inode *inode,
+ */
+ parent = block_to_cpu(SYSV_SB(inode->i_sb), branch[n-1].key);
+ bh = sb_getblk(inode->i_sb, parent);
++ if (!bh) {
++ sysv_free_block(inode->i_sb, branch[n].key);
++ break;
++ }
+ lock_buffer(bh);
+ memset(bh->b_data, 0, blocksize);
+ branch[n].bh = bh;
+diff --git a/include/asm-generic/word-at-a-time.h b/include/asm-generic/word-at-a-time.h
+index 20c93f08c9933..95a1d214108a5 100644
+--- a/include/asm-generic/word-at-a-time.h
++++ b/include/asm-generic/word-at-a-time.h
+@@ -38,7 +38,7 @@ static inline long find_zero(unsigned long mask)
+ return (mask >> 8) ? byte : byte + 1;
+ }
+
+-static inline bool has_zero(unsigned long val, unsigned long *data, const struct word_at_a_time *c)
++static inline unsigned long has_zero(unsigned long val, unsigned long *data, const struct word_at_a_time *c)
+ {
+ unsigned long rhs = val | c->low_bits;
+ *data = rhs;
+diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h
+index 1d6402529d10c..a82a4bb6ce68b 100644
+--- a/include/linux/f2fs_fs.h
++++ b/include/linux/f2fs_fs.h
+@@ -103,6 +103,7 @@ enum f2fs_error {
+ ERROR_INCONSISTENT_SIT,
+ ERROR_CORRUPTED_VERITY_XATTR,
+ ERROR_CORRUPTED_XATTR,
++ ERROR_INVALID_NODE_REFERENCE,
+ ERROR_MAX,
+ };
+
+diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
+index 68adc8af29efb..9291c04a2e09d 100644
+--- a/include/linux/netdevice.h
++++ b/include/linux/netdevice.h
+@@ -4827,13 +4827,6 @@ int skb_crc32c_csum_help(struct sk_buff *skb);
+ int skb_csum_hwoffload_help(struct sk_buff *skb,
+ const netdev_features_t features);
+
+-struct sk_buff *__skb_gso_segment(struct sk_buff *skb,
+- netdev_features_t features, bool tx_path);
+-struct sk_buff *skb_eth_gso_segment(struct sk_buff *skb,
+- netdev_features_t features, __be16 type);
+-struct sk_buff *skb_mac_gso_segment(struct sk_buff *skb,
+- netdev_features_t features);
+-
+ struct netdev_bonding_info {
+ ifslave slave;
+ ifbond master;
+@@ -4856,11 +4849,6 @@ static inline void ethtool_notify(struct net_device *dev, unsigned int cmd,
+ }
+ #endif
+
+-static inline
+-struct sk_buff *skb_gso_segment(struct sk_buff *skb, netdev_features_t features)
+-{
+- return __skb_gso_segment(skb, features, true);
+-}
+ __be16 skb_network_protocol(struct sk_buff *skb, int *depth);
+
+ static inline bool can_checksum_protocol(netdev_features_t features,
+@@ -4987,6 +4975,7 @@ netdev_features_t passthru_features_check(struct sk_buff *skb,
+ struct net_device *dev,
+ netdev_features_t features);
+ netdev_features_t netif_skb_features(struct sk_buff *skb);
++void skb_warn_bad_offload(const struct sk_buff *skb);
+
+ static inline bool net_gso_ok(netdev_features_t features, int gso_type)
+ {
+@@ -5035,19 +5024,6 @@ void netif_set_tso_max_segs(struct net_device *dev, unsigned int segs);
+ void netif_inherit_tso_max(struct net_device *to,
+ const struct net_device *from);
+
+-static inline void skb_gso_error_unwind(struct sk_buff *skb, __be16 protocol,
+- int pulled_hlen, u16 mac_offset,
+- int mac_len)
+-{
+- skb->protocol = protocol;
+- skb->encapsulation = 1;
+- skb_push(skb, pulled_hlen);
+- skb_reset_transport_header(skb);
+- skb->mac_header = mac_offset;
+- skb->network_header = skb->mac_header + mac_len;
+- skb->mac_len = mac_len;
+-}
+-
+ static inline bool netif_is_macsec(const struct net_device *dev)
+ {
+ return dev->priv_flags & IFF_MACSEC;
+diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
+index 0b40417457cd1..fdd9db2612968 100644
+--- a/include/linux/skbuff.h
++++ b/include/linux/skbuff.h
+@@ -3992,8 +3992,6 @@ int skb_zerocopy(struct sk_buff *to, struct sk_buff *from,
+ void skb_split(struct sk_buff *skb, struct sk_buff *skb1, const u32 len);
+ int skb_shift(struct sk_buff *tgt, struct sk_buff *skb, int shiftlen);
+ void skb_scrub_packet(struct sk_buff *skb, bool xnet);
+-bool skb_gso_validate_network_len(const struct sk_buff *skb, unsigned int mtu);
+-bool skb_gso_validate_mac_len(const struct sk_buff *skb, unsigned int len);
+ struct sk_buff *skb_segment(struct sk_buff *skb, netdev_features_t features);
+ struct sk_buff *skb_segment_list(struct sk_buff *skb, netdev_features_t features,
+ unsigned int offset);
+@@ -4859,75 +4857,6 @@ static inline struct sec_path *skb_sec_path(const struct sk_buff *skb)
+ #endif
+ }
+
+-/* Keeps track of mac header offset relative to skb->head.
+- * It is useful for TSO of Tunneling protocol. e.g. GRE.
+- * For non-tunnel skb it points to skb_mac_header() and for
+- * tunnel skb it points to outer mac header.
+- * Keeps track of level of encapsulation of network headers.
+- */
+-struct skb_gso_cb {
+- union {
+- int mac_offset;
+- int data_offset;
+- };
+- int encap_level;
+- __wsum csum;
+- __u16 csum_start;
+-};
+-#define SKB_GSO_CB_OFFSET 32
+-#define SKB_GSO_CB(skb) ((struct skb_gso_cb *)((skb)->cb + SKB_GSO_CB_OFFSET))
+-
+-static inline int skb_tnl_header_len(const struct sk_buff *inner_skb)
+-{
+- return (skb_mac_header(inner_skb) - inner_skb->head) -
+- SKB_GSO_CB(inner_skb)->mac_offset;
+-}
+-
+-static inline int gso_pskb_expand_head(struct sk_buff *skb, int extra)
+-{
+- int new_headroom, headroom;
+- int ret;
+-
+- headroom = skb_headroom(skb);
+- ret = pskb_expand_head(skb, extra, 0, GFP_ATOMIC);
+- if (ret)
+- return ret;
+-
+- new_headroom = skb_headroom(skb);
+- SKB_GSO_CB(skb)->mac_offset += (new_headroom - headroom);
+- return 0;
+-}
+-
+-static inline void gso_reset_checksum(struct sk_buff *skb, __wsum res)
+-{
+- /* Do not update partial checksums if remote checksum is enabled. */
+- if (skb->remcsum_offload)
+- return;
+-
+- SKB_GSO_CB(skb)->csum = res;
+- SKB_GSO_CB(skb)->csum_start = skb_checksum_start(skb) - skb->head;
+-}
+-
+-/* Compute the checksum for a gso segment. First compute the checksum value
+- * from the start of transport header to SKB_GSO_CB(skb)->csum_start, and
+- * then add in skb->csum (checksum from csum_start to end of packet).
+- * skb->csum and csum_start are then updated to reflect the checksum of the
+- * resultant packet starting from the transport header-- the resultant checksum
+- * is in the res argument (i.e. normally zero or ~ of checksum of a pseudo
+- * header.
+- */
+-static inline __sum16 gso_make_checksum(struct sk_buff *skb, __wsum res)
+-{
+- unsigned char *csum_start = skb_transport_header(skb);
+- int plen = (skb->head + SKB_GSO_CB(skb)->csum_start) - csum_start;
+- __wsum partial = SKB_GSO_CB(skb)->csum;
+-
+- SKB_GSO_CB(skb)->csum = res;
+- SKB_GSO_CB(skb)->csum_start = csum_start - skb->head;
+-
+- return csum_fold(csum_partial(csum_start, plen, partial));
+-}
+-
+ static inline bool skb_is_gso(const struct sk_buff *skb)
+ {
+ return skb_shinfo(skb)->gso_size;
+diff --git a/include/linux/spi/spi-mem.h b/include/linux/spi/spi-mem.h
+index 8e984d75f5b6c..6b0a7dc48a4b7 100644
+--- a/include/linux/spi/spi-mem.h
++++ b/include/linux/spi/spi-mem.h
+@@ -101,6 +101,7 @@ struct spi_mem_op {
+ u8 nbytes;
+ u8 buswidth;
+ u8 dtr : 1;
++ u8 __pad : 7;
+ u16 opcode;
+ } cmd;
+
+@@ -108,6 +109,7 @@ struct spi_mem_op {
+ u8 nbytes;
+ u8 buswidth;
+ u8 dtr : 1;
++ u8 __pad : 7;
+ u64 val;
+ } addr;
+
+@@ -115,12 +117,14 @@ struct spi_mem_op {
+ u8 nbytes;
+ u8 buswidth;
+ u8 dtr : 1;
++ u8 __pad : 7;
+ } dummy;
+
+ struct {
+ u8 buswidth;
+ u8 dtr : 1;
+ u8 ecc : 1;
++ u8 __pad : 6;
+ enum spi_mem_data_dir dir;
+ unsigned int nbytes;
+ union {
+diff --git a/include/net/gro.h b/include/net/gro.h
+index a4fab706240d2..d3d318e7d917b 100644
+--- a/include/net/gro.h
++++ b/include/net/gro.h
+@@ -446,5 +446,49 @@ static inline void gro_normal_one(struct napi_struct *napi, struct sk_buff *skb,
+ gro_normal_list(napi);
+ }
+
++/* This function is the alternative of 'inet_iif' and 'inet_sdif'
++ * functions in case we can not rely on fields of IPCB.
++ *
++ * The caller must verify skb_valid_dst(skb) is false and skb->dev is initialized.
++ * The caller must hold the RCU read lock.
++ */
++static inline void inet_get_iif_sdif(const struct sk_buff *skb, int *iif, int *sdif)
++{
++ *iif = inet_iif(skb) ?: skb->dev->ifindex;
++ *sdif = 0;
++
++#if IS_ENABLED(CONFIG_NET_L3_MASTER_DEV)
++ if (netif_is_l3_slave(skb->dev)) {
++ struct net_device *master = netdev_master_upper_dev_get_rcu(skb->dev);
++
++ *sdif = *iif;
++ *iif = master ? master->ifindex : 0;
++ }
++#endif
++}
++
++/* This function is the alternative of 'inet6_iif' and 'inet6_sdif'
++ * functions in case we can not rely on fields of IP6CB.
++ *
++ * The caller must verify skb_valid_dst(skb) is false and skb->dev is initialized.
++ * The caller must hold the RCU read lock.
++ */
++static inline void inet6_get_iif_sdif(const struct sk_buff *skb, int *iif, int *sdif)
++{
++ /* using skb->dev->ifindex because skb_dst(skb) is not initialized */
++ *iif = skb->dev->ifindex;
++ *sdif = 0;
++
++#if IS_ENABLED(CONFIG_NET_L3_MASTER_DEV)
++ if (netif_is_l3_slave(skb->dev)) {
++ struct net_device *master = netdev_master_upper_dev_get_rcu(skb->dev);
++
++ *sdif = *iif;
++ *iif = master ? master->ifindex : 0;
++ }
++#endif
++}
++
++extern struct list_head offload_base;
+
+ #endif /* _NET_IPV6_GRO_H */
+diff --git a/include/net/gso.h b/include/net/gso.h
+new file mode 100644
+index 0000000000000..29975440cad51
+--- /dev/null
++++ b/include/net/gso.h
+@@ -0,0 +1,109 @@
++/* SPDX-License-Identifier: GPL-2.0-or-later */
++
++#ifndef _NET_GSO_H
++#define _NET_GSO_H
++
++#include <linux/skbuff.h>
++
++/* Keeps track of mac header offset relative to skb->head.
++ * It is useful for TSO of Tunneling protocol. e.g. GRE.
++ * For non-tunnel skb it points to skb_mac_header() and for
++ * tunnel skb it points to outer mac header.
++ * Keeps track of level of encapsulation of network headers.
++ */
++struct skb_gso_cb {
++ union {
++ int mac_offset;
++ int data_offset;
++ };
++ int encap_level;
++ __wsum csum;
++ __u16 csum_start;
++};
++#define SKB_GSO_CB_OFFSET 32
++#define SKB_GSO_CB(skb) ((struct skb_gso_cb *)((skb)->cb + SKB_GSO_CB_OFFSET))
++
++static inline int skb_tnl_header_len(const struct sk_buff *inner_skb)
++{
++ return (skb_mac_header(inner_skb) - inner_skb->head) -
++ SKB_GSO_CB(inner_skb)->mac_offset;
++}
++
++static inline int gso_pskb_expand_head(struct sk_buff *skb, int extra)
++{
++ int new_headroom, headroom;
++ int ret;
++
++ headroom = skb_headroom(skb);
++ ret = pskb_expand_head(skb, extra, 0, GFP_ATOMIC);
++ if (ret)
++ return ret;
++
++ new_headroom = skb_headroom(skb);
++ SKB_GSO_CB(skb)->mac_offset += (new_headroom - headroom);
++ return 0;
++}
++
++static inline void gso_reset_checksum(struct sk_buff *skb, __wsum res)
++{
++ /* Do not update partial checksums if remote checksum is enabled. */
++ if (skb->remcsum_offload)
++ return;
++
++ SKB_GSO_CB(skb)->csum = res;
++ SKB_GSO_CB(skb)->csum_start = skb_checksum_start(skb) - skb->head;
++}
++
++/* Compute the checksum for a gso segment. First compute the checksum value
++ * from the start of transport header to SKB_GSO_CB(skb)->csum_start, and
++ * then add in skb->csum (checksum from csum_start to end of packet).
++ * skb->csum and csum_start are then updated to reflect the checksum of the
++ * resultant packet starting from the transport header-- the resultant checksum
++ * is in the res argument (i.e. normally zero or ~ of checksum of a pseudo
++ * header.
++ */
++static inline __sum16 gso_make_checksum(struct sk_buff *skb, __wsum res)
++{
++ unsigned char *csum_start = skb_transport_header(skb);
++ int plen = (skb->head + SKB_GSO_CB(skb)->csum_start) - csum_start;
++ __wsum partial = SKB_GSO_CB(skb)->csum;
++
++ SKB_GSO_CB(skb)->csum = res;
++ SKB_GSO_CB(skb)->csum_start = csum_start - skb->head;
++
++ return csum_fold(csum_partial(csum_start, plen, partial));
++}
++
++struct sk_buff *__skb_gso_segment(struct sk_buff *skb,
++ netdev_features_t features, bool tx_path);
++
++static inline struct sk_buff *skb_gso_segment(struct sk_buff *skb,
++ netdev_features_t features)
++{
++ return __skb_gso_segment(skb, features, true);
++}
++
++struct sk_buff *skb_eth_gso_segment(struct sk_buff *skb,
++ netdev_features_t features, __be16 type);
++
++struct sk_buff *skb_mac_gso_segment(struct sk_buff *skb,
++ netdev_features_t features);
++
++bool skb_gso_validate_network_len(const struct sk_buff *skb, unsigned int mtu);
++
++bool skb_gso_validate_mac_len(const struct sk_buff *skb, unsigned int len);
++
++static inline void skb_gso_error_unwind(struct sk_buff *skb, __be16 protocol,
++ int pulled_hlen, u16 mac_offset,
++ int mac_len)
++{
++ skb->protocol = protocol;
++ skb->encapsulation = 1;
++ skb_push(skb, pulled_hlen);
++ skb_reset_transport_header(skb);
++ skb->mac_header = mac_offset;
++ skb->network_header = skb->mac_header + mac_len;
++ skb->mac_len = mac_len;
++}
++
++#endif /* _NET_GSO_H */
+diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
+index caa20a9055310..0bb32bfc61832 100644
+--- a/include/net/inet_sock.h
++++ b/include/net/inet_sock.h
+@@ -107,11 +107,12 @@ static inline struct inet_request_sock *inet_rsk(const struct request_sock *sk)
+
+ static inline u32 inet_request_mark(const struct sock *sk, struct sk_buff *skb)
+ {
+- if (!sk->sk_mark &&
+- READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_fwmark_accept))
++ u32 mark = READ_ONCE(sk->sk_mark);
++
++ if (!mark && READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_fwmark_accept))
+ return skb->mark;
+
+- return sk->sk_mark;
++ return mark;
+ }
+
+ static inline int inet_request_bound_dev_if(const struct sock *sk,
+diff --git a/include/net/ip.h b/include/net/ip.h
+index 83a1a9bc3ceb1..530e7257e4389 100644
+--- a/include/net/ip.h
++++ b/include/net/ip.h
+@@ -93,7 +93,7 @@ static inline void ipcm_init_sk(struct ipcm_cookie *ipcm,
+ {
+ ipcm_init(ipcm);
+
+- ipcm->sockc.mark = inet->sk.sk_mark;
++ ipcm->sockc.mark = READ_ONCE(inet->sk.sk_mark);
+ ipcm->sockc.tsflags = inet->sk.sk_tsflags;
+ ipcm->oif = READ_ONCE(inet->sk.sk_bound_dev_if);
+ ipcm->addr = inet->inet_saddr;
+diff --git a/include/net/route.h b/include/net/route.h
+index bcc367cf3aa2d..9ca0f72868b76 100644
+--- a/include/net/route.h
++++ b/include/net/route.h
+@@ -168,7 +168,7 @@ static inline struct rtable *ip_route_output_ports(struct net *net, struct flowi
+ __be16 dport, __be16 sport,
+ __u8 proto, __u8 tos, int oif)
+ {
+- flowi4_init_output(fl4, oif, sk ? sk->sk_mark : 0, tos,
++ flowi4_init_output(fl4, oif, sk ? READ_ONCE(sk->sk_mark) : 0, tos,
+ RT_SCOPE_UNIVERSE, proto,
+ sk ? inet_sk_flowi_flags(sk) : 0,
+ daddr, saddr, dport, sport, sock_net_uid(net, sk));
+@@ -301,7 +301,7 @@ static inline void ip_route_connect_init(struct flowi4 *fl4, __be32 dst,
+ if (inet_sk(sk)->transparent)
+ flow_flags |= FLOWI_FLAG_ANYSRC;
+
+- flowi4_init_output(fl4, oif, sk->sk_mark, ip_sock_rt_tos(sk),
++ flowi4_init_output(fl4, oif, READ_ONCE(sk->sk_mark), ip_sock_rt_tos(sk),
+ ip_sock_rt_scope(sk), protocol, flow_flags, dst,
+ src, dport, sport, sk->sk_uid);
+ }
+diff --git a/include/net/udp.h b/include/net/udp.h
+index de4b528522bb9..94f3486c43e33 100644
+--- a/include/net/udp.h
++++ b/include/net/udp.h
+@@ -21,6 +21,7 @@
+ #include <linux/list.h>
+ #include <linux/bug.h>
+ #include <net/inet_sock.h>
++#include <net/gso.h>
+ #include <net/sock.h>
+ #include <net/snmp.h>
+ #include <net/ip.h>
+diff --git a/include/net/vxlan.h b/include/net/vxlan.h
+index b57567296bc67..fae2893613aa2 100644
+--- a/include/net/vxlan.h
++++ b/include/net/vxlan.h
+@@ -554,12 +554,12 @@ static inline void vxlan_flag_attr_error(int attrtype,
+ }
+
+ static inline bool vxlan_fdb_nh_path_select(struct nexthop *nh,
+- int hash,
++ u32 hash,
+ struct vxlan_rdst *rdst)
+ {
+ struct fib_nh_common *nhc;
+
+- nhc = nexthop_path_fdb_result(nh, hash);
++ nhc = nexthop_path_fdb_result(nh, hash >> 1);
+ if (unlikely(!nhc))
+ return false;
+
+diff --git a/io_uring/timeout.c b/io_uring/timeout.c
+index fc950177e2e1d..350eb830b4855 100644
+--- a/io_uring/timeout.c
++++ b/io_uring/timeout.c
+@@ -594,7 +594,7 @@ int io_timeout(struct io_kiocb *req, unsigned int issue_flags)
+ goto add;
+ }
+
+- tail = ctx->cached_cq_tail - atomic_read(&ctx->cq_timeouts);
++ tail = data_race(ctx->cached_cq_tail) - atomic_read(&ctx->cq_timeouts);
+ timeout->target_seq = tail + off;
+
+ /* Update the last seq here in case io_flush_timeouts() hasn't.
+diff --git a/kernel/bpf/bloom_filter.c b/kernel/bpf/bloom_filter.c
+index 540331b610a97..addf3dd57b59b 100644
+--- a/kernel/bpf/bloom_filter.c
++++ b/kernel/bpf/bloom_filter.c
+@@ -86,9 +86,6 @@ static struct bpf_map *bloom_map_alloc(union bpf_attr *attr)
+ int numa_node = bpf_map_attr_numa_node(attr);
+ struct bpf_bloom_filter *bloom;
+
+- if (!bpf_capable())
+- return ERR_PTR(-EPERM);
+-
+ if (attr->key_size != 0 || attr->value_size == 0 ||
+ attr->max_entries == 0 ||
+ attr->map_flags & ~BLOOM_CREATE_FLAG_MASK ||
+diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c
+index 47d9948d768f0..b5149cfce7d4d 100644
+--- a/kernel/bpf/bpf_local_storage.c
++++ b/kernel/bpf/bpf_local_storage.c
+@@ -723,9 +723,6 @@ int bpf_local_storage_map_alloc_check(union bpf_attr *attr)
+ !attr->btf_key_type_id || !attr->btf_value_type_id)
+ return -EINVAL;
+
+- if (!bpf_capable())
+- return -EPERM;
+-
+ if (attr->value_size > BPF_LOCAL_STORAGE_MAX_VALUE_SIZE)
+ return -E2BIG;
+
+diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
+index d3f0a4825fa61..116a0ce378ecd 100644
+--- a/kernel/bpf/bpf_struct_ops.c
++++ b/kernel/bpf/bpf_struct_ops.c
+@@ -655,9 +655,6 @@ static struct bpf_map *bpf_struct_ops_map_alloc(union bpf_attr *attr)
+ const struct btf_type *t, *vt;
+ struct bpf_map *map;
+
+- if (!bpf_capable())
+- return ERR_PTR(-EPERM);
+-
+ st_ops = bpf_struct_ops_find_value(attr->btf_vmlinux_value_type_id);
+ if (!st_ops)
+ return ERR_PTR(-ENOTSUPP);
+diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
+index 3da63be602d1c..286ab3db0fde8 100644
+--- a/kernel/bpf/cpumap.c
++++ b/kernel/bpf/cpumap.c
+@@ -28,7 +28,7 @@
+ #include <linux/sched.h>
+ #include <linux/workqueue.h>
+ #include <linux/kthread.h>
+-#include <linux/capability.h>
++#include <linux/completion.h>
+ #include <trace/events/xdp.h>
+ #include <linux/btf_ids.h>
+
+@@ -74,6 +74,7 @@ struct bpf_cpu_map_entry {
+ struct rcu_head rcu;
+
+ struct work_struct kthread_stop_wq;
++ struct completion kthread_running;
+ };
+
+ struct bpf_cpu_map {
+@@ -89,9 +90,6 @@ static struct bpf_map *cpu_map_alloc(union bpf_attr *attr)
+ u32 value_size = attr->value_size;
+ struct bpf_cpu_map *cmap;
+
+- if (!bpf_capable())
+- return ERR_PTR(-EPERM);
+-
+ /* check sanity of attributes */
+ if (attr->max_entries == 0 || attr->key_size != 4 ||
+ (value_size != offsetofend(struct bpf_cpumap_val, qsize) &&
+@@ -133,11 +131,17 @@ static void __cpu_map_ring_cleanup(struct ptr_ring *ring)
+ * invoked cpu_map_kthread_stop(). Catch any broken behaviour
+ * gracefully and warn once.
+ */
+- struct xdp_frame *xdpf;
++ void *ptr;
+
+- while ((xdpf = ptr_ring_consume(ring)))
+- if (WARN_ON_ONCE(xdpf))
+- xdp_return_frame(xdpf);
++ while ((ptr = ptr_ring_consume(ring))) {
++ WARN_ON_ONCE(1);
++ if (unlikely(__ptr_test_bit(0, &ptr))) {
++ __ptr_clear_bit(0, &ptr);
++ kfree_skb(ptr);
++ continue;
++ }
++ xdp_return_frame(ptr);
++ }
+ }
+
+ static void put_cpu_map_entry(struct bpf_cpu_map_entry *rcpu)
+@@ -157,7 +161,6 @@ static void put_cpu_map_entry(struct bpf_cpu_map_entry *rcpu)
+ static void cpu_map_kthread_stop(struct work_struct *work)
+ {
+ struct bpf_cpu_map_entry *rcpu;
+- int err;
+
+ rcpu = container_of(work, struct bpf_cpu_map_entry, kthread_stop_wq);
+
+@@ -167,14 +170,7 @@ static void cpu_map_kthread_stop(struct work_struct *work)
+ rcu_barrier();
+
+ /* kthread_stop will wake_up_process and wait for it to complete */
+- err = kthread_stop(rcpu->kthread);
+- if (err) {
+- /* kthread_stop may be called before cpu_map_kthread_run
+- * is executed, so we need to release the memory related
+- * to rcpu.
+- */
+- put_cpu_map_entry(rcpu);
+- }
++ kthread_stop(rcpu->kthread);
+ }
+
+ static void cpu_map_bpf_prog_run_skb(struct bpf_cpu_map_entry *rcpu,
+@@ -302,11 +298,11 @@ static int cpu_map_bpf_prog_run(struct bpf_cpu_map_entry *rcpu, void **frames,
+ return nframes;
+ }
+
+-
+ static int cpu_map_kthread_run(void *data)
+ {
+ struct bpf_cpu_map_entry *rcpu = data;
+
++ complete(&rcpu->kthread_running);
+ set_current_state(TASK_INTERRUPTIBLE);
+
+ /* When kthread gives stop order, then rcpu have been disconnected
+@@ -471,6 +467,7 @@ __cpu_map_entry_alloc(struct bpf_map *map, struct bpf_cpumap_val *value,
+ goto free_ptr_ring;
+
+ /* Setup kthread */
++ init_completion(&rcpu->kthread_running);
+ rcpu->kthread = kthread_create_on_node(cpu_map_kthread_run, rcpu, numa,
+ "cpumap/%d/map:%d", cpu,
+ map->id);
+@@ -484,6 +481,12 @@ __cpu_map_entry_alloc(struct bpf_map *map, struct bpf_cpumap_val *value,
+ kthread_bind(rcpu->kthread, cpu);
+ wake_up_process(rcpu->kthread);
+
++ /* Make sure kthread has been running, so kthread_stop() will not
++ * stop the kthread prematurely and all pending frames or skbs
++ * will be handled by the kthread before kthread_stop() returns.
++ */
++ wait_for_completion(&rcpu->kthread_running);
++
+ return rcpu;
+
+ free_prog:
+diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
+index 802692fa3905c..49cc0b5671c61 100644
+--- a/kernel/bpf/devmap.c
++++ b/kernel/bpf/devmap.c
+@@ -160,9 +160,6 @@ static struct bpf_map *dev_map_alloc(union bpf_attr *attr)
+ struct bpf_dtab *dtab;
+ int err;
+
+- if (!capable(CAP_NET_ADMIN))
+- return ERR_PTR(-EPERM);
+-
+ dtab = bpf_map_area_alloc(sizeof(*dtab), NUMA_NO_NODE);
+ if (!dtab)
+ return ERR_PTR(-ENOMEM);
+diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
+index 9901efee4339d..56d3da7d0bc66 100644
+--- a/kernel/bpf/hashtab.c
++++ b/kernel/bpf/hashtab.c
+@@ -422,12 +422,6 @@ static int htab_map_alloc_check(union bpf_attr *attr)
+ BUILD_BUG_ON(offsetof(struct htab_elem, fnode.next) !=
+ offsetof(struct htab_elem, hash_node.pprev));
+
+- if (lru && !bpf_capable())
+- /* LRU implementation is much complicated than other
+- * maps. Hence, limit to CAP_BPF.
+- */
+- return -EPERM;
+-
+ if (zero_seed && !capable(CAP_SYS_ADMIN))
+ /* Guard against local DoS, and discourage production use. */
+ return -EPERM;
+diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c
+index e0d3ddf2037ab..17c7e7782a1f7 100644
+--- a/kernel/bpf/lpm_trie.c
++++ b/kernel/bpf/lpm_trie.c
+@@ -544,9 +544,6 @@ static struct bpf_map *trie_alloc(union bpf_attr *attr)
+ {
+ struct lpm_trie *trie;
+
+- if (!bpf_capable())
+- return ERR_PTR(-EPERM);
+-
+ /* check sanity of attributes */
+ if (attr->max_entries == 0 ||
+ !(attr->map_flags & BPF_F_NO_PREALLOC) ||
+diff --git a/kernel/bpf/queue_stack_maps.c b/kernel/bpf/queue_stack_maps.c
+index 601609164ef34..8d2ddcb7566b7 100644
+--- a/kernel/bpf/queue_stack_maps.c
++++ b/kernel/bpf/queue_stack_maps.c
+@@ -7,7 +7,6 @@
+ #include <linux/bpf.h>
+ #include <linux/list.h>
+ #include <linux/slab.h>
+-#include <linux/capability.h>
+ #include <linux/btf_ids.h>
+ #include "percpu_freelist.h"
+
+@@ -46,9 +45,6 @@ static bool queue_stack_map_is_full(struct bpf_queue_stack *qs)
+ /* Called from syscall */
+ static int queue_stack_map_alloc_check(union bpf_attr *attr)
+ {
+- if (!bpf_capable())
+- return -EPERM;
+-
+ /* check sanity of attributes */
+ if (attr->max_entries == 0 || attr->key_size != 0 ||
+ attr->value_size == 0 ||
+diff --git a/kernel/bpf/reuseport_array.c b/kernel/bpf/reuseport_array.c
+index cbf2d8d784b89..4b4f9670f1a9a 100644
+--- a/kernel/bpf/reuseport_array.c
++++ b/kernel/bpf/reuseport_array.c
+@@ -151,9 +151,6 @@ static struct bpf_map *reuseport_array_alloc(union bpf_attr *attr)
+ int numa_node = bpf_map_attr_numa_node(attr);
+ struct reuseport_array *array;
+
+- if (!bpf_capable())
+- return ERR_PTR(-EPERM);
+-
+ /* allocate all map elements and zero-initialize them */
+ array = bpf_map_area_alloc(struct_size(array, ptrs, attr->max_entries), numa_node);
+ if (!array)
+diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
+index b25fce425b2c6..458bb80b14d57 100644
+--- a/kernel/bpf/stackmap.c
++++ b/kernel/bpf/stackmap.c
+@@ -74,9 +74,6 @@ static struct bpf_map *stack_map_alloc(union bpf_attr *attr)
+ u64 cost, n_buckets;
+ int err;
+
+- if (!bpf_capable())
+- return ERR_PTR(-EPERM);
+-
+ if (attr->map_flags & ~STACK_CREATE_FLAG_MASK)
+ return ERR_PTR(-EINVAL);
+
+diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
+index 5524fcf6fb2a4..f715ec5d541ad 100644
+--- a/kernel/bpf/syscall.c
++++ b/kernel/bpf/syscall.c
+@@ -109,37 +109,6 @@ const struct bpf_map_ops bpf_map_offload_ops = {
+ .map_mem_usage = bpf_map_offload_map_mem_usage,
+ };
+
+-static struct bpf_map *find_and_alloc_map(union bpf_attr *attr)
+-{
+- const struct bpf_map_ops *ops;
+- u32 type = attr->map_type;
+- struct bpf_map *map;
+- int err;
+-
+- if (type >= ARRAY_SIZE(bpf_map_types))
+- return ERR_PTR(-EINVAL);
+- type = array_index_nospec(type, ARRAY_SIZE(bpf_map_types));
+- ops = bpf_map_types[type];
+- if (!ops)
+- return ERR_PTR(-EINVAL);
+-
+- if (ops->map_alloc_check) {
+- err = ops->map_alloc_check(attr);
+- if (err)
+- return ERR_PTR(err);
+- }
+- if (attr->map_ifindex)
+- ops = &bpf_map_offload_ops;
+- if (!ops->map_mem_usage)
+- return ERR_PTR(-EINVAL);
+- map = ops->map_alloc(attr);
+- if (IS_ERR(map))
+- return map;
+- map->ops = ops;
+- map->map_type = type;
+- return map;
+-}
+-
+ static void bpf_map_write_active_inc(struct bpf_map *map)
+ {
+ atomic64_inc(&map->writecnt);
+@@ -1127,7 +1096,9 @@ free_map_tab:
+ /* called via syscall */
+ static int map_create(union bpf_attr *attr)
+ {
++ const struct bpf_map_ops *ops;
+ int numa_node = bpf_map_attr_numa_node(attr);
++ u32 map_type = attr->map_type;
+ struct bpf_map *map;
+ int f_flags;
+ int err;
+@@ -1158,9 +1129,85 @@ static int map_create(union bpf_attr *attr)
+ return -EINVAL;
+
+ /* find map type and init map: hashtable vs rbtree vs bloom vs ... */
+- map = find_and_alloc_map(attr);
++ map_type = attr->map_type;
++ if (map_type >= ARRAY_SIZE(bpf_map_types))
++ return -EINVAL;
++ map_type = array_index_nospec(map_type, ARRAY_SIZE(bpf_map_types));
++ ops = bpf_map_types[map_type];
++ if (!ops)
++ return -EINVAL;
++
++ if (ops->map_alloc_check) {
++ err = ops->map_alloc_check(attr);
++ if (err)
++ return err;
++ }
++ if (attr->map_ifindex)
++ ops = &bpf_map_offload_ops;
++ if (!ops->map_mem_usage)
++ return -EINVAL;
++
++ /* Intent here is for unprivileged_bpf_disabled to block BPF map
++ * creation for unprivileged users; other actions depend
++ * on fd availability and access to bpffs, so are dependent on
++ * object creation success. Even with unprivileged BPF disabled,
++ * capability checks are still carried out.
++ */
++ if (sysctl_unprivileged_bpf_disabled && !bpf_capable())
++ return -EPERM;
++
++ /* check privileged map type permissions */
++ switch (map_type) {
++ case BPF_MAP_TYPE_ARRAY:
++ case BPF_MAP_TYPE_PERCPU_ARRAY:
++ case BPF_MAP_TYPE_PROG_ARRAY:
++ case BPF_MAP_TYPE_PERF_EVENT_ARRAY:
++ case BPF_MAP_TYPE_CGROUP_ARRAY:
++ case BPF_MAP_TYPE_ARRAY_OF_MAPS:
++ case BPF_MAP_TYPE_HASH:
++ case BPF_MAP_TYPE_PERCPU_HASH:
++ case BPF_MAP_TYPE_HASH_OF_MAPS:
++ case BPF_MAP_TYPE_RINGBUF:
++ case BPF_MAP_TYPE_USER_RINGBUF:
++ case BPF_MAP_TYPE_CGROUP_STORAGE:
++ case BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE:
++ /* unprivileged */
++ break;
++ case BPF_MAP_TYPE_SK_STORAGE:
++ case BPF_MAP_TYPE_INODE_STORAGE:
++ case BPF_MAP_TYPE_TASK_STORAGE:
++ case BPF_MAP_TYPE_CGRP_STORAGE:
++ case BPF_MAP_TYPE_BLOOM_FILTER:
++ case BPF_MAP_TYPE_LPM_TRIE:
++ case BPF_MAP_TYPE_REUSEPORT_SOCKARRAY:
++ case BPF_MAP_TYPE_STACK_TRACE:
++ case BPF_MAP_TYPE_QUEUE:
++ case BPF_MAP_TYPE_STACK:
++ case BPF_MAP_TYPE_LRU_HASH:
++ case BPF_MAP_TYPE_LRU_PERCPU_HASH:
++ case BPF_MAP_TYPE_STRUCT_OPS:
++ case BPF_MAP_TYPE_CPUMAP:
++ if (!bpf_capable())
++ return -EPERM;
++ break;
++ case BPF_MAP_TYPE_SOCKMAP:
++ case BPF_MAP_TYPE_SOCKHASH:
++ case BPF_MAP_TYPE_DEVMAP:
++ case BPF_MAP_TYPE_DEVMAP_HASH:
++ case BPF_MAP_TYPE_XSKMAP:
++ if (!capable(CAP_NET_ADMIN))
++ return -EPERM;
++ break;
++ default:
++ WARN(1, "unsupported map type %d", map_type);
++ return -EPERM;
++ }
++
++ map = ops->map_alloc(attr);
+ if (IS_ERR(map))
+ return PTR_ERR(map);
++ map->ops = ops;
++ map->map_type = map_type;
+
+ err = bpf_obj_name_cpy(map->name, attr->map_name,
+ sizeof(attr->map_name));
+@@ -2535,6 +2582,16 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size)
+ /* eBPF programs must be GPL compatible to use GPL-ed functions */
+ is_gpl = license_is_gpl_compatible(license);
+
++ /* Intent here is for unprivileged_bpf_disabled to block BPF program
++ * creation for unprivileged users; other actions depend
++ * on fd availability and access to bpffs, so are dependent on
++ * object creation success. Even with unprivileged BPF disabled,
++ * capability checks are still carried out for these
++ * and other operations.
++ */
++ if (sysctl_unprivileged_bpf_disabled && !bpf_capable())
++ return -EPERM;
++
+ if (attr->insn_cnt == 0 ||
+ attr->insn_cnt > (bpf_capable() ? BPF_COMPLEXITY_LIMIT_INSNS : BPF_MAXINSNS))
+ return -E2BIG;
+@@ -5018,23 +5075,8 @@ out_prog_put:
+ static int __sys_bpf(int cmd, bpfptr_t uattr, unsigned int size)
+ {
+ union bpf_attr attr;
+- bool capable;
+ int err;
+
+- capable = bpf_capable() || !sysctl_unprivileged_bpf_disabled;
+-
+- /* Intent here is for unprivileged_bpf_disabled to block key object
+- * creation commands for unprivileged users; other actions depend
+- * of fd availability and access to bpffs, so are dependent on
+- * object creation success. Capabilities are later verified for
+- * operations such as load and map create, so even with unprivileged
+- * BPF disabled, capability checks are still carried out for these
+- * and other operations.
+- */
+- if (!capable &&
+- (cmd == BPF_MAP_CREATE || cmd == BPF_PROG_LOAD))
+- return -EPERM;
+-
+ err = bpf_check_uarg_tail_zero(uattr, sizeof(attr), size);
+ if (err)
+ return err;
+diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
+index 1f4b07da327a6..a53524f3f7d82 100644
+--- a/kernel/trace/bpf_trace.c
++++ b/kernel/trace/bpf_trace.c
+@@ -661,8 +661,7 @@ static DEFINE_PER_CPU(int, bpf_trace_nest_level);
+ BPF_CALL_5(bpf_perf_event_output, struct pt_regs *, regs, struct bpf_map *, map,
+ u64, flags, void *, data, u64, size)
+ {
+- struct bpf_trace_sample_data *sds = this_cpu_ptr(&bpf_trace_sds);
+- int nest_level = this_cpu_inc_return(bpf_trace_nest_level);
++ struct bpf_trace_sample_data *sds;
+ struct perf_raw_record raw = {
+ .frag = {
+ .size = size,
+@@ -670,7 +669,11 @@ BPF_CALL_5(bpf_perf_event_output, struct pt_regs *, regs, struct bpf_map *, map,
+ },
+ };
+ struct perf_sample_data *sd;
+- int err;
++ int nest_level, err;
++
++ preempt_disable();
++ sds = this_cpu_ptr(&bpf_trace_sds);
++ nest_level = this_cpu_inc_return(bpf_trace_nest_level);
+
+ if (WARN_ON_ONCE(nest_level > ARRAY_SIZE(sds->sds))) {
+ err = -EBUSY;
+@@ -688,9 +691,9 @@ BPF_CALL_5(bpf_perf_event_output, struct pt_regs *, regs, struct bpf_map *, map,
+ perf_sample_save_raw_data(sd, &raw);
+
+ err = __bpf_perf_event_output(regs, map, flags, sd);
+-
+ out:
+ this_cpu_dec(bpf_trace_nest_level);
++ preempt_enable();
+ return err;
+ }
+
+@@ -715,7 +718,6 @@ static DEFINE_PER_CPU(struct bpf_trace_sample_data, bpf_misc_sds);
+ u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
+ void *ctx, u64 ctx_size, bpf_ctx_copy_t ctx_copy)
+ {
+- int nest_level = this_cpu_inc_return(bpf_event_output_nest_level);
+ struct perf_raw_frag frag = {
+ .copy = ctx_copy,
+ .size = ctx_size,
+@@ -732,8 +734,12 @@ u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
+ };
+ struct perf_sample_data *sd;
+ struct pt_regs *regs;
++ int nest_level;
+ u64 ret;
+
++ preempt_disable();
++ nest_level = this_cpu_inc_return(bpf_event_output_nest_level);
++
+ if (WARN_ON_ONCE(nest_level > ARRAY_SIZE(bpf_misc_sds.sds))) {
+ ret = -EBUSY;
+ goto out;
+@@ -748,6 +754,7 @@ u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
+ ret = __bpf_perf_event_output(regs, map, flags, sd);
+ out:
+ this_cpu_dec(bpf_event_output_nest_level);
++ preempt_enable();
+ return ret;
+ }
+
+diff --git a/lib/Makefile b/lib/Makefile
+index 876fcdeae34ec..05d8ec332baac 100644
+--- a/lib/Makefile
++++ b/lib/Makefile
+@@ -82,7 +82,13 @@ obj-$(CONFIG_TEST_STATIC_KEYS) += test_static_key_base.o
+ obj-$(CONFIG_TEST_DYNAMIC_DEBUG) += test_dynamic_debug.o
+ obj-$(CONFIG_TEST_PRINTF) += test_printf.o
+ obj-$(CONFIG_TEST_SCANF) += test_scanf.o
++
+ obj-$(CONFIG_TEST_BITMAP) += test_bitmap.o
++ifeq ($(CONFIG_CC_IS_CLANG)$(CONFIG_KASAN),yy)
++# FIXME: Clang breaks test_bitmap_const_eval when KASAN and GCOV are enabled
++GCOV_PROFILE_test_bitmap.o := n
++endif
++
+ obj-$(CONFIG_TEST_UUID) += test_uuid.o
+ obj-$(CONFIG_TEST_XARRAY) += test_xarray.o
+ obj-$(CONFIG_TEST_MAPLE_TREE) += test_maple_tree.o
+diff --git a/lib/debugobjects.c b/lib/debugobjects.c
+index 984985c39c9b0..a517256a270b7 100644
+--- a/lib/debugobjects.c
++++ b/lib/debugobjects.c
+@@ -498,6 +498,15 @@ static void debug_print_object(struct debug_obj *obj, char *msg)
+ const struct debug_obj_descr *descr = obj->descr;
+ static int limit;
+
++ /*
++ * Don't report if lookup_object_or_alloc() by the current thread
++ * failed because lookup_object_or_alloc()/debug_objects_oom() by a
++ * concurrent thread turned off debug_objects_enabled and cleared
++ * the hash buckets.
++ */
++ if (!debug_objects_enabled)
++ return;
++
+ if (limit < 5 && descr != descr_test) {
+ void *hint = descr->debug_hint ?
+ descr->debug_hint(obj->object) : NULL;
+diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
+index a8005ad3bd589..37a9108c4f588 100644
+--- a/lib/test_bitmap.c
++++ b/lib/test_bitmap.c
+@@ -1149,6 +1149,10 @@ static void __init test_bitmap_print_buf(void)
+ }
+ }
+
++/*
++ * FIXME: Clang breaks compile-time evaluations when KASAN and GCOV are enabled.
++ * To workaround it, GCOV is force-disabled in Makefile for this configuration.
++ */
+ static void __init test_bitmap_const_eval(void)
+ {
+ DECLARE_BITMAP(bitmap, BITS_PER_LONG);
+@@ -1174,11 +1178,7 @@ static void __init test_bitmap_const_eval(void)
+ * the compiler is fixed.
+ */
+ bitmap_clear(bitmap, 0, BITS_PER_LONG);
+-#if defined(__s390__) && defined(__clang__)
+- if (!const_test_bit(7, bitmap))
+-#else
+ if (!test_bit(7, bitmap))
+-#endif
+ bitmap_set(bitmap, 5, 2);
+
+ /* Equals to `unsigned long bitopvar = BIT(20)` */
+diff --git a/mm/filemap.c b/mm/filemap.c
+index 8abce63b259c9..a2006936a6ae2 100644
+--- a/mm/filemap.c
++++ b/mm/filemap.c
+@@ -1760,9 +1760,7 @@ bool __folio_lock_or_retry(struct folio *folio, struct mm_struct *mm,
+ *
+ * Return: The index of the gap if found, otherwise an index outside the
+ * range specified (in which case 'return - index >= max_scan' will be true).
+- * In the rare case of index wrap-around, 0 will be returned. 0 will also
+- * be returned if index == 0 and there is a gap at the index. We can not
+- * wrap-around if passed index == 0.
++ * In the rare case of index wrap-around, 0 will be returned.
+ */
+ pgoff_t page_cache_next_miss(struct address_space *mapping,
+ pgoff_t index, unsigned long max_scan)
+@@ -1772,13 +1770,12 @@ pgoff_t page_cache_next_miss(struct address_space *mapping,
+ while (max_scan--) {
+ void *entry = xas_next(&xas);
+ if (!entry || xa_is_value(entry))
+- return xas.xa_index;
+- if (xas.xa_index == 0 && index != 0)
+- return xas.xa_index;
++ break;
++ if (xas.xa_index == 0)
++ break;
+ }
+
+- /* No gaps in range and no wrap-around, return index beyond range */
+- return xas.xa_index + 1;
++ return xas.xa_index;
+ }
+ EXPORT_SYMBOL(page_cache_next_miss);
+
+@@ -1799,9 +1796,7 @@ EXPORT_SYMBOL(page_cache_next_miss);
+ *
+ * Return: The index of the gap if found, otherwise an index outside the
+ * range specified (in which case 'index - return >= max_scan' will be true).
+- * In the rare case of wrap-around, ULONG_MAX will be returned. ULONG_MAX
+- * will also be returned if index == ULONG_MAX and there is a gap at the
+- * index. We can not wrap-around if passed index == ULONG_MAX.
++ * In the rare case of wrap-around, ULONG_MAX will be returned.
+ */
+ pgoff_t page_cache_prev_miss(struct address_space *mapping,
+ pgoff_t index, unsigned long max_scan)
+@@ -1811,13 +1806,12 @@ pgoff_t page_cache_prev_miss(struct address_space *mapping,
+ while (max_scan--) {
+ void *entry = xas_prev(&xas);
+ if (!entry || xa_is_value(entry))
+- return xas.xa_index;
+- if (xas.xa_index == ULONG_MAX && index != ULONG_MAX)
+- return xas.xa_index;
++ break;
++ if (xas.xa_index == ULONG_MAX)
++ break;
+ }
+
+- /* No gaps in range and no wrap-around, return index beyond range */
+- return xas.xa_index - 1;
++ return xas.xa_index;
+ }
+ EXPORT_SYMBOL(page_cache_prev_miss);
+
+diff --git a/mm/gup.c b/mm/gup.c
+index 94102390b273a..e3e6c473bbc16 100644
+--- a/mm/gup.c
++++ b/mm/gup.c
+@@ -2977,7 +2977,7 @@ static int internal_get_user_pages_fast(unsigned long start,
+ start = untagged_addr(start) & PAGE_MASK;
+ len = nr_pages << PAGE_SHIFT;
+ if (check_add_overflow(start, len, &end))
+- return 0;
++ return -EOVERFLOW;
+ if (end > TASK_SIZE_MAX)
+ return -EFAULT;
+ if (unlikely(!access_ok((void __user *)start, len)))
+diff --git a/mm/kasan/generic.c b/mm/kasan/generic.c
+index f9cb5af9894c6..4d837ab83f083 100644
+--- a/mm/kasan/generic.c
++++ b/mm/kasan/generic.c
+@@ -489,7 +489,7 @@ static void __kasan_record_aux_stack(void *addr, bool can_alloc)
+ return;
+
+ alloc_meta->aux_stack[1] = alloc_meta->aux_stack[0];
+- alloc_meta->aux_stack[0] = kasan_save_stack(GFP_NOWAIT, can_alloc);
++ alloc_meta->aux_stack[0] = kasan_save_stack(0, can_alloc);
+ }
+
+ void kasan_record_aux_stack(void *addr)
+@@ -519,7 +519,7 @@ void kasan_save_free_info(struct kmem_cache *cache, void *object)
+ if (!free_meta)
+ return;
+
+- kasan_set_track(&free_meta->free_track, GFP_NOWAIT);
++ kasan_set_track(&free_meta->free_track, 0);
+ /* The object was freed and has free track set. */
+ *(u8 *)kasan_mem_to_shadow(object) = KASAN_SLAB_FREETRACK;
+ }
+diff --git a/mm/kasan/tags.c b/mm/kasan/tags.c
+index 67a222586846e..7dcfe341d48e3 100644
+--- a/mm/kasan/tags.c
++++ b/mm/kasan/tags.c
+@@ -140,5 +140,5 @@ void kasan_save_alloc_info(struct kmem_cache *cache, void *object, gfp_t flags)
+
+ void kasan_save_free_info(struct kmem_cache *cache, void *object)
+ {
+- save_stack_info(cache, object, GFP_NOWAIT, true);
++ save_stack_info(cache, object, 0, true);
+ }
+diff --git a/mm/kmsan/core.c b/mm/kmsan/core.c
+index 7d1e4aa30bae6..3adb4c1d3b193 100644
+--- a/mm/kmsan/core.c
++++ b/mm/kmsan/core.c
+@@ -74,7 +74,7 @@ depot_stack_handle_t kmsan_save_stack_with_flags(gfp_t flags,
+ nr_entries = stack_trace_save(entries, KMSAN_STACK_DEPTH, 0);
+
+ /* Don't sleep. */
+- flags &= ~__GFP_DIRECT_RECLAIM;
++ flags &= ~(__GFP_DIRECT_RECLAIM | __GFP_KSWAPD_RECLAIM);
+
+ handle = __stack_depot_save(entries, nr_entries, flags, true);
+ return stack_depot_set_extra_bits(handle, extra);
+@@ -245,7 +245,7 @@ depot_stack_handle_t kmsan_internal_chain_origin(depot_stack_handle_t id)
+ extra_bits = kmsan_extra_bits(depth, uaf);
+
+ entries[0] = KMSAN_CHAIN_MAGIC_ORIGIN;
+- entries[1] = kmsan_save_stack_with_flags(GFP_ATOMIC, 0);
++ entries[1] = kmsan_save_stack_with_flags(__GFP_HIGH, 0);
+ entries[2] = id;
+ /*
+ * @entries is a local var in non-instrumented code, so KMSAN does not
+@@ -253,7 +253,7 @@ depot_stack_handle_t kmsan_internal_chain_origin(depot_stack_handle_t id)
+ * positives when __stack_depot_save() passes it to instrumented code.
+ */
+ kmsan_internal_unpoison_memory(entries, sizeof(entries), false);
+- handle = __stack_depot_save(entries, ARRAY_SIZE(entries), GFP_ATOMIC,
++ handle = __stack_depot_save(entries, ARRAY_SIZE(entries), __GFP_HIGH,
+ true);
+ return stack_depot_set_extra_bits(handle, extra_bits);
+ }
+diff --git a/mm/kmsan/instrumentation.c b/mm/kmsan/instrumentation.c
+index cf12e9616b243..cc3907a9c33a0 100644
+--- a/mm/kmsan/instrumentation.c
++++ b/mm/kmsan/instrumentation.c
+@@ -282,7 +282,7 @@ void __msan_poison_alloca(void *address, uintptr_t size, char *descr)
+
+ /* stack_depot_save() may allocate memory. */
+ kmsan_enter_runtime();
+- handle = stack_depot_save(entries, ARRAY_SIZE(entries), GFP_ATOMIC);
++ handle = stack_depot_save(entries, ARRAY_SIZE(entries), __GFP_HIGH);
+ kmsan_leave_runtime();
+
+ kmsan_internal_set_shadow_origin(address, size, -1, handle,
+diff --git a/mm/memcontrol.c b/mm/memcontrol.c
+index 4b27e245a055f..c823c35c2ed46 100644
+--- a/mm/memcontrol.c
++++ b/mm/memcontrol.c
+@@ -3208,12 +3208,12 @@ void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat,
+ * accumulating over a page of vmstat data or when pgdat or idx
+ * changes.
+ */
+- if (stock->cached_objcg != objcg) {
++ if (READ_ONCE(stock->cached_objcg) != objcg) {
+ old = drain_obj_stock(stock);
+ obj_cgroup_get(objcg);
+ stock->nr_bytes = atomic_read(&objcg->nr_charged_bytes)
+ ? atomic_xchg(&objcg->nr_charged_bytes, 0) : 0;
+- stock->cached_objcg = objcg;
++ WRITE_ONCE(stock->cached_objcg, objcg);
+ stock->cached_pgdat = pgdat;
+ } else if (stock->cached_pgdat != pgdat) {
+ /* Flush the existing cached vmstat data */
+@@ -3267,7 +3267,7 @@ static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
+ local_lock_irqsave(&memcg_stock.stock_lock, flags);
+
+ stock = this_cpu_ptr(&memcg_stock);
+- if (objcg == stock->cached_objcg && stock->nr_bytes >= nr_bytes) {
++ if (objcg == READ_ONCE(stock->cached_objcg) && stock->nr_bytes >= nr_bytes) {
+ stock->nr_bytes -= nr_bytes;
+ ret = true;
+ }
+@@ -3279,7 +3279,7 @@ static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes)
+
+ static struct obj_cgroup *drain_obj_stock(struct memcg_stock_pcp *stock)
+ {
+- struct obj_cgroup *old = stock->cached_objcg;
++ struct obj_cgroup *old = READ_ONCE(stock->cached_objcg);
+
+ if (!old)
+ return NULL;
+@@ -3332,7 +3332,7 @@ static struct obj_cgroup *drain_obj_stock(struct memcg_stock_pcp *stock)
+ stock->cached_pgdat = NULL;
+ }
+
+- stock->cached_objcg = NULL;
++ WRITE_ONCE(stock->cached_objcg, NULL);
+ /*
+ * The `old' objects needs to be released by the caller via
+ * obj_cgroup_put() outside of memcg_stock_pcp::stock_lock.
+@@ -3343,10 +3343,11 @@ static struct obj_cgroup *drain_obj_stock(struct memcg_stock_pcp *stock)
+ static bool obj_stock_flush_required(struct memcg_stock_pcp *stock,
+ struct mem_cgroup *root_memcg)
+ {
++ struct obj_cgroup *objcg = READ_ONCE(stock->cached_objcg);
+ struct mem_cgroup *memcg;
+
+- if (stock->cached_objcg) {
+- memcg = obj_cgroup_memcg(stock->cached_objcg);
++ if (objcg) {
++ memcg = obj_cgroup_memcg(objcg);
+ if (memcg && mem_cgroup_is_descendant(memcg, root_memcg))
+ return true;
+ }
+@@ -3365,10 +3366,10 @@ static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes,
+ local_lock_irqsave(&memcg_stock.stock_lock, flags);
+
+ stock = this_cpu_ptr(&memcg_stock);
+- if (stock->cached_objcg != objcg) { /* reset if necessary */
++ if (READ_ONCE(stock->cached_objcg) != objcg) { /* reset if necessary */
+ old = drain_obj_stock(stock);
+ obj_cgroup_get(objcg);
+- stock->cached_objcg = objcg;
++ WRITE_ONCE(stock->cached_objcg, objcg);
+ stock->nr_bytes = atomic_read(&objcg->nr_charged_bytes)
+ ? atomic_xchg(&objcg->nr_charged_bytes, 0) : 0;
+ allow_uncharge = true; /* Allow uncharge when objcg changes */
+diff --git a/mm/memory.c b/mm/memory.c
+index 07bab1e774994..402ee697698e6 100644
+--- a/mm/memory.c
++++ b/mm/memory.c
+@@ -5410,27 +5410,28 @@ retry:
+ if (!vma_is_anonymous(vma))
+ goto inval;
+
+- /* find_mergeable_anon_vma uses adjacent vmas which are not locked */
+- if (!vma->anon_vma)
+- goto inval;
+-
+ if (!vma_start_read(vma))
+ goto inval;
+
++ /*
++ * find_mergeable_anon_vma uses adjacent vmas which are not locked.
++ * This check must happen after vma_start_read(); otherwise, a
++ * concurrent mremap() with MREMAP_DONTUNMAP could dissociate the VMA
++ * from its anon_vma.
++ */
++ if (unlikely(!vma->anon_vma))
++ goto inval_end_read;
++
+ /*
+ * Due to the possibility of userfault handler dropping mmap_lock, avoid
+ * it for now and fall back to page fault handling under mmap_lock.
+ */
+- if (userfaultfd_armed(vma)) {
+- vma_end_read(vma);
+- goto inval;
+- }
++ if (userfaultfd_armed(vma))
++ goto inval_end_read;
+
+ /* Check since vm_start/vm_end might change before we lock the VMA */
+- if (unlikely(address < vma->vm_start || address >= vma->vm_end)) {
+- vma_end_read(vma);
+- goto inval;
+- }
++ if (unlikely(address < vma->vm_start || address >= vma->vm_end))
++ goto inval_end_read;
+
+ /* Check if the VMA got isolated after we found it */
+ if (vma->detached) {
+@@ -5442,6 +5443,9 @@ retry:
+
+ rcu_read_unlock();
+ return vma;
++
++inval_end_read:
++ vma_end_read(vma);
+ inval:
+ rcu_read_unlock();
+ count_vm_vma_lock_event(VMA_LOCK_ABORT);
+diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c
+index eebe256104bc0..947ca580bb9a2 100644
+--- a/net/bluetooth/l2cap_sock.c
++++ b/net/bluetooth/l2cap_sock.c
+@@ -46,6 +46,7 @@ static const struct proto_ops l2cap_sock_ops;
+ static void l2cap_sock_init(struct sock *sk, struct sock *parent);
+ static struct sock *l2cap_sock_alloc(struct net *net, struct socket *sock,
+ int proto, gfp_t prio, int kern);
++static void l2cap_sock_cleanup_listen(struct sock *parent);
+
+ bool l2cap_is_socket(struct socket *sock)
+ {
+@@ -1415,6 +1416,7 @@ static int l2cap_sock_release(struct socket *sock)
+ if (!sk)
+ return 0;
+
++ l2cap_sock_cleanup_listen(sk);
+ bt_sock_unlink(&l2cap_sk_list, sk);
+
+ err = l2cap_sock_shutdown(sock, SHUT_RDWR);
+diff --git a/net/can/raw.c b/net/can/raw.c
+index f64469b98260f..f8e3866157a33 100644
+--- a/net/can/raw.c
++++ b/net/can/raw.c
+@@ -873,7 +873,7 @@ static int raw_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
+
+ skb->dev = dev;
+ skb->priority = sk->sk_priority;
+- skb->mark = sk->sk_mark;
++ skb->mark = READ_ONCE(sk->sk_mark);
+ skb->tstamp = sockc.transmit_time;
+
+ skb_setup_tx_timestamp(skb, sockc.tsflags);
+diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
+index 11c04e7d928eb..658a6f2320cfa 100644
+--- a/net/ceph/osd_client.c
++++ b/net/ceph/osd_client.c
+@@ -3334,17 +3334,24 @@ static int linger_reg_commit_wait(struct ceph_osd_linger_request *lreq)
+ int ret;
+
+ dout("%s lreq %p linger_id %llu\n", __func__, lreq, lreq->linger_id);
+- ret = wait_for_completion_interruptible(&lreq->reg_commit_wait);
++ ret = wait_for_completion_killable(&lreq->reg_commit_wait);
+ return ret ?: lreq->reg_commit_error;
+ }
+
+-static int linger_notify_finish_wait(struct ceph_osd_linger_request *lreq)
++static int linger_notify_finish_wait(struct ceph_osd_linger_request *lreq,
++ unsigned long timeout)
+ {
+- int ret;
++ long left;
+
+ dout("%s lreq %p linger_id %llu\n", __func__, lreq, lreq->linger_id);
+- ret = wait_for_completion_interruptible(&lreq->notify_finish_wait);
+- return ret ?: lreq->notify_finish_error;
++ left = wait_for_completion_killable_timeout(&lreq->notify_finish_wait,
++ ceph_timeout_jiffies(timeout));
++ if (left <= 0)
++ left = left ?: -ETIMEDOUT;
++ else
++ left = lreq->notify_finish_error; /* completed */
++
++ return left;
+ }
+
+ /*
+@@ -4896,7 +4903,8 @@ int ceph_osdc_notify(struct ceph_osd_client *osdc,
+ linger_submit(lreq);
+ ret = linger_reg_commit_wait(lreq);
+ if (!ret)
+- ret = linger_notify_finish_wait(lreq);
++ ret = linger_notify_finish_wait(lreq,
++ msecs_to_jiffies(2 * timeout * MSEC_PER_SEC));
+ else
+ dout("lreq %p failed to initiate notify %d\n", lreq, ret);
+
+diff --git a/net/core/Makefile b/net/core/Makefile
+index 8f367813bc681..731db2eaa6107 100644
+--- a/net/core/Makefile
++++ b/net/core/Makefile
+@@ -13,7 +13,7 @@ obj-y += dev.o dev_addr_lists.o dst.o netevent.o \
+ neighbour.o rtnetlink.o utils.o link_watch.o filter.o \
+ sock_diag.o dev_ioctl.o tso.o sock_reuseport.o \
+ fib_notifier.o xdp.o flow_offload.o gro.o \
+- netdev-genl.o netdev-genl-gen.o
++ netdev-genl.o netdev-genl-gen.o gso.o
+
+ obj-$(CONFIG_NETDEV_ADDR_LIST_TEST) += dev_addr_lists_test.o
+
+diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
+index d4172534dfa8d..cca7594be92ec 100644
+--- a/net/core/bpf_sk_storage.c
++++ b/net/core/bpf_sk_storage.c
+@@ -496,8 +496,11 @@ bpf_sk_storage_diag_alloc(const struct nlattr *nla_stgs)
+ return ERR_PTR(-EPERM);
+
+ nla_for_each_nested(nla, nla_stgs, rem) {
+- if (nla_type(nla) == SK_DIAG_BPF_STORAGE_REQ_MAP_FD)
++ if (nla_type(nla) == SK_DIAG_BPF_STORAGE_REQ_MAP_FD) {
++ if (nla_len(nla) != sizeof(u32))
++ return ERR_PTR(-EINVAL);
+ nr_maps++;
++ }
+ }
+
+ diag = kzalloc(struct_size(diag, maps, nr_maps), GFP_KERNEL);
+diff --git a/net/core/dev.c b/net/core/dev.c
+index c29f3e1db3ca7..44a4eb76a659e 100644
+--- a/net/core/dev.c
++++ b/net/core/dev.c
+@@ -3209,7 +3209,7 @@ static u16 skb_tx_hash(const struct net_device *dev,
+ return (u16) reciprocal_scale(skb_get_hash(skb), qcount) + qoffset;
+ }
+
+-static void skb_warn_bad_offload(const struct sk_buff *skb)
++void skb_warn_bad_offload(const struct sk_buff *skb)
+ {
+ static const netdev_features_t null_features;
+ struct net_device *dev = skb->dev;
+@@ -3338,74 +3338,6 @@ __be16 skb_network_protocol(struct sk_buff *skb, int *depth)
+ return vlan_get_protocol_and_depth(skb, type, depth);
+ }
+
+-/* openvswitch calls this on rx path, so we need a different check.
+- */
+-static inline bool skb_needs_check(struct sk_buff *skb, bool tx_path)
+-{
+- if (tx_path)
+- return skb->ip_summed != CHECKSUM_PARTIAL &&
+- skb->ip_summed != CHECKSUM_UNNECESSARY;
+-
+- return skb->ip_summed == CHECKSUM_NONE;
+-}
+-
+-/**
+- * __skb_gso_segment - Perform segmentation on skb.
+- * @skb: buffer to segment
+- * @features: features for the output path (see dev->features)
+- * @tx_path: whether it is called in TX path
+- *
+- * This function segments the given skb and returns a list of segments.
+- *
+- * It may return NULL if the skb requires no segmentation. This is
+- * only possible when GSO is used for verifying header integrity.
+- *
+- * Segmentation preserves SKB_GSO_CB_OFFSET bytes of previous skb cb.
+- */
+-struct sk_buff *__skb_gso_segment(struct sk_buff *skb,
+- netdev_features_t features, bool tx_path)
+-{
+- struct sk_buff *segs;
+-
+- if (unlikely(skb_needs_check(skb, tx_path))) {
+- int err;
+-
+- /* We're going to init ->check field in TCP or UDP header */
+- err = skb_cow_head(skb, 0);
+- if (err < 0)
+- return ERR_PTR(err);
+- }
+-
+- /* Only report GSO partial support if it will enable us to
+- * support segmentation on this frame without needing additional
+- * work.
+- */
+- if (features & NETIF_F_GSO_PARTIAL) {
+- netdev_features_t partial_features = NETIF_F_GSO_ROBUST;
+- struct net_device *dev = skb->dev;
+-
+- partial_features |= dev->features & dev->gso_partial_features;
+- if (!skb_gso_ok(skb, features | partial_features))
+- features &= ~NETIF_F_GSO_PARTIAL;
+- }
+-
+- BUILD_BUG_ON(SKB_GSO_CB_OFFSET +
+- sizeof(*SKB_GSO_CB(skb)) > sizeof(skb->cb));
+-
+- SKB_GSO_CB(skb)->mac_offset = skb_headroom(skb);
+- SKB_GSO_CB(skb)->encap_level = 0;
+-
+- skb_reset_mac_header(skb);
+- skb_reset_mac_len(skb);
+-
+- segs = skb_mac_gso_segment(skb, features);
+-
+- if (segs != skb && unlikely(skb_needs_check(skb, tx_path) && !IS_ERR(segs)))
+- skb_warn_bad_offload(skb);
+-
+- return segs;
+-}
+-EXPORT_SYMBOL(__skb_gso_segment);
+
+ /* Take action when hardware reception checksum errors are detected. */
+ #ifdef CONFIG_BUG
+diff --git a/net/core/gro.c b/net/core/gro.c
+index 2d84165cb4f1d..2f1b6524bddc5 100644
+--- a/net/core/gro.c
++++ b/net/core/gro.c
+@@ -10,7 +10,7 @@
+ #define GRO_MAX_HEAD (MAX_HEADER + 128)
+
+ static DEFINE_SPINLOCK(offload_lock);
+-static struct list_head offload_base __read_mostly = LIST_HEAD_INIT(offload_base);
++struct list_head offload_base __read_mostly = LIST_HEAD_INIT(offload_base);
+ /* Maximum number of GRO_NORMAL skbs to batch up for list-RX */
+ int gro_normal_batch __read_mostly = 8;
+
+@@ -92,63 +92,6 @@ void dev_remove_offload(struct packet_offload *po)
+ }
+ EXPORT_SYMBOL(dev_remove_offload);
+
+-/**
+- * skb_eth_gso_segment - segmentation handler for ethernet protocols.
+- * @skb: buffer to segment
+- * @features: features for the output path (see dev->features)
+- * @type: Ethernet Protocol ID
+- */
+-struct sk_buff *skb_eth_gso_segment(struct sk_buff *skb,
+- netdev_features_t features, __be16 type)
+-{
+- struct sk_buff *segs = ERR_PTR(-EPROTONOSUPPORT);
+- struct packet_offload *ptype;
+-
+- rcu_read_lock();
+- list_for_each_entry_rcu(ptype, &offload_base, list) {
+- if (ptype->type == type && ptype->callbacks.gso_segment) {
+- segs = ptype->callbacks.gso_segment(skb, features);
+- break;
+- }
+- }
+- rcu_read_unlock();
+-
+- return segs;
+-}
+-EXPORT_SYMBOL(skb_eth_gso_segment);
+-
+-/**
+- * skb_mac_gso_segment - mac layer segmentation handler.
+- * @skb: buffer to segment
+- * @features: features for the output path (see dev->features)
+- */
+-struct sk_buff *skb_mac_gso_segment(struct sk_buff *skb,
+- netdev_features_t features)
+-{
+- struct sk_buff *segs = ERR_PTR(-EPROTONOSUPPORT);
+- struct packet_offload *ptype;
+- int vlan_depth = skb->mac_len;
+- __be16 type = skb_network_protocol(skb, &vlan_depth);
+-
+- if (unlikely(!type))
+- return ERR_PTR(-EINVAL);
+-
+- __skb_pull(skb, vlan_depth);
+-
+- rcu_read_lock();
+- list_for_each_entry_rcu(ptype, &offload_base, list) {
+- if (ptype->type == type && ptype->callbacks.gso_segment) {
+- segs = ptype->callbacks.gso_segment(skb, features);
+- break;
+- }
+- }
+- rcu_read_unlock();
+-
+- __skb_push(skb, skb->data - skb_mac_header(skb));
+-
+- return segs;
+-}
+-EXPORT_SYMBOL(skb_mac_gso_segment);
+
+ int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb)
+ {
+diff --git a/net/core/gso.c b/net/core/gso.c
+new file mode 100644
+index 0000000000000..9e1803bfc9c6c
+--- /dev/null
++++ b/net/core/gso.c
+@@ -0,0 +1,273 @@
++// SPDX-License-Identifier: GPL-2.0-or-later
++#include <linux/skbuff.h>
++#include <linux/sctp.h>
++#include <net/gso.h>
++#include <net/gro.h>
++
++/**
++ * skb_eth_gso_segment - segmentation handler for ethernet protocols.
++ * @skb: buffer to segment
++ * @features: features for the output path (see dev->features)
++ * @type: Ethernet Protocol ID
++ */
++struct sk_buff *skb_eth_gso_segment(struct sk_buff *skb,
++ netdev_features_t features, __be16 type)
++{
++ struct sk_buff *segs = ERR_PTR(-EPROTONOSUPPORT);
++ struct packet_offload *ptype;
++
++ rcu_read_lock();
++ list_for_each_entry_rcu(ptype, &offload_base, list) {
++ if (ptype->type == type && ptype->callbacks.gso_segment) {
++ segs = ptype->callbacks.gso_segment(skb, features);
++ break;
++ }
++ }
++ rcu_read_unlock();
++
++ return segs;
++}
++EXPORT_SYMBOL(skb_eth_gso_segment);
++
++/**
++ * skb_mac_gso_segment - mac layer segmentation handler.
++ * @skb: buffer to segment
++ * @features: features for the output path (see dev->features)
++ */
++struct sk_buff *skb_mac_gso_segment(struct sk_buff *skb,
++ netdev_features_t features)
++{
++ struct sk_buff *segs = ERR_PTR(-EPROTONOSUPPORT);
++ struct packet_offload *ptype;
++ int vlan_depth = skb->mac_len;
++ __be16 type = skb_network_protocol(skb, &vlan_depth);
++
++ if (unlikely(!type))
++ return ERR_PTR(-EINVAL);
++
++ __skb_pull(skb, vlan_depth);
++
++ rcu_read_lock();
++ list_for_each_entry_rcu(ptype, &offload_base, list) {
++ if (ptype->type == type && ptype->callbacks.gso_segment) {
++ segs = ptype->callbacks.gso_segment(skb, features);
++ break;
++ }
++ }
++ rcu_read_unlock();
++
++ __skb_push(skb, skb->data - skb_mac_header(skb));
++
++ return segs;
++}
++EXPORT_SYMBOL(skb_mac_gso_segment);
++/* openvswitch calls this on rx path, so we need a different check.
++ */
++static bool skb_needs_check(const struct sk_buff *skb, bool tx_path)
++{
++ if (tx_path)
++ return skb->ip_summed != CHECKSUM_PARTIAL &&
++ skb->ip_summed != CHECKSUM_UNNECESSARY;
++
++ return skb->ip_summed == CHECKSUM_NONE;
++}
++
++/**
++ * __skb_gso_segment - Perform segmentation on skb.
++ * @skb: buffer to segment
++ * @features: features for the output path (see dev->features)
++ * @tx_path: whether it is called in TX path
++ *
++ * This function segments the given skb and returns a list of segments.
++ *
++ * It may return NULL if the skb requires no segmentation. This is
++ * only possible when GSO is used for verifying header integrity.
++ *
++ * Segmentation preserves SKB_GSO_CB_OFFSET bytes of previous skb cb.
++ */
++struct sk_buff *__skb_gso_segment(struct sk_buff *skb,
++ netdev_features_t features, bool tx_path)
++{
++ struct sk_buff *segs;
++
++ if (unlikely(skb_needs_check(skb, tx_path))) {
++ int err;
++
++ /* We're going to init ->check field in TCP or UDP header */
++ err = skb_cow_head(skb, 0);
++ if (err < 0)
++ return ERR_PTR(err);
++ }
++
++ /* Only report GSO partial support if it will enable us to
++ * support segmentation on this frame without needing additional
++ * work.
++ */
++ if (features & NETIF_F_GSO_PARTIAL) {
++ netdev_features_t partial_features = NETIF_F_GSO_ROBUST;
++ struct net_device *dev = skb->dev;
++
++ partial_features |= dev->features & dev->gso_partial_features;
++ if (!skb_gso_ok(skb, features | partial_features))
++ features &= ~NETIF_F_GSO_PARTIAL;
++ }
++
++ BUILD_BUG_ON(SKB_GSO_CB_OFFSET +
++ sizeof(*SKB_GSO_CB(skb)) > sizeof(skb->cb));
++
++ SKB_GSO_CB(skb)->mac_offset = skb_headroom(skb);
++ SKB_GSO_CB(skb)->encap_level = 0;
++
++ skb_reset_mac_header(skb);
++ skb_reset_mac_len(skb);
++
++ segs = skb_mac_gso_segment(skb, features);
++
++ if (segs != skb && unlikely(skb_needs_check(skb, tx_path) && !IS_ERR(segs)))
++ skb_warn_bad_offload(skb);
++
++ return segs;
++}
++EXPORT_SYMBOL(__skb_gso_segment);
++
++/**
++ * skb_gso_transport_seglen - Return length of individual segments of a gso packet
++ *
++ * @skb: GSO skb
++ *
++ * skb_gso_transport_seglen is used to determine the real size of the
++ * individual segments, including Layer4 headers (TCP/UDP).
++ *
++ * The MAC/L2 or network (IP, IPv6) headers are not accounted for.
++ */
++static unsigned int skb_gso_transport_seglen(const struct sk_buff *skb)
++{
++ const struct skb_shared_info *shinfo = skb_shinfo(skb);
++ unsigned int thlen = 0;
++
++ if (skb->encapsulation) {
++ thlen = skb_inner_transport_header(skb) -
++ skb_transport_header(skb);
++
++ if (likely(shinfo->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6)))
++ thlen += inner_tcp_hdrlen(skb);
++ } else if (likely(shinfo->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6))) {
++ thlen = tcp_hdrlen(skb);
++ } else if (unlikely(skb_is_gso_sctp(skb))) {
++ thlen = sizeof(struct sctphdr);
++ } else if (shinfo->gso_type & SKB_GSO_UDP_L4) {
++ thlen = sizeof(struct udphdr);
++ }
++ /* UFO sets gso_size to the size of the fragmentation
++ * payload, i.e. the size of the L4 (UDP) header is already
++ * accounted for.
++ */
++ return thlen + shinfo->gso_size;
++}
++
++/**
++ * skb_gso_network_seglen - Return length of individual segments of a gso packet
++ *
++ * @skb: GSO skb
++ *
++ * skb_gso_network_seglen is used to determine the real size of the
++ * individual segments, including Layer3 (IP, IPv6) and L4 headers (TCP/UDP).
++ *
++ * The MAC/L2 header is not accounted for.
++ */
++static unsigned int skb_gso_network_seglen(const struct sk_buff *skb)
++{
++ unsigned int hdr_len = skb_transport_header(skb) -
++ skb_network_header(skb);
++
++ return hdr_len + skb_gso_transport_seglen(skb);
++}
++
++/**
++ * skb_gso_mac_seglen - Return length of individual segments of a gso packet
++ *
++ * @skb: GSO skb
++ *
++ * skb_gso_mac_seglen is used to determine the real size of the
++ * individual segments, including MAC/L2, Layer3 (IP, IPv6) and L4
++ * headers (TCP/UDP).
++ */
++static unsigned int skb_gso_mac_seglen(const struct sk_buff *skb)
++{
++ unsigned int hdr_len = skb_transport_header(skb) - skb_mac_header(skb);
++
++ return hdr_len + skb_gso_transport_seglen(skb);
++}
++
++/**
++ * skb_gso_size_check - check the skb size, considering GSO_BY_FRAGS
++ *
++ * There are a couple of instances where we have a GSO skb, and we
++ * want to determine what size it would be after it is segmented.
++ *
++ * We might want to check:
++ * - L3+L4+payload size (e.g. IP forwarding)
++ * - L2+L3+L4+payload size (e.g. sanity check before passing to driver)
++ *
++ * This is a helper to do that correctly considering GSO_BY_FRAGS.
++ *
++ * @skb: GSO skb
++ *
++ * @seg_len: The segmented length (from skb_gso_*_seglen). In the
++ * GSO_BY_FRAGS case this will be [header sizes + GSO_BY_FRAGS].
++ *
++ * @max_len: The maximum permissible length.
++ *
++ * Returns true if the segmented length <= max length.
++ */
++static inline bool skb_gso_size_check(const struct sk_buff *skb,
++ unsigned int seg_len,
++ unsigned int max_len) {
++ const struct skb_shared_info *shinfo = skb_shinfo(skb);
++ const struct sk_buff *iter;
++
++ if (shinfo->gso_size != GSO_BY_FRAGS)
++ return seg_len <= max_len;
++
++ /* Undo this so we can re-use header sizes */
++ seg_len -= GSO_BY_FRAGS;
++
++ skb_walk_frags(skb, iter) {
++ if (seg_len + skb_headlen(iter) > max_len)
++ return false;
++ }
++
++ return true;
++}
++
++/**
++ * skb_gso_validate_network_len - Will a split GSO skb fit into a given MTU?
++ *
++ * @skb: GSO skb
++ * @mtu: MTU to validate against
++ *
++ * skb_gso_validate_network_len validates if a given skb will fit a
++ * wanted MTU once split. It considers L3 headers, L4 headers, and the
++ * payload.
++ */
++bool skb_gso_validate_network_len(const struct sk_buff *skb, unsigned int mtu)
++{
++ return skb_gso_size_check(skb, skb_gso_network_seglen(skb), mtu);
++}
++EXPORT_SYMBOL_GPL(skb_gso_validate_network_len);
++
++/**
++ * skb_gso_validate_mac_len - Will a split GSO skb fit in a given length?
++ *
++ * @skb: GSO skb
++ * @len: length to validate against
++ *
++ * skb_gso_validate_mac_len validates if a given skb will fit a wanted
++ * length once split, including L2, L3 and L4 headers and the payload.
++ */
++bool skb_gso_validate_mac_len(const struct sk_buff *skb, unsigned int len)
++{
++ return skb_gso_size_check(skb, skb_gso_mac_seglen(skb), len);
++}
++EXPORT_SYMBOL_GPL(skb_gso_validate_mac_len);
++
+diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
+index 2fe6a3379aaed..aa1743b2b770b 100644
+--- a/net/core/rtnetlink.c
++++ b/net/core/rtnetlink.c
+@@ -5139,13 +5139,17 @@ static int rtnl_bridge_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
+ br_spec = nlmsg_find_attr(nlh, sizeof(struct ifinfomsg), IFLA_AF_SPEC);
+ if (br_spec) {
+ nla_for_each_nested(attr, br_spec, rem) {
+- if (nla_type(attr) == IFLA_BRIDGE_FLAGS) {
++ if (nla_type(attr) == IFLA_BRIDGE_FLAGS && !have_flags) {
+ if (nla_len(attr) < sizeof(flags))
+ return -EINVAL;
+
+ have_flags = true;
+ flags = nla_get_u16(attr);
+- break;
++ }
++
++ if (nla_type(attr) == IFLA_BRIDGE_MODE) {
++ if (nla_len(attr) < sizeof(u16))
++ return -EINVAL;
+ }
+ }
+ }
+diff --git a/net/core/skbuff.c b/net/core/skbuff.c
+index 1b6a1d99869dc..593ec18e3f007 100644
+--- a/net/core/skbuff.c
++++ b/net/core/skbuff.c
+@@ -67,6 +67,7 @@
+ #include <net/dst.h>
+ #include <net/sock.h>
+ #include <net/checksum.h>
++#include <net/gso.h>
+ #include <net/ip6_checksum.h>
+ #include <net/xfrm.h>
+ #include <net/mpls.h>
+@@ -5789,147 +5790,6 @@ void skb_scrub_packet(struct sk_buff *skb, bool xnet)
+ }
+ EXPORT_SYMBOL_GPL(skb_scrub_packet);
+
+-/**
+- * skb_gso_transport_seglen - Return length of individual segments of a gso packet
+- *
+- * @skb: GSO skb
+- *
+- * skb_gso_transport_seglen is used to determine the real size of the
+- * individual segments, including Layer4 headers (TCP/UDP).
+- *
+- * The MAC/L2 or network (IP, IPv6) headers are not accounted for.
+- */
+-static unsigned int skb_gso_transport_seglen(const struct sk_buff *skb)
+-{
+- const struct skb_shared_info *shinfo = skb_shinfo(skb);
+- unsigned int thlen = 0;
+-
+- if (skb->encapsulation) {
+- thlen = skb_inner_transport_header(skb) -
+- skb_transport_header(skb);
+-
+- if (likely(shinfo->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6)))
+- thlen += inner_tcp_hdrlen(skb);
+- } else if (likely(shinfo->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6))) {
+- thlen = tcp_hdrlen(skb);
+- } else if (unlikely(skb_is_gso_sctp(skb))) {
+- thlen = sizeof(struct sctphdr);
+- } else if (shinfo->gso_type & SKB_GSO_UDP_L4) {
+- thlen = sizeof(struct udphdr);
+- }
+- /* UFO sets gso_size to the size of the fragmentation
+- * payload, i.e. the size of the L4 (UDP) header is already
+- * accounted for.
+- */
+- return thlen + shinfo->gso_size;
+-}
+-
+-/**
+- * skb_gso_network_seglen - Return length of individual segments of a gso packet
+- *
+- * @skb: GSO skb
+- *
+- * skb_gso_network_seglen is used to determine the real size of the
+- * individual segments, including Layer3 (IP, IPv6) and L4 headers (TCP/UDP).
+- *
+- * The MAC/L2 header is not accounted for.
+- */
+-static unsigned int skb_gso_network_seglen(const struct sk_buff *skb)
+-{
+- unsigned int hdr_len = skb_transport_header(skb) -
+- skb_network_header(skb);
+-
+- return hdr_len + skb_gso_transport_seglen(skb);
+-}
+-
+-/**
+- * skb_gso_mac_seglen - Return length of individual segments of a gso packet
+- *
+- * @skb: GSO skb
+- *
+- * skb_gso_mac_seglen is used to determine the real size of the
+- * individual segments, including MAC/L2, Layer3 (IP, IPv6) and L4
+- * headers (TCP/UDP).
+- */
+-static unsigned int skb_gso_mac_seglen(const struct sk_buff *skb)
+-{
+- unsigned int hdr_len = skb_transport_header(skb) - skb_mac_header(skb);
+-
+- return hdr_len + skb_gso_transport_seglen(skb);
+-}
+-
+-/**
+- * skb_gso_size_check - check the skb size, considering GSO_BY_FRAGS
+- *
+- * There are a couple of instances where we have a GSO skb, and we
+- * want to determine what size it would be after it is segmented.
+- *
+- * We might want to check:
+- * - L3+L4+payload size (e.g. IP forwarding)
+- * - L2+L3+L4+payload size (e.g. sanity check before passing to driver)
+- *
+- * This is a helper to do that correctly considering GSO_BY_FRAGS.
+- *
+- * @skb: GSO skb
+- *
+- * @seg_len: The segmented length (from skb_gso_*_seglen). In the
+- * GSO_BY_FRAGS case this will be [header sizes + GSO_BY_FRAGS].
+- *
+- * @max_len: The maximum permissible length.
+- *
+- * Returns true if the segmented length <= max length.
+- */
+-static inline bool skb_gso_size_check(const struct sk_buff *skb,
+- unsigned int seg_len,
+- unsigned int max_len) {
+- const struct skb_shared_info *shinfo = skb_shinfo(skb);
+- const struct sk_buff *iter;
+-
+- if (shinfo->gso_size != GSO_BY_FRAGS)
+- return seg_len <= max_len;
+-
+- /* Undo this so we can re-use header sizes */
+- seg_len -= GSO_BY_FRAGS;
+-
+- skb_walk_frags(skb, iter) {
+- if (seg_len + skb_headlen(iter) > max_len)
+- return false;
+- }
+-
+- return true;
+-}
+-
+-/**
+- * skb_gso_validate_network_len - Will a split GSO skb fit into a given MTU?
+- *
+- * @skb: GSO skb
+- * @mtu: MTU to validate against
+- *
+- * skb_gso_validate_network_len validates if a given skb will fit a
+- * wanted MTU once split. It considers L3 headers, L4 headers, and the
+- * payload.
+- */
+-bool skb_gso_validate_network_len(const struct sk_buff *skb, unsigned int mtu)
+-{
+- return skb_gso_size_check(skb, skb_gso_network_seglen(skb), mtu);
+-}
+-EXPORT_SYMBOL_GPL(skb_gso_validate_network_len);
+-
+-/**
+- * skb_gso_validate_mac_len - Will a split GSO skb fit in a given length?
+- *
+- * @skb: GSO skb
+- * @len: length to validate against
+- *
+- * skb_gso_validate_mac_len validates if a given skb will fit a wanted
+- * length once split, including L2, L3 and L4 headers and the payload.
+- */
+-bool skb_gso_validate_mac_len(const struct sk_buff *skb, unsigned int len)
+-{
+- return skb_gso_size_check(skb, skb_gso_mac_seglen(skb), len);
+-}
+-EXPORT_SYMBOL_GPL(skb_gso_validate_mac_len);
+-
+ static struct sk_buff *skb_reorder_vlan_header(struct sk_buff *skb)
+ {
+ int mac_len, meta_len;
+diff --git a/net/core/sock.c b/net/core/sock.c
+index 4a0edccf86066..1f31a97100d4f 100644
+--- a/net/core/sock.c
++++ b/net/core/sock.c
+@@ -800,7 +800,7 @@ EXPORT_SYMBOL(sock_no_linger);
+ void sock_set_priority(struct sock *sk, u32 priority)
+ {
+ lock_sock(sk);
+- sk->sk_priority = priority;
++ WRITE_ONCE(sk->sk_priority, priority);
+ release_sock(sk);
+ }
+ EXPORT_SYMBOL(sock_set_priority);
+@@ -984,7 +984,7 @@ EXPORT_SYMBOL(sock_set_rcvbuf);
+ static void __sock_set_mark(struct sock *sk, u32 val)
+ {
+ if (val != sk->sk_mark) {
+- sk->sk_mark = val;
++ WRITE_ONCE(sk->sk_mark, val);
+ sk_dst_reset(sk);
+ }
+ }
+@@ -1003,7 +1003,7 @@ static void sock_release_reserved_memory(struct sock *sk, int bytes)
+ bytes = round_down(bytes, PAGE_SIZE);
+
+ WARN_ON(bytes > sk->sk_reserved_mem);
+- sk->sk_reserved_mem -= bytes;
++ WRITE_ONCE(sk->sk_reserved_mem, sk->sk_reserved_mem - bytes);
+ sk_mem_reclaim(sk);
+ }
+
+@@ -1040,7 +1040,8 @@ static int sock_reserve_memory(struct sock *sk, int bytes)
+ }
+ sk->sk_forward_alloc += pages << PAGE_SHIFT;
+
+- sk->sk_reserved_mem += pages << PAGE_SHIFT;
++ WRITE_ONCE(sk->sk_reserved_mem,
++ sk->sk_reserved_mem + (pages << PAGE_SHIFT));
+
+ return 0;
+ }
+@@ -1209,7 +1210,7 @@ set_sndbuf:
+ if ((val >= 0 && val <= 6) ||
+ sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) ||
+ sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN))
+- sk->sk_priority = val;
++ WRITE_ONCE(sk->sk_priority, val);
+ else
+ ret = -EPERM;
+ break;
+@@ -1427,7 +1428,8 @@ set_sndbuf:
+ cmpxchg(&sk->sk_pacing_status,
+ SK_PACING_NONE,
+ SK_PACING_NEEDED);
+- sk->sk_max_pacing_rate = ulval;
++ /* Pairs with READ_ONCE() from sk_getsockopt() */
++ WRITE_ONCE(sk->sk_max_pacing_rate, ulval);
+ sk->sk_pacing_rate = min(sk->sk_pacing_rate, ulval);
+ break;
+ }
+@@ -1522,7 +1524,9 @@ set_sndbuf:
+ }
+ if ((u8)val == SOCK_TXREHASH_DEFAULT)
+ val = READ_ONCE(sock_net(sk)->core.sysctl_txrehash);
+- /* Paired with READ_ONCE() in tcp_rtx_synack() */
++ /* Paired with READ_ONCE() in tcp_rtx_synack()
++ * and sk_getsockopt().
++ */
+ WRITE_ONCE(sk->sk_txrehash, (u8)val);
+ break;
+
+@@ -1622,11 +1626,11 @@ int sk_getsockopt(struct sock *sk, int level, int optname,
+ break;
+
+ case SO_SNDBUF:
+- v.val = sk->sk_sndbuf;
++ v.val = READ_ONCE(sk->sk_sndbuf);
+ break;
+
+ case SO_RCVBUF:
+- v.val = sk->sk_rcvbuf;
++ v.val = READ_ONCE(sk->sk_rcvbuf);
+ break;
+
+ case SO_REUSEADDR:
+@@ -1668,7 +1672,7 @@ int sk_getsockopt(struct sock *sk, int level, int optname,
+ break;
+
+ case SO_PRIORITY:
+- v.val = sk->sk_priority;
++ v.val = READ_ONCE(sk->sk_priority);
+ break;
+
+ case SO_LINGER:
+@@ -1715,7 +1719,7 @@ int sk_getsockopt(struct sock *sk, int level, int optname,
+ break;
+
+ case SO_RCVLOWAT:
+- v.val = sk->sk_rcvlowat;
++ v.val = READ_ONCE(sk->sk_rcvlowat);
+ break;
+
+ case SO_SNDLOWAT:
+@@ -1795,7 +1799,7 @@ int sk_getsockopt(struct sock *sk, int level, int optname,
+ optval, optlen, len);
+
+ case SO_MARK:
+- v.val = sk->sk_mark;
++ v.val = READ_ONCE(sk->sk_mark);
+ break;
+
+ case SO_RCVMARK:
+@@ -1814,7 +1818,7 @@ int sk_getsockopt(struct sock *sk, int level, int optname,
+ if (!sock->ops->set_peek_off)
+ return -EOPNOTSUPP;
+
+- v.val = sk->sk_peek_off;
++ v.val = READ_ONCE(sk->sk_peek_off);
+ break;
+ case SO_NOFCS:
+ v.val = sock_flag(sk, SOCK_NOFCS);
+@@ -1844,7 +1848,7 @@ int sk_getsockopt(struct sock *sk, int level, int optname,
+
+ #ifdef CONFIG_NET_RX_BUSY_POLL
+ case SO_BUSY_POLL:
+- v.val = sk->sk_ll_usec;
++ v.val = READ_ONCE(sk->sk_ll_usec);
+ break;
+ case SO_PREFER_BUSY_POLL:
+ v.val = READ_ONCE(sk->sk_prefer_busy_poll);
+@@ -1852,12 +1856,14 @@ int sk_getsockopt(struct sock *sk, int level, int optname,
+ #endif
+
+ case SO_MAX_PACING_RATE:
++ /* The READ_ONCE() pair with the WRITE_ONCE() in sk_setsockopt() */
+ if (sizeof(v.ulval) != sizeof(v.val) && len >= sizeof(v.ulval)) {
+ lv = sizeof(v.ulval);
+- v.ulval = sk->sk_max_pacing_rate;
++ v.ulval = READ_ONCE(sk->sk_max_pacing_rate);
+ } else {
+ /* 32bit version */
+- v.val = min_t(unsigned long, sk->sk_max_pacing_rate, ~0U);
++ v.val = min_t(unsigned long, ~0U,
++ READ_ONCE(sk->sk_max_pacing_rate));
+ }
+ break;
+
+@@ -1925,11 +1931,12 @@ int sk_getsockopt(struct sock *sk, int level, int optname,
+ break;
+
+ case SO_RESERVE_MEM:
+- v.val = sk->sk_reserved_mem;
++ v.val = READ_ONCE(sk->sk_reserved_mem);
+ break;
+
+ case SO_TXREHASH:
+- v.val = sk->sk_txrehash;
++ /* Paired with WRITE_ONCE() in sk_setsockopt() */
++ v.val = READ_ONCE(sk->sk_txrehash);
+ break;
+
+ default:
+@@ -3120,7 +3127,7 @@ EXPORT_SYMBOL(__sk_mem_reclaim);
+
+ int sk_set_peek_off(struct sock *sk, int val)
+ {
+- sk->sk_peek_off = val;
++ WRITE_ONCE(sk->sk_peek_off, val);
+ return 0;
+ }
+ EXPORT_SYMBOL_GPL(sk_set_peek_off);
+diff --git a/net/core/sock_map.c b/net/core/sock_map.c
+index 00afb66cd0950..08ab108206bf8 100644
+--- a/net/core/sock_map.c
++++ b/net/core/sock_map.c
+@@ -32,8 +32,6 @@ static struct bpf_map *sock_map_alloc(union bpf_attr *attr)
+ {
+ struct bpf_stab *stab;
+
+- if (!capable(CAP_NET_ADMIN))
+- return ERR_PTR(-EPERM);
+ if (attr->max_entries == 0 ||
+ attr->key_size != 4 ||
+ (attr->value_size != sizeof(u32) &&
+@@ -117,7 +115,6 @@ static void sock_map_sk_acquire(struct sock *sk)
+ __acquires(&sk->sk_lock.slock)
+ {
+ lock_sock(sk);
+- preempt_disable();
+ rcu_read_lock();
+ }
+
+@@ -125,7 +122,6 @@ static void sock_map_sk_release(struct sock *sk)
+ __releases(&sk->sk_lock.slock)
+ {
+ rcu_read_unlock();
+- preempt_enable();
+ release_sock(sk);
+ }
+
+@@ -1085,8 +1081,6 @@ static struct bpf_map *sock_hash_alloc(union bpf_attr *attr)
+ struct bpf_shtab *htab;
+ int i, err;
+
+- if (!capable(CAP_NET_ADMIN))
+- return ERR_PTR(-EPERM);
+ if (attr->max_entries == 0 ||
+ attr->key_size == 0 ||
+ (attr->value_size != sizeof(u32) &&
+diff --git a/net/dcb/dcbnl.c b/net/dcb/dcbnl.c
+index c0c4381285759..2e6b8c8fd2ded 100644
+--- a/net/dcb/dcbnl.c
++++ b/net/dcb/dcbnl.c
+@@ -980,7 +980,7 @@ static int dcbnl_bcn_setcfg(struct net_device *netdev, struct nlmsghdr *nlh,
+ return -EOPNOTSUPP;
+
+ ret = nla_parse_nested_deprecated(data, DCB_BCN_ATTR_MAX,
+- tb[DCB_ATTR_BCN], dcbnl_pfc_up_nest,
++ tb[DCB_ATTR_BCN], dcbnl_bcn_nest,
+ NULL);
+ if (ret)
+ return ret;
+diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
+index 93c98990d7263..94b69a50c8b50 100644
+--- a/net/dccp/ipv6.c
++++ b/net/dccp/ipv6.c
+@@ -238,8 +238,8 @@ static int dccp_v6_send_response(const struct sock *sk, struct request_sock *req
+ opt = ireq->ipv6_opt;
+ if (!opt)
+ opt = rcu_dereference(np->opt);
+- err = ip6_xmit(sk, skb, &fl6, sk->sk_mark, opt, np->tclass,
+- sk->sk_priority);
++ err = ip6_xmit(sk, skb, &fl6, READ_ONCE(sk->sk_mark), opt,
++ np->tclass, sk->sk_priority);
+ rcu_read_unlock();
+ err = net_xmit_eval(err);
+ }
+diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
+index 4a76ebf793b85..10ebe39dcc873 100644
+--- a/net/ipv4/af_inet.c
++++ b/net/ipv4/af_inet.c
+@@ -100,6 +100,7 @@
+ #include <net/ip_fib.h>
+ #include <net/inet_connection_sock.h>
+ #include <net/gro.h>
++#include <net/gso.h>
+ #include <net/tcp.h>
+ #include <net/udp.h>
+ #include <net/udplite.h>
+diff --git a/net/ipv4/esp4_offload.c b/net/ipv4/esp4_offload.c
+index ee848be59e65a..10e96ed6c9e39 100644
+--- a/net/ipv4/esp4_offload.c
++++ b/net/ipv4/esp4_offload.c
+@@ -17,6 +17,7 @@
+ #include <linux/err.h>
+ #include <linux/module.h>
+ #include <net/gro.h>
++#include <net/gso.h>
+ #include <net/ip.h>
+ #include <net/xfrm.h>
+ #include <net/esp.h>
+diff --git a/net/ipv4/gre_offload.c b/net/ipv4/gre_offload.c
+index 2b9cb5398335b..311e70bfce407 100644
+--- a/net/ipv4/gre_offload.c
++++ b/net/ipv4/gre_offload.c
+@@ -11,6 +11,7 @@
+ #include <net/protocol.h>
+ #include <net/gre.h>
+ #include <net/gro.h>
++#include <net/gso.h>
+
+ static struct sk_buff *gre_gso_segment(struct sk_buff *skb,
+ netdev_features_t features)
+diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c
+index b812eb36f0e36..f7426926a1041 100644
+--- a/net/ipv4/inet_diag.c
++++ b/net/ipv4/inet_diag.c
+@@ -150,7 +150,7 @@ int inet_diag_msg_attrs_fill(struct sock *sk, struct sk_buff *skb,
+ }
+ #endif
+
+- if (net_admin && nla_put_u32(skb, INET_DIAG_MARK, sk->sk_mark))
++ if (net_admin && nla_put_u32(skb, INET_DIAG_MARK, READ_ONCE(sk->sk_mark)))
+ goto errout;
+
+ if (ext & (1 << (INET_DIAG_CLASS_ID - 1)) ||
+@@ -799,7 +799,7 @@ int inet_diag_bc_sk(const struct nlattr *bc, struct sock *sk)
+ entry.ifindex = sk->sk_bound_dev_if;
+ entry.userlocks = sk_fullsock(sk) ? sk->sk_userlocks : 0;
+ if (sk_fullsock(sk))
+- entry.mark = sk->sk_mark;
++ entry.mark = READ_ONCE(sk->sk_mark);
+ else if (sk->sk_state == TCP_NEW_SYN_RECV)
+ entry.mark = inet_rsk(inet_reqsk(sk))->ir_mark;
+ else if (sk->sk_state == TCP_TIME_WAIT)
+diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
+index a1bead441026e..6f6f63cf9224f 100644
+--- a/net/ipv4/ip_output.c
++++ b/net/ipv4/ip_output.c
+@@ -73,6 +73,7 @@
+ #include <net/arp.h>
+ #include <net/icmp.h>
+ #include <net/checksum.h>
++#include <net/gso.h>
+ #include <net/inetpeer.h>
+ #include <net/inet_ecn.h>
+ #include <net/lwtunnel.h>
+@@ -183,9 +184,9 @@ int ip_build_and_send_pkt(struct sk_buff *skb, const struct sock *sk,
+ ip_options_build(skb, &opt->opt, daddr, rt);
+ }
+
+- skb->priority = sk->sk_priority;
++ skb->priority = READ_ONCE(sk->sk_priority);
+ if (!skb->mark)
+- skb->mark = sk->sk_mark;
++ skb->mark = READ_ONCE(sk->sk_mark);
+
+ /* Send it out. */
+ return ip_local_out(net, skb->sk, skb);
+@@ -527,8 +528,8 @@ packet_routed:
+ skb_shinfo(skb)->gso_segs ?: 1);
+
+ /* TODO : should we use skb->sk here instead of sk ? */
+- skb->priority = sk->sk_priority;
+- skb->mark = sk->sk_mark;
++ skb->priority = READ_ONCE(sk->sk_priority);
++ skb->mark = READ_ONCE(sk->sk_mark);
+
+ res = ip_local_out(net, sk, skb);
+ rcu_read_unlock();
+diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
+index 8e97d8d4cc9d9..d41bce8927b2c 100644
+--- a/net/ipv4/ip_sockglue.c
++++ b/net/ipv4/ip_sockglue.c
+@@ -592,7 +592,7 @@ void __ip_sock_set_tos(struct sock *sk, int val)
+ }
+ if (inet_sk(sk)->tos != val) {
+ inet_sk(sk)->tos = val;
+- sk->sk_priority = rt_tos2priority(val);
++ WRITE_ONCE(sk->sk_priority, rt_tos2priority(val));
+ sk_dst_reset(sk);
+ }
+ }
+diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
+index eadf1c9ef7e49..fb31624019435 100644
+--- a/net/ipv4/raw.c
++++ b/net/ipv4/raw.c
+@@ -348,7 +348,7 @@ static int raw_send_hdrinc(struct sock *sk, struct flowi4 *fl4,
+ goto error;
+ skb_reserve(skb, hlen);
+
+- skb->priority = sk->sk_priority;
++ skb->priority = READ_ONCE(sk->sk_priority);
+ skb->mark = sockc->mark;
+ skb->tstamp = sockc->transmit_time;
+ skb_dst_set(skb, &rt->dst);
+diff --git a/net/ipv4/route.c b/net/ipv4/route.c
+index 98d7e6ba7493b..92fede388d520 100644
+--- a/net/ipv4/route.c
++++ b/net/ipv4/route.c
+@@ -518,7 +518,7 @@ static void __build_flow_key(const struct net *net, struct flowi4 *fl4,
+ const struct inet_sock *inet = inet_sk(sk);
+
+ oif = sk->sk_bound_dev_if;
+- mark = sk->sk_mark;
++ mark = READ_ONCE(sk->sk_mark);
+ tos = ip_sock_rt_tos(sk);
+ scope = ip_sock_rt_scope(sk);
+ prot = inet->hdrincl ? IPPROTO_RAW : sk->sk_protocol;
+@@ -552,7 +552,7 @@ static void build_sk_flow_key(struct flowi4 *fl4, const struct sock *sk)
+ inet_opt = rcu_dereference(inet->inet_opt);
+ if (inet_opt && inet_opt->opt.srr)
+ daddr = inet_opt->opt.faddr;
+- flowi4_init_output(fl4, sk->sk_bound_dev_if, sk->sk_mark,
++ flowi4_init_output(fl4, sk->sk_bound_dev_if, READ_ONCE(sk->sk_mark),
+ ip_sock_rt_tos(sk) & IPTOS_RT_MASK,
+ ip_sock_rt_scope(sk),
+ inet->hdrincl ? IPPROTO_RAW : sk->sk_protocol,
+diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
+index f37d13ee7b4cc..498dd4acdeec8 100644
+--- a/net/ipv4/tcp_ipv4.c
++++ b/net/ipv4/tcp_ipv4.c
+@@ -931,9 +931,9 @@ static void tcp_v4_send_ack(const struct sock *sk,
+ ctl_sk = this_cpu_read(ipv4_tcp_sk);
+ sock_net_set(ctl_sk, net);
+ ctl_sk->sk_mark = (sk->sk_state == TCP_TIME_WAIT) ?
+- inet_twsk(sk)->tw_mark : sk->sk_mark;
++ inet_twsk(sk)->tw_mark : READ_ONCE(sk->sk_mark);
+ ctl_sk->sk_priority = (sk->sk_state == TCP_TIME_WAIT) ?
+- inet_twsk(sk)->tw_priority : sk->sk_priority;
++ inet_twsk(sk)->tw_priority : READ_ONCE(sk->sk_priority);
+ transmit_time = tcp_transmit_time(sk);
+ ip_send_unicast_reply(ctl_sk,
+ skb, &TCP_SKB_CB(skb)->header.h4.opt,
+diff --git a/net/ipv4/tcp_metrics.c b/net/ipv4/tcp_metrics.c
+index 82f4575f9cd90..99ac5efe244d3 100644
+--- a/net/ipv4/tcp_metrics.c
++++ b/net/ipv4/tcp_metrics.c
+@@ -40,7 +40,7 @@ struct tcp_fastopen_metrics {
+
+ struct tcp_metrics_block {
+ struct tcp_metrics_block __rcu *tcpm_next;
+- possible_net_t tcpm_net;
++ struct net *tcpm_net;
+ struct inetpeer_addr tcpm_saddr;
+ struct inetpeer_addr tcpm_daddr;
+ unsigned long tcpm_stamp;
+@@ -51,34 +51,38 @@ struct tcp_metrics_block {
+ struct rcu_head rcu_head;
+ };
+
+-static inline struct net *tm_net(struct tcp_metrics_block *tm)
++static inline struct net *tm_net(const struct tcp_metrics_block *tm)
+ {
+- return read_pnet(&tm->tcpm_net);
++ /* Paired with the WRITE_ONCE() in tcpm_new() */
++ return READ_ONCE(tm->tcpm_net);
+ }
+
+ static bool tcp_metric_locked(struct tcp_metrics_block *tm,
+ enum tcp_metric_index idx)
+ {
+- return tm->tcpm_lock & (1 << idx);
++ /* Paired with WRITE_ONCE() in tcpm_suck_dst() */
++ return READ_ONCE(tm->tcpm_lock) & (1 << idx);
+ }
+
+-static u32 tcp_metric_get(struct tcp_metrics_block *tm,
++static u32 tcp_metric_get(const struct tcp_metrics_block *tm,
+ enum tcp_metric_index idx)
+ {
+- return tm->tcpm_vals[idx];
++ /* Paired with WRITE_ONCE() in tcp_metric_set() */
++ return READ_ONCE(tm->tcpm_vals[idx]);
+ }
+
+ static void tcp_metric_set(struct tcp_metrics_block *tm,
+ enum tcp_metric_index idx,
+ u32 val)
+ {
+- tm->tcpm_vals[idx] = val;
++ /* Paired with READ_ONCE() in tcp_metric_get() */
++ WRITE_ONCE(tm->tcpm_vals[idx], val);
+ }
+
+ static bool addr_same(const struct inetpeer_addr *a,
+ const struct inetpeer_addr *b)
+ {
+- return inetpeer_addr_cmp(a, b) == 0;
++ return (a->family == b->family) && !inetpeer_addr_cmp(a, b);
+ }
+
+ struct tcpm_hash_bucket {
+@@ -89,6 +93,7 @@ static struct tcpm_hash_bucket *tcp_metrics_hash __read_mostly;
+ static unsigned int tcp_metrics_hash_log __read_mostly;
+
+ static DEFINE_SPINLOCK(tcp_metrics_lock);
++static DEFINE_SEQLOCK(fastopen_seqlock);
+
+ static void tcpm_suck_dst(struct tcp_metrics_block *tm,
+ const struct dst_entry *dst,
+@@ -97,7 +102,7 @@ static void tcpm_suck_dst(struct tcp_metrics_block *tm,
+ u32 msval;
+ u32 val;
+
+- tm->tcpm_stamp = jiffies;
++ WRITE_ONCE(tm->tcpm_stamp, jiffies);
+
+ val = 0;
+ if (dst_metric_locked(dst, RTAX_RTT))
+@@ -110,30 +115,42 @@ static void tcpm_suck_dst(struct tcp_metrics_block *tm,
+ val |= 1 << TCP_METRIC_CWND;
+ if (dst_metric_locked(dst, RTAX_REORDERING))
+ val |= 1 << TCP_METRIC_REORDERING;
+- tm->tcpm_lock = val;
++ /* Paired with READ_ONCE() in tcp_metric_locked() */
++ WRITE_ONCE(tm->tcpm_lock, val);
+
+ msval = dst_metric_raw(dst, RTAX_RTT);
+- tm->tcpm_vals[TCP_METRIC_RTT] = msval * USEC_PER_MSEC;
++ tcp_metric_set(tm, TCP_METRIC_RTT, msval * USEC_PER_MSEC);
+
+ msval = dst_metric_raw(dst, RTAX_RTTVAR);
+- tm->tcpm_vals[TCP_METRIC_RTTVAR] = msval * USEC_PER_MSEC;
+- tm->tcpm_vals[TCP_METRIC_SSTHRESH] = dst_metric_raw(dst, RTAX_SSTHRESH);
+- tm->tcpm_vals[TCP_METRIC_CWND] = dst_metric_raw(dst, RTAX_CWND);
+- tm->tcpm_vals[TCP_METRIC_REORDERING] = dst_metric_raw(dst, RTAX_REORDERING);
++ tcp_metric_set(tm, TCP_METRIC_RTTVAR, msval * USEC_PER_MSEC);
++ tcp_metric_set(tm, TCP_METRIC_SSTHRESH,
++ dst_metric_raw(dst, RTAX_SSTHRESH));
++ tcp_metric_set(tm, TCP_METRIC_CWND,
++ dst_metric_raw(dst, RTAX_CWND));
++ tcp_metric_set(tm, TCP_METRIC_REORDERING,
++ dst_metric_raw(dst, RTAX_REORDERING));
+ if (fastopen_clear) {
++ write_seqlock(&fastopen_seqlock);
+ tm->tcpm_fastopen.mss = 0;
+ tm->tcpm_fastopen.syn_loss = 0;
+ tm->tcpm_fastopen.try_exp = 0;
+ tm->tcpm_fastopen.cookie.exp = false;
+ tm->tcpm_fastopen.cookie.len = 0;
++ write_sequnlock(&fastopen_seqlock);
+ }
+ }
+
+ #define TCP_METRICS_TIMEOUT (60 * 60 * HZ)
+
+-static void tcpm_check_stamp(struct tcp_metrics_block *tm, struct dst_entry *dst)
++static void tcpm_check_stamp(struct tcp_metrics_block *tm,
++ const struct dst_entry *dst)
+ {
+- if (tm && unlikely(time_after(jiffies, tm->tcpm_stamp + TCP_METRICS_TIMEOUT)))
++ unsigned long limit;
++
++ if (!tm)
++ return;
++ limit = READ_ONCE(tm->tcpm_stamp) + TCP_METRICS_TIMEOUT;
++ if (unlikely(time_after(jiffies, limit)))
+ tcpm_suck_dst(tm, dst, false);
+ }
+
+@@ -174,20 +191,23 @@ static struct tcp_metrics_block *tcpm_new(struct dst_entry *dst,
+ oldest = deref_locked(tcp_metrics_hash[hash].chain);
+ for (tm = deref_locked(oldest->tcpm_next); tm;
+ tm = deref_locked(tm->tcpm_next)) {
+- if (time_before(tm->tcpm_stamp, oldest->tcpm_stamp))
++ if (time_before(READ_ONCE(tm->tcpm_stamp),
++ READ_ONCE(oldest->tcpm_stamp)))
+ oldest = tm;
+ }
+ tm = oldest;
+ } else {
+- tm = kmalloc(sizeof(*tm), GFP_ATOMIC);
++ tm = kzalloc(sizeof(*tm), GFP_ATOMIC);
+ if (!tm)
+ goto out_unlock;
+ }
+- write_pnet(&tm->tcpm_net, net);
++ /* Paired with the READ_ONCE() in tm_net() */
++ WRITE_ONCE(tm->tcpm_net, net);
++
+ tm->tcpm_saddr = *saddr;
+ tm->tcpm_daddr = *daddr;
+
+- tcpm_suck_dst(tm, dst, true);
++ tcpm_suck_dst(tm, dst, reclaim);
+
+ if (likely(!reclaim)) {
+ tm->tcpm_next = tcp_metrics_hash[hash].chain;
+@@ -434,7 +454,7 @@ void tcp_update_metrics(struct sock *sk)
+ tp->reordering);
+ }
+ }
+- tm->tcpm_stamp = jiffies;
++ WRITE_ONCE(tm->tcpm_stamp, jiffies);
+ out_unlock:
+ rcu_read_unlock();
+ }
+@@ -539,8 +559,6 @@ bool tcp_peer_is_proven(struct request_sock *req, struct dst_entry *dst)
+ return ret;
+ }
+
+-static DEFINE_SEQLOCK(fastopen_seqlock);
+-
+ void tcp_fastopen_cache_get(struct sock *sk, u16 *mss,
+ struct tcp_fastopen_cookie *cookie)
+ {
+@@ -647,7 +665,7 @@ static int tcp_metrics_fill_info(struct sk_buff *msg,
+ }
+
+ if (nla_put_msecs(msg, TCP_METRICS_ATTR_AGE,
+- jiffies - tm->tcpm_stamp,
++ jiffies - READ_ONCE(tm->tcpm_stamp),
+ TCP_METRICS_ATTR_PAD) < 0)
+ goto nla_put_failure;
+
+@@ -658,7 +676,7 @@ static int tcp_metrics_fill_info(struct sk_buff *msg,
+ if (!nest)
+ goto nla_put_failure;
+ for (i = 0; i < TCP_METRIC_MAX_KERNEL + 1; i++) {
+- u32 val = tm->tcpm_vals[i];
++ u32 val = tcp_metric_get(tm, i);
+
+ if (!val)
+ continue;
+diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c
+index 4851211aa60d6..9c51ee9ccd4c0 100644
+--- a/net/ipv4/tcp_offload.c
++++ b/net/ipv4/tcp_offload.c
+@@ -9,6 +9,7 @@
+ #include <linux/indirect_call_wrapper.h>
+ #include <linux/skbuff.h>
+ #include <net/gro.h>
++#include <net/gso.h>
+ #include <net/tcp.h>
+ #include <net/protocol.h>
+
+diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
+index 9482def1f3103..6d327d6d978c5 100644
+--- a/net/ipv4/udp.c
++++ b/net/ipv4/udp.c
+@@ -103,6 +103,7 @@
+ #include <net/ip_tunnels.h>
+ #include <net/route.h>
+ #include <net/checksum.h>
++#include <net/gso.h>
+ #include <net/xfrm.h>
+ #include <trace/events/udp.h>
+ #include <linux/static_key.h>
+@@ -113,6 +114,7 @@
+ #include <net/sock_reuseport.h>
+ #include <net/addrconf.h>
+ #include <net/udp_tunnel.h>
++#include <net/gro.h>
+ #if IS_ENABLED(CONFIG_IPV6)
+ #include <net/ipv6_stubs.h>
+ #endif
+@@ -554,10 +556,13 @@ struct sock *udp4_lib_lookup_skb(const struct sk_buff *skb,
+ {
+ const struct iphdr *iph = ip_hdr(skb);
+ struct net *net = dev_net(skb->dev);
++ int iif, sdif;
++
++ inet_get_iif_sdif(skb, &iif, &sdif);
+
+ return __udp4_lib_lookup(net, iph->saddr, sport,
+- iph->daddr, dport, inet_iif(skb),
+- inet_sdif(skb), net->ipv4.udp_table, NULL);
++ iph->daddr, dport, iif,
++ sdif, net->ipv4.udp_table, NULL);
+ }
+
+ /* Must be called under rcu_read_lock().
+diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
+index 4a61832e7f69b..0f46b3c2e4ac5 100644
+--- a/net/ipv4/udp_offload.c
++++ b/net/ipv4/udp_offload.c
+@@ -8,6 +8,7 @@
+
+ #include <linux/skbuff.h>
+ #include <net/gro.h>
++#include <net/gso.h>
+ #include <net/udp.h>
+ #include <net/protocol.h>
+ #include <net/inet_common.h>
+@@ -608,10 +609,13 @@ static struct sock *udp4_gro_lookup_skb(struct sk_buff *skb, __be16 sport,
+ {
+ const struct iphdr *iph = skb_gro_network_header(skb);
+ struct net *net = dev_net(skb->dev);
++ int iif, sdif;
++
++ inet_get_iif_sdif(skb, &iif, &sdif);
+
+ return __udp4_lib_lookup(net, iph->saddr, sport,
+- iph->daddr, dport, inet_iif(skb),
+- inet_sdif(skb), net->ipv4.udp_table, NULL);
++ iph->daddr, dport, iif,
++ sdif, net->ipv4.udp_table, NULL);
+ }
+
+ INDIRECT_CALLABLE_SCOPE
+diff --git a/net/ipv6/esp6_offload.c b/net/ipv6/esp6_offload.c
+index 7723402689973..a189e08370a5e 100644
+--- a/net/ipv6/esp6_offload.c
++++ b/net/ipv6/esp6_offload.c
+@@ -17,6 +17,7 @@
+ #include <linux/err.h>
+ #include <linux/module.h>
+ #include <net/gro.h>
++#include <net/gso.h>
+ #include <net/ip.h>
+ #include <net/xfrm.h>
+ #include <net/esp.h>
+diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c
+index 00dc2e3b01845..d6314287338da 100644
+--- a/net/ipv6/ip6_offload.c
++++ b/net/ipv6/ip6_offload.c
+@@ -16,6 +16,7 @@
+ #include <net/tcp.h>
+ #include <net/udp.h>
+ #include <net/gro.h>
++#include <net/gso.h>
+
+ #include "ip6_offload.h"
+
+diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
+index 9554cf46ed888..4a27fab1d09a3 100644
+--- a/net/ipv6/ip6_output.c
++++ b/net/ipv6/ip6_output.c
+@@ -42,6 +42,7 @@
+ #include <net/sock.h>
+ #include <net/snmp.h>
+
++#include <net/gso.h>
+ #include <net/ipv6.h>
+ #include <net/ndisc.h>
+ #include <net/protocol.h>
+diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
+index 51cf37abd142d..b4152b5d68ffb 100644
+--- a/net/ipv6/ip6mr.c
++++ b/net/ipv6/ip6mr.c
+@@ -1073,7 +1073,7 @@ static int ip6mr_cache_report(const struct mr_table *mrt, struct sk_buff *pkt,
+ And all this only to mangle msg->im6_msgtype and
+ to set msg->im6_mbz to "mbz" :-)
+ */
+- skb_push(skb, -skb_network_offset(pkt));
++ __skb_pull(skb, skb_network_offset(pkt));
+
+ skb_push(skb, sizeof(*msg));
+ skb_reset_transport_header(skb);
+diff --git a/net/ipv6/ping.c b/net/ipv6/ping.c
+index f804c11e2146c..c2c291827a2ce 100644
+--- a/net/ipv6/ping.c
++++ b/net/ipv6/ping.c
+@@ -120,7 +120,7 @@ static int ping_v6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+
+ ipcm6_init_sk(&ipc6, np);
+ ipc6.sockc.tsflags = sk->sk_tsflags;
+- ipc6.sockc.mark = sk->sk_mark;
++ ipc6.sockc.mark = READ_ONCE(sk->sk_mark);
+
+ fl6.flowi6_oif = oif;
+
+diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
+index 44ee7a2e72ac2..d85d2082aeb77 100644
+--- a/net/ipv6/raw.c
++++ b/net/ipv6/raw.c
+@@ -614,7 +614,7 @@ static int rawv6_send_hdrinc(struct sock *sk, struct msghdr *msg, int length,
+ skb_reserve(skb, hlen);
+
+ skb->protocol = htons(ETH_P_IPV6);
+- skb->priority = sk->sk_priority;
++ skb->priority = READ_ONCE(sk->sk_priority);
+ skb->mark = sockc->mark;
+ skb->tstamp = sockc->transmit_time;
+
+@@ -774,12 +774,12 @@ static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+ */
+ memset(&fl6, 0, sizeof(fl6));
+
+- fl6.flowi6_mark = sk->sk_mark;
++ fl6.flowi6_mark = READ_ONCE(sk->sk_mark);
+ fl6.flowi6_uid = sk->sk_uid;
+
+ ipcm6_init(&ipc6);
+ ipc6.sockc.tsflags = sk->sk_tsflags;
+- ipc6.sockc.mark = sk->sk_mark;
++ ipc6.sockc.mark = fl6.flowi6_mark;
+
+ if (sin6) {
+ if (addr_len < SIN6_LEN_RFC2133)
+diff --git a/net/ipv6/route.c b/net/ipv6/route.c
+index 392aaa373b667..d5c6be77ec1ea 100644
+--- a/net/ipv6/route.c
++++ b/net/ipv6/route.c
+@@ -2951,7 +2951,8 @@ void ip6_sk_update_pmtu(struct sk_buff *skb, struct sock *sk, __be32 mtu)
+ if (!oif && skb->dev)
+ oif = l3mdev_master_ifindex(skb->dev);
+
+- ip6_update_pmtu(skb, sock_net(sk), mtu, oif, sk->sk_mark, sk->sk_uid);
++ ip6_update_pmtu(skb, sock_net(sk), mtu, oif, READ_ONCE(sk->sk_mark),
++ sk->sk_uid);
+
+ dst = __sk_dst_get(sk);
+ if (!dst || !dst->obsolete ||
+@@ -3172,8 +3173,8 @@ void ip6_redirect_no_header(struct sk_buff *skb, struct net *net, int oif)
+
+ void ip6_sk_redirect(struct sk_buff *skb, struct sock *sk)
+ {
+- ip6_redirect(skb, sock_net(sk), sk->sk_bound_dev_if, sk->sk_mark,
+- sk->sk_uid);
++ ip6_redirect(skb, sock_net(sk), sk->sk_bound_dev_if,
++ READ_ONCE(sk->sk_mark), sk->sk_uid);
+ }
+ EXPORT_SYMBOL_GPL(ip6_sk_redirect);
+
+diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
+index f7c248a7f8d1d..3155692a0e06b 100644
+--- a/net/ipv6/tcp_ipv6.c
++++ b/net/ipv6/tcp_ipv6.c
+@@ -568,8 +568,8 @@ static int tcp_v6_send_synack(const struct sock *sk, struct dst_entry *dst,
+ opt = ireq->ipv6_opt;
+ if (!opt)
+ opt = rcu_dereference(np->opt);
+- err = ip6_xmit(sk, skb, fl6, skb->mark ? : sk->sk_mark, opt,
+- tclass, sk->sk_priority);
++ err = ip6_xmit(sk, skb, fl6, skb->mark ? : READ_ONCE(sk->sk_mark),
++ opt, tclass, sk->sk_priority);
+ rcu_read_unlock();
+ err = net_xmit_eval(err);
+ }
+@@ -943,7 +943,7 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32
+ if (sk->sk_state == TCP_TIME_WAIT)
+ mark = inet_twsk(sk)->tw_mark;
+ else
+- mark = sk->sk_mark;
++ mark = READ_ONCE(sk->sk_mark);
+ skb_set_delivery_time(buff, tcp_transmit_time(sk), true);
+ }
+ if (txhash) {
+@@ -1132,7 +1132,8 @@ static void tcp_v6_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb,
+ tcp_time_stamp_raw() + tcp_rsk(req)->ts_off,
+ READ_ONCE(req->ts_recent), sk->sk_bound_dev_if,
+ tcp_v6_md5_do_lookup(sk, &ipv6_hdr(skb)->saddr, l3index),
+- ipv6_get_dsfield(ipv6_hdr(skb)), 0, sk->sk_priority,
++ ipv6_get_dsfield(ipv6_hdr(skb)), 0,
++ READ_ONCE(sk->sk_priority),
+ READ_ONCE(tcp_rsk(req)->txhash));
+ }
+
+diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
+index d594a0425749b..8521729fb2375 100644
+--- a/net/ipv6/udp.c
++++ b/net/ipv6/udp.c
+@@ -51,6 +51,7 @@
+ #include <net/inet6_hashtables.h>
+ #include <net/busy_poll.h>
+ #include <net/sock_reuseport.h>
++#include <net/gro.h>
+
+ #include <linux/proc_fs.h>
+ #include <linux/seq_file.h>
+@@ -300,10 +301,13 @@ struct sock *udp6_lib_lookup_skb(const struct sk_buff *skb,
+ {
+ const struct ipv6hdr *iph = ipv6_hdr(skb);
+ struct net *net = dev_net(skb->dev);
++ int iif, sdif;
++
++ inet6_get_iif_sdif(skb, &iif, &sdif);
+
+ return __udp6_lib_lookup(net, &iph->saddr, sport,
+- &iph->daddr, dport, inet6_iif(skb),
+- inet6_sdif(skb), net->ipv4.udp_table, NULL);
++ &iph->daddr, dport, iif,
++ sdif, net->ipv4.udp_table, NULL);
+ }
+
+ /* Must be called under rcu_read_lock().
+@@ -624,7 +628,7 @@ int __udp6_lib_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
+ if (type == NDISC_REDIRECT) {
+ if (tunnel) {
+ ip6_redirect(skb, sock_net(sk), inet6_iif(skb),
+- sk->sk_mark, sk->sk_uid);
++ READ_ONCE(sk->sk_mark), sk->sk_uid);
+ } else {
+ ip6_sk_redirect(skb, sk);
+ }
+@@ -1356,7 +1360,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+ ipcm6_init(&ipc6);
+ ipc6.gso_size = READ_ONCE(up->gso_size);
+ ipc6.sockc.tsflags = sk->sk_tsflags;
+- ipc6.sockc.mark = sk->sk_mark;
++ ipc6.sockc.mark = READ_ONCE(sk->sk_mark);
+
+ /* destination address check */
+ if (sin6) {
+diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
+index e0e10f6bcdc18..6b95ba241ebe2 100644
+--- a/net/ipv6/udp_offload.c
++++ b/net/ipv6/udp_offload.c
+@@ -14,6 +14,7 @@
+ #include <net/ip6_checksum.h>
+ #include "ip6_offload.h"
+ #include <net/gro.h>
++#include <net/gso.h>
+
+ static struct sk_buff *udp6_ufo_fragment(struct sk_buff *skb,
+ netdev_features_t features)
+@@ -117,10 +118,13 @@ static struct sock *udp6_gro_lookup_skb(struct sk_buff *skb, __be16 sport,
+ {
+ const struct ipv6hdr *iph = skb_gro_network_header(skb);
+ struct net *net = dev_net(skb->dev);
++ int iif, sdif;
++
++ inet6_get_iif_sdif(skb, &iif, &sdif);
+
+ return __udp6_lib_lookup(net, &iph->saddr, sport,
+- &iph->daddr, dport, inet6_iif(skb),
+- inet6_sdif(skb), net->ipv4.udp_table, NULL);
++ &iph->daddr, dport, iif,
++ sdif, net->ipv4.udp_table, NULL);
+ }
+
+ INDIRECT_CALLABLE_SCOPE
+diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c
+index 5137ea1861ce2..bce4132b0a5c8 100644
+--- a/net/l2tp/l2tp_ip6.c
++++ b/net/l2tp/l2tp_ip6.c
+@@ -519,7 +519,7 @@ static int l2tp_ip6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+ /* Get and verify the address */
+ memset(&fl6, 0, sizeof(fl6));
+
+- fl6.flowi6_mark = sk->sk_mark;
++ fl6.flowi6_mark = READ_ONCE(sk->sk_mark);
+ fl6.flowi6_uid = sk->sk_uid;
+
+ ipcm6_init(&ipc6);
+diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
+index 13b522dab0a3d..39ca4a8fe7b32 100644
+--- a/net/mac80211/tx.c
++++ b/net/mac80211/tx.c
+@@ -26,6 +26,7 @@
+ #include <net/codel_impl.h>
+ #include <asm/unaligned.h>
+ #include <net/fq_impl.h>
++#include <net/gso.h>
+
+ #include "ieee80211_i.h"
+ #include "driver-ops.h"
+diff --git a/net/mpls/af_mpls.c b/net/mpls/af_mpls.c
+index dc5165d3eec4e..bf6e81d562631 100644
+--- a/net/mpls/af_mpls.c
++++ b/net/mpls/af_mpls.c
+@@ -12,6 +12,7 @@
+ #include <linux/nospec.h>
+ #include <linux/vmalloc.h>
+ #include <linux/percpu.h>
++#include <net/gso.h>
+ #include <net/ip.h>
+ #include <net/dst.h>
+ #include <net/sock.h>
+diff --git a/net/mpls/mpls_gso.c b/net/mpls/mpls_gso.c
+index 1482259de9b5d..533d082f0701e 100644
+--- a/net/mpls/mpls_gso.c
++++ b/net/mpls/mpls_gso.c
+@@ -14,6 +14,7 @@
+ #include <linux/netdev_features.h>
+ #include <linux/netdevice.h>
+ #include <linux/skbuff.h>
++#include <net/gso.h>
+ #include <net/mpls.h>
+
+ static struct sk_buff *mpls_gso_segment(struct sk_buff *skb,
+diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c
+index d4258869ac48f..64fcfc3d5270f 100644
+--- a/net/mptcp/sockopt.c
++++ b/net/mptcp/sockopt.c
+@@ -102,7 +102,7 @@ static void mptcp_sol_socket_sync_intval(struct mptcp_sock *msk, int optname, in
+ break;
+ case SO_MARK:
+ if (READ_ONCE(ssk->sk_mark) != sk->sk_mark) {
+- ssk->sk_mark = sk->sk_mark;
++ WRITE_ONCE(ssk->sk_mark, sk->sk_mark);
+ sk_dst_reset(ssk);
+ }
+ break;
+diff --git a/net/netfilter/nf_flow_table_ip.c b/net/netfilter/nf_flow_table_ip.c
+index 3bbaf9c7ea46a..7eba00f6c6b6a 100644
+--- a/net/netfilter/nf_flow_table_ip.c
++++ b/net/netfilter/nf_flow_table_ip.c
+@@ -8,6 +8,7 @@
+ #include <linux/ipv6.h>
+ #include <linux/netdevice.h>
+ #include <linux/if_ether.h>
++#include <net/gso.h>
+ #include <net/ip.h>
+ #include <net/ipv6.h>
+ #include <net/ip6_route.h>
+diff --git a/net/netfilter/nfnetlink_queue.c b/net/netfilter/nfnetlink_queue.c
+index e311462f6d98d..556bc902af00f 100644
+--- a/net/netfilter/nfnetlink_queue.c
++++ b/net/netfilter/nfnetlink_queue.c
+@@ -30,6 +30,7 @@
+ #include <linux/netfilter/nf_conntrack_common.h>
+ #include <linux/list.h>
+ #include <linux/cgroup-defs.h>
++#include <net/gso.h>
+ #include <net/sock.h>
+ #include <net/tcp_states.h>
+ #include <net/netfilter/nf_queue.h>
+diff --git a/net/netfilter/nft_socket.c b/net/netfilter/nft_socket.c
+index 85f8df87efdaa..1dd336a3ce786 100644
+--- a/net/netfilter/nft_socket.c
++++ b/net/netfilter/nft_socket.c
+@@ -107,7 +107,7 @@ static void nft_socket_eval(const struct nft_expr *expr,
+ break;
+ case NFT_SOCKET_MARK:
+ if (sk_fullsock(sk)) {
+- *dest = sk->sk_mark;
++ *dest = READ_ONCE(sk->sk_mark);
+ } else {
+ regs->verdict.code = NFT_BREAK;
+ return;
+diff --git a/net/netfilter/xt_socket.c b/net/netfilter/xt_socket.c
+index 7013f55f05d1e..76e01f292aaff 100644
+--- a/net/netfilter/xt_socket.c
++++ b/net/netfilter/xt_socket.c
+@@ -77,7 +77,7 @@ socket_match(const struct sk_buff *skb, struct xt_action_param *par,
+
+ if (info->flags & XT_SOCKET_RESTORESKMARK && !wildcard &&
+ transparent && sk_fullsock(sk))
+- pskb->mark = sk->sk_mark;
++ pskb->mark = READ_ONCE(sk->sk_mark);
+
+ if (sk != skb->sk)
+ sock_gen_put(sk);
+@@ -138,7 +138,7 @@ socket_mt6_v1_v2_v3(const struct sk_buff *skb, struct xt_action_param *par)
+
+ if (info->flags & XT_SOCKET_RESTORESKMARK && !wildcard &&
+ transparent && sk_fullsock(sk))
+- pskb->mark = sk->sk_mark;
++ pskb->mark = READ_ONCE(sk->sk_mark);
+
+ if (sk != skb->sk)
+ sock_gen_put(sk);
+diff --git a/net/nsh/nsh.c b/net/nsh/nsh.c
+index 0f23e5e8e03eb..f4a38bd6a7e04 100644
+--- a/net/nsh/nsh.c
++++ b/net/nsh/nsh.c
+@@ -8,6 +8,7 @@
+ #include <linux/module.h>
+ #include <linux/netdevice.h>
+ #include <linux/skbuff.h>
++#include <net/gso.h>
+ #include <net/nsh.h>
+ #include <net/tun_proto.h>
+
+diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
+index a8cf9a88758ef..8074ea00d577e 100644
+--- a/net/openvswitch/actions.c
++++ b/net/openvswitch/actions.c
+@@ -17,6 +17,7 @@
+ #include <linux/if_vlan.h>
+
+ #include <net/dst.h>
++#include <net/gso.h>
+ #include <net/ip.h>
+ #include <net/ipv6.h>
+ #include <net/ip6_fib.h>
+diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
+index 58f530f60172a..a6d2a0b1aa21e 100644
+--- a/net/openvswitch/datapath.c
++++ b/net/openvswitch/datapath.c
+@@ -35,6 +35,7 @@
+ #include <linux/rculist.h>
+ #include <linux/dmi.h>
+ #include <net/genetlink.h>
++#include <net/gso.h>
+ #include <net/net_namespace.h>
+ #include <net/netns/generic.h>
+ #include <net/pkt_cls.h>
+diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
+index a2dbeb264f260..a753246ef1657 100644
+--- a/net/packet/af_packet.c
++++ b/net/packet/af_packet.c
+@@ -2050,8 +2050,8 @@ retry:
+
+ skb->protocol = proto;
+ skb->dev = dev;
+- skb->priority = sk->sk_priority;
+- skb->mark = sk->sk_mark;
++ skb->priority = READ_ONCE(sk->sk_priority);
++ skb->mark = READ_ONCE(sk->sk_mark);
+ skb->tstamp = sockc.transmit_time;
+
+ skb_setup_tx_timestamp(skb, sockc.tsflags);
+@@ -2585,8 +2585,8 @@ static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
+
+ skb->protocol = proto;
+ skb->dev = dev;
+- skb->priority = po->sk.sk_priority;
+- skb->mark = po->sk.sk_mark;
++ skb->priority = READ_ONCE(po->sk.sk_priority);
++ skb->mark = READ_ONCE(po->sk.sk_mark);
+ skb->tstamp = sockc->transmit_time;
+ skb_setup_tx_timestamp(skb, sockc->tsflags);
+ skb_zcopy_set_nouarg(skb, ph.raw);
+@@ -2988,7 +2988,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
+ goto out_unlock;
+
+ sockcm_init(&sockc, sk);
+- sockc.mark = sk->sk_mark;
++ sockc.mark = READ_ONCE(sk->sk_mark);
+ if (msg->msg_controllen) {
+ err = sock_cmsg_send(sk, msg, &sockc);
+ if (unlikely(err))
+@@ -3061,7 +3061,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
+
+ skb->protocol = proto;
+ skb->dev = dev;
+- skb->priority = sk->sk_priority;
++ skb->priority = READ_ONCE(sk->sk_priority);
+ skb->mark = sockc.mark;
+ skb->tstamp = sockc.transmit_time;
+
+diff --git a/net/sched/act_police.c b/net/sched/act_police.c
+index 2e9dce03d1ecc..f3121c5a85e9f 100644
+--- a/net/sched/act_police.c
++++ b/net/sched/act_police.c
+@@ -16,6 +16,7 @@
+ #include <linux/init.h>
+ #include <linux/slab.h>
+ #include <net/act_api.h>
++#include <net/gso.h>
+ #include <net/netlink.h>
+ #include <net/pkt_cls.h>
+ #include <net/tc_act/tc_police.h>
+diff --git a/net/sched/cls_fw.c b/net/sched/cls_fw.c
+index 8641f80593179..c49d6af0e0480 100644
+--- a/net/sched/cls_fw.c
++++ b/net/sched/cls_fw.c
+@@ -267,7 +267,6 @@ static int fw_change(struct net *net, struct sk_buff *in_skb,
+ return -ENOBUFS;
+
+ fnew->id = f->id;
+- fnew->res = f->res;
+ fnew->ifindex = f->ifindex;
+ fnew->tp = f->tp;
+
+diff --git a/net/sched/cls_route.c b/net/sched/cls_route.c
+index d0c53724d3e86..1e20bbd687f1d 100644
+--- a/net/sched/cls_route.c
++++ b/net/sched/cls_route.c
+@@ -513,7 +513,6 @@ static int route4_change(struct net *net, struct sk_buff *in_skb,
+ if (fold) {
+ f->id = fold->id;
+ f->iif = fold->iif;
+- f->res = fold->res;
+ f->handle = fold->handle;
+
+ f->tp = fold->tp;
+diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
+index 5abf31e432caf..da4c179a4d418 100644
+--- a/net/sched/cls_u32.c
++++ b/net/sched/cls_u32.c
+@@ -826,7 +826,6 @@ static struct tc_u_knode *u32_init_knode(struct net *net, struct tcf_proto *tp,
+
+ new->ifindex = n->ifindex;
+ new->fshift = n->fshift;
+- new->res = n->res;
+ new->flags = n->flags;
+ RCU_INIT_POINTER(new->ht_down, ht);
+
+@@ -1024,18 +1023,62 @@ static int u32_change(struct net *net, struct sk_buff *in_skb,
+ return -EINVAL;
+ }
+
++ /* At this point, we need to derive the new handle that will be used to
++ * uniquely map the identity of this table match entry. The
++ * identity of the entry that we need to construct is 32 bits made of:
++ * htid(12b):bucketid(8b):node/entryid(12b)
++ *
++ * At this point _we have the table(ht)_ in which we will insert this
++ * entry. We carry the table's id in variable "htid".
++ * Note that earlier code picked the ht selection either by a) the user
++ * providing the htid specified via TCA_U32_HASH attribute or b) when
++ * no such attribute is passed then the root ht, is default to at ID
++ * 0x[800][00][000]. Rule: the root table has a single bucket with ID 0.
++ * If OTOH the user passed us the htid, they may also pass a bucketid of
++ * choice. 0 is fine. For example a user htid is 0x[600][01][000] it is
++ * indicating hash bucketid of 1. Rule: the entry/node ID _cannot_ be
++ * passed via the htid, so even if it was non-zero it will be ignored.
++ *
++ * We may also have a handle, if the user passed one. The handle also
++ * carries the same addressing of htid(12b):bucketid(8b):node/entryid(12b).
++ * Rule: the bucketid on the handle is ignored even if one was passed;
++ * rather the value on "htid" is always assumed to be the bucketid.
++ */
+ if (handle) {
++ /* Rule: The htid from handle and tableid from htid must match */
+ if (TC_U32_HTID(handle) && TC_U32_HTID(handle ^ htid)) {
+ NL_SET_ERR_MSG_MOD(extack, "Handle specified hash table address mismatch");
+ return -EINVAL;
+ }
+- handle = htid | TC_U32_NODE(handle);
+- err = idr_alloc_u32(&ht->handle_idr, NULL, &handle, handle,
+- GFP_KERNEL);
+- if (err)
+- return err;
+- } else
++ /* Ok, so far we have a valid htid(12b):bucketid(8b) but we
++ * need to finalize the table entry identification with the last
++ * part - the node/entryid(12b)). Rule: Nodeid _cannot be 0_ for
++ * entries. Rule: nodeid of 0 is reserved only for tables(see
++ * earlier code which processes TC_U32_DIVISOR attribute).
++ * Rule: The nodeid can only be derived from the handle (and not
++ * htid).
++ * Rule: if the handle specified zero for the node id example
++ * 0x60000000, then pick a new nodeid from the pool of IDs
++ * this hash table has been allocating from.
++ * If OTOH it is specified (i.e for example the user passed a
++ * handle such as 0x60000123), then we use it generate our final
++ * handle which is used to uniquely identify the match entry.
++ */
++ if (!TC_U32_NODE(handle)) {
++ handle = gen_new_kid(ht, htid);
++ } else {
++ handle = htid | TC_U32_NODE(handle);
++ err = idr_alloc_u32(&ht->handle_idr, NULL, &handle,
++ handle, GFP_KERNEL);
++ if (err)
++ return err;
++ }
++ } else {
++ /* The user did not give us a handle; lets just generate one
++ * from the table's pool of nodeids.
++ */
+ handle = gen_new_kid(ht, htid);
++ }
+
+ if (tb[TCA_U32_SEL] == NULL) {
+ NL_SET_ERR_MSG_MOD(extack, "Selector not specified");
+diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c
+index 891e007d5c0bf..9cff99558694d 100644
+--- a/net/sched/sch_cake.c
++++ b/net/sched/sch_cake.c
+@@ -65,6 +65,7 @@
+ #include <linux/reciprocal_div.h>
+ #include <net/netlink.h>
+ #include <linux/if_vlan.h>
++#include <net/gso.h>
+ #include <net/pkt_sched.h>
+ #include <net/pkt_cls.h>
+ #include <net/tcp.h>
+diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
+index b93ec2a3454eb..38d9aa0cd30e7 100644
+--- a/net/sched/sch_netem.c
++++ b/net/sched/sch_netem.c
+@@ -21,6 +21,7 @@
+ #include <linux/reciprocal_div.h>
+ #include <linux/rbtree.h>
+
++#include <net/gso.h>
+ #include <net/netlink.h>
+ #include <net/pkt_sched.h>
+ #include <net/inet_ecn.h>
+diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
+index 4caf80ddc6721..97afa244e54f5 100644
+--- a/net/sched/sch_taprio.c
++++ b/net/sched/sch_taprio.c
+@@ -20,6 +20,7 @@
+ #include <linux/spinlock.h>
+ #include <linux/rcupdate.h>
+ #include <linux/time.h>
++#include <net/gso.h>
+ #include <net/netlink.h>
+ #include <net/pkt_sched.h>
+ #include <net/pkt_cls.h>
+@@ -1012,6 +1013,11 @@ static const struct nla_policy taprio_tc_policy[TCA_TAPRIO_TC_ENTRY_MAX + 1] = {
+ TC_FP_PREEMPTIBLE),
+ };
+
++static struct netlink_range_validation_signed taprio_cycle_time_range = {
++ .min = 0,
++ .max = INT_MAX,
++};
++
+ static const struct nla_policy taprio_policy[TCA_TAPRIO_ATTR_MAX + 1] = {
+ [TCA_TAPRIO_ATTR_PRIOMAP] = {
+ .len = sizeof(struct tc_mqprio_qopt)
+@@ -1020,7 +1026,8 @@ static const struct nla_policy taprio_policy[TCA_TAPRIO_ATTR_MAX + 1] = {
+ [TCA_TAPRIO_ATTR_SCHED_BASE_TIME] = { .type = NLA_S64 },
+ [TCA_TAPRIO_ATTR_SCHED_SINGLE_ENTRY] = { .type = NLA_NESTED },
+ [TCA_TAPRIO_ATTR_SCHED_CLOCKID] = { .type = NLA_S32 },
+- [TCA_TAPRIO_ATTR_SCHED_CYCLE_TIME] = { .type = NLA_S64 },
++ [TCA_TAPRIO_ATTR_SCHED_CYCLE_TIME] =
++ NLA_POLICY_FULL_RANGE_SIGNED(NLA_S64, &taprio_cycle_time_range),
+ [TCA_TAPRIO_ATTR_SCHED_CYCLE_TIME_EXTENSION] = { .type = NLA_S64 },
+ [TCA_TAPRIO_ATTR_FLAGS] = { .type = NLA_U32 },
+ [TCA_TAPRIO_ATTR_TXTIME_DELAY] = { .type = NLA_U32 },
+@@ -1156,6 +1163,11 @@ static int parse_taprio_schedule(struct taprio_sched *q, struct nlattr **tb,
+ return -EINVAL;
+ }
+
++ if (cycle < 0 || cycle > INT_MAX) {
++ NL_SET_ERR_MSG(extack, "'cycle_time' is too big");
++ return -EINVAL;
++ }
++
+ new->cycle_time = cycle;
+ }
+
+@@ -1344,7 +1356,7 @@ static void setup_txtime(struct taprio_sched *q,
+ struct sched_gate_list *sched, ktime_t base)
+ {
+ struct sched_entry *entry;
+- u32 interval = 0;
++ u64 interval = 0;
+
+ list_for_each_entry(entry, &sched->entries, list) {
+ entry->next_txtime = ktime_add_ns(base, interval);
+diff --git a/net/sched/sch_tbf.c b/net/sched/sch_tbf.c
+index 277ad11f4d613..17d2d00ddb182 100644
+--- a/net/sched/sch_tbf.c
++++ b/net/sched/sch_tbf.c
+@@ -13,6 +13,7 @@
+ #include <linux/string.h>
+ #include <linux/errno.h>
+ #include <linux/skbuff.h>
++#include <net/gso.h>
+ #include <net/netlink.h>
+ #include <net/sch_generic.h>
+ #include <net/pkt_cls.h>
+diff --git a/net/sctp/offload.c b/net/sctp/offload.c
+index eb874e3c399a5..502095173d885 100644
+--- a/net/sctp/offload.c
++++ b/net/sctp/offload.c
+@@ -22,6 +22,7 @@
+ #include <net/sctp/sctp.h>
+ #include <net/sctp/checksum.h>
+ #include <net/protocol.h>
++#include <net/gso.h>
+
+ static __le32 sctp_gso_make_checksum(struct sk_buff *skb)
+ {
+diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
+index 538e9c6ec8c98..fa6b54c1411cb 100644
+--- a/net/smc/af_smc.c
++++ b/net/smc/af_smc.c
+@@ -445,7 +445,7 @@ static void smc_copy_sock_settings(struct sock *nsk, struct sock *osk,
+ nsk->sk_rcvbuf = osk->sk_rcvbuf;
+ nsk->sk_sndtimeo = osk->sk_sndtimeo;
+ nsk->sk_rcvtimeo = osk->sk_rcvtimeo;
+- nsk->sk_mark = osk->sk_mark;
++ nsk->sk_mark = READ_ONCE(osk->sk_mark);
+ nsk->sk_priority = osk->sk_priority;
+ nsk->sk_rcvlowat = osk->sk_rcvlowat;
+ nsk->sk_bound_dev_if = osk->sk_bound_dev_if;
+diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
+index e7728b57a8c70..10615878e3961 100644
+--- a/net/unix/af_unix.c
++++ b/net/unix/af_unix.c
+@@ -780,7 +780,7 @@ static int unix_set_peek_off(struct sock *sk, int val)
+ if (mutex_lock_interruptible(&u->iolock))
+ return -EINTR;
+
+- sk->sk_peek_off = val;
++ WRITE_ONCE(sk->sk_peek_off, val);
+ mutex_unlock(&u->iolock);
+
+ return 0;
+diff --git a/net/wireless/scan.c b/net/wireless/scan.c
+index 396c63431e1f3..e9a3b0f724f18 100644
+--- a/net/wireless/scan.c
++++ b/net/wireless/scan.c
+@@ -640,7 +640,7 @@ static int cfg80211_parse_colocated_ap(const struct cfg80211_bss_ies *ies,
+
+ ret = cfg80211_calc_short_ssid(ies, &ssid_elem, &s_ssid_tmp);
+ if (ret)
+- return ret;
++ return 0;
+
+ /* RNR IE may contain more than one NEIGHBOR_AP_INFO */
+ while (pos + sizeof(*ap_info) <= end) {
+diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
+index 32dd55b9ce8a8..35e518eaaebae 100644
+--- a/net/xdp/xsk.c
++++ b/net/xdp/xsk.c
+@@ -505,7 +505,7 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
+
+ skb->dev = dev;
+ skb->priority = xs->sk.sk_priority;
+- skb->mark = xs->sk.sk_mark;
++ skb->mark = READ_ONCE(xs->sk.sk_mark);
+ skb_shinfo(skb)->destructor_arg = (void *)(long)desc->addr;
+ skb->destructor = xsk_destruct_skb;
+
+diff --git a/net/xdp/xskmap.c b/net/xdp/xskmap.c
+index 2c1427074a3bb..e1c526f97ce31 100644
+--- a/net/xdp/xskmap.c
++++ b/net/xdp/xskmap.c
+@@ -5,7 +5,6 @@
+
+ #include <linux/bpf.h>
+ #include <linux/filter.h>
+-#include <linux/capability.h>
+ #include <net/xdp_sock.h>
+ #include <linux/slab.h>
+ #include <linux/sched.h>
+@@ -68,9 +67,6 @@ static struct bpf_map *xsk_map_alloc(union bpf_attr *attr)
+ int numa_node;
+ u64 size;
+
+- if (!capable(CAP_NET_ADMIN))
+- return ERR_PTR(-EPERM);
+-
+ if (attr->max_entries == 0 || attr->key_size != 4 ||
+ attr->value_size != 4 ||
+ attr->map_flags & ~(BPF_F_NUMA_NODE | BPF_F_RDONLY | BPF_F_WRONLY))
+diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c
+index 408f5e55744ed..533697e2488f2 100644
+--- a/net/xfrm/xfrm_device.c
++++ b/net/xfrm/xfrm_device.c
+@@ -15,6 +15,7 @@
+ #include <linux/slab.h>
+ #include <linux/spinlock.h>
+ #include <net/dst.h>
++#include <net/gso.h>
+ #include <net/xfrm.h>
+ #include <linux/notifier.h>
+
+diff --git a/net/xfrm/xfrm_interface_core.c b/net/xfrm/xfrm_interface_core.c
+index 35279c220bd78..a3319965470a7 100644
+--- a/net/xfrm/xfrm_interface_core.c
++++ b/net/xfrm/xfrm_interface_core.c
+@@ -33,6 +33,7 @@
+ #include <linux/uaccess.h>
+ #include <linux/atomic.h>
+
++#include <net/gso.h>
+ #include <net/icmp.h>
+ #include <net/ip.h>
+ #include <net/ipv6.h>
+diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c
+index 369e5de8558ff..662c83beb345e 100644
+--- a/net/xfrm/xfrm_output.c
++++ b/net/xfrm/xfrm_output.c
+@@ -13,6 +13,7 @@
+ #include <linux/slab.h>
+ #include <linux/spinlock.h>
+ #include <net/dst.h>
++#include <net/gso.h>
+ #include <net/icmp.h>
+ #include <net/inet_ecn.h>
+ #include <net/xfrm.h>
+diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
+index e7617c9959c31..d6b405782b636 100644
+--- a/net/xfrm/xfrm_policy.c
++++ b/net/xfrm/xfrm_policy.c
+@@ -2250,7 +2250,7 @@ static struct xfrm_policy *xfrm_sk_policy_lookup(const struct sock *sk, int dir,
+
+ match = xfrm_selector_match(&pol->selector, fl, family);
+ if (match) {
+- if ((sk->sk_mark & pol->mark.m) != pol->mark.v ||
++ if ((READ_ONCE(sk->sk_mark) & pol->mark.m) != pol->mark.v ||
+ pol->if_id != if_id) {
+ pol = NULL;
+ goto out;
+diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
+index 50e7a76d54550..d17da84d3929f 100644
+--- a/rust/bindings/bindings_helper.h
++++ b/rust/bindings/bindings_helper.h
+@@ -12,5 +12,6 @@
+ #include <linux/sched.h>
+
+ /* `bindgen` gets confused at certain things. */
++const size_t BINDINGS_ARCH_SLAB_MINALIGN = ARCH_SLAB_MINALIGN;
+ const gfp_t BINDINGS_GFP_KERNEL = GFP_KERNEL;
+ const gfp_t BINDINGS___GFP_ZERO = __GFP_ZERO;
+diff --git a/rust/kernel/allocator.rs b/rust/kernel/allocator.rs
+index 397a3dd57a9b1..9363b527be664 100644
+--- a/rust/kernel/allocator.rs
++++ b/rust/kernel/allocator.rs
+@@ -9,6 +9,36 @@ use crate::bindings;
+
+ struct KernelAllocator;
+
++/// Calls `krealloc` with a proper size to alloc a new object aligned to `new_layout`'s alignment.
++///
++/// # Safety
++///
++/// - `ptr` can be either null or a pointer which has been allocated by this allocator.
++/// - `new_layout` must have a non-zero size.
++unsafe fn krealloc_aligned(ptr: *mut u8, new_layout: Layout, flags: bindings::gfp_t) -> *mut u8 {
++ // Customized layouts from `Layout::from_size_align()` can have size < align, so pad first.
++ let layout = new_layout.pad_to_align();
++
++ let mut size = layout.size();
++
++ if layout.align() > bindings::BINDINGS_ARCH_SLAB_MINALIGN {
++ // The alignment requirement exceeds the slab guarantee, thus try to enlarge the size
++ // to use the "power-of-two" size/alignment guarantee (see comments in `kmalloc()` for
++ // more information).
++ //
++ // Note that `layout.size()` (after padding) is guaranteed to be a multiple of
++ // `layout.align()`, so `next_power_of_two` gives enough alignment guarantee.
++ size = size.next_power_of_two();
++ }
++
++ // SAFETY:
++ // - `ptr` is either null or a pointer returned from a previous `k{re}alloc()` by the
++ // function safety requirement.
++ // - `size` is greater than 0 since it's either a `layout.size()` (which cannot be zero
++ // according to the function safety requirement) or a result from `next_power_of_two()`.
++ unsafe { bindings::krealloc(ptr as *const core::ffi::c_void, size, flags) as *mut u8 }
++}
++
+ unsafe impl GlobalAlloc for KernelAllocator {
+ unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
+ // `krealloc()` is used instead of `kmalloc()` because the latter is
+@@ -30,10 +60,20 @@ static ALLOCATOR: KernelAllocator = KernelAllocator;
+ // to extract the object file that has them from the archive. For the moment,
+ // let's generate them ourselves instead.
+ //
++// Note: Although these are *safe* functions, they are called by the compiler
++// with parameters that obey the same `GlobalAlloc` function safety
++// requirements: size and align should form a valid layout, and size is
++// greater than 0.
++//
+ // Note that `#[no_mangle]` implies exported too, nowadays.
+ #[no_mangle]
+-fn __rust_alloc(size: usize, _align: usize) -> *mut u8 {
+- unsafe { bindings::krealloc(core::ptr::null(), size, bindings::GFP_KERNEL) as *mut u8 }
++fn __rust_alloc(size: usize, align: usize) -> *mut u8 {
++ // SAFETY: See assumption above.
++ let layout = unsafe { Layout::from_size_align_unchecked(size, align) };
++
++ // SAFETY: `ptr::null_mut()` is null, per assumption above the size of `layout` is greater
++ // than 0.
++ unsafe { krealloc_aligned(ptr::null_mut(), layout, bindings::GFP_KERNEL) }
+ }
+
+ #[no_mangle]
+@@ -42,23 +82,27 @@ fn __rust_dealloc(ptr: *mut u8, _size: usize, _align: usize) {
+ }
+
+ #[no_mangle]
+-fn __rust_realloc(ptr: *mut u8, _old_size: usize, _align: usize, new_size: usize) -> *mut u8 {
+- unsafe {
+- bindings::krealloc(
+- ptr as *const core::ffi::c_void,
+- new_size,
+- bindings::GFP_KERNEL,
+- ) as *mut u8
+- }
++fn __rust_realloc(ptr: *mut u8, _old_size: usize, align: usize, new_size: usize) -> *mut u8 {
++ // SAFETY: See assumption above.
++ let new_layout = unsafe { Layout::from_size_align_unchecked(new_size, align) };
++
++ // SAFETY: Per assumption above, `ptr` is allocated by `__rust_*` before, and the size of
++ // `new_layout` is greater than 0.
++ unsafe { krealloc_aligned(ptr, new_layout, bindings::GFP_KERNEL) }
+ }
+
+ #[no_mangle]
+-fn __rust_alloc_zeroed(size: usize, _align: usize) -> *mut u8 {
++fn __rust_alloc_zeroed(size: usize, align: usize) -> *mut u8 {
++ // SAFETY: See assumption above.
++ let layout = unsafe { Layout::from_size_align_unchecked(size, align) };
++
++ // SAFETY: `ptr::null_mut()` is null, per assumption above the size of `layout` is greater
++ // than 0.
+ unsafe {
+- bindings::krealloc(
+- core::ptr::null(),
+- size,
++ krealloc_aligned(
++ ptr::null_mut(),
++ layout,
+ bindings::GFP_KERNEL | bindings::__GFP_ZERO,
+- ) as *mut u8
++ )
+ }
+ }
+diff --git a/tools/perf/arch/arm64/util/pmu.c b/tools/perf/arch/arm64/util/pmu.c
+index ef1ed645097c6..ce0d1c7578348 100644
+--- a/tools/perf/arch/arm64/util/pmu.c
++++ b/tools/perf/arch/arm64/util/pmu.c
+@@ -56,10 +56,11 @@ double perf_pmu__cpu_slots_per_cycle(void)
+ perf_pmu__pathname_scnprintf(path, sizeof(path),
+ pmu->name, "caps/slots");
+ /*
+- * The value of slots is not greater than 32 bits, but sysfs__read_int
+- * can't read value with 0x prefix, so use sysfs__read_ull instead.
++ * The value of slots is not greater than 32 bits, but
++ * filename__read_int can't read value with 0x prefix,
++ * so use filename__read_ull instead.
+ */
+- sysfs__read_ull(path, &slots);
++ filename__read_ull(path, &slots);
+ }
+
+ return slots ? (double)slots : NAN;
+diff --git a/tools/perf/tests/shell/test_uprobe_from_different_cu.sh b/tools/perf/tests/shell/test_uprobe_from_different_cu.sh
+index 00d2e0e2e0c28..319f36ebb9a40 100644
+--- a/tools/perf/tests/shell/test_uprobe_from_different_cu.sh
++++ b/tools/perf/tests/shell/test_uprobe_from_different_cu.sh
+@@ -4,6 +4,12 @@
+
+ set -e
+
++# skip if there's no gcc
++if ! [ -x "$(command -v gcc)" ]; then
++ echo "failed: no gcc compiler"
++ exit 2
++fi
++
+ temp_dir=$(mktemp -d /tmp/perf-uprobe-different-cu-sh.XXXXXXXXXX)
+
+ cleanup()
+@@ -11,7 +17,7 @@ cleanup()
+ trap - EXIT TERM INT
+ if [[ "${temp_dir}" =~ ^/tmp/perf-uprobe-different-cu-sh.*$ ]]; then
+ echo "--- Cleaning up ---"
+- perf probe -x ${temp_dir}/testfile -d foo
++ perf probe -x ${temp_dir}/testfile -d foo || true
+ rm -f "${temp_dir}/"*
+ rmdir "${temp_dir}"
+ fi
+diff --git a/tools/testing/selftests/bpf/prog_tests/unpriv_bpf_disabled.c b/tools/testing/selftests/bpf/prog_tests/unpriv_bpf_disabled.c
+index 8383a99f610fd..0adf8d9475cb2 100644
+--- a/tools/testing/selftests/bpf/prog_tests/unpriv_bpf_disabled.c
++++ b/tools/testing/selftests/bpf/prog_tests/unpriv_bpf_disabled.c
+@@ -171,7 +171,11 @@ static void test_unpriv_bpf_disabled_negative(struct test_unpriv_bpf_disabled *s
+ prog_insns, prog_insn_cnt, &load_opts),
+ -EPERM, "prog_load_fails");
+
+- for (i = BPF_MAP_TYPE_HASH; i <= BPF_MAP_TYPE_BLOOM_FILTER; i++)
++ /* some map types require particular correct parameters which could be
++ * sanity-checked before enforcing -EPERM, so only validate that
++ * the simple ARRAY and HASH maps are failing with -EPERM
++ */
++ for (i = BPF_MAP_TYPE_HASH; i <= BPF_MAP_TYPE_ARRAY; i++)
+ ASSERT_EQ(bpf_map_create(i, NULL, sizeof(int), sizeof(int), 1, NULL),
+ -EPERM, "map_create_fails");
+
+diff --git a/tools/testing/selftests/net/so_incoming_cpu.c b/tools/testing/selftests/net/so_incoming_cpu.c
+index 0e04f9fef9867..a148181641026 100644
+--- a/tools/testing/selftests/net/so_incoming_cpu.c
++++ b/tools/testing/selftests/net/so_incoming_cpu.c
+@@ -159,7 +159,7 @@ void create_clients(struct __test_metadata *_metadata,
+ /* Make sure SYN will be processed on the i-th CPU
+ * and finally distributed to the i-th listener.
+ */
+- sched_setaffinity(0, sizeof(cpu_set), &cpu_set);
++ ret = sched_setaffinity(0, sizeof(cpu_set), &cpu_set);
+ ASSERT_EQ(ret, 0);
+
+ for (j = 0; j < CLIENT_PER_SERVER; j++) {
+diff --git a/tools/testing/selftests/rseq/rseq.c b/tools/testing/selftests/rseq/rseq.c
+index 4e4aa006004c8..a723da2532441 100644
+--- a/tools/testing/selftests/rseq/rseq.c
++++ b/tools/testing/selftests/rseq/rseq.c
+@@ -34,9 +34,17 @@
+ #include "../kselftest.h"
+ #include "rseq.h"
+
+-static const ptrdiff_t *libc_rseq_offset_p;
+-static const unsigned int *libc_rseq_size_p;
+-static const unsigned int *libc_rseq_flags_p;
++/*
++ * Define weak versions to play nice with binaries that are statically linked
++ * against a libc that doesn't support registering its own rseq.
++ */
++__weak ptrdiff_t __rseq_offset;
++__weak unsigned int __rseq_size;
++__weak unsigned int __rseq_flags;
++
++static const ptrdiff_t *libc_rseq_offset_p = &__rseq_offset;
++static const unsigned int *libc_rseq_size_p = &__rseq_size;
++static const unsigned int *libc_rseq_flags_p = &__rseq_flags;
+
+ /* Offset from the thread pointer to the rseq area. */
+ ptrdiff_t rseq_offset;
+@@ -155,9 +163,17 @@ unsigned int get_rseq_feature_size(void)
+ static __attribute__((constructor))
+ void rseq_init(void)
+ {
+- libc_rseq_offset_p = dlsym(RTLD_NEXT, "__rseq_offset");
+- libc_rseq_size_p = dlsym(RTLD_NEXT, "__rseq_size");
+- libc_rseq_flags_p = dlsym(RTLD_NEXT, "__rseq_flags");
++ /*
++ * If the libc's registered rseq size isn't already valid, it may be
++ * because the binary is dynamically linked and not necessarily due to
++ * libc not having registered a restartable sequence. Try to find the
++ * symbols if that's the case.
++ */
++ if (!*libc_rseq_size_p) {
++ libc_rseq_offset_p = dlsym(RTLD_NEXT, "__rseq_offset");
++ libc_rseq_size_p = dlsym(RTLD_NEXT, "__rseq_size");
++ libc_rseq_flags_p = dlsym(RTLD_NEXT, "__rseq_flags");
++ }
+ if (libc_rseq_size_p && libc_rseq_offset_p && libc_rseq_flags_p &&
+ *libc_rseq_size_p != 0) {
+ /* rseq registration owned by glibc */
+diff --git a/tools/testing/selftests/tc-testing/tc-tests/qdiscs/taprio.json b/tools/testing/selftests/tc-testing/tc-tests/qdiscs/taprio.json
+index a44455372646a..08d4861c2e782 100644
+--- a/tools/testing/selftests/tc-testing/tc-tests/qdiscs/taprio.json
++++ b/tools/testing/selftests/tc-testing/tc-tests/qdiscs/taprio.json
+@@ -131,5 +131,30 @@
+ "teardown": [
+ "echo \"1\" > /sys/bus/netdevsim/del_device"
+ ]
++ },
++ {
++ "id": "3e1e",
++ "name": "Add taprio Qdisc with an invalid cycle-time",
++ "category": [
++ "qdisc",
++ "taprio"
++ ],
++ "plugins": {
++ "requires": "nsPlugin"
++ },
++ "setup": [
++ "echo \"1 1 8\" > /sys/bus/netdevsim/new_device",
++ "$TC qdisc add dev $ETH root handle 1: taprio num_tc 3 map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@0 1@0 base-time 1000000000 sched-entry S 01 300000 flags 0x1 clockid CLOCK_TAI cycle-time 4294967296 || /bin/true",
++ "$IP link set dev $ETH up",
++ "$IP addr add 10.10.10.10/24 dev $ETH"
++ ],
++ "cmdUnderTest": "/bin/true",
++ "expExitCode": "0",
++ "verifyCmd": "$TC qdisc show dev $ETH",
++ "matchPattern": "qdisc taprio 1: root refcnt",
++ "matchCount": "0",
++ "teardown": [
++ "echo \"1\" > /sys/bus/netdevsim/del_device"
++ ]
+ }
+ ]
+diff --git a/tools/testing/vsock/Makefile b/tools/testing/vsock/Makefile
+index 43a254f0e14dd..21a98ba565ab5 100644
+--- a/tools/testing/vsock/Makefile
++++ b/tools/testing/vsock/Makefile
+@@ -8,5 +8,5 @@ vsock_perf: vsock_perf.o
+ CFLAGS += -g -O2 -Werror -Wall -I. -I../../include -I../../../usr/include -Wno-pointer-sign -fno-strict-overflow -fno-strict-aliasing -fno-common -MMD -U_FORTIFY_SOURCE -D_GNU_SOURCE
+ .PHONY: all test clean
+ clean:
+- ${RM} *.o *.d vsock_test vsock_diag_test
++ ${RM} *.o *.d vsock_test vsock_diag_test vsock_perf
+ -include *.d
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-08-16 17:28 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-08-16 17:28 UTC (permalink / raw
To: gentoo-commits
commit: 6ac7d6d91eb0fb742af4fcd5f29cb68433c8a24b
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Wed Aug 16 17:16:44 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Wed Aug 16 17:28:24 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=6ac7d6d9
Linux patch 6.4.11
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1010_linux-6.4.11.patch | 9168 +++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 9172 insertions(+)
diff --git a/0000_README b/0000_README
index f63d6a30..c16c1b6b 100644
--- a/0000_README
+++ b/0000_README
@@ -83,6 +83,10 @@ Patch: 1009_linux-6.4.10.patch
From: https://www.kernel.org
Desc: Linux 6.4.10
+Patch: 1010_linux-6.4.11.patch
+From: https://www.kernel.org
+Desc: Linux 6.4.11
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1010_linux-6.4.11.patch b/1010_linux-6.4.11.patch
new file mode 100644
index 00000000..0632e5be
--- /dev/null
+++ b/1010_linux-6.4.11.patch
@@ -0,0 +1,9168 @@
+diff --git a/Makefile b/Makefile
+index bf463afef54bf..d0efd84bb7d0f 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 6
+ PATCHLEVEL = 4
+-SUBLEVEL = 10
++SUBLEVEL = 11
+ EXTRAVERSION =
+ NAME = Hurr durr I'ma ninja sloth
+
+diff --git a/arch/alpha/kernel/setup.c b/arch/alpha/kernel/setup.c
+index 33bf3a6270027..45a920ba4921d 100644
+--- a/arch/alpha/kernel/setup.c
++++ b/arch/alpha/kernel/setup.c
+@@ -385,8 +385,7 @@ setup_memory(void *kernel_end)
+ #endif /* CONFIG_BLK_DEV_INITRD */
+ }
+
+-int __init
+-page_is_ram(unsigned long pfn)
++int page_is_ram(unsigned long pfn)
+ {
+ struct memclust_struct * cluster;
+ struct memdesc_struct * memdesc;
+diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
+index 3a2606ba3e583..62b164a0041fb 100644
+--- a/arch/arm64/kvm/arm.c
++++ b/arch/arm64/kvm/arm.c
+@@ -1800,8 +1800,6 @@ static void _kvm_arch_hardware_enable(void *discard)
+
+ int kvm_arch_hardware_enable(void)
+ {
+- int was_enabled;
+-
+ /*
+ * Most calls to this function are made with migration
+ * disabled, but not with preemption disabled. The former is
+@@ -1810,13 +1808,10 @@ int kvm_arch_hardware_enable(void)
+ */
+ preempt_disable();
+
+- was_enabled = __this_cpu_read(kvm_arm_hardware_enabled);
+ _kvm_arch_hardware_enable(NULL);
+
+- if (!was_enabled) {
+- kvm_vgic_cpu_up();
+- kvm_timer_cpu_up();
+- }
++ kvm_vgic_cpu_up();
++ kvm_timer_cpu_up();
+
+ preempt_enable();
+
+@@ -1833,10 +1828,8 @@ static void _kvm_arch_hardware_disable(void *discard)
+
+ void kvm_arch_hardware_disable(void)
+ {
+- if (__this_cpu_read(kvm_arm_hardware_enabled)) {
+- kvm_timer_cpu_down();
+- kvm_vgic_cpu_down();
+- }
++ kvm_timer_cpu_down();
++ kvm_vgic_cpu_down();
+
+ if (!is_protected_kvm_enabled())
+ _kvm_arch_hardware_disable(NULL);
+diff --git a/arch/parisc/Kconfig.debug b/arch/parisc/Kconfig.debug
+index 3a059cb5e112f..56d35315f88ed 100644
+--- a/arch/parisc/Kconfig.debug
++++ b/arch/parisc/Kconfig.debug
+@@ -2,7 +2,7 @@
+ #
+ config LIGHTWEIGHT_SPINLOCK_CHECK
+ bool "Enable lightweight spinlock checks"
+- depends on SMP && !DEBUG_SPINLOCK
++ depends on DEBUG_KERNEL && SMP && !DEBUG_SPINLOCK
+ default y
+ help
+ Add checks with low performance impact to the spinlock functions
+diff --git a/arch/parisc/include/asm/spinlock.h b/arch/parisc/include/asm/spinlock.h
+index edfcb9858bcb7..0b326e52255e1 100644
+--- a/arch/parisc/include/asm/spinlock.h
++++ b/arch/parisc/include/asm/spinlock.h
+@@ -7,8 +7,6 @@
+ #include <asm/processor.h>
+ #include <asm/spinlock_types.h>
+
+-#define SPINLOCK_BREAK_INSN 0x0000c006 /* break 6,6 */
+-
+ static inline void arch_spin_val_check(int lock_val)
+ {
+ if (IS_ENABLED(CONFIG_LIGHTWEIGHT_SPINLOCK_CHECK))
+diff --git a/arch/parisc/include/asm/spinlock_types.h b/arch/parisc/include/asm/spinlock_types.h
+index d65934079ebdb..efd06a897c6a3 100644
+--- a/arch/parisc/include/asm/spinlock_types.h
++++ b/arch/parisc/include/asm/spinlock_types.h
+@@ -4,6 +4,10 @@
+
+ #define __ARCH_SPIN_LOCK_UNLOCKED_VAL 0x1a46
+
++#define SPINLOCK_BREAK_INSN 0x0000c006 /* break 6,6 */
++
++#ifndef __ASSEMBLY__
++
+ typedef struct {
+ #ifdef CONFIG_PA20
+ volatile unsigned int slock;
+@@ -27,6 +31,8 @@ typedef struct {
+ volatile unsigned int counter;
+ } arch_rwlock_t;
+
++#endif /* __ASSEMBLY__ */
++
+ #define __ARCH_RW_LOCK_UNLOCKED__ 0x01000000
+ #define __ARCH_RW_LOCK_UNLOCKED { .lock_mutex = __ARCH_SPIN_LOCK_UNLOCKED, \
+ .counter = __ARCH_RW_LOCK_UNLOCKED__ }
+diff --git a/arch/parisc/kernel/sys_parisc.c b/arch/parisc/kernel/sys_parisc.c
+index 465b7cb9d44f4..39acccabf2ede 100644
+--- a/arch/parisc/kernel/sys_parisc.c
++++ b/arch/parisc/kernel/sys_parisc.c
+@@ -26,17 +26,12 @@
+ #include <linux/compat.h>
+
+ /*
+- * Construct an artificial page offset for the mapping based on the virtual
++ * Construct an artificial page offset for the mapping based on the physical
+ * address of the kernel file mapping variable.
+- * If filp is zero the calculated pgoff value aliases the memory of the given
+- * address. This is useful for io_uring where the mapping shall alias a kernel
+- * address and a userspace adress where both the kernel and the userspace
+- * access the same memory region.
+ */
+-#define GET_FILP_PGOFF(filp, addr) \
+- ((filp ? (((unsigned long) filp->f_mapping) >> 8) \
+- & ((SHM_COLOUR-1) >> PAGE_SHIFT) : 0UL) \
+- + (addr >> PAGE_SHIFT))
++#define GET_FILP_PGOFF(filp) \
++ (filp ? (((unsigned long) filp->f_mapping) >> 8) \
++ & ((SHM_COLOUR-1) >> PAGE_SHIFT) : 0UL)
+
+ static unsigned long shared_align_offset(unsigned long filp_pgoff,
+ unsigned long pgoff)
+@@ -116,7 +111,7 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
+ do_color_align = 0;
+ if (filp || (flags & MAP_SHARED))
+ do_color_align = 1;
+- filp_pgoff = GET_FILP_PGOFF(filp, addr);
++ filp_pgoff = GET_FILP_PGOFF(filp);
+
+ if (flags & MAP_FIXED) {
+ /* Even MAP_FIXED mappings must reside within TASK_SIZE */
+diff --git a/arch/parisc/kernel/syscall.S b/arch/parisc/kernel/syscall.S
+index 1373e5129868f..1f51aa9c8230c 100644
+--- a/arch/parisc/kernel/syscall.S
++++ b/arch/parisc/kernel/syscall.S
+@@ -39,6 +39,7 @@ registers).
+ #include <asm/assembly.h>
+ #include <asm/processor.h>
+ #include <asm/cache.h>
++#include <asm/spinlock_types.h>
+
+ #include <linux/linkage.h>
+
+@@ -66,6 +67,16 @@ registers).
+ stw \reg1, 0(%sr2,\reg2)
+ .endm
+
++ /* raise exception if spinlock content is not zero or
++ * __ARCH_SPIN_LOCK_UNLOCKED_VAL */
++ .macro spinlock_check spin_val,tmpreg
++#ifdef CONFIG_LIGHTWEIGHT_SPINLOCK_CHECK
++ ldi __ARCH_SPIN_LOCK_UNLOCKED_VAL, \tmpreg
++ andcm,= \spin_val, \tmpreg, %r0
++ .word SPINLOCK_BREAK_INSN
++#endif
++ .endm
++
+ .text
+
+ .import syscall_exit,code
+@@ -508,7 +519,8 @@ lws_start:
+
+ lws_exit_noerror:
+ lws_pagefault_enable %r1,%r21
+- stw,ma %r20, 0(%sr2,%r20)
++ ldi __ARCH_SPIN_LOCK_UNLOCKED_VAL, %r21
++ stw,ma %r21, 0(%sr2,%r20)
+ ssm PSW_SM_I, %r0
+ b lws_exit
+ copy %r0, %r21
+@@ -521,7 +533,8 @@ lws_wouldblock:
+
+ lws_pagefault:
+ lws_pagefault_enable %r1,%r21
+- stw,ma %r20, 0(%sr2,%r20)
++ ldi __ARCH_SPIN_LOCK_UNLOCKED_VAL, %r21
++ stw,ma %r21, 0(%sr2,%r20)
+ ssm PSW_SM_I, %r0
+ ldo 3(%r0),%r28
+ b lws_exit
+@@ -619,6 +632,7 @@ lws_compare_and_swap:
+
+ /* Try to acquire the lock */
+ LDCW 0(%sr2,%r20), %r28
++ spinlock_check %r28, %r21
+ comclr,<> %r0, %r28, %r0
+ b,n lws_wouldblock
+
+@@ -772,6 +786,7 @@ cas2_lock_start:
+
+ /* Try to acquire the lock */
+ LDCW 0(%sr2,%r20), %r28
++ spinlock_check %r28, %r21
+ comclr,<> %r0, %r28, %r0
+ b,n lws_wouldblock
+
+@@ -1001,6 +1016,7 @@ atomic_xchg_start:
+
+ /* Try to acquire the lock */
+ LDCW 0(%sr2,%r20), %r28
++ spinlock_check %r28, %r21
+ comclr,<> %r0, %r28, %r0
+ b,n lws_wouldblock
+
+@@ -1199,6 +1215,7 @@ atomic_store_start:
+
+ /* Try to acquire the lock */
+ LDCW 0(%sr2,%r20), %r28
++ spinlock_check %r28, %r21
+ comclr,<> %r0, %r28, %r0
+ b,n lws_wouldblock
+
+@@ -1330,7 +1347,7 @@ ENTRY(lws_lock_start)
+ /* lws locks */
+ .rept 256
+ /* Keep locks aligned at 16-bytes */
+- .word 1
++ .word __ARCH_SPIN_LOCK_UNLOCKED_VAL
+ .word 0
+ .word 0
+ .word 0
+diff --git a/arch/riscv/include/asm/mmio.h b/arch/riscv/include/asm/mmio.h
+index aff6c33ab0c08..4c58ee7f95ecf 100644
+--- a/arch/riscv/include/asm/mmio.h
++++ b/arch/riscv/include/asm/mmio.h
+@@ -101,9 +101,9 @@ static inline u64 __raw_readq(const volatile void __iomem *addr)
+ * Relaxed I/O memory access primitives. These follow the Device memory
+ * ordering rules but do not guarantee any ordering relative to Normal memory
+ * accesses. These are defined to order the indicated access (either a read or
+- * write) with all other I/O memory accesses. Since the platform specification
+- * defines that all I/O regions are strongly ordered on channel 2, no explicit
+- * fences are required to enforce this ordering.
++ * write) with all other I/O memory accesses to the same peripheral. Since the
++ * platform specification defines that all I/O regions are strongly ordered on
++ * channel 0, no explicit fences are required to enforce this ordering.
+ */
+ /* FIXME: These are now the same as asm-generic */
+ #define __io_rbr() do {} while (0)
+@@ -125,14 +125,14 @@ static inline u64 __raw_readq(const volatile void __iomem *addr)
+ #endif
+
+ /*
+- * I/O memory access primitives. Reads are ordered relative to any
+- * following Normal memory access. Writes are ordered relative to any prior
+- * Normal memory access. The memory barriers here are necessary as RISC-V
++ * I/O memory access primitives. Reads are ordered relative to any following
++ * Normal memory read and delay() loop. Writes are ordered relative to any
++ * prior Normal memory write. The memory barriers here are necessary as RISC-V
+ * doesn't define any ordering between the memory space and the I/O space.
+ */
+ #define __io_br() do {} while (0)
+-#define __io_ar(v) __asm__ __volatile__ ("fence i,r" : : : "memory")
+-#define __io_bw() __asm__ __volatile__ ("fence w,o" : : : "memory")
++#define __io_ar(v) ({ __asm__ __volatile__ ("fence i,ir" : : : "memory"); })
++#define __io_bw() ({ __asm__ __volatile__ ("fence w,o" : : : "memory"); })
+ #define __io_aw() mmiowb_set_pending()
+
+ #define readb(c) ({ u8 __v; __io_br(); __v = readb_cpu(c); __io_ar(__v); __v; })
+diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
+index 75970ee2bda22..b5680c940c1e9 100644
+--- a/arch/riscv/include/asm/pgtable.h
++++ b/arch/riscv/include/asm/pgtable.h
+@@ -188,6 +188,8 @@ extern struct pt_alloc_ops pt_ops __initdata;
+ #define PAGE_KERNEL_IO __pgprot(_PAGE_IOREMAP)
+
+ extern pgd_t swapper_pg_dir[];
++extern pgd_t trampoline_pg_dir[];
++extern pgd_t early_pg_dir[];
+
+ #ifdef CONFIG_TRANSPARENT_HUGEPAGE
+ static inline int pmd_present(pmd_t pmd)
+diff --git a/arch/riscv/kernel/elf_kexec.c b/arch/riscv/kernel/elf_kexec.c
+index 5372b708fae21..c08bb5c3b3857 100644
+--- a/arch/riscv/kernel/elf_kexec.c
++++ b/arch/riscv/kernel/elf_kexec.c
+@@ -281,7 +281,7 @@ static void *elf_kexec_load(struct kimage *image, char *kernel_buf,
+ kbuf.buffer = initrd;
+ kbuf.bufsz = kbuf.memsz = initrd_len;
+ kbuf.buf_align = PAGE_SIZE;
+- kbuf.top_down = false;
++ kbuf.top_down = true;
+ kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
+ ret = kexec_add_buffer(&kbuf);
+ if (ret)
+@@ -425,6 +425,7 @@ int arch_kexec_apply_relocations_add(struct purgatory_info *pi,
+ * sym, instead of searching the whole relsec.
+ */
+ case R_RISCV_PCREL_HI20:
++ case R_RISCV_CALL_PLT:
+ case R_RISCV_CALL:
+ *(u64 *)loc = CLEAN_IMM(UITYPE, *(u64 *)loc) |
+ ENCODE_UJTYPE_IMM(val - addr);
+diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
+index 93e7bb9f67fd4..7e62b4dec650f 100644
+--- a/arch/riscv/mm/init.c
++++ b/arch/riscv/mm/init.c
+@@ -26,12 +26,13 @@
+ #include <linux/kfence.h>
+
+ #include <asm/fixmap.h>
+-#include <asm/tlbflush.h>
+-#include <asm/sections.h>
+-#include <asm/soc.h>
+ #include <asm/io.h>
+-#include <asm/ptdump.h>
+ #include <asm/numa.h>
++#include <asm/pgtable.h>
++#include <asm/ptdump.h>
++#include <asm/sections.h>
++#include <asm/soc.h>
++#include <asm/tlbflush.h>
+
+ #include "../kernel/head.h"
+
+@@ -214,8 +215,13 @@ static void __init setup_bootmem(void)
+ memblock_reserve(vmlinux_start, vmlinux_end - vmlinux_start);
+
+ phys_ram_end = memblock_end_of_DRAM();
++
++ /*
++ * Make sure we align the start of the memory on a PMD boundary so that
++ * at worst, we map the linear mapping with PMD mappings.
++ */
+ if (!IS_ENABLED(CONFIG_XIP_KERNEL))
+- phys_ram_base = memblock_start_of_DRAM();
++ phys_ram_base = memblock_start_of_DRAM() & PMD_MASK;
+
+ /*
+ * In 64-bit, any use of __va/__pa before this point is wrong as we
+diff --git a/arch/riscv/mm/kasan_init.c b/arch/riscv/mm/kasan_init.c
+index 8fc0efcf905c9..a01bc15dce244 100644
+--- a/arch/riscv/mm/kasan_init.c
++++ b/arch/riscv/mm/kasan_init.c
+@@ -22,7 +22,6 @@
+ * region is not and then we have to go down to the PUD level.
+ */
+
+-extern pgd_t early_pg_dir[PTRS_PER_PGD];
+ pgd_t tmp_pg_dir[PTRS_PER_PGD] __page_aligned_bss;
+ p4d_t tmp_p4d[PTRS_PER_P4D] __page_aligned_bss;
+ pud_t tmp_pud[PTRS_PER_PUD] __page_aligned_bss;
+diff --git a/arch/x86/boot/compressed/idt_64.c b/arch/x86/boot/compressed/idt_64.c
+index 6debb816e83dc..3cdf94b414567 100644
+--- a/arch/x86/boot/compressed/idt_64.c
++++ b/arch/x86/boot/compressed/idt_64.c
+@@ -63,7 +63,14 @@ void load_stage2_idt(void)
+ set_idt_entry(X86_TRAP_PF, boot_page_fault);
+
+ #ifdef CONFIG_AMD_MEM_ENCRYPT
+- set_idt_entry(X86_TRAP_VC, boot_stage2_vc);
++ /*
++ * Clear the second stage #VC handler in case guest types
++ * needing #VC have not been detected.
++ */
++ if (sev_status & BIT(1))
++ set_idt_entry(X86_TRAP_VC, boot_stage2_vc);
++ else
++ set_idt_entry(X86_TRAP_VC, NULL);
+ #endif
+
+ load_boot_idt(&boot_idt_desc);
+diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
+index 014b89c890887..91832133c1e8b 100644
+--- a/arch/x86/boot/compressed/sev.c
++++ b/arch/x86/boot/compressed/sev.c
+@@ -352,13 +352,46 @@ void sev_enable(struct boot_params *bp)
+ if (bp)
+ bp->cc_blob_address = 0;
+
++ /*
++ * Do an initial SEV capability check before snp_init() which
++ * loads the CPUID page and the same checks afterwards are done
++ * without the hypervisor and are trustworthy.
++ *
++ * If the HV fakes SEV support, the guest will crash'n'burn
++ * which is good enough.
++ */
++
++ /* Check for the SME/SEV support leaf */
++ eax = 0x80000000;
++ ecx = 0;
++ native_cpuid(&eax, &ebx, &ecx, &edx);
++ if (eax < 0x8000001f)
++ return;
++
++ /*
++ * Check for the SME/SEV feature:
++ * CPUID Fn8000_001F[EAX]
++ * - Bit 0 - Secure Memory Encryption support
++ * - Bit 1 - Secure Encrypted Virtualization support
++ * CPUID Fn8000_001F[EBX]
++ * - Bits 5:0 - Pagetable bit position used to indicate encryption
++ */
++ eax = 0x8000001f;
++ ecx = 0;
++ native_cpuid(&eax, &ebx, &ecx, &edx);
++ /* Check whether SEV is supported */
++ if (!(eax & BIT(1)))
++ return;
++
+ /*
+ * Setup/preliminary detection of SNP. This will be sanity-checked
+ * against CPUID/MSR values later.
+ */
+ snp = snp_init(bp);
+
+- /* Check for the SME/SEV support leaf */
++ /* Now repeat the checks with the SNP CPUID table. */
++
++ /* Recheck the SME/SEV support leaf */
+ eax = 0x80000000;
+ ecx = 0;
+ native_cpuid(&eax, &ebx, &ecx, &edx);
+@@ -366,7 +399,7 @@ void sev_enable(struct boot_params *bp)
+ return;
+
+ /*
+- * Check for the SME/SEV feature:
++ * Recheck for the SME/SEV feature:
+ * CPUID Fn8000_001F[EAX]
+ * - Bit 0 - Secure Memory Encryption support
+ * - Bit 1 - Secure Encrypted Virtualization support
+diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
+index 11a5c68d12185..7645730dc228f 100644
+--- a/arch/x86/entry/vdso/vma.c
++++ b/arch/x86/entry/vdso/vma.c
+@@ -299,8 +299,8 @@ static unsigned long vdso_addr(unsigned long start, unsigned len)
+
+ /* Round the lowest possible end address up to a PMD boundary. */
+ end = (start + len + PMD_SIZE - 1) & PMD_MASK;
+- if (end >= TASK_SIZE_MAX)
+- end = TASK_SIZE_MAX;
++ if (end >= DEFAULT_MAP_WINDOW)
++ end = DEFAULT_MAP_WINDOW;
+ end -= len;
+
+ if (end > start) {
+diff --git a/arch/x86/include/asm/acpi.h b/arch/x86/include/asm/acpi.h
+index 8eb74cf386dbe..2888c0ee4df04 100644
+--- a/arch/x86/include/asm/acpi.h
++++ b/arch/x86/include/asm/acpi.h
+@@ -15,6 +15,7 @@
+ #include <asm/mpspec.h>
+ #include <asm/x86_init.h>
+ #include <asm/cpufeature.h>
++#include <asm/irq_vectors.h>
+
+ #ifdef CONFIG_ACPI_APEI
+ # include <asm/pgtable_types.h>
+@@ -31,6 +32,7 @@ extern int acpi_skip_timer_override;
+ extern int acpi_use_timer_override;
+ extern int acpi_fix_pin2_polarity;
+ extern int acpi_disable_cmcff;
++extern bool acpi_int_src_ovr[NR_IRQS_LEGACY];
+
+ extern u8 acpi_sci_flags;
+ extern u32 acpi_sci_override_gsi;
+diff --git a/arch/x86/include/asm/linkage.h b/arch/x86/include/asm/linkage.h
+index 0953aa32a324a..97a3de7892d3f 100644
+--- a/arch/x86/include/asm/linkage.h
++++ b/arch/x86/include/asm/linkage.h
+@@ -21,7 +21,7 @@
+ #define FUNCTION_PADDING
+ #endif
+
+-#if (CONFIG_FUNCTION_ALIGNMENT > 8) && !defined(__DISABLE_EXPORTS) && !defined(BULID_VDSO)
++#if (CONFIG_FUNCTION_ALIGNMENT > 8) && !defined(__DISABLE_EXPORTS) && !defined(BUILD_VDSO)
+ # define __FUNC_ALIGN __ALIGN; FUNCTION_PADDING
+ #else
+ # define __FUNC_ALIGN __ALIGN
+diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
+index e8db1cff76fda..54100710605f5 100644
+--- a/arch/x86/include/asm/processor.h
++++ b/arch/x86/include/asm/processor.h
+@@ -732,4 +732,6 @@ bool arch_is_platform_page(u64 paddr);
+ #define arch_is_platform_page arch_is_platform_page
+ #endif
+
++extern bool gds_ucode_mitigated(void);
++
+ #endif /* _ASM_X86_PROCESSOR_H */
+diff --git a/arch/x86/include/asm/segment.h b/arch/x86/include/asm/segment.h
+index 794f696257801..9d6411c659205 100644
+--- a/arch/x86/include/asm/segment.h
++++ b/arch/x86/include/asm/segment.h
+@@ -56,7 +56,7 @@
+
+ #define GDT_ENTRY_INVALID_SEG 0
+
+-#ifdef CONFIG_X86_32
++#if defined(CONFIG_X86_32) && !defined(BUILD_VDSO32_64)
+ /*
+ * The layout of the per-CPU GDT under Linux:
+ *
+diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
+index 21b542a6866cf..53369c57751ec 100644
+--- a/arch/x86/kernel/acpi/boot.c
++++ b/arch/x86/kernel/acpi/boot.c
+@@ -52,6 +52,7 @@ int acpi_lapic;
+ int acpi_ioapic;
+ int acpi_strict;
+ int acpi_disable_cmcff;
++bool acpi_int_src_ovr[NR_IRQS_LEGACY];
+
+ /* ACPI SCI override configuration */
+ u8 acpi_sci_flags __initdata;
+@@ -588,6 +589,9 @@ acpi_parse_int_src_ovr(union acpi_subtable_headers * header,
+
+ acpi_table_print_madt_entry(&header->common);
+
++ if (intsrc->source_irq < NR_IRQS_LEGACY)
++ acpi_int_src_ovr[intsrc->source_irq] = true;
++
+ if (intsrc->source_irq == acpi_gbl_FADT.sci_interrupt) {
+ acpi_sci_ioapic_setup(intsrc->source_irq,
+ intsrc->inti_flags & ACPI_MADT_POLARITY_MASK,
+diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
+index c37a3a5cdabd3..0b5f33cb32b59 100644
+--- a/arch/x86/kernel/cpu/amd.c
++++ b/arch/x86/kernel/cpu/amd.c
+@@ -73,6 +73,7 @@ static const int amd_erratum_1054[] =
+ static const int amd_zenbleed[] =
+ AMD_LEGACY_ERRATUM(AMD_MODEL_RANGE(0x17, 0x30, 0x0, 0x4f, 0xf),
+ AMD_MODEL_RANGE(0x17, 0x60, 0x0, 0x7f, 0xf),
++ AMD_MODEL_RANGE(0x17, 0x90, 0x0, 0x91, 0xf),
+ AMD_MODEL_RANGE(0x17, 0xa0, 0x0, 0xaf, 0xf));
+
+ static const int amd_div0[] =
+diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
+index 84f741b06376f..bac2e2949f01d 100644
+--- a/arch/x86/kernel/vmlinux.lds.S
++++ b/arch/x86/kernel/vmlinux.lds.S
+@@ -529,11 +529,17 @@ INIT_PER_CPU(irq_stack_backing_store);
+
+ #ifdef CONFIG_CPU_SRSO
+ /*
+- * GNU ld cannot do XOR so do: (A | B) - (A & B) in order to compute the XOR
++ * GNU ld cannot do XOR until 2.41.
++ * https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=f6f78318fca803c4907fb8d7f6ded8295f1947b1
++ *
++ * LLVM lld cannot do XOR until lld-17.
++ * https://github.com/llvm/llvm-project/commit/fae96104d4378166cbe5c875ef8ed808a356f3fb
++ *
++ * Instead do: (A | B) - (A & B) in order to compute the XOR
+ * of the two function addresses:
+ */
+-. = ASSERT(((srso_untrain_ret_alias | srso_safe_ret_alias) -
+- (srso_untrain_ret_alias & srso_safe_ret_alias)) == ((1 << 2) | (1 << 8) | (1 << 14) | (1 << 20)),
++. = ASSERT(((ABSOLUTE(srso_untrain_ret_alias) | srso_safe_ret_alias) -
++ (ABSOLUTE(srso_untrain_ret_alias) & srso_safe_ret_alias)) == ((1 << 2) | (1 << 8) | (1 << 14) | (1 << 20)),
+ "SRSO function pair won't alias");
+ #endif
+
+diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
+index 69ae5e1b31207..3148ed6e57789 100644
+--- a/arch/x86/kvm/svm/sev.c
++++ b/arch/x86/kvm/svm/sev.c
+@@ -2414,15 +2414,18 @@ static void sev_es_sync_from_ghcb(struct vcpu_svm *svm)
+ */
+ memset(vcpu->arch.regs, 0, sizeof(vcpu->arch.regs));
+
+- vcpu->arch.regs[VCPU_REGS_RAX] = ghcb_get_rax_if_valid(ghcb);
+- vcpu->arch.regs[VCPU_REGS_RBX] = ghcb_get_rbx_if_valid(ghcb);
+- vcpu->arch.regs[VCPU_REGS_RCX] = ghcb_get_rcx_if_valid(ghcb);
+- vcpu->arch.regs[VCPU_REGS_RDX] = ghcb_get_rdx_if_valid(ghcb);
+- vcpu->arch.regs[VCPU_REGS_RSI] = ghcb_get_rsi_if_valid(ghcb);
++ BUILD_BUG_ON(sizeof(svm->sev_es.valid_bitmap) != sizeof(ghcb->save.valid_bitmap));
++ memcpy(&svm->sev_es.valid_bitmap, &ghcb->save.valid_bitmap, sizeof(ghcb->save.valid_bitmap));
+
+- svm->vmcb->save.cpl = ghcb_get_cpl_if_valid(ghcb);
++ vcpu->arch.regs[VCPU_REGS_RAX] = kvm_ghcb_get_rax_if_valid(svm, ghcb);
++ vcpu->arch.regs[VCPU_REGS_RBX] = kvm_ghcb_get_rbx_if_valid(svm, ghcb);
++ vcpu->arch.regs[VCPU_REGS_RCX] = kvm_ghcb_get_rcx_if_valid(svm, ghcb);
++ vcpu->arch.regs[VCPU_REGS_RDX] = kvm_ghcb_get_rdx_if_valid(svm, ghcb);
++ vcpu->arch.regs[VCPU_REGS_RSI] = kvm_ghcb_get_rsi_if_valid(svm, ghcb);
+
+- if (ghcb_xcr0_is_valid(ghcb)) {
++ svm->vmcb->save.cpl = kvm_ghcb_get_cpl_if_valid(svm, ghcb);
++
++ if (kvm_ghcb_xcr0_is_valid(svm)) {
+ vcpu->arch.xcr0 = ghcb_get_xcr0(ghcb);
+ kvm_update_cpuid_runtime(vcpu);
+ }
+@@ -2433,14 +2436,21 @@ static void sev_es_sync_from_ghcb(struct vcpu_svm *svm)
+ control->exit_code_hi = upper_32_bits(exit_code);
+ control->exit_info_1 = ghcb_get_sw_exit_info_1(ghcb);
+ control->exit_info_2 = ghcb_get_sw_exit_info_2(ghcb);
++ svm->sev_es.sw_scratch = kvm_ghcb_get_sw_scratch_if_valid(svm, ghcb);
+
+ /* Clear the valid entries fields */
+ memset(ghcb->save.valid_bitmap, 0, sizeof(ghcb->save.valid_bitmap));
+ }
+
++static u64 kvm_ghcb_get_sw_exit_code(struct vmcb_control_area *control)
++{
++ return (((u64)control->exit_code_hi) << 32) | control->exit_code;
++}
++
+ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
+ {
+- struct kvm_vcpu *vcpu;
++ struct vmcb_control_area *control = &svm->vmcb->control;
++ struct kvm_vcpu *vcpu = &svm->vcpu;
+ struct ghcb *ghcb;
+ u64 exit_code;
+ u64 reason;
+@@ -2451,7 +2461,7 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
+ * Retrieve the exit code now even though it may not be marked valid
+ * as it could help with debugging.
+ */
+- exit_code = ghcb_get_sw_exit_code(ghcb);
++ exit_code = kvm_ghcb_get_sw_exit_code(control);
+
+ /* Only GHCB Usage code 0 is supported */
+ if (ghcb->ghcb_usage) {
+@@ -2461,56 +2471,56 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
+
+ reason = GHCB_ERR_MISSING_INPUT;
+
+- if (!ghcb_sw_exit_code_is_valid(ghcb) ||
+- !ghcb_sw_exit_info_1_is_valid(ghcb) ||
+- !ghcb_sw_exit_info_2_is_valid(ghcb))
++ if (!kvm_ghcb_sw_exit_code_is_valid(svm) ||
++ !kvm_ghcb_sw_exit_info_1_is_valid(svm) ||
++ !kvm_ghcb_sw_exit_info_2_is_valid(svm))
+ goto vmgexit_err;
+
+- switch (ghcb_get_sw_exit_code(ghcb)) {
++ switch (exit_code) {
+ case SVM_EXIT_READ_DR7:
+ break;
+ case SVM_EXIT_WRITE_DR7:
+- if (!ghcb_rax_is_valid(ghcb))
++ if (!kvm_ghcb_rax_is_valid(svm))
+ goto vmgexit_err;
+ break;
+ case SVM_EXIT_RDTSC:
+ break;
+ case SVM_EXIT_RDPMC:
+- if (!ghcb_rcx_is_valid(ghcb))
++ if (!kvm_ghcb_rcx_is_valid(svm))
+ goto vmgexit_err;
+ break;
+ case SVM_EXIT_CPUID:
+- if (!ghcb_rax_is_valid(ghcb) ||
+- !ghcb_rcx_is_valid(ghcb))
++ if (!kvm_ghcb_rax_is_valid(svm) ||
++ !kvm_ghcb_rcx_is_valid(svm))
+ goto vmgexit_err;
+- if (ghcb_get_rax(ghcb) == 0xd)
+- if (!ghcb_xcr0_is_valid(ghcb))
++ if (vcpu->arch.regs[VCPU_REGS_RAX] == 0xd)
++ if (!kvm_ghcb_xcr0_is_valid(svm))
+ goto vmgexit_err;
+ break;
+ case SVM_EXIT_INVD:
+ break;
+ case SVM_EXIT_IOIO:
+- if (ghcb_get_sw_exit_info_1(ghcb) & SVM_IOIO_STR_MASK) {
+- if (!ghcb_sw_scratch_is_valid(ghcb))
++ if (control->exit_info_1 & SVM_IOIO_STR_MASK) {
++ if (!kvm_ghcb_sw_scratch_is_valid(svm))
+ goto vmgexit_err;
+ } else {
+- if (!(ghcb_get_sw_exit_info_1(ghcb) & SVM_IOIO_TYPE_MASK))
+- if (!ghcb_rax_is_valid(ghcb))
++ if (!(control->exit_info_1 & SVM_IOIO_TYPE_MASK))
++ if (!kvm_ghcb_rax_is_valid(svm))
+ goto vmgexit_err;
+ }
+ break;
+ case SVM_EXIT_MSR:
+- if (!ghcb_rcx_is_valid(ghcb))
++ if (!kvm_ghcb_rcx_is_valid(svm))
+ goto vmgexit_err;
+- if (ghcb_get_sw_exit_info_1(ghcb)) {
+- if (!ghcb_rax_is_valid(ghcb) ||
+- !ghcb_rdx_is_valid(ghcb))
++ if (control->exit_info_1) {
++ if (!kvm_ghcb_rax_is_valid(svm) ||
++ !kvm_ghcb_rdx_is_valid(svm))
+ goto vmgexit_err;
+ }
+ break;
+ case SVM_EXIT_VMMCALL:
+- if (!ghcb_rax_is_valid(ghcb) ||
+- !ghcb_cpl_is_valid(ghcb))
++ if (!kvm_ghcb_rax_is_valid(svm) ||
++ !kvm_ghcb_cpl_is_valid(svm))
+ goto vmgexit_err;
+ break;
+ case SVM_EXIT_RDTSCP:
+@@ -2518,19 +2528,19 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
+ case SVM_EXIT_WBINVD:
+ break;
+ case SVM_EXIT_MONITOR:
+- if (!ghcb_rax_is_valid(ghcb) ||
+- !ghcb_rcx_is_valid(ghcb) ||
+- !ghcb_rdx_is_valid(ghcb))
++ if (!kvm_ghcb_rax_is_valid(svm) ||
++ !kvm_ghcb_rcx_is_valid(svm) ||
++ !kvm_ghcb_rdx_is_valid(svm))
+ goto vmgexit_err;
+ break;
+ case SVM_EXIT_MWAIT:
+- if (!ghcb_rax_is_valid(ghcb) ||
+- !ghcb_rcx_is_valid(ghcb))
++ if (!kvm_ghcb_rax_is_valid(svm) ||
++ !kvm_ghcb_rcx_is_valid(svm))
+ goto vmgexit_err;
+ break;
+ case SVM_VMGEXIT_MMIO_READ:
+ case SVM_VMGEXIT_MMIO_WRITE:
+- if (!ghcb_sw_scratch_is_valid(ghcb))
++ if (!kvm_ghcb_sw_scratch_is_valid(svm))
+ goto vmgexit_err;
+ break;
+ case SVM_VMGEXIT_NMI_COMPLETE:
+@@ -2546,8 +2556,6 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
+ return 0;
+
+ vmgexit_err:
+- vcpu = &svm->vcpu;
+-
+ if (reason == GHCB_ERR_INVALID_USAGE) {
+ vcpu_unimpl(vcpu, "vmgexit: ghcb usage %#x is not valid\n",
+ ghcb->ghcb_usage);
+@@ -2560,9 +2568,6 @@ vmgexit_err:
+ dump_ghcb(svm);
+ }
+
+- /* Clear the valid entries fields */
+- memset(ghcb->save.valid_bitmap, 0, sizeof(ghcb->save.valid_bitmap));
+-
+ ghcb_set_sw_exit_info_1(ghcb, 2);
+ ghcb_set_sw_exit_info_2(ghcb, reason);
+
+@@ -2583,7 +2588,7 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm)
+ */
+ if (svm->sev_es.ghcb_sa_sync) {
+ kvm_write_guest(svm->vcpu.kvm,
+- ghcb_get_sw_scratch(svm->sev_es.ghcb),
++ svm->sev_es.sw_scratch,
+ svm->sev_es.ghcb_sa,
+ svm->sev_es.ghcb_sa_len);
+ svm->sev_es.ghcb_sa_sync = false;
+@@ -2634,7 +2639,7 @@ static int setup_vmgexit_scratch(struct vcpu_svm *svm, bool sync, u64 len)
+ u64 scratch_gpa_beg, scratch_gpa_end;
+ void *scratch_va;
+
+- scratch_gpa_beg = ghcb_get_sw_scratch(ghcb);
++ scratch_gpa_beg = svm->sev_es.sw_scratch;
+ if (!scratch_gpa_beg) {
+ pr_err("vmgexit: scratch gpa not provided\n");
+ goto e_scratch;
+@@ -2848,16 +2853,15 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
+
+ trace_kvm_vmgexit_enter(vcpu->vcpu_id, ghcb);
+
+- exit_code = ghcb_get_sw_exit_code(ghcb);
+-
++ sev_es_sync_from_ghcb(svm);
+ ret = sev_es_validate_vmgexit(svm);
+ if (ret)
+ return ret;
+
+- sev_es_sync_from_ghcb(svm);
+ ghcb_set_sw_exit_info_1(ghcb, 0);
+ ghcb_set_sw_exit_info_2(ghcb, 0);
+
++ exit_code = kvm_ghcb_get_sw_exit_code(control);
+ switch (exit_code) {
+ case SVM_VMGEXIT_MMIO_READ:
+ ret = setup_vmgexit_scratch(svm, true, control->exit_info_2);
+diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
+index f44751dd8d5d9..ece0d5959567a 100644
+--- a/arch/x86/kvm/svm/svm.h
++++ b/arch/x86/kvm/svm/svm.h
+@@ -190,10 +190,12 @@ struct vcpu_sev_es_state {
+ /* SEV-ES support */
+ struct sev_es_save_area *vmsa;
+ struct ghcb *ghcb;
++ u8 valid_bitmap[16];
+ struct kvm_host_map ghcb_map;
+ bool received_first_sipi;
+
+ /* SEV-ES scratch area support */
++ u64 sw_scratch;
+ void *ghcb_sa;
+ u32 ghcb_sa_len;
+ bool ghcb_sa_sync;
+@@ -745,4 +747,28 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm);
+ void __svm_sev_es_vcpu_run(struct vcpu_svm *svm, bool spec_ctrl_intercepted);
+ void __svm_vcpu_run(struct vcpu_svm *svm, bool spec_ctrl_intercepted);
+
++#define DEFINE_KVM_GHCB_ACCESSORS(field) \
++ static __always_inline bool kvm_ghcb_##field##_is_valid(const struct vcpu_svm *svm) \
++ { \
++ return test_bit(GHCB_BITMAP_IDX(field), \
++ (unsigned long *)&svm->sev_es.valid_bitmap); \
++ } \
++ \
++ static __always_inline u64 kvm_ghcb_get_##field##_if_valid(struct vcpu_svm *svm, struct ghcb *ghcb) \
++ { \
++ return kvm_ghcb_##field##_is_valid(svm) ? ghcb->save.field : 0; \
++ } \
++
++DEFINE_KVM_GHCB_ACCESSORS(cpl)
++DEFINE_KVM_GHCB_ACCESSORS(rax)
++DEFINE_KVM_GHCB_ACCESSORS(rcx)
++DEFINE_KVM_GHCB_ACCESSORS(rdx)
++DEFINE_KVM_GHCB_ACCESSORS(rbx)
++DEFINE_KVM_GHCB_ACCESSORS(rsi)
++DEFINE_KVM_GHCB_ACCESSORS(sw_exit_code)
++DEFINE_KVM_GHCB_ACCESSORS(sw_exit_info_1)
++DEFINE_KVM_GHCB_ACCESSORS(sw_exit_info_2)
++DEFINE_KVM_GHCB_ACCESSORS(sw_scratch)
++DEFINE_KVM_GHCB_ACCESSORS(xcr0)
++
+ #endif
+diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
+index a96f0f775ae27..7c9f3b1b42bad 100644
+--- a/arch/x86/kvm/x86.c
++++ b/arch/x86/kvm/x86.c
+@@ -314,8 +314,6 @@ u64 __read_mostly host_xcr0;
+
+ static struct kmem_cache *x86_emulator_cache;
+
+-extern bool gds_ucode_mitigated(void);
+-
+ /*
+ * When called, it means the previous get/set msr reached an invalid msr.
+ * Return true if we want to ignore/silent this failed msr access.
+diff --git a/drivers/accel/ivpu/ivpu_gem.c b/drivers/accel/ivpu/ivpu_gem.c
+index 52b339aefadca..9967fcfa27eca 100644
+--- a/drivers/accel/ivpu/ivpu_gem.c
++++ b/drivers/accel/ivpu/ivpu_gem.c
+@@ -173,6 +173,9 @@ static void internal_free_pages_locked(struct ivpu_bo *bo)
+ {
+ unsigned int i, npages = bo->base.size >> PAGE_SHIFT;
+
++ if (ivpu_bo_cache_mode(bo) != DRM_IVPU_BO_CACHED)
++ set_pages_array_wb(bo->pages, bo->base.size >> PAGE_SHIFT);
++
+ for (i = 0; i < npages; i++)
+ put_page(bo->pages[i]);
+
+@@ -587,6 +590,11 @@ ivpu_bo_alloc_internal(struct ivpu_device *vdev, u64 vpu_addr, u64 size, u32 fla
+ if (ivpu_bo_cache_mode(bo) != DRM_IVPU_BO_CACHED)
+ drm_clflush_pages(bo->pages, bo->base.size >> PAGE_SHIFT);
+
++ if (bo->flags & DRM_IVPU_BO_WC)
++ set_pages_array_wc(bo->pages, bo->base.size >> PAGE_SHIFT);
++ else if (bo->flags & DRM_IVPU_BO_UNCACHED)
++ set_pages_array_uc(bo->pages, bo->base.size >> PAGE_SHIFT);
++
+ prot = ivpu_bo_pgprot(bo, PAGE_KERNEL);
+ bo->kvaddr = vmap(bo->pages, bo->base.size >> PAGE_SHIFT, VM_MAP, prot);
+ if (!bo->kvaddr) {
+diff --git a/drivers/acpi/resource.c b/drivers/acpi/resource.c
+index 1dd8d5aebf678..a4d9f149b48d7 100644
+--- a/drivers/acpi/resource.c
++++ b/drivers/acpi/resource.c
+@@ -470,6 +470,45 @@ static const struct dmi_system_id asus_laptop[] = {
+ { }
+ };
+
++static const struct dmi_system_id tongfang_gm_rg[] = {
++ {
++ .ident = "TongFang GMxRGxx/XMG CORE 15 (M22)/TUXEDO Stellaris 15 Gen4 AMD",
++ .matches = {
++ DMI_MATCH(DMI_BOARD_NAME, "GMxRGxx"),
++ },
++ },
++ { }
++};
++
++static const struct dmi_system_id maingear_laptop[] = {
++ {
++ .ident = "MAINGEAR Vector Pro 2 15",
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "Micro Electronics Inc"),
++ DMI_MATCH(DMI_PRODUCT_NAME, "MG-VCP2-15A3070T"),
++ }
++ },
++ {
++ .ident = "MAINGEAR Vector Pro 2 17",
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "Micro Electronics Inc"),
++ DMI_MATCH(DMI_PRODUCT_NAME, "MG-VCP2-17A3070T"),
++ },
++ },
++ { }
++};
++
++static const struct dmi_system_id pcspecialist_laptop[] = {
++ {
++ .ident = "PCSpecialist Elimina Pro 16 M",
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "PCSpecialist"),
++ DMI_MATCH(DMI_PRODUCT_NAME, "Elimina Pro 16 M"),
++ },
++ },
++ { }
++};
++
+ static const struct dmi_system_id lg_laptop[] = {
+ {
+ .ident = "LG Electronics 17U70P",
+@@ -493,6 +532,9 @@ struct irq_override_cmp {
+ static const struct irq_override_cmp override_table[] = {
+ { medion_laptop, 1, ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_LOW, 0, false },
+ { asus_laptop, 1, ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_LOW, 0, false },
++ { tongfang_gm_rg, 1, ACPI_EDGE_SENSITIVE, ACPI_ACTIVE_LOW, 1, true },
++ { maingear_laptop, 1, ACPI_EDGE_SENSITIVE, ACPI_ACTIVE_LOW, 1, true },
++ { pcspecialist_laptop, 1, ACPI_EDGE_SENSITIVE, ACPI_ACTIVE_LOW, 1, true },
+ { lg_laptop, 1, ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_LOW, 0, false },
+ };
+
+@@ -512,6 +554,28 @@ static bool acpi_dev_irq_override(u32 gsi, u8 triggering, u8 polarity,
+ return entry->override;
+ }
+
++#ifdef CONFIG_X86
++ /*
++ * Always use the MADT override info, except for the i8042 PS/2 ctrl
++ * IRQs (1 and 12). For these the DSDT IRQ settings should sometimes
++ * be used otherwise PS/2 keyboards / mice will not work.
++ */
++ if (gsi != 1 && gsi != 12)
++ return true;
++
++ /* If the override comes from an INT_SRC_OVR MADT entry, honor it. */
++ if (acpi_int_src_ovr[gsi])
++ return true;
++
++ /*
++ * IRQ override isn't needed on modern AMD Zen systems and
++ * this override breaks active low IRQs on AMD Ryzen 6000 and
++ * newer systems. Skip it.
++ */
++ if (boot_cpu_has(X86_FEATURE_ZEN))
++ return false;
++#endif
++
+ return true;
+ }
+
+diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
+index 0c6f06abe3f47..c28c8d5ca0c8d 100644
+--- a/drivers/acpi/scan.c
++++ b/drivers/acpi/scan.c
+@@ -1712,6 +1712,7 @@ static bool acpi_device_enumeration_by_parent(struct acpi_device *device)
+ {"BSG1160", },
+ {"BSG2150", },
+ {"CSC3551", },
++ {"CSC3556", },
+ {"INT33FE", },
+ {"INT3515", },
+ /* Non-conforming _HID for Cirrus Logic already released */
+diff --git a/drivers/android/binder.c b/drivers/android/binder.c
+index 8fb7672021ee2..16ec3cb143c36 100644
+--- a/drivers/android/binder.c
++++ b/drivers/android/binder.c
+@@ -6610,6 +6610,7 @@ err_init_binder_device_failed:
+
+ err_alloc_device_names_failed:
+ debugfs_remove_recursive(binder_debugfs_dir_entry_root);
++ binder_alloc_shrinker_exit();
+
+ return ret;
+ }
+diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
+index 662a2a2e2e84a..e3db8297095a2 100644
+--- a/drivers/android/binder_alloc.c
++++ b/drivers/android/binder_alloc.c
+@@ -1087,6 +1087,12 @@ int binder_alloc_shrinker_init(void)
+ return ret;
+ }
+
++void binder_alloc_shrinker_exit(void)
++{
++ unregister_shrinker(&binder_shrinker);
++ list_lru_destroy(&binder_alloc_lru);
++}
++
+ /**
+ * check_buffer() - verify that buffer/offset is safe to access
+ * @alloc: binder_alloc for this proc
+diff --git a/drivers/android/binder_alloc.h b/drivers/android/binder_alloc.h
+index 138d1d5af9ce3..dc1e2b01dd64d 100644
+--- a/drivers/android/binder_alloc.h
++++ b/drivers/android/binder_alloc.h
+@@ -129,6 +129,7 @@ extern struct binder_buffer *binder_alloc_new_buf(struct binder_alloc *alloc,
+ int pid);
+ extern void binder_alloc_init(struct binder_alloc *alloc);
+ extern int binder_alloc_shrinker_init(void);
++extern void binder_alloc_shrinker_exit(void);
+ extern void binder_alloc_vma_close(struct binder_alloc *alloc);
+ extern struct binder_buffer *
+ binder_alloc_prepare_to_free(struct binder_alloc *alloc,
+diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
+index f6d90f1ba5cf7..7d00911f6c5a0 100644
+--- a/drivers/block/zram/zram_drv.c
++++ b/drivers/block/zram/zram_drv.c
+@@ -1870,15 +1870,16 @@ static void zram_bio_discard(struct zram *zram, struct bio *bio)
+
+ static void zram_bio_read(struct zram *zram, struct bio *bio)
+ {
+- struct bvec_iter iter;
+- struct bio_vec bv;
+- unsigned long start_time;
++ unsigned long start_time = bio_start_io_acct(bio);
++ struct bvec_iter iter = bio->bi_iter;
+
+- start_time = bio_start_io_acct(bio);
+- bio_for_each_segment(bv, bio, iter) {
++ do {
+ u32 index = iter.bi_sector >> SECTORS_PER_PAGE_SHIFT;
+ u32 offset = (iter.bi_sector & (SECTORS_PER_PAGE - 1)) <<
+ SECTOR_SHIFT;
++ struct bio_vec bv = bio_iter_iovec(bio, iter);
++
++ bv.bv_len = min_t(u32, bv.bv_len, PAGE_SIZE - offset);
+
+ if (zram_bvec_read(zram, &bv, index, offset, bio) < 0) {
+ atomic64_inc(&zram->stats.failed_reads);
+@@ -1890,22 +1891,26 @@ static void zram_bio_read(struct zram *zram, struct bio *bio)
+ zram_slot_lock(zram, index);
+ zram_accessed(zram, index);
+ zram_slot_unlock(zram, index);
+- }
++
++ bio_advance_iter_single(bio, &iter, bv.bv_len);
++ } while (iter.bi_size);
++
+ bio_end_io_acct(bio, start_time);
+ bio_endio(bio);
+ }
+
+ static void zram_bio_write(struct zram *zram, struct bio *bio)
+ {
+- struct bvec_iter iter;
+- struct bio_vec bv;
+- unsigned long start_time;
++ unsigned long start_time = bio_start_io_acct(bio);
++ struct bvec_iter iter = bio->bi_iter;
+
+- start_time = bio_start_io_acct(bio);
+- bio_for_each_segment(bv, bio, iter) {
++ do {
+ u32 index = iter.bi_sector >> SECTORS_PER_PAGE_SHIFT;
+ u32 offset = (iter.bi_sector & (SECTORS_PER_PAGE - 1)) <<
+ SECTOR_SHIFT;
++ struct bio_vec bv = bio_iter_iovec(bio, iter);
++
++ bv.bv_len = min_t(u32, bv.bv_len, PAGE_SIZE - offset);
+
+ if (zram_bvec_write(zram, &bv, index, offset, bio) < 0) {
+ atomic64_inc(&zram->stats.failed_writes);
+@@ -1916,7 +1921,10 @@ static void zram_bio_write(struct zram *zram, struct bio *bio)
+ zram_slot_lock(zram, index);
+ zram_accessed(zram, index);
+ zram_slot_unlock(zram, index);
+- }
++
++ bio_advance_iter_single(bio, &iter, bv.bv_len);
++ } while (iter.bi_size);
++
+ bio_end_io_acct(bio, start_time);
+ bio_endio(bio);
+ }
+diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
+index cf5499e51999b..ea6b4013bc38f 100644
+--- a/drivers/char/tpm/tpm-chip.c
++++ b/drivers/char/tpm/tpm-chip.c
+@@ -510,70 +510,6 @@ static int tpm_add_legacy_sysfs(struct tpm_chip *chip)
+ return 0;
+ }
+
+-/*
+- * Some AMD fTPM versions may cause stutter
+- * https://www.amd.com/en/support/kb/faq/pa-410
+- *
+- * Fixes are available in two series of fTPM firmware:
+- * 6.x.y.z series: 6.0.18.6 +
+- * 3.x.y.z series: 3.57.y.5 +
+- */
+-#ifdef CONFIG_X86
+-static bool tpm_amd_is_rng_defective(struct tpm_chip *chip)
+-{
+- u32 val1, val2;
+- u64 version;
+- int ret;
+-
+- if (!(chip->flags & TPM_CHIP_FLAG_TPM2))
+- return false;
+-
+- ret = tpm_request_locality(chip);
+- if (ret)
+- return false;
+-
+- ret = tpm2_get_tpm_pt(chip, TPM2_PT_MANUFACTURER, &val1, NULL);
+- if (ret)
+- goto release;
+- if (val1 != 0x414D4400U /* AMD */) {
+- ret = -ENODEV;
+- goto release;
+- }
+- ret = tpm2_get_tpm_pt(chip, TPM2_PT_FIRMWARE_VERSION_1, &val1, NULL);
+- if (ret)
+- goto release;
+- ret = tpm2_get_tpm_pt(chip, TPM2_PT_FIRMWARE_VERSION_2, &val2, NULL);
+-
+-release:
+- tpm_relinquish_locality(chip);
+-
+- if (ret)
+- return false;
+-
+- version = ((u64)val1 << 32) | val2;
+- if ((version >> 48) == 6) {
+- if (version >= 0x0006000000180006ULL)
+- return false;
+- } else if ((version >> 48) == 3) {
+- if (version >= 0x0003005700000005ULL)
+- return false;
+- } else {
+- return false;
+- }
+-
+- dev_warn(&chip->dev,
+- "AMD fTPM version 0x%llx causes system stutter; hwrng disabled\n",
+- version);
+-
+- return true;
+-}
+-#else
+-static inline bool tpm_amd_is_rng_defective(struct tpm_chip *chip)
+-{
+- return false;
+-}
+-#endif /* CONFIG_X86 */
+-
+ static int tpm_hwrng_read(struct hwrng *rng, void *data, size_t max, bool wait)
+ {
+ struct tpm_chip *chip = container_of(rng, struct tpm_chip, hwrng);
+@@ -585,10 +521,20 @@ static int tpm_hwrng_read(struct hwrng *rng, void *data, size_t max, bool wait)
+ return tpm_get_random(chip, data, max);
+ }
+
++static bool tpm_is_hwrng_enabled(struct tpm_chip *chip)
++{
++ if (!IS_ENABLED(CONFIG_HW_RANDOM_TPM))
++ return false;
++ if (tpm_is_firmware_upgrade(chip))
++ return false;
++ if (chip->flags & TPM_CHIP_FLAG_HWRNG_DISABLED)
++ return false;
++ return true;
++}
++
+ static int tpm_add_hwrng(struct tpm_chip *chip)
+ {
+- if (!IS_ENABLED(CONFIG_HW_RANDOM_TPM) || tpm_is_firmware_upgrade(chip) ||
+- tpm_amd_is_rng_defective(chip))
++ if (!tpm_is_hwrng_enabled(chip))
+ return 0;
+
+ snprintf(chip->hwrng_name, sizeof(chip->hwrng_name),
+@@ -693,7 +639,7 @@ int tpm_chip_register(struct tpm_chip *chip)
+ return 0;
+
+ out_hwrng:
+- if (IS_ENABLED(CONFIG_HW_RANDOM_TPM) && !tpm_is_firmware_upgrade(chip))
++ if (tpm_is_hwrng_enabled(chip))
+ hwrng_unregister(&chip->hwrng);
+ out_ppi:
+ tpm_bios_log_teardown(chip);
+@@ -718,8 +664,7 @@ EXPORT_SYMBOL_GPL(tpm_chip_register);
+ void tpm_chip_unregister(struct tpm_chip *chip)
+ {
+ tpm_del_legacy_sysfs(chip);
+- if (IS_ENABLED(CONFIG_HW_RANDOM_TPM) && !tpm_is_firmware_upgrade(chip) &&
+- !tpm_amd_is_rng_defective(chip))
++ if (tpm_is_hwrng_enabled(chip))
+ hwrng_unregister(&chip->hwrng);
+ tpm_bios_log_teardown(chip);
+ if (chip->flags & TPM_CHIP_FLAG_TPM2 && !tpm_is_firmware_upgrade(chip))
+diff --git a/drivers/char/tpm/tpm_crb.c b/drivers/char/tpm/tpm_crb.c
+index 1a5d09b185134..9eb1a18590123 100644
+--- a/drivers/char/tpm/tpm_crb.c
++++ b/drivers/char/tpm/tpm_crb.c
+@@ -463,6 +463,28 @@ static bool crb_req_canceled(struct tpm_chip *chip, u8 status)
+ return (cancel & CRB_CANCEL_INVOKE) == CRB_CANCEL_INVOKE;
+ }
+
++static int crb_check_flags(struct tpm_chip *chip)
++{
++ u32 val;
++ int ret;
++
++ ret = crb_request_locality(chip, 0);
++ if (ret)
++ return ret;
++
++ ret = tpm2_get_tpm_pt(chip, TPM2_PT_MANUFACTURER, &val, NULL);
++ if (ret)
++ goto release;
++
++ if (val == 0x414D4400U /* AMD */)
++ chip->flags |= TPM_CHIP_FLAG_HWRNG_DISABLED;
++
++release:
++ crb_relinquish_locality(chip, 0);
++
++ return ret;
++}
++
+ static const struct tpm_class_ops tpm_crb = {
+ .flags = TPM_OPS_AUTO_STARTUP,
+ .status = crb_status,
+@@ -800,6 +822,14 @@ static int crb_acpi_add(struct acpi_device *device)
+ chip->acpi_dev_handle = device->handle;
+ chip->flags = TPM_CHIP_FLAG_TPM2;
+
++ rc = tpm_chip_bootstrap(chip);
++ if (rc)
++ goto out;
++
++ rc = crb_check_flags(chip);
++ if (rc)
++ goto out;
++
+ rc = tpm_chip_register(chip);
+
+ out:
+diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c
+index cc42cf3de960f..7fa3d91042b26 100644
+--- a/drivers/char/tpm/tpm_tis.c
++++ b/drivers/char/tpm/tpm_tis.c
+@@ -89,7 +89,7 @@ static inline void tpm_tis_iowrite32(u32 b, void __iomem *iobase, u32 addr)
+ tpm_tis_flush(iobase);
+ }
+
+-static int interrupts = -1;
++static int interrupts;
+ module_param(interrupts, int, 0444);
+ MODULE_PARM_DESC(interrupts, "Enable interrupts");
+
+@@ -162,12 +162,28 @@ static const struct dmi_system_id tpm_tis_dmi_table[] = {
+ DMI_MATCH(DMI_PRODUCT_VERSION, "ThinkPad L590"),
+ },
+ },
++ {
++ .callback = tpm_tis_disable_irq,
++ .ident = "ThinkStation P620",
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
++ DMI_MATCH(DMI_PRODUCT_VERSION, "ThinkStation P620"),
++ },
++ },
++ {
++ .callback = tpm_tis_disable_irq,
++ .ident = "TUXEDO InfinityBook S 15/17 Gen7",
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "TUXEDO"),
++ DMI_MATCH(DMI_PRODUCT_NAME, "TUXEDO InfinityBook S 15/17 Gen7"),
++ },
++ },
+ {
+ .callback = tpm_tis_disable_irq,
+ .ident = "UPX-TGL",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "AAEON"),
+- DMI_MATCH(DMI_PRODUCT_VERSION, "UPX-TGL"),
++ DMI_MATCH(DMI_PRODUCT_NAME, "UPX-TGL01"),
+ },
+ },
+ {}
+diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
+index a5764946434c6..71a766bb6d5ed 100644
+--- a/drivers/cpufreq/amd-pstate.c
++++ b/drivers/cpufreq/amd-pstate.c
+@@ -986,8 +986,8 @@ static int amd_pstate_update_status(const char *buf, size_t size)
+ return 0;
+ }
+
+-static ssize_t show_status(struct kobject *kobj,
+- struct kobj_attribute *attr, char *buf)
++static ssize_t status_show(struct device *dev,
++ struct device_attribute *attr, char *buf)
+ {
+ ssize_t ret;
+
+@@ -998,7 +998,7 @@ static ssize_t show_status(struct kobject *kobj,
+ return ret;
+ }
+
+-static ssize_t store_status(struct kobject *a, struct kobj_attribute *b,
++static ssize_t status_store(struct device *a, struct device_attribute *b,
+ const char *buf, size_t count)
+ {
+ char *p = memchr(buf, '\n', count);
+@@ -1017,7 +1017,7 @@ cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
+ cpufreq_freq_attr_ro(amd_pstate_highest_perf);
+ cpufreq_freq_attr_rw(energy_performance_preference);
+ cpufreq_freq_attr_ro(energy_performance_available_preferences);
+-define_one_global_rw(status);
++static DEVICE_ATTR_RW(status);
+
+ static struct freq_attr *amd_pstate_attr[] = {
+ &amd_pstate_max_freq,
+@@ -1036,7 +1036,7 @@ static struct freq_attr *amd_pstate_epp_attr[] = {
+ };
+
+ static struct attribute *pstate_global_attributes[] = {
+- &status.attr,
++ &dev_attr_status.attr,
+ NULL
+ };
+
+diff --git a/drivers/cpuidle/cpuidle-psci-domain.c b/drivers/cpuidle/cpuidle-psci-domain.c
+index c2d6d9c3c930d..b88af1262f1ab 100644
+--- a/drivers/cpuidle/cpuidle-psci-domain.c
++++ b/drivers/cpuidle/cpuidle-psci-domain.c
+@@ -120,20 +120,6 @@ static void psci_pd_remove(void)
+ }
+ }
+
+-static bool psci_pd_try_set_osi_mode(void)
+-{
+- int ret;
+-
+- if (!psci_has_osi_support())
+- return false;
+-
+- ret = psci_set_osi_mode(true);
+- if (ret)
+- return false;
+-
+- return true;
+-}
+-
+ static void psci_cpuidle_domain_sync_state(struct device *dev)
+ {
+ /*
+@@ -152,15 +138,12 @@ static int psci_cpuidle_domain_probe(struct platform_device *pdev)
+ {
+ struct device_node *np = pdev->dev.of_node;
+ struct device_node *node;
+- bool use_osi;
++ bool use_osi = psci_has_osi_support();
+ int ret = 0, pd_count = 0;
+
+ if (!np)
+ return -ENODEV;
+
+- /* If OSI mode is supported, let's try to enable it. */
+- use_osi = psci_pd_try_set_osi_mode();
+-
+ /*
+ * Parse child nodes for the "#power-domain-cells" property and
+ * initialize a genpd/genpd-of-provider pair when it's found.
+@@ -170,33 +153,37 @@ static int psci_cpuidle_domain_probe(struct platform_device *pdev)
+ continue;
+
+ ret = psci_pd_init(node, use_osi);
+- if (ret)
+- goto put_node;
++ if (ret) {
++ of_node_put(node);
++ goto exit;
++ }
+
+ pd_count++;
+ }
+
+ /* Bail out if not using the hierarchical CPU topology. */
+ if (!pd_count)
+- goto no_pd;
++ return 0;
+
+ /* Link genpd masters/subdomains to model the CPU topology. */
+ ret = dt_idle_pd_init_topology(np);
+ if (ret)
+ goto remove_pd;
+
++ /* let's try to enable OSI. */
++ ret = psci_set_osi_mode(use_osi);
++ if (ret)
++ goto remove_pd;
++
+ pr_info("Initialized CPU PM domain topology using %s mode\n",
+ use_osi ? "OSI" : "PC");
+ return 0;
+
+-put_node:
+- of_node_put(node);
+ remove_pd:
++ dt_idle_pd_remove_topology(np);
+ psci_pd_remove();
++exit:
+ pr_err("failed to create CPU PM domains ret=%d\n", ret);
+-no_pd:
+- if (use_osi)
+- psci_set_osi_mode(false);
+ return ret;
+ }
+
+diff --git a/drivers/cpuidle/dt_idle_genpd.c b/drivers/cpuidle/dt_idle_genpd.c
+index b37165514d4e7..1af63c189039e 100644
+--- a/drivers/cpuidle/dt_idle_genpd.c
++++ b/drivers/cpuidle/dt_idle_genpd.c
+@@ -152,6 +152,30 @@ int dt_idle_pd_init_topology(struct device_node *np)
+ return 0;
+ }
+
++int dt_idle_pd_remove_topology(struct device_node *np)
++{
++ struct device_node *node;
++ struct of_phandle_args child, parent;
++ int ret;
++
++ for_each_child_of_node(np, node) {
++ if (of_parse_phandle_with_args(node, "power-domains",
++ "#power-domain-cells", 0, &parent))
++ continue;
++
++ child.np = node;
++ child.args_count = 0;
++ ret = of_genpd_remove_subdomain(&parent, &child);
++ of_node_put(parent.np);
++ if (ret) {
++ of_node_put(node);
++ return ret;
++ }
++ }
++
++ return 0;
++}
++
+ struct device *dt_idle_attach_cpu(int cpu, const char *name)
+ {
+ struct device *dev;
+diff --git a/drivers/cpuidle/dt_idle_genpd.h b/drivers/cpuidle/dt_idle_genpd.h
+index a95483d08a02a..3be1f70f55b5c 100644
+--- a/drivers/cpuidle/dt_idle_genpd.h
++++ b/drivers/cpuidle/dt_idle_genpd.h
+@@ -14,6 +14,8 @@ struct generic_pm_domain *dt_idle_pd_alloc(struct device_node *np,
+
+ int dt_idle_pd_init_topology(struct device_node *np);
+
++int dt_idle_pd_remove_topology(struct device_node *np);
++
+ struct device *dt_idle_attach_cpu(int cpu, const char *name);
+
+ void dt_idle_detach_cpu(struct device *dev);
+@@ -36,6 +38,11 @@ static inline int dt_idle_pd_init_topology(struct device_node *np)
+ return 0;
+ }
+
++static inline int dt_idle_pd_remove_topology(struct device_node *np)
++{
++ return 0;
++}
++
+ static inline struct device *dt_idle_attach_cpu(int cpu, const char *name)
+ {
+ return NULL;
+diff --git a/drivers/dma/idxd/device.c b/drivers/dma/idxd/device.c
+index 5abbcc61c5288..9a15f0d12c799 100644
+--- a/drivers/dma/idxd/device.c
++++ b/drivers/dma/idxd/device.c
+@@ -384,9 +384,7 @@ static void idxd_wq_disable_cleanup(struct idxd_wq *wq)
+ wq->threshold = 0;
+ wq->priority = 0;
+ wq->enqcmds_retries = IDXD_ENQCMDS_RETRIES;
+- clear_bit(WQ_FLAG_DEDICATED, &wq->flags);
+- clear_bit(WQ_FLAG_BLOCK_ON_FAULT, &wq->flags);
+- clear_bit(WQ_FLAG_ATS_DISABLE, &wq->flags);
++ wq->flags = 0;
+ memset(wq->name, 0, WQ_NAME_SIZE);
+ wq->max_xfer_bytes = WQ_DEFAULT_MAX_XFER;
+ idxd_wq_set_max_batch_size(idxd->data->type, wq, WQ_DEFAULT_MAX_BATCH);
+diff --git a/drivers/dma/mcf-edma.c b/drivers/dma/mcf-edma.c
+index ebd8733f72ad4..9413fad08a60c 100644
+--- a/drivers/dma/mcf-edma.c
++++ b/drivers/dma/mcf-edma.c
+@@ -190,7 +190,13 @@ static int mcf_edma_probe(struct platform_device *pdev)
+ return -EINVAL;
+ }
+
+- chans = pdata->dma_channels;
++ if (!pdata->dma_channels) {
++ dev_info(&pdev->dev, "setting default channel number to 64");
++ chans = 64;
++ } else {
++ chans = pdata->dma_channels;
++ }
++
+ len = sizeof(*mcf_edma) + sizeof(*mcf_chan) * chans;
+ mcf_edma = devm_kzalloc(&pdev->dev, len, GFP_KERNEL);
+ if (!mcf_edma)
+@@ -202,11 +208,6 @@ static int mcf_edma_probe(struct platform_device *pdev)
+ mcf_edma->drvdata = &mcf_data;
+ mcf_edma->big_endian = 1;
+
+- if (!mcf_edma->n_chans) {
+- dev_info(&pdev->dev, "setting default channel number to 64");
+- mcf_edma->n_chans = 64;
+- }
+-
+ mutex_init(&mcf_edma->fsl_edma_mutex);
+
+ mcf_edma->membase = devm_platform_ioremap_resource(pdev, 0);
+diff --git a/drivers/dma/owl-dma.c b/drivers/dma/owl-dma.c
+index 95a462a1f5111..b6e0ac8314e5c 100644
+--- a/drivers/dma/owl-dma.c
++++ b/drivers/dma/owl-dma.c
+@@ -192,7 +192,7 @@ struct owl_dma_pchan {
+ };
+
+ /**
+- * struct owl_dma_pchan - Wrapper for DMA ENGINE channel
++ * struct owl_dma_vchan - Wrapper for DMA ENGINE channel
+ * @vc: wrapped virtual channel
+ * @pchan: the physical channel utilized by this channel
+ * @txd: active transaction on this channel
+diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c
+index b4731fe6bbc14..3cf0b38387ae5 100644
+--- a/drivers/dma/pl330.c
++++ b/drivers/dma/pl330.c
+@@ -403,6 +403,12 @@ enum desc_status {
+ * of a channel can be BUSY at any time.
+ */
+ BUSY,
++ /*
++ * Pause was called while descriptor was BUSY. Due to hardware
++ * limitations, only termination is possible for descriptors
++ * that have been paused.
++ */
++ PAUSED,
+ /*
+ * Sitting on the channel work_list but xfer done
+ * by PL330 core
+@@ -2041,7 +2047,7 @@ static inline void fill_queue(struct dma_pl330_chan *pch)
+ list_for_each_entry(desc, &pch->work_list, node) {
+
+ /* If already submitted */
+- if (desc->status == BUSY)
++ if (desc->status == BUSY || desc->status == PAUSED)
+ continue;
+
+ ret = pl330_submit_req(pch->thread, desc);
+@@ -2326,6 +2332,7 @@ static int pl330_pause(struct dma_chan *chan)
+ {
+ struct dma_pl330_chan *pch = to_pchan(chan);
+ struct pl330_dmac *pl330 = pch->dmac;
++ struct dma_pl330_desc *desc;
+ unsigned long flags;
+
+ pm_runtime_get_sync(pl330->ddma.dev);
+@@ -2335,6 +2342,10 @@ static int pl330_pause(struct dma_chan *chan)
+ _stop(pch->thread);
+ spin_unlock(&pl330->lock);
+
++ list_for_each_entry(desc, &pch->work_list, node) {
++ if (desc->status == BUSY)
++ desc->status = PAUSED;
++ }
+ spin_unlock_irqrestore(&pch->lock, flags);
+ pm_runtime_mark_last_busy(pl330->ddma.dev);
+ pm_runtime_put_autosuspend(pl330->ddma.dev);
+@@ -2425,7 +2436,7 @@ pl330_tx_status(struct dma_chan *chan, dma_cookie_t cookie,
+ else if (running && desc == running)
+ transferred =
+ pl330_get_current_xferred_count(pch, desc);
+- else if (desc->status == BUSY)
++ else if (desc->status == BUSY || desc->status == PAUSED)
+ /*
+ * Busy but not running means either just enqueued,
+ * or finished and not yet marked done
+@@ -2442,6 +2453,9 @@ pl330_tx_status(struct dma_chan *chan, dma_cookie_t cookie,
+ case DONE:
+ ret = DMA_COMPLETE;
+ break;
++ case PAUSED:
++ ret = DMA_PAUSED;
++ break;
+ case PREP:
+ case BUSY:
+ ret = DMA_IN_PROGRESS;
+diff --git a/drivers/dma/xilinx/xdma.c b/drivers/dma/xilinx/xdma.c
+index 93ee298d52b89..e0bfd129d563f 100644
+--- a/drivers/dma/xilinx/xdma.c
++++ b/drivers/dma/xilinx/xdma.c
+@@ -668,6 +668,8 @@ static int xdma_set_vector_reg(struct xdma_device *xdev, u32 vec_tbl_start,
+ val |= irq_start << shift;
+ irq_start++;
+ irq_num--;
++ if (!irq_num)
++ break;
+ }
+
+ /* write IRQ register */
+@@ -715,7 +717,7 @@ static int xdma_irq_init(struct xdma_device *xdev)
+ ret = request_irq(irq, xdma_channel_isr, 0,
+ "xdma-c2h-channel", &xdev->c2h_chans[j]);
+ if (ret) {
+- xdma_err(xdev, "H2C channel%d request irq%d failed: %d",
++ xdma_err(xdev, "C2H channel%d request irq%d failed: %d",
+ j, irq, ret);
+ goto failed_init_c2h;
+ }
+@@ -892,7 +894,7 @@ static int xdma_probe(struct platform_device *pdev)
+ }
+
+ reg_base = devm_ioremap_resource(&pdev->dev, res);
+- if (!reg_base) {
++ if (IS_ERR(reg_base)) {
+ xdma_err(xdev, "ioremap failed");
+ goto failed;
+ }
+diff --git a/drivers/gpio/gpio-sim.c b/drivers/gpio/gpio-sim.c
+index 8b49b0abacd51..f1f6f1c329877 100644
+--- a/drivers/gpio/gpio-sim.c
++++ b/drivers/gpio/gpio-sim.c
+@@ -429,6 +429,7 @@ static int gpio_sim_add_bank(struct fwnode_handle *swnode, struct device *dev)
+ gc->set_config = gpio_sim_set_config;
+ gc->to_irq = gpio_sim_to_irq;
+ gc->free = gpio_sim_free;
++ gc->can_sleep = true;
+
+ ret = devm_gpiochip_add_data(dev, gc, chip);
+ if (ret)
+diff --git a/drivers/gpio/gpio-ws16c48.c b/drivers/gpio/gpio-ws16c48.c
+index e73885a4dc328..afb42a8e916fe 100644
+--- a/drivers/gpio/gpio-ws16c48.c
++++ b/drivers/gpio/gpio-ws16c48.c
+@@ -18,7 +18,7 @@
+ #include <linux/spinlock.h>
+ #include <linux/types.h>
+
+-#define WS16C48_EXTENT 10
++#define WS16C48_EXTENT 11
+ #define MAX_NUM_WS16C48 max_num_isa_dev(WS16C48_EXTENT)
+
+ static unsigned int base[MAX_NUM_WS16C48];
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+index 129081ffa0a5f..5e9df7158ea4e 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+@@ -282,6 +282,9 @@ extern int amdgpu_sg_display;
+ #define AMDGPU_SMARTSHIFT_MAX_BIAS (100)
+ #define AMDGPU_SMARTSHIFT_MIN_BIAS (-100)
+
++/* Extra time delay(in ms) to eliminate the influence of temperature momentary fluctuation */
++#define AMDGPU_SWCTF_EXTRA_DELAY 50
++
+ struct amdgpu_device;
+ struct amdgpu_irq_src;
+ struct amdgpu_fpriv;
+@@ -1246,6 +1249,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
+ void amdgpu_device_pci_config_reset(struct amdgpu_device *adev);
+ int amdgpu_device_pci_reset(struct amdgpu_device *adev);
+ bool amdgpu_device_need_post(struct amdgpu_device *adev);
++bool amdgpu_sg_display_supported(struct amdgpu_device *adev);
+ bool amdgpu_device_pcie_dynamic_switching_supported(void);
+ bool amdgpu_device_should_use_aspm(struct amdgpu_device *adev);
+ bool amdgpu_device_aspm_support_quirk(void);
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+index 5612caf77dd65..a989ae72a58a9 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+@@ -291,7 +291,7 @@ static int amdgpu_cs_pass1(struct amdgpu_cs_parser *p,
+
+ if (!p->gang_size) {
+ ret = -EINVAL;
+- goto free_partial_kdata;
++ goto free_all_kdata;
+ }
+
+ for (i = 0; i < p->gang_size; ++i) {
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+index 167b2a1c416eb..44a902d9b5c7b 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+@@ -1352,6 +1352,32 @@ bool amdgpu_device_need_post(struct amdgpu_device *adev)
+ return true;
+ }
+
++/*
++ * On APUs with >= 64GB white flickering has been observed w/ SG enabled.
++ * Disable S/G on such systems until we have a proper fix.
++ * https://gitlab.freedesktop.org/drm/amd/-/issues/2354
++ * https://gitlab.freedesktop.org/drm/amd/-/issues/2735
++ */
++bool amdgpu_sg_display_supported(struct amdgpu_device *adev)
++{
++ switch (amdgpu_sg_display) {
++ case -1:
++ break;
++ case 0:
++ return false;
++ case 1:
++ return true;
++ default:
++ return false;
++ }
++ if ((totalram_pages() << (PAGE_SHIFT - 10)) +
++ (adev->gmc.real_vram_size / 1024) >= 64000000) {
++ DRM_WARN("Disabling S/G due to >=64GB RAM\n");
++ return false;
++ }
++ return true;
++}
++
+ /*
+ * Intel hosts such as Raptor Lake and Sapphire Rapids don't support dynamic
+ * speed switching. Until we have confirmation from Intel that a specific host
+diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+index 812d7dd4c04b4..bdce367544368 100644
+--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
++++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+@@ -1630,9 +1630,8 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
+ }
+ break;
+ }
+- if (init_data.flags.gpu_vm_support &&
+- (amdgpu_sg_display == 0))
+- init_data.flags.gpu_vm_support = false;
++ if (init_data.flags.gpu_vm_support)
++ init_data.flags.gpu_vm_support = amdgpu_sg_display_supported(adev);
+
+ if (init_data.flags.gpu_vm_support)
+ adev->mode_info.gpu_vm_support = true;
+diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
+index 9bc86deac9e8e..b885c39bd16ba 100644
+--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
++++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
+@@ -1320,7 +1320,7 @@ int compute_mst_dsc_configs_for_state(struct drm_atomic_state *state,
+ if (computed_streams[i])
+ continue;
+
+- if (!res_pool->funcs->remove_stream_from_ctx ||
++ if (res_pool->funcs->remove_stream_from_ctx &&
+ res_pool->funcs->remove_stream_from_ctx(stream->ctx->dc, dc_state, stream) != DC_OK)
+ return -EINVAL;
+
+diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
+index 8d2460d06bced..58e8fda04b861 100644
+--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
++++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
+@@ -780,7 +780,8 @@ void dce110_edp_wait_for_hpd_ready(
+ dal_gpio_destroy_irq(&hpd);
+
+ /* ensure that the panel is detected */
+- ASSERT(edp_hpd_high);
++ if (!edp_hpd_high)
++ DC_LOG_DC("%s: wait timed out!\n", __func__);
+ }
+
+ void dce110_edp_power_control(
+diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c
+index e5b7ef7422b83..50dc834046446 100644
+--- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c
++++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c
+@@ -357,8 +357,11 @@ void dpp3_set_cursor_attributes(
+ int cur_rom_en = 0;
+
+ if (color_format == CURSOR_MODE_COLOR_PRE_MULTIPLIED_ALPHA ||
+- color_format == CURSOR_MODE_COLOR_UN_PRE_MULTIPLIED_ALPHA)
+- cur_rom_en = 1;
++ color_format == CURSOR_MODE_COLOR_UN_PRE_MULTIPLIED_ALPHA) {
++ if (cursor_attributes->attribute_flags.bits.ENABLE_CURSOR_DEGAMMA) {
++ cur_rom_en = 1;
++ }
++ }
+
+ REG_UPDATE_3(CURSOR0_CONTROL,
+ CUR0_MODE, color_format,
+diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h
+index d178f3f440816..42172b00be66d 100644
+--- a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h
++++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h
+@@ -89,6 +89,8 @@ struct amdgpu_dpm_thermal {
+ int max_mem_crit_temp;
+ /* memory max emergency(shutdown) temp */
+ int max_mem_emergency_temp;
++ /* SWCTF threshold */
++ int sw_ctf_threshold;
+ /* was last interrupt low to high or high to low */
+ bool high_to_low;
+ /* interrupt source */
+diff --git a/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c b/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
+index 11b7b4cffaae0..ff360c6991712 100644
+--- a/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
++++ b/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
+@@ -26,6 +26,7 @@
+ #include <linux/gfp.h>
+ #include <linux/slab.h>
+ #include <linux/firmware.h>
++#include <linux/reboot.h>
+ #include "amd_shared.h"
+ #include "amd_powerplay.h"
+ #include "power_state.h"
+@@ -91,6 +92,45 @@ static int pp_early_init(void *handle)
+ return 0;
+ }
+
++static void pp_swctf_delayed_work_handler(struct work_struct *work)
++{
++ struct pp_hwmgr *hwmgr =
++ container_of(work, struct pp_hwmgr, swctf_delayed_work.work);
++ struct amdgpu_device *adev = hwmgr->adev;
++ struct amdgpu_dpm_thermal *range =
++ &adev->pm.dpm.thermal;
++ uint32_t gpu_temperature, size;
++ int ret;
++
++ /*
++ * If the hotspot/edge temperature is confirmed as below SW CTF setting point
++ * after the delay enforced, nothing will be done.
++ * Otherwise, a graceful shutdown will be performed to prevent further damage.
++ */
++ if (range->sw_ctf_threshold &&
++ hwmgr->hwmgr_func->read_sensor) {
++ ret = hwmgr->hwmgr_func->read_sensor(hwmgr,
++ AMDGPU_PP_SENSOR_HOTSPOT_TEMP,
++ &gpu_temperature,
++ &size);
++ /*
++ * For some legacy ASICs, hotspot temperature retrieving might be not
++ * supported. Check the edge temperature instead then.
++ */
++ if (ret == -EOPNOTSUPP)
++ ret = hwmgr->hwmgr_func->read_sensor(hwmgr,
++ AMDGPU_PP_SENSOR_EDGE_TEMP,
++ &gpu_temperature,
++ &size);
++ if (!ret && gpu_temperature / 1000 < range->sw_ctf_threshold)
++ return;
++ }
++
++ dev_emerg(adev->dev, "ERROR: GPU over temperature range(SW CTF) detected!\n");
++ dev_emerg(adev->dev, "ERROR: System is going to shutdown due to GPU SW CTF!\n");
++ orderly_poweroff(true);
++}
++
+ static int pp_sw_init(void *handle)
+ {
+ struct amdgpu_device *adev = handle;
+@@ -101,6 +141,10 @@ static int pp_sw_init(void *handle)
+
+ pr_debug("powerplay sw init %s\n", ret ? "failed" : "successfully");
+
++ if (!ret)
++ INIT_DELAYED_WORK(&hwmgr->swctf_delayed_work,
++ pp_swctf_delayed_work_handler);
++
+ return ret;
+ }
+
+@@ -135,6 +179,8 @@ static int pp_hw_fini(void *handle)
+ struct amdgpu_device *adev = handle;
+ struct pp_hwmgr *hwmgr = adev->powerplay.pp_handle;
+
++ cancel_delayed_work_sync(&hwmgr->swctf_delayed_work);
++
+ hwmgr_hw_fini(hwmgr);
+
+ return 0;
+@@ -221,6 +267,8 @@ static int pp_suspend(void *handle)
+ struct amdgpu_device *adev = handle;
+ struct pp_hwmgr *hwmgr = adev->powerplay.pp_handle;
+
++ cancel_delayed_work_sync(&hwmgr->swctf_delayed_work);
++
+ return hwmgr_suspend(hwmgr);
+ }
+
+diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hardwaremanager.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hardwaremanager.c
+index 981dc8c7112d6..90452b66e1071 100644
+--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hardwaremanager.c
++++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hardwaremanager.c
+@@ -241,7 +241,8 @@ int phm_start_thermal_controller(struct pp_hwmgr *hwmgr)
+ TEMP_RANGE_MAX,
+ TEMP_RANGE_MIN,
+ TEMP_RANGE_MAX,
+- TEMP_RANGE_MAX};
++ TEMP_RANGE_MAX,
++ 0};
+ struct amdgpu_device *adev = hwmgr->adev;
+
+ if (!hwmgr->not_vf)
+@@ -265,6 +266,7 @@ int phm_start_thermal_controller(struct pp_hwmgr *hwmgr)
+ adev->pm.dpm.thermal.min_mem_temp = range.mem_min;
+ adev->pm.dpm.thermal.max_mem_crit_temp = range.mem_crit_max;
+ adev->pm.dpm.thermal.max_mem_emergency_temp = range.mem_emergency_max;
++ adev->pm.dpm.thermal.sw_ctf_threshold = range.sw_ctf_threshold;
+
+ return ret;
+ }
+diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
+index e10cc5e7928e6..6841a4bce186f 100644
+--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
++++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
+@@ -5432,6 +5432,8 @@ static int smu7_get_thermal_temperature_range(struct pp_hwmgr *hwmgr,
+ thermal_data->max = data->thermal_temp_setting.temperature_shutdown *
+ PP_TEMPERATURE_UNITS_PER_CENTIGRADES;
+
++ thermal_data->sw_ctf_threshold = thermal_data->max;
++
+ return 0;
+ }
+
+diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu_helper.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu_helper.c
+index bfe80ac0ad8c8..d0b1ab6c45231 100644
+--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu_helper.c
++++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu_helper.c
+@@ -603,21 +603,17 @@ int phm_irq_process(struct amdgpu_device *adev,
+ struct amdgpu_irq_src *source,
+ struct amdgpu_iv_entry *entry)
+ {
++ struct pp_hwmgr *hwmgr = adev->powerplay.pp_handle;
+ uint32_t client_id = entry->client_id;
+ uint32_t src_id = entry->src_id;
+
+ if (client_id == AMDGPU_IRQ_CLIENTID_LEGACY) {
+ if (src_id == VISLANDS30_IV_SRCID_CG_TSS_THERMAL_LOW_TO_HIGH) {
+- dev_emerg(adev->dev, "ERROR: GPU over temperature range(SW CTF) detected!\n");
+- /*
+- * SW CTF just occurred.
+- * Try to do a graceful shutdown to prevent further damage.
+- */
+- dev_emerg(adev->dev, "ERROR: System is going to shutdown due to GPU SW CTF!\n");
+- orderly_poweroff(true);
+- } else if (src_id == VISLANDS30_IV_SRCID_CG_TSS_THERMAL_HIGH_TO_LOW)
++ schedule_delayed_work(&hwmgr->swctf_delayed_work,
++ msecs_to_jiffies(AMDGPU_SWCTF_EXTRA_DELAY));
++ } else if (src_id == VISLANDS30_IV_SRCID_CG_TSS_THERMAL_HIGH_TO_LOW) {
+ dev_emerg(adev->dev, "ERROR: GPU under temperature range detected!\n");
+- else if (src_id == VISLANDS30_IV_SRCID_GPIO_19) {
++ } else if (src_id == VISLANDS30_IV_SRCID_GPIO_19) {
+ dev_emerg(adev->dev, "ERROR: GPU HW Critical Temperature Fault(aka CTF) detected!\n");
+ /*
+ * HW CTF just occurred. Shutdown to prevent further damage.
+@@ -626,15 +622,10 @@ int phm_irq_process(struct amdgpu_device *adev,
+ orderly_poweroff(true);
+ }
+ } else if (client_id == SOC15_IH_CLIENTID_THM) {
+- if (src_id == 0) {
+- dev_emerg(adev->dev, "ERROR: GPU over temperature range(SW CTF) detected!\n");
+- /*
+- * SW CTF just occurred.
+- * Try to do a graceful shutdown to prevent further damage.
+- */
+- dev_emerg(adev->dev, "ERROR: System is going to shutdown due to GPU SW CTF!\n");
+- orderly_poweroff(true);
+- } else
++ if (src_id == 0)
++ schedule_delayed_work(&hwmgr->swctf_delayed_work,
++ msecs_to_jiffies(AMDGPU_SWCTF_EXTRA_DELAY));
++ else
+ dev_emerg(adev->dev, "ERROR: GPU under temperature range detected!\n");
+ } else if (client_id == SOC15_IH_CLIENTID_ROM_SMUIO) {
+ dev_emerg(adev->dev, "ERROR: GPU HW Critical Temperature Fault(aka CTF) detected!\n");
+diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c
+index 99cd2e63afdd4..c51dd4c74fe9d 100644
+--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c
++++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c
+@@ -5241,6 +5241,9 @@ static int vega10_get_thermal_temperature_range(struct pp_hwmgr *hwmgr,
+ {
+ struct vega10_hwmgr *data = hwmgr->backend;
+ PPTable_t *pp_table = &(data->smc_state_table.pp_table);
++ struct phm_ppt_v2_information *pp_table_info =
++ (struct phm_ppt_v2_information *)(hwmgr->pptable);
++ struct phm_tdp_table *tdp_table = pp_table_info->tdp_table;
+
+ memcpy(thermal_data, &SMU7ThermalWithDelayPolicy[0], sizeof(struct PP_TemperatureRange));
+
+@@ -5257,6 +5260,13 @@ static int vega10_get_thermal_temperature_range(struct pp_hwmgr *hwmgr,
+ thermal_data->mem_emergency_max = (pp_table->ThbmLimit + CTF_OFFSET_HBM)*
+ PP_TEMPERATURE_UNITS_PER_CENTIGRADES;
+
++ if (tdp_table->usSoftwareShutdownTemp > pp_table->ThotspotLimit &&
++ tdp_table->usSoftwareShutdownTemp < VEGA10_THERMAL_MAXIMUM_ALERT_TEMP)
++ thermal_data->sw_ctf_threshold = tdp_table->usSoftwareShutdownTemp;
++ else
++ thermal_data->sw_ctf_threshold = VEGA10_THERMAL_MAXIMUM_ALERT_TEMP;
++ thermal_data->sw_ctf_threshold *= PP_TEMPERATURE_UNITS_PER_CENTIGRADES;
++
+ return 0;
+ }
+
+diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c
+index e9db137cd1c6c..1937be1cf5b46 100644
+--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c
++++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega12_hwmgr.c
+@@ -2763,6 +2763,8 @@ static int vega12_notify_cac_buffer_info(struct pp_hwmgr *hwmgr,
+ static int vega12_get_thermal_temperature_range(struct pp_hwmgr *hwmgr,
+ struct PP_TemperatureRange *thermal_data)
+ {
++ struct phm_ppt_v3_information *pptable_information =
++ (struct phm_ppt_v3_information *)hwmgr->pptable;
+ struct vega12_hwmgr *data =
+ (struct vega12_hwmgr *)(hwmgr->backend);
+ PPTable_t *pp_table = &(data->smc_state_table.pp_table);
+@@ -2781,6 +2783,8 @@ static int vega12_get_thermal_temperature_range(struct pp_hwmgr *hwmgr,
+ PP_TEMPERATURE_UNITS_PER_CENTIGRADES;
+ thermal_data->mem_emergency_max = (pp_table->ThbmLimit + CTF_OFFSET_HBM)*
+ PP_TEMPERATURE_UNITS_PER_CENTIGRADES;
++ thermal_data->sw_ctf_threshold = pptable_information->us_software_shutdown_temp *
++ PP_TEMPERATURE_UNITS_PER_CENTIGRADES;
+
+ return 0;
+ }
+diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c
+index 0d4d4811527c6..4e19ccbdb8077 100644
+--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c
++++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega20_hwmgr.c
+@@ -4206,6 +4206,8 @@ static int vega20_notify_cac_buffer_info(struct pp_hwmgr *hwmgr,
+ static int vega20_get_thermal_temperature_range(struct pp_hwmgr *hwmgr,
+ struct PP_TemperatureRange *thermal_data)
+ {
++ struct phm_ppt_v3_information *pptable_information =
++ (struct phm_ppt_v3_information *)hwmgr->pptable;
+ struct vega20_hwmgr *data =
+ (struct vega20_hwmgr *)(hwmgr->backend);
+ PPTable_t *pp_table = &(data->smc_state_table.pp_table);
+@@ -4224,6 +4226,8 @@ static int vega20_get_thermal_temperature_range(struct pp_hwmgr *hwmgr,
+ PP_TEMPERATURE_UNITS_PER_CENTIGRADES;
+ thermal_data->mem_emergency_max = (pp_table->ThbmLimit + CTF_OFFSET_HBM)*
+ PP_TEMPERATURE_UNITS_PER_CENTIGRADES;
++ thermal_data->sw_ctf_threshold = pptable_information->us_software_shutdown_temp *
++ PP_TEMPERATURE_UNITS_PER_CENTIGRADES;
+
+ return 0;
+ }
+diff --git a/drivers/gpu/drm/amd/pm/powerplay/inc/hwmgr.h b/drivers/gpu/drm/amd/pm/powerplay/inc/hwmgr.h
+index 5ce433e2c16a5..ec10643edea3e 100644
+--- a/drivers/gpu/drm/amd/pm/powerplay/inc/hwmgr.h
++++ b/drivers/gpu/drm/amd/pm/powerplay/inc/hwmgr.h
+@@ -811,6 +811,8 @@ struct pp_hwmgr {
+ bool gfxoff_state_changed_by_workload;
+ uint32_t pstate_sclk_peak;
+ uint32_t pstate_mclk_peak;
++
++ struct delayed_work swctf_delayed_work;
+ };
+
+ int hwmgr_early_init(struct pp_hwmgr *hwmgr);
+diff --git a/drivers/gpu/drm/amd/pm/powerplay/inc/power_state.h b/drivers/gpu/drm/amd/pm/powerplay/inc/power_state.h
+index a5f2227a3971c..0ffc2347829d0 100644
+--- a/drivers/gpu/drm/amd/pm/powerplay/inc/power_state.h
++++ b/drivers/gpu/drm/amd/pm/powerplay/inc/power_state.h
+@@ -131,6 +131,7 @@ struct PP_TemperatureRange {
+ int mem_min;
+ int mem_crit_max;
+ int mem_emergency_max;
++ int sw_ctf_threshold;
+ };
+
+ struct PP_StateValidationBlock {
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
+index 2ddf5198e5c48..ea03e8d9a3f6c 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
+@@ -24,6 +24,7 @@
+
+ #include <linux/firmware.h>
+ #include <linux/pci.h>
++#include <linux/reboot.h>
+
+ #include "amdgpu.h"
+ #include "amdgpu_smu.h"
+@@ -1070,6 +1071,34 @@ static void smu_interrupt_work_fn(struct work_struct *work)
+ smu->ppt_funcs->interrupt_work(smu);
+ }
+
++static void smu_swctf_delayed_work_handler(struct work_struct *work)
++{
++ struct smu_context *smu =
++ container_of(work, struct smu_context, swctf_delayed_work.work);
++ struct smu_temperature_range *range =
++ &smu->thermal_range;
++ struct amdgpu_device *adev = smu->adev;
++ uint32_t hotspot_tmp, size;
++
++ /*
++ * If the hotspot temperature is confirmed as below SW CTF setting point
++ * after the delay enforced, nothing will be done.
++ * Otherwise, a graceful shutdown will be performed to prevent further damage.
++ */
++ if (range->software_shutdown_temp &&
++ smu->ppt_funcs->read_sensor &&
++ !smu->ppt_funcs->read_sensor(smu,
++ AMDGPU_PP_SENSOR_HOTSPOT_TEMP,
++ &hotspot_tmp,
++ &size) &&
++ hotspot_tmp / 1000 < range->software_shutdown_temp)
++ return;
++
++ dev_emerg(adev->dev, "ERROR: GPU over temperature range(SW CTF) detected!\n");
++ dev_emerg(adev->dev, "ERROR: System is going to shutdown due to GPU SW CTF!\n");
++ orderly_poweroff(true);
++}
++
+ static int smu_sw_init(void *handle)
+ {
+ struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+@@ -1112,6 +1141,9 @@ static int smu_sw_init(void *handle)
+ smu->smu_dpm.dpm_level = AMD_DPM_FORCED_LEVEL_AUTO;
+ smu->smu_dpm.requested_dpm_level = AMD_DPM_FORCED_LEVEL_AUTO;
+
++ INIT_DELAYED_WORK(&smu->swctf_delayed_work,
++ smu_swctf_delayed_work_handler);
++
+ ret = smu_smc_table_sw_init(smu);
+ if (ret) {
+ dev_err(adev->dev, "Failed to sw init smc table!\n");
+@@ -1592,6 +1624,8 @@ static int smu_smc_hw_cleanup(struct smu_context *smu)
+ return ret;
+ }
+
++ cancel_delayed_work_sync(&smu->swctf_delayed_work);
++
+ ret = smu_disable_dpms(smu);
+ if (ret) {
+ dev_err(adev->dev, "Fail to disable dpm features!\n");
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
+index 09469c750a96b..6e2069dcb6b9d 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
++++ b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
+@@ -573,6 +573,8 @@ struct smu_context
+ u32 debug_param_reg;
+ u32 debug_msg_reg;
+ u32 debug_resp_reg;
++
++ struct delayed_work swctf_delayed_work;
+ };
+
+ struct i2c_adapter;
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
+index e1ef88ee1ed39..aa4a5498a12f7 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c
+@@ -1412,13 +1412,8 @@ static int smu_v11_0_irq_process(struct amdgpu_device *adev,
+ if (client_id == SOC15_IH_CLIENTID_THM) {
+ switch (src_id) {
+ case THM_11_0__SRCID__THM_DIG_THERM_L2H:
+- dev_emerg(adev->dev, "ERROR: GPU over temperature range(SW CTF) detected!\n");
+- /*
+- * SW CTF just occurred.
+- * Try to do a graceful shutdown to prevent further damage.
+- */
+- dev_emerg(adev->dev, "ERROR: System is going to shutdown due to GPU SW CTF!\n");
+- orderly_poweroff(true);
++ schedule_delayed_work(&smu->swctf_delayed_work,
++ msecs_to_jiffies(AMDGPU_SWCTF_EXTRA_DELAY));
+ break;
+ case THM_11_0__SRCID__THM_DIG_THERM_H2L:
+ dev_emerg(adev->dev, "ERROR: GPU under temperature range detected\n");
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
+index 79e9230fc7960..048f4018d0b90 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
+@@ -1377,13 +1377,8 @@ static int smu_v13_0_irq_process(struct amdgpu_device *adev,
+ if (client_id == SOC15_IH_CLIENTID_THM) {
+ switch (src_id) {
+ case THM_11_0__SRCID__THM_DIG_THERM_L2H:
+- dev_emerg(adev->dev, "ERROR: GPU over temperature range(SW CTF) detected!\n");
+- /*
+- * SW CTF just occurred.
+- * Try to do a graceful shutdown to prevent further damage.
+- */
+- dev_emerg(adev->dev, "ERROR: System is going to shutdown due to GPU SW CTF!\n");
+- orderly_poweroff(true);
++ schedule_delayed_work(&smu->swctf_delayed_work,
++ msecs_to_jiffies(AMDGPU_SWCTF_EXTRA_DELAY));
+ break;
+ case THM_11_0__SRCID__THM_DIG_THERM_H2L:
+ dev_emerg(adev->dev, "ERROR: GPU under temperature range detected\n");
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
+index 907cc43d16a90..d7f09af2fb018 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
+@@ -1030,7 +1030,6 @@ static int smu_v13_0_0_print_clk_levels(struct smu_context *smu,
+ struct smu_13_0_dpm_context *dpm_context = smu_dpm->dpm_context;
+ struct smu_13_0_dpm_table *single_dpm_table;
+ struct smu_13_0_pcie_table *pcie_table;
+- const int link_width[] = {0, 1, 2, 4, 8, 12, 16};
+ uint32_t gen_speed, lane_width;
+ int i, curr_freq, size = 0;
+ int ret = 0;
+@@ -1145,7 +1144,7 @@ static int smu_v13_0_0_print_clk_levels(struct smu_context *smu,
+ (pcie_table->pcie_lane[i] == 6) ? "x16" : "",
+ pcie_table->clk_freq[i],
+ (gen_speed == DECODE_GEN_SPEED(pcie_table->pcie_gen[i])) &&
+- (lane_width == DECODE_LANE_WIDTH(link_width[pcie_table->pcie_lane[i]])) ?
++ (lane_width == DECODE_LANE_WIDTH(pcie_table->pcie_lane[i])) ?
+ "*" : "");
+ break;
+
+diff --git a/drivers/gpu/drm/bridge/ite-it6505.c b/drivers/gpu/drm/bridge/ite-it6505.c
+index 45f579c365e7f..94b1497dfb017 100644
+--- a/drivers/gpu/drm/bridge/ite-it6505.c
++++ b/drivers/gpu/drm/bridge/ite-it6505.c
+@@ -2517,9 +2517,11 @@ static irqreturn_t it6505_int_threaded_handler(int unused, void *data)
+ };
+ int int_status[3], i;
+
+- if (it6505->enable_drv_hold || pm_runtime_get_if_in_use(dev) <= 0)
++ if (it6505->enable_drv_hold || !it6505->powered)
+ return IRQ_HANDLED;
+
++ pm_runtime_get_sync(dev);
++
+ int_status[0] = it6505_read(it6505, INT_STATUS_01);
+ int_status[1] = it6505_read(it6505, INT_STATUS_02);
+ int_status[2] = it6505_read(it6505, INT_STATUS_03);
+diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
+index 4ea6507a77e5d..baaf0e0feb063 100644
+--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
++++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
+@@ -623,7 +623,13 @@ int drm_gem_shmem_mmap(struct drm_gem_shmem_object *shmem, struct vm_area_struct
+ int ret;
+
+ if (obj->import_attach) {
++ /* Reset both vm_ops and vm_private_data, so we don't end up with
++ * vm_ops pointing to our implementation if the dma-buf backend
++ * doesn't set those fields.
++ */
+ vma->vm_private_data = NULL;
++ vma->vm_ops = NULL;
++
+ ret = dma_buf_mmap(obj->dma_buf, vma, 0);
+
+ /* Drop the reference drm_gem_mmap_obj() acquired.*/
+diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.c b/drivers/gpu/drm/nouveau/nouveau_connector.c
+index f75c6f09dd2af..a2e0033e8a260 100644
+--- a/drivers/gpu/drm/nouveau/nouveau_connector.c
++++ b/drivers/gpu/drm/nouveau/nouveau_connector.c
+@@ -967,7 +967,7 @@ nouveau_connector_get_modes(struct drm_connector *connector)
+ /* Determine display colour depth for everything except LVDS now,
+ * DP requires this before mode_valid() is called.
+ */
+- if (connector->connector_type != DRM_MODE_CONNECTOR_LVDS && nv_connector->native_mode)
++ if (connector->connector_type != DRM_MODE_CONNECTOR_LVDS)
+ nouveau_connector_detect_depth(connector);
+
+ /* Find the native mode if this is a digital panel, if we didn't
+diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/disp/dp.c b/drivers/gpu/drm/nouveau/nvkm/engine/disp/dp.c
+index 40c8ea43c42f2..b8ac66b4a2c4b 100644
+--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/dp.c
++++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/dp.c
+@@ -26,6 +26,8 @@
+ #include "head.h"
+ #include "ior.h"
+
++#include <drm/display/drm_dp.h>
++
+ #include <subdev/bios.h>
+ #include <subdev/bios/init.h>
+ #include <subdev/gpio.h>
+@@ -634,6 +636,50 @@ nvkm_dp_enable_supported_link_rates(struct nvkm_outp *outp)
+ return outp->dp.rates != 0;
+ }
+
++/* XXX: This is a big fat hack, and this is just drm_dp_read_dpcd_caps()
++ * converted to work inside nvkm. This is a temporary holdover until we start
++ * passing the drm_dp_aux device through NVKM
++ */
++static int
++nvkm_dp_read_dpcd_caps(struct nvkm_outp *outp)
++{
++ struct nvkm_i2c_aux *aux = outp->dp.aux;
++ u8 dpcd_ext[DP_RECEIVER_CAP_SIZE];
++ int ret;
++
++ ret = nvkm_rdaux(aux, DPCD_RC00_DPCD_REV, outp->dp.dpcd, DP_RECEIVER_CAP_SIZE);
++ if (ret < 0)
++ return ret;
++
++ /*
++ * Prior to DP1.3 the bit represented by
++ * DP_EXTENDED_RECEIVER_CAP_FIELD_PRESENT was reserved.
++ * If it is set DP_DPCD_REV at 0000h could be at a value less than
++ * the true capability of the panel. The only way to check is to
++ * then compare 0000h and 2200h.
++ */
++ if (!(outp->dp.dpcd[DP_TRAINING_AUX_RD_INTERVAL] &
++ DP_EXTENDED_RECEIVER_CAP_FIELD_PRESENT))
++ return 0;
++
++ ret = nvkm_rdaux(aux, DP_DP13_DPCD_REV, dpcd_ext, sizeof(dpcd_ext));
++ if (ret < 0)
++ return ret;
++
++ if (outp->dp.dpcd[DP_DPCD_REV] > dpcd_ext[DP_DPCD_REV]) {
++ OUTP_DBG(outp, "Extended DPCD rev less than base DPCD rev (%d > %d)\n",
++ outp->dp.dpcd[DP_DPCD_REV], dpcd_ext[DP_DPCD_REV]);
++ return 0;
++ }
++
++ if (!memcmp(outp->dp.dpcd, dpcd_ext, sizeof(dpcd_ext)))
++ return 0;
++
++ memcpy(outp->dp.dpcd, dpcd_ext, sizeof(dpcd_ext));
++
++ return 0;
++}
++
+ void
+ nvkm_dp_enable(struct nvkm_outp *outp, bool auxpwr)
+ {
+@@ -689,7 +735,7 @@ nvkm_dp_enable(struct nvkm_outp *outp, bool auxpwr)
+ memset(outp->dp.lttpr, 0x00, sizeof(outp->dp.lttpr));
+ }
+
+- if (!nvkm_rdaux(aux, DPCD_RC00_DPCD_REV, outp->dp.dpcd, sizeof(outp->dp.dpcd))) {
++ if (!nvkm_dp_read_dpcd_caps(outp)) {
+ const u8 rates[] = { 0x1e, 0x14, 0x0a, 0x06, 0 };
+ const u8 *rate;
+ int rate_max;
+diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.h b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.h
+index 00dbeda7e3464..de161e7a04aa6 100644
+--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.h
++++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgf100.h
+@@ -117,6 +117,7 @@ void gk104_grctx_generate_r418800(struct gf100_gr *);
+
+ extern const struct gf100_grctx_func gk110_grctx;
+ void gk110_grctx_generate_r419eb0(struct gf100_gr *);
++void gk110_grctx_generate_r419f78(struct gf100_gr *);
+
+ extern const struct gf100_grctx_func gk110b_grctx;
+ extern const struct gf100_grctx_func gk208_grctx;
+diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgk104.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgk104.c
+index 94233d0119dff..52a234b1ef010 100644
+--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgk104.c
++++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgk104.c
+@@ -906,7 +906,9 @@ static void
+ gk104_grctx_generate_r419f78(struct gf100_gr *gr)
+ {
+ struct nvkm_device *device = gr->base.engine.subdev.device;
+- nvkm_mask(device, 0x419f78, 0x00000001, 0x00000000);
++
++ /* bit 3 set disables loads in fp helper invocations, we need it enabled */
++ nvkm_mask(device, 0x419f78, 0x00000009, 0x00000000);
+ }
+
+ void
+diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgk110.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgk110.c
+index 4391458e1fb2f..3acdd9eeb74a7 100644
+--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgk110.c
++++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgk110.c
+@@ -820,6 +820,15 @@ gk110_grctx_generate_r419eb0(struct gf100_gr *gr)
+ nvkm_mask(device, 0x419eb0, 0x00001000, 0x00001000);
+ }
+
++void
++gk110_grctx_generate_r419f78(struct gf100_gr *gr)
++{
++ struct nvkm_device *device = gr->base.engine.subdev.device;
++
++ /* bit 3 set disables loads in fp helper invocations, we need it enabled */
++ nvkm_mask(device, 0x419f78, 0x00000008, 0x00000000);
++}
++
+ const struct gf100_grctx_func
+ gk110_grctx = {
+ .main = gf100_grctx_generate_main,
+@@ -854,4 +863,5 @@ gk110_grctx = {
+ .gpc_tpc_nr = gk104_grctx_generate_gpc_tpc_nr,
+ .r418800 = gk104_grctx_generate_r418800,
+ .r419eb0 = gk110_grctx_generate_r419eb0,
++ .r419f78 = gk110_grctx_generate_r419f78,
+ };
+diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgk110b.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgk110b.c
+index 7b9a34f9ec3c7..5597e87624acd 100644
+--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgk110b.c
++++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgk110b.c
+@@ -103,4 +103,5 @@ gk110b_grctx = {
+ .gpc_tpc_nr = gk104_grctx_generate_gpc_tpc_nr,
+ .r418800 = gk104_grctx_generate_r418800,
+ .r419eb0 = gk110_grctx_generate_r419eb0,
++ .r419f78 = gk110_grctx_generate_r419f78,
+ };
+diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgk208.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgk208.c
+index c78d07a8bb7df..612656496541d 100644
+--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgk208.c
++++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgk208.c
+@@ -568,4 +568,5 @@ gk208_grctx = {
+ .dist_skip_table = gf117_grctx_generate_dist_skip_table,
+ .gpc_tpc_nr = gk104_grctx_generate_gpc_tpc_nr,
+ .r418800 = gk104_grctx_generate_r418800,
++ .r419f78 = gk110_grctx_generate_r419f78,
+ };
+diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgm107.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgm107.c
+index beac66eb2a803..9906974ac3f07 100644
+--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgm107.c
++++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxgm107.c
+@@ -988,4 +988,5 @@ gm107_grctx = {
+ .r406500 = gm107_grctx_generate_r406500,
+ .gpc_tpc_nr = gk104_grctx_generate_gpc_tpc_nr,
+ .r419e00 = gm107_grctx_generate_r419e00,
++ .r419f78 = gk110_grctx_generate_r419f78,
+ };
+diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/tu102.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/tu102.c
+index 3b6c8100a2428..a7775aa185415 100644
+--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/tu102.c
++++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/tu102.c
+@@ -206,19 +206,6 @@ tu102_gr_av_to_init_veid(struct nvkm_blob *blob, struct gf100_gr_pack **ppack)
+ return gk20a_gr_av_to_init_(blob, 64, 0x00100000, ppack);
+ }
+
+-int
+-tu102_gr_load(struct gf100_gr *gr, int ver, const struct gf100_gr_fwif *fwif)
+-{
+- int ret;
+-
+- ret = gm200_gr_load(gr, ver, fwif);
+- if (ret)
+- return ret;
+-
+- return gk20a_gr_load_net(gr, "gr/", "sw_veid_bundle_init", ver, tu102_gr_av_to_init_veid,
+- &gr->bundle_veid);
+-}
+-
+ static const struct gf100_gr_fwif
+ tu102_gr_fwif[] = {
+ { 0, gm200_gr_load, &tu102_gr, &gp108_gr_fecs_acr, &gp108_gr_gpccs_acr },
+diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
+index a530ecc4d207c..bf34498c1b6d7 100644
+--- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
++++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
+@@ -833,12 +833,12 @@ static int vop_plane_atomic_check(struct drm_plane *plane,
+ * need align with 2 pixel.
+ */
+ if (fb->format->is_yuv && ((new_plane_state->src.x1 >> 16) % 2)) {
+- DRM_ERROR("Invalid Source: Yuv format not support odd xpos\n");
++ DRM_DEBUG_KMS("Invalid Source: Yuv format not support odd xpos\n");
+ return -EINVAL;
+ }
+
+ if (fb->format->is_yuv && new_plane_state->rotation & DRM_MODE_REFLECT_Y) {
+- DRM_ERROR("Invalid Source: Yuv format does not support this rotation\n");
++ DRM_DEBUG_KMS("Invalid Source: Yuv format does not support this rotation\n");
+ return -EINVAL;
+ }
+
+@@ -846,7 +846,7 @@ static int vop_plane_atomic_check(struct drm_plane *plane,
+ struct vop *vop = to_vop(crtc);
+
+ if (!vop->data->afbc) {
+- DRM_ERROR("vop does not support AFBC\n");
++ DRM_DEBUG_KMS("vop does not support AFBC\n");
+ return -EINVAL;
+ }
+
+@@ -855,15 +855,16 @@ static int vop_plane_atomic_check(struct drm_plane *plane,
+ return ret;
+
+ if (new_plane_state->src.x1 || new_plane_state->src.y1) {
+- DRM_ERROR("AFBC does not support offset display, xpos=%d, ypos=%d, offset=%d\n",
+- new_plane_state->src.x1,
+- new_plane_state->src.y1, fb->offsets[0]);
++ DRM_DEBUG_KMS("AFBC does not support offset display, " \
++ "xpos=%d, ypos=%d, offset=%d\n",
++ new_plane_state->src.x1, new_plane_state->src.y1,
++ fb->offsets[0]);
+ return -EINVAL;
+ }
+
+ if (new_plane_state->rotation && new_plane_state->rotation != DRM_MODE_ROTATE_0) {
+- DRM_ERROR("No rotation support in AFBC, rotation=%d\n",
+- new_plane_state->rotation);
++ DRM_DEBUG_KMS("No rotation support in AFBC, rotation=%d\n",
++ new_plane_state->rotation);
+ return -EINVAL;
+ }
+ }
+diff --git a/drivers/hwmon/aquacomputer_d5next.c b/drivers/hwmon/aquacomputer_d5next.c
+index c2b99fd4f436c..e6126a5319520 100644
+--- a/drivers/hwmon/aquacomputer_d5next.c
++++ b/drivers/hwmon/aquacomputer_d5next.c
+@@ -13,9 +13,11 @@
+
+ #include <linux/crc16.h>
+ #include <linux/debugfs.h>
++#include <linux/delay.h>
+ #include <linux/hid.h>
+ #include <linux/hwmon.h>
+ #include <linux/jiffies.h>
++#include <linux/ktime.h>
+ #include <linux/module.h>
+ #include <linux/mutex.h>
+ #include <linux/seq_file.h>
+@@ -61,6 +63,8 @@ static const char *const aqc_device_names[] = {
+ #define CTRL_REPORT_ID 0x03
+ #define AQUAERO_CTRL_REPORT_ID 0x0b
+
++#define CTRL_REPORT_DELAY 200 /* ms */
++
+ /* The HID report that the official software always sends
+ * after writing values, currently same for all devices
+ */
+@@ -496,6 +500,9 @@ struct aqc_data {
+ int secondary_ctrl_report_size;
+ u8 *secondary_ctrl_report;
+
++ ktime_t last_ctrl_report_op;
++ int ctrl_report_delay; /* Delay between two ctrl report operations, in ms */
++
+ int buffer_size;
+ u8 *buffer;
+ int checksum_start;
+@@ -577,17 +584,35 @@ static int aqc_aquastreamxt_convert_fan_rpm(u16 val)
+ return 0;
+ }
+
++static void aqc_delay_ctrl_report(struct aqc_data *priv)
++{
++ /*
++ * If previous read or write is too close to this one, delay the current operation
++ * to give the device enough time to process the previous one.
++ */
++ if (priv->ctrl_report_delay) {
++ s64 delta = ktime_ms_delta(ktime_get(), priv->last_ctrl_report_op);
++
++ if (delta < priv->ctrl_report_delay)
++ msleep(priv->ctrl_report_delay - delta);
++ }
++}
++
+ /* Expects the mutex to be locked */
+ static int aqc_get_ctrl_data(struct aqc_data *priv)
+ {
+ int ret;
+
++ aqc_delay_ctrl_report(priv);
++
+ memset(priv->buffer, 0x00, priv->buffer_size);
+ ret = hid_hw_raw_request(priv->hdev, priv->ctrl_report_id, priv->buffer, priv->buffer_size,
+ HID_FEATURE_REPORT, HID_REQ_GET_REPORT);
+ if (ret < 0)
+ ret = -ENODATA;
+
++ priv->last_ctrl_report_op = ktime_get();
++
+ return ret;
+ }
+
+@@ -597,6 +622,8 @@ static int aqc_send_ctrl_data(struct aqc_data *priv)
+ int ret;
+ u16 checksum;
+
++ aqc_delay_ctrl_report(priv);
++
+ /* Checksum is not needed for Aquaero */
+ if (priv->kind != aquaero) {
+ /* Init and xorout value for CRC-16/USB is 0xffff */
+@@ -612,12 +639,16 @@ static int aqc_send_ctrl_data(struct aqc_data *priv)
+ ret = hid_hw_raw_request(priv->hdev, priv->ctrl_report_id, priv->buffer, priv->buffer_size,
+ HID_FEATURE_REPORT, HID_REQ_SET_REPORT);
+ if (ret < 0)
+- return ret;
++ goto record_access_and_ret;
+
+ /* The official software sends this report after every change, so do it here as well */
+ ret = hid_hw_raw_request(priv->hdev, priv->secondary_ctrl_report_id,
+ priv->secondary_ctrl_report, priv->secondary_ctrl_report_size,
+ HID_FEATURE_REPORT, HID_REQ_SET_REPORT);
++
++record_access_and_ret:
++ priv->last_ctrl_report_op = ktime_get();
++
+ return ret;
+ }
+
+@@ -1443,6 +1474,7 @@ static int aqc_probe(struct hid_device *hdev, const struct hid_device_id *id)
+
+ priv->buffer_size = AQUAERO_CTRL_REPORT_SIZE;
+ priv->temp_ctrl_offset = AQUAERO_TEMP_CTRL_OFFSET;
++ priv->ctrl_report_delay = CTRL_REPORT_DELAY;
+
+ priv->temp_label = label_temp_sensors;
+ priv->virtual_temp_label = label_virtual_temp_sensors;
+@@ -1466,6 +1498,7 @@ static int aqc_probe(struct hid_device *hdev, const struct hid_device_id *id)
+ priv->temp_ctrl_offset = D5NEXT_TEMP_CTRL_OFFSET;
+
+ priv->buffer_size = D5NEXT_CTRL_REPORT_SIZE;
++ priv->ctrl_report_delay = CTRL_REPORT_DELAY;
+
+ priv->power_cycle_count_offset = D5NEXT_POWER_CYCLES;
+
+@@ -1516,6 +1549,7 @@ static int aqc_probe(struct hid_device *hdev, const struct hid_device_id *id)
+ priv->temp_ctrl_offset = OCTO_TEMP_CTRL_OFFSET;
+
+ priv->buffer_size = OCTO_CTRL_REPORT_SIZE;
++ priv->ctrl_report_delay = CTRL_REPORT_DELAY;
+
+ priv->power_cycle_count_offset = OCTO_POWER_CYCLES;
+
+@@ -1543,6 +1577,7 @@ static int aqc_probe(struct hid_device *hdev, const struct hid_device_id *id)
+ priv->temp_ctrl_offset = QUADRO_TEMP_CTRL_OFFSET;
+
+ priv->buffer_size = QUADRO_CTRL_REPORT_SIZE;
++ priv->ctrl_report_delay = CTRL_REPORT_DELAY;
+
+ priv->flow_pulses_ctrl_offset = QUADRO_FLOW_PULSES_CTRL_OFFSET;
+ priv->power_cycle_count_offset = QUADRO_POWER_CYCLES;
+diff --git a/drivers/hwmon/pmbus/bel-pfe.c b/drivers/hwmon/pmbus/bel-pfe.c
+index 4100eefb7ac32..61c195f8fd3b8 100644
+--- a/drivers/hwmon/pmbus/bel-pfe.c
++++ b/drivers/hwmon/pmbus/bel-pfe.c
+@@ -17,12 +17,13 @@
+ enum chips {pfe1100, pfe3000};
+
+ /*
+- * Disable status check for pfe3000 devices, because some devices report
+- * communication error (invalid command) for VOUT_MODE command (0x20)
+- * although correct VOUT_MODE (0x16) is returned: it leads to incorrect
+- * exponent in linear mode.
++ * Disable status check because some devices report communication error
++ * (invalid command) for VOUT_MODE command (0x20) although the correct
++ * VOUT_MODE (0x16) is returned: it leads to incorrect exponent in linear
++ * mode.
++ * This affects both pfe3000 and pfe1100.
+ */
+-static struct pmbus_platform_data pfe3000_plat_data = {
++static struct pmbus_platform_data pfe_plat_data = {
+ .flags = PMBUS_SKIP_STATUS_CHECK,
+ };
+
+@@ -94,16 +95,15 @@ static int pfe_pmbus_probe(struct i2c_client *client)
+ int model;
+
+ model = (int)i2c_match_id(pfe_device_id, client)->driver_data;
++ client->dev.platform_data = &pfe_plat_data;
+
+ /*
+ * PFE3000-12-069RA devices may not stay in page 0 during device
+ * probe which leads to probe failure (read status word failed).
+ * So let's set the device to page 0 at the beginning.
+ */
+- if (model == pfe3000) {
+- client->dev.platform_data = &pfe3000_plat_data;
++ if (model == pfe3000)
+ i2c_smbus_write_byte_data(client, PMBUS_PAGE, 0);
+- }
+
+ return pmbus_do_probe(client, &pfe_driver_info[model]);
+ }
+diff --git a/drivers/iio/adc/ad7192.c b/drivers/iio/adc/ad7192.c
+index 8685e0b58a838..7bc3ebfe8081b 100644
+--- a/drivers/iio/adc/ad7192.c
++++ b/drivers/iio/adc/ad7192.c
+@@ -62,7 +62,6 @@
+ #define AD7192_MODE_STA_MASK BIT(20) /* Status Register transmission Mask */
+ #define AD7192_MODE_CLKSRC(x) (((x) & 0x3) << 18) /* Clock Source Select */
+ #define AD7192_MODE_SINC3 BIT(15) /* SINC3 Filter Select */
+-#define AD7192_MODE_ACX BIT(14) /* AC excitation enable(AD7195 only)*/
+ #define AD7192_MODE_ENPAR BIT(13) /* Parity Enable */
+ #define AD7192_MODE_CLKDIV BIT(12) /* Clock divide by 2 (AD7190/2 only)*/
+ #define AD7192_MODE_SCYCLE BIT(11) /* Single cycle conversion */
+@@ -91,6 +90,7 @@
+ /* Configuration Register Bit Designations (AD7192_REG_CONF) */
+
+ #define AD7192_CONF_CHOP BIT(23) /* CHOP enable */
++#define AD7192_CONF_ACX BIT(22) /* AC excitation enable(AD7195 only) */
+ #define AD7192_CONF_REFSEL BIT(20) /* REFIN1/REFIN2 Reference Select */
+ #define AD7192_CONF_CHAN(x) ((x) << 8) /* Channel select */
+ #define AD7192_CONF_CHAN_MASK (0x7FF << 8) /* Channel select mask */
+@@ -472,7 +472,7 @@ static ssize_t ad7192_show_ac_excitation(struct device *dev,
+ struct iio_dev *indio_dev = dev_to_iio_dev(dev);
+ struct ad7192_state *st = iio_priv(indio_dev);
+
+- return sysfs_emit(buf, "%d\n", !!(st->mode & AD7192_MODE_ACX));
++ return sysfs_emit(buf, "%d\n", !!(st->conf & AD7192_CONF_ACX));
+ }
+
+ static ssize_t ad7192_show_bridge_switch(struct device *dev,
+@@ -513,13 +513,13 @@ static ssize_t ad7192_set(struct device *dev,
+
+ ad_sd_write_reg(&st->sd, AD7192_REG_GPOCON, 1, st->gpocon);
+ break;
+- case AD7192_REG_MODE:
++ case AD7192_REG_CONF:
+ if (val)
+- st->mode |= AD7192_MODE_ACX;
++ st->conf |= AD7192_CONF_ACX;
+ else
+- st->mode &= ~AD7192_MODE_ACX;
++ st->conf &= ~AD7192_CONF_ACX;
+
+- ad_sd_write_reg(&st->sd, AD7192_REG_MODE, 3, st->mode);
++ ad_sd_write_reg(&st->sd, AD7192_REG_CONF, 3, st->conf);
+ break;
+ default:
+ ret = -EINVAL;
+@@ -579,12 +579,11 @@ static IIO_DEVICE_ATTR(bridge_switch_en, 0644,
+
+ static IIO_DEVICE_ATTR(ac_excitation_en, 0644,
+ ad7192_show_ac_excitation, ad7192_set,
+- AD7192_REG_MODE);
++ AD7192_REG_CONF);
+
+ static struct attribute *ad7192_attributes[] = {
+ &iio_dev_attr_filter_low_pass_3db_frequency_available.dev_attr.attr,
+ &iio_dev_attr_bridge_switch_en.dev_attr.attr,
+- &iio_dev_attr_ac_excitation_en.dev_attr.attr,
+ NULL
+ };
+
+@@ -595,6 +594,7 @@ static const struct attribute_group ad7192_attribute_group = {
+ static struct attribute *ad7195_attributes[] = {
+ &iio_dev_attr_filter_low_pass_3db_frequency_available.dev_attr.attr,
+ &iio_dev_attr_bridge_switch_en.dev_attr.attr,
++ &iio_dev_attr_ac_excitation_en.dev_attr.attr,
+ NULL
+ };
+
+diff --git a/drivers/iio/adc/ina2xx-adc.c b/drivers/iio/adc/ina2xx-adc.c
+index 38d9d7b2313ea..2090bdf03cbee 100644
+--- a/drivers/iio/adc/ina2xx-adc.c
++++ b/drivers/iio/adc/ina2xx-adc.c
+@@ -124,6 +124,7 @@ static const struct regmap_config ina2xx_regmap_config = {
+ enum ina2xx_ids { ina219, ina226 };
+
+ struct ina2xx_config {
++ const char *name;
+ u16 config_default;
+ int calibration_value;
+ int shunt_voltage_lsb; /* nV */
+@@ -155,6 +156,7 @@ struct ina2xx_chip_info {
+
+ static const struct ina2xx_config ina2xx_config[] = {
+ [ina219] = {
++ .name = "ina219",
+ .config_default = INA219_CONFIG_DEFAULT,
+ .calibration_value = 4096,
+ .shunt_voltage_lsb = 10000,
+@@ -164,6 +166,7 @@ static const struct ina2xx_config ina2xx_config[] = {
+ .chip_id = ina219,
+ },
+ [ina226] = {
++ .name = "ina226",
+ .config_default = INA226_CONFIG_DEFAULT,
+ .calibration_value = 2048,
+ .shunt_voltage_lsb = 2500,
+@@ -996,7 +999,7 @@ static int ina2xx_probe(struct i2c_client *client)
+ /* Patch the current config register with default. */
+ val = chip->config->config_default;
+
+- if (id->driver_data == ina226) {
++ if (type == ina226) {
+ ina226_set_average(chip, INA226_DEFAULT_AVG, &val);
+ ina226_set_int_time_vbus(chip, INA226_DEFAULT_IT, &val);
+ ina226_set_int_time_vshunt(chip, INA226_DEFAULT_IT, &val);
+@@ -1015,7 +1018,7 @@ static int ina2xx_probe(struct i2c_client *client)
+ }
+
+ indio_dev->modes = INDIO_DIRECT_MODE;
+- if (id->driver_data == ina226) {
++ if (type == ina226) {
+ indio_dev->channels = ina226_channels;
+ indio_dev->num_channels = ARRAY_SIZE(ina226_channels);
+ indio_dev->info = &ina226_info;
+@@ -1024,7 +1027,7 @@ static int ina2xx_probe(struct i2c_client *client)
+ indio_dev->num_channels = ARRAY_SIZE(ina219_channels);
+ indio_dev->info = &ina219_info;
+ }
+- indio_dev->name = id->name;
++ indio_dev->name = id ? id->name : chip->config->name;
+
+ ret = devm_iio_kfifo_buffer_setup(&client->dev, indio_dev,
+ &ina2xx_setup_ops);
+diff --git a/drivers/iio/adc/meson_saradc.c b/drivers/iio/adc/meson_saradc.c
+index af6bfcc190752..eb78a6f17fd07 100644
+--- a/drivers/iio/adc/meson_saradc.c
++++ b/drivers/iio/adc/meson_saradc.c
+@@ -916,12 +916,6 @@ static int meson_sar_adc_hw_enable(struct iio_dev *indio_dev)
+ goto err_vref;
+ }
+
+- ret = clk_prepare_enable(priv->core_clk);
+- if (ret) {
+- dev_err(dev, "failed to enable core clk\n");
+- goto err_core_clk;
+- }
+-
+ regval = FIELD_PREP(MESON_SAR_ADC_REG0_FIFO_CNT_IRQ_MASK, 1);
+ regmap_update_bits(priv->regmap, MESON_SAR_ADC_REG0,
+ MESON_SAR_ADC_REG0_FIFO_CNT_IRQ_MASK, regval);
+@@ -948,8 +942,6 @@ err_adc_clk:
+ regmap_update_bits(priv->regmap, MESON_SAR_ADC_REG3,
+ MESON_SAR_ADC_REG3_ADC_EN, 0);
+ meson_sar_adc_set_bandgap(indio_dev, false);
+- clk_disable_unprepare(priv->core_clk);
+-err_core_clk:
+ regulator_disable(priv->vref);
+ err_vref:
+ meson_sar_adc_unlock(indio_dev);
+@@ -977,8 +969,6 @@ static void meson_sar_adc_hw_disable(struct iio_dev *indio_dev)
+
+ meson_sar_adc_set_bandgap(indio_dev, false);
+
+- clk_disable_unprepare(priv->core_clk);
+-
+ regulator_disable(priv->vref);
+
+ if (!ret)
+@@ -1211,7 +1201,7 @@ static int meson_sar_adc_probe(struct platform_device *pdev)
+ if (IS_ERR(priv->clkin))
+ return dev_err_probe(dev, PTR_ERR(priv->clkin), "failed to get clkin\n");
+
+- priv->core_clk = devm_clk_get(dev, "core");
++ priv->core_clk = devm_clk_get_enabled(dev, "core");
+ if (IS_ERR(priv->core_clk))
+ return dev_err_probe(dev, PTR_ERR(priv->core_clk), "failed to get core clk\n");
+
+@@ -1294,15 +1284,26 @@ static int meson_sar_adc_remove(struct platform_device *pdev)
+ static int meson_sar_adc_suspend(struct device *dev)
+ {
+ struct iio_dev *indio_dev = dev_get_drvdata(dev);
++ struct meson_sar_adc_priv *priv = iio_priv(indio_dev);
+
+ meson_sar_adc_hw_disable(indio_dev);
+
++ clk_disable_unprepare(priv->core_clk);
++
+ return 0;
+ }
+
+ static int meson_sar_adc_resume(struct device *dev)
+ {
+ struct iio_dev *indio_dev = dev_get_drvdata(dev);
++ struct meson_sar_adc_priv *priv = iio_priv(indio_dev);
++ int ret;
++
++ ret = clk_prepare_enable(priv->core_clk);
++ if (ret) {
++ dev_err(dev, "failed to enable core clk\n");
++ return ret;
++ }
+
+ return meson_sar_adc_hw_enable(indio_dev);
+ }
+diff --git a/drivers/iio/common/cros_ec_sensors/cros_ec_sensors_core.c b/drivers/iio/common/cros_ec_sensors/cros_ec_sensors_core.c
+index 943e9e14d1e99..b72d39fc2434e 100644
+--- a/drivers/iio/common/cros_ec_sensors/cros_ec_sensors_core.c
++++ b/drivers/iio/common/cros_ec_sensors/cros_ec_sensors_core.c
+@@ -253,7 +253,7 @@ int cros_ec_sensors_core_init(struct platform_device *pdev,
+ platform_set_drvdata(pdev, indio_dev);
+
+ state->ec = ec->ec_dev;
+- state->msg = devm_kzalloc(&pdev->dev,
++ state->msg = devm_kzalloc(&pdev->dev, sizeof(*state->msg) +
+ max((u16)sizeof(struct ec_params_motion_sense),
+ state->ec->max_response), GFP_KERNEL);
+ if (!state->msg)
+diff --git a/drivers/iio/frequency/admv1013.c b/drivers/iio/frequency/admv1013.c
+index 9bf8337806fcf..8c8e0bbfc99f2 100644
+--- a/drivers/iio/frequency/admv1013.c
++++ b/drivers/iio/frequency/admv1013.c
+@@ -344,9 +344,12 @@ static int admv1013_update_quad_filters(struct admv1013_state *st)
+
+ static int admv1013_update_mixer_vgate(struct admv1013_state *st)
+ {
+- unsigned int vcm, mixer_vgate;
++ unsigned int mixer_vgate;
++ int vcm;
+
+ vcm = regulator_get_voltage(st->reg);
++ if (vcm < 0)
++ return vcm;
+
+ if (vcm < 1800000)
+ mixer_vgate = (2389 * vcm / 1000000 + 8100) / 100;
+diff --git a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c
+index 6a18b363cf73b..b6e6b1df8a618 100644
+--- a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c
++++ b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c
+@@ -2687,7 +2687,7 @@ unknown_format:
+ static int lsm6dsx_get_acpi_mount_matrix(struct device *dev,
+ struct iio_mount_matrix *orientation)
+ {
+- return false;
++ return -EOPNOTSUPP;
+ }
+
+ #endif
+diff --git a/drivers/iio/industrialio-core.c b/drivers/iio/industrialio-core.c
+index c117f50d0cf37..adcba832e6fa1 100644
+--- a/drivers/iio/industrialio-core.c
++++ b/drivers/iio/industrialio-core.c
+@@ -1888,7 +1888,7 @@ static const struct iio_buffer_setup_ops noop_ring_setup_ops;
+ int __iio_device_register(struct iio_dev *indio_dev, struct module *this_mod)
+ {
+ struct iio_dev_opaque *iio_dev_opaque = to_iio_dev_opaque(indio_dev);
+- struct fwnode_handle *fwnode;
++ struct fwnode_handle *fwnode = NULL;
+ int ret;
+
+ if (!indio_dev->info)
+@@ -1899,7 +1899,8 @@ int __iio_device_register(struct iio_dev *indio_dev, struct module *this_mod)
+ /* If the calling driver did not initialize firmware node, do it here */
+ if (dev_fwnode(&indio_dev->dev))
+ fwnode = dev_fwnode(&indio_dev->dev);
+- else
++ /* The default dummy IIO device has no parent */
++ else if (indio_dev->dev.parent)
+ fwnode = dev_fwnode(indio_dev->dev.parent);
+ device_set_node(&indio_dev->dev, fwnode);
+
+diff --git a/drivers/iio/light/rohm-bu27034.c b/drivers/iio/light/rohm-bu27034.c
+index f85194fda6b09..021a622ad6116 100644
+--- a/drivers/iio/light/rohm-bu27034.c
++++ b/drivers/iio/light/rohm-bu27034.c
+@@ -575,7 +575,7 @@ static int bu27034_set_scale(struct bu27034_data *data, int chan,
+ return -EINVAL;
+
+ if (chan == BU27034_CHAN_ALS) {
+- if (val == 0 && val2 == 1000)
++ if (val == 0 && val2 == 1000000)
+ return 0;
+
+ return -EINVAL;
+@@ -587,7 +587,7 @@ static int bu27034_set_scale(struct bu27034_data *data, int chan,
+ goto unlock_out;
+
+ ret = iio_gts_find_gain_sel_for_scale_using_time(&data->gts, time_sel,
+- val, val2 * 1000, &gain_sel);
++ val, val2, &gain_sel);
+ if (ret) {
+ /*
+ * Could not support scale with given time. Need to change time.
+@@ -624,7 +624,7 @@ static int bu27034_set_scale(struct bu27034_data *data, int chan,
+
+ /* Can we provide requested scale with this time? */
+ ret = iio_gts_find_gain_sel_for_scale_using_time(
+- &data->gts, new_time_sel, val, val2 * 1000,
++ &data->gts, new_time_sel, val, val2,
+ &gain_sel);
+ if (ret)
+ continue;
+@@ -1217,6 +1217,21 @@ static int bu27034_read_raw(struct iio_dev *idev,
+ }
+ }
+
++static int bu27034_write_raw_get_fmt(struct iio_dev *indio_dev,
++ struct iio_chan_spec const *chan,
++ long mask)
++{
++
++ switch (mask) {
++ case IIO_CHAN_INFO_SCALE:
++ return IIO_VAL_INT_PLUS_NANO;
++ case IIO_CHAN_INFO_INT_TIME:
++ return IIO_VAL_INT_PLUS_MICRO;
++ default:
++ return -EINVAL;
++ }
++}
++
+ static int bu27034_write_raw(struct iio_dev *idev,
+ struct iio_chan_spec const *chan,
+ int val, int val2, long mask)
+@@ -1267,6 +1282,7 @@ static int bu27034_read_avail(struct iio_dev *idev,
+ static const struct iio_info bu27034_info = {
+ .read_raw = &bu27034_read_raw,
+ .write_raw = &bu27034_write_raw,
++ .write_raw_get_fmt = &bu27034_write_raw_get_fmt,
+ .read_avail = &bu27034_read_avail,
+ };
+
+diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
+index 755a9c57db6f3..f9ab671c8eda5 100644
+--- a/drivers/infiniband/core/umem.c
++++ b/drivers/infiniband/core/umem.c
+@@ -85,6 +85,8 @@ unsigned long ib_umem_find_best_pgsz(struct ib_umem *umem,
+ dma_addr_t mask;
+ int i;
+
++ umem->iova = va = virt;
++
+ if (umem->is_odp) {
+ unsigned int page_size = BIT(to_ib_umem_odp(umem)->page_shift);
+
+@@ -100,7 +102,6 @@ unsigned long ib_umem_find_best_pgsz(struct ib_umem *umem,
+ */
+ pgsz_bitmap &= GENMASK(BITS_PER_LONG - 1, PAGE_SHIFT);
+
+- umem->iova = va = virt;
+ /* The best result is the smallest page size that results in the minimum
+ * number of required pages. Compute the largest page size that could
+ * work based on VA address bits that don't change.
+diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
+index 1936f4b4002a7..4f00fb7869f8e 100644
+--- a/drivers/infiniband/hw/bnxt_re/main.c
++++ b/drivers/infiniband/hw/bnxt_re/main.c
+@@ -1152,6 +1152,8 @@ static int bnxt_re_dev_init(struct bnxt_re_dev *rdev, u8 wqe_mode)
+
+ rc = bnxt_re_setup_chip_ctx(rdev, wqe_mode);
+ if (rc) {
++ bnxt_unregister_dev(rdev->en_dev);
++ clear_bit(BNXT_RE_FLAG_NETDEV_REGISTERED, &rdev->flags);
+ ibdev_err(&rdev->ibdev, "Failed to get chip context\n");
+ return -EINVAL;
+ }
+@@ -1425,8 +1427,8 @@ static void bnxt_re_remove(struct auxiliary_device *adev)
+ }
+ bnxt_re_setup_cc(rdev, false);
+ ib_unregister_device(&rdev->ibdev);
+- ib_dealloc_device(&rdev->ibdev);
+ bnxt_re_dev_uninit(rdev);
++ ib_dealloc_device(&rdev->ibdev);
+ skip_remove:
+ mutex_unlock(&bnxt_re_mutex);
+ }
+diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c
+index 9dbb89e9f4afc..baaa4406d5e60 100644
+--- a/drivers/infiniband/hw/hfi1/chip.c
++++ b/drivers/infiniband/hw/hfi1/chip.c
+@@ -12307,6 +12307,7 @@ static void free_cntrs(struct hfi1_devdata *dd)
+
+ if (dd->synth_stats_timer.function)
+ del_timer_sync(&dd->synth_stats_timer);
++ cancel_work_sync(&dd->update_cntr_work);
+ ppd = (struct hfi1_pportdata *)(dd + 1);
+ for (i = 0; i < dd->num_pports; i++, ppd++) {
+ kfree(ppd->cntrs);
+diff --git a/drivers/interconnect/qcom/bcm-voter.c b/drivers/interconnect/qcom/bcm-voter.c
+index 8f385f9c2dd38..d5f2a6b5376bd 100644
+--- a/drivers/interconnect/qcom/bcm-voter.c
++++ b/drivers/interconnect/qcom/bcm-voter.c
+@@ -83,6 +83,11 @@ static void bcm_aggregate(struct qcom_icc_bcm *bcm)
+
+ temp = agg_peak[bucket] * bcm->vote_scale;
+ bcm->vote_y[bucket] = bcm_div(temp, bcm->aux_data.unit);
++
++ if (bcm->enable_mask && (bcm->vote_x[bucket] || bcm->vote_y[bucket])) {
++ bcm->vote_x[bucket] = 0;
++ bcm->vote_y[bucket] = bcm->enable_mask;
++ }
+ }
+
+ if (bcm->keepalive && bcm->vote_x[QCOM_ICC_BUCKET_AMC] == 0 &&
+diff --git a/drivers/interconnect/qcom/icc-rpmh.h b/drivers/interconnect/qcom/icc-rpmh.h
+index 04391c1ba465c..7843d8864d6ba 100644
+--- a/drivers/interconnect/qcom/icc-rpmh.h
++++ b/drivers/interconnect/qcom/icc-rpmh.h
+@@ -81,6 +81,7 @@ struct qcom_icc_node {
+ * @vote_x: aggregated threshold values, represents sum_bw when @type is bw bcm
+ * @vote_y: aggregated threshold values, represents peak_bw when @type is bw bcm
+ * @vote_scale: scaling factor for vote_x and vote_y
++ * @enable_mask: optional mask to send as vote instead of vote_x/vote_y
+ * @dirty: flag used to indicate whether the bcm needs to be committed
+ * @keepalive: flag used to indicate whether a keepalive is required
+ * @aux_data: auxiliary data used when calculating threshold values and
+@@ -97,6 +98,7 @@ struct qcom_icc_bcm {
+ u64 vote_x[QCOM_ICC_NUM_BUCKETS];
+ u64 vote_y[QCOM_ICC_NUM_BUCKETS];
+ u64 vote_scale;
++ u32 enable_mask;
+ bool dirty;
+ bool keepalive;
+ struct bcm_db aux_data;
+diff --git a/drivers/interconnect/qcom/sa8775p.c b/drivers/interconnect/qcom/sa8775p.c
+index da21cc31a5808..f56538669de0e 100644
+--- a/drivers/interconnect/qcom/sa8775p.c
++++ b/drivers/interconnect/qcom/sa8775p.c
+@@ -1873,6 +1873,7 @@ static struct qcom_icc_node srvc_snoc = {
+
+ static struct qcom_icc_bcm bcm_acv = {
+ .name = "ACV",
++ .enable_mask = 0x8,
+ .num_nodes = 1,
+ .nodes = { &ebi },
+ };
+diff --git a/drivers/interconnect/qcom/sm8450.c b/drivers/interconnect/qcom/sm8450.c
+index 2d7a8e7b85ec2..e64c214b40209 100644
+--- a/drivers/interconnect/qcom/sm8450.c
++++ b/drivers/interconnect/qcom/sm8450.c
+@@ -1337,6 +1337,7 @@ static struct qcom_icc_node qns_mem_noc_sf_disp = {
+
+ static struct qcom_icc_bcm bcm_acv = {
+ .name = "ACV",
++ .enable_mask = 0x8,
+ .num_nodes = 1,
+ .nodes = { &ebi },
+ };
+@@ -1349,6 +1350,7 @@ static struct qcom_icc_bcm bcm_ce0 = {
+
+ static struct qcom_icc_bcm bcm_cn0 = {
+ .name = "CN0",
++ .enable_mask = 0x1,
+ .keepalive = true,
+ .num_nodes = 55,
+ .nodes = { &qnm_gemnoc_cnoc, &qnm_gemnoc_pcie,
+@@ -1383,6 +1385,7 @@ static struct qcom_icc_bcm bcm_cn0 = {
+
+ static struct qcom_icc_bcm bcm_co0 = {
+ .name = "CO0",
++ .enable_mask = 0x1,
+ .num_nodes = 2,
+ .nodes = { &qxm_nsp, &qns_nsp_gemnoc },
+ };
+@@ -1403,6 +1406,7 @@ static struct qcom_icc_bcm bcm_mm0 = {
+
+ static struct qcom_icc_bcm bcm_mm1 = {
+ .name = "MM1",
++ .enable_mask = 0x1,
+ .num_nodes = 12,
+ .nodes = { &qnm_camnoc_hf, &qnm_camnoc_icp,
+ &qnm_camnoc_sf, &qnm_mdp,
+@@ -1445,6 +1449,7 @@ static struct qcom_icc_bcm bcm_sh0 = {
+
+ static struct qcom_icc_bcm bcm_sh1 = {
+ .name = "SH1",
++ .enable_mask = 0x1,
+ .num_nodes = 7,
+ .nodes = { &alm_gpu_tcu, &alm_sys_tcu,
+ &qnm_nsp_gemnoc, &qnm_pcie,
+@@ -1461,6 +1466,7 @@ static struct qcom_icc_bcm bcm_sn0 = {
+
+ static struct qcom_icc_bcm bcm_sn1 = {
+ .name = "SN1",
++ .enable_mask = 0x1,
+ .num_nodes = 4,
+ .nodes = { &qhm_gic, &qxm_pimem,
+ &xm_gic, &qns_gemnoc_gc },
+@@ -1492,6 +1498,7 @@ static struct qcom_icc_bcm bcm_sn7 = {
+
+ static struct qcom_icc_bcm bcm_acv_disp = {
+ .name = "ACV",
++ .enable_mask = 0x1,
+ .num_nodes = 1,
+ .nodes = { &ebi_disp },
+ };
+@@ -1510,6 +1517,7 @@ static struct qcom_icc_bcm bcm_mm0_disp = {
+
+ static struct qcom_icc_bcm bcm_mm1_disp = {
+ .name = "MM1",
++ .enable_mask = 0x1,
+ .num_nodes = 3,
+ .nodes = { &qnm_mdp_disp, &qnm_rot_disp,
+ &qns_mem_noc_sf_disp },
+@@ -1523,6 +1531,7 @@ static struct qcom_icc_bcm bcm_sh0_disp = {
+
+ static struct qcom_icc_bcm bcm_sh1_disp = {
+ .name = "SH1",
++ .enable_mask = 0x1,
+ .num_nodes = 1,
+ .nodes = { &qnm_pcie_disp },
+ };
+diff --git a/drivers/interconnect/qcom/sm8550.c b/drivers/interconnect/qcom/sm8550.c
+index d823ba988ef68..0864ed285375e 100644
+--- a/drivers/interconnect/qcom/sm8550.c
++++ b/drivers/interconnect/qcom/sm8550.c
+@@ -1473,6 +1473,7 @@ static struct qcom_icc_node qns_mem_noc_sf_cam_ife_2 = {
+
+ static struct qcom_icc_bcm bcm_acv = {
+ .name = "ACV",
++ .enable_mask = 0x8,
+ .num_nodes = 1,
+ .nodes = { &ebi },
+ };
+@@ -1485,6 +1486,7 @@ static struct qcom_icc_bcm bcm_ce0 = {
+
+ static struct qcom_icc_bcm bcm_cn0 = {
+ .name = "CN0",
++ .enable_mask = 0x1,
+ .keepalive = true,
+ .num_nodes = 54,
+ .nodes = { &qsm_cfg, &qhs_ahb2phy0,
+@@ -1524,6 +1526,7 @@ static struct qcom_icc_bcm bcm_cn1 = {
+
+ static struct qcom_icc_bcm bcm_co0 = {
+ .name = "CO0",
++ .enable_mask = 0x1,
+ .num_nodes = 2,
+ .nodes = { &qxm_nsp, &qns_nsp_gemnoc },
+ };
+@@ -1549,6 +1552,7 @@ static struct qcom_icc_bcm bcm_mm0 = {
+
+ static struct qcom_icc_bcm bcm_mm1 = {
+ .name = "MM1",
++ .enable_mask = 0x1,
+ .num_nodes = 8,
+ .nodes = { &qnm_camnoc_hf, &qnm_camnoc_icp,
+ &qnm_camnoc_sf, &qnm_vapss_hcp,
+@@ -1589,6 +1593,7 @@ static struct qcom_icc_bcm bcm_sh0 = {
+
+ static struct qcom_icc_bcm bcm_sh1 = {
+ .name = "SH1",
++ .enable_mask = 0x1,
+ .num_nodes = 13,
+ .nodes = { &alm_gpu_tcu, &alm_sys_tcu,
+ &chm_apps, &qnm_gpu,
+@@ -1608,6 +1613,7 @@ static struct qcom_icc_bcm bcm_sn0 = {
+
+ static struct qcom_icc_bcm bcm_sn1 = {
+ .name = "SN1",
++ .enable_mask = 0x1,
+ .num_nodes = 3,
+ .nodes = { &qhm_gic, &xm_gic,
+ &qns_gemnoc_gc },
+@@ -1633,6 +1639,7 @@ static struct qcom_icc_bcm bcm_sn7 = {
+
+ static struct qcom_icc_bcm bcm_acv_disp = {
+ .name = "ACV",
++ .enable_mask = 0x1,
+ .num_nodes = 1,
+ .nodes = { &ebi_disp },
+ };
+@@ -1657,12 +1664,14 @@ static struct qcom_icc_bcm bcm_sh0_disp = {
+
+ static struct qcom_icc_bcm bcm_sh1_disp = {
+ .name = "SH1",
++ .enable_mask = 0x1,
+ .num_nodes = 2,
+ .nodes = { &qnm_mnoc_hf_disp, &qnm_pcie_disp },
+ };
+
+ static struct qcom_icc_bcm bcm_acv_cam_ife_0 = {
+ .name = "ACV",
++ .enable_mask = 0x0,
+ .num_nodes = 1,
+ .nodes = { &ebi_cam_ife_0 },
+ };
+@@ -1681,6 +1690,7 @@ static struct qcom_icc_bcm bcm_mm0_cam_ife_0 = {
+
+ static struct qcom_icc_bcm bcm_mm1_cam_ife_0 = {
+ .name = "MM1",
++ .enable_mask = 0x1,
+ .num_nodes = 4,
+ .nodes = { &qnm_camnoc_hf_cam_ife_0, &qnm_camnoc_icp_cam_ife_0,
+ &qnm_camnoc_sf_cam_ife_0, &qns_mem_noc_sf_cam_ife_0 },
+@@ -1694,6 +1704,7 @@ static struct qcom_icc_bcm bcm_sh0_cam_ife_0 = {
+
+ static struct qcom_icc_bcm bcm_sh1_cam_ife_0 = {
+ .name = "SH1",
++ .enable_mask = 0x1,
+ .num_nodes = 3,
+ .nodes = { &qnm_mnoc_hf_cam_ife_0, &qnm_mnoc_sf_cam_ife_0,
+ &qnm_pcie_cam_ife_0 },
+@@ -1701,6 +1712,7 @@ static struct qcom_icc_bcm bcm_sh1_cam_ife_0 = {
+
+ static struct qcom_icc_bcm bcm_acv_cam_ife_1 = {
+ .name = "ACV",
++ .enable_mask = 0x0,
+ .num_nodes = 1,
+ .nodes = { &ebi_cam_ife_1 },
+ };
+@@ -1719,6 +1731,7 @@ static struct qcom_icc_bcm bcm_mm0_cam_ife_1 = {
+
+ static struct qcom_icc_bcm bcm_mm1_cam_ife_1 = {
+ .name = "MM1",
++ .enable_mask = 0x1,
+ .num_nodes = 4,
+ .nodes = { &qnm_camnoc_hf_cam_ife_1, &qnm_camnoc_icp_cam_ife_1,
+ &qnm_camnoc_sf_cam_ife_1, &qns_mem_noc_sf_cam_ife_1 },
+@@ -1732,6 +1745,7 @@ static struct qcom_icc_bcm bcm_sh0_cam_ife_1 = {
+
+ static struct qcom_icc_bcm bcm_sh1_cam_ife_1 = {
+ .name = "SH1",
++ .enable_mask = 0x1,
+ .num_nodes = 3,
+ .nodes = { &qnm_mnoc_hf_cam_ife_1, &qnm_mnoc_sf_cam_ife_1,
+ &qnm_pcie_cam_ife_1 },
+@@ -1739,6 +1753,7 @@ static struct qcom_icc_bcm bcm_sh1_cam_ife_1 = {
+
+ static struct qcom_icc_bcm bcm_acv_cam_ife_2 = {
+ .name = "ACV",
++ .enable_mask = 0x0,
+ .num_nodes = 1,
+ .nodes = { &ebi_cam_ife_2 },
+ };
+@@ -1757,6 +1772,7 @@ static struct qcom_icc_bcm bcm_mm0_cam_ife_2 = {
+
+ static struct qcom_icc_bcm bcm_mm1_cam_ife_2 = {
+ .name = "MM1",
++ .enable_mask = 0x1,
+ .num_nodes = 4,
+ .nodes = { &qnm_camnoc_hf_cam_ife_2, &qnm_camnoc_icp_cam_ife_2,
+ &qnm_camnoc_sf_cam_ife_2, &qns_mem_noc_sf_cam_ife_2 },
+@@ -1770,6 +1786,7 @@ static struct qcom_icc_bcm bcm_sh0_cam_ife_2 = {
+
+ static struct qcom_icc_bcm bcm_sh1_cam_ife_2 = {
+ .name = "SH1",
++ .enable_mask = 0x1,
+ .num_nodes = 3,
+ .nodes = { &qnm_mnoc_hf_cam_ife_2, &qnm_mnoc_sf_cam_ife_2,
+ &qnm_pcie_cam_ife_2 },
+diff --git a/drivers/isdn/mISDN/dsp.h b/drivers/isdn/mISDN/dsp.h
+index fa09d511a8eda..baf31258f5c90 100644
+--- a/drivers/isdn/mISDN/dsp.h
++++ b/drivers/isdn/mISDN/dsp.h
+@@ -247,7 +247,7 @@ extern void dsp_cmx_hardware(struct dsp_conf *conf, struct dsp *dsp);
+ extern int dsp_cmx_conf(struct dsp *dsp, u32 conf_id);
+ extern void dsp_cmx_receive(struct dsp *dsp, struct sk_buff *skb);
+ extern void dsp_cmx_hdlc(struct dsp *dsp, struct sk_buff *skb);
+-extern void dsp_cmx_send(void *arg);
++extern void dsp_cmx_send(struct timer_list *arg);
+ extern void dsp_cmx_transmit(struct dsp *dsp, struct sk_buff *skb);
+ extern int dsp_cmx_del_conf_member(struct dsp *dsp);
+ extern int dsp_cmx_del_conf(struct dsp_conf *conf);
+diff --git a/drivers/isdn/mISDN/dsp_cmx.c b/drivers/isdn/mISDN/dsp_cmx.c
+index 357b87592eb48..61cb45c5d0d84 100644
+--- a/drivers/isdn/mISDN/dsp_cmx.c
++++ b/drivers/isdn/mISDN/dsp_cmx.c
+@@ -1614,7 +1614,7 @@ static u16 dsp_count; /* last sample count */
+ static int dsp_count_valid; /* if we have last sample count */
+
+ void
+-dsp_cmx_send(void *arg)
++dsp_cmx_send(struct timer_list *arg)
+ {
+ struct dsp_conf *conf;
+ struct dsp_conf_member *member;
+diff --git a/drivers/isdn/mISDN/dsp_core.c b/drivers/isdn/mISDN/dsp_core.c
+index 386084530c2f8..fae95f1666883 100644
+--- a/drivers/isdn/mISDN/dsp_core.c
++++ b/drivers/isdn/mISDN/dsp_core.c
+@@ -1195,7 +1195,7 @@ static int __init dsp_init(void)
+ }
+
+ /* set sample timer */
+- timer_setup(&dsp_spl_tl, (void *)dsp_cmx_send, 0);
++ timer_setup(&dsp_spl_tl, dsp_cmx_send, 0);
+ dsp_spl_tl.expires = jiffies + dsp_tics;
+ dsp_spl_jiffies = dsp_spl_tl.expires;
+ add_timer(&dsp_spl_tl);
+diff --git a/drivers/misc/cardreader/rts5227.c b/drivers/misc/cardreader/rts5227.c
+index d676cf63a9669..3dae5e3a16976 100644
+--- a/drivers/misc/cardreader/rts5227.c
++++ b/drivers/misc/cardreader/rts5227.c
+@@ -195,7 +195,7 @@ static int rts5227_extra_init_hw(struct rtsx_pcr *pcr)
+ }
+ }
+
+- if (option->force_clkreq_0)
++ if (option->force_clkreq_0 && pcr->aspm_mode == ASPM_MODE_CFG)
+ rtsx_pci_add_cmd(pcr, WRITE_REG_CMD, PETXCFG,
+ FORCE_CLKREQ_DELINK_MASK, FORCE_CLKREQ_LOW);
+ else
+diff --git a/drivers/misc/cardreader/rts5228.c b/drivers/misc/cardreader/rts5228.c
+index cfebad51d1d80..f4ab09439da70 100644
+--- a/drivers/misc/cardreader/rts5228.c
++++ b/drivers/misc/cardreader/rts5228.c
+@@ -435,17 +435,10 @@ static void rts5228_init_from_cfg(struct rtsx_pcr *pcr)
+ option->ltr_enabled = false;
+ }
+ }
+-
+- if (rtsx_check_dev_flag(pcr, ASPM_L1_1_EN | ASPM_L1_2_EN
+- | PM_L1_1_EN | PM_L1_2_EN))
+- option->force_clkreq_0 = false;
+- else
+- option->force_clkreq_0 = true;
+ }
+
+ static int rts5228_extra_init_hw(struct rtsx_pcr *pcr)
+ {
+- struct rtsx_cr_option *option = &pcr->option;
+
+ rtsx_pci_write_register(pcr, RTS5228_AUTOLOAD_CFG1,
+ CD_RESUME_EN_MASK, CD_RESUME_EN_MASK);
+@@ -476,17 +469,6 @@ static int rts5228_extra_init_hw(struct rtsx_pcr *pcr)
+ else
+ rtsx_pci_write_register(pcr, PETXCFG, 0x30, 0x00);
+
+- /*
+- * If u_force_clkreq_0 is enabled, CLKREQ# PIN will be forced
+- * to drive low, and we forcibly request clock.
+- */
+- if (option->force_clkreq_0)
+- rtsx_pci_write_register(pcr, PETXCFG,
+- FORCE_CLKREQ_DELINK_MASK, FORCE_CLKREQ_LOW);
+- else
+- rtsx_pci_write_register(pcr, PETXCFG,
+- FORCE_CLKREQ_DELINK_MASK, FORCE_CLKREQ_HIGH);
+-
+ rtsx_pci_write_register(pcr, PWD_SUSPEND_EN, 0xFF, 0xFB);
+
+ if (pcr->rtd3_en) {
+diff --git a/drivers/misc/cardreader/rts5249.c b/drivers/misc/cardreader/rts5249.c
+index 91d240dd68faa..47ab72a43256b 100644
+--- a/drivers/misc/cardreader/rts5249.c
++++ b/drivers/misc/cardreader/rts5249.c
+@@ -327,12 +327,11 @@ static int rts5249_extra_init_hw(struct rtsx_pcr *pcr)
+ }
+ }
+
+-
+ /*
+ * If u_force_clkreq_0 is enabled, CLKREQ# PIN will be forced
+ * to drive low, and we forcibly request clock.
+ */
+- if (option->force_clkreq_0)
++ if (option->force_clkreq_0 && pcr->aspm_mode == ASPM_MODE_CFG)
+ rtsx_pci_write_register(pcr, PETXCFG,
+ FORCE_CLKREQ_DELINK_MASK, FORCE_CLKREQ_LOW);
+ else
+diff --git a/drivers/misc/cardreader/rts5260.c b/drivers/misc/cardreader/rts5260.c
+index 9b42b20a3e5ae..79b18f6f73a8a 100644
+--- a/drivers/misc/cardreader/rts5260.c
++++ b/drivers/misc/cardreader/rts5260.c
+@@ -517,17 +517,10 @@ static void rts5260_init_from_cfg(struct rtsx_pcr *pcr)
+ option->ltr_enabled = false;
+ }
+ }
+-
+- if (rtsx_check_dev_flag(pcr, ASPM_L1_1_EN | ASPM_L1_2_EN
+- | PM_L1_1_EN | PM_L1_2_EN))
+- option->force_clkreq_0 = false;
+- else
+- option->force_clkreq_0 = true;
+ }
+
+ static int rts5260_extra_init_hw(struct rtsx_pcr *pcr)
+ {
+- struct rtsx_cr_option *option = &pcr->option;
+
+ /* Set mcu_cnt to 7 to ensure data can be sampled properly */
+ rtsx_pci_write_register(pcr, 0xFC03, 0x7F, 0x07);
+@@ -546,17 +539,6 @@ static int rts5260_extra_init_hw(struct rtsx_pcr *pcr)
+
+ rts5260_init_hw(pcr);
+
+- /*
+- * If u_force_clkreq_0 is enabled, CLKREQ# PIN will be forced
+- * to drive low, and we forcibly request clock.
+- */
+- if (option->force_clkreq_0)
+- rtsx_pci_write_register(pcr, PETXCFG,
+- FORCE_CLKREQ_DELINK_MASK, FORCE_CLKREQ_LOW);
+- else
+- rtsx_pci_write_register(pcr, PETXCFG,
+- FORCE_CLKREQ_DELINK_MASK, FORCE_CLKREQ_HIGH);
+-
+ rtsx_pci_write_register(pcr, pcr->reg_pm_ctrl3, 0x10, 0x00);
+
+ return 0;
+diff --git a/drivers/misc/cardreader/rts5261.c b/drivers/misc/cardreader/rts5261.c
+index b1e76030cafda..94af6bf8a25a6 100644
+--- a/drivers/misc/cardreader/rts5261.c
++++ b/drivers/misc/cardreader/rts5261.c
+@@ -498,17 +498,10 @@ static void rts5261_init_from_cfg(struct rtsx_pcr *pcr)
+ option->ltr_enabled = false;
+ }
+ }
+-
+- if (rtsx_check_dev_flag(pcr, ASPM_L1_1_EN | ASPM_L1_2_EN
+- | PM_L1_1_EN | PM_L1_2_EN))
+- option->force_clkreq_0 = false;
+- else
+- option->force_clkreq_0 = true;
+ }
+
+ static int rts5261_extra_init_hw(struct rtsx_pcr *pcr)
+ {
+- struct rtsx_cr_option *option = &pcr->option;
+ u32 val;
+
+ rtsx_pci_write_register(pcr, RTS5261_AUTOLOAD_CFG1,
+@@ -554,17 +547,6 @@ static int rts5261_extra_init_hw(struct rtsx_pcr *pcr)
+ else
+ rtsx_pci_write_register(pcr, PETXCFG, 0x30, 0x00);
+
+- /*
+- * If u_force_clkreq_0 is enabled, CLKREQ# PIN will be forced
+- * to drive low, and we forcibly request clock.
+- */
+- if (option->force_clkreq_0)
+- rtsx_pci_write_register(pcr, PETXCFG,
+- FORCE_CLKREQ_DELINK_MASK, FORCE_CLKREQ_LOW);
+- else
+- rtsx_pci_write_register(pcr, PETXCFG,
+- FORCE_CLKREQ_DELINK_MASK, FORCE_CLKREQ_HIGH);
+-
+ rtsx_pci_write_register(pcr, PWD_SUSPEND_EN, 0xFF, 0xFB);
+
+ if (pcr->rtd3_en) {
+diff --git a/drivers/misc/cardreader/rtsx_pcr.c b/drivers/misc/cardreader/rtsx_pcr.c
+index 32b7783e9d4fa..a3f4b52bb159f 100644
+--- a/drivers/misc/cardreader/rtsx_pcr.c
++++ b/drivers/misc/cardreader/rtsx_pcr.c
+@@ -1326,8 +1326,11 @@ static int rtsx_pci_init_hw(struct rtsx_pcr *pcr)
+ return err;
+ }
+
+- if (pcr->aspm_mode == ASPM_MODE_REG)
++ if (pcr->aspm_mode == ASPM_MODE_REG) {
+ rtsx_pci_write_register(pcr, ASPM_FORCE_CTL, 0x30, 0x30);
++ rtsx_pci_write_register(pcr, PETXCFG,
++ FORCE_CLKREQ_DELINK_MASK, FORCE_CLKREQ_HIGH);
++ }
+
+ /* No CD interrupt if probing driver with card inserted.
+ * So we need to initialize pcr->card_exist here.
+diff --git a/drivers/mmc/host/moxart-mmc.c b/drivers/mmc/host/moxart-mmc.c
+index 2d002c81dcf36..d0d6ffcf78d40 100644
+--- a/drivers/mmc/host/moxart-mmc.c
++++ b/drivers/mmc/host/moxart-mmc.c
+@@ -338,13 +338,7 @@ static void moxart_transfer_pio(struct moxart_host *host)
+ return;
+ }
+ for (len = 0; len < remain && len < host->fifo_width;) {
+- /* SCR data must be read in big endian. */
+- if (data->mrq->cmd->opcode == SD_APP_SEND_SCR)
+- *sgp = ioread32be(host->base +
+- REG_DATA_WINDOW);
+- else
+- *sgp = ioread32(host->base +
+- REG_DATA_WINDOW);
++ *sgp = ioread32(host->base + REG_DATA_WINDOW);
+ sgp++;
+ len += 4;
+ }
+diff --git a/drivers/mmc/host/sdhci_f_sdh30.c b/drivers/mmc/host/sdhci_f_sdh30.c
+index a202a69a4b084..b01ffb4d09737 100644
+--- a/drivers/mmc/host/sdhci_f_sdh30.c
++++ b/drivers/mmc/host/sdhci_f_sdh30.c
+@@ -29,9 +29,16 @@ struct f_sdhost_priv {
+ bool enable_cmd_dat_delay;
+ };
+
++static void *sdhci_f_sdhost_priv(struct sdhci_host *host)
++{
++ struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
++
++ return sdhci_pltfm_priv(pltfm_host);
++}
++
+ static void sdhci_f_sdh30_soft_voltage_switch(struct sdhci_host *host)
+ {
+- struct f_sdhost_priv *priv = sdhci_priv(host);
++ struct f_sdhost_priv *priv = sdhci_f_sdhost_priv(host);
+ u32 ctrl = 0;
+
+ usleep_range(2500, 3000);
+@@ -64,7 +71,7 @@ static unsigned int sdhci_f_sdh30_get_min_clock(struct sdhci_host *host)
+
+ static void sdhci_f_sdh30_reset(struct sdhci_host *host, u8 mask)
+ {
+- struct f_sdhost_priv *priv = sdhci_priv(host);
++ struct f_sdhost_priv *priv = sdhci_f_sdhost_priv(host);
+ u32 ctl;
+
+ if (sdhci_readw(host, SDHCI_CLOCK_CONTROL) == 0)
+@@ -95,30 +102,32 @@ static const struct sdhci_ops sdhci_f_sdh30_ops = {
+ .set_uhs_signaling = sdhci_set_uhs_signaling,
+ };
+
++static const struct sdhci_pltfm_data sdhci_f_sdh30_pltfm_data = {
++ .ops = &sdhci_f_sdh30_ops,
++ .quirks = SDHCI_QUIRK_NO_ENDATTR_IN_NOPDESC
++ | SDHCI_QUIRK_INVERTED_WRITE_PROTECT,
++ .quirks2 = SDHCI_QUIRK2_SUPPORT_SINGLE
++ | SDHCI_QUIRK2_TUNING_WORK_AROUND,
++};
++
+ static int sdhci_f_sdh30_probe(struct platform_device *pdev)
+ {
+ struct sdhci_host *host;
+ struct device *dev = &pdev->dev;
+- int irq, ctrl = 0, ret = 0;
++ int ctrl = 0, ret = 0;
+ struct f_sdhost_priv *priv;
++ struct sdhci_pltfm_host *pltfm_host;
+ u32 reg = 0;
+
+- irq = platform_get_irq(pdev, 0);
+- if (irq < 0)
+- return irq;
+-
+- host = sdhci_alloc_host(dev, sizeof(struct f_sdhost_priv));
++ host = sdhci_pltfm_init(pdev, &sdhci_f_sdh30_pltfm_data,
++ sizeof(struct f_sdhost_priv));
+ if (IS_ERR(host))
+ return PTR_ERR(host);
+
+- priv = sdhci_priv(host);
++ pltfm_host = sdhci_priv(host);
++ priv = sdhci_pltfm_priv(pltfm_host);
+ priv->dev = dev;
+
+- host->quirks = SDHCI_QUIRK_NO_ENDATTR_IN_NOPDESC |
+- SDHCI_QUIRK_INVERTED_WRITE_PROTECT;
+- host->quirks2 = SDHCI_QUIRK2_SUPPORT_SINGLE |
+- SDHCI_QUIRK2_TUNING_WORK_AROUND;
+-
+ priv->enable_cmd_dat_delay = device_property_read_bool(dev,
+ "fujitsu,cmd-dat-delay-select");
+
+@@ -126,18 +135,6 @@ static int sdhci_f_sdh30_probe(struct platform_device *pdev)
+ if (ret)
+ goto err;
+
+- platform_set_drvdata(pdev, host);
+-
+- host->hw_name = "f_sdh30";
+- host->ops = &sdhci_f_sdh30_ops;
+- host->irq = irq;
+-
+- host->ioaddr = devm_platform_ioremap_resource(pdev, 0);
+- if (IS_ERR(host->ioaddr)) {
+- ret = PTR_ERR(host->ioaddr);
+- goto err;
+- }
+-
+ if (dev_of_node(dev)) {
+ sdhci_get_of_property(pdev);
+
+@@ -204,24 +201,21 @@ err_rst:
+ err_clk:
+ clk_disable_unprepare(priv->clk_iface);
+ err:
+- sdhci_free_host(host);
++ sdhci_pltfm_free(pdev);
++
+ return ret;
+ }
+
+ static int sdhci_f_sdh30_remove(struct platform_device *pdev)
+ {
+ struct sdhci_host *host = platform_get_drvdata(pdev);
+- struct f_sdhost_priv *priv = sdhci_priv(host);
+-
+- sdhci_remove_host(host, readl(host->ioaddr + SDHCI_INT_STATUS) ==
+- 0xffffffff);
++ struct f_sdhost_priv *priv = sdhci_f_sdhost_priv(host);
+
+ reset_control_assert(priv->rst);
+ clk_disable_unprepare(priv->clk);
+ clk_disable_unprepare(priv->clk_iface);
+
+- sdhci_free_host(host);
+- platform_set_drvdata(pdev, NULL);
++ sdhci_pltfm_unregister(pdev);
+
+ return 0;
+ }
+diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
+index 1a0776f9b008a..7be484f2c9264 100644
+--- a/drivers/net/bonding/bond_main.c
++++ b/drivers/net/bonding/bond_main.c
+@@ -5898,7 +5898,9 @@ void bond_setup(struct net_device *bond_dev)
+
+ bond_dev->hw_features = BOND_VLAN_FEATURES |
+ NETIF_F_HW_VLAN_CTAG_RX |
+- NETIF_F_HW_VLAN_CTAG_FILTER;
++ NETIF_F_HW_VLAN_CTAG_FILTER |
++ NETIF_F_HW_VLAN_STAG_RX |
++ NETIF_F_HW_VLAN_STAG_FILTER;
+
+ bond_dev->hw_features |= NETIF_F_GSO_ENCAP_ALL;
+ bond_dev->features |= bond_dev->hw_features;
+diff --git a/drivers/net/dsa/ocelot/felix.c b/drivers/net/dsa/ocelot/felix.c
+index d78b4bd4787e8..b090b4408e3a0 100644
+--- a/drivers/net/dsa/ocelot/felix.c
++++ b/drivers/net/dsa/ocelot/felix.c
+@@ -1625,8 +1625,10 @@ static void felix_teardown(struct dsa_switch *ds)
+ struct felix *felix = ocelot_to_felix(ocelot);
+ struct dsa_port *dp;
+
++ rtnl_lock();
+ if (felix->tag_proto_ops)
+ felix->tag_proto_ops->teardown(ds);
++ rtnl_unlock();
+
+ dsa_switch_for_each_available_port(dp, ds)
+ ocelot_deinit_port(ocelot, dp->index);
+diff --git a/drivers/net/ethernet/freescale/enetc/enetc_pf.c b/drivers/net/ethernet/freescale/enetc/enetc_pf.c
+index 7cd22d370caa3..4b371fbe67eac 100644
+--- a/drivers/net/ethernet/freescale/enetc/enetc_pf.c
++++ b/drivers/net/ethernet/freescale/enetc/enetc_pf.c
+@@ -1222,50 +1222,81 @@ static int enetc_pf_register_with_ierb(struct pci_dev *pdev)
+ return enetc_ierb_register_pf(ierb_pdev, pdev);
+ }
+
+-static int enetc_pf_probe(struct pci_dev *pdev,
+- const struct pci_device_id *ent)
++static struct enetc_si *enetc_psi_create(struct pci_dev *pdev)
+ {
+- struct device_node *node = pdev->dev.of_node;
+- struct enetc_ndev_priv *priv;
+- struct net_device *ndev;
+ struct enetc_si *si;
+- struct enetc_pf *pf;
+ int err;
+
+- err = enetc_pf_register_with_ierb(pdev);
+- if (err == -EPROBE_DEFER)
+- return err;
+- if (err)
+- dev_warn(&pdev->dev,
+- "Could not register with IERB driver: %pe, please update the device tree\n",
+- ERR_PTR(err));
+-
+- err = enetc_pci_probe(pdev, KBUILD_MODNAME, sizeof(*pf));
+- if (err)
+- return dev_err_probe(&pdev->dev, err, "PCI probing failed\n");
++ err = enetc_pci_probe(pdev, KBUILD_MODNAME, sizeof(struct enetc_pf));
++ if (err) {
++ dev_err_probe(&pdev->dev, err, "PCI probing failed\n");
++ goto out;
++ }
+
+ si = pci_get_drvdata(pdev);
+ if (!si->hw.port || !si->hw.global) {
+ err = -ENODEV;
+ dev_err(&pdev->dev, "could not map PF space, probing a VF?\n");
+- goto err_map_pf_space;
++ goto out_pci_remove;
+ }
+
+ err = enetc_setup_cbdr(&pdev->dev, &si->hw, ENETC_CBDR_DEFAULT_SIZE,
+ &si->cbd_ring);
+ if (err)
+- goto err_setup_cbdr;
++ goto out_pci_remove;
+
+ err = enetc_init_port_rfs_memory(si);
+ if (err) {
+ dev_err(&pdev->dev, "Failed to initialize RFS memory\n");
+- goto err_init_port_rfs;
++ goto out_teardown_cbdr;
+ }
+
+ err = enetc_init_port_rss_memory(si);
+ if (err) {
+ dev_err(&pdev->dev, "Failed to initialize RSS memory\n");
+- goto err_init_port_rss;
++ goto out_teardown_cbdr;
++ }
++
++ return si;
++
++out_teardown_cbdr:
++ enetc_teardown_cbdr(&si->cbd_ring);
++out_pci_remove:
++ enetc_pci_remove(pdev);
++out:
++ return ERR_PTR(err);
++}
++
++static void enetc_psi_destroy(struct pci_dev *pdev)
++{
++ struct enetc_si *si = pci_get_drvdata(pdev);
++
++ enetc_teardown_cbdr(&si->cbd_ring);
++ enetc_pci_remove(pdev);
++}
++
++static int enetc_pf_probe(struct pci_dev *pdev,
++ const struct pci_device_id *ent)
++{
++ struct device_node *node = pdev->dev.of_node;
++ struct enetc_ndev_priv *priv;
++ struct net_device *ndev;
++ struct enetc_si *si;
++ struct enetc_pf *pf;
++ int err;
++
++ err = enetc_pf_register_with_ierb(pdev);
++ if (err == -EPROBE_DEFER)
++ return err;
++ if (err)
++ dev_warn(&pdev->dev,
++ "Could not register with IERB driver: %pe, please update the device tree\n",
++ ERR_PTR(err));
++
++ si = enetc_psi_create(pdev);
++ if (IS_ERR(si)) {
++ err = PTR_ERR(si);
++ goto err_psi_create;
+ }
+
+ if (node && !of_device_is_available(node)) {
+@@ -1353,15 +1384,10 @@ err_alloc_si_res:
+ si->ndev = NULL;
+ free_netdev(ndev);
+ err_alloc_netdev:
+-err_init_port_rss:
+-err_init_port_rfs:
+ err_device_disabled:
+ err_setup_mac_addresses:
+- enetc_teardown_cbdr(&si->cbd_ring);
+-err_setup_cbdr:
+-err_map_pf_space:
+- enetc_pci_remove(pdev);
+-
++ enetc_psi_destroy(pdev);
++err_psi_create:
+ return err;
+ }
+
+@@ -1384,12 +1410,29 @@ static void enetc_pf_remove(struct pci_dev *pdev)
+ enetc_free_msix(priv);
+
+ enetc_free_si_resources(priv);
+- enetc_teardown_cbdr(&si->cbd_ring);
+
+ free_netdev(si->ndev);
+
+- enetc_pci_remove(pdev);
++ enetc_psi_destroy(pdev);
++}
++
++static void enetc_fixup_clear_rss_rfs(struct pci_dev *pdev)
++{
++ struct device_node *node = pdev->dev.of_node;
++ struct enetc_si *si;
++
++ /* Only apply quirk for disabled functions. For the ones
++ * that are enabled, enetc_pf_probe() will apply it.
++ */
++ if (node && of_device_is_available(node))
++ return;
++
++ si = enetc_psi_create(pdev);
++ if (si)
++ enetc_psi_destroy(pdev);
+ }
++DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_FREESCALE, ENETC_DEV_ID_PF,
++ enetc_fixup_clear_rss_rfs);
+
+ static const struct pci_device_id enetc_pf_id_table[] = {
+ { PCI_DEVICE(PCI_VENDOR_ID_FREESCALE, ENETC_DEV_ID_PF) },
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c b/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c
+index 32bb14303473b..207b2e3f3fc2b 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c
+@@ -461,9 +461,9 @@ static void hns3_dbg_fill_content(char *content, u16 len,
+ if (result) {
+ if (item_len < strlen(result[i]))
+ break;
+- strscpy(pos, result[i], strlen(result[i]));
++ memcpy(pos, result[i], strlen(result[i]));
+ } else {
+- strscpy(pos, items[i].name, strlen(items[i].name));
++ memcpy(pos, items[i].name, strlen(items[i].name));
+ }
+ pos += item_len;
+ len -= item_len;
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+index b676496ec6d7c..94acefd153bf7 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+@@ -5854,6 +5854,9 @@ void hns3_external_lb_prepare(struct net_device *ndev, bool if_running)
+ if (!if_running)
+ return;
+
++ if (test_and_set_bit(HNS3_NIC_STATE_DOWN, &priv->state))
++ return;
++
+ netif_carrier_off(ndev);
+ netif_tx_disable(ndev);
+
+@@ -5882,7 +5885,16 @@ void hns3_external_lb_restore(struct net_device *ndev, bool if_running)
+ if (!if_running)
+ return;
+
+- hns3_nic_reset_all_ring(priv->ae_handle);
++ if (hns3_nic_resetting(ndev))
++ return;
++
++ if (!test_bit(HNS3_NIC_STATE_DOWN, &priv->state))
++ return;
++
++ if (hns3_nic_reset_all_ring(priv->ae_handle))
++ return;
++
++ clear_bit(HNS3_NIC_STATE_DOWN, &priv->state);
+
+ for (i = 0; i < priv->vector_num; i++)
+ hns3_vector_enable(&priv->tqp_vector[i]);
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c
+index 409db2e709651..0fb2eaee3e8a0 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c
+@@ -111,9 +111,9 @@ static void hclge_dbg_fill_content(char *content, u16 len,
+ if (result) {
+ if (item_len < strlen(result[i]))
+ break;
+- strscpy(pos, result[i], strlen(result[i]));
++ memcpy(pos, result[i], strlen(result[i]));
+ } else {
+- strscpy(pos, items[i].name, strlen(items[i].name));
++ memcpy(pos, items[i].name, strlen(items[i].name));
+ }
+ pos += item_len;
+ len -= item_len;
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+index 2689b108f7df7..c3e94598f3983 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+@@ -72,6 +72,8 @@ static void hclge_restore_hw_table(struct hclge_dev *hdev);
+ static void hclge_sync_promisc_mode(struct hclge_dev *hdev);
+ static void hclge_sync_fd_table(struct hclge_dev *hdev);
+ static void hclge_update_fec_stats(struct hclge_dev *hdev);
++static int hclge_mac_link_status_wait(struct hclge_dev *hdev, int link_ret,
++ int wait_cnt);
+
+ static struct hnae3_ae_algo ae_algo;
+
+@@ -7567,6 +7569,8 @@ static void hclge_enable_fd(struct hnae3_handle *handle, bool enable)
+
+ static void hclge_cfg_mac_mode(struct hclge_dev *hdev, bool enable)
+ {
++#define HCLGE_LINK_STATUS_WAIT_CNT 3
++
+ struct hclge_desc desc;
+ struct hclge_config_mac_mode_cmd *req =
+ (struct hclge_config_mac_mode_cmd *)desc.data;
+@@ -7591,9 +7595,15 @@ static void hclge_cfg_mac_mode(struct hclge_dev *hdev, bool enable)
+ req->txrx_pad_fcs_loop_en = cpu_to_le32(loop_en);
+
+ ret = hclge_cmd_send(&hdev->hw, &desc, 1);
+- if (ret)
++ if (ret) {
+ dev_err(&hdev->pdev->dev,
+ "mac enable fail, ret =%d.\n", ret);
++ return;
++ }
++
++ if (!enable)
++ hclge_mac_link_status_wait(hdev, HCLGE_LINK_STATUS_DOWN,
++ HCLGE_LINK_STATUS_WAIT_CNT);
+ }
+
+ static int hclge_config_switch_param(struct hclge_dev *hdev, int vfid,
+@@ -7656,10 +7666,9 @@ static void hclge_phy_link_status_wait(struct hclge_dev *hdev,
+ } while (++i < HCLGE_PHY_LINK_STATUS_NUM);
+ }
+
+-static int hclge_mac_link_status_wait(struct hclge_dev *hdev, int link_ret)
++static int hclge_mac_link_status_wait(struct hclge_dev *hdev, int link_ret,
++ int wait_cnt)
+ {
+-#define HCLGE_MAC_LINK_STATUS_NUM 100
+-
+ int link_status;
+ int i = 0;
+ int ret;
+@@ -7672,13 +7681,15 @@ static int hclge_mac_link_status_wait(struct hclge_dev *hdev, int link_ret)
+ return 0;
+
+ msleep(HCLGE_LINK_STATUS_MS);
+- } while (++i < HCLGE_MAC_LINK_STATUS_NUM);
++ } while (++i < wait_cnt);
+ return -EBUSY;
+ }
+
+ static int hclge_mac_phy_link_status_wait(struct hclge_dev *hdev, bool en,
+ bool is_phy)
+ {
++#define HCLGE_MAC_LINK_STATUS_NUM 100
++
+ int link_ret;
+
+ link_ret = en ? HCLGE_LINK_STATUS_UP : HCLGE_LINK_STATUS_DOWN;
+@@ -7686,7 +7697,8 @@ static int hclge_mac_phy_link_status_wait(struct hclge_dev *hdev, bool en,
+ if (is_phy)
+ hclge_phy_link_status_wait(hdev, link_ret);
+
+- return hclge_mac_link_status_wait(hdev, link_ret);
++ return hclge_mac_link_status_wait(hdev, link_ret,
++ HCLGE_MAC_LINK_STATUS_NUM);
+ }
+
+ static int hclge_set_app_loopback(struct hclge_dev *hdev, bool en)
+diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
+index 763d613adbcc0..df76cdaddcfb0 100644
+--- a/drivers/net/ethernet/ibm/ibmvnic.c
++++ b/drivers/net/ethernet/ibm/ibmvnic.c
+@@ -97,6 +97,8 @@ static int pending_scrq(struct ibmvnic_adapter *,
+ static union sub_crq *ibmvnic_next_scrq(struct ibmvnic_adapter *,
+ struct ibmvnic_sub_crq_queue *);
+ static int ibmvnic_poll(struct napi_struct *napi, int data);
++static int reset_sub_crq_queues(struct ibmvnic_adapter *adapter);
++static inline void reinit_init_done(struct ibmvnic_adapter *adapter);
+ static void send_query_map(struct ibmvnic_adapter *adapter);
+ static int send_request_map(struct ibmvnic_adapter *, dma_addr_t, u32, u8);
+ static int send_request_unmap(struct ibmvnic_adapter *, u8);
+@@ -114,6 +116,7 @@ static void ibmvnic_tx_scrq_clean_buffer(struct ibmvnic_adapter *adapter,
+ static void free_long_term_buff(struct ibmvnic_adapter *adapter,
+ struct ibmvnic_long_term_buff *ltb);
+ static void ibmvnic_disable_irqs(struct ibmvnic_adapter *adapter);
++static void flush_reset_queue(struct ibmvnic_adapter *adapter);
+
+ struct ibmvnic_stat {
+ char name[ETH_GSTRING_LEN];
+@@ -1505,8 +1508,8 @@ static const char *adapter_state_to_string(enum vnic_state state)
+
+ static int ibmvnic_login(struct net_device *netdev)
+ {
++ unsigned long flags, timeout = msecs_to_jiffies(20000);
+ struct ibmvnic_adapter *adapter = netdev_priv(netdev);
+- unsigned long timeout = msecs_to_jiffies(20000);
+ int retry_count = 0;
+ int retries = 10;
+ bool retry;
+@@ -1527,11 +1530,9 @@ static int ibmvnic_login(struct net_device *netdev)
+
+ if (!wait_for_completion_timeout(&adapter->init_done,
+ timeout)) {
+- netdev_warn(netdev, "Login timed out, retrying...\n");
+- retry = true;
+- adapter->init_done_rc = 0;
+- retry_count++;
+- continue;
++ netdev_warn(netdev, "Login timed out\n");
++ adapter->login_pending = false;
++ goto partial_reset;
+ }
+
+ if (adapter->init_done_rc == ABORTED) {
+@@ -1573,10 +1574,69 @@ static int ibmvnic_login(struct net_device *netdev)
+ "SCRQ irq initialization failed\n");
+ return rc;
+ }
++ /* Default/timeout error handling, reset and start fresh */
+ } else if (adapter->init_done_rc) {
+ netdev_warn(netdev, "Adapter login failed, init_done_rc = %d\n",
+ adapter->init_done_rc);
+- return -EIO;
++
++partial_reset:
++ /* adapter login failed, so free any CRQs or sub-CRQs
++ * and register again before attempting to login again.
++ * If we don't do this then the VIOS may think that
++ * we are already logged in and reject any subsequent
++ * attempts
++ */
++ netdev_warn(netdev,
++ "Freeing and re-registering CRQs before attempting to login again\n");
++ retry = true;
++ adapter->init_done_rc = 0;
++ release_sub_crqs(adapter, true);
++ /* Much of this is similar logic as ibmvnic_probe(),
++ * we are essentially re-initializing communication
++ * with the server. We really should not run any
++ * resets/failovers here because this is already a form
++ * of reset and we do not want parallel resets occurring
++ */
++ do {
++ reinit_init_done(adapter);
++ /* Clear any failovers we got in the previous
++ * pass since we are re-initializing the CRQ
++ */
++ adapter->failover_pending = false;
++ release_crq_queue(adapter);
++ /* If we don't sleep here then we risk an
++ * unnecessary failover event from the VIOS.
++ * This is a known VIOS issue caused by a vnic
++ * device freeing and registering a CRQ too
++ * quickly.
++ */
++ msleep(1500);
++ /* Avoid any resets, since we are currently
++ * resetting.
++ */
++ spin_lock_irqsave(&adapter->rwi_lock, flags);
++ flush_reset_queue(adapter);
++ spin_unlock_irqrestore(&adapter->rwi_lock,
++ flags);
++
++ rc = init_crq_queue(adapter);
++ if (rc) {
++ netdev_err(netdev, "login recovery: init CRQ failed %d\n",
++ rc);
++ return -EIO;
++ }
++
++ rc = ibmvnic_reset_init(adapter, false);
++ if (rc)
++ netdev_err(netdev, "login recovery: Reset init failed %d\n",
++ rc);
++ /* IBMVNIC_CRQ_INIT will return EAGAIN if it
++ * fails, since ibmvnic_reset_init will free
++ * irq's in failure, we won't be able to receive
++ * new CRQs so we need to keep trying. probe()
++ * handles this similarly.
++ */
++ } while (rc == -EAGAIN && retry_count++ < retries);
+ }
+ } while (retry);
+
+@@ -1588,12 +1648,22 @@ static int ibmvnic_login(struct net_device *netdev)
+
+ static void release_login_buffer(struct ibmvnic_adapter *adapter)
+ {
++ if (!adapter->login_buf)
++ return;
++
++ dma_unmap_single(&adapter->vdev->dev, adapter->login_buf_token,
++ adapter->login_buf_sz, DMA_TO_DEVICE);
+ kfree(adapter->login_buf);
+ adapter->login_buf = NULL;
+ }
+
+ static void release_login_rsp_buffer(struct ibmvnic_adapter *adapter)
+ {
++ if (!adapter->login_rsp_buf)
++ return;
++
++ dma_unmap_single(&adapter->vdev->dev, adapter->login_rsp_buf_token,
++ adapter->login_rsp_buf_sz, DMA_FROM_DEVICE);
+ kfree(adapter->login_rsp_buf);
+ adapter->login_rsp_buf = NULL;
+ }
+@@ -4830,11 +4900,14 @@ static int send_login(struct ibmvnic_adapter *adapter)
+ if (rc) {
+ adapter->login_pending = false;
+ netdev_err(adapter->netdev, "Failed to send login, rc=%d\n", rc);
+- goto buf_rsp_map_failed;
++ goto buf_send_failed;
+ }
+
+ return 0;
+
++buf_send_failed:
++ dma_unmap_single(dev, rsp_buffer_token, rsp_buffer_size,
++ DMA_FROM_DEVICE);
+ buf_rsp_map_failed:
+ kfree(login_rsp_buffer);
+ adapter->login_rsp_buf = NULL;
+@@ -5396,6 +5469,7 @@ static int handle_login_rsp(union ibmvnic_crq *login_rsp_crq,
+ int num_tx_pools;
+ int num_rx_pools;
+ u64 *size_array;
++ u32 rsp_len;
+ int i;
+
+ /* CHECK: Test/set of login_pending does not need to be atomic
+@@ -5407,11 +5481,6 @@ static int handle_login_rsp(union ibmvnic_crq *login_rsp_crq,
+ }
+ adapter->login_pending = false;
+
+- dma_unmap_single(dev, adapter->login_buf_token, adapter->login_buf_sz,
+- DMA_TO_DEVICE);
+- dma_unmap_single(dev, adapter->login_rsp_buf_token,
+- adapter->login_rsp_buf_sz, DMA_FROM_DEVICE);
+-
+ /* If the number of queues requested can't be allocated by the
+ * server, the login response will return with code 1. We will need
+ * to resend the login buffer with fewer queues requested.
+@@ -5447,6 +5516,23 @@ static int handle_login_rsp(union ibmvnic_crq *login_rsp_crq,
+ ibmvnic_reset(adapter, VNIC_RESET_FATAL);
+ return -EIO;
+ }
++
++ rsp_len = be32_to_cpu(login_rsp->len);
++ if (be32_to_cpu(login->login_rsp_len) < rsp_len ||
++ rsp_len <= be32_to_cpu(login_rsp->off_txsubm_subcrqs) ||
++ rsp_len <= be32_to_cpu(login_rsp->off_rxadd_subcrqs) ||
++ rsp_len <= be32_to_cpu(login_rsp->off_rxadd_buff_size) ||
++ rsp_len <= be32_to_cpu(login_rsp->off_supp_tx_desc)) {
++ /* This can happen if a login request times out and there are
++ * 2 outstanding login requests sent, the LOGIN_RSP crq
++ * could have been for the older login request. So we are
++ * parsing the newer response buffer which may be incomplete
++ */
++ dev_err(dev, "FATAL: Login rsp offsets/lengths invalid\n");
++ ibmvnic_reset(adapter, VNIC_RESET_FATAL);
++ return -EIO;
++ }
++
+ size_array = (u64 *)((u8 *)(adapter->login_rsp_buf) +
+ be32_to_cpu(adapter->login_rsp_buf->off_rxadd_buff_size));
+ /* variable buffer sizes are not supported, so just read the
+diff --git a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
+index 2f47cfa7f06e2..460ca561819a9 100644
+--- a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
++++ b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
+@@ -1401,14 +1401,15 @@ static int iavf_add_fdir_ethtool(struct iavf_adapter *adapter, struct ethtool_rx
+ if (fsp->flow_type & FLOW_MAC_EXT)
+ return -EINVAL;
+
++ spin_lock_bh(&adapter->fdir_fltr_lock);
+ if (adapter->fdir_active_fltr >= IAVF_MAX_FDIR_FILTERS) {
++ spin_unlock_bh(&adapter->fdir_fltr_lock);
+ dev_err(&adapter->pdev->dev,
+ "Unable to add Flow Director filter because VF reached the limit of max allowed filters (%u)\n",
+ IAVF_MAX_FDIR_FILTERS);
+ return -ENOSPC;
+ }
+
+- spin_lock_bh(&adapter->fdir_fltr_lock);
+ if (iavf_find_fdir_fltr_by_loc(adapter, fsp->location)) {
+ dev_err(&adapter->pdev->dev, "Failed to add Flow Director filter, it already exists\n");
+ spin_unlock_bh(&adapter->fdir_fltr_lock);
+@@ -1781,7 +1782,9 @@ static int iavf_get_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *cmd,
+ case ETHTOOL_GRXCLSRLCNT:
+ if (!FDIR_FLTR_SUPPORT(adapter))
+ break;
++ spin_lock_bh(&adapter->fdir_fltr_lock);
+ cmd->rule_cnt = adapter->fdir_active_fltr;
++ spin_unlock_bh(&adapter->fdir_fltr_lock);
+ cmd->data = IAVF_MAX_FDIR_FILTERS;
+ ret = 0;
+ break;
+diff --git a/drivers/net/ethernet/intel/iavf/iavf_fdir.c b/drivers/net/ethernet/intel/iavf/iavf_fdir.c
+index 6146203efd84a..505e82ebafe47 100644
+--- a/drivers/net/ethernet/intel/iavf/iavf_fdir.c
++++ b/drivers/net/ethernet/intel/iavf/iavf_fdir.c
+@@ -722,7 +722,9 @@ void iavf_print_fdir_fltr(struct iavf_adapter *adapter, struct iavf_fdir_fltr *f
+ bool iavf_fdir_is_dup_fltr(struct iavf_adapter *adapter, struct iavf_fdir_fltr *fltr)
+ {
+ struct iavf_fdir_fltr *tmp;
++ bool ret = false;
+
++ spin_lock_bh(&adapter->fdir_fltr_lock);
+ list_for_each_entry(tmp, &adapter->fdir_list_head, list) {
+ if (tmp->flow_type != fltr->flow_type)
+ continue;
+@@ -732,11 +734,14 @@ bool iavf_fdir_is_dup_fltr(struct iavf_adapter *adapter, struct iavf_fdir_fltr *
+ !memcmp(&tmp->ip_data, &fltr->ip_data,
+ sizeof(fltr->ip_data)) &&
+ !memcmp(&tmp->ext_data, &fltr->ext_data,
+- sizeof(fltr->ext_data)))
+- return true;
++ sizeof(fltr->ext_data))) {
++ ret = true;
++ break;
++ }
+ }
++ spin_unlock_bh(&adapter->fdir_fltr_lock);
+
+- return false;
++ return ret;
+ }
+
+ /**
+diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h
+index 345d3a4e8ed44..2bd042dcd00f9 100644
+--- a/drivers/net/ethernet/intel/igc/igc.h
++++ b/drivers/net/ethernet/intel/igc/igc.h
+@@ -195,6 +195,10 @@ struct igc_adapter {
+ u32 qbv_config_change_errors;
+ bool qbv_transition;
+ unsigned int qbv_count;
++ /* Access to oper_gate_closed, admin_gate_closed and qbv_transition
++ * are protected by the qbv_tx_lock.
++ */
++ spinlock_t qbv_tx_lock;
+
+ /* OS defined structs */
+ struct pci_dev *pdev;
+diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
+index 3ccf2fedc5af7..2ae74870bbae2 100644
+--- a/drivers/net/ethernet/intel/igc/igc_main.c
++++ b/drivers/net/ethernet/intel/igc/igc_main.c
+@@ -4799,6 +4799,7 @@ static int igc_sw_init(struct igc_adapter *adapter)
+ adapter->nfc_rule_count = 0;
+
+ spin_lock_init(&adapter->stats64_lock);
++ spin_lock_init(&adapter->qbv_tx_lock);
+ /* Assume MSI-X interrupts, will be checked during IRQ allocation */
+ adapter->flags |= IGC_FLAG_HAS_MSIX;
+
+@@ -6117,15 +6118,15 @@ static int igc_tsn_enable_launchtime(struct igc_adapter *adapter,
+ return igc_tsn_offload_apply(adapter);
+ }
+
+-static int igc_tsn_clear_schedule(struct igc_adapter *adapter)
++static int igc_qbv_clear_schedule(struct igc_adapter *adapter)
+ {
++ unsigned long flags;
+ int i;
+
+ adapter->base_time = 0;
+ adapter->cycle_time = NSEC_PER_SEC;
+ adapter->taprio_offload_enable = false;
+ adapter->qbv_config_change_errors = 0;
+- adapter->qbv_transition = false;
+ adapter->qbv_count = 0;
+
+ for (i = 0; i < adapter->num_tx_queues; i++) {
+@@ -6134,10 +6135,28 @@ static int igc_tsn_clear_schedule(struct igc_adapter *adapter)
+ ring->start_time = 0;
+ ring->end_time = NSEC_PER_SEC;
+ ring->max_sdu = 0;
++ }
++
++ spin_lock_irqsave(&adapter->qbv_tx_lock, flags);
++
++ adapter->qbv_transition = false;
++
++ for (i = 0; i < adapter->num_tx_queues; i++) {
++ struct igc_ring *ring = adapter->tx_ring[i];
++
+ ring->oper_gate_closed = false;
+ ring->admin_gate_closed = false;
+ }
+
++ spin_unlock_irqrestore(&adapter->qbv_tx_lock, flags);
++
++ return 0;
++}
++
++static int igc_tsn_clear_schedule(struct igc_adapter *adapter)
++{
++ igc_qbv_clear_schedule(adapter);
++
+ return 0;
+ }
+
+@@ -6148,6 +6167,7 @@ static int igc_save_qbv_schedule(struct igc_adapter *adapter,
+ struct igc_hw *hw = &adapter->hw;
+ u32 start_time = 0, end_time = 0;
+ struct timespec64 now;
++ unsigned long flags;
+ size_t n;
+ int i;
+
+@@ -6215,6 +6235,8 @@ static int igc_save_qbv_schedule(struct igc_adapter *adapter,
+ start_time += e->interval;
+ }
+
++ spin_lock_irqsave(&adapter->qbv_tx_lock, flags);
++
+ /* Check whether a queue gets configured.
+ * If not, set the start and end time to be end time.
+ */
+@@ -6239,6 +6261,8 @@ static int igc_save_qbv_schedule(struct igc_adapter *adapter,
+ }
+ }
+
++ spin_unlock_irqrestore(&adapter->qbv_tx_lock, flags);
++
+ for (i = 0; i < adapter->num_tx_queues; i++) {
+ struct igc_ring *ring = adapter->tx_ring[i];
+ struct net_device *dev = adapter->netdev;
+@@ -6603,8 +6627,11 @@ static enum hrtimer_restart igc_qbv_scheduling_timer(struct hrtimer *timer)
+ {
+ struct igc_adapter *adapter = container_of(timer, struct igc_adapter,
+ hrtimer);
++ unsigned long flags;
+ unsigned int i;
+
++ spin_lock_irqsave(&adapter->qbv_tx_lock, flags);
++
+ adapter->qbv_transition = true;
+ for (i = 0; i < adapter->num_tx_queues; i++) {
+ struct igc_ring *tx_ring = adapter->tx_ring[i];
+@@ -6617,6 +6644,9 @@ static enum hrtimer_restart igc_qbv_scheduling_timer(struct hrtimer *timer)
+ }
+ }
+ adapter->qbv_transition = false;
++
++ spin_unlock_irqrestore(&adapter->qbv_tx_lock, flags);
++
+ return HRTIMER_NORESTART;
+ }
+
+diff --git a/drivers/net/ethernet/marvell/prestera/prestera_router.c b/drivers/net/ethernet/marvell/prestera/prestera_router.c
+index a9a1028cb17bb..de317179a7dcc 100644
+--- a/drivers/net/ethernet/marvell/prestera/prestera_router.c
++++ b/drivers/net/ethernet/marvell/prestera/prestera_router.c
+@@ -166,11 +166,11 @@ prestera_util_neigh2nc_key(struct prestera_switch *sw, struct neighbour *n,
+
+ static bool __prestera_fi_is_direct(struct fib_info *fi)
+ {
+- struct fib_nh *fib_nh;
++ struct fib_nh_common *fib_nhc;
+
+ if (fib_info_num_path(fi) == 1) {
+- fib_nh = fib_info_nh(fi, 0);
+- if (fib_nh->fib_nh_gw_family == AF_UNSPEC)
++ fib_nhc = fib_info_nhc(fi, 0);
++ if (fib_nhc->nhc_gw_family == AF_UNSPEC)
+ return true;
+ }
+
+@@ -261,7 +261,7 @@ static bool
+ __prestera_util_kern_n_is_reachable_v4(u32 tb_id, __be32 *addr,
+ struct net_device *dev)
+ {
+- struct fib_nh *fib_nh;
++ struct fib_nh_common *fib_nhc;
+ struct fib_result res;
+ bool reachable;
+
+@@ -269,8 +269,8 @@ __prestera_util_kern_n_is_reachable_v4(u32 tb_id, __be32 *addr,
+
+ if (!prestera_util_kern_get_route(&res, tb_id, addr))
+ if (prestera_fi_is_direct(res.fi)) {
+- fib_nh = fib_info_nh(res.fi, 0);
+- if (dev == fib_nh->fib_nh_dev)
++ fib_nhc = fib_info_nhc(res.fi, 0);
++ if (dev == fib_nhc->nhc_dev)
+ reachable = true;
+ }
+
+@@ -324,7 +324,7 @@ prestera_kern_fib_info_nhc(struct fib_notifier_info *info, int n)
+ if (info->family == AF_INET) {
+ fen4_info = container_of(info, struct fib_entry_notifier_info,
+ info);
+- return &fib_info_nh(fen4_info->fi, n)->nh_common;
++ return fib_info_nhc(fen4_info->fi, n);
+ } else if (info->family == AF_INET6) {
+ fen6_info = container_of(info, struct fib6_entry_notifier_info,
+ info);
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c
+index 0c88cf47af01b..1730f6a716eea 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c
+@@ -1461,10 +1461,12 @@ static void mlx5e_invalidate_encap(struct mlx5e_priv *priv,
+ attr = mlx5e_tc_get_encap_attr(flow);
+ esw_attr = attr->esw_attr;
+
+- if (flow_flag_test(flow, SLOW))
++ if (flow_flag_test(flow, SLOW)) {
+ mlx5e_tc_unoffload_from_slow_path(esw, flow);
+- else
++ } else {
+ mlx5e_tc_unoffload_fdb_rules(esw, flow, flow->attr);
++ mlx5e_tc_unoffload_flow_post_acts(flow);
++ }
+
+ mlx5e_tc_detach_mod_hdr(priv, flow, attr);
+ attr->modify_hdr = NULL;
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+index f084513fbead4..7e6d0489854e3 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+@@ -5266,6 +5266,7 @@ void mlx5e_destroy_q_counters(struct mlx5e_priv *priv)
+ static int mlx5e_nic_init(struct mlx5_core_dev *mdev,
+ struct net_device *netdev)
+ {
++ const bool take_rtnl = netdev->reg_state == NETREG_REGISTERED;
+ struct mlx5e_priv *priv = netdev_priv(netdev);
+ struct mlx5e_flow_steering *fs;
+ int err;
+@@ -5294,9 +5295,19 @@ static int mlx5e_nic_init(struct mlx5_core_dev *mdev,
+ mlx5_core_err(mdev, "TLS initialization failed, %d\n", err);
+
+ mlx5e_health_create_reporters(priv);
++
++ /* If netdev is already registered (e.g. move from uplink to nic profile),
++ * RTNL lock must be held before triggering netdev notifiers.
++ */
++ if (take_rtnl)
++ rtnl_lock();
++
+ /* update XDP supported features */
+ mlx5e_set_xdp_feature(netdev);
+
++ if (take_rtnl)
++ rtnl_unlock();
++
+ return 0;
+ }
+
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+index e002f013fa015..73c827ee1a94e 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+@@ -1943,9 +1943,7 @@ static void mlx5e_tc_del_fdb_flow(struct mlx5e_priv *priv,
+ {
+ struct mlx5_eswitch *esw = priv->mdev->priv.eswitch;
+ struct mlx5_flow_attr *attr = flow->attr;
+- struct mlx5_esw_flow_attr *esw_attr;
+
+- esw_attr = attr->esw_attr;
+ mlx5e_put_flow_tunnel_id(flow);
+
+ remove_unready_flow(flow);
+@@ -1966,12 +1964,6 @@ static void mlx5e_tc_del_fdb_flow(struct mlx5e_priv *priv,
+
+ mlx5_tc_ct_match_del(get_ct_priv(priv), &flow->attr->ct_attr);
+
+- if (esw_attr->int_port)
+- mlx5e_tc_int_port_put(mlx5e_get_int_port_priv(priv), esw_attr->int_port);
+-
+- if (esw_attr->dest_int_port)
+- mlx5e_tc_int_port_put(mlx5e_get_int_port_priv(priv), esw_attr->dest_int_port);
+-
+ if (flow_flag_test(flow, L3_TO_L2_DECAP))
+ mlx5e_detach_decap(priv, flow);
+
+@@ -4250,6 +4242,7 @@ static void
+ mlx5_free_flow_attr_actions(struct mlx5e_tc_flow *flow, struct mlx5_flow_attr *attr)
+ {
+ struct mlx5_core_dev *counter_dev = get_flow_counter_dev(flow);
++ struct mlx5_esw_flow_attr *esw_attr;
+
+ if (!attr)
+ return;
+@@ -4267,6 +4260,18 @@ mlx5_free_flow_attr_actions(struct mlx5e_tc_flow *flow, struct mlx5_flow_attr *a
+ mlx5e_tc_detach_mod_hdr(flow->priv, flow, attr);
+ }
+
++ if (mlx5e_is_eswitch_flow(flow)) {
++ esw_attr = attr->esw_attr;
++
++ if (esw_attr->int_port)
++ mlx5e_tc_int_port_put(mlx5e_get_int_port_priv(flow->priv),
++ esw_attr->int_port);
++
++ if (esw_attr->dest_int_port)
++ mlx5e_tc_int_port_put(mlx5e_get_int_port_priv(flow->priv),
++ esw_attr->dest_int_port);
++ }
++
+ mlx5_tc_ct_delete_flow(get_ct_priv(flow->priv), attr);
+
+ free_branch_attr(flow, attr->branch_true);
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/port_sel.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/port_sel.c
+index d3a3fe4ce6702..7d9bbb494d95b 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/port_sel.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/port_sel.c
+@@ -574,7 +574,7 @@ static int __mlx5_lag_modify_definers_destinations(struct mlx5_lag *ldev,
+ for (i = 0; i < ldev->ports; i++) {
+ for (j = 0; j < ldev->buckets; j++) {
+ idx = i * ldev->buckets + j;
+- if (ldev->v2p_map[i] == ports[i])
++ if (ldev->v2p_map[idx] == ports[idx])
+ continue;
+
+ dest.vport.vhca_id = MLX5_CAP_GEN(ldev->pf[ports[idx] - 1].dev,
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
+index 932fbc843c692..dba4c5e2f7667 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
+@@ -221,10 +221,15 @@ static void mlx5_timestamp_overflow(struct work_struct *work)
+ clock = container_of(timer, struct mlx5_clock, timer);
+ mdev = container_of(clock, struct mlx5_core_dev, clock);
+
++ if (mdev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR)
++ goto out;
++
+ write_seqlock_irqsave(&clock->lock, flags);
+ timecounter_read(&timer->tc);
+ mlx5_update_clock_info_page(mdev);
+ write_sequnlock_irqrestore(&clock->lock, flags);
++
++out:
+ schedule_delayed_work(&timer->overflow_work, timer->overflow_period);
+ }
+
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
+index c7a06c8bbb7a3..3216839776548 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
+@@ -1845,7 +1845,7 @@ static pci_ers_result_t mlx5_pci_err_detected(struct pci_dev *pdev,
+
+ mlx5_enter_error_state(dev, false);
+ mlx5_error_sw_reset(dev);
+- mlx5_unload_one(dev, true);
++ mlx5_unload_one(dev, false);
+ mlx5_drain_health_wq(dev);
+ mlx5_pci_disable_device(dev);
+
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
+index 20d7662c10fb6..5f2195e65dd62 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
+@@ -264,8 +264,7 @@ static u16 mlx5_get_max_vfs(struct mlx5_core_dev *dev)
+ host_total_vfs = MLX5_GET(query_esw_functions_out, out,
+ host_params_context.host_total_vfs);
+ kvfree(out);
+- if (host_total_vfs)
+- return host_total_vfs;
++ return host_total_vfs;
+ }
+
+ done:
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ptrn.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ptrn.c
+index d6947fe13d560..8ca534ef5d031 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ptrn.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ptrn.c
+@@ -82,7 +82,7 @@ dr_ptrn_alloc_pattern(struct mlx5dr_ptrn_mgr *mgr,
+ u32 chunk_size;
+ u32 index;
+
+- chunk_size = ilog2(num_of_actions);
++ chunk_size = ilog2(roundup_pow_of_two(num_of_actions));
+ /* HW modify action index granularity is at least 64B */
+ chunk_size = max_t(u32, chunk_size, DR_CHUNK_SIZE_8);
+
+diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
+index d907727c7b7a5..96c78f7db2543 100644
+--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
++++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
+@@ -8,6 +8,7 @@
+ #include <linux/ethtool.h>
+ #include <linux/filter.h>
+ #include <linux/mm.h>
++#include <linux/pci.h>
+
+ #include <net/checksum.h>
+ #include <net/ip6_checksum.h>
+@@ -2328,9 +2329,12 @@ int mana_attach(struct net_device *ndev)
+ static int mana_dealloc_queues(struct net_device *ndev)
+ {
+ struct mana_port_context *apc = netdev_priv(ndev);
++ unsigned long timeout = jiffies + 120 * HZ;
+ struct gdma_dev *gd = apc->ac->gdma_dev;
+ struct mana_txq *txq;
++ struct sk_buff *skb;
+ int i, err;
++ u32 tsleep;
+
+ if (apc->port_is_up)
+ return -EINVAL;
+@@ -2346,15 +2350,40 @@ static int mana_dealloc_queues(struct net_device *ndev)
+ * to false, but it doesn't matter since mana_start_xmit() drops any
+ * new packets due to apc->port_is_up being false.
+ *
+- * Drain all the in-flight TX packets
++ * Drain all the in-flight TX packets.
++ * A timeout of 120 seconds for all the queues is used.
++ * This will break the while loop when h/w is not responding.
++ * This value of 120 has been decided here considering max
++ * number of queues.
+ */
++
+ for (i = 0; i < apc->num_queues; i++) {
+ txq = &apc->tx_qp[i].txq;
+-
+- while (atomic_read(&txq->pending_sends) > 0)
+- usleep_range(1000, 2000);
++ tsleep = 1000;
++ while (atomic_read(&txq->pending_sends) > 0 &&
++ time_before(jiffies, timeout)) {
++ usleep_range(tsleep, tsleep + 1000);
++ tsleep <<= 1;
++ }
++ if (atomic_read(&txq->pending_sends)) {
++ err = pcie_flr(to_pci_dev(gd->gdma_context->dev));
++ if (err) {
++ netdev_err(ndev, "flr failed %d with %d pkts pending in txq %u\n",
++ err, atomic_read(&txq->pending_sends),
++ txq->gdma_txq_id);
++ }
++ break;
++ }
+ }
+
++ for (i = 0; i < apc->num_queues; i++) {
++ txq = &apc->tx_qp[i].txq;
++ while ((skb = skb_dequeue(&txq->pending_skbs))) {
++ mana_unmap_skb(skb, apc);
++ dev_kfree_skb_any(skb);
++ }
++ atomic_set(&txq->pending_sends, 0);
++ }
+ /* We're 100% sure the queues can no longer be woken up, because
+ * we're sure now mana_poll_tx_cq() can't be running.
+ */
+diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
+index e03a94f2469ab..a25a202ad75ae 100644
+--- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c
++++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
+@@ -1816,6 +1816,7 @@ static int ionic_change_mtu(struct net_device *netdev, int new_mtu)
+ static void ionic_tx_timeout_work(struct work_struct *ws)
+ {
+ struct ionic_lif *lif = container_of(ws, struct ionic_lif, tx_timeout_work);
++ int err;
+
+ if (test_bit(IONIC_LIF_F_FW_RESET, lif->state))
+ return;
+@@ -1828,8 +1829,11 @@ static void ionic_tx_timeout_work(struct work_struct *ws)
+
+ mutex_lock(&lif->queue_lock);
+ ionic_stop_queues_reconfig(lif);
+- ionic_start_queues_reconfig(lif);
++ err = ionic_start_queues_reconfig(lif);
+ mutex_unlock(&lif->queue_lock);
++
++ if (err)
++ dev_err(lif->ionic->dev, "%s: Restarting queues failed\n", __func__);
+ }
+
+ static void ionic_tx_timeout(struct net_device *netdev, unsigned int txqueue)
+@@ -2799,17 +2803,22 @@ static int ionic_cmb_reconfig(struct ionic_lif *lif,
+ if (err) {
+ dev_err(lif->ionic->dev,
+ "CMB restore failed: %d\n", err);
+- goto errout;
++ goto err_out;
+ }
+ }
+
+- ionic_start_queues_reconfig(lif);
+- } else {
+- /* This was detached in ionic_stop_queues_reconfig() */
+- netif_device_attach(lif->netdev);
++ err = ionic_start_queues_reconfig(lif);
++ if (err) {
++ dev_err(lif->ionic->dev,
++ "CMB reconfig failed: %d\n", err);
++ goto err_out;
++ }
+ }
+
+-errout:
++err_out:
++ /* This was detached in ionic_stop_queues_reconfig() */
++ netif_device_attach(lif->netdev);
++
+ return err;
+ }
+
+diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
+index 984dfa5d6c11c..144ec756c796a 100644
+--- a/drivers/net/macsec.c
++++ b/drivers/net/macsec.c
+@@ -743,7 +743,7 @@ static bool macsec_post_decrypt(struct sk_buff *skb, struct macsec_secy *secy, u
+ u64_stats_update_begin(&rxsc_stats->syncp);
+ rxsc_stats->stats.InPktsLate++;
+ u64_stats_update_end(&rxsc_stats->syncp);
+- secy->netdev->stats.rx_dropped++;
++ DEV_STATS_INC(secy->netdev, rx_dropped);
+ return false;
+ }
+
+@@ -767,7 +767,7 @@ static bool macsec_post_decrypt(struct sk_buff *skb, struct macsec_secy *secy, u
+ rxsc_stats->stats.InPktsNotValid++;
+ u64_stats_update_end(&rxsc_stats->syncp);
+ this_cpu_inc(rx_sa->stats->InPktsNotValid);
+- secy->netdev->stats.rx_errors++;
++ DEV_STATS_INC(secy->netdev, rx_errors);
+ return false;
+ }
+
+@@ -1069,7 +1069,7 @@ static enum rx_handler_result handle_not_macsec(struct sk_buff *skb)
+ u64_stats_update_begin(&secy_stats->syncp);
+ secy_stats->stats.InPktsNoTag++;
+ u64_stats_update_end(&secy_stats->syncp);
+- macsec->secy.netdev->stats.rx_dropped++;
++ DEV_STATS_INC(macsec->secy.netdev, rx_dropped);
+ continue;
+ }
+
+@@ -1179,7 +1179,7 @@ static rx_handler_result_t macsec_handle_frame(struct sk_buff **pskb)
+ u64_stats_update_begin(&secy_stats->syncp);
+ secy_stats->stats.InPktsBadTag++;
+ u64_stats_update_end(&secy_stats->syncp);
+- secy->netdev->stats.rx_errors++;
++ DEV_STATS_INC(secy->netdev, rx_errors);
+ goto drop_nosa;
+ }
+
+@@ -1196,7 +1196,7 @@ static rx_handler_result_t macsec_handle_frame(struct sk_buff **pskb)
+ u64_stats_update_begin(&rxsc_stats->syncp);
+ rxsc_stats->stats.InPktsNotUsingSA++;
+ u64_stats_update_end(&rxsc_stats->syncp);
+- secy->netdev->stats.rx_errors++;
++ DEV_STATS_INC(secy->netdev, rx_errors);
+ if (active_rx_sa)
+ this_cpu_inc(active_rx_sa->stats->InPktsNotUsingSA);
+ goto drop_nosa;
+@@ -1230,7 +1230,7 @@ static rx_handler_result_t macsec_handle_frame(struct sk_buff **pskb)
+ u64_stats_update_begin(&rxsc_stats->syncp);
+ rxsc_stats->stats.InPktsLate++;
+ u64_stats_update_end(&rxsc_stats->syncp);
+- macsec->secy.netdev->stats.rx_dropped++;
++ DEV_STATS_INC(macsec->secy.netdev, rx_dropped);
+ goto drop;
+ }
+ }
+@@ -1271,7 +1271,7 @@ deliver:
+ if (ret == NET_RX_SUCCESS)
+ count_rx(dev, len);
+ else
+- macsec->secy.netdev->stats.rx_dropped++;
++ DEV_STATS_INC(macsec->secy.netdev, rx_dropped);
+
+ rcu_read_unlock();
+
+@@ -1308,7 +1308,7 @@ nosci:
+ u64_stats_update_begin(&secy_stats->syncp);
+ secy_stats->stats.InPktsNoSCI++;
+ u64_stats_update_end(&secy_stats->syncp);
+- macsec->secy.netdev->stats.rx_errors++;
++ DEV_STATS_INC(macsec->secy.netdev, rx_errors);
+ continue;
+ }
+
+@@ -1327,7 +1327,7 @@ nosci:
+ secy_stats->stats.InPktsUnknownSCI++;
+ u64_stats_update_end(&secy_stats->syncp);
+ } else {
+- macsec->secy.netdev->stats.rx_dropped++;
++ DEV_STATS_INC(macsec->secy.netdev, rx_dropped);
+ }
+ }
+
+@@ -3422,7 +3422,7 @@ static netdev_tx_t macsec_start_xmit(struct sk_buff *skb,
+
+ if (!secy->operational) {
+ kfree_skb(skb);
+- dev->stats.tx_dropped++;
++ DEV_STATS_INC(dev, tx_dropped);
+ return NETDEV_TX_OK;
+ }
+
+@@ -3430,7 +3430,7 @@ static netdev_tx_t macsec_start_xmit(struct sk_buff *skb,
+ skb = macsec_encrypt(skb, dev);
+ if (IS_ERR(skb)) {
+ if (PTR_ERR(skb) != -EINPROGRESS)
+- dev->stats.tx_dropped++;
++ DEV_STATS_INC(dev, tx_dropped);
+ return NETDEV_TX_OK;
+ }
+
+@@ -3667,9 +3667,9 @@ static void macsec_get_stats64(struct net_device *dev,
+
+ dev_fetch_sw_netstats(s, dev->tstats);
+
+- s->rx_dropped = dev->stats.rx_dropped;
+- s->tx_dropped = dev->stats.tx_dropped;
+- s->rx_errors = dev->stats.rx_errors;
++ s->rx_dropped = atomic_long_read(&dev->stats.__rx_dropped);
++ s->tx_dropped = atomic_long_read(&dev->stats.__tx_dropped);
++ s->rx_errors = atomic_long_read(&dev->stats.__rx_errors);
+ }
+
+ static int macsec_get_iflink(const struct net_device *dev)
+diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c
+index 656136628ffd8..ef6dc008e4c50 100644
+--- a/drivers/net/phy/at803x.c
++++ b/drivers/net/phy/at803x.c
+@@ -2086,8 +2086,6 @@ static struct phy_driver at803x_driver[] = {
+ .flags = PHY_POLL_CABLE_TEST,
+ .config_init = at803x_config_init,
+ .link_change_notify = at803x_link_change_notify,
+- .set_wol = at803x_set_wol,
+- .get_wol = at803x_get_wol,
+ .suspend = at803x_suspend,
+ .resume = at803x_resume,
+ /* PHY_BASIC_FEATURES */
+diff --git a/drivers/net/tun.c b/drivers/net/tun.c
+index 25f0191df00bf..100339bc8b04a 100644
+--- a/drivers/net/tun.c
++++ b/drivers/net/tun.c
+@@ -1594,7 +1594,7 @@ static bool tun_can_build_skb(struct tun_struct *tun, struct tun_file *tfile,
+ if (zerocopy)
+ return false;
+
+- if (SKB_DATA_ALIGN(len + TUN_RX_PAD) +
++ if (SKB_DATA_ALIGN(len + TUN_RX_PAD + XDP_PACKET_HEADROOM) +
+ SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) > PAGE_SIZE)
+ return false;
+
+diff --git a/drivers/net/vxlan/vxlan_vnifilter.c b/drivers/net/vxlan/vxlan_vnifilter.c
+index a3de081cda5ee..c3ff30ab782e9 100644
+--- a/drivers/net/vxlan/vxlan_vnifilter.c
++++ b/drivers/net/vxlan/vxlan_vnifilter.c
+@@ -713,6 +713,12 @@ static struct vxlan_vni_node *vxlan_vni_alloc(struct vxlan_dev *vxlan,
+ return vninode;
+ }
+
++static void vxlan_vni_free(struct vxlan_vni_node *vninode)
++{
++ free_percpu(vninode->stats);
++ kfree(vninode);
++}
++
+ static int vxlan_vni_add(struct vxlan_dev *vxlan,
+ struct vxlan_vni_group *vg,
+ u32 vni, union vxlan_addr *group,
+@@ -740,7 +746,7 @@ static int vxlan_vni_add(struct vxlan_dev *vxlan,
+ &vninode->vnode,
+ vxlan_vni_rht_params);
+ if (err) {
+- kfree(vninode);
++ vxlan_vni_free(vninode);
+ return err;
+ }
+
+@@ -763,8 +769,7 @@ static void vxlan_vni_node_rcu_free(struct rcu_head *rcu)
+ struct vxlan_vni_node *v;
+
+ v = container_of(rcu, struct vxlan_vni_node, rcu);
+- free_percpu(v->stats);
+- kfree(v);
++ vxlan_vni_free(v);
+ }
+
+ static int vxlan_vni_del(struct vxlan_dev *vxlan,
+diff --git a/drivers/net/wireguard/allowedips.c b/drivers/net/wireguard/allowedips.c
+index 5bf7822c53f18..0ba714ca5185c 100644
+--- a/drivers/net/wireguard/allowedips.c
++++ b/drivers/net/wireguard/allowedips.c
+@@ -6,7 +6,7 @@
+ #include "allowedips.h"
+ #include "peer.h"
+
+-enum { MAX_ALLOWEDIPS_BITS = 128 };
++enum { MAX_ALLOWEDIPS_DEPTH = 129 };
+
+ static struct kmem_cache *node_cache;
+
+@@ -42,7 +42,7 @@ static void push_rcu(struct allowedips_node **stack,
+ struct allowedips_node __rcu *p, unsigned int *len)
+ {
+ if (rcu_access_pointer(p)) {
+- if (WARN_ON(IS_ENABLED(DEBUG) && *len >= MAX_ALLOWEDIPS_BITS))
++ if (WARN_ON(IS_ENABLED(DEBUG) && *len >= MAX_ALLOWEDIPS_DEPTH))
+ return;
+ stack[(*len)++] = rcu_dereference_raw(p);
+ }
+@@ -55,7 +55,7 @@ static void node_free_rcu(struct rcu_head *rcu)
+
+ static void root_free_rcu(struct rcu_head *rcu)
+ {
+- struct allowedips_node *node, *stack[MAX_ALLOWEDIPS_BITS] = {
++ struct allowedips_node *node, *stack[MAX_ALLOWEDIPS_DEPTH] = {
+ container_of(rcu, struct allowedips_node, rcu) };
+ unsigned int len = 1;
+
+@@ -68,7 +68,7 @@ static void root_free_rcu(struct rcu_head *rcu)
+
+ static void root_remove_peer_lists(struct allowedips_node *root)
+ {
+- struct allowedips_node *node, *stack[MAX_ALLOWEDIPS_BITS] = { root };
++ struct allowedips_node *node, *stack[MAX_ALLOWEDIPS_DEPTH] = { root };
+ unsigned int len = 1;
+
+ while (len > 0 && (node = stack[--len])) {
+diff --git a/drivers/net/wireguard/selftest/allowedips.c b/drivers/net/wireguard/selftest/allowedips.c
+index 78ebe2892a788..3d1f64ff2e122 100644
+--- a/drivers/net/wireguard/selftest/allowedips.c
++++ b/drivers/net/wireguard/selftest/allowedips.c
+@@ -593,16 +593,20 @@ bool __init wg_allowedips_selftest(void)
+ wg_allowedips_remove_by_peer(&t, a, &mutex);
+ test_negative(4, a, 192, 168, 0, 1);
+
+- /* These will hit the WARN_ON(len >= MAX_ALLOWEDIPS_BITS) in free_node
++ /* These will hit the WARN_ON(len >= MAX_ALLOWEDIPS_DEPTH) in free_node
+ * if something goes wrong.
+ */
+- for (i = 0; i < MAX_ALLOWEDIPS_BITS; ++i) {
+- part = cpu_to_be64(~(1LLU << (i % 64)));
+- memset(&ip, 0xff, 16);
+- memcpy((u8 *)&ip + (i < 64) * 8, &part, 8);
++ for (i = 0; i < 64; ++i) {
++ part = cpu_to_be64(~0LLU << i);
++ memset(&ip, 0xff, 8);
++ memcpy((u8 *)&ip + 8, &part, 8);
++ wg_allowedips_insert_v6(&t, &ip, 128, a, &mutex);
++ memcpy(&ip, &part, 8);
++ memset((u8 *)&ip + 8, 0, 8);
+ wg_allowedips_insert_v6(&t, &ip, 128, a, &mutex);
+ }
+-
++ memset(&ip, 0, 16);
++ wg_allowedips_insert_v6(&t, &ip, 128, a, &mutex);
+ wg_allowedips_free(&t, &mutex);
+
+ wg_allowedips_init(&t);
+diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c
+index de8a2e27f49c7..2a90bb24ba77f 100644
+--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c
++++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c
+@@ -1456,6 +1456,10 @@ brcmf_run_escan(struct brcmf_cfg80211_info *cfg, struct brcmf_if *ifp,
+ params_size -= BRCMF_SCAN_PARAMS_V2_FIXED_SIZE;
+ params_size += BRCMF_SCAN_PARAMS_FIXED_SIZE;
+ params_v1 = kzalloc(params_size, GFP_KERNEL);
++ if (!params_v1) {
++ err = -ENOMEM;
++ goto exit_params;
++ }
+ params_v1->version = cpu_to_le32(BRCMF_ESCAN_REQ_VERSION);
+ brcmf_scan_params_v2_to_v1(¶ms->params_v2_le, ¶ms_v1->params_le);
+ kfree(params);
+@@ -1473,6 +1477,7 @@ brcmf_run_escan(struct brcmf_cfg80211_info *cfg, struct brcmf_if *ifp,
+ bphy_err(drvr, "error (%d)\n", err);
+ }
+
++exit_params:
+ kfree(params);
+ exit:
+ return err;
+diff --git a/drivers/net/wireless/realtek/rtw89/mac.c b/drivers/net/wireless/realtek/rtw89/mac.c
+index 512de491a064b..e31c0cdfd16c8 100644
+--- a/drivers/net/wireless/realtek/rtw89/mac.c
++++ b/drivers/net/wireless/realtek/rtw89/mac.c
+@@ -2484,7 +2484,7 @@ static int cmac_dma_init(struct rtw89_dev *rtwdev, u8 mac_idx)
+ u32 reg;
+ int ret;
+
+- if (chip_id != RTL8852A && chip_id != RTL8852B)
++ if (chip_id != RTL8852B)
+ return 0;
+
+ ret = rtw89_mac_check_mac_en(rtwdev, mac_idx, RTW89_CMAC_SEL);
+diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
+index 45f1dac07685d..c61173be41270 100644
+--- a/drivers/nvme/host/core.c
++++ b/drivers/nvme/host/core.c
+@@ -4728,6 +4728,12 @@ void nvme_remove_namespaces(struct nvme_ctrl *ctrl)
+ */
+ nvme_mpath_clear_ctrl_paths(ctrl);
+
++ /*
++ * Unquiesce io queues so any pending IO won't hang, especially
++ * those submitted from scan work
++ */
++ nvme_unquiesce_io_queues(ctrl);
++
+ /* prevent racing with ns scanning */
+ flush_work(&ctrl->scan_work);
+
+@@ -4737,10 +4743,8 @@ void nvme_remove_namespaces(struct nvme_ctrl *ctrl)
+ * removing the namespaces' disks; fail all the queues now to avoid
+ * potentially having to clean up the failed sync later.
+ */
+- if (ctrl->state == NVME_CTRL_DEAD) {
++ if (ctrl->state == NVME_CTRL_DEAD)
+ nvme_mark_namespaces_dead(ctrl);
+- nvme_unquiesce_io_queues(ctrl);
+- }
+
+ /* this is a no-op when called from the controller reset handler */
+ nvme_change_ctrl_state(ctrl, NVME_CTRL_DELETING_NOIO);
+diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
+index 5b5303f0e2c20..a277ef16392ca 100644
+--- a/drivers/nvme/host/pci.c
++++ b/drivers/nvme/host/pci.c
+@@ -3391,7 +3391,8 @@ static const struct pci_device_id nvme_id_table[] = {
+ { PCI_DEVICE(0x1d97, 0x2263), /* SPCC */
+ .driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, },
+ { PCI_DEVICE(0x144d, 0xa80b), /* Samsung PM9B1 256G and 512G */
+- .driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, },
++ .driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES |
++ NVME_QUIRK_BOGUS_NID, },
+ { PCI_DEVICE(0x144d, 0xa809), /* Samsung MZALQ256HBJD 256G */
+ .driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, },
+ { PCI_DEVICE(0x1cc4, 0x6303), /* UMIS RPJTJ512MGE1QDY 512G */
+diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
+index 0eb79696fb736..354cce8853c1c 100644
+--- a/drivers/nvme/host/rdma.c
++++ b/drivers/nvme/host/rdma.c
+@@ -918,6 +918,7 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
+ goto out_cleanup_tagset;
+
+ if (!new) {
++ nvme_start_freeze(&ctrl->ctrl);
+ nvme_unquiesce_io_queues(&ctrl->ctrl);
+ if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
+ /*
+@@ -926,6 +927,7 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
+ * to be safe.
+ */
+ ret = -ENODEV;
++ nvme_unfreeze(&ctrl->ctrl);
+ goto out_wait_freeze_timed_out;
+ }
+ blk_mq_update_nr_hw_queues(ctrl->ctrl.tagset,
+@@ -975,7 +977,6 @@ static void nvme_rdma_teardown_io_queues(struct nvme_rdma_ctrl *ctrl,
+ bool remove)
+ {
+ if (ctrl->ctrl.queue_count > 1) {
+- nvme_start_freeze(&ctrl->ctrl);
+ nvme_quiesce_io_queues(&ctrl->ctrl);
+ nvme_sync_io_queues(&ctrl->ctrl);
+ nvme_rdma_stop_io_queues(ctrl);
+diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
+index bf0230442d570..5ae08e9cb16de 100644
+--- a/drivers/nvme/host/tcp.c
++++ b/drivers/nvme/host/tcp.c
+@@ -1909,6 +1909,7 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
+ goto out_cleanup_connect_q;
+
+ if (!new) {
++ nvme_start_freeze(ctrl);
+ nvme_unquiesce_io_queues(ctrl);
+ if (!nvme_wait_freeze_timeout(ctrl, NVME_IO_TIMEOUT)) {
+ /*
+@@ -1917,6 +1918,7 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
+ * to be safe.
+ */
+ ret = -ENODEV;
++ nvme_unfreeze(ctrl);
+ goto out_wait_freeze_timed_out;
+ }
+ blk_mq_update_nr_hw_queues(ctrl->tagset,
+@@ -2021,7 +2023,6 @@ static void nvme_tcp_teardown_io_queues(struct nvme_ctrl *ctrl,
+ if (ctrl->queue_count <= 1)
+ return;
+ nvme_quiesce_admin_queue(ctrl);
+- nvme_start_freeze(ctrl);
+ nvme_quiesce_io_queues(ctrl);
+ nvme_sync_io_queues(ctrl);
+ nvme_tcp_stop_io_queues(ctrl);
+diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
+index 5bc81cc0a2de4..46b252bbe5000 100644
+--- a/drivers/pci/bus.c
++++ b/drivers/pci/bus.c
+@@ -11,6 +11,7 @@
+ #include <linux/pci.h>
+ #include <linux/errno.h>
+ #include <linux/ioport.h>
++#include <linux/of.h>
+ #include <linux/proc_fs.h>
+ #include <linux/slab.h>
+
+@@ -332,6 +333,7 @@ void __weak pcibios_bus_add_device(struct pci_dev *pdev) { }
+ */
+ void pci_bus_add_device(struct pci_dev *dev)
+ {
++ struct device_node *dn = dev->dev.of_node;
+ int retval;
+
+ /*
+@@ -344,7 +346,7 @@ void pci_bus_add_device(struct pci_dev *dev)
+ pci_proc_attach_device(dev);
+ pci_bridge_d3_update(dev);
+
+- dev->match_driver = true;
++ dev->match_driver = !dn || of_device_is_available(dn);
+ retval = device_attach(&dev->dev);
+ if (retval < 0 && retval != -EPROBE_DEFER)
+ pci_warn(dev, "device attach failed (%d)\n", retval);
+diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig
+index 8d49bad7f8472..0859be86e7183 100644
+--- a/drivers/pci/controller/Kconfig
++++ b/drivers/pci/controller/Kconfig
+@@ -179,7 +179,6 @@ config PCI_MVEBU
+ depends on MVEBU_MBUS
+ depends on ARM
+ depends on OF
+- depends on BROKEN
+ select PCI_BRIDGE_EMUL
+ help
+ Add support for Marvell EBU PCIe controller. This PCIe controller
+diff --git a/drivers/pci/of.c b/drivers/pci/of.c
+index 2c25f4fa0225a..6f305362ba304 100644
+--- a/drivers/pci/of.c
++++ b/drivers/pci/of.c
+@@ -34,11 +34,6 @@ int pci_set_of_node(struct pci_dev *dev)
+ if (!node)
+ return 0;
+
+- if (!of_device_is_available(node)) {
+- of_node_put(node);
+- return -ENODEV;
+- }
+-
+ dev->dev.of_node = node;
+ dev->dev.fwnode = &node->fwnode;
+ return 0;
+diff --git a/drivers/platform/x86/lenovo-ymc.c b/drivers/platform/x86/lenovo-ymc.c
+index 41676188b3738..f360370d50027 100644
+--- a/drivers/platform/x86/lenovo-ymc.c
++++ b/drivers/platform/x86/lenovo-ymc.c
+@@ -24,6 +24,10 @@ static bool ec_trigger __read_mostly;
+ module_param(ec_trigger, bool, 0444);
+ MODULE_PARM_DESC(ec_trigger, "Enable EC triggering work-around to force emitting tablet mode events");
+
++static bool force;
++module_param(force, bool, 0444);
++MODULE_PARM_DESC(force, "Force loading on boards without a convertible DMI chassis-type");
++
+ static const struct dmi_system_id ec_trigger_quirk_dmi_table[] = {
+ {
+ /* Lenovo Yoga 7 14ARB7 */
+@@ -35,6 +39,20 @@ static const struct dmi_system_id ec_trigger_quirk_dmi_table[] = {
+ { }
+ };
+
++static const struct dmi_system_id allowed_chasis_types_dmi_table[] = {
++ {
++ .matches = {
++ DMI_EXACT_MATCH(DMI_CHASSIS_TYPE, "31" /* Convertible */),
++ },
++ },
++ {
++ .matches = {
++ DMI_EXACT_MATCH(DMI_CHASSIS_TYPE, "32" /* Detachable */),
++ },
++ },
++ { }
++};
++
+ struct lenovo_ymc_private {
+ struct input_dev *input_dev;
+ struct acpi_device *ec_acpi_dev;
+@@ -111,6 +129,13 @@ static int lenovo_ymc_probe(struct wmi_device *wdev, const void *ctx)
+ struct input_dev *input_dev;
+ int err;
+
++ if (!dmi_check_system(allowed_chasis_types_dmi_table)) {
++ if (force)
++ dev_info(&wdev->dev, "Force loading Lenovo YMC support\n");
++ else
++ return -ENODEV;
++ }
++
+ ec_trigger |= dmi_check_system(ec_trigger_quirk_dmi_table);
+
+ priv = devm_kzalloc(&wdev->dev, sizeof(*priv), GFP_KERNEL);
+diff --git a/drivers/platform/x86/mlx-platform.c b/drivers/platform/x86/mlx-platform.c
+index 67367f010139e..7d33977d9c609 100644
+--- a/drivers/platform/x86/mlx-platform.c
++++ b/drivers/platform/x86/mlx-platform.c
+@@ -62,10 +62,6 @@
+ #define MLXPLAT_CPLD_LPC_REG_PWM_CONTROL_OFFSET 0x37
+ #define MLXPLAT_CPLD_LPC_REG_AGGR_OFFSET 0x3a
+ #define MLXPLAT_CPLD_LPC_REG_AGGR_MASK_OFFSET 0x3b
+-#define MLXPLAT_CPLD_LPC_REG_DBG1_OFFSET 0x3c
+-#define MLXPLAT_CPLD_LPC_REG_DBG2_OFFSET 0x3d
+-#define MLXPLAT_CPLD_LPC_REG_DBG3_OFFSET 0x3e
+-#define MLXPLAT_CPLD_LPC_REG_DBG4_OFFSET 0x3f
+ #define MLXPLAT_CPLD_LPC_REG_AGGRLO_OFFSET 0x40
+ #define MLXPLAT_CPLD_LPC_REG_AGGRLO_MASK_OFFSET 0x41
+ #define MLXPLAT_CPLD_LPC_REG_AGGRCO_OFFSET 0x42
+@@ -126,6 +122,10 @@
+ #define MLXPLAT_CPLD_LPC_REG_LC_SD_EVENT_OFFSET 0xaa
+ #define MLXPLAT_CPLD_LPC_REG_LC_SD_MASK_OFFSET 0xab
+ #define MLXPLAT_CPLD_LPC_REG_LC_PWR_ON 0xb2
++#define MLXPLAT_CPLD_LPC_REG_DBG1_OFFSET 0xb6
++#define MLXPLAT_CPLD_LPC_REG_DBG2_OFFSET 0xb7
++#define MLXPLAT_CPLD_LPC_REG_DBG3_OFFSET 0xb8
++#define MLXPLAT_CPLD_LPC_REG_DBG4_OFFSET 0xb9
+ #define MLXPLAT_CPLD_LPC_REG_GP4_RO_OFFSET 0xc2
+ #define MLXPLAT_CPLD_LPC_REG_SPI_CHNL_SELECT 0xc3
+ #define MLXPLAT_CPLD_LPC_REG_WD_CLEAR_OFFSET 0xc7
+@@ -222,7 +222,7 @@
+ MLXPLAT_CPLD_AGGR_MASK_LC_SDWN)
+ #define MLXPLAT_CPLD_LOW_AGGR_MASK_LOW 0xc1
+ #define MLXPLAT_CPLD_LOW_AGGR_MASK_ASIC2 BIT(2)
+-#define MLXPLAT_CPLD_LOW_AGGR_MASK_PWR_BUT BIT(4)
++#define MLXPLAT_CPLD_LOW_AGGR_MASK_PWR_BUT GENMASK(5, 4)
+ #define MLXPLAT_CPLD_LOW_AGGR_MASK_I2C BIT(6)
+ #define MLXPLAT_CPLD_PSU_MASK GENMASK(1, 0)
+ #define MLXPLAT_CPLD_PWR_MASK GENMASK(1, 0)
+@@ -237,7 +237,7 @@
+ #define MLXPLAT_CPLD_GWP_MASK GENMASK(0, 0)
+ #define MLXPLAT_CPLD_EROT_MASK GENMASK(1, 0)
+ #define MLXPLAT_CPLD_PWR_BUTTON_MASK BIT(0)
+-#define MLXPLAT_CPLD_LATCH_RST_MASK BIT(5)
++#define MLXPLAT_CPLD_LATCH_RST_MASK BIT(6)
+ #define MLXPLAT_CPLD_THERMAL1_PDB_MASK BIT(3)
+ #define MLXPLAT_CPLD_THERMAL2_PDB_MASK BIT(4)
+ #define MLXPLAT_CPLD_INTRUSION_MASK BIT(6)
+@@ -2356,7 +2356,7 @@ mlxplat_mlxcpld_l1_switch_pwr_events_handler(void *handle, enum mlxreg_hotplug_k
+ u8 action)
+ {
+ dev_info(&mlxplat_dev->dev, "System shutdown due to short press of power button");
+- kernel_halt();
++ kernel_power_off();
+ return 0;
+ }
+
+@@ -2475,7 +2475,7 @@ static struct mlxreg_core_item mlxplat_mlxcpld_l1_switch_events_items[] = {
+ .reg = MLXPLAT_CPLD_LPC_REG_PWRB_OFFSET,
+ .mask = MLXPLAT_CPLD_PWR_BUTTON_MASK,
+ .count = ARRAY_SIZE(mlxplat_mlxcpld_l1_switch_pwr_events_items_data),
+- .inversed = 0,
++ .inversed = 1,
+ .health = false,
+ },
+ {
+@@ -2484,7 +2484,7 @@ static struct mlxreg_core_item mlxplat_mlxcpld_l1_switch_events_items[] = {
+ .reg = MLXPLAT_CPLD_LPC_REG_BRD_OFFSET,
+ .mask = MLXPLAT_CPLD_L1_CHA_HEALTH_MASK,
+ .count = ARRAY_SIZE(mlxplat_mlxcpld_l1_switch_health_events_items_data),
+- .inversed = 0,
++ .inversed = 1,
+ .health = false,
+ .ind = 8,
+ },
+@@ -3677,7 +3677,7 @@ static struct mlxreg_core_data mlxplat_mlxcpld_default_ng_regs_io_data[] = {
+ {
+ .label = "latch_reset",
+ .reg = MLXPLAT_CPLD_LPC_REG_GP1_OFFSET,
+- .mask = GENMASK(7, 0) & ~BIT(5),
++ .mask = GENMASK(7, 0) & ~BIT(6),
+ .mode = 0200,
+ },
+ {
+@@ -6238,8 +6238,6 @@ static void mlxplat_i2c_mux_topolgy_exit(struct mlxplat_priv *priv)
+ if (priv->pdev_mux[i])
+ platform_device_unregister(priv->pdev_mux[i]);
+ }
+-
+- mlxplat_post_exit();
+ }
+
+ static int mlxplat_i2c_main_complition_notify(void *handle, int id)
+@@ -6369,6 +6367,7 @@ static void __exit mlxplat_exit(void)
+ pm_power_off = NULL;
+ mlxplat_pre_exit(priv);
+ mlxplat_i2c_main_exit(priv);
++ mlxplat_post_exit();
+ }
+ module_exit(mlxplat_exit);
+
+diff --git a/drivers/platform/x86/msi-ec.c b/drivers/platform/x86/msi-ec.c
+index ff93986e3d35a..f26a3121092f9 100644
+--- a/drivers/platform/x86/msi-ec.c
++++ b/drivers/platform/x86/msi-ec.c
+@@ -27,15 +27,15 @@
+ #include <linux/seq_file.h>
+ #include <linux/string.h>
+
+-static const char *const SM_ECO_NAME = "eco";
+-static const char *const SM_COMFORT_NAME = "comfort";
+-static const char *const SM_SPORT_NAME = "sport";
+-static const char *const SM_TURBO_NAME = "turbo";
+-
+-static const char *const FM_AUTO_NAME = "auto";
+-static const char *const FM_SILENT_NAME = "silent";
+-static const char *const FM_BASIC_NAME = "basic";
+-static const char *const FM_ADVANCED_NAME = "advanced";
++#define SM_ECO_NAME "eco"
++#define SM_COMFORT_NAME "comfort"
++#define SM_SPORT_NAME "sport"
++#define SM_TURBO_NAME "turbo"
++
++#define FM_AUTO_NAME "auto"
++#define FM_SILENT_NAME "silent"
++#define FM_BASIC_NAME "basic"
++#define FM_ADVANCED_NAME "advanced"
+
+ static const char * const ALLOWED_FW_0[] __initconst = {
+ "14C1EMS1.012",
+diff --git a/drivers/platform/x86/serial-multi-instantiate.c b/drivers/platform/x86/serial-multi-instantiate.c
+index f3dcbdd72fec7..8158e3cf5d6de 100644
+--- a/drivers/platform/x86/serial-multi-instantiate.c
++++ b/drivers/platform/x86/serial-multi-instantiate.c
+@@ -21,6 +21,7 @@
+ #define IRQ_RESOURCE_NONE 0
+ #define IRQ_RESOURCE_GPIO 1
+ #define IRQ_RESOURCE_APIC 2
++#define IRQ_RESOURCE_AUTO 3
+
+ enum smi_bus_type {
+ SMI_I2C,
+@@ -52,6 +53,18 @@ static int smi_get_irq(struct platform_device *pdev, struct acpi_device *adev,
+ int ret;
+
+ switch (inst->flags & IRQ_RESOURCE_TYPE) {
++ case IRQ_RESOURCE_AUTO:
++ ret = acpi_dev_gpio_irq_get(adev, inst->irq_idx);
++ if (ret > 0) {
++ dev_dbg(&pdev->dev, "Using gpio irq\n");
++ break;
++ }
++ ret = platform_get_irq(pdev, inst->irq_idx);
++ if (ret > 0) {
++ dev_dbg(&pdev->dev, "Using platform irq\n");
++ break;
++ }
++ break;
+ case IRQ_RESOURCE_GPIO:
+ ret = acpi_dev_gpio_irq_get(adev, inst->irq_idx);
+ break;
+@@ -307,10 +320,23 @@ static const struct smi_node int3515_data = {
+
+ static const struct smi_node cs35l41_hda = {
+ .instances = {
+- { "cs35l41-hda", IRQ_RESOURCE_GPIO, 0 },
+- { "cs35l41-hda", IRQ_RESOURCE_GPIO, 0 },
+- { "cs35l41-hda", IRQ_RESOURCE_GPIO, 0 },
+- { "cs35l41-hda", IRQ_RESOURCE_GPIO, 0 },
++ { "cs35l41-hda", IRQ_RESOURCE_AUTO, 0 },
++ { "cs35l41-hda", IRQ_RESOURCE_AUTO, 0 },
++ { "cs35l41-hda", IRQ_RESOURCE_AUTO, 0 },
++ { "cs35l41-hda", IRQ_RESOURCE_AUTO, 0 },
++ {}
++ },
++ .bus_type = SMI_AUTO_DETECT,
++};
++
++static const struct smi_node cs35l56_hda = {
++ .instances = {
++ { "cs35l56-hda", IRQ_RESOURCE_AUTO, 0 },
++ { "cs35l56-hda", IRQ_RESOURCE_AUTO, 0 },
++ { "cs35l56-hda", IRQ_RESOURCE_AUTO, 0 },
++ { "cs35l56-hda", IRQ_RESOURCE_AUTO, 0 },
++ /* a 5th entry is an alias address, not a real device */
++ { "cs35l56-hda_dummy_dev" },
+ {}
+ },
+ .bus_type = SMI_AUTO_DETECT,
+@@ -324,6 +350,7 @@ static const struct acpi_device_id smi_acpi_ids[] = {
+ { "BSG1160", (unsigned long)&bsg1160_data },
+ { "BSG2150", (unsigned long)&bsg2150_data },
+ { "CSC3551", (unsigned long)&cs35l41_hda },
++ { "CSC3556", (unsigned long)&cs35l56_hda },
+ { "INT3515", (unsigned long)&int3515_data },
+ /* Non-conforming _HID for Cirrus Logic already released */
+ { "CLSA0100", (unsigned long)&cs35l41_hda },
+diff --git a/drivers/scsi/53c700.c b/drivers/scsi/53c700.c
+index e1e4f9d108879..857be0f3ae5b9 100644
+--- a/drivers/scsi/53c700.c
++++ b/drivers/scsi/53c700.c
+@@ -1598,7 +1598,7 @@ NCR_700_intr(int irq, void *dev_id)
+ printk("scsi%d (%d:%d) PHASE MISMATCH IN SEND MESSAGE %d remain, return %p[%04x], phase %s\n", host->host_no, pun, lun, count, (void *)temp, temp - hostdata->pScript, sbcl_to_string(NCR_700_readb(host, SBCL_REG)));
+ #endif
+ resume_offset = hostdata->pScript + Ent_SendMessagePhaseMismatch;
+- } else if(dsp >= to32bit(&slot->pSG[0].ins) &&
++ } else if (slot && dsp >= to32bit(&slot->pSG[0].ins) &&
+ dsp <= to32bit(&slot->pSG[NCR_700_SG_SEGMENTS].ins)) {
+ int data_transfer = NCR_700_readl(host, DBC_REG) & 0xffffff;
+ int SGcount = (dsp - to32bit(&slot->pSG[0].ins))/sizeof(struct NCR_700_SG_List);
+diff --git a/drivers/scsi/fnic/fnic.h b/drivers/scsi/fnic/fnic.h
+index d82de34f6fd73..e51e92f932fa8 100644
+--- a/drivers/scsi/fnic/fnic.h
++++ b/drivers/scsi/fnic/fnic.h
+@@ -27,7 +27,7 @@
+
+ #define DRV_NAME "fnic"
+ #define DRV_DESCRIPTION "Cisco FCoE HBA Driver"
+-#define DRV_VERSION "1.6.0.54"
++#define DRV_VERSION "1.6.0.55"
+ #define PFX DRV_NAME ": "
+ #define DFX DRV_NAME "%d: "
+
+diff --git a/drivers/scsi/fnic/fnic_scsi.c b/drivers/scsi/fnic/fnic_scsi.c
+index 26dbd347156ef..be89ce96df46c 100644
+--- a/drivers/scsi/fnic/fnic_scsi.c
++++ b/drivers/scsi/fnic/fnic_scsi.c
+@@ -2139,7 +2139,7 @@ static int fnic_clean_pending_aborts(struct fnic *fnic,
+ bool new_sc)
+
+ {
+- int ret = SUCCESS;
++ int ret = 0;
+ struct fnic_pending_aborts_iter_data iter_data = {
+ .fnic = fnic,
+ .lun_dev = lr_sc->device,
+@@ -2159,9 +2159,11 @@ static int fnic_clean_pending_aborts(struct fnic *fnic,
+
+ /* walk again to check, if IOs are still pending in fw */
+ if (fnic_is_abts_pending(fnic, lr_sc))
+- ret = FAILED;
++ ret = 1;
+
+ clean_pending_aborts_end:
++ FNIC_SCSI_DBG(KERN_INFO, fnic->lport->host,
++ "%s: exit status: %d\n", __func__, ret);
+ return ret;
+ }
+
+diff --git a/drivers/scsi/qedf/qedf_main.c b/drivers/scsi/qedf/qedf_main.c
+index 2a31ddc99dde5..7825765c936cd 100644
+--- a/drivers/scsi/qedf/qedf_main.c
++++ b/drivers/scsi/qedf/qedf_main.c
+@@ -31,6 +31,7 @@ static void qedf_remove(struct pci_dev *pdev);
+ static void qedf_shutdown(struct pci_dev *pdev);
+ static void qedf_schedule_recovery_handler(void *dev);
+ static void qedf_recovery_handler(struct work_struct *work);
++static int qedf_suspend(struct pci_dev *pdev, pm_message_t state);
+
+ /*
+ * Driver module parameters.
+@@ -3271,6 +3272,7 @@ static struct pci_driver qedf_pci_driver = {
+ .probe = qedf_probe,
+ .remove = qedf_remove,
+ .shutdown = qedf_shutdown,
++ .suspend = qedf_suspend,
+ };
+
+ static int __qedf_probe(struct pci_dev *pdev, int mode)
+@@ -4000,6 +4002,22 @@ static void qedf_shutdown(struct pci_dev *pdev)
+ __qedf_remove(pdev, QEDF_MODE_NORMAL);
+ }
+
++static int qedf_suspend(struct pci_dev *pdev, pm_message_t state)
++{
++ struct qedf_ctx *qedf;
++
++ if (!pdev) {
++ QEDF_ERR(NULL, "pdev is NULL.\n");
++ return -ENODEV;
++ }
++
++ qedf = pci_get_drvdata(pdev);
++
++ QEDF_ERR(&qedf->dbg_ctx, "%s: Device does not support suspend operation\n", __func__);
++
++ return -EPERM;
++}
++
+ /*
+ * Recovery handler code
+ */
+diff --git a/drivers/scsi/qedi/qedi_main.c b/drivers/scsi/qedi/qedi_main.c
+index 45d3595541820..ef62dbbc1868e 100644
+--- a/drivers/scsi/qedi/qedi_main.c
++++ b/drivers/scsi/qedi/qedi_main.c
+@@ -69,6 +69,7 @@ static struct nvm_iscsi_block *qedi_get_nvram_block(struct qedi_ctx *qedi);
+ static void qedi_recovery_handler(struct work_struct *work);
+ static void qedi_schedule_hw_err_handler(void *dev,
+ enum qed_hw_err_type err_type);
++static int qedi_suspend(struct pci_dev *pdev, pm_message_t state);
+
+ static int qedi_iscsi_event_cb(void *context, u8 fw_event_code, void *fw_handle)
+ {
+@@ -2510,6 +2511,22 @@ static void qedi_shutdown(struct pci_dev *pdev)
+ __qedi_remove(pdev, QEDI_MODE_SHUTDOWN);
+ }
+
++static int qedi_suspend(struct pci_dev *pdev, pm_message_t state)
++{
++ struct qedi_ctx *qedi;
++
++ if (!pdev) {
++ QEDI_ERR(NULL, "pdev is NULL.\n");
++ return -ENODEV;
++ }
++
++ qedi = pci_get_drvdata(pdev);
++
++ QEDI_ERR(&qedi->dbg_ctx, "%s: Device does not support suspend operation\n", __func__);
++
++ return -EPERM;
++}
++
+ static int __qedi_probe(struct pci_dev *pdev, int mode)
+ {
+ struct qedi_ctx *qedi;
+@@ -2868,6 +2885,7 @@ static struct pci_driver qedi_pci_driver = {
+ .remove = qedi_remove,
+ .shutdown = qedi_shutdown,
+ .err_handler = &qedi_err_handler,
++ .suspend = qedi_suspend,
+ };
+
+ static int __init qedi_init(void)
+diff --git a/drivers/scsi/raid_class.c b/drivers/scsi/raid_class.c
+index 898a0bdf8df67..711252e52d8e1 100644
+--- a/drivers/scsi/raid_class.c
++++ b/drivers/scsi/raid_class.c
+@@ -248,6 +248,7 @@ int raid_component_add(struct raid_template *r,struct device *raid_dev,
+ return 0;
+
+ err_out:
++ put_device(&rc->dev);
+ list_del(&rc->node);
+ rd->component_count--;
+ put_device(component_dev);
+diff --git a/drivers/scsi/scsi_proc.c b/drivers/scsi/scsi_proc.c
+index 4a6eb1741be0d..41f23cd0bfb45 100644
+--- a/drivers/scsi/scsi_proc.c
++++ b/drivers/scsi/scsi_proc.c
+@@ -406,7 +406,7 @@ static ssize_t proc_scsi_write(struct file *file, const char __user *buf,
+ size_t length, loff_t *ppos)
+ {
+ int host, channel, id, lun;
+- char *buffer, *p;
++ char *buffer, *end, *p;
+ int err;
+
+ if (!buf || length > PAGE_SIZE)
+@@ -421,10 +421,14 @@ static ssize_t proc_scsi_write(struct file *file, const char __user *buf,
+ goto out;
+
+ err = -EINVAL;
+- if (length < PAGE_SIZE)
+- buffer[length] = '\0';
+- else if (buffer[PAGE_SIZE-1])
+- goto out;
++ if (length < PAGE_SIZE) {
++ end = buffer + length;
++ *end = '\0';
++ } else {
++ end = buffer + PAGE_SIZE - 1;
++ if (*end)
++ goto out;
++ }
+
+ /*
+ * Usage: echo "scsi add-single-device 0 1 2 3" >/proc/scsi/scsi
+@@ -433,10 +437,10 @@ static ssize_t proc_scsi_write(struct file *file, const char __user *buf,
+ if (!strncmp("scsi add-single-device", buffer, 22)) {
+ p = buffer + 23;
+
+- host = simple_strtoul(p, &p, 0);
+- channel = simple_strtoul(p + 1, &p, 0);
+- id = simple_strtoul(p + 1, &p, 0);
+- lun = simple_strtoul(p + 1, &p, 0);
++ host = (p < end) ? simple_strtoul(p, &p, 0) : 0;
++ channel = (p + 1 < end) ? simple_strtoul(p + 1, &p, 0) : 0;
++ id = (p + 1 < end) ? simple_strtoul(p + 1, &p, 0) : 0;
++ lun = (p + 1 < end) ? simple_strtoul(p + 1, &p, 0) : 0;
+
+ err = scsi_add_single_device(host, channel, id, lun);
+
+@@ -447,10 +451,10 @@ static ssize_t proc_scsi_write(struct file *file, const char __user *buf,
+ } else if (!strncmp("scsi remove-single-device", buffer, 25)) {
+ p = buffer + 26;
+
+- host = simple_strtoul(p, &p, 0);
+- channel = simple_strtoul(p + 1, &p, 0);
+- id = simple_strtoul(p + 1, &p, 0);
+- lun = simple_strtoul(p + 1, &p, 0);
++ host = (p < end) ? simple_strtoul(p, &p, 0) : 0;
++ channel = (p + 1 < end) ? simple_strtoul(p + 1, &p, 0) : 0;
++ id = (p + 1 < end) ? simple_strtoul(p + 1, &p, 0) : 0;
++ lun = (p + 1 < end) ? simple_strtoul(p + 1, &p, 0) : 0;
+
+ err = scsi_remove_single_device(host, channel, id, lun);
+ }
+diff --git a/drivers/scsi/snic/snic_disc.c b/drivers/scsi/snic/snic_disc.c
+index 8fbf3c1b1311d..cd27562ec922e 100644
+--- a/drivers/scsi/snic/snic_disc.c
++++ b/drivers/scsi/snic/snic_disc.c
+@@ -303,6 +303,7 @@ snic_tgt_create(struct snic *snic, struct snic_tgt_id *tgtid)
+ "Snic Tgt: device_add, with err = %d\n",
+ ret);
+
++ put_device(&tgt->dev);
+ put_device(&snic->shost->shost_gendev);
+ spin_lock_irqsave(snic->shost->host_lock, flags);
+ list_del(&tgt->list);
+diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
+index 4d72d82f73586..182cf916f0835 100644
+--- a/drivers/scsi/storvsc_drv.c
++++ b/drivers/scsi/storvsc_drv.c
+@@ -1672,10 +1672,6 @@ static int storvsc_host_reset_handler(struct scsi_cmnd *scmnd)
+ */
+ static enum scsi_timeout_action storvsc_eh_timed_out(struct scsi_cmnd *scmnd)
+ {
+-#if IS_ENABLED(CONFIG_SCSI_FC_ATTRS)
+- if (scmnd->device->host->transportt == fc_transport_template)
+- return fc_eh_timed_out(scmnd);
+-#endif
+ return SCSI_EH_RESET_TIMER;
+ }
+
+diff --git a/drivers/thunderbolt/tb.c b/drivers/thunderbolt/tb.c
+index c1af712ca7288..7c67476efa78d 100644
+--- a/drivers/thunderbolt/tb.c
++++ b/drivers/thunderbolt/tb.c
+@@ -1810,6 +1810,8 @@ unlock:
+
+ pm_runtime_mark_last_busy(&tb->dev);
+ pm_runtime_put_autosuspend(&tb->dev);
++
++ kfree(ev);
+ }
+
+ static void tb_queue_dp_bandwidth_request(struct tb *tb, u64 route, u8 port)
+diff --git a/drivers/ufs/host/ufs-renesas.c b/drivers/ufs/host/ufs-renesas.c
+index f8a5e79ed3b4e..ab0652d8705ac 100644
+--- a/drivers/ufs/host/ufs-renesas.c
++++ b/drivers/ufs/host/ufs-renesas.c
+@@ -359,7 +359,7 @@ static int ufs_renesas_init(struct ufs_hba *hba)
+ {
+ struct ufs_renesas_priv *priv;
+
+- priv = devm_kmalloc(hba->dev, sizeof(*priv), GFP_KERNEL);
++ priv = devm_kzalloc(hba->dev, sizeof(*priv), GFP_KERNEL);
+ if (!priv)
+ return -ENOMEM;
+ ufshcd_set_variant(hba, priv);
+diff --git a/drivers/usb/common/usb-conn-gpio.c b/drivers/usb/common/usb-conn-gpio.c
+index e20874caba363..3f5180d64931b 100644
+--- a/drivers/usb/common/usb-conn-gpio.c
++++ b/drivers/usb/common/usb-conn-gpio.c
+@@ -42,6 +42,7 @@ struct usb_conn_info {
+
+ struct power_supply_desc desc;
+ struct power_supply *charger;
++ bool initial_detection;
+ };
+
+ /*
+@@ -86,11 +87,13 @@ static void usb_conn_detect_cable(struct work_struct *work)
+ dev_dbg(info->dev, "role %s -> %s, gpios: id %d, vbus %d\n",
+ usb_role_string(info->last_role), usb_role_string(role), id, vbus);
+
+- if (info->last_role == role) {
++ if (!info->initial_detection && info->last_role == role) {
+ dev_warn(info->dev, "repeated role: %s\n", usb_role_string(role));
+ return;
+ }
+
++ info->initial_detection = false;
++
+ if (info->last_role == USB_ROLE_HOST && info->vbus)
+ regulator_disable(info->vbus);
+
+@@ -258,6 +261,7 @@ static int usb_conn_probe(struct platform_device *pdev)
+ device_set_wakeup_capable(&pdev->dev, true);
+
+ /* Perform initial detection */
++ info->initial_detection = true;
+ usb_conn_queue_dwork(info, 0);
+
+ return 0;
+diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
+index 550dc8f4d16ad..f7957a93abd26 100644
+--- a/drivers/usb/dwc3/gadget.c
++++ b/drivers/usb/dwc3/gadget.c
+@@ -4448,9 +4448,14 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
+ u32 count;
+
+ if (pm_runtime_suspended(dwc->dev)) {
++ dwc->pending_events = true;
++ /*
++ * Trigger runtime resume. The get() function will be balanced
++ * after processing the pending events in dwc3_process_pending
++ * events().
++ */
+ pm_runtime_get(dwc->dev);
+ disable_irq_nosync(dwc->irq_gadget);
+- dwc->pending_events = true;
+ return IRQ_HANDLED;
+ }
+
+@@ -4711,6 +4716,8 @@ void dwc3_gadget_process_pending_events(struct dwc3 *dwc)
+ {
+ if (dwc->pending_events) {
+ dwc3_interrupt(dwc->irq_gadget, dwc->ev_buf);
++ dwc3_thread_interrupt(dwc->irq_gadget, dwc->ev_buf);
++ pm_runtime_put(dwc->dev);
+ dwc->pending_events = false;
+ enable_irq(dwc->irq_gadget);
+ }
+diff --git a/drivers/usb/gadget/udc/core.c b/drivers/usb/gadget/udc/core.c
+index 0068d0c448658..d5bc2892184ca 100644
+--- a/drivers/usb/gadget/udc/core.c
++++ b/drivers/usb/gadget/udc/core.c
+@@ -822,6 +822,9 @@ EXPORT_SYMBOL_GPL(usb_gadget_disconnect);
+ * usb_gadget_activate() is called. For example, user mode components may
+ * need to be activated before the system can talk to hosts.
+ *
++ * This routine may sleep; it must not be called in interrupt context
++ * (such as from within a gadget driver's disconnect() callback).
++ *
+ * Returns zero on success, else negative errno.
+ */
+ int usb_gadget_deactivate(struct usb_gadget *gadget)
+@@ -860,6 +863,8 @@ EXPORT_SYMBOL_GPL(usb_gadget_deactivate);
+ * This routine activates gadget which was previously deactivated with
+ * usb_gadget_deactivate() call. It calls usb_gadget_connect() if needed.
+ *
++ * This routine may sleep; it must not be called in interrupt context.
++ *
+ * Returns zero on success, else negative errno.
+ */
+ int usb_gadget_activate(struct usb_gadget *gadget)
+@@ -1638,7 +1643,11 @@ static void gadget_unbind_driver(struct device *dev)
+ usb_gadget_disable_async_callbacks(udc);
+ if (gadget->irq)
+ synchronize_irq(gadget->irq);
++ mutex_unlock(&udc->connect_lock);
++
+ udc->driver->unbind(gadget);
++
++ mutex_lock(&udc->connect_lock);
+ usb_gadget_udc_stop_locked(udc);
+ mutex_unlock(&udc->connect_lock);
+
+diff --git a/drivers/usb/storage/alauda.c b/drivers/usb/storage/alauda.c
+index 5e912dd29b4c9..115f05a6201a1 100644
+--- a/drivers/usb/storage/alauda.c
++++ b/drivers/usb/storage/alauda.c
+@@ -318,7 +318,8 @@ static int alauda_get_media_status(struct us_data *us, unsigned char *data)
+ rc = usb_stor_ctrl_transfer(us, us->recv_ctrl_pipe,
+ command, 0xc0, 0, 1, data, 2);
+
+- usb_stor_dbg(us, "Media status %02X %02X\n", data[0], data[1]);
++ if (rc == USB_STOR_XFER_GOOD)
++ usb_stor_dbg(us, "Media status %02X %02X\n", data[0], data[1]);
+
+ return rc;
+ }
+@@ -454,9 +455,14 @@ static int alauda_init_media(struct us_data *us)
+ static int alauda_check_media(struct us_data *us)
+ {
+ struct alauda_info *info = (struct alauda_info *) us->extra;
+- unsigned char status[2];
++ unsigned char *status = us->iobuf;
++ int rc;
+
+- alauda_get_media_status(us, status);
++ rc = alauda_get_media_status(us, status);
++ if (rc != USB_STOR_XFER_GOOD) {
++ status[0] = 0xF0; /* Pretend there's no media */
++ status[1] = 0;
++ }
+
+ /* Check for no media or door open */
+ if ((status[0] & 0x80) || ((status[0] & 0x1F) == 0x10)
+diff --git a/drivers/usb/typec/altmodes/displayport.c b/drivers/usb/typec/altmodes/displayport.c
+index 66de880b28d01..cdf8261e22dbd 100644
+--- a/drivers/usb/typec/altmodes/displayport.c
++++ b/drivers/usb/typec/altmodes/displayport.c
+@@ -60,6 +60,7 @@ struct dp_altmode {
+
+ enum dp_state state;
+ bool hpd;
++ bool pending_hpd;
+
+ struct mutex lock; /* device lock */
+ struct work_struct work;
+@@ -144,8 +145,13 @@ static int dp_altmode_status_update(struct dp_altmode *dp)
+ dp->state = DP_STATE_EXIT;
+ } else if (!(con & DP_CONF_CURRENTLY(dp->data.conf))) {
+ ret = dp_altmode_configure(dp, con);
+- if (!ret)
++ if (!ret) {
+ dp->state = DP_STATE_CONFIGURE;
++ if (dp->hpd != hpd) {
++ dp->hpd = hpd;
++ dp->pending_hpd = true;
++ }
++ }
+ } else {
+ if (dp->hpd != hpd) {
+ drm_connector_oob_hotplug_event(dp->connector_fwnode);
+@@ -161,6 +167,16 @@ static int dp_altmode_configured(struct dp_altmode *dp)
+ {
+ sysfs_notify(&dp->alt->dev.kobj, "displayport", "configuration");
+ sysfs_notify(&dp->alt->dev.kobj, "displayport", "pin_assignment");
++ /*
++ * If the DFP_D/UFP_D sends a change in HPD when first notifying the
++ * DisplayPort driver that it is connected, then we wait until
++ * configuration is complete to signal HPD.
++ */
++ if (dp->pending_hpd) {
++ drm_connector_oob_hotplug_event(dp->connector_fwnode);
++ sysfs_notify(&dp->alt->dev.kobj, "displayport", "hpd");
++ dp->pending_hpd = false;
++ }
+
+ return dp_altmode_notify(dp);
+ }
+diff --git a/drivers/usb/typec/tcpm/tcpm.c b/drivers/usb/typec/tcpm/tcpm.c
+index 3c6b0c8e2d3ae..dc113cbb3bed8 100644
+--- a/drivers/usb/typec/tcpm/tcpm.c
++++ b/drivers/usb/typec/tcpm/tcpm.c
+@@ -5348,6 +5348,10 @@ static void _tcpm_pd_vbus_off(struct tcpm_port *port)
+ /* Do nothing, vbus drop expected */
+ break;
+
++ case SNK_HARD_RESET_WAIT_VBUS:
++ /* Do nothing, its OK to receive vbus off events */
++ break;
++
+ default:
+ if (port->pwr_role == TYPEC_SINK && port->attached)
+ tcpm_set_state(port, SNK_UNATTACHED, tcpm_wait_for_discharge(port));
+@@ -5394,6 +5398,9 @@ static void _tcpm_pd_vbus_vsafe0v(struct tcpm_port *port)
+ case SNK_DEBOUNCED:
+ /*Do nothing, still waiting for VSAFE5V for connect */
+ break;
++ case SNK_HARD_RESET_WAIT_VBUS:
++ /* Do nothing, its OK to receive vbus off events */
++ break;
+ default:
+ if (port->pwr_role == TYPEC_SINK && port->auto_vbus_discharge_enabled)
+ tcpm_set_state(port, SNK_UNATTACHED, 0);
+diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
+index ad14dd745e4ae..2a60033d907bf 100644
+--- a/fs/btrfs/block-group.c
++++ b/fs/btrfs/block-group.c
+@@ -441,13 +441,23 @@ void btrfs_wait_block_group_cache_progress(struct btrfs_block_group *cache,
+ u64 num_bytes)
+ {
+ struct btrfs_caching_control *caching_ctl;
++ int progress;
+
+ caching_ctl = btrfs_get_caching_control(cache);
+ if (!caching_ctl)
+ return;
+
++ /*
++ * We've already failed to allocate from this block group, so even if
++ * there's enough space in the block group it isn't contiguous enough to
++ * allow for an allocation, so wait for at least the next wakeup tick,
++ * or for the thing to be done.
++ */
++ progress = atomic_read(&caching_ctl->progress);
++
+ wait_event(caching_ctl->wait, btrfs_block_group_done(cache) ||
+- (cache->free_space_ctl->free_space >= num_bytes));
++ (progress != atomic_read(&caching_ctl->progress) &&
++ (cache->free_space_ctl->free_space >= num_bytes)));
+
+ btrfs_put_caching_control(caching_ctl);
+ }
+@@ -802,8 +812,10 @@ next:
+
+ if (total_found > CACHING_CTL_WAKE_UP) {
+ total_found = 0;
+- if (wakeup)
++ if (wakeup) {
++ atomic_inc(&caching_ctl->progress);
+ wake_up(&caching_ctl->wait);
++ }
+ }
+ }
+ path->slots[0]++;
+@@ -910,6 +922,7 @@ int btrfs_cache_block_group(struct btrfs_block_group *cache, bool wait)
+ init_waitqueue_head(&caching_ctl->wait);
+ caching_ctl->block_group = cache;
+ refcount_set(&caching_ctl->count, 2);
++ atomic_set(&caching_ctl->progress, 0);
+ btrfs_init_work(&caching_ctl->work, caching_thread, NULL, NULL);
+
+ spin_lock(&cache->lock);
+diff --git a/fs/btrfs/block-group.h b/fs/btrfs/block-group.h
+index 3195d0b0dbed8..471f591db7c0c 100644
+--- a/fs/btrfs/block-group.h
++++ b/fs/btrfs/block-group.h
+@@ -85,6 +85,8 @@ struct btrfs_caching_control {
+ wait_queue_head_t wait;
+ struct btrfs_work work;
+ struct btrfs_block_group *block_group;
++ /* Track progress of caching during allocation. */
++ atomic_t progress;
+ refcount_t count;
+ };
+
+diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
+index 9f056ad41df04..f890c4c71cdaf 100644
+--- a/fs/btrfs/disk-io.c
++++ b/fs/btrfs/disk-io.c
+@@ -1351,7 +1351,8 @@ static int btrfs_init_fs_root(struct btrfs_root *root, dev_t anon_dev)
+ btrfs_drew_lock_init(&root->snapshot_lock);
+
+ if (root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID &&
+- !btrfs_is_data_reloc_root(root)) {
++ !btrfs_is_data_reloc_root(root) &&
++ is_fstree(root->root_key.objectid)) {
+ set_bit(BTRFS_ROOT_SHAREABLE, &root->state);
+ btrfs_check_and_init_root_item(&root->root_item);
+ }
+diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
+index 5cd289de4e92e..10bffcb379148 100644
+--- a/fs/btrfs/extent-tree.c
++++ b/fs/btrfs/extent-tree.c
+@@ -4318,8 +4318,11 @@ have_block_group:
+ ret = 0;
+ }
+
+- if (unlikely(block_group->cached == BTRFS_CACHE_ERROR))
++ if (unlikely(block_group->cached == BTRFS_CACHE_ERROR)) {
++ if (!cache_block_group_error)
++ cache_block_group_error = -EIO;
+ goto loop;
++ }
+
+ if (!find_free_extent_check_size_class(ffe_ctl, block_group))
+ goto loop;
+diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
+index 82b9779deaa88..54eed5a8a412b 100644
+--- a/fs/btrfs/extent_io.c
++++ b/fs/btrfs/extent_io.c
+@@ -2181,11 +2181,12 @@ retry:
+ }
+
+ /*
+- * the filesystem may choose to bump up nr_to_write.
++ * The filesystem may choose to bump up nr_to_write.
+ * We have to make sure to honor the new nr_to_write
+- * at any time
++ * at any time.
+ */
+- nr_to_write_done = wbc->nr_to_write <= 0;
++ nr_to_write_done = (wbc->sync_mode == WB_SYNC_NONE &&
++ wbc->nr_to_write <= 0);
+ }
+ folio_batch_release(&fbatch);
+ cond_resched();
+@@ -2344,6 +2345,12 @@ retry:
+ continue;
+ }
+
++ if (!folio_test_dirty(folio)) {
++ /* Someone wrote it for us. */
++ folio_unlock(folio);
++ continue;
++ }
++
+ if (wbc->sync_mode != WB_SYNC_NONE) {
+ if (folio_test_writeback(folio))
+ submit_write_bio(bio_ctrl, 0);
+diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
+index c89071186388b..ace949bc75059 100644
+--- a/fs/btrfs/inode.c
++++ b/fs/btrfs/inode.c
+@@ -1453,8 +1453,6 @@ out_unlock:
+ clear_bits,
+ page_ops);
+ start += cur_alloc_size;
+- if (start >= end)
+- return ret;
+ }
+
+ /*
+@@ -1463,9 +1461,11 @@ out_unlock:
+ * space_info's bytes_may_use counter, reserved in
+ * btrfs_check_data_free_space().
+ */
+- extent_clear_unlock_delalloc(inode, start, end, locked_page,
+- clear_bits | EXTENT_CLEAR_DATA_RESV,
+- page_ops);
++ if (start < end) {
++ clear_bits |= EXTENT_CLEAR_DATA_RESV;
++ extent_clear_unlock_delalloc(inode, start, end, locked_page,
++ clear_bits, page_ops);
++ }
+ return ret;
+ }
+
+diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
+index 59a06499c647e..5b077e57cdd5e 100644
+--- a/fs/btrfs/relocation.c
++++ b/fs/btrfs/relocation.c
+@@ -1916,7 +1916,39 @@ again:
+ err = PTR_ERR(root);
+ break;
+ }
+- ASSERT(root->reloc_root == reloc_root);
++
++ if (unlikely(root->reloc_root != reloc_root)) {
++ if (root->reloc_root) {
++ btrfs_err(fs_info,
++"reloc tree mismatch, root %lld has reloc root key (%lld %u %llu) gen %llu, expect reloc root key (%lld %u %llu) gen %llu",
++ root->root_key.objectid,
++ root->reloc_root->root_key.objectid,
++ root->reloc_root->root_key.type,
++ root->reloc_root->root_key.offset,
++ btrfs_root_generation(
++ &root->reloc_root->root_item),
++ reloc_root->root_key.objectid,
++ reloc_root->root_key.type,
++ reloc_root->root_key.offset,
++ btrfs_root_generation(
++ &reloc_root->root_item));
++ } else {
++ btrfs_err(fs_info,
++"reloc tree mismatch, root %lld has no reloc root, expect reloc root key (%lld %u %llu) gen %llu",
++ root->root_key.objectid,
++ reloc_root->root_key.objectid,
++ reloc_root->root_key.type,
++ reloc_root->root_key.offset,
++ btrfs_root_generation(
++ &reloc_root->root_item));
++ }
++ list_add(&reloc_root->root_list, &reloc_roots);
++ btrfs_put_root(root);
++ btrfs_abort_transaction(trans, -EUCLEAN);
++ if (!err)
++ err = -EUCLEAN;
++ break;
++ }
+
+ /*
+ * set reference count to 1, so btrfs_recover_relocation
+@@ -1989,7 +2021,7 @@ again:
+ root = btrfs_get_fs_root(fs_info, reloc_root->root_key.offset,
+ false);
+ if (btrfs_root_refs(&reloc_root->root_item) > 0) {
+- if (IS_ERR(root)) {
++ if (WARN_ON(IS_ERR(root))) {
+ /*
+ * For recovery we read the fs roots on mount,
+ * and if we didn't find the root then we marked
+@@ -1998,17 +2030,14 @@ again:
+ * memory. However there's no reason we can't
+ * handle the error properly here just in case.
+ */
+- ASSERT(0);
+ ret = PTR_ERR(root);
+ goto out;
+ }
+- if (root->reloc_root != reloc_root) {
++ if (WARN_ON(root->reloc_root != reloc_root)) {
+ /*
+- * This is actually impossible without something
+- * going really wrong (like weird race condition
+- * or cosmic rays).
++ * This can happen if on-disk metadata has some
++ * corruption, e.g. bad reloc tree key offset.
+ */
+- ASSERT(0);
+ ret = -EINVAL;
+ goto out;
+ }
+diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
+index 2138e9fc05647..9bbcb93755300 100644
+--- a/fs/btrfs/tree-checker.c
++++ b/fs/btrfs/tree-checker.c
+@@ -446,6 +446,20 @@ static int check_root_key(struct extent_buffer *leaf, struct btrfs_key *key,
+ btrfs_item_key_to_cpu(leaf, &item_key, slot);
+ is_root_item = (item_key.type == BTRFS_ROOT_ITEM_KEY);
+
++ /*
++ * Bad rootid for reloc trees.
++ *
++ * Reloc trees are only for subvolume trees, other trees only need
++ * to be COWed to be relocated.
++ */
++ if (unlikely(is_root_item && key->objectid == BTRFS_TREE_RELOC_OBJECTID &&
++ !is_fstree(key->offset))) {
++ generic_err(leaf, slot,
++ "invalid reloc tree for root %lld, root id is not a subvolume tree",
++ key->offset);
++ return -EUCLEAN;
++ }
++
+ /* No such tree id */
+ if (unlikely(key->objectid == 0)) {
+ if (is_root_item)
+diff --git a/fs/nilfs2/inode.c b/fs/nilfs2/inode.c
+index a8ce522ac7479..35bc793053180 100644
+--- a/fs/nilfs2/inode.c
++++ b/fs/nilfs2/inode.c
+@@ -1101,9 +1101,17 @@ int nilfs_set_file_dirty(struct inode *inode, unsigned int nr_dirty)
+
+ int __nilfs_mark_inode_dirty(struct inode *inode, int flags)
+ {
++ struct the_nilfs *nilfs = inode->i_sb->s_fs_info;
+ struct buffer_head *ibh;
+ int err;
+
++ /*
++ * Do not dirty inodes after the log writer has been detached
++ * and its nilfs_root struct has been freed.
++ */
++ if (unlikely(nilfs_purging(nilfs)))
++ return 0;
++
+ err = nilfs_load_inode_block(inode, &ibh);
+ if (unlikely(err)) {
+ nilfs_warn(inode->i_sb,
+diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
+index c2553024bd25e..581691e4be491 100644
+--- a/fs/nilfs2/segment.c
++++ b/fs/nilfs2/segment.c
+@@ -2845,6 +2845,7 @@ void nilfs_detach_log_writer(struct super_block *sb)
+ nilfs_segctor_destroy(nilfs->ns_writer);
+ nilfs->ns_writer = NULL;
+ }
++ set_nilfs_purging(nilfs);
+
+ /* Force to free the list of dirty files */
+ spin_lock(&nilfs->ns_inode_lock);
+@@ -2857,4 +2858,5 @@ void nilfs_detach_log_writer(struct super_block *sb)
+ up_write(&nilfs->ns_segctor_sem);
+
+ nilfs_dispose_list(nilfs, &garbage_list, 1);
++ clear_nilfs_purging(nilfs);
+ }
+diff --git a/fs/nilfs2/the_nilfs.h b/fs/nilfs2/the_nilfs.h
+index 47c7dfbb7ea58..cd4ae1b8ae165 100644
+--- a/fs/nilfs2/the_nilfs.h
++++ b/fs/nilfs2/the_nilfs.h
+@@ -29,6 +29,7 @@ enum {
+ THE_NILFS_DISCONTINUED, /* 'next' pointer chain has broken */
+ THE_NILFS_GC_RUNNING, /* gc process is running */
+ THE_NILFS_SB_DIRTY, /* super block is dirty */
++ THE_NILFS_PURGING, /* disposing dirty files for cleanup */
+ };
+
+ /**
+@@ -208,6 +209,7 @@ THE_NILFS_FNS(INIT, init)
+ THE_NILFS_FNS(DISCONTINUED, discontinued)
+ THE_NILFS_FNS(GC_RUNNING, gc_running)
+ THE_NILFS_FNS(SB_DIRTY, sb_dirty)
++THE_NILFS_FNS(PURGING, purging)
+
+ /*
+ * Mount option operations
+diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
+index 25b44b303b355..2669035f7eb1f 100644
+--- a/fs/proc/kcore.c
++++ b/fs/proc/kcore.c
+@@ -309,6 +309,8 @@ static void append_kcore_note(char *notes, size_t *i, const char *name,
+
+ static ssize_t read_kcore_iter(struct kiocb *iocb, struct iov_iter *iter)
+ {
++ struct file *file = iocb->ki_filp;
++ char *buf = file->private_data;
+ loff_t *fpos = &iocb->ki_pos;
+ size_t phdrs_offset, notes_offset, data_offset;
+ size_t page_offline_frozen = 1;
+@@ -555,10 +557,21 @@ static ssize_t read_kcore_iter(struct kiocb *iocb, struct iov_iter *iter)
+ case KCORE_VMEMMAP:
+ case KCORE_TEXT:
+ /*
+- * We use _copy_to_iter() to bypass usermode hardening
+- * which would otherwise prevent this operation.
++ * Sadly we must use a bounce buffer here to be able to
++ * make use of copy_from_kernel_nofault(), as these
++ * memory regions might not always be mapped on all
++ * architectures.
+ */
+- if (_copy_to_iter((char *)start, tsz, iter) != tsz) {
++ if (copy_from_kernel_nofault(buf, (void *)start, tsz)) {
++ if (iov_iter_zero(tsz, iter) != tsz) {
++ ret = -EFAULT;
++ goto out;
++ }
++ /*
++ * We know the bounce buffer is safe to copy from, so
++ * use _copy_to_iter() directly.
++ */
++ } else if (_copy_to_iter(buf, tsz, iter) != tsz) {
+ ret = -EFAULT;
+ goto out;
+ }
+@@ -595,6 +608,10 @@ static int open_kcore(struct inode *inode, struct file *filp)
+ if (ret)
+ return ret;
+
++ filp->private_data = kmalloc(PAGE_SIZE, GFP_KERNEL);
++ if (!filp->private_data)
++ return -ENOMEM;
++
+ if (kcore_need_update)
+ kcore_update_ram();
+ if (i_size_read(inode) != proc_root_kcore->size) {
+@@ -605,9 +622,16 @@ static int open_kcore(struct inode *inode, struct file *filp)
+ return 0;
+ }
+
++static int release_kcore(struct inode *inode, struct file *file)
++{
++ kfree(file->private_data);
++ return 0;
++}
++
+ static const struct proc_ops kcore_proc_ops = {
+ .proc_read_iter = read_kcore_iter,
+ .proc_open = open_kcore,
++ .proc_release = release_kcore,
+ .proc_lseek = default_llseek,
+ };
+
+diff --git a/fs/smb/server/smb2misc.c b/fs/smb/server/smb2misc.c
+index 33b7e6c4ceffb..e881df1d10cbd 100644
+--- a/fs/smb/server/smb2misc.c
++++ b/fs/smb/server/smb2misc.c
+@@ -380,13 +380,13 @@ int ksmbd_smb2_check_message(struct ksmbd_work *work)
+ }
+
+ if (smb2_req_struct_sizes[command] != pdu->StructureSize2) {
+- if (command == SMB2_OPLOCK_BREAK_HE &&
+- le16_to_cpu(pdu->StructureSize2) != OP_BREAK_STRUCT_SIZE_20 &&
+- le16_to_cpu(pdu->StructureSize2) != OP_BREAK_STRUCT_SIZE_21) {
++ if (!(command == SMB2_OPLOCK_BREAK_HE &&
++ (le16_to_cpu(pdu->StructureSize2) == OP_BREAK_STRUCT_SIZE_20 ||
++ le16_to_cpu(pdu->StructureSize2) == OP_BREAK_STRUCT_SIZE_21))) {
+ /* special case for SMB2.1 lease break message */
+ ksmbd_debug(SMB,
+- "Illegal request size %d for oplock break\n",
+- le16_to_cpu(pdu->StructureSize2));
++ "Illegal request size %u for command %d\n",
++ le16_to_cpu(pdu->StructureSize2), command);
+ return 1;
+ }
+ }
+diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c
+index d7e5196485604..4b4764abcdffa 100644
+--- a/fs/smb/server/smb2pdu.c
++++ b/fs/smb/server/smb2pdu.c
+@@ -2324,9 +2324,16 @@ next:
+ break;
+ buf_len -= next;
+ eabuf = (struct smb2_ea_info *)((char *)eabuf + next);
+- if (next < (u32)eabuf->EaNameLength + le16_to_cpu(eabuf->EaValueLength))
++ if (buf_len < sizeof(struct smb2_ea_info)) {
++ rc = -EINVAL;
+ break;
++ }
+
++ if (buf_len < sizeof(struct smb2_ea_info) + eabuf->EaNameLength +
++ le16_to_cpu(eabuf->EaValueLength)) {
++ rc = -EINVAL;
++ break;
++ }
+ } while (next != 0);
+
+ kfree(attr_name);
+diff --git a/include/linux/cpu.h b/include/linux/cpu.h
+index ce41922470a5d..e81edf076f291 100644
+--- a/include/linux/cpu.h
++++ b/include/linux/cpu.h
+@@ -72,6 +72,8 @@ extern ssize_t cpu_show_retbleed(struct device *dev,
+ struct device_attribute *attr, char *buf);
+ extern ssize_t cpu_show_spec_rstack_overflow(struct device *dev,
+ struct device_attribute *attr, char *buf);
++extern ssize_t cpu_show_gds(struct device *dev,
++ struct device_attribute *attr, char *buf);
+
+ extern __printf(4, 5)
+ struct device *cpu_device_create(struct device *parent, void *drvdata,
+diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
+index 054d7911bfc9f..c1637515a8a41 100644
+--- a/include/linux/skmsg.h
++++ b/include/linux/skmsg.h
+@@ -62,6 +62,7 @@ struct sk_psock_progs {
+
+ enum sk_psock_state_bits {
+ SK_PSOCK_TX_ENABLED,
++ SK_PSOCK_RX_STRP_ENABLED,
+ };
+
+ struct sk_psock_link {
+diff --git a/include/linux/tpm.h b/include/linux/tpm.h
+index 6a1e8f1572551..4ee9d13749adc 100644
+--- a/include/linux/tpm.h
++++ b/include/linux/tpm.h
+@@ -283,6 +283,7 @@ enum tpm_chip_flags {
+ TPM_CHIP_FLAG_FIRMWARE_POWER_MANAGED = BIT(6),
+ TPM_CHIP_FLAG_FIRMWARE_UPGRADE = BIT(7),
+ TPM_CHIP_FLAG_SUSPENDED = BIT(8),
++ TPM_CHIP_FLAG_HWRNG_DISABLED = BIT(9),
+ };
+
+ #define to_tpm_chip(d) container_of(d, struct tpm_chip, dev)
+diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h
+index 9e04f69712b16..f67f705c5ad03 100644
+--- a/include/net/cfg80211.h
++++ b/include/net/cfg80211.h
+@@ -562,6 +562,9 @@ ieee80211_get_sband_iftype_data(const struct ieee80211_supported_band *sband,
+ if (WARN_ON(iftype >= NL80211_IFTYPE_MAX))
+ return NULL;
+
++ if (iftype == NL80211_IFTYPE_AP_VLAN)
++ iftype = NL80211_IFTYPE_AP;
++
+ for (i = 0; i < sband->n_iftype_data; i++) {
+ const struct ieee80211_sband_iftype_data *data =
+ &sband->iftype_data[i];
+diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
+index 1b0beb8f08aee..ad97049e28881 100644
+--- a/include/net/netfilter/nf_tables.h
++++ b/include/net/netfilter/nf_tables.h
+@@ -512,6 +512,7 @@ struct nft_set_elem_expr {
+ *
+ * @list: table set list node
+ * @bindings: list of set bindings
++ * @refs: internal refcounting for async set destruction
+ * @table: table this set belongs to
+ * @net: netnamespace this set belongs to
+ * @name: name of the set
+@@ -541,6 +542,7 @@ struct nft_set_elem_expr {
+ struct nft_set {
+ struct list_head list;
+ struct list_head bindings;
++ refcount_t refs;
+ struct nft_table *table;
+ possible_net_t net;
+ char *name;
+@@ -562,7 +564,8 @@ struct nft_set {
+ struct list_head pending_update;
+ /* runtime data below here */
+ const struct nft_set_ops *ops ____cacheline_aligned;
+- u16 flags:14,
++ u16 flags:13,
++ dead:1,
+ genmask:2;
+ u8 klen;
+ u8 dlen;
+@@ -1592,6 +1595,32 @@ static inline void nft_set_elem_clear_busy(struct nft_set_ext *ext)
+ clear_bit(NFT_SET_ELEM_BUSY_BIT, word);
+ }
+
++#define NFT_SET_ELEM_DEAD_MASK (1 << 3)
++
++#if defined(__LITTLE_ENDIAN_BITFIELD)
++#define NFT_SET_ELEM_DEAD_BIT 3
++#elif defined(__BIG_ENDIAN_BITFIELD)
++#define NFT_SET_ELEM_DEAD_BIT (BITS_PER_LONG - BITS_PER_BYTE + 3)
++#else
++#error
++#endif
++
++static inline void nft_set_elem_dead(struct nft_set_ext *ext)
++{
++ unsigned long *word = (unsigned long *)ext;
++
++ BUILD_BUG_ON(offsetof(struct nft_set_ext, genmask) != 0);
++ set_bit(NFT_SET_ELEM_DEAD_BIT, word);
++}
++
++static inline int nft_set_elem_is_dead(const struct nft_set_ext *ext)
++{
++ unsigned long *word = (unsigned long *)ext;
++
++ BUILD_BUG_ON(offsetof(struct nft_set_ext, genmask) != 0);
++ return test_bit(NFT_SET_ELEM_DEAD_BIT, word);
++}
++
+ /**
+ * struct nft_trans - nf_tables object update in transaction
+ *
+@@ -1729,6 +1758,38 @@ struct nft_trans_flowtable {
+ #define nft_trans_flowtable_flags(trans) \
+ (((struct nft_trans_flowtable *)trans->data)->flags)
+
++#define NFT_TRANS_GC_BATCHCOUNT 256
++
++struct nft_trans_gc {
++ struct list_head list;
++ struct net *net;
++ struct nft_set *set;
++ u32 seq;
++ u8 count;
++ void *priv[NFT_TRANS_GC_BATCHCOUNT];
++ struct rcu_head rcu;
++};
++
++struct nft_trans_gc *nft_trans_gc_alloc(struct nft_set *set,
++ unsigned int gc_seq, gfp_t gfp);
++void nft_trans_gc_destroy(struct nft_trans_gc *trans);
++
++struct nft_trans_gc *nft_trans_gc_queue_async(struct nft_trans_gc *gc,
++ unsigned int gc_seq, gfp_t gfp);
++void nft_trans_gc_queue_async_done(struct nft_trans_gc *gc);
++
++struct nft_trans_gc *nft_trans_gc_queue_sync(struct nft_trans_gc *gc, gfp_t gfp);
++void nft_trans_gc_queue_sync_done(struct nft_trans_gc *trans);
++
++void nft_trans_gc_elem_add(struct nft_trans_gc *gc, void *priv);
++
++struct nft_trans_gc *nft_trans_gc_catchall(struct nft_trans_gc *gc,
++ unsigned int gc_seq);
++
++void nft_setelem_data_deactivate(const struct net *net,
++ const struct nft_set *set,
++ struct nft_set_elem *elem);
++
+ int __init nft_chain_filter_init(void);
+ void nft_chain_filter_fini(void);
+
+@@ -1755,6 +1816,7 @@ struct nftables_pernet {
+ struct mutex commit_mutex;
+ u64 table_handle;
+ unsigned int base_seq;
++ unsigned int gc_seq;
+ };
+
+ extern unsigned int nf_tables_net_id;
+diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
+index bf06db8d2046c..7b1ddffa3dfc8 100644
+--- a/include/trace/events/tcp.h
++++ b/include/trace/events/tcp.h
+@@ -381,6 +381,7 @@ TRACE_EVENT(tcp_cong_state_set,
+ __field(const void *, skaddr)
+ __field(__u16, sport)
+ __field(__u16, dport)
++ __field(__u16, family)
+ __array(__u8, saddr, 4)
+ __array(__u8, daddr, 4)
+ __array(__u8, saddr_v6, 16)
+@@ -396,6 +397,7 @@ TRACE_EVENT(tcp_cong_state_set,
+
+ __entry->sport = ntohs(inet->inet_sport);
+ __entry->dport = ntohs(inet->inet_dport);
++ __entry->family = sk->sk_family;
+
+ p32 = (__be32 *) __entry->saddr;
+ *p32 = inet->inet_saddr;
+@@ -409,7 +411,8 @@ TRACE_EVENT(tcp_cong_state_set,
+ __entry->cong_state = ca_state;
+ ),
+
+- TP_printk("sport=%hu dport=%hu saddr=%pI4 daddr=%pI4 saddrv6=%pI6c daddrv6=%pI6c cong_state=%u",
++ TP_printk("family=%s sport=%hu dport=%hu saddr=%pI4 daddr=%pI4 saddrv6=%pI6c daddrv6=%pI6c cong_state=%u",
++ show_family_name(__entry->family),
+ __entry->sport, __entry->dport,
+ __entry->saddr, __entry->daddr,
+ __entry->saddr_v6, __entry->daddr_v6,
+diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
+index 2989b81cca82a..a57bdf336ca8a 100644
+--- a/io_uring/io_uring.c
++++ b/io_uring/io_uring.c
+@@ -3466,6 +3466,8 @@ static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp,
+ * - use the kernel virtual address of the shared io_uring context
+ * (instead of the userspace-provided address, which has to be 0UL
+ * anyway).
++ * - use the same pgoff which the get_unmapped_area() uses to
++ * calculate the page colouring.
+ * For architectures without such aliasing requirements, the
+ * architecture will return any suitable mapping because addr is 0.
+ */
+@@ -3474,6 +3476,7 @@ static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp,
+ pgoff = 0; /* has been translated to ptr above */
+ #ifdef SHM_COLOUR
+ addr = (uintptr_t) ptr;
++ pgoff = addr >> PAGE_SHIFT;
+ #else
+ addr = 0UL;
+ #endif
+diff --git a/io_uring/openclose.c b/io_uring/openclose.c
+index a1b98c81a52d9..1b4a06a8572df 100644
+--- a/io_uring/openclose.c
++++ b/io_uring/openclose.c
+@@ -35,9 +35,11 @@ static bool io_openat_force_async(struct io_open *open)
+ {
+ /*
+ * Don't bother trying for O_TRUNC, O_CREAT, or O_TMPFILE open,
+- * it'll always -EAGAIN
++ * it'll always -EAGAIN. Note that we test for __O_TMPFILE because
++ * O_TMPFILE includes O_DIRECTORY, which isn't a flag we need to force
++ * async for.
+ */
+- return open->how.flags & (O_TRUNC | O_CREAT | O_TMPFILE);
++ return open->how.flags & (O_TRUNC | O_CREAT | __O_TMPFILE);
+ }
+
+ static int __io_openat_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+diff --git a/mm/damon/core.c b/mm/damon/core.c
+index 91cff7f2997ef..eb9580942a5c3 100644
+--- a/mm/damon/core.c
++++ b/mm/damon/core.c
+@@ -273,6 +273,7 @@ struct damos_filter *damos_new_filter(enum damos_filter_type type,
+ return NULL;
+ filter->type = type;
+ filter->matching = matching;
++ INIT_LIST_HEAD(&filter->list);
+ return filter;
+ }
+
+diff --git a/mm/hugetlb.c b/mm/hugetlb.c
+index f791076da157c..a9fae660d1635 100644
+--- a/mm/hugetlb.c
++++ b/mm/hugetlb.c
+@@ -1580,9 +1580,37 @@ static inline void destroy_compound_gigantic_folio(struct folio *folio,
+ unsigned int order) { }
+ #endif
+
++static inline void __clear_hugetlb_destructor(struct hstate *h,
++ struct folio *folio)
++{
++ lockdep_assert_held(&hugetlb_lock);
++
++ /*
++ * Very subtle
++ *
++ * For non-gigantic pages set the destructor to the normal compound
++ * page dtor. This is needed in case someone takes an additional
++ * temporary ref to the page, and freeing is delayed until they drop
++ * their reference.
++ *
++ * For gigantic pages set the destructor to the null dtor. This
++ * destructor will never be called. Before freeing the gigantic
++ * page destroy_compound_gigantic_folio will turn the folio into a
++ * simple group of pages. After this the destructor does not
++ * apply.
++ *
++ */
++ if (hstate_is_gigantic(h))
++ folio_set_compound_dtor(folio, NULL_COMPOUND_DTOR);
++ else
++ folio_set_compound_dtor(folio, COMPOUND_PAGE_DTOR);
++}
++
+ /*
+- * Remove hugetlb folio from lists, and update dtor so that the folio appears
+- * as just a compound page.
++ * Remove hugetlb folio from lists.
++ * If vmemmap exists for the folio, update dtor so that the folio appears
++ * as just a compound page. Otherwise, wait until after allocating vmemmap
++ * to update dtor.
+ *
+ * A reference is held on the folio, except in the case of demote.
+ *
+@@ -1613,31 +1641,19 @@ static void __remove_hugetlb_folio(struct hstate *h, struct folio *folio,
+ }
+
+ /*
+- * Very subtle
+- *
+- * For non-gigantic pages set the destructor to the normal compound
+- * page dtor. This is needed in case someone takes an additional
+- * temporary ref to the page, and freeing is delayed until they drop
+- * their reference.
+- *
+- * For gigantic pages set the destructor to the null dtor. This
+- * destructor will never be called. Before freeing the gigantic
+- * page destroy_compound_gigantic_folio will turn the folio into a
+- * simple group of pages. After this the destructor does not
+- * apply.
+- *
+- * This handles the case where more than one ref is held when and
+- * after update_and_free_hugetlb_folio is called.
+- *
+- * In the case of demote we do not ref count the page as it will soon
+- * be turned into a page of smaller size.
++ * We can only clear the hugetlb destructor after allocating vmemmap
++ * pages. Otherwise, someone (memory error handling) may try to write
++ * to tail struct pages.
++ */
++ if (!folio_test_hugetlb_vmemmap_optimized(folio))
++ __clear_hugetlb_destructor(h, folio);
++
++ /*
++ * In the case of demote we do not ref count the page as it will soon
++ * be turned into a page of smaller size.
+ */
+ if (!demote)
+ folio_ref_unfreeze(folio, 1);
+- if (hstate_is_gigantic(h))
+- folio_set_compound_dtor(folio, NULL_COMPOUND_DTOR);
+- else
+- folio_set_compound_dtor(folio, COMPOUND_PAGE_DTOR);
+
+ h->nr_huge_pages--;
+ h->nr_huge_pages_node[nid]--;
+@@ -1706,6 +1722,7 @@ static void __update_and_free_hugetlb_folio(struct hstate *h,
+ {
+ int i;
+ struct page *subpage;
++ bool clear_dtor = folio_test_hugetlb_vmemmap_optimized(folio);
+
+ if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported())
+ return;
+@@ -1736,6 +1753,16 @@ static void __update_and_free_hugetlb_folio(struct hstate *h,
+ if (unlikely(folio_test_hwpoison(folio)))
+ folio_clear_hugetlb_hwpoison(folio);
+
++ /*
++ * If vmemmap pages were allocated above, then we need to clear the
++ * hugetlb destructor under the hugetlb lock.
++ */
++ if (clear_dtor) {
++ spin_lock_irq(&hugetlb_lock);
++ __clear_hugetlb_destructor(h, folio);
++ spin_unlock_irq(&hugetlb_lock);
++ }
++
+ for (i = 0; i < pages_per_huge_page(h); i++) {
+ subpage = folio_page(folio, i);
+ subpage->flags &= ~(1 << PG_locked | 1 << PG_error |
+diff --git a/mm/memory-failure.c b/mm/memory-failure.c
+index 47e2b545ffcc6..244dbfe075a25 100644
+--- a/mm/memory-failure.c
++++ b/mm/memory-failure.c
+@@ -2469,7 +2469,7 @@ int unpoison_memory(unsigned long pfn)
+ {
+ struct folio *folio;
+ struct page *p;
+- int ret = -EBUSY;
++ int ret = -EBUSY, ghp;
+ unsigned long count = 1;
+ bool huge = false;
+ static DEFINE_RATELIMIT_STATE(unpoison_rs, DEFAULT_RATELIMIT_INTERVAL,
+@@ -2502,6 +2502,13 @@ int unpoison_memory(unsigned long pfn)
+ goto unlock_mutex;
+ }
+
++ if (folio_test_slab(folio) || PageTable(&folio->page) || folio_test_reserved(folio))
++ goto unlock_mutex;
++
++ /*
++ * Note that folio->_mapcount is overloaded in SLAB, so the simple test
++ * in folio_mapped() has to be done after folio_test_slab() is checked.
++ */
+ if (folio_mapped(folio)) {
+ unpoison_pr_info("Unpoison: Someone maps the hwpoison page %#lx\n",
+ pfn, &unpoison_rs);
+@@ -2514,32 +2521,28 @@ int unpoison_memory(unsigned long pfn)
+ goto unlock_mutex;
+ }
+
+- if (folio_test_slab(folio) || PageTable(&folio->page) || folio_test_reserved(folio))
+- goto unlock_mutex;
+-
+- ret = get_hwpoison_page(p, MF_UNPOISON);
+- if (!ret) {
++ ghp = get_hwpoison_page(p, MF_UNPOISON);
++ if (!ghp) {
+ if (PageHuge(p)) {
+ huge = true;
+ count = folio_free_raw_hwp(folio, false);
+- if (count == 0) {
+- ret = -EBUSY;
++ if (count == 0)
+ goto unlock_mutex;
+- }
+ }
+ ret = folio_test_clear_hwpoison(folio) ? 0 : -EBUSY;
+- } else if (ret < 0) {
+- if (ret == -EHWPOISON) {
++ } else if (ghp < 0) {
++ if (ghp == -EHWPOISON) {
+ ret = put_page_back_buddy(p) ? 0 : -EBUSY;
+- } else
++ } else {
++ ret = ghp;
+ unpoison_pr_info("Unpoison: failed to grab page %#lx\n",
+ pfn, &unpoison_rs);
++ }
+ } else {
+ if (PageHuge(p)) {
+ huge = true;
+ count = folio_free_raw_hwp(folio, false);
+ if (count == 0) {
+- ret = -EBUSY;
+ folio_put(folio);
+ goto unlock_mutex;
+ }
+diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
+index 02f7f414aade0..4def13ee071c7 100644
+--- a/mm/zsmalloc.c
++++ b/mm/zsmalloc.c
+@@ -1977,6 +1977,7 @@ static void replace_sub_page(struct size_class *class, struct zspage *zspage,
+
+ static bool zs_page_isolate(struct page *page, isolate_mode_t mode)
+ {
++ struct zs_pool *pool;
+ struct zspage *zspage;
+
+ /*
+@@ -1986,9 +1987,10 @@ static bool zs_page_isolate(struct page *page, isolate_mode_t mode)
+ VM_BUG_ON_PAGE(PageIsolated(page), page);
+
+ zspage = get_zspage(page);
+- migrate_write_lock(zspage);
++ pool = zspage->pool;
++ spin_lock(&pool->lock);
+ inc_zspage_isolation(zspage);
+- migrate_write_unlock(zspage);
++ spin_unlock(&pool->lock);
+
+ return true;
+ }
+@@ -2054,12 +2056,12 @@ static int zs_page_migrate(struct page *newpage, struct page *page,
+ kunmap_atomic(s_addr);
+
+ replace_sub_page(class, zspage, newpage, page);
++ dec_zspage_isolation(zspage);
+ /*
+ * Since we complete the data copy and set up new zspage structure,
+ * it's okay to release the pool's lock.
+ */
+ spin_unlock(&pool->lock);
+- dec_zspage_isolation(zspage);
+ migrate_write_unlock(zspage);
+
+ get_page(newpage);
+@@ -2076,14 +2078,16 @@ static int zs_page_migrate(struct page *newpage, struct page *page,
+
+ static void zs_page_putback(struct page *page)
+ {
++ struct zs_pool *pool;
+ struct zspage *zspage;
+
+ VM_BUG_ON_PAGE(!PageIsolated(page), page);
+
+ zspage = get_zspage(page);
+- migrate_write_lock(zspage);
++ pool = zspage->pool;
++ spin_lock(&pool->lock);
+ dec_zspage_isolation(zspage);
+- migrate_write_unlock(zspage);
++ spin_unlock(&pool->lock);
+ }
+
+ static const struct movable_operations zsmalloc_mops = {
+diff --git a/net/core/filter.c b/net/core/filter.c
+index 1c959794a8862..f15ae393c2767 100644
+--- a/net/core/filter.c
++++ b/net/core/filter.c
+@@ -4115,12 +4115,6 @@ BPF_CALL_2(bpf_xdp_adjust_tail, struct xdp_buff *, xdp, int, offset)
+ if (unlikely(data_end > data_hard_end))
+ return -EINVAL;
+
+- /* ALL drivers MUST init xdp->frame_sz, chicken check below */
+- if (unlikely(xdp->frame_sz > PAGE_SIZE)) {
+- WARN_ONCE(1, "Too BIG xdp->frame_sz = %d\n", xdp->frame_sz);
+- return -EINVAL;
+- }
+-
+ if (unlikely(data_end < xdp->data + ETH_HLEN))
+ return -EINVAL;
+
+diff --git a/net/core/skmsg.c b/net/core/skmsg.c
+index a29508e1ff356..ef1a2eb6520bf 100644
+--- a/net/core/skmsg.c
++++ b/net/core/skmsg.c
+@@ -1120,13 +1120,19 @@ static void sk_psock_strp_data_ready(struct sock *sk)
+
+ int sk_psock_init_strp(struct sock *sk, struct sk_psock *psock)
+ {
++ int ret;
++
+ static const struct strp_callbacks cb = {
+ .rcv_msg = sk_psock_strp_read,
+ .read_sock_done = sk_psock_strp_read_done,
+ .parse_msg = sk_psock_strp_parse,
+ };
+
+- return strp_init(&psock->strp, sk, &cb);
++ ret = strp_init(&psock->strp, sk, &cb);
++ if (!ret)
++ sk_psock_set_state(psock, SK_PSOCK_RX_STRP_ENABLED);
++
++ return ret;
+ }
+
+ void sk_psock_start_strp(struct sock *sk, struct sk_psock *psock)
+@@ -1154,7 +1160,7 @@ void sk_psock_stop_strp(struct sock *sk, struct sk_psock *psock)
+ static void sk_psock_done_strp(struct sk_psock *psock)
+ {
+ /* Parser has been stopped */
+- if (psock->progs.stream_parser)
++ if (sk_psock_test_state(psock, SK_PSOCK_RX_STRP_ENABLED))
+ strp_done(&psock->strp);
+ }
+ #else
+diff --git a/net/core/sock_map.c b/net/core/sock_map.c
+index 08ab108206bf8..8f07fea39d9ea 100644
+--- a/net/core/sock_map.c
++++ b/net/core/sock_map.c
+@@ -146,13 +146,13 @@ static void sock_map_del_link(struct sock *sk,
+ list_for_each_entry_safe(link, tmp, &psock->link, list) {
+ if (link->link_raw == link_raw) {
+ struct bpf_map *map = link->map;
+- struct bpf_stab *stab = container_of(map, struct bpf_stab,
+- map);
+- if (psock->saved_data_ready && stab->progs.stream_parser)
++ struct sk_psock_progs *progs = sock_map_progs(map);
++
++ if (psock->saved_data_ready && progs->stream_parser)
+ strp_stop = true;
+- if (psock->saved_data_ready && stab->progs.stream_verdict)
++ if (psock->saved_data_ready && progs->stream_verdict)
+ verdict_stop = true;
+- if (psock->saved_data_ready && stab->progs.skb_verdict)
++ if (psock->saved_data_ready && progs->skb_verdict)
+ verdict_stop = true;
+ list_del(&link->list);
+ sk_psock_free_link(link);
+diff --git a/net/dccp/output.c b/net/dccp/output.c
+index b8a24734385ef..fd2eb148d24de 100644
+--- a/net/dccp/output.c
++++ b/net/dccp/output.c
+@@ -187,7 +187,7 @@ unsigned int dccp_sync_mss(struct sock *sk, u32 pmtu)
+
+ /* And store cached results */
+ icsk->icsk_pmtu_cookie = pmtu;
+- dp->dccps_mss_cache = cur_mps;
++ WRITE_ONCE(dp->dccps_mss_cache, cur_mps);
+
+ return cur_mps;
+ }
+diff --git a/net/dccp/proto.c b/net/dccp/proto.c
+index b0ebf853cb07b..18873f2308ec8 100644
+--- a/net/dccp/proto.c
++++ b/net/dccp/proto.c
+@@ -630,7 +630,7 @@ static int do_dccp_getsockopt(struct sock *sk, int level, int optname,
+ return dccp_getsockopt_service(sk, len,
+ (__be32 __user *)optval, optlen);
+ case DCCP_SOCKOPT_GET_CUR_MPS:
+- val = dp->dccps_mss_cache;
++ val = READ_ONCE(dp->dccps_mss_cache);
+ break;
+ case DCCP_SOCKOPT_AVAILABLE_CCIDS:
+ return ccid_getsockopt_builtin_ccids(sk, len, optval, optlen);
+@@ -739,7 +739,7 @@ int dccp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+
+ trace_dccp_probe(sk, len);
+
+- if (len > dp->dccps_mss_cache)
++ if (len > READ_ONCE(dp->dccps_mss_cache))
+ return -EMSGSIZE;
+
+ lock_sock(sk);
+@@ -772,6 +772,12 @@ int dccp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+ goto out_discard;
+ }
+
++ /* We need to check dccps_mss_cache after socket is locked. */
++ if (len > dp->dccps_mss_cache) {
++ rc = -EMSGSIZE;
++ goto out_discard;
++ }
++
+ skb_reserve(skb, sk->sk_prot->max_header);
+ rc = memcpy_from_msg(skb_put(skb, len), msg, len);
+ if (rc != 0)
+diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
+index 92c02c886fe73..586b1b3e35b80 100644
+--- a/net/ipv4/ip_tunnel_core.c
++++ b/net/ipv4/ip_tunnel_core.c
+@@ -224,7 +224,7 @@ static int iptunnel_pmtud_build_icmp(struct sk_buff *skb, int mtu)
+ .un.frag.__unused = 0,
+ .un.frag.mtu = htons(mtu),
+ };
+- icmph->checksum = ip_compute_csum(icmph, len);
++ icmph->checksum = csum_fold(skb_checksum(skb, 0, len, 0));
+ skb_reset_transport_header(skb);
+
+ niph = skb_push(skb, sizeof(*niph));
+diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
+index f95142e56da05..be5498f5dd319 100644
+--- a/net/ipv4/nexthop.c
++++ b/net/ipv4/nexthop.c
+@@ -3221,13 +3221,9 @@ static int rtm_dump_nexthop(struct sk_buff *skb, struct netlink_callback *cb)
+ &rtm_dump_nexthop_cb, &filter);
+ if (err < 0) {
+ if (likely(skb->len))
+- goto out;
+- goto out_err;
++ err = skb->len;
+ }
+
+-out:
+- err = skb->len;
+-out_err:
+ cb->seq = net->nexthop.seq;
+ nl_dump_check_consistent(cb, nlmsg_hdr(skb));
+ return err;
+@@ -3367,25 +3363,19 @@ static int rtm_dump_nexthop_bucket_nh(struct sk_buff *skb,
+ dd->filter.res_bucket_nh_id != nhge->nh->id)
+ continue;
+
++ dd->ctx->bucket_index = bucket_index;
+ err = nh_fill_res_bucket(skb, nh, bucket, bucket_index,
+ RTM_NEWNEXTHOPBUCKET, portid,
+ cb->nlh->nlmsg_seq, NLM_F_MULTI,
+ cb->extack);
+- if (err < 0) {
+- if (likely(skb->len))
+- goto out;
+- goto out_err;
+- }
++ if (err)
++ return err;
+ }
+
+ dd->ctx->done_nh_idx = dd->ctx->nh.idx + 1;
+- bucket_index = 0;
++ dd->ctx->bucket_index = 0;
+
+-out:
+- err = skb->len;
+-out_err:
+- dd->ctx->bucket_index = bucket_index;
+- return err;
++ return 0;
+ }
+
+ static int rtm_dump_nexthop_bucket_cb(struct sk_buff *skb,
+@@ -3434,13 +3424,9 @@ static int rtm_dump_nexthop_bucket(struct sk_buff *skb,
+
+ if (err < 0) {
+ if (likely(skb->len))
+- goto out;
+- goto out_err;
++ err = skb->len;
+ }
+
+-out:
+- err = skb->len;
+-out_err:
+ cb->seq = net->nexthop.seq;
+ nl_dump_check_consistent(cb, nlmsg_hdr(skb));
+ return err;
+diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
+index 18634ebd20a47..a42be96ae209b 100644
+--- a/net/ipv6/ndisc.c
++++ b/net/ipv6/ndisc.c
+@@ -197,7 +197,8 @@ static struct nd_opt_hdr *ndisc_next_option(struct nd_opt_hdr *cur,
+ static inline int ndisc_is_useropt(const struct net_device *dev,
+ struct nd_opt_hdr *opt)
+ {
+- return opt->nd_opt_type == ND_OPT_RDNSS ||
++ return opt->nd_opt_type == ND_OPT_PREFIX_INFO ||
++ opt->nd_opt_type == ND_OPT_RDNSS ||
+ opt->nd_opt_type == ND_OPT_DNSSL ||
+ opt->nd_opt_type == ND_OPT_CAPTIVE_PORTAL ||
+ opt->nd_opt_type == ND_OPT_PREF64 ||
+diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
+index f2eeb8a850af2..39daf5915bae2 100644
+--- a/net/mptcp/protocol.c
++++ b/net/mptcp/protocol.c
+@@ -2335,7 +2335,7 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk,
+
+ lock_sock_nested(ssk, SINGLE_DEPTH_NESTING);
+
+- if (flags & MPTCP_CF_FASTCLOSE) {
++ if ((flags & MPTCP_CF_FASTCLOSE) && !__mptcp_check_fallback(msk)) {
+ /* be sure to force the tcp_disconnect() path,
+ * to generate the egress reset
+ */
+@@ -3321,7 +3321,7 @@ static void mptcp_release_cb(struct sock *sk)
+
+ if (__test_and_clear_bit(MPTCP_CLEAN_UNA, &msk->cb_flags))
+ __mptcp_clean_una_wakeup(sk);
+- if (unlikely(&msk->cb_flags)) {
++ if (unlikely(msk->cb_flags)) {
+ /* be sure to set the current sk state before tacking actions
+ * depending on sk_state, that is processing MPTCP_ERROR_REPORT
+ */
+diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
+index d3783a7056e17..16c9c3197adad 100644
+--- a/net/mptcp/protocol.h
++++ b/net/mptcp/protocol.h
+@@ -320,7 +320,6 @@ struct mptcp_sock {
+
+ u32 setsockopt_seq;
+ char ca_name[TCP_CA_NAME_MAX];
+- struct mptcp_sock *dl_next;
+ };
+
+ #define mptcp_data_lock(sk) spin_lock_bh(&(sk)->sk_lock.slock)
+diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
+index d9c8b21c6076e..521d6817464a9 100644
+--- a/net/mptcp/subflow.c
++++ b/net/mptcp/subflow.c
+@@ -1785,16 +1785,31 @@ static void subflow_state_change(struct sock *sk)
+ void mptcp_subflow_queue_clean(struct sock *listener_sk, struct sock *listener_ssk)
+ {
+ struct request_sock_queue *queue = &inet_csk(listener_ssk)->icsk_accept_queue;
+- struct mptcp_sock *msk, *next, *head = NULL;
+- struct request_sock *req;
+- struct sock *sk;
++ struct request_sock *req, *head, *tail;
++ struct mptcp_subflow_context *subflow;
++ struct sock *sk, *ssk;
+
+- /* build a list of all unaccepted mptcp sockets */
++ /* Due to lock dependencies no relevant lock can be acquired under rskq_lock.
++ * Splice the req list, so that accept() can not reach the pending ssk after
++ * the listener socket is released below.
++ */
+ spin_lock_bh(&queue->rskq_lock);
+- for (req = queue->rskq_accept_head; req; req = req->dl_next) {
+- struct mptcp_subflow_context *subflow;
+- struct sock *ssk = req->sk;
++ head = queue->rskq_accept_head;
++ tail = queue->rskq_accept_tail;
++ queue->rskq_accept_head = NULL;
++ queue->rskq_accept_tail = NULL;
++ spin_unlock_bh(&queue->rskq_lock);
++
++ if (!head)
++ return;
+
++ /* can't acquire the msk socket lock under the subflow one,
++ * or will cause ABBA deadlock
++ */
++ release_sock(listener_ssk);
++
++ for (req = head; req; req = req->dl_next) {
++ ssk = req->sk;
+ if (!sk_is_mptcp(ssk))
+ continue;
+
+@@ -1802,32 +1817,10 @@ void mptcp_subflow_queue_clean(struct sock *listener_sk, struct sock *listener_s
+ if (!subflow || !subflow->conn)
+ continue;
+
+- /* skip if already in list */
+ sk = subflow->conn;
+- msk = mptcp_sk(sk);
+- if (msk->dl_next || msk == head)
+- continue;
+-
+ sock_hold(sk);
+- msk->dl_next = head;
+- head = msk;
+- }
+- spin_unlock_bh(&queue->rskq_lock);
+- if (!head)
+- return;
+-
+- /* can't acquire the msk socket lock under the subflow one,
+- * or will cause ABBA deadlock
+- */
+- release_sock(listener_ssk);
+-
+- for (msk = head; msk; msk = next) {
+- sk = (struct sock *)msk;
+
+ lock_sock_nested(sk, SINGLE_DEPTH_NESTING);
+- next = msk->dl_next;
+- msk->dl_next = NULL;
+-
+ __mptcp_unaccepted_force_close(sk);
+ release_sock(sk);
+
+@@ -1851,6 +1844,13 @@ void mptcp_subflow_queue_clean(struct sock *listener_sk, struct sock *listener_s
+
+ /* we are still under the listener msk socket lock */
+ lock_sock_nested(listener_ssk, SINGLE_DEPTH_NESTING);
++
++ /* restore the listener queue, to let the TCP code clean it up */
++ spin_lock_bh(&queue->rskq_lock);
++ WARN_ON_ONCE(queue->rskq_accept_head);
++ queue->rskq_accept_head = head;
++ queue->rskq_accept_tail = tail;
++ spin_unlock_bh(&queue->rskq_lock);
+ }
+
+ static int subflow_ulp_init(struct sock *sk)
+diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
+index da00c411a9cd4..c6de10f458fa4 100644
+--- a/net/netfilter/nf_tables_api.c
++++ b/net/netfilter/nf_tables_api.c
+@@ -31,7 +31,9 @@ static LIST_HEAD(nf_tables_expressions);
+ static LIST_HEAD(nf_tables_objects);
+ static LIST_HEAD(nf_tables_flowtables);
+ static LIST_HEAD(nf_tables_destroy_list);
++static LIST_HEAD(nf_tables_gc_list);
+ static DEFINE_SPINLOCK(nf_tables_destroy_list_lock);
++static DEFINE_SPINLOCK(nf_tables_gc_list_lock);
+
+ enum {
+ NFT_VALIDATE_SKIP = 0,
+@@ -120,6 +122,9 @@ static void nft_validate_state_update(struct nft_table *table, u8 new_validate_s
+ static void nf_tables_trans_destroy_work(struct work_struct *w);
+ static DECLARE_WORK(trans_destroy_work, nf_tables_trans_destroy_work);
+
++static void nft_trans_gc_work(struct work_struct *work);
++static DECLARE_WORK(trans_gc_work, nft_trans_gc_work);
++
+ static void nft_ctx_init(struct nft_ctx *ctx,
+ struct net *net,
+ const struct sk_buff *skb,
+@@ -581,10 +586,6 @@ static int nft_trans_set_add(const struct nft_ctx *ctx, int msg_type,
+ return __nft_trans_set_add(ctx, msg_type, set, NULL);
+ }
+
+-static void nft_setelem_data_deactivate(const struct net *net,
+- const struct nft_set *set,
+- struct nft_set_elem *elem);
+-
+ static int nft_mapelem_deactivate(const struct nft_ctx *ctx,
+ struct nft_set *set,
+ const struct nft_set_iter *iter,
+@@ -5054,6 +5055,7 @@ static int nf_tables_newset(struct sk_buff *skb, const struct nfnl_info *info,
+
+ INIT_LIST_HEAD(&set->bindings);
+ INIT_LIST_HEAD(&set->catchall_list);
++ refcount_set(&set->refs, 1);
+ set->table = table;
+ write_pnet(&set->net, net);
+ set->ops = ops;
+@@ -5121,6 +5123,14 @@ static void nft_set_catchall_destroy(const struct nft_ctx *ctx,
+ }
+ }
+
++static void nft_set_put(struct nft_set *set)
++{
++ if (refcount_dec_and_test(&set->refs)) {
++ kfree(set->name);
++ kvfree(set);
++ }
++}
++
+ static void nft_set_destroy(const struct nft_ctx *ctx, struct nft_set *set)
+ {
+ int i;
+@@ -5133,8 +5143,7 @@ static void nft_set_destroy(const struct nft_ctx *ctx, struct nft_set *set)
+
+ set->ops->destroy(ctx, set);
+ nft_set_catchall_destroy(ctx, set);
+- kfree(set->name);
+- kvfree(set);
++ nft_set_put(set);
+ }
+
+ static int nf_tables_delset(struct sk_buff *skb, const struct nfnl_info *info,
+@@ -5590,8 +5599,12 @@ static int nf_tables_dump_setelem(const struct nft_ctx *ctx,
+ const struct nft_set_iter *iter,
+ struct nft_set_elem *elem)
+ {
++ const struct nft_set_ext *ext = nft_set_elem_ext(set, elem->priv);
+ struct nft_set_dump_args *args;
+
++ if (nft_set_elem_expired(ext))
++ return 0;
++
+ args = container_of(iter, struct nft_set_dump_args, iter);
+ return nf_tables_fill_setelem(args->skb, set, elem);
+ }
+@@ -6251,7 +6264,8 @@ struct nft_set_ext *nft_set_catchall_lookup(const struct net *net,
+ list_for_each_entry_rcu(catchall, &set->catchall_list, list) {
+ ext = nft_set_elem_ext(set, catchall->elem);
+ if (nft_set_elem_active(ext, genmask) &&
+- !nft_set_elem_expired(ext))
++ !nft_set_elem_expired(ext) &&
++ !nft_set_elem_is_dead(ext))
+ return ext;
+ }
+
+@@ -6343,7 +6357,6 @@ static void nft_setelem_activate(struct net *net, struct nft_set *set,
+
+ if (nft_setelem_is_catchall(set, elem)) {
+ nft_set_elem_change_active(net, set, ext);
+- nft_set_elem_clear_busy(ext);
+ } else {
+ set->ops->activate(net, set, elem);
+ }
+@@ -6358,8 +6371,7 @@ static int nft_setelem_catchall_deactivate(const struct net *net,
+
+ list_for_each_entry(catchall, &set->catchall_list, list) {
+ ext = nft_set_elem_ext(set, catchall->elem);
+- if (!nft_is_active(net, ext) ||
+- nft_set_elem_mark_busy(ext))
++ if (!nft_is_active(net, ext))
+ continue;
+
+ kfree(elem->priv);
+@@ -6903,9 +6915,9 @@ static void nft_setelem_data_activate(const struct net *net,
+ nft_use_inc_restore(&(*nft_set_ext_obj(ext))->use);
+ }
+
+-static void nft_setelem_data_deactivate(const struct net *net,
+- const struct nft_set *set,
+- struct nft_set_elem *elem)
++void nft_setelem_data_deactivate(const struct net *net,
++ const struct nft_set *set,
++ struct nft_set_elem *elem)
+ {
+ const struct nft_set_ext *ext = nft_set_elem_ext(set, elem->priv);
+
+@@ -7069,8 +7081,7 @@ static int nft_set_catchall_flush(const struct nft_ctx *ctx,
+
+ list_for_each_entry_rcu(catchall, &set->catchall_list, list) {
+ ext = nft_set_elem_ext(set, catchall->elem);
+- if (!nft_set_elem_active(ext, genmask) ||
+- nft_set_elem_mark_busy(ext))
++ if (!nft_set_elem_active(ext, genmask))
+ continue;
+
+ elem.priv = catchall->elem;
+@@ -9382,6 +9393,207 @@ void nft_chain_del(struct nft_chain *chain)
+ list_del_rcu(&chain->list);
+ }
+
++static void nft_trans_gc_setelem_remove(struct nft_ctx *ctx,
++ struct nft_trans_gc *trans)
++{
++ void **priv = trans->priv;
++ unsigned int i;
++
++ for (i = 0; i < trans->count; i++) {
++ struct nft_set_elem elem = {
++ .priv = priv[i],
++ };
++
++ nft_setelem_data_deactivate(ctx->net, trans->set, &elem);
++ nft_setelem_remove(ctx->net, trans->set, &elem);
++ }
++}
++
++void nft_trans_gc_destroy(struct nft_trans_gc *trans)
++{
++ nft_set_put(trans->set);
++ put_net(trans->net);
++ kfree(trans);
++}
++
++static void nft_trans_gc_trans_free(struct rcu_head *rcu)
++{
++ struct nft_set_elem elem = {};
++ struct nft_trans_gc *trans;
++ struct nft_ctx ctx = {};
++ unsigned int i;
++
++ trans = container_of(rcu, struct nft_trans_gc, rcu);
++ ctx.net = read_pnet(&trans->set->net);
++
++ for (i = 0; i < trans->count; i++) {
++ elem.priv = trans->priv[i];
++ if (!nft_setelem_is_catchall(trans->set, &elem))
++ atomic_dec(&trans->set->nelems);
++
++ nf_tables_set_elem_destroy(&ctx, trans->set, elem.priv);
++ }
++
++ nft_trans_gc_destroy(trans);
++}
++
++static bool nft_trans_gc_work_done(struct nft_trans_gc *trans)
++{
++ struct nftables_pernet *nft_net;
++ struct nft_ctx ctx = {};
++
++ nft_net = nft_pernet(trans->net);
++
++ mutex_lock(&nft_net->commit_mutex);
++
++ /* Check for race with transaction, otherwise this batch refers to
++ * stale objects that might not be there anymore. Skip transaction if
++ * set has been destroyed from control plane transaction in case gc
++ * worker loses race.
++ */
++ if (READ_ONCE(nft_net->gc_seq) != trans->seq || trans->set->dead) {
++ mutex_unlock(&nft_net->commit_mutex);
++ return false;
++ }
++
++ ctx.net = trans->net;
++ ctx.table = trans->set->table;
++
++ nft_trans_gc_setelem_remove(&ctx, trans);
++ mutex_unlock(&nft_net->commit_mutex);
++
++ return true;
++}
++
++static void nft_trans_gc_work(struct work_struct *work)
++{
++ struct nft_trans_gc *trans, *next;
++ LIST_HEAD(trans_gc_list);
++
++ spin_lock(&nf_tables_destroy_list_lock);
++ list_splice_init(&nf_tables_gc_list, &trans_gc_list);
++ spin_unlock(&nf_tables_destroy_list_lock);
++
++ list_for_each_entry_safe(trans, next, &trans_gc_list, list) {
++ list_del(&trans->list);
++ if (!nft_trans_gc_work_done(trans)) {
++ nft_trans_gc_destroy(trans);
++ continue;
++ }
++ call_rcu(&trans->rcu, nft_trans_gc_trans_free);
++ }
++}
++
++struct nft_trans_gc *nft_trans_gc_alloc(struct nft_set *set,
++ unsigned int gc_seq, gfp_t gfp)
++{
++ struct net *net = read_pnet(&set->net);
++ struct nft_trans_gc *trans;
++
++ trans = kzalloc(sizeof(*trans), gfp);
++ if (!trans)
++ return NULL;
++
++ refcount_inc(&set->refs);
++ trans->set = set;
++ trans->net = get_net(net);
++ trans->seq = gc_seq;
++
++ return trans;
++}
++
++void nft_trans_gc_elem_add(struct nft_trans_gc *trans, void *priv)
++{
++ trans->priv[trans->count++] = priv;
++}
++
++static void nft_trans_gc_queue_work(struct nft_trans_gc *trans)
++{
++ spin_lock(&nf_tables_gc_list_lock);
++ list_add_tail(&trans->list, &nf_tables_gc_list);
++ spin_unlock(&nf_tables_gc_list_lock);
++
++ schedule_work(&trans_gc_work);
++}
++
++static int nft_trans_gc_space(struct nft_trans_gc *trans)
++{
++ return NFT_TRANS_GC_BATCHCOUNT - trans->count;
++}
++
++struct nft_trans_gc *nft_trans_gc_queue_async(struct nft_trans_gc *gc,
++ unsigned int gc_seq, gfp_t gfp)
++{
++ if (nft_trans_gc_space(gc))
++ return gc;
++
++ nft_trans_gc_queue_work(gc);
++
++ return nft_trans_gc_alloc(gc->set, gc_seq, gfp);
++}
++
++void nft_trans_gc_queue_async_done(struct nft_trans_gc *trans)
++{
++ if (trans->count == 0) {
++ nft_trans_gc_destroy(trans);
++ return;
++ }
++
++ nft_trans_gc_queue_work(trans);
++}
++
++struct nft_trans_gc *nft_trans_gc_queue_sync(struct nft_trans_gc *gc, gfp_t gfp)
++{
++ if (WARN_ON_ONCE(!lockdep_commit_lock_is_held(gc->net)))
++ return NULL;
++
++ if (nft_trans_gc_space(gc))
++ return gc;
++
++ call_rcu(&gc->rcu, nft_trans_gc_trans_free);
++
++ return nft_trans_gc_alloc(gc->set, 0, gfp);
++}
++
++void nft_trans_gc_queue_sync_done(struct nft_trans_gc *trans)
++{
++ WARN_ON_ONCE(!lockdep_commit_lock_is_held(trans->net));
++
++ if (trans->count == 0) {
++ nft_trans_gc_destroy(trans);
++ return;
++ }
++
++ call_rcu(&trans->rcu, nft_trans_gc_trans_free);
++}
++
++struct nft_trans_gc *nft_trans_gc_catchall(struct nft_trans_gc *gc,
++ unsigned int gc_seq)
++{
++ struct nft_set_elem_catchall *catchall;
++ const struct nft_set *set = gc->set;
++ struct nft_set_ext *ext;
++
++ list_for_each_entry_rcu(catchall, &set->catchall_list, list) {
++ ext = nft_set_elem_ext(set, catchall->elem);
++
++ if (!nft_set_elem_expired(ext))
++ continue;
++ if (nft_set_elem_is_dead(ext))
++ goto dead_elem;
++
++ nft_set_elem_dead(ext);
++dead_elem:
++ gc = nft_trans_gc_queue_async(gc, gc_seq, GFP_ATOMIC);
++ if (!gc)
++ return NULL;
++
++ nft_trans_gc_elem_add(gc, catchall->elem);
++ }
++
++ return gc;
++}
++
+ static void nf_tables_module_autoload_cleanup(struct net *net)
+ {
+ struct nftables_pernet *nft_net = nft_pernet(net);
+@@ -9544,11 +9756,11 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb)
+ {
+ struct nftables_pernet *nft_net = nft_pernet(net);
+ struct nft_trans *trans, *next;
++ unsigned int base_seq, gc_seq;
+ LIST_HEAD(set_update_list);
+ struct nft_trans_elem *te;
+ struct nft_chain *chain;
+ struct nft_table *table;
+- unsigned int base_seq;
+ LIST_HEAD(adl);
+ int err;
+
+@@ -9625,6 +9837,10 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb)
+
+ WRITE_ONCE(nft_net->base_seq, base_seq);
+
++ /* Bump gc counter, it becomes odd, this is the busy mark. */
++ gc_seq = READ_ONCE(nft_net->gc_seq);
++ WRITE_ONCE(nft_net->gc_seq, ++gc_seq);
++
+ /* step 3. Start new generation, rules_gen_X now in use. */
+ net->nft.gencursor = nft_gencursor_next(net);
+
+@@ -9729,6 +9945,7 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb)
+ break;
+ case NFT_MSG_DELSET:
+ case NFT_MSG_DESTROYSET:
++ nft_trans_set(trans)->dead = 1;
+ list_del_rcu(&nft_trans_set(trans)->list);
+ nf_tables_set_notify(&trans->ctx, nft_trans_set(trans),
+ trans->msg_type, GFP_KERNEL);
+@@ -9831,6 +10048,8 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb)
+ nft_commit_notify(net, NETLINK_CB(skb).portid);
+ nf_tables_gen_notify(net, skb, NFT_MSG_NEWGEN);
+ nf_tables_commit_audit_log(&adl, nft_net->base_seq);
++
++ WRITE_ONCE(nft_net->gc_seq, ++gc_seq);
+ nf_tables_commit_release(net);
+
+ return 0;
+@@ -10880,6 +11099,7 @@ static int __net_init nf_tables_init_net(struct net *net)
+ INIT_LIST_HEAD(&nft_net->notify_list);
+ mutex_init(&nft_net->commit_mutex);
+ nft_net->base_seq = 1;
++ nft_net->gc_seq = 0;
+
+ return 0;
+ }
+@@ -10908,10 +11128,16 @@ static void __net_exit nf_tables_exit_net(struct net *net)
+ WARN_ON_ONCE(!list_empty(&nft_net->notify_list));
+ }
+
++static void nf_tables_exit_batch(struct list_head *net_exit_list)
++{
++ flush_work(&trans_gc_work);
++}
++
+ static struct pernet_operations nf_tables_net_ops = {
+ .init = nf_tables_init_net,
+ .pre_exit = nf_tables_pre_exit_net,
+ .exit = nf_tables_exit_net,
++ .exit_batch = nf_tables_exit_batch,
+ .id = &nf_tables_net_id,
+ .size = sizeof(struct nftables_pernet),
+ };
+@@ -10983,6 +11209,7 @@ static void __exit nf_tables_module_exit(void)
+ nft_chain_filter_fini();
+ nft_chain_route_fini();
+ unregister_pernet_subsys(&nf_tables_net_ops);
++ cancel_work_sync(&trans_gc_work);
+ cancel_work_sync(&trans_destroy_work);
+ rcu_barrier();
+ rhltable_destroy(&nft_objname_ht);
+diff --git a/net/netfilter/nft_set_hash.c b/net/netfilter/nft_set_hash.c
+index 0b73cb0e752f7..cef5df8460009 100644
+--- a/net/netfilter/nft_set_hash.c
++++ b/net/netfilter/nft_set_hash.c
+@@ -59,6 +59,8 @@ static inline int nft_rhash_cmp(struct rhashtable_compare_arg *arg,
+
+ if (memcmp(nft_set_ext_key(&he->ext), x->key, x->set->klen))
+ return 1;
++ if (nft_set_elem_is_dead(&he->ext))
++ return 1;
+ if (nft_set_elem_expired(&he->ext))
+ return 1;
+ if (!nft_set_elem_active(&he->ext, x->genmask))
+@@ -188,7 +190,6 @@ static void nft_rhash_activate(const struct net *net, const struct nft_set *set,
+ struct nft_rhash_elem *he = elem->priv;
+
+ nft_set_elem_change_active(net, set, &he->ext);
+- nft_set_elem_clear_busy(&he->ext);
+ }
+
+ static bool nft_rhash_flush(const struct net *net,
+@@ -196,12 +197,9 @@ static bool nft_rhash_flush(const struct net *net,
+ {
+ struct nft_rhash_elem *he = priv;
+
+- if (!nft_set_elem_mark_busy(&he->ext) ||
+- !nft_is_active(net, &he->ext)) {
+- nft_set_elem_change_active(net, set, &he->ext);
+- return true;
+- }
+- return false;
++ nft_set_elem_change_active(net, set, &he->ext);
++
++ return true;
+ }
+
+ static void *nft_rhash_deactivate(const struct net *net,
+@@ -218,9 +216,8 @@ static void *nft_rhash_deactivate(const struct net *net,
+
+ rcu_read_lock();
+ he = rhashtable_lookup(&priv->ht, &arg, nft_rhash_params);
+- if (he != NULL &&
+- !nft_rhash_flush(net, set, he))
+- he = NULL;
++ if (he)
++ nft_set_elem_change_active(net, set, &he->ext);
+
+ rcu_read_unlock();
+
+@@ -252,7 +249,9 @@ static bool nft_rhash_delete(const struct nft_set *set,
+ if (he == NULL)
+ return false;
+
+- return rhashtable_remove_fast(&priv->ht, &he->node, nft_rhash_params) == 0;
++ nft_set_elem_dead(&he->ext);
++
++ return true;
+ }
+
+ static void nft_rhash_walk(const struct nft_ctx *ctx, struct nft_set *set,
+@@ -278,8 +277,6 @@ static void nft_rhash_walk(const struct nft_ctx *ctx, struct nft_set *set,
+
+ if (iter->count < iter->skip)
+ goto cont;
+- if (nft_set_elem_expired(&he->ext))
+- goto cont;
+ if (!nft_set_elem_active(&he->ext, iter->genmask))
+ goto cont;
+
+@@ -314,25 +311,48 @@ static bool nft_rhash_expr_needs_gc_run(const struct nft_set *set,
+
+ static void nft_rhash_gc(struct work_struct *work)
+ {
++ struct nftables_pernet *nft_net;
+ struct nft_set *set;
+ struct nft_rhash_elem *he;
+ struct nft_rhash *priv;
+- struct nft_set_gc_batch *gcb = NULL;
+ struct rhashtable_iter hti;
++ struct nft_trans_gc *gc;
++ struct net *net;
++ u32 gc_seq;
+
+ priv = container_of(work, struct nft_rhash, gc_work.work);
+ set = nft_set_container_of(priv);
++ net = read_pnet(&set->net);
++ nft_net = nft_pernet(net);
++ gc_seq = READ_ONCE(nft_net->gc_seq);
++
++ gc = nft_trans_gc_alloc(set, gc_seq, GFP_KERNEL);
++ if (!gc)
++ goto done;
+
+ rhashtable_walk_enter(&priv->ht, &hti);
+ rhashtable_walk_start(&hti);
+
+ while ((he = rhashtable_walk_next(&hti))) {
+ if (IS_ERR(he)) {
+- if (PTR_ERR(he) != -EAGAIN)
+- break;
++ if (PTR_ERR(he) != -EAGAIN) {
++ nft_trans_gc_destroy(gc);
++ gc = NULL;
++ goto try_later;
++ }
+ continue;
+ }
+
++ /* Ruleset has been updated, try later. */
++ if (READ_ONCE(nft_net->gc_seq) != gc_seq) {
++ nft_trans_gc_destroy(gc);
++ gc = NULL;
++ goto try_later;
++ }
++
++ if (nft_set_elem_is_dead(&he->ext))
++ goto dead_elem;
++
+ if (nft_set_ext_exists(&he->ext, NFT_SET_EXT_EXPRESSIONS) &&
+ nft_rhash_expr_needs_gc_run(set, &he->ext))
+ goto needs_gc_run;
+@@ -340,26 +360,26 @@ static void nft_rhash_gc(struct work_struct *work)
+ if (!nft_set_elem_expired(&he->ext))
+ continue;
+ needs_gc_run:
+- if (nft_set_elem_mark_busy(&he->ext))
+- continue;
++ nft_set_elem_dead(&he->ext);
++dead_elem:
++ gc = nft_trans_gc_queue_async(gc, gc_seq, GFP_ATOMIC);
++ if (!gc)
++ goto try_later;
+
+- gcb = nft_set_gc_batch_check(set, gcb, GFP_ATOMIC);
+- if (gcb == NULL)
+- break;
+- rhashtable_remove_fast(&priv->ht, &he->node, nft_rhash_params);
+- atomic_dec(&set->nelems);
+- nft_set_gc_batch_add(gcb, he);
++ nft_trans_gc_elem_add(gc, he);
+ }
++
++ gc = nft_trans_gc_catchall(gc, gc_seq);
++
++try_later:
++ /* catchall list iteration requires rcu read side lock. */
+ rhashtable_walk_stop(&hti);
+ rhashtable_walk_exit(&hti);
+
+- he = nft_set_catchall_gc(set);
+- if (he) {
+- gcb = nft_set_gc_batch_check(set, gcb, GFP_ATOMIC);
+- if (gcb)
+- nft_set_gc_batch_add(gcb, he);
+- }
+- nft_set_gc_batch_complete(gcb);
++ if (gc)
++ nft_trans_gc_queue_async_done(gc);
++
++done:
+ queue_delayed_work(system_power_efficient_wq, &priv->gc_work,
+ nft_set_gc_interval(set));
+ }
+@@ -394,7 +414,7 @@ static int nft_rhash_init(const struct nft_set *set,
+ return err;
+
+ INIT_DEFERRABLE_WORK(&priv->gc_work, nft_rhash_gc);
+- if (set->flags & NFT_SET_TIMEOUT)
++ if (set->flags & (NFT_SET_TIMEOUT | NFT_SET_EVAL))
+ nft_rhash_gc_init(set);
+
+ return 0;
+@@ -422,7 +442,6 @@ static void nft_rhash_destroy(const struct nft_ctx *ctx,
+ };
+
+ cancel_delayed_work_sync(&priv->gc_work);
+- rcu_barrier();
+ rhashtable_free_and_destroy(&priv->ht, nft_rhash_elem_destroy,
+ (void *)&rhash_ctx);
+ }
+diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c
+index a81829c10feab..92b108e3000eb 100644
+--- a/net/netfilter/nft_set_pipapo.c
++++ b/net/netfilter/nft_set_pipapo.c
+@@ -566,8 +566,7 @@ next_match:
+ goto out;
+
+ if (last) {
+- if (nft_set_elem_expired(&f->mt[b].e->ext) ||
+- (genmask &&
++ if ((genmask &&
+ !nft_set_elem_active(&f->mt[b].e->ext, genmask)))
+ goto next_match;
+
+@@ -601,8 +600,17 @@ out:
+ static void *nft_pipapo_get(const struct net *net, const struct nft_set *set,
+ const struct nft_set_elem *elem, unsigned int flags)
+ {
+- return pipapo_get(net, set, (const u8 *)elem->key.val.data,
+- nft_genmask_cur(net));
++ struct nft_pipapo_elem *ret;
++
++ ret = pipapo_get(net, set, (const u8 *)elem->key.val.data,
++ nft_genmask_cur(net));
++ if (IS_ERR(ret))
++ return ret;
++
++ if (nft_set_elem_expired(&ret->ext))
++ return ERR_PTR(-ENOENT);
++
++ return ret;
+ }
+
+ /**
+@@ -1529,16 +1537,34 @@ static void pipapo_drop(struct nft_pipapo_match *m,
+ }
+ }
+
++static void nft_pipapo_gc_deactivate(struct net *net, struct nft_set *set,
++ struct nft_pipapo_elem *e)
++
++{
++ struct nft_set_elem elem = {
++ .priv = e,
++ };
++
++ nft_setelem_data_deactivate(net, set, &elem);
++}
++
+ /**
+ * pipapo_gc() - Drop expired entries from set, destroy start and end elements
+ * @set: nftables API set representation
+ * @m: Matching data
+ */
+-static void pipapo_gc(const struct nft_set *set, struct nft_pipapo_match *m)
++static void pipapo_gc(const struct nft_set *_set, struct nft_pipapo_match *m)
+ {
++ struct nft_set *set = (struct nft_set *) _set;
+ struct nft_pipapo *priv = nft_set_priv(set);
++ struct net *net = read_pnet(&set->net);
+ int rules_f0, first_rule = 0;
+ struct nft_pipapo_elem *e;
++ struct nft_trans_gc *gc;
++
++ gc = nft_trans_gc_alloc(set, 0, GFP_KERNEL);
++ if (!gc)
++ return;
+
+ while ((rules_f0 = pipapo_rules_same_key(m->f, first_rule))) {
+ union nft_pipapo_map_bucket rulemap[NFT_PIPAPO_MAX_FIELDS];
+@@ -1562,13 +1588,20 @@ static void pipapo_gc(const struct nft_set *set, struct nft_pipapo_match *m)
+ f--;
+ i--;
+ e = f->mt[rulemap[i].to].e;
+- if (nft_set_elem_expired(&e->ext) &&
+- !nft_set_elem_mark_busy(&e->ext)) {
++
++ /* synchronous gc never fails, there is no need to set on
++ * NFT_SET_ELEM_DEAD_BIT.
++ */
++ if (nft_set_elem_expired(&e->ext)) {
+ priv->dirty = true;
+- pipapo_drop(m, rulemap);
+
+- rcu_barrier();
+- nft_set_elem_destroy(set, e, true);
++ gc = nft_trans_gc_queue_sync(gc, GFP_ATOMIC);
++ if (!gc)
++ break;
++
++ nft_pipapo_gc_deactivate(net, set, e);
++ pipapo_drop(m, rulemap);
++ nft_trans_gc_elem_add(gc, e);
+
+ /* And check again current first rule, which is now the
+ * first we haven't checked.
+@@ -1578,11 +1611,11 @@ static void pipapo_gc(const struct nft_set *set, struct nft_pipapo_match *m)
+ }
+ }
+
+- e = nft_set_catchall_gc(set);
+- if (e)
+- nft_set_elem_destroy(set, e, true);
+-
+- priv->last_gc = jiffies;
++ gc = nft_trans_gc_catchall(gc, 0);
++ if (gc) {
++ nft_trans_gc_queue_sync_done(gc);
++ priv->last_gc = jiffies;
++ }
+ }
+
+ /**
+@@ -1707,7 +1740,6 @@ static void nft_pipapo_activate(const struct net *net,
+ return;
+
+ nft_set_elem_change_active(net, set, &e->ext);
+- nft_set_elem_clear_busy(&e->ext);
+ }
+
+ /**
+@@ -2006,8 +2038,6 @@ static void nft_pipapo_walk(const struct nft_ctx *ctx, struct nft_set *set,
+ goto cont;
+
+ e = f->mt[r].e;
+- if (nft_set_elem_expired(&e->ext))
+- goto cont;
+
+ elem.priv = e;
+
+diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c
+index 8d73fffd2d09d..f9d4c8fcbbf82 100644
+--- a/net/netfilter/nft_set_rbtree.c
++++ b/net/netfilter/nft_set_rbtree.c
+@@ -46,6 +46,12 @@ static int nft_rbtree_cmp(const struct nft_set *set,
+ set->klen);
+ }
+
++static bool nft_rbtree_elem_expired(const struct nft_rbtree_elem *rbe)
++{
++ return nft_set_elem_expired(&rbe->ext) ||
++ nft_set_elem_is_dead(&rbe->ext);
++}
++
+ static bool __nft_rbtree_lookup(const struct net *net, const struct nft_set *set,
+ const u32 *key, const struct nft_set_ext **ext,
+ unsigned int seq)
+@@ -80,7 +86,7 @@ static bool __nft_rbtree_lookup(const struct net *net, const struct nft_set *set
+ continue;
+ }
+
+- if (nft_set_elem_expired(&rbe->ext))
++ if (nft_rbtree_elem_expired(rbe))
+ return false;
+
+ if (nft_rbtree_interval_end(rbe)) {
+@@ -98,7 +104,7 @@ static bool __nft_rbtree_lookup(const struct net *net, const struct nft_set *set
+
+ if (set->flags & NFT_SET_INTERVAL && interval != NULL &&
+ nft_set_elem_active(&interval->ext, genmask) &&
+- !nft_set_elem_expired(&interval->ext) &&
++ !nft_rbtree_elem_expired(interval) &&
+ nft_rbtree_interval_start(interval)) {
+ *ext = &interval->ext;
+ return true;
+@@ -215,6 +221,18 @@ static void *nft_rbtree_get(const struct net *net, const struct nft_set *set,
+ return rbe;
+ }
+
++static void nft_rbtree_gc_remove(struct net *net, struct nft_set *set,
++ struct nft_rbtree *priv,
++ struct nft_rbtree_elem *rbe)
++{
++ struct nft_set_elem elem = {
++ .priv = rbe,
++ };
++
++ nft_setelem_data_deactivate(net, set, &elem);
++ rb_erase(&rbe->node, &priv->root);
++}
++
+ static int nft_rbtree_gc_elem(const struct nft_set *__set,
+ struct nft_rbtree *priv,
+ struct nft_rbtree_elem *rbe,
+@@ -222,11 +240,12 @@ static int nft_rbtree_gc_elem(const struct nft_set *__set,
+ {
+ struct nft_set *set = (struct nft_set *)__set;
+ struct rb_node *prev = rb_prev(&rbe->node);
++ struct net *net = read_pnet(&set->net);
+ struct nft_rbtree_elem *rbe_prev;
+- struct nft_set_gc_batch *gcb;
++ struct nft_trans_gc *gc;
+
+- gcb = nft_set_gc_batch_check(set, NULL, GFP_ATOMIC);
+- if (!gcb)
++ gc = nft_trans_gc_alloc(set, 0, GFP_ATOMIC);
++ if (!gc)
+ return -ENOMEM;
+
+ /* search for end interval coming before this element.
+@@ -244,17 +263,28 @@ static int nft_rbtree_gc_elem(const struct nft_set *__set,
+
+ if (prev) {
+ rbe_prev = rb_entry(prev, struct nft_rbtree_elem, node);
++ nft_rbtree_gc_remove(net, set, priv, rbe_prev);
+
+- rb_erase(&rbe_prev->node, &priv->root);
+- atomic_dec(&set->nelems);
+- nft_set_gc_batch_add(gcb, rbe_prev);
++ /* There is always room in this trans gc for this element,
++ * memory allocation never actually happens, hence, the warning
++ * splat in such case. No need to set NFT_SET_ELEM_DEAD_BIT,
++ * this is synchronous gc which never fails.
++ */
++ gc = nft_trans_gc_queue_sync(gc, GFP_ATOMIC);
++ if (WARN_ON_ONCE(!gc))
++ return -ENOMEM;
++
++ nft_trans_gc_elem_add(gc, rbe_prev);
+ }
+
+- rb_erase(&rbe->node, &priv->root);
+- atomic_dec(&set->nelems);
++ nft_rbtree_gc_remove(net, set, priv, rbe);
++ gc = nft_trans_gc_queue_sync(gc, GFP_ATOMIC);
++ if (WARN_ON_ONCE(!gc))
++ return -ENOMEM;
++
++ nft_trans_gc_elem_add(gc, rbe);
+
+- nft_set_gc_batch_add(gcb, rbe);
+- nft_set_gc_batch_complete(gcb);
++ nft_trans_gc_queue_sync_done(gc);
+
+ return 0;
+ }
+@@ -482,7 +512,6 @@ static void nft_rbtree_activate(const struct net *net,
+ struct nft_rbtree_elem *rbe = elem->priv;
+
+ nft_set_elem_change_active(net, set, &rbe->ext);
+- nft_set_elem_clear_busy(&rbe->ext);
+ }
+
+ static bool nft_rbtree_flush(const struct net *net,
+@@ -490,12 +519,9 @@ static bool nft_rbtree_flush(const struct net *net,
+ {
+ struct nft_rbtree_elem *rbe = priv;
+
+- if (!nft_set_elem_mark_busy(&rbe->ext) ||
+- !nft_is_active(net, &rbe->ext)) {
+- nft_set_elem_change_active(net, set, &rbe->ext);
+- return true;
+- }
+- return false;
++ nft_set_elem_change_active(net, set, &rbe->ext);
++
++ return true;
+ }
+
+ static void *nft_rbtree_deactivate(const struct net *net,
+@@ -552,8 +578,6 @@ static void nft_rbtree_walk(const struct nft_ctx *ctx,
+
+ if (iter->count < iter->skip)
+ goto cont;
+- if (nft_set_elem_expired(&rbe->ext))
+- goto cont;
+ if (!nft_set_elem_active(&rbe->ext, iter->genmask))
+ goto cont;
+
+@@ -572,26 +596,40 @@ cont:
+
+ static void nft_rbtree_gc(struct work_struct *work)
+ {
+- struct nft_rbtree_elem *rbe, *rbe_end = NULL, *rbe_prev = NULL;
+- struct nft_set_gc_batch *gcb = NULL;
++ struct nft_rbtree_elem *rbe, *rbe_end = NULL;
++ struct nftables_pernet *nft_net;
+ struct nft_rbtree *priv;
++ struct nft_trans_gc *gc;
+ struct rb_node *node;
+ struct nft_set *set;
++ unsigned int gc_seq;
+ struct net *net;
+- u8 genmask;
+
+ priv = container_of(work, struct nft_rbtree, gc_work.work);
+ set = nft_set_container_of(priv);
+ net = read_pnet(&set->net);
+- genmask = nft_genmask_cur(net);
++ nft_net = nft_pernet(net);
++ gc_seq = READ_ONCE(nft_net->gc_seq);
++
++ gc = nft_trans_gc_alloc(set, gc_seq, GFP_KERNEL);
++ if (!gc)
++ goto done;
+
+ write_lock_bh(&priv->lock);
+ write_seqcount_begin(&priv->count);
+ for (node = rb_first(&priv->root); node != NULL; node = rb_next(node)) {
++
++ /* Ruleset has been updated, try later. */
++ if (READ_ONCE(nft_net->gc_seq) != gc_seq) {
++ nft_trans_gc_destroy(gc);
++ gc = NULL;
++ goto try_later;
++ }
++
+ rbe = rb_entry(node, struct nft_rbtree_elem, node);
+
+- if (!nft_set_elem_active(&rbe->ext, genmask))
+- continue;
++ if (nft_set_elem_is_dead(&rbe->ext))
++ goto dead_elem;
+
+ /* elements are reversed in the rbtree for historical reasons,
+ * from highest to lowest value, that is why end element is
+@@ -604,46 +642,36 @@ static void nft_rbtree_gc(struct work_struct *work)
+ if (!nft_set_elem_expired(&rbe->ext))
+ continue;
+
+- if (nft_set_elem_mark_busy(&rbe->ext)) {
+- rbe_end = NULL;
++ nft_set_elem_dead(&rbe->ext);
++
++ if (!rbe_end)
+ continue;
+- }
+
+- if (rbe_prev) {
+- rb_erase(&rbe_prev->node, &priv->root);
+- rbe_prev = NULL;
+- }
+- gcb = nft_set_gc_batch_check(set, gcb, GFP_ATOMIC);
+- if (!gcb)
+- break;
++ nft_set_elem_dead(&rbe_end->ext);
+
+- atomic_dec(&set->nelems);
+- nft_set_gc_batch_add(gcb, rbe);
+- rbe_prev = rbe;
++ gc = nft_trans_gc_queue_async(gc, gc_seq, GFP_ATOMIC);
++ if (!gc)
++ goto try_later;
+
+- if (rbe_end) {
+- atomic_dec(&set->nelems);
+- nft_set_gc_batch_add(gcb, rbe_end);
+- rb_erase(&rbe_end->node, &priv->root);
+- rbe_end = NULL;
+- }
+- node = rb_next(node);
+- if (!node)
+- break;
++ nft_trans_gc_elem_add(gc, rbe_end);
++ rbe_end = NULL;
++dead_elem:
++ gc = nft_trans_gc_queue_async(gc, gc_seq, GFP_ATOMIC);
++ if (!gc)
++ goto try_later;
++
++ nft_trans_gc_elem_add(gc, rbe);
+ }
+- if (rbe_prev)
+- rb_erase(&rbe_prev->node, &priv->root);
++
++ gc = nft_trans_gc_catchall(gc, gc_seq);
++
++try_later:
+ write_seqcount_end(&priv->count);
+ write_unlock_bh(&priv->lock);
+
+- rbe = nft_set_catchall_gc(set);
+- if (rbe) {
+- gcb = nft_set_gc_batch_check(set, gcb, GFP_ATOMIC);
+- if (gcb)
+- nft_set_gc_batch_add(gcb, rbe);
+- }
+- nft_set_gc_batch_complete(gcb);
+-
++ if (gc)
++ nft_trans_gc_queue_async_done(gc);
++done:
+ queue_delayed_work(system_power_efficient_wq, &priv->gc_work,
+ nft_set_gc_interval(set));
+ }
+diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
+index a753246ef1657..96a017f78539f 100644
+--- a/net/packet/af_packet.c
++++ b/net/packet/af_packet.c
+@@ -401,18 +401,20 @@ static void __packet_set_status(struct packet_sock *po, void *frame, int status)
+ {
+ union tpacket_uhdr h;
+
++ /* WRITE_ONCE() are paired with READ_ONCE() in __packet_get_status */
++
+ h.raw = frame;
+ switch (po->tp_version) {
+ case TPACKET_V1:
+- h.h1->tp_status = status;
++ WRITE_ONCE(h.h1->tp_status, status);
+ flush_dcache_page(pgv_to_page(&h.h1->tp_status));
+ break;
+ case TPACKET_V2:
+- h.h2->tp_status = status;
++ WRITE_ONCE(h.h2->tp_status, status);
+ flush_dcache_page(pgv_to_page(&h.h2->tp_status));
+ break;
+ case TPACKET_V3:
+- h.h3->tp_status = status;
++ WRITE_ONCE(h.h3->tp_status, status);
+ flush_dcache_page(pgv_to_page(&h.h3->tp_status));
+ break;
+ default:
+@@ -429,17 +431,19 @@ static int __packet_get_status(const struct packet_sock *po, void *frame)
+
+ smp_rmb();
+
++ /* READ_ONCE() are paired with WRITE_ONCE() in __packet_set_status */
++
+ h.raw = frame;
+ switch (po->tp_version) {
+ case TPACKET_V1:
+ flush_dcache_page(pgv_to_page(&h.h1->tp_status));
+- return h.h1->tp_status;
++ return READ_ONCE(h.h1->tp_status);
+ case TPACKET_V2:
+ flush_dcache_page(pgv_to_page(&h.h2->tp_status));
+- return h.h2->tp_status;
++ return READ_ONCE(h.h2->tp_status);
+ case TPACKET_V3:
+ flush_dcache_page(pgv_to_page(&h.h3->tp_status));
+- return h.h3->tp_status;
++ return READ_ONCE(h.h3->tp_status);
+ default:
+ WARN(1, "TPACKET version not supported.\n");
+ BUG();
+diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
+index fa6b54c1411cb..f94e7a04e33d0 100644
+--- a/net/smc/af_smc.c
++++ b/net/smc/af_smc.c
+@@ -378,8 +378,8 @@ static struct sock *smc_sock_alloc(struct net *net, struct socket *sock,
+ sk->sk_state = SMC_INIT;
+ sk->sk_destruct = smc_destruct;
+ sk->sk_protocol = protocol;
+- WRITE_ONCE(sk->sk_sndbuf, READ_ONCE(net->smc.sysctl_wmem));
+- WRITE_ONCE(sk->sk_rcvbuf, READ_ONCE(net->smc.sysctl_rmem));
++ WRITE_ONCE(sk->sk_sndbuf, 2 * READ_ONCE(net->smc.sysctl_wmem));
++ WRITE_ONCE(sk->sk_rcvbuf, 2 * READ_ONCE(net->smc.sysctl_rmem));
+ smc = smc_sk(sk);
+ INIT_WORK(&smc->tcp_listen_work, smc_tcp_listen_work);
+ INIT_WORK(&smc->connect_work, smc_connect_work);
+@@ -436,13 +436,60 @@ out:
+ return rc;
+ }
+
++/* copy only relevant settings and flags of SOL_SOCKET level from smc to
++ * clc socket (since smc is not called for these options from net/core)
++ */
++
++#define SK_FLAGS_SMC_TO_CLC ((1UL << SOCK_URGINLINE) | \
++ (1UL << SOCK_KEEPOPEN) | \
++ (1UL << SOCK_LINGER) | \
++ (1UL << SOCK_BROADCAST) | \
++ (1UL << SOCK_TIMESTAMP) | \
++ (1UL << SOCK_DBG) | \
++ (1UL << SOCK_RCVTSTAMP) | \
++ (1UL << SOCK_RCVTSTAMPNS) | \
++ (1UL << SOCK_LOCALROUTE) | \
++ (1UL << SOCK_TIMESTAMPING_RX_SOFTWARE) | \
++ (1UL << SOCK_RXQ_OVFL) | \
++ (1UL << SOCK_WIFI_STATUS) | \
++ (1UL << SOCK_NOFCS) | \
++ (1UL << SOCK_FILTER_LOCKED) | \
++ (1UL << SOCK_TSTAMP_NEW))
++
++/* if set, use value set by setsockopt() - else use IPv4 or SMC sysctl value */
++static void smc_adjust_sock_bufsizes(struct sock *nsk, struct sock *osk,
++ unsigned long mask)
++{
++ struct net *nnet = sock_net(nsk);
++
++ nsk->sk_userlocks = osk->sk_userlocks;
++ if (osk->sk_userlocks & SOCK_SNDBUF_LOCK) {
++ nsk->sk_sndbuf = osk->sk_sndbuf;
++ } else {
++ if (mask == SK_FLAGS_SMC_TO_CLC)
++ WRITE_ONCE(nsk->sk_sndbuf,
++ READ_ONCE(nnet->ipv4.sysctl_tcp_wmem[1]));
++ else
++ WRITE_ONCE(nsk->sk_sndbuf,
++ 2 * READ_ONCE(nnet->smc.sysctl_wmem));
++ }
++ if (osk->sk_userlocks & SOCK_RCVBUF_LOCK) {
++ nsk->sk_rcvbuf = osk->sk_rcvbuf;
++ } else {
++ if (mask == SK_FLAGS_SMC_TO_CLC)
++ WRITE_ONCE(nsk->sk_rcvbuf,
++ READ_ONCE(nnet->ipv4.sysctl_tcp_rmem[1]));
++ else
++ WRITE_ONCE(nsk->sk_rcvbuf,
++ 2 * READ_ONCE(nnet->smc.sysctl_rmem));
++ }
++}
++
+ static void smc_copy_sock_settings(struct sock *nsk, struct sock *osk,
+ unsigned long mask)
+ {
+ /* options we don't get control via setsockopt for */
+ nsk->sk_type = osk->sk_type;
+- nsk->sk_sndbuf = osk->sk_sndbuf;
+- nsk->sk_rcvbuf = osk->sk_rcvbuf;
+ nsk->sk_sndtimeo = osk->sk_sndtimeo;
+ nsk->sk_rcvtimeo = osk->sk_rcvtimeo;
+ nsk->sk_mark = READ_ONCE(osk->sk_mark);
+@@ -453,26 +500,10 @@ static void smc_copy_sock_settings(struct sock *nsk, struct sock *osk,
+
+ nsk->sk_flags &= ~mask;
+ nsk->sk_flags |= osk->sk_flags & mask;
++
++ smc_adjust_sock_bufsizes(nsk, osk, mask);
+ }
+
+-#define SK_FLAGS_SMC_TO_CLC ((1UL << SOCK_URGINLINE) | \
+- (1UL << SOCK_KEEPOPEN) | \
+- (1UL << SOCK_LINGER) | \
+- (1UL << SOCK_BROADCAST) | \
+- (1UL << SOCK_TIMESTAMP) | \
+- (1UL << SOCK_DBG) | \
+- (1UL << SOCK_RCVTSTAMP) | \
+- (1UL << SOCK_RCVTSTAMPNS) | \
+- (1UL << SOCK_LOCALROUTE) | \
+- (1UL << SOCK_TIMESTAMPING_RX_SOFTWARE) | \
+- (1UL << SOCK_RXQ_OVFL) | \
+- (1UL << SOCK_WIFI_STATUS) | \
+- (1UL << SOCK_NOFCS) | \
+- (1UL << SOCK_FILTER_LOCKED) | \
+- (1UL << SOCK_TSTAMP_NEW))
+-/* copy only relevant settings and flags of SOL_SOCKET level from smc to
+- * clc socket (since smc is not called for these options from net/core)
+- */
+ static void smc_copy_sock_settings_to_clc(struct smc_sock *smc)
+ {
+ smc_copy_sock_settings(smc->clcsock->sk, &smc->sk, SK_FLAGS_SMC_TO_CLC);
+@@ -2479,8 +2510,6 @@ static void smc_tcp_listen_work(struct work_struct *work)
+ sock_hold(lsk); /* sock_put in smc_listen_work */
+ INIT_WORK(&new_smc->smc_listen_work, smc_listen_work);
+ smc_copy_sock_settings_to_smc(new_smc);
+- new_smc->sk.sk_sndbuf = lsmc->sk.sk_sndbuf;
+- new_smc->sk.sk_rcvbuf = lsmc->sk.sk_rcvbuf;
+ sock_hold(&new_smc->sk); /* sock_put in passive closing */
+ if (!queue_work(smc_hs_wq, &new_smc->smc_listen_work))
+ sock_put(&new_smc->sk);
+diff --git a/net/smc/smc.h b/net/smc/smc.h
+index 2eeea4cdc7187..1f2b912c43d10 100644
+--- a/net/smc/smc.h
++++ b/net/smc/smc.h
+@@ -161,7 +161,7 @@ struct smc_connection {
+
+ struct smc_buf_desc *sndbuf_desc; /* send buffer descriptor */
+ struct smc_buf_desc *rmb_desc; /* RMBE descriptor */
+- int rmbe_size_short;/* compressed notation */
++ int rmbe_size_comp; /* compressed notation */
+ int rmbe_update_limit;
+ /* lower limit for consumer
+ * cursor update
+diff --git a/net/smc/smc_clc.c b/net/smc/smc_clc.c
+index b9b8b07aa7023..c90d9e5dda540 100644
+--- a/net/smc/smc_clc.c
++++ b/net/smc/smc_clc.c
+@@ -1007,7 +1007,7 @@ static int smc_clc_send_confirm_accept(struct smc_sock *smc,
+ clc->d0.gid =
+ conn->lgr->smcd->ops->get_local_gid(conn->lgr->smcd);
+ clc->d0.token = conn->rmb_desc->token;
+- clc->d0.dmbe_size = conn->rmbe_size_short;
++ clc->d0.dmbe_size = conn->rmbe_size_comp;
+ clc->d0.dmbe_idx = 0;
+ memcpy(&clc->d0.linkid, conn->lgr->id, SMC_LGR_ID_SIZE);
+ if (version == SMC_V1) {
+@@ -1050,7 +1050,7 @@ static int smc_clc_send_confirm_accept(struct smc_sock *smc,
+ clc->r0.qp_mtu = min(link->path_mtu, link->peer_mtu);
+ break;
+ }
+- clc->r0.rmbe_size = conn->rmbe_size_short;
++ clc->r0.rmbe_size = conn->rmbe_size_comp;
+ clc->r0.rmb_dma_addr = conn->rmb_desc->is_vm ?
+ cpu_to_be64((uintptr_t)conn->rmb_desc->cpu_addr) :
+ cpu_to_be64((u64)sg_dma_address
+diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c
+index 3f465faf2b681..6b78075404d7d 100644
+--- a/net/smc/smc_core.c
++++ b/net/smc/smc_core.c
+@@ -2309,31 +2309,30 @@ static int __smc_buf_create(struct smc_sock *smc, bool is_smcd, bool is_rmb)
+ struct smc_connection *conn = &smc->conn;
+ struct smc_link_group *lgr = conn->lgr;
+ struct list_head *buf_list;
+- int bufsize, bufsize_short;
++ int bufsize, bufsize_comp;
+ struct rw_semaphore *lock; /* lock buffer list */
+ bool is_dgraded = false;
+- int sk_buf_size;
+
+ if (is_rmb)
+ /* use socket recv buffer size (w/o overhead) as start value */
+- sk_buf_size = smc->sk.sk_rcvbuf;
++ bufsize = smc->sk.sk_rcvbuf / 2;
+ else
+ /* use socket send buffer size (w/o overhead) as start value */
+- sk_buf_size = smc->sk.sk_sndbuf;
++ bufsize = smc->sk.sk_sndbuf / 2;
+
+- for (bufsize_short = smc_compress_bufsize(sk_buf_size, is_smcd, is_rmb);
+- bufsize_short >= 0; bufsize_short--) {
++ for (bufsize_comp = smc_compress_bufsize(bufsize, is_smcd, is_rmb);
++ bufsize_comp >= 0; bufsize_comp--) {
+ if (is_rmb) {
+ lock = &lgr->rmbs_lock;
+- buf_list = &lgr->rmbs[bufsize_short];
++ buf_list = &lgr->rmbs[bufsize_comp];
+ } else {
+ lock = &lgr->sndbufs_lock;
+- buf_list = &lgr->sndbufs[bufsize_short];
++ buf_list = &lgr->sndbufs[bufsize_comp];
+ }
+- bufsize = smc_uncompress_bufsize(bufsize_short);
++ bufsize = smc_uncompress_bufsize(bufsize_comp);
+
+ /* check for reusable slot in the link group */
+- buf_desc = smc_buf_get_slot(bufsize_short, lock, buf_list);
++ buf_desc = smc_buf_get_slot(bufsize_comp, lock, buf_list);
+ if (buf_desc) {
+ buf_desc->is_dma_need_sync = 0;
+ SMC_STAT_RMB_SIZE(smc, is_smcd, is_rmb, bufsize);
+@@ -2377,8 +2376,8 @@ static int __smc_buf_create(struct smc_sock *smc, bool is_smcd, bool is_rmb)
+
+ if (is_rmb) {
+ conn->rmb_desc = buf_desc;
+- conn->rmbe_size_short = bufsize_short;
+- smc->sk.sk_rcvbuf = bufsize;
++ conn->rmbe_size_comp = bufsize_comp;
++ smc->sk.sk_rcvbuf = bufsize * 2;
+ atomic_set(&conn->bytes_to_rcv, 0);
+ conn->rmbe_update_limit =
+ smc_rmb_wnd_update_limit(buf_desc->len);
+@@ -2386,7 +2385,7 @@ static int __smc_buf_create(struct smc_sock *smc, bool is_smcd, bool is_rmb)
+ smc_ism_set_conn(conn); /* map RMB/smcd_dev to conn */
+ } else {
+ conn->sndbuf_desc = buf_desc;
+- smc->sk.sk_sndbuf = bufsize;
++ smc->sk.sk_sndbuf = bufsize * 2;
+ atomic_set(&conn->sndbuf_space, bufsize);
+ }
+ return 0;
+diff --git a/net/smc/smc_sysctl.c b/net/smc/smc_sysctl.c
+index b6f79fabb9d3f..0b2a957ca5f5f 100644
+--- a/net/smc/smc_sysctl.c
++++ b/net/smc/smc_sysctl.c
+@@ -21,6 +21,10 @@
+
+ static int min_sndbuf = SMC_BUF_MIN_SIZE;
+ static int min_rcvbuf = SMC_BUF_MIN_SIZE;
++static int max_sndbuf = INT_MAX / 2;
++static int max_rcvbuf = INT_MAX / 2;
++static const int net_smc_wmem_init = (64 * 1024);
++static const int net_smc_rmem_init = (64 * 1024);
+
+ static struct ctl_table smc_table[] = {
+ {
+@@ -53,6 +57,7 @@ static struct ctl_table smc_table[] = {
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = &min_sndbuf,
++ .extra2 = &max_sndbuf,
+ },
+ {
+ .procname = "rmem",
+@@ -61,6 +66,7 @@ static struct ctl_table smc_table[] = {
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = &min_rcvbuf,
++ .extra2 = &max_rcvbuf,
+ },
+ { }
+ };
+@@ -88,8 +94,8 @@ int __net_init smc_sysctl_net_init(struct net *net)
+ net->smc.sysctl_autocorking_size = SMC_AUTOCORKING_DEFAULT_SIZE;
+ net->smc.sysctl_smcr_buf_type = SMCR_PHYS_CONT_BUFS;
+ net->smc.sysctl_smcr_testlink_time = SMC_LLC_TESTLINK_DEFAULT_TIME;
+- WRITE_ONCE(net->smc.sysctl_wmem, READ_ONCE(net->ipv4.sysctl_tcp_wmem[1]));
+- WRITE_ONCE(net->smc.sysctl_rmem, READ_ONCE(net->ipv4.sysctl_tcp_rmem[1]));
++ WRITE_ONCE(net->smc.sysctl_wmem, net_smc_wmem_init);
++ WRITE_ONCE(net->smc.sysctl_rmem, net_smc_rmem_init);
+
+ return 0;
+
+diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
+index bf69c9d6d06c0..1849827884735 100644
+--- a/net/tls/tls_device.c
++++ b/net/tls/tls_device.c
+@@ -52,6 +52,8 @@ static LIST_HEAD(tls_device_list);
+ static LIST_HEAD(tls_device_down_list);
+ static DEFINE_SPINLOCK(tls_device_lock);
+
++static struct page *dummy_page;
++
+ static void tls_device_free_ctx(struct tls_context *ctx)
+ {
+ if (ctx->tx_conf == TLS_HW) {
+@@ -313,36 +315,33 @@ static int tls_push_record(struct sock *sk,
+ return tls_push_sg(sk, ctx, offload_ctx->sg_tx_data, 0, flags);
+ }
+
+-static int tls_device_record_close(struct sock *sk,
+- struct tls_context *ctx,
+- struct tls_record_info *record,
+- struct page_frag *pfrag,
+- unsigned char record_type)
++static void tls_device_record_close(struct sock *sk,
++ struct tls_context *ctx,
++ struct tls_record_info *record,
++ struct page_frag *pfrag,
++ unsigned char record_type)
+ {
+ struct tls_prot_info *prot = &ctx->prot_info;
+- int ret;
++ struct page_frag dummy_tag_frag;
+
+ /* append tag
+ * device will fill in the tag, we just need to append a placeholder
+ * use socket memory to improve coalescing (re-using a single buffer
+ * increases frag count)
+- * if we can't allocate memory now, steal some back from data
++ * if we can't allocate memory now use the dummy page
+ */
+- if (likely(skb_page_frag_refill(prot->tag_size, pfrag,
+- sk->sk_allocation))) {
+- ret = 0;
+- tls_append_frag(record, pfrag, prot->tag_size);
+- } else {
+- ret = prot->tag_size;
+- if (record->len <= prot->overhead_size)
+- return -ENOMEM;
++ if (unlikely(pfrag->size - pfrag->offset < prot->tag_size) &&
++ !skb_page_frag_refill(prot->tag_size, pfrag, sk->sk_allocation)) {
++ dummy_tag_frag.page = dummy_page;
++ dummy_tag_frag.offset = 0;
++ pfrag = &dummy_tag_frag;
+ }
++ tls_append_frag(record, pfrag, prot->tag_size);
+
+ /* fill prepend */
+ tls_fill_prepend(ctx, skb_frag_address(&record->frags[0]),
+ record->len - prot->overhead_size,
+ record_type);
+- return ret;
+ }
+
+ static int tls_create_new_record(struct tls_offload_context_tx *offload_ctx,
+@@ -535,18 +534,8 @@ last_record:
+
+ if (done || record->len >= max_open_record_len ||
+ (record->num_frags >= MAX_SKB_FRAGS - 1)) {
+- rc = tls_device_record_close(sk, tls_ctx, record,
+- pfrag, record_type);
+- if (rc) {
+- if (rc > 0) {
+- size += rc;
+- } else {
+- size = orig_size;
+- destroy_record(record);
+- ctx->open_record = NULL;
+- break;
+- }
+- }
++ tls_device_record_close(sk, tls_ctx, record,
++ pfrag, record_type);
+
+ rc = tls_push_record(sk,
+ tls_ctx,
+@@ -1466,14 +1455,26 @@ int __init tls_device_init(void)
+ {
+ int err;
+
+- destruct_wq = alloc_workqueue("ktls_device_destruct", 0, 0);
+- if (!destruct_wq)
++ dummy_page = alloc_page(GFP_KERNEL);
++ if (!dummy_page)
+ return -ENOMEM;
+
++ destruct_wq = alloc_workqueue("ktls_device_destruct", 0, 0);
++ if (!destruct_wq) {
++ err = -ENOMEM;
++ goto err_free_dummy;
++ }
++
+ err = register_netdevice_notifier(&tls_dev_notifier);
+ if (err)
+- destroy_workqueue(destruct_wq);
++ goto err_destroy_wq;
+
++ return 0;
++
++err_destroy_wq:
++ destroy_workqueue(destruct_wq);
++err_free_dummy:
++ put_page(dummy_page);
+ return err;
+ }
+
+@@ -1482,4 +1483,5 @@ void __exit tls_device_cleanup(void)
+ unregister_netdevice_notifier(&tls_dev_notifier);
+ destroy_workqueue(destruct_wq);
+ clean_acked_data_flush();
++ put_page(dummy_page);
+ }
+diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
+index 087d60c0f6e4f..1b688745ce0a1 100644
+--- a/net/wireless/nl80211.c
++++ b/net/wireless/nl80211.c
+@@ -5426,8 +5426,11 @@ nl80211_parse_mbssid_elems(struct wiphy *wiphy, struct nlattr *attrs)
+ if (!wiphy->mbssid_max_interfaces)
+ return ERR_PTR(-EINVAL);
+
+- nla_for_each_nested(nl_elems, attrs, rem_elems)
++ nla_for_each_nested(nl_elems, attrs, rem_elems) {
++ if (num_elems >= 255)
++ return ERR_PTR(-EINVAL);
+ num_elems++;
++ }
+
+ elems = kzalloc(struct_size(elems, elem, num_elems), GFP_KERNEL);
+ if (!elems)
+diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
+index 35e518eaaebae..5f249fa969985 100644
+--- a/net/xdp/xsk.c
++++ b/net/xdp/xsk.c
+@@ -994,6 +994,7 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len)
+ err = xp_alloc_tx_descs(xs->pool, xs);
+ if (err) {
+ xp_put_pool(xs->pool);
++ xs->pool = NULL;
+ sockfd_put(sock);
+ goto out_unlock;
+ }
+diff --git a/tools/testing/radix-tree/regression1.c b/tools/testing/radix-tree/regression1.c
+index a61c7bcbc72da..63f468bf8245c 100644
+--- a/tools/testing/radix-tree/regression1.c
++++ b/tools/testing/radix-tree/regression1.c
+@@ -177,7 +177,7 @@ void regression1_test(void)
+ nr_threads = 2;
+ pthread_barrier_init(&worker_barrier, NULL, nr_threads);
+
+- threads = malloc(nr_threads * sizeof(pthread_t *));
++ threads = malloc(nr_threads * sizeof(*threads));
+
+ for (i = 0; i < nr_threads; i++) {
+ arg = i;
+diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
+index b4f6f3a50ae58..ba35bcc66e7e9 100644
+--- a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
++++ b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
+@@ -1432,7 +1432,7 @@ static void vsock_unix_redir_connectible(int sock_mapfd, int verd_mapfd,
+ if (n < 1)
+ goto out;
+
+- n = recv(mode == REDIR_INGRESS ? u0 : u1, &b, sizeof(b), MSG_DONTWAIT);
++ n = xrecv_nonblock(mode == REDIR_INGRESS ? u0 : u1, &b, sizeof(b), 0);
+ if (n < 0)
+ FAIL("%s: recv() err, errno=%d", log_prefix, errno);
+ if (n == 0)
+diff --git a/tools/testing/selftests/mm/ksm_tests.c b/tools/testing/selftests/mm/ksm_tests.c
+index 435acebdc325f..380b691d3eb9f 100644
+--- a/tools/testing/selftests/mm/ksm_tests.c
++++ b/tools/testing/selftests/mm/ksm_tests.c
+@@ -831,6 +831,7 @@ int main(int argc, char *argv[])
+ printf("Size must be greater than 0\n");
+ return KSFT_FAIL;
+ }
++ break;
+ case 't':
+ {
+ int tmp = atoi(optarg);
+diff --git a/tools/testing/selftests/net/fib_nexthops.sh b/tools/testing/selftests/net/fib_nexthops.sh
+index 0f5e88c8f4ffe..df8d90b51867a 100755
+--- a/tools/testing/selftests/net/fib_nexthops.sh
++++ b/tools/testing/selftests/net/fib_nexthops.sh
+@@ -1981,6 +1981,11 @@ basic()
+
+ run_cmd "$IP link set dev lo up"
+
++ # Dump should not loop endlessly when maximum nexthop ID is configured.
++ run_cmd "$IP nexthop add id $((2**32-1)) blackhole"
++ run_cmd "timeout 5 $IP nexthop"
++ log_test $? 0 "Maximum nexthop ID dump"
++
+ #
+ # groups
+ #
+@@ -2201,6 +2206,11 @@ basic_res()
+ run_cmd "$IP nexthop bucket list fdb"
+ log_test $? 255 "Dump all nexthop buckets with invalid 'fdb' keyword"
+
++ # Dump should not loop endlessly when maximum nexthop ID is configured.
++ run_cmd "$IP nexthop add id $((2**32-1)) group 1/2 type resilient buckets 4"
++ run_cmd "timeout 5 $IP nexthop bucket"
++ log_test $? 0 "Maximum nexthop ID dump"
++
+ #
+ # resilient nexthop buckets get requests
+ #
+diff --git a/tools/testing/selftests/net/forwarding/bridge_mdb.sh b/tools/testing/selftests/net/forwarding/bridge_mdb.sh
+index ae3f9462a2b61..d0c6c499d5dab 100755
+--- a/tools/testing/selftests/net/forwarding/bridge_mdb.sh
++++ b/tools/testing/selftests/net/forwarding/bridge_mdb.sh
+@@ -617,7 +617,7 @@ __cfg_test_port_ip_sg()
+ grep -q "permanent"
+ check_err $? "Entry not added as \"permanent\" when should"
+ bridge -d -s mdb show dev br0 vid 10 | grep "$grp_key" | \
+- grep -q "0.00"
++ grep -q " 0.00"
+ check_err $? "\"permanent\" entry has a pending group timer"
+ bridge mdb del dev br0 port $swp1 $grp_key vid 10
+
+@@ -626,7 +626,7 @@ __cfg_test_port_ip_sg()
+ grep -q "temp"
+ check_err $? "Entry not added as \"temp\" when should"
+ bridge -d -s mdb show dev br0 vid 10 | grep "$grp_key" | \
+- grep -q "0.00"
++ grep -q " 0.00"
+ check_fail $? "\"temp\" entry has an unpending group timer"
+ bridge mdb del dev br0 port $swp1 $grp_key vid 10
+
+@@ -659,7 +659,7 @@ __cfg_test_port_ip_sg()
+ grep -q "permanent"
+ check_err $? "Entry not marked as \"permanent\" after replace"
+ bridge -d -s mdb show dev br0 vid 10 | grep "$grp_key" | \
+- grep -q "0.00"
++ grep -q " 0.00"
+ check_err $? "Entry has a pending group timer after replace"
+
+ bridge mdb replace dev br0 port $swp1 $grp_key vid 10 temp
+@@ -667,7 +667,7 @@ __cfg_test_port_ip_sg()
+ grep -q "temp"
+ check_err $? "Entry not marked as \"temp\" after replace"
+ bridge -d -s mdb show dev br0 vid 10 | grep "$grp_key" | \
+- grep -q "0.00"
++ grep -q " 0.00"
+ check_fail $? "Entry has an unpending group timer after replace"
+ bridge mdb del dev br0 port $swp1 $grp_key vid 10
+
+@@ -850,6 +850,7 @@ cfg_test()
+ __fwd_test_host_ip()
+ {
+ local grp=$1; shift
++ local dmac=$1; shift
+ local src=$1; shift
+ local mode=$1; shift
+ local name
+@@ -872,27 +873,27 @@ __fwd_test_host_ip()
+ # Packet should only be flooded to multicast router ports when there is
+ # no matching MDB entry. The bridge is not configured as a multicast
+ # router port.
+- $MZ $mode $h1.10 -c 1 -p 128 -A $src -B $grp -t udp -q
++ $MZ $mode $h1.10 -a own -b $dmac -c 1 -p 128 -A $src -B $grp -t udp -q
+ tc_check_packets "dev br0 ingress" 1 0
+ check_err $? "Packet locally received after flood"
+
+ # Install a regular port group entry and expect the packet to not be
+ # locally received.
+ bridge mdb add dev br0 port $swp2 grp $grp temp vid 10
+- $MZ $mode $h1.10 -c 1 -p 128 -A $src -B $grp -t udp -q
++ $MZ $mode $h1.10 -a own -b $dmac -c 1 -p 128 -A $src -B $grp -t udp -q
+ tc_check_packets "dev br0 ingress" 1 0
+ check_err $? "Packet locally received after installing a regular entry"
+
+ # Add a host entry and expect the packet to be locally received.
+ bridge mdb add dev br0 port br0 grp $grp temp vid 10
+- $MZ $mode $h1.10 -c 1 -p 128 -A $src -B $grp -t udp -q
++ $MZ $mode $h1.10 -a own -b $dmac -c 1 -p 128 -A $src -B $grp -t udp -q
+ tc_check_packets "dev br0 ingress" 1 1
+ check_err $? "Packet not locally received after adding a host entry"
+
+ # Remove the host entry and expect the packet to not be locally
+ # received.
+ bridge mdb del dev br0 port br0 grp $grp vid 10
+- $MZ $mode $h1.10 -c 1 -p 128 -A $src -B $grp -t udp -q
++ $MZ $mode $h1.10 -a own -b $dmac -c 1 -p 128 -A $src -B $grp -t udp -q
+ tc_check_packets "dev br0 ingress" 1 1
+ check_err $? "Packet locally received after removing a host entry"
+
+@@ -905,8 +906,8 @@ __fwd_test_host_ip()
+
+ fwd_test_host_ip()
+ {
+- __fwd_test_host_ip "239.1.1.1" "192.0.2.1" "-4"
+- __fwd_test_host_ip "ff0e::1" "2001:db8:1::1" "-6"
++ __fwd_test_host_ip "239.1.1.1" "01:00:5e:01:01:01" "192.0.2.1" "-4"
++ __fwd_test_host_ip "ff0e::1" "33:33:00:00:00:01" "2001:db8:1::1" "-6"
+ }
+
+ fwd_test_host_l2()
+@@ -966,6 +967,7 @@ fwd_test_host()
+ __fwd_test_port_ip()
+ {
+ local grp=$1; shift
++ local dmac=$1; shift
+ local valid_src=$1; shift
+ local invalid_src=$1; shift
+ local mode=$1; shift
+@@ -999,43 +1001,43 @@ __fwd_test_port_ip()
+ vlan_ethtype $eth_type vlan_id 10 dst_ip $grp \
+ src_ip $invalid_src action drop
+
+- $MZ $mode $h1.10 -c 1 -p 128 -A $valid_src -B $grp -t udp -q
++ $MZ $mode $h1.10 -a own -b $dmac -c 1 -p 128 -A $valid_src -B $grp -t udp -q
+ tc_check_packets "dev $h2 ingress" 1 0
+ check_err $? "Packet from valid source received on H2 before adding entry"
+
+- $MZ $mode $h1.10 -c 1 -p 128 -A $invalid_src -B $grp -t udp -q
++ $MZ $mode $h1.10 -a own -b $dmac -c 1 -p 128 -A $invalid_src -B $grp -t udp -q
+ tc_check_packets "dev $h2 ingress" 2 0
+ check_err $? "Packet from invalid source received on H2 before adding entry"
+
+ bridge mdb add dev br0 port $swp2 grp $grp vid 10 \
+ filter_mode $filter_mode source_list $src_list
+
+- $MZ $mode $h1.10 -c 1 -p 128 -A $valid_src -B $grp -t udp -q
++ $MZ $mode $h1.10 -a own -b $dmac -c 1 -p 128 -A $valid_src -B $grp -t udp -q
+ tc_check_packets "dev $h2 ingress" 1 1
+ check_err $? "Packet from valid source not received on H2 after adding entry"
+
+- $MZ $mode $h1.10 -c 1 -p 128 -A $invalid_src -B $grp -t udp -q
++ $MZ $mode $h1.10 -a own -b $dmac -c 1 -p 128 -A $invalid_src -B $grp -t udp -q
+ tc_check_packets "dev $h2 ingress" 2 0
+ check_err $? "Packet from invalid source received on H2 after adding entry"
+
+ bridge mdb replace dev br0 port $swp2 grp $grp vid 10 \
+ filter_mode exclude
+
+- $MZ $mode $h1.10 -c 1 -p 128 -A $valid_src -B $grp -t udp -q
++ $MZ $mode $h1.10 -a own -b $dmac -c 1 -p 128 -A $valid_src -B $grp -t udp -q
+ tc_check_packets "dev $h2 ingress" 1 2
+ check_err $? "Packet from valid source not received on H2 after allowing all sources"
+
+- $MZ $mode $h1.10 -c 1 -p 128 -A $invalid_src -B $grp -t udp -q
++ $MZ $mode $h1.10 -a own -b $dmac -c 1 -p 128 -A $invalid_src -B $grp -t udp -q
+ tc_check_packets "dev $h2 ingress" 2 1
+ check_err $? "Packet from invalid source not received on H2 after allowing all sources"
+
+ bridge mdb del dev br0 port $swp2 grp $grp vid 10
+
+- $MZ $mode $h1.10 -c 1 -p 128 -A $valid_src -B $grp -t udp -q
++ $MZ $mode $h1.10 -a own -b $dmac -c 1 -p 128 -A $valid_src -B $grp -t udp -q
+ tc_check_packets "dev $h2 ingress" 1 2
+ check_err $? "Packet from valid source received on H2 after deleting entry"
+
+- $MZ $mode $h1.10 -c 1 -p 128 -A $invalid_src -B $grp -t udp -q
++ $MZ $mode $h1.10 -a own -b $dmac -c 1 -p 128 -A $invalid_src -B $grp -t udp -q
+ tc_check_packets "dev $h2 ingress" 2 1
+ check_err $? "Packet from invalid source received on H2 after deleting entry"
+
+@@ -1047,11 +1049,11 @@ __fwd_test_port_ip()
+
+ fwd_test_port_ip()
+ {
+- __fwd_test_port_ip "239.1.1.1" "192.0.2.1" "192.0.2.2" "-4" "exclude"
+- __fwd_test_port_ip "ff0e::1" "2001:db8:1::1" "2001:db8:1::2" "-6" \
++ __fwd_test_port_ip "239.1.1.1" "01:00:5e:01:01:01" "192.0.2.1" "192.0.2.2" "-4" "exclude"
++ __fwd_test_port_ip "ff0e::1" "33:33:00:00:00:01" "2001:db8:1::1" "2001:db8:1::2" "-6" \
+ "exclude"
+- __fwd_test_port_ip "239.1.1.1" "192.0.2.1" "192.0.2.2" "-4" "include"
+- __fwd_test_port_ip "ff0e::1" "2001:db8:1::1" "2001:db8:1::2" "-6" \
++ __fwd_test_port_ip "239.1.1.1" "01:00:5e:01:01:01" "192.0.2.1" "192.0.2.2" "-4" "include"
++ __fwd_test_port_ip "ff0e::1" "33:33:00:00:00:01" "2001:db8:1::1" "2001:db8:1::2" "-6" \
+ "include"
+ }
+
+@@ -1127,7 +1129,7 @@ ctrl_igmpv3_is_in_test()
+ filter_mode include source_list 192.0.2.1
+
+ # IS_IN ( 192.0.2.2 )
+- $MZ $h1.10 -c 1 -A 192.0.2.1 -B 239.1.1.1 \
++ $MZ $h1.10 -c 1 -a own -b 01:00:5e:01:01:01 -A 192.0.2.1 -B 239.1.1.1 \
+ -t ip proto=2,p=$(igmpv3_is_in_get 239.1.1.1 192.0.2.2) -q
+
+ bridge -d mdb show dev br0 vid 10 | grep 239.1.1.1 | grep -q 192.0.2.2
+@@ -1140,7 +1142,7 @@ ctrl_igmpv3_is_in_test()
+ filter_mode include source_list 192.0.2.1
+
+ # IS_IN ( 192.0.2.2 )
+- $MZ $h1.10 -c 1 -A 192.0.2.1 -B 239.1.1.1 \
++ $MZ $h1.10 -a own -b 01:00:5e:01:01:01 -c 1 -A 192.0.2.1 -B 239.1.1.1 \
+ -t ip proto=2,p=$(igmpv3_is_in_get 239.1.1.1 192.0.2.2) -q
+
+ bridge -d mdb show dev br0 vid 10 | grep 239.1.1.1 | grep -v "src" | \
+@@ -1167,7 +1169,7 @@ ctrl_mldv2_is_in_test()
+
+ # IS_IN ( 2001:db8:1::2 )
+ local p=$(mldv2_is_in_get fe80::1 ff0e::1 2001:db8:1::2)
+- $MZ -6 $h1.10 -c 1 -A fe80::1 -B ff0e::1 \
++ $MZ -6 $h1.10 -a own -b 33:33:00:00:00:01 -c 1 -A fe80::1 -B ff0e::1 \
+ -t ip hop=1,next=0,p="$p" -q
+
+ bridge -d mdb show dev br0 vid 10 | grep ff0e::1 | \
+@@ -1181,7 +1183,7 @@ ctrl_mldv2_is_in_test()
+ filter_mode include source_list 2001:db8:1::1
+
+ # IS_IN ( 2001:db8:1::2 )
+- $MZ -6 $h1.10 -c 1 -A fe80::1 -B ff0e::1 \
++ $MZ -6 $h1.10 -a own -b 33:33:00:00:00:01 -c 1 -A fe80::1 -B ff0e::1 \
+ -t ip hop=1,next=0,p="$p" -q
+
+ bridge -d mdb show dev br0 vid 10 | grep ff0e::1 | grep -v "src" | \
+@@ -1206,6 +1208,11 @@ ctrl_test()
+ ctrl_mldv2_is_in_test
+ }
+
++if ! bridge mdb help 2>&1 | grep -q "replace"; then
++ echo "SKIP: iproute2 too old, missing bridge mdb replace support"
++ exit $ksft_skip
++fi
++
+ trap cleanup EXIT
+
+ setup_prepare
+diff --git a/tools/testing/selftests/net/forwarding/bridge_mdb_max.sh b/tools/testing/selftests/net/forwarding/bridge_mdb_max.sh
+index ae255b662ba38..3da9d93ab36fb 100755
+--- a/tools/testing/selftests/net/forwarding/bridge_mdb_max.sh
++++ b/tools/testing/selftests/net/forwarding/bridge_mdb_max.sh
+@@ -252,7 +252,8 @@ ctl4_entries_add()
+ local IPs=$(seq -f 192.0.2.%g 1 $((n - 1)))
+ local peer=$(locus_dev_peer $locus)
+ local GRP=239.1.1.${grp}
+- $MZ $peer -c 1 -A 192.0.2.1 -B $GRP \
++ local dmac=01:00:5e:01:01:$(printf "%02x" $grp)
++ $MZ $peer -a own -b $dmac -c 1 -A 192.0.2.1 -B $GRP \
+ -t ip proto=2,p=$(igmpv3_is_in_get $GRP $IPs) -q
+ sleep 1
+
+@@ -272,7 +273,8 @@ ctl4_entries_del()
+
+ local peer=$(locus_dev_peer $locus)
+ local GRP=239.1.1.${grp}
+- $MZ $peer -c 1 -A 192.0.2.1 -B 224.0.0.2 \
++ local dmac=01:00:5e:00:00:02
++ $MZ $peer -a own -b $dmac -c 1 -A 192.0.2.1 -B 224.0.0.2 \
+ -t ip proto=2,p=$(igmpv2_leave_get $GRP) -q
+ sleep 1
+ ! bridge mdb show dev br0 | grep -q $GRP
+@@ -289,8 +291,10 @@ ctl6_entries_add()
+ local peer=$(locus_dev_peer $locus)
+ local SIP=fe80::1
+ local GRP=ff0e::${grp}
++ local dmac=33:33:00:00:00:$(printf "%02x" $grp)
+ local p=$(mldv2_is_in_get $SIP $GRP $IPs)
+- $MZ -6 $peer -c 1 -A $SIP -B $GRP -t ip hop=1,next=0,p="$p" -q
++ $MZ -6 $peer -a own -b $dmac -c 1 -A $SIP -B $GRP \
++ -t ip hop=1,next=0,p="$p" -q
+ sleep 1
+
+ local nn=$(bridge mdb show dev br0 | grep $GRP | wc -l)
+@@ -310,8 +314,10 @@ ctl6_entries_del()
+ local peer=$(locus_dev_peer $locus)
+ local SIP=fe80::1
+ local GRP=ff0e::${grp}
++ local dmac=33:33:00:00:00:$(printf "%02x" $grp)
+ local p=$(mldv1_done_get $SIP $GRP)
+- $MZ -6 $peer -c 1 -A $SIP -B $GRP -t ip hop=1,next=0,p="$p" -q
++ $MZ -6 $peer -a own -b $dmac -c 1 -A $SIP -B $GRP \
++ -t ip hop=1,next=0,p="$p" -q
+ sleep 1
+ ! bridge mdb show dev br0 | grep -q $GRP
+ }
+@@ -1328,6 +1334,11 @@ test_8021qvs()
+ switch_destroy
+ }
+
++if ! bridge link help 2>&1 | grep -q "mcast_max_groups"; then
++ echo "SKIP: iproute2 too old, missing bridge \"mcast_max_groups\" support"
++ exit $ksft_skip
++fi
++
+ trap cleanup EXIT
+
+ setup_prepare
+diff --git a/tools/testing/selftests/net/forwarding/ethtool.sh b/tools/testing/selftests/net/forwarding/ethtool.sh
+index dbb9fcf759e0f..aa2eafb7b2437 100755
+--- a/tools/testing/selftests/net/forwarding/ethtool.sh
++++ b/tools/testing/selftests/net/forwarding/ethtool.sh
+@@ -286,6 +286,8 @@ different_speeds_autoneg_on()
+ ethtool -s $h1 autoneg on
+ }
+
++skip_on_veth
++
+ trap cleanup EXIT
+
+ setup_prepare
+diff --git a/tools/testing/selftests/net/forwarding/ethtool_extended_state.sh b/tools/testing/selftests/net/forwarding/ethtool_extended_state.sh
+index 072faa77f53bd..17f89c3b7c020 100755
+--- a/tools/testing/selftests/net/forwarding/ethtool_extended_state.sh
++++ b/tools/testing/selftests/net/forwarding/ethtool_extended_state.sh
+@@ -108,6 +108,8 @@ no_cable()
+ ip link set dev $swp3 down
+ }
+
++skip_on_veth
++
+ setup_prepare
+
+ tests_run
+diff --git a/tools/testing/selftests/net/forwarding/ethtool_mm.sh b/tools/testing/selftests/net/forwarding/ethtool_mm.sh
+index c580ad6238483..39e736f30322a 100755
+--- a/tools/testing/selftests/net/forwarding/ethtool_mm.sh
++++ b/tools/testing/selftests/net/forwarding/ethtool_mm.sh
+@@ -258,11 +258,6 @@ h2_destroy()
+
+ setup_prepare()
+ {
+- check_ethtool_mm_support
+- check_tc_fp_support
+- require_command lldptool
+- bail_on_lldpad "autoconfigure the MAC Merge layer" "configure it manually"
+-
+ h1=${NETIFS[p1]}
+ h2=${NETIFS[p2]}
+
+@@ -278,6 +273,19 @@ cleanup()
+ h1_destroy
+ }
+
++check_ethtool_mm_support
++check_tc_fp_support
++require_command lldptool
++bail_on_lldpad "autoconfigure the MAC Merge layer" "configure it manually"
++
++for netif in ${NETIFS[@]}; do
++ ethtool --show-mm $netif 2>&1 &> /dev/null
++ if [[ $? -ne 0 ]]; then
++ echo "SKIP: $netif does not support MAC Merge"
++ exit $ksft_skip
++ fi
++done
++
+ trap cleanup EXIT
+
+ setup_prepare
+diff --git a/tools/testing/selftests/net/forwarding/hw_stats_l3_gre.sh b/tools/testing/selftests/net/forwarding/hw_stats_l3_gre.sh
+index eb9ec4a68f84b..7594bbb490292 100755
+--- a/tools/testing/selftests/net/forwarding/hw_stats_l3_gre.sh
++++ b/tools/testing/selftests/net/forwarding/hw_stats_l3_gre.sh
+@@ -99,6 +99,8 @@ test_stats_rx()
+ test_stats g2a rx
+ }
+
++skip_on_veth
++
+ trap cleanup EXIT
+
+ setup_prepare
+diff --git a/tools/testing/selftests/net/forwarding/ip6_forward_instats_vrf.sh b/tools/testing/selftests/net/forwarding/ip6_forward_instats_vrf.sh
+index 9f5b3e2e5e954..49fa94b53a1ca 100755
+--- a/tools/testing/selftests/net/forwarding/ip6_forward_instats_vrf.sh
++++ b/tools/testing/selftests/net/forwarding/ip6_forward_instats_vrf.sh
+@@ -14,6 +14,8 @@ ALL_TESTS="
+ NUM_NETIFS=4
+ source lib.sh
+
++require_command $TROUTE6
++
+ h1_create()
+ {
+ simple_if_init $h1 2001:1:1::2/64
+diff --git a/tools/testing/selftests/net/forwarding/lib.sh b/tools/testing/selftests/net/forwarding/lib.sh
+index 9ddb68dd6a089..f69015bf2dea9 100755
+--- a/tools/testing/selftests/net/forwarding/lib.sh
++++ b/tools/testing/selftests/net/forwarding/lib.sh
+@@ -30,6 +30,7 @@ REQUIRE_MZ=${REQUIRE_MZ:=yes}
+ REQUIRE_MTOOLS=${REQUIRE_MTOOLS:=no}
+ STABLE_MAC_ADDRS=${STABLE_MAC_ADDRS:=no}
+ TCPDUMP_EXTRA_FLAGS=${TCPDUMP_EXTRA_FLAGS:=}
++TROUTE6=${TROUTE6:=traceroute6}
+
+ relative_path="${BASH_SOURCE%/*}"
+ if [[ "$relative_path" == "${BASH_SOURCE}" ]]; then
+@@ -163,6 +164,17 @@ check_port_mab_support()
+ fi
+ }
+
++skip_on_veth()
++{
++ local kind=$(ip -j -d link show dev ${NETIFS[p1]} |
++ jq -r '.[].linkinfo.info_kind')
++
++ if [[ $kind == veth ]]; then
++ echo "SKIP: Test cannot be run with veth pairs"
++ exit $ksft_skip
++ fi
++}
++
+ if [[ "$(id -u)" -ne 0 ]]; then
+ echo "SKIP: need root privileges"
+ exit $ksft_skip
+@@ -225,6 +237,11 @@ create_netif_veth()
+ for ((i = 1; i <= NUM_NETIFS; ++i)); do
+ local j=$((i+1))
+
++ if [ -z ${NETIFS[p$i]} ]; then
++ echo "SKIP: Cannot create interface. Name not specified"
++ exit $ksft_skip
++ fi
++
+ ip link show dev ${NETIFS[p$i]} &> /dev/null
+ if [[ $? -ne 0 ]]; then
+ ip link add ${NETIFS[p$i]} type veth \
+diff --git a/tools/testing/selftests/net/forwarding/settings b/tools/testing/selftests/net/forwarding/settings
+new file mode 100644
+index 0000000000000..e7b9417537fbc
+--- /dev/null
++++ b/tools/testing/selftests/net/forwarding/settings
+@@ -0,0 +1 @@
++timeout=0
+diff --git a/tools/testing/selftests/net/forwarding/tc_actions.sh b/tools/testing/selftests/net/forwarding/tc_actions.sh
+index a96cff8e72197..b0f5e55d2d0b2 100755
+--- a/tools/testing/selftests/net/forwarding/tc_actions.sh
++++ b/tools/testing/selftests/net/forwarding/tc_actions.sh
+@@ -9,6 +9,8 @@ NUM_NETIFS=4
+ source tc_common.sh
+ source lib.sh
+
++require_command ncat
++
+ tcflags="skip_hw"
+
+ h1_create()
+@@ -220,9 +222,9 @@ mirred_egress_to_ingress_tcp_test()
+ ip_proto icmp \
+ action drop
+
+- ip vrf exec v$h1 nc --recv-only -w10 -l -p 12345 -o $mirred_e2i_tf2 &
++ ip vrf exec v$h1 ncat --recv-only -w10 -l -p 12345 -o $mirred_e2i_tf2 &
+ local rpid=$!
+- ip vrf exec v$h1 nc -w1 --send-only 192.0.2.2 12345 <$mirred_e2i_tf1
++ ip vrf exec v$h1 ncat -w1 --send-only 192.0.2.2 12345 <$mirred_e2i_tf1
+ wait -n $rpid
+ cmp -s $mirred_e2i_tf1 $mirred_e2i_tf2
+ check_err $? "server output check failed"
+diff --git a/tools/testing/selftests/net/forwarding/tc_flower.sh b/tools/testing/selftests/net/forwarding/tc_flower.sh
+index 683711f41aa9b..b1daad19b01ec 100755
+--- a/tools/testing/selftests/net/forwarding/tc_flower.sh
++++ b/tools/testing/selftests/net/forwarding/tc_flower.sh
+@@ -52,8 +52,8 @@ match_dst_mac_test()
+ tc_check_packets "dev $h2 ingress" 101 1
+ check_fail $? "Matched on a wrong filter"
+
+- tc_check_packets "dev $h2 ingress" 102 1
+- check_err $? "Did not match on correct filter"
++ tc_check_packets "dev $h2 ingress" 102 0
++ check_fail $? "Did not match on correct filter"
+
+ tc filter del dev $h2 ingress protocol ip pref 1 handle 101 flower
+ tc filter del dev $h2 ingress protocol ip pref 2 handle 102 flower
+@@ -78,8 +78,8 @@ match_src_mac_test()
+ tc_check_packets "dev $h2 ingress" 101 1
+ check_fail $? "Matched on a wrong filter"
+
+- tc_check_packets "dev $h2 ingress" 102 1
+- check_err $? "Did not match on correct filter"
++ tc_check_packets "dev $h2 ingress" 102 0
++ check_fail $? "Did not match on correct filter"
+
+ tc filter del dev $h2 ingress protocol ip pref 1 handle 101 flower
+ tc filter del dev $h2 ingress protocol ip pref 2 handle 102 flower
+diff --git a/tools/testing/selftests/net/forwarding/tc_tunnel_key.sh b/tools/testing/selftests/net/forwarding/tc_tunnel_key.sh
+index 5ac184d518099..5a5dd90348195 100755
+--- a/tools/testing/selftests/net/forwarding/tc_tunnel_key.sh
++++ b/tools/testing/selftests/net/forwarding/tc_tunnel_key.sh
+@@ -104,11 +104,14 @@ tunnel_key_nofrag_test()
+ local i
+
+ tc filter add dev $swp1 ingress protocol ip pref 100 handle 100 \
+- flower ip_flags nofrag action drop
++ flower src_ip 192.0.2.1 dst_ip 192.0.2.2 ip_proto udp \
++ ip_flags nofrag action drop
+ tc filter add dev $swp1 ingress protocol ip pref 101 handle 101 \
+- flower ip_flags firstfrag action drop
++ flower src_ip 192.0.2.1 dst_ip 192.0.2.2 ip_proto udp \
++ ip_flags firstfrag action drop
+ tc filter add dev $swp1 ingress protocol ip pref 102 handle 102 \
+- flower ip_flags nofirstfrag action drop
++ flower src_ip 192.0.2.1 dst_ip 192.0.2.2 ip_proto udp \
++ ip_flags nofirstfrag action drop
+
+ # test 'nofrag' set
+ tc filter add dev h1-et egress protocol all pref 1 handle 1 matchall $tcflags \
+diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
+index a40c35c90c52b..de0d04b0e2469 100755
+--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
++++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
+@@ -676,6 +676,7 @@ pm_nl_del_endpoint()
+ local addr=$3
+
+ if [ $ip_mptcp -eq 1 ]; then
++ [ $id -ne 0 ] && addr=''
+ ip -n $ns mptcp endpoint delete id $id $addr
+ else
+ ip netns exec $ns ./pm_nl_ctl del $id $addr
+@@ -766,10 +767,11 @@ pm_nl_check_endpoint()
+ fi
+
+ if [ $ip_mptcp -eq 1 ]; then
++ # get line and trim trailing whitespace
+ line=$(ip -n $ns mptcp endpoint show $id)
++ line="${line% }"
+ # the dump order is: address id flags port dev
+- expected_line="$addr"
+- [ -n "$addr" ] && expected_line="$expected_line $addr"
++ [ -n "$addr" ] && expected_line="$addr"
+ expected_line="$expected_line $id"
+ [ -n "$_flags" ] && expected_line="$expected_line ${_flags//","/" "}"
+ [ -n "$dev" ] && expected_line="$expected_line $dev"
+diff --git a/tools/testing/selftests/rseq/Makefile b/tools/testing/selftests/rseq/Makefile
+index b357ba24af06f..7a957c7d459ae 100644
+--- a/tools/testing/selftests/rseq/Makefile
++++ b/tools/testing/selftests/rseq/Makefile
+@@ -4,8 +4,10 @@ ifneq ($(shell $(CC) --version 2>&1 | head -n 1 | grep clang),)
+ CLANG_FLAGS += -no-integrated-as
+ endif
+
++top_srcdir = ../../../..
++
+ CFLAGS += -O2 -Wall -g -I./ $(KHDR_INCLUDES) -L$(OUTPUT) -Wl,-rpath=./ \
+- $(CLANG_FLAGS)
++ $(CLANG_FLAGS) -I$(top_srcdir)/tools/include
+ LDLIBS += -lpthread -ldl
+
+ # Own dependencies because we only want to build against 1st prerequisite, but
+diff --git a/tools/testing/selftests/rseq/rseq.c b/tools/testing/selftests/rseq/rseq.c
+index a723da2532441..96e812bdf8a45 100644
+--- a/tools/testing/selftests/rseq/rseq.c
++++ b/tools/testing/selftests/rseq/rseq.c
+@@ -31,6 +31,8 @@
+ #include <sys/auxv.h>
+ #include <linux/auxvec.h>
+
++#include <linux/compiler.h>
++
+ #include "../kselftest.h"
+ #include "rseq.h"
+
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-08-16 20:20 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-08-16 20:20 UTC (permalink / raw
To: gentoo-commits
commit: 81e50a0a0f2211c36ba03aa501b7d1b2a67f32d5
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Wed Aug 16 20:19:52 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Wed Aug 16 20:19:52 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=81e50a0a
Adding back BMQ thanks to holgerh
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 8 +
...MQ-and-PDS-io-scheduler-v6.4-r1-linux-tkg.patch | 11164 +++++++++++++++++++
5021_BMQ-and-PDS-gentoo-defaults.patch | 13 +
3 files changed, 11185 insertions(+)
diff --git a/0000_README b/0000_README
index c16c1b6b..9ce881e3 100644
--- a/0000_README
+++ b/0000_README
@@ -130,3 +130,11 @@ Desc: Add Gentoo Linux support config settings and defaults.
Patch: 5010_enable-cpu-optimizations-universal.patch
From: https://github.com/graysky2/kernel_compiler_patch
Desc: Kernel >= 5.15 patch enables gcc = v11.1+ optimizations for additional CPUs.
+
+Patch: 5020_BMQ-and-PDS-io-scheduler-v6.4-r1-linux-tkg.patch
+From: https://github.com/hhoffstaette/kernel-patches/
+Desc: BMQ(BitMap Queue) Scheduler. A new CPU scheduler developed from PDS(incld). Inspired by the scheduler in zircon.
+
+Patch: 5021_BMQ-and-PDS-gentoo-defaults.patch
+From: https://gitweb.gentoo.org/proj/linux-patches.git/
+Desc: Set defaults for BMQ. Add archs as people test, default to N
diff --git a/5020_BMQ-and-PDS-io-scheduler-v6.4-r1-linux-tkg.patch b/5020_BMQ-and-PDS-io-scheduler-v6.4-r1-linux-tkg.patch
new file mode 100644
index 00000000..5e870849
--- /dev/null
+++ b/5020_BMQ-and-PDS-io-scheduler-v6.4-r1-linux-tkg.patch
@@ -0,0 +1,11164 @@
+diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
+index 9e5bab29685f..b942b7dd8c42 100644
+--- a/Documentation/admin-guide/kernel-parameters.txt
++++ b/Documentation/admin-guide/kernel-parameters.txt
+@@ -5496,6 +5496,12 @@
+ sa1100ir [NET]
+ See drivers/net/irda/sa1100_ir.c.
+
++ sched_timeslice=
++ [KNL] Time slice in ms for Project C BMQ/PDS scheduler.
++ Format: integer 2, 4
++ Default: 4
++ See Documentation/scheduler/sched-BMQ.txt
++
+ sched_verbose [KNL] Enables verbose scheduler debug messages.
+
+ schedstats= [KNL,X86] Enable or disable scheduled statistics.
+diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
+index d85d90f5d000..f730195a3adb 100644
+--- a/Documentation/admin-guide/sysctl/kernel.rst
++++ b/Documentation/admin-guide/sysctl/kernel.rst
+@@ -1616,3 +1616,13 @@ is 10 seconds.
+
+ The softlockup threshold is (``2 * watchdog_thresh``). Setting this
+ tunable to zero will disable lockup detection altogether.
++
++yield_type:
++===========
++
++BMQ/PDS CPU scheduler only. This determines what type of yield calls
++to sched_yield will perform.
++
++ 0 - No yield.
++ 1 - Deboost and requeue task. (default)
++ 2 - Set run queue skip task.
+diff --git a/Documentation/scheduler/sched-BMQ.txt b/Documentation/scheduler/sched-BMQ.txt
+new file mode 100644
+index 000000000000..05c84eec0f31
+--- /dev/null
++++ b/Documentation/scheduler/sched-BMQ.txt
+@@ -0,0 +1,110 @@
++ BitMap queue CPU Scheduler
++ --------------------------
++
++CONTENT
++========
++
++ Background
++ Design
++ Overview
++ Task policy
++ Priority management
++ BitMap Queue
++ CPU Assignment and Migration
++
++
++Background
++==========
++
++BitMap Queue CPU scheduler, referred to as BMQ from here on, is an evolution
++of previous Priority and Deadline based Skiplist multiple queue scheduler(PDS),
++and inspired by Zircon scheduler. The goal of it is to keep the scheduler code
++simple, while efficiency and scalable for interactive tasks, such as desktop,
++movie playback and gaming etc.
++
++Design
++======
++
++Overview
++--------
++
++BMQ use per CPU run queue design, each CPU(logical) has it's own run queue,
++each CPU is responsible for scheduling the tasks that are putting into it's
++run queue.
++
++The run queue is a set of priority queues. Note that these queues are fifo
++queue for non-rt tasks or priority queue for rt tasks in data structure. See
++BitMap Queue below for details. BMQ is optimized for non-rt tasks in the fact
++that most applications are non-rt tasks. No matter the queue is fifo or
++priority, In each queue is an ordered list of runnable tasks awaiting execution
++and the data structures are the same. When it is time for a new task to run,
++the scheduler simply looks the lowest numbered queueue that contains a task,
++and runs the first task from the head of that queue. And per CPU idle task is
++also in the run queue, so the scheduler can always find a task to run on from
++its run queue.
++
++Each task will assigned the same timeslice(default 4ms) when it is picked to
++start running. Task will be reinserted at the end of the appropriate priority
++queue when it uses its whole timeslice. When the scheduler selects a new task
++from the priority queue it sets the CPU's preemption timer for the remainder of
++the previous timeslice. When that timer fires the scheduler will stop execution
++on that task, select another task and start over again.
++
++If a task blocks waiting for a shared resource then it's taken out of its
++priority queue and is placed in a wait queue for the shared resource. When it
++is unblocked it will be reinserted in the appropriate priority queue of an
++eligible CPU.
++
++Task policy
++-----------
++
++BMQ supports DEADLINE, FIFO, RR, NORMAL, BATCH and IDLE task policy like the
++mainline CFS scheduler. But BMQ is heavy optimized for non-rt task, that's
++NORMAL/BATCH/IDLE policy tasks. Below is the implementation detail of each
++policy.
++
++DEADLINE
++ It is squashed as priority 0 FIFO task.
++
++FIFO/RR
++ All RT tasks share one single priority queue in BMQ run queue designed. The
++complexity of insert operation is O(n). BMQ is not designed for system runs
++with major rt policy tasks.
++
++NORMAL/BATCH/IDLE
++ BATCH and IDLE tasks are treated as the same policy. They compete CPU with
++NORMAL policy tasks, but they just don't boost. To control the priority of
++NORMAL/BATCH/IDLE tasks, simply use nice level.
++
++ISO
++ ISO policy is not supported in BMQ. Please use nice level -20 NORMAL policy
++task instead.
++
++Priority management
++-------------------
++
++RT tasks have priority from 0-99. For non-rt tasks, there are three different
++factors used to determine the effective priority of a task. The effective
++priority being what is used to determine which queue it will be in.
++
++The first factor is simply the task’s static priority. Which is assigned from
++task's nice level, within [-20, 19] in userland's point of view and [0, 39]
++internally.
++
++The second factor is the priority boost. This is a value bounded between
++[-MAX_PRIORITY_ADJ, MAX_PRIORITY_ADJ] used to offset the base priority, it is
++modified by the following cases:
++
++*When a thread has used up its entire timeslice, always deboost its boost by
++increasing by one.
++*When a thread gives up cpu control(voluntary or non-voluntary) to reschedule,
++and its switch-in time(time after last switch and run) below the thredhold
++based on its priority boost, will boost its boost by decreasing by one buti is
++capped at 0 (won’t go negative).
++
++The intent in this system is to ensure that interactive threads are serviced
++quickly. These are usually the threads that interact directly with the user
++and cause user-perceivable latency. These threads usually do little work and
++spend most of their time blocked awaiting another user event. So they get the
++priority boost from unblocking while background threads that do most of the
++processing receive the priority penalty for using their entire timeslice.
+diff --git a/fs/proc/base.c b/fs/proc/base.c
+index 05452c3b9872..fa1ceb85ad24 100644
+--- a/fs/proc/base.c
++++ b/fs/proc/base.c
+@@ -480,7 +480,7 @@ static int proc_pid_schedstat(struct seq_file *m, struct pid_namespace *ns,
+ seq_puts(m, "0 0 0\n");
+ else
+ seq_printf(m, "%llu %llu %lu\n",
+- (unsigned long long)task->se.sum_exec_runtime,
++ (unsigned long long)tsk_seruntime(task),
+ (unsigned long long)task->sched_info.run_delay,
+ task->sched_info.pcount);
+
+diff --git a/include/asm-generic/resource.h b/include/asm-generic/resource.h
+index 8874f681b056..59eb72bf7d5f 100644
+--- a/include/asm-generic/resource.h
++++ b/include/asm-generic/resource.h
+@@ -23,7 +23,7 @@
+ [RLIMIT_LOCKS] = { RLIM_INFINITY, RLIM_INFINITY }, \
+ [RLIMIT_SIGPENDING] = { 0, 0 }, \
+ [RLIMIT_MSGQUEUE] = { MQ_BYTES_MAX, MQ_BYTES_MAX }, \
+- [RLIMIT_NICE] = { 0, 0 }, \
++ [RLIMIT_NICE] = { 30, 30 }, \
+ [RLIMIT_RTPRIO] = { 0, 0 }, \
+ [RLIMIT_RTTIME] = { RLIM_INFINITY, RLIM_INFINITY }, \
+ }
+diff --git a/include/linux/sched.h b/include/linux/sched.h
+index eed5d65b8d1f..cdfd9263ddd6 100644
+--- a/include/linux/sched.h
++++ b/include/linux/sched.h
+@@ -764,8 +764,14 @@ struct task_struct {
+ unsigned int ptrace;
+
+ #ifdef CONFIG_SMP
+- int on_cpu;
+ struct __call_single_node wake_entry;
++#endif
++#if defined(CONFIG_SMP) || defined(CONFIG_SCHED_ALT)
++ int on_cpu;
++#endif
++
++#ifdef CONFIG_SMP
++#ifndef CONFIG_SCHED_ALT
+ unsigned int wakee_flips;
+ unsigned long wakee_flip_decay_ts;
+ struct task_struct *last_wakee;
+@@ -779,6 +785,7 @@ struct task_struct {
+ */
+ int recent_used_cpu;
+ int wake_cpu;
++#endif /* !CONFIG_SCHED_ALT */
+ #endif
+ int on_rq;
+
+@@ -787,6 +794,20 @@ struct task_struct {
+ int normal_prio;
+ unsigned int rt_priority;
+
++#ifdef CONFIG_SCHED_ALT
++ u64 last_ran;
++ s64 time_slice;
++ int sq_idx;
++ struct list_head sq_node;
++#ifdef CONFIG_SCHED_BMQ
++ int boost_prio;
++#endif /* CONFIG_SCHED_BMQ */
++#ifdef CONFIG_SCHED_PDS
++ u64 deadline;
++#endif /* CONFIG_SCHED_PDS */
++ /* sched_clock time spent running */
++ u64 sched_time;
++#else /* !CONFIG_SCHED_ALT */
+ struct sched_entity se;
+ struct sched_rt_entity rt;
+ struct sched_dl_entity dl;
+@@ -797,6 +818,7 @@ struct task_struct {
+ unsigned long core_cookie;
+ unsigned int core_occupation;
+ #endif
++#endif /* !CONFIG_SCHED_ALT */
+
+ #ifdef CONFIG_CGROUP_SCHED
+ struct task_group *sched_task_group;
+@@ -1551,6 +1573,15 @@ struct task_struct {
+ */
+ };
+
++#ifdef CONFIG_SCHED_ALT
++#define tsk_seruntime(t) ((t)->sched_time)
++/* replace the uncertian rt_timeout with 0UL */
++#define tsk_rttimeout(t) (0UL)
++#else /* CFS */
++#define tsk_seruntime(t) ((t)->se.sum_exec_runtime)
++#define tsk_rttimeout(t) ((t)->rt.timeout)
++#endif /* !CONFIG_SCHED_ALT */
++
+ static inline struct pid *task_pid(struct task_struct *task)
+ {
+ return task->thread_pid;
+diff --git a/include/linux/sched/deadline.h b/include/linux/sched/deadline.h
+index 7c83d4d5a971..fa30f98cb2be 100644
+--- a/include/linux/sched/deadline.h
++++ b/include/linux/sched/deadline.h
+@@ -1,5 +1,24 @@
+ /* SPDX-License-Identifier: GPL-2.0 */
+
++#ifdef CONFIG_SCHED_ALT
++
++static inline int dl_task(struct task_struct *p)
++{
++ return 0;
++}
++
++#ifdef CONFIG_SCHED_BMQ
++#define __tsk_deadline(p) (0UL)
++#endif
++
++#ifdef CONFIG_SCHED_PDS
++#define __tsk_deadline(p) ((((u64) ((p)->prio))<<56) | (p)->deadline)
++#endif
++
++#else
++
++#define __tsk_deadline(p) ((p)->dl.deadline)
++
+ /*
+ * SCHED_DEADLINE tasks has negative priorities, reflecting
+ * the fact that any of them has higher prio than RT and
+@@ -21,6 +40,7 @@ static inline int dl_task(struct task_struct *p)
+ {
+ return dl_prio(p->prio);
+ }
++#endif /* CONFIG_SCHED_ALT */
+
+ static inline bool dl_time_before(u64 a, u64 b)
+ {
+diff --git a/include/linux/sched/prio.h b/include/linux/sched/prio.h
+index ab83d85e1183..6af9ae681116 100644
+--- a/include/linux/sched/prio.h
++++ b/include/linux/sched/prio.h
+@@ -18,6 +18,32 @@
+ #define MAX_PRIO (MAX_RT_PRIO + NICE_WIDTH)
+ #define DEFAULT_PRIO (MAX_RT_PRIO + NICE_WIDTH / 2)
+
++#ifdef CONFIG_SCHED_ALT
++
++/* Undefine MAX_PRIO and DEFAULT_PRIO */
++#undef MAX_PRIO
++#undef DEFAULT_PRIO
++
++/* +/- priority levels from the base priority */
++#ifdef CONFIG_SCHED_BMQ
++#define MAX_PRIORITY_ADJ (7)
++
++#define MIN_NORMAL_PRIO (MAX_RT_PRIO)
++#define MAX_PRIO (MIN_NORMAL_PRIO + NICE_WIDTH)
++#define DEFAULT_PRIO (MIN_NORMAL_PRIO + NICE_WIDTH / 2)
++#endif
++
++#ifdef CONFIG_SCHED_PDS
++#define MAX_PRIORITY_ADJ (0)
++
++#define MIN_NORMAL_PRIO (128)
++#define NORMAL_PRIO_NUM (64)
++#define MAX_PRIO (MIN_NORMAL_PRIO + NORMAL_PRIO_NUM)
++#define DEFAULT_PRIO (MAX_PRIO - NICE_WIDTH / 2)
++#endif
++
++#endif /* CONFIG_SCHED_ALT */
++
+ /*
+ * Convert user-nice values [ -20 ... 0 ... 19 ]
+ * to static priority [ MAX_RT_PRIO..MAX_PRIO-1 ],
+diff --git a/include/linux/sched/rt.h b/include/linux/sched/rt.h
+index 994c25640e15..8c050a59ece1 100644
+--- a/include/linux/sched/rt.h
++++ b/include/linux/sched/rt.h
+@@ -24,8 +24,10 @@ static inline bool task_is_realtime(struct task_struct *tsk)
+
+ if (policy == SCHED_FIFO || policy == SCHED_RR)
+ return true;
++#ifndef CONFIG_SCHED_ALT
+ if (policy == SCHED_DEADLINE)
+ return true;
++#endif
+ return false;
+ }
+
+diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
+index 816df6cc444e..c8da08e18c91 100644
+--- a/include/linux/sched/topology.h
++++ b/include/linux/sched/topology.h
+@@ -234,7 +234,8 @@ static inline bool cpus_share_cache(int this_cpu, int that_cpu)
+
+ #endif /* !CONFIG_SMP */
+
+-#if defined(CONFIG_ENERGY_MODEL) && defined(CONFIG_CPU_FREQ_GOV_SCHEDUTIL)
++#if defined(CONFIG_ENERGY_MODEL) && defined(CONFIG_CPU_FREQ_GOV_SCHEDUTIL) && \
++ !defined(CONFIG_SCHED_ALT)
+ extern void rebuild_sched_domains_energy(void);
+ #else
+ static inline void rebuild_sched_domains_energy(void)
+diff --git a/init/Kconfig b/init/Kconfig
+index 32c24950c4ce..cf951b739454 100644
+--- a/init/Kconfig
++++ b/init/Kconfig
+@@ -629,6 +629,7 @@ config TASK_IO_ACCOUNTING
+
+ config PSI
+ bool "Pressure stall information tracking"
++ depends on !SCHED_ALT
+ help
+ Collect metrics that indicate how overcommitted the CPU, memory,
+ and IO capacity are in the system.
+@@ -793,6 +794,7 @@ menu "Scheduler features"
+ config UCLAMP_TASK
+ bool "Enable utilization clamping for RT/FAIR tasks"
+ depends on CPU_FREQ_GOV_SCHEDUTIL
++ depends on !SCHED_ALT
+ help
+ This feature enables the scheduler to track the clamped utilization
+ of each CPU based on RUNNABLE tasks scheduled on that CPU.
+@@ -839,6 +841,35 @@ config UCLAMP_BUCKETS_COUNT
+
+ If in doubt, use the default value.
+
++menuconfig SCHED_ALT
++ bool "Alternative CPU Schedulers"
++ default y
++ help
++ This feature enable alternative CPU scheduler"
++
++if SCHED_ALT
++
++choice
++ prompt "Alternative CPU Scheduler"
++ default SCHED_BMQ
++
++config SCHED_BMQ
++ bool "BMQ CPU scheduler"
++ help
++ The BitMap Queue CPU scheduler for excellent interactivity and
++ responsiveness on the desktop and solid scalability on normal
++ hardware and commodity servers.
++
++config SCHED_PDS
++ bool "PDS CPU scheduler"
++ help
++ The Priority and Deadline based Skip list multiple queue CPU
++ Scheduler.
++
++endchoice
++
++endif
++
+ endmenu
+
+ #
+@@ -892,6 +923,7 @@ config NUMA_BALANCING
+ depends on ARCH_SUPPORTS_NUMA_BALANCING
+ depends on !ARCH_WANT_NUMA_VARIABLE_LOCALITY
+ depends on SMP && NUMA && MIGRATION && !PREEMPT_RT
++ depends on !SCHED_ALT
+ help
+ This option adds support for automatic NUMA aware memory/task placement.
+ The mechanism is quite primitive and is based on migrating memory when
+@@ -989,6 +1021,7 @@ config FAIR_GROUP_SCHED
+ depends on CGROUP_SCHED
+ default CGROUP_SCHED
+
++if !SCHED_ALT
+ config CFS_BANDWIDTH
+ bool "CPU bandwidth provisioning for FAIR_GROUP_SCHED"
+ depends on FAIR_GROUP_SCHED
+@@ -1011,6 +1044,7 @@ config RT_GROUP_SCHED
+ realtime bandwidth for them.
+ See Documentation/scheduler/sched-rt-group.rst for more information.
+
++endif #!SCHED_ALT
+ endif #CGROUP_SCHED
+
+ config SCHED_MM_CID
+@@ -1259,6 +1293,7 @@ config CHECKPOINT_RESTORE
+
+ config SCHED_AUTOGROUP
+ bool "Automatic process group scheduling"
++ depends on !SCHED_ALT
+ select CGROUPS
+ select CGROUP_SCHED
+ select FAIR_GROUP_SCHED
+diff --git a/init/init_task.c b/init/init_task.c
+index ff6c4b9bfe6b..19e9c662d1a1 100644
+--- a/init/init_task.c
++++ b/init/init_task.c
+@@ -75,9 +75,15 @@ struct task_struct init_task
+ .stack = init_stack,
+ .usage = REFCOUNT_INIT(2),
+ .flags = PF_KTHREAD,
++#ifdef CONFIG_SCHED_ALT
++ .prio = DEFAULT_PRIO + MAX_PRIORITY_ADJ,
++ .static_prio = DEFAULT_PRIO,
++ .normal_prio = DEFAULT_PRIO + MAX_PRIORITY_ADJ,
++#else
+ .prio = MAX_PRIO - 20,
+ .static_prio = MAX_PRIO - 20,
+ .normal_prio = MAX_PRIO - 20,
++#endif
+ .policy = SCHED_NORMAL,
+ .cpus_ptr = &init_task.cpus_mask,
+ .user_cpus_ptr = NULL,
+@@ -88,6 +94,17 @@ struct task_struct init_task
+ .restart_block = {
+ .fn = do_no_restart_syscall,
+ },
++#ifdef CONFIG_SCHED_ALT
++ .sq_node = LIST_HEAD_INIT(init_task.sq_node),
++#ifdef CONFIG_SCHED_BMQ
++ .boost_prio = 0,
++ .sq_idx = 15,
++#endif
++#ifdef CONFIG_SCHED_PDS
++ .deadline = 0,
++#endif
++ .time_slice = HZ,
++#else
+ .se = {
+ .group_node = LIST_HEAD_INIT(init_task.se.group_node),
+ },
+@@ -95,6 +112,7 @@ struct task_struct init_task
+ .run_list = LIST_HEAD_INIT(init_task.rt.run_list),
+ .time_slice = RR_TIMESLICE,
+ },
++#endif
+ .tasks = LIST_HEAD_INIT(init_task.tasks),
+ #ifdef CONFIG_SMP
+ .pushable_tasks = PLIST_NODE_INIT(init_task.pushable_tasks, MAX_PRIO),
+diff --git a/kernel/Kconfig.preempt b/kernel/Kconfig.preempt
+index c2f1fd95a821..41654679b1b2 100644
+--- a/kernel/Kconfig.preempt
++++ b/kernel/Kconfig.preempt
+@@ -117,7 +117,7 @@ config PREEMPT_DYNAMIC
+
+ config SCHED_CORE
+ bool "Core Scheduling for SMT"
+- depends on SCHED_SMT
++ depends on SCHED_SMT && !SCHED_ALT
+ help
+ This option permits Core Scheduling, a means of coordinated task
+ selection across SMT siblings. When enabled -- see
+diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
+index e4ca2dd2b764..82786dbb220c 100644
+--- a/kernel/cgroup/cpuset.c
++++ b/kernel/cgroup/cpuset.c
+@@ -791,7 +791,7 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial)
+ return ret;
+ }
+
+-#ifdef CONFIG_SMP
++#if defined(CONFIG_SMP) && !defined(CONFIG_SCHED_ALT)
+ /*
+ * Helper routine for generate_sched_domains().
+ * Do cpusets a, b have overlapping effective cpus_allowed masks?
+@@ -1187,7 +1187,7 @@ static void rebuild_sched_domains_locked(void)
+ /* Have scheduler rebuild the domains */
+ partition_and_rebuild_sched_domains(ndoms, doms, attr);
+ }
+-#else /* !CONFIG_SMP */
++#else /* !CONFIG_SMP || CONFIG_SCHED_ALT */
+ static void rebuild_sched_domains_locked(void)
+ {
+ }
+diff --git a/kernel/delayacct.c b/kernel/delayacct.c
+index 6f0c358e73d8..8111481ce8b1 100644
+--- a/kernel/delayacct.c
++++ b/kernel/delayacct.c
+@@ -150,7 +150,7 @@ int delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk)
+ */
+ t1 = tsk->sched_info.pcount;
+ t2 = tsk->sched_info.run_delay;
+- t3 = tsk->se.sum_exec_runtime;
++ t3 = tsk_seruntime(tsk);
+
+ d->cpu_count += t1;
+
+diff --git a/kernel/exit.c b/kernel/exit.c
+index edb50b4c9972..09e72bba7cc2 100644
+--- a/kernel/exit.c
++++ b/kernel/exit.c
+@@ -173,7 +173,7 @@ static void __exit_signal(struct task_struct *tsk)
+ sig->curr_target = next_thread(tsk);
+ }
+
+- add_device_randomness((const void*) &tsk->se.sum_exec_runtime,
++ add_device_randomness((const void*) &tsk_seruntime(tsk),
+ sizeof(unsigned long long));
+
+ /*
+@@ -194,7 +194,7 @@ static void __exit_signal(struct task_struct *tsk)
+ sig->inblock += task_io_get_inblock(tsk);
+ sig->oublock += task_io_get_oublock(tsk);
+ task_io_accounting_add(&sig->ioac, &tsk->ioac);
+- sig->sum_sched_runtime += tsk->se.sum_exec_runtime;
++ sig->sum_sched_runtime += tsk_seruntime(tsk);
+ sig->nr_threads--;
+ __unhash_process(tsk, group_dead);
+ write_sequnlock(&sig->stats_lock);
+--- a/kernel/locking/rtmutex.c 2023-08-01 15:40:26.000000000 +0200
++++ b/kernel/locking/rtmutex.c 2023-08-02 16:05:00.952812874 +0200
+@@ -343,7 +343,7 @@ waiter_update_prio(struct rt_mutex_waite
+ lockdep_assert(RB_EMPTY_NODE(&waiter->tree.entry));
+
+ waiter->tree.prio = __waiter_prio(task);
+- waiter->tree.deadline = task->dl.deadline;
++ waiter->tree.deadline = __tsk_deadline(task);
+ }
+
+ /*
+@@ -364,16 +364,20 @@ waiter_clone_prio(struct rt_mutex_waiter
+ * Only use with rt_waiter_node_{less,equal}()
+ */
+ #define task_to_waiter_node(p) \
+- &(struct rt_waiter_node){ .prio = __waiter_prio(p), .deadline = (p)->dl.deadline }
++ &(struct rt_waiter_node){ .prio = __waiter_prio(p), .deadline = __tsk_deadline(p) }
+ #define task_to_waiter(p) \
+ &(struct rt_mutex_waiter){ .tree = *task_to_waiter_node(p) }
+
+ static __always_inline int rt_waiter_node_less(struct rt_waiter_node *left,
+ struct rt_waiter_node *right)
+ {
++#ifdef CONFIG_SCHED_PDS
++ return (left->deadline < right->deadline);
++#else
+ if (left->prio < right->prio)
+ return 1;
+
++#ifndef CONFIG_SCHED_BMQ
+ /*
+ * If both waiters have dl_prio(), we check the deadlines of the
+ * associated tasks.
+@@ -382,16 +386,22 @@ static __always_inline int rt_waiter_nod
+ */
+ if (dl_prio(left->prio))
+ return dl_time_before(left->deadline, right->deadline);
++#endif
+
+ return 0;
++#endif
+ }
+
+ static __always_inline int rt_waiter_node_equal(struct rt_waiter_node *left,
+ struct rt_waiter_node *right)
+ {
++#ifdef CONFIG_SCHED_PDS
++ return (left->deadline == right->deadline);
++#else
+ if (left->prio != right->prio)
+ return 0;
+
++#ifndef CONFIG_SCHED_BMQ
+ /*
+ * If both waiters have dl_prio(), we check the deadlines of the
+ * associated tasks.
+@@ -400,8 +410,10 @@ static __always_inline int rt_waiter_nod
+ */
+ if (dl_prio(left->prio))
+ return left->deadline == right->deadline;
++#endif
+
+ return 1;
++#endif
+ }
+
+ static inline bool rt_mutex_steal(struct rt_mutex_waiter *waiter,
+diff --git a/kernel/sched/Makefile b/kernel/sched/Makefile
+index 976092b7bd45..31d587c16ec1 100644
+--- a/kernel/sched/Makefile
++++ b/kernel/sched/Makefile
+@@ -28,7 +28,12 @@ endif
+ # These compilation units have roughly the same size and complexity - so their
+ # build parallelizes well and finishes roughly at once:
+ #
++ifdef CONFIG_SCHED_ALT
++obj-y += alt_core.o
++obj-$(CONFIG_SCHED_DEBUG) += alt_debug.o
++else
+ obj-y += core.o
+ obj-y += fair.o
++endif
+ obj-y += build_policy.o
+ obj-y += build_utility.o
+diff --git a/kernel/sched/alt_core.c b/kernel/sched/alt_core.c
+new file mode 100644
+index 000000000000..3e8ddbd8001c
+--- /dev/null
++++ b/kernel/sched/alt_core.c
+@@ -0,0 +1,8729 @@
++/*
++ * kernel/sched/alt_core.c
++ *
++ * Core alternative kernel scheduler code and related syscalls
++ *
++ * Copyright (C) 1991-2002 Linus Torvalds
++ *
++ * 2009-08-13 Brainfuck deadline scheduling policy by Con Kolivas deletes
++ * a whole lot of those previous things.
++ * 2017-09-06 Priority and Deadline based Skip list multiple queue kernel
++ * scheduler by Alfred Chen.
++ * 2019-02-20 BMQ(BitMap Queue) kernel scheduler by Alfred Chen.
++ */
++#include <linux/sched/clock.h>
++#include <linux/sched/cputime.h>
++#include <linux/sched/debug.h>
++#include <linux/sched/isolation.h>
++#include <linux/sched/loadavg.h>
++#include <linux/sched/mm.h>
++#include <linux/sched/nohz.h>
++#include <linux/sched/stat.h>
++#include <linux/sched/wake_q.h>
++
++#include <linux/blkdev.h>
++#include <linux/context_tracking.h>
++#include <linux/cpuset.h>
++#include <linux/delayacct.h>
++#include <linux/init_task.h>
++#include <linux/kcov.h>
++#include <linux/kprobes.h>
++#include <linux/nmi.h>
++#include <linux/scs.h>
++
++#include <uapi/linux/sched/types.h>
++
++#include <asm/irq_regs.h>
++#include <asm/switch_to.h>
++
++#define CREATE_TRACE_POINTS
++#include <trace/events/sched.h>
++#include <trace/events/ipi.h>
++#undef CREATE_TRACE_POINTS
++
++#include "sched.h"
++
++#include "pelt.h"
++
++#include "../../io_uring/io-wq.h"
++#include "../smpboot.h"
++
++EXPORT_TRACEPOINT_SYMBOL_GPL(ipi_send_cpu);
++EXPORT_TRACEPOINT_SYMBOL_GPL(ipi_send_cpumask);
++
++/*
++ * Export tracepoints that act as a bare tracehook (ie: have no trace event
++ * associated with them) to allow external modules to probe them.
++ */
++EXPORT_TRACEPOINT_SYMBOL_GPL(pelt_irq_tp);
++
++#ifdef CONFIG_SCHED_DEBUG
++#define sched_feat(x) (1)
++/*
++ * Print a warning if need_resched is set for the given duration (if
++ * LATENCY_WARN is enabled).
++ *
++ * If sysctl_resched_latency_warn_once is set, only one warning will be shown
++ * per boot.
++ */
++__read_mostly int sysctl_resched_latency_warn_ms = 100;
++__read_mostly int sysctl_resched_latency_warn_once = 1;
++#else
++#define sched_feat(x) (0)
++#endif /* CONFIG_SCHED_DEBUG */
++
++#define ALT_SCHED_VERSION "v6.4-r1"
++
++/*
++ * Compile time debug macro
++ * #define ALT_SCHED_DEBUG
++ */
++
++/* rt_prio(prio) defined in include/linux/sched/rt.h */
++#define rt_task(p) rt_prio((p)->prio)
++#define rt_policy(policy) ((policy) == SCHED_FIFO || (policy) == SCHED_RR)
++#define task_has_rt_policy(p) (rt_policy((p)->policy))
++
++#define STOP_PRIO (MAX_RT_PRIO - 1)
++
++/* Default time slice is 4 in ms, can be set via kernel parameter "sched_timeslice" */
++u64 sched_timeslice_ns __read_mostly = (4 << 20);
++
++static inline void requeue_task(struct task_struct *p, struct rq *rq, int idx);
++
++#ifdef CONFIG_SCHED_BMQ
++#include "bmq.h"
++#endif
++#ifdef CONFIG_SCHED_PDS
++#include "pds.h"
++#endif
++
++struct affinity_context {
++ const struct cpumask *new_mask;
++ struct cpumask *user_mask;
++ unsigned int flags;
++};
++
++static int __init sched_timeslice(char *str)
++{
++ int timeslice_ms;
++
++ get_option(&str, ×lice_ms);
++ if (2 != timeslice_ms)
++ timeslice_ms = 4;
++ sched_timeslice_ns = timeslice_ms << 20;
++ sched_timeslice_imp(timeslice_ms);
++
++ return 0;
++}
++early_param("sched_timeslice", sched_timeslice);
++
++/* Reschedule if less than this many μs left */
++#define RESCHED_NS (100 << 10)
++
++/**
++ * sched_yield_type - Choose what sort of yield sched_yield will perform.
++ * 0: No yield.
++ * 1: Deboost and requeue task. (default)
++ * 2: Set rq skip task.
++ */
++int sched_yield_type __read_mostly = 1;
++
++#ifdef CONFIG_SMP
++static cpumask_t sched_rq_pending_mask ____cacheline_aligned_in_smp;
++
++DEFINE_PER_CPU_ALIGNED(cpumask_t [NR_CPU_AFFINITY_LEVELS], sched_cpu_topo_masks);
++DEFINE_PER_CPU_ALIGNED(cpumask_t *, sched_cpu_llc_mask);
++DEFINE_PER_CPU_ALIGNED(cpumask_t *, sched_cpu_topo_end_mask);
++
++#ifdef CONFIG_SCHED_SMT
++DEFINE_STATIC_KEY_FALSE(sched_smt_present);
++EXPORT_SYMBOL_GPL(sched_smt_present);
++#endif
++
++/*
++ * Keep a unique ID per domain (we use the first CPUs number in the cpumask of
++ * the domain), this allows us to quickly tell if two cpus are in the same cache
++ * domain, see cpus_share_cache().
++ */
++DEFINE_PER_CPU(int, sd_llc_id);
++#endif /* CONFIG_SMP */
++
++static DEFINE_MUTEX(sched_hotcpu_mutex);
++
++DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
++
++#ifndef prepare_arch_switch
++# define prepare_arch_switch(next) do { } while (0)
++#endif
++#ifndef finish_arch_post_lock_switch
++# define finish_arch_post_lock_switch() do { } while (0)
++#endif
++
++#ifdef CONFIG_SCHED_SMT
++static cpumask_t sched_sg_idle_mask ____cacheline_aligned_in_smp;
++#endif
++static cpumask_t sched_preempt_mask[SCHED_QUEUE_BITS] ____cacheline_aligned_in_smp;
++static cpumask_t *const sched_idle_mask = &sched_preempt_mask[0];
++
++/* task function */
++static inline const struct cpumask *task_user_cpus(struct task_struct *p)
++{
++ if (!p->user_cpus_ptr)
++ return cpu_possible_mask; /* &init_task.cpus_mask */
++ return p->user_cpus_ptr;
++}
++
++/* sched_queue related functions */
++static inline void sched_queue_init(struct sched_queue *q)
++{
++ int i;
++
++ bitmap_zero(q->bitmap, SCHED_QUEUE_BITS);
++ for(i = 0; i < SCHED_LEVELS; i++)
++ INIT_LIST_HEAD(&q->heads[i]);
++}
++
++/*
++ * Init idle task and put into queue structure of rq
++ * IMPORTANT: may be called multiple times for a single cpu
++ */
++static inline void sched_queue_init_idle(struct sched_queue *q,
++ struct task_struct *idle)
++{
++ idle->sq_idx = IDLE_TASK_SCHED_PRIO;
++ INIT_LIST_HEAD(&q->heads[idle->sq_idx]);
++ list_add(&idle->sq_node, &q->heads[idle->sq_idx]);
++}
++
++static inline void
++clear_recorded_preempt_mask(int pr, int low, int high, int cpu)
++{
++ if (low < pr && pr <= high)
++ cpumask_clear_cpu(cpu, sched_preempt_mask + SCHED_QUEUE_BITS - pr);
++}
++
++static inline void
++set_recorded_preempt_mask(int pr, int low, int high, int cpu)
++{
++ if (low < pr && pr <= high)
++ cpumask_set_cpu(cpu, sched_preempt_mask + SCHED_QUEUE_BITS - pr);
++}
++
++static atomic_t sched_prio_record = ATOMIC_INIT(0);
++
++/* water mark related functions */
++static inline void update_sched_preempt_mask(struct rq *rq)
++{
++ unsigned long prio = find_first_bit(rq->queue.bitmap, SCHED_QUEUE_BITS);
++ unsigned long last_prio = rq->prio;
++ int cpu, pr;
++
++ if (prio == last_prio)
++ return;
++
++ rq->prio = prio;
++ cpu = cpu_of(rq);
++ pr = atomic_read(&sched_prio_record);
++
++ if (prio < last_prio) {
++ if (IDLE_TASK_SCHED_PRIO == last_prio) {
++#ifdef CONFIG_SCHED_SMT
++ if (static_branch_likely(&sched_smt_present))
++ cpumask_andnot(&sched_sg_idle_mask,
++ &sched_sg_idle_mask, cpu_smt_mask(cpu));
++#endif
++ cpumask_clear_cpu(cpu, sched_idle_mask);
++ last_prio -= 2;
++ }
++ clear_recorded_preempt_mask(pr, prio, last_prio, cpu);
++
++ return;
++ }
++ /* last_prio < prio */
++ if (IDLE_TASK_SCHED_PRIO == prio) {
++#ifdef CONFIG_SCHED_SMT
++ if (static_branch_likely(&sched_smt_present) &&
++ cpumask_intersects(cpu_smt_mask(cpu), sched_idle_mask))
++ cpumask_or(&sched_sg_idle_mask,
++ &sched_sg_idle_mask, cpu_smt_mask(cpu));
++#endif
++ cpumask_set_cpu(cpu, sched_idle_mask);
++ prio -= 2;
++ }
++ set_recorded_preempt_mask(pr, last_prio, prio, cpu);
++}
++
++/*
++ * This routine assume that the idle task always in queue
++ */
++static inline struct task_struct *sched_rq_first_task(struct rq *rq)
++{
++ const struct list_head *head = &rq->queue.heads[sched_prio2idx(rq->prio, rq)];
++
++ return list_first_entry(head, struct task_struct, sq_node);
++}
++
++static inline struct task_struct *
++sched_rq_next_task(struct task_struct *p, struct rq *rq)
++{
++ unsigned long idx = p->sq_idx;
++ struct list_head *head = &rq->queue.heads[idx];
++
++ if (list_is_last(&p->sq_node, head)) {
++ idx = find_next_bit(rq->queue.bitmap, SCHED_QUEUE_BITS,
++ sched_idx2prio(idx, rq) + 1);
++ head = &rq->queue.heads[sched_prio2idx(idx, rq)];
++
++ return list_first_entry(head, struct task_struct, sq_node);
++ }
++
++ return list_next_entry(p, sq_node);
++}
++
++static inline struct task_struct *rq_runnable_task(struct rq *rq)
++{
++ struct task_struct *next = sched_rq_first_task(rq);
++
++ if (unlikely(next == rq->skip))
++ next = sched_rq_next_task(next, rq);
++
++ return next;
++}
++
++/*
++ * Serialization rules:
++ *
++ * Lock order:
++ *
++ * p->pi_lock
++ * rq->lock
++ * hrtimer_cpu_base->lock (hrtimer_start() for bandwidth controls)
++ *
++ * rq1->lock
++ * rq2->lock where: rq1 < rq2
++ *
++ * Regular state:
++ *
++ * Normal scheduling state is serialized by rq->lock. __schedule() takes the
++ * local CPU's rq->lock, it optionally removes the task from the runqueue and
++ * always looks at the local rq data structures to find the most eligible task
++ * to run next.
++ *
++ * Task enqueue is also under rq->lock, possibly taken from another CPU.
++ * Wakeups from another LLC domain might use an IPI to transfer the enqueue to
++ * the local CPU to avoid bouncing the runqueue state around [ see
++ * ttwu_queue_wakelist() ]
++ *
++ * Task wakeup, specifically wakeups that involve migration, are horribly
++ * complicated to avoid having to take two rq->locks.
++ *
++ * Special state:
++ *
++ * System-calls and anything external will use task_rq_lock() which acquires
++ * both p->pi_lock and rq->lock. As a consequence the state they change is
++ * stable while holding either lock:
++ *
++ * - sched_setaffinity()/
++ * set_cpus_allowed_ptr(): p->cpus_ptr, p->nr_cpus_allowed
++ * - set_user_nice(): p->se.load, p->*prio
++ * - __sched_setscheduler(): p->sched_class, p->policy, p->*prio,
++ * p->se.load, p->rt_priority,
++ * p->dl.dl_{runtime, deadline, period, flags, bw, density}
++ * - sched_setnuma(): p->numa_preferred_nid
++ * - sched_move_task(): p->sched_task_group
++ * - uclamp_update_active() p->uclamp*
++ *
++ * p->state <- TASK_*:
++ *
++ * is changed locklessly using set_current_state(), __set_current_state() or
++ * set_special_state(), see their respective comments, or by
++ * try_to_wake_up(). This latter uses p->pi_lock to serialize against
++ * concurrent self.
++ *
++ * p->on_rq <- { 0, 1 = TASK_ON_RQ_QUEUED, 2 = TASK_ON_RQ_MIGRATING }:
++ *
++ * is set by activate_task() and cleared by deactivate_task(), under
++ * rq->lock. Non-zero indicates the task is runnable, the special
++ * ON_RQ_MIGRATING state is used for migration without holding both
++ * rq->locks. It indicates task_cpu() is not stable, see task_rq_lock().
++ *
++ * p->on_cpu <- { 0, 1 }:
++ *
++ * is set by prepare_task() and cleared by finish_task() such that it will be
++ * set before p is scheduled-in and cleared after p is scheduled-out, both
++ * under rq->lock. Non-zero indicates the task is running on its CPU.
++ *
++ * [ The astute reader will observe that it is possible for two tasks on one
++ * CPU to have ->on_cpu = 1 at the same time. ]
++ *
++ * task_cpu(p): is changed by set_task_cpu(), the rules are:
++ *
++ * - Don't call set_task_cpu() on a blocked task:
++ *
++ * We don't care what CPU we're not running on, this simplifies hotplug,
++ * the CPU assignment of blocked tasks isn't required to be valid.
++ *
++ * - for try_to_wake_up(), called under p->pi_lock:
++ *
++ * This allows try_to_wake_up() to only take one rq->lock, see its comment.
++ *
++ * - for migration called under rq->lock:
++ * [ see task_on_rq_migrating() in task_rq_lock() ]
++ *
++ * o move_queued_task()
++ * o detach_task()
++ *
++ * - for migration called under double_rq_lock():
++ *
++ * o __migrate_swap_task()
++ * o push_rt_task() / pull_rt_task()
++ * o push_dl_task() / pull_dl_task()
++ * o dl_task_offline_migration()
++ *
++ */
++
++/*
++ * Context: p->pi_lock
++ */
++static inline struct rq
++*__task_access_lock(struct task_struct *p, raw_spinlock_t **plock)
++{
++ struct rq *rq;
++ for (;;) {
++ rq = task_rq(p);
++ if (p->on_cpu || task_on_rq_queued(p)) {
++ raw_spin_lock(&rq->lock);
++ if (likely((p->on_cpu || task_on_rq_queued(p))
++ && rq == task_rq(p))) {
++ *plock = &rq->lock;
++ return rq;
++ }
++ raw_spin_unlock(&rq->lock);
++ } else if (task_on_rq_migrating(p)) {
++ do {
++ cpu_relax();
++ } while (unlikely(task_on_rq_migrating(p)));
++ } else {
++ *plock = NULL;
++ return rq;
++ }
++ }
++}
++
++static inline void
++__task_access_unlock(struct task_struct *p, raw_spinlock_t *lock)
++{
++ if (NULL != lock)
++ raw_spin_unlock(lock);
++}
++
++static inline struct rq
++*task_access_lock_irqsave(struct task_struct *p, raw_spinlock_t **plock,
++ unsigned long *flags)
++{
++ struct rq *rq;
++ for (;;) {
++ rq = task_rq(p);
++ if (p->on_cpu || task_on_rq_queued(p)) {
++ raw_spin_lock_irqsave(&rq->lock, *flags);
++ if (likely((p->on_cpu || task_on_rq_queued(p))
++ && rq == task_rq(p))) {
++ *plock = &rq->lock;
++ return rq;
++ }
++ raw_spin_unlock_irqrestore(&rq->lock, *flags);
++ } else if (task_on_rq_migrating(p)) {
++ do {
++ cpu_relax();
++ } while (unlikely(task_on_rq_migrating(p)));
++ } else {
++ raw_spin_lock_irqsave(&p->pi_lock, *flags);
++ if (likely(!p->on_cpu && !p->on_rq &&
++ rq == task_rq(p))) {
++ *plock = &p->pi_lock;
++ return rq;
++ }
++ raw_spin_unlock_irqrestore(&p->pi_lock, *flags);
++ }
++ }
++}
++
++static inline void
++task_access_unlock_irqrestore(struct task_struct *p, raw_spinlock_t *lock,
++ unsigned long *flags)
++{
++ raw_spin_unlock_irqrestore(lock, *flags);
++}
++
++/*
++ * __task_rq_lock - lock the rq @p resides on.
++ */
++struct rq *__task_rq_lock(struct task_struct *p, struct rq_flags *rf)
++ __acquires(rq->lock)
++{
++ struct rq *rq;
++
++ lockdep_assert_held(&p->pi_lock);
++
++ for (;;) {
++ rq = task_rq(p);
++ raw_spin_lock(&rq->lock);
++ if (likely(rq == task_rq(p) && !task_on_rq_migrating(p)))
++ return rq;
++ raw_spin_unlock(&rq->lock);
++
++ while (unlikely(task_on_rq_migrating(p)))
++ cpu_relax();
++ }
++}
++
++/*
++ * task_rq_lock - lock p->pi_lock and lock the rq @p resides on.
++ */
++struct rq *task_rq_lock(struct task_struct *p, struct rq_flags *rf)
++ __acquires(p->pi_lock)
++ __acquires(rq->lock)
++{
++ struct rq *rq;
++
++ for (;;) {
++ raw_spin_lock_irqsave(&p->pi_lock, rf->flags);
++ rq = task_rq(p);
++ raw_spin_lock(&rq->lock);
++ /*
++ * move_queued_task() task_rq_lock()
++ *
++ * ACQUIRE (rq->lock)
++ * [S] ->on_rq = MIGRATING [L] rq = task_rq()
++ * WMB (__set_task_cpu()) ACQUIRE (rq->lock);
++ * [S] ->cpu = new_cpu [L] task_rq()
++ * [L] ->on_rq
++ * RELEASE (rq->lock)
++ *
++ * If we observe the old CPU in task_rq_lock(), the acquire of
++ * the old rq->lock will fully serialize against the stores.
++ *
++ * If we observe the new CPU in task_rq_lock(), the address
++ * dependency headed by '[L] rq = task_rq()' and the acquire
++ * will pair with the WMB to ensure we then also see migrating.
++ */
++ if (likely(rq == task_rq(p) && !task_on_rq_migrating(p))) {
++ return rq;
++ }
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, rf->flags);
++
++ while (unlikely(task_on_rq_migrating(p)))
++ cpu_relax();
++ }
++}
++
++static inline void
++rq_lock_irqsave(struct rq *rq, struct rq_flags *rf)
++ __acquires(rq->lock)
++{
++ raw_spin_lock_irqsave(&rq->lock, rf->flags);
++}
++
++static inline void
++rq_unlock_irqrestore(struct rq *rq, struct rq_flags *rf)
++ __releases(rq->lock)
++{
++ raw_spin_unlock_irqrestore(&rq->lock, rf->flags);
++}
++
++void raw_spin_rq_lock_nested(struct rq *rq, int subclass)
++{
++ raw_spinlock_t *lock;
++
++ /* Matches synchronize_rcu() in __sched_core_enable() */
++ preempt_disable();
++
++ for (;;) {
++ lock = __rq_lockp(rq);
++ raw_spin_lock_nested(lock, subclass);
++ if (likely(lock == __rq_lockp(rq))) {
++ /* preempt_count *MUST* be > 1 */
++ preempt_enable_no_resched();
++ return;
++ }
++ raw_spin_unlock(lock);
++ }
++}
++
++void raw_spin_rq_unlock(struct rq *rq)
++{
++ raw_spin_unlock(rq_lockp(rq));
++}
++
++/*
++ * RQ-clock updating methods:
++ */
++
++static void update_rq_clock_task(struct rq *rq, s64 delta)
++{
++/*
++ * In theory, the compile should just see 0 here, and optimize out the call
++ * to sched_rt_avg_update. But I don't trust it...
++ */
++ s64 __maybe_unused steal = 0, irq_delta = 0;
++
++#ifdef CONFIG_IRQ_TIME_ACCOUNTING
++ irq_delta = irq_time_read(cpu_of(rq)) - rq->prev_irq_time;
++
++ /*
++ * Since irq_time is only updated on {soft,}irq_exit, we might run into
++ * this case when a previous update_rq_clock() happened inside a
++ * {soft,}irq region.
++ *
++ * When this happens, we stop ->clock_task and only update the
++ * prev_irq_time stamp to account for the part that fit, so that a next
++ * update will consume the rest. This ensures ->clock_task is
++ * monotonic.
++ *
++ * It does however cause some slight miss-attribution of {soft,}irq
++ * time, a more accurate solution would be to update the irq_time using
++ * the current rq->clock timestamp, except that would require using
++ * atomic ops.
++ */
++ if (irq_delta > delta)
++ irq_delta = delta;
++
++ rq->prev_irq_time += irq_delta;
++ delta -= irq_delta;
++ delayacct_irq(rq->curr, irq_delta);
++#endif
++#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
++ if (static_key_false((¶virt_steal_rq_enabled))) {
++ steal = paravirt_steal_clock(cpu_of(rq));
++ steal -= rq->prev_steal_time_rq;
++
++ if (unlikely(steal > delta))
++ steal = delta;
++
++ rq->prev_steal_time_rq += steal;
++ delta -= steal;
++ }
++#endif
++
++ rq->clock_task += delta;
++
++#ifdef CONFIG_HAVE_SCHED_AVG_IRQ
++ if ((irq_delta + steal))
++ update_irq_load_avg(rq, irq_delta + steal);
++#endif
++}
++
++static inline void update_rq_clock(struct rq *rq)
++{
++ s64 delta = sched_clock_cpu(cpu_of(rq)) - rq->clock;
++
++ if (unlikely(delta <= 0))
++ return;
++ rq->clock += delta;
++ update_rq_time_edge(rq);
++ update_rq_clock_task(rq, delta);
++}
++
++/*
++ * RQ Load update routine
++ */
++#define RQ_LOAD_HISTORY_BITS (sizeof(s32) * 8ULL)
++#define RQ_UTIL_SHIFT (8)
++#define RQ_LOAD_HISTORY_TO_UTIL(l) (((l) >> (RQ_LOAD_HISTORY_BITS - 1 - RQ_UTIL_SHIFT)) & 0xff)
++
++#define LOAD_BLOCK(t) ((t) >> 17)
++#define LOAD_HALF_BLOCK(t) ((t) >> 16)
++#define BLOCK_MASK(t) ((t) & ((0x01 << 18) - 1))
++#define LOAD_BLOCK_BIT(b) (1UL << (RQ_LOAD_HISTORY_BITS - 1 - (b)))
++#define CURRENT_LOAD_BIT LOAD_BLOCK_BIT(0)
++
++static inline void rq_load_update(struct rq *rq)
++{
++ u64 time = rq->clock;
++ u64 delta = min(LOAD_BLOCK(time) - LOAD_BLOCK(rq->load_stamp),
++ RQ_LOAD_HISTORY_BITS - 1);
++ u64 prev = !!(rq->load_history & CURRENT_LOAD_BIT);
++ u64 curr = !!rq->nr_running;
++
++ if (delta) {
++ rq->load_history = rq->load_history >> delta;
++
++ if (delta < RQ_UTIL_SHIFT) {
++ rq->load_block += (~BLOCK_MASK(rq->load_stamp)) * prev;
++ if (!!LOAD_HALF_BLOCK(rq->load_block) ^ curr)
++ rq->load_history ^= LOAD_BLOCK_BIT(delta);
++ }
++
++ rq->load_block = BLOCK_MASK(time) * prev;
++ } else {
++ rq->load_block += (time - rq->load_stamp) * prev;
++ }
++ if (prev ^ curr)
++ rq->load_history ^= CURRENT_LOAD_BIT;
++ rq->load_stamp = time;
++}
++
++unsigned long rq_load_util(struct rq *rq, unsigned long max)
++{
++ return RQ_LOAD_HISTORY_TO_UTIL(rq->load_history) * (max >> RQ_UTIL_SHIFT);
++}
++
++#ifdef CONFIG_SMP
++unsigned long sched_cpu_util(int cpu)
++{
++ return rq_load_util(cpu_rq(cpu), arch_scale_cpu_capacity(cpu));
++}
++#endif /* CONFIG_SMP */
++
++#ifdef CONFIG_CPU_FREQ
++/**
++ * cpufreq_update_util - Take a note about CPU utilization changes.
++ * @rq: Runqueue to carry out the update for.
++ * @flags: Update reason flags.
++ *
++ * This function is called by the scheduler on the CPU whose utilization is
++ * being updated.
++ *
++ * It can only be called from RCU-sched read-side critical sections.
++ *
++ * The way cpufreq is currently arranged requires it to evaluate the CPU
++ * performance state (frequency/voltage) on a regular basis to prevent it from
++ * being stuck in a completely inadequate performance level for too long.
++ * That is not guaranteed to happen if the updates are only triggered from CFS
++ * and DL, though, because they may not be coming in if only RT tasks are
++ * active all the time (or there are RT tasks only).
++ *
++ * As a workaround for that issue, this function is called periodically by the
++ * RT sched class to trigger extra cpufreq updates to prevent it from stalling,
++ * but that really is a band-aid. Going forward it should be replaced with
++ * solutions targeted more specifically at RT tasks.
++ */
++static inline void cpufreq_update_util(struct rq *rq, unsigned int flags)
++{
++ struct update_util_data *data;
++
++#ifdef CONFIG_SMP
++ rq_load_update(rq);
++#endif
++ data = rcu_dereference_sched(*per_cpu_ptr(&cpufreq_update_util_data,
++ cpu_of(rq)));
++ if (data)
++ data->func(data, rq_clock(rq), flags);
++}
++#else
++static inline void cpufreq_update_util(struct rq *rq, unsigned int flags)
++{
++#ifdef CONFIG_SMP
++ rq_load_update(rq);
++#endif
++}
++#endif /* CONFIG_CPU_FREQ */
++
++#ifdef CONFIG_NO_HZ_FULL
++/*
++ * Tick may be needed by tasks in the runqueue depending on their policy and
++ * requirements. If tick is needed, lets send the target an IPI to kick it out
++ * of nohz mode if necessary.
++ */
++static inline void sched_update_tick_dependency(struct rq *rq)
++{
++ int cpu = cpu_of(rq);
++
++ if (!tick_nohz_full_cpu(cpu))
++ return;
++
++ if (rq->nr_running < 2)
++ tick_nohz_dep_clear_cpu(cpu, TICK_DEP_BIT_SCHED);
++ else
++ tick_nohz_dep_set_cpu(cpu, TICK_DEP_BIT_SCHED);
++}
++#else /* !CONFIG_NO_HZ_FULL */
++static inline void sched_update_tick_dependency(struct rq *rq) { }
++#endif
++
++bool sched_task_on_rq(struct task_struct *p)
++{
++ return task_on_rq_queued(p);
++}
++
++unsigned long get_wchan(struct task_struct *p)
++{
++ unsigned long ip = 0;
++ unsigned int state;
++
++ if (!p || p == current)
++ return 0;
++
++ /* Only get wchan if task is blocked and we can keep it that way. */
++ raw_spin_lock_irq(&p->pi_lock);
++ state = READ_ONCE(p->__state);
++ smp_rmb(); /* see try_to_wake_up() */
++ if (state != TASK_RUNNING && state != TASK_WAKING && !p->on_rq)
++ ip = __get_wchan(p);
++ raw_spin_unlock_irq(&p->pi_lock);
++
++ return ip;
++}
++
++/*
++ * Add/Remove/Requeue task to/from the runqueue routines
++ * Context: rq->lock
++ */
++#define __SCHED_DEQUEUE_TASK(p, rq, flags, func) \
++ sched_info_dequeue(rq, p); \
++ \
++ list_del(&p->sq_node); \
++ if (list_empty(&rq->queue.heads[p->sq_idx])) { \
++ clear_bit(sched_idx2prio(p->sq_idx, rq), rq->queue.bitmap); \
++ func; \
++ }
++
++#define __SCHED_ENQUEUE_TASK(p, rq, flags) \
++ sched_info_enqueue(rq, p); \
++ \
++ p->sq_idx = task_sched_prio_idx(p, rq); \
++ list_add_tail(&p->sq_node, &rq->queue.heads[p->sq_idx]); \
++ set_bit(sched_idx2prio(p->sq_idx, rq), rq->queue.bitmap);
++
++static inline void dequeue_task(struct task_struct *p, struct rq *rq, int flags)
++{
++#ifdef ALT_SCHED_DEBUG
++ lockdep_assert_held(&rq->lock);
++
++ /*printk(KERN_INFO "sched: dequeue(%d) %px %016llx\n", cpu_of(rq), p, p->deadline);*/
++ WARN_ONCE(task_rq(p) != rq, "sched: dequeue task reside on cpu%d from cpu%d\n",
++ task_cpu(p), cpu_of(rq));
++#endif
++
++ __SCHED_DEQUEUE_TASK(p, rq, flags, update_sched_preempt_mask(rq));
++ --rq->nr_running;
++#ifdef CONFIG_SMP
++ if (1 == rq->nr_running)
++ cpumask_clear_cpu(cpu_of(rq), &sched_rq_pending_mask);
++#endif
++
++ sched_update_tick_dependency(rq);
++}
++
++static inline void enqueue_task(struct task_struct *p, struct rq *rq, int flags)
++{
++#ifdef ALT_SCHED_DEBUG
++ lockdep_assert_held(&rq->lock);
++
++ /*printk(KERN_INFO "sched: enqueue(%d) %px %d\n", cpu_of(rq), p, p->prio);*/
++ WARN_ONCE(task_rq(p) != rq, "sched: enqueue task reside on cpu%d to cpu%d\n",
++ task_cpu(p), cpu_of(rq));
++#endif
++
++ __SCHED_ENQUEUE_TASK(p, rq, flags);
++ update_sched_preempt_mask(rq);
++ ++rq->nr_running;
++#ifdef CONFIG_SMP
++ if (2 == rq->nr_running)
++ cpumask_set_cpu(cpu_of(rq), &sched_rq_pending_mask);
++#endif
++
++ sched_update_tick_dependency(rq);
++}
++
++static inline void requeue_task(struct task_struct *p, struct rq *rq, int idx)
++{
++#ifdef ALT_SCHED_DEBUG
++ lockdep_assert_held(&rq->lock);
++ /*printk(KERN_INFO "sched: requeue(%d) %px %016llx\n", cpu_of(rq), p, p->deadline);*/
++ WARN_ONCE(task_rq(p) != rq, "sched: cpu[%d] requeue task reside on cpu%d\n",
++ cpu_of(rq), task_cpu(p));
++#endif
++
++ list_del(&p->sq_node);
++ list_add_tail(&p->sq_node, &rq->queue.heads[idx]);
++ if (idx != p->sq_idx) {
++ if (list_empty(&rq->queue.heads[p->sq_idx]))
++ clear_bit(sched_idx2prio(p->sq_idx, rq), rq->queue.bitmap);
++ p->sq_idx = idx;
++ set_bit(sched_idx2prio(p->sq_idx, rq), rq->queue.bitmap);
++ update_sched_preempt_mask(rq);
++ }
++}
++
++/*
++ * cmpxchg based fetch_or, macro so it works for different integer types
++ */
++#define fetch_or(ptr, mask) \
++ ({ \
++ typeof(ptr) _ptr = (ptr); \
++ typeof(mask) _mask = (mask); \
++ typeof(*_ptr) _val = *_ptr; \
++ \
++ do { \
++ } while (!try_cmpxchg(_ptr, &_val, _val | _mask)); \
++ _val; \
++})
++
++#if defined(CONFIG_SMP) && defined(TIF_POLLING_NRFLAG)
++/*
++ * Atomically set TIF_NEED_RESCHED and test for TIF_POLLING_NRFLAG,
++ * this avoids any races wrt polling state changes and thereby avoids
++ * spurious IPIs.
++ */
++static inline bool set_nr_and_not_polling(struct task_struct *p)
++{
++ struct thread_info *ti = task_thread_info(p);
++ return !(fetch_or(&ti->flags, _TIF_NEED_RESCHED) & _TIF_POLLING_NRFLAG);
++}
++
++/*
++ * Atomically set TIF_NEED_RESCHED if TIF_POLLING_NRFLAG is set.
++ *
++ * If this returns true, then the idle task promises to call
++ * sched_ttwu_pending() and reschedule soon.
++ */
++static bool set_nr_if_polling(struct task_struct *p)
++{
++ struct thread_info *ti = task_thread_info(p);
++ typeof(ti->flags) val = READ_ONCE(ti->flags);
++
++ for (;;) {
++ if (!(val & _TIF_POLLING_NRFLAG))
++ return false;
++ if (val & _TIF_NEED_RESCHED)
++ return true;
++ if (try_cmpxchg(&ti->flags, &val, val | _TIF_NEED_RESCHED))
++ break;
++ }
++ return true;
++}
++
++#else
++static inline bool set_nr_and_not_polling(struct task_struct *p)
++{
++ set_tsk_need_resched(p);
++ return true;
++}
++
++#ifdef CONFIG_SMP
++static inline bool set_nr_if_polling(struct task_struct *p)
++{
++ return false;
++}
++#endif
++#endif
++
++static bool __wake_q_add(struct wake_q_head *head, struct task_struct *task)
++{
++ struct wake_q_node *node = &task->wake_q;
++
++ /*
++ * Atomically grab the task, if ->wake_q is !nil already it means
++ * it's already queued (either by us or someone else) and will get the
++ * wakeup due to that.
++ *
++ * In order to ensure that a pending wakeup will observe our pending
++ * state, even in the failed case, an explicit smp_mb() must be used.
++ */
++ smp_mb__before_atomic();
++ if (unlikely(cmpxchg_relaxed(&node->next, NULL, WAKE_Q_TAIL)))
++ return false;
++
++ /*
++ * The head is context local, there can be no concurrency.
++ */
++ *head->lastp = node;
++ head->lastp = &node->next;
++ return true;
++}
++
++/**
++ * wake_q_add() - queue a wakeup for 'later' waking.
++ * @head: the wake_q_head to add @task to
++ * @task: the task to queue for 'later' wakeup
++ *
++ * Queue a task for later wakeup, most likely by the wake_up_q() call in the
++ * same context, _HOWEVER_ this is not guaranteed, the wakeup can come
++ * instantly.
++ *
++ * This function must be used as-if it were wake_up_process(); IOW the task
++ * must be ready to be woken at this location.
++ */
++void wake_q_add(struct wake_q_head *head, struct task_struct *task)
++{
++ if (__wake_q_add(head, task))
++ get_task_struct(task);
++}
++
++/**
++ * wake_q_add_safe() - safely queue a wakeup for 'later' waking.
++ * @head: the wake_q_head to add @task to
++ * @task: the task to queue for 'later' wakeup
++ *
++ * Queue a task for later wakeup, most likely by the wake_up_q() call in the
++ * same context, _HOWEVER_ this is not guaranteed, the wakeup can come
++ * instantly.
++ *
++ * This function must be used as-if it were wake_up_process(); IOW the task
++ * must be ready to be woken at this location.
++ *
++ * This function is essentially a task-safe equivalent to wake_q_add(). Callers
++ * that already hold reference to @task can call the 'safe' version and trust
++ * wake_q to do the right thing depending whether or not the @task is already
++ * queued for wakeup.
++ */
++void wake_q_add_safe(struct wake_q_head *head, struct task_struct *task)
++{
++ if (!__wake_q_add(head, task))
++ put_task_struct(task);
++}
++
++void wake_up_q(struct wake_q_head *head)
++{
++ struct wake_q_node *node = head->first;
++
++ while (node != WAKE_Q_TAIL) {
++ struct task_struct *task;
++
++ task = container_of(node, struct task_struct, wake_q);
++ /* task can safely be re-inserted now: */
++ node = node->next;
++ task->wake_q.next = NULL;
++
++ /*
++ * wake_up_process() executes a full barrier, which pairs with
++ * the queueing in wake_q_add() so as not to miss wakeups.
++ */
++ wake_up_process(task);
++ put_task_struct(task);
++ }
++}
++
++/*
++ * resched_curr - mark rq's current task 'to be rescheduled now'.
++ *
++ * On UP this means the setting of the need_resched flag, on SMP it
++ * might also involve a cross-CPU call to trigger the scheduler on
++ * the target CPU.
++ */
++void resched_curr(struct rq *rq)
++{
++ struct task_struct *curr = rq->curr;
++ int cpu;
++
++ lockdep_assert_held(&rq->lock);
++
++ if (test_tsk_need_resched(curr))
++ return;
++
++ cpu = cpu_of(rq);
++ if (cpu == smp_processor_id()) {
++ set_tsk_need_resched(curr);
++ set_preempt_need_resched();
++ return;
++ }
++
++ if (set_nr_and_not_polling(curr))
++ smp_send_reschedule(cpu);
++ else
++ trace_sched_wake_idle_without_ipi(cpu);
++}
++
++void resched_cpu(int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++ unsigned long flags;
++
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ if (cpu_online(cpu) || cpu == smp_processor_id())
++ resched_curr(cpu_rq(cpu));
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++}
++
++#ifdef CONFIG_SMP
++#ifdef CONFIG_NO_HZ_COMMON
++void nohz_balance_enter_idle(int cpu) {}
++
++void select_nohz_load_balancer(int stop_tick) {}
++
++void set_cpu_sd_state_idle(void) {}
++
++/*
++ * In the semi idle case, use the nearest busy CPU for migrating timers
++ * from an idle CPU. This is good for power-savings.
++ *
++ * We don't do similar optimization for completely idle system, as
++ * selecting an idle CPU will add more delays to the timers than intended
++ * (as that CPU's timer base may not be uptodate wrt jiffies etc).
++ */
++int get_nohz_timer_target(void)
++{
++ int i, cpu = smp_processor_id(), default_cpu = -1;
++ struct cpumask *mask;
++ const struct cpumask *hk_mask;
++
++ if (housekeeping_cpu(cpu, HK_TYPE_TIMER)) {
++ if (!idle_cpu(cpu))
++ return cpu;
++ default_cpu = cpu;
++ }
++
++ hk_mask = housekeeping_cpumask(HK_TYPE_TIMER);
++
++ for (mask = per_cpu(sched_cpu_topo_masks, cpu) + 1;
++ mask < per_cpu(sched_cpu_topo_end_mask, cpu); mask++)
++ for_each_cpu_and(i, mask, hk_mask)
++ if (!idle_cpu(i))
++ return i;
++
++ if (default_cpu == -1)
++ default_cpu = housekeeping_any_cpu(HK_TYPE_TIMER);
++ cpu = default_cpu;
++
++ return cpu;
++}
++
++/*
++ * When add_timer_on() enqueues a timer into the timer wheel of an
++ * idle CPU then this timer might expire before the next timer event
++ * which is scheduled to wake up that CPU. In case of a completely
++ * idle system the next event might even be infinite time into the
++ * future. wake_up_idle_cpu() ensures that the CPU is woken up and
++ * leaves the inner idle loop so the newly added timer is taken into
++ * account when the CPU goes back to idle and evaluates the timer
++ * wheel for the next timer event.
++ */
++static inline void wake_up_idle_cpu(int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ if (cpu == smp_processor_id())
++ return;
++
++ if (set_nr_and_not_polling(rq->idle))
++ smp_send_reschedule(cpu);
++ else
++ trace_sched_wake_idle_without_ipi(cpu);
++}
++
++static inline bool wake_up_full_nohz_cpu(int cpu)
++{
++ /*
++ * We just need the target to call irq_exit() and re-evaluate
++ * the next tick. The nohz full kick at least implies that.
++ * If needed we can still optimize that later with an
++ * empty IRQ.
++ */
++ if (cpu_is_offline(cpu))
++ return true; /* Don't try to wake offline CPUs. */
++ if (tick_nohz_full_cpu(cpu)) {
++ if (cpu != smp_processor_id() ||
++ tick_nohz_tick_stopped())
++ tick_nohz_full_kick_cpu(cpu);
++ return true;
++ }
++
++ return false;
++}
++
++void wake_up_nohz_cpu(int cpu)
++{
++ if (!wake_up_full_nohz_cpu(cpu))
++ wake_up_idle_cpu(cpu);
++}
++
++static void nohz_csd_func(void *info)
++{
++ struct rq *rq = info;
++ int cpu = cpu_of(rq);
++ unsigned int flags;
++
++ /*
++ * Release the rq::nohz_csd.
++ */
++ flags = atomic_fetch_andnot(NOHZ_KICK_MASK, nohz_flags(cpu));
++ WARN_ON(!(flags & NOHZ_KICK_MASK));
++
++ rq->idle_balance = idle_cpu(cpu);
++ if (rq->idle_balance && !need_resched()) {
++ rq->nohz_idle_balance = flags;
++ raise_softirq_irqoff(SCHED_SOFTIRQ);
++ }
++}
++
++#endif /* CONFIG_NO_HZ_COMMON */
++#endif /* CONFIG_SMP */
++
++static inline void check_preempt_curr(struct rq *rq)
++{
++ if (sched_rq_first_task(rq) != rq->curr)
++ resched_curr(rq);
++}
++
++#ifdef CONFIG_SCHED_HRTICK
++/*
++ * Use HR-timers to deliver accurate preemption points.
++ */
++
++static void hrtick_clear(struct rq *rq)
++{
++ if (hrtimer_active(&rq->hrtick_timer))
++ hrtimer_cancel(&rq->hrtick_timer);
++}
++
++/*
++ * High-resolution timer tick.
++ * Runs from hardirq context with interrupts disabled.
++ */
++static enum hrtimer_restart hrtick(struct hrtimer *timer)
++{
++ struct rq *rq = container_of(timer, struct rq, hrtick_timer);
++
++ WARN_ON_ONCE(cpu_of(rq) != smp_processor_id());
++
++ raw_spin_lock(&rq->lock);
++ resched_curr(rq);
++ raw_spin_unlock(&rq->lock);
++
++ return HRTIMER_NORESTART;
++}
++
++/*
++ * Use hrtick when:
++ * - enabled by features
++ * - hrtimer is actually high res
++ */
++static inline int hrtick_enabled(struct rq *rq)
++{
++ /**
++ * Alt schedule FW doesn't support sched_feat yet
++ if (!sched_feat(HRTICK))
++ return 0;
++ */
++ if (!cpu_active(cpu_of(rq)))
++ return 0;
++ return hrtimer_is_hres_active(&rq->hrtick_timer);
++}
++
++#ifdef CONFIG_SMP
++
++static void __hrtick_restart(struct rq *rq)
++{
++ struct hrtimer *timer = &rq->hrtick_timer;
++ ktime_t time = rq->hrtick_time;
++
++ hrtimer_start(timer, time, HRTIMER_MODE_ABS_PINNED_HARD);
++}
++
++/*
++ * called from hardirq (IPI) context
++ */
++static void __hrtick_start(void *arg)
++{
++ struct rq *rq = arg;
++
++ raw_spin_lock(&rq->lock);
++ __hrtick_restart(rq);
++ raw_spin_unlock(&rq->lock);
++}
++
++/*
++ * Called to set the hrtick timer state.
++ *
++ * called with rq->lock held and irqs disabled
++ */
++void hrtick_start(struct rq *rq, u64 delay)
++{
++ struct hrtimer *timer = &rq->hrtick_timer;
++ s64 delta;
++
++ /*
++ * Don't schedule slices shorter than 10000ns, that just
++ * doesn't make sense and can cause timer DoS.
++ */
++ delta = max_t(s64, delay, 10000LL);
++
++ rq->hrtick_time = ktime_add_ns(timer->base->get_time(), delta);
++
++ if (rq == this_rq())
++ __hrtick_restart(rq);
++ else
++ smp_call_function_single_async(cpu_of(rq), &rq->hrtick_csd);
++}
++
++#else
++/*
++ * Called to set the hrtick timer state.
++ *
++ * called with rq->lock held and irqs disabled
++ */
++void hrtick_start(struct rq *rq, u64 delay)
++{
++ /*
++ * Don't schedule slices shorter than 10000ns, that just
++ * doesn't make sense. Rely on vruntime for fairness.
++ */
++ delay = max_t(u64, delay, 10000LL);
++ hrtimer_start(&rq->hrtick_timer, ns_to_ktime(delay),
++ HRTIMER_MODE_REL_PINNED_HARD);
++}
++#endif /* CONFIG_SMP */
++
++static void hrtick_rq_init(struct rq *rq)
++{
++#ifdef CONFIG_SMP
++ INIT_CSD(&rq->hrtick_csd, __hrtick_start, rq);
++#endif
++
++ hrtimer_init(&rq->hrtick_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD);
++ rq->hrtick_timer.function = hrtick;
++}
++#else /* CONFIG_SCHED_HRTICK */
++static inline int hrtick_enabled(struct rq *rq)
++{
++ return 0;
++}
++
++static inline void hrtick_clear(struct rq *rq)
++{
++}
++
++static inline void hrtick_rq_init(struct rq *rq)
++{
++}
++#endif /* CONFIG_SCHED_HRTICK */
++
++static inline int __normal_prio(int policy, int rt_prio, int static_prio)
++{
++ return rt_policy(policy) ? (MAX_RT_PRIO - 1 - rt_prio) :
++ static_prio + MAX_PRIORITY_ADJ;
++}
++
++/*
++ * Calculate the expected normal priority: i.e. priority
++ * without taking RT-inheritance into account. Might be
++ * boosted by interactivity modifiers. Changes upon fork,
++ * setprio syscalls, and whenever the interactivity
++ * estimator recalculates.
++ */
++static inline int normal_prio(struct task_struct *p)
++{
++ return __normal_prio(p->policy, p->rt_priority, p->static_prio);
++}
++
++/*
++ * Calculate the current priority, i.e. the priority
++ * taken into account by the scheduler. This value might
++ * be boosted by RT tasks as it will be RT if the task got
++ * RT-boosted. If not then it returns p->normal_prio.
++ */
++static int effective_prio(struct task_struct *p)
++{
++ p->normal_prio = normal_prio(p);
++ /*
++ * If we are RT tasks or we were boosted to RT priority,
++ * keep the priority unchanged. Otherwise, update priority
++ * to the normal priority:
++ */
++ if (!rt_prio(p->prio))
++ return p->normal_prio;
++ return p->prio;
++}
++
++/*
++ * activate_task - move a task to the runqueue.
++ *
++ * Context: rq->lock
++ */
++static void activate_task(struct task_struct *p, struct rq *rq)
++{
++ enqueue_task(p, rq, ENQUEUE_WAKEUP);
++ p->on_rq = TASK_ON_RQ_QUEUED;
++
++ /*
++ * If in_iowait is set, the code below may not trigger any cpufreq
++ * utilization updates, so do it here explicitly with the IOWAIT flag
++ * passed.
++ */
++ cpufreq_update_util(rq, SCHED_CPUFREQ_IOWAIT * p->in_iowait);
++}
++
++/*
++ * deactivate_task - remove a task from the runqueue.
++ *
++ * Context: rq->lock
++ */
++static inline void deactivate_task(struct task_struct *p, struct rq *rq)
++{
++ dequeue_task(p, rq, DEQUEUE_SLEEP);
++ p->on_rq = 0;
++ cpufreq_update_util(rq, 0);
++}
++
++static inline void __set_task_cpu(struct task_struct *p, unsigned int cpu)
++{
++#ifdef CONFIG_SMP
++ /*
++ * After ->cpu is set up to a new value, task_access_lock(p, ...) can be
++ * successfully executed on another CPU. We must ensure that updates of
++ * per-task data have been completed by this moment.
++ */
++ smp_wmb();
++
++ WRITE_ONCE(task_thread_info(p)->cpu, cpu);
++#endif
++}
++
++static inline bool is_migration_disabled(struct task_struct *p)
++{
++#ifdef CONFIG_SMP
++ return p->migration_disabled;
++#else
++ return false;
++#endif
++}
++
++#define SCA_CHECK 0x01
++#define SCA_USER 0x08
++
++#ifdef CONFIG_SMP
++
++void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
++{
++#ifdef CONFIG_SCHED_DEBUG
++ unsigned int state = READ_ONCE(p->__state);
++
++ /*
++ * We should never call set_task_cpu() on a blocked task,
++ * ttwu() will sort out the placement.
++ */
++ WARN_ON_ONCE(state != TASK_RUNNING && state != TASK_WAKING && !p->on_rq);
++
++#ifdef CONFIG_LOCKDEP
++ /*
++ * The caller should hold either p->pi_lock or rq->lock, when changing
++ * a task's CPU. ->pi_lock for waking tasks, rq->lock for runnable tasks.
++ *
++ * sched_move_task() holds both and thus holding either pins the cgroup,
++ * see task_group().
++ */
++ WARN_ON_ONCE(debug_locks && !(lockdep_is_held(&p->pi_lock) ||
++ lockdep_is_held(&task_rq(p)->lock)));
++#endif
++ /*
++ * Clearly, migrating tasks to offline CPUs is a fairly daft thing.
++ */
++ WARN_ON_ONCE(!cpu_online(new_cpu));
++
++ WARN_ON_ONCE(is_migration_disabled(p));
++#endif
++ trace_sched_migrate_task(p, new_cpu);
++
++ if (task_cpu(p) != new_cpu)
++ {
++ rseq_migrate(p);
++ perf_event_task_migrate(p);
++ }
++
++ __set_task_cpu(p, new_cpu);
++}
++
++#define MDF_FORCE_ENABLED 0x80
++
++static void
++__do_set_cpus_ptr(struct task_struct *p, const struct cpumask *new_mask)
++{
++ /*
++ * This here violates the locking rules for affinity, since we're only
++ * supposed to change these variables while holding both rq->lock and
++ * p->pi_lock.
++ *
++ * HOWEVER, it magically works, because ttwu() is the only code that
++ * accesses these variables under p->pi_lock and only does so after
++ * smp_cond_load_acquire(&p->on_cpu, !VAL), and we're in __schedule()
++ * before finish_task().
++ *
++ * XXX do further audits, this smells like something putrid.
++ */
++ SCHED_WARN_ON(!p->on_cpu);
++ p->cpus_ptr = new_mask;
++}
++
++void migrate_disable(void)
++{
++ struct task_struct *p = current;
++ int cpu;
++
++ if (p->migration_disabled) {
++ p->migration_disabled++;
++ return;
++ }
++
++ preempt_disable();
++ cpu = smp_processor_id();
++ if (cpumask_test_cpu(cpu, &p->cpus_mask)) {
++ cpu_rq(cpu)->nr_pinned++;
++ p->migration_disabled = 1;
++ p->migration_flags &= ~MDF_FORCE_ENABLED;
++
++ /*
++ * Violates locking rules! see comment in __do_set_cpus_ptr().
++ */
++ if (p->cpus_ptr == &p->cpus_mask)
++ __do_set_cpus_ptr(p, cpumask_of(cpu));
++ }
++ preempt_enable();
++}
++EXPORT_SYMBOL_GPL(migrate_disable);
++
++void migrate_enable(void)
++{
++ struct task_struct *p = current;
++
++ if (0 == p->migration_disabled)
++ return;
++
++ if (p->migration_disabled > 1) {
++ p->migration_disabled--;
++ return;
++ }
++
++ if (WARN_ON_ONCE(!p->migration_disabled))
++ return;
++
++ /*
++ * Ensure stop_task runs either before or after this, and that
++ * __set_cpus_allowed_ptr(SCA_MIGRATE_ENABLE) doesn't schedule().
++ */
++ preempt_disable();
++ /*
++ * Assumption: current should be running on allowed cpu
++ */
++ WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &p->cpus_mask));
++ if (p->cpus_ptr != &p->cpus_mask)
++ __do_set_cpus_ptr(p, &p->cpus_mask);
++ /*
++ * Mustn't clear migration_disabled() until cpus_ptr points back at the
++ * regular cpus_mask, otherwise things that race (eg.
++ * select_fallback_rq) get confused.
++ */
++ barrier();
++ p->migration_disabled = 0;
++ this_rq()->nr_pinned--;
++ preempt_enable();
++}
++EXPORT_SYMBOL_GPL(migrate_enable);
++
++static inline bool rq_has_pinned_tasks(struct rq *rq)
++{
++ return rq->nr_pinned;
++}
++
++/*
++ * Per-CPU kthreads are allowed to run on !active && online CPUs, see
++ * __set_cpus_allowed_ptr() and select_fallback_rq().
++ */
++static inline bool is_cpu_allowed(struct task_struct *p, int cpu)
++{
++ /* When not in the task's cpumask, no point in looking further. */
++ if (!cpumask_test_cpu(cpu, p->cpus_ptr))
++ return false;
++
++ /* migrate_disabled() must be allowed to finish. */
++ if (is_migration_disabled(p))
++ return cpu_online(cpu);
++
++ /* Non kernel threads are not allowed during either online or offline. */
++ if (!(p->flags & PF_KTHREAD))
++ return cpu_active(cpu) && task_cpu_possible(cpu, p);
++
++ /* KTHREAD_IS_PER_CPU is always allowed. */
++ if (kthread_is_per_cpu(p))
++ return cpu_online(cpu);
++
++ /* Regular kernel threads don't get to stay during offline. */
++ if (cpu_dying(cpu))
++ return false;
++
++ /* But are allowed during online. */
++ return cpu_online(cpu);
++}
++
++/*
++ * This is how migration works:
++ *
++ * 1) we invoke migration_cpu_stop() on the target CPU using
++ * stop_one_cpu().
++ * 2) stopper starts to run (implicitly forcing the migrated thread
++ * off the CPU)
++ * 3) it checks whether the migrated task is still in the wrong runqueue.
++ * 4) if it's in the wrong runqueue then the migration thread removes
++ * it and puts it into the right queue.
++ * 5) stopper completes and stop_one_cpu() returns and the migration
++ * is done.
++ */
++
++/*
++ * move_queued_task - move a queued task to new rq.
++ *
++ * Returns (locked) new rq. Old rq's lock is released.
++ */
++static struct rq *move_queued_task(struct rq *rq, struct task_struct *p, int
++ new_cpu)
++{
++ int src_cpu;
++
++ lockdep_assert_held(&rq->lock);
++
++ src_cpu = cpu_of(rq);
++ WRITE_ONCE(p->on_rq, TASK_ON_RQ_MIGRATING);
++ dequeue_task(p, rq, 0);
++ set_task_cpu(p, new_cpu);
++ raw_spin_unlock(&rq->lock);
++
++ rq = cpu_rq(new_cpu);
++
++ raw_spin_lock(&rq->lock);
++ WARN_ON_ONCE(task_cpu(p) != new_cpu);
++
++ sched_mm_cid_migrate_to(rq, p, src_cpu);
++
++ sched_task_sanity_check(p, rq);
++ enqueue_task(p, rq, 0);
++ p->on_rq = TASK_ON_RQ_QUEUED;
++ check_preempt_curr(rq);
++
++ return rq;
++}
++
++struct migration_arg {
++ struct task_struct *task;
++ int dest_cpu;
++};
++
++/*
++ * Move (not current) task off this CPU, onto the destination CPU. We're doing
++ * this because either it can't run here any more (set_cpus_allowed()
++ * away from this CPU, or CPU going down), or because we're
++ * attempting to rebalance this task on exec (sched_exec).
++ *
++ * So we race with normal scheduler movements, but that's OK, as long
++ * as the task is no longer on this CPU.
++ */
++static struct rq *__migrate_task(struct rq *rq, struct task_struct *p, int
++ dest_cpu)
++{
++ /* Affinity changed (again). */
++ if (!is_cpu_allowed(p, dest_cpu))
++ return rq;
++
++ update_rq_clock(rq);
++ return move_queued_task(rq, p, dest_cpu);
++}
++
++/*
++ * migration_cpu_stop - this will be executed by a highprio stopper thread
++ * and performs thread migration by bumping thread off CPU then
++ * 'pushing' onto another runqueue.
++ */
++static int migration_cpu_stop(void *data)
++{
++ struct migration_arg *arg = data;
++ struct task_struct *p = arg->task;
++ struct rq *rq = this_rq();
++ unsigned long flags;
++
++ /*
++ * The original target CPU might have gone down and we might
++ * be on another CPU but it doesn't matter.
++ */
++ local_irq_save(flags);
++ /*
++ * We need to explicitly wake pending tasks before running
++ * __migrate_task() such that we will not miss enforcing cpus_ptr
++ * during wakeups, see set_cpus_allowed_ptr()'s TASK_WAKING test.
++ */
++ flush_smp_call_function_queue();
++
++ raw_spin_lock(&p->pi_lock);
++ raw_spin_lock(&rq->lock);
++ /*
++ * If task_rq(p) != rq, it cannot be migrated here, because we're
++ * holding rq->lock, if p->on_rq == 0 it cannot get enqueued because
++ * we're holding p->pi_lock.
++ */
++ if (task_rq(p) == rq && task_on_rq_queued(p))
++ rq = __migrate_task(rq, p, arg->dest_cpu);
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++
++ return 0;
++}
++
++static inline void
++set_cpus_allowed_common(struct task_struct *p, struct affinity_context *ctx)
++{
++ cpumask_copy(&p->cpus_mask, ctx->new_mask);
++ p->nr_cpus_allowed = cpumask_weight(ctx->new_mask);
++
++ /*
++ * Swap in a new user_cpus_ptr if SCA_USER flag set
++ */
++ if (ctx->flags & SCA_USER)
++ swap(p->user_cpus_ptr, ctx->user_mask);
++}
++
++static void
++__do_set_cpus_allowed(struct task_struct *p, struct affinity_context *ctx)
++{
++ lockdep_assert_held(&p->pi_lock);
++ set_cpus_allowed_common(p, ctx);
++}
++
++/*
++ * Used for kthread_bind() and select_fallback_rq(), in both cases the user
++ * affinity (if any) should be destroyed too.
++ */
++void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
++{
++ struct affinity_context ac = {
++ .new_mask = new_mask,
++ .user_mask = NULL,
++ .flags = SCA_USER, /* clear the user requested mask */
++ };
++ union cpumask_rcuhead {
++ cpumask_t cpumask;
++ struct rcu_head rcu;
++ };
++
++ __do_set_cpus_allowed(p, &ac);
++
++ /*
++ * Because this is called with p->pi_lock held, it is not possible
++ * to use kfree() here (when PREEMPT_RT=y), therefore punt to using
++ * kfree_rcu().
++ */
++ kfree_rcu((union cpumask_rcuhead *)ac.user_mask, rcu);
++}
++
++static cpumask_t *alloc_user_cpus_ptr(int node)
++{
++ /*
++ * See do_set_cpus_allowed() above for the rcu_head usage.
++ */
++ int size = max_t(int, cpumask_size(), sizeof(struct rcu_head));
++
++ return kmalloc_node(size, GFP_KERNEL, node);
++}
++
++int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
++ int node)
++{
++ cpumask_t *user_mask;
++ unsigned long flags;
++
++ /*
++ * Always clear dst->user_cpus_ptr first as their user_cpus_ptr's
++ * may differ by now due to racing.
++ */
++ dst->user_cpus_ptr = NULL;
++
++ /*
++ * This check is racy and losing the race is a valid situation.
++ * It is not worth the extra overhead of taking the pi_lock on
++ * every fork/clone.
++ */
++ if (data_race(!src->user_cpus_ptr))
++ return 0;
++
++ user_mask = alloc_user_cpus_ptr(node);
++ if (!user_mask)
++ return -ENOMEM;
++
++ /*
++ * Use pi_lock to protect content of user_cpus_ptr
++ *
++ * Though unlikely, user_cpus_ptr can be reset to NULL by a concurrent
++ * do_set_cpus_allowed().
++ */
++ raw_spin_lock_irqsave(&src->pi_lock, flags);
++ if (src->user_cpus_ptr) {
++ swap(dst->user_cpus_ptr, user_mask);
++ cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
++ }
++ raw_spin_unlock_irqrestore(&src->pi_lock, flags);
++
++ if (unlikely(user_mask))
++ kfree(user_mask);
++
++ return 0;
++}
++
++static inline struct cpumask *clear_user_cpus_ptr(struct task_struct *p)
++{
++ struct cpumask *user_mask = NULL;
++
++ swap(p->user_cpus_ptr, user_mask);
++
++ return user_mask;
++}
++
++void release_user_cpus_ptr(struct task_struct *p)
++{
++ kfree(clear_user_cpus_ptr(p));
++}
++
++#endif
++
++/**
++ * task_curr - is this task currently executing on a CPU?
++ * @p: the task in question.
++ *
++ * Return: 1 if the task is currently executing. 0 otherwise.
++ */
++inline int task_curr(const struct task_struct *p)
++{
++ return cpu_curr(task_cpu(p)) == p;
++}
++
++#ifdef CONFIG_SMP
++/*
++ * wait_task_inactive - wait for a thread to unschedule.
++ *
++ * Wait for the thread to block in any of the states set in @match_state.
++ * If it changes, i.e. @p might have woken up, then return zero. When we
++ * succeed in waiting for @p to be off its CPU, we return a positive number
++ * (its total switch count). If a second call a short while later returns the
++ * same number, the caller can be sure that @p has remained unscheduled the
++ * whole time.
++ *
++ * The caller must ensure that the task *will* unschedule sometime soon,
++ * else this function might spin for a *long* time. This function can't
++ * be called with interrupts off, or it may introduce deadlock with
++ * smp_call_function() if an IPI is sent by the same process we are
++ * waiting to become inactive.
++ */
++unsigned long wait_task_inactive(struct task_struct *p, unsigned int match_state)
++{
++ unsigned long flags;
++ bool running, on_rq;
++ unsigned long ncsw;
++ struct rq *rq;
++ raw_spinlock_t *lock;
++
++ for (;;) {
++ rq = task_rq(p);
++
++ /*
++ * If the task is actively running on another CPU
++ * still, just relax and busy-wait without holding
++ * any locks.
++ *
++ * NOTE! Since we don't hold any locks, it's not
++ * even sure that "rq" stays as the right runqueue!
++ * But we don't care, since this will return false
++ * if the runqueue has changed and p is actually now
++ * running somewhere else!
++ */
++ while (task_on_cpu(p) && p == rq->curr) {
++ if (!(READ_ONCE(p->__state) & match_state))
++ return 0;
++ cpu_relax();
++ }
++
++ /*
++ * Ok, time to look more closely! We need the rq
++ * lock now, to be *sure*. If we're wrong, we'll
++ * just go back and repeat.
++ */
++ task_access_lock_irqsave(p, &lock, &flags);
++ trace_sched_wait_task(p);
++ running = task_on_cpu(p);
++ on_rq = p->on_rq;
++ ncsw = 0;
++ if (READ_ONCE(p->__state) & match_state)
++ ncsw = p->nvcsw | LONG_MIN; /* sets MSB */
++ task_access_unlock_irqrestore(p, lock, &flags);
++
++ /*
++ * If it changed from the expected state, bail out now.
++ */
++ if (unlikely(!ncsw))
++ break;
++
++ /*
++ * Was it really running after all now that we
++ * checked with the proper locks actually held?
++ *
++ * Oops. Go back and try again..
++ */
++ if (unlikely(running)) {
++ cpu_relax();
++ continue;
++ }
++
++ /*
++ * It's not enough that it's not actively running,
++ * it must be off the runqueue _entirely_, and not
++ * preempted!
++ *
++ * So if it was still runnable (but just not actively
++ * running right now), it's preempted, and we should
++ * yield - it could be a while.
++ */
++ if (unlikely(on_rq)) {
++ ktime_t to = NSEC_PER_SEC / HZ;
++
++ set_current_state(TASK_UNINTERRUPTIBLE);
++ schedule_hrtimeout(&to, HRTIMER_MODE_REL_HARD);
++ continue;
++ }
++
++ /*
++ * Ahh, all good. It wasn't running, and it wasn't
++ * runnable, which means that it will never become
++ * running in the future either. We're all done!
++ */
++ break;
++ }
++
++ return ncsw;
++}
++
++/***
++ * kick_process - kick a running thread to enter/exit the kernel
++ * @p: the to-be-kicked thread
++ *
++ * Cause a process which is running on another CPU to enter
++ * kernel-mode, without any delay. (to get signals handled.)
++ *
++ * NOTE: this function doesn't have to take the runqueue lock,
++ * because all it wants to ensure is that the remote task enters
++ * the kernel. If the IPI races and the task has been migrated
++ * to another CPU then no harm is done and the purpose has been
++ * achieved as well.
++ */
++void kick_process(struct task_struct *p)
++{
++ int cpu;
++
++ preempt_disable();
++ cpu = task_cpu(p);
++ if ((cpu != smp_processor_id()) && task_curr(p))
++ smp_send_reschedule(cpu);
++ preempt_enable();
++}
++EXPORT_SYMBOL_GPL(kick_process);
++
++/*
++ * ->cpus_ptr is protected by both rq->lock and p->pi_lock
++ *
++ * A few notes on cpu_active vs cpu_online:
++ *
++ * - cpu_active must be a subset of cpu_online
++ *
++ * - on CPU-up we allow per-CPU kthreads on the online && !active CPU,
++ * see __set_cpus_allowed_ptr(). At this point the newly online
++ * CPU isn't yet part of the sched domains, and balancing will not
++ * see it.
++ *
++ * - on cpu-down we clear cpu_active() to mask the sched domains and
++ * avoid the load balancer to place new tasks on the to be removed
++ * CPU. Existing tasks will remain running there and will be taken
++ * off.
++ *
++ * This means that fallback selection must not select !active CPUs.
++ * And can assume that any active CPU must be online. Conversely
++ * select_task_rq() below may allow selection of !active CPUs in order
++ * to satisfy the above rules.
++ */
++static int select_fallback_rq(int cpu, struct task_struct *p)
++{
++ int nid = cpu_to_node(cpu);
++ const struct cpumask *nodemask = NULL;
++ enum { cpuset, possible, fail } state = cpuset;
++ int dest_cpu;
++
++ /*
++ * If the node that the CPU is on has been offlined, cpu_to_node()
++ * will return -1. There is no CPU on the node, and we should
++ * select the CPU on the other node.
++ */
++ if (nid != -1) {
++ nodemask = cpumask_of_node(nid);
++
++ /* Look for allowed, online CPU in same node. */
++ for_each_cpu(dest_cpu, nodemask) {
++ if (is_cpu_allowed(p, dest_cpu))
++ return dest_cpu;
++ }
++ }
++
++ for (;;) {
++ /* Any allowed, online CPU? */
++ for_each_cpu(dest_cpu, p->cpus_ptr) {
++ if (!is_cpu_allowed(p, dest_cpu))
++ continue;
++ goto out;
++ }
++
++ /* No more Mr. Nice Guy. */
++ switch (state) {
++ case cpuset:
++ if (cpuset_cpus_allowed_fallback(p)) {
++ state = possible;
++ break;
++ }
++ fallthrough;
++ case possible:
++ /*
++ * XXX When called from select_task_rq() we only
++ * hold p->pi_lock and again violate locking order.
++ *
++ * More yuck to audit.
++ */
++ do_set_cpus_allowed(p, task_cpu_possible_mask(p));
++ state = fail;
++ break;
++
++ case fail:
++ BUG();
++ break;
++ }
++ }
++
++out:
++ if (state != cpuset) {
++ /*
++ * Don't tell them about moving exiting tasks or
++ * kernel threads (both mm NULL), since they never
++ * leave kernel.
++ */
++ if (p->mm && printk_ratelimit()) {
++ printk_deferred("process %d (%s) no longer affine to cpu%d\n",
++ task_pid_nr(p), p->comm, cpu);
++ }
++ }
++
++ return dest_cpu;
++}
++
++static inline void
++sched_preempt_mask_flush(cpumask_t *mask, int prio)
++{
++ int cpu;
++
++ cpumask_copy(mask, sched_idle_mask);
++
++ for_each_clear_bit(cpu, cpumask_bits(mask), nr_cpumask_bits) {
++ if (prio < cpu_rq(cpu)->prio)
++ cpumask_set_cpu(cpu, mask);
++ }
++}
++
++static inline int
++preempt_mask_check(struct task_struct *p, cpumask_t *allow_mask, cpumask_t *preempt_mask)
++{
++ int task_prio = task_sched_prio(p);
++ cpumask_t *mask = sched_preempt_mask + SCHED_QUEUE_BITS - 1 - task_prio;
++ int pr = atomic_read(&sched_prio_record);
++
++ if (pr != task_prio) {
++ sched_preempt_mask_flush(mask, task_prio);
++ atomic_set(&sched_prio_record, task_prio);
++ }
++
++ return cpumask_and(preempt_mask, allow_mask, mask);
++}
++
++static inline int select_task_rq(struct task_struct *p)
++{
++ cpumask_t allow_mask, mask;
++
++ if (unlikely(!cpumask_and(&allow_mask, p->cpus_ptr, cpu_active_mask)))
++ return select_fallback_rq(task_cpu(p), p);
++
++ if (
++#ifdef CONFIG_SCHED_SMT
++ cpumask_and(&mask, &allow_mask, &sched_sg_idle_mask) ||
++#endif
++ cpumask_and(&mask, &allow_mask, sched_idle_mask) ||
++ preempt_mask_check(p, &allow_mask, &mask))
++ return best_mask_cpu(task_cpu(p), &mask);
++
++ return best_mask_cpu(task_cpu(p), &allow_mask);
++}
++
++void sched_set_stop_task(int cpu, struct task_struct *stop)
++{
++ static struct lock_class_key stop_pi_lock;
++ struct sched_param stop_param = { .sched_priority = STOP_PRIO };
++ struct sched_param start_param = { .sched_priority = 0 };
++ struct task_struct *old_stop = cpu_rq(cpu)->stop;
++
++ if (stop) {
++ /*
++ * Make it appear like a SCHED_FIFO task, its something
++ * userspace knows about and won't get confused about.
++ *
++ * Also, it will make PI more or less work without too
++ * much confusion -- but then, stop work should not
++ * rely on PI working anyway.
++ */
++ sched_setscheduler_nocheck(stop, SCHED_FIFO, &stop_param);
++
++ /*
++ * The PI code calls rt_mutex_setprio() with ->pi_lock held to
++ * adjust the effective priority of a task. As a result,
++ * rt_mutex_setprio() can trigger (RT) balancing operations,
++ * which can then trigger wakeups of the stop thread to push
++ * around the current task.
++ *
++ * The stop task itself will never be part of the PI-chain, it
++ * never blocks, therefore that ->pi_lock recursion is safe.
++ * Tell lockdep about this by placing the stop->pi_lock in its
++ * own class.
++ */
++ lockdep_set_class(&stop->pi_lock, &stop_pi_lock);
++ }
++
++ cpu_rq(cpu)->stop = stop;
++
++ if (old_stop) {
++ /*
++ * Reset it back to a normal scheduling policy so that
++ * it can die in pieces.
++ */
++ sched_setscheduler_nocheck(old_stop, SCHED_NORMAL, &start_param);
++ }
++}
++
++static int affine_move_task(struct rq *rq, struct task_struct *p, int dest_cpu,
++ raw_spinlock_t *lock, unsigned long irq_flags)
++ __releases(rq->lock)
++ __releases(p->pi_lock)
++{
++ /* Can the task run on the task's current CPU? If so, we're done */
++ if (!cpumask_test_cpu(task_cpu(p), &p->cpus_mask)) {
++ if (p->migration_disabled) {
++ if (likely(p->cpus_ptr != &p->cpus_mask))
++ __do_set_cpus_ptr(p, &p->cpus_mask);
++ p->migration_disabled = 0;
++ p->migration_flags |= MDF_FORCE_ENABLED;
++ /* When p is migrate_disabled, rq->lock should be held */
++ rq->nr_pinned--;
++ }
++
++ if (task_on_cpu(p) || READ_ONCE(p->__state) == TASK_WAKING) {
++ struct migration_arg arg = { p, dest_cpu };
++
++ /* Need help from migration thread: drop lock and wait. */
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
++ stop_one_cpu(cpu_of(rq), migration_cpu_stop, &arg);
++ return 0;
++ }
++ if (task_on_rq_queued(p)) {
++ /*
++ * OK, since we're going to drop the lock immediately
++ * afterwards anyway.
++ */
++ update_rq_clock(rq);
++ rq = move_queued_task(rq, p, dest_cpu);
++ lock = &rq->lock;
++ }
++ }
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
++ return 0;
++}
++
++static int __set_cpus_allowed_ptr_locked(struct task_struct *p,
++ struct affinity_context *ctx,
++ struct rq *rq,
++ raw_spinlock_t *lock,
++ unsigned long irq_flags)
++{
++ const struct cpumask *cpu_allowed_mask = task_cpu_possible_mask(p);
++ const struct cpumask *cpu_valid_mask = cpu_active_mask;
++ bool kthread = p->flags & PF_KTHREAD;
++ int dest_cpu;
++ int ret = 0;
++
++ if (kthread || is_migration_disabled(p)) {
++ /*
++ * Kernel threads are allowed on online && !active CPUs,
++ * however, during cpu-hot-unplug, even these might get pushed
++ * away if not KTHREAD_IS_PER_CPU.
++ *
++ * Specifically, migration_disabled() tasks must not fail the
++ * cpumask_any_and_distribute() pick below, esp. so on
++ * SCA_MIGRATE_ENABLE, otherwise we'll not call
++ * set_cpus_allowed_common() and actually reset p->cpus_ptr.
++ */
++ cpu_valid_mask = cpu_online_mask;
++ }
++
++ if (!kthread && !cpumask_subset(ctx->new_mask, cpu_allowed_mask)) {
++ ret = -EINVAL;
++ goto out;
++ }
++
++ /*
++ * Must re-check here, to close a race against __kthread_bind(),
++ * sched_setaffinity() is not guaranteed to observe the flag.
++ */
++ if ((ctx->flags & SCA_CHECK) && (p->flags & PF_NO_SETAFFINITY)) {
++ ret = -EINVAL;
++ goto out;
++ }
++
++ if (cpumask_equal(&p->cpus_mask, ctx->new_mask))
++ goto out;
++
++ dest_cpu = cpumask_any_and(cpu_valid_mask, ctx->new_mask);
++ if (dest_cpu >= nr_cpu_ids) {
++ ret = -EINVAL;
++ goto out;
++ }
++
++ __do_set_cpus_allowed(p, ctx);
++
++ return affine_move_task(rq, p, dest_cpu, lock, irq_flags);
++
++out:
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
++
++ return ret;
++}
++
++/*
++ * Change a given task's CPU affinity. Migrate the thread to a
++ * is removed from the allowed bitmask.
++ *
++ * NOTE: the caller must have a valid reference to the task, the
++ * task must not exit() & deallocate itself prematurely. The
++ * call is not atomic; no spinlocks may be held.
++ */
++static int __set_cpus_allowed_ptr(struct task_struct *p,
++ struct affinity_context *ctx)
++{
++ unsigned long irq_flags;
++ struct rq *rq;
++ raw_spinlock_t *lock;
++
++ raw_spin_lock_irqsave(&p->pi_lock, irq_flags);
++ rq = __task_access_lock(p, &lock);
++ /*
++ * Masking should be skipped if SCA_USER or any of the SCA_MIGRATE_*
++ * flags are set.
++ */
++ if (p->user_cpus_ptr &&
++ !(ctx->flags & SCA_USER) &&
++ cpumask_and(rq->scratch_mask, ctx->new_mask, p->user_cpus_ptr))
++ ctx->new_mask = rq->scratch_mask;
++
++
++ return __set_cpus_allowed_ptr_locked(p, ctx, rq, lock, irq_flags);
++}
++
++int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask)
++{
++ struct affinity_context ac = {
++ .new_mask = new_mask,
++ .flags = 0,
++ };
++
++ return __set_cpus_allowed_ptr(p, &ac);
++}
++EXPORT_SYMBOL_GPL(set_cpus_allowed_ptr);
++
++/*
++ * Change a given task's CPU affinity to the intersection of its current
++ * affinity mask and @subset_mask, writing the resulting mask to @new_mask.
++ * If user_cpus_ptr is defined, use it as the basis for restricting CPU
++ * affinity or use cpu_online_mask instead.
++ *
++ * If the resulting mask is empty, leave the affinity unchanged and return
++ * -EINVAL.
++ */
++static int restrict_cpus_allowed_ptr(struct task_struct *p,
++ struct cpumask *new_mask,
++ const struct cpumask *subset_mask)
++{
++ struct affinity_context ac = {
++ .new_mask = new_mask,
++ .flags = 0,
++ };
++ unsigned long irq_flags;
++ raw_spinlock_t *lock;
++ struct rq *rq;
++ int err;
++
++ raw_spin_lock_irqsave(&p->pi_lock, irq_flags);
++ rq = __task_access_lock(p, &lock);
++
++ if (!cpumask_and(new_mask, task_user_cpus(p), subset_mask)) {
++ err = -EINVAL;
++ goto err_unlock;
++ }
++
++ return __set_cpus_allowed_ptr_locked(p, &ac, rq, lock, irq_flags);
++
++err_unlock:
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
++ return err;
++}
++
++/*
++ * Restrict the CPU affinity of task @p so that it is a subset of
++ * task_cpu_possible_mask() and point @p->user_cpus_ptr to a copy of the
++ * old affinity mask. If the resulting mask is empty, we warn and walk
++ * up the cpuset hierarchy until we find a suitable mask.
++ */
++void force_compatible_cpus_allowed_ptr(struct task_struct *p)
++{
++ cpumask_var_t new_mask;
++ const struct cpumask *override_mask = task_cpu_possible_mask(p);
++
++ alloc_cpumask_var(&new_mask, GFP_KERNEL);
++
++ /*
++ * __migrate_task() can fail silently in the face of concurrent
++ * offlining of the chosen destination CPU, so take the hotplug
++ * lock to ensure that the migration succeeds.
++ */
++ cpus_read_lock();
++ if (!cpumask_available(new_mask))
++ goto out_set_mask;
++
++ if (!restrict_cpus_allowed_ptr(p, new_mask, override_mask))
++ goto out_free_mask;
++
++ /*
++ * We failed to find a valid subset of the affinity mask for the
++ * task, so override it based on its cpuset hierarchy.
++ */
++ cpuset_cpus_allowed(p, new_mask);
++ override_mask = new_mask;
++
++out_set_mask:
++ if (printk_ratelimit()) {
++ printk_deferred("Overriding affinity for process %d (%s) to CPUs %*pbl\n",
++ task_pid_nr(p), p->comm,
++ cpumask_pr_args(override_mask));
++ }
++
++ WARN_ON(set_cpus_allowed_ptr(p, override_mask));
++out_free_mask:
++ cpus_read_unlock();
++ free_cpumask_var(new_mask);
++}
++
++static int
++__sched_setaffinity(struct task_struct *p, struct affinity_context *ctx);
++
++/*
++ * Restore the affinity of a task @p which was previously restricted by a
++ * call to force_compatible_cpus_allowed_ptr().
++ *
++ * It is the caller's responsibility to serialise this with any calls to
++ * force_compatible_cpus_allowed_ptr(@p).
++ */
++void relax_compatible_cpus_allowed_ptr(struct task_struct *p)
++{
++ struct affinity_context ac = {
++ .new_mask = task_user_cpus(p),
++ .flags = 0,
++ };
++ int ret;
++
++ /*
++ * Try to restore the old affinity mask with __sched_setaffinity().
++ * Cpuset masking will be done there too.
++ */
++ ret = __sched_setaffinity(p, &ac);
++ WARN_ON_ONCE(ret);
++}
++
++#else /* CONFIG_SMP */
++
++static inline int select_task_rq(struct task_struct *p)
++{
++ return 0;
++}
++
++static inline int
++__set_cpus_allowed_ptr(struct task_struct *p,
++ struct affinity_context *ctx)
++{
++ return set_cpus_allowed_ptr(p, ctx->new_mask);
++}
++
++static inline bool rq_has_pinned_tasks(struct rq *rq)
++{
++ return false;
++}
++
++static inline cpumask_t *alloc_user_cpus_ptr(int node)
++{
++ return NULL;
++}
++
++#endif /* !CONFIG_SMP */
++
++static void
++ttwu_stat(struct task_struct *p, int cpu, int wake_flags)
++{
++ struct rq *rq;
++
++ if (!schedstat_enabled())
++ return;
++
++ rq = this_rq();
++
++#ifdef CONFIG_SMP
++ if (cpu == rq->cpu) {
++ __schedstat_inc(rq->ttwu_local);
++ __schedstat_inc(p->stats.nr_wakeups_local);
++ } else {
++ /** Alt schedule FW ToDo:
++ * How to do ttwu_wake_remote
++ */
++ }
++#endif /* CONFIG_SMP */
++
++ __schedstat_inc(rq->ttwu_count);
++ __schedstat_inc(p->stats.nr_wakeups);
++}
++
++/*
++ * Mark the task runnable.
++ */
++static inline void ttwu_do_wakeup(struct task_struct *p)
++{
++ WRITE_ONCE(p->__state, TASK_RUNNING);
++ trace_sched_wakeup(p);
++}
++
++static inline void
++ttwu_do_activate(struct rq *rq, struct task_struct *p, int wake_flags)
++{
++ if (p->sched_contributes_to_load)
++ rq->nr_uninterruptible--;
++
++ if (
++#ifdef CONFIG_SMP
++ !(wake_flags & WF_MIGRATED) &&
++#endif
++ p->in_iowait) {
++ delayacct_blkio_end(p);
++ atomic_dec(&task_rq(p)->nr_iowait);
++ }
++
++ activate_task(p, rq);
++ check_preempt_curr(rq);
++
++ ttwu_do_wakeup(p);
++}
++
++/*
++ * Consider @p being inside a wait loop:
++ *
++ * for (;;) {
++ * set_current_state(TASK_UNINTERRUPTIBLE);
++ *
++ * if (CONDITION)
++ * break;
++ *
++ * schedule();
++ * }
++ * __set_current_state(TASK_RUNNING);
++ *
++ * between set_current_state() and schedule(). In this case @p is still
++ * runnable, so all that needs doing is change p->state back to TASK_RUNNING in
++ * an atomic manner.
++ *
++ * By taking task_rq(p)->lock we serialize against schedule(), if @p->on_rq
++ * then schedule() must still happen and p->state can be changed to
++ * TASK_RUNNING. Otherwise we lost the race, schedule() has happened, and we
++ * need to do a full wakeup with enqueue.
++ *
++ * Returns: %true when the wakeup is done,
++ * %false otherwise.
++ */
++static int ttwu_runnable(struct task_struct *p, int wake_flags)
++{
++ struct rq *rq;
++ raw_spinlock_t *lock;
++ int ret = 0;
++
++ rq = __task_access_lock(p, &lock);
++ if (task_on_rq_queued(p)) {
++ if (!task_on_cpu(p)) {
++ /*
++ * When on_rq && !on_cpu the task is preempted, see if
++ * it should preempt the task that is current now.
++ */
++ update_rq_clock(rq);
++ check_preempt_curr(rq);
++ }
++ ttwu_do_wakeup(p);
++ ret = 1;
++ }
++ __task_access_unlock(p, lock);
++
++ return ret;
++}
++
++#ifdef CONFIG_SMP
++void sched_ttwu_pending(void *arg)
++{
++ struct llist_node *llist = arg;
++ struct rq *rq = this_rq();
++ struct task_struct *p, *t;
++ struct rq_flags rf;
++
++ if (!llist)
++ return;
++
++ rq_lock_irqsave(rq, &rf);
++ update_rq_clock(rq);
++
++ llist_for_each_entry_safe(p, t, llist, wake_entry.llist) {
++ if (WARN_ON_ONCE(p->on_cpu))
++ smp_cond_load_acquire(&p->on_cpu, !VAL);
++
++ if (WARN_ON_ONCE(task_cpu(p) != cpu_of(rq)))
++ set_task_cpu(p, cpu_of(rq));
++
++ ttwu_do_activate(rq, p, p->sched_remote_wakeup ? WF_MIGRATED : 0);
++ }
++
++ /*
++ * Must be after enqueueing at least once task such that
++ * idle_cpu() does not observe a false-negative -- if it does,
++ * it is possible for select_idle_siblings() to stack a number
++ * of tasks on this CPU during that window.
++ *
++ * It is ok to clear ttwu_pending when another task pending.
++ * We will receive IPI after local irq enabled and then enqueue it.
++ * Since now nr_running > 0, idle_cpu() will always get correct result.
++ */
++ WRITE_ONCE(rq->ttwu_pending, 0);
++ rq_unlock_irqrestore(rq, &rf);
++}
++
++/*
++ * Prepare the scene for sending an IPI for a remote smp_call
++ *
++ * Returns true if the caller can proceed with sending the IPI.
++ * Returns false otherwise.
++ */
++bool call_function_single_prep_ipi(int cpu)
++{
++ if (set_nr_if_polling(cpu_rq(cpu)->idle)) {
++ trace_sched_wake_idle_without_ipi(cpu);
++ return false;
++ }
++
++ return true;
++}
++
++/*
++ * Queue a task on the target CPUs wake_list and wake the CPU via IPI if
++ * necessary. The wakee CPU on receipt of the IPI will queue the task
++ * via sched_ttwu_wakeup() for activation so the wakee incurs the cost
++ * of the wakeup instead of the waker.
++ */
++static void __ttwu_queue_wakelist(struct task_struct *p, int cpu, int wake_flags)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ p->sched_remote_wakeup = !!(wake_flags & WF_MIGRATED);
++
++ WRITE_ONCE(rq->ttwu_pending, 1);
++ __smp_call_single_queue(cpu, &p->wake_entry.llist);
++}
++
++static inline bool ttwu_queue_cond(struct task_struct *p, int cpu)
++{
++ /*
++ * Do not complicate things with the async wake_list while the CPU is
++ * in hotplug state.
++ */
++ if (!cpu_active(cpu))
++ return false;
++
++ /* Ensure the task will still be allowed to run on the CPU. */
++ if (!cpumask_test_cpu(cpu, p->cpus_ptr))
++ return false;
++
++ /*
++ * If the CPU does not share cache, then queue the task on the
++ * remote rqs wakelist to avoid accessing remote data.
++ */
++ if (!cpus_share_cache(smp_processor_id(), cpu))
++ return true;
++
++ if (cpu == smp_processor_id())
++ return false;
++
++ /*
++ * If the wakee cpu is idle, or the task is descheduling and the
++ * only running task on the CPU, then use the wakelist to offload
++ * the task activation to the idle (or soon-to-be-idle) CPU as
++ * the current CPU is likely busy. nr_running is checked to
++ * avoid unnecessary task stacking.
++ *
++ * Note that we can only get here with (wakee) p->on_rq=0,
++ * p->on_cpu can be whatever, we've done the dequeue, so
++ * the wakee has been accounted out of ->nr_running.
++ */
++ if (!cpu_rq(cpu)->nr_running)
++ return true;
++
++ return false;
++}
++
++static bool ttwu_queue_wakelist(struct task_struct *p, int cpu, int wake_flags)
++{
++ if (__is_defined(ALT_SCHED_TTWU_QUEUE) && ttwu_queue_cond(p, cpu)) {
++ sched_clock_cpu(cpu); /* Sync clocks across CPUs */
++ __ttwu_queue_wakelist(p, cpu, wake_flags);
++ return true;
++ }
++
++ return false;
++}
++
++void wake_up_if_idle(int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++ unsigned long flags;
++
++ rcu_read_lock();
++
++ if (!is_idle_task(rcu_dereference(rq->curr)))
++ goto out;
++
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ if (is_idle_task(rq->curr))
++ resched_curr(rq);
++ /* Else CPU is not idle, do nothing here */
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++
++out:
++ rcu_read_unlock();
++}
++
++bool cpus_share_cache(int this_cpu, int that_cpu)
++{
++ if (this_cpu == that_cpu)
++ return true;
++
++ return per_cpu(sd_llc_id, this_cpu) == per_cpu(sd_llc_id, that_cpu);
++}
++#else /* !CONFIG_SMP */
++
++static inline bool ttwu_queue_wakelist(struct task_struct *p, int cpu, int wake_flags)
++{
++ return false;
++}
++
++#endif /* CONFIG_SMP */
++
++static inline void ttwu_queue(struct task_struct *p, int cpu, int wake_flags)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ if (ttwu_queue_wakelist(p, cpu, wake_flags))
++ return;
++
++ raw_spin_lock(&rq->lock);
++ update_rq_clock(rq);
++ ttwu_do_activate(rq, p, wake_flags);
++ raw_spin_unlock(&rq->lock);
++}
++
++/*
++ * Invoked from try_to_wake_up() to check whether the task can be woken up.
++ *
++ * The caller holds p::pi_lock if p != current or has preemption
++ * disabled when p == current.
++ *
++ * The rules of PREEMPT_RT saved_state:
++ *
++ * The related locking code always holds p::pi_lock when updating
++ * p::saved_state, which means the code is fully serialized in both cases.
++ *
++ * The lock wait and lock wakeups happen via TASK_RTLOCK_WAIT. No other
++ * bits set. This allows to distinguish all wakeup scenarios.
++ */
++static __always_inline
++bool ttwu_state_match(struct task_struct *p, unsigned int state, int *success)
++{
++ if (IS_ENABLED(CONFIG_DEBUG_PREEMPT)) {
++ WARN_ON_ONCE((state & TASK_RTLOCK_WAIT) &&
++ state != TASK_RTLOCK_WAIT);
++ }
++
++ if (READ_ONCE(p->__state) & state) {
++ *success = 1;
++ return true;
++ }
++
++#ifdef CONFIG_PREEMPT_RT
++ /*
++ * Saved state preserves the task state across blocking on
++ * an RT lock. If the state matches, set p::saved_state to
++ * TASK_RUNNING, but do not wake the task because it waits
++ * for a lock wakeup. Also indicate success because from
++ * the regular waker's point of view this has succeeded.
++ *
++ * After acquiring the lock the task will restore p::__state
++ * from p::saved_state which ensures that the regular
++ * wakeup is not lost. The restore will also set
++ * p::saved_state to TASK_RUNNING so any further tests will
++ * not result in false positives vs. @success
++ */
++ if (p->saved_state & state) {
++ p->saved_state = TASK_RUNNING;
++ *success = 1;
++ }
++#endif
++ return false;
++}
++
++/*
++ * Notes on Program-Order guarantees on SMP systems.
++ *
++ * MIGRATION
++ *
++ * The basic program-order guarantee on SMP systems is that when a task [t]
++ * migrates, all its activity on its old CPU [c0] happens-before any subsequent
++ * execution on its new CPU [c1].
++ *
++ * For migration (of runnable tasks) this is provided by the following means:
++ *
++ * A) UNLOCK of the rq(c0)->lock scheduling out task t
++ * B) migration for t is required to synchronize *both* rq(c0)->lock and
++ * rq(c1)->lock (if not at the same time, then in that order).
++ * C) LOCK of the rq(c1)->lock scheduling in task
++ *
++ * Transitivity guarantees that B happens after A and C after B.
++ * Note: we only require RCpc transitivity.
++ * Note: the CPU doing B need not be c0 or c1
++ *
++ * Example:
++ *
++ * CPU0 CPU1 CPU2
++ *
++ * LOCK rq(0)->lock
++ * sched-out X
++ * sched-in Y
++ * UNLOCK rq(0)->lock
++ *
++ * LOCK rq(0)->lock // orders against CPU0
++ * dequeue X
++ * UNLOCK rq(0)->lock
++ *
++ * LOCK rq(1)->lock
++ * enqueue X
++ * UNLOCK rq(1)->lock
++ *
++ * LOCK rq(1)->lock // orders against CPU2
++ * sched-out Z
++ * sched-in X
++ * UNLOCK rq(1)->lock
++ *
++ *
++ * BLOCKING -- aka. SLEEP + WAKEUP
++ *
++ * For blocking we (obviously) need to provide the same guarantee as for
++ * migration. However the means are completely different as there is no lock
++ * chain to provide order. Instead we do:
++ *
++ * 1) smp_store_release(X->on_cpu, 0) -- finish_task()
++ * 2) smp_cond_load_acquire(!X->on_cpu) -- try_to_wake_up()
++ *
++ * Example:
++ *
++ * CPU0 (schedule) CPU1 (try_to_wake_up) CPU2 (schedule)
++ *
++ * LOCK rq(0)->lock LOCK X->pi_lock
++ * dequeue X
++ * sched-out X
++ * smp_store_release(X->on_cpu, 0);
++ *
++ * smp_cond_load_acquire(&X->on_cpu, !VAL);
++ * X->state = WAKING
++ * set_task_cpu(X,2)
++ *
++ * LOCK rq(2)->lock
++ * enqueue X
++ * X->state = RUNNING
++ * UNLOCK rq(2)->lock
++ *
++ * LOCK rq(2)->lock // orders against CPU1
++ * sched-out Z
++ * sched-in X
++ * UNLOCK rq(2)->lock
++ *
++ * UNLOCK X->pi_lock
++ * UNLOCK rq(0)->lock
++ *
++ *
++ * However; for wakeups there is a second guarantee we must provide, namely we
++ * must observe the state that lead to our wakeup. That is, not only must our
++ * task observe its own prior state, it must also observe the stores prior to
++ * its wakeup.
++ *
++ * This means that any means of doing remote wakeups must order the CPU doing
++ * the wakeup against the CPU the task is going to end up running on. This,
++ * however, is already required for the regular Program-Order guarantee above,
++ * since the waking CPU is the one issueing the ACQUIRE (smp_cond_load_acquire).
++ *
++ */
++
++/**
++ * try_to_wake_up - wake up a thread
++ * @p: the thread to be awakened
++ * @state: the mask of task states that can be woken
++ * @wake_flags: wake modifier flags (WF_*)
++ *
++ * Conceptually does:
++ *
++ * If (@state & @p->state) @p->state = TASK_RUNNING.
++ *
++ * If the task was not queued/runnable, also place it back on a runqueue.
++ *
++ * This function is atomic against schedule() which would dequeue the task.
++ *
++ * It issues a full memory barrier before accessing @p->state, see the comment
++ * with set_current_state().
++ *
++ * Uses p->pi_lock to serialize against concurrent wake-ups.
++ *
++ * Relies on p->pi_lock stabilizing:
++ * - p->sched_class
++ * - p->cpus_ptr
++ * - p->sched_task_group
++ * in order to do migration, see its use of select_task_rq()/set_task_cpu().
++ *
++ * Tries really hard to only take one task_rq(p)->lock for performance.
++ * Takes rq->lock in:
++ * - ttwu_runnable() -- old rq, unavoidable, see comment there;
++ * - ttwu_queue() -- new rq, for enqueue of the task;
++ * - psi_ttwu_dequeue() -- much sadness :-( accounting will kill us.
++ *
++ * As a consequence we race really badly with just about everything. See the
++ * many memory barriers and their comments for details.
++ *
++ * Return: %true if @p->state changes (an actual wakeup was done),
++ * %false otherwise.
++ */
++static int try_to_wake_up(struct task_struct *p, unsigned int state,
++ int wake_flags)
++{
++ unsigned long flags;
++ int cpu, success = 0;
++
++ preempt_disable();
++ if (p == current) {
++ /*
++ * We're waking current, this means 'p->on_rq' and 'task_cpu(p)
++ * == smp_processor_id()'. Together this means we can special
++ * case the whole 'p->on_rq && ttwu_runnable()' case below
++ * without taking any locks.
++ *
++ * In particular:
++ * - we rely on Program-Order guarantees for all the ordering,
++ * - we're serialized against set_special_state() by virtue of
++ * it disabling IRQs (this allows not taking ->pi_lock).
++ */
++ if (!ttwu_state_match(p, state, &success))
++ goto out;
++
++ trace_sched_waking(p);
++ ttwu_do_wakeup(p);
++ goto out;
++ }
++
++ /*
++ * If we are going to wake up a thread waiting for CONDITION we
++ * need to ensure that CONDITION=1 done by the caller can not be
++ * reordered with p->state check below. This pairs with smp_store_mb()
++ * in set_current_state() that the waiting thread does.
++ */
++ raw_spin_lock_irqsave(&p->pi_lock, flags);
++ smp_mb__after_spinlock();
++ if (!ttwu_state_match(p, state, &success))
++ goto unlock;
++
++ trace_sched_waking(p);
++
++ /*
++ * Ensure we load p->on_rq _after_ p->state, otherwise it would
++ * be possible to, falsely, observe p->on_rq == 0 and get stuck
++ * in smp_cond_load_acquire() below.
++ *
++ * sched_ttwu_pending() try_to_wake_up()
++ * STORE p->on_rq = 1 LOAD p->state
++ * UNLOCK rq->lock
++ *
++ * __schedule() (switch to task 'p')
++ * LOCK rq->lock smp_rmb();
++ * smp_mb__after_spinlock();
++ * UNLOCK rq->lock
++ *
++ * [task p]
++ * STORE p->state = UNINTERRUPTIBLE LOAD p->on_rq
++ *
++ * Pairs with the LOCK+smp_mb__after_spinlock() on rq->lock in
++ * __schedule(). See the comment for smp_mb__after_spinlock().
++ *
++ * A similar smb_rmb() lives in try_invoke_on_locked_down_task().
++ */
++ smp_rmb();
++ if (READ_ONCE(p->on_rq) && ttwu_runnable(p, wake_flags))
++ goto unlock;
++
++#ifdef CONFIG_SMP
++ /*
++ * Ensure we load p->on_cpu _after_ p->on_rq, otherwise it would be
++ * possible to, falsely, observe p->on_cpu == 0.
++ *
++ * One must be running (->on_cpu == 1) in order to remove oneself
++ * from the runqueue.
++ *
++ * __schedule() (switch to task 'p') try_to_wake_up()
++ * STORE p->on_cpu = 1 LOAD p->on_rq
++ * UNLOCK rq->lock
++ *
++ * __schedule() (put 'p' to sleep)
++ * LOCK rq->lock smp_rmb();
++ * smp_mb__after_spinlock();
++ * STORE p->on_rq = 0 LOAD p->on_cpu
++ *
++ * Pairs with the LOCK+smp_mb__after_spinlock() on rq->lock in
++ * __schedule(). See the comment for smp_mb__after_spinlock().
++ *
++ * Form a control-dep-acquire with p->on_rq == 0 above, to ensure
++ * schedule()'s deactivate_task() has 'happened' and p will no longer
++ * care about it's own p->state. See the comment in __schedule().
++ */
++ smp_acquire__after_ctrl_dep();
++
++ /*
++ * We're doing the wakeup (@success == 1), they did a dequeue (p->on_rq
++ * == 0), which means we need to do an enqueue, change p->state to
++ * TASK_WAKING such that we can unlock p->pi_lock before doing the
++ * enqueue, such as ttwu_queue_wakelist().
++ */
++ WRITE_ONCE(p->__state, TASK_WAKING);
++
++ /*
++ * If the owning (remote) CPU is still in the middle of schedule() with
++ * this task as prev, considering queueing p on the remote CPUs wake_list
++ * which potentially sends an IPI instead of spinning on p->on_cpu to
++ * let the waker make forward progress. This is safe because IRQs are
++ * disabled and the IPI will deliver after on_cpu is cleared.
++ *
++ * Ensure we load task_cpu(p) after p->on_cpu:
++ *
++ * set_task_cpu(p, cpu);
++ * STORE p->cpu = @cpu
++ * __schedule() (switch to task 'p')
++ * LOCK rq->lock
++ * smp_mb__after_spin_lock() smp_cond_load_acquire(&p->on_cpu)
++ * STORE p->on_cpu = 1 LOAD p->cpu
++ *
++ * to ensure we observe the correct CPU on which the task is currently
++ * scheduling.
++ */
++ if (smp_load_acquire(&p->on_cpu) &&
++ ttwu_queue_wakelist(p, task_cpu(p), wake_flags))
++ goto unlock;
++
++ /*
++ * If the owning (remote) CPU is still in the middle of schedule() with
++ * this task as prev, wait until it's done referencing the task.
++ *
++ * Pairs with the smp_store_release() in finish_task().
++ *
++ * This ensures that tasks getting woken will be fully ordered against
++ * their previous state and preserve Program Order.
++ */
++ smp_cond_load_acquire(&p->on_cpu, !VAL);
++
++ sched_task_ttwu(p);
++
++ cpu = select_task_rq(p);
++
++ if (cpu != task_cpu(p)) {
++ if (p->in_iowait) {
++ delayacct_blkio_end(p);
++ atomic_dec(&task_rq(p)->nr_iowait);
++ }
++
++ wake_flags |= WF_MIGRATED;
++ set_task_cpu(p, cpu);
++ }
++#else
++ cpu = task_cpu(p);
++#endif /* CONFIG_SMP */
++
++ ttwu_queue(p, cpu, wake_flags);
++unlock:
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++out:
++ if (success)
++ ttwu_stat(p, task_cpu(p), wake_flags);
++ preempt_enable();
++
++ return success;
++}
++
++static bool __task_needs_rq_lock(struct task_struct *p)
++{
++ unsigned int state = READ_ONCE(p->__state);
++
++ /*
++ * Since pi->lock blocks try_to_wake_up(), we don't need rq->lock when
++ * the task is blocked. Make sure to check @state since ttwu() can drop
++ * locks at the end, see ttwu_queue_wakelist().
++ */
++ if (state == TASK_RUNNING || state == TASK_WAKING)
++ return true;
++
++ /*
++ * Ensure we load p->on_rq after p->__state, otherwise it would be
++ * possible to, falsely, observe p->on_rq == 0.
++ *
++ * See try_to_wake_up() for a longer comment.
++ */
++ smp_rmb();
++ if (p->on_rq)
++ return true;
++
++#ifdef CONFIG_SMP
++ /*
++ * Ensure the task has finished __schedule() and will not be referenced
++ * anymore. Again, see try_to_wake_up() for a longer comment.
++ */
++ smp_rmb();
++ smp_cond_load_acquire(&p->on_cpu, !VAL);
++#endif
++
++ return false;
++}
++
++/**
++ * task_call_func - Invoke a function on task in fixed state
++ * @p: Process for which the function is to be invoked, can be @current.
++ * @func: Function to invoke.
++ * @arg: Argument to function.
++ *
++ * Fix the task in it's current state by avoiding wakeups and or rq operations
++ * and call @func(@arg) on it. This function can use ->on_rq and task_curr()
++ * to work out what the state is, if required. Given that @func can be invoked
++ * with a runqueue lock held, it had better be quite lightweight.
++ *
++ * Returns:
++ * Whatever @func returns
++ */
++int task_call_func(struct task_struct *p, task_call_f func, void *arg)
++{
++ struct rq *rq = NULL;
++ struct rq_flags rf;
++ int ret;
++
++ raw_spin_lock_irqsave(&p->pi_lock, rf.flags);
++
++ if (__task_needs_rq_lock(p))
++ rq = __task_rq_lock(p, &rf);
++
++ /*
++ * At this point the task is pinned; either:
++ * - blocked and we're holding off wakeups (pi->lock)
++ * - woken, and we're holding off enqueue (rq->lock)
++ * - queued, and we're holding off schedule (rq->lock)
++ * - running, and we're holding off de-schedule (rq->lock)
++ *
++ * The called function (@func) can use: task_curr(), p->on_rq and
++ * p->__state to differentiate between these states.
++ */
++ ret = func(p, arg);
++
++ if (rq)
++ __task_rq_unlock(rq, &rf);
++
++ raw_spin_unlock_irqrestore(&p->pi_lock, rf.flags);
++ return ret;
++}
++
++/**
++ * cpu_curr_snapshot - Return a snapshot of the currently running task
++ * @cpu: The CPU on which to snapshot the task.
++ *
++ * Returns the task_struct pointer of the task "currently" running on
++ * the specified CPU. If the same task is running on that CPU throughout,
++ * the return value will be a pointer to that task's task_struct structure.
++ * If the CPU did any context switches even vaguely concurrently with the
++ * execution of this function, the return value will be a pointer to the
++ * task_struct structure of a randomly chosen task that was running on
++ * that CPU somewhere around the time that this function was executing.
++ *
++ * If the specified CPU was offline, the return value is whatever it
++ * is, perhaps a pointer to the task_struct structure of that CPU's idle
++ * task, but there is no guarantee. Callers wishing a useful return
++ * value must take some action to ensure that the specified CPU remains
++ * online throughout.
++ *
++ * This function executes full memory barriers before and after fetching
++ * the pointer, which permits the caller to confine this function's fetch
++ * with respect to the caller's accesses to other shared variables.
++ */
++struct task_struct *cpu_curr_snapshot(int cpu)
++{
++ struct task_struct *t;
++
++ smp_mb(); /* Pairing determined by caller's synchronization design. */
++ t = rcu_dereference(cpu_curr(cpu));
++ smp_mb(); /* Pairing determined by caller's synchronization design. */
++ return t;
++}
++
++/**
++ * wake_up_process - Wake up a specific process
++ * @p: The process to be woken up.
++ *
++ * Attempt to wake up the nominated process and move it to the set of runnable
++ * processes.
++ *
++ * Return: 1 if the process was woken up, 0 if it was already running.
++ *
++ * This function executes a full memory barrier before accessing the task state.
++ */
++int wake_up_process(struct task_struct *p)
++{
++ return try_to_wake_up(p, TASK_NORMAL, 0);
++}
++EXPORT_SYMBOL(wake_up_process);
++
++int wake_up_state(struct task_struct *p, unsigned int state)
++{
++ return try_to_wake_up(p, state, 0);
++}
++
++/*
++ * Perform scheduler related setup for a newly forked process p.
++ * p is forked by current.
++ *
++ * __sched_fork() is basic setup used by init_idle() too:
++ */
++static inline void __sched_fork(unsigned long clone_flags, struct task_struct *p)
++{
++ p->on_rq = 0;
++ p->on_cpu = 0;
++ p->utime = 0;
++ p->stime = 0;
++ p->sched_time = 0;
++
++#ifdef CONFIG_SCHEDSTATS
++ /* Even if schedstat is disabled, there should not be garbage */
++ memset(&p->stats, 0, sizeof(p->stats));
++#endif
++
++#ifdef CONFIG_PREEMPT_NOTIFIERS
++ INIT_HLIST_HEAD(&p->preempt_notifiers);
++#endif
++
++#ifdef CONFIG_COMPACTION
++ p->capture_control = NULL;
++#endif
++#ifdef CONFIG_SMP
++ p->wake_entry.u_flags = CSD_TYPE_TTWU;
++#endif
++ init_sched_mm_cid(p);
++}
++
++/*
++ * fork()/clone()-time setup:
++ */
++int sched_fork(unsigned long clone_flags, struct task_struct *p)
++{
++ __sched_fork(clone_flags, p);
++ /*
++ * We mark the process as NEW here. This guarantees that
++ * nobody will actually run it, and a signal or other external
++ * event cannot wake it up and insert it on the runqueue either.
++ */
++ p->__state = TASK_NEW;
++
++ /*
++ * Make sure we do not leak PI boosting priority to the child.
++ */
++ p->prio = current->normal_prio;
++
++ /*
++ * Revert to default priority/policy on fork if requested.
++ */
++ if (unlikely(p->sched_reset_on_fork)) {
++ if (task_has_rt_policy(p)) {
++ p->policy = SCHED_NORMAL;
++ p->static_prio = NICE_TO_PRIO(0);
++ p->rt_priority = 0;
++ } else if (PRIO_TO_NICE(p->static_prio) < 0)
++ p->static_prio = NICE_TO_PRIO(0);
++
++ p->prio = p->normal_prio = p->static_prio;
++
++ /*
++ * We don't need the reset flag anymore after the fork. It has
++ * fulfilled its duty:
++ */
++ p->sched_reset_on_fork = 0;
++ }
++
++#ifdef CONFIG_SCHED_INFO
++ if (unlikely(sched_info_on()))
++ memset(&p->sched_info, 0, sizeof(p->sched_info));
++#endif
++ init_task_preempt_count(p);
++
++ return 0;
++}
++
++void sched_cgroup_fork(struct task_struct *p, struct kernel_clone_args *kargs)
++{
++ unsigned long flags;
++ struct rq *rq;
++
++ /*
++ * Because we're not yet on the pid-hash, p->pi_lock isn't strictly
++ * required yet, but lockdep gets upset if rules are violated.
++ */
++ raw_spin_lock_irqsave(&p->pi_lock, flags);
++ /*
++ * Share the timeslice between parent and child, thus the
++ * total amount of pending timeslices in the system doesn't change,
++ * resulting in more scheduling fairness.
++ */
++ rq = this_rq();
++ raw_spin_lock(&rq->lock);
++
++ rq->curr->time_slice /= 2;
++ p->time_slice = rq->curr->time_slice;
++#ifdef CONFIG_SCHED_HRTICK
++ hrtick_start(rq, rq->curr->time_slice);
++#endif
++
++ if (p->time_slice < RESCHED_NS) {
++ p->time_slice = sched_timeslice_ns;
++ resched_curr(rq);
++ }
++ sched_task_fork(p, rq);
++ raw_spin_unlock(&rq->lock);
++
++ rseq_migrate(p);
++ /*
++ * We're setting the CPU for the first time, we don't migrate,
++ * so use __set_task_cpu().
++ */
++ __set_task_cpu(p, smp_processor_id());
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++}
++
++void sched_post_fork(struct task_struct *p)
++{
++}
++
++#ifdef CONFIG_SCHEDSTATS
++
++DEFINE_STATIC_KEY_FALSE(sched_schedstats);
++
++static void set_schedstats(bool enabled)
++{
++ if (enabled)
++ static_branch_enable(&sched_schedstats);
++ else
++ static_branch_disable(&sched_schedstats);
++}
++
++void force_schedstat_enabled(void)
++{
++ if (!schedstat_enabled()) {
++ pr_info("kernel profiling enabled schedstats, disable via kernel.sched_schedstats.\n");
++ static_branch_enable(&sched_schedstats);
++ }
++}
++
++static int __init setup_schedstats(char *str)
++{
++ int ret = 0;
++ if (!str)
++ goto out;
++
++ if (!strcmp(str, "enable")) {
++ set_schedstats(true);
++ ret = 1;
++ } else if (!strcmp(str, "disable")) {
++ set_schedstats(false);
++ ret = 1;
++ }
++out:
++ if (!ret)
++ pr_warn("Unable to parse schedstats=\n");
++
++ return ret;
++}
++__setup("schedstats=", setup_schedstats);
++
++#ifdef CONFIG_PROC_SYSCTL
++static int sysctl_schedstats(struct ctl_table *table, int write, void *buffer,
++ size_t *lenp, loff_t *ppos)
++{
++ struct ctl_table t;
++ int err;
++ int state = static_branch_likely(&sched_schedstats);
++
++ if (write && !capable(CAP_SYS_ADMIN))
++ return -EPERM;
++
++ t = *table;
++ t.data = &state;
++ err = proc_dointvec_minmax(&t, write, buffer, lenp, ppos);
++ if (err < 0)
++ return err;
++ if (write)
++ set_schedstats(state);
++ return err;
++}
++
++static struct ctl_table sched_core_sysctls[] = {
++ {
++ .procname = "sched_schedstats",
++ .data = NULL,
++ .maxlen = sizeof(unsigned int),
++ .mode = 0644,
++ .proc_handler = sysctl_schedstats,
++ .extra1 = SYSCTL_ZERO,
++ .extra2 = SYSCTL_ONE,
++ },
++ {}
++};
++static int __init sched_core_sysctl_init(void)
++{
++ register_sysctl_init("kernel", sched_core_sysctls);
++ return 0;
++}
++late_initcall(sched_core_sysctl_init);
++#endif /* CONFIG_PROC_SYSCTL */
++#endif /* CONFIG_SCHEDSTATS */
++
++/*
++ * wake_up_new_task - wake up a newly created task for the first time.
++ *
++ * This function will do some initial scheduler statistics housekeeping
++ * that must be done for every newly created context, then puts the task
++ * on the runqueue and wakes it.
++ */
++void wake_up_new_task(struct task_struct *p)
++{
++ unsigned long flags;
++ struct rq *rq;
++
++ raw_spin_lock_irqsave(&p->pi_lock, flags);
++ WRITE_ONCE(p->__state, TASK_RUNNING);
++ rq = cpu_rq(select_task_rq(p));
++#ifdef CONFIG_SMP
++ rseq_migrate(p);
++ /*
++ * Fork balancing, do it here and not earlier because:
++ * - cpus_ptr can change in the fork path
++ * - any previously selected CPU might disappear through hotplug
++ *
++ * Use __set_task_cpu() to avoid calling sched_class::migrate_task_rq,
++ * as we're not fully set-up yet.
++ */
++ __set_task_cpu(p, cpu_of(rq));
++#endif
++
++ raw_spin_lock(&rq->lock);
++ update_rq_clock(rq);
++
++ activate_task(p, rq);
++ trace_sched_wakeup_new(p);
++ check_preempt_curr(rq);
++
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++}
++
++#ifdef CONFIG_PREEMPT_NOTIFIERS
++
++static DEFINE_STATIC_KEY_FALSE(preempt_notifier_key);
++
++void preempt_notifier_inc(void)
++{
++ static_branch_inc(&preempt_notifier_key);
++}
++EXPORT_SYMBOL_GPL(preempt_notifier_inc);
++
++void preempt_notifier_dec(void)
++{
++ static_branch_dec(&preempt_notifier_key);
++}
++EXPORT_SYMBOL_GPL(preempt_notifier_dec);
++
++/**
++ * preempt_notifier_register - tell me when current is being preempted & rescheduled
++ * @notifier: notifier struct to register
++ */
++void preempt_notifier_register(struct preempt_notifier *notifier)
++{
++ if (!static_branch_unlikely(&preempt_notifier_key))
++ WARN(1, "registering preempt_notifier while notifiers disabled\n");
++
++ hlist_add_head(¬ifier->link, ¤t->preempt_notifiers);
++}
++EXPORT_SYMBOL_GPL(preempt_notifier_register);
++
++/**
++ * preempt_notifier_unregister - no longer interested in preemption notifications
++ * @notifier: notifier struct to unregister
++ *
++ * This is *not* safe to call from within a preemption notifier.
++ */
++void preempt_notifier_unregister(struct preempt_notifier *notifier)
++{
++ hlist_del(¬ifier->link);
++}
++EXPORT_SYMBOL_GPL(preempt_notifier_unregister);
++
++static void __fire_sched_in_preempt_notifiers(struct task_struct *curr)
++{
++ struct preempt_notifier *notifier;
++
++ hlist_for_each_entry(notifier, &curr->preempt_notifiers, link)
++ notifier->ops->sched_in(notifier, raw_smp_processor_id());
++}
++
++static __always_inline void fire_sched_in_preempt_notifiers(struct task_struct *curr)
++{
++ if (static_branch_unlikely(&preempt_notifier_key))
++ __fire_sched_in_preempt_notifiers(curr);
++}
++
++static void
++__fire_sched_out_preempt_notifiers(struct task_struct *curr,
++ struct task_struct *next)
++{
++ struct preempt_notifier *notifier;
++
++ hlist_for_each_entry(notifier, &curr->preempt_notifiers, link)
++ notifier->ops->sched_out(notifier, next);
++}
++
++static __always_inline void
++fire_sched_out_preempt_notifiers(struct task_struct *curr,
++ struct task_struct *next)
++{
++ if (static_branch_unlikely(&preempt_notifier_key))
++ __fire_sched_out_preempt_notifiers(curr, next);
++}
++
++#else /* !CONFIG_PREEMPT_NOTIFIERS */
++
++static inline void fire_sched_in_preempt_notifiers(struct task_struct *curr)
++{
++}
++
++static inline void
++fire_sched_out_preempt_notifiers(struct task_struct *curr,
++ struct task_struct *next)
++{
++}
++
++#endif /* CONFIG_PREEMPT_NOTIFIERS */
++
++static inline void prepare_task(struct task_struct *next)
++{
++ /*
++ * Claim the task as running, we do this before switching to it
++ * such that any running task will have this set.
++ *
++ * See the smp_load_acquire(&p->on_cpu) case in ttwu() and
++ * its ordering comment.
++ */
++ WRITE_ONCE(next->on_cpu, 1);
++}
++
++static inline void finish_task(struct task_struct *prev)
++{
++#ifdef CONFIG_SMP
++ /*
++ * This must be the very last reference to @prev from this CPU. After
++ * p->on_cpu is cleared, the task can be moved to a different CPU. We
++ * must ensure this doesn't happen until the switch is completely
++ * finished.
++ *
++ * In particular, the load of prev->state in finish_task_switch() must
++ * happen before this.
++ *
++ * Pairs with the smp_cond_load_acquire() in try_to_wake_up().
++ */
++ smp_store_release(&prev->on_cpu, 0);
++#else
++ prev->on_cpu = 0;
++#endif
++}
++
++#ifdef CONFIG_SMP
++
++static void do_balance_callbacks(struct rq *rq, struct balance_callback *head)
++{
++ void (*func)(struct rq *rq);
++ struct balance_callback *next;
++
++ lockdep_assert_held(&rq->lock);
++
++ while (head) {
++ func = (void (*)(struct rq *))head->func;
++ next = head->next;
++ head->next = NULL;
++ head = next;
++
++ func(rq);
++ }
++}
++
++static void balance_push(struct rq *rq);
++
++/*
++ * balance_push_callback is a right abuse of the callback interface and plays
++ * by significantly different rules.
++ *
++ * Where the normal balance_callback's purpose is to be ran in the same context
++ * that queued it (only later, when it's safe to drop rq->lock again),
++ * balance_push_callback is specifically targeted at __schedule().
++ *
++ * This abuse is tolerated because it places all the unlikely/odd cases behind
++ * a single test, namely: rq->balance_callback == NULL.
++ */
++struct balance_callback balance_push_callback = {
++ .next = NULL,
++ .func = balance_push,
++};
++
++static inline struct balance_callback *
++__splice_balance_callbacks(struct rq *rq, bool split)
++{
++ struct balance_callback *head = rq->balance_callback;
++
++ if (likely(!head))
++ return NULL;
++
++ lockdep_assert_rq_held(rq);
++ /*
++ * Must not take balance_push_callback off the list when
++ * splice_balance_callbacks() and balance_callbacks() are not
++ * in the same rq->lock section.
++ *
++ * In that case it would be possible for __schedule() to interleave
++ * and observe the list empty.
++ */
++ if (split && head == &balance_push_callback)
++ head = NULL;
++ else
++ rq->balance_callback = NULL;
++
++ return head;
++}
++
++static inline struct balance_callback *splice_balance_callbacks(struct rq *rq)
++{
++ return __splice_balance_callbacks(rq, true);
++}
++
++static void __balance_callbacks(struct rq *rq)
++{
++ do_balance_callbacks(rq, __splice_balance_callbacks(rq, false));
++}
++
++static inline void balance_callbacks(struct rq *rq, struct balance_callback *head)
++{
++ unsigned long flags;
++
++ if (unlikely(head)) {
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ do_balance_callbacks(rq, head);
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++ }
++}
++
++#else
++
++static inline void __balance_callbacks(struct rq *rq)
++{
++}
++
++static inline struct balance_callback *splice_balance_callbacks(struct rq *rq)
++{
++ return NULL;
++}
++
++static inline void balance_callbacks(struct rq *rq, struct balance_callback *head)
++{
++}
++
++#endif
++
++static inline void
++prepare_lock_switch(struct rq *rq, struct task_struct *next)
++{
++ /*
++ * Since the runqueue lock will be released by the next
++ * task (which is an invalid locking op but in the case
++ * of the scheduler it's an obvious special-case), so we
++ * do an early lockdep release here:
++ */
++ spin_release(&rq->lock.dep_map, _THIS_IP_);
++#ifdef CONFIG_DEBUG_SPINLOCK
++ /* this is a valid case when another task releases the spinlock */
++ rq->lock.owner = next;
++#endif
++}
++
++static inline void finish_lock_switch(struct rq *rq)
++{
++ /*
++ * If we are tracking spinlock dependencies then we have to
++ * fix up the runqueue lock - which gets 'carried over' from
++ * prev into current:
++ */
++ spin_acquire(&rq->lock.dep_map, 0, 0, _THIS_IP_);
++ __balance_callbacks(rq);
++ raw_spin_unlock_irq(&rq->lock);
++}
++
++/*
++ * NOP if the arch has not defined these:
++ */
++
++#ifndef prepare_arch_switch
++# define prepare_arch_switch(next) do { } while (0)
++#endif
++
++#ifndef finish_arch_post_lock_switch
++# define finish_arch_post_lock_switch() do { } while (0)
++#endif
++
++static inline void kmap_local_sched_out(void)
++{
++#ifdef CONFIG_KMAP_LOCAL
++ if (unlikely(current->kmap_ctrl.idx))
++ __kmap_local_sched_out();
++#endif
++}
++
++static inline void kmap_local_sched_in(void)
++{
++#ifdef CONFIG_KMAP_LOCAL
++ if (unlikely(current->kmap_ctrl.idx))
++ __kmap_local_sched_in();
++#endif
++}
++
++/**
++ * prepare_task_switch - prepare to switch tasks
++ * @rq: the runqueue preparing to switch
++ * @next: the task we are going to switch to.
++ *
++ * This is called with the rq lock held and interrupts off. It must
++ * be paired with a subsequent finish_task_switch after the context
++ * switch.
++ *
++ * prepare_task_switch sets up locking and calls architecture specific
++ * hooks.
++ */
++static inline void
++prepare_task_switch(struct rq *rq, struct task_struct *prev,
++ struct task_struct *next)
++{
++ kcov_prepare_switch(prev);
++ sched_info_switch(rq, prev, next);
++ perf_event_task_sched_out(prev, next);
++ rseq_preempt(prev);
++ fire_sched_out_preempt_notifiers(prev, next);
++ kmap_local_sched_out();
++ prepare_task(next);
++ prepare_arch_switch(next);
++}
++
++/**
++ * finish_task_switch - clean up after a task-switch
++ * @rq: runqueue associated with task-switch
++ * @prev: the thread we just switched away from.
++ *
++ * finish_task_switch must be called after the context switch, paired
++ * with a prepare_task_switch call before the context switch.
++ * finish_task_switch will reconcile locking set up by prepare_task_switch,
++ * and do any other architecture-specific cleanup actions.
++ *
++ * Note that we may have delayed dropping an mm in context_switch(). If
++ * so, we finish that here outside of the runqueue lock. (Doing it
++ * with the lock held can cause deadlocks; see schedule() for
++ * details.)
++ *
++ * The context switch have flipped the stack from under us and restored the
++ * local variables which were saved when this task called schedule() in the
++ * past. prev == current is still correct but we need to recalculate this_rq
++ * because prev may have moved to another CPU.
++ */
++static struct rq *finish_task_switch(struct task_struct *prev)
++ __releases(rq->lock)
++{
++ struct rq *rq = this_rq();
++ struct mm_struct *mm = rq->prev_mm;
++ unsigned int prev_state;
++
++ /*
++ * The previous task will have left us with a preempt_count of 2
++ * because it left us after:
++ *
++ * schedule()
++ * preempt_disable(); // 1
++ * __schedule()
++ * raw_spin_lock_irq(&rq->lock) // 2
++ *
++ * Also, see FORK_PREEMPT_COUNT.
++ */
++ if (WARN_ONCE(preempt_count() != 2*PREEMPT_DISABLE_OFFSET,
++ "corrupted preempt_count: %s/%d/0x%x\n",
++ current->comm, current->pid, preempt_count()))
++ preempt_count_set(FORK_PREEMPT_COUNT);
++
++ rq->prev_mm = NULL;
++
++ /*
++ * A task struct has one reference for the use as "current".
++ * If a task dies, then it sets TASK_DEAD in tsk->state and calls
++ * schedule one last time. The schedule call will never return, and
++ * the scheduled task must drop that reference.
++ *
++ * We must observe prev->state before clearing prev->on_cpu (in
++ * finish_task), otherwise a concurrent wakeup can get prev
++ * running on another CPU and we could rave with its RUNNING -> DEAD
++ * transition, resulting in a double drop.
++ */
++ prev_state = READ_ONCE(prev->__state);
++ vtime_task_switch(prev);
++ perf_event_task_sched_in(prev, current);
++ finish_task(prev);
++ tick_nohz_task_switch();
++ finish_lock_switch(rq);
++ finish_arch_post_lock_switch();
++ kcov_finish_switch(current);
++ /*
++ * kmap_local_sched_out() is invoked with rq::lock held and
++ * interrupts disabled. There is no requirement for that, but the
++ * sched out code does not have an interrupt enabled section.
++ * Restoring the maps on sched in does not require interrupts being
++ * disabled either.
++ */
++ kmap_local_sched_in();
++
++ fire_sched_in_preempt_notifiers(current);
++ /*
++ * When switching through a kernel thread, the loop in
++ * membarrier_{private,global}_expedited() may have observed that
++ * kernel thread and not issued an IPI. It is therefore possible to
++ * schedule between user->kernel->user threads without passing though
++ * switch_mm(). Membarrier requires a barrier after storing to
++ * rq->curr, before returning to userspace, so provide them here:
++ *
++ * - a full memory barrier for {PRIVATE,GLOBAL}_EXPEDITED, implicitly
++ * provided by mmdrop(),
++ * - a sync_core for SYNC_CORE.
++ */
++ if (mm) {
++ membarrier_mm_sync_core_before_usermode(mm);
++ mmdrop_sched(mm);
++ }
++ if (unlikely(prev_state == TASK_DEAD)) {
++ /* Task is done with its stack. */
++ put_task_stack(prev);
++
++ put_task_struct_rcu_user(prev);
++ }
++
++ return rq;
++}
++
++/**
++ * schedule_tail - first thing a freshly forked thread must call.
++ * @prev: the thread we just switched away from.
++ */
++asmlinkage __visible void schedule_tail(struct task_struct *prev)
++ __releases(rq->lock)
++{
++ /*
++ * New tasks start with FORK_PREEMPT_COUNT, see there and
++ * finish_task_switch() for details.
++ *
++ * finish_task_switch() will drop rq->lock() and lower preempt_count
++ * and the preempt_enable() will end up enabling preemption (on
++ * PREEMPT_COUNT kernels).
++ */
++
++ finish_task_switch(prev);
++ preempt_enable();
++
++ if (current->set_child_tid)
++ put_user(task_pid_vnr(current), current->set_child_tid);
++
++ calculate_sigpending();
++}
++
++/*
++ * context_switch - switch to the new MM and the new thread's register state.
++ */
++static __always_inline struct rq *
++context_switch(struct rq *rq, struct task_struct *prev,
++ struct task_struct *next)
++{
++ prepare_task_switch(rq, prev, next);
++
++ /*
++ * For paravirt, this is coupled with an exit in switch_to to
++ * combine the page table reload and the switch backend into
++ * one hypercall.
++ */
++ arch_start_context_switch(prev);
++
++ /*
++ * kernel -> kernel lazy + transfer active
++ * user -> kernel lazy + mmgrab() active
++ *
++ * kernel -> user switch + mmdrop() active
++ * user -> user switch
++ *
++ * switch_mm_cid() needs to be updated if the barriers provided
++ * by context_switch() are modified.
++ */
++ if (!next->mm) { // to kernel
++ enter_lazy_tlb(prev->active_mm, next);
++
++ next->active_mm = prev->active_mm;
++ if (prev->mm) // from user
++ mmgrab(prev->active_mm);
++ else
++ prev->active_mm = NULL;
++ } else { // to user
++ membarrier_switch_mm(rq, prev->active_mm, next->mm);
++ /*
++ * sys_membarrier() requires an smp_mb() between setting
++ * rq->curr / membarrier_switch_mm() and returning to userspace.
++ *
++ * The below provides this either through switch_mm(), or in
++ * case 'prev->active_mm == next->mm' through
++ * finish_task_switch()'s mmdrop().
++ */
++ switch_mm_irqs_off(prev->active_mm, next->mm, next);
++ lru_gen_use_mm(next->mm);
++
++ if (!prev->mm) { // from kernel
++ /* will mmdrop() in finish_task_switch(). */
++ rq->prev_mm = prev->active_mm;
++ prev->active_mm = NULL;
++ }
++ }
++
++ /* switch_mm_cid() requires the memory barriers above. */
++ switch_mm_cid(rq, prev, next);
++
++ prepare_lock_switch(rq, next);
++
++ /* Here we just switch the register state and the stack. */
++ switch_to(prev, next, prev);
++ barrier();
++
++ return finish_task_switch(prev);
++}
++
++/*
++ * nr_running, nr_uninterruptible and nr_context_switches:
++ *
++ * externally visible scheduler statistics: current number of runnable
++ * threads, total number of context switches performed since bootup.
++ */
++unsigned int nr_running(void)
++{
++ unsigned int i, sum = 0;
++
++ for_each_online_cpu(i)
++ sum += cpu_rq(i)->nr_running;
++
++ return sum;
++}
++
++/*
++ * Check if only the current task is running on the CPU.
++ *
++ * Caution: this function does not check that the caller has disabled
++ * preemption, thus the result might have a time-of-check-to-time-of-use
++ * race. The caller is responsible to use it correctly, for example:
++ *
++ * - from a non-preemptible section (of course)
++ *
++ * - from a thread that is bound to a single CPU
++ *
++ * - in a loop with very short iterations (e.g. a polling loop)
++ */
++bool single_task_running(void)
++{
++ return raw_rq()->nr_running == 1;
++}
++EXPORT_SYMBOL(single_task_running);
++
++unsigned long long nr_context_switches_cpu(int cpu)
++{
++ return cpu_rq(cpu)->nr_switches;
++}
++
++unsigned long long nr_context_switches(void)
++{
++ int i;
++ unsigned long long sum = 0;
++
++ for_each_possible_cpu(i)
++ sum += cpu_rq(i)->nr_switches;
++
++ return sum;
++}
++
++/*
++ * Consumers of these two interfaces, like for example the cpuidle menu
++ * governor, are using nonsensical data. Preferring shallow idle state selection
++ * for a CPU that has IO-wait which might not even end up running the task when
++ * it does become runnable.
++ */
++
++unsigned int nr_iowait_cpu(int cpu)
++{
++ return atomic_read(&cpu_rq(cpu)->nr_iowait);
++}
++
++/*
++ * IO-wait accounting, and how it's mostly bollocks (on SMP).
++ *
++ * The idea behind IO-wait account is to account the idle time that we could
++ * have spend running if it were not for IO. That is, if we were to improve the
++ * storage performance, we'd have a proportional reduction in IO-wait time.
++ *
++ * This all works nicely on UP, where, when a task blocks on IO, we account
++ * idle time as IO-wait, because if the storage were faster, it could've been
++ * running and we'd not be idle.
++ *
++ * This has been extended to SMP, by doing the same for each CPU. This however
++ * is broken.
++ *
++ * Imagine for instance the case where two tasks block on one CPU, only the one
++ * CPU will have IO-wait accounted, while the other has regular idle. Even
++ * though, if the storage were faster, both could've ran at the same time,
++ * utilising both CPUs.
++ *
++ * This means, that when looking globally, the current IO-wait accounting on
++ * SMP is a lower bound, by reason of under accounting.
++ *
++ * Worse, since the numbers are provided per CPU, they are sometimes
++ * interpreted per CPU, and that is nonsensical. A blocked task isn't strictly
++ * associated with any one particular CPU, it can wake to another CPU than it
++ * blocked on. This means the per CPU IO-wait number is meaningless.
++ *
++ * Task CPU affinities can make all that even more 'interesting'.
++ */
++
++unsigned int nr_iowait(void)
++{
++ unsigned int i, sum = 0;
++
++ for_each_possible_cpu(i)
++ sum += nr_iowait_cpu(i);
++
++ return sum;
++}
++
++#ifdef CONFIG_SMP
++
++/*
++ * sched_exec - execve() is a valuable balancing opportunity, because at
++ * this point the task has the smallest effective memory and cache
++ * footprint.
++ */
++void sched_exec(void)
++{
++}
++
++#endif
++
++DEFINE_PER_CPU(struct kernel_stat, kstat);
++DEFINE_PER_CPU(struct kernel_cpustat, kernel_cpustat);
++
++EXPORT_PER_CPU_SYMBOL(kstat);
++EXPORT_PER_CPU_SYMBOL(kernel_cpustat);
++
++static inline void update_curr(struct rq *rq, struct task_struct *p)
++{
++ s64 ns = rq->clock_task - p->last_ran;
++
++ p->sched_time += ns;
++ cgroup_account_cputime(p, ns);
++ account_group_exec_runtime(p, ns);
++
++ p->time_slice -= ns;
++ p->last_ran = rq->clock_task;
++}
++
++/*
++ * Return accounted runtime for the task.
++ * Return separately the current's pending runtime that have not been
++ * accounted yet.
++ */
++unsigned long long task_sched_runtime(struct task_struct *p)
++{
++ unsigned long flags;
++ struct rq *rq;
++ raw_spinlock_t *lock;
++ u64 ns;
++
++#if defined(CONFIG_64BIT) && defined(CONFIG_SMP)
++ /*
++ * 64-bit doesn't need locks to atomically read a 64-bit value.
++ * So we have a optimization chance when the task's delta_exec is 0.
++ * Reading ->on_cpu is racy, but this is ok.
++ *
++ * If we race with it leaving CPU, we'll take a lock. So we're correct.
++ * If we race with it entering CPU, unaccounted time is 0. This is
++ * indistinguishable from the read occurring a few cycles earlier.
++ * If we see ->on_cpu without ->on_rq, the task is leaving, and has
++ * been accounted, so we're correct here as well.
++ */
++ if (!p->on_cpu || !task_on_rq_queued(p))
++ return tsk_seruntime(p);
++#endif
++
++ rq = task_access_lock_irqsave(p, &lock, &flags);
++ /*
++ * Must be ->curr _and_ ->on_rq. If dequeued, we would
++ * project cycles that may never be accounted to this
++ * thread, breaking clock_gettime().
++ */
++ if (p == rq->curr && task_on_rq_queued(p)) {
++ update_rq_clock(rq);
++ update_curr(rq, p);
++ }
++ ns = tsk_seruntime(p);
++ task_access_unlock_irqrestore(p, lock, &flags);
++
++ return ns;
++}
++
++/* This manages tasks that have run out of timeslice during a scheduler_tick */
++static inline void scheduler_task_tick(struct rq *rq)
++{
++ struct task_struct *p = rq->curr;
++
++ if (is_idle_task(p))
++ return;
++
++ update_curr(rq, p);
++ cpufreq_update_util(rq, 0);
++
++ /*
++ * Tasks have less than RESCHED_NS of time slice left they will be
++ * rescheduled.
++ */
++ if (p->time_slice >= RESCHED_NS)
++ return;
++ set_tsk_need_resched(p);
++ set_preempt_need_resched();
++}
++
++#ifdef CONFIG_SCHED_DEBUG
++static u64 cpu_resched_latency(struct rq *rq)
++{
++ int latency_warn_ms = READ_ONCE(sysctl_resched_latency_warn_ms);
++ u64 resched_latency, now = rq_clock(rq);
++ static bool warned_once;
++
++ if (sysctl_resched_latency_warn_once && warned_once)
++ return 0;
++
++ if (!need_resched() || !latency_warn_ms)
++ return 0;
++
++ if (system_state == SYSTEM_BOOTING)
++ return 0;
++
++ if (!rq->last_seen_need_resched_ns) {
++ rq->last_seen_need_resched_ns = now;
++ rq->ticks_without_resched = 0;
++ return 0;
++ }
++
++ rq->ticks_without_resched++;
++ resched_latency = now - rq->last_seen_need_resched_ns;
++ if (resched_latency <= latency_warn_ms * NSEC_PER_MSEC)
++ return 0;
++
++ warned_once = true;
++
++ return resched_latency;
++}
++
++static int __init setup_resched_latency_warn_ms(char *str)
++{
++ long val;
++
++ if ((kstrtol(str, 0, &val))) {
++ pr_warn("Unable to set resched_latency_warn_ms\n");
++ return 1;
++ }
++
++ sysctl_resched_latency_warn_ms = val;
++ return 1;
++}
++__setup("resched_latency_warn_ms=", setup_resched_latency_warn_ms);
++#else
++static inline u64 cpu_resched_latency(struct rq *rq) { return 0; }
++#endif /* CONFIG_SCHED_DEBUG */
++
++/*
++ * This function gets called by the timer code, with HZ frequency.
++ * We call it with interrupts disabled.
++ */
++void scheduler_tick(void)
++{
++ int cpu __maybe_unused = smp_processor_id();
++ struct rq *rq = cpu_rq(cpu);
++ u64 resched_latency;
++
++ if (housekeeping_cpu(cpu, HK_TYPE_TICK))
++ arch_scale_freq_tick();
++
++ sched_clock_tick();
++
++ raw_spin_lock(&rq->lock);
++ update_rq_clock(rq);
++
++ scheduler_task_tick(rq);
++ if (sched_feat(LATENCY_WARN))
++ resched_latency = cpu_resched_latency(rq);
++ calc_global_load_tick(rq);
++
++ task_tick_mm_cid(rq, rq->curr);
++
++ rq->last_tick = rq->clock;
++ raw_spin_unlock(&rq->lock);
++
++ if (sched_feat(LATENCY_WARN) && resched_latency)
++ resched_latency_warn(cpu, resched_latency);
++
++ perf_event_task_tick();
++}
++
++#ifdef CONFIG_SCHED_SMT
++static inline int sg_balance_cpu_stop(void *data)
++{
++ struct rq *rq = this_rq();
++ struct task_struct *p = data;
++ cpumask_t tmp;
++ unsigned long flags;
++
++ local_irq_save(flags);
++
++ raw_spin_lock(&p->pi_lock);
++ raw_spin_lock(&rq->lock);
++
++ rq->active_balance = 0;
++ /* _something_ may have changed the task, double check again */
++ if (task_on_rq_queued(p) && task_rq(p) == rq &&
++ cpumask_and(&tmp, p->cpus_ptr, &sched_sg_idle_mask) &&
++ !is_migration_disabled(p)) {
++ int cpu = cpu_of(rq);
++ int dcpu = __best_mask_cpu(&tmp, per_cpu(sched_cpu_llc_mask, cpu));
++ rq = move_queued_task(rq, p, dcpu);
++ }
++
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock(&p->pi_lock);
++
++ local_irq_restore(flags);
++
++ return 0;
++}
++
++/* sg_balance_trigger - trigger slibing group balance for @cpu */
++static inline int sg_balance_trigger(const int cpu)
++{
++ struct rq *rq= cpu_rq(cpu);
++ unsigned long flags;
++ struct task_struct *curr;
++ int res;
++
++ if (!raw_spin_trylock_irqsave(&rq->lock, flags))
++ return 0;
++ curr = rq->curr;
++ res = (!is_idle_task(curr)) && (1 == rq->nr_running) &&\
++ cpumask_intersects(curr->cpus_ptr, &sched_sg_idle_mask) &&\
++ !is_migration_disabled(curr) && (!rq->active_balance);
++
++ if (res)
++ rq->active_balance = 1;
++
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++
++ if (res)
++ stop_one_cpu_nowait(cpu, sg_balance_cpu_stop, curr,
++ &rq->active_balance_work);
++ return res;
++}
++
++/*
++ * sg_balance - slibing group balance check for run queue @rq
++ */
++static inline void sg_balance(struct rq *rq, int cpu)
++{
++ cpumask_t chk;
++
++ /* exit when cpu is offline */
++ if (unlikely(!rq->online))
++ return;
++
++ /*
++ * Only cpu in slibing idle group will do the checking and then
++ * find potential cpus which can migrate the current running task
++ */
++ if (cpumask_test_cpu(cpu, &sched_sg_idle_mask) &&
++ cpumask_andnot(&chk, cpu_online_mask, sched_idle_mask) &&
++ cpumask_andnot(&chk, &chk, &sched_rq_pending_mask)) {
++ int i;
++
++ for_each_cpu_wrap(i, &chk, cpu) {
++ if (!cpumask_intersects(cpu_smt_mask(i), sched_idle_mask) &&\
++ sg_balance_trigger(i))
++ return;
++ }
++ }
++}
++#endif /* CONFIG_SCHED_SMT */
++
++#ifdef CONFIG_NO_HZ_FULL
++
++struct tick_work {
++ int cpu;
++ atomic_t state;
++ struct delayed_work work;
++};
++/* Values for ->state, see diagram below. */
++#define TICK_SCHED_REMOTE_OFFLINE 0
++#define TICK_SCHED_REMOTE_OFFLINING 1
++#define TICK_SCHED_REMOTE_RUNNING 2
++
++/*
++ * State diagram for ->state:
++ *
++ *
++ * TICK_SCHED_REMOTE_OFFLINE
++ * | ^
++ * | |
++ * | | sched_tick_remote()
++ * | |
++ * | |
++ * +--TICK_SCHED_REMOTE_OFFLINING
++ * | ^
++ * | |
++ * sched_tick_start() | | sched_tick_stop()
++ * | |
++ * V |
++ * TICK_SCHED_REMOTE_RUNNING
++ *
++ *
++ * Other transitions get WARN_ON_ONCE(), except that sched_tick_remote()
++ * and sched_tick_start() are happy to leave the state in RUNNING.
++ */
++
++static struct tick_work __percpu *tick_work_cpu;
++
++static void sched_tick_remote(struct work_struct *work)
++{
++ struct delayed_work *dwork = to_delayed_work(work);
++ struct tick_work *twork = container_of(dwork, struct tick_work, work);
++ int cpu = twork->cpu;
++ struct rq *rq = cpu_rq(cpu);
++ struct task_struct *curr;
++ unsigned long flags;
++ u64 delta;
++ int os;
++
++ /*
++ * Handle the tick only if it appears the remote CPU is running in full
++ * dynticks mode. The check is racy by nature, but missing a tick or
++ * having one too much is no big deal because the scheduler tick updates
++ * statistics and checks timeslices in a time-independent way, regardless
++ * of when exactly it is running.
++ */
++ if (!tick_nohz_tick_stopped_cpu(cpu))
++ goto out_requeue;
++
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ curr = rq->curr;
++ if (cpu_is_offline(cpu))
++ goto out_unlock;
++
++ update_rq_clock(rq);
++ if (!is_idle_task(curr)) {
++ /*
++ * Make sure the next tick runs within a reasonable
++ * amount of time.
++ */
++ delta = rq_clock_task(rq) - curr->last_ran;
++ WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3);
++ }
++ scheduler_task_tick(rq);
++
++ calc_load_nohz_remote(rq);
++out_unlock:
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++
++out_requeue:
++ /*
++ * Run the remote tick once per second (1Hz). This arbitrary
++ * frequency is large enough to avoid overload but short enough
++ * to keep scheduler internal stats reasonably up to date. But
++ * first update state to reflect hotplug activity if required.
++ */
++ os = atomic_fetch_add_unless(&twork->state, -1, TICK_SCHED_REMOTE_RUNNING);
++ WARN_ON_ONCE(os == TICK_SCHED_REMOTE_OFFLINE);
++ if (os == TICK_SCHED_REMOTE_RUNNING)
++ queue_delayed_work(system_unbound_wq, dwork, HZ);
++}
++
++static void sched_tick_start(int cpu)
++{
++ int os;
++ struct tick_work *twork;
++
++ if (housekeeping_cpu(cpu, HK_TYPE_TICK))
++ return;
++
++ WARN_ON_ONCE(!tick_work_cpu);
++
++ twork = per_cpu_ptr(tick_work_cpu, cpu);
++ os = atomic_xchg(&twork->state, TICK_SCHED_REMOTE_RUNNING);
++ WARN_ON_ONCE(os == TICK_SCHED_REMOTE_RUNNING);
++ if (os == TICK_SCHED_REMOTE_OFFLINE) {
++ twork->cpu = cpu;
++ INIT_DELAYED_WORK(&twork->work, sched_tick_remote);
++ queue_delayed_work(system_unbound_wq, &twork->work, HZ);
++ }
++}
++
++#ifdef CONFIG_HOTPLUG_CPU
++static void sched_tick_stop(int cpu)
++{
++ struct tick_work *twork;
++ int os;
++
++ if (housekeeping_cpu(cpu, HK_TYPE_TICK))
++ return;
++
++ WARN_ON_ONCE(!tick_work_cpu);
++
++ twork = per_cpu_ptr(tick_work_cpu, cpu);
++ /* There cannot be competing actions, but don't rely on stop-machine. */
++ os = atomic_xchg(&twork->state, TICK_SCHED_REMOTE_OFFLINING);
++ WARN_ON_ONCE(os != TICK_SCHED_REMOTE_RUNNING);
++ /* Don't cancel, as this would mess up the state machine. */
++}
++#endif /* CONFIG_HOTPLUG_CPU */
++
++int __init sched_tick_offload_init(void)
++{
++ tick_work_cpu = alloc_percpu(struct tick_work);
++ BUG_ON(!tick_work_cpu);
++ return 0;
++}
++
++#else /* !CONFIG_NO_HZ_FULL */
++static inline void sched_tick_start(int cpu) { }
++static inline void sched_tick_stop(int cpu) { }
++#endif
++
++#if defined(CONFIG_PREEMPTION) && (defined(CONFIG_DEBUG_PREEMPT) || \
++ defined(CONFIG_PREEMPT_TRACER))
++/*
++ * If the value passed in is equal to the current preempt count
++ * then we just disabled preemption. Start timing the latency.
++ */
++static inline void preempt_latency_start(int val)
++{
++ if (preempt_count() == val) {
++ unsigned long ip = get_lock_parent_ip();
++#ifdef CONFIG_DEBUG_PREEMPT
++ current->preempt_disable_ip = ip;
++#endif
++ trace_preempt_off(CALLER_ADDR0, ip);
++ }
++}
++
++void preempt_count_add(int val)
++{
++#ifdef CONFIG_DEBUG_PREEMPT
++ /*
++ * Underflow?
++ */
++ if (DEBUG_LOCKS_WARN_ON((preempt_count() < 0)))
++ return;
++#endif
++ __preempt_count_add(val);
++#ifdef CONFIG_DEBUG_PREEMPT
++ /*
++ * Spinlock count overflowing soon?
++ */
++ DEBUG_LOCKS_WARN_ON((preempt_count() & PREEMPT_MASK) >=
++ PREEMPT_MASK - 10);
++#endif
++ preempt_latency_start(val);
++}
++EXPORT_SYMBOL(preempt_count_add);
++NOKPROBE_SYMBOL(preempt_count_add);
++
++/*
++ * If the value passed in equals to the current preempt count
++ * then we just enabled preemption. Stop timing the latency.
++ */
++static inline void preempt_latency_stop(int val)
++{
++ if (preempt_count() == val)
++ trace_preempt_on(CALLER_ADDR0, get_lock_parent_ip());
++}
++
++void preempt_count_sub(int val)
++{
++#ifdef CONFIG_DEBUG_PREEMPT
++ /*
++ * Underflow?
++ */
++ if (DEBUG_LOCKS_WARN_ON(val > preempt_count()))
++ return;
++ /*
++ * Is the spinlock portion underflowing?
++ */
++ if (DEBUG_LOCKS_WARN_ON((val < PREEMPT_MASK) &&
++ !(preempt_count() & PREEMPT_MASK)))
++ return;
++#endif
++
++ preempt_latency_stop(val);
++ __preempt_count_sub(val);
++}
++EXPORT_SYMBOL(preempt_count_sub);
++NOKPROBE_SYMBOL(preempt_count_sub);
++
++#else
++static inline void preempt_latency_start(int val) { }
++static inline void preempt_latency_stop(int val) { }
++#endif
++
++static inline unsigned long get_preempt_disable_ip(struct task_struct *p)
++{
++#ifdef CONFIG_DEBUG_PREEMPT
++ return p->preempt_disable_ip;
++#else
++ return 0;
++#endif
++}
++
++/*
++ * Print scheduling while atomic bug:
++ */
++static noinline void __schedule_bug(struct task_struct *prev)
++{
++ /* Save this before calling printk(), since that will clobber it */
++ unsigned long preempt_disable_ip = get_preempt_disable_ip(current);
++
++ if (oops_in_progress)
++ return;
++
++ printk(KERN_ERR "BUG: scheduling while atomic: %s/%d/0x%08x\n",
++ prev->comm, prev->pid, preempt_count());
++
++ debug_show_held_locks(prev);
++ print_modules();
++ if (irqs_disabled())
++ print_irqtrace_events(prev);
++ if (IS_ENABLED(CONFIG_DEBUG_PREEMPT)
++ && in_atomic_preempt_off()) {
++ pr_err("Preemption disabled at:");
++ print_ip_sym(KERN_ERR, preempt_disable_ip);
++ }
++ check_panic_on_warn("scheduling while atomic");
++
++ dump_stack();
++ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
++}
++
++/*
++ * Various schedule()-time debugging checks and statistics:
++ */
++static inline void schedule_debug(struct task_struct *prev, bool preempt)
++{
++#ifdef CONFIG_SCHED_STACK_END_CHECK
++ if (task_stack_end_corrupted(prev))
++ panic("corrupted stack end detected inside scheduler\n");
++
++ if (task_scs_end_corrupted(prev))
++ panic("corrupted shadow stack detected inside scheduler\n");
++#endif
++
++#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
++ if (!preempt && READ_ONCE(prev->__state) && prev->non_block_count) {
++ printk(KERN_ERR "BUG: scheduling in a non-blocking section: %s/%d/%i\n",
++ prev->comm, prev->pid, prev->non_block_count);
++ dump_stack();
++ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
++ }
++#endif
++
++ if (unlikely(in_atomic_preempt_off())) {
++ __schedule_bug(prev);
++ preempt_count_set(PREEMPT_DISABLED);
++ }
++ rcu_sleep_check();
++ SCHED_WARN_ON(ct_state() == CONTEXT_USER);
++
++ profile_hit(SCHED_PROFILING, __builtin_return_address(0));
++
++ schedstat_inc(this_rq()->sched_count);
++}
++
++#ifdef ALT_SCHED_DEBUG
++void alt_sched_debug(void)
++{
++ printk(KERN_INFO "sched: pending: 0x%04lx, idle: 0x%04lx, sg_idle: 0x%04lx\n",
++ sched_rq_pending_mask.bits[0],
++ sched_idle_mask->bits[0],
++ sched_sg_idle_mask.bits[0]);
++}
++#else
++inline void alt_sched_debug(void) {}
++#endif
++
++#ifdef CONFIG_SMP
++
++#ifdef CONFIG_PREEMPT_RT
++#define SCHED_NR_MIGRATE_BREAK 8
++#else
++#define SCHED_NR_MIGRATE_BREAK 32
++#endif
++
++const_debug unsigned int sysctl_sched_nr_migrate = SCHED_NR_MIGRATE_BREAK;
++
++/*
++ * Migrate pending tasks in @rq to @dest_cpu
++ */
++static inline int
++migrate_pending_tasks(struct rq *rq, struct rq *dest_rq, const int dest_cpu)
++{
++ struct task_struct *p, *skip = rq->curr;
++ int nr_migrated = 0;
++ int nr_tries = min(rq->nr_running / 2, sysctl_sched_nr_migrate);
++
++ /* WA to check rq->curr is still on rq */
++ if (!task_on_rq_queued(skip))
++ return 0;
++
++ while (skip != rq->idle && nr_tries &&
++ (p = sched_rq_next_task(skip, rq)) != rq->idle) {
++ skip = sched_rq_next_task(p, rq);
++ if (cpumask_test_cpu(dest_cpu, p->cpus_ptr)) {
++ __SCHED_DEQUEUE_TASK(p, rq, 0, );
++ set_task_cpu(p, dest_cpu);
++ sched_task_sanity_check(p, dest_rq);
++ sched_mm_cid_migrate_to(dest_rq, p, cpu_of(rq));
++ __SCHED_ENQUEUE_TASK(p, dest_rq, 0);
++ nr_migrated++;
++ }
++ nr_tries--;
++ }
++
++ return nr_migrated;
++}
++
++static inline int take_other_rq_tasks(struct rq *rq, int cpu)
++{
++ struct cpumask *topo_mask, *end_mask;
++
++ if (unlikely(!rq->online))
++ return 0;
++
++ if (cpumask_empty(&sched_rq_pending_mask))
++ return 0;
++
++ topo_mask = per_cpu(sched_cpu_topo_masks, cpu) + 1;
++ end_mask = per_cpu(sched_cpu_topo_end_mask, cpu);
++ do {
++ int i;
++ for_each_cpu_and(i, &sched_rq_pending_mask, topo_mask) {
++ int nr_migrated;
++ struct rq *src_rq;
++
++ src_rq = cpu_rq(i);
++ if (!do_raw_spin_trylock(&src_rq->lock))
++ continue;
++ spin_acquire(&src_rq->lock.dep_map,
++ SINGLE_DEPTH_NESTING, 1, _RET_IP_);
++
++ if ((nr_migrated = migrate_pending_tasks(src_rq, rq, cpu))) {
++ src_rq->nr_running -= nr_migrated;
++ if (src_rq->nr_running < 2)
++ cpumask_clear_cpu(i, &sched_rq_pending_mask);
++
++ spin_release(&src_rq->lock.dep_map, _RET_IP_);
++ do_raw_spin_unlock(&src_rq->lock);
++
++ rq->nr_running += nr_migrated;
++ if (rq->nr_running > 1)
++ cpumask_set_cpu(cpu, &sched_rq_pending_mask);
++
++ update_sched_preempt_mask(rq);
++ cpufreq_update_util(rq, 0);
++
++ return 1;
++ }
++
++ spin_release(&src_rq->lock.dep_map, _RET_IP_);
++ do_raw_spin_unlock(&src_rq->lock);
++ }
++ } while (++topo_mask < end_mask);
++
++ return 0;
++}
++#endif
++
++/*
++ * Timeslices below RESCHED_NS are considered as good as expired as there's no
++ * point rescheduling when there's so little time left.
++ */
++static inline void check_curr(struct task_struct *p, struct rq *rq)
++{
++ if (unlikely(rq->idle == p))
++ return;
++
++ update_curr(rq, p);
++
++ if (p->time_slice < RESCHED_NS)
++ time_slice_expired(p, rq);
++}
++
++static inline struct task_struct *
++choose_next_task(struct rq *rq, int cpu)
++{
++ struct task_struct *next;
++
++ if (unlikely(rq->skip)) {
++ next = rq_runnable_task(rq);
++ if (next == rq->idle) {
++#ifdef CONFIG_SMP
++ if (!take_other_rq_tasks(rq, cpu)) {
++#endif
++ rq->skip = NULL;
++ schedstat_inc(rq->sched_goidle);
++ return next;
++#ifdef CONFIG_SMP
++ }
++ next = rq_runnable_task(rq);
++#endif
++ }
++ rq->skip = NULL;
++#ifdef CONFIG_HIGH_RES_TIMERS
++ hrtick_start(rq, next->time_slice);
++#endif
++ return next;
++ }
++
++ next = sched_rq_first_task(rq);
++ if (next == rq->idle) {
++#ifdef CONFIG_SMP
++ if (!take_other_rq_tasks(rq, cpu)) {
++#endif
++ schedstat_inc(rq->sched_goidle);
++ /*printk(KERN_INFO "sched: choose_next_task(%d) idle %px\n", cpu, next);*/
++ return next;
++#ifdef CONFIG_SMP
++ }
++ next = sched_rq_first_task(rq);
++#endif
++ }
++#ifdef CONFIG_HIGH_RES_TIMERS
++ hrtick_start(rq, next->time_slice);
++#endif
++ /*printk(KERN_INFO "sched: choose_next_task(%d) next %px\n", cpu, next);*/
++ return next;
++}
++
++/*
++ * Constants for the sched_mode argument of __schedule().
++ *
++ * The mode argument allows RT enabled kernels to differentiate a
++ * preemption from blocking on an 'sleeping' spin/rwlock. Note that
++ * SM_MASK_PREEMPT for !RT has all bits set, which allows the compiler to
++ * optimize the AND operation out and just check for zero.
++ */
++#define SM_NONE 0x0
++#define SM_PREEMPT 0x1
++#define SM_RTLOCK_WAIT 0x2
++
++#ifndef CONFIG_PREEMPT_RT
++# define SM_MASK_PREEMPT (~0U)
++#else
++# define SM_MASK_PREEMPT SM_PREEMPT
++#endif
++
++/*
++ * schedule() is the main scheduler function.
++ *
++ * The main means of driving the scheduler and thus entering this function are:
++ *
++ * 1. Explicit blocking: mutex, semaphore, waitqueue, etc.
++ *
++ * 2. TIF_NEED_RESCHED flag is checked on interrupt and userspace return
++ * paths. For example, see arch/x86/entry_64.S.
++ *
++ * To drive preemption between tasks, the scheduler sets the flag in timer
++ * interrupt handler scheduler_tick().
++ *
++ * 3. Wakeups don't really cause entry into schedule(). They add a
++ * task to the run-queue and that's it.
++ *
++ * Now, if the new task added to the run-queue preempts the current
++ * task, then the wakeup sets TIF_NEED_RESCHED and schedule() gets
++ * called on the nearest possible occasion:
++ *
++ * - If the kernel is preemptible (CONFIG_PREEMPTION=y):
++ *
++ * - in syscall or exception context, at the next outmost
++ * preempt_enable(). (this might be as soon as the wake_up()'s
++ * spin_unlock()!)
++ *
++ * - in IRQ context, return from interrupt-handler to
++ * preemptible context
++ *
++ * - If the kernel is not preemptible (CONFIG_PREEMPTION is not set)
++ * then at the next:
++ *
++ * - cond_resched() call
++ * - explicit schedule() call
++ * - return from syscall or exception to user-space
++ * - return from interrupt-handler to user-space
++ *
++ * WARNING: must be called with preemption disabled!
++ */
++static void __sched notrace __schedule(unsigned int sched_mode)
++{
++ struct task_struct *prev, *next;
++ unsigned long *switch_count;
++ unsigned long prev_state;
++ struct rq *rq;
++ int cpu;
++
++ cpu = smp_processor_id();
++ rq = cpu_rq(cpu);
++ prev = rq->curr;
++
++ schedule_debug(prev, !!sched_mode);
++
++ /* by passing sched_feat(HRTICK) checking which Alt schedule FW doesn't support */
++ hrtick_clear(rq);
++
++ local_irq_disable();
++ rcu_note_context_switch(!!sched_mode);
++
++ /*
++ * Make sure that signal_pending_state()->signal_pending() below
++ * can't be reordered with __set_current_state(TASK_INTERRUPTIBLE)
++ * done by the caller to avoid the race with signal_wake_up():
++ *
++ * __set_current_state(@state) signal_wake_up()
++ * schedule() set_tsk_thread_flag(p, TIF_SIGPENDING)
++ * wake_up_state(p, state)
++ * LOCK rq->lock LOCK p->pi_state
++ * smp_mb__after_spinlock() smp_mb__after_spinlock()
++ * if (signal_pending_state()) if (p->state & @state)
++ *
++ * Also, the membarrier system call requires a full memory barrier
++ * after coming from user-space, before storing to rq->curr.
++ */
++ raw_spin_lock(&rq->lock);
++ smp_mb__after_spinlock();
++
++ update_rq_clock(rq);
++
++ switch_count = &prev->nivcsw;
++ /*
++ * We must load prev->state once (task_struct::state is volatile), such
++ * that we form a control dependency vs deactivate_task() below.
++ */
++ prev_state = READ_ONCE(prev->__state);
++ if (!(sched_mode & SM_MASK_PREEMPT) && prev_state) {
++ if (signal_pending_state(prev_state, prev)) {
++ WRITE_ONCE(prev->__state, TASK_RUNNING);
++ } else {
++ prev->sched_contributes_to_load =
++ (prev_state & TASK_UNINTERRUPTIBLE) &&
++ !(prev_state & TASK_NOLOAD) &&
++ !(prev_state & TASK_FROZEN);
++
++ if (prev->sched_contributes_to_load)
++ rq->nr_uninterruptible++;
++
++ /*
++ * __schedule() ttwu()
++ * prev_state = prev->state; if (p->on_rq && ...)
++ * if (prev_state) goto out;
++ * p->on_rq = 0; smp_acquire__after_ctrl_dep();
++ * p->state = TASK_WAKING
++ *
++ * Where __schedule() and ttwu() have matching control dependencies.
++ *
++ * After this, schedule() must not care about p->state any more.
++ */
++ sched_task_deactivate(prev, rq);
++ deactivate_task(prev, rq);
++
++ if (prev->in_iowait) {
++ atomic_inc(&rq->nr_iowait);
++ delayacct_blkio_start();
++ }
++ }
++ switch_count = &prev->nvcsw;
++ }
++
++ check_curr(prev, rq);
++
++ next = choose_next_task(rq, cpu);
++ clear_tsk_need_resched(prev);
++ clear_preempt_need_resched();
++#ifdef CONFIG_SCHED_DEBUG
++ rq->last_seen_need_resched_ns = 0;
++#endif
++
++ if (likely(prev != next)) {
++ next->last_ran = rq->clock_task;
++ rq->last_ts_switch = rq->clock;
++
++ /*printk(KERN_INFO "sched: %px -> %px\n", prev, next);*/
++ rq->nr_switches++;
++ /*
++ * RCU users of rcu_dereference(rq->curr) may not see
++ * changes to task_struct made by pick_next_task().
++ */
++ RCU_INIT_POINTER(rq->curr, next);
++ /*
++ * The membarrier system call requires each architecture
++ * to have a full memory barrier after updating
++ * rq->curr, before returning to user-space.
++ *
++ * Here are the schemes providing that barrier on the
++ * various architectures:
++ * - mm ? switch_mm() : mmdrop() for x86, s390, sparc, PowerPC.
++ * switch_mm() rely on membarrier_arch_switch_mm() on PowerPC.
++ * - finish_lock_switch() for weakly-ordered
++ * architectures where spin_unlock is a full barrier,
++ * - switch_to() for arm64 (weakly-ordered, spin_unlock
++ * is a RELEASE barrier),
++ */
++ ++*switch_count;
++
++ trace_sched_switch(sched_mode & SM_MASK_PREEMPT, prev, next, prev_state);
++
++ /* Also unlocks the rq: */
++ rq = context_switch(rq, prev, next);
++
++ cpu = cpu_of(rq);
++ } else {
++ __balance_callbacks(rq);
++ raw_spin_unlock_irq(&rq->lock);
++ }
++
++#ifdef CONFIG_SCHED_SMT
++ sg_balance(rq, cpu);
++#endif
++}
++
++void __noreturn do_task_dead(void)
++{
++ /* Causes final put_task_struct in finish_task_switch(): */
++ set_special_state(TASK_DEAD);
++
++ /* Tell freezer to ignore us: */
++ current->flags |= PF_NOFREEZE;
++
++ __schedule(SM_NONE);
++ BUG();
++
++ /* Avoid "noreturn function does return" - but don't continue if BUG() is a NOP: */
++ for (;;)
++ cpu_relax();
++}
++
++static inline void sched_submit_work(struct task_struct *tsk)
++{
++ unsigned int task_flags;
++
++ if (task_is_running(tsk))
++ return;
++
++ task_flags = tsk->flags;
++ /*
++ * If a worker goes to sleep, notify and ask workqueue whether it
++ * wants to wake up a task to maintain concurrency.
++ */
++ if (task_flags & (PF_WQ_WORKER | PF_IO_WORKER)) {
++ if (task_flags & PF_WQ_WORKER)
++ wq_worker_sleeping(tsk);
++ else
++ io_wq_worker_sleeping(tsk);
++ }
++
++ /*
++ * spinlock and rwlock must not flush block requests. This will
++ * deadlock if the callback attempts to acquire a lock which is
++ * already acquired.
++ */
++ SCHED_WARN_ON(current->__state & TASK_RTLOCK_WAIT);
++
++ /*
++ * If we are going to sleep and we have plugged IO queued,
++ * make sure to submit it to avoid deadlocks.
++ */
++ blk_flush_plug(tsk->plug, true);
++}
++
++static void sched_update_worker(struct task_struct *tsk)
++{
++ if (tsk->flags & (PF_WQ_WORKER | PF_IO_WORKER)) {
++ if (tsk->flags & PF_WQ_WORKER)
++ wq_worker_running(tsk);
++ else
++ io_wq_worker_running(tsk);
++ }
++}
++
++asmlinkage __visible void __sched schedule(void)
++{
++ struct task_struct *tsk = current;
++
++ sched_submit_work(tsk);
++ do {
++ preempt_disable();
++ __schedule(SM_NONE);
++ sched_preempt_enable_no_resched();
++ } while (need_resched());
++ sched_update_worker(tsk);
++}
++EXPORT_SYMBOL(schedule);
++
++/*
++ * synchronize_rcu_tasks() makes sure that no task is stuck in preempted
++ * state (have scheduled out non-voluntarily) by making sure that all
++ * tasks have either left the run queue or have gone into user space.
++ * As idle tasks do not do either, they must not ever be preempted
++ * (schedule out non-voluntarily).
++ *
++ * schedule_idle() is similar to schedule_preempt_disable() except that it
++ * never enables preemption because it does not call sched_submit_work().
++ */
++void __sched schedule_idle(void)
++{
++ /*
++ * As this skips calling sched_submit_work(), which the idle task does
++ * regardless because that function is a nop when the task is in a
++ * TASK_RUNNING state, make sure this isn't used someplace that the
++ * current task can be in any other state. Note, idle is always in the
++ * TASK_RUNNING state.
++ */
++ WARN_ON_ONCE(current->__state);
++ do {
++ __schedule(SM_NONE);
++ } while (need_resched());
++}
++
++#if defined(CONFIG_CONTEXT_TRACKING_USER) && !defined(CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK)
++asmlinkage __visible void __sched schedule_user(void)
++{
++ /*
++ * If we come here after a random call to set_need_resched(),
++ * or we have been woken up remotely but the IPI has not yet arrived,
++ * we haven't yet exited the RCU idle mode. Do it here manually until
++ * we find a better solution.
++ *
++ * NB: There are buggy callers of this function. Ideally we
++ * should warn if prev_state != CONTEXT_USER, but that will trigger
++ * too frequently to make sense yet.
++ */
++ enum ctx_state prev_state = exception_enter();
++ schedule();
++ exception_exit(prev_state);
++}
++#endif
++
++/**
++ * schedule_preempt_disabled - called with preemption disabled
++ *
++ * Returns with preemption disabled. Note: preempt_count must be 1
++ */
++void __sched schedule_preempt_disabled(void)
++{
++ sched_preempt_enable_no_resched();
++ schedule();
++ preempt_disable();
++}
++
++#ifdef CONFIG_PREEMPT_RT
++void __sched notrace schedule_rtlock(void)
++{
++ do {
++ preempt_disable();
++ __schedule(SM_RTLOCK_WAIT);
++ sched_preempt_enable_no_resched();
++ } while (need_resched());
++}
++NOKPROBE_SYMBOL(schedule_rtlock);
++#endif
++
++static void __sched notrace preempt_schedule_common(void)
++{
++ do {
++ /*
++ * Because the function tracer can trace preempt_count_sub()
++ * and it also uses preempt_enable/disable_notrace(), if
++ * NEED_RESCHED is set, the preempt_enable_notrace() called
++ * by the function tracer will call this function again and
++ * cause infinite recursion.
++ *
++ * Preemption must be disabled here before the function
++ * tracer can trace. Break up preempt_disable() into two
++ * calls. One to disable preemption without fear of being
++ * traced. The other to still record the preemption latency,
++ * which can also be traced by the function tracer.
++ */
++ preempt_disable_notrace();
++ preempt_latency_start(1);
++ __schedule(SM_PREEMPT);
++ preempt_latency_stop(1);
++ preempt_enable_no_resched_notrace();
++
++ /*
++ * Check again in case we missed a preemption opportunity
++ * between schedule and now.
++ */
++ } while (need_resched());
++}
++
++#ifdef CONFIG_PREEMPTION
++/*
++ * This is the entry point to schedule() from in-kernel preemption
++ * off of preempt_enable.
++ */
++asmlinkage __visible void __sched notrace preempt_schedule(void)
++{
++ /*
++ * If there is a non-zero preempt_count or interrupts are disabled,
++ * we do not want to preempt the current task. Just return..
++ */
++ if (likely(!preemptible()))
++ return;
++
++ preempt_schedule_common();
++}
++NOKPROBE_SYMBOL(preempt_schedule);
++EXPORT_SYMBOL(preempt_schedule);
++
++#ifdef CONFIG_PREEMPT_DYNAMIC
++#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
++#ifndef preempt_schedule_dynamic_enabled
++#define preempt_schedule_dynamic_enabled preempt_schedule
++#define preempt_schedule_dynamic_disabled NULL
++#endif
++DEFINE_STATIC_CALL(preempt_schedule, preempt_schedule_dynamic_enabled);
++EXPORT_STATIC_CALL_TRAMP(preempt_schedule);
++#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
++static DEFINE_STATIC_KEY_TRUE(sk_dynamic_preempt_schedule);
++void __sched notrace dynamic_preempt_schedule(void)
++{
++ if (!static_branch_unlikely(&sk_dynamic_preempt_schedule))
++ return;
++ preempt_schedule();
++}
++NOKPROBE_SYMBOL(dynamic_preempt_schedule);
++EXPORT_SYMBOL(dynamic_preempt_schedule);
++#endif
++#endif
++
++/**
++ * preempt_schedule_notrace - preempt_schedule called by tracing
++ *
++ * The tracing infrastructure uses preempt_enable_notrace to prevent
++ * recursion and tracing preempt enabling caused by the tracing
++ * infrastructure itself. But as tracing can happen in areas coming
++ * from userspace or just about to enter userspace, a preempt enable
++ * can occur before user_exit() is called. This will cause the scheduler
++ * to be called when the system is still in usermode.
++ *
++ * To prevent this, the preempt_enable_notrace will use this function
++ * instead of preempt_schedule() to exit user context if needed before
++ * calling the scheduler.
++ */
++asmlinkage __visible void __sched notrace preempt_schedule_notrace(void)
++{
++ enum ctx_state prev_ctx;
++
++ if (likely(!preemptible()))
++ return;
++
++ do {
++ /*
++ * Because the function tracer can trace preempt_count_sub()
++ * and it also uses preempt_enable/disable_notrace(), if
++ * NEED_RESCHED is set, the preempt_enable_notrace() called
++ * by the function tracer will call this function again and
++ * cause infinite recursion.
++ *
++ * Preemption must be disabled here before the function
++ * tracer can trace. Break up preempt_disable() into two
++ * calls. One to disable preemption without fear of being
++ * traced. The other to still record the preemption latency,
++ * which can also be traced by the function tracer.
++ */
++ preempt_disable_notrace();
++ preempt_latency_start(1);
++ /*
++ * Needs preempt disabled in case user_exit() is traced
++ * and the tracer calls preempt_enable_notrace() causing
++ * an infinite recursion.
++ */
++ prev_ctx = exception_enter();
++ __schedule(SM_PREEMPT);
++ exception_exit(prev_ctx);
++
++ preempt_latency_stop(1);
++ preempt_enable_no_resched_notrace();
++ } while (need_resched());
++}
++EXPORT_SYMBOL_GPL(preempt_schedule_notrace);
++
++#ifdef CONFIG_PREEMPT_DYNAMIC
++#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
++#ifndef preempt_schedule_notrace_dynamic_enabled
++#define preempt_schedule_notrace_dynamic_enabled preempt_schedule_notrace
++#define preempt_schedule_notrace_dynamic_disabled NULL
++#endif
++DEFINE_STATIC_CALL(preempt_schedule_notrace, preempt_schedule_notrace_dynamic_enabled);
++EXPORT_STATIC_CALL_TRAMP(preempt_schedule_notrace);
++#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
++static DEFINE_STATIC_KEY_TRUE(sk_dynamic_preempt_schedule_notrace);
++void __sched notrace dynamic_preempt_schedule_notrace(void)
++{
++ if (!static_branch_unlikely(&sk_dynamic_preempt_schedule_notrace))
++ return;
++ preempt_schedule_notrace();
++}
++NOKPROBE_SYMBOL(dynamic_preempt_schedule_notrace);
++EXPORT_SYMBOL(dynamic_preempt_schedule_notrace);
++#endif
++#endif
++
++#endif /* CONFIG_PREEMPTION */
++
++/*
++ * This is the entry point to schedule() from kernel preemption
++ * off of irq context.
++ * Note, that this is called and return with irqs disabled. This will
++ * protect us against recursive calling from irq.
++ */
++asmlinkage __visible void __sched preempt_schedule_irq(void)
++{
++ enum ctx_state prev_state;
++
++ /* Catch callers which need to be fixed */
++ BUG_ON(preempt_count() || !irqs_disabled());
++
++ prev_state = exception_enter();
++
++ do {
++ preempt_disable();
++ local_irq_enable();
++ __schedule(SM_PREEMPT);
++ local_irq_disable();
++ sched_preempt_enable_no_resched();
++ } while (need_resched());
++
++ exception_exit(prev_state);
++}
++
++int default_wake_function(wait_queue_entry_t *curr, unsigned mode, int wake_flags,
++ void *key)
++{
++ WARN_ON_ONCE(IS_ENABLED(CONFIG_SCHED_DEBUG) && wake_flags & ~WF_SYNC);
++ return try_to_wake_up(curr->private, mode, wake_flags);
++}
++EXPORT_SYMBOL(default_wake_function);
++
++static inline void check_task_changed(struct task_struct *p, struct rq *rq)
++{
++ /* Trigger resched if task sched_prio has been modified. */
++ if (task_on_rq_queued(p)) {
++ int idx;
++
++ update_rq_clock(rq);
++ idx = task_sched_prio_idx(p, rq);
++ if (idx != p->sq_idx) {
++ requeue_task(p, rq, idx);
++ check_preempt_curr(rq);
++ }
++ }
++}
++
++static void __setscheduler_prio(struct task_struct *p, int prio)
++{
++ p->prio = prio;
++}
++
++#ifdef CONFIG_RT_MUTEXES
++
++static inline int __rt_effective_prio(struct task_struct *pi_task, int prio)
++{
++ if (pi_task)
++ prio = min(prio, pi_task->prio);
++
++ return prio;
++}
++
++static inline int rt_effective_prio(struct task_struct *p, int prio)
++{
++ struct task_struct *pi_task = rt_mutex_get_top_task(p);
++
++ return __rt_effective_prio(pi_task, prio);
++}
++
++/*
++ * rt_mutex_setprio - set the current priority of a task
++ * @p: task to boost
++ * @pi_task: donor task
++ *
++ * This function changes the 'effective' priority of a task. It does
++ * not touch ->normal_prio like __setscheduler().
++ *
++ * Used by the rt_mutex code to implement priority inheritance
++ * logic. Call site only calls if the priority of the task changed.
++ */
++void rt_mutex_setprio(struct task_struct *p, struct task_struct *pi_task)
++{
++ int prio;
++ struct rq *rq;
++ raw_spinlock_t *lock;
++
++ /* XXX used to be waiter->prio, not waiter->task->prio */
++ prio = __rt_effective_prio(pi_task, p->normal_prio);
++
++ /*
++ * If nothing changed; bail early.
++ */
++ if (p->pi_top_task == pi_task && prio == p->prio)
++ return;
++
++ rq = __task_access_lock(p, &lock);
++ /*
++ * Set under pi_lock && rq->lock, such that the value can be used under
++ * either lock.
++ *
++ * Note that there is loads of tricky to make this pointer cache work
++ * right. rt_mutex_slowunlock()+rt_mutex_postunlock() work together to
++ * ensure a task is de-boosted (pi_task is set to NULL) before the
++ * task is allowed to run again (and can exit). This ensures the pointer
++ * points to a blocked task -- which guarantees the task is present.
++ */
++ p->pi_top_task = pi_task;
++
++ /*
++ * For FIFO/RR we only need to set prio, if that matches we're done.
++ */
++ if (prio == p->prio)
++ goto out_unlock;
++
++ /*
++ * Idle task boosting is a nono in general. There is one
++ * exception, when PREEMPT_RT and NOHZ is active:
++ *
++ * The idle task calls get_next_timer_interrupt() and holds
++ * the timer wheel base->lock on the CPU and another CPU wants
++ * to access the timer (probably to cancel it). We can safely
++ * ignore the boosting request, as the idle CPU runs this code
++ * with interrupts disabled and will complete the lock
++ * protected section without being interrupted. So there is no
++ * real need to boost.
++ */
++ if (unlikely(p == rq->idle)) {
++ WARN_ON(p != rq->curr);
++ WARN_ON(p->pi_blocked_on);
++ goto out_unlock;
++ }
++
++ trace_sched_pi_setprio(p, pi_task);
++
++ __setscheduler_prio(p, prio);
++
++ check_task_changed(p, rq);
++out_unlock:
++ /* Avoid rq from going away on us: */
++ preempt_disable();
++
++ __balance_callbacks(rq);
++ __task_access_unlock(p, lock);
++
++ preempt_enable();
++}
++#else
++static inline int rt_effective_prio(struct task_struct *p, int prio)
++{
++ return prio;
++}
++#endif
++
++void set_user_nice(struct task_struct *p, long nice)
++{
++ unsigned long flags;
++ struct rq *rq;
++ raw_spinlock_t *lock;
++
++ if (task_nice(p) == nice || nice < MIN_NICE || nice > MAX_NICE)
++ return;
++ /*
++ * We have to be careful, if called from sys_setpriority(),
++ * the task might be in the middle of scheduling on another CPU.
++ */
++ raw_spin_lock_irqsave(&p->pi_lock, flags);
++ rq = __task_access_lock(p, &lock);
++
++ p->static_prio = NICE_TO_PRIO(nice);
++ /*
++ * The RT priorities are set via sched_setscheduler(), but we still
++ * allow the 'normal' nice value to be set - but as expected
++ * it won't have any effect on scheduling until the task is
++ * not SCHED_NORMAL/SCHED_BATCH:
++ */
++ if (task_has_rt_policy(p))
++ goto out_unlock;
++
++ p->prio = effective_prio(p);
++
++ check_task_changed(p, rq);
++out_unlock:
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++}
++EXPORT_SYMBOL(set_user_nice);
++
++/*
++ * is_nice_reduction - check if nice value is an actual reduction
++ *
++ * Similar to can_nice() but does not perform a capability check.
++ *
++ * @p: task
++ * @nice: nice value
++ */
++static bool is_nice_reduction(const struct task_struct *p, const int nice)
++{
++ /* Convert nice value [19,-20] to rlimit style value [1,40]: */
++ int nice_rlim = nice_to_rlimit(nice);
++
++ return (nice_rlim <= task_rlimit(p, RLIMIT_NICE));
++}
++
++/*
++ * can_nice - check if a task can reduce its nice value
++ * @p: task
++ * @nice: nice value
++ */
++int can_nice(const struct task_struct *p, const int nice)
++{
++ return is_nice_reduction(p, nice) || capable(CAP_SYS_NICE);
++}
++
++#ifdef __ARCH_WANT_SYS_NICE
++
++/*
++ * sys_nice - change the priority of the current process.
++ * @increment: priority increment
++ *
++ * sys_setpriority is a more generic, but much slower function that
++ * does similar things.
++ */
++SYSCALL_DEFINE1(nice, int, increment)
++{
++ long nice, retval;
++
++ /*
++ * Setpriority might change our priority at the same moment.
++ * We don't have to worry. Conceptually one call occurs first
++ * and we have a single winner.
++ */
++
++ increment = clamp(increment, -NICE_WIDTH, NICE_WIDTH);
++ nice = task_nice(current) + increment;
++
++ nice = clamp_val(nice, MIN_NICE, MAX_NICE);
++ if (increment < 0 && !can_nice(current, nice))
++ return -EPERM;
++
++ retval = security_task_setnice(current, nice);
++ if (retval)
++ return retval;
++
++ set_user_nice(current, nice);
++ return 0;
++}
++
++#endif
++
++/**
++ * task_prio - return the priority value of a given task.
++ * @p: the task in question.
++ *
++ * Return: The priority value as seen by users in /proc.
++ *
++ * sched policy return value kernel prio user prio/nice
++ *
++ * (BMQ)normal, batch, idle[0 ... 53] [100 ... 139] 0/[-20 ... 19]/[-7 ... 7]
++ * (PDS)normal, batch, idle[0 ... 39] 100 0/[-20 ... 19]
++ * fifo, rr [-1 ... -100] [99 ... 0] [0 ... 99]
++ */
++int task_prio(const struct task_struct *p)
++{
++ return (p->prio < MAX_RT_PRIO) ? p->prio - MAX_RT_PRIO :
++ task_sched_prio_normal(p, task_rq(p));
++}
++
++/**
++ * idle_cpu - is a given CPU idle currently?
++ * @cpu: the processor in question.
++ *
++ * Return: 1 if the CPU is currently idle. 0 otherwise.
++ */
++int idle_cpu(int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ if (rq->curr != rq->idle)
++ return 0;
++
++ if (rq->nr_running)
++ return 0;
++
++#ifdef CONFIG_SMP
++ if (rq->ttwu_pending)
++ return 0;
++#endif
++
++ return 1;
++}
++
++/**
++ * idle_task - return the idle task for a given CPU.
++ * @cpu: the processor in question.
++ *
++ * Return: The idle task for the cpu @cpu.
++ */
++struct task_struct *idle_task(int cpu)
++{
++ return cpu_rq(cpu)->idle;
++}
++
++/**
++ * find_process_by_pid - find a process with a matching PID value.
++ * @pid: the pid in question.
++ *
++ * The task of @pid, if found. %NULL otherwise.
++ */
++static inline struct task_struct *find_process_by_pid(pid_t pid)
++{
++ return pid ? find_task_by_vpid(pid) : current;
++}
++
++/*
++ * sched_setparam() passes in -1 for its policy, to let the functions
++ * it calls know not to change it.
++ */
++#define SETPARAM_POLICY -1
++
++static void __setscheduler_params(struct task_struct *p,
++ const struct sched_attr *attr)
++{
++ int policy = attr->sched_policy;
++
++ if (policy == SETPARAM_POLICY)
++ policy = p->policy;
++
++ p->policy = policy;
++
++ /*
++ * allow normal nice value to be set, but will not have any
++ * effect on scheduling until the task not SCHED_NORMAL/
++ * SCHED_BATCH
++ */
++ p->static_prio = NICE_TO_PRIO(attr->sched_nice);
++
++ /*
++ * __sched_setscheduler() ensures attr->sched_priority == 0 when
++ * !rt_policy. Always setting this ensures that things like
++ * getparam()/getattr() don't report silly values for !rt tasks.
++ */
++ p->rt_priority = attr->sched_priority;
++ p->normal_prio = normal_prio(p);
++}
++
++/*
++ * check the target process has a UID that matches the current process's
++ */
++static bool check_same_owner(struct task_struct *p)
++{
++ const struct cred *cred = current_cred(), *pcred;
++ bool match;
++
++ rcu_read_lock();
++ pcred = __task_cred(p);
++ match = (uid_eq(cred->euid, pcred->euid) ||
++ uid_eq(cred->euid, pcred->uid));
++ rcu_read_unlock();
++ return match;
++}
++
++/*
++ * Allow unprivileged RT tasks to decrease priority.
++ * Only issue a capable test if needed and only once to avoid an audit
++ * event on permitted non-privileged operations:
++ */
++static int user_check_sched_setscheduler(struct task_struct *p,
++ const struct sched_attr *attr,
++ int policy, int reset_on_fork)
++{
++ if (rt_policy(policy)) {
++ unsigned long rlim_rtprio = task_rlimit(p, RLIMIT_RTPRIO);
++
++ /* Can't set/change the rt policy: */
++ if (policy != p->policy && !rlim_rtprio)
++ goto req_priv;
++
++ /* Can't increase priority: */
++ if (attr->sched_priority > p->rt_priority &&
++ attr->sched_priority > rlim_rtprio)
++ goto req_priv;
++ }
++
++ /* Can't change other user's priorities: */
++ if (!check_same_owner(p))
++ goto req_priv;
++
++ /* Normal users shall not reset the sched_reset_on_fork flag: */
++ if (p->sched_reset_on_fork && !reset_on_fork)
++ goto req_priv;
++
++ return 0;
++
++req_priv:
++ if (!capable(CAP_SYS_NICE))
++ return -EPERM;
++
++ return 0;
++}
++
++static int __sched_setscheduler(struct task_struct *p,
++ const struct sched_attr *attr,
++ bool user, bool pi)
++{
++ const struct sched_attr dl_squash_attr = {
++ .size = sizeof(struct sched_attr),
++ .sched_policy = SCHED_FIFO,
++ .sched_nice = 0,
++ .sched_priority = 99,
++ };
++ int oldpolicy = -1, policy = attr->sched_policy;
++ int retval, newprio;
++ struct balance_callback *head;
++ unsigned long flags;
++ struct rq *rq;
++ int reset_on_fork;
++ raw_spinlock_t *lock;
++
++ /* The pi code expects interrupts enabled */
++ BUG_ON(pi && in_interrupt());
++
++ /*
++ * Alt schedule FW supports SCHED_DEADLINE by squash it as prio 0 SCHED_FIFO
++ */
++ if (unlikely(SCHED_DEADLINE == policy)) {
++ attr = &dl_squash_attr;
++ policy = attr->sched_policy;
++ }
++recheck:
++ /* Double check policy once rq lock held */
++ if (policy < 0) {
++ reset_on_fork = p->sched_reset_on_fork;
++ policy = oldpolicy = p->policy;
++ } else {
++ reset_on_fork = !!(attr->sched_flags & SCHED_RESET_ON_FORK);
++
++ if (policy > SCHED_IDLE)
++ return -EINVAL;
++ }
++
++ if (attr->sched_flags & ~(SCHED_FLAG_ALL))
++ return -EINVAL;
++
++ /*
++ * Valid priorities for SCHED_FIFO and SCHED_RR are
++ * 1..MAX_RT_PRIO-1, valid priority for SCHED_NORMAL and
++ * SCHED_BATCH and SCHED_IDLE is 0.
++ */
++ if (attr->sched_priority < 0 ||
++ (p->mm && attr->sched_priority > MAX_RT_PRIO - 1) ||
++ (!p->mm && attr->sched_priority > MAX_RT_PRIO - 1))
++ return -EINVAL;
++ if ((SCHED_RR == policy || SCHED_FIFO == policy) !=
++ (attr->sched_priority != 0))
++ return -EINVAL;
++
++ if (user) {
++ retval = user_check_sched_setscheduler(p, attr, policy, reset_on_fork);
++ if (retval)
++ return retval;
++
++ retval = security_task_setscheduler(p);
++ if (retval)
++ return retval;
++ }
++
++ if (pi)
++ cpuset_read_lock();
++
++ /*
++ * Make sure no PI-waiters arrive (or leave) while we are
++ * changing the priority of the task:
++ */
++ raw_spin_lock_irqsave(&p->pi_lock, flags);
++
++ /*
++ * To be able to change p->policy safely, task_access_lock()
++ * must be called.
++ * IF use task_access_lock() here:
++ * For the task p which is not running, reading rq->stop is
++ * racy but acceptable as ->stop doesn't change much.
++ * An enhancemnet can be made to read rq->stop saftly.
++ */
++ rq = __task_access_lock(p, &lock);
++
++ /*
++ * Changing the policy of the stop threads its a very bad idea
++ */
++ if (p == rq->stop) {
++ retval = -EINVAL;
++ goto unlock;
++ }
++
++ /*
++ * If not changing anything there's no need to proceed further:
++ */
++ if (unlikely(policy == p->policy)) {
++ if (rt_policy(policy) && attr->sched_priority != p->rt_priority)
++ goto change;
++ if (!rt_policy(policy) &&
++ NICE_TO_PRIO(attr->sched_nice) != p->static_prio)
++ goto change;
++
++ p->sched_reset_on_fork = reset_on_fork;
++ retval = 0;
++ goto unlock;
++ }
++change:
++
++ /* Re-check policy now with rq lock held */
++ if (unlikely(oldpolicy != -1 && oldpolicy != p->policy)) {
++ policy = oldpolicy = -1;
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++ if (pi)
++ cpuset_read_unlock();
++ goto recheck;
++ }
++
++ p->sched_reset_on_fork = reset_on_fork;
++
++ newprio = __normal_prio(policy, attr->sched_priority, NICE_TO_PRIO(attr->sched_nice));
++ if (pi) {
++ /*
++ * Take priority boosted tasks into account. If the new
++ * effective priority is unchanged, we just store the new
++ * normal parameters and do not touch the scheduler class and
++ * the runqueue. This will be done when the task deboost
++ * itself.
++ */
++ newprio = rt_effective_prio(p, newprio);
++ }
++
++ if (!(attr->sched_flags & SCHED_FLAG_KEEP_PARAMS)) {
++ __setscheduler_params(p, attr);
++ __setscheduler_prio(p, newprio);
++ }
++
++ check_task_changed(p, rq);
++
++ /* Avoid rq from going away on us: */
++ preempt_disable();
++ head = splice_balance_callbacks(rq);
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++
++ if (pi) {
++ cpuset_read_unlock();
++ rt_mutex_adjust_pi(p);
++ }
++
++ /* Run balance callbacks after we've adjusted the PI chain: */
++ balance_callbacks(rq, head);
++ preempt_enable();
++
++ return 0;
++
++unlock:
++ __task_access_unlock(p, lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
++ if (pi)
++ cpuset_read_unlock();
++ return retval;
++}
++
++static int _sched_setscheduler(struct task_struct *p, int policy,
++ const struct sched_param *param, bool check)
++{
++ struct sched_attr attr = {
++ .sched_policy = policy,
++ .sched_priority = param->sched_priority,
++ .sched_nice = PRIO_TO_NICE(p->static_prio),
++ };
++
++ /* Fixup the legacy SCHED_RESET_ON_FORK hack. */
++ if ((policy != SETPARAM_POLICY) && (policy & SCHED_RESET_ON_FORK)) {
++ attr.sched_flags |= SCHED_FLAG_RESET_ON_FORK;
++ policy &= ~SCHED_RESET_ON_FORK;
++ attr.sched_policy = policy;
++ }
++
++ return __sched_setscheduler(p, &attr, check, true);
++}
++
++/**
++ * sched_setscheduler - change the scheduling policy and/or RT priority of a thread.
++ * @p: the task in question.
++ * @policy: new policy.
++ * @param: structure containing the new RT priority.
++ *
++ * Use sched_set_fifo(), read its comment.
++ *
++ * Return: 0 on success. An error code otherwise.
++ *
++ * NOTE that the task may be already dead.
++ */
++int sched_setscheduler(struct task_struct *p, int policy,
++ const struct sched_param *param)
++{
++ return _sched_setscheduler(p, policy, param, true);
++}
++
++int sched_setattr(struct task_struct *p, const struct sched_attr *attr)
++{
++ return __sched_setscheduler(p, attr, true, true);
++}
++
++int sched_setattr_nocheck(struct task_struct *p, const struct sched_attr *attr)
++{
++ return __sched_setscheduler(p, attr, false, true);
++}
++EXPORT_SYMBOL_GPL(sched_setattr_nocheck);
++
++/**
++ * sched_setscheduler_nocheck - change the scheduling policy and/or RT priority of a thread from kernelspace.
++ * @p: the task in question.
++ * @policy: new policy.
++ * @param: structure containing the new RT priority.
++ *
++ * Just like sched_setscheduler, only don't bother checking if the
++ * current context has permission. For example, this is needed in
++ * stop_machine(): we create temporary high priority worker threads,
++ * but our caller might not have that capability.
++ *
++ * Return: 0 on success. An error code otherwise.
++ */
++int sched_setscheduler_nocheck(struct task_struct *p, int policy,
++ const struct sched_param *param)
++{
++ return _sched_setscheduler(p, policy, param, false);
++}
++
++/*
++ * SCHED_FIFO is a broken scheduler model; that is, it is fundamentally
++ * incapable of resource management, which is the one thing an OS really should
++ * be doing.
++ *
++ * This is of course the reason it is limited to privileged users only.
++ *
++ * Worse still; it is fundamentally impossible to compose static priority
++ * workloads. You cannot take two correctly working static prio workloads
++ * and smash them together and still expect them to work.
++ *
++ * For this reason 'all' FIFO tasks the kernel creates are basically at:
++ *
++ * MAX_RT_PRIO / 2
++ *
++ * The administrator _MUST_ configure the system, the kernel simply doesn't
++ * know enough information to make a sensible choice.
++ */
++void sched_set_fifo(struct task_struct *p)
++{
++ struct sched_param sp = { .sched_priority = MAX_RT_PRIO / 2 };
++ WARN_ON_ONCE(sched_setscheduler_nocheck(p, SCHED_FIFO, &sp) != 0);
++}
++EXPORT_SYMBOL_GPL(sched_set_fifo);
++
++/*
++ * For when you don't much care about FIFO, but want to be above SCHED_NORMAL.
++ */
++void sched_set_fifo_low(struct task_struct *p)
++{
++ struct sched_param sp = { .sched_priority = 1 };
++ WARN_ON_ONCE(sched_setscheduler_nocheck(p, SCHED_FIFO, &sp) != 0);
++}
++EXPORT_SYMBOL_GPL(sched_set_fifo_low);
++
++void sched_set_normal(struct task_struct *p, int nice)
++{
++ struct sched_attr attr = {
++ .sched_policy = SCHED_NORMAL,
++ .sched_nice = nice,
++ };
++ WARN_ON_ONCE(sched_setattr_nocheck(p, &attr) != 0);
++}
++EXPORT_SYMBOL_GPL(sched_set_normal);
++
++static int
++do_sched_setscheduler(pid_t pid, int policy, struct sched_param __user *param)
++{
++ struct sched_param lparam;
++ struct task_struct *p;
++ int retval;
++
++ if (!param || pid < 0)
++ return -EINVAL;
++ if (copy_from_user(&lparam, param, sizeof(struct sched_param)))
++ return -EFAULT;
++
++ rcu_read_lock();
++ retval = -ESRCH;
++ p = find_process_by_pid(pid);
++ if (likely(p))
++ get_task_struct(p);
++ rcu_read_unlock();
++
++ if (likely(p)) {
++ retval = sched_setscheduler(p, policy, &lparam);
++ put_task_struct(p);
++ }
++
++ return retval;
++}
++
++/*
++ * Mimics kernel/events/core.c perf_copy_attr().
++ */
++static int sched_copy_attr(struct sched_attr __user *uattr, struct sched_attr *attr)
++{
++ u32 size;
++ int ret;
++
++ /* Zero the full structure, so that a short copy will be nice: */
++ memset(attr, 0, sizeof(*attr));
++
++ ret = get_user(size, &uattr->size);
++ if (ret)
++ return ret;
++
++ /* ABI compatibility quirk: */
++ if (!size)
++ size = SCHED_ATTR_SIZE_VER0;
++
++ if (size < SCHED_ATTR_SIZE_VER0 || size > PAGE_SIZE)
++ goto err_size;
++
++ ret = copy_struct_from_user(attr, sizeof(*attr), uattr, size);
++ if (ret) {
++ if (ret == -E2BIG)
++ goto err_size;
++ return ret;
++ }
++
++ /*
++ * XXX: Do we want to be lenient like existing syscalls; or do we want
++ * to be strict and return an error on out-of-bounds values?
++ */
++ attr->sched_nice = clamp(attr->sched_nice, -20, 19);
++
++ /* sched/core.c uses zero here but we already know ret is zero */
++ return 0;
++
++err_size:
++ put_user(sizeof(*attr), &uattr->size);
++ return -E2BIG;
++}
++
++/**
++ * sys_sched_setscheduler - set/change the scheduler policy and RT priority
++ * @pid: the pid in question.
++ * @policy: new policy.
++ *
++ * Return: 0 on success. An error code otherwise.
++ * @param: structure containing the new RT priority.
++ */
++SYSCALL_DEFINE3(sched_setscheduler, pid_t, pid, int, policy, struct sched_param __user *, param)
++{
++ if (policy < 0)
++ return -EINVAL;
++
++ return do_sched_setscheduler(pid, policy, param);
++}
++
++/**
++ * sys_sched_setparam - set/change the RT priority of a thread
++ * @pid: the pid in question.
++ * @param: structure containing the new RT priority.
++ *
++ * Return: 0 on success. An error code otherwise.
++ */
++SYSCALL_DEFINE2(sched_setparam, pid_t, pid, struct sched_param __user *, param)
++{
++ return do_sched_setscheduler(pid, SETPARAM_POLICY, param);
++}
++
++/**
++ * sys_sched_setattr - same as above, but with extended sched_attr
++ * @pid: the pid in question.
++ * @uattr: structure containing the extended parameters.
++ */
++SYSCALL_DEFINE3(sched_setattr, pid_t, pid, struct sched_attr __user *, uattr,
++ unsigned int, flags)
++{
++ struct sched_attr attr;
++ struct task_struct *p;
++ int retval;
++
++ if (!uattr || pid < 0 || flags)
++ return -EINVAL;
++
++ retval = sched_copy_attr(uattr, &attr);
++ if (retval)
++ return retval;
++
++ if ((int)attr.sched_policy < 0)
++ return -EINVAL;
++
++ rcu_read_lock();
++ retval = -ESRCH;
++ p = find_process_by_pid(pid);
++ if (likely(p))
++ get_task_struct(p);
++ rcu_read_unlock();
++
++ if (likely(p)) {
++ retval = sched_setattr(p, &attr);
++ put_task_struct(p);
++ }
++
++ return retval;
++}
++
++/**
++ * sys_sched_getscheduler - get the policy (scheduling class) of a thread
++ * @pid: the pid in question.
++ *
++ * Return: On success, the policy of the thread. Otherwise, a negative error
++ * code.
++ */
++SYSCALL_DEFINE1(sched_getscheduler, pid_t, pid)
++{
++ struct task_struct *p;
++ int retval = -EINVAL;
++
++ if (pid < 0)
++ goto out_nounlock;
++
++ retval = -ESRCH;
++ rcu_read_lock();
++ p = find_process_by_pid(pid);
++ if (p) {
++ retval = security_task_getscheduler(p);
++ if (!retval)
++ retval = p->policy;
++ }
++ rcu_read_unlock();
++
++out_nounlock:
++ return retval;
++}
++
++/**
++ * sys_sched_getscheduler - get the RT priority of a thread
++ * @pid: the pid in question.
++ * @param: structure containing the RT priority.
++ *
++ * Return: On success, 0 and the RT priority is in @param. Otherwise, an error
++ * code.
++ */
++SYSCALL_DEFINE2(sched_getparam, pid_t, pid, struct sched_param __user *, param)
++{
++ struct sched_param lp = { .sched_priority = 0 };
++ struct task_struct *p;
++ int retval = -EINVAL;
++
++ if (!param || pid < 0)
++ goto out_nounlock;
++
++ rcu_read_lock();
++ p = find_process_by_pid(pid);
++ retval = -ESRCH;
++ if (!p)
++ goto out_unlock;
++
++ retval = security_task_getscheduler(p);
++ if (retval)
++ goto out_unlock;
++
++ if (task_has_rt_policy(p))
++ lp.sched_priority = p->rt_priority;
++ rcu_read_unlock();
++
++ /*
++ * This one might sleep, we cannot do it with a spinlock held ...
++ */
++ retval = copy_to_user(param, &lp, sizeof(*param)) ? -EFAULT : 0;
++
++out_nounlock:
++ return retval;
++
++out_unlock:
++ rcu_read_unlock();
++ return retval;
++}
++
++/*
++ * Copy the kernel size attribute structure (which might be larger
++ * than what user-space knows about) to user-space.
++ *
++ * Note that all cases are valid: user-space buffer can be larger or
++ * smaller than the kernel-space buffer. The usual case is that both
++ * have the same size.
++ */
++static int
++sched_attr_copy_to_user(struct sched_attr __user *uattr,
++ struct sched_attr *kattr,
++ unsigned int usize)
++{
++ unsigned int ksize = sizeof(*kattr);
++
++ if (!access_ok(uattr, usize))
++ return -EFAULT;
++
++ /*
++ * sched_getattr() ABI forwards and backwards compatibility:
++ *
++ * If usize == ksize then we just copy everything to user-space and all is good.
++ *
++ * If usize < ksize then we only copy as much as user-space has space for,
++ * this keeps ABI compatibility as well. We skip the rest.
++ *
++ * If usize > ksize then user-space is using a newer version of the ABI,
++ * which part the kernel doesn't know about. Just ignore it - tooling can
++ * detect the kernel's knowledge of attributes from the attr->size value
++ * which is set to ksize in this case.
++ */
++ kattr->size = min(usize, ksize);
++
++ if (copy_to_user(uattr, kattr, kattr->size))
++ return -EFAULT;
++
++ return 0;
++}
++
++/**
++ * sys_sched_getattr - similar to sched_getparam, but with sched_attr
++ * @pid: the pid in question.
++ * @uattr: structure containing the extended parameters.
++ * @usize: sizeof(attr) for fwd/bwd comp.
++ * @flags: for future extension.
++ */
++SYSCALL_DEFINE4(sched_getattr, pid_t, pid, struct sched_attr __user *, uattr,
++ unsigned int, usize, unsigned int, flags)
++{
++ struct sched_attr kattr = { };
++ struct task_struct *p;
++ int retval;
++
++ if (!uattr || pid < 0 || usize > PAGE_SIZE ||
++ usize < SCHED_ATTR_SIZE_VER0 || flags)
++ return -EINVAL;
++
++ rcu_read_lock();
++ p = find_process_by_pid(pid);
++ retval = -ESRCH;
++ if (!p)
++ goto out_unlock;
++
++ retval = security_task_getscheduler(p);
++ if (retval)
++ goto out_unlock;
++
++ kattr.sched_policy = p->policy;
++ if (p->sched_reset_on_fork)
++ kattr.sched_flags |= SCHED_FLAG_RESET_ON_FORK;
++ if (task_has_rt_policy(p))
++ kattr.sched_priority = p->rt_priority;
++ else
++ kattr.sched_nice = task_nice(p);
++ kattr.sched_flags &= SCHED_FLAG_ALL;
++
++#ifdef CONFIG_UCLAMP_TASK
++ kattr.sched_util_min = p->uclamp_req[UCLAMP_MIN].value;
++ kattr.sched_util_max = p->uclamp_req[UCLAMP_MAX].value;
++#endif
++
++ rcu_read_unlock();
++
++ return sched_attr_copy_to_user(uattr, &kattr, usize);
++
++out_unlock:
++ rcu_read_unlock();
++ return retval;
++}
++
++#ifdef CONFIG_SMP
++int dl_task_check_affinity(struct task_struct *p, const struct cpumask *mask)
++{
++ return 0;
++}
++#endif
++
++static int
++__sched_setaffinity(struct task_struct *p, struct affinity_context *ctx)
++{
++ int retval;
++ cpumask_var_t cpus_allowed, new_mask;
++
++ if (!alloc_cpumask_var(&cpus_allowed, GFP_KERNEL))
++ return -ENOMEM;
++
++ if (!alloc_cpumask_var(&new_mask, GFP_KERNEL)) {
++ retval = -ENOMEM;
++ goto out_free_cpus_allowed;
++ }
++
++ cpuset_cpus_allowed(p, cpus_allowed);
++ cpumask_and(new_mask, ctx->new_mask, cpus_allowed);
++
++ ctx->new_mask = new_mask;
++ ctx->flags |= SCA_CHECK;
++
++ retval = __set_cpus_allowed_ptr(p, ctx);
++ if (retval)
++ goto out_free_new_mask;
++
++ cpuset_cpus_allowed(p, cpus_allowed);
++ if (!cpumask_subset(new_mask, cpus_allowed)) {
++ /*
++ * We must have raced with a concurrent cpuset
++ * update. Just reset the cpus_allowed to the
++ * cpuset's cpus_allowed
++ */
++ cpumask_copy(new_mask, cpus_allowed);
++
++ /*
++ * If SCA_USER is set, a 2nd call to __set_cpus_allowed_ptr()
++ * will restore the previous user_cpus_ptr value.
++ *
++ * In the unlikely event a previous user_cpus_ptr exists,
++ * we need to further restrict the mask to what is allowed
++ * by that old user_cpus_ptr.
++ */
++ if (unlikely((ctx->flags & SCA_USER) && ctx->user_mask)) {
++ bool empty = !cpumask_and(new_mask, new_mask,
++ ctx->user_mask);
++
++ if (WARN_ON_ONCE(empty))
++ cpumask_copy(new_mask, cpus_allowed);
++ }
++ __set_cpus_allowed_ptr(p, ctx);
++ retval = -EINVAL;
++ }
++
++out_free_new_mask:
++ free_cpumask_var(new_mask);
++out_free_cpus_allowed:
++ free_cpumask_var(cpus_allowed);
++ return retval;
++}
++
++long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
++{
++ struct affinity_context ac;
++ struct cpumask *user_mask;
++ struct task_struct *p;
++ int retval;
++
++ rcu_read_lock();
++
++ p = find_process_by_pid(pid);
++ if (!p) {
++ rcu_read_unlock();
++ return -ESRCH;
++ }
++
++ /* Prevent p going away */
++ get_task_struct(p);
++ rcu_read_unlock();
++
++ if (p->flags & PF_NO_SETAFFINITY) {
++ retval = -EINVAL;
++ goto out_put_task;
++ }
++
++ if (!check_same_owner(p)) {
++ rcu_read_lock();
++ if (!ns_capable(__task_cred(p)->user_ns, CAP_SYS_NICE)) {
++ rcu_read_unlock();
++ retval = -EPERM;
++ goto out_put_task;
++ }
++ rcu_read_unlock();
++ }
++
++ retval = security_task_setscheduler(p);
++ if (retval)
++ goto out_put_task;
++
++ /*
++ * With non-SMP configs, user_cpus_ptr/user_mask isn't used and
++ * alloc_user_cpus_ptr() returns NULL.
++ */
++ user_mask = alloc_user_cpus_ptr(NUMA_NO_NODE);
++ if (user_mask) {
++ cpumask_copy(user_mask, in_mask);
++ } else if (IS_ENABLED(CONFIG_SMP)) {
++ retval = -ENOMEM;
++ goto out_put_task;
++ }
++
++ ac = (struct affinity_context){
++ .new_mask = in_mask,
++ .user_mask = user_mask,
++ .flags = SCA_USER,
++ };
++
++ retval = __sched_setaffinity(p, &ac);
++ kfree(ac.user_mask);
++
++out_put_task:
++ put_task_struct(p);
++ return retval;
++}
++
++static int get_user_cpu_mask(unsigned long __user *user_mask_ptr, unsigned len,
++ struct cpumask *new_mask)
++{
++ if (len < cpumask_size())
++ cpumask_clear(new_mask);
++ else if (len > cpumask_size())
++ len = cpumask_size();
++
++ return copy_from_user(new_mask, user_mask_ptr, len) ? -EFAULT : 0;
++}
++
++/**
++ * sys_sched_setaffinity - set the CPU affinity of a process
++ * @pid: pid of the process
++ * @len: length in bytes of the bitmask pointed to by user_mask_ptr
++ * @user_mask_ptr: user-space pointer to the new CPU mask
++ *
++ * Return: 0 on success. An error code otherwise.
++ */
++SYSCALL_DEFINE3(sched_setaffinity, pid_t, pid, unsigned int, len,
++ unsigned long __user *, user_mask_ptr)
++{
++ cpumask_var_t new_mask;
++ int retval;
++
++ if (!alloc_cpumask_var(&new_mask, GFP_KERNEL))
++ return -ENOMEM;
++
++ retval = get_user_cpu_mask(user_mask_ptr, len, new_mask);
++ if (retval == 0)
++ retval = sched_setaffinity(pid, new_mask);
++ free_cpumask_var(new_mask);
++ return retval;
++}
++
++long sched_getaffinity(pid_t pid, cpumask_t *mask)
++{
++ struct task_struct *p;
++ raw_spinlock_t *lock;
++ unsigned long flags;
++ int retval;
++
++ rcu_read_lock();
++
++ retval = -ESRCH;
++ p = find_process_by_pid(pid);
++ if (!p)
++ goto out_unlock;
++
++ retval = security_task_getscheduler(p);
++ if (retval)
++ goto out_unlock;
++
++ task_access_lock_irqsave(p, &lock, &flags);
++ cpumask_and(mask, &p->cpus_mask, cpu_active_mask);
++ task_access_unlock_irqrestore(p, lock, &flags);
++
++out_unlock:
++ rcu_read_unlock();
++
++ return retval;
++}
++
++/**
++ * sys_sched_getaffinity - get the CPU affinity of a process
++ * @pid: pid of the process
++ * @len: length in bytes of the bitmask pointed to by user_mask_ptr
++ * @user_mask_ptr: user-space pointer to hold the current CPU mask
++ *
++ * Return: size of CPU mask copied to user_mask_ptr on success. An
++ * error code otherwise.
++ */
++SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len,
++ unsigned long __user *, user_mask_ptr)
++{
++ int ret;
++ cpumask_var_t mask;
++
++ if ((len * BITS_PER_BYTE) < nr_cpu_ids)
++ return -EINVAL;
++ if (len & (sizeof(unsigned long)-1))
++ return -EINVAL;
++
++ if (!zalloc_cpumask_var(&mask, GFP_KERNEL))
++ return -ENOMEM;
++
++ ret = sched_getaffinity(pid, mask);
++ if (ret == 0) {
++ unsigned int retlen = min(len, cpumask_size());
++
++ if (copy_to_user(user_mask_ptr, cpumask_bits(mask), retlen))
++ ret = -EFAULT;
++ else
++ ret = retlen;
++ }
++ free_cpumask_var(mask);
++
++ return ret;
++}
++
++static void do_sched_yield(void)
++{
++ struct rq *rq;
++ struct rq_flags rf;
++
++ if (!sched_yield_type)
++ return;
++
++ rq = this_rq_lock_irq(&rf);
++
++ schedstat_inc(rq->yld_count);
++
++ if (1 == sched_yield_type) {
++ if (!rt_task(current))
++ do_sched_yield_type_1(current, rq);
++ } else if (2 == sched_yield_type) {
++ if (rq->nr_running > 1)
++ rq->skip = current;
++ }
++
++ preempt_disable();
++ raw_spin_unlock_irq(&rq->lock);
++ sched_preempt_enable_no_resched();
++
++ schedule();
++}
++
++/**
++ * sys_sched_yield - yield the current processor to other threads.
++ *
++ * This function yields the current CPU to other tasks. If there are no
++ * other threads running on this CPU then this function will return.
++ *
++ * Return: 0.
++ */
++SYSCALL_DEFINE0(sched_yield)
++{
++ do_sched_yield();
++ return 0;
++}
++
++#if !defined(CONFIG_PREEMPTION) || defined(CONFIG_PREEMPT_DYNAMIC)
++int __sched __cond_resched(void)
++{
++ if (should_resched(0)) {
++ preempt_schedule_common();
++ return 1;
++ }
++ /*
++ * In preemptible kernels, ->rcu_read_lock_nesting tells the tick
++ * whether the current CPU is in an RCU read-side critical section,
++ * so the tick can report quiescent states even for CPUs looping
++ * in kernel context. In contrast, in non-preemptible kernels,
++ * RCU readers leave no in-memory hints, which means that CPU-bound
++ * processes executing in kernel context might never report an
++ * RCU quiescent state. Therefore, the following code causes
++ * cond_resched() to report a quiescent state, but only when RCU
++ * is in urgent need of one.
++ */
++#ifndef CONFIG_PREEMPT_RCU
++ rcu_all_qs();
++#endif
++ return 0;
++}
++EXPORT_SYMBOL(__cond_resched);
++#endif
++
++#ifdef CONFIG_PREEMPT_DYNAMIC
++#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
++#define cond_resched_dynamic_enabled __cond_resched
++#define cond_resched_dynamic_disabled ((void *)&__static_call_return0)
++DEFINE_STATIC_CALL_RET0(cond_resched, __cond_resched);
++EXPORT_STATIC_CALL_TRAMP(cond_resched);
++
++#define might_resched_dynamic_enabled __cond_resched
++#define might_resched_dynamic_disabled ((void *)&__static_call_return0)
++DEFINE_STATIC_CALL_RET0(might_resched, __cond_resched);
++EXPORT_STATIC_CALL_TRAMP(might_resched);
++#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
++static DEFINE_STATIC_KEY_FALSE(sk_dynamic_cond_resched);
++int __sched dynamic_cond_resched(void)
++{
++ klp_sched_try_switch();
++ if (!static_branch_unlikely(&sk_dynamic_cond_resched))
++ return 0;
++ return __cond_resched();
++}
++EXPORT_SYMBOL(dynamic_cond_resched);
++
++static DEFINE_STATIC_KEY_FALSE(sk_dynamic_might_resched);
++int __sched dynamic_might_resched(void)
++{
++ if (!static_branch_unlikely(&sk_dynamic_might_resched))
++ return 0;
++ return __cond_resched();
++}
++EXPORT_SYMBOL(dynamic_might_resched);
++#endif
++#endif
++
++/*
++ * __cond_resched_lock() - if a reschedule is pending, drop the given lock,
++ * call schedule, and on return reacquire the lock.
++ *
++ * This works OK both with and without CONFIG_PREEMPTION. We do strange low-level
++ * operations here to prevent schedule() from being called twice (once via
++ * spin_unlock(), once by hand).
++ */
++int __cond_resched_lock(spinlock_t *lock)
++{
++ int resched = should_resched(PREEMPT_LOCK_OFFSET);
++ int ret = 0;
++
++ lockdep_assert_held(lock);
++
++ if (spin_needbreak(lock) || resched) {
++ spin_unlock(lock);
++ if (!_cond_resched())
++ cpu_relax();
++ ret = 1;
++ spin_lock(lock);
++ }
++ return ret;
++}
++EXPORT_SYMBOL(__cond_resched_lock);
++
++int __cond_resched_rwlock_read(rwlock_t *lock)
++{
++ int resched = should_resched(PREEMPT_LOCK_OFFSET);
++ int ret = 0;
++
++ lockdep_assert_held_read(lock);
++
++ if (rwlock_needbreak(lock) || resched) {
++ read_unlock(lock);
++ if (!_cond_resched())
++ cpu_relax();
++ ret = 1;
++ read_lock(lock);
++ }
++ return ret;
++}
++EXPORT_SYMBOL(__cond_resched_rwlock_read);
++
++int __cond_resched_rwlock_write(rwlock_t *lock)
++{
++ int resched = should_resched(PREEMPT_LOCK_OFFSET);
++ int ret = 0;
++
++ lockdep_assert_held_write(lock);
++
++ if (rwlock_needbreak(lock) || resched) {
++ write_unlock(lock);
++ if (!_cond_resched())
++ cpu_relax();
++ ret = 1;
++ write_lock(lock);
++ }
++ return ret;
++}
++EXPORT_SYMBOL(__cond_resched_rwlock_write);
++
++#ifdef CONFIG_PREEMPT_DYNAMIC
++
++#ifdef CONFIG_GENERIC_ENTRY
++#include <linux/entry-common.h>
++#endif
++
++/*
++ * SC:cond_resched
++ * SC:might_resched
++ * SC:preempt_schedule
++ * SC:preempt_schedule_notrace
++ * SC:irqentry_exit_cond_resched
++ *
++ *
++ * NONE:
++ * cond_resched <- __cond_resched
++ * might_resched <- RET0
++ * preempt_schedule <- NOP
++ * preempt_schedule_notrace <- NOP
++ * irqentry_exit_cond_resched <- NOP
++ *
++ * VOLUNTARY:
++ * cond_resched <- __cond_resched
++ * might_resched <- __cond_resched
++ * preempt_schedule <- NOP
++ * preempt_schedule_notrace <- NOP
++ * irqentry_exit_cond_resched <- NOP
++ *
++ * FULL:
++ * cond_resched <- RET0
++ * might_resched <- RET0
++ * preempt_schedule <- preempt_schedule
++ * preempt_schedule_notrace <- preempt_schedule_notrace
++ * irqentry_exit_cond_resched <- irqentry_exit_cond_resched
++ */
++
++enum {
++ preempt_dynamic_undefined = -1,
++ preempt_dynamic_none,
++ preempt_dynamic_voluntary,
++ preempt_dynamic_full,
++};
++
++int preempt_dynamic_mode = preempt_dynamic_undefined;
++
++int sched_dynamic_mode(const char *str)
++{
++ if (!strcmp(str, "none"))
++ return preempt_dynamic_none;
++
++ if (!strcmp(str, "voluntary"))
++ return preempt_dynamic_voluntary;
++
++ if (!strcmp(str, "full"))
++ return preempt_dynamic_full;
++
++ return -EINVAL;
++}
++
++#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
++#define preempt_dynamic_enable(f) static_call_update(f, f##_dynamic_enabled)
++#define preempt_dynamic_disable(f) static_call_update(f, f##_dynamic_disabled)
++#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
++#define preempt_dynamic_enable(f) static_key_enable(&sk_dynamic_##f.key)
++#define preempt_dynamic_disable(f) static_key_disable(&sk_dynamic_##f.key)
++#else
++#error "Unsupported PREEMPT_DYNAMIC mechanism"
++#endif
++
++static DEFINE_MUTEX(sched_dynamic_mutex);
++static bool klp_override;
++
++static void __sched_dynamic_update(int mode)
++{
++ /*
++ * Avoid {NONE,VOLUNTARY} -> FULL transitions from ever ending up in
++ * the ZERO state, which is invalid.
++ */
++ if (!klp_override)
++ preempt_dynamic_enable(cond_resched);
++ preempt_dynamic_enable(cond_resched);
++ preempt_dynamic_enable(might_resched);
++ preempt_dynamic_enable(preempt_schedule);
++ preempt_dynamic_enable(preempt_schedule_notrace);
++ preempt_dynamic_enable(irqentry_exit_cond_resched);
++
++ switch (mode) {
++ case preempt_dynamic_none:
++ if (!klp_override)
++ preempt_dynamic_enable(cond_resched);
++ preempt_dynamic_disable(might_resched);
++ preempt_dynamic_disable(preempt_schedule);
++ preempt_dynamic_disable(preempt_schedule_notrace);
++ preempt_dynamic_disable(irqentry_exit_cond_resched);
++ if (mode != preempt_dynamic_mode)
++ pr_info("Dynamic Preempt: none\n");
++ break;
++
++ case preempt_dynamic_voluntary:
++ if (!klp_override)
++ preempt_dynamic_enable(cond_resched);
++ preempt_dynamic_enable(might_resched);
++ preempt_dynamic_disable(preempt_schedule);
++ preempt_dynamic_disable(preempt_schedule_notrace);
++ preempt_dynamic_disable(irqentry_exit_cond_resched);
++ if (mode != preempt_dynamic_mode)
++ pr_info("Dynamic Preempt: voluntary\n");
++ break;
++
++ case preempt_dynamic_full:
++ if (!klp_override)
++ preempt_dynamic_enable(cond_resched);
++ preempt_dynamic_disable(might_resched);
++ preempt_dynamic_enable(preempt_schedule);
++ preempt_dynamic_enable(preempt_schedule_notrace);
++ preempt_dynamic_enable(irqentry_exit_cond_resched);
++ if (mode != preempt_dynamic_mode)
++ pr_info("Dynamic Preempt: full\n");
++ break;
++ }
++
++ preempt_dynamic_mode = mode;
++}
++
++void sched_dynamic_update(int mode)
++{
++ mutex_lock(&sched_dynamic_mutex);
++ __sched_dynamic_update(mode);
++ mutex_unlock(&sched_dynamic_mutex);
++}
++
++#ifdef CONFIG_HAVE_PREEMPT_DYNAMIC_CALL
++
++static int klp_cond_resched(void)
++{
++ __klp_sched_try_switch();
++ return __cond_resched();
++}
++
++void sched_dynamic_klp_enable(void)
++{
++ mutex_lock(&sched_dynamic_mutex);
++
++ klp_override = true;
++ static_call_update(cond_resched, klp_cond_resched);
++
++ mutex_unlock(&sched_dynamic_mutex);
++}
++
++void sched_dynamic_klp_disable(void)
++{
++ mutex_lock(&sched_dynamic_mutex);
++
++ klp_override = false;
++ __sched_dynamic_update(preempt_dynamic_mode);
++
++ mutex_unlock(&sched_dynamic_mutex);
++}
++
++#endif /* CONFIG_HAVE_PREEMPT_DYNAMIC_CALL */
++
++
++static int __init setup_preempt_mode(char *str)
++{
++ int mode = sched_dynamic_mode(str);
++ if (mode < 0) {
++ pr_warn("Dynamic Preempt: unsupported mode: %s\n", str);
++ return 0;
++ }
++
++ sched_dynamic_update(mode);
++ return 1;
++}
++__setup("preempt=", setup_preempt_mode);
++
++static void __init preempt_dynamic_init(void)
++{
++ if (preempt_dynamic_mode == preempt_dynamic_undefined) {
++ if (IS_ENABLED(CONFIG_PREEMPT_NONE)) {
++ sched_dynamic_update(preempt_dynamic_none);
++ } else if (IS_ENABLED(CONFIG_PREEMPT_VOLUNTARY)) {
++ sched_dynamic_update(preempt_dynamic_voluntary);
++ } else {
++ /* Default static call setting, nothing to do */
++ WARN_ON_ONCE(!IS_ENABLED(CONFIG_PREEMPT));
++ preempt_dynamic_mode = preempt_dynamic_full;
++ pr_info("Dynamic Preempt: full\n");
++ }
++ }
++}
++
++#define PREEMPT_MODEL_ACCESSOR(mode) \
++ bool preempt_model_##mode(void) \
++ { \
++ WARN_ON_ONCE(preempt_dynamic_mode == preempt_dynamic_undefined); \
++ return preempt_dynamic_mode == preempt_dynamic_##mode; \
++ } \
++ EXPORT_SYMBOL_GPL(preempt_model_##mode)
++
++PREEMPT_MODEL_ACCESSOR(none);
++PREEMPT_MODEL_ACCESSOR(voluntary);
++PREEMPT_MODEL_ACCESSOR(full);
++
++#else /* !CONFIG_PREEMPT_DYNAMIC */
++
++static inline void preempt_dynamic_init(void) { }
++
++#endif /* #ifdef CONFIG_PREEMPT_DYNAMIC */
++
++/**
++ * yield - yield the current processor to other threads.
++ *
++ * Do not ever use this function, there's a 99% chance you're doing it wrong.
++ *
++ * The scheduler is at all times free to pick the calling task as the most
++ * eligible task to run, if removing the yield() call from your code breaks
++ * it, it's already broken.
++ *
++ * Typical broken usage is:
++ *
++ * while (!event)
++ * yield();
++ *
++ * where one assumes that yield() will let 'the other' process run that will
++ * make event true. If the current task is a SCHED_FIFO task that will never
++ * happen. Never use yield() as a progress guarantee!!
++ *
++ * If you want to use yield() to wait for something, use wait_event().
++ * If you want to use yield() to be 'nice' for others, use cond_resched().
++ * If you still want to use yield(), do not!
++ */
++void __sched yield(void)
++{
++ set_current_state(TASK_RUNNING);
++ do_sched_yield();
++}
++EXPORT_SYMBOL(yield);
++
++/**
++ * yield_to - yield the current processor to another thread in
++ * your thread group, or accelerate that thread toward the
++ * processor it's on.
++ * @p: target task
++ * @preempt: whether task preemption is allowed or not
++ *
++ * It's the caller's job to ensure that the target task struct
++ * can't go away on us before we can do any checks.
++ *
++ * In Alt schedule FW, yield_to is not supported.
++ *
++ * Return:
++ * true (>0) if we indeed boosted the target task.
++ * false (0) if we failed to boost the target.
++ * -ESRCH if there's no task to yield to.
++ */
++int __sched yield_to(struct task_struct *p, bool preempt)
++{
++ return 0;
++}
++EXPORT_SYMBOL_GPL(yield_to);
++
++int io_schedule_prepare(void)
++{
++ int old_iowait = current->in_iowait;
++
++ current->in_iowait = 1;
++ blk_flush_plug(current->plug, true);
++ return old_iowait;
++}
++
++void io_schedule_finish(int token)
++{
++ current->in_iowait = token;
++}
++
++/*
++ * This task is about to go to sleep on IO. Increment rq->nr_iowait so
++ * that process accounting knows that this is a task in IO wait state.
++ *
++ * But don't do that if it is a deliberate, throttling IO wait (this task
++ * has set its backing_dev_info: the queue against which it should throttle)
++ */
++
++long __sched io_schedule_timeout(long timeout)
++{
++ int token;
++ long ret;
++
++ token = io_schedule_prepare();
++ ret = schedule_timeout(timeout);
++ io_schedule_finish(token);
++
++ return ret;
++}
++EXPORT_SYMBOL(io_schedule_timeout);
++
++void __sched io_schedule(void)
++{
++ int token;
++
++ token = io_schedule_prepare();
++ schedule();
++ io_schedule_finish(token);
++}
++EXPORT_SYMBOL(io_schedule);
++
++/**
++ * sys_sched_get_priority_max - return maximum RT priority.
++ * @policy: scheduling class.
++ *
++ * Return: On success, this syscall returns the maximum
++ * rt_priority that can be used by a given scheduling class.
++ * On failure, a negative error code is returned.
++ */
++SYSCALL_DEFINE1(sched_get_priority_max, int, policy)
++{
++ int ret = -EINVAL;
++
++ switch (policy) {
++ case SCHED_FIFO:
++ case SCHED_RR:
++ ret = MAX_RT_PRIO - 1;
++ break;
++ case SCHED_NORMAL:
++ case SCHED_BATCH:
++ case SCHED_IDLE:
++ ret = 0;
++ break;
++ }
++ return ret;
++}
++
++/**
++ * sys_sched_get_priority_min - return minimum RT priority.
++ * @policy: scheduling class.
++ *
++ * Return: On success, this syscall returns the minimum
++ * rt_priority that can be used by a given scheduling class.
++ * On failure, a negative error code is returned.
++ */
++SYSCALL_DEFINE1(sched_get_priority_min, int, policy)
++{
++ int ret = -EINVAL;
++
++ switch (policy) {
++ case SCHED_FIFO:
++ case SCHED_RR:
++ ret = 1;
++ break;
++ case SCHED_NORMAL:
++ case SCHED_BATCH:
++ case SCHED_IDLE:
++ ret = 0;
++ break;
++ }
++ return ret;
++}
++
++static int sched_rr_get_interval(pid_t pid, struct timespec64 *t)
++{
++ struct task_struct *p;
++ int retval;
++
++ alt_sched_debug();
++
++ if (pid < 0)
++ return -EINVAL;
++
++ retval = -ESRCH;
++ rcu_read_lock();
++ p = find_process_by_pid(pid);
++ if (!p)
++ goto out_unlock;
++
++ retval = security_task_getscheduler(p);
++ if (retval)
++ goto out_unlock;
++ rcu_read_unlock();
++
++ *t = ns_to_timespec64(sched_timeslice_ns);
++ return 0;
++
++out_unlock:
++ rcu_read_unlock();
++ return retval;
++}
++
++/**
++ * sys_sched_rr_get_interval - return the default timeslice of a process.
++ * @pid: pid of the process.
++ * @interval: userspace pointer to the timeslice value.
++ *
++ *
++ * Return: On success, 0 and the timeslice is in @interval. Otherwise,
++ * an error code.
++ */
++SYSCALL_DEFINE2(sched_rr_get_interval, pid_t, pid,
++ struct __kernel_timespec __user *, interval)
++{
++ struct timespec64 t;
++ int retval = sched_rr_get_interval(pid, &t);
++
++ if (retval == 0)
++ retval = put_timespec64(&t, interval);
++
++ return retval;
++}
++
++#ifdef CONFIG_COMPAT_32BIT_TIME
++SYSCALL_DEFINE2(sched_rr_get_interval_time32, pid_t, pid,
++ struct old_timespec32 __user *, interval)
++{
++ struct timespec64 t;
++ int retval = sched_rr_get_interval(pid, &t);
++
++ if (retval == 0)
++ retval = put_old_timespec32(&t, interval);
++ return retval;
++}
++#endif
++
++void sched_show_task(struct task_struct *p)
++{
++ unsigned long free = 0;
++ int ppid;
++
++ if (!try_get_task_stack(p))
++ return;
++
++ pr_info("task:%-15.15s state:%c", p->comm, task_state_to_char(p));
++
++ if (task_is_running(p))
++ pr_cont(" running task ");
++#ifdef CONFIG_DEBUG_STACK_USAGE
++ free = stack_not_used(p);
++#endif
++ ppid = 0;
++ rcu_read_lock();
++ if (pid_alive(p))
++ ppid = task_pid_nr(rcu_dereference(p->real_parent));
++ rcu_read_unlock();
++ pr_cont(" stack:%-5lu pid:%-5d ppid:%-6d flags:0x%08lx\n",
++ free, task_pid_nr(p), ppid,
++ read_task_thread_flags(p));
++
++ print_worker_info(KERN_INFO, p);
++ print_stop_info(KERN_INFO, p);
++ show_stack(p, NULL, KERN_INFO);
++ put_task_stack(p);
++}
++EXPORT_SYMBOL_GPL(sched_show_task);
++
++static inline bool
++state_filter_match(unsigned long state_filter, struct task_struct *p)
++{
++ unsigned int state = READ_ONCE(p->__state);
++
++ /* no filter, everything matches */
++ if (!state_filter)
++ return true;
++
++ /* filter, but doesn't match */
++ if (!(state & state_filter))
++ return false;
++
++ /*
++ * When looking for TASK_UNINTERRUPTIBLE skip TASK_IDLE (allows
++ * TASK_KILLABLE).
++ */
++ if (state_filter == TASK_UNINTERRUPTIBLE && (state & TASK_NOLOAD))
++ return false;
++
++ return true;
++}
++
++
++void show_state_filter(unsigned int state_filter)
++{
++ struct task_struct *g, *p;
++
++ rcu_read_lock();
++ for_each_process_thread(g, p) {
++ /*
++ * reset the NMI-timeout, listing all files on a slow
++ * console might take a lot of time:
++ * Also, reset softlockup watchdogs on all CPUs, because
++ * another CPU might be blocked waiting for us to process
++ * an IPI.
++ */
++ touch_nmi_watchdog();
++ touch_all_softlockup_watchdogs();
++ if (state_filter_match(state_filter, p))
++ sched_show_task(p);
++ }
++
++#ifdef CONFIG_SCHED_DEBUG
++ /* TODO: Alt schedule FW should support this
++ if (!state_filter)
++ sysrq_sched_debug_show();
++ */
++#endif
++ rcu_read_unlock();
++ /*
++ * Only show locks if all tasks are dumped:
++ */
++ if (!state_filter)
++ debug_show_all_locks();
++}
++
++void dump_cpu_task(int cpu)
++{
++ if (cpu == smp_processor_id() && in_hardirq()) {
++ struct pt_regs *regs;
++
++ regs = get_irq_regs();
++ if (regs) {
++ show_regs(regs);
++ return;
++ }
++ }
++
++ if (trigger_single_cpu_backtrace(cpu))
++ return;
++
++ pr_info("Task dump for CPU %d:\n", cpu);
++ sched_show_task(cpu_curr(cpu));
++}
++
++/**
++ * init_idle - set up an idle thread for a given CPU
++ * @idle: task in question
++ * @cpu: CPU the idle task belongs to
++ *
++ * NOTE: this function does not set the idle thread's NEED_RESCHED
++ * flag, to make booting more robust.
++ */
++void __init init_idle(struct task_struct *idle, int cpu)
++{
++#ifdef CONFIG_SMP
++ struct affinity_context ac = (struct affinity_context) {
++ .new_mask = cpumask_of(cpu),
++ .flags = 0,
++ };
++#endif
++ struct rq *rq = cpu_rq(cpu);
++ unsigned long flags;
++
++ __sched_fork(0, idle);
++
++ raw_spin_lock_irqsave(&idle->pi_lock, flags);
++ raw_spin_lock(&rq->lock);
++
++ idle->last_ran = rq->clock_task;
++ idle->__state = TASK_RUNNING;
++ /*
++ * PF_KTHREAD should already be set at this point; regardless, make it
++ * look like a proper per-CPU kthread.
++ */
++ idle->flags |= PF_IDLE | PF_KTHREAD | PF_NO_SETAFFINITY;
++ kthread_set_per_cpu(idle, cpu);
++
++ sched_queue_init_idle(&rq->queue, idle);
++
++#ifdef CONFIG_SMP
++ /*
++ * It's possible that init_idle() gets called multiple times on a task,
++ * in that case do_set_cpus_allowed() will not do the right thing.
++ *
++ * And since this is boot we can forgo the serialisation.
++ */
++ set_cpus_allowed_common(idle, &ac);
++#endif
++
++ /* Silence PROVE_RCU */
++ rcu_read_lock();
++ __set_task_cpu(idle, cpu);
++ rcu_read_unlock();
++
++ rq->idle = idle;
++ rcu_assign_pointer(rq->curr, idle);
++ idle->on_cpu = 1;
++
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock_irqrestore(&idle->pi_lock, flags);
++
++ /* Set the preempt count _outside_ the spinlocks! */
++ init_idle_preempt_count(idle, cpu);
++
++ ftrace_graph_init_idle_task(idle, cpu);
++ vtime_init_idle(idle, cpu);
++#ifdef CONFIG_SMP
++ sprintf(idle->comm, "%s/%d", INIT_TASK_COMM, cpu);
++#endif
++}
++
++#ifdef CONFIG_SMP
++
++int cpuset_cpumask_can_shrink(const struct cpumask __maybe_unused *cur,
++ const struct cpumask __maybe_unused *trial)
++{
++ return 1;
++}
++
++int task_can_attach(struct task_struct *p,
++ const struct cpumask *cs_effective_cpus)
++{
++ int ret = 0;
++
++ /*
++ * Kthreads which disallow setaffinity shouldn't be moved
++ * to a new cpuset; we don't want to change their CPU
++ * affinity and isolating such threads by their set of
++ * allowed nodes is unnecessary. Thus, cpusets are not
++ * applicable for such threads. This prevents checking for
++ * success of set_cpus_allowed_ptr() on all attached tasks
++ * before cpus_mask may be changed.
++ */
++ if (p->flags & PF_NO_SETAFFINITY)
++ ret = -EINVAL;
++
++ return ret;
++}
++
++bool sched_smp_initialized __read_mostly;
++
++#ifdef CONFIG_HOTPLUG_CPU
++/*
++ * Ensures that the idle task is using init_mm right before its CPU goes
++ * offline.
++ */
++void idle_task_exit(void)
++{
++ struct mm_struct *mm = current->active_mm;
++
++ BUG_ON(current != this_rq()->idle);
++
++ if (mm != &init_mm) {
++ switch_mm(mm, &init_mm, current);
++ finish_arch_post_lock_switch();
++ }
++
++ /* finish_cpu(), as ran on the BP, will clean up the active_mm state */
++}
++
++static int __balance_push_cpu_stop(void *arg)
++{
++ struct task_struct *p = arg;
++ struct rq *rq = this_rq();
++ struct rq_flags rf;
++ int cpu;
++
++ raw_spin_lock_irq(&p->pi_lock);
++ rq_lock(rq, &rf);
++
++ update_rq_clock(rq);
++
++ if (task_rq(p) == rq && task_on_rq_queued(p)) {
++ cpu = select_fallback_rq(rq->cpu, p);
++ rq = __migrate_task(rq, p, cpu);
++ }
++
++ rq_unlock(rq, &rf);
++ raw_spin_unlock_irq(&p->pi_lock);
++
++ put_task_struct(p);
++
++ return 0;
++}
++
++static DEFINE_PER_CPU(struct cpu_stop_work, push_work);
++
++/*
++ * This is enabled below SCHED_AP_ACTIVE; when !cpu_active(), but only
++ * effective when the hotplug motion is down.
++ */
++static void balance_push(struct rq *rq)
++{
++ struct task_struct *push_task = rq->curr;
++
++ lockdep_assert_held(&rq->lock);
++
++ /*
++ * Ensure the thing is persistent until balance_push_set(.on = false);
++ */
++ rq->balance_callback = &balance_push_callback;
++
++ /*
++ * Only active while going offline and when invoked on the outgoing
++ * CPU.
++ */
++ if (!cpu_dying(rq->cpu) || rq != this_rq())
++ return;
++
++ /*
++ * Both the cpu-hotplug and stop task are in this case and are
++ * required to complete the hotplug process.
++ */
++ if (kthread_is_per_cpu(push_task) ||
++ is_migration_disabled(push_task)) {
++
++ /*
++ * If this is the idle task on the outgoing CPU try to wake
++ * up the hotplug control thread which might wait for the
++ * last task to vanish. The rcuwait_active() check is
++ * accurate here because the waiter is pinned on this CPU
++ * and can't obviously be running in parallel.
++ *
++ * On RT kernels this also has to check whether there are
++ * pinned and scheduled out tasks on the runqueue. They
++ * need to leave the migrate disabled section first.
++ */
++ if (!rq->nr_running && !rq_has_pinned_tasks(rq) &&
++ rcuwait_active(&rq->hotplug_wait)) {
++ raw_spin_unlock(&rq->lock);
++ rcuwait_wake_up(&rq->hotplug_wait);
++ raw_spin_lock(&rq->lock);
++ }
++ return;
++ }
++
++ get_task_struct(push_task);
++ /*
++ * Temporarily drop rq->lock such that we can wake-up the stop task.
++ * Both preemption and IRQs are still disabled.
++ */
++ raw_spin_unlock(&rq->lock);
++ stop_one_cpu_nowait(rq->cpu, __balance_push_cpu_stop, push_task,
++ this_cpu_ptr(&push_work));
++ /*
++ * At this point need_resched() is true and we'll take the loop in
++ * schedule(). The next pick is obviously going to be the stop task
++ * which kthread_is_per_cpu() and will push this task away.
++ */
++ raw_spin_lock(&rq->lock);
++}
++
++static void balance_push_set(int cpu, bool on)
++{
++ struct rq *rq = cpu_rq(cpu);
++ struct rq_flags rf;
++
++ rq_lock_irqsave(rq, &rf);
++ if (on) {
++ WARN_ON_ONCE(rq->balance_callback);
++ rq->balance_callback = &balance_push_callback;
++ } else if (rq->balance_callback == &balance_push_callback) {
++ rq->balance_callback = NULL;
++ }
++ rq_unlock_irqrestore(rq, &rf);
++}
++
++/*
++ * Invoked from a CPUs hotplug control thread after the CPU has been marked
++ * inactive. All tasks which are not per CPU kernel threads are either
++ * pushed off this CPU now via balance_push() or placed on a different CPU
++ * during wakeup. Wait until the CPU is quiescent.
++ */
++static void balance_hotplug_wait(void)
++{
++ struct rq *rq = this_rq();
++
++ rcuwait_wait_event(&rq->hotplug_wait,
++ rq->nr_running == 1 && !rq_has_pinned_tasks(rq),
++ TASK_UNINTERRUPTIBLE);
++}
++
++#else
++
++static void balance_push(struct rq *rq)
++{
++}
++
++static void balance_push_set(int cpu, bool on)
++{
++}
++
++static inline void balance_hotplug_wait(void)
++{
++}
++#endif /* CONFIG_HOTPLUG_CPU */
++
++static void set_rq_offline(struct rq *rq)
++{
++ if (rq->online)
++ rq->online = false;
++}
++
++static void set_rq_online(struct rq *rq)
++{
++ if (!rq->online)
++ rq->online = true;
++}
++
++/*
++ * used to mark begin/end of suspend/resume:
++ */
++static int num_cpus_frozen;
++
++/*
++ * Update cpusets according to cpu_active mask. If cpusets are
++ * disabled, cpuset_update_active_cpus() becomes a simple wrapper
++ * around partition_sched_domains().
++ *
++ * If we come here as part of a suspend/resume, don't touch cpusets because we
++ * want to restore it back to its original state upon resume anyway.
++ */
++static void cpuset_cpu_active(void)
++{
++ if (cpuhp_tasks_frozen) {
++ /*
++ * num_cpus_frozen tracks how many CPUs are involved in suspend
++ * resume sequence. As long as this is not the last online
++ * operation in the resume sequence, just build a single sched
++ * domain, ignoring cpusets.
++ */
++ partition_sched_domains(1, NULL, NULL);
++ if (--num_cpus_frozen)
++ return;
++ /*
++ * This is the last CPU online operation. So fall through and
++ * restore the original sched domains by considering the
++ * cpuset configurations.
++ */
++ cpuset_force_rebuild();
++ }
++
++ cpuset_update_active_cpus();
++}
++
++static int cpuset_cpu_inactive(unsigned int cpu)
++{
++ if (!cpuhp_tasks_frozen) {
++ cpuset_update_active_cpus();
++ } else {
++ num_cpus_frozen++;
++ partition_sched_domains(1, NULL, NULL);
++ }
++ return 0;
++}
++
++int sched_cpu_activate(unsigned int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++ unsigned long flags;
++
++ /*
++ * Clear the balance_push callback and prepare to schedule
++ * regular tasks.
++ */
++ balance_push_set(cpu, false);
++
++#ifdef CONFIG_SCHED_SMT
++ /*
++ * When going up, increment the number of cores with SMT present.
++ */
++ if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
++ static_branch_inc_cpuslocked(&sched_smt_present);
++#endif
++ set_cpu_active(cpu, true);
++
++ if (sched_smp_initialized)
++ cpuset_cpu_active();
++
++ /*
++ * Put the rq online, if not already. This happens:
++ *
++ * 1) In the early boot process, because we build the real domains
++ * after all cpus have been brought up.
++ *
++ * 2) At runtime, if cpuset_cpu_active() fails to rebuild the
++ * domains.
++ */
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ set_rq_online(rq);
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++
++ return 0;
++}
++
++int sched_cpu_deactivate(unsigned int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++ unsigned long flags;
++ int ret;
++
++ set_cpu_active(cpu, false);
++
++ /*
++ * From this point forward, this CPU will refuse to run any task that
++ * is not: migrate_disable() or KTHREAD_IS_PER_CPU, and will actively
++ * push those tasks away until this gets cleared, see
++ * sched_cpu_dying().
++ */
++ balance_push_set(cpu, true);
++
++ /*
++ * We've cleared cpu_active_mask, wait for all preempt-disabled and RCU
++ * users of this state to go away such that all new such users will
++ * observe it.
++ *
++ * Specifically, we rely on ttwu to no longer target this CPU, see
++ * ttwu_queue_cond() and is_cpu_allowed().
++ *
++ * Do sync before park smpboot threads to take care the rcu boost case.
++ */
++ synchronize_rcu();
++
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ update_rq_clock(rq);
++ set_rq_offline(rq);
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++
++#ifdef CONFIG_SCHED_SMT
++ /*
++ * When going down, decrement the number of cores with SMT present.
++ */
++ if (cpumask_weight(cpu_smt_mask(cpu)) == 2) {
++ static_branch_dec_cpuslocked(&sched_smt_present);
++ if (!static_branch_likely(&sched_smt_present))
++ cpumask_clear(&sched_sg_idle_mask);
++ }
++#endif
++
++ if (!sched_smp_initialized)
++ return 0;
++
++ ret = cpuset_cpu_inactive(cpu);
++ if (ret) {
++ balance_push_set(cpu, false);
++ set_cpu_active(cpu, true);
++ return ret;
++ }
++
++ return 0;
++}
++
++static void sched_rq_cpu_starting(unsigned int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++
++ rq->calc_load_update = calc_load_update;
++}
++
++int sched_cpu_starting(unsigned int cpu)
++{
++ sched_rq_cpu_starting(cpu);
++ sched_tick_start(cpu);
++ return 0;
++}
++
++#ifdef CONFIG_HOTPLUG_CPU
++
++/*
++ * Invoked immediately before the stopper thread is invoked to bring the
++ * CPU down completely. At this point all per CPU kthreads except the
++ * hotplug thread (current) and the stopper thread (inactive) have been
++ * either parked or have been unbound from the outgoing CPU. Ensure that
++ * any of those which might be on the way out are gone.
++ *
++ * If after this point a bound task is being woken on this CPU then the
++ * responsible hotplug callback has failed to do it's job.
++ * sched_cpu_dying() will catch it with the appropriate fireworks.
++ */
++int sched_cpu_wait_empty(unsigned int cpu)
++{
++ balance_hotplug_wait();
++ return 0;
++}
++
++/*
++ * Since this CPU is going 'away' for a while, fold any nr_active delta we
++ * might have. Called from the CPU stopper task after ensuring that the
++ * stopper is the last running task on the CPU, so nr_active count is
++ * stable. We need to take the teardown thread which is calling this into
++ * account, so we hand in adjust = 1 to the load calculation.
++ *
++ * Also see the comment "Global load-average calculations".
++ */
++static void calc_load_migrate(struct rq *rq)
++{
++ long delta = calc_load_fold_active(rq, 1);
++
++ if (delta)
++ atomic_long_add(delta, &calc_load_tasks);
++}
++
++static void dump_rq_tasks(struct rq *rq, const char *loglvl)
++{
++ struct task_struct *g, *p;
++ int cpu = cpu_of(rq);
++
++ lockdep_assert_held(&rq->lock);
++
++ printk("%sCPU%d enqueued tasks (%u total):\n", loglvl, cpu, rq->nr_running);
++ for_each_process_thread(g, p) {
++ if (task_cpu(p) != cpu)
++ continue;
++
++ if (!task_on_rq_queued(p))
++ continue;
++
++ printk("%s\tpid: %d, name: %s\n", loglvl, p->pid, p->comm);
++ }
++}
++
++int sched_cpu_dying(unsigned int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++ unsigned long flags;
++
++ /* Handle pending wakeups and then migrate everything off */
++ sched_tick_stop(cpu);
++
++ raw_spin_lock_irqsave(&rq->lock, flags);
++ if (rq->nr_running != 1 || rq_has_pinned_tasks(rq)) {
++ WARN(true, "Dying CPU not properly vacated!");
++ dump_rq_tasks(rq, KERN_WARNING);
++ }
++ raw_spin_unlock_irqrestore(&rq->lock, flags);
++
++ calc_load_migrate(rq);
++ hrtick_clear(rq);
++ return 0;
++}
++#endif
++
++#ifdef CONFIG_SMP
++static void sched_init_topology_cpumask_early(void)
++{
++ int cpu;
++ cpumask_t *tmp;
++
++ for_each_possible_cpu(cpu) {
++ /* init topo masks */
++ tmp = per_cpu(sched_cpu_topo_masks, cpu);
++
++ cpumask_copy(tmp, cpumask_of(cpu));
++ tmp++;
++ cpumask_copy(tmp, cpu_possible_mask);
++ per_cpu(sched_cpu_llc_mask, cpu) = tmp;
++ per_cpu(sched_cpu_topo_end_mask, cpu) = ++tmp;
++ /*per_cpu(sd_llc_id, cpu) = cpu;*/
++ }
++}
++
++#define TOPOLOGY_CPUMASK(name, mask, last)\
++ if (cpumask_and(topo, topo, mask)) { \
++ cpumask_copy(topo, mask); \
++ printk(KERN_INFO "sched: cpu#%02d topo: 0x%08lx - "#name, \
++ cpu, (topo++)->bits[0]); \
++ } \
++ if (!last) \
++ bitmap_complement(cpumask_bits(topo), cpumask_bits(mask), \
++ nr_cpumask_bits);
++
++static void sched_init_topology_cpumask(void)
++{
++ int cpu;
++ cpumask_t *topo;
++
++ for_each_online_cpu(cpu) {
++ /* take chance to reset time slice for idle tasks */
++ cpu_rq(cpu)->idle->time_slice = sched_timeslice_ns;
++
++ topo = per_cpu(sched_cpu_topo_masks, cpu) + 1;
++
++ bitmap_complement(cpumask_bits(topo), cpumask_bits(cpumask_of(cpu)),
++ nr_cpumask_bits);
++#ifdef CONFIG_SCHED_SMT
++ TOPOLOGY_CPUMASK(smt, topology_sibling_cpumask(cpu), false);
++#endif
++ per_cpu(sd_llc_id, cpu) = cpumask_first(cpu_coregroup_mask(cpu));
++ per_cpu(sched_cpu_llc_mask, cpu) = topo;
++ TOPOLOGY_CPUMASK(coregroup, cpu_coregroup_mask(cpu), false);
++
++ TOPOLOGY_CPUMASK(core, topology_core_cpumask(cpu), false);
++
++ TOPOLOGY_CPUMASK(others, cpu_online_mask, true);
++
++ per_cpu(sched_cpu_topo_end_mask, cpu) = topo;
++ printk(KERN_INFO "sched: cpu#%02d llc_id = %d, llc_mask idx = %d\n",
++ cpu, per_cpu(sd_llc_id, cpu),
++ (int) (per_cpu(sched_cpu_llc_mask, cpu) -
++ per_cpu(sched_cpu_topo_masks, cpu)));
++ }
++}
++#endif
++
++void __init sched_init_smp(void)
++{
++ /* Move init over to a non-isolated CPU */
++ if (set_cpus_allowed_ptr(current, housekeeping_cpumask(HK_TYPE_DOMAIN)) < 0)
++ BUG();
++ current->flags &= ~PF_NO_SETAFFINITY;
++
++ sched_init_topology_cpumask();
++
++ sched_smp_initialized = true;
++}
++
++static int __init migration_init(void)
++{
++ sched_cpu_starting(smp_processor_id());
++ return 0;
++}
++early_initcall(migration_init);
++
++#else
++void __init sched_init_smp(void)
++{
++ cpu_rq(0)->idle->time_slice = sched_timeslice_ns;
++}
++#endif /* CONFIG_SMP */
++
++int in_sched_functions(unsigned long addr)
++{
++ return in_lock_functions(addr) ||
++ (addr >= (unsigned long)__sched_text_start
++ && addr < (unsigned long)__sched_text_end);
++}
++
++#ifdef CONFIG_CGROUP_SCHED
++/* task group related information */
++struct task_group {
++ struct cgroup_subsys_state css;
++
++ struct rcu_head rcu;
++ struct list_head list;
++
++ struct task_group *parent;
++ struct list_head siblings;
++ struct list_head children;
++#ifdef CONFIG_FAIR_GROUP_SCHED
++ unsigned long shares;
++#endif
++};
++
++/*
++ * Default task group.
++ * Every task in system belongs to this group at bootup.
++ */
++struct task_group root_task_group;
++LIST_HEAD(task_groups);
++
++/* Cacheline aligned slab cache for task_group */
++static struct kmem_cache *task_group_cache __read_mostly;
++#endif /* CONFIG_CGROUP_SCHED */
++
++void __init sched_init(void)
++{
++ int i;
++ struct rq *rq;
++
++ printk(KERN_INFO "sched/alt: "ALT_SCHED_NAME" CPU Scheduler "ALT_SCHED_VERSION\
++ " by Alfred Chen.\n");
++
++ wait_bit_init();
++
++#ifdef CONFIG_SMP
++ for (i = 0; i < SCHED_QUEUE_BITS; i++)
++ cpumask_copy(sched_preempt_mask + i, cpu_present_mask);
++#endif
++
++#ifdef CONFIG_CGROUP_SCHED
++ task_group_cache = KMEM_CACHE(task_group, 0);
++
++ list_add(&root_task_group.list, &task_groups);
++ INIT_LIST_HEAD(&root_task_group.children);
++ INIT_LIST_HEAD(&root_task_group.siblings);
++#endif /* CONFIG_CGROUP_SCHED */
++ for_each_possible_cpu(i) {
++ rq = cpu_rq(i);
++
++ sched_queue_init(&rq->queue);
++ rq->prio = IDLE_TASK_SCHED_PRIO;
++ rq->skip = NULL;
++
++ raw_spin_lock_init(&rq->lock);
++ rq->nr_running = rq->nr_uninterruptible = 0;
++ rq->calc_load_active = 0;
++ rq->calc_load_update = jiffies + LOAD_FREQ;
++#ifdef CONFIG_SMP
++ rq->online = false;
++ rq->cpu = i;
++
++#ifdef CONFIG_SCHED_SMT
++ rq->active_balance = 0;
++#endif
++
++#ifdef CONFIG_NO_HZ_COMMON
++ INIT_CSD(&rq->nohz_csd, nohz_csd_func, rq);
++#endif
++ rq->balance_callback = &balance_push_callback;
++#ifdef CONFIG_HOTPLUG_CPU
++ rcuwait_init(&rq->hotplug_wait);
++#endif
++#endif /* CONFIG_SMP */
++ rq->nr_switches = 0;
++
++ hrtick_rq_init(rq);
++ atomic_set(&rq->nr_iowait, 0);
++
++ zalloc_cpumask_var_node(&rq->scratch_mask, GFP_KERNEL, cpu_to_node(i));
++ }
++#ifdef CONFIG_SMP
++ /* Set rq->online for cpu 0 */
++ cpu_rq(0)->online = true;
++#endif
++ /*
++ * The boot idle thread does lazy MMU switching as well:
++ */
++ mmgrab(&init_mm);
++ enter_lazy_tlb(&init_mm, current);
++
++ /*
++ * The idle task doesn't need the kthread struct to function, but it
++ * is dressed up as a per-CPU kthread and thus needs to play the part
++ * if we want to avoid special-casing it in code that deals with per-CPU
++ * kthreads.
++ */
++ WARN_ON(!set_kthread_struct(current));
++
++ /*
++ * Make us the idle thread. Technically, schedule() should not be
++ * called from this thread, however somewhere below it might be,
++ * but because we are the idle thread, we just pick up running again
++ * when this runqueue becomes "idle".
++ */
++ init_idle(current, smp_processor_id());
++
++ calc_load_update = jiffies + LOAD_FREQ;
++
++#ifdef CONFIG_SMP
++ idle_thread_set_boot_cpu();
++ balance_push_set(smp_processor_id(), false);
++
++ sched_init_topology_cpumask_early();
++#endif /* SMP */
++
++ preempt_dynamic_init();
++}
++
++#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
++
++void __might_sleep(const char *file, int line)
++{
++ unsigned int state = get_current_state();
++ /*
++ * Blocking primitives will set (and therefore destroy) current->state,
++ * since we will exit with TASK_RUNNING make sure we enter with it,
++ * otherwise we will destroy state.
++ */
++ WARN_ONCE(state != TASK_RUNNING && current->task_state_change,
++ "do not call blocking ops when !TASK_RUNNING; "
++ "state=%x set at [<%p>] %pS\n", state,
++ (void *)current->task_state_change,
++ (void *)current->task_state_change);
++
++ __might_resched(file, line, 0);
++}
++EXPORT_SYMBOL(__might_sleep);
++
++static void print_preempt_disable_ip(int preempt_offset, unsigned long ip)
++{
++ if (!IS_ENABLED(CONFIG_DEBUG_PREEMPT))
++ return;
++
++ if (preempt_count() == preempt_offset)
++ return;
++
++ pr_err("Preemption disabled at:");
++ print_ip_sym(KERN_ERR, ip);
++}
++
++static inline bool resched_offsets_ok(unsigned int offsets)
++{
++ unsigned int nested = preempt_count();
++
++ nested += rcu_preempt_depth() << MIGHT_RESCHED_RCU_SHIFT;
++
++ return nested == offsets;
++}
++
++void __might_resched(const char *file, int line, unsigned int offsets)
++{
++ /* Ratelimiting timestamp: */
++ static unsigned long prev_jiffy;
++
++ unsigned long preempt_disable_ip;
++
++ /* WARN_ON_ONCE() by default, no rate limit required: */
++ rcu_sleep_check();
++
++ if ((resched_offsets_ok(offsets) && !irqs_disabled() &&
++ !is_idle_task(current) && !current->non_block_count) ||
++ system_state == SYSTEM_BOOTING || system_state > SYSTEM_RUNNING ||
++ oops_in_progress)
++ return;
++ if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
++ return;
++ prev_jiffy = jiffies;
++
++ /* Save this before calling printk(), since that will clobber it: */
++ preempt_disable_ip = get_preempt_disable_ip(current);
++
++ pr_err("BUG: sleeping function called from invalid context at %s:%d\n",
++ file, line);
++ pr_err("in_atomic(): %d, irqs_disabled(): %d, non_block: %d, pid: %d, name: %s\n",
++ in_atomic(), irqs_disabled(), current->non_block_count,
++ current->pid, current->comm);
++ pr_err("preempt_count: %x, expected: %x\n", preempt_count(),
++ offsets & MIGHT_RESCHED_PREEMPT_MASK);
++
++ if (IS_ENABLED(CONFIG_PREEMPT_RCU)) {
++ pr_err("RCU nest depth: %d, expected: %u\n",
++ rcu_preempt_depth(), offsets >> MIGHT_RESCHED_RCU_SHIFT);
++ }
++
++ if (task_stack_end_corrupted(current))
++ pr_emerg("Thread overran stack, or stack corrupted\n");
++
++ debug_show_held_locks(current);
++ if (irqs_disabled())
++ print_irqtrace_events(current);
++
++ print_preempt_disable_ip(offsets & MIGHT_RESCHED_PREEMPT_MASK,
++ preempt_disable_ip);
++
++ dump_stack();
++ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
++}
++EXPORT_SYMBOL(__might_resched);
++
++void __cant_sleep(const char *file, int line, int preempt_offset)
++{
++ static unsigned long prev_jiffy;
++
++ if (irqs_disabled())
++ return;
++
++ if (!IS_ENABLED(CONFIG_PREEMPT_COUNT))
++ return;
++
++ if (preempt_count() > preempt_offset)
++ return;
++
++ if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
++ return;
++ prev_jiffy = jiffies;
++
++ printk(KERN_ERR "BUG: assuming atomic context at %s:%d\n", file, line);
++ printk(KERN_ERR "in_atomic(): %d, irqs_disabled(): %d, pid: %d, name: %s\n",
++ in_atomic(), irqs_disabled(),
++ current->pid, current->comm);
++
++ debug_show_held_locks(current);
++ dump_stack();
++ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
++}
++EXPORT_SYMBOL_GPL(__cant_sleep);
++
++#ifdef CONFIG_SMP
++void __cant_migrate(const char *file, int line)
++{
++ static unsigned long prev_jiffy;
++
++ if (irqs_disabled())
++ return;
++
++ if (is_migration_disabled(current))
++ return;
++
++ if (!IS_ENABLED(CONFIG_PREEMPT_COUNT))
++ return;
++
++ if (preempt_count() > 0)
++ return;
++
++ if (current->migration_flags & MDF_FORCE_ENABLED)
++ return;
++
++ if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
++ return;
++ prev_jiffy = jiffies;
++
++ pr_err("BUG: assuming non migratable context at %s:%d\n", file, line);
++ pr_err("in_atomic(): %d, irqs_disabled(): %d, migration_disabled() %u pid: %d, name: %s\n",
++ in_atomic(), irqs_disabled(), is_migration_disabled(current),
++ current->pid, current->comm);
++
++ debug_show_held_locks(current);
++ dump_stack();
++ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
++}
++EXPORT_SYMBOL_GPL(__cant_migrate);
++#endif
++#endif
++
++#ifdef CONFIG_MAGIC_SYSRQ
++void normalize_rt_tasks(void)
++{
++ struct task_struct *g, *p;
++ struct sched_attr attr = {
++ .sched_policy = SCHED_NORMAL,
++ };
++
++ read_lock(&tasklist_lock);
++ for_each_process_thread(g, p) {
++ /*
++ * Only normalize user tasks:
++ */
++ if (p->flags & PF_KTHREAD)
++ continue;
++
++ schedstat_set(p->stats.wait_start, 0);
++ schedstat_set(p->stats.sleep_start, 0);
++ schedstat_set(p->stats.block_start, 0);
++
++ if (!rt_task(p)) {
++ /*
++ * Renice negative nice level userspace
++ * tasks back to 0:
++ */
++ if (task_nice(p) < 0)
++ set_user_nice(p, 0);
++ continue;
++ }
++
++ __sched_setscheduler(p, &attr, false, false);
++ }
++ read_unlock(&tasklist_lock);
++}
++#endif /* CONFIG_MAGIC_SYSRQ */
++
++#if defined(CONFIG_IA64) || defined(CONFIG_KGDB_KDB)
++/*
++ * These functions are only useful for the IA64 MCA handling, or kdb.
++ *
++ * They can only be called when the whole system has been
++ * stopped - every CPU needs to be quiescent, and no scheduling
++ * activity can take place. Using them for anything else would
++ * be a serious bug, and as a result, they aren't even visible
++ * under any other configuration.
++ */
++
++/**
++ * curr_task - return the current task for a given CPU.
++ * @cpu: the processor in question.
++ *
++ * ONLY VALID WHEN THE WHOLE SYSTEM IS STOPPED!
++ *
++ * Return: The current task for @cpu.
++ */
++struct task_struct *curr_task(int cpu)
++{
++ return cpu_curr(cpu);
++}
++
++#endif /* defined(CONFIG_IA64) || defined(CONFIG_KGDB_KDB) */
++
++#ifdef CONFIG_IA64
++/**
++ * ia64_set_curr_task - set the current task for a given CPU.
++ * @cpu: the processor in question.
++ * @p: the task pointer to set.
++ *
++ * Description: This function must only be used when non-maskable interrupts
++ * are serviced on a separate stack. It allows the architecture to switch the
++ * notion of the current task on a CPU in a non-blocking manner. This function
++ * must be called with all CPU's synchronised, and interrupts disabled, the
++ * and caller must save the original value of the current task (see
++ * curr_task() above) and restore that value before reenabling interrupts and
++ * re-starting the system.
++ *
++ * ONLY VALID WHEN THE WHOLE SYSTEM IS STOPPED!
++ */
++void ia64_set_curr_task(int cpu, struct task_struct *p)
++{
++ cpu_curr(cpu) = p;
++}
++
++#endif
++
++#ifdef CONFIG_CGROUP_SCHED
++static void sched_free_group(struct task_group *tg)
++{
++ kmem_cache_free(task_group_cache, tg);
++}
++
++static void sched_free_group_rcu(struct rcu_head *rhp)
++{
++ sched_free_group(container_of(rhp, struct task_group, rcu));
++}
++
++static void sched_unregister_group(struct task_group *tg)
++{
++ /*
++ * We have to wait for yet another RCU grace period to expire, as
++ * print_cfs_stats() might run concurrently.
++ */
++ call_rcu(&tg->rcu, sched_free_group_rcu);
++}
++
++/* allocate runqueue etc for a new task group */
++struct task_group *sched_create_group(struct task_group *parent)
++{
++ struct task_group *tg;
++
++ tg = kmem_cache_alloc(task_group_cache, GFP_KERNEL | __GFP_ZERO);
++ if (!tg)
++ return ERR_PTR(-ENOMEM);
++
++ return tg;
++}
++
++void sched_online_group(struct task_group *tg, struct task_group *parent)
++{
++}
++
++/* rcu callback to free various structures associated with a task group */
++static void sched_unregister_group_rcu(struct rcu_head *rhp)
++{
++ /* Now it should be safe to free those cfs_rqs: */
++ sched_unregister_group(container_of(rhp, struct task_group, rcu));
++}
++
++void sched_destroy_group(struct task_group *tg)
++{
++ /* Wait for possible concurrent references to cfs_rqs complete: */
++ call_rcu(&tg->rcu, sched_unregister_group_rcu);
++}
++
++void sched_release_group(struct task_group *tg)
++{
++}
++
++static inline struct task_group *css_tg(struct cgroup_subsys_state *css)
++{
++ return css ? container_of(css, struct task_group, css) : NULL;
++}
++
++static struct cgroup_subsys_state *
++cpu_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
++{
++ struct task_group *parent = css_tg(parent_css);
++ struct task_group *tg;
++
++ if (!parent) {
++ /* This is early initialization for the top cgroup */
++ return &root_task_group.css;
++ }
++
++ tg = sched_create_group(parent);
++ if (IS_ERR(tg))
++ return ERR_PTR(-ENOMEM);
++ return &tg->css;
++}
++
++/* Expose task group only after completing cgroup initialization */
++static int cpu_cgroup_css_online(struct cgroup_subsys_state *css)
++{
++ struct task_group *tg = css_tg(css);
++ struct task_group *parent = css_tg(css->parent);
++
++ if (parent)
++ sched_online_group(tg, parent);
++ return 0;
++}
++
++static void cpu_cgroup_css_released(struct cgroup_subsys_state *css)
++{
++ struct task_group *tg = css_tg(css);
++
++ sched_release_group(tg);
++}
++
++static void cpu_cgroup_css_free(struct cgroup_subsys_state *css)
++{
++ struct task_group *tg = css_tg(css);
++
++ /*
++ * Relies on the RCU grace period between css_released() and this.
++ */
++ sched_unregister_group(tg);
++}
++
++#ifdef CONFIG_RT_GROUP_SCHED
++static int cpu_cgroup_can_attach(struct cgroup_taskset *tset)
++{
++ return 0;
++}
++#endif
++
++static void cpu_cgroup_attach(struct cgroup_taskset *tset)
++{
++}
++
++#ifdef CONFIG_FAIR_GROUP_SCHED
++static DEFINE_MUTEX(shares_mutex);
++
++int sched_group_set_shares(struct task_group *tg, unsigned long shares)
++{
++ /*
++ * We can't change the weight of the root cgroup.
++ */
++ if (&root_task_group == tg)
++ return -EINVAL;
++
++ shares = clamp(shares, scale_load(MIN_SHARES), scale_load(MAX_SHARES));
++
++ mutex_lock(&shares_mutex);
++ if (tg->shares == shares)
++ goto done;
++
++ tg->shares = shares;
++done:
++ mutex_unlock(&shares_mutex);
++ return 0;
++}
++
++static int cpu_shares_write_u64(struct cgroup_subsys_state *css,
++ struct cftype *cftype, u64 shareval)
++{
++ if (shareval > scale_load_down(ULONG_MAX))
++ shareval = MAX_SHARES;
++ return sched_group_set_shares(css_tg(css), scale_load(shareval));
++}
++
++static u64 cpu_shares_read_u64(struct cgroup_subsys_state *css,
++ struct cftype *cft)
++{
++ struct task_group *tg = css_tg(css);
++
++ return (u64) scale_load_down(tg->shares);
++}
++#endif
++
++static struct cftype cpu_legacy_files[] = {
++#ifdef CONFIG_FAIR_GROUP_SCHED
++ {
++ .name = "shares",
++ .read_u64 = cpu_shares_read_u64,
++ .write_u64 = cpu_shares_write_u64,
++ },
++#endif
++ { } /* Terminate */
++};
++
++
++static struct cftype cpu_files[] = {
++ { } /* terminate */
++};
++
++static int cpu_extra_stat_show(struct seq_file *sf,
++ struct cgroup_subsys_state *css)
++{
++ return 0;
++}
++
++struct cgroup_subsys cpu_cgrp_subsys = {
++ .css_alloc = cpu_cgroup_css_alloc,
++ .css_online = cpu_cgroup_css_online,
++ .css_released = cpu_cgroup_css_released,
++ .css_free = cpu_cgroup_css_free,
++ .css_extra_stat_show = cpu_extra_stat_show,
++#ifdef CONFIG_RT_GROUP_SCHED
++ .can_attach = cpu_cgroup_can_attach,
++#endif
++ .attach = cpu_cgroup_attach,
++ .legacy_cftypes = cpu_files,
++ .legacy_cftypes = cpu_legacy_files,
++ .dfl_cftypes = cpu_files,
++ .early_init = true,
++ .threaded = true,
++};
++#endif /* CONFIG_CGROUP_SCHED */
++
++#undef CREATE_TRACE_POINTS
++
++#ifdef CONFIG_SCHED_MM_CID
++
++#
++/*
++ * @cid_lock: Guarantee forward-progress of cid allocation.
++ *
++ * Concurrency ID allocation within a bitmap is mostly lock-free. The cid_lock
++ * is only used when contention is detected by the lock-free allocation so
++ * forward progress can be guaranteed.
++ */
++DEFINE_RAW_SPINLOCK(cid_lock);
++
++/*
++ * @use_cid_lock: Select cid allocation behavior: lock-free vs spinlock.
++ *
++ * When @use_cid_lock is 0, the cid allocation is lock-free. When contention is
++ * detected, it is set to 1 to ensure that all newly coming allocations are
++ * serialized by @cid_lock until the allocation which detected contention
++ * completes and sets @use_cid_lock back to 0. This guarantees forward progress
++ * of a cid allocation.
++ */
++int use_cid_lock;
++
++/*
++ * mm_cid remote-clear implements a lock-free algorithm to clear per-mm/cpu cid
++ * concurrently with respect to the execution of the source runqueue context
++ * switch.
++ *
++ * There is one basic properties we want to guarantee here:
++ *
++ * (1) Remote-clear should _never_ mark a per-cpu cid UNSET when it is actively
++ * used by a task. That would lead to concurrent allocation of the cid and
++ * userspace corruption.
++ *
++ * Provide this guarantee by introducing a Dekker memory ordering to guarantee
++ * that a pair of loads observe at least one of a pair of stores, which can be
++ * shown as:
++ *
++ * X = Y = 0
++ *
++ * w[X]=1 w[Y]=1
++ * MB MB
++ * r[Y]=y r[X]=x
++ *
++ * Which guarantees that x==0 && y==0 is impossible. But rather than using
++ * values 0 and 1, this algorithm cares about specific state transitions of the
++ * runqueue current task (as updated by the scheduler context switch), and the
++ * per-mm/cpu cid value.
++ *
++ * Let's introduce task (Y) which has task->mm == mm and task (N) which has
++ * task->mm != mm for the rest of the discussion. There are two scheduler state
++ * transitions on context switch we care about:
++ *
++ * (TSA) Store to rq->curr with transition from (N) to (Y)
++ *
++ * (TSB) Store to rq->curr with transition from (Y) to (N)
++ *
++ * On the remote-clear side, there is one transition we care about:
++ *
++ * (TMA) cmpxchg to *pcpu_cid to set the LAZY flag
++ *
++ * There is also a transition to UNSET state which can be performed from all
++ * sides (scheduler, remote-clear). It is always performed with a cmpxchg which
++ * guarantees that only a single thread will succeed:
++ *
++ * (TMB) cmpxchg to *pcpu_cid to mark UNSET
++ *
++ * Just to be clear, what we do _not_ want to happen is a transition to UNSET
++ * when a thread is actively using the cid (property (1)).
++ *
++ * Let's looks at the relevant combinations of TSA/TSB, and TMA transitions.
++ *
++ * Scenario A) (TSA)+(TMA) (from next task perspective)
++ *
++ * CPU0 CPU1
++ *
++ * Context switch CS-1 Remote-clear
++ * - store to rq->curr: (N)->(Y) (TSA) - cmpxchg to *pcpu_id to LAZY (TMA)
++ * (implied barrier after cmpxchg)
++ * - switch_mm_cid()
++ * - memory barrier (see switch_mm_cid()
++ * comment explaining how this barrier
++ * is combined with other scheduler
++ * barriers)
++ * - mm_cid_get (next)
++ * - READ_ONCE(*pcpu_cid) - rcu_dereference(src_rq->curr)
++ *
++ * This Dekker ensures that either task (Y) is observed by the
++ * rcu_dereference() or the LAZY flag is observed by READ_ONCE(), or both are
++ * observed.
++ *
++ * If task (Y) store is observed by rcu_dereference(), it means that there is
++ * still an active task on the cpu. Remote-clear will therefore not transition
++ * to UNSET, which fulfills property (1).
++ *
++ * If task (Y) is not observed, but the lazy flag is observed by READ_ONCE(),
++ * it will move its state to UNSET, which clears the percpu cid perhaps
++ * uselessly (which is not an issue for correctness). Because task (Y) is not
++ * observed, CPU1 can move ahead to set the state to UNSET. Because moving
++ * state to UNSET is done with a cmpxchg expecting that the old state has the
++ * LAZY flag set, only one thread will successfully UNSET.
++ *
++ * If both states (LAZY flag and task (Y)) are observed, the thread on CPU0
++ * will observe the LAZY flag and transition to UNSET (perhaps uselessly), and
++ * CPU1 will observe task (Y) and do nothing more, which is fine.
++ *
++ * What we are effectively preventing with this Dekker is a scenario where
++ * neither LAZY flag nor store (Y) are observed, which would fail property (1)
++ * because this would UNSET a cid which is actively used.
++ */
++
++void sched_mm_cid_migrate_from(struct task_struct *t)
++{
++ t->migrate_from_cpu = task_cpu(t);
++}
++
++static
++int __sched_mm_cid_migrate_from_fetch_cid(struct rq *src_rq,
++ struct task_struct *t,
++ struct mm_cid *src_pcpu_cid)
++{
++ struct mm_struct *mm = t->mm;
++ struct task_struct *src_task;
++ int src_cid, last_mm_cid;
++
++ if (!mm)
++ return -1;
++
++ last_mm_cid = t->last_mm_cid;
++ /*
++ * If the migrated task has no last cid, or if the current
++ * task on src rq uses the cid, it means the source cid does not need
++ * to be moved to the destination cpu.
++ */
++ if (last_mm_cid == -1)
++ return -1;
++ src_cid = READ_ONCE(src_pcpu_cid->cid);
++ if (!mm_cid_is_valid(src_cid) || last_mm_cid != src_cid)
++ return -1;
++
++ /*
++ * If we observe an active task using the mm on this rq, it means we
++ * are not the last task to be migrated from this cpu for this mm, so
++ * there is no need to move src_cid to the destination cpu.
++ */
++ rcu_read_lock();
++ src_task = rcu_dereference(src_rq->curr);
++ if (READ_ONCE(src_task->mm_cid_active) && src_task->mm == mm) {
++ rcu_read_unlock();
++ t->last_mm_cid = -1;
++ return -1;
++ }
++ rcu_read_unlock();
++
++ return src_cid;
++}
++
++static
++int __sched_mm_cid_migrate_from_try_steal_cid(struct rq *src_rq,
++ struct task_struct *t,
++ struct mm_cid *src_pcpu_cid,
++ int src_cid)
++{
++ struct task_struct *src_task;
++ struct mm_struct *mm = t->mm;
++ int lazy_cid;
++
++ if (src_cid == -1)
++ return -1;
++
++ /*
++ * Attempt to clear the source cpu cid to move it to the destination
++ * cpu.
++ */
++ lazy_cid = mm_cid_set_lazy_put(src_cid);
++ if (!try_cmpxchg(&src_pcpu_cid->cid, &src_cid, lazy_cid))
++ return -1;
++
++ /*
++ * The implicit barrier after cmpxchg per-mm/cpu cid before loading
++ * rq->curr->mm matches the scheduler barrier in context_switch()
++ * between store to rq->curr and load of prev and next task's
++ * per-mm/cpu cid.
++ *
++ * The implicit barrier after cmpxchg per-mm/cpu cid before loading
++ * rq->curr->mm_cid_active matches the barrier in
++ * sched_mm_cid_exit_signals(), sched_mm_cid_before_execve(), and
++ * sched_mm_cid_after_execve() between store to t->mm_cid_active and
++ * load of per-mm/cpu cid.
++ */
++
++ /*
++ * If we observe an active task using the mm on this rq after setting
++ * the lazy-put flag, this task will be responsible for transitioning
++ * from lazy-put flag set to MM_CID_UNSET.
++ */
++ rcu_read_lock();
++ src_task = rcu_dereference(src_rq->curr);
++ if (READ_ONCE(src_task->mm_cid_active) && src_task->mm == mm) {
++ rcu_read_unlock();
++ /*
++ * We observed an active task for this mm, there is therefore
++ * no point in moving this cid to the destination cpu.
++ */
++ t->last_mm_cid = -1;
++ return -1;
++ }
++ rcu_read_unlock();
++
++ /*
++ * The src_cid is unused, so it can be unset.
++ */
++ if (!try_cmpxchg(&src_pcpu_cid->cid, &lazy_cid, MM_CID_UNSET))
++ return -1;
++ return src_cid;
++}
++
++/*
++ * Migration to dst cpu. Called with dst_rq lock held.
++ * Interrupts are disabled, which keeps the window of cid ownership without the
++ * source rq lock held small.
++ */
++void sched_mm_cid_migrate_to(struct rq *dst_rq, struct task_struct *t, int src_cpu)
++{
++ struct mm_cid *src_pcpu_cid, *dst_pcpu_cid;
++ struct mm_struct *mm = t->mm;
++ int src_cid, dst_cid;
++ struct rq *src_rq;
++
++ lockdep_assert_rq_held(dst_rq);
++
++ if (!mm)
++ return;
++ if (src_cpu == -1) {
++ t->last_mm_cid = -1;
++ return;
++ }
++ /*
++ * Move the src cid if the dst cid is unset. This keeps id
++ * allocation closest to 0 in cases where few threads migrate around
++ * many cpus.
++ *
++ * If destination cid is already set, we may have to just clear
++ * the src cid to ensure compactness in frequent migrations
++ * scenarios.
++ *
++ * It is not useful to clear the src cid when the number of threads is
++ * greater or equal to the number of allowed cpus, because user-space
++ * can expect that the number of allowed cids can reach the number of
++ * allowed cpus.
++ */
++ dst_pcpu_cid = per_cpu_ptr(mm->pcpu_cid, cpu_of(dst_rq));
++ dst_cid = READ_ONCE(dst_pcpu_cid->cid);
++ if (!mm_cid_is_unset(dst_cid) &&
++ atomic_read(&mm->mm_users) >= t->nr_cpus_allowed)
++ return;
++ src_pcpu_cid = per_cpu_ptr(mm->pcpu_cid, src_cpu);
++ src_rq = cpu_rq(src_cpu);
++ src_cid = __sched_mm_cid_migrate_from_fetch_cid(src_rq, t, src_pcpu_cid);
++ if (src_cid == -1)
++ return;
++ src_cid = __sched_mm_cid_migrate_from_try_steal_cid(src_rq, t, src_pcpu_cid,
++ src_cid);
++ if (src_cid == -1)
++ return;
++ if (!mm_cid_is_unset(dst_cid)) {
++ __mm_cid_put(mm, src_cid);
++ return;
++ }
++ /* Move src_cid to dst cpu. */
++ mm_cid_snapshot_time(dst_rq, mm);
++ WRITE_ONCE(dst_pcpu_cid->cid, src_cid);
++}
++
++static void sched_mm_cid_remote_clear(struct mm_struct *mm, struct mm_cid *pcpu_cid,
++ int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++ struct task_struct *t;
++ unsigned long flags;
++ int cid, lazy_cid;
++
++ cid = READ_ONCE(pcpu_cid->cid);
++ if (!mm_cid_is_valid(cid))
++ return;
++
++ /*
++ * Clear the cpu cid if it is set to keep cid allocation compact. If
++ * there happens to be other tasks left on the source cpu using this
++ * mm, the next task using this mm will reallocate its cid on context
++ * switch.
++ */
++ lazy_cid = mm_cid_set_lazy_put(cid);
++ if (!try_cmpxchg(&pcpu_cid->cid, &cid, lazy_cid))
++ return;
++
++ /*
++ * The implicit barrier after cmpxchg per-mm/cpu cid before loading
++ * rq->curr->mm matches the scheduler barrier in context_switch()
++ * between store to rq->curr and load of prev and next task's
++ * per-mm/cpu cid.
++ *
++ * The implicit barrier after cmpxchg per-mm/cpu cid before loading
++ * rq->curr->mm_cid_active matches the barrier in
++ * sched_mm_cid_exit_signals(), sched_mm_cid_before_execve(), and
++ * sched_mm_cid_after_execve() between store to t->mm_cid_active and
++ * load of per-mm/cpu cid.
++ */
++
++ /*
++ * If we observe an active task using the mm on this rq after setting
++ * the lazy-put flag, that task will be responsible for transitioning
++ * from lazy-put flag set to MM_CID_UNSET.
++ */
++ rcu_read_lock();
++ t = rcu_dereference(rq->curr);
++ if (READ_ONCE(t->mm_cid_active) && t->mm == mm) {
++ rcu_read_unlock();
++ return;
++ }
++ rcu_read_unlock();
++
++ /*
++ * The cid is unused, so it can be unset.
++ * Disable interrupts to keep the window of cid ownership without rq
++ * lock small.
++ */
++ local_irq_save(flags);
++ if (try_cmpxchg(&pcpu_cid->cid, &lazy_cid, MM_CID_UNSET))
++ __mm_cid_put(mm, cid);
++ local_irq_restore(flags);
++}
++
++static void sched_mm_cid_remote_clear_old(struct mm_struct *mm, int cpu)
++{
++ struct rq *rq = cpu_rq(cpu);
++ struct mm_cid *pcpu_cid;
++ struct task_struct *curr;
++ u64 rq_clock;
++
++ /*
++ * rq->clock load is racy on 32-bit but one spurious clear once in a
++ * while is irrelevant.
++ */
++ rq_clock = READ_ONCE(rq->clock);
++ pcpu_cid = per_cpu_ptr(mm->pcpu_cid, cpu);
++
++ /*
++ * In order to take care of infrequently scheduled tasks, bump the time
++ * snapshot associated with this cid if an active task using the mm is
++ * observed on this rq.
++ */
++ rcu_read_lock();
++ curr = rcu_dereference(rq->curr);
++ if (READ_ONCE(curr->mm_cid_active) && curr->mm == mm) {
++ WRITE_ONCE(pcpu_cid->time, rq_clock);
++ rcu_read_unlock();
++ return;
++ }
++ rcu_read_unlock();
++
++ if (rq_clock < pcpu_cid->time + SCHED_MM_CID_PERIOD_NS)
++ return;
++ sched_mm_cid_remote_clear(mm, pcpu_cid, cpu);
++}
++
++static void sched_mm_cid_remote_clear_weight(struct mm_struct *mm, int cpu,
++ int weight)
++{
++ struct mm_cid *pcpu_cid;
++ int cid;
++
++ pcpu_cid = per_cpu_ptr(mm->pcpu_cid, cpu);
++ cid = READ_ONCE(pcpu_cid->cid);
++ if (!mm_cid_is_valid(cid) || cid < weight)
++ return;
++ sched_mm_cid_remote_clear(mm, pcpu_cid, cpu);
++}
++
++static void task_mm_cid_work(struct callback_head *work)
++{
++ unsigned long now = jiffies, old_scan, next_scan;
++ struct task_struct *t = current;
++ struct cpumask *cidmask;
++ struct mm_struct *mm;
++ int weight, cpu;
++
++ SCHED_WARN_ON(t != container_of(work, struct task_struct, cid_work));
++
++ work->next = work; /* Prevent double-add */
++ if (t->flags & PF_EXITING)
++ return;
++ mm = t->mm;
++ if (!mm)
++ return;
++ old_scan = READ_ONCE(mm->mm_cid_next_scan);
++ next_scan = now + msecs_to_jiffies(MM_CID_SCAN_DELAY);
++ if (!old_scan) {
++ unsigned long res;
++
++ res = cmpxchg(&mm->mm_cid_next_scan, old_scan, next_scan);
++ if (res != old_scan)
++ old_scan = res;
++ else
++ old_scan = next_scan;
++ }
++ if (time_before(now, old_scan))
++ return;
++ if (!try_cmpxchg(&mm->mm_cid_next_scan, &old_scan, next_scan))
++ return;
++ cidmask = mm_cidmask(mm);
++ /* Clear cids that were not recently used. */
++ for_each_possible_cpu(cpu)
++ sched_mm_cid_remote_clear_old(mm, cpu);
++ weight = cpumask_weight(cidmask);
++ /*
++ * Clear cids that are greater or equal to the cidmask weight to
++ * recompact it.
++ */
++ for_each_possible_cpu(cpu)
++ sched_mm_cid_remote_clear_weight(mm, cpu, weight);
++}
++
++void init_sched_mm_cid(struct task_struct *t)
++{
++ struct mm_struct *mm = t->mm;
++ int mm_users = 0;
++
++ if (mm) {
++ mm_users = atomic_read(&mm->mm_users);
++ if (mm_users == 1)
++ mm->mm_cid_next_scan = jiffies + msecs_to_jiffies(MM_CID_SCAN_DELAY);
++ }
++ t->cid_work.next = &t->cid_work; /* Protect against double add */
++ init_task_work(&t->cid_work, task_mm_cid_work);
++}
++
++void task_tick_mm_cid(struct rq *rq, struct task_struct *curr)
++{
++ struct callback_head *work = &curr->cid_work;
++ unsigned long now = jiffies;
++
++ if (!curr->mm || (curr->flags & (PF_EXITING | PF_KTHREAD)) ||
++ work->next != work)
++ return;
++ if (time_before(now, READ_ONCE(curr->mm->mm_cid_next_scan)))
++ return;
++ task_work_add(curr, work, TWA_RESUME);
++}
++
++void sched_mm_cid_exit_signals(struct task_struct *t)
++{
++ struct mm_struct *mm = t->mm;
++ struct rq_flags rf;
++ struct rq *rq;
++
++ if (!mm)
++ return;
++
++ preempt_disable();
++ rq = this_rq();
++ rq_lock_irqsave(rq, &rf);
++ preempt_enable_no_resched(); /* holding spinlock */
++ WRITE_ONCE(t->mm_cid_active, 0);
++ /*
++ * Store t->mm_cid_active before loading per-mm/cpu cid.
++ * Matches barrier in sched_mm_cid_remote_clear_old().
++ */
++ smp_mb();
++ mm_cid_put(mm);
++ t->last_mm_cid = t->mm_cid = -1;
++ rq_unlock_irqrestore(rq, &rf);
++}
++
++void sched_mm_cid_before_execve(struct task_struct *t)
++{
++ struct mm_struct *mm = t->mm;
++ struct rq_flags rf;
++ struct rq *rq;
++
++ if (!mm)
++ return;
++
++ preempt_disable();
++ rq = this_rq();
++ rq_lock_irqsave(rq, &rf);
++ preempt_enable_no_resched(); /* holding spinlock */
++ WRITE_ONCE(t->mm_cid_active, 0);
++ /*
++ * Store t->mm_cid_active before loading per-mm/cpu cid.
++ * Matches barrier in sched_mm_cid_remote_clear_old().
++ */
++ smp_mb();
++ mm_cid_put(mm);
++ t->last_mm_cid = t->mm_cid = -1;
++ rq_unlock_irqrestore(rq, &rf);
++}
++
++void sched_mm_cid_after_execve(struct task_struct *t)
++{
++ struct mm_struct *mm = t->mm;
++ struct rq_flags rf;
++ struct rq *rq;
++
++ if (!mm)
++ return;
++
++ preempt_disable();
++ rq = this_rq();
++ rq_lock_irqsave(rq, &rf);
++ preempt_enable_no_resched(); /* holding spinlock */
++ WRITE_ONCE(t->mm_cid_active, 1);
++ /*
++ * Store t->mm_cid_active before loading per-mm/cpu cid.
++ * Matches barrier in sched_mm_cid_remote_clear_old().
++ */
++ smp_mb();
++ t->last_mm_cid = t->mm_cid = mm_cid_get(rq, mm);
++ rq_unlock_irqrestore(rq, &rf);
++ rseq_set_notify_resume(t);
++}
++
++void sched_mm_cid_fork(struct task_struct *t)
++{
++ WARN_ON_ONCE(!t->mm || t->mm_cid != -1);
++ t->mm_cid_active = 1;
++}
++#endif
+diff --git a/kernel/sched/alt_debug.c b/kernel/sched/alt_debug.c
+new file mode 100644
+index 000000000000..1212a031700e
+--- /dev/null
++++ b/kernel/sched/alt_debug.c
+@@ -0,0 +1,31 @@
++/*
++ * kernel/sched/alt_debug.c
++ *
++ * Print the alt scheduler debugging details
++ *
++ * Author: Alfred Chen
++ * Date : 2020
++ */
++#include "sched.h"
++
++/*
++ * This allows printing both to /proc/sched_debug and
++ * to the console
++ */
++#define SEQ_printf(m, x...) \
++ do { \
++ if (m) \
++ seq_printf(m, x); \
++ else \
++ pr_cont(x); \
++ } while (0)
++
++void proc_sched_show_task(struct task_struct *p, struct pid_namespace *ns,
++ struct seq_file *m)
++{
++ SEQ_printf(m, "%s (%d, #threads: %d)\n", p->comm, task_pid_nr_ns(p, ns),
++ get_nr_threads(p));
++}
++
++void proc_sched_set_task(struct task_struct *p)
++{}
+diff --git a/kernel/sched/alt_sched.h b/kernel/sched/alt_sched.h
+new file mode 100644
+index 000000000000..5494f27cdb04
+--- /dev/null
++++ b/kernel/sched/alt_sched.h
+@@ -0,0 +1,906 @@
++#ifndef ALT_SCHED_H
++#define ALT_SCHED_H
++
++#include <linux/context_tracking.h>
++#include <linux/profile.h>
++#include <linux/stop_machine.h>
++#include <linux/syscalls.h>
++#include <linux/tick.h>
++
++#include <trace/events/power.h>
++#include <trace/events/sched.h>
++
++#include "../workqueue_internal.h"
++
++#include "cpupri.h"
++
++#ifdef CONFIG_SCHED_BMQ
++/* bits:
++ * RT(0-99), (Low prio adj range, nice width, high prio adj range) / 2, cpu idle task */
++#define SCHED_LEVELS (MAX_RT_PRIO + NICE_WIDTH / 2 + MAX_PRIORITY_ADJ + 1)
++#endif
++
++#ifdef CONFIG_SCHED_PDS
++/* bits: RT(0-24), reserved(25-31), SCHED_NORMAL_PRIO_NUM(32), cpu idle task(1) */
++#define SCHED_LEVELS (64 + 1)
++#endif /* CONFIG_SCHED_PDS */
++
++#define IDLE_TASK_SCHED_PRIO (SCHED_LEVELS - 1)
++
++#ifdef CONFIG_SCHED_DEBUG
++# define SCHED_WARN_ON(x) WARN_ONCE(x, #x)
++extern void resched_latency_warn(int cpu, u64 latency);
++#else
++# define SCHED_WARN_ON(x) ({ (void)(x), 0; })
++static inline void resched_latency_warn(int cpu, u64 latency) {}
++#endif
++
++/*
++ * Increase resolution of nice-level calculations for 64-bit architectures.
++ * The extra resolution improves shares distribution and load balancing of
++ * low-weight task groups (eg. nice +19 on an autogroup), deeper taskgroup
++ * hierarchies, especially on larger systems. This is not a user-visible change
++ * and does not change the user-interface for setting shares/weights.
++ *
++ * We increase resolution only if we have enough bits to allow this increased
++ * resolution (i.e. 64-bit). The costs for increasing resolution when 32-bit
++ * are pretty high and the returns do not justify the increased costs.
++ *
++ * Really only required when CONFIG_FAIR_GROUP_SCHED=y is also set, but to
++ * increase coverage and consistency always enable it on 64-bit platforms.
++ */
++#ifdef CONFIG_64BIT
++# define NICE_0_LOAD_SHIFT (SCHED_FIXEDPOINT_SHIFT + SCHED_FIXEDPOINT_SHIFT)
++# define scale_load(w) ((w) << SCHED_FIXEDPOINT_SHIFT)
++# define scale_load_down(w) \
++({ \
++ unsigned long __w = (w); \
++ if (__w) \
++ __w = max(2UL, __w >> SCHED_FIXEDPOINT_SHIFT); \
++ __w; \
++})
++#else
++# define NICE_0_LOAD_SHIFT (SCHED_FIXEDPOINT_SHIFT)
++# define scale_load(w) (w)
++# define scale_load_down(w) (w)
++#endif
++
++#ifdef CONFIG_FAIR_GROUP_SCHED
++#define ROOT_TASK_GROUP_LOAD NICE_0_LOAD
++
++/*
++ * A weight of 0 or 1 can cause arithmetics problems.
++ * A weight of a cfs_rq is the sum of weights of which entities
++ * are queued on this cfs_rq, so a weight of a entity should not be
++ * too large, so as the shares value of a task group.
++ * (The default weight is 1024 - so there's no practical
++ * limitation from this.)
++ */
++#define MIN_SHARES (1UL << 1)
++#define MAX_SHARES (1UL << 18)
++#endif
++
++/*
++ * Tunables that become constants when CONFIG_SCHED_DEBUG is off:
++ */
++#ifdef CONFIG_SCHED_DEBUG
++# define const_debug __read_mostly
++#else
++# define const_debug const
++#endif
++
++/* task_struct::on_rq states: */
++#define TASK_ON_RQ_QUEUED 1
++#define TASK_ON_RQ_MIGRATING 2
++
++static inline int task_on_rq_queued(struct task_struct *p)
++{
++ return p->on_rq == TASK_ON_RQ_QUEUED;
++}
++
++static inline int task_on_rq_migrating(struct task_struct *p)
++{
++ return READ_ONCE(p->on_rq) == TASK_ON_RQ_MIGRATING;
++}
++
++/*
++ * wake flags
++ */
++#define WF_SYNC 0x01 /* waker goes to sleep after wakeup */
++#define WF_FORK 0x02 /* child wakeup after fork */
++#define WF_MIGRATED 0x04 /* internal use, task got migrated */
++
++#define SCHED_QUEUE_BITS (SCHED_LEVELS - 1)
++
++struct sched_queue {
++ DECLARE_BITMAP(bitmap, SCHED_QUEUE_BITS);
++ struct list_head heads[SCHED_LEVELS];
++};
++
++struct rq;
++struct cpuidle_state;
++
++struct balance_callback {
++ struct balance_callback *next;
++ void (*func)(struct rq *rq);
++};
++
++/*
++ * This is the main, per-CPU runqueue data structure.
++ * This data should only be modified by the local cpu.
++ */
++struct rq {
++ /* runqueue lock: */
++ raw_spinlock_t lock;
++
++ struct task_struct __rcu *curr;
++ struct task_struct *idle, *stop, *skip;
++ struct mm_struct *prev_mm;
++
++ struct sched_queue queue;
++#ifdef CONFIG_SCHED_PDS
++ u64 time_edge;
++#endif
++ unsigned long prio;
++
++ /* switch count */
++ u64 nr_switches;
++
++ atomic_t nr_iowait;
++
++#ifdef CONFIG_SCHED_DEBUG
++ u64 last_seen_need_resched_ns;
++ int ticks_without_resched;
++#endif
++
++#ifdef CONFIG_MEMBARRIER
++ int membarrier_state;
++#endif
++
++#ifdef CONFIG_SMP
++ int cpu; /* cpu of this runqueue */
++ bool online;
++
++ unsigned int ttwu_pending;
++ unsigned char nohz_idle_balance;
++ unsigned char idle_balance;
++
++#ifdef CONFIG_HAVE_SCHED_AVG_IRQ
++ struct sched_avg avg_irq;
++#endif
++
++#ifdef CONFIG_SCHED_SMT
++ int active_balance;
++ struct cpu_stop_work active_balance_work;
++#endif
++ struct balance_callback *balance_callback;
++#ifdef CONFIG_HOTPLUG_CPU
++ struct rcuwait hotplug_wait;
++#endif
++ unsigned int nr_pinned;
++
++#endif /* CONFIG_SMP */
++#ifdef CONFIG_IRQ_TIME_ACCOUNTING
++ u64 prev_irq_time;
++#endif /* CONFIG_IRQ_TIME_ACCOUNTING */
++#ifdef CONFIG_PARAVIRT
++ u64 prev_steal_time;
++#endif /* CONFIG_PARAVIRT */
++#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
++ u64 prev_steal_time_rq;
++#endif /* CONFIG_PARAVIRT_TIME_ACCOUNTING */
++
++ /* For genenal cpu load util */
++ s32 load_history;
++ u64 load_block;
++ u64 load_stamp;
++
++ /* calc_load related fields */
++ unsigned long calc_load_update;
++ long calc_load_active;
++
++ u64 clock, last_tick;
++ u64 last_ts_switch;
++ u64 clock_task;
++
++ unsigned int nr_running;
++ unsigned long nr_uninterruptible;
++
++#ifdef CONFIG_SCHED_HRTICK
++#ifdef CONFIG_SMP
++ call_single_data_t hrtick_csd;
++#endif
++ struct hrtimer hrtick_timer;
++ ktime_t hrtick_time;
++#endif
++
++#ifdef CONFIG_SCHEDSTATS
++
++ /* latency stats */
++ struct sched_info rq_sched_info;
++ unsigned long long rq_cpu_time;
++ /* could above be rq->cfs_rq.exec_clock + rq->rt_rq.rt_runtime ? */
++
++ /* sys_sched_yield() stats */
++ unsigned int yld_count;
++
++ /* schedule() stats */
++ unsigned int sched_switch;
++ unsigned int sched_count;
++ unsigned int sched_goidle;
++
++ /* try_to_wake_up() stats */
++ unsigned int ttwu_count;
++ unsigned int ttwu_local;
++#endif /* CONFIG_SCHEDSTATS */
++
++#ifdef CONFIG_CPU_IDLE
++ /* Must be inspected within a rcu lock section */
++ struct cpuidle_state *idle_state;
++#endif
++
++#ifdef CONFIG_NO_HZ_COMMON
++#ifdef CONFIG_SMP
++ call_single_data_t nohz_csd;
++#endif
++ atomic_t nohz_flags;
++#endif /* CONFIG_NO_HZ_COMMON */
++
++ /* Scratch cpumask to be temporarily used under rq_lock */
++ cpumask_var_t scratch_mask;
++};
++
++extern unsigned long rq_load_util(struct rq *rq, unsigned long max);
++
++extern unsigned long calc_load_update;
++extern atomic_long_t calc_load_tasks;
++
++extern void calc_global_load_tick(struct rq *this_rq);
++extern long calc_load_fold_active(struct rq *this_rq, long adjust);
++
++DECLARE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
++#define cpu_rq(cpu) (&per_cpu(runqueues, (cpu)))
++#define this_rq() this_cpu_ptr(&runqueues)
++#define task_rq(p) cpu_rq(task_cpu(p))
++#define cpu_curr(cpu) (cpu_rq(cpu)->curr)
++#define raw_rq() raw_cpu_ptr(&runqueues)
++
++#ifdef CONFIG_SMP
++#if defined(CONFIG_SCHED_DEBUG) && defined(CONFIG_SYSCTL)
++void register_sched_domain_sysctl(void);
++void unregister_sched_domain_sysctl(void);
++#else
++static inline void register_sched_domain_sysctl(void)
++{
++}
++static inline void unregister_sched_domain_sysctl(void)
++{
++}
++#endif
++
++extern bool sched_smp_initialized;
++
++enum {
++ ITSELF_LEVEL_SPACE_HOLDER,
++#ifdef CONFIG_SCHED_SMT
++ SMT_LEVEL_SPACE_HOLDER,
++#endif
++ COREGROUP_LEVEL_SPACE_HOLDER,
++ CORE_LEVEL_SPACE_HOLDER,
++ OTHER_LEVEL_SPACE_HOLDER,
++ NR_CPU_AFFINITY_LEVELS
++};
++
++DECLARE_PER_CPU_ALIGNED(cpumask_t [NR_CPU_AFFINITY_LEVELS], sched_cpu_topo_masks);
++
++static inline int
++__best_mask_cpu(const cpumask_t *cpumask, const cpumask_t *mask)
++{
++ int cpu;
++
++ while ((cpu = cpumask_any_and(cpumask, mask)) >= nr_cpu_ids)
++ mask++;
++
++ return cpu;
++}
++
++static inline int best_mask_cpu(int cpu, const cpumask_t *mask)
++{
++ return __best_mask_cpu(mask, per_cpu(sched_cpu_topo_masks, cpu));
++}
++
++extern void flush_smp_call_function_queue(void);
++
++#else /* !CONFIG_SMP */
++static inline void flush_smp_call_function_queue(void) { }
++#endif
++
++#ifndef arch_scale_freq_tick
++static __always_inline
++void arch_scale_freq_tick(void)
++{
++}
++#endif
++
++#ifndef arch_scale_freq_capacity
++static __always_inline
++unsigned long arch_scale_freq_capacity(int cpu)
++{
++ return SCHED_CAPACITY_SCALE;
++}
++#endif
++
++static inline u64 __rq_clock_broken(struct rq *rq)
++{
++ return READ_ONCE(rq->clock);
++}
++
++static inline u64 rq_clock(struct rq *rq)
++{
++ /*
++ * Relax lockdep_assert_held() checking as in VRQ, call to
++ * sched_info_xxxx() may not held rq->lock
++ * lockdep_assert_held(&rq->lock);
++ */
++ return rq->clock;
++}
++
++static inline u64 rq_clock_task(struct rq *rq)
++{
++ /*
++ * Relax lockdep_assert_held() checking as in VRQ, call to
++ * sched_info_xxxx() may not held rq->lock
++ * lockdep_assert_held(&rq->lock);
++ */
++ return rq->clock_task;
++}
++
++/*
++ * {de,en}queue flags:
++ *
++ * DEQUEUE_SLEEP - task is no longer runnable
++ * ENQUEUE_WAKEUP - task just became runnable
++ *
++ */
++
++#define DEQUEUE_SLEEP 0x01
++
++#define ENQUEUE_WAKEUP 0x01
++
++
++/*
++ * Below are scheduler API which using in other kernel code
++ * It use the dummy rq_flags
++ * ToDo : BMQ need to support these APIs for compatibility with mainline
++ * scheduler code.
++ */
++struct rq_flags {
++ unsigned long flags;
++};
++
++struct rq *__task_rq_lock(struct task_struct *p, struct rq_flags *rf)
++ __acquires(rq->lock);
++
++struct rq *task_rq_lock(struct task_struct *p, struct rq_flags *rf)
++ __acquires(p->pi_lock)
++ __acquires(rq->lock);
++
++static inline void __task_rq_unlock(struct rq *rq, struct rq_flags *rf)
++ __releases(rq->lock)
++{
++ raw_spin_unlock(&rq->lock);
++}
++
++static inline void
++task_rq_unlock(struct rq *rq, struct task_struct *p, struct rq_flags *rf)
++ __releases(rq->lock)
++ __releases(p->pi_lock)
++{
++ raw_spin_unlock(&rq->lock);
++ raw_spin_unlock_irqrestore(&p->pi_lock, rf->flags);
++}
++
++static inline void
++rq_lock(struct rq *rq, struct rq_flags *rf)
++ __acquires(rq->lock)
++{
++ raw_spin_lock(&rq->lock);
++}
++
++static inline void
++rq_unlock(struct rq *rq, struct rq_flags *rf)
++ __releases(rq->lock)
++{
++ raw_spin_unlock(&rq->lock);
++}
++
++static inline void
++rq_lock_irq(struct rq *rq, struct rq_flags *rf)
++ __acquires(rq->lock)
++{
++ raw_spin_lock_irq(&rq->lock);
++}
++
++static inline void
++rq_unlock_irq(struct rq *rq, struct rq_flags *rf)
++ __releases(rq->lock)
++{
++ raw_spin_unlock_irq(&rq->lock);
++}
++
++static inline struct rq *
++this_rq_lock_irq(struct rq_flags *rf)
++ __acquires(rq->lock)
++{
++ struct rq *rq;
++
++ local_irq_disable();
++ rq = this_rq();
++ raw_spin_lock(&rq->lock);
++
++ return rq;
++}
++
++static inline raw_spinlock_t *__rq_lockp(struct rq *rq)
++{
++ return &rq->lock;
++}
++
++static inline raw_spinlock_t *rq_lockp(struct rq *rq)
++{
++ return __rq_lockp(rq);
++}
++
++static inline void lockdep_assert_rq_held(struct rq *rq)
++{
++ lockdep_assert_held(__rq_lockp(rq));
++}
++
++extern void raw_spin_rq_lock_nested(struct rq *rq, int subclass);
++extern void raw_spin_rq_unlock(struct rq *rq);
++
++static inline void raw_spin_rq_lock(struct rq *rq)
++{
++ raw_spin_rq_lock_nested(rq, 0);
++}
++
++static inline void raw_spin_rq_lock_irq(struct rq *rq)
++{
++ local_irq_disable();
++ raw_spin_rq_lock(rq);
++}
++
++static inline void raw_spin_rq_unlock_irq(struct rq *rq)
++{
++ raw_spin_rq_unlock(rq);
++ local_irq_enable();
++}
++
++static inline int task_current(struct rq *rq, struct task_struct *p)
++{
++ return rq->curr == p;
++}
++
++static inline bool task_on_cpu(struct task_struct *p)
++{
++ return p->on_cpu;
++}
++
++extern int task_running_nice(struct task_struct *p);
++
++extern struct static_key_false sched_schedstats;
++
++#ifdef CONFIG_CPU_IDLE
++static inline void idle_set_state(struct rq *rq,
++ struct cpuidle_state *idle_state)
++{
++ rq->idle_state = idle_state;
++}
++
++static inline struct cpuidle_state *idle_get_state(struct rq *rq)
++{
++ WARN_ON(!rcu_read_lock_held());
++ return rq->idle_state;
++}
++#else
++static inline void idle_set_state(struct rq *rq,
++ struct cpuidle_state *idle_state)
++{
++}
++
++static inline struct cpuidle_state *idle_get_state(struct rq *rq)
++{
++ return NULL;
++}
++#endif
++
++static inline int cpu_of(const struct rq *rq)
++{
++#ifdef CONFIG_SMP
++ return rq->cpu;
++#else
++ return 0;
++#endif
++}
++
++#include "stats.h"
++
++#ifdef CONFIG_NO_HZ_COMMON
++#define NOHZ_BALANCE_KICK_BIT 0
++#define NOHZ_STATS_KICK_BIT 1
++
++#define NOHZ_BALANCE_KICK BIT(NOHZ_BALANCE_KICK_BIT)
++#define NOHZ_STATS_KICK BIT(NOHZ_STATS_KICK_BIT)
++
++#define NOHZ_KICK_MASK (NOHZ_BALANCE_KICK | NOHZ_STATS_KICK)
++
++#define nohz_flags(cpu) (&cpu_rq(cpu)->nohz_flags)
++
++/* TODO: needed?
++extern void nohz_balance_exit_idle(struct rq *rq);
++#else
++static inline void nohz_balance_exit_idle(struct rq *rq) { }
++*/
++#endif
++
++#ifdef CONFIG_IRQ_TIME_ACCOUNTING
++struct irqtime {
++ u64 total;
++ u64 tick_delta;
++ u64 irq_start_time;
++ struct u64_stats_sync sync;
++};
++
++DECLARE_PER_CPU(struct irqtime, cpu_irqtime);
++
++/*
++ * Returns the irqtime minus the softirq time computed by ksoftirqd.
++ * Otherwise ksoftirqd's sum_exec_runtime is substracted its own runtime
++ * and never move forward.
++ */
++static inline u64 irq_time_read(int cpu)
++{
++ struct irqtime *irqtime = &per_cpu(cpu_irqtime, cpu);
++ unsigned int seq;
++ u64 total;
++
++ do {
++ seq = __u64_stats_fetch_begin(&irqtime->sync);
++ total = irqtime->total;
++ } while (__u64_stats_fetch_retry(&irqtime->sync, seq));
++
++ return total;
++}
++#endif /* CONFIG_IRQ_TIME_ACCOUNTING */
++
++#ifdef CONFIG_CPU_FREQ
++DECLARE_PER_CPU(struct update_util_data __rcu *, cpufreq_update_util_data);
++#endif /* CONFIG_CPU_FREQ */
++
++#ifdef CONFIG_NO_HZ_FULL
++extern int __init sched_tick_offload_init(void);
++#else
++static inline int sched_tick_offload_init(void) { return 0; }
++#endif
++
++#ifdef arch_scale_freq_capacity
++#ifndef arch_scale_freq_invariant
++#define arch_scale_freq_invariant() (true)
++#endif
++#else /* arch_scale_freq_capacity */
++#define arch_scale_freq_invariant() (false)
++#endif
++
++extern void schedule_idle(void);
++
++#define cap_scale(v, s) ((v)*(s) >> SCHED_CAPACITY_SHIFT)
++
++/*
++ * !! For sched_setattr_nocheck() (kernel) only !!
++ *
++ * This is actually gross. :(
++ *
++ * It is used to make schedutil kworker(s) higher priority than SCHED_DEADLINE
++ * tasks, but still be able to sleep. We need this on platforms that cannot
++ * atomically change clock frequency. Remove once fast switching will be
++ * available on such platforms.
++ *
++ * SUGOV stands for SchedUtil GOVernor.
++ */
++#define SCHED_FLAG_SUGOV 0x10000000
++
++#ifdef CONFIG_MEMBARRIER
++/*
++ * The scheduler provides memory barriers required by membarrier between:
++ * - prior user-space memory accesses and store to rq->membarrier_state,
++ * - store to rq->membarrier_state and following user-space memory accesses.
++ * In the same way it provides those guarantees around store to rq->curr.
++ */
++static inline void membarrier_switch_mm(struct rq *rq,
++ struct mm_struct *prev_mm,
++ struct mm_struct *next_mm)
++{
++ int membarrier_state;
++
++ if (prev_mm == next_mm)
++ return;
++
++ membarrier_state = atomic_read(&next_mm->membarrier_state);
++ if (READ_ONCE(rq->membarrier_state) == membarrier_state)
++ return;
++
++ WRITE_ONCE(rq->membarrier_state, membarrier_state);
++}
++#else
++static inline void membarrier_switch_mm(struct rq *rq,
++ struct mm_struct *prev_mm,
++ struct mm_struct *next_mm)
++{
++}
++#endif
++
++#ifdef CONFIG_NUMA
++extern int sched_numa_find_closest(const struct cpumask *cpus, int cpu);
++#else
++static inline int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
++{
++ return nr_cpu_ids;
++}
++#endif
++
++extern void swake_up_all_locked(struct swait_queue_head *q);
++extern void __prepare_to_swait(struct swait_queue_head *q, struct swait_queue *wait);
++
++#ifdef CONFIG_PREEMPT_DYNAMIC
++extern int preempt_dynamic_mode;
++extern int sched_dynamic_mode(const char *str);
++extern void sched_dynamic_update(int mode);
++#endif
++
++static inline void nohz_run_idle_balance(int cpu) { }
++
++static inline
++unsigned long uclamp_rq_util_with(struct rq *rq, unsigned long util,
++ struct task_struct *p)
++{
++ return util;
++}
++
++static inline bool uclamp_rq_is_capped(struct rq *rq) { return false; }
++
++#ifdef CONFIG_SCHED_MM_CID
++
++#define SCHED_MM_CID_PERIOD_NS (100ULL * 1000000) /* 100ms */
++#define MM_CID_SCAN_DELAY 100 /* 100ms */
++
++extern raw_spinlock_t cid_lock;
++extern int use_cid_lock;
++
++extern void sched_mm_cid_migrate_from(struct task_struct *t);
++extern void sched_mm_cid_migrate_to(struct rq *dst_rq, struct task_struct *t, int src_cpu);
++extern void task_tick_mm_cid(struct rq *rq, struct task_struct *curr);
++extern void init_sched_mm_cid(struct task_struct *t);
++
++static inline void __mm_cid_put(struct mm_struct *mm, int cid)
++{
++ if (cid < 0)
++ return;
++ cpumask_clear_cpu(cid, mm_cidmask(mm));
++}
++
++/*
++ * The per-mm/cpu cid can have the MM_CID_LAZY_PUT flag set or transition to
++ * the MM_CID_UNSET state without holding the rq lock, but the rq lock needs to
++ * be held to transition to other states.
++ *
++ * State transitions synchronized with cmpxchg or try_cmpxchg need to be
++ * consistent across cpus, which prevents use of this_cpu_cmpxchg.
++ */
++static inline void mm_cid_put_lazy(struct task_struct *t)
++{
++ struct mm_struct *mm = t->mm;
++ struct mm_cid __percpu *pcpu_cid = mm->pcpu_cid;
++ int cid;
++
++ lockdep_assert_irqs_disabled();
++ cid = __this_cpu_read(pcpu_cid->cid);
++ if (!mm_cid_is_lazy_put(cid) ||
++ !try_cmpxchg(&this_cpu_ptr(pcpu_cid)->cid, &cid, MM_CID_UNSET))
++ return;
++ __mm_cid_put(mm, mm_cid_clear_lazy_put(cid));
++}
++
++static inline int mm_cid_pcpu_unset(struct mm_struct *mm)
++{
++ struct mm_cid __percpu *pcpu_cid = mm->pcpu_cid;
++ int cid, res;
++
++ lockdep_assert_irqs_disabled();
++ cid = __this_cpu_read(pcpu_cid->cid);
++ for (;;) {
++ if (mm_cid_is_unset(cid))
++ return MM_CID_UNSET;
++ /*
++ * Attempt transition from valid or lazy-put to unset.
++ */
++ res = cmpxchg(&this_cpu_ptr(pcpu_cid)->cid, cid, MM_CID_UNSET);
++ if (res == cid)
++ break;
++ cid = res;
++ }
++ return cid;
++}
++
++static inline void mm_cid_put(struct mm_struct *mm)
++{
++ int cid;
++
++ lockdep_assert_irqs_disabled();
++ cid = mm_cid_pcpu_unset(mm);
++ if (cid == MM_CID_UNSET)
++ return;
++ __mm_cid_put(mm, mm_cid_clear_lazy_put(cid));
++}
++
++static inline int __mm_cid_try_get(struct mm_struct *mm)
++{
++ struct cpumask *cpumask;
++ int cid;
++
++ cpumask = mm_cidmask(mm);
++ /*
++ * Retry finding first zero bit if the mask is temporarily
++ * filled. This only happens during concurrent remote-clear
++ * which owns a cid without holding a rq lock.
++ */
++ for (;;) {
++ cid = cpumask_first_zero(cpumask);
++ if (cid < nr_cpu_ids)
++ break;
++ cpu_relax();
++ }
++ if (cpumask_test_and_set_cpu(cid, cpumask))
++ return -1;
++ return cid;
++}
++
++/*
++ * Save a snapshot of the current runqueue time of this cpu
++ * with the per-cpu cid value, allowing to estimate how recently it was used.
++ */
++static inline void mm_cid_snapshot_time(struct rq *rq, struct mm_struct *mm)
++{
++ struct mm_cid *pcpu_cid = per_cpu_ptr(mm->pcpu_cid, cpu_of(rq));
++
++ lockdep_assert_rq_held(rq);
++ WRITE_ONCE(pcpu_cid->time, rq->clock);
++}
++
++static inline int __mm_cid_get(struct rq *rq, struct mm_struct *mm)
++{
++ int cid;
++
++ /*
++ * All allocations (even those using the cid_lock) are lock-free. If
++ * use_cid_lock is set, hold the cid_lock to perform cid allocation to
++ * guarantee forward progress.
++ */
++ if (!READ_ONCE(use_cid_lock)) {
++ cid = __mm_cid_try_get(mm);
++ if (cid >= 0)
++ goto end;
++ raw_spin_lock(&cid_lock);
++ } else {
++ raw_spin_lock(&cid_lock);
++ cid = __mm_cid_try_get(mm);
++ if (cid >= 0)
++ goto unlock;
++ }
++
++ /*
++ * cid concurrently allocated. Retry while forcing following
++ * allocations to use the cid_lock to ensure forward progress.
++ */
++ WRITE_ONCE(use_cid_lock, 1);
++ /*
++ * Set use_cid_lock before allocation. Only care about program order
++ * because this is only required for forward progress.
++ */
++ barrier();
++ /*
++ * Retry until it succeeds. It is guaranteed to eventually succeed once
++ * all newcoming allocations observe the use_cid_lock flag set.
++ */
++ do {
++ cid = __mm_cid_try_get(mm);
++ cpu_relax();
++ } while (cid < 0);
++ /*
++ * Allocate before clearing use_cid_lock. Only care about
++ * program order because this is for forward progress.
++ */
++ barrier();
++ WRITE_ONCE(use_cid_lock, 0);
++unlock:
++ raw_spin_unlock(&cid_lock);
++end:
++ mm_cid_snapshot_time(rq, mm);
++ return cid;
++}
++
++static inline int mm_cid_get(struct rq *rq, struct mm_struct *mm)
++{
++ struct mm_cid __percpu *pcpu_cid = mm->pcpu_cid;
++ struct cpumask *cpumask;
++ int cid;
++
++ lockdep_assert_rq_held(rq);
++ cpumask = mm_cidmask(mm);
++ cid = __this_cpu_read(pcpu_cid->cid);
++ if (mm_cid_is_valid(cid)) {
++ mm_cid_snapshot_time(rq, mm);
++ return cid;
++ }
++ if (mm_cid_is_lazy_put(cid)) {
++ if (try_cmpxchg(&this_cpu_ptr(pcpu_cid)->cid, &cid, MM_CID_UNSET))
++ __mm_cid_put(mm, mm_cid_clear_lazy_put(cid));
++ }
++ cid = __mm_cid_get(rq, mm);
++ __this_cpu_write(pcpu_cid->cid, cid);
++ return cid;
++}
++
++static inline void switch_mm_cid(struct rq *rq,
++ struct task_struct *prev,
++ struct task_struct *next)
++{
++ /*
++ * Provide a memory barrier between rq->curr store and load of
++ * {prev,next}->mm->pcpu_cid[cpu] on rq->curr->mm transition.
++ *
++ * Should be adapted if context_switch() is modified.
++ */
++ if (!next->mm) { // to kernel
++ /*
++ * user -> kernel transition does not guarantee a barrier, but
++ * we can use the fact that it performs an atomic operation in
++ * mmgrab().
++ */
++ if (prev->mm) // from user
++ smp_mb__after_mmgrab();
++ /*
++ * kernel -> kernel transition does not change rq->curr->mm
++ * state. It stays NULL.
++ */
++ } else { // to user
++ /*
++ * kernel -> user transition does not provide a barrier
++ * between rq->curr store and load of {prev,next}->mm->pcpu_cid[cpu].
++ * Provide it here.
++ */
++ if (!prev->mm) // from kernel
++ smp_mb();
++ /*
++ * user -> user transition guarantees a memory barrier through
++ * switch_mm() when current->mm changes. If current->mm is
++ * unchanged, no barrier is needed.
++ */
++ }
++ if (prev->mm_cid_active) {
++ mm_cid_snapshot_time(rq, prev->mm);
++ mm_cid_put_lazy(prev);
++ prev->mm_cid = -1;
++ }
++ if (next->mm_cid_active)
++ next->last_mm_cid = next->mm_cid = mm_cid_get(rq, next->mm);
++}
++
++#else
++static inline void switch_mm_cid(struct rq *rq, struct task_struct *prev, struct task_struct *next) { }
++static inline void sched_mm_cid_migrate_from(struct task_struct *t) { }
++static inline void sched_mm_cid_migrate_to(struct rq *dst_rq, struct task_struct *t, int src_cpu) { }
++static inline void task_tick_mm_cid(struct rq *rq, struct task_struct *curr) { }
++static inline void init_sched_mm_cid(struct task_struct *t) { }
++#endif
++
++#endif /* ALT_SCHED_H */
+diff --git a/kernel/sched/bmq.h b/kernel/sched/bmq.h
+new file mode 100644
+index 000000000000..f29b8f3aa786
+--- /dev/null
++++ b/kernel/sched/bmq.h
+@@ -0,0 +1,110 @@
++#define ALT_SCHED_NAME "BMQ"
++
++/*
++ * BMQ only routines
++ */
++#define rq_switch_time(rq) ((rq)->clock - (rq)->last_ts_switch)
++#define boost_threshold(p) (sched_timeslice_ns >>\
++ (15 - MAX_PRIORITY_ADJ - (p)->boost_prio))
++
++static inline void boost_task(struct task_struct *p)
++{
++ int limit;
++
++ switch (p->policy) {
++ case SCHED_NORMAL:
++ limit = -MAX_PRIORITY_ADJ;
++ break;
++ case SCHED_BATCH:
++ case SCHED_IDLE:
++ limit = 0;
++ break;
++ default:
++ return;
++ }
++
++ if (p->boost_prio > limit)
++ p->boost_prio--;
++}
++
++static inline void deboost_task(struct task_struct *p)
++{
++ if (p->boost_prio < MAX_PRIORITY_ADJ)
++ p->boost_prio++;
++}
++
++/*
++ * Common interfaces
++ */
++static inline void sched_timeslice_imp(const int timeslice_ms) {}
++
++static inline int
++task_sched_prio_normal(const struct task_struct *p, const struct rq *rq)
++{
++ return p->prio + p->boost_prio - MAX_RT_PRIO;
++}
++
++static inline int task_sched_prio(const struct task_struct *p)
++{
++ return (p->prio < MAX_RT_PRIO)? p->prio : MAX_RT_PRIO / 2 + (p->prio + p->boost_prio) / 2;
++}
++
++static inline int
++task_sched_prio_idx(const struct task_struct *p, const struct rq *rq)
++{
++ return task_sched_prio(p);
++}
++
++static inline int sched_prio2idx(int prio, struct rq *rq)
++{
++ return prio;
++}
++
++static inline int sched_idx2prio(int idx, struct rq *rq)
++{
++ return idx;
++}
++
++static inline void time_slice_expired(struct task_struct *p, struct rq *rq)
++{
++ p->time_slice = sched_timeslice_ns;
++
++ if (SCHED_FIFO != p->policy && task_on_rq_queued(p)) {
++ if (SCHED_RR != p->policy)
++ deboost_task(p);
++ requeue_task(p, rq, task_sched_prio_idx(p, rq));
++ }
++}
++
++static inline void sched_task_sanity_check(struct task_struct *p, struct rq *rq) {}
++
++inline int task_running_nice(struct task_struct *p)
++{
++ return (p->prio + p->boost_prio > DEFAULT_PRIO + MAX_PRIORITY_ADJ);
++}
++
++static void sched_task_fork(struct task_struct *p, struct rq *rq)
++{
++ p->boost_prio = MAX_PRIORITY_ADJ;
++}
++
++static inline void do_sched_yield_type_1(struct task_struct *p, struct rq *rq)
++{
++ p->boost_prio = MAX_PRIORITY_ADJ;
++}
++
++#ifdef CONFIG_SMP
++static inline void sched_task_ttwu(struct task_struct *p)
++{
++ if(this_rq()->clock_task - p->last_ran > sched_timeslice_ns)
++ boost_task(p);
++}
++#endif
++
++static inline void sched_task_deactivate(struct task_struct *p, struct rq *rq)
++{
++ if (rq_switch_time(rq) < boost_threshold(p))
++ boost_task(p);
++}
++
++static inline void update_rq_time_edge(struct rq *rq) {}
+diff --git a/kernel/sched/build_policy.c b/kernel/sched/build_policy.c
+index d9dc9ab3773f..71a25540d65e 100644
+--- a/kernel/sched/build_policy.c
++++ b/kernel/sched/build_policy.c
+@@ -42,13 +42,19 @@
+
+ #include "idle.c"
+
++#ifndef CONFIG_SCHED_ALT
+ #include "rt.c"
++#endif
+
+ #ifdef CONFIG_SMP
++#ifndef CONFIG_SCHED_ALT
+ # include "cpudeadline.c"
++#endif
+ # include "pelt.c"
+ #endif
+
+ #include "cputime.c"
+-#include "deadline.c"
+
++#ifndef CONFIG_SCHED_ALT
++#include "deadline.c"
++#endif
+diff --git a/kernel/sched/build_utility.c b/kernel/sched/build_utility.c
+index 99bdd96f454f..23f80a86d2d7 100644
+--- a/kernel/sched/build_utility.c
++++ b/kernel/sched/build_utility.c
+@@ -85,7 +85,9 @@
+
+ #ifdef CONFIG_SMP
+ # include "cpupri.c"
++#ifndef CONFIG_SCHED_ALT
+ # include "stop_task.c"
++#endif
+ # include "topology.c"
+ #endif
+
+diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
+index e3211455b203..87f7a4f732c8 100644
+--- a/kernel/sched/cpufreq_schedutil.c
++++ b/kernel/sched/cpufreq_schedutil.c
+@@ -157,9 +157,14 @@ static void sugov_get_util(struct sugov_cpu *sg_cpu)
+ {
+ struct rq *rq = cpu_rq(sg_cpu->cpu);
+
++#ifndef CONFIG_SCHED_ALT
+ sg_cpu->bw_dl = cpu_bw_dl(rq);
+ sg_cpu->util = effective_cpu_util(sg_cpu->cpu, cpu_util_cfs(sg_cpu->cpu),
+ FREQUENCY_UTIL, NULL);
++#else
++ sg_cpu->bw_dl = 0;
++ sg_cpu->util = rq_load_util(rq, arch_scale_cpu_capacity(sg_cpu->cpu));
++#endif /* CONFIG_SCHED_ALT */
+ }
+
+ /**
+@@ -305,8 +310,10 @@ static inline bool sugov_cpu_is_busy(struct sugov_cpu *sg_cpu) { return false; }
+ */
+ static inline void ignore_dl_rate_limit(struct sugov_cpu *sg_cpu)
+ {
++#ifndef CONFIG_SCHED_ALT
+ if (cpu_bw_dl(cpu_rq(sg_cpu->cpu)) > sg_cpu->bw_dl)
+ sg_cpu->sg_policy->limits_changed = true;
++#endif
+ }
+
+ static inline bool sugov_update_single_common(struct sugov_cpu *sg_cpu,
+@@ -609,6 +616,7 @@ static int sugov_kthread_create(struct sugov_policy *sg_policy)
+ }
+
+ ret = sched_setattr_nocheck(thread, &attr);
++
+ if (ret) {
+ kthread_stop(thread);
+ pr_warn("%s: failed to set SCHED_DEADLINE\n", __func__);
+@@ -841,7 +849,9 @@ cpufreq_governor_init(schedutil_gov);
+ #ifdef CONFIG_ENERGY_MODEL
+ static void rebuild_sd_workfn(struct work_struct *work)
+ {
++#ifndef CONFIG_SCHED_ALT
+ rebuild_sched_domains_energy();
++#endif /* CONFIG_SCHED_ALT */
+ }
+ static DECLARE_WORK(rebuild_sd_work, rebuild_sd_workfn);
+
+diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
+index af7952f12e6c..6461cbbb734d 100644
+--- a/kernel/sched/cputime.c
++++ b/kernel/sched/cputime.c
+@@ -126,7 +126,7 @@ void account_user_time(struct task_struct *p, u64 cputime)
+ p->utime += cputime;
+ account_group_user_time(p, cputime);
+
+- index = (task_nice(p) > 0) ? CPUTIME_NICE : CPUTIME_USER;
++ index = task_running_nice(p) ? CPUTIME_NICE : CPUTIME_USER;
+
+ /* Add user time to cpustat. */
+ task_group_account_field(p, index, cputime);
+@@ -150,7 +150,7 @@ void account_guest_time(struct task_struct *p, u64 cputime)
+ p->gtime += cputime;
+
+ /* Add guest time to cpustat. */
+- if (task_nice(p) > 0) {
++ if (task_running_nice(p)) {
+ task_group_account_field(p, CPUTIME_NICE, cputime);
+ cpustat[CPUTIME_GUEST_NICE] += cputime;
+ } else {
+@@ -288,7 +288,7 @@ static inline u64 account_other_time(u64 max)
+ #ifdef CONFIG_64BIT
+ static inline u64 read_sum_exec_runtime(struct task_struct *t)
+ {
+- return t->se.sum_exec_runtime;
++ return tsk_seruntime(t);
+ }
+ #else
+ static u64 read_sum_exec_runtime(struct task_struct *t)
+@@ -298,7 +298,7 @@ static u64 read_sum_exec_runtime(struct task_struct *t)
+ struct rq *rq;
+
+ rq = task_rq_lock(t, &rf);
+- ns = t->se.sum_exec_runtime;
++ ns = tsk_seruntime(t);
+ task_rq_unlock(rq, t, &rf);
+
+ return ns;
+@@ -630,7 +630,7 @@ void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev,
+ void task_cputime_adjusted(struct task_struct *p, u64 *ut, u64 *st)
+ {
+ struct task_cputime cputime = {
+- .sum_exec_runtime = p->se.sum_exec_runtime,
++ .sum_exec_runtime = tsk_seruntime(p),
+ };
+
+ if (task_cputime(p, &cputime.utime, &cputime.stime))
+diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
+index 0b2340a79b65..1e5407b8a738 100644
+--- a/kernel/sched/debug.c
++++ b/kernel/sched/debug.c
+@@ -7,6 +7,7 @@
+ * Copyright(C) 2007, Red Hat, Inc., Ingo Molnar
+ */
+
++#ifndef CONFIG_SCHED_ALT
+ /*
+ * This allows printing both to /proc/sched_debug and
+ * to the console
+@@ -215,6 +216,7 @@ static const struct file_operations sched_scaling_fops = {
+ };
+
+ #endif /* SMP */
++#endif /* !CONFIG_SCHED_ALT */
+
+ #ifdef CONFIG_PREEMPT_DYNAMIC
+
+@@ -278,6 +280,7 @@ static const struct file_operations sched_dynamic_fops = {
+
+ #endif /* CONFIG_PREEMPT_DYNAMIC */
+
++#ifndef CONFIG_SCHED_ALT
+ __read_mostly bool sched_debug_verbose;
+
+ #ifdef CONFIG_SMP
+@@ -332,6 +335,7 @@ static const struct file_operations sched_debug_fops = {
+ .llseek = seq_lseek,
+ .release = seq_release,
+ };
++#endif /* !CONFIG_SCHED_ALT */
+
+ static struct dentry *debugfs_sched;
+
+@@ -341,12 +345,16 @@ static __init int sched_init_debug(void)
+
+ debugfs_sched = debugfs_create_dir("sched", NULL);
+
++#ifndef CONFIG_SCHED_ALT
+ debugfs_create_file("features", 0644, debugfs_sched, NULL, &sched_feat_fops);
+ debugfs_create_file_unsafe("verbose", 0644, debugfs_sched, &sched_debug_verbose, &sched_verbose_fops);
++ debugfs_create_bool("verbose", 0644, debugfs_sched, &sched_debug_verbose);
++#endif /* !CONFIG_SCHED_ALT */
+ #ifdef CONFIG_PREEMPT_DYNAMIC
+ debugfs_create_file("preempt", 0644, debugfs_sched, NULL, &sched_dynamic_fops);
+ #endif
+
++#ifndef CONFIG_SCHED_ALT
+ debugfs_create_u32("latency_ns", 0644, debugfs_sched, &sysctl_sched_latency);
+ debugfs_create_u32("min_granularity_ns", 0644, debugfs_sched, &sysctl_sched_min_granularity);
+ debugfs_create_u32("idle_min_granularity_ns", 0644, debugfs_sched, &sysctl_sched_idle_min_granularity);
+@@ -376,11 +384,13 @@ static __init int sched_init_debug(void)
+ #endif
+
+ debugfs_create_file("debug", 0444, debugfs_sched, NULL, &sched_debug_fops);
++#endif /* !CONFIG_SCHED_ALT */
+
+ return 0;
+ }
+ late_initcall(sched_init_debug);
+
++#ifndef CONFIG_SCHED_ALT
+ #ifdef CONFIG_SMP
+
+ static cpumask_var_t sd_sysctl_cpus;
+@@ -1114,6 +1124,7 @@ void proc_sched_set_task(struct task_struct *p)
+ memset(&p->stats, 0, sizeof(p->stats));
+ #endif
+ }
++#endif /* !CONFIG_SCHED_ALT */
+
+ void resched_latency_warn(int cpu, u64 latency)
+ {
+diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
+index 342f58a329f5..ab493e759084 100644
+--- a/kernel/sched/idle.c
++++ b/kernel/sched/idle.c
+@@ -379,6 +379,7 @@ void cpu_startup_entry(enum cpuhp_state state)
+ do_idle();
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ /*
+ * idle-task scheduling class.
+ */
+@@ -500,3 +501,4 @@ DEFINE_SCHED_CLASS(idle) = {
+ .switched_to = switched_to_idle,
+ .update_curr = update_curr_idle,
+ };
++#endif
+diff --git a/kernel/sched/pds.h b/kernel/sched/pds.h
+new file mode 100644
+index 000000000000..15cc4887efed
+--- /dev/null
++++ b/kernel/sched/pds.h
+@@ -0,0 +1,152 @@
++#define ALT_SCHED_NAME "PDS"
++
++#define MIN_SCHED_NORMAL_PRIO (32)
++static const u64 RT_MASK = ((1ULL << MIN_SCHED_NORMAL_PRIO) - 1);
++
++#define SCHED_NORMAL_PRIO_NUM (32)
++#define SCHED_EDGE_DELTA (SCHED_NORMAL_PRIO_NUM - NICE_WIDTH / 2)
++
++/* PDS assume NORMAL_PRIO_NUM is power of 2 */
++#define SCHED_NORMAL_PRIO_MOD(x) ((x) & (SCHED_NORMAL_PRIO_NUM - 1))
++
++/* default time slice 4ms -> shift 22, 2 time slice slots -> shift 23 */
++static __read_mostly int sched_timeslice_shift = 23;
++
++/*
++ * Common interfaces
++ */
++static inline void sched_timeslice_imp(const int timeslice_ms)
++{
++ if (2 == timeslice_ms)
++ sched_timeslice_shift = 22;
++}
++
++static inline int
++task_sched_prio_normal(const struct task_struct *p, const struct rq *rq)
++{
++ s64 delta = p->deadline - rq->time_edge + SCHED_EDGE_DELTA;
++
++#ifdef ALT_SCHED_DEBUG
++ if (WARN_ONCE(delta > NORMAL_PRIO_NUM - 1,
++ "pds: task_sched_prio_normal() delta %lld\n", delta))
++ return SCHED_NORMAL_PRIO_NUM - 1;
++#endif
++
++ return max(0LL, delta);
++}
++
++static inline int task_sched_prio(const struct task_struct *p)
++{
++ return (p->prio < MIN_NORMAL_PRIO) ? (p->prio >> 2) :
++ MIN_SCHED_NORMAL_PRIO + task_sched_prio_normal(p, task_rq(p));
++}
++
++static inline int
++task_sched_prio_idx(const struct task_struct *p, const struct rq *rq)
++{
++ u64 idx;
++
++ if (p->prio < MIN_NORMAL_PRIO)
++ return p->prio >> 2;
++
++ idx = max(p->deadline + SCHED_EDGE_DELTA, rq->time_edge);
++ /*printk(KERN_INFO "sched: task_sched_prio_idx edge:%llu, deadline=%llu idx=%llu\n", rq->time_edge, p->deadline, idx);*/
++ return MIN_SCHED_NORMAL_PRIO + SCHED_NORMAL_PRIO_MOD(idx);
++}
++
++static inline int sched_prio2idx(int sched_prio, struct rq *rq)
++{
++ return (IDLE_TASK_SCHED_PRIO == sched_prio || sched_prio < MIN_SCHED_NORMAL_PRIO) ?
++ sched_prio :
++ MIN_SCHED_NORMAL_PRIO + SCHED_NORMAL_PRIO_MOD(sched_prio + rq->time_edge);
++}
++
++static inline int sched_idx2prio(int sched_idx, struct rq *rq)
++{
++ return (sched_idx < MIN_SCHED_NORMAL_PRIO) ?
++ sched_idx :
++ MIN_SCHED_NORMAL_PRIO + SCHED_NORMAL_PRIO_MOD(sched_idx - rq->time_edge);
++}
++
++static inline void sched_renew_deadline(struct task_struct *p, const struct rq *rq)
++{
++ if (p->prio >= MIN_NORMAL_PRIO)
++ p->deadline = rq->time_edge + (p->static_prio - (MAX_PRIO - NICE_WIDTH)) / 2;
++}
++
++int task_running_nice(struct task_struct *p)
++{
++ return (p->prio > DEFAULT_PRIO);
++}
++
++static inline void update_rq_time_edge(struct rq *rq)
++{
++ struct list_head head;
++ u64 old = rq->time_edge;
++ u64 now = rq->clock >> sched_timeslice_shift;
++ u64 prio, delta;
++ DECLARE_BITMAP(normal, SCHED_QUEUE_BITS);
++
++ if (now == old)
++ return;
++
++ rq->time_edge = now;
++ delta = min_t(u64, SCHED_NORMAL_PRIO_NUM, now - old);
++ INIT_LIST_HEAD(&head);
++
++ /*printk(KERN_INFO "sched: update_rq_time_edge 0x%016lx %llu\n", rq->queue.bitmap[0], delta);*/
++ prio = MIN_SCHED_NORMAL_PRIO;
++ for_each_set_bit_from(prio, rq->queue.bitmap, MIN_SCHED_NORMAL_PRIO + delta)
++ list_splice_tail_init(rq->queue.heads + MIN_SCHED_NORMAL_PRIO +
++ SCHED_NORMAL_PRIO_MOD(prio + old), &head);
++
++ bitmap_shift_right(normal, rq->queue.bitmap, delta, SCHED_QUEUE_BITS);
++ if (!list_empty(&head)) {
++ struct task_struct *p;
++ u64 idx = MIN_SCHED_NORMAL_PRIO + SCHED_NORMAL_PRIO_MOD(now);
++
++ list_for_each_entry(p, &head, sq_node)
++ p->sq_idx = idx;
++
++ list_splice(&head, rq->queue.heads + idx);
++ set_bit(MIN_SCHED_NORMAL_PRIO, normal);
++ }
++ bitmap_replace(rq->queue.bitmap, normal, rq->queue.bitmap,
++ (const unsigned long *)&RT_MASK, SCHED_QUEUE_BITS);
++
++ if (rq->prio < MIN_SCHED_NORMAL_PRIO || IDLE_TASK_SCHED_PRIO == rq->prio)
++ return;
++
++ rq->prio = (rq->prio < MIN_SCHED_NORMAL_PRIO + delta) ?
++ MIN_SCHED_NORMAL_PRIO : rq->prio - delta;
++}
++
++static inline void time_slice_expired(struct task_struct *p, struct rq *rq)
++{
++ p->time_slice = sched_timeslice_ns;
++ sched_renew_deadline(p, rq);
++ if (SCHED_FIFO != p->policy && task_on_rq_queued(p))
++ requeue_task(p, rq, task_sched_prio_idx(p, rq));
++}
++
++static inline void sched_task_sanity_check(struct task_struct *p, struct rq *rq)
++{
++ u64 max_dl = rq->time_edge + NICE_WIDTH / 2 - 1;
++ if (unlikely(p->deadline > max_dl))
++ p->deadline = max_dl;
++}
++
++static void sched_task_fork(struct task_struct *p, struct rq *rq)
++{
++ sched_renew_deadline(p, rq);
++}
++
++static inline void do_sched_yield_type_1(struct task_struct *p, struct rq *rq)
++{
++ time_slice_expired(p, rq);
++}
++
++#ifdef CONFIG_SMP
++static inline void sched_task_ttwu(struct task_struct *p) {}
++#endif
++static inline void sched_task_deactivate(struct task_struct *p, struct rq *rq) {}
+diff --git a/kernel/sched/pelt.c b/kernel/sched/pelt.c
+index 0f310768260c..bd38bf738fe9 100644
+--- a/kernel/sched/pelt.c
++++ b/kernel/sched/pelt.c
+@@ -266,6 +266,7 @@ ___update_load_avg(struct sched_avg *sa, unsigned long load)
+ WRITE_ONCE(sa->util_avg, sa->util_sum / divider);
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ /*
+ * sched_entity:
+ *
+@@ -383,8 +384,9 @@ int update_dl_rq_load_avg(u64 now, struct rq *rq, int running)
+
+ return 0;
+ }
++#endif
+
+-#ifdef CONFIG_SCHED_THERMAL_PRESSURE
++#if defined(CONFIG_SCHED_THERMAL_PRESSURE) && !defined(CONFIG_SCHED_ALT)
+ /*
+ * thermal:
+ *
+diff --git a/kernel/sched/pelt.h b/kernel/sched/pelt.h
+index 3a0e0dc28721..e8a7d84aa5a5 100644
+--- a/kernel/sched/pelt.h
++++ b/kernel/sched/pelt.h
+@@ -1,13 +1,15 @@
+ #ifdef CONFIG_SMP
+ #include "sched-pelt.h"
+
++#ifndef CONFIG_SCHED_ALT
+ int __update_load_avg_blocked_se(u64 now, struct sched_entity *se);
+ int __update_load_avg_se(u64 now, struct cfs_rq *cfs_rq, struct sched_entity *se);
+ int __update_load_avg_cfs_rq(u64 now, struct cfs_rq *cfs_rq);
+ int update_rt_rq_load_avg(u64 now, struct rq *rq, int running);
+ int update_dl_rq_load_avg(u64 now, struct rq *rq, int running);
++#endif
+
+-#ifdef CONFIG_SCHED_THERMAL_PRESSURE
++#if defined(CONFIG_SCHED_THERMAL_PRESSURE) && !defined(CONFIG_SCHED_ALT)
+ int update_thermal_load_avg(u64 now, struct rq *rq, u64 capacity);
+
+ static inline u64 thermal_load_avg(struct rq *rq)
+@@ -44,6 +46,7 @@ static inline u32 get_pelt_divider(struct sched_avg *avg)
+ return PELT_MIN_DIVIDER + avg->period_contrib;
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ static inline void cfs_se_util_change(struct sched_avg *avg)
+ {
+ unsigned int enqueued;
+@@ -180,9 +183,11 @@ static inline u64 cfs_rq_clock_pelt(struct cfs_rq *cfs_rq)
+ return rq_clock_pelt(rq_of(cfs_rq));
+ }
+ #endif
++#endif /* CONFIG_SCHED_ALT */
+
+ #else
+
++#ifndef CONFIG_SCHED_ALT
+ static inline int
+ update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq)
+ {
+@@ -200,6 +205,7 @@ update_dl_rq_load_avg(u64 now, struct rq *rq, int running)
+ {
+ return 0;
+ }
++#endif
+
+ static inline int
+ update_thermal_load_avg(u64 now, struct rq *rq, u64 capacity)
+diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
+index ec7b3e0a2b20..3b4052dd7bee 100644
+--- a/kernel/sched/sched.h
++++ b/kernel/sched/sched.h
+@@ -5,6 +5,10 @@
+ #ifndef _KERNEL_SCHED_SCHED_H
+ #define _KERNEL_SCHED_SCHED_H
+
++#ifdef CONFIG_SCHED_ALT
++#include "alt_sched.h"
++#else
++
+ #include <linux/sched/affinity.h>
+ #include <linux/sched/autogroup.h>
+ #include <linux/sched/cpufreq.h>
+@@ -3487,4 +3491,9 @@ static inline void task_tick_mm_cid(struct rq *rq, struct task_struct *curr) { }
+ static inline void init_sched_mm_cid(struct task_struct *t) { }
+ #endif
+
++static inline int task_running_nice(struct task_struct *p)
++{
++ return (task_nice(p) > 0);
++}
++#endif /* !CONFIG_SCHED_ALT */
+ #endif /* _KERNEL_SCHED_SCHED_H */
+diff --git a/kernel/sched/stats.c b/kernel/sched/stats.c
+index 857f837f52cb..5486c63e4790 100644
+--- a/kernel/sched/stats.c
++++ b/kernel/sched/stats.c
+@@ -125,8 +125,10 @@ static int show_schedstat(struct seq_file *seq, void *v)
+ } else {
+ struct rq *rq;
+ #ifdef CONFIG_SMP
++#ifndef CONFIG_SCHED_ALT
+ struct sched_domain *sd;
+ int dcount = 0;
++#endif
+ #endif
+ cpu = (unsigned long)(v - 2);
+ rq = cpu_rq(cpu);
+@@ -143,6 +145,7 @@ static int show_schedstat(struct seq_file *seq, void *v)
+ seq_printf(seq, "\n");
+
+ #ifdef CONFIG_SMP
++#ifndef CONFIG_SCHED_ALT
+ /* domain-specific stats */
+ rcu_read_lock();
+ for_each_domain(cpu, sd) {
+@@ -171,6 +174,7 @@ static int show_schedstat(struct seq_file *seq, void *v)
+ sd->ttwu_move_balance);
+ }
+ rcu_read_unlock();
++#endif
+ #endif
+ }
+ return 0;
+diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h
+index 38f3698f5e5b..b9d597394316 100644
+--- a/kernel/sched/stats.h
++++ b/kernel/sched/stats.h
+@@ -89,6 +89,7 @@ static inline void rq_sched_info_depart (struct rq *rq, unsigned long long delt
+
+ #endif /* CONFIG_SCHEDSTATS */
+
++#ifndef CONFIG_SCHED_ALT
+ #ifdef CONFIG_FAIR_GROUP_SCHED
+ struct sched_entity_stats {
+ struct sched_entity se;
+@@ -105,6 +106,7 @@ __schedstats_from_se(struct sched_entity *se)
+ #endif
+ return &task_of(se)->stats;
+ }
++#endif /* CONFIG_SCHED_ALT */
+
+ #ifdef CONFIG_PSI
+ void psi_task_change(struct task_struct *task, int clear, int set);
+diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
+index 6682535e37c8..144875e2728d 100644
+--- a/kernel/sched/topology.c
++++ b/kernel/sched/topology.c
+@@ -3,6 +3,7 @@
+ * Scheduler topology setup/handling methods
+ */
+
++#ifndef CONFIG_SCHED_ALT
+ #include <linux/bsearch.h>
+
+ DEFINE_MUTEX(sched_domains_mutex);
+@@ -1415,8 +1416,10 @@ static void asym_cpu_capacity_scan(void)
+ */
+
+ static int default_relax_domain_level = -1;
++#endif /* CONFIG_SCHED_ALT */
+ int sched_domain_level_max;
+
++#ifndef CONFIG_SCHED_ALT
+ static int __init setup_relax_domain_level(char *str)
+ {
+ if (kstrtoint(str, 0, &default_relax_domain_level))
+@@ -1649,6 +1652,7 @@ sd_init(struct sched_domain_topology_level *tl,
+
+ return sd;
+ }
++#endif /* CONFIG_SCHED_ALT */
+
+ /*
+ * Topology list, bottom-up.
+@@ -1685,6 +1689,7 @@ void set_sched_topology(struct sched_domain_topology_level *tl)
+ sched_domain_topology_saved = NULL;
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ #ifdef CONFIG_NUMA
+
+ static const struct cpumask *sd_numa_mask(int cpu)
+@@ -2740,3 +2745,20 @@ void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
+ partition_sched_domains_locked(ndoms_new, doms_new, dattr_new);
+ mutex_unlock(&sched_domains_mutex);
+ }
++#else /* CONFIG_SCHED_ALT */
++void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
++ struct sched_domain_attr *dattr_new)
++{}
++
++#ifdef CONFIG_NUMA
++int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
++{
++ return best_mask_cpu(cpu, cpus);
++}
++
++int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node)
++{
++ return cpumask_nth(cpu, cpus);
++}
++#endif /* CONFIG_NUMA */
++#endif
+diff --git a/kernel/sysctl.c b/kernel/sysctl.c
+index bfe53e835524..943fa125064b 100644
+--- a/kernel/sysctl.c
++++ b/kernel/sysctl.c
+@@ -92,6 +92,10 @@ EXPORT_SYMBOL_GPL(sysctl_long_vals);
+
+ /* Constants used for minimum and maximum */
+
++#ifdef CONFIG_SCHED_ALT
++extern int sched_yield_type;
++#endif
++
+ #ifdef CONFIG_PERF_EVENTS
+ static const int six_hundred_forty_kb = 640 * 1024;
+ #endif
+@@ -1917,6 +1921,17 @@ static struct ctl_table kern_table[] = {
+ .proc_handler = proc_dointvec,
+ },
+ #endif
++#ifdef CONFIG_SCHED_ALT
++ {
++ .procname = "yield_type",
++ .data = &sched_yield_type,
++ .maxlen = sizeof (int),
++ .mode = 0644,
++ .proc_handler = &proc_dointvec_minmax,
++ .extra1 = SYSCTL_ZERO,
++ .extra2 = SYSCTL_TWO,
++ },
++#endif
+ #if defined(CONFIG_S390) && defined(CONFIG_SMP)
+ {
+ .procname = "spin_retry",
+diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
+index e8c08292defc..3823ff0ddc0f 100644
+--- a/kernel/time/hrtimer.c
++++ b/kernel/time/hrtimer.c
+@@ -2088,8 +2088,10 @@ long hrtimer_nanosleep(ktime_t rqtp, const enum hrtimer_mode mode,
+ int ret = 0;
+ u64 slack;
+
++#ifndef CONFIG_SCHED_ALT
+ slack = current->timer_slack_ns;
+- if (rt_task(current))
++ if (dl_task(current) || rt_task(current))
++#endif
+ slack = 0;
+
+ hrtimer_init_sleeper_on_stack(&t, clockid, mode);
+diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
+index e9c6f9d0e42c..43ee0a94abdd 100644
+--- a/kernel/time/posix-cpu-timers.c
++++ b/kernel/time/posix-cpu-timers.c
+@@ -223,7 +223,7 @@ static void task_sample_cputime(struct task_struct *p, u64 *samples)
+ u64 stime, utime;
+
+ task_cputime(p, &utime, &stime);
+- store_samples(samples, stime, utime, p->se.sum_exec_runtime);
++ store_samples(samples, stime, utime, tsk_seruntime(p));
+ }
+
+ static void proc_sample_cputime_atomic(struct task_cputime_atomic *at,
+@@ -867,6 +867,7 @@ static void collect_posix_cputimers(struct posix_cputimers *pct, u64 *samples,
+ }
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ static inline void check_dl_overrun(struct task_struct *tsk)
+ {
+ if (tsk->dl.dl_overrun) {
+@@ -874,6 +875,7 @@ static inline void check_dl_overrun(struct task_struct *tsk)
+ send_signal_locked(SIGXCPU, SEND_SIG_PRIV, tsk, PIDTYPE_TGID);
+ }
+ }
++#endif
+
+ static bool check_rlimit(u64 time, u64 limit, int signo, bool rt, bool hard)
+ {
+@@ -901,8 +903,10 @@ static void check_thread_timers(struct task_struct *tsk,
+ u64 samples[CPUCLOCK_MAX];
+ unsigned long soft;
+
++#ifndef CONFIG_SCHED_ALT
+ if (dl_task(tsk))
+ check_dl_overrun(tsk);
++#endif
+
+ if (expiry_cache_is_inactive(pct))
+ return;
+@@ -916,7 +920,7 @@ static void check_thread_timers(struct task_struct *tsk,
+ soft = task_rlimit(tsk, RLIMIT_RTTIME);
+ if (soft != RLIM_INFINITY) {
+ /* Task RT timeout is accounted in jiffies. RTTIME is usec */
+- unsigned long rttime = tsk->rt.timeout * (USEC_PER_SEC / HZ);
++ unsigned long rttime = tsk_rttimeout(tsk) * (USEC_PER_SEC / HZ);
+ unsigned long hard = task_rlimit_max(tsk, RLIMIT_RTTIME);
+
+ /* At the hard limit, send SIGKILL. No further action. */
+@@ -1152,8 +1156,10 @@ static inline bool fastpath_timer_check(struct task_struct *tsk)
+ return true;
+ }
+
++#ifndef CONFIG_SCHED_ALT
+ if (dl_task(tsk) && tsk->dl.dl_overrun)
+ return true;
++#endif
+
+ return false;
+ }
+diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
+index 529590499b1f..d04bb99b4f0e 100644
+--- a/kernel/trace/trace_selftest.c
++++ b/kernel/trace/trace_selftest.c
+@@ -1155,10 +1155,15 @@ static int trace_wakeup_test_thread(void *data)
+ {
+ /* Make this a -deadline thread */
+ static const struct sched_attr attr = {
++#ifdef CONFIG_SCHED_ALT
++ /* No deadline on BMQ/PDS, use RR */
++ .sched_policy = SCHED_RR,
++#else
+ .sched_policy = SCHED_DEADLINE,
+ .sched_runtime = 100000ULL,
+ .sched_deadline = 10000000ULL,
+ .sched_period = 10000000ULL
++#endif
+ };
+ struct wakeup_test_data *x = data;
+
diff --git a/5021_BMQ-and-PDS-gentoo-defaults.patch b/5021_BMQ-and-PDS-gentoo-defaults.patch
new file mode 100644
index 00000000..6dc48eec
--- /dev/null
+++ b/5021_BMQ-and-PDS-gentoo-defaults.patch
@@ -0,0 +1,13 @@
+--- a/init/Kconfig 2023-02-13 08:16:09.534315265 -0500
++++ b/init/Kconfig 2023-02-13 08:17:24.130237204 -0500
+@@ -867,8 +867,9 @@ config UCLAMP_BUCKETS_COUNT
+ If in doubt, use the default value.
+
+ menuconfig SCHED_ALT
++ depends on X86_64
+ bool "Alternative CPU Schedulers"
+- default y
++ default n
+ help
+ This feature enable alternative CPU scheduler"
+
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-08-23 15:57 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-08-23 15:57 UTC (permalink / raw
To: gentoo-commits
commit: b2e6d876d0a9bd37cb2b8df364a102109268df7b
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Wed Aug 23 15:57:15 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Wed Aug 23 15:57:15 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=b2e6d876
Linux patch 6.4.12
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1011_linux-6.4.12.patch | 8395 +++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 8399 insertions(+)
diff --git a/0000_README b/0000_README
index 9ce881e3..5da232d8 100644
--- a/0000_README
+++ b/0000_README
@@ -87,6 +87,10 @@ Patch: 1010_linux-6.4.11.patch
From: https://www.kernel.org
Desc: Linux 6.4.11
+Patch: 1011_linux-6.4.12.patch
+From: https://www.kernel.org
+Desc: Linux 6.4.12
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1011_linux-6.4.12.patch b/1011_linux-6.4.12.patch
new file mode 100644
index 00000000..e7ae9487
--- /dev/null
+++ b/1011_linux-6.4.12.patch
@@ -0,0 +1,8395 @@
+diff --git a/Documentation/admin-guide/hw-vuln/srso.rst b/Documentation/admin-guide/hw-vuln/srso.rst
+index 2f923c805802f..f79cb11b080f6 100644
+--- a/Documentation/admin-guide/hw-vuln/srso.rst
++++ b/Documentation/admin-guide/hw-vuln/srso.rst
+@@ -124,8 +124,8 @@ sequence.
+ To ensure the safety of this mitigation, the kernel must ensure that the
+ safe return sequence is itself free from attacker interference. In Zen3
+ and Zen4, this is accomplished by creating a BTB alias between the
+-untraining function srso_untrain_ret_alias() and the safe return
+-function srso_safe_ret_alias() which results in evicting a potentially
++untraining function srso_alias_untrain_ret() and the safe return
++function srso_alias_safe_ret() which results in evicting a potentially
+ poisoned BTB entry and using that safe one for all function returns.
+
+ In older Zen1 and Zen2, this is accomplished using a reinterpretation
+diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
+index a8fc0eb6fb1d6..7323911931828 100644
+--- a/Documentation/admin-guide/kernel-parameters.txt
++++ b/Documentation/admin-guide/kernel-parameters.txt
+@@ -323,6 +323,7 @@
+ option with care.
+ pgtbl_v1 - Use v1 page table for DMA-API (Default).
+ pgtbl_v2 - Use v2 page table for DMA-API.
++ irtcachedis - Disable Interrupt Remapping Table (IRT) caching.
+
+ amd_iommu_dump= [HW,X86-64]
+ Enable AMD IOMMU driver option to dump the ACPI table
+diff --git a/Documentation/devicetree/bindings/input/goodix,gt7375p.yaml b/Documentation/devicetree/bindings/input/goodix,gt7375p.yaml
+index ce18d7dadae23..1edad1da1196d 100644
+--- a/Documentation/devicetree/bindings/input/goodix,gt7375p.yaml
++++ b/Documentation/devicetree/bindings/input/goodix,gt7375p.yaml
+@@ -43,6 +43,15 @@ properties:
+ itself as long as it allows the main board to make signals compatible
+ with what the touchscreen is expecting for its IO rails.
+
++ goodix,no-reset-during-suspend:
++ description:
++ Set this to true to enforce the driver to not assert the reset GPIO
++ during suspend.
++ Due to potential touchscreen hardware flaw, back-powering could happen in
++ suspend if the power supply is on and with active-low reset GPIO asserted.
++ This property is used to avoid the back-powering issue.
++ type: boolean
++
+ required:
+ - compatible
+ - reg
+diff --git a/Documentation/devicetree/bindings/pinctrl/qcom,sa8775p-tlmm.yaml b/Documentation/devicetree/bindings/pinctrl/qcom,sa8775p-tlmm.yaml
+index e608a4f1bcaec..e119a226a4b18 100644
+--- a/Documentation/devicetree/bindings/pinctrl/qcom,sa8775p-tlmm.yaml
++++ b/Documentation/devicetree/bindings/pinctrl/qcom,sa8775p-tlmm.yaml
+@@ -87,7 +87,7 @@ $defs:
+ emac0_mdc, emac0_mdio, emac0_ptp_aux, emac0_ptp_pps, emac1_mcg0,
+ emac1_mcg1, emac1_mcg2, emac1_mcg3, emac1_mdc, emac1_mdio,
+ emac1_ptp_aux, emac1_ptp_pps, gcc_gp1, gcc_gp2, gcc_gp3,
+- gcc_gp4, gcc_gp5, hs0_mi2s, hs1_mi2s, hs2_mi2s, ibi_i3c,
++ gcc_gp4, gcc_gp5, gpio, hs0_mi2s, hs1_mi2s, hs2_mi2s, ibi_i3c,
+ jitter_bist, mdp0_vsync0, mdp0_vsync1, mdp0_vsync2, mdp0_vsync3,
+ mdp0_vsync4, mdp0_vsync5, mdp0_vsync6, mdp0_vsync7, mdp0_vsync8,
+ mdp1_vsync0, mdp1_vsync1, mdp1_vsync2, mdp1_vsync3, mdp1_vsync4,
+diff --git a/Documentation/networking/nf_conntrack-sysctl.rst b/Documentation/networking/nf_conntrack-sysctl.rst
+index 8b1045c3b59e0..c383a394c6656 100644
+--- a/Documentation/networking/nf_conntrack-sysctl.rst
++++ b/Documentation/networking/nf_conntrack-sysctl.rst
+@@ -178,10 +178,10 @@ nf_conntrack_sctp_timeout_established - INTEGER (seconds)
+ Default is set to (hb_interval * path_max_retrans + rto_max)
+
+ nf_conntrack_sctp_timeout_shutdown_sent - INTEGER (seconds)
+- default 0.3
++ default 3
+
+ nf_conntrack_sctp_timeout_shutdown_recd - INTEGER (seconds)
+- default 0.3
++ default 3
+
+ nf_conntrack_sctp_timeout_shutdown_ack_sent - INTEGER (seconds)
+ default 3
+diff --git a/Makefile b/Makefile
+index d0efd84bb7d0f..0ff13b943f994 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 6
+ PATCHLEVEL = 4
+-SUBLEVEL = 11
++SUBLEVEL = 12
+ EXTRAVERSION =
+ NAME = Hurr durr I'ma ninja sloth
+
+diff --git a/arch/arm/boot/dts/imx23.dtsi b/arch/arm/boot/dts/imx23.dtsi
+index d19508c8f9ed6..a3668a0827fc8 100644
+--- a/arch/arm/boot/dts/imx23.dtsi
++++ b/arch/arm/boot/dts/imx23.dtsi
+@@ -59,7 +59,7 @@
+ reg = <0x80000000 0x2000>;
+ };
+
+- dma_apbh: dma-apbh@80004000 {
++ dma_apbh: dma-controller@80004000 {
+ compatible = "fsl,imx23-dma-apbh";
+ reg = <0x80004000 0x2000>;
+ interrupts = <0 14 20 0
+diff --git a/arch/arm/boot/dts/imx28.dtsi b/arch/arm/boot/dts/imx28.dtsi
+index a8d3c3113e0f6..29e37b1fae66f 100644
+--- a/arch/arm/boot/dts/imx28.dtsi
++++ b/arch/arm/boot/dts/imx28.dtsi
+@@ -78,7 +78,7 @@
+ status = "disabled";
+ };
+
+- dma_apbh: dma-apbh@80004000 {
++ dma_apbh: dma-controller@80004000 {
+ compatible = "fsl,imx28-dma-apbh";
+ reg = <0x80004000 0x2000>;
+ interrupts = <82 83 84 85
+diff --git a/arch/arm/boot/dts/imx6dl-prtrvt.dts b/arch/arm/boot/dts/imx6dl-prtrvt.dts
+index 56bb1ca56a2df..36b031236e475 100644
+--- a/arch/arm/boot/dts/imx6dl-prtrvt.dts
++++ b/arch/arm/boot/dts/imx6dl-prtrvt.dts
+@@ -124,6 +124,10 @@
+ status = "disabled";
+ };
+
++&usbotg {
++ disable-over-current;
++};
++
+ &vpu {
+ status = "disabled";
+ };
+diff --git a/arch/arm/boot/dts/imx6qdl-phytec-mira.dtsi b/arch/arm/boot/dts/imx6qdl-phytec-mira.dtsi
+index 1a599c294ab86..1ca4d219609f6 100644
+--- a/arch/arm/boot/dts/imx6qdl-phytec-mira.dtsi
++++ b/arch/arm/boot/dts/imx6qdl-phytec-mira.dtsi
+@@ -182,7 +182,7 @@
+ pinctrl-0 = <&pinctrl_rtc_int>;
+ reg = <0x68>;
+ interrupt-parent = <&gpio7>;
+- interrupts = <8 IRQ_TYPE_LEVEL_HIGH>;
++ interrupts = <8 IRQ_TYPE_LEVEL_LOW>;
+ status = "disabled";
+ };
+ };
+diff --git a/arch/arm/boot/dts/imx6qdl-prti6q.dtsi b/arch/arm/boot/dts/imx6qdl-prti6q.dtsi
+index f0db0d4471f40..36f84f4da6b0d 100644
+--- a/arch/arm/boot/dts/imx6qdl-prti6q.dtsi
++++ b/arch/arm/boot/dts/imx6qdl-prti6q.dtsi
+@@ -69,6 +69,7 @@
+ vbus-supply = <®_usb_h1_vbus>;
+ phy_type = "utmi";
+ dr_mode = "host";
++ disable-over-current;
+ status = "okay";
+ };
+
+@@ -78,10 +79,18 @@
+ pinctrl-0 = <&pinctrl_usbotg>;
+ phy_type = "utmi";
+ dr_mode = "host";
+- disable-over-current;
++ over-current-active-low;
+ status = "okay";
+ };
+
++&usbphynop1 {
++ status = "disabled";
++};
++
++&usbphynop2 {
++ status = "disabled";
++};
++
+ &usdhc1 {
+ pinctrl-names = "default";
+ pinctrl-0 = <&pinctrl_usdhc1>;
+diff --git a/arch/arm/boot/dts/imx6qdl.dtsi b/arch/arm/boot/dts/imx6qdl.dtsi
+index b72ec745f6d12..bda182edc5891 100644
+--- a/arch/arm/boot/dts/imx6qdl.dtsi
++++ b/arch/arm/boot/dts/imx6qdl.dtsi
+@@ -150,7 +150,7 @@
+ interrupt-parent = <&gpc>;
+ ranges;
+
+- dma_apbh: dma-apbh@110000 {
++ dma_apbh: dma-controller@110000 {
+ compatible = "fsl,imx6q-dma-apbh", "fsl,imx28-dma-apbh";
+ reg = <0x00110000 0x2000>;
+ interrupts = <0 13 IRQ_TYPE_LEVEL_HIGH>,
+diff --git a/arch/arm/boot/dts/imx6sx.dtsi b/arch/arm/boot/dts/imx6sx.dtsi
+index 93ac2380ca1ec..fc0654e3fe950 100644
+--- a/arch/arm/boot/dts/imx6sx.dtsi
++++ b/arch/arm/boot/dts/imx6sx.dtsi
+@@ -209,7 +209,7 @@
+ power-domains = <&pd_pu>;
+ };
+
+- dma_apbh: dma-apbh@1804000 {
++ dma_apbh: dma-controller@1804000 {
+ compatible = "fsl,imx6sx-dma-apbh", "fsl,imx28-dma-apbh";
+ reg = <0x01804000 0x2000>;
+ interrupts = <GIC_SPI 13 IRQ_TYPE_LEVEL_HIGH>,
+@@ -980,6 +980,8 @@
+ <&clks IMX6SX_CLK_USDHC1>;
+ clock-names = "ipg", "ahb", "per";
+ bus-width = <4>;
++ fsl,tuning-start-tap = <20>;
++ fsl,tuning-step= <2>;
+ status = "disabled";
+ };
+
+@@ -992,6 +994,8 @@
+ <&clks IMX6SX_CLK_USDHC2>;
+ clock-names = "ipg", "ahb", "per";
+ bus-width = <4>;
++ fsl,tuning-start-tap = <20>;
++ fsl,tuning-step= <2>;
+ status = "disabled";
+ };
+
+@@ -1004,6 +1008,8 @@
+ <&clks IMX6SX_CLK_USDHC3>;
+ clock-names = "ipg", "ahb", "per";
+ bus-width = <4>;
++ fsl,tuning-start-tap = <20>;
++ fsl,tuning-step= <2>;
+ status = "disabled";
+ };
+
+diff --git a/arch/arm/boot/dts/imx6ul.dtsi b/arch/arm/boot/dts/imx6ul.dtsi
+index 3d9d0f8235685..118764c50d921 100644
+--- a/arch/arm/boot/dts/imx6ul.dtsi
++++ b/arch/arm/boot/dts/imx6ul.dtsi
+@@ -164,7 +164,7 @@
+ <0x00a06000 0x2000>;
+ };
+
+- dma_apbh: dma-apbh@1804000 {
++ dma_apbh: dma-controller@1804000 {
+ compatible = "fsl,imx6q-dma-apbh", "fsl,imx28-dma-apbh";
+ reg = <0x01804000 0x2000>;
+ interrupts = <0 13 IRQ_TYPE_LEVEL_HIGH>,
+diff --git a/arch/arm/boot/dts/imx7s.dtsi b/arch/arm/boot/dts/imx7s.dtsi
+index efe2525b62fa1..6ffb428dc939c 100644
+--- a/arch/arm/boot/dts/imx7s.dtsi
++++ b/arch/arm/boot/dts/imx7s.dtsi
+@@ -1184,6 +1184,8 @@
+ <&clks IMX7D_USDHC1_ROOT_CLK>;
+ clock-names = "ipg", "ahb", "per";
+ bus-width = <4>;
++ fsl,tuning-step = <2>;
++ fsl,tuning-start-tap = <20>;
+ status = "disabled";
+ };
+
+@@ -1196,6 +1198,8 @@
+ <&clks IMX7D_USDHC2_ROOT_CLK>;
+ clock-names = "ipg", "ahb", "per";
+ bus-width = <4>;
++ fsl,tuning-step = <2>;
++ fsl,tuning-start-tap = <20>;
+ status = "disabled";
+ };
+
+@@ -1208,6 +1212,8 @@
+ <&clks IMX7D_USDHC3_ROOT_CLK>;
+ clock-names = "ipg", "ahb", "per";
+ bus-width = <4>;
++ fsl,tuning-step = <2>;
++ fsl,tuning-start-tap = <20>;
+ status = "disabled";
+ };
+
+@@ -1257,7 +1263,7 @@
+ };
+ };
+
+- dma_apbh: dma-apbh@33000000 {
++ dma_apbh: dma-controller@33000000 {
+ compatible = "fsl,imx7d-dma-apbh", "fsl,imx28-dma-apbh";
+ reg = <0x33000000 0x2000>;
+ interrupts = <GIC_SPI 12 IRQ_TYPE_LEVEL_HIGH>,
+diff --git a/arch/arm64/boot/dts/freescale/imx8mm.dtsi b/arch/arm64/boot/dts/freescale/imx8mm.dtsi
+index d6b36f04f3dc1..1a647d4072ba0 100644
+--- a/arch/arm64/boot/dts/freescale/imx8mm.dtsi
++++ b/arch/arm64/boot/dts/freescale/imx8mm.dtsi
+@@ -1221,10 +1221,9 @@
+ compatible = "fsl,imx8mm-mipi-csi2";
+ reg = <0x32e30000 0x1000>;
+ interrupts = <GIC_SPI 17 IRQ_TYPE_LEVEL_HIGH>;
+- assigned-clocks = <&clk IMX8MM_CLK_CSI1_CORE>,
+- <&clk IMX8MM_CLK_CSI1_PHY_REF>;
+- assigned-clock-parents = <&clk IMX8MM_SYS_PLL2_1000M>,
+- <&clk IMX8MM_SYS_PLL2_1000M>;
++ assigned-clocks = <&clk IMX8MM_CLK_CSI1_CORE>;
++ assigned-clock-parents = <&clk IMX8MM_SYS_PLL2_1000M>;
++
+ clock-frequency = <333000000>;
+ clocks = <&clk IMX8MM_CLK_DISP_APB_ROOT>,
+ <&clk IMX8MM_CLK_CSI1_ROOT>,
+diff --git a/arch/arm64/boot/dts/freescale/imx93.dtsi b/arch/arm64/boot/dts/freescale/imx93.dtsi
+index e8d49660ac85b..c0f49fedaf9ea 100644
+--- a/arch/arm64/boot/dts/freescale/imx93.dtsi
++++ b/arch/arm64/boot/dts/freescale/imx93.dtsi
+@@ -306,7 +306,7 @@
+
+ anatop: anatop@44480000 {
+ compatible = "fsl,imx93-anatop", "syscon";
+- reg = <0x44480000 0x10000>;
++ reg = <0x44480000 0x2000>;
+ };
+
+ adc1: adc@44530000 {
+diff --git a/arch/arm64/boot/dts/qcom/ipq5332.dtsi b/arch/arm64/boot/dts/qcom/ipq5332.dtsi
+index af4d97143bcf5..c2d6cc65a323a 100644
+--- a/arch/arm64/boot/dts/qcom/ipq5332.dtsi
++++ b/arch/arm64/boot/dts/qcom/ipq5332.dtsi
+@@ -135,6 +135,13 @@
+ #size-cells = <1>;
+ ranges = <0 0 0 0xffffffff>;
+
++ qfprom: efuse@a4000 {
++ compatible = "qcom,ipq5332-qfprom", "qcom,qfprom";
++ reg = <0x000a4000 0x721>;
++ #address-cells = <1>;
++ #size-cells = <1>;
++ };
++
+ rng: rng@e3000 {
+ compatible = "qcom,prng-ee";
+ reg = <0x000e3000 0x1000>;
+diff --git a/arch/arm64/boot/dts/qcom/qrb5165-rb5.dts b/arch/arm64/boot/dts/qcom/qrb5165-rb5.dts
+index dd924331b0eea..ec066a89436a8 100644
+--- a/arch/arm64/boot/dts/qcom/qrb5165-rb5.dts
++++ b/arch/arm64/boot/dts/qcom/qrb5165-rb5.dts
+@@ -121,7 +121,7 @@
+ };
+ };
+
+- pm8150l-thermal {
++ pm8150l-pcb-thermal {
+ polling-delay-passive = <0>;
+ polling-delay = <0>;
+ thermal-sensors = <&pm8150l_adc_tm 1>;
+diff --git a/arch/arm64/boot/dts/rockchip/rk3399-rock-4c-plus.dts b/arch/arm64/boot/dts/rockchip/rk3399-rock-4c-plus.dts
+index 028eb508ae302..8bfd5f88d1ef6 100644
+--- a/arch/arm64/boot/dts/rockchip/rk3399-rock-4c-plus.dts
++++ b/arch/arm64/boot/dts/rockchip/rk3399-rock-4c-plus.dts
+@@ -548,9 +548,8 @@
+ &sdhci {
+ max-frequency = <150000000>;
+ bus-width = <8>;
+- mmc-hs400-1_8v;
++ mmc-hs200-1_8v;
+ non-removable;
+- mmc-hs400-enhanced-strobe;
+ status = "okay";
+ };
+
+diff --git a/arch/arm64/boot/dts/rockchip/rk3399-rock-pi-4.dtsi b/arch/arm64/boot/dts/rockchip/rk3399-rock-pi-4.dtsi
+index 907071d4fe804..980c4534313a2 100644
+--- a/arch/arm64/boot/dts/rockchip/rk3399-rock-pi-4.dtsi
++++ b/arch/arm64/boot/dts/rockchip/rk3399-rock-pi-4.dtsi
+@@ -45,7 +45,7 @@
+ sdio_pwrseq: sdio-pwrseq {
+ compatible = "mmc-pwrseq-simple";
+ clocks = <&rk808 1>;
+- clock-names = "ext_clock";
++ clock-names = "lpo";
+ pinctrl-names = "default";
+ pinctrl-0 = <&wifi_enable_h>;
+ reset-gpios = <&gpio0 RK_PB2 GPIO_ACTIVE_LOW>;
+@@ -645,9 +645,9 @@
+ };
+
+ &sdhci {
++ max-frequency = <150000000>;
+ bus-width = <8>;
+- mmc-hs400-1_8v;
+- mmc-hs400-enhanced-strobe;
++ mmc-hs200-1_8v;
+ non-removable;
+ status = "okay";
+ };
+diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
+index 67f2fb781f59e..8df46f186c64b 100644
+--- a/arch/arm64/include/asm/fpsimd.h
++++ b/arch/arm64/include/asm/fpsimd.h
+@@ -356,7 +356,7 @@ static inline int sme_max_virtualisable_vl(void)
+ return vec_max_virtualisable_vl(ARM64_VEC_SME);
+ }
+
+-extern void sme_alloc(struct task_struct *task);
++extern void sme_alloc(struct task_struct *task, bool flush);
+ extern unsigned int sme_get_vl(void);
+ extern int sme_set_current_vl(unsigned long arg);
+ extern int sme_get_current_vl(void);
+@@ -388,7 +388,7 @@ static inline void sme_smstart_sm(void) { }
+ static inline void sme_smstop_sm(void) { }
+ static inline void sme_smstop(void) { }
+
+-static inline void sme_alloc(struct task_struct *task) { }
++static inline void sme_alloc(struct task_struct *task, bool flush) { }
+ static inline void sme_setup(void) { }
+ static inline unsigned int sme_get_vl(void) { return 0; }
+ static inline int sme_max_vl(void) { return 0; }
+diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
+index 75c37b1c55aaf..087c05aa960ea 100644
+--- a/arch/arm64/kernel/fpsimd.c
++++ b/arch/arm64/kernel/fpsimd.c
+@@ -1285,9 +1285,9 @@ void fpsimd_release_task(struct task_struct *dead_task)
+ * the interest of testability and predictability, the architecture
+ * guarantees that when ZA is enabled it will be zeroed.
+ */
+-void sme_alloc(struct task_struct *task)
++void sme_alloc(struct task_struct *task, bool flush)
+ {
+- if (task->thread.sme_state) {
++ if (task->thread.sme_state && flush) {
+ memset(task->thread.sme_state, 0, sme_state_size(task));
+ return;
+ }
+@@ -1515,7 +1515,7 @@ void do_sme_acc(unsigned long esr, struct pt_regs *regs)
+ }
+
+ sve_alloc(current, false);
+- sme_alloc(current);
++ sme_alloc(current, true);
+ if (!current->thread.sve_state || !current->thread.sme_state) {
+ force_sig(SIGKILL);
+ return;
+diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
+index 5b9b4305248b8..187aa2b175b4f 100644
+--- a/arch/arm64/kernel/ptrace.c
++++ b/arch/arm64/kernel/ptrace.c
+@@ -881,6 +881,13 @@ static int sve_set_common(struct task_struct *target,
+ break;
+ case ARM64_VEC_SME:
+ target->thread.svcr |= SVCR_SM_MASK;
++
++ /*
++ * Disable traps and ensure there is SME storage but
++ * preserve any currently set values in ZA/ZT.
++ */
++ sme_alloc(target, false);
++ set_tsk_thread_flag(target, TIF_SME);
+ break;
+ default:
+ WARN_ON_ONCE(1);
+@@ -1100,7 +1107,7 @@ static int za_set(struct task_struct *target,
+ }
+
+ /* Allocate/reinit ZA storage */
+- sme_alloc(target);
++ sme_alloc(target, true);
+ if (!target->thread.sme_state) {
+ ret = -ENOMEM;
+ goto out;
+@@ -1170,8 +1177,13 @@ static int zt_set(struct task_struct *target,
+ if (!system_supports_sme2())
+ return -EINVAL;
+
++ /* Ensure SVE storage in case this is first use of SME */
++ sve_alloc(target, false);
++ if (!target->thread.sve_state)
++ return -ENOMEM;
++
+ if (!thread_za_enabled(&target->thread)) {
+- sme_alloc(target);
++ sme_alloc(target, true);
+ if (!target->thread.sme_state)
+ return -ENOMEM;
+ }
+@@ -1179,8 +1191,10 @@ static int zt_set(struct task_struct *target,
+ ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
+ thread_zt_state(&target->thread),
+ 0, ZT_SIG_REG_BYTES);
+- if (ret == 0)
++ if (ret == 0) {
+ target->thread.svcr |= SVCR_ZA_MASK;
++ set_tsk_thread_flag(target, TIF_SME);
++ }
+
+ fpsimd_flush_task_state(target);
+
+diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
+index 10b407672c427..bcd1ebb21da66 100644
+--- a/arch/arm64/kernel/signal.c
++++ b/arch/arm64/kernel/signal.c
+@@ -474,7 +474,7 @@ static int restore_za_context(struct user_ctxs *user)
+ fpsimd_flush_task_state(current);
+ /* From now, fpsimd_thread_switch() won't touch thread.sve_state */
+
+- sme_alloc(current);
++ sme_alloc(current, true);
+ if (!current->thread.sme_state) {
+ current->thread.svcr &= ~SVCR_ZA_MASK;
+ clear_thread_flag(TIF_SME);
+diff --git a/arch/parisc/kernel/entry.S b/arch/parisc/kernel/entry.S
+index 0e5ebfe8d9d29..ae03b8679696e 100644
+--- a/arch/parisc/kernel/entry.S
++++ b/arch/parisc/kernel/entry.S
+@@ -25,6 +25,7 @@
+ #include <asm/traps.h>
+ #include <asm/thread_info.h>
+ #include <asm/alternative.h>
++#include <asm/spinlock_types.h>
+
+ #include <linux/linkage.h>
+ #include <linux/pgtable.h>
+@@ -406,7 +407,7 @@
+ LDREG 0(\ptp),\pte
+ bb,<,n \pte,_PAGE_PRESENT_BIT,3f
+ b \fault
+- stw \spc,0(\tmp)
++ stw \tmp1,0(\tmp)
+ 99: ALTERNATIVE(98b, 99b, ALT_COND_NO_SMP, INSN_NOP)
+ #endif
+ 2: LDREG 0(\ptp),\pte
+@@ -415,24 +416,22 @@
+ .endm
+
+ /* Release page_table_lock without reloading lock address.
+- Note that the values in the register spc are limited to
+- NR_SPACE_IDS (262144). Thus, the stw instruction always
+- stores a nonzero value even when register spc is 64 bits.
+ We use an ordered store to ensure all prior accesses are
+ performed prior to releasing the lock. */
+- .macro ptl_unlock0 spc,tmp
++ .macro ptl_unlock0 spc,tmp,tmp2
+ #ifdef CONFIG_TLB_PTLOCK
+-98: or,COND(=) %r0,\spc,%r0
+- stw,ma \spc,0(\tmp)
++98: ldi __ARCH_SPIN_LOCK_UNLOCKED_VAL, \tmp2
++ or,COND(=) %r0,\spc,%r0
++ stw,ma \tmp2,0(\tmp)
+ 99: ALTERNATIVE(98b, 99b, ALT_COND_NO_SMP, INSN_NOP)
+ #endif
+ .endm
+
+ /* Release page_table_lock. */
+- .macro ptl_unlock1 spc,tmp
++ .macro ptl_unlock1 spc,tmp,tmp2
+ #ifdef CONFIG_TLB_PTLOCK
+ 98: get_ptl \tmp
+- ptl_unlock0 \spc,\tmp
++ ptl_unlock0 \spc,\tmp,\tmp2
+ 99: ALTERNATIVE(98b, 99b, ALT_COND_NO_SMP, INSN_NOP)
+ #endif
+ .endm
+@@ -1125,7 +1124,7 @@ dtlb_miss_20w:
+
+ idtlbt pte,prot
+
+- ptl_unlock1 spc,t0
++ ptl_unlock1 spc,t0,t1
+ rfir
+ nop
+
+@@ -1151,7 +1150,7 @@ nadtlb_miss_20w:
+
+ idtlbt pte,prot
+
+- ptl_unlock1 spc,t0
++ ptl_unlock1 spc,t0,t1
+ rfir
+ nop
+
+@@ -1185,7 +1184,7 @@ dtlb_miss_11:
+
+ mtsp t1, %sr1 /* Restore sr1 */
+
+- ptl_unlock1 spc,t0
++ ptl_unlock1 spc,t0,t1
+ rfir
+ nop
+
+@@ -1218,7 +1217,7 @@ nadtlb_miss_11:
+
+ mtsp t1, %sr1 /* Restore sr1 */
+
+- ptl_unlock1 spc,t0
++ ptl_unlock1 spc,t0,t1
+ rfir
+ nop
+
+@@ -1247,7 +1246,7 @@ dtlb_miss_20:
+
+ idtlbt pte,prot
+
+- ptl_unlock1 spc,t0
++ ptl_unlock1 spc,t0,t1
+ rfir
+ nop
+
+@@ -1275,7 +1274,7 @@ nadtlb_miss_20:
+
+ idtlbt pte,prot
+
+- ptl_unlock1 spc,t0
++ ptl_unlock1 spc,t0,t1
+ rfir
+ nop
+
+@@ -1320,7 +1319,7 @@ itlb_miss_20w:
+
+ iitlbt pte,prot
+
+- ptl_unlock1 spc,t0
++ ptl_unlock1 spc,t0,t1
+ rfir
+ nop
+
+@@ -1344,7 +1343,7 @@ naitlb_miss_20w:
+
+ iitlbt pte,prot
+
+- ptl_unlock1 spc,t0
++ ptl_unlock1 spc,t0,t1
+ rfir
+ nop
+
+@@ -1378,7 +1377,7 @@ itlb_miss_11:
+
+ mtsp t1, %sr1 /* Restore sr1 */
+
+- ptl_unlock1 spc,t0
++ ptl_unlock1 spc,t0,t1
+ rfir
+ nop
+
+@@ -1402,7 +1401,7 @@ naitlb_miss_11:
+
+ mtsp t1, %sr1 /* Restore sr1 */
+
+- ptl_unlock1 spc,t0
++ ptl_unlock1 spc,t0,t1
+ rfir
+ nop
+
+@@ -1432,7 +1431,7 @@ itlb_miss_20:
+
+ iitlbt pte,prot
+
+- ptl_unlock1 spc,t0
++ ptl_unlock1 spc,t0,t1
+ rfir
+ nop
+
+@@ -1452,7 +1451,7 @@ naitlb_miss_20:
+
+ iitlbt pte,prot
+
+- ptl_unlock1 spc,t0
++ ptl_unlock1 spc,t0,t1
+ rfir
+ nop
+
+@@ -1482,7 +1481,7 @@ dbit_trap_20w:
+
+ idtlbt pte,prot
+
+- ptl_unlock0 spc,t0
++ ptl_unlock0 spc,t0,t1
+ rfir
+ nop
+ #else
+@@ -1508,7 +1507,7 @@ dbit_trap_11:
+
+ mtsp t1, %sr1 /* Restore sr1 */
+
+- ptl_unlock0 spc,t0
++ ptl_unlock0 spc,t0,t1
+ rfir
+ nop
+
+@@ -1528,7 +1527,7 @@ dbit_trap_20:
+
+ idtlbt pte,prot
+
+- ptl_unlock0 spc,t0
++ ptl_unlock0 spc,t0,t1
+ rfir
+ nop
+ #endif
+diff --git a/arch/powerpc/kernel/rtas_flash.c b/arch/powerpc/kernel/rtas_flash.c
+index 4caf5e3079eb4..359577ec16801 100644
+--- a/arch/powerpc/kernel/rtas_flash.c
++++ b/arch/powerpc/kernel/rtas_flash.c
+@@ -709,9 +709,9 @@ static int __init rtas_flash_init(void)
+ if (!rtas_validate_flash_data.buf)
+ return -ENOMEM;
+
+- flash_block_cache = kmem_cache_create("rtas_flash_cache",
+- RTAS_BLK_SIZE, RTAS_BLK_SIZE, 0,
+- NULL);
++ flash_block_cache = kmem_cache_create_usercopy("rtas_flash_cache",
++ RTAS_BLK_SIZE, RTAS_BLK_SIZE,
++ 0, 0, RTAS_BLK_SIZE, NULL);
+ if (!flash_block_cache) {
+ printk(KERN_ERR "%s: failed to create block cache\n",
+ __func__);
+diff --git a/arch/powerpc/mm/kasan/Makefile b/arch/powerpc/mm/kasan/Makefile
+index 699eeffd9f551..f9522fd70b2f3 100644
+--- a/arch/powerpc/mm/kasan/Makefile
++++ b/arch/powerpc/mm/kasan/Makefile
+@@ -1,6 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+
+ KASAN_SANITIZE := n
++KCOV_INSTRUMENT := n
+
+ obj-$(CONFIG_PPC32) += init_32.o
+ obj-$(CONFIG_PPC_8xx) += 8xx.o
+diff --git a/arch/riscv/include/asm/insn.h b/arch/riscv/include/asm/insn.h
+index 8d5c84f2d5ef7..603095c913e37 100644
+--- a/arch/riscv/include/asm/insn.h
++++ b/arch/riscv/include/asm/insn.h
+@@ -110,6 +110,7 @@
+ #define RVC_INSN_FUNCT4_OPOFF 12
+ #define RVC_INSN_FUNCT3_MASK GENMASK(15, 13)
+ #define RVC_INSN_FUNCT3_OPOFF 13
++#define RVC_INSN_J_RS1_MASK GENMASK(11, 7)
+ #define RVC_INSN_J_RS2_MASK GENMASK(6, 2)
+ #define RVC_INSN_OPCODE_MASK GENMASK(1, 0)
+ #define RVC_ENCODE_FUNCT3(f_) (RVC_FUNCT3_##f_ << RVC_INSN_FUNCT3_OPOFF)
+@@ -225,8 +226,6 @@ __RISCV_INSN_FUNCS(c_jal, RVC_MASK_C_JAL, RVC_MATCH_C_JAL)
+ __RISCV_INSN_FUNCS(auipc, RVG_MASK_AUIPC, RVG_MATCH_AUIPC)
+ __RISCV_INSN_FUNCS(jalr, RVG_MASK_JALR, RVG_MATCH_JALR)
+ __RISCV_INSN_FUNCS(jal, RVG_MASK_JAL, RVG_MATCH_JAL)
+-__RISCV_INSN_FUNCS(c_jr, RVC_MASK_C_JR, RVC_MATCH_C_JR)
+-__RISCV_INSN_FUNCS(c_jalr, RVC_MASK_C_JALR, RVC_MATCH_C_JALR)
+ __RISCV_INSN_FUNCS(c_j, RVC_MASK_C_J, RVC_MATCH_C_J)
+ __RISCV_INSN_FUNCS(beq, RVG_MASK_BEQ, RVG_MATCH_BEQ)
+ __RISCV_INSN_FUNCS(bne, RVG_MASK_BNE, RVG_MATCH_BNE)
+@@ -253,6 +252,18 @@ static __always_inline bool riscv_insn_is_branch(u32 code)
+ return (code & RV_INSN_OPCODE_MASK) == RVG_OPCODE_BRANCH;
+ }
+
++static __always_inline bool riscv_insn_is_c_jr(u32 code)
++{
++ return (code & RVC_MASK_C_JR) == RVC_MATCH_C_JR &&
++ (code & RVC_INSN_J_RS1_MASK) != 0;
++}
++
++static __always_inline bool riscv_insn_is_c_jalr(u32 code)
++{
++ return (code & RVC_MASK_C_JALR) == RVC_MATCH_C_JALR &&
++ (code & RVC_INSN_J_RS1_MASK) != 0;
++}
++
+ #define RV_IMM_SIGN(x) (-(((x) >> 31) & 1))
+ #define RVC_IMM_SIGN(x) (-(((x) >> 12) & 1))
+ #define RV_X(X, s, mask) (((X) >> (s)) & (mask))
+diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
+index 8c258b78c925c..bd19e885dcec1 100644
+--- a/arch/riscv/kernel/traps.c
++++ b/arch/riscv/kernel/traps.c
+@@ -268,16 +268,16 @@ asmlinkage __visible __trap_section void do_trap_break(struct pt_regs *regs)
+ asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs)
+ {
+ if (user_mode(regs)) {
+- ulong syscall = regs->a7;
++ long syscall = regs->a7;
+
+ regs->epc += 4;
+ regs->orig_a0 = regs->a0;
+
+ syscall = syscall_enter_from_user_mode(regs, syscall);
+
+- if (syscall < NR_syscalls)
++ if (syscall >= 0 && syscall < NR_syscalls)
+ syscall_handler(regs, syscall);
+- else
++ else if (syscall != -1)
+ regs->a0 = -ENOSYS;
+
+ syscall_exit_to_user_mode(regs);
+diff --git a/arch/riscv/lib/uaccess.S b/arch/riscv/lib/uaccess.S
+index ec486e5369d9b..09b47ebacf2e8 100644
+--- a/arch/riscv/lib/uaccess.S
++++ b/arch/riscv/lib/uaccess.S
+@@ -17,8 +17,11 @@ ENTRY(__asm_copy_from_user)
+ li t6, SR_SUM
+ csrs CSR_STATUS, t6
+
+- /* Save for return value */
+- mv t5, a2
++ /*
++ * Save the terminal address which will be used to compute the number
++ * of bytes copied in case of a fixup exception.
++ */
++ add t5, a0, a2
+
+ /*
+ * Register allocation for code below:
+@@ -176,7 +179,7 @@ ENTRY(__asm_copy_from_user)
+ 10:
+ /* Disable access to user memory */
+ csrc CSR_STATUS, t6
+- mv a0, t5
++ sub a0, t5, a0
+ ret
+ ENDPROC(__asm_copy_to_user)
+ ENDPROC(__asm_copy_from_user)
+@@ -228,7 +231,7 @@ ENTRY(__clear_user)
+ 11:
+ /* Disable access to user memory */
+ csrc CSR_STATUS, t6
+- mv a0, a1
++ sub a0, a3, a0
+ ret
+ ENDPROC(__clear_user)
+ EXPORT_SYMBOL(__clear_user)
+diff --git a/arch/um/os-Linux/user_syms.c b/arch/um/os-Linux/user_syms.c
+index 9b62a9d352b3a..a310ae27b479a 100644
+--- a/arch/um/os-Linux/user_syms.c
++++ b/arch/um/os-Linux/user_syms.c
+@@ -37,13 +37,6 @@ EXPORT_SYMBOL(vsyscall_ehdr);
+ EXPORT_SYMBOL(vsyscall_end);
+ #endif
+
+-/* Export symbols used by GCC for the stack protector. */
+-extern void __stack_smash_handler(void *) __attribute__((weak));
+-EXPORT_SYMBOL(__stack_smash_handler);
+-
+-extern long __guard __attribute__((weak));
+-EXPORT_SYMBOL(__guard);
+-
+ #ifdef _FORTIFY_SOURCE
+ extern int __sprintf_chk(char *str, int flag, size_t len, const char *format);
+ EXPORT_SYMBOL(__sprintf_chk);
+diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/entry-common.h
+index 117903881fe43..ce8f50192ae3e 100644
+--- a/arch/x86/include/asm/entry-common.h
++++ b/arch/x86/include/asm/entry-common.h
+@@ -92,6 +92,7 @@ static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
+ static __always_inline void arch_exit_to_user_mode(void)
+ {
+ mds_user_clear_cpu_buffers();
++ amd_clear_divider();
+ }
+ #define arch_exit_to_user_mode arch_exit_to_user_mode
+
+diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
+index e1e7b319fe78d..8da84e1e56581 100644
+--- a/arch/x86/include/asm/nospec-branch.h
++++ b/arch/x86/include/asm/nospec-branch.h
+@@ -268,9 +268,9 @@
+ .endm
+
+ #ifdef CONFIG_CPU_UNRET_ENTRY
+-#define CALL_ZEN_UNTRAIN_RET "call zen_untrain_ret"
++#define CALL_UNTRAIN_RET "call entry_untrain_ret"
+ #else
+-#define CALL_ZEN_UNTRAIN_RET ""
++#define CALL_UNTRAIN_RET ""
+ #endif
+
+ /*
+@@ -278,7 +278,7 @@
+ * return thunk isn't mapped into the userspace tables (then again, AMD
+ * typically has NO_MELTDOWN).
+ *
+- * While zen_untrain_ret() doesn't clobber anything but requires stack,
++ * While retbleed_untrain_ret() doesn't clobber anything but requires stack,
+ * entry_ibpb() will clobber AX, CX, DX.
+ *
+ * As such, this must be placed after every *SWITCH_TO_KERNEL_CR3 at a point
+@@ -289,14 +289,20 @@
+ defined(CONFIG_CALL_DEPTH_TRACKING) || defined(CONFIG_CPU_SRSO)
+ VALIDATE_UNRET_END
+ ALTERNATIVE_3 "", \
+- CALL_ZEN_UNTRAIN_RET, X86_FEATURE_UNRET, \
++ CALL_UNTRAIN_RET, X86_FEATURE_UNRET, \
+ "call entry_ibpb", X86_FEATURE_ENTRY_IBPB, \
+ __stringify(RESET_CALL_DEPTH), X86_FEATURE_CALL_DEPTH
+ #endif
++.endm
+
+-#ifdef CONFIG_CPU_SRSO
+- ALTERNATIVE_2 "", "call srso_untrain_ret", X86_FEATURE_SRSO, \
+- "call srso_untrain_ret_alias", X86_FEATURE_SRSO_ALIAS
++.macro UNTRAIN_RET_VM
++#if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) || \
++ defined(CONFIG_CALL_DEPTH_TRACKING) || defined(CONFIG_CPU_SRSO)
++ VALIDATE_UNRET_END
++ ALTERNATIVE_3 "", \
++ CALL_UNTRAIN_RET, X86_FEATURE_UNRET, \
++ "call entry_ibpb", X86_FEATURE_IBPB_ON_VMEXIT, \
++ __stringify(RESET_CALL_DEPTH), X86_FEATURE_CALL_DEPTH
+ #endif
+ .endm
+
+@@ -305,15 +311,10 @@
+ defined(CONFIG_CALL_DEPTH_TRACKING)
+ VALIDATE_UNRET_END
+ ALTERNATIVE_3 "", \
+- CALL_ZEN_UNTRAIN_RET, X86_FEATURE_UNRET, \
++ CALL_UNTRAIN_RET, X86_FEATURE_UNRET, \
+ "call entry_ibpb", X86_FEATURE_ENTRY_IBPB, \
+ __stringify(RESET_CALL_DEPTH_FROM_CALL), X86_FEATURE_CALL_DEPTH
+ #endif
+-
+-#ifdef CONFIG_CPU_SRSO
+- ALTERNATIVE_2 "", "call srso_untrain_ret", X86_FEATURE_SRSO, \
+- "call srso_untrain_ret_alias", X86_FEATURE_SRSO_ALIAS
+-#endif
+ .endm
+
+
+@@ -337,17 +338,24 @@ extern retpoline_thunk_t __x86_indirect_thunk_array[];
+ extern retpoline_thunk_t __x86_indirect_call_thunk_array[];
+ extern retpoline_thunk_t __x86_indirect_jump_thunk_array[];
+
++#ifdef CONFIG_RETHUNK
+ extern void __x86_return_thunk(void);
+-extern void zen_untrain_ret(void);
++#else
++static inline void __x86_return_thunk(void) {}
++#endif
++
++extern void retbleed_return_thunk(void);
++extern void srso_return_thunk(void);
++extern void srso_alias_return_thunk(void);
++
++extern void retbleed_untrain_ret(void);
+ extern void srso_untrain_ret(void);
+-extern void srso_untrain_ret_alias(void);
++extern void srso_alias_untrain_ret(void);
++
++extern void entry_untrain_ret(void);
+ extern void entry_ibpb(void);
+
+-#ifdef CONFIG_CALL_THUNKS
+ extern void (*x86_return_thunk)(void);
+-#else
+-#define x86_return_thunk (&__x86_return_thunk)
+-#endif
+
+ #ifdef CONFIG_CALL_DEPTH_TRACKING
+ extern void __x86_return_skl(void);
+@@ -474,9 +482,6 @@ enum ssb_mitigation {
+ SPEC_STORE_BYPASS_SECCOMP,
+ };
+
+-extern char __indirect_thunk_start[];
+-extern char __indirect_thunk_end[];
+-
+ static __always_inline
+ void alternative_msr_write(unsigned int msr, u64 val, unsigned int feature)
+ {
+diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
+index f615e0cb6d932..94b42fbb6ffa6 100644
+--- a/arch/x86/kernel/alternative.c
++++ b/arch/x86/kernel/alternative.c
+@@ -571,10 +571,6 @@ void __init_or_module noinline apply_retpolines(s32 *start, s32 *end)
+
+ #ifdef CONFIG_RETHUNK
+
+-#ifdef CONFIG_CALL_THUNKS
+-void (*x86_return_thunk)(void) __ro_after_init = &__x86_return_thunk;
+-#endif
+-
+ /*
+ * Rewrite the compiler generated return thunk tail-calls.
+ *
+diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
+index 0b5f33cb32b59..13b0da82cb5fb 100644
+--- a/arch/x86/kernel/cpu/amd.c
++++ b/arch/x86/kernel/cpu/amd.c
+@@ -1329,3 +1329,4 @@ void noinstr amd_clear_divider(void)
+ asm volatile(ALTERNATIVE("", "div %2\n\t", X86_BUG_DIV0)
+ :: "a" (0), "d" (0), "r" (1));
+ }
++EXPORT_SYMBOL_GPL(amd_clear_divider);
+diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
+index f3d627901d890..d5319779da585 100644
+--- a/arch/x86/kernel/cpu/bugs.c
++++ b/arch/x86/kernel/cpu/bugs.c
+@@ -63,6 +63,8 @@ EXPORT_SYMBOL_GPL(x86_pred_cmd);
+
+ static DEFINE_MUTEX(spec_ctrl_mutex);
+
++void (*x86_return_thunk)(void) __ro_after_init = &__x86_return_thunk;
++
+ /* Update SPEC_CTRL MSR and its cached copy unconditionally */
+ static void update_spec_ctrl(u64 val)
+ {
+@@ -165,8 +167,13 @@ void __init cpu_select_mitigations(void)
+ md_clear_select_mitigation();
+ srbds_select_mitigation();
+ l1d_flush_select_mitigation();
+- gds_select_mitigation();
++
++ /*
++ * srso_select_mitigation() depends and must run after
++ * retbleed_select_mitigation().
++ */
+ srso_select_mitigation();
++ gds_select_mitigation();
+ }
+
+ /*
+@@ -1035,6 +1042,9 @@ do_cmd_auto:
+ setup_force_cpu_cap(X86_FEATURE_RETHUNK);
+ setup_force_cpu_cap(X86_FEATURE_UNRET);
+
++ if (IS_ENABLED(CONFIG_RETHUNK))
++ x86_return_thunk = retbleed_return_thunk;
++
+ if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD &&
+ boot_cpu_data.x86_vendor != X86_VENDOR_HYGON)
+ pr_err(RETBLEED_UNTRAIN_MSG);
+@@ -1044,6 +1054,7 @@ do_cmd_auto:
+
+ case RETBLEED_MITIGATION_IBPB:
+ setup_force_cpu_cap(X86_FEATURE_ENTRY_IBPB);
++ setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);
+ mitigate_smt = true;
+ break;
+
+@@ -2417,9 +2428,10 @@ static void __init srso_select_mitigation(void)
+ * Zen1/2 with SMT off aren't vulnerable after the right
+ * IBPB microcode has been applied.
+ */
+- if ((boot_cpu_data.x86 < 0x19) &&
+- (!cpu_smt_possible() || (cpu_smt_control == CPU_SMT_DISABLED)))
++ if (boot_cpu_data.x86 < 0x19 && !cpu_smt_possible()) {
+ setup_force_cpu_cap(X86_FEATURE_SRSO_NO);
++ return;
++ }
+ }
+
+ if (retbleed_mitigation == RETBLEED_MITIGATION_IBPB) {
+@@ -2448,11 +2460,15 @@ static void __init srso_select_mitigation(void)
+ * like ftrace, static_call, etc.
+ */
+ setup_force_cpu_cap(X86_FEATURE_RETHUNK);
++ setup_force_cpu_cap(X86_FEATURE_UNRET);
+
+- if (boot_cpu_data.x86 == 0x19)
++ if (boot_cpu_data.x86 == 0x19) {
+ setup_force_cpu_cap(X86_FEATURE_SRSO_ALIAS);
+- else
++ x86_return_thunk = srso_alias_return_thunk;
++ } else {
+ setup_force_cpu_cap(X86_FEATURE_SRSO);
++ x86_return_thunk = srso_return_thunk;
++ }
+ srso_mitigation = SRSO_MITIGATION_SAFE_RET;
+ } else {
+ pr_err("WARNING: kernel not compiled with CPU_SRSO.\n");
+@@ -2701,6 +2717,9 @@ static ssize_t gds_show_state(char *buf)
+
+ static ssize_t srso_show_state(char *buf)
+ {
++ if (boot_cpu_has(X86_FEATURE_SRSO_NO))
++ return sysfs_emit(buf, "Mitigation: SMT disabled\n");
++
+ return sysfs_emit(buf, "%s%s\n",
+ srso_strings[srso_mitigation],
+ (cpu_has_ibpb_brtype_microcode() ? "" : ", no microcode"));
+diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
+index 57b0037d0a996..517821b48391a 100644
+--- a/arch/x86/kernel/kprobes/opt.c
++++ b/arch/x86/kernel/kprobes/opt.c
+@@ -226,7 +226,7 @@ static int copy_optimized_instructions(u8 *dest, u8 *src, u8 *real)
+ }
+
+ /* Check whether insn is indirect jump */
+-static int __insn_is_indirect_jump(struct insn *insn)
++static int insn_is_indirect_jump(struct insn *insn)
+ {
+ return ((insn->opcode.bytes[0] == 0xff &&
+ (X86_MODRM_REG(insn->modrm.value) & 6) == 4) || /* Jump */
+@@ -260,26 +260,6 @@ static int insn_jump_into_range(struct insn *insn, unsigned long start, int len)
+ return (start <= target && target <= start + len);
+ }
+
+-static int insn_is_indirect_jump(struct insn *insn)
+-{
+- int ret = __insn_is_indirect_jump(insn);
+-
+-#ifdef CONFIG_RETPOLINE
+- /*
+- * Jump to x86_indirect_thunk_* is treated as an indirect jump.
+- * Note that even with CONFIG_RETPOLINE=y, the kernel compiled with
+- * older gcc may use indirect jump. So we add this check instead of
+- * replace indirect-jump check.
+- */
+- if (!ret)
+- ret = insn_jump_into_range(insn,
+- (unsigned long)__indirect_thunk_start,
+- (unsigned long)__indirect_thunk_end -
+- (unsigned long)__indirect_thunk_start);
+-#endif
+- return ret;
+-}
+-
+ /* Decode whole function to ensure any instructions don't jump into target */
+ static int can_optimize(unsigned long paddr)
+ {
+@@ -334,9 +314,21 @@ static int can_optimize(unsigned long paddr)
+ /* Recover address */
+ insn.kaddr = (void *)addr;
+ insn.next_byte = (void *)(addr + insn.length);
+- /* Check any instructions don't jump into target */
+- if (insn_is_indirect_jump(&insn) ||
+- insn_jump_into_range(&insn, paddr + INT3_INSN_SIZE,
++ /*
++ * Check any instructions don't jump into target, indirectly or
++ * directly.
++ *
++ * The indirect case is present to handle a code with jump
++ * tables. When the kernel uses retpolines, the check should in
++ * theory additionally look for jumps to indirect thunks.
++ * However, the kernel built with retpolines or IBT has jump
++ * tables disabled so the check can be skipped altogether.
++ */
++ if (!IS_ENABLED(CONFIG_RETPOLINE) &&
++ !IS_ENABLED(CONFIG_X86_KERNEL_IBT) &&
++ insn_is_indirect_jump(&insn))
++ return 0;
++ if (insn_jump_into_range(&insn, paddr + INT3_INSN_SIZE,
+ DISP32_SIZE))
+ return 0;
+ addr += insn.length;
+diff --git a/arch/x86/kernel/static_call.c b/arch/x86/kernel/static_call.c
+index b70670a985978..77a9316da4357 100644
+--- a/arch/x86/kernel/static_call.c
++++ b/arch/x86/kernel/static_call.c
+@@ -186,6 +186,19 @@ EXPORT_SYMBOL_GPL(arch_static_call_transform);
+ */
+ bool __static_call_fixup(void *tramp, u8 op, void *dest)
+ {
++ unsigned long addr = (unsigned long)tramp;
++ /*
++ * Not all .return_sites are a static_call trampoline (most are not).
++ * Check if the 3 bytes after the return are still kernel text, if not,
++ * then this definitely is not a trampoline and we need not worry
++ * further.
++ *
++ * This avoids the memcmp() below tripping over pagefaults etc..
++ */
++ if (((addr >> PAGE_SHIFT) != ((addr + 7) >> PAGE_SHIFT)) &&
++ !kernel_text_address(addr + 7))
++ return false;
++
+ if (memcmp(tramp+5, tramp_ud, 3)) {
+ /* Not a trampoline site, not our problem. */
+ return false;
+diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
+index 1885326a8f659..4a817d20ce3bb 100644
+--- a/arch/x86/kernel/traps.c
++++ b/arch/x86/kernel/traps.c
+@@ -206,8 +206,6 @@ DEFINE_IDTENTRY(exc_divide_error)
+ {
+ do_error_trap(regs, 0, "divide error", X86_TRAP_DE, SIGFPE,
+ FPE_INTDIV, error_get_trap_addr(regs));
+-
+- amd_clear_divider();
+ }
+
+ DEFINE_IDTENTRY(exc_overflow)
+diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
+index bac2e2949f01d..83d41c2601d7b 100644
+--- a/arch/x86/kernel/vmlinux.lds.S
++++ b/arch/x86/kernel/vmlinux.lds.S
+@@ -133,27 +133,25 @@ SECTIONS
+ KPROBES_TEXT
+ SOFTIRQENTRY_TEXT
+ #ifdef CONFIG_RETPOLINE
+- __indirect_thunk_start = .;
+- *(.text.__x86.indirect_thunk)
+- *(.text.__x86.return_thunk)
+- __indirect_thunk_end = .;
++ *(.text..__x86.indirect_thunk)
++ *(.text..__x86.return_thunk)
+ #endif
+ STATIC_CALL_TEXT
+
+ ALIGN_ENTRY_TEXT_BEGIN
+ #ifdef CONFIG_CPU_SRSO
+- *(.text.__x86.rethunk_untrain)
++ *(.text..__x86.rethunk_untrain)
+ #endif
+
+ ENTRY_TEXT
+
+ #ifdef CONFIG_CPU_SRSO
+ /*
+- * See the comment above srso_untrain_ret_alias()'s
++ * See the comment above srso_alias_untrain_ret()'s
+ * definition.
+ */
+- . = srso_untrain_ret_alias | (1 << 2) | (1 << 8) | (1 << 14) | (1 << 20);
+- *(.text.__x86.rethunk_safe)
++ . = srso_alias_untrain_ret | (1 << 2) | (1 << 8) | (1 << 14) | (1 << 20);
++ *(.text..__x86.rethunk_safe)
+ #endif
+ ALIGN_ENTRY_TEXT_END
+ *(.gnu.warning)
+@@ -522,8 +520,8 @@ INIT_PER_CPU(irq_stack_backing_store);
+ "fixed_percpu_data is not at start of per-cpu area");
+ #endif
+
+- #ifdef CONFIG_RETHUNK
+-. = ASSERT((__ret & 0x3f) == 0, "__ret not cacheline-aligned");
++#ifdef CONFIG_RETHUNK
++. = ASSERT((retbleed_return_thunk & 0x3f) == 0, "retbleed_return_thunk not cacheline-aligned");
+ . = ASSERT((srso_safe_ret & 0x3f) == 0, "srso_safe_ret not cacheline-aligned");
+ #endif
+
+@@ -538,8 +536,8 @@ INIT_PER_CPU(irq_stack_backing_store);
+ * Instead do: (A | B) - (A & B) in order to compute the XOR
+ * of the two function addresses:
+ */
+-. = ASSERT(((ABSOLUTE(srso_untrain_ret_alias) | srso_safe_ret_alias) -
+- (ABSOLUTE(srso_untrain_ret_alias) & srso_safe_ret_alias)) == ((1 << 2) | (1 << 8) | (1 << 14) | (1 << 20)),
++. = ASSERT(((ABSOLUTE(srso_alias_untrain_ret) | srso_alias_safe_ret) -
++ (ABSOLUTE(srso_alias_untrain_ret) & srso_alias_safe_ret)) == ((1 << 2) | (1 << 8) | (1 << 14) | (1 << 20)),
+ "SRSO function pair won't alias");
+ #endif
+
+diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
+index af7b968f55703..c3b557aca2494 100644
+--- a/arch/x86/kvm/svm/svm.c
++++ b/arch/x86/kvm/svm/svm.c
+@@ -4034,6 +4034,8 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_in
+
+ guest_state_enter_irqoff();
+
++ amd_clear_divider();
++
+ if (sev_es_guest(vcpu->kvm))
+ __svm_sev_es_vcpu_run(svm, spec_ctrl_intercepted);
+ else
+diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
+index 265452fc9ebe9..ef2ebabb059c8 100644
+--- a/arch/x86/kvm/svm/vmenter.S
++++ b/arch/x86/kvm/svm/vmenter.S
+@@ -222,10 +222,7 @@ SYM_FUNC_START(__svm_vcpu_run)
+ * because interrupt handlers won't sanitize 'ret' if the return is
+ * from the kernel.
+ */
+- UNTRAIN_RET
+-
+- /* SRSO */
+- ALTERNATIVE "", "call entry_ibpb", X86_FEATURE_IBPB_ON_VMEXIT
++ UNTRAIN_RET_VM
+
+ /*
+ * Clear all general purpose registers except RSP and RAX to prevent
+@@ -362,7 +359,7 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
+ * because interrupt handlers won't sanitize RET if the return is
+ * from the kernel.
+ */
+- UNTRAIN_RET
++ UNTRAIN_RET_VM
+
+ /* "Pop" @spec_ctrl_intercepted. */
+ pop %_ASM_BX
+diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
+index 2cff585f22f29..cd86aeb5fdd3e 100644
+--- a/arch/x86/lib/retpoline.S
++++ b/arch/x86/lib/retpoline.S
+@@ -13,7 +13,7 @@
+ #include <asm/frame.h>
+ #include <asm/nops.h>
+
+- .section .text.__x86.indirect_thunk
++ .section .text..__x86.indirect_thunk
+
+
+ .macro POLINE reg
+@@ -133,75 +133,106 @@ SYM_CODE_END(__x86_indirect_jump_thunk_array)
+ #ifdef CONFIG_RETHUNK
+
+ /*
+- * srso_untrain_ret_alias() and srso_safe_ret_alias() are placed at
++ * srso_alias_untrain_ret() and srso_alias_safe_ret() are placed at
+ * special addresses:
+ *
+- * - srso_untrain_ret_alias() is 2M aligned
+- * - srso_safe_ret_alias() is also in the same 2M page but bits 2, 8, 14
++ * - srso_alias_untrain_ret() is 2M aligned
++ * - srso_alias_safe_ret() is also in the same 2M page but bits 2, 8, 14
+ * and 20 in its virtual address are set (while those bits in the
+- * srso_untrain_ret_alias() function are cleared).
++ * srso_alias_untrain_ret() function are cleared).
+ *
+ * This guarantees that those two addresses will alias in the branch
+ * target buffer of Zen3/4 generations, leading to any potential
+ * poisoned entries at that BTB slot to get evicted.
+ *
+- * As a result, srso_safe_ret_alias() becomes a safe return.
++ * As a result, srso_alias_safe_ret() becomes a safe return.
+ */
+ #ifdef CONFIG_CPU_SRSO
+- .section .text.__x86.rethunk_untrain
++ .section .text..__x86.rethunk_untrain
+
+-SYM_START(srso_untrain_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
++SYM_START(srso_alias_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
++ UNWIND_HINT_FUNC
+ ANNOTATE_NOENDBR
+ ASM_NOP2
+ lfence
+- jmp __x86_return_thunk
+-SYM_FUNC_END(srso_untrain_ret_alias)
+-__EXPORT_THUNK(srso_untrain_ret_alias)
+-
+- .section .text.__x86.rethunk_safe
++ jmp srso_alias_return_thunk
++SYM_FUNC_END(srso_alias_untrain_ret)
++__EXPORT_THUNK(srso_alias_untrain_ret)
++
++ .section .text..__x86.rethunk_safe
++#else
++/* dummy definition for alternatives */
++SYM_START(srso_alias_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
++ ANNOTATE_UNRET_SAFE
++ ret
++ int3
++SYM_FUNC_END(srso_alias_untrain_ret)
+ #endif
+
+-/* Needs a definition for the __x86_return_thunk alternative below. */
+-SYM_START(srso_safe_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
+-#ifdef CONFIG_CPU_SRSO
+- add $8, %_ASM_SP
++SYM_START(srso_alias_safe_ret, SYM_L_GLOBAL, SYM_A_NONE)
++ lea 8(%_ASM_SP), %_ASM_SP
+ UNWIND_HINT_FUNC
+-#endif
+ ANNOTATE_UNRET_SAFE
+ ret
+ int3
+-SYM_FUNC_END(srso_safe_ret_alias)
++SYM_FUNC_END(srso_alias_safe_ret)
+
+- .section .text.__x86.return_thunk
++ .section .text..__x86.return_thunk
++
++SYM_CODE_START(srso_alias_return_thunk)
++ UNWIND_HINT_FUNC
++ ANNOTATE_NOENDBR
++ call srso_alias_safe_ret
++ ud2
++SYM_CODE_END(srso_alias_return_thunk)
++
++/*
++ * Some generic notes on the untraining sequences:
++ *
++ * They are interchangeable when it comes to flushing potentially wrong
++ * RET predictions from the BTB.
++ *
++ * The SRSO Zen1/2 (MOVABS) untraining sequence is longer than the
++ * Retbleed sequence because the return sequence done there
++ * (srso_safe_ret()) is longer and the return sequence must fully nest
++ * (end before) the untraining sequence. Therefore, the untraining
++ * sequence must fully overlap the return sequence.
++ *
++ * Regarding alignment - the instructions which need to be untrained,
++ * must all start at a cacheline boundary for Zen1/2 generations. That
++ * is, instruction sequences starting at srso_safe_ret() and
++ * the respective instruction sequences at retbleed_return_thunk()
++ * must start at a cacheline boundary.
++ */
+
+ /*
+ * Safety details here pertain to the AMD Zen{1,2} microarchitecture:
+- * 1) The RET at __x86_return_thunk must be on a 64 byte boundary, for
++ * 1) The RET at retbleed_return_thunk must be on a 64 byte boundary, for
+ * alignment within the BTB.
+- * 2) The instruction at zen_untrain_ret must contain, and not
++ * 2) The instruction at retbleed_untrain_ret must contain, and not
+ * end with, the 0xc3 byte of the RET.
+ * 3) STIBP must be enabled, or SMT disabled, to prevent the sibling thread
+ * from re-poisioning the BTB prediction.
+ */
+ .align 64
+- .skip 64 - (__ret - zen_untrain_ret), 0xcc
+-SYM_START(zen_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
++ .skip 64 - (retbleed_return_thunk - retbleed_untrain_ret), 0xcc
++SYM_START(retbleed_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
+ ANNOTATE_NOENDBR
+ /*
+- * As executed from zen_untrain_ret, this is:
++ * As executed from retbleed_untrain_ret, this is:
+ *
+ * TEST $0xcc, %bl
+ * LFENCE
+- * JMP __x86_return_thunk
++ * JMP retbleed_return_thunk
+ *
+ * Executing the TEST instruction has a side effect of evicting any BTB
+ * prediction (potentially attacker controlled) attached to the RET, as
+- * __x86_return_thunk + 1 isn't an instruction boundary at the moment.
++ * retbleed_return_thunk + 1 isn't an instruction boundary at the moment.
+ */
+ .byte 0xf6
+
+ /*
+- * As executed from __x86_return_thunk, this is a plain RET.
++ * As executed from retbleed_return_thunk, this is a plain RET.
+ *
+ * As part of the TEST above, RET is the ModRM byte, and INT3 the imm8.
+ *
+@@ -213,13 +244,13 @@ SYM_START(zen_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
+ * With SMT enabled and STIBP active, a sibling thread cannot poison
+ * RET's prediction to a type of its choice, but can evict the
+ * prediction due to competitive sharing. If the prediction is
+- * evicted, __x86_return_thunk will suffer Straight Line Speculation
++ * evicted, retbleed_return_thunk will suffer Straight Line Speculation
+ * which will be contained safely by the INT3.
+ */
+-SYM_INNER_LABEL(__ret, SYM_L_GLOBAL)
++SYM_INNER_LABEL(retbleed_return_thunk, SYM_L_GLOBAL)
+ ret
+ int3
+-SYM_CODE_END(__ret)
++SYM_CODE_END(retbleed_return_thunk)
+
+ /*
+ * Ensure the TEST decoding / BTB invalidation is complete.
+@@ -230,16 +261,16 @@ SYM_CODE_END(__ret)
+ * Jump back and execute the RET in the middle of the TEST instruction.
+ * INT3 is for SLS protection.
+ */
+- jmp __ret
++ jmp retbleed_return_thunk
+ int3
+-SYM_FUNC_END(zen_untrain_ret)
+-__EXPORT_THUNK(zen_untrain_ret)
++SYM_FUNC_END(retbleed_untrain_ret)
++__EXPORT_THUNK(retbleed_untrain_ret)
+
+ /*
+- * SRSO untraining sequence for Zen1/2, similar to zen_untrain_ret()
++ * SRSO untraining sequence for Zen1/2, similar to retbleed_untrain_ret()
+ * above. On kernel entry, srso_untrain_ret() is executed which is a
+ *
+- * movabs $0xccccccc308c48348,%rax
++ * movabs $0xccccc30824648d48,%rax
+ *
+ * and when the return thunk executes the inner label srso_safe_ret()
+ * later, it is a stack manipulation and a RET which is mispredicted and
+@@ -251,22 +282,44 @@ SYM_START(srso_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
+ ANNOTATE_NOENDBR
+ .byte 0x48, 0xb8
+
++/*
++ * This forces the function return instruction to speculate into a trap
++ * (UD2 in srso_return_thunk() below). This RET will then mispredict
++ * and execution will continue at the return site read from the top of
++ * the stack.
++ */
+ SYM_INNER_LABEL(srso_safe_ret, SYM_L_GLOBAL)
+- add $8, %_ASM_SP
++ lea 8(%_ASM_SP), %_ASM_SP
+ ret
+ int3
+ int3
+- int3
++ /* end of movabs */
+ lfence
+ call srso_safe_ret
+- int3
++ ud2
+ SYM_CODE_END(srso_safe_ret)
+ SYM_FUNC_END(srso_untrain_ret)
+ __EXPORT_THUNK(srso_untrain_ret)
+
+-SYM_FUNC_START(__x86_return_thunk)
+- ALTERNATIVE_2 "jmp __ret", "call srso_safe_ret", X86_FEATURE_SRSO, \
+- "call srso_safe_ret_alias", X86_FEATURE_SRSO_ALIAS
++SYM_CODE_START(srso_return_thunk)
++ UNWIND_HINT_FUNC
++ ANNOTATE_NOENDBR
++ call srso_safe_ret
++ ud2
++SYM_CODE_END(srso_return_thunk)
++
++SYM_FUNC_START(entry_untrain_ret)
++ ALTERNATIVE_2 "jmp retbleed_untrain_ret", \
++ "jmp srso_untrain_ret", X86_FEATURE_SRSO, \
++ "jmp srso_alias_untrain_ret", X86_FEATURE_SRSO_ALIAS
++SYM_FUNC_END(entry_untrain_ret)
++__EXPORT_THUNK(entry_untrain_ret)
++
++SYM_CODE_START(__x86_return_thunk)
++ UNWIND_HINT_FUNC
++ ANNOTATE_NOENDBR
++ ANNOTATE_UNRET_SAFE
++ ret
+ int3
+ SYM_CODE_END(__x86_return_thunk)
+ EXPORT_SYMBOL(__x86_return_thunk)
+diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
+index fc49be622e05b..9faafcd10e177 100644
+--- a/block/blk-cgroup.c
++++ b/block/blk-cgroup.c
+@@ -136,7 +136,9 @@ static void blkg_free_workfn(struct work_struct *work)
+ blkcg_policy[i]->pd_free_fn(blkg->pd[i]);
+ if (blkg->parent)
+ blkg_put(blkg->parent);
++ spin_lock_irq(&q->queue_lock);
+ list_del_init(&blkg->q_node);
++ spin_unlock_irq(&q->queue_lock);
+ mutex_unlock(&q->blkcg_mutex);
+
+ blk_put_queue(q);
+diff --git a/block/blk-crypto-fallback.c b/block/blk-crypto-fallback.c
+index ad9844c5b40cb..e6468eab2681e 100644
+--- a/block/blk-crypto-fallback.c
++++ b/block/blk-crypto-fallback.c
+@@ -78,7 +78,7 @@ static struct blk_crypto_fallback_keyslot {
+ struct crypto_skcipher *tfms[BLK_ENCRYPTION_MODE_MAX];
+ } *blk_crypto_keyslots;
+
+-static struct blk_crypto_profile blk_crypto_fallback_profile;
++static struct blk_crypto_profile *blk_crypto_fallback_profile;
+ static struct workqueue_struct *blk_crypto_wq;
+ static mempool_t *blk_crypto_bounce_page_pool;
+ static struct bio_set crypto_bio_split;
+@@ -292,7 +292,7 @@ static bool blk_crypto_fallback_encrypt_bio(struct bio **bio_ptr)
+ * Get a blk-crypto-fallback keyslot that contains a crypto_skcipher for
+ * this bio's algorithm and key.
+ */
+- blk_st = blk_crypto_get_keyslot(&blk_crypto_fallback_profile,
++ blk_st = blk_crypto_get_keyslot(blk_crypto_fallback_profile,
+ bc->bc_key, &slot);
+ if (blk_st != BLK_STS_OK) {
+ src_bio->bi_status = blk_st;
+@@ -395,7 +395,7 @@ static void blk_crypto_fallback_decrypt_bio(struct work_struct *work)
+ * Get a blk-crypto-fallback keyslot that contains a crypto_skcipher for
+ * this bio's algorithm and key.
+ */
+- blk_st = blk_crypto_get_keyslot(&blk_crypto_fallback_profile,
++ blk_st = blk_crypto_get_keyslot(blk_crypto_fallback_profile,
+ bc->bc_key, &slot);
+ if (blk_st != BLK_STS_OK) {
+ bio->bi_status = blk_st;
+@@ -499,7 +499,7 @@ bool blk_crypto_fallback_bio_prep(struct bio **bio_ptr)
+ return false;
+ }
+
+- if (!__blk_crypto_cfg_supported(&blk_crypto_fallback_profile,
++ if (!__blk_crypto_cfg_supported(blk_crypto_fallback_profile,
+ &bc->bc_key->crypto_cfg)) {
+ bio->bi_status = BLK_STS_NOTSUPP;
+ return false;
+@@ -526,7 +526,7 @@ bool blk_crypto_fallback_bio_prep(struct bio **bio_ptr)
+
+ int blk_crypto_fallback_evict_key(const struct blk_crypto_key *key)
+ {
+- return __blk_crypto_evict_key(&blk_crypto_fallback_profile, key);
++ return __blk_crypto_evict_key(blk_crypto_fallback_profile, key);
+ }
+
+ static bool blk_crypto_fallback_inited;
+@@ -534,7 +534,6 @@ static int blk_crypto_fallback_init(void)
+ {
+ int i;
+ int err;
+- struct blk_crypto_profile *profile = &blk_crypto_fallback_profile;
+
+ if (blk_crypto_fallback_inited)
+ return 0;
+@@ -545,18 +544,27 @@ static int blk_crypto_fallback_init(void)
+ if (err)
+ goto out;
+
+- err = blk_crypto_profile_init(profile, blk_crypto_num_keyslots);
+- if (err)
++ /* Dynamic allocation is needed because of lockdep_register_key(). */
++ blk_crypto_fallback_profile =
++ kzalloc(sizeof(*blk_crypto_fallback_profile), GFP_KERNEL);
++ if (!blk_crypto_fallback_profile) {
++ err = -ENOMEM;
+ goto fail_free_bioset;
++ }
++
++ err = blk_crypto_profile_init(blk_crypto_fallback_profile,
++ blk_crypto_num_keyslots);
++ if (err)
++ goto fail_free_profile;
+ err = -ENOMEM;
+
+- profile->ll_ops = blk_crypto_fallback_ll_ops;
+- profile->max_dun_bytes_supported = BLK_CRYPTO_MAX_IV_SIZE;
++ blk_crypto_fallback_profile->ll_ops = blk_crypto_fallback_ll_ops;
++ blk_crypto_fallback_profile->max_dun_bytes_supported = BLK_CRYPTO_MAX_IV_SIZE;
+
+ /* All blk-crypto modes have a crypto API fallback. */
+ for (i = 0; i < BLK_ENCRYPTION_MODE_MAX; i++)
+- profile->modes_supported[i] = 0xFFFFFFFF;
+- profile->modes_supported[BLK_ENCRYPTION_MODE_INVALID] = 0;
++ blk_crypto_fallback_profile->modes_supported[i] = 0xFFFFFFFF;
++ blk_crypto_fallback_profile->modes_supported[BLK_ENCRYPTION_MODE_INVALID] = 0;
+
+ blk_crypto_wq = alloc_workqueue("blk_crypto_wq",
+ WQ_UNBOUND | WQ_HIGHPRI |
+@@ -597,7 +605,9 @@ fail_free_keyslots:
+ fail_free_wq:
+ destroy_workqueue(blk_crypto_wq);
+ fail_destroy_profile:
+- blk_crypto_profile_destroy(profile);
++ blk_crypto_profile_destroy(blk_crypto_fallback_profile);
++fail_free_profile:
++ kfree(blk_crypto_fallback_profile);
+ fail_free_bioset:
+ bioset_exit(&crypto_bio_split);
+ out:
+diff --git a/drivers/accel/habanalabs/common/device.c b/drivers/accel/habanalabs/common/device.c
+index fabfc501ef543..a39dd346a1678 100644
+--- a/drivers/accel/habanalabs/common/device.c
++++ b/drivers/accel/habanalabs/common/device.c
+@@ -981,6 +981,18 @@ static void device_early_fini(struct hl_device *hdev)
+ hdev->asic_funcs->early_fini(hdev);
+ }
+
++static bool is_pci_link_healthy(struct hl_device *hdev)
++{
++ u16 vendor_id;
++
++ if (!hdev->pdev)
++ return false;
++
++ pci_read_config_word(hdev->pdev, PCI_VENDOR_ID, &vendor_id);
++
++ return (vendor_id == PCI_VENDOR_ID_HABANALABS);
++}
++
+ static void hl_device_heartbeat(struct work_struct *work)
+ {
+ struct hl_device *hdev = container_of(work, struct hl_device,
+@@ -995,7 +1007,8 @@ static void hl_device_heartbeat(struct work_struct *work)
+ goto reschedule;
+
+ if (hl_device_operational(hdev, NULL))
+- dev_err(hdev->dev, "Device heartbeat failed!\n");
++ dev_err(hdev->dev, "Device heartbeat failed! PCI link is %s\n",
++ is_pci_link_healthy(hdev) ? "healthy" : "broken");
+
+ info.err_type = HL_INFO_FW_HEARTBEAT_ERR;
+ info.event_mask = &event_mask;
+diff --git a/drivers/accel/habanalabs/common/habanalabs.h b/drivers/accel/habanalabs/common/habanalabs.h
+index eaae69a9f8178..7f5d1b6e3fb08 100644
+--- a/drivers/accel/habanalabs/common/habanalabs.h
++++ b/drivers/accel/habanalabs/common/habanalabs.h
+@@ -36,6 +36,8 @@
+ struct hl_device;
+ struct hl_fpriv;
+
++#define PCI_VENDOR_ID_HABANALABS 0x1da3
++
+ /* Use upper bits of mmap offset to store habana driver specific information.
+ * bits[63:59] - Encode mmap type
+ * bits[45:0] - mmap offset value
+diff --git a/drivers/accel/habanalabs/common/habanalabs_drv.c b/drivers/accel/habanalabs/common/habanalabs_drv.c
+index d9df64e75f33a..70fb2df9a93b8 100644
+--- a/drivers/accel/habanalabs/common/habanalabs_drv.c
++++ b/drivers/accel/habanalabs/common/habanalabs_drv.c
+@@ -13,6 +13,7 @@
+
+ #include <linux/pci.h>
+ #include <linux/module.h>
++#include <linux/vmalloc.h>
+
+ #define CREATE_TRACE_POINTS
+ #include <trace/events/habanalabs.h>
+@@ -54,8 +55,6 @@ module_param(boot_error_status_mask, ulong, 0444);
+ MODULE_PARM_DESC(boot_error_status_mask,
+ "Mask of the error status during device CPU boot (If bitX is cleared then error X is masked. Default all 1's)");
+
+-#define PCI_VENDOR_ID_HABANALABS 0x1da3
+-
+ #define PCI_IDS_GOYA 0x0001
+ #define PCI_IDS_GAUDI 0x1000
+ #define PCI_IDS_GAUDI_SEC 0x1010
+@@ -220,6 +219,7 @@ int hl_device_open(struct inode *inode, struct file *filp)
+
+ hl_debugfs_add_file(hpriv);
+
++ vfree(hdev->captured_err_info.page_fault_info.user_mappings);
+ memset(&hdev->captured_err_info, 0, sizeof(hdev->captured_err_info));
+ atomic_set(&hdev->captured_err_info.cs_timeout.write_enable, 1);
+ hdev->captured_err_info.undef_opcode.write_enable = true;
+diff --git a/drivers/accel/qaic/qaic_control.c b/drivers/accel/qaic/qaic_control.c
+index cfbc92da426fa..388abd40024ba 100644
+--- a/drivers/accel/qaic/qaic_control.c
++++ b/drivers/accel/qaic/qaic_control.c
+@@ -392,18 +392,31 @@ static int find_and_map_user_pages(struct qaic_device *qdev,
+ struct qaic_manage_trans_dma_xfer *in_trans,
+ struct ioctl_resources *resources, struct dma_xfer *xfer)
+ {
++ u64 xfer_start_addr, remaining, end, total;
+ unsigned long need_pages;
+ struct page **page_list;
+ unsigned long nr_pages;
+ struct sg_table *sgt;
+- u64 xfer_start_addr;
+ int ret;
+ int i;
+
+- xfer_start_addr = in_trans->addr + resources->xferred_dma_size;
++ if (check_add_overflow(in_trans->addr, resources->xferred_dma_size, &xfer_start_addr))
++ return -EINVAL;
+
+- need_pages = DIV_ROUND_UP(in_trans->size + offset_in_page(xfer_start_addr) -
+- resources->xferred_dma_size, PAGE_SIZE);
++ if (in_trans->size < resources->xferred_dma_size)
++ return -EINVAL;
++ remaining = in_trans->size - resources->xferred_dma_size;
++ if (remaining == 0)
++ return 0;
++
++ if (check_add_overflow(xfer_start_addr, remaining, &end))
++ return -EINVAL;
++
++ total = remaining + offset_in_page(xfer_start_addr);
++ if (total >= SIZE_MAX)
++ return -EINVAL;
++
++ need_pages = DIV_ROUND_UP(total, PAGE_SIZE);
+
+ nr_pages = need_pages;
+
+@@ -435,7 +448,7 @@ static int find_and_map_user_pages(struct qaic_device *qdev,
+
+ ret = sg_alloc_table_from_pages(sgt, page_list, nr_pages,
+ offset_in_page(xfer_start_addr),
+- in_trans->size - resources->xferred_dma_size, GFP_KERNEL);
++ remaining, GFP_KERNEL);
+ if (ret) {
+ ret = -ENOMEM;
+ goto free_sgt;
+@@ -566,9 +579,6 @@ static int encode_dma(struct qaic_device *qdev, void *trans, struct wrapper_list
+ QAIC_MANAGE_EXT_MSG_LENGTH)
+ return -ENOMEM;
+
+- if (in_trans->addr + in_trans->size < in_trans->addr || !in_trans->size)
+- return -EINVAL;
+-
+ xfer = kmalloc(sizeof(*xfer), GFP_KERNEL);
+ if (!xfer)
+ return -ENOMEM;
+diff --git a/drivers/accel/qaic/qaic_data.c b/drivers/accel/qaic/qaic_data.c
+index e9a1cb779b305..6b6d981a71be7 100644
+--- a/drivers/accel/qaic/qaic_data.c
++++ b/drivers/accel/qaic/qaic_data.c
+@@ -1021,6 +1021,7 @@ int qaic_attach_slice_bo_ioctl(struct drm_device *dev, void *data, struct drm_fi
+ bo->dbc = dbc;
+ srcu_read_unlock(&dbc->ch_lock, rcu_id);
+ drm_gem_object_put(obj);
++ kfree(slice_ent);
+ srcu_read_unlock(&qdev->dev_lock, qdev_rcu_id);
+ srcu_read_unlock(&usr->qddev_lock, usr_rcu_id);
+
+diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
+index 50e23762ec5e9..025e803ba55c2 100644
+--- a/drivers/bluetooth/btusb.c
++++ b/drivers/bluetooth/btusb.c
+@@ -613,6 +613,9 @@ static const struct usb_device_id blacklist_table[] = {
+ { USB_DEVICE(0x0489, 0xe0d9), .driver_info = BTUSB_MEDIATEK |
+ BTUSB_WIDEBAND_SPEECH |
+ BTUSB_VALID_LE_STATES },
++ { USB_DEVICE(0x0489, 0xe0f5), .driver_info = BTUSB_MEDIATEK |
++ BTUSB_WIDEBAND_SPEECH |
++ BTUSB_VALID_LE_STATES },
+ { USB_DEVICE(0x13d3, 0x3568), .driver_info = BTUSB_MEDIATEK |
+ BTUSB_WIDEBAND_SPEECH |
+ BTUSB_VALID_LE_STATES },
+diff --git a/drivers/bus/ti-sysc.c b/drivers/bus/ti-sysc.c
+index 21fe9854703f9..4cb23b9e06ea4 100644
+--- a/drivers/bus/ti-sysc.c
++++ b/drivers/bus/ti-sysc.c
+@@ -2142,6 +2142,8 @@ static int sysc_reset(struct sysc *ddata)
+ sysc_val = sysc_read_sysconfig(ddata);
+ sysc_val |= sysc_mask;
+ sysc_write(ddata, sysc_offset, sysc_val);
++ /* Flush posted write */
++ sysc_val = sysc_read_sysconfig(ddata);
+ }
+
+ if (ddata->cfg.srst_udelay)
+diff --git a/drivers/firewire/net.c b/drivers/firewire/net.c
+index 538bd677c254a..7a4d1a478e33e 100644
+--- a/drivers/firewire/net.c
++++ b/drivers/firewire/net.c
+@@ -479,7 +479,7 @@ static int fwnet_finish_incoming_packet(struct net_device *net,
+ struct sk_buff *skb, u16 source_node_id,
+ bool is_broadcast, u16 ether_type)
+ {
+- int status;
++ int status, len;
+
+ switch (ether_type) {
+ case ETH_P_ARP:
+@@ -533,13 +533,15 @@ static int fwnet_finish_incoming_packet(struct net_device *net,
+ }
+ skb->protocol = protocol;
+ }
++
++ len = skb->len;
+ status = netif_rx(skb);
+ if (status == NET_RX_DROP) {
+ net->stats.rx_errors++;
+ net->stats.rx_dropped++;
+ } else {
+ net->stats.rx_packets++;
+- net->stats.rx_bytes += skb->len;
++ net->stats.rx_bytes += len;
+ }
+
+ return 0;
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+index a989ae72a58a9..0c023269aadaa 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+@@ -189,7 +189,7 @@ static int amdgpu_cs_pass1(struct amdgpu_cs_parser *p,
+ uint64_t *chunk_array_user;
+ uint64_t *chunk_array;
+ uint32_t uf_offset = 0;
+- unsigned int size;
++ size_t size;
+ int ret;
+ int i;
+
+@@ -1625,15 +1625,15 @@ static int amdgpu_cs_wait_all_fences(struct amdgpu_device *adev,
+ continue;
+
+ r = dma_fence_wait_timeout(fence, true, timeout);
++ if (r > 0 && fence->error)
++ r = fence->error;
++
+ dma_fence_put(fence);
+ if (r < 0)
+ return r;
+
+ if (r == 0)
+ break;
+-
+- if (fence->error)
+- return fence->error;
+ }
+
+ memset(wait, 0, sizeof(*wait));
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
+index c6d4d41c4393e..23d054526e7c7 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
+@@ -106,3 +106,41 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm,
+ ttm_eu_backoff_reservation(&ticket, &list);
+ return 0;
+ }
++
++int amdgpu_unmap_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm,
++ struct amdgpu_bo *bo, struct amdgpu_bo_va *bo_va,
++ uint64_t csa_addr)
++{
++ struct ww_acquire_ctx ticket;
++ struct list_head list;
++ struct amdgpu_bo_list_entry pd;
++ struct ttm_validate_buffer csa_tv;
++ int r;
++
++ INIT_LIST_HEAD(&list);
++ INIT_LIST_HEAD(&csa_tv.head);
++ csa_tv.bo = &bo->tbo;
++ csa_tv.num_shared = 1;
++
++ list_add(&csa_tv.head, &list);
++ amdgpu_vm_get_pd_bo(vm, &list, &pd);
++
++ r = ttm_eu_reserve_buffers(&ticket, &list, true, NULL);
++ if (r) {
++ DRM_ERROR("failed to reserve CSA,PD BOs: err=%d\n", r);
++ return r;
++ }
++
++ r = amdgpu_vm_bo_unmap(adev, bo_va, csa_addr);
++ if (r) {
++ DRM_ERROR("failed to do bo_unmap on static CSA, err=%d\n", r);
++ ttm_eu_backoff_reservation(&ticket, &list);
++ return r;
++ }
++
++ amdgpu_vm_bo_del(adev, bo_va);
++
++ ttm_eu_backoff_reservation(&ticket, &list);
++
++ return 0;
++}
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.h
+index 524b4437a0217..7dfc1f2012ebf 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.h
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.h
+@@ -34,6 +34,9 @@ int amdgpu_allocate_static_csa(struct amdgpu_device *adev, struct amdgpu_bo **bo
+ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm,
+ struct amdgpu_bo *bo, struct amdgpu_bo_va **bo_va,
+ uint64_t csa_addr, uint32_t size);
++int amdgpu_unmap_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm,
++ struct amdgpu_bo *bo, struct amdgpu_bo_va *bo_va,
++ uint64_t csa_addr);
+ void amdgpu_free_static_csa(struct amdgpu_bo **bo);
+
+ #endif
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+index 44a902d9b5c7b..3108f5219cf3b 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+@@ -4250,6 +4250,7 @@ int amdgpu_device_suspend(struct drm_device *dev, bool fbcon)
+ drm_fb_helper_set_suspend_unlocked(adev_to_drm(adev)->fb_helper, true);
+
+ cancel_delayed_work_sync(&adev->delayed_init_work);
++ flush_delayed_work(&adev->gfx.gfx_off_delay_work);
+
+ amdgpu_ras_suspend(adev);
+
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+index a7d250809da99..b9ba01b4c9925 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+@@ -555,6 +555,41 @@ int amdgpu_fence_driver_sw_init(struct amdgpu_device *adev)
+ return 0;
+ }
+
++/**
++ * amdgpu_fence_need_ring_interrupt_restore - helper function to check whether
++ * fence driver interrupts need to be restored.
++ *
++ * @ring: ring that to be checked
++ *
++ * Interrupts for rings that belong to GFX IP don't need to be restored
++ * when the target power state is s0ix.
++ *
++ * Return true if need to restore interrupts, false otherwise.
++ */
++static bool amdgpu_fence_need_ring_interrupt_restore(struct amdgpu_ring *ring)
++{
++ struct amdgpu_device *adev = ring->adev;
++ bool is_gfx_power_domain = false;
++
++ switch (ring->funcs->type) {
++ case AMDGPU_RING_TYPE_SDMA:
++ /* SDMA 5.x+ is part of GFX power domain so it's covered by GFXOFF */
++ if (adev->ip_versions[SDMA0_HWIP][0] >= IP_VERSION(5, 0, 0))
++ is_gfx_power_domain = true;
++ break;
++ case AMDGPU_RING_TYPE_GFX:
++ case AMDGPU_RING_TYPE_COMPUTE:
++ case AMDGPU_RING_TYPE_KIQ:
++ case AMDGPU_RING_TYPE_MES:
++ is_gfx_power_domain = true;
++ break;
++ default:
++ break;
++ }
++
++ return !(adev->in_s0ix && is_gfx_power_domain);
++}
++
+ /**
+ * amdgpu_fence_driver_hw_fini - tear down the fence driver
+ * for all possible rings.
+@@ -583,7 +618,8 @@ void amdgpu_fence_driver_hw_fini(struct amdgpu_device *adev)
+ amdgpu_fence_driver_force_completion(ring);
+
+ if (!drm_dev_is_unplugged(adev_to_drm(adev)) &&
+- ring->fence_drv.irq_src)
++ ring->fence_drv.irq_src &&
++ amdgpu_fence_need_ring_interrupt_restore(ring))
+ amdgpu_irq_put(adev, ring->fence_drv.irq_src,
+ ring->fence_drv.irq_type);
+
+@@ -658,7 +694,8 @@ void amdgpu_fence_driver_hw_init(struct amdgpu_device *adev)
+ continue;
+
+ /* enable the interrupt */
+- if (ring->fence_drv.irq_src)
++ if (ring->fence_drv.irq_src &&
++ amdgpu_fence_need_ring_interrupt_restore(ring))
+ amdgpu_irq_get(adev, ring->fence_drv.irq_src,
+ ring->fence_drv.irq_type);
+ }
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+index f3f541ba0acaa..bff5b6eac39b5 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+@@ -589,15 +589,8 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool enable)
+
+ if (adev->gfx.gfx_off_req_count == 0 &&
+ !adev->gfx.gfx_off_state) {
+- /* If going to s2idle, no need to wait */
+- if (adev->in_s0ix) {
+- if (!amdgpu_dpm_set_powergating_by_smu(adev,
+- AMD_IP_BLOCK_TYPE_GFX, true))
+- adev->gfx.gfx_off_state = true;
+- } else {
+- schedule_delayed_work(&adev->gfx.gfx_off_delay_work,
++ schedule_delayed_work(&adev->gfx.gfx_off_delay_work,
+ delay);
+- }
+ }
+ } else {
+ if (adev->gfx.gfx_off_req_count == 0) {
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
+index fafebec5b7b66..9581c020d815d 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
+@@ -124,7 +124,6 @@ void amdgpu_irq_disable_all(struct amdgpu_device *adev)
+ continue;
+
+ for (k = 0; k < src->num_types; ++k) {
+- atomic_set(&src->enabled_types[k], 0);
+ r = src->funcs->set(adev, src, k,
+ AMDGPU_IRQ_STATE_DISABLE);
+ if (r)
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+index 0efb38539d70c..724e80c192973 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+@@ -1284,12 +1284,12 @@ void amdgpu_driver_postclose_kms(struct drm_device *dev,
+ if (amdgpu_device_ip_get_ip_block(adev, AMD_IP_BLOCK_TYPE_VCE) != NULL)
+ amdgpu_vce_free_handles(adev, file_priv);
+
+- if (amdgpu_mcbp) {
+- /* TODO: how to handle reserve failure */
+- BUG_ON(amdgpu_bo_reserve(adev->virt.csa_obj, true));
+- amdgpu_vm_bo_del(adev, fpriv->csa_va);
++ if (fpriv->csa_va) {
++ uint64_t csa_addr = amdgpu_csa_vaddr(adev) & AMDGPU_GMC_HOLE_MASK;
++
++ WARN_ON(amdgpu_unmap_static_csa(adev, &fpriv->vm, adev->virt.csa_obj,
++ fpriv->csa_va, csa_addr));
+ fpriv->csa_va = NULL;
+- amdgpu_bo_unreserve(adev->virt.csa_obj);
+ }
+
+ pasid = fpriv->vm.pasid;
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+index db820331f2c61..39e54685653cc 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+@@ -520,6 +520,8 @@ static int psp_sw_fini(void *handle)
+ kfree(cmd);
+ cmd = NULL;
+
++ psp_free_shared_bufs(psp);
++
+ if (psp->km_ring.ring_mem)
+ amdgpu_bo_free_kernel(&adev->firmware.rbuf,
+ &psp->km_ring.ring_mem_mc_addr,
+@@ -2657,8 +2659,6 @@ static int psp_hw_fini(void *handle)
+
+ psp_ring_destroy(psp, PSP_RING_TYPE__KM);
+
+- psp_free_shared_bufs(psp);
+-
+ return 0;
+ }
+
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+index 49de3a3eebc78..de04606c2061e 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+@@ -361,6 +361,8 @@ void amdgpu_ring_fini(struct amdgpu_ring *ring)
+ amdgpu_bo_free_kernel(&ring->ring_obj,
+ &ring->gpu_addr,
+ (void **)&ring->ring);
++ } else {
++ kfree(ring->fence_drv.fences);
+ }
+
+ dma_fence_put(ring->vmid_wait);
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+index 23f52150ebef4..fd029d91a3402 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+@@ -1367,6 +1367,7 @@ struct amdgpu_bo_va *amdgpu_vm_bo_add(struct amdgpu_device *adev,
+ amdgpu_vm_bo_base_init(&bo_va->base, vm, bo);
+
+ bo_va->ref_count = 1;
++ bo_va->last_pt_update = dma_fence_get_stub();
+ INIT_LIST_HEAD(&bo_va->valids);
+ INIT_LIST_HEAD(&bo_va->invalids);
+
+@@ -2088,7 +2089,8 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm)
+ vm->update_funcs = &amdgpu_vm_cpu_funcs;
+ else
+ vm->update_funcs = &amdgpu_vm_sdma_funcs;
+- vm->last_update = NULL;
++
++ vm->last_update = dma_fence_get_stub();
+ vm->last_unlocked = dma_fence_get_stub();
+ vm->last_tlb_flush = dma_fence_get_stub();
+
+@@ -2213,7 +2215,7 @@ int amdgpu_vm_make_compute(struct amdgpu_device *adev, struct amdgpu_vm *vm)
+ goto unreserve_bo;
+
+ dma_fence_put(vm->last_update);
+- vm->last_update = NULL;
++ vm->last_update = dma_fence_get_stub();
+ vm->is_compute_context = true;
+
+ /* Free the shadow bo for compute VM */
+diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+index bdce367544368..4dd9a85f5c724 100644
+--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
++++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+@@ -1653,11 +1653,6 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
+ if (amdgpu_dc_feature_mask & DC_DISABLE_LTTPR_DP2_0)
+ init_data.flags.allow_lttpr_non_transparent_mode.bits.DP2_0 = true;
+
+- /* Disable SubVP + DRR config by default */
+- init_data.flags.disable_subvp_drr = true;
+- if (amdgpu_dc_feature_mask & DC_ENABLE_SUBVP_DRR)
+- init_data.flags.disable_subvp_drr = false;
+-
+ init_data.flags.seamless_boot_edp_requested = false;
+
+ if (check_seamless_boot_capability(adev)) {
+diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c
+index 8d9444db092ab..eea103908b09f 100644
+--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c
++++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c
+@@ -233,6 +233,32 @@ void dcn32_init_clocks(struct clk_mgr *clk_mgr_base)
+ DC_FP_END();
+ }
+
++static void dcn32_update_clocks_update_dtb_dto(struct clk_mgr_internal *clk_mgr,
++ struct dc_state *context,
++ int ref_dtbclk_khz)
++{
++ struct dccg *dccg = clk_mgr->dccg;
++ uint32_t tg_mask = 0;
++ int i;
++
++ for (i = 0; i < clk_mgr->base.ctx->dc->res_pool->pipe_count; i++) {
++ struct pipe_ctx *pipe_ctx = &context->res_ctx.pipe_ctx[i];
++ struct dtbclk_dto_params dto_params = {0};
++
++ /* use mask to program DTO once per tg */
++ if (pipe_ctx->stream_res.tg &&
++ !(tg_mask & (1 << pipe_ctx->stream_res.tg->inst))) {
++ tg_mask |= (1 << pipe_ctx->stream_res.tg->inst);
++
++ dto_params.otg_inst = pipe_ctx->stream_res.tg->inst;
++ dto_params.ref_dtbclk_khz = ref_dtbclk_khz;
++
++ dccg->funcs->set_dtbclk_dto(clk_mgr->dccg, &dto_params);
++ //dccg->funcs->set_audio_dtbclk_dto(clk_mgr->dccg, &dto_params);
++ }
++ }
++}
++
+ /* Since DPPCLK request to PMFW needs to be exact (due to DPP DTO programming),
+ * update DPPCLK to be the exact frequency that will be set after the DPPCLK
+ * divider is updated. This will prevent rounding issues that could cause DPP
+@@ -570,6 +596,7 @@ static void dcn32_update_clocks(struct clk_mgr *clk_mgr_base,
+ /* DCCG requires KHz precision for DTBCLK */
+ clk_mgr_base->clks.ref_dtbclk_khz =
+ dcn32_smu_set_hard_min_by_freq(clk_mgr, PPCLK_DTBCLK, khz_to_mhz_ceil(new_clocks->ref_dtbclk_khz));
++ dcn32_update_clocks_update_dtb_dto(clk_mgr, context, clk_mgr_base->clks.ref_dtbclk_khz);
+ }
+
+ if (dc->config.forced_clocks == false || (force_reset && safe_to_lower)) {
+diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
+index 1d8c5805ef20c..77ef474ced071 100644
+--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
++++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
+@@ -712,7 +712,7 @@ static const struct dc_debug_options debug_defaults_drv = {
+ .timing_trace = false,
+ .clock_trace = true,
+ .disable_pplib_clock_request = true,
+- .pipe_split_policy = MPC_SPLIT_DYNAMIC,
++ .pipe_split_policy = MPC_SPLIT_AVOID_MULT_DISP,
+ .force_single_disp_pipe_split = false,
+ .disable_dcc = DCC_ENABLE,
+ .vsr_support = true,
+diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dccg.c b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dccg.c
+index 4c2fdfea162f5..65c1d754e2d6b 100644
+--- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dccg.c
++++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dccg.c
+@@ -47,6 +47,14 @@ void dccg31_update_dpp_dto(struct dccg *dccg, int dpp_inst, int req_dppclk)
+ {
+ struct dcn_dccg *dccg_dcn = TO_DCN_DCCG(dccg);
+
++ if (dccg->dpp_clock_gated[dpp_inst]) {
++ /*
++ * Do not update the DPPCLK DTO if the clock is stopped.
++ * It is treated the same as if the pipe itself were in PG.
++ */
++ return;
++ }
++
+ if (dccg->ref_dppclk && req_dppclk) {
+ int ref_dppclk = dccg->ref_dppclk;
+ int modulo, phase;
+diff --git a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dccg.c b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dccg.c
+index de7bfba2c1798..afeb9f4d53441 100644
+--- a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dccg.c
++++ b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dccg.c
+@@ -322,6 +322,9 @@ static void dccg314_dpp_root_clock_control(
+ {
+ struct dcn_dccg *dccg_dcn = TO_DCN_DCCG(dccg);
+
++ if (dccg->dpp_clock_gated[dpp_inst] != clock_on)
++ return;
++
+ if (clock_on) {
+ /* turn off the DTO and leave phase/modulo at max */
+ REG_UPDATE(DPPCLK_DTO_CTRL, DPPCLK_DTO_ENABLE[dpp_inst], 0);
+@@ -335,6 +338,8 @@ static void dccg314_dpp_root_clock_control(
+ DPPCLK0_DTO_PHASE, 0,
+ DPPCLK0_DTO_MODULO, 1);
+ }
++
++ dccg->dpp_clock_gated[dpp_inst] = !clock_on;
+ }
+
+ static const struct dccg_funcs dccg314_funcs = {
+diff --git a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c
+index abeeede38fb39..653b5f15d4ca7 100644
+--- a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c
++++ b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c
+@@ -921,6 +921,22 @@ static const struct dc_debug_options debug_defaults_drv = {
+ .afmt = true,
+ }
+ },
++
++ .root_clock_optimization = {
++ .bits = {
++ .dpp = true,
++ .dsc = false,
++ .hdmistream = false,
++ .hdmichar = false,
++ .dpstream = false,
++ .symclk32_se = false,
++ .symclk32_le = false,
++ .symclk_fe = false,
++ .physymclk = false,
++ .dpiasymclk = false,
++ }
++ },
++
+ .seamless_boot_odm_combine = true
+ };
+
+@@ -1920,6 +1936,10 @@ static bool dcn314_resource_construct(
+ dc->debug = debug_defaults_drv;
+ else
+ dc->debug = debug_defaults_diags;
++
++ /* Disable root clock optimization */
++ dc->debug.root_clock_optimization.u32All = 0;
++
+ // Init the vm_helper
+ if (dc->vm_helper)
+ vm_helper_init(dc->vm_helper, 16);
+diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c
+index 7661f8946aa31..9ec767ebf5d16 100644
+--- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c
++++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c
+@@ -1097,10 +1097,6 @@ void dcn20_calculate_dlg_params(struct dc *dc,
+ context->res_ctx.pipe_ctx[i].plane_res.bw.dppclk_khz =
+ pipes[pipe_idx].clks_cfg.dppclk_mhz * 1000;
+ context->res_ctx.pipe_ctx[i].pipe_dlg_param = pipes[pipe_idx].pipe.dest;
+- if (context->res_ctx.pipe_ctx[i].stream->adaptive_sync_infopacket.valid)
+- dcn20_adjust_freesync_v_startup(
+- &context->res_ctx.pipe_ctx[i].stream->timing,
+- &context->res_ctx.pipe_ctx[i].pipe_dlg_param.vstartup_start);
+
+ pipe_idx++;
+ }
+@@ -1914,6 +1910,7 @@ static bool dcn20_validate_bandwidth_internal(struct dc *dc, struct dc_state *co
+ int vlevel = 0;
+ int pipe_split_from[MAX_PIPES];
+ int pipe_cnt = 0;
++ int i = 0;
+ display_e2e_pipe_params_st *pipes = kzalloc(dc->res_pool->pipe_count * sizeof(display_e2e_pipe_params_st), GFP_ATOMIC);
+ DC_LOGGER_INIT(dc->ctx->logger);
+
+@@ -1937,6 +1934,15 @@ static bool dcn20_validate_bandwidth_internal(struct dc *dc, struct dc_state *co
+ dcn20_calculate_wm(dc, context, pipes, &pipe_cnt, pipe_split_from, vlevel, fast_validate);
+ dcn20_calculate_dlg_params(dc, context, pipes, pipe_cnt, vlevel);
+
++ for (i = 0; i < dc->res_pool->pipe_count; i++) {
++ if (!context->res_ctx.pipe_ctx[i].stream)
++ continue;
++ if (context->res_ctx.pipe_ctx[i].stream->adaptive_sync_infopacket.valid)
++ dcn20_adjust_freesync_v_startup(
++ &context->res_ctx.pipe_ctx[i].stream->timing,
++ &context->res_ctx.pipe_ctx[i].pipe_dlg_param.vstartup_start);
++ }
++
+ BW_VAL_TRACE_END_WATERMARKS();
+
+ goto validate_out;
+@@ -2209,6 +2215,7 @@ bool dcn21_validate_bandwidth_fp(struct dc *dc,
+ int vlevel = 0;
+ int pipe_split_from[MAX_PIPES];
+ int pipe_cnt = 0;
++ int i = 0;
+ display_e2e_pipe_params_st *pipes = kzalloc(dc->res_pool->pipe_count * sizeof(display_e2e_pipe_params_st), GFP_ATOMIC);
+ DC_LOGGER_INIT(dc->ctx->logger);
+
+@@ -2237,6 +2244,15 @@ bool dcn21_validate_bandwidth_fp(struct dc *dc,
+ dcn21_calculate_wm(dc, context, pipes, &pipe_cnt, pipe_split_from, vlevel, fast_validate);
+ dcn20_calculate_dlg_params(dc, context, pipes, pipe_cnt, vlevel);
+
++ for (i = 0; i < dc->res_pool->pipe_count; i++) {
++ if (!context->res_ctx.pipe_ctx[i].stream)
++ continue;
++ if (context->res_ctx.pipe_ctx[i].stream->adaptive_sync_infopacket.valid)
++ dcn20_adjust_freesync_v_startup(
++ &context->res_ctx.pipe_ctx[i].stream->timing,
++ &context->res_ctx.pipe_ctx[i].pipe_dlg_param.vstartup_start);
++ }
++
+ BW_VAL_TRACE_END_WATERMARKS();
+
+ goto validate_out;
+diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
+index d8b4119820bfc..1bfda6e2b3070 100644
+--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
++++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
+@@ -880,10 +880,6 @@ static bool subvp_drr_schedulable(struct dc *dc, struct dc_state *context, struc
+ int16_t stretched_drr_us = 0;
+ int16_t drr_stretched_vblank_us = 0;
+ int16_t max_vblank_mallregion = 0;
+- const struct dc_config *config = &dc->config;
+-
+- if (config->disable_subvp_drr)
+- return false;
+
+ // Find SubVP pipe
+ for (i = 0; i < dc->res_pool->pipe_count; i++) {
+diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
+index d75248b6cae99..9a5150e96017a 100644
+--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
++++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
+@@ -811,7 +811,7 @@ static void DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman
+ v->SwathHeightC[k],
+ TWait,
+ (v->DRAMSpeedPerState[mode_lib->vba.VoltageLevel] <= MEM_STROBE_FREQ_MHZ ||
+- v->DCFCLKPerState[mode_lib->vba.VoltageLevel] <= MIN_DCFCLK_FREQ_MHZ) ?
++ v->DCFCLKPerState[mode_lib->vba.VoltageLevel] <= DCFCLK_FREQ_EXTRA_PREFETCH_REQ_MHZ) ?
+ mode_lib->vba.ip.min_prefetch_in_strobe_us : 0,
+ /* Output */
+ &v->DSTXAfterScaler[k],
+@@ -3311,7 +3311,7 @@ void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l
+ v->swath_width_chroma_ub_this_state[k],
+ v->SwathHeightYThisState[k],
+ v->SwathHeightCThisState[k], v->TWait,
+- (v->DRAMSpeedPerState[i] <= MEM_STROBE_FREQ_MHZ || v->DCFCLKState[i][j] <= MIN_DCFCLK_FREQ_MHZ) ?
++ (v->DRAMSpeedPerState[i] <= MEM_STROBE_FREQ_MHZ || v->DCFCLKState[i][j] <= DCFCLK_FREQ_EXTRA_PREFETCH_REQ_MHZ) ?
+ mode_lib->vba.ip.min_prefetch_in_strobe_us : 0,
+
+ /* Output */
+diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.h b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.h
+index d98e36a9a09cc..c4745d63039bb 100644
+--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.h
++++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.h
+@@ -53,7 +53,7 @@
+ #define BPP_BLENDED_PIPE 0xffffffff
+
+ #define MEM_STROBE_FREQ_MHZ 1600
+-#define MIN_DCFCLK_FREQ_MHZ 200
++#define DCFCLK_FREQ_EXTRA_PREFETCH_REQ_MHZ 300
+ #define MEM_STROBE_MAX_DELIVERY_TIME_US 60.0
+
+ struct display_mode_lib;
+diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/dccg.h b/drivers/gpu/drm/amd/display/dc/inc/hw/dccg.h
+index ad6acd1b34e1d..9651cccb084a3 100644
+--- a/drivers/gpu/drm/amd/display/dc/inc/hw/dccg.h
++++ b/drivers/gpu/drm/amd/display/dc/inc/hw/dccg.h
+@@ -68,6 +68,7 @@ struct dccg {
+ const struct dccg_funcs *funcs;
+ int pipe_dppclk_khz[MAX_PIPES];
+ int ref_dppclk;
++ bool dpp_clock_gated[MAX_PIPES];
+ //int dtbclk_khz[MAX_PIPES];/* TODO needs to be removed */
+ //int audio_dtbclk_khz;/* TODO needs to be removed */
+ //int ref_dtbclk_khz;/* TODO needs to be removed */
+diff --git a/drivers/gpu/drm/amd/include/amd_shared.h b/drivers/gpu/drm/amd/include/amd_shared.h
+index e4a22c68517d1..f175e65b853a0 100644
+--- a/drivers/gpu/drm/amd/include/amd_shared.h
++++ b/drivers/gpu/drm/amd/include/amd_shared.h
+@@ -240,7 +240,6 @@ enum DC_FEATURE_MASK {
+ DC_DISABLE_LTTPR_DP2_0 = (1 << 6), //0x40, disabled by default
+ DC_PSR_ALLOW_SMU_OPT = (1 << 7), //0x80, disabled by default
+ DC_PSR_ALLOW_MULTI_DISP_OPT = (1 << 8), //0x100, disabled by default
+- DC_ENABLE_SUBVP_DRR = (1 << 9), // 0x200, disabled by default
+ };
+
+ enum DC_DEBUG_MASK {
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
+index ea03e8d9a3f6c..818379276a582 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
+@@ -1573,9 +1573,9 @@ static int smu_disable_dpms(struct smu_context *smu)
+
+ /*
+ * For SMU 13.0.4/11, PMFW will handle the features disablement properly
+- * for gpu reset case. Driver involvement is unnecessary.
++ * for gpu reset and S0i3 cases. Driver involvement is unnecessary.
+ */
+- if (amdgpu_in_reset(adev)) {
++ if (amdgpu_in_reset(adev) || adev->in_s0ix) {
+ switch (adev->ip_versions[MP1_HWIP][0]) {
+ case IP_VERSION(13, 0, 4):
+ case IP_VERSION(13, 0, 11):
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+index 0cda3b276f611..f0800c0c5168c 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+@@ -588,7 +588,9 @@ err0_out:
+ return -ENOMEM;
+ }
+
+-static uint32_t sienna_cichlid_get_throttler_status_locked(struct smu_context *smu)
++static uint32_t sienna_cichlid_get_throttler_status_locked(struct smu_context *smu,
++ bool use_metrics_v3,
++ bool use_metrics_v2)
+ {
+ struct smu_table_context *smu_table= &smu->smu_table;
+ SmuMetricsExternal_t *metrics_ext =
+@@ -596,13 +598,11 @@ static uint32_t sienna_cichlid_get_throttler_status_locked(struct smu_context *s
+ uint32_t throttler_status = 0;
+ int i;
+
+- if ((smu->adev->ip_versions[MP1_HWIP][0] == IP_VERSION(11, 0, 7)) &&
+- (smu->smc_fw_version >= 0x3A4900)) {
++ if (use_metrics_v3) {
+ for (i = 0; i < THROTTLER_COUNT; i++)
+ throttler_status |=
+ (metrics_ext->SmuMetrics_V3.ThrottlingPercentage[i] ? 1U << i : 0);
+- } else if ((smu->adev->ip_versions[MP1_HWIP][0] == IP_VERSION(11, 0, 7)) &&
+- (smu->smc_fw_version >= 0x3A4300)) {
++ } else if (use_metrics_v2) {
+ for (i = 0; i < THROTTLER_COUNT; i++)
+ throttler_status |=
+ (metrics_ext->SmuMetrics_V2.ThrottlingPercentage[i] ? 1U << i : 0);
+@@ -864,7 +864,7 @@ static int sienna_cichlid_get_smu_metrics_data(struct smu_context *smu,
+ metrics->TemperatureVrSoc) * SMU_TEMPERATURE_UNITS_PER_CENTIGRADES;
+ break;
+ case METRICS_THROTTLER_STATUS:
+- *value = sienna_cichlid_get_throttler_status_locked(smu);
++ *value = sienna_cichlid_get_throttler_status_locked(smu, use_metrics_v3, use_metrics_v2);
+ break;
+ case METRICS_CURR_FANSPEED:
+ *value = use_metrics_v3 ? metrics_v3->CurrFanSpeed :
+@@ -4017,7 +4017,7 @@ static ssize_t sienna_cichlid_get_gpu_metrics(struct smu_context *smu,
+ gpu_metrics->current_dclk1 = use_metrics_v3 ? metrics_v3->CurrClock[PPCLK_DCLK_1] :
+ use_metrics_v2 ? metrics_v2->CurrClock[PPCLK_DCLK_1] : metrics->CurrClock[PPCLK_DCLK_1];
+
+- gpu_metrics->throttle_status = sienna_cichlid_get_throttler_status_locked(smu);
++ gpu_metrics->throttle_status = sienna_cichlid_get_throttler_status_locked(smu, use_metrics_v3, use_metrics_v2);
+ gpu_metrics->indep_throttle_status =
+ smu_cmn_get_indep_throttler_status(gpu_metrics->throttle_status,
+ sienna_cichlid_throttler_map);
+diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
+index 0454da505687b..e1a04461ba884 100644
+--- a/drivers/gpu/drm/drm_edid.c
++++ b/drivers/gpu/drm/drm_edid.c
+@@ -3424,6 +3424,10 @@ static struct drm_display_mode *drm_mode_detailed(struct drm_connector *connecto
+ connector->base.id, connector->name);
+ return NULL;
+ }
++ if (!(pt->misc & DRM_EDID_PT_SEPARATE_SYNC)) {
++ drm_dbg_kms(dev, "[CONNECTOR:%d:%s] Composite sync not supported\n",
++ connector->base.id, connector->name);
++ }
+
+ /* it is incorrect if hsync/vsync width is zero */
+ if (!hsync_pulse_width || !vsync_pulse_width) {
+@@ -3470,27 +3474,10 @@ static struct drm_display_mode *drm_mode_detailed(struct drm_connector *connecto
+ if (info->quirks & EDID_QUIRK_DETAILED_SYNC_PP) {
+ mode->flags |= DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC;
+ } else {
+- switch (pt->misc & DRM_EDID_PT_SYNC_MASK) {
+- case DRM_EDID_PT_ANALOG_CSYNC:
+- case DRM_EDID_PT_BIPOLAR_ANALOG_CSYNC:
+- drm_dbg_kms(dev, "[CONNECTOR:%d:%s] Analog composite sync!\n",
+- connector->base.id, connector->name);
+- mode->flags |= DRM_MODE_FLAG_CSYNC | DRM_MODE_FLAG_NCSYNC;
+- break;
+- case DRM_EDID_PT_DIGITAL_CSYNC:
+- drm_dbg_kms(dev, "[CONNECTOR:%d:%s] Digital composite sync!\n",
+- connector->base.id, connector->name);
+- mode->flags |= DRM_MODE_FLAG_CSYNC;
+- mode->flags |= (pt->misc & DRM_EDID_PT_HSYNC_POSITIVE) ?
+- DRM_MODE_FLAG_PCSYNC : DRM_MODE_FLAG_NCSYNC;
+- break;
+- case DRM_EDID_PT_DIGITAL_SEPARATE_SYNC:
+- mode->flags |= (pt->misc & DRM_EDID_PT_HSYNC_POSITIVE) ?
+- DRM_MODE_FLAG_PHSYNC : DRM_MODE_FLAG_NHSYNC;
+- mode->flags |= (pt->misc & DRM_EDID_PT_VSYNC_POSITIVE) ?
+- DRM_MODE_FLAG_PVSYNC : DRM_MODE_FLAG_NVSYNC;
+- break;
+- }
++ mode->flags |= (pt->misc & DRM_EDID_PT_HSYNC_POSITIVE) ?
++ DRM_MODE_FLAG_PHSYNC : DRM_MODE_FLAG_NHSYNC;
++ mode->flags |= (pt->misc & DRM_EDID_PT_VSYNC_POSITIVE) ?
++ DRM_MODE_FLAG_PVSYNC : DRM_MODE_FLAG_NVSYNC;
+ }
+
+ set_size:
+diff --git a/drivers/gpu/drm/i915/display/intel_sdvo.c b/drivers/gpu/drm/i915/display/intel_sdvo.c
+index e12ba458636c1..5ee0479ae6de3 100644
+--- a/drivers/gpu/drm/i915/display/intel_sdvo.c
++++ b/drivers/gpu/drm/i915/display/intel_sdvo.c
+@@ -2752,7 +2752,7 @@ static struct intel_sdvo_connector *intel_sdvo_connector_alloc(void)
+ __drm_atomic_helper_connector_reset(&sdvo_connector->base.base,
+ &conn_state->base.base);
+
+- INIT_LIST_HEAD(&sdvo_connector->base.panel.fixed_modes);
++ intel_panel_init_alloc(&sdvo_connector->base);
+
+ return sdvo_connector;
+ }
+diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
+index cc18e8f664864..78822331f1b7f 100644
+--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
++++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
+@@ -470,12 +470,19 @@ int intel_guc_slpc_set_ignore_eff_freq(struct intel_guc_slpc *slpc, bool val)
+ ret = slpc_set_param(slpc,
+ SLPC_PARAM_IGNORE_EFFICIENT_FREQUENCY,
+ val);
+- if (ret)
++ if (ret) {
+ guc_probe_error(slpc_to_guc(slpc), "Failed to set efficient freq(%d): %pe\n",
+ val, ERR_PTR(ret));
+- else
++ } else {
+ slpc->ignore_eff_freq = val;
+
++ /* Set min to RPn when we disable efficient freq */
++ if (val)
++ ret = slpc_set_param(slpc,
++ SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ,
++ slpc->min_freq);
++ }
++
+ intel_runtime_pm_put(&i915->runtime_pm, wakeref);
+ mutex_unlock(&slpc->lock);
+ return ret;
+@@ -602,9 +609,8 @@ static int slpc_set_softlimits(struct intel_guc_slpc *slpc)
+ return ret;
+
+ if (!slpc->min_freq_softlimit) {
+- ret = intel_guc_slpc_get_min_freq(slpc, &slpc->min_freq_softlimit);
+- if (unlikely(ret))
+- return ret;
++ /* Min softlimit is initialized to RPn */
++ slpc->min_freq_softlimit = slpc->min_freq;
+ slpc_to_gt(slpc)->defaults.min_freq = slpc->min_freq_softlimit;
+ } else {
+ return intel_guc_slpc_set_min_freq(slpc,
+@@ -755,6 +761,9 @@ int intel_guc_slpc_enable(struct intel_guc_slpc *slpc)
+ return ret;
+ }
+
++ /* Set cached value of ignore efficient freq */
++ intel_guc_slpc_set_ignore_eff_freq(slpc, slpc->ignore_eff_freq);
++
+ /* Revert SLPC min/max to softlimits if necessary */
+ ret = slpc_set_softlimits(slpc);
+ if (unlikely(ret)) {
+@@ -765,9 +774,6 @@ int intel_guc_slpc_enable(struct intel_guc_slpc *slpc)
+ /* Set cached media freq ratio mode */
+ intel_guc_slpc_set_media_ratio_mode(slpc, slpc->media_ratio_mode);
+
+- /* Set cached value of ignore efficient freq */
+- intel_guc_slpc_set_ignore_eff_freq(slpc, slpc->ignore_eff_freq);
+-
+ return 0;
+ }
+
+diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.c b/drivers/gpu/drm/nouveau/nouveau_connector.c
+index a2e0033e8a260..622f6eb9a8bfd 100644
+--- a/drivers/gpu/drm/nouveau/nouveau_connector.c
++++ b/drivers/gpu/drm/nouveau/nouveau_connector.c
+@@ -1408,8 +1408,7 @@ nouveau_connector_create(struct drm_device *dev,
+ ret = nvif_conn_ctor(&disp->disp, nv_connector->base.name, nv_connector->index,
+ &nv_connector->conn);
+ if (ret) {
+- kfree(nv_connector);
+- return ERR_PTR(ret);
++ goto drm_conn_err;
+ }
+
+ ret = nvif_conn_event_ctor(&nv_connector->conn, "kmsHotplug",
+@@ -1426,8 +1425,7 @@ nouveau_connector_create(struct drm_device *dev,
+ if (ret) {
+ nvif_event_dtor(&nv_connector->hpd);
+ nvif_conn_dtor(&nv_connector->conn);
+- kfree(nv_connector);
+- return ERR_PTR(ret);
++ goto drm_conn_err;
+ }
+ }
+ }
+@@ -1475,4 +1473,9 @@ nouveau_connector_create(struct drm_device *dev,
+
+ drm_connector_register(connector);
+ return connector;
++
++drm_conn_err:
++ drm_connector_cleanup(connector);
++ kfree(nv_connector);
++ return ERR_PTR(ret);
+ }
+diff --git a/drivers/gpu/drm/panel/panel-simple.c b/drivers/gpu/drm/panel/panel-simple.c
+index e02249b212c2a..cf6b146acc323 100644
+--- a/drivers/gpu/drm/panel/panel-simple.c
++++ b/drivers/gpu/drm/panel/panel-simple.c
+@@ -969,21 +969,21 @@ static const struct panel_desc auo_g104sn02 = {
+ .connector_type = DRM_MODE_CONNECTOR_LVDS,
+ };
+
+-static const struct drm_display_mode auo_g121ean01_mode = {
+- .clock = 66700,
+- .hdisplay = 1280,
+- .hsync_start = 1280 + 58,
+- .hsync_end = 1280 + 58 + 8,
+- .htotal = 1280 + 58 + 8 + 70,
+- .vdisplay = 800,
+- .vsync_start = 800 + 6,
+- .vsync_end = 800 + 6 + 4,
+- .vtotal = 800 + 6 + 4 + 10,
++static const struct display_timing auo_g121ean01_timing = {
++ .pixelclock = { 60000000, 74400000, 90000000 },
++ .hactive = { 1280, 1280, 1280 },
++ .hfront_porch = { 20, 50, 100 },
++ .hback_porch = { 20, 50, 100 },
++ .hsync_len = { 30, 100, 200 },
++ .vactive = { 800, 800, 800 },
++ .vfront_porch = { 2, 10, 25 },
++ .vback_porch = { 2, 10, 25 },
++ .vsync_len = { 4, 18, 50 },
+ };
+
+ static const struct panel_desc auo_g121ean01 = {
+- .modes = &auo_g121ean01_mode,
+- .num_modes = 1,
++ .timings = &auo_g121ean01_timing,
++ .num_timings = 1,
+ .bpc = 8,
+ .size = {
+ .width = 261,
+diff --git a/drivers/gpu/drm/qxl/qxl_drv.h b/drivers/gpu/drm/qxl/qxl_drv.h
+index ea993d7162e8c..307a890fde133 100644
+--- a/drivers/gpu/drm/qxl/qxl_drv.h
++++ b/drivers/gpu/drm/qxl/qxl_drv.h
+@@ -310,7 +310,7 @@ int qxl_gem_object_create_with_handle(struct qxl_device *qdev,
+ u32 domain,
+ size_t size,
+ struct qxl_surface *surf,
+- struct qxl_bo **qobj,
++ struct drm_gem_object **gobj,
+ uint32_t *handle);
+ void qxl_gem_object_free(struct drm_gem_object *gobj);
+ int qxl_gem_object_open(struct drm_gem_object *obj, struct drm_file *file_priv);
+diff --git a/drivers/gpu/drm/qxl/qxl_dumb.c b/drivers/gpu/drm/qxl/qxl_dumb.c
+index d636ba6854513..17df5c7ccf691 100644
+--- a/drivers/gpu/drm/qxl/qxl_dumb.c
++++ b/drivers/gpu/drm/qxl/qxl_dumb.c
+@@ -34,6 +34,7 @@ int qxl_mode_dumb_create(struct drm_file *file_priv,
+ {
+ struct qxl_device *qdev = to_qxl(dev);
+ struct qxl_bo *qobj;
++ struct drm_gem_object *gobj;
+ uint32_t handle;
+ int r;
+ struct qxl_surface surf;
+@@ -62,11 +63,13 @@ int qxl_mode_dumb_create(struct drm_file *file_priv,
+
+ r = qxl_gem_object_create_with_handle(qdev, file_priv,
+ QXL_GEM_DOMAIN_CPU,
+- args->size, &surf, &qobj,
++ args->size, &surf, &gobj,
+ &handle);
+ if (r)
+ return r;
++ qobj = gem_to_qxl_bo(gobj);
+ qobj->is_dumb = true;
++ drm_gem_object_put(gobj);
+ args->pitch = pitch;
+ args->handle = handle;
+ return 0;
+diff --git a/drivers/gpu/drm/qxl/qxl_gem.c b/drivers/gpu/drm/qxl/qxl_gem.c
+index a08da0bd9098b..fc5e3763c3595 100644
+--- a/drivers/gpu/drm/qxl/qxl_gem.c
++++ b/drivers/gpu/drm/qxl/qxl_gem.c
+@@ -72,32 +72,41 @@ int qxl_gem_object_create(struct qxl_device *qdev, int size,
+ return 0;
+ }
+
++/*
++ * If the caller passed a valid gobj pointer, it is responsible to call
++ * drm_gem_object_put() when it no longer needs to acess the object.
++ *
++ * If gobj is NULL, it is handled internally.
++ */
+ int qxl_gem_object_create_with_handle(struct qxl_device *qdev,
+ struct drm_file *file_priv,
+ u32 domain,
+ size_t size,
+ struct qxl_surface *surf,
+- struct qxl_bo **qobj,
++ struct drm_gem_object **gobj,
+ uint32_t *handle)
+ {
+- struct drm_gem_object *gobj;
+ int r;
++ struct drm_gem_object *local_gobj;
+
+- BUG_ON(!qobj);
+ BUG_ON(!handle);
+
+ r = qxl_gem_object_create(qdev, size, 0,
+ domain,
+ false, false, surf,
+- &gobj);
++ &local_gobj);
+ if (r)
+ return -ENOMEM;
+- r = drm_gem_handle_create(file_priv, gobj, handle);
++ r = drm_gem_handle_create(file_priv, local_gobj, handle);
+ if (r)
+ return r;
+- /* drop reference from allocate - handle holds it now */
+- *qobj = gem_to_qxl_bo(gobj);
+- drm_gem_object_put(gobj);
++
++ if (gobj)
++ *gobj = local_gobj;
++ else
++ /* drop reference from allocate - handle holds it now */
++ drm_gem_object_put(local_gobj);
++
+ return 0;
+ }
+
+diff --git a/drivers/gpu/drm/qxl/qxl_ioctl.c b/drivers/gpu/drm/qxl/qxl_ioctl.c
+index 30f58b21372aa..dd0f834d881ce 100644
+--- a/drivers/gpu/drm/qxl/qxl_ioctl.c
++++ b/drivers/gpu/drm/qxl/qxl_ioctl.c
+@@ -38,7 +38,6 @@ int qxl_alloc_ioctl(struct drm_device *dev, void *data, struct drm_file *file_pr
+ struct qxl_device *qdev = to_qxl(dev);
+ struct drm_qxl_alloc *qxl_alloc = data;
+ int ret;
+- struct qxl_bo *qobj;
+ uint32_t handle;
+ u32 domain = QXL_GEM_DOMAIN_VRAM;
+
+@@ -50,7 +49,7 @@ int qxl_alloc_ioctl(struct drm_device *dev, void *data, struct drm_file *file_pr
+ domain,
+ qxl_alloc->size,
+ NULL,
+- &qobj, &handle);
++ NULL, &handle);
+ if (ret) {
+ DRM_ERROR("%s: failed to create gem ret=%d\n",
+ __func__, ret);
+@@ -386,7 +385,6 @@ int qxl_alloc_surf_ioctl(struct drm_device *dev, void *data, struct drm_file *fi
+ {
+ struct qxl_device *qdev = to_qxl(dev);
+ struct drm_qxl_alloc_surf *param = data;
+- struct qxl_bo *qobj;
+ int handle;
+ int ret;
+ int size, actual_stride;
+@@ -406,7 +404,7 @@ int qxl_alloc_surf_ioctl(struct drm_device *dev, void *data, struct drm_file *fi
+ QXL_GEM_DOMAIN_SURFACE,
+ size,
+ &surf,
+- &qobj, &handle);
++ NULL, &handle);
+ if (ret) {
+ DRM_ERROR("%s: failed to create gem ret=%d\n",
+ __func__, ret);
+diff --git a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
+index d6d29be6b4f48..7e175dbfd8924 100644
+--- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
++++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
+@@ -223,20 +223,6 @@ static void rcar_du_crtc_set_display_timing(struct rcar_du_crtc *rcrtc)
+ * DU channels that have a display PLL can't use the internal
+ * system clock, and have no internal clock divider.
+ */
+-
+- /*
+- * The H3 ES1.x exhibits dot clock duty cycle stability issues.
+- * We can work around them by configuring the DPLL to twice the
+- * desired frequency, coupled with a /2 post-divider. Restrict
+- * the workaround to H3 ES1.x as ES2.0 and all other SoCs have
+- * no post-divider when a display PLL is present (as shown by
+- * the workaround breaking HDMI output on M3-W during testing).
+- */
+- if (rcdu->info->quirks & RCAR_DU_QUIRK_H3_ES1_PCLK_STABILITY) {
+- target *= 2;
+- div = 1;
+- }
+-
+ extclk = clk_get_rate(rcrtc->extclock);
+ rcar_du_dpll_divider(rcrtc, &dpll, extclk, target);
+
+@@ -245,30 +231,13 @@ static void rcar_du_crtc_set_display_timing(struct rcar_du_crtc *rcrtc)
+ | DPLLCR_N(dpll.n) | DPLLCR_M(dpll.m)
+ | DPLLCR_STBY;
+
+- if (rcrtc->index == 1) {
++ if (rcrtc->index == 1)
+ dpllcr |= DPLLCR_PLCS1
+ | DPLLCR_INCS_DOTCLKIN1;
+- } else {
+- dpllcr |= DPLLCR_PLCS0_PLL
++ else
++ dpllcr |= DPLLCR_PLCS0
+ | DPLLCR_INCS_DOTCLKIN0;
+
+- /*
+- * On ES2.x we have a single mux controlled via bit 21,
+- * which selects between DCLKIN source (bit 21 = 0) and
+- * a PLL source (bit 21 = 1), where the PLL is always
+- * PLL1.
+- *
+- * On ES1.x we have an additional mux, controlled
+- * via bit 20, for choosing between PLL0 (bit 20 = 0)
+- * and PLL1 (bit 20 = 1). We always want to use PLL1,
+- * so on ES1.x, in addition to setting bit 21, we need
+- * to set the bit 20.
+- */
+-
+- if (rcdu->info->quirks & RCAR_DU_QUIRK_H3_ES1_PLL)
+- dpllcr |= DPLLCR_PLCS0_H3ES1X_PLL1;
+- }
+-
+ rcar_du_group_write(rcrtc->group, DPLLCR, dpllcr);
+
+ escr = ESCR_DCLKSEL_DCLKIN | div;
+diff --git a/drivers/gpu/drm/rcar-du/rcar_du_drv.c b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
+index b9a94c5260e9d..1ffde19cb87fe 100644
+--- a/drivers/gpu/drm/rcar-du/rcar_du_drv.c
++++ b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
+@@ -16,7 +16,6 @@
+ #include <linux/platform_device.h>
+ #include <linux/pm.h>
+ #include <linux/slab.h>
+-#include <linux/sys_soc.h>
+ #include <linux/wait.h>
+
+ #include <drm/drm_atomic_helper.h>
+@@ -387,43 +386,6 @@ static const struct rcar_du_device_info rcar_du_r8a7795_info = {
+ .dpll_mask = BIT(2) | BIT(1),
+ };
+
+-static const struct rcar_du_device_info rcar_du_r8a7795_es1_info = {
+- .gen = 3,
+- .features = RCAR_DU_FEATURE_CRTC_IRQ
+- | RCAR_DU_FEATURE_CRTC_CLOCK
+- | RCAR_DU_FEATURE_VSP1_SOURCE
+- | RCAR_DU_FEATURE_INTERLACED
+- | RCAR_DU_FEATURE_TVM_SYNC,
+- .quirks = RCAR_DU_QUIRK_H3_ES1_PCLK_STABILITY
+- | RCAR_DU_QUIRK_H3_ES1_PLL,
+- .channels_mask = BIT(3) | BIT(2) | BIT(1) | BIT(0),
+- .routes = {
+- /*
+- * R8A7795 has one RGB output, two HDMI outputs and one
+- * LVDS output.
+- */
+- [RCAR_DU_OUTPUT_DPAD0] = {
+- .possible_crtcs = BIT(3),
+- .port = 0,
+- },
+- [RCAR_DU_OUTPUT_HDMI0] = {
+- .possible_crtcs = BIT(1),
+- .port = 1,
+- },
+- [RCAR_DU_OUTPUT_HDMI1] = {
+- .possible_crtcs = BIT(2),
+- .port = 2,
+- },
+- [RCAR_DU_OUTPUT_LVDS0] = {
+- .possible_crtcs = BIT(0),
+- .port = 3,
+- },
+- },
+- .num_lvds = 1,
+- .num_rpf = 5,
+- .dpll_mask = BIT(2) | BIT(1),
+-};
+-
+ static const struct rcar_du_device_info rcar_du_r8a7796_info = {
+ .gen = 3,
+ .features = RCAR_DU_FEATURE_CRTC_IRQ
+@@ -614,11 +576,6 @@ static const struct of_device_id rcar_du_of_table[] = {
+
+ MODULE_DEVICE_TABLE(of, rcar_du_of_table);
+
+-static const struct soc_device_attribute rcar_du_soc_table[] = {
+- { .soc_id = "r8a7795", .revision = "ES1.*", .data = &rcar_du_r8a7795_es1_info },
+- { /* sentinel */ }
+-};
+-
+ const char *rcar_du_output_name(enum rcar_du_output output)
+ {
+ static const char * const names[] = {
+@@ -707,7 +664,6 @@ static void rcar_du_shutdown(struct platform_device *pdev)
+
+ static int rcar_du_probe(struct platform_device *pdev)
+ {
+- const struct soc_device_attribute *soc_attr;
+ struct rcar_du_device *rcdu;
+ unsigned int mask;
+ int ret;
+@@ -725,10 +681,6 @@ static int rcar_du_probe(struct platform_device *pdev)
+
+ rcdu->info = of_device_get_match_data(rcdu->dev);
+
+- soc_attr = soc_device_match(rcar_du_soc_table);
+- if (soc_attr)
+- rcdu->info = soc_attr->data;
+-
+ platform_set_drvdata(pdev, rcdu);
+
+ /* I/O resources */
+diff --git a/drivers/gpu/drm/rcar-du/rcar_du_drv.h b/drivers/gpu/drm/rcar-du/rcar_du_drv.h
+index acc3673fefe18..5cfa2bb7ad93d 100644
+--- a/drivers/gpu/drm/rcar-du/rcar_du_drv.h
++++ b/drivers/gpu/drm/rcar-du/rcar_du_drv.h
+@@ -34,8 +34,6 @@ struct rcar_du_device;
+ #define RCAR_DU_FEATURE_NO_BLENDING BIT(5) /* PnMR.SPIM does not have ALP nor EOR bits */
+
+ #define RCAR_DU_QUIRK_ALIGN_128B BIT(0) /* Align pitches to 128 bytes */
+-#define RCAR_DU_QUIRK_H3_ES1_PCLK_STABILITY BIT(1) /* H3 ES1 has pclk stability issue */
+-#define RCAR_DU_QUIRK_H3_ES1_PLL BIT(2) /* H3 ES1 PLL setup differs from non-ES1 */
+
+ enum rcar_du_output {
+ RCAR_DU_OUTPUT_DPAD0,
+diff --git a/drivers/gpu/drm/rcar-du/rcar_du_regs.h b/drivers/gpu/drm/rcar-du/rcar_du_regs.h
+index 6c750fab6ebb7..391de6661d8bc 100644
+--- a/drivers/gpu/drm/rcar-du/rcar_du_regs.h
++++ b/drivers/gpu/drm/rcar-du/rcar_du_regs.h
+@@ -283,8 +283,7 @@
+ #define DPLLCR 0x20044
+ #define DPLLCR_CODE (0x95 << 24)
+ #define DPLLCR_PLCS1 (1 << 23)
+-#define DPLLCR_PLCS0_PLL (1 << 21)
+-#define DPLLCR_PLCS0_H3ES1X_PLL1 (1 << 20)
++#define DPLLCR_PLCS0 (1 << 21)
+ #define DPLLCR_CLKE (1 << 18)
+ #define DPLLCR_FDPLL(n) ((n) << 12)
+ #define DPLLCR_N(n) ((n) << 5)
+diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
+index e0a8890a62e23..3e2a31d8190eb 100644
+--- a/drivers/gpu/drm/scheduler/sched_entity.c
++++ b/drivers/gpu/drm/scheduler/sched_entity.c
+@@ -448,6 +448,12 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
+ drm_sched_rq_update_fifo(entity, next->submit_ts);
+ }
+
++ /* Jobs and entities might have different lifecycles. Since we're
++ * removing the job from the entities queue, set the jobs entity pointer
++ * to NULL to prevent any future access of the entity through this job.
++ */
++ sched_job->entity = NULL;
++
+ return sched_job;
+ }
+
+diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
+index aea5a90ff98b9..cdd67676c3d1b 100644
+--- a/drivers/gpu/drm/scheduler/sched_main.c
++++ b/drivers/gpu/drm/scheduler/sched_main.c
+@@ -42,6 +42,10 @@
+ * the hardware.
+ *
+ * The jobs in a entity are always scheduled in the order that they were pushed.
++ *
++ * Note that once a job was taken from the entities queue and pushed to the
++ * hardware, i.e. the pending queue, the entity must not be referenced anymore
++ * through the jobs entity pointer.
+ */
+
+ #include <linux/kthread.h>
+diff --git a/drivers/gpu/drm/stm/ltdc.c b/drivers/gpu/drm/stm/ltdc.c
+index 03c6becda795c..b8be4c1db4235 100644
+--- a/drivers/gpu/drm/stm/ltdc.c
++++ b/drivers/gpu/drm/stm/ltdc.c
+@@ -1145,7 +1145,7 @@ static void ltdc_crtc_disable_vblank(struct drm_crtc *crtc)
+
+ static int ltdc_crtc_set_crc_source(struct drm_crtc *crtc, const char *source)
+ {
+- struct ltdc_device *ldev = crtc_to_ltdc(crtc);
++ struct ltdc_device *ldev;
+ int ret;
+
+ DRM_DEBUG_DRIVER("\n");
+@@ -1153,6 +1153,8 @@ static int ltdc_crtc_set_crc_source(struct drm_crtc *crtc, const char *source)
+ if (!crtc)
+ return -ENODEV;
+
++ ldev = crtc_to_ltdc(crtc);
++
+ if (source && strcmp(source, "auto") == 0) {
+ ldev->crc_active = true;
+ ret = regmap_set_bits(ldev->regmap, LTDC_GCR, GCR_CRCEN);
+diff --git a/drivers/hid/hid-logitech-hidpp.c b/drivers/hid/hid-logitech-hidpp.c
+index f7e06d433a915..dfe8e09a18de0 100644
+--- a/drivers/hid/hid-logitech-hidpp.c
++++ b/drivers/hid/hid-logitech-hidpp.c
+@@ -4608,6 +4608,8 @@ static const struct hid_device_id hidpp_devices[] = {
+ HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, 0xC086) },
+ { /* Logitech G903 Hero Gaming Mouse over USB */
+ HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, 0xC091) },
++ { /* Logitech G915 TKL Keyboard over USB */
++ HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, 0xC343) },
+ { /* Logitech G920 Wheel over USB */
+ HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_LOGITECH_G920_WHEEL),
+ .driver_data = HIDPP_QUIRK_CLASS_G920 | HIDPP_QUIRK_FORCE_OUTPUT_REPORTS},
+@@ -4630,6 +4632,8 @@ static const struct hid_device_id hidpp_devices[] = {
+ { /* MX5500 keyboard over Bluetooth */
+ HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, 0xb30b),
+ .driver_data = HIDPP_QUIRK_HIDPP_CONSUMER_VENDOR_KEYS },
++ { /* Logitech G915 TKL keyboard over Bluetooth */
++ HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, 0xb35f) },
+ { /* M-RCQ142 V470 Cordless Laser Mouse over Bluetooth */
+ HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LOGITECH, 0xb008) },
+ { /* MX Master mouse over Bluetooth */
+diff --git a/drivers/hid/i2c-hid/i2c-hid-of-goodix.c b/drivers/hid/i2c-hid/i2c-hid-of-goodix.c
+index 0060e3dcd775d..db4639db98407 100644
+--- a/drivers/hid/i2c-hid/i2c-hid-of-goodix.c
++++ b/drivers/hid/i2c-hid/i2c-hid-of-goodix.c
+@@ -28,6 +28,7 @@ struct i2c_hid_of_goodix {
+ struct regulator *vdd;
+ struct regulator *vddio;
+ struct gpio_desc *reset_gpio;
++ bool no_reset_during_suspend;
+ const struct goodix_i2c_hid_timing_data *timings;
+ };
+
+@@ -37,6 +38,14 @@ static int goodix_i2c_hid_power_up(struct i2chid_ops *ops)
+ container_of(ops, struct i2c_hid_of_goodix, ops);
+ int ret;
+
++ /*
++ * We assert reset GPIO here (instead of during power-down) to ensure
++ * the device will have a clean state after powering up, just like the
++ * normal scenarios will have.
++ */
++ if (ihid_goodix->no_reset_during_suspend)
++ gpiod_set_value_cansleep(ihid_goodix->reset_gpio, 1);
++
+ ret = regulator_enable(ihid_goodix->vdd);
+ if (ret)
+ return ret;
+@@ -60,7 +69,9 @@ static void goodix_i2c_hid_power_down(struct i2chid_ops *ops)
+ struct i2c_hid_of_goodix *ihid_goodix =
+ container_of(ops, struct i2c_hid_of_goodix, ops);
+
+- gpiod_set_value_cansleep(ihid_goodix->reset_gpio, 1);
++ if (!ihid_goodix->no_reset_during_suspend)
++ gpiod_set_value_cansleep(ihid_goodix->reset_gpio, 1);
++
+ regulator_disable(ihid_goodix->vddio);
+ regulator_disable(ihid_goodix->vdd);
+ }
+@@ -91,6 +102,9 @@ static int i2c_hid_of_goodix_probe(struct i2c_client *client)
+ if (IS_ERR(ihid_goodix->vddio))
+ return PTR_ERR(ihid_goodix->vddio);
+
++ ihid_goodix->no_reset_during_suspend =
++ of_property_read_bool(client->dev.of_node, "goodix,no-reset-during-suspend");
++
+ ihid_goodix->timings = device_get_match_data(&client->dev);
+
+ return i2c_hid_core_probe(client, &ihid_goodix->ops, 0x0001, 0);
+diff --git a/drivers/hid/intel-ish-hid/ipc/hw-ish.h b/drivers/hid/intel-ish-hid/ipc/hw-ish.h
+index fc108f19a64c3..e99f3a3c65e15 100644
+--- a/drivers/hid/intel-ish-hid/ipc/hw-ish.h
++++ b/drivers/hid/intel-ish-hid/ipc/hw-ish.h
+@@ -33,6 +33,7 @@
+ #define ADL_N_DEVICE_ID 0x54FC
+ #define RPL_S_DEVICE_ID 0x7A78
+ #define MTL_P_DEVICE_ID 0x7E45
++#define ARL_H_DEVICE_ID 0x7745
+
+ #define REVISION_ID_CHT_A0 0x6
+ #define REVISION_ID_CHT_Ax_SI 0x0
+diff --git a/drivers/hid/intel-ish-hid/ipc/pci-ish.c b/drivers/hid/intel-ish-hid/ipc/pci-ish.c
+index 7120b30ac51d0..55cb25038e632 100644
+--- a/drivers/hid/intel-ish-hid/ipc/pci-ish.c
++++ b/drivers/hid/intel-ish-hid/ipc/pci-ish.c
+@@ -44,6 +44,7 @@ static const struct pci_device_id ish_pci_tbl[] = {
+ {PCI_DEVICE(PCI_VENDOR_ID_INTEL, ADL_N_DEVICE_ID)},
+ {PCI_DEVICE(PCI_VENDOR_ID_INTEL, RPL_S_DEVICE_ID)},
+ {PCI_DEVICE(PCI_VENDOR_ID_INTEL, MTL_P_DEVICE_ID)},
++ {PCI_DEVICE(PCI_VENDOR_ID_INTEL, ARL_H_DEVICE_ID)},
+ {0, }
+ };
+ MODULE_DEVICE_TABLE(pci, ish_pci_tbl);
+diff --git a/drivers/i2c/busses/i2c-bcm-iproc.c b/drivers/i2c/busses/i2c-bcm-iproc.c
+index 85d8a6b048856..30a2a3200bed9 100644
+--- a/drivers/i2c/busses/i2c-bcm-iproc.c
++++ b/drivers/i2c/busses/i2c-bcm-iproc.c
+@@ -233,13 +233,14 @@ static inline u32 iproc_i2c_rd_reg(struct bcm_iproc_i2c_dev *iproc_i2c,
+ u32 offset)
+ {
+ u32 val;
++ unsigned long flags;
+
+ if (iproc_i2c->idm_base) {
+- spin_lock(&iproc_i2c->idm_lock);
++ spin_lock_irqsave(&iproc_i2c->idm_lock, flags);
+ writel(iproc_i2c->ape_addr_mask,
+ iproc_i2c->idm_base + IDM_CTRL_DIRECT_OFFSET);
+ val = readl(iproc_i2c->base + offset);
+- spin_unlock(&iproc_i2c->idm_lock);
++ spin_unlock_irqrestore(&iproc_i2c->idm_lock, flags);
+ } else {
+ val = readl(iproc_i2c->base + offset);
+ }
+@@ -250,12 +251,14 @@ static inline u32 iproc_i2c_rd_reg(struct bcm_iproc_i2c_dev *iproc_i2c,
+ static inline void iproc_i2c_wr_reg(struct bcm_iproc_i2c_dev *iproc_i2c,
+ u32 offset, u32 val)
+ {
++ unsigned long flags;
++
+ if (iproc_i2c->idm_base) {
+- spin_lock(&iproc_i2c->idm_lock);
++ spin_lock_irqsave(&iproc_i2c->idm_lock, flags);
+ writel(iproc_i2c->ape_addr_mask,
+ iproc_i2c->idm_base + IDM_CTRL_DIRECT_OFFSET);
+ writel(val, iproc_i2c->base + offset);
+- spin_unlock(&iproc_i2c->idm_lock);
++ spin_unlock_irqrestore(&iproc_i2c->idm_lock, flags);
+ } else {
+ writel(val, iproc_i2c->base + offset);
+ }
+diff --git a/drivers/i2c/busses/i2c-designware-master.c b/drivers/i2c/busses/i2c-designware-master.c
+index 55ea91a633829..c51fc1f4b97eb 100644
+--- a/drivers/i2c/busses/i2c-designware-master.c
++++ b/drivers/i2c/busses/i2c-designware-master.c
+@@ -526,9 +526,21 @@ i2c_dw_read(struct dw_i2c_dev *dev)
+ u32 flags = msgs[dev->msg_read_idx].flags;
+
+ regmap_read(dev->map, DW_IC_DATA_CMD, &tmp);
++ tmp &= DW_IC_DATA_CMD_DAT;
+ /* Ensure length byte is a valid value */
+- if (flags & I2C_M_RECV_LEN &&
+- (tmp & DW_IC_DATA_CMD_DAT) <= I2C_SMBUS_BLOCK_MAX && tmp > 0) {
++ if (flags & I2C_M_RECV_LEN) {
++ /*
++ * if IC_EMPTYFIFO_HOLD_MASTER_EN is set, which cannot be
++ * detected from the registers, the controller can be
++ * disabled if the STOP bit is set. But it is only set
++ * after receiving block data response length in
++ * I2C_FUNC_SMBUS_BLOCK_DATA case. That needs to read
++ * another byte with STOP bit set when the block data
++ * response length is invalid to complete the transaction.
++ */
++ if (!tmp || tmp > I2C_SMBUS_BLOCK_MAX)
++ tmp = 1;
++
+ len = i2c_dw_recv_len(dev, tmp);
+ }
+ *buf++ = tmp;
+diff --git a/drivers/i2c/busses/i2c-hisi.c b/drivers/i2c/busses/i2c-hisi.c
+index e067671b3ce2e..0980c773cb5b1 100644
+--- a/drivers/i2c/busses/i2c-hisi.c
++++ b/drivers/i2c/busses/i2c-hisi.c
+@@ -330,6 +330,14 @@ static irqreturn_t hisi_i2c_irq(int irq, void *context)
+ struct hisi_i2c_controller *ctlr = context;
+ u32 int_stat;
+
++ /*
++ * Don't handle the interrupt if cltr->completion is NULL. We may
++ * reach here because the interrupt is spurious or the transfer is
++ * started by another port (e.g. firmware) rather than us.
++ */
++ if (!ctlr->completion)
++ return IRQ_NONE;
++
+ int_stat = readl(ctlr->iobase + HISI_I2C_INT_MSTAT);
+ hisi_i2c_clear_int(ctlr, int_stat);
+ if (!(int_stat & HISI_I2C_INT_ALL))
+diff --git a/drivers/i2c/busses/i2c-tegra.c b/drivers/i2c/busses/i2c-tegra.c
+index 157066f06a32d..d561cf066d705 100644
+--- a/drivers/i2c/busses/i2c-tegra.c
++++ b/drivers/i2c/busses/i2c-tegra.c
+@@ -449,7 +449,7 @@ static int tegra_i2c_init_dma(struct tegra_i2c_dev *i2c_dev)
+ if (i2c_dev->is_vi)
+ return 0;
+
+- if (!i2c_dev->hw->has_apb_dma) {
++ if (i2c_dev->hw->has_apb_dma) {
+ if (!IS_ENABLED(CONFIG_TEGRA20_APB_DMA)) {
+ dev_dbg(i2c_dev->dev, "APB DMA support not enabled\n");
+ return 0;
+diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
+index 2c95e6f3d47ac..eef3ef3fabb42 100644
+--- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
++++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
+@@ -179,6 +179,8 @@ struct bnxt_re_dev {
+ #define BNXT_RE_ROCEV2_IPV4_PACKET 2
+ #define BNXT_RE_ROCEV2_IPV6_PACKET 3
+
++#define BNXT_RE_CHECK_RC(x) ((x) && ((x) != -ETIMEDOUT))
++
+ static inline struct device *rdev_to_dev(struct bnxt_re_dev *rdev)
+ {
+ if (rdev)
+diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+index ebe6852c40e8c..e7f153ee27541 100644
+--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
++++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+@@ -614,12 +614,20 @@ int bnxt_re_destroy_ah(struct ib_ah *ib_ah, u32 flags)
+ {
+ struct bnxt_re_ah *ah = container_of(ib_ah, struct bnxt_re_ah, ib_ah);
+ struct bnxt_re_dev *rdev = ah->rdev;
++ bool block = true;
++ int rc = 0;
+
+- bnxt_qplib_destroy_ah(&rdev->qplib_res, &ah->qplib_ah,
+- !(flags & RDMA_DESTROY_AH_SLEEPABLE));
++ block = !(flags & RDMA_DESTROY_AH_SLEEPABLE);
++ rc = bnxt_qplib_destroy_ah(&rdev->qplib_res, &ah->qplib_ah, block);
++ if (BNXT_RE_CHECK_RC(rc)) {
++ if (rc == -ETIMEDOUT)
++ rc = 0;
++ else
++ goto fail;
++ }
+ atomic_dec(&rdev->ah_count);
+-
+- return 0;
++fail:
++ return rc;
+ }
+
+ static u8 bnxt_re_stack_to_dev_nw_type(enum rdma_network_type ntype)
+diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.c b/drivers/infiniband/hw/bnxt_re/qplib_sp.c
+index b967a17a44beb..10919532bca29 100644
+--- a/drivers/infiniband/hw/bnxt_re/qplib_sp.c
++++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.c
+@@ -468,13 +468,14 @@ int bnxt_qplib_create_ah(struct bnxt_qplib_res *res, struct bnxt_qplib_ah *ah,
+ return 0;
+ }
+
+-void bnxt_qplib_destroy_ah(struct bnxt_qplib_res *res, struct bnxt_qplib_ah *ah,
+- bool block)
++int bnxt_qplib_destroy_ah(struct bnxt_qplib_res *res, struct bnxt_qplib_ah *ah,
++ bool block)
+ {
+ struct bnxt_qplib_rcfw *rcfw = res->rcfw;
+ struct creq_destroy_ah_resp resp = {};
+ struct bnxt_qplib_cmdqmsg msg = {};
+ struct cmdq_destroy_ah req = {};
++ int rc;
+
+ /* Clean up the AH table in the device */
+ bnxt_qplib_rcfw_cmd_prep((struct cmdq_base *)&req,
+@@ -485,7 +486,8 @@ void bnxt_qplib_destroy_ah(struct bnxt_qplib_res *res, struct bnxt_qplib_ah *ah,
+
+ bnxt_qplib_fill_cmdqmsg(&msg, &req, &resp, NULL, sizeof(req),
+ sizeof(resp), block);
+- bnxt_qplib_rcfw_send_message(rcfw, &msg);
++ rc = bnxt_qplib_rcfw_send_message(rcfw, &msg);
++ return rc;
+ }
+
+ /* MRW */
+diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.h b/drivers/infiniband/hw/bnxt_re/qplib_sp.h
+index 5de874659cdfa..4061616048e85 100644
+--- a/drivers/infiniband/hw/bnxt_re/qplib_sp.h
++++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.h
+@@ -327,8 +327,8 @@ int bnxt_qplib_set_func_resources(struct bnxt_qplib_res *res,
+ struct bnxt_qplib_ctx *ctx);
+ int bnxt_qplib_create_ah(struct bnxt_qplib_res *res, struct bnxt_qplib_ah *ah,
+ bool block);
+-void bnxt_qplib_destroy_ah(struct bnxt_qplib_res *res, struct bnxt_qplib_ah *ah,
+- bool block);
++int bnxt_qplib_destroy_ah(struct bnxt_qplib_res *res, struct bnxt_qplib_ah *ah,
++ bool block);
+ int bnxt_qplib_alloc_mrw(struct bnxt_qplib_res *res,
+ struct bnxt_qplib_mrw *mrw);
+ int bnxt_qplib_dereg_mrw(struct bnxt_qplib_res *res, struct bnxt_qplib_mrw *mrw,
+diff --git a/drivers/infiniband/hw/mana/qp.c b/drivers/infiniband/hw/mana/qp.c
+index 54b61930a7fdb..4b3b5b274e849 100644
+--- a/drivers/infiniband/hw/mana/qp.c
++++ b/drivers/infiniband/hw/mana/qp.c
+@@ -13,7 +13,7 @@ static int mana_ib_cfg_vport_steering(struct mana_ib_dev *dev,
+ u8 *rx_hash_key)
+ {
+ struct mana_port_context *mpc = netdev_priv(ndev);
+- struct mana_cfg_rx_steer_req *req = NULL;
++ struct mana_cfg_rx_steer_req_v2 *req;
+ struct mana_cfg_rx_steer_resp resp = {};
+ mana_handle_t *req_indir_tab;
+ struct gdma_context *gc;
+@@ -33,6 +33,8 @@ static int mana_ib_cfg_vport_steering(struct mana_ib_dev *dev,
+ mana_gd_init_req_hdr(&req->hdr, MANA_CONFIG_VPORT_RX, req_buf_size,
+ sizeof(resp));
+
++ req->hdr.req.msg_version = GDMA_MESSAGE_V2;
++
+ req->vport = mpc->port_handle;
+ req->rx_enable = 1;
+ req->update_default_rxobj = 1;
+@@ -46,6 +48,7 @@ static int mana_ib_cfg_vport_steering(struct mana_ib_dev *dev,
+ req->num_indir_entries = MANA_INDIRECT_TABLE_SIZE;
+ req->indir_tab_offset = sizeof(*req);
+ req->update_indir_tab = true;
++ req->cqe_coalescing_enable = 1;
+
+ req_indir_tab = (mana_handle_t *)(req + 1);
+ /* The ind table passed to the hardware must have
+diff --git a/drivers/infiniband/hw/mlx5/qpc.c b/drivers/infiniband/hw/mlx5/qpc.c
+index bae0334d6e7f1..aec011557b4a7 100644
+--- a/drivers/infiniband/hw/mlx5/qpc.c
++++ b/drivers/infiniband/hw/mlx5/qpc.c
+@@ -298,8 +298,7 @@ int mlx5_core_destroy_qp(struct mlx5_ib_dev *dev, struct mlx5_core_qp *qp)
+ MLX5_SET(destroy_qp_in, in, opcode, MLX5_CMD_OP_DESTROY_QP);
+ MLX5_SET(destroy_qp_in, in, qpn, qp->qpn);
+ MLX5_SET(destroy_qp_in, in, uid, qp->uid);
+- mlx5_cmd_exec_in(dev->mdev, destroy_qp, in);
+- return 0;
++ return mlx5_cmd_exec_in(dev->mdev, destroy_qp, in);
+ }
+
+ int mlx5_core_set_delay_drop(struct mlx5_ib_dev *dev,
+@@ -551,14 +550,14 @@ int mlx5_core_xrcd_dealloc(struct mlx5_ib_dev *dev, u32 xrcdn)
+ return mlx5_cmd_exec_in(dev->mdev, dealloc_xrcd, in);
+ }
+
+-static void destroy_rq_tracked(struct mlx5_ib_dev *dev, u32 rqn, u16 uid)
++static int destroy_rq_tracked(struct mlx5_ib_dev *dev, u32 rqn, u16 uid)
+ {
+ u32 in[MLX5_ST_SZ_DW(destroy_rq_in)] = {};
+
+ MLX5_SET(destroy_rq_in, in, opcode, MLX5_CMD_OP_DESTROY_RQ);
+ MLX5_SET(destroy_rq_in, in, rqn, rqn);
+ MLX5_SET(destroy_rq_in, in, uid, uid);
+- mlx5_cmd_exec_in(dev->mdev, destroy_rq, in);
++ return mlx5_cmd_exec_in(dev->mdev, destroy_rq, in);
+ }
+
+ int mlx5_core_create_rq_tracked(struct mlx5_ib_dev *dev, u32 *in, int inlen,
+@@ -589,8 +588,7 @@ int mlx5_core_destroy_rq_tracked(struct mlx5_ib_dev *dev,
+ struct mlx5_core_qp *rq)
+ {
+ destroy_resource_common(dev, rq);
+- destroy_rq_tracked(dev, rq->qpn, rq->uid);
+- return 0;
++ return destroy_rq_tracked(dev, rq->qpn, rq->uid);
+ }
+
+ static void destroy_sq_tracked(struct mlx5_ib_dev *dev, u32 sqn, u16 uid)
+diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
+index 2ddbda3a43746..5a224e244be8a 100644
+--- a/drivers/iommu/amd/amd_iommu_types.h
++++ b/drivers/iommu/amd/amd_iommu_types.h
+@@ -174,6 +174,7 @@
+ #define CONTROL_GAINT_EN 29
+ #define CONTROL_XT_EN 50
+ #define CONTROL_INTCAPXT_EN 51
++#define CONTROL_IRTCACHEDIS 59
+ #define CONTROL_SNPAVIC_EN 61
+
+ #define CTRL_INV_TO_MASK (7 << CONTROL_INV_TIMEOUT)
+@@ -716,6 +717,9 @@ struct amd_iommu {
+ /* if one, we need to send a completion wait command */
+ bool need_sync;
+
++ /* true if disable irte caching */
++ bool irtcachedis_enabled;
++
+ /* Handle for IOMMU core code */
+ struct iommu_device iommu;
+
+diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
+index c2d80a4e5fb06..02846299af0ef 100644
+--- a/drivers/iommu/amd/init.c
++++ b/drivers/iommu/amd/init.c
+@@ -162,6 +162,7 @@ static int amd_iommu_xt_mode = IRQ_REMAP_XAPIC_MODE;
+ static bool amd_iommu_detected;
+ static bool amd_iommu_disabled __initdata;
+ static bool amd_iommu_force_enable __initdata;
++static bool amd_iommu_irtcachedis;
+ static int amd_iommu_target_ivhd_type;
+
+ /* Global EFR and EFR2 registers */
+@@ -484,6 +485,9 @@ static void iommu_disable(struct amd_iommu *iommu)
+
+ /* Disable IOMMU hardware itself */
+ iommu_feature_disable(iommu, CONTROL_IOMMU_EN);
++
++ /* Clear IRTE cache disabling bit */
++ iommu_feature_disable(iommu, CONTROL_IRTCACHEDIS);
+ }
+
+ /*
+@@ -2710,6 +2714,33 @@ static void iommu_enable_ga(struct amd_iommu *iommu)
+ #endif
+ }
+
++static void iommu_disable_irtcachedis(struct amd_iommu *iommu)
++{
++ iommu_feature_disable(iommu, CONTROL_IRTCACHEDIS);
++}
++
++static void iommu_enable_irtcachedis(struct amd_iommu *iommu)
++{
++ u64 ctrl;
++
++ if (!amd_iommu_irtcachedis)
++ return;
++
++ /*
++ * Note:
++ * The support for IRTCacheDis feature is dertermined by
++ * checking if the bit is writable.
++ */
++ iommu_feature_enable(iommu, CONTROL_IRTCACHEDIS);
++ ctrl = readq(iommu->mmio_base + MMIO_CONTROL_OFFSET);
++ ctrl &= (1ULL << CONTROL_IRTCACHEDIS);
++ if (ctrl)
++ iommu->irtcachedis_enabled = true;
++ pr_info("iommu%d (%#06x) : IRT cache is %s\n",
++ iommu->index, iommu->devid,
++ iommu->irtcachedis_enabled ? "disabled" : "enabled");
++}
++
+ static void early_enable_iommu(struct amd_iommu *iommu)
+ {
+ iommu_disable(iommu);
+@@ -2720,6 +2751,7 @@ static void early_enable_iommu(struct amd_iommu *iommu)
+ iommu_set_exclusion_range(iommu);
+ iommu_enable_ga(iommu);
+ iommu_enable_xt(iommu);
++ iommu_enable_irtcachedis(iommu);
+ iommu_enable(iommu);
+ iommu_flush_all_caches(iommu);
+ }
+@@ -2770,10 +2802,12 @@ static void early_enable_iommus(void)
+ for_each_iommu(iommu) {
+ iommu_disable_command_buffer(iommu);
+ iommu_disable_event_buffer(iommu);
++ iommu_disable_irtcachedis(iommu);
+ iommu_enable_command_buffer(iommu);
+ iommu_enable_event_buffer(iommu);
+ iommu_enable_ga(iommu);
+ iommu_enable_xt(iommu);
++ iommu_enable_irtcachedis(iommu);
+ iommu_set_device_table(iommu);
+ iommu_flush_all_caches(iommu);
+ }
+@@ -3426,6 +3460,8 @@ static int __init parse_amd_iommu_options(char *str)
+ amd_iommu_pgtable = AMD_IOMMU_V1;
+ } else if (strncmp(str, "pgtbl_v2", 8) == 0) {
+ amd_iommu_pgtable = AMD_IOMMU_V2;
++ } else if (strncmp(str, "irtcachedis", 11) == 0) {
++ amd_iommu_irtcachedis = true;
+ } else {
+ pr_notice("Unknown option - '%s'\n", str);
+ }
+diff --git a/drivers/leds/rgb/leds-qcom-lpg.c b/drivers/leds/rgb/leds-qcom-lpg.c
+index 1c849814a4917..212df2e3d3502 100644
+--- a/drivers/leds/rgb/leds-qcom-lpg.c
++++ b/drivers/leds/rgb/leds-qcom-lpg.c
+@@ -1173,8 +1173,10 @@ static int lpg_add_led(struct lpg *lpg, struct device_node *np)
+ i = 0;
+ for_each_available_child_of_node(np, child) {
+ ret = lpg_parse_channel(lpg, child, &led->channels[i]);
+- if (ret < 0)
++ if (ret < 0) {
++ of_node_put(child);
+ return ret;
++ }
+
+ info[i].color_index = led->channels[i]->color;
+ info[i].intensity = 0;
+@@ -1352,8 +1354,10 @@ static int lpg_probe(struct platform_device *pdev)
+
+ for_each_available_child_of_node(pdev->dev.of_node, np) {
+ ret = lpg_add_led(lpg, np);
+- if (ret)
++ if (ret) {
++ of_node_put(np);
+ return ret;
++ }
+ }
+
+ for (i = 0; i < lpg->num_channels; i++)
+diff --git a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
+index 40cb3cb87ba17..60425c99a2b8b 100644
+--- a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
++++ b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
+@@ -1310,6 +1310,8 @@ static int mtk_jpeg_probe(struct platform_device *pdev)
+ jpeg->dev = &pdev->dev;
+ jpeg->variant = of_device_get_match_data(jpeg->dev);
+
++ platform_set_drvdata(pdev, jpeg);
++
+ ret = devm_of_platform_populate(&pdev->dev);
+ if (ret) {
+ v4l2_err(&jpeg->v4l2_dev, "Master of platform populate failed.");
+@@ -1381,8 +1383,6 @@ static int mtk_jpeg_probe(struct platform_device *pdev)
+ jpeg->variant->dev_name, jpeg->vdev->num,
+ VIDEO_MAJOR, jpeg->vdev->minor);
+
+- platform_set_drvdata(pdev, jpeg);
+-
+ pm_runtime_enable(&pdev->dev);
+
+ return 0;
+diff --git a/drivers/media/platform/mediatek/vpu/mtk_vpu.c b/drivers/media/platform/mediatek/vpu/mtk_vpu.c
+index 5e2bc286f168e..1a95958a1f908 100644
+--- a/drivers/media/platform/mediatek/vpu/mtk_vpu.c
++++ b/drivers/media/platform/mediatek/vpu/mtk_vpu.c
+@@ -562,15 +562,17 @@ static int load_requested_vpu(struct mtk_vpu *vpu,
+ int vpu_load_firmware(struct platform_device *pdev)
+ {
+ struct mtk_vpu *vpu;
+- struct device *dev = &pdev->dev;
++ struct device *dev;
+ struct vpu_run *run;
+ int ret;
+
+ if (!pdev) {
+- dev_err(dev, "VPU platform device is invalid\n");
++ pr_err("VPU platform device is invalid\n");
+ return -EINVAL;
+ }
+
++ dev = &pdev->dev;
++
+ vpu = platform_get_drvdata(pdev);
+ run = &vpu->run;
+
+diff --git a/drivers/media/platform/qcom/camss/camss-vfe.c b/drivers/media/platform/qcom/camss/camss-vfe.c
+index e0832f3f4f25c..06c95568e5af4 100644
+--- a/drivers/media/platform/qcom/camss/camss-vfe.c
++++ b/drivers/media/platform/qcom/camss/camss-vfe.c
+@@ -1541,7 +1541,11 @@ int msm_vfe_register_entities(struct vfe_device *vfe,
+ }
+
+ video_out->ops = &vfe->video_ops;
+- video_out->bpl_alignment = 8;
++ if (vfe->camss->version == CAMSS_845 ||
++ vfe->camss->version == CAMSS_8250)
++ video_out->bpl_alignment = 16;
++ else
++ video_out->bpl_alignment = 8;
+ video_out->line_based = 0;
+ if (i == VFE_LINE_PIX) {
+ video_out->bpl_alignment = 16;
+diff --git a/drivers/media/usb/uvc/uvc_v4l2.c b/drivers/media/usb/uvc/uvc_v4l2.c
+index 35453f81c1d97..c06f8ca9e09ec 100644
+--- a/drivers/media/usb/uvc/uvc_v4l2.c
++++ b/drivers/media/usb/uvc/uvc_v4l2.c
+@@ -45,7 +45,7 @@ static int uvc_control_add_xu_mapping(struct uvc_video_chain *chain,
+ map->menu_names = NULL;
+ map->menu_mapping = NULL;
+
+- map->menu_mask = BIT_MASK(xmap->menu_count);
++ map->menu_mask = GENMASK(xmap->menu_count - 1, 0);
+
+ size = xmap->menu_count * sizeof(*map->menu_mapping);
+ map->menu_mapping = kzalloc(size, GFP_KERNEL);
+diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
+index e46330815484d..5d6c16adb50da 100644
+--- a/drivers/mmc/core/block.c
++++ b/drivers/mmc/core/block.c
+@@ -2097,14 +2097,14 @@ static void mmc_blk_mq_poll_completion(struct mmc_queue *mq,
+ mmc_blk_urgent_bkops(mq, mqrq);
+ }
+
+-static void mmc_blk_mq_dec_in_flight(struct mmc_queue *mq, struct request *req)
++static void mmc_blk_mq_dec_in_flight(struct mmc_queue *mq, enum mmc_issue_type issue_type)
+ {
+ unsigned long flags;
+ bool put_card;
+
+ spin_lock_irqsave(&mq->lock, flags);
+
+- mq->in_flight[mmc_issue_type(mq, req)] -= 1;
++ mq->in_flight[issue_type] -= 1;
+
+ put_card = (mmc_tot_in_flight(mq) == 0);
+
+@@ -2117,6 +2117,7 @@ static void mmc_blk_mq_dec_in_flight(struct mmc_queue *mq, struct request *req)
+ static void mmc_blk_mq_post_req(struct mmc_queue *mq, struct request *req,
+ bool can_sleep)
+ {
++ enum mmc_issue_type issue_type = mmc_issue_type(mq, req);
+ struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+ struct mmc_request *mrq = &mqrq->brq.mrq;
+ struct mmc_host *host = mq->card->host;
+@@ -2136,7 +2137,7 @@ static void mmc_blk_mq_post_req(struct mmc_queue *mq, struct request *req,
+ blk_mq_complete_request(req);
+ }
+
+- mmc_blk_mq_dec_in_flight(mq, req);
++ mmc_blk_mq_dec_in_flight(mq, issue_type);
+ }
+
+ void mmc_blk_mq_recovery(struct mmc_queue *mq)
+diff --git a/drivers/mmc/host/sdhci_f_sdh30.c b/drivers/mmc/host/sdhci_f_sdh30.c
+index b01ffb4d09737..3215063bcf868 100644
+--- a/drivers/mmc/host/sdhci_f_sdh30.c
++++ b/drivers/mmc/host/sdhci_f_sdh30.c
+@@ -210,13 +210,16 @@ static int sdhci_f_sdh30_remove(struct platform_device *pdev)
+ {
+ struct sdhci_host *host = platform_get_drvdata(pdev);
+ struct f_sdhost_priv *priv = sdhci_f_sdhost_priv(host);
+-
+- reset_control_assert(priv->rst);
+- clk_disable_unprepare(priv->clk);
+- clk_disable_unprepare(priv->clk_iface);
++ struct clk *clk_iface = priv->clk_iface;
++ struct reset_control *rst = priv->rst;
++ struct clk *clk = priv->clk;
+
+ sdhci_pltfm_unregister(pdev);
+
++ reset_control_assert(rst);
++ clk_disable_unprepare(clk);
++ clk_disable_unprepare(clk_iface);
++
+ return 0;
+ }
+
+diff --git a/drivers/mmc/host/sunplus-mmc.c b/drivers/mmc/host/sunplus-mmc.c
+index db5e0dcdfa7f3..2bdebeb1f8e49 100644
+--- a/drivers/mmc/host/sunplus-mmc.c
++++ b/drivers/mmc/host/sunplus-mmc.c
+@@ -863,11 +863,9 @@ static int spmmc_drv_probe(struct platform_device *pdev)
+ struct spmmc_host *host;
+ int ret = 0;
+
+- mmc = mmc_alloc_host(sizeof(*host), &pdev->dev);
+- if (!mmc) {
+- ret = -ENOMEM;
+- goto probe_free_host;
+- }
++ mmc = devm_mmc_alloc_host(&pdev->dev, sizeof(struct spmmc_host));
++ if (!mmc)
++ return -ENOMEM;
+
+ host = mmc_priv(mmc);
+ host->mmc = mmc;
+@@ -902,7 +900,7 @@ static int spmmc_drv_probe(struct platform_device *pdev)
+
+ ret = mmc_of_parse(mmc);
+ if (ret)
+- goto probe_free_host;
++ goto clk_disable;
+
+ mmc->ops = &spmmc_ops;
+ mmc->f_min = SPMMC_MIN_CLK;
+@@ -911,7 +909,7 @@ static int spmmc_drv_probe(struct platform_device *pdev)
+
+ ret = mmc_regulator_get_supply(mmc);
+ if (ret)
+- goto probe_free_host;
++ goto clk_disable;
+
+ if (!mmc->ocr_avail)
+ mmc->ocr_avail = MMC_VDD_32_33 | MMC_VDD_33_34;
+@@ -927,14 +925,17 @@ static int spmmc_drv_probe(struct platform_device *pdev)
+ host->tuning_info.enable_tuning = 1;
+ pm_runtime_set_active(&pdev->dev);
+ pm_runtime_enable(&pdev->dev);
+- mmc_add_host(mmc);
++ ret = mmc_add_host(mmc);
++ if (ret)
++ goto pm_disable;
+
+- return ret;
++ return 0;
+
+-probe_free_host:
+- if (mmc)
+- mmc_free_host(mmc);
++pm_disable:
++ pm_runtime_disable(&pdev->dev);
+
++clk_disable:
++ clk_disable_unprepare(host->clk);
+ return ret;
+ }
+
+@@ -948,7 +949,6 @@ static int spmmc_drv_remove(struct platform_device *dev)
+ pm_runtime_put_noidle(&dev->dev);
+ pm_runtime_disable(&dev->dev);
+ platform_set_drvdata(dev, NULL);
+- mmc_free_host(host->mmc);
+
+ return 0;
+ }
+diff --git a/drivers/mmc/host/wbsd.c b/drivers/mmc/host/wbsd.c
+index 521af9251f335..bf2a92fba0ed8 100644
+--- a/drivers/mmc/host/wbsd.c
++++ b/drivers/mmc/host/wbsd.c
+@@ -1705,8 +1705,6 @@ static int wbsd_init(struct device *dev, int base, int irq, int dma,
+
+ wbsd_release_resources(host);
+ wbsd_free_mmc(dev);
+-
+- mmc_free_host(mmc);
+ return ret;
+ }
+
+diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
+index 642e93e8623eb..8c9d05a1fe667 100644
+--- a/drivers/net/dsa/mv88e6xxx/chip.c
++++ b/drivers/net/dsa/mv88e6xxx/chip.c
+@@ -3006,6 +3006,14 @@ static void mv88e6xxx_hardware_reset(struct mv88e6xxx_chip *chip)
+
+ /* If there is a GPIO connected to the reset pin, toggle it */
+ if (gpiod) {
++ /* If the switch has just been reset and not yet completed
++ * loading EEPROM, the reset may interrupt the I2C transaction
++ * mid-byte, causing the first EEPROM read after the reset
++ * from the wrong location resulting in the switch booting
++ * to wrong mode and inoperable.
++ */
++ mv88e6xxx_g1_wait_eeprom_done(chip);
++
+ gpiod_set_value_cansleep(gpiod, 1);
+ usleep_range(10000, 20000);
+ gpiod_set_value_cansleep(gpiod, 0);
+diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
+index 29a1199dad146..3fbe15b3ac627 100644
+--- a/drivers/net/ethernet/cadence/macb_main.c
++++ b/drivers/net/ethernet/cadence/macb_main.c
+@@ -5159,6 +5159,9 @@ static int __maybe_unused macb_suspend(struct device *dev)
+ unsigned int q;
+ int err;
+
++ if (!device_may_wakeup(&bp->dev->dev))
++ phy_exit(bp->sgmii_phy);
++
+ if (!netif_running(netdev))
+ return 0;
+
+@@ -5219,7 +5222,6 @@ static int __maybe_unused macb_suspend(struct device *dev)
+ if (!(bp->wol & MACB_WOL_ENABLED)) {
+ rtnl_lock();
+ phylink_stop(bp->phylink);
+- phy_exit(bp->sgmii_phy);
+ rtnl_unlock();
+ spin_lock_irqsave(&bp->lock, flags);
+ macb_reset_hw(bp);
+@@ -5249,6 +5251,9 @@ static int __maybe_unused macb_resume(struct device *dev)
+ unsigned int q;
+ int err;
+
++ if (!device_may_wakeup(&bp->dev->dev))
++ phy_init(bp->sgmii_phy);
++
+ if (!netif_running(netdev))
+ return 0;
+
+@@ -5309,8 +5314,6 @@ static int __maybe_unused macb_resume(struct device *dev)
+ macb_set_rx_mode(netdev);
+ macb_restore_features(bp);
+ rtnl_lock();
+- if (!device_may_wakeup(&bp->dev->dev))
+- phy_init(bp->sgmii_phy);
+
+ phylink_start(bp->phylink);
+ rtnl_unlock();
+diff --git a/drivers/net/ethernet/intel/i40e/i40e_nvm.c b/drivers/net/ethernet/intel/i40e/i40e_nvm.c
+index 9da0c87f03288..f99c1f7fec406 100644
+--- a/drivers/net/ethernet/intel/i40e/i40e_nvm.c
++++ b/drivers/net/ethernet/intel/i40e/i40e_nvm.c
+@@ -210,11 +210,11 @@ read_nvm_exit:
+ * @hw: pointer to the HW structure.
+ * @module_pointer: module pointer location in words from the NVM beginning
+ * @offset: offset in words from module start
+- * @words: number of words to write
+- * @data: buffer with words to write to the Shadow RAM
++ * @words: number of words to read
++ * @data: buffer with words to read to the Shadow RAM
+ * @last_command: tells the AdminQ that this is the last command
+ *
+- * Writes a 16 bit words buffer to the Shadow RAM using the admin command.
++ * Reads a 16 bit words buffer to the Shadow RAM using the admin command.
+ **/
+ static int i40e_read_nvm_aq(struct i40e_hw *hw,
+ u8 module_pointer, u32 offset,
+@@ -234,18 +234,18 @@ static int i40e_read_nvm_aq(struct i40e_hw *hw,
+ */
+ if ((offset + words) > hw->nvm.sr_size)
+ i40e_debug(hw, I40E_DEBUG_NVM,
+- "NVM write error: offset %d beyond Shadow RAM limit %d\n",
++ "NVM read error: offset %d beyond Shadow RAM limit %d\n",
+ (offset + words), hw->nvm.sr_size);
+ else if (words > I40E_SR_SECTOR_SIZE_IN_WORDS)
+- /* We can write only up to 4KB (one sector), in one AQ write */
++ /* We can read only up to 4KB (one sector), in one AQ write */
+ i40e_debug(hw, I40E_DEBUG_NVM,
+- "NVM write fail error: tried to write %d words, limit is %d.\n",
++ "NVM read fail error: tried to read %d words, limit is %d.\n",
+ words, I40E_SR_SECTOR_SIZE_IN_WORDS);
+ else if (((offset + (words - 1)) / I40E_SR_SECTOR_SIZE_IN_WORDS)
+ != (offset / I40E_SR_SECTOR_SIZE_IN_WORDS))
+- /* A single write cannot spread over two sectors */
++ /* A single read cannot spread over two sectors */
+ i40e_debug(hw, I40E_DEBUG_NVM,
+- "NVM write error: cannot spread over two sectors in a single write offset=%d words=%d\n",
++ "NVM read error: cannot spread over two sectors in a single read offset=%d words=%d\n",
+ offset, words);
+ else
+ ret_code = i40e_aq_read_nvm(hw, module_pointer,
+diff --git a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
+index 460ca561819a9..a34303ad057d0 100644
+--- a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
++++ b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c
+@@ -1289,6 +1289,7 @@ iavf_add_fdir_fltr_info(struct iavf_adapter *adapter, struct ethtool_rx_flow_spe
+ fltr->ip_mask.src_port = fsp->m_u.tcp_ip4_spec.psrc;
+ fltr->ip_mask.dst_port = fsp->m_u.tcp_ip4_spec.pdst;
+ fltr->ip_mask.tos = fsp->m_u.tcp_ip4_spec.tos;
++ fltr->ip_ver = 4;
+ break;
+ case AH_V4_FLOW:
+ case ESP_V4_FLOW:
+@@ -1300,6 +1301,7 @@ iavf_add_fdir_fltr_info(struct iavf_adapter *adapter, struct ethtool_rx_flow_spe
+ fltr->ip_mask.v4_addrs.dst_ip = fsp->m_u.ah_ip4_spec.ip4dst;
+ fltr->ip_mask.spi = fsp->m_u.ah_ip4_spec.spi;
+ fltr->ip_mask.tos = fsp->m_u.ah_ip4_spec.tos;
++ fltr->ip_ver = 4;
+ break;
+ case IPV4_USER_FLOW:
+ fltr->ip_data.v4_addrs.src_ip = fsp->h_u.usr_ip4_spec.ip4src;
+@@ -1312,6 +1314,7 @@ iavf_add_fdir_fltr_info(struct iavf_adapter *adapter, struct ethtool_rx_flow_spe
+ fltr->ip_mask.l4_header = fsp->m_u.usr_ip4_spec.l4_4_bytes;
+ fltr->ip_mask.tos = fsp->m_u.usr_ip4_spec.tos;
+ fltr->ip_mask.proto = fsp->m_u.usr_ip4_spec.proto;
++ fltr->ip_ver = 4;
+ break;
+ case TCP_V6_FLOW:
+ case UDP_V6_FLOW:
+@@ -1330,6 +1333,7 @@ iavf_add_fdir_fltr_info(struct iavf_adapter *adapter, struct ethtool_rx_flow_spe
+ fltr->ip_mask.src_port = fsp->m_u.tcp_ip6_spec.psrc;
+ fltr->ip_mask.dst_port = fsp->m_u.tcp_ip6_spec.pdst;
+ fltr->ip_mask.tclass = fsp->m_u.tcp_ip6_spec.tclass;
++ fltr->ip_ver = 6;
+ break;
+ case AH_V6_FLOW:
+ case ESP_V6_FLOW:
+@@ -1345,6 +1349,7 @@ iavf_add_fdir_fltr_info(struct iavf_adapter *adapter, struct ethtool_rx_flow_spe
+ sizeof(struct in6_addr));
+ fltr->ip_mask.spi = fsp->m_u.ah_ip6_spec.spi;
+ fltr->ip_mask.tclass = fsp->m_u.ah_ip6_spec.tclass;
++ fltr->ip_ver = 6;
+ break;
+ case IPV6_USER_FLOW:
+ memcpy(&fltr->ip_data.v6_addrs.src_ip, fsp->h_u.usr_ip6_spec.ip6src,
+@@ -1361,6 +1366,7 @@ iavf_add_fdir_fltr_info(struct iavf_adapter *adapter, struct ethtool_rx_flow_spe
+ fltr->ip_mask.l4_header = fsp->m_u.usr_ip6_spec.l4_4_bytes;
+ fltr->ip_mask.tclass = fsp->m_u.usr_ip6_spec.tclass;
+ fltr->ip_mask.proto = fsp->m_u.usr_ip6_spec.l4_proto;
++ fltr->ip_ver = 6;
+ break;
+ case ETHER_FLOW:
+ fltr->eth_data.etype = fsp->h_u.ether_spec.h_proto;
+@@ -1371,6 +1377,10 @@ iavf_add_fdir_fltr_info(struct iavf_adapter *adapter, struct ethtool_rx_flow_spe
+ return -EINVAL;
+ }
+
++ err = iavf_validate_fdir_fltr_masks(adapter, fltr);
++ if (err)
++ return err;
++
+ if (iavf_fdir_is_dup_fltr(adapter, fltr))
+ return -EEXIST;
+
+diff --git a/drivers/net/ethernet/intel/iavf/iavf_fdir.c b/drivers/net/ethernet/intel/iavf/iavf_fdir.c
+index 505e82ebafe47..03e774bd2a5b4 100644
+--- a/drivers/net/ethernet/intel/iavf/iavf_fdir.c
++++ b/drivers/net/ethernet/intel/iavf/iavf_fdir.c
+@@ -18,6 +18,79 @@ static const struct in6_addr ipv6_addr_full_mask = {
+ }
+ };
+
++static const struct in6_addr ipv6_addr_zero_mask = {
++ .in6_u = {
++ .u6_addr8 = {
++ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
++ }
++ }
++};
++
++/**
++ * iavf_validate_fdir_fltr_masks - validate Flow Director filter fields masks
++ * @adapter: pointer to the VF adapter structure
++ * @fltr: Flow Director filter data structure
++ *
++ * Returns 0 if all masks of packet fields are either full or empty. Returns
++ * error on at least one partial mask.
++ */
++int iavf_validate_fdir_fltr_masks(struct iavf_adapter *adapter,
++ struct iavf_fdir_fltr *fltr)
++{
++ if (fltr->eth_mask.etype && fltr->eth_mask.etype != htons(U16_MAX))
++ goto partial_mask;
++
++ if (fltr->ip_ver == 4) {
++ if (fltr->ip_mask.v4_addrs.src_ip &&
++ fltr->ip_mask.v4_addrs.src_ip != htonl(U32_MAX))
++ goto partial_mask;
++
++ if (fltr->ip_mask.v4_addrs.dst_ip &&
++ fltr->ip_mask.v4_addrs.dst_ip != htonl(U32_MAX))
++ goto partial_mask;
++
++ if (fltr->ip_mask.tos && fltr->ip_mask.tos != U8_MAX)
++ goto partial_mask;
++ } else if (fltr->ip_ver == 6) {
++ if (memcmp(&fltr->ip_mask.v6_addrs.src_ip, &ipv6_addr_zero_mask,
++ sizeof(struct in6_addr)) &&
++ memcmp(&fltr->ip_mask.v6_addrs.src_ip, &ipv6_addr_full_mask,
++ sizeof(struct in6_addr)))
++ goto partial_mask;
++
++ if (memcmp(&fltr->ip_mask.v6_addrs.dst_ip, &ipv6_addr_zero_mask,
++ sizeof(struct in6_addr)) &&
++ memcmp(&fltr->ip_mask.v6_addrs.dst_ip, &ipv6_addr_full_mask,
++ sizeof(struct in6_addr)))
++ goto partial_mask;
++
++ if (fltr->ip_mask.tclass && fltr->ip_mask.tclass != U8_MAX)
++ goto partial_mask;
++ }
++
++ if (fltr->ip_mask.proto && fltr->ip_mask.proto != U8_MAX)
++ goto partial_mask;
++
++ if (fltr->ip_mask.src_port && fltr->ip_mask.src_port != htons(U16_MAX))
++ goto partial_mask;
++
++ if (fltr->ip_mask.dst_port && fltr->ip_mask.dst_port != htons(U16_MAX))
++ goto partial_mask;
++
++ if (fltr->ip_mask.spi && fltr->ip_mask.spi != htonl(U32_MAX))
++ goto partial_mask;
++
++ if (fltr->ip_mask.l4_header &&
++ fltr->ip_mask.l4_header != htonl(U32_MAX))
++ goto partial_mask;
++
++ return 0;
++
++partial_mask:
++ dev_err(&adapter->pdev->dev, "Failed to add Flow Director filter, partial masks are not supported\n");
++ return -EOPNOTSUPP;
++}
++
+ /**
+ * iavf_pkt_udp_no_pay_len - the length of UDP packet without payload
+ * @fltr: Flow Director filter data structure
+@@ -263,8 +336,6 @@ iavf_fill_fdir_ip4_hdr(struct iavf_fdir_fltr *fltr,
+ VIRTCHNL_ADD_PROTO_HDR_FIELD_BIT(hdr, IPV4, DST);
+ }
+
+- fltr->ip_ver = 4;
+-
+ return 0;
+ }
+
+@@ -309,8 +380,6 @@ iavf_fill_fdir_ip6_hdr(struct iavf_fdir_fltr *fltr,
+ VIRTCHNL_ADD_PROTO_HDR_FIELD_BIT(hdr, IPV6, DST);
+ }
+
+- fltr->ip_ver = 6;
+-
+ return 0;
+ }
+
+diff --git a/drivers/net/ethernet/intel/iavf/iavf_fdir.h b/drivers/net/ethernet/intel/iavf/iavf_fdir.h
+index 33c55c366315b..9eb9f73f6adf3 100644
+--- a/drivers/net/ethernet/intel/iavf/iavf_fdir.h
++++ b/drivers/net/ethernet/intel/iavf/iavf_fdir.h
+@@ -110,6 +110,8 @@ struct iavf_fdir_fltr {
+ struct virtchnl_fdir_add vc_add_msg;
+ };
+
++int iavf_validate_fdir_fltr_masks(struct iavf_adapter *adapter,
++ struct iavf_fdir_fltr *fltr);
+ int iavf_fill_fdir_add_msg(struct iavf_adapter *adapter, struct iavf_fdir_fltr *fltr);
+ void iavf_print_fdir_fltr(struct iavf_adapter *adapter, struct iavf_fdir_fltr *fltr);
+ bool iavf_fdir_is_dup_fltr(struct iavf_adapter *adapter, struct iavf_fdir_fltr *fltr);
+diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch.c b/drivers/net/ethernet/intel/ice/ice_eswitch.c
+index f6dd3f8fd936e..03e5139849462 100644
+--- a/drivers/net/ethernet/intel/ice/ice_eswitch.c
++++ b/drivers/net/ethernet/intel/ice/ice_eswitch.c
+@@ -568,6 +568,12 @@ ice_eswitch_mode_set(struct devlink *devlink, u16 mode,
+ break;
+ case DEVLINK_ESWITCH_MODE_SWITCHDEV:
+ {
++ if (ice_is_adq_active(pf)) {
++ dev_err(ice_pf_to_dev(pf), "Couldn't change eswitch mode to switchdev - ADQ is active. Delete ADQ configs and try again, e.g. tc qdisc del dev $PF root");
++ NL_SET_ERR_MSG_MOD(extack, "Couldn't change eswitch mode to switchdev - ADQ is active. Delete ADQ configs and try again, e.g. tc qdisc del dev $PF root");
++ return -EOPNOTSUPP;
++ }
++
+ dev_info(ice_pf_to_dev(pf), "PF %d changed eswitch mode to switchdev",
+ pf->hw.pf_id);
+ NL_SET_ERR_MSG_MOD(extack, "Changed eswitch mode to switchdev");
+diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
+index 34e8e7cb1bc54..cfb76612bd2f9 100644
+--- a/drivers/net/ethernet/intel/ice/ice_main.c
++++ b/drivers/net/ethernet/intel/ice/ice_main.c
+@@ -9065,6 +9065,11 @@ ice_setup_tc(struct net_device *netdev, enum tc_setup_type type,
+ ice_setup_tc_block_cb,
+ np, np, true);
+ case TC_SETUP_QDISC_MQPRIO:
++ if (ice_is_eswitch_mode_switchdev(pf)) {
++ netdev_err(netdev, "TC MQPRIO offload not supported, switchdev is enabled\n");
++ return -EOPNOTSUPP;
++ }
++
+ if (pf->adev) {
+ mutex_lock(&pf->adev_mutex);
+ device_lock(&pf->adev->dev);
+diff --git a/drivers/net/ethernet/marvell/octeon_ep/octep_ctrl_net.c b/drivers/net/ethernet/marvell/octeon_ep/octep_ctrl_net.c
+index 1cc6af2feb38a..565320ec24f81 100644
+--- a/drivers/net/ethernet/marvell/octeon_ep/octep_ctrl_net.c
++++ b/drivers/net/ethernet/marvell/octeon_ep/octep_ctrl_net.c
+@@ -55,7 +55,7 @@ static int octep_send_mbox_req(struct octep_device *oct,
+ list_add_tail(&d->list, &oct->ctrl_req_wait_list);
+ ret = wait_event_interruptible_timeout(oct->ctrl_req_wait_q,
+ (d->done != 0),
+- jiffies + msecs_to_jiffies(500));
++ msecs_to_jiffies(500));
+ list_del(&d->list);
+ if (ret == 0 || ret == 1)
+ return -EAGAIN;
+diff --git a/drivers/net/ethernet/marvell/octeon_ep/octep_main.c b/drivers/net/ethernet/marvell/octeon_ep/octep_main.c
+index 43eb6e8713511..4424de2ffd70c 100644
+--- a/drivers/net/ethernet/marvell/octeon_ep/octep_main.c
++++ b/drivers/net/ethernet/marvell/octeon_ep/octep_main.c
+@@ -1038,6 +1038,10 @@ static void octep_device_cleanup(struct octep_device *oct)
+ {
+ int i;
+
++ oct->poll_non_ioq_intr = false;
++ cancel_delayed_work_sync(&oct->intr_poll_task);
++ cancel_work_sync(&oct->ctrl_mbox_task);
++
+ dev_info(&oct->pdev->dev, "Cleaning up Octeon Device ...\n");
+
+ for (i = 0; i < OCTEP_MAX_VF; i++) {
+@@ -1200,14 +1204,11 @@ static void octep_remove(struct pci_dev *pdev)
+ if (!oct)
+ return;
+
+- cancel_work_sync(&oct->tx_timeout_task);
+- cancel_work_sync(&oct->ctrl_mbox_task);
+ netdev = oct->netdev;
+ if (netdev->reg_state == NETREG_REGISTERED)
+ unregister_netdev(netdev);
+
+- oct->poll_non_ioq_intr = false;
+- cancel_delayed_work_sync(&oct->intr_poll_task);
++ cancel_work_sync(&oct->tx_timeout_task);
+ octep_device_cleanup(oct);
+ pci_release_mem_regions(pdev);
+ free_netdev(netdev);
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
+index 9e8e6184f9e43..ecfe93a479da8 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
+@@ -84,6 +84,8 @@ enum mlx5e_xdp_xmit_mode {
+ * MLX5E_XDP_XMIT_MODE_XSK:
+ * none.
+ */
++#define MLX5E_XDP_FIFO_ENTRIES2DS_MAX_RATIO 4
++
+ union mlx5e_xdp_info {
+ enum mlx5e_xdp_xmit_mode mode;
+ union {
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+index 7e6d0489854e3..975c82df345cd 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+@@ -1298,11 +1298,13 @@ static int mlx5e_alloc_xdpsq_fifo(struct mlx5e_xdpsq *sq, int numa)
+ {
+ struct mlx5e_xdp_info_fifo *xdpi_fifo = &sq->db.xdpi_fifo;
+ int wq_sz = mlx5_wq_cyc_get_size(&sq->wq);
+- int entries = wq_sz * MLX5_SEND_WQEBB_NUM_DS * 2; /* upper bound for maximum num of
+- * entries of all xmit_modes.
+- */
++ int entries;
+ size_t size;
+
++ /* upper bound for maximum num of entries of all xmit_modes. */
++ entries = roundup_pow_of_two(wq_sz * MLX5_SEND_WQEBB_NUM_DS *
++ MLX5E_XDP_FIFO_ENTRIES2DS_MAX_RATIO);
++
+ size = array_size(sizeof(*xdpi_fifo->xi), entries);
+ xdpi_fifo->xi = kvzalloc_node(size, GFP_KERNEL, numa);
+ if (!xdpi_fifo->xi)
+diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
+index 96c78f7db2543..7441577294bad 100644
+--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
++++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
+@@ -973,7 +973,7 @@ static int mana_cfg_vport_steering(struct mana_port_context *apc,
+ bool update_tab)
+ {
+ u16 num_entries = MANA_INDIRECT_TABLE_SIZE;
+- struct mana_cfg_rx_steer_req *req = NULL;
++ struct mana_cfg_rx_steer_req_v2 *req;
+ struct mana_cfg_rx_steer_resp resp = {};
+ struct net_device *ndev = apc->ndev;
+ mana_handle_t *req_indir_tab;
+@@ -988,6 +988,8 @@ static int mana_cfg_vport_steering(struct mana_port_context *apc,
+ mana_gd_init_req_hdr(&req->hdr, MANA_CONFIG_VPORT_RX, req_buf_size,
+ sizeof(resp));
+
++ req->hdr.req.msg_version = GDMA_MESSAGE_V2;
++
+ req->vport = apc->port_handle;
+ req->num_indir_entries = num_entries;
+ req->indir_tab_offset = sizeof(*req);
+@@ -997,6 +999,7 @@ static int mana_cfg_vport_steering(struct mana_port_context *apc,
+ req->update_hashkey = update_key;
+ req->update_indir_tab = update_tab;
+ req->default_rxobj = apc->default_rxobj;
++ req->cqe_coalescing_enable = 0;
+
+ if (update_key)
+ memcpy(&req->hashkey, apc->hashkey, MANA_HASH_KEY_SIZE);
+diff --git a/drivers/net/ethernet/qlogic/qede/qede_main.c b/drivers/net/ethernet/qlogic/qede/qede_main.c
+index 4b004a7281903..99df00c30b8c6 100644
+--- a/drivers/net/ethernet/qlogic/qede/qede_main.c
++++ b/drivers/net/ethernet/qlogic/qede/qede_main.c
+@@ -176,6 +176,15 @@ static int qede_sriov_configure(struct pci_dev *pdev, int num_vfs_param)
+ }
+ #endif
+
++static int __maybe_unused qede_suspend(struct device *dev)
++{
++ dev_info(dev, "Device does not support suspend operation\n");
++
++ return -EOPNOTSUPP;
++}
++
++static DEFINE_SIMPLE_DEV_PM_OPS(qede_pm_ops, qede_suspend, NULL);
++
+ static const struct pci_error_handlers qede_err_handler = {
+ .error_detected = qede_io_error_detected,
+ };
+@@ -190,6 +199,7 @@ static struct pci_driver qede_pci_driver = {
+ .sriov_configure = qede_sriov_configure,
+ #endif
+ .err_handler = &qede_err_handler,
++ .driver.pm = &qede_pm_ops,
+ };
+
+ static struct qed_eth_cb_ops qede_ll_ops = {
+diff --git a/drivers/net/ethernet/sfc/ef100_nic.c b/drivers/net/ethernet/sfc/ef100_nic.c
+index 7adde9639c8ab..35d8e9811998d 100644
+--- a/drivers/net/ethernet/sfc/ef100_nic.c
++++ b/drivers/net/ethernet/sfc/ef100_nic.c
+@@ -1194,7 +1194,7 @@ int ef100_probe_netdev_pf(struct efx_nic *efx)
+ net_dev->features |= NETIF_F_HW_TC;
+ efx->fixed_features |= NETIF_F_HW_TC;
+ }
+- return rc;
++ return 0;
+ }
+
+ int ef100_probe_vf(struct efx_nic *efx)
+diff --git a/drivers/net/ethernet/sfc/tc.c b/drivers/net/ethernet/sfc/tc.c
+index d7827ab3761f9..6c8dfe0a64824 100644
+--- a/drivers/net/ethernet/sfc/tc.c
++++ b/drivers/net/ethernet/sfc/tc.c
+@@ -1310,6 +1310,58 @@ void efx_tc_deconfigure_default_rule(struct efx_nic *efx,
+ rule->fw_id = MC_CMD_MAE_ACTION_RULE_INSERT_OUT_ACTION_RULE_ID_NULL;
+ }
+
++static int efx_tc_configure_fallback_acts(struct efx_nic *efx, u32 eg_port,
++ struct efx_tc_action_set_list *acts)
++{
++ struct efx_tc_action_set *act;
++ int rc;
++
++ act = kzalloc(sizeof(*act), GFP_KERNEL);
++ if (!act)
++ return -ENOMEM;
++ act->deliver = 1;
++ act->dest_mport = eg_port;
++ rc = efx_mae_alloc_action_set(efx, act);
++ if (rc)
++ goto fail1;
++ EFX_WARN_ON_PARANOID(!list_empty(&acts->list));
++ list_add_tail(&act->list, &acts->list);
++ rc = efx_mae_alloc_action_set_list(efx, acts);
++ if (rc)
++ goto fail2;
++ return 0;
++fail2:
++ list_del(&act->list);
++ efx_mae_free_action_set(efx, act->fw_id);
++fail1:
++ kfree(act);
++ return rc;
++}
++
++static int efx_tc_configure_fallback_acts_pf(struct efx_nic *efx)
++{
++ struct efx_tc_action_set_list *acts = &efx->tc->facts.pf;
++ u32 eg_port;
++
++ efx_mae_mport_uplink(efx, &eg_port);
++ return efx_tc_configure_fallback_acts(efx, eg_port, acts);
++}
++
++static int efx_tc_configure_fallback_acts_reps(struct efx_nic *efx)
++{
++ struct efx_tc_action_set_list *acts = &efx->tc->facts.reps;
++ u32 eg_port;
++
++ efx_mae_mport_mport(efx, efx->tc->reps_mport_id, &eg_port);
++ return efx_tc_configure_fallback_acts(efx, eg_port, acts);
++}
++
++static void efx_tc_deconfigure_fallback_acts(struct efx_nic *efx,
++ struct efx_tc_action_set_list *acts)
++{
++ efx_tc_free_action_set_list(efx, acts, true);
++}
++
+ static int efx_tc_configure_rep_mport(struct efx_nic *efx)
+ {
+ u32 rep_mport_label;
+@@ -1402,10 +1454,16 @@ int efx_init_tc(struct efx_nic *efx)
+ rc = efx_tc_configure_rep_mport(efx);
+ if (rc)
+ return rc;
+- efx->tc->up = true;
++ rc = efx_tc_configure_fallback_acts_pf(efx);
++ if (rc)
++ return rc;
++ rc = efx_tc_configure_fallback_acts_reps(efx);
++ if (rc)
++ return rc;
+ rc = flow_indr_dev_register(efx_tc_indr_setup_cb, efx);
+ if (rc)
+ return rc;
++ efx->tc->up = true;
+ return 0;
+ }
+
+@@ -1419,6 +1477,8 @@ void efx_fini_tc(struct efx_nic *efx)
+ efx_tc_deconfigure_rep_mport(efx);
+ efx_tc_deconfigure_default_rule(efx, &efx->tc->dflt.pf);
+ efx_tc_deconfigure_default_rule(efx, &efx->tc->dflt.wire);
++ efx_tc_deconfigure_fallback_acts(efx, &efx->tc->facts.pf);
++ efx_tc_deconfigure_fallback_acts(efx, &efx->tc->facts.reps);
+ efx->tc->up = false;
+ }
+
+@@ -1483,6 +1543,10 @@ int efx_init_struct_tc(struct efx_nic *efx)
+ efx->tc->dflt.pf.fw_id = MC_CMD_MAE_ACTION_RULE_INSERT_OUT_ACTION_RULE_ID_NULL;
+ INIT_LIST_HEAD(&efx->tc->dflt.wire.acts.list);
+ efx->tc->dflt.wire.fw_id = MC_CMD_MAE_ACTION_RULE_INSERT_OUT_ACTION_RULE_ID_NULL;
++ INIT_LIST_HEAD(&efx->tc->facts.pf.list);
++ efx->tc->facts.pf.fw_id = MC_CMD_MAE_ACTION_SET_ALLOC_OUT_ACTION_SET_ID_NULL;
++ INIT_LIST_HEAD(&efx->tc->facts.reps.list);
++ efx->tc->facts.reps.fw_id = MC_CMD_MAE_ACTION_SET_ALLOC_OUT_ACTION_SET_ID_NULL;
+ efx->extra_channel_type[EFX_EXTRA_CHANNEL_TC] = &efx_tc_channel_type;
+ return 0;
+ fail_match_action_ht:
+@@ -1508,6 +1572,10 @@ void efx_fini_struct_tc(struct efx_nic *efx)
+ MC_CMD_MAE_ACTION_RULE_INSERT_OUT_ACTION_RULE_ID_NULL);
+ EFX_WARN_ON_PARANOID(efx->tc->dflt.wire.fw_id !=
+ MC_CMD_MAE_ACTION_RULE_INSERT_OUT_ACTION_RULE_ID_NULL);
++ EFX_WARN_ON_PARANOID(efx->tc->facts.pf.fw_id !=
++ MC_CMD_MAE_ACTION_SET_LIST_ALLOC_OUT_ACTION_SET_LIST_ID_NULL);
++ EFX_WARN_ON_PARANOID(efx->tc->facts.reps.fw_id !=
++ MC_CMD_MAE_ACTION_SET_LIST_ALLOC_OUT_ACTION_SET_LIST_ID_NULL);
+ rhashtable_free_and_destroy(&efx->tc->match_action_ht, efx_tc_flow_free,
+ efx);
+ rhashtable_free_and_destroy(&efx->tc->encap_match_ht,
+diff --git a/drivers/net/ethernet/sfc/tc.h b/drivers/net/ethernet/sfc/tc.h
+index 04cced6a2d39f..2b6782e9c7226 100644
+--- a/drivers/net/ethernet/sfc/tc.h
++++ b/drivers/net/ethernet/sfc/tc.h
+@@ -133,6 +133,11 @@ enum efx_tc_rule_prios {
+ * %EFX_TC_PRIO_DFLT. Named by *ingress* port
+ * @dflt.pf: rule for traffic ingressing from PF (egresses to wire)
+ * @dflt.wire: rule for traffic ingressing from wire (egresses to PF)
++ * @facts: Fallback action-set-lists for unready rules. Named by *egress* port
++ * @facts.pf: action-set-list for unready rules on PF netdev, hence applying to
++ * traffic from wire, and egressing to PF
++ * @facts.reps: action-set-list for unready rules on representors, hence
++ * applying to traffic from representees, and egressing to the reps mport
+ * @up: have TC datastructures been set up?
+ */
+ struct efx_tc_state {
+@@ -153,6 +158,10 @@ struct efx_tc_state {
+ struct efx_tc_flow_rule pf;
+ struct efx_tc_flow_rule wire;
+ } dflt;
++ struct {
++ struct efx_tc_action_set_list pf;
++ struct efx_tc_action_set_list reps;
++ } facts;
+ bool up;
+ };
+
+diff --git a/drivers/net/pcs/pcs-rzn1-miic.c b/drivers/net/pcs/pcs-rzn1-miic.c
+index 323bec5e57f83..3560991690038 100644
+--- a/drivers/net/pcs/pcs-rzn1-miic.c
++++ b/drivers/net/pcs/pcs-rzn1-miic.c
+@@ -313,15 +313,21 @@ struct phylink_pcs *miic_create(struct device *dev, struct device_node *np)
+
+ pdev = of_find_device_by_node(pcs_np);
+ of_node_put(pcs_np);
+- if (!pdev || !platform_get_drvdata(pdev))
++ if (!pdev || !platform_get_drvdata(pdev)) {
++ if (pdev)
++ put_device(&pdev->dev);
+ return ERR_PTR(-EPROBE_DEFER);
++ }
+
+ miic_port = kzalloc(sizeof(*miic_port), GFP_KERNEL);
+- if (!miic_port)
++ if (!miic_port) {
++ put_device(&pdev->dev);
+ return ERR_PTR(-ENOMEM);
++ }
+
+ miic = platform_get_drvdata(pdev);
+ device_link_add(dev, miic->dev, DL_FLAG_AUTOREMOVE_CONSUMER);
++ put_device(&pdev->dev);
+
+ miic_port->miic = miic;
+ miic_port->port = port - 1;
+diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c
+index ef6dc008e4c50..8a77ec33b4172 100644
+--- a/drivers/net/phy/at803x.c
++++ b/drivers/net/phy/at803x.c
+@@ -304,7 +304,6 @@ struct at803x_priv {
+ bool is_1000basex;
+ struct regulator_dev *vddio_rdev;
+ struct regulator_dev *vddh_rdev;
+- struct regulator *vddio;
+ u64 stats[ARRAY_SIZE(at803x_hw_stats)];
+ };
+
+@@ -460,21 +459,27 @@ static int at803x_set_wol(struct phy_device *phydev,
+ phy_write_mmd(phydev, MDIO_MMD_PCS, offsets[i],
+ mac[(i * 2) + 1] | (mac[(i * 2)] << 8));
+
+- /* Enable WOL function */
+- ret = phy_modify_mmd(phydev, MDIO_MMD_PCS, AT803X_PHY_MMD3_WOL_CTRL,
+- 0, AT803X_WOL_EN);
+- if (ret)
+- return ret;
++ /* Enable WOL function for 1588 */
++ if (phydev->drv->phy_id == ATH8031_PHY_ID) {
++ ret = phy_modify_mmd(phydev, MDIO_MMD_PCS,
++ AT803X_PHY_MMD3_WOL_CTRL,
++ 0, AT803X_WOL_EN);
++ if (ret)
++ return ret;
++ }
+ /* Enable WOL interrupt */
+ ret = phy_modify(phydev, AT803X_INTR_ENABLE, 0, AT803X_INTR_ENABLE_WOL);
+ if (ret)
+ return ret;
+ } else {
+- /* Disable WoL function */
+- ret = phy_modify_mmd(phydev, MDIO_MMD_PCS, AT803X_PHY_MMD3_WOL_CTRL,
+- AT803X_WOL_EN, 0);
+- if (ret)
+- return ret;
++ /* Disable WoL function for 1588 */
++ if (phydev->drv->phy_id == ATH8031_PHY_ID) {
++ ret = phy_modify_mmd(phydev, MDIO_MMD_PCS,
++ AT803X_PHY_MMD3_WOL_CTRL,
++ AT803X_WOL_EN, 0);
++ if (ret)
++ return ret;
++ }
+ /* Disable WOL interrupt */
+ ret = phy_modify(phydev, AT803X_INTR_ENABLE, AT803X_INTR_ENABLE_WOL, 0);
+ if (ret)
+@@ -509,11 +514,11 @@ static void at803x_get_wol(struct phy_device *phydev,
+ wol->supported = WAKE_MAGIC;
+ wol->wolopts = 0;
+
+- value = phy_read_mmd(phydev, MDIO_MMD_PCS, AT803X_PHY_MMD3_WOL_CTRL);
++ value = phy_read(phydev, AT803X_INTR_ENABLE);
+ if (value < 0)
+ return;
+
+- if (value & AT803X_WOL_EN)
++ if (value & AT803X_INTR_ENABLE_WOL)
+ wol->wolopts |= WAKE_MAGIC;
+ }
+
+@@ -824,11 +829,11 @@ static int at803x_parse_dt(struct phy_device *phydev)
+ if (ret < 0)
+ return ret;
+
+- priv->vddio = devm_regulator_get_optional(&phydev->mdio.dev,
+- "vddio");
+- if (IS_ERR(priv->vddio)) {
++ ret = devm_regulator_get_enable_optional(&phydev->mdio.dev,
++ "vddio");
++ if (ret) {
+ phydev_err(phydev, "failed to get VDDIO regulator\n");
+- return PTR_ERR(priv->vddio);
++ return ret;
+ }
+
+ /* Only AR8031/8033 support 1000Base-X for SFP modules */
+@@ -856,23 +861,12 @@ static int at803x_probe(struct phy_device *phydev)
+ if (ret)
+ return ret;
+
+- if (priv->vddio) {
+- ret = regulator_enable(priv->vddio);
+- if (ret < 0)
+- return ret;
+- }
+-
+ if (phydev->drv->phy_id == ATH8031_PHY_ID) {
+ int ccr = phy_read(phydev, AT803X_REG_CHIP_CONFIG);
+ int mode_cfg;
+- struct ethtool_wolinfo wol = {
+- .wolopts = 0,
+- };
+
+- if (ccr < 0) {
+- ret = ccr;
+- goto err;
+- }
++ if (ccr < 0)
++ return ccr;
+ mode_cfg = ccr & AT803X_MODE_CFG_MASK;
+
+ switch (mode_cfg) {
+@@ -886,29 +880,17 @@ static int at803x_probe(struct phy_device *phydev)
+ break;
+ }
+
+- /* Disable WOL by default */
+- ret = at803x_set_wol(phydev, &wol);
+- if (ret < 0) {
+- phydev_err(phydev, "failed to disable WOL on probe: %d\n", ret);
+- goto err;
+- }
++ /* Disable WoL in 1588 register which is enabled
++ * by default
++ */
++ ret = phy_modify_mmd(phydev, MDIO_MMD_PCS,
++ AT803X_PHY_MMD3_WOL_CTRL,
++ AT803X_WOL_EN, 0);
++ if (ret)
++ return ret;
+ }
+
+ return 0;
+-
+-err:
+- if (priv->vddio)
+- regulator_disable(priv->vddio);
+-
+- return ret;
+-}
+-
+-static void at803x_remove(struct phy_device *phydev)
+-{
+- struct at803x_priv *priv = phydev->priv;
+-
+- if (priv->vddio)
+- regulator_disable(priv->vddio);
+ }
+
+ static int at803x_get_features(struct phy_device *phydev)
+@@ -2021,7 +2003,6 @@ static struct phy_driver at803x_driver[] = {
+ .name = "Qualcomm Atheros AR8035",
+ .flags = PHY_POLL_CABLE_TEST,
+ .probe = at803x_probe,
+- .remove = at803x_remove,
+ .config_aneg = at803x_config_aneg,
+ .config_init = at803x_config_init,
+ .soft_reset = genphy_soft_reset,
+@@ -2043,7 +2024,6 @@ static struct phy_driver at803x_driver[] = {
+ .name = "Qualcomm Atheros AR8030",
+ .phy_id_mask = AT8030_PHY_ID_MASK,
+ .probe = at803x_probe,
+- .remove = at803x_remove,
+ .config_init = at803x_config_init,
+ .link_change_notify = at803x_link_change_notify,
+ .set_wol = at803x_set_wol,
+@@ -2059,7 +2039,6 @@ static struct phy_driver at803x_driver[] = {
+ .name = "Qualcomm Atheros AR8031/AR8033",
+ .flags = PHY_POLL_CABLE_TEST,
+ .probe = at803x_probe,
+- .remove = at803x_remove,
+ .config_init = at803x_config_init,
+ .config_aneg = at803x_config_aneg,
+ .soft_reset = genphy_soft_reset,
+@@ -2082,7 +2061,6 @@ static struct phy_driver at803x_driver[] = {
+ PHY_ID_MATCH_EXACT(ATH8032_PHY_ID),
+ .name = "Qualcomm Atheros AR8032",
+ .probe = at803x_probe,
+- .remove = at803x_remove,
+ .flags = PHY_POLL_CABLE_TEST,
+ .config_init = at803x_config_init,
+ .link_change_notify = at803x_link_change_notify,
+@@ -2098,7 +2076,6 @@ static struct phy_driver at803x_driver[] = {
+ PHY_ID_MATCH_EXACT(ATH9331_PHY_ID),
+ .name = "Qualcomm Atheros AR9331 built-in PHY",
+ .probe = at803x_probe,
+- .remove = at803x_remove,
+ .suspend = at803x_suspend,
+ .resume = at803x_resume,
+ .flags = PHY_POLL_CABLE_TEST,
+@@ -2115,7 +2092,6 @@ static struct phy_driver at803x_driver[] = {
+ PHY_ID_MATCH_EXACT(QCA9561_PHY_ID),
+ .name = "Qualcomm Atheros QCA9561 built-in PHY",
+ .probe = at803x_probe,
+- .remove = at803x_remove,
+ .suspend = at803x_suspend,
+ .resume = at803x_resume,
+ .flags = PHY_POLL_CABLE_TEST,
+@@ -2181,7 +2157,6 @@ static struct phy_driver at803x_driver[] = {
+ .name = "Qualcomm QCA8081",
+ .flags = PHY_POLL_CABLE_TEST,
+ .probe = at803x_probe,
+- .remove = at803x_remove,
+ .config_intr = at803x_config_intr,
+ .handle_interrupt = at803x_handle_interrupt,
+ .get_tunable = at803x_get_tunable,
+diff --git a/drivers/net/phy/broadcom.c b/drivers/net/phy/broadcom.c
+index ad71c88c87e78..f9ad8902100f3 100644
+--- a/drivers/net/phy/broadcom.c
++++ b/drivers/net/phy/broadcom.c
+@@ -486,6 +486,17 @@ static int bcm54xx_resume(struct phy_device *phydev)
+ return bcm54xx_config_init(phydev);
+ }
+
++static int bcm54810_read_mmd(struct phy_device *phydev, int devnum, u16 regnum)
++{
++ return -EOPNOTSUPP;
++}
++
++static int bcm54810_write_mmd(struct phy_device *phydev, int devnum, u16 regnum,
++ u16 val)
++{
++ return -EOPNOTSUPP;
++}
++
+ static int bcm54811_config_init(struct phy_device *phydev)
+ {
+ int err, reg;
+@@ -981,6 +992,8 @@ static struct phy_driver broadcom_drivers[] = {
+ .get_strings = bcm_phy_get_strings,
+ .get_stats = bcm54xx_get_stats,
+ .probe = bcm54xx_phy_probe,
++ .read_mmd = bcm54810_read_mmd,
++ .write_mmd = bcm54810_write_mmd,
+ .config_init = bcm54xx_config_init,
+ .config_aneg = bcm5481_config_aneg,
+ .config_intr = bcm_phy_config_intr,
+diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
+index 2c4e6de8f4d9f..7958ea0e8714a 100644
+--- a/drivers/net/phy/phy_device.c
++++ b/drivers/net/phy/phy_device.c
+@@ -3217,6 +3217,8 @@ static int phy_probe(struct device *dev)
+ goto out;
+ }
+
++ phy_disable_interrupts(phydev);
++
+ /* Start out supporting everything. Eventually,
+ * a controller will attach, and may modify one
+ * or both of these values
+@@ -3334,16 +3336,6 @@ static int phy_remove(struct device *dev)
+ return 0;
+ }
+
+-static void phy_shutdown(struct device *dev)
+-{
+- struct phy_device *phydev = to_phy_device(dev);
+-
+- if (phydev->state == PHY_READY || !phydev->attached_dev)
+- return;
+-
+- phy_disable_interrupts(phydev);
+-}
+-
+ /**
+ * phy_driver_register - register a phy_driver with the PHY layer
+ * @new_driver: new phy_driver to register
+@@ -3377,7 +3369,6 @@ int phy_driver_register(struct phy_driver *new_driver, struct module *owner)
+ new_driver->mdiodrv.driver.bus = &mdio_bus_type;
+ new_driver->mdiodrv.driver.probe = phy_probe;
+ new_driver->mdiodrv.driver.remove = phy_remove;
+- new_driver->mdiodrv.driver.shutdown = phy_shutdown;
+ new_driver->mdiodrv.driver.owner = owner;
+ new_driver->mdiodrv.driver.probe_type = PROBE_FORCE_SYNCHRONOUS;
+
+diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
+index d3dc22509ea58..382756c3fb837 100644
+--- a/drivers/net/team/team.c
++++ b/drivers/net/team/team.c
+@@ -2200,7 +2200,9 @@ static void team_setup(struct net_device *dev)
+
+ dev->hw_features = TEAM_VLAN_FEATURES |
+ NETIF_F_HW_VLAN_CTAG_RX |
+- NETIF_F_HW_VLAN_CTAG_FILTER;
++ NETIF_F_HW_VLAN_CTAG_FILTER |
++ NETIF_F_HW_VLAN_STAG_RX |
++ NETIF_F_HW_VLAN_STAG_FILTER;
+
+ dev->hw_features |= NETIF_F_GSO_ENCAP_ALL;
+ dev->features |= dev->hw_features;
+diff --git a/drivers/net/veth.c b/drivers/net/veth.c
+index dce9f9d63e04e..76019949e3fe9 100644
+--- a/drivers/net/veth.c
++++ b/drivers/net/veth.c
+@@ -1071,8 +1071,9 @@ static int __veth_napi_enable_range(struct net_device *dev, int start, int end)
+ err_xdp_ring:
+ for (i--; i >= start; i--)
+ ptr_ring_cleanup(&priv->rq[i].xdp_ring, veth_ptr_free);
++ i = end;
+ err_page_pool:
+- for (i = start; i < end; i++) {
++ for (i--; i >= start; i--) {
+ page_pool_destroy(priv->rq[i].page_pool);
+ priv->rq[i].page_pool = NULL;
+ }
+diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
+index 2336a0e4befa5..9b310795617c8 100644
+--- a/drivers/net/virtio_net.c
++++ b/drivers/net/virtio_net.c
+@@ -2652,7 +2652,7 @@ static void virtnet_init_default_rss(struct virtnet_info *vi)
+ vi->ctrl->rss.indirection_table[i] = indir_val;
+ }
+
+- vi->ctrl->rss.max_tx_vq = vi->curr_queue_pairs;
++ vi->ctrl->rss.max_tx_vq = vi->has_rss ? vi->curr_queue_pairs : 0;
+ vi->ctrl->rss.hash_key_length = vi->rss_key_size;
+
+ netdev_rss_key_fill(vi->ctrl->rss.key, vi->rss_key_size);
+@@ -4110,8 +4110,6 @@ static int virtnet_probe(struct virtio_device *vdev)
+ if (vi->has_rss || vi->has_rss_hash_report)
+ virtnet_init_default_rss(vi);
+
+- _virtnet_set_queues(vi, vi->curr_queue_pairs);
+-
+ /* serialize netdev register + virtio_device_ready() with ndo_open() */
+ rtnl_lock();
+
+@@ -4124,6 +4122,8 @@ static int virtnet_probe(struct virtio_device *vdev)
+
+ virtio_device_ready(vdev);
+
++ _virtnet_set_queues(vi, vi->curr_queue_pairs);
++
+ /* a random MAC address has been assigned, notify the device.
+ * We don't fail probe if VIRTIO_NET_F_CTRL_MAC_ADDR is not there
+ * because many devices work fine without getting MAC explicitly
+diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c b/drivers/pci/controller/dwc/pcie-tegra194.c
+index 09825b4a075e5..e6eec85480ca9 100644
+--- a/drivers/pci/controller/dwc/pcie-tegra194.c
++++ b/drivers/pci/controller/dwc/pcie-tegra194.c
+@@ -223,6 +223,7 @@
+ #define EP_STATE_ENABLED 1
+
+ static const unsigned int pcie_gen_freq[] = {
++ GEN1_CORE_CLK_FREQ, /* PCI_EXP_LNKSTA_CLS == 0; undefined */
+ GEN1_CORE_CLK_FREQ,
+ GEN2_CORE_CLK_FREQ,
+ GEN3_CORE_CLK_FREQ,
+@@ -459,7 +460,11 @@ static irqreturn_t tegra_pcie_ep_irq_thread(int irq, void *arg)
+
+ speed = dw_pcie_readw_dbi(pci, pcie->pcie_cap_base + PCI_EXP_LNKSTA) &
+ PCI_EXP_LNKSTA_CLS;
+- clk_set_rate(pcie->core_clk, pcie_gen_freq[speed - 1]);
++
++ if (speed >= ARRAY_SIZE(pcie_gen_freq))
++ speed = 0;
++
++ clk_set_rate(pcie->core_clk, pcie_gen_freq[speed]);
+
+ if (pcie->of_data->has_ltr_req_fix)
+ return IRQ_HANDLED;
+@@ -1020,7 +1025,11 @@ retry_link:
+
+ speed = dw_pcie_readw_dbi(pci, pcie->pcie_cap_base + PCI_EXP_LNKSTA) &
+ PCI_EXP_LNKSTA_CLS;
+- clk_set_rate(pcie->core_clk, pcie_gen_freq[speed - 1]);
++
++ if (speed >= ARRAY_SIZE(pcie_gen_freq))
++ speed = 0;
++
++ clk_set_rate(pcie->core_clk, pcie_gen_freq[speed]);
+
+ tegra_pcie_enable_interrupts(pp);
+
+diff --git a/drivers/pcmcia/rsrc_nonstatic.c b/drivers/pcmcia/rsrc_nonstatic.c
+index 471e0c5815f39..bf9d070a44966 100644
+--- a/drivers/pcmcia/rsrc_nonstatic.c
++++ b/drivers/pcmcia/rsrc_nonstatic.c
+@@ -1053,6 +1053,8 @@ static void nonstatic_release_resource_db(struct pcmcia_socket *s)
+ q = p->next;
+ kfree(p);
+ }
++
++ kfree(data);
+ }
+
+
+diff --git a/drivers/pinctrl/qcom/pinctrl-msm.c b/drivers/pinctrl/qcom/pinctrl-msm.c
+index c5f52d4f7781b..1fb0a24356bf5 100644
+--- a/drivers/pinctrl/qcom/pinctrl-msm.c
++++ b/drivers/pinctrl/qcom/pinctrl-msm.c
+@@ -1039,6 +1039,7 @@ static int msm_gpio_irq_set_type(struct irq_data *d, unsigned int type)
+ struct gpio_chip *gc = irq_data_get_irq_chip_data(d);
+ struct msm_pinctrl *pctrl = gpiochip_get_data(gc);
+ const struct msm_pingroup *g;
++ u32 intr_target_mask = GENMASK(2, 0);
+ unsigned long flags;
+ bool was_enabled;
+ u32 val;
+@@ -1075,13 +1076,15 @@ static int msm_gpio_irq_set_type(struct irq_data *d, unsigned int type)
+ * With intr_target_use_scm interrupts are routed to
+ * application cpu using scm calls.
+ */
++ if (g->intr_target_width)
++ intr_target_mask = GENMASK(g->intr_target_width - 1, 0);
++
+ if (pctrl->intr_target_use_scm) {
+ u32 addr = pctrl->phys_base[0] + g->intr_target_reg;
+ int ret;
+
+ qcom_scm_io_readl(addr, &val);
+-
+- val &= ~(7 << g->intr_target_bit);
++ val &= ~(intr_target_mask << g->intr_target_bit);
+ val |= g->intr_target_kpss_val << g->intr_target_bit;
+
+ ret = qcom_scm_io_writel(addr, val);
+@@ -1091,7 +1094,7 @@ static int msm_gpio_irq_set_type(struct irq_data *d, unsigned int type)
+ d->hwirq);
+ } else {
+ val = msm_readl_intr_target(pctrl, g);
+- val &= ~(7 << g->intr_target_bit);
++ val &= ~(intr_target_mask << g->intr_target_bit);
+ val |= g->intr_target_kpss_val << g->intr_target_bit;
+ msm_writel_intr_target(val, pctrl, g);
+ }
+diff --git a/drivers/pinctrl/qcom/pinctrl-msm.h b/drivers/pinctrl/qcom/pinctrl-msm.h
+index 985eceda25173..7f30416be127b 100644
+--- a/drivers/pinctrl/qcom/pinctrl-msm.h
++++ b/drivers/pinctrl/qcom/pinctrl-msm.h
+@@ -51,6 +51,7 @@ struct msm_function {
+ * @intr_status_bit: Offset in @intr_status_reg for reading and acking the interrupt
+ * status.
+ * @intr_target_bit: Offset in @intr_target_reg for configuring the interrupt routing.
++ * @intr_target_width: Number of bits used for specifying interrupt routing target.
+ * @intr_target_kpss_val: Value in @intr_target_bit for specifying that the interrupt from
+ * this gpio should get routed to the KPSS processor.
+ * @intr_raw_status_bit: Offset in @intr_cfg_reg for the raw status bit.
+@@ -94,6 +95,7 @@ struct msm_pingroup {
+ unsigned intr_ack_high:1;
+
+ unsigned intr_target_bit:5;
++ unsigned intr_target_width:5;
+ unsigned intr_target_kpss_val:5;
+ unsigned intr_raw_status_bit:5;
+ unsigned intr_polarity_bit:5;
+diff --git a/drivers/pinctrl/qcom/pinctrl-sa8775p.c b/drivers/pinctrl/qcom/pinctrl-sa8775p.c
+index 2ae7cdca65d3e..62f7a36d290cb 100644
+--- a/drivers/pinctrl/qcom/pinctrl-sa8775p.c
++++ b/drivers/pinctrl/qcom/pinctrl-sa8775p.c
+@@ -54,6 +54,7 @@
+ .intr_enable_bit = 0, \
+ .intr_status_bit = 0, \
+ .intr_target_bit = 5, \
++ .intr_target_width = 4, \
+ .intr_target_kpss_val = 3, \
+ .intr_raw_status_bit = 4, \
+ .intr_polarity_bit = 1, \
+diff --git a/drivers/regulator/da9063-regulator.c b/drivers/regulator/da9063-regulator.c
+index dfd5ec9f75c90..a0621665a6d22 100644
+--- a/drivers/regulator/da9063-regulator.c
++++ b/drivers/regulator/da9063-regulator.c
+@@ -778,9 +778,6 @@ static int da9063_check_xvp_constraints(struct regulator_config *config)
+ const struct notification_limit *uv_l = &constr->under_voltage_limits;
+ const struct notification_limit *ov_l = &constr->over_voltage_limits;
+
+- if (!config->init_data) /* No config in DT, pointers will be invalid */
+- return 0;
+-
+ /* make sure that only one severity is used to clarify if unchanged, enabled or disabled */
+ if ((!!uv_l->prot + !!uv_l->err + !!uv_l->warn) > 1) {
+ dev_err(config->dev, "%s: at most one voltage monitoring severity allowed!\n",
+@@ -1031,9 +1028,12 @@ static int da9063_regulator_probe(struct platform_device *pdev)
+ config.of_node = da9063_reg_matches[id].of_node;
+ config.regmap = da9063->regmap;
+
+- ret = da9063_check_xvp_constraints(&config);
+- if (ret)
+- return ret;
++ /* Checking constraints requires init_data from DT. */
++ if (config.init_data) {
++ ret = da9063_check_xvp_constraints(&config);
++ if (ret)
++ return ret;
++ }
+
+ regl->rdev = devm_regulator_register(&pdev->dev, ®l->desc,
+ &config);
+diff --git a/drivers/regulator/qcom-rpmh-regulator.c b/drivers/regulator/qcom-rpmh-regulator.c
+index f3b280af07737..cd077b7c4aff3 100644
+--- a/drivers/regulator/qcom-rpmh-regulator.c
++++ b/drivers/regulator/qcom-rpmh-regulator.c
+@@ -1068,7 +1068,7 @@ static const struct rpmh_vreg_init_data pm8550_vreg_data[] = {
+ RPMH_VREG("ldo9", "ldo%s9", &pmic5_pldo, "vdd-l8-l9"),
+ RPMH_VREG("ldo10", "ldo%s10", &pmic5_nldo515, "vdd-l1-l4-l10"),
+ RPMH_VREG("ldo11", "ldo%s11", &pmic5_nldo515, "vdd-l11"),
+- RPMH_VREG("ldo12", "ldo%s12", &pmic5_pldo, "vdd-l12"),
++ RPMH_VREG("ldo12", "ldo%s12", &pmic5_nldo515, "vdd-l12"),
+ RPMH_VREG("ldo13", "ldo%s13", &pmic5_pldo, "vdd-l2-l13-l14"),
+ RPMH_VREG("ldo14", "ldo%s14", &pmic5_pldo, "vdd-l2-l13-l14"),
+ RPMH_VREG("ldo15", "ldo%s15", &pmic5_nldo515, "vdd-l15"),
+diff --git a/drivers/soc/aspeed/aspeed-socinfo.c b/drivers/soc/aspeed/aspeed-socinfo.c
+index 1ca140356a084..3f759121dc00a 100644
+--- a/drivers/soc/aspeed/aspeed-socinfo.c
++++ b/drivers/soc/aspeed/aspeed-socinfo.c
+@@ -137,6 +137,7 @@ static int __init aspeed_socinfo_init(void)
+
+ soc_dev = soc_device_register(attrs);
+ if (IS_ERR(soc_dev)) {
++ kfree(attrs->machine);
+ kfree(attrs->soc_id);
+ kfree(attrs->serial_number);
+ kfree(attrs);
+diff --git a/drivers/soc/aspeed/aspeed-uart-routing.c b/drivers/soc/aspeed/aspeed-uart-routing.c
+index ef8b24fd18518..59123e1f27acb 100644
+--- a/drivers/soc/aspeed/aspeed-uart-routing.c
++++ b/drivers/soc/aspeed/aspeed-uart-routing.c
+@@ -524,7 +524,7 @@ static ssize_t aspeed_uart_routing_store(struct device *dev,
+ struct aspeed_uart_routing_selector *sel = to_routing_selector(attr);
+ int val;
+
+- val = match_string(sel->options, -1, buf);
++ val = __sysfs_match_string(sel->options, -1, buf);
+ if (val < 0) {
+ dev_err(dev, "invalid value \"%s\"\n", buf);
+ return -EINVAL;
+diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c
+index e58beac442958..1257d1c41f8e5 100644
+--- a/drivers/thunderbolt/nhi.c
++++ b/drivers/thunderbolt/nhi.c
+@@ -1480,6 +1480,8 @@ static struct pci_device_id nhi_ids[] = {
+ .driver_data = (kernel_ulong_t)&icl_nhi_ops },
+ { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_MTL_P_NHI1),
+ .driver_data = (kernel_ulong_t)&icl_nhi_ops },
++ { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_BARLOW_RIDGE_HOST_80G_NHI) },
++ { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_BARLOW_RIDGE_HOST_40G_NHI) },
+
+ /* Any USB4 compliant host */
+ { PCI_DEVICE_CLASS(PCI_CLASS_SERIAL_USB_USB4, ~0) },
+diff --git a/drivers/thunderbolt/nhi.h b/drivers/thunderbolt/nhi.h
+index b0718020c6f59..0f029ce758825 100644
+--- a/drivers/thunderbolt/nhi.h
++++ b/drivers/thunderbolt/nhi.h
+@@ -75,6 +75,10 @@ extern const struct tb_nhi_ops icl_nhi_ops;
+ #define PCI_DEVICE_ID_INTEL_TITAN_RIDGE_DD_BRIDGE 0x15ef
+ #define PCI_DEVICE_ID_INTEL_ADL_NHI0 0x463e
+ #define PCI_DEVICE_ID_INTEL_ADL_NHI1 0x466d
++#define PCI_DEVICE_ID_INTEL_BARLOW_RIDGE_HOST_80G_NHI 0x5781
++#define PCI_DEVICE_ID_INTEL_BARLOW_RIDGE_HOST_40G_NHI 0x5784
++#define PCI_DEVICE_ID_INTEL_BARLOW_RIDGE_HUB_80G_BRIDGE 0x5786
++#define PCI_DEVICE_ID_INTEL_BARLOW_RIDGE_HUB_40G_BRIDGE 0x57a4
+ #define PCI_DEVICE_ID_INTEL_MTL_M_NHI0 0x7eb2
+ #define PCI_DEVICE_ID_INTEL_MTL_P_NHI0 0x7ec2
+ #define PCI_DEVICE_ID_INTEL_MTL_P_NHI1 0x7ec3
+diff --git a/drivers/thunderbolt/quirks.c b/drivers/thunderbolt/quirks.c
+index 1157b8869bcca..8c2ee431fcde8 100644
+--- a/drivers/thunderbolt/quirks.c
++++ b/drivers/thunderbolt/quirks.c
+@@ -74,6 +74,14 @@ static const struct tb_quirk tb_quirks[] = {
+ quirk_usb3_maximum_bandwidth },
+ { 0x8087, PCI_DEVICE_ID_INTEL_MTL_P_NHI1, 0x0000, 0x0000,
+ quirk_usb3_maximum_bandwidth },
++ { 0x8087, PCI_DEVICE_ID_INTEL_BARLOW_RIDGE_HOST_80G_NHI, 0x0000, 0x0000,
++ quirk_usb3_maximum_bandwidth },
++ { 0x8087, PCI_DEVICE_ID_INTEL_BARLOW_RIDGE_HOST_40G_NHI, 0x0000, 0x0000,
++ quirk_usb3_maximum_bandwidth },
++ { 0x8087, PCI_DEVICE_ID_INTEL_BARLOW_RIDGE_HUB_80G_BRIDGE, 0x0000, 0x0000,
++ quirk_usb3_maximum_bandwidth },
++ { 0x8087, PCI_DEVICE_ID_INTEL_BARLOW_RIDGE_HUB_40G_BRIDGE, 0x0000, 0x0000,
++ quirk_usb3_maximum_bandwidth },
+ /*
+ * CLx is not supported on AMD USB4 Yellow Carp and Pink Sardine platforms.
+ */
+diff --git a/drivers/thunderbolt/retimer.c b/drivers/thunderbolt/retimer.c
+index 9cc28197dbc45..edbd92435b41a 100644
+--- a/drivers/thunderbolt/retimer.c
++++ b/drivers/thunderbolt/retimer.c
+@@ -187,6 +187,21 @@ static ssize_t nvm_authenticate_show(struct device *dev,
+ return ret;
+ }
+
++static void tb_retimer_nvm_authenticate_status(struct tb_port *port, u32 *status)
++{
++ int i;
++
++ tb_port_dbg(port, "reading NVM authentication status of retimers\n");
++
++ /*
++ * Before doing anything else, read the authentication status.
++ * If the retimer has it set, store it for the new retimer
++ * device instance.
++ */
++ for (i = 1; i <= TB_MAX_RETIMER_INDEX; i++)
++ usb4_port_retimer_nvm_authenticate_status(port, i, &status[i]);
++}
++
+ static void tb_retimer_set_inbound_sbtx(struct tb_port *port)
+ {
+ int i;
+@@ -455,18 +470,16 @@ int tb_retimer_scan(struct tb_port *port, bool add)
+ return ret;
+
+ /*
+- * Enable sideband channel for each retimer. We can do this
+- * regardless whether there is device connected or not.
++ * Immediately after sending enumerate retimers read the
++ * authentication status of each retimer.
+ */
+- tb_retimer_set_inbound_sbtx(port);
++ tb_retimer_nvm_authenticate_status(port, status);
+
+ /*
+- * Before doing anything else, read the authentication status.
+- * If the retimer has it set, store it for the new retimer
+- * device instance.
++ * Enable sideband channel for each retimer. We can do this
++ * regardless whether there is device connected or not.
+ */
+- for (i = 1; i <= TB_MAX_RETIMER_INDEX; i++)
+- usb4_port_retimer_nvm_authenticate_status(port, i, &status[i]);
++ tb_retimer_set_inbound_sbtx(port);
+
+ for (i = 1; i <= TB_MAX_RETIMER_INDEX; i++) {
+ /*
+diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c
+index 1cdefac4dd1b5..739f522cb893c 100644
+--- a/drivers/tty/n_gsm.c
++++ b/drivers/tty/n_gsm.c
+@@ -3042,12 +3042,13 @@ static void gsm_error(struct gsm_mux *gsm)
+ static void gsm_cleanup_mux(struct gsm_mux *gsm, bool disc)
+ {
+ int i;
+- struct gsm_dlci *dlci = gsm->dlci[0];
++ struct gsm_dlci *dlci;
+ struct gsm_msg *txq, *ntxq;
+
+ gsm->dead = true;
+ mutex_lock(&gsm->mutex);
+
++ dlci = gsm->dlci[0];
+ if (dlci) {
+ if (disc && dlci->state != DLCI_CLOSED) {
+ gsm_dlci_begin_close(dlci);
+diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
+index 053d44412e42f..0a67dff575f78 100644
+--- a/drivers/tty/serial/8250/8250_port.c
++++ b/drivers/tty/serial/8250/8250_port.c
+@@ -3288,6 +3288,7 @@ void serial8250_init_port(struct uart_8250_port *up)
+ struct uart_port *port = &up->port;
+
+ spin_lock_init(&port->lock);
++ port->pm = NULL;
+ port->ops = &serial8250_pops;
+ port->has_sysrq = IS_ENABLED(CONFIG_SERIAL_8250_CONSOLE);
+
+diff --git a/drivers/tty/serial/fsl_lpuart.c b/drivers/tty/serial/fsl_lpuart.c
+index f38606b750967..3e4992b281132 100644
+--- a/drivers/tty/serial/fsl_lpuart.c
++++ b/drivers/tty/serial/fsl_lpuart.c
+@@ -1137,8 +1137,8 @@ static void lpuart_copy_rx_to_tty(struct lpuart_port *sport)
+ unsigned long sr = lpuart32_read(&sport->port, UARTSTAT);
+
+ if (sr & (UARTSTAT_PE | UARTSTAT_FE)) {
+- /* Read DR to clear the error flags */
+- lpuart32_read(&sport->port, UARTDATA);
++ /* Clear the error flags */
++ lpuart32_write(&sport->port, sr, UARTSTAT);
+
+ if (sr & UARTSTAT_PE)
+ sport->port.icount.parity++;
+diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c
+index 1e38fc9b10c11..e9e11a2596211 100644
+--- a/drivers/tty/serial/stm32-usart.c
++++ b/drivers/tty/serial/stm32-usart.c
+@@ -1755,13 +1755,10 @@ static int stm32_usart_serial_remove(struct platform_device *pdev)
+ struct uart_port *port = platform_get_drvdata(pdev);
+ struct stm32_port *stm32_port = to_stm32_port(port);
+ const struct stm32_usart_offsets *ofs = &stm32_port->info->ofs;
+- int err;
+ u32 cr3;
+
+ pm_runtime_get_sync(&pdev->dev);
+- err = uart_remove_one_port(&stm32_usart_driver, port);
+- if (err)
+- return(err);
++ uart_remove_one_port(&stm32_usart_driver, port);
+
+ pm_runtime_disable(&pdev->dev);
+ pm_runtime_set_suspended(&pdev->dev);
+diff --git a/drivers/usb/chipidea/ci_hdrc_imx.c b/drivers/usb/chipidea/ci_hdrc_imx.c
+index 2855ac3030014..f7577f2bd2c5d 100644
+--- a/drivers/usb/chipidea/ci_hdrc_imx.c
++++ b/drivers/usb/chipidea/ci_hdrc_imx.c
+@@ -70,6 +70,10 @@ static const struct ci_hdrc_imx_platform_flag imx7ulp_usb_data = {
+ CI_HDRC_PMQOS,
+ };
+
++static const struct ci_hdrc_imx_platform_flag imx8ulp_usb_data = {
++ .flags = CI_HDRC_SUPPORTS_RUNTIME_PM,
++};
++
+ static const struct of_device_id ci_hdrc_imx_dt_ids[] = {
+ { .compatible = "fsl,imx23-usb", .data = &imx23_usb_data},
+ { .compatible = "fsl,imx28-usb", .data = &imx28_usb_data},
+@@ -80,6 +84,7 @@ static const struct of_device_id ci_hdrc_imx_dt_ids[] = {
+ { .compatible = "fsl,imx6ul-usb", .data = &imx6ul_usb_data},
+ { .compatible = "fsl,imx7d-usb", .data = &imx7d_usb_data},
+ { .compatible = "fsl,imx7ulp-usb", .data = &imx7ulp_usb_data},
++ { .compatible = "fsl,imx8ulp-usb", .data = &imx8ulp_usb_data},
+ { /* sentinel */ }
+ };
+ MODULE_DEVICE_TABLE(of, ci_hdrc_imx_dt_ids);
+diff --git a/drivers/usb/chipidea/usbmisc_imx.c b/drivers/usb/chipidea/usbmisc_imx.c
+index c57c1a71a5132..681c2ddc83fa5 100644
+--- a/drivers/usb/chipidea/usbmisc_imx.c
++++ b/drivers/usb/chipidea/usbmisc_imx.c
+@@ -135,7 +135,7 @@
+ #define TXVREFTUNE0_MASK (0xf << 20)
+
+ #define MX6_USB_OTG_WAKEUP_BITS (MX6_BM_WAKEUP_ENABLE | MX6_BM_VBUS_WAKEUP | \
+- MX6_BM_ID_WAKEUP)
++ MX6_BM_ID_WAKEUP | MX6SX_BM_DPDM_WAKEUP_EN)
+
+ struct usbmisc_ops {
+ /* It's called once when probe a usb device */
+@@ -152,6 +152,7 @@ struct usbmisc_ops {
+ int (*charger_detection)(struct imx_usbmisc_data *data);
+ /* It's called when system resume from usb power lost */
+ int (*power_lost_check)(struct imx_usbmisc_data *data);
++ void (*vbus_comparator_on)(struct imx_usbmisc_data *data, bool on);
+ };
+
+ struct imx_usbmisc {
+@@ -875,6 +876,33 @@ static int imx7d_charger_detection(struct imx_usbmisc_data *data)
+ return ret;
+ }
+
++static void usbmisc_imx7d_vbus_comparator_on(struct imx_usbmisc_data *data,
++ bool on)
++{
++ unsigned long flags;
++ struct imx_usbmisc *usbmisc = dev_get_drvdata(data->dev);
++ u32 val;
++
++ if (data->hsic)
++ return;
++
++ spin_lock_irqsave(&usbmisc->lock, flags);
++ /*
++ * Disable VBUS valid comparator when in suspend mode,
++ * when OTG is disabled and DRVVBUS0 is asserted case
++ * the Bandgap circuitry and VBUS Valid comparator are
++ * still powered, even in Suspend or Sleep mode.
++ */
++ val = readl(usbmisc->base + MX7D_USB_OTG_PHY_CFG2);
++ if (on)
++ val |= MX7D_USB_OTG_PHY_CFG2_DRVVBUS0;
++ else
++ val &= ~MX7D_USB_OTG_PHY_CFG2_DRVVBUS0;
++
++ writel(val, usbmisc->base + MX7D_USB_OTG_PHY_CFG2);
++ spin_unlock_irqrestore(&usbmisc->lock, flags);
++}
++
+ static int usbmisc_imx7ulp_init(struct imx_usbmisc_data *data)
+ {
+ struct imx_usbmisc *usbmisc = dev_get_drvdata(data->dev);
+@@ -1018,6 +1046,7 @@ static const struct usbmisc_ops imx7d_usbmisc_ops = {
+ .set_wakeup = usbmisc_imx7d_set_wakeup,
+ .charger_detection = imx7d_charger_detection,
+ .power_lost_check = usbmisc_imx7d_power_lost_check,
++ .vbus_comparator_on = usbmisc_imx7d_vbus_comparator_on,
+ };
+
+ static const struct usbmisc_ops imx7ulp_usbmisc_ops = {
+@@ -1132,6 +1161,9 @@ int imx_usbmisc_suspend(struct imx_usbmisc_data *data, bool wakeup)
+
+ usbmisc = dev_get_drvdata(data->dev);
+
++ if (usbmisc->ops->vbus_comparator_on)
++ usbmisc->ops->vbus_comparator_on(data, false);
++
+ if (wakeup && usbmisc->ops->set_wakeup)
+ ret = usbmisc->ops->set_wakeup(data, true);
+ if (ret) {
+@@ -1185,6 +1217,9 @@ int imx_usbmisc_resume(struct imx_usbmisc_data *data, bool wakeup)
+ goto hsic_set_clk_fail;
+ }
+
++ if (usbmisc->ops->vbus_comparator_on)
++ usbmisc->ops->vbus_comparator_on(data, true);
++
+ return 0;
+
+ hsic_set_clk_fail:
+diff --git a/drivers/usb/gadget/function/u_serial.c b/drivers/usb/gadget/function/u_serial.c
+index e5d522d54f6a3..97f07757d19e3 100644
+--- a/drivers/usb/gadget/function/u_serial.c
++++ b/drivers/usb/gadget/function/u_serial.c
+@@ -916,8 +916,11 @@ static void __gs_console_push(struct gs_console *cons)
+ }
+
+ req->length = size;
++
++ spin_unlock_irq(&cons->lock);
+ if (usb_ep_queue(ep, req, GFP_ATOMIC))
+ req->length = 0;
++ spin_lock_irq(&cons->lock);
+ }
+
+ static void gs_console_work(struct work_struct *work)
+diff --git a/drivers/usb/gadget/function/uvc_video.c b/drivers/usb/gadget/function/uvc_video.c
+index dd1c6b2ca7c6f..e81865978299c 100644
+--- a/drivers/usb/gadget/function/uvc_video.c
++++ b/drivers/usb/gadget/function/uvc_video.c
+@@ -386,6 +386,9 @@ static void uvcg_video_pump(struct work_struct *work)
+ struct uvc_buffer *buf;
+ unsigned long flags;
+ int ret;
++ bool buf_int;
++ /* video->max_payload_size is only set when using bulk transfer */
++ bool is_bulk = video->max_payload_size;
+
+ while (video->ep->enabled) {
+ /*
+@@ -408,20 +411,35 @@ static void uvcg_video_pump(struct work_struct *work)
+ */
+ spin_lock_irqsave(&queue->irqlock, flags);
+ buf = uvcg_queue_head(queue);
+- if (buf == NULL) {
++
++ if (buf != NULL) {
++ video->encode(req, video, buf);
++ /* Always interrupt for the last request of a video buffer */
++ buf_int = buf->state == UVC_BUF_STATE_DONE;
++ } else if (!(queue->flags & UVC_QUEUE_DISCONNECTED) && !is_bulk) {
++ /*
++ * No video buffer available; the queue is still connected and
++ * we're traferring over ISOC. Queue a 0 length request to
++ * prevent missed ISOC transfers.
++ */
++ req->length = 0;
++ buf_int = false;
++ } else {
++ /*
++ * Either queue has been disconnected or no video buffer
++ * available to bulk transfer. Either way, stop processing
++ * further.
++ */
+ spin_unlock_irqrestore(&queue->irqlock, flags);
+ break;
+ }
+
+- video->encode(req, video, buf);
+-
+ /*
+ * With usb3 we have more requests. This will decrease the
+ * interrupt load to a quarter but also catches the corner
+ * cases, which needs to be handled.
+ */
+- if (list_empty(&video->req_free) ||
+- buf->state == UVC_BUF_STATE_DONE ||
++ if (list_empty(&video->req_free) || buf_int ||
+ !(video->req_int_count %
+ DIV_ROUND_UP(video->uvc_num_requests, 4))) {
+ video->req_int_count = 0;
+@@ -441,8 +459,7 @@ static void uvcg_video_pump(struct work_struct *work)
+
+ /* Endpoint now owns the request */
+ req = NULL;
+- if (buf->state != UVC_BUF_STATE_DONE)
+- video->req_int_count++;
++ video->req_int_count++;
+ }
+
+ if (!req)
+@@ -527,4 +544,3 @@ int uvcg_video_init(struct uvc_video *video, struct uvc_device *uvc)
+ V4L2_BUF_TYPE_VIDEO_OUTPUT, &video->mutex);
+ return 0;
+ }
+-
+diff --git a/drivers/usb/host/xhci-histb.c b/drivers/usb/host/xhci-histb.c
+index 91ce97821de51..7c20477550830 100644
+--- a/drivers/usb/host/xhci-histb.c
++++ b/drivers/usb/host/xhci-histb.c
+@@ -164,16 +164,6 @@ static void xhci_histb_host_disable(struct xhci_hcd_histb *histb)
+ clk_disable_unprepare(histb->bus_clk);
+ }
+
+-static void xhci_histb_quirks(struct device *dev, struct xhci_hcd *xhci)
+-{
+- /*
+- * As of now platform drivers don't provide MSI support so we ensure
+- * here that the generic code does not try to make a pci_dev from our
+- * dev struct in order to setup MSI
+- */
+- xhci->quirks |= XHCI_PLAT;
+-}
+-
+ /* called during probe() after chip reset completes */
+ static int xhci_histb_setup(struct usb_hcd *hcd)
+ {
+@@ -186,7 +176,7 @@ static int xhci_histb_setup(struct usb_hcd *hcd)
+ return ret;
+ }
+
+- return xhci_gen_setup(hcd, xhci_histb_quirks);
++ return xhci_gen_setup(hcd, NULL);
+ }
+
+ static const struct xhci_driver_overrides xhci_histb_overrides __initconst = {
+diff --git a/drivers/usb/host/xhci-mtk.c b/drivers/usb/host/xhci-mtk.c
+index b60521e1a9a63..9a40da3b0064b 100644
+--- a/drivers/usb/host/xhci-mtk.c
++++ b/drivers/usb/host/xhci-mtk.c
+@@ -418,12 +418,6 @@ static void xhci_mtk_quirks(struct device *dev, struct xhci_hcd *xhci)
+ struct usb_hcd *hcd = xhci_to_hcd(xhci);
+ struct xhci_hcd_mtk *mtk = hcd_to_mtk(hcd);
+
+- /*
+- * As of now platform drivers don't provide MSI support so we ensure
+- * here that the generic code does not try to make a pci_dev from our
+- * dev struct in order to setup MSI
+- */
+- xhci->quirks |= XHCI_PLAT;
+ xhci->quirks |= XHCI_MTK_HOST;
+ /*
+ * MTK host controller gives a spurious successful event after a
+diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
+index db9826c38b20b..9540f0e48c215 100644
+--- a/drivers/usb/host/xhci-pci.c
++++ b/drivers/usb/host/xhci-pci.c
+@@ -108,9 +108,6 @@ static void xhci_cleanup_msix(struct xhci_hcd *xhci)
+ struct usb_hcd *hcd = xhci_to_hcd(xhci);
+ struct pci_dev *pdev = to_pci_dev(hcd->self.controller);
+
+- if (xhci->quirks & XHCI_PLAT)
+- return;
+-
+ /* return if using legacy interrupt */
+ if (hcd->irq > 0)
+ return;
+@@ -208,10 +205,6 @@ static int xhci_try_enable_msi(struct usb_hcd *hcd)
+ struct pci_dev *pdev;
+ int ret;
+
+- /* The xhci platform device has set up IRQs through usb_add_hcd. */
+- if (xhci->quirks & XHCI_PLAT)
+- return 0;
+-
+ pdev = to_pci_dev(xhci_to_hcd(xhci)->self.controller);
+ /*
+ * Some Fresco Logic host controllers advertise MSI, but fail to
+diff --git a/drivers/usb/host/xhci-plat.c b/drivers/usb/host/xhci-plat.c
+index f36633fa83624..80da67a6c3bf2 100644
+--- a/drivers/usb/host/xhci-plat.c
++++ b/drivers/usb/host/xhci-plat.c
+@@ -78,12 +78,7 @@ static void xhci_plat_quirks(struct device *dev, struct xhci_hcd *xhci)
+ {
+ struct xhci_plat_priv *priv = xhci_to_priv(xhci);
+
+- /*
+- * As of now platform drivers don't provide MSI support so we ensure
+- * here that the generic code does not try to make a pci_dev from our
+- * dev struct in order to setup MSI
+- */
+- xhci->quirks |= XHCI_PLAT | priv->quirks;
++ xhci->quirks |= priv->quirks;
+ }
+
+ /* called during probe() after chip reset completes */
+diff --git a/drivers/usb/host/xhci-tegra.c b/drivers/usb/host/xhci-tegra.c
+index d28fa892c2866..07a319db58034 100644
+--- a/drivers/usb/host/xhci-tegra.c
++++ b/drivers/usb/host/xhci-tegra.c
+@@ -2662,7 +2662,6 @@ static void tegra_xhci_quirks(struct device *dev, struct xhci_hcd *xhci)
+ {
+ struct tegra_xusb *tegra = dev_get_drvdata(dev);
+
+- xhci->quirks |= XHCI_PLAT;
+ if (tegra && tegra->soc->lpm_support)
+ xhci->quirks |= XHCI_LPM_SUPPORT;
+ }
+diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
+index 4474d540f6b49..0b1928851a2a9 100644
+--- a/drivers/usb/host/xhci.h
++++ b/drivers/usb/host/xhci.h
+@@ -1874,7 +1874,7 @@ struct xhci_hcd {
+ #define XHCI_SPURIOUS_REBOOT BIT_ULL(13)
+ #define XHCI_COMP_MODE_QUIRK BIT_ULL(14)
+ #define XHCI_AVOID_BEI BIT_ULL(15)
+-#define XHCI_PLAT BIT_ULL(16)
++#define XHCI_PLAT BIT_ULL(16) /* Deprecated */
+ #define XHCI_SLOW_SUSPEND BIT_ULL(17)
+ #define XHCI_SPURIOUS_WAKEUP BIT_ULL(18)
+ /* For controllers with a broken beyond repair streams implementation */
+diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h b/drivers/vdpa/mlx5/core/mlx5_vdpa.h
+index 25fc4120b618d..b53420e874acb 100644
+--- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h
++++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h
+@@ -31,6 +31,7 @@ struct mlx5_vdpa_mr {
+ struct list_head head;
+ unsigned long num_directs;
+ unsigned long num_klms;
++ /* state of dvq mr */
+ bool initialized;
+
+ /* serialize mkey creation and destruction */
+@@ -121,6 +122,7 @@ int mlx5_vdpa_handle_set_map(struct mlx5_vdpa_dev *mvdev, struct vhost_iotlb *io
+ int mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, struct vhost_iotlb *iotlb,
+ unsigned int asid);
+ void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev);
++void mlx5_vdpa_destroy_mr_asid(struct mlx5_vdpa_dev *mvdev, unsigned int asid);
+
+ #define mlx5_vdpa_warn(__dev, format, ...) \
+ dev_warn((__dev)->mdev->device, "%s:%d:(pid %d) warning: " format, __func__, __LINE__, \
+diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c
+index 03e5432297912..5a1971fcd87b1 100644
+--- a/drivers/vdpa/mlx5/core/mr.c
++++ b/drivers/vdpa/mlx5/core/mr.c
+@@ -489,60 +489,103 @@ static void destroy_user_mr(struct mlx5_vdpa_dev *mvdev, struct mlx5_vdpa_mr *mr
+ }
+ }
+
+-void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev)
++static void _mlx5_vdpa_destroy_cvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid)
++{
++ if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid)
++ return;
++
++ prune_iotlb(mvdev);
++}
++
++static void _mlx5_vdpa_destroy_dvq_mr(struct mlx5_vdpa_dev *mvdev, unsigned int asid)
+ {
+ struct mlx5_vdpa_mr *mr = &mvdev->mr;
+
+- mutex_lock(&mr->mkey_mtx);
++ if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid)
++ return;
++
+ if (!mr->initialized)
+- goto out;
++ return;
+
+- prune_iotlb(mvdev);
+ if (mr->user_mr)
+ destroy_user_mr(mvdev, mr);
+ else
+ destroy_dma_mr(mvdev, mr);
+
+ mr->initialized = false;
+-out:
++}
++
++void mlx5_vdpa_destroy_mr_asid(struct mlx5_vdpa_dev *mvdev, unsigned int asid)
++{
++ struct mlx5_vdpa_mr *mr = &mvdev->mr;
++
++ mutex_lock(&mr->mkey_mtx);
++
++ _mlx5_vdpa_destroy_dvq_mr(mvdev, asid);
++ _mlx5_vdpa_destroy_cvq_mr(mvdev, asid);
++
+ mutex_unlock(&mr->mkey_mtx);
+ }
+
+-static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev,
+- struct vhost_iotlb *iotlb, unsigned int asid)
++void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev)
++{
++ mlx5_vdpa_destroy_mr_asid(mvdev, mvdev->group2asid[MLX5_VDPA_CVQ_GROUP]);
++ mlx5_vdpa_destroy_mr_asid(mvdev, mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP]);
++}
++
++static int _mlx5_vdpa_create_cvq_mr(struct mlx5_vdpa_dev *mvdev,
++ struct vhost_iotlb *iotlb,
++ unsigned int asid)
++{
++ if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] != asid)
++ return 0;
++
++ return dup_iotlb(mvdev, iotlb);
++}
++
++static int _mlx5_vdpa_create_dvq_mr(struct mlx5_vdpa_dev *mvdev,
++ struct vhost_iotlb *iotlb,
++ unsigned int asid)
+ {
+ struct mlx5_vdpa_mr *mr = &mvdev->mr;
+ int err;
+
+- if (mr->initialized)
++ if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] != asid)
+ return 0;
+
+- if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) {
+- if (iotlb)
+- err = create_user_mr(mvdev, iotlb);
+- else
+- err = create_dma_mr(mvdev, mr);
++ if (mr->initialized)
++ return 0;
+
+- if (err)
+- return err;
+- }
++ if (iotlb)
++ err = create_user_mr(mvdev, iotlb);
++ else
++ err = create_dma_mr(mvdev, mr);
+
+- if (mvdev->group2asid[MLX5_VDPA_CVQ_GROUP] == asid) {
+- err = dup_iotlb(mvdev, iotlb);
+- if (err)
+- goto out_err;
+- }
++ if (err)
++ return err;
+
+ mr->initialized = true;
++
++ return 0;
++}
++
++static int _mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev,
++ struct vhost_iotlb *iotlb, unsigned int asid)
++{
++ int err;
++
++ err = _mlx5_vdpa_create_dvq_mr(mvdev, iotlb, asid);
++ if (err)
++ return err;
++
++ err = _mlx5_vdpa_create_cvq_mr(mvdev, iotlb, asid);
++ if (err)
++ goto out_err;
++
+ return 0;
+
+ out_err:
+- if (mvdev->group2asid[MLX5_VDPA_DATAVQ_GROUP] == asid) {
+- if (iotlb)
+- destroy_user_mr(mvdev, mr);
+- else
+- destroy_dma_mr(mvdev, mr);
+- }
++ _mlx5_vdpa_destroy_dvq_mr(mvdev, asid);
+
+ return err;
+ }
+diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
+index 279ac6a558d29..f18a9301ab94e 100644
+--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
++++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
+@@ -2564,7 +2564,7 @@ static int mlx5_vdpa_change_map(struct mlx5_vdpa_dev *mvdev,
+ goto err_mr;
+
+ teardown_driver(ndev);
+- mlx5_vdpa_destroy_mr(mvdev);
++ mlx5_vdpa_destroy_mr_asid(mvdev, asid);
+ err = mlx5_vdpa_create_mr(mvdev, iotlb, asid);
+ if (err)
+ goto err_mr;
+@@ -2580,7 +2580,7 @@ static int mlx5_vdpa_change_map(struct mlx5_vdpa_dev *mvdev,
+ return 0;
+
+ err_setup:
+- mlx5_vdpa_destroy_mr(mvdev);
++ mlx5_vdpa_destroy_mr_asid(mvdev, asid);
+ err_mr:
+ return err;
+ }
+diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
+index 965e32529eb85..a7612e0783b36 100644
+--- a/drivers/vdpa/vdpa.c
++++ b/drivers/vdpa/vdpa.c
+@@ -1247,44 +1247,41 @@ static const struct nla_policy vdpa_nl_policy[VDPA_ATTR_MAX + 1] = {
+ [VDPA_ATTR_MGMTDEV_DEV_NAME] = { .type = NLA_STRING },
+ [VDPA_ATTR_DEV_NAME] = { .type = NLA_STRING },
+ [VDPA_ATTR_DEV_NET_CFG_MACADDR] = NLA_POLICY_ETH_ADDR,
++ [VDPA_ATTR_DEV_NET_CFG_MAX_VQP] = { .type = NLA_U16 },
+ /* virtio spec 1.1 section 5.1.4.1 for valid MTU range */
+ [VDPA_ATTR_DEV_NET_CFG_MTU] = NLA_POLICY_MIN(NLA_U16, 68),
++ [VDPA_ATTR_DEV_QUEUE_INDEX] = { .type = NLA_U32 },
++ [VDPA_ATTR_DEV_FEATURES] = { .type = NLA_U64 },
+ };
+
+ static const struct genl_ops vdpa_nl_ops[] = {
+ {
+ .cmd = VDPA_CMD_MGMTDEV_GET,
+- .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
+ .doit = vdpa_nl_cmd_mgmtdev_get_doit,
+ .dumpit = vdpa_nl_cmd_mgmtdev_get_dumpit,
+ },
+ {
+ .cmd = VDPA_CMD_DEV_NEW,
+- .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
+ .doit = vdpa_nl_cmd_dev_add_set_doit,
+ .flags = GENL_ADMIN_PERM,
+ },
+ {
+ .cmd = VDPA_CMD_DEV_DEL,
+- .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
+ .doit = vdpa_nl_cmd_dev_del_set_doit,
+ .flags = GENL_ADMIN_PERM,
+ },
+ {
+ .cmd = VDPA_CMD_DEV_GET,
+- .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
+ .doit = vdpa_nl_cmd_dev_get_doit,
+ .dumpit = vdpa_nl_cmd_dev_get_dumpit,
+ },
+ {
+ .cmd = VDPA_CMD_DEV_CONFIG_GET,
+- .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
+ .doit = vdpa_nl_cmd_dev_config_get_doit,
+ .dumpit = vdpa_nl_cmd_dev_config_get_dumpit,
+ },
+ {
+ .cmd = VDPA_CMD_DEV_VSTATS_GET,
+- .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
+ .doit = vdpa_nl_cmd_dev_stats_get_doit,
+ .flags = GENL_ADMIN_PERM,
+ },
+diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
+index 0d84e6a9c3cca..76d4ab451f599 100644
+--- a/drivers/vdpa/vdpa_user/vduse_dev.c
++++ b/drivers/vdpa/vdpa_user/vduse_dev.c
+@@ -935,10 +935,10 @@ static void vduse_dev_irq_inject(struct work_struct *work)
+ {
+ struct vduse_dev *dev = container_of(work, struct vduse_dev, inject);
+
+- spin_lock_irq(&dev->irq_lock);
++ spin_lock_bh(&dev->irq_lock);
+ if (dev->config_cb.callback)
+ dev->config_cb.callback(dev->config_cb.private);
+- spin_unlock_irq(&dev->irq_lock);
++ spin_unlock_bh(&dev->irq_lock);
+ }
+
+ static void vduse_vq_irq_inject(struct work_struct *work)
+@@ -946,10 +946,10 @@ static void vduse_vq_irq_inject(struct work_struct *work)
+ struct vduse_virtqueue *vq = container_of(work,
+ struct vduse_virtqueue, inject);
+
+- spin_lock_irq(&vq->irq_lock);
++ spin_lock_bh(&vq->irq_lock);
+ if (vq->ready && vq->cb.callback)
+ vq->cb.callback(vq->cb.private);
+- spin_unlock_irq(&vq->irq_lock);
++ spin_unlock_bh(&vq->irq_lock);
+ }
+
+ static bool vduse_vq_signal_irqfd(struct vduse_virtqueue *vq)
+diff --git a/drivers/video/fbdev/mmp/hw/mmp_ctrl.c b/drivers/video/fbdev/mmp/hw/mmp_ctrl.c
+index 51fbf02a03430..76b50b6c98ad9 100644
+--- a/drivers/video/fbdev/mmp/hw/mmp_ctrl.c
++++ b/drivers/video/fbdev/mmp/hw/mmp_ctrl.c
+@@ -519,7 +519,9 @@ static int mmphw_probe(struct platform_device *pdev)
+ "unable to get clk %s\n", mi->clk_name);
+ goto failed;
+ }
+- clk_prepare_enable(ctrl->clk);
++ ret = clk_prepare_enable(ctrl->clk);
++ if (ret)
++ goto failed;
+
+ /* init global regs */
+ ctrl_set_default(ctrl);
+diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
+index a46a4a29e9295..97760f6112959 100644
+--- a/drivers/virtio/virtio_mmio.c
++++ b/drivers/virtio/virtio_mmio.c
+@@ -607,9 +607,8 @@ static void virtio_mmio_release_dev(struct device *_d)
+ struct virtio_device *vdev =
+ container_of(_d, struct virtio_device, dev);
+ struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
+- struct platform_device *pdev = vm_dev->pdev;
+
+- devm_kfree(&pdev->dev, vm_dev);
++ kfree(vm_dev);
+ }
+
+ /* Platform device */
+@@ -620,7 +619,7 @@ static int virtio_mmio_probe(struct platform_device *pdev)
+ unsigned long magic;
+ int rc;
+
+- vm_dev = devm_kzalloc(&pdev->dev, sizeof(*vm_dev), GFP_KERNEL);
++ vm_dev = kzalloc(sizeof(*vm_dev), GFP_KERNEL);
+ if (!vm_dev)
+ return -ENOMEM;
+
+diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
+index 989e2d7184ce4..961161da59000 100644
+--- a/drivers/virtio/virtio_vdpa.c
++++ b/drivers/virtio/virtio_vdpa.c
+@@ -393,11 +393,13 @@ static int virtio_vdpa_find_vqs(struct virtio_device *vdev, unsigned int nvqs,
+ cb.callback = virtio_vdpa_config_cb;
+ cb.private = vd_dev;
+ ops->set_config_cb(vdpa, &cb);
++ kfree(masks);
+
+ return 0;
+
+ err_setup_vq:
+ virtio_vdpa_del_vqs(vdev);
++ kfree(masks);
+ return err;
+ }
+
+diff --git a/drivers/watchdog/sp5100_tco.c b/drivers/watchdog/sp5100_tco.c
+index 14f8d8d90920f..2bd3dc25cb030 100644
+--- a/drivers/watchdog/sp5100_tco.c
++++ b/drivers/watchdog/sp5100_tco.c
+@@ -96,7 +96,7 @@ static enum tco_reg_layout tco_reg_layout(struct pci_dev *dev)
+ sp5100_tco_pci->device == PCI_DEVICE_ID_AMD_KERNCZ_SMBUS &&
+ sp5100_tco_pci->revision >= AMD_ZEN_SMBUS_PCI_REV) {
+ return efch_mmio;
+- } else if (dev->vendor == PCI_VENDOR_ID_AMD &&
++ } else if ((dev->vendor == PCI_VENDOR_ID_AMD || dev->vendor == PCI_VENDOR_ID_HYGON) &&
+ ((dev->device == PCI_DEVICE_ID_AMD_HUDSON2_SMBUS &&
+ dev->revision >= 0x41) ||
+ (dev->device == PCI_DEVICE_ID_AMD_KERNCZ_SMBUS &&
+@@ -579,6 +579,8 @@ static const struct pci_device_id sp5100_tco_pci_tbl[] = {
+ PCI_ANY_ID, },
+ { PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_KERNCZ_SMBUS, PCI_ANY_ID,
+ PCI_ANY_ID, },
++ { PCI_VENDOR_ID_HYGON, PCI_DEVICE_ID_AMD_KERNCZ_SMBUS, PCI_ANY_ID,
++ PCI_ANY_ID, },
+ { 0, }, /* End of list */
+ };
+ MODULE_DEVICE_TABLE(pci, sp5100_tco_pci_tbl);
+diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
+index 2a60033d907bf..a250afa655d5c 100644
+--- a/fs/btrfs/block-group.c
++++ b/fs/btrfs/block-group.c
+@@ -1670,6 +1670,10 @@ void btrfs_mark_bg_unused(struct btrfs_block_group *bg)
+ btrfs_get_block_group(bg);
+ trace_btrfs_add_unused_block_group(bg);
+ list_add_tail(&bg->bg_list, &fs_info->unused_bgs);
++ } else if (!test_bit(BLOCK_GROUP_FLAG_NEW, &bg->runtime_flags)) {
++ /* Pull out the block group from the reclaim_bgs list. */
++ trace_btrfs_add_unused_block_group(bg);
++ list_move_tail(&bg->bg_list, &fs_info->unused_bgs);
+ }
+ spin_unlock(&fs_info->unused_bgs_lock);
+ }
+@@ -2693,6 +2697,7 @@ void btrfs_create_pending_block_groups(struct btrfs_trans_handle *trans)
+ next:
+ btrfs_delayed_refs_rsv_release(fs_info, 1);
+ list_del_init(&block_group->bg_list);
++ clear_bit(BLOCK_GROUP_FLAG_NEW, &block_group->runtime_flags);
+ }
+ btrfs_trans_release_chunk_metadata(trans);
+ }
+@@ -2732,6 +2737,13 @@ struct btrfs_block_group *btrfs_make_block_group(struct btrfs_trans_handle *tran
+ if (!cache)
+ return ERR_PTR(-ENOMEM);
+
++ /*
++ * Mark it as new before adding it to the rbtree of block groups or any
++ * list, so that no other task finds it and calls btrfs_mark_bg_unused()
++ * before the new flag is set.
++ */
++ set_bit(BLOCK_GROUP_FLAG_NEW, &cache->runtime_flags);
++
+ cache->length = size;
+ set_free_space_tree_thresholds(cache);
+ cache->flags = type;
+diff --git a/fs/btrfs/block-group.h b/fs/btrfs/block-group.h
+index 471f591db7c0c..0852f6c101f82 100644
+--- a/fs/btrfs/block-group.h
++++ b/fs/btrfs/block-group.h
+@@ -70,6 +70,11 @@ enum btrfs_block_group_flags {
+ BLOCK_GROUP_FLAG_NEEDS_FREE_SPACE,
+ /* Indicate that the block group is placed on a sequential zone */
+ BLOCK_GROUP_FLAG_SEQUENTIAL_ZONE,
++ /*
++ * Indicate that block group is in the list of new block groups of a
++ * transaction.
++ */
++ BLOCK_GROUP_FLAG_NEW,
+ };
+
+ enum btrfs_caching_type {
+diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
+index 4c1986cd5bed5..cff98526679e7 100644
+--- a/fs/btrfs/ctree.h
++++ b/fs/btrfs/ctree.h
+@@ -443,6 +443,7 @@ struct btrfs_drop_extents_args {
+
+ struct btrfs_file_private {
+ void *filldir_buf;
++ u64 last_index;
+ struct extent_state *llseek_cached_state;
+ };
+
+diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
+index 6b457b010cbc4..6d51db066503b 100644
+--- a/fs/btrfs/delayed-inode.c
++++ b/fs/btrfs/delayed-inode.c
+@@ -1632,6 +1632,7 @@ int btrfs_inode_delayed_dir_index_count(struct btrfs_inode *inode)
+ }
+
+ bool btrfs_readdir_get_delayed_items(struct inode *inode,
++ u64 last_index,
+ struct list_head *ins_list,
+ struct list_head *del_list)
+ {
+@@ -1651,14 +1652,14 @@ bool btrfs_readdir_get_delayed_items(struct inode *inode,
+
+ mutex_lock(&delayed_node->mutex);
+ item = __btrfs_first_delayed_insertion_item(delayed_node);
+- while (item) {
++ while (item && item->index <= last_index) {
+ refcount_inc(&item->refs);
+ list_add_tail(&item->readdir_list, ins_list);
+ item = __btrfs_next_delayed_item(item);
+ }
+
+ item = __btrfs_first_delayed_deletion_item(delayed_node);
+- while (item) {
++ while (item && item->index <= last_index) {
+ refcount_inc(&item->refs);
+ list_add_tail(&item->readdir_list, del_list);
+ item = __btrfs_next_delayed_item(item);
+diff --git a/fs/btrfs/delayed-inode.h b/fs/btrfs/delayed-inode.h
+index 4f21daa3dbc7b..dc1085b2a3976 100644
+--- a/fs/btrfs/delayed-inode.h
++++ b/fs/btrfs/delayed-inode.h
+@@ -148,6 +148,7 @@ void btrfs_destroy_delayed_inodes(struct btrfs_fs_info *fs_info);
+
+ /* Used for readdir() */
+ bool btrfs_readdir_get_delayed_items(struct inode *inode,
++ u64 last_index,
+ struct list_head *ins_list,
+ struct list_head *del_list);
+ void btrfs_readdir_put_delayed_items(struct inode *inode,
+diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
+index 54eed5a8a412b..00f260c8bd60a 100644
+--- a/fs/btrfs/extent_io.c
++++ b/fs/btrfs/extent_io.c
+@@ -962,7 +962,30 @@ static void submit_extent_page(struct btrfs_bio_ctrl *bio_ctrl,
+ size -= len;
+ pg_offset += len;
+ disk_bytenr += len;
+- bio_ctrl->len_to_oe_boundary -= len;
++
++ /*
++ * len_to_oe_boundary defaults to U32_MAX, which isn't page or
++ * sector aligned. alloc_new_bio() then sets it to the end of
++ * our ordered extent for writes into zoned devices.
++ *
++ * When len_to_oe_boundary is tracking an ordered extent, we
++ * trust the ordered extent code to align things properly, and
++ * the check above to cap our write to the ordered extent
++ * boundary is correct.
++ *
++ * When len_to_oe_boundary is U32_MAX, the cap above would
++ * result in a 4095 byte IO for the last page right before
++ * we hit the bio limit of UINT_MAX. bio_add_page() has all
++ * the checks required to make sure we don't overflow the bio,
++ * and we should just ignore len_to_oe_boundary completely
++ * unless we're using it to track an ordered extent.
++ *
++ * It's pretty hard to make a bio sized U32_MAX, but it can
++ * happen when the page cache is able to feed us contiguous
++ * pages for large extents.
++ */
++ if (bio_ctrl->len_to_oe_boundary != U32_MAX)
++ bio_ctrl->len_to_oe_boundary -= len;
+
+ /* Ordered extent boundary: move on to a new bio. */
+ if (bio_ctrl->len_to_oe_boundary == 0)
+diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c
+index 138afa955370b..367ed73cb6c74 100644
+--- a/fs/btrfs/extent_map.c
++++ b/fs/btrfs/extent_map.c
+@@ -758,8 +758,6 @@ void btrfs_drop_extent_map_range(struct btrfs_inode *inode, u64 start, u64 end,
+
+ if (skip_pinned && test_bit(EXTENT_FLAG_PINNED, &em->flags)) {
+ start = em_end;
+- if (end != (u64)-1)
+- len = start + len - em_end;
+ goto next;
+ }
+
+@@ -827,8 +825,8 @@ void btrfs_drop_extent_map_range(struct btrfs_inode *inode, u64 start, u64 end,
+ if (!split)
+ goto remove_em;
+ }
+- split->start = start + len;
+- split->len = em_end - (start + len);
++ split->start = end;
++ split->len = em_end - end;
+ split->block_start = em->block_start;
+ split->flags = flags;
+ split->compress_type = em->compress_type;
+diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
+index ace949bc75059..a446965d701db 100644
+--- a/fs/btrfs/inode.c
++++ b/fs/btrfs/inode.c
+@@ -5744,6 +5744,74 @@ static struct dentry *btrfs_lookup(struct inode *dir, struct dentry *dentry,
+ return d_splice_alias(inode, dentry);
+ }
+
++/*
++ * Find the highest existing sequence number in a directory and then set the
++ * in-memory index_cnt variable to the first free sequence number.
++ */
++static int btrfs_set_inode_index_count(struct btrfs_inode *inode)
++{
++ struct btrfs_root *root = inode->root;
++ struct btrfs_key key, found_key;
++ struct btrfs_path *path;
++ struct extent_buffer *leaf;
++ int ret;
++
++ key.objectid = btrfs_ino(inode);
++ key.type = BTRFS_DIR_INDEX_KEY;
++ key.offset = (u64)-1;
++
++ path = btrfs_alloc_path();
++ if (!path)
++ return -ENOMEM;
++
++ ret = btrfs_search_slot(NULL, root, &key, path, 0, 0);
++ if (ret < 0)
++ goto out;
++ /* FIXME: we should be able to handle this */
++ if (ret == 0)
++ goto out;
++ ret = 0;
++
++ if (path->slots[0] == 0) {
++ inode->index_cnt = BTRFS_DIR_START_INDEX;
++ goto out;
++ }
++
++ path->slots[0]--;
++
++ leaf = path->nodes[0];
++ btrfs_item_key_to_cpu(leaf, &found_key, path->slots[0]);
++
++ if (found_key.objectid != btrfs_ino(inode) ||
++ found_key.type != BTRFS_DIR_INDEX_KEY) {
++ inode->index_cnt = BTRFS_DIR_START_INDEX;
++ goto out;
++ }
++
++ inode->index_cnt = found_key.offset + 1;
++out:
++ btrfs_free_path(path);
++ return ret;
++}
++
++static int btrfs_get_dir_last_index(struct btrfs_inode *dir, u64 *index)
++{
++ if (dir->index_cnt == (u64)-1) {
++ int ret;
++
++ ret = btrfs_inode_delayed_dir_index_count(dir);
++ if (ret) {
++ ret = btrfs_set_inode_index_count(dir);
++ if (ret)
++ return ret;
++ }
++ }
++
++ *index = dir->index_cnt;
++
++ return 0;
++}
++
+ /*
+ * All this infrastructure exists because dir_emit can fault, and we are holding
+ * the tree lock when doing readdir. For now just allocate a buffer and copy
+@@ -5756,10 +5824,17 @@ static struct dentry *btrfs_lookup(struct inode *dir, struct dentry *dentry,
+ static int btrfs_opendir(struct inode *inode, struct file *file)
+ {
+ struct btrfs_file_private *private;
++ u64 last_index;
++ int ret;
++
++ ret = btrfs_get_dir_last_index(BTRFS_I(inode), &last_index);
++ if (ret)
++ return ret;
+
+ private = kzalloc(sizeof(struct btrfs_file_private), GFP_KERNEL);
+ if (!private)
+ return -ENOMEM;
++ private->last_index = last_index;
+ private->filldir_buf = kzalloc(PAGE_SIZE, GFP_KERNEL);
+ if (!private->filldir_buf) {
+ kfree(private);
+@@ -5826,7 +5901,8 @@ static int btrfs_real_readdir(struct file *file, struct dir_context *ctx)
+
+ INIT_LIST_HEAD(&ins_list);
+ INIT_LIST_HEAD(&del_list);
+- put = btrfs_readdir_get_delayed_items(inode, &ins_list, &del_list);
++ put = btrfs_readdir_get_delayed_items(inode, private->last_index,
++ &ins_list, &del_list);
+
+ again:
+ key.type = BTRFS_DIR_INDEX_KEY;
+@@ -5844,6 +5920,8 @@ again:
+ break;
+ if (found_key.offset < ctx->pos)
+ continue;
++ if (found_key.offset > private->last_index)
++ break;
+ if (btrfs_should_delete_dir_index(&del_list, found_key.offset))
+ continue;
+ di = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_dir_item);
+@@ -5979,57 +6057,6 @@ static int btrfs_update_time(struct inode *inode, struct timespec64 *now,
+ return dirty ? btrfs_dirty_inode(BTRFS_I(inode)) : 0;
+ }
+
+-/*
+- * find the highest existing sequence number in a directory
+- * and then set the in-memory index_cnt variable to reflect
+- * free sequence numbers
+- */
+-static int btrfs_set_inode_index_count(struct btrfs_inode *inode)
+-{
+- struct btrfs_root *root = inode->root;
+- struct btrfs_key key, found_key;
+- struct btrfs_path *path;
+- struct extent_buffer *leaf;
+- int ret;
+-
+- key.objectid = btrfs_ino(inode);
+- key.type = BTRFS_DIR_INDEX_KEY;
+- key.offset = (u64)-1;
+-
+- path = btrfs_alloc_path();
+- if (!path)
+- return -ENOMEM;
+-
+- ret = btrfs_search_slot(NULL, root, &key, path, 0, 0);
+- if (ret < 0)
+- goto out;
+- /* FIXME: we should be able to handle this */
+- if (ret == 0)
+- goto out;
+- ret = 0;
+-
+- if (path->slots[0] == 0) {
+- inode->index_cnt = BTRFS_DIR_START_INDEX;
+- goto out;
+- }
+-
+- path->slots[0]--;
+-
+- leaf = path->nodes[0];
+- btrfs_item_key_to_cpu(leaf, &found_key, path->slots[0]);
+-
+- if (found_key.objectid != btrfs_ino(inode) ||
+- found_key.type != BTRFS_DIR_INDEX_KEY) {
+- inode->index_cnt = BTRFS_DIR_START_INDEX;
+- goto out;
+- }
+-
+- inode->index_cnt = found_key.offset + 1;
+-out:
+- btrfs_free_path(path);
+- return ret;
+-}
+-
+ /*
+ * helper to find a free sequence number in a given directory. This current
+ * code is very simple, later versions will do smarter things in the btree
+diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
+index 16c228344cbb8..2feb7f2294233 100644
+--- a/fs/btrfs/scrub.c
++++ b/fs/btrfs/scrub.c
+@@ -655,7 +655,8 @@ static void scrub_verify_one_metadata(struct scrub_stripe *stripe, int sector_nr
+ btrfs_stack_header_bytenr(header), logical);
+ return;
+ }
+- if (memcmp(header->fsid, fs_info->fs_devices->fsid, BTRFS_FSID_SIZE) != 0) {
++ if (memcmp(header->fsid, fs_info->fs_devices->metadata_uuid,
++ BTRFS_FSID_SIZE) != 0) {
+ bitmap_set(&stripe->meta_error_bitmap, sector_nr, sectors_per_tree);
+ bitmap_set(&stripe->error_bitmap, sector_nr, sectors_per_tree);
+ btrfs_warn_rl(fs_info,
+diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
+index 436e15e3759da..30977f10e36b7 100644
+--- a/fs/btrfs/volumes.c
++++ b/fs/btrfs/volumes.c
+@@ -4631,8 +4631,7 @@ int btrfs_cancel_balance(struct btrfs_fs_info *fs_info)
+ }
+ }
+
+- BUG_ON(fs_info->balance_ctl ||
+- test_bit(BTRFS_FS_BALANCE_RUNNING, &fs_info->flags));
++ ASSERT(!test_bit(BTRFS_FS_BALANCE_RUNNING, &fs_info->flags));
+ atomic_dec(&fs_info->balance_cancel_req);
+ mutex_unlock(&fs_info->balance_mutex);
+ return 0;
+diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
+index 83c4abff496da..5fb367b1d4b06 100644
+--- a/fs/ceph/mds_client.c
++++ b/fs/ceph/mds_client.c
+@@ -645,6 +645,7 @@ bad:
+ err = -EIO;
+ out_bad:
+ pr_err("mds parse_reply err %d\n", err);
++ ceph_msg_dump(msg);
+ return err;
+ }
+
+@@ -3538,6 +3539,7 @@ static void handle_forward(struct ceph_mds_client *mdsc,
+
+ bad:
+ pr_err("mdsc_handle_forward decode error err=%d\n", err);
++ ceph_msg_dump(msg);
+ }
+
+ static int __decode_session_metadata(void **p, void *end,
+@@ -5258,6 +5260,7 @@ void ceph_mdsc_handle_fsmap(struct ceph_mds_client *mdsc, struct ceph_msg *msg)
+ bad:
+ pr_err("error decoding fsmap %d. Shutting down mount.\n", err);
+ ceph_umount_begin(mdsc->fsc->sb);
++ ceph_msg_dump(msg);
+ err_out:
+ mutex_lock(&mdsc->mutex);
+ mdsc->mdsmap_err = err;
+@@ -5326,6 +5329,7 @@ bad_unlock:
+ bad:
+ pr_err("error decoding mdsmap %d. Shutting down mount.\n", err);
+ ceph_umount_begin(mdsc->fsc->sb);
++ ceph_msg_dump(msg);
+ return;
+ }
+
+diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
+index a84bf6444bba9..204ba7f8417e6 100644
+--- a/fs/gfs2/super.c
++++ b/fs/gfs2/super.c
+@@ -1004,7 +1004,14 @@ static int gfs2_show_options(struct seq_file *s, struct dentry *root)
+ {
+ struct gfs2_sbd *sdp = root->d_sb->s_fs_info;
+ struct gfs2_args *args = &sdp->sd_args;
+- int val;
++ unsigned int logd_secs, statfs_slow, statfs_quantum, quota_quantum;
++
++ spin_lock(&sdp->sd_tune.gt_spin);
++ logd_secs = sdp->sd_tune.gt_logd_secs;
++ quota_quantum = sdp->sd_tune.gt_quota_quantum;
++ statfs_quantum = sdp->sd_tune.gt_statfs_quantum;
++ statfs_slow = sdp->sd_tune.gt_statfs_slow;
++ spin_unlock(&sdp->sd_tune.gt_spin);
+
+ if (is_ancestor(root, sdp->sd_master_dir))
+ seq_puts(s, ",meta");
+@@ -1059,17 +1066,14 @@ static int gfs2_show_options(struct seq_file *s, struct dentry *root)
+ }
+ if (args->ar_discard)
+ seq_puts(s, ",discard");
+- val = sdp->sd_tune.gt_logd_secs;
+- if (val != 30)
+- seq_printf(s, ",commit=%d", val);
+- val = sdp->sd_tune.gt_statfs_quantum;
+- if (val != 30)
+- seq_printf(s, ",statfs_quantum=%d", val);
+- else if (sdp->sd_tune.gt_statfs_slow)
++ if (logd_secs != 30)
++ seq_printf(s, ",commit=%d", logd_secs);
++ if (statfs_quantum != 30)
++ seq_printf(s, ",statfs_quantum=%d", statfs_quantum);
++ else if (statfs_slow)
+ seq_puts(s, ",statfs_quantum=0");
+- val = sdp->sd_tune.gt_quota_quantum;
+- if (val != 60)
+- seq_printf(s, ",quota_quantum=%d", val);
++ if (quota_quantum != 60)
++ seq_printf(s, ",quota_quantum=%d", quota_quantum);
+ if (args->ar_statfs_percent)
+ seq_printf(s, ",statfs_percent=%d", args->ar_statfs_percent);
+ if (args->ar_errors != GFS2_ERRORS_DEFAULT) {
+diff --git a/fs/netfs/iterator.c b/fs/netfs/iterator.c
+index 8a4c866874297..facb84f262dc7 100644
+--- a/fs/netfs/iterator.c
++++ b/fs/netfs/iterator.c
+@@ -151,7 +151,7 @@ static ssize_t netfs_extract_user_to_sg(struct iov_iter *iter,
+
+ failed:
+ while (sgtable->nents > sgtable->orig_nents)
+- put_page(sg_page(&sgtable->sgl[--sgtable->nents]));
++ unpin_user_page(sg_page(&sgtable->sgl[--sgtable->nents]));
+ return res;
+ }
+
+diff --git a/fs/ntfs3/frecord.c b/fs/ntfs3/frecord.c
+index 2bfcf1a989c95..50214b77c6a35 100644
+--- a/fs/ntfs3/frecord.c
++++ b/fs/ntfs3/frecord.c
+@@ -874,6 +874,7 @@ int ni_create_attr_list(struct ntfs_inode *ni)
+ if (err)
+ goto out1;
+
++ err = -EINVAL;
+ /* Call mi_remove_attr() in reverse order to keep pointers 'arr_move' valid. */
+ while (to_free > 0) {
+ struct ATTRIB *b = arr_move[--nb];
+@@ -882,7 +883,8 @@ int ni_create_attr_list(struct ntfs_inode *ni)
+
+ attr = mi_insert_attr(mi, b->type, Add2Ptr(b, name_off),
+ b->name_len, asize, name_off);
+- WARN_ON(!attr);
++ if (!attr)
++ goto out1;
+
+ mi_get_ref(mi, &le_b[nb]->ref);
+ le_b[nb]->id = attr->id;
+@@ -892,17 +894,20 @@ int ni_create_attr_list(struct ntfs_inode *ni)
+ attr->id = le_b[nb]->id;
+
+ /* Remove from primary record. */
+- WARN_ON(!mi_remove_attr(NULL, &ni->mi, b));
++ if (!mi_remove_attr(NULL, &ni->mi, b))
++ goto out1;
+
+ if (to_free <= asize)
+ break;
+ to_free -= asize;
+- WARN_ON(!nb);
++ if (!nb)
++ goto out1;
+ }
+
+ attr = mi_insert_attr(&ni->mi, ATTR_LIST, NULL, 0,
+ lsize + SIZEOF_RESIDENT, SIZEOF_RESIDENT);
+- WARN_ON(!attr);
++ if (!attr)
++ goto out1;
+
+ attr->non_res = 0;
+ attr->flags = 0;
+@@ -922,9 +927,10 @@ out1:
+ kfree(ni->attr_list.le);
+ ni->attr_list.le = NULL;
+ ni->attr_list.size = 0;
++ return err;
+
+ out:
+- return err;
++ return 0;
+ }
+
+ /*
+diff --git a/fs/ntfs3/fsntfs.c b/fs/ntfs3/fsntfs.c
+index 28cc421102e59..21567e58265c4 100644
+--- a/fs/ntfs3/fsntfs.c
++++ b/fs/ntfs3/fsntfs.c
+@@ -178,7 +178,7 @@ int ntfs_fix_post_read(struct NTFS_RECORD_HEADER *rhdr, size_t bytes,
+ /* Check errors. */
+ if ((fo & 1) || fo + fn * sizeof(short) > SECTOR_SIZE || !fn-- ||
+ fn * SECTOR_SIZE > bytes) {
+- return -EINVAL; /* Native chkntfs returns ok! */
++ return -E_NTFS_CORRUPT;
+ }
+
+ /* Get fixup pointer. */
+diff --git a/fs/ntfs3/index.c b/fs/ntfs3/index.c
+index 0a48d2d672198..b40da258e6848 100644
+--- a/fs/ntfs3/index.c
++++ b/fs/ntfs3/index.c
+@@ -1113,6 +1113,12 @@ ok:
+ *node = in;
+
+ out:
++ if (err == -E_NTFS_CORRUPT) {
++ ntfs_inode_err(&ni->vfs_inode, "directory corrupted");
++ ntfs_set_state(ni->mi.sbi, NTFS_DIRTY_ERROR);
++ err = -EINVAL;
++ }
++
+ if (ib != in->index)
+ kfree(ib);
+
+diff --git a/fs/ntfs3/ntfs_fs.h b/fs/ntfs3/ntfs_fs.h
+index eb01f7e76479a..2e4be773728df 100644
+--- a/fs/ntfs3/ntfs_fs.h
++++ b/fs/ntfs3/ntfs_fs.h
+@@ -53,6 +53,8 @@ enum utf16_endian;
+ #define E_NTFS_NONRESIDENT 556
+ /* NTFS specific error code about punch hole. */
+ #define E_NTFS_NOTALIGNED 557
++/* NTFS specific error code when on-disk struct is corrupted. */
++#define E_NTFS_CORRUPT 558
+
+
+ /* sbi->flags */
+diff --git a/fs/ntfs3/record.c b/fs/ntfs3/record.c
+index 2a281cead2bcc..7974ca35a15c6 100644
+--- a/fs/ntfs3/record.c
++++ b/fs/ntfs3/record.c
+@@ -124,7 +124,7 @@ int mi_read(struct mft_inode *mi, bool is_mft)
+ struct rw_semaphore *rw_lock = NULL;
+
+ if (is_mounted(sbi)) {
+- if (!is_mft) {
++ if (!is_mft && mft_ni) {
+ rw_lock = &mft_ni->file.run_lock;
+ down_read(rw_lock);
+ }
+@@ -148,7 +148,7 @@ int mi_read(struct mft_inode *mi, bool is_mft)
+ ni_lock(mft_ni);
+ down_write(rw_lock);
+ }
+- err = attr_load_runs_vcn(mft_ni, ATTR_DATA, NULL, 0, &mft_ni->file.run,
++ err = attr_load_runs_vcn(mft_ni, ATTR_DATA, NULL, 0, run,
+ vbo >> sbi->cluster_bits);
+ if (rw_lock) {
+ up_write(rw_lock);
+@@ -180,6 +180,12 @@ ok:
+ return 0;
+
+ out:
++ if (err == -E_NTFS_CORRUPT) {
++ ntfs_err(sbi->sb, "mft corrupted");
++ ntfs_set_state(sbi, NTFS_DIRTY_ERROR);
++ err = -EINVAL;
++ }
++
+ return err;
+ }
+
+diff --git a/fs/ntfs3/super.c b/fs/ntfs3/super.c
+index 5158dd31fd97f..ecf899d571d83 100644
+--- a/fs/ntfs3/super.c
++++ b/fs/ntfs3/super.c
+@@ -724,6 +724,8 @@ static int ntfs_init_from_boot(struct super_block *sb, u32 sector_size,
+ struct MFT_REC *rec;
+ u16 fn, ao;
+ u8 cluster_bits;
++ u32 boot_off = 0;
++ const char *hint = "Primary boot";
+
+ sbi->volume.blocks = dev_size >> PAGE_SHIFT;
+
+@@ -731,11 +733,12 @@ static int ntfs_init_from_boot(struct super_block *sb, u32 sector_size,
+ if (!bh)
+ return -EIO;
+
++check_boot:
+ err = -EINVAL;
+- boot = (struct NTFS_BOOT *)bh->b_data;
++ boot = (struct NTFS_BOOT *)Add2Ptr(bh->b_data, boot_off);
+
+ if (memcmp(boot->system_id, "NTFS ", sizeof("NTFS ") - 1)) {
+- ntfs_err(sb, "Boot's signature is not NTFS.");
++ ntfs_err(sb, "%s signature is not NTFS.", hint);
+ goto out;
+ }
+
+@@ -748,14 +751,16 @@ static int ntfs_init_from_boot(struct super_block *sb, u32 sector_size,
+ boot->bytes_per_sector[0];
+ if (boot_sector_size < SECTOR_SIZE ||
+ !is_power_of_2(boot_sector_size)) {
+- ntfs_err(sb, "Invalid bytes per sector %u.", boot_sector_size);
++ ntfs_err(sb, "%s: invalid bytes per sector %u.", hint,
++ boot_sector_size);
+ goto out;
+ }
+
+ /* cluster size: 512, 1K, 2K, 4K, ... 2M */
+ sct_per_clst = true_sectors_per_clst(boot);
+ if ((int)sct_per_clst < 0 || !is_power_of_2(sct_per_clst)) {
+- ntfs_err(sb, "Invalid sectors per cluster %u.", sct_per_clst);
++ ntfs_err(sb, "%s: invalid sectors per cluster %u.", hint,
++ sct_per_clst);
+ goto out;
+ }
+
+@@ -771,8 +776,8 @@ static int ntfs_init_from_boot(struct super_block *sb, u32 sector_size,
+ if (mlcn * sct_per_clst >= sectors || mlcn2 * sct_per_clst >= sectors) {
+ ntfs_err(
+ sb,
+- "Start of MFT 0x%llx (0x%llx) is out of volume 0x%llx.",
+- mlcn, mlcn2, sectors);
++ "%s: start of MFT 0x%llx (0x%llx) is out of volume 0x%llx.",
++ hint, mlcn, mlcn2, sectors);
+ goto out;
+ }
+
+@@ -784,7 +789,7 @@ static int ntfs_init_from_boot(struct super_block *sb, u32 sector_size,
+
+ /* Check MFT record size. */
+ if (record_size < SECTOR_SIZE || !is_power_of_2(record_size)) {
+- ntfs_err(sb, "Invalid bytes per MFT record %u (%d).",
++ ntfs_err(sb, "%s: invalid bytes per MFT record %u (%d).", hint,
+ record_size, boot->record_size);
+ goto out;
+ }
+@@ -801,13 +806,13 @@ static int ntfs_init_from_boot(struct super_block *sb, u32 sector_size,
+
+ /* Check index record size. */
+ if (sbi->index_size < SECTOR_SIZE || !is_power_of_2(sbi->index_size)) {
+- ntfs_err(sb, "Invalid bytes per index %u(%d).", sbi->index_size,
+- boot->index_size);
++ ntfs_err(sb, "%s: invalid bytes per index %u(%d).", hint,
++ sbi->index_size, boot->index_size);
+ goto out;
+ }
+
+ if (sbi->index_size > MAXIMUM_BYTES_PER_INDEX) {
+- ntfs_err(sb, "Unsupported bytes per index %u.",
++ ntfs_err(sb, "%s: unsupported bytes per index %u.", hint,
+ sbi->index_size);
+ goto out;
+ }
+@@ -834,7 +839,7 @@ static int ntfs_init_from_boot(struct super_block *sb, u32 sector_size,
+
+ /* Compare boot's cluster and sector. */
+ if (sbi->cluster_size < boot_sector_size) {
+- ntfs_err(sb, "Invalid bytes per cluster (%u).",
++ ntfs_err(sb, "%s: invalid bytes per cluster (%u).", hint,
+ sbi->cluster_size);
+ goto out;
+ }
+@@ -930,7 +935,46 @@ static int ntfs_init_from_boot(struct super_block *sb, u32 sector_size,
+
+ err = 0;
+
++ if (bh->b_blocknr && !sb_rdonly(sb)) {
++ /*
++ * Alternative boot is ok but primary is not ok.
++ * Update primary boot.
++ */
++ struct buffer_head *bh0 = sb_getblk(sb, 0);
++ if (bh0) {
++ if (buffer_locked(bh0))
++ __wait_on_buffer(bh0);
++
++ lock_buffer(bh0);
++ memcpy(bh0->b_data, boot, sizeof(*boot));
++ set_buffer_uptodate(bh0);
++ mark_buffer_dirty(bh0);
++ unlock_buffer(bh0);
++ if (!sync_dirty_buffer(bh0))
++ ntfs_warn(sb, "primary boot is updated");
++ put_bh(bh0);
++ }
++ }
++
+ out:
++ if (err == -EINVAL && !bh->b_blocknr && dev_size > PAGE_SHIFT) {
++ u32 block_size = min_t(u32, sector_size, PAGE_SIZE);
++ u64 lbo = dev_size - sizeof(*boot);
++
++ /*
++ * Try alternative boot (last sector)
++ */
++ brelse(bh);
++
++ sb_set_blocksize(sb, block_size);
++ bh = ntfs_bread(sb, lbo >> blksize_bits(block_size));
++ if (!bh)
++ return -EINVAL;
++
++ boot_off = lbo & (block_size - 1);
++ hint = "Alternative boot";
++ goto check_boot;
++ }
+ brelse(bh);
+
+ return err;
+@@ -955,6 +999,7 @@ static int ntfs_fill_super(struct super_block *sb, struct fs_context *fc)
+ struct ATTR_DEF_ENTRY *t;
+ u16 *shared;
+ struct MFT_REF ref;
++ bool ro = sb_rdonly(sb);
+
+ ref.high = 0;
+
+@@ -1035,6 +1080,10 @@ static int ntfs_fill_super(struct super_block *sb, struct fs_context *fc)
+ sbi->volume.minor_ver = info->minor_ver;
+ sbi->volume.flags = info->flags;
+ sbi->volume.ni = ni;
++ if (info->flags & VOLUME_FLAG_DIRTY) {
++ sbi->volume.real_dirty = true;
++ ntfs_info(sb, "It is recommened to use chkdsk.");
++ }
+
+ /* Load $MFTMirr to estimate recs_mirr. */
+ ref.low = cpu_to_le32(MFT_REC_MIRR);
+@@ -1069,21 +1118,16 @@ static int ntfs_fill_super(struct super_block *sb, struct fs_context *fc)
+
+ iput(inode);
+
+- if (sbi->flags & NTFS_FLAGS_NEED_REPLAY) {
+- if (!sb_rdonly(sb)) {
+- ntfs_warn(sb,
+- "failed to replay log file. Can't mount rw!");
+- err = -EINVAL;
+- goto out;
+- }
+- } else if (sbi->volume.flags & VOLUME_FLAG_DIRTY) {
+- if (!sb_rdonly(sb) && !options->force) {
+- ntfs_warn(
+- sb,
+- "volume is dirty and \"force\" flag is not set!");
+- err = -EINVAL;
+- goto out;
+- }
++ if ((sbi->flags & NTFS_FLAGS_NEED_REPLAY) && !ro) {
++ ntfs_warn(sb, "failed to replay log file. Can't mount rw!");
++ err = -EINVAL;
++ goto out;
++ }
++
++ if ((sbi->volume.flags & VOLUME_FLAG_DIRTY) && !ro && !options->force) {
++ ntfs_warn(sb, "volume is dirty and \"force\" flag is not set!");
++ err = -EINVAL;
++ goto out;
+ }
+
+ /* Load $MFT. */
+@@ -1173,7 +1217,7 @@ static int ntfs_fill_super(struct super_block *sb, struct fs_context *fc)
+
+ bad_len += len;
+ bad_frags += 1;
+- if (sb_rdonly(sb))
++ if (ro)
+ continue;
+
+ if (wnd_set_used_safe(&sbi->used.bitmap, lcn, len, &tt) || tt) {
+diff --git a/fs/ntfs3/xattr.c b/fs/ntfs3/xattr.c
+index fd02fcf4d4091..26787c2bbf758 100644
+--- a/fs/ntfs3/xattr.c
++++ b/fs/ntfs3/xattr.c
+@@ -141,6 +141,7 @@ static int ntfs_read_ea(struct ntfs_inode *ni, struct EA_FULL **ea,
+
+ memset(Add2Ptr(ea_p, size), 0, add_bytes);
+
++ err = -EINVAL;
+ /* Check all attributes for consistency. */
+ for (off = 0; off < size; off += ea_size) {
+ const struct EA_FULL *ef = Add2Ptr(ea_p, off);
+diff --git a/fs/smb/client/cifs_debug.c b/fs/smb/client/cifs_debug.c
+index ed0f71137584f..d14e88e14fb2e 100644
+--- a/fs/smb/client/cifs_debug.c
++++ b/fs/smb/client/cifs_debug.c
+@@ -153,6 +153,11 @@ cifs_dump_channel(struct seq_file *m, int i, struct cifs_chan *chan)
+ in_flight(server),
+ atomic_read(&server->in_send),
+ atomic_read(&server->num_waiters));
++#ifdef CONFIG_NET_NS
++ if (server->net)
++ seq_printf(m, " Net namespace: %u ", server->net->ns.inum);
++#endif /* NET_NS */
++
+ }
+
+ static inline const char *smb_speed_to_str(size_t bps)
+@@ -429,10 +434,15 @@ skip_rdma:
+ server->reconnect_instance,
+ server->srv_count,
+ server->sec_mode, in_flight(server));
++#ifdef CONFIG_NET_NS
++ if (server->net)
++ seq_printf(m, " Net namespace: %u ", server->net->ns.inum);
++#endif /* NET_NS */
+
+ seq_printf(m, "\nIn Send: %d In MaxReq Wait: %d",
+ atomic_read(&server->in_send),
+ atomic_read(&server->num_waiters));
++
+ if (server->leaf_fullpath) {
+ seq_printf(m, "\nDFS leaf full path: %s",
+ server->leaf_fullpath);
+diff --git a/fs/smb/client/cifsfs.c b/fs/smb/client/cifsfs.c
+index 43a4d8603db34..30b03938f6d1d 100644
+--- a/fs/smb/client/cifsfs.c
++++ b/fs/smb/client/cifsfs.c
+@@ -884,11 +884,11 @@ struct dentry *
+ cifs_smb3_do_mount(struct file_system_type *fs_type,
+ int flags, struct smb3_fs_context *old_ctx)
+ {
+- int rc;
+- struct super_block *sb = NULL;
+- struct cifs_sb_info *cifs_sb = NULL;
+ struct cifs_mnt_data mnt_data;
++ struct cifs_sb_info *cifs_sb;
++ struct super_block *sb;
+ struct dentry *root;
++ int rc;
+
+ if (cifsFYI) {
+ cifs_dbg(FYI, "%s: devname=%s flags=0x%x\n", __func__,
+@@ -897,11 +897,9 @@ cifs_smb3_do_mount(struct file_system_type *fs_type,
+ cifs_info("Attempting to mount %s\n", old_ctx->source);
+ }
+
+- cifs_sb = kzalloc(sizeof(struct cifs_sb_info), GFP_KERNEL);
+- if (cifs_sb == NULL) {
+- root = ERR_PTR(-ENOMEM);
+- goto out;
+- }
++ cifs_sb = kzalloc(sizeof(*cifs_sb), GFP_KERNEL);
++ if (!cifs_sb)
++ return ERR_PTR(-ENOMEM);
+
+ cifs_sb->ctx = kzalloc(sizeof(struct smb3_fs_context), GFP_KERNEL);
+ if (!cifs_sb->ctx) {
+@@ -938,10 +936,8 @@ cifs_smb3_do_mount(struct file_system_type *fs_type,
+
+ sb = sget(fs_type, cifs_match_super, cifs_set_super, flags, &mnt_data);
+ if (IS_ERR(sb)) {
+- root = ERR_CAST(sb);
+ cifs_umount(cifs_sb);
+- cifs_sb = NULL;
+- goto out;
++ return ERR_CAST(sb);
+ }
+
+ if (sb->s_root) {
+@@ -972,13 +968,9 @@ out_super:
+ deactivate_locked_super(sb);
+ return root;
+ out:
+- if (cifs_sb) {
+- if (!sb || IS_ERR(sb)) { /* otherwise kill_sb will handle */
+- kfree(cifs_sb->prepath);
+- smb3_cleanup_fs_context(cifs_sb->ctx);
+- kfree(cifs_sb);
+- }
+- }
++ kfree(cifs_sb->prepath);
++ smb3_cleanup_fs_context(cifs_sb->ctx);
++ kfree(cifs_sb);
+ return root;
+ }
+
+diff --git a/fs/smb/client/file.c b/fs/smb/client/file.c
+index d554bca7e07eb..855454ff6cede 100644
+--- a/fs/smb/client/file.c
++++ b/fs/smb/client/file.c
+@@ -4681,9 +4681,9 @@ static int cifs_readpage_worker(struct file *file, struct page *page,
+
+ io_error:
+ kunmap(page);
+- unlock_page(page);
+
+ read_complete:
++ unlock_page(page);
+ return rc;
+ }
+
+@@ -4878,9 +4878,11 @@ void cifs_oplock_break(struct work_struct *work)
+ struct cifsFileInfo *cfile = container_of(work, struct cifsFileInfo,
+ oplock_break);
+ struct inode *inode = d_inode(cfile->dentry);
++ struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb);
+ struct cifsInodeInfo *cinode = CIFS_I(inode);
+- struct cifs_tcon *tcon = tlink_tcon(cfile->tlink);
+- struct TCP_Server_Info *server = tcon->ses->server;
++ struct cifs_tcon *tcon;
++ struct TCP_Server_Info *server;
++ struct tcon_link *tlink;
+ int rc = 0;
+ bool purge_cache = false, oplock_break_cancelled;
+ __u64 persistent_fid, volatile_fid;
+@@ -4889,6 +4891,12 @@ void cifs_oplock_break(struct work_struct *work)
+ wait_on_bit(&cinode->flags, CIFS_INODE_PENDING_WRITERS,
+ TASK_UNINTERRUPTIBLE);
+
++ tlink = cifs_sb_tlink(cifs_sb);
++ if (IS_ERR(tlink))
++ goto out;
++ tcon = tlink_tcon(tlink);
++ server = tcon->ses->server;
++
+ server->ops->downgrade_oplock(server, cinode, cfile->oplock_level,
+ cfile->oplock_epoch, &purge_cache);
+
+@@ -4938,18 +4946,19 @@ oplock_break_ack:
+ /*
+ * MS-SMB2 3.2.5.19.1 and 3.2.5.19.2 (and MS-CIFS 3.2.5.42) do not require
+ * an acknowledgment to be sent when the file has already been closed.
+- * check for server null, since can race with kill_sb calling tree disconnect.
+ */
+ spin_lock(&cinode->open_file_lock);
+- if (tcon->ses && tcon->ses->server && !oplock_break_cancelled &&
+- !list_empty(&cinode->openFileList)) {
++ /* check list empty since can race with kill_sb calling tree disconnect */
++ if (!oplock_break_cancelled && !list_empty(&cinode->openFileList)) {
+ spin_unlock(&cinode->open_file_lock);
+- rc = tcon->ses->server->ops->oplock_response(tcon, persistent_fid,
+- volatile_fid, net_fid, cinode);
++ rc = server->ops->oplock_response(tcon, persistent_fid,
++ volatile_fid, net_fid, cinode);
+ cifs_dbg(FYI, "Oplock release rc = %d\n", rc);
+ } else
+ spin_unlock(&cinode->open_file_lock);
+
++ cifs_put_tlink(tlink);
++out:
+ cifs_done_oplock_break(cinode);
+ }
+
+diff --git a/fs/smb/client/fs_context.c b/fs/smb/client/fs_context.c
+index 4946a0c596009..67e16c2ac90e6 100644
+--- a/fs/smb/client/fs_context.c
++++ b/fs/smb/client/fs_context.c
+@@ -231,6 +231,8 @@ cifs_parse_security_flavors(struct fs_context *fc, char *value, struct smb3_fs_c
+ break;
+ case Opt_sec_none:
+ ctx->nullauth = 1;
++ kfree(ctx->username);
++ ctx->username = NULL;
+ break;
+ default:
+ cifs_errorf(fc, "bad security option: %s\n", value);
+@@ -1201,6 +1203,8 @@ static int smb3_fs_context_parse_param(struct fs_context *fc,
+ case Opt_user:
+ kfree(ctx->username);
+ ctx->username = NULL;
++ if (ctx->nullauth)
++ break;
+ if (strlen(param->string) == 0) {
+ /* null user, ie. anonymous authentication */
+ ctx->nullauth = 1;
+diff --git a/fs/smb/client/smb2pdu.c b/fs/smb/client/smb2pdu.c
+index 17fe212ab895d..e04766fe6f803 100644
+--- a/fs/smb/client/smb2pdu.c
++++ b/fs/smb/client/smb2pdu.c
+@@ -3797,6 +3797,12 @@ void smb2_reconnect_server(struct work_struct *work)
+
+ spin_lock(&cifs_tcp_ses_lock);
+ list_for_each_entry(ses, &pserver->smb_ses_list, smb_ses_list) {
++ spin_lock(&ses->ses_lock);
++ if (ses->ses_status == SES_EXITING) {
++ spin_unlock(&ses->ses_lock);
++ continue;
++ }
++ spin_unlock(&ses->ses_lock);
+
+ tcon_selected = false;
+
+diff --git a/include/drm/drm_edid.h b/include/drm/drm_edid.h
+index 571885d32907b..70ae6c290bdc3 100644
+--- a/include/drm/drm_edid.h
++++ b/include/drm/drm_edid.h
+@@ -61,15 +61,9 @@ struct std_timing {
+ u8 vfreq_aspect;
+ } __attribute__((packed));
+
+-#define DRM_EDID_PT_SYNC_MASK (3 << 3)
+-# define DRM_EDID_PT_ANALOG_CSYNC (0 << 3)
+-# define DRM_EDID_PT_BIPOLAR_ANALOG_CSYNC (1 << 3)
+-# define DRM_EDID_PT_DIGITAL_CSYNC (2 << 3)
+-# define DRM_EDID_PT_CSYNC_ON_RGB (1 << 1) /* analog csync only */
+-# define DRM_EDID_PT_CSYNC_SERRATE (1 << 2)
+-# define DRM_EDID_PT_DIGITAL_SEPARATE_SYNC (3 << 3)
+-# define DRM_EDID_PT_HSYNC_POSITIVE (1 << 1) /* also digital csync */
+-# define DRM_EDID_PT_VSYNC_POSITIVE (1 << 2)
++#define DRM_EDID_PT_HSYNC_POSITIVE (1 << 1)
++#define DRM_EDID_PT_VSYNC_POSITIVE (1 << 2)
++#define DRM_EDID_PT_SEPARATE_SYNC (3 << 3)
+ #define DRM_EDID_PT_STEREO (1 << 5)
+ #define DRM_EDID_PT_INTERLACED (1 << 7)
+
+diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h
+index 2c8860e406bd8..0417360a6db9b 100644
+--- a/include/linux/iopoll.h
++++ b/include/linux/iopoll.h
+@@ -53,6 +53,7 @@
+ } \
+ if (__sleep_us) \
+ usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
++ cpu_relax(); \
+ } \
+ (cond) ? 0 : -ETIMEDOUT; \
+ })
+@@ -95,6 +96,7 @@
+ } \
+ if (__delay_us) \
+ udelay(__delay_us); \
++ cpu_relax(); \
+ } \
+ (cond) ? 0 : -ETIMEDOUT; \
+ })
+diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
+index bdf8de2cdd935..7b4dd69555e49 100644
+--- a/include/linux/virtio_net.h
++++ b/include/linux/virtio_net.h
+@@ -155,6 +155,10 @@ retry:
+ if (gso_type & SKB_GSO_UDP)
+ nh_off -= thlen;
+
++ /* Kernel has a special handling for GSO_BY_FRAGS. */
++ if (gso_size == GSO_BY_FRAGS)
++ return -EINVAL;
++
+ /* Too small packets are not really GSO ones. */
+ if (skb->len - nh_off > gso_size) {
+ shinfo->gso_size = gso_size;
+diff --git a/include/media/v4l2-mem2mem.h b/include/media/v4l2-mem2mem.h
+index bb9de6a899e07..d6c8eb2b52019 100644
+--- a/include/media/v4l2-mem2mem.h
++++ b/include/media/v4l2-mem2mem.h
+@@ -593,7 +593,14 @@ void v4l2_m2m_buf_queue(struct v4l2_m2m_ctx *m2m_ctx,
+ static inline
+ unsigned int v4l2_m2m_num_src_bufs_ready(struct v4l2_m2m_ctx *m2m_ctx)
+ {
+- return m2m_ctx->out_q_ctx.num_rdy;
++ unsigned int num_buf_rdy;
++ unsigned long flags;
++
++ spin_lock_irqsave(&m2m_ctx->out_q_ctx.rdy_spinlock, flags);
++ num_buf_rdy = m2m_ctx->out_q_ctx.num_rdy;
++ spin_unlock_irqrestore(&m2m_ctx->out_q_ctx.rdy_spinlock, flags);
++
++ return num_buf_rdy;
+ }
+
+ /**
+@@ -605,7 +612,14 @@ unsigned int v4l2_m2m_num_src_bufs_ready(struct v4l2_m2m_ctx *m2m_ctx)
+ static inline
+ unsigned int v4l2_m2m_num_dst_bufs_ready(struct v4l2_m2m_ctx *m2m_ctx)
+ {
+- return m2m_ctx->cap_q_ctx.num_rdy;
++ unsigned int num_buf_rdy;
++ unsigned long flags;
++
++ spin_lock_irqsave(&m2m_ctx->cap_q_ctx.rdy_spinlock, flags);
++ num_buf_rdy = m2m_ctx->cap_q_ctx.num_rdy;
++ spin_unlock_irqrestore(&m2m_ctx->cap_q_ctx.rdy_spinlock, flags);
++
++ return num_buf_rdy;
+ }
+
+ /**
+diff --git a/include/net/mana/mana.h b/include/net/mana/mana.h
+index 9eef199728454..024ad8ddb27e5 100644
+--- a/include/net/mana/mana.h
++++ b/include/net/mana/mana.h
+@@ -579,7 +579,7 @@ struct mana_fence_rq_resp {
+ }; /* HW DATA */
+
+ /* Configure vPort Rx Steering */
+-struct mana_cfg_rx_steer_req {
++struct mana_cfg_rx_steer_req_v2 {
+ struct gdma_req_hdr hdr;
+ mana_handle_t vport;
+ u16 num_indir_entries;
+@@ -592,6 +592,8 @@ struct mana_cfg_rx_steer_req {
+ u8 reserved;
+ mana_handle_t default_rxobj;
+ u8 hashkey[MANA_HASH_KEY_SIZE];
++ u8 cqe_coalescing_enable;
++ u8 reserved2[7];
+ }; /* HW DATA */
+
+ struct mana_cfg_rx_steer_resp {
+diff --git a/include/net/sock.h b/include/net/sock.h
+index ad468fe71413a..415f3840a26aa 100644
+--- a/include/net/sock.h
++++ b/include/net/sock.h
+@@ -1421,6 +1421,12 @@ static inline bool sk_has_memory_pressure(const struct sock *sk)
+ return sk->sk_prot->memory_pressure != NULL;
+ }
+
++static inline bool sk_under_global_memory_pressure(const struct sock *sk)
++{
++ return sk->sk_prot->memory_pressure &&
++ !!*sk->sk_prot->memory_pressure;
++}
++
+ static inline bool sk_under_memory_pressure(const struct sock *sk)
+ {
+ if (!sk->sk_prot->memory_pressure)
+diff --git a/include/net/xfrm.h b/include/net/xfrm.h
+index 151ca95dd08db..363c7d5105542 100644
+--- a/include/net/xfrm.h
++++ b/include/net/xfrm.h
+@@ -1984,6 +1984,7 @@ static inline void xfrm_dev_state_free(struct xfrm_state *x)
+ if (dev->xfrmdev_ops->xdo_dev_state_free)
+ dev->xfrmdev_ops->xdo_dev_state_free(x);
+ xso->dev = NULL;
++ xso->type = XFRM_DEV_OFFLOAD_UNSPECIFIED;
+ netdev_put(dev, &xso->dev_tracker);
+ }
+ }
+diff --git a/kernel/dma/remap.c b/kernel/dma/remap.c
+index b4526668072e7..27596f3b4aef3 100644
+--- a/kernel/dma/remap.c
++++ b/kernel/dma/remap.c
+@@ -43,13 +43,13 @@ void *dma_common_contiguous_remap(struct page *page, size_t size,
+ void *vaddr;
+ int i;
+
+- pages = kmalloc_array(count, sizeof(struct page *), GFP_KERNEL);
++ pages = kvmalloc_array(count, sizeof(struct page *), GFP_KERNEL);
+ if (!pages)
+ return NULL;
+ for (i = 0; i < count; i++)
+ pages[i] = nth_page(page, i);
+ vaddr = vmap(pages, count, VM_DMA_COHERENT, prot);
+- kfree(pages);
++ kvfree(pages);
+
+ return vaddr;
+ }
+diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
+index 99634b29a8b82..46b4a3c7c3bf5 100644
+--- a/kernel/trace/ring_buffer.c
++++ b/kernel/trace/ring_buffer.c
+@@ -538,6 +538,7 @@ struct trace_buffer {
+ unsigned flags;
+ int cpus;
+ atomic_t record_disabled;
++ atomic_t resizing;
+ cpumask_var_t cpumask;
+
+ struct lock_class_key *reader_lock_key;
+@@ -2166,7 +2167,7 @@ int ring_buffer_resize(struct trace_buffer *buffer, unsigned long size,
+
+ /* prevent another thread from changing buffer sizes */
+ mutex_lock(&buffer->mutex);
+-
++ atomic_inc(&buffer->resizing);
+
+ if (cpu_id == RING_BUFFER_ALL_CPUS) {
+ /*
+@@ -2321,6 +2322,7 @@ int ring_buffer_resize(struct trace_buffer *buffer, unsigned long size,
+ atomic_dec(&buffer->record_disabled);
+ }
+
++ atomic_dec(&buffer->resizing);
+ mutex_unlock(&buffer->mutex);
+ return 0;
+
+@@ -2341,6 +2343,7 @@ int ring_buffer_resize(struct trace_buffer *buffer, unsigned long size,
+ }
+ }
+ out_err_unlock:
++ atomic_dec(&buffer->resizing);
+ mutex_unlock(&buffer->mutex);
+ return err;
+ }
+@@ -5543,6 +5546,15 @@ int ring_buffer_swap_cpu(struct trace_buffer *buffer_a,
+ if (local_read(&cpu_buffer_b->committing))
+ goto out_dec;
+
++ /*
++ * When resize is in progress, we cannot swap it because
++ * it will mess the state of the cpu buffer.
++ */
++ if (atomic_read(&buffer_a->resizing))
++ goto out_dec;
++ if (atomic_read(&buffer_b->resizing))
++ goto out_dec;
++
+ buffer_a->buffers[cpu] = cpu_buffer_b;
+ buffer_b->buffers[cpu] = cpu_buffer_a;
+
+diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
+index c80ff6f5b2cc1..fd051f85efd4b 100644
+--- a/kernel/trace/trace.c
++++ b/kernel/trace/trace.c
+@@ -1928,9 +1928,10 @@ update_max_tr_single(struct trace_array *tr, struct task_struct *tsk, int cpu)
+ * place on this CPU. We fail to record, but we reset
+ * the max trace buffer (no one writes directly to it)
+ * and flag that it failed.
++ * Another reason is resize is in progress.
+ */
+ trace_array_printk_buf(tr->max_buffer.buffer, _THIS_IP_,
+- "Failed to swap buffers due to commit in progress\n");
++ "Failed to swap buffers due to commit or resize in progress\n");
+ }
+
+ WARN_ON_ONCE(ret && ret != -EAGAIN && ret != -EBUSY);
+diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
+index c5e8798e297ca..17ca13e8c044c 100644
+--- a/net/bluetooth/l2cap_core.c
++++ b/net/bluetooth/l2cap_core.c
+@@ -6374,9 +6374,14 @@ static inline int l2cap_le_command_rej(struct l2cap_conn *conn,
+ if (!chan)
+ goto done;
+
++ chan = l2cap_chan_hold_unless_zero(chan);
++ if (!chan)
++ goto done;
++
+ l2cap_chan_lock(chan);
+ l2cap_chan_del(chan, ECONNREFUSED);
+ l2cap_chan_unlock(chan);
++ l2cap_chan_put(chan);
+
+ done:
+ mutex_unlock(&conn->chan_lock);
+diff --git a/net/bluetooth/mgmt.c b/net/bluetooth/mgmt.c
+index 1e07d0f289723..d4498037fadc6 100644
+--- a/net/bluetooth/mgmt.c
++++ b/net/bluetooth/mgmt.c
+@@ -7285,7 +7285,7 @@ static void get_conn_info_complete(struct hci_dev *hdev, void *data, int err)
+
+ bt_dev_dbg(hdev, "err %d", err);
+
+- memcpy(&rp.addr, &cp->addr.bdaddr, sizeof(rp.addr));
++ memcpy(&rp.addr, &cp->addr, sizeof(rp.addr));
+
+ status = mgmt_status(err);
+ if (status == MGMT_STATUS_SUCCESS) {
+diff --git a/net/core/sock.c b/net/core/sock.c
+index 1f31a97100d4f..8451a95266bf0 100644
+--- a/net/core/sock.c
++++ b/net/core/sock.c
+@@ -3107,7 +3107,7 @@ void __sk_mem_reduce_allocated(struct sock *sk, int amount)
+ if (mem_cgroup_sockets_enabled && sk->sk_memcg)
+ mem_cgroup_uncharge_skmem(sk->sk_memcg, amount);
+
+- if (sk_under_memory_pressure(sk) &&
++ if (sk_under_global_memory_pressure(sk) &&
+ (sk_memory_allocated(sk) < sk_prot_mem_limits(sk, 0)))
+ sk_leave_memory_pressure(sk);
+ }
+diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c
+index 53bfd8af69203..d1e7d0ceb7edd 100644
+--- a/net/ipv4/ip_vti.c
++++ b/net/ipv4/ip_vti.c
+@@ -287,12 +287,12 @@ static netdev_tx_t vti_tunnel_xmit(struct sk_buff *skb, struct net_device *dev)
+
+ switch (skb->protocol) {
+ case htons(ETH_P_IP):
+- xfrm_decode_session(skb, &fl, AF_INET);
+ memset(IPCB(skb), 0, sizeof(*IPCB(skb)));
++ xfrm_decode_session(skb, &fl, AF_INET);
+ break;
+ case htons(ETH_P_IPV6):
+- xfrm_decode_session(skb, &fl, AF_INET6);
+ memset(IP6CB(skb), 0, sizeof(*IP6CB(skb)));
++ xfrm_decode_session(skb, &fl, AF_INET6);
+ break;
+ default:
+ goto tx_err;
+diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
+index 39eb947fe3920..366c3c25ebe20 100644
+--- a/net/ipv4/tcp_timer.c
++++ b/net/ipv4/tcp_timer.c
+@@ -586,7 +586,9 @@ out_reset_timer:
+ tcp_stream_is_thin(tp) &&
+ icsk->icsk_retransmits <= TCP_THIN_LINEAR_RETRIES) {
+ icsk->icsk_backoff = 0;
+- icsk->icsk_rto = min(__tcp_set_rto(tp), TCP_RTO_MAX);
++ icsk->icsk_rto = clamp(__tcp_set_rto(tp),
++ tcp_rto_min(sk),
++ TCP_RTO_MAX);
+ } else {
+ /* Use normal (exponential) backoff */
+ icsk->icsk_rto = min(icsk->icsk_rto << 1, TCP_RTO_MAX);
+diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c
+index 10b222865d46a..73c85d4e0e9cd 100644
+--- a/net/ipv6/ip6_vti.c
++++ b/net/ipv6/ip6_vti.c
+@@ -568,12 +568,12 @@ vti6_tnl_xmit(struct sk_buff *skb, struct net_device *dev)
+ vti6_addr_conflict(t, ipv6_hdr(skb)))
+ goto tx_err;
+
+- xfrm_decode_session(skb, &fl, AF_INET6);
+ memset(IP6CB(skb), 0, sizeof(*IP6CB(skb)));
++ xfrm_decode_session(skb, &fl, AF_INET6);
+ break;
+ case htons(ETH_P_IP):
+- xfrm_decode_session(skb, &fl, AF_INET);
+ memset(IPCB(skb), 0, sizeof(*IPCB(skb)));
++ xfrm_decode_session(skb, &fl, AF_INET);
+ break;
+ default:
+ goto tx_err;
+diff --git a/net/key/af_key.c b/net/key/af_key.c
+index 31ab12fd720ae..203131ad0dfe1 100644
+--- a/net/key/af_key.c
++++ b/net/key/af_key.c
+@@ -1848,9 +1848,9 @@ static int pfkey_dump(struct sock *sk, struct sk_buff *skb, const struct sadb_ms
+ if (ext_hdrs[SADB_X_EXT_FILTER - 1]) {
+ struct sadb_x_filter *xfilter = ext_hdrs[SADB_X_EXT_FILTER - 1];
+
+- if ((xfilter->sadb_x_filter_splen >=
++ if ((xfilter->sadb_x_filter_splen >
+ (sizeof(xfrm_address_t) << 3)) ||
+- (xfilter->sadb_x_filter_dplen >=
++ (xfilter->sadb_x_filter_dplen >
+ (sizeof(xfrm_address_t) << 3))) {
+ mutex_unlock(&pfk->dump_lock);
+ return -EINVAL;
+diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
+index 62606fb44d027..4bb0d90eca1cd 100644
+--- a/net/netfilter/ipvs/ip_vs_ctl.c
++++ b/net/netfilter/ipvs/ip_vs_ctl.c
+@@ -1876,6 +1876,7 @@ static int
+ proc_do_sync_threshold(struct ctl_table *table, int write,
+ void *buffer, size_t *lenp, loff_t *ppos)
+ {
++ struct netns_ipvs *ipvs = table->extra2;
+ int *valp = table->data;
+ int val[2];
+ int rc;
+@@ -1885,6 +1886,7 @@ proc_do_sync_threshold(struct ctl_table *table, int write,
+ .mode = table->mode,
+ };
+
++ mutex_lock(&ipvs->sync_mutex);
+ memcpy(val, valp, sizeof(val));
+ rc = proc_dointvec(&tmp, write, buffer, lenp, ppos);
+ if (write) {
+@@ -1894,6 +1896,7 @@ proc_do_sync_threshold(struct ctl_table *table, int write,
+ else
+ memcpy(valp, val, sizeof(val));
+ }
++ mutex_unlock(&ipvs->sync_mutex);
+ return rc;
+ }
+
+@@ -4321,6 +4324,7 @@ static int __net_init ip_vs_control_net_init_sysctl(struct netns_ipvs *ipvs)
+ ipvs->sysctl_sync_threshold[0] = DEFAULT_SYNC_THRESHOLD;
+ ipvs->sysctl_sync_threshold[1] = DEFAULT_SYNC_PERIOD;
+ tbl[idx].data = &ipvs->sysctl_sync_threshold;
++ tbl[idx].extra2 = ipvs;
+ tbl[idx++].maxlen = sizeof(ipvs->sysctl_sync_threshold);
+ ipvs->sysctl_sync_refresh_period = DEFAULT_SYNC_REFRESH_PERIOD;
+ tbl[idx++].data = &ipvs->sysctl_sync_refresh_period;
+diff --git a/net/netfilter/nf_conntrack_proto_sctp.c b/net/netfilter/nf_conntrack_proto_sctp.c
+index 91eacc9b0b987..b6bcc8f2f46b7 100644
+--- a/net/netfilter/nf_conntrack_proto_sctp.c
++++ b/net/netfilter/nf_conntrack_proto_sctp.c
+@@ -49,8 +49,8 @@ static const unsigned int sctp_timeouts[SCTP_CONNTRACK_MAX] = {
+ [SCTP_CONNTRACK_COOKIE_WAIT] = 3 SECS,
+ [SCTP_CONNTRACK_COOKIE_ECHOED] = 3 SECS,
+ [SCTP_CONNTRACK_ESTABLISHED] = 210 SECS,
+- [SCTP_CONNTRACK_SHUTDOWN_SENT] = 300 SECS / 1000,
+- [SCTP_CONNTRACK_SHUTDOWN_RECD] = 300 SECS / 1000,
++ [SCTP_CONNTRACK_SHUTDOWN_SENT] = 3 SECS,
++ [SCTP_CONNTRACK_SHUTDOWN_RECD] = 3 SECS,
+ [SCTP_CONNTRACK_SHUTDOWN_ACK_SENT] = 3 SECS,
+ [SCTP_CONNTRACK_HEARTBEAT_SENT] = 30 SECS,
+ };
+@@ -105,7 +105,7 @@ static const u8 sctp_conntracks[2][11][SCTP_CONNTRACK_MAX] = {
+ {
+ /* ORIGINAL */
+ /* sNO, sCL, sCW, sCE, sES, sSS, sSR, sSA, sHS */
+-/* init */ {sCL, sCL, sCW, sCE, sES, sSS, sSR, sSA, sCW},
++/* init */ {sCL, sCL, sCW, sCE, sES, sCL, sCL, sSA, sCW},
+ /* init_ack */ {sCL, sCL, sCW, sCE, sES, sSS, sSR, sSA, sCL},
+ /* abort */ {sCL, sCL, sCL, sCL, sCL, sCL, sCL, sCL, sCL},
+ /* shutdown */ {sCL, sCL, sCW, sCE, sSS, sSS, sSR, sSA, sCL},
+diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
+index c6de10f458fa4..b280b151a9e98 100644
+--- a/net/netfilter/nf_tables_api.c
++++ b/net/netfilter/nf_tables_api.c
+@@ -7088,6 +7088,7 @@ static int nft_set_catchall_flush(const struct nft_ctx *ctx,
+ ret = __nft_set_catchall_flush(ctx, set, &elem);
+ if (ret < 0)
+ break;
++ nft_set_elem_change_active(ctx->net, set, ext);
+ }
+
+ return ret;
+@@ -9494,9 +9495,14 @@ struct nft_trans_gc *nft_trans_gc_alloc(struct nft_set *set,
+ if (!trans)
+ return NULL;
+
++ trans->net = maybe_get_net(net);
++ if (!trans->net) {
++ kfree(trans);
++ return NULL;
++ }
++
+ refcount_inc(&set->refs);
+ trans->set = set;
+- trans->net = get_net(net);
+ trans->seq = gc_seq;
+
+ return trans;
+@@ -9752,6 +9758,22 @@ static void nft_set_commit_update(struct list_head *set_update_list)
+ }
+ }
+
++static unsigned int nft_gc_seq_begin(struct nftables_pernet *nft_net)
++{
++ unsigned int gc_seq;
++
++ /* Bump gc counter, it becomes odd, this is the busy mark. */
++ gc_seq = READ_ONCE(nft_net->gc_seq);
++ WRITE_ONCE(nft_net->gc_seq, ++gc_seq);
++
++ return gc_seq;
++}
++
++static void nft_gc_seq_end(struct nftables_pernet *nft_net, unsigned int gc_seq)
++{
++ WRITE_ONCE(nft_net->gc_seq, ++gc_seq);
++}
++
+ static int nf_tables_commit(struct net *net, struct sk_buff *skb)
+ {
+ struct nftables_pernet *nft_net = nft_pernet(net);
+@@ -9837,9 +9859,7 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb)
+
+ WRITE_ONCE(nft_net->base_seq, base_seq);
+
+- /* Bump gc counter, it becomes odd, this is the busy mark. */
+- gc_seq = READ_ONCE(nft_net->gc_seq);
+- WRITE_ONCE(nft_net->gc_seq, ++gc_seq);
++ gc_seq = nft_gc_seq_begin(nft_net);
+
+ /* step 3. Start new generation, rules_gen_X now in use. */
+ net->nft.gencursor = nft_gencursor_next(net);
+@@ -10049,7 +10069,7 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb)
+ nf_tables_gen_notify(net, skb, NFT_MSG_NEWGEN);
+ nf_tables_commit_audit_log(&adl, nft_net->base_seq);
+
+- WRITE_ONCE(nft_net->gc_seq, ++gc_seq);
++ nft_gc_seq_end(nft_net, gc_seq);
+ nf_tables_commit_release(net);
+
+ return 0;
+@@ -11050,6 +11070,7 @@ static int nft_rcv_nl_event(struct notifier_block *this, unsigned long event,
+ struct net *net = n->net;
+ unsigned int deleted;
+ bool restart = false;
++ unsigned int gc_seq;
+
+ if (event != NETLINK_URELEASE || n->protocol != NETLINK_NETFILTER)
+ return NOTIFY_DONE;
+@@ -11057,6 +11078,9 @@ static int nft_rcv_nl_event(struct notifier_block *this, unsigned long event,
+ nft_net = nft_pernet(net);
+ deleted = 0;
+ mutex_lock(&nft_net->commit_mutex);
++
++ gc_seq = nft_gc_seq_begin(nft_net);
++
+ if (!list_empty(&nf_tables_destroy_list))
+ rcu_barrier();
+ again:
+@@ -11079,6 +11103,8 @@ again:
+ if (restart)
+ goto again;
+ }
++ nft_gc_seq_end(nft_net, gc_seq);
++
+ mutex_unlock(&nft_net->commit_mutex);
+
+ return NOTIFY_DONE;
+@@ -11116,12 +11142,20 @@ static void __net_exit nf_tables_pre_exit_net(struct net *net)
+ static void __net_exit nf_tables_exit_net(struct net *net)
+ {
+ struct nftables_pernet *nft_net = nft_pernet(net);
++ unsigned int gc_seq;
+
+ mutex_lock(&nft_net->commit_mutex);
++
++ gc_seq = nft_gc_seq_begin(nft_net);
++
+ if (!list_empty(&nft_net->commit_list) ||
+ !list_empty(&nft_net->module_list))
+ __nf_tables_abort(net, NFNL_ABORT_NONE);
++
+ __nft_release_tables(net);
++
++ nft_gc_seq_end(nft_net, gc_seq);
++
+ mutex_unlock(&nft_net->commit_mutex);
+ WARN_ON_ONCE(!list_empty(&nft_net->tables));
+ WARN_ON_ONCE(!list_empty(&nft_net->module_list));
+diff --git a/net/netfilter/nft_dynset.c b/net/netfilter/nft_dynset.c
+index bd19c7aec92ee..c98a273c3006d 100644
+--- a/net/netfilter/nft_dynset.c
++++ b/net/netfilter/nft_dynset.c
+@@ -191,6 +191,9 @@ static int nft_dynset_init(const struct nft_ctx *ctx,
+ if (IS_ERR(set))
+ return PTR_ERR(set);
+
++ if (set->flags & NFT_SET_OBJECT)
++ return -EOPNOTSUPP;
++
+ if (set->ops->update == NULL)
+ return -EOPNOTSUPP;
+
+diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c
+index 92b108e3000eb..352180b123fc7 100644
+--- a/net/netfilter/nft_set_pipapo.c
++++ b/net/netfilter/nft_set_pipapo.c
+@@ -566,6 +566,8 @@ next_match:
+ goto out;
+
+ if (last) {
++ if (nft_set_elem_expired(&f->mt[b].e->ext))
++ goto next_match;
+ if ((genmask &&
+ !nft_set_elem_active(&f->mt[b].e->ext, genmask)))
+ goto next_match;
+@@ -600,17 +602,8 @@ out:
+ static void *nft_pipapo_get(const struct net *net, const struct nft_set *set,
+ const struct nft_set_elem *elem, unsigned int flags)
+ {
+- struct nft_pipapo_elem *ret;
+-
+- ret = pipapo_get(net, set, (const u8 *)elem->key.val.data,
++ return pipapo_get(net, set, (const u8 *)elem->key.val.data,
+ nft_genmask_cur(net));
+- if (IS_ERR(ret))
+- return ret;
+-
+- if (nft_set_elem_expired(&ret->ext))
+- return ERR_PTR(-ENOENT);
+-
+- return ret;
+ }
+
+ /**
+@@ -1698,6 +1691,17 @@ static void nft_pipapo_commit(const struct nft_set *set)
+ priv->clone = new_clone;
+ }
+
++static bool nft_pipapo_transaction_mutex_held(const struct nft_set *set)
++{
++#ifdef CONFIG_PROVE_LOCKING
++ const struct net *net = read_pnet(&set->net);
++
++ return lockdep_is_held(&nft_pernet(net)->commit_mutex);
++#else
++ return true;
++#endif
++}
++
+ static void nft_pipapo_abort(const struct nft_set *set)
+ {
+ struct nft_pipapo *priv = nft_set_priv(set);
+@@ -1706,7 +1710,7 @@ static void nft_pipapo_abort(const struct nft_set *set)
+ if (!priv->dirty)
+ return;
+
+- m = rcu_dereference(priv->match);
++ m = rcu_dereference_protected(priv->match, nft_pipapo_transaction_mutex_held(set));
+
+ new_clone = pipapo_clone(m);
+ if (IS_ERR(new_clone))
+@@ -1733,11 +1737,7 @@ static void nft_pipapo_activate(const struct net *net,
+ const struct nft_set *set,
+ const struct nft_set_elem *elem)
+ {
+- struct nft_pipapo_elem *e;
+-
+- e = pipapo_get(net, set, (const u8 *)elem->key.val.data, 0);
+- if (IS_ERR(e))
+- return;
++ struct nft_pipapo_elem *e = elem->priv;
+
+ nft_set_elem_change_active(net, set, &e->ext);
+ }
+@@ -1951,10 +1951,6 @@ static void nft_pipapo_remove(const struct net *net, const struct nft_set *set,
+
+ data = (const u8 *)nft_set_ext_key(&e->ext);
+
+- e = pipapo_get(net, set, data, 0);
+- if (IS_ERR(e))
+- return;
+-
+ while ((rules_f0 = pipapo_rules_same_key(m->f, first_rule))) {
+ union nft_pipapo_map_bucket rulemap[NFT_PIPAPO_MAX_FIELDS];
+ const u8 *match_start, *match_end;
+diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
+index a6d2a0b1aa21e..3d7a91e64c88f 100644
+--- a/net/openvswitch/datapath.c
++++ b/net/openvswitch/datapath.c
+@@ -1829,7 +1829,7 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct genl_info *info)
+ parms.port_no = OVSP_LOCAL;
+ parms.upcall_portids = a[OVS_DP_ATTR_UPCALL_PID];
+ parms.desired_ifindex = a[OVS_DP_ATTR_IFINDEX]
+- ? nla_get_u32(a[OVS_DP_ATTR_IFINDEX]) : 0;
++ ? nla_get_s32(a[OVS_DP_ATTR_IFINDEX]) : 0;
+
+ /* So far only local changes have been made, now need the lock. */
+ ovs_lock();
+@@ -2049,7 +2049,7 @@ static const struct nla_policy datapath_policy[OVS_DP_ATTR_MAX + 1] = {
+ [OVS_DP_ATTR_USER_FEATURES] = { .type = NLA_U32 },
+ [OVS_DP_ATTR_MASKS_CACHE_SIZE] = NLA_POLICY_RANGE(NLA_U32, 0,
+ PCPU_MIN_UNIT_SIZE / sizeof(struct mask_cache_entry)),
+- [OVS_DP_ATTR_IFINDEX] = {.type = NLA_U32 },
++ [OVS_DP_ATTR_IFINDEX] = NLA_POLICY_MIN(NLA_S32, 0),
+ };
+
+ static const struct genl_small_ops dp_datapath_genl_ops[] = {
+@@ -2302,7 +2302,7 @@ restart:
+ parms.port_no = port_no;
+ parms.upcall_portids = a[OVS_VPORT_ATTR_UPCALL_PID];
+ parms.desired_ifindex = a[OVS_VPORT_ATTR_IFINDEX]
+- ? nla_get_u32(a[OVS_VPORT_ATTR_IFINDEX]) : 0;
++ ? nla_get_s32(a[OVS_VPORT_ATTR_IFINDEX]) : 0;
+
+ vport = new_vport(&parms);
+ err = PTR_ERR(vport);
+@@ -2539,7 +2539,7 @@ static const struct nla_policy vport_policy[OVS_VPORT_ATTR_MAX + 1] = {
+ [OVS_VPORT_ATTR_TYPE] = { .type = NLA_U32 },
+ [OVS_VPORT_ATTR_UPCALL_PID] = { .type = NLA_UNSPEC },
+ [OVS_VPORT_ATTR_OPTIONS] = { .type = NLA_NESTED },
+- [OVS_VPORT_ATTR_IFINDEX] = { .type = NLA_U32 },
++ [OVS_VPORT_ATTR_IFINDEX] = NLA_POLICY_MIN(NLA_S32, 0),
+ [OVS_VPORT_ATTR_NETNSID] = { .type = NLA_S32 },
+ [OVS_VPORT_ATTR_UPCALL_STATS] = { .type = NLA_NESTED },
+ };
+diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
+index 10615878e3961..714bd87f12d91 100644
+--- a/net/unix/af_unix.c
++++ b/net/unix/af_unix.c
+@@ -2291,6 +2291,7 @@ static ssize_t unix_stream_sendpage(struct socket *socket, struct page *page,
+
+ if (false) {
+ alloc_skb:
++ spin_unlock(&other->sk_receive_queue.lock);
+ unix_state_unlock(other);
+ mutex_unlock(&unix_sk(other)->iolock);
+ newskb = sock_alloc_send_pskb(sk, 0, 0, flags & MSG_DONTWAIT,
+@@ -2330,6 +2331,7 @@ alloc_skb:
+ init_scm = false;
+ }
+
++ spin_lock(&other->sk_receive_queue.lock);
+ skb = skb_peek_tail(&other->sk_receive_queue);
+ if (tail && tail == skb) {
+ skb = newskb;
+@@ -2360,14 +2362,11 @@ alloc_skb:
+ refcount_add(size, &sk->sk_wmem_alloc);
+
+ if (newskb) {
+- err = unix_scm_to_skb(&scm, skb, false);
+- if (err)
+- goto err_state_unlock;
+- spin_lock(&other->sk_receive_queue.lock);
++ unix_scm_to_skb(&scm, skb, false);
+ __skb_queue_tail(&other->sk_receive_queue, newskb);
+- spin_unlock(&other->sk_receive_queue.lock);
+ }
+
++ spin_unlock(&other->sk_receive_queue.lock);
+ unix_state_unlock(other);
+ mutex_unlock(&unix_sk(other)->iolock);
+
+diff --git a/net/xfrm/xfrm_compat.c b/net/xfrm/xfrm_compat.c
+index 8cbf45a8bcdc2..655fe4ff86212 100644
+--- a/net/xfrm/xfrm_compat.c
++++ b/net/xfrm/xfrm_compat.c
+@@ -108,7 +108,7 @@ static const struct nla_policy compat_policy[XFRMA_MAX+1] = {
+ [XFRMA_ALG_COMP] = { .len = sizeof(struct xfrm_algo) },
+ [XFRMA_ENCAP] = { .len = sizeof(struct xfrm_encap_tmpl) },
+ [XFRMA_TMPL] = { .len = sizeof(struct xfrm_user_tmpl) },
+- [XFRMA_SEC_CTX] = { .len = sizeof(struct xfrm_sec_ctx) },
++ [XFRMA_SEC_CTX] = { .len = sizeof(struct xfrm_user_sec_ctx) },
+ [XFRMA_LTIME_VAL] = { .len = sizeof(struct xfrm_lifetime_cur) },
+ [XFRMA_REPLAY_VAL] = { .len = sizeof(struct xfrm_replay_state) },
+ [XFRMA_REPLAY_THRESH] = { .type = NLA_U32 },
+diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
+index 815b380804011..d5ee96789d4bf 100644
+--- a/net/xfrm/xfrm_input.c
++++ b/net/xfrm/xfrm_input.c
+@@ -180,6 +180,8 @@ static int xfrm4_remove_beet_encap(struct xfrm_state *x, struct sk_buff *skb)
+ int optlen = 0;
+ int err = -EINVAL;
+
++ skb->protocol = htons(ETH_P_IP);
++
+ if (unlikely(XFRM_MODE_SKB_CB(skb)->protocol == IPPROTO_BEETPH)) {
+ struct ip_beet_phdr *ph;
+ int phlen;
+@@ -232,6 +234,8 @@ static int xfrm4_remove_tunnel_encap(struct xfrm_state *x, struct sk_buff *skb)
+ {
+ int err = -EINVAL;
+
++ skb->protocol = htons(ETH_P_IP);
++
+ if (!pskb_may_pull(skb, sizeof(struct iphdr)))
+ goto out;
+
+@@ -267,6 +271,8 @@ static int xfrm6_remove_tunnel_encap(struct xfrm_state *x, struct sk_buff *skb)
+ {
+ int err = -EINVAL;
+
++ skb->protocol = htons(ETH_P_IPV6);
++
+ if (!pskb_may_pull(skb, sizeof(struct ipv6hdr)))
+ goto out;
+
+@@ -296,6 +302,8 @@ static int xfrm6_remove_beet_encap(struct xfrm_state *x, struct sk_buff *skb)
+ int size = sizeof(struct ipv6hdr);
+ int err;
+
++ skb->protocol = htons(ETH_P_IPV6);
++
+ err = skb_cow_head(skb, size + skb->mac_len);
+ if (err)
+ goto out;
+@@ -346,6 +354,7 @@ xfrm_inner_mode_encap_remove(struct xfrm_state *x,
+ return xfrm6_remove_tunnel_encap(x, skb);
+ break;
+ }
++ return -EINVAL;
+ }
+
+ WARN_ON_ONCE(1);
+@@ -366,19 +375,6 @@ static int xfrm_prepare_input(struct xfrm_state *x, struct sk_buff *skb)
+ return -EAFNOSUPPORT;
+ }
+
+- switch (XFRM_MODE_SKB_CB(skb)->protocol) {
+- case IPPROTO_IPIP:
+- case IPPROTO_BEETPH:
+- skb->protocol = htons(ETH_P_IP);
+- break;
+- case IPPROTO_IPV6:
+- skb->protocol = htons(ETH_P_IPV6);
+- break;
+- default:
+- WARN_ON_ONCE(1);
+- break;
+- }
+-
+ return xfrm_inner_mode_encap_remove(x, skb);
+ }
+
+diff --git a/net/xfrm/xfrm_interface_core.c b/net/xfrm/xfrm_interface_core.c
+index a3319965470a7..b864740846902 100644
+--- a/net/xfrm/xfrm_interface_core.c
++++ b/net/xfrm/xfrm_interface_core.c
+@@ -537,8 +537,8 @@ static netdev_tx_t xfrmi_xmit(struct sk_buff *skb, struct net_device *dev)
+
+ switch (skb->protocol) {
+ case htons(ETH_P_IPV6):
+- xfrm_decode_session(skb, &fl, AF_INET6);
+ memset(IP6CB(skb), 0, sizeof(*IP6CB(skb)));
++ xfrm_decode_session(skb, &fl, AF_INET6);
+ if (!dst) {
+ fl.u.ip6.flowi6_oif = dev->ifindex;
+ fl.u.ip6.flowi6_flags |= FLOWI_FLAG_ANYSRC;
+@@ -552,8 +552,8 @@ static netdev_tx_t xfrmi_xmit(struct sk_buff *skb, struct net_device *dev)
+ }
+ break;
+ case htons(ETH_P_IP):
+- xfrm_decode_session(skb, &fl, AF_INET);
+ memset(IPCB(skb), 0, sizeof(*IPCB(skb)));
++ xfrm_decode_session(skb, &fl, AF_INET);
+ if (!dst) {
+ struct rtable *rt;
+
+diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
+index 49e63eea841dd..bda5327bf34df 100644
+--- a/net/xfrm/xfrm_state.c
++++ b/net/xfrm/xfrm_state.c
+@@ -1324,12 +1324,8 @@ found:
+ struct xfrm_dev_offload *xso = &x->xso;
+
+ if (xso->type == XFRM_DEV_OFFLOAD_PACKET) {
+- xso->dev->xfrmdev_ops->xdo_dev_state_delete(x);
+- xso->dir = 0;
+- netdev_put(xso->dev, &xso->dev_tracker);
+- xso->dev = NULL;
+- xso->real_dev = NULL;
+- xso->type = XFRM_DEV_OFFLOAD_UNSPECIFIED;
++ xfrm_dev_state_delete(x);
++ xfrm_dev_state_free(x);
+ }
+ #endif
+ x->km.state = XFRM_STATE_DEAD;
+diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
+index c34a2a06ca940..ad01997c3aa9d 100644
+--- a/net/xfrm/xfrm_user.c
++++ b/net/xfrm/xfrm_user.c
+@@ -628,7 +628,7 @@ static void xfrm_update_ae_params(struct xfrm_state *x, struct nlattr **attrs,
+ struct nlattr *rt = attrs[XFRMA_REPLAY_THRESH];
+ struct nlattr *mt = attrs[XFRMA_MTIMER_THRESH];
+
+- if (re) {
++ if (re && x->replay_esn && x->preplay_esn) {
+ struct xfrm_replay_state_esn *replay_esn;
+ replay_esn = nla_data(re);
+ memcpy(x->replay_esn, replay_esn,
+@@ -1267,6 +1267,15 @@ static int xfrm_dump_sa(struct sk_buff *skb, struct netlink_callback *cb)
+ sizeof(*filter), GFP_KERNEL);
+ if (filter == NULL)
+ return -ENOMEM;
++
++ /* see addr_match(), (prefix length >> 5) << 2
++ * will be used to compare xfrm_address_t
++ */
++ if (filter->splen > (sizeof(xfrm_address_t) << 3) ||
++ filter->dplen > (sizeof(xfrm_address_t) << 3)) {
++ kfree(filter);
++ return -EINVAL;
++ }
+ }
+
+ if (attrs[XFRMA_PROTO])
+@@ -2336,6 +2345,7 @@ static int xfrm_get_policy(struct sk_buff *skb, struct nlmsghdr *nlh,
+ NETLINK_CB(skb).portid);
+ }
+ } else {
++ xfrm_dev_policy_delete(xp);
+ xfrm_audit_policy_delete(xp, err ? 0 : 1, true);
+
+ if (err != 0)
+@@ -3015,7 +3025,7 @@ const struct nla_policy xfrma_policy[XFRMA_MAX+1] = {
+ [XFRMA_ALG_COMP] = { .len = sizeof(struct xfrm_algo) },
+ [XFRMA_ENCAP] = { .len = sizeof(struct xfrm_encap_tmpl) },
+ [XFRMA_TMPL] = { .len = sizeof(struct xfrm_user_tmpl) },
+- [XFRMA_SEC_CTX] = { .len = sizeof(struct xfrm_sec_ctx) },
++ [XFRMA_SEC_CTX] = { .len = sizeof(struct xfrm_user_sec_ctx) },
+ [XFRMA_LTIME_VAL] = { .len = sizeof(struct xfrm_lifetime_cur) },
+ [XFRMA_REPLAY_VAL] = { .len = sizeof(struct xfrm_replay_state) },
+ [XFRMA_REPLAY_THRESH] = { .type = NLA_U32 },
+@@ -3035,6 +3045,7 @@ const struct nla_policy xfrma_policy[XFRMA_MAX+1] = {
+ [XFRMA_SET_MARK] = { .type = NLA_U32 },
+ [XFRMA_SET_MARK_MASK] = { .type = NLA_U32 },
+ [XFRMA_IF_ID] = { .type = NLA_U32 },
++ [XFRMA_MTIMER_THRESH] = { .type = NLA_U32 },
+ };
+ EXPORT_SYMBOL_GPL(xfrma_policy);
+
+diff --git a/rust/macros/vtable.rs b/rust/macros/vtable.rs
+index 34d5e7fb5768a..ee06044fcd4f3 100644
+--- a/rust/macros/vtable.rs
++++ b/rust/macros/vtable.rs
+@@ -74,6 +74,7 @@ pub(crate) fn vtable(_attr: TokenStream, ts: TokenStream) -> TokenStream {
+ const {gen_const_name}: bool = false;",
+ )
+ .unwrap();
++ consts.insert(gen_const_name);
+ }
+ } else {
+ const_items = "const USE_VTABLE_ATTR: () = ();".to_owned();
+diff --git a/sound/hda/hdac_regmap.c b/sound/hda/hdac_regmap.c
+index fe3587547cfec..39610a15bcc98 100644
+--- a/sound/hda/hdac_regmap.c
++++ b/sound/hda/hdac_regmap.c
+@@ -597,10 +597,9 @@ EXPORT_SYMBOL_GPL(snd_hdac_regmap_update_raw_once);
+ */
+ void snd_hdac_regmap_sync(struct hdac_device *codec)
+ {
+- if (codec->regmap) {
+- mutex_lock(&codec->regmap_lock);
++ mutex_lock(&codec->regmap_lock);
++ if (codec->regmap)
+ regcache_sync(codec->regmap);
+- mutex_unlock(&codec->regmap_lock);
+- }
++ mutex_unlock(&codec->regmap_lock);
+ }
+ EXPORT_SYMBOL_GPL(snd_hdac_regmap_sync);
+diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
+index bcd548e247fc8..074aa06aa585c 100644
+--- a/sound/pci/hda/patch_realtek.c
++++ b/sound/pci/hda/patch_realtek.c
+@@ -7081,6 +7081,9 @@ enum {
+ ALC285_FIXUP_SPEAKER2_TO_DAC1,
+ ALC285_FIXUP_ASUS_SPEAKER2_TO_DAC1,
+ ALC285_FIXUP_ASUS_HEADSET_MIC,
++ ALC285_FIXUP_ASUS_SPI_REAR_SPEAKERS,
++ ALC285_FIXUP_ASUS_I2C_SPEAKER2_TO_DAC1,
++ ALC285_FIXUP_ASUS_I2C_HEADSET_MIC,
+ ALC280_FIXUP_HP_HEADSET_MIC,
+ ALC221_FIXUP_HP_FRONT_MIC,
+ ALC292_FIXUP_TPT460,
+@@ -8073,6 +8076,31 @@ static const struct hda_fixup alc269_fixups[] = {
+ .chained = true,
+ .chain_id = ALC285_FIXUP_ASUS_SPEAKER2_TO_DAC1
+ },
++ [ALC285_FIXUP_ASUS_SPI_REAR_SPEAKERS] = {
++ .type = HDA_FIXUP_PINS,
++ .v.pins = (const struct hda_pintbl[]) {
++ { 0x14, 0x90170120 },
++ { }
++ },
++ .chained = true,
++ .chain_id = ALC285_FIXUP_ASUS_HEADSET_MIC
++ },
++ [ALC285_FIXUP_ASUS_I2C_SPEAKER2_TO_DAC1] = {
++ .type = HDA_FIXUP_FUNC,
++ .v.func = alc285_fixup_speaker2_to_dac1,
++ .chained = true,
++ .chain_id = ALC287_FIXUP_CS35L41_I2C_2
++ },
++ [ALC285_FIXUP_ASUS_I2C_HEADSET_MIC] = {
++ .type = HDA_FIXUP_PINS,
++ .v.pins = (const struct hda_pintbl[]) {
++ { 0x19, 0x03a11050 },
++ { 0x1b, 0x03a11c30 },
++ { }
++ },
++ .chained = true,
++ .chain_id = ALC285_FIXUP_ASUS_I2C_SPEAKER2_TO_DAC1
++ },
+ [ALC256_FIXUP_DELL_INSPIRON_7559_SUBWOOFER] = {
+ .type = HDA_FIXUP_PINS,
+ .v.pins = (const struct hda_pintbl[]) {
+@@ -9578,7 +9606,13 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x103c, 0x8b96, "HP", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF),
+ SND_PCI_QUIRK(0x103c, 0x8b97, "HP", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF),
+ SND_PCI_QUIRK(0x103c, 0x8bf0, "HP", ALC236_FIXUP_HP_GPIO_LED),
+- SND_PCI_QUIRK(0x103c, 0x8c26, "HP HP EliteBook 800G11", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED),
++ SND_PCI_QUIRK(0x103c, 0x8c46, "HP EliteBook 830 G11", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED),
++ SND_PCI_QUIRK(0x103c, 0x8c47, "HP EliteBook 840 G11", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED),
++ SND_PCI_QUIRK(0x103c, 0x8c48, "HP EliteBook 860 G11", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED),
++ SND_PCI_QUIRK(0x103c, 0x8c49, "HP Elite x360 830 2-in-1 G11", ALC245_FIXUP_CS35L41_SPI_2_HP_GPIO_LED),
++ SND_PCI_QUIRK(0x103c, 0x8c70, "HP EliteBook 835 G11", ALC287_FIXUP_CS35L41_I2C_2_HP_GPIO_LED),
++ SND_PCI_QUIRK(0x103c, 0x8c71, "HP EliteBook 845 G11", ALC287_FIXUP_CS35L41_I2C_2_HP_GPIO_LED),
++ SND_PCI_QUIRK(0x103c, 0x8c72, "HP EliteBook 865 G11", ALC287_FIXUP_CS35L41_I2C_2_HP_GPIO_LED),
+ SND_PCI_QUIRK(0x1043, 0x103e, "ASUS X540SA", ALC256_FIXUP_ASUS_MIC),
+ SND_PCI_QUIRK(0x1043, 0x103f, "ASUS TX300", ALC282_FIXUP_ASUS_TX300),
+ SND_PCI_QUIRK(0x1043, 0x106d, "Asus K53BE", ALC269_FIXUP_LIMIT_INT_MIC_BOOST),
+@@ -9598,10 +9632,13 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x1043, 0x1313, "Asus K42JZ", ALC269VB_FIXUP_ASUS_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1043, 0x13b0, "ASUS Z550SA", ALC256_FIXUP_ASUS_MIC),
+ SND_PCI_QUIRK(0x1043, 0x1427, "Asus Zenbook UX31E", ALC269VB_FIXUP_ASUS_ZENBOOK),
++ SND_PCI_QUIRK(0x1043, 0x1433, "ASUS GX650P", ALC285_FIXUP_ASUS_I2C_HEADSET_MIC),
++ SND_PCI_QUIRK(0x1043, 0x1463, "Asus GA402X", ALC285_FIXUP_ASUS_I2C_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1043, 0x1473, "ASUS GU604V", ALC285_FIXUP_ASUS_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1043, 0x1483, "ASUS GU603V", ALC285_FIXUP_ASUS_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1043, 0x1493, "ASUS GV601V", ALC285_FIXUP_ASUS_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1043, 0x1517, "Asus Zenbook UX31A", ALC269VB_FIXUP_ASUS_ZENBOOK_UX31A),
++ SND_PCI_QUIRK(0x1043, 0x1573, "ASUS GZ301V", ALC285_FIXUP_ASUS_HEADSET_MIC),
+ SND_PCI_QUIRK(0x1043, 0x1662, "ASUS GV301QH", ALC294_FIXUP_ASUS_DUAL_SPK),
+ SND_PCI_QUIRK(0x1043, 0x1683, "ASUS UM3402YAR", ALC287_FIXUP_CS35L41_I2C_2),
+ SND_PCI_QUIRK(0x1043, 0x16b2, "ASUS GU603", ALC289_FIXUP_ASUS_GA401),
+@@ -9627,7 +9664,8 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x1043, 0x1c23, "Asus X55U", ALC269_FIXUP_LIMIT_INT_MIC_BOOST),
+ SND_PCI_QUIRK(0x1043, 0x1c62, "ASUS GU603", ALC289_FIXUP_ASUS_GA401),
+ SND_PCI_QUIRK(0x1043, 0x1c92, "ASUS ROG Strix G15", ALC285_FIXUP_ASUS_G533Z_PINS),
+- SND_PCI_QUIRK(0x1043, 0x1caf, "ASUS G634JYR/JZR", ALC285_FIXUP_ASUS_HEADSET_MIC),
++ SND_PCI_QUIRK(0x1043, 0x1c9f, "ASUS G614JI", ALC285_FIXUP_ASUS_HEADSET_MIC),
++ SND_PCI_QUIRK(0x1043, 0x1caf, "ASUS G634JYR/JZR", ALC285_FIXUP_ASUS_SPI_REAR_SPEAKERS),
+ SND_PCI_QUIRK(0x1043, 0x1ccd, "ASUS X555UB", ALC256_FIXUP_ASUS_MIC),
+ SND_PCI_QUIRK(0x1043, 0x1d1f, "ASUS ROG Strix G17 2023 (G713PV)", ALC287_FIXUP_CS35L41_I2C_2),
+ SND_PCI_QUIRK(0x1043, 0x1d42, "ASUS Zephyrus G14 2022", ALC289_FIXUP_ASUS_GA401),
+@@ -10595,6 +10633,7 @@ static int patch_alc269(struct hda_codec *codec)
+ spec = codec->spec;
+ spec->gen.shared_mic_vref_pin = 0x18;
+ codec->power_save_node = 0;
++ spec->en_3kpull_low = true;
+
+ #ifdef CONFIG_PM
+ codec->patch_ops.suspend = alc269_suspend;
+@@ -10677,14 +10716,16 @@ static int patch_alc269(struct hda_codec *codec)
+ spec->shutup = alc256_shutup;
+ spec->init_hook = alc256_init;
+ spec->gen.mixer_nid = 0; /* ALC256 does not have any loopback mixer path */
+- if (codec->bus->pci->vendor == PCI_VENDOR_ID_AMD)
+- spec->en_3kpull_low = true;
++ if (codec->core.vendor_id == 0x10ec0236 &&
++ codec->bus->pci->vendor != PCI_VENDOR_ID_AMD)
++ spec->en_3kpull_low = false;
+ break;
+ case 0x10ec0257:
+ spec->codec_variant = ALC269_TYPE_ALC257;
+ spec->shutup = alc256_shutup;
+ spec->init_hook = alc256_init;
+ spec->gen.mixer_nid = 0;
++ spec->en_3kpull_low = false;
+ break;
+ case 0x10ec0215:
+ case 0x10ec0245:
+@@ -11316,6 +11357,7 @@ enum {
+ ALC897_FIXUP_HP_HSMIC_VERB,
+ ALC897_FIXUP_LENOVO_HEADSET_MODE,
+ ALC897_FIXUP_HEADSET_MIC_PIN2,
++ ALC897_FIXUP_UNIS_H3C_X500S,
+ };
+
+ static const struct hda_fixup alc662_fixups[] = {
+@@ -11755,6 +11797,13 @@ static const struct hda_fixup alc662_fixups[] = {
+ .chained = true,
+ .chain_id = ALC897_FIXUP_LENOVO_HEADSET_MODE
+ },
++ [ALC897_FIXUP_UNIS_H3C_X500S] = {
++ .type = HDA_FIXUP_VERBS,
++ .v.verbs = (const struct hda_verb[]) {
++ { 0x14, AC_VERB_SET_EAPD_BTLENABLE, 0 },
++ {}
++ },
++ },
+ };
+
+ static const struct snd_pci_quirk alc662_fixup_tbl[] = {
+@@ -11916,6 +11965,7 @@ static const struct hda_model_fixup alc662_fixup_models[] = {
+ {.id = ALC662_FIXUP_USI_HEADSET_MODE, .name = "usi-headset"},
+ {.id = ALC662_FIXUP_LENOVO_MULTI_CODECS, .name = "dual-codecs"},
+ {.id = ALC669_FIXUP_ACER_ASPIRE_ETHOS, .name = "aspire-ethos"},
++ {.id = ALC897_FIXUP_UNIS_H3C_X500S, .name = "unis-h3c-x500s"},
+ {}
+ };
+
+diff --git a/sound/soc/amd/Kconfig b/sound/soc/amd/Kconfig
+index 08e42082f5e96..e724cb3c70b74 100644
+--- a/sound/soc/amd/Kconfig
++++ b/sound/soc/amd/Kconfig
+@@ -81,6 +81,7 @@ config SND_SOC_AMD_VANGOGH_MACH
+ tristate "AMD Vangogh support for NAU8821 CS35L41"
+ select SND_SOC_NAU8821
+ select SND_SOC_CS35L41_SPI
++ select SND_AMD_ACP_CONFIG
+ depends on SND_SOC_AMD_ACP5x && I2C && SPI_MASTER
+ help
+ This option enables machine driver for Vangogh platform
+diff --git a/sound/soc/amd/vangogh/acp5x.h b/sound/soc/amd/vangogh/acp5x.h
+index bd9f1c5684d17..ac1936a8c43ff 100644
+--- a/sound/soc/amd/vangogh/acp5x.h
++++ b/sound/soc/amd/vangogh/acp5x.h
+@@ -147,6 +147,8 @@ static inline void acp_writel(u32 val, void __iomem *base_addr)
+ writel(val, base_addr - ACP5x_PHY_BASE_ADDRESS);
+ }
+
++int snd_amd_acp_find_config(struct pci_dev *pci);
++
+ static inline u64 acp_get_byte_count(struct i2s_stream_instance *rtd,
+ int direction)
+ {
+diff --git a/sound/soc/amd/vangogh/pci-acp5x.c b/sound/soc/amd/vangogh/pci-acp5x.c
+index e0df17c88e8e0..c4634a8a17cdc 100644
+--- a/sound/soc/amd/vangogh/pci-acp5x.c
++++ b/sound/soc/amd/vangogh/pci-acp5x.c
+@@ -125,10 +125,15 @@ static int snd_acp5x_probe(struct pci_dev *pci,
+ {
+ struct acp5x_dev_data *adata;
+ struct platform_device_info pdevinfo[ACP5x_DEVS];
+- unsigned int irqflags;
++ unsigned int irqflags, flag;
+ int ret, i;
+ u32 addr, val;
+
++ /* Return if acp config flag is defined */
++ flag = snd_amd_acp_find_config(pci);
++ if (flag)
++ return -ENODEV;
++
+ irqflags = IRQF_SHARED;
+ if (pci->revision != 0x50)
+ return -ENODEV;
+diff --git a/sound/soc/codecs/cs35l56.c b/sound/soc/codecs/cs35l56.c
+index e0d2b9bb23262..f3fee448d759e 100644
+--- a/sound/soc/codecs/cs35l56.c
++++ b/sound/soc/codecs/cs35l56.c
+@@ -834,12 +834,6 @@ static void cs35l56_dsp_work(struct work_struct *work)
+ if (!cs35l56->init_done)
+ return;
+
+- cs35l56->dsp.part = devm_kasprintf(cs35l56->dev, GFP_KERNEL, "cs35l56%s-%02x",
+- cs35l56->secured ? "s" : "", cs35l56->rev);
+-
+- if (!cs35l56->dsp.part)
+- return;
+-
+ pm_runtime_get_sync(cs35l56->dev);
+
+ /*
+@@ -1505,6 +1499,12 @@ int cs35l56_init(struct cs35l56_private *cs35l56)
+ dev_info(cs35l56->dev, "Cirrus Logic CS35L56%s Rev %02X OTP%d\n",
+ cs35l56->secured ? "s" : "", cs35l56->rev, otpid);
+
++ /* Populate the DSP information with the revision and security state */
++ cs35l56->dsp.part = devm_kasprintf(cs35l56->dev, GFP_KERNEL, "cs35l56%s-%02x",
++ cs35l56->secured ? "s" : "", cs35l56->rev);
++ if (!cs35l56->dsp.part)
++ return -ENOMEM;
++
+ /* Wake source and *_BLOCKED interrupts default to unmasked, so mask them */
+ regmap_write(cs35l56->regmap, CS35L56_IRQ1_MASK_20, 0xffffffff);
+ regmap_update_bits(cs35l56->regmap, CS35L56_IRQ1_MASK_1,
+diff --git a/sound/soc/codecs/max98363.c b/sound/soc/codecs/max98363.c
+index e6b84e222b504..169913ba76dd7 100644
+--- a/sound/soc/codecs/max98363.c
++++ b/sound/soc/codecs/max98363.c
+@@ -191,10 +191,10 @@ static int max98363_io_init(struct sdw_slave *slave)
+ pm_runtime_get_noresume(dev);
+
+ ret = regmap_read(max98363->regmap, MAX98363_R21FF_REV_ID, ®);
+- if (!ret) {
++ if (!ret)
+ dev_info(dev, "Revision ID: %X\n", reg);
+- return ret;
+- }
++ else
++ goto out;
+
+ if (max98363->first_hw_init) {
+ regcache_cache_bypass(max98363->regmap, false);
+@@ -204,10 +204,11 @@ static int max98363_io_init(struct sdw_slave *slave)
+ max98363->first_hw_init = true;
+ max98363->hw_init = true;
+
++out:
+ pm_runtime_mark_last_busy(dev);
+ pm_runtime_put_autosuspend(dev);
+
+- return 0;
++ return ret;
+ }
+
+ #define MAX98363_RATES SNDRV_PCM_RATE_8000_192000
+diff --git a/sound/soc/codecs/rt5665.c b/sound/soc/codecs/rt5665.c
+index 17afaef85c77a..382bdbcf7b59b 100644
+--- a/sound/soc/codecs/rt5665.c
++++ b/sound/soc/codecs/rt5665.c
+@@ -4472,6 +4472,8 @@ static void rt5665_remove(struct snd_soc_component *component)
+ struct rt5665_priv *rt5665 = snd_soc_component_get_drvdata(component);
+
+ regmap_write(rt5665->regmap, RT5665_RESET, 0);
++
++ regulator_bulk_disable(ARRAY_SIZE(rt5665->supplies), rt5665->supplies);
+ }
+
+ #ifdef CONFIG_PM
+diff --git a/sound/soc/intel/boards/sof_sdw.c b/sound/soc/intel/boards/sof_sdw.c
+index 5fa204897a52b..a6d13aae8f720 100644
+--- a/sound/soc/intel/boards/sof_sdw.c
++++ b/sound/soc/intel/boards/sof_sdw.c
+@@ -367,6 +367,16 @@ static const struct dmi_system_id sof_sdw_quirk_table[] = {
+ RT711_JD2),
+ },
+ /* RaptorLake devices */
++ {
++ .callback = sof_sdw_quirk_cb,
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc"),
++ DMI_EXACT_MATCH(DMI_PRODUCT_SKU, "0BDA")
++ },
++ .driver_data = (void *)(SOF_SDW_TGL_HDMI |
++ RT711_JD2 |
++ SOF_SDW_FOUR_SPK),
++ },
+ {
+ .callback = sof_sdw_quirk_cb,
+ .matches = {
+@@ -415,6 +425,31 @@ static const struct dmi_system_id sof_sdw_quirk_table[] = {
+ },
+ .driver_data = (void *)(RT711_JD1),
+ },
++ {
++ .callback = sof_sdw_quirk_cb,
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "Intel Corporation"),
++ DMI_MATCH(DMI_PRODUCT_NAME, "Meteor Lake Client Platform"),
++ },
++ .driver_data = (void *)(RT711_JD2_100K),
++ },
++ {
++ .callback = sof_sdw_quirk_cb,
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "Google"),
++ DMI_MATCH(DMI_PRODUCT_NAME, "Rex"),
++ },
++ .driver_data = (void *)(SOF_SDW_PCH_DMIC),
++ },
++ /* LunarLake devices */
++ {
++ .callback = sof_sdw_quirk_cb,
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "Intel Corporation"),
++ DMI_MATCH(DMI_PRODUCT_NAME, "Lunar Lake Client Platform"),
++ },
++ .driver_data = (void *)(RT711_JD2_100K),
++ },
+ {}
+ };
+
+diff --git a/sound/soc/intel/boards/sof_sdw_rt711_sdca.c b/sound/soc/intel/boards/sof_sdw_rt711_sdca.c
+index 7f16304d025be..cf8b9793fe0e5 100644
+--- a/sound/soc/intel/boards/sof_sdw_rt711_sdca.c
++++ b/sound/soc/intel/boards/sof_sdw_rt711_sdca.c
+@@ -143,6 +143,9 @@ int sof_sdw_rt711_sdca_exit(struct snd_soc_card *card, struct snd_soc_dai_link *
+ if (!ctx->headset_codec_dev)
+ return 0;
+
++ if (!SOF_RT711_JDSRC(sof_sdw_quirk))
++ return 0;
++
+ device_remove_software_node(ctx->headset_codec_dev);
+ put_device(ctx->headset_codec_dev);
+
+diff --git a/sound/soc/meson/axg-tdm-formatter.c b/sound/soc/meson/axg-tdm-formatter.c
+index 9883dc777f630..63333a2b0a9c3 100644
+--- a/sound/soc/meson/axg-tdm-formatter.c
++++ b/sound/soc/meson/axg-tdm-formatter.c
+@@ -30,27 +30,32 @@ int axg_tdm_formatter_set_channel_masks(struct regmap *map,
+ struct axg_tdm_stream *ts,
+ unsigned int offset)
+ {
+- unsigned int val, ch = ts->channels;
+- unsigned long mask;
+- int i, j;
++ unsigned int ch = ts->channels;
++ u32 val[AXG_TDM_NUM_LANES];
++ int i, j, k;
++
++ /*
++ * We need to mimick the slot distribution used by the HW to keep the
++ * channel placement consistent regardless of the number of channel
++ * in the stream. This is why the odd algorithm below is used.
++ */
++ memset(val, 0, sizeof(*val) * AXG_TDM_NUM_LANES);
+
+ /*
+ * Distribute the channels of the stream over the available slots
+- * of each TDM lane
++ * of each TDM lane. We need to go over the 32 slots ...
+ */
+- for (i = 0; i < AXG_TDM_NUM_LANES; i++) {
+- val = 0;
+- mask = ts->mask[i];
+-
+- for (j = find_first_bit(&mask, 32);
+- (j < 32) && ch;
+- j = find_next_bit(&mask, 32, j + 1)) {
+- val |= 1 << j;
+- ch -= 1;
++ for (i = 0; (i < 32) && ch; i += 2) {
++ /* ... of all the lanes ... */
++ for (j = 0; j < AXG_TDM_NUM_LANES; j++) {
++ /* ... then distribute the channels in pairs */
++ for (k = 0; k < 2; k++) {
++ if ((BIT(i + k) & ts->mask[j]) && ch) {
++ val[j] |= BIT(i + k);
++ ch -= 1;
++ }
++ }
+ }
+-
+- regmap_write(map, offset, val);
+- offset += regmap_get_reg_stride(map);
+ }
+
+ /*
+@@ -63,6 +68,11 @@ int axg_tdm_formatter_set_channel_masks(struct regmap *map,
+ return -EINVAL;
+ }
+
++ for (i = 0; i < AXG_TDM_NUM_LANES; i++) {
++ regmap_write(map, offset, val[i]);
++ offset += regmap_get_reg_stride(map);
++ }
++
+ return 0;
+ }
+ EXPORT_SYMBOL_GPL(axg_tdm_formatter_set_channel_masks);
+diff --git a/sound/soc/sof/amd/acp.h b/sound/soc/sof/amd/acp.h
+index 1c535cc6c3a95..dc624f727aa37 100644
+--- a/sound/soc/sof/amd/acp.h
++++ b/sound/soc/sof/amd/acp.h
+@@ -55,6 +55,9 @@
+
+ #define ACP_DSP_TO_HOST_IRQ 0x04
+
++#define ACP_RN_PCI_ID 0x01
++#define ACP_RMB_PCI_ID 0x6F
++
+ #define HOST_BRIDGE_CZN 0x1630
+ #define HOST_BRIDGE_RMB 0x14B5
+ #define ACP_SHA_STAT 0x8000
+diff --git a/sound/soc/sof/amd/pci-rmb.c b/sound/soc/sof/amd/pci-rmb.c
+index eaf70ea6e556e..58b3092425f1a 100644
+--- a/sound/soc/sof/amd/pci-rmb.c
++++ b/sound/soc/sof/amd/pci-rmb.c
+@@ -65,6 +65,9 @@ static int acp_pci_rmb_probe(struct pci_dev *pci, const struct pci_device_id *pc
+ {
+ unsigned int flag;
+
++ if (pci->revision != ACP_RMB_PCI_ID)
++ return -ENODEV;
++
+ flag = snd_amd_acp_find_config(pci);
+ if (flag != FLAG_AMD_SOF && flag != FLAG_AMD_SOF_ONLY_DMIC)
+ return -ENODEV;
+diff --git a/sound/soc/sof/amd/pci-rn.c b/sound/soc/sof/amd/pci-rn.c
+index 4809cb644152b..7409e21ce5aa7 100644
+--- a/sound/soc/sof/amd/pci-rn.c
++++ b/sound/soc/sof/amd/pci-rn.c
+@@ -65,6 +65,9 @@ static int acp_pci_rn_probe(struct pci_dev *pci, const struct pci_device_id *pci
+ {
+ unsigned int flag;
+
++ if (pci->revision != ACP_RN_PCI_ID)
++ return -ENODEV;
++
+ flag = snd_amd_acp_find_config(pci);
+ if (flag != FLAG_AMD_SOF && flag != FLAG_AMD_SOF_ONLY_DMIC)
+ return -ENODEV;
+diff --git a/sound/soc/sof/core.c b/sound/soc/sof/core.c
+index 9a9d82220fd0d..30db685cc5f4b 100644
+--- a/sound/soc/sof/core.c
++++ b/sound/soc/sof/core.c
+@@ -504,8 +504,10 @@ int snd_sof_device_shutdown(struct device *dev)
+ if (IS_ENABLED(CONFIG_SND_SOC_SOF_PROBE_WORK_QUEUE))
+ cancel_work_sync(&sdev->probe_work);
+
+- if (sdev->fw_state == SOF_FW_BOOT_COMPLETE)
++ if (sdev->fw_state == SOF_FW_BOOT_COMPLETE) {
++ sof_fw_trace_free(sdev);
+ return snd_sof_shutdown(sdev);
++ }
+
+ return 0;
+ }
+diff --git a/sound/soc/sof/intel/hda-dai-ops.c b/sound/soc/sof/intel/hda-dai-ops.c
+index 4b39cecacd68d..5938046f46b21 100644
+--- a/sound/soc/sof/intel/hda-dai-ops.c
++++ b/sound/soc/sof/intel/hda-dai-ops.c
+@@ -289,16 +289,27 @@ static const struct hda_dai_widget_dma_ops hda_ipc4_chain_dma_ops = {
+ static int hda_ipc3_post_trigger(struct snd_sof_dev *sdev, struct snd_soc_dai *cpu_dai,
+ struct snd_pcm_substream *substream, int cmd)
+ {
++ struct hdac_ext_stream *hext_stream = hda_get_hext_stream(sdev, cpu_dai, substream);
+ struct snd_soc_dapm_widget *w = snd_soc_dai_get_widget(cpu_dai, substream->stream);
++ struct snd_soc_pcm_runtime *rtd = asoc_substream_to_rtd(substream);
++ struct snd_soc_dai *codec_dai = asoc_rtd_to_codec(rtd, 0);
+
+ switch (cmd) {
+ case SNDRV_PCM_TRIGGER_SUSPEND:
+ case SNDRV_PCM_TRIGGER_STOP:
+ {
+ struct snd_sof_dai_config_data data = { 0 };
++ int ret;
+
+ data.dai_data = DMA_CHAN_INVALID;
+- return hda_dai_config(w, SOF_DAI_CONFIG_FLAGS_HW_FREE, &data);
++ ret = hda_dai_config(w, SOF_DAI_CONFIG_FLAGS_HW_FREE, &data);
++ if (ret < 0)
++ return ret;
++
++ if (cmd == SNDRV_PCM_TRIGGER_STOP)
++ return hda_link_dma_cleanup(substream, hext_stream, cpu_dai, codec_dai);
++
++ break;
+ }
+ case SNDRV_PCM_TRIGGER_PAUSE_PUSH:
+ return hda_dai_config(w, SOF_DAI_CONFIG_FLAGS_PAUSE, NULL);
+diff --git a/sound/soc/sof/intel/hda-dai.c b/sound/soc/sof/intel/hda-dai.c
+index 44a5d94c5050f..8a76320c3b993 100644
+--- a/sound/soc/sof/intel/hda-dai.c
++++ b/sound/soc/sof/intel/hda-dai.c
+@@ -91,10 +91,10 @@ hda_dai_get_ops(struct snd_pcm_substream *substream, struct snd_soc_dai *cpu_dai
+ return sdai->platform_private;
+ }
+
+-static int hda_link_dma_cleanup(struct snd_pcm_substream *substream,
+- struct hdac_ext_stream *hext_stream,
+- struct snd_soc_dai *cpu_dai,
+- struct snd_soc_dai *codec_dai)
++int hda_link_dma_cleanup(struct snd_pcm_substream *substream,
++ struct hdac_ext_stream *hext_stream,
++ struct snd_soc_dai *cpu_dai,
++ struct snd_soc_dai *codec_dai)
+ {
+ struct snd_sof_dev *sdev = snd_soc_component_get_drvdata(cpu_dai->component);
+ const struct hda_dai_widget_dma_ops *ops = hda_dai_get_ops(substream, cpu_dai);
+diff --git a/sound/soc/sof/intel/hda.c b/sound/soc/sof/intel/hda.c
+index 3153e21f100ab..3853582e32e12 100644
+--- a/sound/soc/sof/intel/hda.c
++++ b/sound/soc/sof/intel/hda.c
+@@ -1343,12 +1343,22 @@ static void hda_generic_machine_select(struct snd_sof_dev *sdev,
+ hda_mach->mach_params.dmic_num = dmic_num;
+ pdata->tplg_filename = tplg_filename;
+
+- if (codec_num == 2) {
++ if (codec_num == 2 ||
++ (codec_num == 1 && !HDA_IDISP_CODEC(bus->codec_mask))) {
+ /*
+ * Prevent SoundWire links from starting when an external
+ * HDaudio codec is used
+ */
+ hda_mach->mach_params.link_mask = 0;
++ } else {
++ /*
++ * Allow SoundWire links to start when no external HDaudio codec
++ * was detected. This will not create a SoundWire card but
++ * will help detect if any SoundWire codec reports as ATTACHED.
++ */
++ struct sof_intel_hda_dev *hdev = sdev->pdata->hw_pdata;
++
++ hda_mach->mach_params.link_mask = hdev->info.link_mask;
+ }
+
+ *mach = hda_mach;
+diff --git a/sound/soc/sof/intel/hda.h b/sound/soc/sof/intel/hda.h
+index c4befacde23e4..94c738eae751a 100644
+--- a/sound/soc/sof/intel/hda.h
++++ b/sound/soc/sof/intel/hda.h
+@@ -942,5 +942,7 @@ const struct hda_dai_widget_dma_ops *
+ hda_select_dai_widget_ops(struct snd_sof_dev *sdev, struct snd_sof_widget *swidget);
+ int hda_dai_config(struct snd_soc_dapm_widget *w, unsigned int flags,
+ struct snd_sof_dai_config_data *data);
++int hda_link_dma_cleanup(struct snd_pcm_substream *substream, struct hdac_ext_stream *hext_stream,
++ struct snd_soc_dai *cpu_dai, struct snd_soc_dai *codec_dai);
+
+ #endif
+diff --git a/sound/usb/quirks-table.h b/sound/usb/quirks-table.h
+index efb4a3311cc59..5d72dc8441cbb 100644
+--- a/sound/usb/quirks-table.h
++++ b/sound/usb/quirks-table.h
+@@ -4507,6 +4507,35 @@ YAMAHA_DEVICE(0x7010, "UB99"),
+ }
+ }
+ },
++{
++ /* Advanced modes of the Mythware XA001AU.
++ * For the standard mode, Mythware XA001AU has ID ffad:a001
++ */
++ USB_DEVICE_VENDOR_SPEC(0xffad, 0xa001),
++ .driver_info = (unsigned long) &(const struct snd_usb_audio_quirk) {
++ .vendor_name = "Mythware",
++ .product_name = "XA001AU",
++ .ifnum = QUIRK_ANY_INTERFACE,
++ .type = QUIRK_COMPOSITE,
++ .data = (const struct snd_usb_audio_quirk[]) {
++ {
++ .ifnum = 0,
++ .type = QUIRK_IGNORE_INTERFACE,
++ },
++ {
++ .ifnum = 1,
++ .type = QUIRK_AUDIO_STANDARD_INTERFACE,
++ },
++ {
++ .ifnum = 2,
++ .type = QUIRK_AUDIO_STANDARD_INTERFACE,
++ },
++ {
++ .ifnum = -1
++ }
++ }
++ }
++},
+
+ #undef USB_DEVICE_VENDOR_SPEC
+ #undef USB_AUDIO_DEVICE
+diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
+index a25fdc08c39e8..228b2bb086ac2 100644
+--- a/tools/objtool/arch/x86/decode.c
++++ b/tools/objtool/arch/x86/decode.c
+@@ -824,8 +824,11 @@ bool arch_is_retpoline(struct symbol *sym)
+
+ bool arch_is_rethunk(struct symbol *sym)
+ {
+- return !strcmp(sym->name, "__x86_return_thunk") ||
+- !strcmp(sym->name, "srso_untrain_ret") ||
+- !strcmp(sym->name, "srso_safe_ret") ||
+- !strcmp(sym->name, "__ret");
++ return !strcmp(sym->name, "__x86_return_thunk");
++}
++
++bool arch_is_embedded_insn(struct symbol *sym)
++{
++ return !strcmp(sym->name, "retbleed_return_thunk") ||
++ !strcmp(sym->name, "srso_safe_ret");
+ }
+diff --git a/tools/objtool/check.c b/tools/objtool/check.c
+index 0fcf99c914000..f7f34a0b101e1 100644
+--- a/tools/objtool/check.c
++++ b/tools/objtool/check.c
+@@ -429,7 +429,7 @@ static int decode_instructions(struct objtool_file *file)
+ if (!strcmp(sec->name, ".noinstr.text") ||
+ !strcmp(sec->name, ".entry.text") ||
+ !strcmp(sec->name, ".cpuidle.text") ||
+- !strncmp(sec->name, ".text.__x86.", 12))
++ !strncmp(sec->name, ".text..__x86.", 13))
+ sec->noinstr = true;
+
+ /*
+@@ -495,7 +495,7 @@ static int decode_instructions(struct objtool_file *file)
+ return -1;
+ }
+
+- if (func->return_thunk || func->alias != func)
++ if (func->embedded_insn || func->alias != func)
+ continue;
+
+ if (!find_insn(file, sec, func->offset)) {
+@@ -1346,16 +1346,33 @@ static int add_ignore_alternatives(struct objtool_file *file)
+ return 0;
+ }
+
++/*
++ * Symbols that replace INSN_CALL_DYNAMIC, every (tail) call to such a symbol
++ * will be added to the .retpoline_sites section.
++ */
+ __weak bool arch_is_retpoline(struct symbol *sym)
+ {
+ return false;
+ }
+
++/*
++ * Symbols that replace INSN_RETURN, every (tail) call to such a symbol
++ * will be added to the .return_sites section.
++ */
+ __weak bool arch_is_rethunk(struct symbol *sym)
+ {
+ return false;
+ }
+
++/*
++ * Symbols that are embedded inside other instructions, because sometimes crazy
++ * code exists. These are mostly ignored for validation purposes.
++ */
++__weak bool arch_is_embedded_insn(struct symbol *sym)
++{
++ return false;
++}
++
+ static struct reloc *insn_reloc(struct objtool_file *file, struct instruction *insn)
+ {
+ struct reloc *reloc;
+@@ -1638,14 +1655,14 @@ static int add_jump_destinations(struct objtool_file *file)
+ struct symbol *sym = find_symbol_by_offset(dest_sec, dest_off);
+
+ /*
+- * This is a special case for zen_untrain_ret().
++ * This is a special case for retbleed_untrain_ret().
+ * It jumps to __x86_return_thunk(), but objtool
+ * can't find the thunk's starting RET
+ * instruction, because the RET is also in the
+ * middle of another instruction. Objtool only
+ * knows about the outer instruction.
+ */
+- if (sym && sym->return_thunk) {
++ if (sym && sym->embedded_insn) {
+ add_return_call(file, insn, false);
+ continue;
+ }
+@@ -2550,6 +2567,9 @@ static int classify_symbols(struct objtool_file *file)
+ if (arch_is_rethunk(func))
+ func->return_thunk = true;
+
++ if (arch_is_embedded_insn(func))
++ func->embedded_insn = true;
++
+ if (arch_ftrace_match(func->name))
+ func->fentry = true;
+
+@@ -2678,12 +2698,17 @@ static int decode_sections(struct objtool_file *file)
+ return 0;
+ }
+
+-static bool is_fentry_call(struct instruction *insn)
++static bool is_special_call(struct instruction *insn)
+ {
+- if (insn->type == INSN_CALL &&
+- insn_call_dest(insn) &&
+- insn_call_dest(insn)->fentry)
+- return true;
++ if (insn->type == INSN_CALL) {
++ struct symbol *dest = insn_call_dest(insn);
++
++ if (!dest)
++ return false;
++
++ if (dest->fentry || dest->embedded_insn)
++ return true;
++ }
+
+ return false;
+ }
+@@ -3681,7 +3706,7 @@ static int validate_branch(struct objtool_file *file, struct symbol *func,
+ if (ret)
+ return ret;
+
+- if (opts.stackval && func && !is_fentry_call(insn) &&
++ if (opts.stackval && func && !is_special_call(insn) &&
+ !has_valid_stack_frame(&state)) {
+ WARN_INSN(insn, "call without frame pointer save/setup");
+ return 1;
+diff --git a/tools/objtool/include/objtool/arch.h b/tools/objtool/include/objtool/arch.h
+index 2b6d2ce4f9a5b..0b303eba660e4 100644
+--- a/tools/objtool/include/objtool/arch.h
++++ b/tools/objtool/include/objtool/arch.h
+@@ -90,6 +90,7 @@ int arch_decode_hint_reg(u8 sp_reg, int *base);
+
+ bool arch_is_retpoline(struct symbol *sym);
+ bool arch_is_rethunk(struct symbol *sym);
++bool arch_is_embedded_insn(struct symbol *sym);
+
+ int arch_rewrite_retpolines(struct objtool_file *file);
+
+diff --git a/tools/objtool/include/objtool/elf.h b/tools/objtool/include/objtool/elf.h
+index e1ca588eb69d1..bfb4a69d0e91e 100644
+--- a/tools/objtool/include/objtool/elf.h
++++ b/tools/objtool/include/objtool/elf.h
+@@ -61,6 +61,7 @@ struct symbol {
+ u8 return_thunk : 1;
+ u8 fentry : 1;
+ u8 profiling_func : 1;
++ u8 embedded_insn : 1;
+ struct list_head pv_target;
+ struct list_head reloc_list;
+ };
+diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
+index 9e02e19c1b7a9..4d564e0698dfc 100644
+--- a/tools/perf/util/machine.c
++++ b/tools/perf/util/machine.c
+@@ -44,7 +44,6 @@
+ #include <linux/zalloc.h>
+
+ static void __machine__remove_thread(struct machine *machine, struct thread *th, bool lock);
+-static int append_inlines(struct callchain_cursor *cursor, struct map_symbol *ms, u64 ip);
+
+ static struct dso *machine__kernel_dso(struct machine *machine)
+ {
+@@ -2371,10 +2370,6 @@ static int add_callchain_ip(struct thread *thread,
+ ms.maps = al.maps;
+ ms.map = al.map;
+ ms.sym = al.sym;
+-
+- if (!branch && append_inlines(cursor, &ms, ip) == 0)
+- return 0;
+-
+ srcline = callchain_srcline(&ms, al.addr);
+ err = callchain_cursor_append(cursor, ip, &ms,
+ branch, flags, nr_loop_iter,
+diff --git a/tools/perf/util/thread-stack.c b/tools/perf/util/thread-stack.c
+index 4b85c1728012c..e72bd059538c1 100644
+--- a/tools/perf/util/thread-stack.c
++++ b/tools/perf/util/thread-stack.c
+@@ -1037,9 +1037,7 @@ static int thread_stack__trace_end(struct thread_stack *ts,
+
+ static bool is_x86_retpoline(const char *name)
+ {
+- const char *p = strstr(name, "__x86_indirect_thunk_");
+-
+- return p == name || !strcmp(name, "__indirect_thunk_start");
++ return strstr(name, "__x86_indirect_thunk_") == name;
+ }
+
+ /*
+diff --git a/tools/testing/selftests/net/forwarding/mirror_gre_changes.sh b/tools/testing/selftests/net/forwarding/mirror_gre_changes.sh
+index aff88f78e3391..5ea9d63915f77 100755
+--- a/tools/testing/selftests/net/forwarding/mirror_gre_changes.sh
++++ b/tools/testing/selftests/net/forwarding/mirror_gre_changes.sh
+@@ -72,7 +72,8 @@ test_span_gre_ttl()
+
+ RET=0
+
+- mirror_install $swp1 ingress $tundev "matchall $tcflags"
++ mirror_install $swp1 ingress $tundev \
++ "prot ip flower $tcflags ip_prot icmp"
+ tc filter add dev $h3 ingress pref 77 prot $prot \
+ flower skip_hw ip_ttl 50 action pass
+
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-08-30 13:45 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-08-30 13:45 UTC (permalink / raw
To: gentoo-commits
commit: 2e53db233b997822f3cf66a2e51670b8bf7ad336
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Wed Aug 30 13:44:49 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Wed Aug 30 13:44:49 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=2e53db23
Remove BMQ due to new incompatibilities wth kern ver 6.4.13
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 8 -
...MQ-and-PDS-io-scheduler-v6.4-r1-linux-tkg.patch | 11164 -------------------
5021_BMQ-and-PDS-gentoo-defaults.patch | 13 -
3 files changed, 11185 deletions(-)
diff --git a/0000_README b/0000_README
index 5da232d8..1c391fe4 100644
--- a/0000_README
+++ b/0000_README
@@ -134,11 +134,3 @@ Desc: Add Gentoo Linux support config settings and defaults.
Patch: 5010_enable-cpu-optimizations-universal.patch
From: https://github.com/graysky2/kernel_compiler_patch
Desc: Kernel >= 5.15 patch enables gcc = v11.1+ optimizations for additional CPUs.
-
-Patch: 5020_BMQ-and-PDS-io-scheduler-v6.4-r1-linux-tkg.patch
-From: https://github.com/hhoffstaette/kernel-patches/
-Desc: BMQ(BitMap Queue) Scheduler. A new CPU scheduler developed from PDS(incld). Inspired by the scheduler in zircon.
-
-Patch: 5021_BMQ-and-PDS-gentoo-defaults.patch
-From: https://gitweb.gentoo.org/proj/linux-patches.git/
-Desc: Set defaults for BMQ. Add archs as people test, default to N
diff --git a/5020_BMQ-and-PDS-io-scheduler-v6.4-r1-linux-tkg.patch b/5020_BMQ-and-PDS-io-scheduler-v6.4-r1-linux-tkg.patch
deleted file mode 100644
index 5e870849..00000000
--- a/5020_BMQ-and-PDS-io-scheduler-v6.4-r1-linux-tkg.patch
+++ /dev/null
@@ -1,11164 +0,0 @@
-diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
-index 9e5bab29685f..b942b7dd8c42 100644
---- a/Documentation/admin-guide/kernel-parameters.txt
-+++ b/Documentation/admin-guide/kernel-parameters.txt
-@@ -5496,6 +5496,12 @@
- sa1100ir [NET]
- See drivers/net/irda/sa1100_ir.c.
-
-+ sched_timeslice=
-+ [KNL] Time slice in ms for Project C BMQ/PDS scheduler.
-+ Format: integer 2, 4
-+ Default: 4
-+ See Documentation/scheduler/sched-BMQ.txt
-+
- sched_verbose [KNL] Enables verbose scheduler debug messages.
-
- schedstats= [KNL,X86] Enable or disable scheduled statistics.
-diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
-index d85d90f5d000..f730195a3adb 100644
---- a/Documentation/admin-guide/sysctl/kernel.rst
-+++ b/Documentation/admin-guide/sysctl/kernel.rst
-@@ -1616,3 +1616,13 @@ is 10 seconds.
-
- The softlockup threshold is (``2 * watchdog_thresh``). Setting this
- tunable to zero will disable lockup detection altogether.
-+
-+yield_type:
-+===========
-+
-+BMQ/PDS CPU scheduler only. This determines what type of yield calls
-+to sched_yield will perform.
-+
-+ 0 - No yield.
-+ 1 - Deboost and requeue task. (default)
-+ 2 - Set run queue skip task.
-diff --git a/Documentation/scheduler/sched-BMQ.txt b/Documentation/scheduler/sched-BMQ.txt
-new file mode 100644
-index 000000000000..05c84eec0f31
---- /dev/null
-+++ b/Documentation/scheduler/sched-BMQ.txt
-@@ -0,0 +1,110 @@
-+ BitMap queue CPU Scheduler
-+ --------------------------
-+
-+CONTENT
-+========
-+
-+ Background
-+ Design
-+ Overview
-+ Task policy
-+ Priority management
-+ BitMap Queue
-+ CPU Assignment and Migration
-+
-+
-+Background
-+==========
-+
-+BitMap Queue CPU scheduler, referred to as BMQ from here on, is an evolution
-+of previous Priority and Deadline based Skiplist multiple queue scheduler(PDS),
-+and inspired by Zircon scheduler. The goal of it is to keep the scheduler code
-+simple, while efficiency and scalable for interactive tasks, such as desktop,
-+movie playback and gaming etc.
-+
-+Design
-+======
-+
-+Overview
-+--------
-+
-+BMQ use per CPU run queue design, each CPU(logical) has it's own run queue,
-+each CPU is responsible for scheduling the tasks that are putting into it's
-+run queue.
-+
-+The run queue is a set of priority queues. Note that these queues are fifo
-+queue for non-rt tasks or priority queue for rt tasks in data structure. See
-+BitMap Queue below for details. BMQ is optimized for non-rt tasks in the fact
-+that most applications are non-rt tasks. No matter the queue is fifo or
-+priority, In each queue is an ordered list of runnable tasks awaiting execution
-+and the data structures are the same. When it is time for a new task to run,
-+the scheduler simply looks the lowest numbered queueue that contains a task,
-+and runs the first task from the head of that queue. And per CPU idle task is
-+also in the run queue, so the scheduler can always find a task to run on from
-+its run queue.
-+
-+Each task will assigned the same timeslice(default 4ms) when it is picked to
-+start running. Task will be reinserted at the end of the appropriate priority
-+queue when it uses its whole timeslice. When the scheduler selects a new task
-+from the priority queue it sets the CPU's preemption timer for the remainder of
-+the previous timeslice. When that timer fires the scheduler will stop execution
-+on that task, select another task and start over again.
-+
-+If a task blocks waiting for a shared resource then it's taken out of its
-+priority queue and is placed in a wait queue for the shared resource. When it
-+is unblocked it will be reinserted in the appropriate priority queue of an
-+eligible CPU.
-+
-+Task policy
-+-----------
-+
-+BMQ supports DEADLINE, FIFO, RR, NORMAL, BATCH and IDLE task policy like the
-+mainline CFS scheduler. But BMQ is heavy optimized for non-rt task, that's
-+NORMAL/BATCH/IDLE policy tasks. Below is the implementation detail of each
-+policy.
-+
-+DEADLINE
-+ It is squashed as priority 0 FIFO task.
-+
-+FIFO/RR
-+ All RT tasks share one single priority queue in BMQ run queue designed. The
-+complexity of insert operation is O(n). BMQ is not designed for system runs
-+with major rt policy tasks.
-+
-+NORMAL/BATCH/IDLE
-+ BATCH and IDLE tasks are treated as the same policy. They compete CPU with
-+NORMAL policy tasks, but they just don't boost. To control the priority of
-+NORMAL/BATCH/IDLE tasks, simply use nice level.
-+
-+ISO
-+ ISO policy is not supported in BMQ. Please use nice level -20 NORMAL policy
-+task instead.
-+
-+Priority management
-+-------------------
-+
-+RT tasks have priority from 0-99. For non-rt tasks, there are three different
-+factors used to determine the effective priority of a task. The effective
-+priority being what is used to determine which queue it will be in.
-+
-+The first factor is simply the task’s static priority. Which is assigned from
-+task's nice level, within [-20, 19] in userland's point of view and [0, 39]
-+internally.
-+
-+The second factor is the priority boost. This is a value bounded between
-+[-MAX_PRIORITY_ADJ, MAX_PRIORITY_ADJ] used to offset the base priority, it is
-+modified by the following cases:
-+
-+*When a thread has used up its entire timeslice, always deboost its boost by
-+increasing by one.
-+*When a thread gives up cpu control(voluntary or non-voluntary) to reschedule,
-+and its switch-in time(time after last switch and run) below the thredhold
-+based on its priority boost, will boost its boost by decreasing by one buti is
-+capped at 0 (won’t go negative).
-+
-+The intent in this system is to ensure that interactive threads are serviced
-+quickly. These are usually the threads that interact directly with the user
-+and cause user-perceivable latency. These threads usually do little work and
-+spend most of their time blocked awaiting another user event. So they get the
-+priority boost from unblocking while background threads that do most of the
-+processing receive the priority penalty for using their entire timeslice.
-diff --git a/fs/proc/base.c b/fs/proc/base.c
-index 05452c3b9872..fa1ceb85ad24 100644
---- a/fs/proc/base.c
-+++ b/fs/proc/base.c
-@@ -480,7 +480,7 @@ static int proc_pid_schedstat(struct seq_file *m, struct pid_namespace *ns,
- seq_puts(m, "0 0 0\n");
- else
- seq_printf(m, "%llu %llu %lu\n",
-- (unsigned long long)task->se.sum_exec_runtime,
-+ (unsigned long long)tsk_seruntime(task),
- (unsigned long long)task->sched_info.run_delay,
- task->sched_info.pcount);
-
-diff --git a/include/asm-generic/resource.h b/include/asm-generic/resource.h
-index 8874f681b056..59eb72bf7d5f 100644
---- a/include/asm-generic/resource.h
-+++ b/include/asm-generic/resource.h
-@@ -23,7 +23,7 @@
- [RLIMIT_LOCKS] = { RLIM_INFINITY, RLIM_INFINITY }, \
- [RLIMIT_SIGPENDING] = { 0, 0 }, \
- [RLIMIT_MSGQUEUE] = { MQ_BYTES_MAX, MQ_BYTES_MAX }, \
-- [RLIMIT_NICE] = { 0, 0 }, \
-+ [RLIMIT_NICE] = { 30, 30 }, \
- [RLIMIT_RTPRIO] = { 0, 0 }, \
- [RLIMIT_RTTIME] = { RLIM_INFINITY, RLIM_INFINITY }, \
- }
-diff --git a/include/linux/sched.h b/include/linux/sched.h
-index eed5d65b8d1f..cdfd9263ddd6 100644
---- a/include/linux/sched.h
-+++ b/include/linux/sched.h
-@@ -764,8 +764,14 @@ struct task_struct {
- unsigned int ptrace;
-
- #ifdef CONFIG_SMP
-- int on_cpu;
- struct __call_single_node wake_entry;
-+#endif
-+#if defined(CONFIG_SMP) || defined(CONFIG_SCHED_ALT)
-+ int on_cpu;
-+#endif
-+
-+#ifdef CONFIG_SMP
-+#ifndef CONFIG_SCHED_ALT
- unsigned int wakee_flips;
- unsigned long wakee_flip_decay_ts;
- struct task_struct *last_wakee;
-@@ -779,6 +785,7 @@ struct task_struct {
- */
- int recent_used_cpu;
- int wake_cpu;
-+#endif /* !CONFIG_SCHED_ALT */
- #endif
- int on_rq;
-
-@@ -787,6 +794,20 @@ struct task_struct {
- int normal_prio;
- unsigned int rt_priority;
-
-+#ifdef CONFIG_SCHED_ALT
-+ u64 last_ran;
-+ s64 time_slice;
-+ int sq_idx;
-+ struct list_head sq_node;
-+#ifdef CONFIG_SCHED_BMQ
-+ int boost_prio;
-+#endif /* CONFIG_SCHED_BMQ */
-+#ifdef CONFIG_SCHED_PDS
-+ u64 deadline;
-+#endif /* CONFIG_SCHED_PDS */
-+ /* sched_clock time spent running */
-+ u64 sched_time;
-+#else /* !CONFIG_SCHED_ALT */
- struct sched_entity se;
- struct sched_rt_entity rt;
- struct sched_dl_entity dl;
-@@ -797,6 +818,7 @@ struct task_struct {
- unsigned long core_cookie;
- unsigned int core_occupation;
- #endif
-+#endif /* !CONFIG_SCHED_ALT */
-
- #ifdef CONFIG_CGROUP_SCHED
- struct task_group *sched_task_group;
-@@ -1551,6 +1573,15 @@ struct task_struct {
- */
- };
-
-+#ifdef CONFIG_SCHED_ALT
-+#define tsk_seruntime(t) ((t)->sched_time)
-+/* replace the uncertian rt_timeout with 0UL */
-+#define tsk_rttimeout(t) (0UL)
-+#else /* CFS */
-+#define tsk_seruntime(t) ((t)->se.sum_exec_runtime)
-+#define tsk_rttimeout(t) ((t)->rt.timeout)
-+#endif /* !CONFIG_SCHED_ALT */
-+
- static inline struct pid *task_pid(struct task_struct *task)
- {
- return task->thread_pid;
-diff --git a/include/linux/sched/deadline.h b/include/linux/sched/deadline.h
-index 7c83d4d5a971..fa30f98cb2be 100644
---- a/include/linux/sched/deadline.h
-+++ b/include/linux/sched/deadline.h
-@@ -1,5 +1,24 @@
- /* SPDX-License-Identifier: GPL-2.0 */
-
-+#ifdef CONFIG_SCHED_ALT
-+
-+static inline int dl_task(struct task_struct *p)
-+{
-+ return 0;
-+}
-+
-+#ifdef CONFIG_SCHED_BMQ
-+#define __tsk_deadline(p) (0UL)
-+#endif
-+
-+#ifdef CONFIG_SCHED_PDS
-+#define __tsk_deadline(p) ((((u64) ((p)->prio))<<56) | (p)->deadline)
-+#endif
-+
-+#else
-+
-+#define __tsk_deadline(p) ((p)->dl.deadline)
-+
- /*
- * SCHED_DEADLINE tasks has negative priorities, reflecting
- * the fact that any of them has higher prio than RT and
-@@ -21,6 +40,7 @@ static inline int dl_task(struct task_struct *p)
- {
- return dl_prio(p->prio);
- }
-+#endif /* CONFIG_SCHED_ALT */
-
- static inline bool dl_time_before(u64 a, u64 b)
- {
-diff --git a/include/linux/sched/prio.h b/include/linux/sched/prio.h
-index ab83d85e1183..6af9ae681116 100644
---- a/include/linux/sched/prio.h
-+++ b/include/linux/sched/prio.h
-@@ -18,6 +18,32 @@
- #define MAX_PRIO (MAX_RT_PRIO + NICE_WIDTH)
- #define DEFAULT_PRIO (MAX_RT_PRIO + NICE_WIDTH / 2)
-
-+#ifdef CONFIG_SCHED_ALT
-+
-+/* Undefine MAX_PRIO and DEFAULT_PRIO */
-+#undef MAX_PRIO
-+#undef DEFAULT_PRIO
-+
-+/* +/- priority levels from the base priority */
-+#ifdef CONFIG_SCHED_BMQ
-+#define MAX_PRIORITY_ADJ (7)
-+
-+#define MIN_NORMAL_PRIO (MAX_RT_PRIO)
-+#define MAX_PRIO (MIN_NORMAL_PRIO + NICE_WIDTH)
-+#define DEFAULT_PRIO (MIN_NORMAL_PRIO + NICE_WIDTH / 2)
-+#endif
-+
-+#ifdef CONFIG_SCHED_PDS
-+#define MAX_PRIORITY_ADJ (0)
-+
-+#define MIN_NORMAL_PRIO (128)
-+#define NORMAL_PRIO_NUM (64)
-+#define MAX_PRIO (MIN_NORMAL_PRIO + NORMAL_PRIO_NUM)
-+#define DEFAULT_PRIO (MAX_PRIO - NICE_WIDTH / 2)
-+#endif
-+
-+#endif /* CONFIG_SCHED_ALT */
-+
- /*
- * Convert user-nice values [ -20 ... 0 ... 19 ]
- * to static priority [ MAX_RT_PRIO..MAX_PRIO-1 ],
-diff --git a/include/linux/sched/rt.h b/include/linux/sched/rt.h
-index 994c25640e15..8c050a59ece1 100644
---- a/include/linux/sched/rt.h
-+++ b/include/linux/sched/rt.h
-@@ -24,8 +24,10 @@ static inline bool task_is_realtime(struct task_struct *tsk)
-
- if (policy == SCHED_FIFO || policy == SCHED_RR)
- return true;
-+#ifndef CONFIG_SCHED_ALT
- if (policy == SCHED_DEADLINE)
- return true;
-+#endif
- return false;
- }
-
-diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
-index 816df6cc444e..c8da08e18c91 100644
---- a/include/linux/sched/topology.h
-+++ b/include/linux/sched/topology.h
-@@ -234,7 +234,8 @@ static inline bool cpus_share_cache(int this_cpu, int that_cpu)
-
- #endif /* !CONFIG_SMP */
-
--#if defined(CONFIG_ENERGY_MODEL) && defined(CONFIG_CPU_FREQ_GOV_SCHEDUTIL)
-+#if defined(CONFIG_ENERGY_MODEL) && defined(CONFIG_CPU_FREQ_GOV_SCHEDUTIL) && \
-+ !defined(CONFIG_SCHED_ALT)
- extern void rebuild_sched_domains_energy(void);
- #else
- static inline void rebuild_sched_domains_energy(void)
-diff --git a/init/Kconfig b/init/Kconfig
-index 32c24950c4ce..cf951b739454 100644
---- a/init/Kconfig
-+++ b/init/Kconfig
-@@ -629,6 +629,7 @@ config TASK_IO_ACCOUNTING
-
- config PSI
- bool "Pressure stall information tracking"
-+ depends on !SCHED_ALT
- help
- Collect metrics that indicate how overcommitted the CPU, memory,
- and IO capacity are in the system.
-@@ -793,6 +794,7 @@ menu "Scheduler features"
- config UCLAMP_TASK
- bool "Enable utilization clamping for RT/FAIR tasks"
- depends on CPU_FREQ_GOV_SCHEDUTIL
-+ depends on !SCHED_ALT
- help
- This feature enables the scheduler to track the clamped utilization
- of each CPU based on RUNNABLE tasks scheduled on that CPU.
-@@ -839,6 +841,35 @@ config UCLAMP_BUCKETS_COUNT
-
- If in doubt, use the default value.
-
-+menuconfig SCHED_ALT
-+ bool "Alternative CPU Schedulers"
-+ default y
-+ help
-+ This feature enable alternative CPU scheduler"
-+
-+if SCHED_ALT
-+
-+choice
-+ prompt "Alternative CPU Scheduler"
-+ default SCHED_BMQ
-+
-+config SCHED_BMQ
-+ bool "BMQ CPU scheduler"
-+ help
-+ The BitMap Queue CPU scheduler for excellent interactivity and
-+ responsiveness on the desktop and solid scalability on normal
-+ hardware and commodity servers.
-+
-+config SCHED_PDS
-+ bool "PDS CPU scheduler"
-+ help
-+ The Priority and Deadline based Skip list multiple queue CPU
-+ Scheduler.
-+
-+endchoice
-+
-+endif
-+
- endmenu
-
- #
-@@ -892,6 +923,7 @@ config NUMA_BALANCING
- depends on ARCH_SUPPORTS_NUMA_BALANCING
- depends on !ARCH_WANT_NUMA_VARIABLE_LOCALITY
- depends on SMP && NUMA && MIGRATION && !PREEMPT_RT
-+ depends on !SCHED_ALT
- help
- This option adds support for automatic NUMA aware memory/task placement.
- The mechanism is quite primitive and is based on migrating memory when
-@@ -989,6 +1021,7 @@ config FAIR_GROUP_SCHED
- depends on CGROUP_SCHED
- default CGROUP_SCHED
-
-+if !SCHED_ALT
- config CFS_BANDWIDTH
- bool "CPU bandwidth provisioning for FAIR_GROUP_SCHED"
- depends on FAIR_GROUP_SCHED
-@@ -1011,6 +1044,7 @@ config RT_GROUP_SCHED
- realtime bandwidth for them.
- See Documentation/scheduler/sched-rt-group.rst for more information.
-
-+endif #!SCHED_ALT
- endif #CGROUP_SCHED
-
- config SCHED_MM_CID
-@@ -1259,6 +1293,7 @@ config CHECKPOINT_RESTORE
-
- config SCHED_AUTOGROUP
- bool "Automatic process group scheduling"
-+ depends on !SCHED_ALT
- select CGROUPS
- select CGROUP_SCHED
- select FAIR_GROUP_SCHED
-diff --git a/init/init_task.c b/init/init_task.c
-index ff6c4b9bfe6b..19e9c662d1a1 100644
---- a/init/init_task.c
-+++ b/init/init_task.c
-@@ -75,9 +75,15 @@ struct task_struct init_task
- .stack = init_stack,
- .usage = REFCOUNT_INIT(2),
- .flags = PF_KTHREAD,
-+#ifdef CONFIG_SCHED_ALT
-+ .prio = DEFAULT_PRIO + MAX_PRIORITY_ADJ,
-+ .static_prio = DEFAULT_PRIO,
-+ .normal_prio = DEFAULT_PRIO + MAX_PRIORITY_ADJ,
-+#else
- .prio = MAX_PRIO - 20,
- .static_prio = MAX_PRIO - 20,
- .normal_prio = MAX_PRIO - 20,
-+#endif
- .policy = SCHED_NORMAL,
- .cpus_ptr = &init_task.cpus_mask,
- .user_cpus_ptr = NULL,
-@@ -88,6 +94,17 @@ struct task_struct init_task
- .restart_block = {
- .fn = do_no_restart_syscall,
- },
-+#ifdef CONFIG_SCHED_ALT
-+ .sq_node = LIST_HEAD_INIT(init_task.sq_node),
-+#ifdef CONFIG_SCHED_BMQ
-+ .boost_prio = 0,
-+ .sq_idx = 15,
-+#endif
-+#ifdef CONFIG_SCHED_PDS
-+ .deadline = 0,
-+#endif
-+ .time_slice = HZ,
-+#else
- .se = {
- .group_node = LIST_HEAD_INIT(init_task.se.group_node),
- },
-@@ -95,6 +112,7 @@ struct task_struct init_task
- .run_list = LIST_HEAD_INIT(init_task.rt.run_list),
- .time_slice = RR_TIMESLICE,
- },
-+#endif
- .tasks = LIST_HEAD_INIT(init_task.tasks),
- #ifdef CONFIG_SMP
- .pushable_tasks = PLIST_NODE_INIT(init_task.pushable_tasks, MAX_PRIO),
-diff --git a/kernel/Kconfig.preempt b/kernel/Kconfig.preempt
-index c2f1fd95a821..41654679b1b2 100644
---- a/kernel/Kconfig.preempt
-+++ b/kernel/Kconfig.preempt
-@@ -117,7 +117,7 @@ config PREEMPT_DYNAMIC
-
- config SCHED_CORE
- bool "Core Scheduling for SMT"
-- depends on SCHED_SMT
-+ depends on SCHED_SMT && !SCHED_ALT
- help
- This option permits Core Scheduling, a means of coordinated task
- selection across SMT siblings. When enabled -- see
-diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
-index e4ca2dd2b764..82786dbb220c 100644
---- a/kernel/cgroup/cpuset.c
-+++ b/kernel/cgroup/cpuset.c
-@@ -791,7 +791,7 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial)
- return ret;
- }
-
--#ifdef CONFIG_SMP
-+#if defined(CONFIG_SMP) && !defined(CONFIG_SCHED_ALT)
- /*
- * Helper routine for generate_sched_domains().
- * Do cpusets a, b have overlapping effective cpus_allowed masks?
-@@ -1187,7 +1187,7 @@ static void rebuild_sched_domains_locked(void)
- /* Have scheduler rebuild the domains */
- partition_and_rebuild_sched_domains(ndoms, doms, attr);
- }
--#else /* !CONFIG_SMP */
-+#else /* !CONFIG_SMP || CONFIG_SCHED_ALT */
- static void rebuild_sched_domains_locked(void)
- {
- }
-diff --git a/kernel/delayacct.c b/kernel/delayacct.c
-index 6f0c358e73d8..8111481ce8b1 100644
---- a/kernel/delayacct.c
-+++ b/kernel/delayacct.c
-@@ -150,7 +150,7 @@ int delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk)
- */
- t1 = tsk->sched_info.pcount;
- t2 = tsk->sched_info.run_delay;
-- t3 = tsk->se.sum_exec_runtime;
-+ t3 = tsk_seruntime(tsk);
-
- d->cpu_count += t1;
-
-diff --git a/kernel/exit.c b/kernel/exit.c
-index edb50b4c9972..09e72bba7cc2 100644
---- a/kernel/exit.c
-+++ b/kernel/exit.c
-@@ -173,7 +173,7 @@ static void __exit_signal(struct task_struct *tsk)
- sig->curr_target = next_thread(tsk);
- }
-
-- add_device_randomness((const void*) &tsk->se.sum_exec_runtime,
-+ add_device_randomness((const void*) &tsk_seruntime(tsk),
- sizeof(unsigned long long));
-
- /*
-@@ -194,7 +194,7 @@ static void __exit_signal(struct task_struct *tsk)
- sig->inblock += task_io_get_inblock(tsk);
- sig->oublock += task_io_get_oublock(tsk);
- task_io_accounting_add(&sig->ioac, &tsk->ioac);
-- sig->sum_sched_runtime += tsk->se.sum_exec_runtime;
-+ sig->sum_sched_runtime += tsk_seruntime(tsk);
- sig->nr_threads--;
- __unhash_process(tsk, group_dead);
- write_sequnlock(&sig->stats_lock);
---- a/kernel/locking/rtmutex.c 2023-08-01 15:40:26.000000000 +0200
-+++ b/kernel/locking/rtmutex.c 2023-08-02 16:05:00.952812874 +0200
-@@ -343,7 +343,7 @@ waiter_update_prio(struct rt_mutex_waite
- lockdep_assert(RB_EMPTY_NODE(&waiter->tree.entry));
-
- waiter->tree.prio = __waiter_prio(task);
-- waiter->tree.deadline = task->dl.deadline;
-+ waiter->tree.deadline = __tsk_deadline(task);
- }
-
- /*
-@@ -364,16 +364,20 @@ waiter_clone_prio(struct rt_mutex_waiter
- * Only use with rt_waiter_node_{less,equal}()
- */
- #define task_to_waiter_node(p) \
-- &(struct rt_waiter_node){ .prio = __waiter_prio(p), .deadline = (p)->dl.deadline }
-+ &(struct rt_waiter_node){ .prio = __waiter_prio(p), .deadline = __tsk_deadline(p) }
- #define task_to_waiter(p) \
- &(struct rt_mutex_waiter){ .tree = *task_to_waiter_node(p) }
-
- static __always_inline int rt_waiter_node_less(struct rt_waiter_node *left,
- struct rt_waiter_node *right)
- {
-+#ifdef CONFIG_SCHED_PDS
-+ return (left->deadline < right->deadline);
-+#else
- if (left->prio < right->prio)
- return 1;
-
-+#ifndef CONFIG_SCHED_BMQ
- /*
- * If both waiters have dl_prio(), we check the deadlines of the
- * associated tasks.
-@@ -382,16 +386,22 @@ static __always_inline int rt_waiter_nod
- */
- if (dl_prio(left->prio))
- return dl_time_before(left->deadline, right->deadline);
-+#endif
-
- return 0;
-+#endif
- }
-
- static __always_inline int rt_waiter_node_equal(struct rt_waiter_node *left,
- struct rt_waiter_node *right)
- {
-+#ifdef CONFIG_SCHED_PDS
-+ return (left->deadline == right->deadline);
-+#else
- if (left->prio != right->prio)
- return 0;
-
-+#ifndef CONFIG_SCHED_BMQ
- /*
- * If both waiters have dl_prio(), we check the deadlines of the
- * associated tasks.
-@@ -400,8 +410,10 @@ static __always_inline int rt_waiter_nod
- */
- if (dl_prio(left->prio))
- return left->deadline == right->deadline;
-+#endif
-
- return 1;
-+#endif
- }
-
- static inline bool rt_mutex_steal(struct rt_mutex_waiter *waiter,
-diff --git a/kernel/sched/Makefile b/kernel/sched/Makefile
-index 976092b7bd45..31d587c16ec1 100644
---- a/kernel/sched/Makefile
-+++ b/kernel/sched/Makefile
-@@ -28,7 +28,12 @@ endif
- # These compilation units have roughly the same size and complexity - so their
- # build parallelizes well and finishes roughly at once:
- #
-+ifdef CONFIG_SCHED_ALT
-+obj-y += alt_core.o
-+obj-$(CONFIG_SCHED_DEBUG) += alt_debug.o
-+else
- obj-y += core.o
- obj-y += fair.o
-+endif
- obj-y += build_policy.o
- obj-y += build_utility.o
-diff --git a/kernel/sched/alt_core.c b/kernel/sched/alt_core.c
-new file mode 100644
-index 000000000000..3e8ddbd8001c
---- /dev/null
-+++ b/kernel/sched/alt_core.c
-@@ -0,0 +1,8729 @@
-+/*
-+ * kernel/sched/alt_core.c
-+ *
-+ * Core alternative kernel scheduler code and related syscalls
-+ *
-+ * Copyright (C) 1991-2002 Linus Torvalds
-+ *
-+ * 2009-08-13 Brainfuck deadline scheduling policy by Con Kolivas deletes
-+ * a whole lot of those previous things.
-+ * 2017-09-06 Priority and Deadline based Skip list multiple queue kernel
-+ * scheduler by Alfred Chen.
-+ * 2019-02-20 BMQ(BitMap Queue) kernel scheduler by Alfred Chen.
-+ */
-+#include <linux/sched/clock.h>
-+#include <linux/sched/cputime.h>
-+#include <linux/sched/debug.h>
-+#include <linux/sched/isolation.h>
-+#include <linux/sched/loadavg.h>
-+#include <linux/sched/mm.h>
-+#include <linux/sched/nohz.h>
-+#include <linux/sched/stat.h>
-+#include <linux/sched/wake_q.h>
-+
-+#include <linux/blkdev.h>
-+#include <linux/context_tracking.h>
-+#include <linux/cpuset.h>
-+#include <linux/delayacct.h>
-+#include <linux/init_task.h>
-+#include <linux/kcov.h>
-+#include <linux/kprobes.h>
-+#include <linux/nmi.h>
-+#include <linux/scs.h>
-+
-+#include <uapi/linux/sched/types.h>
-+
-+#include <asm/irq_regs.h>
-+#include <asm/switch_to.h>
-+
-+#define CREATE_TRACE_POINTS
-+#include <trace/events/sched.h>
-+#include <trace/events/ipi.h>
-+#undef CREATE_TRACE_POINTS
-+
-+#include "sched.h"
-+
-+#include "pelt.h"
-+
-+#include "../../io_uring/io-wq.h"
-+#include "../smpboot.h"
-+
-+EXPORT_TRACEPOINT_SYMBOL_GPL(ipi_send_cpu);
-+EXPORT_TRACEPOINT_SYMBOL_GPL(ipi_send_cpumask);
-+
-+/*
-+ * Export tracepoints that act as a bare tracehook (ie: have no trace event
-+ * associated with them) to allow external modules to probe them.
-+ */
-+EXPORT_TRACEPOINT_SYMBOL_GPL(pelt_irq_tp);
-+
-+#ifdef CONFIG_SCHED_DEBUG
-+#define sched_feat(x) (1)
-+/*
-+ * Print a warning if need_resched is set for the given duration (if
-+ * LATENCY_WARN is enabled).
-+ *
-+ * If sysctl_resched_latency_warn_once is set, only one warning will be shown
-+ * per boot.
-+ */
-+__read_mostly int sysctl_resched_latency_warn_ms = 100;
-+__read_mostly int sysctl_resched_latency_warn_once = 1;
-+#else
-+#define sched_feat(x) (0)
-+#endif /* CONFIG_SCHED_DEBUG */
-+
-+#define ALT_SCHED_VERSION "v6.4-r1"
-+
-+/*
-+ * Compile time debug macro
-+ * #define ALT_SCHED_DEBUG
-+ */
-+
-+/* rt_prio(prio) defined in include/linux/sched/rt.h */
-+#define rt_task(p) rt_prio((p)->prio)
-+#define rt_policy(policy) ((policy) == SCHED_FIFO || (policy) == SCHED_RR)
-+#define task_has_rt_policy(p) (rt_policy((p)->policy))
-+
-+#define STOP_PRIO (MAX_RT_PRIO - 1)
-+
-+/* Default time slice is 4 in ms, can be set via kernel parameter "sched_timeslice" */
-+u64 sched_timeslice_ns __read_mostly = (4 << 20);
-+
-+static inline void requeue_task(struct task_struct *p, struct rq *rq, int idx);
-+
-+#ifdef CONFIG_SCHED_BMQ
-+#include "bmq.h"
-+#endif
-+#ifdef CONFIG_SCHED_PDS
-+#include "pds.h"
-+#endif
-+
-+struct affinity_context {
-+ const struct cpumask *new_mask;
-+ struct cpumask *user_mask;
-+ unsigned int flags;
-+};
-+
-+static int __init sched_timeslice(char *str)
-+{
-+ int timeslice_ms;
-+
-+ get_option(&str, ×lice_ms);
-+ if (2 != timeslice_ms)
-+ timeslice_ms = 4;
-+ sched_timeslice_ns = timeslice_ms << 20;
-+ sched_timeslice_imp(timeslice_ms);
-+
-+ return 0;
-+}
-+early_param("sched_timeslice", sched_timeslice);
-+
-+/* Reschedule if less than this many μs left */
-+#define RESCHED_NS (100 << 10)
-+
-+/**
-+ * sched_yield_type - Choose what sort of yield sched_yield will perform.
-+ * 0: No yield.
-+ * 1: Deboost and requeue task. (default)
-+ * 2: Set rq skip task.
-+ */
-+int sched_yield_type __read_mostly = 1;
-+
-+#ifdef CONFIG_SMP
-+static cpumask_t sched_rq_pending_mask ____cacheline_aligned_in_smp;
-+
-+DEFINE_PER_CPU_ALIGNED(cpumask_t [NR_CPU_AFFINITY_LEVELS], sched_cpu_topo_masks);
-+DEFINE_PER_CPU_ALIGNED(cpumask_t *, sched_cpu_llc_mask);
-+DEFINE_PER_CPU_ALIGNED(cpumask_t *, sched_cpu_topo_end_mask);
-+
-+#ifdef CONFIG_SCHED_SMT
-+DEFINE_STATIC_KEY_FALSE(sched_smt_present);
-+EXPORT_SYMBOL_GPL(sched_smt_present);
-+#endif
-+
-+/*
-+ * Keep a unique ID per domain (we use the first CPUs number in the cpumask of
-+ * the domain), this allows us to quickly tell if two cpus are in the same cache
-+ * domain, see cpus_share_cache().
-+ */
-+DEFINE_PER_CPU(int, sd_llc_id);
-+#endif /* CONFIG_SMP */
-+
-+static DEFINE_MUTEX(sched_hotcpu_mutex);
-+
-+DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
-+
-+#ifndef prepare_arch_switch
-+# define prepare_arch_switch(next) do { } while (0)
-+#endif
-+#ifndef finish_arch_post_lock_switch
-+# define finish_arch_post_lock_switch() do { } while (0)
-+#endif
-+
-+#ifdef CONFIG_SCHED_SMT
-+static cpumask_t sched_sg_idle_mask ____cacheline_aligned_in_smp;
-+#endif
-+static cpumask_t sched_preempt_mask[SCHED_QUEUE_BITS] ____cacheline_aligned_in_smp;
-+static cpumask_t *const sched_idle_mask = &sched_preempt_mask[0];
-+
-+/* task function */
-+static inline const struct cpumask *task_user_cpus(struct task_struct *p)
-+{
-+ if (!p->user_cpus_ptr)
-+ return cpu_possible_mask; /* &init_task.cpus_mask */
-+ return p->user_cpus_ptr;
-+}
-+
-+/* sched_queue related functions */
-+static inline void sched_queue_init(struct sched_queue *q)
-+{
-+ int i;
-+
-+ bitmap_zero(q->bitmap, SCHED_QUEUE_BITS);
-+ for(i = 0; i < SCHED_LEVELS; i++)
-+ INIT_LIST_HEAD(&q->heads[i]);
-+}
-+
-+/*
-+ * Init idle task and put into queue structure of rq
-+ * IMPORTANT: may be called multiple times for a single cpu
-+ */
-+static inline void sched_queue_init_idle(struct sched_queue *q,
-+ struct task_struct *idle)
-+{
-+ idle->sq_idx = IDLE_TASK_SCHED_PRIO;
-+ INIT_LIST_HEAD(&q->heads[idle->sq_idx]);
-+ list_add(&idle->sq_node, &q->heads[idle->sq_idx]);
-+}
-+
-+static inline void
-+clear_recorded_preempt_mask(int pr, int low, int high, int cpu)
-+{
-+ if (low < pr && pr <= high)
-+ cpumask_clear_cpu(cpu, sched_preempt_mask + SCHED_QUEUE_BITS - pr);
-+}
-+
-+static inline void
-+set_recorded_preempt_mask(int pr, int low, int high, int cpu)
-+{
-+ if (low < pr && pr <= high)
-+ cpumask_set_cpu(cpu, sched_preempt_mask + SCHED_QUEUE_BITS - pr);
-+}
-+
-+static atomic_t sched_prio_record = ATOMIC_INIT(0);
-+
-+/* water mark related functions */
-+static inline void update_sched_preempt_mask(struct rq *rq)
-+{
-+ unsigned long prio = find_first_bit(rq->queue.bitmap, SCHED_QUEUE_BITS);
-+ unsigned long last_prio = rq->prio;
-+ int cpu, pr;
-+
-+ if (prio == last_prio)
-+ return;
-+
-+ rq->prio = prio;
-+ cpu = cpu_of(rq);
-+ pr = atomic_read(&sched_prio_record);
-+
-+ if (prio < last_prio) {
-+ if (IDLE_TASK_SCHED_PRIO == last_prio) {
-+#ifdef CONFIG_SCHED_SMT
-+ if (static_branch_likely(&sched_smt_present))
-+ cpumask_andnot(&sched_sg_idle_mask,
-+ &sched_sg_idle_mask, cpu_smt_mask(cpu));
-+#endif
-+ cpumask_clear_cpu(cpu, sched_idle_mask);
-+ last_prio -= 2;
-+ }
-+ clear_recorded_preempt_mask(pr, prio, last_prio, cpu);
-+
-+ return;
-+ }
-+ /* last_prio < prio */
-+ if (IDLE_TASK_SCHED_PRIO == prio) {
-+#ifdef CONFIG_SCHED_SMT
-+ if (static_branch_likely(&sched_smt_present) &&
-+ cpumask_intersects(cpu_smt_mask(cpu), sched_idle_mask))
-+ cpumask_or(&sched_sg_idle_mask,
-+ &sched_sg_idle_mask, cpu_smt_mask(cpu));
-+#endif
-+ cpumask_set_cpu(cpu, sched_idle_mask);
-+ prio -= 2;
-+ }
-+ set_recorded_preempt_mask(pr, last_prio, prio, cpu);
-+}
-+
-+/*
-+ * This routine assume that the idle task always in queue
-+ */
-+static inline struct task_struct *sched_rq_first_task(struct rq *rq)
-+{
-+ const struct list_head *head = &rq->queue.heads[sched_prio2idx(rq->prio, rq)];
-+
-+ return list_first_entry(head, struct task_struct, sq_node);
-+}
-+
-+static inline struct task_struct *
-+sched_rq_next_task(struct task_struct *p, struct rq *rq)
-+{
-+ unsigned long idx = p->sq_idx;
-+ struct list_head *head = &rq->queue.heads[idx];
-+
-+ if (list_is_last(&p->sq_node, head)) {
-+ idx = find_next_bit(rq->queue.bitmap, SCHED_QUEUE_BITS,
-+ sched_idx2prio(idx, rq) + 1);
-+ head = &rq->queue.heads[sched_prio2idx(idx, rq)];
-+
-+ return list_first_entry(head, struct task_struct, sq_node);
-+ }
-+
-+ return list_next_entry(p, sq_node);
-+}
-+
-+static inline struct task_struct *rq_runnable_task(struct rq *rq)
-+{
-+ struct task_struct *next = sched_rq_first_task(rq);
-+
-+ if (unlikely(next == rq->skip))
-+ next = sched_rq_next_task(next, rq);
-+
-+ return next;
-+}
-+
-+/*
-+ * Serialization rules:
-+ *
-+ * Lock order:
-+ *
-+ * p->pi_lock
-+ * rq->lock
-+ * hrtimer_cpu_base->lock (hrtimer_start() for bandwidth controls)
-+ *
-+ * rq1->lock
-+ * rq2->lock where: rq1 < rq2
-+ *
-+ * Regular state:
-+ *
-+ * Normal scheduling state is serialized by rq->lock. __schedule() takes the
-+ * local CPU's rq->lock, it optionally removes the task from the runqueue and
-+ * always looks at the local rq data structures to find the most eligible task
-+ * to run next.
-+ *
-+ * Task enqueue is also under rq->lock, possibly taken from another CPU.
-+ * Wakeups from another LLC domain might use an IPI to transfer the enqueue to
-+ * the local CPU to avoid bouncing the runqueue state around [ see
-+ * ttwu_queue_wakelist() ]
-+ *
-+ * Task wakeup, specifically wakeups that involve migration, are horribly
-+ * complicated to avoid having to take two rq->locks.
-+ *
-+ * Special state:
-+ *
-+ * System-calls and anything external will use task_rq_lock() which acquires
-+ * both p->pi_lock and rq->lock. As a consequence the state they change is
-+ * stable while holding either lock:
-+ *
-+ * - sched_setaffinity()/
-+ * set_cpus_allowed_ptr(): p->cpus_ptr, p->nr_cpus_allowed
-+ * - set_user_nice(): p->se.load, p->*prio
-+ * - __sched_setscheduler(): p->sched_class, p->policy, p->*prio,
-+ * p->se.load, p->rt_priority,
-+ * p->dl.dl_{runtime, deadline, period, flags, bw, density}
-+ * - sched_setnuma(): p->numa_preferred_nid
-+ * - sched_move_task(): p->sched_task_group
-+ * - uclamp_update_active() p->uclamp*
-+ *
-+ * p->state <- TASK_*:
-+ *
-+ * is changed locklessly using set_current_state(), __set_current_state() or
-+ * set_special_state(), see their respective comments, or by
-+ * try_to_wake_up(). This latter uses p->pi_lock to serialize against
-+ * concurrent self.
-+ *
-+ * p->on_rq <- { 0, 1 = TASK_ON_RQ_QUEUED, 2 = TASK_ON_RQ_MIGRATING }:
-+ *
-+ * is set by activate_task() and cleared by deactivate_task(), under
-+ * rq->lock. Non-zero indicates the task is runnable, the special
-+ * ON_RQ_MIGRATING state is used for migration without holding both
-+ * rq->locks. It indicates task_cpu() is not stable, see task_rq_lock().
-+ *
-+ * p->on_cpu <- { 0, 1 }:
-+ *
-+ * is set by prepare_task() and cleared by finish_task() such that it will be
-+ * set before p is scheduled-in and cleared after p is scheduled-out, both
-+ * under rq->lock. Non-zero indicates the task is running on its CPU.
-+ *
-+ * [ The astute reader will observe that it is possible for two tasks on one
-+ * CPU to have ->on_cpu = 1 at the same time. ]
-+ *
-+ * task_cpu(p): is changed by set_task_cpu(), the rules are:
-+ *
-+ * - Don't call set_task_cpu() on a blocked task:
-+ *
-+ * We don't care what CPU we're not running on, this simplifies hotplug,
-+ * the CPU assignment of blocked tasks isn't required to be valid.
-+ *
-+ * - for try_to_wake_up(), called under p->pi_lock:
-+ *
-+ * This allows try_to_wake_up() to only take one rq->lock, see its comment.
-+ *
-+ * - for migration called under rq->lock:
-+ * [ see task_on_rq_migrating() in task_rq_lock() ]
-+ *
-+ * o move_queued_task()
-+ * o detach_task()
-+ *
-+ * - for migration called under double_rq_lock():
-+ *
-+ * o __migrate_swap_task()
-+ * o push_rt_task() / pull_rt_task()
-+ * o push_dl_task() / pull_dl_task()
-+ * o dl_task_offline_migration()
-+ *
-+ */
-+
-+/*
-+ * Context: p->pi_lock
-+ */
-+static inline struct rq
-+*__task_access_lock(struct task_struct *p, raw_spinlock_t **plock)
-+{
-+ struct rq *rq;
-+ for (;;) {
-+ rq = task_rq(p);
-+ if (p->on_cpu || task_on_rq_queued(p)) {
-+ raw_spin_lock(&rq->lock);
-+ if (likely((p->on_cpu || task_on_rq_queued(p))
-+ && rq == task_rq(p))) {
-+ *plock = &rq->lock;
-+ return rq;
-+ }
-+ raw_spin_unlock(&rq->lock);
-+ } else if (task_on_rq_migrating(p)) {
-+ do {
-+ cpu_relax();
-+ } while (unlikely(task_on_rq_migrating(p)));
-+ } else {
-+ *plock = NULL;
-+ return rq;
-+ }
-+ }
-+}
-+
-+static inline void
-+__task_access_unlock(struct task_struct *p, raw_spinlock_t *lock)
-+{
-+ if (NULL != lock)
-+ raw_spin_unlock(lock);
-+}
-+
-+static inline struct rq
-+*task_access_lock_irqsave(struct task_struct *p, raw_spinlock_t **plock,
-+ unsigned long *flags)
-+{
-+ struct rq *rq;
-+ for (;;) {
-+ rq = task_rq(p);
-+ if (p->on_cpu || task_on_rq_queued(p)) {
-+ raw_spin_lock_irqsave(&rq->lock, *flags);
-+ if (likely((p->on_cpu || task_on_rq_queued(p))
-+ && rq == task_rq(p))) {
-+ *plock = &rq->lock;
-+ return rq;
-+ }
-+ raw_spin_unlock_irqrestore(&rq->lock, *flags);
-+ } else if (task_on_rq_migrating(p)) {
-+ do {
-+ cpu_relax();
-+ } while (unlikely(task_on_rq_migrating(p)));
-+ } else {
-+ raw_spin_lock_irqsave(&p->pi_lock, *flags);
-+ if (likely(!p->on_cpu && !p->on_rq &&
-+ rq == task_rq(p))) {
-+ *plock = &p->pi_lock;
-+ return rq;
-+ }
-+ raw_spin_unlock_irqrestore(&p->pi_lock, *flags);
-+ }
-+ }
-+}
-+
-+static inline void
-+task_access_unlock_irqrestore(struct task_struct *p, raw_spinlock_t *lock,
-+ unsigned long *flags)
-+{
-+ raw_spin_unlock_irqrestore(lock, *flags);
-+}
-+
-+/*
-+ * __task_rq_lock - lock the rq @p resides on.
-+ */
-+struct rq *__task_rq_lock(struct task_struct *p, struct rq_flags *rf)
-+ __acquires(rq->lock)
-+{
-+ struct rq *rq;
-+
-+ lockdep_assert_held(&p->pi_lock);
-+
-+ for (;;) {
-+ rq = task_rq(p);
-+ raw_spin_lock(&rq->lock);
-+ if (likely(rq == task_rq(p) && !task_on_rq_migrating(p)))
-+ return rq;
-+ raw_spin_unlock(&rq->lock);
-+
-+ while (unlikely(task_on_rq_migrating(p)))
-+ cpu_relax();
-+ }
-+}
-+
-+/*
-+ * task_rq_lock - lock p->pi_lock and lock the rq @p resides on.
-+ */
-+struct rq *task_rq_lock(struct task_struct *p, struct rq_flags *rf)
-+ __acquires(p->pi_lock)
-+ __acquires(rq->lock)
-+{
-+ struct rq *rq;
-+
-+ for (;;) {
-+ raw_spin_lock_irqsave(&p->pi_lock, rf->flags);
-+ rq = task_rq(p);
-+ raw_spin_lock(&rq->lock);
-+ /*
-+ * move_queued_task() task_rq_lock()
-+ *
-+ * ACQUIRE (rq->lock)
-+ * [S] ->on_rq = MIGRATING [L] rq = task_rq()
-+ * WMB (__set_task_cpu()) ACQUIRE (rq->lock);
-+ * [S] ->cpu = new_cpu [L] task_rq()
-+ * [L] ->on_rq
-+ * RELEASE (rq->lock)
-+ *
-+ * If we observe the old CPU in task_rq_lock(), the acquire of
-+ * the old rq->lock will fully serialize against the stores.
-+ *
-+ * If we observe the new CPU in task_rq_lock(), the address
-+ * dependency headed by '[L] rq = task_rq()' and the acquire
-+ * will pair with the WMB to ensure we then also see migrating.
-+ */
-+ if (likely(rq == task_rq(p) && !task_on_rq_migrating(p))) {
-+ return rq;
-+ }
-+ raw_spin_unlock(&rq->lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, rf->flags);
-+
-+ while (unlikely(task_on_rq_migrating(p)))
-+ cpu_relax();
-+ }
-+}
-+
-+static inline void
-+rq_lock_irqsave(struct rq *rq, struct rq_flags *rf)
-+ __acquires(rq->lock)
-+{
-+ raw_spin_lock_irqsave(&rq->lock, rf->flags);
-+}
-+
-+static inline void
-+rq_unlock_irqrestore(struct rq *rq, struct rq_flags *rf)
-+ __releases(rq->lock)
-+{
-+ raw_spin_unlock_irqrestore(&rq->lock, rf->flags);
-+}
-+
-+void raw_spin_rq_lock_nested(struct rq *rq, int subclass)
-+{
-+ raw_spinlock_t *lock;
-+
-+ /* Matches synchronize_rcu() in __sched_core_enable() */
-+ preempt_disable();
-+
-+ for (;;) {
-+ lock = __rq_lockp(rq);
-+ raw_spin_lock_nested(lock, subclass);
-+ if (likely(lock == __rq_lockp(rq))) {
-+ /* preempt_count *MUST* be > 1 */
-+ preempt_enable_no_resched();
-+ return;
-+ }
-+ raw_spin_unlock(lock);
-+ }
-+}
-+
-+void raw_spin_rq_unlock(struct rq *rq)
-+{
-+ raw_spin_unlock(rq_lockp(rq));
-+}
-+
-+/*
-+ * RQ-clock updating methods:
-+ */
-+
-+static void update_rq_clock_task(struct rq *rq, s64 delta)
-+{
-+/*
-+ * In theory, the compile should just see 0 here, and optimize out the call
-+ * to sched_rt_avg_update. But I don't trust it...
-+ */
-+ s64 __maybe_unused steal = 0, irq_delta = 0;
-+
-+#ifdef CONFIG_IRQ_TIME_ACCOUNTING
-+ irq_delta = irq_time_read(cpu_of(rq)) - rq->prev_irq_time;
-+
-+ /*
-+ * Since irq_time is only updated on {soft,}irq_exit, we might run into
-+ * this case when a previous update_rq_clock() happened inside a
-+ * {soft,}irq region.
-+ *
-+ * When this happens, we stop ->clock_task and only update the
-+ * prev_irq_time stamp to account for the part that fit, so that a next
-+ * update will consume the rest. This ensures ->clock_task is
-+ * monotonic.
-+ *
-+ * It does however cause some slight miss-attribution of {soft,}irq
-+ * time, a more accurate solution would be to update the irq_time using
-+ * the current rq->clock timestamp, except that would require using
-+ * atomic ops.
-+ */
-+ if (irq_delta > delta)
-+ irq_delta = delta;
-+
-+ rq->prev_irq_time += irq_delta;
-+ delta -= irq_delta;
-+ delayacct_irq(rq->curr, irq_delta);
-+#endif
-+#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
-+ if (static_key_false((¶virt_steal_rq_enabled))) {
-+ steal = paravirt_steal_clock(cpu_of(rq));
-+ steal -= rq->prev_steal_time_rq;
-+
-+ if (unlikely(steal > delta))
-+ steal = delta;
-+
-+ rq->prev_steal_time_rq += steal;
-+ delta -= steal;
-+ }
-+#endif
-+
-+ rq->clock_task += delta;
-+
-+#ifdef CONFIG_HAVE_SCHED_AVG_IRQ
-+ if ((irq_delta + steal))
-+ update_irq_load_avg(rq, irq_delta + steal);
-+#endif
-+}
-+
-+static inline void update_rq_clock(struct rq *rq)
-+{
-+ s64 delta = sched_clock_cpu(cpu_of(rq)) - rq->clock;
-+
-+ if (unlikely(delta <= 0))
-+ return;
-+ rq->clock += delta;
-+ update_rq_time_edge(rq);
-+ update_rq_clock_task(rq, delta);
-+}
-+
-+/*
-+ * RQ Load update routine
-+ */
-+#define RQ_LOAD_HISTORY_BITS (sizeof(s32) * 8ULL)
-+#define RQ_UTIL_SHIFT (8)
-+#define RQ_LOAD_HISTORY_TO_UTIL(l) (((l) >> (RQ_LOAD_HISTORY_BITS - 1 - RQ_UTIL_SHIFT)) & 0xff)
-+
-+#define LOAD_BLOCK(t) ((t) >> 17)
-+#define LOAD_HALF_BLOCK(t) ((t) >> 16)
-+#define BLOCK_MASK(t) ((t) & ((0x01 << 18) - 1))
-+#define LOAD_BLOCK_BIT(b) (1UL << (RQ_LOAD_HISTORY_BITS - 1 - (b)))
-+#define CURRENT_LOAD_BIT LOAD_BLOCK_BIT(0)
-+
-+static inline void rq_load_update(struct rq *rq)
-+{
-+ u64 time = rq->clock;
-+ u64 delta = min(LOAD_BLOCK(time) - LOAD_BLOCK(rq->load_stamp),
-+ RQ_LOAD_HISTORY_BITS - 1);
-+ u64 prev = !!(rq->load_history & CURRENT_LOAD_BIT);
-+ u64 curr = !!rq->nr_running;
-+
-+ if (delta) {
-+ rq->load_history = rq->load_history >> delta;
-+
-+ if (delta < RQ_UTIL_SHIFT) {
-+ rq->load_block += (~BLOCK_MASK(rq->load_stamp)) * prev;
-+ if (!!LOAD_HALF_BLOCK(rq->load_block) ^ curr)
-+ rq->load_history ^= LOAD_BLOCK_BIT(delta);
-+ }
-+
-+ rq->load_block = BLOCK_MASK(time) * prev;
-+ } else {
-+ rq->load_block += (time - rq->load_stamp) * prev;
-+ }
-+ if (prev ^ curr)
-+ rq->load_history ^= CURRENT_LOAD_BIT;
-+ rq->load_stamp = time;
-+}
-+
-+unsigned long rq_load_util(struct rq *rq, unsigned long max)
-+{
-+ return RQ_LOAD_HISTORY_TO_UTIL(rq->load_history) * (max >> RQ_UTIL_SHIFT);
-+}
-+
-+#ifdef CONFIG_SMP
-+unsigned long sched_cpu_util(int cpu)
-+{
-+ return rq_load_util(cpu_rq(cpu), arch_scale_cpu_capacity(cpu));
-+}
-+#endif /* CONFIG_SMP */
-+
-+#ifdef CONFIG_CPU_FREQ
-+/**
-+ * cpufreq_update_util - Take a note about CPU utilization changes.
-+ * @rq: Runqueue to carry out the update for.
-+ * @flags: Update reason flags.
-+ *
-+ * This function is called by the scheduler on the CPU whose utilization is
-+ * being updated.
-+ *
-+ * It can only be called from RCU-sched read-side critical sections.
-+ *
-+ * The way cpufreq is currently arranged requires it to evaluate the CPU
-+ * performance state (frequency/voltage) on a regular basis to prevent it from
-+ * being stuck in a completely inadequate performance level for too long.
-+ * That is not guaranteed to happen if the updates are only triggered from CFS
-+ * and DL, though, because they may not be coming in if only RT tasks are
-+ * active all the time (or there are RT tasks only).
-+ *
-+ * As a workaround for that issue, this function is called periodically by the
-+ * RT sched class to trigger extra cpufreq updates to prevent it from stalling,
-+ * but that really is a band-aid. Going forward it should be replaced with
-+ * solutions targeted more specifically at RT tasks.
-+ */
-+static inline void cpufreq_update_util(struct rq *rq, unsigned int flags)
-+{
-+ struct update_util_data *data;
-+
-+#ifdef CONFIG_SMP
-+ rq_load_update(rq);
-+#endif
-+ data = rcu_dereference_sched(*per_cpu_ptr(&cpufreq_update_util_data,
-+ cpu_of(rq)));
-+ if (data)
-+ data->func(data, rq_clock(rq), flags);
-+}
-+#else
-+static inline void cpufreq_update_util(struct rq *rq, unsigned int flags)
-+{
-+#ifdef CONFIG_SMP
-+ rq_load_update(rq);
-+#endif
-+}
-+#endif /* CONFIG_CPU_FREQ */
-+
-+#ifdef CONFIG_NO_HZ_FULL
-+/*
-+ * Tick may be needed by tasks in the runqueue depending on their policy and
-+ * requirements. If tick is needed, lets send the target an IPI to kick it out
-+ * of nohz mode if necessary.
-+ */
-+static inline void sched_update_tick_dependency(struct rq *rq)
-+{
-+ int cpu = cpu_of(rq);
-+
-+ if (!tick_nohz_full_cpu(cpu))
-+ return;
-+
-+ if (rq->nr_running < 2)
-+ tick_nohz_dep_clear_cpu(cpu, TICK_DEP_BIT_SCHED);
-+ else
-+ tick_nohz_dep_set_cpu(cpu, TICK_DEP_BIT_SCHED);
-+}
-+#else /* !CONFIG_NO_HZ_FULL */
-+static inline void sched_update_tick_dependency(struct rq *rq) { }
-+#endif
-+
-+bool sched_task_on_rq(struct task_struct *p)
-+{
-+ return task_on_rq_queued(p);
-+}
-+
-+unsigned long get_wchan(struct task_struct *p)
-+{
-+ unsigned long ip = 0;
-+ unsigned int state;
-+
-+ if (!p || p == current)
-+ return 0;
-+
-+ /* Only get wchan if task is blocked and we can keep it that way. */
-+ raw_spin_lock_irq(&p->pi_lock);
-+ state = READ_ONCE(p->__state);
-+ smp_rmb(); /* see try_to_wake_up() */
-+ if (state != TASK_RUNNING && state != TASK_WAKING && !p->on_rq)
-+ ip = __get_wchan(p);
-+ raw_spin_unlock_irq(&p->pi_lock);
-+
-+ return ip;
-+}
-+
-+/*
-+ * Add/Remove/Requeue task to/from the runqueue routines
-+ * Context: rq->lock
-+ */
-+#define __SCHED_DEQUEUE_TASK(p, rq, flags, func) \
-+ sched_info_dequeue(rq, p); \
-+ \
-+ list_del(&p->sq_node); \
-+ if (list_empty(&rq->queue.heads[p->sq_idx])) { \
-+ clear_bit(sched_idx2prio(p->sq_idx, rq), rq->queue.bitmap); \
-+ func; \
-+ }
-+
-+#define __SCHED_ENQUEUE_TASK(p, rq, flags) \
-+ sched_info_enqueue(rq, p); \
-+ \
-+ p->sq_idx = task_sched_prio_idx(p, rq); \
-+ list_add_tail(&p->sq_node, &rq->queue.heads[p->sq_idx]); \
-+ set_bit(sched_idx2prio(p->sq_idx, rq), rq->queue.bitmap);
-+
-+static inline void dequeue_task(struct task_struct *p, struct rq *rq, int flags)
-+{
-+#ifdef ALT_SCHED_DEBUG
-+ lockdep_assert_held(&rq->lock);
-+
-+ /*printk(KERN_INFO "sched: dequeue(%d) %px %016llx\n", cpu_of(rq), p, p->deadline);*/
-+ WARN_ONCE(task_rq(p) != rq, "sched: dequeue task reside on cpu%d from cpu%d\n",
-+ task_cpu(p), cpu_of(rq));
-+#endif
-+
-+ __SCHED_DEQUEUE_TASK(p, rq, flags, update_sched_preempt_mask(rq));
-+ --rq->nr_running;
-+#ifdef CONFIG_SMP
-+ if (1 == rq->nr_running)
-+ cpumask_clear_cpu(cpu_of(rq), &sched_rq_pending_mask);
-+#endif
-+
-+ sched_update_tick_dependency(rq);
-+}
-+
-+static inline void enqueue_task(struct task_struct *p, struct rq *rq, int flags)
-+{
-+#ifdef ALT_SCHED_DEBUG
-+ lockdep_assert_held(&rq->lock);
-+
-+ /*printk(KERN_INFO "sched: enqueue(%d) %px %d\n", cpu_of(rq), p, p->prio);*/
-+ WARN_ONCE(task_rq(p) != rq, "sched: enqueue task reside on cpu%d to cpu%d\n",
-+ task_cpu(p), cpu_of(rq));
-+#endif
-+
-+ __SCHED_ENQUEUE_TASK(p, rq, flags);
-+ update_sched_preempt_mask(rq);
-+ ++rq->nr_running;
-+#ifdef CONFIG_SMP
-+ if (2 == rq->nr_running)
-+ cpumask_set_cpu(cpu_of(rq), &sched_rq_pending_mask);
-+#endif
-+
-+ sched_update_tick_dependency(rq);
-+}
-+
-+static inline void requeue_task(struct task_struct *p, struct rq *rq, int idx)
-+{
-+#ifdef ALT_SCHED_DEBUG
-+ lockdep_assert_held(&rq->lock);
-+ /*printk(KERN_INFO "sched: requeue(%d) %px %016llx\n", cpu_of(rq), p, p->deadline);*/
-+ WARN_ONCE(task_rq(p) != rq, "sched: cpu[%d] requeue task reside on cpu%d\n",
-+ cpu_of(rq), task_cpu(p));
-+#endif
-+
-+ list_del(&p->sq_node);
-+ list_add_tail(&p->sq_node, &rq->queue.heads[idx]);
-+ if (idx != p->sq_idx) {
-+ if (list_empty(&rq->queue.heads[p->sq_idx]))
-+ clear_bit(sched_idx2prio(p->sq_idx, rq), rq->queue.bitmap);
-+ p->sq_idx = idx;
-+ set_bit(sched_idx2prio(p->sq_idx, rq), rq->queue.bitmap);
-+ update_sched_preempt_mask(rq);
-+ }
-+}
-+
-+/*
-+ * cmpxchg based fetch_or, macro so it works for different integer types
-+ */
-+#define fetch_or(ptr, mask) \
-+ ({ \
-+ typeof(ptr) _ptr = (ptr); \
-+ typeof(mask) _mask = (mask); \
-+ typeof(*_ptr) _val = *_ptr; \
-+ \
-+ do { \
-+ } while (!try_cmpxchg(_ptr, &_val, _val | _mask)); \
-+ _val; \
-+})
-+
-+#if defined(CONFIG_SMP) && defined(TIF_POLLING_NRFLAG)
-+/*
-+ * Atomically set TIF_NEED_RESCHED and test for TIF_POLLING_NRFLAG,
-+ * this avoids any races wrt polling state changes and thereby avoids
-+ * spurious IPIs.
-+ */
-+static inline bool set_nr_and_not_polling(struct task_struct *p)
-+{
-+ struct thread_info *ti = task_thread_info(p);
-+ return !(fetch_or(&ti->flags, _TIF_NEED_RESCHED) & _TIF_POLLING_NRFLAG);
-+}
-+
-+/*
-+ * Atomically set TIF_NEED_RESCHED if TIF_POLLING_NRFLAG is set.
-+ *
-+ * If this returns true, then the idle task promises to call
-+ * sched_ttwu_pending() and reschedule soon.
-+ */
-+static bool set_nr_if_polling(struct task_struct *p)
-+{
-+ struct thread_info *ti = task_thread_info(p);
-+ typeof(ti->flags) val = READ_ONCE(ti->flags);
-+
-+ for (;;) {
-+ if (!(val & _TIF_POLLING_NRFLAG))
-+ return false;
-+ if (val & _TIF_NEED_RESCHED)
-+ return true;
-+ if (try_cmpxchg(&ti->flags, &val, val | _TIF_NEED_RESCHED))
-+ break;
-+ }
-+ return true;
-+}
-+
-+#else
-+static inline bool set_nr_and_not_polling(struct task_struct *p)
-+{
-+ set_tsk_need_resched(p);
-+ return true;
-+}
-+
-+#ifdef CONFIG_SMP
-+static inline bool set_nr_if_polling(struct task_struct *p)
-+{
-+ return false;
-+}
-+#endif
-+#endif
-+
-+static bool __wake_q_add(struct wake_q_head *head, struct task_struct *task)
-+{
-+ struct wake_q_node *node = &task->wake_q;
-+
-+ /*
-+ * Atomically grab the task, if ->wake_q is !nil already it means
-+ * it's already queued (either by us or someone else) and will get the
-+ * wakeup due to that.
-+ *
-+ * In order to ensure that a pending wakeup will observe our pending
-+ * state, even in the failed case, an explicit smp_mb() must be used.
-+ */
-+ smp_mb__before_atomic();
-+ if (unlikely(cmpxchg_relaxed(&node->next, NULL, WAKE_Q_TAIL)))
-+ return false;
-+
-+ /*
-+ * The head is context local, there can be no concurrency.
-+ */
-+ *head->lastp = node;
-+ head->lastp = &node->next;
-+ return true;
-+}
-+
-+/**
-+ * wake_q_add() - queue a wakeup for 'later' waking.
-+ * @head: the wake_q_head to add @task to
-+ * @task: the task to queue for 'later' wakeup
-+ *
-+ * Queue a task for later wakeup, most likely by the wake_up_q() call in the
-+ * same context, _HOWEVER_ this is not guaranteed, the wakeup can come
-+ * instantly.
-+ *
-+ * This function must be used as-if it were wake_up_process(); IOW the task
-+ * must be ready to be woken at this location.
-+ */
-+void wake_q_add(struct wake_q_head *head, struct task_struct *task)
-+{
-+ if (__wake_q_add(head, task))
-+ get_task_struct(task);
-+}
-+
-+/**
-+ * wake_q_add_safe() - safely queue a wakeup for 'later' waking.
-+ * @head: the wake_q_head to add @task to
-+ * @task: the task to queue for 'later' wakeup
-+ *
-+ * Queue a task for later wakeup, most likely by the wake_up_q() call in the
-+ * same context, _HOWEVER_ this is not guaranteed, the wakeup can come
-+ * instantly.
-+ *
-+ * This function must be used as-if it were wake_up_process(); IOW the task
-+ * must be ready to be woken at this location.
-+ *
-+ * This function is essentially a task-safe equivalent to wake_q_add(). Callers
-+ * that already hold reference to @task can call the 'safe' version and trust
-+ * wake_q to do the right thing depending whether or not the @task is already
-+ * queued for wakeup.
-+ */
-+void wake_q_add_safe(struct wake_q_head *head, struct task_struct *task)
-+{
-+ if (!__wake_q_add(head, task))
-+ put_task_struct(task);
-+}
-+
-+void wake_up_q(struct wake_q_head *head)
-+{
-+ struct wake_q_node *node = head->first;
-+
-+ while (node != WAKE_Q_TAIL) {
-+ struct task_struct *task;
-+
-+ task = container_of(node, struct task_struct, wake_q);
-+ /* task can safely be re-inserted now: */
-+ node = node->next;
-+ task->wake_q.next = NULL;
-+
-+ /*
-+ * wake_up_process() executes a full barrier, which pairs with
-+ * the queueing in wake_q_add() so as not to miss wakeups.
-+ */
-+ wake_up_process(task);
-+ put_task_struct(task);
-+ }
-+}
-+
-+/*
-+ * resched_curr - mark rq's current task 'to be rescheduled now'.
-+ *
-+ * On UP this means the setting of the need_resched flag, on SMP it
-+ * might also involve a cross-CPU call to trigger the scheduler on
-+ * the target CPU.
-+ */
-+void resched_curr(struct rq *rq)
-+{
-+ struct task_struct *curr = rq->curr;
-+ int cpu;
-+
-+ lockdep_assert_held(&rq->lock);
-+
-+ if (test_tsk_need_resched(curr))
-+ return;
-+
-+ cpu = cpu_of(rq);
-+ if (cpu == smp_processor_id()) {
-+ set_tsk_need_resched(curr);
-+ set_preempt_need_resched();
-+ return;
-+ }
-+
-+ if (set_nr_and_not_polling(curr))
-+ smp_send_reschedule(cpu);
-+ else
-+ trace_sched_wake_idle_without_ipi(cpu);
-+}
-+
-+void resched_cpu(int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+ unsigned long flags;
-+
-+ raw_spin_lock_irqsave(&rq->lock, flags);
-+ if (cpu_online(cpu) || cpu == smp_processor_id())
-+ resched_curr(cpu_rq(cpu));
-+ raw_spin_unlock_irqrestore(&rq->lock, flags);
-+}
-+
-+#ifdef CONFIG_SMP
-+#ifdef CONFIG_NO_HZ_COMMON
-+void nohz_balance_enter_idle(int cpu) {}
-+
-+void select_nohz_load_balancer(int stop_tick) {}
-+
-+void set_cpu_sd_state_idle(void) {}
-+
-+/*
-+ * In the semi idle case, use the nearest busy CPU for migrating timers
-+ * from an idle CPU. This is good for power-savings.
-+ *
-+ * We don't do similar optimization for completely idle system, as
-+ * selecting an idle CPU will add more delays to the timers than intended
-+ * (as that CPU's timer base may not be uptodate wrt jiffies etc).
-+ */
-+int get_nohz_timer_target(void)
-+{
-+ int i, cpu = smp_processor_id(), default_cpu = -1;
-+ struct cpumask *mask;
-+ const struct cpumask *hk_mask;
-+
-+ if (housekeeping_cpu(cpu, HK_TYPE_TIMER)) {
-+ if (!idle_cpu(cpu))
-+ return cpu;
-+ default_cpu = cpu;
-+ }
-+
-+ hk_mask = housekeeping_cpumask(HK_TYPE_TIMER);
-+
-+ for (mask = per_cpu(sched_cpu_topo_masks, cpu) + 1;
-+ mask < per_cpu(sched_cpu_topo_end_mask, cpu); mask++)
-+ for_each_cpu_and(i, mask, hk_mask)
-+ if (!idle_cpu(i))
-+ return i;
-+
-+ if (default_cpu == -1)
-+ default_cpu = housekeeping_any_cpu(HK_TYPE_TIMER);
-+ cpu = default_cpu;
-+
-+ return cpu;
-+}
-+
-+/*
-+ * When add_timer_on() enqueues a timer into the timer wheel of an
-+ * idle CPU then this timer might expire before the next timer event
-+ * which is scheduled to wake up that CPU. In case of a completely
-+ * idle system the next event might even be infinite time into the
-+ * future. wake_up_idle_cpu() ensures that the CPU is woken up and
-+ * leaves the inner idle loop so the newly added timer is taken into
-+ * account when the CPU goes back to idle and evaluates the timer
-+ * wheel for the next timer event.
-+ */
-+static inline void wake_up_idle_cpu(int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+
-+ if (cpu == smp_processor_id())
-+ return;
-+
-+ if (set_nr_and_not_polling(rq->idle))
-+ smp_send_reschedule(cpu);
-+ else
-+ trace_sched_wake_idle_without_ipi(cpu);
-+}
-+
-+static inline bool wake_up_full_nohz_cpu(int cpu)
-+{
-+ /*
-+ * We just need the target to call irq_exit() and re-evaluate
-+ * the next tick. The nohz full kick at least implies that.
-+ * If needed we can still optimize that later with an
-+ * empty IRQ.
-+ */
-+ if (cpu_is_offline(cpu))
-+ return true; /* Don't try to wake offline CPUs. */
-+ if (tick_nohz_full_cpu(cpu)) {
-+ if (cpu != smp_processor_id() ||
-+ tick_nohz_tick_stopped())
-+ tick_nohz_full_kick_cpu(cpu);
-+ return true;
-+ }
-+
-+ return false;
-+}
-+
-+void wake_up_nohz_cpu(int cpu)
-+{
-+ if (!wake_up_full_nohz_cpu(cpu))
-+ wake_up_idle_cpu(cpu);
-+}
-+
-+static void nohz_csd_func(void *info)
-+{
-+ struct rq *rq = info;
-+ int cpu = cpu_of(rq);
-+ unsigned int flags;
-+
-+ /*
-+ * Release the rq::nohz_csd.
-+ */
-+ flags = atomic_fetch_andnot(NOHZ_KICK_MASK, nohz_flags(cpu));
-+ WARN_ON(!(flags & NOHZ_KICK_MASK));
-+
-+ rq->idle_balance = idle_cpu(cpu);
-+ if (rq->idle_balance && !need_resched()) {
-+ rq->nohz_idle_balance = flags;
-+ raise_softirq_irqoff(SCHED_SOFTIRQ);
-+ }
-+}
-+
-+#endif /* CONFIG_NO_HZ_COMMON */
-+#endif /* CONFIG_SMP */
-+
-+static inline void check_preempt_curr(struct rq *rq)
-+{
-+ if (sched_rq_first_task(rq) != rq->curr)
-+ resched_curr(rq);
-+}
-+
-+#ifdef CONFIG_SCHED_HRTICK
-+/*
-+ * Use HR-timers to deliver accurate preemption points.
-+ */
-+
-+static void hrtick_clear(struct rq *rq)
-+{
-+ if (hrtimer_active(&rq->hrtick_timer))
-+ hrtimer_cancel(&rq->hrtick_timer);
-+}
-+
-+/*
-+ * High-resolution timer tick.
-+ * Runs from hardirq context with interrupts disabled.
-+ */
-+static enum hrtimer_restart hrtick(struct hrtimer *timer)
-+{
-+ struct rq *rq = container_of(timer, struct rq, hrtick_timer);
-+
-+ WARN_ON_ONCE(cpu_of(rq) != smp_processor_id());
-+
-+ raw_spin_lock(&rq->lock);
-+ resched_curr(rq);
-+ raw_spin_unlock(&rq->lock);
-+
-+ return HRTIMER_NORESTART;
-+}
-+
-+/*
-+ * Use hrtick when:
-+ * - enabled by features
-+ * - hrtimer is actually high res
-+ */
-+static inline int hrtick_enabled(struct rq *rq)
-+{
-+ /**
-+ * Alt schedule FW doesn't support sched_feat yet
-+ if (!sched_feat(HRTICK))
-+ return 0;
-+ */
-+ if (!cpu_active(cpu_of(rq)))
-+ return 0;
-+ return hrtimer_is_hres_active(&rq->hrtick_timer);
-+}
-+
-+#ifdef CONFIG_SMP
-+
-+static void __hrtick_restart(struct rq *rq)
-+{
-+ struct hrtimer *timer = &rq->hrtick_timer;
-+ ktime_t time = rq->hrtick_time;
-+
-+ hrtimer_start(timer, time, HRTIMER_MODE_ABS_PINNED_HARD);
-+}
-+
-+/*
-+ * called from hardirq (IPI) context
-+ */
-+static void __hrtick_start(void *arg)
-+{
-+ struct rq *rq = arg;
-+
-+ raw_spin_lock(&rq->lock);
-+ __hrtick_restart(rq);
-+ raw_spin_unlock(&rq->lock);
-+}
-+
-+/*
-+ * Called to set the hrtick timer state.
-+ *
-+ * called with rq->lock held and irqs disabled
-+ */
-+void hrtick_start(struct rq *rq, u64 delay)
-+{
-+ struct hrtimer *timer = &rq->hrtick_timer;
-+ s64 delta;
-+
-+ /*
-+ * Don't schedule slices shorter than 10000ns, that just
-+ * doesn't make sense and can cause timer DoS.
-+ */
-+ delta = max_t(s64, delay, 10000LL);
-+
-+ rq->hrtick_time = ktime_add_ns(timer->base->get_time(), delta);
-+
-+ if (rq == this_rq())
-+ __hrtick_restart(rq);
-+ else
-+ smp_call_function_single_async(cpu_of(rq), &rq->hrtick_csd);
-+}
-+
-+#else
-+/*
-+ * Called to set the hrtick timer state.
-+ *
-+ * called with rq->lock held and irqs disabled
-+ */
-+void hrtick_start(struct rq *rq, u64 delay)
-+{
-+ /*
-+ * Don't schedule slices shorter than 10000ns, that just
-+ * doesn't make sense. Rely on vruntime for fairness.
-+ */
-+ delay = max_t(u64, delay, 10000LL);
-+ hrtimer_start(&rq->hrtick_timer, ns_to_ktime(delay),
-+ HRTIMER_MODE_REL_PINNED_HARD);
-+}
-+#endif /* CONFIG_SMP */
-+
-+static void hrtick_rq_init(struct rq *rq)
-+{
-+#ifdef CONFIG_SMP
-+ INIT_CSD(&rq->hrtick_csd, __hrtick_start, rq);
-+#endif
-+
-+ hrtimer_init(&rq->hrtick_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD);
-+ rq->hrtick_timer.function = hrtick;
-+}
-+#else /* CONFIG_SCHED_HRTICK */
-+static inline int hrtick_enabled(struct rq *rq)
-+{
-+ return 0;
-+}
-+
-+static inline void hrtick_clear(struct rq *rq)
-+{
-+}
-+
-+static inline void hrtick_rq_init(struct rq *rq)
-+{
-+}
-+#endif /* CONFIG_SCHED_HRTICK */
-+
-+static inline int __normal_prio(int policy, int rt_prio, int static_prio)
-+{
-+ return rt_policy(policy) ? (MAX_RT_PRIO - 1 - rt_prio) :
-+ static_prio + MAX_PRIORITY_ADJ;
-+}
-+
-+/*
-+ * Calculate the expected normal priority: i.e. priority
-+ * without taking RT-inheritance into account. Might be
-+ * boosted by interactivity modifiers. Changes upon fork,
-+ * setprio syscalls, and whenever the interactivity
-+ * estimator recalculates.
-+ */
-+static inline int normal_prio(struct task_struct *p)
-+{
-+ return __normal_prio(p->policy, p->rt_priority, p->static_prio);
-+}
-+
-+/*
-+ * Calculate the current priority, i.e. the priority
-+ * taken into account by the scheduler. This value might
-+ * be boosted by RT tasks as it will be RT if the task got
-+ * RT-boosted. If not then it returns p->normal_prio.
-+ */
-+static int effective_prio(struct task_struct *p)
-+{
-+ p->normal_prio = normal_prio(p);
-+ /*
-+ * If we are RT tasks or we were boosted to RT priority,
-+ * keep the priority unchanged. Otherwise, update priority
-+ * to the normal priority:
-+ */
-+ if (!rt_prio(p->prio))
-+ return p->normal_prio;
-+ return p->prio;
-+}
-+
-+/*
-+ * activate_task - move a task to the runqueue.
-+ *
-+ * Context: rq->lock
-+ */
-+static void activate_task(struct task_struct *p, struct rq *rq)
-+{
-+ enqueue_task(p, rq, ENQUEUE_WAKEUP);
-+ p->on_rq = TASK_ON_RQ_QUEUED;
-+
-+ /*
-+ * If in_iowait is set, the code below may not trigger any cpufreq
-+ * utilization updates, so do it here explicitly with the IOWAIT flag
-+ * passed.
-+ */
-+ cpufreq_update_util(rq, SCHED_CPUFREQ_IOWAIT * p->in_iowait);
-+}
-+
-+/*
-+ * deactivate_task - remove a task from the runqueue.
-+ *
-+ * Context: rq->lock
-+ */
-+static inline void deactivate_task(struct task_struct *p, struct rq *rq)
-+{
-+ dequeue_task(p, rq, DEQUEUE_SLEEP);
-+ p->on_rq = 0;
-+ cpufreq_update_util(rq, 0);
-+}
-+
-+static inline void __set_task_cpu(struct task_struct *p, unsigned int cpu)
-+{
-+#ifdef CONFIG_SMP
-+ /*
-+ * After ->cpu is set up to a new value, task_access_lock(p, ...) can be
-+ * successfully executed on another CPU. We must ensure that updates of
-+ * per-task data have been completed by this moment.
-+ */
-+ smp_wmb();
-+
-+ WRITE_ONCE(task_thread_info(p)->cpu, cpu);
-+#endif
-+}
-+
-+static inline bool is_migration_disabled(struct task_struct *p)
-+{
-+#ifdef CONFIG_SMP
-+ return p->migration_disabled;
-+#else
-+ return false;
-+#endif
-+}
-+
-+#define SCA_CHECK 0x01
-+#define SCA_USER 0x08
-+
-+#ifdef CONFIG_SMP
-+
-+void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
-+{
-+#ifdef CONFIG_SCHED_DEBUG
-+ unsigned int state = READ_ONCE(p->__state);
-+
-+ /*
-+ * We should never call set_task_cpu() on a blocked task,
-+ * ttwu() will sort out the placement.
-+ */
-+ WARN_ON_ONCE(state != TASK_RUNNING && state != TASK_WAKING && !p->on_rq);
-+
-+#ifdef CONFIG_LOCKDEP
-+ /*
-+ * The caller should hold either p->pi_lock or rq->lock, when changing
-+ * a task's CPU. ->pi_lock for waking tasks, rq->lock for runnable tasks.
-+ *
-+ * sched_move_task() holds both and thus holding either pins the cgroup,
-+ * see task_group().
-+ */
-+ WARN_ON_ONCE(debug_locks && !(lockdep_is_held(&p->pi_lock) ||
-+ lockdep_is_held(&task_rq(p)->lock)));
-+#endif
-+ /*
-+ * Clearly, migrating tasks to offline CPUs is a fairly daft thing.
-+ */
-+ WARN_ON_ONCE(!cpu_online(new_cpu));
-+
-+ WARN_ON_ONCE(is_migration_disabled(p));
-+#endif
-+ trace_sched_migrate_task(p, new_cpu);
-+
-+ if (task_cpu(p) != new_cpu)
-+ {
-+ rseq_migrate(p);
-+ perf_event_task_migrate(p);
-+ }
-+
-+ __set_task_cpu(p, new_cpu);
-+}
-+
-+#define MDF_FORCE_ENABLED 0x80
-+
-+static void
-+__do_set_cpus_ptr(struct task_struct *p, const struct cpumask *new_mask)
-+{
-+ /*
-+ * This here violates the locking rules for affinity, since we're only
-+ * supposed to change these variables while holding both rq->lock and
-+ * p->pi_lock.
-+ *
-+ * HOWEVER, it magically works, because ttwu() is the only code that
-+ * accesses these variables under p->pi_lock and only does so after
-+ * smp_cond_load_acquire(&p->on_cpu, !VAL), and we're in __schedule()
-+ * before finish_task().
-+ *
-+ * XXX do further audits, this smells like something putrid.
-+ */
-+ SCHED_WARN_ON(!p->on_cpu);
-+ p->cpus_ptr = new_mask;
-+}
-+
-+void migrate_disable(void)
-+{
-+ struct task_struct *p = current;
-+ int cpu;
-+
-+ if (p->migration_disabled) {
-+ p->migration_disabled++;
-+ return;
-+ }
-+
-+ preempt_disable();
-+ cpu = smp_processor_id();
-+ if (cpumask_test_cpu(cpu, &p->cpus_mask)) {
-+ cpu_rq(cpu)->nr_pinned++;
-+ p->migration_disabled = 1;
-+ p->migration_flags &= ~MDF_FORCE_ENABLED;
-+
-+ /*
-+ * Violates locking rules! see comment in __do_set_cpus_ptr().
-+ */
-+ if (p->cpus_ptr == &p->cpus_mask)
-+ __do_set_cpus_ptr(p, cpumask_of(cpu));
-+ }
-+ preempt_enable();
-+}
-+EXPORT_SYMBOL_GPL(migrate_disable);
-+
-+void migrate_enable(void)
-+{
-+ struct task_struct *p = current;
-+
-+ if (0 == p->migration_disabled)
-+ return;
-+
-+ if (p->migration_disabled > 1) {
-+ p->migration_disabled--;
-+ return;
-+ }
-+
-+ if (WARN_ON_ONCE(!p->migration_disabled))
-+ return;
-+
-+ /*
-+ * Ensure stop_task runs either before or after this, and that
-+ * __set_cpus_allowed_ptr(SCA_MIGRATE_ENABLE) doesn't schedule().
-+ */
-+ preempt_disable();
-+ /*
-+ * Assumption: current should be running on allowed cpu
-+ */
-+ WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &p->cpus_mask));
-+ if (p->cpus_ptr != &p->cpus_mask)
-+ __do_set_cpus_ptr(p, &p->cpus_mask);
-+ /*
-+ * Mustn't clear migration_disabled() until cpus_ptr points back at the
-+ * regular cpus_mask, otherwise things that race (eg.
-+ * select_fallback_rq) get confused.
-+ */
-+ barrier();
-+ p->migration_disabled = 0;
-+ this_rq()->nr_pinned--;
-+ preempt_enable();
-+}
-+EXPORT_SYMBOL_GPL(migrate_enable);
-+
-+static inline bool rq_has_pinned_tasks(struct rq *rq)
-+{
-+ return rq->nr_pinned;
-+}
-+
-+/*
-+ * Per-CPU kthreads are allowed to run on !active && online CPUs, see
-+ * __set_cpus_allowed_ptr() and select_fallback_rq().
-+ */
-+static inline bool is_cpu_allowed(struct task_struct *p, int cpu)
-+{
-+ /* When not in the task's cpumask, no point in looking further. */
-+ if (!cpumask_test_cpu(cpu, p->cpus_ptr))
-+ return false;
-+
-+ /* migrate_disabled() must be allowed to finish. */
-+ if (is_migration_disabled(p))
-+ return cpu_online(cpu);
-+
-+ /* Non kernel threads are not allowed during either online or offline. */
-+ if (!(p->flags & PF_KTHREAD))
-+ return cpu_active(cpu) && task_cpu_possible(cpu, p);
-+
-+ /* KTHREAD_IS_PER_CPU is always allowed. */
-+ if (kthread_is_per_cpu(p))
-+ return cpu_online(cpu);
-+
-+ /* Regular kernel threads don't get to stay during offline. */
-+ if (cpu_dying(cpu))
-+ return false;
-+
-+ /* But are allowed during online. */
-+ return cpu_online(cpu);
-+}
-+
-+/*
-+ * This is how migration works:
-+ *
-+ * 1) we invoke migration_cpu_stop() on the target CPU using
-+ * stop_one_cpu().
-+ * 2) stopper starts to run (implicitly forcing the migrated thread
-+ * off the CPU)
-+ * 3) it checks whether the migrated task is still in the wrong runqueue.
-+ * 4) if it's in the wrong runqueue then the migration thread removes
-+ * it and puts it into the right queue.
-+ * 5) stopper completes and stop_one_cpu() returns and the migration
-+ * is done.
-+ */
-+
-+/*
-+ * move_queued_task - move a queued task to new rq.
-+ *
-+ * Returns (locked) new rq. Old rq's lock is released.
-+ */
-+static struct rq *move_queued_task(struct rq *rq, struct task_struct *p, int
-+ new_cpu)
-+{
-+ int src_cpu;
-+
-+ lockdep_assert_held(&rq->lock);
-+
-+ src_cpu = cpu_of(rq);
-+ WRITE_ONCE(p->on_rq, TASK_ON_RQ_MIGRATING);
-+ dequeue_task(p, rq, 0);
-+ set_task_cpu(p, new_cpu);
-+ raw_spin_unlock(&rq->lock);
-+
-+ rq = cpu_rq(new_cpu);
-+
-+ raw_spin_lock(&rq->lock);
-+ WARN_ON_ONCE(task_cpu(p) != new_cpu);
-+
-+ sched_mm_cid_migrate_to(rq, p, src_cpu);
-+
-+ sched_task_sanity_check(p, rq);
-+ enqueue_task(p, rq, 0);
-+ p->on_rq = TASK_ON_RQ_QUEUED;
-+ check_preempt_curr(rq);
-+
-+ return rq;
-+}
-+
-+struct migration_arg {
-+ struct task_struct *task;
-+ int dest_cpu;
-+};
-+
-+/*
-+ * Move (not current) task off this CPU, onto the destination CPU. We're doing
-+ * this because either it can't run here any more (set_cpus_allowed()
-+ * away from this CPU, or CPU going down), or because we're
-+ * attempting to rebalance this task on exec (sched_exec).
-+ *
-+ * So we race with normal scheduler movements, but that's OK, as long
-+ * as the task is no longer on this CPU.
-+ */
-+static struct rq *__migrate_task(struct rq *rq, struct task_struct *p, int
-+ dest_cpu)
-+{
-+ /* Affinity changed (again). */
-+ if (!is_cpu_allowed(p, dest_cpu))
-+ return rq;
-+
-+ update_rq_clock(rq);
-+ return move_queued_task(rq, p, dest_cpu);
-+}
-+
-+/*
-+ * migration_cpu_stop - this will be executed by a highprio stopper thread
-+ * and performs thread migration by bumping thread off CPU then
-+ * 'pushing' onto another runqueue.
-+ */
-+static int migration_cpu_stop(void *data)
-+{
-+ struct migration_arg *arg = data;
-+ struct task_struct *p = arg->task;
-+ struct rq *rq = this_rq();
-+ unsigned long flags;
-+
-+ /*
-+ * The original target CPU might have gone down and we might
-+ * be on another CPU but it doesn't matter.
-+ */
-+ local_irq_save(flags);
-+ /*
-+ * We need to explicitly wake pending tasks before running
-+ * __migrate_task() such that we will not miss enforcing cpus_ptr
-+ * during wakeups, see set_cpus_allowed_ptr()'s TASK_WAKING test.
-+ */
-+ flush_smp_call_function_queue();
-+
-+ raw_spin_lock(&p->pi_lock);
-+ raw_spin_lock(&rq->lock);
-+ /*
-+ * If task_rq(p) != rq, it cannot be migrated here, because we're
-+ * holding rq->lock, if p->on_rq == 0 it cannot get enqueued because
-+ * we're holding p->pi_lock.
-+ */
-+ if (task_rq(p) == rq && task_on_rq_queued(p))
-+ rq = __migrate_task(rq, p, arg->dest_cpu);
-+ raw_spin_unlock(&rq->lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
-+
-+ return 0;
-+}
-+
-+static inline void
-+set_cpus_allowed_common(struct task_struct *p, struct affinity_context *ctx)
-+{
-+ cpumask_copy(&p->cpus_mask, ctx->new_mask);
-+ p->nr_cpus_allowed = cpumask_weight(ctx->new_mask);
-+
-+ /*
-+ * Swap in a new user_cpus_ptr if SCA_USER flag set
-+ */
-+ if (ctx->flags & SCA_USER)
-+ swap(p->user_cpus_ptr, ctx->user_mask);
-+}
-+
-+static void
-+__do_set_cpus_allowed(struct task_struct *p, struct affinity_context *ctx)
-+{
-+ lockdep_assert_held(&p->pi_lock);
-+ set_cpus_allowed_common(p, ctx);
-+}
-+
-+/*
-+ * Used for kthread_bind() and select_fallback_rq(), in both cases the user
-+ * affinity (if any) should be destroyed too.
-+ */
-+void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
-+{
-+ struct affinity_context ac = {
-+ .new_mask = new_mask,
-+ .user_mask = NULL,
-+ .flags = SCA_USER, /* clear the user requested mask */
-+ };
-+ union cpumask_rcuhead {
-+ cpumask_t cpumask;
-+ struct rcu_head rcu;
-+ };
-+
-+ __do_set_cpus_allowed(p, &ac);
-+
-+ /*
-+ * Because this is called with p->pi_lock held, it is not possible
-+ * to use kfree() here (when PREEMPT_RT=y), therefore punt to using
-+ * kfree_rcu().
-+ */
-+ kfree_rcu((union cpumask_rcuhead *)ac.user_mask, rcu);
-+}
-+
-+static cpumask_t *alloc_user_cpus_ptr(int node)
-+{
-+ /*
-+ * See do_set_cpus_allowed() above for the rcu_head usage.
-+ */
-+ int size = max_t(int, cpumask_size(), sizeof(struct rcu_head));
-+
-+ return kmalloc_node(size, GFP_KERNEL, node);
-+}
-+
-+int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
-+ int node)
-+{
-+ cpumask_t *user_mask;
-+ unsigned long flags;
-+
-+ /*
-+ * Always clear dst->user_cpus_ptr first as their user_cpus_ptr's
-+ * may differ by now due to racing.
-+ */
-+ dst->user_cpus_ptr = NULL;
-+
-+ /*
-+ * This check is racy and losing the race is a valid situation.
-+ * It is not worth the extra overhead of taking the pi_lock on
-+ * every fork/clone.
-+ */
-+ if (data_race(!src->user_cpus_ptr))
-+ return 0;
-+
-+ user_mask = alloc_user_cpus_ptr(node);
-+ if (!user_mask)
-+ return -ENOMEM;
-+
-+ /*
-+ * Use pi_lock to protect content of user_cpus_ptr
-+ *
-+ * Though unlikely, user_cpus_ptr can be reset to NULL by a concurrent
-+ * do_set_cpus_allowed().
-+ */
-+ raw_spin_lock_irqsave(&src->pi_lock, flags);
-+ if (src->user_cpus_ptr) {
-+ swap(dst->user_cpus_ptr, user_mask);
-+ cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
-+ }
-+ raw_spin_unlock_irqrestore(&src->pi_lock, flags);
-+
-+ if (unlikely(user_mask))
-+ kfree(user_mask);
-+
-+ return 0;
-+}
-+
-+static inline struct cpumask *clear_user_cpus_ptr(struct task_struct *p)
-+{
-+ struct cpumask *user_mask = NULL;
-+
-+ swap(p->user_cpus_ptr, user_mask);
-+
-+ return user_mask;
-+}
-+
-+void release_user_cpus_ptr(struct task_struct *p)
-+{
-+ kfree(clear_user_cpus_ptr(p));
-+}
-+
-+#endif
-+
-+/**
-+ * task_curr - is this task currently executing on a CPU?
-+ * @p: the task in question.
-+ *
-+ * Return: 1 if the task is currently executing. 0 otherwise.
-+ */
-+inline int task_curr(const struct task_struct *p)
-+{
-+ return cpu_curr(task_cpu(p)) == p;
-+}
-+
-+#ifdef CONFIG_SMP
-+/*
-+ * wait_task_inactive - wait for a thread to unschedule.
-+ *
-+ * Wait for the thread to block in any of the states set in @match_state.
-+ * If it changes, i.e. @p might have woken up, then return zero. When we
-+ * succeed in waiting for @p to be off its CPU, we return a positive number
-+ * (its total switch count). If a second call a short while later returns the
-+ * same number, the caller can be sure that @p has remained unscheduled the
-+ * whole time.
-+ *
-+ * The caller must ensure that the task *will* unschedule sometime soon,
-+ * else this function might spin for a *long* time. This function can't
-+ * be called with interrupts off, or it may introduce deadlock with
-+ * smp_call_function() if an IPI is sent by the same process we are
-+ * waiting to become inactive.
-+ */
-+unsigned long wait_task_inactive(struct task_struct *p, unsigned int match_state)
-+{
-+ unsigned long flags;
-+ bool running, on_rq;
-+ unsigned long ncsw;
-+ struct rq *rq;
-+ raw_spinlock_t *lock;
-+
-+ for (;;) {
-+ rq = task_rq(p);
-+
-+ /*
-+ * If the task is actively running on another CPU
-+ * still, just relax and busy-wait without holding
-+ * any locks.
-+ *
-+ * NOTE! Since we don't hold any locks, it's not
-+ * even sure that "rq" stays as the right runqueue!
-+ * But we don't care, since this will return false
-+ * if the runqueue has changed and p is actually now
-+ * running somewhere else!
-+ */
-+ while (task_on_cpu(p) && p == rq->curr) {
-+ if (!(READ_ONCE(p->__state) & match_state))
-+ return 0;
-+ cpu_relax();
-+ }
-+
-+ /*
-+ * Ok, time to look more closely! We need the rq
-+ * lock now, to be *sure*. If we're wrong, we'll
-+ * just go back and repeat.
-+ */
-+ task_access_lock_irqsave(p, &lock, &flags);
-+ trace_sched_wait_task(p);
-+ running = task_on_cpu(p);
-+ on_rq = p->on_rq;
-+ ncsw = 0;
-+ if (READ_ONCE(p->__state) & match_state)
-+ ncsw = p->nvcsw | LONG_MIN; /* sets MSB */
-+ task_access_unlock_irqrestore(p, lock, &flags);
-+
-+ /*
-+ * If it changed from the expected state, bail out now.
-+ */
-+ if (unlikely(!ncsw))
-+ break;
-+
-+ /*
-+ * Was it really running after all now that we
-+ * checked with the proper locks actually held?
-+ *
-+ * Oops. Go back and try again..
-+ */
-+ if (unlikely(running)) {
-+ cpu_relax();
-+ continue;
-+ }
-+
-+ /*
-+ * It's not enough that it's not actively running,
-+ * it must be off the runqueue _entirely_, and not
-+ * preempted!
-+ *
-+ * So if it was still runnable (but just not actively
-+ * running right now), it's preempted, and we should
-+ * yield - it could be a while.
-+ */
-+ if (unlikely(on_rq)) {
-+ ktime_t to = NSEC_PER_SEC / HZ;
-+
-+ set_current_state(TASK_UNINTERRUPTIBLE);
-+ schedule_hrtimeout(&to, HRTIMER_MODE_REL_HARD);
-+ continue;
-+ }
-+
-+ /*
-+ * Ahh, all good. It wasn't running, and it wasn't
-+ * runnable, which means that it will never become
-+ * running in the future either. We're all done!
-+ */
-+ break;
-+ }
-+
-+ return ncsw;
-+}
-+
-+/***
-+ * kick_process - kick a running thread to enter/exit the kernel
-+ * @p: the to-be-kicked thread
-+ *
-+ * Cause a process which is running on another CPU to enter
-+ * kernel-mode, without any delay. (to get signals handled.)
-+ *
-+ * NOTE: this function doesn't have to take the runqueue lock,
-+ * because all it wants to ensure is that the remote task enters
-+ * the kernel. If the IPI races and the task has been migrated
-+ * to another CPU then no harm is done and the purpose has been
-+ * achieved as well.
-+ */
-+void kick_process(struct task_struct *p)
-+{
-+ int cpu;
-+
-+ preempt_disable();
-+ cpu = task_cpu(p);
-+ if ((cpu != smp_processor_id()) && task_curr(p))
-+ smp_send_reschedule(cpu);
-+ preempt_enable();
-+}
-+EXPORT_SYMBOL_GPL(kick_process);
-+
-+/*
-+ * ->cpus_ptr is protected by both rq->lock and p->pi_lock
-+ *
-+ * A few notes on cpu_active vs cpu_online:
-+ *
-+ * - cpu_active must be a subset of cpu_online
-+ *
-+ * - on CPU-up we allow per-CPU kthreads on the online && !active CPU,
-+ * see __set_cpus_allowed_ptr(). At this point the newly online
-+ * CPU isn't yet part of the sched domains, and balancing will not
-+ * see it.
-+ *
-+ * - on cpu-down we clear cpu_active() to mask the sched domains and
-+ * avoid the load balancer to place new tasks on the to be removed
-+ * CPU. Existing tasks will remain running there and will be taken
-+ * off.
-+ *
-+ * This means that fallback selection must not select !active CPUs.
-+ * And can assume that any active CPU must be online. Conversely
-+ * select_task_rq() below may allow selection of !active CPUs in order
-+ * to satisfy the above rules.
-+ */
-+static int select_fallback_rq(int cpu, struct task_struct *p)
-+{
-+ int nid = cpu_to_node(cpu);
-+ const struct cpumask *nodemask = NULL;
-+ enum { cpuset, possible, fail } state = cpuset;
-+ int dest_cpu;
-+
-+ /*
-+ * If the node that the CPU is on has been offlined, cpu_to_node()
-+ * will return -1. There is no CPU on the node, and we should
-+ * select the CPU on the other node.
-+ */
-+ if (nid != -1) {
-+ nodemask = cpumask_of_node(nid);
-+
-+ /* Look for allowed, online CPU in same node. */
-+ for_each_cpu(dest_cpu, nodemask) {
-+ if (is_cpu_allowed(p, dest_cpu))
-+ return dest_cpu;
-+ }
-+ }
-+
-+ for (;;) {
-+ /* Any allowed, online CPU? */
-+ for_each_cpu(dest_cpu, p->cpus_ptr) {
-+ if (!is_cpu_allowed(p, dest_cpu))
-+ continue;
-+ goto out;
-+ }
-+
-+ /* No more Mr. Nice Guy. */
-+ switch (state) {
-+ case cpuset:
-+ if (cpuset_cpus_allowed_fallback(p)) {
-+ state = possible;
-+ break;
-+ }
-+ fallthrough;
-+ case possible:
-+ /*
-+ * XXX When called from select_task_rq() we only
-+ * hold p->pi_lock and again violate locking order.
-+ *
-+ * More yuck to audit.
-+ */
-+ do_set_cpus_allowed(p, task_cpu_possible_mask(p));
-+ state = fail;
-+ break;
-+
-+ case fail:
-+ BUG();
-+ break;
-+ }
-+ }
-+
-+out:
-+ if (state != cpuset) {
-+ /*
-+ * Don't tell them about moving exiting tasks or
-+ * kernel threads (both mm NULL), since they never
-+ * leave kernel.
-+ */
-+ if (p->mm && printk_ratelimit()) {
-+ printk_deferred("process %d (%s) no longer affine to cpu%d\n",
-+ task_pid_nr(p), p->comm, cpu);
-+ }
-+ }
-+
-+ return dest_cpu;
-+}
-+
-+static inline void
-+sched_preempt_mask_flush(cpumask_t *mask, int prio)
-+{
-+ int cpu;
-+
-+ cpumask_copy(mask, sched_idle_mask);
-+
-+ for_each_clear_bit(cpu, cpumask_bits(mask), nr_cpumask_bits) {
-+ if (prio < cpu_rq(cpu)->prio)
-+ cpumask_set_cpu(cpu, mask);
-+ }
-+}
-+
-+static inline int
-+preempt_mask_check(struct task_struct *p, cpumask_t *allow_mask, cpumask_t *preempt_mask)
-+{
-+ int task_prio = task_sched_prio(p);
-+ cpumask_t *mask = sched_preempt_mask + SCHED_QUEUE_BITS - 1 - task_prio;
-+ int pr = atomic_read(&sched_prio_record);
-+
-+ if (pr != task_prio) {
-+ sched_preempt_mask_flush(mask, task_prio);
-+ atomic_set(&sched_prio_record, task_prio);
-+ }
-+
-+ return cpumask_and(preempt_mask, allow_mask, mask);
-+}
-+
-+static inline int select_task_rq(struct task_struct *p)
-+{
-+ cpumask_t allow_mask, mask;
-+
-+ if (unlikely(!cpumask_and(&allow_mask, p->cpus_ptr, cpu_active_mask)))
-+ return select_fallback_rq(task_cpu(p), p);
-+
-+ if (
-+#ifdef CONFIG_SCHED_SMT
-+ cpumask_and(&mask, &allow_mask, &sched_sg_idle_mask) ||
-+#endif
-+ cpumask_and(&mask, &allow_mask, sched_idle_mask) ||
-+ preempt_mask_check(p, &allow_mask, &mask))
-+ return best_mask_cpu(task_cpu(p), &mask);
-+
-+ return best_mask_cpu(task_cpu(p), &allow_mask);
-+}
-+
-+void sched_set_stop_task(int cpu, struct task_struct *stop)
-+{
-+ static struct lock_class_key stop_pi_lock;
-+ struct sched_param stop_param = { .sched_priority = STOP_PRIO };
-+ struct sched_param start_param = { .sched_priority = 0 };
-+ struct task_struct *old_stop = cpu_rq(cpu)->stop;
-+
-+ if (stop) {
-+ /*
-+ * Make it appear like a SCHED_FIFO task, its something
-+ * userspace knows about and won't get confused about.
-+ *
-+ * Also, it will make PI more or less work without too
-+ * much confusion -- but then, stop work should not
-+ * rely on PI working anyway.
-+ */
-+ sched_setscheduler_nocheck(stop, SCHED_FIFO, &stop_param);
-+
-+ /*
-+ * The PI code calls rt_mutex_setprio() with ->pi_lock held to
-+ * adjust the effective priority of a task. As a result,
-+ * rt_mutex_setprio() can trigger (RT) balancing operations,
-+ * which can then trigger wakeups of the stop thread to push
-+ * around the current task.
-+ *
-+ * The stop task itself will never be part of the PI-chain, it
-+ * never blocks, therefore that ->pi_lock recursion is safe.
-+ * Tell lockdep about this by placing the stop->pi_lock in its
-+ * own class.
-+ */
-+ lockdep_set_class(&stop->pi_lock, &stop_pi_lock);
-+ }
-+
-+ cpu_rq(cpu)->stop = stop;
-+
-+ if (old_stop) {
-+ /*
-+ * Reset it back to a normal scheduling policy so that
-+ * it can die in pieces.
-+ */
-+ sched_setscheduler_nocheck(old_stop, SCHED_NORMAL, &start_param);
-+ }
-+}
-+
-+static int affine_move_task(struct rq *rq, struct task_struct *p, int dest_cpu,
-+ raw_spinlock_t *lock, unsigned long irq_flags)
-+ __releases(rq->lock)
-+ __releases(p->pi_lock)
-+{
-+ /* Can the task run on the task's current CPU? If so, we're done */
-+ if (!cpumask_test_cpu(task_cpu(p), &p->cpus_mask)) {
-+ if (p->migration_disabled) {
-+ if (likely(p->cpus_ptr != &p->cpus_mask))
-+ __do_set_cpus_ptr(p, &p->cpus_mask);
-+ p->migration_disabled = 0;
-+ p->migration_flags |= MDF_FORCE_ENABLED;
-+ /* When p is migrate_disabled, rq->lock should be held */
-+ rq->nr_pinned--;
-+ }
-+
-+ if (task_on_cpu(p) || READ_ONCE(p->__state) == TASK_WAKING) {
-+ struct migration_arg arg = { p, dest_cpu };
-+
-+ /* Need help from migration thread: drop lock and wait. */
-+ __task_access_unlock(p, lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
-+ stop_one_cpu(cpu_of(rq), migration_cpu_stop, &arg);
-+ return 0;
-+ }
-+ if (task_on_rq_queued(p)) {
-+ /*
-+ * OK, since we're going to drop the lock immediately
-+ * afterwards anyway.
-+ */
-+ update_rq_clock(rq);
-+ rq = move_queued_task(rq, p, dest_cpu);
-+ lock = &rq->lock;
-+ }
-+ }
-+ __task_access_unlock(p, lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
-+ return 0;
-+}
-+
-+static int __set_cpus_allowed_ptr_locked(struct task_struct *p,
-+ struct affinity_context *ctx,
-+ struct rq *rq,
-+ raw_spinlock_t *lock,
-+ unsigned long irq_flags)
-+{
-+ const struct cpumask *cpu_allowed_mask = task_cpu_possible_mask(p);
-+ const struct cpumask *cpu_valid_mask = cpu_active_mask;
-+ bool kthread = p->flags & PF_KTHREAD;
-+ int dest_cpu;
-+ int ret = 0;
-+
-+ if (kthread || is_migration_disabled(p)) {
-+ /*
-+ * Kernel threads are allowed on online && !active CPUs,
-+ * however, during cpu-hot-unplug, even these might get pushed
-+ * away if not KTHREAD_IS_PER_CPU.
-+ *
-+ * Specifically, migration_disabled() tasks must not fail the
-+ * cpumask_any_and_distribute() pick below, esp. so on
-+ * SCA_MIGRATE_ENABLE, otherwise we'll not call
-+ * set_cpus_allowed_common() and actually reset p->cpus_ptr.
-+ */
-+ cpu_valid_mask = cpu_online_mask;
-+ }
-+
-+ if (!kthread && !cpumask_subset(ctx->new_mask, cpu_allowed_mask)) {
-+ ret = -EINVAL;
-+ goto out;
-+ }
-+
-+ /*
-+ * Must re-check here, to close a race against __kthread_bind(),
-+ * sched_setaffinity() is not guaranteed to observe the flag.
-+ */
-+ if ((ctx->flags & SCA_CHECK) && (p->flags & PF_NO_SETAFFINITY)) {
-+ ret = -EINVAL;
-+ goto out;
-+ }
-+
-+ if (cpumask_equal(&p->cpus_mask, ctx->new_mask))
-+ goto out;
-+
-+ dest_cpu = cpumask_any_and(cpu_valid_mask, ctx->new_mask);
-+ if (dest_cpu >= nr_cpu_ids) {
-+ ret = -EINVAL;
-+ goto out;
-+ }
-+
-+ __do_set_cpus_allowed(p, ctx);
-+
-+ return affine_move_task(rq, p, dest_cpu, lock, irq_flags);
-+
-+out:
-+ __task_access_unlock(p, lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
-+
-+ return ret;
-+}
-+
-+/*
-+ * Change a given task's CPU affinity. Migrate the thread to a
-+ * is removed from the allowed bitmask.
-+ *
-+ * NOTE: the caller must have a valid reference to the task, the
-+ * task must not exit() & deallocate itself prematurely. The
-+ * call is not atomic; no spinlocks may be held.
-+ */
-+static int __set_cpus_allowed_ptr(struct task_struct *p,
-+ struct affinity_context *ctx)
-+{
-+ unsigned long irq_flags;
-+ struct rq *rq;
-+ raw_spinlock_t *lock;
-+
-+ raw_spin_lock_irqsave(&p->pi_lock, irq_flags);
-+ rq = __task_access_lock(p, &lock);
-+ /*
-+ * Masking should be skipped if SCA_USER or any of the SCA_MIGRATE_*
-+ * flags are set.
-+ */
-+ if (p->user_cpus_ptr &&
-+ !(ctx->flags & SCA_USER) &&
-+ cpumask_and(rq->scratch_mask, ctx->new_mask, p->user_cpus_ptr))
-+ ctx->new_mask = rq->scratch_mask;
-+
-+
-+ return __set_cpus_allowed_ptr_locked(p, ctx, rq, lock, irq_flags);
-+}
-+
-+int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask)
-+{
-+ struct affinity_context ac = {
-+ .new_mask = new_mask,
-+ .flags = 0,
-+ };
-+
-+ return __set_cpus_allowed_ptr(p, &ac);
-+}
-+EXPORT_SYMBOL_GPL(set_cpus_allowed_ptr);
-+
-+/*
-+ * Change a given task's CPU affinity to the intersection of its current
-+ * affinity mask and @subset_mask, writing the resulting mask to @new_mask.
-+ * If user_cpus_ptr is defined, use it as the basis for restricting CPU
-+ * affinity or use cpu_online_mask instead.
-+ *
-+ * If the resulting mask is empty, leave the affinity unchanged and return
-+ * -EINVAL.
-+ */
-+static int restrict_cpus_allowed_ptr(struct task_struct *p,
-+ struct cpumask *new_mask,
-+ const struct cpumask *subset_mask)
-+{
-+ struct affinity_context ac = {
-+ .new_mask = new_mask,
-+ .flags = 0,
-+ };
-+ unsigned long irq_flags;
-+ raw_spinlock_t *lock;
-+ struct rq *rq;
-+ int err;
-+
-+ raw_spin_lock_irqsave(&p->pi_lock, irq_flags);
-+ rq = __task_access_lock(p, &lock);
-+
-+ if (!cpumask_and(new_mask, task_user_cpus(p), subset_mask)) {
-+ err = -EINVAL;
-+ goto err_unlock;
-+ }
-+
-+ return __set_cpus_allowed_ptr_locked(p, &ac, rq, lock, irq_flags);
-+
-+err_unlock:
-+ __task_access_unlock(p, lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, irq_flags);
-+ return err;
-+}
-+
-+/*
-+ * Restrict the CPU affinity of task @p so that it is a subset of
-+ * task_cpu_possible_mask() and point @p->user_cpus_ptr to a copy of the
-+ * old affinity mask. If the resulting mask is empty, we warn and walk
-+ * up the cpuset hierarchy until we find a suitable mask.
-+ */
-+void force_compatible_cpus_allowed_ptr(struct task_struct *p)
-+{
-+ cpumask_var_t new_mask;
-+ const struct cpumask *override_mask = task_cpu_possible_mask(p);
-+
-+ alloc_cpumask_var(&new_mask, GFP_KERNEL);
-+
-+ /*
-+ * __migrate_task() can fail silently in the face of concurrent
-+ * offlining of the chosen destination CPU, so take the hotplug
-+ * lock to ensure that the migration succeeds.
-+ */
-+ cpus_read_lock();
-+ if (!cpumask_available(new_mask))
-+ goto out_set_mask;
-+
-+ if (!restrict_cpus_allowed_ptr(p, new_mask, override_mask))
-+ goto out_free_mask;
-+
-+ /*
-+ * We failed to find a valid subset of the affinity mask for the
-+ * task, so override it based on its cpuset hierarchy.
-+ */
-+ cpuset_cpus_allowed(p, new_mask);
-+ override_mask = new_mask;
-+
-+out_set_mask:
-+ if (printk_ratelimit()) {
-+ printk_deferred("Overriding affinity for process %d (%s) to CPUs %*pbl\n",
-+ task_pid_nr(p), p->comm,
-+ cpumask_pr_args(override_mask));
-+ }
-+
-+ WARN_ON(set_cpus_allowed_ptr(p, override_mask));
-+out_free_mask:
-+ cpus_read_unlock();
-+ free_cpumask_var(new_mask);
-+}
-+
-+static int
-+__sched_setaffinity(struct task_struct *p, struct affinity_context *ctx);
-+
-+/*
-+ * Restore the affinity of a task @p which was previously restricted by a
-+ * call to force_compatible_cpus_allowed_ptr().
-+ *
-+ * It is the caller's responsibility to serialise this with any calls to
-+ * force_compatible_cpus_allowed_ptr(@p).
-+ */
-+void relax_compatible_cpus_allowed_ptr(struct task_struct *p)
-+{
-+ struct affinity_context ac = {
-+ .new_mask = task_user_cpus(p),
-+ .flags = 0,
-+ };
-+ int ret;
-+
-+ /*
-+ * Try to restore the old affinity mask with __sched_setaffinity().
-+ * Cpuset masking will be done there too.
-+ */
-+ ret = __sched_setaffinity(p, &ac);
-+ WARN_ON_ONCE(ret);
-+}
-+
-+#else /* CONFIG_SMP */
-+
-+static inline int select_task_rq(struct task_struct *p)
-+{
-+ return 0;
-+}
-+
-+static inline int
-+__set_cpus_allowed_ptr(struct task_struct *p,
-+ struct affinity_context *ctx)
-+{
-+ return set_cpus_allowed_ptr(p, ctx->new_mask);
-+}
-+
-+static inline bool rq_has_pinned_tasks(struct rq *rq)
-+{
-+ return false;
-+}
-+
-+static inline cpumask_t *alloc_user_cpus_ptr(int node)
-+{
-+ return NULL;
-+}
-+
-+#endif /* !CONFIG_SMP */
-+
-+static void
-+ttwu_stat(struct task_struct *p, int cpu, int wake_flags)
-+{
-+ struct rq *rq;
-+
-+ if (!schedstat_enabled())
-+ return;
-+
-+ rq = this_rq();
-+
-+#ifdef CONFIG_SMP
-+ if (cpu == rq->cpu) {
-+ __schedstat_inc(rq->ttwu_local);
-+ __schedstat_inc(p->stats.nr_wakeups_local);
-+ } else {
-+ /** Alt schedule FW ToDo:
-+ * How to do ttwu_wake_remote
-+ */
-+ }
-+#endif /* CONFIG_SMP */
-+
-+ __schedstat_inc(rq->ttwu_count);
-+ __schedstat_inc(p->stats.nr_wakeups);
-+}
-+
-+/*
-+ * Mark the task runnable.
-+ */
-+static inline void ttwu_do_wakeup(struct task_struct *p)
-+{
-+ WRITE_ONCE(p->__state, TASK_RUNNING);
-+ trace_sched_wakeup(p);
-+}
-+
-+static inline void
-+ttwu_do_activate(struct rq *rq, struct task_struct *p, int wake_flags)
-+{
-+ if (p->sched_contributes_to_load)
-+ rq->nr_uninterruptible--;
-+
-+ if (
-+#ifdef CONFIG_SMP
-+ !(wake_flags & WF_MIGRATED) &&
-+#endif
-+ p->in_iowait) {
-+ delayacct_blkio_end(p);
-+ atomic_dec(&task_rq(p)->nr_iowait);
-+ }
-+
-+ activate_task(p, rq);
-+ check_preempt_curr(rq);
-+
-+ ttwu_do_wakeup(p);
-+}
-+
-+/*
-+ * Consider @p being inside a wait loop:
-+ *
-+ * for (;;) {
-+ * set_current_state(TASK_UNINTERRUPTIBLE);
-+ *
-+ * if (CONDITION)
-+ * break;
-+ *
-+ * schedule();
-+ * }
-+ * __set_current_state(TASK_RUNNING);
-+ *
-+ * between set_current_state() and schedule(). In this case @p is still
-+ * runnable, so all that needs doing is change p->state back to TASK_RUNNING in
-+ * an atomic manner.
-+ *
-+ * By taking task_rq(p)->lock we serialize against schedule(), if @p->on_rq
-+ * then schedule() must still happen and p->state can be changed to
-+ * TASK_RUNNING. Otherwise we lost the race, schedule() has happened, and we
-+ * need to do a full wakeup with enqueue.
-+ *
-+ * Returns: %true when the wakeup is done,
-+ * %false otherwise.
-+ */
-+static int ttwu_runnable(struct task_struct *p, int wake_flags)
-+{
-+ struct rq *rq;
-+ raw_spinlock_t *lock;
-+ int ret = 0;
-+
-+ rq = __task_access_lock(p, &lock);
-+ if (task_on_rq_queued(p)) {
-+ if (!task_on_cpu(p)) {
-+ /*
-+ * When on_rq && !on_cpu the task is preempted, see if
-+ * it should preempt the task that is current now.
-+ */
-+ update_rq_clock(rq);
-+ check_preempt_curr(rq);
-+ }
-+ ttwu_do_wakeup(p);
-+ ret = 1;
-+ }
-+ __task_access_unlock(p, lock);
-+
-+ return ret;
-+}
-+
-+#ifdef CONFIG_SMP
-+void sched_ttwu_pending(void *arg)
-+{
-+ struct llist_node *llist = arg;
-+ struct rq *rq = this_rq();
-+ struct task_struct *p, *t;
-+ struct rq_flags rf;
-+
-+ if (!llist)
-+ return;
-+
-+ rq_lock_irqsave(rq, &rf);
-+ update_rq_clock(rq);
-+
-+ llist_for_each_entry_safe(p, t, llist, wake_entry.llist) {
-+ if (WARN_ON_ONCE(p->on_cpu))
-+ smp_cond_load_acquire(&p->on_cpu, !VAL);
-+
-+ if (WARN_ON_ONCE(task_cpu(p) != cpu_of(rq)))
-+ set_task_cpu(p, cpu_of(rq));
-+
-+ ttwu_do_activate(rq, p, p->sched_remote_wakeup ? WF_MIGRATED : 0);
-+ }
-+
-+ /*
-+ * Must be after enqueueing at least once task such that
-+ * idle_cpu() does not observe a false-negative -- if it does,
-+ * it is possible for select_idle_siblings() to stack a number
-+ * of tasks on this CPU during that window.
-+ *
-+ * It is ok to clear ttwu_pending when another task pending.
-+ * We will receive IPI after local irq enabled and then enqueue it.
-+ * Since now nr_running > 0, idle_cpu() will always get correct result.
-+ */
-+ WRITE_ONCE(rq->ttwu_pending, 0);
-+ rq_unlock_irqrestore(rq, &rf);
-+}
-+
-+/*
-+ * Prepare the scene for sending an IPI for a remote smp_call
-+ *
-+ * Returns true if the caller can proceed with sending the IPI.
-+ * Returns false otherwise.
-+ */
-+bool call_function_single_prep_ipi(int cpu)
-+{
-+ if (set_nr_if_polling(cpu_rq(cpu)->idle)) {
-+ trace_sched_wake_idle_without_ipi(cpu);
-+ return false;
-+ }
-+
-+ return true;
-+}
-+
-+/*
-+ * Queue a task on the target CPUs wake_list and wake the CPU via IPI if
-+ * necessary. The wakee CPU on receipt of the IPI will queue the task
-+ * via sched_ttwu_wakeup() for activation so the wakee incurs the cost
-+ * of the wakeup instead of the waker.
-+ */
-+static void __ttwu_queue_wakelist(struct task_struct *p, int cpu, int wake_flags)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+
-+ p->sched_remote_wakeup = !!(wake_flags & WF_MIGRATED);
-+
-+ WRITE_ONCE(rq->ttwu_pending, 1);
-+ __smp_call_single_queue(cpu, &p->wake_entry.llist);
-+}
-+
-+static inline bool ttwu_queue_cond(struct task_struct *p, int cpu)
-+{
-+ /*
-+ * Do not complicate things with the async wake_list while the CPU is
-+ * in hotplug state.
-+ */
-+ if (!cpu_active(cpu))
-+ return false;
-+
-+ /* Ensure the task will still be allowed to run on the CPU. */
-+ if (!cpumask_test_cpu(cpu, p->cpus_ptr))
-+ return false;
-+
-+ /*
-+ * If the CPU does not share cache, then queue the task on the
-+ * remote rqs wakelist to avoid accessing remote data.
-+ */
-+ if (!cpus_share_cache(smp_processor_id(), cpu))
-+ return true;
-+
-+ if (cpu == smp_processor_id())
-+ return false;
-+
-+ /*
-+ * If the wakee cpu is idle, or the task is descheduling and the
-+ * only running task on the CPU, then use the wakelist to offload
-+ * the task activation to the idle (or soon-to-be-idle) CPU as
-+ * the current CPU is likely busy. nr_running is checked to
-+ * avoid unnecessary task stacking.
-+ *
-+ * Note that we can only get here with (wakee) p->on_rq=0,
-+ * p->on_cpu can be whatever, we've done the dequeue, so
-+ * the wakee has been accounted out of ->nr_running.
-+ */
-+ if (!cpu_rq(cpu)->nr_running)
-+ return true;
-+
-+ return false;
-+}
-+
-+static bool ttwu_queue_wakelist(struct task_struct *p, int cpu, int wake_flags)
-+{
-+ if (__is_defined(ALT_SCHED_TTWU_QUEUE) && ttwu_queue_cond(p, cpu)) {
-+ sched_clock_cpu(cpu); /* Sync clocks across CPUs */
-+ __ttwu_queue_wakelist(p, cpu, wake_flags);
-+ return true;
-+ }
-+
-+ return false;
-+}
-+
-+void wake_up_if_idle(int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+ unsigned long flags;
-+
-+ rcu_read_lock();
-+
-+ if (!is_idle_task(rcu_dereference(rq->curr)))
-+ goto out;
-+
-+ raw_spin_lock_irqsave(&rq->lock, flags);
-+ if (is_idle_task(rq->curr))
-+ resched_curr(rq);
-+ /* Else CPU is not idle, do nothing here */
-+ raw_spin_unlock_irqrestore(&rq->lock, flags);
-+
-+out:
-+ rcu_read_unlock();
-+}
-+
-+bool cpus_share_cache(int this_cpu, int that_cpu)
-+{
-+ if (this_cpu == that_cpu)
-+ return true;
-+
-+ return per_cpu(sd_llc_id, this_cpu) == per_cpu(sd_llc_id, that_cpu);
-+}
-+#else /* !CONFIG_SMP */
-+
-+static inline bool ttwu_queue_wakelist(struct task_struct *p, int cpu, int wake_flags)
-+{
-+ return false;
-+}
-+
-+#endif /* CONFIG_SMP */
-+
-+static inline void ttwu_queue(struct task_struct *p, int cpu, int wake_flags)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+
-+ if (ttwu_queue_wakelist(p, cpu, wake_flags))
-+ return;
-+
-+ raw_spin_lock(&rq->lock);
-+ update_rq_clock(rq);
-+ ttwu_do_activate(rq, p, wake_flags);
-+ raw_spin_unlock(&rq->lock);
-+}
-+
-+/*
-+ * Invoked from try_to_wake_up() to check whether the task can be woken up.
-+ *
-+ * The caller holds p::pi_lock if p != current or has preemption
-+ * disabled when p == current.
-+ *
-+ * The rules of PREEMPT_RT saved_state:
-+ *
-+ * The related locking code always holds p::pi_lock when updating
-+ * p::saved_state, which means the code is fully serialized in both cases.
-+ *
-+ * The lock wait and lock wakeups happen via TASK_RTLOCK_WAIT. No other
-+ * bits set. This allows to distinguish all wakeup scenarios.
-+ */
-+static __always_inline
-+bool ttwu_state_match(struct task_struct *p, unsigned int state, int *success)
-+{
-+ if (IS_ENABLED(CONFIG_DEBUG_PREEMPT)) {
-+ WARN_ON_ONCE((state & TASK_RTLOCK_WAIT) &&
-+ state != TASK_RTLOCK_WAIT);
-+ }
-+
-+ if (READ_ONCE(p->__state) & state) {
-+ *success = 1;
-+ return true;
-+ }
-+
-+#ifdef CONFIG_PREEMPT_RT
-+ /*
-+ * Saved state preserves the task state across blocking on
-+ * an RT lock. If the state matches, set p::saved_state to
-+ * TASK_RUNNING, but do not wake the task because it waits
-+ * for a lock wakeup. Also indicate success because from
-+ * the regular waker's point of view this has succeeded.
-+ *
-+ * After acquiring the lock the task will restore p::__state
-+ * from p::saved_state which ensures that the regular
-+ * wakeup is not lost. The restore will also set
-+ * p::saved_state to TASK_RUNNING so any further tests will
-+ * not result in false positives vs. @success
-+ */
-+ if (p->saved_state & state) {
-+ p->saved_state = TASK_RUNNING;
-+ *success = 1;
-+ }
-+#endif
-+ return false;
-+}
-+
-+/*
-+ * Notes on Program-Order guarantees on SMP systems.
-+ *
-+ * MIGRATION
-+ *
-+ * The basic program-order guarantee on SMP systems is that when a task [t]
-+ * migrates, all its activity on its old CPU [c0] happens-before any subsequent
-+ * execution on its new CPU [c1].
-+ *
-+ * For migration (of runnable tasks) this is provided by the following means:
-+ *
-+ * A) UNLOCK of the rq(c0)->lock scheduling out task t
-+ * B) migration for t is required to synchronize *both* rq(c0)->lock and
-+ * rq(c1)->lock (if not at the same time, then in that order).
-+ * C) LOCK of the rq(c1)->lock scheduling in task
-+ *
-+ * Transitivity guarantees that B happens after A and C after B.
-+ * Note: we only require RCpc transitivity.
-+ * Note: the CPU doing B need not be c0 or c1
-+ *
-+ * Example:
-+ *
-+ * CPU0 CPU1 CPU2
-+ *
-+ * LOCK rq(0)->lock
-+ * sched-out X
-+ * sched-in Y
-+ * UNLOCK rq(0)->lock
-+ *
-+ * LOCK rq(0)->lock // orders against CPU0
-+ * dequeue X
-+ * UNLOCK rq(0)->lock
-+ *
-+ * LOCK rq(1)->lock
-+ * enqueue X
-+ * UNLOCK rq(1)->lock
-+ *
-+ * LOCK rq(1)->lock // orders against CPU2
-+ * sched-out Z
-+ * sched-in X
-+ * UNLOCK rq(1)->lock
-+ *
-+ *
-+ * BLOCKING -- aka. SLEEP + WAKEUP
-+ *
-+ * For blocking we (obviously) need to provide the same guarantee as for
-+ * migration. However the means are completely different as there is no lock
-+ * chain to provide order. Instead we do:
-+ *
-+ * 1) smp_store_release(X->on_cpu, 0) -- finish_task()
-+ * 2) smp_cond_load_acquire(!X->on_cpu) -- try_to_wake_up()
-+ *
-+ * Example:
-+ *
-+ * CPU0 (schedule) CPU1 (try_to_wake_up) CPU2 (schedule)
-+ *
-+ * LOCK rq(0)->lock LOCK X->pi_lock
-+ * dequeue X
-+ * sched-out X
-+ * smp_store_release(X->on_cpu, 0);
-+ *
-+ * smp_cond_load_acquire(&X->on_cpu, !VAL);
-+ * X->state = WAKING
-+ * set_task_cpu(X,2)
-+ *
-+ * LOCK rq(2)->lock
-+ * enqueue X
-+ * X->state = RUNNING
-+ * UNLOCK rq(2)->lock
-+ *
-+ * LOCK rq(2)->lock // orders against CPU1
-+ * sched-out Z
-+ * sched-in X
-+ * UNLOCK rq(2)->lock
-+ *
-+ * UNLOCK X->pi_lock
-+ * UNLOCK rq(0)->lock
-+ *
-+ *
-+ * However; for wakeups there is a second guarantee we must provide, namely we
-+ * must observe the state that lead to our wakeup. That is, not only must our
-+ * task observe its own prior state, it must also observe the stores prior to
-+ * its wakeup.
-+ *
-+ * This means that any means of doing remote wakeups must order the CPU doing
-+ * the wakeup against the CPU the task is going to end up running on. This,
-+ * however, is already required for the regular Program-Order guarantee above,
-+ * since the waking CPU is the one issueing the ACQUIRE (smp_cond_load_acquire).
-+ *
-+ */
-+
-+/**
-+ * try_to_wake_up - wake up a thread
-+ * @p: the thread to be awakened
-+ * @state: the mask of task states that can be woken
-+ * @wake_flags: wake modifier flags (WF_*)
-+ *
-+ * Conceptually does:
-+ *
-+ * If (@state & @p->state) @p->state = TASK_RUNNING.
-+ *
-+ * If the task was not queued/runnable, also place it back on a runqueue.
-+ *
-+ * This function is atomic against schedule() which would dequeue the task.
-+ *
-+ * It issues a full memory barrier before accessing @p->state, see the comment
-+ * with set_current_state().
-+ *
-+ * Uses p->pi_lock to serialize against concurrent wake-ups.
-+ *
-+ * Relies on p->pi_lock stabilizing:
-+ * - p->sched_class
-+ * - p->cpus_ptr
-+ * - p->sched_task_group
-+ * in order to do migration, see its use of select_task_rq()/set_task_cpu().
-+ *
-+ * Tries really hard to only take one task_rq(p)->lock for performance.
-+ * Takes rq->lock in:
-+ * - ttwu_runnable() -- old rq, unavoidable, see comment there;
-+ * - ttwu_queue() -- new rq, for enqueue of the task;
-+ * - psi_ttwu_dequeue() -- much sadness :-( accounting will kill us.
-+ *
-+ * As a consequence we race really badly with just about everything. See the
-+ * many memory barriers and their comments for details.
-+ *
-+ * Return: %true if @p->state changes (an actual wakeup was done),
-+ * %false otherwise.
-+ */
-+static int try_to_wake_up(struct task_struct *p, unsigned int state,
-+ int wake_flags)
-+{
-+ unsigned long flags;
-+ int cpu, success = 0;
-+
-+ preempt_disable();
-+ if (p == current) {
-+ /*
-+ * We're waking current, this means 'p->on_rq' and 'task_cpu(p)
-+ * == smp_processor_id()'. Together this means we can special
-+ * case the whole 'p->on_rq && ttwu_runnable()' case below
-+ * without taking any locks.
-+ *
-+ * In particular:
-+ * - we rely on Program-Order guarantees for all the ordering,
-+ * - we're serialized against set_special_state() by virtue of
-+ * it disabling IRQs (this allows not taking ->pi_lock).
-+ */
-+ if (!ttwu_state_match(p, state, &success))
-+ goto out;
-+
-+ trace_sched_waking(p);
-+ ttwu_do_wakeup(p);
-+ goto out;
-+ }
-+
-+ /*
-+ * If we are going to wake up a thread waiting for CONDITION we
-+ * need to ensure that CONDITION=1 done by the caller can not be
-+ * reordered with p->state check below. This pairs with smp_store_mb()
-+ * in set_current_state() that the waiting thread does.
-+ */
-+ raw_spin_lock_irqsave(&p->pi_lock, flags);
-+ smp_mb__after_spinlock();
-+ if (!ttwu_state_match(p, state, &success))
-+ goto unlock;
-+
-+ trace_sched_waking(p);
-+
-+ /*
-+ * Ensure we load p->on_rq _after_ p->state, otherwise it would
-+ * be possible to, falsely, observe p->on_rq == 0 and get stuck
-+ * in smp_cond_load_acquire() below.
-+ *
-+ * sched_ttwu_pending() try_to_wake_up()
-+ * STORE p->on_rq = 1 LOAD p->state
-+ * UNLOCK rq->lock
-+ *
-+ * __schedule() (switch to task 'p')
-+ * LOCK rq->lock smp_rmb();
-+ * smp_mb__after_spinlock();
-+ * UNLOCK rq->lock
-+ *
-+ * [task p]
-+ * STORE p->state = UNINTERRUPTIBLE LOAD p->on_rq
-+ *
-+ * Pairs with the LOCK+smp_mb__after_spinlock() on rq->lock in
-+ * __schedule(). See the comment for smp_mb__after_spinlock().
-+ *
-+ * A similar smb_rmb() lives in try_invoke_on_locked_down_task().
-+ */
-+ smp_rmb();
-+ if (READ_ONCE(p->on_rq) && ttwu_runnable(p, wake_flags))
-+ goto unlock;
-+
-+#ifdef CONFIG_SMP
-+ /*
-+ * Ensure we load p->on_cpu _after_ p->on_rq, otherwise it would be
-+ * possible to, falsely, observe p->on_cpu == 0.
-+ *
-+ * One must be running (->on_cpu == 1) in order to remove oneself
-+ * from the runqueue.
-+ *
-+ * __schedule() (switch to task 'p') try_to_wake_up()
-+ * STORE p->on_cpu = 1 LOAD p->on_rq
-+ * UNLOCK rq->lock
-+ *
-+ * __schedule() (put 'p' to sleep)
-+ * LOCK rq->lock smp_rmb();
-+ * smp_mb__after_spinlock();
-+ * STORE p->on_rq = 0 LOAD p->on_cpu
-+ *
-+ * Pairs with the LOCK+smp_mb__after_spinlock() on rq->lock in
-+ * __schedule(). See the comment for smp_mb__after_spinlock().
-+ *
-+ * Form a control-dep-acquire with p->on_rq == 0 above, to ensure
-+ * schedule()'s deactivate_task() has 'happened' and p will no longer
-+ * care about it's own p->state. See the comment in __schedule().
-+ */
-+ smp_acquire__after_ctrl_dep();
-+
-+ /*
-+ * We're doing the wakeup (@success == 1), they did a dequeue (p->on_rq
-+ * == 0), which means we need to do an enqueue, change p->state to
-+ * TASK_WAKING such that we can unlock p->pi_lock before doing the
-+ * enqueue, such as ttwu_queue_wakelist().
-+ */
-+ WRITE_ONCE(p->__state, TASK_WAKING);
-+
-+ /*
-+ * If the owning (remote) CPU is still in the middle of schedule() with
-+ * this task as prev, considering queueing p on the remote CPUs wake_list
-+ * which potentially sends an IPI instead of spinning on p->on_cpu to
-+ * let the waker make forward progress. This is safe because IRQs are
-+ * disabled and the IPI will deliver after on_cpu is cleared.
-+ *
-+ * Ensure we load task_cpu(p) after p->on_cpu:
-+ *
-+ * set_task_cpu(p, cpu);
-+ * STORE p->cpu = @cpu
-+ * __schedule() (switch to task 'p')
-+ * LOCK rq->lock
-+ * smp_mb__after_spin_lock() smp_cond_load_acquire(&p->on_cpu)
-+ * STORE p->on_cpu = 1 LOAD p->cpu
-+ *
-+ * to ensure we observe the correct CPU on which the task is currently
-+ * scheduling.
-+ */
-+ if (smp_load_acquire(&p->on_cpu) &&
-+ ttwu_queue_wakelist(p, task_cpu(p), wake_flags))
-+ goto unlock;
-+
-+ /*
-+ * If the owning (remote) CPU is still in the middle of schedule() with
-+ * this task as prev, wait until it's done referencing the task.
-+ *
-+ * Pairs with the smp_store_release() in finish_task().
-+ *
-+ * This ensures that tasks getting woken will be fully ordered against
-+ * their previous state and preserve Program Order.
-+ */
-+ smp_cond_load_acquire(&p->on_cpu, !VAL);
-+
-+ sched_task_ttwu(p);
-+
-+ cpu = select_task_rq(p);
-+
-+ if (cpu != task_cpu(p)) {
-+ if (p->in_iowait) {
-+ delayacct_blkio_end(p);
-+ atomic_dec(&task_rq(p)->nr_iowait);
-+ }
-+
-+ wake_flags |= WF_MIGRATED;
-+ set_task_cpu(p, cpu);
-+ }
-+#else
-+ cpu = task_cpu(p);
-+#endif /* CONFIG_SMP */
-+
-+ ttwu_queue(p, cpu, wake_flags);
-+unlock:
-+ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
-+out:
-+ if (success)
-+ ttwu_stat(p, task_cpu(p), wake_flags);
-+ preempt_enable();
-+
-+ return success;
-+}
-+
-+static bool __task_needs_rq_lock(struct task_struct *p)
-+{
-+ unsigned int state = READ_ONCE(p->__state);
-+
-+ /*
-+ * Since pi->lock blocks try_to_wake_up(), we don't need rq->lock when
-+ * the task is blocked. Make sure to check @state since ttwu() can drop
-+ * locks at the end, see ttwu_queue_wakelist().
-+ */
-+ if (state == TASK_RUNNING || state == TASK_WAKING)
-+ return true;
-+
-+ /*
-+ * Ensure we load p->on_rq after p->__state, otherwise it would be
-+ * possible to, falsely, observe p->on_rq == 0.
-+ *
-+ * See try_to_wake_up() for a longer comment.
-+ */
-+ smp_rmb();
-+ if (p->on_rq)
-+ return true;
-+
-+#ifdef CONFIG_SMP
-+ /*
-+ * Ensure the task has finished __schedule() and will not be referenced
-+ * anymore. Again, see try_to_wake_up() for a longer comment.
-+ */
-+ smp_rmb();
-+ smp_cond_load_acquire(&p->on_cpu, !VAL);
-+#endif
-+
-+ return false;
-+}
-+
-+/**
-+ * task_call_func - Invoke a function on task in fixed state
-+ * @p: Process for which the function is to be invoked, can be @current.
-+ * @func: Function to invoke.
-+ * @arg: Argument to function.
-+ *
-+ * Fix the task in it's current state by avoiding wakeups and or rq operations
-+ * and call @func(@arg) on it. This function can use ->on_rq and task_curr()
-+ * to work out what the state is, if required. Given that @func can be invoked
-+ * with a runqueue lock held, it had better be quite lightweight.
-+ *
-+ * Returns:
-+ * Whatever @func returns
-+ */
-+int task_call_func(struct task_struct *p, task_call_f func, void *arg)
-+{
-+ struct rq *rq = NULL;
-+ struct rq_flags rf;
-+ int ret;
-+
-+ raw_spin_lock_irqsave(&p->pi_lock, rf.flags);
-+
-+ if (__task_needs_rq_lock(p))
-+ rq = __task_rq_lock(p, &rf);
-+
-+ /*
-+ * At this point the task is pinned; either:
-+ * - blocked and we're holding off wakeups (pi->lock)
-+ * - woken, and we're holding off enqueue (rq->lock)
-+ * - queued, and we're holding off schedule (rq->lock)
-+ * - running, and we're holding off de-schedule (rq->lock)
-+ *
-+ * The called function (@func) can use: task_curr(), p->on_rq and
-+ * p->__state to differentiate between these states.
-+ */
-+ ret = func(p, arg);
-+
-+ if (rq)
-+ __task_rq_unlock(rq, &rf);
-+
-+ raw_spin_unlock_irqrestore(&p->pi_lock, rf.flags);
-+ return ret;
-+}
-+
-+/**
-+ * cpu_curr_snapshot - Return a snapshot of the currently running task
-+ * @cpu: The CPU on which to snapshot the task.
-+ *
-+ * Returns the task_struct pointer of the task "currently" running on
-+ * the specified CPU. If the same task is running on that CPU throughout,
-+ * the return value will be a pointer to that task's task_struct structure.
-+ * If the CPU did any context switches even vaguely concurrently with the
-+ * execution of this function, the return value will be a pointer to the
-+ * task_struct structure of a randomly chosen task that was running on
-+ * that CPU somewhere around the time that this function was executing.
-+ *
-+ * If the specified CPU was offline, the return value is whatever it
-+ * is, perhaps a pointer to the task_struct structure of that CPU's idle
-+ * task, but there is no guarantee. Callers wishing a useful return
-+ * value must take some action to ensure that the specified CPU remains
-+ * online throughout.
-+ *
-+ * This function executes full memory barriers before and after fetching
-+ * the pointer, which permits the caller to confine this function's fetch
-+ * with respect to the caller's accesses to other shared variables.
-+ */
-+struct task_struct *cpu_curr_snapshot(int cpu)
-+{
-+ struct task_struct *t;
-+
-+ smp_mb(); /* Pairing determined by caller's synchronization design. */
-+ t = rcu_dereference(cpu_curr(cpu));
-+ smp_mb(); /* Pairing determined by caller's synchronization design. */
-+ return t;
-+}
-+
-+/**
-+ * wake_up_process - Wake up a specific process
-+ * @p: The process to be woken up.
-+ *
-+ * Attempt to wake up the nominated process and move it to the set of runnable
-+ * processes.
-+ *
-+ * Return: 1 if the process was woken up, 0 if it was already running.
-+ *
-+ * This function executes a full memory barrier before accessing the task state.
-+ */
-+int wake_up_process(struct task_struct *p)
-+{
-+ return try_to_wake_up(p, TASK_NORMAL, 0);
-+}
-+EXPORT_SYMBOL(wake_up_process);
-+
-+int wake_up_state(struct task_struct *p, unsigned int state)
-+{
-+ return try_to_wake_up(p, state, 0);
-+}
-+
-+/*
-+ * Perform scheduler related setup for a newly forked process p.
-+ * p is forked by current.
-+ *
-+ * __sched_fork() is basic setup used by init_idle() too:
-+ */
-+static inline void __sched_fork(unsigned long clone_flags, struct task_struct *p)
-+{
-+ p->on_rq = 0;
-+ p->on_cpu = 0;
-+ p->utime = 0;
-+ p->stime = 0;
-+ p->sched_time = 0;
-+
-+#ifdef CONFIG_SCHEDSTATS
-+ /* Even if schedstat is disabled, there should not be garbage */
-+ memset(&p->stats, 0, sizeof(p->stats));
-+#endif
-+
-+#ifdef CONFIG_PREEMPT_NOTIFIERS
-+ INIT_HLIST_HEAD(&p->preempt_notifiers);
-+#endif
-+
-+#ifdef CONFIG_COMPACTION
-+ p->capture_control = NULL;
-+#endif
-+#ifdef CONFIG_SMP
-+ p->wake_entry.u_flags = CSD_TYPE_TTWU;
-+#endif
-+ init_sched_mm_cid(p);
-+}
-+
-+/*
-+ * fork()/clone()-time setup:
-+ */
-+int sched_fork(unsigned long clone_flags, struct task_struct *p)
-+{
-+ __sched_fork(clone_flags, p);
-+ /*
-+ * We mark the process as NEW here. This guarantees that
-+ * nobody will actually run it, and a signal or other external
-+ * event cannot wake it up and insert it on the runqueue either.
-+ */
-+ p->__state = TASK_NEW;
-+
-+ /*
-+ * Make sure we do not leak PI boosting priority to the child.
-+ */
-+ p->prio = current->normal_prio;
-+
-+ /*
-+ * Revert to default priority/policy on fork if requested.
-+ */
-+ if (unlikely(p->sched_reset_on_fork)) {
-+ if (task_has_rt_policy(p)) {
-+ p->policy = SCHED_NORMAL;
-+ p->static_prio = NICE_TO_PRIO(0);
-+ p->rt_priority = 0;
-+ } else if (PRIO_TO_NICE(p->static_prio) < 0)
-+ p->static_prio = NICE_TO_PRIO(0);
-+
-+ p->prio = p->normal_prio = p->static_prio;
-+
-+ /*
-+ * We don't need the reset flag anymore after the fork. It has
-+ * fulfilled its duty:
-+ */
-+ p->sched_reset_on_fork = 0;
-+ }
-+
-+#ifdef CONFIG_SCHED_INFO
-+ if (unlikely(sched_info_on()))
-+ memset(&p->sched_info, 0, sizeof(p->sched_info));
-+#endif
-+ init_task_preempt_count(p);
-+
-+ return 0;
-+}
-+
-+void sched_cgroup_fork(struct task_struct *p, struct kernel_clone_args *kargs)
-+{
-+ unsigned long flags;
-+ struct rq *rq;
-+
-+ /*
-+ * Because we're not yet on the pid-hash, p->pi_lock isn't strictly
-+ * required yet, but lockdep gets upset if rules are violated.
-+ */
-+ raw_spin_lock_irqsave(&p->pi_lock, flags);
-+ /*
-+ * Share the timeslice between parent and child, thus the
-+ * total amount of pending timeslices in the system doesn't change,
-+ * resulting in more scheduling fairness.
-+ */
-+ rq = this_rq();
-+ raw_spin_lock(&rq->lock);
-+
-+ rq->curr->time_slice /= 2;
-+ p->time_slice = rq->curr->time_slice;
-+#ifdef CONFIG_SCHED_HRTICK
-+ hrtick_start(rq, rq->curr->time_slice);
-+#endif
-+
-+ if (p->time_slice < RESCHED_NS) {
-+ p->time_slice = sched_timeslice_ns;
-+ resched_curr(rq);
-+ }
-+ sched_task_fork(p, rq);
-+ raw_spin_unlock(&rq->lock);
-+
-+ rseq_migrate(p);
-+ /*
-+ * We're setting the CPU for the first time, we don't migrate,
-+ * so use __set_task_cpu().
-+ */
-+ __set_task_cpu(p, smp_processor_id());
-+ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
-+}
-+
-+void sched_post_fork(struct task_struct *p)
-+{
-+}
-+
-+#ifdef CONFIG_SCHEDSTATS
-+
-+DEFINE_STATIC_KEY_FALSE(sched_schedstats);
-+
-+static void set_schedstats(bool enabled)
-+{
-+ if (enabled)
-+ static_branch_enable(&sched_schedstats);
-+ else
-+ static_branch_disable(&sched_schedstats);
-+}
-+
-+void force_schedstat_enabled(void)
-+{
-+ if (!schedstat_enabled()) {
-+ pr_info("kernel profiling enabled schedstats, disable via kernel.sched_schedstats.\n");
-+ static_branch_enable(&sched_schedstats);
-+ }
-+}
-+
-+static int __init setup_schedstats(char *str)
-+{
-+ int ret = 0;
-+ if (!str)
-+ goto out;
-+
-+ if (!strcmp(str, "enable")) {
-+ set_schedstats(true);
-+ ret = 1;
-+ } else if (!strcmp(str, "disable")) {
-+ set_schedstats(false);
-+ ret = 1;
-+ }
-+out:
-+ if (!ret)
-+ pr_warn("Unable to parse schedstats=\n");
-+
-+ return ret;
-+}
-+__setup("schedstats=", setup_schedstats);
-+
-+#ifdef CONFIG_PROC_SYSCTL
-+static int sysctl_schedstats(struct ctl_table *table, int write, void *buffer,
-+ size_t *lenp, loff_t *ppos)
-+{
-+ struct ctl_table t;
-+ int err;
-+ int state = static_branch_likely(&sched_schedstats);
-+
-+ if (write && !capable(CAP_SYS_ADMIN))
-+ return -EPERM;
-+
-+ t = *table;
-+ t.data = &state;
-+ err = proc_dointvec_minmax(&t, write, buffer, lenp, ppos);
-+ if (err < 0)
-+ return err;
-+ if (write)
-+ set_schedstats(state);
-+ return err;
-+}
-+
-+static struct ctl_table sched_core_sysctls[] = {
-+ {
-+ .procname = "sched_schedstats",
-+ .data = NULL,
-+ .maxlen = sizeof(unsigned int),
-+ .mode = 0644,
-+ .proc_handler = sysctl_schedstats,
-+ .extra1 = SYSCTL_ZERO,
-+ .extra2 = SYSCTL_ONE,
-+ },
-+ {}
-+};
-+static int __init sched_core_sysctl_init(void)
-+{
-+ register_sysctl_init("kernel", sched_core_sysctls);
-+ return 0;
-+}
-+late_initcall(sched_core_sysctl_init);
-+#endif /* CONFIG_PROC_SYSCTL */
-+#endif /* CONFIG_SCHEDSTATS */
-+
-+/*
-+ * wake_up_new_task - wake up a newly created task for the first time.
-+ *
-+ * This function will do some initial scheduler statistics housekeeping
-+ * that must be done for every newly created context, then puts the task
-+ * on the runqueue and wakes it.
-+ */
-+void wake_up_new_task(struct task_struct *p)
-+{
-+ unsigned long flags;
-+ struct rq *rq;
-+
-+ raw_spin_lock_irqsave(&p->pi_lock, flags);
-+ WRITE_ONCE(p->__state, TASK_RUNNING);
-+ rq = cpu_rq(select_task_rq(p));
-+#ifdef CONFIG_SMP
-+ rseq_migrate(p);
-+ /*
-+ * Fork balancing, do it here and not earlier because:
-+ * - cpus_ptr can change in the fork path
-+ * - any previously selected CPU might disappear through hotplug
-+ *
-+ * Use __set_task_cpu() to avoid calling sched_class::migrate_task_rq,
-+ * as we're not fully set-up yet.
-+ */
-+ __set_task_cpu(p, cpu_of(rq));
-+#endif
-+
-+ raw_spin_lock(&rq->lock);
-+ update_rq_clock(rq);
-+
-+ activate_task(p, rq);
-+ trace_sched_wakeup_new(p);
-+ check_preempt_curr(rq);
-+
-+ raw_spin_unlock(&rq->lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
-+}
-+
-+#ifdef CONFIG_PREEMPT_NOTIFIERS
-+
-+static DEFINE_STATIC_KEY_FALSE(preempt_notifier_key);
-+
-+void preempt_notifier_inc(void)
-+{
-+ static_branch_inc(&preempt_notifier_key);
-+}
-+EXPORT_SYMBOL_GPL(preempt_notifier_inc);
-+
-+void preempt_notifier_dec(void)
-+{
-+ static_branch_dec(&preempt_notifier_key);
-+}
-+EXPORT_SYMBOL_GPL(preempt_notifier_dec);
-+
-+/**
-+ * preempt_notifier_register - tell me when current is being preempted & rescheduled
-+ * @notifier: notifier struct to register
-+ */
-+void preempt_notifier_register(struct preempt_notifier *notifier)
-+{
-+ if (!static_branch_unlikely(&preempt_notifier_key))
-+ WARN(1, "registering preempt_notifier while notifiers disabled\n");
-+
-+ hlist_add_head(¬ifier->link, ¤t->preempt_notifiers);
-+}
-+EXPORT_SYMBOL_GPL(preempt_notifier_register);
-+
-+/**
-+ * preempt_notifier_unregister - no longer interested in preemption notifications
-+ * @notifier: notifier struct to unregister
-+ *
-+ * This is *not* safe to call from within a preemption notifier.
-+ */
-+void preempt_notifier_unregister(struct preempt_notifier *notifier)
-+{
-+ hlist_del(¬ifier->link);
-+}
-+EXPORT_SYMBOL_GPL(preempt_notifier_unregister);
-+
-+static void __fire_sched_in_preempt_notifiers(struct task_struct *curr)
-+{
-+ struct preempt_notifier *notifier;
-+
-+ hlist_for_each_entry(notifier, &curr->preempt_notifiers, link)
-+ notifier->ops->sched_in(notifier, raw_smp_processor_id());
-+}
-+
-+static __always_inline void fire_sched_in_preempt_notifiers(struct task_struct *curr)
-+{
-+ if (static_branch_unlikely(&preempt_notifier_key))
-+ __fire_sched_in_preempt_notifiers(curr);
-+}
-+
-+static void
-+__fire_sched_out_preempt_notifiers(struct task_struct *curr,
-+ struct task_struct *next)
-+{
-+ struct preempt_notifier *notifier;
-+
-+ hlist_for_each_entry(notifier, &curr->preempt_notifiers, link)
-+ notifier->ops->sched_out(notifier, next);
-+}
-+
-+static __always_inline void
-+fire_sched_out_preempt_notifiers(struct task_struct *curr,
-+ struct task_struct *next)
-+{
-+ if (static_branch_unlikely(&preempt_notifier_key))
-+ __fire_sched_out_preempt_notifiers(curr, next);
-+}
-+
-+#else /* !CONFIG_PREEMPT_NOTIFIERS */
-+
-+static inline void fire_sched_in_preempt_notifiers(struct task_struct *curr)
-+{
-+}
-+
-+static inline void
-+fire_sched_out_preempt_notifiers(struct task_struct *curr,
-+ struct task_struct *next)
-+{
-+}
-+
-+#endif /* CONFIG_PREEMPT_NOTIFIERS */
-+
-+static inline void prepare_task(struct task_struct *next)
-+{
-+ /*
-+ * Claim the task as running, we do this before switching to it
-+ * such that any running task will have this set.
-+ *
-+ * See the smp_load_acquire(&p->on_cpu) case in ttwu() and
-+ * its ordering comment.
-+ */
-+ WRITE_ONCE(next->on_cpu, 1);
-+}
-+
-+static inline void finish_task(struct task_struct *prev)
-+{
-+#ifdef CONFIG_SMP
-+ /*
-+ * This must be the very last reference to @prev from this CPU. After
-+ * p->on_cpu is cleared, the task can be moved to a different CPU. We
-+ * must ensure this doesn't happen until the switch is completely
-+ * finished.
-+ *
-+ * In particular, the load of prev->state in finish_task_switch() must
-+ * happen before this.
-+ *
-+ * Pairs with the smp_cond_load_acquire() in try_to_wake_up().
-+ */
-+ smp_store_release(&prev->on_cpu, 0);
-+#else
-+ prev->on_cpu = 0;
-+#endif
-+}
-+
-+#ifdef CONFIG_SMP
-+
-+static void do_balance_callbacks(struct rq *rq, struct balance_callback *head)
-+{
-+ void (*func)(struct rq *rq);
-+ struct balance_callback *next;
-+
-+ lockdep_assert_held(&rq->lock);
-+
-+ while (head) {
-+ func = (void (*)(struct rq *))head->func;
-+ next = head->next;
-+ head->next = NULL;
-+ head = next;
-+
-+ func(rq);
-+ }
-+}
-+
-+static void balance_push(struct rq *rq);
-+
-+/*
-+ * balance_push_callback is a right abuse of the callback interface and plays
-+ * by significantly different rules.
-+ *
-+ * Where the normal balance_callback's purpose is to be ran in the same context
-+ * that queued it (only later, when it's safe to drop rq->lock again),
-+ * balance_push_callback is specifically targeted at __schedule().
-+ *
-+ * This abuse is tolerated because it places all the unlikely/odd cases behind
-+ * a single test, namely: rq->balance_callback == NULL.
-+ */
-+struct balance_callback balance_push_callback = {
-+ .next = NULL,
-+ .func = balance_push,
-+};
-+
-+static inline struct balance_callback *
-+__splice_balance_callbacks(struct rq *rq, bool split)
-+{
-+ struct balance_callback *head = rq->balance_callback;
-+
-+ if (likely(!head))
-+ return NULL;
-+
-+ lockdep_assert_rq_held(rq);
-+ /*
-+ * Must not take balance_push_callback off the list when
-+ * splice_balance_callbacks() and balance_callbacks() are not
-+ * in the same rq->lock section.
-+ *
-+ * In that case it would be possible for __schedule() to interleave
-+ * and observe the list empty.
-+ */
-+ if (split && head == &balance_push_callback)
-+ head = NULL;
-+ else
-+ rq->balance_callback = NULL;
-+
-+ return head;
-+}
-+
-+static inline struct balance_callback *splice_balance_callbacks(struct rq *rq)
-+{
-+ return __splice_balance_callbacks(rq, true);
-+}
-+
-+static void __balance_callbacks(struct rq *rq)
-+{
-+ do_balance_callbacks(rq, __splice_balance_callbacks(rq, false));
-+}
-+
-+static inline void balance_callbacks(struct rq *rq, struct balance_callback *head)
-+{
-+ unsigned long flags;
-+
-+ if (unlikely(head)) {
-+ raw_spin_lock_irqsave(&rq->lock, flags);
-+ do_balance_callbacks(rq, head);
-+ raw_spin_unlock_irqrestore(&rq->lock, flags);
-+ }
-+}
-+
-+#else
-+
-+static inline void __balance_callbacks(struct rq *rq)
-+{
-+}
-+
-+static inline struct balance_callback *splice_balance_callbacks(struct rq *rq)
-+{
-+ return NULL;
-+}
-+
-+static inline void balance_callbacks(struct rq *rq, struct balance_callback *head)
-+{
-+}
-+
-+#endif
-+
-+static inline void
-+prepare_lock_switch(struct rq *rq, struct task_struct *next)
-+{
-+ /*
-+ * Since the runqueue lock will be released by the next
-+ * task (which is an invalid locking op but in the case
-+ * of the scheduler it's an obvious special-case), so we
-+ * do an early lockdep release here:
-+ */
-+ spin_release(&rq->lock.dep_map, _THIS_IP_);
-+#ifdef CONFIG_DEBUG_SPINLOCK
-+ /* this is a valid case when another task releases the spinlock */
-+ rq->lock.owner = next;
-+#endif
-+}
-+
-+static inline void finish_lock_switch(struct rq *rq)
-+{
-+ /*
-+ * If we are tracking spinlock dependencies then we have to
-+ * fix up the runqueue lock - which gets 'carried over' from
-+ * prev into current:
-+ */
-+ spin_acquire(&rq->lock.dep_map, 0, 0, _THIS_IP_);
-+ __balance_callbacks(rq);
-+ raw_spin_unlock_irq(&rq->lock);
-+}
-+
-+/*
-+ * NOP if the arch has not defined these:
-+ */
-+
-+#ifndef prepare_arch_switch
-+# define prepare_arch_switch(next) do { } while (0)
-+#endif
-+
-+#ifndef finish_arch_post_lock_switch
-+# define finish_arch_post_lock_switch() do { } while (0)
-+#endif
-+
-+static inline void kmap_local_sched_out(void)
-+{
-+#ifdef CONFIG_KMAP_LOCAL
-+ if (unlikely(current->kmap_ctrl.idx))
-+ __kmap_local_sched_out();
-+#endif
-+}
-+
-+static inline void kmap_local_sched_in(void)
-+{
-+#ifdef CONFIG_KMAP_LOCAL
-+ if (unlikely(current->kmap_ctrl.idx))
-+ __kmap_local_sched_in();
-+#endif
-+}
-+
-+/**
-+ * prepare_task_switch - prepare to switch tasks
-+ * @rq: the runqueue preparing to switch
-+ * @next: the task we are going to switch to.
-+ *
-+ * This is called with the rq lock held and interrupts off. It must
-+ * be paired with a subsequent finish_task_switch after the context
-+ * switch.
-+ *
-+ * prepare_task_switch sets up locking and calls architecture specific
-+ * hooks.
-+ */
-+static inline void
-+prepare_task_switch(struct rq *rq, struct task_struct *prev,
-+ struct task_struct *next)
-+{
-+ kcov_prepare_switch(prev);
-+ sched_info_switch(rq, prev, next);
-+ perf_event_task_sched_out(prev, next);
-+ rseq_preempt(prev);
-+ fire_sched_out_preempt_notifiers(prev, next);
-+ kmap_local_sched_out();
-+ prepare_task(next);
-+ prepare_arch_switch(next);
-+}
-+
-+/**
-+ * finish_task_switch - clean up after a task-switch
-+ * @rq: runqueue associated with task-switch
-+ * @prev: the thread we just switched away from.
-+ *
-+ * finish_task_switch must be called after the context switch, paired
-+ * with a prepare_task_switch call before the context switch.
-+ * finish_task_switch will reconcile locking set up by prepare_task_switch,
-+ * and do any other architecture-specific cleanup actions.
-+ *
-+ * Note that we may have delayed dropping an mm in context_switch(). If
-+ * so, we finish that here outside of the runqueue lock. (Doing it
-+ * with the lock held can cause deadlocks; see schedule() for
-+ * details.)
-+ *
-+ * The context switch have flipped the stack from under us and restored the
-+ * local variables which were saved when this task called schedule() in the
-+ * past. prev == current is still correct but we need to recalculate this_rq
-+ * because prev may have moved to another CPU.
-+ */
-+static struct rq *finish_task_switch(struct task_struct *prev)
-+ __releases(rq->lock)
-+{
-+ struct rq *rq = this_rq();
-+ struct mm_struct *mm = rq->prev_mm;
-+ unsigned int prev_state;
-+
-+ /*
-+ * The previous task will have left us with a preempt_count of 2
-+ * because it left us after:
-+ *
-+ * schedule()
-+ * preempt_disable(); // 1
-+ * __schedule()
-+ * raw_spin_lock_irq(&rq->lock) // 2
-+ *
-+ * Also, see FORK_PREEMPT_COUNT.
-+ */
-+ if (WARN_ONCE(preempt_count() != 2*PREEMPT_DISABLE_OFFSET,
-+ "corrupted preempt_count: %s/%d/0x%x\n",
-+ current->comm, current->pid, preempt_count()))
-+ preempt_count_set(FORK_PREEMPT_COUNT);
-+
-+ rq->prev_mm = NULL;
-+
-+ /*
-+ * A task struct has one reference for the use as "current".
-+ * If a task dies, then it sets TASK_DEAD in tsk->state and calls
-+ * schedule one last time. The schedule call will never return, and
-+ * the scheduled task must drop that reference.
-+ *
-+ * We must observe prev->state before clearing prev->on_cpu (in
-+ * finish_task), otherwise a concurrent wakeup can get prev
-+ * running on another CPU and we could rave with its RUNNING -> DEAD
-+ * transition, resulting in a double drop.
-+ */
-+ prev_state = READ_ONCE(prev->__state);
-+ vtime_task_switch(prev);
-+ perf_event_task_sched_in(prev, current);
-+ finish_task(prev);
-+ tick_nohz_task_switch();
-+ finish_lock_switch(rq);
-+ finish_arch_post_lock_switch();
-+ kcov_finish_switch(current);
-+ /*
-+ * kmap_local_sched_out() is invoked with rq::lock held and
-+ * interrupts disabled. There is no requirement for that, but the
-+ * sched out code does not have an interrupt enabled section.
-+ * Restoring the maps on sched in does not require interrupts being
-+ * disabled either.
-+ */
-+ kmap_local_sched_in();
-+
-+ fire_sched_in_preempt_notifiers(current);
-+ /*
-+ * When switching through a kernel thread, the loop in
-+ * membarrier_{private,global}_expedited() may have observed that
-+ * kernel thread and not issued an IPI. It is therefore possible to
-+ * schedule between user->kernel->user threads without passing though
-+ * switch_mm(). Membarrier requires a barrier after storing to
-+ * rq->curr, before returning to userspace, so provide them here:
-+ *
-+ * - a full memory barrier for {PRIVATE,GLOBAL}_EXPEDITED, implicitly
-+ * provided by mmdrop(),
-+ * - a sync_core for SYNC_CORE.
-+ */
-+ if (mm) {
-+ membarrier_mm_sync_core_before_usermode(mm);
-+ mmdrop_sched(mm);
-+ }
-+ if (unlikely(prev_state == TASK_DEAD)) {
-+ /* Task is done with its stack. */
-+ put_task_stack(prev);
-+
-+ put_task_struct_rcu_user(prev);
-+ }
-+
-+ return rq;
-+}
-+
-+/**
-+ * schedule_tail - first thing a freshly forked thread must call.
-+ * @prev: the thread we just switched away from.
-+ */
-+asmlinkage __visible void schedule_tail(struct task_struct *prev)
-+ __releases(rq->lock)
-+{
-+ /*
-+ * New tasks start with FORK_PREEMPT_COUNT, see there and
-+ * finish_task_switch() for details.
-+ *
-+ * finish_task_switch() will drop rq->lock() and lower preempt_count
-+ * and the preempt_enable() will end up enabling preemption (on
-+ * PREEMPT_COUNT kernels).
-+ */
-+
-+ finish_task_switch(prev);
-+ preempt_enable();
-+
-+ if (current->set_child_tid)
-+ put_user(task_pid_vnr(current), current->set_child_tid);
-+
-+ calculate_sigpending();
-+}
-+
-+/*
-+ * context_switch - switch to the new MM and the new thread's register state.
-+ */
-+static __always_inline struct rq *
-+context_switch(struct rq *rq, struct task_struct *prev,
-+ struct task_struct *next)
-+{
-+ prepare_task_switch(rq, prev, next);
-+
-+ /*
-+ * For paravirt, this is coupled with an exit in switch_to to
-+ * combine the page table reload and the switch backend into
-+ * one hypercall.
-+ */
-+ arch_start_context_switch(prev);
-+
-+ /*
-+ * kernel -> kernel lazy + transfer active
-+ * user -> kernel lazy + mmgrab() active
-+ *
-+ * kernel -> user switch + mmdrop() active
-+ * user -> user switch
-+ *
-+ * switch_mm_cid() needs to be updated if the barriers provided
-+ * by context_switch() are modified.
-+ */
-+ if (!next->mm) { // to kernel
-+ enter_lazy_tlb(prev->active_mm, next);
-+
-+ next->active_mm = prev->active_mm;
-+ if (prev->mm) // from user
-+ mmgrab(prev->active_mm);
-+ else
-+ prev->active_mm = NULL;
-+ } else { // to user
-+ membarrier_switch_mm(rq, prev->active_mm, next->mm);
-+ /*
-+ * sys_membarrier() requires an smp_mb() between setting
-+ * rq->curr / membarrier_switch_mm() and returning to userspace.
-+ *
-+ * The below provides this either through switch_mm(), or in
-+ * case 'prev->active_mm == next->mm' through
-+ * finish_task_switch()'s mmdrop().
-+ */
-+ switch_mm_irqs_off(prev->active_mm, next->mm, next);
-+ lru_gen_use_mm(next->mm);
-+
-+ if (!prev->mm) { // from kernel
-+ /* will mmdrop() in finish_task_switch(). */
-+ rq->prev_mm = prev->active_mm;
-+ prev->active_mm = NULL;
-+ }
-+ }
-+
-+ /* switch_mm_cid() requires the memory barriers above. */
-+ switch_mm_cid(rq, prev, next);
-+
-+ prepare_lock_switch(rq, next);
-+
-+ /* Here we just switch the register state and the stack. */
-+ switch_to(prev, next, prev);
-+ barrier();
-+
-+ return finish_task_switch(prev);
-+}
-+
-+/*
-+ * nr_running, nr_uninterruptible and nr_context_switches:
-+ *
-+ * externally visible scheduler statistics: current number of runnable
-+ * threads, total number of context switches performed since bootup.
-+ */
-+unsigned int nr_running(void)
-+{
-+ unsigned int i, sum = 0;
-+
-+ for_each_online_cpu(i)
-+ sum += cpu_rq(i)->nr_running;
-+
-+ return sum;
-+}
-+
-+/*
-+ * Check if only the current task is running on the CPU.
-+ *
-+ * Caution: this function does not check that the caller has disabled
-+ * preemption, thus the result might have a time-of-check-to-time-of-use
-+ * race. The caller is responsible to use it correctly, for example:
-+ *
-+ * - from a non-preemptible section (of course)
-+ *
-+ * - from a thread that is bound to a single CPU
-+ *
-+ * - in a loop with very short iterations (e.g. a polling loop)
-+ */
-+bool single_task_running(void)
-+{
-+ return raw_rq()->nr_running == 1;
-+}
-+EXPORT_SYMBOL(single_task_running);
-+
-+unsigned long long nr_context_switches_cpu(int cpu)
-+{
-+ return cpu_rq(cpu)->nr_switches;
-+}
-+
-+unsigned long long nr_context_switches(void)
-+{
-+ int i;
-+ unsigned long long sum = 0;
-+
-+ for_each_possible_cpu(i)
-+ sum += cpu_rq(i)->nr_switches;
-+
-+ return sum;
-+}
-+
-+/*
-+ * Consumers of these two interfaces, like for example the cpuidle menu
-+ * governor, are using nonsensical data. Preferring shallow idle state selection
-+ * for a CPU that has IO-wait which might not even end up running the task when
-+ * it does become runnable.
-+ */
-+
-+unsigned int nr_iowait_cpu(int cpu)
-+{
-+ return atomic_read(&cpu_rq(cpu)->nr_iowait);
-+}
-+
-+/*
-+ * IO-wait accounting, and how it's mostly bollocks (on SMP).
-+ *
-+ * The idea behind IO-wait account is to account the idle time that we could
-+ * have spend running if it were not for IO. That is, if we were to improve the
-+ * storage performance, we'd have a proportional reduction in IO-wait time.
-+ *
-+ * This all works nicely on UP, where, when a task blocks on IO, we account
-+ * idle time as IO-wait, because if the storage were faster, it could've been
-+ * running and we'd not be idle.
-+ *
-+ * This has been extended to SMP, by doing the same for each CPU. This however
-+ * is broken.
-+ *
-+ * Imagine for instance the case where two tasks block on one CPU, only the one
-+ * CPU will have IO-wait accounted, while the other has regular idle. Even
-+ * though, if the storage were faster, both could've ran at the same time,
-+ * utilising both CPUs.
-+ *
-+ * This means, that when looking globally, the current IO-wait accounting on
-+ * SMP is a lower bound, by reason of under accounting.
-+ *
-+ * Worse, since the numbers are provided per CPU, they are sometimes
-+ * interpreted per CPU, and that is nonsensical. A blocked task isn't strictly
-+ * associated with any one particular CPU, it can wake to another CPU than it
-+ * blocked on. This means the per CPU IO-wait number is meaningless.
-+ *
-+ * Task CPU affinities can make all that even more 'interesting'.
-+ */
-+
-+unsigned int nr_iowait(void)
-+{
-+ unsigned int i, sum = 0;
-+
-+ for_each_possible_cpu(i)
-+ sum += nr_iowait_cpu(i);
-+
-+ return sum;
-+}
-+
-+#ifdef CONFIG_SMP
-+
-+/*
-+ * sched_exec - execve() is a valuable balancing opportunity, because at
-+ * this point the task has the smallest effective memory and cache
-+ * footprint.
-+ */
-+void sched_exec(void)
-+{
-+}
-+
-+#endif
-+
-+DEFINE_PER_CPU(struct kernel_stat, kstat);
-+DEFINE_PER_CPU(struct kernel_cpustat, kernel_cpustat);
-+
-+EXPORT_PER_CPU_SYMBOL(kstat);
-+EXPORT_PER_CPU_SYMBOL(kernel_cpustat);
-+
-+static inline void update_curr(struct rq *rq, struct task_struct *p)
-+{
-+ s64 ns = rq->clock_task - p->last_ran;
-+
-+ p->sched_time += ns;
-+ cgroup_account_cputime(p, ns);
-+ account_group_exec_runtime(p, ns);
-+
-+ p->time_slice -= ns;
-+ p->last_ran = rq->clock_task;
-+}
-+
-+/*
-+ * Return accounted runtime for the task.
-+ * Return separately the current's pending runtime that have not been
-+ * accounted yet.
-+ */
-+unsigned long long task_sched_runtime(struct task_struct *p)
-+{
-+ unsigned long flags;
-+ struct rq *rq;
-+ raw_spinlock_t *lock;
-+ u64 ns;
-+
-+#if defined(CONFIG_64BIT) && defined(CONFIG_SMP)
-+ /*
-+ * 64-bit doesn't need locks to atomically read a 64-bit value.
-+ * So we have a optimization chance when the task's delta_exec is 0.
-+ * Reading ->on_cpu is racy, but this is ok.
-+ *
-+ * If we race with it leaving CPU, we'll take a lock. So we're correct.
-+ * If we race with it entering CPU, unaccounted time is 0. This is
-+ * indistinguishable from the read occurring a few cycles earlier.
-+ * If we see ->on_cpu without ->on_rq, the task is leaving, and has
-+ * been accounted, so we're correct here as well.
-+ */
-+ if (!p->on_cpu || !task_on_rq_queued(p))
-+ return tsk_seruntime(p);
-+#endif
-+
-+ rq = task_access_lock_irqsave(p, &lock, &flags);
-+ /*
-+ * Must be ->curr _and_ ->on_rq. If dequeued, we would
-+ * project cycles that may never be accounted to this
-+ * thread, breaking clock_gettime().
-+ */
-+ if (p == rq->curr && task_on_rq_queued(p)) {
-+ update_rq_clock(rq);
-+ update_curr(rq, p);
-+ }
-+ ns = tsk_seruntime(p);
-+ task_access_unlock_irqrestore(p, lock, &flags);
-+
-+ return ns;
-+}
-+
-+/* This manages tasks that have run out of timeslice during a scheduler_tick */
-+static inline void scheduler_task_tick(struct rq *rq)
-+{
-+ struct task_struct *p = rq->curr;
-+
-+ if (is_idle_task(p))
-+ return;
-+
-+ update_curr(rq, p);
-+ cpufreq_update_util(rq, 0);
-+
-+ /*
-+ * Tasks have less than RESCHED_NS of time slice left they will be
-+ * rescheduled.
-+ */
-+ if (p->time_slice >= RESCHED_NS)
-+ return;
-+ set_tsk_need_resched(p);
-+ set_preempt_need_resched();
-+}
-+
-+#ifdef CONFIG_SCHED_DEBUG
-+static u64 cpu_resched_latency(struct rq *rq)
-+{
-+ int latency_warn_ms = READ_ONCE(sysctl_resched_latency_warn_ms);
-+ u64 resched_latency, now = rq_clock(rq);
-+ static bool warned_once;
-+
-+ if (sysctl_resched_latency_warn_once && warned_once)
-+ return 0;
-+
-+ if (!need_resched() || !latency_warn_ms)
-+ return 0;
-+
-+ if (system_state == SYSTEM_BOOTING)
-+ return 0;
-+
-+ if (!rq->last_seen_need_resched_ns) {
-+ rq->last_seen_need_resched_ns = now;
-+ rq->ticks_without_resched = 0;
-+ return 0;
-+ }
-+
-+ rq->ticks_without_resched++;
-+ resched_latency = now - rq->last_seen_need_resched_ns;
-+ if (resched_latency <= latency_warn_ms * NSEC_PER_MSEC)
-+ return 0;
-+
-+ warned_once = true;
-+
-+ return resched_latency;
-+}
-+
-+static int __init setup_resched_latency_warn_ms(char *str)
-+{
-+ long val;
-+
-+ if ((kstrtol(str, 0, &val))) {
-+ pr_warn("Unable to set resched_latency_warn_ms\n");
-+ return 1;
-+ }
-+
-+ sysctl_resched_latency_warn_ms = val;
-+ return 1;
-+}
-+__setup("resched_latency_warn_ms=", setup_resched_latency_warn_ms);
-+#else
-+static inline u64 cpu_resched_latency(struct rq *rq) { return 0; }
-+#endif /* CONFIG_SCHED_DEBUG */
-+
-+/*
-+ * This function gets called by the timer code, with HZ frequency.
-+ * We call it with interrupts disabled.
-+ */
-+void scheduler_tick(void)
-+{
-+ int cpu __maybe_unused = smp_processor_id();
-+ struct rq *rq = cpu_rq(cpu);
-+ u64 resched_latency;
-+
-+ if (housekeeping_cpu(cpu, HK_TYPE_TICK))
-+ arch_scale_freq_tick();
-+
-+ sched_clock_tick();
-+
-+ raw_spin_lock(&rq->lock);
-+ update_rq_clock(rq);
-+
-+ scheduler_task_tick(rq);
-+ if (sched_feat(LATENCY_WARN))
-+ resched_latency = cpu_resched_latency(rq);
-+ calc_global_load_tick(rq);
-+
-+ task_tick_mm_cid(rq, rq->curr);
-+
-+ rq->last_tick = rq->clock;
-+ raw_spin_unlock(&rq->lock);
-+
-+ if (sched_feat(LATENCY_WARN) && resched_latency)
-+ resched_latency_warn(cpu, resched_latency);
-+
-+ perf_event_task_tick();
-+}
-+
-+#ifdef CONFIG_SCHED_SMT
-+static inline int sg_balance_cpu_stop(void *data)
-+{
-+ struct rq *rq = this_rq();
-+ struct task_struct *p = data;
-+ cpumask_t tmp;
-+ unsigned long flags;
-+
-+ local_irq_save(flags);
-+
-+ raw_spin_lock(&p->pi_lock);
-+ raw_spin_lock(&rq->lock);
-+
-+ rq->active_balance = 0;
-+ /* _something_ may have changed the task, double check again */
-+ if (task_on_rq_queued(p) && task_rq(p) == rq &&
-+ cpumask_and(&tmp, p->cpus_ptr, &sched_sg_idle_mask) &&
-+ !is_migration_disabled(p)) {
-+ int cpu = cpu_of(rq);
-+ int dcpu = __best_mask_cpu(&tmp, per_cpu(sched_cpu_llc_mask, cpu));
-+ rq = move_queued_task(rq, p, dcpu);
-+ }
-+
-+ raw_spin_unlock(&rq->lock);
-+ raw_spin_unlock(&p->pi_lock);
-+
-+ local_irq_restore(flags);
-+
-+ return 0;
-+}
-+
-+/* sg_balance_trigger - trigger slibing group balance for @cpu */
-+static inline int sg_balance_trigger(const int cpu)
-+{
-+ struct rq *rq= cpu_rq(cpu);
-+ unsigned long flags;
-+ struct task_struct *curr;
-+ int res;
-+
-+ if (!raw_spin_trylock_irqsave(&rq->lock, flags))
-+ return 0;
-+ curr = rq->curr;
-+ res = (!is_idle_task(curr)) && (1 == rq->nr_running) &&\
-+ cpumask_intersects(curr->cpus_ptr, &sched_sg_idle_mask) &&\
-+ !is_migration_disabled(curr) && (!rq->active_balance);
-+
-+ if (res)
-+ rq->active_balance = 1;
-+
-+ raw_spin_unlock_irqrestore(&rq->lock, flags);
-+
-+ if (res)
-+ stop_one_cpu_nowait(cpu, sg_balance_cpu_stop, curr,
-+ &rq->active_balance_work);
-+ return res;
-+}
-+
-+/*
-+ * sg_balance - slibing group balance check for run queue @rq
-+ */
-+static inline void sg_balance(struct rq *rq, int cpu)
-+{
-+ cpumask_t chk;
-+
-+ /* exit when cpu is offline */
-+ if (unlikely(!rq->online))
-+ return;
-+
-+ /*
-+ * Only cpu in slibing idle group will do the checking and then
-+ * find potential cpus which can migrate the current running task
-+ */
-+ if (cpumask_test_cpu(cpu, &sched_sg_idle_mask) &&
-+ cpumask_andnot(&chk, cpu_online_mask, sched_idle_mask) &&
-+ cpumask_andnot(&chk, &chk, &sched_rq_pending_mask)) {
-+ int i;
-+
-+ for_each_cpu_wrap(i, &chk, cpu) {
-+ if (!cpumask_intersects(cpu_smt_mask(i), sched_idle_mask) &&\
-+ sg_balance_trigger(i))
-+ return;
-+ }
-+ }
-+}
-+#endif /* CONFIG_SCHED_SMT */
-+
-+#ifdef CONFIG_NO_HZ_FULL
-+
-+struct tick_work {
-+ int cpu;
-+ atomic_t state;
-+ struct delayed_work work;
-+};
-+/* Values for ->state, see diagram below. */
-+#define TICK_SCHED_REMOTE_OFFLINE 0
-+#define TICK_SCHED_REMOTE_OFFLINING 1
-+#define TICK_SCHED_REMOTE_RUNNING 2
-+
-+/*
-+ * State diagram for ->state:
-+ *
-+ *
-+ * TICK_SCHED_REMOTE_OFFLINE
-+ * | ^
-+ * | |
-+ * | | sched_tick_remote()
-+ * | |
-+ * | |
-+ * +--TICK_SCHED_REMOTE_OFFLINING
-+ * | ^
-+ * | |
-+ * sched_tick_start() | | sched_tick_stop()
-+ * | |
-+ * V |
-+ * TICK_SCHED_REMOTE_RUNNING
-+ *
-+ *
-+ * Other transitions get WARN_ON_ONCE(), except that sched_tick_remote()
-+ * and sched_tick_start() are happy to leave the state in RUNNING.
-+ */
-+
-+static struct tick_work __percpu *tick_work_cpu;
-+
-+static void sched_tick_remote(struct work_struct *work)
-+{
-+ struct delayed_work *dwork = to_delayed_work(work);
-+ struct tick_work *twork = container_of(dwork, struct tick_work, work);
-+ int cpu = twork->cpu;
-+ struct rq *rq = cpu_rq(cpu);
-+ struct task_struct *curr;
-+ unsigned long flags;
-+ u64 delta;
-+ int os;
-+
-+ /*
-+ * Handle the tick only if it appears the remote CPU is running in full
-+ * dynticks mode. The check is racy by nature, but missing a tick or
-+ * having one too much is no big deal because the scheduler tick updates
-+ * statistics and checks timeslices in a time-independent way, regardless
-+ * of when exactly it is running.
-+ */
-+ if (!tick_nohz_tick_stopped_cpu(cpu))
-+ goto out_requeue;
-+
-+ raw_spin_lock_irqsave(&rq->lock, flags);
-+ curr = rq->curr;
-+ if (cpu_is_offline(cpu))
-+ goto out_unlock;
-+
-+ update_rq_clock(rq);
-+ if (!is_idle_task(curr)) {
-+ /*
-+ * Make sure the next tick runs within a reasonable
-+ * amount of time.
-+ */
-+ delta = rq_clock_task(rq) - curr->last_ran;
-+ WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3);
-+ }
-+ scheduler_task_tick(rq);
-+
-+ calc_load_nohz_remote(rq);
-+out_unlock:
-+ raw_spin_unlock_irqrestore(&rq->lock, flags);
-+
-+out_requeue:
-+ /*
-+ * Run the remote tick once per second (1Hz). This arbitrary
-+ * frequency is large enough to avoid overload but short enough
-+ * to keep scheduler internal stats reasonably up to date. But
-+ * first update state to reflect hotplug activity if required.
-+ */
-+ os = atomic_fetch_add_unless(&twork->state, -1, TICK_SCHED_REMOTE_RUNNING);
-+ WARN_ON_ONCE(os == TICK_SCHED_REMOTE_OFFLINE);
-+ if (os == TICK_SCHED_REMOTE_RUNNING)
-+ queue_delayed_work(system_unbound_wq, dwork, HZ);
-+}
-+
-+static void sched_tick_start(int cpu)
-+{
-+ int os;
-+ struct tick_work *twork;
-+
-+ if (housekeeping_cpu(cpu, HK_TYPE_TICK))
-+ return;
-+
-+ WARN_ON_ONCE(!tick_work_cpu);
-+
-+ twork = per_cpu_ptr(tick_work_cpu, cpu);
-+ os = atomic_xchg(&twork->state, TICK_SCHED_REMOTE_RUNNING);
-+ WARN_ON_ONCE(os == TICK_SCHED_REMOTE_RUNNING);
-+ if (os == TICK_SCHED_REMOTE_OFFLINE) {
-+ twork->cpu = cpu;
-+ INIT_DELAYED_WORK(&twork->work, sched_tick_remote);
-+ queue_delayed_work(system_unbound_wq, &twork->work, HZ);
-+ }
-+}
-+
-+#ifdef CONFIG_HOTPLUG_CPU
-+static void sched_tick_stop(int cpu)
-+{
-+ struct tick_work *twork;
-+ int os;
-+
-+ if (housekeeping_cpu(cpu, HK_TYPE_TICK))
-+ return;
-+
-+ WARN_ON_ONCE(!tick_work_cpu);
-+
-+ twork = per_cpu_ptr(tick_work_cpu, cpu);
-+ /* There cannot be competing actions, but don't rely on stop-machine. */
-+ os = atomic_xchg(&twork->state, TICK_SCHED_REMOTE_OFFLINING);
-+ WARN_ON_ONCE(os != TICK_SCHED_REMOTE_RUNNING);
-+ /* Don't cancel, as this would mess up the state machine. */
-+}
-+#endif /* CONFIG_HOTPLUG_CPU */
-+
-+int __init sched_tick_offload_init(void)
-+{
-+ tick_work_cpu = alloc_percpu(struct tick_work);
-+ BUG_ON(!tick_work_cpu);
-+ return 0;
-+}
-+
-+#else /* !CONFIG_NO_HZ_FULL */
-+static inline void sched_tick_start(int cpu) { }
-+static inline void sched_tick_stop(int cpu) { }
-+#endif
-+
-+#if defined(CONFIG_PREEMPTION) && (defined(CONFIG_DEBUG_PREEMPT) || \
-+ defined(CONFIG_PREEMPT_TRACER))
-+/*
-+ * If the value passed in is equal to the current preempt count
-+ * then we just disabled preemption. Start timing the latency.
-+ */
-+static inline void preempt_latency_start(int val)
-+{
-+ if (preempt_count() == val) {
-+ unsigned long ip = get_lock_parent_ip();
-+#ifdef CONFIG_DEBUG_PREEMPT
-+ current->preempt_disable_ip = ip;
-+#endif
-+ trace_preempt_off(CALLER_ADDR0, ip);
-+ }
-+}
-+
-+void preempt_count_add(int val)
-+{
-+#ifdef CONFIG_DEBUG_PREEMPT
-+ /*
-+ * Underflow?
-+ */
-+ if (DEBUG_LOCKS_WARN_ON((preempt_count() < 0)))
-+ return;
-+#endif
-+ __preempt_count_add(val);
-+#ifdef CONFIG_DEBUG_PREEMPT
-+ /*
-+ * Spinlock count overflowing soon?
-+ */
-+ DEBUG_LOCKS_WARN_ON((preempt_count() & PREEMPT_MASK) >=
-+ PREEMPT_MASK - 10);
-+#endif
-+ preempt_latency_start(val);
-+}
-+EXPORT_SYMBOL(preempt_count_add);
-+NOKPROBE_SYMBOL(preempt_count_add);
-+
-+/*
-+ * If the value passed in equals to the current preempt count
-+ * then we just enabled preemption. Stop timing the latency.
-+ */
-+static inline void preempt_latency_stop(int val)
-+{
-+ if (preempt_count() == val)
-+ trace_preempt_on(CALLER_ADDR0, get_lock_parent_ip());
-+}
-+
-+void preempt_count_sub(int val)
-+{
-+#ifdef CONFIG_DEBUG_PREEMPT
-+ /*
-+ * Underflow?
-+ */
-+ if (DEBUG_LOCKS_WARN_ON(val > preempt_count()))
-+ return;
-+ /*
-+ * Is the spinlock portion underflowing?
-+ */
-+ if (DEBUG_LOCKS_WARN_ON((val < PREEMPT_MASK) &&
-+ !(preempt_count() & PREEMPT_MASK)))
-+ return;
-+#endif
-+
-+ preempt_latency_stop(val);
-+ __preempt_count_sub(val);
-+}
-+EXPORT_SYMBOL(preempt_count_sub);
-+NOKPROBE_SYMBOL(preempt_count_sub);
-+
-+#else
-+static inline void preempt_latency_start(int val) { }
-+static inline void preempt_latency_stop(int val) { }
-+#endif
-+
-+static inline unsigned long get_preempt_disable_ip(struct task_struct *p)
-+{
-+#ifdef CONFIG_DEBUG_PREEMPT
-+ return p->preempt_disable_ip;
-+#else
-+ return 0;
-+#endif
-+}
-+
-+/*
-+ * Print scheduling while atomic bug:
-+ */
-+static noinline void __schedule_bug(struct task_struct *prev)
-+{
-+ /* Save this before calling printk(), since that will clobber it */
-+ unsigned long preempt_disable_ip = get_preempt_disable_ip(current);
-+
-+ if (oops_in_progress)
-+ return;
-+
-+ printk(KERN_ERR "BUG: scheduling while atomic: %s/%d/0x%08x\n",
-+ prev->comm, prev->pid, preempt_count());
-+
-+ debug_show_held_locks(prev);
-+ print_modules();
-+ if (irqs_disabled())
-+ print_irqtrace_events(prev);
-+ if (IS_ENABLED(CONFIG_DEBUG_PREEMPT)
-+ && in_atomic_preempt_off()) {
-+ pr_err("Preemption disabled at:");
-+ print_ip_sym(KERN_ERR, preempt_disable_ip);
-+ }
-+ check_panic_on_warn("scheduling while atomic");
-+
-+ dump_stack();
-+ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
-+}
-+
-+/*
-+ * Various schedule()-time debugging checks and statistics:
-+ */
-+static inline void schedule_debug(struct task_struct *prev, bool preempt)
-+{
-+#ifdef CONFIG_SCHED_STACK_END_CHECK
-+ if (task_stack_end_corrupted(prev))
-+ panic("corrupted stack end detected inside scheduler\n");
-+
-+ if (task_scs_end_corrupted(prev))
-+ panic("corrupted shadow stack detected inside scheduler\n");
-+#endif
-+
-+#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
-+ if (!preempt && READ_ONCE(prev->__state) && prev->non_block_count) {
-+ printk(KERN_ERR "BUG: scheduling in a non-blocking section: %s/%d/%i\n",
-+ prev->comm, prev->pid, prev->non_block_count);
-+ dump_stack();
-+ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
-+ }
-+#endif
-+
-+ if (unlikely(in_atomic_preempt_off())) {
-+ __schedule_bug(prev);
-+ preempt_count_set(PREEMPT_DISABLED);
-+ }
-+ rcu_sleep_check();
-+ SCHED_WARN_ON(ct_state() == CONTEXT_USER);
-+
-+ profile_hit(SCHED_PROFILING, __builtin_return_address(0));
-+
-+ schedstat_inc(this_rq()->sched_count);
-+}
-+
-+#ifdef ALT_SCHED_DEBUG
-+void alt_sched_debug(void)
-+{
-+ printk(KERN_INFO "sched: pending: 0x%04lx, idle: 0x%04lx, sg_idle: 0x%04lx\n",
-+ sched_rq_pending_mask.bits[0],
-+ sched_idle_mask->bits[0],
-+ sched_sg_idle_mask.bits[0]);
-+}
-+#else
-+inline void alt_sched_debug(void) {}
-+#endif
-+
-+#ifdef CONFIG_SMP
-+
-+#ifdef CONFIG_PREEMPT_RT
-+#define SCHED_NR_MIGRATE_BREAK 8
-+#else
-+#define SCHED_NR_MIGRATE_BREAK 32
-+#endif
-+
-+const_debug unsigned int sysctl_sched_nr_migrate = SCHED_NR_MIGRATE_BREAK;
-+
-+/*
-+ * Migrate pending tasks in @rq to @dest_cpu
-+ */
-+static inline int
-+migrate_pending_tasks(struct rq *rq, struct rq *dest_rq, const int dest_cpu)
-+{
-+ struct task_struct *p, *skip = rq->curr;
-+ int nr_migrated = 0;
-+ int nr_tries = min(rq->nr_running / 2, sysctl_sched_nr_migrate);
-+
-+ /* WA to check rq->curr is still on rq */
-+ if (!task_on_rq_queued(skip))
-+ return 0;
-+
-+ while (skip != rq->idle && nr_tries &&
-+ (p = sched_rq_next_task(skip, rq)) != rq->idle) {
-+ skip = sched_rq_next_task(p, rq);
-+ if (cpumask_test_cpu(dest_cpu, p->cpus_ptr)) {
-+ __SCHED_DEQUEUE_TASK(p, rq, 0, );
-+ set_task_cpu(p, dest_cpu);
-+ sched_task_sanity_check(p, dest_rq);
-+ sched_mm_cid_migrate_to(dest_rq, p, cpu_of(rq));
-+ __SCHED_ENQUEUE_TASK(p, dest_rq, 0);
-+ nr_migrated++;
-+ }
-+ nr_tries--;
-+ }
-+
-+ return nr_migrated;
-+}
-+
-+static inline int take_other_rq_tasks(struct rq *rq, int cpu)
-+{
-+ struct cpumask *topo_mask, *end_mask;
-+
-+ if (unlikely(!rq->online))
-+ return 0;
-+
-+ if (cpumask_empty(&sched_rq_pending_mask))
-+ return 0;
-+
-+ topo_mask = per_cpu(sched_cpu_topo_masks, cpu) + 1;
-+ end_mask = per_cpu(sched_cpu_topo_end_mask, cpu);
-+ do {
-+ int i;
-+ for_each_cpu_and(i, &sched_rq_pending_mask, topo_mask) {
-+ int nr_migrated;
-+ struct rq *src_rq;
-+
-+ src_rq = cpu_rq(i);
-+ if (!do_raw_spin_trylock(&src_rq->lock))
-+ continue;
-+ spin_acquire(&src_rq->lock.dep_map,
-+ SINGLE_DEPTH_NESTING, 1, _RET_IP_);
-+
-+ if ((nr_migrated = migrate_pending_tasks(src_rq, rq, cpu))) {
-+ src_rq->nr_running -= nr_migrated;
-+ if (src_rq->nr_running < 2)
-+ cpumask_clear_cpu(i, &sched_rq_pending_mask);
-+
-+ spin_release(&src_rq->lock.dep_map, _RET_IP_);
-+ do_raw_spin_unlock(&src_rq->lock);
-+
-+ rq->nr_running += nr_migrated;
-+ if (rq->nr_running > 1)
-+ cpumask_set_cpu(cpu, &sched_rq_pending_mask);
-+
-+ update_sched_preempt_mask(rq);
-+ cpufreq_update_util(rq, 0);
-+
-+ return 1;
-+ }
-+
-+ spin_release(&src_rq->lock.dep_map, _RET_IP_);
-+ do_raw_spin_unlock(&src_rq->lock);
-+ }
-+ } while (++topo_mask < end_mask);
-+
-+ return 0;
-+}
-+#endif
-+
-+/*
-+ * Timeslices below RESCHED_NS are considered as good as expired as there's no
-+ * point rescheduling when there's so little time left.
-+ */
-+static inline void check_curr(struct task_struct *p, struct rq *rq)
-+{
-+ if (unlikely(rq->idle == p))
-+ return;
-+
-+ update_curr(rq, p);
-+
-+ if (p->time_slice < RESCHED_NS)
-+ time_slice_expired(p, rq);
-+}
-+
-+static inline struct task_struct *
-+choose_next_task(struct rq *rq, int cpu)
-+{
-+ struct task_struct *next;
-+
-+ if (unlikely(rq->skip)) {
-+ next = rq_runnable_task(rq);
-+ if (next == rq->idle) {
-+#ifdef CONFIG_SMP
-+ if (!take_other_rq_tasks(rq, cpu)) {
-+#endif
-+ rq->skip = NULL;
-+ schedstat_inc(rq->sched_goidle);
-+ return next;
-+#ifdef CONFIG_SMP
-+ }
-+ next = rq_runnable_task(rq);
-+#endif
-+ }
-+ rq->skip = NULL;
-+#ifdef CONFIG_HIGH_RES_TIMERS
-+ hrtick_start(rq, next->time_slice);
-+#endif
-+ return next;
-+ }
-+
-+ next = sched_rq_first_task(rq);
-+ if (next == rq->idle) {
-+#ifdef CONFIG_SMP
-+ if (!take_other_rq_tasks(rq, cpu)) {
-+#endif
-+ schedstat_inc(rq->sched_goidle);
-+ /*printk(KERN_INFO "sched: choose_next_task(%d) idle %px\n", cpu, next);*/
-+ return next;
-+#ifdef CONFIG_SMP
-+ }
-+ next = sched_rq_first_task(rq);
-+#endif
-+ }
-+#ifdef CONFIG_HIGH_RES_TIMERS
-+ hrtick_start(rq, next->time_slice);
-+#endif
-+ /*printk(KERN_INFO "sched: choose_next_task(%d) next %px\n", cpu, next);*/
-+ return next;
-+}
-+
-+/*
-+ * Constants for the sched_mode argument of __schedule().
-+ *
-+ * The mode argument allows RT enabled kernels to differentiate a
-+ * preemption from blocking on an 'sleeping' spin/rwlock. Note that
-+ * SM_MASK_PREEMPT for !RT has all bits set, which allows the compiler to
-+ * optimize the AND operation out and just check for zero.
-+ */
-+#define SM_NONE 0x0
-+#define SM_PREEMPT 0x1
-+#define SM_RTLOCK_WAIT 0x2
-+
-+#ifndef CONFIG_PREEMPT_RT
-+# define SM_MASK_PREEMPT (~0U)
-+#else
-+# define SM_MASK_PREEMPT SM_PREEMPT
-+#endif
-+
-+/*
-+ * schedule() is the main scheduler function.
-+ *
-+ * The main means of driving the scheduler and thus entering this function are:
-+ *
-+ * 1. Explicit blocking: mutex, semaphore, waitqueue, etc.
-+ *
-+ * 2. TIF_NEED_RESCHED flag is checked on interrupt and userspace return
-+ * paths. For example, see arch/x86/entry_64.S.
-+ *
-+ * To drive preemption between tasks, the scheduler sets the flag in timer
-+ * interrupt handler scheduler_tick().
-+ *
-+ * 3. Wakeups don't really cause entry into schedule(). They add a
-+ * task to the run-queue and that's it.
-+ *
-+ * Now, if the new task added to the run-queue preempts the current
-+ * task, then the wakeup sets TIF_NEED_RESCHED and schedule() gets
-+ * called on the nearest possible occasion:
-+ *
-+ * - If the kernel is preemptible (CONFIG_PREEMPTION=y):
-+ *
-+ * - in syscall or exception context, at the next outmost
-+ * preempt_enable(). (this might be as soon as the wake_up()'s
-+ * spin_unlock()!)
-+ *
-+ * - in IRQ context, return from interrupt-handler to
-+ * preemptible context
-+ *
-+ * - If the kernel is not preemptible (CONFIG_PREEMPTION is not set)
-+ * then at the next:
-+ *
-+ * - cond_resched() call
-+ * - explicit schedule() call
-+ * - return from syscall or exception to user-space
-+ * - return from interrupt-handler to user-space
-+ *
-+ * WARNING: must be called with preemption disabled!
-+ */
-+static void __sched notrace __schedule(unsigned int sched_mode)
-+{
-+ struct task_struct *prev, *next;
-+ unsigned long *switch_count;
-+ unsigned long prev_state;
-+ struct rq *rq;
-+ int cpu;
-+
-+ cpu = smp_processor_id();
-+ rq = cpu_rq(cpu);
-+ prev = rq->curr;
-+
-+ schedule_debug(prev, !!sched_mode);
-+
-+ /* by passing sched_feat(HRTICK) checking which Alt schedule FW doesn't support */
-+ hrtick_clear(rq);
-+
-+ local_irq_disable();
-+ rcu_note_context_switch(!!sched_mode);
-+
-+ /*
-+ * Make sure that signal_pending_state()->signal_pending() below
-+ * can't be reordered with __set_current_state(TASK_INTERRUPTIBLE)
-+ * done by the caller to avoid the race with signal_wake_up():
-+ *
-+ * __set_current_state(@state) signal_wake_up()
-+ * schedule() set_tsk_thread_flag(p, TIF_SIGPENDING)
-+ * wake_up_state(p, state)
-+ * LOCK rq->lock LOCK p->pi_state
-+ * smp_mb__after_spinlock() smp_mb__after_spinlock()
-+ * if (signal_pending_state()) if (p->state & @state)
-+ *
-+ * Also, the membarrier system call requires a full memory barrier
-+ * after coming from user-space, before storing to rq->curr.
-+ */
-+ raw_spin_lock(&rq->lock);
-+ smp_mb__after_spinlock();
-+
-+ update_rq_clock(rq);
-+
-+ switch_count = &prev->nivcsw;
-+ /*
-+ * We must load prev->state once (task_struct::state is volatile), such
-+ * that we form a control dependency vs deactivate_task() below.
-+ */
-+ prev_state = READ_ONCE(prev->__state);
-+ if (!(sched_mode & SM_MASK_PREEMPT) && prev_state) {
-+ if (signal_pending_state(prev_state, prev)) {
-+ WRITE_ONCE(prev->__state, TASK_RUNNING);
-+ } else {
-+ prev->sched_contributes_to_load =
-+ (prev_state & TASK_UNINTERRUPTIBLE) &&
-+ !(prev_state & TASK_NOLOAD) &&
-+ !(prev_state & TASK_FROZEN);
-+
-+ if (prev->sched_contributes_to_load)
-+ rq->nr_uninterruptible++;
-+
-+ /*
-+ * __schedule() ttwu()
-+ * prev_state = prev->state; if (p->on_rq && ...)
-+ * if (prev_state) goto out;
-+ * p->on_rq = 0; smp_acquire__after_ctrl_dep();
-+ * p->state = TASK_WAKING
-+ *
-+ * Where __schedule() and ttwu() have matching control dependencies.
-+ *
-+ * After this, schedule() must not care about p->state any more.
-+ */
-+ sched_task_deactivate(prev, rq);
-+ deactivate_task(prev, rq);
-+
-+ if (prev->in_iowait) {
-+ atomic_inc(&rq->nr_iowait);
-+ delayacct_blkio_start();
-+ }
-+ }
-+ switch_count = &prev->nvcsw;
-+ }
-+
-+ check_curr(prev, rq);
-+
-+ next = choose_next_task(rq, cpu);
-+ clear_tsk_need_resched(prev);
-+ clear_preempt_need_resched();
-+#ifdef CONFIG_SCHED_DEBUG
-+ rq->last_seen_need_resched_ns = 0;
-+#endif
-+
-+ if (likely(prev != next)) {
-+ next->last_ran = rq->clock_task;
-+ rq->last_ts_switch = rq->clock;
-+
-+ /*printk(KERN_INFO "sched: %px -> %px\n", prev, next);*/
-+ rq->nr_switches++;
-+ /*
-+ * RCU users of rcu_dereference(rq->curr) may not see
-+ * changes to task_struct made by pick_next_task().
-+ */
-+ RCU_INIT_POINTER(rq->curr, next);
-+ /*
-+ * The membarrier system call requires each architecture
-+ * to have a full memory barrier after updating
-+ * rq->curr, before returning to user-space.
-+ *
-+ * Here are the schemes providing that barrier on the
-+ * various architectures:
-+ * - mm ? switch_mm() : mmdrop() for x86, s390, sparc, PowerPC.
-+ * switch_mm() rely on membarrier_arch_switch_mm() on PowerPC.
-+ * - finish_lock_switch() for weakly-ordered
-+ * architectures where spin_unlock is a full barrier,
-+ * - switch_to() for arm64 (weakly-ordered, spin_unlock
-+ * is a RELEASE barrier),
-+ */
-+ ++*switch_count;
-+
-+ trace_sched_switch(sched_mode & SM_MASK_PREEMPT, prev, next, prev_state);
-+
-+ /* Also unlocks the rq: */
-+ rq = context_switch(rq, prev, next);
-+
-+ cpu = cpu_of(rq);
-+ } else {
-+ __balance_callbacks(rq);
-+ raw_spin_unlock_irq(&rq->lock);
-+ }
-+
-+#ifdef CONFIG_SCHED_SMT
-+ sg_balance(rq, cpu);
-+#endif
-+}
-+
-+void __noreturn do_task_dead(void)
-+{
-+ /* Causes final put_task_struct in finish_task_switch(): */
-+ set_special_state(TASK_DEAD);
-+
-+ /* Tell freezer to ignore us: */
-+ current->flags |= PF_NOFREEZE;
-+
-+ __schedule(SM_NONE);
-+ BUG();
-+
-+ /* Avoid "noreturn function does return" - but don't continue if BUG() is a NOP: */
-+ for (;;)
-+ cpu_relax();
-+}
-+
-+static inline void sched_submit_work(struct task_struct *tsk)
-+{
-+ unsigned int task_flags;
-+
-+ if (task_is_running(tsk))
-+ return;
-+
-+ task_flags = tsk->flags;
-+ /*
-+ * If a worker goes to sleep, notify and ask workqueue whether it
-+ * wants to wake up a task to maintain concurrency.
-+ */
-+ if (task_flags & (PF_WQ_WORKER | PF_IO_WORKER)) {
-+ if (task_flags & PF_WQ_WORKER)
-+ wq_worker_sleeping(tsk);
-+ else
-+ io_wq_worker_sleeping(tsk);
-+ }
-+
-+ /*
-+ * spinlock and rwlock must not flush block requests. This will
-+ * deadlock if the callback attempts to acquire a lock which is
-+ * already acquired.
-+ */
-+ SCHED_WARN_ON(current->__state & TASK_RTLOCK_WAIT);
-+
-+ /*
-+ * If we are going to sleep and we have plugged IO queued,
-+ * make sure to submit it to avoid deadlocks.
-+ */
-+ blk_flush_plug(tsk->plug, true);
-+}
-+
-+static void sched_update_worker(struct task_struct *tsk)
-+{
-+ if (tsk->flags & (PF_WQ_WORKER | PF_IO_WORKER)) {
-+ if (tsk->flags & PF_WQ_WORKER)
-+ wq_worker_running(tsk);
-+ else
-+ io_wq_worker_running(tsk);
-+ }
-+}
-+
-+asmlinkage __visible void __sched schedule(void)
-+{
-+ struct task_struct *tsk = current;
-+
-+ sched_submit_work(tsk);
-+ do {
-+ preempt_disable();
-+ __schedule(SM_NONE);
-+ sched_preempt_enable_no_resched();
-+ } while (need_resched());
-+ sched_update_worker(tsk);
-+}
-+EXPORT_SYMBOL(schedule);
-+
-+/*
-+ * synchronize_rcu_tasks() makes sure that no task is stuck in preempted
-+ * state (have scheduled out non-voluntarily) by making sure that all
-+ * tasks have either left the run queue or have gone into user space.
-+ * As idle tasks do not do either, they must not ever be preempted
-+ * (schedule out non-voluntarily).
-+ *
-+ * schedule_idle() is similar to schedule_preempt_disable() except that it
-+ * never enables preemption because it does not call sched_submit_work().
-+ */
-+void __sched schedule_idle(void)
-+{
-+ /*
-+ * As this skips calling sched_submit_work(), which the idle task does
-+ * regardless because that function is a nop when the task is in a
-+ * TASK_RUNNING state, make sure this isn't used someplace that the
-+ * current task can be in any other state. Note, idle is always in the
-+ * TASK_RUNNING state.
-+ */
-+ WARN_ON_ONCE(current->__state);
-+ do {
-+ __schedule(SM_NONE);
-+ } while (need_resched());
-+}
-+
-+#if defined(CONFIG_CONTEXT_TRACKING_USER) && !defined(CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK)
-+asmlinkage __visible void __sched schedule_user(void)
-+{
-+ /*
-+ * If we come here after a random call to set_need_resched(),
-+ * or we have been woken up remotely but the IPI has not yet arrived,
-+ * we haven't yet exited the RCU idle mode. Do it here manually until
-+ * we find a better solution.
-+ *
-+ * NB: There are buggy callers of this function. Ideally we
-+ * should warn if prev_state != CONTEXT_USER, but that will trigger
-+ * too frequently to make sense yet.
-+ */
-+ enum ctx_state prev_state = exception_enter();
-+ schedule();
-+ exception_exit(prev_state);
-+}
-+#endif
-+
-+/**
-+ * schedule_preempt_disabled - called with preemption disabled
-+ *
-+ * Returns with preemption disabled. Note: preempt_count must be 1
-+ */
-+void __sched schedule_preempt_disabled(void)
-+{
-+ sched_preempt_enable_no_resched();
-+ schedule();
-+ preempt_disable();
-+}
-+
-+#ifdef CONFIG_PREEMPT_RT
-+void __sched notrace schedule_rtlock(void)
-+{
-+ do {
-+ preempt_disable();
-+ __schedule(SM_RTLOCK_WAIT);
-+ sched_preempt_enable_no_resched();
-+ } while (need_resched());
-+}
-+NOKPROBE_SYMBOL(schedule_rtlock);
-+#endif
-+
-+static void __sched notrace preempt_schedule_common(void)
-+{
-+ do {
-+ /*
-+ * Because the function tracer can trace preempt_count_sub()
-+ * and it also uses preempt_enable/disable_notrace(), if
-+ * NEED_RESCHED is set, the preempt_enable_notrace() called
-+ * by the function tracer will call this function again and
-+ * cause infinite recursion.
-+ *
-+ * Preemption must be disabled here before the function
-+ * tracer can trace. Break up preempt_disable() into two
-+ * calls. One to disable preemption without fear of being
-+ * traced. The other to still record the preemption latency,
-+ * which can also be traced by the function tracer.
-+ */
-+ preempt_disable_notrace();
-+ preempt_latency_start(1);
-+ __schedule(SM_PREEMPT);
-+ preempt_latency_stop(1);
-+ preempt_enable_no_resched_notrace();
-+
-+ /*
-+ * Check again in case we missed a preemption opportunity
-+ * between schedule and now.
-+ */
-+ } while (need_resched());
-+}
-+
-+#ifdef CONFIG_PREEMPTION
-+/*
-+ * This is the entry point to schedule() from in-kernel preemption
-+ * off of preempt_enable.
-+ */
-+asmlinkage __visible void __sched notrace preempt_schedule(void)
-+{
-+ /*
-+ * If there is a non-zero preempt_count or interrupts are disabled,
-+ * we do not want to preempt the current task. Just return..
-+ */
-+ if (likely(!preemptible()))
-+ return;
-+
-+ preempt_schedule_common();
-+}
-+NOKPROBE_SYMBOL(preempt_schedule);
-+EXPORT_SYMBOL(preempt_schedule);
-+
-+#ifdef CONFIG_PREEMPT_DYNAMIC
-+#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
-+#ifndef preempt_schedule_dynamic_enabled
-+#define preempt_schedule_dynamic_enabled preempt_schedule
-+#define preempt_schedule_dynamic_disabled NULL
-+#endif
-+DEFINE_STATIC_CALL(preempt_schedule, preempt_schedule_dynamic_enabled);
-+EXPORT_STATIC_CALL_TRAMP(preempt_schedule);
-+#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
-+static DEFINE_STATIC_KEY_TRUE(sk_dynamic_preempt_schedule);
-+void __sched notrace dynamic_preempt_schedule(void)
-+{
-+ if (!static_branch_unlikely(&sk_dynamic_preempt_schedule))
-+ return;
-+ preempt_schedule();
-+}
-+NOKPROBE_SYMBOL(dynamic_preempt_schedule);
-+EXPORT_SYMBOL(dynamic_preempt_schedule);
-+#endif
-+#endif
-+
-+/**
-+ * preempt_schedule_notrace - preempt_schedule called by tracing
-+ *
-+ * The tracing infrastructure uses preempt_enable_notrace to prevent
-+ * recursion and tracing preempt enabling caused by the tracing
-+ * infrastructure itself. But as tracing can happen in areas coming
-+ * from userspace or just about to enter userspace, a preempt enable
-+ * can occur before user_exit() is called. This will cause the scheduler
-+ * to be called when the system is still in usermode.
-+ *
-+ * To prevent this, the preempt_enable_notrace will use this function
-+ * instead of preempt_schedule() to exit user context if needed before
-+ * calling the scheduler.
-+ */
-+asmlinkage __visible void __sched notrace preempt_schedule_notrace(void)
-+{
-+ enum ctx_state prev_ctx;
-+
-+ if (likely(!preemptible()))
-+ return;
-+
-+ do {
-+ /*
-+ * Because the function tracer can trace preempt_count_sub()
-+ * and it also uses preempt_enable/disable_notrace(), if
-+ * NEED_RESCHED is set, the preempt_enable_notrace() called
-+ * by the function tracer will call this function again and
-+ * cause infinite recursion.
-+ *
-+ * Preemption must be disabled here before the function
-+ * tracer can trace. Break up preempt_disable() into two
-+ * calls. One to disable preemption without fear of being
-+ * traced. The other to still record the preemption latency,
-+ * which can also be traced by the function tracer.
-+ */
-+ preempt_disable_notrace();
-+ preempt_latency_start(1);
-+ /*
-+ * Needs preempt disabled in case user_exit() is traced
-+ * and the tracer calls preempt_enable_notrace() causing
-+ * an infinite recursion.
-+ */
-+ prev_ctx = exception_enter();
-+ __schedule(SM_PREEMPT);
-+ exception_exit(prev_ctx);
-+
-+ preempt_latency_stop(1);
-+ preempt_enable_no_resched_notrace();
-+ } while (need_resched());
-+}
-+EXPORT_SYMBOL_GPL(preempt_schedule_notrace);
-+
-+#ifdef CONFIG_PREEMPT_DYNAMIC
-+#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
-+#ifndef preempt_schedule_notrace_dynamic_enabled
-+#define preempt_schedule_notrace_dynamic_enabled preempt_schedule_notrace
-+#define preempt_schedule_notrace_dynamic_disabled NULL
-+#endif
-+DEFINE_STATIC_CALL(preempt_schedule_notrace, preempt_schedule_notrace_dynamic_enabled);
-+EXPORT_STATIC_CALL_TRAMP(preempt_schedule_notrace);
-+#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
-+static DEFINE_STATIC_KEY_TRUE(sk_dynamic_preempt_schedule_notrace);
-+void __sched notrace dynamic_preempt_schedule_notrace(void)
-+{
-+ if (!static_branch_unlikely(&sk_dynamic_preempt_schedule_notrace))
-+ return;
-+ preempt_schedule_notrace();
-+}
-+NOKPROBE_SYMBOL(dynamic_preempt_schedule_notrace);
-+EXPORT_SYMBOL(dynamic_preempt_schedule_notrace);
-+#endif
-+#endif
-+
-+#endif /* CONFIG_PREEMPTION */
-+
-+/*
-+ * This is the entry point to schedule() from kernel preemption
-+ * off of irq context.
-+ * Note, that this is called and return with irqs disabled. This will
-+ * protect us against recursive calling from irq.
-+ */
-+asmlinkage __visible void __sched preempt_schedule_irq(void)
-+{
-+ enum ctx_state prev_state;
-+
-+ /* Catch callers which need to be fixed */
-+ BUG_ON(preempt_count() || !irqs_disabled());
-+
-+ prev_state = exception_enter();
-+
-+ do {
-+ preempt_disable();
-+ local_irq_enable();
-+ __schedule(SM_PREEMPT);
-+ local_irq_disable();
-+ sched_preempt_enable_no_resched();
-+ } while (need_resched());
-+
-+ exception_exit(prev_state);
-+}
-+
-+int default_wake_function(wait_queue_entry_t *curr, unsigned mode, int wake_flags,
-+ void *key)
-+{
-+ WARN_ON_ONCE(IS_ENABLED(CONFIG_SCHED_DEBUG) && wake_flags & ~WF_SYNC);
-+ return try_to_wake_up(curr->private, mode, wake_flags);
-+}
-+EXPORT_SYMBOL(default_wake_function);
-+
-+static inline void check_task_changed(struct task_struct *p, struct rq *rq)
-+{
-+ /* Trigger resched if task sched_prio has been modified. */
-+ if (task_on_rq_queued(p)) {
-+ int idx;
-+
-+ update_rq_clock(rq);
-+ idx = task_sched_prio_idx(p, rq);
-+ if (idx != p->sq_idx) {
-+ requeue_task(p, rq, idx);
-+ check_preempt_curr(rq);
-+ }
-+ }
-+}
-+
-+static void __setscheduler_prio(struct task_struct *p, int prio)
-+{
-+ p->prio = prio;
-+}
-+
-+#ifdef CONFIG_RT_MUTEXES
-+
-+static inline int __rt_effective_prio(struct task_struct *pi_task, int prio)
-+{
-+ if (pi_task)
-+ prio = min(prio, pi_task->prio);
-+
-+ return prio;
-+}
-+
-+static inline int rt_effective_prio(struct task_struct *p, int prio)
-+{
-+ struct task_struct *pi_task = rt_mutex_get_top_task(p);
-+
-+ return __rt_effective_prio(pi_task, prio);
-+}
-+
-+/*
-+ * rt_mutex_setprio - set the current priority of a task
-+ * @p: task to boost
-+ * @pi_task: donor task
-+ *
-+ * This function changes the 'effective' priority of a task. It does
-+ * not touch ->normal_prio like __setscheduler().
-+ *
-+ * Used by the rt_mutex code to implement priority inheritance
-+ * logic. Call site only calls if the priority of the task changed.
-+ */
-+void rt_mutex_setprio(struct task_struct *p, struct task_struct *pi_task)
-+{
-+ int prio;
-+ struct rq *rq;
-+ raw_spinlock_t *lock;
-+
-+ /* XXX used to be waiter->prio, not waiter->task->prio */
-+ prio = __rt_effective_prio(pi_task, p->normal_prio);
-+
-+ /*
-+ * If nothing changed; bail early.
-+ */
-+ if (p->pi_top_task == pi_task && prio == p->prio)
-+ return;
-+
-+ rq = __task_access_lock(p, &lock);
-+ /*
-+ * Set under pi_lock && rq->lock, such that the value can be used under
-+ * either lock.
-+ *
-+ * Note that there is loads of tricky to make this pointer cache work
-+ * right. rt_mutex_slowunlock()+rt_mutex_postunlock() work together to
-+ * ensure a task is de-boosted (pi_task is set to NULL) before the
-+ * task is allowed to run again (and can exit). This ensures the pointer
-+ * points to a blocked task -- which guarantees the task is present.
-+ */
-+ p->pi_top_task = pi_task;
-+
-+ /*
-+ * For FIFO/RR we only need to set prio, if that matches we're done.
-+ */
-+ if (prio == p->prio)
-+ goto out_unlock;
-+
-+ /*
-+ * Idle task boosting is a nono in general. There is one
-+ * exception, when PREEMPT_RT and NOHZ is active:
-+ *
-+ * The idle task calls get_next_timer_interrupt() and holds
-+ * the timer wheel base->lock on the CPU and another CPU wants
-+ * to access the timer (probably to cancel it). We can safely
-+ * ignore the boosting request, as the idle CPU runs this code
-+ * with interrupts disabled and will complete the lock
-+ * protected section without being interrupted. So there is no
-+ * real need to boost.
-+ */
-+ if (unlikely(p == rq->idle)) {
-+ WARN_ON(p != rq->curr);
-+ WARN_ON(p->pi_blocked_on);
-+ goto out_unlock;
-+ }
-+
-+ trace_sched_pi_setprio(p, pi_task);
-+
-+ __setscheduler_prio(p, prio);
-+
-+ check_task_changed(p, rq);
-+out_unlock:
-+ /* Avoid rq from going away on us: */
-+ preempt_disable();
-+
-+ __balance_callbacks(rq);
-+ __task_access_unlock(p, lock);
-+
-+ preempt_enable();
-+}
-+#else
-+static inline int rt_effective_prio(struct task_struct *p, int prio)
-+{
-+ return prio;
-+}
-+#endif
-+
-+void set_user_nice(struct task_struct *p, long nice)
-+{
-+ unsigned long flags;
-+ struct rq *rq;
-+ raw_spinlock_t *lock;
-+
-+ if (task_nice(p) == nice || nice < MIN_NICE || nice > MAX_NICE)
-+ return;
-+ /*
-+ * We have to be careful, if called from sys_setpriority(),
-+ * the task might be in the middle of scheduling on another CPU.
-+ */
-+ raw_spin_lock_irqsave(&p->pi_lock, flags);
-+ rq = __task_access_lock(p, &lock);
-+
-+ p->static_prio = NICE_TO_PRIO(nice);
-+ /*
-+ * The RT priorities are set via sched_setscheduler(), but we still
-+ * allow the 'normal' nice value to be set - but as expected
-+ * it won't have any effect on scheduling until the task is
-+ * not SCHED_NORMAL/SCHED_BATCH:
-+ */
-+ if (task_has_rt_policy(p))
-+ goto out_unlock;
-+
-+ p->prio = effective_prio(p);
-+
-+ check_task_changed(p, rq);
-+out_unlock:
-+ __task_access_unlock(p, lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
-+}
-+EXPORT_SYMBOL(set_user_nice);
-+
-+/*
-+ * is_nice_reduction - check if nice value is an actual reduction
-+ *
-+ * Similar to can_nice() but does not perform a capability check.
-+ *
-+ * @p: task
-+ * @nice: nice value
-+ */
-+static bool is_nice_reduction(const struct task_struct *p, const int nice)
-+{
-+ /* Convert nice value [19,-20] to rlimit style value [1,40]: */
-+ int nice_rlim = nice_to_rlimit(nice);
-+
-+ return (nice_rlim <= task_rlimit(p, RLIMIT_NICE));
-+}
-+
-+/*
-+ * can_nice - check if a task can reduce its nice value
-+ * @p: task
-+ * @nice: nice value
-+ */
-+int can_nice(const struct task_struct *p, const int nice)
-+{
-+ return is_nice_reduction(p, nice) || capable(CAP_SYS_NICE);
-+}
-+
-+#ifdef __ARCH_WANT_SYS_NICE
-+
-+/*
-+ * sys_nice - change the priority of the current process.
-+ * @increment: priority increment
-+ *
-+ * sys_setpriority is a more generic, but much slower function that
-+ * does similar things.
-+ */
-+SYSCALL_DEFINE1(nice, int, increment)
-+{
-+ long nice, retval;
-+
-+ /*
-+ * Setpriority might change our priority at the same moment.
-+ * We don't have to worry. Conceptually one call occurs first
-+ * and we have a single winner.
-+ */
-+
-+ increment = clamp(increment, -NICE_WIDTH, NICE_WIDTH);
-+ nice = task_nice(current) + increment;
-+
-+ nice = clamp_val(nice, MIN_NICE, MAX_NICE);
-+ if (increment < 0 && !can_nice(current, nice))
-+ return -EPERM;
-+
-+ retval = security_task_setnice(current, nice);
-+ if (retval)
-+ return retval;
-+
-+ set_user_nice(current, nice);
-+ return 0;
-+}
-+
-+#endif
-+
-+/**
-+ * task_prio - return the priority value of a given task.
-+ * @p: the task in question.
-+ *
-+ * Return: The priority value as seen by users in /proc.
-+ *
-+ * sched policy return value kernel prio user prio/nice
-+ *
-+ * (BMQ)normal, batch, idle[0 ... 53] [100 ... 139] 0/[-20 ... 19]/[-7 ... 7]
-+ * (PDS)normal, batch, idle[0 ... 39] 100 0/[-20 ... 19]
-+ * fifo, rr [-1 ... -100] [99 ... 0] [0 ... 99]
-+ */
-+int task_prio(const struct task_struct *p)
-+{
-+ return (p->prio < MAX_RT_PRIO) ? p->prio - MAX_RT_PRIO :
-+ task_sched_prio_normal(p, task_rq(p));
-+}
-+
-+/**
-+ * idle_cpu - is a given CPU idle currently?
-+ * @cpu: the processor in question.
-+ *
-+ * Return: 1 if the CPU is currently idle. 0 otherwise.
-+ */
-+int idle_cpu(int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+
-+ if (rq->curr != rq->idle)
-+ return 0;
-+
-+ if (rq->nr_running)
-+ return 0;
-+
-+#ifdef CONFIG_SMP
-+ if (rq->ttwu_pending)
-+ return 0;
-+#endif
-+
-+ return 1;
-+}
-+
-+/**
-+ * idle_task - return the idle task for a given CPU.
-+ * @cpu: the processor in question.
-+ *
-+ * Return: The idle task for the cpu @cpu.
-+ */
-+struct task_struct *idle_task(int cpu)
-+{
-+ return cpu_rq(cpu)->idle;
-+}
-+
-+/**
-+ * find_process_by_pid - find a process with a matching PID value.
-+ * @pid: the pid in question.
-+ *
-+ * The task of @pid, if found. %NULL otherwise.
-+ */
-+static inline struct task_struct *find_process_by_pid(pid_t pid)
-+{
-+ return pid ? find_task_by_vpid(pid) : current;
-+}
-+
-+/*
-+ * sched_setparam() passes in -1 for its policy, to let the functions
-+ * it calls know not to change it.
-+ */
-+#define SETPARAM_POLICY -1
-+
-+static void __setscheduler_params(struct task_struct *p,
-+ const struct sched_attr *attr)
-+{
-+ int policy = attr->sched_policy;
-+
-+ if (policy == SETPARAM_POLICY)
-+ policy = p->policy;
-+
-+ p->policy = policy;
-+
-+ /*
-+ * allow normal nice value to be set, but will not have any
-+ * effect on scheduling until the task not SCHED_NORMAL/
-+ * SCHED_BATCH
-+ */
-+ p->static_prio = NICE_TO_PRIO(attr->sched_nice);
-+
-+ /*
-+ * __sched_setscheduler() ensures attr->sched_priority == 0 when
-+ * !rt_policy. Always setting this ensures that things like
-+ * getparam()/getattr() don't report silly values for !rt tasks.
-+ */
-+ p->rt_priority = attr->sched_priority;
-+ p->normal_prio = normal_prio(p);
-+}
-+
-+/*
-+ * check the target process has a UID that matches the current process's
-+ */
-+static bool check_same_owner(struct task_struct *p)
-+{
-+ const struct cred *cred = current_cred(), *pcred;
-+ bool match;
-+
-+ rcu_read_lock();
-+ pcred = __task_cred(p);
-+ match = (uid_eq(cred->euid, pcred->euid) ||
-+ uid_eq(cred->euid, pcred->uid));
-+ rcu_read_unlock();
-+ return match;
-+}
-+
-+/*
-+ * Allow unprivileged RT tasks to decrease priority.
-+ * Only issue a capable test if needed and only once to avoid an audit
-+ * event on permitted non-privileged operations:
-+ */
-+static int user_check_sched_setscheduler(struct task_struct *p,
-+ const struct sched_attr *attr,
-+ int policy, int reset_on_fork)
-+{
-+ if (rt_policy(policy)) {
-+ unsigned long rlim_rtprio = task_rlimit(p, RLIMIT_RTPRIO);
-+
-+ /* Can't set/change the rt policy: */
-+ if (policy != p->policy && !rlim_rtprio)
-+ goto req_priv;
-+
-+ /* Can't increase priority: */
-+ if (attr->sched_priority > p->rt_priority &&
-+ attr->sched_priority > rlim_rtprio)
-+ goto req_priv;
-+ }
-+
-+ /* Can't change other user's priorities: */
-+ if (!check_same_owner(p))
-+ goto req_priv;
-+
-+ /* Normal users shall not reset the sched_reset_on_fork flag: */
-+ if (p->sched_reset_on_fork && !reset_on_fork)
-+ goto req_priv;
-+
-+ return 0;
-+
-+req_priv:
-+ if (!capable(CAP_SYS_NICE))
-+ return -EPERM;
-+
-+ return 0;
-+}
-+
-+static int __sched_setscheduler(struct task_struct *p,
-+ const struct sched_attr *attr,
-+ bool user, bool pi)
-+{
-+ const struct sched_attr dl_squash_attr = {
-+ .size = sizeof(struct sched_attr),
-+ .sched_policy = SCHED_FIFO,
-+ .sched_nice = 0,
-+ .sched_priority = 99,
-+ };
-+ int oldpolicy = -1, policy = attr->sched_policy;
-+ int retval, newprio;
-+ struct balance_callback *head;
-+ unsigned long flags;
-+ struct rq *rq;
-+ int reset_on_fork;
-+ raw_spinlock_t *lock;
-+
-+ /* The pi code expects interrupts enabled */
-+ BUG_ON(pi && in_interrupt());
-+
-+ /*
-+ * Alt schedule FW supports SCHED_DEADLINE by squash it as prio 0 SCHED_FIFO
-+ */
-+ if (unlikely(SCHED_DEADLINE == policy)) {
-+ attr = &dl_squash_attr;
-+ policy = attr->sched_policy;
-+ }
-+recheck:
-+ /* Double check policy once rq lock held */
-+ if (policy < 0) {
-+ reset_on_fork = p->sched_reset_on_fork;
-+ policy = oldpolicy = p->policy;
-+ } else {
-+ reset_on_fork = !!(attr->sched_flags & SCHED_RESET_ON_FORK);
-+
-+ if (policy > SCHED_IDLE)
-+ return -EINVAL;
-+ }
-+
-+ if (attr->sched_flags & ~(SCHED_FLAG_ALL))
-+ return -EINVAL;
-+
-+ /*
-+ * Valid priorities for SCHED_FIFO and SCHED_RR are
-+ * 1..MAX_RT_PRIO-1, valid priority for SCHED_NORMAL and
-+ * SCHED_BATCH and SCHED_IDLE is 0.
-+ */
-+ if (attr->sched_priority < 0 ||
-+ (p->mm && attr->sched_priority > MAX_RT_PRIO - 1) ||
-+ (!p->mm && attr->sched_priority > MAX_RT_PRIO - 1))
-+ return -EINVAL;
-+ if ((SCHED_RR == policy || SCHED_FIFO == policy) !=
-+ (attr->sched_priority != 0))
-+ return -EINVAL;
-+
-+ if (user) {
-+ retval = user_check_sched_setscheduler(p, attr, policy, reset_on_fork);
-+ if (retval)
-+ return retval;
-+
-+ retval = security_task_setscheduler(p);
-+ if (retval)
-+ return retval;
-+ }
-+
-+ if (pi)
-+ cpuset_read_lock();
-+
-+ /*
-+ * Make sure no PI-waiters arrive (or leave) while we are
-+ * changing the priority of the task:
-+ */
-+ raw_spin_lock_irqsave(&p->pi_lock, flags);
-+
-+ /*
-+ * To be able to change p->policy safely, task_access_lock()
-+ * must be called.
-+ * IF use task_access_lock() here:
-+ * For the task p which is not running, reading rq->stop is
-+ * racy but acceptable as ->stop doesn't change much.
-+ * An enhancemnet can be made to read rq->stop saftly.
-+ */
-+ rq = __task_access_lock(p, &lock);
-+
-+ /*
-+ * Changing the policy of the stop threads its a very bad idea
-+ */
-+ if (p == rq->stop) {
-+ retval = -EINVAL;
-+ goto unlock;
-+ }
-+
-+ /*
-+ * If not changing anything there's no need to proceed further:
-+ */
-+ if (unlikely(policy == p->policy)) {
-+ if (rt_policy(policy) && attr->sched_priority != p->rt_priority)
-+ goto change;
-+ if (!rt_policy(policy) &&
-+ NICE_TO_PRIO(attr->sched_nice) != p->static_prio)
-+ goto change;
-+
-+ p->sched_reset_on_fork = reset_on_fork;
-+ retval = 0;
-+ goto unlock;
-+ }
-+change:
-+
-+ /* Re-check policy now with rq lock held */
-+ if (unlikely(oldpolicy != -1 && oldpolicy != p->policy)) {
-+ policy = oldpolicy = -1;
-+ __task_access_unlock(p, lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
-+ if (pi)
-+ cpuset_read_unlock();
-+ goto recheck;
-+ }
-+
-+ p->sched_reset_on_fork = reset_on_fork;
-+
-+ newprio = __normal_prio(policy, attr->sched_priority, NICE_TO_PRIO(attr->sched_nice));
-+ if (pi) {
-+ /*
-+ * Take priority boosted tasks into account. If the new
-+ * effective priority is unchanged, we just store the new
-+ * normal parameters and do not touch the scheduler class and
-+ * the runqueue. This will be done when the task deboost
-+ * itself.
-+ */
-+ newprio = rt_effective_prio(p, newprio);
-+ }
-+
-+ if (!(attr->sched_flags & SCHED_FLAG_KEEP_PARAMS)) {
-+ __setscheduler_params(p, attr);
-+ __setscheduler_prio(p, newprio);
-+ }
-+
-+ check_task_changed(p, rq);
-+
-+ /* Avoid rq from going away on us: */
-+ preempt_disable();
-+ head = splice_balance_callbacks(rq);
-+ __task_access_unlock(p, lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
-+
-+ if (pi) {
-+ cpuset_read_unlock();
-+ rt_mutex_adjust_pi(p);
-+ }
-+
-+ /* Run balance callbacks after we've adjusted the PI chain: */
-+ balance_callbacks(rq, head);
-+ preempt_enable();
-+
-+ return 0;
-+
-+unlock:
-+ __task_access_unlock(p, lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
-+ if (pi)
-+ cpuset_read_unlock();
-+ return retval;
-+}
-+
-+static int _sched_setscheduler(struct task_struct *p, int policy,
-+ const struct sched_param *param, bool check)
-+{
-+ struct sched_attr attr = {
-+ .sched_policy = policy,
-+ .sched_priority = param->sched_priority,
-+ .sched_nice = PRIO_TO_NICE(p->static_prio),
-+ };
-+
-+ /* Fixup the legacy SCHED_RESET_ON_FORK hack. */
-+ if ((policy != SETPARAM_POLICY) && (policy & SCHED_RESET_ON_FORK)) {
-+ attr.sched_flags |= SCHED_FLAG_RESET_ON_FORK;
-+ policy &= ~SCHED_RESET_ON_FORK;
-+ attr.sched_policy = policy;
-+ }
-+
-+ return __sched_setscheduler(p, &attr, check, true);
-+}
-+
-+/**
-+ * sched_setscheduler - change the scheduling policy and/or RT priority of a thread.
-+ * @p: the task in question.
-+ * @policy: new policy.
-+ * @param: structure containing the new RT priority.
-+ *
-+ * Use sched_set_fifo(), read its comment.
-+ *
-+ * Return: 0 on success. An error code otherwise.
-+ *
-+ * NOTE that the task may be already dead.
-+ */
-+int sched_setscheduler(struct task_struct *p, int policy,
-+ const struct sched_param *param)
-+{
-+ return _sched_setscheduler(p, policy, param, true);
-+}
-+
-+int sched_setattr(struct task_struct *p, const struct sched_attr *attr)
-+{
-+ return __sched_setscheduler(p, attr, true, true);
-+}
-+
-+int sched_setattr_nocheck(struct task_struct *p, const struct sched_attr *attr)
-+{
-+ return __sched_setscheduler(p, attr, false, true);
-+}
-+EXPORT_SYMBOL_GPL(sched_setattr_nocheck);
-+
-+/**
-+ * sched_setscheduler_nocheck - change the scheduling policy and/or RT priority of a thread from kernelspace.
-+ * @p: the task in question.
-+ * @policy: new policy.
-+ * @param: structure containing the new RT priority.
-+ *
-+ * Just like sched_setscheduler, only don't bother checking if the
-+ * current context has permission. For example, this is needed in
-+ * stop_machine(): we create temporary high priority worker threads,
-+ * but our caller might not have that capability.
-+ *
-+ * Return: 0 on success. An error code otherwise.
-+ */
-+int sched_setscheduler_nocheck(struct task_struct *p, int policy,
-+ const struct sched_param *param)
-+{
-+ return _sched_setscheduler(p, policy, param, false);
-+}
-+
-+/*
-+ * SCHED_FIFO is a broken scheduler model; that is, it is fundamentally
-+ * incapable of resource management, which is the one thing an OS really should
-+ * be doing.
-+ *
-+ * This is of course the reason it is limited to privileged users only.
-+ *
-+ * Worse still; it is fundamentally impossible to compose static priority
-+ * workloads. You cannot take two correctly working static prio workloads
-+ * and smash them together and still expect them to work.
-+ *
-+ * For this reason 'all' FIFO tasks the kernel creates are basically at:
-+ *
-+ * MAX_RT_PRIO / 2
-+ *
-+ * The administrator _MUST_ configure the system, the kernel simply doesn't
-+ * know enough information to make a sensible choice.
-+ */
-+void sched_set_fifo(struct task_struct *p)
-+{
-+ struct sched_param sp = { .sched_priority = MAX_RT_PRIO / 2 };
-+ WARN_ON_ONCE(sched_setscheduler_nocheck(p, SCHED_FIFO, &sp) != 0);
-+}
-+EXPORT_SYMBOL_GPL(sched_set_fifo);
-+
-+/*
-+ * For when you don't much care about FIFO, but want to be above SCHED_NORMAL.
-+ */
-+void sched_set_fifo_low(struct task_struct *p)
-+{
-+ struct sched_param sp = { .sched_priority = 1 };
-+ WARN_ON_ONCE(sched_setscheduler_nocheck(p, SCHED_FIFO, &sp) != 0);
-+}
-+EXPORT_SYMBOL_GPL(sched_set_fifo_low);
-+
-+void sched_set_normal(struct task_struct *p, int nice)
-+{
-+ struct sched_attr attr = {
-+ .sched_policy = SCHED_NORMAL,
-+ .sched_nice = nice,
-+ };
-+ WARN_ON_ONCE(sched_setattr_nocheck(p, &attr) != 0);
-+}
-+EXPORT_SYMBOL_GPL(sched_set_normal);
-+
-+static int
-+do_sched_setscheduler(pid_t pid, int policy, struct sched_param __user *param)
-+{
-+ struct sched_param lparam;
-+ struct task_struct *p;
-+ int retval;
-+
-+ if (!param || pid < 0)
-+ return -EINVAL;
-+ if (copy_from_user(&lparam, param, sizeof(struct sched_param)))
-+ return -EFAULT;
-+
-+ rcu_read_lock();
-+ retval = -ESRCH;
-+ p = find_process_by_pid(pid);
-+ if (likely(p))
-+ get_task_struct(p);
-+ rcu_read_unlock();
-+
-+ if (likely(p)) {
-+ retval = sched_setscheduler(p, policy, &lparam);
-+ put_task_struct(p);
-+ }
-+
-+ return retval;
-+}
-+
-+/*
-+ * Mimics kernel/events/core.c perf_copy_attr().
-+ */
-+static int sched_copy_attr(struct sched_attr __user *uattr, struct sched_attr *attr)
-+{
-+ u32 size;
-+ int ret;
-+
-+ /* Zero the full structure, so that a short copy will be nice: */
-+ memset(attr, 0, sizeof(*attr));
-+
-+ ret = get_user(size, &uattr->size);
-+ if (ret)
-+ return ret;
-+
-+ /* ABI compatibility quirk: */
-+ if (!size)
-+ size = SCHED_ATTR_SIZE_VER0;
-+
-+ if (size < SCHED_ATTR_SIZE_VER0 || size > PAGE_SIZE)
-+ goto err_size;
-+
-+ ret = copy_struct_from_user(attr, sizeof(*attr), uattr, size);
-+ if (ret) {
-+ if (ret == -E2BIG)
-+ goto err_size;
-+ return ret;
-+ }
-+
-+ /*
-+ * XXX: Do we want to be lenient like existing syscalls; or do we want
-+ * to be strict and return an error on out-of-bounds values?
-+ */
-+ attr->sched_nice = clamp(attr->sched_nice, -20, 19);
-+
-+ /* sched/core.c uses zero here but we already know ret is zero */
-+ return 0;
-+
-+err_size:
-+ put_user(sizeof(*attr), &uattr->size);
-+ return -E2BIG;
-+}
-+
-+/**
-+ * sys_sched_setscheduler - set/change the scheduler policy and RT priority
-+ * @pid: the pid in question.
-+ * @policy: new policy.
-+ *
-+ * Return: 0 on success. An error code otherwise.
-+ * @param: structure containing the new RT priority.
-+ */
-+SYSCALL_DEFINE3(sched_setscheduler, pid_t, pid, int, policy, struct sched_param __user *, param)
-+{
-+ if (policy < 0)
-+ return -EINVAL;
-+
-+ return do_sched_setscheduler(pid, policy, param);
-+}
-+
-+/**
-+ * sys_sched_setparam - set/change the RT priority of a thread
-+ * @pid: the pid in question.
-+ * @param: structure containing the new RT priority.
-+ *
-+ * Return: 0 on success. An error code otherwise.
-+ */
-+SYSCALL_DEFINE2(sched_setparam, pid_t, pid, struct sched_param __user *, param)
-+{
-+ return do_sched_setscheduler(pid, SETPARAM_POLICY, param);
-+}
-+
-+/**
-+ * sys_sched_setattr - same as above, but with extended sched_attr
-+ * @pid: the pid in question.
-+ * @uattr: structure containing the extended parameters.
-+ */
-+SYSCALL_DEFINE3(sched_setattr, pid_t, pid, struct sched_attr __user *, uattr,
-+ unsigned int, flags)
-+{
-+ struct sched_attr attr;
-+ struct task_struct *p;
-+ int retval;
-+
-+ if (!uattr || pid < 0 || flags)
-+ return -EINVAL;
-+
-+ retval = sched_copy_attr(uattr, &attr);
-+ if (retval)
-+ return retval;
-+
-+ if ((int)attr.sched_policy < 0)
-+ return -EINVAL;
-+
-+ rcu_read_lock();
-+ retval = -ESRCH;
-+ p = find_process_by_pid(pid);
-+ if (likely(p))
-+ get_task_struct(p);
-+ rcu_read_unlock();
-+
-+ if (likely(p)) {
-+ retval = sched_setattr(p, &attr);
-+ put_task_struct(p);
-+ }
-+
-+ return retval;
-+}
-+
-+/**
-+ * sys_sched_getscheduler - get the policy (scheduling class) of a thread
-+ * @pid: the pid in question.
-+ *
-+ * Return: On success, the policy of the thread. Otherwise, a negative error
-+ * code.
-+ */
-+SYSCALL_DEFINE1(sched_getscheduler, pid_t, pid)
-+{
-+ struct task_struct *p;
-+ int retval = -EINVAL;
-+
-+ if (pid < 0)
-+ goto out_nounlock;
-+
-+ retval = -ESRCH;
-+ rcu_read_lock();
-+ p = find_process_by_pid(pid);
-+ if (p) {
-+ retval = security_task_getscheduler(p);
-+ if (!retval)
-+ retval = p->policy;
-+ }
-+ rcu_read_unlock();
-+
-+out_nounlock:
-+ return retval;
-+}
-+
-+/**
-+ * sys_sched_getscheduler - get the RT priority of a thread
-+ * @pid: the pid in question.
-+ * @param: structure containing the RT priority.
-+ *
-+ * Return: On success, 0 and the RT priority is in @param. Otherwise, an error
-+ * code.
-+ */
-+SYSCALL_DEFINE2(sched_getparam, pid_t, pid, struct sched_param __user *, param)
-+{
-+ struct sched_param lp = { .sched_priority = 0 };
-+ struct task_struct *p;
-+ int retval = -EINVAL;
-+
-+ if (!param || pid < 0)
-+ goto out_nounlock;
-+
-+ rcu_read_lock();
-+ p = find_process_by_pid(pid);
-+ retval = -ESRCH;
-+ if (!p)
-+ goto out_unlock;
-+
-+ retval = security_task_getscheduler(p);
-+ if (retval)
-+ goto out_unlock;
-+
-+ if (task_has_rt_policy(p))
-+ lp.sched_priority = p->rt_priority;
-+ rcu_read_unlock();
-+
-+ /*
-+ * This one might sleep, we cannot do it with a spinlock held ...
-+ */
-+ retval = copy_to_user(param, &lp, sizeof(*param)) ? -EFAULT : 0;
-+
-+out_nounlock:
-+ return retval;
-+
-+out_unlock:
-+ rcu_read_unlock();
-+ return retval;
-+}
-+
-+/*
-+ * Copy the kernel size attribute structure (which might be larger
-+ * than what user-space knows about) to user-space.
-+ *
-+ * Note that all cases are valid: user-space buffer can be larger or
-+ * smaller than the kernel-space buffer. The usual case is that both
-+ * have the same size.
-+ */
-+static int
-+sched_attr_copy_to_user(struct sched_attr __user *uattr,
-+ struct sched_attr *kattr,
-+ unsigned int usize)
-+{
-+ unsigned int ksize = sizeof(*kattr);
-+
-+ if (!access_ok(uattr, usize))
-+ return -EFAULT;
-+
-+ /*
-+ * sched_getattr() ABI forwards and backwards compatibility:
-+ *
-+ * If usize == ksize then we just copy everything to user-space and all is good.
-+ *
-+ * If usize < ksize then we only copy as much as user-space has space for,
-+ * this keeps ABI compatibility as well. We skip the rest.
-+ *
-+ * If usize > ksize then user-space is using a newer version of the ABI,
-+ * which part the kernel doesn't know about. Just ignore it - tooling can
-+ * detect the kernel's knowledge of attributes from the attr->size value
-+ * which is set to ksize in this case.
-+ */
-+ kattr->size = min(usize, ksize);
-+
-+ if (copy_to_user(uattr, kattr, kattr->size))
-+ return -EFAULT;
-+
-+ return 0;
-+}
-+
-+/**
-+ * sys_sched_getattr - similar to sched_getparam, but with sched_attr
-+ * @pid: the pid in question.
-+ * @uattr: structure containing the extended parameters.
-+ * @usize: sizeof(attr) for fwd/bwd comp.
-+ * @flags: for future extension.
-+ */
-+SYSCALL_DEFINE4(sched_getattr, pid_t, pid, struct sched_attr __user *, uattr,
-+ unsigned int, usize, unsigned int, flags)
-+{
-+ struct sched_attr kattr = { };
-+ struct task_struct *p;
-+ int retval;
-+
-+ if (!uattr || pid < 0 || usize > PAGE_SIZE ||
-+ usize < SCHED_ATTR_SIZE_VER0 || flags)
-+ return -EINVAL;
-+
-+ rcu_read_lock();
-+ p = find_process_by_pid(pid);
-+ retval = -ESRCH;
-+ if (!p)
-+ goto out_unlock;
-+
-+ retval = security_task_getscheduler(p);
-+ if (retval)
-+ goto out_unlock;
-+
-+ kattr.sched_policy = p->policy;
-+ if (p->sched_reset_on_fork)
-+ kattr.sched_flags |= SCHED_FLAG_RESET_ON_FORK;
-+ if (task_has_rt_policy(p))
-+ kattr.sched_priority = p->rt_priority;
-+ else
-+ kattr.sched_nice = task_nice(p);
-+ kattr.sched_flags &= SCHED_FLAG_ALL;
-+
-+#ifdef CONFIG_UCLAMP_TASK
-+ kattr.sched_util_min = p->uclamp_req[UCLAMP_MIN].value;
-+ kattr.sched_util_max = p->uclamp_req[UCLAMP_MAX].value;
-+#endif
-+
-+ rcu_read_unlock();
-+
-+ return sched_attr_copy_to_user(uattr, &kattr, usize);
-+
-+out_unlock:
-+ rcu_read_unlock();
-+ return retval;
-+}
-+
-+#ifdef CONFIG_SMP
-+int dl_task_check_affinity(struct task_struct *p, const struct cpumask *mask)
-+{
-+ return 0;
-+}
-+#endif
-+
-+static int
-+__sched_setaffinity(struct task_struct *p, struct affinity_context *ctx)
-+{
-+ int retval;
-+ cpumask_var_t cpus_allowed, new_mask;
-+
-+ if (!alloc_cpumask_var(&cpus_allowed, GFP_KERNEL))
-+ return -ENOMEM;
-+
-+ if (!alloc_cpumask_var(&new_mask, GFP_KERNEL)) {
-+ retval = -ENOMEM;
-+ goto out_free_cpus_allowed;
-+ }
-+
-+ cpuset_cpus_allowed(p, cpus_allowed);
-+ cpumask_and(new_mask, ctx->new_mask, cpus_allowed);
-+
-+ ctx->new_mask = new_mask;
-+ ctx->flags |= SCA_CHECK;
-+
-+ retval = __set_cpus_allowed_ptr(p, ctx);
-+ if (retval)
-+ goto out_free_new_mask;
-+
-+ cpuset_cpus_allowed(p, cpus_allowed);
-+ if (!cpumask_subset(new_mask, cpus_allowed)) {
-+ /*
-+ * We must have raced with a concurrent cpuset
-+ * update. Just reset the cpus_allowed to the
-+ * cpuset's cpus_allowed
-+ */
-+ cpumask_copy(new_mask, cpus_allowed);
-+
-+ /*
-+ * If SCA_USER is set, a 2nd call to __set_cpus_allowed_ptr()
-+ * will restore the previous user_cpus_ptr value.
-+ *
-+ * In the unlikely event a previous user_cpus_ptr exists,
-+ * we need to further restrict the mask to what is allowed
-+ * by that old user_cpus_ptr.
-+ */
-+ if (unlikely((ctx->flags & SCA_USER) && ctx->user_mask)) {
-+ bool empty = !cpumask_and(new_mask, new_mask,
-+ ctx->user_mask);
-+
-+ if (WARN_ON_ONCE(empty))
-+ cpumask_copy(new_mask, cpus_allowed);
-+ }
-+ __set_cpus_allowed_ptr(p, ctx);
-+ retval = -EINVAL;
-+ }
-+
-+out_free_new_mask:
-+ free_cpumask_var(new_mask);
-+out_free_cpus_allowed:
-+ free_cpumask_var(cpus_allowed);
-+ return retval;
-+}
-+
-+long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
-+{
-+ struct affinity_context ac;
-+ struct cpumask *user_mask;
-+ struct task_struct *p;
-+ int retval;
-+
-+ rcu_read_lock();
-+
-+ p = find_process_by_pid(pid);
-+ if (!p) {
-+ rcu_read_unlock();
-+ return -ESRCH;
-+ }
-+
-+ /* Prevent p going away */
-+ get_task_struct(p);
-+ rcu_read_unlock();
-+
-+ if (p->flags & PF_NO_SETAFFINITY) {
-+ retval = -EINVAL;
-+ goto out_put_task;
-+ }
-+
-+ if (!check_same_owner(p)) {
-+ rcu_read_lock();
-+ if (!ns_capable(__task_cred(p)->user_ns, CAP_SYS_NICE)) {
-+ rcu_read_unlock();
-+ retval = -EPERM;
-+ goto out_put_task;
-+ }
-+ rcu_read_unlock();
-+ }
-+
-+ retval = security_task_setscheduler(p);
-+ if (retval)
-+ goto out_put_task;
-+
-+ /*
-+ * With non-SMP configs, user_cpus_ptr/user_mask isn't used and
-+ * alloc_user_cpus_ptr() returns NULL.
-+ */
-+ user_mask = alloc_user_cpus_ptr(NUMA_NO_NODE);
-+ if (user_mask) {
-+ cpumask_copy(user_mask, in_mask);
-+ } else if (IS_ENABLED(CONFIG_SMP)) {
-+ retval = -ENOMEM;
-+ goto out_put_task;
-+ }
-+
-+ ac = (struct affinity_context){
-+ .new_mask = in_mask,
-+ .user_mask = user_mask,
-+ .flags = SCA_USER,
-+ };
-+
-+ retval = __sched_setaffinity(p, &ac);
-+ kfree(ac.user_mask);
-+
-+out_put_task:
-+ put_task_struct(p);
-+ return retval;
-+}
-+
-+static int get_user_cpu_mask(unsigned long __user *user_mask_ptr, unsigned len,
-+ struct cpumask *new_mask)
-+{
-+ if (len < cpumask_size())
-+ cpumask_clear(new_mask);
-+ else if (len > cpumask_size())
-+ len = cpumask_size();
-+
-+ return copy_from_user(new_mask, user_mask_ptr, len) ? -EFAULT : 0;
-+}
-+
-+/**
-+ * sys_sched_setaffinity - set the CPU affinity of a process
-+ * @pid: pid of the process
-+ * @len: length in bytes of the bitmask pointed to by user_mask_ptr
-+ * @user_mask_ptr: user-space pointer to the new CPU mask
-+ *
-+ * Return: 0 on success. An error code otherwise.
-+ */
-+SYSCALL_DEFINE3(sched_setaffinity, pid_t, pid, unsigned int, len,
-+ unsigned long __user *, user_mask_ptr)
-+{
-+ cpumask_var_t new_mask;
-+ int retval;
-+
-+ if (!alloc_cpumask_var(&new_mask, GFP_KERNEL))
-+ return -ENOMEM;
-+
-+ retval = get_user_cpu_mask(user_mask_ptr, len, new_mask);
-+ if (retval == 0)
-+ retval = sched_setaffinity(pid, new_mask);
-+ free_cpumask_var(new_mask);
-+ return retval;
-+}
-+
-+long sched_getaffinity(pid_t pid, cpumask_t *mask)
-+{
-+ struct task_struct *p;
-+ raw_spinlock_t *lock;
-+ unsigned long flags;
-+ int retval;
-+
-+ rcu_read_lock();
-+
-+ retval = -ESRCH;
-+ p = find_process_by_pid(pid);
-+ if (!p)
-+ goto out_unlock;
-+
-+ retval = security_task_getscheduler(p);
-+ if (retval)
-+ goto out_unlock;
-+
-+ task_access_lock_irqsave(p, &lock, &flags);
-+ cpumask_and(mask, &p->cpus_mask, cpu_active_mask);
-+ task_access_unlock_irqrestore(p, lock, &flags);
-+
-+out_unlock:
-+ rcu_read_unlock();
-+
-+ return retval;
-+}
-+
-+/**
-+ * sys_sched_getaffinity - get the CPU affinity of a process
-+ * @pid: pid of the process
-+ * @len: length in bytes of the bitmask pointed to by user_mask_ptr
-+ * @user_mask_ptr: user-space pointer to hold the current CPU mask
-+ *
-+ * Return: size of CPU mask copied to user_mask_ptr on success. An
-+ * error code otherwise.
-+ */
-+SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len,
-+ unsigned long __user *, user_mask_ptr)
-+{
-+ int ret;
-+ cpumask_var_t mask;
-+
-+ if ((len * BITS_PER_BYTE) < nr_cpu_ids)
-+ return -EINVAL;
-+ if (len & (sizeof(unsigned long)-1))
-+ return -EINVAL;
-+
-+ if (!zalloc_cpumask_var(&mask, GFP_KERNEL))
-+ return -ENOMEM;
-+
-+ ret = sched_getaffinity(pid, mask);
-+ if (ret == 0) {
-+ unsigned int retlen = min(len, cpumask_size());
-+
-+ if (copy_to_user(user_mask_ptr, cpumask_bits(mask), retlen))
-+ ret = -EFAULT;
-+ else
-+ ret = retlen;
-+ }
-+ free_cpumask_var(mask);
-+
-+ return ret;
-+}
-+
-+static void do_sched_yield(void)
-+{
-+ struct rq *rq;
-+ struct rq_flags rf;
-+
-+ if (!sched_yield_type)
-+ return;
-+
-+ rq = this_rq_lock_irq(&rf);
-+
-+ schedstat_inc(rq->yld_count);
-+
-+ if (1 == sched_yield_type) {
-+ if (!rt_task(current))
-+ do_sched_yield_type_1(current, rq);
-+ } else if (2 == sched_yield_type) {
-+ if (rq->nr_running > 1)
-+ rq->skip = current;
-+ }
-+
-+ preempt_disable();
-+ raw_spin_unlock_irq(&rq->lock);
-+ sched_preempt_enable_no_resched();
-+
-+ schedule();
-+}
-+
-+/**
-+ * sys_sched_yield - yield the current processor to other threads.
-+ *
-+ * This function yields the current CPU to other tasks. If there are no
-+ * other threads running on this CPU then this function will return.
-+ *
-+ * Return: 0.
-+ */
-+SYSCALL_DEFINE0(sched_yield)
-+{
-+ do_sched_yield();
-+ return 0;
-+}
-+
-+#if !defined(CONFIG_PREEMPTION) || defined(CONFIG_PREEMPT_DYNAMIC)
-+int __sched __cond_resched(void)
-+{
-+ if (should_resched(0)) {
-+ preempt_schedule_common();
-+ return 1;
-+ }
-+ /*
-+ * In preemptible kernels, ->rcu_read_lock_nesting tells the tick
-+ * whether the current CPU is in an RCU read-side critical section,
-+ * so the tick can report quiescent states even for CPUs looping
-+ * in kernel context. In contrast, in non-preemptible kernels,
-+ * RCU readers leave no in-memory hints, which means that CPU-bound
-+ * processes executing in kernel context might never report an
-+ * RCU quiescent state. Therefore, the following code causes
-+ * cond_resched() to report a quiescent state, but only when RCU
-+ * is in urgent need of one.
-+ */
-+#ifndef CONFIG_PREEMPT_RCU
-+ rcu_all_qs();
-+#endif
-+ return 0;
-+}
-+EXPORT_SYMBOL(__cond_resched);
-+#endif
-+
-+#ifdef CONFIG_PREEMPT_DYNAMIC
-+#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
-+#define cond_resched_dynamic_enabled __cond_resched
-+#define cond_resched_dynamic_disabled ((void *)&__static_call_return0)
-+DEFINE_STATIC_CALL_RET0(cond_resched, __cond_resched);
-+EXPORT_STATIC_CALL_TRAMP(cond_resched);
-+
-+#define might_resched_dynamic_enabled __cond_resched
-+#define might_resched_dynamic_disabled ((void *)&__static_call_return0)
-+DEFINE_STATIC_CALL_RET0(might_resched, __cond_resched);
-+EXPORT_STATIC_CALL_TRAMP(might_resched);
-+#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
-+static DEFINE_STATIC_KEY_FALSE(sk_dynamic_cond_resched);
-+int __sched dynamic_cond_resched(void)
-+{
-+ klp_sched_try_switch();
-+ if (!static_branch_unlikely(&sk_dynamic_cond_resched))
-+ return 0;
-+ return __cond_resched();
-+}
-+EXPORT_SYMBOL(dynamic_cond_resched);
-+
-+static DEFINE_STATIC_KEY_FALSE(sk_dynamic_might_resched);
-+int __sched dynamic_might_resched(void)
-+{
-+ if (!static_branch_unlikely(&sk_dynamic_might_resched))
-+ return 0;
-+ return __cond_resched();
-+}
-+EXPORT_SYMBOL(dynamic_might_resched);
-+#endif
-+#endif
-+
-+/*
-+ * __cond_resched_lock() - if a reschedule is pending, drop the given lock,
-+ * call schedule, and on return reacquire the lock.
-+ *
-+ * This works OK both with and without CONFIG_PREEMPTION. We do strange low-level
-+ * operations here to prevent schedule() from being called twice (once via
-+ * spin_unlock(), once by hand).
-+ */
-+int __cond_resched_lock(spinlock_t *lock)
-+{
-+ int resched = should_resched(PREEMPT_LOCK_OFFSET);
-+ int ret = 0;
-+
-+ lockdep_assert_held(lock);
-+
-+ if (spin_needbreak(lock) || resched) {
-+ spin_unlock(lock);
-+ if (!_cond_resched())
-+ cpu_relax();
-+ ret = 1;
-+ spin_lock(lock);
-+ }
-+ return ret;
-+}
-+EXPORT_SYMBOL(__cond_resched_lock);
-+
-+int __cond_resched_rwlock_read(rwlock_t *lock)
-+{
-+ int resched = should_resched(PREEMPT_LOCK_OFFSET);
-+ int ret = 0;
-+
-+ lockdep_assert_held_read(lock);
-+
-+ if (rwlock_needbreak(lock) || resched) {
-+ read_unlock(lock);
-+ if (!_cond_resched())
-+ cpu_relax();
-+ ret = 1;
-+ read_lock(lock);
-+ }
-+ return ret;
-+}
-+EXPORT_SYMBOL(__cond_resched_rwlock_read);
-+
-+int __cond_resched_rwlock_write(rwlock_t *lock)
-+{
-+ int resched = should_resched(PREEMPT_LOCK_OFFSET);
-+ int ret = 0;
-+
-+ lockdep_assert_held_write(lock);
-+
-+ if (rwlock_needbreak(lock) || resched) {
-+ write_unlock(lock);
-+ if (!_cond_resched())
-+ cpu_relax();
-+ ret = 1;
-+ write_lock(lock);
-+ }
-+ return ret;
-+}
-+EXPORT_SYMBOL(__cond_resched_rwlock_write);
-+
-+#ifdef CONFIG_PREEMPT_DYNAMIC
-+
-+#ifdef CONFIG_GENERIC_ENTRY
-+#include <linux/entry-common.h>
-+#endif
-+
-+/*
-+ * SC:cond_resched
-+ * SC:might_resched
-+ * SC:preempt_schedule
-+ * SC:preempt_schedule_notrace
-+ * SC:irqentry_exit_cond_resched
-+ *
-+ *
-+ * NONE:
-+ * cond_resched <- __cond_resched
-+ * might_resched <- RET0
-+ * preempt_schedule <- NOP
-+ * preempt_schedule_notrace <- NOP
-+ * irqentry_exit_cond_resched <- NOP
-+ *
-+ * VOLUNTARY:
-+ * cond_resched <- __cond_resched
-+ * might_resched <- __cond_resched
-+ * preempt_schedule <- NOP
-+ * preempt_schedule_notrace <- NOP
-+ * irqentry_exit_cond_resched <- NOP
-+ *
-+ * FULL:
-+ * cond_resched <- RET0
-+ * might_resched <- RET0
-+ * preempt_schedule <- preempt_schedule
-+ * preempt_schedule_notrace <- preempt_schedule_notrace
-+ * irqentry_exit_cond_resched <- irqentry_exit_cond_resched
-+ */
-+
-+enum {
-+ preempt_dynamic_undefined = -1,
-+ preempt_dynamic_none,
-+ preempt_dynamic_voluntary,
-+ preempt_dynamic_full,
-+};
-+
-+int preempt_dynamic_mode = preempt_dynamic_undefined;
-+
-+int sched_dynamic_mode(const char *str)
-+{
-+ if (!strcmp(str, "none"))
-+ return preempt_dynamic_none;
-+
-+ if (!strcmp(str, "voluntary"))
-+ return preempt_dynamic_voluntary;
-+
-+ if (!strcmp(str, "full"))
-+ return preempt_dynamic_full;
-+
-+ return -EINVAL;
-+}
-+
-+#if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL)
-+#define preempt_dynamic_enable(f) static_call_update(f, f##_dynamic_enabled)
-+#define preempt_dynamic_disable(f) static_call_update(f, f##_dynamic_disabled)
-+#elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
-+#define preempt_dynamic_enable(f) static_key_enable(&sk_dynamic_##f.key)
-+#define preempt_dynamic_disable(f) static_key_disable(&sk_dynamic_##f.key)
-+#else
-+#error "Unsupported PREEMPT_DYNAMIC mechanism"
-+#endif
-+
-+static DEFINE_MUTEX(sched_dynamic_mutex);
-+static bool klp_override;
-+
-+static void __sched_dynamic_update(int mode)
-+{
-+ /*
-+ * Avoid {NONE,VOLUNTARY} -> FULL transitions from ever ending up in
-+ * the ZERO state, which is invalid.
-+ */
-+ if (!klp_override)
-+ preempt_dynamic_enable(cond_resched);
-+ preempt_dynamic_enable(cond_resched);
-+ preempt_dynamic_enable(might_resched);
-+ preempt_dynamic_enable(preempt_schedule);
-+ preempt_dynamic_enable(preempt_schedule_notrace);
-+ preempt_dynamic_enable(irqentry_exit_cond_resched);
-+
-+ switch (mode) {
-+ case preempt_dynamic_none:
-+ if (!klp_override)
-+ preempt_dynamic_enable(cond_resched);
-+ preempt_dynamic_disable(might_resched);
-+ preempt_dynamic_disable(preempt_schedule);
-+ preempt_dynamic_disable(preempt_schedule_notrace);
-+ preempt_dynamic_disable(irqentry_exit_cond_resched);
-+ if (mode != preempt_dynamic_mode)
-+ pr_info("Dynamic Preempt: none\n");
-+ break;
-+
-+ case preempt_dynamic_voluntary:
-+ if (!klp_override)
-+ preempt_dynamic_enable(cond_resched);
-+ preempt_dynamic_enable(might_resched);
-+ preempt_dynamic_disable(preempt_schedule);
-+ preempt_dynamic_disable(preempt_schedule_notrace);
-+ preempt_dynamic_disable(irqentry_exit_cond_resched);
-+ if (mode != preempt_dynamic_mode)
-+ pr_info("Dynamic Preempt: voluntary\n");
-+ break;
-+
-+ case preempt_dynamic_full:
-+ if (!klp_override)
-+ preempt_dynamic_enable(cond_resched);
-+ preempt_dynamic_disable(might_resched);
-+ preempt_dynamic_enable(preempt_schedule);
-+ preempt_dynamic_enable(preempt_schedule_notrace);
-+ preempt_dynamic_enable(irqentry_exit_cond_resched);
-+ if (mode != preempt_dynamic_mode)
-+ pr_info("Dynamic Preempt: full\n");
-+ break;
-+ }
-+
-+ preempt_dynamic_mode = mode;
-+}
-+
-+void sched_dynamic_update(int mode)
-+{
-+ mutex_lock(&sched_dynamic_mutex);
-+ __sched_dynamic_update(mode);
-+ mutex_unlock(&sched_dynamic_mutex);
-+}
-+
-+#ifdef CONFIG_HAVE_PREEMPT_DYNAMIC_CALL
-+
-+static int klp_cond_resched(void)
-+{
-+ __klp_sched_try_switch();
-+ return __cond_resched();
-+}
-+
-+void sched_dynamic_klp_enable(void)
-+{
-+ mutex_lock(&sched_dynamic_mutex);
-+
-+ klp_override = true;
-+ static_call_update(cond_resched, klp_cond_resched);
-+
-+ mutex_unlock(&sched_dynamic_mutex);
-+}
-+
-+void sched_dynamic_klp_disable(void)
-+{
-+ mutex_lock(&sched_dynamic_mutex);
-+
-+ klp_override = false;
-+ __sched_dynamic_update(preempt_dynamic_mode);
-+
-+ mutex_unlock(&sched_dynamic_mutex);
-+}
-+
-+#endif /* CONFIG_HAVE_PREEMPT_DYNAMIC_CALL */
-+
-+
-+static int __init setup_preempt_mode(char *str)
-+{
-+ int mode = sched_dynamic_mode(str);
-+ if (mode < 0) {
-+ pr_warn("Dynamic Preempt: unsupported mode: %s\n", str);
-+ return 0;
-+ }
-+
-+ sched_dynamic_update(mode);
-+ return 1;
-+}
-+__setup("preempt=", setup_preempt_mode);
-+
-+static void __init preempt_dynamic_init(void)
-+{
-+ if (preempt_dynamic_mode == preempt_dynamic_undefined) {
-+ if (IS_ENABLED(CONFIG_PREEMPT_NONE)) {
-+ sched_dynamic_update(preempt_dynamic_none);
-+ } else if (IS_ENABLED(CONFIG_PREEMPT_VOLUNTARY)) {
-+ sched_dynamic_update(preempt_dynamic_voluntary);
-+ } else {
-+ /* Default static call setting, nothing to do */
-+ WARN_ON_ONCE(!IS_ENABLED(CONFIG_PREEMPT));
-+ preempt_dynamic_mode = preempt_dynamic_full;
-+ pr_info("Dynamic Preempt: full\n");
-+ }
-+ }
-+}
-+
-+#define PREEMPT_MODEL_ACCESSOR(mode) \
-+ bool preempt_model_##mode(void) \
-+ { \
-+ WARN_ON_ONCE(preempt_dynamic_mode == preempt_dynamic_undefined); \
-+ return preempt_dynamic_mode == preempt_dynamic_##mode; \
-+ } \
-+ EXPORT_SYMBOL_GPL(preempt_model_##mode)
-+
-+PREEMPT_MODEL_ACCESSOR(none);
-+PREEMPT_MODEL_ACCESSOR(voluntary);
-+PREEMPT_MODEL_ACCESSOR(full);
-+
-+#else /* !CONFIG_PREEMPT_DYNAMIC */
-+
-+static inline void preempt_dynamic_init(void) { }
-+
-+#endif /* #ifdef CONFIG_PREEMPT_DYNAMIC */
-+
-+/**
-+ * yield - yield the current processor to other threads.
-+ *
-+ * Do not ever use this function, there's a 99% chance you're doing it wrong.
-+ *
-+ * The scheduler is at all times free to pick the calling task as the most
-+ * eligible task to run, if removing the yield() call from your code breaks
-+ * it, it's already broken.
-+ *
-+ * Typical broken usage is:
-+ *
-+ * while (!event)
-+ * yield();
-+ *
-+ * where one assumes that yield() will let 'the other' process run that will
-+ * make event true. If the current task is a SCHED_FIFO task that will never
-+ * happen. Never use yield() as a progress guarantee!!
-+ *
-+ * If you want to use yield() to wait for something, use wait_event().
-+ * If you want to use yield() to be 'nice' for others, use cond_resched().
-+ * If you still want to use yield(), do not!
-+ */
-+void __sched yield(void)
-+{
-+ set_current_state(TASK_RUNNING);
-+ do_sched_yield();
-+}
-+EXPORT_SYMBOL(yield);
-+
-+/**
-+ * yield_to - yield the current processor to another thread in
-+ * your thread group, or accelerate that thread toward the
-+ * processor it's on.
-+ * @p: target task
-+ * @preempt: whether task preemption is allowed or not
-+ *
-+ * It's the caller's job to ensure that the target task struct
-+ * can't go away on us before we can do any checks.
-+ *
-+ * In Alt schedule FW, yield_to is not supported.
-+ *
-+ * Return:
-+ * true (>0) if we indeed boosted the target task.
-+ * false (0) if we failed to boost the target.
-+ * -ESRCH if there's no task to yield to.
-+ */
-+int __sched yield_to(struct task_struct *p, bool preempt)
-+{
-+ return 0;
-+}
-+EXPORT_SYMBOL_GPL(yield_to);
-+
-+int io_schedule_prepare(void)
-+{
-+ int old_iowait = current->in_iowait;
-+
-+ current->in_iowait = 1;
-+ blk_flush_plug(current->plug, true);
-+ return old_iowait;
-+}
-+
-+void io_schedule_finish(int token)
-+{
-+ current->in_iowait = token;
-+}
-+
-+/*
-+ * This task is about to go to sleep on IO. Increment rq->nr_iowait so
-+ * that process accounting knows that this is a task in IO wait state.
-+ *
-+ * But don't do that if it is a deliberate, throttling IO wait (this task
-+ * has set its backing_dev_info: the queue against which it should throttle)
-+ */
-+
-+long __sched io_schedule_timeout(long timeout)
-+{
-+ int token;
-+ long ret;
-+
-+ token = io_schedule_prepare();
-+ ret = schedule_timeout(timeout);
-+ io_schedule_finish(token);
-+
-+ return ret;
-+}
-+EXPORT_SYMBOL(io_schedule_timeout);
-+
-+void __sched io_schedule(void)
-+{
-+ int token;
-+
-+ token = io_schedule_prepare();
-+ schedule();
-+ io_schedule_finish(token);
-+}
-+EXPORT_SYMBOL(io_schedule);
-+
-+/**
-+ * sys_sched_get_priority_max - return maximum RT priority.
-+ * @policy: scheduling class.
-+ *
-+ * Return: On success, this syscall returns the maximum
-+ * rt_priority that can be used by a given scheduling class.
-+ * On failure, a negative error code is returned.
-+ */
-+SYSCALL_DEFINE1(sched_get_priority_max, int, policy)
-+{
-+ int ret = -EINVAL;
-+
-+ switch (policy) {
-+ case SCHED_FIFO:
-+ case SCHED_RR:
-+ ret = MAX_RT_PRIO - 1;
-+ break;
-+ case SCHED_NORMAL:
-+ case SCHED_BATCH:
-+ case SCHED_IDLE:
-+ ret = 0;
-+ break;
-+ }
-+ return ret;
-+}
-+
-+/**
-+ * sys_sched_get_priority_min - return minimum RT priority.
-+ * @policy: scheduling class.
-+ *
-+ * Return: On success, this syscall returns the minimum
-+ * rt_priority that can be used by a given scheduling class.
-+ * On failure, a negative error code is returned.
-+ */
-+SYSCALL_DEFINE1(sched_get_priority_min, int, policy)
-+{
-+ int ret = -EINVAL;
-+
-+ switch (policy) {
-+ case SCHED_FIFO:
-+ case SCHED_RR:
-+ ret = 1;
-+ break;
-+ case SCHED_NORMAL:
-+ case SCHED_BATCH:
-+ case SCHED_IDLE:
-+ ret = 0;
-+ break;
-+ }
-+ return ret;
-+}
-+
-+static int sched_rr_get_interval(pid_t pid, struct timespec64 *t)
-+{
-+ struct task_struct *p;
-+ int retval;
-+
-+ alt_sched_debug();
-+
-+ if (pid < 0)
-+ return -EINVAL;
-+
-+ retval = -ESRCH;
-+ rcu_read_lock();
-+ p = find_process_by_pid(pid);
-+ if (!p)
-+ goto out_unlock;
-+
-+ retval = security_task_getscheduler(p);
-+ if (retval)
-+ goto out_unlock;
-+ rcu_read_unlock();
-+
-+ *t = ns_to_timespec64(sched_timeslice_ns);
-+ return 0;
-+
-+out_unlock:
-+ rcu_read_unlock();
-+ return retval;
-+}
-+
-+/**
-+ * sys_sched_rr_get_interval - return the default timeslice of a process.
-+ * @pid: pid of the process.
-+ * @interval: userspace pointer to the timeslice value.
-+ *
-+ *
-+ * Return: On success, 0 and the timeslice is in @interval. Otherwise,
-+ * an error code.
-+ */
-+SYSCALL_DEFINE2(sched_rr_get_interval, pid_t, pid,
-+ struct __kernel_timespec __user *, interval)
-+{
-+ struct timespec64 t;
-+ int retval = sched_rr_get_interval(pid, &t);
-+
-+ if (retval == 0)
-+ retval = put_timespec64(&t, interval);
-+
-+ return retval;
-+}
-+
-+#ifdef CONFIG_COMPAT_32BIT_TIME
-+SYSCALL_DEFINE2(sched_rr_get_interval_time32, pid_t, pid,
-+ struct old_timespec32 __user *, interval)
-+{
-+ struct timespec64 t;
-+ int retval = sched_rr_get_interval(pid, &t);
-+
-+ if (retval == 0)
-+ retval = put_old_timespec32(&t, interval);
-+ return retval;
-+}
-+#endif
-+
-+void sched_show_task(struct task_struct *p)
-+{
-+ unsigned long free = 0;
-+ int ppid;
-+
-+ if (!try_get_task_stack(p))
-+ return;
-+
-+ pr_info("task:%-15.15s state:%c", p->comm, task_state_to_char(p));
-+
-+ if (task_is_running(p))
-+ pr_cont(" running task ");
-+#ifdef CONFIG_DEBUG_STACK_USAGE
-+ free = stack_not_used(p);
-+#endif
-+ ppid = 0;
-+ rcu_read_lock();
-+ if (pid_alive(p))
-+ ppid = task_pid_nr(rcu_dereference(p->real_parent));
-+ rcu_read_unlock();
-+ pr_cont(" stack:%-5lu pid:%-5d ppid:%-6d flags:0x%08lx\n",
-+ free, task_pid_nr(p), ppid,
-+ read_task_thread_flags(p));
-+
-+ print_worker_info(KERN_INFO, p);
-+ print_stop_info(KERN_INFO, p);
-+ show_stack(p, NULL, KERN_INFO);
-+ put_task_stack(p);
-+}
-+EXPORT_SYMBOL_GPL(sched_show_task);
-+
-+static inline bool
-+state_filter_match(unsigned long state_filter, struct task_struct *p)
-+{
-+ unsigned int state = READ_ONCE(p->__state);
-+
-+ /* no filter, everything matches */
-+ if (!state_filter)
-+ return true;
-+
-+ /* filter, but doesn't match */
-+ if (!(state & state_filter))
-+ return false;
-+
-+ /*
-+ * When looking for TASK_UNINTERRUPTIBLE skip TASK_IDLE (allows
-+ * TASK_KILLABLE).
-+ */
-+ if (state_filter == TASK_UNINTERRUPTIBLE && (state & TASK_NOLOAD))
-+ return false;
-+
-+ return true;
-+}
-+
-+
-+void show_state_filter(unsigned int state_filter)
-+{
-+ struct task_struct *g, *p;
-+
-+ rcu_read_lock();
-+ for_each_process_thread(g, p) {
-+ /*
-+ * reset the NMI-timeout, listing all files on a slow
-+ * console might take a lot of time:
-+ * Also, reset softlockup watchdogs on all CPUs, because
-+ * another CPU might be blocked waiting for us to process
-+ * an IPI.
-+ */
-+ touch_nmi_watchdog();
-+ touch_all_softlockup_watchdogs();
-+ if (state_filter_match(state_filter, p))
-+ sched_show_task(p);
-+ }
-+
-+#ifdef CONFIG_SCHED_DEBUG
-+ /* TODO: Alt schedule FW should support this
-+ if (!state_filter)
-+ sysrq_sched_debug_show();
-+ */
-+#endif
-+ rcu_read_unlock();
-+ /*
-+ * Only show locks if all tasks are dumped:
-+ */
-+ if (!state_filter)
-+ debug_show_all_locks();
-+}
-+
-+void dump_cpu_task(int cpu)
-+{
-+ if (cpu == smp_processor_id() && in_hardirq()) {
-+ struct pt_regs *regs;
-+
-+ regs = get_irq_regs();
-+ if (regs) {
-+ show_regs(regs);
-+ return;
-+ }
-+ }
-+
-+ if (trigger_single_cpu_backtrace(cpu))
-+ return;
-+
-+ pr_info("Task dump for CPU %d:\n", cpu);
-+ sched_show_task(cpu_curr(cpu));
-+}
-+
-+/**
-+ * init_idle - set up an idle thread for a given CPU
-+ * @idle: task in question
-+ * @cpu: CPU the idle task belongs to
-+ *
-+ * NOTE: this function does not set the idle thread's NEED_RESCHED
-+ * flag, to make booting more robust.
-+ */
-+void __init init_idle(struct task_struct *idle, int cpu)
-+{
-+#ifdef CONFIG_SMP
-+ struct affinity_context ac = (struct affinity_context) {
-+ .new_mask = cpumask_of(cpu),
-+ .flags = 0,
-+ };
-+#endif
-+ struct rq *rq = cpu_rq(cpu);
-+ unsigned long flags;
-+
-+ __sched_fork(0, idle);
-+
-+ raw_spin_lock_irqsave(&idle->pi_lock, flags);
-+ raw_spin_lock(&rq->lock);
-+
-+ idle->last_ran = rq->clock_task;
-+ idle->__state = TASK_RUNNING;
-+ /*
-+ * PF_KTHREAD should already be set at this point; regardless, make it
-+ * look like a proper per-CPU kthread.
-+ */
-+ idle->flags |= PF_IDLE | PF_KTHREAD | PF_NO_SETAFFINITY;
-+ kthread_set_per_cpu(idle, cpu);
-+
-+ sched_queue_init_idle(&rq->queue, idle);
-+
-+#ifdef CONFIG_SMP
-+ /*
-+ * It's possible that init_idle() gets called multiple times on a task,
-+ * in that case do_set_cpus_allowed() will not do the right thing.
-+ *
-+ * And since this is boot we can forgo the serialisation.
-+ */
-+ set_cpus_allowed_common(idle, &ac);
-+#endif
-+
-+ /* Silence PROVE_RCU */
-+ rcu_read_lock();
-+ __set_task_cpu(idle, cpu);
-+ rcu_read_unlock();
-+
-+ rq->idle = idle;
-+ rcu_assign_pointer(rq->curr, idle);
-+ idle->on_cpu = 1;
-+
-+ raw_spin_unlock(&rq->lock);
-+ raw_spin_unlock_irqrestore(&idle->pi_lock, flags);
-+
-+ /* Set the preempt count _outside_ the spinlocks! */
-+ init_idle_preempt_count(idle, cpu);
-+
-+ ftrace_graph_init_idle_task(idle, cpu);
-+ vtime_init_idle(idle, cpu);
-+#ifdef CONFIG_SMP
-+ sprintf(idle->comm, "%s/%d", INIT_TASK_COMM, cpu);
-+#endif
-+}
-+
-+#ifdef CONFIG_SMP
-+
-+int cpuset_cpumask_can_shrink(const struct cpumask __maybe_unused *cur,
-+ const struct cpumask __maybe_unused *trial)
-+{
-+ return 1;
-+}
-+
-+int task_can_attach(struct task_struct *p,
-+ const struct cpumask *cs_effective_cpus)
-+{
-+ int ret = 0;
-+
-+ /*
-+ * Kthreads which disallow setaffinity shouldn't be moved
-+ * to a new cpuset; we don't want to change their CPU
-+ * affinity and isolating such threads by their set of
-+ * allowed nodes is unnecessary. Thus, cpusets are not
-+ * applicable for such threads. This prevents checking for
-+ * success of set_cpus_allowed_ptr() on all attached tasks
-+ * before cpus_mask may be changed.
-+ */
-+ if (p->flags & PF_NO_SETAFFINITY)
-+ ret = -EINVAL;
-+
-+ return ret;
-+}
-+
-+bool sched_smp_initialized __read_mostly;
-+
-+#ifdef CONFIG_HOTPLUG_CPU
-+/*
-+ * Ensures that the idle task is using init_mm right before its CPU goes
-+ * offline.
-+ */
-+void idle_task_exit(void)
-+{
-+ struct mm_struct *mm = current->active_mm;
-+
-+ BUG_ON(current != this_rq()->idle);
-+
-+ if (mm != &init_mm) {
-+ switch_mm(mm, &init_mm, current);
-+ finish_arch_post_lock_switch();
-+ }
-+
-+ /* finish_cpu(), as ran on the BP, will clean up the active_mm state */
-+}
-+
-+static int __balance_push_cpu_stop(void *arg)
-+{
-+ struct task_struct *p = arg;
-+ struct rq *rq = this_rq();
-+ struct rq_flags rf;
-+ int cpu;
-+
-+ raw_spin_lock_irq(&p->pi_lock);
-+ rq_lock(rq, &rf);
-+
-+ update_rq_clock(rq);
-+
-+ if (task_rq(p) == rq && task_on_rq_queued(p)) {
-+ cpu = select_fallback_rq(rq->cpu, p);
-+ rq = __migrate_task(rq, p, cpu);
-+ }
-+
-+ rq_unlock(rq, &rf);
-+ raw_spin_unlock_irq(&p->pi_lock);
-+
-+ put_task_struct(p);
-+
-+ return 0;
-+}
-+
-+static DEFINE_PER_CPU(struct cpu_stop_work, push_work);
-+
-+/*
-+ * This is enabled below SCHED_AP_ACTIVE; when !cpu_active(), but only
-+ * effective when the hotplug motion is down.
-+ */
-+static void balance_push(struct rq *rq)
-+{
-+ struct task_struct *push_task = rq->curr;
-+
-+ lockdep_assert_held(&rq->lock);
-+
-+ /*
-+ * Ensure the thing is persistent until balance_push_set(.on = false);
-+ */
-+ rq->balance_callback = &balance_push_callback;
-+
-+ /*
-+ * Only active while going offline and when invoked on the outgoing
-+ * CPU.
-+ */
-+ if (!cpu_dying(rq->cpu) || rq != this_rq())
-+ return;
-+
-+ /*
-+ * Both the cpu-hotplug and stop task are in this case and are
-+ * required to complete the hotplug process.
-+ */
-+ if (kthread_is_per_cpu(push_task) ||
-+ is_migration_disabled(push_task)) {
-+
-+ /*
-+ * If this is the idle task on the outgoing CPU try to wake
-+ * up the hotplug control thread which might wait for the
-+ * last task to vanish. The rcuwait_active() check is
-+ * accurate here because the waiter is pinned on this CPU
-+ * and can't obviously be running in parallel.
-+ *
-+ * On RT kernels this also has to check whether there are
-+ * pinned and scheduled out tasks on the runqueue. They
-+ * need to leave the migrate disabled section first.
-+ */
-+ if (!rq->nr_running && !rq_has_pinned_tasks(rq) &&
-+ rcuwait_active(&rq->hotplug_wait)) {
-+ raw_spin_unlock(&rq->lock);
-+ rcuwait_wake_up(&rq->hotplug_wait);
-+ raw_spin_lock(&rq->lock);
-+ }
-+ return;
-+ }
-+
-+ get_task_struct(push_task);
-+ /*
-+ * Temporarily drop rq->lock such that we can wake-up the stop task.
-+ * Both preemption and IRQs are still disabled.
-+ */
-+ raw_spin_unlock(&rq->lock);
-+ stop_one_cpu_nowait(rq->cpu, __balance_push_cpu_stop, push_task,
-+ this_cpu_ptr(&push_work));
-+ /*
-+ * At this point need_resched() is true and we'll take the loop in
-+ * schedule(). The next pick is obviously going to be the stop task
-+ * which kthread_is_per_cpu() and will push this task away.
-+ */
-+ raw_spin_lock(&rq->lock);
-+}
-+
-+static void balance_push_set(int cpu, bool on)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+ struct rq_flags rf;
-+
-+ rq_lock_irqsave(rq, &rf);
-+ if (on) {
-+ WARN_ON_ONCE(rq->balance_callback);
-+ rq->balance_callback = &balance_push_callback;
-+ } else if (rq->balance_callback == &balance_push_callback) {
-+ rq->balance_callback = NULL;
-+ }
-+ rq_unlock_irqrestore(rq, &rf);
-+}
-+
-+/*
-+ * Invoked from a CPUs hotplug control thread after the CPU has been marked
-+ * inactive. All tasks which are not per CPU kernel threads are either
-+ * pushed off this CPU now via balance_push() or placed on a different CPU
-+ * during wakeup. Wait until the CPU is quiescent.
-+ */
-+static void balance_hotplug_wait(void)
-+{
-+ struct rq *rq = this_rq();
-+
-+ rcuwait_wait_event(&rq->hotplug_wait,
-+ rq->nr_running == 1 && !rq_has_pinned_tasks(rq),
-+ TASK_UNINTERRUPTIBLE);
-+}
-+
-+#else
-+
-+static void balance_push(struct rq *rq)
-+{
-+}
-+
-+static void balance_push_set(int cpu, bool on)
-+{
-+}
-+
-+static inline void balance_hotplug_wait(void)
-+{
-+}
-+#endif /* CONFIG_HOTPLUG_CPU */
-+
-+static void set_rq_offline(struct rq *rq)
-+{
-+ if (rq->online)
-+ rq->online = false;
-+}
-+
-+static void set_rq_online(struct rq *rq)
-+{
-+ if (!rq->online)
-+ rq->online = true;
-+}
-+
-+/*
-+ * used to mark begin/end of suspend/resume:
-+ */
-+static int num_cpus_frozen;
-+
-+/*
-+ * Update cpusets according to cpu_active mask. If cpusets are
-+ * disabled, cpuset_update_active_cpus() becomes a simple wrapper
-+ * around partition_sched_domains().
-+ *
-+ * If we come here as part of a suspend/resume, don't touch cpusets because we
-+ * want to restore it back to its original state upon resume anyway.
-+ */
-+static void cpuset_cpu_active(void)
-+{
-+ if (cpuhp_tasks_frozen) {
-+ /*
-+ * num_cpus_frozen tracks how many CPUs are involved in suspend
-+ * resume sequence. As long as this is not the last online
-+ * operation in the resume sequence, just build a single sched
-+ * domain, ignoring cpusets.
-+ */
-+ partition_sched_domains(1, NULL, NULL);
-+ if (--num_cpus_frozen)
-+ return;
-+ /*
-+ * This is the last CPU online operation. So fall through and
-+ * restore the original sched domains by considering the
-+ * cpuset configurations.
-+ */
-+ cpuset_force_rebuild();
-+ }
-+
-+ cpuset_update_active_cpus();
-+}
-+
-+static int cpuset_cpu_inactive(unsigned int cpu)
-+{
-+ if (!cpuhp_tasks_frozen) {
-+ cpuset_update_active_cpus();
-+ } else {
-+ num_cpus_frozen++;
-+ partition_sched_domains(1, NULL, NULL);
-+ }
-+ return 0;
-+}
-+
-+int sched_cpu_activate(unsigned int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+ unsigned long flags;
-+
-+ /*
-+ * Clear the balance_push callback and prepare to schedule
-+ * regular tasks.
-+ */
-+ balance_push_set(cpu, false);
-+
-+#ifdef CONFIG_SCHED_SMT
-+ /*
-+ * When going up, increment the number of cores with SMT present.
-+ */
-+ if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
-+ static_branch_inc_cpuslocked(&sched_smt_present);
-+#endif
-+ set_cpu_active(cpu, true);
-+
-+ if (sched_smp_initialized)
-+ cpuset_cpu_active();
-+
-+ /*
-+ * Put the rq online, if not already. This happens:
-+ *
-+ * 1) In the early boot process, because we build the real domains
-+ * after all cpus have been brought up.
-+ *
-+ * 2) At runtime, if cpuset_cpu_active() fails to rebuild the
-+ * domains.
-+ */
-+ raw_spin_lock_irqsave(&rq->lock, flags);
-+ set_rq_online(rq);
-+ raw_spin_unlock_irqrestore(&rq->lock, flags);
-+
-+ return 0;
-+}
-+
-+int sched_cpu_deactivate(unsigned int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+ unsigned long flags;
-+ int ret;
-+
-+ set_cpu_active(cpu, false);
-+
-+ /*
-+ * From this point forward, this CPU will refuse to run any task that
-+ * is not: migrate_disable() or KTHREAD_IS_PER_CPU, and will actively
-+ * push those tasks away until this gets cleared, see
-+ * sched_cpu_dying().
-+ */
-+ balance_push_set(cpu, true);
-+
-+ /*
-+ * We've cleared cpu_active_mask, wait for all preempt-disabled and RCU
-+ * users of this state to go away such that all new such users will
-+ * observe it.
-+ *
-+ * Specifically, we rely on ttwu to no longer target this CPU, see
-+ * ttwu_queue_cond() and is_cpu_allowed().
-+ *
-+ * Do sync before park smpboot threads to take care the rcu boost case.
-+ */
-+ synchronize_rcu();
-+
-+ raw_spin_lock_irqsave(&rq->lock, flags);
-+ update_rq_clock(rq);
-+ set_rq_offline(rq);
-+ raw_spin_unlock_irqrestore(&rq->lock, flags);
-+
-+#ifdef CONFIG_SCHED_SMT
-+ /*
-+ * When going down, decrement the number of cores with SMT present.
-+ */
-+ if (cpumask_weight(cpu_smt_mask(cpu)) == 2) {
-+ static_branch_dec_cpuslocked(&sched_smt_present);
-+ if (!static_branch_likely(&sched_smt_present))
-+ cpumask_clear(&sched_sg_idle_mask);
-+ }
-+#endif
-+
-+ if (!sched_smp_initialized)
-+ return 0;
-+
-+ ret = cpuset_cpu_inactive(cpu);
-+ if (ret) {
-+ balance_push_set(cpu, false);
-+ set_cpu_active(cpu, true);
-+ return ret;
-+ }
-+
-+ return 0;
-+}
-+
-+static void sched_rq_cpu_starting(unsigned int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+
-+ rq->calc_load_update = calc_load_update;
-+}
-+
-+int sched_cpu_starting(unsigned int cpu)
-+{
-+ sched_rq_cpu_starting(cpu);
-+ sched_tick_start(cpu);
-+ return 0;
-+}
-+
-+#ifdef CONFIG_HOTPLUG_CPU
-+
-+/*
-+ * Invoked immediately before the stopper thread is invoked to bring the
-+ * CPU down completely. At this point all per CPU kthreads except the
-+ * hotplug thread (current) and the stopper thread (inactive) have been
-+ * either parked or have been unbound from the outgoing CPU. Ensure that
-+ * any of those which might be on the way out are gone.
-+ *
-+ * If after this point a bound task is being woken on this CPU then the
-+ * responsible hotplug callback has failed to do it's job.
-+ * sched_cpu_dying() will catch it with the appropriate fireworks.
-+ */
-+int sched_cpu_wait_empty(unsigned int cpu)
-+{
-+ balance_hotplug_wait();
-+ return 0;
-+}
-+
-+/*
-+ * Since this CPU is going 'away' for a while, fold any nr_active delta we
-+ * might have. Called from the CPU stopper task after ensuring that the
-+ * stopper is the last running task on the CPU, so nr_active count is
-+ * stable. We need to take the teardown thread which is calling this into
-+ * account, so we hand in adjust = 1 to the load calculation.
-+ *
-+ * Also see the comment "Global load-average calculations".
-+ */
-+static void calc_load_migrate(struct rq *rq)
-+{
-+ long delta = calc_load_fold_active(rq, 1);
-+
-+ if (delta)
-+ atomic_long_add(delta, &calc_load_tasks);
-+}
-+
-+static void dump_rq_tasks(struct rq *rq, const char *loglvl)
-+{
-+ struct task_struct *g, *p;
-+ int cpu = cpu_of(rq);
-+
-+ lockdep_assert_held(&rq->lock);
-+
-+ printk("%sCPU%d enqueued tasks (%u total):\n", loglvl, cpu, rq->nr_running);
-+ for_each_process_thread(g, p) {
-+ if (task_cpu(p) != cpu)
-+ continue;
-+
-+ if (!task_on_rq_queued(p))
-+ continue;
-+
-+ printk("%s\tpid: %d, name: %s\n", loglvl, p->pid, p->comm);
-+ }
-+}
-+
-+int sched_cpu_dying(unsigned int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+ unsigned long flags;
-+
-+ /* Handle pending wakeups and then migrate everything off */
-+ sched_tick_stop(cpu);
-+
-+ raw_spin_lock_irqsave(&rq->lock, flags);
-+ if (rq->nr_running != 1 || rq_has_pinned_tasks(rq)) {
-+ WARN(true, "Dying CPU not properly vacated!");
-+ dump_rq_tasks(rq, KERN_WARNING);
-+ }
-+ raw_spin_unlock_irqrestore(&rq->lock, flags);
-+
-+ calc_load_migrate(rq);
-+ hrtick_clear(rq);
-+ return 0;
-+}
-+#endif
-+
-+#ifdef CONFIG_SMP
-+static void sched_init_topology_cpumask_early(void)
-+{
-+ int cpu;
-+ cpumask_t *tmp;
-+
-+ for_each_possible_cpu(cpu) {
-+ /* init topo masks */
-+ tmp = per_cpu(sched_cpu_topo_masks, cpu);
-+
-+ cpumask_copy(tmp, cpumask_of(cpu));
-+ tmp++;
-+ cpumask_copy(tmp, cpu_possible_mask);
-+ per_cpu(sched_cpu_llc_mask, cpu) = tmp;
-+ per_cpu(sched_cpu_topo_end_mask, cpu) = ++tmp;
-+ /*per_cpu(sd_llc_id, cpu) = cpu;*/
-+ }
-+}
-+
-+#define TOPOLOGY_CPUMASK(name, mask, last)\
-+ if (cpumask_and(topo, topo, mask)) { \
-+ cpumask_copy(topo, mask); \
-+ printk(KERN_INFO "sched: cpu#%02d topo: 0x%08lx - "#name, \
-+ cpu, (topo++)->bits[0]); \
-+ } \
-+ if (!last) \
-+ bitmap_complement(cpumask_bits(topo), cpumask_bits(mask), \
-+ nr_cpumask_bits);
-+
-+static void sched_init_topology_cpumask(void)
-+{
-+ int cpu;
-+ cpumask_t *topo;
-+
-+ for_each_online_cpu(cpu) {
-+ /* take chance to reset time slice for idle tasks */
-+ cpu_rq(cpu)->idle->time_slice = sched_timeslice_ns;
-+
-+ topo = per_cpu(sched_cpu_topo_masks, cpu) + 1;
-+
-+ bitmap_complement(cpumask_bits(topo), cpumask_bits(cpumask_of(cpu)),
-+ nr_cpumask_bits);
-+#ifdef CONFIG_SCHED_SMT
-+ TOPOLOGY_CPUMASK(smt, topology_sibling_cpumask(cpu), false);
-+#endif
-+ per_cpu(sd_llc_id, cpu) = cpumask_first(cpu_coregroup_mask(cpu));
-+ per_cpu(sched_cpu_llc_mask, cpu) = topo;
-+ TOPOLOGY_CPUMASK(coregroup, cpu_coregroup_mask(cpu), false);
-+
-+ TOPOLOGY_CPUMASK(core, topology_core_cpumask(cpu), false);
-+
-+ TOPOLOGY_CPUMASK(others, cpu_online_mask, true);
-+
-+ per_cpu(sched_cpu_topo_end_mask, cpu) = topo;
-+ printk(KERN_INFO "sched: cpu#%02d llc_id = %d, llc_mask idx = %d\n",
-+ cpu, per_cpu(sd_llc_id, cpu),
-+ (int) (per_cpu(sched_cpu_llc_mask, cpu) -
-+ per_cpu(sched_cpu_topo_masks, cpu)));
-+ }
-+}
-+#endif
-+
-+void __init sched_init_smp(void)
-+{
-+ /* Move init over to a non-isolated CPU */
-+ if (set_cpus_allowed_ptr(current, housekeeping_cpumask(HK_TYPE_DOMAIN)) < 0)
-+ BUG();
-+ current->flags &= ~PF_NO_SETAFFINITY;
-+
-+ sched_init_topology_cpumask();
-+
-+ sched_smp_initialized = true;
-+}
-+
-+static int __init migration_init(void)
-+{
-+ sched_cpu_starting(smp_processor_id());
-+ return 0;
-+}
-+early_initcall(migration_init);
-+
-+#else
-+void __init sched_init_smp(void)
-+{
-+ cpu_rq(0)->idle->time_slice = sched_timeslice_ns;
-+}
-+#endif /* CONFIG_SMP */
-+
-+int in_sched_functions(unsigned long addr)
-+{
-+ return in_lock_functions(addr) ||
-+ (addr >= (unsigned long)__sched_text_start
-+ && addr < (unsigned long)__sched_text_end);
-+}
-+
-+#ifdef CONFIG_CGROUP_SCHED
-+/* task group related information */
-+struct task_group {
-+ struct cgroup_subsys_state css;
-+
-+ struct rcu_head rcu;
-+ struct list_head list;
-+
-+ struct task_group *parent;
-+ struct list_head siblings;
-+ struct list_head children;
-+#ifdef CONFIG_FAIR_GROUP_SCHED
-+ unsigned long shares;
-+#endif
-+};
-+
-+/*
-+ * Default task group.
-+ * Every task in system belongs to this group at bootup.
-+ */
-+struct task_group root_task_group;
-+LIST_HEAD(task_groups);
-+
-+/* Cacheline aligned slab cache for task_group */
-+static struct kmem_cache *task_group_cache __read_mostly;
-+#endif /* CONFIG_CGROUP_SCHED */
-+
-+void __init sched_init(void)
-+{
-+ int i;
-+ struct rq *rq;
-+
-+ printk(KERN_INFO "sched/alt: "ALT_SCHED_NAME" CPU Scheduler "ALT_SCHED_VERSION\
-+ " by Alfred Chen.\n");
-+
-+ wait_bit_init();
-+
-+#ifdef CONFIG_SMP
-+ for (i = 0; i < SCHED_QUEUE_BITS; i++)
-+ cpumask_copy(sched_preempt_mask + i, cpu_present_mask);
-+#endif
-+
-+#ifdef CONFIG_CGROUP_SCHED
-+ task_group_cache = KMEM_CACHE(task_group, 0);
-+
-+ list_add(&root_task_group.list, &task_groups);
-+ INIT_LIST_HEAD(&root_task_group.children);
-+ INIT_LIST_HEAD(&root_task_group.siblings);
-+#endif /* CONFIG_CGROUP_SCHED */
-+ for_each_possible_cpu(i) {
-+ rq = cpu_rq(i);
-+
-+ sched_queue_init(&rq->queue);
-+ rq->prio = IDLE_TASK_SCHED_PRIO;
-+ rq->skip = NULL;
-+
-+ raw_spin_lock_init(&rq->lock);
-+ rq->nr_running = rq->nr_uninterruptible = 0;
-+ rq->calc_load_active = 0;
-+ rq->calc_load_update = jiffies + LOAD_FREQ;
-+#ifdef CONFIG_SMP
-+ rq->online = false;
-+ rq->cpu = i;
-+
-+#ifdef CONFIG_SCHED_SMT
-+ rq->active_balance = 0;
-+#endif
-+
-+#ifdef CONFIG_NO_HZ_COMMON
-+ INIT_CSD(&rq->nohz_csd, nohz_csd_func, rq);
-+#endif
-+ rq->balance_callback = &balance_push_callback;
-+#ifdef CONFIG_HOTPLUG_CPU
-+ rcuwait_init(&rq->hotplug_wait);
-+#endif
-+#endif /* CONFIG_SMP */
-+ rq->nr_switches = 0;
-+
-+ hrtick_rq_init(rq);
-+ atomic_set(&rq->nr_iowait, 0);
-+
-+ zalloc_cpumask_var_node(&rq->scratch_mask, GFP_KERNEL, cpu_to_node(i));
-+ }
-+#ifdef CONFIG_SMP
-+ /* Set rq->online for cpu 0 */
-+ cpu_rq(0)->online = true;
-+#endif
-+ /*
-+ * The boot idle thread does lazy MMU switching as well:
-+ */
-+ mmgrab(&init_mm);
-+ enter_lazy_tlb(&init_mm, current);
-+
-+ /*
-+ * The idle task doesn't need the kthread struct to function, but it
-+ * is dressed up as a per-CPU kthread and thus needs to play the part
-+ * if we want to avoid special-casing it in code that deals with per-CPU
-+ * kthreads.
-+ */
-+ WARN_ON(!set_kthread_struct(current));
-+
-+ /*
-+ * Make us the idle thread. Technically, schedule() should not be
-+ * called from this thread, however somewhere below it might be,
-+ * but because we are the idle thread, we just pick up running again
-+ * when this runqueue becomes "idle".
-+ */
-+ init_idle(current, smp_processor_id());
-+
-+ calc_load_update = jiffies + LOAD_FREQ;
-+
-+#ifdef CONFIG_SMP
-+ idle_thread_set_boot_cpu();
-+ balance_push_set(smp_processor_id(), false);
-+
-+ sched_init_topology_cpumask_early();
-+#endif /* SMP */
-+
-+ preempt_dynamic_init();
-+}
-+
-+#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
-+
-+void __might_sleep(const char *file, int line)
-+{
-+ unsigned int state = get_current_state();
-+ /*
-+ * Blocking primitives will set (and therefore destroy) current->state,
-+ * since we will exit with TASK_RUNNING make sure we enter with it,
-+ * otherwise we will destroy state.
-+ */
-+ WARN_ONCE(state != TASK_RUNNING && current->task_state_change,
-+ "do not call blocking ops when !TASK_RUNNING; "
-+ "state=%x set at [<%p>] %pS\n", state,
-+ (void *)current->task_state_change,
-+ (void *)current->task_state_change);
-+
-+ __might_resched(file, line, 0);
-+}
-+EXPORT_SYMBOL(__might_sleep);
-+
-+static void print_preempt_disable_ip(int preempt_offset, unsigned long ip)
-+{
-+ if (!IS_ENABLED(CONFIG_DEBUG_PREEMPT))
-+ return;
-+
-+ if (preempt_count() == preempt_offset)
-+ return;
-+
-+ pr_err("Preemption disabled at:");
-+ print_ip_sym(KERN_ERR, ip);
-+}
-+
-+static inline bool resched_offsets_ok(unsigned int offsets)
-+{
-+ unsigned int nested = preempt_count();
-+
-+ nested += rcu_preempt_depth() << MIGHT_RESCHED_RCU_SHIFT;
-+
-+ return nested == offsets;
-+}
-+
-+void __might_resched(const char *file, int line, unsigned int offsets)
-+{
-+ /* Ratelimiting timestamp: */
-+ static unsigned long prev_jiffy;
-+
-+ unsigned long preempt_disable_ip;
-+
-+ /* WARN_ON_ONCE() by default, no rate limit required: */
-+ rcu_sleep_check();
-+
-+ if ((resched_offsets_ok(offsets) && !irqs_disabled() &&
-+ !is_idle_task(current) && !current->non_block_count) ||
-+ system_state == SYSTEM_BOOTING || system_state > SYSTEM_RUNNING ||
-+ oops_in_progress)
-+ return;
-+ if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
-+ return;
-+ prev_jiffy = jiffies;
-+
-+ /* Save this before calling printk(), since that will clobber it: */
-+ preempt_disable_ip = get_preempt_disable_ip(current);
-+
-+ pr_err("BUG: sleeping function called from invalid context at %s:%d\n",
-+ file, line);
-+ pr_err("in_atomic(): %d, irqs_disabled(): %d, non_block: %d, pid: %d, name: %s\n",
-+ in_atomic(), irqs_disabled(), current->non_block_count,
-+ current->pid, current->comm);
-+ pr_err("preempt_count: %x, expected: %x\n", preempt_count(),
-+ offsets & MIGHT_RESCHED_PREEMPT_MASK);
-+
-+ if (IS_ENABLED(CONFIG_PREEMPT_RCU)) {
-+ pr_err("RCU nest depth: %d, expected: %u\n",
-+ rcu_preempt_depth(), offsets >> MIGHT_RESCHED_RCU_SHIFT);
-+ }
-+
-+ if (task_stack_end_corrupted(current))
-+ pr_emerg("Thread overran stack, or stack corrupted\n");
-+
-+ debug_show_held_locks(current);
-+ if (irqs_disabled())
-+ print_irqtrace_events(current);
-+
-+ print_preempt_disable_ip(offsets & MIGHT_RESCHED_PREEMPT_MASK,
-+ preempt_disable_ip);
-+
-+ dump_stack();
-+ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
-+}
-+EXPORT_SYMBOL(__might_resched);
-+
-+void __cant_sleep(const char *file, int line, int preempt_offset)
-+{
-+ static unsigned long prev_jiffy;
-+
-+ if (irqs_disabled())
-+ return;
-+
-+ if (!IS_ENABLED(CONFIG_PREEMPT_COUNT))
-+ return;
-+
-+ if (preempt_count() > preempt_offset)
-+ return;
-+
-+ if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
-+ return;
-+ prev_jiffy = jiffies;
-+
-+ printk(KERN_ERR "BUG: assuming atomic context at %s:%d\n", file, line);
-+ printk(KERN_ERR "in_atomic(): %d, irqs_disabled(): %d, pid: %d, name: %s\n",
-+ in_atomic(), irqs_disabled(),
-+ current->pid, current->comm);
-+
-+ debug_show_held_locks(current);
-+ dump_stack();
-+ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
-+}
-+EXPORT_SYMBOL_GPL(__cant_sleep);
-+
-+#ifdef CONFIG_SMP
-+void __cant_migrate(const char *file, int line)
-+{
-+ static unsigned long prev_jiffy;
-+
-+ if (irqs_disabled())
-+ return;
-+
-+ if (is_migration_disabled(current))
-+ return;
-+
-+ if (!IS_ENABLED(CONFIG_PREEMPT_COUNT))
-+ return;
-+
-+ if (preempt_count() > 0)
-+ return;
-+
-+ if (current->migration_flags & MDF_FORCE_ENABLED)
-+ return;
-+
-+ if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
-+ return;
-+ prev_jiffy = jiffies;
-+
-+ pr_err("BUG: assuming non migratable context at %s:%d\n", file, line);
-+ pr_err("in_atomic(): %d, irqs_disabled(): %d, migration_disabled() %u pid: %d, name: %s\n",
-+ in_atomic(), irqs_disabled(), is_migration_disabled(current),
-+ current->pid, current->comm);
-+
-+ debug_show_held_locks(current);
-+ dump_stack();
-+ add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
-+}
-+EXPORT_SYMBOL_GPL(__cant_migrate);
-+#endif
-+#endif
-+
-+#ifdef CONFIG_MAGIC_SYSRQ
-+void normalize_rt_tasks(void)
-+{
-+ struct task_struct *g, *p;
-+ struct sched_attr attr = {
-+ .sched_policy = SCHED_NORMAL,
-+ };
-+
-+ read_lock(&tasklist_lock);
-+ for_each_process_thread(g, p) {
-+ /*
-+ * Only normalize user tasks:
-+ */
-+ if (p->flags & PF_KTHREAD)
-+ continue;
-+
-+ schedstat_set(p->stats.wait_start, 0);
-+ schedstat_set(p->stats.sleep_start, 0);
-+ schedstat_set(p->stats.block_start, 0);
-+
-+ if (!rt_task(p)) {
-+ /*
-+ * Renice negative nice level userspace
-+ * tasks back to 0:
-+ */
-+ if (task_nice(p) < 0)
-+ set_user_nice(p, 0);
-+ continue;
-+ }
-+
-+ __sched_setscheduler(p, &attr, false, false);
-+ }
-+ read_unlock(&tasklist_lock);
-+}
-+#endif /* CONFIG_MAGIC_SYSRQ */
-+
-+#if defined(CONFIG_IA64) || defined(CONFIG_KGDB_KDB)
-+/*
-+ * These functions are only useful for the IA64 MCA handling, or kdb.
-+ *
-+ * They can only be called when the whole system has been
-+ * stopped - every CPU needs to be quiescent, and no scheduling
-+ * activity can take place. Using them for anything else would
-+ * be a serious bug, and as a result, they aren't even visible
-+ * under any other configuration.
-+ */
-+
-+/**
-+ * curr_task - return the current task for a given CPU.
-+ * @cpu: the processor in question.
-+ *
-+ * ONLY VALID WHEN THE WHOLE SYSTEM IS STOPPED!
-+ *
-+ * Return: The current task for @cpu.
-+ */
-+struct task_struct *curr_task(int cpu)
-+{
-+ return cpu_curr(cpu);
-+}
-+
-+#endif /* defined(CONFIG_IA64) || defined(CONFIG_KGDB_KDB) */
-+
-+#ifdef CONFIG_IA64
-+/**
-+ * ia64_set_curr_task - set the current task for a given CPU.
-+ * @cpu: the processor in question.
-+ * @p: the task pointer to set.
-+ *
-+ * Description: This function must only be used when non-maskable interrupts
-+ * are serviced on a separate stack. It allows the architecture to switch the
-+ * notion of the current task on a CPU in a non-blocking manner. This function
-+ * must be called with all CPU's synchronised, and interrupts disabled, the
-+ * and caller must save the original value of the current task (see
-+ * curr_task() above) and restore that value before reenabling interrupts and
-+ * re-starting the system.
-+ *
-+ * ONLY VALID WHEN THE WHOLE SYSTEM IS STOPPED!
-+ */
-+void ia64_set_curr_task(int cpu, struct task_struct *p)
-+{
-+ cpu_curr(cpu) = p;
-+}
-+
-+#endif
-+
-+#ifdef CONFIG_CGROUP_SCHED
-+static void sched_free_group(struct task_group *tg)
-+{
-+ kmem_cache_free(task_group_cache, tg);
-+}
-+
-+static void sched_free_group_rcu(struct rcu_head *rhp)
-+{
-+ sched_free_group(container_of(rhp, struct task_group, rcu));
-+}
-+
-+static void sched_unregister_group(struct task_group *tg)
-+{
-+ /*
-+ * We have to wait for yet another RCU grace period to expire, as
-+ * print_cfs_stats() might run concurrently.
-+ */
-+ call_rcu(&tg->rcu, sched_free_group_rcu);
-+}
-+
-+/* allocate runqueue etc for a new task group */
-+struct task_group *sched_create_group(struct task_group *parent)
-+{
-+ struct task_group *tg;
-+
-+ tg = kmem_cache_alloc(task_group_cache, GFP_KERNEL | __GFP_ZERO);
-+ if (!tg)
-+ return ERR_PTR(-ENOMEM);
-+
-+ return tg;
-+}
-+
-+void sched_online_group(struct task_group *tg, struct task_group *parent)
-+{
-+}
-+
-+/* rcu callback to free various structures associated with a task group */
-+static void sched_unregister_group_rcu(struct rcu_head *rhp)
-+{
-+ /* Now it should be safe to free those cfs_rqs: */
-+ sched_unregister_group(container_of(rhp, struct task_group, rcu));
-+}
-+
-+void sched_destroy_group(struct task_group *tg)
-+{
-+ /* Wait for possible concurrent references to cfs_rqs complete: */
-+ call_rcu(&tg->rcu, sched_unregister_group_rcu);
-+}
-+
-+void sched_release_group(struct task_group *tg)
-+{
-+}
-+
-+static inline struct task_group *css_tg(struct cgroup_subsys_state *css)
-+{
-+ return css ? container_of(css, struct task_group, css) : NULL;
-+}
-+
-+static struct cgroup_subsys_state *
-+cpu_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
-+{
-+ struct task_group *parent = css_tg(parent_css);
-+ struct task_group *tg;
-+
-+ if (!parent) {
-+ /* This is early initialization for the top cgroup */
-+ return &root_task_group.css;
-+ }
-+
-+ tg = sched_create_group(parent);
-+ if (IS_ERR(tg))
-+ return ERR_PTR(-ENOMEM);
-+ return &tg->css;
-+}
-+
-+/* Expose task group only after completing cgroup initialization */
-+static int cpu_cgroup_css_online(struct cgroup_subsys_state *css)
-+{
-+ struct task_group *tg = css_tg(css);
-+ struct task_group *parent = css_tg(css->parent);
-+
-+ if (parent)
-+ sched_online_group(tg, parent);
-+ return 0;
-+}
-+
-+static void cpu_cgroup_css_released(struct cgroup_subsys_state *css)
-+{
-+ struct task_group *tg = css_tg(css);
-+
-+ sched_release_group(tg);
-+}
-+
-+static void cpu_cgroup_css_free(struct cgroup_subsys_state *css)
-+{
-+ struct task_group *tg = css_tg(css);
-+
-+ /*
-+ * Relies on the RCU grace period between css_released() and this.
-+ */
-+ sched_unregister_group(tg);
-+}
-+
-+#ifdef CONFIG_RT_GROUP_SCHED
-+static int cpu_cgroup_can_attach(struct cgroup_taskset *tset)
-+{
-+ return 0;
-+}
-+#endif
-+
-+static void cpu_cgroup_attach(struct cgroup_taskset *tset)
-+{
-+}
-+
-+#ifdef CONFIG_FAIR_GROUP_SCHED
-+static DEFINE_MUTEX(shares_mutex);
-+
-+int sched_group_set_shares(struct task_group *tg, unsigned long shares)
-+{
-+ /*
-+ * We can't change the weight of the root cgroup.
-+ */
-+ if (&root_task_group == tg)
-+ return -EINVAL;
-+
-+ shares = clamp(shares, scale_load(MIN_SHARES), scale_load(MAX_SHARES));
-+
-+ mutex_lock(&shares_mutex);
-+ if (tg->shares == shares)
-+ goto done;
-+
-+ tg->shares = shares;
-+done:
-+ mutex_unlock(&shares_mutex);
-+ return 0;
-+}
-+
-+static int cpu_shares_write_u64(struct cgroup_subsys_state *css,
-+ struct cftype *cftype, u64 shareval)
-+{
-+ if (shareval > scale_load_down(ULONG_MAX))
-+ shareval = MAX_SHARES;
-+ return sched_group_set_shares(css_tg(css), scale_load(shareval));
-+}
-+
-+static u64 cpu_shares_read_u64(struct cgroup_subsys_state *css,
-+ struct cftype *cft)
-+{
-+ struct task_group *tg = css_tg(css);
-+
-+ return (u64) scale_load_down(tg->shares);
-+}
-+#endif
-+
-+static struct cftype cpu_legacy_files[] = {
-+#ifdef CONFIG_FAIR_GROUP_SCHED
-+ {
-+ .name = "shares",
-+ .read_u64 = cpu_shares_read_u64,
-+ .write_u64 = cpu_shares_write_u64,
-+ },
-+#endif
-+ { } /* Terminate */
-+};
-+
-+
-+static struct cftype cpu_files[] = {
-+ { } /* terminate */
-+};
-+
-+static int cpu_extra_stat_show(struct seq_file *sf,
-+ struct cgroup_subsys_state *css)
-+{
-+ return 0;
-+}
-+
-+struct cgroup_subsys cpu_cgrp_subsys = {
-+ .css_alloc = cpu_cgroup_css_alloc,
-+ .css_online = cpu_cgroup_css_online,
-+ .css_released = cpu_cgroup_css_released,
-+ .css_free = cpu_cgroup_css_free,
-+ .css_extra_stat_show = cpu_extra_stat_show,
-+#ifdef CONFIG_RT_GROUP_SCHED
-+ .can_attach = cpu_cgroup_can_attach,
-+#endif
-+ .attach = cpu_cgroup_attach,
-+ .legacy_cftypes = cpu_files,
-+ .legacy_cftypes = cpu_legacy_files,
-+ .dfl_cftypes = cpu_files,
-+ .early_init = true,
-+ .threaded = true,
-+};
-+#endif /* CONFIG_CGROUP_SCHED */
-+
-+#undef CREATE_TRACE_POINTS
-+
-+#ifdef CONFIG_SCHED_MM_CID
-+
-+#
-+/*
-+ * @cid_lock: Guarantee forward-progress of cid allocation.
-+ *
-+ * Concurrency ID allocation within a bitmap is mostly lock-free. The cid_lock
-+ * is only used when contention is detected by the lock-free allocation so
-+ * forward progress can be guaranteed.
-+ */
-+DEFINE_RAW_SPINLOCK(cid_lock);
-+
-+/*
-+ * @use_cid_lock: Select cid allocation behavior: lock-free vs spinlock.
-+ *
-+ * When @use_cid_lock is 0, the cid allocation is lock-free. When contention is
-+ * detected, it is set to 1 to ensure that all newly coming allocations are
-+ * serialized by @cid_lock until the allocation which detected contention
-+ * completes and sets @use_cid_lock back to 0. This guarantees forward progress
-+ * of a cid allocation.
-+ */
-+int use_cid_lock;
-+
-+/*
-+ * mm_cid remote-clear implements a lock-free algorithm to clear per-mm/cpu cid
-+ * concurrently with respect to the execution of the source runqueue context
-+ * switch.
-+ *
-+ * There is one basic properties we want to guarantee here:
-+ *
-+ * (1) Remote-clear should _never_ mark a per-cpu cid UNSET when it is actively
-+ * used by a task. That would lead to concurrent allocation of the cid and
-+ * userspace corruption.
-+ *
-+ * Provide this guarantee by introducing a Dekker memory ordering to guarantee
-+ * that a pair of loads observe at least one of a pair of stores, which can be
-+ * shown as:
-+ *
-+ * X = Y = 0
-+ *
-+ * w[X]=1 w[Y]=1
-+ * MB MB
-+ * r[Y]=y r[X]=x
-+ *
-+ * Which guarantees that x==0 && y==0 is impossible. But rather than using
-+ * values 0 and 1, this algorithm cares about specific state transitions of the
-+ * runqueue current task (as updated by the scheduler context switch), and the
-+ * per-mm/cpu cid value.
-+ *
-+ * Let's introduce task (Y) which has task->mm == mm and task (N) which has
-+ * task->mm != mm for the rest of the discussion. There are two scheduler state
-+ * transitions on context switch we care about:
-+ *
-+ * (TSA) Store to rq->curr with transition from (N) to (Y)
-+ *
-+ * (TSB) Store to rq->curr with transition from (Y) to (N)
-+ *
-+ * On the remote-clear side, there is one transition we care about:
-+ *
-+ * (TMA) cmpxchg to *pcpu_cid to set the LAZY flag
-+ *
-+ * There is also a transition to UNSET state which can be performed from all
-+ * sides (scheduler, remote-clear). It is always performed with a cmpxchg which
-+ * guarantees that only a single thread will succeed:
-+ *
-+ * (TMB) cmpxchg to *pcpu_cid to mark UNSET
-+ *
-+ * Just to be clear, what we do _not_ want to happen is a transition to UNSET
-+ * when a thread is actively using the cid (property (1)).
-+ *
-+ * Let's looks at the relevant combinations of TSA/TSB, and TMA transitions.
-+ *
-+ * Scenario A) (TSA)+(TMA) (from next task perspective)
-+ *
-+ * CPU0 CPU1
-+ *
-+ * Context switch CS-1 Remote-clear
-+ * - store to rq->curr: (N)->(Y) (TSA) - cmpxchg to *pcpu_id to LAZY (TMA)
-+ * (implied barrier after cmpxchg)
-+ * - switch_mm_cid()
-+ * - memory barrier (see switch_mm_cid()
-+ * comment explaining how this barrier
-+ * is combined with other scheduler
-+ * barriers)
-+ * - mm_cid_get (next)
-+ * - READ_ONCE(*pcpu_cid) - rcu_dereference(src_rq->curr)
-+ *
-+ * This Dekker ensures that either task (Y) is observed by the
-+ * rcu_dereference() or the LAZY flag is observed by READ_ONCE(), or both are
-+ * observed.
-+ *
-+ * If task (Y) store is observed by rcu_dereference(), it means that there is
-+ * still an active task on the cpu. Remote-clear will therefore not transition
-+ * to UNSET, which fulfills property (1).
-+ *
-+ * If task (Y) is not observed, but the lazy flag is observed by READ_ONCE(),
-+ * it will move its state to UNSET, which clears the percpu cid perhaps
-+ * uselessly (which is not an issue for correctness). Because task (Y) is not
-+ * observed, CPU1 can move ahead to set the state to UNSET. Because moving
-+ * state to UNSET is done with a cmpxchg expecting that the old state has the
-+ * LAZY flag set, only one thread will successfully UNSET.
-+ *
-+ * If both states (LAZY flag and task (Y)) are observed, the thread on CPU0
-+ * will observe the LAZY flag and transition to UNSET (perhaps uselessly), and
-+ * CPU1 will observe task (Y) and do nothing more, which is fine.
-+ *
-+ * What we are effectively preventing with this Dekker is a scenario where
-+ * neither LAZY flag nor store (Y) are observed, which would fail property (1)
-+ * because this would UNSET a cid which is actively used.
-+ */
-+
-+void sched_mm_cid_migrate_from(struct task_struct *t)
-+{
-+ t->migrate_from_cpu = task_cpu(t);
-+}
-+
-+static
-+int __sched_mm_cid_migrate_from_fetch_cid(struct rq *src_rq,
-+ struct task_struct *t,
-+ struct mm_cid *src_pcpu_cid)
-+{
-+ struct mm_struct *mm = t->mm;
-+ struct task_struct *src_task;
-+ int src_cid, last_mm_cid;
-+
-+ if (!mm)
-+ return -1;
-+
-+ last_mm_cid = t->last_mm_cid;
-+ /*
-+ * If the migrated task has no last cid, or if the current
-+ * task on src rq uses the cid, it means the source cid does not need
-+ * to be moved to the destination cpu.
-+ */
-+ if (last_mm_cid == -1)
-+ return -1;
-+ src_cid = READ_ONCE(src_pcpu_cid->cid);
-+ if (!mm_cid_is_valid(src_cid) || last_mm_cid != src_cid)
-+ return -1;
-+
-+ /*
-+ * If we observe an active task using the mm on this rq, it means we
-+ * are not the last task to be migrated from this cpu for this mm, so
-+ * there is no need to move src_cid to the destination cpu.
-+ */
-+ rcu_read_lock();
-+ src_task = rcu_dereference(src_rq->curr);
-+ if (READ_ONCE(src_task->mm_cid_active) && src_task->mm == mm) {
-+ rcu_read_unlock();
-+ t->last_mm_cid = -1;
-+ return -1;
-+ }
-+ rcu_read_unlock();
-+
-+ return src_cid;
-+}
-+
-+static
-+int __sched_mm_cid_migrate_from_try_steal_cid(struct rq *src_rq,
-+ struct task_struct *t,
-+ struct mm_cid *src_pcpu_cid,
-+ int src_cid)
-+{
-+ struct task_struct *src_task;
-+ struct mm_struct *mm = t->mm;
-+ int lazy_cid;
-+
-+ if (src_cid == -1)
-+ return -1;
-+
-+ /*
-+ * Attempt to clear the source cpu cid to move it to the destination
-+ * cpu.
-+ */
-+ lazy_cid = mm_cid_set_lazy_put(src_cid);
-+ if (!try_cmpxchg(&src_pcpu_cid->cid, &src_cid, lazy_cid))
-+ return -1;
-+
-+ /*
-+ * The implicit barrier after cmpxchg per-mm/cpu cid before loading
-+ * rq->curr->mm matches the scheduler barrier in context_switch()
-+ * between store to rq->curr and load of prev and next task's
-+ * per-mm/cpu cid.
-+ *
-+ * The implicit barrier after cmpxchg per-mm/cpu cid before loading
-+ * rq->curr->mm_cid_active matches the barrier in
-+ * sched_mm_cid_exit_signals(), sched_mm_cid_before_execve(), and
-+ * sched_mm_cid_after_execve() between store to t->mm_cid_active and
-+ * load of per-mm/cpu cid.
-+ */
-+
-+ /*
-+ * If we observe an active task using the mm on this rq after setting
-+ * the lazy-put flag, this task will be responsible for transitioning
-+ * from lazy-put flag set to MM_CID_UNSET.
-+ */
-+ rcu_read_lock();
-+ src_task = rcu_dereference(src_rq->curr);
-+ if (READ_ONCE(src_task->mm_cid_active) && src_task->mm == mm) {
-+ rcu_read_unlock();
-+ /*
-+ * We observed an active task for this mm, there is therefore
-+ * no point in moving this cid to the destination cpu.
-+ */
-+ t->last_mm_cid = -1;
-+ return -1;
-+ }
-+ rcu_read_unlock();
-+
-+ /*
-+ * The src_cid is unused, so it can be unset.
-+ */
-+ if (!try_cmpxchg(&src_pcpu_cid->cid, &lazy_cid, MM_CID_UNSET))
-+ return -1;
-+ return src_cid;
-+}
-+
-+/*
-+ * Migration to dst cpu. Called with dst_rq lock held.
-+ * Interrupts are disabled, which keeps the window of cid ownership without the
-+ * source rq lock held small.
-+ */
-+void sched_mm_cid_migrate_to(struct rq *dst_rq, struct task_struct *t, int src_cpu)
-+{
-+ struct mm_cid *src_pcpu_cid, *dst_pcpu_cid;
-+ struct mm_struct *mm = t->mm;
-+ int src_cid, dst_cid;
-+ struct rq *src_rq;
-+
-+ lockdep_assert_rq_held(dst_rq);
-+
-+ if (!mm)
-+ return;
-+ if (src_cpu == -1) {
-+ t->last_mm_cid = -1;
-+ return;
-+ }
-+ /*
-+ * Move the src cid if the dst cid is unset. This keeps id
-+ * allocation closest to 0 in cases where few threads migrate around
-+ * many cpus.
-+ *
-+ * If destination cid is already set, we may have to just clear
-+ * the src cid to ensure compactness in frequent migrations
-+ * scenarios.
-+ *
-+ * It is not useful to clear the src cid when the number of threads is
-+ * greater or equal to the number of allowed cpus, because user-space
-+ * can expect that the number of allowed cids can reach the number of
-+ * allowed cpus.
-+ */
-+ dst_pcpu_cid = per_cpu_ptr(mm->pcpu_cid, cpu_of(dst_rq));
-+ dst_cid = READ_ONCE(dst_pcpu_cid->cid);
-+ if (!mm_cid_is_unset(dst_cid) &&
-+ atomic_read(&mm->mm_users) >= t->nr_cpus_allowed)
-+ return;
-+ src_pcpu_cid = per_cpu_ptr(mm->pcpu_cid, src_cpu);
-+ src_rq = cpu_rq(src_cpu);
-+ src_cid = __sched_mm_cid_migrate_from_fetch_cid(src_rq, t, src_pcpu_cid);
-+ if (src_cid == -1)
-+ return;
-+ src_cid = __sched_mm_cid_migrate_from_try_steal_cid(src_rq, t, src_pcpu_cid,
-+ src_cid);
-+ if (src_cid == -1)
-+ return;
-+ if (!mm_cid_is_unset(dst_cid)) {
-+ __mm_cid_put(mm, src_cid);
-+ return;
-+ }
-+ /* Move src_cid to dst cpu. */
-+ mm_cid_snapshot_time(dst_rq, mm);
-+ WRITE_ONCE(dst_pcpu_cid->cid, src_cid);
-+}
-+
-+static void sched_mm_cid_remote_clear(struct mm_struct *mm, struct mm_cid *pcpu_cid,
-+ int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+ struct task_struct *t;
-+ unsigned long flags;
-+ int cid, lazy_cid;
-+
-+ cid = READ_ONCE(pcpu_cid->cid);
-+ if (!mm_cid_is_valid(cid))
-+ return;
-+
-+ /*
-+ * Clear the cpu cid if it is set to keep cid allocation compact. If
-+ * there happens to be other tasks left on the source cpu using this
-+ * mm, the next task using this mm will reallocate its cid on context
-+ * switch.
-+ */
-+ lazy_cid = mm_cid_set_lazy_put(cid);
-+ if (!try_cmpxchg(&pcpu_cid->cid, &cid, lazy_cid))
-+ return;
-+
-+ /*
-+ * The implicit barrier after cmpxchg per-mm/cpu cid before loading
-+ * rq->curr->mm matches the scheduler barrier in context_switch()
-+ * between store to rq->curr and load of prev and next task's
-+ * per-mm/cpu cid.
-+ *
-+ * The implicit barrier after cmpxchg per-mm/cpu cid before loading
-+ * rq->curr->mm_cid_active matches the barrier in
-+ * sched_mm_cid_exit_signals(), sched_mm_cid_before_execve(), and
-+ * sched_mm_cid_after_execve() between store to t->mm_cid_active and
-+ * load of per-mm/cpu cid.
-+ */
-+
-+ /*
-+ * If we observe an active task using the mm on this rq after setting
-+ * the lazy-put flag, that task will be responsible for transitioning
-+ * from lazy-put flag set to MM_CID_UNSET.
-+ */
-+ rcu_read_lock();
-+ t = rcu_dereference(rq->curr);
-+ if (READ_ONCE(t->mm_cid_active) && t->mm == mm) {
-+ rcu_read_unlock();
-+ return;
-+ }
-+ rcu_read_unlock();
-+
-+ /*
-+ * The cid is unused, so it can be unset.
-+ * Disable interrupts to keep the window of cid ownership without rq
-+ * lock small.
-+ */
-+ local_irq_save(flags);
-+ if (try_cmpxchg(&pcpu_cid->cid, &lazy_cid, MM_CID_UNSET))
-+ __mm_cid_put(mm, cid);
-+ local_irq_restore(flags);
-+}
-+
-+static void sched_mm_cid_remote_clear_old(struct mm_struct *mm, int cpu)
-+{
-+ struct rq *rq = cpu_rq(cpu);
-+ struct mm_cid *pcpu_cid;
-+ struct task_struct *curr;
-+ u64 rq_clock;
-+
-+ /*
-+ * rq->clock load is racy on 32-bit but one spurious clear once in a
-+ * while is irrelevant.
-+ */
-+ rq_clock = READ_ONCE(rq->clock);
-+ pcpu_cid = per_cpu_ptr(mm->pcpu_cid, cpu);
-+
-+ /*
-+ * In order to take care of infrequently scheduled tasks, bump the time
-+ * snapshot associated with this cid if an active task using the mm is
-+ * observed on this rq.
-+ */
-+ rcu_read_lock();
-+ curr = rcu_dereference(rq->curr);
-+ if (READ_ONCE(curr->mm_cid_active) && curr->mm == mm) {
-+ WRITE_ONCE(pcpu_cid->time, rq_clock);
-+ rcu_read_unlock();
-+ return;
-+ }
-+ rcu_read_unlock();
-+
-+ if (rq_clock < pcpu_cid->time + SCHED_MM_CID_PERIOD_NS)
-+ return;
-+ sched_mm_cid_remote_clear(mm, pcpu_cid, cpu);
-+}
-+
-+static void sched_mm_cid_remote_clear_weight(struct mm_struct *mm, int cpu,
-+ int weight)
-+{
-+ struct mm_cid *pcpu_cid;
-+ int cid;
-+
-+ pcpu_cid = per_cpu_ptr(mm->pcpu_cid, cpu);
-+ cid = READ_ONCE(pcpu_cid->cid);
-+ if (!mm_cid_is_valid(cid) || cid < weight)
-+ return;
-+ sched_mm_cid_remote_clear(mm, pcpu_cid, cpu);
-+}
-+
-+static void task_mm_cid_work(struct callback_head *work)
-+{
-+ unsigned long now = jiffies, old_scan, next_scan;
-+ struct task_struct *t = current;
-+ struct cpumask *cidmask;
-+ struct mm_struct *mm;
-+ int weight, cpu;
-+
-+ SCHED_WARN_ON(t != container_of(work, struct task_struct, cid_work));
-+
-+ work->next = work; /* Prevent double-add */
-+ if (t->flags & PF_EXITING)
-+ return;
-+ mm = t->mm;
-+ if (!mm)
-+ return;
-+ old_scan = READ_ONCE(mm->mm_cid_next_scan);
-+ next_scan = now + msecs_to_jiffies(MM_CID_SCAN_DELAY);
-+ if (!old_scan) {
-+ unsigned long res;
-+
-+ res = cmpxchg(&mm->mm_cid_next_scan, old_scan, next_scan);
-+ if (res != old_scan)
-+ old_scan = res;
-+ else
-+ old_scan = next_scan;
-+ }
-+ if (time_before(now, old_scan))
-+ return;
-+ if (!try_cmpxchg(&mm->mm_cid_next_scan, &old_scan, next_scan))
-+ return;
-+ cidmask = mm_cidmask(mm);
-+ /* Clear cids that were not recently used. */
-+ for_each_possible_cpu(cpu)
-+ sched_mm_cid_remote_clear_old(mm, cpu);
-+ weight = cpumask_weight(cidmask);
-+ /*
-+ * Clear cids that are greater or equal to the cidmask weight to
-+ * recompact it.
-+ */
-+ for_each_possible_cpu(cpu)
-+ sched_mm_cid_remote_clear_weight(mm, cpu, weight);
-+}
-+
-+void init_sched_mm_cid(struct task_struct *t)
-+{
-+ struct mm_struct *mm = t->mm;
-+ int mm_users = 0;
-+
-+ if (mm) {
-+ mm_users = atomic_read(&mm->mm_users);
-+ if (mm_users == 1)
-+ mm->mm_cid_next_scan = jiffies + msecs_to_jiffies(MM_CID_SCAN_DELAY);
-+ }
-+ t->cid_work.next = &t->cid_work; /* Protect against double add */
-+ init_task_work(&t->cid_work, task_mm_cid_work);
-+}
-+
-+void task_tick_mm_cid(struct rq *rq, struct task_struct *curr)
-+{
-+ struct callback_head *work = &curr->cid_work;
-+ unsigned long now = jiffies;
-+
-+ if (!curr->mm || (curr->flags & (PF_EXITING | PF_KTHREAD)) ||
-+ work->next != work)
-+ return;
-+ if (time_before(now, READ_ONCE(curr->mm->mm_cid_next_scan)))
-+ return;
-+ task_work_add(curr, work, TWA_RESUME);
-+}
-+
-+void sched_mm_cid_exit_signals(struct task_struct *t)
-+{
-+ struct mm_struct *mm = t->mm;
-+ struct rq_flags rf;
-+ struct rq *rq;
-+
-+ if (!mm)
-+ return;
-+
-+ preempt_disable();
-+ rq = this_rq();
-+ rq_lock_irqsave(rq, &rf);
-+ preempt_enable_no_resched(); /* holding spinlock */
-+ WRITE_ONCE(t->mm_cid_active, 0);
-+ /*
-+ * Store t->mm_cid_active before loading per-mm/cpu cid.
-+ * Matches barrier in sched_mm_cid_remote_clear_old().
-+ */
-+ smp_mb();
-+ mm_cid_put(mm);
-+ t->last_mm_cid = t->mm_cid = -1;
-+ rq_unlock_irqrestore(rq, &rf);
-+}
-+
-+void sched_mm_cid_before_execve(struct task_struct *t)
-+{
-+ struct mm_struct *mm = t->mm;
-+ struct rq_flags rf;
-+ struct rq *rq;
-+
-+ if (!mm)
-+ return;
-+
-+ preempt_disable();
-+ rq = this_rq();
-+ rq_lock_irqsave(rq, &rf);
-+ preempt_enable_no_resched(); /* holding spinlock */
-+ WRITE_ONCE(t->mm_cid_active, 0);
-+ /*
-+ * Store t->mm_cid_active before loading per-mm/cpu cid.
-+ * Matches barrier in sched_mm_cid_remote_clear_old().
-+ */
-+ smp_mb();
-+ mm_cid_put(mm);
-+ t->last_mm_cid = t->mm_cid = -1;
-+ rq_unlock_irqrestore(rq, &rf);
-+}
-+
-+void sched_mm_cid_after_execve(struct task_struct *t)
-+{
-+ struct mm_struct *mm = t->mm;
-+ struct rq_flags rf;
-+ struct rq *rq;
-+
-+ if (!mm)
-+ return;
-+
-+ preempt_disable();
-+ rq = this_rq();
-+ rq_lock_irqsave(rq, &rf);
-+ preempt_enable_no_resched(); /* holding spinlock */
-+ WRITE_ONCE(t->mm_cid_active, 1);
-+ /*
-+ * Store t->mm_cid_active before loading per-mm/cpu cid.
-+ * Matches barrier in sched_mm_cid_remote_clear_old().
-+ */
-+ smp_mb();
-+ t->last_mm_cid = t->mm_cid = mm_cid_get(rq, mm);
-+ rq_unlock_irqrestore(rq, &rf);
-+ rseq_set_notify_resume(t);
-+}
-+
-+void sched_mm_cid_fork(struct task_struct *t)
-+{
-+ WARN_ON_ONCE(!t->mm || t->mm_cid != -1);
-+ t->mm_cid_active = 1;
-+}
-+#endif
-diff --git a/kernel/sched/alt_debug.c b/kernel/sched/alt_debug.c
-new file mode 100644
-index 000000000000..1212a031700e
---- /dev/null
-+++ b/kernel/sched/alt_debug.c
-@@ -0,0 +1,31 @@
-+/*
-+ * kernel/sched/alt_debug.c
-+ *
-+ * Print the alt scheduler debugging details
-+ *
-+ * Author: Alfred Chen
-+ * Date : 2020
-+ */
-+#include "sched.h"
-+
-+/*
-+ * This allows printing both to /proc/sched_debug and
-+ * to the console
-+ */
-+#define SEQ_printf(m, x...) \
-+ do { \
-+ if (m) \
-+ seq_printf(m, x); \
-+ else \
-+ pr_cont(x); \
-+ } while (0)
-+
-+void proc_sched_show_task(struct task_struct *p, struct pid_namespace *ns,
-+ struct seq_file *m)
-+{
-+ SEQ_printf(m, "%s (%d, #threads: %d)\n", p->comm, task_pid_nr_ns(p, ns),
-+ get_nr_threads(p));
-+}
-+
-+void proc_sched_set_task(struct task_struct *p)
-+{}
-diff --git a/kernel/sched/alt_sched.h b/kernel/sched/alt_sched.h
-new file mode 100644
-index 000000000000..5494f27cdb04
---- /dev/null
-+++ b/kernel/sched/alt_sched.h
-@@ -0,0 +1,906 @@
-+#ifndef ALT_SCHED_H
-+#define ALT_SCHED_H
-+
-+#include <linux/context_tracking.h>
-+#include <linux/profile.h>
-+#include <linux/stop_machine.h>
-+#include <linux/syscalls.h>
-+#include <linux/tick.h>
-+
-+#include <trace/events/power.h>
-+#include <trace/events/sched.h>
-+
-+#include "../workqueue_internal.h"
-+
-+#include "cpupri.h"
-+
-+#ifdef CONFIG_SCHED_BMQ
-+/* bits:
-+ * RT(0-99), (Low prio adj range, nice width, high prio adj range) / 2, cpu idle task */
-+#define SCHED_LEVELS (MAX_RT_PRIO + NICE_WIDTH / 2 + MAX_PRIORITY_ADJ + 1)
-+#endif
-+
-+#ifdef CONFIG_SCHED_PDS
-+/* bits: RT(0-24), reserved(25-31), SCHED_NORMAL_PRIO_NUM(32), cpu idle task(1) */
-+#define SCHED_LEVELS (64 + 1)
-+#endif /* CONFIG_SCHED_PDS */
-+
-+#define IDLE_TASK_SCHED_PRIO (SCHED_LEVELS - 1)
-+
-+#ifdef CONFIG_SCHED_DEBUG
-+# define SCHED_WARN_ON(x) WARN_ONCE(x, #x)
-+extern void resched_latency_warn(int cpu, u64 latency);
-+#else
-+# define SCHED_WARN_ON(x) ({ (void)(x), 0; })
-+static inline void resched_latency_warn(int cpu, u64 latency) {}
-+#endif
-+
-+/*
-+ * Increase resolution of nice-level calculations for 64-bit architectures.
-+ * The extra resolution improves shares distribution and load balancing of
-+ * low-weight task groups (eg. nice +19 on an autogroup), deeper taskgroup
-+ * hierarchies, especially on larger systems. This is not a user-visible change
-+ * and does not change the user-interface for setting shares/weights.
-+ *
-+ * We increase resolution only if we have enough bits to allow this increased
-+ * resolution (i.e. 64-bit). The costs for increasing resolution when 32-bit
-+ * are pretty high and the returns do not justify the increased costs.
-+ *
-+ * Really only required when CONFIG_FAIR_GROUP_SCHED=y is also set, but to
-+ * increase coverage and consistency always enable it on 64-bit platforms.
-+ */
-+#ifdef CONFIG_64BIT
-+# define NICE_0_LOAD_SHIFT (SCHED_FIXEDPOINT_SHIFT + SCHED_FIXEDPOINT_SHIFT)
-+# define scale_load(w) ((w) << SCHED_FIXEDPOINT_SHIFT)
-+# define scale_load_down(w) \
-+({ \
-+ unsigned long __w = (w); \
-+ if (__w) \
-+ __w = max(2UL, __w >> SCHED_FIXEDPOINT_SHIFT); \
-+ __w; \
-+})
-+#else
-+# define NICE_0_LOAD_SHIFT (SCHED_FIXEDPOINT_SHIFT)
-+# define scale_load(w) (w)
-+# define scale_load_down(w) (w)
-+#endif
-+
-+#ifdef CONFIG_FAIR_GROUP_SCHED
-+#define ROOT_TASK_GROUP_LOAD NICE_0_LOAD
-+
-+/*
-+ * A weight of 0 or 1 can cause arithmetics problems.
-+ * A weight of a cfs_rq is the sum of weights of which entities
-+ * are queued on this cfs_rq, so a weight of a entity should not be
-+ * too large, so as the shares value of a task group.
-+ * (The default weight is 1024 - so there's no practical
-+ * limitation from this.)
-+ */
-+#define MIN_SHARES (1UL << 1)
-+#define MAX_SHARES (1UL << 18)
-+#endif
-+
-+/*
-+ * Tunables that become constants when CONFIG_SCHED_DEBUG is off:
-+ */
-+#ifdef CONFIG_SCHED_DEBUG
-+# define const_debug __read_mostly
-+#else
-+# define const_debug const
-+#endif
-+
-+/* task_struct::on_rq states: */
-+#define TASK_ON_RQ_QUEUED 1
-+#define TASK_ON_RQ_MIGRATING 2
-+
-+static inline int task_on_rq_queued(struct task_struct *p)
-+{
-+ return p->on_rq == TASK_ON_RQ_QUEUED;
-+}
-+
-+static inline int task_on_rq_migrating(struct task_struct *p)
-+{
-+ return READ_ONCE(p->on_rq) == TASK_ON_RQ_MIGRATING;
-+}
-+
-+/*
-+ * wake flags
-+ */
-+#define WF_SYNC 0x01 /* waker goes to sleep after wakeup */
-+#define WF_FORK 0x02 /* child wakeup after fork */
-+#define WF_MIGRATED 0x04 /* internal use, task got migrated */
-+
-+#define SCHED_QUEUE_BITS (SCHED_LEVELS - 1)
-+
-+struct sched_queue {
-+ DECLARE_BITMAP(bitmap, SCHED_QUEUE_BITS);
-+ struct list_head heads[SCHED_LEVELS];
-+};
-+
-+struct rq;
-+struct cpuidle_state;
-+
-+struct balance_callback {
-+ struct balance_callback *next;
-+ void (*func)(struct rq *rq);
-+};
-+
-+/*
-+ * This is the main, per-CPU runqueue data structure.
-+ * This data should only be modified by the local cpu.
-+ */
-+struct rq {
-+ /* runqueue lock: */
-+ raw_spinlock_t lock;
-+
-+ struct task_struct __rcu *curr;
-+ struct task_struct *idle, *stop, *skip;
-+ struct mm_struct *prev_mm;
-+
-+ struct sched_queue queue;
-+#ifdef CONFIG_SCHED_PDS
-+ u64 time_edge;
-+#endif
-+ unsigned long prio;
-+
-+ /* switch count */
-+ u64 nr_switches;
-+
-+ atomic_t nr_iowait;
-+
-+#ifdef CONFIG_SCHED_DEBUG
-+ u64 last_seen_need_resched_ns;
-+ int ticks_without_resched;
-+#endif
-+
-+#ifdef CONFIG_MEMBARRIER
-+ int membarrier_state;
-+#endif
-+
-+#ifdef CONFIG_SMP
-+ int cpu; /* cpu of this runqueue */
-+ bool online;
-+
-+ unsigned int ttwu_pending;
-+ unsigned char nohz_idle_balance;
-+ unsigned char idle_balance;
-+
-+#ifdef CONFIG_HAVE_SCHED_AVG_IRQ
-+ struct sched_avg avg_irq;
-+#endif
-+
-+#ifdef CONFIG_SCHED_SMT
-+ int active_balance;
-+ struct cpu_stop_work active_balance_work;
-+#endif
-+ struct balance_callback *balance_callback;
-+#ifdef CONFIG_HOTPLUG_CPU
-+ struct rcuwait hotplug_wait;
-+#endif
-+ unsigned int nr_pinned;
-+
-+#endif /* CONFIG_SMP */
-+#ifdef CONFIG_IRQ_TIME_ACCOUNTING
-+ u64 prev_irq_time;
-+#endif /* CONFIG_IRQ_TIME_ACCOUNTING */
-+#ifdef CONFIG_PARAVIRT
-+ u64 prev_steal_time;
-+#endif /* CONFIG_PARAVIRT */
-+#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
-+ u64 prev_steal_time_rq;
-+#endif /* CONFIG_PARAVIRT_TIME_ACCOUNTING */
-+
-+ /* For genenal cpu load util */
-+ s32 load_history;
-+ u64 load_block;
-+ u64 load_stamp;
-+
-+ /* calc_load related fields */
-+ unsigned long calc_load_update;
-+ long calc_load_active;
-+
-+ u64 clock, last_tick;
-+ u64 last_ts_switch;
-+ u64 clock_task;
-+
-+ unsigned int nr_running;
-+ unsigned long nr_uninterruptible;
-+
-+#ifdef CONFIG_SCHED_HRTICK
-+#ifdef CONFIG_SMP
-+ call_single_data_t hrtick_csd;
-+#endif
-+ struct hrtimer hrtick_timer;
-+ ktime_t hrtick_time;
-+#endif
-+
-+#ifdef CONFIG_SCHEDSTATS
-+
-+ /* latency stats */
-+ struct sched_info rq_sched_info;
-+ unsigned long long rq_cpu_time;
-+ /* could above be rq->cfs_rq.exec_clock + rq->rt_rq.rt_runtime ? */
-+
-+ /* sys_sched_yield() stats */
-+ unsigned int yld_count;
-+
-+ /* schedule() stats */
-+ unsigned int sched_switch;
-+ unsigned int sched_count;
-+ unsigned int sched_goidle;
-+
-+ /* try_to_wake_up() stats */
-+ unsigned int ttwu_count;
-+ unsigned int ttwu_local;
-+#endif /* CONFIG_SCHEDSTATS */
-+
-+#ifdef CONFIG_CPU_IDLE
-+ /* Must be inspected within a rcu lock section */
-+ struct cpuidle_state *idle_state;
-+#endif
-+
-+#ifdef CONFIG_NO_HZ_COMMON
-+#ifdef CONFIG_SMP
-+ call_single_data_t nohz_csd;
-+#endif
-+ atomic_t nohz_flags;
-+#endif /* CONFIG_NO_HZ_COMMON */
-+
-+ /* Scratch cpumask to be temporarily used under rq_lock */
-+ cpumask_var_t scratch_mask;
-+};
-+
-+extern unsigned long rq_load_util(struct rq *rq, unsigned long max);
-+
-+extern unsigned long calc_load_update;
-+extern atomic_long_t calc_load_tasks;
-+
-+extern void calc_global_load_tick(struct rq *this_rq);
-+extern long calc_load_fold_active(struct rq *this_rq, long adjust);
-+
-+DECLARE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
-+#define cpu_rq(cpu) (&per_cpu(runqueues, (cpu)))
-+#define this_rq() this_cpu_ptr(&runqueues)
-+#define task_rq(p) cpu_rq(task_cpu(p))
-+#define cpu_curr(cpu) (cpu_rq(cpu)->curr)
-+#define raw_rq() raw_cpu_ptr(&runqueues)
-+
-+#ifdef CONFIG_SMP
-+#if defined(CONFIG_SCHED_DEBUG) && defined(CONFIG_SYSCTL)
-+void register_sched_domain_sysctl(void);
-+void unregister_sched_domain_sysctl(void);
-+#else
-+static inline void register_sched_domain_sysctl(void)
-+{
-+}
-+static inline void unregister_sched_domain_sysctl(void)
-+{
-+}
-+#endif
-+
-+extern bool sched_smp_initialized;
-+
-+enum {
-+ ITSELF_LEVEL_SPACE_HOLDER,
-+#ifdef CONFIG_SCHED_SMT
-+ SMT_LEVEL_SPACE_HOLDER,
-+#endif
-+ COREGROUP_LEVEL_SPACE_HOLDER,
-+ CORE_LEVEL_SPACE_HOLDER,
-+ OTHER_LEVEL_SPACE_HOLDER,
-+ NR_CPU_AFFINITY_LEVELS
-+};
-+
-+DECLARE_PER_CPU_ALIGNED(cpumask_t [NR_CPU_AFFINITY_LEVELS], sched_cpu_topo_masks);
-+
-+static inline int
-+__best_mask_cpu(const cpumask_t *cpumask, const cpumask_t *mask)
-+{
-+ int cpu;
-+
-+ while ((cpu = cpumask_any_and(cpumask, mask)) >= nr_cpu_ids)
-+ mask++;
-+
-+ return cpu;
-+}
-+
-+static inline int best_mask_cpu(int cpu, const cpumask_t *mask)
-+{
-+ return __best_mask_cpu(mask, per_cpu(sched_cpu_topo_masks, cpu));
-+}
-+
-+extern void flush_smp_call_function_queue(void);
-+
-+#else /* !CONFIG_SMP */
-+static inline void flush_smp_call_function_queue(void) { }
-+#endif
-+
-+#ifndef arch_scale_freq_tick
-+static __always_inline
-+void arch_scale_freq_tick(void)
-+{
-+}
-+#endif
-+
-+#ifndef arch_scale_freq_capacity
-+static __always_inline
-+unsigned long arch_scale_freq_capacity(int cpu)
-+{
-+ return SCHED_CAPACITY_SCALE;
-+}
-+#endif
-+
-+static inline u64 __rq_clock_broken(struct rq *rq)
-+{
-+ return READ_ONCE(rq->clock);
-+}
-+
-+static inline u64 rq_clock(struct rq *rq)
-+{
-+ /*
-+ * Relax lockdep_assert_held() checking as in VRQ, call to
-+ * sched_info_xxxx() may not held rq->lock
-+ * lockdep_assert_held(&rq->lock);
-+ */
-+ return rq->clock;
-+}
-+
-+static inline u64 rq_clock_task(struct rq *rq)
-+{
-+ /*
-+ * Relax lockdep_assert_held() checking as in VRQ, call to
-+ * sched_info_xxxx() may not held rq->lock
-+ * lockdep_assert_held(&rq->lock);
-+ */
-+ return rq->clock_task;
-+}
-+
-+/*
-+ * {de,en}queue flags:
-+ *
-+ * DEQUEUE_SLEEP - task is no longer runnable
-+ * ENQUEUE_WAKEUP - task just became runnable
-+ *
-+ */
-+
-+#define DEQUEUE_SLEEP 0x01
-+
-+#define ENQUEUE_WAKEUP 0x01
-+
-+
-+/*
-+ * Below are scheduler API which using in other kernel code
-+ * It use the dummy rq_flags
-+ * ToDo : BMQ need to support these APIs for compatibility with mainline
-+ * scheduler code.
-+ */
-+struct rq_flags {
-+ unsigned long flags;
-+};
-+
-+struct rq *__task_rq_lock(struct task_struct *p, struct rq_flags *rf)
-+ __acquires(rq->lock);
-+
-+struct rq *task_rq_lock(struct task_struct *p, struct rq_flags *rf)
-+ __acquires(p->pi_lock)
-+ __acquires(rq->lock);
-+
-+static inline void __task_rq_unlock(struct rq *rq, struct rq_flags *rf)
-+ __releases(rq->lock)
-+{
-+ raw_spin_unlock(&rq->lock);
-+}
-+
-+static inline void
-+task_rq_unlock(struct rq *rq, struct task_struct *p, struct rq_flags *rf)
-+ __releases(rq->lock)
-+ __releases(p->pi_lock)
-+{
-+ raw_spin_unlock(&rq->lock);
-+ raw_spin_unlock_irqrestore(&p->pi_lock, rf->flags);
-+}
-+
-+static inline void
-+rq_lock(struct rq *rq, struct rq_flags *rf)
-+ __acquires(rq->lock)
-+{
-+ raw_spin_lock(&rq->lock);
-+}
-+
-+static inline void
-+rq_unlock(struct rq *rq, struct rq_flags *rf)
-+ __releases(rq->lock)
-+{
-+ raw_spin_unlock(&rq->lock);
-+}
-+
-+static inline void
-+rq_lock_irq(struct rq *rq, struct rq_flags *rf)
-+ __acquires(rq->lock)
-+{
-+ raw_spin_lock_irq(&rq->lock);
-+}
-+
-+static inline void
-+rq_unlock_irq(struct rq *rq, struct rq_flags *rf)
-+ __releases(rq->lock)
-+{
-+ raw_spin_unlock_irq(&rq->lock);
-+}
-+
-+static inline struct rq *
-+this_rq_lock_irq(struct rq_flags *rf)
-+ __acquires(rq->lock)
-+{
-+ struct rq *rq;
-+
-+ local_irq_disable();
-+ rq = this_rq();
-+ raw_spin_lock(&rq->lock);
-+
-+ return rq;
-+}
-+
-+static inline raw_spinlock_t *__rq_lockp(struct rq *rq)
-+{
-+ return &rq->lock;
-+}
-+
-+static inline raw_spinlock_t *rq_lockp(struct rq *rq)
-+{
-+ return __rq_lockp(rq);
-+}
-+
-+static inline void lockdep_assert_rq_held(struct rq *rq)
-+{
-+ lockdep_assert_held(__rq_lockp(rq));
-+}
-+
-+extern void raw_spin_rq_lock_nested(struct rq *rq, int subclass);
-+extern void raw_spin_rq_unlock(struct rq *rq);
-+
-+static inline void raw_spin_rq_lock(struct rq *rq)
-+{
-+ raw_spin_rq_lock_nested(rq, 0);
-+}
-+
-+static inline void raw_spin_rq_lock_irq(struct rq *rq)
-+{
-+ local_irq_disable();
-+ raw_spin_rq_lock(rq);
-+}
-+
-+static inline void raw_spin_rq_unlock_irq(struct rq *rq)
-+{
-+ raw_spin_rq_unlock(rq);
-+ local_irq_enable();
-+}
-+
-+static inline int task_current(struct rq *rq, struct task_struct *p)
-+{
-+ return rq->curr == p;
-+}
-+
-+static inline bool task_on_cpu(struct task_struct *p)
-+{
-+ return p->on_cpu;
-+}
-+
-+extern int task_running_nice(struct task_struct *p);
-+
-+extern struct static_key_false sched_schedstats;
-+
-+#ifdef CONFIG_CPU_IDLE
-+static inline void idle_set_state(struct rq *rq,
-+ struct cpuidle_state *idle_state)
-+{
-+ rq->idle_state = idle_state;
-+}
-+
-+static inline struct cpuidle_state *idle_get_state(struct rq *rq)
-+{
-+ WARN_ON(!rcu_read_lock_held());
-+ return rq->idle_state;
-+}
-+#else
-+static inline void idle_set_state(struct rq *rq,
-+ struct cpuidle_state *idle_state)
-+{
-+}
-+
-+static inline struct cpuidle_state *idle_get_state(struct rq *rq)
-+{
-+ return NULL;
-+}
-+#endif
-+
-+static inline int cpu_of(const struct rq *rq)
-+{
-+#ifdef CONFIG_SMP
-+ return rq->cpu;
-+#else
-+ return 0;
-+#endif
-+}
-+
-+#include "stats.h"
-+
-+#ifdef CONFIG_NO_HZ_COMMON
-+#define NOHZ_BALANCE_KICK_BIT 0
-+#define NOHZ_STATS_KICK_BIT 1
-+
-+#define NOHZ_BALANCE_KICK BIT(NOHZ_BALANCE_KICK_BIT)
-+#define NOHZ_STATS_KICK BIT(NOHZ_STATS_KICK_BIT)
-+
-+#define NOHZ_KICK_MASK (NOHZ_BALANCE_KICK | NOHZ_STATS_KICK)
-+
-+#define nohz_flags(cpu) (&cpu_rq(cpu)->nohz_flags)
-+
-+/* TODO: needed?
-+extern void nohz_balance_exit_idle(struct rq *rq);
-+#else
-+static inline void nohz_balance_exit_idle(struct rq *rq) { }
-+*/
-+#endif
-+
-+#ifdef CONFIG_IRQ_TIME_ACCOUNTING
-+struct irqtime {
-+ u64 total;
-+ u64 tick_delta;
-+ u64 irq_start_time;
-+ struct u64_stats_sync sync;
-+};
-+
-+DECLARE_PER_CPU(struct irqtime, cpu_irqtime);
-+
-+/*
-+ * Returns the irqtime minus the softirq time computed by ksoftirqd.
-+ * Otherwise ksoftirqd's sum_exec_runtime is substracted its own runtime
-+ * and never move forward.
-+ */
-+static inline u64 irq_time_read(int cpu)
-+{
-+ struct irqtime *irqtime = &per_cpu(cpu_irqtime, cpu);
-+ unsigned int seq;
-+ u64 total;
-+
-+ do {
-+ seq = __u64_stats_fetch_begin(&irqtime->sync);
-+ total = irqtime->total;
-+ } while (__u64_stats_fetch_retry(&irqtime->sync, seq));
-+
-+ return total;
-+}
-+#endif /* CONFIG_IRQ_TIME_ACCOUNTING */
-+
-+#ifdef CONFIG_CPU_FREQ
-+DECLARE_PER_CPU(struct update_util_data __rcu *, cpufreq_update_util_data);
-+#endif /* CONFIG_CPU_FREQ */
-+
-+#ifdef CONFIG_NO_HZ_FULL
-+extern int __init sched_tick_offload_init(void);
-+#else
-+static inline int sched_tick_offload_init(void) { return 0; }
-+#endif
-+
-+#ifdef arch_scale_freq_capacity
-+#ifndef arch_scale_freq_invariant
-+#define arch_scale_freq_invariant() (true)
-+#endif
-+#else /* arch_scale_freq_capacity */
-+#define arch_scale_freq_invariant() (false)
-+#endif
-+
-+extern void schedule_idle(void);
-+
-+#define cap_scale(v, s) ((v)*(s) >> SCHED_CAPACITY_SHIFT)
-+
-+/*
-+ * !! For sched_setattr_nocheck() (kernel) only !!
-+ *
-+ * This is actually gross. :(
-+ *
-+ * It is used to make schedutil kworker(s) higher priority than SCHED_DEADLINE
-+ * tasks, but still be able to sleep. We need this on platforms that cannot
-+ * atomically change clock frequency. Remove once fast switching will be
-+ * available on such platforms.
-+ *
-+ * SUGOV stands for SchedUtil GOVernor.
-+ */
-+#define SCHED_FLAG_SUGOV 0x10000000
-+
-+#ifdef CONFIG_MEMBARRIER
-+/*
-+ * The scheduler provides memory barriers required by membarrier between:
-+ * - prior user-space memory accesses and store to rq->membarrier_state,
-+ * - store to rq->membarrier_state and following user-space memory accesses.
-+ * In the same way it provides those guarantees around store to rq->curr.
-+ */
-+static inline void membarrier_switch_mm(struct rq *rq,
-+ struct mm_struct *prev_mm,
-+ struct mm_struct *next_mm)
-+{
-+ int membarrier_state;
-+
-+ if (prev_mm == next_mm)
-+ return;
-+
-+ membarrier_state = atomic_read(&next_mm->membarrier_state);
-+ if (READ_ONCE(rq->membarrier_state) == membarrier_state)
-+ return;
-+
-+ WRITE_ONCE(rq->membarrier_state, membarrier_state);
-+}
-+#else
-+static inline void membarrier_switch_mm(struct rq *rq,
-+ struct mm_struct *prev_mm,
-+ struct mm_struct *next_mm)
-+{
-+}
-+#endif
-+
-+#ifdef CONFIG_NUMA
-+extern int sched_numa_find_closest(const struct cpumask *cpus, int cpu);
-+#else
-+static inline int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
-+{
-+ return nr_cpu_ids;
-+}
-+#endif
-+
-+extern void swake_up_all_locked(struct swait_queue_head *q);
-+extern void __prepare_to_swait(struct swait_queue_head *q, struct swait_queue *wait);
-+
-+#ifdef CONFIG_PREEMPT_DYNAMIC
-+extern int preempt_dynamic_mode;
-+extern int sched_dynamic_mode(const char *str);
-+extern void sched_dynamic_update(int mode);
-+#endif
-+
-+static inline void nohz_run_idle_balance(int cpu) { }
-+
-+static inline
-+unsigned long uclamp_rq_util_with(struct rq *rq, unsigned long util,
-+ struct task_struct *p)
-+{
-+ return util;
-+}
-+
-+static inline bool uclamp_rq_is_capped(struct rq *rq) { return false; }
-+
-+#ifdef CONFIG_SCHED_MM_CID
-+
-+#define SCHED_MM_CID_PERIOD_NS (100ULL * 1000000) /* 100ms */
-+#define MM_CID_SCAN_DELAY 100 /* 100ms */
-+
-+extern raw_spinlock_t cid_lock;
-+extern int use_cid_lock;
-+
-+extern void sched_mm_cid_migrate_from(struct task_struct *t);
-+extern void sched_mm_cid_migrate_to(struct rq *dst_rq, struct task_struct *t, int src_cpu);
-+extern void task_tick_mm_cid(struct rq *rq, struct task_struct *curr);
-+extern void init_sched_mm_cid(struct task_struct *t);
-+
-+static inline void __mm_cid_put(struct mm_struct *mm, int cid)
-+{
-+ if (cid < 0)
-+ return;
-+ cpumask_clear_cpu(cid, mm_cidmask(mm));
-+}
-+
-+/*
-+ * The per-mm/cpu cid can have the MM_CID_LAZY_PUT flag set or transition to
-+ * the MM_CID_UNSET state without holding the rq lock, but the rq lock needs to
-+ * be held to transition to other states.
-+ *
-+ * State transitions synchronized with cmpxchg or try_cmpxchg need to be
-+ * consistent across cpus, which prevents use of this_cpu_cmpxchg.
-+ */
-+static inline void mm_cid_put_lazy(struct task_struct *t)
-+{
-+ struct mm_struct *mm = t->mm;
-+ struct mm_cid __percpu *pcpu_cid = mm->pcpu_cid;
-+ int cid;
-+
-+ lockdep_assert_irqs_disabled();
-+ cid = __this_cpu_read(pcpu_cid->cid);
-+ if (!mm_cid_is_lazy_put(cid) ||
-+ !try_cmpxchg(&this_cpu_ptr(pcpu_cid)->cid, &cid, MM_CID_UNSET))
-+ return;
-+ __mm_cid_put(mm, mm_cid_clear_lazy_put(cid));
-+}
-+
-+static inline int mm_cid_pcpu_unset(struct mm_struct *mm)
-+{
-+ struct mm_cid __percpu *pcpu_cid = mm->pcpu_cid;
-+ int cid, res;
-+
-+ lockdep_assert_irqs_disabled();
-+ cid = __this_cpu_read(pcpu_cid->cid);
-+ for (;;) {
-+ if (mm_cid_is_unset(cid))
-+ return MM_CID_UNSET;
-+ /*
-+ * Attempt transition from valid or lazy-put to unset.
-+ */
-+ res = cmpxchg(&this_cpu_ptr(pcpu_cid)->cid, cid, MM_CID_UNSET);
-+ if (res == cid)
-+ break;
-+ cid = res;
-+ }
-+ return cid;
-+}
-+
-+static inline void mm_cid_put(struct mm_struct *mm)
-+{
-+ int cid;
-+
-+ lockdep_assert_irqs_disabled();
-+ cid = mm_cid_pcpu_unset(mm);
-+ if (cid == MM_CID_UNSET)
-+ return;
-+ __mm_cid_put(mm, mm_cid_clear_lazy_put(cid));
-+}
-+
-+static inline int __mm_cid_try_get(struct mm_struct *mm)
-+{
-+ struct cpumask *cpumask;
-+ int cid;
-+
-+ cpumask = mm_cidmask(mm);
-+ /*
-+ * Retry finding first zero bit if the mask is temporarily
-+ * filled. This only happens during concurrent remote-clear
-+ * which owns a cid without holding a rq lock.
-+ */
-+ for (;;) {
-+ cid = cpumask_first_zero(cpumask);
-+ if (cid < nr_cpu_ids)
-+ break;
-+ cpu_relax();
-+ }
-+ if (cpumask_test_and_set_cpu(cid, cpumask))
-+ return -1;
-+ return cid;
-+}
-+
-+/*
-+ * Save a snapshot of the current runqueue time of this cpu
-+ * with the per-cpu cid value, allowing to estimate how recently it was used.
-+ */
-+static inline void mm_cid_snapshot_time(struct rq *rq, struct mm_struct *mm)
-+{
-+ struct mm_cid *pcpu_cid = per_cpu_ptr(mm->pcpu_cid, cpu_of(rq));
-+
-+ lockdep_assert_rq_held(rq);
-+ WRITE_ONCE(pcpu_cid->time, rq->clock);
-+}
-+
-+static inline int __mm_cid_get(struct rq *rq, struct mm_struct *mm)
-+{
-+ int cid;
-+
-+ /*
-+ * All allocations (even those using the cid_lock) are lock-free. If
-+ * use_cid_lock is set, hold the cid_lock to perform cid allocation to
-+ * guarantee forward progress.
-+ */
-+ if (!READ_ONCE(use_cid_lock)) {
-+ cid = __mm_cid_try_get(mm);
-+ if (cid >= 0)
-+ goto end;
-+ raw_spin_lock(&cid_lock);
-+ } else {
-+ raw_spin_lock(&cid_lock);
-+ cid = __mm_cid_try_get(mm);
-+ if (cid >= 0)
-+ goto unlock;
-+ }
-+
-+ /*
-+ * cid concurrently allocated. Retry while forcing following
-+ * allocations to use the cid_lock to ensure forward progress.
-+ */
-+ WRITE_ONCE(use_cid_lock, 1);
-+ /*
-+ * Set use_cid_lock before allocation. Only care about program order
-+ * because this is only required for forward progress.
-+ */
-+ barrier();
-+ /*
-+ * Retry until it succeeds. It is guaranteed to eventually succeed once
-+ * all newcoming allocations observe the use_cid_lock flag set.
-+ */
-+ do {
-+ cid = __mm_cid_try_get(mm);
-+ cpu_relax();
-+ } while (cid < 0);
-+ /*
-+ * Allocate before clearing use_cid_lock. Only care about
-+ * program order because this is for forward progress.
-+ */
-+ barrier();
-+ WRITE_ONCE(use_cid_lock, 0);
-+unlock:
-+ raw_spin_unlock(&cid_lock);
-+end:
-+ mm_cid_snapshot_time(rq, mm);
-+ return cid;
-+}
-+
-+static inline int mm_cid_get(struct rq *rq, struct mm_struct *mm)
-+{
-+ struct mm_cid __percpu *pcpu_cid = mm->pcpu_cid;
-+ struct cpumask *cpumask;
-+ int cid;
-+
-+ lockdep_assert_rq_held(rq);
-+ cpumask = mm_cidmask(mm);
-+ cid = __this_cpu_read(pcpu_cid->cid);
-+ if (mm_cid_is_valid(cid)) {
-+ mm_cid_snapshot_time(rq, mm);
-+ return cid;
-+ }
-+ if (mm_cid_is_lazy_put(cid)) {
-+ if (try_cmpxchg(&this_cpu_ptr(pcpu_cid)->cid, &cid, MM_CID_UNSET))
-+ __mm_cid_put(mm, mm_cid_clear_lazy_put(cid));
-+ }
-+ cid = __mm_cid_get(rq, mm);
-+ __this_cpu_write(pcpu_cid->cid, cid);
-+ return cid;
-+}
-+
-+static inline void switch_mm_cid(struct rq *rq,
-+ struct task_struct *prev,
-+ struct task_struct *next)
-+{
-+ /*
-+ * Provide a memory barrier between rq->curr store and load of
-+ * {prev,next}->mm->pcpu_cid[cpu] on rq->curr->mm transition.
-+ *
-+ * Should be adapted if context_switch() is modified.
-+ */
-+ if (!next->mm) { // to kernel
-+ /*
-+ * user -> kernel transition does not guarantee a barrier, but
-+ * we can use the fact that it performs an atomic operation in
-+ * mmgrab().
-+ */
-+ if (prev->mm) // from user
-+ smp_mb__after_mmgrab();
-+ /*
-+ * kernel -> kernel transition does not change rq->curr->mm
-+ * state. It stays NULL.
-+ */
-+ } else { // to user
-+ /*
-+ * kernel -> user transition does not provide a barrier
-+ * between rq->curr store and load of {prev,next}->mm->pcpu_cid[cpu].
-+ * Provide it here.
-+ */
-+ if (!prev->mm) // from kernel
-+ smp_mb();
-+ /*
-+ * user -> user transition guarantees a memory barrier through
-+ * switch_mm() when current->mm changes. If current->mm is
-+ * unchanged, no barrier is needed.
-+ */
-+ }
-+ if (prev->mm_cid_active) {
-+ mm_cid_snapshot_time(rq, prev->mm);
-+ mm_cid_put_lazy(prev);
-+ prev->mm_cid = -1;
-+ }
-+ if (next->mm_cid_active)
-+ next->last_mm_cid = next->mm_cid = mm_cid_get(rq, next->mm);
-+}
-+
-+#else
-+static inline void switch_mm_cid(struct rq *rq, struct task_struct *prev, struct task_struct *next) { }
-+static inline void sched_mm_cid_migrate_from(struct task_struct *t) { }
-+static inline void sched_mm_cid_migrate_to(struct rq *dst_rq, struct task_struct *t, int src_cpu) { }
-+static inline void task_tick_mm_cid(struct rq *rq, struct task_struct *curr) { }
-+static inline void init_sched_mm_cid(struct task_struct *t) { }
-+#endif
-+
-+#endif /* ALT_SCHED_H */
-diff --git a/kernel/sched/bmq.h b/kernel/sched/bmq.h
-new file mode 100644
-index 000000000000..f29b8f3aa786
---- /dev/null
-+++ b/kernel/sched/bmq.h
-@@ -0,0 +1,110 @@
-+#define ALT_SCHED_NAME "BMQ"
-+
-+/*
-+ * BMQ only routines
-+ */
-+#define rq_switch_time(rq) ((rq)->clock - (rq)->last_ts_switch)
-+#define boost_threshold(p) (sched_timeslice_ns >>\
-+ (15 - MAX_PRIORITY_ADJ - (p)->boost_prio))
-+
-+static inline void boost_task(struct task_struct *p)
-+{
-+ int limit;
-+
-+ switch (p->policy) {
-+ case SCHED_NORMAL:
-+ limit = -MAX_PRIORITY_ADJ;
-+ break;
-+ case SCHED_BATCH:
-+ case SCHED_IDLE:
-+ limit = 0;
-+ break;
-+ default:
-+ return;
-+ }
-+
-+ if (p->boost_prio > limit)
-+ p->boost_prio--;
-+}
-+
-+static inline void deboost_task(struct task_struct *p)
-+{
-+ if (p->boost_prio < MAX_PRIORITY_ADJ)
-+ p->boost_prio++;
-+}
-+
-+/*
-+ * Common interfaces
-+ */
-+static inline void sched_timeslice_imp(const int timeslice_ms) {}
-+
-+static inline int
-+task_sched_prio_normal(const struct task_struct *p, const struct rq *rq)
-+{
-+ return p->prio + p->boost_prio - MAX_RT_PRIO;
-+}
-+
-+static inline int task_sched_prio(const struct task_struct *p)
-+{
-+ return (p->prio < MAX_RT_PRIO)? p->prio : MAX_RT_PRIO / 2 + (p->prio + p->boost_prio) / 2;
-+}
-+
-+static inline int
-+task_sched_prio_idx(const struct task_struct *p, const struct rq *rq)
-+{
-+ return task_sched_prio(p);
-+}
-+
-+static inline int sched_prio2idx(int prio, struct rq *rq)
-+{
-+ return prio;
-+}
-+
-+static inline int sched_idx2prio(int idx, struct rq *rq)
-+{
-+ return idx;
-+}
-+
-+static inline void time_slice_expired(struct task_struct *p, struct rq *rq)
-+{
-+ p->time_slice = sched_timeslice_ns;
-+
-+ if (SCHED_FIFO != p->policy && task_on_rq_queued(p)) {
-+ if (SCHED_RR != p->policy)
-+ deboost_task(p);
-+ requeue_task(p, rq, task_sched_prio_idx(p, rq));
-+ }
-+}
-+
-+static inline void sched_task_sanity_check(struct task_struct *p, struct rq *rq) {}
-+
-+inline int task_running_nice(struct task_struct *p)
-+{
-+ return (p->prio + p->boost_prio > DEFAULT_PRIO + MAX_PRIORITY_ADJ);
-+}
-+
-+static void sched_task_fork(struct task_struct *p, struct rq *rq)
-+{
-+ p->boost_prio = MAX_PRIORITY_ADJ;
-+}
-+
-+static inline void do_sched_yield_type_1(struct task_struct *p, struct rq *rq)
-+{
-+ p->boost_prio = MAX_PRIORITY_ADJ;
-+}
-+
-+#ifdef CONFIG_SMP
-+static inline void sched_task_ttwu(struct task_struct *p)
-+{
-+ if(this_rq()->clock_task - p->last_ran > sched_timeslice_ns)
-+ boost_task(p);
-+}
-+#endif
-+
-+static inline void sched_task_deactivate(struct task_struct *p, struct rq *rq)
-+{
-+ if (rq_switch_time(rq) < boost_threshold(p))
-+ boost_task(p);
-+}
-+
-+static inline void update_rq_time_edge(struct rq *rq) {}
-diff --git a/kernel/sched/build_policy.c b/kernel/sched/build_policy.c
-index d9dc9ab3773f..71a25540d65e 100644
---- a/kernel/sched/build_policy.c
-+++ b/kernel/sched/build_policy.c
-@@ -42,13 +42,19 @@
-
- #include "idle.c"
-
-+#ifndef CONFIG_SCHED_ALT
- #include "rt.c"
-+#endif
-
- #ifdef CONFIG_SMP
-+#ifndef CONFIG_SCHED_ALT
- # include "cpudeadline.c"
-+#endif
- # include "pelt.c"
- #endif
-
- #include "cputime.c"
--#include "deadline.c"
-
-+#ifndef CONFIG_SCHED_ALT
-+#include "deadline.c"
-+#endif
-diff --git a/kernel/sched/build_utility.c b/kernel/sched/build_utility.c
-index 99bdd96f454f..23f80a86d2d7 100644
---- a/kernel/sched/build_utility.c
-+++ b/kernel/sched/build_utility.c
-@@ -85,7 +85,9 @@
-
- #ifdef CONFIG_SMP
- # include "cpupri.c"
-+#ifndef CONFIG_SCHED_ALT
- # include "stop_task.c"
-+#endif
- # include "topology.c"
- #endif
-
-diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
-index e3211455b203..87f7a4f732c8 100644
---- a/kernel/sched/cpufreq_schedutil.c
-+++ b/kernel/sched/cpufreq_schedutil.c
-@@ -157,9 +157,14 @@ static void sugov_get_util(struct sugov_cpu *sg_cpu)
- {
- struct rq *rq = cpu_rq(sg_cpu->cpu);
-
-+#ifndef CONFIG_SCHED_ALT
- sg_cpu->bw_dl = cpu_bw_dl(rq);
- sg_cpu->util = effective_cpu_util(sg_cpu->cpu, cpu_util_cfs(sg_cpu->cpu),
- FREQUENCY_UTIL, NULL);
-+#else
-+ sg_cpu->bw_dl = 0;
-+ sg_cpu->util = rq_load_util(rq, arch_scale_cpu_capacity(sg_cpu->cpu));
-+#endif /* CONFIG_SCHED_ALT */
- }
-
- /**
-@@ -305,8 +310,10 @@ static inline bool sugov_cpu_is_busy(struct sugov_cpu *sg_cpu) { return false; }
- */
- static inline void ignore_dl_rate_limit(struct sugov_cpu *sg_cpu)
- {
-+#ifndef CONFIG_SCHED_ALT
- if (cpu_bw_dl(cpu_rq(sg_cpu->cpu)) > sg_cpu->bw_dl)
- sg_cpu->sg_policy->limits_changed = true;
-+#endif
- }
-
- static inline bool sugov_update_single_common(struct sugov_cpu *sg_cpu,
-@@ -609,6 +616,7 @@ static int sugov_kthread_create(struct sugov_policy *sg_policy)
- }
-
- ret = sched_setattr_nocheck(thread, &attr);
-+
- if (ret) {
- kthread_stop(thread);
- pr_warn("%s: failed to set SCHED_DEADLINE\n", __func__);
-@@ -841,7 +849,9 @@ cpufreq_governor_init(schedutil_gov);
- #ifdef CONFIG_ENERGY_MODEL
- static void rebuild_sd_workfn(struct work_struct *work)
- {
-+#ifndef CONFIG_SCHED_ALT
- rebuild_sched_domains_energy();
-+#endif /* CONFIG_SCHED_ALT */
- }
- static DECLARE_WORK(rebuild_sd_work, rebuild_sd_workfn);
-
-diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
-index af7952f12e6c..6461cbbb734d 100644
---- a/kernel/sched/cputime.c
-+++ b/kernel/sched/cputime.c
-@@ -126,7 +126,7 @@ void account_user_time(struct task_struct *p, u64 cputime)
- p->utime += cputime;
- account_group_user_time(p, cputime);
-
-- index = (task_nice(p) > 0) ? CPUTIME_NICE : CPUTIME_USER;
-+ index = task_running_nice(p) ? CPUTIME_NICE : CPUTIME_USER;
-
- /* Add user time to cpustat. */
- task_group_account_field(p, index, cputime);
-@@ -150,7 +150,7 @@ void account_guest_time(struct task_struct *p, u64 cputime)
- p->gtime += cputime;
-
- /* Add guest time to cpustat. */
-- if (task_nice(p) > 0) {
-+ if (task_running_nice(p)) {
- task_group_account_field(p, CPUTIME_NICE, cputime);
- cpustat[CPUTIME_GUEST_NICE] += cputime;
- } else {
-@@ -288,7 +288,7 @@ static inline u64 account_other_time(u64 max)
- #ifdef CONFIG_64BIT
- static inline u64 read_sum_exec_runtime(struct task_struct *t)
- {
-- return t->se.sum_exec_runtime;
-+ return tsk_seruntime(t);
- }
- #else
- static u64 read_sum_exec_runtime(struct task_struct *t)
-@@ -298,7 +298,7 @@ static u64 read_sum_exec_runtime(struct task_struct *t)
- struct rq *rq;
-
- rq = task_rq_lock(t, &rf);
-- ns = t->se.sum_exec_runtime;
-+ ns = tsk_seruntime(t);
- task_rq_unlock(rq, t, &rf);
-
- return ns;
-@@ -630,7 +630,7 @@ void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev,
- void task_cputime_adjusted(struct task_struct *p, u64 *ut, u64 *st)
- {
- struct task_cputime cputime = {
-- .sum_exec_runtime = p->se.sum_exec_runtime,
-+ .sum_exec_runtime = tsk_seruntime(p),
- };
-
- if (task_cputime(p, &cputime.utime, &cputime.stime))
-diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
-index 0b2340a79b65..1e5407b8a738 100644
---- a/kernel/sched/debug.c
-+++ b/kernel/sched/debug.c
-@@ -7,6 +7,7 @@
- * Copyright(C) 2007, Red Hat, Inc., Ingo Molnar
- */
-
-+#ifndef CONFIG_SCHED_ALT
- /*
- * This allows printing both to /proc/sched_debug and
- * to the console
-@@ -215,6 +216,7 @@ static const struct file_operations sched_scaling_fops = {
- };
-
- #endif /* SMP */
-+#endif /* !CONFIG_SCHED_ALT */
-
- #ifdef CONFIG_PREEMPT_DYNAMIC
-
-@@ -278,6 +280,7 @@ static const struct file_operations sched_dynamic_fops = {
-
- #endif /* CONFIG_PREEMPT_DYNAMIC */
-
-+#ifndef CONFIG_SCHED_ALT
- __read_mostly bool sched_debug_verbose;
-
- #ifdef CONFIG_SMP
-@@ -332,6 +335,7 @@ static const struct file_operations sched_debug_fops = {
- .llseek = seq_lseek,
- .release = seq_release,
- };
-+#endif /* !CONFIG_SCHED_ALT */
-
- static struct dentry *debugfs_sched;
-
-@@ -341,12 +345,16 @@ static __init int sched_init_debug(void)
-
- debugfs_sched = debugfs_create_dir("sched", NULL);
-
-+#ifndef CONFIG_SCHED_ALT
- debugfs_create_file("features", 0644, debugfs_sched, NULL, &sched_feat_fops);
- debugfs_create_file_unsafe("verbose", 0644, debugfs_sched, &sched_debug_verbose, &sched_verbose_fops);
-+ debugfs_create_bool("verbose", 0644, debugfs_sched, &sched_debug_verbose);
-+#endif /* !CONFIG_SCHED_ALT */
- #ifdef CONFIG_PREEMPT_DYNAMIC
- debugfs_create_file("preempt", 0644, debugfs_sched, NULL, &sched_dynamic_fops);
- #endif
-
-+#ifndef CONFIG_SCHED_ALT
- debugfs_create_u32("latency_ns", 0644, debugfs_sched, &sysctl_sched_latency);
- debugfs_create_u32("min_granularity_ns", 0644, debugfs_sched, &sysctl_sched_min_granularity);
- debugfs_create_u32("idle_min_granularity_ns", 0644, debugfs_sched, &sysctl_sched_idle_min_granularity);
-@@ -376,11 +384,13 @@ static __init int sched_init_debug(void)
- #endif
-
- debugfs_create_file("debug", 0444, debugfs_sched, NULL, &sched_debug_fops);
-+#endif /* !CONFIG_SCHED_ALT */
-
- return 0;
- }
- late_initcall(sched_init_debug);
-
-+#ifndef CONFIG_SCHED_ALT
- #ifdef CONFIG_SMP
-
- static cpumask_var_t sd_sysctl_cpus;
-@@ -1114,6 +1124,7 @@ void proc_sched_set_task(struct task_struct *p)
- memset(&p->stats, 0, sizeof(p->stats));
- #endif
- }
-+#endif /* !CONFIG_SCHED_ALT */
-
- void resched_latency_warn(int cpu, u64 latency)
- {
-diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
-index 342f58a329f5..ab493e759084 100644
---- a/kernel/sched/idle.c
-+++ b/kernel/sched/idle.c
-@@ -379,6 +379,7 @@ void cpu_startup_entry(enum cpuhp_state state)
- do_idle();
- }
-
-+#ifndef CONFIG_SCHED_ALT
- /*
- * idle-task scheduling class.
- */
-@@ -500,3 +501,4 @@ DEFINE_SCHED_CLASS(idle) = {
- .switched_to = switched_to_idle,
- .update_curr = update_curr_idle,
- };
-+#endif
-diff --git a/kernel/sched/pds.h b/kernel/sched/pds.h
-new file mode 100644
-index 000000000000..15cc4887efed
---- /dev/null
-+++ b/kernel/sched/pds.h
-@@ -0,0 +1,152 @@
-+#define ALT_SCHED_NAME "PDS"
-+
-+#define MIN_SCHED_NORMAL_PRIO (32)
-+static const u64 RT_MASK = ((1ULL << MIN_SCHED_NORMAL_PRIO) - 1);
-+
-+#define SCHED_NORMAL_PRIO_NUM (32)
-+#define SCHED_EDGE_DELTA (SCHED_NORMAL_PRIO_NUM - NICE_WIDTH / 2)
-+
-+/* PDS assume NORMAL_PRIO_NUM is power of 2 */
-+#define SCHED_NORMAL_PRIO_MOD(x) ((x) & (SCHED_NORMAL_PRIO_NUM - 1))
-+
-+/* default time slice 4ms -> shift 22, 2 time slice slots -> shift 23 */
-+static __read_mostly int sched_timeslice_shift = 23;
-+
-+/*
-+ * Common interfaces
-+ */
-+static inline void sched_timeslice_imp(const int timeslice_ms)
-+{
-+ if (2 == timeslice_ms)
-+ sched_timeslice_shift = 22;
-+}
-+
-+static inline int
-+task_sched_prio_normal(const struct task_struct *p, const struct rq *rq)
-+{
-+ s64 delta = p->deadline - rq->time_edge + SCHED_EDGE_DELTA;
-+
-+#ifdef ALT_SCHED_DEBUG
-+ if (WARN_ONCE(delta > NORMAL_PRIO_NUM - 1,
-+ "pds: task_sched_prio_normal() delta %lld\n", delta))
-+ return SCHED_NORMAL_PRIO_NUM - 1;
-+#endif
-+
-+ return max(0LL, delta);
-+}
-+
-+static inline int task_sched_prio(const struct task_struct *p)
-+{
-+ return (p->prio < MIN_NORMAL_PRIO) ? (p->prio >> 2) :
-+ MIN_SCHED_NORMAL_PRIO + task_sched_prio_normal(p, task_rq(p));
-+}
-+
-+static inline int
-+task_sched_prio_idx(const struct task_struct *p, const struct rq *rq)
-+{
-+ u64 idx;
-+
-+ if (p->prio < MIN_NORMAL_PRIO)
-+ return p->prio >> 2;
-+
-+ idx = max(p->deadline + SCHED_EDGE_DELTA, rq->time_edge);
-+ /*printk(KERN_INFO "sched: task_sched_prio_idx edge:%llu, deadline=%llu idx=%llu\n", rq->time_edge, p->deadline, idx);*/
-+ return MIN_SCHED_NORMAL_PRIO + SCHED_NORMAL_PRIO_MOD(idx);
-+}
-+
-+static inline int sched_prio2idx(int sched_prio, struct rq *rq)
-+{
-+ return (IDLE_TASK_SCHED_PRIO == sched_prio || sched_prio < MIN_SCHED_NORMAL_PRIO) ?
-+ sched_prio :
-+ MIN_SCHED_NORMAL_PRIO + SCHED_NORMAL_PRIO_MOD(sched_prio + rq->time_edge);
-+}
-+
-+static inline int sched_idx2prio(int sched_idx, struct rq *rq)
-+{
-+ return (sched_idx < MIN_SCHED_NORMAL_PRIO) ?
-+ sched_idx :
-+ MIN_SCHED_NORMAL_PRIO + SCHED_NORMAL_PRIO_MOD(sched_idx - rq->time_edge);
-+}
-+
-+static inline void sched_renew_deadline(struct task_struct *p, const struct rq *rq)
-+{
-+ if (p->prio >= MIN_NORMAL_PRIO)
-+ p->deadline = rq->time_edge + (p->static_prio - (MAX_PRIO - NICE_WIDTH)) / 2;
-+}
-+
-+int task_running_nice(struct task_struct *p)
-+{
-+ return (p->prio > DEFAULT_PRIO);
-+}
-+
-+static inline void update_rq_time_edge(struct rq *rq)
-+{
-+ struct list_head head;
-+ u64 old = rq->time_edge;
-+ u64 now = rq->clock >> sched_timeslice_shift;
-+ u64 prio, delta;
-+ DECLARE_BITMAP(normal, SCHED_QUEUE_BITS);
-+
-+ if (now == old)
-+ return;
-+
-+ rq->time_edge = now;
-+ delta = min_t(u64, SCHED_NORMAL_PRIO_NUM, now - old);
-+ INIT_LIST_HEAD(&head);
-+
-+ /*printk(KERN_INFO "sched: update_rq_time_edge 0x%016lx %llu\n", rq->queue.bitmap[0], delta);*/
-+ prio = MIN_SCHED_NORMAL_PRIO;
-+ for_each_set_bit_from(prio, rq->queue.bitmap, MIN_SCHED_NORMAL_PRIO + delta)
-+ list_splice_tail_init(rq->queue.heads + MIN_SCHED_NORMAL_PRIO +
-+ SCHED_NORMAL_PRIO_MOD(prio + old), &head);
-+
-+ bitmap_shift_right(normal, rq->queue.bitmap, delta, SCHED_QUEUE_BITS);
-+ if (!list_empty(&head)) {
-+ struct task_struct *p;
-+ u64 idx = MIN_SCHED_NORMAL_PRIO + SCHED_NORMAL_PRIO_MOD(now);
-+
-+ list_for_each_entry(p, &head, sq_node)
-+ p->sq_idx = idx;
-+
-+ list_splice(&head, rq->queue.heads + idx);
-+ set_bit(MIN_SCHED_NORMAL_PRIO, normal);
-+ }
-+ bitmap_replace(rq->queue.bitmap, normal, rq->queue.bitmap,
-+ (const unsigned long *)&RT_MASK, SCHED_QUEUE_BITS);
-+
-+ if (rq->prio < MIN_SCHED_NORMAL_PRIO || IDLE_TASK_SCHED_PRIO == rq->prio)
-+ return;
-+
-+ rq->prio = (rq->prio < MIN_SCHED_NORMAL_PRIO + delta) ?
-+ MIN_SCHED_NORMAL_PRIO : rq->prio - delta;
-+}
-+
-+static inline void time_slice_expired(struct task_struct *p, struct rq *rq)
-+{
-+ p->time_slice = sched_timeslice_ns;
-+ sched_renew_deadline(p, rq);
-+ if (SCHED_FIFO != p->policy && task_on_rq_queued(p))
-+ requeue_task(p, rq, task_sched_prio_idx(p, rq));
-+}
-+
-+static inline void sched_task_sanity_check(struct task_struct *p, struct rq *rq)
-+{
-+ u64 max_dl = rq->time_edge + NICE_WIDTH / 2 - 1;
-+ if (unlikely(p->deadline > max_dl))
-+ p->deadline = max_dl;
-+}
-+
-+static void sched_task_fork(struct task_struct *p, struct rq *rq)
-+{
-+ sched_renew_deadline(p, rq);
-+}
-+
-+static inline void do_sched_yield_type_1(struct task_struct *p, struct rq *rq)
-+{
-+ time_slice_expired(p, rq);
-+}
-+
-+#ifdef CONFIG_SMP
-+static inline void sched_task_ttwu(struct task_struct *p) {}
-+#endif
-+static inline void sched_task_deactivate(struct task_struct *p, struct rq *rq) {}
-diff --git a/kernel/sched/pelt.c b/kernel/sched/pelt.c
-index 0f310768260c..bd38bf738fe9 100644
---- a/kernel/sched/pelt.c
-+++ b/kernel/sched/pelt.c
-@@ -266,6 +266,7 @@ ___update_load_avg(struct sched_avg *sa, unsigned long load)
- WRITE_ONCE(sa->util_avg, sa->util_sum / divider);
- }
-
-+#ifndef CONFIG_SCHED_ALT
- /*
- * sched_entity:
- *
-@@ -383,8 +384,9 @@ int update_dl_rq_load_avg(u64 now, struct rq *rq, int running)
-
- return 0;
- }
-+#endif
-
--#ifdef CONFIG_SCHED_THERMAL_PRESSURE
-+#if defined(CONFIG_SCHED_THERMAL_PRESSURE) && !defined(CONFIG_SCHED_ALT)
- /*
- * thermal:
- *
-diff --git a/kernel/sched/pelt.h b/kernel/sched/pelt.h
-index 3a0e0dc28721..e8a7d84aa5a5 100644
---- a/kernel/sched/pelt.h
-+++ b/kernel/sched/pelt.h
-@@ -1,13 +1,15 @@
- #ifdef CONFIG_SMP
- #include "sched-pelt.h"
-
-+#ifndef CONFIG_SCHED_ALT
- int __update_load_avg_blocked_se(u64 now, struct sched_entity *se);
- int __update_load_avg_se(u64 now, struct cfs_rq *cfs_rq, struct sched_entity *se);
- int __update_load_avg_cfs_rq(u64 now, struct cfs_rq *cfs_rq);
- int update_rt_rq_load_avg(u64 now, struct rq *rq, int running);
- int update_dl_rq_load_avg(u64 now, struct rq *rq, int running);
-+#endif
-
--#ifdef CONFIG_SCHED_THERMAL_PRESSURE
-+#if defined(CONFIG_SCHED_THERMAL_PRESSURE) && !defined(CONFIG_SCHED_ALT)
- int update_thermal_load_avg(u64 now, struct rq *rq, u64 capacity);
-
- static inline u64 thermal_load_avg(struct rq *rq)
-@@ -44,6 +46,7 @@ static inline u32 get_pelt_divider(struct sched_avg *avg)
- return PELT_MIN_DIVIDER + avg->period_contrib;
- }
-
-+#ifndef CONFIG_SCHED_ALT
- static inline void cfs_se_util_change(struct sched_avg *avg)
- {
- unsigned int enqueued;
-@@ -180,9 +183,11 @@ static inline u64 cfs_rq_clock_pelt(struct cfs_rq *cfs_rq)
- return rq_clock_pelt(rq_of(cfs_rq));
- }
- #endif
-+#endif /* CONFIG_SCHED_ALT */
-
- #else
-
-+#ifndef CONFIG_SCHED_ALT
- static inline int
- update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq)
- {
-@@ -200,6 +205,7 @@ update_dl_rq_load_avg(u64 now, struct rq *rq, int running)
- {
- return 0;
- }
-+#endif
-
- static inline int
- update_thermal_load_avg(u64 now, struct rq *rq, u64 capacity)
-diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
-index ec7b3e0a2b20..3b4052dd7bee 100644
---- a/kernel/sched/sched.h
-+++ b/kernel/sched/sched.h
-@@ -5,6 +5,10 @@
- #ifndef _KERNEL_SCHED_SCHED_H
- #define _KERNEL_SCHED_SCHED_H
-
-+#ifdef CONFIG_SCHED_ALT
-+#include "alt_sched.h"
-+#else
-+
- #include <linux/sched/affinity.h>
- #include <linux/sched/autogroup.h>
- #include <linux/sched/cpufreq.h>
-@@ -3487,4 +3491,9 @@ static inline void task_tick_mm_cid(struct rq *rq, struct task_struct *curr) { }
- static inline void init_sched_mm_cid(struct task_struct *t) { }
- #endif
-
-+static inline int task_running_nice(struct task_struct *p)
-+{
-+ return (task_nice(p) > 0);
-+}
-+#endif /* !CONFIG_SCHED_ALT */
- #endif /* _KERNEL_SCHED_SCHED_H */
-diff --git a/kernel/sched/stats.c b/kernel/sched/stats.c
-index 857f837f52cb..5486c63e4790 100644
---- a/kernel/sched/stats.c
-+++ b/kernel/sched/stats.c
-@@ -125,8 +125,10 @@ static int show_schedstat(struct seq_file *seq, void *v)
- } else {
- struct rq *rq;
- #ifdef CONFIG_SMP
-+#ifndef CONFIG_SCHED_ALT
- struct sched_domain *sd;
- int dcount = 0;
-+#endif
- #endif
- cpu = (unsigned long)(v - 2);
- rq = cpu_rq(cpu);
-@@ -143,6 +145,7 @@ static int show_schedstat(struct seq_file *seq, void *v)
- seq_printf(seq, "\n");
-
- #ifdef CONFIG_SMP
-+#ifndef CONFIG_SCHED_ALT
- /* domain-specific stats */
- rcu_read_lock();
- for_each_domain(cpu, sd) {
-@@ -171,6 +174,7 @@ static int show_schedstat(struct seq_file *seq, void *v)
- sd->ttwu_move_balance);
- }
- rcu_read_unlock();
-+#endif
- #endif
- }
- return 0;
-diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h
-index 38f3698f5e5b..b9d597394316 100644
---- a/kernel/sched/stats.h
-+++ b/kernel/sched/stats.h
-@@ -89,6 +89,7 @@ static inline void rq_sched_info_depart (struct rq *rq, unsigned long long delt
-
- #endif /* CONFIG_SCHEDSTATS */
-
-+#ifndef CONFIG_SCHED_ALT
- #ifdef CONFIG_FAIR_GROUP_SCHED
- struct sched_entity_stats {
- struct sched_entity se;
-@@ -105,6 +106,7 @@ __schedstats_from_se(struct sched_entity *se)
- #endif
- return &task_of(se)->stats;
- }
-+#endif /* CONFIG_SCHED_ALT */
-
- #ifdef CONFIG_PSI
- void psi_task_change(struct task_struct *task, int clear, int set);
-diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
-index 6682535e37c8..144875e2728d 100644
---- a/kernel/sched/topology.c
-+++ b/kernel/sched/topology.c
-@@ -3,6 +3,7 @@
- * Scheduler topology setup/handling methods
- */
-
-+#ifndef CONFIG_SCHED_ALT
- #include <linux/bsearch.h>
-
- DEFINE_MUTEX(sched_domains_mutex);
-@@ -1415,8 +1416,10 @@ static void asym_cpu_capacity_scan(void)
- */
-
- static int default_relax_domain_level = -1;
-+#endif /* CONFIG_SCHED_ALT */
- int sched_domain_level_max;
-
-+#ifndef CONFIG_SCHED_ALT
- static int __init setup_relax_domain_level(char *str)
- {
- if (kstrtoint(str, 0, &default_relax_domain_level))
-@@ -1649,6 +1652,7 @@ sd_init(struct sched_domain_topology_level *tl,
-
- return sd;
- }
-+#endif /* CONFIG_SCHED_ALT */
-
- /*
- * Topology list, bottom-up.
-@@ -1685,6 +1689,7 @@ void set_sched_topology(struct sched_domain_topology_level *tl)
- sched_domain_topology_saved = NULL;
- }
-
-+#ifndef CONFIG_SCHED_ALT
- #ifdef CONFIG_NUMA
-
- static const struct cpumask *sd_numa_mask(int cpu)
-@@ -2740,3 +2745,20 @@ void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
- partition_sched_domains_locked(ndoms_new, doms_new, dattr_new);
- mutex_unlock(&sched_domains_mutex);
- }
-+#else /* CONFIG_SCHED_ALT */
-+void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
-+ struct sched_domain_attr *dattr_new)
-+{}
-+
-+#ifdef CONFIG_NUMA
-+int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
-+{
-+ return best_mask_cpu(cpu, cpus);
-+}
-+
-+int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node)
-+{
-+ return cpumask_nth(cpu, cpus);
-+}
-+#endif /* CONFIG_NUMA */
-+#endif
-diff --git a/kernel/sysctl.c b/kernel/sysctl.c
-index bfe53e835524..943fa125064b 100644
---- a/kernel/sysctl.c
-+++ b/kernel/sysctl.c
-@@ -92,6 +92,10 @@ EXPORT_SYMBOL_GPL(sysctl_long_vals);
-
- /* Constants used for minimum and maximum */
-
-+#ifdef CONFIG_SCHED_ALT
-+extern int sched_yield_type;
-+#endif
-+
- #ifdef CONFIG_PERF_EVENTS
- static const int six_hundred_forty_kb = 640 * 1024;
- #endif
-@@ -1917,6 +1921,17 @@ static struct ctl_table kern_table[] = {
- .proc_handler = proc_dointvec,
- },
- #endif
-+#ifdef CONFIG_SCHED_ALT
-+ {
-+ .procname = "yield_type",
-+ .data = &sched_yield_type,
-+ .maxlen = sizeof (int),
-+ .mode = 0644,
-+ .proc_handler = &proc_dointvec_minmax,
-+ .extra1 = SYSCTL_ZERO,
-+ .extra2 = SYSCTL_TWO,
-+ },
-+#endif
- #if defined(CONFIG_S390) && defined(CONFIG_SMP)
- {
- .procname = "spin_retry",
-diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
-index e8c08292defc..3823ff0ddc0f 100644
---- a/kernel/time/hrtimer.c
-+++ b/kernel/time/hrtimer.c
-@@ -2088,8 +2088,10 @@ long hrtimer_nanosleep(ktime_t rqtp, const enum hrtimer_mode mode,
- int ret = 0;
- u64 slack;
-
-+#ifndef CONFIG_SCHED_ALT
- slack = current->timer_slack_ns;
-- if (rt_task(current))
-+ if (dl_task(current) || rt_task(current))
-+#endif
- slack = 0;
-
- hrtimer_init_sleeper_on_stack(&t, clockid, mode);
-diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
-index e9c6f9d0e42c..43ee0a94abdd 100644
---- a/kernel/time/posix-cpu-timers.c
-+++ b/kernel/time/posix-cpu-timers.c
-@@ -223,7 +223,7 @@ static void task_sample_cputime(struct task_struct *p, u64 *samples)
- u64 stime, utime;
-
- task_cputime(p, &utime, &stime);
-- store_samples(samples, stime, utime, p->se.sum_exec_runtime);
-+ store_samples(samples, stime, utime, tsk_seruntime(p));
- }
-
- static void proc_sample_cputime_atomic(struct task_cputime_atomic *at,
-@@ -867,6 +867,7 @@ static void collect_posix_cputimers(struct posix_cputimers *pct, u64 *samples,
- }
- }
-
-+#ifndef CONFIG_SCHED_ALT
- static inline void check_dl_overrun(struct task_struct *tsk)
- {
- if (tsk->dl.dl_overrun) {
-@@ -874,6 +875,7 @@ static inline void check_dl_overrun(struct task_struct *tsk)
- send_signal_locked(SIGXCPU, SEND_SIG_PRIV, tsk, PIDTYPE_TGID);
- }
- }
-+#endif
-
- static bool check_rlimit(u64 time, u64 limit, int signo, bool rt, bool hard)
- {
-@@ -901,8 +903,10 @@ static void check_thread_timers(struct task_struct *tsk,
- u64 samples[CPUCLOCK_MAX];
- unsigned long soft;
-
-+#ifndef CONFIG_SCHED_ALT
- if (dl_task(tsk))
- check_dl_overrun(tsk);
-+#endif
-
- if (expiry_cache_is_inactive(pct))
- return;
-@@ -916,7 +920,7 @@ static void check_thread_timers(struct task_struct *tsk,
- soft = task_rlimit(tsk, RLIMIT_RTTIME);
- if (soft != RLIM_INFINITY) {
- /* Task RT timeout is accounted in jiffies. RTTIME is usec */
-- unsigned long rttime = tsk->rt.timeout * (USEC_PER_SEC / HZ);
-+ unsigned long rttime = tsk_rttimeout(tsk) * (USEC_PER_SEC / HZ);
- unsigned long hard = task_rlimit_max(tsk, RLIMIT_RTTIME);
-
- /* At the hard limit, send SIGKILL. No further action. */
-@@ -1152,8 +1156,10 @@ static inline bool fastpath_timer_check(struct task_struct *tsk)
- return true;
- }
-
-+#ifndef CONFIG_SCHED_ALT
- if (dl_task(tsk) && tsk->dl.dl_overrun)
- return true;
-+#endif
-
- return false;
- }
-diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
-index 529590499b1f..d04bb99b4f0e 100644
---- a/kernel/trace/trace_selftest.c
-+++ b/kernel/trace/trace_selftest.c
-@@ -1155,10 +1155,15 @@ static int trace_wakeup_test_thread(void *data)
- {
- /* Make this a -deadline thread */
- static const struct sched_attr attr = {
-+#ifdef CONFIG_SCHED_ALT
-+ /* No deadline on BMQ/PDS, use RR */
-+ .sched_policy = SCHED_RR,
-+#else
- .sched_policy = SCHED_DEADLINE,
- .sched_runtime = 100000ULL,
- .sched_deadline = 10000000ULL,
- .sched_period = 10000000ULL
-+#endif
- };
- struct wakeup_test_data *x = data;
-
diff --git a/5021_BMQ-and-PDS-gentoo-defaults.patch b/5021_BMQ-and-PDS-gentoo-defaults.patch
deleted file mode 100644
index 6dc48eec..00000000
--- a/5021_BMQ-and-PDS-gentoo-defaults.patch
+++ /dev/null
@@ -1,13 +0,0 @@
---- a/init/Kconfig 2023-02-13 08:16:09.534315265 -0500
-+++ b/init/Kconfig 2023-02-13 08:17:24.130237204 -0500
-@@ -867,8 +867,9 @@ config UCLAMP_BUCKETS_COUNT
- If in doubt, use the default value.
-
- menuconfig SCHED_ALT
-+ depends on X86_64
- bool "Alternative CPU Schedulers"
-- default y
-+ default n
- help
- This feature enable alternative CPU scheduler"
-
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-08-30 13:48 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-08-30 13:48 UTC (permalink / raw
To: gentoo-commits
commit: 71da69a90eddf1faf621cb296f0ca23862107be1
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Wed Aug 30 13:48:03 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Wed Aug 30 13:48:03 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=71da69a9
Linux patch 6.4.13
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1012_linux-6.4.13.patch | 6523 +++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 6527 insertions(+)
diff --git a/0000_README b/0000_README
index 1c391fe4..38b60f4b 100644
--- a/0000_README
+++ b/0000_README
@@ -91,6 +91,10 @@ Patch: 1011_linux-6.4.12.patch
From: https://www.kernel.org
Desc: Linux 6.4.12
+Patch: 1012_linux-6.4.13.patch
+From: https://www.kernel.org
+Desc: Linux 6.4.13
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1012_linux-6.4.13.patch b/1012_linux-6.4.13.patch
new file mode 100644
index 00000000..57b07c23
--- /dev/null
+++ b/1012_linux-6.4.13.patch
@@ -0,0 +1,6523 @@
+diff --git a/Makefile b/Makefile
+index 0ff13b943f994..900e515b87cf8 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 6
+ PATCHLEVEL = 4
+-SUBLEVEL = 12
++SUBLEVEL = 13
+ EXTRAVERSION =
+ NAME = Hurr durr I'ma ninja sloth
+
+diff --git a/arch/loongarch/kernel/hw_breakpoint.c b/arch/loongarch/kernel/hw_breakpoint.c
+index 021b59c248fac..fc55c4de2a11f 100644
+--- a/arch/loongarch/kernel/hw_breakpoint.c
++++ b/arch/loongarch/kernel/hw_breakpoint.c
+@@ -207,8 +207,7 @@ static int hw_breakpoint_control(struct perf_event *bp,
+ write_wb_reg(CSR_CFG_CTRL, i, 0, CTRL_PLV_ENABLE);
+ } else {
+ ctrl = encode_ctrl_reg(info->ctrl);
+- write_wb_reg(CSR_CFG_CTRL, i, 1, ctrl | CTRL_PLV_ENABLE |
+- 1 << MWPnCFG3_LoadEn | 1 << MWPnCFG3_StoreEn);
++ write_wb_reg(CSR_CFG_CTRL, i, 1, ctrl | CTRL_PLV_ENABLE);
+ }
+ enable = csr_read64(LOONGARCH_CSR_CRMD);
+ csr_write64(CSR_CRMD_WE | enable, LOONGARCH_CSR_CRMD);
+diff --git a/arch/powerpc/mm/book3s64/subpage_prot.c b/arch/powerpc/mm/book3s64/subpage_prot.c
+index b75a9fb99599a..b0eea434ef08e 100644
+--- a/arch/powerpc/mm/book3s64/subpage_prot.c
++++ b/arch/powerpc/mm/book3s64/subpage_prot.c
+@@ -143,6 +143,7 @@ static int subpage_walk_pmd_entry(pmd_t *pmd, unsigned long addr,
+
+ static const struct mm_walk_ops subpage_walk_ops = {
+ .pmd_entry = subpage_walk_pmd_entry,
++ .walk_lock = PGWALK_WRLOCK_VERIFY,
+ };
+
+ static void subpage_mark_vma_nohuge(struct mm_struct *mm, unsigned long addr,
+diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
+index a11b1c038c6d1..052845384ed38 100644
+--- a/arch/riscv/Kconfig
++++ b/arch/riscv/Kconfig
+@@ -525,24 +525,30 @@ config TOOLCHAIN_HAS_ZIHINTPAUSE
+ config TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI
+ def_bool y
+ # https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=aed44286efa8ae8717a77d94b51ac3614e2ca6dc
+- depends on AS_IS_GNU && AS_VERSION >= 23800
+- help
+- Newer binutils versions default to ISA spec version 20191213 which
+- moves some instructions from the I extension to the Zicsr and Zifencei
+- extensions.
++ # https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=98416dbb0a62579d4a7a4a76bab51b5b52fec2cd
++ depends on AS_IS_GNU && AS_VERSION >= 23600
++ help
++ Binutils-2.38 and GCC-12.1.0 bumped the default ISA spec to the newer
++ 20191213 version, which moves some instructions from the I extension to
++ the Zicsr and Zifencei extensions. This requires explicitly specifying
++ Zicsr and Zifencei when binutils >= 2.38 or GCC >= 12.1.0. Zicsr
++ and Zifencei are supported in binutils from version 2.36 onwards.
++ To make life easier, and avoid forcing toolchains that default to a
++ newer ISA spec to version 2.2, relax the check to binutils >= 2.36.
++ For clang < 17 or GCC < 11.3.0, for which this is not possible or need
++ special treatment, this is dealt with in TOOLCHAIN_NEEDS_OLD_ISA_SPEC.
+
+ config TOOLCHAIN_NEEDS_OLD_ISA_SPEC
+ def_bool y
+ depends on TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI
+ # https://github.com/llvm/llvm-project/commit/22e199e6afb1263c943c0c0d4498694e15bf8a16
+- depends on CC_IS_CLANG && CLANG_VERSION < 170000
+- help
+- Certain versions of clang do not support zicsr and zifencei via -march
+- but newer versions of binutils require it for the reasons noted in the
+- help text of CONFIG_TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI. This
+- option causes an older ISA spec compatible with these older versions
+- of clang to be passed to GAS, which has the same result as passing zicsr
+- and zifencei to -march.
++ # https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=d29f5d6ab513c52fd872f532c492e35ae9fd6671
++ depends on (CC_IS_CLANG && CLANG_VERSION < 170000) || (CC_IS_GCC && GCC_VERSION < 110300)
++ help
++ Certain versions of clang and GCC do not support zicsr and zifencei via
++ -march. This option causes an older ISA spec compatible with these older
++ versions of clang and GCC to be passed to GAS, which has the same result
++ as passing zicsr and zifencei to -march.
+
+ config FPU
+ bool "FPU support"
+diff --git a/arch/riscv/kernel/compat_vdso/Makefile b/arch/riscv/kernel/compat_vdso/Makefile
+index 189345773e7e1..b86e5e2c3aea9 100644
+--- a/arch/riscv/kernel/compat_vdso/Makefile
++++ b/arch/riscv/kernel/compat_vdso/Makefile
+@@ -11,7 +11,13 @@ compat_vdso-syms += flush_icache
+ COMPAT_CC := $(CC)
+ COMPAT_LD := $(LD)
+
+-COMPAT_CC_FLAGS := -march=rv32g -mabi=ilp32
++# binutils 2.35 does not support the zifencei extension, but in the ISA
++# spec 20191213, G stands for IMAFD_ZICSR_ZIFENCEI.
++ifdef CONFIG_TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI
++ COMPAT_CC_FLAGS := -march=rv32g -mabi=ilp32
++else
++ COMPAT_CC_FLAGS := -march=rv32imafd -mabi=ilp32
++endif
+ COMPAT_LD_FLAGS := -melf32lriscv
+
+ # Disable attributes, as they're useless and break the build.
+diff --git a/arch/riscv/mm/pageattr.c b/arch/riscv/mm/pageattr.c
+index ea3d61de065b3..161d0b34c2cb2 100644
+--- a/arch/riscv/mm/pageattr.c
++++ b/arch/riscv/mm/pageattr.c
+@@ -102,6 +102,7 @@ static const struct mm_walk_ops pageattr_ops = {
+ .pmd_entry = pageattr_pmd_entry,
+ .pte_entry = pageattr_pte_entry,
+ .pte_hole = pageattr_pte_hole,
++ .walk_lock = PGWALK_RDLOCK,
+ };
+
+ static int __set_memory(unsigned long addr, int numpages, pgprot_t set_mask,
+diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c
+index d7e8297d5642b..aeb06a811d8aa 100644
+--- a/arch/s390/mm/gmap.c
++++ b/arch/s390/mm/gmap.c
+@@ -2514,6 +2514,7 @@ static int thp_split_walk_pmd_entry(pmd_t *pmd, unsigned long addr,
+
+ static const struct mm_walk_ops thp_split_walk_ops = {
+ .pmd_entry = thp_split_walk_pmd_entry,
++ .walk_lock = PGWALK_WRLOCK_VERIFY,
+ };
+
+ static inline void thp_split_mm(struct mm_struct *mm)
+@@ -2558,6 +2559,7 @@ static int __zap_zero_pages(pmd_t *pmd, unsigned long start,
+
+ static const struct mm_walk_ops zap_zero_walk_ops = {
+ .pmd_entry = __zap_zero_pages,
++ .walk_lock = PGWALK_WRLOCK,
+ };
+
+ /*
+@@ -2648,6 +2650,7 @@ static const struct mm_walk_ops enable_skey_walk_ops = {
+ .hugetlb_entry = __s390_enable_skey_hugetlb,
+ .pte_entry = __s390_enable_skey_pte,
+ .pmd_entry = __s390_enable_skey_pmd,
++ .walk_lock = PGWALK_WRLOCK,
+ };
+
+ int s390_enable_skey(void)
+@@ -2685,6 +2688,7 @@ static int __s390_reset_cmma(pte_t *pte, unsigned long addr,
+
+ static const struct mm_walk_ops reset_cmma_walk_ops = {
+ .pte_entry = __s390_reset_cmma,
++ .walk_lock = PGWALK_WRLOCK,
+ };
+
+ void s390_reset_cmma(struct mm_struct *mm)
+@@ -2721,6 +2725,7 @@ static int s390_gather_pages(pte_t *ptep, unsigned long addr,
+
+ static const struct mm_walk_ops gather_pages_ops = {
+ .pte_entry = s390_gather_pages,
++ .walk_lock = PGWALK_RDLOCK,
+ };
+
+ /*
+diff --git a/arch/x86/kernel/fpu/context.h b/arch/x86/kernel/fpu/context.h
+index af5cbdd9bd29a..f6d856bd50bc5 100644
+--- a/arch/x86/kernel/fpu/context.h
++++ b/arch/x86/kernel/fpu/context.h
+@@ -19,8 +19,7 @@
+ * FPU state for a task MUST let the rest of the kernel know that the
+ * FPU registers are no longer valid for this task.
+ *
+- * Either one of these invalidation functions is enough. Invalidate
+- * a resource you control: CPU if using the CPU for something else
++ * Invalidate a resource you control: CPU if using the CPU for something else
+ * (with preemption disabled), FPU for the current task, or a task that
+ * is prevented from running by the current task.
+ */
+diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
+index 1015af1ae562b..98e507cc7d34c 100644
+--- a/arch/x86/kernel/fpu/core.c
++++ b/arch/x86/kernel/fpu/core.c
+@@ -679,7 +679,7 @@ static void fpu_reset_fpregs(void)
+ struct fpu *fpu = ¤t->thread.fpu;
+
+ fpregs_lock();
+- fpu__drop(fpu);
++ __fpu_invalidate_fpregs_state(fpu);
+ /*
+ * This does not change the actual hardware registers. It just
+ * resets the memory image and sets TIF_NEED_FPU_LOAD so a
+diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
+index 0bab497c94369..1afbc4866b100 100644
+--- a/arch/x86/kernel/fpu/xstate.c
++++ b/arch/x86/kernel/fpu/xstate.c
+@@ -882,6 +882,13 @@ void __init fpu__init_system_xstate(unsigned int legacy_size)
+ goto out_disable;
+ }
+
++ /*
++ * CPU capabilities initialization runs before FPU init. So
++ * X86_FEATURE_OSXSAVE is not set. Now that XSAVE is completely
++ * functional, set the feature bit so depending code works.
++ */
++ setup_force_cpu_cap(X86_FEATURE_OSXSAVE);
++
+ print_xstate_offset_size();
+ pr_info("x86/fpu: Enabled xstate features 0x%llx, context size is %d bytes, using '%s' format.\n",
+ fpu_kernel_cfg.max_features,
+diff --git a/drivers/acpi/resource.c b/drivers/acpi/resource.c
+index a4d9f149b48d7..32cfa3f4efd3d 100644
+--- a/drivers/acpi/resource.c
++++ b/drivers/acpi/resource.c
+@@ -501,9 +501,13 @@ static const struct dmi_system_id maingear_laptop[] = {
+ static const struct dmi_system_id pcspecialist_laptop[] = {
+ {
+ .ident = "PCSpecialist Elimina Pro 16 M",
++ /*
++ * Some models have product-name "Elimina Pro 16 M",
++ * others "GM6BGEQ". Match on board-name to match both.
++ */
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "PCSpecialist"),
+- DMI_MATCH(DMI_PRODUCT_NAME, "Elimina Pro 16 M"),
++ DMI_MATCH(DMI_BOARD_NAME, "GM6BGEQ"),
+ },
+ },
+ { }
+diff --git a/drivers/clk/clk-devres.c b/drivers/clk/clk-devres.c
+index 4fb4fd4b06bda..737aa70e2cb3d 100644
+--- a/drivers/clk/clk-devres.c
++++ b/drivers/clk/clk-devres.c
+@@ -205,18 +205,19 @@ EXPORT_SYMBOL(devm_clk_put);
+ struct clk *devm_get_clk_from_child(struct device *dev,
+ struct device_node *np, const char *con_id)
+ {
+- struct clk **ptr, *clk;
++ struct devm_clk_state *state;
++ struct clk *clk;
+
+- ptr = devres_alloc(devm_clk_release, sizeof(*ptr), GFP_KERNEL);
+- if (!ptr)
++ state = devres_alloc(devm_clk_release, sizeof(*state), GFP_KERNEL);
++ if (!state)
+ return ERR_PTR(-ENOMEM);
+
+ clk = of_clk_get_by_name(np, con_id);
+ if (!IS_ERR(clk)) {
+- *ptr = clk;
+- devres_add(dev, ptr);
++ state->clk = clk;
++ devres_add(dev, state);
+ } else {
+- devres_free(ptr);
++ devres_free(state);
+ }
+
+ return clk;
+diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c
+index 348b3a9170fa4..7f5ed1aa7a9f8 100644
+--- a/drivers/dma-buf/sw_sync.c
++++ b/drivers/dma-buf/sw_sync.c
+@@ -191,6 +191,7 @@ static const struct dma_fence_ops timeline_fence_ops = {
+ */
+ static void sync_timeline_signal(struct sync_timeline *obj, unsigned int inc)
+ {
++ LIST_HEAD(signalled);
+ struct sync_pt *pt, *next;
+
+ trace_sync_timeline(obj);
+@@ -203,21 +204,20 @@ static void sync_timeline_signal(struct sync_timeline *obj, unsigned int inc)
+ if (!timeline_fence_signaled(&pt->base))
+ break;
+
+- list_del_init(&pt->link);
++ dma_fence_get(&pt->base);
++
++ list_move_tail(&pt->link, &signalled);
+ rb_erase(&pt->node, &obj->pt_tree);
+
+- /*
+- * A signal callback may release the last reference to this
+- * fence, causing it to be freed. That operation has to be
+- * last to avoid a use after free inside this loop, and must
+- * be after we remove the fence from the timeline in order to
+- * prevent deadlocking on timeline->lock inside
+- * timeline_fence_release().
+- */
+ dma_fence_signal_locked(&pt->base);
+ }
+
+ spin_unlock_irq(&obj->lock);
++
++ list_for_each_entry_safe(pt, next, &signalled, link) {
++ list_del_init(&pt->link);
++ dma_fence_put(&pt->base);
++ }
+ }
+
+ /**
+diff --git a/drivers/gpio/gpio-sim.c b/drivers/gpio/gpio-sim.c
+index f1f6f1c329877..533d815725794 100644
+--- a/drivers/gpio/gpio-sim.c
++++ b/drivers/gpio/gpio-sim.c
+@@ -291,6 +291,15 @@ static void gpio_sim_mutex_destroy(void *data)
+ mutex_destroy(lock);
+ }
+
++static void gpio_sim_dispose_mappings(void *data)
++{
++ struct gpio_sim_chip *chip = data;
++ unsigned int i;
++
++ for (i = 0; i < chip->gc.ngpio; i++)
++ irq_dispose_mapping(irq_find_mapping(chip->irq_sim, i));
++}
++
+ static void gpio_sim_sysfs_remove(void *data)
+ {
+ struct gpio_sim_chip *chip = data;
+@@ -402,10 +411,14 @@ static int gpio_sim_add_bank(struct fwnode_handle *swnode, struct device *dev)
+ if (!chip->pull_map)
+ return -ENOMEM;
+
+- chip->irq_sim = devm_irq_domain_create_sim(dev, NULL, num_lines);
++ chip->irq_sim = devm_irq_domain_create_sim(dev, swnode, num_lines);
+ if (IS_ERR(chip->irq_sim))
+ return PTR_ERR(chip->irq_sim);
+
++ ret = devm_add_action_or_reset(dev, gpio_sim_dispose_mappings, chip);
++ if (ret)
++ return ret;
++
+ mutex_init(&chip->lock);
+ ret = devm_add_action_or_reset(dev, gpio_sim_mutex_destroy,
+ &chip->lock);
+diff --git a/drivers/gpu/drm/drm_probe_helper.c b/drivers/gpu/drm/drm_probe_helper.c
+index 2fb9bf901a2cc..3f479483d7d80 100644
+--- a/drivers/gpu/drm/drm_probe_helper.c
++++ b/drivers/gpu/drm/drm_probe_helper.c
+@@ -262,6 +262,26 @@ static bool drm_kms_helper_enable_hpd(struct drm_device *dev)
+ }
+
+ #define DRM_OUTPUT_POLL_PERIOD (10*HZ)
++static void reschedule_output_poll_work(struct drm_device *dev)
++{
++ unsigned long delay = DRM_OUTPUT_POLL_PERIOD;
++
++ if (dev->mode_config.delayed_event)
++ /*
++ * FIXME:
++ *
++ * Use short (1s) delay to handle the initial delayed event.
++ * This delay should not be needed, but Optimus/nouveau will
++ * fail in a mysterious way if the delayed event is handled as
++ * soon as possible like it is done in
++ * drm_helper_probe_single_connector_modes() in case the poll
++ * was enabled before.
++ */
++ delay = HZ;
++
++ schedule_delayed_work(&dev->mode_config.output_poll_work, delay);
++}
++
+ /**
+ * drm_kms_helper_poll_enable - re-enable output polling.
+ * @dev: drm_device
+@@ -279,37 +299,41 @@ static bool drm_kms_helper_enable_hpd(struct drm_device *dev)
+ */
+ void drm_kms_helper_poll_enable(struct drm_device *dev)
+ {
+- bool poll = false;
+- unsigned long delay = DRM_OUTPUT_POLL_PERIOD;
+-
+ if (!dev->mode_config.poll_enabled || !drm_kms_helper_poll ||
+ dev->mode_config.poll_running)
+ return;
+
+- poll = drm_kms_helper_enable_hpd(dev);
+-
+- if (dev->mode_config.delayed_event) {
+- /*
+- * FIXME:
+- *
+- * Use short (1s) delay to handle the initial delayed event.
+- * This delay should not be needed, but Optimus/nouveau will
+- * fail in a mysterious way if the delayed event is handled as
+- * soon as possible like it is done in
+- * drm_helper_probe_single_connector_modes() in case the poll
+- * was enabled before.
+- */
+- poll = true;
+- delay = HZ;
+- }
+-
+- if (poll)
+- schedule_delayed_work(&dev->mode_config.output_poll_work, delay);
++ if (drm_kms_helper_enable_hpd(dev) ||
++ dev->mode_config.delayed_event)
++ reschedule_output_poll_work(dev);
+
+ dev->mode_config.poll_running = true;
+ }
+ EXPORT_SYMBOL(drm_kms_helper_poll_enable);
+
++/**
++ * drm_kms_helper_poll_reschedule - reschedule the output polling work
++ * @dev: drm_device
++ *
++ * This function reschedules the output polling work, after polling for a
++ * connector has been enabled.
++ *
++ * Drivers must call this helper after enabling polling for a connector by
++ * setting %DRM_CONNECTOR_POLL_CONNECT / %DRM_CONNECTOR_POLL_DISCONNECT flags
++ * in drm_connector::polled. Note that after disabling polling by clearing these
++ * flags for a connector will stop the output polling work automatically if
++ * the polling is disabled for all other connectors as well.
++ *
++ * The function can be called only after polling has been enabled by calling
++ * drm_kms_helper_poll_init() / drm_kms_helper_poll_enable().
++ */
++void drm_kms_helper_poll_reschedule(struct drm_device *dev)
++{
++ if (dev->mode_config.poll_running)
++ reschedule_output_poll_work(dev);
++}
++EXPORT_SYMBOL(drm_kms_helper_poll_reschedule);
++
+ static enum drm_connector_status
+ drm_helper_probe_detect_ctx(struct drm_connector *connector, bool force)
+ {
+diff --git a/drivers/gpu/drm/i915/display/intel_display_device.c b/drivers/gpu/drm/i915/display/intel_display_device.c
+index 8c57d48e8270f..95c413f713fbd 100644
+--- a/drivers/gpu/drm/i915/display/intel_display_device.c
++++ b/drivers/gpu/drm/i915/display/intel_display_device.c
+@@ -5,7 +5,10 @@
+
+ #include <drm/i915_pciids.h>
+ #include <drm/drm_color_mgmt.h>
++#include <linux/pci.h>
+
++#include "i915_drv.h"
++#include "i915_reg.h"
+ #include "intel_display_device.h"
+ #include "intel_display_power.h"
+ #include "intel_display_reg_defs.h"
+@@ -657,10 +660,24 @@ static const struct intel_display_device_info xe_lpdp_display = {
+ BIT(TRANSCODER_C) | BIT(TRANSCODER_D),
+ };
+
++/*
++ * Separate detection for no display cases to keep the display id array simple.
++ *
++ * IVB Q requires subvendor and subdevice matching to differentiate from IVB D
++ * GT2 server.
++ */
++static bool has_no_display(struct pci_dev *pdev)
++{
++ static const struct pci_device_id ids[] = {
++ INTEL_IVB_Q_IDS(0),
++ {}
++ };
++
++ return pci_match_id(ids, pdev);
++}
++
+ #undef INTEL_VGA_DEVICE
+-#undef INTEL_QUANTA_VGA_DEVICE
+ #define INTEL_VGA_DEVICE(id, info) { id, info }
+-#define INTEL_QUANTA_VGA_DEVICE(info) { 0x16a, info }
+
+ static const struct {
+ u32 devid;
+@@ -685,7 +702,6 @@ static const struct {
+ INTEL_IRONLAKE_M_IDS(&ilk_m_display),
+ INTEL_SNB_D_IDS(&snb_display),
+ INTEL_SNB_M_IDS(&snb_display),
+- INTEL_IVB_Q_IDS(NULL), /* must be first IVB in list */
+ INTEL_IVB_M_IDS(&ivb_display),
+ INTEL_IVB_D_IDS(&ivb_display),
+ INTEL_HSW_IDS(&hsw_display),
+@@ -710,19 +726,78 @@ static const struct {
+ INTEL_RPLP_IDS(&xe_lpd_display),
+ INTEL_DG2_IDS(&xe_hpd_display),
+
+- /* FIXME: Replace this with a GMD_ID lookup */
+- INTEL_MTL_IDS(&xe_lpdp_display),
++ /*
++ * Do not add any GMD_ID-based platforms to this list. They will
++ * be probed automatically based on the IP version reported by
++ * the hardware.
++ */
+ };
+
++static const struct {
++ u16 ver;
++ u16 rel;
++ const struct intel_display_device_info *display;
++} gmdid_display_map[] = {
++ { 14, 0, &xe_lpdp_display },
++};
++
++static const struct intel_display_device_info *
++probe_gmdid_display(struct drm_i915_private *i915, u16 *ver, u16 *rel, u16 *step)
++{
++ struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
++ void __iomem *addr;
++ u32 val;
++ int i;
++
++ addr = pci_iomap_range(pdev, 0, i915_mmio_reg_offset(GMD_ID_DISPLAY), sizeof(u32));
++ if (!addr) {
++ drm_err(&i915->drm, "Cannot map MMIO BAR to read display GMD_ID\n");
++ return &no_display;
++ }
++
++ val = ioread32(addr);
++ pci_iounmap(pdev, addr);
++
++ if (val == 0)
++ /* Platform doesn't have display */
++ return &no_display;
++
++ *ver = REG_FIELD_GET(GMD_ID_ARCH_MASK, val);
++ *rel = REG_FIELD_GET(GMD_ID_RELEASE_MASK, val);
++ *step = REG_FIELD_GET(GMD_ID_STEP, val);
++
++ for (i = 0; i < ARRAY_SIZE(gmdid_display_map); i++)
++ if (*ver == gmdid_display_map[i].ver &&
++ *rel == gmdid_display_map[i].rel)
++ return gmdid_display_map[i].display;
++
++ drm_err(&i915->drm, "Unrecognized display IP version %d.%02d; disabling display.\n",
++ *ver, *rel);
++ return &no_display;
++}
++
+ const struct intel_display_device_info *
+-intel_display_device_probe(u16 pci_devid)
++intel_display_device_probe(struct drm_i915_private *i915, bool has_gmdid,
++ u16 *gmdid_ver, u16 *gmdid_rel, u16 *gmdid_step)
+ {
++ struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
+ int i;
+
++ if (has_gmdid)
++ return probe_gmdid_display(i915, gmdid_ver, gmdid_rel, gmdid_step);
++
++ if (has_no_display(pdev)) {
++ drm_dbg_kms(&i915->drm, "Device doesn't have display\n");
++ return &no_display;
++ }
++
+ for (i = 0; i < ARRAY_SIZE(intel_display_ids); i++) {
+- if (intel_display_ids[i].devid == pci_devid)
++ if (intel_display_ids[i].devid == pdev->device)
+ return intel_display_ids[i].info;
+ }
+
++ drm_dbg(&i915->drm, "No display ID found for device ID %04x; disabling display.\n",
++ pdev->device);
++
+ return &no_display;
+ }
+diff --git a/drivers/gpu/drm/i915/display/intel_display_device.h b/drivers/gpu/drm/i915/display/intel_display_device.h
+index 1f7d08b3ad6b1..d1d11581d85dc 100644
+--- a/drivers/gpu/drm/i915/display/intel_display_device.h
++++ b/drivers/gpu/drm/i915/display/intel_display_device.h
+@@ -10,6 +10,8 @@
+
+ #include "display/intel_display_limits.h"
+
++struct drm_i915_private;
++
+ #define DEV_INFO_DISPLAY_FOR_EACH_FLAG(func) \
+ /* Keep in alphabetical order */ \
+ func(cursor_needs_physical); \
+@@ -81,6 +83,7 @@ struct intel_display_device_info {
+ };
+
+ const struct intel_display_device_info *
+-intel_display_device_probe(u16 pci_devid);
++intel_display_device_probe(struct drm_i915_private *i915, bool has_gmdid,
++ u16 *ver, u16 *rel, u16 *step);
+
+ #endif
+diff --git a/drivers/gpu/drm/i915/display/intel_hotplug.c b/drivers/gpu/drm/i915/display/intel_hotplug.c
+index b12900446828a..6dd0d66e54f49 100644
+--- a/drivers/gpu/drm/i915/display/intel_hotplug.c
++++ b/drivers/gpu/drm/i915/display/intel_hotplug.c
+@@ -210,7 +210,7 @@ intel_hpd_irq_storm_switch_to_polling(struct drm_i915_private *dev_priv)
+
+ /* Enable polling and queue hotplug re-enabling. */
+ if (hpd_disabled) {
+- drm_kms_helper_poll_enable(&dev_priv->drm);
++ drm_kms_helper_poll_reschedule(&dev_priv->drm);
+ mod_delayed_work(system_wq, &dev_priv->display.hotplug.reenable_work,
+ msecs_to_jiffies(HPD_STORM_REENABLE_DELAY));
+ }
+@@ -644,7 +644,7 @@ static void i915_hpd_poll_init_work(struct work_struct *work)
+ drm_connector_list_iter_end(&conn_iter);
+
+ if (enabled)
+- drm_kms_helper_poll_enable(&dev_priv->drm);
++ drm_kms_helper_poll_reschedule(&dev_priv->drm);
+
+ mutex_unlock(&dev_priv->drm.mode_config.mutex);
+
+diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c
+index 2980ccdef6cd6..46834299a7d57 100644
+--- a/drivers/gpu/drm/i915/i915_driver.c
++++ b/drivers/gpu/drm/i915/i915_driver.c
+@@ -433,7 +433,6 @@ static int i915_pcode_init(struct drm_i915_private *i915)
+ static int i915_driver_hw_probe(struct drm_i915_private *dev_priv)
+ {
+ struct pci_dev *pdev = to_pci_dev(dev_priv->drm.dev);
+- struct pci_dev *root_pdev;
+ int ret;
+
+ if (i915_inject_probe_failure(dev_priv))
+@@ -547,15 +546,6 @@ static int i915_driver_hw_probe(struct drm_i915_private *dev_priv)
+
+ intel_bw_init_hw(dev_priv);
+
+- /*
+- * FIXME: Temporary hammer to avoid freezing the machine on our DGFX
+- * This should be totally removed when we handle the pci states properly
+- * on runtime PM and on s2idle cases.
+- */
+- root_pdev = pcie_find_root_port(pdev);
+- if (root_pdev)
+- pci_d3cold_disable(root_pdev);
+-
+ return 0;
+
+ err_opregion:
+@@ -581,7 +571,6 @@ err_perf:
+ static void i915_driver_hw_remove(struct drm_i915_private *dev_priv)
+ {
+ struct pci_dev *pdev = to_pci_dev(dev_priv->drm.dev);
+- struct pci_dev *root_pdev;
+
+ i915_perf_fini(dev_priv);
+
+@@ -589,10 +578,6 @@ static void i915_driver_hw_remove(struct drm_i915_private *dev_priv)
+
+ if (pdev->msi_enabled)
+ pci_disable_msi(pdev);
+-
+- root_pdev = pcie_find_root_port(pdev);
+- if (root_pdev)
+- pci_d3cold_enable(root_pdev);
+ }
+
+ /**
+@@ -754,13 +739,17 @@ int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
+ struct drm_i915_private *i915;
+ int ret;
+
++ ret = pci_enable_device(pdev);
++ if (ret) {
++ pr_err("Failed to enable graphics device: %pe\n", ERR_PTR(ret));
++ return ret;
++ }
++
+ i915 = i915_driver_create(pdev, ent);
+- if (IS_ERR(i915))
++ if (IS_ERR(i915)) {
++ pci_disable_device(pdev);
+ return PTR_ERR(i915);
+-
+- ret = pci_enable_device(pdev);
+- if (ret)
+- goto out_fini;
++ }
+
+ ret = i915_driver_early_probe(i915);
+ if (ret < 0)
+@@ -843,7 +832,6 @@ out_runtime_pm_put:
+ i915_driver_late_release(i915);
+ out_pci_disable:
+ pci_disable_device(pdev);
+-out_fini:
+ i915_probe_error(i915, "Device initialization failed (%d)\n", ret);
+ return ret;
+ }
+@@ -1499,6 +1487,8 @@ static int intel_runtime_suspend(struct device *kdev)
+ {
+ struct drm_i915_private *dev_priv = kdev_to_i915(kdev);
+ struct intel_runtime_pm *rpm = &dev_priv->runtime_pm;
++ struct pci_dev *pdev = to_pci_dev(dev_priv->drm.dev);
++ struct pci_dev *root_pdev;
+ struct intel_gt *gt;
+ int ret, i;
+
+@@ -1550,6 +1540,15 @@ static int intel_runtime_suspend(struct device *kdev)
+ drm_err(&dev_priv->drm,
+ "Unclaimed access detected prior to suspending\n");
+
++ /*
++ * FIXME: Temporary hammer to avoid freezing the machine on our DGFX
++ * This should be totally removed when we handle the pci states properly
++ * on runtime PM.
++ */
++ root_pdev = pcie_find_root_port(pdev);
++ if (root_pdev)
++ pci_d3cold_disable(root_pdev);
++
+ rpm->suspended = true;
+
+ /*
+@@ -1588,6 +1587,8 @@ static int intel_runtime_resume(struct device *kdev)
+ {
+ struct drm_i915_private *dev_priv = kdev_to_i915(kdev);
+ struct intel_runtime_pm *rpm = &dev_priv->runtime_pm;
++ struct pci_dev *pdev = to_pci_dev(dev_priv->drm.dev);
++ struct pci_dev *root_pdev;
+ struct intel_gt *gt;
+ int ret, i;
+
+@@ -1601,6 +1602,11 @@ static int intel_runtime_resume(struct device *kdev)
+
+ intel_opregion_notify_adapter(dev_priv, PCI_D0);
+ rpm->suspended = false;
++
++ root_pdev = pcie_find_root_port(pdev);
++ if (root_pdev)
++ pci_d3cold_enable(root_pdev);
++
+ if (intel_uncore_unclaimed_mmio(&dev_priv->uncore))
+ drm_dbg(&dev_priv->drm,
+ "Unclaimed access during suspend, bios?\n");
+diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
+index 79523e55ca9c4..2f79d232b04a9 100644
+--- a/drivers/gpu/drm/i915/intel_device_info.c
++++ b/drivers/gpu/drm/i915/intel_device_info.c
+@@ -345,7 +345,6 @@ static void ip_ver_read(struct drm_i915_private *i915, u32 offset, struct intel_
+ static void intel_ipver_early_init(struct drm_i915_private *i915)
+ {
+ struct intel_runtime_info *runtime = RUNTIME_INFO(i915);
+- struct intel_display_runtime_info *display_runtime = DISPLAY_RUNTIME_INFO(i915);
+
+ if (!HAS_GMD_ID(i915)) {
+ drm_WARN_ON(&i915->drm, RUNTIME_INFO(i915)->graphics.ip.ver > 12);
+@@ -366,8 +365,6 @@ static void intel_ipver_early_init(struct drm_i915_private *i915)
+ RUNTIME_INFO(i915)->graphics.ip.ver = 12;
+ RUNTIME_INFO(i915)->graphics.ip.rel = 70;
+ }
+- ip_ver_read(i915, i915_mmio_reg_offset(GMD_ID_DISPLAY),
+- (struct intel_ip_version *)&display_runtime->ip);
+ ip_ver_read(i915, i915_mmio_reg_offset(GMD_ID_MEDIA),
+ &runtime->media.ip);
+ }
+@@ -574,6 +571,7 @@ void intel_device_info_driver_create(struct drm_i915_private *i915,
+ {
+ struct intel_device_info *info;
+ struct intel_runtime_info *runtime;
++ u16 ver, rel, step;
+
+ /* Setup the write-once "constant" device info */
+ info = mkwrite_device_info(i915);
+@@ -584,11 +582,18 @@ void intel_device_info_driver_create(struct drm_i915_private *i915,
+ memcpy(runtime, &INTEL_INFO(i915)->__runtime, sizeof(*runtime));
+
+ /* Probe display support */
+- info->display = intel_display_device_probe(device_id);
++ info->display = intel_display_device_probe(i915, info->has_gmd_id,
++ &ver, &rel, &step);
+ memcpy(DISPLAY_RUNTIME_INFO(i915),
+ &DISPLAY_INFO(i915)->__runtime_defaults,
+ sizeof(*DISPLAY_RUNTIME_INFO(i915)));
+
++ if (info->has_gmd_id) {
++ DISPLAY_RUNTIME_INFO(i915)->ip.ver = ver;
++ DISPLAY_RUNTIME_INFO(i915)->ip.rel = rel;
++ DISPLAY_RUNTIME_INFO(i915)->ip.step = step;
++ }
++
+ runtime->device_id = device_id;
+ }
+
+diff --git a/drivers/gpu/drm/panfrost/panfrost_devfreq.c b/drivers/gpu/drm/panfrost/panfrost_devfreq.c
+index 58dfb15a8757f..e78de99e99335 100644
+--- a/drivers/gpu/drm/panfrost/panfrost_devfreq.c
++++ b/drivers/gpu/drm/panfrost/panfrost_devfreq.c
+@@ -96,7 +96,7 @@ static int panfrost_read_speedbin(struct device *dev)
+ * keep going without it; any other error means that we are
+ * supposed to read the bin value, but we failed doing so.
+ */
+- if (ret != -ENOENT) {
++ if (ret != -ENOENT && ret != -EOPNOTSUPP) {
+ DRM_DEV_ERROR(dev, "Cannot read speed-bin (%d).", ret);
+ return ret;
+ }
+diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
+index 82094c137855b..c43853597776f 100644
+--- a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
++++ b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c
+@@ -497,10 +497,9 @@ static int vmw_user_bo_synccpu_release(struct drm_file *filp,
+ if (!(flags & drm_vmw_synccpu_allow_cs)) {
+ atomic_dec(&vmw_bo->cpu_writers);
+ }
+- ttm_bo_put(&vmw_bo->tbo);
++ vmw_user_bo_unref(vmw_bo);
+ }
+
+- drm_gem_object_put(&vmw_bo->tbo.base);
+ return ret;
+ }
+
+@@ -540,8 +539,7 @@ int vmw_user_bo_synccpu_ioctl(struct drm_device *dev, void *data,
+ return ret;
+
+ ret = vmw_user_bo_synccpu_grab(vbo, arg->flags);
+- vmw_bo_unreference(&vbo);
+- drm_gem_object_put(&vbo->tbo.base);
++ vmw_user_bo_unref(vbo);
+ if (unlikely(ret != 0)) {
+ if (ret == -ERESTARTSYS || ret == -EBUSY)
+ return -EBUSY;
+diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.h b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.h
+index 50a836e709949..1d433fceed3d8 100644
+--- a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.h
++++ b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.h
+@@ -195,6 +195,14 @@ static inline struct vmw_bo *vmw_bo_reference(struct vmw_bo *buf)
+ return buf;
+ }
+
++static inline void vmw_user_bo_unref(struct vmw_bo *vbo)
++{
++ if (vbo) {
++ ttm_bo_put(&vbo->tbo);
++ drm_gem_object_put(&vbo->tbo.base);
++ }
++}
++
+ static inline struct vmw_bo *to_vmw_bo(struct drm_gem_object *gobj)
+ {
+ return container_of((gobj), struct vmw_bo, tbo.base);
+diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
+index 3810a9984a7fd..58bfdf203ecae 100644
+--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
++++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
+@@ -1513,4 +1513,16 @@ static inline bool vmw_has_fences(struct vmw_private *vmw)
+ return (vmw_fifo_caps(vmw) & SVGA_FIFO_CAP_FENCE) != 0;
+ }
+
++static inline bool vmw_shadertype_is_valid(enum vmw_sm_type shader_model,
++ u32 shader_type)
++{
++ SVGA3dShaderType max_allowed = SVGA3D_SHADERTYPE_PREDX_MAX;
++
++ if (shader_model >= VMW_SM_5)
++ max_allowed = SVGA3D_SHADERTYPE_MAX;
++ else if (shader_model >= VMW_SM_4)
++ max_allowed = SVGA3D_SHADERTYPE_DX10_MAX;
++ return shader_type >= SVGA3D_SHADERTYPE_MIN && shader_type < max_allowed;
++}
++
+ #endif
+diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
+index 6b9aa2b4ef54a..98e0723ca6f5e 100644
+--- a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
++++ b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
+@@ -1164,8 +1164,7 @@ static int vmw_translate_mob_ptr(struct vmw_private *dev_priv,
+ }
+ vmw_bo_placement_set(vmw_bo, VMW_BO_DOMAIN_MOB, VMW_BO_DOMAIN_MOB);
+ ret = vmw_validation_add_bo(sw_context->ctx, vmw_bo);
+- ttm_bo_put(&vmw_bo->tbo);
+- drm_gem_object_put(&vmw_bo->tbo.base);
++ vmw_user_bo_unref(vmw_bo);
+ if (unlikely(ret != 0))
+ return ret;
+
+@@ -1221,8 +1220,7 @@ static int vmw_translate_guest_ptr(struct vmw_private *dev_priv,
+ vmw_bo_placement_set(vmw_bo, VMW_BO_DOMAIN_GMR | VMW_BO_DOMAIN_VRAM,
+ VMW_BO_DOMAIN_GMR | VMW_BO_DOMAIN_VRAM);
+ ret = vmw_validation_add_bo(sw_context->ctx, vmw_bo);
+- ttm_bo_put(&vmw_bo->tbo);
+- drm_gem_object_put(&vmw_bo->tbo.base);
++ vmw_user_bo_unref(vmw_bo);
+ if (unlikely(ret != 0))
+ return ret;
+
+@@ -1992,7 +1990,7 @@ static int vmw_cmd_set_shader(struct vmw_private *dev_priv,
+
+ cmd = container_of(header, typeof(*cmd), header);
+
+- if (cmd->body.type >= SVGA3D_SHADERTYPE_PREDX_MAX) {
++ if (!vmw_shadertype_is_valid(VMW_SM_LEGACY, cmd->body.type)) {
+ VMW_DEBUG_USER("Illegal shader type %u.\n",
+ (unsigned int) cmd->body.type);
+ return -EINVAL;
+@@ -2115,8 +2113,6 @@ vmw_cmd_dx_set_single_constant_buffer(struct vmw_private *dev_priv,
+ SVGA3dCmdHeader *header)
+ {
+ VMW_DECLARE_CMD_VAR(*cmd, SVGA3dCmdDXSetSingleConstantBuffer);
+- SVGA3dShaderType max_shader_num = has_sm5_context(dev_priv) ?
+- SVGA3D_NUM_SHADERTYPE : SVGA3D_NUM_SHADERTYPE_DX10;
+
+ struct vmw_resource *res = NULL;
+ struct vmw_ctx_validation_info *ctx_node = VMW_GET_CTX_NODE(sw_context);
+@@ -2133,6 +2129,14 @@ vmw_cmd_dx_set_single_constant_buffer(struct vmw_private *dev_priv,
+ if (unlikely(ret != 0))
+ return ret;
+
++ if (!vmw_shadertype_is_valid(dev_priv->sm_type, cmd->body.type) ||
++ cmd->body.slot >= SVGA3D_DX_MAX_CONSTBUFFERS) {
++ VMW_DEBUG_USER("Illegal const buffer shader %u slot %u.\n",
++ (unsigned int) cmd->body.type,
++ (unsigned int) cmd->body.slot);
++ return -EINVAL;
++ }
++
+ binding.bi.ctx = ctx_node->ctx;
+ binding.bi.res = res;
+ binding.bi.bt = vmw_ctx_binding_cb;
+@@ -2141,14 +2145,6 @@ vmw_cmd_dx_set_single_constant_buffer(struct vmw_private *dev_priv,
+ binding.size = cmd->body.sizeInBytes;
+ binding.slot = cmd->body.slot;
+
+- if (binding.shader_slot >= max_shader_num ||
+- binding.slot >= SVGA3D_DX_MAX_CONSTBUFFERS) {
+- VMW_DEBUG_USER("Illegal const buffer shader %u slot %u.\n",
+- (unsigned int) cmd->body.type,
+- (unsigned int) binding.slot);
+- return -EINVAL;
+- }
+-
+ vmw_binding_add(ctx_node->staged, &binding.bi, binding.shader_slot,
+ binding.slot);
+
+@@ -2207,15 +2203,13 @@ static int vmw_cmd_dx_set_shader_res(struct vmw_private *dev_priv,
+ {
+ VMW_DECLARE_CMD_VAR(*cmd, SVGA3dCmdDXSetShaderResources) =
+ container_of(header, typeof(*cmd), header);
+- SVGA3dShaderType max_allowed = has_sm5_context(dev_priv) ?
+- SVGA3D_SHADERTYPE_MAX : SVGA3D_SHADERTYPE_DX10_MAX;
+
+ u32 num_sr_view = (cmd->header.size - sizeof(cmd->body)) /
+ sizeof(SVGA3dShaderResourceViewId);
+
+ if ((u64) cmd->body.startView + (u64) num_sr_view >
+ (u64) SVGA3D_DX_MAX_SRVIEWS ||
+- cmd->body.type >= max_allowed) {
++ !vmw_shadertype_is_valid(dev_priv->sm_type, cmd->body.type)) {
+ VMW_DEBUG_USER("Invalid shader binding.\n");
+ return -EINVAL;
+ }
+@@ -2239,8 +2233,6 @@ static int vmw_cmd_dx_set_shader(struct vmw_private *dev_priv,
+ SVGA3dCmdHeader *header)
+ {
+ VMW_DECLARE_CMD_VAR(*cmd, SVGA3dCmdDXSetShader);
+- SVGA3dShaderType max_allowed = has_sm5_context(dev_priv) ?
+- SVGA3D_SHADERTYPE_MAX : SVGA3D_SHADERTYPE_DX10_MAX;
+ struct vmw_resource *res = NULL;
+ struct vmw_ctx_validation_info *ctx_node = VMW_GET_CTX_NODE(sw_context);
+ struct vmw_ctx_bindinfo_shader binding;
+@@ -2251,8 +2243,7 @@ static int vmw_cmd_dx_set_shader(struct vmw_private *dev_priv,
+
+ cmd = container_of(header, typeof(*cmd), header);
+
+- if (cmd->body.type >= max_allowed ||
+- cmd->body.type < SVGA3D_SHADERTYPE_MIN) {
++ if (!vmw_shadertype_is_valid(dev_priv->sm_type, cmd->body.type)) {
+ VMW_DEBUG_USER("Illegal shader type %u.\n",
+ (unsigned int) cmd->body.type);
+ return -EINVAL;
+diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
+index b62207be3363e..1489ad73c103f 100644
+--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
++++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
+@@ -1665,10 +1665,8 @@ static struct drm_framebuffer *vmw_kms_fb_create(struct drm_device *dev,
+
+ err_out:
+ /* vmw_user_lookup_handle takes one ref so does new_fb */
+- if (bo) {
+- vmw_bo_unreference(&bo);
+- drm_gem_object_put(&bo->tbo.base);
+- }
++ if (bo)
++ vmw_user_bo_unref(bo);
+ if (surface)
+ vmw_surface_unreference(&surface);
+
+diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_overlay.c b/drivers/gpu/drm/vmwgfx/vmwgfx_overlay.c
+index 7e112319a23ce..fb85f244c3d02 100644
+--- a/drivers/gpu/drm/vmwgfx/vmwgfx_overlay.c
++++ b/drivers/gpu/drm/vmwgfx/vmwgfx_overlay.c
+@@ -451,8 +451,7 @@ int vmw_overlay_ioctl(struct drm_device *dev, void *data,
+
+ ret = vmw_overlay_update_stream(dev_priv, buf, arg, true);
+
+- vmw_bo_unreference(&buf);
+- drm_gem_object_put(&buf->tbo.base);
++ vmw_user_bo_unref(buf);
+
+ out_unlock:
+ mutex_unlock(&overlay->mutex);
+diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_shader.c b/drivers/gpu/drm/vmwgfx/vmwgfx_shader.c
+index e7226db8b2424..1e81ff2422cf6 100644
+--- a/drivers/gpu/drm/vmwgfx/vmwgfx_shader.c
++++ b/drivers/gpu/drm/vmwgfx/vmwgfx_shader.c
+@@ -809,8 +809,7 @@ static int vmw_shader_define(struct drm_device *dev, struct drm_file *file_priv,
+ shader_type, num_input_sig,
+ num_output_sig, tfile, shader_handle);
+ out_bad_arg:
+- vmw_bo_unreference(&buffer);
+- drm_gem_object_put(&buffer->tbo.base);
++ vmw_user_bo_unref(buffer);
+ return ret;
+ }
+
+diff --git a/drivers/media/platform/mediatek/vcodec/mtk_vcodec_enc.c b/drivers/media/platform/mediatek/vcodec/mtk_vcodec_enc.c
+index db65e77bd3733..664f052978305 100644
+--- a/drivers/media/platform/mediatek/vcodec/mtk_vcodec_enc.c
++++ b/drivers/media/platform/mediatek/vcodec/mtk_vcodec_enc.c
+@@ -821,6 +821,8 @@ static int vb2ops_venc_queue_setup(struct vb2_queue *vq,
+ return -EINVAL;
+
+ if (*nplanes) {
++ if (*nplanes != q_data->fmt->num_planes)
++ return -EINVAL;
+ for (i = 0; i < *nplanes; i++)
+ if (sizes[i] < q_data->sizeimage[i])
+ return -EINVAL;
+diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
+index b9dbad3a8af82..fc5da5d7744da 100644
+--- a/drivers/net/bonding/bond_alb.c
++++ b/drivers/net/bonding/bond_alb.c
+@@ -660,10 +660,10 @@ static struct slave *rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond)
+ return NULL;
+ arp = (struct arp_pkt *)skb_network_header(skb);
+
+- /* Don't modify or load balance ARPs that do not originate locally
+- * (e.g.,arrive via a bridge).
++ /* Don't modify or load balance ARPs that do not originate
++ * from the bond itself or a VLAN directly above the bond.
+ */
+- if (!bond_slave_has_mac_rx(bond, arp->mac_src))
++ if (!bond_slave_has_mac_rcu(bond, arp->mac_src))
+ return NULL;
+
+ dev = ip_dev_find(dev_net(bond->dev), arp->ip_src);
+diff --git a/drivers/net/can/vxcan.c b/drivers/net/can/vxcan.c
+index 4068d962203d6..98c669ad51414 100644
+--- a/drivers/net/can/vxcan.c
++++ b/drivers/net/can/vxcan.c
+@@ -192,12 +192,7 @@ static int vxcan_newlink(struct net *net, struct net_device *dev,
+
+ nla_peer = data[VXCAN_INFO_PEER];
+ ifmp = nla_data(nla_peer);
+- err = rtnl_nla_parse_ifla(peer_tb,
+- nla_data(nla_peer) +
+- sizeof(struct ifinfomsg),
+- nla_len(nla_peer) -
+- sizeof(struct ifinfomsg),
+- NULL);
++ err = rtnl_nla_parse_ifinfomsg(peer_tb, nla_peer, extack);
+ if (err < 0)
+ return err;
+
+diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
+index 7e773c4ba0463..32dc4f19c82c6 100644
+--- a/drivers/net/dsa/mt7530.c
++++ b/drivers/net/dsa/mt7530.c
+@@ -1006,6 +1006,10 @@ mt753x_trap_frames(struct mt7530_priv *priv)
+ mt7530_rmw(priv, MT753X_BPC, MT753X_BPDU_PORT_FW_MASK,
+ MT753X_BPDU_CPU_ONLY);
+
++ /* Trap 802.1X PAE frames to the CPU port(s) */
++ mt7530_rmw(priv, MT753X_BPC, MT753X_PAE_PORT_FW_MASK,
++ MT753X_PAE_PORT_FW(MT753X_BPDU_CPU_ONLY));
++
+ /* Trap LLDP frames with :0E MAC DA to the CPU port(s) */
+ mt7530_rmw(priv, MT753X_RGAC2, MT753X_R0E_PORT_FW_MASK,
+ MT753X_R0E_PORT_FW(MT753X_BPDU_CPU_ONLY));
+diff --git a/drivers/net/dsa/mt7530.h b/drivers/net/dsa/mt7530.h
+index 08045b035e6ab..17e42d30fff4b 100644
+--- a/drivers/net/dsa/mt7530.h
++++ b/drivers/net/dsa/mt7530.h
+@@ -66,6 +66,8 @@ enum mt753x_id {
+ /* Registers for BPDU and PAE frame control*/
+ #define MT753X_BPC 0x24
+ #define MT753X_BPDU_PORT_FW_MASK GENMASK(2, 0)
++#define MT753X_PAE_PORT_FW_MASK GENMASK(18, 16)
++#define MT753X_PAE_PORT_FW(x) FIELD_PREP(MT753X_PAE_PORT_FW_MASK, x)
+
+ /* Register for :03 and :0E MAC DA frame control */
+ #define MT753X_RGAC2 0x2c
+diff --git a/drivers/net/dsa/ocelot/felix_vsc9959.c b/drivers/net/dsa/ocelot/felix_vsc9959.c
+index ca69973ae91b9..b1ecd08cec96a 100644
+--- a/drivers/net/dsa/ocelot/felix_vsc9959.c
++++ b/drivers/net/dsa/ocelot/felix_vsc9959.c
+@@ -1081,6 +1081,9 @@ static u64 vsc9959_tas_remaining_gate_len_ps(u64 gate_len_ns)
+ if (gate_len_ns == U64_MAX)
+ return U64_MAX;
+
++ if (gate_len_ns < VSC9959_TAS_MIN_GATE_LEN_NS)
++ return 0;
++
+ return (gate_len_ns - VSC9959_TAS_MIN_GATE_LEN_NS) * PSEC_PER_NSEC;
+ }
+
+diff --git a/drivers/net/ethernet/broadcom/bgmac.c b/drivers/net/ethernet/broadcom/bgmac.c
+index 10c7c232cc4ec..52ee3751187a2 100644
+--- a/drivers/net/ethernet/broadcom/bgmac.c
++++ b/drivers/net/ethernet/broadcom/bgmac.c
+@@ -1448,7 +1448,7 @@ int bgmac_phy_connect_direct(struct bgmac *bgmac)
+ int err;
+
+ phy_dev = fixed_phy_register(PHY_POLL, &fphy_status, NULL);
+- if (!phy_dev || IS_ERR(phy_dev)) {
++ if (IS_ERR(phy_dev)) {
+ dev_err(bgmac->dev, "Failed to register fixed PHY device\n");
+ return -ENODEV;
+ }
+diff --git a/drivers/net/ethernet/broadcom/genet/bcmmii.c b/drivers/net/ethernet/broadcom/genet/bcmmii.c
+index 0092e46c46f83..cc3afb605b1ec 100644
+--- a/drivers/net/ethernet/broadcom/genet/bcmmii.c
++++ b/drivers/net/ethernet/broadcom/genet/bcmmii.c
+@@ -617,7 +617,7 @@ static int bcmgenet_mii_pd_init(struct bcmgenet_priv *priv)
+ };
+
+ phydev = fixed_phy_register(PHY_POLL, &fphy_status, NULL);
+- if (!phydev || IS_ERR(phydev)) {
++ if (IS_ERR(phydev)) {
+ dev_err(kdev, "failed to register fixed PHY device\n");
+ return -ENODEV;
+ }
+diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
+index 5ef073a79ce94..cb2810f175ccd 100644
+--- a/drivers/net/ethernet/broadcom/tg3.c
++++ b/drivers/net/ethernet/broadcom/tg3.c
+@@ -6881,7 +6881,10 @@ static int tg3_rx(struct tg3_napi *tnapi, int budget)
+
+ ri->data = NULL;
+
+- skb = build_skb(data, frag_size);
++ if (frag_size)
++ skb = build_skb(data, frag_size);
++ else
++ skb = slab_build_skb(data);
+ if (!skb) {
+ tg3_frag_free(frag_size != 0, data);
+ goto drop_it_no_recycle;
+diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c
+index c2e7037c7ba1c..7750702900fa6 100644
+--- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c
++++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c
+@@ -1466,7 +1466,7 @@ static void make_established(struct sock *sk, u32 snd_isn, unsigned int opt)
+ tp->write_seq = snd_isn;
+ tp->snd_nxt = snd_isn;
+ tp->snd_una = snd_isn;
+- inet_sk(sk)->inet_id = get_random_u16();
++ atomic_set(&inet_sk(sk)->inet_id, get_random_u16());
+ assign_rxopt(sk, opt);
+
+ if (tp->rcv_wnd > (RCV_BUFSIZ_M << 10))
+diff --git a/drivers/net/ethernet/ibm/ibmveth.c b/drivers/net/ethernet/ibm/ibmveth.c
+index 113fcb3e353ea..832a2ae019509 100644
+--- a/drivers/net/ethernet/ibm/ibmveth.c
++++ b/drivers/net/ethernet/ibm/ibmveth.c
+@@ -203,7 +203,7 @@ static inline void ibmveth_flush_buffer(void *addr, unsigned long length)
+ unsigned long offset;
+
+ for (offset = 0; offset < length; offset += SMP_CACHE_BYTES)
+- asm("dcbfl %0,%1" :: "b" (addr), "r" (offset));
++ asm("dcbf %0,%1,1" :: "b" (addr), "r" (offset));
+ }
+
+ /* replenish the buffers for a pool. note that we don't need to
+diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
+index b847bd105b16e..5d21cb4ef6301 100644
+--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
++++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
+@@ -2615,7 +2615,7 @@ int i40e_sync_vsi_filters(struct i40e_vsi *vsi)
+ retval = i40e_correct_mac_vlan_filters
+ (vsi, &tmp_add_list, &tmp_del_list,
+ vlan_filters);
+- else
++ else if (pf->vf)
+ retval = i40e_correct_vf_mac_vlan_filters
+ (vsi, &tmp_add_list, &tmp_del_list,
+ vlan_filters, pf->vf[vsi->vf_id].trusted);
+@@ -2788,7 +2788,8 @@ int i40e_sync_vsi_filters(struct i40e_vsi *vsi)
+ }
+
+ /* if the VF is not trusted do not do promisc */
+- if ((vsi->type == I40E_VSI_SRIOV) && !pf->vf[vsi->vf_id].trusted) {
++ if (vsi->type == I40E_VSI_SRIOV && pf->vf &&
++ !pf->vf[vsi->vf_id].trusted) {
+ clear_bit(__I40E_VSI_OVERFLOW_PROMISC, vsi->state);
+ goto out;
+ }
+diff --git a/drivers/net/ethernet/intel/ice/ice_base.c b/drivers/net/ethernet/intel/ice/ice_base.c
+index 619cb07a40691..25e09ab708ca1 100644
+--- a/drivers/net/ethernet/intel/ice/ice_base.c
++++ b/drivers/net/ethernet/intel/ice/ice_base.c
+@@ -393,7 +393,8 @@ static int ice_setup_rx_ctx(struct ice_rx_ring *ring)
+ /* Receive Packet Data Buffer Size.
+ * The Packet Data Buffer Size is defined in 128 byte units.
+ */
+- rlan_ctx.dbuf = ring->rx_buf_len >> ICE_RLAN_CTX_DBUF_S;
++ rlan_ctx.dbuf = DIV_ROUND_UP(ring->rx_buf_len,
++ BIT_ULL(ICE_RLAN_CTX_DBUF_S));
+
+ /* use 32 byte descriptors */
+ rlan_ctx.dsize = 1;
+diff --git a/drivers/net/ethernet/intel/ice/ice_sriov.c b/drivers/net/ethernet/intel/ice/ice_sriov.c
+index 588ad8696756d..f1dca59bd8449 100644
+--- a/drivers/net/ethernet/intel/ice/ice_sriov.c
++++ b/drivers/net/ethernet/intel/ice/ice_sriov.c
+@@ -1171,7 +1171,7 @@ int ice_set_vf_spoofchk(struct net_device *netdev, int vf_id, bool ena)
+ if (!vf)
+ return -EINVAL;
+
+- ret = ice_check_vf_ready_for_reset(vf);
++ ret = ice_check_vf_ready_for_cfg(vf);
+ if (ret)
+ goto out_put_vf;
+
+@@ -1286,7 +1286,7 @@ int ice_set_vf_mac(struct net_device *netdev, int vf_id, u8 *mac)
+ goto out_put_vf;
+ }
+
+- ret = ice_check_vf_ready_for_reset(vf);
++ ret = ice_check_vf_ready_for_cfg(vf);
+ if (ret)
+ goto out_put_vf;
+
+@@ -1340,7 +1340,7 @@ int ice_set_vf_trust(struct net_device *netdev, int vf_id, bool trusted)
+ return -EOPNOTSUPP;
+ }
+
+- ret = ice_check_vf_ready_for_reset(vf);
++ ret = ice_check_vf_ready_for_cfg(vf);
+ if (ret)
+ goto out_put_vf;
+
+@@ -1653,7 +1653,7 @@ ice_set_vf_port_vlan(struct net_device *netdev, int vf_id, u16 vlan_id, u8 qos,
+ if (!vf)
+ return -EINVAL;
+
+- ret = ice_check_vf_ready_for_reset(vf);
++ ret = ice_check_vf_ready_for_cfg(vf);
+ if (ret)
+ goto out_put_vf;
+
+diff --git a/drivers/net/ethernet/intel/ice/ice_vf_lib.c b/drivers/net/ethernet/intel/ice/ice_vf_lib.c
+index bf74a2f3a4f8c..14da7ebaaead7 100644
+--- a/drivers/net/ethernet/intel/ice/ice_vf_lib.c
++++ b/drivers/net/ethernet/intel/ice/ice_vf_lib.c
+@@ -185,25 +185,6 @@ int ice_check_vf_ready_for_cfg(struct ice_vf *vf)
+ return 0;
+ }
+
+-/**
+- * ice_check_vf_ready_for_reset - check if VF is ready to be reset
+- * @vf: VF to check if it's ready to be reset
+- *
+- * The purpose of this function is to ensure that the VF is not in reset,
+- * disabled, and is both initialized and active, thus enabling us to safely
+- * initialize another reset.
+- */
+-int ice_check_vf_ready_for_reset(struct ice_vf *vf)
+-{
+- int ret;
+-
+- ret = ice_check_vf_ready_for_cfg(vf);
+- if (!ret && !test_bit(ICE_VF_STATE_ACTIVE, vf->vf_states))
+- ret = -EAGAIN;
+-
+- return ret;
+-}
+-
+ /**
+ * ice_trigger_vf_reset - Reset a VF on HW
+ * @vf: pointer to the VF structure
+@@ -631,11 +612,17 @@ int ice_reset_vf(struct ice_vf *vf, u32 flags)
+ return 0;
+ }
+
++ if (flags & ICE_VF_RESET_LOCK)
++ mutex_lock(&vf->cfg_lock);
++ else
++ lockdep_assert_held(&vf->cfg_lock);
++
+ if (ice_is_vf_disabled(vf)) {
+ vsi = ice_get_vf_vsi(vf);
+ if (!vsi) {
+ dev_dbg(dev, "VF is already removed\n");
+- return -EINVAL;
++ err = -EINVAL;
++ goto out_unlock;
+ }
+ ice_vsi_stop_lan_tx_rings(vsi, ICE_NO_RESET, vf->vf_id);
+
+@@ -644,14 +631,9 @@ int ice_reset_vf(struct ice_vf *vf, u32 flags)
+
+ dev_dbg(dev, "VF is already disabled, there is no need for resetting it, telling VM, all is fine %d\n",
+ vf->vf_id);
+- return 0;
++ goto out_unlock;
+ }
+
+- if (flags & ICE_VF_RESET_LOCK)
+- mutex_lock(&vf->cfg_lock);
+- else
+- lockdep_assert_held(&vf->cfg_lock);
+-
+ /* Set VF disable bit state here, before triggering reset */
+ set_bit(ICE_VF_STATE_DIS, vf->vf_states);
+ ice_trigger_vf_reset(vf, flags & ICE_VF_RESET_VFLR, false);
+diff --git a/drivers/net/ethernet/intel/ice/ice_vf_lib.h b/drivers/net/ethernet/intel/ice/ice_vf_lib.h
+index a38ef00a36794..e3cda6fb71ab1 100644
+--- a/drivers/net/ethernet/intel/ice/ice_vf_lib.h
++++ b/drivers/net/ethernet/intel/ice/ice_vf_lib.h
+@@ -215,7 +215,6 @@ u16 ice_get_num_vfs(struct ice_pf *pf);
+ struct ice_vsi *ice_get_vf_vsi(struct ice_vf *vf);
+ bool ice_is_vf_disabled(struct ice_vf *vf);
+ int ice_check_vf_ready_for_cfg(struct ice_vf *vf);
+-int ice_check_vf_ready_for_reset(struct ice_vf *vf);
+ void ice_set_vf_state_dis(struct ice_vf *vf);
+ bool ice_is_any_vf_in_unicast_promisc(struct ice_pf *pf);
+ void
+diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl.c b/drivers/net/ethernet/intel/ice/ice_virtchnl.c
+index f4a524f80b110..97243c616d5d6 100644
+--- a/drivers/net/ethernet/intel/ice/ice_virtchnl.c
++++ b/drivers/net/ethernet/intel/ice/ice_virtchnl.c
+@@ -3955,7 +3955,6 @@ error_handler:
+ ice_vc_notify_vf_link_state(vf);
+ break;
+ case VIRTCHNL_OP_RESET_VF:
+- clear_bit(ICE_VF_STATE_ACTIVE, vf->vf_states);
+ ops->reset_vf(vf);
+ break;
+ case VIRTCHNL_OP_ADD_ETH_ADDR:
+diff --git a/drivers/net/ethernet/intel/igb/igb_ptp.c b/drivers/net/ethernet/intel/igb/igb_ptp.c
+index 405886ee52615..319c544b9f04c 100644
+--- a/drivers/net/ethernet/intel/igb/igb_ptp.c
++++ b/drivers/net/ethernet/intel/igb/igb_ptp.c
+@@ -1385,18 +1385,6 @@ void igb_ptp_init(struct igb_adapter *adapter)
+ return;
+ }
+
+- spin_lock_init(&adapter->tmreg_lock);
+- INIT_WORK(&adapter->ptp_tx_work, igb_ptp_tx_work);
+-
+- if (adapter->ptp_flags & IGB_PTP_OVERFLOW_CHECK)
+- INIT_DELAYED_WORK(&adapter->ptp_overflow_work,
+- igb_ptp_overflow_check);
+-
+- adapter->tstamp_config.rx_filter = HWTSTAMP_FILTER_NONE;
+- adapter->tstamp_config.tx_type = HWTSTAMP_TX_OFF;
+-
+- igb_ptp_reset(adapter);
+-
+ adapter->ptp_clock = ptp_clock_register(&adapter->ptp_caps,
+ &adapter->pdev->dev);
+ if (IS_ERR(adapter->ptp_clock)) {
+@@ -1406,6 +1394,18 @@ void igb_ptp_init(struct igb_adapter *adapter)
+ dev_info(&adapter->pdev->dev, "added PHC on %s\n",
+ adapter->netdev->name);
+ adapter->ptp_flags |= IGB_PTP_ENABLED;
++
++ spin_lock_init(&adapter->tmreg_lock);
++ INIT_WORK(&adapter->ptp_tx_work, igb_ptp_tx_work);
++
++ if (adapter->ptp_flags & IGB_PTP_OVERFLOW_CHECK)
++ INIT_DELAYED_WORK(&adapter->ptp_overflow_work,
++ igb_ptp_overflow_check);
++
++ adapter->tstamp_config.rx_filter = HWTSTAMP_FILTER_NONE;
++ adapter->tstamp_config.tx_type = HWTSTAMP_TX_OFF;
++
++ igb_ptp_reset(adapter);
+ }
+ }
+
+diff --git a/drivers/net/ethernet/intel/igc/igc_defines.h b/drivers/net/ethernet/intel/igc/igc_defines.h
+index 44a5070299465..2f780cc90883c 100644
+--- a/drivers/net/ethernet/intel/igc/igc_defines.h
++++ b/drivers/net/ethernet/intel/igc/igc_defines.h
+@@ -546,7 +546,7 @@
+ #define IGC_PTM_CTRL_START_NOW BIT(29) /* Start PTM Now */
+ #define IGC_PTM_CTRL_EN BIT(30) /* Enable PTM */
+ #define IGC_PTM_CTRL_TRIG BIT(31) /* PTM Cycle trigger */
+-#define IGC_PTM_CTRL_SHRT_CYC(usec) (((usec) & 0x2f) << 2)
++#define IGC_PTM_CTRL_SHRT_CYC(usec) (((usec) & 0x3f) << 2)
+ #define IGC_PTM_CTRL_PTM_TO(usec) (((usec) & 0xff) << 8)
+
+ #define IGC_PTM_SHORT_CYC_DEFAULT 10 /* Default Short/interrupted cycle interval */
+diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
+index 8cdf91a5bf44f..49c1dbe5ec788 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
++++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
+@@ -4016,9 +4016,10 @@ rx_frscfg:
+ if (link < 0)
+ return NIX_AF_ERR_RX_LINK_INVALID;
+
+- nix_find_link_frs(rvu, req, pcifunc);
+
+ linkcfg:
++ nix_find_link_frs(rvu, req, pcifunc);
++
+ cfg = rvu_read64(rvu, blkaddr, NIX_AF_RX_LINKX_CFG(link));
+ cfg = (cfg & ~(0xFFFFULL << 16)) | ((u64)req->maxlen << 16);
+ if (req->update_minlen)
+diff --git a/drivers/net/ethernet/mediatek/mtk_wed.c b/drivers/net/ethernet/mediatek/mtk_wed.c
+index 985cff910f30c..3b651efcc25e1 100644
+--- a/drivers/net/ethernet/mediatek/mtk_wed.c
++++ b/drivers/net/ethernet/mediatek/mtk_wed.c
+@@ -221,9 +221,13 @@ void mtk_wed_fe_reset(void)
+
+ for (i = 0; i < ARRAY_SIZE(hw_list); i++) {
+ struct mtk_wed_hw *hw = hw_list[i];
+- struct mtk_wed_device *dev = hw->wed_dev;
++ struct mtk_wed_device *dev;
+ int err;
+
++ if (!hw)
++ break;
++
++ dev = hw->wed_dev;
+ if (!dev || !dev->wlan.reset)
+ continue;
+
+@@ -244,8 +248,12 @@ void mtk_wed_fe_reset_complete(void)
+
+ for (i = 0; i < ARRAY_SIZE(hw_list); i++) {
+ struct mtk_wed_hw *hw = hw_list[i];
+- struct mtk_wed_device *dev = hw->wed_dev;
++ struct mtk_wed_device *dev;
++
++ if (!hw)
++ break;
+
++ dev = hw->wed_dev;
+ if (!dev || !dev->wlan.reset_complete)
+ continue;
+
+diff --git a/drivers/net/ethernet/mellanox/mlxsw/core_acl_flex_keys.c b/drivers/net/ethernet/mellanox/mlxsw/core_acl_flex_keys.c
+index bd1a51a0a5408..f208a237d0b52 100644
+--- a/drivers/net/ethernet/mellanox/mlxsw/core_acl_flex_keys.c
++++ b/drivers/net/ethernet/mellanox/mlxsw/core_acl_flex_keys.c
+@@ -32,8 +32,8 @@ static const struct mlxsw_afk_element_info mlxsw_afk_element_infos[] = {
+ MLXSW_AFK_ELEMENT_INFO_U32(IP_TTL_, 0x18, 0, 8),
+ MLXSW_AFK_ELEMENT_INFO_U32(IP_ECN, 0x18, 9, 2),
+ MLXSW_AFK_ELEMENT_INFO_U32(IP_DSCP, 0x18, 11, 6),
+- MLXSW_AFK_ELEMENT_INFO_U32(VIRT_ROUTER_MSB, 0x18, 17, 3),
+- MLXSW_AFK_ELEMENT_INFO_U32(VIRT_ROUTER_LSB, 0x18, 20, 8),
++ MLXSW_AFK_ELEMENT_INFO_U32(VIRT_ROUTER_MSB, 0x18, 17, 4),
++ MLXSW_AFK_ELEMENT_INFO_U32(VIRT_ROUTER_LSB, 0x18, 21, 8),
+ MLXSW_AFK_ELEMENT_INFO_BUF(SRC_IP_96_127, 0x20, 4),
+ MLXSW_AFK_ELEMENT_INFO_BUF(SRC_IP_64_95, 0x24, 4),
+ MLXSW_AFK_ELEMENT_INFO_BUF(SRC_IP_32_63, 0x28, 4),
+diff --git a/drivers/net/ethernet/mellanox/mlxsw/pci.c b/drivers/net/ethernet/mellanox/mlxsw/pci.c
+index c968309657dd1..51eea1f0529c8 100644
+--- a/drivers/net/ethernet/mellanox/mlxsw/pci.c
++++ b/drivers/net/ethernet/mellanox/mlxsw/pci.c
+@@ -517,11 +517,15 @@ static void mlxsw_pci_skb_cb_ts_set(struct mlxsw_pci *mlxsw_pci,
+ struct sk_buff *skb,
+ enum mlxsw_pci_cqe_v cqe_v, char *cqe)
+ {
++ u8 ts_type;
++
+ if (cqe_v != MLXSW_PCI_CQE_V2)
+ return;
+
+- if (mlxsw_pci_cqe2_time_stamp_type_get(cqe) !=
+- MLXSW_PCI_CQE_TIME_STAMP_TYPE_UTC)
++ ts_type = mlxsw_pci_cqe2_time_stamp_type_get(cqe);
++
++ if (ts_type != MLXSW_PCI_CQE_TIME_STAMP_TYPE_UTC &&
++ ts_type != MLXSW_PCI_CQE_TIME_STAMP_TYPE_MIRROR_UTC)
+ return;
+
+ mlxsw_skb_cb(skb)->cqe_ts.sec = mlxsw_pci_cqe2_time_stamp_sec_get(cqe);
+diff --git a/drivers/net/ethernet/mellanox/mlxsw/reg.h b/drivers/net/ethernet/mellanox/mlxsw/reg.h
+index 8165bf31a99ae..17160e867befb 100644
+--- a/drivers/net/ethernet/mellanox/mlxsw/reg.h
++++ b/drivers/net/ethernet/mellanox/mlxsw/reg.h
+@@ -97,14 +97,6 @@ MLXSW_ITEM32(reg, sspr, m, 0x00, 31, 1);
+ */
+ MLXSW_ITEM32_LP(reg, sspr, 0x00, 16, 0x00, 12);
+
+-/* reg_sspr_sub_port
+- * Virtual port within the physical port.
+- * Should be set to 0 when virtual ports are not enabled on the port.
+- *
+- * Access: RW
+- */
+-MLXSW_ITEM32(reg, sspr, sub_port, 0x00, 8, 8);
+-
+ /* reg_sspr_system_port
+ * Unique identifier within the stacking domain that represents all the ports
+ * that are available in the system (external ports).
+@@ -120,7 +112,6 @@ static inline void mlxsw_reg_sspr_pack(char *payload, u16 local_port)
+ MLXSW_REG_ZERO(sspr, payload);
+ mlxsw_reg_sspr_m_set(payload, 1);
+ mlxsw_reg_sspr_local_port_set(payload, local_port);
+- mlxsw_reg_sspr_sub_port_set(payload, 0);
+ mlxsw_reg_sspr_system_port_set(payload, local_port);
+ }
+
+diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum2_mr_tcam.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum2_mr_tcam.c
+index e4f4cded2b6f9..b1178b7a7f51a 100644
+--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum2_mr_tcam.c
++++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum2_mr_tcam.c
+@@ -193,7 +193,7 @@ mlxsw_sp2_mr_tcam_rule_parse(struct mlxsw_sp_acl_rule *rule,
+ key->vrid, GENMASK(7, 0));
+ mlxsw_sp_acl_rulei_keymask_u32(rulei,
+ MLXSW_AFK_ELEMENT_VIRT_ROUTER_MSB,
+- key->vrid >> 8, GENMASK(2, 0));
++ key->vrid >> 8, GENMASK(3, 0));
+ switch (key->proto) {
+ case MLXSW_SP_L3_PROTO_IPV4:
+ return mlxsw_sp2_mr_tcam_rule_parse4(rulei, key);
+diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_flex_keys.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_flex_keys.c
+index 00c32320f8915..173808c096bab 100644
+--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_flex_keys.c
++++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_flex_keys.c
+@@ -169,7 +169,7 @@ static struct mlxsw_afk_element_inst mlxsw_sp_afk_element_info_ipv4_2[] = {
+
+ static struct mlxsw_afk_element_inst mlxsw_sp_afk_element_info_ipv4_4[] = {
+ MLXSW_AFK_ELEMENT_INST_U32(VIRT_ROUTER_LSB, 0x04, 24, 8),
+- MLXSW_AFK_ELEMENT_INST_U32(VIRT_ROUTER_MSB, 0x00, 0, 3),
++ MLXSW_AFK_ELEMENT_INST_EXT_U32(VIRT_ROUTER_MSB, 0x00, 0, 3, 0, true),
+ };
+
+ static struct mlxsw_afk_element_inst mlxsw_sp_afk_element_info_ipv6_0[] = {
+@@ -319,7 +319,7 @@ static struct mlxsw_afk_element_inst mlxsw_sp_afk_element_info_mac_5b[] = {
+
+ static struct mlxsw_afk_element_inst mlxsw_sp_afk_element_info_ipv4_4b[] = {
+ MLXSW_AFK_ELEMENT_INST_U32(VIRT_ROUTER_LSB, 0x04, 13, 8),
+- MLXSW_AFK_ELEMENT_INST_EXT_U32(VIRT_ROUTER_MSB, 0x04, 21, 4, 0, true),
++ MLXSW_AFK_ELEMENT_INST_U32(VIRT_ROUTER_MSB, 0x04, 21, 4),
+ };
+
+ static struct mlxsw_afk_element_inst mlxsw_sp_afk_element_info_ipv6_2b[] = {
+diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c
+index b15dd9a3ad540..1b55928e89b8a 100644
+--- a/drivers/net/ipvlan/ipvlan_main.c
++++ b/drivers/net/ipvlan/ipvlan_main.c
+@@ -748,7 +748,8 @@ static int ipvlan_device_event(struct notifier_block *unused,
+
+ write_pnet(&port->pnet, newnet);
+
+- ipvlan_migrate_l3s_hook(oldnet, newnet);
++ if (port->mode == IPVLAN_MODE_L3S)
++ ipvlan_migrate_l3s_hook(oldnet, newnet);
+ break;
+ }
+ case NETDEV_UNREGISTER:
+diff --git a/drivers/net/mdio/mdio-bitbang.c b/drivers/net/mdio/mdio-bitbang.c
+index b83932562be21..81b7748c10ce0 100644
+--- a/drivers/net/mdio/mdio-bitbang.c
++++ b/drivers/net/mdio/mdio-bitbang.c
+@@ -186,7 +186,7 @@ int mdiobb_read_c45(struct mii_bus *bus, int phy, int devad, int reg)
+ struct mdiobb_ctrl *ctrl = bus->priv;
+
+ mdiobb_cmd_addr(ctrl, phy, devad, reg);
+- mdiobb_cmd(ctrl, MDIO_C45_READ, phy, reg);
++ mdiobb_cmd(ctrl, MDIO_C45_READ, phy, devad);
+
+ return mdiobb_read_common(bus, phy);
+ }
+@@ -222,7 +222,7 @@ int mdiobb_write_c45(struct mii_bus *bus, int phy, int devad, int reg, u16 val)
+ struct mdiobb_ctrl *ctrl = bus->priv;
+
+ mdiobb_cmd_addr(ctrl, phy, devad, reg);
+- mdiobb_cmd(ctrl, MDIO_C45_WRITE, phy, reg);
++ mdiobb_cmd(ctrl, MDIO_C45_WRITE, phy, devad);
+
+ return mdiobb_write_common(bus, val);
+ }
+diff --git a/drivers/net/veth.c b/drivers/net/veth.c
+index 76019949e3fe9..c977b704f1342 100644
+--- a/drivers/net/veth.c
++++ b/drivers/net/veth.c
+@@ -1851,10 +1851,7 @@ static int veth_newlink(struct net *src_net, struct net_device *dev,
+
+ nla_peer = data[VETH_INFO_PEER];
+ ifmp = nla_data(nla_peer);
+- err = rtnl_nla_parse_ifla(peer_tb,
+- nla_data(nla_peer) + sizeof(struct ifinfomsg),
+- nla_len(nla_peer) - sizeof(struct ifinfomsg),
+- NULL);
++ err = rtnl_nla_parse_ifinfomsg(peer_tb, nla_peer, extack);
+ if (err < 0)
+ return err;
+
+diff --git a/drivers/net/wireless/intel/iwlwifi/Kconfig b/drivers/net/wireless/intel/iwlwifi/Kconfig
+index b20409f8c13ab..20971304fdef4 100644
+--- a/drivers/net/wireless/intel/iwlwifi/Kconfig
++++ b/drivers/net/wireless/intel/iwlwifi/Kconfig
+@@ -66,6 +66,7 @@ config IWLMVM
+ tristate "Intel Wireless WiFi MVM Firmware support"
+ select WANT_DEV_COREDUMP
+ depends on MAC80211
++ depends on PTP_1588_CLOCK_OPTIONAL
+ help
+ This is the driver that supports the MVM firmware. The list
+ of the devices that use this firmware is available here:
+diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c
+index e311d406b1705..4999636eaa926 100644
+--- a/drivers/of/dynamic.c
++++ b/drivers/of/dynamic.c
+@@ -63,15 +63,14 @@ int of_reconfig_notifier_unregister(struct notifier_block *nb)
+ }
+ EXPORT_SYMBOL_GPL(of_reconfig_notifier_unregister);
+
+-#ifdef DEBUG
+-const char *action_names[] = {
++static const char *action_names[] = {
++ [0] = "INVALID",
+ [OF_RECONFIG_ATTACH_NODE] = "ATTACH_NODE",
+ [OF_RECONFIG_DETACH_NODE] = "DETACH_NODE",
+ [OF_RECONFIG_ADD_PROPERTY] = "ADD_PROPERTY",
+ [OF_RECONFIG_REMOVE_PROPERTY] = "REMOVE_PROPERTY",
+ [OF_RECONFIG_UPDATE_PROPERTY] = "UPDATE_PROPERTY",
+ };
+-#endif
+
+ int of_reconfig_notify(unsigned long action, struct of_reconfig_data *p)
+ {
+@@ -620,21 +619,9 @@ static int __of_changeset_entry_apply(struct of_changeset_entry *ce)
+ }
+
+ ret = __of_add_property(ce->np, ce->prop);
+- if (ret) {
+- pr_err("changeset: add_property failed @%pOF/%s\n",
+- ce->np,
+- ce->prop->name);
+- break;
+- }
+ break;
+ case OF_RECONFIG_REMOVE_PROPERTY:
+ ret = __of_remove_property(ce->np, ce->prop);
+- if (ret) {
+- pr_err("changeset: remove_property failed @%pOF/%s\n",
+- ce->np,
+- ce->prop->name);
+- break;
+- }
+ break;
+
+ case OF_RECONFIG_UPDATE_PROPERTY:
+@@ -648,20 +635,17 @@ static int __of_changeset_entry_apply(struct of_changeset_entry *ce)
+ }
+
+ ret = __of_update_property(ce->np, ce->prop, &old_prop);
+- if (ret) {
+- pr_err("changeset: update_property failed @%pOF/%s\n",
+- ce->np,
+- ce->prop->name);
+- break;
+- }
+ break;
+ default:
+ ret = -EINVAL;
+ }
+ raw_spin_unlock_irqrestore(&devtree_lock, flags);
+
+- if (ret)
++ if (ret) {
++ pr_err("changeset: apply failed: %-15s %pOF:%s\n",
++ action_names[ce->action], ce->np, ce->prop->name);
+ return ret;
++ }
+
+ switch (ce->action) {
+ case OF_RECONFIG_ATTACH_NODE:
+@@ -947,6 +931,9 @@ int of_changeset_action(struct of_changeset *ocs, unsigned long action,
+ if (!ce)
+ return -ENOMEM;
+
++ if (WARN_ON(action >= ARRAY_SIZE(action_names)))
++ return -EINVAL;
++
+ /* get a reference to the node */
+ ce->action = action;
+ ce->np = of_node_get(np);
+diff --git a/drivers/of/kexec.c b/drivers/of/kexec.c
+index f26d2ba8a3715..68278340cecfe 100644
+--- a/drivers/of/kexec.c
++++ b/drivers/of/kexec.c
+@@ -184,7 +184,8 @@ int __init ima_free_kexec_buffer(void)
+ if (ret)
+ return ret;
+
+- return memblock_phys_free(addr, size);
++ memblock_free_late(addr, size);
++ return 0;
+ }
+ #endif
+
+diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c
+index 2191c01365317..4fe02e9f7dcdd 100644
+--- a/drivers/of/unittest.c
++++ b/drivers/of/unittest.c
+@@ -664,12 +664,12 @@ static void __init of_unittest_parse_phandle_with_args_map(void)
+ memset(&args, 0, sizeof(args));
+
+ EXPECT_BEGIN(KERN_INFO,
+- "OF: /testcase-data/phandle-tests/consumer-b: could not find phandle");
++ "OF: /testcase-data/phandle-tests/consumer-b: could not find phandle 12345678");
+
+ rc = of_parse_phandle_with_args_map(np, "phandle-list-bad-phandle",
+ "phandle", 0, &args);
+ EXPECT_END(KERN_INFO,
+- "OF: /testcase-data/phandle-tests/consumer-b: could not find phandle");
++ "OF: /testcase-data/phandle-tests/consumer-b: could not find phandle 12345678");
+
+ unittest(rc == -EINVAL, "expected:%i got:%i\n", -EINVAL, rc);
+
+diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
+index 5b1f271c6034b..601129772b2d5 100644
+--- a/drivers/pci/hotplug/acpiphp_glue.c
++++ b/drivers/pci/hotplug/acpiphp_glue.c
+@@ -512,12 +512,15 @@ static void enable_slot(struct acpiphp_slot *slot, bool bridge)
+ if (pass && dev->subordinate) {
+ check_hotplug_bridge(slot, dev);
+ pcibios_resource_survey_bus(dev->subordinate);
+- __pci_bus_size_bridges(dev->subordinate,
+- &add_list);
++ if (pci_is_root_bus(bus))
++ __pci_bus_size_bridges(dev->subordinate, &add_list);
+ }
+ }
+ }
+- __pci_bus_assign_resources(bus, &add_list, NULL);
++ if (pci_is_root_bus(bus))
++ __pci_bus_assign_resources(bus, &add_list, NULL);
++ else
++ pci_assign_unassigned_bridge_resources(bus->self);
+ }
+
+ acpiphp_sanitize_bus(bus);
+diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
+index b129d7c76b3e9..3b10e0a01b1d2 100644
+--- a/drivers/pinctrl/pinctrl-amd.c
++++ b/drivers/pinctrl/pinctrl-amd.c
+@@ -862,6 +862,33 @@ static const struct pinconf_ops amd_pinconf_ops = {
+ .pin_config_group_set = amd_pinconf_group_set,
+ };
+
++static void amd_gpio_irq_init(struct amd_gpio *gpio_dev)
++{
++ struct pinctrl_desc *desc = gpio_dev->pctrl->desc;
++ unsigned long flags;
++ u32 pin_reg, mask;
++ int i;
++
++ mask = BIT(WAKE_CNTRL_OFF_S0I3) | BIT(WAKE_CNTRL_OFF_S3) |
++ BIT(WAKE_CNTRL_OFF_S4);
++
++ for (i = 0; i < desc->npins; i++) {
++ int pin = desc->pins[i].number;
++ const struct pin_desc *pd = pin_desc_get(gpio_dev->pctrl, pin);
++
++ if (!pd)
++ continue;
++
++ raw_spin_lock_irqsave(&gpio_dev->lock, flags);
++
++ pin_reg = readl(gpio_dev->base + pin * 4);
++ pin_reg &= ~mask;
++ writel(pin_reg, gpio_dev->base + pin * 4);
++
++ raw_spin_unlock_irqrestore(&gpio_dev->lock, flags);
++ }
++}
++
+ #ifdef CONFIG_PM_SLEEP
+ static bool amd_gpio_should_save(struct amd_gpio *gpio_dev, unsigned int pin)
+ {
+@@ -1099,6 +1126,9 @@ static int amd_gpio_probe(struct platform_device *pdev)
+ return PTR_ERR(gpio_dev->pctrl);
+ }
+
++ /* Disable and mask interrupts */
++ amd_gpio_irq_init(gpio_dev);
++
+ girq = &gpio_dev->gc.irq;
+ gpio_irq_chip_set_chip(girq, &amd_gpio_irqchip);
+ /* This will let us handle the parent IRQ in the driver */
+diff --git a/drivers/pinctrl/renesas/pinctrl-rza2.c b/drivers/pinctrl/renesas/pinctrl-rza2.c
+index 40b1326a10776..5591ddf16fdfd 100644
+--- a/drivers/pinctrl/renesas/pinctrl-rza2.c
++++ b/drivers/pinctrl/renesas/pinctrl-rza2.c
+@@ -14,6 +14,7 @@
+ #include <linux/gpio/driver.h>
+ #include <linux/io.h>
+ #include <linux/module.h>
++#include <linux/mutex.h>
+ #include <linux/of_device.h>
+ #include <linux/pinctrl/pinmux.h>
+
+@@ -46,6 +47,7 @@ struct rza2_pinctrl_priv {
+ struct pinctrl_dev *pctl;
+ struct pinctrl_gpio_range gpio_range;
+ int npins;
++ struct mutex mutex; /* serialize adding groups and functions */
+ };
+
+ #define RZA2_PDR(port) (0x0000 + (port) * 2) /* Direction 16-bit */
+@@ -358,10 +360,14 @@ static int rza2_dt_node_to_map(struct pinctrl_dev *pctldev,
+ psel_val[i] = MUX_FUNC(value);
+ }
+
++ mutex_lock(&priv->mutex);
++
+ /* Register a single pin group listing all the pins we read from DT */
+ gsel = pinctrl_generic_add_group(pctldev, np->name, pins, npins, NULL);
+- if (gsel < 0)
+- return gsel;
++ if (gsel < 0) {
++ ret = gsel;
++ goto unlock;
++ }
+
+ /*
+ * Register a single group function where the 'data' is an array PSEL
+@@ -390,6 +396,8 @@ static int rza2_dt_node_to_map(struct pinctrl_dev *pctldev,
+ (*map)->data.mux.function = np->name;
+ *num_maps = 1;
+
++ mutex_unlock(&priv->mutex);
++
+ return 0;
+
+ remove_function:
+@@ -398,6 +406,9 @@ remove_function:
+ remove_group:
+ pinctrl_generic_remove_group(pctldev, gsel);
+
++unlock:
++ mutex_unlock(&priv->mutex);
++
+ dev_err(priv->dev, "Unable to parse DT node %s\n", np->name);
+
+ return ret;
+@@ -473,6 +484,8 @@ static int rza2_pinctrl_probe(struct platform_device *pdev)
+ if (IS_ERR(priv->base))
+ return PTR_ERR(priv->base);
+
++ mutex_init(&priv->mutex);
++
+ platform_set_drvdata(pdev, priv);
+
+ priv->npins = (int)(uintptr_t)of_device_get_match_data(&pdev->dev) *
+diff --git a/drivers/pinctrl/renesas/pinctrl-rzg2l.c b/drivers/pinctrl/renesas/pinctrl-rzg2l.c
+index b53d26167da52..6e8a76556e238 100644
+--- a/drivers/pinctrl/renesas/pinctrl-rzg2l.c
++++ b/drivers/pinctrl/renesas/pinctrl-rzg2l.c
+@@ -11,6 +11,7 @@
+ #include <linux/interrupt.h>
+ #include <linux/io.h>
+ #include <linux/module.h>
++#include <linux/mutex.h>
+ #include <linux/of_device.h>
+ #include <linux/of_irq.h>
+ #include <linux/seq_file.h>
+@@ -149,10 +150,11 @@ struct rzg2l_pinctrl {
+ struct gpio_chip gpio_chip;
+ struct pinctrl_gpio_range gpio_range;
+ DECLARE_BITMAP(tint_slot, RZG2L_TINT_MAX_INTERRUPT);
+- spinlock_t bitmap_lock;
++ spinlock_t bitmap_lock; /* protect tint_slot bitmap */
+ unsigned int hwirq[RZG2L_TINT_MAX_INTERRUPT];
+
+- spinlock_t lock;
++ spinlock_t lock; /* lock read/write registers */
++ struct mutex mutex; /* serialize adding groups and functions */
+ };
+
+ static const unsigned int iolh_groupa_mA[] = { 2, 4, 8, 12 };
+@@ -362,11 +364,13 @@ static int rzg2l_dt_subnode_to_map(struct pinctrl_dev *pctldev,
+ name = np->name;
+ }
+
++ mutex_lock(&pctrl->mutex);
++
+ /* Register a single pin group listing all the pins we read from DT */
+ gsel = pinctrl_generic_add_group(pctldev, name, pins, num_pinmux, NULL);
+ if (gsel < 0) {
+ ret = gsel;
+- goto done;
++ goto unlock;
+ }
+
+ /*
+@@ -380,6 +384,8 @@ static int rzg2l_dt_subnode_to_map(struct pinctrl_dev *pctldev,
+ goto remove_group;
+ }
+
++ mutex_unlock(&pctrl->mutex);
++
+ maps[idx].type = PIN_MAP_TYPE_MUX_GROUP;
+ maps[idx].data.mux.group = name;
+ maps[idx].data.mux.function = name;
+@@ -391,6 +397,8 @@ static int rzg2l_dt_subnode_to_map(struct pinctrl_dev *pctldev,
+
+ remove_group:
+ pinctrl_generic_remove_group(pctldev, gsel);
++unlock:
++ mutex_unlock(&pctrl->mutex);
+ done:
+ *index = idx;
+ kfree(configs);
+@@ -1509,6 +1517,7 @@ static int rzg2l_pinctrl_probe(struct platform_device *pdev)
+
+ spin_lock_init(&pctrl->lock);
+ spin_lock_init(&pctrl->bitmap_lock);
++ mutex_init(&pctrl->mutex);
+
+ platform_set_drvdata(pdev, pctrl);
+
+diff --git a/drivers/pinctrl/renesas/pinctrl-rzv2m.c b/drivers/pinctrl/renesas/pinctrl-rzv2m.c
+index 35b23c1a5684d..9146101ea9e2f 100644
+--- a/drivers/pinctrl/renesas/pinctrl-rzv2m.c
++++ b/drivers/pinctrl/renesas/pinctrl-rzv2m.c
+@@ -14,6 +14,7 @@
+ #include <linux/gpio/driver.h>
+ #include <linux/io.h>
+ #include <linux/module.h>
++#include <linux/mutex.h>
+ #include <linux/of_device.h>
+ #include <linux/spinlock.h>
+
+@@ -123,7 +124,8 @@ struct rzv2m_pinctrl {
+ struct gpio_chip gpio_chip;
+ struct pinctrl_gpio_range gpio_range;
+
+- spinlock_t lock;
++ spinlock_t lock; /* lock read/write registers */
++ struct mutex mutex; /* serialize adding groups and functions */
+ };
+
+ static const unsigned int drv_1_8V_group2_uA[] = { 1800, 3800, 7800, 11000 };
+@@ -322,11 +324,13 @@ static int rzv2m_dt_subnode_to_map(struct pinctrl_dev *pctldev,
+ name = np->name;
+ }
+
++ mutex_lock(&pctrl->mutex);
++
+ /* Register a single pin group listing all the pins we read from DT */
+ gsel = pinctrl_generic_add_group(pctldev, name, pins, num_pinmux, NULL);
+ if (gsel < 0) {
+ ret = gsel;
+- goto done;
++ goto unlock;
+ }
+
+ /*
+@@ -340,6 +344,8 @@ static int rzv2m_dt_subnode_to_map(struct pinctrl_dev *pctldev,
+ goto remove_group;
+ }
+
++ mutex_unlock(&pctrl->mutex);
++
+ maps[idx].type = PIN_MAP_TYPE_MUX_GROUP;
+ maps[idx].data.mux.group = name;
+ maps[idx].data.mux.function = name;
+@@ -351,6 +357,8 @@ static int rzv2m_dt_subnode_to_map(struct pinctrl_dev *pctldev,
+
+ remove_group:
+ pinctrl_generic_remove_group(pctldev, gsel);
++unlock:
++ mutex_unlock(&pctrl->mutex);
+ done:
+ *index = idx;
+ kfree(configs);
+@@ -1071,6 +1079,7 @@ static int rzv2m_pinctrl_probe(struct platform_device *pdev)
+ }
+
+ spin_lock_init(&pctrl->lock);
++ mutex_init(&pctrl->mutex);
+
+ platform_set_drvdata(pdev, pctrl);
+
+diff --git a/drivers/platform/x86/ideapad-laptop.c b/drivers/platform/x86/ideapad-laptop.c
+index d2fee9a3e2390..6d9297c1d96c1 100644
+--- a/drivers/platform/x86/ideapad-laptop.c
++++ b/drivers/platform/x86/ideapad-laptop.c
+@@ -1049,6 +1049,11 @@ static const struct key_entry ideapad_keymap[] = {
+ { KE_IGNORE, 0x03 | IDEAPAD_WMI_KEY },
+ /* Customizable Lenovo Hotkey ("star" with 'S' inside) */
+ { KE_KEY, 0x01 | IDEAPAD_WMI_KEY, { KEY_FAVORITES } },
++ { KE_KEY, 0x04 | IDEAPAD_WMI_KEY, { KEY_SELECTIVE_SCREENSHOT } },
++ /* Lenovo Support */
++ { KE_KEY, 0x07 | IDEAPAD_WMI_KEY, { KEY_HELP } },
++ { KE_KEY, 0x0e | IDEAPAD_WMI_KEY, { KEY_PICKUP_PHONE } },
++ { KE_KEY, 0x0f | IDEAPAD_WMI_KEY, { KEY_HANGUP_PHONE } },
+ /* Dark mode toggle */
+ { KE_KEY, 0x13 | IDEAPAD_WMI_KEY, { KEY_PROG1 } },
+ /* Sound profile switch */
+diff --git a/drivers/platform/x86/lenovo-ymc.c b/drivers/platform/x86/lenovo-ymc.c
+index f360370d50027..e1fbc35504d49 100644
+--- a/drivers/platform/x86/lenovo-ymc.c
++++ b/drivers/platform/x86/lenovo-ymc.c
+@@ -36,6 +36,13 @@ static const struct dmi_system_id ec_trigger_quirk_dmi_table[] = {
+ DMI_MATCH(DMI_PRODUCT_NAME, "82QF"),
+ },
+ },
++ {
++ /* Lenovo Yoga 7 14ACN6 */
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
++ DMI_MATCH(DMI_PRODUCT_NAME, "82N7"),
++ },
++ },
+ { }
+ };
+
+diff --git a/drivers/scsi/raid_class.c b/drivers/scsi/raid_class.c
+index 711252e52d8e1..95a86e0dfd77a 100644
+--- a/drivers/scsi/raid_class.c
++++ b/drivers/scsi/raid_class.c
+@@ -209,54 +209,6 @@ raid_attr_ro_state(level);
+ raid_attr_ro_fn(resync);
+ raid_attr_ro_state_fn(state);
+
+-static void raid_component_release(struct device *dev)
+-{
+- struct raid_component *rc =
+- container_of(dev, struct raid_component, dev);
+- dev_printk(KERN_ERR, rc->dev.parent, "COMPONENT RELEASE\n");
+- put_device(rc->dev.parent);
+- kfree(rc);
+-}
+-
+-int raid_component_add(struct raid_template *r,struct device *raid_dev,
+- struct device *component_dev)
+-{
+- struct device *cdev =
+- attribute_container_find_class_device(&r->raid_attrs.ac,
+- raid_dev);
+- struct raid_component *rc;
+- struct raid_data *rd = dev_get_drvdata(cdev);
+- int err;
+-
+- rc = kzalloc(sizeof(*rc), GFP_KERNEL);
+- if (!rc)
+- return -ENOMEM;
+-
+- INIT_LIST_HEAD(&rc->node);
+- device_initialize(&rc->dev);
+- rc->dev.release = raid_component_release;
+- rc->dev.parent = get_device(component_dev);
+- rc->num = rd->component_count++;
+-
+- dev_set_name(&rc->dev, "component-%d", rc->num);
+- list_add_tail(&rc->node, &rd->component_list);
+- rc->dev.class = &raid_class.class;
+- err = device_add(&rc->dev);
+- if (err)
+- goto err_out;
+-
+- return 0;
+-
+-err_out:
+- put_device(&rc->dev);
+- list_del(&rc->node);
+- rd->component_count--;
+- put_device(component_dev);
+- kfree(rc);
+- return err;
+-}
+-EXPORT_SYMBOL(raid_component_add);
+-
+ struct raid_template *
+ raid_class_attach(struct raid_function_template *ft)
+ {
+diff --git a/drivers/scsi/snic/snic_disc.c b/drivers/scsi/snic/snic_disc.c
+index cd27562ec922e..6c529b37f3b46 100644
+--- a/drivers/scsi/snic/snic_disc.c
++++ b/drivers/scsi/snic/snic_disc.c
+@@ -303,12 +303,11 @@ snic_tgt_create(struct snic *snic, struct snic_tgt_id *tgtid)
+ "Snic Tgt: device_add, with err = %d\n",
+ ret);
+
+- put_device(&tgt->dev);
+ put_device(&snic->shost->shost_gendev);
+ spin_lock_irqsave(snic->shost->host_lock, flags);
+ list_del(&tgt->list);
+ spin_unlock_irqrestore(snic->shost->host_lock, flags);
+- kfree(tgt);
++ put_device(&tgt->dev);
+ tgt = NULL;
+
+ return tgt;
+diff --git a/drivers/spi/spi-cadence.c b/drivers/spi/spi-cadence.c
+index 26e6633693196..f2825a2f1c178 100644
+--- a/drivers/spi/spi-cadence.c
++++ b/drivers/spi/spi-cadence.c
+@@ -316,12 +316,6 @@ static void cdns_spi_process_fifo(struct cdns_spi *xspi, int ntx, int nrx)
+ xspi->rx_bytes -= nrx;
+
+ while (ntx || nrx) {
+- /* When xspi in busy condition, bytes may send failed,
+- * then spi control did't work thoroughly, add one byte delay
+- */
+- if (cdns_spi_read(xspi, CDNS_SPI_ISR) & CDNS_SPI_IXR_TXFULL)
+- udelay(10);
+-
+ if (ntx) {
+ if (xspi->txbuf)
+ cdns_spi_write(xspi, CDNS_SPI_TXD, *xspi->txbuf++);
+@@ -391,6 +385,11 @@ static irqreturn_t cdns_spi_irq(int irq, void *dev_id)
+ if (xspi->tx_bytes) {
+ cdns_spi_process_fifo(xspi, trans_cnt, trans_cnt);
+ } else {
++ /* Fixed delay due to controller limitation with
++ * RX_NEMPTY incorrect status
++ * Xilinx AR:65885 contains more details
++ */
++ udelay(10);
+ cdns_spi_process_fifo(xspi, 0, trans_cnt);
+ cdns_spi_write(xspi, CDNS_SPI_IDR,
+ CDNS_SPI_IXR_DEFAULT);
+@@ -438,12 +437,18 @@ static int cdns_transfer_one(struct spi_controller *ctlr,
+ cdns_spi_setup_transfer(spi, transfer);
+ } else {
+ /* Set TX empty threshold to half of FIFO depth
+- * only if TX bytes are more than half FIFO depth.
++ * only if TX bytes are more than FIFO depth.
+ */
+ if (xspi->tx_bytes > xspi->tx_fifo_depth)
+ cdns_spi_write(xspi, CDNS_SPI_THLD, xspi->tx_fifo_depth >> 1);
+ }
+
++ /* When xspi in busy condition, bytes may send failed,
++ * then spi control didn't work thoroughly, add one byte delay
++ */
++ if (cdns_spi_read(xspi, CDNS_SPI_ISR) & CDNS_SPI_IXR_TXFULL)
++ udelay(10);
++
+ cdns_spi_process_fifo(xspi, xspi->tx_fifo_depth, 0);
+ spi_transfer_delay_exec(transfer);
+
+diff --git a/drivers/thunderbolt/tmu.c b/drivers/thunderbolt/tmu.c
+index 626aca3124b1c..d9544600b3867 100644
+--- a/drivers/thunderbolt/tmu.c
++++ b/drivers/thunderbolt/tmu.c
+@@ -415,7 +415,8 @@ int tb_switch_tmu_disable(struct tb_switch *sw)
+ * uni-directional mode and we don't want to change it's TMU
+ * mode.
+ */
+- tb_switch_tmu_rate_write(sw, TB_SWITCH_TMU_RATE_OFF);
++ ret = tb_switch_tmu_rate_write(sw, TB_SWITCH_TMU_RATE_OFF);
++ return ret;
+
+ tb_port_tmu_time_sync_disable(up);
+ ret = tb_port_tmu_time_sync_disable(down);
+diff --git a/drivers/tty/Kconfig b/drivers/tty/Kconfig
+index 341abaed4ce2c..069de553127c4 100644
+--- a/drivers/tty/Kconfig
++++ b/drivers/tty/Kconfig
+@@ -164,6 +164,9 @@ config LEGACY_TIOCSTI
+ userspace depends on this functionality to continue operating
+ normally.
+
++ Processes which run with CAP_SYS_ADMIN, such as BRLTTY, can
++ use TIOCSTI even when this is set to N.
++
+ This functionality can be changed at runtime with the
+ dev.tty.legacy_tiocsti sysctl. This configuration option sets
+ the default value of the sysctl.
+diff --git a/drivers/ufs/host/ufs-qcom.c b/drivers/ufs/host/ufs-qcom.c
+index 82d02e7f3b4f3..6e6d0dc5ece56 100644
+--- a/drivers/ufs/host/ufs-qcom.c
++++ b/drivers/ufs/host/ufs-qcom.c
+@@ -225,7 +225,7 @@ static void ufs_qcom_select_unipro_mode(struct ufs_qcom_host *host)
+ ufs_qcom_cap_qunipro(host) ? QUNIPRO_SEL : 0,
+ REG_UFS_CFG1);
+
+- if (host->hw_ver.major == 0x05)
++ if (host->hw_ver.major >= 0x05)
+ ufshcd_rmwl(host->hba, QUNIPRO_G4_SEL, 0, REG_UFS_CFG0);
+
+ /* make sure above configuration is applied before we return */
+diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
+index c4e0da6db7195..9ec91017a7f3c 100644
+--- a/fs/jbd2/checkpoint.c
++++ b/fs/jbd2/checkpoint.c
+@@ -27,7 +27,7 @@
+ *
+ * Called with j_list_lock held.
+ */
+-static inline void __buffer_unlink_first(struct journal_head *jh)
++static inline void __buffer_unlink(struct journal_head *jh)
+ {
+ transaction_t *transaction = jh->b_cp_transaction;
+
+@@ -40,23 +40,6 @@ static inline void __buffer_unlink_first(struct journal_head *jh)
+ }
+ }
+
+-/*
+- * Unlink a buffer from a transaction checkpoint(io) list.
+- *
+- * Called with j_list_lock held.
+- */
+-static inline void __buffer_unlink(struct journal_head *jh)
+-{
+- transaction_t *transaction = jh->b_cp_transaction;
+-
+- __buffer_unlink_first(jh);
+- if (transaction->t_checkpoint_io_list == jh) {
+- transaction->t_checkpoint_io_list = jh->b_cpnext;
+- if (transaction->t_checkpoint_io_list == jh)
+- transaction->t_checkpoint_io_list = NULL;
+- }
+-}
+-
+ /*
+ * Check a checkpoint buffer could be release or not.
+ *
+@@ -366,50 +349,10 @@ int jbd2_cleanup_journal_tail(journal_t *journal)
+
+ /* Checkpoint list management */
+
+-/*
+- * journal_clean_one_cp_list
+- *
+- * Find all the written-back checkpoint buffers in the given list and
+- * release them. If 'destroy' is set, clean all buffers unconditionally.
+- *
+- * Called with j_list_lock held.
+- * Returns 1 if we freed the transaction, 0 otherwise.
+- */
+-static int journal_clean_one_cp_list(struct journal_head *jh, bool destroy)
+-{
+- struct journal_head *last_jh;
+- struct journal_head *next_jh = jh;
+-
+- if (!jh)
+- return 0;
+-
+- last_jh = jh->b_cpprev;
+- do {
+- jh = next_jh;
+- next_jh = jh->b_cpnext;
+-
+- if (!destroy && __cp_buffer_busy(jh))
+- return 0;
+-
+- if (__jbd2_journal_remove_checkpoint(jh))
+- return 1;
+- /*
+- * This function only frees up some memory
+- * if possible so we dont have an obligation
+- * to finish processing. Bail out if preemption
+- * requested:
+- */
+- if (need_resched())
+- return 0;
+- } while (jh != last_jh);
+-
+- return 0;
+-}
+-
+ /*
+ * journal_shrink_one_cp_list
+ *
+- * Find 'nr_to_scan' written-back checkpoint buffers in the given list
++ * Find all the written-back checkpoint buffers in the given list
+ * and try to release them. If the whole transaction is released, set
+ * the 'released' parameter. Return the number of released checkpointed
+ * buffers.
+@@ -417,15 +360,15 @@ static int journal_clean_one_cp_list(struct journal_head *jh, bool destroy)
+ * Called with j_list_lock held.
+ */
+ static unsigned long journal_shrink_one_cp_list(struct journal_head *jh,
+- unsigned long *nr_to_scan,
+- bool *released)
++ bool destroy, bool *released)
+ {
+ struct journal_head *last_jh;
+ struct journal_head *next_jh = jh;
+ unsigned long nr_freed = 0;
+ int ret;
+
+- if (!jh || *nr_to_scan == 0)
++ *released = false;
++ if (!jh)
+ return 0;
+
+ last_jh = jh->b_cpprev;
+@@ -433,12 +376,15 @@ static unsigned long journal_shrink_one_cp_list(struct journal_head *jh,
+ jh = next_jh;
+ next_jh = jh->b_cpnext;
+
+- (*nr_to_scan)--;
+- if (__cp_buffer_busy(jh))
+- continue;
++ if (destroy) {
++ ret = __jbd2_journal_remove_checkpoint(jh);
++ } else {
++ ret = jbd2_journal_try_remove_checkpoint(jh);
++ if (ret < 0)
++ continue;
++ }
+
+ nr_freed++;
+- ret = __jbd2_journal_remove_checkpoint(jh);
+ if (ret) {
+ *released = true;
+ break;
+@@ -446,7 +392,7 @@ static unsigned long journal_shrink_one_cp_list(struct journal_head *jh,
+
+ if (need_resched())
+ break;
+- } while (jh != last_jh && *nr_to_scan);
++ } while (jh != last_jh);
+
+ return nr_freed;
+ }
+@@ -464,11 +410,11 @@ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal,
+ unsigned long *nr_to_scan)
+ {
+ transaction_t *transaction, *last_transaction, *next_transaction;
+- bool released;
++ bool __maybe_unused released;
+ tid_t first_tid = 0, last_tid = 0, next_tid = 0;
+ tid_t tid = 0;
+ unsigned long nr_freed = 0;
+- unsigned long nr_scanned = *nr_to_scan;
++ unsigned long freed;
+
+ again:
+ spin_lock(&journal->j_list_lock);
+@@ -497,19 +443,11 @@ again:
+ transaction = next_transaction;
+ next_transaction = transaction->t_cpnext;
+ tid = transaction->t_tid;
+- released = false;
+
+- nr_freed += journal_shrink_one_cp_list(transaction->t_checkpoint_list,
+- nr_to_scan, &released);
+- if (*nr_to_scan == 0)
+- break;
+- if (need_resched() || spin_needbreak(&journal->j_list_lock))
+- break;
+- if (released)
+- continue;
+-
+- nr_freed += journal_shrink_one_cp_list(transaction->t_checkpoint_io_list,
+- nr_to_scan, &released);
++ freed = journal_shrink_one_cp_list(transaction->t_checkpoint_list,
++ false, &released);
++ nr_freed += freed;
++ (*nr_to_scan) -= min(*nr_to_scan, freed);
+ if (*nr_to_scan == 0)
+ break;
+ if (need_resched() || spin_needbreak(&journal->j_list_lock))
+@@ -530,9 +468,8 @@ again:
+ if (*nr_to_scan && next_tid)
+ goto again;
+ out:
+- nr_scanned -= *nr_to_scan;
+ trace_jbd2_shrink_checkpoint_list(journal, first_tid, tid, last_tid,
+- nr_freed, nr_scanned, next_tid);
++ nr_freed, next_tid);
+
+ return nr_freed;
+ }
+@@ -548,7 +485,7 @@ out:
+ void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy)
+ {
+ transaction_t *transaction, *last_transaction, *next_transaction;
+- int ret;
++ bool released;
+
+ transaction = journal->j_checkpoint_transactions;
+ if (!transaction)
+@@ -559,8 +496,8 @@ void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy)
+ do {
+ transaction = next_transaction;
+ next_transaction = transaction->t_cpnext;
+- ret = journal_clean_one_cp_list(transaction->t_checkpoint_list,
+- destroy);
++ journal_shrink_one_cp_list(transaction->t_checkpoint_list,
++ destroy, &released);
+ /*
+ * This function only frees up some memory if possible so we
+ * dont have an obligation to finish processing. Bail out if
+@@ -568,23 +505,12 @@ void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy)
+ */
+ if (need_resched())
+ return;
+- if (ret)
+- continue;
+- /*
+- * It is essential that we are as careful as in the case of
+- * t_checkpoint_list with removing the buffer from the list as
+- * we can possibly see not yet submitted buffers on io_list
+- */
+- ret = journal_clean_one_cp_list(transaction->
+- t_checkpoint_io_list, destroy);
+- if (need_resched())
+- return;
+ /*
+ * Stop scanning if we couldn't free the transaction. This
+ * avoids pointless scanning of transactions which still
+ * weren't checkpointed.
+ */
+- if (!ret)
++ if (!released)
+ return;
+ } while (transaction != last_transaction);
+ }
+@@ -663,7 +589,7 @@ int __jbd2_journal_remove_checkpoint(struct journal_head *jh)
+ jbd2_journal_put_journal_head(jh);
+
+ /* Is this transaction empty? */
+- if (transaction->t_checkpoint_list || transaction->t_checkpoint_io_list)
++ if (transaction->t_checkpoint_list)
+ return 0;
+
+ /*
+@@ -694,6 +620,34 @@ int __jbd2_journal_remove_checkpoint(struct journal_head *jh)
+ return 1;
+ }
+
++/*
++ * Check the checkpoint buffer and try to remove it from the checkpoint
++ * list if it's clean. Returns -EBUSY if it is not clean, returns 1 if
++ * it frees the transaction, 0 otherwise.
++ *
++ * This function is called with j_list_lock held.
++ */
++int jbd2_journal_try_remove_checkpoint(struct journal_head *jh)
++{
++ struct buffer_head *bh = jh2bh(jh);
++
++ if (!trylock_buffer(bh))
++ return -EBUSY;
++ if (buffer_dirty(bh)) {
++ unlock_buffer(bh);
++ return -EBUSY;
++ }
++ unlock_buffer(bh);
++
++ /*
++ * Buffer is clean and the IO has finished (we held the buffer
++ * lock) so the checkpoint is done. We can safely remove the
++ * buffer from this transaction.
++ */
++ JBUFFER_TRACE(jh, "remove from checkpoint list");
++ return __jbd2_journal_remove_checkpoint(jh);
++}
++
+ /*
+ * journal_insert_checkpoint: put a committed buffer onto a checkpoint
+ * list so that we know when it is safe to clean the transaction out of
+@@ -755,7 +709,6 @@ void __jbd2_journal_drop_transaction(journal_t *journal, transaction_t *transact
+ J_ASSERT(transaction->t_forget == NULL);
+ J_ASSERT(transaction->t_shadow_list == NULL);
+ J_ASSERT(transaction->t_checkpoint_list == NULL);
+- J_ASSERT(transaction->t_checkpoint_io_list == NULL);
+ J_ASSERT(atomic_read(&transaction->t_updates) == 0);
+ J_ASSERT(journal->j_committing_transaction != transaction);
+ J_ASSERT(journal->j_running_transaction != transaction);
+diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
+index b33155dd70017..1073259902a60 100644
+--- a/fs/jbd2/commit.c
++++ b/fs/jbd2/commit.c
+@@ -1141,8 +1141,7 @@ restart_loop:
+ spin_lock(&journal->j_list_lock);
+ commit_transaction->t_state = T_FINISHED;
+ /* Check if the transaction can be dropped now that we are finished */
+- if (commit_transaction->t_checkpoint_list == NULL &&
+- commit_transaction->t_checkpoint_io_list == NULL) {
++ if (commit_transaction->t_checkpoint_list == NULL) {
+ __jbd2_journal_drop_transaction(journal, commit_transaction);
+ jbd2_journal_free_transaction(commit_transaction);
+ }
+diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
+index 18611241f4513..6ef5022949c46 100644
+--- a/fs/jbd2/transaction.c
++++ b/fs/jbd2/transaction.c
+@@ -1784,8 +1784,7 @@ int jbd2_journal_forget(handle_t *handle, struct buffer_head *bh)
+ * Otherwise, if the buffer has been written to disk,
+ * it is safe to remove the checkpoint and drop it.
+ */
+- if (!buffer_dirty(bh)) {
+- __jbd2_journal_remove_checkpoint(jh);
++ if (jbd2_journal_try_remove_checkpoint(jh) >= 0) {
+ spin_unlock(&journal->j_list_lock);
+ goto drop;
+ }
+@@ -2112,20 +2111,14 @@ __journal_try_to_free_buffer(journal_t *journal, struct buffer_head *bh)
+
+ jh = bh2jh(bh);
+
+- if (buffer_locked(bh) || buffer_dirty(bh))
+- goto out;
+-
+ if (jh->b_next_transaction != NULL || jh->b_transaction != NULL)
+- goto out;
++ return;
+
+ spin_lock(&journal->j_list_lock);
+- if (jh->b_cp_transaction != NULL) {
+- /* written-back checkpointed metadata buffer */
+- JBUFFER_TRACE(jh, "remove from checkpoint list");
+- __jbd2_journal_remove_checkpoint(jh);
+- }
++ /* Remove written-back checkpointed metadata buffer */
++ if (jh->b_cp_transaction != NULL)
++ jbd2_journal_try_remove_checkpoint(jh);
+ spin_unlock(&journal->j_list_lock);
+-out:
+ return;
+ }
+
+diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c
+index 9a18c5a69ace6..aaffaaa336cc5 100644
+--- a/fs/nfs/direct.c
++++ b/fs/nfs/direct.c
+@@ -472,20 +472,26 @@ out:
+ return result;
+ }
+
+-static void
+-nfs_direct_join_group(struct list_head *list, struct inode *inode)
++static void nfs_direct_join_group(struct list_head *list, struct inode *inode)
+ {
+- struct nfs_page *req, *next;
++ struct nfs_page *req, *subreq;
+
+ list_for_each_entry(req, list, wb_list) {
+- if (req->wb_head != req || req->wb_this_page == req)
++ if (req->wb_head != req)
+ continue;
+- for (next = req->wb_this_page;
+- next != req->wb_head;
+- next = next->wb_this_page) {
+- nfs_list_remove_request(next);
+- nfs_release_request(next);
+- }
++ subreq = req->wb_this_page;
++ if (subreq == req)
++ continue;
++ do {
++ /*
++ * Remove subrequests from this list before freeing
++ * them in the call to nfs_join_page_group().
++ */
++ if (!list_empty(&subreq->wb_list)) {
++ nfs_list_remove_request(subreq);
++ nfs_release_request(subreq);
++ }
++ } while ((subreq = subreq->wb_this_page) != req);
+ nfs_join_page_group(req, inode);
+ }
+ }
+diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
+index 93e306bf4430f..5d7e0511f3513 100644
+--- a/fs/nfs/nfs42proc.c
++++ b/fs/nfs/nfs42proc.c
+@@ -1360,7 +1360,6 @@ ssize_t nfs42_proc_getxattr(struct inode *inode, const char *name,
+ for (i = 0; i < np; i++) {
+ pages[i] = alloc_page(GFP_KERNEL);
+ if (!pages[i]) {
+- np = i + 1;
+ err = -ENOMEM;
+ goto out;
+ }
+@@ -1384,8 +1383,8 @@ ssize_t nfs42_proc_getxattr(struct inode *inode, const char *name,
+ } while (exception.retry);
+
+ out:
+- while (--np >= 0)
+- __free_page(pages[np]);
++ while (--i >= 0)
++ __free_page(pages[i]);
+ kfree(pages);
+
+ return err;
+diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
+index 9faba2dac11dd..fd752e0c4ec24 100644
+--- a/fs/nfs/nfs4proc.c
++++ b/fs/nfs/nfs4proc.c
+@@ -6004,9 +6004,8 @@ static ssize_t __nfs4_get_acl_uncached(struct inode *inode, void *buf,
+ out_ok:
+ ret = res.acl_len;
+ out_free:
+- for (i = 0; i < npages; i++)
+- if (pages[i])
+- __free_page(pages[i]);
++ while (--i >= 0)
++ __free_page(pages[i]);
+ if (res.acl_scratch)
+ __free_page(res.acl_scratch);
+ kfree(pages);
+@@ -7181,8 +7180,15 @@ static void nfs4_lock_done(struct rpc_task *task, void *calldata)
+ } else if (!nfs4_update_lock_stateid(lsp, &data->res.stateid))
+ goto out_restart;
+ break;
+- case -NFS4ERR_BAD_STATEID:
+ case -NFS4ERR_OLD_STATEID:
++ if (data->arg.new_lock_owner != 0 &&
++ nfs4_refresh_open_old_stateid(&data->arg.open_stateid,
++ lsp->ls_state))
++ goto out_restart;
++ if (nfs4_refresh_lock_old_stateid(&data->arg.lock_stateid, lsp))
++ goto out_restart;
++ fallthrough;
++ case -NFS4ERR_BAD_STATEID:
+ case -NFS4ERR_STALE_STATEID:
+ case -NFS4ERR_EXPIRED:
+ if (data->arg.new_lock_owner != 0) {
+diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
+index 3aefbad4cc099..daf305daa7516 100644
+--- a/fs/nfsd/nfs4state.c
++++ b/fs/nfsd/nfs4state.c
+@@ -1354,9 +1354,9 @@ static void revoke_delegation(struct nfs4_delegation *dp)
+ trace_nfsd_stid_revoke(&dp->dl_stid);
+
+ if (clp->cl_minorversion) {
++ spin_lock(&clp->cl_lock);
+ dp->dl_stid.sc_type = NFS4_REVOKED_DELEG_STID;
+ refcount_inc(&dp->dl_stid.sc_count);
+- spin_lock(&clp->cl_lock);
+ list_add(&dp->dl_recall_lru, &clp->cl_revoked);
+ spin_unlock(&clp->cl_lock);
+ }
+diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
+index 581691e4be491..7ec16879756e8 100644
+--- a/fs/nilfs2/segment.c
++++ b/fs/nilfs2/segment.c
+@@ -725,6 +725,11 @@ static size_t nilfs_lookup_dirty_data_buffers(struct inode *inode,
+ struct folio *folio = fbatch.folios[i];
+
+ folio_lock(folio);
++ if (unlikely(folio->mapping != mapping)) {
++ /* Exclude folios removed from the address space */
++ folio_unlock(folio);
++ continue;
++ }
+ head = folio_buffers(folio);
+ if (!head) {
+ create_empty_buffers(&folio->page, i_blocksize(inode), 0);
+diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
+index 420510f6a545e..f05850d6bef7f 100644
+--- a/fs/proc/task_mmu.c
++++ b/fs/proc/task_mmu.c
+@@ -759,12 +759,14 @@ static int smaps_hugetlb_range(pte_t *pte, unsigned long hmask,
+ static const struct mm_walk_ops smaps_walk_ops = {
+ .pmd_entry = smaps_pte_range,
+ .hugetlb_entry = smaps_hugetlb_range,
++ .walk_lock = PGWALK_RDLOCK,
+ };
+
+ static const struct mm_walk_ops smaps_shmem_walk_ops = {
+ .pmd_entry = smaps_pte_range,
+ .hugetlb_entry = smaps_hugetlb_range,
+ .pte_hole = smaps_pte_hole,
++ .walk_lock = PGWALK_RDLOCK,
+ };
+
+ /*
+@@ -1245,6 +1247,7 @@ static int clear_refs_test_walk(unsigned long start, unsigned long end,
+ static const struct mm_walk_ops clear_refs_walk_ops = {
+ .pmd_entry = clear_refs_pte_range,
+ .test_walk = clear_refs_test_walk,
++ .walk_lock = PGWALK_WRLOCK,
+ };
+
+ static ssize_t clear_refs_write(struct file *file, const char __user *buf,
+@@ -1621,6 +1624,7 @@ static const struct mm_walk_ops pagemap_ops = {
+ .pmd_entry = pagemap_pmd_range,
+ .pte_hole = pagemap_pte_hole,
+ .hugetlb_entry = pagemap_hugetlb_range,
++ .walk_lock = PGWALK_RDLOCK,
+ };
+
+ /*
+@@ -1932,6 +1936,7 @@ static int gather_hugetlb_stats(pte_t *pte, unsigned long hmask,
+ static const struct mm_walk_ops show_numa_ops = {
+ .hugetlb_entry = gather_hugetlb_stats,
+ .pmd_entry = gather_pte_stats,
++ .walk_lock = PGWALK_RDLOCK,
+ };
+
+ /*
+diff --git a/include/drm/display/drm_dp.h b/include/drm/display/drm_dp.h
+index f8813c1e059be..40aa08741ef2f 100644
+--- a/include/drm/display/drm_dp.h
++++ b/include/drm/display/drm_dp.h
+@@ -1534,7 +1534,7 @@ enum drm_dp_phy {
+
+ #define DP_BRANCH_OUI_HEADER_SIZE 0xc
+ #define DP_RECEIVER_CAP_SIZE 0xf
+-#define DP_DSC_RECEIVER_CAP_SIZE 0xf
++#define DP_DSC_RECEIVER_CAP_SIZE 0x10 /* DSC Capabilities 0x60 through 0x6F */
+ #define EDP_PSR_RECEIVER_CAP_SIZE 2
+ #define EDP_DISPLAY_CTL_CAP_SIZE 3
+ #define DP_LTTPR_COMMON_CAP_SIZE 8
+diff --git a/include/drm/drm_probe_helper.h b/include/drm/drm_probe_helper.h
+index 4977e0ab72dbb..fad3c4003b2b5 100644
+--- a/include/drm/drm_probe_helper.h
++++ b/include/drm/drm_probe_helper.h
+@@ -25,6 +25,7 @@ void drm_kms_helper_connector_hotplug_event(struct drm_connector *connector);
+
+ void drm_kms_helper_poll_disable(struct drm_device *dev);
+ void drm_kms_helper_poll_enable(struct drm_device *dev);
++void drm_kms_helper_poll_reschedule(struct drm_device *dev);
+ bool drm_kms_helper_is_poll_worker(void);
+
+ enum drm_mode_status drm_crtc_helper_mode_valid_fixed(struct drm_crtc *crtc,
+diff --git a/include/linux/clk.h b/include/linux/clk.h
+index 1ef0133242374..06f1b292f8a00 100644
+--- a/include/linux/clk.h
++++ b/include/linux/clk.h
+@@ -183,6 +183,39 @@ int clk_get_scaled_duty_cycle(struct clk *clk, unsigned int scale);
+ */
+ bool clk_is_match(const struct clk *p, const struct clk *q);
+
++/**
++ * clk_rate_exclusive_get - get exclusivity over the rate control of a
++ * producer
++ * @clk: clock source
++ *
++ * This function allows drivers to get exclusive control over the rate of a
++ * provider. It prevents any other consumer to execute, even indirectly,
++ * opereation which could alter the rate of the provider or cause glitches
++ *
++ * If exlusivity is claimed more than once on clock, even by the same driver,
++ * the rate effectively gets locked as exclusivity can't be preempted.
++ *
++ * Must not be called from within atomic context.
++ *
++ * Returns success (0) or negative errno.
++ */
++int clk_rate_exclusive_get(struct clk *clk);
++
++/**
++ * clk_rate_exclusive_put - release exclusivity over the rate control of a
++ * producer
++ * @clk: clock source
++ *
++ * This function allows drivers to release the exclusivity it previously got
++ * from clk_rate_exclusive_get()
++ *
++ * The caller must balance the number of clk_rate_exclusive_get() and
++ * clk_rate_exclusive_put() calls.
++ *
++ * Must not be called from within atomic context.
++ */
++void clk_rate_exclusive_put(struct clk *clk);
++
+ #else
+
+ static inline int clk_notifier_register(struct clk *clk,
+@@ -236,6 +269,13 @@ static inline bool clk_is_match(const struct clk *p, const struct clk *q)
+ return p == q;
+ }
+
++static inline int clk_rate_exclusive_get(struct clk *clk)
++{
++ return 0;
++}
++
++static inline void clk_rate_exclusive_put(struct clk *clk) {}
++
+ #endif
+
+ #ifdef CONFIG_HAVE_CLK_PREPARE
+@@ -583,38 +623,6 @@ struct clk *devm_clk_get_optional_enabled(struct device *dev, const char *id);
+ */
+ struct clk *devm_get_clk_from_child(struct device *dev,
+ struct device_node *np, const char *con_id);
+-/**
+- * clk_rate_exclusive_get - get exclusivity over the rate control of a
+- * producer
+- * @clk: clock source
+- *
+- * This function allows drivers to get exclusive control over the rate of a
+- * provider. It prevents any other consumer to execute, even indirectly,
+- * opereation which could alter the rate of the provider or cause glitches
+- *
+- * If exlusivity is claimed more than once on clock, even by the same driver,
+- * the rate effectively gets locked as exclusivity can't be preempted.
+- *
+- * Must not be called from within atomic context.
+- *
+- * Returns success (0) or negative errno.
+- */
+-int clk_rate_exclusive_get(struct clk *clk);
+-
+-/**
+- * clk_rate_exclusive_put - release exclusivity over the rate control of a
+- * producer
+- * @clk: clock source
+- *
+- * This function allows drivers to release the exclusivity it previously got
+- * from clk_rate_exclusive_get()
+- *
+- * The caller must balance the number of clk_rate_exclusive_get() and
+- * clk_rate_exclusive_put() calls.
+- *
+- * Must not be called from within atomic context.
+- */
+-void clk_rate_exclusive_put(struct clk *clk);
+
+ /**
+ * clk_enable - inform the system when the clock source should be running.
+@@ -974,14 +982,6 @@ static inline void clk_bulk_put_all(int num_clks, struct clk_bulk_data *clks) {}
+
+ static inline void devm_clk_put(struct device *dev, struct clk *clk) {}
+
+-
+-static inline int clk_rate_exclusive_get(struct clk *clk)
+-{
+- return 0;
+-}
+-
+-static inline void clk_rate_exclusive_put(struct clk *clk) {}
+-
+ static inline int clk_enable(struct clk *clk)
+ {
+ return 0;
+diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h
+index 980b76a1237ea..d629094fac6e6 100644
+--- a/include/linux/cpuset.h
++++ b/include/linux/cpuset.h
+@@ -71,8 +71,10 @@ extern void cpuset_init_smp(void);
+ extern void cpuset_force_rebuild(void);
+ extern void cpuset_update_active_cpus(void);
+ extern void cpuset_wait_for_hotplug(void);
+-extern void cpuset_read_lock(void);
+-extern void cpuset_read_unlock(void);
++extern void inc_dl_tasks_cs(struct task_struct *task);
++extern void dec_dl_tasks_cs(struct task_struct *task);
++extern void cpuset_lock(void);
++extern void cpuset_unlock(void);
+ extern void cpuset_cpus_allowed(struct task_struct *p, struct cpumask *mask);
+ extern bool cpuset_cpus_allowed_fallback(struct task_struct *p);
+ extern nodemask_t cpuset_mems_allowed(struct task_struct *p);
+@@ -189,8 +191,10 @@ static inline void cpuset_update_active_cpus(void)
+
+ static inline void cpuset_wait_for_hotplug(void) { }
+
+-static inline void cpuset_read_lock(void) { }
+-static inline void cpuset_read_unlock(void) { }
++static inline void inc_dl_tasks_cs(struct task_struct *task) { }
++static inline void dec_dl_tasks_cs(struct task_struct *task) { }
++static inline void cpuset_lock(void) { }
++static inline void cpuset_unlock(void) { }
+
+ static inline void cpuset_cpus_allowed(struct task_struct *p,
+ struct cpumask *mask)
+diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
+index f619bae1dcc5d..c212da35a052c 100644
+--- a/include/linux/jbd2.h
++++ b/include/linux/jbd2.h
+@@ -622,12 +622,6 @@ struct transaction_s
+ */
+ struct journal_head *t_checkpoint_list;
+
+- /*
+- * Doubly-linked circular list of all buffers submitted for IO while
+- * checkpointing. [j_list_lock]
+- */
+- struct journal_head *t_checkpoint_io_list;
+-
+ /*
+ * Doubly-linked circular list of metadata buffers being
+ * shadowed by log IO. The IO buffers on the iobuf list and
+@@ -1449,6 +1443,7 @@ extern void jbd2_journal_commit_transaction(journal_t *);
+ void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy);
+ unsigned long jbd2_journal_shrink_checkpoint_list(journal_t *journal, unsigned long *nr_to_scan);
+ int __jbd2_journal_remove_checkpoint(struct journal_head *);
++int jbd2_journal_try_remove_checkpoint(struct journal_head *jh);
+ void jbd2_journal_destroy_checkpoint(journal_t *journal);
+ void __jbd2_journal_insert_checkpoint(struct journal_head *, transaction_t *);
+
+diff --git a/include/linux/mm.h b/include/linux/mm.h
+index d1fd7c544dcd8..6585547c5c067 100644
+--- a/include/linux/mm.h
++++ b/include/linux/mm.h
+@@ -3381,15 +3381,24 @@ static inline int vm_fault_to_errno(vm_fault_t vm_fault, int foll_flags)
+ * Indicates whether GUP can follow a PROT_NONE mapped page, or whether
+ * a (NUMA hinting) fault is required.
+ */
+-static inline bool gup_can_follow_protnone(unsigned int flags)
++static inline bool gup_can_follow_protnone(struct vm_area_struct *vma,
++ unsigned int flags)
+ {
+ /*
+- * FOLL_FORCE has to be able to make progress even if the VMA is
+- * inaccessible. Further, FOLL_FORCE access usually does not represent
+- * application behaviour and we should avoid triggering NUMA hinting
+- * faults.
++ * If callers don't want to honor NUMA hinting faults, no need to
++ * determine if we would actually have to trigger a NUMA hinting fault.
+ */
+- return flags & FOLL_FORCE;
++ if (!(flags & FOLL_HONOR_NUMA_FAULT))
++ return true;
++
++ /*
++ * NUMA hinting faults don't apply in inaccessible (PROT_NONE) VMAs.
++ *
++ * Requiring a fault here even for inaccessible VMAs would mean that
++ * FOLL_FORCE cannot make any progress, because handle_mm_fault()
++ * refuses to process NUMA hinting faults in inaccessible VMAs.
++ */
++ return !vma_is_accessible(vma);
+ }
+
+ typedef int (*pte_fn_t)(pte_t *pte, unsigned long addr, void *data);
+diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
+index 5e74ce4a28cd6..7d30dc4ff0ff1 100644
+--- a/include/linux/mm_types.h
++++ b/include/linux/mm_types.h
+@@ -1286,6 +1286,15 @@ enum {
+ FOLL_PCI_P2PDMA = 1 << 10,
+ /* allow interrupts from generic signals */
+ FOLL_INTERRUPTIBLE = 1 << 11,
++ /*
++ * Always honor (trigger) NUMA hinting faults.
++ *
++ * FOLL_WRITE implicitly honors NUMA hinting faults because a
++ * PROT_NONE-mapped page is not writable (exceptions with FOLL_FORCE
++ * apply). get_user_pages_fast_only() always implicitly honors NUMA
++ * hinting faults.
++ */
++ FOLL_HONOR_NUMA_FAULT = 1 << 12,
+
+ /* See also internal only FOLL flags in mm/internal.h */
+ };
+diff --git a/include/linux/pagewalk.h b/include/linux/pagewalk.h
+index 27a6df448ee56..27cd1e59ccf77 100644
+--- a/include/linux/pagewalk.h
++++ b/include/linux/pagewalk.h
+@@ -6,6 +6,16 @@
+
+ struct mm_walk;
+
++/* Locking requirement during a page walk. */
++enum page_walk_lock {
++ /* mmap_lock should be locked for read to stabilize the vma tree */
++ PGWALK_RDLOCK = 0,
++ /* vma will be write-locked during the walk */
++ PGWALK_WRLOCK = 1,
++ /* vma is expected to be already write-locked during the walk */
++ PGWALK_WRLOCK_VERIFY = 2,
++};
++
+ /**
+ * struct mm_walk_ops - callbacks for walk_page_range
+ * @pgd_entry: if set, called for each non-empty PGD (top-level) entry
+@@ -66,6 +76,7 @@ struct mm_walk_ops {
+ int (*pre_vma)(unsigned long start, unsigned long end,
+ struct mm_walk *walk);
+ void (*post_vma)(struct mm_walk *walk);
++ enum page_walk_lock walk_lock;
+ };
+
+ /*
+diff --git a/include/linux/raid_class.h b/include/linux/raid_class.h
+index 6a9b177d5c414..e50416ba9cd93 100644
+--- a/include/linux/raid_class.h
++++ b/include/linux/raid_class.h
+@@ -77,7 +77,3 @@ DEFINE_RAID_ATTRIBUTE(enum raid_state, state)
+
+ struct raid_template *raid_class_attach(struct raid_function_template *);
+ void raid_class_release(struct raid_template *);
+-
+-int __must_check raid_component_add(struct raid_template *, struct device *,
+- struct device *);
+-
+diff --git a/include/linux/sched.h b/include/linux/sched.h
+index eed5d65b8d1f4..2553918f0b619 100644
+--- a/include/linux/sched.h
++++ b/include/linux/sched.h
+@@ -1852,7 +1852,9 @@ current_restore_flags(unsigned long orig_flags, unsigned long flags)
+ }
+
+ extern int cpuset_cpumask_can_shrink(const struct cpumask *cur, const struct cpumask *trial);
+-extern int task_can_attach(struct task_struct *p, const struct cpumask *cs_effective_cpus);
++extern int task_can_attach(struct task_struct *p);
++extern int dl_bw_alloc(int cpu, u64 dl_bw);
++extern void dl_bw_free(int cpu, u64 dl_bw);
+ #ifdef CONFIG_SMP
+ extern void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask);
+ extern int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask);
+diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
+index 7c4a0b72334eb..c55fc453e33b5 100644
+--- a/include/linux/trace_events.h
++++ b/include/linux/trace_events.h
+@@ -59,6 +59,17 @@ int trace_raw_output_prep(struct trace_iterator *iter,
+ extern __printf(2, 3)
+ void trace_event_printf(struct trace_iterator *iter, const char *fmt, ...);
+
++/* Used to find the offset and length of dynamic fields in trace events */
++struct trace_dynamic_info {
++#ifdef CONFIG_CPU_BIG_ENDIAN
++ u16 offset;
++ u16 len;
++#else
++ u16 len;
++ u16 offset;
++#endif
++};
++
+ /*
+ * The trace entry - the most basic unit of tracing. This is what
+ * is printed in the end as a single line in the trace output, such as:
+diff --git a/include/net/bonding.h b/include/net/bonding.h
+index 59955ac331578..6e4e406d8cd20 100644
+--- a/include/net/bonding.h
++++ b/include/net/bonding.h
+@@ -724,23 +724,14 @@ static inline struct slave *bond_slave_has_mac(struct bonding *bond,
+ }
+
+ /* Caller must hold rcu_read_lock() for read */
+-static inline bool bond_slave_has_mac_rx(struct bonding *bond, const u8 *mac)
++static inline bool bond_slave_has_mac_rcu(struct bonding *bond, const u8 *mac)
+ {
+ struct list_head *iter;
+ struct slave *tmp;
+- struct netdev_hw_addr *ha;
+
+ bond_for_each_slave_rcu(bond, tmp, iter)
+ if (ether_addr_equal_64bits(mac, tmp->dev->dev_addr))
+ return true;
+-
+- if (netdev_uc_empty(bond->dev))
+- return false;
+-
+- netdev_for_each_uc_addr(ha, bond->dev)
+- if (ether_addr_equal_64bits(mac, ha->addr))
+- return true;
+-
+ return false;
+ }
+
+diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
+index 0bb32bfc61832..491ceb7ebe5d1 100644
+--- a/include/net/inet_sock.h
++++ b/include/net/inet_sock.h
+@@ -222,8 +222,8 @@ struct inet_sock {
+ __s16 uc_ttl;
+ __u16 cmsg_flags;
+ struct ip_options_rcu __rcu *inet_opt;
++ atomic_t inet_id;
+ __be16 inet_sport;
+- __u16 inet_id;
+
+ __u8 tos;
+ __u8 min_ttl;
+diff --git a/include/net/ip.h b/include/net/ip.h
+index 530e7257e4389..1872f570abeda 100644
+--- a/include/net/ip.h
++++ b/include/net/ip.h
+@@ -532,8 +532,19 @@ static inline void ip_select_ident_segs(struct net *net, struct sk_buff *skb,
+ * generator as much as we can.
+ */
+ if (sk && inet_sk(sk)->inet_daddr) {
+- iph->id = htons(inet_sk(sk)->inet_id);
+- inet_sk(sk)->inet_id += segs;
++ int val;
++
++ /* avoid atomic operations for TCP,
++ * as we hold socket lock at this point.
++ */
++ if (sk_is_tcp(sk)) {
++ sock_owned_by_me(sk);
++ val = atomic_read(&inet_sk(sk)->inet_id);
++ atomic_set(&inet_sk(sk)->inet_id, val + segs);
++ } else {
++ val = atomic_add_return(segs, &inet_sk(sk)->inet_id);
++ }
++ iph->id = htons(val);
+ return;
+ }
+ if ((iph->frag_off & htons(IP_DF)) && !skb->ignore_df) {
+diff --git a/include/net/mac80211.h b/include/net/mac80211.h
+index 65510cfda37af..67d81f7186660 100644
+--- a/include/net/mac80211.h
++++ b/include/net/mac80211.h
+@@ -6578,6 +6578,7 @@ void ieee80211_stop_rx_ba_session(struct ieee80211_vif *vif, u16 ba_rx_bitmap,
+ * marks frames marked in the bitmap as having been filtered. Afterwards, it
+ * checks if any frames in the window starting from @ssn can now be released
+ * (in case they were only waiting for frames that were filtered.)
++ * (Only work correctly if @max_rx_aggregation_subframes <= 64 frames)
+ */
+ void ieee80211_mark_rx_ba_filtered_frames(struct ieee80211_sta *pubsta, u8 tid,
+ u16 ssn, u64 filtered,
+diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
+index ad97049e28881..9f7bf417b9481 100644
+--- a/include/net/netfilter/nf_tables.h
++++ b/include/net/netfilter/nf_tables.h
+@@ -534,6 +534,7 @@ struct nft_set_elem_expr {
+ * @expr: stateful expression
+ * @ops: set ops
+ * @flags: set flags
++ * @dead: set will be freed, never cleared
+ * @genmask: generation mask
+ * @klen: key length
+ * @dlen: data length
+@@ -586,6 +587,11 @@ static inline void *nft_set_priv(const struct nft_set *set)
+ return (void *)set->data;
+ }
+
++static inline bool nft_set_gc_is_pending(const struct nft_set *s)
++{
++ return refcount_read(&s->refs) != 1;
++}
++
+ static inline struct nft_set *nft_set_container_of(const void *priv)
+ {
+ return (void *)priv - offsetof(struct nft_set, data);
+@@ -1817,6 +1823,7 @@ struct nftables_pernet {
+ u64 table_handle;
+ unsigned int base_seq;
+ unsigned int gc_seq;
++ u8 validate_state;
+ };
+
+ extern unsigned int nf_tables_net_id;
+diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h
+index d9076a7a430c2..6506221c5fe31 100644
+--- a/include/net/rtnetlink.h
++++ b/include/net/rtnetlink.h
+@@ -190,8 +190,8 @@ int rtnl_delete_link(struct net_device *dev, u32 portid, const struct nlmsghdr *
+ int rtnl_configure_link(struct net_device *dev, const struct ifinfomsg *ifm,
+ u32 portid, const struct nlmsghdr *nlh);
+
+-int rtnl_nla_parse_ifla(struct nlattr **tb, const struct nlattr *head, int len,
+- struct netlink_ext_ack *exterr);
++int rtnl_nla_parse_ifinfomsg(struct nlattr **tb, const struct nlattr *nla_peer,
++ struct netlink_ext_ack *exterr);
+ struct net *rtnl_get_net_ns_capable(struct sock *sk, int netnsid);
+
+ #define MODULE_ALIAS_RTNL_LINK(kind) MODULE_ALIAS("rtnl-link-" kind)
+diff --git a/include/net/sock.h b/include/net/sock.h
+index 415f3840a26aa..d0d796d51a504 100644
+--- a/include/net/sock.h
++++ b/include/net/sock.h
+@@ -1324,6 +1324,7 @@ struct proto {
+ /*
+ * Pressure flag: try to collapse.
+ * Technical note: it is used by multiple contexts non atomically.
++ * Make sure to use READ_ONCE()/WRITE_ONCE() for all reads/writes.
+ * All the __sk_mem_schedule() is of this nature: accounting
+ * is strict, actions are advisory and have some latency.
+ */
+@@ -1424,7 +1425,7 @@ static inline bool sk_has_memory_pressure(const struct sock *sk)
+ static inline bool sk_under_global_memory_pressure(const struct sock *sk)
+ {
+ return sk->sk_prot->memory_pressure &&
+- !!*sk->sk_prot->memory_pressure;
++ !!READ_ONCE(*sk->sk_prot->memory_pressure);
+ }
+
+ static inline bool sk_under_memory_pressure(const struct sock *sk)
+@@ -1436,7 +1437,7 @@ static inline bool sk_under_memory_pressure(const struct sock *sk)
+ mem_cgroup_under_socket_pressure(sk->sk_memcg))
+ return true;
+
+- return !!*sk->sk_prot->memory_pressure;
++ return !!READ_ONCE(*sk->sk_prot->memory_pressure);
+ }
+
+ static inline long
+@@ -1513,7 +1514,7 @@ proto_memory_pressure(struct proto *prot)
+ {
+ if (!prot->memory_pressure)
+ return false;
+- return !!*prot->memory_pressure;
++ return !!READ_ONCE(*prot->memory_pressure);
+ }
+
+
+diff --git a/include/trace/events/jbd2.h b/include/trace/events/jbd2.h
+index 8f5ee380d3093..5646ae15a957a 100644
+--- a/include/trace/events/jbd2.h
++++ b/include/trace/events/jbd2.h
+@@ -462,11 +462,9 @@ TRACE_EVENT(jbd2_shrink_scan_exit,
+ TRACE_EVENT(jbd2_shrink_checkpoint_list,
+
+ TP_PROTO(journal_t *journal, tid_t first_tid, tid_t tid, tid_t last_tid,
+- unsigned long nr_freed, unsigned long nr_scanned,
+- tid_t next_tid),
++ unsigned long nr_freed, tid_t next_tid),
+
+- TP_ARGS(journal, first_tid, tid, last_tid, nr_freed,
+- nr_scanned, next_tid),
++ TP_ARGS(journal, first_tid, tid, last_tid, nr_freed, next_tid),
+
+ TP_STRUCT__entry(
+ __field(dev_t, dev)
+@@ -474,7 +472,6 @@ TRACE_EVENT(jbd2_shrink_checkpoint_list,
+ __field(tid_t, tid)
+ __field(tid_t, last_tid)
+ __field(unsigned long, nr_freed)
+- __field(unsigned long, nr_scanned)
+ __field(tid_t, next_tid)
+ ),
+
+@@ -484,15 +481,14 @@ TRACE_EVENT(jbd2_shrink_checkpoint_list,
+ __entry->tid = tid;
+ __entry->last_tid = last_tid;
+ __entry->nr_freed = nr_freed;
+- __entry->nr_scanned = nr_scanned;
+ __entry->next_tid = next_tid;
+ ),
+
+ TP_printk("dev %d,%d shrink transaction %u-%u(%u) freed %lu "
+- "scanned %lu next transaction %u",
++ "next transaction %u",
+ MAJOR(__entry->dev), MINOR(__entry->dev),
+ __entry->first_tid, __entry->tid, __entry->last_tid,
+- __entry->nr_freed, __entry->nr_scanned, __entry->next_tid)
++ __entry->nr_freed, __entry->next_tid)
+ );
+
+ #endif /* _TRACE_JBD2_H */
+diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
+index 3299ec69ce0d1..5b9648c940b4f 100644
+--- a/kernel/cgroup/cgroup.c
++++ b/kernel/cgroup/cgroup.c
+@@ -57,6 +57,7 @@
+ #include <linux/file.h>
+ #include <linux/fs_parser.h>
+ #include <linux/sched/cputime.h>
++#include <linux/sched/deadline.h>
+ #include <linux/psi.h>
+ #include <net/sock.h>
+
+@@ -6696,6 +6697,9 @@ void cgroup_exit(struct task_struct *tsk)
+ list_add_tail(&tsk->cg_list, &cset->dying_tasks);
+ cset->nr_tasks--;
+
++ if (dl_task(tsk))
++ dec_dl_tasks_cs(tsk);
++
+ WARN_ON_ONCE(cgroup_task_frozen(tsk));
+ if (unlikely(!(tsk->flags & PF_KTHREAD) &&
+ test_bit(CGRP_FREEZE, &task_dfl_cgroup(tsk)->flags)))
+diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
+index e4ca2dd2b7648..2c76fcd9f0bcb 100644
+--- a/kernel/cgroup/cpuset.c
++++ b/kernel/cgroup/cpuset.c
+@@ -193,6 +193,14 @@ struct cpuset {
+ int use_parent_ecpus;
+ int child_ecpus_count;
+
++ /*
++ * number of SCHED_DEADLINE tasks attached to this cpuset, so that we
++ * know when to rebuild associated root domain bandwidth information.
++ */
++ int nr_deadline_tasks;
++ int nr_migrate_dl_tasks;
++ u64 sum_migrate_dl_bw;
++
+ /* Invalid partition error code, not lock protected */
+ enum prs_errcode prs_err;
+
+@@ -245,6 +253,20 @@ static inline struct cpuset *parent_cs(struct cpuset *cs)
+ return css_cs(cs->css.parent);
+ }
+
++void inc_dl_tasks_cs(struct task_struct *p)
++{
++ struct cpuset *cs = task_cs(p);
++
++ cs->nr_deadline_tasks++;
++}
++
++void dec_dl_tasks_cs(struct task_struct *p)
++{
++ struct cpuset *cs = task_cs(p);
++
++ cs->nr_deadline_tasks--;
++}
++
+ /* bits in struct cpuset flags field */
+ typedef enum {
+ CS_ONLINE,
+@@ -366,22 +388,23 @@ static struct cpuset top_cpuset = {
+ if (is_cpuset_online(((des_cs) = css_cs((pos_css)))))
+
+ /*
+- * There are two global locks guarding cpuset structures - cpuset_rwsem and
++ * There are two global locks guarding cpuset structures - cpuset_mutex and
+ * callback_lock. We also require taking task_lock() when dereferencing a
+ * task's cpuset pointer. See "The task_lock() exception", at the end of this
+- * comment. The cpuset code uses only cpuset_rwsem write lock. Other
+- * kernel subsystems can use cpuset_read_lock()/cpuset_read_unlock() to
+- * prevent change to cpuset structures.
++ * comment. The cpuset code uses only cpuset_mutex. Other kernel subsystems
++ * can use cpuset_lock()/cpuset_unlock() to prevent change to cpuset
++ * structures. Note that cpuset_mutex needs to be a mutex as it is used in
++ * paths that rely on priority inheritance (e.g. scheduler - on RT) for
++ * correctness.
+ *
+ * A task must hold both locks to modify cpusets. If a task holds
+- * cpuset_rwsem, it blocks others wanting that rwsem, ensuring that it
+- * is the only task able to also acquire callback_lock and be able to
+- * modify cpusets. It can perform various checks on the cpuset structure
+- * first, knowing nothing will change. It can also allocate memory while
+- * just holding cpuset_rwsem. While it is performing these checks, various
+- * callback routines can briefly acquire callback_lock to query cpusets.
+- * Once it is ready to make the changes, it takes callback_lock, blocking
+- * everyone else.
++ * cpuset_mutex, it blocks others, ensuring that it is the only task able to
++ * also acquire callback_lock and be able to modify cpusets. It can perform
++ * various checks on the cpuset structure first, knowing nothing will change.
++ * It can also allocate memory while just holding cpuset_mutex. While it is
++ * performing these checks, various callback routines can briefly acquire
++ * callback_lock to query cpusets. Once it is ready to make the changes, it
++ * takes callback_lock, blocking everyone else.
+ *
+ * Calls to the kernel memory allocator can not be made while holding
+ * callback_lock, as that would risk double tripping on callback_lock
+@@ -403,16 +426,16 @@ static struct cpuset top_cpuset = {
+ * guidelines for accessing subsystem state in kernel/cgroup.c
+ */
+
+-DEFINE_STATIC_PERCPU_RWSEM(cpuset_rwsem);
++static DEFINE_MUTEX(cpuset_mutex);
+
+-void cpuset_read_lock(void)
++void cpuset_lock(void)
+ {
+- percpu_down_read(&cpuset_rwsem);
++ mutex_lock(&cpuset_mutex);
+ }
+
+-void cpuset_read_unlock(void)
++void cpuset_unlock(void)
+ {
+- percpu_up_read(&cpuset_rwsem);
++ mutex_unlock(&cpuset_mutex);
+ }
+
+ static DEFINE_SPINLOCK(callback_lock);
+@@ -496,7 +519,7 @@ static inline bool partition_is_populated(struct cpuset *cs,
+ * One way or another, we guarantee to return some non-empty subset
+ * of cpu_online_mask.
+ *
+- * Call with callback_lock or cpuset_rwsem held.
++ * Call with callback_lock or cpuset_mutex held.
+ */
+ static void guarantee_online_cpus(struct task_struct *tsk,
+ struct cpumask *pmask)
+@@ -538,7 +561,7 @@ out_unlock:
+ * One way or another, we guarantee to return some non-empty subset
+ * of node_states[N_MEMORY].
+ *
+- * Call with callback_lock or cpuset_rwsem held.
++ * Call with callback_lock or cpuset_mutex held.
+ */
+ static void guarantee_online_mems(struct cpuset *cs, nodemask_t *pmask)
+ {
+@@ -550,7 +573,7 @@ static void guarantee_online_mems(struct cpuset *cs, nodemask_t *pmask)
+ /*
+ * update task's spread flag if cpuset's page/slab spread flag is set
+ *
+- * Call with callback_lock or cpuset_rwsem held. The check can be skipped
++ * Call with callback_lock or cpuset_mutex held. The check can be skipped
+ * if on default hierarchy.
+ */
+ static void cpuset_update_task_spread_flags(struct cpuset *cs,
+@@ -575,7 +598,7 @@ static void cpuset_update_task_spread_flags(struct cpuset *cs,
+ *
+ * One cpuset is a subset of another if all its allowed CPUs and
+ * Memory Nodes are a subset of the other, and its exclusive flags
+- * are only set if the other's are set. Call holding cpuset_rwsem.
++ * are only set if the other's are set. Call holding cpuset_mutex.
+ */
+
+ static int is_cpuset_subset(const struct cpuset *p, const struct cpuset *q)
+@@ -713,7 +736,7 @@ out:
+ * If we replaced the flag and mask values of the current cpuset
+ * (cur) with those values in the trial cpuset (trial), would
+ * our various subset and exclusive rules still be valid? Presumes
+- * cpuset_rwsem held.
++ * cpuset_mutex held.
+ *
+ * 'cur' is the address of an actual, in-use cpuset. Operations
+ * such as list traversal that depend on the actual address of the
+@@ -829,7 +852,7 @@ static void update_domain_attr_tree(struct sched_domain_attr *dattr,
+ rcu_read_unlock();
+ }
+
+-/* Must be called with cpuset_rwsem held. */
++/* Must be called with cpuset_mutex held. */
+ static inline int nr_cpusets(void)
+ {
+ /* jump label reference count + the top-level cpuset */
+@@ -855,7 +878,7 @@ static inline int nr_cpusets(void)
+ * domains when operating in the severe memory shortage situations
+ * that could cause allocation failures below.
+ *
+- * Must be called with cpuset_rwsem held.
++ * Must be called with cpuset_mutex held.
+ *
+ * The three key local variables below are:
+ * cp - cpuset pointer, used (together with pos_css) to perform a
+@@ -1066,11 +1089,14 @@ done:
+ return ndoms;
+ }
+
+-static void update_tasks_root_domain(struct cpuset *cs)
++static void dl_update_tasks_root_domain(struct cpuset *cs)
+ {
+ struct css_task_iter it;
+ struct task_struct *task;
+
++ if (cs->nr_deadline_tasks == 0)
++ return;
++
+ css_task_iter_start(&cs->css, 0, &it);
+
+ while ((task = css_task_iter_next(&it)))
+@@ -1079,12 +1105,12 @@ static void update_tasks_root_domain(struct cpuset *cs)
+ css_task_iter_end(&it);
+ }
+
+-static void rebuild_root_domains(void)
++static void dl_rebuild_rd_accounting(void)
+ {
+ struct cpuset *cs = NULL;
+ struct cgroup_subsys_state *pos_css;
+
+- percpu_rwsem_assert_held(&cpuset_rwsem);
++ lockdep_assert_held(&cpuset_mutex);
+ lockdep_assert_cpus_held();
+ lockdep_assert_held(&sched_domains_mutex);
+
+@@ -1107,7 +1133,7 @@ static void rebuild_root_domains(void)
+
+ rcu_read_unlock();
+
+- update_tasks_root_domain(cs);
++ dl_update_tasks_root_domain(cs);
+
+ rcu_read_lock();
+ css_put(&cs->css);
+@@ -1121,7 +1147,7 @@ partition_and_rebuild_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
+ {
+ mutex_lock(&sched_domains_mutex);
+ partition_sched_domains_locked(ndoms_new, doms_new, dattr_new);
+- rebuild_root_domains();
++ dl_rebuild_rd_accounting();
+ mutex_unlock(&sched_domains_mutex);
+ }
+
+@@ -1134,7 +1160,7 @@ partition_and_rebuild_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
+ * 'cpus' is removed, then call this routine to rebuild the
+ * scheduler's dynamic sched domains.
+ *
+- * Call with cpuset_rwsem held. Takes cpus_read_lock().
++ * Call with cpuset_mutex held. Takes cpus_read_lock().
+ */
+ static void rebuild_sched_domains_locked(void)
+ {
+@@ -1145,7 +1171,7 @@ static void rebuild_sched_domains_locked(void)
+ int ndoms;
+
+ lockdep_assert_cpus_held();
+- percpu_rwsem_assert_held(&cpuset_rwsem);
++ lockdep_assert_held(&cpuset_mutex);
+
+ /*
+ * If we have raced with CPU hotplug, return early to avoid
+@@ -1196,9 +1222,9 @@ static void rebuild_sched_domains_locked(void)
+ void rebuild_sched_domains(void)
+ {
+ cpus_read_lock();
+- percpu_down_write(&cpuset_rwsem);
++ mutex_lock(&cpuset_mutex);
+ rebuild_sched_domains_locked();
+- percpu_up_write(&cpuset_rwsem);
++ mutex_unlock(&cpuset_mutex);
+ cpus_read_unlock();
+ }
+
+@@ -1208,7 +1234,7 @@ void rebuild_sched_domains(void)
+ * @new_cpus: the temp variable for the new effective_cpus mask
+ *
+ * Iterate through each task of @cs updating its cpus_allowed to the
+- * effective cpuset's. As this function is called with cpuset_rwsem held,
++ * effective cpuset's. As this function is called with cpuset_mutex held,
+ * cpuset membership stays stable. For top_cpuset, task_cpu_possible_mask()
+ * is used instead of effective_cpus to make sure all offline CPUs are also
+ * included as hotplug code won't update cpumasks for tasks in top_cpuset.
+@@ -1322,7 +1348,7 @@ static int update_parent_subparts_cpumask(struct cpuset *cs, int cmd,
+ int old_prs, new_prs;
+ int part_error = PERR_NONE; /* Partition error? */
+
+- percpu_rwsem_assert_held(&cpuset_rwsem);
++ lockdep_assert_held(&cpuset_mutex);
+
+ /*
+ * The parent must be a partition root.
+@@ -1545,7 +1571,7 @@ static int update_parent_subparts_cpumask(struct cpuset *cs, int cmd,
+ *
+ * On legacy hierarchy, effective_cpus will be the same with cpu_allowed.
+ *
+- * Called with cpuset_rwsem held
++ * Called with cpuset_mutex held
+ */
+ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp,
+ bool force)
+@@ -1705,7 +1731,7 @@ static void update_sibling_cpumasks(struct cpuset *parent, struct cpuset *cs,
+ struct cpuset *sibling;
+ struct cgroup_subsys_state *pos_css;
+
+- percpu_rwsem_assert_held(&cpuset_rwsem);
++ lockdep_assert_held(&cpuset_mutex);
+
+ /*
+ * Check all its siblings and call update_cpumasks_hier()
+@@ -1955,12 +1981,12 @@ static void *cpuset_being_rebound;
+ * @cs: the cpuset in which each task's mems_allowed mask needs to be changed
+ *
+ * Iterate through each task of @cs updating its mems_allowed to the
+- * effective cpuset's. As this function is called with cpuset_rwsem held,
++ * effective cpuset's. As this function is called with cpuset_mutex held,
+ * cpuset membership stays stable.
+ */
+ static void update_tasks_nodemask(struct cpuset *cs)
+ {
+- static nodemask_t newmems; /* protected by cpuset_rwsem */
++ static nodemask_t newmems; /* protected by cpuset_mutex */
+ struct css_task_iter it;
+ struct task_struct *task;
+
+@@ -1973,7 +1999,7 @@ static void update_tasks_nodemask(struct cpuset *cs)
+ * take while holding tasklist_lock. Forks can happen - the
+ * mpol_dup() cpuset_being_rebound check will catch such forks,
+ * and rebind their vma mempolicies too. Because we still hold
+- * the global cpuset_rwsem, we know that no other rebind effort
++ * the global cpuset_mutex, we know that no other rebind effort
+ * will be contending for the global variable cpuset_being_rebound.
+ * It's ok if we rebind the same mm twice; mpol_rebind_mm()
+ * is idempotent. Also migrate pages in each mm to new nodes.
+@@ -2019,7 +2045,7 @@ static void update_tasks_nodemask(struct cpuset *cs)
+ *
+ * On legacy hierarchy, effective_mems will be the same with mems_allowed.
+ *
+- * Called with cpuset_rwsem held
++ * Called with cpuset_mutex held
+ */
+ static void update_nodemasks_hier(struct cpuset *cs, nodemask_t *new_mems)
+ {
+@@ -2072,7 +2098,7 @@ static void update_nodemasks_hier(struct cpuset *cs, nodemask_t *new_mems)
+ * mempolicies and if the cpuset is marked 'memory_migrate',
+ * migrate the tasks pages to the new memory.
+ *
+- * Call with cpuset_rwsem held. May take callback_lock during call.
++ * Call with cpuset_mutex held. May take callback_lock during call.
+ * Will take tasklist_lock, scan tasklist for tasks in cpuset cs,
+ * lock each such tasks mm->mmap_lock, scan its vma's and rebind
+ * their mempolicies to the cpusets new mems_allowed.
+@@ -2164,7 +2190,7 @@ static int update_relax_domain_level(struct cpuset *cs, s64 val)
+ * @cs: the cpuset in which each task's spread flags needs to be changed
+ *
+ * Iterate through each task of @cs updating its spread flags. As this
+- * function is called with cpuset_rwsem held, cpuset membership stays
++ * function is called with cpuset_mutex held, cpuset membership stays
+ * stable.
+ */
+ static void update_tasks_flags(struct cpuset *cs)
+@@ -2184,7 +2210,7 @@ static void update_tasks_flags(struct cpuset *cs)
+ * cs: the cpuset to update
+ * turning_on: whether the flag is being set or cleared
+ *
+- * Call with cpuset_rwsem held.
++ * Call with cpuset_mutex held.
+ */
+
+ static int update_flag(cpuset_flagbits_t bit, struct cpuset *cs,
+@@ -2234,7 +2260,7 @@ out:
+ * @new_prs: new partition root state
+ * Return: 0 if successful, != 0 if error
+ *
+- * Call with cpuset_rwsem held.
++ * Call with cpuset_mutex held.
+ */
+ static int update_prstate(struct cpuset *cs, int new_prs)
+ {
+@@ -2472,19 +2498,26 @@ static int cpuset_can_attach_check(struct cpuset *cs)
+ return 0;
+ }
+
+-/* Called by cgroups to determine if a cpuset is usable; cpuset_rwsem held */
++static void reset_migrate_dl_data(struct cpuset *cs)
++{
++ cs->nr_migrate_dl_tasks = 0;
++ cs->sum_migrate_dl_bw = 0;
++}
++
++/* Called by cgroups to determine if a cpuset is usable; cpuset_mutex held */
+ static int cpuset_can_attach(struct cgroup_taskset *tset)
+ {
+ struct cgroup_subsys_state *css;
+- struct cpuset *cs;
++ struct cpuset *cs, *oldcs;
+ struct task_struct *task;
+ int ret;
+
+ /* used later by cpuset_attach() */
+ cpuset_attach_old_cs = task_cs(cgroup_taskset_first(tset, &css));
++ oldcs = cpuset_attach_old_cs;
+ cs = css_cs(css);
+
+- percpu_down_write(&cpuset_rwsem);
++ mutex_lock(&cpuset_mutex);
+
+ /* Check to see if task is allowed in the cpuset */
+ ret = cpuset_can_attach_check(cs);
+@@ -2492,21 +2525,46 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
+ goto out_unlock;
+
+ cgroup_taskset_for_each(task, css, tset) {
+- ret = task_can_attach(task, cs->effective_cpus);
++ ret = task_can_attach(task);
+ if (ret)
+ goto out_unlock;
+ ret = security_task_setscheduler(task);
+ if (ret)
+ goto out_unlock;
++
++ if (dl_task(task)) {
++ cs->nr_migrate_dl_tasks++;
++ cs->sum_migrate_dl_bw += task->dl.dl_bw;
++ }
+ }
+
++ if (!cs->nr_migrate_dl_tasks)
++ goto out_success;
++
++ if (!cpumask_intersects(oldcs->effective_cpus, cs->effective_cpus)) {
++ int cpu = cpumask_any_and(cpu_active_mask, cs->effective_cpus);
++
++ if (unlikely(cpu >= nr_cpu_ids)) {
++ reset_migrate_dl_data(cs);
++ ret = -EINVAL;
++ goto out_unlock;
++ }
++
++ ret = dl_bw_alloc(cpu, cs->sum_migrate_dl_bw);
++ if (ret) {
++ reset_migrate_dl_data(cs);
++ goto out_unlock;
++ }
++ }
++
++out_success:
+ /*
+ * Mark attach is in progress. This makes validate_change() fail
+ * changes which zero cpus/mems_allowed.
+ */
+ cs->attach_in_progress++;
+ out_unlock:
+- percpu_up_write(&cpuset_rwsem);
++ mutex_unlock(&cpuset_mutex);
+ return ret;
+ }
+
+@@ -2518,15 +2576,23 @@ static void cpuset_cancel_attach(struct cgroup_taskset *tset)
+ cgroup_taskset_first(tset, &css);
+ cs = css_cs(css);
+
+- percpu_down_write(&cpuset_rwsem);
++ mutex_lock(&cpuset_mutex);
+ cs->attach_in_progress--;
+ if (!cs->attach_in_progress)
+ wake_up(&cpuset_attach_wq);
+- percpu_up_write(&cpuset_rwsem);
++
++ if (cs->nr_migrate_dl_tasks) {
++ int cpu = cpumask_any(cs->effective_cpus);
++
++ dl_bw_free(cpu, cs->sum_migrate_dl_bw);
++ reset_migrate_dl_data(cs);
++ }
++
++ mutex_unlock(&cpuset_mutex);
+ }
+
+ /*
+- * Protected by cpuset_rwsem. cpus_attach is used only by cpuset_attach_task()
++ * Protected by cpuset_mutex. cpus_attach is used only by cpuset_attach_task()
+ * but we can't allocate it dynamically there. Define it global and
+ * allocate from cpuset_init().
+ */
+@@ -2535,7 +2601,7 @@ static nodemask_t cpuset_attach_nodemask_to;
+
+ static void cpuset_attach_task(struct cpuset *cs, struct task_struct *task)
+ {
+- percpu_rwsem_assert_held(&cpuset_rwsem);
++ lockdep_assert_held(&cpuset_mutex);
+
+ if (cs != &top_cpuset)
+ guarantee_online_cpus(task, cpus_attach);
+@@ -2565,7 +2631,7 @@ static void cpuset_attach(struct cgroup_taskset *tset)
+ cs = css_cs(css);
+
+ lockdep_assert_cpus_held(); /* see cgroup_attach_lock() */
+- percpu_down_write(&cpuset_rwsem);
++ mutex_lock(&cpuset_mutex);
+ cpus_updated = !cpumask_equal(cs->effective_cpus,
+ oldcs->effective_cpus);
+ mems_updated = !nodes_equal(cs->effective_mems, oldcs->effective_mems);
+@@ -2622,11 +2688,17 @@ static void cpuset_attach(struct cgroup_taskset *tset)
+ out:
+ cs->old_mems_allowed = cpuset_attach_nodemask_to;
+
++ if (cs->nr_migrate_dl_tasks) {
++ cs->nr_deadline_tasks += cs->nr_migrate_dl_tasks;
++ oldcs->nr_deadline_tasks -= cs->nr_migrate_dl_tasks;
++ reset_migrate_dl_data(cs);
++ }
++
+ cs->attach_in_progress--;
+ if (!cs->attach_in_progress)
+ wake_up(&cpuset_attach_wq);
+
+- percpu_up_write(&cpuset_rwsem);
++ mutex_unlock(&cpuset_mutex);
+ }
+
+ /* The various types of files and directories in a cpuset file system */
+@@ -2658,7 +2730,7 @@ static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft,
+ int retval = 0;
+
+ cpus_read_lock();
+- percpu_down_write(&cpuset_rwsem);
++ mutex_lock(&cpuset_mutex);
+ if (!is_cpuset_online(cs)) {
+ retval = -ENODEV;
+ goto out_unlock;
+@@ -2694,7 +2766,7 @@ static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft,
+ break;
+ }
+ out_unlock:
+- percpu_up_write(&cpuset_rwsem);
++ mutex_unlock(&cpuset_mutex);
+ cpus_read_unlock();
+ return retval;
+ }
+@@ -2707,7 +2779,7 @@ static int cpuset_write_s64(struct cgroup_subsys_state *css, struct cftype *cft,
+ int retval = -ENODEV;
+
+ cpus_read_lock();
+- percpu_down_write(&cpuset_rwsem);
++ mutex_lock(&cpuset_mutex);
+ if (!is_cpuset_online(cs))
+ goto out_unlock;
+
+@@ -2720,7 +2792,7 @@ static int cpuset_write_s64(struct cgroup_subsys_state *css, struct cftype *cft,
+ break;
+ }
+ out_unlock:
+- percpu_up_write(&cpuset_rwsem);
++ mutex_unlock(&cpuset_mutex);
+ cpus_read_unlock();
+ return retval;
+ }
+@@ -2753,7 +2825,7 @@ static ssize_t cpuset_write_resmask(struct kernfs_open_file *of,
+ * operation like this one can lead to a deadlock through kernfs
+ * active_ref protection. Let's break the protection. Losing the
+ * protection is okay as we check whether @cs is online after
+- * grabbing cpuset_rwsem anyway. This only happens on the legacy
++ * grabbing cpuset_mutex anyway. This only happens on the legacy
+ * hierarchies.
+ */
+ css_get(&cs->css);
+@@ -2761,7 +2833,7 @@ static ssize_t cpuset_write_resmask(struct kernfs_open_file *of,
+ flush_work(&cpuset_hotplug_work);
+
+ cpus_read_lock();
+- percpu_down_write(&cpuset_rwsem);
++ mutex_lock(&cpuset_mutex);
+ if (!is_cpuset_online(cs))
+ goto out_unlock;
+
+@@ -2785,7 +2857,7 @@ static ssize_t cpuset_write_resmask(struct kernfs_open_file *of,
+
+ free_cpuset(trialcs);
+ out_unlock:
+- percpu_up_write(&cpuset_rwsem);
++ mutex_unlock(&cpuset_mutex);
+ cpus_read_unlock();
+ kernfs_unbreak_active_protection(of->kn);
+ css_put(&cs->css);
+@@ -2933,13 +3005,13 @@ static ssize_t sched_partition_write(struct kernfs_open_file *of, char *buf,
+
+ css_get(&cs->css);
+ cpus_read_lock();
+- percpu_down_write(&cpuset_rwsem);
++ mutex_lock(&cpuset_mutex);
+ if (!is_cpuset_online(cs))
+ goto out_unlock;
+
+ retval = update_prstate(cs, val);
+ out_unlock:
+- percpu_up_write(&cpuset_rwsem);
++ mutex_unlock(&cpuset_mutex);
+ cpus_read_unlock();
+ css_put(&cs->css);
+ return retval ?: nbytes;
+@@ -3156,7 +3228,7 @@ static int cpuset_css_online(struct cgroup_subsys_state *css)
+ return 0;
+
+ cpus_read_lock();
+- percpu_down_write(&cpuset_rwsem);
++ mutex_lock(&cpuset_mutex);
+
+ set_bit(CS_ONLINE, &cs->flags);
+ if (is_spread_page(parent))
+@@ -3207,7 +3279,7 @@ static int cpuset_css_online(struct cgroup_subsys_state *css)
+ cpumask_copy(cs->effective_cpus, parent->cpus_allowed);
+ spin_unlock_irq(&callback_lock);
+ out_unlock:
+- percpu_up_write(&cpuset_rwsem);
++ mutex_unlock(&cpuset_mutex);
+ cpus_read_unlock();
+ return 0;
+ }
+@@ -3228,7 +3300,7 @@ static void cpuset_css_offline(struct cgroup_subsys_state *css)
+ struct cpuset *cs = css_cs(css);
+
+ cpus_read_lock();
+- percpu_down_write(&cpuset_rwsem);
++ mutex_lock(&cpuset_mutex);
+
+ if (is_partition_valid(cs))
+ update_prstate(cs, 0);
+@@ -3247,7 +3319,7 @@ static void cpuset_css_offline(struct cgroup_subsys_state *css)
+ cpuset_dec();
+ clear_bit(CS_ONLINE, &cs->flags);
+
+- percpu_up_write(&cpuset_rwsem);
++ mutex_unlock(&cpuset_mutex);
+ cpus_read_unlock();
+ }
+
+@@ -3260,7 +3332,7 @@ static void cpuset_css_free(struct cgroup_subsys_state *css)
+
+ static void cpuset_bind(struct cgroup_subsys_state *root_css)
+ {
+- percpu_down_write(&cpuset_rwsem);
++ mutex_lock(&cpuset_mutex);
+ spin_lock_irq(&callback_lock);
+
+ if (is_in_v2_mode()) {
+@@ -3273,7 +3345,7 @@ static void cpuset_bind(struct cgroup_subsys_state *root_css)
+ }
+
+ spin_unlock_irq(&callback_lock);
+- percpu_up_write(&cpuset_rwsem);
++ mutex_unlock(&cpuset_mutex);
+ }
+
+ /*
+@@ -3294,14 +3366,14 @@ static int cpuset_can_fork(struct task_struct *task, struct css_set *cset)
+ return 0;
+
+ lockdep_assert_held(&cgroup_mutex);
+- percpu_down_write(&cpuset_rwsem);
++ mutex_lock(&cpuset_mutex);
+
+ /* Check to see if task is allowed in the cpuset */
+ ret = cpuset_can_attach_check(cs);
+ if (ret)
+ goto out_unlock;
+
+- ret = task_can_attach(task, cs->effective_cpus);
++ ret = task_can_attach(task);
+ if (ret)
+ goto out_unlock;
+
+@@ -3315,7 +3387,7 @@ static int cpuset_can_fork(struct task_struct *task, struct css_set *cset)
+ */
+ cs->attach_in_progress++;
+ out_unlock:
+- percpu_up_write(&cpuset_rwsem);
++ mutex_unlock(&cpuset_mutex);
+ return ret;
+ }
+
+@@ -3331,11 +3403,11 @@ static void cpuset_cancel_fork(struct task_struct *task, struct css_set *cset)
+ if (same_cs)
+ return;
+
+- percpu_down_write(&cpuset_rwsem);
++ mutex_lock(&cpuset_mutex);
+ cs->attach_in_progress--;
+ if (!cs->attach_in_progress)
+ wake_up(&cpuset_attach_wq);
+- percpu_up_write(&cpuset_rwsem);
++ mutex_unlock(&cpuset_mutex);
+ }
+
+ /*
+@@ -3363,7 +3435,7 @@ static void cpuset_fork(struct task_struct *task)
+ }
+
+ /* CLONE_INTO_CGROUP */
+- percpu_down_write(&cpuset_rwsem);
++ mutex_lock(&cpuset_mutex);
+ guarantee_online_mems(cs, &cpuset_attach_nodemask_to);
+ cpuset_attach_task(cs, task);
+
+@@ -3371,7 +3443,7 @@ static void cpuset_fork(struct task_struct *task)
+ if (!cs->attach_in_progress)
+ wake_up(&cpuset_attach_wq);
+
+- percpu_up_write(&cpuset_rwsem);
++ mutex_unlock(&cpuset_mutex);
+ }
+
+ struct cgroup_subsys cpuset_cgrp_subsys = {
+@@ -3472,7 +3544,7 @@ hotplug_update_tasks_legacy(struct cpuset *cs,
+ is_empty = cpumask_empty(cs->cpus_allowed) ||
+ nodes_empty(cs->mems_allowed);
+
+- percpu_up_write(&cpuset_rwsem);
++ mutex_unlock(&cpuset_mutex);
+
+ /*
+ * Move tasks to the nearest ancestor with execution resources,
+@@ -3482,7 +3554,7 @@ hotplug_update_tasks_legacy(struct cpuset *cs,
+ if (is_empty)
+ remove_tasks_in_empty_cpuset(cs);
+
+- percpu_down_write(&cpuset_rwsem);
++ mutex_lock(&cpuset_mutex);
+ }
+
+ static void
+@@ -3533,14 +3605,14 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs, struct tmpmasks *tmp)
+ retry:
+ wait_event(cpuset_attach_wq, cs->attach_in_progress == 0);
+
+- percpu_down_write(&cpuset_rwsem);
++ mutex_lock(&cpuset_mutex);
+
+ /*
+ * We have raced with task attaching. We wait until attaching
+ * is finished, so we won't attach a task to an empty cpuset.
+ */
+ if (cs->attach_in_progress) {
+- percpu_up_write(&cpuset_rwsem);
++ mutex_unlock(&cpuset_mutex);
+ goto retry;
+ }
+
+@@ -3637,7 +3709,7 @@ update_tasks:
+ cpus_updated, mems_updated);
+
+ unlock:
+- percpu_up_write(&cpuset_rwsem);
++ mutex_unlock(&cpuset_mutex);
+ }
+
+ /**
+@@ -3667,7 +3739,7 @@ static void cpuset_hotplug_workfn(struct work_struct *work)
+ if (on_dfl && !alloc_cpumasks(NULL, &tmp))
+ ptmp = &tmp;
+
+- percpu_down_write(&cpuset_rwsem);
++ mutex_lock(&cpuset_mutex);
+
+ /* fetch the available cpus/mems and find out which changed how */
+ cpumask_copy(&new_cpus, cpu_active_mask);
+@@ -3724,7 +3796,7 @@ static void cpuset_hotplug_workfn(struct work_struct *work)
+ update_tasks_nodemask(&top_cpuset);
+ }
+
+- percpu_up_write(&cpuset_rwsem);
++ mutex_unlock(&cpuset_mutex);
+
+ /* if cpus or mems changed, we need to propagate to descendants */
+ if (cpus_updated || mems_updated) {
+@@ -4155,7 +4227,7 @@ void __cpuset_memory_pressure_bump(void)
+ * - Used for /proc/<pid>/cpuset.
+ * - No need to task_lock(tsk) on this tsk->cpuset reference, as it
+ * doesn't really matter if tsk->cpuset changes after we read it,
+- * and we take cpuset_rwsem, keeping cpuset_attach() from changing it
++ * and we take cpuset_mutex, keeping cpuset_attach() from changing it
+ * anyway.
+ */
+ int proc_cpuset_show(struct seq_file *m, struct pid_namespace *ns,
+diff --git a/kernel/sched/core.c b/kernel/sched/core.c
+index a68d1276bab00..90005760003f1 100644
+--- a/kernel/sched/core.c
++++ b/kernel/sched/core.c
+@@ -7590,6 +7590,7 @@ static int __sched_setscheduler(struct task_struct *p,
+ int reset_on_fork;
+ int queue_flags = DEQUEUE_SAVE | DEQUEUE_MOVE | DEQUEUE_NOCLOCK;
+ struct rq *rq;
++ bool cpuset_locked = false;
+
+ /* The pi code expects interrupts enabled */
+ BUG_ON(pi && in_interrupt());
+@@ -7639,8 +7640,14 @@ recheck:
+ return retval;
+ }
+
+- if (pi)
+- cpuset_read_lock();
++ /*
++ * SCHED_DEADLINE bandwidth accounting relies on stable cpusets
++ * information.
++ */
++ if (dl_policy(policy) || dl_policy(p->policy)) {
++ cpuset_locked = true;
++ cpuset_lock();
++ }
+
+ /*
+ * Make sure no PI-waiters arrive (or leave) while we are
+@@ -7716,8 +7723,8 @@ change:
+ if (unlikely(oldpolicy != -1 && oldpolicy != p->policy)) {
+ policy = oldpolicy = -1;
+ task_rq_unlock(rq, p, &rf);
+- if (pi)
+- cpuset_read_unlock();
++ if (cpuset_locked)
++ cpuset_unlock();
+ goto recheck;
+ }
+
+@@ -7784,7 +7791,8 @@ change:
+ task_rq_unlock(rq, p, &rf);
+
+ if (pi) {
+- cpuset_read_unlock();
++ if (cpuset_locked)
++ cpuset_unlock();
+ rt_mutex_adjust_pi(p);
+ }
+
+@@ -7796,8 +7804,8 @@ change:
+
+ unlock:
+ task_rq_unlock(rq, p, &rf);
+- if (pi)
+- cpuset_read_unlock();
++ if (cpuset_locked)
++ cpuset_unlock();
+ return retval;
+ }
+
+@@ -9286,8 +9294,7 @@ int cpuset_cpumask_can_shrink(const struct cpumask *cur,
+ return ret;
+ }
+
+-int task_can_attach(struct task_struct *p,
+- const struct cpumask *cs_effective_cpus)
++int task_can_attach(struct task_struct *p)
+ {
+ int ret = 0;
+
+@@ -9300,21 +9307,9 @@ int task_can_attach(struct task_struct *p,
+ * success of set_cpus_allowed_ptr() on all attached tasks
+ * before cpus_mask may be changed.
+ */
+- if (p->flags & PF_NO_SETAFFINITY) {
++ if (p->flags & PF_NO_SETAFFINITY)
+ ret = -EINVAL;
+- goto out;
+- }
+
+- if (dl_task(p) && !cpumask_intersects(task_rq(p)->rd->span,
+- cs_effective_cpus)) {
+- int cpu = cpumask_any_and(cpu_active_mask, cs_effective_cpus);
+-
+- if (unlikely(cpu >= nr_cpu_ids))
+- return -EINVAL;
+- ret = dl_cpu_busy(cpu, p);
+- }
+-
+-out:
+ return ret;
+ }
+
+@@ -9596,7 +9591,7 @@ static void cpuset_cpu_active(void)
+ static int cpuset_cpu_inactive(unsigned int cpu)
+ {
+ if (!cpuhp_tasks_frozen) {
+- int ret = dl_cpu_busy(cpu, NULL);
++ int ret = dl_bw_check_overflow(cpu);
+
+ if (ret)
+ return ret;
+diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
+index 5a9a4b81c972e..166c3e6eae617 100644
+--- a/kernel/sched/deadline.c
++++ b/kernel/sched/deadline.c
+@@ -16,6 +16,8 @@
+ * Fabio Checconi <fchecconi@gmail.com>
+ */
+
++#include <linux/cpuset.h>
++
+ /*
+ * Default limits for DL period; on the top end we guard against small util
+ * tasks still getting ridiculously long effective runtimes, on the bottom end we
+@@ -2596,6 +2598,12 @@ static void switched_from_dl(struct rq *rq, struct task_struct *p)
+ if (task_on_rq_queued(p) && p->dl.dl_runtime)
+ task_non_contending(p);
+
++ /*
++ * In case a task is setscheduled out from SCHED_DEADLINE we need to
++ * keep track of that on its cpuset (for correct bandwidth tracking).
++ */
++ dec_dl_tasks_cs(p);
++
+ if (!task_on_rq_queued(p)) {
+ /*
+ * Inactive timer is armed. However, p is leaving DEADLINE and
+@@ -2636,6 +2644,12 @@ static void switched_to_dl(struct rq *rq, struct task_struct *p)
+ if (hrtimer_try_to_cancel(&p->dl.inactive_timer) == 1)
+ put_task_struct(p);
+
++ /*
++ * In case a task is setscheduled to SCHED_DEADLINE we need to keep
++ * track of that on its cpuset (for correct bandwidth tracking).
++ */
++ inc_dl_tasks_cs(p);
++
+ /* If p is not queued we will update its parameters at next wakeup. */
+ if (!task_on_rq_queued(p)) {
+ add_rq_bw(&p->dl, &rq->dl);
+@@ -3044,26 +3058,38 @@ int dl_cpuset_cpumask_can_shrink(const struct cpumask *cur,
+ return ret;
+ }
+
+-int dl_cpu_busy(int cpu, struct task_struct *p)
++enum dl_bw_request {
++ dl_bw_req_check_overflow = 0,
++ dl_bw_req_alloc,
++ dl_bw_req_free
++};
++
++static int dl_bw_manage(enum dl_bw_request req, int cpu, u64 dl_bw)
+ {
+- unsigned long flags, cap;
++ unsigned long flags;
+ struct dl_bw *dl_b;
+- bool overflow;
++ bool overflow = 0;
+
+ rcu_read_lock_sched();
+ dl_b = dl_bw_of(cpu);
+ raw_spin_lock_irqsave(&dl_b->lock, flags);
+- cap = dl_bw_capacity(cpu);
+- overflow = __dl_overflow(dl_b, cap, 0, p ? p->dl.dl_bw : 0);
+
+- if (!overflow && p) {
+- /*
+- * We reserve space for this task in the destination
+- * root_domain, as we can't fail after this point.
+- * We will free resources in the source root_domain
+- * later on (see set_cpus_allowed_dl()).
+- */
+- __dl_add(dl_b, p->dl.dl_bw, dl_bw_cpus(cpu));
++ if (req == dl_bw_req_free) {
++ __dl_sub(dl_b, dl_bw, dl_bw_cpus(cpu));
++ } else {
++ unsigned long cap = dl_bw_capacity(cpu);
++
++ overflow = __dl_overflow(dl_b, cap, 0, dl_bw);
++
++ if (req == dl_bw_req_alloc && !overflow) {
++ /*
++ * We reserve space in the destination
++ * root_domain, as we can't fail after this point.
++ * We will free resources in the source root_domain
++ * later on (see set_cpus_allowed_dl()).
++ */
++ __dl_add(dl_b, dl_bw, dl_bw_cpus(cpu));
++ }
+ }
+
+ raw_spin_unlock_irqrestore(&dl_b->lock, flags);
+@@ -3071,6 +3097,21 @@ int dl_cpu_busy(int cpu, struct task_struct *p)
+
+ return overflow ? -EBUSY : 0;
+ }
++
++int dl_bw_check_overflow(int cpu)
++{
++ return dl_bw_manage(dl_bw_req_check_overflow, cpu, 0);
++}
++
++int dl_bw_alloc(int cpu, u64 dl_bw)
++{
++ return dl_bw_manage(dl_bw_req_alloc, cpu, dl_bw);
++}
++
++void dl_bw_free(int cpu, u64 dl_bw)
++{
++ dl_bw_manage(dl_bw_req_free, cpu, dl_bw);
++}
+ #endif
+
+ #ifdef CONFIG_SCHED_DEBUG
+diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
+index 81ac605b9cd5c..ead91c1fbe75b 100644
+--- a/kernel/sched/sched.h
++++ b/kernel/sched/sched.h
+@@ -330,7 +330,7 @@ extern void __getparam_dl(struct task_struct *p, struct sched_attr *attr);
+ extern bool __checkparam_dl(const struct sched_attr *attr);
+ extern bool dl_param_changed(struct task_struct *p, const struct sched_attr *attr);
+ extern int dl_cpuset_cpumask_can_shrink(const struct cpumask *cur, const struct cpumask *trial);
+-extern int dl_cpu_busy(int cpu, struct task_struct *p);
++extern int dl_bw_check_overflow(int cpu);
+
+ #ifdef CONFIG_CGROUP_SCHED
+
+diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
+index fd051f85efd4b..f4855be6ac2b5 100644
+--- a/kernel/trace/trace.c
++++ b/kernel/trace/trace.c
+@@ -4196,8 +4196,15 @@ static void *s_start(struct seq_file *m, loff_t *pos)
+ * will point to the same string as current_trace->name.
+ */
+ mutex_lock(&trace_types_lock);
+- if (unlikely(tr->current_trace && iter->trace->name != tr->current_trace->name))
++ if (unlikely(tr->current_trace && iter->trace->name != tr->current_trace->name)) {
++ /* Close iter->trace before switching to the new current tracer */
++ if (iter->trace->close)
++ iter->trace->close(iter);
+ *iter->trace = *tr->current_trace;
++ /* Reopen the new current tracer */
++ if (iter->trace->open)
++ iter->trace->open(iter);
++ }
+ mutex_unlock(&trace_types_lock);
+
+ #ifdef CONFIG_TRACER_MAX_TRACE
+@@ -5260,11 +5267,17 @@ int tracing_set_cpumask(struct trace_array *tr,
+ !cpumask_test_cpu(cpu, tracing_cpumask_new)) {
+ atomic_inc(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled);
+ ring_buffer_record_disable_cpu(tr->array_buffer.buffer, cpu);
++#ifdef CONFIG_TRACER_MAX_TRACE
++ ring_buffer_record_disable_cpu(tr->max_buffer.buffer, cpu);
++#endif
+ }
+ if (!cpumask_test_cpu(cpu, tr->tracing_cpumask) &&
+ cpumask_test_cpu(cpu, tracing_cpumask_new)) {
+ atomic_dec(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled);
+ ring_buffer_record_enable_cpu(tr->array_buffer.buffer, cpu);
++#ifdef CONFIG_TRACER_MAX_TRACE
++ ring_buffer_record_enable_cpu(tr->max_buffer.buffer, cpu);
++#endif
+ }
+ }
+ arch_spin_unlock(&tr->max_lock);
+diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
+index eee1f3ca47494..2daeac8e690a6 100644
+--- a/kernel/trace/trace.h
++++ b/kernel/trace/trace.h
+@@ -1282,6 +1282,14 @@ static inline void trace_branch_disable(void)
+ /* set ring buffers to default size if not already done so */
+ int tracing_update_buffers(void);
+
++union trace_synth_field {
++ u8 as_u8;
++ u16 as_u16;
++ u32 as_u32;
++ u64 as_u64;
++ struct trace_dynamic_info as_dynamic;
++};
++
+ struct ftrace_event_field {
+ struct list_head link;
+ const char *name;
+diff --git a/kernel/trace/trace_events_synth.c b/kernel/trace/trace_events_synth.c
+index d6a70aff24101..32109d092b10f 100644
+--- a/kernel/trace/trace_events_synth.c
++++ b/kernel/trace/trace_events_synth.c
+@@ -127,7 +127,7 @@ static bool synth_event_match(const char *system, const char *event,
+
+ struct synth_trace_event {
+ struct trace_entry ent;
+- u64 fields[];
++ union trace_synth_field fields[];
+ };
+
+ static int synth_event_define_fields(struct trace_event_call *call)
+@@ -321,19 +321,19 @@ static const char *synth_field_fmt(char *type)
+
+ static void print_synth_event_num_val(struct trace_seq *s,
+ char *print_fmt, char *name,
+- int size, u64 val, char *space)
++ int size, union trace_synth_field *val, char *space)
+ {
+ switch (size) {
+ case 1:
+- trace_seq_printf(s, print_fmt, name, (u8)val, space);
++ trace_seq_printf(s, print_fmt, name, val->as_u8, space);
+ break;
+
+ case 2:
+- trace_seq_printf(s, print_fmt, name, (u16)val, space);
++ trace_seq_printf(s, print_fmt, name, val->as_u16, space);
+ break;
+
+ case 4:
+- trace_seq_printf(s, print_fmt, name, (u32)val, space);
++ trace_seq_printf(s, print_fmt, name, val->as_u32, space);
+ break;
+
+ default:
+@@ -350,7 +350,7 @@ static enum print_line_t print_synth_event(struct trace_iterator *iter,
+ struct trace_seq *s = &iter->seq;
+ struct synth_trace_event *entry;
+ struct synth_event *se;
+- unsigned int i, n_u64;
++ unsigned int i, j, n_u64;
+ char print_fmt[32];
+ const char *fmt;
+
+@@ -374,43 +374,28 @@ static enum print_line_t print_synth_event(struct trace_iterator *iter,
+ /* parameter values */
+ if (se->fields[i]->is_string) {
+ if (se->fields[i]->is_dynamic) {
+- u32 offset, data_offset;
+- char *str_field;
+-
+- offset = (u32)entry->fields[n_u64];
+- data_offset = offset & 0xffff;
+-
+- str_field = (char *)entry + data_offset;
++ union trace_synth_field *data = &entry->fields[n_u64];
+
+ trace_seq_printf(s, print_fmt, se->fields[i]->name,
+ STR_VAR_LEN_MAX,
+- str_field,
++ (char *)entry + data->as_dynamic.offset,
+ i == se->n_fields - 1 ? "" : " ");
+ n_u64++;
+ } else {
+ trace_seq_printf(s, print_fmt, se->fields[i]->name,
+ STR_VAR_LEN_MAX,
+- (char *)&entry->fields[n_u64],
++ (char *)&entry->fields[n_u64].as_u64,
+ i == se->n_fields - 1 ? "" : " ");
+ n_u64 += STR_VAR_LEN_MAX / sizeof(u64);
+ }
+ } else if (se->fields[i]->is_stack) {
+- u32 offset, data_offset, len;
+- unsigned long *p, *end;
+-
+- offset = (u32)entry->fields[n_u64];
+- data_offset = offset & 0xffff;
+- len = offset >> 16;
+-
+- p = (void *)entry + data_offset;
+- end = (void *)p + len - (sizeof(long) - 1);
++ union trace_synth_field *data = &entry->fields[n_u64];
++ unsigned long *p = (void *)entry + data->as_dynamic.offset;
+
+ trace_seq_printf(s, "%s=STACK:\n", se->fields[i]->name);
+-
+- for (; *p && p < end; p++)
+- trace_seq_printf(s, "=> %pS\n", (void *)*p);
++ for (j = 1; j < data->as_dynamic.len / sizeof(long); j++)
++ trace_seq_printf(s, "=> %pS\n", (void *)p[j]);
+ n_u64++;
+-
+ } else {
+ struct trace_print_flags __flags[] = {
+ __def_gfpflag_names, {-1, NULL} };
+@@ -419,13 +404,13 @@ static enum print_line_t print_synth_event(struct trace_iterator *iter,
+ print_synth_event_num_val(s, print_fmt,
+ se->fields[i]->name,
+ se->fields[i]->size,
+- entry->fields[n_u64],
++ &entry->fields[n_u64],
+ space);
+
+ if (strcmp(se->fields[i]->type, "gfp_t") == 0) {
+ trace_seq_puts(s, " (");
+ trace_print_flags_seq(s, "|",
+- entry->fields[n_u64],
++ entry->fields[n_u64].as_u64,
+ __flags);
+ trace_seq_putc(s, ')');
+ }
+@@ -454,21 +439,16 @@ static unsigned int trace_string(struct synth_trace_event *entry,
+ int ret;
+
+ if (is_dynamic) {
+- u32 data_offset;
+-
+- data_offset = struct_size(entry, fields, event->n_u64);
+- data_offset += data_size;
+-
+- len = fetch_store_strlen((unsigned long)str_val);
++ union trace_synth_field *data = &entry->fields[*n_u64];
+
+- data_offset |= len << 16;
+- *(u32 *)&entry->fields[*n_u64] = data_offset;
++ data->as_dynamic.offset = struct_size(entry, fields, event->n_u64) + data_size;
++ data->as_dynamic.len = fetch_store_strlen((unsigned long)str_val);
+
+ ret = fetch_store_string((unsigned long)str_val, &entry->fields[*n_u64], entry);
+
+ (*n_u64)++;
+ } else {
+- str_field = (char *)&entry->fields[*n_u64];
++ str_field = (char *)&entry->fields[*n_u64].as_u64;
+
+ #ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
+ if ((unsigned long)str_val < TASK_SIZE)
+@@ -492,6 +472,7 @@ static unsigned int trace_stack(struct synth_trace_event *entry,
+ unsigned int data_size,
+ unsigned int *n_u64)
+ {
++ union trace_synth_field *data = &entry->fields[*n_u64];
+ unsigned int len;
+ u32 data_offset;
+ void *data_loc;
+@@ -504,10 +485,6 @@ static unsigned int trace_stack(struct synth_trace_event *entry,
+ break;
+ }
+
+- /* Include the zero'd element if it fits */
+- if (len < HIST_STACKTRACE_DEPTH)
+- len++;
+-
+ len *= sizeof(long);
+
+ /* Find the dynamic section to copy the stack into. */
+@@ -515,8 +492,9 @@ static unsigned int trace_stack(struct synth_trace_event *entry,
+ memcpy(data_loc, stack, len);
+
+ /* Fill in the field that holds the offset/len combo */
+- data_offset |= len << 16;
+- *(u32 *)&entry->fields[*n_u64] = data_offset;
++
++ data->as_dynamic.offset = data_offset;
++ data->as_dynamic.len = len;
+
+ (*n_u64)++;
+
+@@ -550,7 +528,8 @@ static notrace void trace_event_raw_event_synth(void *__data,
+ str_val = (char *)(long)var_ref_vals[val_idx];
+
+ if (event->dynamic_fields[i]->is_stack) {
+- len = *((unsigned long *)str_val);
++ /* reserve one extra element for size */
++ len = *((unsigned long *)str_val) + 1;
+ len *= sizeof(unsigned long);
+ } else {
+ len = fetch_store_strlen((unsigned long)str_val);
+@@ -592,19 +571,19 @@ static notrace void trace_event_raw_event_synth(void *__data,
+
+ switch (field->size) {
+ case 1:
+- *(u8 *)&entry->fields[n_u64] = (u8)val;
++ entry->fields[n_u64].as_u8 = (u8)val;
+ break;
+
+ case 2:
+- *(u16 *)&entry->fields[n_u64] = (u16)val;
++ entry->fields[n_u64].as_u16 = (u16)val;
+ break;
+
+ case 4:
+- *(u32 *)&entry->fields[n_u64] = (u32)val;
++ entry->fields[n_u64].as_u32 = (u32)val;
+ break;
+
+ default:
+- entry->fields[n_u64] = val;
++ entry->fields[n_u64].as_u64 = val;
+ break;
+ }
+ n_u64++;
+@@ -1790,19 +1769,19 @@ int synth_event_trace(struct trace_event_file *file, unsigned int n_vals, ...)
+
+ switch (field->size) {
+ case 1:
+- *(u8 *)&state.entry->fields[n_u64] = (u8)val;
++ state.entry->fields[n_u64].as_u8 = (u8)val;
+ break;
+
+ case 2:
+- *(u16 *)&state.entry->fields[n_u64] = (u16)val;
++ state.entry->fields[n_u64].as_u16 = (u16)val;
+ break;
+
+ case 4:
+- *(u32 *)&state.entry->fields[n_u64] = (u32)val;
++ state.entry->fields[n_u64].as_u32 = (u32)val;
+ break;
+
+ default:
+- state.entry->fields[n_u64] = val;
++ state.entry->fields[n_u64].as_u64 = val;
+ break;
+ }
+ n_u64++;
+@@ -1883,19 +1862,19 @@ int synth_event_trace_array(struct trace_event_file *file, u64 *vals,
+
+ switch (field->size) {
+ case 1:
+- *(u8 *)&state.entry->fields[n_u64] = (u8)val;
++ state.entry->fields[n_u64].as_u8 = (u8)val;
+ break;
+
+ case 2:
+- *(u16 *)&state.entry->fields[n_u64] = (u16)val;
++ state.entry->fields[n_u64].as_u16 = (u16)val;
+ break;
+
+ case 4:
+- *(u32 *)&state.entry->fields[n_u64] = (u32)val;
++ state.entry->fields[n_u64].as_u32 = (u32)val;
+ break;
+
+ default:
+- state.entry->fields[n_u64] = val;
++ state.entry->fields[n_u64].as_u64 = val;
+ break;
+ }
+ n_u64++;
+@@ -2030,19 +2009,19 @@ static int __synth_event_add_val(const char *field_name, u64 val,
+ } else {
+ switch (field->size) {
+ case 1:
+- *(u8 *)&trace_state->entry->fields[field->offset] = (u8)val;
++ trace_state->entry->fields[field->offset].as_u8 = (u8)val;
+ break;
+
+ case 2:
+- *(u16 *)&trace_state->entry->fields[field->offset] = (u16)val;
++ trace_state->entry->fields[field->offset].as_u16 = (u16)val;
+ break;
+
+ case 4:
+- *(u32 *)&trace_state->entry->fields[field->offset] = (u32)val;
++ trace_state->entry->fields[field->offset].as_u32 = (u32)val;
+ break;
+
+ default:
+- trace_state->entry->fields[field->offset] = val;
++ trace_state->entry->fields[field->offset].as_u64 = val;
+ break;
+ }
+ }
+diff --git a/kernel/trace/trace_irqsoff.c b/kernel/trace/trace_irqsoff.c
+index 590b3d51afae9..ba37f768e2f27 100644
+--- a/kernel/trace/trace_irqsoff.c
++++ b/kernel/trace/trace_irqsoff.c
+@@ -231,7 +231,8 @@ static void irqsoff_trace_open(struct trace_iterator *iter)
+ {
+ if (is_graph(iter->tr))
+ graph_trace_open(iter);
+-
++ else
++ iter->private = NULL;
+ }
+
+ static void irqsoff_trace_close(struct trace_iterator *iter)
+diff --git a/kernel/trace/trace_sched_wakeup.c b/kernel/trace/trace_sched_wakeup.c
+index 330aee1c1a49e..0469a04a355f2 100644
+--- a/kernel/trace/trace_sched_wakeup.c
++++ b/kernel/trace/trace_sched_wakeup.c
+@@ -168,6 +168,8 @@ static void wakeup_trace_open(struct trace_iterator *iter)
+ {
+ if (is_graph(iter->tr))
+ graph_trace_open(iter);
++ else
++ iter->private = NULL;
+ }
+
+ static void wakeup_trace_close(struct trace_iterator *iter)
+diff --git a/lib/clz_ctz.c b/lib/clz_ctz.c
+index 0d3a686b5ba29..fb8c0c5c2bd27 100644
+--- a/lib/clz_ctz.c
++++ b/lib/clz_ctz.c
+@@ -28,36 +28,16 @@ int __weak __clzsi2(int val)
+ }
+ EXPORT_SYMBOL(__clzsi2);
+
+-int __weak __clzdi2(long val);
+-int __weak __ctzdi2(long val);
+-#if BITS_PER_LONG == 32
+-
+-int __weak __clzdi2(long val)
++int __weak __clzdi2(u64 val);
++int __weak __clzdi2(u64 val)
+ {
+- return 32 - fls((int)val);
++ return 64 - fls64(val);
+ }
+ EXPORT_SYMBOL(__clzdi2);
+
+-int __weak __ctzdi2(long val)
++int __weak __ctzdi2(u64 val);
++int __weak __ctzdi2(u64 val)
+ {
+- return __ffs((u32)val);
++ return __ffs64(val);
+ }
+ EXPORT_SYMBOL(__ctzdi2);
+-
+-#elif BITS_PER_LONG == 64
+-
+-int __weak __clzdi2(long val)
+-{
+- return 64 - fls64((u64)val);
+-}
+-EXPORT_SYMBOL(__clzdi2);
+-
+-int __weak __ctzdi2(long val)
+-{
+- return __ffs64((u64)val);
+-}
+-EXPORT_SYMBOL(__ctzdi2);
+-
+-#else
+-#error BITS_PER_LONG not 32 or 64
+-#endif
+diff --git a/lib/maple_tree.c b/lib/maple_tree.c
+index bb28a49d173c0..3315eaf93f563 100644
+--- a/lib/maple_tree.c
++++ b/lib/maple_tree.c
+@@ -4315,6 +4315,9 @@ static inline bool mas_wr_append(struct ma_wr_state *wr_mas)
+ struct ma_state *mas = wr_mas->mas;
+ unsigned char node_pivots = mt_pivots[wr_mas->type];
+
++ if (mt_in_rcu(mas->tree))
++ return false;
++
+ if ((mas->index != wr_mas->r_min) && (mas->last == wr_mas->r_max)) {
+ if (new_end < node_pivots)
+ wr_mas->pivots[new_end] = wr_mas->pivots[end];
+diff --git a/lib/radix-tree.c b/lib/radix-tree.c
+index 1a31065b2036a..976b9bd02a1b5 100644
+--- a/lib/radix-tree.c
++++ b/lib/radix-tree.c
+@@ -1136,7 +1136,6 @@ static void set_iter_tags(struct radix_tree_iter *iter,
+ void __rcu **radix_tree_iter_resume(void __rcu **slot,
+ struct radix_tree_iter *iter)
+ {
+- slot++;
+ iter->index = __radix_tree_iter_add(iter, 1);
+ iter->next_index = iter->index;
+ iter->tags = 0;
+diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c
+index 37994fb6120cb..81068829a7e3c 100644
+--- a/mm/damon/vaddr.c
++++ b/mm/damon/vaddr.c
+@@ -384,6 +384,7 @@ out:
+ static const struct mm_walk_ops damon_mkold_ops = {
+ .pmd_entry = damon_mkold_pmd_entry,
+ .hugetlb_entry = damon_mkold_hugetlb_entry,
++ .walk_lock = PGWALK_RDLOCK,
+ };
+
+ static void damon_va_mkold(struct mm_struct *mm, unsigned long addr)
+@@ -519,6 +520,7 @@ out:
+ static const struct mm_walk_ops damon_young_ops = {
+ .pmd_entry = damon_young_pmd_entry,
+ .hugetlb_entry = damon_young_hugetlb_entry,
++ .walk_lock = PGWALK_RDLOCK,
+ };
+
+ static bool damon_va_young(struct mm_struct *mm, unsigned long addr,
+diff --git a/mm/gup.c b/mm/gup.c
+index e3e6c473bbc16..cdffc0edc20d0 100644
+--- a/mm/gup.c
++++ b/mm/gup.c
+@@ -551,7 +551,7 @@ static struct page *follow_page_pte(struct vm_area_struct *vma,
+ pte = *ptep;
+ if (!pte_present(pte))
+ goto no_page;
+- if (pte_protnone(pte) && !gup_can_follow_protnone(flags))
++ if (pte_protnone(pte) && !gup_can_follow_protnone(vma, flags))
+ goto no_page;
+
+ page = vm_normal_page(vma, address, pte);
+@@ -672,7 +672,7 @@ static struct page *follow_pmd_mask(struct vm_area_struct *vma,
+ if (likely(!pmd_trans_huge(pmdval)))
+ return follow_page_pte(vma, address, pmd, flags, &ctx->pgmap);
+
+- if (pmd_protnone(pmdval) && !gup_can_follow_protnone(flags))
++ if (pmd_protnone(pmdval) && !gup_can_follow_protnone(vma, flags))
+ return no_page_table(vma, flags);
+
+ ptl = pmd_lock(mm, pmd);
+@@ -820,6 +820,10 @@ struct page *follow_page(struct vm_area_struct *vma, unsigned long address,
+ if (WARN_ON_ONCE(foll_flags & FOLL_PIN))
+ return NULL;
+
++ /*
++ * We never set FOLL_HONOR_NUMA_FAULT because callers don't expect
++ * to fail on PROT_NONE-mapped pages.
++ */
+ page = follow_page_mask(vma, address, foll_flags, &ctx);
+ if (ctx.pgmap)
+ put_dev_pagemap(ctx.pgmap);
+@@ -2134,6 +2138,13 @@ static bool is_valid_gup_args(struct page **pages, struct vm_area_struct **vmas,
+ gup_flags |= FOLL_UNLOCKABLE;
+ }
+
++ /*
++ * For now, always trigger NUMA hinting faults. Some GUP users like
++ * KVM require the hint to be as the calling context of GUP is
++ * functionally similar to a memory reference from task context.
++ */
++ gup_flags |= FOLL_HONOR_NUMA_FAULT;
++
+ /* FOLL_GET and FOLL_PIN are mutually exclusive. */
+ if (WARN_ON_ONCE((gup_flags & (FOLL_PIN | FOLL_GET)) ==
+ (FOLL_PIN | FOLL_GET)))
+@@ -2394,7 +2405,14 @@ static int gup_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr,
+ struct page *page;
+ struct folio *folio;
+
+- if (pte_protnone(pte) && !gup_can_follow_protnone(flags))
++ /*
++ * Always fallback to ordinary GUP on PROT_NONE-mapped pages:
++ * pte_access_permitted() better should reject these pages
++ * either way: otherwise, GUP-fast might succeed in
++ * cases where ordinary GUP would fail due to VMA access
++ * permissions.
++ */
++ if (pte_protnone(pte))
+ goto pte_unmap;
+
+ if (!pte_access_permitted(pte, flags & FOLL_WRITE))
+@@ -2784,8 +2802,8 @@ static int gup_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr, unsigned lo
+
+ if (unlikely(pmd_trans_huge(pmd) || pmd_huge(pmd) ||
+ pmd_devmap(pmd))) {
+- if (pmd_protnone(pmd) &&
+- !gup_can_follow_protnone(flags))
++ /* See gup_pte_range() */
++ if (pmd_protnone(pmd))
+ return 0;
+
+ if (!gup_huge_pmd(pmd, pmdp, addr, next, flags,
+@@ -2965,7 +2983,7 @@ static int internal_get_user_pages_fast(unsigned long start,
+ if (WARN_ON_ONCE(gup_flags & ~(FOLL_WRITE | FOLL_LONGTERM |
+ FOLL_FORCE | FOLL_PIN | FOLL_GET |
+ FOLL_FAST_ONLY | FOLL_NOFAULT |
+- FOLL_PCI_P2PDMA)))
++ FOLL_PCI_P2PDMA | FOLL_HONOR_NUMA_FAULT)))
+ return -EINVAL;
+
+ if (gup_flags & FOLL_PIN)
+diff --git a/mm/hmm.c b/mm/hmm.c
+index 6a151c09de5ee..a334c9bf00143 100644
+--- a/mm/hmm.c
++++ b/mm/hmm.c
+@@ -560,6 +560,7 @@ static const struct mm_walk_ops hmm_walk_ops = {
+ .pte_hole = hmm_vma_walk_hole,
+ .hugetlb_entry = hmm_vma_walk_hugetlb_entry,
+ .test_walk = hmm_vma_walk_test,
++ .walk_lock = PGWALK_RDLOCK,
+ };
+
+ /**
+diff --git a/mm/huge_memory.c b/mm/huge_memory.c
+index 624671aaa60d0..4231a720a02c8 100644
+--- a/mm/huge_memory.c
++++ b/mm/huge_memory.c
+@@ -1467,8 +1467,7 @@ struct page *follow_trans_huge_pmd(struct vm_area_struct *vma,
+ if ((flags & FOLL_DUMP) && is_huge_zero_pmd(*pmd))
+ return ERR_PTR(-EFAULT);
+
+- /* Full NUMA hinting faults to serialise migration in fault paths */
+- if (pmd_protnone(*pmd) && !gup_can_follow_protnone(flags))
++ if (pmd_protnone(*pmd) && !gup_can_follow_protnone(vma, flags))
+ return NULL;
+
+ if (!pmd_write(*pmd) && gup_must_unshare(vma, flags, page))
+diff --git a/mm/internal.h b/mm/internal.h
+index 68410c6d97aca..8a2b57134b970 100644
+--- a/mm/internal.h
++++ b/mm/internal.h
+@@ -994,6 +994,16 @@ static inline bool gup_must_unshare(struct vm_area_struct *vma,
+ if (IS_ENABLED(CONFIG_HAVE_FAST_GUP))
+ smp_rmb();
+
++ /*
++ * During GUP-fast we might not get called on the head page for a
++ * hugetlb page that is mapped using cont-PTE, because GUP-fast does
++ * not work with the abstracted hugetlb PTEs that always point at the
++ * head page. For hugetlb, PageAnonExclusive only applies on the head
++ * page (as it cannot be partially COW-shared), so lookup the head page.
++ */
++ if (unlikely(!PageHead(page) && PageHuge(page)))
++ page = compound_head(page);
++
+ /*
+ * Note that PageKsm() pages cannot be exclusive, and consequently,
+ * cannot get pinned.
+diff --git a/mm/ksm.c b/mm/ksm.c
+index 0156bded3a66c..8a8462037f5e8 100644
+--- a/mm/ksm.c
++++ b/mm/ksm.c
+@@ -454,6 +454,12 @@ static int break_ksm_pmd_entry(pmd_t *pmd, unsigned long addr, unsigned long nex
+
+ static const struct mm_walk_ops break_ksm_ops = {
+ .pmd_entry = break_ksm_pmd_entry,
++ .walk_lock = PGWALK_RDLOCK,
++};
++
++static const struct mm_walk_ops break_ksm_lock_vma_ops = {
++ .pmd_entry = break_ksm_pmd_entry,
++ .walk_lock = PGWALK_WRLOCK,
+ };
+
+ /*
+@@ -469,16 +475,17 @@ static const struct mm_walk_ops break_ksm_ops = {
+ * of the process that owns 'vma'. We also do not want to enforce
+ * protection keys here anyway.
+ */
+-static int break_ksm(struct vm_area_struct *vma, unsigned long addr)
++static int break_ksm(struct vm_area_struct *vma, unsigned long addr, bool lock_vma)
+ {
+ vm_fault_t ret = 0;
++ const struct mm_walk_ops *ops = lock_vma ?
++ &break_ksm_lock_vma_ops : &break_ksm_ops;
+
+ do {
+ int ksm_page;
+
+ cond_resched();
+- ksm_page = walk_page_range_vma(vma, addr, addr + 1,
+- &break_ksm_ops, NULL);
++ ksm_page = walk_page_range_vma(vma, addr, addr + 1, ops, NULL);
+ if (WARN_ON_ONCE(ksm_page < 0))
+ return ksm_page;
+ if (!ksm_page)
+@@ -564,7 +571,7 @@ static void break_cow(struct ksm_rmap_item *rmap_item)
+ mmap_read_lock(mm);
+ vma = find_mergeable_vma(mm, addr);
+ if (vma)
+- break_ksm(vma, addr);
++ break_ksm(vma, addr, false);
+ mmap_read_unlock(mm);
+ }
+
+@@ -870,7 +877,7 @@ static void remove_trailing_rmap_items(struct ksm_rmap_item **rmap_list)
+ * in cmp_and_merge_page on one of the rmap_items we would be removing.
+ */
+ static int unmerge_ksm_pages(struct vm_area_struct *vma,
+- unsigned long start, unsigned long end)
++ unsigned long start, unsigned long end, bool lock_vma)
+ {
+ unsigned long addr;
+ int err = 0;
+@@ -881,7 +888,7 @@ static int unmerge_ksm_pages(struct vm_area_struct *vma,
+ if (signal_pending(current))
+ err = -ERESTARTSYS;
+ else
+- err = break_ksm(vma, addr);
++ err = break_ksm(vma, addr, lock_vma);
+ }
+ return err;
+ }
+@@ -1028,7 +1035,7 @@ static int unmerge_and_remove_all_rmap_items(void)
+ if (!(vma->vm_flags & VM_MERGEABLE) || !vma->anon_vma)
+ continue;
+ err = unmerge_ksm_pages(vma,
+- vma->vm_start, vma->vm_end);
++ vma->vm_start, vma->vm_end, false);
+ if (err)
+ goto error;
+ }
+@@ -2528,7 +2535,7 @@ static int __ksm_del_vma(struct vm_area_struct *vma)
+ return 0;
+
+ if (vma->anon_vma) {
+- err = unmerge_ksm_pages(vma, vma->vm_start, vma->vm_end);
++ err = unmerge_ksm_pages(vma, vma->vm_start, vma->vm_end, true);
+ if (err)
+ return err;
+ }
+@@ -2666,7 +2673,7 @@ int ksm_madvise(struct vm_area_struct *vma, unsigned long start,
+ return 0; /* just ignore the advice */
+
+ if (vma->anon_vma) {
+- err = unmerge_ksm_pages(vma, start, end);
++ err = unmerge_ksm_pages(vma, start, end, true);
+ if (err)
+ return err;
+ }
+diff --git a/mm/madvise.c b/mm/madvise.c
+index b5ffbaf616f51..a3f72d551b5fa 100644
+--- a/mm/madvise.c
++++ b/mm/madvise.c
+@@ -227,6 +227,7 @@ static int swapin_walk_pmd_entry(pmd_t *pmd, unsigned long start,
+
+ static const struct mm_walk_ops swapin_walk_ops = {
+ .pmd_entry = swapin_walk_pmd_entry,
++ .walk_lock = PGWALK_RDLOCK,
+ };
+
+ static void force_shm_swapin_readahead(struct vm_area_struct *vma,
+@@ -375,7 +376,7 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd,
+ folio = pfn_folio(pmd_pfn(orig_pmd));
+
+ /* Do not interfere with other mappings of this folio */
+- if (folio_mapcount(folio) != 1)
++ if (folio_estimated_sharers(folio) != 1)
+ goto huge_unlock;
+
+ if (pageout_anon_only_filter && !folio_test_anon(folio))
+@@ -447,7 +448,7 @@ regular_folio:
+ * are sure it's worth. Split it if we are only owner.
+ */
+ if (folio_test_large(folio)) {
+- if (folio_mapcount(folio) != 1)
++ if (folio_estimated_sharers(folio) != 1)
+ break;
+ if (pageout_anon_only_filter && !folio_test_anon(folio))
+ break;
+@@ -521,6 +522,7 @@ regular_folio:
+
+ static const struct mm_walk_ops cold_walk_ops = {
+ .pmd_entry = madvise_cold_or_pageout_pte_range,
++ .walk_lock = PGWALK_RDLOCK,
+ };
+
+ static void madvise_cold_page_range(struct mmu_gather *tlb,
+@@ -664,8 +666,8 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr,
+ * deactivate all pages.
+ */
+ if (folio_test_large(folio)) {
+- if (folio_mapcount(folio) != 1)
+- goto out;
++ if (folio_estimated_sharers(folio) != 1)
++ break;
+ folio_get(folio);
+ if (!folio_trylock(folio)) {
+ folio_put(folio);
+@@ -741,6 +743,7 @@ next:
+
+ static const struct mm_walk_ops madvise_free_walk_ops = {
+ .pmd_entry = madvise_free_pte_range,
++ .walk_lock = PGWALK_RDLOCK,
+ };
+
+ static int madvise_free_single_vma(struct vm_area_struct *vma,
+diff --git a/mm/memcontrol.c b/mm/memcontrol.c
+index c823c35c2ed46..cfacd9ceccf66 100644
+--- a/mm/memcontrol.c
++++ b/mm/memcontrol.c
+@@ -6072,6 +6072,7 @@ static int mem_cgroup_count_precharge_pte_range(pmd_t *pmd,
+
+ static const struct mm_walk_ops precharge_walk_ops = {
+ .pmd_entry = mem_cgroup_count_precharge_pte_range,
++ .walk_lock = PGWALK_RDLOCK,
+ };
+
+ static unsigned long mem_cgroup_count_precharge(struct mm_struct *mm)
+@@ -6351,6 +6352,7 @@ put: /* get_mctgt_type() gets & locks the page */
+
+ static const struct mm_walk_ops charge_walk_ops = {
+ .pmd_entry = mem_cgroup_move_charge_pte_range,
++ .walk_lock = PGWALK_RDLOCK,
+ };
+
+ static void mem_cgroup_move_charge(void)
+diff --git a/mm/memory-failure.c b/mm/memory-failure.c
+index 244dbfe075a25..3d75a25d9a22c 100644
+--- a/mm/memory-failure.c
++++ b/mm/memory-failure.c
+@@ -836,6 +836,7 @@ static int hwpoison_hugetlb_range(pte_t *ptep, unsigned long hmask,
+ static const struct mm_walk_ops hwp_walk_ops = {
+ .pmd_entry = hwpoison_pte_range,
+ .hugetlb_entry = hwpoison_hugetlb_range,
++ .walk_lock = PGWALK_RDLOCK,
+ };
+
+ /*
+@@ -2743,10 +2744,13 @@ retry:
+ if (ret > 0) {
+ ret = soft_offline_in_use_page(page);
+ } else if (ret == 0) {
+- if (!page_handle_poison(page, true, false) && try_again) {
+- try_again = false;
+- flags &= ~MF_COUNT_INCREASED;
+- goto retry;
++ if (!page_handle_poison(page, true, false)) {
++ if (try_again) {
++ try_again = false;
++ flags &= ~MF_COUNT_INCREASED;
++ goto retry;
++ }
++ ret = -EBUSY;
+ }
+ }
+
+diff --git a/mm/mempolicy.c b/mm/mempolicy.c
+index d524bf8d0e90c..bf9159fb7428c 100644
+--- a/mm/mempolicy.c
++++ b/mm/mempolicy.c
+@@ -715,6 +715,14 @@ static const struct mm_walk_ops queue_pages_walk_ops = {
+ .hugetlb_entry = queue_folios_hugetlb,
+ .pmd_entry = queue_folios_pte_range,
+ .test_walk = queue_pages_test_walk,
++ .walk_lock = PGWALK_RDLOCK,
++};
++
++static const struct mm_walk_ops queue_pages_lock_vma_walk_ops = {
++ .hugetlb_entry = queue_folios_hugetlb,
++ .pmd_entry = queue_folios_pte_range,
++ .test_walk = queue_pages_test_walk,
++ .walk_lock = PGWALK_WRLOCK,
+ };
+
+ /*
+@@ -735,7 +743,7 @@ static const struct mm_walk_ops queue_pages_walk_ops = {
+ static int
+ queue_pages_range(struct mm_struct *mm, unsigned long start, unsigned long end,
+ nodemask_t *nodes, unsigned long flags,
+- struct list_head *pagelist)
++ struct list_head *pagelist, bool lock_vma)
+ {
+ int err;
+ struct queue_pages qp = {
+@@ -746,8 +754,10 @@ queue_pages_range(struct mm_struct *mm, unsigned long start, unsigned long end,
+ .end = end,
+ .first = NULL,
+ };
++ const struct mm_walk_ops *ops = lock_vma ?
++ &queue_pages_lock_vma_walk_ops : &queue_pages_walk_ops;
+
+- err = walk_page_range(mm, start, end, &queue_pages_walk_ops, &qp);
++ err = walk_page_range(mm, start, end, ops, &qp);
+
+ if (!qp.first)
+ /* whole range in hole */
+@@ -1075,7 +1085,7 @@ static int migrate_to_node(struct mm_struct *mm, int source, int dest,
+ vma = find_vma(mm, 0);
+ VM_BUG_ON(!(flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)));
+ queue_pages_range(mm, vma->vm_start, mm->task_size, &nmask,
+- flags | MPOL_MF_DISCONTIG_OK, &pagelist);
++ flags | MPOL_MF_DISCONTIG_OK, &pagelist, false);
+
+ if (!list_empty(&pagelist)) {
+ err = migrate_pages(&pagelist, alloc_migration_target, NULL,
+@@ -1321,12 +1331,8 @@ static long do_mbind(unsigned long start, unsigned long len,
+ * Lock the VMAs before scanning for pages to migrate, to ensure we don't
+ * miss a concurrently inserted page.
+ */
+- vma_iter_init(&vmi, mm, start);
+- for_each_vma_range(vmi, vma, end)
+- vma_start_write(vma);
+-
+ ret = queue_pages_range(mm, start, end, nmask,
+- flags | MPOL_MF_INVERT, &pagelist);
++ flags | MPOL_MF_INVERT, &pagelist, true);
+
+ if (ret < 0) {
+ err = ret;
+diff --git a/mm/migrate_device.c b/mm/migrate_device.c
+index d30c9de60b0d7..dae008fec8058 100644
+--- a/mm/migrate_device.c
++++ b/mm/migrate_device.c
+@@ -286,6 +286,7 @@ next:
+ static const struct mm_walk_ops migrate_vma_walk_ops = {
+ .pmd_entry = migrate_vma_collect_pmd,
+ .pte_hole = migrate_vma_collect_hole,
++ .walk_lock = PGWALK_RDLOCK,
+ };
+
+ /*
+diff --git a/mm/mincore.c b/mm/mincore.c
+index 2d5be013a25a0..9750b15f66942 100644
+--- a/mm/mincore.c
++++ b/mm/mincore.c
+@@ -177,6 +177,7 @@ static const struct mm_walk_ops mincore_walk_ops = {
+ .pmd_entry = mincore_pte_range,
+ .pte_hole = mincore_unmapped_range,
+ .hugetlb_entry = mincore_hugetlb,
++ .walk_lock = PGWALK_RDLOCK,
+ };
+
+ /*
+diff --git a/mm/mlock.c b/mm/mlock.c
+index 39e03a37f0a98..62da0b798a137 100644
+--- a/mm/mlock.c
++++ b/mm/mlock.c
+@@ -365,6 +365,7 @@ static void mlock_vma_pages_range(struct vm_area_struct *vma,
+ {
+ static const struct mm_walk_ops mlock_walk_ops = {
+ .pmd_entry = mlock_pte_range,
++ .walk_lock = PGWALK_WRLOCK_VERIFY,
+ };
+
+ /*
+diff --git a/mm/mprotect.c b/mm/mprotect.c
+index c59e7561698c8..7db20085667a8 100644
+--- a/mm/mprotect.c
++++ b/mm/mprotect.c
+@@ -611,6 +611,7 @@ static const struct mm_walk_ops prot_none_walk_ops = {
+ .pte_entry = prot_none_pte_entry,
+ .hugetlb_entry = prot_none_hugetlb_entry,
+ .test_walk = prot_none_test,
++ .walk_lock = PGWALK_WRLOCK,
+ };
+
+ int
+diff --git a/mm/pagewalk.c b/mm/pagewalk.c
+index cb23f8a15c134..cb7791d013268 100644
+--- a/mm/pagewalk.c
++++ b/mm/pagewalk.c
+@@ -384,6 +384,33 @@ static int __walk_page_range(unsigned long start, unsigned long end,
+ return err;
+ }
+
++static inline void process_mm_walk_lock(struct mm_struct *mm,
++ enum page_walk_lock walk_lock)
++{
++ if (walk_lock == PGWALK_RDLOCK)
++ mmap_assert_locked(mm);
++ else
++ mmap_assert_write_locked(mm);
++}
++
++static inline void process_vma_walk_lock(struct vm_area_struct *vma,
++ enum page_walk_lock walk_lock)
++{
++#ifdef CONFIG_PER_VMA_LOCK
++ switch (walk_lock) {
++ case PGWALK_WRLOCK:
++ vma_start_write(vma);
++ break;
++ case PGWALK_WRLOCK_VERIFY:
++ vma_assert_write_locked(vma);
++ break;
++ case PGWALK_RDLOCK:
++ /* PGWALK_RDLOCK is handled by process_mm_walk_lock */
++ break;
++ }
++#endif
++}
++
+ /**
+ * walk_page_range - walk page table with caller specific callbacks
+ * @mm: mm_struct representing the target process of page table walk
+@@ -443,7 +470,7 @@ int walk_page_range(struct mm_struct *mm, unsigned long start,
+ if (!walk.mm)
+ return -EINVAL;
+
+- mmap_assert_locked(walk.mm);
++ process_mm_walk_lock(walk.mm, ops->walk_lock);
+
+ vma = find_vma(walk.mm, start);
+ do {
+@@ -458,6 +485,7 @@ int walk_page_range(struct mm_struct *mm, unsigned long start,
+ if (ops->pte_hole)
+ err = ops->pte_hole(start, next, -1, &walk);
+ } else { /* inside vma */
++ process_vma_walk_lock(vma, ops->walk_lock);
+ walk.vma = vma;
+ next = min(end, vma->vm_end);
+ vma = find_vma(mm, vma->vm_end);
+@@ -533,7 +561,8 @@ int walk_page_range_vma(struct vm_area_struct *vma, unsigned long start,
+ if (start < vma->vm_start || end > vma->vm_end)
+ return -EINVAL;
+
+- mmap_assert_locked(walk.mm);
++ process_mm_walk_lock(walk.mm, ops->walk_lock);
++ process_vma_walk_lock(vma, ops->walk_lock);
+ return __walk_page_range(start, end, &walk);
+ }
+
+@@ -550,7 +579,8 @@ int walk_page_vma(struct vm_area_struct *vma, const struct mm_walk_ops *ops,
+ if (!walk.mm)
+ return -EINVAL;
+
+- mmap_assert_locked(walk.mm);
++ process_mm_walk_lock(walk.mm, ops->walk_lock);
++ process_vma_walk_lock(vma, ops->walk_lock);
+ return __walk_page_range(vma->vm_start, vma->vm_end, &walk);
+ }
+
+diff --git a/mm/shmem.c b/mm/shmem.c
+index 74abb97ea557b..fe208a072e594 100644
+--- a/mm/shmem.c
++++ b/mm/shmem.c
+@@ -806,14 +806,16 @@ unsigned long shmem_partial_swap_usage(struct address_space *mapping,
+ XA_STATE(xas, &mapping->i_pages, start);
+ struct page *page;
+ unsigned long swapped = 0;
++ unsigned long max = end - 1;
+
+ rcu_read_lock();
+- xas_for_each(&xas, page, end - 1) {
++ xas_for_each(&xas, page, max) {
+ if (xas_retry(&xas, page))
+ continue;
+ if (xa_is_value(page))
+ swapped++;
+-
++ if (xas.xa_index == max)
++ break;
+ if (need_resched()) {
+ xas_pause(&xas);
+ cond_resched_rcu();
+diff --git a/mm/vmalloc.c b/mm/vmalloc.c
+index 1d13d71687d73..73a0077ee3afc 100644
+--- a/mm/vmalloc.c
++++ b/mm/vmalloc.c
+@@ -2929,6 +2929,10 @@ void *vmap_pfn(unsigned long *pfns, unsigned int count, pgprot_t prot)
+ free_vm_area(area);
+ return NULL;
+ }
++
++ flush_cache_vmap((unsigned long)area->addr,
++ (unsigned long)area->addr + count * PAGE_SIZE);
++
+ return area->addr;
+ }
+ EXPORT_SYMBOL_GPL(vmap_pfn);
+diff --git a/mm/vmscan.c b/mm/vmscan.c
+index 6114a1fc6c688..7ff3389c677f9 100644
+--- a/mm/vmscan.c
++++ b/mm/vmscan.c
+@@ -4249,6 +4249,7 @@ static void walk_mm(struct lruvec *lruvec, struct mm_struct *mm, struct lru_gen_
+ static const struct mm_walk_ops mm_walk_ops = {
+ .test_walk = should_skip_vma,
+ .p4d_entry = walk_pud_range,
++ .walk_lock = PGWALK_RDLOCK,
+ };
+
+ int err;
+@@ -4817,16 +4818,17 @@ void lru_gen_release_memcg(struct mem_cgroup *memcg)
+
+ spin_lock_irq(&pgdat->memcg_lru.lock);
+
+- VM_WARN_ON_ONCE(hlist_nulls_unhashed(&lruvec->lrugen.list));
++ if (hlist_nulls_unhashed(&lruvec->lrugen.list))
++ goto unlock;
+
+ gen = lruvec->lrugen.gen;
+
+- hlist_nulls_del_rcu(&lruvec->lrugen.list);
++ hlist_nulls_del_init_rcu(&lruvec->lrugen.list);
+ pgdat->memcg_lru.nr_memcgs[gen]--;
+
+ if (!pgdat->memcg_lru.nr_memcgs[gen] && gen == get_memcg_gen(pgdat->memcg_lru.seq))
+ WRITE_ONCE(pgdat->memcg_lru.seq, pgdat->memcg_lru.seq + 1);
+-
++unlock:
+ spin_unlock_irq(&pgdat->memcg_lru.lock);
+ }
+ }
+@@ -5397,8 +5399,10 @@ restart:
+ rcu_read_lock();
+
+ hlist_nulls_for_each_entry_rcu(lrugen, pos, &pgdat->memcg_lru.fifo[gen][bin], list) {
+- if (op)
++ if (op) {
+ lru_gen_rotate_memcg(lruvec, op);
++ op = 0;
++ }
+
+ mem_cgroup_put(memcg);
+
+@@ -5406,7 +5410,7 @@ restart:
+ memcg = lruvec_memcg(lruvec);
+
+ if (!mem_cgroup_tryget(memcg)) {
+- op = 0;
++ lru_gen_release_memcg(memcg);
+ memcg = NULL;
+ continue;
+ }
+diff --git a/net/batman-adv/bat_v_elp.c b/net/batman-adv/bat_v_elp.c
+index acff565849ae9..1d704574e6bf5 100644
+--- a/net/batman-adv/bat_v_elp.c
++++ b/net/batman-adv/bat_v_elp.c
+@@ -505,7 +505,7 @@ int batadv_v_elp_packet_recv(struct sk_buff *skb,
+ struct batadv_priv *bat_priv = netdev_priv(if_incoming->soft_iface);
+ struct batadv_elp_packet *elp_packet;
+ struct batadv_hard_iface *primary_if;
+- struct ethhdr *ethhdr = (struct ethhdr *)skb_mac_header(skb);
++ struct ethhdr *ethhdr;
+ bool res;
+ int ret = NET_RX_DROP;
+
+@@ -513,6 +513,7 @@ int batadv_v_elp_packet_recv(struct sk_buff *skb,
+ if (!res)
+ goto free_skb;
+
++ ethhdr = eth_hdr(skb);
+ if (batadv_is_my_mac(bat_priv, ethhdr->h_source))
+ goto free_skb;
+
+diff --git a/net/batman-adv/bat_v_ogm.c b/net/batman-adv/bat_v_ogm.c
+index e710e9afe78f3..e503ee0d896bd 100644
+--- a/net/batman-adv/bat_v_ogm.c
++++ b/net/batman-adv/bat_v_ogm.c
+@@ -123,8 +123,10 @@ static void batadv_v_ogm_send_to_if(struct sk_buff *skb,
+ {
+ struct batadv_priv *bat_priv = netdev_priv(hard_iface->soft_iface);
+
+- if (hard_iface->if_status != BATADV_IF_ACTIVE)
++ if (hard_iface->if_status != BATADV_IF_ACTIVE) {
++ kfree_skb(skb);
+ return;
++ }
+
+ batadv_inc_counter(bat_priv, BATADV_CNT_MGMT_TX);
+ batadv_add_counter(bat_priv, BATADV_CNT_MGMT_TX_BYTES,
+@@ -985,7 +987,7 @@ int batadv_v_ogm_packet_recv(struct sk_buff *skb,
+ {
+ struct batadv_priv *bat_priv = netdev_priv(if_incoming->soft_iface);
+ struct batadv_ogm2_packet *ogm_packet;
+- struct ethhdr *ethhdr = eth_hdr(skb);
++ struct ethhdr *ethhdr;
+ int ogm_offset;
+ u8 *packet_pos;
+ int ret = NET_RX_DROP;
+@@ -999,6 +1001,7 @@ int batadv_v_ogm_packet_recv(struct sk_buff *skb,
+ if (!batadv_check_management_packet(skb, if_incoming, BATADV_OGM2_HLEN))
+ goto free_skb;
+
++ ethhdr = eth_hdr(skb);
+ if (batadv_is_my_mac(bat_priv, ethhdr->h_source))
+ goto free_skb;
+
+diff --git a/net/batman-adv/hard-interface.c b/net/batman-adv/hard-interface.c
+index 41c1ad33d009f..24c9c0c3f3166 100644
+--- a/net/batman-adv/hard-interface.c
++++ b/net/batman-adv/hard-interface.c
+@@ -630,7 +630,19 @@ out:
+ */
+ void batadv_update_min_mtu(struct net_device *soft_iface)
+ {
+- soft_iface->mtu = batadv_hardif_min_mtu(soft_iface);
++ struct batadv_priv *bat_priv = netdev_priv(soft_iface);
++ int limit_mtu;
++ int mtu;
++
++ mtu = batadv_hardif_min_mtu(soft_iface);
++
++ if (bat_priv->mtu_set_by_user)
++ limit_mtu = bat_priv->mtu_set_by_user;
++ else
++ limit_mtu = ETH_DATA_LEN;
++
++ mtu = min(mtu, limit_mtu);
++ dev_set_mtu(soft_iface, mtu);
+
+ /* Check if the local translate table should be cleaned up to match a
+ * new (and smaller) MTU.
+diff --git a/net/batman-adv/netlink.c b/net/batman-adv/netlink.c
+index ad5714f737be2..6efbc9275aec2 100644
+--- a/net/batman-adv/netlink.c
++++ b/net/batman-adv/netlink.c
+@@ -495,7 +495,10 @@ static int batadv_netlink_set_mesh(struct sk_buff *skb, struct genl_info *info)
+ attr = info->attrs[BATADV_ATTR_FRAGMENTATION_ENABLED];
+
+ atomic_set(&bat_priv->fragmentation, !!nla_get_u8(attr));
++
++ rtnl_lock();
+ batadv_update_min_mtu(bat_priv->soft_iface);
++ rtnl_unlock();
+ }
+
+ if (info->attrs[BATADV_ATTR_GW_BANDWIDTH_DOWN]) {
+diff --git a/net/batman-adv/soft-interface.c b/net/batman-adv/soft-interface.c
+index d3fdf82282afe..85d00dc9ce32c 100644
+--- a/net/batman-adv/soft-interface.c
++++ b/net/batman-adv/soft-interface.c
+@@ -153,11 +153,14 @@ static int batadv_interface_set_mac_addr(struct net_device *dev, void *p)
+
+ static int batadv_interface_change_mtu(struct net_device *dev, int new_mtu)
+ {
++ struct batadv_priv *bat_priv = netdev_priv(dev);
++
+ /* check ranges */
+ if (new_mtu < 68 || new_mtu > batadv_hardif_min_mtu(dev))
+ return -EINVAL;
+
+ dev->mtu = new_mtu;
++ bat_priv->mtu_set_by_user = new_mtu;
+
+ return 0;
+ }
+diff --git a/net/batman-adv/translation-table.c b/net/batman-adv/translation-table.c
+index 36ca31252a733..b95c36765d045 100644
+--- a/net/batman-adv/translation-table.c
++++ b/net/batman-adv/translation-table.c
+@@ -774,7 +774,6 @@ check_roaming:
+ if (roamed_back) {
+ batadv_tt_global_free(bat_priv, tt_global,
+ "Roaming canceled");
+- tt_global = NULL;
+ } else {
+ /* The global entry has to be marked as ROAMING and
+ * has to be kept for consistency purpose
+diff --git a/net/batman-adv/types.h b/net/batman-adv/types.h
+index ca9449ec9836a..cf1a0eafe3abc 100644
+--- a/net/batman-adv/types.h
++++ b/net/batman-adv/types.h
+@@ -1546,6 +1546,12 @@ struct batadv_priv {
+ /** @soft_iface: net device which holds this struct as private data */
+ struct net_device *soft_iface;
+
++ /**
++ * @mtu_set_by_user: MTU was set once by user
++ * protected by rtnl_lock
++ */
++ int mtu_set_by_user;
++
+ /**
+ * @bat_counters: mesh internal traffic statistic counters (see
+ * batadv_counters)
+diff --git a/net/can/isotp.c b/net/can/isotp.c
+index ca9d728d6d727..9d498a886a586 100644
+--- a/net/can/isotp.c
++++ b/net/can/isotp.c
+@@ -188,12 +188,6 @@ static bool isotp_register_rxid(struct isotp_sock *so)
+ return (isotp_bc_flags(so) == 0);
+ }
+
+-static bool isotp_register_txecho(struct isotp_sock *so)
+-{
+- /* all modes but SF_BROADCAST register for tx echo skbs */
+- return (isotp_bc_flags(so) != CAN_ISOTP_SF_BROADCAST);
+-}
+-
+ static enum hrtimer_restart isotp_rx_timer_handler(struct hrtimer *hrtimer)
+ {
+ struct isotp_sock *so = container_of(hrtimer, struct isotp_sock,
+@@ -1209,7 +1203,7 @@ static int isotp_release(struct socket *sock)
+ lock_sock(sk);
+
+ /* remove current filters & unregister */
+- if (so->bound && isotp_register_txecho(so)) {
++ if (so->bound) {
+ if (so->ifindex) {
+ struct net_device *dev;
+
+@@ -1332,14 +1326,12 @@ static int isotp_bind(struct socket *sock, struct sockaddr *uaddr, int len)
+ can_rx_register(net, dev, rx_id, SINGLE_MASK(rx_id),
+ isotp_rcv, sk, "isotp", sk);
+
+- if (isotp_register_txecho(so)) {
+- /* no consecutive frame echo skb in flight */
+- so->cfecho = 0;
++ /* no consecutive frame echo skb in flight */
++ so->cfecho = 0;
+
+- /* register for echo skb's */
+- can_rx_register(net, dev, tx_id, SINGLE_MASK(tx_id),
+- isotp_rcv_echo, sk, "isotpe", sk);
+- }
++ /* register for echo skb's */
++ can_rx_register(net, dev, tx_id, SINGLE_MASK(tx_id),
++ isotp_rcv_echo, sk, "isotpe", sk);
+
+ dev_put(dev);
+
+@@ -1560,7 +1552,7 @@ static void isotp_notify(struct isotp_sock *so, unsigned long msg,
+ case NETDEV_UNREGISTER:
+ lock_sock(sk);
+ /* remove current filters & unregister */
+- if (so->bound && isotp_register_txecho(so)) {
++ if (so->bound) {
+ if (isotp_register_rxid(so))
+ can_rx_unregister(dev_net(dev), dev, so->rxid,
+ SINGLE_MASK(so->rxid),
+diff --git a/net/can/raw.c b/net/can/raw.c
+index f8e3866157a33..174d16be0a95d 100644
+--- a/net/can/raw.c
++++ b/net/can/raw.c
+@@ -84,6 +84,8 @@ struct raw_sock {
+ struct sock sk;
+ int bound;
+ int ifindex;
++ struct net_device *dev;
++ netdevice_tracker dev_tracker;
+ struct list_head notifier;
+ int loopback;
+ int recv_own_msgs;
+@@ -277,21 +279,24 @@ static void raw_notify(struct raw_sock *ro, unsigned long msg,
+ if (!net_eq(dev_net(dev), sock_net(sk)))
+ return;
+
+- if (ro->ifindex != dev->ifindex)
++ if (ro->dev != dev)
+ return;
+
+ switch (msg) {
+ case NETDEV_UNREGISTER:
+ lock_sock(sk);
+ /* remove current filters & unregister */
+- if (ro->bound)
++ if (ro->bound) {
+ raw_disable_allfilters(dev_net(dev), dev, sk);
++ netdev_put(dev, &ro->dev_tracker);
++ }
+
+ if (ro->count > 1)
+ kfree(ro->filter);
+
+ ro->ifindex = 0;
+ ro->bound = 0;
++ ro->dev = NULL;
+ ro->count = 0;
+ release_sock(sk);
+
+@@ -337,6 +342,7 @@ static int raw_init(struct sock *sk)
+
+ ro->bound = 0;
+ ro->ifindex = 0;
++ ro->dev = NULL;
+
+ /* set default filter to single entry dfilter */
+ ro->dfilter.can_id = 0;
+@@ -383,18 +389,14 @@ static int raw_release(struct socket *sock)
+ list_del(&ro->notifier);
+ spin_unlock(&raw_notifier_lock);
+
++ rtnl_lock();
+ lock_sock(sk);
+
+ /* remove current filters & unregister */
+ if (ro->bound) {
+- if (ro->ifindex) {
+- struct net_device *dev;
+-
+- dev = dev_get_by_index(sock_net(sk), ro->ifindex);
+- if (dev) {
+- raw_disable_allfilters(dev_net(dev), dev, sk);
+- dev_put(dev);
+- }
++ if (ro->dev) {
++ raw_disable_allfilters(dev_net(ro->dev), ro->dev, sk);
++ netdev_put(ro->dev, &ro->dev_tracker);
+ } else {
+ raw_disable_allfilters(sock_net(sk), NULL, sk);
+ }
+@@ -405,6 +407,7 @@ static int raw_release(struct socket *sock)
+
+ ro->ifindex = 0;
+ ro->bound = 0;
++ ro->dev = NULL;
+ ro->count = 0;
+ free_percpu(ro->uniq);
+
+@@ -412,6 +415,8 @@ static int raw_release(struct socket *sock)
+ sock->sk = NULL;
+
+ release_sock(sk);
++ rtnl_unlock();
++
+ sock_put(sk);
+
+ return 0;
+@@ -422,6 +427,7 @@ static int raw_bind(struct socket *sock, struct sockaddr *uaddr, int len)
+ struct sockaddr_can *addr = (struct sockaddr_can *)uaddr;
+ struct sock *sk = sock->sk;
+ struct raw_sock *ro = raw_sk(sk);
++ struct net_device *dev = NULL;
+ int ifindex;
+ int err = 0;
+ int notify_enetdown = 0;
+@@ -431,24 +437,23 @@ static int raw_bind(struct socket *sock, struct sockaddr *uaddr, int len)
+ if (addr->can_family != AF_CAN)
+ return -EINVAL;
+
++ rtnl_lock();
+ lock_sock(sk);
+
+ if (ro->bound && addr->can_ifindex == ro->ifindex)
+ goto out;
+
+ if (addr->can_ifindex) {
+- struct net_device *dev;
+-
+ dev = dev_get_by_index(sock_net(sk), addr->can_ifindex);
+ if (!dev) {
+ err = -ENODEV;
+ goto out;
+ }
+ if (dev->type != ARPHRD_CAN) {
+- dev_put(dev);
+ err = -ENODEV;
+- goto out;
++ goto out_put_dev;
+ }
++
+ if (!(dev->flags & IFF_UP))
+ notify_enetdown = 1;
+
+@@ -456,7 +461,9 @@ static int raw_bind(struct socket *sock, struct sockaddr *uaddr, int len)
+
+ /* filters set by default/setsockopt */
+ err = raw_enable_allfilters(sock_net(sk), dev, sk);
+- dev_put(dev);
++ if (err)
++ goto out_put_dev;
++
+ } else {
+ ifindex = 0;
+
+@@ -467,26 +474,30 @@ static int raw_bind(struct socket *sock, struct sockaddr *uaddr, int len)
+ if (!err) {
+ if (ro->bound) {
+ /* unregister old filters */
+- if (ro->ifindex) {
+- struct net_device *dev;
+-
+- dev = dev_get_by_index(sock_net(sk),
+- ro->ifindex);
+- if (dev) {
+- raw_disable_allfilters(dev_net(dev),
+- dev, sk);
+- dev_put(dev);
+- }
++ if (ro->dev) {
++ raw_disable_allfilters(dev_net(ro->dev),
++ ro->dev, sk);
++ /* drop reference to old ro->dev */
++ netdev_put(ro->dev, &ro->dev_tracker);
+ } else {
+ raw_disable_allfilters(sock_net(sk), NULL, sk);
+ }
+ }
+ ro->ifindex = ifindex;
+ ro->bound = 1;
++ /* bind() ok -> hold a reference for new ro->dev */
++ ro->dev = dev;
++ if (ro->dev)
++ netdev_hold(ro->dev, &ro->dev_tracker, GFP_KERNEL);
+ }
+
+- out:
++out_put_dev:
++ /* remove potential reference from dev_get_by_index() */
++ if (dev)
++ dev_put(dev);
++out:
+ release_sock(sk);
++ rtnl_unlock();
+
+ if (notify_enetdown) {
+ sk->sk_err = ENETDOWN;
+@@ -553,9 +564,9 @@ static int raw_setsockopt(struct socket *sock, int level, int optname,
+ rtnl_lock();
+ lock_sock(sk);
+
+- if (ro->bound && ro->ifindex) {
+- dev = dev_get_by_index(sock_net(sk), ro->ifindex);
+- if (!dev) {
++ dev = ro->dev;
++ if (ro->bound && dev) {
++ if (dev->reg_state != NETREG_REGISTERED) {
+ if (count > 1)
+ kfree(filter);
+ err = -ENODEV;
+@@ -596,7 +607,6 @@ static int raw_setsockopt(struct socket *sock, int level, int optname,
+ ro->count = count;
+
+ out_fil:
+- dev_put(dev);
+ release_sock(sk);
+ rtnl_unlock();
+
+@@ -614,9 +624,9 @@ static int raw_setsockopt(struct socket *sock, int level, int optname,
+ rtnl_lock();
+ lock_sock(sk);
+
+- if (ro->bound && ro->ifindex) {
+- dev = dev_get_by_index(sock_net(sk), ro->ifindex);
+- if (!dev) {
++ dev = ro->dev;
++ if (ro->bound && dev) {
++ if (dev->reg_state != NETREG_REGISTERED) {
+ err = -ENODEV;
+ goto out_err;
+ }
+@@ -640,7 +650,6 @@ static int raw_setsockopt(struct socket *sock, int level, int optname,
+ ro->err_mask = err_mask;
+
+ out_err:
+- dev_put(dev);
+ release_sock(sk);
+ rtnl_unlock();
+
+diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
+index aa1743b2b770b..fd6d2430d40ff 100644
+--- a/net/core/rtnetlink.c
++++ b/net/core/rtnetlink.c
+@@ -2268,13 +2268,27 @@ out_err:
+ return err;
+ }
+
+-int rtnl_nla_parse_ifla(struct nlattr **tb, const struct nlattr *head, int len,
+- struct netlink_ext_ack *exterr)
++int rtnl_nla_parse_ifinfomsg(struct nlattr **tb, const struct nlattr *nla_peer,
++ struct netlink_ext_ack *exterr)
+ {
+- return nla_parse_deprecated(tb, IFLA_MAX, head, len, ifla_policy,
++ const struct ifinfomsg *ifmp;
++ const struct nlattr *attrs;
++ size_t len;
++
++ ifmp = nla_data(nla_peer);
++ attrs = nla_data(nla_peer) + sizeof(struct ifinfomsg);
++ len = nla_len(nla_peer) - sizeof(struct ifinfomsg);
++
++ if (ifmp->ifi_index < 0) {
++ NL_SET_ERR_MSG_ATTR(exterr, nla_peer,
++ "ifindex can't be negative");
++ return -EINVAL;
++ }
++
++ return nla_parse_deprecated(tb, IFLA_MAX, attrs, len, ifla_policy,
+ exterr);
+ }
+-EXPORT_SYMBOL(rtnl_nla_parse_ifla);
++EXPORT_SYMBOL(rtnl_nla_parse_ifinfomsg);
+
+ struct net *rtnl_link_get_net(struct net *src_net, struct nlattr *tb[])
+ {
+@@ -3546,6 +3560,9 @@ replay:
+ if (ifm->ifi_index > 0) {
+ link_specified = true;
+ dev = __dev_get_by_index(net, ifm->ifi_index);
++ } else if (ifm->ifi_index < 0) {
++ NL_SET_ERR_MSG(extack, "ifindex can't be negative");
++ return -EINVAL;
+ } else if (tb[IFLA_IFNAME] || tb[IFLA_ALT_IFNAME]) {
+ link_specified = true;
+ dev = rtnl_dev_get(net, tb);
+diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
+index 3ab68415d121c..e7b9703bd1a1a 100644
+--- a/net/dccp/ipv4.c
++++ b/net/dccp/ipv4.c
+@@ -130,7 +130,7 @@ int dccp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
+ inet->inet_daddr,
+ inet->inet_sport,
+ inet->inet_dport);
+- inet->inet_id = get_random_u16();
++ atomic_set(&inet->inet_id, get_random_u16());
+
+ err = dccp_connect(sk);
+ rt = NULL;
+@@ -432,7 +432,7 @@ struct sock *dccp_v4_request_recv_sock(const struct sock *sk,
+ RCU_INIT_POINTER(newinet->inet_opt, rcu_dereference(ireq->ireq_opt));
+ newinet->mc_index = inet_iif(skb);
+ newinet->mc_ttl = ip_hdr(skb)->ttl;
+- newinet->inet_id = get_random_u16();
++ atomic_set(&newinet->inet_id, get_random_u16());
+
+ if (dst == NULL && (dst = inet_csk_route_child_sock(sk, newsk, req)) == NULL)
+ goto put_and_exit;
+diff --git a/net/dccp/proto.c b/net/dccp/proto.c
+index 18873f2308ec8..f3494cb5fab04 100644
+--- a/net/dccp/proto.c
++++ b/net/dccp/proto.c
+@@ -315,11 +315,15 @@ EXPORT_SYMBOL_GPL(dccp_disconnect);
+ __poll_t dccp_poll(struct file *file, struct socket *sock,
+ poll_table *wait)
+ {
+- __poll_t mask;
+ struct sock *sk = sock->sk;
++ __poll_t mask;
++ u8 shutdown;
++ int state;
+
+ sock_poll_wait(file, sock, wait);
+- if (sk->sk_state == DCCP_LISTEN)
++
++ state = inet_sk_state_load(sk);
++ if (state == DCCP_LISTEN)
+ return inet_csk_listen_poll(sk);
+
+ /* Socket is not locked. We are protected from async events
+@@ -328,20 +332,21 @@ __poll_t dccp_poll(struct file *file, struct socket *sock,
+ */
+
+ mask = 0;
+- if (sk->sk_err)
++ if (READ_ONCE(sk->sk_err))
+ mask = EPOLLERR;
++ shutdown = READ_ONCE(sk->sk_shutdown);
+
+- if (sk->sk_shutdown == SHUTDOWN_MASK || sk->sk_state == DCCP_CLOSED)
++ if (shutdown == SHUTDOWN_MASK || state == DCCP_CLOSED)
+ mask |= EPOLLHUP;
+- if (sk->sk_shutdown & RCV_SHUTDOWN)
++ if (shutdown & RCV_SHUTDOWN)
+ mask |= EPOLLIN | EPOLLRDNORM | EPOLLRDHUP;
+
+ /* Connected? */
+- if ((1 << sk->sk_state) & ~(DCCPF_REQUESTING | DCCPF_RESPOND)) {
++ if ((1 << state) & ~(DCCPF_REQUESTING | DCCPF_RESPOND)) {
+ if (atomic_read(&sk->sk_rmem_alloc) > 0)
+ mask |= EPOLLIN | EPOLLRDNORM;
+
+- if (!(sk->sk_shutdown & SEND_SHUTDOWN)) {
++ if (!(shutdown & SEND_SHUTDOWN)) {
+ if (sk_stream_is_writeable(sk)) {
+ mask |= EPOLLOUT | EPOLLWRNORM;
+ } else { /* send SIGIO later */
+@@ -359,7 +364,6 @@ __poll_t dccp_poll(struct file *file, struct socket *sock,
+ }
+ return mask;
+ }
+-
+ EXPORT_SYMBOL_GPL(dccp_poll);
+
+ int dccp_ioctl(struct sock *sk, int cmd, unsigned long arg)
+diff --git a/net/devlink/leftover.c b/net/devlink/leftover.c
+index 790e61b2a9404..6ef6090eeffe5 100644
+--- a/net/devlink/leftover.c
++++ b/net/devlink/leftover.c
+@@ -6739,6 +6739,7 @@ void devlink_notify_unregister(struct devlink *devlink)
+ struct devlink_param_item *param_item;
+ struct devlink_trap_item *trap_item;
+ struct devlink_port *devlink_port;
++ struct devlink_linecard *linecard;
+ struct devlink_rate *rate_node;
+ struct devlink_region *region;
+ unsigned long port_index;
+@@ -6767,6 +6768,8 @@ void devlink_notify_unregister(struct devlink *devlink)
+
+ xa_for_each(&devlink->ports, port_index, devlink_port)
+ devlink_port_notify(devlink_port, DEVLINK_CMD_PORT_DEL);
++ list_for_each_entry_reverse(linecard, &devlink->linecard_list, list)
++ devlink_linecard_notify(linecard, DEVLINK_CMD_LINECARD_DEL);
+ devlink_notify(devlink, DEVLINK_CMD_DEL);
+ }
+
+diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
+index 10ebe39dcc873..9dde8e842befe 100644
+--- a/net/ipv4/af_inet.c
++++ b/net/ipv4/af_inet.c
+@@ -340,7 +340,7 @@ lookup_protocol:
+ else
+ inet->pmtudisc = IP_PMTUDISC_WANT;
+
+- inet->inet_id = 0;
++ atomic_set(&inet->inet_id, 0);
+
+ sock_init_data(sock, sk);
+
+diff --git a/net/ipv4/datagram.c b/net/ipv4/datagram.c
+index 4d1af0cd7d99e..cb5dbee9e018f 100644
+--- a/net/ipv4/datagram.c
++++ b/net/ipv4/datagram.c
+@@ -73,7 +73,7 @@ int __ip4_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len
+ reuseport_has_conns_set(sk);
+ sk->sk_state = TCP_ESTABLISHED;
+ sk_set_txhash(sk);
+- inet->inet_id = get_random_u16();
++ atomic_set(&inet->inet_id, get_random_u16());
+
+ sk_dst_set(sk, &rt->dst);
+ err = 0;
+diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
+index 498dd4acdeec8..caecb4d1e424a 100644
+--- a/net/ipv4/tcp_ipv4.c
++++ b/net/ipv4/tcp_ipv4.c
+@@ -312,7 +312,7 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
+ inet->inet_daddr));
+ }
+
+- inet->inet_id = get_random_u16();
++ atomic_set(&inet->inet_id, get_random_u16());
+
+ if (tcp_fastopen_defer_connect(sk, &err))
+ return err;
+@@ -1596,7 +1596,7 @@ struct sock *tcp_v4_syn_recv_sock(const struct sock *sk, struct sk_buff *skb,
+ inet_csk(newsk)->icsk_ext_hdr_len = 0;
+ if (inet_opt)
+ inet_csk(newsk)->icsk_ext_hdr_len = inet_opt->opt.optlen;
+- newinet->inet_id = get_random_u16();
++ atomic_set(&newinet->inet_id, get_random_u16());
+
+ /* Set ToS of the new socket based upon the value of incoming SYN.
+ * ECT bits are set later in tcp_init_transfer().
+diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c
+index fc6e130364da1..3f316e52cbe43 100644
+--- a/net/mac80211/rx.c
++++ b/net/mac80211/rx.c
+@@ -1083,7 +1083,8 @@ static inline bool ieee80211_rx_reorder_ready(struct tid_ampdu_rx *tid_agg_rx,
+ struct sk_buff *tail = skb_peek_tail(frames);
+ struct ieee80211_rx_status *status;
+
+- if (tid_agg_rx->reorder_buf_filtered & BIT_ULL(index))
++ if (tid_agg_rx->reorder_buf_filtered &&
++ tid_agg_rx->reorder_buf_filtered & BIT_ULL(index))
+ return true;
+
+ if (!tail)
+@@ -1124,7 +1125,8 @@ static void ieee80211_release_reorder_frame(struct ieee80211_sub_if_data *sdata,
+ }
+
+ no_frame:
+- tid_agg_rx->reorder_buf_filtered &= ~BIT_ULL(index);
++ if (tid_agg_rx->reorder_buf_filtered)
++ tid_agg_rx->reorder_buf_filtered &= ~BIT_ULL(index);
+ tid_agg_rx->head_seq_num = ieee80211_sn_inc(tid_agg_rx->head_seq_num);
+ }
+
+@@ -4245,6 +4247,7 @@ void ieee80211_mark_rx_ba_filtered_frames(struct ieee80211_sta *pubsta, u8 tid,
+ u16 ssn, u64 filtered,
+ u16 received_mpdus)
+ {
++ struct ieee80211_local *local;
+ struct sta_info *sta;
+ struct tid_ampdu_rx *tid_agg_rx;
+ struct sk_buff_head frames;
+@@ -4262,6 +4265,11 @@ void ieee80211_mark_rx_ba_filtered_frames(struct ieee80211_sta *pubsta, u8 tid,
+
+ sta = container_of(pubsta, struct sta_info, sta);
+
++ local = sta->sdata->local;
++ WARN_ONCE(local->hw.max_rx_aggregation_subframes > 64,
++ "RX BA marker can't support max_rx_aggregation_subframes %u > 64\n",
++ local->hw.max_rx_aggregation_subframes);
++
+ if (!ieee80211_rx_data_set_sta(&rx, sta, -1))
+ return;
+
+diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
+index b280b151a9e98..ad38f84a8f11a 100644
+--- a/net/netfilter/nf_tables_api.c
++++ b/net/netfilter/nf_tables_api.c
+@@ -1372,7 +1372,7 @@ static int nf_tables_newtable(struct sk_buff *skb, const struct nfnl_info *info,
+ if (table == NULL)
+ goto err_kzalloc;
+
+- table->validate_state = NFT_VALIDATE_SKIP;
++ table->validate_state = nft_net->validate_state;
+ table->name = nla_strdup(attr, GFP_KERNEL_ACCOUNT);
+ if (table->name == NULL)
+ goto err_strdup;
+@@ -9065,9 +9065,8 @@ static int nf_tables_validate(struct net *net)
+ return -EAGAIN;
+
+ nft_validate_state_update(table, NFT_VALIDATE_SKIP);
++ break;
+ }
+-
+- break;
+ }
+
+ return 0;
+@@ -9471,9 +9470,9 @@ static void nft_trans_gc_work(struct work_struct *work)
+ struct nft_trans_gc *trans, *next;
+ LIST_HEAD(trans_gc_list);
+
+- spin_lock(&nf_tables_destroy_list_lock);
++ spin_lock(&nf_tables_gc_list_lock);
+ list_splice_init(&nf_tables_gc_list, &trans_gc_list);
+- spin_unlock(&nf_tables_destroy_list_lock);
++ spin_unlock(&nf_tables_gc_list_lock);
+
+ list_for_each_entry_safe(trans, next, &trans_gc_list, list) {
+ list_del(&trans->list);
+@@ -9813,8 +9812,10 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb)
+ }
+
+ /* 0. Validate ruleset, otherwise roll back for error reporting. */
+- if (nf_tables_validate(net) < 0)
++ if (nf_tables_validate(net) < 0) {
++ nft_net->validate_state = NFT_VALIDATE_DO;
+ return -EAGAIN;
++ }
+
+ err = nft_flow_rule_offload_commit(net);
+ if (err < 0)
+@@ -10070,6 +10071,7 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb)
+ nf_tables_commit_audit_log(&adl, nft_net->base_seq);
+
+ nft_gc_seq_end(nft_net, gc_seq);
++ nft_net->validate_state = NFT_VALIDATE_SKIP;
+ nf_tables_commit_release(net);
+
+ return 0;
+@@ -10346,8 +10348,12 @@ static int nf_tables_abort(struct net *net, struct sk_buff *skb,
+ enum nfnl_abort_action action)
+ {
+ struct nftables_pernet *nft_net = nft_pernet(net);
+- int ret = __nf_tables_abort(net, action);
++ unsigned int gc_seq;
++ int ret;
+
++ gc_seq = nft_gc_seq_begin(nft_net);
++ ret = __nf_tables_abort(net, action);
++ nft_gc_seq_end(nft_net, gc_seq);
+ mutex_unlock(&nft_net->commit_mutex);
+
+ return ret;
+@@ -11082,7 +11088,7 @@ static int nft_rcv_nl_event(struct notifier_block *this, unsigned long event,
+ gc_seq = nft_gc_seq_begin(nft_net);
+
+ if (!list_empty(&nf_tables_destroy_list))
+- rcu_barrier();
++ nf_tables_trans_destroy_flush_work();
+ again:
+ list_for_each_entry(table, &nft_net->tables, list) {
+ if (nft_table_has_owner(table) &&
+@@ -11126,6 +11132,7 @@ static int __net_init nf_tables_init_net(struct net *net)
+ mutex_init(&nft_net->commit_mutex);
+ nft_net->base_seq = 1;
+ nft_net->gc_seq = 0;
++ nft_net->validate_state = NFT_VALIDATE_SKIP;
+
+ return 0;
+ }
+diff --git a/net/netfilter/nft_set_hash.c b/net/netfilter/nft_set_hash.c
+index cef5df8460009..524763659f251 100644
+--- a/net/netfilter/nft_set_hash.c
++++ b/net/netfilter/nft_set_hash.c
+@@ -326,6 +326,9 @@ static void nft_rhash_gc(struct work_struct *work)
+ nft_net = nft_pernet(net);
+ gc_seq = READ_ONCE(nft_net->gc_seq);
+
++ if (nft_set_gc_is_pending(set))
++ goto done;
++
+ gc = nft_trans_gc_alloc(set, gc_seq, GFP_KERNEL);
+ if (!gc)
+ goto done;
+diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c
+index 352180b123fc7..a9da4683b8c53 100644
+--- a/net/netfilter/nft_set_pipapo.c
++++ b/net/netfilter/nft_set_pipapo.c
+@@ -902,12 +902,14 @@ static void pipapo_lt_bits_adjust(struct nft_pipapo_field *f)
+ static int pipapo_insert(struct nft_pipapo_field *f, const uint8_t *k,
+ int mask_bits)
+ {
+- int rule = f->rules++, group, ret, bit_offset = 0;
++ int rule = f->rules, group, ret, bit_offset = 0;
+
+- ret = pipapo_resize(f, f->rules - 1, f->rules);
++ ret = pipapo_resize(f, f->rules, f->rules + 1);
+ if (ret)
+ return ret;
+
++ f->rules++;
++
+ for (group = 0; group < f->groups; group++) {
+ int i, v;
+ u8 mask;
+@@ -1052,7 +1054,9 @@ static int pipapo_expand(struct nft_pipapo_field *f,
+ step++;
+ if (step >= len) {
+ if (!masks) {
+- pipapo_insert(f, base, 0);
++ err = pipapo_insert(f, base, 0);
++ if (err < 0)
++ return err;
+ masks = 1;
+ }
+ goto out;
+@@ -1235,6 +1239,9 @@ static int nft_pipapo_insert(const struct net *net, const struct nft_set *set,
+ else
+ ret = pipapo_expand(f, start, end, f->groups * f->bb);
+
++ if (ret < 0)
++ return ret;
++
+ if (f->bsize > bsize_max)
+ bsize_max = f->bsize;
+
+@@ -1543,7 +1550,7 @@ static void nft_pipapo_gc_deactivate(struct net *net, struct nft_set *set,
+
+ /**
+ * pipapo_gc() - Drop expired entries from set, destroy start and end elements
+- * @set: nftables API set representation
++ * @_set: nftables API set representation
+ * @m: Matching data
+ */
+ static void pipapo_gc(const struct nft_set *_set, struct nft_pipapo_match *m)
+diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c
+index f9d4c8fcbbf82..c6435e7092319 100644
+--- a/net/netfilter/nft_set_rbtree.c
++++ b/net/netfilter/nft_set_rbtree.c
+@@ -611,6 +611,9 @@ static void nft_rbtree_gc(struct work_struct *work)
+ nft_net = nft_pernet(net);
+ gc_seq = READ_ONCE(nft_net->gc_seq);
+
++ if (nft_set_gc_is_pending(set))
++ goto done;
++
+ gc = nft_trans_gc_alloc(set, gc_seq, GFP_KERNEL);
+ if (!gc)
+ goto done;
+diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
+index aa6b1fe651519..e9eaf637220e9 100644
+--- a/net/sched/sch_api.c
++++ b/net/sched/sch_api.c
+@@ -1547,10 +1547,28 @@ static int tc_get_qdisc(struct sk_buff *skb, struct nlmsghdr *n,
+ return 0;
+ }
+
++static bool req_create_or_replace(struct nlmsghdr *n)
++{
++ return (n->nlmsg_flags & NLM_F_CREATE &&
++ n->nlmsg_flags & NLM_F_REPLACE);
++}
++
++static bool req_create_exclusive(struct nlmsghdr *n)
++{
++ return (n->nlmsg_flags & NLM_F_CREATE &&
++ n->nlmsg_flags & NLM_F_EXCL);
++}
++
++static bool req_change(struct nlmsghdr *n)
++{
++ return (!(n->nlmsg_flags & NLM_F_CREATE) &&
++ !(n->nlmsg_flags & NLM_F_REPLACE) &&
++ !(n->nlmsg_flags & NLM_F_EXCL));
++}
++
+ /*
+ * Create/change qdisc.
+ */
+-
+ static int tc_modify_qdisc(struct sk_buff *skb, struct nlmsghdr *n,
+ struct netlink_ext_ack *extack)
+ {
+@@ -1644,27 +1662,35 @@ replay:
+ *
+ * We know, that some child q is already
+ * attached to this parent and have choice:
+- * either to change it or to create/graft new one.
++ * 1) change it or 2) create/graft new one.
++ * If the requested qdisc kind is different
++ * than the existing one, then we choose graft.
++ * If they are the same then this is "change"
++ * operation - just let it fallthrough..
+ *
+ * 1. We are allowed to create/graft only
+- * if CREATE and REPLACE flags are set.
++ * if the request is explicitly stating
++ * "please create if it doesn't exist".
+ *
+- * 2. If EXCL is set, requestor wanted to say,
+- * that qdisc tcm_handle is not expected
++ * 2. If the request is to exclusive create
++ * then the qdisc tcm_handle is not expected
+ * to exist, so that we choose create/graft too.
+ *
+ * 3. The last case is when no flags are set.
++ * This will happen when for example tc
++ * utility issues a "change" command.
+ * Alas, it is sort of hole in API, we
+ * cannot decide what to do unambiguously.
+- * For now we select create/graft, if
+- * user gave KIND, which does not match existing.
++ * For now we select create/graft.
+ */
+- if ((n->nlmsg_flags & NLM_F_CREATE) &&
+- (n->nlmsg_flags & NLM_F_REPLACE) &&
+- ((n->nlmsg_flags & NLM_F_EXCL) ||
+- (tca[TCA_KIND] &&
+- nla_strcmp(tca[TCA_KIND], q->ops->id))))
+- goto create_n_graft;
++ if (tca[TCA_KIND] &&
++ nla_strcmp(tca[TCA_KIND], q->ops->id)) {
++ if (req_create_or_replace(n) ||
++ req_create_exclusive(n))
++ goto create_n_graft;
++ else if (req_change(n))
++ goto create_n_graft2;
++ }
+ }
+ }
+ } else {
+@@ -1698,6 +1724,7 @@ create_n_graft:
+ NL_SET_ERR_MSG(extack, "Qdisc not found. To create specify NLM_F_CREATE flag");
+ return -ENOENT;
+ }
++create_n_graft2:
+ if (clid == TC_H_INGRESS) {
+ if (dev_ingress_queue(dev)) {
+ q = qdisc_create(dev, dev_ingress_queue(dev),
+diff --git a/net/sctp/socket.c b/net/sctp/socket.c
+index ee15eff6364ee..d77561d97a1ed 100644
+--- a/net/sctp/socket.c
++++ b/net/sctp/socket.c
+@@ -99,7 +99,7 @@ struct percpu_counter sctp_sockets_allocated;
+
+ static void sctp_enter_memory_pressure(struct sock *sk)
+ {
+- sctp_memory_pressure = 1;
++ WRITE_ONCE(sctp_memory_pressure, 1);
+ }
+
+
+@@ -9479,7 +9479,7 @@ void sctp_copy_sock(struct sock *newsk, struct sock *sk,
+ newinet->inet_rcv_saddr = inet->inet_rcv_saddr;
+ newinet->inet_dport = htons(asoc->peer.port);
+ newinet->pmtudisc = inet->pmtudisc;
+- newinet->inet_id = get_random_u16();
++ atomic_set(&newinet->inet_id, get_random_u16());
+
+ newinet->uc_ttl = inet->uc_ttl;
+ newinet->mc_loop = 1;
+diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
+index b098fde373abf..28c0771c4e8c3 100644
+--- a/net/sunrpc/xprtrdma/verbs.c
++++ b/net/sunrpc/xprtrdma/verbs.c
+@@ -935,9 +935,6 @@ struct rpcrdma_rep *rpcrdma_rep_create(struct rpcrdma_xprt *r_xprt,
+ if (!rep->rr_rdmabuf)
+ goto out_free;
+
+- if (!rpcrdma_regbuf_dma_map(r_xprt, rep->rr_rdmabuf))
+- goto out_free_regbuf;
+-
+ rep->rr_cid.ci_completion_id =
+ atomic_inc_return(&r_xprt->rx_ep->re_completion_ids);
+
+@@ -956,8 +953,6 @@ struct rpcrdma_rep *rpcrdma_rep_create(struct rpcrdma_xprt *r_xprt,
+ spin_unlock(&buf->rb_lock);
+ return rep;
+
+-out_free_regbuf:
+- rpcrdma_regbuf_free(rep->rr_rdmabuf);
+ out_free:
+ kfree(rep);
+ out:
+@@ -1363,6 +1358,10 @@ void rpcrdma_post_recvs(struct rpcrdma_xprt *r_xprt, int needed, bool temp)
+ rep = rpcrdma_rep_create(r_xprt, temp);
+ if (!rep)
+ break;
++ if (!rpcrdma_regbuf_dma_map(r_xprt, rep->rr_rdmabuf)) {
++ rpcrdma_rep_put(buf, rep);
++ break;
++ }
+
+ rep->rr_cid.ci_queue_id = ep->re_attr.recv_cq->res.id;
+ trace_xprtrdma_post_recv(rep);
+diff --git a/security/selinux/ss/policydb.c b/security/selinux/ss/policydb.c
+index adcfb63b3550d..6f9ff4643dcbc 100644
+--- a/security/selinux/ss/policydb.c
++++ b/security/selinux/ss/policydb.c
+@@ -2005,6 +2005,7 @@ static int filename_trans_read_helper(struct policydb *p, void *fp)
+ if (!datum)
+ goto out;
+
++ datum->next = NULL;
+ *dst = datum;
+
+ /* ebitmap_read() will at least init the bitmap */
+@@ -2017,7 +2018,6 @@ static int filename_trans_read_helper(struct policydb *p, void *fp)
+ goto out;
+
+ datum->otype = le32_to_cpu(buf[0]);
+- datum->next = NULL;
+
+ dst = &datum->next;
+ }
+diff --git a/sound/pci/ymfpci/ymfpci.c b/sound/pci/ymfpci/ymfpci.c
+index b033bd2909405..48444dda44def 100644
+--- a/sound/pci/ymfpci/ymfpci.c
++++ b/sound/pci/ymfpci/ymfpci.c
+@@ -152,8 +152,8 @@ static inline int snd_ymfpci_create_gameport(struct snd_ymfpci *chip, int dev, i
+ void snd_ymfpci_free_gameport(struct snd_ymfpci *chip) { }
+ #endif /* SUPPORT_JOYSTICK */
+
+-static int snd_card_ymfpci_probe(struct pci_dev *pci,
+- const struct pci_device_id *pci_id)
++static int __snd_card_ymfpci_probe(struct pci_dev *pci,
++ const struct pci_device_id *pci_id)
+ {
+ static int dev;
+ struct snd_card *card;
+@@ -348,6 +348,12 @@ static int snd_card_ymfpci_probe(struct pci_dev *pci,
+ return 0;
+ }
+
++static int snd_card_ymfpci_probe(struct pci_dev *pci,
++ const struct pci_device_id *pci_id)
++{
++ return snd_card_free_on_error(&pci->dev, __snd_card_ymfpci_probe(pci, pci_id));
++}
++
+ static struct pci_driver ymfpci_driver = {
+ .name = KBUILD_MODNAME,
+ .id_table = snd_ymfpci_ids,
+diff --git a/sound/soc/amd/Kconfig b/sound/soc/amd/Kconfig
+index e724cb3c70b74..57d5e342a8eb4 100644
+--- a/sound/soc/amd/Kconfig
++++ b/sound/soc/amd/Kconfig
+@@ -71,6 +71,7 @@ config SND_SOC_AMD_RENOIR_MACH
+ config SND_SOC_AMD_ACP5x
+ tristate "AMD Audio Coprocessor-v5.x I2S support"
+ depends on X86 && PCI
++ select SND_AMD_ACP_CONFIG
+ help
+ This option enables ACP v5.x support on AMD platform
+
+diff --git a/sound/soc/amd/yc/acp6x-mach.c b/sound/soc/amd/yc/acp6x-mach.c
+index 246299a178f9b..5310ba0734b14 100644
+--- a/sound/soc/amd/yc/acp6x-mach.c
++++ b/sound/soc/amd/yc/acp6x-mach.c
+@@ -217,7 +217,7 @@ static const struct dmi_system_id yc_acp_quirk_table[] = {
+ .driver_data = &acp6x_card,
+ .matches = {
+ DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"),
+- DMI_MATCH(DMI_PRODUCT_NAME, "82"),
++ DMI_MATCH(DMI_PRODUCT_NAME, "82V2"),
+ }
+ },
+ {
+@@ -248,6 +248,13 @@ static const struct dmi_system_id yc_acp_quirk_table[] = {
+ DMI_MATCH(DMI_PRODUCT_NAME, "M3402RA"),
+ }
+ },
++ {
++ .driver_data = &acp6x_card,
++ .matches = {
++ DMI_MATCH(DMI_BOARD_VENDOR, "ASUSTeK COMPUTER INC."),
++ DMI_MATCH(DMI_PRODUCT_NAME, "M6500RC"),
++ }
++ },
+ {
+ .driver_data = &acp6x_card,
+ .matches = {
+diff --git a/sound/soc/codecs/cs35l41.c b/sound/soc/codecs/cs35l41.c
+index 6ac501f008eca..8a879b6f48290 100644
+--- a/sound/soc/codecs/cs35l41.c
++++ b/sound/soc/codecs/cs35l41.c
+@@ -168,7 +168,7 @@ static int cs35l41_get_fs_mon_config_index(int freq)
+ static const DECLARE_TLV_DB_RANGE(dig_vol_tlv,
+ 0, 0, TLV_DB_SCALE_ITEM(TLV_DB_GAIN_MUTE, 0, 1),
+ 1, 913, TLV_DB_MINMAX_ITEM(-10200, 1200));
+-static DECLARE_TLV_DB_SCALE(amp_gain_tlv, 0, 1, 1);
++static DECLARE_TLV_DB_SCALE(amp_gain_tlv, 50, 100, 0);
+
+ static const struct snd_kcontrol_new dre_ctrl =
+ SOC_DAPM_SINGLE("Switch", CS35L41_PWR_CTRL3, 20, 1, 0);
+diff --git a/sound/soc/codecs/cs35l56.c b/sound/soc/codecs/cs35l56.c
+index f3fee448d759e..6a2b0797f3c7d 100644
+--- a/sound/soc/codecs/cs35l56.c
++++ b/sound/soc/codecs/cs35l56.c
+@@ -5,7 +5,6 @@
+ // Copyright (C) 2023 Cirrus Logic, Inc. and
+ // Cirrus Logic International Semiconductor Ltd.
+
+-#include <linux/acpi.h>
+ #include <linux/completion.h>
+ #include <linux/debugfs.h>
+ #include <linux/delay.h>
+@@ -1327,26 +1326,22 @@ static int cs35l56_dsp_init(struct cs35l56_private *cs35l56)
+ return 0;
+ }
+
+-static int cs35l56_acpi_get_name(struct cs35l56_private *cs35l56)
++static int cs35l56_get_firmware_uid(struct cs35l56_private *cs35l56)
+ {
+- acpi_handle handle = ACPI_HANDLE(cs35l56->dev);
+- const char *sub;
++ struct device *dev = cs35l56->dev;
++ const char *prop;
++ int ret;
+
+- /* If there is no ACPI_HANDLE, there is no ACPI for this system, return 0 */
+- if (!handle)
++ ret = device_property_read_string(dev, "cirrus,firmware-uid", &prop);
++ /* If bad sw node property, return 0 and fallback to legacy firmware path */
++ if (ret < 0)
+ return 0;
+
+- sub = acpi_get_subsystem_id(handle);
+- if (IS_ERR(sub)) {
+- /* If bad ACPI, return 0 and fallback to legacy firmware path, otherwise fail */
+- if (PTR_ERR(sub) == -ENODATA)
+- return 0;
+- else
+- return PTR_ERR(sub);
+- }
++ cs35l56->dsp.system_name = devm_kstrdup(dev, prop, GFP_KERNEL);
++ if (cs35l56->dsp.system_name == NULL)
++ return -ENOMEM;
+
+- cs35l56->dsp.system_name = sub;
+- dev_dbg(cs35l56->dev, "Subsystem ID: %s\n", cs35l56->dsp.system_name);
++ dev_dbg(dev, "Firmware UID: %s\n", cs35l56->dsp.system_name);
+
+ return 0;
+ }
+@@ -1390,7 +1385,7 @@ int cs35l56_common_probe(struct cs35l56_private *cs35l56)
+ gpiod_set_value_cansleep(cs35l56->reset_gpio, 1);
+ }
+
+- ret = cs35l56_acpi_get_name(cs35l56);
++ ret = cs35l56_get_firmware_uid(cs35l56);
+ if (ret != 0)
+ goto err;
+
+@@ -1577,8 +1572,6 @@ void cs35l56_remove(struct cs35l56_private *cs35l56)
+
+ regcache_cache_only(cs35l56->regmap, true);
+
+- kfree(cs35l56->dsp.system_name);
+-
+ gpiod_set_value_cansleep(cs35l56->reset_gpio, 0);
+ regulator_bulk_disable(ARRAY_SIZE(cs35l56->supplies), cs35l56->supplies);
+ }
+diff --git a/sound/soc/sof/ipc4-pcm.c b/sound/soc/sof/ipc4-pcm.c
+index 9e2b6c45080dd..49eb98605518a 100644
+--- a/sound/soc/sof/ipc4-pcm.c
++++ b/sound/soc/sof/ipc4-pcm.c
+@@ -708,6 +708,9 @@ static int sof_ipc4_pcm_hw_params(struct snd_soc_component *component,
+ struct snd_sof_pcm *spcm;
+
+ spcm = snd_sof_find_spcm_dai(component, rtd);
++ if (!spcm)
++ return -EINVAL;
++
+ time_info = spcm->stream[substream->stream].private;
+ /* delay calculation is not supported by current fw_reg ABI */
+ if (!time_info)
+diff --git a/tools/testing/selftests/drivers/net/bonding/bond-break-lacpdu-tx.sh b/tools/testing/selftests/drivers/net/bonding/bond-break-lacpdu-tx.sh
+index 47ab90596acb2..6358df5752f90 100755
+--- a/tools/testing/selftests/drivers/net/bonding/bond-break-lacpdu-tx.sh
++++ b/tools/testing/selftests/drivers/net/bonding/bond-break-lacpdu-tx.sh
+@@ -57,8 +57,8 @@ ip link add name veth2-bond type veth peer name veth2-end
+
+ # add ports
+ ip link set fbond master fab-br0
+-ip link set veth1-bond down master fbond
+-ip link set veth2-bond down master fbond
++ip link set veth1-bond master fbond
++ip link set veth2-bond master fbond
+
+ # bring up
+ ip link set veth1-end up
+diff --git a/tools/testing/selftests/drivers/net/mlxsw/sharedbuffer.sh b/tools/testing/selftests/drivers/net/mlxsw/sharedbuffer.sh
+index 7d9e73a43a49b..0c47faff9274b 100755
+--- a/tools/testing/selftests/drivers/net/mlxsw/sharedbuffer.sh
++++ b/tools/testing/selftests/drivers/net/mlxsw/sharedbuffer.sh
+@@ -98,12 +98,12 @@ sb_occ_etc_check()
+
+ port_pool_test()
+ {
+- local exp_max_occ=288
++ local exp_max_occ=$(devlink_cell_size_get)
+ local max_occ
+
+ devlink sb occupancy clearmax $DEVLINK_DEV
+
+- $MZ $h1 -c 1 -p 160 -a $h1mac -b $h2mac -A 192.0.1.1 -B 192.0.1.2 \
++ $MZ $h1 -c 1 -p 10 -a $h1mac -b $h2mac -A 192.0.1.1 -B 192.0.1.2 \
+ -t ip -q
+
+ devlink sb occupancy snapshot $DEVLINK_DEV
+@@ -126,12 +126,12 @@ port_pool_test()
+
+ port_tc_ip_test()
+ {
+- local exp_max_occ=288
++ local exp_max_occ=$(devlink_cell_size_get)
+ local max_occ
+
+ devlink sb occupancy clearmax $DEVLINK_DEV
+
+- $MZ $h1 -c 1 -p 160 -a $h1mac -b $h2mac -A 192.0.1.1 -B 192.0.1.2 \
++ $MZ $h1 -c 1 -p 10 -a $h1mac -b $h2mac -A 192.0.1.1 -B 192.0.1.2 \
+ -t ip -q
+
+ devlink sb occupancy snapshot $DEVLINK_DEV
+@@ -154,16 +154,12 @@ port_tc_ip_test()
+
+ port_tc_arp_test()
+ {
+- local exp_max_occ=96
++ local exp_max_occ=$(devlink_cell_size_get)
+ local max_occ
+
+- if [[ $MLXSW_CHIP != "mlxsw_spectrum" ]]; then
+- exp_max_occ=144
+- fi
+-
+ devlink sb occupancy clearmax $DEVLINK_DEV
+
+- $MZ $h1 -c 1 -p 160 -a $h1mac -A 192.0.1.1 -t arp -q
++ $MZ $h1 -c 1 -p 10 -a $h1mac -A 192.0.1.1 -t arp -q
+
+ devlink sb occupancy snapshot $DEVLINK_DEV
+
+diff --git a/tools/testing/selftests/mm/hmm-tests.c b/tools/testing/selftests/mm/hmm-tests.c
+index 4adaad1b822f0..20294553a5dd7 100644
+--- a/tools/testing/selftests/mm/hmm-tests.c
++++ b/tools/testing/selftests/mm/hmm-tests.c
+@@ -57,9 +57,14 @@ enum {
+
+ #define ALIGN(x, a) (((x) + (a - 1)) & (~((a) - 1)))
+ /* Just the flags we need, copied from mm.h: */
++
++#ifndef FOLL_WRITE
+ #define FOLL_WRITE 0x01 /* check pte is writable */
+-#define FOLL_LONGTERM 0x10000 /* mapping lifetime is indefinite */
++#endif
+
++#ifndef FOLL_LONGTERM
++#define FOLL_LONGTERM 0x100 /* mapping lifetime is indefinite */
++#endif
+ FIXTURE(hmm)
+ {
+ int fd;
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-09-02 9:55 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-09-02 9:55 UTC (permalink / raw
To: gentoo-commits
commit: 8f4f9ecc0aab883f69fd21d1be936a91c1e70fb7
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Sat Sep 2 09:55:37 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Sat Sep 2 09:55:37 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=8f4f9ecc
Linux patch 6.4.14
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1013_linux-6.4.14.patch | 326 ++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 330 insertions(+)
diff --git a/0000_README b/0000_README
index 38b60f4b..8a165dad 100644
--- a/0000_README
+++ b/0000_README
@@ -95,6 +95,10 @@ Patch: 1012_linux-6.4.13.patch
From: https://www.kernel.org
Desc: Linux 6.4.13
+Patch: 1013_linux-6.4.14.patch
+From: https://www.kernel.org
+Desc: Linux 6.4.14
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1013_linux-6.4.14.patch b/1013_linux-6.4.14.patch
new file mode 100644
index 00000000..428e62bd
--- /dev/null
+++ b/1013_linux-6.4.14.patch
@@ -0,0 +1,326 @@
+diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
+index 7323911931828..2b0cea8be7408 100644
+--- a/Documentation/admin-guide/kernel-parameters.txt
++++ b/Documentation/admin-guide/kernel-parameters.txt
+@@ -6240,10 +6240,6 @@
+ -1: disable all critical trip points in all thermal zones
+ <degrees C>: override all critical trip points
+
+- thermal.nocrt= [HW,ACPI]
+- Set to disable actions on ACPI thermal zone
+- critical and hot trip points.
+-
+ thermal.off= [HW,ACPI]
+ 1: disable ACPI thermal control
+
+diff --git a/Makefile b/Makefile
+index 900e515b87cf8..97611fe99c8f0 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 6
+ PATCHLEVEL = 4
+-SUBLEVEL = 13
++SUBLEVEL = 14
+ EXTRAVERSION =
+ NAME = Hurr durr I'ma ninja sloth
+
+diff --git a/arch/arm/kernel/module-plts.c b/arch/arm/kernel/module-plts.c
+index f5a43fd8c1639..da2ee8d6ef1a7 100644
+--- a/arch/arm/kernel/module-plts.c
++++ b/arch/arm/kernel/module-plts.c
+@@ -251,7 +251,7 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
+ /* sort by type and symbol index */
+ sort(rels, numrels, sizeof(Elf32_Rel), cmp_rel, NULL);
+
+- if (strncmp(secstrings + dstsec->sh_name, ".init", 5) != 0)
++ if (!module_init_layout_section(secstrings + dstsec->sh_name))
+ core_plts += count_plts(syms, dstsec->sh_addr, rels,
+ numrels, s->sh_info);
+ else
+diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c
+index 543493bf924d2..bd69a4e7cd605 100644
+--- a/arch/arm64/kernel/module-plts.c
++++ b/arch/arm64/kernel/module-plts.c
+@@ -7,6 +7,7 @@
+ #include <linux/ftrace.h>
+ #include <linux/kernel.h>
+ #include <linux/module.h>
++#include <linux/moduleloader.h>
+ #include <linux/sort.h>
+
+ static struct plt_entry __get_adrp_add_pair(u64 dst, u64 pc,
+@@ -338,7 +339,7 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
+ if (nents)
+ sort(rels, nents, sizeof(Elf64_Rela), cmp_rela, NULL);
+
+- if (!str_has_prefix(secstrings + dstsec->sh_name, ".init"))
++ if (!module_init_layout_section(secstrings + dstsec->sh_name))
+ core_plts += count_plts(syms, rels, numrels,
+ sechdrs[i].sh_info, dstsec);
+ else
+diff --git a/arch/parisc/kernel/sys_parisc.c b/arch/parisc/kernel/sys_parisc.c
+index 39acccabf2ede..9915062d5243c 100644
+--- a/arch/parisc/kernel/sys_parisc.c
++++ b/arch/parisc/kernel/sys_parisc.c
+@@ -24,6 +24,7 @@
+ #include <linux/personality.h>
+ #include <linux/random.h>
+ #include <linux/compat.h>
++#include <linux/elf-randomize.h>
+
+ /*
+ * Construct an artificial page offset for the mapping based on the physical
+@@ -339,7 +340,7 @@ asmlinkage long parisc_fallocate(int fd, int mode, u32 offhi, u32 offlo,
+ ((u64)lenhi << 32) | lenlo);
+ }
+
+-long parisc_personality(unsigned long personality)
++asmlinkage long parisc_personality(unsigned long personality)
+ {
+ long err;
+
+diff --git a/arch/x86/include/asm/sections.h b/arch/x86/include/asm/sections.h
+index a6e8373a5170f..3fa87e5e11aba 100644
+--- a/arch/x86/include/asm/sections.h
++++ b/arch/x86/include/asm/sections.h
+@@ -2,8 +2,6 @@
+ #ifndef _ASM_X86_SECTIONS_H
+ #define _ASM_X86_SECTIONS_H
+
+-#define arch_is_kernel_initmem_freed arch_is_kernel_initmem_freed
+-
+ #include <asm-generic/sections.h>
+ #include <asm/extable.h>
+
+@@ -18,20 +16,4 @@ extern char __end_of_kernel_reserve[];
+
+ extern unsigned long _brk_start, _brk_end;
+
+-static inline bool arch_is_kernel_initmem_freed(unsigned long addr)
+-{
+- /*
+- * If _brk_start has not been cleared, brk allocation is incomplete,
+- * and we can not make assumptions about its use.
+- */
+- if (_brk_start)
+- return 0;
+-
+- /*
+- * After brk allocation is complete, space between _brk_end and _end
+- * is available for allocation.
+- */
+- return addr >= _brk_end && addr < (unsigned long)&_end;
+-}
+-
+ #endif /* _ASM_X86_SECTIONS_H */
+diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c
+index 4720a3649a61b..dd89c28fa7368 100644
+--- a/drivers/acpi/thermal.c
++++ b/drivers/acpi/thermal.c
+@@ -59,10 +59,6 @@ static int tzp;
+ module_param(tzp, int, 0444);
+ MODULE_PARM_DESC(tzp, "Thermal zone polling frequency, in 1/10 seconds.");
+
+-static int nocrt;
+-module_param(nocrt, int, 0);
+-MODULE_PARM_DESC(nocrt, "Set to take no action upon ACPI thermal zone critical trips points.");
+-
+ static int off;
+ module_param(off, int, 0);
+ MODULE_PARM_DESC(off, "Set to disable ACPI thermal support.");
+@@ -1143,7 +1139,7 @@ static int thermal_act(const struct dmi_system_id *d) {
+ static int thermal_nocrt(const struct dmi_system_id *d) {
+ pr_notice("%s detected: disabling all critical thermal trip point actions.\n",
+ d->ident);
+- nocrt = 1;
++ crt = -1;
+ return 0;
+ }
+ static int thermal_tzp(const struct dmi_system_id *d) {
+diff --git a/drivers/thunderbolt/tmu.c b/drivers/thunderbolt/tmu.c
+index d9544600b3867..49146f97bb16e 100644
+--- a/drivers/thunderbolt/tmu.c
++++ b/drivers/thunderbolt/tmu.c
+@@ -416,6 +416,7 @@ int tb_switch_tmu_disable(struct tb_switch *sw)
+ * mode.
+ */
+ ret = tb_switch_tmu_rate_write(sw, TB_SWITCH_TMU_RATE_OFF);
++ if (ret)
+ return ret;
+
+ tb_port_tmu_time_sync_disable(up);
+diff --git a/include/linux/moduleloader.h b/include/linux/moduleloader.h
+index 03be088fb4396..001b2ce83832e 100644
+--- a/include/linux/moduleloader.h
++++ b/include/linux/moduleloader.h
+@@ -42,6 +42,11 @@ bool module_init_section(const char *name);
+ */
+ bool module_exit_section(const char *name);
+
++/* Describes whether within_module_init() will consider this an init section
++ * or not. This behaviour changes with CONFIG_MODULE_UNLOAD.
++ */
++bool module_init_layout_section(const char *sname);
++
+ /*
+ * Apply the given relocation to the (simplified) ELF. Return -error
+ * or 0.
+diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
+index 4874508bb950e..7c3882b9133c8 100644
+--- a/kernel/kallsyms.c
++++ b/kernel/kallsyms.c
+@@ -188,16 +188,13 @@ static bool cleanup_symbol_name(char *s)
+
+ static int compare_symbol_name(const char *name, char *namebuf)
+ {
+- int ret;
+-
+- ret = strcmp(name, namebuf);
+- if (!ret)
+- return ret;
+-
+- if (cleanup_symbol_name(namebuf) && !strcmp(name, namebuf))
+- return 0;
+-
+- return ret;
++ /* The kallsyms_seqs_of_names is sorted based on names after
++ * cleanup_symbol_name() (see scripts/kallsyms.c) if clang lto is enabled.
++ * To ensure correct bisection in kallsyms_lookup_names(), do
++ * cleanup_symbol_name(namebuf) before comparing name and namebuf.
++ */
++ cleanup_symbol_name(namebuf);
++ return strcmp(name, namebuf);
+ }
+
+ static unsigned int get_symbol_seq(int index)
+diff --git a/kernel/kallsyms_selftest.c b/kernel/kallsyms_selftest.c
+index a2e3745d15c47..e05ddc33a7529 100644
+--- a/kernel/kallsyms_selftest.c
++++ b/kernel/kallsyms_selftest.c
+@@ -196,7 +196,7 @@ static bool match_cleanup_name(const char *s, const char *name)
+ if (!IS_ENABLED(CONFIG_LTO_CLANG))
+ return false;
+
+- p = strchr(s, '.');
++ p = strstr(s, ".llvm.");
+ if (!p)
+ return false;
+
+@@ -344,27 +344,6 @@ static int test_kallsyms_basic_function(void)
+ goto failed;
+ }
+
+- /*
+- * The first '.' may be the initial letter, in which case the
+- * entire symbol name will be truncated to an empty string in
+- * cleanup_symbol_name(). Do not test these symbols.
+- *
+- * For example:
+- * cat /proc/kallsyms | awk '{print $3}' | grep -E "^\." | head
+- * .E_read_words
+- * .E_leading_bytes
+- * .E_trailing_bytes
+- * .E_write_words
+- * .E_copy
+- * .str.292.llvm.12122243386960820698
+- * .str.24.llvm.12122243386960820698
+- * .str.29.llvm.12122243386960820698
+- * .str.75.llvm.12122243386960820698
+- * .str.99.llvm.12122243386960820698
+- */
+- if (IS_ENABLED(CONFIG_LTO_CLANG) && !namebuf[0])
+- continue;
+-
+ lookup_addr = kallsyms_lookup_name(namebuf);
+
+ memset(stat, 0, sizeof(*stat));
+diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
+index 4dfd2f3e09b2e..c6fd60ceb04aa 100644
+--- a/kernel/locking/lockdep.c
++++ b/kernel/locking/lockdep.c
+@@ -817,34 +817,26 @@ static int very_verbose(struct lock_class *class)
+ * Is this the address of a static object:
+ */
+ #ifdef __KERNEL__
+-/*
+- * Check if an address is part of freed initmem. After initmem is freed,
+- * memory can be allocated from it, and such allocations would then have
+- * addresses within the range [_stext, _end].
+- */
+-#ifndef arch_is_kernel_initmem_freed
+-static int arch_is_kernel_initmem_freed(unsigned long addr)
+-{
+- if (system_state < SYSTEM_FREEING_INITMEM)
+- return 0;
+-
+- return init_section_contains((void *)addr, 1);
+-}
+-#endif
+-
+ static int static_obj(const void *obj)
+ {
+- unsigned long start = (unsigned long) &_stext,
+- end = (unsigned long) &_end,
+- addr = (unsigned long) obj;
++ unsigned long addr = (unsigned long) obj;
+
+- if (arch_is_kernel_initmem_freed(addr))
+- return 0;
++ if (is_kernel_core_data(addr))
++ return 1;
++
++ /*
++ * keys are allowed in the __ro_after_init section.
++ */
++ if (is_kernel_rodata(addr))
++ return 1;
+
+ /*
+- * static variable?
++ * in initdata section and used during bootup only?
++ * NOTE: On some platforms the initdata section is
++ * outside of the _stext ... _end range.
+ */
+- if ((addr >= start) && (addr < end))
++ if (system_state < SYSTEM_FREEING_INITMEM &&
++ init_section_contains((void *)addr, 1))
+ return 1;
+
+ /*
+diff --git a/kernel/module/decompress.c b/kernel/module/decompress.c
+index 8a5d6d63b06cb..87440f714c0ca 100644
+--- a/kernel/module/decompress.c
++++ b/kernel/module/decompress.c
+@@ -241,7 +241,7 @@ static ssize_t module_zstd_decompress(struct load_info *info,
+ }
+
+ wksp_size = zstd_dstream_workspace_bound(header.windowSize);
+- wksp = kmalloc(wksp_size, GFP_KERNEL);
++ wksp = vmalloc(wksp_size);
+ if (!wksp) {
+ retval = -ENOMEM;
+ goto out;
+@@ -284,7 +284,7 @@ static ssize_t module_zstd_decompress(struct load_info *info,
+ retval = new_size;
+
+ out:
+- kfree(wksp);
++ vfree(wksp);
+ return retval;
+ }
+ #else
+diff --git a/kernel/module/main.c b/kernel/module/main.c
+index 4e2cf784cf8ce..f1facc898a646 100644
+--- a/kernel/module/main.c
++++ b/kernel/module/main.c
+@@ -1491,7 +1491,7 @@ long module_get_offset_and_type(struct module *mod, enum mod_mem_type type,
+ return offset | mask;
+ }
+
+-static bool module_init_layout_section(const char *sname)
++bool module_init_layout_section(const char *sname)
+ {
+ #ifndef CONFIG_MODULE_UNLOAD
+ if (module_exit_section(sname))
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-09-06 22:15 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-09-06 22:15 UTC (permalink / raw
To: gentoo-commits
commit: e5ee201c60a1c999c98e63c145808a9340c3c554
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Wed Sep 6 22:15:00 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Wed Sep 6 22:15:00 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=e5ee201c
Linux patch 6.4.15
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1014_linux-6.4.15.patch | 998 ++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 1002 insertions(+)
diff --git a/0000_README b/0000_README
index 8a165dad..5a91174a 100644
--- a/0000_README
+++ b/0000_README
@@ -99,6 +99,10 @@ Patch: 1013_linux-6.4.14.patch
From: https://www.kernel.org
Desc: Linux 6.4.14
+Patch: 1014_linux-6.4.15.patch
+From: https://www.kernel.org
+Desc: Linux 6.4.15
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1014_linux-6.4.15.patch b/1014_linux-6.4.15.patch
new file mode 100644
index 00000000..0394e65f
--- /dev/null
+++ b/1014_linux-6.4.15.patch
@@ -0,0 +1,998 @@
+diff --git a/Documentation/devicetree/bindings/serial/nxp,sc16is7xx.txt b/Documentation/devicetree/bindings/serial/nxp,sc16is7xx.txt
+index 0fa8e3e43bf80..1a7e4bff0456f 100644
+--- a/Documentation/devicetree/bindings/serial/nxp,sc16is7xx.txt
++++ b/Documentation/devicetree/bindings/serial/nxp,sc16is7xx.txt
+@@ -23,6 +23,9 @@ Optional properties:
+ 1 = active low.
+ - irda-mode-ports: An array that lists the indices of the port that
+ should operate in IrDA mode.
++- nxp,modem-control-line-ports: An array that lists the indices of the port that
++ should have shared GPIO lines configured as
++ modem control lines.
+
+ Example:
+ sc16is750: sc16is750@51 {
+@@ -35,6 +38,26 @@ Example:
+ #gpio-cells = <2>;
+ };
+
++ sc16is752: sc16is752@53 {
++ compatible = "nxp,sc16is752";
++ reg = <0x53>;
++ clocks = <&clk20m>;
++ interrupt-parent = <&gpio3>;
++ interrupts = <7 IRQ_TYPE_EDGE_FALLING>;
++ nxp,modem-control-line-ports = <1>; /* Port 1 as modem control lines */
++ gpio-controller; /* Port 0 as GPIOs */
++ #gpio-cells = <2>;
++ };
++
++ sc16is752: sc16is752@54 {
++ compatible = "nxp,sc16is752";
++ reg = <0x54>;
++ clocks = <&clk20m>;
++ interrupt-parent = <&gpio3>;
++ interrupts = <7 IRQ_TYPE_EDGE_FALLING>;
++ nxp,modem-control-line-ports = <0 1>; /* Ports 0 and 1 as modem control lines */
++ };
++
+ * spi as bus
+
+ Required properties:
+@@ -59,6 +82,9 @@ Optional properties:
+ 1 = active low.
+ - irda-mode-ports: An array that lists the indices of the port that
+ should operate in IrDA mode.
++- nxp,modem-control-line-ports: An array that lists the indices of the port that
++ should have shared GPIO lines configured as
++ modem control lines.
+
+ Example:
+ sc16is750: sc16is750@0 {
+@@ -70,3 +96,23 @@ Example:
+ gpio-controller;
+ #gpio-cells = <2>;
+ };
++
++ sc16is752: sc16is752@1 {
++ compatible = "nxp,sc16is752";
++ reg = <1>;
++ clocks = <&clk20m>;
++ interrupt-parent = <&gpio3>;
++ interrupts = <7 IRQ_TYPE_EDGE_FALLING>;
++ nxp,modem-control-line-ports = <1>; /* Port 1 as modem control lines */
++ gpio-controller; /* Port 0 as GPIOs */
++ #gpio-cells = <2>;
++ };
++
++ sc16is752: sc16is752@2 {
++ compatible = "nxp,sc16is752";
++ reg = <2>;
++ clocks = <&clk20m>;
++ interrupt-parent = <&gpio3>;
++ interrupts = <7 IRQ_TYPE_EDGE_FALLING>;
++ nxp,modem-control-line-ports = <0 1>; /* Ports 0 and 1 as modem control lines */
++ };
+diff --git a/Makefile b/Makefile
+index 97611fe99c8f0..212d1c7e4a1a3 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 6
+ PATCHLEVEL = 4
+-SUBLEVEL = 14
++SUBLEVEL = 15
+ EXTRAVERSION =
+ NAME = Hurr durr I'ma ninja sloth
+
+diff --git a/arch/arm/mach-pxa/sharpsl_pm.c b/arch/arm/mach-pxa/sharpsl_pm.c
+index d29bdcd5270e0..72fa2e3fd3531 100644
+--- a/arch/arm/mach-pxa/sharpsl_pm.c
++++ b/arch/arm/mach-pxa/sharpsl_pm.c
+@@ -216,8 +216,6 @@ void sharpsl_battery_kick(void)
+ {
+ schedule_delayed_work(&sharpsl_bat, msecs_to_jiffies(125));
+ }
+-EXPORT_SYMBOL(sharpsl_battery_kick);
+-
+
+ static void sharpsl_battery_thread(struct work_struct *private_)
+ {
+diff --git a/arch/arm/mach-pxa/spitz.c b/arch/arm/mach-pxa/spitz.c
+index 28e376e06fdc8..804d41ea2229f 100644
+--- a/arch/arm/mach-pxa/spitz.c
++++ b/arch/arm/mach-pxa/spitz.c
+@@ -9,7 +9,6 @@
+ */
+
+ #include <linux/kernel.h>
+-#include <linux/module.h> /* symbol_get ; symbol_put */
+ #include <linux/platform_device.h>
+ #include <linux/delay.h>
+ #include <linux/gpio_keys.h>
+@@ -518,17 +517,6 @@ static struct gpiod_lookup_table spitz_ads7846_gpio_table = {
+ },
+ };
+
+-static void spitz_bl_kick_battery(void)
+-{
+- void (*kick_batt)(void);
+-
+- kick_batt = symbol_get(sharpsl_battery_kick);
+- if (kick_batt) {
+- kick_batt();
+- symbol_put(sharpsl_battery_kick);
+- }
+-}
+-
+ static struct gpiod_lookup_table spitz_lcdcon_gpio_table = {
+ .dev_id = "spi2.1",
+ .table = {
+@@ -556,7 +544,7 @@ static struct corgi_lcd_platform_data spitz_lcdcon_info = {
+ .max_intensity = 0x2f,
+ .default_intensity = 0x1f,
+ .limit_mask = 0x0b,
+- .kick_battery = spitz_bl_kick_battery,
++ .kick_battery = sharpsl_battery_kick,
+ };
+
+ static struct spi_board_info spitz_spi_devices[] = {
+diff --git a/arch/mips/alchemy/devboards/db1000.c b/arch/mips/alchemy/devboards/db1000.c
+index 79d66faa84828..012da042d0a4f 100644
+--- a/arch/mips/alchemy/devboards/db1000.c
++++ b/arch/mips/alchemy/devboards/db1000.c
+@@ -14,7 +14,6 @@
+ #include <linux/interrupt.h>
+ #include <linux/leds.h>
+ #include <linux/mmc/host.h>
+-#include <linux/module.h>
+ #include <linux/platform_device.h>
+ #include <linux/pm.h>
+ #include <linux/spi/spi.h>
+@@ -167,12 +166,7 @@ static struct platform_device db1x00_audio_dev = {
+
+ static irqreturn_t db1100_mmc_cd(int irq, void *ptr)
+ {
+- void (*mmc_cd)(struct mmc_host *, unsigned long);
+- /* link against CONFIG_MMC=m */
+- mmc_cd = symbol_get(mmc_detect_change);
+- mmc_cd(ptr, msecs_to_jiffies(500));
+- symbol_put(mmc_detect_change);
+-
++ mmc_detect_change(ptr, msecs_to_jiffies(500));
+ return IRQ_HANDLED;
+ }
+
+diff --git a/arch/mips/alchemy/devboards/db1200.c b/arch/mips/alchemy/devboards/db1200.c
+index 1864eb935ca57..76080c71a2a7b 100644
+--- a/arch/mips/alchemy/devboards/db1200.c
++++ b/arch/mips/alchemy/devboards/db1200.c
+@@ -10,7 +10,6 @@
+ #include <linux/gpio.h>
+ #include <linux/i2c.h>
+ #include <linux/init.h>
+-#include <linux/module.h>
+ #include <linux/interrupt.h>
+ #include <linux/io.h>
+ #include <linux/leds.h>
+@@ -340,14 +339,7 @@ static irqreturn_t db1200_mmc_cd(int irq, void *ptr)
+
+ static irqreturn_t db1200_mmc_cdfn(int irq, void *ptr)
+ {
+- void (*mmc_cd)(struct mmc_host *, unsigned long);
+-
+- /* link against CONFIG_MMC=m */
+- mmc_cd = symbol_get(mmc_detect_change);
+- if (mmc_cd) {
+- mmc_cd(ptr, msecs_to_jiffies(200));
+- symbol_put(mmc_detect_change);
+- }
++ mmc_detect_change(ptr, msecs_to_jiffies(200));
+
+ msleep(100); /* debounce */
+ if (irq == DB1200_SD0_INSERT_INT)
+@@ -431,14 +423,7 @@ static irqreturn_t pb1200_mmc1_cd(int irq, void *ptr)
+
+ static irqreturn_t pb1200_mmc1_cdfn(int irq, void *ptr)
+ {
+- void (*mmc_cd)(struct mmc_host *, unsigned long);
+-
+- /* link against CONFIG_MMC=m */
+- mmc_cd = symbol_get(mmc_detect_change);
+- if (mmc_cd) {
+- mmc_cd(ptr, msecs_to_jiffies(200));
+- symbol_put(mmc_detect_change);
+- }
++ mmc_detect_change(ptr, msecs_to_jiffies(200));
+
+ msleep(100); /* debounce */
+ if (irq == PB1200_SD1_INSERT_INT)
+diff --git a/arch/mips/alchemy/devboards/db1300.c b/arch/mips/alchemy/devboards/db1300.c
+index e70e529ddd914..ff61901329c62 100644
+--- a/arch/mips/alchemy/devboards/db1300.c
++++ b/arch/mips/alchemy/devboards/db1300.c
+@@ -17,7 +17,6 @@
+ #include <linux/interrupt.h>
+ #include <linux/ata_platform.h>
+ #include <linux/mmc/host.h>
+-#include <linux/module.h>
+ #include <linux/mtd/mtd.h>
+ #include <linux/mtd/platnand.h>
+ #include <linux/platform_device.h>
+@@ -459,14 +458,7 @@ static irqreturn_t db1300_mmc_cd(int irq, void *ptr)
+
+ static irqreturn_t db1300_mmc_cdfn(int irq, void *ptr)
+ {
+- void (*mmc_cd)(struct mmc_host *, unsigned long);
+-
+- /* link against CONFIG_MMC=m. We can only be called once MMC core has
+- * initialized the controller, so symbol_get() should always succeed.
+- */
+- mmc_cd = symbol_get(mmc_detect_change);
+- mmc_cd(ptr, msecs_to_jiffies(200));
+- symbol_put(mmc_detect_change);
++ mmc_detect_change(ptr, msecs_to_jiffies(200));
+
+ msleep(100); /* debounce */
+ if (irq == DB1300_SD1_INSERT_INT)
+diff --git a/drivers/firmware/stratix10-svc.c b/drivers/firmware/stratix10-svc.c
+index 2d674126160fe..cab11af28c231 100644
+--- a/drivers/firmware/stratix10-svc.c
++++ b/drivers/firmware/stratix10-svc.c
+@@ -756,7 +756,7 @@ svc_create_memory_pool(struct platform_device *pdev,
+ paddr = begin;
+ size = end - begin;
+ va = devm_memremap(dev, paddr, size, MEMREMAP_WC);
+- if (!va) {
++ if (IS_ERR(va)) {
+ dev_err(dev, "fail to remap shared memory\n");
+ return ERR_PTR(-EINVAL);
+ }
+diff --git a/drivers/fsi/fsi-master-ast-cf.c b/drivers/fsi/fsi-master-ast-cf.c
+index 5f608ef8b53ca..cde281ec89d7b 100644
+--- a/drivers/fsi/fsi-master-ast-cf.c
++++ b/drivers/fsi/fsi-master-ast-cf.c
+@@ -1441,3 +1441,4 @@ static struct platform_driver fsi_master_acf = {
+
+ module_platform_driver(fsi_master_acf);
+ MODULE_LICENSE("GPL");
++MODULE_FIRMWARE(FW_FILE_NAME);
+diff --git a/drivers/hid/wacom.h b/drivers/hid/wacom.h
+index 4da50e19808ef..166a76c9bcad3 100644
+--- a/drivers/hid/wacom.h
++++ b/drivers/hid/wacom.h
+@@ -150,6 +150,7 @@ struct wacom_remote {
+ struct input_dev *input;
+ bool registered;
+ struct wacom_battery battery;
++ ktime_t active_time;
+ } remotes[WACOM_MAX_REMOTES];
+ };
+
+diff --git a/drivers/hid/wacom_sys.c b/drivers/hid/wacom_sys.c
+index 76e5353aca0c7..eb833455abd50 100644
+--- a/drivers/hid/wacom_sys.c
++++ b/drivers/hid/wacom_sys.c
+@@ -2523,6 +2523,18 @@ fail:
+ return;
+ }
+
++static void wacom_remote_destroy_battery(struct wacom *wacom, int index)
++{
++ struct wacom_remote *remote = wacom->remote;
++
++ if (remote->remotes[index].battery.battery) {
++ devres_release_group(&wacom->hdev->dev,
++ &remote->remotes[index].battery.bat_desc);
++ remote->remotes[index].battery.battery = NULL;
++ remote->remotes[index].active_time = 0;
++ }
++}
++
+ static void wacom_remote_destroy_one(struct wacom *wacom, unsigned int index)
+ {
+ struct wacom_remote *remote = wacom->remote;
+@@ -2537,9 +2549,7 @@ static void wacom_remote_destroy_one(struct wacom *wacom, unsigned int index)
+ remote->remotes[i].registered = false;
+ spin_unlock_irqrestore(&remote->remote_lock, flags);
+
+- if (remote->remotes[i].battery.battery)
+- devres_release_group(&wacom->hdev->dev,
+- &remote->remotes[i].battery.bat_desc);
++ wacom_remote_destroy_battery(wacom, i);
+
+ if (remote->remotes[i].group.name)
+ devres_release_group(&wacom->hdev->dev,
+@@ -2547,7 +2557,6 @@ static void wacom_remote_destroy_one(struct wacom *wacom, unsigned int index)
+
+ remote->remotes[i].serial = 0;
+ remote->remotes[i].group.name = NULL;
+- remote->remotes[i].battery.battery = NULL;
+ wacom->led.groups[i].select = WACOM_STATUS_UNKNOWN;
+ }
+ }
+@@ -2632,6 +2641,9 @@ static int wacom_remote_attach_battery(struct wacom *wacom, int index)
+ if (remote->remotes[index].battery.battery)
+ return 0;
+
++ if (!remote->remotes[index].active_time)
++ return 0;
++
+ if (wacom->led.groups[index].select == WACOM_STATUS_UNKNOWN)
+ return 0;
+
+@@ -2647,6 +2659,7 @@ static void wacom_remote_work(struct work_struct *work)
+ {
+ struct wacom *wacom = container_of(work, struct wacom, remote_work);
+ struct wacom_remote *remote = wacom->remote;
++ ktime_t kt = ktime_get();
+ struct wacom_remote_data data;
+ unsigned long flags;
+ unsigned int count;
+@@ -2673,6 +2686,10 @@ static void wacom_remote_work(struct work_struct *work)
+ serial = data.remote[i].serial;
+ if (data.remote[i].connected) {
+
++ if (kt - remote->remotes[i].active_time > WACOM_REMOTE_BATTERY_TIMEOUT
++ && remote->remotes[i].active_time != 0)
++ wacom_remote_destroy_battery(wacom, i);
++
+ if (remote->remotes[i].serial == serial) {
+ wacom_remote_attach_battery(wacom, i);
+ continue;
+diff --git a/drivers/hid/wacom_wac.c b/drivers/hid/wacom_wac.c
+index 174bf03908d7c..6c056f8844e70 100644
+--- a/drivers/hid/wacom_wac.c
++++ b/drivers/hid/wacom_wac.c
+@@ -1134,6 +1134,7 @@ static int wacom_remote_irq(struct wacom_wac *wacom_wac, size_t len)
+ if (index < 0 || !remote->remotes[index].registered)
+ goto out;
+
++ remote->remotes[i].active_time = ktime_get();
+ input = remote->remotes[index].input;
+
+ input_report_key(input, BTN_0, (data[9] & 0x01));
+diff --git a/drivers/hid/wacom_wac.h b/drivers/hid/wacom_wac.h
+index ee21bb260f22f..2e7cc5e7a0cb7 100644
+--- a/drivers/hid/wacom_wac.h
++++ b/drivers/hid/wacom_wac.h
+@@ -13,6 +13,7 @@
+ #define WACOM_NAME_MAX 64
+ #define WACOM_MAX_REMOTES 5
+ #define WACOM_STATUS_UNKNOWN 255
++#define WACOM_REMOTE_BATTERY_TIMEOUT 21000000000ll
+
+ /* packet length for individual models */
+ #define WACOM_PKGLEN_BBFUN 9
+diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
+index 9f793892123c2..caef6d9a13cd3 100644
+--- a/drivers/mmc/host/Kconfig
++++ b/drivers/mmc/host/Kconfig
+@@ -526,11 +526,12 @@ config MMC_ALCOR
+ of Alcor Micro PCI-E card reader
+
+ config MMC_AU1X
+- tristate "Alchemy AU1XX0 MMC Card Interface support"
++ bool "Alchemy AU1XX0 MMC Card Interface support"
+ depends on MIPS_ALCHEMY
++ depends on MMC=y
+ help
+ This selects the AMD Alchemy(R) Multimedia card interface.
+- If you have a Alchemy platform with a MMC slot, say Y or M here.
++ If you have a Alchemy platform with a MMC slot, say Y here.
+
+ If unsure, say N.
+
+diff --git a/drivers/net/ethernet/freescale/enetc/enetc_ptp.c b/drivers/net/ethernet/freescale/enetc/enetc_ptp.c
+index 17c097cef7d45..5243fc0310589 100644
+--- a/drivers/net/ethernet/freescale/enetc/enetc_ptp.c
++++ b/drivers/net/ethernet/freescale/enetc/enetc_ptp.c
+@@ -8,7 +8,7 @@
+ #include "enetc.h"
+
+ int enetc_phc_index = -1;
+-EXPORT_SYMBOL(enetc_phc_index);
++EXPORT_SYMBOL_GPL(enetc_phc_index);
+
+ static struct ptp_clock_info enetc_ptp_caps = {
+ .owner = THIS_MODULE,
+diff --git a/drivers/net/wireless/ath/ath11k/dp_tx.c b/drivers/net/wireless/ath/ath11k/dp_tx.c
+index 08a28464eb7a9..cd24488612454 100644
+--- a/drivers/net/wireless/ath/ath11k/dp_tx.c
++++ b/drivers/net/wireless/ath/ath11k/dp_tx.c
+@@ -344,7 +344,7 @@ ath11k_dp_tx_htt_tx_complete_buf(struct ath11k_base *ab,
+ dma_unmap_single(ab->dev, skb_cb->paddr, msdu->len, DMA_TO_DEVICE);
+
+ if (!skb_cb->vif) {
+- dev_kfree_skb_any(msdu);
++ ieee80211_free_txskb(ar->hw, msdu);
+ return;
+ }
+
+@@ -369,7 +369,7 @@ ath11k_dp_tx_htt_tx_complete_buf(struct ath11k_base *ab,
+ "dp_tx: failed to find the peer with peer_id %d\n",
+ ts->peer_id);
+ spin_unlock_bh(&ab->base_lock);
+- dev_kfree_skb_any(msdu);
++ ieee80211_free_txskb(ar->hw, msdu);
+ return;
+ }
+ spin_unlock_bh(&ab->base_lock);
+@@ -566,12 +566,12 @@ static void ath11k_dp_tx_complete_msdu(struct ath11k *ar,
+ dma_unmap_single(ab->dev, skb_cb->paddr, msdu->len, DMA_TO_DEVICE);
+
+ if (unlikely(!rcu_access_pointer(ab->pdevs_active[ar->pdev_idx]))) {
+- dev_kfree_skb_any(msdu);
++ ieee80211_free_txskb(ar->hw, msdu);
+ return;
+ }
+
+ if (unlikely(!skb_cb->vif)) {
+- dev_kfree_skb_any(msdu);
++ ieee80211_free_txskb(ar->hw, msdu);
+ return;
+ }
+
+@@ -624,7 +624,7 @@ static void ath11k_dp_tx_complete_msdu(struct ath11k *ar,
+ "dp_tx: failed to find the peer with peer_id %d\n",
+ ts->peer_id);
+ spin_unlock_bh(&ab->base_lock);
+- dev_kfree_skb_any(msdu);
++ ieee80211_free_txskb(ar->hw, msdu);
+ return;
+ }
+ arsta = (struct ath11k_sta *)peer->sta->drv_priv;
+diff --git a/drivers/net/wireless/mediatek/mt76/mt76_connac_mac.c b/drivers/net/wireless/mediatek/mt76/mt76_connac_mac.c
+index d39a3cc5e381f..be4d63db5f64a 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt76_connac_mac.c
++++ b/drivers/net/wireless/mediatek/mt76/mt76_connac_mac.c
+@@ -495,6 +495,7 @@ void mt76_connac2_mac_write_txwi(struct mt76_dev *dev, __le32 *txwi,
+ BSS_CHANGED_BEACON_ENABLED));
+ bool inband_disc = !!(changed & (BSS_CHANGED_UNSOL_BCAST_PROBE_RESP |
+ BSS_CHANGED_FILS_DISCOVERY));
++ bool amsdu_en = wcid->amsdu;
+
+ if (vif) {
+ struct mt76_vif *mvif = (struct mt76_vif *)vif->drv_priv;
+@@ -554,12 +555,14 @@ void mt76_connac2_mac_write_txwi(struct mt76_dev *dev, __le32 *txwi,
+ txwi[4] = 0;
+
+ val = FIELD_PREP(MT_TXD5_PID, pid);
+- if (pid >= MT_PACKET_ID_FIRST)
++ if (pid >= MT_PACKET_ID_FIRST) {
+ val |= MT_TXD5_TX_STATUS_HOST;
++ amsdu_en = amsdu_en && !is_mt7921(dev);
++ }
+
+ txwi[5] = cpu_to_le32(val);
+ txwi[6] = 0;
+- txwi[7] = wcid->amsdu ? cpu_to_le32(MT_TXD7_HW_AMSDU) : 0;
++ txwi[7] = amsdu_en ? cpu_to_le32(MT_TXD7_HW_AMSDU) : 0;
+
+ if (is_8023)
+ mt76_connac2_mac_write_txwi_8023(txwi, skb, wcid);
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/main.c b/drivers/net/wireless/mediatek/mt76/mt7921/main.c
+index 3b6adb29cbef1..0e3ada1e008cd 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7921/main.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7921/main.c
+@@ -1363,7 +1363,7 @@ mt7921_set_antenna(struct ieee80211_hw *hw, u32 tx_ant, u32 rx_ant)
+ return -EINVAL;
+
+ if ((BIT(hweight8(tx_ant)) - 1) != tx_ant)
+- tx_ant = BIT(ffs(tx_ant) - 1) - 1;
++ return -EINVAL;
+
+ mt7921_mutex_acquire(dev);
+
+diff --git a/drivers/net/wireless/realtek/rtw88/usb.c b/drivers/net/wireless/realtek/rtw88/usb.c
+index 976eafa739a2d..a5d2be81933bb 100644
+--- a/drivers/net/wireless/realtek/rtw88/usb.c
++++ b/drivers/net/wireless/realtek/rtw88/usb.c
+@@ -837,7 +837,7 @@ int rtw_usb_probe(struct usb_interface *intf, const struct usb_device_id *id)
+
+ ret = rtw_core_init(rtwdev);
+ if (ret)
+- goto err_release_hw;
++ goto err_free_rx_bufs;
+
+ ret = rtw_usb_intf_init(rtwdev, intf);
+ if (ret) {
+@@ -883,6 +883,9 @@ err_destroy_usb:
+ err_deinit_core:
+ rtw_core_deinit(rtwdev);
+
++err_free_rx_bufs:
++ rtw_usb_free_rx_bufs(rtwusb);
++
+ err_release_hw:
+ ieee80211_free_hw(hw);
+
+diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
+index 3b10e0a01b1d2..f135617710619 100644
+--- a/drivers/pinctrl/pinctrl-amd.c
++++ b/drivers/pinctrl/pinctrl-amd.c
+@@ -748,7 +748,7 @@ static int amd_pinconf_get(struct pinctrl_dev *pctldev,
+ break;
+
+ default:
+- dev_err(&gpio_dev->pdev->dev, "Invalid config param %04x\n",
++ dev_dbg(&gpio_dev->pdev->dev, "Invalid config param %04x\n",
+ param);
+ return -ENOTSUPP;
+ }
+@@ -798,7 +798,7 @@ static int amd_pinconf_set(struct pinctrl_dev *pctldev, unsigned int pin,
+ break;
+
+ default:
+- dev_err(&gpio_dev->pdev->dev,
++ dev_dbg(&gpio_dev->pdev->dev,
+ "Invalid config param %04x\n", param);
+ ret = -ENOTSUPP;
+ }
+diff --git a/drivers/rtc/rtc-ds1685.c b/drivers/rtc/rtc-ds1685.c
+index 0f707be0eb87f..04dbf35cf3b70 100644
+--- a/drivers/rtc/rtc-ds1685.c
++++ b/drivers/rtc/rtc-ds1685.c
+@@ -1432,7 +1432,7 @@ ds1685_rtc_poweroff(struct platform_device *pdev)
+ unreachable();
+ }
+ }
+-EXPORT_SYMBOL(ds1685_rtc_poweroff);
++EXPORT_SYMBOL_GPL(ds1685_rtc_poweroff);
+ /* ----------------------------------------------------------------------- */
+
+
+diff --git a/drivers/staging/rtl8712/os_intfs.c b/drivers/staging/rtl8712/os_intfs.c
+index a2f3645be0cc8..b18e6d9c832b8 100644
+--- a/drivers/staging/rtl8712/os_intfs.c
++++ b/drivers/staging/rtl8712/os_intfs.c
+@@ -327,6 +327,7 @@ int r8712_init_drv_sw(struct _adapter *padapter)
+ mp871xinit(padapter);
+ init_default_value(padapter);
+ r8712_InitSwLeds(padapter);
++ mutex_init(&padapter->mutex_start);
+
+ return 0;
+
+diff --git a/drivers/staging/rtl8712/usb_intf.c b/drivers/staging/rtl8712/usb_intf.c
+index 37364d3101e21..df05213f922f4 100644
+--- a/drivers/staging/rtl8712/usb_intf.c
++++ b/drivers/staging/rtl8712/usb_intf.c
+@@ -567,7 +567,6 @@ static int r871xu_drv_init(struct usb_interface *pusb_intf,
+ if (rtl871x_load_fw(padapter))
+ goto deinit_drv_sw;
+ init_completion(&padapter->rx_filter_ready);
+- mutex_init(&padapter->mutex_start);
+ return 0;
+
+ deinit_drv_sw:
+diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
+index 22fe5a8ce9399..24ebdb0b63a8e 100644
+--- a/drivers/tty/serial/qcom_geni_serial.c
++++ b/drivers/tty/serial/qcom_geni_serial.c
+@@ -126,6 +126,7 @@ struct qcom_geni_serial_port {
+ dma_addr_t rx_dma_addr;
+ bool setup;
+ unsigned int baud;
++ unsigned long clk_rate;
+ void *rx_buf;
+ u32 loopback;
+ bool brk;
+@@ -1244,6 +1245,7 @@ static void qcom_geni_serial_set_termios(struct uart_port *uport,
+ baud * sampling_rate, clk_rate, clk_div);
+
+ uport->uartclk = clk_rate;
++ port->clk_rate = clk_rate;
+ dev_pm_opp_set_rate(uport->dev, clk_rate);
+ ser_clk_cfg = SER_CLK_EN;
+ ser_clk_cfg |= clk_div << CLK_DIV_SHFT;
+@@ -1508,10 +1510,13 @@ static void qcom_geni_serial_pm(struct uart_port *uport,
+
+ if (new_state == UART_PM_STATE_ON && old_state == UART_PM_STATE_OFF) {
+ geni_icc_enable(&port->se);
++ if (port->clk_rate)
++ dev_pm_opp_set_rate(uport->dev, port->clk_rate);
+ geni_se_resources_on(&port->se);
+ } else if (new_state == UART_PM_STATE_OFF &&
+ old_state == UART_PM_STATE_ON) {
+ geni_se_resources_off(&port->se);
++ dev_pm_opp_set_rate(uport->dev, 0);
+ geni_icc_disable(&port->se);
+ }
+ }
+diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
+index abad091baeeae..54c760b46da13 100644
+--- a/drivers/tty/serial/sc16is7xx.c
++++ b/drivers/tty/serial/sc16is7xx.c
+@@ -1342,9 +1342,18 @@ static int sc16is7xx_gpio_direction_output(struct gpio_chip *chip,
+ state |= BIT(offset);
+ else
+ state &= ~BIT(offset);
+- sc16is7xx_port_write(port, SC16IS7XX_IOSTATE_REG, state);
++
++ /*
++ * If we write IOSTATE first, and then IODIR, the output value is not
++ * transferred to the corresponding I/O pin.
++ * The datasheet states that each register bit will be transferred to
++ * the corresponding I/O pin programmed as output when writing to
++ * IOSTATE. Therefore, configure direction first with IODIR, and then
++ * set value after with IOSTATE.
++ */
+ sc16is7xx_port_update(port, SC16IS7XX_IODIR_REG, BIT(offset),
+ BIT(offset));
++ sc16is7xx_port_write(port, SC16IS7XX_IOSTATE_REG, state);
+
+ return 0;
+ }
+@@ -1436,6 +1445,12 @@ static int sc16is7xx_probe(struct device *dev,
+ s->p[i].port.fifosize = SC16IS7XX_FIFO_SIZE;
+ s->p[i].port.flags = UPF_FIXED_TYPE | UPF_LOW_LATENCY;
+ s->p[i].port.iobase = i;
++ /*
++ * Use all ones as membase to make sure uart_configure_port() in
++ * serial_core.c does not abort for SPI/I2C devices where the
++ * membase address is not applicable.
++ */
++ s->p[i].port.membase = (void __iomem *)~0;
+ s->p[i].port.iotype = UPIO_PORT;
+ s->p[i].port.uartclk = freq;
+ s->p[i].port.rs485_config = sc16is7xx_config_rs485;
+diff --git a/drivers/usb/chipidea/ci_hdrc_imx.c b/drivers/usb/chipidea/ci_hdrc_imx.c
+index f7577f2bd2c5d..916eae08770db 100644
+--- a/drivers/usb/chipidea/ci_hdrc_imx.c
++++ b/drivers/usb/chipidea/ci_hdrc_imx.c
+@@ -175,10 +175,12 @@ static struct imx_usbmisc_data *usbmisc_get_init_data(struct device *dev)
+ if (of_usb_get_phy_mode(np) == USBPHY_INTERFACE_MODE_ULPI)
+ data->ulpi = 1;
+
+- of_property_read_u32(np, "samsung,picophy-pre-emp-curr-control",
+- &data->emp_curr_control);
+- of_property_read_u32(np, "samsung,picophy-dc-vol-level-adjust",
+- &data->dc_vol_level_adjust);
++ if (of_property_read_u32(np, "samsung,picophy-pre-emp-curr-control",
++ &data->emp_curr_control))
++ data->emp_curr_control = -1;
++ if (of_property_read_u32(np, "samsung,picophy-dc-vol-level-adjust",
++ &data->dc_vol_level_adjust))
++ data->dc_vol_level_adjust = -1;
+
+ return data;
+ }
+diff --git a/drivers/usb/chipidea/usbmisc_imx.c b/drivers/usb/chipidea/usbmisc_imx.c
+index 681c2ddc83fa5..520fc5b026bda 100644
+--- a/drivers/usb/chipidea/usbmisc_imx.c
++++ b/drivers/usb/chipidea/usbmisc_imx.c
+@@ -660,13 +660,15 @@ static int usbmisc_imx7d_init(struct imx_usbmisc_data *data)
+ usbmisc->base + MX7D_USBNC_USB_CTRL2);
+ /* PHY tuning for signal quality */
+ reg = readl(usbmisc->base + MX7D_USB_OTG_PHY_CFG1);
+- if (data->emp_curr_control && data->emp_curr_control <=
++ if (data->emp_curr_control >= 0 &&
++ data->emp_curr_control <=
+ (TXPREEMPAMPTUNE0_MASK >> TXPREEMPAMPTUNE0_BIT)) {
+ reg &= ~TXPREEMPAMPTUNE0_MASK;
+ reg |= (data->emp_curr_control << TXPREEMPAMPTUNE0_BIT);
+ }
+
+- if (data->dc_vol_level_adjust && data->dc_vol_level_adjust <=
++ if (data->dc_vol_level_adjust >= 0 &&
++ data->dc_vol_level_adjust <=
+ (TXVREFTUNE0_MASK >> TXVREFTUNE0_BIT)) {
+ reg &= ~TXVREFTUNE0_MASK;
+ reg |= (data->dc_vol_level_adjust << TXVREFTUNE0_BIT);
+diff --git a/drivers/usb/dwc3/dwc3-meson-g12a.c b/drivers/usb/dwc3/dwc3-meson-g12a.c
+index eaea944ebd2ce..10298b91731eb 100644
+--- a/drivers/usb/dwc3/dwc3-meson-g12a.c
++++ b/drivers/usb/dwc3/dwc3-meson-g12a.c
+@@ -938,6 +938,12 @@ static int __maybe_unused dwc3_meson_g12a_resume(struct device *dev)
+ return ret;
+ }
+
++ if (priv->drvdata->usb_post_init) {
++ ret = priv->drvdata->usb_post_init(priv);
++ if (ret)
++ return ret;
++ }
++
+ return 0;
+ }
+
+diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c
+index 8ac98e60fff56..7994a4549a6c8 100644
+--- a/drivers/usb/serial/option.c
++++ b/drivers/usb/serial/option.c
+@@ -259,6 +259,7 @@ static void option_instat_callback(struct urb *urb);
+ #define QUECTEL_PRODUCT_EM05G 0x030a
+ #define QUECTEL_PRODUCT_EM060K 0x030b
+ #define QUECTEL_PRODUCT_EM05G_CS 0x030c
++#define QUECTEL_PRODUCT_EM05GV2 0x030e
+ #define QUECTEL_PRODUCT_EM05CN_SG 0x0310
+ #define QUECTEL_PRODUCT_EM05G_SG 0x0311
+ #define QUECTEL_PRODUCT_EM05CN 0x0312
+@@ -1188,6 +1189,8 @@ static const struct usb_device_id option_ids[] = {
+ .driver_info = RSVD(6) | ZLP },
+ { USB_DEVICE_INTERFACE_CLASS(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM05G, 0xff),
+ .driver_info = RSVD(6) | ZLP },
++ { USB_DEVICE_INTERFACE_CLASS(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM05GV2, 0xff),
++ .driver_info = RSVD(4) | ZLP },
+ { USB_DEVICE_INTERFACE_CLASS(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM05G_CS, 0xff),
+ .driver_info = RSVD(6) | ZLP },
+ { USB_DEVICE_INTERFACE_CLASS(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EM05G_GR, 0xff),
+@@ -2232,6 +2235,10 @@ static const struct usb_device_id option_ids[] = {
+ .driver_info = RSVD(0) | RSVD(1) | RSVD(6) },
+ { USB_DEVICE_INTERFACE_CLASS(0x0489, 0xe0db, 0xff), /* Foxconn T99W265 MBIM */
+ .driver_info = RSVD(3) },
++ { USB_DEVICE_INTERFACE_CLASS(0x0489, 0xe0ee, 0xff), /* Foxconn T99W368 MBIM */
++ .driver_info = RSVD(3) },
++ { USB_DEVICE_INTERFACE_CLASS(0x0489, 0xe0f0, 0xff), /* Foxconn T99W373 MBIM */
++ .driver_info = RSVD(3) },
+ { USB_DEVICE(0x1508, 0x1001), /* Fibocom NL668 (IOT version) */
+ .driver_info = RSVD(4) | RSVD(5) | RSVD(6) },
+ { USB_DEVICE(0x1782, 0x4d10) }, /* Fibocom L610 (AT mode) */
+diff --git a/drivers/usb/typec/tcpm/tcpci.c b/drivers/usb/typec/tcpm/tcpci.c
+index 8da23240afbe4..009dddeb6e36a 100644
+--- a/drivers/usb/typec/tcpm/tcpci.c
++++ b/drivers/usb/typec/tcpm/tcpci.c
+@@ -602,6 +602,10 @@ static int tcpci_init(struct tcpc_dev *tcpc)
+ if (time_after(jiffies, timeout))
+ return -ETIMEDOUT;
+
++ ret = tcpci_write16(tcpci, TCPC_FAULT_STATUS, TCPC_FAULT_STATUS_ALL_REG_RST_TO_DEFAULT);
++ if (ret < 0)
++ return ret;
++
+ /* Handle vendor init */
+ if (tcpci->data->init) {
+ ret = tcpci->data->init(tcpci, tcpci->data);
+diff --git a/drivers/usb/typec/tcpm/tcpm.c b/drivers/usb/typec/tcpm/tcpm.c
+index dc113cbb3bed8..9c4b73d23f833 100644
+--- a/drivers/usb/typec/tcpm/tcpm.c
++++ b/drivers/usb/typec/tcpm/tcpm.c
+@@ -2753,6 +2753,13 @@ static void tcpm_pd_ctrl_request(struct tcpm_port *port,
+ port->sink_cap_done = true;
+ tcpm_set_state(port, ready_state(port), 0);
+ break;
++ /*
++ * Some port partners do not support GET_STATUS, avoid soft reset the link to
++ * prevent redundant power re-negotiation
++ */
++ case GET_STATUS_SEND:
++ tcpm_set_state(port, ready_state(port), 0);
++ break;
+ case SRC_READY:
+ case SNK_READY:
+ if (port->vdm_state > VDM_STATE_READY) {
+diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
+index 470988bb7867e..9a7c8bb0590f1 100644
+--- a/fs/erofs/zdata.c
++++ b/fs/erofs/zdata.c
+@@ -993,6 +993,8 @@ hitted:
+ cur = end - min_t(erofs_off_t, offset + end - map->m_la, end);
+ if (!(map->m_flags & EROFS_MAP_MAPPED)) {
+ zero_user_segment(page, cur, end);
++ ++spiltted;
++ tight = false;
+ goto next_part;
+ }
+ if (map->m_flags & EROFS_MAP_FRAGMENT) {
+diff --git a/fs/nilfs2/alloc.c b/fs/nilfs2/alloc.c
+index 6ce8617b562d5..7342de296ec3c 100644
+--- a/fs/nilfs2/alloc.c
++++ b/fs/nilfs2/alloc.c
+@@ -205,7 +205,8 @@ static int nilfs_palloc_get_block(struct inode *inode, unsigned long blkoff,
+ int ret;
+
+ spin_lock(lock);
+- if (prev->bh && blkoff == prev->blkoff) {
++ if (prev->bh && blkoff == prev->blkoff &&
++ likely(buffer_uptodate(prev->bh))) {
+ get_bh(prev->bh);
+ *bhp = prev->bh;
+ spin_unlock(lock);
+diff --git a/fs/nilfs2/inode.c b/fs/nilfs2/inode.c
+index 35bc793053180..acf7a266f72f5 100644
+--- a/fs/nilfs2/inode.c
++++ b/fs/nilfs2/inode.c
+@@ -1025,7 +1025,7 @@ int nilfs_load_inode_block(struct inode *inode, struct buffer_head **pbh)
+ int err;
+
+ spin_lock(&nilfs->ns_inode_lock);
+- if (ii->i_bh == NULL) {
++ if (ii->i_bh == NULL || unlikely(!buffer_uptodate(ii->i_bh))) {
+ spin_unlock(&nilfs->ns_inode_lock);
+ err = nilfs_ifile_get_inode_block(ii->i_root->ifile,
+ inode->i_ino, pbh);
+@@ -1034,7 +1034,10 @@ int nilfs_load_inode_block(struct inode *inode, struct buffer_head **pbh)
+ spin_lock(&nilfs->ns_inode_lock);
+ if (ii->i_bh == NULL)
+ ii->i_bh = *pbh;
+- else {
++ else if (unlikely(!buffer_uptodate(ii->i_bh))) {
++ __brelse(ii->i_bh);
++ ii->i_bh = *pbh;
++ } else {
+ brelse(*pbh);
+ *pbh = ii->i_bh;
+ }
+diff --git a/fs/smb/server/auth.c b/fs/smb/server/auth.c
+index 5e5e120edcc22..15e5684e328c1 100644
+--- a/fs/smb/server/auth.c
++++ b/fs/smb/server/auth.c
+@@ -355,6 +355,9 @@ int ksmbd_decode_ntlmssp_auth_blob(struct authenticate_message *authblob,
+ if (blob_len < (u64)sess_key_off + sess_key_len)
+ return -EINVAL;
+
++ if (sess_key_len > CIFS_KEY_SIZE)
++ return -EINVAL;
++
+ ctx_arc4 = kmalloc(sizeof(*ctx_arc4), GFP_KERNEL);
+ if (!ctx_arc4)
+ return -ENOMEM;
+diff --git a/fs/smb/server/oplock.c b/fs/smb/server/oplock.c
+index 844b303baf293..90edd8522d291 100644
+--- a/fs/smb/server/oplock.c
++++ b/fs/smb/server/oplock.c
+@@ -1492,7 +1492,7 @@ struct create_context *smb2_find_context_vals(void *open_req, const char *tag, i
+ name_len < 4 ||
+ name_off + name_len > cc_len ||
+ (value_off & 0x7) != 0 ||
+- (value_off && (value_off < name_off + name_len)) ||
++ (value_len && value_off < name_off + (name_len < 8 ? 8 : name_len)) ||
+ ((u64)value_off + value_len > cc_len))
+ return ERR_PTR(-EINVAL);
+
+diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c
+index 4b4764abcdffa..6d0896c76b098 100644
+--- a/fs/smb/server/smb2pdu.c
++++ b/fs/smb/server/smb2pdu.c
+@@ -4310,7 +4310,7 @@ static int smb2_get_ea(struct ksmbd_work *work, struct ksmbd_file *fp,
+ if (!strncmp(name, XATTR_USER_PREFIX, XATTR_USER_PREFIX_LEN))
+ name_len -= XATTR_USER_PREFIX_LEN;
+
+- ptr = (char *)(&eainfo->name + name_len + 1);
++ ptr = eainfo->name + name_len + 1;
+ buf_free_len -= (offsetof(struct smb2_ea_info, name) +
+ name_len + 1);
+ /* bailout if xattr can't fit in buf_free_len */
+diff --git a/fs/smb/server/smb2pdu.h b/fs/smb/server/smb2pdu.h
+index 2767c08a534a3..d12cfd3b09278 100644
+--- a/fs/smb/server/smb2pdu.h
++++ b/fs/smb/server/smb2pdu.h
+@@ -361,7 +361,7 @@ struct smb2_ea_info {
+ __u8 Flags;
+ __u8 EaNameLength;
+ __le16 EaValueLength;
+- char name[1];
++ char name[];
+ /* optionally followed by value */
+ } __packed; /* level 15 Query */
+
+diff --git a/fs/smb/server/transport_rdma.c b/fs/smb/server/transport_rdma.c
+index c06efc020bd95..7578200f63b1d 100644
+--- a/fs/smb/server/transport_rdma.c
++++ b/fs/smb/server/transport_rdma.c
+@@ -1366,24 +1366,35 @@ static int smb_direct_rdma_xmit(struct smb_direct_transport *t,
+ LIST_HEAD(msg_list);
+ char *desc_buf;
+ int credits_needed;
+- unsigned int desc_buf_len;
+- size_t total_length = 0;
++ unsigned int desc_buf_len, desc_num = 0;
+
+ if (t->status != SMB_DIRECT_CS_CONNECTED)
+ return -ENOTCONN;
+
++ if (buf_len > t->max_rdma_rw_size)
++ return -EINVAL;
++
+ /* calculate needed credits */
+ credits_needed = 0;
+ desc_buf = buf;
+ for (i = 0; i < desc_len / sizeof(*desc); i++) {
++ if (!buf_len)
++ break;
++
+ desc_buf_len = le32_to_cpu(desc[i].length);
++ if (!desc_buf_len)
++ return -EINVAL;
++
++ if (desc_buf_len > buf_len) {
++ desc_buf_len = buf_len;
++ desc[i].length = cpu_to_le32(desc_buf_len);
++ buf_len = 0;
++ }
+
+ credits_needed += calc_rw_credits(t, desc_buf, desc_buf_len);
+ desc_buf += desc_buf_len;
+- total_length += desc_buf_len;
+- if (desc_buf_len == 0 || total_length > buf_len ||
+- total_length > t->max_rdma_rw_size)
+- return -EINVAL;
++ buf_len -= desc_buf_len;
++ desc_num++;
+ }
+
+ ksmbd_debug(RDMA, "RDMA %s, len %#x, needed credits %#x\n",
+@@ -1395,7 +1406,7 @@ static int smb_direct_rdma_xmit(struct smb_direct_transport *t,
+
+ /* build rdma_rw_ctx for each descriptor */
+ desc_buf = buf;
+- for (i = 0; i < desc_len / sizeof(*desc); i++) {
++ for (i = 0; i < desc_num; i++) {
+ msg = kzalloc(offsetof(struct smb_direct_rdma_rw_msg, sg_list) +
+ sizeof(struct scatterlist) * SG_CHUNK_SIZE, GFP_KERNEL);
+ if (!msg) {
+diff --git a/include/linux/usb/tcpci.h b/include/linux/usb/tcpci.h
+index 85e95a3251d34..83376473ac765 100644
+--- a/include/linux/usb/tcpci.h
++++ b/include/linux/usb/tcpci.h
+@@ -103,6 +103,7 @@
+ #define TCPC_POWER_STATUS_SINKING_VBUS BIT(0)
+
+ #define TCPC_FAULT_STATUS 0x1f
++#define TCPC_FAULT_STATUS_ALL_REG_RST_TO_DEFAULT BIT(7)
+
+ #define TCPC_ALERT_EXTENDED 0x21
+
+diff --git a/kernel/module/main.c b/kernel/module/main.c
+index f1facc898a646..a04e94c9f8a49 100644
+--- a/kernel/module/main.c
++++ b/kernel/module/main.c
+@@ -1302,12 +1302,20 @@ void *__symbol_get(const char *symbol)
+ };
+
+ preempt_disable();
+- if (!find_symbol(&fsa) || strong_try_module_get(fsa.owner)) {
+- preempt_enable();
+- return NULL;
++ if (!find_symbol(&fsa))
++ goto fail;
++ if (fsa.license != GPL_ONLY) {
++ pr_warn("failing symbol_get of non-GPLONLY symbol %s.\n",
++ symbol);
++ goto fail;
+ }
++ if (strong_try_module_get(fsa.owner))
++ goto fail;
+ preempt_enable();
+ return (void *)kernel_symbol_value(fsa.sym);
++fail:
++ preempt_enable();
++ return NULL;
+ }
+ EXPORT_SYMBOL_GPL(__symbol_get);
+
+diff --git a/sound/usb/stream.c b/sound/usb/stream.c
+index f10f4e6d3fb85..3d4add94e367d 100644
+--- a/sound/usb/stream.c
++++ b/sound/usb/stream.c
+@@ -1093,6 +1093,7 @@ static int __snd_usb_parse_audio_interface(struct snd_usb_audio *chip,
+ int i, altno, err, stream;
+ struct audioformat *fp = NULL;
+ struct snd_usb_power_domain *pd = NULL;
++ bool set_iface_first;
+ int num, protocol;
+
+ dev = chip->dev;
+@@ -1223,11 +1224,19 @@ static int __snd_usb_parse_audio_interface(struct snd_usb_audio *chip,
+ return err;
+ }
+
++ set_iface_first = false;
++ if (protocol == UAC_VERSION_1 ||
++ (chip->quirk_flags & QUIRK_FLAG_SET_IFACE_FIRST))
++ set_iface_first = true;
++
+ /* try to set the interface... */
+ usb_set_interface(chip->dev, iface_no, 0);
++ if (set_iface_first)
++ usb_set_interface(chip->dev, iface_no, altno);
+ snd_usb_init_pitch(chip, fp);
+ snd_usb_init_sample_rate(chip, fp, fp->rate_max);
+- usb_set_interface(chip->dev, iface_no, altno);
++ if (!set_iface_first)
++ usb_set_interface(chip->dev, iface_no, altno);
+ }
+ return 0;
+ }
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [gentoo-commits] proj/linux-patches:6.4 commit in: /
@ 2023-09-13 11:04 Mike Pagano
0 siblings, 0 replies; 29+ messages in thread
From: Mike Pagano @ 2023-09-13 11:04 UTC (permalink / raw
To: gentoo-commits
commit: fcec9389a083f96ad3b1a91221c1151a42c9292b
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Wed Sep 13 11:04:11 2023 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Wed Sep 13 11:04:11 2023 +0000
URL: https://gitweb.gentoo.org/proj/linux-patches.git/commit/?id=fcec9389
Linux patch 6.4.16
Signed-off-by: Mike Pagano <mpagano <AT> gentoo.org>
0000_README | 4 +
1015_linux-6.4.16.patch | 37960 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 37964 insertions(+)
diff --git a/0000_README b/0000_README
index 5a91174a..087660bb 100644
--- a/0000_README
+++ b/0000_README
@@ -103,6 +103,10 @@ Patch: 1014_linux-6.4.15.patch
From: https://www.kernel.org
Desc: Linux 6.4.15
+Patch: 1015_linux-6.4.16.patch
+From: https://www.kernel.org
+Desc: Linux 6.4.16
+
Patch: 1500_XATTR_USER_PREFIX.patch
From: https://bugs.gentoo.org/show_bug.cgi?id=470644
Desc: Support for namespace user.pax.* on tmpfs.
diff --git a/1015_linux-6.4.16.patch b/1015_linux-6.4.16.patch
new file mode 100644
index 00000000..d8f25ac8
--- /dev/null
+++ b/1015_linux-6.4.16.patch
@@ -0,0 +1,37960 @@
+diff --git a/Documentation/ABI/testing/sysfs-bus-fsi-devices-sbefifo b/Documentation/ABI/testing/sysfs-bus-fsi-devices-sbefifo
+index 531fe9d6b40aa..c7393b4dd2d88 100644
+--- a/Documentation/ABI/testing/sysfs-bus-fsi-devices-sbefifo
++++ b/Documentation/ABI/testing/sysfs-bus-fsi-devices-sbefifo
+@@ -5,6 +5,6 @@ Description:
+ Indicates whether or not this SBE device has experienced a
+ timeout; i.e. the SBE did not respond within the time allotted
+ by the driver. A value of 1 indicates that a timeout has
+- ocurred and no transfers have completed since the timeout. A
+- value of 0 indicates that no timeout has ocurred, or if one
+- has, more recent transfers have completed successful.
++ occurred and no transfers have completed since the timeout. A
++ value of 0 indicates that no timeout has occurred, or if one
++ has, more recent transfers have completed successfully.
+diff --git a/Documentation/ABI/testing/sysfs-driver-chromeos-acpi b/Documentation/ABI/testing/sysfs-driver-chromeos-acpi
+index c308926e1568a..7c8e129fc1005 100644
+--- a/Documentation/ABI/testing/sysfs-driver-chromeos-acpi
++++ b/Documentation/ABI/testing/sysfs-driver-chromeos-acpi
+@@ -134,4 +134,4 @@ KernelVersion: 5.19
+ Description:
+ Returns the verified boot data block shared between the
+ firmware verification step and the kernel verification step
+- (binary).
++ (hex dump).
+diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs
+index 8140fc98f5aee..ad3d76d37c8ba 100644
+--- a/Documentation/ABI/testing/sysfs-fs-f2fs
++++ b/Documentation/ABI/testing/sysfs-fs-f2fs
+@@ -54,9 +54,9 @@ Description: Controls the in-place-update policy.
+ 0x00 DISABLE disable IPU(=default option in LFS mode)
+ 0x01 FORCE all the time
+ 0x02 SSR if SSR mode is activated
+- 0x04 UTIL if FS utilization is over threashold
++ 0x04 UTIL if FS utilization is over threshold
+ 0x08 SSR_UTIL if SSR mode is activated and FS utilization is over
+- threashold
++ threshold
+ 0x10 FSYNC activated in fsync path only for high performance
+ flash storages. IPU will be triggered only if the
+ # of dirty pages over min_fsync_blocks.
+@@ -117,7 +117,7 @@ Date: December 2021
+ Contact: "Konstantin Vyshetsky" <vkon@google.com>
+ Description: Controls the number of discards a thread will issue at a time.
+ Higher number will allow the discard thread to finish its work
+- faster, at the cost of higher latency for incomming I/O.
++ faster, at the cost of higher latency for incoming I/O.
+
+ What: /sys/fs/f2fs/<disk>/min_discard_issue_time
+ Date: December 2021
+@@ -334,7 +334,7 @@ Description: This indicates how many GC can be failed for the pinned
+ state. 2048 trials is set by default.
+
+ What: /sys/fs/f2fs/<disk>/extension_list
+-Date: Feburary 2018
++Date: February 2018
+ Contact: "Chao Yu" <yuchao0@huawei.com>
+ Description: Used to control configure extension list:
+ - Query: cat /sys/fs/f2fs/<disk>/extension_list
+diff --git a/Documentation/devicetree/bindings/clock/qcom,qdu1000-gcc.yaml b/Documentation/devicetree/bindings/clock/qcom,qdu1000-gcc.yaml
+index 767a9d03aa327..d712b1a87e25f 100644
+--- a/Documentation/devicetree/bindings/clock/qcom,qdu1000-gcc.yaml
++++ b/Documentation/devicetree/bindings/clock/qcom,qdu1000-gcc.yaml
+@@ -7,7 +7,8 @@ $schema: http://devicetree.org/meta-schemas/core.yaml#
+ title: Qualcomm Global Clock & Reset Controller for QDU1000 and QRU1000
+
+ maintainers:
+- - Melody Olvera <quic_molvera@quicinc.com>
++ - Taniya Das <quic_tdas@quicinc.com>
++ - Imran Shaik <quic_imrashai@quicinc.com>
+
+ description: |
+ Qualcomm global clock control module which supports the clocks, resets and
+diff --git a/Documentation/devicetree/bindings/extcon/maxim,max77843.yaml b/Documentation/devicetree/bindings/extcon/maxim,max77843.yaml
+index 1289605456408..55800fb0221d0 100644
+--- a/Documentation/devicetree/bindings/extcon/maxim,max77843.yaml
++++ b/Documentation/devicetree/bindings/extcon/maxim,max77843.yaml
+@@ -23,6 +23,7 @@ properties:
+
+ connector:
+ $ref: /schemas/connector/usb-connector.yaml#
++ unevaluatedProperties: false
+
+ ports:
+ $ref: /schemas/graph.yaml#/properties/ports
+diff --git a/Documentation/devicetree/bindings/regulator/qcom,rpm-regulator.yaml b/Documentation/devicetree/bindings/regulator/qcom,rpm-regulator.yaml
+index 8a08698e34846..b4eb4001eb3d2 100644
+--- a/Documentation/devicetree/bindings/regulator/qcom,rpm-regulator.yaml
++++ b/Documentation/devicetree/bindings/regulator/qcom,rpm-regulator.yaml
+@@ -49,7 +49,7 @@ patternProperties:
+ ".*-supply$":
+ description: Input supply phandle(s) for this node
+
+- "^((s|l|lvs)[0-9]*)|(s[1-2][a-b])|(ncp)|(mvs)|(usb-switch)|(hdmi-switch)$":
++ "^((s|l|lvs)[0-9]*|s[1-2][a-b]|ncp|mvs|usb-switch|hdmi-switch)$":
+ description: List of regulators and its properties
+ $ref: regulator.yaml#
+ unevaluatedProperties: false
+diff --git a/Documentation/scsi/scsi_mid_low_api.rst b/Documentation/scsi/scsi_mid_low_api.rst
+index 6fa3a62795016..022198c513506 100644
+--- a/Documentation/scsi/scsi_mid_low_api.rst
++++ b/Documentation/scsi/scsi_mid_low_api.rst
+@@ -1190,11 +1190,11 @@ Members of interest:
+ - pointer to scsi_device object that this command is
+ associated with.
+ resid
+- - an LLD should set this signed integer to the requested
++ - an LLD should set this unsigned integer to the requested
+ transfer length (i.e. 'request_bufflen') less the number
+ of bytes that are actually transferred. 'resid' is
+ preset to 0 so an LLD can ignore it if it cannot detect
+- underruns (overruns should be rare). If possible an LLD
++ underruns (overruns should not be reported). An LLD
+ should set 'resid' prior to invoking 'done'. The most
+ interesting case is data transfers from a SCSI target
+ device (e.g. READs) that underrun.
+diff --git a/Documentation/userspace-api/media/v4l/vidioc-subdev-g-routing.rst b/Documentation/userspace-api/media/v4l/vidioc-subdev-g-routing.rst
+index 2d6e3bbdd0404..72677a280cd64 100644
+--- a/Documentation/userspace-api/media/v4l/vidioc-subdev-g-routing.rst
++++ b/Documentation/userspace-api/media/v4l/vidioc-subdev-g-routing.rst
+@@ -58,6 +58,9 @@ the subdevice exposes, drivers return the ENOSPC error code and adjust the
+ value of the ``num_routes`` field. Application should then reserve enough memory
+ for all the route entries and call ``VIDIOC_SUBDEV_G_ROUTING`` again.
+
++On a successful ``VIDIOC_SUBDEV_G_ROUTING`` call the driver updates the
++``num_routes`` field to reflect the actual number of routes returned.
++
+ .. tabularcolumns:: |p{4.4cm}|p{4.4cm}|p{8.7cm}|
+
+ .. c:type:: v4l2_subdev_routing
+@@ -138,9 +141,7 @@ ENOSPC
+
+ EINVAL
+ The sink or source pad identifiers reference a non-existing pad, or reference
+- pads of different types (ie. the sink_pad identifiers refers to a source pad)
+- or the sink or source stream identifiers reference a non-existing stream on
+- the sink or source pad.
++ pads of different types (ie. the sink_pad identifiers refers to a source pad).
+
+ E2BIG
+ The application provided ``num_routes`` for ``VIDIOC_SUBDEV_S_ROUTING`` is
+diff --git a/Makefile b/Makefile
+index 212d1c7e4a1a3..34ea74d74476f 100644
+--- a/Makefile
++++ b/Makefile
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0
+ VERSION = 6
+ PATCHLEVEL = 4
+-SUBLEVEL = 15
++SUBLEVEL = 16
+ EXTRAVERSION =
+ NAME = Hurr durr I'ma ninja sloth
+
+@@ -1289,7 +1289,7 @@ prepare0: archprepare
+ # All the preparing..
+ prepare: prepare0
+ ifdef CONFIG_RUST
+- $(Q)$(CONFIG_SHELL) $(srctree)/scripts/rust_is_available.sh -v
++ $(Q)$(CONFIG_SHELL) $(srctree)/scripts/rust_is_available.sh
+ $(Q)$(MAKE) $(build)=rust
+ endif
+
+@@ -1825,7 +1825,7 @@ $(DOC_TARGETS):
+ # "Is Rust available?" target
+ PHONY += rustavailable
+ rustavailable:
+- $(Q)$(CONFIG_SHELL) $(srctree)/scripts/rust_is_available.sh -v && echo "Rust is available!"
++ $(Q)$(CONFIG_SHELL) $(srctree)/scripts/rust_is_available.sh && echo "Rust is available!"
+
+ # Documentation target
+ #
+diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
+index 59829fc903152..3cf1bf724e8ed 100644
+--- a/arch/arm/boot/dts/Makefile
++++ b/arch/arm/boot/dts/Makefile
+@@ -335,6 +335,7 @@ dtb-$(CONFIG_MACH_KIRKWOOD) += \
+ kirkwood-iconnect.dtb \
+ kirkwood-iomega_ix2_200.dtb \
+ kirkwood-is2.dtb \
++ kirkwood-km_fixedeth.dtb \
+ kirkwood-km_kirkwood.dtb \
+ kirkwood-l-50.dtb \
+ kirkwood-laplug.dtb \
+@@ -880,7 +881,10 @@ dtb-$(CONFIG_ARCH_OMAP3) += \
+ am3517-craneboard.dtb \
+ am3517-evm.dtb \
+ am3517_mt_ventoux.dtb \
++ logicpd-torpedo-35xx-devkit.dtb \
+ logicpd-torpedo-37xx-devkit.dtb \
++ logicpd-torpedo-37xx-devkit-28.dtb \
++ logicpd-som-lv-35xx-devkit.dtb \
+ logicpd-som-lv-37xx-devkit.dtb \
+ omap3430-sdp.dtb \
+ omap3-beagle.dtb \
+@@ -1561,6 +1565,8 @@ dtb-$(CONFIG_MACH_ARMADA_38X) += \
+ armada-388-helios4.dtb \
+ armada-388-rd.dtb
+ dtb-$(CONFIG_MACH_ARMADA_39X) += \
++ armada-390-db.dtb \
++ armada-395-gp.dtb \
+ armada-398-db.dtb
+ dtb-$(CONFIG_MACH_ARMADA_XP) += \
+ armada-xp-axpwifiap.dtb \
+@@ -1590,6 +1596,7 @@ dtb-$(CONFIG_MACH_DOVE) += \
+ dtb-$(CONFIG_ARCH_MEDIATEK) += \
+ mt2701-evb.dtb \
+ mt6580-evbp1.dtb \
++ mt6582-prestigio-pmt5008-3g.dtb \
+ mt6589-aquaris5.dtb \
+ mt6589-fairphone-fp1.dtb \
+ mt6592-evb.dtb \
+@@ -1645,6 +1652,7 @@ dtb-$(CONFIG_ARCH_ASPEED) += \
+ aspeed-bmc-intel-s2600wf.dtb \
+ aspeed-bmc-inspur-fp5280g2.dtb \
+ aspeed-bmc-inspur-nf5280m6.dtb \
++ aspeed-bmc-inspur-on5263m5.dtb \
+ aspeed-bmc-lenovo-hr630.dtb \
+ aspeed-bmc-lenovo-hr855xg2.dtb \
+ aspeed-bmc-microsoft-olympus.dtb \
+diff --git a/arch/arm/boot/dts/bcm47189-luxul-xap-1440.dts b/arch/arm/boot/dts/bcm47189-luxul-xap-1440.dts
+index 0734aa249b8e0..0f6d7fe30068f 100644
+--- a/arch/arm/boot/dts/bcm47189-luxul-xap-1440.dts
++++ b/arch/arm/boot/dts/bcm47189-luxul-xap-1440.dts
+@@ -26,7 +26,6 @@
+ led-wlan {
+ label = "bcm53xx:blue:wlan";
+ gpios = <&chipcommon 10 GPIO_ACTIVE_LOW>;
+- linux,default-trigger = "default-off";
+ };
+
+ led-system {
+@@ -46,3 +45,16 @@
+ };
+ };
+ };
++
++&gmac0 {
++ phy-mode = "rgmii";
++ phy-handle = <&bcm54210e>;
++
++ mdio {
++ /delete-node/ switch@1e;
++
++ bcm54210e: ethernet-phy@0 {
++ reg = <0>;
++ };
++ };
++};
+diff --git a/arch/arm/boot/dts/bcm47189-luxul-xap-810.dts b/arch/arm/boot/dts/bcm47189-luxul-xap-810.dts
+index e6fb6cbe69633..4e0ef0af726f5 100644
+--- a/arch/arm/boot/dts/bcm47189-luxul-xap-810.dts
++++ b/arch/arm/boot/dts/bcm47189-luxul-xap-810.dts
+@@ -26,7 +26,6 @@
+ led-5ghz {
+ label = "bcm53xx:blue:5ghz";
+ gpios = <&chipcommon 11 GPIO_ACTIVE_HIGH>;
+- linux,default-trigger = "default-off";
+ };
+
+ led-system {
+@@ -42,7 +41,6 @@
+ led-2ghz {
+ label = "bcm53xx:blue:2ghz";
+ gpios = <&pcie0_chipcommon 3 GPIO_ACTIVE_HIGH>;
+- linux,default-trigger = "default-off";
+ };
+ };
+
+@@ -83,3 +81,16 @@
+ };
+ };
+ };
++
++&gmac0 {
++ phy-mode = "rgmii";
++ phy-handle = <&bcm54210e>;
++
++ mdio {
++ /delete-node/ switch@1e;
++
++ bcm54210e: ethernet-phy@0 {
++ reg = <0>;
++ };
++ };
++};
+diff --git a/arch/arm/boot/dts/bcm47189-tenda-ac9.dts b/arch/arm/boot/dts/bcm47189-tenda-ac9.dts
+index dab2e5f63a727..06b1a582809ca 100644
+--- a/arch/arm/boot/dts/bcm47189-tenda-ac9.dts
++++ b/arch/arm/boot/dts/bcm47189-tenda-ac9.dts
+@@ -135,8 +135,8 @@
+ label = "lan4";
+ };
+
+- port@5 {
+- reg = <5>;
++ port@8 {
++ reg = <8>;
+ label = "cpu";
+ ethernet = <&gmac0>;
+ };
+diff --git a/arch/arm/boot/dts/bcm53573.dtsi b/arch/arm/boot/dts/bcm53573.dtsi
+index 3f03a381db0f2..eed1a6147f0bf 100644
+--- a/arch/arm/boot/dts/bcm53573.dtsi
++++ b/arch/arm/boot/dts/bcm53573.dtsi
+@@ -127,6 +127,9 @@
+
+ pcie0: pcie@2000 {
+ reg = <0x00002000 0x1000>;
++
++ #address-cells = <3>;
++ #size-cells = <2>;
+ };
+
+ usb2: usb2@4000 {
+@@ -156,8 +159,6 @@
+ };
+
+ ohci: usb@d000 {
+- #usb-cells = <0>;
+-
+ compatible = "generic-ohci";
+ reg = <0xd000 0x1000>;
+ interrupt-parent = <&gic>;
+diff --git a/arch/arm/boot/dts/bcm947189acdbmr.dts b/arch/arm/boot/dts/bcm947189acdbmr.dts
+index 3709baa2376f5..0b8727ae6f16d 100644
+--- a/arch/arm/boot/dts/bcm947189acdbmr.dts
++++ b/arch/arm/boot/dts/bcm947189acdbmr.dts
+@@ -60,9 +60,9 @@
+ spi {
+ compatible = "spi-gpio";
+ num-chipselects = <1>;
+- gpio-sck = <&chipcommon 21 0>;
+- gpio-miso = <&chipcommon 22 0>;
+- gpio-mosi = <&chipcommon 23 0>;
++ sck-gpios = <&chipcommon 21 0>;
++ miso-gpios = <&chipcommon 22 0>;
++ mosi-gpios = <&chipcommon 23 0>;
+ cs-gpios = <&chipcommon 24 0>;
+ #address-cells = <1>;
+ #size-cells = <0>;
+diff --git a/arch/arm/boot/dts/integratorap.dts b/arch/arm/boot/dts/integratorap.dts
+index 5b52d75bc6bed..d9927d3181dce 100644
+--- a/arch/arm/boot/dts/integratorap.dts
++++ b/arch/arm/boot/dts/integratorap.dts
+@@ -158,7 +158,7 @@
+ valid-mask = <0x003fffff>;
+ };
+
+- pci: pciv3@62000000 {
++ pci: pci@62000000 {
+ compatible = "arm,integrator-ap-pci", "v3,v360epc-pci";
+ device_type = "pci";
+ #interrupt-cells = <1>;
+diff --git a/arch/arm/boot/dts/qcom-ipq4019.dtsi b/arch/arm/boot/dts/qcom-ipq4019.dtsi
+index f0ef86fadc9d9..e328216443135 100644
+--- a/arch/arm/boot/dts/qcom-ipq4019.dtsi
++++ b/arch/arm/boot/dts/qcom-ipq4019.dtsi
+@@ -230,9 +230,12 @@
+ interrupts = <GIC_SPI 123 IRQ_TYPE_LEVEL_HIGH>, <GIC_SPI 138 IRQ_TYPE_LEVEL_HIGH>;
+ interrupt-names = "hc_irq", "pwr_irq";
+ bus-width = <8>;
+- clocks = <&gcc GCC_SDCC1_AHB_CLK>, <&gcc GCC_SDCC1_APPS_CLK>,
+- <&gcc GCC_DCD_XO_CLK>;
+- clock-names = "iface", "core", "xo";
++ clocks = <&gcc GCC_SDCC1_AHB_CLK>,
++ <&gcc GCC_SDCC1_APPS_CLK>,
++ <&xo>;
++ clock-names = "iface",
++ "core",
++ "xo";
+ status = "disabled";
+ };
+
+diff --git a/arch/arm/boot/dts/qcom-sdx65-mtp.dts b/arch/arm/boot/dts/qcom-sdx65-mtp.dts
+index 57bc3b03d3aac..4264ace66b295 100644
+--- a/arch/arm/boot/dts/qcom-sdx65-mtp.dts
++++ b/arch/arm/boot/dts/qcom-sdx65-mtp.dts
+@@ -7,7 +7,7 @@
+ #include "qcom-sdx65.dtsi"
+ #include <dt-bindings/regulator/qcom,rpmh-regulator.h>
+ #include <arm64/qcom/pmk8350.dtsi>
+-#include <arm64/qcom/pm8150b.dtsi>
++#include <arm64/qcom/pm7250b.dtsi>
+ #include "qcom-pmx65.dtsi"
+
+ / {
+diff --git a/arch/arm/boot/dts/s3c6410-mini6410.dts b/arch/arm/boot/dts/s3c6410-mini6410.dts
+index 17097da36f5ed..0b07b3c319604 100644
+--- a/arch/arm/boot/dts/s3c6410-mini6410.dts
++++ b/arch/arm/boot/dts/s3c6410-mini6410.dts
+@@ -51,7 +51,7 @@
+
+ ethernet@18000000 {
+ compatible = "davicom,dm9000";
+- reg = <0x18000000 0x2 0x18000004 0x2>;
++ reg = <0x18000000 0x2>, <0x18000004 0x2>;
+ interrupt-parent = <&gpn>;
+ interrupts = <7 IRQ_TYPE_LEVEL_HIGH>;
+ davicom,no-eeprom;
+diff --git a/arch/arm/boot/dts/s5pv210-smdkv210.dts b/arch/arm/boot/dts/s5pv210-smdkv210.dts
+index fbae768d65e27..901e7197b1368 100644
+--- a/arch/arm/boot/dts/s5pv210-smdkv210.dts
++++ b/arch/arm/boot/dts/s5pv210-smdkv210.dts
+@@ -41,7 +41,7 @@
+
+ ethernet@a8000000 {
+ compatible = "davicom,dm9000";
+- reg = <0xA8000000 0x2 0xA8000002 0x2>;
++ reg = <0xa8000000 0x2>, <0xa8000002 0x2>;
+ interrupt-parent = <&gph1>;
+ interrupts = <1 IRQ_TYPE_LEVEL_HIGH>;
+ local-mac-address = [00 00 de ad be ef];
+@@ -55,6 +55,14 @@
+ default-brightness-level = <6>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&pwm3_out>;
++ power-supply = <&dc5v_reg>;
++ };
++
++ dc5v_reg: regulator-0 {
++ compatible = "regulator-fixed";
++ regulator-name = "DC5V";
++ regulator-min-microvolt = <5000000>;
++ regulator-max-microvolt = <5000000>;
+ };
+ };
+
+diff --git a/arch/arm/boot/dts/stm32mp157c-emstamp-argon.dtsi b/arch/arm/boot/dts/stm32mp157c-emstamp-argon.dtsi
+index b01470a9a3d53..fd89542c69c93 100644
+--- a/arch/arm/boot/dts/stm32mp157c-emstamp-argon.dtsi
++++ b/arch/arm/boot/dts/stm32mp157c-emstamp-argon.dtsi
+@@ -97,9 +97,11 @@
+ adc1: adc@0 {
+ pinctrl-names = "default";
+ pinctrl-0 = <&adc1_in6_pins_a>;
+- st,min-sample-time-nsecs = <5000>;
+- st,adc-channels = <6>;
+ status = "disabled";
++ channel@6 {
++ reg = <6>;
++ st,min-sample-time-ns = <5000>;
++ };
+ };
+
+ adc2: adc@100 {
+@@ -366,8 +368,8 @@
+ &m4_rproc {
+ memory-region = <&retram>, <&mcuram>, <&mcuram2>, <&vdev0vring0>,
+ <&vdev0vring1>, <&vdev0buffer>;
+- mboxes = <&ipcc 0>, <&ipcc 1>, <&ipcc 2>;
+- mbox-names = "vq0", "vq1", "shutdown";
++ mboxes = <&ipcc 0>, <&ipcc 1>, <&ipcc 2>, <&ipcc 3>;
++ mbox-names = "vq0", "vq1", "shutdown", "detach";
+ interrupt-parent = <&exti>;
+ interrupts = <68 1>;
+ interrupt-names = "wdg";
+diff --git a/arch/arm/boot/dts/stm32mp157c-odyssey-som.dtsi b/arch/arm/boot/dts/stm32mp157c-odyssey-som.dtsi
+index e22871dc580c8..cf74852514906 100644
+--- a/arch/arm/boot/dts/stm32mp157c-odyssey-som.dtsi
++++ b/arch/arm/boot/dts/stm32mp157c-odyssey-som.dtsi
+@@ -230,8 +230,8 @@
+ &m4_rproc {
+ memory-region = <&retram>, <&mcuram>, <&mcuram2>, <&vdev0vring0>,
+ <&vdev0vring1>, <&vdev0buffer>;
+- mboxes = <&ipcc 0>, <&ipcc 1>, <&ipcc 2>;
+- mbox-names = "vq0", "vq1", "shutdown";
++ mboxes = <&ipcc 0>, <&ipcc 1>, <&ipcc 2>, <&ipcc 3>;
++ mbox-names = "vq0", "vq1", "shutdown", "detach";
+ interrupt-parent = <&exti>;
+ interrupts = <68 1>;
+ status = "okay";
+diff --git a/arch/arm/boot/dts/stm32mp15xx-dhcom-som.dtsi b/arch/arm/boot/dts/stm32mp15xx-dhcom-som.dtsi
+index c06edd2eacb0c..74a11ccc5333f 100644
+--- a/arch/arm/boot/dts/stm32mp15xx-dhcom-som.dtsi
++++ b/arch/arm/boot/dts/stm32mp15xx-dhcom-som.dtsi
+@@ -80,17 +80,19 @@
+ vdda-supply = <&vdda>;
+ vref-supply = <&vdda>;
+ status = "okay";
++};
+
+- adc1: adc@0 {
+- st,min-sample-time-nsecs = <5000>;
+- st,adc-channels = <0>;
+- status = "okay";
++&adc1 {
++ channel@0 {
++ reg = <0>;
++ st,min-sample-time-ns = <5000>;
+ };
++};
+
+- adc2: adc@100 {
+- st,adc-channels = <1>;
+- st,min-sample-time-nsecs = <5000>;
+- status = "okay";
++&adc2 {
++ channel@1 {
++ reg = <1>;
++ st,min-sample-time-ns = <5000>;
+ };
+ };
+
+@@ -414,8 +416,8 @@
+ &m4_rproc {
+ memory-region = <&retram>, <&mcuram>, <&mcuram2>, <&vdev0vring0>,
+ <&vdev0vring1>, <&vdev0buffer>;
+- mboxes = <&ipcc 0>, <&ipcc 1>, <&ipcc 2>;
+- mbox-names = "vq0", "vq1", "shutdown";
++ mboxes = <&ipcc 0>, <&ipcc 1>, <&ipcc 2>, <&ipcc 3>;
++ mbox-names = "vq0", "vq1", "shutdown", "detach";
+ interrupt-parent = <&exti>;
+ interrupts = <68 1>;
+ status = "okay";
+diff --git a/arch/arm/boot/dts/stm32mp15xx-dhcor-avenger96.dtsi b/arch/arm/boot/dts/stm32mp15xx-dhcor-avenger96.dtsi
+index 7d5d6d4360385..c792dff433fc5 100644
+--- a/arch/arm/boot/dts/stm32mp15xx-dhcor-avenger96.dtsi
++++ b/arch/arm/boot/dts/stm32mp15xx-dhcor-avenger96.dtsi
+@@ -111,17 +111,39 @@
+ vdda-supply = <&vdda>;
+ vref-supply = <&vdda>;
+ status = "okay";
++};
+
+- adc1: adc@0 {
+- st,adc-channels = <0 1 6>;
+- st,min-sample-time-nsecs = <5000>;
+- status = "okay";
++&adc1 {
++ channel@0 {
++ reg = <0>;
++ st,min-sample-time-ns = <5000>;
+ };
+
+- adc2: adc@100 {
+- st,adc-channels = <0 1 2>;
+- st,min-sample-time-nsecs = <5000>;
+- status = "okay";
++ channel@1 {
++ reg = <1>;
++ st,min-sample-time-ns = <5000>;
++ };
++
++ channel@6 {
++ reg = <6>;
++ st,min-sample-time-ns = <5000>;
++ };
++};
++
++&adc2 {
++ channel@0 {
++ reg = <0>;
++ st,min-sample-time-ns = <5000>;
++ };
++
++ channel@1 {
++ reg = <1>;
++ st,min-sample-time-ns = <5000>;
++ };
++
++ channel@2 {
++ reg = <2>;
++ st,min-sample-time-ns = <5000>;
+ };
+ };
+
+diff --git a/arch/arm/boot/dts/stm32mp15xx-dhcor-som.dtsi b/arch/arm/boot/dts/stm32mp15xx-dhcor-som.dtsi
+index bba19f21e5277..89881a26c6141 100644
+--- a/arch/arm/boot/dts/stm32mp15xx-dhcor-som.dtsi
++++ b/arch/arm/boot/dts/stm32mp15xx-dhcor-som.dtsi
+@@ -227,8 +227,8 @@
+ &m4_rproc {
+ memory-region = <&retram>, <&mcuram>, <&mcuram2>, <&vdev0vring0>,
+ <&vdev0vring1>, <&vdev0buffer>;
+- mboxes = <&ipcc 0>, <&ipcc 1>, <&ipcc 2>;
+- mbox-names = "vq0", "vq1", "shutdown";
++ mboxes = <&ipcc 0>, <&ipcc 1>, <&ipcc 2>, <&ipcc 3>;
++ mbox-names = "vq0", "vq1", "shutdown", "detach";
+ interrupt-parent = <&exti>;
+ interrupts = <68 1>;
+ status = "okay";
+diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h
+index dfeed440254a8..fe4326d938c18 100644
+--- a/arch/arm/include/asm/syscall.h
++++ b/arch/arm/include/asm/syscall.h
+@@ -25,6 +25,9 @@ static inline int syscall_get_nr(struct task_struct *task,
+ if (IS_ENABLED(CONFIG_AEABI) && !IS_ENABLED(CONFIG_OABI_COMPAT))
+ return task_thread_info(task)->abi_syscall;
+
++ if (task_thread_info(task)->abi_syscall == -1)
++ return -1;
++
+ return task_thread_info(task)->abi_syscall & __NR_SYSCALL_MASK;
+ }
+
+diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
+index 03d4c5578c5c9..b60bba3c1d516 100644
+--- a/arch/arm/kernel/entry-common.S
++++ b/arch/arm/kernel/entry-common.S
+@@ -90,6 +90,7 @@ slow_work_pending:
+ cmp r0, #0
+ beq no_work_pending
+ movlt scno, #(__NR_restart_syscall - __NR_SYSCALL_BASE)
++ str scno, [tsk, #TI_ABI_SYSCALL] @ make sure tracers see update
+ ldmia sp, {r0 - r6} @ have to reload r0 - r6
+ b local_restart @ ... and off we go
+ ENDPROC(ret_fast_syscall)
+diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
+index 2d8e2516906b6..fef32d73f9120 100644
+--- a/arch/arm/kernel/ptrace.c
++++ b/arch/arm/kernel/ptrace.c
+@@ -783,8 +783,9 @@ long arch_ptrace(struct task_struct *child, long request,
+ break;
+
+ case PTRACE_SET_SYSCALL:
+- task_thread_info(child)->abi_syscall = data &
+- __NR_SYSCALL_MASK;
++ if (data != -1)
++ data &= __NR_SYSCALL_MASK;
++ task_thread_info(child)->abi_syscall = data;
+ ret = 0;
+ break;
+
+diff --git a/arch/arm/mach-omap2/powerdomain.c b/arch/arm/mach-omap2/powerdomain.c
+index 777f9f8e7cd86..5e05dd1324e7b 100644
+--- a/arch/arm/mach-omap2/powerdomain.c
++++ b/arch/arm/mach-omap2/powerdomain.c
+@@ -174,7 +174,7 @@ static int _pwrdm_state_switch(struct powerdomain *pwrdm, int flag)
+ break;
+ case PWRDM_STATE_PREV:
+ prev = pwrdm_read_prev_pwrst(pwrdm);
+- if (pwrdm->state != prev)
++ if (prev >= 0 && pwrdm->state != prev)
+ pwrdm->state_counter[prev]++;
+ if (prev == PWRDM_POWER_RET)
+ _update_logic_membank_counters(pwrdm);
+diff --git a/arch/arm64/boot/dts/freescale/imx8mp-debix-model-a.dts b/arch/arm64/boot/dts/freescale/imx8mp-debix-model-a.dts
+index b4409349eb3f6..1004ab0abb131 100644
+--- a/arch/arm64/boot/dts/freescale/imx8mp-debix-model-a.dts
++++ b/arch/arm64/boot/dts/freescale/imx8mp-debix-model-a.dts
+@@ -355,28 +355,6 @@
+ >;
+ };
+
+- pinctrl_fec: fecgrp {
+- fsl,pins = <
+- MX8MP_IOMUXC_SAI1_RXD2__ENET1_MDC 0x3
+- MX8MP_IOMUXC_SAI1_RXD3__ENET1_MDIO 0x3
+- MX8MP_IOMUXC_SAI1_RXD4__ENET1_RGMII_RD0 0x91
+- MX8MP_IOMUXC_SAI1_RXD5__ENET1_RGMII_RD1 0x91
+- MX8MP_IOMUXC_SAI1_RXD6__ENET1_RGMII_RD2 0x91
+- MX8MP_IOMUXC_SAI1_RXD7__ENET1_RGMII_RD3 0x91
+- MX8MP_IOMUXC_SAI1_TXC__ENET1_RGMII_RXC 0x91
+- MX8MP_IOMUXC_SAI1_TXFS__ENET1_RGMII_RX_CTL 0x91
+- MX8MP_IOMUXC_SAI1_TXD0__ENET1_RGMII_TD0 0x1f
+- MX8MP_IOMUXC_SAI1_TXD1__ENET1_RGMII_TD1 0x1f
+- MX8MP_IOMUXC_SAI1_TXD2__ENET1_RGMII_TD2 0x1f
+- MX8MP_IOMUXC_SAI1_TXD3__ENET1_RGMII_TD3 0x1f
+- MX8MP_IOMUXC_SAI1_TXD4__ENET1_RGMII_TX_CTL 0x1f
+- MX8MP_IOMUXC_SAI1_TXD5__ENET1_RGMII_TXC 0x1f
+- MX8MP_IOMUXC_SAI1_RXD1__ENET1_1588_EVENT1_OUT 0x1f
+- MX8MP_IOMUXC_SAI1_RXD0__ENET1_1588_EVENT1_IN 0x1f
+- MX8MP_IOMUXC_SAI1_TXD7__GPIO4_IO19 0x19
+- >;
+- };
+-
+ pinctrl_gpio_led: gpioledgrp {
+ fsl,pins = <
+ MX8MP_IOMUXC_NAND_READY_B__GPIO3_IO16 0x19
+diff --git a/arch/arm64/boot/dts/nvidia/tegra210-smaug.dts b/arch/arm64/boot/dts/nvidia/tegra210-smaug.dts
+index d7d7c63e62e25..79d294c2ee199 100644
+--- a/arch/arm64/boot/dts/nvidia/tegra210-smaug.dts
++++ b/arch/arm64/boot/dts/nvidia/tegra210-smaug.dts
+@@ -1312,6 +1312,7 @@
+
+ uartd: serial@70006300 {
+ compatible = "nvidia,tegra30-hsuart";
++ reset-names = "serial";
+ status = "okay";
+
+ bluetooth {
+diff --git a/arch/arm64/boot/dts/nvidia/tegra234-p3737-0000+p3701-0000.dts b/arch/arm64/boot/dts/nvidia/tegra234-p3737-0000+p3701-0000.dts
+index caa9e952a149c..a1194c4e15f0e 100644
+--- a/arch/arm64/boot/dts/nvidia/tegra234-p3737-0000+p3701-0000.dts
++++ b/arch/arm64/boot/dts/nvidia/tegra234-p3737-0000+p3701-0000.dts
+@@ -2010,6 +2010,7 @@
+
+ serial@3100000 {
+ compatible = "nvidia,tegra194-hsuart";
++ reset-names = "serial";
+ status = "okay";
+ };
+
+diff --git a/arch/arm64/boot/dts/qcom/apq8016-sbc.dts b/arch/arm64/boot/dts/qcom/apq8016-sbc.dts
+index 3ec449f5cab78..fa92a870cfc40 100644
+--- a/arch/arm64/boot/dts/qcom/apq8016-sbc.dts
++++ b/arch/arm64/boot/dts/qcom/apq8016-sbc.dts
+@@ -75,7 +75,7 @@
+
+ usb_id: usb-id {
+ compatible = "linux,extcon-usb-gpio";
+- id-gpio = <&msmgpio 121 GPIO_ACTIVE_HIGH>;
++ id-gpio = <&tlmm 121 GPIO_ACTIVE_HIGH>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&usb_id_default>;
+ };
+@@ -101,13 +101,13 @@
+ button {
+ label = "Volume Up";
+ linux,code = <KEY_VOLUMEUP>;
+- gpios = <&msmgpio 107 GPIO_ACTIVE_LOW>;
++ gpios = <&tlmm 107 GPIO_ACTIVE_LOW>;
+ };
+ };
+
+ leds {
+ pinctrl-names = "default";
+- pinctrl-0 = <&msmgpio_leds>,
++ pinctrl-0 = <&tlmm_leds>,
+ <&pm8916_gpios_leds>,
+ <&pm8916_mpps_leds>;
+
+@@ -117,7 +117,7 @@
+ label = "apq8016-sbc:green:user1";
+ function = LED_FUNCTION_HEARTBEAT;
+ color = <LED_COLOR_ID_GREEN>;
+- gpios = <&msmgpio 21 GPIO_ACTIVE_HIGH>;
++ gpios = <&tlmm 21 GPIO_ACTIVE_HIGH>;
+ linux,default-trigger = "heartbeat";
+ default-state = "off";
+ };
+@@ -126,7 +126,7 @@
+ label = "apq8016-sbc:green:user2";
+ function = LED_FUNCTION_DISK_ACTIVITY;
+ color = <LED_COLOR_ID_GREEN>;
+- gpios = <&msmgpio 120 GPIO_ACTIVE_HIGH>;
++ gpios = <&tlmm 120 GPIO_ACTIVE_HIGH>;
+ linux,default-trigger = "mmc0";
+ default-state = "off";
+ };
+@@ -186,14 +186,14 @@
+ compatible = "adi,adv7533";
+ reg = <0x39>;
+
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <31 IRQ_TYPE_EDGE_FALLING>;
+
+ adi,dsi-lanes = <4>;
+ clocks = <&rpmcc RPM_SMD_BB_CLK2>;
+ clock-names = "cec";
+
+- pd-gpios = <&msmgpio 32 GPIO_ACTIVE_HIGH>;
++ pd-gpios = <&tlmm 32 GPIO_ACTIVE_HIGH>;
+
+ avdd-supply = <&pm8916_l6>;
+ v1p2-supply = <&pm8916_l6>;
+@@ -276,8 +276,8 @@
+ compatible = "ovti,ov5640";
+ reg = <0x3b>;
+
+- enable-gpios = <&msmgpio 34 GPIO_ACTIVE_HIGH>;
+- reset-gpios = <&msmgpio 35 GPIO_ACTIVE_LOW>;
++ powerdown-gpios = <&tlmm 34 GPIO_ACTIVE_HIGH>;
++ reset-gpios = <&tlmm 35 GPIO_ACTIVE_LOW>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&camera_rear_default>;
+
+@@ -285,9 +285,9 @@
+ clock-names = "xclk";
+ clock-frequency = <23880000>;
+
+- vdddo-supply = <&camera_vdddo_1v8>;
+- vdda-supply = <&camera_vdda_2v8>;
+- vddd-supply = <&camera_vddd_1v5>;
++ DOVDD-supply = <&camera_vdddo_1v8>;
++ AVDD-supply = <&camera_vdda_2v8>;
++ DVDD-supply = <&camera_vddd_1v5>;
+
+ /* No camera mezzanine by default */
+ status = "disabled";
+@@ -310,6 +310,10 @@
+ status = "okay";
+ };
+
++&lpass_codec {
++ status = "okay";
++};
++
+ &mdss {
+ status = "okay";
+ };
+@@ -325,6 +329,40 @@
+ linux,code = <KEY_VOLUMEDOWN>;
+ };
+
++&pm8916_rpm_regulators {
++ /*
++ * The 96Boards specification expects a 1.8V power rail on the low-speed
++ * expansion connector that is able to provide at least 0.18W / 100 mA.
++ * L15/L16 are connected in parallel to provide 55 mA each. A minimum load
++ * must be specified to ensure the regulators are not put in LPM where they
++ * would only provide 5 mA.
++ */
++ pm8916_l15: l15 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ regulator-system-load = <50000>;
++ regulator-allow-set-load;
++ regulator-always-on;
++ };
++ pm8916_l16: l16 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ regulator-system-load = <50000>;
++ regulator-allow-set-load;
++ regulator-always-on;
++ };
++
++ pm8916_l17: l17 {
++ regulator-min-microvolt = <3300000>;
++ regulator-max-microvolt = <3300000>;
++ };
++};
++
++&pm8916_s4 {
++ regulator-always-on;
++ regulator-boot-on;
++};
++
+ &sdhc_1 {
+ status = "okay";
+
+@@ -340,7 +378,7 @@
+ pinctrl-0 = <&sdc2_clk_on &sdc2_cmd_on &sdc2_data_on &sdc2_cd_on>;
+ pinctrl-1 = <&sdc2_clk_off &sdc2_cmd_off &sdc2_data_off &sdc2_cd_off>;
+
+- cd-gpios = <&msmgpio 38 GPIO_ACTIVE_LOW>;
++ cd-gpios = <&tlmm 38 GPIO_ACTIVE_LOW>;
+ };
+
+ &sound {
+@@ -399,6 +437,7 @@
+ };
+
+ &wcd_codec {
++ status = "okay";
+ clocks = <&gcc GCC_CODEC_DIGCODEC_CLK>;
+ clock-names = "mclk";
+ qcom,mbhc-vthreshold-low = <75 150 237 450 500>;
+@@ -441,125 +480,6 @@
+ &stm { status = "okay"; };
+ &tpiu { status = "okay"; };
+
+-&smd_rpm_regulators {
+- vdd_l1_l2_l3-supply = <&pm8916_s3>;
+- vdd_l4_l5_l6-supply = <&pm8916_s4>;
+- vdd_l7-supply = <&pm8916_s4>;
+-
+- s3 {
+- regulator-min-microvolt = <1250000>;
+- regulator-max-microvolt = <1350000>;
+- };
+-
+- s4 {
+- regulator-min-microvolt = <1850000>;
+- regulator-max-microvolt = <2150000>;
+-
+- regulator-always-on;
+- regulator-boot-on;
+- };
+-
+- l1 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+- };
+-
+- l2 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1200000>;
+- };
+-
+- l4 {
+- regulator-min-microvolt = <2050000>;
+- regulator-max-microvolt = <2050000>;
+- };
+-
+- l5 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l6 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l7 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l8 {
+- regulator-min-microvolt = <2900000>;
+- regulator-max-microvolt = <2900000>;
+- };
+-
+- l9 {
+- regulator-min-microvolt = <3300000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l10 {
+- regulator-min-microvolt = <2800000>;
+- regulator-max-microvolt = <2800000>;
+- };
+-
+- l11 {
+- regulator-min-microvolt = <2950000>;
+- regulator-max-microvolt = <2950000>;
+- regulator-allow-set-load;
+- regulator-system-load = <200000>;
+- };
+-
+- l12 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- };
+-
+- l13 {
+- regulator-min-microvolt = <3075000>;
+- regulator-max-microvolt = <3075000>;
+- };
+-
+- l14 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- /*
+- * The 96Boards specification expects a 1.8V power rail on the low-speed
+- * expansion connector that is able to provide at least 0.18W / 100 mA.
+- * L15/L16 are connected in parallel to provide 55 mA each. A minimum load
+- * must be specified to ensure the regulators are not put in LPM where they
+- * would only provide 5 mA.
+- */
+- l15 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- regulator-system-load = <50000>;
+- regulator-allow-set-load;
+- regulator-always-on;
+- };
+-
+- l16 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- regulator-system-load = <50000>;
+- regulator-allow-set-load;
+- regulator-always-on;
+- };
+-
+- l17 {
+- regulator-min-microvolt = <3300000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l18 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2700000>;
+- };
+-};
+-
+ /*
+ * 2mA drive strength is not enough when connecting multiple
+ * I2C devices with different pull up resistors.
+@@ -600,7 +520,7 @@
+ * ones actually used for GPIO.
+ */
+
+-&msmgpio {
++&tlmm {
+ gpio-line-names =
+ "[UART0_TX]", /* GPIO_0, LSEC pin 5 */
+ "[UART0_RX]", /* GPIO_1, LSEC pin 7 */
+@@ -725,7 +645,7 @@
+ "USR_LED_2_CTRL", /* GPIO 120 */
+ "SB_HS_ID";
+
+- msmgpio_leds: msmgpio-leds-state {
++ tlmm_leds: tlmm-leds-state {
+ pins = "gpio21", "gpio120";
+ function = "gpio";
+
+diff --git a/arch/arm64/boot/dts/qcom/msm8916-acer-a1-724.dts b/arch/arm64/boot/dts/qcom/msm8916-acer-a1-724.dts
+index 13cd9ad167df7..7b77a80f049c6 100644
+--- a/arch/arm64/boot/dts/qcom/msm8916-acer-a1-724.dts
++++ b/arch/arm64/boot/dts/qcom/msm8916-acer-a1-724.dts
+@@ -39,14 +39,14 @@
+
+ button-volume-up {
+ label = "Volume Up";
+- gpios = <&msmgpio 107 GPIO_ACTIVE_LOW>;
++ gpios = <&tlmm 107 GPIO_ACTIVE_LOW>;
+ linux,code = <KEY_VOLUMEUP>;
+ };
+ };
+
+ usb_id: usb-id {
+ compatible = "linux,extcon-usb-gpio";
+- id-gpio = <&msmgpio 110 GPIO_ACTIVE_HIGH>;
++ id-gpio = <&tlmm 110 GPIO_ACTIVE_HIGH>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&usb_id_default>;
+ };
+@@ -58,7 +58,7 @@
+ accelerometer@10 {
+ compatible = "bosch,bmc150_accel";
+ reg = <0x10>;
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <115 IRQ_TYPE_EDGE_RISING>;
+
+ vdd-supply = <&pm8916_l17>;
+@@ -89,10 +89,10 @@
+ compatible = "edt,edt-ft5406";
+ reg = <0x38>;
+
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <13 IRQ_TYPE_LEVEL_LOW>;
+
+- reset-gpios = <&msmgpio 12 GPIO_ACTIVE_LOW>;
++ reset-gpios = <&tlmm 12 GPIO_ACTIVE_LOW>;
+
+ vcc-supply = <&pm8916_l16>;
+ iovcc-supply = <&pm8916_l6>;
+@@ -114,6 +114,18 @@
+ status = "okay";
+ };
+
++&pm8916_rpm_regulators {
++ pm8916_l16: l16 {
++ regulator-min-microvolt = <2900000>;
++ regulator-max-microvolt = <2900000>;
++ };
++
++ pm8916_l17: l17 {
++ regulator-min-microvolt = <2850000>;
++ regulator-max-microvolt = <2850000>;
++ };
++};
++
+ &pm8916_vib {
+ status = "okay";
+ };
+@@ -131,7 +143,7 @@
+ pinctrl-0 = <&sdc2_clk_on &sdc2_cmd_on &sdc2_data_on>;
+ pinctrl-1 = <&sdc2_clk_off &sdc2_cmd_off &sdc2_data_off>;
+
+- cd-gpios = <&msmgpio 38 GPIO_ACTIVE_HIGH>;
++ cd-gpios = <&tlmm 38 GPIO_ACTIVE_HIGH>;
+
+ status = "okay";
+ };
+@@ -153,110 +165,7 @@
+ compatible = "qcom,wcn3620";
+ };
+
+-&smd_rpm_regulators {
+- vdd_l1_l2_l3-supply = <&pm8916_s3>;
+- vdd_l4_l5_l6-supply = <&pm8916_s4>;
+- vdd_l7-supply = <&pm8916_s4>;
+-
+- s3 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1300000>;
+- };
+-
+- s4 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2100000>;
+- };
+-
+- l1 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+- };
+-
+- l2 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1200000>;
+- };
+-
+- l4 {
+- regulator-min-microvolt = <2050000>;
+- regulator-max-microvolt = <2050000>;
+- };
+-
+- l5 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l6 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l7 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l8 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2900000>;
+- };
+-
+- l9 {
+- regulator-min-microvolt = <3300000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l10 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2800000>;
+- };
+-
+- l11 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- regulator-system-load = <200000>;
+- regulator-allow-set-load;
+- };
+-
+- l12 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- };
+-
+- l13 {
+- regulator-min-microvolt = <3075000>;
+- regulator-max-microvolt = <3075000>;
+- };
+-
+- l14 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l15 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l16 {
+- regulator-min-microvolt = <2900000>;
+- regulator-max-microvolt = <2900000>;
+- };
+-
+- l17 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2850000>;
+- };
+-
+- l18 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2700000>;
+- };
+-};
+-
+-&msmgpio {
++&tlmm {
+ accel_int_default: accel-int-default-state {
+ pins = "gpio115";
+ function = "gpio";
+diff --git a/arch/arm64/boot/dts/qcom/msm8916-alcatel-idol347.dts b/arch/arm64/boot/dts/qcom/msm8916-alcatel-idol347.dts
+index fecb69944cfa3..d2abbdec5fe68 100644
+--- a/arch/arm64/boot/dts/qcom/msm8916-alcatel-idol347.dts
++++ b/arch/arm64/boot/dts/qcom/msm8916-alcatel-idol347.dts
+@@ -30,7 +30,7 @@
+
+ button-volume-up {
+ label = "Volume Up";
+- gpios = <&msmgpio 107 GPIO_ACTIVE_LOW>;
++ gpios = <&tlmm 107 GPIO_ACTIVE_LOW>;
+ linux,code = <KEY_VOLUMEUP>;
+ };
+ };
+@@ -42,7 +42,7 @@
+ pinctrl-0 = <&gpio_leds_default>;
+
+ led-0 {
+- gpios = <&msmgpio 32 GPIO_ACTIVE_HIGH>;
++ gpios = <&tlmm 32 GPIO_ACTIVE_HIGH>;
+ linux,default-trigger = "torch";
+ function = LED_FUNCTION_TORCH;
+ };
+@@ -50,7 +50,7 @@
+
+ usb_id: usb-id {
+ compatible = "linux,extcon-usb-gpio";
+- id-gpio = <&msmgpio 69 GPIO_ACTIVE_HIGH>;
++ id-gpio = <&tlmm 69 GPIO_ACTIVE_HIGH>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&usb_id_default>;
+ };
+@@ -66,9 +66,9 @@
+ touchscreen@26 {
+ compatible = "mstar,msg2638";
+ reg = <0x26>;
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <13 IRQ_TYPE_EDGE_FALLING>;
+- reset-gpios = <&msmgpio 100 GPIO_ACTIVE_LOW>;
++ reset-gpios = <&tlmm 100 GPIO_ACTIVE_LOW>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&ts_int_reset_default>;
+ vdd-supply = <&pm8916_l17>;
+@@ -86,7 +86,7 @@
+ reg = <0x0c>;
+ vdd-supply = <&pm8916_l17>;
+ vid-supply = <&pm8916_l6>;
+- reset-gpios = <&msmgpio 8 GPIO_ACTIVE_LOW>;
++ reset-gpios = <&tlmm 8 GPIO_ACTIVE_LOW>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&mag_reset_default>;
+ mount-matrix = "0", "1", "0",
+@@ -99,7 +99,7 @@
+ reg = <0x0f>;
+ vdd-supply = <&pm8916_l17>;
+ vddio-supply = <&pm8916_l6>;
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <31 IRQ_TYPE_EDGE_RISING>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&accel_int_default>;
+@@ -111,7 +111,7 @@
+ proximity@48 {
+ compatible = "sensortek,stk3310";
+ reg = <0x48>;
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <12 IRQ_TYPE_EDGE_FALLING>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&proximity_int_default>;
+@@ -122,7 +122,7 @@
+ reg = <0x68>;
+ vdd-supply = <&pm8916_l17>;
+ vddio-supply = <&pm8916_l6>;
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <97 IRQ_TYPE_EDGE_RISING>,
+ <98 IRQ_TYPE_EDGE_RISING>;
+ pinctrl-names = "default";
+@@ -136,7 +136,7 @@
+ led-controller@68 {
+ compatible = "si-en,sn3190";
+ reg = <0x68>;
+- shutdown-gpios = <&msmgpio 89 GPIO_ACTIVE_HIGH>;
++ shutdown-gpios = <&tlmm 89 GPIO_ACTIVE_HIGH>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&led_enable_default &led_shutdown_default>;
+ #address-cells = <1>;
+@@ -156,6 +156,13 @@
+ linux,code = <KEY_VOLUMEDOWN>;
+ };
+
++&pm8916_rpm_regulators {
++ pm8916_l17: l17 {
++ regulator-min-microvolt = <2850000>;
++ regulator-max-microvolt = <2850000>;
++ };
++};
++
+ &pm8916_vib {
+ status = "okay";
+ };
+@@ -175,7 +182,7 @@
+ pinctrl-0 = <&sdc2_clk_on &sdc2_cmd_on &sdc2_data_on &sdc2_cd_on>;
+ pinctrl-1 = <&sdc2_clk_off &sdc2_cmd_off &sdc2_data_off &sdc2_cd_off>;
+
+- cd-gpios = <&msmgpio 38 GPIO_ACTIVE_LOW>;
++ cd-gpios = <&tlmm 38 GPIO_ACTIVE_LOW>;
+ };
+
+ &usb {
+@@ -195,110 +202,7 @@
+ compatible = "qcom,wcn3620";
+ };
+
+-&smd_rpm_regulators {
+- vdd_l1_l2_l3-supply = <&pm8916_s3>;
+- vdd_l4_l5_l6-supply = <&pm8916_s4>;
+- vdd_l7-supply = <&pm8916_s4>;
+-
+- s3 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1300000>;
+- };
+-
+- s4 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2100000>;
+- };
+-
+- l1 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+- };
+-
+- l2 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1200000>;
+- };
+-
+- l4 {
+- regulator-min-microvolt = <2050000>;
+- regulator-max-microvolt = <2050000>;
+- };
+-
+- l5 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l6 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l7 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l8 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2900000>;
+- };
+-
+- l9 {
+- regulator-min-microvolt = <3300000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l10 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2800000>;
+- };
+-
+- l11 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- regulator-allow-set-load;
+- regulator-system-load = <200000>;
+- };
+-
+- l12 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- };
+-
+- l13 {
+- regulator-min-microvolt = <3075000>;
+- regulator-max-microvolt = <3075000>;
+- };
+-
+- l14 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l15 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l16 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l17 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2850000>;
+- };
+-
+- l18 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2700000>;
+- };
+-};
+-
+-&msmgpio {
++&tlmm {
+ accel_int_default: accel-int-default-state {
+ pins = "gpio31";
+ function = "gpio";
+diff --git a/arch/arm64/boot/dts/qcom/msm8916-asus-z00l.dts b/arch/arm64/boot/dts/qcom/msm8916-asus-z00l.dts
+index 91284a1d0966f..c58a70fdf36fb 100644
+--- a/arch/arm64/boot/dts/qcom/msm8916-asus-z00l.dts
++++ b/arch/arm64/boot/dts/qcom/msm8916-asus-z00l.dts
+@@ -30,14 +30,14 @@
+
+ button-volume-up {
+ label = "Volume Up";
+- gpios = <&msmgpio 107 GPIO_ACTIVE_LOW>;
++ gpios = <&tlmm 107 GPIO_ACTIVE_LOW>;
+ linux,code = <KEY_VOLUMEUP>;
+ debounce-interval = <15>;
+ };
+
+ button-volume-down {
+ label = "Volume Down";
+- gpios = <&msmgpio 117 GPIO_ACTIVE_LOW>;
++ gpios = <&tlmm 117 GPIO_ACTIVE_LOW>;
+ linux,code = <KEY_VOLUMEDOWN>;
+ debounce-interval = <15>;
+ };
+@@ -49,7 +49,7 @@
+ regulator-min-microvolt = <2950000>;
+ regulator-max-microvolt = <2950000>;
+
+- gpio = <&msmgpio 87 GPIO_ACTIVE_HIGH>;
++ gpio = <&tlmm 87 GPIO_ACTIVE_HIGH>;
+ enable-active-high;
+
+ startup-delay-us = <200>;
+@@ -60,7 +60,7 @@
+
+ usb_id: usb-id {
+ compatible = "linux,extcon-usb-gpio";
+- id-gpios = <&msmgpio 110 GPIO_ACTIVE_HIGH>;
++ id-gpios = <&tlmm 110 GPIO_ACTIVE_HIGH>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&usb_id_default>;
+ };
+@@ -76,7 +76,7 @@
+ vdd-supply = <&pm8916_l8>;
+ vid-supply = <&pm8916_l6>;
+
+- reset-gpios = <&msmgpio 112 GPIO_ACTIVE_LOW>;
++ reset-gpios = <&tlmm 112 GPIO_ACTIVE_LOW>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&mag_reset_default>;
+@@ -86,7 +86,7 @@
+ compatible = "invensense,mpu6515";
+ reg = <0x68>;
+
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <36 IRQ_TYPE_EDGE_RISING>;
+
+ vdd-supply = <&pm8916_l17>;
+@@ -108,10 +108,10 @@
+ compatible = "edt,edt-ft5306";
+ reg = <0x38>;
+
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <13 IRQ_TYPE_EDGE_FALLING>;
+
+- reset-gpios = <&msmgpio 12 GPIO_ACTIVE_LOW>;
++ reset-gpios = <&tlmm 12 GPIO_ACTIVE_LOW>;
+
+ vcc-supply = <&pm8916_l11>;
+ iovcc-supply = <&pm8916_l6>;
+@@ -128,6 +128,13 @@
+ status = "okay";
+ };
+
++&pm8916_rpm_regulators {
++ pm8916_l17: l17 {
++ regulator-min-microvolt = <2850000>;
++ regulator-max-microvolt = <2850000>;
++ };
++};
++
+ &sdhc_1 {
+ status = "okay";
+
+@@ -143,7 +150,7 @@
+ pinctrl-names = "default", "sleep";
+ pinctrl-0 = <&sdc2_clk_on &sdc2_cmd_on &sdc2_data_on &sdc2_cd_on>;
+ pinctrl-1 = <&sdc2_clk_off &sdc2_cmd_off &sdc2_data_off &sdc2_cd_off>;
+- cd-gpios = <&msmgpio 38 GPIO_ACTIVE_LOW>;
++ cd-gpios = <&tlmm 38 GPIO_ACTIVE_LOW>;
+ };
+
+ &usb {
+@@ -163,110 +170,7 @@
+ compatible = "qcom,wcn3620";
+ };
+
+-&smd_rpm_regulators {
+- vdd_l1_l2_l3-supply = <&pm8916_s3>;
+- vdd_l4_l5_l6-supply = <&pm8916_s4>;
+- vdd_l7-supply = <&pm8916_s4>;
+-
+- s3 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1300000>;
+- };
+-
+- s4 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2100000>;
+- };
+-
+- l1 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+- };
+-
+- l2 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1200000>;
+- };
+-
+- l4 {
+- regulator-min-microvolt = <2050000>;
+- regulator-max-microvolt = <2050000>;
+- };
+-
+- l5 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l6 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l7 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l8 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2900000>;
+- };
+-
+- l9 {
+- regulator-min-microvolt = <3300000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l10 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2800000>;
+- };
+-
+- l11 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- regulator-allow-set-load;
+- regulator-system-load = <200000>;
+- };
+-
+- l12 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- };
+-
+- l13 {
+- regulator-min-microvolt = <3075000>;
+- regulator-max-microvolt = <3075000>;
+- };
+-
+- l14 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l15 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l16 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l17 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2850000>;
+- };
+-
+- l18 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2700000>;
+- };
+-};
+-
+-&msmgpio {
++&tlmm {
+ gpio_keys_default: gpio-keys-default-state {
+ pins = "gpio107", "gpio117";
+ function = "gpio";
+diff --git a/arch/arm64/boot/dts/qcom/msm8916-gplus-fl8005a.dts b/arch/arm64/boot/dts/qcom/msm8916-gplus-fl8005a.dts
+index 525ec76efeeb7..221db7edec5ef 100644
+--- a/arch/arm64/boot/dts/qcom/msm8916-gplus-fl8005a.dts
++++ b/arch/arm64/boot/dts/qcom/msm8916-gplus-fl8005a.dts
+@@ -24,8 +24,8 @@
+ flash-led-controller {
+ /* Actually qcom,leds-gpio-flash */
+ compatible = "sgmicro,sgm3140";
+- enable-gpios = <&msmgpio 31 GPIO_ACTIVE_HIGH>;
+- flash-gpios = <&msmgpio 32 GPIO_ACTIVE_HIGH>;
++ enable-gpios = <&tlmm 31 GPIO_ACTIVE_HIGH>;
++ flash-gpios = <&tlmm 32 GPIO_ACTIVE_HIGH>;
+
+ pinctrl-0 = <&camera_flash_default>;
+ pinctrl-names = "default";
+@@ -45,7 +45,7 @@
+
+ button-volume-up {
+ label = "Volume Up";
+- gpios = <&msmgpio 107 GPIO_ACTIVE_LOW>;
++ gpios = <&tlmm 107 GPIO_ACTIVE_LOW>;
+ linux,code = <KEY_VOLUMEUP>;
+ };
+ };
+@@ -59,21 +59,21 @@
+ led-red {
+ function = LED_FUNCTION_CHARGING;
+ color = <LED_COLOR_ID_RED>;
+- gpios = <&msmgpio 117 GPIO_ACTIVE_HIGH>;
++ gpios = <&tlmm 117 GPIO_ACTIVE_HIGH>;
+ retain-state-suspended;
+ };
+
+ led-green {
+ function = LED_FUNCTION_CHARGING;
+ color = <LED_COLOR_ID_GREEN>;
+- gpios = <&msmgpio 118 GPIO_ACTIVE_HIGH>;
++ gpios = <&tlmm 118 GPIO_ACTIVE_HIGH>;
+ retain-state-suspended;
+ };
+ };
+
+ usb_id: usb-id {
+ compatible = "linux,extcon-usb-gpio";
+- id-gpio = <&msmgpio 110 GPIO_ACTIVE_HIGH>;
++ id-gpio = <&tlmm 110 GPIO_ACTIVE_HIGH>;
+ pinctrl-0 = <&usb_id_default>;
+ pinctrl-names = "default";
+ };
+@@ -87,10 +87,10 @@
+ compatible = "edt,edt-ft5406";
+ reg = <0x38>;
+
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <13 IRQ_TYPE_EDGE_FALLING>;
+
+- reset-gpios = <&msmgpio 12 GPIO_ACTIVE_LOW>;
++ reset-gpios = <&tlmm 12 GPIO_ACTIVE_LOW>;
+
+ vcc-supply = <&pm8916_l17>;
+ iovcc-supply = <&pm8916_l6>;
+@@ -114,6 +114,13 @@
+ status = "okay";
+ };
+
++&pm8916_rpm_regulators {
++ pm8916_l17: l17 {
++ regulator-min-microvolt = <2850000>;
++ regulator-max-microvolt = <2850000>;
++ };
++};
++
+ &pm8916_vib {
+ status = "okay";
+ };
+@@ -131,7 +138,7 @@
+ pinctrl-1 = <&sdc2_clk_off &sdc2_cmd_off &sdc2_data_off>;
+ pinctrl-names = "default", "sleep";
+
+- cd-gpios = <&msmgpio 38 GPIO_ACTIVE_LOW>;
++ cd-gpios = <&tlmm 38 GPIO_ACTIVE_LOW>;
+
+ status = "okay";
+ };
+@@ -153,110 +160,7 @@
+ compatible = "qcom,wcn3620";
+ };
+
+-&smd_rpm_regulators {
+- vdd_l1_l2_l3-supply = <&pm8916_s3>;
+- vdd_l4_l5_l6-supply = <&pm8916_s4>;
+- vdd_l7-supply = <&pm8916_s4>;
+-
+- s3 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1300000>;
+- };
+-
+- s4 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2100000>;
+- };
+-
+- l1 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+- };
+-
+- l2 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1200000>;
+- };
+-
+- l4 {
+- regulator-min-microvolt = <2050000>;
+- regulator-max-microvolt = <2050000>;
+- };
+-
+- l5 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l6 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l7 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l8 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2900000>;
+- };
+-
+- l9 {
+- regulator-min-microvolt = <3300000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l10 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2800000>;
+- };
+-
+- l11 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- regulator-system-load = <200000>;
+- regulator-allow-set-load;
+- };
+-
+- l12 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- };
+-
+- l13 {
+- regulator-min-microvolt = <3075000>;
+- regulator-max-microvolt = <3075000>;
+- };
+-
+- l14 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l15 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l16 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l17 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2850000>;
+- };
+-
+- l18 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2700000>;
+- };
+-};
+-
+-&msmgpio {
++&tlmm {
+ camera_flash_default: camera-flash-default-state {
+ pins = "gpio31", "gpio32";
+ function = "gpio";
+diff --git a/arch/arm64/boot/dts/qcom/msm8916-huawei-g7.dts b/arch/arm64/boot/dts/qcom/msm8916-huawei-g7.dts
+index 5b1bac8f51220..b02e8f9a8ca0d 100644
+--- a/arch/arm64/boot/dts/qcom/msm8916-huawei-g7.dts
++++ b/arch/arm64/boot/dts/qcom/msm8916-huawei-g7.dts
+@@ -43,7 +43,7 @@
+
+ button-volume-up {
+ label = "Volume Up";
+- gpios = <&msmgpio 107 GPIO_ACTIVE_LOW>;
++ gpios = <&tlmm 107 GPIO_ACTIVE_LOW>;
+ linux,code = <KEY_VOLUMEUP>;
+ };
+ };
+@@ -55,21 +55,21 @@
+ pinctrl-0 = <&gpio_leds_default>;
+
+ led-0 {
+- gpios = <&msmgpio 8 GPIO_ACTIVE_HIGH>;
++ gpios = <&tlmm 8 GPIO_ACTIVE_HIGH>;
+ color = <LED_COLOR_ID_RED>;
+ default-state = "off";
+ function = LED_FUNCTION_INDICATOR;
+ };
+
+ led-1 {
+- gpios = <&msmgpio 9 GPIO_ACTIVE_HIGH>;
++ gpios = <&tlmm 9 GPIO_ACTIVE_HIGH>;
+ color = <LED_COLOR_ID_GREEN>;
+ default-state = "off";
+ function = LED_FUNCTION_INDICATOR;
+ };
+
+ led-2 {
+- gpios = <&msmgpio 10 GPIO_ACTIVE_HIGH>;
++ gpios = <&tlmm 10 GPIO_ACTIVE_HIGH>;
+ color = <LED_COLOR_ID_BLUE>;
+ default-state = "off";
+ function = LED_FUNCTION_INDICATOR;
+@@ -78,7 +78,7 @@
+
+ usb_id: usb-id {
+ compatible = "linux,extcon-usb-gpio";
+- id-gpio = <&msmgpio 117 GPIO_ACTIVE_HIGH>;
++ id-gpio = <&tlmm 117 GPIO_ACTIVE_HIGH>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&usb_id_default>;
+ };
+@@ -94,7 +94,7 @@
+ vdd-supply = <&pm8916_l17>;
+ vid-supply = <&pm8916_l6>;
+
+- reset-gpios = <&msmgpio 36 GPIO_ACTIVE_LOW>;
++ reset-gpios = <&tlmm 36 GPIO_ACTIVE_LOW>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&mag_reset_default>;
+@@ -104,7 +104,7 @@
+ compatible = "kionix,kx023-1025";
+ reg = <0x1e>;
+
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <115 IRQ_TYPE_EDGE_RISING>;
+
+ vdd-supply = <&pm8916_l17>;
+@@ -122,7 +122,7 @@
+ compatible = "avago,apds9930";
+ reg = <0x39>;
+
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <113 IRQ_TYPE_EDGE_FALLING>;
+
+ vdd-supply = <&pm8916_l17>;
+@@ -146,7 +146,7 @@
+ regulator-name = "outp";
+ regulator-min-microvolt = <5400000>;
+ regulator-max-microvolt = <5400000>;
+- enable-gpios = <&msmgpio 97 GPIO_ACTIVE_HIGH>;
++ enable-gpios = <&tlmm 97 GPIO_ACTIVE_HIGH>;
+ regulator-active-discharge = <1>;
+ };
+
+@@ -154,7 +154,7 @@
+ regulator-name = "outn";
+ regulator-min-microvolt = <5400000>;
+ regulator-max-microvolt = <5400000>;
+- enable-gpios = <&msmgpio 32 GPIO_ACTIVE_HIGH>;
++ enable-gpios = <&tlmm 32 GPIO_ACTIVE_HIGH>;
+ regulator-active-discharge = <1>;
+ };
+ };
+@@ -169,7 +169,7 @@
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <13 IRQ_TYPE_EDGE_FALLING>;
+
+ vdd-supply = <&pm8916_l17>;
+@@ -199,11 +199,11 @@
+ compatible = "nxp,pn547", "nxp,nxp-nci-i2c";
+ reg = <0x28>;
+
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <21 IRQ_TYPE_EDGE_RISING>;
+
+- enable-gpios = <&msmgpio 20 GPIO_ACTIVE_HIGH>;
+- firmware-gpios = <&msmgpio 2 GPIO_ACTIVE_HIGH>;
++ enable-gpios = <&tlmm 20 GPIO_ACTIVE_HIGH>;
++ firmware-gpios = <&tlmm 2 GPIO_ACTIVE_HIGH>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&nfc_default>;
+@@ -218,11 +218,32 @@
+ status = "okay";
+ };
+
++&lpass_codec {
++ status = "okay";
++};
++
++&pm8916_l8 {
++ regulator-min-microvolt = <2950000>;
++ regulator-max-microvolt = <2950000>;
++};
++
+ &pm8916_resin {
+ status = "okay";
+ linux,code = <KEY_VOLUMEDOWN>;
+ };
+
++&pm8916_rpm_regulators {
++ pm8916_l16: l16 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ };
++
++ pm8916_l17: l17 {
++ regulator-min-microvolt = <2850000>;
++ regulator-max-microvolt = <2850000>;
++ };
++};
++
+ &pm8916_vib {
+ status = "okay";
+ };
+@@ -243,7 +264,7 @@
+ pinctrl-1 = <&sdc2_clk_off &sdc2_cmd_off &sdc2_data_off &sdhc2_cd_default>;
+
+ /*
+- * The Huawei device tree sets cd-gpios = <&msmgpio 38 GPIO_ACTIVE_HIGH>.
++ * The Huawei device tree sets cd-gpios = <&tlmm 38 GPIO_ACTIVE_HIGH>.
+ * However, gpio38 does not change its state when inserting/removing the
+ * SD card, it's just low all the time. The Huawei kernel seems to use
+ * polling for SD card detection instead.
+@@ -255,7 +276,7 @@
+ * Maybe Huawei decided to replace the second SIM card slot with the
+ * SD card slot and forgot to re-route to gpio38.
+ */
+- cd-gpios = <&msmgpio 56 GPIO_ACTIVE_LOW>;
++ cd-gpios = <&tlmm 56 GPIO_ACTIVE_LOW>;
+ };
+
+ &sound {
+@@ -302,6 +323,7 @@
+ };
+
+ &wcd_codec {
++ status = "okay";
+ qcom,micbias-lvl = <2800>;
+ qcom,mbhc-vthreshold-low = <75 150 237 450 500>;
+ qcom,mbhc-vthreshold-high = <75 150 237 450 500>;
+@@ -316,110 +338,7 @@
+ compatible = "qcom,wcn3620";
+ };
+
+-&smd_rpm_regulators {
+- vdd_l1_l2_l3-supply = <&pm8916_s3>;
+- vdd_l4_l5_l6-supply = <&pm8916_s4>;
+- vdd_l7-supply = <&pm8916_s4>;
+-
+- s3 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1300000>;
+- };
+-
+- s4 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2100000>;
+- };
+-
+- l1 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+- };
+-
+- l2 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1200000>;
+- };
+-
+- l4 {
+- regulator-min-microvolt = <2050000>;
+- regulator-max-microvolt = <2050000>;
+- };
+-
+- l5 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l6 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l7 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l8 {
+- regulator-min-microvolt = <2950000>;
+- regulator-max-microvolt = <2950000>;
+- };
+-
+- l9 {
+- regulator-min-microvolt = <3300000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l10 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2800000>;
+- };
+-
+- l11 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- regulator-allow-set-load;
+- regulator-system-load = <200000>;
+- };
+-
+- l12 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- };
+-
+- l13 {
+- regulator-min-microvolt = <3075000>;
+- regulator-max-microvolt = <3075000>;
+- };
+-
+- l14 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l15 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l16 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l17 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2850000>;
+- };
+-
+- l18 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2700000>;
+- };
+-};
+-
+-&msmgpio {
++&tlmm {
+ accel_irq_default: accel-irq-default-state {
+ pins = "gpio115";
+ function = "gpio";
+diff --git a/arch/arm64/boot/dts/qcom/msm8916-longcheer-l8150.dts b/arch/arm64/boot/dts/qcom/msm8916-longcheer-l8150.dts
+index f1dd625e18227..4aa2281bdbcaa 100644
+--- a/arch/arm64/boot/dts/qcom/msm8916-longcheer-l8150.dts
++++ b/arch/arm64/boot/dts/qcom/msm8916-longcheer-l8150.dts
+@@ -41,7 +41,7 @@
+
+ button-volume-up {
+ label = "Volume Up";
+- gpios = <&msmgpio 107 GPIO_ACTIVE_LOW>;
++ gpios = <&tlmm 107 GPIO_ACTIVE_LOW>;
+ linux,code = <KEY_VOLUMEUP>;
+ };
+ };
+@@ -53,7 +53,7 @@
+ regulator-min-microvolt = <2800000>;
+ regulator-max-microvolt = <2800000>;
+
+- gpio = <&msmgpio 17 GPIO_ACTIVE_HIGH>;
++ gpio = <&tlmm 17 GPIO_ACTIVE_HIGH>;
+ enable-active-high;
+
+ pinctrl-names = "default";
+@@ -62,8 +62,8 @@
+
+ flash-led-controller {
+ compatible = "sgmicro,sgm3140";
+- flash-gpios = <&msmgpio 31 GPIO_ACTIVE_HIGH>;
+- enable-gpios = <&msmgpio 32 GPIO_ACTIVE_HIGH>;
++ flash-gpios = <&tlmm 31 GPIO_ACTIVE_HIGH>;
++ enable-gpios = <&tlmm 32 GPIO_ACTIVE_HIGH>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&camera_flash_default>;
+@@ -122,7 +122,7 @@
+ * to the BMC156. However, there are two pads next to the chip
+ * that can be shorted to make it work if needed.
+ *
+- * interrupt-parent = <&msmgpio>;
++ * interrupt-parent = <&tlmm>;
+ * interrupts = <116 IRQ_TYPE_EDGE_RISING>;
+ */
+
+@@ -141,7 +141,7 @@
+ compatible = "bosch,bmc156_magn";
+ reg = <0x12>;
+
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <113 IRQ_TYPE_EDGE_RISING>;
+
+ pinctrl-names = "default";
+@@ -156,21 +156,21 @@
+ reg = <0x23>;
+ proximity-near-level = <75>;
+
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <115 IRQ_TYPE_EDGE_FALLING>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&light_int_default>;
+
+ vdd-supply = <&pm8916_l17>;
+- vio-supply = <&pm8916_l6>;
++ vddio-supply = <&pm8916_l6>;
+ };
+
+ gyroscope@68 {
+ compatible = "bosch,bmg160";
+ reg = <0x68>;
+
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <23 IRQ_TYPE_EDGE_RISING>,
+ <22 IRQ_TYPE_EDGE_RISING>;
+
+@@ -191,7 +191,7 @@
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <13 IRQ_TYPE_EDGE_FALLING>;
+
+ vdd-supply = <®_ctp>;
+@@ -223,6 +223,13 @@
+ linux,code = <KEY_VOLUMEDOWN>;
+ };
+
++&pm8916_rpm_regulators {
++ pm8916_l17: l17 {
++ regulator-min-microvolt = <2850000>;
++ regulator-max-microvolt = <2850000>;
++ };
++};
++
+ &pm8916_usbin {
+ status = "okay";
+ };
+@@ -267,110 +274,7 @@
+ compatible = "qcom,wcn3620";
+ };
+
+-&smd_rpm_regulators {
+- vdd_l1_l2_l3-supply = <&pm8916_s3>;
+- vdd_l4_l5_l6-supply = <&pm8916_s4>;
+- vdd_l7-supply = <&pm8916_s4>;
+-
+- s3 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1300000>;
+- };
+-
+- s4 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2100000>;
+- };
+-
+- l1 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+- };
+-
+- l2 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1200000>;
+- };
+-
+- l4 {
+- regulator-min-microvolt = <2050000>;
+- regulator-max-microvolt = <2050000>;
+- };
+-
+- l5 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l6 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l7 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l8 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2900000>;
+- };
+-
+- l9 {
+- regulator-min-microvolt = <3300000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l10 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2800000>;
+- };
+-
+- l11 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- regulator-allow-set-load;
+- regulator-system-load = <200000>;
+- };
+-
+- l12 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- };
+-
+- l13 {
+- regulator-min-microvolt = <3075000>;
+- regulator-max-microvolt = <3075000>;
+- };
+-
+- l14 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l15 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l16 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l17 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2850000>;
+- };
+-
+- l18 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2700000>;
+- };
+-};
+-
+-&msmgpio {
++&tlmm {
+ accel_int_default: accel-int-default-state {
+ pins = "gpio116";
+ function = "gpio";
+diff --git a/arch/arm64/boot/dts/qcom/msm8916-longcheer-l8910.dts b/arch/arm64/boot/dts/qcom/msm8916-longcheer-l8910.dts
+index b79e80913af9f..a1208c8e06203 100644
+--- a/arch/arm64/boot/dts/qcom/msm8916-longcheer-l8910.dts
++++ b/arch/arm64/boot/dts/qcom/msm8916-longcheer-l8910.dts
+@@ -20,6 +20,21 @@
+ stdout-path = "serial0";
+ };
+
++ flash-led-controller {
++ compatible = "ocs,ocp8110";
++ enable-gpios = <&tlmm 49 GPIO_ACTIVE_HIGH>;
++ flash-gpios = <&tlmm 119 GPIO_ACTIVE_HIGH>;
++
++ pinctrl-0 = <&camera_front_flash_default>;
++ pinctrl-names = "default";
++
++ flash_led: led {
++ function = LED_FUNCTION_FLASH;
++ color = <LED_COLOR_ID_WHITE>;
++ flash-max-timeout-us = <250000>;
++ };
++ };
++
+ gpio-keys {
+ compatible = "gpio-keys";
+
+@@ -30,7 +45,7 @@
+
+ button-volume-up {
+ label = "Volume Up";
+- gpios = <&msmgpio 107 GPIO_ACTIVE_LOW>;
++ gpios = <&tlmm 107 GPIO_ACTIVE_LOW>;
+ linux,code = <KEY_VOLUMEUP>;
+ };
+ };
+@@ -39,7 +54,7 @@
+ compatible = "gpio-leds";
+
+ led-0 {
+- gpios = <&msmgpio 17 GPIO_ACTIVE_HIGH>;
++ gpios = <&tlmm 17 GPIO_ACTIVE_HIGH>;
+ color = <LED_COLOR_ID_WHITE>;
+ default-state = "off";
+ function = LED_FUNCTION_KBD_BACKLIGHT;
+@@ -51,7 +66,7 @@
+
+ usb_id: usb-id {
+ compatible = "linux,extcon-usb-gpio";
+- id-gpio = <&msmgpio 110 GPIO_ACTIVE_HIGH>;
++ id-gpio = <&tlmm 110 GPIO_ACTIVE_HIGH>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&usb_id_default>;
+ };
+@@ -67,7 +82,7 @@
+ vdd-supply = <&pm8916_l17>;
+ vid-supply = <&pm8916_l6>;
+
+- reset-gpios = <&msmgpio 111 GPIO_ACTIVE_LOW>;
++ reset-gpios = <&tlmm 111 GPIO_ACTIVE_LOW>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&mag_reset_default>;
+@@ -95,6 +110,13 @@
+ linux,code = <KEY_VOLUMEDOWN>;
+ };
+
++&pm8916_rpm_regulators {
++ pm8916_l17: l17 {
++ regulator-min-microvolt = <2850000>;
++ regulator-max-microvolt = <2850000>;
++ };
++};
++
+ &pm8916_vib {
+ status = "okay";
+ };
+@@ -114,7 +136,7 @@
+ pinctrl-0 = <&sdc2_clk_on &sdc2_cmd_on &sdc2_data_on &sdc2_cd_on>;
+ pinctrl-1 = <&sdc2_clk_off &sdc2_cmd_off &sdc2_data_off &sdc2_cd_off>;
+
+- cd-gpios = <&msmgpio 38 GPIO_ACTIVE_LOW>;
++ cd-gpios = <&tlmm 38 GPIO_ACTIVE_LOW>;
+ };
+
+ &usb {
+@@ -134,110 +156,7 @@
+ compatible = "qcom,wcn3620";
+ };
+
+-&smd_rpm_regulators {
+- vdd_l1_l2_l3-supply = <&pm8916_s3>;
+- vdd_l4_l5_l6-supply = <&pm8916_s4>;
+- vdd_l7-supply = <&pm8916_s4>;
+-
+- s3 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1300000>;
+- };
+-
+- s4 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2100000>;
+- };
+-
+- l1 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+- };
+-
+- l2 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1200000>;
+- };
+-
+- l4 {
+- regulator-min-microvolt = <2050000>;
+- regulator-max-microvolt = <2050000>;
+- };
+-
+- l5 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l6 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l7 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l8 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2900000>;
+- };
+-
+- l9 {
+- regulator-min-microvolt = <3300000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l10 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2800000>;
+- };
+-
+- l11 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- regulator-allow-set-load;
+- regulator-system-load = <200000>;
+- };
+-
+- l12 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- };
+-
+- l13 {
+- regulator-min-microvolt = <3075000>;
+- regulator-max-microvolt = <3075000>;
+- };
+-
+- l14 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l15 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l16 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l17 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2850000>;
+- };
+-
+- l18 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2700000>;
+- };
+-};
+-
+-&msmgpio {
++&tlmm {
+ button_backlight_default: button-backlight-default-state {
+ pins = "gpio17";
+ function = "gpio";
+@@ -246,6 +165,13 @@
+ bias-disable;
+ };
+
++ camera_front_flash_default: camera-front-flash-default-state {
++ pins = "gpio49", "gpio119";
++ function = "gpio";
++ drive-strength = <2>;
++ bias-disable;
++ };
++
+ gpio_keys_default: gpio-keys-default-state {
+ pins = "gpio107";
+ function = "gpio";
+diff --git a/arch/arm64/boot/dts/qcom/msm8916-pins.dtsi b/arch/arm64/boot/dts/qcom/msm8916-pins.dtsi
+index 33dfcf318a81b..1e07f70768f41 100644
+--- a/arch/arm64/boot/dts/qcom/msm8916-pins.dtsi
++++ b/arch/arm64/boot/dts/qcom/msm8916-pins.dtsi
+@@ -3,7 +3,7 @@
+ * Copyright (c) 2013-2015, The Linux Foundation. All rights reserved.
+ */
+
+-&msmgpio {
++&tlmm {
+
+ blsp1_uart1_default: blsp1-uart1-default-state {
+ /* TX, RX, CTS_N, RTS_N */
+diff --git a/arch/arm64/boot/dts/qcom/msm8916-pm8916.dtsi b/arch/arm64/boot/dts/qcom/msm8916-pm8916.dtsi
+index 6eb5e0a395100..d83f767ac5bf4 100644
+--- a/arch/arm64/boot/dts/qcom/msm8916-pm8916.dtsi
++++ b/arch/arm64/boot/dts/qcom/msm8916-pm8916.dtsi
+@@ -47,30 +47,92 @@
+ };
+
+ &rpm_requests {
+- smd_rpm_regulators: regulators {
++ pm8916_rpm_regulators: regulators {
+ compatible = "qcom,rpm-pm8916-regulators";
++ vdd_l1_l2_l3-supply = <&pm8916_s3>;
++ vdd_l4_l5_l6-supply = <&pm8916_s4>;
++ vdd_l7-supply = <&pm8916_s4>;
+
+ /* pm8916_s1 is managed by rpmpd (MSM8916_VDDCX) */
+- pm8916_s3: s3 {};
+- pm8916_s4: s4 {};
+
+- pm8916_l1: l1 {};
+- pm8916_l2: l2 {};
++ pm8916_s3: s3 {
++ regulator-min-microvolt = <1250000>;
++ regulator-max-microvolt = <1350000>;
++ };
++
++ pm8916_s4: s4 {
++ regulator-min-microvolt = <1850000>;
++ regulator-max-microvolt = <2150000>;
++ };
++
++ /*
++ * Some of the regulators are unused or managed by another
++ * processor (e.g. the modem). We should still define nodes for
++ * them to ensure the vote from the application processor can be
++ * dropped in case the regulators are already on during boot.
++ *
++ * The labels for these nodes are omitted on purpose because
++ * boards should configure a proper voltage before using them.
++ */
++ l1 {};
++
++ pm8916_l2: l2 {
++ regulator-min-microvolt = <1200000>;
++ regulator-max-microvolt = <1200000>;
++ };
++
+ /* pm8916_l3 is managed by rpmpd (MSM8916_VDDMX) */
+- pm8916_l4: l4 {};
+- pm8916_l5: l5 {};
+- pm8916_l6: l6 {};
+- pm8916_l7: l7 {};
+- pm8916_l8: l8 {};
+- pm8916_l9: l9 {};
+- pm8916_l10: l10 {};
+- pm8916_l11: l11 {};
+- pm8916_l12: l12 {};
+- pm8916_l13: l13 {};
+- pm8916_l14: l14 {};
+- pm8916_l15: l15 {};
+- pm8916_l16: l16 {};
+- pm8916_l17: l17 {};
+- pm8916_l18: l18 {};
++
++ l4 {};
++
++ pm8916_l5: l5 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ };
++
++ pm8916_l6: l6 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ };
++
++ pm8916_l7: l7 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <1800000>;
++ };
++
++ pm8916_l8: l8 {
++ regulator-min-microvolt = <2900000>;
++ regulator-max-microvolt = <2900000>;
++ };
++
++ pm8916_l9: l9 {
++ regulator-min-microvolt = <3300000>;
++ regulator-max-microvolt = <3300000>;
++ };
++
++ l10 {};
++
++ pm8916_l11: l11 {
++ regulator-min-microvolt = <2950000>;
++ regulator-max-microvolt = <2950000>;
++ regulator-allow-set-load;
++ regulator-system-load = <200000>;
++ };
++
++ pm8916_l12: l12 {
++ regulator-min-microvolt = <1800000>;
++ regulator-max-microvolt = <2950000>;
++ };
++
++ pm8916_l13: l13 {
++ regulator-min-microvolt = <3075000>;
++ regulator-max-microvolt = <3075000>;
++ };
++
++ l14 {};
++ l15 {};
++ l16 {};
++ l17 {};
++ l18 {};
+ };
+ };
+diff --git a/arch/arm64/boot/dts/qcom/msm8916-samsung-a2015-common.dtsi b/arch/arm64/boot/dts/qcom/msm8916-samsung-a2015-common.dtsi
+index 16d67749960e0..550ba6b9d4cd8 100644
+--- a/arch/arm64/boot/dts/qcom/msm8916-samsung-a2015-common.dtsi
++++ b/arch/arm64/boot/dts/qcom/msm8916-samsung-a2015-common.dtsi
+@@ -44,13 +44,13 @@
+
+ button-volume-up {
+ label = "Volume Up";
+- gpios = <&msmgpio 107 GPIO_ACTIVE_LOW>;
++ gpios = <&tlmm 107 GPIO_ACTIVE_LOW>;
+ linux,code = <KEY_VOLUMEUP>;
+ };
+
+ button-home {
+ label = "Home";
+- gpios = <&msmgpio 109 GPIO_ACTIVE_LOW>;
++ gpios = <&tlmm 109 GPIO_ACTIVE_LOW>;
+ linux,code = <KEY_HOMEPAGE>;
+ };
+ };
+@@ -65,7 +65,7 @@
+
+ event-hall-sensor {
+ label = "Hall Effect Sensor";
+- gpios = <&msmgpio 52 GPIO_ACTIVE_LOW>;
++ gpios = <&tlmm 52 GPIO_ACTIVE_LOW>;
+ linux,input-type = <EV_SW>;
+ linux,code = <SW_LID>;
+ linux,can-disable;
+@@ -83,7 +83,7 @@
+ regulator-min-microvolt = <3000000>;
+ regulator-max-microvolt = <3000000>;
+
+- gpio = <&msmgpio 76 GPIO_ACTIVE_HIGH>;
++ gpio = <&tlmm 76 GPIO_ACTIVE_HIGH>;
+ enable-active-high;
+
+ pinctrl-names = "default";
+@@ -96,7 +96,7 @@
+ regulator-min-microvolt = <3300000>;
+ regulator-max-microvolt = <3300000>;
+
+- gpio = <&msmgpio 73 GPIO_ACTIVE_HIGH>;
++ gpio = <&tlmm 73 GPIO_ACTIVE_HIGH>;
+ enable-active-high;
+
+ pinctrl-names = "default";
+@@ -105,8 +105,8 @@
+
+ i2c-muic {
+ compatible = "i2c-gpio";
+- sda-gpios = <&msmgpio 105 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
+- scl-gpios = <&msmgpio 106 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
++ sda-gpios = <&tlmm 105 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
++ scl-gpios = <&tlmm 106 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&muic_i2c_default>;
+@@ -118,7 +118,7 @@
+ compatible = "siliconmitus,sm5502-muic";
+
+ reg = <0x25>;
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <12 IRQ_TYPE_EDGE_FALLING>;
+
+ pinctrl-names = "default";
+@@ -128,8 +128,8 @@
+
+ i2c-tkey {
+ compatible = "i2c-gpio";
+- sda-gpios = <&msmgpio 16 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
+- scl-gpios = <&msmgpio 17 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
++ sda-gpios = <&tlmm 16 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
++ scl-gpios = <&tlmm 17 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&tkey_i2c_default>;
+@@ -142,7 +142,7 @@
+ compatible = "coreriver,tc360-touchkey";
+ reg = <0x20>;
+
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <98 IRQ_TYPE_EDGE_FALLING>;
+
+ /* vcc/vdd-supply are board-specific */
+@@ -157,8 +157,8 @@
+
+ i2c-nfc {
+ compatible = "i2c-gpio";
+- sda-gpios = <&msmgpio 0 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
+- scl-gpios = <&msmgpio 1 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
++ sda-gpios = <&tlmm 0 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
++ scl-gpios = <&tlmm 1 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&nfc_i2c_default>;
+@@ -170,11 +170,11 @@
+ compatible = "samsung,s3fwrn5-i2c";
+ reg = <0x27>;
+
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <21 IRQ_TYPE_EDGE_RISING>;
+
+- en-gpios = <&msmgpio 20 GPIO_ACTIVE_LOW>;
+- wake-gpios = <&msmgpio 49 GPIO_ACTIVE_HIGH>;
++ en-gpios = <&tlmm 20 GPIO_ACTIVE_LOW>;
++ wake-gpios = <&tlmm 49 GPIO_ACTIVE_HIGH>;
+
+ clocks = <&rpmcc RPM_SMD_BB_CLK2_PIN>;
+
+@@ -200,7 +200,7 @@
+ accelerometer: accelerometer@10 {
+ compatible = "bosch,bmc150_accel";
+ reg = <0x10>;
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <115 IRQ_TYPE_EDGE_RISING>;
+
+ vdd-supply = <&pm8916_l17>;
+@@ -225,7 +225,7 @@
+ battery@35 {
+ compatible = "richtek,rt5033-battery";
+ reg = <0x35>;
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <121 IRQ_TYPE_EDGE_BOTH>;
+
+ pinctrl-names = "default";
+@@ -252,6 +252,13 @@
+ linux,code = <KEY_VOLUMEDOWN>;
+ };
+
++&pm8916_rpm_regulators {
++ pm8916_l17: l17 {
++ regulator-min-microvolt = <2850000>;
++ regulator-max-microvolt = <2850000>;
++ };
++};
++
+ &sdhc_1 {
+ status = "okay";
+
+@@ -267,7 +274,7 @@
+ pinctrl-0 = <&sdc2_clk_on &sdc2_cmd_on &sdc2_data_on &sdc2_cd_on>;
+ pinctrl-1 = <&sdc2_clk_off &sdc2_cmd_off &sdc2_data_off &sdc2_cd_off>;
+
+- cd-gpios = <&msmgpio 38 GPIO_ACTIVE_LOW>;
++ cd-gpios = <&tlmm 38 GPIO_ACTIVE_LOW>;
+ };
+
+ &usb {
+@@ -279,110 +286,7 @@
+ extcon = <&muic>;
+ };
+
+-&smd_rpm_regulators {
+- vdd_l1_l2_l3-supply = <&pm8916_s3>;
+- vdd_l4_l5_l6-supply = <&pm8916_s4>;
+- vdd_l7-supply = <&pm8916_s4>;
+-
+- s3 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1300000>;
+- };
+-
+- s4 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2100000>;
+- };
+-
+- l1 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+- };
+-
+- l2 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1200000>;
+- };
+-
+- l4 {
+- regulator-min-microvolt = <2050000>;
+- regulator-max-microvolt = <2050000>;
+- };
+-
+- l5 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l6 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l7 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l8 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2900000>;
+- };
+-
+- l9 {
+- regulator-min-microvolt = <3300000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l10 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2800000>;
+- };
+-
+- l11 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- regulator-allow-set-load;
+- regulator-system-load = <200000>;
+- };
+-
+- l12 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- };
+-
+- l13 {
+- regulator-min-microvolt = <3075000>;
+- regulator-max-microvolt = <3075000>;
+- };
+-
+- l14 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l15 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l16 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l17 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2850000>;
+- };
+-
+- l18 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2700000>;
+- };
+-};
+-
+-&msmgpio {
++&tlmm {
+ accel_int_default: accel-int-default-state {
+ pins = "gpio115";
+ function = "gpio";
+diff --git a/arch/arm64/boot/dts/qcom/msm8916-samsung-a3u-eur.dts b/arch/arm64/boot/dts/qcom/msm8916-samsung-a3u-eur.dts
+index a1ca4d8834201..9068aa6f7b293 100644
+--- a/arch/arm64/boot/dts/qcom/msm8916-samsung-a3u-eur.dts
++++ b/arch/arm64/boot/dts/qcom/msm8916-samsung-a3u-eur.dts
+@@ -15,7 +15,7 @@
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+
+- gpio = <&msmgpio 9 GPIO_ACTIVE_HIGH>;
++ gpio = <&tlmm 9 GPIO_ACTIVE_HIGH>;
+ enable-active-high;
+
+ pinctrl-names = "default";
+@@ -28,7 +28,7 @@
+ regulator-min-microvolt = <2800000>;
+ regulator-max-microvolt = <2800000>;
+
+- gpio = <&msmgpio 86 GPIO_ACTIVE_HIGH>;
++ gpio = <&tlmm 86 GPIO_ACTIVE_HIGH>;
+ enable-active-high;
+
+ pinctrl-names = "default";
+@@ -41,7 +41,7 @@
+ regulator-min-microvolt = <3300000>;
+ regulator-max-microvolt = <3300000>;
+
+- gpio = <&msmgpio 60 GPIO_ACTIVE_HIGH>;
++ gpio = <&tlmm 60 GPIO_ACTIVE_HIGH>;
+ enable-active-high;
+
+ pinctrl-names = "default";
+@@ -67,7 +67,7 @@
+ compatible = "zinitix,bt541";
+
+ reg = <0x20>;
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <13 IRQ_TYPE_EDGE_FALLING>;
+
+ touchscreen-size-x = <540>;
+@@ -93,7 +93,7 @@
+
+ vdd3-supply = <®_panel_vdd3>;
+ vci-supply = <&pm8916_l17>;
+- reset-gpios = <&msmgpio 25 GPIO_ACTIVE_HIGH>;
++ reset-gpios = <&tlmm 25 GPIO_ACTIVE_HIGH>;
+
+ port {
+ panel_in: endpoint {
+@@ -120,7 +120,7 @@
+ compatible = "qcom,wcn3620";
+ };
+
+-&msmgpio {
++&tlmm {
+ panel_vdd3_default: panel-vdd3-default-state {
+ pins = "gpio9";
+ function = "gpio";
+diff --git a/arch/arm64/boot/dts/qcom/msm8916-samsung-a5u-eur.dts b/arch/arm64/boot/dts/qcom/msm8916-samsung-a5u-eur.dts
+index 4e10b8a5e9f9c..388482a1e3d9f 100644
+--- a/arch/arm64/boot/dts/qcom/msm8916-samsung-a5u-eur.dts
++++ b/arch/arm64/boot/dts/qcom/msm8916-samsung-a5u-eur.dts
+@@ -15,7 +15,7 @@
+ regulator-min-microvolt = <3300000>;
+ regulator-max-microvolt = <3300000>;
+
+- gpio = <&msmgpio 97 GPIO_ACTIVE_HIGH>;
++ gpio = <&tlmm 97 GPIO_ACTIVE_HIGH>;
+ enable-active-high;
+
+ pinctrl-names = "default";
+@@ -36,7 +36,7 @@
+ compatible = "melfas,mms345l";
+
+ reg = <0x48>;
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <13 IRQ_TYPE_EDGE_FALLING>;
+
+ touchscreen-size-x = <720>;
+@@ -71,7 +71,7 @@
+ compatible = "qcom,wcn3660b";
+ };
+
+-&msmgpio {
++&tlmm {
+ tkey_en_default: tkey-en-default-state {
+ pins = "gpio97";
+ function = "gpio";
+diff --git a/arch/arm64/boot/dts/qcom/msm8916-samsung-e2015-common.dtsi b/arch/arm64/boot/dts/qcom/msm8916-samsung-e2015-common.dtsi
+index f6c4a011fdfd2..0cdd6af7817f4 100644
+--- a/arch/arm64/boot/dts/qcom/msm8916-samsung-e2015-common.dtsi
++++ b/arch/arm64/boot/dts/qcom/msm8916-samsung-e2015-common.dtsi
+@@ -18,7 +18,7 @@
+ compatible = "siliconmitus,sm5504-muic";
+ reg = <0x14>;
+
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <12 IRQ_TYPE_EDGE_FALLING>;
+
+ pinctrl-names = "default";
+@@ -32,7 +32,7 @@
+ regulator-min-microvolt = <3300000>;
+ regulator-max-microvolt = <3300000>;
+
+- gpio = <&msmgpio 97 GPIO_ACTIVE_HIGH>;
++ gpio = <&tlmm 97 GPIO_ACTIVE_HIGH>;
+ enable-active-high;
+
+ pinctrl-names = "default";
+@@ -66,7 +66,7 @@
+ compatible = "qcom,wcn3620";
+ };
+
+-&msmgpio {
++&tlmm {
+ tkey_en_default: tkey-en-default-state {
+ pins = "gpio97";
+ function = "gpio";
+diff --git a/arch/arm64/boot/dts/qcom/msm8916-samsung-grandmax.dts b/arch/arm64/boot/dts/qcom/msm8916-samsung-grandmax.dts
+index 4cbd68b894481..3f145dde4059f 100644
+--- a/arch/arm64/boot/dts/qcom/msm8916-samsung-grandmax.dts
++++ b/arch/arm64/boot/dts/qcom/msm8916-samsung-grandmax.dts
+@@ -33,7 +33,7 @@
+ function = LED_FUNCTION_KBD_BACKLIGHT;
+ color = <LED_COLOR_ID_WHITE>;
+
+- gpios = <&msmgpio 60 GPIO_ACTIVE_HIGH>;
++ gpios = <&tlmm 60 GPIO_ACTIVE_HIGH>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&gpio_leds_default>;
+@@ -42,14 +42,14 @@
+ };
+
+ ®_motor_vdd {
+- gpio = <&msmgpio 72 GPIO_ACTIVE_HIGH>;
++ gpio = <&tlmm 72 GPIO_ACTIVE_HIGH>;
+ };
+
+ ®_touch_key {
+ status = "disabled";
+ };
+
+-&msmgpio {
++&tlmm {
+ gpio_leds_default: gpio-led-default-state {
+ pins = "gpio60";
+ function = "gpio";
+diff --git a/arch/arm64/boot/dts/qcom/msm8916-samsung-gt5-common.dtsi b/arch/arm64/boot/dts/qcom/msm8916-samsung-gt5-common.dtsi
+index 74ffd04db8d84..cb1b6318a246d 100644
+--- a/arch/arm64/boot/dts/qcom/msm8916-samsung-gt5-common.dtsi
++++ b/arch/arm64/boot/dts/qcom/msm8916-samsung-gt5-common.dtsi
+@@ -34,13 +34,13 @@
+
+ volume-up-button {
+ label = "Volume Up";
+- gpios = <&msmgpio 107 GPIO_ACTIVE_LOW>;
++ gpios = <&tlmm 107 GPIO_ACTIVE_LOW>;
+ linux,code = <KEY_VOLUMEUP>;
+ };
+
+ home-button {
+ label = "Home";
+- gpios = <&msmgpio 109 GPIO_ACTIVE_LOW>;
++ gpios = <&tlmm 109 GPIO_ACTIVE_LOW>;
+ linux,code = <KEY_HOMEPAGE>;
+ };
+ };
+@@ -55,7 +55,7 @@
+
+ hall-sensor-switch {
+ label = "Hall Effect Sensor";
+- gpios = <&msmgpio 52 GPIO_ACTIVE_LOW>;
++ gpios = <&tlmm 52 GPIO_ACTIVE_LOW>;
+ linux,input-type = <EV_SW>;
+ linux,code = <SW_LID>;
+ linux,can-disable;
+@@ -74,7 +74,7 @@
+ maxim,over-heat-temp = <600>;
+ maxim,over-volt = <4400>;
+
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <121 IRQ_TYPE_EDGE_FALLING>;
+
+ pinctrl-0 = <&fuelgauge_int_default>;
+@@ -97,7 +97,7 @@
+ vdd-supply = <&pm8916_l17>;
+ vddio-supply = <&pm8916_l5>;
+
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <115 IRQ_TYPE_LEVEL_HIGH>;
+ interrupt-names = "INT1";
+
+@@ -120,6 +120,13 @@
+ status = "okay";
+ };
+
++&pm8916_rpm_regulators {
++ pm8916_l17: l17 {
++ regulator-min-microvolt = <2850000>;
++ regulator-max-microvolt = <2850000>;
++ };
++};
++
+ /* FIXME: Replace with MAX77849 MUIC when driver is available */
+ &pm8916_usbin {
+ status = "okay";
+@@ -138,7 +145,7 @@
+ pinctrl-1 = <&sdc2_clk_off &sdc2_cmd_off &sdc2_data_off &sdc2_cd_off>;
+ pinctrl-names = "default", "sleep";
+
+- cd-gpios = <&msmgpio 38 GPIO_ACTIVE_LOW>;
++ cd-gpios = <&tlmm 38 GPIO_ACTIVE_LOW>;
+
+ status = "okay";
+ };
+@@ -162,110 +169,7 @@
+ compatible = "qcom,wcn3660b";
+ };
+
+-&smd_rpm_regulators {
+- vdd_l1_l2_l3-supply = <&pm8916_s3>;
+- vdd_l4_l5_l6-supply = <&pm8916_s4>;
+- vdd_l7-supply = <&pm8916_s4>;
+-
+- s3 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1300000>;
+- };
+-
+- s4 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2100000>;
+- };
+-
+- l1 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+- };
+-
+- l2 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1200000>;
+- };
+-
+- l4 {
+- regulator-min-microvolt = <2050000>;
+- regulator-max-microvolt = <2050000>;
+- };
+-
+- l5 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l6 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l7 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l8 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2900000>;
+- };
+-
+- l9 {
+- regulator-min-microvolt = <3300000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l10 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2800000>;
+- };
+-
+- l11 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- regulator-system-load = <200000>;
+- regulator-allow-set-load;
+- };
+-
+- l12 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- };
+-
+- l13 {
+- regulator-min-microvolt = <3075000>;
+- regulator-max-microvolt = <3075000>;
+- };
+-
+- l14 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l15 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l16 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l17 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2850000>;
+- };
+-
+- l18 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2700000>;
+- };
+-};
+-
+-&msmgpio {
++&tlmm {
+ accel_int_default: accel-int-default-state {
+ pins = "gpio115";
+ function = "gpio";
+diff --git a/arch/arm64/boot/dts/qcom/msm8916-samsung-gt510.dts b/arch/arm64/boot/dts/qcom/msm8916-samsung-gt510.dts
+index 607a5dc8a5341..48111c6a2c78f 100644
+--- a/arch/arm64/boot/dts/qcom/msm8916-samsung-gt510.dts
++++ b/arch/arm64/boot/dts/qcom/msm8916-samsung-gt510.dts
+@@ -25,7 +25,7 @@
+ regulator-min-microvolt = <3000000>;
+ regulator-max-microvolt = <3000000>;
+
+- gpio = <&msmgpio 76 GPIO_ACTIVE_HIGH>;
++ gpio = <&tlmm 76 GPIO_ACTIVE_HIGH>;
+ enable-active-high;
+
+ pinctrl-0 = <&motor_en_default>;
+@@ -38,7 +38,7 @@
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+
+- gpio = <&msmgpio 73 GPIO_ACTIVE_HIGH>;
++ gpio = <&tlmm 73 GPIO_ACTIVE_HIGH>;
+ enable-active-high;
+
+ pinctrl-0 = <&tsp_en_default>;
+@@ -51,7 +51,7 @@
+ regulator-min-microvolt = <3300000>;
+ regulator-max-microvolt = <3300000>;
+
+- gpio = <&msmgpio 73 GPIO_ACTIVE_HIGH>;
++ gpio = <&tlmm 73 GPIO_ACTIVE_HIGH>;
+ enable-active-high;
+ };
+
+@@ -71,20 +71,20 @@
+ touchscreen@4a {
+ compatible = "atmel,maxtouch";
+ reg = <0x4a>;
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <13 IRQ_TYPE_LEVEL_LOW>;
+
+ vdd-supply = <®_tsp_1p8v>;
+ vdda-supply = <®_tsp_3p3v>;
+
+- reset-gpios = <&msmgpio 114 GPIO_ACTIVE_LOW>;
++ reset-gpios = <&tlmm 114 GPIO_ACTIVE_LOW>;
+
+ pinctrl-0 = <&tsp_int_rst_default>;
+ pinctrl-names = "default";
+ };
+ };
+
+-&msmgpio {
++&tlmm {
+ motor_en_default: motor-en-default-state {
+ pins = "gpio76";
+ function = "gpio";
+diff --git a/arch/arm64/boot/dts/qcom/msm8916-samsung-gt58.dts b/arch/arm64/boot/dts/qcom/msm8916-samsung-gt58.dts
+index 5d6f8383306bb..98ceaad7fcea9 100644
+--- a/arch/arm64/boot/dts/qcom/msm8916-samsung-gt58.dts
++++ b/arch/arm64/boot/dts/qcom/msm8916-samsung-gt58.dts
+@@ -15,7 +15,7 @@
+ regulator-min-microvolt = <3300000>;
+ regulator-max-microvolt = <3300000>;
+
+- gpio = <&msmgpio 73 GPIO_ACTIVE_HIGH>;
++ gpio = <&tlmm 73 GPIO_ACTIVE_HIGH>;
+ enable-active-high;
+
+ pinctrl-0 = <®_tsp_en_default>;
+@@ -24,7 +24,7 @@
+
+ vibrator {
+ compatible = "gpio-vibrator";
+- enable-gpios = <&msmgpio 76 GPIO_ACTIVE_HIGH>;
++ enable-gpios = <&tlmm 76 GPIO_ACTIVE_HIGH>;
+
+ pinctrl-0 = <&vibrator_en_default>;
+ pinctrl-names = "default";
+@@ -37,7 +37,7 @@
+ touchscreen@20 {
+ compatible = "zinitix,bt532";
+ reg = <0x20>;
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <13 IRQ_TYPE_EDGE_FALLING>;
+
+ touchscreen-size-x = <768>;
+@@ -51,7 +51,7 @@
+ };
+ };
+
+-&msmgpio {
++&tlmm {
+ reg_tsp_en_default: reg-tsp-en-default-state {
+ pins = "gpio73";
+ function = "gpio";
+diff --git a/arch/arm64/boot/dts/qcom/msm8916-samsung-j5-common.dtsi b/arch/arm64/boot/dts/qcom/msm8916-samsung-j5-common.dtsi
+index adeee0830e768..b2d2bc205ef27 100644
+--- a/arch/arm64/boot/dts/qcom/msm8916-samsung-j5-common.dtsi
++++ b/arch/arm64/boot/dts/qcom/msm8916-samsung-j5-common.dtsi
+@@ -32,7 +32,7 @@
+
+ event-hall-sensor {
+ label = "Hall Effect Sensor";
+- gpios = <&msmgpio 52 GPIO_ACTIVE_LOW>;
++ gpios = <&tlmm 52 GPIO_ACTIVE_LOW>;
+ linux,input-type = <EV_SW>;
+ linux,code = <SW_LID>;
+ linux,can-disable;
+@@ -49,21 +49,21 @@
+
+ button-volume-up {
+ label = "Volume Up";
+- gpios = <&msmgpio 107 GPIO_ACTIVE_LOW>;
++ gpios = <&tlmm 107 GPIO_ACTIVE_LOW>;
+ linux,code = <KEY_VOLUMEUP>;
+ };
+
+ button-home {
+ label = "Home Key";
+- gpios = <&msmgpio 109 GPIO_ACTIVE_LOW>;
++ gpios = <&tlmm 109 GPIO_ACTIVE_LOW>;
+ linux,code = <KEY_HOMEPAGE>;
+ };
+ };
+
+ i2c_muic: i2c-muic {
+ compatible = "i2c-gpio";
+- sda-gpios = <&msmgpio 105 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
+- scl-gpios = <&msmgpio 106 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
++ sda-gpios = <&tlmm 105 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
++ scl-gpios = <&tlmm 106 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&muic_i2c_default>;
+@@ -75,7 +75,7 @@
+ compatible = "siliconmitus,sm5703-muic";
+ reg = <0x25>;
+
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <12 IRQ_TYPE_EDGE_FALLING>;
+
+ pinctrl-names = "default";
+@@ -108,7 +108,7 @@
+ pinctrl-0 = <&sdc2_clk_on &sdc2_cmd_on &sdc2_data_on &sdc2_cd_on>;
+ pinctrl-1 = <&sdc2_clk_off &sdc2_cmd_off &sdc2_data_off &sdc2_cd_off>;
+
+- cd-gpios = <&msmgpio 38 GPIO_ACTIVE_LOW>;
++ cd-gpios = <&tlmm 38 GPIO_ACTIVE_LOW>;
+ };
+
+ &usb {
+@@ -128,110 +128,7 @@
+ compatible = "qcom,wcn3620";
+ };
+
+-&smd_rpm_regulators {
+- vdd_l1_l2_l3-supply = <&pm8916_s3>;
+- vdd_l4_l5_l6-supply = <&pm8916_s4>;
+- vdd_l7-supply = <&pm8916_s4>;
+-
+- s3 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1300000>;
+- };
+-
+- s4 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2100000>;
+- };
+-
+- l1 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+- };
+-
+- l2 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1200000>;
+- };
+-
+- l4 {
+- regulator-min-microvolt = <2050000>;
+- regulator-max-microvolt = <2050000>;
+- };
+-
+- l5 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l6 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l7 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l8 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2900000>;
+- };
+-
+- l9 {
+- regulator-min-microvolt = <3300000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l10 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2800000>;
+- };
+-
+- l11 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- regulator-allow-set-load;
+- regulator-system-load = <200000>;
+- };
+-
+- l12 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- };
+-
+- l13 {
+- regulator-min-microvolt = <3075000>;
+- regulator-max-microvolt = <3075000>;
+- };
+-
+- l14 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l15 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l16 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l17 {
+- regulator-min-microvolt = <3000000>;
+- regulator-max-microvolt = <3000000>;
+- };
+-
+- l18 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2700000>;
+- };
+-};
+-
+-&msmgpio {
++&tlmm {
+ gpio_hall_sensor_default: gpio-hall-sensor-default-state {
+ pins = "gpio52";
+ function = "gpio";
+diff --git a/arch/arm64/boot/dts/qcom/msm8916-samsung-serranove.dts b/arch/arm64/boot/dts/qcom/msm8916-samsung-serranove.dts
+index 1a41a4db874da..13a1d8828447b 100644
+--- a/arch/arm64/boot/dts/qcom/msm8916-samsung-serranove.dts
++++ b/arch/arm64/boot/dts/qcom/msm8916-samsung-serranove.dts
+@@ -53,13 +53,13 @@
+
+ button-volume-up {
+ label = "Volume Up";
+- gpios = <&msmgpio 107 GPIO_ACTIVE_LOW>;
++ gpios = <&tlmm 107 GPIO_ACTIVE_LOW>;
+ linux,code = <KEY_VOLUMEUP>;
+ };
+
+ button-home {
+ label = "Home";
+- gpios = <&msmgpio 109 GPIO_ACTIVE_LOW>;
++ gpios = <&tlmm 109 GPIO_ACTIVE_LOW>;
+ linux,code = <KEY_HOMEPAGE>;
+ };
+ };
+@@ -74,7 +74,7 @@
+
+ event-hall-sensor {
+ label = "Hall Effect Sensor";
+- gpios = <&msmgpio 52 GPIO_ACTIVE_LOW>;
++ gpios = <&tlmm 52 GPIO_ACTIVE_LOW>;
+ linux,input-type = <EV_SW>;
+ linux,code = <SW_LID>;
+ linux,can-disable;
+@@ -87,7 +87,7 @@
+ regulator-min-microvolt = <3300000>;
+ regulator-max-microvolt = <3300000>;
+
+- gpio = <&msmgpio 73 GPIO_ACTIVE_HIGH>;
++ gpio = <&tlmm 73 GPIO_ACTIVE_HIGH>;
+ enable-active-high;
+
+ pinctrl-names = "default";
+@@ -100,7 +100,7 @@
+ regulator-min-microvolt = <2800000>;
+ regulator-max-microvolt = <2800000>;
+
+- gpio = <&msmgpio 86 GPIO_ACTIVE_HIGH>;
++ gpio = <&tlmm 86 GPIO_ACTIVE_HIGH>;
+ enable-active-high;
+
+ pinctrl-names = "default";
+@@ -113,7 +113,7 @@
+ regulator-min-microvolt = <3300000>;
+ regulator-max-microvolt = <3300000>;
+
+- gpio = <&msmgpio 60 GPIO_ACTIVE_HIGH>;
++ gpio = <&tlmm 60 GPIO_ACTIVE_HIGH>;
+ enable-active-high;
+
+ pinctrl-names = "default";
+@@ -122,8 +122,8 @@
+
+ i2c-muic {
+ compatible = "i2c-gpio";
+- sda-gpios = <&msmgpio 105 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
+- scl-gpios = <&msmgpio 106 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
++ sda-gpios = <&tlmm 105 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
++ scl-gpios = <&tlmm 106 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&muic_i2c_default>;
+@@ -135,7 +135,7 @@
+ compatible = "siliconmitus,sm5504-muic";
+ reg = <0x14>;
+
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <12 IRQ_TYPE_EDGE_FALLING>;
+
+ pinctrl-names = "default";
+@@ -145,8 +145,8 @@
+
+ i2c-tkey {
+ compatible = "i2c-gpio";
+- sda-gpios = <&msmgpio 16 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
+- scl-gpios = <&msmgpio 17 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
++ sda-gpios = <&tlmm 16 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
++ scl-gpios = <&tlmm 17 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&tkey_i2c_default>;
+@@ -158,7 +158,7 @@
+ compatible = "coreriver,tc360-touchkey";
+ reg = <0x20>;
+
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <98 IRQ_TYPE_EDGE_FALLING>;
+
+ vcc-supply = <®_touch_key>;
+@@ -174,8 +174,8 @@
+
+ i2c-nfc {
+ compatible = "i2c-gpio";
+- sda-gpios = <&msmgpio 0 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
+- scl-gpios = <&msmgpio 1 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
++ sda-gpios = <&tlmm 0 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
++ scl-gpios = <&tlmm 1 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&nfc_i2c_default>;
+@@ -187,11 +187,11 @@
+ compatible = "nxp,pn547", "nxp,nxp-nci-i2c";
+ reg = <0x2b>;
+
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <21 IRQ_TYPE_EDGE_RISING>;
+
+- enable-gpios = <&msmgpio 20 GPIO_ACTIVE_HIGH>;
+- firmware-gpios = <&msmgpio 49 GPIO_ACTIVE_HIGH>;
++ enable-gpios = <&tlmm 20 GPIO_ACTIVE_HIGH>;
++ firmware-gpios = <&tlmm 49 GPIO_ACTIVE_HIGH>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&nfc_default>;
+@@ -206,7 +206,7 @@
+ compatible = "st,lsm6ds3";
+ reg = <0x6b>;
+
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <115 IRQ_TYPE_EDGE_RISING>;
+
+ pinctrl-names = "default";
+@@ -230,7 +230,7 @@
+ compatible = "richtek,rt5033-battery";
+ reg = <0x35>;
+
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <121 IRQ_TYPE_EDGE_FALLING>;
+
+ pinctrl-names = "default";
+@@ -245,7 +245,7 @@
+ compatible = "zinitix,bt541";
+ reg = <0x20>;
+
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <13 IRQ_TYPE_EDGE_FALLING>;
+
+ touchscreen-size-x = <540>;
+@@ -320,110 +320,7 @@
+ compatible = "qcom,wcn3660b";
+ };
+
+-&smd_rpm_regulators {
+- vdd_l1_l2_l3-supply = <&pm8916_s3>;
+- vdd_l4_l5_l6-supply = <&pm8916_s4>;
+- vdd_l7-supply = <&pm8916_s4>;
+-
+- s3 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1300000>;
+- };
+-
+- s4 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2100000>;
+- };
+-
+- l1 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+- };
+-
+- l2 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1200000>;
+- };
+-
+- l4 {
+- regulator-min-microvolt = <2050000>;
+- regulator-max-microvolt = <2050000>;
+- };
+-
+- l5 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l6 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l7 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l8 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2900000>;
+- };
+-
+- l9 {
+- regulator-min-microvolt = <3300000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l10 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2800000>;
+- };
+-
+- l11 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- regulator-allow-set-load;
+- regulator-system-load = <200000>;
+- };
+-
+- l12 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- };
+-
+- l13 {
+- regulator-min-microvolt = <3075000>;
+- regulator-max-microvolt = <3075000>;
+- };
+-
+- l14 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l15 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l16 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l17 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2850000>;
+- };
+-
+- l18 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2700000>;
+- };
+-};
+-
+-&msmgpio {
++&tlmm {
+ fg_alert_default: fg-alert-default-state {
+ pins = "gpio121";
+ function = "gpio";
+diff --git a/arch/arm64/boot/dts/qcom/msm8916-thwc-uf896.dts b/arch/arm64/boot/dts/qcom/msm8916-thwc-uf896.dts
+index 82e260375174d..6fe1850ba20e9 100644
+--- a/arch/arm64/boot/dts/qcom/msm8916-thwc-uf896.dts
++++ b/arch/arm64/boot/dts/qcom/msm8916-thwc-uf896.dts
+@@ -10,19 +10,19 @@
+ };
+
+ &button_restart {
+- gpios = <&msmgpio 35 GPIO_ACTIVE_LOW>;
++ gpios = <&tlmm 35 GPIO_ACTIVE_LOW>;
+ };
+
+ &led_r {
+- gpios = <&msmgpio 82 GPIO_ACTIVE_HIGH>;
++ gpios = <&tlmm 82 GPIO_ACTIVE_HIGH>;
+ };
+
+ &led_g {
+- gpios = <&msmgpio 83 GPIO_ACTIVE_HIGH>;
++ gpios = <&tlmm 83 GPIO_ACTIVE_HIGH>;
+ };
+
+ &led_b {
+- gpios = <&msmgpio 81 GPIO_ACTIVE_HIGH>;
++ gpios = <&tlmm 81 GPIO_ACTIVE_HIGH>;
+ };
+
+ &button_default {
+diff --git a/arch/arm64/boot/dts/qcom/msm8916-thwc-ufi001c.dts b/arch/arm64/boot/dts/qcom/msm8916-thwc-ufi001c.dts
+index 978f0abcdf8ff..16d4a91022be6 100644
+--- a/arch/arm64/boot/dts/qcom/msm8916-thwc-ufi001c.dts
++++ b/arch/arm64/boot/dts/qcom/msm8916-thwc-ufi001c.dts
+@@ -10,19 +10,19 @@
+ };
+
+ &button_restart {
+- gpios = <&msmgpio 37 GPIO_ACTIVE_HIGH>;
++ gpios = <&tlmm 37 GPIO_ACTIVE_HIGH>;
+ };
+
+ &led_r {
+- gpios = <&msmgpio 22 GPIO_ACTIVE_HIGH>;
++ gpios = <&tlmm 22 GPIO_ACTIVE_HIGH>;
+ };
+
+ &led_g {
+- gpios = <&msmgpio 21 GPIO_ACTIVE_HIGH>;
++ gpios = <&tlmm 21 GPIO_ACTIVE_HIGH>;
+ };
+
+ &led_b {
+- gpios = <&msmgpio 20 GPIO_ACTIVE_HIGH>;
++ gpios = <&tlmm 20 GPIO_ACTIVE_HIGH>;
+ };
+
+ &mpss {
+@@ -40,7 +40,7 @@
+ };
+
+ /* This selects the external SIM card slot by default */
+-&msmgpio {
++&tlmm {
+ sim_ctrl_default: sim-ctrl-default-state {
+ esim-sel-pins {
+ pins = "gpio0", "gpio3";
+diff --git a/arch/arm64/boot/dts/qcom/msm8916-ufi.dtsi b/arch/arm64/boot/dts/qcom/msm8916-ufi.dtsi
+index 50bae6f214f1f..cb5c228ba9f6b 100644
+--- a/arch/arm64/boot/dts/qcom/msm8916-ufi.dtsi
++++ b/arch/arm64/boot/dts/qcom/msm8916-ufi.dtsi
+@@ -126,110 +126,7 @@
+ compatible = "qcom,wcn3620";
+ };
+
+-&smd_rpm_regulators {
+- vdd_l1_l2_l3-supply = <&pm8916_s3>;
+- vdd_l4_l5_l6-supply = <&pm8916_s4>;
+- vdd_l7-supply = <&pm8916_s4>;
+-
+- s3 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1300000>;
+- };
+-
+- s4 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2100000>;
+- };
+-
+- l1 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+- };
+-
+- l2 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1200000>;
+- };
+-
+- l4 {
+- regulator-min-microvolt = <2050000>;
+- regulator-max-microvolt = <2050000>;
+- };
+-
+- l5 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l6 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l7 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l8 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2900000>;
+- };
+-
+- l9 {
+- regulator-min-microvolt = <3300000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l10 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2800000>;
+- };
+-
+- l11 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- regulator-system-load = <200000>;
+- regulator-allow-set-load;
+- };
+-
+- l12 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- };
+-
+- l13 {
+- regulator-min-microvolt = <3075000>;
+- regulator-max-microvolt = <3075000>;
+- };
+-
+- l14 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l15 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l16 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l17 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2850000>;
+- };
+-
+- l18 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2700000>;
+- };
+-};
+-
+-&msmgpio {
++&tlmm {
+ /* pins are board-specific */
+ button_default: button-default-state {
+ function = "gpio";
+diff --git a/arch/arm64/boot/dts/qcom/msm8916-wingtech-wt88047.dts b/arch/arm64/boot/dts/qcom/msm8916-wingtech-wt88047.dts
+index ac56c7595f78a..12ce4dc236c63 100644
+--- a/arch/arm64/boot/dts/qcom/msm8916-wingtech-wt88047.dts
++++ b/arch/arm64/boot/dts/qcom/msm8916-wingtech-wt88047.dts
+@@ -25,8 +25,8 @@
+
+ flash-led-controller {
+ compatible = "ocs,ocp8110";
+- enable-gpios = <&msmgpio 31 GPIO_ACTIVE_HIGH>;
+- flash-gpios = <&msmgpio 32 GPIO_ACTIVE_HIGH>;
++ enable-gpios = <&tlmm 31 GPIO_ACTIVE_HIGH>;
++ flash-gpios = <&tlmm 32 GPIO_ACTIVE_HIGH>;
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&camera_flash_default>;
+@@ -47,14 +47,14 @@
+
+ button-volume-up {
+ label = "Volume Up";
+- gpios = <&msmgpio 107 GPIO_ACTIVE_LOW>;
++ gpios = <&tlmm 107 GPIO_ACTIVE_LOW>;
+ linux,code = <KEY_VOLUMEUP>;
+ };
+ };
+
+ usb_id: usb-id {
+ compatible = "linux,extcon-usb-gpio";
+- id-gpio = <&msmgpio 110 GPIO_ACTIVE_HIGH>;
++ id-gpio = <&tlmm 110 GPIO_ACTIVE_HIGH>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&usb_id_default>;
+ };
+@@ -67,7 +67,7 @@
+ compatible = "invensense,mpu6880";
+ reg = <0x68>;
+
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <115 IRQ_TYPE_EDGE_RISING>;
+
+ vdd-supply = <&pm8916_l17>;
+@@ -90,10 +90,10 @@
+ compatible = "edt,edt-ft5506";
+ reg = <0x38>;
+
+- interrupt-parent = <&msmgpio>;
++ interrupt-parent = <&tlmm>;
+ interrupts = <13 IRQ_TYPE_EDGE_FALLING>;
+
+- reset-gpios = <&msmgpio 12 GPIO_ACTIVE_LOW>;
++ reset-gpios = <&tlmm 12 GPIO_ACTIVE_LOW>;
+
+ vcc-supply = <&pm8916_l17>;
+ iovcc-supply = <&pm8916_l6>;
+@@ -149,6 +149,22 @@
+ linux,code = <KEY_VOLUMEDOWN>;
+ };
+
++&pm8916_rpm_regulators {
++ pm8916_l16: l16 {
++ /*
++ * L16 is only used for AW2013 which is fine with 2.5-3.3V.
++ * Use the recommended typical voltage of 2.8V as minimum.
++ */
++ regulator-min-microvolt = <2800000>;
++ regulator-max-microvolt = <3300000>;
++ };
++
++ pm8916_l17: l17 {
++ regulator-min-microvolt = <2850000>;
++ regulator-max-microvolt = <2850000>;
++ };
++};
++
+ &pm8916_vib {
+ status = "okay";
+ };
+@@ -188,110 +204,7 @@
+ compatible = "qcom,wcn3620";
+ };
+
+-&smd_rpm_regulators {
+- vdd_l1_l2_l3-supply = <&pm8916_s3>;
+- vdd_l4_l5_l6-supply = <&pm8916_s4>;
+- vdd_l7-supply = <&pm8916_s4>;
+-
+- s3 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1300000>;
+- };
+-
+- s4 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2100000>;
+- };
+-
+- l1 {
+- regulator-min-microvolt = <1225000>;
+- regulator-max-microvolt = <1225000>;
+- };
+-
+- l2 {
+- regulator-min-microvolt = <1200000>;
+- regulator-max-microvolt = <1200000>;
+- };
+-
+- l4 {
+- regulator-min-microvolt = <2050000>;
+- regulator-max-microvolt = <2050000>;
+- };
+-
+- l5 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l6 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l7 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <1800000>;
+- };
+-
+- l8 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2900000>;
+- };
+-
+- l9 {
+- regulator-min-microvolt = <3300000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l10 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2800000>;
+- };
+-
+- l11 {
+- regulator-min-microvolt = <2950000>;
+- regulator-max-microvolt = <2950000>;
+- regulator-allow-set-load;
+- regulator-system-load = <200000>;
+- };
+-
+- l12 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <2950000>;
+- };
+-
+- l13 {
+- regulator-min-microvolt = <3075000>;
+- regulator-max-microvolt = <3075000>;
+- };
+-
+- l14 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l15 {
+- regulator-min-microvolt = <1800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l16 {
+- regulator-min-microvolt = <2800000>;
+- regulator-max-microvolt = <3300000>;
+- };
+-
+- l17 {
+- regulator-min-microvolt = <2850000>;
+- regulator-max-microvolt = <2850000>;
+- };
+-
+- l18 {
+- regulator-min-microvolt = <2700000>;
+- regulator-max-microvolt = <2700000>;
+- };
+-};
+-
+-&msmgpio {
++&tlmm {
+ camera_flash_default: camera-flash-default-state {
+ pins = "gpio31", "gpio32";
+ function = "gpio";
+diff --git a/arch/arm64/boot/dts/qcom/msm8916-yiming-uz801v3.dts b/arch/arm64/boot/dts/qcom/msm8916-yiming-uz801v3.dts
+index 74ce6563be183..5e6ba8c58bb57 100644
+--- a/arch/arm64/boot/dts/qcom/msm8916-yiming-uz801v3.dts
++++ b/arch/arm64/boot/dts/qcom/msm8916-yiming-uz801v3.dts
+@@ -10,19 +10,19 @@
+ };
+
+ &button_restart {
+- gpios = <&msmgpio 23 GPIO_ACTIVE_LOW>;
++ gpios = <&tlmm 23 GPIO_ACTIVE_LOW>;
+ };
+
+ &led_r {
+- gpios = <&msmgpio 7 GPIO_ACTIVE_HIGH>;
++ gpios = <&tlmm 7 GPIO_ACTIVE_HIGH>;
+ };
+
+ &led_g {
+- gpios = <&msmgpio 8 GPIO_ACTIVE_HIGH>;
++ gpios = <&tlmm 8 GPIO_ACTIVE_HIGH>;
+ };
+
+ &led_b {
+- gpios = <&msmgpio 6 GPIO_ACTIVE_HIGH>;
++ gpios = <&tlmm 6 GPIO_ACTIVE_HIGH>;
+ };
+
+ &button_default {
+diff --git a/arch/arm64/boot/dts/qcom/msm8916.dtsi b/arch/arm64/boot/dts/qcom/msm8916.dtsi
+index bf88c10ff55b0..309ed76ec2d87 100644
+--- a/arch/arm64/boot/dts/qcom/msm8916.dtsi
++++ b/arch/arm64/boot/dts/qcom/msm8916.dtsi
+@@ -993,12 +993,12 @@
+ };
+ };
+
+- msmgpio: pinctrl@1000000 {
++ tlmm: pinctrl@1000000 {
+ compatible = "qcom,msm8916-pinctrl";
+ reg = <0x01000000 0x300000>;
+ interrupts = <GIC_SPI 208 IRQ_TYPE_LEVEL_HIGH>;
+ gpio-controller;
+- gpio-ranges = <&msmgpio 0 0 122>;
++ gpio-ranges = <&tlmm 0 0 122>;
+ #gpio-cells = <2>;
+ interrupt-controller;
+ #interrupt-cells = <2>;
+@@ -1552,6 +1552,7 @@
+ <&gcc GCC_CODEC_DIGCODEC_CLK>;
+ clock-names = "ahbix-clk", "mclk";
+ #sound-dai-cells = <1>;
++ status = "disabled";
+ };
+
+ sdhc_1: mmc@7824900 {
+diff --git a/arch/arm64/boot/dts/qcom/msm8996-xiaomi-gemini.dts b/arch/arm64/boot/dts/qcom/msm8996-xiaomi-gemini.dts
+index 100123d514944..f1d990dd7f7c9 100644
+--- a/arch/arm64/boot/dts/qcom/msm8996-xiaomi-gemini.dts
++++ b/arch/arm64/boot/dts/qcom/msm8996-xiaomi-gemini.dts
+@@ -82,7 +82,7 @@
+ #size-cells = <0>;
+ interrupt-parent = <&tlmm>;
+ interrupts = <125 IRQ_TYPE_LEVEL_LOW>;
+- vdda-supply = <&vreg_l6a_1p8>;
++ vio-supply = <&vreg_l6a_1p8>;
+ vdd-supply = <&vdd_3v2_tp>;
+ reset-gpios = <&tlmm 89 GPIO_ACTIVE_LOW>;
+
+diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi b/arch/arm64/boot/dts/qcom/msm8996.dtsi
+index 25fe2b8552fc7..61da7fc281b32 100644
+--- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
++++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
+@@ -1075,7 +1075,7 @@
+ reg-names = "dsi_ctrl";
+
+ interrupt-parent = <&mdss>;
+- interrupts = <4>;
++ interrupts = <5>;
+
+ clocks = <&mmcc MDSS_MDP_CLK>,
+ <&mmcc MDSS_BYTE1_CLK>,
+@@ -3336,6 +3336,9 @@
+ #size-cells = <1>;
+ ranges;
+
++ interrupts = <GIC_SPI 352 IRQ_TYPE_LEVEL_HIGH>;
++ interrupt-names = "hs_phy_irq";
++
+ clocks = <&gcc GCC_PERIPH_NOC_USB20_AHB_CLK>,
+ <&gcc GCC_USB20_MASTER_CLK>,
+ <&gcc GCC_USB20_MOCK_UTMI_CLK>,
+diff --git a/arch/arm64/boot/dts/qcom/msm8998.dtsi b/arch/arm64/boot/dts/qcom/msm8998.dtsi
+index 3ec941fed14fe..f7c2820b1aacb 100644
+--- a/arch/arm64/boot/dts/qcom/msm8998.dtsi
++++ b/arch/arm64/boot/dts/qcom/msm8998.dtsi
+@@ -2428,10 +2428,10 @@
+
+ clocks = <&mmcc MNOC_AHB_CLK>,
+ <&mmcc BIMC_SMMU_AHB_CLK>,
+- <&rpmcc RPM_SMD_MMAXI_CLK>,
+ <&mmcc BIMC_SMMU_AXI_CLK>;
+- clock-names = "iface-mm", "iface-smmu",
+- "bus-mm", "bus-smmu";
++ clock-names = "iface-mm",
++ "iface-smmu",
++ "bus-smmu";
+
+ #global-interrupts = <0>;
+ interrupts =
+@@ -2455,6 +2455,8 @@
+ <GIC_SPI 261 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 262 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 272 IRQ_TYPE_LEVEL_HIGH>;
++
++ power-domains = <&mmcc BIMC_SMMU_GDSC>;
+ };
+
+ remoteproc_adsp: remoteproc@17300000 {
+diff --git a/arch/arm64/boot/dts/qcom/pm6150l.dtsi b/arch/arm64/boot/dts/qcom/pm6150l.dtsi
+index 6f7aa67501e27..0fdf440596c01 100644
+--- a/arch/arm64/boot/dts/qcom/pm6150l.dtsi
++++ b/arch/arm64/boot/dts/qcom/pm6150l.dtsi
+@@ -121,8 +121,9 @@
+ pm6150l_wled: leds@d800 {
+ compatible = "qcom,pm6150l-wled";
+ reg = <0xd800>, <0xd900>;
+- interrupts = <0x5 0xd8 0x1 IRQ_TYPE_EDGE_RISING>;
+- interrupt-names = "ovp";
++ interrupts = <0x5 0xd8 0x1 IRQ_TYPE_EDGE_RISING>,
++ <0x5 0xd8 0x2 IRQ_TYPE_EDGE_RISING>;
++ interrupt-names = "ovp", "short";
+ label = "backlight";
+
+ status = "disabled";
+diff --git a/arch/arm64/boot/dts/qcom/pm660l.dtsi b/arch/arm64/boot/dts/qcom/pm660l.dtsi
+index 87b71b7205b85..6fdbf507c262a 100644
+--- a/arch/arm64/boot/dts/qcom/pm660l.dtsi
++++ b/arch/arm64/boot/dts/qcom/pm660l.dtsi
+@@ -74,8 +74,9 @@
+ pm660l_wled: leds@d800 {
+ compatible = "qcom,pm660l-wled";
+ reg = <0xd800>, <0xd900>;
+- interrupts = <0x3 0xd8 0x1 IRQ_TYPE_EDGE_RISING>;
+- interrupt-names = "ovp";
++ interrupts = <0x3 0xd8 0x1 IRQ_TYPE_EDGE_RISING>,
++ <0x3 0xd8 0x2 IRQ_TYPE_EDGE_RISING>;
++ interrupt-names = "ovp", "short";
+ label = "backlight";
+
+ status = "disabled";
+diff --git a/arch/arm64/boot/dts/qcom/pm8350.dtsi b/arch/arm64/boot/dts/qcom/pm8350.dtsi
+index 2dfeb99300d74..9ed9ba23e81e4 100644
+--- a/arch/arm64/boot/dts/qcom/pm8350.dtsi
++++ b/arch/arm64/boot/dts/qcom/pm8350.dtsi
+@@ -8,7 +8,7 @@
+
+ / {
+ thermal-zones {
+- pm8350_thermal: pm8350c-thermal {
++ pm8350_thermal: pm8350-thermal {
+ polling-delay-passive = <100>;
+ polling-delay = <0>;
+ thermal-sensors = <&pm8350_temp_alarm>;
+diff --git a/arch/arm64/boot/dts/qcom/pm8350b.dtsi b/arch/arm64/boot/dts/qcom/pm8350b.dtsi
+index f1c7bd9d079c2..05c1058988927 100644
+--- a/arch/arm64/boot/dts/qcom/pm8350b.dtsi
++++ b/arch/arm64/boot/dts/qcom/pm8350b.dtsi
+@@ -8,7 +8,7 @@
+
+ / {
+ thermal-zones {
+- pm8350b_thermal: pm8350c-thermal {
++ pm8350b_thermal: pm8350b-thermal {
+ polling-delay-passive = <100>;
+ polling-delay = <0>;
+ thermal-sensors = <&pm8350b_temp_alarm>;
+diff --git a/arch/arm64/boot/dts/qcom/pm8916.dtsi b/arch/arm64/boot/dts/qcom/pm8916.dtsi
+index f4fb1a92ab55a..33ca1002fb754 100644
+--- a/arch/arm64/boot/dts/qcom/pm8916.dtsi
++++ b/arch/arm64/boot/dts/qcom/pm8916.dtsi
+@@ -178,6 +178,7 @@
+ vdd-cdc-tx-rx-cx-supply = <&pm8916_l5>;
+ vdd-micbias-supply = <&pm8916_l13>;
+ #sound-dai-cells = <1>;
++ status = "disabled";
+ };
+ };
+ };
+diff --git a/arch/arm64/boot/dts/qcom/pmi8950.dtsi b/arch/arm64/boot/dts/qcom/pmi8950.dtsi
+index 4891be3cd68a3..c16adca4e93a9 100644
+--- a/arch/arm64/boot/dts/qcom/pmi8950.dtsi
++++ b/arch/arm64/boot/dts/qcom/pmi8950.dtsi
+@@ -87,8 +87,9 @@
+ pmi8950_wled: leds@d800 {
+ compatible = "qcom,pmi8950-wled";
+ reg = <0xd800>, <0xd900>;
+- interrupts = <0x3 0xd8 0x02 IRQ_TYPE_EDGE_RISING>;
+- interrupt-names = "short";
++ interrupts = <0x3 0xd8 0x1 IRQ_TYPE_EDGE_RISING>,
++ <0x3 0xd8 0x2 IRQ_TYPE_EDGE_RISING>;
++ interrupt-names = "ovp", "short";
+ label = "backlight";
+
+ status = "disabled";
+diff --git a/arch/arm64/boot/dts/qcom/pmi8994.dtsi b/arch/arm64/boot/dts/qcom/pmi8994.dtsi
+index 0192968f4d9b3..36d6a1fb553ac 100644
+--- a/arch/arm64/boot/dts/qcom/pmi8994.dtsi
++++ b/arch/arm64/boot/dts/qcom/pmi8994.dtsi
+@@ -54,8 +54,9 @@
+ pmi8994_wled: wled@d800 {
+ compatible = "qcom,pmi8994-wled";
+ reg = <0xd800>, <0xd900>;
+- interrupts = <3 0xd8 0x02 IRQ_TYPE_EDGE_RISING>;
+- interrupt-names = "short";
++ interrupts = <0x3 0xd8 0x1 IRQ_TYPE_EDGE_RISING>,
++ <0x3 0xd8 0x2 IRQ_TYPE_EDGE_RISING>;
++ interrupt-names = "ovp", "short";
+ qcom,cabc;
+ qcom,external-pfet;
+ status = "disabled";
+diff --git a/arch/arm64/boot/dts/qcom/pmk8350.dtsi b/arch/arm64/boot/dts/qcom/pmk8350.dtsi
+index f26fb7d32faf2..767ab7f284608 100644
+--- a/arch/arm64/boot/dts/qcom/pmk8350.dtsi
++++ b/arch/arm64/boot/dts/qcom/pmk8350.dtsi
+@@ -49,7 +49,7 @@
+ };
+
+ pmk8350_adc_tm: adc-tm@3400 {
+- compatible = "qcom,adc-tm7";
++ compatible = "qcom,spmi-adc-tm5-gen2";
+ reg = <0x3400>;
+ interrupts = <PMK8350_SID 0x34 0x0 IRQ_TYPE_EDGE_RISING>;
+ #address-cells = <1>;
+diff --git a/arch/arm64/boot/dts/qcom/pmr735b.dtsi b/arch/arm64/boot/dts/qcom/pmr735b.dtsi
+index ec24c4478005a..f7473e2473224 100644
+--- a/arch/arm64/boot/dts/qcom/pmr735b.dtsi
++++ b/arch/arm64/boot/dts/qcom/pmr735b.dtsi
+@@ -8,7 +8,7 @@
+
+ / {
+ thermal-zones {
+- pmr735a_thermal: pmr735a-thermal {
++ pmr735b_thermal: pmr735b-thermal {
+ polling-delay-passive = <100>;
+ polling-delay = <0>;
+ thermal-sensors = <&pmr735b_temp_alarm>;
+diff --git a/arch/arm64/boot/dts/qcom/sc8280xp-crd.dts b/arch/arm64/boot/dts/qcom/sc8280xp-crd.dts
+index 5b25d54b95911..4fa9a4f242273 100644
+--- a/arch/arm64/boot/dts/qcom/sc8280xp-crd.dts
++++ b/arch/arm64/boot/dts/qcom/sc8280xp-crd.dts
+@@ -167,7 +167,7 @@
+ regulator-min-microvolt = <3300000>;
+ regulator-max-microvolt = <3300000>;
+
+- gpio = <&pmc8280_1_gpios 1 GPIO_ACTIVE_HIGH>;
++ gpio = <&pmc8280_1_gpios 2 GPIO_ACTIVE_HIGH>;
+ enable-active-high;
+
+ pinctrl-names = "default";
+@@ -696,7 +696,7 @@
+ };
+
+ misc_3p3_reg_en: misc-3p3-reg-en-state {
+- pins = "gpio1";
++ pins = "gpio2";
+ function = "normal";
+ };
+ };
+diff --git a/arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts b/arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts
+index bdcba719fc385..9fa9b40b41b49 100644
+--- a/arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts
++++ b/arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts
+@@ -1212,7 +1212,7 @@
+ };
+
+ &tlmm {
+- gpio-reserved-ranges = <70 2>, <74 6>, <83 4>, <125 2>, <128 2>, <154 7>;
++ gpio-reserved-ranges = <70 2>, <74 6>, <125 2>, <128 2>, <154 4>;
+
+ bt_default: bt-default-state {
+ hstp-bt-en-pins {
+diff --git a/arch/arm64/boot/dts/qcom/sc8280xp.dtsi b/arch/arm64/boot/dts/qcom/sc8280xp.dtsi
+index cc4aef21e6172..9c3fb75e06005 100644
+--- a/arch/arm64/boot/dts/qcom/sc8280xp.dtsi
++++ b/arch/arm64/boot/dts/qcom/sc8280xp.dtsi
+@@ -296,6 +296,7 @@
+ firmware {
+ scm: scm {
+ compatible = "qcom,scm-sc8280xp", "qcom,scm";
++ interconnects = <&aggre2_noc MASTER_CRYPTO 0 &mc_virt SLAVE_EBI1 0>;
+ };
+ };
+
+diff --git a/arch/arm64/boot/dts/qcom/sdm845-sony-xperia-tama.dtsi b/arch/arm64/boot/dts/qcom/sdm845-sony-xperia-tama.dtsi
+index 420ffede3e804..25e06add95652 100644
+--- a/arch/arm64/boot/dts/qcom/sdm845-sony-xperia-tama.dtsi
++++ b/arch/arm64/boot/dts/qcom/sdm845-sony-xperia-tama.dtsi
+@@ -15,6 +15,15 @@
+ qcom,msm-id = <321 0x20001>; /* SDM845 v2.1 */
+ qcom,board-id = <8 0>;
+
++ aliases {
++ serial0 = &uart6;
++ serial1 = &uart9;
++ };
++
++ chosen {
++ stdout-path = "serial0:115200n8";
++ };
++
+ gpio-keys {
+ compatible = "gpio-keys";
+
+diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
+index 1bfb938e284fb..b73ce14ababd1 100644
+--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
++++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
+@@ -1207,6 +1207,7 @@
+ #clock-cells = <1>;
+ #reset-cells = <1>;
+ #power-domain-cells = <1>;
++ power-domains = <&rpmhpd SDM845_CX>;
+ };
+
+ qfprom@784000 {
+@@ -2613,7 +2614,7 @@
+ <0 0>,
+ <0 0>,
+ <0 0>,
+- <0 300000000>;
++ <75000000 300000000>;
+
+ status = "disabled";
+ };
+diff --git a/arch/arm64/boot/dts/qcom/sm6350.dtsi b/arch/arm64/boot/dts/qcom/sm6350.dtsi
+index ad34301f6cddf..18dc3119eea10 100644
+--- a/arch/arm64/boot/dts/qcom/sm6350.dtsi
++++ b/arch/arm64/boot/dts/qcom/sm6350.dtsi
+@@ -473,11 +473,6 @@
+ no-map;
+ };
+
+- pil_gpu_mem: memory@8b715400 {
+- reg = <0 0x8b715400 0 0x2000>;
+- no-map;
+- };
+-
+ pil_modem_mem: memory@8b800000 {
+ reg = <0 0x8b800000 0 0xf800000>;
+ no-map;
+@@ -498,6 +493,11 @@
+ no-map;
+ };
+
++ pil_gpu_mem: memory@f0d00000 {
++ reg = <0 0xf0d00000 0 0x1000>;
++ no-map;
++ };
++
+ debug_region: memory@ffb00000 {
+ reg = <0 0xffb00000 0 0xc0000>;
+ no-map;
+diff --git a/arch/arm64/boot/dts/qcom/sm8150.dtsi b/arch/arm64/boot/dts/qcom/sm8150.dtsi
+index 27dcda0d4288f..c4d87092e6d9c 100644
+--- a/arch/arm64/boot/dts/qcom/sm8150.dtsi
++++ b/arch/arm64/boot/dts/qcom/sm8150.dtsi
+@@ -1231,7 +1231,7 @@
+ dma-names = "tx", "rx";
+ pinctrl-names = "default";
+ pinctrl-0 = <&qup_i2c7_default>;
+- interrupts = <GIC_SPI 607 IRQ_TYPE_LEVEL_HIGH>;
++ interrupts = <GIC_SPI 608 IRQ_TYPE_LEVEL_HIGH>;
+ #address-cells = <1>;
+ #size-cells = <0>;
+ status = "disabled";
+@@ -3805,7 +3805,7 @@
+ };
+
+ mdss_dsi0_phy: phy@ae94400 {
+- compatible = "qcom,dsi-phy-7nm";
++ compatible = "qcom,dsi-phy-7nm-8150";
+ reg = <0 0x0ae94400 0 0x200>,
+ <0 0x0ae94600 0 0x280>,
+ <0 0x0ae94900 0 0x260>;
+@@ -3879,7 +3879,7 @@
+ };
+
+ mdss_dsi1_phy: phy@ae96400 {
+- compatible = "qcom,dsi-phy-7nm";
++ compatible = "qcom,dsi-phy-7nm-8150";
+ reg = <0 0x0ae96400 0 0x200>,
+ <0 0x0ae96600 0 0x280>,
+ <0 0x0ae96900 0 0x260>;
+diff --git a/arch/arm64/boot/dts/qcom/sm8250-sony-xperia-edo-pdx203.dts b/arch/arm64/boot/dts/qcom/sm8250-sony-xperia-edo-pdx203.dts
+index 356a81698731a..62590c6bd3067 100644
+--- a/arch/arm64/boot/dts/qcom/sm8250-sony-xperia-edo-pdx203.dts
++++ b/arch/arm64/boot/dts/qcom/sm8250-sony-xperia-edo-pdx203.dts
+@@ -14,3 +14,236 @@
+ };
+
+ /delete-node/ &vreg_l7f_1p8;
++
++&pm8009_gpios {
++ gpio-line-names = "NC", /* GPIO_1 */
++ "CAM_PWR_LD_EN",
++ "WIDEC_PWR_EN",
++ "NC";
++};
++
++&pm8150_gpios {
++ gpio-line-names = "VOL_DOWN_N", /* GPIO_1 */
++ "OPTION_2",
++ "NC",
++ "PM_SLP_CLK_IN",
++ "OPTION_1",
++ "NC",
++ "NC",
++ "SP_ARI_PWR_ALARM",
++ "NC",
++ "NC"; /* GPIO_10 */
++};
++
++&pm8150b_gpios {
++ gpio-line-names = "SNAPSHOT_N", /* GPIO_1 */
++ "FOCUS_N",
++ "NC",
++ "NC",
++ "RF_LCD_ID_EN",
++ "NC",
++ "NC",
++ "LCD_ID",
++ "NC",
++ "WLC_EN_N", /* GPIO_10 */
++ "NC",
++ "RF_ID";
++};
++
++&pm8150l_gpios {
++ gpio-line-names = "NC", /* GPIO_1 */
++ "PM3003A_EN",
++ "NC",
++ "NC",
++ "NC",
++ "AUX2_THERM",
++ "BB_HP_EN",
++ "FP_LDO_EN",
++ "PMX_RESET_N",
++ "AUX3_THERM", /* GPIO_10 */
++ "DTV_PWR_EN",
++ "PM3003A_MODE";
++};
++
++&tlmm {
++ gpio-line-names = "AP_CTI_IN", /* GPIO_0 */
++ "MDM2AP_ERR_FATAL",
++ "AP_CTI_OUT",
++ "MDM2AP_STATUS",
++ "NFC_I2C_SDA",
++ "NFC_I2C_SCL",
++ "NFC_EN",
++ "NFC_CLK_REQ",
++ "NFC_ESE_PWR_REQ",
++ "DVDT_WRT_DET_AND",
++ "SPK_AMP_RESET_N", /* GPIO_10 */
++ "SPK_AMP_INT_N",
++ "APPS_I2C_1_SDA",
++ "APPS_I2C_1_SCL",
++ "NC",
++ "TX_GTR_THRES_IN",
++ "HST_BT_UART_CTS",
++ "HST_BT_UART_RFR",
++ "HST_BT_UART_TX",
++ "HST_BT_UART_RX",
++ "HST_WLAN_EN", /* GPIO_20 */
++ "HST_BT_EN",
++ "RGBC_IR_PWR_EN",
++ "FP_INT_N",
++ "NC",
++ "NC",
++ "NC",
++ "NC",
++ "NFC_ESE_SPI_MISO",
++ "NFC_ESE_SPI_MOSI",
++ "NFC_ESE_SPI_SCLK", /* GPIO_30 */
++ "NFC_ESE_SPI_CS_N",
++ "WCD_RST_N",
++ "NC",
++ "SDM_DEBUG_UART_TX",
++ "SDM_DEBUG_UART_RX",
++ "TS_I2C_SDA",
++ "TS_I2C_SCL",
++ "TS_INT_N",
++ "FP_SPI_MISO", /* GPIO_40 */
++ "FP_SPI_MOSI",
++ "FP_SPI_SCLK",
++ "FP_SPI_CS_N",
++ "APPS_I2C_0_SDA",
++ "APPS_I2C_0_SCL",
++ "DISP_ERR_FG",
++ "UIM2_DETECT_EN",
++ "NC",
++ "NC",
++ "NC", /* GPIO_50 */
++ "NC",
++ "MDM_UART_CTS",
++ "MDM_UART_RFR",
++ "MDM_UART_TX",
++ "MDM_UART_RX",
++ "AP2MDM_STATUS",
++ "AP2MDM_ERR_FATAL",
++ "MDM_IPC_HS_UART_TX",
++ "MDM_IPC_HS_UART_RX",
++ "NC", /* GPIO_60 */
++ "NC",
++ "NC",
++ "NC",
++ "NC",
++ "USB_CC_DIR",
++ "DISP_VSYNC",
++ "NC",
++ "NC",
++ "CAM_PWR_B_CS",
++ "NC", /* GPIO_70 */
++ "CAM_PWR_A_CS",
++ "SBU_SW_SEL",
++ "SBU_SW_OE",
++ "FP_RESET_N",
++ "FP_RESET_N",
++ "DISP_RESET_N",
++ "DEBUG_GPIO0",
++ "TRAY_DET",
++ "CAM2_RST_N",
++ "PCIE0_RST_N",
++ "PCIE0_CLK_REQ_N", /* GPIO_80 */
++ "PCIE0_WAKE_N",
++ "DVDT_ENABLE",
++ "DVDT_WRT_DET_OR",
++ "NC",
++ "PCIE2_RST_N",
++ "PCIE2_CLK_REQ_N",
++ "PCIE2_WAKE_N",
++ "MDM_VFR_IRQ0",
++ "MDM_VFR_IRQ1",
++ "SW_SERVICE", /* GPIO_90 */
++ "CAM_SOF",
++ "CAM1_RST_N",
++ "CAM0_RST_N",
++ "CAM0_MCLK",
++ "CAM1_MCLK",
++ "CAM2_MCLK",
++ "CAM3_MCLK",
++ "CAM4_MCLK",
++ "TOF_RST_N",
++ "NC", /* GPIO_100 */
++ "CCI0_I2C_SDA",
++ "CCI0_I2C_SCL",
++ "CCI1_I2C_SDA",
++ "CCI1_I2C_SCL_",
++ "CCI2_I2C_SDA",
++ "CCI2_I2C_SCL",
++ "CCI3_I2C_SDA",
++ "CCI3_I2C_SCL",
++ "CAM3_RST_N",
++ "NFC_DWL_REQ", /* GPIO_110 */
++ "NFC_IRQ",
++ "XVS",
++ "NC",
++ "RF_ID_EXTENSION",
++ "SPK_AMP_I2C_SDA",
++ "SPK_AMP_I2C_SCL",
++ "NC",
++ "NC",
++ "WLC_I2C_SDA",
++ "WLC_I2C_SCL", /* GPIO_120 */
++ "ACC_COVER_OPEN",
++ "ALS_PROX_INT_N",
++ "ACCEL_INT",
++ "WLAN_SW_CTRL",
++ "CAMSENSOR_I2C_SDA",
++ "CAMSENSOR_I2C_SCL",
++ "UDON_SWITCH_SEL",
++ "WDOG_DISABLE",
++ "BAROMETER_INT",
++ "NC", /* GPIO_130 */
++ "NC",
++ "FORCED_USB_BOOT",
++ "NC",
++ "NC",
++ "WLC_INT_N",
++ "NC",
++ "NC",
++ "RGBC_IR_INT",
++ "NC",
++ "NC", /* GPIO_140 */
++ "NC",
++ "BT_SLIMBUS_CLK",
++ "BT_SLIMBUS_DATA",
++ "HW_ID_0",
++ "HW_ID_1",
++ "WCD_SWR_TX_CLK",
++ "WCD_SWR_TX_DATA0",
++ "WCD_SWR_TX_DATA1",
++ "WCD_SWR_RX_CLK",
++ "WCD_SWR_RX_DATA0", /* GPIO_150 */
++ "WCD_SWR_RX_DATA1",
++ "SDM_DMIC_CLK1",
++ "SDM_DMIC_DATA1",
++ "SDM_DMIC_CLK2",
++ "SDM_DMIC_DATA2",
++ "SPK_AMP_I2S_CLK",
++ "SPK_AMP_I2S_WS",
++ "SPK_AMP_I2S_ASP_DIN",
++ "SPK_AMP_I2S_ASP_DOUT",
++ "COMPASS_I2C_SDA", /* GPIO_160 */
++ "COMPASS_I2C_SCL",
++ "NC",
++ "NC",
++ "SSC_SPI_1_MISO",
++ "SSC_SPI_1_MOSI",
++ "SSC_SPI_1_CLK",
++ "SSC_SPI_1_CS_N",
++ "NC",
++ "NC",
++ "SSC_SENSOR_I2C_SDA", /* GPIO_170 */
++ "SSC_SENSOR_I2C_SCL",
++ "NC",
++ "NC",
++ "NC",
++ "NC",
++ "HST_BLE_SNS_UART6_TX",
++ "HST_BLE_SNS_UART6_RX",
++ "HST_WLAN_UART_TX",
++ "HST_WLAN_UART_RX";
++};
+diff --git a/arch/arm64/boot/dts/qcom/sm8250-sony-xperia-edo-pdx206.dts b/arch/arm64/boot/dts/qcom/sm8250-sony-xperia-edo-pdx206.dts
+index 01fe3974ee720..58a521046f5f5 100644
+--- a/arch/arm64/boot/dts/qcom/sm8250-sony-xperia-edo-pdx206.dts
++++ b/arch/arm64/boot/dts/qcom/sm8250-sony-xperia-edo-pdx206.dts
+@@ -20,6 +20,8 @@
+ };
+
+ &gpio_keys {
++ pinctrl-0 = <&focus_n &snapshot_n &vol_down_n &g_assist_n>;
++
+ g-assist-key {
+ label = "Google Assistant Key";
+ linux,code = <KEY_LEFTMETA>;
+@@ -30,6 +32,247 @@
+ };
+ };
+
++&pm8009_gpios {
++ gpio-line-names = "NC", /* GPIO_1 */
++ "NC",
++ "WIDEC_PWR_EN",
++ "NC";
++};
++
++&pm8150_gpios {
++ gpio-line-names = "VOL_DOWN_N", /* GPIO_1 */
++ "OPTION_2",
++ "NC",
++ "PM_SLP_CLK_IN",
++ "OPTION_1",
++ "G_ASSIST_N",
++ "NC",
++ "SP_ARI_PWR_ALARM",
++ "NC",
++ "NC"; /* GPIO_10 */
++
++ g_assist_n: g-assist-n-state {
++ pins = "gpio6";
++ function = "normal";
++ power-source = <1>;
++ bias-pull-up;
++ input-enable;
++ };
++};
++
++&pm8150b_gpios {
++ gpio-line-names = "SNAPSHOT_N", /* GPIO_1 */
++ "FOCUS_N",
++ "NC",
++ "NC",
++ "RF_LCD_ID_EN",
++ "NC",
++ "NC",
++ "LCD_ID",
++ "NC",
++ "NC", /* GPIO_10 */
++ "NC",
++ "RF_ID";
++};
++
++&pm8150l_gpios {
++ gpio-line-names = "NC", /* GPIO_1 */
++ "PM3003A_EN",
++ "NC",
++ "NC",
++ "NC",
++ "AUX2_THERM",
++ "BB_HP_EN",
++ "FP_LDO_EN",
++ "PMX_RESET_N",
++ "NC", /* GPIO_10 */
++ "NC",
++ "PM3003A_MODE";
++};
++
++&tlmm {
++ gpio-line-names = "AP_CTI_IN", /* GPIO_0 */
++ "MDM2AP_ERR_FATAL",
++ "AP_CTI_OUT",
++ "MDM2AP_STATUS",
++ "NFC_I2C_SDA",
++ "NFC_I2C_SCL",
++ "NFC_EN",
++ "NFC_CLK_REQ",
++ "NFC_ESE_PWR_REQ",
++ "DVDT_WRT_DET_AND",
++ "SPK_AMP_RESET_N", /* GPIO_10 */
++ "SPK_AMP_INT_N",
++ "APPS_I2C_1_SDA",
++ "APPS_I2C_1_SCL",
++ "NC",
++ "TX_GTR_THRES_IN",
++ "HST_BT_UART_CTS",
++ "HST_BT_UART_RFR",
++ "HST_BT_UART_TX",
++ "HST_BT_UART_RX",
++ "HST_WLAN_EN", /* GPIO_20 */
++ "HST_BT_EN",
++ "RGBC_IR_PWR_EN",
++ "FP_INT_N",
++ "NC",
++ "NC",
++ "NC",
++ "NC",
++ "NFC_ESE_SPI_MISO",
++ "NFC_ESE_SPI_MOSI",
++ "NFC_ESE_SPI_SCLK", /* GPIO_30 */
++ "NFC_ESE_SPI_CS_N",
++ "WCD_RST_N",
++ "NC",
++ "SDM_DEBUG_UART_TX",
++ "SDM_DEBUG_UART_RX",
++ "TS_I2C_SDA",
++ "TS_I2C_SCL",
++ "TS_INT_N",
++ "FP_SPI_MISO", /* GPIO_40 */
++ "FP_SPI_MOSI",
++ "FP_SPI_SCLK",
++ "FP_SPI_CS_N",
++ "APPS_I2C_0_SDA",
++ "APPS_I2C_0_SCL",
++ "DISP_ERR_FG",
++ "UIM2_DETECT_EN",
++ "NC",
++ "NC",
++ "NC", /* GPIO_50 */
++ "NC",
++ "MDM_UART_CTS",
++ "MDM_UART_RFR",
++ "MDM_UART_TX",
++ "MDM_UART_RX",
++ "AP2MDM_STATUS",
++ "AP2MDM_ERR_FATAL",
++ "MDM_IPC_HS_UART_TX",
++ "MDM_IPC_HS_UART_RX",
++ "NC", /* GPIO_60 */
++ "NC",
++ "NC",
++ "NC",
++ "NC",
++ "USB_CC_DIR",
++ "DISP_VSYNC",
++ "NC",
++ "NC",
++ "CAM_PWR_B_CS",
++ "NC", /* GPIO_70 */
++ "FRONTC_PWR_EN",
++ "SBU_SW_SEL",
++ "SBU_SW_OE",
++ "FP_RESET_N",
++ "FP_RESET_N",
++ "DISP_RESET_N",
++ "DEBUG_GPIO0",
++ "TRAY_DET",
++ "CAM2_RST_N",
++ "PCIE0_RST_N",
++ "PCIE0_CLK_REQ_N", /* GPIO_80 */
++ "PCIE0_WAKE_N",
++ "DVDT_ENABLE",
++ "DVDT_WRT_DET_OR",
++ "NC",
++ "PCIE2_RST_N",
++ "PCIE2_CLK_REQ_N",
++ "PCIE2_WAKE_N",
++ "MDM_VFR_IRQ0",
++ "MDM_VFR_IRQ1",
++ "SW_SERVICE", /* GPIO_90 */
++ "CAM_SOF",
++ "CAM1_RST_N",
++ "CAM0_RST_N",
++ "CAM0_MCLK",
++ "CAM1_MCLK",
++ "CAM2_MCLK",
++ "CAM3_MCLK",
++ "NC",
++ "NC",
++ "NC", /* GPIO_100 */
++ "CCI0_I2C_SDA",
++ "CCI0_I2C_SCL",
++ "CCI1_I2C_SDA",
++ "CCI1_I2C_SCL_",
++ "CCI2_I2C_SDA",
++ "CCI2_I2C_SCL",
++ "CCI3_I2C_SDA",
++ "CCI3_I2C_SCL",
++ "CAM3_RST_N",
++ "NFC_DWL_REQ", /* GPIO_110 */
++ "NFC_IRQ",
++ "XVS",
++ "NC",
++ "RF_ID_EXTENSION",
++ "SPK_AMP_I2C_SDA",
++ "SPK_AMP_I2C_SCL",
++ "NC",
++ "NC",
++ "NC",
++ "NC",
++ "ACC_COVER_OPEN",
++ "ALS_PROX_INT_N",
++ "ACCEL_INT",
++ "WLAN_SW_CTRL",
++ "CAMSENSOR_I2C_SDA",
++ "CAMSENSOR_I2C_SCL",
++ "UDON_SWITCH_SEL",
++ "WDOG_DISABLE",
++ "BAROMETER_INT",
++ "NC", /* GPIO_130 */
++ "NC",
++ "FORCED_USB_BOOT",
++ "NC",
++ "NC",
++ "NC",
++ "NC",
++ "NC",
++ "RGBC_IR_INT",
++ "NC",
++ "NC", /* GPIO_140 */
++ "NC",
++ "BT_SLIMBUS_CLK",
++ "BT_SLIMBUS_DATA",
++ "HW_ID_0",
++ "HW_ID_1",
++ "WCD_SWR_TX_CLK",
++ "WCD_SWR_TX_DATA0",
++ "WCD_SWR_TX_DATA1",
++ "WCD_SWR_RX_CLK",
++ "WCD_SWR_RX_DATA0", /* GPIO_150 */
++ "WCD_SWR_RX_DATA1",
++ "SDM_DMIC_CLK1",
++ "SDM_DMIC_DATA1",
++ "SDM_DMIC_CLK2",
++ "SDM_DMIC_DATA2",
++ "SPK_AMP_I2S_CLK",
++ "SPK_AMP_I2S_WS",
++ "SPK_AMP_I2S_ASP_DIN",
++ "SPK_AMP_I2S_ASP_DOUT",
++ "COMPASS_I2C_SDA", /* GPIO_160 */
++ "COMPASS_I2C_SCL",
++ "NC",
++ "NC",
++ "SSC_SPI_1_MISO",
++ "SSC_SPI_1_MOSI",
++ "SSC_SPI_1_CLK",
++ "SSC_SPI_1_CS_N",
++ "NC",
++ "NC",
++ "SSC_SENSOR_I2C_SDA", /* GPIO_170 */
++ "SSC_SENSOR_I2C_SCL",
++ "NC",
++ "NC",
++ "NC",
++ "NC",
++ "HST_BLE_SNS_UART6_TX",
++ "HST_BLE_SNS_UART6_RX",
++ "HST_WLAN_UART_TX",
++ "HST_WLAN_UART_RX";
++};
++
+ &vreg_l2f_1p3 {
+ regulator-min-microvolt = <1200000>;
+ regulator-max-microvolt = <1200000>;
+diff --git a/arch/arm64/boot/dts/qcom/sm8250-sony-xperia-edo.dtsi b/arch/arm64/boot/dts/qcom/sm8250-sony-xperia-edo.dtsi
+index dcabb714f0f35..0268d80248e5c 100644
+--- a/arch/arm64/boot/dts/qcom/sm8250-sony-xperia-edo.dtsi
++++ b/arch/arm64/boot/dts/qcom/sm8250-sony-xperia-edo.dtsi
+@@ -51,12 +51,26 @@
+ gpio_keys: gpio-keys {
+ compatible = "gpio-keys";
+
+- /*
+- * Camera focus (light press) and camera snapshot (full press)
+- * seem not to work properly.. Adding the former one stalls the CPU
+- * and the latter kills the volume down key for whatever reason. In any
+- * case, they are both on &pm8150b_gpios: camera focus(2), camera snapshot(1).
+- */
++ pinctrl-0 = <&focus_n &snapshot_n &vol_down_n>;
++ pinctrl-names = "default";
++
++ key-camera-focus {
++ label = "Camera Focus";
++ linux,code = <KEY_CAMERA_FOCUS>;
++ gpios = <&pm8150b_gpios 2 GPIO_ACTIVE_LOW>;
++ debounce-interval = <15>;
++ linux,can-disable;
++ wakeup-source;
++ };
++
++ key-camera-snapshot {
++ label = "Camera Snapshot";
++ linux,code = <KEY_CAMERA>;
++ gpios = <&pm8150b_gpios 1 GPIO_ACTIVE_LOW>;
++ debounce-interval = <15>;
++ linux,can-disable;
++ wakeup-source;
++ };
+
+ key-vol-down {
+ label = "Volume Down";
+@@ -551,6 +565,34 @@
+ vdda-pll-supply = <&vreg_l9a_1p2>;
+ };
+
++&pm8150_gpios {
++ vol_down_n: vol-down-n-state {
++ pins = "gpio1";
++ function = "normal";
++ power-source = <0>;
++ bias-pull-up;
++ input-enable;
++ };
++};
++
++&pm8150b_gpios {
++ snapshot_n: snapshot-n-state {
++ pins = "gpio1";
++ function = "normal";
++ power-source = <0>;
++ bias-pull-up;
++ input-enable;
++ };
++
++ focus_n: focus-n-state {
++ pins = "gpio2";
++ function = "normal";
++ power-source = <0>;
++ bias-pull-up;
++ input-enable;
++ };
++};
++
+ &pon_pwrkey {
+ status = "okay";
+ };
+diff --git a/arch/arm64/boot/dts/qcom/sm8250.dtsi b/arch/arm64/boot/dts/qcom/sm8250.dtsi
+index 7bea916900e29..29f0b0381b278 100644
+--- a/arch/arm64/boot/dts/qcom/sm8250.dtsi
++++ b/arch/arm64/boot/dts/qcom/sm8250.dtsi
+@@ -100,7 +100,7 @@
+ clocks = <&cpufreq_hw 0>;
+ enable-method = "psci";
+ capacity-dmips-mhz = <448>;
+- dynamic-power-coefficient = <205>;
++ dynamic-power-coefficient = <105>;
+ next-level-cache = <&L2_0>;
+ power-domains = <&CPU_PD0>;
+ power-domain-names = "psci";
+@@ -131,7 +131,7 @@
+ clocks = <&cpufreq_hw 0>;
+ enable-method = "psci";
+ capacity-dmips-mhz = <448>;
+- dynamic-power-coefficient = <205>;
++ dynamic-power-coefficient = <105>;
+ next-level-cache = <&L2_100>;
+ power-domains = <&CPU_PD1>;
+ power-domain-names = "psci";
+@@ -156,7 +156,7 @@
+ clocks = <&cpufreq_hw 0>;
+ enable-method = "psci";
+ capacity-dmips-mhz = <448>;
+- dynamic-power-coefficient = <205>;
++ dynamic-power-coefficient = <105>;
+ next-level-cache = <&L2_200>;
+ power-domains = <&CPU_PD2>;
+ power-domain-names = "psci";
+@@ -181,7 +181,7 @@
+ clocks = <&cpufreq_hw 0>;
+ enable-method = "psci";
+ capacity-dmips-mhz = <448>;
+- dynamic-power-coefficient = <205>;
++ dynamic-power-coefficient = <105>;
+ next-level-cache = <&L2_300>;
+ power-domains = <&CPU_PD3>;
+ power-domain-names = "psci";
+@@ -1905,6 +1905,7 @@
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&pcie0_default_state>;
++ dma-coherent;
+
+ status = "disabled";
+ };
+@@ -2011,6 +2012,7 @@
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&pcie1_default_state>;
++ dma-coherent;
+
+ status = "disabled";
+ };
+@@ -2119,6 +2121,7 @@
+
+ pinctrl-names = "default";
+ pinctrl-0 = <&pcie2_default_state>;
++ dma-coherent;
+
+ status = "disabled";
+ };
+diff --git a/arch/arm64/boot/dts/qcom/sm8350.dtsi b/arch/arm64/boot/dts/qcom/sm8350.dtsi
+index 425af2c38a37f..f375d960ba85c 100644
+--- a/arch/arm64/boot/dts/qcom/sm8350.dtsi
++++ b/arch/arm64/boot/dts/qcom/sm8350.dtsi
+@@ -48,7 +48,7 @@
+
+ CPU0: cpu@0 {
+ device_type = "cpu";
+- compatible = "qcom,kryo685";
++ compatible = "arm,cortex-a55";
+ reg = <0x0 0x0>;
+ clocks = <&cpufreq_hw 0>;
+ enable-method = "psci";
+@@ -72,7 +72,7 @@
+
+ CPU1: cpu@100 {
+ device_type = "cpu";
+- compatible = "qcom,kryo685";
++ compatible = "arm,cortex-a55";
+ reg = <0x0 0x100>;
+ clocks = <&cpufreq_hw 0>;
+ enable-method = "psci";
+@@ -91,7 +91,7 @@
+
+ CPU2: cpu@200 {
+ device_type = "cpu";
+- compatible = "qcom,kryo685";
++ compatible = "arm,cortex-a55";
+ reg = <0x0 0x200>;
+ clocks = <&cpufreq_hw 0>;
+ enable-method = "psci";
+@@ -110,7 +110,7 @@
+
+ CPU3: cpu@300 {
+ device_type = "cpu";
+- compatible = "qcom,kryo685";
++ compatible = "arm,cortex-a55";
+ reg = <0x0 0x300>;
+ clocks = <&cpufreq_hw 0>;
+ enable-method = "psci";
+@@ -129,7 +129,7 @@
+
+ CPU4: cpu@400 {
+ device_type = "cpu";
+- compatible = "qcom,kryo685";
++ compatible = "arm,cortex-a78";
+ reg = <0x0 0x400>;
+ clocks = <&cpufreq_hw 1>;
+ enable-method = "psci";
+@@ -148,7 +148,7 @@
+
+ CPU5: cpu@500 {
+ device_type = "cpu";
+- compatible = "qcom,kryo685";
++ compatible = "arm,cortex-a78";
+ reg = <0x0 0x500>;
+ clocks = <&cpufreq_hw 1>;
+ enable-method = "psci";
+@@ -167,7 +167,7 @@
+
+ CPU6: cpu@600 {
+ device_type = "cpu";
+- compatible = "qcom,kryo685";
++ compatible = "arm,cortex-a78";
+ reg = <0x0 0x600>;
+ clocks = <&cpufreq_hw 1>;
+ enable-method = "psci";
+@@ -186,7 +186,7 @@
+
+ CPU7: cpu@700 {
+ device_type = "cpu";
+- compatible = "qcom,kryo685";
++ compatible = "arm,cortex-x1";
+ reg = <0x0 0x700>;
+ clocks = <&cpufreq_hw 2>;
+ enable-method = "psci";
+@@ -246,8 +246,8 @@
+ compatible = "arm,idle-state";
+ idle-state-name = "silver-rail-power-collapse";
+ arm,psci-suspend-param = <0x40000004>;
+- entry-latency-us = <355>;
+- exit-latency-us = <909>;
++ entry-latency-us = <360>;
++ exit-latency-us = <531>;
+ min-residency-us = <3934>;
+ local-timer-stop;
+ };
+@@ -256,8 +256,8 @@
+ compatible = "arm,idle-state";
+ idle-state-name = "gold-rail-power-collapse";
+ arm,psci-suspend-param = <0x40000004>;
+- entry-latency-us = <241>;
+- exit-latency-us = <1461>;
++ entry-latency-us = <702>;
++ exit-latency-us = <1061>;
+ min-residency-us = <4488>;
+ local-timer-stop;
+ };
+@@ -3339,6 +3339,13 @@
+ <0 0x18593000 0 0x1000>;
+ reg-names = "freq-domain0", "freq-domain1", "freq-domain2";
+
++ interrupts = <GIC_SPI 30 IRQ_TYPE_LEVEL_HIGH>,
++ <GIC_SPI 31 IRQ_TYPE_LEVEL_HIGH>,
++ <GIC_SPI 19 IRQ_TYPE_LEVEL_HIGH>;
++ interrupt-names = "dcvsh-irq-0",
++ "dcvsh-irq-1",
++ "dcvsh-irq-2";
++
+ clocks = <&rpmhcc RPMH_CXO_CLK>, <&gcc GCC_GPLL0>;
+ clock-names = "xo", "alternate";
+
+diff --git a/arch/arm64/boot/dts/qcom/sm8450-hdk.dts b/arch/arm64/boot/dts/qcom/sm8450-hdk.dts
+index e931545a2cac4..50306d070883d 100644
+--- a/arch/arm64/boot/dts/qcom/sm8450-hdk.dts
++++ b/arch/arm64/boot/dts/qcom/sm8450-hdk.dts
+@@ -14,7 +14,6 @@
+ #include "pm8450.dtsi"
+ #include "pmk8350.dtsi"
+ #include "pmr735a.dtsi"
+-#include "pmr735b.dtsi"
+
+ / {
+ model = "Qualcomm Technologies, Inc. SM8450 HDK";
+diff --git a/arch/arm64/boot/dts/qcom/sm8550-mtp.dts b/arch/arm64/boot/dts/qcom/sm8550-mtp.dts
+index e2b9bb6b1e279..ea304d1e6f8f6 100644
+--- a/arch/arm64/boot/dts/qcom/sm8550-mtp.dts
++++ b/arch/arm64/boot/dts/qcom/sm8550-mtp.dts
+@@ -79,6 +79,7 @@
+
+ vdd-bob1-supply = <&vph_pwr>;
+ vdd-bob2-supply = <&vph_pwr>;
++ vdd-l1-l4-l10-supply = <&vreg_s6g_1p8>;
+ vdd-l2-l13-l14-supply = <&vreg_bob1>;
+ vdd-l3-supply = <&vreg_s4g_1p3>;
+ vdd-l5-l16-supply = <&vreg_bob1>;
+diff --git a/arch/arm64/boot/dts/rockchip/rk3399-eaidk-610.dts b/arch/arm64/boot/dts/rockchip/rk3399-eaidk-610.dts
+index d1f343345f674..6464ef4d113dd 100644
+--- a/arch/arm64/boot/dts/rockchip/rk3399-eaidk-610.dts
++++ b/arch/arm64/boot/dts/rockchip/rk3399-eaidk-610.dts
+@@ -773,7 +773,7 @@
+ compatible = "brcm,bcm4329-fmac";
+ reg = <1>;
+ interrupt-parent = <&gpio0>;
+- interrupts = <RK_PA3 GPIO_ACTIVE_HIGH>;
++ interrupts = <RK_PA3 IRQ_TYPE_LEVEL_HIGH>;
+ interrupt-names = "host-wake";
+ pinctrl-names = "default";
+ pinctrl-0 = <&wifi_host_wake_l>;
+diff --git a/arch/arm64/boot/dts/rockchip/rk3399-rock-pi-4b-plus.dts b/arch/arm64/boot/dts/rockchip/rk3399-rock-pi-4b-plus.dts
+index cec3b7b1b9474..8a17c1eaae15e 100644
+--- a/arch/arm64/boot/dts/rockchip/rk3399-rock-pi-4b-plus.dts
++++ b/arch/arm64/boot/dts/rockchip/rk3399-rock-pi-4b-plus.dts
+@@ -31,7 +31,7 @@
+ compatible = "brcm,bcm4329-fmac";
+ reg = <1>;
+ interrupt-parent = <&gpio0>;
+- interrupts = <RK_PA3 GPIO_ACTIVE_HIGH>;
++ interrupts = <RK_PA3 IRQ_TYPE_LEVEL_HIGH>;
+ interrupt-names = "host-wake";
+ pinctrl-names = "default";
+ pinctrl-0 = <&wifi_host_wake_l>;
+diff --git a/arch/arm64/boot/dts/rockchip/rk3566-box-demo.dts b/arch/arm64/boot/dts/rockchip/rk3566-box-demo.dts
+index 410cd3e5e7bca..538db870993d4 100644
+--- a/arch/arm64/boot/dts/rockchip/rk3566-box-demo.dts
++++ b/arch/arm64/boot/dts/rockchip/rk3566-box-demo.dts
+@@ -416,7 +416,7 @@
+ compatible = "brcm,bcm4329-fmac";
+ reg = <1>;
+ interrupt-parent = <&gpio2>;
+- interrupts = <RK_PB2 GPIO_ACTIVE_HIGH>;
++ interrupts = <RK_PB2 IRQ_TYPE_LEVEL_HIGH>;
+ interrupt-names = "host-wake";
+ pinctrl-names = "default";
+ pinctrl-0 = <&wifi_host_wake_h>;
+diff --git a/arch/arm64/boot/dts/rockchip/rk3568-radxa-e25.dts b/arch/arm64/boot/dts/rockchip/rk3568-radxa-e25.dts
+index 63c4bd873188e..72ad74c38a2b4 100644
+--- a/arch/arm64/boot/dts/rockchip/rk3568-radxa-e25.dts
++++ b/arch/arm64/boot/dts/rockchip/rk3568-radxa-e25.dts
+@@ -47,6 +47,9 @@
+ vin-supply = <&vcc5v0_sys>;
+ };
+
++ /* actually fed by vcc5v0_sys, dependent
++ * on pi6c clock generator
++ */
+ vcc3v3_minipcie: vcc3v3-minipcie-regulator {
+ compatible = "regulator-fixed";
+ enable-active-high;
+@@ -54,9 +57,9 @@
+ pinctrl-names = "default";
+ pinctrl-0 = <&minipcie_enable_h>;
+ regulator-name = "vcc3v3_minipcie";
+- regulator-min-microvolt = <5000000>;
+- regulator-max-microvolt = <5000000>;
+- vin-supply = <&vcc5v0_sys>;
++ regulator-min-microvolt = <3300000>;
++ regulator-max-microvolt = <3300000>;
++ vin-supply = <&vcc3v3_pi6c_05>;
+ };
+
+ vcc3v3_ngff: vcc3v3-ngff-regulator {
+@@ -71,9 +74,6 @@
+ vin-supply = <&vcc5v0_sys>;
+ };
+
+- /* actually fed by vcc5v0_sys, dependent
+- * on pi6c clock generator
+- */
+ vcc3v3_pcie30x1: vcc3v3-pcie30x1-regulator {
+ compatible = "regulator-fixed";
+ enable-active-high;
+@@ -83,7 +83,7 @@
+ regulator-name = "vcc3v3_pcie30x1";
+ regulator-min-microvolt = <3300000>;
+ regulator-max-microvolt = <3300000>;
+- vin-supply = <&vcc3v3_pi6c_05>;
++ vin-supply = <&vcc5v0_sys>;
+ };
+
+ vcc3v3_pi6c_05: vcc3v3-pi6c-05-regulator {
+@@ -99,6 +99,10 @@
+ };
+ };
+
++&combphy1 {
++ phy-supply = <&vcc3v3_pcie30x1>;
++};
++
+ &pcie2x1 {
+ pinctrl-names = "default";
+ pinctrl-0 = <&pcie20_reset_h>;
+@@ -117,7 +121,7 @@
+ pinctrl-names = "default";
+ pinctrl-0 = <&pcie30x1m0_pins>;
+ reset-gpios = <&gpio0 RK_PC3 GPIO_ACTIVE_HIGH>;
+- vpcie3v3-supply = <&vcc3v3_pcie30x1>;
++ vpcie3v3-supply = <&vcc3v3_minipcie>;
+ status = "okay";
+ };
+
+@@ -178,6 +182,10 @@
+ status = "okay";
+ };
+
++&sata1 {
++ status = "okay";
++};
++
+ &sdmmc0 {
+ bus-width = <4>;
+ cap-sd-highspeed;
+diff --git a/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi b/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
+index 976f8303c84f4..5629a13d9fc43 100644
+--- a/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
++++ b/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
+@@ -248,7 +248,7 @@
+ status = "okay";
+ pinctrl-names = "default";
+ pinctrl-0 = <&main_i2c1_pins_default>;
+- clock-frequency = <400000>;
++ clock-frequency = <100000>;
+
+ tlv320aic3106: audio-codec@1b {
+ #sound-dai-cells = <0>;
+diff --git a/arch/arm64/boot/dts/ti/k3-j784s4-main.dtsi b/arch/arm64/boot/dts/ti/k3-j784s4-main.dtsi
+index e9169eb358c16..320c31cba0a43 100644
+--- a/arch/arm64/boot/dts/ti/k3-j784s4-main.dtsi
++++ b/arch/arm64/boot/dts/ti/k3-j784s4-main.dtsi
+@@ -60,7 +60,7 @@
+ #interrupt-cells = <1>;
+ ti,sci = <&sms>;
+ ti,sci-dev-id = <10>;
+- ti,interrupt-ranges = <8 360 56>;
++ ti,interrupt-ranges = <8 392 56>;
+ };
+
+ main_pmx0: pinctrl@11c000 {
+diff --git a/arch/arm64/boot/dts/ti/k3-j784s4-mcu-wakeup.dtsi b/arch/arm64/boot/dts/ti/k3-j784s4-mcu-wakeup.dtsi
+index ed2b40369c59a..77208349eb22b 100644
+--- a/arch/arm64/boot/dts/ti/k3-j784s4-mcu-wakeup.dtsi
++++ b/arch/arm64/boot/dts/ti/k3-j784s4-mcu-wakeup.dtsi
+@@ -92,7 +92,7 @@
+ #interrupt-cells = <1>;
+ ti,sci = <&sms>;
+ ti,sci-dev-id = <177>;
+- ti,interrupt-ranges = <16 928 16>;
++ ti,interrupt-ranges = <16 960 16>;
+ };
+
+ mcu_conf: syscon@40f00000 {
+diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
+index a24609e14d50e..fdfe54f35cf8e 100644
+--- a/arch/arm64/configs/defconfig
++++ b/arch/arm64/configs/defconfig
+@@ -1107,7 +1107,6 @@ CONFIG_XEN_GNTDEV=y
+ CONFIG_XEN_GRANT_DEV_ALLOC=y
+ CONFIG_STAGING=y
+ CONFIG_STAGING_MEDIA=y
+-CONFIG_VIDEO_IMX_MEDIA=m
+ CONFIG_VIDEO_MAX96712=m
+ CONFIG_CHROME_PLATFORMS=y
+ CONFIG_CROS_EC=y
+@@ -1159,6 +1158,7 @@ CONFIG_IPQ_GCC_8074=y
+ CONFIG_IPQ_GCC_9574=y
+ CONFIG_MSM_GCC_8916=y
+ CONFIG_MSM_GCC_8994=y
++CONFIG_MSM_GCC_8996=y
+ CONFIG_MSM_MMCC_8994=m
+ CONFIG_MSM_MMCC_8996=m
+ CONFIG_MSM_MMCC_8998=m
+diff --git a/arch/arm64/include/asm/sdei.h b/arch/arm64/include/asm/sdei.h
+index 4292d9bafb9d2..484cb6972e99a 100644
+--- a/arch/arm64/include/asm/sdei.h
++++ b/arch/arm64/include/asm/sdei.h
+@@ -17,6 +17,9 @@
+
+ #include <asm/virt.h>
+
++DECLARE_PER_CPU(struct sdei_registered_event *, sdei_active_normal_event);
++DECLARE_PER_CPU(struct sdei_registered_event *, sdei_active_critical_event);
++
+ extern unsigned long sdei_exit_mode;
+
+ /* Software Delegated Exception entry point from firmware*/
+@@ -29,6 +32,9 @@ asmlinkage void __sdei_asm_entry_trampoline(unsigned long event_num,
+ unsigned long pc,
+ unsigned long pstate);
+
++/* Abort a running handler. Context is discarded. */
++void __sdei_handler_abort(void);
++
+ /*
+ * The above entry point does the minimum to call C code. This function does
+ * anything else, before calling the driver.
+diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
+index ab2a6e33c0528..1b4a65a33186d 100644
+--- a/arch/arm64/kernel/entry.S
++++ b/arch/arm64/kernel/entry.S
+@@ -1003,9 +1003,13 @@ SYM_CODE_START(__sdei_asm_handler)
+
+ mov x19, x1
+
+-#if defined(CONFIG_VMAP_STACK) || defined(CONFIG_SHADOW_CALL_STACK)
++ /* Store the registered-event for crash_smp_send_stop() */
+ ldrb w4, [x19, #SDEI_EVENT_PRIORITY]
+-#endif
++ cbnz w4, 1f
++ adr_this_cpu dst=x5, sym=sdei_active_normal_event, tmp=x6
++ b 2f
++1: adr_this_cpu dst=x5, sym=sdei_active_critical_event, tmp=x6
++2: str x19, [x5]
+
+ #ifdef CONFIG_VMAP_STACK
+ /*
+@@ -1072,6 +1076,14 @@ SYM_CODE_START(__sdei_asm_handler)
+
+ ldr_l x2, sdei_exit_mode
+
++ /* Clear the registered-event seen by crash_smp_send_stop() */
++ ldrb w3, [x4, #SDEI_EVENT_PRIORITY]
++ cbnz w3, 1f
++ adr_this_cpu dst=x5, sym=sdei_active_normal_event, tmp=x6
++ b 2f
++1: adr_this_cpu dst=x5, sym=sdei_active_critical_event, tmp=x6
++2: str xzr, [x5]
++
+ alternative_if_not ARM64_UNMAP_KERNEL_AT_EL0
+ sdei_handler_exit exit_mode=x2
+ alternative_else_nop_endif
+@@ -1082,4 +1094,15 @@ alternative_else_nop_endif
+ #endif
+ SYM_CODE_END(__sdei_asm_handler)
+ NOKPROBE(__sdei_asm_handler)
++
++SYM_CODE_START(__sdei_handler_abort)
++ mov_q x0, SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME
++ adr x1, 1f
++ ldr_l x2, sdei_exit_mode
++ sdei_handler_exit exit_mode=x2
++ // exit the handler and jump to the next instruction.
++ // Exit will stomp x0-x17, PSTATE, ELR_ELx, and SPSR_ELx.
++1: ret
++SYM_CODE_END(__sdei_handler_abort)
++NOKPROBE(__sdei_handler_abort)
+ #endif /* CONFIG_ARM_SDE_INTERFACE */
+diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
+index 087c05aa960ea..91e44ac7150f9 100644
+--- a/arch/arm64/kernel/fpsimd.c
++++ b/arch/arm64/kernel/fpsimd.c
+@@ -1179,9 +1179,6 @@ void sve_kernel_enable(const struct arm64_cpu_capabilities *__always_unused p)
+ */
+ u64 read_zcr_features(void)
+ {
+- u64 zcr;
+- unsigned int vq_max;
+-
+ /*
+ * Set the maximum possible VL, and write zeroes to all other
+ * bits to see if they stick.
+@@ -1189,12 +1186,8 @@ u64 read_zcr_features(void)
+ sve_kernel_enable(NULL);
+ write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL1);
+
+- zcr = read_sysreg_s(SYS_ZCR_EL1);
+- zcr &= ~(u64)ZCR_ELx_LEN_MASK; /* find sticky 1s outside LEN field */
+- vq_max = sve_vq_from_vl(sve_get_vl());
+- zcr |= vq_max - 1; /* set LEN field to maximum effective value */
+-
+- return zcr;
++ /* Return LEN value that would be written to get the maximum VL */
++ return sve_vq_from_vl(sve_get_vl()) - 1;
+ }
+
+ void __init sve_setup(void)
+@@ -1349,9 +1342,6 @@ void fa64_kernel_enable(const struct arm64_cpu_capabilities *__always_unused p)
+ */
+ u64 read_smcr_features(void)
+ {
+- u64 smcr;
+- unsigned int vq_max;
+-
+ sme_kernel_enable(NULL);
+
+ /*
+@@ -1360,12 +1350,8 @@ u64 read_smcr_features(void)
+ write_sysreg_s(read_sysreg_s(SYS_SMCR_EL1) | SMCR_ELx_LEN_MASK,
+ SYS_SMCR_EL1);
+
+- smcr = read_sysreg_s(SYS_SMCR_EL1);
+- smcr &= ~(u64)SMCR_ELx_LEN_MASK; /* Only the LEN field */
+- vq_max = sve_vq_from_vl(sme_get_vl());
+- smcr |= vq_max - 1; /* set LEN field to maximum effective value */
+-
+- return smcr;
++ /* Return LEN value that would be written to get the maximum VL */
++ return sve_vq_from_vl(sme_get_vl()) - 1;
+ }
+
+ void __init sme_setup(void)
+diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
+index 187aa2b175b4f..20d7ef82de90a 100644
+--- a/arch/arm64/kernel/ptrace.c
++++ b/arch/arm64/kernel/ptrace.c
+@@ -891,7 +891,8 @@ static int sve_set_common(struct task_struct *target,
+ break;
+ default:
+ WARN_ON_ONCE(1);
+- return -EINVAL;
++ ret = -EINVAL;
++ goto out;
+ }
+
+ /*
+diff --git a/arch/arm64/kernel/sdei.c b/arch/arm64/kernel/sdei.c
+index 830be01af32db..255d12f881c26 100644
+--- a/arch/arm64/kernel/sdei.c
++++ b/arch/arm64/kernel/sdei.c
+@@ -47,6 +47,9 @@ DEFINE_PER_CPU(unsigned long *, sdei_shadow_call_stack_normal_ptr);
+ DEFINE_PER_CPU(unsigned long *, sdei_shadow_call_stack_critical_ptr);
+ #endif
+
++DEFINE_PER_CPU(struct sdei_registered_event *, sdei_active_normal_event);
++DEFINE_PER_CPU(struct sdei_registered_event *, sdei_active_critical_event);
++
+ static void _free_sdei_stack(unsigned long * __percpu *ptr, int cpu)
+ {
+ unsigned long *p;
+diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
+index d00d4cbb31b16..c6b882e589e69 100644
+--- a/arch/arm64/kernel/smp.c
++++ b/arch/arm64/kernel/smp.c
+@@ -1048,10 +1048,8 @@ void crash_smp_send_stop(void)
+ * If this cpu is the only one alive at this point in time, online or
+ * not, there are no stop messages to be sent around, so just back out.
+ */
+- if (num_other_online_cpus() == 0) {
+- sdei_mask_local_cpu();
+- return;
+- }
++ if (num_other_online_cpus() == 0)
++ goto skip_ipi;
+
+ cpumask_copy(&mask, cpu_online_mask);
+ cpumask_clear_cpu(smp_processor_id(), &mask);
+@@ -1070,7 +1068,9 @@ void crash_smp_send_stop(void)
+ pr_warn("SMP: failed to stop secondary CPUs %*pbl\n",
+ cpumask_pr_args(&mask));
+
++skip_ipi:
+ sdei_mask_local_cpu();
++ sdei_handler_abort();
+ }
+
+ bool smp_crash_stop_failed(void)
+diff --git a/arch/arm64/lib/csum.c b/arch/arm64/lib/csum.c
+index 78b87a64ca0a3..2432683e48a61 100644
+--- a/arch/arm64/lib/csum.c
++++ b/arch/arm64/lib/csum.c
+@@ -24,7 +24,7 @@ unsigned int __no_sanitize_address do_csum(const unsigned char *buff, int len)
+ const u64 *ptr;
+ u64 data, sum64 = 0;
+
+- if (unlikely(len == 0))
++ if (unlikely(len <= 0))
+ return 0;
+
+ offset = (unsigned long)buff & 7;
+diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
+index 95364e8bdc194..50a8e7ab5fa94 100644
+--- a/arch/arm64/mm/hugetlbpage.c
++++ b/arch/arm64/mm/hugetlbpage.c
+@@ -236,7 +236,7 @@ static void clear_flush(struct mm_struct *mm,
+ unsigned long i, saddr = addr;
+
+ for (i = 0; i < ncontig; i++, addr += pgsize, ptep++)
+- pte_clear(mm, addr, ptep);
++ ptep_clear(mm, addr, ptep);
+
+ flush_tlb_range(&vma, saddr, addr);
+ }
+diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
+index 63a637fdf6c28..95629322241f5 100644
+--- a/arch/loongarch/Makefile
++++ b/arch/loongarch/Makefile
+@@ -106,7 +106,7 @@ KBUILD_CFLAGS += -isystem $(shell $(CC) -print-file-name=include)
+
+ KBUILD_LDFLAGS += -m $(ld-emul)
+
+-ifdef CONFIG_LOONGARCH
++ifdef need-compiler
+ CHECKFLAGS += $(shell $(CC) $(KBUILD_CPPFLAGS) $(KBUILD_CFLAGS) -dM -E -x c /dev/null | \
+ grep -E -vw '__GNUC_(MINOR_|PATCHLEVEL_)?_' | \
+ sed -e "s/^\#define /-D'/" -e "s/ /'='/" -e "s/$$/'/" -e 's/\$$/&&/g')
+diff --git a/arch/loongarch/include/asm/fpu.h b/arch/loongarch/include/asm/fpu.h
+index 192f8e35d9126..b1dc4200ae6a4 100644
+--- a/arch/loongarch/include/asm/fpu.h
++++ b/arch/loongarch/include/asm/fpu.h
+@@ -117,16 +117,30 @@ static inline void restore_fp(struct task_struct *tsk)
+ _restore_fp(&tsk->thread.fpu);
+ }
+
+-static inline union fpureg *get_fpu_regs(struct task_struct *tsk)
++static inline void save_fpu_regs(struct task_struct *tsk)
+ {
++ unsigned int euen;
++
+ if (tsk == current) {
+ preempt_disable();
+- if (is_fpu_owner())
++
++ euen = csr_read32(LOONGARCH_CSR_EUEN);
++
++#ifdef CONFIG_CPU_HAS_LASX
++ if (euen & CSR_EUEN_LASXEN)
++ _save_lasx(¤t->thread.fpu);
++ else
++#endif
++#ifdef CONFIG_CPU_HAS_LSX
++ if (euen & CSR_EUEN_LSXEN)
++ _save_lsx(¤t->thread.fpu);
++ else
++#endif
++ if (euen & CSR_EUEN_FPEN)
+ _save_fp(¤t->thread.fpu);
++
+ preempt_enable();
+ }
+-
+- return tsk->thread.fpu.fpr;
+ }
+
+ #endif /* _ASM_FPU_H */
+diff --git a/arch/loongarch/include/asm/local.h b/arch/loongarch/include/asm/local.h
+index 83e995b30e472..c49675852bdcd 100644
+--- a/arch/loongarch/include/asm/local.h
++++ b/arch/loongarch/include/asm/local.h
+@@ -63,8 +63,8 @@ static inline long local_cmpxchg(local_t *l, long old, long new)
+
+ static inline bool local_try_cmpxchg(local_t *l, long *old, long new)
+ {
+- typeof(l->a.counter) *__old = (typeof(l->a.counter) *) old;
+- return try_cmpxchg_local(&l->a.counter, __old, new);
++ return try_cmpxchg_local(&l->a.counter,
++ (typeof(l->a.counter) *) old, new);
+ }
+
+ #define local_xchg(l, n) (atomic_long_xchg((&(l)->a), (n)))
+diff --git a/arch/loongarch/include/asm/pgtable.h b/arch/loongarch/include/asm/pgtable.h
+index 9a9f9ff9b7098..5f71e789dcf68 100644
+--- a/arch/loongarch/include/asm/pgtable.h
++++ b/arch/loongarch/include/asm/pgtable.h
+@@ -593,6 +593,9 @@ static inline long pmd_protnone(pmd_t pmd)
+ }
+ #endif /* CONFIG_NUMA_BALANCING */
+
++#define pmd_leaf(pmd) ((pmd_val(pmd) & _PAGE_HUGE) != 0)
++#define pud_leaf(pud) ((pud_val(pud) & _PAGE_HUGE) != 0)
++
+ /*
+ * We provide our own get_unmapped area to cope with the virtual aliasing
+ * constraints placed on us by the cache architecture.
+diff --git a/arch/loongarch/kernel/ptrace.c b/arch/loongarch/kernel/ptrace.c
+index 5fcffb4523676..286c0ca39eae0 100644
+--- a/arch/loongarch/kernel/ptrace.c
++++ b/arch/loongarch/kernel/ptrace.c
+@@ -147,6 +147,8 @@ static int fpr_get(struct task_struct *target,
+ {
+ int r;
+
++ save_fpu_regs(target);
++
+ if (sizeof(target->thread.fpu.fpr[0]) == sizeof(elf_fpreg_t))
+ r = gfpr_get(target, &to);
+ else
+diff --git a/arch/loongarch/kernel/setup.c b/arch/loongarch/kernel/setup.c
+index 78a00359bde3c..9d830ab4e3025 100644
+--- a/arch/loongarch/kernel/setup.c
++++ b/arch/loongarch/kernel/setup.c
+@@ -332,9 +332,25 @@ static void __init bootcmdline_init(char **cmdline_p)
+ strlcat(boot_command_line, " ", COMMAND_LINE_SIZE);
+
+ strlcat(boot_command_line, init_command_line, COMMAND_LINE_SIZE);
++ goto out;
+ }
+ #endif
+
++ /*
++ * Append built-in command line to the bootloader command line if
++ * CONFIG_CMDLINE_EXTEND is enabled.
++ */
++ if (IS_ENABLED(CONFIG_CMDLINE_EXTEND) && CONFIG_CMDLINE[0]) {
++ strlcat(boot_command_line, " ", COMMAND_LINE_SIZE);
++ strlcat(boot_command_line, CONFIG_CMDLINE, COMMAND_LINE_SIZE);
++ }
++
++ /*
++ * Use built-in command line if the bootloader command line is empty.
++ */
++ if (IS_ENABLED(CONFIG_CMDLINE_BOOTLOADER) && !boot_command_line[0])
++ strscpy(boot_command_line, CONFIG_CMDLINE, COMMAND_LINE_SIZE);
++
+ out:
+ *cmdline_p = boot_command_line;
+ }
+diff --git a/arch/m68k/fpsp040/skeleton.S b/arch/m68k/fpsp040/skeleton.S
+index 439395aa6fb42..081922c72daaa 100644
+--- a/arch/m68k/fpsp040/skeleton.S
++++ b/arch/m68k/fpsp040/skeleton.S
+@@ -499,13 +499,13 @@ in_ea:
+ dbf %d0,morein
+ rts
+
+- .section .fixup,#alloc,#execinstr
++ .section .fixup,"ax"
+ .even
+ 1:
+ jbsr fpsp040_die
+ jbra .Lnotkern
+
+- .section __ex_table,#alloc
++ .section __ex_table,"a"
+ .align 4
+
+ .long in_ea,1b
+diff --git a/arch/m68k/ifpsp060/os.S b/arch/m68k/ifpsp060/os.S
+index 7a0d6e4280665..89e2ec224ab6c 100644
+--- a/arch/m68k/ifpsp060/os.S
++++ b/arch/m68k/ifpsp060/os.S
+@@ -379,11 +379,11 @@ _060_real_access:
+
+
+ | Execption handling for movs access to illegal memory
+- .section .fixup,#alloc,#execinstr
++ .section .fixup,"ax"
+ .even
+ 1: moveq #-1,%d1
+ rts
+-.section __ex_table,#alloc
++.section __ex_table,"a"
+ .align 4
+ .long dmrbuae,1b
+ .long dmrwuae,1b
+diff --git a/arch/m68k/kernel/relocate_kernel.S b/arch/m68k/kernel/relocate_kernel.S
+index ab0f1e7d46535..f7667079e08e9 100644
+--- a/arch/m68k/kernel/relocate_kernel.S
++++ b/arch/m68k/kernel/relocate_kernel.S
+@@ -26,7 +26,7 @@ ENTRY(relocate_new_kernel)
+ lea %pc@(.Lcopy),%a4
+ 2: addl #0x00000000,%a4 /* virt_to_phys() */
+
+- .section ".m68k_fixup","aw"
++ .section .m68k_fixup,"aw"
+ .long M68K_FIXUP_MEMOFFSET, 2b+2
+ .previous
+
+@@ -49,7 +49,7 @@ ENTRY(relocate_new_kernel)
+ lea %pc@(.Lcont040),%a4
+ 5: addl #0x00000000,%a4 /* virt_to_phys() */
+
+- .section ".m68k_fixup","aw"
++ .section .m68k_fixup,"aw"
+ .long M68K_FIXUP_MEMOFFSET, 5b+2
+ .previous
+
+diff --git a/arch/mips/include/asm/local.h b/arch/mips/include/asm/local.h
+index 5daf6fe8e3e9a..e6ae3df0349d2 100644
+--- a/arch/mips/include/asm/local.h
++++ b/arch/mips/include/asm/local.h
+@@ -101,8 +101,8 @@ static __inline__ long local_cmpxchg(local_t *l, long old, long new)
+
+ static __inline__ bool local_try_cmpxchg(local_t *l, long *old, long new)
+ {
+- typeof(l->a.counter) *__old = (typeof(l->a.counter) *) old;
+- return try_cmpxchg_local(&l->a.counter, __old, new);
++ return try_cmpxchg_local(&l->a.counter,
++ (typeof(l->a.counter) *) old, new);
+ }
+
+ #define local_xchg(l, n) (atomic_long_xchg((&(l)->a), (n)))
+diff --git a/arch/parisc/kernel/processor.c b/arch/parisc/kernel/processor.c
+index ba07e760d3c76..25db0f8836f76 100644
+--- a/arch/parisc/kernel/processor.c
++++ b/arch/parisc/kernel/processor.c
+@@ -377,10 +377,18 @@ int
+ show_cpuinfo (struct seq_file *m, void *v)
+ {
+ unsigned long cpu;
++ char cpu_name[60], *p;
++
++ /* strip PA path from CPU name to not confuse lscpu */
++ strlcpy(cpu_name, per_cpu(cpu_data, 0).dev->name, sizeof(cpu_name));
++ p = strrchr(cpu_name, '[');
++ if (p)
++ *(--p) = 0;
+
+ for_each_online_cpu(cpu) {
+- const struct cpuinfo_parisc *cpuinfo = &per_cpu(cpu_data, cpu);
+ #ifdef CONFIG_SMP
++ const struct cpuinfo_parisc *cpuinfo = &per_cpu(cpu_data, cpu);
++
+ if (0 == cpuinfo->hpa)
+ continue;
+ #endif
+@@ -425,8 +433,7 @@ show_cpuinfo (struct seq_file *m, void *v)
+
+ seq_printf(m, "model\t\t: %s - %s\n",
+ boot_cpu_data.pdc.sys_model_name,
+- cpuinfo->dev ?
+- cpuinfo->dev->name : "Unknown");
++ cpu_name);
+
+ seq_printf(m, "hversion\t: 0x%08x\n"
+ "sversion\t: 0x%08x\n",
+diff --git a/arch/powerpc/include/asm/ftrace.h b/arch/powerpc/include/asm/ftrace.h
+index 91c049d51d0e1..2edc6269b1a35 100644
+--- a/arch/powerpc/include/asm/ftrace.h
++++ b/arch/powerpc/include/asm/ftrace.h
+@@ -12,7 +12,7 @@
+
+ /* Ignore unused weak functions which will have larger offsets */
+ #ifdef CONFIG_MPROFILE_KERNEL
+-#define FTRACE_MCOUNT_MAX_OFFSET 12
++#define FTRACE_MCOUNT_MAX_OFFSET 16
+ #elif defined(CONFIG_PPC32)
+ #define FTRACE_MCOUNT_MAX_OFFSET 8
+ #endif
+diff --git a/arch/powerpc/include/asm/lppaca.h b/arch/powerpc/include/asm/lppaca.h
+index 34d44cb17c874..ee1488d38fdc1 100644
+--- a/arch/powerpc/include/asm/lppaca.h
++++ b/arch/powerpc/include/asm/lppaca.h
+@@ -45,6 +45,7 @@
+ #include <asm/types.h>
+ #include <asm/mmu.h>
+ #include <asm/firmware.h>
++#include <asm/paca.h>
+
+ /*
+ * The lppaca is the "virtual processor area" registered with the hypervisor,
+@@ -127,13 +128,23 @@ struct lppaca {
+ */
+ #define LPPACA_OLD_SHARED_PROC 2
+
+-static inline bool lppaca_shared_proc(struct lppaca *l)
++#ifdef CONFIG_PPC_PSERIES
++/*
++ * All CPUs should have the same shared proc value, so directly access the PACA
++ * to avoid false positives from DEBUG_PREEMPT.
++ */
++static inline bool lppaca_shared_proc(void)
+ {
++ struct lppaca *l = local_paca->lppaca_ptr;
++
+ if (!firmware_has_feature(FW_FEATURE_SPLPAR))
+ return false;
+ return !!(l->__old_status & LPPACA_OLD_SHARED_PROC);
+ }
+
++#define get_lppaca() (get_paca()->lppaca_ptr)
++#endif
++
+ /*
+ * SLB shadow buffer structure as defined in the PAPR. The save_area
+ * contains adjacent ESID and VSID pairs for each shadowed SLB. The
+diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
+index da0377f465973..46ac4647b5fb9 100644
+--- a/arch/powerpc/include/asm/paca.h
++++ b/arch/powerpc/include/asm/paca.h
+@@ -15,7 +15,6 @@
+ #include <linux/cache.h>
+ #include <linux/string.h>
+ #include <asm/types.h>
+-#include <asm/lppaca.h>
+ #include <asm/mmu.h>
+ #include <asm/page.h>
+ #ifdef CONFIG_PPC_BOOK3E_64
+@@ -47,14 +46,11 @@ extern unsigned int debug_smp_processor_id(void); /* from linux/smp.h */
+ #define get_paca() local_paca
+ #endif
+
+-#ifdef CONFIG_PPC_PSERIES
+-#define get_lppaca() (get_paca()->lppaca_ptr)
+-#endif
+-
+ #define get_slb_shadow() (get_paca()->slb_shadow_ptr)
+
+ struct task_struct;
+ struct rtas_args;
++struct lppaca;
+
+ /*
+ * Defines the layout of the paca.
+diff --git a/arch/powerpc/include/asm/paravirt.h b/arch/powerpc/include/asm/paravirt.h
+index f5ba1a3c41f8e..e08513d731193 100644
+--- a/arch/powerpc/include/asm/paravirt.h
++++ b/arch/powerpc/include/asm/paravirt.h
+@@ -6,6 +6,7 @@
+ #include <asm/smp.h>
+ #ifdef CONFIG_PPC64
+ #include <asm/paca.h>
++#include <asm/lppaca.h>
+ #include <asm/hvcall.h>
+ #endif
+
+diff --git a/arch/powerpc/include/asm/plpar_wrappers.h b/arch/powerpc/include/asm/plpar_wrappers.h
+index 8239c0af5eb2b..fe3d0ea0058ac 100644
+--- a/arch/powerpc/include/asm/plpar_wrappers.h
++++ b/arch/powerpc/include/asm/plpar_wrappers.h
+@@ -9,6 +9,7 @@
+
+ #include <asm/hvcall.h>
+ #include <asm/paca.h>
++#include <asm/lppaca.h>
+ #include <asm/page.h>
+
+ static inline long poll_pending(void)
+diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
+index ea0a073abd969..3ff2da7b120b5 100644
+--- a/arch/powerpc/kernel/fadump.c
++++ b/arch/powerpc/kernel/fadump.c
+@@ -654,6 +654,7 @@ int __init fadump_reserve_mem(void)
+ return ret;
+ error_out:
+ fw_dump.fadump_enabled = 0;
++ fw_dump.reserve_dump_area_size = 0;
+ return 0;
+ }
+
+diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
+index 67f0b01e6ff57..400831307cd7b 100644
+--- a/arch/powerpc/kernel/iommu.c
++++ b/arch/powerpc/kernel/iommu.c
+@@ -172,17 +172,28 @@ static int fail_iommu_bus_notify(struct notifier_block *nb,
+ return 0;
+ }
+
+-static struct notifier_block fail_iommu_bus_notifier = {
++/*
++ * PCI and VIO buses need separate notifier_block structs, since they're linked
++ * list nodes. Sharing a notifier_block would mean that any notifiers later
++ * registered for PCI buses would also get called by VIO buses and vice versa.
++ */
++static struct notifier_block fail_iommu_pci_bus_notifier = {
+ .notifier_call = fail_iommu_bus_notify
+ };
+
++#ifdef CONFIG_IBMVIO
++static struct notifier_block fail_iommu_vio_bus_notifier = {
++ .notifier_call = fail_iommu_bus_notify
++};
++#endif
++
+ static int __init fail_iommu_setup(void)
+ {
+ #ifdef CONFIG_PCI
+- bus_register_notifier(&pci_bus_type, &fail_iommu_bus_notifier);
++ bus_register_notifier(&pci_bus_type, &fail_iommu_pci_bus_notifier);
+ #endif
+ #ifdef CONFIG_IBMVIO
+- bus_register_notifier(&vio_bus_type, &fail_iommu_bus_notifier);
++ bus_register_notifier(&vio_bus_type, &fail_iommu_vio_bus_notifier);
+ #endif
+
+ return 0;
+diff --git a/arch/powerpc/kvm/book3s_hv_ras.c b/arch/powerpc/kvm/book3s_hv_ras.c
+index ccfd969656306..82be6d87514b7 100644
+--- a/arch/powerpc/kvm/book3s_hv_ras.c
++++ b/arch/powerpc/kvm/book3s_hv_ras.c
+@@ -9,6 +9,7 @@
+ #include <linux/kvm.h>
+ #include <linux/kvm_host.h>
+ #include <linux/kernel.h>
++#include <asm/lppaca.h>
+ #include <asm/opal.h>
+ #include <asm/mce.h>
+ #include <asm/machdep.h>
+diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c b/arch/powerpc/mm/book3s64/radix_tlb.c
+index 0bd4866d98241..9383606c5e6e0 100644
+--- a/arch/powerpc/mm/book3s64/radix_tlb.c
++++ b/arch/powerpc/mm/book3s64/radix_tlb.c
+@@ -127,21 +127,6 @@ static __always_inline void __tlbie_pid(unsigned long pid, unsigned long ric)
+ trace_tlbie(0, 0, rb, rs, ric, prs, r);
+ }
+
+-static __always_inline void __tlbie_pid_lpid(unsigned long pid,
+- unsigned long lpid,
+- unsigned long ric)
+-{
+- unsigned long rb, rs, prs, r;
+-
+- rb = PPC_BIT(53); /* IS = 1 */
+- rs = (pid << PPC_BITLSHIFT(31)) | (lpid & ~(PPC_BITMASK(0, 31)));
+- prs = 1; /* process scoped */
+- r = 1; /* radix format */
+-
+- asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1)
+- : : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(rs) : "memory");
+- trace_tlbie(0, 0, rb, rs, ric, prs, r);
+-}
+ static __always_inline void __tlbie_lpid(unsigned long lpid, unsigned long ric)
+ {
+ unsigned long rb,rs,prs,r;
+@@ -202,23 +187,6 @@ static __always_inline void __tlbie_va(unsigned long va, unsigned long pid,
+ trace_tlbie(0, 0, rb, rs, ric, prs, r);
+ }
+
+-static __always_inline void __tlbie_va_lpid(unsigned long va, unsigned long pid,
+- unsigned long lpid,
+- unsigned long ap, unsigned long ric)
+-{
+- unsigned long rb, rs, prs, r;
+-
+- rb = va & ~(PPC_BITMASK(52, 63));
+- rb |= ap << PPC_BITLSHIFT(58);
+- rs = (pid << PPC_BITLSHIFT(31)) | (lpid & ~(PPC_BITMASK(0, 31)));
+- prs = 1; /* process scoped */
+- r = 1; /* radix format */
+-
+- asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1)
+- : : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(rs) : "memory");
+- trace_tlbie(0, 0, rb, rs, ric, prs, r);
+-}
+-
+ static __always_inline void __tlbie_lpid_va(unsigned long va, unsigned long lpid,
+ unsigned long ap, unsigned long ric)
+ {
+@@ -264,22 +232,6 @@ static inline void fixup_tlbie_va_range(unsigned long va, unsigned long pid,
+ }
+ }
+
+-static inline void fixup_tlbie_va_range_lpid(unsigned long va,
+- unsigned long pid,
+- unsigned long lpid,
+- unsigned long ap)
+-{
+- if (cpu_has_feature(CPU_FTR_P9_TLBIE_ERAT_BUG)) {
+- asm volatile("ptesync" : : : "memory");
+- __tlbie_pid_lpid(0, lpid, RIC_FLUSH_TLB);
+- }
+-
+- if (cpu_has_feature(CPU_FTR_P9_TLBIE_STQ_BUG)) {
+- asm volatile("ptesync" : : : "memory");
+- __tlbie_va_lpid(va, pid, lpid, ap, RIC_FLUSH_TLB);
+- }
+-}
+-
+ static inline void fixup_tlbie_pid(unsigned long pid)
+ {
+ /*
+@@ -299,26 +251,6 @@ static inline void fixup_tlbie_pid(unsigned long pid)
+ }
+ }
+
+-static inline void fixup_tlbie_pid_lpid(unsigned long pid, unsigned long lpid)
+-{
+- /*
+- * We can use any address for the invalidation, pick one which is
+- * probably unused as an optimisation.
+- */
+- unsigned long va = ((1UL << 52) - 1);
+-
+- if (cpu_has_feature(CPU_FTR_P9_TLBIE_ERAT_BUG)) {
+- asm volatile("ptesync" : : : "memory");
+- __tlbie_pid_lpid(0, lpid, RIC_FLUSH_TLB);
+- }
+-
+- if (cpu_has_feature(CPU_FTR_P9_TLBIE_STQ_BUG)) {
+- asm volatile("ptesync" : : : "memory");
+- __tlbie_va_lpid(va, pid, lpid, mmu_get_ap(MMU_PAGE_64K),
+- RIC_FLUSH_TLB);
+- }
+-}
+-
+ static inline void fixup_tlbie_lpid_va(unsigned long va, unsigned long lpid,
+ unsigned long ap)
+ {
+@@ -416,31 +348,6 @@ static inline void _tlbie_pid(unsigned long pid, unsigned long ric)
+ asm volatile("eieio; tlbsync; ptesync": : :"memory");
+ }
+
+-static inline void _tlbie_pid_lpid(unsigned long pid, unsigned long lpid,
+- unsigned long ric)
+-{
+- asm volatile("ptesync" : : : "memory");
+-
+- /*
+- * Workaround the fact that the "ric" argument to __tlbie_pid
+- * must be a compile-time contraint to match the "i" constraint
+- * in the asm statement.
+- */
+- switch (ric) {
+- case RIC_FLUSH_TLB:
+- __tlbie_pid_lpid(pid, lpid, RIC_FLUSH_TLB);
+- fixup_tlbie_pid_lpid(pid, lpid);
+- break;
+- case RIC_FLUSH_PWC:
+- __tlbie_pid_lpid(pid, lpid, RIC_FLUSH_PWC);
+- break;
+- case RIC_FLUSH_ALL:
+- default:
+- __tlbie_pid_lpid(pid, lpid, RIC_FLUSH_ALL);
+- fixup_tlbie_pid_lpid(pid, lpid);
+- }
+- asm volatile("eieio; tlbsync; ptesync" : : : "memory");
+-}
+ struct tlbiel_pid {
+ unsigned long pid;
+ unsigned long ric;
+@@ -566,20 +473,6 @@ static inline void __tlbie_va_range(unsigned long start, unsigned long end,
+ fixup_tlbie_va_range(addr - page_size, pid, ap);
+ }
+
+-static inline void __tlbie_va_range_lpid(unsigned long start, unsigned long end,
+- unsigned long pid, unsigned long lpid,
+- unsigned long page_size,
+- unsigned long psize)
+-{
+- unsigned long addr;
+- unsigned long ap = mmu_get_ap(psize);
+-
+- for (addr = start; addr < end; addr += page_size)
+- __tlbie_va_lpid(addr, pid, lpid, ap, RIC_FLUSH_TLB);
+-
+- fixup_tlbie_va_range_lpid(addr - page_size, pid, lpid, ap);
+-}
+-
+ static __always_inline void _tlbie_va(unsigned long va, unsigned long pid,
+ unsigned long psize, unsigned long ric)
+ {
+@@ -660,18 +553,6 @@ static inline void _tlbie_va_range(unsigned long start, unsigned long end,
+ asm volatile("eieio; tlbsync; ptesync": : :"memory");
+ }
+
+-static inline void _tlbie_va_range_lpid(unsigned long start, unsigned long end,
+- unsigned long pid, unsigned long lpid,
+- unsigned long page_size,
+- unsigned long psize, bool also_pwc)
+-{
+- asm volatile("ptesync" : : : "memory");
+- if (also_pwc)
+- __tlbie_pid_lpid(pid, lpid, RIC_FLUSH_PWC);
+- __tlbie_va_range_lpid(start, end, pid, lpid, page_size, psize);
+- asm volatile("eieio; tlbsync; ptesync" : : : "memory");
+-}
+-
+ static inline void _tlbiel_va_range_multicast(struct mm_struct *mm,
+ unsigned long start, unsigned long end,
+ unsigned long pid, unsigned long page_size,
+@@ -1486,6 +1367,127 @@ void radix__flush_tlb_all(void)
+ }
+
+ #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
++static __always_inline void __tlbie_pid_lpid(unsigned long pid,
++ unsigned long lpid,
++ unsigned long ric)
++{
++ unsigned long rb, rs, prs, r;
++
++ rb = PPC_BIT(53); /* IS = 1 */
++ rs = (pid << PPC_BITLSHIFT(31)) | (lpid & ~(PPC_BITMASK(0, 31)));
++ prs = 1; /* process scoped */
++ r = 1; /* radix format */
++
++ asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1)
++ : : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(rs) : "memory");
++ trace_tlbie(0, 0, rb, rs, ric, prs, r);
++}
++
++static __always_inline void __tlbie_va_lpid(unsigned long va, unsigned long pid,
++ unsigned long lpid,
++ unsigned long ap, unsigned long ric)
++{
++ unsigned long rb, rs, prs, r;
++
++ rb = va & ~(PPC_BITMASK(52, 63));
++ rb |= ap << PPC_BITLSHIFT(58);
++ rs = (pid << PPC_BITLSHIFT(31)) | (lpid & ~(PPC_BITMASK(0, 31)));
++ prs = 1; /* process scoped */
++ r = 1; /* radix format */
++
++ asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1)
++ : : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(rs) : "memory");
++ trace_tlbie(0, 0, rb, rs, ric, prs, r);
++}
++
++static inline void fixup_tlbie_pid_lpid(unsigned long pid, unsigned long lpid)
++{
++ /*
++ * We can use any address for the invalidation, pick one which is
++ * probably unused as an optimisation.
++ */
++ unsigned long va = ((1UL << 52) - 1);
++
++ if (cpu_has_feature(CPU_FTR_P9_TLBIE_ERAT_BUG)) {
++ asm volatile("ptesync" : : : "memory");
++ __tlbie_pid_lpid(0, lpid, RIC_FLUSH_TLB);
++ }
++
++ if (cpu_has_feature(CPU_FTR_P9_TLBIE_STQ_BUG)) {
++ asm volatile("ptesync" : : : "memory");
++ __tlbie_va_lpid(va, pid, lpid, mmu_get_ap(MMU_PAGE_64K),
++ RIC_FLUSH_TLB);
++ }
++}
++
++static inline void _tlbie_pid_lpid(unsigned long pid, unsigned long lpid,
++ unsigned long ric)
++{
++ asm volatile("ptesync" : : : "memory");
++
++ /*
++ * Workaround the fact that the "ric" argument to __tlbie_pid
++ * must be a compile-time contraint to match the "i" constraint
++ * in the asm statement.
++ */
++ switch (ric) {
++ case RIC_FLUSH_TLB:
++ __tlbie_pid_lpid(pid, lpid, RIC_FLUSH_TLB);
++ fixup_tlbie_pid_lpid(pid, lpid);
++ break;
++ case RIC_FLUSH_PWC:
++ __tlbie_pid_lpid(pid, lpid, RIC_FLUSH_PWC);
++ break;
++ case RIC_FLUSH_ALL:
++ default:
++ __tlbie_pid_lpid(pid, lpid, RIC_FLUSH_ALL);
++ fixup_tlbie_pid_lpid(pid, lpid);
++ }
++ asm volatile("eieio; tlbsync; ptesync" : : : "memory");
++}
++
++static inline void fixup_tlbie_va_range_lpid(unsigned long va,
++ unsigned long pid,
++ unsigned long lpid,
++ unsigned long ap)
++{
++ if (cpu_has_feature(CPU_FTR_P9_TLBIE_ERAT_BUG)) {
++ asm volatile("ptesync" : : : "memory");
++ __tlbie_pid_lpid(0, lpid, RIC_FLUSH_TLB);
++ }
++
++ if (cpu_has_feature(CPU_FTR_P9_TLBIE_STQ_BUG)) {
++ asm volatile("ptesync" : : : "memory");
++ __tlbie_va_lpid(va, pid, lpid, ap, RIC_FLUSH_TLB);
++ }
++}
++
++static inline void __tlbie_va_range_lpid(unsigned long start, unsigned long end,
++ unsigned long pid, unsigned long lpid,
++ unsigned long page_size,
++ unsigned long psize)
++{
++ unsigned long addr;
++ unsigned long ap = mmu_get_ap(psize);
++
++ for (addr = start; addr < end; addr += page_size)
++ __tlbie_va_lpid(addr, pid, lpid, ap, RIC_FLUSH_TLB);
++
++ fixup_tlbie_va_range_lpid(addr - page_size, pid, lpid, ap);
++}
++
++static inline void _tlbie_va_range_lpid(unsigned long start, unsigned long end,
++ unsigned long pid, unsigned long lpid,
++ unsigned long page_size,
++ unsigned long psize, bool also_pwc)
++{
++ asm volatile("ptesync" : : : "memory");
++ if (also_pwc)
++ __tlbie_pid_lpid(pid, lpid, RIC_FLUSH_PWC);
++ __tlbie_va_range_lpid(start, end, pid, lpid, page_size, psize);
++ asm volatile("eieio; tlbsync; ptesync" : : : "memory");
++}
++
+ /*
+ * Performs process-scoped invalidations for a given LPID
+ * as part of H_RPT_INVALIDATE hcall.
+diff --git a/arch/powerpc/mm/book3s64/slb.c b/arch/powerpc/mm/book3s64/slb.c
+index 6956f637a38c1..f2708c8629a52 100644
+--- a/arch/powerpc/mm/book3s64/slb.c
++++ b/arch/powerpc/mm/book3s64/slb.c
+@@ -13,6 +13,7 @@
+ #include <asm/mmu.h>
+ #include <asm/mmu_context.h>
+ #include <asm/paca.h>
++#include <asm/lppaca.h>
+ #include <asm/ppc-opcode.h>
+ #include <asm/cputable.h>
+ #include <asm/cacheflush.h>
+diff --git a/arch/powerpc/perf/core-fsl-emb.c b/arch/powerpc/perf/core-fsl-emb.c
+index ee721f420a7ba..1a53ab08447cb 100644
+--- a/arch/powerpc/perf/core-fsl-emb.c
++++ b/arch/powerpc/perf/core-fsl-emb.c
+@@ -645,7 +645,6 @@ static void perf_event_interrupt(struct pt_regs *regs)
+ struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
+ struct perf_event *event;
+ unsigned long val;
+- int found = 0;
+
+ for (i = 0; i < ppmu->n_counter; ++i) {
+ event = cpuhw->event[i];
+@@ -654,7 +653,6 @@ static void perf_event_interrupt(struct pt_regs *regs)
+ if ((int)val < 0) {
+ if (event) {
+ /* event has overflowed */
+- found = 1;
+ record_and_restart(event, val, regs);
+ } else {
+ /*
+@@ -672,11 +670,13 @@ static void perf_event_interrupt(struct pt_regs *regs)
+ isync();
+ }
+
+-void hw_perf_event_setup(int cpu)
++static int fsl_emb_pmu_prepare_cpu(unsigned int cpu)
+ {
+ struct cpu_hw_events *cpuhw = &per_cpu(cpu_hw_events, cpu);
+
+ memset(cpuhw, 0, sizeof(*cpuhw));
++
++ return 0;
+ }
+
+ int register_fsl_emb_pmu(struct fsl_emb_pmu *pmu)
+@@ -689,6 +689,8 @@ int register_fsl_emb_pmu(struct fsl_emb_pmu *pmu)
+ pmu->name);
+
+ perf_pmu_register(&fsl_emb_pmu, "cpu", PERF_TYPE_RAW);
++ cpuhp_setup_state(CPUHP_PERF_POWER, "perf/powerpc:prepare",
++ fsl_emb_pmu_prepare_cpu, NULL);
+
+ return 0;
+ }
+diff --git a/arch/powerpc/platforms/powermac/time.c b/arch/powerpc/platforms/powermac/time.c
+index 4c5790aff1b54..8633891b7aa58 100644
+--- a/arch/powerpc/platforms/powermac/time.c
++++ b/arch/powerpc/platforms/powermac/time.c
+@@ -26,8 +26,8 @@
+ #include <linux/rtc.h>
+ #include <linux/of_address.h>
+
++#include <asm/early_ioremap.h>
+ #include <asm/sections.h>
+-#include <asm/io.h>
+ #include <asm/machdep.h>
+ #include <asm/time.h>
+ #include <asm/nvram.h>
+@@ -182,7 +182,7 @@ static int __init via_calibrate_decr(void)
+ return 0;
+ }
+ of_node_put(vias);
+- via = ioremap(rsrc.start, resource_size(&rsrc));
++ via = early_ioremap(rsrc.start, resource_size(&rsrc));
+ if (via == NULL) {
+ printk(KERN_ERR "Failed to map VIA for timer calibration !\n");
+ return 0;
+@@ -207,7 +207,7 @@ static int __init via_calibrate_decr(void)
+
+ ppc_tb_freq = (dstart - dend) * 100 / 6;
+
+- iounmap(via);
++ early_iounmap((void *)via, resource_size(&rsrc));
+
+ return 1;
+ }
+diff --git a/arch/powerpc/platforms/pseries/hvCall.S b/arch/powerpc/platforms/pseries/hvCall.S
+index 35254ac7af5ee..ca0674b0b683e 100644
+--- a/arch/powerpc/platforms/pseries/hvCall.S
++++ b/arch/powerpc/platforms/pseries/hvCall.S
+@@ -91,6 +91,7 @@ BEGIN_FTR_SECTION; \
+ b 1f; \
+ END_FTR_SECTION(0, 1); \
+ LOAD_REG_ADDR(r12, hcall_tracepoint_refcount) ; \
++ ld r12,0(r12); \
+ std r12,32(r1); \
+ cmpdi r12,0; \
+ bne- LABEL; \
+diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
+index 2eab323f69706..cb2f1211f7ebf 100644
+--- a/arch/powerpc/platforms/pseries/lpar.c
++++ b/arch/powerpc/platforms/pseries/lpar.c
+@@ -639,16 +639,8 @@ static const struct proc_ops vcpudispatch_stats_freq_proc_ops = {
+
+ static int __init vcpudispatch_stats_procfs_init(void)
+ {
+- /*
+- * Avoid smp_processor_id while preemptible. All CPUs should have
+- * the same value for lppaca_shared_proc.
+- */
+- preempt_disable();
+- if (!lppaca_shared_proc(get_lppaca())) {
+- preempt_enable();
++ if (!lppaca_shared_proc())
+ return 0;
+- }
+- preempt_enable();
+
+ if (!proc_create("powerpc/vcpudispatch_stats", 0600, NULL,
+ &vcpudispatch_stats_proc_ops))
+diff --git a/arch/powerpc/platforms/pseries/lparcfg.c b/arch/powerpc/platforms/pseries/lparcfg.c
+index 8acc705095209..1c151d77e74b3 100644
+--- a/arch/powerpc/platforms/pseries/lparcfg.c
++++ b/arch/powerpc/platforms/pseries/lparcfg.c
+@@ -206,7 +206,7 @@ static void parse_ppp_data(struct seq_file *m)
+ ppp_data.active_system_procs);
+
+ /* pool related entries are appropriate for shared configs */
+- if (lppaca_shared_proc(get_lppaca())) {
++ if (lppaca_shared_proc()) {
+ unsigned long pool_idle_time, pool_procs;
+
+ seq_printf(m, "pool=%d\n", ppp_data.pool_num);
+@@ -560,7 +560,7 @@ static int pseries_lparcfg_data(struct seq_file *m, void *v)
+ partition_potential_processors);
+
+ seq_printf(m, "shared_processor_mode=%d\n",
+- lppaca_shared_proc(get_lppaca()));
++ lppaca_shared_proc());
+
+ #ifdef CONFIG_PPC_64S_HASH_MMU
+ if (!radix_enabled())
+diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
+index e2a57cfa6c837..0ef2a7e014aa1 100644
+--- a/arch/powerpc/platforms/pseries/setup.c
++++ b/arch/powerpc/platforms/pseries/setup.c
+@@ -847,7 +847,7 @@ static void __init pSeries_setup_arch(void)
+ if (firmware_has_feature(FW_FEATURE_LPAR)) {
+ vpa_init(boot_cpuid);
+
+- if (lppaca_shared_proc(get_lppaca())) {
++ if (lppaca_shared_proc()) {
+ static_branch_enable(&shared_processor);
+ pv_spinlocks_init();
+ #ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
+diff --git a/arch/powerpc/sysdev/mpc5xxx_clocks.c b/arch/powerpc/sysdev/mpc5xxx_clocks.c
+index c5bf7e1b37804..58cee28e23992 100644
+--- a/arch/powerpc/sysdev/mpc5xxx_clocks.c
++++ b/arch/powerpc/sysdev/mpc5xxx_clocks.c
+@@ -25,8 +25,10 @@ unsigned long mpc5xxx_fwnode_get_bus_frequency(struct fwnode_handle *fwnode)
+
+ fwnode_for_each_parent_node(fwnode, parent) {
+ ret = fwnode_property_read_u32(parent, "bus-frequency", &bus_freq);
+- if (!ret)
++ if (!ret) {
++ fwnode_handle_put(parent);
+ return bus_freq;
++ }
+ }
+
+ return 0;
+diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
+index 70c4c59a1a8f4..70ce51b8a9291 100644
+--- a/arch/powerpc/xmon/xmon.c
++++ b/arch/powerpc/xmon/xmon.c
+@@ -58,6 +58,7 @@
+ #ifdef CONFIG_PPC64
+ #include <asm/hvcall.h>
+ #include <asm/paca.h>
++#include <asm/lppaca.h>
+ #endif
+
+ #include "nonstdio.h"
+diff --git a/arch/riscv/mm/kasan_init.c b/arch/riscv/mm/kasan_init.c
+index a01bc15dce244..5e39dcf23fdbc 100644
+--- a/arch/riscv/mm/kasan_init.c
++++ b/arch/riscv/mm/kasan_init.c
+@@ -22,9 +22,9 @@
+ * region is not and then we have to go down to the PUD level.
+ */
+
+-pgd_t tmp_pg_dir[PTRS_PER_PGD] __page_aligned_bss;
+-p4d_t tmp_p4d[PTRS_PER_P4D] __page_aligned_bss;
+-pud_t tmp_pud[PTRS_PER_PUD] __page_aligned_bss;
++static pgd_t tmp_pg_dir[PTRS_PER_PGD] __page_aligned_bss;
++static p4d_t tmp_p4d[PTRS_PER_P4D] __page_aligned_bss;
++static pud_t tmp_pud[PTRS_PER_PUD] __page_aligned_bss;
+
+ static void __init kasan_populate_pte(pmd_t *pmd, unsigned long vaddr, unsigned long end)
+ {
+@@ -438,7 +438,7 @@ static void __init kasan_shallow_populate(void *start, void *end)
+ kasan_shallow_populate_pgd(vaddr, vend);
+ }
+
+-static void create_tmp_mapping(void)
++static void __init create_tmp_mapping(void)
+ {
+ void *ptr;
+ p4d_t *base_p4d;
+diff --git a/arch/s390/crypto/paes_s390.c b/arch/s390/crypto/paes_s390.c
+index 29dc827e0fe81..143ae4d4284db 100644
+--- a/arch/s390/crypto/paes_s390.c
++++ b/arch/s390/crypto/paes_s390.c
+@@ -35,7 +35,7 @@
+ * and padding is also possible, the limits need to be generous.
+ */
+ #define PAES_MIN_KEYSIZE 16
+-#define PAES_MAX_KEYSIZE 320
++#define PAES_MAX_KEYSIZE MAXEP11AESKEYBLOBSIZE
+
+ static u8 *ctrblk;
+ static DEFINE_MUTEX(ctrblk_lock);
+diff --git a/arch/s390/include/uapi/asm/pkey.h b/arch/s390/include/uapi/asm/pkey.h
+index 924b876f992c1..29c6fd369761e 100644
+--- a/arch/s390/include/uapi/asm/pkey.h
++++ b/arch/s390/include/uapi/asm/pkey.h
+@@ -26,7 +26,7 @@
+ #define MAXCLRKEYSIZE 32 /* a clear key value may be up to 32 bytes */
+ #define MAXAESCIPHERKEYSIZE 136 /* our aes cipher keys have always 136 bytes */
+ #define MINEP11AESKEYBLOBSIZE 256 /* min EP11 AES key blob size */
+-#define MAXEP11AESKEYBLOBSIZE 320 /* max EP11 AES key blob size */
++#define MAXEP11AESKEYBLOBSIZE 336 /* max EP11 AES key blob size */
+
+ /* Minimum size of a key blob */
+ #define MINKEYBLOBSIZE SECKEYBLOBSIZE
+diff --git a/arch/s390/kernel/ipl.c b/arch/s390/kernel/ipl.c
+index f44f70de96611..9d74b037b3a3b 100644
+--- a/arch/s390/kernel/ipl.c
++++ b/arch/s390/kernel/ipl.c
+@@ -638,6 +638,8 @@ static struct attribute_group ipl_ccw_attr_group_lpar = {
+
+ static struct attribute *ipl_unknown_attrs[] = {
+ &sys_ipl_type_attr.attr,
++ &sys_ipl_secure_attr.attr,
++ &sys_ipl_has_secure_attr.attr,
+ NULL,
+ };
+
+diff --git a/arch/um/configs/i386_defconfig b/arch/um/configs/i386_defconfig
+index c0162286d68b7..c33a6880a437a 100644
+--- a/arch/um/configs/i386_defconfig
++++ b/arch/um/configs/i386_defconfig
+@@ -35,6 +35,7 @@ CONFIG_TTY_CHAN=y
+ CONFIG_XTERM_CHAN=y
+ CONFIG_CON_CHAN="pts"
+ CONFIG_SSL_CHAN="pts"
++CONFIG_SOUND=m
+ CONFIG_UML_SOUND=m
+ CONFIG_DEVTMPFS=y
+ CONFIG_DEVTMPFS_MOUNT=y
+diff --git a/arch/um/configs/x86_64_defconfig b/arch/um/configs/x86_64_defconfig
+index bec6e5d956873..df29f282b6ac2 100644
+--- a/arch/um/configs/x86_64_defconfig
++++ b/arch/um/configs/x86_64_defconfig
+@@ -33,6 +33,7 @@ CONFIG_TTY_CHAN=y
+ CONFIG_XTERM_CHAN=y
+ CONFIG_CON_CHAN="pts"
+ CONFIG_SSL_CHAN="pts"
++CONFIG_SOUND=m
+ CONFIG_UML_SOUND=m
+ CONFIG_DEVTMPFS=y
+ CONFIG_DEVTMPFS_MOUNT=y
+diff --git a/arch/um/drivers/Kconfig b/arch/um/drivers/Kconfig
+index 36911b1fddcf0..b94b2618e7d84 100644
+--- a/arch/um/drivers/Kconfig
++++ b/arch/um/drivers/Kconfig
+@@ -111,24 +111,14 @@ config SSL_CHAN
+
+ config UML_SOUND
+ tristate "Sound support"
++ depends on SOUND
++ select SOUND_OSS_CORE
+ help
+ This option enables UML sound support. If enabled, it will pull in
+- soundcore and the UML hostaudio relay, which acts as a intermediary
++ the UML hostaudio relay, which acts as a intermediary
+ between the host's dsp and mixer devices and the UML sound system.
+ It is safe to say 'Y' here.
+
+-config SOUND
+- tristate
+- default UML_SOUND
+-
+-config SOUND_OSS_CORE
+- bool
+- default UML_SOUND
+-
+-config HOSTAUDIO
+- tristate
+- default UML_SOUND
+-
+ endmenu
+
+ menu "UML Network Devices"
+diff --git a/arch/um/drivers/Makefile b/arch/um/drivers/Makefile
+index a461a950f0518..0e6af81096fd5 100644
+--- a/arch/um/drivers/Makefile
++++ b/arch/um/drivers/Makefile
+@@ -54,7 +54,7 @@ obj-$(CONFIG_UML_NET) += net.o
+ obj-$(CONFIG_MCONSOLE) += mconsole.o
+ obj-$(CONFIG_MMAPPER) += mmapper_kern.o
+ obj-$(CONFIG_BLK_DEV_UBD) += ubd.o
+-obj-$(CONFIG_HOSTAUDIO) += hostaudio.o
++obj-$(CONFIG_UML_SOUND) += hostaudio.o
+ obj-$(CONFIG_NULL_CHAN) += null.o
+ obj-$(CONFIG_PORT_CHAN) += port.o
+ obj-$(CONFIG_PTY_CHAN) += pty.o
+diff --git a/arch/um/drivers/virt-pci.c b/arch/um/drivers/virt-pci.c
+index 7699ca5f35d48..ffe2ee8a02465 100644
+--- a/arch/um/drivers/virt-pci.c
++++ b/arch/um/drivers/virt-pci.c
+@@ -544,6 +544,7 @@ static void um_pci_irq_vq_cb(struct virtqueue *vq)
+ }
+ }
+
++#ifdef CONFIG_OF
+ /* Copied from arch/x86/kernel/devicetree.c */
+ struct device_node *pcibios_get_phb_of_node(struct pci_bus *bus)
+ {
+@@ -562,6 +563,7 @@ struct device_node *pcibios_get_phb_of_node(struct pci_bus *bus)
+ }
+ return NULL;
+ }
++#endif
+
+ static int um_pci_init_vqs(struct um_pci_device *dev)
+ {
+diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
+index 03c4328a88cbd..f732426d3b483 100644
+--- a/arch/x86/boot/compressed/head_64.S
++++ b/arch/x86/boot/compressed/head_64.S
+@@ -459,11 +459,25 @@ SYM_CODE_START(startup_64)
+ /* Save the trampoline address in RCX */
+ movq %rax, %rcx
+
++ /* Set up 32-bit addressable stack */
++ leaq TRAMPOLINE_32BIT_STACK_END(%rcx), %rsp
++
++ /*
++ * Preserve live 64-bit registers on the stack: this is necessary
++ * because the architecture does not guarantee that GPRs will retain
++ * their full 64-bit values across a 32-bit mode switch.
++ */
++ pushq %rbp
++ pushq %rbx
++ pushq %rsi
++
+ /*
+- * Load the address of trampoline_return() into RDI.
+- * It will be used by the trampoline to return to the main code.
++ * Push the 64-bit address of trampoline_return() onto the new stack.
++ * It will be used by the trampoline to return to the main code. Due to
++ * the 32-bit mode switch, it cannot be kept it in a register either.
+ */
+ leaq trampoline_return(%rip), %rdi
++ pushq %rdi
+
+ /* Switch to compatibility mode (CS.L = 0 CS.D = 1) via far return */
+ pushq $__KERNEL32_CS
+@@ -471,6 +485,11 @@ SYM_CODE_START(startup_64)
+ pushq %rax
+ lretq
+ trampoline_return:
++ /* Restore live 64-bit registers */
++ popq %rsi
++ popq %rbx
++ popq %rbp
++
+ /* Restore the stack, the 32-bit trampoline uses its own stack */
+ leaq rva(boot_stack_end)(%rbx), %rsp
+
+@@ -582,7 +601,7 @@ SYM_FUNC_END(.Lrelocated)
+ /*
+ * This is the 32-bit trampoline that will be copied over to low memory.
+ *
+- * RDI contains the return address (might be above 4G).
++ * Return address is at the top of the stack (might be above 4G).
+ * ECX contains the base address of the trampoline memory.
+ * Non zero RDX means trampoline needs to enable 5-level paging.
+ */
+@@ -592,9 +611,6 @@ SYM_CODE_START(trampoline_32bit_src)
+ movl %eax, %ds
+ movl %eax, %ss
+
+- /* Set up new stack */
+- leal TRAMPOLINE_32BIT_STACK_END(%ecx), %esp
+-
+ /* Disable paging */
+ movl %cr0, %eax
+ btrl $X86_CR0_PG_BIT, %eax
+@@ -671,7 +687,7 @@ SYM_CODE_END(trampoline_32bit_src)
+ .code64
+ SYM_FUNC_START_LOCAL_NOALIGN(.Lpaging_enabled)
+ /* Return from the trampoline */
+- jmp *%rdi
++ retq
+ SYM_FUNC_END(.Lpaging_enabled)
+
+ /*
+diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
+index d49e90dc04a4c..847740c08c97d 100644
+--- a/arch/x86/events/intel/uncore_snbep.c
++++ b/arch/x86/events/intel/uncore_snbep.c
+@@ -6474,8 +6474,18 @@ void spr_uncore_cpu_init(void)
+
+ type = uncore_find_type_by_id(uncore_msr_uncores, UNCORE_SPR_CHA);
+ if (type) {
++ /*
++ * The value from the discovery table (stored in the type->num_boxes
++ * of UNCORE_SPR_CHA) is incorrect on some SPR variants because of a
++ * firmware bug. Using the value from SPR_MSR_UNC_CBO_CONFIG to replace it.
++ */
+ rdmsrl(SPR_MSR_UNC_CBO_CONFIG, num_cbo);
+- type->num_boxes = num_cbo;
++ /*
++ * The MSR doesn't work on the EMR XCC, but the firmware bug doesn't impact
++ * the EMR XCC. Don't let the value from the MSR replace the existing value.
++ */
++ if (num_cbo)
++ type->num_boxes = num_cbo;
+ }
+ spr_uncore_iio_free_running.num_boxes = uncore_type_max_boxes(uncore_msr_uncores, UNCORE_SPR_IIO);
+ }
+diff --git a/arch/x86/hyperv/hv_vtl.c b/arch/x86/hyperv/hv_vtl.c
+index 85d38b9f35861..db5d2ea39fc0d 100644
+--- a/arch/x86/hyperv/hv_vtl.c
++++ b/arch/x86/hyperv/hv_vtl.c
+@@ -25,6 +25,10 @@ void __init hv_vtl_init_platform(void)
+ x86_init.irqs.pre_vector_init = x86_init_noop;
+ x86_init.timers.timer_init = x86_init_noop;
+
++ /* Avoid searching for BIOS MP tables */
++ x86_init.mpparse.find_smp_config = x86_init_noop;
++ x86_init.mpparse.get_smp_config = x86_init_uint_noop;
++
+ x86_platform.get_wallclock = get_rtc_noop;
+ x86_platform.set_wallclock = set_rtc_noop;
+ x86_platform.get_nmi_reason = hv_get_nmi_reason;
+diff --git a/arch/x86/include/asm/local.h b/arch/x86/include/asm/local.h
+index 56d4ef604b919..635132a127782 100644
+--- a/arch/x86/include/asm/local.h
++++ b/arch/x86/include/asm/local.h
+@@ -127,8 +127,8 @@ static inline long local_cmpxchg(local_t *l, long old, long new)
+
+ static inline bool local_try_cmpxchg(local_t *l, long *old, long new)
+ {
+- typeof(l->a.counter) *__old = (typeof(l->a.counter) *) old;
+- return try_cmpxchg_local(&l->a.counter, __old, new);
++ return try_cmpxchg_local(&l->a.counter,
++ (typeof(l->a.counter) *) old, new);
+ }
+
+ /* Always has a lock prefix */
+diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
+index 7f97a8a97e24a..473b16d73b471 100644
+--- a/arch/x86/include/asm/mem_encrypt.h
++++ b/arch/x86/include/asm/mem_encrypt.h
+@@ -50,8 +50,8 @@ void __init sme_enable(struct boot_params *bp);
+
+ int __init early_set_memory_decrypted(unsigned long vaddr, unsigned long size);
+ int __init early_set_memory_encrypted(unsigned long vaddr, unsigned long size);
+-void __init early_set_mem_enc_dec_hypercall(unsigned long vaddr, int npages,
+- bool enc);
++void __init early_set_mem_enc_dec_hypercall(unsigned long vaddr,
++ unsigned long size, bool enc);
+
+ void __init mem_encrypt_free_decrypted_mem(void);
+
+@@ -85,7 +85,7 @@ early_set_memory_decrypted(unsigned long vaddr, unsigned long size) { return 0;
+ static inline int __init
+ early_set_memory_encrypted(unsigned long vaddr, unsigned long size) { return 0; }
+ static inline void __init
+-early_set_mem_enc_dec_hypercall(unsigned long vaddr, int npages, bool enc) {}
++early_set_mem_enc_dec_hypercall(unsigned long vaddr, unsigned long size, bool enc) {}
+
+ static inline void mem_encrypt_free_decrypted_mem(void) { }
+
+diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
+index 447d4bee25c48..97533e6b1c61b 100644
+--- a/arch/x86/include/asm/pgtable_types.h
++++ b/arch/x86/include/asm/pgtable_types.h
+@@ -125,11 +125,12 @@
+ * instance, and is *not* included in this mask since
+ * pte_modify() does modify it.
+ */
+-#define _PAGE_CHG_MASK (PTE_PFN_MASK | _PAGE_PCD | _PAGE_PWT | \
+- _PAGE_SPECIAL | _PAGE_ACCESSED | _PAGE_DIRTY | \
+- _PAGE_SOFT_DIRTY | _PAGE_DEVMAP | _PAGE_ENC | \
+- _PAGE_UFFD_WP)
+-#define _HPAGE_CHG_MASK (_PAGE_CHG_MASK | _PAGE_PSE)
++#define _COMMON_PAGE_CHG_MASK (PTE_PFN_MASK | _PAGE_PCD | _PAGE_PWT | \
++ _PAGE_SPECIAL | _PAGE_ACCESSED | _PAGE_DIRTY |\
++ _PAGE_SOFT_DIRTY | _PAGE_DEVMAP | _PAGE_ENC | \
++ _PAGE_UFFD_WP)
++#define _PAGE_CHG_MASK (_COMMON_PAGE_CHG_MASK | _PAGE_PAT)
++#define _HPAGE_CHG_MASK (_COMMON_PAGE_CHG_MASK | _PAGE_PSE | _PAGE_PAT_LARGE)
+
+ /*
+ * The cache modes defined here are used to translate between pure SW usage
+diff --git a/arch/x86/kernel/apm_32.c b/arch/x86/kernel/apm_32.c
+index c6c15ce1952fb..5934ee5bc087e 100644
+--- a/arch/x86/kernel/apm_32.c
++++ b/arch/x86/kernel/apm_32.c
+@@ -238,12 +238,6 @@
+ extern int (*console_blank_hook)(int);
+ #endif
+
+-/*
+- * The apm_bios device is one of the misc char devices.
+- * This is its minor number.
+- */
+-#define APM_MINOR_DEV 134
+-
+ /*
+ * Various options can be changed at boot time as follows:
+ * (We allow underscores for compatibility with the modules code)
+diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
+index 19c74e68c0a21..aa26c2bb70259 100644
+--- a/arch/x86/kernel/cpu/common.c
++++ b/arch/x86/kernel/cpu/common.c
+@@ -1282,11 +1282,11 @@ static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = {
+ VULNBL_INTEL_STEPPINGS(BROADWELL_G, X86_STEPPING_ANY, SRBDS),
+ VULNBL_INTEL_STEPPINGS(BROADWELL_X, X86_STEPPING_ANY, MMIO),
+ VULNBL_INTEL_STEPPINGS(BROADWELL, X86_STEPPING_ANY, SRBDS),
+- VULNBL_INTEL_STEPPINGS(SKYLAKE_L, X86_STEPPING_ANY, SRBDS | MMIO | RETBLEED),
+ VULNBL_INTEL_STEPPINGS(SKYLAKE_X, X86_STEPPING_ANY, MMIO | RETBLEED | GDS),
+- VULNBL_INTEL_STEPPINGS(SKYLAKE, X86_STEPPING_ANY, SRBDS | MMIO | RETBLEED),
+- VULNBL_INTEL_STEPPINGS(KABYLAKE_L, X86_STEPPING_ANY, SRBDS | MMIO | RETBLEED | GDS),
+- VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPING_ANY, SRBDS | MMIO | RETBLEED | GDS),
++ VULNBL_INTEL_STEPPINGS(SKYLAKE_L, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | SRBDS),
++ VULNBL_INTEL_STEPPINGS(SKYLAKE, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | SRBDS),
++ VULNBL_INTEL_STEPPINGS(KABYLAKE_L, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | SRBDS),
++ VULNBL_INTEL_STEPPINGS(KABYLAKE, X86_STEPPING_ANY, MMIO | RETBLEED | GDS | SRBDS),
+ VULNBL_INTEL_STEPPINGS(CANNONLAKE_L, X86_STEPPING_ANY, RETBLEED),
+ VULNBL_INTEL_STEPPINGS(ICELAKE_L, X86_STEPPING_ANY, MMIO | MMIO_SBDS | RETBLEED | GDS),
+ VULNBL_INTEL_STEPPINGS(ICELAKE_D, X86_STEPPING_ANY, MMIO | GDS),
+diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
+index 2eec60f50057a..00a7fdac7f89d 100644
+--- a/arch/x86/kernel/cpu/mce/core.c
++++ b/arch/x86/kernel/cpu/mce/core.c
+@@ -842,6 +842,26 @@ static noinstr bool quirk_skylake_repmov(void)
+ return false;
+ }
+
++/*
++ * Some Zen-based Instruction Fetch Units set EIPV=RIPV=0 on poison consumption
++ * errors. This means mce_gather_info() will not save the "ip" and "cs" registers.
++ *
++ * However, the context is still valid, so save the "cs" register for later use.
++ *
++ * The "ip" register is truly unknown, so don't save it or fixup EIPV/RIPV.
++ *
++ * The Instruction Fetch Unit is at MCA bank 1 for all affected systems.
++ */
++static __always_inline void quirk_zen_ifu(int bank, struct mce *m, struct pt_regs *regs)
++{
++ if (bank != 1)
++ return;
++ if (!(m->status & MCI_STATUS_POISON))
++ return;
++
++ m->cs = regs->cs;
++}
++
+ /*
+ * Do a quick check if any of the events requires a panic.
+ * This decides if we keep the events around or clear them.
+@@ -861,6 +881,9 @@ static __always_inline int mce_no_way_out(struct mce *m, char **msg, unsigned lo
+ if (mce_flags.snb_ifu_quirk)
+ quirk_sandybridge_ifu(i, m, regs);
+
++ if (mce_flags.zen_ifu_quirk)
++ quirk_zen_ifu(i, m, regs);
++
+ m->bank = i;
+ if (mce_severity(m, regs, &tmp, true) >= MCE_PANIC_SEVERITY) {
+ mce_read_aux(m, i);
+@@ -1842,6 +1865,9 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
+ if (c->x86 == 0x15 && c->x86_model <= 0xf)
+ mce_flags.overflow_recov = 1;
+
++ if (c->x86 >= 0x17 && c->x86 <= 0x1A)
++ mce_flags.zen_ifu_quirk = 1;
++
+ }
+
+ if (c->x86_vendor == X86_VENDOR_INTEL) {
+diff --git a/arch/x86/kernel/cpu/mce/internal.h b/arch/x86/kernel/cpu/mce/internal.h
+index d2412ce2d312f..d5946fcdcd5de 100644
+--- a/arch/x86/kernel/cpu/mce/internal.h
++++ b/arch/x86/kernel/cpu/mce/internal.h
+@@ -157,6 +157,9 @@ struct mce_vendor_flags {
+ */
+ smca : 1,
+
++ /* Zen IFU quirk */
++ zen_ifu_quirk : 1,
++
+ /* AMD-style error thresholding banks present. */
+ amd_threshold : 1,
+
+@@ -172,7 +175,7 @@ struct mce_vendor_flags {
+ /* Skylake, Cascade Lake, Cooper Lake REP;MOVS* quirk */
+ skx_repmov_quirk : 1,
+
+- __reserved_0 : 56;
++ __reserved_0 : 55;
+ };
+
+ extern struct mce_vendor_flags mce_flags;
+diff --git a/arch/x86/kernel/cpu/sgx/virt.c b/arch/x86/kernel/cpu/sgx/virt.c
+index c3e37eaec8ecd..7aaa3652e31d1 100644
+--- a/arch/x86/kernel/cpu/sgx/virt.c
++++ b/arch/x86/kernel/cpu/sgx/virt.c
+@@ -204,6 +204,7 @@ static int sgx_vepc_release(struct inode *inode, struct file *file)
+ continue;
+
+ xa_erase(&vepc->page_array, index);
++ cond_resched();
+ }
+
+ /*
+@@ -222,6 +223,7 @@ static int sgx_vepc_release(struct inode *inode, struct file *file)
+ list_add_tail(&epc_page->list, &secs_pages);
+
+ xa_erase(&vepc->page_array, index);
++ cond_resched();
+ }
+
+ /*
+@@ -243,6 +245,7 @@ static int sgx_vepc_release(struct inode *inode, struct file *file)
+
+ if (sgx_vepc_free_page(epc_page))
+ list_add_tail(&epc_page->list, &secs_pages);
++ cond_resched();
+ }
+
+ if (!list_empty(&secs_pages))
+diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
+index 1cceac5984daa..526d4da3dcd46 100644
+--- a/arch/x86/kernel/kvm.c
++++ b/arch/x86/kernel/kvm.c
+@@ -966,10 +966,8 @@ static void __init kvm_init_platform(void)
+ * Ensure that _bss_decrypted section is marked as decrypted in the
+ * shared pages list.
+ */
+- nr_pages = DIV_ROUND_UP(__end_bss_decrypted - __start_bss_decrypted,
+- PAGE_SIZE);
+ early_set_mem_enc_dec_hypercall((unsigned long)__start_bss_decrypted,
+- nr_pages, 0);
++ __end_bss_decrypted - __start_bss_decrypted, 0);
+
+ /*
+ * If not booted using EFI, enable Live migration support.
+diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
+index 83d41c2601d7b..f15fb71f280e2 100644
+--- a/arch/x86/kernel/vmlinux.lds.S
++++ b/arch/x86/kernel/vmlinux.lds.S
+@@ -156,7 +156,7 @@ SECTIONS
+ ALIGN_ENTRY_TEXT_END
+ *(.gnu.warning)
+
+- } :text =0xcccc
++ } :text = 0xcccccccc
+
+ /* End of text section, which should occupy whole number of pages */
+ _etext = .;
+diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
+index 6eaa3d6994aeb..11c050f40d828 100644
+--- a/arch/x86/kvm/mmu/mmu.c
++++ b/arch/x86/kvm/mmu/mmu.c
+@@ -58,6 +58,8 @@
+
+ extern bool itlb_multihit_kvm_mitigation;
+
++static bool nx_hugepage_mitigation_hard_disabled;
++
+ int __read_mostly nx_huge_pages = -1;
+ static uint __read_mostly nx_huge_pages_recovery_period_ms;
+ #ifdef CONFIG_PREEMPT_RT
+@@ -67,12 +69,13 @@ static uint __read_mostly nx_huge_pages_recovery_ratio = 0;
+ static uint __read_mostly nx_huge_pages_recovery_ratio = 60;
+ #endif
+
++static int get_nx_huge_pages(char *buffer, const struct kernel_param *kp);
+ static int set_nx_huge_pages(const char *val, const struct kernel_param *kp);
+ static int set_nx_huge_pages_recovery_param(const char *val, const struct kernel_param *kp);
+
+ static const struct kernel_param_ops nx_huge_pages_ops = {
+ .set = set_nx_huge_pages,
+- .get = param_get_bool,
++ .get = get_nx_huge_pages,
+ };
+
+ static const struct kernel_param_ops nx_huge_pages_recovery_param_ops = {
+@@ -6844,6 +6847,14 @@ static void mmu_destroy_caches(void)
+ kmem_cache_destroy(mmu_page_header_cache);
+ }
+
++static int get_nx_huge_pages(char *buffer, const struct kernel_param *kp)
++{
++ if (nx_hugepage_mitigation_hard_disabled)
++ return sprintf(buffer, "never\n");
++
++ return param_get_bool(buffer, kp);
++}
++
+ static bool get_nx_auto_mode(void)
+ {
+ /* Return true when CPU has the bug, and mitigations are ON */
+@@ -6860,15 +6871,29 @@ static int set_nx_huge_pages(const char *val, const struct kernel_param *kp)
+ bool old_val = nx_huge_pages;
+ bool new_val;
+
++ if (nx_hugepage_mitigation_hard_disabled)
++ return -EPERM;
++
+ /* In "auto" mode deploy workaround only if CPU has the bug. */
+- if (sysfs_streq(val, "off"))
++ if (sysfs_streq(val, "off")) {
+ new_val = 0;
+- else if (sysfs_streq(val, "force"))
++ } else if (sysfs_streq(val, "force")) {
+ new_val = 1;
+- else if (sysfs_streq(val, "auto"))
++ } else if (sysfs_streq(val, "auto")) {
+ new_val = get_nx_auto_mode();
+- else if (kstrtobool(val, &new_val) < 0)
++ } else if (sysfs_streq(val, "never")) {
++ new_val = 0;
++
++ mutex_lock(&kvm_lock);
++ if (!list_empty(&vm_list)) {
++ mutex_unlock(&kvm_lock);
++ return -EBUSY;
++ }
++ nx_hugepage_mitigation_hard_disabled = true;
++ mutex_unlock(&kvm_lock);
++ } else if (kstrtobool(val, &new_val) < 0) {
+ return -EINVAL;
++ }
+
+ __set_nx_huge_pages(new_val);
+
+@@ -7006,6 +7031,9 @@ static int set_nx_huge_pages_recovery_param(const char *val, const struct kernel
+ uint old_period, new_period;
+ int err;
+
++ if (nx_hugepage_mitigation_hard_disabled)
++ return -EPERM;
++
+ was_recovery_enabled = calc_nx_huge_pages_recovery_period(&old_period);
+
+ err = param_set_uint(val, kp);
+@@ -7164,6 +7192,9 @@ int kvm_mmu_post_init_vm(struct kvm *kvm)
+ {
+ int err;
+
++ if (nx_hugepage_mitigation_hard_disabled)
++ return 0;
++
+ err = kvm_vm_create_worker_thread(kvm, kvm_nx_huge_page_recovery_worker, 0,
+ "kvm-nx-lpage-recovery",
+ &kvm->arch.nx_huge_page_recovery_thread);
+diff --git a/arch/x86/mm/mem_encrypt_amd.c b/arch/x86/mm/mem_encrypt_amd.c
+index 4f95c449a406e..bff4201c9db7c 100644
+--- a/arch/x86/mm/mem_encrypt_amd.c
++++ b/arch/x86/mm/mem_encrypt_amd.c
+@@ -288,11 +288,10 @@ static bool amd_enc_cache_flush_required(void)
+ return !cpu_feature_enabled(X86_FEATURE_SME_COHERENT);
+ }
+
+-static void enc_dec_hypercall(unsigned long vaddr, int npages, bool enc)
++static void enc_dec_hypercall(unsigned long vaddr, unsigned long size, bool enc)
+ {
+ #ifdef CONFIG_PARAVIRT
+- unsigned long sz = npages << PAGE_SHIFT;
+- unsigned long vaddr_end = vaddr + sz;
++ unsigned long vaddr_end = vaddr + size;
+
+ while (vaddr < vaddr_end) {
+ int psize, pmask, level;
+@@ -342,7 +341,7 @@ static bool amd_enc_status_change_finish(unsigned long vaddr, int npages, bool e
+ snp_set_memory_private(vaddr, npages);
+
+ if (!cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
+- enc_dec_hypercall(vaddr, npages, enc);
++ enc_dec_hypercall(vaddr, npages << PAGE_SHIFT, enc);
+
+ return true;
+ }
+@@ -466,7 +465,7 @@ static int __init early_set_memory_enc_dec(unsigned long vaddr,
+
+ ret = 0;
+
+- early_set_mem_enc_dec_hypercall(start, PAGE_ALIGN(size) >> PAGE_SHIFT, enc);
++ early_set_mem_enc_dec_hypercall(start, size, enc);
+ out:
+ __flush_tlb_all();
+ return ret;
+@@ -482,9 +481,9 @@ int __init early_set_memory_encrypted(unsigned long vaddr, unsigned long size)
+ return early_set_memory_enc_dec(vaddr, size, true);
+ }
+
+-void __init early_set_mem_enc_dec_hypercall(unsigned long vaddr, int npages, bool enc)
++void __init early_set_mem_enc_dec_hypercall(unsigned long vaddr, unsigned long size, bool enc)
+ {
+- enc_dec_hypercall(vaddr, npages, enc);
++ enc_dec_hypercall(vaddr, size, enc);
+ }
+
+ void __init sme_early_init(void)
+diff --git a/arch/xtensa/include/asm/core.h b/arch/xtensa/include/asm/core.h
+index f856d2bcb9f36..7cef85ad9741a 100644
+--- a/arch/xtensa/include/asm/core.h
++++ b/arch/xtensa/include/asm/core.h
+@@ -44,4 +44,13 @@
+ #define XTENSA_STACK_ALIGNMENT 16
+ #endif
+
++#ifndef XCHAL_HW_MIN_VERSION
++#if defined(XCHAL_HW_MIN_VERSION_MAJOR) && defined(XCHAL_HW_MIN_VERSION_MINOR)
++#define XCHAL_HW_MIN_VERSION (XCHAL_HW_MIN_VERSION_MAJOR * 100 + \
++ XCHAL_HW_MIN_VERSION_MINOR)
++#else
++#define XCHAL_HW_MIN_VERSION 0
++#endif
++#endif
++
+ #endif
+diff --git a/arch/xtensa/kernel/perf_event.c b/arch/xtensa/kernel/perf_event.c
+index a0d05c8598d0f..183618090d05b 100644
+--- a/arch/xtensa/kernel/perf_event.c
++++ b/arch/xtensa/kernel/perf_event.c
+@@ -13,17 +13,26 @@
+ #include <linux/perf_event.h>
+ #include <linux/platform_device.h>
+
++#include <asm/core.h>
+ #include <asm/processor.h>
+ #include <asm/stacktrace.h>
+
++#define XTENSA_HWVERSION_RG_2015_0 260000
++
++#if XCHAL_HW_MIN_VERSION >= XTENSA_HWVERSION_RG_2015_0
++#define XTENSA_PMU_ERI_BASE 0x00101000
++#else
++#define XTENSA_PMU_ERI_BASE 0x00001000
++#endif
++
+ /* Global control/status for all perf counters */
+-#define XTENSA_PMU_PMG 0x1000
++#define XTENSA_PMU_PMG XTENSA_PMU_ERI_BASE
+ /* Perf counter values */
+-#define XTENSA_PMU_PM(i) (0x1080 + (i) * 4)
++#define XTENSA_PMU_PM(i) (XTENSA_PMU_ERI_BASE + 0x80 + (i) * 4)
+ /* Perf counter control registers */
+-#define XTENSA_PMU_PMCTRL(i) (0x1100 + (i) * 4)
++#define XTENSA_PMU_PMCTRL(i) (XTENSA_PMU_ERI_BASE + 0x100 + (i) * 4)
+ /* Perf counter status registers */
+-#define XTENSA_PMU_PMSTAT(i) (0x1180 + (i) * 4)
++#define XTENSA_PMU_PMSTAT(i) (XTENSA_PMU_ERI_BASE + 0x180 + (i) * 4)
+
+ #define XTENSA_PMU_PMG_PMEN 0x1
+
+diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
+index 9faafcd10e177..4a42ea2972ad8 100644
+--- a/block/blk-cgroup.c
++++ b/block/blk-cgroup.c
+@@ -1511,7 +1511,7 @@ int blkcg_activate_policy(struct gendisk *disk, const struct blkcg_policy *pol)
+ retry:
+ spin_lock_irq(&q->queue_lock);
+
+- /* blkg_list is pushed at the head, reverse walk to allocate parents first */
++ /* blkg_list is pushed at the head, reverse walk to initialize parents first */
+ list_for_each_entry_reverse(blkg, &q->blkg_list, q_node) {
+ struct blkg_policy_data *pd;
+
+@@ -1549,21 +1549,20 @@ retry:
+ goto enomem;
+ }
+
+- blkg->pd[pol->plid] = pd;
++ spin_lock(&blkg->blkcg->lock);
++
+ pd->blkg = blkg;
+ pd->plid = pol->plid;
+- pd->online = false;
+- }
++ blkg->pd[pol->plid] = pd;
+
+- /* all allocated, init in the same order */
+- if (pol->pd_init_fn)
+- list_for_each_entry_reverse(blkg, &q->blkg_list, q_node)
+- pol->pd_init_fn(blkg->pd[pol->plid]);
++ if (pol->pd_init_fn)
++ pol->pd_init_fn(pd);
+
+- list_for_each_entry_reverse(blkg, &q->blkg_list, q_node) {
+ if (pol->pd_online_fn)
+- pol->pd_online_fn(blkg->pd[pol->plid]);
+- blkg->pd[pol->plid]->online = true;
++ pol->pd_online_fn(pd);
++ pd->online = true;
++
++ spin_unlock(&blkg->blkcg->lock);
+ }
+
+ __set_bit(pol->plid, q->blkcg_pols);
+@@ -1580,14 +1579,19 @@ out:
+ return ret;
+
+ enomem:
+- /* alloc failed, nothing's initialized yet, free everything */
++ /* alloc failed, take down everything */
+ spin_lock_irq(&q->queue_lock);
+ list_for_each_entry(blkg, &q->blkg_list, q_node) {
+ struct blkcg *blkcg = blkg->blkcg;
++ struct blkg_policy_data *pd;
+
+ spin_lock(&blkcg->lock);
+- if (blkg->pd[pol->plid]) {
+- pol->pd_free_fn(blkg->pd[pol->plid]);
++ pd = blkg->pd[pol->plid];
++ if (pd) {
++ if (pd->online && pol->pd_offline_fn)
++ pol->pd_offline_fn(pd);
++ pd->online = false;
++ pol->pd_free_fn(pd);
+ blkg->pd[pol->plid] = NULL;
+ }
+ spin_unlock(&blkcg->lock);
+diff --git a/block/blk-settings.c b/block/blk-settings.c
+index 4dd59059b788e..0046b447268f9 100644
+--- a/block/blk-settings.c
++++ b/block/blk-settings.c
+@@ -830,10 +830,13 @@ EXPORT_SYMBOL(blk_set_queue_depth);
+ */
+ void blk_queue_write_cache(struct request_queue *q, bool wc, bool fua)
+ {
+- if (wc)
++ if (wc) {
++ blk_queue_flag_set(QUEUE_FLAG_HW_WC, q);
+ blk_queue_flag_set(QUEUE_FLAG_WC, q);
+- else
++ } else {
++ blk_queue_flag_clear(QUEUE_FLAG_HW_WC, q);
+ blk_queue_flag_clear(QUEUE_FLAG_WC, q);
++ }
+ if (fua)
+ blk_queue_flag_set(QUEUE_FLAG_FUA, q);
+ else
+diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
+index a642085838531..b7fc4cf3f992c 100644
+--- a/block/blk-sysfs.c
++++ b/block/blk-sysfs.c
+@@ -517,21 +517,16 @@ static ssize_t queue_wc_show(struct request_queue *q, char *page)
+ static ssize_t queue_wc_store(struct request_queue *q, const char *page,
+ size_t count)
+ {
+- int set = -1;
+-
+- if (!strncmp(page, "write back", 10))
+- set = 1;
+- else if (!strncmp(page, "write through", 13) ||
+- !strncmp(page, "none", 4))
+- set = 0;
+-
+- if (set == -1)
+- return -EINVAL;
+-
+- if (set)
++ if (!strncmp(page, "write back", 10)) {
++ if (!test_bit(QUEUE_FLAG_HW_WC, &q->queue_flags))
++ return -EINVAL;
+ blk_queue_flag_set(QUEUE_FLAG_WC, q);
+- else
++ } else if (!strncmp(page, "write through", 13) ||
++ !strncmp(page, "none", 4)) {
+ blk_queue_flag_clear(QUEUE_FLAG_WC, q);
++ } else {
++ return -EINVAL;
++ }
+
+ return count;
+ }
+diff --git a/block/ioctl.c b/block/ioctl.c
+index 9c5f637ff153f..3c475e4166e9f 100644
+--- a/block/ioctl.c
++++ b/block/ioctl.c
+@@ -20,6 +20,8 @@ static int blkpg_do_ioctl(struct block_device *bdev,
+ struct blkpg_partition p;
+ long long start, length;
+
++ if (disk->flags & GENHD_FL_NO_PART)
++ return -EINVAL;
+ if (!capable(CAP_SYS_ADMIN))
+ return -EACCES;
+ if (copy_from_user(&p, upart, sizeof(struct blkpg_partition)))
+diff --git a/block/mq-deadline.c b/block/mq-deadline.c
+index 5839a027e0f05..7e043d4a78f84 100644
+--- a/block/mq-deadline.c
++++ b/block/mq-deadline.c
+@@ -620,8 +620,9 @@ static void dd_depth_updated(struct blk_mq_hw_ctx *hctx)
+ struct request_queue *q = hctx->queue;
+ struct deadline_data *dd = q->elevator->elevator_data;
+ struct blk_mq_tags *tags = hctx->sched_tags;
++ unsigned int shift = tags->bitmap_tags.sb.shift;
+
+- dd->async_depth = max(1UL, 3 * q->nr_requests / 4);
++ dd->async_depth = max(1U, 3 * (1U << shift) / 4);
+
+ sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, dd->async_depth);
+ }
+diff --git a/crypto/af_alg.c b/crypto/af_alg.c
+index 5f7252a5b7b44..703fb426ff5d1 100644
+--- a/crypto/af_alg.c
++++ b/crypto/af_alg.c
+@@ -320,18 +320,21 @@ static int alg_setkey_by_key_serial(struct alg_sock *ask, sockptr_t optval,
+
+ if (IS_ERR(ret)) {
+ up_read(&key->sem);
++ key_put(key);
+ return PTR_ERR(ret);
+ }
+
+ key_data = sock_kmalloc(&ask->sk, key_datalen, GFP_KERNEL);
+ if (!key_data) {
+ up_read(&key->sem);
++ key_put(key);
+ return -ENOMEM;
+ }
+
+ memcpy(key_data, ret, key_datalen);
+
+ up_read(&key->sem);
++ key_put(key);
+
+ err = type->setkey(ask->private, key_data, key_datalen);
+
+diff --git a/crypto/algapi.c b/crypto/algapi.c
+index 5e7cd603d489c..4fe95c4480473 100644
+--- a/crypto/algapi.c
++++ b/crypto/algapi.c
+@@ -17,6 +17,7 @@
+ #include <linux/rtnetlink.h>
+ #include <linux/slab.h>
+ #include <linux/string.h>
++#include <linux/workqueue.h>
+
+ #include "internal.h"
+
+@@ -74,15 +75,26 @@ static void crypto_free_instance(struct crypto_instance *inst)
+ inst->alg.cra_type->free(inst);
+ }
+
+-static void crypto_destroy_instance(struct crypto_alg *alg)
++static void crypto_destroy_instance_workfn(struct work_struct *w)
+ {
+- struct crypto_instance *inst = (void *)alg;
++ struct crypto_instance *inst = container_of(w, struct crypto_instance,
++ free_work);
+ struct crypto_template *tmpl = inst->tmpl;
+
+ crypto_free_instance(inst);
+ crypto_tmpl_put(tmpl);
+ }
+
++static void crypto_destroy_instance(struct crypto_alg *alg)
++{
++ struct crypto_instance *inst = container_of(alg,
++ struct crypto_instance,
++ alg);
++
++ INIT_WORK(&inst->free_work, crypto_destroy_instance_workfn);
++ schedule_work(&inst->free_work);
++}
++
+ /*
+ * This function adds a spawn to the list secondary_spawns which
+ * will be used at the end of crypto_remove_spawns to unregister
+diff --git a/crypto/asymmetric_keys/x509_public_key.c b/crypto/asymmetric_keys/x509_public_key.c
+index 0b4943a4592b7..1815024bead38 100644
+--- a/crypto/asymmetric_keys/x509_public_key.c
++++ b/crypto/asymmetric_keys/x509_public_key.c
+@@ -117,6 +117,11 @@ int x509_check_for_self_signed(struct x509_certificate *cert)
+ goto out;
+ }
+
++ if (cert->unsupported_sig) {
++ ret = 0;
++ goto out;
++ }
++
+ ret = public_key_verify_signature(cert->pub, cert->sig);
+ if (ret < 0) {
+ if (ret == -ENOPKG) {
+diff --git a/drivers/acpi/x86/s2idle.c b/drivers/acpi/x86/s2idle.c
+index e499c60c45791..ec84da6cc1bff 100644
+--- a/drivers/acpi/x86/s2idle.c
++++ b/drivers/acpi/x86/s2idle.c
+@@ -122,17 +122,16 @@ static void lpi_device_get_constraints_amd(void)
+ acpi_handle_debug(lps0_device_handle,
+ "LPI: constraints list begin:\n");
+
+- for (j = 0; j < package->package.count; ++j) {
++ for (j = 0; j < package->package.count; j++) {
+ union acpi_object *info_obj = &package->package.elements[j];
+ struct lpi_device_constraint_amd dev_info = {};
+ struct lpi_constraints *list;
+ acpi_status status;
+
+- for (k = 0; k < info_obj->package.count; ++k) {
+- union acpi_object *obj = &info_obj->package.elements[k];
++ list = &lpi_constraints_table[lpi_constraints_table_size];
+
+- list = &lpi_constraints_table[lpi_constraints_table_size];
+- list->min_dstate = -1;
++ for (k = 0; k < info_obj->package.count; k++) {
++ union acpi_object *obj = &info_obj->package.elements[k];
+
+ switch (k) {
+ case 0:
+@@ -148,27 +147,21 @@ static void lpi_device_get_constraints_amd(void)
+ dev_info.min_dstate = obj->integer.value;
+ break;
+ }
++ }
+
+- if (!dev_info.enabled || !dev_info.name ||
+- !dev_info.min_dstate)
+- continue;
++ if (!dev_info.enabled || !dev_info.name ||
++ !dev_info.min_dstate)
++ continue;
+
+- status = acpi_get_handle(NULL, dev_info.name,
+- &list->handle);
+- if (ACPI_FAILURE(status))
+- continue;
++ status = acpi_get_handle(NULL, dev_info.name, &list->handle);
++ if (ACPI_FAILURE(status))
++ continue;
+
+- acpi_handle_debug(lps0_device_handle,
+- "Name:%s\n", dev_info.name);
++ acpi_handle_debug(lps0_device_handle,
++ "Name:%s\n", dev_info.name);
+
+- list->min_dstate = dev_info.min_dstate;
++ list->min_dstate = dev_info.min_dstate;
+
+- if (list->min_dstate < 0) {
+- acpi_handle_debug(lps0_device_handle,
+- "Incomplete constraint defined\n");
+- continue;
+- }
+- }
+ lpi_constraints_table_size++;
+ }
+ }
+@@ -213,7 +206,7 @@ static void lpi_device_get_constraints(void)
+ if (!package)
+ continue;
+
+- for (j = 0; j < package->package.count; ++j) {
++ for (j = 0; j < package->package.count; j++) {
+ union acpi_object *element =
+ &(package->package.elements[j]);
+
+@@ -245,7 +238,7 @@ static void lpi_device_get_constraints(void)
+
+ constraint->min_dstate = -1;
+
+- for (j = 0; j < package_count; ++j) {
++ for (j = 0; j < package_count; j++) {
+ union acpi_object *info_obj = &info.package[j];
+ union acpi_object *cnstr_pkg;
+ union acpi_object *obj;
+diff --git a/drivers/amba/bus.c b/drivers/amba/bus.c
+index ce88af9eb562f..09e72967b8abf 100644
+--- a/drivers/amba/bus.c
++++ b/drivers/amba/bus.c
+@@ -528,6 +528,7 @@ static void amba_device_release(struct device *dev)
+ {
+ struct amba_device *d = to_amba_device(dev);
+
++ of_node_put(d->dev.of_node);
+ if (d->res.parent)
+ release_resource(&d->res);
+ mutex_destroy(&d->periphid_lock);
+diff --git a/drivers/ata/pata_arasan_cf.c b/drivers/ata/pata_arasan_cf.c
+index 6ab294322e792..314eaa1679540 100644
+--- a/drivers/ata/pata_arasan_cf.c
++++ b/drivers/ata/pata_arasan_cf.c
+@@ -529,7 +529,8 @@ static void data_xfer(struct work_struct *work)
+ /* dma_request_channel may sleep, so calling from process context */
+ acdev->dma_chan = dma_request_chan(acdev->host->dev, "data");
+ if (IS_ERR(acdev->dma_chan)) {
+- dev_err(acdev->host->dev, "Unable to get dma_chan\n");
++ dev_err_probe(acdev->host->dev, PTR_ERR(acdev->dma_chan),
++ "Unable to get dma_chan\n");
+ acdev->dma_chan = NULL;
+ goto chan_request_fail;
+ }
+diff --git a/drivers/base/core.c b/drivers/base/core.c
+index 3dff5037943e0..6ceaf50f5a671 100644
+--- a/drivers/base/core.c
++++ b/drivers/base/core.c
+@@ -3817,6 +3817,17 @@ void device_del(struct device *dev)
+ device_platform_notify_remove(dev);
+ device_links_purge(dev);
+
++ /*
++ * If a device does not have a driver attached, we need to clean
++ * up any managed resources. We do this in device_release(), but
++ * it's never called (and we leak the device) if a managed
++ * resource holds a reference to the device. So release all
++ * managed resources here, like we do in driver_detach(). We
++ * still need to do so again in device_release() in case someone
++ * adds a new resource after this point, though.
++ */
++ devres_release_all(dev);
++
+ bus_notify(dev, BUS_NOTIFY_REMOVED_DEVICE);
+ kobject_uevent(&dev->kobj, KOBJ_REMOVE);
+ glue_dir = get_glue_dir(dev);
+diff --git a/drivers/base/dd.c b/drivers/base/dd.c
+index 9c09ca5c4ab68..7145d9b940b14 100644
+--- a/drivers/base/dd.c
++++ b/drivers/base/dd.c
+@@ -693,6 +693,8 @@ re_probe:
+
+ device_remove(dev);
+ driver_sysfs_remove(dev);
++ if (dev->bus && dev->bus->dma_cleanup)
++ dev->bus->dma_cleanup(dev);
+ device_unbind_cleanup(dev);
+
+ goto re_probe;
+diff --git a/drivers/base/regmap/regcache-maple.c b/drivers/base/regmap/regcache-maple.c
+index c2e3a0f6c2183..08316d578be23 100644
+--- a/drivers/base/regmap/regcache-maple.c
++++ b/drivers/base/regmap/regcache-maple.c
+@@ -74,7 +74,7 @@ static int regcache_maple_write(struct regmap *map, unsigned int reg,
+ rcu_read_unlock();
+
+ entry = kmalloc((last - index + 1) * sizeof(unsigned long),
+- GFP_KERNEL);
++ map->alloc_flags);
+ if (!entry)
+ return -ENOMEM;
+
+@@ -92,7 +92,7 @@ static int regcache_maple_write(struct regmap *map, unsigned int reg,
+ mas_lock(&mas);
+
+ mas_set_range(&mas, index, last);
+- ret = mas_store_gfp(&mas, entry, GFP_KERNEL);
++ ret = mas_store_gfp(&mas, entry, map->alloc_flags);
+
+ mas_unlock(&mas);
+
+@@ -134,7 +134,7 @@ static int regcache_maple_drop(struct regmap *map, unsigned int min,
+
+ lower = kmemdup(entry, ((min - mas.index) *
+ sizeof(unsigned long)),
+- GFP_KERNEL);
++ map->alloc_flags);
+ if (!lower) {
+ ret = -ENOMEM;
+ goto out_unlocked;
+@@ -148,7 +148,7 @@ static int regcache_maple_drop(struct regmap *map, unsigned int min,
+ upper = kmemdup(&entry[max + 1],
+ ((mas.last - max) *
+ sizeof(unsigned long)),
+- GFP_KERNEL);
++ map->alloc_flags);
+ if (!upper) {
+ ret = -ENOMEM;
+ goto out_unlocked;
+@@ -162,7 +162,7 @@ static int regcache_maple_drop(struct regmap *map, unsigned int min,
+ /* Insert new nodes with the saved data */
+ if (lower) {
+ mas_set_range(&mas, lower_index, lower_last);
+- ret = mas_store_gfp(&mas, lower, GFP_KERNEL);
++ ret = mas_store_gfp(&mas, lower, map->alloc_flags);
+ if (ret != 0)
+ goto out;
+ lower = NULL;
+@@ -170,7 +170,7 @@ static int regcache_maple_drop(struct regmap *map, unsigned int min,
+
+ if (upper) {
+ mas_set_range(&mas, upper_index, upper_last);
+- ret = mas_store_gfp(&mas, upper, GFP_KERNEL);
++ ret = mas_store_gfp(&mas, upper, map->alloc_flags);
+ if (ret != 0)
+ goto out;
+ upper = NULL;
+@@ -242,11 +242,41 @@ static int regcache_maple_exit(struct regmap *map)
+ return 0;
+ }
+
++static int regcache_maple_insert_block(struct regmap *map, int first,
++ int last)
++{
++ struct maple_tree *mt = map->cache;
++ MA_STATE(mas, mt, first, last);
++ unsigned long *entry;
++ int i, ret;
++
++ entry = kcalloc(last - first + 1, sizeof(unsigned long), map->alloc_flags);
++ if (!entry)
++ return -ENOMEM;
++
++ for (i = 0; i < last - first + 1; i++)
++ entry[i] = map->reg_defaults[first + i].def;
++
++ mas_lock(&mas);
++
++ mas_set_range(&mas, map->reg_defaults[first].reg,
++ map->reg_defaults[last].reg);
++ ret = mas_store_gfp(&mas, entry, map->alloc_flags);
++
++ mas_unlock(&mas);
++
++ if (ret)
++ kfree(entry);
++
++ return ret;
++}
++
+ static int regcache_maple_init(struct regmap *map)
+ {
+ struct maple_tree *mt;
+ int i;
+ int ret;
++ int range_start;
+
+ mt = kmalloc(sizeof(*mt), GFP_KERNEL);
+ if (!mt)
+@@ -255,14 +285,30 @@ static int regcache_maple_init(struct regmap *map)
+
+ mt_init(mt);
+
+- for (i = 0; i < map->num_reg_defaults; i++) {
+- ret = regcache_maple_write(map,
+- map->reg_defaults[i].reg,
+- map->reg_defaults[i].def);
+- if (ret)
+- goto err;
++ if (!map->num_reg_defaults)
++ return 0;
++
++ range_start = 0;
++
++ /* Scan for ranges of contiguous registers */
++ for (i = 1; i < map->num_reg_defaults; i++) {
++ if (map->reg_defaults[i].reg !=
++ map->reg_defaults[i - 1].reg + 1) {
++ ret = regcache_maple_insert_block(map, range_start,
++ i - 1);
++ if (ret != 0)
++ goto err;
++
++ range_start = i;
++ }
+ }
+
++ /* Add the last block */
++ ret = regcache_maple_insert_block(map, range_start,
++ map->num_reg_defaults - 1);
++ if (ret != 0)
++ goto err;
++
+ return 0;
+
+ err:
+diff --git a/drivers/base/regmap/regcache-rbtree.c b/drivers/base/regmap/regcache-rbtree.c
+index fabf87058d80b..ae6b8788d5f3f 100644
+--- a/drivers/base/regmap/regcache-rbtree.c
++++ b/drivers/base/regmap/regcache-rbtree.c
+@@ -277,7 +277,7 @@ static int regcache_rbtree_insert_to_block(struct regmap *map,
+
+ blk = krealloc(rbnode->block,
+ blklen * map->cache_word_size,
+- GFP_KERNEL);
++ map->alloc_flags);
+ if (!blk)
+ return -ENOMEM;
+
+@@ -286,7 +286,7 @@ static int regcache_rbtree_insert_to_block(struct regmap *map,
+ if (BITS_TO_LONGS(blklen) > BITS_TO_LONGS(rbnode->blklen)) {
+ present = krealloc(rbnode->cache_present,
+ BITS_TO_LONGS(blklen) * sizeof(*present),
+- GFP_KERNEL);
++ map->alloc_flags);
+ if (!present)
+ return -ENOMEM;
+
+@@ -320,7 +320,7 @@ regcache_rbtree_node_alloc(struct regmap *map, unsigned int reg)
+ const struct regmap_range *range;
+ int i;
+
+- rbnode = kzalloc(sizeof(*rbnode), GFP_KERNEL);
++ rbnode = kzalloc(sizeof(*rbnode), map->alloc_flags);
+ if (!rbnode)
+ return NULL;
+
+@@ -346,13 +346,13 @@ regcache_rbtree_node_alloc(struct regmap *map, unsigned int reg)
+ }
+
+ rbnode->block = kmalloc_array(rbnode->blklen, map->cache_word_size,
+- GFP_KERNEL);
++ map->alloc_flags);
+ if (!rbnode->block)
+ goto err_free;
+
+ rbnode->cache_present = kcalloc(BITS_TO_LONGS(rbnode->blklen),
+ sizeof(*rbnode->cache_present),
+- GFP_KERNEL);
++ map->alloc_flags);
+ if (!rbnode->cache_present)
+ goto err_free_block;
+
+diff --git a/drivers/base/test/test_async_driver_probe.c b/drivers/base/test/test_async_driver_probe.c
+index 929410d0dd6fe..3465800baa6c8 100644
+--- a/drivers/base/test/test_async_driver_probe.c
++++ b/drivers/base/test/test_async_driver_probe.c
+@@ -84,7 +84,7 @@ test_platform_device_register_node(char *name, int id, int nid)
+
+ pdev = platform_device_alloc(name, id);
+ if (!pdev)
+- return NULL;
++ return ERR_PTR(-ENOMEM);
+
+ if (nid != NUMA_NO_NODE)
+ set_dev_node(&pdev->dev, nid);
+diff --git a/drivers/bluetooth/btintel.c b/drivers/bluetooth/btintel.c
+index d9349ba48281e..7ba60151a16a6 100644
+--- a/drivers/bluetooth/btintel.c
++++ b/drivers/bluetooth/btintel.c
+@@ -2658,6 +2658,9 @@ static int btintel_setup_combined(struct hci_dev *hdev)
+ set_bit(HCI_QUIRK_WIDEBAND_SPEECH_SUPPORTED,
+ &hdev->quirks);
+
++ /* These variants don't seem to support LE Coded PHY */
++ set_bit(HCI_QUIRK_BROKEN_LE_CODED, &hdev->quirks);
++
+ /* Setup MSFT Extension support */
+ btintel_set_msft_opcode(hdev, ver.hw_variant);
+
+@@ -2729,6 +2732,9 @@ static int btintel_setup_combined(struct hci_dev *hdev)
+ */
+ set_bit(HCI_QUIRK_WIDEBAND_SPEECH_SUPPORTED, &hdev->quirks);
+
++ /* These variants don't seem to support LE Coded PHY */
++ set_bit(HCI_QUIRK_BROKEN_LE_CODED, &hdev->quirks);
++
+ /* Set Valid LE States quirk */
+ set_bit(HCI_QUIRK_VALID_LE_STATES, &hdev->quirks);
+
+diff --git a/drivers/bluetooth/btrtl.c b/drivers/bluetooth/btrtl.c
+index 2915c82d719d8..03a20d617deee 100644
+--- a/drivers/bluetooth/btrtl.c
++++ b/drivers/bluetooth/btrtl.c
+@@ -101,21 +101,21 @@ static const struct id_table ic_id_table[] = {
+ { IC_INFO(RTL_ROM_LMP_8723A, 0xb, 0x6, HCI_USB),
+ .config_needed = false,
+ .has_rom_version = false,
+- .fw_name = "rtl_bt/rtl8723a_fw.bin",
++ .fw_name = "rtl_bt/rtl8723a_fw",
+ .cfg_name = NULL },
+
+ /* 8723BS */
+ { IC_INFO(RTL_ROM_LMP_8723B, 0xb, 0x6, HCI_UART),
+ .config_needed = true,
+ .has_rom_version = true,
+- .fw_name = "rtl_bt/rtl8723bs_fw.bin",
++ .fw_name = "rtl_bt/rtl8723bs_fw",
+ .cfg_name = "rtl_bt/rtl8723bs_config" },
+
+ /* 8723B */
+ { IC_INFO(RTL_ROM_LMP_8723B, 0xb, 0x6, HCI_USB),
+ .config_needed = false,
+ .has_rom_version = true,
+- .fw_name = "rtl_bt/rtl8723b_fw.bin",
++ .fw_name = "rtl_bt/rtl8723b_fw",
+ .cfg_name = "rtl_bt/rtl8723b_config" },
+
+ /* 8723CS-CG */
+@@ -126,7 +126,7 @@ static const struct id_table ic_id_table[] = {
+ .hci_bus = HCI_UART,
+ .config_needed = true,
+ .has_rom_version = true,
+- .fw_name = "rtl_bt/rtl8723cs_cg_fw.bin",
++ .fw_name = "rtl_bt/rtl8723cs_cg_fw",
+ .cfg_name = "rtl_bt/rtl8723cs_cg_config" },
+
+ /* 8723CS-VF */
+@@ -137,7 +137,7 @@ static const struct id_table ic_id_table[] = {
+ .hci_bus = HCI_UART,
+ .config_needed = true,
+ .has_rom_version = true,
+- .fw_name = "rtl_bt/rtl8723cs_vf_fw.bin",
++ .fw_name = "rtl_bt/rtl8723cs_vf_fw",
+ .cfg_name = "rtl_bt/rtl8723cs_vf_config" },
+
+ /* 8723CS-XX */
+@@ -148,28 +148,28 @@ static const struct id_table ic_id_table[] = {
+ .hci_bus = HCI_UART,
+ .config_needed = true,
+ .has_rom_version = true,
+- .fw_name = "rtl_bt/rtl8723cs_xx_fw.bin",
++ .fw_name = "rtl_bt/rtl8723cs_xx_fw",
+ .cfg_name = "rtl_bt/rtl8723cs_xx_config" },
+
+ /* 8723D */
+ { IC_INFO(RTL_ROM_LMP_8723B, 0xd, 0x8, HCI_USB),
+ .config_needed = true,
+ .has_rom_version = true,
+- .fw_name = "rtl_bt/rtl8723d_fw.bin",
++ .fw_name = "rtl_bt/rtl8723d_fw",
+ .cfg_name = "rtl_bt/rtl8723d_config" },
+
+ /* 8723DS */
+ { IC_INFO(RTL_ROM_LMP_8723B, 0xd, 0x8, HCI_UART),
+ .config_needed = true,
+ .has_rom_version = true,
+- .fw_name = "rtl_bt/rtl8723ds_fw.bin",
++ .fw_name = "rtl_bt/rtl8723ds_fw",
+ .cfg_name = "rtl_bt/rtl8723ds_config" },
+
+ /* 8821A */
+ { IC_INFO(RTL_ROM_LMP_8821A, 0xa, 0x6, HCI_USB),
+ .config_needed = false,
+ .has_rom_version = true,
+- .fw_name = "rtl_bt/rtl8821a_fw.bin",
++ .fw_name = "rtl_bt/rtl8821a_fw",
+ .cfg_name = "rtl_bt/rtl8821a_config" },
+
+ /* 8821C */
+@@ -177,7 +177,7 @@ static const struct id_table ic_id_table[] = {
+ .config_needed = false,
+ .has_rom_version = true,
+ .has_msft_ext = true,
+- .fw_name = "rtl_bt/rtl8821c_fw.bin",
++ .fw_name = "rtl_bt/rtl8821c_fw",
+ .cfg_name = "rtl_bt/rtl8821c_config" },
+
+ /* 8821CS */
+@@ -185,14 +185,14 @@ static const struct id_table ic_id_table[] = {
+ .config_needed = true,
+ .has_rom_version = true,
+ .has_msft_ext = true,
+- .fw_name = "rtl_bt/rtl8821cs_fw.bin",
++ .fw_name = "rtl_bt/rtl8821cs_fw",
+ .cfg_name = "rtl_bt/rtl8821cs_config" },
+
+ /* 8761A */
+ { IC_INFO(RTL_ROM_LMP_8761A, 0xa, 0x6, HCI_USB),
+ .config_needed = false,
+ .has_rom_version = true,
+- .fw_name = "rtl_bt/rtl8761a_fw.bin",
++ .fw_name = "rtl_bt/rtl8761a_fw",
+ .cfg_name = "rtl_bt/rtl8761a_config" },
+
+ /* 8761B */
+@@ -200,14 +200,14 @@ static const struct id_table ic_id_table[] = {
+ .config_needed = false,
+ .has_rom_version = true,
+ .has_msft_ext = true,
+- .fw_name = "rtl_bt/rtl8761b_fw.bin",
++ .fw_name = "rtl_bt/rtl8761b_fw",
+ .cfg_name = "rtl_bt/rtl8761b_config" },
+
+ /* 8761BU */
+ { IC_INFO(RTL_ROM_LMP_8761A, 0xb, 0xa, HCI_USB),
+ .config_needed = false,
+ .has_rom_version = true,
+- .fw_name = "rtl_bt/rtl8761bu_fw.bin",
++ .fw_name = "rtl_bt/rtl8761bu_fw",
+ .cfg_name = "rtl_bt/rtl8761bu_config" },
+
+ /* 8822C with UART interface */
+@@ -215,7 +215,7 @@ static const struct id_table ic_id_table[] = {
+ .config_needed = true,
+ .has_rom_version = true,
+ .has_msft_ext = true,
+- .fw_name = "rtl_bt/rtl8822cs_fw.bin",
++ .fw_name = "rtl_bt/rtl8822cs_fw",
+ .cfg_name = "rtl_bt/rtl8822cs_config" },
+
+ /* 8822C with UART interface */
+@@ -223,7 +223,7 @@ static const struct id_table ic_id_table[] = {
+ .config_needed = true,
+ .has_rom_version = true,
+ .has_msft_ext = true,
+- .fw_name = "rtl_bt/rtl8822cs_fw.bin",
++ .fw_name = "rtl_bt/rtl8822cs_fw",
+ .cfg_name = "rtl_bt/rtl8822cs_config" },
+
+ /* 8822C with USB interface */
+@@ -231,7 +231,7 @@ static const struct id_table ic_id_table[] = {
+ .config_needed = false,
+ .has_rom_version = true,
+ .has_msft_ext = true,
+- .fw_name = "rtl_bt/rtl8822cu_fw.bin",
++ .fw_name = "rtl_bt/rtl8822cu_fw",
+ .cfg_name = "rtl_bt/rtl8822cu_config" },
+
+ /* 8822B */
+@@ -239,7 +239,7 @@ static const struct id_table ic_id_table[] = {
+ .config_needed = true,
+ .has_rom_version = true,
+ .has_msft_ext = true,
+- .fw_name = "rtl_bt/rtl8822b_fw.bin",
++ .fw_name = "rtl_bt/rtl8822b_fw",
+ .cfg_name = "rtl_bt/rtl8822b_config" },
+
+ /* 8852A */
+@@ -247,7 +247,7 @@ static const struct id_table ic_id_table[] = {
+ .config_needed = false,
+ .has_rom_version = true,
+ .has_msft_ext = true,
+- .fw_name = "rtl_bt/rtl8852au_fw.bin",
++ .fw_name = "rtl_bt/rtl8852au_fw",
+ .cfg_name = "rtl_bt/rtl8852au_config" },
+
+ /* 8852B with UART interface */
+@@ -255,7 +255,7 @@ static const struct id_table ic_id_table[] = {
+ .config_needed = true,
+ .has_rom_version = true,
+ .has_msft_ext = true,
+- .fw_name = "rtl_bt/rtl8852bs_fw.bin",
++ .fw_name = "rtl_bt/rtl8852bs_fw",
+ .cfg_name = "rtl_bt/rtl8852bs_config" },
+
+ /* 8852B */
+@@ -263,7 +263,7 @@ static const struct id_table ic_id_table[] = {
+ .config_needed = false,
+ .has_rom_version = true,
+ .has_msft_ext = true,
+- .fw_name = "rtl_bt/rtl8852bu_fw.bin",
++ .fw_name = "rtl_bt/rtl8852bu_fw",
+ .cfg_name = "rtl_bt/rtl8852bu_config" },
+
+ /* 8852C */
+@@ -271,7 +271,7 @@ static const struct id_table ic_id_table[] = {
+ .config_needed = false,
+ .has_rom_version = true,
+ .has_msft_ext = true,
+- .fw_name = "rtl_bt/rtl8852cu_fw.bin",
++ .fw_name = "rtl_bt/rtl8852cu_fw",
+ .cfg_name = "rtl_bt/rtl8852cu_config" },
+
+ /* 8851B */
+@@ -279,7 +279,7 @@ static const struct id_table ic_id_table[] = {
+ .config_needed = false,
+ .has_rom_version = true,
+ .has_msft_ext = false,
+- .fw_name = "rtl_bt/rtl8851bu_fw.bin",
++ .fw_name = "rtl_bt/rtl8851bu_fw",
+ .cfg_name = "rtl_bt/rtl8851bu_config" },
+ };
+
+@@ -967,6 +967,7 @@ struct btrtl_device_info *btrtl_initialize(struct hci_dev *hdev,
+ struct btrtl_device_info *btrtl_dev;
+ struct sk_buff *skb;
+ struct hci_rp_read_local_version *resp;
++ char fw_name[40];
+ char cfg_name[40];
+ u16 hci_rev, lmp_subver;
+ u8 hci_ver, lmp_ver, chip_type = 0;
+@@ -1079,8 +1080,26 @@ next:
+ goto err_free;
+ }
+
+- btrtl_dev->fw_len = rtl_load_file(hdev, btrtl_dev->ic_info->fw_name,
+- &btrtl_dev->fw_data);
++ if (!btrtl_dev->ic_info->fw_name) {
++ ret = -ENOMEM;
++ goto err_free;
++ }
++
++ btrtl_dev->fw_len = -EIO;
++ if (lmp_subver == RTL_ROM_LMP_8852A && hci_rev == 0x000c) {
++ snprintf(fw_name, sizeof(fw_name), "%s_v2.bin",
++ btrtl_dev->ic_info->fw_name);
++ btrtl_dev->fw_len = rtl_load_file(hdev, fw_name,
++ &btrtl_dev->fw_data);
++ }
++
++ if (btrtl_dev->fw_len < 0) {
++ snprintf(fw_name, sizeof(fw_name), "%s.bin",
++ btrtl_dev->ic_info->fw_name);
++ btrtl_dev->fw_len = rtl_load_file(hdev, fw_name,
++ &btrtl_dev->fw_data);
++ }
++
+ if (btrtl_dev->fw_len < 0) {
+ rtl_dev_err(hdev, "firmware file %s not found",
+ btrtl_dev->ic_info->fw_name);
+@@ -1180,6 +1199,10 @@ void btrtl_set_quirks(struct hci_dev *hdev, struct btrtl_device_info *btrtl_dev)
+ if (btrtl_dev->project_id == CHIP_ID_8852C)
+ btrealtek_set_flag(hdev, REALTEK_ALT6_CONTINUOUS_TX_CHIP);
+
++ if (btrtl_dev->project_id == CHIP_ID_8852A ||
++ btrtl_dev->project_id == CHIP_ID_8852C)
++ set_bit(HCI_QUIRK_USE_MSFT_EXT_ADDRESS_FILTER, &hdev->quirks);
++
+ hci_set_aosp_capable(hdev);
+ break;
+ default:
+@@ -1382,6 +1405,7 @@ MODULE_FIRMWARE("rtl_bt/rtl8852bs_config.bin");
+ MODULE_FIRMWARE("rtl_bt/rtl8852bu_fw.bin");
+ MODULE_FIRMWARE("rtl_bt/rtl8852bu_config.bin");
+ MODULE_FIRMWARE("rtl_bt/rtl8852cu_fw.bin");
++MODULE_FIRMWARE("rtl_bt/rtl8852cu_fw_v2.bin");
+ MODULE_FIRMWARE("rtl_bt/rtl8852cu_config.bin");
+ MODULE_FIRMWARE("rtl_bt/rtl8851bu_fw.bin");
+ MODULE_FIRMWARE("rtl_bt/rtl8851bu_config.bin");
+diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
+index 025e803ba55c2..5559d6d8ae8b2 100644
+--- a/drivers/bluetooth/btusb.c
++++ b/drivers/bluetooth/btusb.c
+@@ -2077,7 +2077,7 @@ static int btusb_switch_alt_setting(struct hci_dev *hdev, int new_alts)
+ * alternate setting.
+ */
+ spin_lock_irqsave(&data->rxlock, flags);
+- kfree_skb(data->sco_skb);
++ dev_kfree_skb_irq(data->sco_skb);
+ data->sco_skb = NULL;
+ spin_unlock_irqrestore(&data->rxlock, flags);
+
+diff --git a/drivers/bluetooth/hci_nokia.c b/drivers/bluetooth/hci_nokia.c
+index 05f7f6de6863d..97da0b2bfd17e 100644
+--- a/drivers/bluetooth/hci_nokia.c
++++ b/drivers/bluetooth/hci_nokia.c
+@@ -734,7 +734,11 @@ static int nokia_bluetooth_serdev_probe(struct serdev_device *serdev)
+ return err;
+ }
+
+- clk_prepare_enable(sysclk);
++ err = clk_prepare_enable(sysclk);
++ if (err) {
++ dev_err(dev, "could not enable sysclk: %d", err);
++ return err;
++ }
+ btdev->sysclk_speed = clk_get_rate(sysclk);
+ clk_disable_unprepare(sysclk);
+
+diff --git a/drivers/bus/ti-sysc.c b/drivers/bus/ti-sysc.c
+index 4cb23b9e06ea4..c95fa4335fee2 100644
+--- a/drivers/bus/ti-sysc.c
++++ b/drivers/bus/ti-sysc.c
+@@ -3106,7 +3106,7 @@ static int sysc_init_static_data(struct sysc *ddata)
+
+ match = soc_device_match(sysc_soc_match);
+ if (match && match->data)
+- sysc_soc->soc = (int)match->data;
++ sysc_soc->soc = (enum sysc_soc)(uintptr_t)match->data;
+
+ /*
+ * Check and warn about possible old incomplete dtb. We now want to see
+diff --git a/drivers/char/hw_random/iproc-rng200.c b/drivers/char/hw_random/iproc-rng200.c
+index 06bc060534d81..c0df053cbe4b2 100644
+--- a/drivers/char/hw_random/iproc-rng200.c
++++ b/drivers/char/hw_random/iproc-rng200.c
+@@ -182,6 +182,8 @@ static int iproc_rng200_probe(struct platform_device *pdev)
+ return PTR_ERR(priv->base);
+ }
+
++ dev_set_drvdata(dev, priv);
++
+ priv->rng.name = "iproc-rng200";
+ priv->rng.read = iproc_rng200_read;
+ priv->rng.init = iproc_rng200_init;
+@@ -199,6 +201,28 @@ static int iproc_rng200_probe(struct platform_device *pdev)
+ return 0;
+ }
+
++static int __maybe_unused iproc_rng200_suspend(struct device *dev)
++{
++ struct iproc_rng200_dev *priv = dev_get_drvdata(dev);
++
++ iproc_rng200_cleanup(&priv->rng);
++
++ return 0;
++}
++
++static int __maybe_unused iproc_rng200_resume(struct device *dev)
++{
++ struct iproc_rng200_dev *priv = dev_get_drvdata(dev);
++
++ iproc_rng200_init(&priv->rng);
++
++ return 0;
++}
++
++static const struct dev_pm_ops iproc_rng200_pm_ops = {
++ SET_SYSTEM_SLEEP_PM_OPS(iproc_rng200_suspend, iproc_rng200_resume)
++};
++
+ static const struct of_device_id iproc_rng200_of_match[] = {
+ { .compatible = "brcm,bcm2711-rng200", },
+ { .compatible = "brcm,bcm7211-rng200", },
+@@ -212,6 +236,7 @@ static struct platform_driver iproc_rng200_driver = {
+ .driver = {
+ .name = "iproc-rng200",
+ .of_match_table = iproc_rng200_of_match,
++ .pm = &iproc_rng200_pm_ops,
+ },
+ .probe = iproc_rng200_probe,
+ };
+diff --git a/drivers/char/hw_random/nomadik-rng.c b/drivers/char/hw_random/nomadik-rng.c
+index e8f9621e79541..3774adf903a83 100644
+--- a/drivers/char/hw_random/nomadik-rng.c
++++ b/drivers/char/hw_random/nomadik-rng.c
+@@ -13,8 +13,6 @@
+ #include <linux/clk.h>
+ #include <linux/err.h>
+
+-static struct clk *rng_clk;
+-
+ static int nmk_rng_read(struct hwrng *rng, void *data, size_t max, bool wait)
+ {
+ void __iomem *base = (void __iomem *)rng->priv;
+@@ -36,21 +34,20 @@ static struct hwrng nmk_rng = {
+
+ static int nmk_rng_probe(struct amba_device *dev, const struct amba_id *id)
+ {
++ struct clk *rng_clk;
+ void __iomem *base;
+ int ret;
+
+- rng_clk = devm_clk_get(&dev->dev, NULL);
++ rng_clk = devm_clk_get_enabled(&dev->dev, NULL);
+ if (IS_ERR(rng_clk)) {
+ dev_err(&dev->dev, "could not get rng clock\n");
+ ret = PTR_ERR(rng_clk);
+ return ret;
+ }
+
+- clk_prepare_enable(rng_clk);
+-
+ ret = amba_request_regions(dev, dev->dev.init_name);
+ if (ret)
+- goto out_clk;
++ return ret;
+ ret = -ENOMEM;
+ base = devm_ioremap(&dev->dev, dev->res.start,
+ resource_size(&dev->res));
+@@ -64,15 +61,12 @@ static int nmk_rng_probe(struct amba_device *dev, const struct amba_id *id)
+
+ out_release:
+ amba_release_regions(dev);
+-out_clk:
+- clk_disable_unprepare(rng_clk);
+ return ret;
+ }
+
+ static void nmk_rng_remove(struct amba_device *dev)
+ {
+ amba_release_regions(dev);
+- clk_disable_unprepare(rng_clk);
+ }
+
+ static const struct amba_id nmk_rng_ids[] = {
+diff --git a/drivers/char/hw_random/pic32-rng.c b/drivers/char/hw_random/pic32-rng.c
+index 99c8bd0859a14..e04a054e89307 100644
+--- a/drivers/char/hw_random/pic32-rng.c
++++ b/drivers/char/hw_random/pic32-rng.c
+@@ -36,7 +36,6 @@
+ struct pic32_rng {
+ void __iomem *base;
+ struct hwrng rng;
+- struct clk *clk;
+ };
+
+ /*
+@@ -70,6 +69,7 @@ static int pic32_rng_read(struct hwrng *rng, void *buf, size_t max,
+ static int pic32_rng_probe(struct platform_device *pdev)
+ {
+ struct pic32_rng *priv;
++ struct clk *clk;
+ u32 v;
+ int ret;
+
+@@ -81,13 +81,9 @@ static int pic32_rng_probe(struct platform_device *pdev)
+ if (IS_ERR(priv->base))
+ return PTR_ERR(priv->base);
+
+- priv->clk = devm_clk_get(&pdev->dev, NULL);
+- if (IS_ERR(priv->clk))
+- return PTR_ERR(priv->clk);
+-
+- ret = clk_prepare_enable(priv->clk);
+- if (ret)
+- return ret;
++ clk = devm_clk_get_enabled(&pdev->dev, NULL);
++ if (IS_ERR(clk))
++ return PTR_ERR(clk);
+
+ /* enable TRNG in enhanced mode */
+ v = TRNGEN | TRNGMOD;
+@@ -98,15 +94,11 @@ static int pic32_rng_probe(struct platform_device *pdev)
+
+ ret = devm_hwrng_register(&pdev->dev, &priv->rng);
+ if (ret)
+- goto err_register;
++ return ret;
+
+ platform_set_drvdata(pdev, priv);
+
+ return 0;
+-
+-err_register:
+- clk_disable_unprepare(priv->clk);
+- return ret;
+ }
+
+ static int pic32_rng_remove(struct platform_device *pdev)
+@@ -114,7 +106,6 @@ static int pic32_rng_remove(struct platform_device *pdev)
+ struct pic32_rng *rng = platform_get_drvdata(pdev);
+
+ writel(0, rng->base + RNGCON);
+- clk_disable_unprepare(rng->clk);
+ return 0;
+ }
+
+diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
+index abddd7e43a9a6..5cd031f3fc970 100644
+--- a/drivers/char/ipmi/ipmi_si_intf.c
++++ b/drivers/char/ipmi/ipmi_si_intf.c
+@@ -2082,6 +2082,11 @@ static int try_smi_init(struct smi_info *new_smi)
+ new_smi->io.io_cleanup = NULL;
+ }
+
++ if (rv && new_smi->si_sm) {
++ kfree(new_smi->si_sm);
++ new_smi->si_sm = NULL;
++ }
++
+ return rv;
+ }
+
+diff --git a/drivers/char/ipmi/ipmi_ssif.c b/drivers/char/ipmi/ipmi_ssif.c
+index 3b921c78ba083..faf1f2ad584bf 100644
+--- a/drivers/char/ipmi/ipmi_ssif.c
++++ b/drivers/char/ipmi/ipmi_ssif.c
+@@ -1400,7 +1400,7 @@ static struct ssif_addr_info *ssif_info_find(unsigned short addr,
+ restart:
+ list_for_each_entry(info, &ssif_infos, link) {
+ if (info->binfo.addr == addr) {
+- if (info->addr_src == SI_SMBIOS)
++ if (info->addr_src == SI_SMBIOS && !info->adapter_name)
+ info->adapter_name = kstrdup(adapter_name,
+ GFP_KERNEL);
+
+@@ -1600,6 +1600,11 @@ static int ssif_add_infos(struct i2c_client *client)
+ info->addr_src = SI_ACPI;
+ info->client = client;
+ info->adapter_name = kstrdup(client->adapter->name, GFP_KERNEL);
++ if (!info->adapter_name) {
++ kfree(info);
++ return -ENOMEM;
++ }
++
+ info->binfo.addr = client->addr;
+ list_add_tail(&info->link, &ssif_infos);
+ return 0;
+diff --git a/drivers/char/tpm/tpm_crb.c b/drivers/char/tpm/tpm_crb.c
+index 9eb1a18590123..a5dbebb1acfcf 100644
+--- a/drivers/char/tpm/tpm_crb.c
++++ b/drivers/char/tpm/tpm_crb.c
+@@ -463,28 +463,6 @@ static bool crb_req_canceled(struct tpm_chip *chip, u8 status)
+ return (cancel & CRB_CANCEL_INVOKE) == CRB_CANCEL_INVOKE;
+ }
+
+-static int crb_check_flags(struct tpm_chip *chip)
+-{
+- u32 val;
+- int ret;
+-
+- ret = crb_request_locality(chip, 0);
+- if (ret)
+- return ret;
+-
+- ret = tpm2_get_tpm_pt(chip, TPM2_PT_MANUFACTURER, &val, NULL);
+- if (ret)
+- goto release;
+-
+- if (val == 0x414D4400U /* AMD */)
+- chip->flags |= TPM_CHIP_FLAG_HWRNG_DISABLED;
+-
+-release:
+- crb_relinquish_locality(chip, 0);
+-
+- return ret;
+-}
+-
+ static const struct tpm_class_ops tpm_crb = {
+ .flags = TPM_OPS_AUTO_STARTUP,
+ .status = crb_status,
+@@ -826,9 +804,14 @@ static int crb_acpi_add(struct acpi_device *device)
+ if (rc)
+ goto out;
+
+- rc = crb_check_flags(chip);
+- if (rc)
+- goto out;
++#ifdef CONFIG_X86
++ /* A quirk for https://www.amd.com/en/support/kb/faq/pa-410 */
++ if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD &&
++ priv->sm != ACPI_TPM2_COMMAND_BUFFER_WITH_PLUTON) {
++ dev_info(dev, "Disabling hwrng\n");
++ chip->flags |= TPM_CHIP_FLAG_HWRNG_DISABLED;
++ }
++#endif /* CONFIG_X86 */
+
+ rc = tpm_chip_register(chip);
+
+diff --git a/drivers/clk/Kconfig b/drivers/clk/Kconfig
+index 016814e15536a..52dfbae4f361c 100644
+--- a/drivers/clk/Kconfig
++++ b/drivers/clk/Kconfig
+@@ -444,6 +444,7 @@ config COMMON_CLK_BD718XX
+ config COMMON_CLK_FIXED_MMIO
+ bool "Clock driver for Memory Mapped Fixed values"
+ depends on COMMON_CLK && OF
++ depends on HAS_IOMEM
+ help
+ Support for Memory Mapped IO Fixed clocks
+
+diff --git a/drivers/clk/imx/clk-composite-8m.c b/drivers/clk/imx/clk-composite-8m.c
+index 7a6e3ce97133b..27a08c50ac1d8 100644
+--- a/drivers/clk/imx/clk-composite-8m.c
++++ b/drivers/clk/imx/clk-composite-8m.c
+@@ -97,7 +97,7 @@ static int imx8m_clk_composite_divider_set_rate(struct clk_hw *hw,
+ int prediv_value;
+ int div_value;
+ int ret;
+- u32 val;
++ u32 orig, val;
+
+ ret = imx8m_clk_composite_compute_dividers(rate, parent_rate,
+ &prediv_value, &div_value);
+@@ -106,13 +106,15 @@ static int imx8m_clk_composite_divider_set_rate(struct clk_hw *hw,
+
+ spin_lock_irqsave(divider->lock, flags);
+
+- val = readl(divider->reg);
+- val &= ~((clk_div_mask(divider->width) << divider->shift) |
+- (clk_div_mask(PCG_DIV_WIDTH) << PCG_DIV_SHIFT));
++ orig = readl(divider->reg);
++ val = orig & ~((clk_div_mask(divider->width) << divider->shift) |
++ (clk_div_mask(PCG_DIV_WIDTH) << PCG_DIV_SHIFT));
+
+ val |= (u32)(prediv_value - 1) << divider->shift;
+ val |= (u32)(div_value - 1) << PCG_DIV_SHIFT;
+- writel(val, divider->reg);
++
++ if (val != orig)
++ writel(val, divider->reg);
+
+ spin_unlock_irqrestore(divider->lock, flags);
+
+diff --git a/drivers/clk/imx/clk-imx8mp.c b/drivers/clk/imx/clk-imx8mp.c
+index 1469249386dd8..670aa2bab3017 100644
+--- a/drivers/clk/imx/clk-imx8mp.c
++++ b/drivers/clk/imx/clk-imx8mp.c
+@@ -178,10 +178,6 @@ static const char * const imx8mp_sai3_sels[] = {"osc_24m", "audio_pll1_out", "au
+ "video_pll1_out", "sys_pll1_133m", "osc_hdmi",
+ "clk_ext3", "clk_ext4", };
+
+-static const char * const imx8mp_sai4_sels[] = {"osc_24m", "audio_pll1_out", "audio_pll2_out",
+- "video_pll1_out", "sys_pll1_133m", "osc_hdmi",
+- "clk_ext1", "clk_ext2", };
+-
+ static const char * const imx8mp_sai5_sels[] = {"osc_24m", "audio_pll1_out", "audio_pll2_out",
+ "video_pll1_out", "sys_pll1_133m", "osc_hdmi",
+ "clk_ext2", "clk_ext3", };
+@@ -567,7 +563,6 @@ static int imx8mp_clocks_probe(struct platform_device *pdev)
+ hws[IMX8MP_CLK_SAI1] = imx8m_clk_hw_composite("sai1", imx8mp_sai1_sels, ccm_base + 0xa580);
+ hws[IMX8MP_CLK_SAI2] = imx8m_clk_hw_composite("sai2", imx8mp_sai2_sels, ccm_base + 0xa600);
+ hws[IMX8MP_CLK_SAI3] = imx8m_clk_hw_composite("sai3", imx8mp_sai3_sels, ccm_base + 0xa680);
+- hws[IMX8MP_CLK_SAI4] = imx8m_clk_hw_composite("sai4", imx8mp_sai4_sels, ccm_base + 0xa700);
+ hws[IMX8MP_CLK_SAI5] = imx8m_clk_hw_composite("sai5", imx8mp_sai5_sels, ccm_base + 0xa780);
+ hws[IMX8MP_CLK_SAI6] = imx8m_clk_hw_composite("sai6", imx8mp_sai6_sels, ccm_base + 0xa800);
+ hws[IMX8MP_CLK_ENET_QOS] = imx8m_clk_hw_composite("enet_qos", imx8mp_enet_qos_sels, ccm_base + 0xa880);
+diff --git a/drivers/clk/imx/clk-imx8ulp.c b/drivers/clk/imx/clk-imx8ulp.c
+index e308c88cb801c..1b04e2fc78ad5 100644
+--- a/drivers/clk/imx/clk-imx8ulp.c
++++ b/drivers/clk/imx/clk-imx8ulp.c
+@@ -167,7 +167,7 @@ static int imx8ulp_clk_cgc1_init(struct platform_device *pdev)
+ clks[IMX8ULP_CLK_SPLL2_PRE_SEL] = imx_clk_hw_mux_flags("spll2_pre_sel", base + 0x510, 0, 1, pll_pre_sels, ARRAY_SIZE(pll_pre_sels), CLK_SET_PARENT_GATE);
+ clks[IMX8ULP_CLK_SPLL3_PRE_SEL] = imx_clk_hw_mux_flags("spll3_pre_sel", base + 0x610, 0, 1, pll_pre_sels, ARRAY_SIZE(pll_pre_sels), CLK_SET_PARENT_GATE);
+
+- clks[IMX8ULP_CLK_SPLL2] = imx_clk_hw_pllv4(IMX_PLLV4_IMX8ULP, "spll2", "spll2_pre_sel", base + 0x500);
++ clks[IMX8ULP_CLK_SPLL2] = imx_clk_hw_pllv4(IMX_PLLV4_IMX8ULP_1GHZ, "spll2", "spll2_pre_sel", base + 0x500);
+ clks[IMX8ULP_CLK_SPLL3] = imx_clk_hw_pllv4(IMX_PLLV4_IMX8ULP, "spll3", "spll3_pre_sel", base + 0x600);
+ clks[IMX8ULP_CLK_SPLL3_VCODIV] = imx_clk_hw_divider("spll3_vcodiv", "spll3", base + 0x604, 0, 6);
+
+diff --git a/drivers/clk/imx/clk-pllv4.c b/drivers/clk/imx/clk-pllv4.c
+index 6e7e34571fc8d..9b136c951762c 100644
+--- a/drivers/clk/imx/clk-pllv4.c
++++ b/drivers/clk/imx/clk-pllv4.c
+@@ -44,11 +44,15 @@ struct clk_pllv4 {
+ u32 cfg_offset;
+ u32 num_offset;
+ u32 denom_offset;
++ bool use_mult_range;
+ };
+
+ /* Valid PLL MULT Table */
+ static const int pllv4_mult_table[] = {33, 27, 22, 20, 17, 16};
+
++/* Valid PLL MULT range, (max, min) */
++static const int pllv4_mult_range[] = {54, 27};
++
+ #define to_clk_pllv4(__hw) container_of(__hw, struct clk_pllv4, hw)
+
+ #define LOCK_TIMEOUT_US USEC_PER_MSEC
+@@ -94,17 +98,30 @@ static unsigned long clk_pllv4_recalc_rate(struct clk_hw *hw,
+ static long clk_pllv4_round_rate(struct clk_hw *hw, unsigned long rate,
+ unsigned long *prate)
+ {
++ struct clk_pllv4 *pll = to_clk_pllv4(hw);
+ unsigned long parent_rate = *prate;
+ unsigned long round_rate, i;
+ u32 mfn, mfd = DEFAULT_MFD;
+ bool found = false;
+ u64 temp64;
+-
+- for (i = 0; i < ARRAY_SIZE(pllv4_mult_table); i++) {
+- round_rate = parent_rate * pllv4_mult_table[i];
+- if (rate >= round_rate) {
++ u32 mult;
++
++ if (pll->use_mult_range) {
++ temp64 = (u64)rate;
++ do_div(temp64, parent_rate);
++ mult = temp64;
++ if (mult >= pllv4_mult_range[1] &&
++ mult <= pllv4_mult_range[0]) {
++ round_rate = parent_rate * mult;
+ found = true;
+- break;
++ }
++ } else {
++ for (i = 0; i < ARRAY_SIZE(pllv4_mult_table); i++) {
++ round_rate = parent_rate * pllv4_mult_table[i];
++ if (rate >= round_rate) {
++ found = true;
++ break;
++ }
+ }
+ }
+
+@@ -138,14 +155,20 @@ static long clk_pllv4_round_rate(struct clk_hw *hw, unsigned long rate,
+ return round_rate + (u32)temp64;
+ }
+
+-static bool clk_pllv4_is_valid_mult(unsigned int mult)
++static bool clk_pllv4_is_valid_mult(struct clk_pllv4 *pll, unsigned int mult)
+ {
+ int i;
+
+ /* check if mult is in valid MULT table */
+- for (i = 0; i < ARRAY_SIZE(pllv4_mult_table); i++) {
+- if (pllv4_mult_table[i] == mult)
++ if (pll->use_mult_range) {
++ if (mult >= pllv4_mult_range[1] &&
++ mult <= pllv4_mult_range[0])
+ return true;
++ } else {
++ for (i = 0; i < ARRAY_SIZE(pllv4_mult_table); i++) {
++ if (pllv4_mult_table[i] == mult)
++ return true;
++ }
+ }
+
+ return false;
+@@ -160,7 +183,7 @@ static int clk_pllv4_set_rate(struct clk_hw *hw, unsigned long rate,
+
+ mult = rate / parent_rate;
+
+- if (!clk_pllv4_is_valid_mult(mult))
++ if (!clk_pllv4_is_valid_mult(pll, mult))
+ return -EINVAL;
+
+ if (parent_rate <= MAX_MFD)
+@@ -227,10 +250,13 @@ struct clk_hw *imx_clk_hw_pllv4(enum imx_pllv4_type type, const char *name,
+
+ pll->base = base;
+
+- if (type == IMX_PLLV4_IMX8ULP) {
++ if (type == IMX_PLLV4_IMX8ULP ||
++ type == IMX_PLLV4_IMX8ULP_1GHZ) {
+ pll->cfg_offset = IMX8ULP_PLL_CFG_OFFSET;
+ pll->num_offset = IMX8ULP_PLL_NUM_OFFSET;
+ pll->denom_offset = IMX8ULP_PLL_DENOM_OFFSET;
++ if (type == IMX_PLLV4_IMX8ULP_1GHZ)
++ pll->use_mult_range = true;
+ } else {
+ pll->cfg_offset = PLL_CFG_OFFSET;
+ pll->num_offset = PLL_NUM_OFFSET;
+diff --git a/drivers/clk/imx/clk.h b/drivers/clk/imx/clk.h
+index 1031468701d7f..6f752f07d125d 100644
+--- a/drivers/clk/imx/clk.h
++++ b/drivers/clk/imx/clk.h
+@@ -46,6 +46,7 @@ enum imx_pll14xx_type {
+ enum imx_pllv4_type {
+ IMX_PLLV4_IMX7ULP,
+ IMX_PLLV4_IMX8ULP,
++ IMX_PLLV4_IMX8ULP_1GHZ,
+ };
+
+ enum imx_pfdv2_type {
+diff --git a/drivers/clk/keystone/pll.c b/drivers/clk/keystone/pll.c
+index d59a7621bb204..ee5c72369334f 100644
+--- a/drivers/clk/keystone/pll.c
++++ b/drivers/clk/keystone/pll.c
+@@ -209,7 +209,7 @@ static void __init _of_pll_clk_init(struct device_node *node, bool pllctrl)
+ }
+
+ clk = clk_register_pll(NULL, node->name, parent_name, pll_data);
+- if (clk) {
++ if (!IS_ERR_OR_NULL(clk)) {
+ of_clk_add_provider(node, of_clk_src_simple_get, clk);
+ return;
+ }
+diff --git a/drivers/clk/qcom/dispcc-sc8280xp.c b/drivers/clk/qcom/dispcc-sc8280xp.c
+index 167470beb3691..30f636b9f0ec8 100644
+--- a/drivers/clk/qcom/dispcc-sc8280xp.c
++++ b/drivers/clk/qcom/dispcc-sc8280xp.c
+@@ -3057,7 +3057,7 @@ static struct gdsc disp0_mdss_gdsc = {
+ .name = "disp0_mdss_gdsc",
+ },
+ .pwrsts = PWRSTS_OFF_ON,
+- .flags = HW_CTRL,
++ .flags = HW_CTRL | RETAIN_FF_ENABLE,
+ };
+
+ static struct gdsc disp1_mdss_gdsc = {
+@@ -3069,7 +3069,7 @@ static struct gdsc disp1_mdss_gdsc = {
+ .name = "disp1_mdss_gdsc",
+ },
+ .pwrsts = PWRSTS_OFF_ON,
+- .flags = HW_CTRL,
++ .flags = HW_CTRL | RETAIN_FF_ENABLE,
+ };
+
+ static struct gdsc disp0_mdss_int2_gdsc = {
+@@ -3081,7 +3081,7 @@ static struct gdsc disp0_mdss_int2_gdsc = {
+ .name = "disp0_mdss_int2_gdsc",
+ },
+ .pwrsts = PWRSTS_OFF_ON,
+- .flags = HW_CTRL,
++ .flags = HW_CTRL | RETAIN_FF_ENABLE,
+ };
+
+ static struct gdsc disp1_mdss_int2_gdsc = {
+@@ -3093,7 +3093,7 @@ static struct gdsc disp1_mdss_int2_gdsc = {
+ .name = "disp1_mdss_int2_gdsc",
+ },
+ .pwrsts = PWRSTS_OFF_ON,
+- .flags = HW_CTRL,
++ .flags = HW_CTRL | RETAIN_FF_ENABLE,
+ };
+
+ static struct gdsc *disp0_cc_sc8280xp_gdscs[] = {
+diff --git a/drivers/clk/qcom/gcc-qdu1000.c b/drivers/clk/qcom/gcc-qdu1000.c
+index 5051769ad90c7..626c5afed7806 100644
+--- a/drivers/clk/qcom/gcc-qdu1000.c
++++ b/drivers/clk/qcom/gcc-qdu1000.c
+@@ -1,6 +1,6 @@
+ // SPDX-License-Identifier: GPL-2.0-only
+ /*
+- * Copyright (c) 2022, Qualcomm Innovation Center, Inc. All rights reserved.
++ * Copyright (c) 2022-2023, Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+ #include <linux/clk-provider.h>
+@@ -370,16 +370,6 @@ static const struct clk_parent_data gcc_parent_data_6[] = {
+ { .index = DT_TCXO_IDX },
+ };
+
+-static const struct parent_map gcc_parent_map_7[] = {
+- { P_PCIE_0_PIPE_CLK, 0 },
+- { P_BI_TCXO, 2 },
+-};
+-
+-static const struct clk_parent_data gcc_parent_data_7[] = {
+- { .index = DT_PCIE_0_PIPE_CLK_IDX },
+- { .index = DT_TCXO_IDX },
+-};
+-
+ static const struct parent_map gcc_parent_map_8[] = {
+ { P_BI_TCXO, 0 },
+ { P_GCC_GPLL0_OUT_MAIN, 1 },
+@@ -439,16 +429,15 @@ static struct clk_regmap_mux gcc_pcie_0_phy_aux_clk_src = {
+ },
+ };
+
+-static struct clk_regmap_mux gcc_pcie_0_pipe_clk_src = {
++static struct clk_regmap_phy_mux gcc_pcie_0_pipe_clk_src = {
+ .reg = 0x9d064,
+- .shift = 0,
+- .width = 2,
+- .parent_map = gcc_parent_map_7,
+ .clkr = {
+ .hw.init = &(const struct clk_init_data) {
+ .name = "gcc_pcie_0_pipe_clk_src",
+- .parent_data = gcc_parent_data_7,
+- .num_parents = ARRAY_SIZE(gcc_parent_data_7),
++ .parent_data = &(const struct clk_parent_data){
++ .index = DT_PCIE_0_PIPE_CLK_IDX,
++ },
++ .num_parents = 1,
+ .ops = &clk_regmap_phy_mux_ops,
+ },
+ },
+@@ -1458,14 +1447,13 @@ static struct clk_branch gcc_pcie_0_cfg_ahb_clk = {
+
+ static struct clk_branch gcc_pcie_0_clkref_en = {
+ .halt_reg = 0x9c004,
+- .halt_bit = 31,
+- .halt_check = BRANCH_HALT_ENABLE,
++ .halt_check = BRANCH_HALT,
+ .clkr = {
+ .enable_reg = 0x9c004,
+ .enable_mask = BIT(0),
+ .hw.init = &(const struct clk_init_data) {
+ .name = "gcc_pcie_0_clkref_en",
+- .ops = &clk_branch_ops,
++ .ops = &clk_branch2_ops,
+ },
+ },
+ };
+@@ -2285,14 +2273,13 @@ static struct clk_branch gcc_tsc_etu_clk = {
+
+ static struct clk_branch gcc_usb2_clkref_en = {
+ .halt_reg = 0x9c008,
+- .halt_bit = 31,
+- .halt_check = BRANCH_HALT_ENABLE,
++ .halt_check = BRANCH_HALT,
+ .clkr = {
+ .enable_reg = 0x9c008,
+ .enable_mask = BIT(0),
+ .hw.init = &(const struct clk_init_data) {
+ .name = "gcc_usb2_clkref_en",
+- .ops = &clk_branch_ops,
++ .ops = &clk_branch2_ops,
+ },
+ },
+ };
+@@ -2534,6 +2521,7 @@ static struct clk_regmap *gcc_qdu1000_clocks[] = {
+ [GCC_AGGRE_NOC_ECPRI_GSI_CLK] = &gcc_aggre_noc_ecpri_gsi_clk.clkr,
+ [GCC_PCIE_0_PHY_AUX_CLK_SRC] = &gcc_pcie_0_phy_aux_clk_src.clkr,
+ [GCC_PCIE_0_PIPE_CLK_SRC] = &gcc_pcie_0_pipe_clk_src.clkr,
++ [GCC_GPLL1_OUT_EVEN] = &gcc_gpll1_out_even.clkr,
+ };
+
+ static const struct qcom_reset_map gcc_qdu1000_resets[] = {
+diff --git a/drivers/clk/qcom/gcc-sc7180.c b/drivers/clk/qcom/gcc-sc7180.c
+index cef3c77564cfd..49f36e1df4fa8 100644
+--- a/drivers/clk/qcom/gcc-sc7180.c
++++ b/drivers/clk/qcom/gcc-sc7180.c
+@@ -651,6 +651,7 @@ static struct clk_rcg2 gcc_sdcc2_apps_clk_src = {
+ .name = "gcc_sdcc2_apps_clk_src",
+ .parent_data = gcc_parent_data_5,
+ .num_parents = ARRAY_SIZE(gcc_parent_data_5),
++ .flags = CLK_OPS_PARENT_ENABLE,
+ .ops = &clk_rcg2_floor_ops,
+ },
+ };
+diff --git a/drivers/clk/qcom/gcc-sc8280xp.c b/drivers/clk/qcom/gcc-sc8280xp.c
+index 04a99dbaa57e0..57bbd609151cd 100644
+--- a/drivers/clk/qcom/gcc-sc8280xp.c
++++ b/drivers/clk/qcom/gcc-sc8280xp.c
+@@ -6760,7 +6760,7 @@ static struct gdsc pcie_0_tunnel_gdsc = {
+ .name = "pcie_0_tunnel_gdsc",
+ },
+ .pwrsts = PWRSTS_OFF_ON,
+- .flags = VOTABLE,
++ .flags = VOTABLE | RETAIN_FF_ENABLE,
+ };
+
+ static struct gdsc pcie_1_tunnel_gdsc = {
+@@ -6771,7 +6771,7 @@ static struct gdsc pcie_1_tunnel_gdsc = {
+ .name = "pcie_1_tunnel_gdsc",
+ },
+ .pwrsts = PWRSTS_OFF_ON,
+- .flags = VOTABLE,
++ .flags = VOTABLE | RETAIN_FF_ENABLE,
+ };
+
+ /*
+@@ -6786,7 +6786,7 @@ static struct gdsc pcie_2a_gdsc = {
+ .name = "pcie_2a_gdsc",
+ },
+ .pwrsts = PWRSTS_OFF_ON,
+- .flags = VOTABLE | ALWAYS_ON,
++ .flags = VOTABLE | RETAIN_FF_ENABLE | ALWAYS_ON,
+ };
+
+ static struct gdsc pcie_2b_gdsc = {
+@@ -6797,7 +6797,7 @@ static struct gdsc pcie_2b_gdsc = {
+ .name = "pcie_2b_gdsc",
+ },
+ .pwrsts = PWRSTS_OFF_ON,
+- .flags = VOTABLE | ALWAYS_ON,
++ .flags = VOTABLE | RETAIN_FF_ENABLE | ALWAYS_ON,
+ };
+
+ static struct gdsc pcie_3a_gdsc = {
+@@ -6808,7 +6808,7 @@ static struct gdsc pcie_3a_gdsc = {
+ .name = "pcie_3a_gdsc",
+ },
+ .pwrsts = PWRSTS_OFF_ON,
+- .flags = VOTABLE | ALWAYS_ON,
++ .flags = VOTABLE | RETAIN_FF_ENABLE | ALWAYS_ON,
+ };
+
+ static struct gdsc pcie_3b_gdsc = {
+@@ -6819,7 +6819,7 @@ static struct gdsc pcie_3b_gdsc = {
+ .name = "pcie_3b_gdsc",
+ },
+ .pwrsts = PWRSTS_OFF_ON,
+- .flags = VOTABLE | ALWAYS_ON,
++ .flags = VOTABLE | RETAIN_FF_ENABLE | ALWAYS_ON,
+ };
+
+ static struct gdsc pcie_4_gdsc = {
+@@ -6830,7 +6830,7 @@ static struct gdsc pcie_4_gdsc = {
+ .name = "pcie_4_gdsc",
+ },
+ .pwrsts = PWRSTS_OFF_ON,
+- .flags = VOTABLE | ALWAYS_ON,
++ .flags = VOTABLE | RETAIN_FF_ENABLE | ALWAYS_ON,
+ };
+
+ static struct gdsc ufs_card_gdsc = {
+@@ -6839,6 +6839,7 @@ static struct gdsc ufs_card_gdsc = {
+ .name = "ufs_card_gdsc",
+ },
+ .pwrsts = PWRSTS_OFF_ON,
++ .flags = RETAIN_FF_ENABLE,
+ };
+
+ static struct gdsc ufs_phy_gdsc = {
+@@ -6847,6 +6848,7 @@ static struct gdsc ufs_phy_gdsc = {
+ .name = "ufs_phy_gdsc",
+ },
+ .pwrsts = PWRSTS_OFF_ON,
++ .flags = RETAIN_FF_ENABLE,
+ };
+
+ static struct gdsc usb30_mp_gdsc = {
+@@ -6855,6 +6857,7 @@ static struct gdsc usb30_mp_gdsc = {
+ .name = "usb30_mp_gdsc",
+ },
+ .pwrsts = PWRSTS_RET_ON,
++ .flags = RETAIN_FF_ENABLE,
+ };
+
+ static struct gdsc usb30_prim_gdsc = {
+@@ -6863,6 +6866,7 @@ static struct gdsc usb30_prim_gdsc = {
+ .name = "usb30_prim_gdsc",
+ },
+ .pwrsts = PWRSTS_RET_ON,
++ .flags = RETAIN_FF_ENABLE,
+ };
+
+ static struct gdsc usb30_sec_gdsc = {
+@@ -6871,6 +6875,7 @@ static struct gdsc usb30_sec_gdsc = {
+ .name = "usb30_sec_gdsc",
+ },
+ .pwrsts = PWRSTS_RET_ON,
++ .flags = RETAIN_FF_ENABLE,
+ };
+
+ static struct gdsc emac_0_gdsc = {
+@@ -6879,6 +6884,7 @@ static struct gdsc emac_0_gdsc = {
+ .name = "emac_0_gdsc",
+ },
+ .pwrsts = PWRSTS_OFF_ON,
++ .flags = RETAIN_FF_ENABLE,
+ };
+
+ static struct gdsc emac_1_gdsc = {
+@@ -6887,6 +6893,97 @@ static struct gdsc emac_1_gdsc = {
+ .name = "emac_1_gdsc",
+ },
+ .pwrsts = PWRSTS_OFF_ON,
++ .flags = RETAIN_FF_ENABLE,
++};
++
++static struct gdsc usb4_1_gdsc = {
++ .gdscr = 0xb8004,
++ .pd = {
++ .name = "usb4_1_gdsc",
++ },
++ .pwrsts = PWRSTS_OFF_ON,
++ .flags = RETAIN_FF_ENABLE,
++};
++
++static struct gdsc usb4_gdsc = {
++ .gdscr = 0x2a004,
++ .pd = {
++ .name = "usb4_gdsc",
++ },
++ .pwrsts = PWRSTS_OFF_ON,
++ .flags = RETAIN_FF_ENABLE,
++};
++
++static struct gdsc hlos1_vote_mmnoc_mmu_tbu_hf0_gdsc = {
++ .gdscr = 0x7d050,
++ .pd = {
++ .name = "hlos1_vote_mmnoc_mmu_tbu_hf0_gdsc",
++ },
++ .pwrsts = PWRSTS_OFF_ON,
++ .flags = VOTABLE,
++};
++
++static struct gdsc hlos1_vote_mmnoc_mmu_tbu_hf1_gdsc = {
++ .gdscr = 0x7d058,
++ .pd = {
++ .name = "hlos1_vote_mmnoc_mmu_tbu_hf1_gdsc",
++ },
++ .pwrsts = PWRSTS_OFF_ON,
++ .flags = VOTABLE,
++};
++
++static struct gdsc hlos1_vote_mmnoc_mmu_tbu_sf0_gdsc = {
++ .gdscr = 0x7d054,
++ .pd = {
++ .name = "hlos1_vote_mmnoc_mmu_tbu_sf0_gdsc",
++ },
++ .pwrsts = PWRSTS_OFF_ON,
++ .flags = VOTABLE,
++};
++
++static struct gdsc hlos1_vote_mmnoc_mmu_tbu_sf1_gdsc = {
++ .gdscr = 0x7d06c,
++ .pd = {
++ .name = "hlos1_vote_mmnoc_mmu_tbu_sf1_gdsc",
++ },
++ .pwrsts = PWRSTS_OFF_ON,
++ .flags = VOTABLE,
++};
++
++static struct gdsc hlos1_vote_turing_mmu_tbu0_gdsc = {
++ .gdscr = 0x7d05c,
++ .pd = {
++ .name = "hlos1_vote_turing_mmu_tbu0_gdsc",
++ },
++ .pwrsts = PWRSTS_OFF_ON,
++ .flags = VOTABLE,
++};
++
++static struct gdsc hlos1_vote_turing_mmu_tbu1_gdsc = {
++ .gdscr = 0x7d060,
++ .pd = {
++ .name = "hlos1_vote_turing_mmu_tbu1_gdsc",
++ },
++ .pwrsts = PWRSTS_OFF_ON,
++ .flags = VOTABLE,
++};
++
++static struct gdsc hlos1_vote_turing_mmu_tbu2_gdsc = {
++ .gdscr = 0x7d0a0,
++ .pd = {
++ .name = "hlos1_vote_turing_mmu_tbu2_gdsc",
++ },
++ .pwrsts = PWRSTS_OFF_ON,
++ .flags = VOTABLE,
++};
++
++static struct gdsc hlos1_vote_turing_mmu_tbu3_gdsc = {
++ .gdscr = 0x7d0a4,
++ .pd = {
++ .name = "hlos1_vote_turing_mmu_tbu3_gdsc",
++ },
++ .pwrsts = PWRSTS_OFF_ON,
++ .flags = VOTABLE,
+ };
+
+ static struct clk_regmap *gcc_sc8280xp_clocks[] = {
+@@ -7369,6 +7466,16 @@ static struct gdsc *gcc_sc8280xp_gdscs[] = {
+ [USB30_SEC_GDSC] = &usb30_sec_gdsc,
+ [EMAC_0_GDSC] = &emac_0_gdsc,
+ [EMAC_1_GDSC] = &emac_1_gdsc,
++ [USB4_1_GDSC] = &usb4_1_gdsc,
++ [USB4_GDSC] = &usb4_gdsc,
++ [HLOS1_VOTE_MMNOC_MMU_TBU_HF0_GDSC] = &hlos1_vote_mmnoc_mmu_tbu_hf0_gdsc,
++ [HLOS1_VOTE_MMNOC_MMU_TBU_HF1_GDSC] = &hlos1_vote_mmnoc_mmu_tbu_hf1_gdsc,
++ [HLOS1_VOTE_MMNOC_MMU_TBU_SF0_GDSC] = &hlos1_vote_mmnoc_mmu_tbu_sf0_gdsc,
++ [HLOS1_VOTE_MMNOC_MMU_TBU_SF1_GDSC] = &hlos1_vote_mmnoc_mmu_tbu_sf1_gdsc,
++ [HLOS1_VOTE_TURING_MMU_TBU0_GDSC] = &hlos1_vote_turing_mmu_tbu0_gdsc,
++ [HLOS1_VOTE_TURING_MMU_TBU1_GDSC] = &hlos1_vote_turing_mmu_tbu1_gdsc,
++ [HLOS1_VOTE_TURING_MMU_TBU2_GDSC] = &hlos1_vote_turing_mmu_tbu2_gdsc,
++ [HLOS1_VOTE_TURING_MMU_TBU3_GDSC] = &hlos1_vote_turing_mmu_tbu3_gdsc,
+ };
+
+ static const struct clk_rcg_dfs_data gcc_dfs_clocks[] = {
+diff --git a/drivers/clk/qcom/gcc-sm6350.c b/drivers/clk/qcom/gcc-sm6350.c
+index 9b4e4bb059635..cf4a7b6e0b23a 100644
+--- a/drivers/clk/qcom/gcc-sm6350.c
++++ b/drivers/clk/qcom/gcc-sm6350.c
+@@ -641,6 +641,7 @@ static struct clk_rcg2 gcc_sdcc2_apps_clk_src = {
+ .name = "gcc_sdcc2_apps_clk_src",
+ .parent_data = gcc_parent_data_8,
+ .num_parents = ARRAY_SIZE(gcc_parent_data_8),
++ .flags = CLK_OPS_PARENT_ENABLE,
+ .ops = &clk_rcg2_floor_ops,
+ },
+ };
+diff --git a/drivers/clk/qcom/gcc-sm7150.c b/drivers/clk/qcom/gcc-sm7150.c
+index 6b628178f62c4..6da87f0436d0c 100644
+--- a/drivers/clk/qcom/gcc-sm7150.c
++++ b/drivers/clk/qcom/gcc-sm7150.c
+@@ -739,6 +739,7 @@ static struct clk_rcg2 gcc_sdcc2_apps_clk_src = {
+ .parent_data = gcc_parent_data_6,
+ .num_parents = ARRAY_SIZE(gcc_parent_data_6),
+ .ops = &clk_rcg2_floor_ops,
++ .flags = CLK_OPS_PARENT_ENABLE,
+ },
+ };
+
+diff --git a/drivers/clk/qcom/gcc-sm8250.c b/drivers/clk/qcom/gcc-sm8250.c
+index b6cf4bc88d4d4..d3c75bb55946a 100644
+--- a/drivers/clk/qcom/gcc-sm8250.c
++++ b/drivers/clk/qcom/gcc-sm8250.c
+@@ -721,6 +721,7 @@ static struct clk_rcg2 gcc_sdcc2_apps_clk_src = {
+ .name = "gcc_sdcc2_apps_clk_src",
+ .parent_data = gcc_parent_data_4,
+ .num_parents = ARRAY_SIZE(gcc_parent_data_4),
++ .flags = CLK_OPS_PARENT_ENABLE,
+ .ops = &clk_rcg2_floor_ops,
+ },
+ };
+diff --git a/drivers/clk/qcom/gcc-sm8450.c b/drivers/clk/qcom/gcc-sm8450.c
+index 84764cc3db4ff..1d912f8a7243c 100644
+--- a/drivers/clk/qcom/gcc-sm8450.c
++++ b/drivers/clk/qcom/gcc-sm8450.c
+@@ -904,7 +904,7 @@ static struct clk_rcg2 gcc_sdcc2_apps_clk_src = {
+ .parent_data = gcc_parent_data_7,
+ .num_parents = ARRAY_SIZE(gcc_parent_data_7),
+ .flags = CLK_SET_RATE_PARENT,
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_floor_ops,
+ },
+ };
+
+@@ -926,7 +926,7 @@ static struct clk_rcg2 gcc_sdcc4_apps_clk_src = {
+ .parent_data = gcc_parent_data_0,
+ .num_parents = ARRAY_SIZE(gcc_parent_data_0),
+ .flags = CLK_SET_RATE_PARENT,
+- .ops = &clk_rcg2_ops,
++ .ops = &clk_rcg2_floor_ops,
+ },
+ };
+
+diff --git a/drivers/clk/qcom/gpucc-sm6350.c b/drivers/clk/qcom/gpucc-sm6350.c
+index ef15185a99c31..0bcbba2a29436 100644
+--- a/drivers/clk/qcom/gpucc-sm6350.c
++++ b/drivers/clk/qcom/gpucc-sm6350.c
+@@ -24,6 +24,12 @@
+ #define CX_GMU_CBCR_WAKE_MASK 0xF
+ #define CX_GMU_CBCR_WAKE_SHIFT 8
+
++enum {
++ DT_BI_TCXO,
++ DT_GPLL0_OUT_MAIN,
++ DT_GPLL0_OUT_MAIN_DIV,
++};
++
+ enum {
+ P_BI_TCXO,
+ P_GPLL0_OUT_MAIN,
+@@ -61,6 +67,7 @@ static struct clk_alpha_pll gpu_cc_pll0 = {
+ .hw.init = &(struct clk_init_data){
+ .name = "gpu_cc_pll0",
+ .parent_data = &(const struct clk_parent_data){
++ .index = DT_BI_TCXO,
+ .fw_name = "bi_tcxo",
+ },
+ .num_parents = 1,
+@@ -104,6 +111,7 @@ static struct clk_alpha_pll gpu_cc_pll1 = {
+ .hw.init = &(struct clk_init_data){
+ .name = "gpu_cc_pll1",
+ .parent_data = &(const struct clk_parent_data){
++ .index = DT_BI_TCXO,
+ .fw_name = "bi_tcxo",
+ },
+ .num_parents = 1,
+@@ -121,11 +129,11 @@ static const struct parent_map gpu_cc_parent_map_0[] = {
+ };
+
+ static const struct clk_parent_data gpu_cc_parent_data_0[] = {
+- { .fw_name = "bi_tcxo" },
++ { .index = DT_BI_TCXO, .fw_name = "bi_tcxo" },
+ { .hw = &gpu_cc_pll0.clkr.hw },
+ { .hw = &gpu_cc_pll1.clkr.hw },
+- { .fw_name = "gcc_gpu_gpll0_clk" },
+- { .fw_name = "gcc_gpu_gpll0_div_clk" },
++ { .index = DT_GPLL0_OUT_MAIN, .fw_name = "gcc_gpu_gpll0_clk_src" },
++ { .index = DT_GPLL0_OUT_MAIN_DIV, .fw_name = "gcc_gpu_gpll0_div_clk_src" },
+ };
+
+ static const struct parent_map gpu_cc_parent_map_1[] = {
+@@ -138,12 +146,12 @@ static const struct parent_map gpu_cc_parent_map_1[] = {
+ };
+
+ static const struct clk_parent_data gpu_cc_parent_data_1[] = {
+- { .fw_name = "bi_tcxo" },
++ { .index = DT_BI_TCXO, .fw_name = "bi_tcxo" },
+ { .hw = &crc_div.hw },
+ { .hw = &gpu_cc_pll0.clkr.hw },
+ { .hw = &gpu_cc_pll1.clkr.hw },
+ { .hw = &gpu_cc_pll1.clkr.hw },
+- { .fw_name = "gcc_gpu_gpll0_clk" },
++ { .index = DT_GPLL0_OUT_MAIN, .fw_name = "gcc_gpu_gpll0_clk_src" },
+ };
+
+ static const struct freq_tbl ftbl_gpu_cc_gmu_clk_src[] = {
+diff --git a/drivers/clk/qcom/reset.c b/drivers/clk/qcom/reset.c
+index 0e914ec7aeae1..e45e32804d2c7 100644
+--- a/drivers/clk/qcom/reset.c
++++ b/drivers/clk/qcom/reset.c
+@@ -16,7 +16,8 @@ static int qcom_reset(struct reset_controller_dev *rcdev, unsigned long id)
+ struct qcom_reset_controller *rst = to_qcom_reset_controller(rcdev);
+
+ rcdev->ops->assert(rcdev, id);
+- udelay(rst->reset_map[id].udelay ?: 1); /* use 1 us as default */
++ fsleep(rst->reset_map[id].udelay ?: 1); /* use 1 us as default */
++
+ rcdev->ops->deassert(rcdev, id);
+ return 0;
+ }
+diff --git a/drivers/clk/rockchip/clk-rk3568.c b/drivers/clk/rockchip/clk-rk3568.c
+index f85902e2590c7..2f54f630c8b65 100644
+--- a/drivers/clk/rockchip/clk-rk3568.c
++++ b/drivers/clk/rockchip/clk-rk3568.c
+@@ -81,7 +81,7 @@ static struct rockchip_pll_rate_table rk3568_pll_rates[] = {
+ RK3036_PLL_RATE(108000000, 2, 45, 5, 1, 1, 0),
+ RK3036_PLL_RATE(100000000, 1, 150, 6, 6, 1, 0),
+ RK3036_PLL_RATE(96000000, 1, 96, 6, 4, 1, 0),
+- RK3036_PLL_RATE(78750000, 1, 96, 6, 4, 1, 0),
++ RK3036_PLL_RATE(78750000, 4, 315, 6, 4, 1, 0),
+ RK3036_PLL_RATE(74250000, 2, 99, 4, 4, 1, 0),
+ { /* sentinel */ },
+ };
+diff --git a/drivers/clk/sunxi-ng/ccu_mmc_timing.c b/drivers/clk/sunxi-ng/ccu_mmc_timing.c
+index 23a8d44e2449b..78919d7843bec 100644
+--- a/drivers/clk/sunxi-ng/ccu_mmc_timing.c
++++ b/drivers/clk/sunxi-ng/ccu_mmc_timing.c
+@@ -43,7 +43,7 @@ int sunxi_ccu_set_mmc_timing_mode(struct clk *clk, bool new_mode)
+ EXPORT_SYMBOL_GPL(sunxi_ccu_set_mmc_timing_mode);
+
+ /**
+- * sunxi_ccu_set_mmc_timing_mode: Get the current MMC clock timing mode
++ * sunxi_ccu_get_mmc_timing_mode: Get the current MMC clock timing mode
+ * @clk: clock to query
+ *
+ * Return: %0 if the clock is in old timing mode, > %0 if it is in
+diff --git a/drivers/cpufreq/amd-pstate-ut.c b/drivers/cpufreq/amd-pstate-ut.c
+index 7f3fe20489818..502d494499ae8 100644
+--- a/drivers/cpufreq/amd-pstate-ut.c
++++ b/drivers/cpufreq/amd-pstate-ut.c
+@@ -64,27 +64,9 @@ static struct amd_pstate_ut_struct amd_pstate_ut_cases[] = {
+ static bool get_shared_mem(void)
+ {
+ bool result = false;
+- char path[] = "/sys/module/amd_pstate/parameters/shared_mem";
+- char buf[5] = {0};
+- struct file *filp = NULL;
+- loff_t pos = 0;
+- ssize_t ret;
+-
+- if (!boot_cpu_has(X86_FEATURE_CPPC)) {
+- filp = filp_open(path, O_RDONLY, 0);
+- if (IS_ERR(filp))
+- pr_err("%s unable to open %s file!\n", __func__, path);
+- else {
+- ret = kernel_read(filp, &buf, sizeof(buf), &pos);
+- if (ret < 0)
+- pr_err("%s read %s file fail ret=%ld!\n",
+- __func__, path, (long)ret);
+- filp_close(filp, NULL);
+- }
+
+- if ('Y' == *buf)
+- result = true;
+- }
++ if (!boot_cpu_has(X86_FEATURE_CPPC))
++ result = true;
+
+ return result;
+ }
+@@ -158,7 +140,7 @@ static void amd_pstate_ut_check_perf(u32 index)
+ if (ret) {
+ amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_FAIL;
+ pr_err("%s cppc_get_perf_caps ret=%d error!\n", __func__, ret);
+- return;
++ goto skip_test;
+ }
+
+ nominal_perf = cppc_perf.nominal_perf;
+@@ -169,7 +151,7 @@ static void amd_pstate_ut_check_perf(u32 index)
+ if (ret) {
+ amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_FAIL;
+ pr_err("%s read CPPC_CAP1 ret=%d error!\n", __func__, ret);
+- return;
++ goto skip_test;
+ }
+
+ nominal_perf = AMD_CPPC_NOMINAL_PERF(cap1);
+@@ -187,7 +169,7 @@ static void amd_pstate_ut_check_perf(u32 index)
+ nominal_perf, cpudata->nominal_perf,
+ lowest_nonlinear_perf, cpudata->lowest_nonlinear_perf,
+ lowest_perf, cpudata->lowest_perf);
+- return;
++ goto skip_test;
+ }
+
+ if (!((highest_perf >= nominal_perf) &&
+@@ -198,11 +180,15 @@ static void amd_pstate_ut_check_perf(u32 index)
+ pr_err("%s cpu%d highest=%d >= nominal=%d > lowest_nonlinear=%d > lowest=%d > 0, the formula is incorrect!\n",
+ __func__, cpu, highest_perf, nominal_perf,
+ lowest_nonlinear_perf, lowest_perf);
+- return;
++ goto skip_test;
+ }
++ cpufreq_cpu_put(policy);
+ }
+
+ amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_PASS;
++ return;
++skip_test:
++ cpufreq_cpu_put(policy);
+ }
+
+ /*
+@@ -230,14 +216,14 @@ static void amd_pstate_ut_check_freq(u32 index)
+ pr_err("%s cpu%d max=%d >= nominal=%d > lowest_nonlinear=%d > min=%d > 0, the formula is incorrect!\n",
+ __func__, cpu, cpudata->max_freq, cpudata->nominal_freq,
+ cpudata->lowest_nonlinear_freq, cpudata->min_freq);
+- return;
++ goto skip_test;
+ }
+
+ if (cpudata->min_freq != policy->min) {
+ amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_FAIL;
+ pr_err("%s cpu%d cpudata_min_freq=%d policy_min=%d, they should be equal!\n",
+ __func__, cpu, cpudata->min_freq, policy->min);
+- return;
++ goto skip_test;
+ }
+
+ if (cpudata->boost_supported) {
+@@ -249,16 +235,20 @@ static void amd_pstate_ut_check_freq(u32 index)
+ pr_err("%s cpu%d policy_max=%d should be equal cpu_max=%d or cpu_nominal=%d !\n",
+ __func__, cpu, policy->max, cpudata->max_freq,
+ cpudata->nominal_freq);
+- return;
++ goto skip_test;
+ }
+ } else {
+ amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_FAIL;
+ pr_err("%s cpu%d must support boost!\n", __func__, cpu);
+- return;
++ goto skip_test;
+ }
++ cpufreq_cpu_put(policy);
+ }
+
+ amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_PASS;
++ return;
++skip_test:
++ cpufreq_cpu_put(policy);
+ }
+
+ static int __init amd_pstate_ut_init(void)
+diff --git a/drivers/cpufreq/brcmstb-avs-cpufreq.c b/drivers/cpufreq/brcmstb-avs-cpufreq.c
+index ffea6402189d3..3052949aebbc7 100644
+--- a/drivers/cpufreq/brcmstb-avs-cpufreq.c
++++ b/drivers/cpufreq/brcmstb-avs-cpufreq.c
+@@ -434,7 +434,11 @@ brcm_avs_get_freq_table(struct device *dev, struct private_data *priv)
+ if (ret)
+ return ERR_PTR(ret);
+
+- table = devm_kcalloc(dev, AVS_PSTATE_MAX + 1, sizeof(*table),
++ /*
++ * We allocate space for the 5 different P-STATES AVS,
++ * plus extra space for a terminating element.
++ */
++ table = devm_kcalloc(dev, AVS_PSTATE_MAX + 1 + 1, sizeof(*table),
+ GFP_KERNEL);
+ if (!table)
+ return ERR_PTR(-ENOMEM);
+diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
+index 6b52ebe5a8904..f11b01b25e8d5 100644
+--- a/drivers/cpufreq/cpufreq.c
++++ b/drivers/cpufreq/cpufreq.c
+@@ -455,8 +455,10 @@ void cpufreq_freq_transition_end(struct cpufreq_policy *policy,
+ policy->cur,
+ policy->cpuinfo.max_freq);
+
++ spin_lock(&policy->transition_lock);
+ policy->transition_ongoing = false;
+ policy->transition_task = NULL;
++ spin_unlock(&policy->transition_lock);
+
+ wake_up(&policy->transition_wait);
+ }
+diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
+index f29182512b982..1d025bf7dc079 100644
+--- a/drivers/cpufreq/intel_pstate.c
++++ b/drivers/cpufreq/intel_pstate.c
+@@ -2570,6 +2570,11 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy)
+ intel_pstate_clear_update_util_hook(policy->cpu);
+ intel_pstate_hwp_set(policy->cpu);
+ }
++ /*
++ * policy->cur is never updated with the intel_pstate driver, but it
++ * is used as a stale frequency value. So, keep it within limits.
++ */
++ policy->cur = policy->min;
+
+ mutex_unlock(&intel_pstate_limits_lock);
+
+diff --git a/drivers/cpufreq/powernow-k8.c b/drivers/cpufreq/powernow-k8.c
+index d289036beff23..b10f7a1b77f11 100644
+--- a/drivers/cpufreq/powernow-k8.c
++++ b/drivers/cpufreq/powernow-k8.c
+@@ -1101,7 +1101,8 @@ static int powernowk8_cpu_exit(struct cpufreq_policy *pol)
+
+ kfree(data->powernow_table);
+ kfree(data);
+- for_each_cpu(cpu, pol->cpus)
++ /* pol->cpus will be empty here, use related_cpus instead. */
++ for_each_cpu(cpu, pol->related_cpus)
+ per_cpu(powernow_data, cpu) = NULL;
+
+ return 0;
+diff --git a/drivers/cpufreq/tegra194-cpufreq.c b/drivers/cpufreq/tegra194-cpufreq.c
+index 36dad5ea59475..75f1e611d0aab 100644
+--- a/drivers/cpufreq/tegra194-cpufreq.c
++++ b/drivers/cpufreq/tegra194-cpufreq.c
+@@ -508,6 +508,32 @@ static int tegra194_cpufreq_init(struct cpufreq_policy *policy)
+ return 0;
+ }
+
++static int tegra194_cpufreq_online(struct cpufreq_policy *policy)
++{
++ /* We did light-weight tear down earlier, nothing to do here */
++ return 0;
++}
++
++static int tegra194_cpufreq_offline(struct cpufreq_policy *policy)
++{
++ /*
++ * Preserve policy->driver_data and don't free resources on light-weight
++ * tear down.
++ */
++
++ return 0;
++}
++
++static int tegra194_cpufreq_exit(struct cpufreq_policy *policy)
++{
++ struct device *cpu_dev = get_cpu_device(policy->cpu);
++
++ dev_pm_opp_remove_all_dynamic(cpu_dev);
++ dev_pm_opp_of_cpumask_remove_table(policy->related_cpus);
++
++ return 0;
++}
++
+ static int tegra194_cpufreq_set_target(struct cpufreq_policy *policy,
+ unsigned int index)
+ {
+@@ -535,6 +561,9 @@ static struct cpufreq_driver tegra194_cpufreq_driver = {
+ .target_index = tegra194_cpufreq_set_target,
+ .get = tegra194_get_speed,
+ .init = tegra194_cpufreq_init,
++ .exit = tegra194_cpufreq_exit,
++ .online = tegra194_cpufreq_online,
++ .offline = tegra194_cpufreq_offline,
+ .attr = cpufreq_generic_attr,
+ };
+
+diff --git a/drivers/cpuidle/cpuidle-pseries.c b/drivers/cpuidle/cpuidle-pseries.c
+index a7d33f3ee01e7..14db9b7d985d1 100644
+--- a/drivers/cpuidle/cpuidle-pseries.c
++++ b/drivers/cpuidle/cpuidle-pseries.c
+@@ -414,13 +414,7 @@ static int __init pseries_idle_probe(void)
+ return -ENODEV;
+
+ if (firmware_has_feature(FW_FEATURE_SPLPAR)) {
+- /*
+- * Use local_paca instead of get_lppaca() since
+- * preemption is not disabled, and it is not required in
+- * fact, since lppaca_ptr does not need to be the value
+- * associated to the current CPU, it can be from any CPU.
+- */
+- if (lppaca_shared_proc(local_paca->lppaca_ptr)) {
++ if (lppaca_shared_proc()) {
+ cpuidle_state_table = shared_states;
+ max_idle_state = ARRAY_SIZE(shared_states);
+ } else {
+diff --git a/drivers/cpuidle/governors/teo.c b/drivers/cpuidle/governors/teo.c
+index 987fc5f3997dc..2cdc711679a5f 100644
+--- a/drivers/cpuidle/governors/teo.c
++++ b/drivers/cpuidle/governors/teo.c
+@@ -397,13 +397,23 @@ static int teo_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
+ * the shallowest non-polling state and exit.
+ */
+ if (drv->state_count < 3 && cpu_data->utilized) {
+- for (i = 0; i < drv->state_count; ++i) {
+- if (!dev->states_usage[i].disable &&
+- !(drv->states[i].flags & CPUIDLE_FLAG_POLLING)) {
+- idx = i;
+- goto end;
+- }
+- }
++ /* The CPU is utilized, so assume a short idle duration. */
++ duration_ns = teo_middle_of_bin(0, drv);
++ /*
++ * If state 0 is enabled and it is not a polling one, select it
++ * right away unless the scheduler tick has been stopped, in
++ * which case care needs to be taken to leave the CPU in a deep
++ * enough state in case it is not woken up any time soon after
++ * all. If state 1 is disabled, though, state 0 must be used
++ * anyway.
++ */
++ if ((!idx && !(drv->states[0].flags & CPUIDLE_FLAG_POLLING) &&
++ teo_time_ok(duration_ns)) || dev->states_usage[1].disable)
++ idx = 0;
++ else /* Assume that state 1 is not a polling one and use it. */
++ idx = 1;
++
++ goto end;
+ }
+
+ /*
+@@ -539,10 +549,20 @@ static int teo_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
+
+ /*
+ * If the CPU is being utilized over the threshold, choose a shallower
+- * non-polling state to improve latency
++ * non-polling state to improve latency, unless the scheduler tick has
++ * been stopped already and the shallower state's target residency is
++ * not sufficiently large.
+ */
+- if (cpu_data->utilized)
+- idx = teo_find_shallower_state(drv, dev, idx, duration_ns, true);
++ if (cpu_data->utilized) {
++ s64 span_ns;
++
++ i = teo_find_shallower_state(drv, dev, idx, duration_ns, true);
++ span_ns = teo_middle_of_bin(i, drv);
++ if (teo_time_ok(span_ns)) {
++ idx = i;
++ duration_ns = span_ns;
++ }
++ }
+
+ end:
+ /*
+diff --git a/drivers/crypto/caam/caampkc.c b/drivers/crypto/caam/caampkc.c
+index 72afc249d42fb..7e08af751e4ea 100644
+--- a/drivers/crypto/caam/caampkc.c
++++ b/drivers/crypto/caam/caampkc.c
+@@ -225,7 +225,9 @@ static int caam_rsa_count_leading_zeros(struct scatterlist *sgl,
+ if (len && *buff)
+ break;
+
+- sg_miter_next(&miter);
++ if (!sg_miter_next(&miter))
++ break;
++
+ buff = miter.addr;
+ len = miter.length;
+
+diff --git a/drivers/crypto/intel/qat/qat_common/adf_gen4_pm.h b/drivers/crypto/intel/qat/qat_common/adf_gen4_pm.h
+index f8f8a9ee29e5b..db4326933d1c0 100644
+--- a/drivers/crypto/intel/qat/qat_common/adf_gen4_pm.h
++++ b/drivers/crypto/intel/qat/qat_common/adf_gen4_pm.h
+@@ -35,7 +35,7 @@
+ #define ADF_GEN4_PM_MSG_PENDING BIT(0)
+ #define ADF_GEN4_PM_MSG_PAYLOAD_BIT_MASK GENMASK(28, 1)
+
+-#define ADF_GEN4_PM_DEFAULT_IDLE_FILTER (0x0)
++#define ADF_GEN4_PM_DEFAULT_IDLE_FILTER (0x6)
+ #define ADF_GEN4_PM_MAX_IDLE_FILTER (0x7)
+
+ int adf_gen4_enable_pm(struct adf_accel_dev *accel_dev);
+diff --git a/drivers/crypto/stm32/stm32-hash.c b/drivers/crypto/stm32/stm32-hash.c
+index f0df32382719c..fabae6da627b9 100644
+--- a/drivers/crypto/stm32/stm32-hash.c
++++ b/drivers/crypto/stm32/stm32-hash.c
+@@ -492,7 +492,7 @@ static int stm32_hash_xmit_dma(struct stm32_hash_dev *hdev,
+
+ reg = stm32_hash_read(hdev, HASH_CR);
+
+- if (!hdev->pdata->has_mdmat) {
++ if (hdev->pdata->has_mdmat) {
+ if (mdma)
+ reg |= HASH_CR_MDMAT;
+ else
+@@ -627,9 +627,9 @@ static int stm32_hash_dma_send(struct stm32_hash_dev *hdev)
+ }
+
+ for_each_sg(rctx->sg, tsg, rctx->nents, i) {
++ sg[0] = *tsg;
+ len = sg->length;
+
+- sg[0] = *tsg;
+ if (sg_is_last(sg)) {
+ if (hdev->dma_mode == 1) {
+ len = (ALIGN(sg->length, 16) - 16);
+@@ -1705,9 +1705,7 @@ static int stm32_hash_remove(struct platform_device *pdev)
+ if (!hdev)
+ return -ENODEV;
+
+- ret = pm_runtime_resume_and_get(hdev->dev);
+- if (ret < 0)
+- return ret;
++ ret = pm_runtime_get_sync(hdev->dev);
+
+ stm32_hash_unregister_algs(hdev);
+
+@@ -1723,7 +1721,8 @@ static int stm32_hash_remove(struct platform_device *pdev)
+ pm_runtime_disable(hdev->dev);
+ pm_runtime_put_noidle(hdev->dev);
+
+- clk_disable_unprepare(hdev->clk);
++ if (ret >= 0)
++ clk_disable_unprepare(hdev->clk);
+
+ return 0;
+ }
+diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
+index e36cbb920ec88..9464f8d3cb5b4 100644
+--- a/drivers/devfreq/devfreq.c
++++ b/drivers/devfreq/devfreq.c
+@@ -763,6 +763,7 @@ static void devfreq_dev_release(struct device *dev)
+ dev_pm_opp_put_opp_table(devfreq->opp_table);
+
+ mutex_destroy(&devfreq->lock);
++ srcu_cleanup_notifier_head(&devfreq->transition_notifier_list);
+ kfree(devfreq);
+ }
+
+diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
+index f5f422f9b8507..b6221b4432fd3 100644
+--- a/drivers/dma/Kconfig
++++ b/drivers/dma/Kconfig
+@@ -211,6 +211,7 @@ config FSL_DMA
+ config FSL_EDMA
+ tristate "Freescale eDMA engine support"
+ depends on OF
++ depends on HAS_IOMEM
+ select DMA_ENGINE
+ select DMA_VIRTUAL_CHANNELS
+ help
+@@ -280,6 +281,7 @@ config IMX_SDMA
+
+ config INTEL_IDMA64
+ tristate "Intel integrated DMA 64-bit support"
++ depends on HAS_IOMEM
+ select DMA_ENGINE
+ select DMA_VIRTUAL_CHANNELS
+ help
+diff --git a/drivers/dma/idxd/sysfs.c b/drivers/dma/idxd/sysfs.c
+index 293739ac55969..a5c3eb4348325 100644
+--- a/drivers/dma/idxd/sysfs.c
++++ b/drivers/dma/idxd/sysfs.c
+@@ -1095,8 +1095,8 @@ static ssize_t wq_ats_disable_store(struct device *dev, struct device_attribute
+ if (wq->state != IDXD_WQ_DISABLED)
+ return -EPERM;
+
+- if (!idxd->hw.wq_cap.wq_ats_support)
+- return -EOPNOTSUPP;
++ if (!test_bit(IDXD_FLAG_CONFIGURABLE, &idxd->flags))
++ return -EPERM;
+
+ rc = kstrtobool(buf, &ats_dis);
+ if (rc < 0)
+@@ -1131,8 +1131,8 @@ static ssize_t wq_prs_disable_store(struct device *dev, struct device_attribute
+ if (wq->state != IDXD_WQ_DISABLED)
+ return -EPERM;
+
+- if (!idxd->hw.wq_cap.wq_prs_support)
+- return -EOPNOTSUPP;
++ if (!test_bit(IDXD_FLAG_CONFIGURABLE, &idxd->flags))
++ return -EPERM;
+
+ rc = kstrtobool(buf, &prs_dis);
+ if (rc < 0)
+@@ -1288,12 +1288,9 @@ static struct attribute *idxd_wq_attributes[] = {
+ NULL,
+ };
+
+-static bool idxd_wq_attr_op_config_invisible(struct attribute *attr,
+- struct idxd_device *idxd)
+-{
+- return attr == &dev_attr_wq_op_config.attr &&
+- !idxd->hw.wq_cap.op_config;
+-}
++/* A WQ attr is invisible if the feature is not supported in WQCAP. */
++#define idxd_wq_attr_invisible(name, cap_field, a, idxd) \
++ ((a) == &dev_attr_wq_##name.attr && !(idxd)->hw.wq_cap.cap_field)
+
+ static bool idxd_wq_attr_max_batch_size_invisible(struct attribute *attr,
+ struct idxd_device *idxd)
+@@ -1303,13 +1300,6 @@ static bool idxd_wq_attr_max_batch_size_invisible(struct attribute *attr,
+ idxd->data->type == IDXD_TYPE_IAX;
+ }
+
+-static bool idxd_wq_attr_wq_prs_disable_invisible(struct attribute *attr,
+- struct idxd_device *idxd)
+-{
+- return attr == &dev_attr_wq_prs_disable.attr &&
+- !idxd->hw.wq_cap.wq_prs_support;
+-}
+-
+ static umode_t idxd_wq_attr_visible(struct kobject *kobj,
+ struct attribute *attr, int n)
+ {
+@@ -1317,13 +1307,16 @@ static umode_t idxd_wq_attr_visible(struct kobject *kobj,
+ struct idxd_wq *wq = confdev_to_wq(dev);
+ struct idxd_device *idxd = wq->idxd;
+
+- if (idxd_wq_attr_op_config_invisible(attr, idxd))
++ if (idxd_wq_attr_invisible(op_config, op_config, attr, idxd))
+ return 0;
+
+ if (idxd_wq_attr_max_batch_size_invisible(attr, idxd))
+ return 0;
+
+- if (idxd_wq_attr_wq_prs_disable_invisible(attr, idxd))
++ if (idxd_wq_attr_invisible(prs_disable, wq_prs_support, attr, idxd))
++ return 0;
++
++ if (idxd_wq_attr_invisible(ats_disable, wq_ats_support, attr, idxd))
+ return 0;
+
+ return attr->mode;
+@@ -1480,7 +1473,7 @@ static ssize_t pasid_enabled_show(struct device *dev,
+ {
+ struct idxd_device *idxd = confdev_to_idxd(dev);
+
+- return sysfs_emit(buf, "%u\n", device_pasid_enabled(idxd));
++ return sysfs_emit(buf, "%u\n", device_user_pasid_enabled(idxd));
+ }
+ static DEVICE_ATTR_RO(pasid_enabled);
+
+diff --git a/drivers/dma/ste_dma40.c b/drivers/dma/ste_dma40.c
+index f093e08c23b16..3b09fdc507e04 100644
+--- a/drivers/dma/ste_dma40.c
++++ b/drivers/dma/ste_dma40.c
+@@ -3597,6 +3597,10 @@ static int __init d40_probe(struct platform_device *pdev)
+ spin_lock_init(&base->lcla_pool.lock);
+
+ base->irq = platform_get_irq(pdev, 0);
++ if (base->irq < 0) {
++ ret = base->irq;
++ goto destroy_cache;
++ }
+
+ ret = request_irq(base->irq, d40_handle_interrupt, 0, D40_NAME, base);
+ if (ret) {
+diff --git a/drivers/edac/i10nm_base.c b/drivers/edac/i10nm_base.c
+index a897b6aff3686..349ff6cfb3796 100644
+--- a/drivers/edac/i10nm_base.c
++++ b/drivers/edac/i10nm_base.c
+@@ -658,13 +658,49 @@ static struct pci_dev *get_ddr_munit(struct skx_dev *d, int i, u32 *offset, unsi
+ return mdev;
+ }
+
++/**
++ * i10nm_imc_absent() - Check whether the memory controller @imc is absent
++ *
++ * @imc : The pointer to the structure of memory controller EDAC device.
++ *
++ * RETURNS : true if the memory controller EDAC device is absent, false otherwise.
++ */
++static bool i10nm_imc_absent(struct skx_imc *imc)
++{
++ u32 mcmtr;
++ int i;
++
++ switch (res_cfg->type) {
++ case SPR:
++ for (i = 0; i < res_cfg->ddr_chan_num; i++) {
++ mcmtr = I10NM_GET_MCMTR(imc, i);
++ edac_dbg(1, "ch%d mcmtr reg %x\n", i, mcmtr);
++ if (mcmtr != ~0)
++ return false;
++ }
++
++ /*
++ * Some workstations' absent memory controllers still
++ * appear as PCIe devices, misleading the EDAC driver.
++ * By observing that the MMIO registers of these absent
++ * memory controllers consistently hold the value of ~0.
++ *
++ * We identify a memory controller as absent by checking
++ * if its MMIO register "mcmtr" == ~0 in all its channels.
++ */
++ return true;
++ default:
++ return false;
++ }
++}
++
+ static int i10nm_get_ddr_munits(void)
+ {
+ struct pci_dev *mdev;
+ void __iomem *mbase;
+ unsigned long size;
+ struct skx_dev *d;
+- int i, j = 0;
++ int i, lmc, j = 0;
+ u32 reg, off;
+ u64 base;
+
+@@ -690,7 +726,7 @@ static int i10nm_get_ddr_munits(void)
+ edac_dbg(2, "socket%d mmio base 0x%llx (reg 0x%x)\n",
+ j++, base, reg);
+
+- for (i = 0; i < res_cfg->ddr_imc_num; i++) {
++ for (lmc = 0, i = 0; i < res_cfg->ddr_imc_num; i++) {
+ mdev = get_ddr_munit(d, i, &off, &size);
+
+ if (i == 0 && !mdev) {
+@@ -700,8 +736,6 @@ static int i10nm_get_ddr_munits(void)
+ if (!mdev)
+ continue;
+
+- d->imc[i].mdev = mdev;
+-
+ edac_dbg(2, "mc%d mmio base 0x%llx size 0x%lx (reg 0x%x)\n",
+ i, base + off, size, reg);
+
+@@ -712,7 +746,17 @@ static int i10nm_get_ddr_munits(void)
+ return -ENODEV;
+ }
+
+- d->imc[i].mbase = mbase;
++ d->imc[lmc].mbase = mbase;
++ if (i10nm_imc_absent(&d->imc[lmc])) {
++ pci_dev_put(mdev);
++ iounmap(mbase);
++ d->imc[lmc].mbase = NULL;
++ edac_dbg(2, "Skip absent mc%d\n", i);
++ continue;
++ } else {
++ d->imc[lmc].mdev = mdev;
++ lmc++;
++ }
+ }
+ }
+
+diff --git a/drivers/edac/igen6_edac.c b/drivers/edac/igen6_edac.c
+index 544dd19072eab..1a18693294db4 100644
+--- a/drivers/edac/igen6_edac.c
++++ b/drivers/edac/igen6_edac.c
+@@ -27,7 +27,7 @@
+ #include "edac_mc.h"
+ #include "edac_module.h"
+
+-#define IGEN6_REVISION "v2.5"
++#define IGEN6_REVISION "v2.5.1"
+
+ #define EDAC_MOD_STR "igen6_edac"
+ #define IGEN6_NMI_NAME "igen6_ibecc"
+@@ -1216,9 +1216,6 @@ static int igen6_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
+ INIT_WORK(&ecclog_work, ecclog_work_cb);
+ init_irq_work(&ecclog_irq_work, ecclog_irq_work_cb);
+
+- /* Check if any pending errors before registering the NMI handler */
+- ecclog_handler();
+-
+ rc = register_err_handler();
+ if (rc)
+ goto fail3;
+@@ -1230,6 +1227,9 @@ static int igen6_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
+ goto fail4;
+ }
+
++ /* Check if any pending errors before/during the registration of the error handler */
++ ecclog_handler();
++
+ igen6_debug_setup();
+ return 0;
+ fail4:
+diff --git a/drivers/extcon/Kconfig b/drivers/extcon/Kconfig
+index 290186e44e6bd..4dd52a6a5b48d 100644
+--- a/drivers/extcon/Kconfig
++++ b/drivers/extcon/Kconfig
+@@ -62,6 +62,7 @@ config EXTCON_INTEL_CHT_WC
+ tristate "Intel Cherrytrail Whiskey Cove PMIC extcon driver"
+ depends on INTEL_SOC_PMIC_CHTWC
+ depends on USB_SUPPORT
++ depends on POWER_SUPPLY
+ select USB_ROLE_SWITCH
+ help
+ Say Y here to enable extcon support for charger detection / control
+diff --git a/drivers/firmware/arm_sdei.c b/drivers/firmware/arm_sdei.c
+index f9040bd610812..285fe7ad490d1 100644
+--- a/drivers/firmware/arm_sdei.c
++++ b/drivers/firmware/arm_sdei.c
+@@ -1095,3 +1095,22 @@ int sdei_event_handler(struct pt_regs *regs,
+ return err;
+ }
+ NOKPROBE_SYMBOL(sdei_event_handler);
++
++void sdei_handler_abort(void)
++{
++ /*
++ * If the crash happened in an SDEI event handler then we need to
++ * finish the handler with the firmware so that we can have working
++ * interrupts in the crash kernel.
++ */
++ if (__this_cpu_read(sdei_active_critical_event)) {
++ pr_warn("still in SDEI critical event context, attempting to finish handler.\n");
++ __sdei_handler_abort();
++ __this_cpu_write(sdei_active_critical_event, NULL);
++ }
++ if (__this_cpu_read(sdei_active_normal_event)) {
++ pr_warn("still in SDEI normal event context, attempting to finish handler.\n");
++ __sdei_handler_abort();
++ __this_cpu_write(sdei_active_normal_event, NULL);
++ }
++}
+diff --git a/drivers/firmware/cirrus/cs_dsp.c b/drivers/firmware/cirrus/cs_dsp.c
+index ec056f6f40ce8..cc3a28f386a77 100644
+--- a/drivers/firmware/cirrus/cs_dsp.c
++++ b/drivers/firmware/cirrus/cs_dsp.c
+@@ -978,7 +978,8 @@ static int cs_dsp_create_control(struct cs_dsp *dsp,
+ ctl->alg_region.alg == alg_region->alg &&
+ ctl->alg_region.type == alg_region->type) {
+ if ((!subname && !ctl->subname) ||
+- (subname && !strncmp(ctl->subname, subname, ctl->subname_len))) {
++ (subname && (ctl->subname_len == subname_len) &&
++ !strncmp(ctl->subname, subname, ctl->subname_len))) {
+ if (!ctl->enabled)
+ ctl->enabled = 1;
+ return 0;
+diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c
+index a0bfd31358ba9..9ae0d6d0c285f 100644
+--- a/drivers/firmware/efi/libstub/x86-stub.c
++++ b/drivers/firmware/efi/libstub/x86-stub.c
+@@ -61,7 +61,7 @@ preserve_pci_rom_image(efi_pci_io_protocol_t *pci, struct pci_setup_rom **__rom)
+ rom->data.type = SETUP_PCI;
+ rom->data.len = size - sizeof(struct setup_data);
+ rom->data.next = 0;
+- rom->pcilen = pci->romsize;
++ rom->pcilen = romsize;
+ *__rom = rom;
+
+ status = efi_call_proto(pci, pci.read, EfiPciIoWidthUint16,
+diff --git a/drivers/firmware/meson/meson_sm.c b/drivers/firmware/meson/meson_sm.c
+index 798bcdb05d84e..9a2656d73600b 100644
+--- a/drivers/firmware/meson/meson_sm.c
++++ b/drivers/firmware/meson/meson_sm.c
+@@ -292,6 +292,8 @@ static int __init meson_sm_probe(struct platform_device *pdev)
+ return -ENOMEM;
+
+ chip = of_match_device(meson_sm_ids, dev)->data;
++ if (!chip)
++ return -EINVAL;
+
+ if (chip->cmd_shmem_in_base) {
+ fw->sm_shmem_in_base = meson_sm_map_shmem(chip->cmd_shmem_in_base,
+diff --git a/drivers/firmware/ti_sci.c b/drivers/firmware/ti_sci.c
+index 039d92a595ec6..91aaa0ca9bde8 100644
+--- a/drivers/firmware/ti_sci.c
++++ b/drivers/firmware/ti_sci.c
+@@ -97,7 +97,6 @@ struct ti_sci_desc {
+ * @node: list head
+ * @host_id: Host ID
+ * @users: Number of users of this instance
+- * @is_suspending: Flag set to indicate in suspend path.
+ */
+ struct ti_sci_info {
+ struct device *dev;
+@@ -116,7 +115,6 @@ struct ti_sci_info {
+ u8 host_id;
+ /* protected by ti_sci_list_mutex */
+ int users;
+- bool is_suspending;
+ };
+
+ #define cl_to_ti_sci_info(c) container_of(c, struct ti_sci_info, cl)
+@@ -418,14 +416,14 @@ static inline int ti_sci_do_xfer(struct ti_sci_info *info,
+
+ ret = 0;
+
+- if (!info->is_suspending) {
++ if (system_state <= SYSTEM_RUNNING) {
+ /* And we wait for the response. */
+ timeout = msecs_to_jiffies(info->desc->max_rx_timeout_ms);
+ if (!wait_for_completion_timeout(&xfer->done, timeout))
+ ret = -ETIMEDOUT;
+ } else {
+ /*
+- * If we are suspending, we cannot use wait_for_completion_timeout
++ * If we are !running, we cannot use wait_for_completion_timeout
+ * during noirq phase, so we must manually poll the completion.
+ */
+ ret = read_poll_timeout_atomic(try_wait_for_completion, done_state,
+@@ -3281,35 +3279,6 @@ static int tisci_reboot_handler(struct notifier_block *nb, unsigned long mode,
+ return NOTIFY_BAD;
+ }
+
+-static void ti_sci_set_is_suspending(struct ti_sci_info *info, bool is_suspending)
+-{
+- info->is_suspending = is_suspending;
+-}
+-
+-static int ti_sci_suspend(struct device *dev)
+-{
+- struct ti_sci_info *info = dev_get_drvdata(dev);
+- /*
+- * We must switch operation to polled mode now as drivers and the genpd
+- * layer may make late TI SCI calls to change clock and device states
+- * from the noirq phase of suspend.
+- */
+- ti_sci_set_is_suspending(info, true);
+-
+- return 0;
+-}
+-
+-static int ti_sci_resume(struct device *dev)
+-{
+- struct ti_sci_info *info = dev_get_drvdata(dev);
+-
+- ti_sci_set_is_suspending(info, false);
+-
+- return 0;
+-}
+-
+-static DEFINE_SIMPLE_DEV_PM_OPS(ti_sci_pm_ops, ti_sci_suspend, ti_sci_resume);
+-
+ /* Description for K2G */
+ static const struct ti_sci_desc ti_sci_pmmc_k2g_desc = {
+ .default_host_id = 2,
+@@ -3516,7 +3485,6 @@ static struct platform_driver ti_sci_driver = {
+ .driver = {
+ .name = "ti-sci",
+ .of_match_table = of_match_ptr(ti_sci_of_match),
+- .pm = &ti_sci_pm_ops,
+ },
+ };
+ module_platform_driver(ti_sci_driver);
+diff --git a/drivers/fsi/fsi-master-aspeed.c b/drivers/fsi/fsi-master-aspeed.c
+index 7cec1772820d3..5eccab175e86b 100644
+--- a/drivers/fsi/fsi-master-aspeed.c
++++ b/drivers/fsi/fsi-master-aspeed.c
+@@ -454,6 +454,8 @@ static ssize_t cfam_reset_store(struct device *dev, struct device_attribute *att
+ gpiod_set_value(aspeed->cfam_reset_gpio, 1);
+ usleep_range(900, 1000);
+ gpiod_set_value(aspeed->cfam_reset_gpio, 0);
++ usleep_range(900, 1000);
++ opb_writel(aspeed, ctrl_base + FSI_MRESP0, cpu_to_be32(FSI_MRESP_RST_ALL_MASTER));
+ mutex_unlock(&aspeed->lock);
+ trace_fsi_master_aspeed_cfam_reset(false);
+
+diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
+index 5be8ad61523eb..6e7701f80929f 100644
+--- a/drivers/gpio/gpiolib.c
++++ b/drivers/gpio/gpiolib.c
+@@ -2175,12 +2175,18 @@ static bool gpiod_free_commit(struct gpio_desc *desc)
+
+ void gpiod_free(struct gpio_desc *desc)
+ {
+- if (desc && desc->gdev && gpiod_free_commit(desc)) {
+- module_put(desc->gdev->owner);
+- gpio_device_put(desc->gdev);
+- } else {
++ /*
++ * We must not use VALIDATE_DESC_VOID() as the underlying gdev->chip
++ * may already be NULL but we still want to put the references.
++ */
++ if (!desc)
++ return;
++
++ if (!gpiod_free_commit(desc))
+ WARN_ON(extra_checks);
+- }
++
++ module_put(desc->gdev->owner);
++ gpio_device_put(desc->gdev);
+ }
+
+ /**
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+index 3108f5219cf3b..f7770e9c9aaca 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+@@ -1231,6 +1231,9 @@ int amdgpu_device_resize_fb_bar(struct amdgpu_device *adev)
+ u16 cmd;
+ int r;
+
++ if (!IS_ENABLED(CONFIG_PHYS_ADDR_T_64BIT))
++ return 0;
++
+ /* Bypass for VF */
+ if (amdgpu_sriov_vf(adev))
+ return 0;
+diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+index 724e80c192973..cecae6c1e8935 100644
+--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+@@ -556,6 +556,7 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
+ crtc = (struct drm_crtc *)minfo->crtcs[i];
+ if (crtc && crtc->base.id == info->mode_crtc.id) {
+ struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
++
+ ui32 = amdgpu_crtc->crtc_id;
+ found = 1;
+ break;
+@@ -574,7 +575,7 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
+ if (ret)
+ return ret;
+
+- ret = copy_to_user(out, &ip, min((size_t)size, sizeof(ip)));
++ ret = copy_to_user(out, &ip, min_t(size_t, size, sizeof(ip)));
+ return ret ? -EFAULT : 0;
+ }
+ case AMDGPU_INFO_HW_IP_COUNT: {
+@@ -720,17 +721,18 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
+ ? -EFAULT : 0;
+ }
+ case AMDGPU_INFO_READ_MMR_REG: {
+- unsigned n, alloc_size;
++ unsigned int n, alloc_size;
+ uint32_t *regs;
+- unsigned se_num = (info->read_mmr_reg.instance >>
++ unsigned int se_num = (info->read_mmr_reg.instance >>
+ AMDGPU_INFO_MMR_SE_INDEX_SHIFT) &
+ AMDGPU_INFO_MMR_SE_INDEX_MASK;
+- unsigned sh_num = (info->read_mmr_reg.instance >>
++ unsigned int sh_num = (info->read_mmr_reg.instance >>
+ AMDGPU_INFO_MMR_SH_INDEX_SHIFT) &
+ AMDGPU_INFO_MMR_SH_INDEX_MASK;
+
+ /* set full masks if the userspace set all bits
+- * in the bitfields */
++ * in the bitfields
++ */
+ if (se_num == AMDGPU_INFO_MMR_SE_INDEX_MASK)
+ se_num = 0xffffffff;
+ else if (se_num >= AMDGPU_GFX_MAX_SE)
+@@ -882,7 +884,7 @@ int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
+ return ret;
+ }
+ case AMDGPU_INFO_VCE_CLOCK_TABLE: {
+- unsigned i;
++ unsigned int i;
+ struct drm_amdgpu_info_vce_clock_table vce_clk_table = {};
+ struct amd_vce_state *vce_state;
+
+diff --git a/drivers/gpu/drm/amd/amdgpu/cik.c b/drivers/gpu/drm/amd/amdgpu/cik.c
+index de6d10390ab2f..9be6da37032a7 100644
+--- a/drivers/gpu/drm/amd/amdgpu/cik.c
++++ b/drivers/gpu/drm/amd/amdgpu/cik.c
+@@ -1574,17 +1574,8 @@ static void cik_pcie_gen3_enable(struct amdgpu_device *adev)
+ u16 bridge_cfg2, gpu_cfg2;
+ u32 max_lw, current_lw, tmp;
+
+- pcie_capability_read_word(root, PCI_EXP_LNKCTL,
+- &bridge_cfg);
+- pcie_capability_read_word(adev->pdev, PCI_EXP_LNKCTL,
+- &gpu_cfg);
+-
+- tmp16 = bridge_cfg | PCI_EXP_LNKCTL_HAWD;
+- pcie_capability_write_word(root, PCI_EXP_LNKCTL, tmp16);
+-
+- tmp16 = gpu_cfg | PCI_EXP_LNKCTL_HAWD;
+- pcie_capability_write_word(adev->pdev, PCI_EXP_LNKCTL,
+- tmp16);
++ pcie_capability_set_word(root, PCI_EXP_LNKCTL, PCI_EXP_LNKCTL_HAWD);
++ pcie_capability_set_word(adev->pdev, PCI_EXP_LNKCTL, PCI_EXP_LNKCTL_HAWD);
+
+ tmp = RREG32_PCIE(ixPCIE_LC_STATUS1);
+ max_lw = (tmp & PCIE_LC_STATUS1__LC_DETECTED_LINK_WIDTH_MASK) >>
+@@ -1637,21 +1628,14 @@ static void cik_pcie_gen3_enable(struct amdgpu_device *adev)
+ msleep(100);
+
+ /* linkctl */
+- pcie_capability_read_word(root, PCI_EXP_LNKCTL,
+- &tmp16);
+- tmp16 &= ~PCI_EXP_LNKCTL_HAWD;
+- tmp16 |= (bridge_cfg & PCI_EXP_LNKCTL_HAWD);
+- pcie_capability_write_word(root, PCI_EXP_LNKCTL,
+- tmp16);
+-
+- pcie_capability_read_word(adev->pdev,
+- PCI_EXP_LNKCTL,
+- &tmp16);
+- tmp16 &= ~PCI_EXP_LNKCTL_HAWD;
+- tmp16 |= (gpu_cfg & PCI_EXP_LNKCTL_HAWD);
+- pcie_capability_write_word(adev->pdev,
+- PCI_EXP_LNKCTL,
+- tmp16);
++ pcie_capability_clear_and_set_word(root, PCI_EXP_LNKCTL,
++ PCI_EXP_LNKCTL_HAWD,
++ bridge_cfg &
++ PCI_EXP_LNKCTL_HAWD);
++ pcie_capability_clear_and_set_word(adev->pdev, PCI_EXP_LNKCTL,
++ PCI_EXP_LNKCTL_HAWD,
++ gpu_cfg &
++ PCI_EXP_LNKCTL_HAWD);
+
+ /* linkctl2 */
+ pcie_capability_read_word(root, PCI_EXP_LNKCTL2,
+diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v13_0.c b/drivers/gpu/drm/amd/amdgpu/psp_v13_0.c
+index caee76ab71105..92f2ee412908d 100644
+--- a/drivers/gpu/drm/amd/amdgpu/psp_v13_0.c
++++ b/drivers/gpu/drm/amd/amdgpu/psp_v13_0.c
+@@ -136,14 +136,15 @@ static int psp_v13_0_wait_for_bootloader(struct psp_context *psp)
+ int ret;
+ int retry_loop;
+
++ /* Wait for bootloader to signify that it is ready having bit 31 of
++ * C2PMSG_35 set to 1. All other bits are expected to be cleared.
++ * If there is an error in processing command, bits[7:0] will be set.
++ * This is applicable for PSP v13.0.6 and newer.
++ */
+ for (retry_loop = 0; retry_loop < 10; retry_loop++) {
+- /* Wait for bootloader to signify that is
+- ready having bit 31 of C2PMSG_35 set to 1 */
+- ret = psp_wait_for(psp,
+- SOC15_REG_OFFSET(MP0, 0, regMP0_SMN_C2PMSG_35),
+- 0x80000000,
+- 0x80000000,
+- false);
++ ret = psp_wait_for(
++ psp, SOC15_REG_OFFSET(MP0, 0, regMP0_SMN_C2PMSG_35),
++ 0x80000000, 0xffffffff, false);
+
+ if (ret == 0)
+ return 0;
+diff --git a/drivers/gpu/drm/amd/amdgpu/si.c b/drivers/gpu/drm/amd/amdgpu/si.c
+index 7f99e130acd06..fd34c2100bd96 100644
+--- a/drivers/gpu/drm/amd/amdgpu/si.c
++++ b/drivers/gpu/drm/amd/amdgpu/si.c
+@@ -2276,17 +2276,8 @@ static void si_pcie_gen3_enable(struct amdgpu_device *adev)
+ u16 bridge_cfg2, gpu_cfg2;
+ u32 max_lw, current_lw, tmp;
+
+- pcie_capability_read_word(root, PCI_EXP_LNKCTL,
+- &bridge_cfg);
+- pcie_capability_read_word(adev->pdev, PCI_EXP_LNKCTL,
+- &gpu_cfg);
+-
+- tmp16 = bridge_cfg | PCI_EXP_LNKCTL_HAWD;
+- pcie_capability_write_word(root, PCI_EXP_LNKCTL, tmp16);
+-
+- tmp16 = gpu_cfg | PCI_EXP_LNKCTL_HAWD;
+- pcie_capability_write_word(adev->pdev, PCI_EXP_LNKCTL,
+- tmp16);
++ pcie_capability_set_word(root, PCI_EXP_LNKCTL, PCI_EXP_LNKCTL_HAWD);
++ pcie_capability_set_word(adev->pdev, PCI_EXP_LNKCTL, PCI_EXP_LNKCTL_HAWD);
+
+ tmp = RREG32_PCIE(PCIE_LC_STATUS1);
+ max_lw = (tmp & LC_DETECTED_LINK_WIDTH_MASK) >> LC_DETECTED_LINK_WIDTH_SHIFT;
+@@ -2331,21 +2322,14 @@ static void si_pcie_gen3_enable(struct amdgpu_device *adev)
+
+ mdelay(100);
+
+- pcie_capability_read_word(root, PCI_EXP_LNKCTL,
+- &tmp16);
+- tmp16 &= ~PCI_EXP_LNKCTL_HAWD;
+- tmp16 |= (bridge_cfg & PCI_EXP_LNKCTL_HAWD);
+- pcie_capability_write_word(root, PCI_EXP_LNKCTL,
+- tmp16);
+-
+- pcie_capability_read_word(adev->pdev,
+- PCI_EXP_LNKCTL,
+- &tmp16);
+- tmp16 &= ~PCI_EXP_LNKCTL_HAWD;
+- tmp16 |= (gpu_cfg & PCI_EXP_LNKCTL_HAWD);
+- pcie_capability_write_word(adev->pdev,
+- PCI_EXP_LNKCTL,
+- tmp16);
++ pcie_capability_clear_and_set_word(root, PCI_EXP_LNKCTL,
++ PCI_EXP_LNKCTL_HAWD,
++ bridge_cfg &
++ PCI_EXP_LNKCTL_HAWD);
++ pcie_capability_clear_and_set_word(adev->pdev, PCI_EXP_LNKCTL,
++ PCI_EXP_LNKCTL_HAWD,
++ gpu_cfg &
++ PCI_EXP_LNKCTL_HAWD);
+
+ pcie_capability_read_word(root, PCI_EXP_LNKCTL2,
+ &tmp16);
+diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+index 4dd9a85f5c724..e20d55edb0209 100644
+--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
++++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+@@ -7993,10 +7993,12 @@ static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
+ * fast updates.
+ */
+ if (crtc->state->async_flip &&
+- acrtc_state->update_type != UPDATE_TYPE_FAST)
++ (acrtc_state->update_type != UPDATE_TYPE_FAST ||
++ get_mem_type(old_plane_state->fb) != get_mem_type(fb)))
+ drm_warn_once(state->dev,
+ "[PLANE:%d:%s] async flip with non-fast update\n",
+ plane->base.id, plane->name);
++
+ bundle->flip_addrs[planes_count].flip_immediate =
+ crtc->state->async_flip &&
+ acrtc_state->update_type == UPDATE_TYPE_FAST &&
+@@ -9953,6 +9955,11 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev,
+
+ /* Remove exiting planes if they are modified */
+ for_each_oldnew_plane_in_state_reverse(state, plane, old_plane_state, new_plane_state, i) {
++ if (old_plane_state->fb && new_plane_state->fb &&
++ get_mem_type(old_plane_state->fb) !=
++ get_mem_type(new_plane_state->fb))
++ lock_and_validation_needed = true;
++
+ ret = dm_update_plane_state(dc, state, plane,
+ old_plane_state,
+ new_plane_state,
+@@ -10200,9 +10207,20 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev,
+ struct dm_crtc_state *dm_new_crtc_state =
+ to_dm_crtc_state(new_crtc_state);
+
++ /*
++ * Only allow async flips for fast updates that don't change
++ * the FB pitch, the DCC state, rotation, etc.
++ */
++ if (new_crtc_state->async_flip && lock_and_validation_needed) {
++ drm_dbg_atomic(crtc->dev,
++ "[CRTC:%d:%s] async flips are only supported for fast updates\n",
++ crtc->base.id, crtc->name);
++ ret = -EINVAL;
++ goto fail;
++ }
++
+ dm_new_crtc_state->update_type = lock_and_validation_needed ?
+- UPDATE_TYPE_FULL :
+- UPDATE_TYPE_FAST;
++ UPDATE_TYPE_FULL : UPDATE_TYPE_FAST;
+ }
+
+ /* Must be success */
+diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c
+index 30d4c6fd95f53..440fc0869a34b 100644
+--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c
++++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c
+@@ -398,18 +398,6 @@ static int dm_crtc_helper_atomic_check(struct drm_crtc *crtc,
+ return -EINVAL;
+ }
+
+- /*
+- * Only allow async flips for fast updates that don't change the FB
+- * pitch, the DCC state, rotation, etc.
+- */
+- if (crtc_state->async_flip &&
+- dm_crtc_state->update_type != UPDATE_TYPE_FAST) {
+- drm_dbg_atomic(crtc->dev,
+- "[CRTC:%d:%s] async flips are only supported for fast updates\n",
+- crtc->base.id, crtc->name);
+- return -EINVAL;
+- }
+-
+ /* In some use cases, like reset, no stream is attached */
+ if (!dm_crtc_state->stream)
+ return 0;
+diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn315/dcn315_smu.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn315/dcn315_smu.c
+index 925d6e13620ec..1bbf85defd611 100644
+--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn315/dcn315_smu.c
++++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn315/dcn315_smu.c
+@@ -32,6 +32,7 @@
+
+ #define MAX_INSTANCE 6
+ #define MAX_SEGMENT 6
++#define SMU_REGISTER_WRITE_RETRY_COUNT 5
+
+ struct IP_BASE_INSTANCE
+ {
+@@ -134,6 +135,8 @@ static int dcn315_smu_send_msg_with_param(
+ unsigned int msg_id, unsigned int param)
+ {
+ uint32_t result;
++ uint32_t i = 0;
++ uint32_t read_back_data;
+
+ result = dcn315_smu_wait_for_response(clk_mgr, 10, 200000);
+
+@@ -150,10 +153,19 @@ static int dcn315_smu_send_msg_with_param(
+ /* Set the parameter register for the SMU message, unit is Mhz */
+ REG_WRITE(MP1_SMN_C2PMSG_37, param);
+
+- /* Trigger the message transaction by writing the message ID */
+- generic_write_indirect_reg(CTX,
+- REG_NBIO(RSMU_INDEX), REG_NBIO(RSMU_DATA),
+- mmMP1_C2PMSG_3, msg_id);
++ for (i = 0; i < SMU_REGISTER_WRITE_RETRY_COUNT; i++) {
++ /* Trigger the message transaction by writing the message ID */
++ generic_write_indirect_reg(CTX,
++ REG_NBIO(RSMU_INDEX), REG_NBIO(RSMU_DATA),
++ mmMP1_C2PMSG_3, msg_id);
++ read_back_data = generic_read_indirect_reg(CTX,
++ REG_NBIO(RSMU_INDEX), REG_NBIO(RSMU_DATA),
++ mmMP1_C2PMSG_3);
++ if (read_back_data == msg_id)
++ break;
++ udelay(2);
++ smu_print("SMU msg id write fail %x times. \n", i + 1);
++ }
+
+ result = dcn315_smu_wait_for_response(clk_mgr, 10, 200000);
+
+diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
+index 58e8fda04b861..ad28fdd87797f 100644
+--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
++++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
+@@ -1795,10 +1795,13 @@ void dce110_enable_accelerated_mode(struct dc *dc, struct dc_state *context)
+ hws->funcs.edp_backlight_control(edp_link_with_sink, false);
+ }
+ /*resume from S3, no vbios posting, no need to power down again*/
++ clk_mgr_exit_optimized_pwr_state(dc, dc->clk_mgr);
++
+ power_down_all_hw_blocks(dc);
+ disable_vga_and_power_gate_all_controllers(dc);
+ if (edp_link_with_sink && !keep_edp_vdd_on)
+ dc->hwss.edp_power_control(edp_link_with_sink, false);
++ clk_mgr_optimize_pwr_state(dc, dc->clk_mgr);
+ }
+ bios_set_scratch_acc_mode_change(dc->ctx->dc_bios, 1);
+ }
+diff --git a/drivers/gpu/drm/amd/display/dc/dcn301/dcn301_init.c b/drivers/gpu/drm/amd/display/dc/dcn301/dcn301_init.c
+index 6192851c59ed8..51265a812bdc8 100644
+--- a/drivers/gpu/drm/amd/display/dc/dcn301/dcn301_init.c
++++ b/drivers/gpu/drm/amd/display/dc/dcn301/dcn301_init.c
+@@ -75,6 +75,7 @@ static const struct hw_sequencer_funcs dcn301_funcs = {
+ .get_hw_state = dcn10_get_hw_state,
+ .clear_status_bits = dcn10_clear_status_bits,
+ .wait_for_mpcc_disconnect = dcn10_wait_for_mpcc_disconnect,
++ .edp_backlight_control = dce110_edp_backlight_control,
+ .edp_power_control = dce110_edp_power_control,
+ .edp_wait_for_hpd_ready = dce110_edp_wait_for_hpd_ready,
+ .set_cursor_position = dcn10_set_cursor_position,
+diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dccg.c b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dccg.c
+index 65c1d754e2d6b..01cc679ae4186 100644
+--- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dccg.c
++++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_dccg.c
+@@ -84,7 +84,8 @@ static enum phyd32clk_clock_source get_phy_mux_symclk(
+ struct dcn_dccg *dccg_dcn,
+ enum phyd32clk_clock_source src)
+ {
+- if (dccg_dcn->base.ctx->asic_id.hw_internal_rev == YELLOW_CARP_B0) {
++ if (dccg_dcn->base.ctx->asic_id.chip_family == FAMILY_YELLOW_CARP &&
++ dccg_dcn->base.ctx->asic_id.hw_internal_rev == YELLOW_CARP_B0) {
+ if (src == PHYD32CLKC)
+ src = PHYD32CLKF;
+ if (src == PHYD32CLKD)
+diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c
+index b878effa2129b..b428a343add9c 100644
+--- a/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c
++++ b/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c
+@@ -33,7 +33,7 @@
+ #include "dml/display_mode_vba.h"
+
+ struct _vcs_dpi_ip_params_st dcn3_14_ip = {
+- .VBlankNomDefaultUS = 800,
++ .VBlankNomDefaultUS = 668,
+ .gpuvm_enable = 1,
+ .gpuvm_max_page_table_levels = 1,
+ .hostvm_enable = 1,
+diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
+index f4f40459f22b9..a5b2a7d943f71 100644
+--- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
++++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
+@@ -2195,15 +2195,19 @@ static int amdgpu_device_attr_create(struct amdgpu_device *adev,
+ uint32_t mask, struct list_head *attr_list)
+ {
+ int ret = 0;
+- struct device_attribute *dev_attr = &attr->dev_attr;
+- const char *name = dev_attr->attr.name;
+ enum amdgpu_device_attr_states attr_states = ATTR_STATE_SUPPORTED;
+ struct amdgpu_device_attr_entry *attr_entry;
++ struct device_attribute *dev_attr;
++ const char *name;
+
+ int (*attr_update)(struct amdgpu_device *adev, struct amdgpu_device_attr *attr,
+ uint32_t mask, enum amdgpu_device_attr_states *states) = default_attr_update;
+
+- BUG_ON(!attr);
++ if (!attr)
++ return -EINVAL;
++
++ dev_attr = &attr->dev_attr;
++ name = dev_attr->attr.name;
+
+ attr_update = attr->attr_update ? attr->attr_update : default_attr_update;
+
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
+index d7f09af2fb018..419a247dfbbf2 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
+@@ -1323,7 +1323,7 @@ static ssize_t smu_v13_0_0_get_gpu_metrics(struct smu_context *smu,
+ gpu_metrics->average_vclk1_frequency = metrics->AverageVclk1Frequency;
+ gpu_metrics->average_dclk1_frequency = metrics->AverageDclk1Frequency;
+
+- gpu_metrics->current_gfxclk = metrics->CurrClock[PPCLK_GFXCLK];
++ gpu_metrics->current_gfxclk = gpu_metrics->average_gfxclk_frequency;
+ gpu_metrics->current_socclk = metrics->CurrClock[PPCLK_SOCCLK];
+ gpu_metrics->current_uclk = metrics->CurrClock[PPCLK_UCLK];
+ gpu_metrics->current_vclk0 = metrics->CurrClock[PPCLK_VCLK_0];
+diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
+index c9093517b1bda..bfa020fe0d4fe 100644
+--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
++++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
+@@ -697,16 +697,19 @@ static int smu_v13_0_6_get_smu_metrics_data(struct smu_context *smu,
+ *value = SMUQ10_TO_UINT(metrics->SocketPower) << 8;
+ break;
+ case METRICS_TEMPERATURE_HOTSPOT:
+- *value = SMUQ10_TO_UINT(metrics->MaxSocketTemperature);
++ *value = SMUQ10_TO_UINT(metrics->MaxSocketTemperature) *
++ SMU_TEMPERATURE_UNITS_PER_CENTIGRADES;
+ break;
+ case METRICS_TEMPERATURE_MEM:
+- *value = SMUQ10_TO_UINT(metrics->MaxHbmTemperature);
++ *value = SMUQ10_TO_UINT(metrics->MaxHbmTemperature) *
++ SMU_TEMPERATURE_UNITS_PER_CENTIGRADES;
+ break;
+ /* This is the max of all VRs and not just SOC VR.
+ * No need to define another data type for the same.
+ */
+ case METRICS_TEMPERATURE_VRSOC:
+- *value = SMUQ10_TO_UINT(metrics->MaxVrTemperature);
++ *value = SMUQ10_TO_UINT(metrics->MaxVrTemperature) *
++ SMU_TEMPERATURE_UNITS_PER_CENTIGRADES;
+ break;
+ case METRICS_THROTTLER_STATUS:
+ *value = smu_v13_0_6_get_throttler_status(smu, metrics);
+diff --git a/drivers/gpu/drm/armada/armada_overlay.c b/drivers/gpu/drm/armada/armada_overlay.c
+index f21eb8fb76d87..3b9bd8ecda137 100644
+--- a/drivers/gpu/drm/armada/armada_overlay.c
++++ b/drivers/gpu/drm/armada/armada_overlay.c
+@@ -4,6 +4,8 @@
+ * Rewritten from the dovefb driver, and Armada510 manuals.
+ */
+
++#include <linux/bitfield.h>
++
+ #include <drm/armada_drm.h>
+ #include <drm/drm_atomic.h>
+ #include <drm/drm_atomic_helper.h>
+@@ -445,8 +447,8 @@ static int armada_overlay_get_property(struct drm_plane *plane,
+ drm_to_overlay_state(state)->colorkey_ug,
+ drm_to_overlay_state(state)->colorkey_vb, 0);
+ } else if (property == priv->colorkey_mode_prop) {
+- *val = (drm_to_overlay_state(state)->colorkey_mode &
+- CFG_CKMODE_MASK) >> ffs(CFG_CKMODE_MASK);
++ *val = FIELD_GET(CFG_CKMODE_MASK,
++ drm_to_overlay_state(state)->colorkey_mode);
+ } else if (property == priv->brightness_prop) {
+ *val = drm_to_overlay_state(state)->brightness + 256;
+ } else if (property == priv->contrast_prop) {
+diff --git a/drivers/gpu/drm/ast/ast_dp.c b/drivers/gpu/drm/ast/ast_dp.c
+index 6dc1a09504e13..fdd9a493aa9c0 100644
+--- a/drivers/gpu/drm/ast/ast_dp.c
++++ b/drivers/gpu/drm/ast/ast_dp.c
+@@ -7,6 +7,17 @@
+ #include <drm/drm_print.h>
+ #include "ast_drv.h"
+
++bool ast_astdp_is_connected(struct ast_device *ast)
++{
++ if (!ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xD1, ASTDP_MCU_FW_EXECUTING))
++ return false;
++ if (!ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xDF, ASTDP_HPD))
++ return false;
++ if (!ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xDC, ASTDP_LINK_SUCCESS))
++ return false;
++ return true;
++}
++
+ int ast_astdp_read_edid(struct drm_device *dev, u8 *ediddata)
+ {
+ struct ast_device *ast = to_ast_device(dev);
+diff --git a/drivers/gpu/drm/ast/ast_dp501.c b/drivers/gpu/drm/ast/ast_dp501.c
+index 1bc35a992369d..fa7442b0c2612 100644
+--- a/drivers/gpu/drm/ast/ast_dp501.c
++++ b/drivers/gpu/drm/ast/ast_dp501.c
+@@ -272,11 +272,9 @@ static bool ast_launch_m68k(struct drm_device *dev)
+ return true;
+ }
+
+-bool ast_dp501_read_edid(struct drm_device *dev, u8 *ediddata)
++bool ast_dp501_is_connected(struct ast_device *ast)
+ {
+- struct ast_device *ast = to_ast_device(dev);
+- u32 i, boot_address, offset, data;
+- u32 *pEDIDidx;
++ u32 boot_address, offset, data;
+
+ if (ast->config_mode == ast_use_p2a) {
+ boot_address = get_fw_base(ast);
+@@ -292,14 +290,6 @@ bool ast_dp501_read_edid(struct drm_device *dev, u8 *ediddata)
+ data = ast_mindwm(ast, boot_address + offset);
+ if (!(data & AST_DP501_PNP_CONNECTED))
+ return false;
+-
+- /* Read EDID */
+- offset = AST_DP501_EDID_DATA;
+- for (i = 0; i < 128; i += 4) {
+- data = ast_mindwm(ast, boot_address + offset + i);
+- pEDIDidx = (u32 *)(ediddata + i);
+- *pEDIDidx = data;
+- }
+ } else {
+ if (!ast->dp501_fw_buf)
+ return false;
+@@ -319,7 +309,30 @@ bool ast_dp501_read_edid(struct drm_device *dev, u8 *ediddata)
+ data = readl(ast->dp501_fw_buf + offset);
+ if (!(data & AST_DP501_PNP_CONNECTED))
+ return false;
++ }
++ return true;
++}
++
++bool ast_dp501_read_edid(struct drm_device *dev, u8 *ediddata)
++{
++ struct ast_device *ast = to_ast_device(dev);
++ u32 i, boot_address, offset, data;
++ u32 *pEDIDidx;
++
++ if (!ast_dp501_is_connected(ast))
++ return false;
++
++ if (ast->config_mode == ast_use_p2a) {
++ boot_address = get_fw_base(ast);
+
++ /* Read EDID */
++ offset = AST_DP501_EDID_DATA;
++ for (i = 0; i < 128; i += 4) {
++ data = ast_mindwm(ast, boot_address + offset + i);
++ pEDIDidx = (u32 *)(ediddata + i);
++ *pEDIDidx = data;
++ }
++ } else {
+ /* Read EDID */
+ offset = AST_DP501_EDID_DATA;
+ for (i = 0; i < 128; i += 4) {
+diff --git a/drivers/gpu/drm/ast/ast_drv.h b/drivers/gpu/drm/ast/ast_drv.h
+index 5498a6676f2e8..8a0ffa8b5939b 100644
+--- a/drivers/gpu/drm/ast/ast_drv.h
++++ b/drivers/gpu/drm/ast/ast_drv.h
+@@ -468,6 +468,7 @@ void ast_patch_ahb_2500(struct ast_device *ast);
+ /* ast dp501 */
+ void ast_set_dp501_video_output(struct drm_device *dev, u8 mode);
+ bool ast_backup_fw(struct drm_device *dev, u8 *addr, u32 size);
++bool ast_dp501_is_connected(struct ast_device *ast);
+ bool ast_dp501_read_edid(struct drm_device *dev, u8 *ediddata);
+ u8 ast_get_dp501_max_clk(struct drm_device *dev);
+ void ast_init_3rdtx(struct drm_device *dev);
+@@ -476,6 +477,7 @@ void ast_init_3rdtx(struct drm_device *dev);
+ struct ast_i2c_chan *ast_i2c_create(struct drm_device *dev);
+
+ /* aspeed DP */
++bool ast_astdp_is_connected(struct ast_device *ast);
+ int ast_astdp_read_edid(struct drm_device *dev, u8 *ediddata);
+ void ast_dp_launch(struct drm_device *dev);
+ void ast_dp_power_on_off(struct drm_device *dev, bool no);
+diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
+index b3c670af6ef2b..0724516f29737 100644
+--- a/drivers/gpu/drm/ast/ast_mode.c
++++ b/drivers/gpu/drm/ast/ast_mode.c
+@@ -1585,8 +1585,20 @@ err_drm_connector_update_edid_property:
+ return 0;
+ }
+
++static int ast_dp501_connector_helper_detect_ctx(struct drm_connector *connector,
++ struct drm_modeset_acquire_ctx *ctx,
++ bool force)
++{
++ struct ast_device *ast = to_ast_device(connector->dev);
++
++ if (ast_dp501_is_connected(ast))
++ return connector_status_connected;
++ return connector_status_disconnected;
++}
++
+ static const struct drm_connector_helper_funcs ast_dp501_connector_helper_funcs = {
+ .get_modes = ast_dp501_connector_helper_get_modes,
++ .detect_ctx = ast_dp501_connector_helper_detect_ctx,
+ };
+
+ static const struct drm_connector_funcs ast_dp501_connector_funcs = {
+@@ -1611,7 +1623,7 @@ static int ast_dp501_connector_init(struct drm_device *dev, struct drm_connector
+ connector->interlace_allowed = 0;
+ connector->doublescan_allowed = 0;
+
+- connector->polled = DRM_CONNECTOR_POLL_CONNECT;
++ connector->polled = DRM_CONNECTOR_POLL_CONNECT | DRM_CONNECTOR_POLL_DISCONNECT;
+
+ return 0;
+ }
+@@ -1683,8 +1695,20 @@ err_drm_connector_update_edid_property:
+ return 0;
+ }
+
++static int ast_astdp_connector_helper_detect_ctx(struct drm_connector *connector,
++ struct drm_modeset_acquire_ctx *ctx,
++ bool force)
++{
++ struct ast_device *ast = to_ast_device(connector->dev);
++
++ if (ast_astdp_is_connected(ast))
++ return connector_status_connected;
++ return connector_status_disconnected;
++}
++
+ static const struct drm_connector_helper_funcs ast_astdp_connector_helper_funcs = {
+ .get_modes = ast_astdp_connector_helper_get_modes,
++ .detect_ctx = ast_astdp_connector_helper_detect_ctx,
+ };
+
+ static const struct drm_connector_funcs ast_astdp_connector_funcs = {
+@@ -1709,7 +1733,7 @@ static int ast_astdp_connector_init(struct drm_device *dev, struct drm_connector
+ connector->interlace_allowed = 0;
+ connector->doublescan_allowed = 0;
+
+- connector->polled = DRM_CONNECTOR_POLL_CONNECT;
++ connector->polled = DRM_CONNECTOR_POLL_CONNECT | DRM_CONNECTOR_POLL_DISCONNECT;
+
+ return 0;
+ }
+@@ -1848,5 +1872,7 @@ int ast_mode_config_init(struct ast_device *ast)
+
+ drm_mode_config_reset(dev);
+
++ drm_kms_helper_poll_init(dev);
++
+ return 0;
+ }
+diff --git a/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c b/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
+index ddceafa7b6374..8d6c93296503e 100644
+--- a/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
++++ b/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
+@@ -786,8 +786,13 @@ static void adv7511_mode_set(struct adv7511 *adv7511,
+ else
+ low_refresh_rate = ADV7511_LOW_REFRESH_RATE_NONE;
+
+- regmap_update_bits(adv7511->regmap, 0xfb,
+- 0x6, low_refresh_rate << 1);
++ if (adv7511->type == ADV7511)
++ regmap_update_bits(adv7511->regmap, 0xfb,
++ 0x6, low_refresh_rate << 1);
++ else
++ regmap_update_bits(adv7511->regmap, 0x4a,
++ 0xc, low_refresh_rate << 2);
++
+ regmap_update_bits(adv7511->regmap, 0x17,
+ 0x60, (vsync_polarity << 6) | (hsync_polarity << 5));
+
+diff --git a/drivers/gpu/drm/bridge/analogix/anx7625.c b/drivers/gpu/drm/bridge/analogix/anx7625.c
+index 9e387c3e9b696..666a2b5c0c5c0 100644
+--- a/drivers/gpu/drm/bridge/analogix/anx7625.c
++++ b/drivers/gpu/drm/bridge/analogix/anx7625.c
+@@ -873,11 +873,11 @@ static int anx7625_hdcp_enable(struct anx7625_data *ctx)
+ }
+
+ /* Read downstream capability */
+- ret = anx7625_aux_trans(ctx, DP_AUX_NATIVE_READ, 0x68028, 1, &bcap);
++ ret = anx7625_aux_trans(ctx, DP_AUX_NATIVE_READ, DP_AUX_HDCP_BCAPS, 1, &bcap);
+ if (ret < 0)
+ return ret;
+
+- if (!(bcap & 0x01)) {
++ if (!(bcap & DP_BCAPS_HDCP_CAPABLE)) {
+ pr_warn("downstream not support HDCP 1.4, cap(%x).\n", bcap);
+ return 0;
+ }
+@@ -932,8 +932,8 @@ static void anx7625_dp_start(struct anx7625_data *ctx)
+
+ dev_dbg(dev, "set downstream sink into normal\n");
+ /* Downstream sink enter into normal mode */
+- data = 1;
+- ret = anx7625_aux_trans(ctx, DP_AUX_NATIVE_WRITE, 0x000600, 1, &data);
++ data = DP_SET_POWER_D0;
++ ret = anx7625_aux_trans(ctx, DP_AUX_NATIVE_WRITE, DP_SET_POWER, 1, &data);
+ if (ret < 0)
+ dev_err(dev, "IO error : set sink into normal mode fail\n");
+
+@@ -972,8 +972,8 @@ static void anx7625_dp_stop(struct anx7625_data *ctx)
+
+ dev_dbg(dev, "notify downstream enter into standby\n");
+ /* Downstream monitor enter into standby mode */
+- data = 2;
+- ret |= anx7625_aux_trans(ctx, DP_AUX_NATIVE_WRITE, 0x000600, 1, &data);
++ data = DP_SET_POWER_D3;
++ ret |= anx7625_aux_trans(ctx, DP_AUX_NATIVE_WRITE, DP_SET_POWER, 1, &data);
+ if (ret < 0)
+ DRM_DEV_ERROR(dev, "IO error : mute video fail\n");
+
+diff --git a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
+index b2efecf7d1603..4291798bd70f5 100644
+--- a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
++++ b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
+@@ -265,6 +265,7 @@ struct dw_mipi_dsi {
+ struct dw_mipi_dsi *master; /* dual-dsi master ptr */
+ struct dw_mipi_dsi *slave; /* dual-dsi slave ptr */
+
++ struct drm_display_mode mode;
+ const struct dw_mipi_dsi_plat_data *plat_data;
+ };
+
+@@ -332,6 +333,7 @@ static int dw_mipi_dsi_host_attach(struct mipi_dsi_host *host,
+ if (IS_ERR(bridge))
+ return PTR_ERR(bridge);
+
++ bridge->pre_enable_prev_first = true;
+ dsi->panel_bridge = bridge;
+
+ drm_bridge_add(&dsi->bridge);
+@@ -859,15 +861,6 @@ static void dw_mipi_dsi_bridge_post_atomic_disable(struct drm_bridge *bridge,
+ */
+ dw_mipi_dsi_set_mode(dsi, 0);
+
+- /*
+- * TODO Only way found to call panel-bridge post_disable &
+- * panel unprepare before the dsi "final" disable...
+- * This needs to be fixed in the drm_bridge framework and the API
+- * needs to be updated to manage our own call chains...
+- */
+- if (dsi->panel_bridge->funcs->post_disable)
+- dsi->panel_bridge->funcs->post_disable(dsi->panel_bridge);
+-
+ if (phy_ops->power_off)
+ phy_ops->power_off(dsi->plat_data->priv_data);
+
+@@ -942,15 +935,25 @@ static void dw_mipi_dsi_mode_set(struct dw_mipi_dsi *dsi,
+ phy_ops->power_on(dsi->plat_data->priv_data);
+ }
+
++static void dw_mipi_dsi_bridge_atomic_pre_enable(struct drm_bridge *bridge,
++ struct drm_bridge_state *old_bridge_state)
++{
++ struct dw_mipi_dsi *dsi = bridge_to_dsi(bridge);
++
++ /* Power up the dsi ctl into a command mode */
++ dw_mipi_dsi_mode_set(dsi, &dsi->mode);
++ if (dsi->slave)
++ dw_mipi_dsi_mode_set(dsi->slave, &dsi->mode);
++}
++
+ static void dw_mipi_dsi_bridge_mode_set(struct drm_bridge *bridge,
+ const struct drm_display_mode *mode,
+ const struct drm_display_mode *adjusted_mode)
+ {
+ struct dw_mipi_dsi *dsi = bridge_to_dsi(bridge);
+
+- dw_mipi_dsi_mode_set(dsi, adjusted_mode);
+- if (dsi->slave)
+- dw_mipi_dsi_mode_set(dsi->slave, adjusted_mode);
++ /* Store the display mode for later use in pre_enable callback */
++ drm_mode_copy(&dsi->mode, adjusted_mode);
+ }
+
+ static void dw_mipi_dsi_bridge_atomic_enable(struct drm_bridge *bridge,
+@@ -1004,6 +1007,7 @@ static const struct drm_bridge_funcs dw_mipi_dsi_bridge_funcs = {
+ .atomic_duplicate_state = drm_atomic_helper_bridge_duplicate_state,
+ .atomic_destroy_state = drm_atomic_helper_bridge_destroy_state,
+ .atomic_reset = drm_atomic_helper_bridge_reset,
++ .atomic_pre_enable = dw_mipi_dsi_bridge_atomic_pre_enable,
+ .atomic_enable = dw_mipi_dsi_bridge_atomic_enable,
+ .atomic_post_disable = dw_mipi_dsi_bridge_post_atomic_disable,
+ .mode_set = dw_mipi_dsi_bridge_mode_set,
+diff --git a/drivers/gpu/drm/bridge/tc358764.c b/drivers/gpu/drm/bridge/tc358764.c
+index f85654f1b1045..8e938a7480f37 100644
+--- a/drivers/gpu/drm/bridge/tc358764.c
++++ b/drivers/gpu/drm/bridge/tc358764.c
+@@ -176,7 +176,7 @@ static void tc358764_read(struct tc358764 *ctx, u16 addr, u32 *val)
+ if (ret >= 0)
+ le32_to_cpus(val);
+
+- dev_dbg(ctx->dev, "read: %d, addr: %d\n", addr, *val);
++ dev_dbg(ctx->dev, "read: addr=0x%04x data=0x%08x\n", addr, *val);
+ }
+
+ static void tc358764_write(struct tc358764 *ctx, u16 addr, u32 val)
+diff --git a/drivers/gpu/drm/etnaviv/etnaviv_dump.c b/drivers/gpu/drm/etnaviv/etnaviv_dump.c
+index 44b5f3c35aabe..898f84a0fc30c 100644
+--- a/drivers/gpu/drm/etnaviv/etnaviv_dump.c
++++ b/drivers/gpu/drm/etnaviv/etnaviv_dump.c
+@@ -130,9 +130,9 @@ void etnaviv_core_dump(struct etnaviv_gem_submit *submit)
+ return;
+ etnaviv_dump_core = false;
+
+- mutex_lock(&gpu->mmu_context->lock);
++ mutex_lock(&submit->mmu_context->lock);
+
+- mmu_size = etnaviv_iommu_dump_size(gpu->mmu_context);
++ mmu_size = etnaviv_iommu_dump_size(submit->mmu_context);
+
+ /* We always dump registers, mmu, ring, hanging cmdbuf and end marker */
+ n_obj = 5;
+@@ -162,7 +162,7 @@ void etnaviv_core_dump(struct etnaviv_gem_submit *submit)
+ iter.start = __vmalloc(file_size, GFP_KERNEL | __GFP_NOWARN |
+ __GFP_NORETRY);
+ if (!iter.start) {
+- mutex_unlock(&gpu->mmu_context->lock);
++ mutex_unlock(&submit->mmu_context->lock);
+ dev_warn(gpu->dev, "failed to allocate devcoredump file\n");
+ return;
+ }
+@@ -174,18 +174,18 @@ void etnaviv_core_dump(struct etnaviv_gem_submit *submit)
+ memset(iter.hdr, 0, iter.data - iter.start);
+
+ etnaviv_core_dump_registers(&iter, gpu);
+- etnaviv_core_dump_mmu(&iter, gpu->mmu_context, mmu_size);
++ etnaviv_core_dump_mmu(&iter, submit->mmu_context, mmu_size);
+ etnaviv_core_dump_mem(&iter, ETDUMP_BUF_RING, gpu->buffer.vaddr,
+ gpu->buffer.size,
+ etnaviv_cmdbuf_get_va(&gpu->buffer,
+- &gpu->mmu_context->cmdbuf_mapping));
++ &submit->mmu_context->cmdbuf_mapping));
+
+ etnaviv_core_dump_mem(&iter, ETDUMP_BUF_CMD,
+ submit->cmdbuf.vaddr, submit->cmdbuf.size,
+ etnaviv_cmdbuf_get_va(&submit->cmdbuf,
+- &gpu->mmu_context->cmdbuf_mapping));
++ &submit->mmu_context->cmdbuf_mapping));
+
+- mutex_unlock(&gpu->mmu_context->lock);
++ mutex_unlock(&submit->mmu_context->lock);
+
+ /* Reserve space for the bomap */
+ if (n_bomap_pages) {
+diff --git a/drivers/gpu/drm/hyperv/hyperv_drm_drv.c b/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
+index f830d62a5ce60..559ce242919df 100644
+--- a/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
++++ b/drivers/gpu/drm/hyperv/hyperv_drm_drv.c
+@@ -7,6 +7,7 @@
+ #include <linux/hyperv.h>
+ #include <linux/module.h>
+ #include <linux/pci.h>
++#include <linux/screen_info.h>
+
+ #include <drm/drm_aperture.h>
+ #include <drm/drm_atomic_helper.h>
+diff --git a/drivers/gpu/drm/mediatek/mtk_disp_ovl_adaptor.c b/drivers/gpu/drm/mediatek/mtk_disp_ovl_adaptor.c
+index c0a38f5217eee..f2f6a5c01a6d2 100644
+--- a/drivers/gpu/drm/mediatek/mtk_disp_ovl_adaptor.c
++++ b/drivers/gpu/drm/mediatek/mtk_disp_ovl_adaptor.c
+@@ -426,7 +426,7 @@ static int ovl_adaptor_comp_init(struct device *dev, struct component_match **ma
+ continue;
+ }
+
+- type = (enum mtk_ovl_adaptor_comp_type)of_id->data;
++ type = (enum mtk_ovl_adaptor_comp_type)(uintptr_t)of_id->data;
+ id = ovl_adaptor_comp_get_id(dev, node, type);
+ if (id < 0) {
+ dev_warn(dev, "Skipping unknown component %pOF\n",
+diff --git a/drivers/gpu/drm/mediatek/mtk_dp.c b/drivers/gpu/drm/mediatek/mtk_dp.c
+index 64eee77452c04..c58b775877a31 100644
+--- a/drivers/gpu/drm/mediatek/mtk_dp.c
++++ b/drivers/gpu/drm/mediatek/mtk_dp.c
+@@ -1588,7 +1588,9 @@ static int mtk_dp_parse_capabilities(struct mtk_dp *mtk_dp)
+ u8 val;
+ ssize_t ret;
+
+- drm_dp_read_dpcd_caps(&mtk_dp->aux, mtk_dp->rx_cap);
++ ret = drm_dp_read_dpcd_caps(&mtk_dp->aux, mtk_dp->rx_cap);
++ if (ret < 0)
++ return ret;
+
+ if (drm_dp_tps4_supported(mtk_dp->rx_cap))
+ mtk_dp->train_info.channel_eq_pattern = DP_TRAINING_PATTERN_4;
+@@ -1615,10 +1617,13 @@ static int mtk_dp_parse_capabilities(struct mtk_dp *mtk_dp)
+ return ret == 0 ? -EIO : ret;
+ }
+
+- if (val)
+- drm_dp_dpcd_writeb(&mtk_dp->aux,
+- DP_DEVICE_SERVICE_IRQ_VECTOR_ESI0,
+- val);
++ if (val) {
++ ret = drm_dp_dpcd_writeb(&mtk_dp->aux,
++ DP_DEVICE_SERVICE_IRQ_VECTOR_ESI0,
++ val);
++ if (ret < 0)
++ return ret;
++ }
+ }
+
+ return 0;
+diff --git a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
+index d40142842f85c..8d44f3df116fa 100644
+--- a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
++++ b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
+@@ -116,10 +116,9 @@ static int mtk_drm_cmdq_pkt_create(struct cmdq_client *client, struct cmdq_pkt *
+ dma_addr_t dma_addr;
+
+ pkt->va_base = kzalloc(size, GFP_KERNEL);
+- if (!pkt->va_base) {
+- kfree(pkt);
++ if (!pkt->va_base)
+ return -ENOMEM;
+- }
++
+ pkt->buf_size = size;
+ pkt->cl = (void *)client;
+
+@@ -129,7 +128,6 @@ static int mtk_drm_cmdq_pkt_create(struct cmdq_client *client, struct cmdq_pkt *
+ if (dma_mapping_error(dev, dma_addr)) {
+ dev_err(dev, "dma map failed, size=%u\n", (u32)(u64)size);
+ kfree(pkt->va_base);
+- kfree(pkt);
+ return -ENOMEM;
+ }
+
+@@ -145,7 +143,6 @@ static void mtk_drm_cmdq_pkt_destroy(struct cmdq_pkt *pkt)
+ dma_unmap_single(client->chan->mbox->dev, pkt->pa_base, pkt->buf_size,
+ DMA_TO_DEVICE);
+ kfree(pkt->va_base);
+- kfree(pkt);
+ }
+ #endif
+
+diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
+index f114da4d36a96..771f4e1733539 100644
+--- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
++++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
+@@ -563,14 +563,15 @@ int mtk_ddp_comp_init(struct device_node *node, struct mtk_ddp_comp *comp,
+ /* Not all drm components have a DTS device node, such as ovl_adaptor,
+ * which is the drm bring up sub driver
+ */
+- if (node) {
+- comp_pdev = of_find_device_by_node(node);
+- if (!comp_pdev) {
+- DRM_INFO("Waiting for device %s\n", node->full_name);
+- return -EPROBE_DEFER;
+- }
+- comp->dev = &comp_pdev->dev;
++ if (!node)
++ return 0;
++
++ comp_pdev = of_find_device_by_node(node);
++ if (!comp_pdev) {
++ DRM_INFO("Waiting for device %s\n", node->full_name);
++ return -EPROBE_DEFER;
+ }
++ comp->dev = &comp_pdev->dev;
+
+ if (type == MTK_DISP_AAL ||
+ type == MTK_DISP_BLS ||
+@@ -580,7 +581,6 @@ int mtk_ddp_comp_init(struct device_node *node, struct mtk_ddp_comp *comp,
+ type == MTK_DISP_MERGE ||
+ type == MTK_DISP_OVL ||
+ type == MTK_DISP_OVL_2L ||
+- type == MTK_DISP_OVL_ADAPTOR ||
+ type == MTK_DISP_PWM ||
+ type == MTK_DISP_RDMA ||
+ type == MTK_DPI ||
+diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
+index 6dcb4ba2466c0..30d10f21562f4 100644
+--- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
++++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
+@@ -354,7 +354,7 @@ static bool mtk_drm_get_all_drm_priv(struct device *dev)
+ const struct of_device_id *of_id;
+ struct device_node *node;
+ struct device *drm_dev;
+- int cnt = 0;
++ unsigned int cnt = 0;
+ int i, j;
+
+ for_each_child_of_node(phandle->parent, node) {
+@@ -375,6 +375,9 @@ static bool mtk_drm_get_all_drm_priv(struct device *dev)
+ all_drm_priv[cnt] = dev_get_drvdata(drm_dev);
+ if (all_drm_priv[cnt] && all_drm_priv[cnt]->mtk_drm_bound)
+ cnt++;
++
++ if (cnt == MAX_CRTC)
++ break;
+ }
+
+ if (drm_priv->data->mmsys_dev_num == cnt) {
+@@ -829,7 +832,7 @@ static int mtk_drm_probe(struct platform_device *pdev)
+ continue;
+ }
+
+- comp_type = (enum mtk_ddp_comp_type)of_id->data;
++ comp_type = (enum mtk_ddp_comp_type)(uintptr_t)of_id->data;
+
+ if (comp_type == MTK_DISP_MUTEX) {
+ int id;
+diff --git a/drivers/gpu/drm/mediatek/mtk_drm_gem.c b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
+index a25b28d3ee902..9f364df52478d 100644
+--- a/drivers/gpu/drm/mediatek/mtk_drm_gem.c
++++ b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
+@@ -247,7 +247,11 @@ int mtk_drm_gem_prime_vmap(struct drm_gem_object *obj, struct iosys_map *map)
+
+ mtk_gem->kvaddr = vmap(mtk_gem->pages, npages, VM_MAP,
+ pgprot_writecombine(PAGE_KERNEL));
+-
++ if (!mtk_gem->kvaddr) {
++ kfree(sgt);
++ kfree(mtk_gem->pages);
++ return -ENOMEM;
++ }
+ out:
+ kfree(sgt);
+ iosys_map_set_vaddr(map, mtk_gem->kvaddr);
+diff --git a/drivers/gpu/drm/msm/adreno/a2xx_gpu.c b/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
+index c67089a7ebc10..ad4570d60abf2 100644
+--- a/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
++++ b/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
+@@ -540,6 +540,10 @@ struct msm_gpu *a2xx_gpu_init(struct drm_device *dev)
+ gpu->perfcntrs = perfcntrs;
+ gpu->num_perfcntrs = ARRAY_SIZE(perfcntrs);
+
++ ret = adreno_gpu_init(dev, pdev, adreno_gpu, &funcs, 1);
++ if (ret)
++ goto fail;
++
+ if (adreno_is_a20x(adreno_gpu))
+ adreno_gpu->registers = a200_registers;
+ else if (adreno_is_a225(adreno_gpu))
+@@ -547,10 +551,6 @@ struct msm_gpu *a2xx_gpu_init(struct drm_device *dev)
+ else
+ adreno_gpu->registers = a220_registers;
+
+- ret = adreno_gpu_init(dev, pdev, adreno_gpu, &funcs, 1);
+- if (ret)
+- goto fail;
+-
+ if (!gpu->aspace) {
+ dev_err(dev->dev, "No memory protection without MMU\n");
+ if (!allow_vram_carveout) {
+diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+index 8914992378f21..1ff2a71e1aea5 100644
+--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
++++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+@@ -1472,8 +1472,15 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
+ struct a6xx_gmu *gmu = &a6xx_gpu->gmu;
+ struct platform_device *pdev = to_platform_device(gmu->dev);
+
+- if (!gmu->initialized)
++ mutex_lock(&gmu->lock);
++ if (!gmu->initialized) {
++ mutex_unlock(&gmu->lock);
+ return;
++ }
++
++ gmu->initialized = false;
++
++ mutex_unlock(&gmu->lock);
+
+ pm_runtime_force_suspend(gmu->dev);
+
+@@ -1501,8 +1508,6 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
+
+ /* Drop reference taken in of_find_device_by_node */
+ put_device(gmu->dev);
+-
+- gmu->initialized = false;
+ }
+
+ static int cxpd_notifier_cb(struct notifier_block *nb,
+diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+index 411b7a5fa2f32..bdda1a6336543 100644
+--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
++++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+@@ -1697,9 +1697,7 @@ static void a6xx_destroy(struct msm_gpu *gpu)
+
+ a6xx_llc_slices_destroy(a6xx_gpu);
+
+- mutex_lock(&a6xx_gpu->gmu.lock);
+ a6xx_gmu_remove(a6xx_gpu);
+- mutex_unlock(&a6xx_gpu->gmu.lock);
+
+ adreno_gpu_cleanup(adreno_gpu);
+
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
+index ff9ccf72a4bf9..6560eeef00143 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
++++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
+@@ -195,7 +195,6 @@ const struct dpu_mdss_cfg dpu_msm8998_cfg = {
+ .intf = msm8998_intf,
+ .vbif_count = ARRAY_SIZE(msm8998_vbif),
+ .vbif = msm8998_vbif,
+- .reg_dma_count = 0,
+ .perf = &msm8998_perf_data,
+ .mdss_irqs = BIT(MDP_SSPP_TOP0_INTR) | \
+ BIT(MDP_SSPP_TOP0_INTR2) | \
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h
+index 5b9b3b99f1b5f..84159f8cbdaeb 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h
++++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h
+@@ -193,8 +193,6 @@ const struct dpu_mdss_cfg dpu_sdm845_cfg = {
+ .intf = sdm845_intf,
+ .vbif_count = ARRAY_SIZE(sdm845_vbif),
+ .vbif = sdm845_vbif,
+- .reg_dma_count = 1,
+- .dma_cfg = &sdm845_regdma,
+ .perf = &sdm845_perf_data,
+ .mdss_irqs = BIT(MDP_SSPP_TOP0_INTR) | \
+ BIT(MDP_SSPP_TOP0_INTR2) | \
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
+index 074ba54d420f4..266c525f8daaf 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
++++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
+@@ -220,8 +220,6 @@ const struct dpu_mdss_cfg dpu_sm8150_cfg = {
+ .intf = sm8150_intf,
+ .vbif_count = ARRAY_SIZE(sdm845_vbif),
+ .vbif = sdm845_vbif,
+- .reg_dma_count = 1,
+- .dma_cfg = &sm8150_regdma,
+ .perf = &sm8150_perf_data,
+ .mdss_irqs = BIT(MDP_SSPP_TOP0_INTR) | \
+ BIT(MDP_SSPP_TOP0_INTR2) | \
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h
+index 0540d21810857..76c5745c2fa1f 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h
++++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h
+@@ -198,8 +198,6 @@ const struct dpu_mdss_cfg dpu_sc8180x_cfg = {
+ .intf = sc8180x_intf,
+ .vbif_count = ARRAY_SIZE(sdm845_vbif),
+ .vbif = sdm845_vbif,
+- .reg_dma_count = 1,
+- .dma_cfg = &sm8150_regdma,
+ .perf = &sc8180x_perf_data,
+ .mdss_irqs = BIT(MDP_SSPP_TOP0_INTR) | \
+ BIT(MDP_SSPP_TOP0_INTR2) | \
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h
+index b3284de35b8fa..8660d04d0f589 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h
++++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h
+@@ -228,8 +228,6 @@ const struct dpu_mdss_cfg dpu_sm8250_cfg = {
+ .vbif = sdm845_vbif,
+ .wb_count = ARRAY_SIZE(sm8250_wb),
+ .wb = sm8250_wb,
+- .reg_dma_count = 1,
+- .dma_cfg = &sm8250_regdma,
+ .perf = &sm8250_perf_data,
+ .mdss_irqs = BIT(MDP_SSPP_TOP0_INTR) | \
+ BIT(MDP_SSPP_TOP0_INTR2) | \
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h
+index 88c211876516a..9631116f99e9b 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h
++++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h
+@@ -147,8 +147,6 @@ const struct dpu_mdss_cfg dpu_sc7180_cfg = {
+ .wb = sc7180_wb,
+ .vbif_count = ARRAY_SIZE(sdm845_vbif),
+ .vbif = sdm845_vbif,
+- .reg_dma_count = 1,
+- .dma_cfg = &sdm845_regdma,
+ .perf = &sc7180_perf_data,
+ .mdss_irqs = BIT(MDP_SSPP_TOP0_INTR) | \
+ BIT(MDP_SSPP_TOP0_INTR2) | \
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h
+index 4f6a965bcd90b..9e8d6632a1927 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h
++++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h
+@@ -211,8 +211,6 @@ const struct dpu_mdss_cfg dpu_sm8350_cfg = {
+ .intf = sm8350_intf,
+ .vbif_count = ARRAY_SIZE(sdm845_vbif),
+ .vbif = sdm845_vbif,
+- .reg_dma_count = 1,
+- .dma_cfg = &sm8350_regdma,
+ .perf = &sm8350_perf_data,
+ .mdss_irqs = BIT(MDP_SSPP_TOP0_INTR) | \
+ BIT(MDP_SSPP_TOP0_INTR2) | \
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h
+index 706d0f13b598e..cb58b4ec97db4 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h
++++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h
+@@ -202,8 +202,6 @@ const struct dpu_mdss_cfg dpu_sc8280xp_cfg = {
+ .intf = sc8280xp_intf,
+ .vbif_count = ARRAY_SIZE(sdm845_vbif),
+ .vbif = sdm845_vbif,
+- .reg_dma_count = 1,
+- .dma_cfg = &sc8280xp_regdma,
+ .perf = &sc8280xp_perf_data,
+ .mdss_irqs = BIT(MDP_SSPP_TOP0_INTR) | \
+ BIT(MDP_SSPP_TOP0_INTR2) | \
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h
+index 8bd4bb97e639c..905b403ffb0fb 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h
++++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h
+@@ -219,8 +219,6 @@ const struct dpu_mdss_cfg dpu_sm8450_cfg = {
+ .intf = sm8450_intf,
+ .vbif_count = ARRAY_SIZE(sdm845_vbif),
+ .vbif = sdm845_vbif,
+- .reg_dma_count = 1,
+- .dma_cfg = &sm8450_regdma,
+ .perf = &sm8450_perf_data,
+ .mdss_irqs = BIT(MDP_SSPP_TOP0_INTR) | \
+ BIT(MDP_SSPP_TOP0_INTR2) | \
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h
+index d0ab351b6a8b9..a6e4763660bb0 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h
++++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h
+@@ -222,10 +222,8 @@ const struct dpu_mdss_cfg dpu_sm8550_cfg = {
+ .merge_3d = sm8550_merge_3d,
+ .intf_count = ARRAY_SIZE(sm8550_intf),
+ .intf = sm8550_intf,
+- .vbif_count = ARRAY_SIZE(sdm845_vbif),
+- .vbif = sdm845_vbif,
+- .reg_dma_count = 1,
+- .dma_cfg = &sm8450_regdma,
++ .vbif_count = ARRAY_SIZE(sm8550_vbif),
++ .vbif = sm8550_vbif,
+ .perf = &sm8550_perf_data,
+ .mdss_irqs = BIT(MDP_SSPP_TOP0_INTR) | \
+ BIT(MDP_SSPP_TOP0_INTR2) | \
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
+index bac4aa807b4bc..2553d6374482b 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
++++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
+@@ -455,7 +455,8 @@ static int dpu_encoder_phys_wb_wait_for_commit_done(
+ wait_info.atomic_cnt = &phys_enc->pending_kickoff_cnt;
+ wait_info.timeout_ms = KICKOFF_TIMEOUT_MS;
+
+- ret = dpu_encoder_helper_wait_for_irq(phys_enc, INTR_IDX_WB_DONE,
++ ret = dpu_encoder_helper_wait_for_irq(phys_enc,
++ phys_enc->irq[INTR_IDX_WB_DONE],
+ dpu_encoder_phys_wb_done_irq, &wait_info);
+ if (ret == -ETIMEDOUT)
+ _dpu_encoder_phys_wb_handle_wbdone_timeout(phys_enc);
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
+index 0b604f31197bb..23c16d25b62f5 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
++++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
+@@ -642,44 +642,24 @@ static const struct dpu_vbif_cfg sdm845_vbif[] = {
+ },
+ };
+
+-static const struct dpu_reg_dma_cfg sc8280xp_regdma = {
+- .base = 0x0,
+- .version = 0x00020000,
+- .trigger_sel_off = 0x119c,
+- .xin_id = 7,
+- .clk_ctrl = DPU_CLK_CTRL_REG_DMA,
+-};
+-
+-static const struct dpu_reg_dma_cfg sdm845_regdma = {
+- .base = 0x0, .version = 0x1, .trigger_sel_off = 0x119c
+-};
+-
+-static const struct dpu_reg_dma_cfg sm8150_regdma = {
+- .base = 0x0, .version = 0x00010001, .trigger_sel_off = 0x119c
+-};
+-
+-static const struct dpu_reg_dma_cfg sm8250_regdma = {
+- .base = 0x0,
+- .version = 0x00010002,
+- .trigger_sel_off = 0x119c,
+- .xin_id = 7,
+- .clk_ctrl = DPU_CLK_CTRL_REG_DMA,
+-};
+-
+-static const struct dpu_reg_dma_cfg sm8350_regdma = {
+- .base = 0x400,
+- .version = 0x00020000,
+- .trigger_sel_off = 0x119c,
+- .xin_id = 7,
+- .clk_ctrl = DPU_CLK_CTRL_REG_DMA,
+-};
+-
+-static const struct dpu_reg_dma_cfg sm8450_regdma = {
+- .base = 0x0,
+- .version = 0x00020000,
+- .trigger_sel_off = 0x119c,
+- .xin_id = 7,
+- .clk_ctrl = DPU_CLK_CTRL_REG_DMA,
++static const struct dpu_vbif_cfg sm8550_vbif[] = {
++ {
++ .name = "vbif_rt", .id = VBIF_RT,
++ .base = 0, .len = 0x1040,
++ .features = BIT(DPU_VBIF_QOS_REMAP),
++ .xin_halt_timeout = 0x4000,
++ .qos_rp_remap_size = 0x40,
++ .qos_rt_tbl = {
++ .npriority_lvl = ARRAY_SIZE(sdm845_rt_pri_lvl),
++ .priority_lvl = sdm845_rt_pri_lvl,
++ },
++ .qos_nrt_tbl = {
++ .npriority_lvl = ARRAY_SIZE(sdm845_nrt_pri_lvl),
++ .priority_lvl = sdm845_nrt_pri_lvl,
++ },
++ .memtype_count = 16,
++ .memtype = {3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3},
++ },
+ };
+
+ /*************************************************************
+diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
+index 71584cd56fd75..8d62c21b051a8 100644
+--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
++++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
+@@ -720,21 +720,6 @@ struct dpu_vbif_cfg {
+ u32 memtype_count;
+ u32 memtype[MAX_XIN_COUNT];
+ };
+-/**
+- * struct dpu_reg_dma_cfg - information of lut dma blocks
+- * @id enum identifying this block
+- * @base register offset of this block
+- * @features bit mask identifying sub-blocks/features
+- * @version version of lutdma hw block
+- * @trigger_sel_off offset to trigger select registers of lutdma
+- */
+-struct dpu_reg_dma_cfg {
+- DPU_HW_BLK_INFO;
+- u32 version;
+- u32 trigger_sel_off;
+- u32 xin_id;
+- enum dpu_clk_ctrl_type clk_ctrl;
+-};
+
+ /**
+ * Define CDP use cases
+@@ -850,9 +835,6 @@ struct dpu_mdss_cfg {
+ u32 wb_count;
+ const struct dpu_wb_cfg *wb;
+
+- u32 reg_dma_count;
+- const struct dpu_reg_dma_cfg *dma_cfg;
+-
+ u32 ad_count;
+
+ u32 dspp_count;
+diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c
+index bd2c4ac456017..0d5ff03cb0910 100644
+--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c
++++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c
+@@ -130,8 +130,7 @@ static void mdp5_plane_destroy_state(struct drm_plane *plane,
+ {
+ struct mdp5_plane_state *pstate = to_mdp5_plane_state(state);
+
+- if (state->fb)
+- drm_framebuffer_put(state->fb);
++ __drm_atomic_helper_plane_destroy_state(state);
+
+ kfree(pstate);
+ }
+diff --git a/drivers/gpu/drm/msm/disp/msm_disp_snapshot_util.c b/drivers/gpu/drm/msm/disp/msm_disp_snapshot_util.c
+index acfe1b31e0792..add72bbc28b17 100644
+--- a/drivers/gpu/drm/msm/disp/msm_disp_snapshot_util.c
++++ b/drivers/gpu/drm/msm/disp/msm_disp_snapshot_util.c
+@@ -192,5 +192,5 @@ void msm_disp_snapshot_add_block(struct msm_disp_state *disp_state, u32 len,
+ new_blk->base_addr = base_addr;
+
+ msm_disp_state_dump_regs(&new_blk->state, new_blk->size, base_addr);
+- list_add(&new_blk->node, &disp_state->blocks);
++ list_add_tail(&new_blk->node, &disp_state->blocks);
+ }
+diff --git a/drivers/gpu/drm/panel/panel-simple.c b/drivers/gpu/drm/panel/panel-simple.c
+index cf6b146acc323..23c1b1a96df64 100644
+--- a/drivers/gpu/drm/panel/panel-simple.c
++++ b/drivers/gpu/drm/panel/panel-simple.c
+@@ -1159,7 +1159,9 @@ static const struct panel_desc auo_t215hvn01 = {
+ .delay = {
+ .disable = 5,
+ .unprepare = 1000,
+- }
++ },
++ .bus_format = MEDIA_BUS_FMT_RGB888_1X7X4_SPWG,
++ .connector_type = DRM_MODE_CONNECTOR_LVDS,
+ };
+
+ static const struct drm_display_mode avic_tm070ddh03_mode = {
+diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
+index 5819737c21c67..a6f3c811ceb8e 100644
+--- a/drivers/gpu/drm/radeon/cik.c
++++ b/drivers/gpu/drm/radeon/cik.c
+@@ -9534,17 +9534,8 @@ static void cik_pcie_gen3_enable(struct radeon_device *rdev)
+ u16 bridge_cfg2, gpu_cfg2;
+ u32 max_lw, current_lw, tmp;
+
+- pcie_capability_read_word(root, PCI_EXP_LNKCTL,
+- &bridge_cfg);
+- pcie_capability_read_word(rdev->pdev, PCI_EXP_LNKCTL,
+- &gpu_cfg);
+-
+- tmp16 = bridge_cfg | PCI_EXP_LNKCTL_HAWD;
+- pcie_capability_write_word(root, PCI_EXP_LNKCTL, tmp16);
+-
+- tmp16 = gpu_cfg | PCI_EXP_LNKCTL_HAWD;
+- pcie_capability_write_word(rdev->pdev, PCI_EXP_LNKCTL,
+- tmp16);
++ pcie_capability_set_word(root, PCI_EXP_LNKCTL, PCI_EXP_LNKCTL_HAWD);
++ pcie_capability_set_word(rdev->pdev, PCI_EXP_LNKCTL, PCI_EXP_LNKCTL_HAWD);
+
+ tmp = RREG32_PCIE_PORT(PCIE_LC_STATUS1);
+ max_lw = (tmp & LC_DETECTED_LINK_WIDTH_MASK) >> LC_DETECTED_LINK_WIDTH_SHIFT;
+@@ -9591,21 +9582,14 @@ static void cik_pcie_gen3_enable(struct radeon_device *rdev)
+ msleep(100);
+
+ /* linkctl */
+- pcie_capability_read_word(root, PCI_EXP_LNKCTL,
+- &tmp16);
+- tmp16 &= ~PCI_EXP_LNKCTL_HAWD;
+- tmp16 |= (bridge_cfg & PCI_EXP_LNKCTL_HAWD);
+- pcie_capability_write_word(root, PCI_EXP_LNKCTL,
+- tmp16);
+-
+- pcie_capability_read_word(rdev->pdev,
+- PCI_EXP_LNKCTL,
+- &tmp16);
+- tmp16 &= ~PCI_EXP_LNKCTL_HAWD;
+- tmp16 |= (gpu_cfg & PCI_EXP_LNKCTL_HAWD);
+- pcie_capability_write_word(rdev->pdev,
+- PCI_EXP_LNKCTL,
+- tmp16);
++ pcie_capability_clear_and_set_word(root, PCI_EXP_LNKCTL,
++ PCI_EXP_LNKCTL_HAWD,
++ bridge_cfg &
++ PCI_EXP_LNKCTL_HAWD);
++ pcie_capability_clear_and_set_word(rdev->pdev, PCI_EXP_LNKCTL,
++ PCI_EXP_LNKCTL_HAWD,
++ gpu_cfg &
++ PCI_EXP_LNKCTL_HAWD);
+
+ /* linkctl2 */
+ pcie_capability_read_word(root, PCI_EXP_LNKCTL2,
+diff --git a/drivers/gpu/drm/radeon/si.c b/drivers/gpu/drm/radeon/si.c
+index 8d5e4b25609d5..a91012447b56e 100644
+--- a/drivers/gpu/drm/radeon/si.c
++++ b/drivers/gpu/drm/radeon/si.c
+@@ -7131,17 +7131,8 @@ static void si_pcie_gen3_enable(struct radeon_device *rdev)
+ u16 bridge_cfg2, gpu_cfg2;
+ u32 max_lw, current_lw, tmp;
+
+- pcie_capability_read_word(root, PCI_EXP_LNKCTL,
+- &bridge_cfg);
+- pcie_capability_read_word(rdev->pdev, PCI_EXP_LNKCTL,
+- &gpu_cfg);
+-
+- tmp16 = bridge_cfg | PCI_EXP_LNKCTL_HAWD;
+- pcie_capability_write_word(root, PCI_EXP_LNKCTL, tmp16);
+-
+- tmp16 = gpu_cfg | PCI_EXP_LNKCTL_HAWD;
+- pcie_capability_write_word(rdev->pdev, PCI_EXP_LNKCTL,
+- tmp16);
++ pcie_capability_set_word(root, PCI_EXP_LNKCTL, PCI_EXP_LNKCTL_HAWD);
++ pcie_capability_set_word(rdev->pdev, PCI_EXP_LNKCTL, PCI_EXP_LNKCTL_HAWD);
+
+ tmp = RREG32_PCIE(PCIE_LC_STATUS1);
+ max_lw = (tmp & LC_DETECTED_LINK_WIDTH_MASK) >> LC_DETECTED_LINK_WIDTH_SHIFT;
+@@ -7188,22 +7179,14 @@ static void si_pcie_gen3_enable(struct radeon_device *rdev)
+ msleep(100);
+
+ /* linkctl */
+- pcie_capability_read_word(root, PCI_EXP_LNKCTL,
+- &tmp16);
+- tmp16 &= ~PCI_EXP_LNKCTL_HAWD;
+- tmp16 |= (bridge_cfg & PCI_EXP_LNKCTL_HAWD);
+- pcie_capability_write_word(root,
+- PCI_EXP_LNKCTL,
+- tmp16);
+-
+- pcie_capability_read_word(rdev->pdev,
+- PCI_EXP_LNKCTL,
+- &tmp16);
+- tmp16 &= ~PCI_EXP_LNKCTL_HAWD;
+- tmp16 |= (gpu_cfg & PCI_EXP_LNKCTL_HAWD);
+- pcie_capability_write_word(rdev->pdev,
+- PCI_EXP_LNKCTL,
+- tmp16);
++ pcie_capability_clear_and_set_word(root, PCI_EXP_LNKCTL,
++ PCI_EXP_LNKCTL_HAWD,
++ bridge_cfg &
++ PCI_EXP_LNKCTL_HAWD);
++ pcie_capability_clear_and_set_word(rdev->pdev, PCI_EXP_LNKCTL,
++ PCI_EXP_LNKCTL_HAWD,
++ gpu_cfg &
++ PCI_EXP_LNKCTL_HAWD);
+
+ /* linkctl2 */
+ pcie_capability_read_word(root, PCI_EXP_LNKCTL2,
+diff --git a/drivers/gpu/drm/tegra/dpaux.c b/drivers/gpu/drm/tegra/dpaux.c
+index 4d2677dcd8315..68ded2e34e1cf 100644
+--- a/drivers/gpu/drm/tegra/dpaux.c
++++ b/drivers/gpu/drm/tegra/dpaux.c
+@@ -468,7 +468,7 @@ static int tegra_dpaux_probe(struct platform_device *pdev)
+
+ dpaux->irq = platform_get_irq(pdev, 0);
+ if (dpaux->irq < 0)
+- return -ENXIO;
++ return dpaux->irq;
+
+ if (!pdev->dev.pm_domain) {
+ dpaux->rst = devm_reset_control_get(&pdev->dev, "dpaux");
+diff --git a/drivers/gpu/drm/tiny/repaper.c b/drivers/gpu/drm/tiny/repaper.c
+index c2677d081a7b6..13ae148f59b9b 100644
+--- a/drivers/gpu/drm/tiny/repaper.c
++++ b/drivers/gpu/drm/tiny/repaper.c
+@@ -533,7 +533,7 @@ static int repaper_fb_dirty(struct drm_framebuffer *fb)
+ DRM_DEBUG("Flushing [FB:%d] st=%ums\n", fb->base.id,
+ epd->factored_stage_time);
+
+- buf = kmalloc_array(fb->width, fb->height, GFP_KERNEL);
++ buf = kmalloc(fb->width * fb->height / 8, GFP_KERNEL);
+ if (!buf) {
+ ret = -ENOMEM;
+ goto out_exit;
+diff --git a/drivers/gpu/drm/xlnx/zynqmp_dpsub.c b/drivers/gpu/drm/xlnx/zynqmp_dpsub.c
+index bab862484d429..068413be65275 100644
+--- a/drivers/gpu/drm/xlnx/zynqmp_dpsub.c
++++ b/drivers/gpu/drm/xlnx/zynqmp_dpsub.c
+@@ -227,7 +227,9 @@ static int zynqmp_dpsub_probe(struct platform_device *pdev)
+ dpsub->dev = &pdev->dev;
+ platform_set_drvdata(pdev, dpsub);
+
+- dma_set_mask(dpsub->dev, DMA_BIT_MASK(ZYNQMP_DISP_MAX_DMA_BIT));
++ ret = dma_set_mask(dpsub->dev, DMA_BIT_MASK(ZYNQMP_DISP_MAX_DMA_BIT));
++ if (ret)
++ return ret;
+
+ /* Try the reserved memory. Proceed if there's none. */
+ of_reserved_mem_device_init(&pdev->dev);
+diff --git a/drivers/hid/hid-input.c b/drivers/hid/hid-input.c
+index 851ee86eff32a..40a5645f8fe81 100644
+--- a/drivers/hid/hid-input.c
++++ b/drivers/hid/hid-input.c
+@@ -988,6 +988,7 @@ static void hidinput_configure_usage(struct hid_input *hidinput, struct hid_fiel
+ return;
+
+ case 0x3c: /* Invert */
++ device->quirks &= ~HID_QUIRK_NOINVERT;
+ map_key_clear(BTN_TOOL_RUBBER);
+ break;
+
+@@ -1013,9 +1014,13 @@ static void hidinput_configure_usage(struct hid_input *hidinput, struct hid_fiel
+ case 0x45: /* ERASER */
+ /*
+ * This event is reported when eraser tip touches the surface.
+- * Actual eraser (BTN_TOOL_RUBBER) is set by Invert usage when
+- * tool gets in proximity.
++ * Actual eraser (BTN_TOOL_RUBBER) is set and released either
++ * by Invert if tool reports proximity or by Eraser directly.
+ */
++ if (!test_bit(BTN_TOOL_RUBBER, input->keybit)) {
++ device->quirks |= HID_QUIRK_NOINVERT;
++ set_bit(BTN_TOOL_RUBBER, input->keybit);
++ }
+ map_key_clear(BTN_TOUCH);
+ break;
+
+@@ -1580,6 +1585,15 @@ void hidinput_hid_event(struct hid_device *hid, struct hid_field *field, struct
+ else if (report->tool != BTN_TOOL_RUBBER)
+ /* value is off, tool is not rubber, ignore */
+ return;
++ else if (*quirks & HID_QUIRK_NOINVERT &&
++ !test_bit(BTN_TOUCH, input->key)) {
++ /*
++ * There is no invert to release the tool, let hid_input
++ * send BTN_TOUCH with scancode and release the tool after.
++ */
++ hid_report_release_tool(report, input, BTN_TOOL_RUBBER);
++ return;
++ }
+
+ /* let hid-input set BTN_TOUCH */
+ break;
+diff --git a/drivers/hid/hid-logitech-dj.c b/drivers/hid/hid-logitech-dj.c
+index 62180414efccd..e6a8b6d8eab70 100644
+--- a/drivers/hid/hid-logitech-dj.c
++++ b/drivers/hid/hid-logitech-dj.c
+@@ -1285,6 +1285,9 @@ static int logi_dj_recv_switch_to_dj_mode(struct dj_receiver_dev *djrcv_dev,
+ * 50 msec should gives enough time to the receiver to be ready.
+ */
+ msleep(50);
++
++ if (retval)
++ return retval;
+ }
+
+ /*
+@@ -1306,7 +1309,7 @@ static int logi_dj_recv_switch_to_dj_mode(struct dj_receiver_dev *djrcv_dev,
+ buf[5] = 0x09;
+ buf[6] = 0x00;
+
+- hid_hw_raw_request(hdev, REPORT_ID_HIDPP_SHORT, buf,
++ retval = hid_hw_raw_request(hdev, REPORT_ID_HIDPP_SHORT, buf,
+ HIDPP_REPORT_SHORT_LENGTH, HID_OUTPUT_REPORT,
+ HID_REQ_SET_REPORT);
+
+diff --git a/drivers/hid/hid-logitech-hidpp.c b/drivers/hid/hid-logitech-hidpp.c
+index dfe8e09a18de0..e461ddaf10e80 100644
+--- a/drivers/hid/hid-logitech-hidpp.c
++++ b/drivers/hid/hid-logitech-hidpp.c
+@@ -275,21 +275,22 @@ static int __hidpp_send_report(struct hid_device *hdev,
+ }
+
+ /*
+- * hidpp_send_message_sync() returns 0 in case of success, and something else
+- * in case of a failure.
+- * - If ' something else' is positive, that means that an error has been raised
+- * by the protocol itself.
+- * - If ' something else' is negative, that means that we had a classic error
+- * (-ENOMEM, -EPIPE, etc...)
++ * Effectively send the message to the device, waiting for its answer.
++ *
++ * Must be called with hidpp->send_mutex locked
++ *
++ * Same return protocol than hidpp_send_message_sync():
++ * - success on 0
++ * - negative error means transport error
++ * - positive value means protocol error
+ */
+-static int hidpp_send_message_sync(struct hidpp_device *hidpp,
++static int __do_hidpp_send_message_sync(struct hidpp_device *hidpp,
+ struct hidpp_report *message,
+ struct hidpp_report *response)
+ {
+- int ret = -1;
+- int max_retries = 3;
++ int ret;
+
+- mutex_lock(&hidpp->send_mutex);
++ __must_hold(&hidpp->send_mutex);
+
+ hidpp->send_receive_buf = response;
+ hidpp->answer_available = false;
+@@ -300,47 +301,74 @@ static int hidpp_send_message_sync(struct hidpp_device *hidpp,
+ */
+ *response = *message;
+
+- for (; max_retries != 0 && ret; max_retries--) {
+- ret = __hidpp_send_report(hidpp->hid_dev, message);
++ ret = __hidpp_send_report(hidpp->hid_dev, message);
++ if (ret) {
++ dbg_hid("__hidpp_send_report returned err: %d\n", ret);
++ memset(response, 0, sizeof(struct hidpp_report));
++ return ret;
++ }
+
+- if (ret) {
+- dbg_hid("__hidpp_send_report returned err: %d\n", ret);
+- memset(response, 0, sizeof(struct hidpp_report));
+- break;
+- }
++ if (!wait_event_timeout(hidpp->wait, hidpp->answer_available,
++ 5*HZ)) {
++ dbg_hid("%s:timeout waiting for response\n", __func__);
++ memset(response, 0, sizeof(struct hidpp_report));
++ return -ETIMEDOUT;
++ }
+
+- if (!wait_event_timeout(hidpp->wait, hidpp->answer_available,
+- 5*HZ)) {
+- dbg_hid("%s:timeout waiting for response\n", __func__);
+- memset(response, 0, sizeof(struct hidpp_report));
+- ret = -ETIMEDOUT;
+- break;
+- }
++ if (response->report_id == REPORT_ID_HIDPP_SHORT &&
++ response->rap.sub_id == HIDPP_ERROR) {
++ ret = response->rap.params[1];
++ dbg_hid("%s:got hidpp error %02X\n", __func__, ret);
++ return ret;
++ }
+
+- if (response->report_id == REPORT_ID_HIDPP_SHORT &&
+- response->rap.sub_id == HIDPP_ERROR) {
+- ret = response->rap.params[1];
+- dbg_hid("%s:got hidpp error %02X\n", __func__, ret);
++ if ((response->report_id == REPORT_ID_HIDPP_LONG ||
++ response->report_id == REPORT_ID_HIDPP_VERY_LONG) &&
++ response->fap.feature_index == HIDPP20_ERROR) {
++ ret = response->fap.params[1];
++ dbg_hid("%s:got hidpp 2.0 error %02X\n", __func__, ret);
++ return ret;
++ }
++
++ return 0;
++}
++
++/*
++ * hidpp_send_message_sync() returns 0 in case of success, and something else
++ * in case of a failure.
++ *
++ * See __do_hidpp_send_message_sync() for a detailed explanation of the returned
++ * value.
++ */
++static int hidpp_send_message_sync(struct hidpp_device *hidpp,
++ struct hidpp_report *message,
++ struct hidpp_report *response)
++{
++ int ret;
++ int max_retries = 3;
++
++ mutex_lock(&hidpp->send_mutex);
++
++ do {
++ ret = __do_hidpp_send_message_sync(hidpp, message, response);
++ if (ret != HIDPP20_ERROR_BUSY)
+ break;
+- }
+
+- if ((response->report_id == REPORT_ID_HIDPP_LONG ||
+- response->report_id == REPORT_ID_HIDPP_VERY_LONG) &&
+- response->fap.feature_index == HIDPP20_ERROR) {
+- ret = response->fap.params[1];
+- if (ret != HIDPP20_ERROR_BUSY) {
+- dbg_hid("%s:got hidpp 2.0 error %02X\n", __func__, ret);
+- break;
+- }
+- dbg_hid("%s:got busy hidpp 2.0 error %02X, retrying\n", __func__, ret);
+- }
+- }
++ dbg_hid("%s:got busy hidpp 2.0 error %02X, retrying\n", __func__, ret);
++ } while (--max_retries);
+
+ mutex_unlock(&hidpp->send_mutex);
+ return ret;
+
+ }
+
++/*
++ * hidpp_send_fap_command_sync() returns 0 in case of success, and something else
++ * in case of a failure.
++ *
++ * See __do_hidpp_send_message_sync() for a detailed explanation of the returned
++ * value.
++ */
+ static int hidpp_send_fap_command_sync(struct hidpp_device *hidpp,
+ u8 feat_index, u8 funcindex_clientid, u8 *params, int param_count,
+ struct hidpp_report *response)
+@@ -373,6 +401,13 @@ static int hidpp_send_fap_command_sync(struct hidpp_device *hidpp,
+ return ret;
+ }
+
++/*
++ * hidpp_send_rap_command_sync() returns 0 in case of success, and something else
++ * in case of a failure.
++ *
++ * See __do_hidpp_send_message_sync() for a detailed explanation of the returned
++ * value.
++ */
+ static int hidpp_send_rap_command_sync(struct hidpp_device *hidpp_dev,
+ u8 report_id, u8 sub_id, u8 reg_address, u8 *params, int param_count,
+ struct hidpp_report *response)
+diff --git a/drivers/hid/hid-multitouch.c b/drivers/hid/hid-multitouch.c
+index e31be0cb8b850..521b2ffb42449 100644
+--- a/drivers/hid/hid-multitouch.c
++++ b/drivers/hid/hid-multitouch.c
+@@ -1594,7 +1594,6 @@ static void mt_post_parse(struct mt_device *td, struct mt_application *app)
+ static int mt_input_configured(struct hid_device *hdev, struct hid_input *hi)
+ {
+ struct mt_device *td = hid_get_drvdata(hdev);
+- char *name;
+ const char *suffix = NULL;
+ struct mt_report_data *rdata;
+ struct mt_application *mt_application = NULL;
+@@ -1645,15 +1644,9 @@ static int mt_input_configured(struct hid_device *hdev, struct hid_input *hi)
+ break;
+ }
+
+- if (suffix) {
+- name = devm_kzalloc(&hi->input->dev,
+- strlen(hdev->name) + strlen(suffix) + 2,
+- GFP_KERNEL);
+- if (name) {
+- sprintf(name, "%s %s", hdev->name, suffix);
+- hi->input->name = name;
+- }
+- }
++ if (suffix)
++ hi->input->name = devm_kasprintf(&hdev->dev, GFP_KERNEL,
++ "%s %s", hdev->name, suffix);
+
+ return 0;
+ }
+diff --git a/drivers/hid/hid-uclogic-core.c b/drivers/hid/hid-uclogic-core.c
+index f67835f9ed4cc..ad74cbc9a0aa5 100644
+--- a/drivers/hid/hid-uclogic-core.c
++++ b/drivers/hid/hid-uclogic-core.c
+@@ -85,10 +85,8 @@ static int uclogic_input_configured(struct hid_device *hdev,
+ {
+ struct uclogic_drvdata *drvdata = hid_get_drvdata(hdev);
+ struct uclogic_params *params = &drvdata->params;
+- char *name;
+ const char *suffix = NULL;
+ struct hid_field *field;
+- size_t len;
+ size_t i;
+ const struct uclogic_params_frame *frame;
+
+@@ -146,14 +144,9 @@ static int uclogic_input_configured(struct hid_device *hdev,
+ }
+ }
+
+- if (suffix) {
+- len = strlen(hdev->name) + 2 + strlen(suffix);
+- name = devm_kzalloc(&hi->input->dev, len, GFP_KERNEL);
+- if (name) {
+- snprintf(name, len, "%s %s", hdev->name, suffix);
+- hi->input->name = name;
+- }
+- }
++ if (suffix)
++ hi->input->name = devm_kasprintf(&hdev->dev, GFP_KERNEL,
++ "%s %s", hdev->name, suffix);
+
+ return 0;
+ }
+diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
+index 67f95a29aeca5..edbb38f6956b9 100644
+--- a/drivers/hv/vmbus_drv.c
++++ b/drivers/hv/vmbus_drv.c
+@@ -2287,7 +2287,8 @@ static int vmbus_acpi_add(struct platform_device *pdev)
+ * Some ancestor of the vmbus acpi device (Gen1 or Gen2
+ * firmware) is the VMOD that has the mmio ranges. Get that.
+ */
+- for (ancestor = acpi_dev_parent(device); ancestor;
++ for (ancestor = acpi_dev_parent(device);
++ ancestor && ancestor->handle != ACPI_ROOT_OBJECT;
+ ancestor = acpi_dev_parent(ancestor)) {
+ result = acpi_walk_resources(ancestor->handle, METHOD_NAME__CRS,
+ vmbus_walk_resources, NULL);
+diff --git a/drivers/hwmon/tmp513.c b/drivers/hwmon/tmp513.c
+index 0693eaee054ff..a8167a48f824c 100644
+--- a/drivers/hwmon/tmp513.c
++++ b/drivers/hwmon/tmp513.c
+@@ -434,7 +434,7 @@ static umode_t tmp51x_is_visible(const void *_data,
+
+ switch (type) {
+ case hwmon_temp:
+- if (data->id == tmp512 && channel == 4)
++ if (data->id == tmp512 && channel == 3)
+ return 0;
+ switch (attr) {
+ case hwmon_temp_input:
+diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c
+index 0ab1f73c2d06a..e374b02d98be3 100644
+--- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
++++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
+@@ -450,7 +450,7 @@ static int tmc_set_etf_buffer(struct coresight_device *csdev,
+ return -EINVAL;
+
+ /* wrap head around to the amount of space we have */
+- head = handle->head & ((buf->nr_pages << PAGE_SHIFT) - 1);
++ head = handle->head & (((unsigned long)buf->nr_pages << PAGE_SHIFT) - 1);
+
+ /* find the page to write to */
+ buf->cur = head / PAGE_SIZE;
+diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
+index eaa296ced1678..8ef4a2a13427e 100644
+--- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
++++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
+@@ -45,7 +45,8 @@ struct etr_perf_buffer {
+ };
+
+ /* Convert the perf index to an offset within the ETR buffer */
+-#define PERF_IDX2OFF(idx, buf) ((idx) % ((buf)->nr_pages << PAGE_SHIFT))
++#define PERF_IDX2OFF(idx, buf) \
++ ((idx) % ((unsigned long)(buf)->nr_pages << PAGE_SHIFT))
+
+ /* Lower limit for ETR hardware buffer */
+ #define TMC_ETR_PERF_MIN_BUF_SIZE SZ_1M
+@@ -1262,7 +1263,7 @@ alloc_etr_buf(struct tmc_drvdata *drvdata, struct perf_event *event,
+ * than the size requested via sysfs.
+ */
+ if ((nr_pages << PAGE_SHIFT) > drvdata->size) {
+- etr_buf = tmc_alloc_etr_buf(drvdata, (nr_pages << PAGE_SHIFT),
++ etr_buf = tmc_alloc_etr_buf(drvdata, ((ssize_t)nr_pages << PAGE_SHIFT),
+ 0, node, NULL);
+ if (!IS_ERR(etr_buf))
+ goto done;
+diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h
+index 01c0382a29c0a..26e6356261c11 100644
+--- a/drivers/hwtracing/coresight/coresight-tmc.h
++++ b/drivers/hwtracing/coresight/coresight-tmc.h
+@@ -325,7 +325,7 @@ ssize_t tmc_sg_table_get_data(struct tmc_sg_table *sg_table,
+ static inline unsigned long
+ tmc_sg_table_buf_size(struct tmc_sg_table *sg_table)
+ {
+- return sg_table->data_pages.nr_pages << PAGE_SHIFT;
++ return (unsigned long)sg_table->data_pages.nr_pages << PAGE_SHIFT;
+ }
+
+ struct coresight_device *tmc_etr_get_catu_device(struct tmc_drvdata *drvdata);
+diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
+index 1fc4fd79a1c69..925f6c9cecff4 100644
+--- a/drivers/hwtracing/coresight/coresight-trbe.c
++++ b/drivers/hwtracing/coresight/coresight-trbe.c
+@@ -1223,6 +1223,16 @@ static void arm_trbe_enable_cpu(void *info)
+ enable_percpu_irq(drvdata->irq, IRQ_TYPE_NONE);
+ }
+
++static void arm_trbe_disable_cpu(void *info)
++{
++ struct trbe_drvdata *drvdata = info;
++ struct trbe_cpudata *cpudata = this_cpu_ptr(drvdata->cpudata);
++
++ disable_percpu_irq(drvdata->irq);
++ trbe_reset_local(cpudata);
++}
++
++
+ static void arm_trbe_register_coresight_cpu(struct trbe_drvdata *drvdata, int cpu)
+ {
+ struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
+@@ -1324,18 +1334,12 @@ cpu_clear:
+ cpumask_clear_cpu(cpu, &drvdata->supported_cpus);
+ }
+
+-static void arm_trbe_remove_coresight_cpu(void *info)
++static void arm_trbe_remove_coresight_cpu(struct trbe_drvdata *drvdata, int cpu)
+ {
+- int cpu = smp_processor_id();
+- struct trbe_drvdata *drvdata = info;
+- struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
+ struct coresight_device *trbe_csdev = coresight_get_percpu_sink(cpu);
+
+- disable_percpu_irq(drvdata->irq);
+- trbe_reset_local(cpudata);
+ if (trbe_csdev) {
+ coresight_unregister(trbe_csdev);
+- cpudata->drvdata = NULL;
+ coresight_set_percpu_sink(cpu, NULL);
+ }
+ }
+@@ -1364,8 +1368,10 @@ static int arm_trbe_remove_coresight(struct trbe_drvdata *drvdata)
+ {
+ int cpu;
+
+- for_each_cpu(cpu, &drvdata->supported_cpus)
+- smp_call_function_single(cpu, arm_trbe_remove_coresight_cpu, drvdata, 1);
++ for_each_cpu(cpu, &drvdata->supported_cpus) {
++ smp_call_function_single(cpu, arm_trbe_disable_cpu, drvdata, 1);
++ arm_trbe_remove_coresight_cpu(drvdata, cpu);
++ }
+ free_percpu(drvdata->cpudata);
+ return 0;
+ }
+@@ -1404,12 +1410,8 @@ static int arm_trbe_cpu_teardown(unsigned int cpu, struct hlist_node *node)
+ {
+ struct trbe_drvdata *drvdata = hlist_entry_safe(node, struct trbe_drvdata, hotplug_node);
+
+- if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) {
+- struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
+-
+- disable_percpu_irq(drvdata->irq);
+- trbe_reset_local(cpudata);
+- }
++ if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
++ arm_trbe_disable_cpu(drvdata);
+ return 0;
+ }
+
+diff --git a/drivers/i2c/busses/i2c-imx-lpi2c.c b/drivers/i2c/busses/i2c-imx-lpi2c.c
+index 4d24ceb57ee74..338171f76daf7 100644
+--- a/drivers/i2c/busses/i2c-imx-lpi2c.c
++++ b/drivers/i2c/busses/i2c-imx-lpi2c.c
+@@ -209,6 +209,9 @@ static int lpi2c_imx_config(struct lpi2c_imx_struct *lpi2c_imx)
+ lpi2c_imx_set_mode(lpi2c_imx);
+
+ clk_rate = clk_get_rate(lpi2c_imx->clks[0].clk);
++ if (!clk_rate)
++ return -EINVAL;
++
+ if (lpi2c_imx->mode == HS || lpi2c_imx->mode == ULTRA_FAST)
+ filt = 0;
+ else
+diff --git a/drivers/i3c/master/svc-i3c-master.c b/drivers/i3c/master/svc-i3c-master.c
+index 79b08942a925d..964d51dd39a47 100644
+--- a/drivers/i3c/master/svc-i3c-master.c
++++ b/drivers/i3c/master/svc-i3c-master.c
+@@ -782,6 +782,10 @@ static int svc_i3c_master_do_daa_locked(struct svc_i3c_master *master,
+ */
+ break;
+ } else if (SVC_I3C_MSTATUS_NACKED(reg)) {
++ /* No I3C devices attached */
++ if (dev_nb == 0)
++ break;
++
+ /*
+ * A slave device nacked the address, this is
+ * allowed only once, DAA will be stopped and
+@@ -1251,11 +1255,17 @@ static int svc_i3c_master_send_ccc_cmd(struct i3c_master_controller *m,
+ {
+ struct svc_i3c_master *master = to_svc_i3c_master(m);
+ bool broadcast = cmd->id < 0x80;
++ int ret;
+
+ if (broadcast)
+- return svc_i3c_master_send_bdcast_ccc_cmd(master, cmd);
++ ret = svc_i3c_master_send_bdcast_ccc_cmd(master, cmd);
+ else
+- return svc_i3c_master_send_direct_ccc_cmd(master, cmd);
++ ret = svc_i3c_master_send_direct_ccc_cmd(master, cmd);
++
++ if (ret)
++ cmd->err = I3C_ERROR_M2;
++
++ return ret;
+ }
+
+ static int svc_i3c_master_priv_xfers(struct i3c_dev_desc *dev,
+diff --git a/drivers/iio/accel/adxl313_i2c.c b/drivers/iio/accel/adxl313_i2c.c
+index 99cc7fc294882..68785bd3ef2f0 100644
+--- a/drivers/iio/accel/adxl313_i2c.c
++++ b/drivers/iio/accel/adxl313_i2c.c
+@@ -40,8 +40,8 @@ static const struct regmap_config adxl31x_i2c_regmap_config[] = {
+
+ static const struct i2c_device_id adxl313_i2c_id[] = {
+ { .name = "adxl312", .driver_data = (kernel_ulong_t)&adxl31x_chip_info[ADXL312] },
+- { .name = "adxl313", .driver_data = (kernel_ulong_t)&adxl31x_chip_info[ADXL312] },
+- { .name = "adxl314", .driver_data = (kernel_ulong_t)&adxl31x_chip_info[ADXL312] },
++ { .name = "adxl313", .driver_data = (kernel_ulong_t)&adxl31x_chip_info[ADXL313] },
++ { .name = "adxl314", .driver_data = (kernel_ulong_t)&adxl31x_chip_info[ADXL314] },
+ { }
+ };
+
+diff --git a/drivers/infiniband/core/uverbs_std_types_counters.c b/drivers/infiniband/core/uverbs_std_types_counters.c
+index 999da9c798668..381aa57976417 100644
+--- a/drivers/infiniband/core/uverbs_std_types_counters.c
++++ b/drivers/infiniband/core/uverbs_std_types_counters.c
+@@ -107,6 +107,8 @@ static int UVERBS_HANDLER(UVERBS_METHOD_COUNTERS_READ)(
+ return ret;
+
+ uattr = uverbs_attr_get(attrs, UVERBS_ATTR_READ_COUNTERS_BUFF);
++ if (IS_ERR(uattr))
++ return PTR_ERR(uattr);
+ read_attr.ncounters = uattr->ptr_attr.len / sizeof(u64);
+ read_attr.counters_buff = uverbs_zalloc(
+ attrs, array_size(read_attr.ncounters, sizeof(u64)));
+diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
+index eef3ef3fabb42..bacb1b6723ef8 100644
+--- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
++++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
+@@ -116,7 +116,6 @@ struct bnxt_re_dev {
+ struct list_head list;
+ unsigned long flags;
+ #define BNXT_RE_FLAG_NETDEV_REGISTERED 0
+-#define BNXT_RE_FLAG_GOT_MSIX 2
+ #define BNXT_RE_FLAG_HAVE_L2_REF 3
+ #define BNXT_RE_FLAG_RCFW_CHANNEL_EN 4
+ #define BNXT_RE_FLAG_QOS_WORK_REG 5
+diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
+index 4f00fb7869f8e..bfab2c83faf92 100644
+--- a/drivers/infiniband/hw/bnxt_re/main.c
++++ b/drivers/infiniband/hw/bnxt_re/main.c
+@@ -1113,8 +1113,8 @@ static void bnxt_re_dev_uninit(struct bnxt_re_dev *rdev)
+ bnxt_re_net_ring_free(rdev, rdev->rcfw.creq.ring_id, type);
+ bnxt_qplib_free_rcfw_channel(&rdev->rcfw);
+ }
+- if (test_and_clear_bit(BNXT_RE_FLAG_GOT_MSIX, &rdev->flags))
+- rdev->num_msix = 0;
++
++ rdev->num_msix = 0;
+
+ bnxt_re_destroy_chip_ctx(rdev);
+ if (test_and_clear_bit(BNXT_RE_FLAG_NETDEV_REGISTERED, &rdev->flags))
+@@ -1170,7 +1170,6 @@ static int bnxt_re_dev_init(struct bnxt_re_dev *rdev, u8 wqe_mode)
+ ibdev_dbg(&rdev->ibdev, "Got %d MSI-X vectors\n",
+ rdev->en_dev->ulp_tbl->msix_requested);
+ rdev->num_msix = rdev->en_dev->ulp_tbl->msix_requested;
+- set_bit(BNXT_RE_FLAG_GOT_MSIX, &rdev->flags);
+
+ bnxt_re_query_hwrm_intf_version(rdev);
+
+diff --git a/drivers/infiniband/hw/efa/efa_verbs.c b/drivers/infiniband/hw/efa/efa_verbs.c
+index 2a195c4b0f17d..3538d59521e41 100644
+--- a/drivers/infiniband/hw/efa/efa_verbs.c
++++ b/drivers/infiniband/hw/efa/efa_verbs.c
+@@ -449,12 +449,12 @@ int efa_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata)
+
+ ibdev_dbg(&dev->ibdev, "Destroy qp[%u]\n", ibqp->qp_num);
+
+- efa_qp_user_mmap_entries_remove(qp);
+-
+ err = efa_destroy_qp_handle(dev, qp->qp_handle);
+ if (err)
+ return err;
+
++ efa_qp_user_mmap_entries_remove(qp);
++
+ if (qp->rq_cpu_addr) {
+ ibdev_dbg(&dev->ibdev,
+ "qp->cpu_addr[0x%p] freed: size[%lu], dma[%pad]\n",
+@@ -1013,8 +1013,8 @@ int efa_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
+ "Destroy cq[%d] virt[0x%p] freed: size[%lu], dma[%pad]\n",
+ cq->cq_idx, cq->cpu_addr, cq->size, &cq->dma_addr);
+
+- efa_cq_user_mmap_entries_remove(cq);
+ efa_destroy_cq_idx(dev, cq->cq_idx);
++ efa_cq_user_mmap_entries_remove(cq);
+ if (cq->eq) {
+ xa_erase(&dev->cqs_xa, cq->cq_idx);
+ synchronize_irq(cq->eq->irq.irqn);
+diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
+index 84239b907de2a..bb94eb076858c 100644
+--- a/drivers/infiniband/hw/hns/hns_roce_device.h
++++ b/drivers/infiniband/hw/hns/hns_roce_device.h
+@@ -97,6 +97,7 @@
+ #define HNS_ROCE_CQ_BANK_NUM 4
+
+ #define CQ_BANKID_SHIFT 2
++#define CQ_BANKID_MASK GENMASK(1, 0)
+
+ enum {
+ SERV_TYPE_RC,
+diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+index d4c6b9bc0a4ea..ec1b82ddc23dd 100644
+--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
++++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+@@ -757,7 +757,8 @@ out:
+ qp->sq.head += nreq;
+ qp->next_sge = sge_idx;
+
+- if (nreq == 1 && (qp->en_flags & HNS_ROCE_QP_CAP_DIRECT_WQE))
++ if (nreq == 1 && !ret &&
++ (qp->en_flags & HNS_ROCE_QP_CAP_DIRECT_WQE))
+ write_dwqe(hr_dev, qp, wqe);
+ else
+ update_sq_db(hr_dev, qp);
+@@ -6740,14 +6741,14 @@ static int __hns_roce_hw_v2_init_instance(struct hnae3_handle *handle)
+ ret = hns_roce_init(hr_dev);
+ if (ret) {
+ dev_err(hr_dev->dev, "RoCE Engine init failed!\n");
+- goto error_failed_cfg;
++ goto error_failed_roce_init;
+ }
+
+ if (hr_dev->pci_dev->revision == PCI_REVISION_ID_HIP08) {
+ ret = free_mr_init(hr_dev);
+ if (ret) {
+ dev_err(hr_dev->dev, "failed to init free mr!\n");
+- goto error_failed_roce_init;
++ goto error_failed_free_mr_init;
+ }
+ }
+
+@@ -6755,10 +6756,10 @@ static int __hns_roce_hw_v2_init_instance(struct hnae3_handle *handle)
+
+ return 0;
+
+-error_failed_roce_init:
++error_failed_free_mr_init:
+ hns_roce_exit(hr_dev);
+
+-error_failed_cfg:
++error_failed_roce_init:
+ kfree(hr_dev->priv);
+
+ error_failed_kzalloc:
+diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
+index 485e110ca4333..9141eadf33d2a 100644
+--- a/drivers/infiniband/hw/hns/hns_roce_main.c
++++ b/drivers/infiniband/hw/hns/hns_roce_main.c
+@@ -219,6 +219,7 @@ static int hns_roce_query_port(struct ib_device *ib_dev, u32 port_num,
+ unsigned long flags;
+ enum ib_mtu mtu;
+ u32 port;
++ int ret;
+
+ port = port_num - 1;
+
+@@ -231,8 +232,10 @@ static int hns_roce_query_port(struct ib_device *ib_dev, u32 port_num,
+ IB_PORT_BOOT_MGMT_SUP;
+ props->max_msg_sz = HNS_ROCE_MAX_MSG_LEN;
+ props->pkey_tbl_len = 1;
+- props->active_width = IB_WIDTH_4X;
+- props->active_speed = 1;
++ ret = ib_get_eth_speed(ib_dev, port_num, &props->active_speed,
++ &props->active_width);
++ if (ret)
++ ibdev_warn(ib_dev, "failed to get speed, ret = %d.\n", ret);
+
+ spin_lock_irqsave(&hr_dev->iboe.lock, flags);
+
+diff --git a/drivers/infiniband/hw/hns/hns_roce_qp.c b/drivers/infiniband/hw/hns/hns_roce_qp.c
+index d855a917f4cfa..cdc1c6de43a17 100644
+--- a/drivers/infiniband/hw/hns/hns_roce_qp.c
++++ b/drivers/infiniband/hw/hns/hns_roce_qp.c
+@@ -170,14 +170,29 @@ static void hns_roce_ib_qp_event(struct hns_roce_qp *hr_qp,
+ }
+ }
+
+-static u8 get_least_load_bankid_for_qp(struct hns_roce_bank *bank)
++static u8 get_affinity_cq_bank(u8 qp_bank)
+ {
+- u32 least_load = bank[0].inuse;
++ return (qp_bank >> 1) & CQ_BANKID_MASK;
++}
++
++static u8 get_least_load_bankid_for_qp(struct ib_qp_init_attr *init_attr,
++ struct hns_roce_bank *bank)
++{
++#define INVALID_LOAD_QPNUM 0xFFFFFFFF
++ struct ib_cq *scq = init_attr->send_cq;
++ u32 least_load = INVALID_LOAD_QPNUM;
++ unsigned long cqn = 0;
+ u8 bankid = 0;
+ u32 bankcnt;
+ u8 i;
+
+- for (i = 1; i < HNS_ROCE_QP_BANK_NUM; i++) {
++ if (scq)
++ cqn = to_hr_cq(scq)->cqn;
++
++ for (i = 0; i < HNS_ROCE_QP_BANK_NUM; i++) {
++ if (scq && (get_affinity_cq_bank(i) != (cqn & CQ_BANKID_MASK)))
++ continue;
++
+ bankcnt = bank[i].inuse;
+ if (bankcnt < least_load) {
+ least_load = bankcnt;
+@@ -209,7 +224,8 @@ static int alloc_qpn_with_bankid(struct hns_roce_bank *bank, u8 bankid,
+
+ return 0;
+ }
+-static int alloc_qpn(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp)
++static int alloc_qpn(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp,
++ struct ib_qp_init_attr *init_attr)
+ {
+ struct hns_roce_qp_table *qp_table = &hr_dev->qp_table;
+ unsigned long num = 0;
+@@ -220,7 +236,7 @@ static int alloc_qpn(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp)
+ num = 1;
+ } else {
+ mutex_lock(&qp_table->bank_mutex);
+- bankid = get_least_load_bankid_for_qp(qp_table->bank);
++ bankid = get_least_load_bankid_for_qp(init_attr, qp_table->bank);
+
+ ret = alloc_qpn_with_bankid(&qp_table->bank[bankid], bankid,
+ &num);
+@@ -1082,7 +1098,7 @@ static int hns_roce_create_qp_common(struct hns_roce_dev *hr_dev,
+ goto err_buf;
+ }
+
+- ret = alloc_qpn(hr_dev, hr_qp);
++ ret = alloc_qpn(hr_dev, hr_qp, init_attr);
+ if (ret) {
+ ibdev_err(ibdev, "failed to alloc QPN, ret = %d.\n", ret);
+ goto err_qpn;
+diff --git a/drivers/infiniband/hw/irdma/ctrl.c b/drivers/infiniband/hw/irdma/ctrl.c
+index 45e3344daa048..ef47ec271e19e 100644
+--- a/drivers/infiniband/hw/irdma/ctrl.c
++++ b/drivers/infiniband/hw/irdma/ctrl.c
+@@ -1061,6 +1061,9 @@ static int irdma_sc_alloc_stag(struct irdma_sc_dev *dev,
+ u64 hdr;
+ enum irdma_page_size page_size;
+
++ if (!info->total_len && !info->all_memory)
++ return -EINVAL;
++
+ if (info->page_size == 0x40000000)
+ page_size = IRDMA_PAGE_SIZE_1G;
+ else if (info->page_size == 0x200000)
+@@ -1126,6 +1129,9 @@ static int irdma_sc_mr_reg_non_shared(struct irdma_sc_dev *dev,
+ u8 addr_type;
+ enum irdma_page_size page_size;
+
++ if (!info->total_len && !info->all_memory)
++ return -EINVAL;
++
+ if (info->page_size == 0x40000000)
+ page_size = IRDMA_PAGE_SIZE_1G;
+ else if (info->page_size == 0x200000)
+diff --git a/drivers/infiniband/hw/irdma/main.h b/drivers/infiniband/hw/irdma/main.h
+index 2323962cdeacb..de2f4c0514118 100644
+--- a/drivers/infiniband/hw/irdma/main.h
++++ b/drivers/infiniband/hw/irdma/main.h
+@@ -239,7 +239,7 @@ struct irdma_qv_info {
+
+ struct irdma_qvlist_info {
+ u32 num_vectors;
+- struct irdma_qv_info qv_info[1];
++ struct irdma_qv_info qv_info[];
+ };
+
+ struct irdma_gen_ops {
+diff --git a/drivers/infiniband/hw/irdma/type.h b/drivers/infiniband/hw/irdma/type.h
+index a20709577ab0a..3b1fa5bc0a585 100644
+--- a/drivers/infiniband/hw/irdma/type.h
++++ b/drivers/infiniband/hw/irdma/type.h
+@@ -971,6 +971,7 @@ struct irdma_allocate_stag_info {
+ bool remote_access:1;
+ bool use_hmc_fcn_index:1;
+ bool use_pf_rid:1;
++ bool all_memory:1;
+ u8 hmc_fcn_index;
+ };
+
+@@ -998,6 +999,7 @@ struct irdma_reg_ns_stag_info {
+ bool use_hmc_fcn_index:1;
+ u8 hmc_fcn_index;
+ bool use_pf_rid:1;
++ bool all_memory:1;
+ };
+
+ struct irdma_fast_reg_stag_info {
+diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c
+index eaa12c1245982..20d70f0d21e0f 100644
+--- a/drivers/infiniband/hw/irdma/verbs.c
++++ b/drivers/infiniband/hw/irdma/verbs.c
+@@ -2552,7 +2552,8 @@ static int irdma_hw_alloc_stag(struct irdma_device *iwdev,
+ struct irdma_mr *iwmr)
+ {
+ struct irdma_allocate_stag_info *info;
+- struct irdma_pd *iwpd = to_iwpd(iwmr->ibmr.pd);
++ struct ib_pd *pd = iwmr->ibmr.pd;
++ struct irdma_pd *iwpd = to_iwpd(pd);
+ int status;
+ struct irdma_cqp_request *cqp_request;
+ struct cqp_cmds_info *cqp_info;
+@@ -2568,6 +2569,7 @@ static int irdma_hw_alloc_stag(struct irdma_device *iwdev,
+ info->stag_idx = iwmr->stag >> IRDMA_CQPSQ_STAG_IDX_S;
+ info->pd_id = iwpd->sc_pd.pd_id;
+ info->total_len = iwmr->len;
++ info->all_memory = pd->flags & IB_PD_UNSAFE_GLOBAL_RKEY;
+ info->remote_access = true;
+ cqp_info->cqp_cmd = IRDMA_OP_ALLOC_STAG;
+ cqp_info->post_sq = 1;
+@@ -2615,6 +2617,8 @@ static struct ib_mr *irdma_alloc_mr(struct ib_pd *pd, enum ib_mr_type mr_type,
+ iwmr->type = IRDMA_MEMREG_TYPE_MEM;
+ palloc = &iwpbl->pble_alloc;
+ iwmr->page_cnt = max_num_sg;
++ /* Use system PAGE_SIZE as the sg page sizes are unknown at this point */
++ iwmr->len = max_num_sg * PAGE_SIZE;
+ err_code = irdma_get_pble(iwdev->rf->pble_rsrc, palloc, iwmr->page_cnt,
+ false);
+ if (err_code)
+@@ -2694,7 +2698,8 @@ static int irdma_hwreg_mr(struct irdma_device *iwdev, struct irdma_mr *iwmr,
+ {
+ struct irdma_pbl *iwpbl = &iwmr->iwpbl;
+ struct irdma_reg_ns_stag_info *stag_info;
+- struct irdma_pd *iwpd = to_iwpd(iwmr->ibmr.pd);
++ struct ib_pd *pd = iwmr->ibmr.pd;
++ struct irdma_pd *iwpd = to_iwpd(pd);
+ struct irdma_pble_alloc *palloc = &iwpbl->pble_alloc;
+ struct irdma_cqp_request *cqp_request;
+ struct cqp_cmds_info *cqp_info;
+@@ -2713,6 +2718,7 @@ static int irdma_hwreg_mr(struct irdma_device *iwdev, struct irdma_mr *iwmr,
+ stag_info->total_len = iwmr->len;
+ stag_info->access_rights = irdma_get_mr_access(access);
+ stag_info->pd_id = iwpd->sc_pd.pd_id;
++ stag_info->all_memory = pd->flags & IB_PD_UNSAFE_GLOBAL_RKEY;
+ if (stag_info->access_rights & IRDMA_ACCESS_FLAGS_ZERO_BASED)
+ stag_info->addr_type = IRDMA_ADDR_TYPE_ZERO_BASED;
+ else
+@@ -4424,7 +4430,6 @@ static int irdma_query_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr)
+ ah_attr->grh.traffic_class = ah->sc_ah.ah_info.tc_tos;
+ ah_attr->grh.hop_limit = ah->sc_ah.ah_info.hop_ttl;
+ ah_attr->grh.sgid_index = ah->sgid_index;
+- ah_attr->grh.sgid_index = ah->sgid_index;
+ memcpy(&ah_attr->grh.dgid, &ah->dgid,
+ sizeof(ah_attr->grh.dgid));
+ }
+diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c
+index f46c5a5fd0aea..44fece204abdd 100644
+--- a/drivers/infiniband/sw/rxe/rxe_comp.c
++++ b/drivers/infiniband/sw/rxe/rxe_comp.c
+@@ -597,6 +597,10 @@ static void flush_send_queue(struct rxe_qp *qp, bool notify)
+ struct rxe_queue *q = qp->sq.queue;
+ int err;
+
++ /* send queue never got created. nothing to do. */
++ if (!qp->sq.queue)
++ return;
++
+ while ((wqe = queue_head(q, q->type))) {
+ if (notify) {
+ err = flush_send_wqe(qp, wqe);
+diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h
+index 804b15e929dd9..5b32724a95c8d 100644
+--- a/drivers/infiniband/sw/rxe/rxe_loc.h
++++ b/drivers/infiniband/sw/rxe/rxe_loc.h
+@@ -138,12 +138,6 @@ static inline int qp_mtu(struct rxe_qp *qp)
+ return IB_MTU_4096;
+ }
+
+-static inline int rcv_wqe_size(int max_sge)
+-{
+- return sizeof(struct rxe_recv_wqe) +
+- max_sge * sizeof(struct ib_sge);
+-}
+-
+ void free_rd_atomic_resource(struct resp_res *res);
+
+ static inline void rxe_advance_resp_resource(struct rxe_qp *qp)
+diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
+index a0f206431cf8e..b66afadbd7c40 100644
+--- a/drivers/infiniband/sw/rxe/rxe_qp.c
++++ b/drivers/infiniband/sw/rxe/rxe_qp.c
+@@ -183,13 +183,63 @@ static void rxe_qp_init_misc(struct rxe_dev *rxe, struct rxe_qp *qp,
+ atomic_set(&qp->skb_out, 0);
+ }
+
++static int rxe_init_sq(struct rxe_qp *qp, struct ib_qp_init_attr *init,
++ struct ib_udata *udata,
++ struct rxe_create_qp_resp __user *uresp)
++{
++ struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
++ int wqe_size;
++ int err;
++
++ qp->sq.max_wr = init->cap.max_send_wr;
++ wqe_size = max_t(int, init->cap.max_send_sge * sizeof(struct ib_sge),
++ init->cap.max_inline_data);
++ qp->sq.max_sge = wqe_size / sizeof(struct ib_sge);
++ qp->sq.max_inline = wqe_size;
++ wqe_size += sizeof(struct rxe_send_wqe);
++
++ qp->sq.queue = rxe_queue_init(rxe, &qp->sq.max_wr, wqe_size,
++ QUEUE_TYPE_FROM_CLIENT);
++ if (!qp->sq.queue) {
++ rxe_err_qp(qp, "Unable to allocate send queue");
++ err = -ENOMEM;
++ goto err_out;
++ }
++
++ /* prepare info for caller to mmap send queue if user space qp */
++ err = do_mmap_info(rxe, uresp ? &uresp->sq_mi : NULL, udata,
++ qp->sq.queue->buf, qp->sq.queue->buf_size,
++ &qp->sq.queue->ip);
++ if (err) {
++ rxe_err_qp(qp, "do_mmap_info failed, err = %d", err);
++ goto err_free;
++ }
++
++ /* return actual capabilities to caller which may be larger
++ * than requested
++ */
++ init->cap.max_send_wr = qp->sq.max_wr;
++ init->cap.max_send_sge = qp->sq.max_sge;
++ init->cap.max_inline_data = qp->sq.max_inline;
++
++ return 0;
++
++err_free:
++ vfree(qp->sq.queue->buf);
++ kfree(qp->sq.queue);
++ qp->sq.queue = NULL;
++err_out:
++ return err;
++}
++
+ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
+ struct ib_qp_init_attr *init, struct ib_udata *udata,
+ struct rxe_create_qp_resp __user *uresp)
+ {
+ int err;
+- int wqe_size;
+- enum queue_type type;
++
++ /* if we don't finish qp create make sure queue is valid */
++ skb_queue_head_init(&qp->req_pkts);
+
+ err = sock_create_kern(&init_net, AF_INET, SOCK_DGRAM, 0, &qp->sk);
+ if (err < 0)
+@@ -204,32 +254,10 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
+ * (0xc000 - 0xffff).
+ */
+ qp->src_port = RXE_ROCE_V2_SPORT + (hash_32(qp_num(qp), 14) & 0x3fff);
+- qp->sq.max_wr = init->cap.max_send_wr;
+-
+- /* These caps are limited by rxe_qp_chk_cap() done by the caller */
+- wqe_size = max_t(int, init->cap.max_send_sge * sizeof(struct ib_sge),
+- init->cap.max_inline_data);
+- qp->sq.max_sge = init->cap.max_send_sge =
+- wqe_size / sizeof(struct ib_sge);
+- qp->sq.max_inline = init->cap.max_inline_data = wqe_size;
+- wqe_size += sizeof(struct rxe_send_wqe);
+
+- type = QUEUE_TYPE_FROM_CLIENT;
+- qp->sq.queue = rxe_queue_init(rxe, &qp->sq.max_wr,
+- wqe_size, type);
+- if (!qp->sq.queue)
+- return -ENOMEM;
+-
+- err = do_mmap_info(rxe, uresp ? &uresp->sq_mi : NULL, udata,
+- qp->sq.queue->buf, qp->sq.queue->buf_size,
+- &qp->sq.queue->ip);
+-
+- if (err) {
+- vfree(qp->sq.queue->buf);
+- kfree(qp->sq.queue);
+- qp->sq.queue = NULL;
++ err = rxe_init_sq(qp, init, udata, uresp);
++ if (err)
+ return err;
+- }
+
+ qp->req.wqe_index = queue_get_producer(qp->sq.queue,
+ QUEUE_TYPE_FROM_CLIENT);
+@@ -248,36 +276,65 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
+ return 0;
+ }
+
++static int rxe_init_rq(struct rxe_qp *qp, struct ib_qp_init_attr *init,
++ struct ib_udata *udata,
++ struct rxe_create_qp_resp __user *uresp)
++{
++ struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
++ int wqe_size;
++ int err;
++
++ qp->rq.max_wr = init->cap.max_recv_wr;
++ qp->rq.max_sge = init->cap.max_recv_sge;
++ wqe_size = sizeof(struct rxe_recv_wqe) +
++ qp->rq.max_sge*sizeof(struct ib_sge);
++
++ qp->rq.queue = rxe_queue_init(rxe, &qp->rq.max_wr, wqe_size,
++ QUEUE_TYPE_FROM_CLIENT);
++ if (!qp->rq.queue) {
++ rxe_err_qp(qp, "Unable to allocate recv queue");
++ err = -ENOMEM;
++ goto err_out;
++ }
++
++ /* prepare info for caller to mmap recv queue if user space qp */
++ err = do_mmap_info(rxe, uresp ? &uresp->rq_mi : NULL, udata,
++ qp->rq.queue->buf, qp->rq.queue->buf_size,
++ &qp->rq.queue->ip);
++ if (err) {
++ rxe_err_qp(qp, "do_mmap_info failed, err = %d", err);
++ goto err_free;
++ }
++
++ /* return actual capabilities to caller which may be larger
++ * than requested
++ */
++ init->cap.max_recv_wr = qp->rq.max_wr;
++
++ return 0;
++
++err_free:
++ vfree(qp->rq.queue->buf);
++ kfree(qp->rq.queue);
++ qp->rq.queue = NULL;
++err_out:
++ return err;
++}
++
+ static int rxe_qp_init_resp(struct rxe_dev *rxe, struct rxe_qp *qp,
+ struct ib_qp_init_attr *init,
+ struct ib_udata *udata,
+ struct rxe_create_qp_resp __user *uresp)
+ {
+ int err;
+- int wqe_size;
+- enum queue_type type;
++
++ /* if we don't finish qp create make sure queue is valid */
++ skb_queue_head_init(&qp->resp_pkts);
+
+ if (!qp->srq) {
+- qp->rq.max_wr = init->cap.max_recv_wr;
+- qp->rq.max_sge = init->cap.max_recv_sge;
+-
+- wqe_size = rcv_wqe_size(qp->rq.max_sge);
+-
+- type = QUEUE_TYPE_FROM_CLIENT;
+- qp->rq.queue = rxe_queue_init(rxe, &qp->rq.max_wr,
+- wqe_size, type);
+- if (!qp->rq.queue)
+- return -ENOMEM;
+-
+- err = do_mmap_info(rxe, uresp ? &uresp->rq_mi : NULL, udata,
+- qp->rq.queue->buf, qp->rq.queue->buf_size,
+- &qp->rq.queue->ip);
+- if (err) {
+- vfree(qp->rq.queue->buf);
+- kfree(qp->rq.queue);
+- qp->rq.queue = NULL;
++ err = rxe_init_rq(qp, init, udata, uresp);
++ if (err)
+ return err;
+- }
+ }
+
+ rxe_init_task(&qp->resp.task, qp, rxe_responder);
+@@ -307,10 +364,10 @@ int rxe_qp_from_init(struct rxe_dev *rxe, struct rxe_qp *qp, struct rxe_pd *pd,
+ if (srq)
+ rxe_get(srq);
+
+- qp->pd = pd;
+- qp->rcq = rcq;
+- qp->scq = scq;
+- qp->srq = srq;
++ qp->pd = pd;
++ qp->rcq = rcq;
++ qp->scq = scq;
++ qp->srq = srq;
+
+ atomic_inc(&rcq->num_wq);
+ atomic_inc(&scq->num_wq);
+diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c
+index 5fe7cbae30313..1104255b7be9a 100644
+--- a/drivers/infiniband/sw/rxe/rxe_req.c
++++ b/drivers/infiniband/sw/rxe/rxe_req.c
+@@ -578,10 +578,11 @@ static void save_state(struct rxe_send_wqe *wqe,
+ struct rxe_send_wqe *rollback_wqe,
+ u32 *rollback_psn)
+ {
+- rollback_wqe->state = wqe->state;
++ rollback_wqe->state = wqe->state;
+ rollback_wqe->first_psn = wqe->first_psn;
+- rollback_wqe->last_psn = wqe->last_psn;
+- *rollback_psn = qp->req.psn;
++ rollback_wqe->last_psn = wqe->last_psn;
++ rollback_wqe->dma = wqe->dma;
++ *rollback_psn = qp->req.psn;
+ }
+
+ static void rollback_state(struct rxe_send_wqe *wqe,
+@@ -589,10 +590,11 @@ static void rollback_state(struct rxe_send_wqe *wqe,
+ struct rxe_send_wqe *rollback_wqe,
+ u32 rollback_psn)
+ {
+- wqe->state = rollback_wqe->state;
++ wqe->state = rollback_wqe->state;
+ wqe->first_psn = rollback_wqe->first_psn;
+- wqe->last_psn = rollback_wqe->last_psn;
+- qp->req.psn = rollback_psn;
++ wqe->last_psn = rollback_wqe->last_psn;
++ wqe->dma = rollback_wqe->dma;
++ qp->req.psn = rollback_psn;
+ }
+
+ static void update_state(struct rxe_qp *qp, struct rxe_pkt_info *pkt)
+@@ -797,6 +799,9 @@ int rxe_requester(struct rxe_qp *qp)
+ pkt.mask = rxe_opcode[opcode].mask;
+ pkt.wqe = wqe;
+
++ /* save wqe state before we build and send packet */
++ save_state(wqe, qp, &rollback_wqe, &rollback_psn);
++
+ av = rxe_get_av(&pkt, &ah);
+ if (unlikely(!av)) {
+ rxe_dbg_qp(qp, "Failed no address vector\n");
+@@ -829,29 +834,29 @@ int rxe_requester(struct rxe_qp *qp)
+ if (ah)
+ rxe_put(ah);
+
+- /*
+- * To prevent a race on wqe access between requester and completer,
+- * wqe members state and psn need to be set before calling
+- * rxe_xmit_packet().
+- * Otherwise, completer might initiate an unjustified retry flow.
+- */
+- save_state(wqe, qp, &rollback_wqe, &rollback_psn);
++ /* update wqe state as though we had sent it */
+ update_wqe_state(qp, wqe, &pkt);
+ update_wqe_psn(qp, wqe, &pkt, payload);
+
+ err = rxe_xmit_packet(qp, &pkt, skb);
+ if (err) {
+- qp->need_req_skb = 1;
++ if (err != -EAGAIN) {
++ wqe->status = IB_WC_LOC_QP_OP_ERR;
++ goto err;
++ }
+
++ /* the packet was dropped so reset wqe to the state
++ * before we sent it so we can try to resend
++ */
+ rollback_state(wqe, qp, &rollback_wqe, rollback_psn);
+
+- if (err == -EAGAIN) {
+- rxe_sched_task(&qp->req.task);
+- goto exit;
+- }
++ /* force a delay until the dropped packet is freed and
++ * the send queue is drained below the low water mark
++ */
++ qp->need_req_skb = 1;
+
+- wqe->status = IB_WC_LOC_QP_OP_ERR;
+- goto err;
++ rxe_sched_task(&qp->req.task);
++ goto exit;
+ }
+
+ update_state(qp, &pkt);
+diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
+index ee68306555b99..ed5af55237d9f 100644
+--- a/drivers/infiniband/sw/rxe/rxe_resp.c
++++ b/drivers/infiniband/sw/rxe/rxe_resp.c
+@@ -1452,6 +1452,10 @@ static void flush_recv_queue(struct rxe_qp *qp, bool notify)
+ if (qp->srq)
+ return;
+
++ /* recv queue not created. nothing to do. */
++ if (!qp->rq.queue)
++ return;
++
+ while ((wqe = queue_head(q, q->type))) {
+ if (notify) {
+ err = flush_recv_wqe(qp, wqe);
+diff --git a/drivers/infiniband/sw/rxe/rxe_srq.c b/drivers/infiniband/sw/rxe/rxe_srq.c
+index 27ca82ec0826b..3661cb627d28a 100644
+--- a/drivers/infiniband/sw/rxe/rxe_srq.c
++++ b/drivers/infiniband/sw/rxe/rxe_srq.c
+@@ -45,40 +45,41 @@ int rxe_srq_from_init(struct rxe_dev *rxe, struct rxe_srq *srq,
+ struct ib_srq_init_attr *init, struct ib_udata *udata,
+ struct rxe_create_srq_resp __user *uresp)
+ {
+- int err;
+- int srq_wqe_size;
+ struct rxe_queue *q;
+- enum queue_type type;
++ int wqe_size;
++ int err;
+
+- srq->ibsrq.event_handler = init->event_handler;
+- srq->ibsrq.srq_context = init->srq_context;
+- srq->limit = init->attr.srq_limit;
+- srq->srq_num = srq->elem.index;
+- srq->rq.max_wr = init->attr.max_wr;
+- srq->rq.max_sge = init->attr.max_sge;
++ srq->ibsrq.event_handler = init->event_handler;
++ srq->ibsrq.srq_context = init->srq_context;
++ srq->limit = init->attr.srq_limit;
++ srq->srq_num = srq->elem.index;
++ srq->rq.max_wr = init->attr.max_wr;
++ srq->rq.max_sge = init->attr.max_sge;
+
+- srq_wqe_size = rcv_wqe_size(srq->rq.max_sge);
++ wqe_size = sizeof(struct rxe_recv_wqe) +
++ srq->rq.max_sge*sizeof(struct ib_sge);
+
+ spin_lock_init(&srq->rq.producer_lock);
+ spin_lock_init(&srq->rq.consumer_lock);
+
+- type = QUEUE_TYPE_FROM_CLIENT;
+- q = rxe_queue_init(rxe, &srq->rq.max_wr, srq_wqe_size, type);
++ q = rxe_queue_init(rxe, &srq->rq.max_wr, wqe_size,
++ QUEUE_TYPE_FROM_CLIENT);
+ if (!q) {
+ rxe_dbg_srq(srq, "Unable to allocate queue\n");
+- return -ENOMEM;
++ err = -ENOMEM;
++ goto err_out;
+ }
+
+- srq->rq.queue = q;
+-
+ err = do_mmap_info(rxe, uresp ? &uresp->mi : NULL, udata, q->buf,
+ q->buf_size, &q->ip);
+ if (err) {
+- vfree(q->buf);
+- kfree(q);
+- return err;
++ rxe_dbg_srq(srq, "Unable to init mmap info for caller\n");
++ goto err_free;
+ }
+
++ srq->rq.queue = q;
++ init->attr.max_wr = srq->rq.max_wr;
++
+ if (uresp) {
+ if (copy_to_user(&uresp->srq_num, &srq->srq_num,
+ sizeof(uresp->srq_num))) {
+@@ -88,6 +89,12 @@ int rxe_srq_from_init(struct rxe_dev *rxe, struct rxe_srq *srq,
+ }
+
+ return 0;
++
++err_free:
++ vfree(q->buf);
++ kfree(q);
++err_out:
++ return err;
+ }
+
+ int rxe_srq_chk_attr(struct rxe_dev *rxe, struct rxe_srq *srq,
+@@ -145,9 +152,10 @@ int rxe_srq_from_attr(struct rxe_dev *rxe, struct rxe_srq *srq,
+ struct ib_srq_attr *attr, enum ib_srq_attr_mask mask,
+ struct rxe_modify_srq_cmd *ucmd, struct ib_udata *udata)
+ {
+- int err;
+ struct rxe_queue *q = srq->rq.queue;
+ struct mminfo __user *mi = NULL;
++ int wqe_size;
++ int err;
+
+ if (mask & IB_SRQ_MAX_WR) {
+ /*
+@@ -156,12 +164,16 @@ int rxe_srq_from_attr(struct rxe_dev *rxe, struct rxe_srq *srq,
+ */
+ mi = u64_to_user_ptr(ucmd->mmap_info_addr);
+
+- err = rxe_queue_resize(q, &attr->max_wr,
+- rcv_wqe_size(srq->rq.max_sge), udata, mi,
+- &srq->rq.producer_lock,
++ wqe_size = sizeof(struct rxe_recv_wqe) +
++ srq->rq.max_sge*sizeof(struct ib_sge);
++
++ err = rxe_queue_resize(q, &attr->max_wr, wqe_size,
++ udata, mi, &srq->rq.producer_lock,
+ &srq->rq.consumer_lock);
+ if (err)
+- goto err2;
++ goto err_free;
++
++ srq->rq.max_wr = attr->max_wr;
+ }
+
+ if (mask & IB_SRQ_LIMIT)
+@@ -169,7 +181,7 @@ int rxe_srq_from_attr(struct rxe_dev *rxe, struct rxe_srq *srq,
+
+ return 0;
+
+-err2:
++err_free:
+ rxe_queue_cleanup(q);
+ srq->rq.queue = NULL;
+ return err;
+diff --git a/drivers/infiniband/sw/siw/siw.h b/drivers/infiniband/sw/siw/siw.h
+index 2f3a9cda3850f..8b4a710b82bc1 100644
+--- a/drivers/infiniband/sw/siw/siw.h
++++ b/drivers/infiniband/sw/siw/siw.h
+@@ -74,6 +74,7 @@ struct siw_device {
+
+ u32 vendor_part_id;
+ int numa_node;
++ char raw_gid[ETH_ALEN];
+
+ /* physical port state (only one port per device) */
+ enum ib_port_state state;
+diff --git a/drivers/infiniband/sw/siw/siw_cm.c b/drivers/infiniband/sw/siw/siw_cm.c
+index da530c0404da4..a2605178f4eda 100644
+--- a/drivers/infiniband/sw/siw/siw_cm.c
++++ b/drivers/infiniband/sw/siw/siw_cm.c
+@@ -1501,7 +1501,6 @@ error:
+
+ cep->cm_id = NULL;
+ id->rem_ref(id);
+- siw_cep_put(cep);
+
+ qp->cep = NULL;
+ siw_cep_put(cep);
+diff --git a/drivers/infiniband/sw/siw/siw_main.c b/drivers/infiniband/sw/siw/siw_main.c
+index 65b5cda5457ba..f45600d169ae7 100644
+--- a/drivers/infiniband/sw/siw/siw_main.c
++++ b/drivers/infiniband/sw/siw/siw_main.c
+@@ -75,8 +75,7 @@ static int siw_device_register(struct siw_device *sdev, const char *name)
+ return rv;
+ }
+
+- siw_dbg(base_dev, "HWaddr=%pM\n", sdev->netdev->dev_addr);
+-
++ siw_dbg(base_dev, "HWaddr=%pM\n", sdev->raw_gid);
+ return 0;
+ }
+
+@@ -313,24 +312,19 @@ static struct siw_device *siw_device_create(struct net_device *netdev)
+ return NULL;
+
+ base_dev = &sdev->base_dev;
+-
+ sdev->netdev = netdev;
+
+- if (netdev->type != ARPHRD_LOOPBACK && netdev->type != ARPHRD_NONE) {
+- addrconf_addr_eui48((unsigned char *)&base_dev->node_guid,
+- netdev->dev_addr);
++ if (netdev->addr_len) {
++ memcpy(sdev->raw_gid, netdev->dev_addr,
++ min_t(unsigned int, netdev->addr_len, ETH_ALEN));
+ } else {
+ /*
+- * This device does not have a HW address,
+- * but connection mangagement lib expects gid != 0
++ * This device does not have a HW address, but
++ * connection mangagement requires a unique gid.
+ */
+- size_t len = min_t(size_t, strlen(base_dev->name), 6);
+- char addr[6] = { };
+-
+- memcpy(addr, base_dev->name, len);
+- addrconf_addr_eui48((unsigned char *)&base_dev->node_guid,
+- addr);
++ eth_random_addr(sdev->raw_gid);
+ }
++ addrconf_addr_eui48((u8 *)&base_dev->node_guid, sdev->raw_gid);
+
+ base_dev->uverbs_cmd_mask |= BIT_ULL(IB_USER_VERBS_CMD_POST_SEND);
+
+diff --git a/drivers/infiniband/sw/siw/siw_verbs.c b/drivers/infiniband/sw/siw/siw_verbs.c
+index 398ec13db6248..10cabc792c68e 100644
+--- a/drivers/infiniband/sw/siw/siw_verbs.c
++++ b/drivers/infiniband/sw/siw/siw_verbs.c
+@@ -157,7 +157,7 @@ int siw_query_device(struct ib_device *base_dev, struct ib_device_attr *attr,
+ attr->vendor_part_id = sdev->vendor_part_id;
+
+ addrconf_addr_eui48((u8 *)&attr->sys_image_guid,
+- sdev->netdev->dev_addr);
++ sdev->raw_gid);
+
+ return 0;
+ }
+@@ -218,7 +218,7 @@ int siw_query_gid(struct ib_device *base_dev, u32 port, int idx,
+
+ /* subnet_prefix == interface_id == 0; */
+ memset(gid, 0, sizeof(*gid));
+- memcpy(&gid->raw[0], sdev->netdev->dev_addr, 6);
++ memcpy(gid->raw, sdev->raw_gid, ETH_ALEN);
+
+ return 0;
+ }
+@@ -1494,7 +1494,7 @@ int siw_map_mr_sg(struct ib_mr *base_mr, struct scatterlist *sl, int num_sle,
+
+ if (pbl->max_buf < num_sle) {
+ siw_dbg_mem(mem, "too many SGE's: %d > %d\n",
+- mem->pbl->max_buf, num_sle);
++ num_sle, pbl->max_buf);
+ return -ENOMEM;
+ }
+ for_each_sg(sl, slp, num_sle, i) {
+diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c
+index 92e1e7587af8b..00a7303c8cc60 100644
+--- a/drivers/infiniband/ulp/isert/ib_isert.c
++++ b/drivers/infiniband/ulp/isert/ib_isert.c
+@@ -2570,6 +2570,8 @@ static void isert_wait_conn(struct iscsit_conn *conn)
+ isert_put_unsol_pending_cmds(conn);
+ isert_wait4cmds(conn);
+ isert_wait4logout(isert_conn);
++
++ queue_work(isert_release_wq, &isert_conn->release_work);
+ }
+
+ static void isert_free_conn(struct iscsit_conn *conn)
+diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
+index 0e513a7e5ac80..1574218764e0a 100644
+--- a/drivers/infiniband/ulp/srp/ib_srp.c
++++ b/drivers/infiniband/ulp/srp/ib_srp.c
+@@ -1979,12 +1979,8 @@ static void srp_process_rsp(struct srp_rdma_ch *ch, struct srp_rsp *rsp)
+
+ if (unlikely(rsp->flags & SRP_RSP_FLAG_DIUNDER))
+ scsi_set_resid(scmnd, be32_to_cpu(rsp->data_in_res_cnt));
+- else if (unlikely(rsp->flags & SRP_RSP_FLAG_DIOVER))
+- scsi_set_resid(scmnd, -be32_to_cpu(rsp->data_in_res_cnt));
+ else if (unlikely(rsp->flags & SRP_RSP_FLAG_DOUNDER))
+ scsi_set_resid(scmnd, be32_to_cpu(rsp->data_out_res_cnt));
+- else if (unlikely(rsp->flags & SRP_RSP_FLAG_DOOVER))
+- scsi_set_resid(scmnd, -be32_to_cpu(rsp->data_out_res_cnt));
+
+ srp_free_req(ch, req, scmnd,
+ be32_to_cpu(rsp->req_lim_delta));
+diff --git a/drivers/input/serio/i8042-acpipnpio.h b/drivers/input/serio/i8042-acpipnpio.h
+index 028e45bd050bf..1724d6cb8649d 100644
+--- a/drivers/input/serio/i8042-acpipnpio.h
++++ b/drivers/input/serio/i8042-acpipnpio.h
+@@ -1281,6 +1281,13 @@ static const struct dmi_system_id i8042_dmi_quirk_table[] __initconst = {
+ .driver_data = (void *)(SERIO_QUIRK_NOMUX | SERIO_QUIRK_RESET_ALWAYS |
+ SERIO_QUIRK_NOLOOP | SERIO_QUIRK_NOPNP)
+ },
++ /* See comment on TUXEDO InfinityBook S17 Gen6 / Clevo NS70MU above */
++ {
++ .matches = {
++ DMI_MATCH(DMI_BOARD_NAME, "PD5x_7xPNP_PNR_PNN_PNT"),
++ },
++ .driver_data = (void *)(SERIO_QUIRK_NOAUX)
++ },
+ {
+ .matches = {
+ DMI_MATCH(DMI_BOARD_NAME, "X170SM"),
+diff --git a/drivers/interconnect/qcom/bcm-voter.c b/drivers/interconnect/qcom/bcm-voter.c
+index d5f2a6b5376bd..a2d437a05a11f 100644
+--- a/drivers/interconnect/qcom/bcm-voter.c
++++ b/drivers/interconnect/qcom/bcm-voter.c
+@@ -58,6 +58,36 @@ static u64 bcm_div(u64 num, u32 base)
+ return num;
+ }
+
++/* BCMs with enable_mask use one-hot-encoding for on/off signaling */
++static void bcm_aggregate_mask(struct qcom_icc_bcm *bcm)
++{
++ struct qcom_icc_node *node;
++ int bucket, i;
++
++ for (bucket = 0; bucket < QCOM_ICC_NUM_BUCKETS; bucket++) {
++ bcm->vote_x[bucket] = 0;
++ bcm->vote_y[bucket] = 0;
++
++ for (i = 0; i < bcm->num_nodes; i++) {
++ node = bcm->nodes[i];
++
++ /* If any vote in this bucket exists, keep the BCM enabled */
++ if (node->sum_avg[bucket] || node->max_peak[bucket]) {
++ bcm->vote_x[bucket] = 0;
++ bcm->vote_y[bucket] = bcm->enable_mask;
++ break;
++ }
++ }
++ }
++
++ if (bcm->keepalive) {
++ bcm->vote_x[QCOM_ICC_BUCKET_AMC] = bcm->enable_mask;
++ bcm->vote_x[QCOM_ICC_BUCKET_WAKE] = bcm->enable_mask;
++ bcm->vote_y[QCOM_ICC_BUCKET_AMC] = bcm->enable_mask;
++ bcm->vote_y[QCOM_ICC_BUCKET_WAKE] = bcm->enable_mask;
++ }
++}
++
+ static void bcm_aggregate(struct qcom_icc_bcm *bcm)
+ {
+ struct qcom_icc_node *node;
+@@ -83,11 +113,6 @@ static void bcm_aggregate(struct qcom_icc_bcm *bcm)
+
+ temp = agg_peak[bucket] * bcm->vote_scale;
+ bcm->vote_y[bucket] = bcm_div(temp, bcm->aux_data.unit);
+-
+- if (bcm->enable_mask && (bcm->vote_x[bucket] || bcm->vote_y[bucket])) {
+- bcm->vote_x[bucket] = 0;
+- bcm->vote_y[bucket] = bcm->enable_mask;
+- }
+ }
+
+ if (bcm->keepalive && bcm->vote_x[QCOM_ICC_BUCKET_AMC] == 0 &&
+@@ -260,8 +285,12 @@ int qcom_icc_bcm_voter_commit(struct bcm_voter *voter)
+ return 0;
+
+ mutex_lock(&voter->lock);
+- list_for_each_entry(bcm, &voter->commit_list, list)
+- bcm_aggregate(bcm);
++ list_for_each_entry(bcm, &voter->commit_list, list) {
++ if (bcm->enable_mask)
++ bcm_aggregate_mask(bcm);
++ else
++ bcm_aggregate(bcm);
++ }
+
+ /*
+ * Pre sort the BCMs based on VCD for ease of generating a command list
+diff --git a/drivers/interconnect/qcom/qcm2290.c b/drivers/interconnect/qcom/qcm2290.c
+index a29cdb4fac03f..82a2698ad66b1 100644
+--- a/drivers/interconnect/qcom/qcm2290.c
++++ b/drivers/interconnect/qcom/qcm2290.c
+@@ -1355,6 +1355,7 @@ static struct platform_driver qcm2290_noc_driver = {
+ .driver = {
+ .name = "qnoc-qcm2290",
+ .of_match_table = qcm2290_noc_of_match,
++ .sync_state = icc_sync_state,
+ },
+ };
+ module_platform_driver(qcm2290_noc_driver);
+diff --git a/drivers/interconnect/qcom/sm8450.c b/drivers/interconnect/qcom/sm8450.c
+index e64c214b40209..d6e582a02e628 100644
+--- a/drivers/interconnect/qcom/sm8450.c
++++ b/drivers/interconnect/qcom/sm8450.c
+@@ -1886,6 +1886,7 @@ static struct platform_driver qnoc_driver = {
+ .driver = {
+ .name = "qnoc-sm8450",
+ .of_match_table = qnoc_of_match,
++ .sync_state = icc_sync_state,
+ },
+ };
+
+diff --git a/drivers/iommu/amd/iommu_v2.c b/drivers/iommu/amd/iommu_v2.c
+index 261352a232716..65d78d7e04408 100644
+--- a/drivers/iommu/amd/iommu_v2.c
++++ b/drivers/iommu/amd/iommu_v2.c
+@@ -262,8 +262,8 @@ static void put_pasid_state(struct pasid_state *pasid_state)
+
+ static void put_pasid_state_wait(struct pasid_state *pasid_state)
+ {
+- refcount_dec(&pasid_state->count);
+- wait_event(pasid_state->wq, !refcount_read(&pasid_state->count));
++ if (!refcount_dec_and_test(&pasid_state->count))
++ wait_event(pasid_state->wq, !refcount_read(&pasid_state->count));
+ free_pasid_state(pasid_state);
+ }
+
+diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
+index a503ed758ec30..3e551ca6afdb9 100644
+--- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c
++++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
+@@ -273,6 +273,13 @@ static int qcom_iommu_init_domain(struct iommu_domain *domain,
+ ctx->secure_init = true;
+ }
+
++ /* Disable context bank before programming */
++ iommu_writel(ctx, ARM_SMMU_CB_SCTLR, 0);
++
++ /* Clear context bank fault address fault status registers */
++ iommu_writel(ctx, ARM_SMMU_CB_FAR, 0);
++ iommu_writel(ctx, ARM_SMMU_CB_FSR, ARM_SMMU_FSR_FAULT);
++
+ /* TTBRs */
+ iommu_writeq(ctx, ARM_SMMU_CB_TTBR0,
+ pgtbl_cfg.arm_lpae_s1_cfg.ttbr |
+diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
+index c5d479770e12e..49fc5a038a145 100644
+--- a/drivers/iommu/intel/pasid.c
++++ b/drivers/iommu/intel/pasid.c
+@@ -129,7 +129,7 @@ int intel_pasid_alloc_table(struct device *dev)
+ info->pasid_table = pasid_table;
+
+ if (!ecap_coherent(info->iommu->ecap))
+- clflush_cache_range(pasid_table->table, size);
++ clflush_cache_range(pasid_table->table, (1 << order) * PAGE_SIZE);
+
+ return 0;
+ }
+diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
+index f1dcfa3f1a1b4..88e7154f846d3 100644
+--- a/drivers/iommu/iommu.c
++++ b/drivers/iommu/iommu.c
+@@ -3196,7 +3196,7 @@ static void __iommu_release_dma_ownership(struct iommu_group *group)
+
+ /**
+ * iommu_group_release_dma_owner() - Release DMA ownership of a group
+- * @dev: The device
++ * @group: The group
+ *
+ * Release the DMA ownership claimed by iommu_group_claim_dma_owner().
+ */
+@@ -3210,7 +3210,7 @@ EXPORT_SYMBOL_GPL(iommu_group_release_dma_owner);
+
+ /**
+ * iommu_device_release_dma_owner() - Release DMA ownership of a device
+- * @group: The device.
++ * @dev: The device.
+ *
+ * Release the DMA ownership claimed by iommu_device_claim_dma_owner().
+ */
+diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
+index ed2937a4e196f..2e43ebf1a2b5c 100644
+--- a/drivers/iommu/iommufd/device.c
++++ b/drivers/iommu/iommufd/device.c
+@@ -298,8 +298,8 @@ static int iommufd_device_auto_get_domain(struct iommufd_device *idev,
+ }
+ hwpt->auto_domain = true;
+
+- mutex_unlock(&ioas->mutex);
+ iommufd_object_finalize(idev->ictx, &hwpt->obj);
++ mutex_unlock(&ioas->mutex);
+ return 0;
+ out_unlock:
+ mutex_unlock(&ioas->mutex);
+diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
+index e93906d6e112e..c2764891a779c 100644
+--- a/drivers/iommu/mtk_iommu.c
++++ b/drivers/iommu/mtk_iommu.c
+@@ -258,6 +258,8 @@ struct mtk_iommu_data {
+ struct device *smicomm_dev;
+
+ struct mtk_iommu_bank_data *bank;
++ struct mtk_iommu_domain *share_dom; /* For 2 HWs share pgtable */
++
+ struct regmap *pericfg;
+ struct mutex mutex; /* Protect m4u_group/m4u_dom above */
+
+@@ -620,15 +622,14 @@ static int mtk_iommu_domain_finalise(struct mtk_iommu_domain *dom,
+ struct mtk_iommu_data *data,
+ unsigned int region_id)
+ {
++ struct mtk_iommu_domain *share_dom = data->share_dom;
+ const struct mtk_iommu_iova_region *region;
+- struct mtk_iommu_domain *m4u_dom;
+-
+- /* Always use bank0 in sharing pgtable case */
+- m4u_dom = data->bank[0].m4u_dom;
+- if (m4u_dom) {
+- dom->iop = m4u_dom->iop;
+- dom->cfg = m4u_dom->cfg;
+- dom->domain.pgsize_bitmap = m4u_dom->cfg.pgsize_bitmap;
++
++ /* Always use share domain in sharing pgtable case */
++ if (MTK_IOMMU_HAS_FLAG(data->plat_data, SHARE_PGTABLE) && share_dom) {
++ dom->iop = share_dom->iop;
++ dom->cfg = share_dom->cfg;
++ dom->domain.pgsize_bitmap = share_dom->cfg.pgsize_bitmap;
+ goto update_iova_region;
+ }
+
+@@ -658,6 +659,9 @@ static int mtk_iommu_domain_finalise(struct mtk_iommu_domain *dom,
+ /* Update our support page sizes bitmap */
+ dom->domain.pgsize_bitmap = dom->cfg.pgsize_bitmap;
+
++ if (MTK_IOMMU_HAS_FLAG(data->plat_data, SHARE_PGTABLE))
++ data->share_dom = dom;
++
+ update_iova_region:
+ /* Update the iova region for this domain */
+ region = data->plat_data->iova_region + region_id;
+@@ -708,7 +712,9 @@ static int mtk_iommu_attach_device(struct iommu_domain *domain,
+ /* Data is in the frstdata in sharing pgtable case. */
+ frstdata = mtk_iommu_get_frst_data(hw_list);
+
++ mutex_lock(&frstdata->mutex);
+ ret = mtk_iommu_domain_finalise(dom, frstdata, region_id);
++ mutex_unlock(&frstdata->mutex);
+ if (ret) {
+ mutex_unlock(&dom->mutex);
+ return ret;
+diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
+index 4054030c32379..ae42959bc4905 100644
+--- a/drivers/iommu/rockchip-iommu.c
++++ b/drivers/iommu/rockchip-iommu.c
+@@ -98,8 +98,6 @@ struct rk_iommu_ops {
+ phys_addr_t (*pt_address)(u32 dte);
+ u32 (*mk_dtentries)(dma_addr_t pt_dma);
+ u32 (*mk_ptentries)(phys_addr_t page, int prot);
+- phys_addr_t (*dte_addr_phys)(u32 addr);
+- u32 (*dma_addr_dte)(dma_addr_t dt_dma);
+ u64 dma_bit_mask;
+ };
+
+@@ -278,8 +276,8 @@ static u32 rk_mk_pte(phys_addr_t page, int prot)
+ /*
+ * In v2:
+ * 31:12 - Page address bit 31:0
+- * 11:9 - Page address bit 34:32
+- * 8:4 - Page address bit 39:35
++ * 11: 8 - Page address bit 35:32
++ * 7: 4 - Page address bit 39:36
+ * 3 - Security
+ * 2 - Writable
+ * 1 - Readable
+@@ -506,7 +504,7 @@ static int rk_iommu_force_reset(struct rk_iommu *iommu)
+
+ /*
+ * Check if register DTE_ADDR is working by writing DTE_ADDR_DUMMY
+- * and verifying that upper 5 nybbles are read back.
++ * and verifying that upper 5 (v1) or 7 (v2) nybbles are read back.
+ */
+ for (i = 0; i < iommu->num_mmu; i++) {
+ dte_addr = rk_ops->pt_address(DTE_ADDR_DUMMY);
+@@ -531,33 +529,6 @@ static int rk_iommu_force_reset(struct rk_iommu *iommu)
+ return 0;
+ }
+
+-static inline phys_addr_t rk_dte_addr_phys(u32 addr)
+-{
+- return (phys_addr_t)addr;
+-}
+-
+-static inline u32 rk_dma_addr_dte(dma_addr_t dt_dma)
+-{
+- return dt_dma;
+-}
+-
+-#define DT_HI_MASK GENMASK_ULL(39, 32)
+-#define DTE_BASE_HI_MASK GENMASK(11, 4)
+-#define DT_SHIFT 28
+-
+-static inline phys_addr_t rk_dte_addr_phys_v2(u32 addr)
+-{
+- u64 addr64 = addr;
+- return (phys_addr_t)(addr64 & RK_DTE_PT_ADDRESS_MASK) |
+- ((addr64 & DTE_BASE_HI_MASK) << DT_SHIFT);
+-}
+-
+-static inline u32 rk_dma_addr_dte_v2(dma_addr_t dt_dma)
+-{
+- return (dt_dma & RK_DTE_PT_ADDRESS_MASK) |
+- ((dt_dma & DT_HI_MASK) >> DT_SHIFT);
+-}
+-
+ static void log_iova(struct rk_iommu *iommu, int index, dma_addr_t iova)
+ {
+ void __iomem *base = iommu->bases[index];
+@@ -577,7 +548,7 @@ static void log_iova(struct rk_iommu *iommu, int index, dma_addr_t iova)
+ page_offset = rk_iova_page_offset(iova);
+
+ mmu_dte_addr = rk_iommu_read(base, RK_MMU_DTE_ADDR);
+- mmu_dte_addr_phys = rk_ops->dte_addr_phys(mmu_dte_addr);
++ mmu_dte_addr_phys = rk_ops->pt_address(mmu_dte_addr);
+
+ dte_addr_phys = mmu_dte_addr_phys + (4 * dte_index);
+ dte_addr = phys_to_virt(dte_addr_phys);
+@@ -967,7 +938,7 @@ static int rk_iommu_enable(struct rk_iommu *iommu)
+
+ for (i = 0; i < iommu->num_mmu; i++) {
+ rk_iommu_write(iommu->bases[i], RK_MMU_DTE_ADDR,
+- rk_ops->dma_addr_dte(rk_domain->dt_dma));
++ rk_ops->mk_dtentries(rk_domain->dt_dma));
+ rk_iommu_base_command(iommu->bases[i], RK_MMU_CMD_ZAP_CACHE);
+ rk_iommu_write(iommu->bases[i], RK_MMU_INT_MASK, RK_MMU_IRQ_MASK);
+ }
+@@ -1405,8 +1376,6 @@ static struct rk_iommu_ops iommu_data_ops_v1 = {
+ .pt_address = &rk_dte_pt_address,
+ .mk_dtentries = &rk_mk_dte,
+ .mk_ptentries = &rk_mk_pte,
+- .dte_addr_phys = &rk_dte_addr_phys,
+- .dma_addr_dte = &rk_dma_addr_dte,
+ .dma_bit_mask = DMA_BIT_MASK(32),
+ };
+
+@@ -1414,8 +1383,6 @@ static struct rk_iommu_ops iommu_data_ops_v2 = {
+ .pt_address = &rk_dte_pt_address_v2,
+ .mk_dtentries = &rk_mk_dte_v2,
+ .mk_ptentries = &rk_mk_pte_v2,
+- .dte_addr_phys = &rk_dte_addr_phys_v2,
+- .dma_addr_dte = &rk_dma_addr_dte_v2,
+ .dma_bit_mask = DMA_BIT_MASK(40),
+ };
+
+diff --git a/drivers/iommu/sprd-iommu.c b/drivers/iommu/sprd-iommu.c
+index 39e34fdeccda7..eb684d8807cab 100644
+--- a/drivers/iommu/sprd-iommu.c
++++ b/drivers/iommu/sprd-iommu.c
+@@ -148,6 +148,7 @@ static struct iommu_domain *sprd_iommu_domain_alloc(unsigned int domain_type)
+
+ dom->domain.geometry.aperture_start = 0;
+ dom->domain.geometry.aperture_end = SZ_256M - 1;
++ dom->domain.geometry.force_aperture = true;
+
+ return &dom->domain;
+ }
+diff --git a/drivers/irqchip/irq-loongson-eiointc.c b/drivers/irqchip/irq-loongson-eiointc.c
+index a7fcde3e3ecc7..ae28918df5c59 100644
+--- a/drivers/irqchip/irq-loongson-eiointc.c
++++ b/drivers/irqchip/irq-loongson-eiointc.c
+@@ -143,7 +143,7 @@ static int eiointc_router_init(unsigned int cpu)
+ int i, bit;
+ uint32_t data;
+ uint32_t node = cpu_to_eio_node(cpu);
+- uint32_t index = eiointc_index(node);
++ int index = eiointc_index(node);
+
+ if (index < 0) {
+ pr_err("Error: invalid nodemap!\n");
+diff --git a/drivers/leds/led-class-multicolor.c b/drivers/leds/led-class-multicolor.c
+index e317408583df9..ec62a48116135 100644
+--- a/drivers/leds/led-class-multicolor.c
++++ b/drivers/leds/led-class-multicolor.c
+@@ -6,6 +6,7 @@
+ #include <linux/device.h>
+ #include <linux/init.h>
+ #include <linux/led-class-multicolor.h>
++#include <linux/math.h>
+ #include <linux/module.h>
+ #include <linux/slab.h>
+ #include <linux/uaccess.h>
+@@ -19,9 +20,10 @@ int led_mc_calc_color_components(struct led_classdev_mc *mcled_cdev,
+ int i;
+
+ for (i = 0; i < mcled_cdev->num_colors; i++)
+- mcled_cdev->subled_info[i].brightness = brightness *
+- mcled_cdev->subled_info[i].intensity /
+- led_cdev->max_brightness;
++ mcled_cdev->subled_info[i].brightness =
++ DIV_ROUND_CLOSEST(brightness *
++ mcled_cdev->subled_info[i].intensity,
++ led_cdev->max_brightness);
+
+ return 0;
+ }
+diff --git a/drivers/leds/led-core.c b/drivers/leds/led-core.c
+index 4a97cb7457888..aad8bc44459fe 100644
+--- a/drivers/leds/led-core.c
++++ b/drivers/leds/led-core.c
+@@ -419,15 +419,15 @@ int led_compose_name(struct device *dev, struct led_init_data *init_data,
+ struct fwnode_handle *fwnode = init_data->fwnode;
+ const char *devicename = init_data->devicename;
+
+- /* We want to label LEDs that can produce full range of colors
+- * as RGB, not multicolor */
+- BUG_ON(props.color == LED_COLOR_ID_MULTI);
+-
+ if (!led_classdev_name)
+ return -EINVAL;
+
+ led_parse_fwnode_props(dev, fwnode, &props);
+
++ /* We want to label LEDs that can produce full range of colors
++ * as RGB, not multicolor */
++ BUG_ON(props.color == LED_COLOR_ID_MULTI);
++
+ if (props.label) {
+ /*
+ * If init_data.devicename is NULL, then it indicates that
+diff --git a/drivers/leds/leds-pwm.c b/drivers/leds/leds-pwm.c
+index 29194cc382afb..87c199242f3c8 100644
+--- a/drivers/leds/leds-pwm.c
++++ b/drivers/leds/leds-pwm.c
+@@ -146,7 +146,7 @@ static int led_pwm_create_fwnode(struct device *dev, struct led_pwm_priv *priv)
+ led.name = to_of_node(fwnode)->name;
+
+ if (!led.name) {
+- ret = EINVAL;
++ ret = -EINVAL;
+ goto err_child_out;
+ }
+
+diff --git a/drivers/leds/trigger/ledtrig-tty.c b/drivers/leds/trigger/ledtrig-tty.c
+index f62db7e520b52..8ae0d2d284aff 100644
+--- a/drivers/leds/trigger/ledtrig-tty.c
++++ b/drivers/leds/trigger/ledtrig-tty.c
+@@ -7,6 +7,8 @@
+ #include <linux/tty.h>
+ #include <uapi/linux/serial.h>
+
++#define LEDTRIG_TTY_INTERVAL 50
++
+ struct ledtrig_tty_data {
+ struct led_classdev *led_cdev;
+ struct delayed_work dwork;
+@@ -122,17 +124,19 @@ static void ledtrig_tty_work(struct work_struct *work)
+
+ if (icount.rx != trigger_data->rx ||
+ icount.tx != trigger_data->tx) {
+- led_set_brightness_sync(trigger_data->led_cdev, LED_ON);
++ unsigned long interval = LEDTRIG_TTY_INTERVAL;
++
++ led_blink_set_oneshot(trigger_data->led_cdev, &interval,
++ &interval, 0);
+
+ trigger_data->rx = icount.rx;
+ trigger_data->tx = icount.tx;
+- } else {
+- led_set_brightness_sync(trigger_data->led_cdev, LED_OFF);
+ }
+
+ out:
+ mutex_unlock(&trigger_data->mutex);
+- schedule_delayed_work(&trigger_data->dwork, msecs_to_jiffies(100));
++ schedule_delayed_work(&trigger_data->dwork,
++ msecs_to_jiffies(LEDTRIG_TTY_INTERVAL * 2));
+ }
+
+ static struct attribute *ledtrig_tty_attrs[] = {
+diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
+index ea226a37b110a..ba6b4819d37e4 100644
+--- a/drivers/md/md-bitmap.c
++++ b/drivers/md/md-bitmap.c
+@@ -2504,6 +2504,10 @@ backlog_store(struct mddev *mddev, const char *buf, size_t len)
+ if (backlog > COUNTER_MAX)
+ return -EINVAL;
+
++ rv = mddev_lock(mddev);
++ if (rv)
++ return rv;
++
+ /*
+ * Without write mostly device, it doesn't make sense to set
+ * backlog for max_write_behind.
+@@ -2517,6 +2521,7 @@ backlog_store(struct mddev *mddev, const char *buf, size_t len)
+ if (!has_write_mostly) {
+ pr_warn_ratelimited("%s: can't set backlog, no write mostly device available\n",
+ mdname(mddev));
++ mddev_unlock(mddev);
+ return -EINVAL;
+ }
+
+@@ -2527,13 +2532,13 @@ backlog_store(struct mddev *mddev, const char *buf, size_t len)
+ mddev_destroy_serial_pool(mddev, NULL, false);
+ } else if (backlog && !mddev->serial_info_pool) {
+ /* serial_info_pool is needed since backlog is not zero */
+- struct md_rdev *rdev;
+-
+ rdev_for_each(rdev, mddev)
+ mddev_create_serial_pool(mddev, rdev, false);
+ }
+ if (old_mwb != backlog)
+ md_bitmap_update_sb(mddev->bitmap);
++
++ mddev_unlock(mddev);
+ return len;
+ }
+
+diff --git a/drivers/md/md.c b/drivers/md/md.c
+index 32d7ba8069aef..a2904e10ae35e 100644
+--- a/drivers/md/md.c
++++ b/drivers/md/md.c
+@@ -477,11 +477,13 @@ EXPORT_SYMBOL_GPL(mddev_suspend);
+
+ void mddev_resume(struct mddev *mddev)
+ {
+- /* entred the memalloc scope from mddev_suspend() */
+- memalloc_noio_restore(mddev->noio_flag);
+ lockdep_assert_held(&mddev->reconfig_mutex);
+ if (--mddev->suspended)
+ return;
++
++ /* entred the memalloc scope from mddev_suspend() */
++ memalloc_noio_restore(mddev->noio_flag);
++
+ percpu_ref_resurrect(&mddev->active_io);
+ wake_up(&mddev->sb_wait);
+ mddev->pers->quiesce(mddev, 0);
+diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
+index d1ac73fcd8529..7c6a0b4437d8f 100644
+--- a/drivers/md/raid0.c
++++ b/drivers/md/raid0.c
+@@ -557,54 +557,20 @@ static void raid0_handle_discard(struct mddev *mddev, struct bio *bio)
+ bio_endio(bio);
+ }
+
+-static bool raid0_make_request(struct mddev *mddev, struct bio *bio)
++static void raid0_map_submit_bio(struct mddev *mddev, struct bio *bio)
+ {
+ struct r0conf *conf = mddev->private;
+ struct strip_zone *zone;
+ struct md_rdev *tmp_dev;
+- sector_t bio_sector;
+- sector_t sector;
+- sector_t orig_sector;
+- unsigned chunk_sects;
+- unsigned sectors;
+-
+- if (unlikely(bio->bi_opf & REQ_PREFLUSH)
+- && md_flush_request(mddev, bio))
+- return true;
++ sector_t bio_sector = bio->bi_iter.bi_sector;
++ sector_t sector = bio_sector;
+
+- if (unlikely((bio_op(bio) == REQ_OP_DISCARD))) {
+- raid0_handle_discard(mddev, bio);
+- return true;
+- }
++ md_account_bio(mddev, &bio);
+
+- bio_sector = bio->bi_iter.bi_sector;
+- sector = bio_sector;
+- chunk_sects = mddev->chunk_sectors;
+-
+- sectors = chunk_sects -
+- (likely(is_power_of_2(chunk_sects))
+- ? (sector & (chunk_sects-1))
+- : sector_div(sector, chunk_sects));
+-
+- /* Restore due to sector_div */
+- sector = bio_sector;
+-
+- if (sectors < bio_sectors(bio)) {
+- struct bio *split = bio_split(bio, sectors, GFP_NOIO,
+- &mddev->bio_set);
+- bio_chain(split, bio);
+- submit_bio_noacct(bio);
+- bio = split;
+- }
+-
+- if (bio->bi_pool != &mddev->bio_set)
+- md_account_bio(mddev, &bio);
+-
+- orig_sector = sector;
+ zone = find_zone(mddev->private, §or);
+ switch (conf->layout) {
+ case RAID0_ORIG_LAYOUT:
+- tmp_dev = map_sector(mddev, zone, orig_sector, §or);
++ tmp_dev = map_sector(mddev, zone, bio_sector, §or);
+ break;
+ case RAID0_ALT_MULTIZONE_LAYOUT:
+ tmp_dev = map_sector(mddev, zone, sector, §or);
+@@ -612,13 +578,13 @@ static bool raid0_make_request(struct mddev *mddev, struct bio *bio)
+ default:
+ WARN(1, "md/raid0:%s: Invalid layout\n", mdname(mddev));
+ bio_io_error(bio);
+- return true;
++ return;
+ }
+
+ if (unlikely(is_rdev_broken(tmp_dev))) {
+ bio_io_error(bio);
+ md_error(mddev, tmp_dev);
+- return true;
++ return;
+ }
+
+ bio_set_dev(bio, tmp_dev->bdev);
+@@ -630,6 +596,40 @@ static bool raid0_make_request(struct mddev *mddev, struct bio *bio)
+ bio_sector);
+ mddev_check_write_zeroes(mddev, bio);
+ submit_bio_noacct(bio);
++}
++
++static bool raid0_make_request(struct mddev *mddev, struct bio *bio)
++{
++ sector_t sector;
++ unsigned chunk_sects;
++ unsigned sectors;
++
++ if (unlikely(bio->bi_opf & REQ_PREFLUSH)
++ && md_flush_request(mddev, bio))
++ return true;
++
++ if (unlikely((bio_op(bio) == REQ_OP_DISCARD))) {
++ raid0_handle_discard(mddev, bio);
++ return true;
++ }
++
++ sector = bio->bi_iter.bi_sector;
++ chunk_sects = mddev->chunk_sectors;
++
++ sectors = chunk_sects -
++ (likely(is_power_of_2(chunk_sects))
++ ? (sector & (chunk_sects-1))
++ : sector_div(sector, chunk_sects));
++
++ if (sectors < bio_sectors(bio)) {
++ struct bio *split = bio_split(bio, sectors, GFP_NOIO,
++ &mddev->bio_set);
++ bio_chain(split, bio);
++ raid0_map_submit_bio(mddev, bio);
++ bio = split;
++ }
++
++ raid0_map_submit_bio(mddev, bio);
+ return true;
+ }
+
+diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
+index ee75b058438f3..925ab30c15d49 100644
+--- a/drivers/md/raid10.c
++++ b/drivers/md/raid10.c
+@@ -1321,6 +1321,25 @@ static void raid10_write_one_disk(struct mddev *mddev, struct r10bio *r10_bio,
+ }
+ }
+
++static struct md_rdev *dereference_rdev_and_rrdev(struct raid10_info *mirror,
++ struct md_rdev **prrdev)
++{
++ struct md_rdev *rdev, *rrdev;
++
++ rrdev = rcu_dereference(mirror->replacement);
++ /*
++ * Read replacement first to prevent reading both rdev and
++ * replacement as NULL during replacement replace rdev.
++ */
++ smp_mb();
++ rdev = rcu_dereference(mirror->rdev);
++ if (rdev == rrdev)
++ rrdev = NULL;
++
++ *prrdev = rrdev;
++ return rdev;
++}
++
+ static void wait_blocked_dev(struct mddev *mddev, struct r10bio *r10_bio)
+ {
+ int i;
+@@ -1331,11 +1350,9 @@ retry_wait:
+ blocked_rdev = NULL;
+ rcu_read_lock();
+ for (i = 0; i < conf->copies; i++) {
+- struct md_rdev *rdev = rcu_dereference(conf->mirrors[i].rdev);
+- struct md_rdev *rrdev = rcu_dereference(
+- conf->mirrors[i].replacement);
+- if (rdev == rrdev)
+- rrdev = NULL;
++ struct md_rdev *rdev, *rrdev;
++
++ rdev = dereference_rdev_and_rrdev(&conf->mirrors[i], &rrdev);
+ if (rdev && unlikely(test_bit(Blocked, &rdev->flags))) {
+ atomic_inc(&rdev->nr_pending);
+ blocked_rdev = rdev;
+@@ -1464,15 +1481,7 @@ static void raid10_write_request(struct mddev *mddev, struct bio *bio,
+ int d = r10_bio->devs[i].devnum;
+ struct md_rdev *rdev, *rrdev;
+
+- rrdev = rcu_dereference(conf->mirrors[d].replacement);
+- /*
+- * Read replacement first to prevent reading both rdev and
+- * replacement as NULL during replacement replace rdev.
+- */
+- smp_mb();
+- rdev = rcu_dereference(conf->mirrors[d].rdev);
+- if (rdev == rrdev)
+- rrdev = NULL;
++ rdev = dereference_rdev_and_rrdev(&conf->mirrors[d], &rrdev);
+ if (rdev && (test_bit(Faulty, &rdev->flags)))
+ rdev = NULL;
+ if (rrdev && (test_bit(Faulty, &rrdev->flags)))
+@@ -1779,10 +1788,9 @@ retry_discard:
+ */
+ rcu_read_lock();
+ for (disk = 0; disk < geo->raid_disks; disk++) {
+- struct md_rdev *rdev = rcu_dereference(conf->mirrors[disk].rdev);
+- struct md_rdev *rrdev = rcu_dereference(
+- conf->mirrors[disk].replacement);
++ struct md_rdev *rdev, *rrdev;
+
++ rdev = dereference_rdev_and_rrdev(&conf->mirrors[disk], &rrdev);
+ r10_bio->devs[disk].bio = NULL;
+ r10_bio->devs[disk].repl_bio = NULL;
+
+diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
+index 46182b955aef8..21653e1ed9384 100644
+--- a/drivers/md/raid5-cache.c
++++ b/drivers/md/raid5-cache.c
+@@ -1260,14 +1260,13 @@ static void r5l_log_flush_endio(struct bio *bio)
+
+ if (bio->bi_status)
+ md_error(log->rdev->mddev, log->rdev);
++ bio_uninit(bio);
+
+ spin_lock_irqsave(&log->io_list_lock, flags);
+ list_for_each_entry(io, &log->flushing_ios, log_sibling)
+ r5l_io_run_stripes(io);
+ list_splice_tail_init(&log->flushing_ios, &log->finished_ios);
+ spin_unlock_irqrestore(&log->io_list_lock, flags);
+-
+- bio_uninit(bio);
+ }
+
+ /*
+@@ -3164,12 +3163,15 @@ void r5l_exit_log(struct r5conf *conf)
+ {
+ struct r5l_log *log = conf->log;
+
+- /* Ensure disable_writeback_work wakes up and exits */
+- wake_up(&conf->mddev->sb_wait);
+- flush_work(&log->disable_writeback_work);
+ md_unregister_thread(&log->reclaim_thread);
+
++ /*
++ * 'reconfig_mutex' is held by caller, set 'confg->log' to NULL to
++ * ensure disable_writeback_work wakes up and exits.
++ */
+ conf->log = NULL;
++ wake_up(&conf->mddev->sb_wait);
++ flush_work(&log->disable_writeback_work);
+
+ mempool_exit(&log->meta_pool);
+ bioset_exit(&log->bs);
+diff --git a/drivers/media/cec/core/cec-adap.c b/drivers/media/cec/core/cec-adap.c
+index 241b1621b197c..09ca83c233299 100644
+--- a/drivers/media/cec/core/cec-adap.c
++++ b/drivers/media/cec/core/cec-adap.c
+@@ -385,8 +385,8 @@ static void cec_data_cancel(struct cec_data *data, u8 tx_status, u8 rx_status)
+ cec_queue_msg_monitor(adap, &data->msg, 1);
+
+ if (!data->blocking && data->msg.sequence)
+- /* Allow drivers to process the message first */
+- call_op(adap, received, &data->msg);
++ /* Allow drivers to react to a canceled transmit */
++ call_void_op(adap, adap_nb_transmit_canceled, &data->msg);
+
+ cec_data_completed(data);
+ }
+@@ -1348,7 +1348,7 @@ static void cec_adap_unconfigure(struct cec_adapter *adap)
+ cec_flush(adap);
+ wake_up_interruptible(&adap->kthread_waitq);
+ cec_post_state_event(adap);
+- call_void_op(adap, adap_configured, false);
++ call_void_op(adap, adap_unconfigured);
+ }
+
+ /*
+@@ -1539,7 +1539,7 @@ configured:
+ adap->kthread_config = NULL;
+ complete(&adap->config_completion);
+ mutex_unlock(&adap->lock);
+- call_void_op(adap, adap_configured, true);
++ call_void_op(adap, configured);
+ return 0;
+
+ unconfigure:
+diff --git a/drivers/media/cec/usb/pulse8/pulse8-cec.c b/drivers/media/cec/usb/pulse8/pulse8-cec.c
+index 04b13cdc38d2c..ba67587bd43ec 100644
+--- a/drivers/media/cec/usb/pulse8/pulse8-cec.c
++++ b/drivers/media/cec/usb/pulse8/pulse8-cec.c
+@@ -809,8 +809,11 @@ static void pulse8_ping_eeprom_work_handler(struct work_struct *work)
+
+ mutex_lock(&pulse8->lock);
+ cmd = MSGCODE_PING;
+- pulse8_send_and_wait(pulse8, &cmd, 1,
+- MSGCODE_COMMAND_ACCEPTED, 0);
++ if (pulse8_send_and_wait(pulse8, &cmd, 1,
++ MSGCODE_COMMAND_ACCEPTED, 0)) {
++ dev_warn(pulse8->dev, "failed to ping EEPROM\n");
++ goto unlock;
++ }
+
+ if (pulse8->vers < 2)
+ goto unlock;
+diff --git a/drivers/media/dvb-frontends/ascot2e.c b/drivers/media/dvb-frontends/ascot2e.c
+index 9b00b56230b61..cf8e5f1bd1018 100644
+--- a/drivers/media/dvb-frontends/ascot2e.c
++++ b/drivers/media/dvb-frontends/ascot2e.c
+@@ -533,7 +533,7 @@ struct dvb_frontend *ascot2e_attach(struct dvb_frontend *fe,
+ priv->i2c_address, priv->i2c);
+ return fe;
+ }
+-EXPORT_SYMBOL(ascot2e_attach);
++EXPORT_SYMBOL_GPL(ascot2e_attach);
+
+ MODULE_DESCRIPTION("Sony ASCOT2E terr/cab tuner driver");
+ MODULE_AUTHOR("info@netup.ru");
+diff --git a/drivers/media/dvb-frontends/atbm8830.c b/drivers/media/dvb-frontends/atbm8830.c
+index bdd16b9c58244..778c865085bf9 100644
+--- a/drivers/media/dvb-frontends/atbm8830.c
++++ b/drivers/media/dvb-frontends/atbm8830.c
+@@ -489,7 +489,7 @@ error_out:
+ return NULL;
+
+ }
+-EXPORT_SYMBOL(atbm8830_attach);
++EXPORT_SYMBOL_GPL(atbm8830_attach);
+
+ MODULE_DESCRIPTION("AltoBeam ATBM8830/8831 GB20600 demodulator driver");
+ MODULE_AUTHOR("David T. L. Wong <davidtlwong@gmail.com>");
+diff --git a/drivers/media/dvb-frontends/au8522_dig.c b/drivers/media/dvb-frontends/au8522_dig.c
+index 78cafdf279618..230436bf6cbd9 100644
+--- a/drivers/media/dvb-frontends/au8522_dig.c
++++ b/drivers/media/dvb-frontends/au8522_dig.c
+@@ -879,7 +879,7 @@ error:
+ au8522_release_state(state);
+ return NULL;
+ }
+-EXPORT_SYMBOL(au8522_attach);
++EXPORT_SYMBOL_GPL(au8522_attach);
+
+ static const struct dvb_frontend_ops au8522_ops = {
+ .delsys = { SYS_ATSC, SYS_DVBC_ANNEX_B },
+diff --git a/drivers/media/dvb-frontends/bcm3510.c b/drivers/media/dvb-frontends/bcm3510.c
+index 68b92b4419cff..b3f5c49accafd 100644
+--- a/drivers/media/dvb-frontends/bcm3510.c
++++ b/drivers/media/dvb-frontends/bcm3510.c
+@@ -835,7 +835,7 @@ error:
+ kfree(state);
+ return NULL;
+ }
+-EXPORT_SYMBOL(bcm3510_attach);
++EXPORT_SYMBOL_GPL(bcm3510_attach);
+
+ static const struct dvb_frontend_ops bcm3510_ops = {
+ .delsys = { SYS_ATSC, SYS_DVBC_ANNEX_B },
+diff --git a/drivers/media/dvb-frontends/cx22700.c b/drivers/media/dvb-frontends/cx22700.c
+index b39ff516271b2..1d04c0a652b26 100644
+--- a/drivers/media/dvb-frontends/cx22700.c
++++ b/drivers/media/dvb-frontends/cx22700.c
+@@ -432,4 +432,4 @@ MODULE_DESCRIPTION("Conexant CX22700 DVB-T Demodulator driver");
+ MODULE_AUTHOR("Holger Waechtler");
+ MODULE_LICENSE("GPL");
+
+-EXPORT_SYMBOL(cx22700_attach);
++EXPORT_SYMBOL_GPL(cx22700_attach);
+diff --git a/drivers/media/dvb-frontends/cx22702.c b/drivers/media/dvb-frontends/cx22702.c
+index cc6acbf6393d4..61ad34b7004b5 100644
+--- a/drivers/media/dvb-frontends/cx22702.c
++++ b/drivers/media/dvb-frontends/cx22702.c
+@@ -604,7 +604,7 @@ error:
+ kfree(state);
+ return NULL;
+ }
+-EXPORT_SYMBOL(cx22702_attach);
++EXPORT_SYMBOL_GPL(cx22702_attach);
+
+ static const struct dvb_frontend_ops cx22702_ops = {
+ .delsys = { SYS_DVBT },
+diff --git a/drivers/media/dvb-frontends/cx24110.c b/drivers/media/dvb-frontends/cx24110.c
+index 6f99d6a27be2d..9aeea089756fe 100644
+--- a/drivers/media/dvb-frontends/cx24110.c
++++ b/drivers/media/dvb-frontends/cx24110.c
+@@ -653,4 +653,4 @@ MODULE_DESCRIPTION("Conexant CX24110 DVB-S Demodulator driver");
+ MODULE_AUTHOR("Peter Hettkamp");
+ MODULE_LICENSE("GPL");
+
+-EXPORT_SYMBOL(cx24110_attach);
++EXPORT_SYMBOL_GPL(cx24110_attach);
+diff --git a/drivers/media/dvb-frontends/cx24113.c b/drivers/media/dvb-frontends/cx24113.c
+index dd55d314bf9af..203cb6b3f941b 100644
+--- a/drivers/media/dvb-frontends/cx24113.c
++++ b/drivers/media/dvb-frontends/cx24113.c
+@@ -590,7 +590,7 @@ error:
+
+ return NULL;
+ }
+-EXPORT_SYMBOL(cx24113_attach);
++EXPORT_SYMBOL_GPL(cx24113_attach);
+
+ module_param(debug, int, 0644);
+ MODULE_PARM_DESC(debug, "Activates frontend debugging (default:0)");
+diff --git a/drivers/media/dvb-frontends/cx24116.c b/drivers/media/dvb-frontends/cx24116.c
+index ea8264ccbb4e8..8b978a9f74a4e 100644
+--- a/drivers/media/dvb-frontends/cx24116.c
++++ b/drivers/media/dvb-frontends/cx24116.c
+@@ -1133,7 +1133,7 @@ struct dvb_frontend *cx24116_attach(const struct cx24116_config *config,
+ state->frontend.demodulator_priv = state;
+ return &state->frontend;
+ }
+-EXPORT_SYMBOL(cx24116_attach);
++EXPORT_SYMBOL_GPL(cx24116_attach);
+
+ /*
+ * Initialise or wake up device
+diff --git a/drivers/media/dvb-frontends/cx24120.c b/drivers/media/dvb-frontends/cx24120.c
+index d8acd582c7111..44515fdbe91d4 100644
+--- a/drivers/media/dvb-frontends/cx24120.c
++++ b/drivers/media/dvb-frontends/cx24120.c
+@@ -305,7 +305,7 @@ error:
+ kfree(state);
+ return NULL;
+ }
+-EXPORT_SYMBOL(cx24120_attach);
++EXPORT_SYMBOL_GPL(cx24120_attach);
+
+ static int cx24120_test_rom(struct cx24120_state *state)
+ {
+@@ -973,7 +973,9 @@ static void cx24120_set_clock_ratios(struct dvb_frontend *fe)
+ cmd.arg[8] = (clock_ratios_table[idx].rate >> 8) & 0xff;
+ cmd.arg[9] = (clock_ratios_table[idx].rate >> 0) & 0xff;
+
+- cx24120_message_send(state, &cmd);
++ ret = cx24120_message_send(state, &cmd);
++ if (ret != 0)
++ return;
+
+ /* Calculate ber window rates for stat work */
+ cx24120_calculate_ber_window(state, clock_ratios_table[idx].rate);
+diff --git a/drivers/media/dvb-frontends/cx24123.c b/drivers/media/dvb-frontends/cx24123.c
+index 3d84ee17e54c6..539889e638ccc 100644
+--- a/drivers/media/dvb-frontends/cx24123.c
++++ b/drivers/media/dvb-frontends/cx24123.c
+@@ -1096,7 +1096,7 @@ error:
+
+ return NULL;
+ }
+-EXPORT_SYMBOL(cx24123_attach);
++EXPORT_SYMBOL_GPL(cx24123_attach);
+
+ static const struct dvb_frontend_ops cx24123_ops = {
+ .delsys = { SYS_DVBS },
+diff --git a/drivers/media/dvb-frontends/cxd2820r_core.c b/drivers/media/dvb-frontends/cxd2820r_core.c
+index 47aa40967171d..3e7cb2756a787 100644
+--- a/drivers/media/dvb-frontends/cxd2820r_core.c
++++ b/drivers/media/dvb-frontends/cxd2820r_core.c
+@@ -536,7 +536,7 @@ struct dvb_frontend *cxd2820r_attach(const struct cxd2820r_config *config,
+
+ return pdata.get_dvb_frontend(client);
+ }
+-EXPORT_SYMBOL(cxd2820r_attach);
++EXPORT_SYMBOL_GPL(cxd2820r_attach);
+
+ static struct dvb_frontend *cxd2820r_get_dvb_frontend(struct i2c_client *client)
+ {
+diff --git a/drivers/media/dvb-frontends/cxd2841er.c b/drivers/media/dvb-frontends/cxd2841er.c
+index 5431f922f55e4..e9d1eef40c627 100644
+--- a/drivers/media/dvb-frontends/cxd2841er.c
++++ b/drivers/media/dvb-frontends/cxd2841er.c
+@@ -3930,14 +3930,14 @@ struct dvb_frontend *cxd2841er_attach_s(struct cxd2841er_config *cfg,
+ {
+ return cxd2841er_attach(cfg, i2c, SYS_DVBS);
+ }
+-EXPORT_SYMBOL(cxd2841er_attach_s);
++EXPORT_SYMBOL_GPL(cxd2841er_attach_s);
+
+ struct dvb_frontend *cxd2841er_attach_t_c(struct cxd2841er_config *cfg,
+ struct i2c_adapter *i2c)
+ {
+ return cxd2841er_attach(cfg, i2c, 0);
+ }
+-EXPORT_SYMBOL(cxd2841er_attach_t_c);
++EXPORT_SYMBOL_GPL(cxd2841er_attach_t_c);
+
+ static const struct dvb_frontend_ops cxd2841er_dvbs_s2_ops = {
+ .delsys = { SYS_DVBS, SYS_DVBS2 },
+diff --git a/drivers/media/dvb-frontends/cxd2880/cxd2880_top.c b/drivers/media/dvb-frontends/cxd2880/cxd2880_top.c
+index d5b1b3788e392..09d31c368741d 100644
+--- a/drivers/media/dvb-frontends/cxd2880/cxd2880_top.c
++++ b/drivers/media/dvb-frontends/cxd2880/cxd2880_top.c
+@@ -1950,7 +1950,7 @@ struct dvb_frontend *cxd2880_attach(struct dvb_frontend *fe,
+
+ return fe;
+ }
+-EXPORT_SYMBOL(cxd2880_attach);
++EXPORT_SYMBOL_GPL(cxd2880_attach);
+
+ MODULE_DESCRIPTION("Sony CXD2880 DVB-T2/T tuner + demod driver");
+ MODULE_AUTHOR("Sony Semiconductor Solutions Corporation");
+diff --git a/drivers/media/dvb-frontends/dib0070.c b/drivers/media/dvb-frontends/dib0070.c
+index cafb41dba861c..9a8e7cdd2a247 100644
+--- a/drivers/media/dvb-frontends/dib0070.c
++++ b/drivers/media/dvb-frontends/dib0070.c
+@@ -762,7 +762,7 @@ free_mem:
+ fe->tuner_priv = NULL;
+ return NULL;
+ }
+-EXPORT_SYMBOL(dib0070_attach);
++EXPORT_SYMBOL_GPL(dib0070_attach);
+
+ MODULE_AUTHOR("Patrick Boettcher <patrick.boettcher@posteo.de>");
+ MODULE_DESCRIPTION("Driver for the DiBcom 0070 base-band RF Tuner");
+diff --git a/drivers/media/dvb-frontends/dib0090.c b/drivers/media/dvb-frontends/dib0090.c
+index 903da33642dff..c958bcff026ec 100644
+--- a/drivers/media/dvb-frontends/dib0090.c
++++ b/drivers/media/dvb-frontends/dib0090.c
+@@ -2634,7 +2634,7 @@ struct dvb_frontend *dib0090_register(struct dvb_frontend *fe, struct i2c_adapte
+ return NULL;
+ }
+
+-EXPORT_SYMBOL(dib0090_register);
++EXPORT_SYMBOL_GPL(dib0090_register);
+
+ struct dvb_frontend *dib0090_fw_register(struct dvb_frontend *fe, struct i2c_adapter *i2c, const struct dib0090_config *config)
+ {
+@@ -2660,7 +2660,7 @@ free_mem:
+ fe->tuner_priv = NULL;
+ return NULL;
+ }
+-EXPORT_SYMBOL(dib0090_fw_register);
++EXPORT_SYMBOL_GPL(dib0090_fw_register);
+
+ MODULE_AUTHOR("Patrick Boettcher <patrick.boettcher@posteo.de>");
+ MODULE_AUTHOR("Olivier Grenie <olivier.grenie@parrot.com>");
+diff --git a/drivers/media/dvb-frontends/dib3000mb.c b/drivers/media/dvb-frontends/dib3000mb.c
+index a6c2fc4586eb3..c598b2a633256 100644
+--- a/drivers/media/dvb-frontends/dib3000mb.c
++++ b/drivers/media/dvb-frontends/dib3000mb.c
+@@ -815,4 +815,4 @@ MODULE_AUTHOR(DRIVER_AUTHOR);
+ MODULE_DESCRIPTION(DRIVER_DESC);
+ MODULE_LICENSE("GPL");
+
+-EXPORT_SYMBOL(dib3000mb_attach);
++EXPORT_SYMBOL_GPL(dib3000mb_attach);
+diff --git a/drivers/media/dvb-frontends/dib3000mc.c b/drivers/media/dvb-frontends/dib3000mc.c
+index 2e11a246aae0d..c2fca8289abae 100644
+--- a/drivers/media/dvb-frontends/dib3000mc.c
++++ b/drivers/media/dvb-frontends/dib3000mc.c
+@@ -935,7 +935,7 @@ error:
+ kfree(st);
+ return NULL;
+ }
+-EXPORT_SYMBOL(dib3000mc_attach);
++EXPORT_SYMBOL_GPL(dib3000mc_attach);
+
+ static const struct dvb_frontend_ops dib3000mc_ops = {
+ .delsys = { SYS_DVBT },
+diff --git a/drivers/media/dvb-frontends/dib7000m.c b/drivers/media/dvb-frontends/dib7000m.c
+index 97ce97789c9e3..fdb22f32e3a11 100644
+--- a/drivers/media/dvb-frontends/dib7000m.c
++++ b/drivers/media/dvb-frontends/dib7000m.c
+@@ -1434,7 +1434,7 @@ error:
+ kfree(st);
+ return NULL;
+ }
+-EXPORT_SYMBOL(dib7000m_attach);
++EXPORT_SYMBOL_GPL(dib7000m_attach);
+
+ static const struct dvb_frontend_ops dib7000m_ops = {
+ .delsys = { SYS_DVBT },
+diff --git a/drivers/media/dvb-frontends/dib7000p.c b/drivers/media/dvb-frontends/dib7000p.c
+index a90d2f51868ff..d1e53de5206ae 100644
+--- a/drivers/media/dvb-frontends/dib7000p.c
++++ b/drivers/media/dvb-frontends/dib7000p.c
+@@ -497,7 +497,7 @@ static int dib7000p_update_pll(struct dvb_frontend *fe, struct dibx000_bandwidth
+ prediv = reg_1856 & 0x3f;
+ loopdiv = (reg_1856 >> 6) & 0x3f;
+
+- if ((bw != NULL) && (bw->pll_prediv != prediv || bw->pll_ratio != loopdiv)) {
++ if (loopdiv && bw && (bw->pll_prediv != prediv || bw->pll_ratio != loopdiv)) {
+ dprintk("Updating pll (prediv: old = %d new = %d ; loopdiv : old = %d new = %d)\n", prediv, bw->pll_prediv, loopdiv, bw->pll_ratio);
+ reg_1856 &= 0xf000;
+ reg_1857 = dib7000p_read_word(state, 1857);
+@@ -2822,7 +2822,7 @@ void *dib7000p_attach(struct dib7000p_ops *ops)
+
+ return ops;
+ }
+-EXPORT_SYMBOL(dib7000p_attach);
++EXPORT_SYMBOL_GPL(dib7000p_attach);
+
+ static const struct dvb_frontend_ops dib7000p_ops = {
+ .delsys = { SYS_DVBT },
+diff --git a/drivers/media/dvb-frontends/dib8000.c b/drivers/media/dvb-frontends/dib8000.c
+index fe19d127abb3f..301d8eca7a6f9 100644
+--- a/drivers/media/dvb-frontends/dib8000.c
++++ b/drivers/media/dvb-frontends/dib8000.c
+@@ -4527,7 +4527,7 @@ void *dib8000_attach(struct dib8000_ops *ops)
+
+ return ops;
+ }
+-EXPORT_SYMBOL(dib8000_attach);
++EXPORT_SYMBOL_GPL(dib8000_attach);
+
+ MODULE_AUTHOR("Olivier Grenie <Olivier.Grenie@parrot.com, Patrick Boettcher <patrick.boettcher@posteo.de>");
+ MODULE_DESCRIPTION("Driver for the DiBcom 8000 ISDB-T demodulator");
+diff --git a/drivers/media/dvb-frontends/dib9000.c b/drivers/media/dvb-frontends/dib9000.c
+index 914ca820c174b..6f81890b31eeb 100644
+--- a/drivers/media/dvb-frontends/dib9000.c
++++ b/drivers/media/dvb-frontends/dib9000.c
+@@ -2546,7 +2546,7 @@ error:
+ kfree(st);
+ return NULL;
+ }
+-EXPORT_SYMBOL(dib9000_attach);
++EXPORT_SYMBOL_GPL(dib9000_attach);
+
+ static const struct dvb_frontend_ops dib9000_ops = {
+ .delsys = { SYS_DVBT },
+diff --git a/drivers/media/dvb-frontends/drx39xyj/drxj.c b/drivers/media/dvb-frontends/drx39xyj/drxj.c
+index 68f4e8b5a0abb..a738573c8cd7a 100644
+--- a/drivers/media/dvb-frontends/drx39xyj/drxj.c
++++ b/drivers/media/dvb-frontends/drx39xyj/drxj.c
+@@ -12372,7 +12372,7 @@ error:
+
+ return NULL;
+ }
+-EXPORT_SYMBOL(drx39xxj_attach);
++EXPORT_SYMBOL_GPL(drx39xxj_attach);
+
+ static const struct dvb_frontend_ops drx39xxj_ops = {
+ .delsys = { SYS_ATSC, SYS_DVBC_ANNEX_B },
+diff --git a/drivers/media/dvb-frontends/drxd_hard.c b/drivers/media/dvb-frontends/drxd_hard.c
+index 9860cae65f1cf..6a531937f4bbb 100644
+--- a/drivers/media/dvb-frontends/drxd_hard.c
++++ b/drivers/media/dvb-frontends/drxd_hard.c
+@@ -2939,7 +2939,7 @@ error:
+ kfree(state);
+ return NULL;
+ }
+-EXPORT_SYMBOL(drxd_attach);
++EXPORT_SYMBOL_GPL(drxd_attach);
+
+ MODULE_DESCRIPTION("DRXD driver");
+ MODULE_AUTHOR("Micronas");
+diff --git a/drivers/media/dvb-frontends/drxk_hard.c b/drivers/media/dvb-frontends/drxk_hard.c
+index 3301ef75d4417..1acdd204c25ce 100644
+--- a/drivers/media/dvb-frontends/drxk_hard.c
++++ b/drivers/media/dvb-frontends/drxk_hard.c
+@@ -6833,7 +6833,7 @@ error:
+ kfree(state);
+ return NULL;
+ }
+-EXPORT_SYMBOL(drxk_attach);
++EXPORT_SYMBOL_GPL(drxk_attach);
+
+ MODULE_DESCRIPTION("DRX-K driver");
+ MODULE_AUTHOR("Ralph Metzler");
+diff --git a/drivers/media/dvb-frontends/ds3000.c b/drivers/media/dvb-frontends/ds3000.c
+index 20fcf31af1658..515aa7c7baf2a 100644
+--- a/drivers/media/dvb-frontends/ds3000.c
++++ b/drivers/media/dvb-frontends/ds3000.c
+@@ -859,7 +859,7 @@ struct dvb_frontend *ds3000_attach(const struct ds3000_config *config,
+ ds3000_set_voltage(&state->frontend, SEC_VOLTAGE_OFF);
+ return &state->frontend;
+ }
+-EXPORT_SYMBOL(ds3000_attach);
++EXPORT_SYMBOL_GPL(ds3000_attach);
+
+ static int ds3000_set_carrier_offset(struct dvb_frontend *fe,
+ s32 carrier_offset_khz)
+diff --git a/drivers/media/dvb-frontends/dvb-pll.c b/drivers/media/dvb-frontends/dvb-pll.c
+index e35e00db7dbb3..b0199ec866705 100644
+--- a/drivers/media/dvb-frontends/dvb-pll.c
++++ b/drivers/media/dvb-frontends/dvb-pll.c
+@@ -866,7 +866,7 @@ out:
+
+ return NULL;
+ }
+-EXPORT_SYMBOL(dvb_pll_attach);
++EXPORT_SYMBOL_GPL(dvb_pll_attach);
+
+
+ static int
+diff --git a/drivers/media/dvb-frontends/ec100.c b/drivers/media/dvb-frontends/ec100.c
+index 03bd80666cf83..2ad0a3c2f7567 100644
+--- a/drivers/media/dvb-frontends/ec100.c
++++ b/drivers/media/dvb-frontends/ec100.c
+@@ -299,7 +299,7 @@ error:
+ kfree(state);
+ return NULL;
+ }
+-EXPORT_SYMBOL(ec100_attach);
++EXPORT_SYMBOL_GPL(ec100_attach);
+
+ static const struct dvb_frontend_ops ec100_ops = {
+ .delsys = { SYS_DVBT },
+diff --git a/drivers/media/dvb-frontends/helene.c b/drivers/media/dvb-frontends/helene.c
+index e4bbf6a51a2bc..685cf1c908bde 100644
+--- a/drivers/media/dvb-frontends/helene.c
++++ b/drivers/media/dvb-frontends/helene.c
+@@ -1025,7 +1025,7 @@ struct dvb_frontend *helene_attach_s(struct dvb_frontend *fe,
+ priv->i2c_address, priv->i2c);
+ return fe;
+ }
+-EXPORT_SYMBOL(helene_attach_s);
++EXPORT_SYMBOL_GPL(helene_attach_s);
+
+ struct dvb_frontend *helene_attach(struct dvb_frontend *fe,
+ const struct helene_config *config,
+@@ -1061,7 +1061,7 @@ struct dvb_frontend *helene_attach(struct dvb_frontend *fe,
+ priv->i2c_address, priv->i2c);
+ return fe;
+ }
+-EXPORT_SYMBOL(helene_attach);
++EXPORT_SYMBOL_GPL(helene_attach);
+
+ static int helene_probe(struct i2c_client *client)
+ {
+diff --git a/drivers/media/dvb-frontends/horus3a.c b/drivers/media/dvb-frontends/horus3a.c
+index 24bf5cbcc1846..0330b78a5b3f2 100644
+--- a/drivers/media/dvb-frontends/horus3a.c
++++ b/drivers/media/dvb-frontends/horus3a.c
+@@ -395,7 +395,7 @@ struct dvb_frontend *horus3a_attach(struct dvb_frontend *fe,
+ priv->i2c_address, priv->i2c);
+ return fe;
+ }
+-EXPORT_SYMBOL(horus3a_attach);
++EXPORT_SYMBOL_GPL(horus3a_attach);
+
+ MODULE_DESCRIPTION("Sony HORUS3A satellite tuner driver");
+ MODULE_AUTHOR("Sergey Kozlov <serjk@netup.ru>");
+diff --git a/drivers/media/dvb-frontends/isl6405.c b/drivers/media/dvb-frontends/isl6405.c
+index 2cd69b4ff82cb..7d28a743f97eb 100644
+--- a/drivers/media/dvb-frontends/isl6405.c
++++ b/drivers/media/dvb-frontends/isl6405.c
+@@ -141,7 +141,7 @@ struct dvb_frontend *isl6405_attach(struct dvb_frontend *fe, struct i2c_adapter
+
+ return fe;
+ }
+-EXPORT_SYMBOL(isl6405_attach);
++EXPORT_SYMBOL_GPL(isl6405_attach);
+
+ MODULE_DESCRIPTION("Driver for lnb supply and control ic isl6405");
+ MODULE_AUTHOR("Hartmut Hackmann & Oliver Endriss");
+diff --git a/drivers/media/dvb-frontends/isl6421.c b/drivers/media/dvb-frontends/isl6421.c
+index 43b0dfc6f453e..2e9f6f12f849e 100644
+--- a/drivers/media/dvb-frontends/isl6421.c
++++ b/drivers/media/dvb-frontends/isl6421.c
+@@ -213,7 +213,7 @@ struct dvb_frontend *isl6421_attach(struct dvb_frontend *fe, struct i2c_adapter
+
+ return fe;
+ }
+-EXPORT_SYMBOL(isl6421_attach);
++EXPORT_SYMBOL_GPL(isl6421_attach);
+
+ MODULE_DESCRIPTION("Driver for lnb supply and control ic isl6421");
+ MODULE_AUTHOR("Andrew de Quincey & Oliver Endriss");
+diff --git a/drivers/media/dvb-frontends/isl6423.c b/drivers/media/dvb-frontends/isl6423.c
+index 8cd1bb88ce6e7..a0d0a38340574 100644
+--- a/drivers/media/dvb-frontends/isl6423.c
++++ b/drivers/media/dvb-frontends/isl6423.c
+@@ -289,7 +289,7 @@ exit:
+ fe->sec_priv = NULL;
+ return NULL;
+ }
+-EXPORT_SYMBOL(isl6423_attach);
++EXPORT_SYMBOL_GPL(isl6423_attach);
+
+ MODULE_DESCRIPTION("ISL6423 SEC");
+ MODULE_AUTHOR("Manu Abraham");
+diff --git a/drivers/media/dvb-frontends/itd1000.c b/drivers/media/dvb-frontends/itd1000.c
+index 1b33478653d16..f8f362f50e78d 100644
+--- a/drivers/media/dvb-frontends/itd1000.c
++++ b/drivers/media/dvb-frontends/itd1000.c
+@@ -389,7 +389,7 @@ struct dvb_frontend *itd1000_attach(struct dvb_frontend *fe, struct i2c_adapter
+
+ return fe;
+ }
+-EXPORT_SYMBOL(itd1000_attach);
++EXPORT_SYMBOL_GPL(itd1000_attach);
+
+ MODULE_AUTHOR("Patrick Boettcher <pb@linuxtv.org>");
+ MODULE_DESCRIPTION("Integrant ITD1000 driver");
+diff --git a/drivers/media/dvb-frontends/ix2505v.c b/drivers/media/dvb-frontends/ix2505v.c
+index 73f27105c139d..3212e333d472b 100644
+--- a/drivers/media/dvb-frontends/ix2505v.c
++++ b/drivers/media/dvb-frontends/ix2505v.c
+@@ -302,7 +302,7 @@ error:
+ kfree(state);
+ return NULL;
+ }
+-EXPORT_SYMBOL(ix2505v_attach);
++EXPORT_SYMBOL_GPL(ix2505v_attach);
+
+ module_param_named(debug, ix2505v_debug, int, 0644);
+ MODULE_PARM_DESC(debug, "Turn on/off frontend debugging (default:off).");
+diff --git a/drivers/media/dvb-frontends/l64781.c b/drivers/media/dvb-frontends/l64781.c
+index c5106a1ea1cd0..fe5af2453d559 100644
+--- a/drivers/media/dvb-frontends/l64781.c
++++ b/drivers/media/dvb-frontends/l64781.c
+@@ -593,4 +593,4 @@ MODULE_DESCRIPTION("LSI L64781 DVB-T Demodulator driver");
+ MODULE_AUTHOR("Holger Waechtler, Marko Kohtala");
+ MODULE_LICENSE("GPL");
+
+-EXPORT_SYMBOL(l64781_attach);
++EXPORT_SYMBOL_GPL(l64781_attach);
+diff --git a/drivers/media/dvb-frontends/lg2160.c b/drivers/media/dvb-frontends/lg2160.c
+index f343066c297e2..fe700aa56bff3 100644
+--- a/drivers/media/dvb-frontends/lg2160.c
++++ b/drivers/media/dvb-frontends/lg2160.c
+@@ -1426,7 +1426,7 @@ struct dvb_frontend *lg2160_attach(const struct lg2160_config *config,
+
+ return &state->frontend;
+ }
+-EXPORT_SYMBOL(lg2160_attach);
++EXPORT_SYMBOL_GPL(lg2160_attach);
+
+ MODULE_DESCRIPTION("LG Electronics LG216x ATSC/MH Demodulator Driver");
+ MODULE_AUTHOR("Michael Krufky <mkrufky@linuxtv.org>");
+diff --git a/drivers/media/dvb-frontends/lgdt3305.c b/drivers/media/dvb-frontends/lgdt3305.c
+index 62d7439889196..60a97f1cc74e5 100644
+--- a/drivers/media/dvb-frontends/lgdt3305.c
++++ b/drivers/media/dvb-frontends/lgdt3305.c
+@@ -1148,7 +1148,7 @@ fail:
+ kfree(state);
+ return NULL;
+ }
+-EXPORT_SYMBOL(lgdt3305_attach);
++EXPORT_SYMBOL_GPL(lgdt3305_attach);
+
+ static const struct dvb_frontend_ops lgdt3304_ops = {
+ .delsys = { SYS_ATSC, SYS_DVBC_ANNEX_B },
+diff --git a/drivers/media/dvb-frontends/lgdt3306a.c b/drivers/media/dvb-frontends/lgdt3306a.c
+index 6bf723b5ffad8..cfd76e0746336 100644
+--- a/drivers/media/dvb-frontends/lgdt3306a.c
++++ b/drivers/media/dvb-frontends/lgdt3306a.c
+@@ -1859,7 +1859,7 @@ fail:
+ kfree(state);
+ return NULL;
+ }
+-EXPORT_SYMBOL(lgdt3306a_attach);
++EXPORT_SYMBOL_GPL(lgdt3306a_attach);
+
+ #ifdef DBG_DUMP
+
+diff --git a/drivers/media/dvb-frontends/lgdt330x.c b/drivers/media/dvb-frontends/lgdt330x.c
+index 1d6932d8e4978..5cd58efc38433 100644
+--- a/drivers/media/dvb-frontends/lgdt330x.c
++++ b/drivers/media/dvb-frontends/lgdt330x.c
+@@ -927,7 +927,7 @@ struct dvb_frontend *lgdt330x_attach(const struct lgdt330x_config *_config,
+
+ return lgdt330x_get_dvb_frontend(client);
+ }
+-EXPORT_SYMBOL(lgdt330x_attach);
++EXPORT_SYMBOL_GPL(lgdt330x_attach);
+
+ static const struct dvb_frontend_ops lgdt3302_ops = {
+ .delsys = { SYS_ATSC, SYS_DVBC_ANNEX_B },
+diff --git a/drivers/media/dvb-frontends/lgs8gxx.c b/drivers/media/dvb-frontends/lgs8gxx.c
+index 30014979b985b..ffaf60e16ecd4 100644
+--- a/drivers/media/dvb-frontends/lgs8gxx.c
++++ b/drivers/media/dvb-frontends/lgs8gxx.c
+@@ -1043,7 +1043,7 @@ error_out:
+ return NULL;
+
+ }
+-EXPORT_SYMBOL(lgs8gxx_attach);
++EXPORT_SYMBOL_GPL(lgs8gxx_attach);
+
+ MODULE_DESCRIPTION("Legend Silicon LGS8913/LGS8GXX DMB-TH demodulator driver");
+ MODULE_AUTHOR("David T. L. Wong <davidtlwong@gmail.com>");
+diff --git a/drivers/media/dvb-frontends/lnbh25.c b/drivers/media/dvb-frontends/lnbh25.c
+index 9ffe06cd787dd..41bec050642b5 100644
+--- a/drivers/media/dvb-frontends/lnbh25.c
++++ b/drivers/media/dvb-frontends/lnbh25.c
+@@ -173,7 +173,7 @@ struct dvb_frontend *lnbh25_attach(struct dvb_frontend *fe,
+ __func__, priv->i2c_address);
+ return fe;
+ }
+-EXPORT_SYMBOL(lnbh25_attach);
++EXPORT_SYMBOL_GPL(lnbh25_attach);
+
+ MODULE_DESCRIPTION("ST LNBH25 driver");
+ MODULE_AUTHOR("info@netup.ru");
+diff --git a/drivers/media/dvb-frontends/lnbp21.c b/drivers/media/dvb-frontends/lnbp21.c
+index e564974162d65..32593b1f75a38 100644
+--- a/drivers/media/dvb-frontends/lnbp21.c
++++ b/drivers/media/dvb-frontends/lnbp21.c
+@@ -155,7 +155,7 @@ struct dvb_frontend *lnbh24_attach(struct dvb_frontend *fe,
+ return lnbx2x_attach(fe, i2c, override_set, override_clear,
+ i2c_addr, LNBH24_TTX);
+ }
+-EXPORT_SYMBOL(lnbh24_attach);
++EXPORT_SYMBOL_GPL(lnbh24_attach);
+
+ struct dvb_frontend *lnbp21_attach(struct dvb_frontend *fe,
+ struct i2c_adapter *i2c, u8 override_set,
+@@ -164,7 +164,7 @@ struct dvb_frontend *lnbp21_attach(struct dvb_frontend *fe,
+ return lnbx2x_attach(fe, i2c, override_set, override_clear,
+ 0x08, LNBP21_ISEL);
+ }
+-EXPORT_SYMBOL(lnbp21_attach);
++EXPORT_SYMBOL_GPL(lnbp21_attach);
+
+ MODULE_DESCRIPTION("Driver for lnb supply and control ic lnbp21, lnbh24");
+ MODULE_AUTHOR("Oliver Endriss, Igor M. Liplianin");
+diff --git a/drivers/media/dvb-frontends/lnbp22.c b/drivers/media/dvb-frontends/lnbp22.c
+index b8c7145d4cefe..cb4ea5d3fad4a 100644
+--- a/drivers/media/dvb-frontends/lnbp22.c
++++ b/drivers/media/dvb-frontends/lnbp22.c
+@@ -125,7 +125,7 @@ struct dvb_frontend *lnbp22_attach(struct dvb_frontend *fe,
+
+ return fe;
+ }
+-EXPORT_SYMBOL(lnbp22_attach);
++EXPORT_SYMBOL_GPL(lnbp22_attach);
+
+ MODULE_DESCRIPTION("Driver for lnb supply and control ic lnbp22");
+ MODULE_AUTHOR("Dominik Kuhlen");
+diff --git a/drivers/media/dvb-frontends/m88ds3103.c b/drivers/media/dvb-frontends/m88ds3103.c
+index f26508b217ee6..36eab5d254b5b 100644
+--- a/drivers/media/dvb-frontends/m88ds3103.c
++++ b/drivers/media/dvb-frontends/m88ds3103.c
+@@ -1695,7 +1695,7 @@ struct dvb_frontend *m88ds3103_attach(const struct m88ds3103_config *cfg,
+ *tuner_i2c_adapter = pdata.get_i2c_adapter(client);
+ return pdata.get_dvb_frontend(client);
+ }
+-EXPORT_SYMBOL(m88ds3103_attach);
++EXPORT_SYMBOL_GPL(m88ds3103_attach);
+
+ static const struct dvb_frontend_ops m88ds3103_ops = {
+ .delsys = {SYS_DVBS, SYS_DVBS2},
+diff --git a/drivers/media/dvb-frontends/m88rs2000.c b/drivers/media/dvb-frontends/m88rs2000.c
+index b294ba87e934f..2aa98203cd659 100644
+--- a/drivers/media/dvb-frontends/m88rs2000.c
++++ b/drivers/media/dvb-frontends/m88rs2000.c
+@@ -808,7 +808,7 @@ error:
+
+ return NULL;
+ }
+-EXPORT_SYMBOL(m88rs2000_attach);
++EXPORT_SYMBOL_GPL(m88rs2000_attach);
+
+ MODULE_DESCRIPTION("M88RS2000 DVB-S Demodulator driver");
+ MODULE_AUTHOR("Malcolm Priestley tvboxspy@gmail.com");
+diff --git a/drivers/media/dvb-frontends/mb86a16.c b/drivers/media/dvb-frontends/mb86a16.c
+index d3e29937cf4cf..460821a986e53 100644
+--- a/drivers/media/dvb-frontends/mb86a16.c
++++ b/drivers/media/dvb-frontends/mb86a16.c
+@@ -1851,6 +1851,6 @@ error:
+ kfree(state);
+ return NULL;
+ }
+-EXPORT_SYMBOL(mb86a16_attach);
++EXPORT_SYMBOL_GPL(mb86a16_attach);
+ MODULE_LICENSE("GPL");
+ MODULE_AUTHOR("Manu Abraham");
+diff --git a/drivers/media/dvb-frontends/mb86a20s.c b/drivers/media/dvb-frontends/mb86a20s.c
+index b74b9afed9a2e..9f5c61d4f23c5 100644
+--- a/drivers/media/dvb-frontends/mb86a20s.c
++++ b/drivers/media/dvb-frontends/mb86a20s.c
+@@ -2081,7 +2081,7 @@ struct dvb_frontend *mb86a20s_attach(const struct mb86a20s_config *config,
+ dev_info(&i2c->dev, "Detected a Fujitsu mb86a20s frontend\n");
+ return &state->frontend;
+ }
+-EXPORT_SYMBOL(mb86a20s_attach);
++EXPORT_SYMBOL_GPL(mb86a20s_attach);
+
+ static const struct dvb_frontend_ops mb86a20s_ops = {
+ .delsys = { SYS_ISDBT },
+diff --git a/drivers/media/dvb-frontends/mt312.c b/drivers/media/dvb-frontends/mt312.c
+index d43a67045dbe7..fb867dd8a26be 100644
+--- a/drivers/media/dvb-frontends/mt312.c
++++ b/drivers/media/dvb-frontends/mt312.c
+@@ -827,7 +827,7 @@ error:
+ kfree(state);
+ return NULL;
+ }
+-EXPORT_SYMBOL(mt312_attach);
++EXPORT_SYMBOL_GPL(mt312_attach);
+
+ module_param(debug, int, 0644);
+ MODULE_PARM_DESC(debug, "Turn on/off frontend debugging (default:off).");
+diff --git a/drivers/media/dvb-frontends/mt352.c b/drivers/media/dvb-frontends/mt352.c
+index 399d5c519027e..1b2889f5cf67d 100644
+--- a/drivers/media/dvb-frontends/mt352.c
++++ b/drivers/media/dvb-frontends/mt352.c
+@@ -593,4 +593,4 @@ MODULE_DESCRIPTION("Zarlink MT352 DVB-T Demodulator driver");
+ MODULE_AUTHOR("Holger Waechtler, Daniel Mack, Antonio Mancuso");
+ MODULE_LICENSE("GPL");
+
+-EXPORT_SYMBOL(mt352_attach);
++EXPORT_SYMBOL_GPL(mt352_attach);
+diff --git a/drivers/media/dvb-frontends/nxt200x.c b/drivers/media/dvb-frontends/nxt200x.c
+index 200b6dbc75f81..1c549ada6ebf9 100644
+--- a/drivers/media/dvb-frontends/nxt200x.c
++++ b/drivers/media/dvb-frontends/nxt200x.c
+@@ -1216,5 +1216,5 @@ MODULE_DESCRIPTION("NXT200X (ATSC 8VSB & ITU-T J.83 AnnexB 64/256 QAM) Demodulat
+ MODULE_AUTHOR("Kirk Lapray, Michael Krufky, Jean-Francois Thibert, and Taylor Jacob");
+ MODULE_LICENSE("GPL");
+
+-EXPORT_SYMBOL(nxt200x_attach);
++EXPORT_SYMBOL_GPL(nxt200x_attach);
+
+diff --git a/drivers/media/dvb-frontends/nxt6000.c b/drivers/media/dvb-frontends/nxt6000.c
+index 136918f82dda0..e8d4940370ddf 100644
+--- a/drivers/media/dvb-frontends/nxt6000.c
++++ b/drivers/media/dvb-frontends/nxt6000.c
+@@ -621,4 +621,4 @@ MODULE_DESCRIPTION("NxtWave NXT6000 DVB-T demodulator driver");
+ MODULE_AUTHOR("Florian Schirmer");
+ MODULE_LICENSE("GPL");
+
+-EXPORT_SYMBOL(nxt6000_attach);
++EXPORT_SYMBOL_GPL(nxt6000_attach);
+diff --git a/drivers/media/dvb-frontends/or51132.c b/drivers/media/dvb-frontends/or51132.c
+index 24de1b1151583..144a1f25dec0a 100644
+--- a/drivers/media/dvb-frontends/or51132.c
++++ b/drivers/media/dvb-frontends/or51132.c
+@@ -605,4 +605,4 @@ MODULE_AUTHOR("Kirk Lapray");
+ MODULE_AUTHOR("Trent Piepho");
+ MODULE_LICENSE("GPL");
+
+-EXPORT_SYMBOL(or51132_attach);
++EXPORT_SYMBOL_GPL(or51132_attach);
+diff --git a/drivers/media/dvb-frontends/or51211.c b/drivers/media/dvb-frontends/or51211.c
+index ddcaea5c9941f..dc60482162c54 100644
+--- a/drivers/media/dvb-frontends/or51211.c
++++ b/drivers/media/dvb-frontends/or51211.c
+@@ -551,5 +551,5 @@ MODULE_DESCRIPTION("Oren OR51211 VSB [pcHDTV HD-2000] Demodulator Driver");
+ MODULE_AUTHOR("Kirk Lapray");
+ MODULE_LICENSE("GPL");
+
+-EXPORT_SYMBOL(or51211_attach);
++EXPORT_SYMBOL_GPL(or51211_attach);
+
+diff --git a/drivers/media/dvb-frontends/s5h1409.c b/drivers/media/dvb-frontends/s5h1409.c
+index 3089cc174a6f5..28b1dca077ead 100644
+--- a/drivers/media/dvb-frontends/s5h1409.c
++++ b/drivers/media/dvb-frontends/s5h1409.c
+@@ -981,7 +981,7 @@ error:
+ kfree(state);
+ return NULL;
+ }
+-EXPORT_SYMBOL(s5h1409_attach);
++EXPORT_SYMBOL_GPL(s5h1409_attach);
+
+ static const struct dvb_frontend_ops s5h1409_ops = {
+ .delsys = { SYS_ATSC, SYS_DVBC_ANNEX_B },
+diff --git a/drivers/media/dvb-frontends/s5h1411.c b/drivers/media/dvb-frontends/s5h1411.c
+index 2563a72e98b70..fc48e659c2d8a 100644
+--- a/drivers/media/dvb-frontends/s5h1411.c
++++ b/drivers/media/dvb-frontends/s5h1411.c
+@@ -900,7 +900,7 @@ error:
+ kfree(state);
+ return NULL;
+ }
+-EXPORT_SYMBOL(s5h1411_attach);
++EXPORT_SYMBOL_GPL(s5h1411_attach);
+
+ static const struct dvb_frontend_ops s5h1411_ops = {
+ .delsys = { SYS_ATSC, SYS_DVBC_ANNEX_B },
+diff --git a/drivers/media/dvb-frontends/s5h1420.c b/drivers/media/dvb-frontends/s5h1420.c
+index 6bdec2898bc81..d700de1ea6c24 100644
+--- a/drivers/media/dvb-frontends/s5h1420.c
++++ b/drivers/media/dvb-frontends/s5h1420.c
+@@ -918,7 +918,7 @@ error:
+ kfree(state);
+ return NULL;
+ }
+-EXPORT_SYMBOL(s5h1420_attach);
++EXPORT_SYMBOL_GPL(s5h1420_attach);
+
+ static const struct dvb_frontend_ops s5h1420_ops = {
+ .delsys = { SYS_DVBS },
+diff --git a/drivers/media/dvb-frontends/s5h1432.c b/drivers/media/dvb-frontends/s5h1432.c
+index 956e8ee4b388e..ff5d3bdf3bc67 100644
+--- a/drivers/media/dvb-frontends/s5h1432.c
++++ b/drivers/media/dvb-frontends/s5h1432.c
+@@ -355,7 +355,7 @@ struct dvb_frontend *s5h1432_attach(const struct s5h1432_config *config,
+
+ return &state->frontend;
+ }
+-EXPORT_SYMBOL(s5h1432_attach);
++EXPORT_SYMBOL_GPL(s5h1432_attach);
+
+ static const struct dvb_frontend_ops s5h1432_ops = {
+ .delsys = { SYS_DVBT },
+diff --git a/drivers/media/dvb-frontends/s921.c b/drivers/media/dvb-frontends/s921.c
+index f118d8e641030..7e461ac159fc1 100644
+--- a/drivers/media/dvb-frontends/s921.c
++++ b/drivers/media/dvb-frontends/s921.c
+@@ -495,7 +495,7 @@ struct dvb_frontend *s921_attach(const struct s921_config *config,
+
+ return &state->frontend;
+ }
+-EXPORT_SYMBOL(s921_attach);
++EXPORT_SYMBOL_GPL(s921_attach);
+
+ static const struct dvb_frontend_ops s921_ops = {
+ .delsys = { SYS_ISDBT },
+diff --git a/drivers/media/dvb-frontends/si21xx.c b/drivers/media/dvb-frontends/si21xx.c
+index 2d29d2c4d434c..210ccd356e2bf 100644
+--- a/drivers/media/dvb-frontends/si21xx.c
++++ b/drivers/media/dvb-frontends/si21xx.c
+@@ -937,7 +937,7 @@ error:
+ kfree(state);
+ return NULL;
+ }
+-EXPORT_SYMBOL(si21xx_attach);
++EXPORT_SYMBOL_GPL(si21xx_attach);
+
+ module_param(debug, int, 0644);
+ MODULE_PARM_DESC(debug, "Turn on/off frontend debugging (default:off).");
+diff --git a/drivers/media/dvb-frontends/sp887x.c b/drivers/media/dvb-frontends/sp887x.c
+index 146e7f2dd3c5e..f59c0f96416b5 100644
+--- a/drivers/media/dvb-frontends/sp887x.c
++++ b/drivers/media/dvb-frontends/sp887x.c
+@@ -624,4 +624,4 @@ MODULE_PARM_DESC(debug, "Turn on/off frontend debugging (default:off).");
+ MODULE_DESCRIPTION("Spase sp887x DVB-T demodulator driver");
+ MODULE_LICENSE("GPL");
+
+-EXPORT_SYMBOL(sp887x_attach);
++EXPORT_SYMBOL_GPL(sp887x_attach);
+diff --git a/drivers/media/dvb-frontends/stb0899_drv.c b/drivers/media/dvb-frontends/stb0899_drv.c
+index 4ee6c1e1e9f7d..2f4d8fb400cd6 100644
+--- a/drivers/media/dvb-frontends/stb0899_drv.c
++++ b/drivers/media/dvb-frontends/stb0899_drv.c
+@@ -1638,7 +1638,7 @@ error:
+ kfree(state);
+ return NULL;
+ }
+-EXPORT_SYMBOL(stb0899_attach);
++EXPORT_SYMBOL_GPL(stb0899_attach);
+ MODULE_PARM_DESC(verbose, "Set Verbosity level");
+ MODULE_AUTHOR("Manu Abraham");
+ MODULE_DESCRIPTION("STB0899 Multi-Std frontend");
+diff --git a/drivers/media/dvb-frontends/stb6000.c b/drivers/media/dvb-frontends/stb6000.c
+index 8c9800d577e03..d74e34677b925 100644
+--- a/drivers/media/dvb-frontends/stb6000.c
++++ b/drivers/media/dvb-frontends/stb6000.c
+@@ -232,7 +232,7 @@ struct dvb_frontend *stb6000_attach(struct dvb_frontend *fe, int addr,
+
+ return fe;
+ }
+-EXPORT_SYMBOL(stb6000_attach);
++EXPORT_SYMBOL_GPL(stb6000_attach);
+
+ module_param(debug, int, 0644);
+ MODULE_PARM_DESC(debug, "Turn on/off frontend debugging (default:off).");
+diff --git a/drivers/media/dvb-frontends/stb6100.c b/drivers/media/dvb-frontends/stb6100.c
+index 698866c4f15a7..c5818a15a0d70 100644
+--- a/drivers/media/dvb-frontends/stb6100.c
++++ b/drivers/media/dvb-frontends/stb6100.c
+@@ -557,7 +557,7 @@ static void stb6100_release(struct dvb_frontend *fe)
+ kfree(state);
+ }
+
+-EXPORT_SYMBOL(stb6100_attach);
++EXPORT_SYMBOL_GPL(stb6100_attach);
+ MODULE_PARM_DESC(verbose, "Set Verbosity level");
+
+ MODULE_AUTHOR("Manu Abraham");
+diff --git a/drivers/media/dvb-frontends/stv0288.c b/drivers/media/dvb-frontends/stv0288.c
+index 3ae1f3a2f1420..a5581bd60f9e8 100644
+--- a/drivers/media/dvb-frontends/stv0288.c
++++ b/drivers/media/dvb-frontends/stv0288.c
+@@ -590,7 +590,7 @@ error:
+
+ return NULL;
+ }
+-EXPORT_SYMBOL(stv0288_attach);
++EXPORT_SYMBOL_GPL(stv0288_attach);
+
+ module_param(debug_legacy_dish_switch, int, 0444);
+ MODULE_PARM_DESC(debug_legacy_dish_switch,
+diff --git a/drivers/media/dvb-frontends/stv0297.c b/drivers/media/dvb-frontends/stv0297.c
+index 6d5962d5697ac..9d4dbd99a5a79 100644
+--- a/drivers/media/dvb-frontends/stv0297.c
++++ b/drivers/media/dvb-frontends/stv0297.c
+@@ -710,4 +710,4 @@ MODULE_DESCRIPTION("ST STV0297 DVB-C Demodulator driver");
+ MODULE_AUTHOR("Dennis Noermann and Andrew de Quincey");
+ MODULE_LICENSE("GPL");
+
+-EXPORT_SYMBOL(stv0297_attach);
++EXPORT_SYMBOL_GPL(stv0297_attach);
+diff --git a/drivers/media/dvb-frontends/stv0299.c b/drivers/media/dvb-frontends/stv0299.c
+index b5263a0ee5aa5..da7ff2c2e8e55 100644
+--- a/drivers/media/dvb-frontends/stv0299.c
++++ b/drivers/media/dvb-frontends/stv0299.c
+@@ -752,4 +752,4 @@ MODULE_DESCRIPTION("ST STV0299 DVB Demodulator driver");
+ MODULE_AUTHOR("Ralph Metzler, Holger Waechtler, Peter Schildmann, Felix Domke, Andreas Oberritter, Andrew de Quincey, Kenneth Aafly");
+ MODULE_LICENSE("GPL");
+
+-EXPORT_SYMBOL(stv0299_attach);
++EXPORT_SYMBOL_GPL(stv0299_attach);
+diff --git a/drivers/media/dvb-frontends/stv0367.c b/drivers/media/dvb-frontends/stv0367.c
+index 95e376f23506f..04556b77c16c9 100644
+--- a/drivers/media/dvb-frontends/stv0367.c
++++ b/drivers/media/dvb-frontends/stv0367.c
+@@ -1750,7 +1750,7 @@ error:
+ kfree(state);
+ return NULL;
+ }
+-EXPORT_SYMBOL(stv0367ter_attach);
++EXPORT_SYMBOL_GPL(stv0367ter_attach);
+
+ static int stv0367cab_gate_ctrl(struct dvb_frontend *fe, int enable)
+ {
+@@ -2919,7 +2919,7 @@ error:
+ kfree(state);
+ return NULL;
+ }
+-EXPORT_SYMBOL(stv0367cab_attach);
++EXPORT_SYMBOL_GPL(stv0367cab_attach);
+
+ /*
+ * Functions for operation on Digital Devices hardware
+@@ -3340,7 +3340,7 @@ error:
+ kfree(state);
+ return NULL;
+ }
+-EXPORT_SYMBOL(stv0367ddb_attach);
++EXPORT_SYMBOL_GPL(stv0367ddb_attach);
+
+ MODULE_PARM_DESC(debug, "Set debug");
+ MODULE_PARM_DESC(i2c_debug, "Set i2c debug");
+diff --git a/drivers/media/dvb-frontends/stv0900_core.c b/drivers/media/dvb-frontends/stv0900_core.c
+index 212312d20ff62..e7b9b9b11d7df 100644
+--- a/drivers/media/dvb-frontends/stv0900_core.c
++++ b/drivers/media/dvb-frontends/stv0900_core.c
+@@ -1957,7 +1957,7 @@ error:
+ kfree(state);
+ return NULL;
+ }
+-EXPORT_SYMBOL(stv0900_attach);
++EXPORT_SYMBOL_GPL(stv0900_attach);
+
+ MODULE_PARM_DESC(debug, "Set debug");
+
+diff --git a/drivers/media/dvb-frontends/stv090x.c b/drivers/media/dvb-frontends/stv090x.c
+index 9bde0ad6f26eb..fcf2d57fc11b9 100644
+--- a/drivers/media/dvb-frontends/stv090x.c
++++ b/drivers/media/dvb-frontends/stv090x.c
+@@ -5071,7 +5071,7 @@ error:
+ kfree(state);
+ return NULL;
+ }
+-EXPORT_SYMBOL(stv090x_attach);
++EXPORT_SYMBOL_GPL(stv090x_attach);
+
+ static const struct i2c_device_id stv090x_id_table[] = {
+ {"stv090x", 0},
+diff --git a/drivers/media/dvb-frontends/stv6110.c b/drivers/media/dvb-frontends/stv6110.c
+index 963f6a896102a..1cf9c095dbff0 100644
+--- a/drivers/media/dvb-frontends/stv6110.c
++++ b/drivers/media/dvb-frontends/stv6110.c
+@@ -427,7 +427,7 @@ struct dvb_frontend *stv6110_attach(struct dvb_frontend *fe,
+
+ return fe;
+ }
+-EXPORT_SYMBOL(stv6110_attach);
++EXPORT_SYMBOL_GPL(stv6110_attach);
+
+ module_param(debug, int, 0644);
+ MODULE_PARM_DESC(debug, "Turn on/off frontend debugging (default:off).");
+diff --git a/drivers/media/dvb-frontends/stv6110x.c b/drivers/media/dvb-frontends/stv6110x.c
+index b2f456116c60f..3d522965f7293 100644
+--- a/drivers/media/dvb-frontends/stv6110x.c
++++ b/drivers/media/dvb-frontends/stv6110x.c
+@@ -467,7 +467,7 @@ const struct stv6110x_devctl *stv6110x_attach(struct dvb_frontend *fe,
+ dev_info(&stv6110x->i2c->dev, "Attaching STV6110x\n");
+ return stv6110x->devctl;
+ }
+-EXPORT_SYMBOL(stv6110x_attach);
++EXPORT_SYMBOL_GPL(stv6110x_attach);
+
+ static const struct i2c_device_id stv6110x_id_table[] = {
+ {"stv6110x", 0},
+diff --git a/drivers/media/dvb-frontends/tda10021.c b/drivers/media/dvb-frontends/tda10021.c
+index faa6e54b33729..462e12ab6bd14 100644
+--- a/drivers/media/dvb-frontends/tda10021.c
++++ b/drivers/media/dvb-frontends/tda10021.c
+@@ -523,4 +523,4 @@ MODULE_DESCRIPTION("Philips TDA10021 DVB-C demodulator driver");
+ MODULE_AUTHOR("Ralph Metzler, Holger Waechtler, Markus Schulz");
+ MODULE_LICENSE("GPL");
+
+-EXPORT_SYMBOL(tda10021_attach);
++EXPORT_SYMBOL_GPL(tda10021_attach);
+diff --git a/drivers/media/dvb-frontends/tda10023.c b/drivers/media/dvb-frontends/tda10023.c
+index 8f32edf6b700e..4c2541ecd7433 100644
+--- a/drivers/media/dvb-frontends/tda10023.c
++++ b/drivers/media/dvb-frontends/tda10023.c
+@@ -594,4 +594,4 @@ MODULE_DESCRIPTION("Philips TDA10023 DVB-C demodulator driver");
+ MODULE_AUTHOR("Georg Acher, Hartmut Birr");
+ MODULE_LICENSE("GPL");
+
+-EXPORT_SYMBOL(tda10023_attach);
++EXPORT_SYMBOL_GPL(tda10023_attach);
+diff --git a/drivers/media/dvb-frontends/tda10048.c b/drivers/media/dvb-frontends/tda10048.c
+index 0b3f6999515e3..f6d8a64762b99 100644
+--- a/drivers/media/dvb-frontends/tda10048.c
++++ b/drivers/media/dvb-frontends/tda10048.c
+@@ -1138,7 +1138,7 @@ error:
+ kfree(state);
+ return NULL;
+ }
+-EXPORT_SYMBOL(tda10048_attach);
++EXPORT_SYMBOL_GPL(tda10048_attach);
+
+ static const struct dvb_frontend_ops tda10048_ops = {
+ .delsys = { SYS_DVBT },
+diff --git a/drivers/media/dvb-frontends/tda1004x.c b/drivers/media/dvb-frontends/tda1004x.c
+index 83a798ca9b002..6f306db6c615f 100644
+--- a/drivers/media/dvb-frontends/tda1004x.c
++++ b/drivers/media/dvb-frontends/tda1004x.c
+@@ -1378,5 +1378,5 @@ MODULE_DESCRIPTION("Philips TDA10045H & TDA10046H DVB-T Demodulator");
+ MODULE_AUTHOR("Andrew de Quincey & Robert Schlabbach");
+ MODULE_LICENSE("GPL");
+
+-EXPORT_SYMBOL(tda10045_attach);
+-EXPORT_SYMBOL(tda10046_attach);
++EXPORT_SYMBOL_GPL(tda10045_attach);
++EXPORT_SYMBOL_GPL(tda10046_attach);
+diff --git a/drivers/media/dvb-frontends/tda10086.c b/drivers/media/dvb-frontends/tda10086.c
+index cdcf97664bba8..b449514ae5854 100644
+--- a/drivers/media/dvb-frontends/tda10086.c
++++ b/drivers/media/dvb-frontends/tda10086.c
+@@ -764,4 +764,4 @@ MODULE_DESCRIPTION("Philips TDA10086 DVB-S Demodulator");
+ MODULE_AUTHOR("Andrew de Quincey");
+ MODULE_LICENSE("GPL");
+
+-EXPORT_SYMBOL(tda10086_attach);
++EXPORT_SYMBOL_GPL(tda10086_attach);
+diff --git a/drivers/media/dvb-frontends/tda665x.c b/drivers/media/dvb-frontends/tda665x.c
+index 13e8969da7f89..346be5011fb73 100644
+--- a/drivers/media/dvb-frontends/tda665x.c
++++ b/drivers/media/dvb-frontends/tda665x.c
+@@ -227,7 +227,7 @@ struct dvb_frontend *tda665x_attach(struct dvb_frontend *fe,
+
+ return fe;
+ }
+-EXPORT_SYMBOL(tda665x_attach);
++EXPORT_SYMBOL_GPL(tda665x_attach);
+
+ MODULE_DESCRIPTION("TDA665x driver");
+ MODULE_AUTHOR("Manu Abraham");
+diff --git a/drivers/media/dvb-frontends/tda8083.c b/drivers/media/dvb-frontends/tda8083.c
+index e3e1c3db2c856..44f53624557bc 100644
+--- a/drivers/media/dvb-frontends/tda8083.c
++++ b/drivers/media/dvb-frontends/tda8083.c
+@@ -481,4 +481,4 @@ MODULE_DESCRIPTION("Philips TDA8083 DVB-S Demodulator");
+ MODULE_AUTHOR("Ralph Metzler, Holger Waechtler");
+ MODULE_LICENSE("GPL");
+
+-EXPORT_SYMBOL(tda8083_attach);
++EXPORT_SYMBOL_GPL(tda8083_attach);
+diff --git a/drivers/media/dvb-frontends/tda8261.c b/drivers/media/dvb-frontends/tda8261.c
+index 0d576d41c67d8..8b06f92745dca 100644
+--- a/drivers/media/dvb-frontends/tda8261.c
++++ b/drivers/media/dvb-frontends/tda8261.c
+@@ -188,7 +188,7 @@ exit:
+ return NULL;
+ }
+
+-EXPORT_SYMBOL(tda8261_attach);
++EXPORT_SYMBOL_GPL(tda8261_attach);
+
+ MODULE_AUTHOR("Manu Abraham");
+ MODULE_DESCRIPTION("TDA8261 8PSK/QPSK Tuner");
+diff --git a/drivers/media/dvb-frontends/tda826x.c b/drivers/media/dvb-frontends/tda826x.c
+index f9703a1dd758c..eafcf5f7da3dc 100644
+--- a/drivers/media/dvb-frontends/tda826x.c
++++ b/drivers/media/dvb-frontends/tda826x.c
+@@ -164,7 +164,7 @@ struct dvb_frontend *tda826x_attach(struct dvb_frontend *fe, int addr, struct i2
+
+ return fe;
+ }
+-EXPORT_SYMBOL(tda826x_attach);
++EXPORT_SYMBOL_GPL(tda826x_attach);
+
+ module_param(debug, int, 0644);
+ MODULE_PARM_DESC(debug, "Turn on/off frontend debugging (default:off).");
+diff --git a/drivers/media/dvb-frontends/ts2020.c b/drivers/media/dvb-frontends/ts2020.c
+index c28fee7509cdd..9fa0bf6b95e4d 100644
+--- a/drivers/media/dvb-frontends/ts2020.c
++++ b/drivers/media/dvb-frontends/ts2020.c
+@@ -525,7 +525,7 @@ struct dvb_frontend *ts2020_attach(struct dvb_frontend *fe,
+
+ return fe;
+ }
+-EXPORT_SYMBOL(ts2020_attach);
++EXPORT_SYMBOL_GPL(ts2020_attach);
+
+ /*
+ * We implement own regmap locking due to legacy DVB attach which uses frontend
+diff --git a/drivers/media/dvb-frontends/tua6100.c b/drivers/media/dvb-frontends/tua6100.c
+index 2483f614d0e7d..41dd9b6d31908 100644
+--- a/drivers/media/dvb-frontends/tua6100.c
++++ b/drivers/media/dvb-frontends/tua6100.c
+@@ -186,7 +186,7 @@ struct dvb_frontend *tua6100_attach(struct dvb_frontend *fe, int addr, struct i2
+ fe->tuner_priv = priv;
+ return fe;
+ }
+-EXPORT_SYMBOL(tua6100_attach);
++EXPORT_SYMBOL_GPL(tua6100_attach);
+
+ MODULE_DESCRIPTION("DVB tua6100 driver");
+ MODULE_AUTHOR("Andrew de Quincey");
+diff --git a/drivers/media/dvb-frontends/ves1820.c b/drivers/media/dvb-frontends/ves1820.c
+index 9df14d0be1c1a..ee5620e731e9b 100644
+--- a/drivers/media/dvb-frontends/ves1820.c
++++ b/drivers/media/dvb-frontends/ves1820.c
+@@ -434,4 +434,4 @@ MODULE_DESCRIPTION("VLSI VES1820 DVB-C Demodulator driver");
+ MODULE_AUTHOR("Ralph Metzler, Holger Waechtler");
+ MODULE_LICENSE("GPL");
+
+-EXPORT_SYMBOL(ves1820_attach);
++EXPORT_SYMBOL_GPL(ves1820_attach);
+diff --git a/drivers/media/dvb-frontends/ves1x93.c b/drivers/media/dvb-frontends/ves1x93.c
+index b747272863025..c60e21d26b881 100644
+--- a/drivers/media/dvb-frontends/ves1x93.c
++++ b/drivers/media/dvb-frontends/ves1x93.c
+@@ -540,4 +540,4 @@ MODULE_DESCRIPTION("VLSI VES1x93 DVB-S Demodulator driver");
+ MODULE_AUTHOR("Ralph Metzler");
+ MODULE_LICENSE("GPL");
+
+-EXPORT_SYMBOL(ves1x93_attach);
++EXPORT_SYMBOL_GPL(ves1x93_attach);
+diff --git a/drivers/media/dvb-frontends/zl10036.c b/drivers/media/dvb-frontends/zl10036.c
+index d392c7cce2ce0..7ba575e9c55f4 100644
+--- a/drivers/media/dvb-frontends/zl10036.c
++++ b/drivers/media/dvb-frontends/zl10036.c
+@@ -496,7 +496,7 @@ error:
+ kfree(state);
+ return NULL;
+ }
+-EXPORT_SYMBOL(zl10036_attach);
++EXPORT_SYMBOL_GPL(zl10036_attach);
+
+ module_param_named(debug, zl10036_debug, int, 0644);
+ MODULE_PARM_DESC(debug, "Turn on/off frontend debugging (default:off).");
+diff --git a/drivers/media/dvb-frontends/zl10039.c b/drivers/media/dvb-frontends/zl10039.c
+index 1335bf78d5b7f..a3e4d219400ce 100644
+--- a/drivers/media/dvb-frontends/zl10039.c
++++ b/drivers/media/dvb-frontends/zl10039.c
+@@ -295,7 +295,7 @@ error:
+ kfree(state);
+ return NULL;
+ }
+-EXPORT_SYMBOL(zl10039_attach);
++EXPORT_SYMBOL_GPL(zl10039_attach);
+
+ module_param(debug, int, 0644);
+ MODULE_PARM_DESC(debug, "Turn on/off frontend debugging (default:off).");
+diff --git a/drivers/media/dvb-frontends/zl10353.c b/drivers/media/dvb-frontends/zl10353.c
+index 2a2cf20a73d61..8849d05475c27 100644
+--- a/drivers/media/dvb-frontends/zl10353.c
++++ b/drivers/media/dvb-frontends/zl10353.c
+@@ -665,4 +665,4 @@ MODULE_DESCRIPTION("Zarlink ZL10353 DVB-T demodulator driver");
+ MODULE_AUTHOR("Chris Pascoe");
+ MODULE_LICENSE("GPL");
+
+-EXPORT_SYMBOL(zl10353_attach);
++EXPORT_SYMBOL_GPL(zl10353_attach);
+diff --git a/drivers/media/i2c/Kconfig b/drivers/media/i2c/Kconfig
+index 76d1ee3cc1bab..dbcc2222ddd33 100644
+--- a/drivers/media/i2c/Kconfig
++++ b/drivers/media/i2c/Kconfig
+@@ -25,8 +25,15 @@ config VIDEO_IR_I2C
+ # V4L2 I2C drivers that are related with Camera support
+ #
+
+-menu "Camera sensor devices"
+- visible if MEDIA_CAMERA_SUPPORT
++menuconfig VIDEO_CAMERA_SENSOR
++ bool "Camera sensor devices"
++ depends on MEDIA_CAMERA_SUPPORT && I2C
++ select MEDIA_CONTROLLER
++ select V4L2_FWNODE
++ select VIDEO_V4L2_SUBDEV_API
++ default y
++
++if VIDEO_CAMERA_SENSOR
+
+ config VIDEO_APTINA_PLL
+ tristate
+@@ -797,7 +804,7 @@ config VIDEO_ST_VGXY61
+ source "drivers/media/i2c/ccs/Kconfig"
+ source "drivers/media/i2c/et8ek8/Kconfig"
+
+-endmenu
++endif
+
+ menu "Lens drivers"
+ visible if MEDIA_CAMERA_SUPPORT
+diff --git a/drivers/media/i2c/ad5820.c b/drivers/media/i2c/ad5820.c
+index 44c26af49071c..786f38b4cbc0e 100644
+--- a/drivers/media/i2c/ad5820.c
++++ b/drivers/media/i2c/ad5820.c
+@@ -349,7 +349,6 @@ static void ad5820_remove(struct i2c_client *client)
+ static const struct i2c_device_id ad5820_id_table[] = {
+ { "ad5820", 0 },
+ { "ad5821", 0 },
+- { "ad5823", 0 },
+ { }
+ };
+ MODULE_DEVICE_TABLE(i2c, ad5820_id_table);
+@@ -357,7 +356,6 @@ MODULE_DEVICE_TABLE(i2c, ad5820_id_table);
+ static const struct of_device_id ad5820_of_table[] = {
+ { .compatible = "adi,ad5820" },
+ { .compatible = "adi,ad5821" },
+- { .compatible = "adi,ad5823" },
+ { }
+ };
+ MODULE_DEVICE_TABLE(of, ad5820_of_table);
+diff --git a/drivers/media/i2c/ccs/ccs-data.c b/drivers/media/i2c/ccs/ccs-data.c
+index 45f2b2f55ec5c..08400edf77ced 100644
+--- a/drivers/media/i2c/ccs/ccs-data.c
++++ b/drivers/media/i2c/ccs/ccs-data.c
+@@ -464,8 +464,7 @@ static int ccs_data_parse_rules(struct bin_container *bin,
+ rule_payload = __rule_type + 1;
+ rule_plen2 = rule_plen - sizeof(*__rule_type);
+
+- switch (*__rule_type) {
+- case CCS_DATA_BLOCK_RULE_ID_IF: {
++ if (*__rule_type == CCS_DATA_BLOCK_RULE_ID_IF) {
+ const struct __ccs_data_block_rule_if *__if_rules =
+ rule_payload;
+ const size_t __num_if_rules =
+@@ -514,49 +513,61 @@ static int ccs_data_parse_rules(struct bin_container *bin,
+ rules->if_rules = if_rule;
+ rules->num_if_rules = __num_if_rules;
+ }
+- break;
+- }
+- case CCS_DATA_BLOCK_RULE_ID_READ_ONLY_REGS:
+- rval = ccs_data_parse_reg_rules(bin, &rules->read_only_regs,
+- &rules->num_read_only_regs,
+- rule_payload,
+- rule_payload + rule_plen2,
+- dev);
+- if (rval)
+- return rval;
+- break;
+- case CCS_DATA_BLOCK_RULE_ID_FFD:
+- rval = ccs_data_parse_ffd(bin, &rules->frame_format,
+- rule_payload,
+- rule_payload + rule_plen2,
+- dev);
+- if (rval)
+- return rval;
+- break;
+- case CCS_DATA_BLOCK_RULE_ID_MSR:
+- rval = ccs_data_parse_reg_rules(bin,
+- &rules->manufacturer_regs,
+- &rules->num_manufacturer_regs,
+- rule_payload,
+- rule_payload + rule_plen2,
+- dev);
+- if (rval)
+- return rval;
+- break;
+- case CCS_DATA_BLOCK_RULE_ID_PDAF_READOUT:
+- rval = ccs_data_parse_pdaf_readout(bin,
+- &rules->pdaf_readout,
+- rule_payload,
+- rule_payload + rule_plen2,
+- dev);
+- if (rval)
+- return rval;
+- break;
+- default:
+- dev_dbg(dev,
+- "Don't know how to handle rule type %u!\n",
+- *__rule_type);
+- return -EINVAL;
++ } else {
++ /* Check there was an if rule before any other rules */
++ if (bin->base && !rules)
++ return -EINVAL;
++
++ switch (*__rule_type) {
++ case CCS_DATA_BLOCK_RULE_ID_READ_ONLY_REGS:
++ rval = ccs_data_parse_reg_rules(bin,
++ rules ?
++ &rules->read_only_regs : NULL,
++ rules ?
++ &rules->num_read_only_regs : NULL,
++ rule_payload,
++ rule_payload + rule_plen2,
++ dev);
++ if (rval)
++ return rval;
++ break;
++ case CCS_DATA_BLOCK_RULE_ID_FFD:
++ rval = ccs_data_parse_ffd(bin, rules ?
++ &rules->frame_format : NULL,
++ rule_payload,
++ rule_payload + rule_plen2,
++ dev);
++ if (rval)
++ return rval;
++ break;
++ case CCS_DATA_BLOCK_RULE_ID_MSR:
++ rval = ccs_data_parse_reg_rules(bin,
++ rules ?
++ &rules->manufacturer_regs : NULL,
++ rules ?
++ &rules->num_manufacturer_regs : NULL,
++ rule_payload,
++ rule_payload + rule_plen2,
++ dev);
++ if (rval)
++ return rval;
++ break;
++ case CCS_DATA_BLOCK_RULE_ID_PDAF_READOUT:
++ rval = ccs_data_parse_pdaf_readout(bin,
++ rules ?
++ &rules->pdaf_readout : NULL,
++ rule_payload,
++ rule_payload + rule_plen2,
++ dev);
++ if (rval)
++ return rval;
++ break;
++ default:
++ dev_dbg(dev,
++ "Don't know how to handle rule type %u!\n",
++ *__rule_type);
++ return -EINVAL;
++ }
+ }
+ __next_rule = __next_rule + rule_hlen + rule_plen;
+ }
+diff --git a/drivers/media/i2c/imx290.c b/drivers/media/i2c/imx290.c
+index 5ea25b7acc55f..a84b581682a21 100644
+--- a/drivers/media/i2c/imx290.c
++++ b/drivers/media/i2c/imx290.c
+@@ -902,7 +902,6 @@ static const char * const imx290_test_pattern_menu[] = {
+ };
+
+ static void imx290_ctrl_update(struct imx290 *imx290,
+- const struct v4l2_mbus_framefmt *format,
+ const struct imx290_mode *mode)
+ {
+ unsigned int hblank_min = mode->hmax_min - mode->width;
+@@ -1195,7 +1194,7 @@ static int imx290_set_fmt(struct v4l2_subdev *sd,
+ if (fmt->which == V4L2_SUBDEV_FORMAT_ACTIVE) {
+ imx290->current_mode = mode;
+
+- imx290_ctrl_update(imx290, &fmt->format, mode);
++ imx290_ctrl_update(imx290, mode);
+ imx290_exposure_update(imx290, mode);
+ }
+
+@@ -1300,7 +1299,6 @@ static const struct media_entity_operations imx290_subdev_entity_ops = {
+ static int imx290_subdev_init(struct imx290 *imx290)
+ {
+ struct i2c_client *client = to_i2c_client(imx290->dev);
+- const struct v4l2_mbus_framefmt *format;
+ struct v4l2_subdev_state *state;
+ int ret;
+
+@@ -1335,8 +1333,7 @@ static int imx290_subdev_init(struct imx290 *imx290)
+ }
+
+ state = v4l2_subdev_lock_and_get_active_state(&imx290->sd);
+- format = v4l2_subdev_get_pad_format(&imx290->sd, state, 0);
+- imx290_ctrl_update(imx290, format, imx290->current_mode);
++ imx290_ctrl_update(imx290, imx290->current_mode);
+ v4l2_subdev_unlock_state(state);
+
+ return 0;
+diff --git a/drivers/media/i2c/ov2680.c b/drivers/media/i2c/ov2680.c
+index 54153bf66bddc..8943e4e78a0df 100644
+--- a/drivers/media/i2c/ov2680.c
++++ b/drivers/media/i2c/ov2680.c
+@@ -54,6 +54,9 @@
+ #define OV2680_WIDTH_MAX 1600
+ #define OV2680_HEIGHT_MAX 1200
+
++#define OV2680_DEFAULT_WIDTH 800
++#define OV2680_DEFAULT_HEIGHT 600
++
+ enum ov2680_mode_id {
+ OV2680_MODE_QUXGA_800_600,
+ OV2680_MODE_720P_1280_720,
+@@ -85,15 +88,8 @@ struct ov2680_mode_info {
+
+ struct ov2680_ctrls {
+ struct v4l2_ctrl_handler handler;
+- struct {
+- struct v4l2_ctrl *auto_exp;
+- struct v4l2_ctrl *exposure;
+- };
+- struct {
+- struct v4l2_ctrl *auto_gain;
+- struct v4l2_ctrl *gain;
+- };
+-
++ struct v4l2_ctrl *exposure;
++ struct v4l2_ctrl *gain;
+ struct v4l2_ctrl *hflip;
+ struct v4l2_ctrl *vflip;
+ struct v4l2_ctrl *test_pattern;
+@@ -143,6 +139,7 @@ static const struct reg_value ov2680_setting_30fps_QUXGA_800_600[] = {
+ {0x380e, 0x02}, {0x380f, 0x84}, {0x3811, 0x04}, {0x3813, 0x04},
+ {0x3814, 0x31}, {0x3815, 0x31}, {0x3820, 0xc0}, {0x4008, 0x00},
+ {0x4009, 0x03}, {0x4837, 0x1e}, {0x3501, 0x4e}, {0x3502, 0xe0},
++ {0x3503, 0x03},
+ };
+
+ static const struct reg_value ov2680_setting_30fps_720P_1280_720[] = {
+@@ -321,70 +318,62 @@ static void ov2680_power_down(struct ov2680_dev *sensor)
+ usleep_range(5000, 10000);
+ }
+
+-static int ov2680_bayer_order(struct ov2680_dev *sensor)
++static void ov2680_set_bayer_order(struct ov2680_dev *sensor,
++ struct v4l2_mbus_framefmt *fmt)
+ {
+- u32 format1;
+- u32 format2;
+- u32 hv_flip;
+- int ret;
+-
+- ret = ov2680_read_reg(sensor, OV2680_REG_FORMAT1, &format1);
+- if (ret < 0)
+- return ret;
+-
+- ret = ov2680_read_reg(sensor, OV2680_REG_FORMAT2, &format2);
+- if (ret < 0)
+- return ret;
++ int hv_flip = 0;
+
+- hv_flip = (format2 & BIT(2) << 1) | (format1 & BIT(2));
++ if (sensor->ctrls.vflip && sensor->ctrls.vflip->val)
++ hv_flip += 1;
+
+- sensor->fmt.code = ov2680_hv_flip_bayer_order[hv_flip];
++ if (sensor->ctrls.hflip && sensor->ctrls.hflip->val)
++ hv_flip += 2;
+
+- return 0;
++ fmt->code = ov2680_hv_flip_bayer_order[hv_flip];
+ }
+
+-static int ov2680_vflip_enable(struct ov2680_dev *sensor)
++static void ov2680_fill_format(struct ov2680_dev *sensor,
++ struct v4l2_mbus_framefmt *fmt,
++ unsigned int width, unsigned int height)
+ {
+- int ret;
+-
+- ret = ov2680_mod_reg(sensor, OV2680_REG_FORMAT1, BIT(2), BIT(2));
+- if (ret < 0)
+- return ret;
+-
+- return ov2680_bayer_order(sensor);
++ memset(fmt, 0, sizeof(*fmt));
++ fmt->width = width;
++ fmt->height = height;
++ fmt->field = V4L2_FIELD_NONE;
++ fmt->colorspace = V4L2_COLORSPACE_SRGB;
++ ov2680_set_bayer_order(sensor, fmt);
+ }
+
+-static int ov2680_vflip_disable(struct ov2680_dev *sensor)
++static int ov2680_set_vflip(struct ov2680_dev *sensor, s32 val)
+ {
+ int ret;
+
+- ret = ov2680_mod_reg(sensor, OV2680_REG_FORMAT1, BIT(2), BIT(0));
+- if (ret < 0)
+- return ret;
+-
+- return ov2680_bayer_order(sensor);
+-}
+-
+-static int ov2680_hflip_enable(struct ov2680_dev *sensor)
+-{
+- int ret;
++ if (sensor->is_streaming)
++ return -EBUSY;
+
+- ret = ov2680_mod_reg(sensor, OV2680_REG_FORMAT2, BIT(2), BIT(2));
++ ret = ov2680_mod_reg(sensor, OV2680_REG_FORMAT1,
++ BIT(2), val ? BIT(2) : 0);
+ if (ret < 0)
+ return ret;
+
+- return ov2680_bayer_order(sensor);
++ ov2680_set_bayer_order(sensor, &sensor->fmt);
++ return 0;
+ }
+
+-static int ov2680_hflip_disable(struct ov2680_dev *sensor)
++static int ov2680_set_hflip(struct ov2680_dev *sensor, s32 val)
+ {
+ int ret;
+
+- ret = ov2680_mod_reg(sensor, OV2680_REG_FORMAT2, BIT(2), BIT(0));
++ if (sensor->is_streaming)
++ return -EBUSY;
++
++ ret = ov2680_mod_reg(sensor, OV2680_REG_FORMAT2,
++ BIT(2), val ? BIT(2) : 0);
+ if (ret < 0)
+ return ret;
+
+- return ov2680_bayer_order(sensor);
++ ov2680_set_bayer_order(sensor, &sensor->fmt);
++ return 0;
+ }
+
+ static int ov2680_test_pattern_set(struct ov2680_dev *sensor, int value)
+@@ -405,69 +394,15 @@ static int ov2680_test_pattern_set(struct ov2680_dev *sensor, int value)
+ return 0;
+ }
+
+-static int ov2680_gain_set(struct ov2680_dev *sensor, bool auto_gain)
++static int ov2680_gain_set(struct ov2680_dev *sensor, u32 gain)
+ {
+- struct ov2680_ctrls *ctrls = &sensor->ctrls;
+- u32 gain;
+- int ret;
+-
+- ret = ov2680_mod_reg(sensor, OV2680_REG_R_MANUAL, BIT(1),
+- auto_gain ? 0 : BIT(1));
+- if (ret < 0)
+- return ret;
+-
+- if (auto_gain || !ctrls->gain->is_new)
+- return 0;
+-
+- gain = ctrls->gain->val;
+-
+- ret = ov2680_write_reg16(sensor, OV2680_REG_GAIN_PK, gain);
+-
+- return 0;
+-}
+-
+-static int ov2680_gain_get(struct ov2680_dev *sensor)
+-{
+- u32 gain;
+- int ret;
+-
+- ret = ov2680_read_reg16(sensor, OV2680_REG_GAIN_PK, &gain);
+- if (ret)
+- return ret;
+-
+- return gain;
+-}
+-
+-static int ov2680_exposure_set(struct ov2680_dev *sensor, bool auto_exp)
+-{
+- struct ov2680_ctrls *ctrls = &sensor->ctrls;
+- u32 exp;
+- int ret;
+-
+- ret = ov2680_mod_reg(sensor, OV2680_REG_R_MANUAL, BIT(0),
+- auto_exp ? 0 : BIT(0));
+- if (ret < 0)
+- return ret;
+-
+- if (auto_exp || !ctrls->exposure->is_new)
+- return 0;
+-
+- exp = (u32)ctrls->exposure->val;
+- exp <<= 4;
+-
+- return ov2680_write_reg24(sensor, OV2680_REG_EXPOSURE_PK_HIGH, exp);
++ return ov2680_write_reg16(sensor, OV2680_REG_GAIN_PK, gain);
+ }
+
+-static int ov2680_exposure_get(struct ov2680_dev *sensor)
++static int ov2680_exposure_set(struct ov2680_dev *sensor, u32 exp)
+ {
+- int ret;
+- u32 exp;
+-
+- ret = ov2680_read_reg24(sensor, OV2680_REG_EXPOSURE_PK_HIGH, &exp);
+- if (ret)
+- return ret;
+-
+- return exp >> 4;
++ return ov2680_write_reg24(sensor, OV2680_REG_EXPOSURE_PK_HIGH,
++ exp << 4);
+ }
+
+ static int ov2680_stream_enable(struct ov2680_dev *sensor)
+@@ -482,33 +417,17 @@ static int ov2680_stream_disable(struct ov2680_dev *sensor)
+
+ static int ov2680_mode_set(struct ov2680_dev *sensor)
+ {
+- struct ov2680_ctrls *ctrls = &sensor->ctrls;
+ int ret;
+
+- ret = ov2680_gain_set(sensor, false);
+- if (ret < 0)
+- return ret;
+-
+- ret = ov2680_exposure_set(sensor, false);
++ ret = ov2680_load_regs(sensor, sensor->current_mode);
+ if (ret < 0)
+ return ret;
+
+- ret = ov2680_load_regs(sensor, sensor->current_mode);
++ /* Restore value of all ctrls */
++ ret = __v4l2_ctrl_handler_setup(&sensor->ctrls.handler);
+ if (ret < 0)
+ return ret;
+
+- if (ctrls->auto_gain->val) {
+- ret = ov2680_gain_set(sensor, true);
+- if (ret < 0)
+- return ret;
+- }
+-
+- if (ctrls->auto_exp->val == V4L2_EXPOSURE_AUTO) {
+- ret = ov2680_exposure_set(sensor, true);
+- if (ret < 0)
+- return ret;
+- }
+-
+ sensor->mode_pending_changes = false;
+
+ return 0;
+@@ -556,7 +475,7 @@ static int ov2680_power_on(struct ov2680_dev *sensor)
+ ret = ov2680_write_reg(sensor, OV2680_REG_SOFT_RESET, 0x01);
+ if (ret != 0) {
+ dev_err(dev, "sensor soft reset failed\n");
+- return ret;
++ goto err_disable_regulators;
+ }
+ usleep_range(1000, 2000);
+ } else {
+@@ -566,7 +485,7 @@ static int ov2680_power_on(struct ov2680_dev *sensor)
+
+ ret = clk_prepare_enable(sensor->xvclk);
+ if (ret < 0)
+- return ret;
++ goto err_disable_regulators;
+
+ sensor->is_enabled = true;
+
+@@ -576,6 +495,10 @@ static int ov2680_power_on(struct ov2680_dev *sensor)
+ ov2680_stream_disable(sensor);
+
+ return 0;
++
++err_disable_regulators:
++ regulator_bulk_disable(OV2680_NUM_SUPPLIES, sensor->supplies);
++ return ret;
+ }
+
+ static int ov2680_s_power(struct v4l2_subdev *sd, int on)
+@@ -590,15 +513,10 @@ static int ov2680_s_power(struct v4l2_subdev *sd, int on)
+ else
+ ret = ov2680_power_off(sensor);
+
+- mutex_unlock(&sensor->lock);
+-
+- if (on && ret == 0) {
+- ret = v4l2_ctrl_handler_setup(&sensor->ctrls.handler);
+- if (ret < 0)
+- return ret;
+-
++ if (on && ret == 0)
+ ret = ov2680_mode_restore(sensor);
+- }
++
++ mutex_unlock(&sensor->lock);
+
+ return ret;
+ }
+@@ -664,7 +582,6 @@ static int ov2680_get_fmt(struct v4l2_subdev *sd,
+ {
+ struct ov2680_dev *sensor = to_ov2680_dev(sd);
+ struct v4l2_mbus_framefmt *fmt = NULL;
+- int ret = 0;
+
+ if (format->pad != 0)
+ return -EINVAL;
+@@ -672,22 +589,17 @@ static int ov2680_get_fmt(struct v4l2_subdev *sd,
+ mutex_lock(&sensor->lock);
+
+ if (format->which == V4L2_SUBDEV_FORMAT_TRY) {
+-#ifdef CONFIG_VIDEO_V4L2_SUBDEV_API
+ fmt = v4l2_subdev_get_try_format(&sensor->sd, sd_state,
+ format->pad);
+-#else
+- ret = -EINVAL;
+-#endif
+ } else {
+ fmt = &sensor->fmt;
+ }
+
+- if (fmt)
+- format->format = *fmt;
++ format->format = *fmt;
+
+ mutex_unlock(&sensor->lock);
+
+- return ret;
++ return 0;
+ }
+
+ static int ov2680_set_fmt(struct v4l2_subdev *sd,
+@@ -695,43 +607,35 @@ static int ov2680_set_fmt(struct v4l2_subdev *sd,
+ struct v4l2_subdev_format *format)
+ {
+ struct ov2680_dev *sensor = to_ov2680_dev(sd);
+- struct v4l2_mbus_framefmt *fmt = &format->format;
+-#ifdef CONFIG_VIDEO_V4L2_SUBDEV_API
+ struct v4l2_mbus_framefmt *try_fmt;
+-#endif
+ const struct ov2680_mode_info *mode;
+ int ret = 0;
+
+ if (format->pad != 0)
+ return -EINVAL;
+
+- mutex_lock(&sensor->lock);
+-
+- if (sensor->is_streaming) {
+- ret = -EBUSY;
+- goto unlock;
+- }
+-
+ mode = v4l2_find_nearest_size(ov2680_mode_data,
+- ARRAY_SIZE(ov2680_mode_data), width,
+- height, fmt->width, fmt->height);
+- if (!mode) {
+- ret = -EINVAL;
+- goto unlock;
+- }
++ ARRAY_SIZE(ov2680_mode_data),
++ width, height,
++ format->format.width,
++ format->format.height);
++ if (!mode)
++ return -EINVAL;
++
++ ov2680_fill_format(sensor, &format->format, mode->width, mode->height);
+
+ if (format->which == V4L2_SUBDEV_FORMAT_TRY) {
+-#ifdef CONFIG_VIDEO_V4L2_SUBDEV_API
+ try_fmt = v4l2_subdev_get_try_format(sd, sd_state, 0);
+- format->format = *try_fmt;
+-#endif
+- goto unlock;
++ *try_fmt = format->format;
++ return 0;
+ }
+
+- fmt->width = mode->width;
+- fmt->height = mode->height;
+- fmt->code = sensor->fmt.code;
+- fmt->colorspace = sensor->fmt.colorspace;
++ mutex_lock(&sensor->lock);
++
++ if (sensor->is_streaming) {
++ ret = -EBUSY;
++ goto unlock;
++ }
+
+ sensor->current_mode = mode;
+ sensor->fmt = format->format;
+@@ -746,16 +650,11 @@ unlock:
+ static int ov2680_init_cfg(struct v4l2_subdev *sd,
+ struct v4l2_subdev_state *sd_state)
+ {
+- struct v4l2_subdev_format fmt = {
+- .which = sd_state ? V4L2_SUBDEV_FORMAT_TRY
+- : V4L2_SUBDEV_FORMAT_ACTIVE,
+- .format = {
+- .width = 800,
+- .height = 600,
+- }
+- };
++ struct ov2680_dev *sensor = to_ov2680_dev(sd);
+
+- return ov2680_set_fmt(sd, sd_state, &fmt);
++ ov2680_fill_format(sensor, &sd_state->pads[0].try_fmt,
++ OV2680_DEFAULT_WIDTH, OV2680_DEFAULT_HEIGHT);
++ return 0;
+ }
+
+ static int ov2680_enum_frame_size(struct v4l2_subdev *sd,
+@@ -794,66 +693,23 @@ static int ov2680_enum_frame_interval(struct v4l2_subdev *sd,
+ return 0;
+ }
+
+-static int ov2680_g_volatile_ctrl(struct v4l2_ctrl *ctrl)
+-{
+- struct v4l2_subdev *sd = ctrl_to_sd(ctrl);
+- struct ov2680_dev *sensor = to_ov2680_dev(sd);
+- struct ov2680_ctrls *ctrls = &sensor->ctrls;
+- int val;
+-
+- if (!sensor->is_enabled)
+- return 0;
+-
+- switch (ctrl->id) {
+- case V4L2_CID_GAIN:
+- val = ov2680_gain_get(sensor);
+- if (val < 0)
+- return val;
+- ctrls->gain->val = val;
+- break;
+- case V4L2_CID_EXPOSURE:
+- val = ov2680_exposure_get(sensor);
+- if (val < 0)
+- return val;
+- ctrls->exposure->val = val;
+- break;
+- }
+-
+- return 0;
+-}
+-
+ static int ov2680_s_ctrl(struct v4l2_ctrl *ctrl)
+ {
+ struct v4l2_subdev *sd = ctrl_to_sd(ctrl);
+ struct ov2680_dev *sensor = to_ov2680_dev(sd);
+- struct ov2680_ctrls *ctrls = &sensor->ctrls;
+
+ if (!sensor->is_enabled)
+ return 0;
+
+ switch (ctrl->id) {
+- case V4L2_CID_AUTOGAIN:
+- return ov2680_gain_set(sensor, !!ctrl->val);
+ case V4L2_CID_GAIN:
+- return ov2680_gain_set(sensor, !!ctrls->auto_gain->val);
+- case V4L2_CID_EXPOSURE_AUTO:
+- return ov2680_exposure_set(sensor, !!ctrl->val);
++ return ov2680_gain_set(sensor, ctrl->val);
+ case V4L2_CID_EXPOSURE:
+- return ov2680_exposure_set(sensor, !!ctrls->auto_exp->val);
++ return ov2680_exposure_set(sensor, ctrl->val);
+ case V4L2_CID_VFLIP:
+- if (sensor->is_streaming)
+- return -EBUSY;
+- if (ctrl->val)
+- return ov2680_vflip_enable(sensor);
+- else
+- return ov2680_vflip_disable(sensor);
++ return ov2680_set_vflip(sensor, ctrl->val);
+ case V4L2_CID_HFLIP:
+- if (sensor->is_streaming)
+- return -EBUSY;
+- if (ctrl->val)
+- return ov2680_hflip_enable(sensor);
+- else
+- return ov2680_hflip_disable(sensor);
++ return ov2680_set_hflip(sensor, ctrl->val);
+ case V4L2_CID_TEST_PATTERN:
+ return ov2680_test_pattern_set(sensor, ctrl->val);
+ default:
+@@ -864,7 +720,6 @@ static int ov2680_s_ctrl(struct v4l2_ctrl *ctrl)
+ }
+
+ static const struct v4l2_ctrl_ops ov2680_ctrl_ops = {
+- .g_volatile_ctrl = ov2680_g_volatile_ctrl,
+ .s_ctrl = ov2680_s_ctrl,
+ };
+
+@@ -898,11 +753,8 @@ static int ov2680_mode_init(struct ov2680_dev *sensor)
+ const struct ov2680_mode_info *init_mode;
+
+ /* set initial mode */
+- sensor->fmt.code = MEDIA_BUS_FMT_SBGGR10_1X10;
+- sensor->fmt.width = 800;
+- sensor->fmt.height = 600;
+- sensor->fmt.field = V4L2_FIELD_NONE;
+- sensor->fmt.colorspace = V4L2_COLORSPACE_SRGB;
++ ov2680_fill_format(sensor, &sensor->fmt,
++ OV2680_DEFAULT_WIDTH, OV2680_DEFAULT_HEIGHT);
+
+ sensor->frame_interval.denominator = OV2680_FRAME_RATE;
+ sensor->frame_interval.numerator = 1;
+@@ -926,9 +778,7 @@ static int ov2680_v4l2_register(struct ov2680_dev *sensor)
+ v4l2_i2c_subdev_init(&sensor->sd, sensor->i2c_client,
+ &ov2680_subdev_ops);
+
+-#ifdef CONFIG_VIDEO_V4L2_SUBDEV_API
+ sensor->sd.flags = V4L2_SUBDEV_FL_HAS_DEVNODE;
+-#endif
+ sensor->pad.flags = MEDIA_PAD_FL_SOURCE;
+ sensor->sd.entity.function = MEDIA_ENT_F_CAM_SENSOR;
+
+@@ -936,7 +786,7 @@ static int ov2680_v4l2_register(struct ov2680_dev *sensor)
+ if (ret < 0)
+ return ret;
+
+- v4l2_ctrl_handler_init(hdl, 7);
++ v4l2_ctrl_handler_init(hdl, 5);
+
+ hdl->lock = &sensor->lock;
+
+@@ -948,16 +798,9 @@ static int ov2680_v4l2_register(struct ov2680_dev *sensor)
+ ARRAY_SIZE(test_pattern_menu) - 1,
+ 0, 0, test_pattern_menu);
+
+- ctrls->auto_exp = v4l2_ctrl_new_std_menu(hdl, ops,
+- V4L2_CID_EXPOSURE_AUTO,
+- V4L2_EXPOSURE_MANUAL, 0,
+- V4L2_EXPOSURE_AUTO);
+-
+ ctrls->exposure = v4l2_ctrl_new_std(hdl, ops, V4L2_CID_EXPOSURE,
+ 0, 32767, 1, 0);
+
+- ctrls->auto_gain = v4l2_ctrl_new_std(hdl, ops, V4L2_CID_AUTOGAIN,
+- 0, 1, 1, 1);
+ ctrls->gain = v4l2_ctrl_new_std(hdl, ops, V4L2_CID_GAIN, 0, 2047, 1, 0);
+
+ if (hdl->error) {
+@@ -965,14 +808,9 @@ static int ov2680_v4l2_register(struct ov2680_dev *sensor)
+ goto cleanup_entity;
+ }
+
+- ctrls->gain->flags |= V4L2_CTRL_FLAG_VOLATILE;
+- ctrls->exposure->flags |= V4L2_CTRL_FLAG_VOLATILE;
+ ctrls->vflip->flags |= V4L2_CTRL_FLAG_MODIFY_LAYOUT;
+ ctrls->hflip->flags |= V4L2_CTRL_FLAG_MODIFY_LAYOUT;
+
+- v4l2_ctrl_auto_cluster(2, &ctrls->auto_gain, 0, true);
+- v4l2_ctrl_auto_cluster(2, &ctrls->auto_exp, 1, true);
+-
+ sensor->sd.ctrl_handler = hdl;
+
+ ret = v4l2_async_register_subdev(&sensor->sd);
+diff --git a/drivers/media/i2c/ov5640.c b/drivers/media/i2c/ov5640.c
+index 1536649b9e90f..bf50705ad6d36 100644
+--- a/drivers/media/i2c/ov5640.c
++++ b/drivers/media/i2c/ov5640.c
+@@ -568,9 +568,7 @@ static const struct reg_value ov5640_init_setting[] = {
+ {0x4001, 0x02, 0, 0}, {0x4004, 0x02, 0, 0}, {0x3000, 0x00, 0, 0},
+ {0x3002, 0x1c, 0, 0}, {0x3004, 0xff, 0, 0}, {0x3006, 0xc3, 0, 0},
+ {0x302e, 0x08, 0, 0}, {0x4300, 0x3f, 0, 0},
+- {0x501f, 0x00, 0, 0}, {0x4407, 0x04, 0, 0},
+- {0x440e, 0x00, 0, 0}, {0x460b, 0x35, 0, 0}, {0x460c, 0x22, 0, 0},
+- {0x4837, 0x0a, 0, 0}, {0x3824, 0x02, 0, 0},
++ {0x501f, 0x00, 0, 0}, {0x440e, 0x00, 0, 0}, {0x4837, 0x0a, 0, 0},
+ {0x5000, 0xa7, 0, 0}, {0x5001, 0xa3, 0, 0}, {0x5180, 0xff, 0, 0},
+ {0x5181, 0xf2, 0, 0}, {0x5182, 0x00, 0, 0}, {0x5183, 0x14, 0, 0},
+ {0x5184, 0x25, 0, 0}, {0x5185, 0x24, 0, 0}, {0x5186, 0x09, 0, 0},
+@@ -634,7 +632,8 @@ static const struct reg_value ov5640_setting_low_res[] = {
+ {0x3a0a, 0x00, 0, 0}, {0x3a0b, 0xf6, 0, 0}, {0x3a0e, 0x03, 0, 0},
+ {0x3a0d, 0x04, 0, 0}, {0x3a14, 0x03, 0, 0}, {0x3a15, 0xd8, 0, 0},
+ {0x4001, 0x02, 0, 0}, {0x4004, 0x02, 0, 0},
+- {0x4407, 0x04, 0, 0}, {0x5001, 0xa3, 0, 0},
++ {0x4407, 0x04, 0, 0}, {0x460b, 0x35, 0, 0}, {0x460c, 0x22, 0, 0},
++ {0x3824, 0x02, 0, 0}, {0x5001, 0xa3, 0, 0},
+ };
+
+ static const struct reg_value ov5640_setting_720P_1280_720[] = {
+@@ -2453,16 +2452,13 @@ static void ov5640_power(struct ov5640_dev *sensor, bool enable)
+ static void ov5640_powerup_sequence(struct ov5640_dev *sensor)
+ {
+ if (sensor->pwdn_gpio) {
+- gpiod_set_value_cansleep(sensor->reset_gpio, 0);
++ gpiod_set_value_cansleep(sensor->reset_gpio, 1);
+
+ /* camera power cycle */
+ ov5640_power(sensor, false);
+- usleep_range(5000, 10000);
++ usleep_range(5000, 10000); /* t2 */
+ ov5640_power(sensor, true);
+- usleep_range(5000, 10000);
+-
+- gpiod_set_value_cansleep(sensor->reset_gpio, 1);
+- usleep_range(1000, 2000);
++ usleep_range(1000, 2000); /* t3 */
+
+ gpiod_set_value_cansleep(sensor->reset_gpio, 0);
+ } else {
+@@ -2470,7 +2466,7 @@ static void ov5640_powerup_sequence(struct ov5640_dev *sensor)
+ ov5640_write_reg(sensor, OV5640_REG_SYS_CTRL0,
+ OV5640_REG_SYS_CTRL0_SW_RST);
+ }
+- usleep_range(20000, 25000);
++ usleep_range(20000, 25000); /* t4 */
+
+ /*
+ * software standby: allows registers programming;
+@@ -2543,9 +2539,9 @@ static int ov5640_set_power_mipi(struct ov5640_dev *sensor, bool on)
+ * "ov5640_set_stream_mipi()")
+ * [4] = 0 : Power up MIPI HS Tx
+ * [3] = 0 : Power up MIPI LS Rx
+- * [2] = 0 : MIPI interface disabled
++ * [2] = 1 : MIPI interface enabled
+ */
+- ret = ov5640_write_reg(sensor, OV5640_REG_IO_MIPI_CTRL00, 0x40);
++ ret = ov5640_write_reg(sensor, OV5640_REG_IO_MIPI_CTRL00, 0x44);
+ if (ret)
+ return ret;
+
+diff --git a/drivers/media/i2c/rdacm21.c b/drivers/media/i2c/rdacm21.c
+index 9ccc56c30d3b0..d269c541ebe4c 100644
+--- a/drivers/media/i2c/rdacm21.c
++++ b/drivers/media/i2c/rdacm21.c
+@@ -351,7 +351,7 @@ static void ov10640_power_up(struct rdacm21_device *dev)
+ static int ov10640_check_id(struct rdacm21_device *dev)
+ {
+ unsigned int i;
+- u8 val;
++ u8 val = 0;
+
+ /* Read OV10640 ID to test communications. */
+ for (i = 0; i < OV10640_PID_TIMEOUT; ++i) {
+diff --git a/drivers/media/i2c/tvp5150.c b/drivers/media/i2c/tvp5150.c
+index 859f1cb2fa744..84f87c016f9b5 100644
+--- a/drivers/media/i2c/tvp5150.c
++++ b/drivers/media/i2c/tvp5150.c
+@@ -2068,6 +2068,10 @@ static int tvp5150_parse_dt(struct tvp5150 *decoder, struct device_node *np)
+ tvpc->ent.name = devm_kasprintf(dev, GFP_KERNEL, "%s %s",
+ v4l2c->name, v4l2c->label ?
+ v4l2c->label : "");
++ if (!tvpc->ent.name) {
++ ret = -ENOMEM;
++ goto err_free;
++ }
+ }
+
+ ep_np = of_graph_get_endpoint_by_regs(np, TVP5150_PAD_VID_OUT, 0);
+diff --git a/drivers/media/pci/Kconfig b/drivers/media/pci/Kconfig
+index 480194543d055..ee095bde0b686 100644
+--- a/drivers/media/pci/Kconfig
++++ b/drivers/media/pci/Kconfig
+@@ -73,7 +73,7 @@ config VIDEO_PCI_SKELETON
+ Enable build of the skeleton PCI driver, used as a reference
+ when developing new drivers.
+
+-source "drivers/media/pci/intel/ipu3/Kconfig"
++source "drivers/media/pci/intel/Kconfig"
+
+ endif #MEDIA_PCI_SUPPORT
+ endif #PCI
+diff --git a/drivers/media/pci/bt8xx/dst.c b/drivers/media/pci/bt8xx/dst.c
+index 3e52a51982d76..110651e478314 100644
+--- a/drivers/media/pci/bt8xx/dst.c
++++ b/drivers/media/pci/bt8xx/dst.c
+@@ -1722,7 +1722,7 @@ struct dst_state *dst_attach(struct dst_state *state, struct dvb_adapter *dvb_ad
+ return state; /* Manu (DST is a card not a frontend) */
+ }
+
+-EXPORT_SYMBOL(dst_attach);
++EXPORT_SYMBOL_GPL(dst_attach);
+
+ static const struct dvb_frontend_ops dst_dvbt_ops = {
+ .delsys = { SYS_DVBT },
+diff --git a/drivers/media/pci/bt8xx/dst_ca.c b/drivers/media/pci/bt8xx/dst_ca.c
+index 85fcdc59f0d18..571392d80ccc6 100644
+--- a/drivers/media/pci/bt8xx/dst_ca.c
++++ b/drivers/media/pci/bt8xx/dst_ca.c
+@@ -668,7 +668,7 @@ struct dvb_device *dst_ca_attach(struct dst_state *dst, struct dvb_adapter *dvb_
+ return NULL;
+ }
+
+-EXPORT_SYMBOL(dst_ca_attach);
++EXPORT_SYMBOL_GPL(dst_ca_attach);
+
+ MODULE_DESCRIPTION("DST DVB-S/T/C Combo CA driver");
+ MODULE_AUTHOR("Manu Abraham");
+diff --git a/drivers/media/pci/cx23885/cx23885-dvb.c b/drivers/media/pci/cx23885/cx23885-dvb.c
+index 8fd5b6ef24282..7551ca4a322a4 100644
+--- a/drivers/media/pci/cx23885/cx23885-dvb.c
++++ b/drivers/media/pci/cx23885/cx23885-dvb.c
+@@ -2459,16 +2459,10 @@ static int dvb_register(struct cx23885_tsport *port)
+ request_module("%s", info.type);
+ client_tuner = i2c_new_client_device(&dev->i2c_bus[1].i2c_adap, &info);
+ if (!i2c_client_has_driver(client_tuner)) {
+- module_put(client_demod->dev.driver->owner);
+- i2c_unregister_device(client_demod);
+- port->i2c_client_demod = NULL;
+ goto frontend_detach;
+ }
+ if (!try_module_get(client_tuner->dev.driver->owner)) {
+ i2c_unregister_device(client_tuner);
+- module_put(client_demod->dev.driver->owner);
+- i2c_unregister_device(client_demod);
+- port->i2c_client_demod = NULL;
+ goto frontend_detach;
+ }
+ port->i2c_client_tuner = client_tuner;
+@@ -2505,16 +2499,10 @@ static int dvb_register(struct cx23885_tsport *port)
+ request_module("%s", info.type);
+ client_tuner = i2c_new_client_device(&dev->i2c_bus[1].i2c_adap, &info);
+ if (!i2c_client_has_driver(client_tuner)) {
+- module_put(client_demod->dev.driver->owner);
+- i2c_unregister_device(client_demod);
+- port->i2c_client_demod = NULL;
+ goto frontend_detach;
+ }
+ if (!try_module_get(client_tuner->dev.driver->owner)) {
+ i2c_unregister_device(client_tuner);
+- module_put(client_demod->dev.driver->owner);
+- i2c_unregister_device(client_demod);
+- port->i2c_client_demod = NULL;
+ goto frontend_detach;
+ }
+ port->i2c_client_tuner = client_tuner;
+diff --git a/drivers/media/pci/ddbridge/ddbridge-dummy-fe.c b/drivers/media/pci/ddbridge/ddbridge-dummy-fe.c
+index 6868a0c4fc82a..520ebd16b0c44 100644
+--- a/drivers/media/pci/ddbridge/ddbridge-dummy-fe.c
++++ b/drivers/media/pci/ddbridge/ddbridge-dummy-fe.c
+@@ -112,7 +112,7 @@ struct dvb_frontend *ddbridge_dummy_fe_qam_attach(void)
+ state->frontend.demodulator_priv = state;
+ return &state->frontend;
+ }
+-EXPORT_SYMBOL(ddbridge_dummy_fe_qam_attach);
++EXPORT_SYMBOL_GPL(ddbridge_dummy_fe_qam_attach);
+
+ static const struct dvb_frontend_ops ddbridge_dummy_fe_qam_ops = {
+ .delsys = { SYS_DVBC_ANNEX_A },
+diff --git a/drivers/media/pci/intel/Kconfig b/drivers/media/pci/intel/Kconfig
+new file mode 100644
+index 0000000000000..51b18fce6a1de
+--- /dev/null
++++ b/drivers/media/pci/intel/Kconfig
+@@ -0,0 +1,10 @@
++# SPDX-License-Identifier: GPL-2.0-only
++config IPU_BRIDGE
++ tristate
++ depends on I2C && ACPI
++ help
++ This is a helper module for the IPU bridge, which can be
++ used by ipu3 and other drivers. In order to handle module
++ dependencies, this is selected by each driver that needs it.
++
++source "drivers/media/pci/intel/ipu3/Kconfig"
+diff --git a/drivers/media/pci/intel/Makefile b/drivers/media/pci/intel/Makefile
+index 0b4236c4db49a..951191a7e4011 100644
+--- a/drivers/media/pci/intel/Makefile
++++ b/drivers/media/pci/intel/Makefile
+@@ -1,6 +1,6 @@
+ # SPDX-License-Identifier: GPL-2.0-only
+ #
+-# Makefile for the IPU3 cio2 and ImGU drivers
++# Makefile for the IPU drivers
+ #
+-
++obj-$(CONFIG_IPU_BRIDGE) += ipu-bridge.o
+ obj-y += ipu3/
+diff --git a/drivers/media/pci/intel/ipu-bridge.c b/drivers/media/pci/intel/ipu-bridge.c
+new file mode 100644
+index 0000000000000..c5c44fb43c97a
+--- /dev/null
++++ b/drivers/media/pci/intel/ipu-bridge.c
+@@ -0,0 +1,502 @@
++// SPDX-License-Identifier: GPL-2.0
++/* Author: Dan Scally <djrscally@gmail.com> */
++
++#include <linux/acpi.h>
++#include <linux/device.h>
++#include <linux/i2c.h>
++#include <linux/pci.h>
++#include <linux/property.h>
++#include <media/v4l2-fwnode.h>
++
++#include "ipu-bridge.h"
++
++/*
++ * Extend this array with ACPI Hardware IDs of devices known to be working
++ * plus the number of link-frequencies expected by their drivers, along with
++ * the frequency values in hertz. This is somewhat opportunistic way of adding
++ * support for this for now in the hopes of a better source for the information
++ * (possibly some encoded value in the SSDB buffer that we're unaware of)
++ * becoming apparent in the future.
++ *
++ * Do not add an entry for a sensor that is not actually supported.
++ */
++static const struct ipu_sensor_config ipu_supported_sensors[] = {
++ /* Omnivision OV5693 */
++ IPU_SENSOR_CONFIG("INT33BE", 1, 419200000),
++ /* Omnivision OV8865 */
++ IPU_SENSOR_CONFIG("INT347A", 1, 360000000),
++ /* Omnivision OV7251 */
++ IPU_SENSOR_CONFIG("INT347E", 1, 319200000),
++ /* Omnivision OV2680 */
++ IPU_SENSOR_CONFIG("OVTI2680", 0),
++ /* Omnivision ov8856 */
++ IPU_SENSOR_CONFIG("OVTI8856", 3, 180000000, 360000000, 720000000),
++ /* Omnivision ov2740 */
++ IPU_SENSOR_CONFIG("INT3474", 1, 360000000),
++ /* Hynix hi556 */
++ IPU_SENSOR_CONFIG("INT3537", 1, 437000000),
++ /* Omnivision ov13b10 */
++ IPU_SENSOR_CONFIG("OVTIDB10", 1, 560000000),
++};
++
++static const struct ipu_property_names prop_names = {
++ .clock_frequency = "clock-frequency",
++ .rotation = "rotation",
++ .orientation = "orientation",
++ .bus_type = "bus-type",
++ .data_lanes = "data-lanes",
++ .remote_endpoint = "remote-endpoint",
++ .link_frequencies = "link-frequencies",
++};
++
++static const char * const ipu_vcm_types[] = {
++ "ad5823",
++ "dw9714",
++ "ad5816",
++ "dw9719",
++ "dw9718",
++ "dw9806b",
++ "wv517s",
++ "lc898122xa",
++ "lc898212axb",
++};
++
++static int ipu_bridge_read_acpi_buffer(struct acpi_device *adev, char *id,
++ void *data, u32 size)
++{
++ struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
++ union acpi_object *obj;
++ acpi_status status;
++ int ret = 0;
++
++ status = acpi_evaluate_object(adev->handle, id, NULL, &buffer);
++ if (ACPI_FAILURE(status))
++ return -ENODEV;
++
++ obj = buffer.pointer;
++ if (!obj) {
++ dev_err(&adev->dev, "Couldn't locate ACPI buffer\n");
++ return -ENODEV;
++ }
++
++ if (obj->type != ACPI_TYPE_BUFFER) {
++ dev_err(&adev->dev, "Not an ACPI buffer\n");
++ ret = -ENODEV;
++ goto out_free_buff;
++ }
++
++ if (obj->buffer.length > size) {
++ dev_err(&adev->dev, "Given buffer is too small\n");
++ ret = -EINVAL;
++ goto out_free_buff;
++ }
++
++ memcpy(data, obj->buffer.pointer, obj->buffer.length);
++
++out_free_buff:
++ kfree(buffer.pointer);
++ return ret;
++}
++
++static u32 ipu_bridge_parse_rotation(struct ipu_sensor *sensor)
++{
++ switch (sensor->ssdb.degree) {
++ case IPU_SENSOR_ROTATION_NORMAL:
++ return 0;
++ case IPU_SENSOR_ROTATION_INVERTED:
++ return 180;
++ default:
++ dev_warn(&sensor->adev->dev,
++ "Unknown rotation %d. Assume 0 degree rotation\n",
++ sensor->ssdb.degree);
++ return 0;
++ }
++}
++
++static enum v4l2_fwnode_orientation ipu_bridge_parse_orientation(struct ipu_sensor *sensor)
++{
++ switch (sensor->pld->panel) {
++ case ACPI_PLD_PANEL_FRONT:
++ return V4L2_FWNODE_ORIENTATION_FRONT;
++ case ACPI_PLD_PANEL_BACK:
++ return V4L2_FWNODE_ORIENTATION_BACK;
++ case ACPI_PLD_PANEL_TOP:
++ case ACPI_PLD_PANEL_LEFT:
++ case ACPI_PLD_PANEL_RIGHT:
++ case ACPI_PLD_PANEL_UNKNOWN:
++ return V4L2_FWNODE_ORIENTATION_EXTERNAL;
++ default:
++ dev_warn(&sensor->adev->dev, "Unknown _PLD panel value %d\n",
++ sensor->pld->panel);
++ return V4L2_FWNODE_ORIENTATION_EXTERNAL;
++ }
++}
++
++static void ipu_bridge_create_fwnode_properties(
++ struct ipu_sensor *sensor,
++ struct ipu_bridge *bridge,
++ const struct ipu_sensor_config *cfg)
++{
++ u32 rotation;
++ enum v4l2_fwnode_orientation orientation;
++
++ rotation = ipu_bridge_parse_rotation(sensor);
++ orientation = ipu_bridge_parse_orientation(sensor);
++
++ sensor->prop_names = prop_names;
++
++ sensor->local_ref[0] = SOFTWARE_NODE_REFERENCE(&sensor->swnodes[SWNODE_IPU_ENDPOINT]);
++ sensor->remote_ref[0] = SOFTWARE_NODE_REFERENCE(&sensor->swnodes[SWNODE_SENSOR_ENDPOINT]);
++
++ sensor->dev_properties[0] = PROPERTY_ENTRY_U32(
++ sensor->prop_names.clock_frequency,
++ sensor->ssdb.mclkspeed);
++ sensor->dev_properties[1] = PROPERTY_ENTRY_U32(
++ sensor->prop_names.rotation,
++ rotation);
++ sensor->dev_properties[2] = PROPERTY_ENTRY_U32(
++ sensor->prop_names.orientation,
++ orientation);
++ if (sensor->ssdb.vcmtype) {
++ sensor->vcm_ref[0] =
++ SOFTWARE_NODE_REFERENCE(&sensor->swnodes[SWNODE_VCM]);
++ sensor->dev_properties[3] =
++ PROPERTY_ENTRY_REF_ARRAY("lens-focus", sensor->vcm_ref);
++ }
++
++ sensor->ep_properties[0] = PROPERTY_ENTRY_U32(
++ sensor->prop_names.bus_type,
++ V4L2_FWNODE_BUS_TYPE_CSI2_DPHY);
++ sensor->ep_properties[1] = PROPERTY_ENTRY_U32_ARRAY_LEN(
++ sensor->prop_names.data_lanes,
++ bridge->data_lanes,
++ sensor->ssdb.lanes);
++ sensor->ep_properties[2] = PROPERTY_ENTRY_REF_ARRAY(
++ sensor->prop_names.remote_endpoint,
++ sensor->local_ref);
++
++ if (cfg->nr_link_freqs > 0)
++ sensor->ep_properties[3] = PROPERTY_ENTRY_U64_ARRAY_LEN(
++ sensor->prop_names.link_frequencies,
++ cfg->link_freqs,
++ cfg->nr_link_freqs);
++
++ sensor->ipu_properties[0] = PROPERTY_ENTRY_U32_ARRAY_LEN(
++ sensor->prop_names.data_lanes,
++ bridge->data_lanes,
++ sensor->ssdb.lanes);
++ sensor->ipu_properties[1] = PROPERTY_ENTRY_REF_ARRAY(
++ sensor->prop_names.remote_endpoint,
++ sensor->remote_ref);
++}
++
++static void ipu_bridge_init_swnode_names(struct ipu_sensor *sensor)
++{
++ snprintf(sensor->node_names.remote_port,
++ sizeof(sensor->node_names.remote_port),
++ SWNODE_GRAPH_PORT_NAME_FMT, sensor->ssdb.link);
++ snprintf(sensor->node_names.port,
++ sizeof(sensor->node_names.port),
++ SWNODE_GRAPH_PORT_NAME_FMT, 0); /* Always port 0 */
++ snprintf(sensor->node_names.endpoint,
++ sizeof(sensor->node_names.endpoint),
++ SWNODE_GRAPH_ENDPOINT_NAME_FMT, 0); /* And endpoint 0 */
++}
++
++static void ipu_bridge_init_swnode_group(struct ipu_sensor *sensor)
++{
++ struct software_node *nodes = sensor->swnodes;
++
++ sensor->group[SWNODE_SENSOR_HID] = &nodes[SWNODE_SENSOR_HID];
++ sensor->group[SWNODE_SENSOR_PORT] = &nodes[SWNODE_SENSOR_PORT];
++ sensor->group[SWNODE_SENSOR_ENDPOINT] = &nodes[SWNODE_SENSOR_ENDPOINT];
++ sensor->group[SWNODE_IPU_PORT] = &nodes[SWNODE_IPU_PORT];
++ sensor->group[SWNODE_IPU_ENDPOINT] = &nodes[SWNODE_IPU_ENDPOINT];
++ if (sensor->ssdb.vcmtype)
++ sensor->group[SWNODE_VCM] = &nodes[SWNODE_VCM];
++}
++
++static void ipu_bridge_create_connection_swnodes(struct ipu_bridge *bridge,
++ struct ipu_sensor *sensor)
++{
++ struct software_node *nodes = sensor->swnodes;
++
++ ipu_bridge_init_swnode_names(sensor);
++
++ nodes[SWNODE_SENSOR_HID] = NODE_SENSOR(sensor->name,
++ sensor->dev_properties);
++ nodes[SWNODE_SENSOR_PORT] = NODE_PORT(sensor->node_names.port,
++ &nodes[SWNODE_SENSOR_HID]);
++ nodes[SWNODE_SENSOR_ENDPOINT] = NODE_ENDPOINT(
++ sensor->node_names.endpoint,
++ &nodes[SWNODE_SENSOR_PORT],
++ sensor->ep_properties);
++ nodes[SWNODE_IPU_PORT] = NODE_PORT(sensor->node_names.remote_port,
++ &bridge->ipu_hid_node);
++ nodes[SWNODE_IPU_ENDPOINT] = NODE_ENDPOINT(
++ sensor->node_names.endpoint,
++ &nodes[SWNODE_IPU_PORT],
++ sensor->ipu_properties);
++ if (sensor->ssdb.vcmtype) {
++ /* append ssdb.link to distinguish VCM nodes with same HID */
++ snprintf(sensor->node_names.vcm, sizeof(sensor->node_names.vcm),
++ "%s-%u", ipu_vcm_types[sensor->ssdb.vcmtype - 1],
++ sensor->ssdb.link);
++ nodes[SWNODE_VCM] = NODE_VCM(sensor->node_names.vcm);
++ }
++
++ ipu_bridge_init_swnode_group(sensor);
++}
++
++static void ipu_bridge_instantiate_vcm_i2c_client(struct ipu_sensor *sensor)
++{
++ struct i2c_board_info board_info = { };
++ char name[16];
++
++ if (!sensor->ssdb.vcmtype)
++ return;
++
++ snprintf(name, sizeof(name), "%s-VCM", acpi_dev_name(sensor->adev));
++ board_info.dev_name = name;
++ strscpy(board_info.type, ipu_vcm_types[sensor->ssdb.vcmtype - 1],
++ ARRAY_SIZE(board_info.type));
++ board_info.swnode = &sensor->swnodes[SWNODE_VCM];
++
++ sensor->vcm_i2c_client =
++ i2c_acpi_new_device_by_fwnode(acpi_fwnode_handle(sensor->adev),
++ 1, &board_info);
++ if (IS_ERR(sensor->vcm_i2c_client)) {
++ dev_warn(&sensor->adev->dev, "Error instantiation VCM i2c-client: %ld\n",
++ PTR_ERR(sensor->vcm_i2c_client));
++ sensor->vcm_i2c_client = NULL;
++ }
++}
++
++static void ipu_bridge_unregister_sensors(struct ipu_bridge *bridge)
++{
++ struct ipu_sensor *sensor;
++ unsigned int i;
++
++ for (i = 0; i < bridge->n_sensors; i++) {
++ sensor = &bridge->sensors[i];
++ software_node_unregister_node_group(sensor->group);
++ ACPI_FREE(sensor->pld);
++ acpi_dev_put(sensor->adev);
++ i2c_unregister_device(sensor->vcm_i2c_client);
++ }
++}
++
++static int ipu_bridge_connect_sensor(const struct ipu_sensor_config *cfg,
++ struct ipu_bridge *bridge,
++ struct pci_dev *ipu)
++{
++ struct fwnode_handle *fwnode, *primary;
++ struct ipu_sensor *sensor;
++ struct acpi_device *adev;
++ acpi_status status;
++ int ret;
++
++ for_each_acpi_dev_match(adev, cfg->hid, NULL, -1) {
++ if (!adev->status.enabled)
++ continue;
++
++ if (bridge->n_sensors >= CIO2_NUM_PORTS) {
++ acpi_dev_put(adev);
++ dev_err(&ipu->dev, "Exceeded available IPU ports\n");
++ return -EINVAL;
++ }
++
++ sensor = &bridge->sensors[bridge->n_sensors];
++ /*
++ * Borrow our adev ref to the sensor for now, on success
++ * acpi_dev_get(adev) is done further below.
++ */
++ sensor->adev = adev;
++
++ ret = ipu_bridge_read_acpi_buffer(adev, "SSDB",
++ &sensor->ssdb,
++ sizeof(sensor->ssdb));
++ if (ret)
++ goto err_put_adev;
++
++ snprintf(sensor->name, sizeof(sensor->name), "%s-%u",
++ cfg->hid, sensor->ssdb.link);
++
++ if (sensor->ssdb.vcmtype > ARRAY_SIZE(ipu_vcm_types)) {
++ dev_warn(&adev->dev, "Unknown VCM type %d\n",
++ sensor->ssdb.vcmtype);
++ sensor->ssdb.vcmtype = 0;
++ }
++
++ status = acpi_get_physical_device_location(adev->handle, &sensor->pld);
++ if (ACPI_FAILURE(status)) {
++ ret = -ENODEV;
++ goto err_put_adev;
++ }
++
++ if (sensor->ssdb.lanes > IPU_MAX_LANES) {
++ dev_err(&adev->dev,
++ "Number of lanes in SSDB is invalid\n");
++ ret = -EINVAL;
++ goto err_free_pld;
++ }
++
++ ipu_bridge_create_fwnode_properties(sensor, bridge, cfg);
++ ipu_bridge_create_connection_swnodes(bridge, sensor);
++
++ ret = software_node_register_node_group(sensor->group);
++ if (ret)
++ goto err_free_pld;
++
++ fwnode = software_node_fwnode(&sensor->swnodes[
++ SWNODE_SENSOR_HID]);
++ if (!fwnode) {
++ ret = -ENODEV;
++ goto err_free_swnodes;
++ }
++
++ sensor->adev = acpi_dev_get(adev);
++
++ primary = acpi_fwnode_handle(adev);
++ primary->secondary = fwnode;
++
++ ipu_bridge_instantiate_vcm_i2c_client(sensor);
++
++ dev_info(&ipu->dev, "Found supported sensor %s\n",
++ acpi_dev_name(adev));
++
++ bridge->n_sensors++;
++ }
++
++ return 0;
++
++err_free_swnodes:
++ software_node_unregister_node_group(sensor->group);
++err_free_pld:
++ ACPI_FREE(sensor->pld);
++err_put_adev:
++ acpi_dev_put(adev);
++ return ret;
++}
++
++static int ipu_bridge_connect_sensors(struct ipu_bridge *bridge,
++ struct pci_dev *ipu)
++{
++ unsigned int i;
++ int ret;
++
++ for (i = 0; i < ARRAY_SIZE(ipu_supported_sensors); i++) {
++ const struct ipu_sensor_config *cfg =
++ &ipu_supported_sensors[i];
++
++ ret = ipu_bridge_connect_sensor(cfg, bridge, ipu);
++ if (ret)
++ goto err_unregister_sensors;
++ }
++
++ return 0;
++
++err_unregister_sensors:
++ ipu_bridge_unregister_sensors(bridge);
++ return ret;
++}
++
++/*
++ * The VCM cannot be probed until the PMIC is completely setup. We cannot rely
++ * on -EPROBE_DEFER for this, since the consumer<->supplier relations between
++ * the VCM and regulators/clks are not described in ACPI, instead they are
++ * passed as board-data to the PMIC drivers. Since -PROBE_DEFER does not work
++ * for the clks/regulators the VCM i2c-clients must not be instantiated until
++ * the PMIC is fully setup.
++ *
++ * The sensor/VCM ACPI device has an ACPI _DEP on the PMIC, check this using the
++ * acpi_dev_ready_for_enumeration() helper, like the i2c-core-acpi code does
++ * for the sensors.
++ */
++static int ipu_bridge_sensors_are_ready(void)
++{
++ struct acpi_device *adev;
++ bool ready = true;
++ unsigned int i;
++
++ for (i = 0; i < ARRAY_SIZE(ipu_supported_sensors); i++) {
++ const struct ipu_sensor_config *cfg =
++ &ipu_supported_sensors[i];
++
++ for_each_acpi_dev_match(adev, cfg->hid, NULL, -1) {
++ if (!adev->status.enabled)
++ continue;
++
++ if (!acpi_dev_ready_for_enumeration(adev))
++ ready = false;
++ }
++ }
++
++ return ready;
++}
++
++int ipu_bridge_init(struct pci_dev *ipu)
++{
++ struct device *dev = &ipu->dev;
++ struct fwnode_handle *fwnode;
++ struct ipu_bridge *bridge;
++ unsigned int i;
++ int ret;
++
++ if (!ipu_bridge_sensors_are_ready())
++ return -EPROBE_DEFER;
++
++ bridge = kzalloc(sizeof(*bridge), GFP_KERNEL);
++ if (!bridge)
++ return -ENOMEM;
++
++ strscpy(bridge->ipu_node_name, IPU_HID,
++ sizeof(bridge->ipu_node_name));
++ bridge->ipu_hid_node.name = bridge->ipu_node_name;
++
++ ret = software_node_register(&bridge->ipu_hid_node);
++ if (ret < 0) {
++ dev_err(dev, "Failed to register the IPU HID node\n");
++ goto err_free_bridge;
++ }
++
++ /*
++ * Map the lane arrangement, which is fixed for the IPU3 (meaning we
++ * only need one, rather than one per sensor). We include it as a
++ * member of the struct ipu_bridge rather than a global variable so
++ * that it survives if the module is unloaded along with the rest of
++ * the struct.
++ */
++ for (i = 0; i < IPU_MAX_LANES; i++)
++ bridge->data_lanes[i] = i + 1;
++
++ ret = ipu_bridge_connect_sensors(bridge, ipu);
++ if (ret || bridge->n_sensors == 0)
++ goto err_unregister_ipu;
++
++ dev_info(dev, "Connected %d cameras\n", bridge->n_sensors);
++
++ fwnode = software_node_fwnode(&bridge->ipu_hid_node);
++ if (!fwnode) {
++ dev_err(dev, "Error getting fwnode from ipu software_node\n");
++ ret = -ENODEV;
++ goto err_unregister_sensors;
++ }
++
++ set_secondary_fwnode(dev, fwnode);
++
++ return 0;
++
++err_unregister_sensors:
++ ipu_bridge_unregister_sensors(bridge);
++err_unregister_ipu:
++ software_node_unregister(&bridge->ipu_hid_node);
++err_free_bridge:
++ kfree(bridge);
++
++ return ret;
++}
++EXPORT_SYMBOL_NS_GPL(ipu_bridge_init, INTEL_IPU_BRIDGE);
++
++MODULE_LICENSE("GPL");
++MODULE_DESCRIPTION("Intel IPU Sensors Bridge driver");
+diff --git a/drivers/media/pci/intel/ipu-bridge.h b/drivers/media/pci/intel/ipu-bridge.h
+new file mode 100644
+index 0000000000000..1ff0b2d04d929
+--- /dev/null
++++ b/drivers/media/pci/intel/ipu-bridge.h
+@@ -0,0 +1,153 @@
++/* SPDX-License-Identifier: GPL-2.0 */
++/* Author: Dan Scally <djrscally@gmail.com> */
++#ifndef __IPU_BRIDGE_H
++#define __IPU_BRIDGE_H
++
++#include <linux/property.h>
++#include <linux/types.h>
++
++#include "ipu3/ipu3-cio2.h"
++
++struct i2c_client;
++
++#define IPU_HID "INT343E"
++#define IPU_MAX_LANES 4
++#define MAX_NUM_LINK_FREQS 3
++
++/* Values are educated guesses as we don't have a spec */
++#define IPU_SENSOR_ROTATION_NORMAL 0
++#define IPU_SENSOR_ROTATION_INVERTED 1
++
++#define IPU_SENSOR_CONFIG(_HID, _NR, ...) \
++ (const struct ipu_sensor_config) { \
++ .hid = _HID, \
++ .nr_link_freqs = _NR, \
++ .link_freqs = { __VA_ARGS__ } \
++ }
++
++#define NODE_SENSOR(_HID, _PROPS) \
++ (const struct software_node) { \
++ .name = _HID, \
++ .properties = _PROPS, \
++ }
++
++#define NODE_PORT(_PORT, _SENSOR_NODE) \
++ (const struct software_node) { \
++ .name = _PORT, \
++ .parent = _SENSOR_NODE, \
++ }
++
++#define NODE_ENDPOINT(_EP, _PORT, _PROPS) \
++ (const struct software_node) { \
++ .name = _EP, \
++ .parent = _PORT, \
++ .properties = _PROPS, \
++ }
++
++#define NODE_VCM(_TYPE) \
++ (const struct software_node) { \
++ .name = _TYPE, \
++ }
++
++enum ipu_sensor_swnodes {
++ SWNODE_SENSOR_HID,
++ SWNODE_SENSOR_PORT,
++ SWNODE_SENSOR_ENDPOINT,
++ SWNODE_IPU_PORT,
++ SWNODE_IPU_ENDPOINT,
++ /* Must be last because it is optional / maybe empty */
++ SWNODE_VCM,
++ SWNODE_COUNT
++};
++
++/* Data representation as it is in ACPI SSDB buffer */
++struct ipu_sensor_ssdb {
++ u8 version;
++ u8 sku;
++ u8 guid_csi2[16];
++ u8 devfunction;
++ u8 bus;
++ u32 dphylinkenfuses;
++ u32 clockdiv;
++ u8 link;
++ u8 lanes;
++ u32 csiparams[10];
++ u32 maxlanespeed;
++ u8 sensorcalibfileidx;
++ u8 sensorcalibfileidxInMBZ[3];
++ u8 romtype;
++ u8 vcmtype;
++ u8 platforminfo;
++ u8 platformsubinfo;
++ u8 flash;
++ u8 privacyled;
++ u8 degree;
++ u8 mipilinkdefined;
++ u32 mclkspeed;
++ u8 controllogicid;
++ u8 reserved1[3];
++ u8 mclkport;
++ u8 reserved2[13];
++} __packed;
++
++struct ipu_property_names {
++ char clock_frequency[16];
++ char rotation[9];
++ char orientation[12];
++ char bus_type[9];
++ char data_lanes[11];
++ char remote_endpoint[16];
++ char link_frequencies[17];
++};
++
++struct ipu_node_names {
++ char port[7];
++ char endpoint[11];
++ char remote_port[7];
++ char vcm[16];
++};
++
++struct ipu_sensor_config {
++ const char *hid;
++ const u8 nr_link_freqs;
++ const u64 link_freqs[MAX_NUM_LINK_FREQS];
++};
++
++struct ipu_sensor {
++ /* append ssdb.link(u8) in "-%u" format as suffix of HID */
++ char name[ACPI_ID_LEN + 4];
++ struct acpi_device *adev;
++ struct i2c_client *vcm_i2c_client;
++
++ /* SWNODE_COUNT + 1 for terminating NULL */
++ const struct software_node *group[SWNODE_COUNT + 1];
++ struct software_node swnodes[SWNODE_COUNT];
++ struct ipu_node_names node_names;
++
++ struct ipu_sensor_ssdb ssdb;
++ struct acpi_pld_info *pld;
++
++ struct ipu_property_names prop_names;
++ struct property_entry ep_properties[5];
++ struct property_entry dev_properties[5];
++ struct property_entry ipu_properties[3];
++ struct software_node_ref_args local_ref[1];
++ struct software_node_ref_args remote_ref[1];
++ struct software_node_ref_args vcm_ref[1];
++};
++
++struct ipu_bridge {
++ char ipu_node_name[ACPI_ID_LEN];
++ struct software_node ipu_hid_node;
++ u32 data_lanes[4];
++ unsigned int n_sensors;
++ struct ipu_sensor sensors[CIO2_NUM_PORTS];
++};
++
++#if IS_ENABLED(CONFIG_IPU_BRIDGE)
++int ipu_bridge_init(struct pci_dev *ipu);
++#else
++static inline int ipu_bridge_init(struct pci_dev *ipu) { return 0; }
++#endif
++
++#endif
+diff --git a/drivers/media/pci/intel/ipu3/Kconfig b/drivers/media/pci/intel/ipu3/Kconfig
+index 65b0c1598fbf1..0951545eab21a 100644
+--- a/drivers/media/pci/intel/ipu3/Kconfig
++++ b/drivers/media/pci/intel/ipu3/Kconfig
+@@ -8,6 +8,7 @@ config VIDEO_IPU3_CIO2
+ select VIDEO_V4L2_SUBDEV_API
+ select V4L2_FWNODE
+ select VIDEOBUF2_DMA_SG
++ select IPU_BRIDGE if CIO2_BRIDGE
+
+ help
+ This is the Intel IPU3 CIO2 CSI-2 receiver unit, found in Intel
+diff --git a/drivers/media/pci/intel/ipu3/Makefile b/drivers/media/pci/intel/ipu3/Makefile
+index 933777e6ea8ab..429d516452e42 100644
+--- a/drivers/media/pci/intel/ipu3/Makefile
++++ b/drivers/media/pci/intel/ipu3/Makefile
+@@ -2,4 +2,3 @@
+ obj-$(CONFIG_VIDEO_IPU3_CIO2) += ipu3-cio2.o
+
+ ipu3-cio2-y += ipu3-cio2-main.o
+-ipu3-cio2-$(CONFIG_CIO2_BRIDGE) += cio2-bridge.o
+diff --git a/drivers/media/pci/intel/ipu3/cio2-bridge.c b/drivers/media/pci/intel/ipu3/cio2-bridge.c
+deleted file mode 100644
+index 3c2accfe54551..0000000000000
+--- a/drivers/media/pci/intel/ipu3/cio2-bridge.c
++++ /dev/null
+@@ -1,494 +0,0 @@
+-// SPDX-License-Identifier: GPL-2.0
+-/* Author: Dan Scally <djrscally@gmail.com> */
+-
+-#include <linux/acpi.h>
+-#include <linux/device.h>
+-#include <linux/i2c.h>
+-#include <linux/pci.h>
+-#include <linux/property.h>
+-#include <media/v4l2-fwnode.h>
+-
+-#include "cio2-bridge.h"
+-
+-/*
+- * Extend this array with ACPI Hardware IDs of devices known to be working
+- * plus the number of link-frequencies expected by their drivers, along with
+- * the frequency values in hertz. This is somewhat opportunistic way of adding
+- * support for this for now in the hopes of a better source for the information
+- * (possibly some encoded value in the SSDB buffer that we're unaware of)
+- * becoming apparent in the future.
+- *
+- * Do not add an entry for a sensor that is not actually supported.
+- */
+-static const struct cio2_sensor_config cio2_supported_sensors[] = {
+- /* Omnivision OV5693 */
+- CIO2_SENSOR_CONFIG("INT33BE", 1, 419200000),
+- /* Omnivision OV8865 */
+- CIO2_SENSOR_CONFIG("INT347A", 1, 360000000),
+- /* Omnivision OV7251 */
+- CIO2_SENSOR_CONFIG("INT347E", 1, 319200000),
+- /* Omnivision OV2680 */
+- CIO2_SENSOR_CONFIG("OVTI2680", 0),
+- /* Omnivision ov8856 */
+- CIO2_SENSOR_CONFIG("OVTI8856", 3, 180000000, 360000000, 720000000),
+- /* Omnivision ov2740 */
+- CIO2_SENSOR_CONFIG("INT3474", 1, 360000000),
+- /* Hynix hi556 */
+- CIO2_SENSOR_CONFIG("INT3537", 1, 437000000),
+- /* Omnivision ov13b10 */
+- CIO2_SENSOR_CONFIG("OVTIDB10", 1, 560000000),
+-};
+-
+-static const struct cio2_property_names prop_names = {
+- .clock_frequency = "clock-frequency",
+- .rotation = "rotation",
+- .orientation = "orientation",
+- .bus_type = "bus-type",
+- .data_lanes = "data-lanes",
+- .remote_endpoint = "remote-endpoint",
+- .link_frequencies = "link-frequencies",
+-};
+-
+-static const char * const cio2_vcm_types[] = {
+- "ad5823",
+- "dw9714",
+- "ad5816",
+- "dw9719",
+- "dw9718",
+- "dw9806b",
+- "wv517s",
+- "lc898122xa",
+- "lc898212axb",
+-};
+-
+-static int cio2_bridge_read_acpi_buffer(struct acpi_device *adev, char *id,
+- void *data, u32 size)
+-{
+- struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
+- union acpi_object *obj;
+- acpi_status status;
+- int ret = 0;
+-
+- status = acpi_evaluate_object(adev->handle, id, NULL, &buffer);
+- if (ACPI_FAILURE(status))
+- return -ENODEV;
+-
+- obj = buffer.pointer;
+- if (!obj) {
+- dev_err(&adev->dev, "Couldn't locate ACPI buffer\n");
+- return -ENODEV;
+- }
+-
+- if (obj->type != ACPI_TYPE_BUFFER) {
+- dev_err(&adev->dev, "Not an ACPI buffer\n");
+- ret = -ENODEV;
+- goto out_free_buff;
+- }
+-
+- if (obj->buffer.length > size) {
+- dev_err(&adev->dev, "Given buffer is too small\n");
+- ret = -EINVAL;
+- goto out_free_buff;
+- }
+-
+- memcpy(data, obj->buffer.pointer, obj->buffer.length);
+-
+-out_free_buff:
+- kfree(buffer.pointer);
+- return ret;
+-}
+-
+-static u32 cio2_bridge_parse_rotation(struct cio2_sensor *sensor)
+-{
+- switch (sensor->ssdb.degree) {
+- case CIO2_SENSOR_ROTATION_NORMAL:
+- return 0;
+- case CIO2_SENSOR_ROTATION_INVERTED:
+- return 180;
+- default:
+- dev_warn(&sensor->adev->dev,
+- "Unknown rotation %d. Assume 0 degree rotation\n",
+- sensor->ssdb.degree);
+- return 0;
+- }
+-}
+-
+-static enum v4l2_fwnode_orientation cio2_bridge_parse_orientation(struct cio2_sensor *sensor)
+-{
+- switch (sensor->pld->panel) {
+- case ACPI_PLD_PANEL_FRONT:
+- return V4L2_FWNODE_ORIENTATION_FRONT;
+- case ACPI_PLD_PANEL_BACK:
+- return V4L2_FWNODE_ORIENTATION_BACK;
+- case ACPI_PLD_PANEL_TOP:
+- case ACPI_PLD_PANEL_LEFT:
+- case ACPI_PLD_PANEL_RIGHT:
+- case ACPI_PLD_PANEL_UNKNOWN:
+- return V4L2_FWNODE_ORIENTATION_EXTERNAL;
+- default:
+- dev_warn(&sensor->adev->dev, "Unknown _PLD panel value %d\n",
+- sensor->pld->panel);
+- return V4L2_FWNODE_ORIENTATION_EXTERNAL;
+- }
+-}
+-
+-static void cio2_bridge_create_fwnode_properties(
+- struct cio2_sensor *sensor,
+- struct cio2_bridge *bridge,
+- const struct cio2_sensor_config *cfg)
+-{
+- u32 rotation;
+- enum v4l2_fwnode_orientation orientation;
+-
+- rotation = cio2_bridge_parse_rotation(sensor);
+- orientation = cio2_bridge_parse_orientation(sensor);
+-
+- sensor->prop_names = prop_names;
+-
+- sensor->local_ref[0] = SOFTWARE_NODE_REFERENCE(&sensor->swnodes[SWNODE_CIO2_ENDPOINT]);
+- sensor->remote_ref[0] = SOFTWARE_NODE_REFERENCE(&sensor->swnodes[SWNODE_SENSOR_ENDPOINT]);
+-
+- sensor->dev_properties[0] = PROPERTY_ENTRY_U32(
+- sensor->prop_names.clock_frequency,
+- sensor->ssdb.mclkspeed);
+- sensor->dev_properties[1] = PROPERTY_ENTRY_U32(
+- sensor->prop_names.rotation,
+- rotation);
+- sensor->dev_properties[2] = PROPERTY_ENTRY_U32(
+- sensor->prop_names.orientation,
+- orientation);
+- if (sensor->ssdb.vcmtype) {
+- sensor->vcm_ref[0] =
+- SOFTWARE_NODE_REFERENCE(&sensor->swnodes[SWNODE_VCM]);
+- sensor->dev_properties[3] =
+- PROPERTY_ENTRY_REF_ARRAY("lens-focus", sensor->vcm_ref);
+- }
+-
+- sensor->ep_properties[0] = PROPERTY_ENTRY_U32(
+- sensor->prop_names.bus_type,
+- V4L2_FWNODE_BUS_TYPE_CSI2_DPHY);
+- sensor->ep_properties[1] = PROPERTY_ENTRY_U32_ARRAY_LEN(
+- sensor->prop_names.data_lanes,
+- bridge->data_lanes,
+- sensor->ssdb.lanes);
+- sensor->ep_properties[2] = PROPERTY_ENTRY_REF_ARRAY(
+- sensor->prop_names.remote_endpoint,
+- sensor->local_ref);
+-
+- if (cfg->nr_link_freqs > 0)
+- sensor->ep_properties[3] = PROPERTY_ENTRY_U64_ARRAY_LEN(
+- sensor->prop_names.link_frequencies,
+- cfg->link_freqs,
+- cfg->nr_link_freqs);
+-
+- sensor->cio2_properties[0] = PROPERTY_ENTRY_U32_ARRAY_LEN(
+- sensor->prop_names.data_lanes,
+- bridge->data_lanes,
+- sensor->ssdb.lanes);
+- sensor->cio2_properties[1] = PROPERTY_ENTRY_REF_ARRAY(
+- sensor->prop_names.remote_endpoint,
+- sensor->remote_ref);
+-}
+-
+-static void cio2_bridge_init_swnode_names(struct cio2_sensor *sensor)
+-{
+- snprintf(sensor->node_names.remote_port,
+- sizeof(sensor->node_names.remote_port),
+- SWNODE_GRAPH_PORT_NAME_FMT, sensor->ssdb.link);
+- snprintf(sensor->node_names.port,
+- sizeof(sensor->node_names.port),
+- SWNODE_GRAPH_PORT_NAME_FMT, 0); /* Always port 0 */
+- snprintf(sensor->node_names.endpoint,
+- sizeof(sensor->node_names.endpoint),
+- SWNODE_GRAPH_ENDPOINT_NAME_FMT, 0); /* And endpoint 0 */
+-}
+-
+-static void cio2_bridge_init_swnode_group(struct cio2_sensor *sensor)
+-{
+- struct software_node *nodes = sensor->swnodes;
+-
+- sensor->group[SWNODE_SENSOR_HID] = &nodes[SWNODE_SENSOR_HID];
+- sensor->group[SWNODE_SENSOR_PORT] = &nodes[SWNODE_SENSOR_PORT];
+- sensor->group[SWNODE_SENSOR_ENDPOINT] = &nodes[SWNODE_SENSOR_ENDPOINT];
+- sensor->group[SWNODE_CIO2_PORT] = &nodes[SWNODE_CIO2_PORT];
+- sensor->group[SWNODE_CIO2_ENDPOINT] = &nodes[SWNODE_CIO2_ENDPOINT];
+- if (sensor->ssdb.vcmtype)
+- sensor->group[SWNODE_VCM] = &nodes[SWNODE_VCM];
+-}
+-
+-static void cio2_bridge_create_connection_swnodes(struct cio2_bridge *bridge,
+- struct cio2_sensor *sensor)
+-{
+- struct software_node *nodes = sensor->swnodes;
+- char vcm_name[ACPI_ID_LEN + 4];
+-
+- cio2_bridge_init_swnode_names(sensor);
+-
+- nodes[SWNODE_SENSOR_HID] = NODE_SENSOR(sensor->name,
+- sensor->dev_properties);
+- nodes[SWNODE_SENSOR_PORT] = NODE_PORT(sensor->node_names.port,
+- &nodes[SWNODE_SENSOR_HID]);
+- nodes[SWNODE_SENSOR_ENDPOINT] = NODE_ENDPOINT(
+- sensor->node_names.endpoint,
+- &nodes[SWNODE_SENSOR_PORT],
+- sensor->ep_properties);
+- nodes[SWNODE_CIO2_PORT] = NODE_PORT(sensor->node_names.remote_port,
+- &bridge->cio2_hid_node);
+- nodes[SWNODE_CIO2_ENDPOINT] = NODE_ENDPOINT(
+- sensor->node_names.endpoint,
+- &nodes[SWNODE_CIO2_PORT],
+- sensor->cio2_properties);
+- if (sensor->ssdb.vcmtype) {
+- /* append ssdb.link to distinguish VCM nodes with same HID */
+- snprintf(vcm_name, sizeof(vcm_name), "%s-%u",
+- cio2_vcm_types[sensor->ssdb.vcmtype - 1],
+- sensor->ssdb.link);
+- nodes[SWNODE_VCM] = NODE_VCM(vcm_name);
+- }
+-
+- cio2_bridge_init_swnode_group(sensor);
+-}
+-
+-static void cio2_bridge_instantiate_vcm_i2c_client(struct cio2_sensor *sensor)
+-{
+- struct i2c_board_info board_info = { };
+- char name[16];
+-
+- if (!sensor->ssdb.vcmtype)
+- return;
+-
+- snprintf(name, sizeof(name), "%s-VCM", acpi_dev_name(sensor->adev));
+- board_info.dev_name = name;
+- strscpy(board_info.type, cio2_vcm_types[sensor->ssdb.vcmtype - 1],
+- ARRAY_SIZE(board_info.type));
+- board_info.swnode = &sensor->swnodes[SWNODE_VCM];
+-
+- sensor->vcm_i2c_client =
+- i2c_acpi_new_device_by_fwnode(acpi_fwnode_handle(sensor->adev),
+- 1, &board_info);
+- if (IS_ERR(sensor->vcm_i2c_client)) {
+- dev_warn(&sensor->adev->dev, "Error instantiation VCM i2c-client: %ld\n",
+- PTR_ERR(sensor->vcm_i2c_client));
+- sensor->vcm_i2c_client = NULL;
+- }
+-}
+-
+-static void cio2_bridge_unregister_sensors(struct cio2_bridge *bridge)
+-{
+- struct cio2_sensor *sensor;
+- unsigned int i;
+-
+- for (i = 0; i < bridge->n_sensors; i++) {
+- sensor = &bridge->sensors[i];
+- software_node_unregister_node_group(sensor->group);
+- ACPI_FREE(sensor->pld);
+- acpi_dev_put(sensor->adev);
+- i2c_unregister_device(sensor->vcm_i2c_client);
+- }
+-}
+-
+-static int cio2_bridge_connect_sensor(const struct cio2_sensor_config *cfg,
+- struct cio2_bridge *bridge,
+- struct pci_dev *cio2)
+-{
+- struct fwnode_handle *fwnode, *primary;
+- struct cio2_sensor *sensor;
+- struct acpi_device *adev;
+- acpi_status status;
+- int ret;
+-
+- for_each_acpi_dev_match(adev, cfg->hid, NULL, -1) {
+- if (!adev->status.enabled)
+- continue;
+-
+- if (bridge->n_sensors >= CIO2_NUM_PORTS) {
+- acpi_dev_put(adev);
+- dev_err(&cio2->dev, "Exceeded available CIO2 ports\n");
+- return -EINVAL;
+- }
+-
+- sensor = &bridge->sensors[bridge->n_sensors];
+-
+- ret = cio2_bridge_read_acpi_buffer(adev, "SSDB",
+- &sensor->ssdb,
+- sizeof(sensor->ssdb));
+- if (ret)
+- goto err_put_adev;
+-
+- snprintf(sensor->name, sizeof(sensor->name), "%s-%u",
+- cfg->hid, sensor->ssdb.link);
+-
+- if (sensor->ssdb.vcmtype > ARRAY_SIZE(cio2_vcm_types)) {
+- dev_warn(&adev->dev, "Unknown VCM type %d\n",
+- sensor->ssdb.vcmtype);
+- sensor->ssdb.vcmtype = 0;
+- }
+-
+- status = acpi_get_physical_device_location(adev->handle, &sensor->pld);
+- if (ACPI_FAILURE(status)) {
+- ret = -ENODEV;
+- goto err_put_adev;
+- }
+-
+- if (sensor->ssdb.lanes > CIO2_MAX_LANES) {
+- dev_err(&adev->dev,
+- "Number of lanes in SSDB is invalid\n");
+- ret = -EINVAL;
+- goto err_free_pld;
+- }
+-
+- cio2_bridge_create_fwnode_properties(sensor, bridge, cfg);
+- cio2_bridge_create_connection_swnodes(bridge, sensor);
+-
+- ret = software_node_register_node_group(sensor->group);
+- if (ret)
+- goto err_free_pld;
+-
+- fwnode = software_node_fwnode(&sensor->swnodes[
+- SWNODE_SENSOR_HID]);
+- if (!fwnode) {
+- ret = -ENODEV;
+- goto err_free_swnodes;
+- }
+-
+- sensor->adev = acpi_dev_get(adev);
+-
+- primary = acpi_fwnode_handle(adev);
+- primary->secondary = fwnode;
+-
+- cio2_bridge_instantiate_vcm_i2c_client(sensor);
+-
+- dev_info(&cio2->dev, "Found supported sensor %s\n",
+- acpi_dev_name(adev));
+-
+- bridge->n_sensors++;
+- }
+-
+- return 0;
+-
+-err_free_swnodes:
+- software_node_unregister_node_group(sensor->group);
+-err_free_pld:
+- ACPI_FREE(sensor->pld);
+-err_put_adev:
+- acpi_dev_put(adev);
+- return ret;
+-}
+-
+-static int cio2_bridge_connect_sensors(struct cio2_bridge *bridge,
+- struct pci_dev *cio2)
+-{
+- unsigned int i;
+- int ret;
+-
+- for (i = 0; i < ARRAY_SIZE(cio2_supported_sensors); i++) {
+- const struct cio2_sensor_config *cfg =
+- &cio2_supported_sensors[i];
+-
+- ret = cio2_bridge_connect_sensor(cfg, bridge, cio2);
+- if (ret)
+- goto err_unregister_sensors;
+- }
+-
+- return 0;
+-
+-err_unregister_sensors:
+- cio2_bridge_unregister_sensors(bridge);
+- return ret;
+-}
+-
+-/*
+- * The VCM cannot be probed until the PMIC is completely setup. We cannot rely
+- * on -EPROBE_DEFER for this, since the consumer<->supplier relations between
+- * the VCM and regulators/clks are not described in ACPI, instead they are
+- * passed as board-data to the PMIC drivers. Since -PROBE_DEFER does not work
+- * for the clks/regulators the VCM i2c-clients must not be instantiated until
+- * the PMIC is fully setup.
+- *
+- * The sensor/VCM ACPI device has an ACPI _DEP on the PMIC, check this using the
+- * acpi_dev_ready_for_enumeration() helper, like the i2c-core-acpi code does
+- * for the sensors.
+- */
+-static int cio2_bridge_sensors_are_ready(void)
+-{
+- struct acpi_device *adev;
+- bool ready = true;
+- unsigned int i;
+-
+- for (i = 0; i < ARRAY_SIZE(cio2_supported_sensors); i++) {
+- const struct cio2_sensor_config *cfg =
+- &cio2_supported_sensors[i];
+-
+- for_each_acpi_dev_match(adev, cfg->hid, NULL, -1) {
+- if (!adev->status.enabled)
+- continue;
+-
+- if (!acpi_dev_ready_for_enumeration(adev))
+- ready = false;
+- }
+- }
+-
+- return ready;
+-}
+-
+-int cio2_bridge_init(struct pci_dev *cio2)
+-{
+- struct device *dev = &cio2->dev;
+- struct fwnode_handle *fwnode;
+- struct cio2_bridge *bridge;
+- unsigned int i;
+- int ret;
+-
+- if (!cio2_bridge_sensors_are_ready())
+- return -EPROBE_DEFER;
+-
+- bridge = kzalloc(sizeof(*bridge), GFP_KERNEL);
+- if (!bridge)
+- return -ENOMEM;
+-
+- strscpy(bridge->cio2_node_name, CIO2_HID,
+- sizeof(bridge->cio2_node_name));
+- bridge->cio2_hid_node.name = bridge->cio2_node_name;
+-
+- ret = software_node_register(&bridge->cio2_hid_node);
+- if (ret < 0) {
+- dev_err(dev, "Failed to register the CIO2 HID node\n");
+- goto err_free_bridge;
+- }
+-
+- /*
+- * Map the lane arrangement, which is fixed for the IPU3 (meaning we
+- * only need one, rather than one per sensor). We include it as a
+- * member of the struct cio2_bridge rather than a global variable so
+- * that it survives if the module is unloaded along with the rest of
+- * the struct.
+- */
+- for (i = 0; i < CIO2_MAX_LANES; i++)
+- bridge->data_lanes[i] = i + 1;
+-
+- ret = cio2_bridge_connect_sensors(bridge, cio2);
+- if (ret || bridge->n_sensors == 0)
+- goto err_unregister_cio2;
+-
+- dev_info(dev, "Connected %d cameras\n", bridge->n_sensors);
+-
+- fwnode = software_node_fwnode(&bridge->cio2_hid_node);
+- if (!fwnode) {
+- dev_err(dev, "Error getting fwnode from cio2 software_node\n");
+- ret = -ENODEV;
+- goto err_unregister_sensors;
+- }
+-
+- set_secondary_fwnode(dev, fwnode);
+-
+- return 0;
+-
+-err_unregister_sensors:
+- cio2_bridge_unregister_sensors(bridge);
+-err_unregister_cio2:
+- software_node_unregister(&bridge->cio2_hid_node);
+-err_free_bridge:
+- kfree(bridge);
+-
+- return ret;
+-}
+diff --git a/drivers/media/pci/intel/ipu3/cio2-bridge.h b/drivers/media/pci/intel/ipu3/cio2-bridge.h
+deleted file mode 100644
+index b76ed8a641e20..0000000000000
+--- a/drivers/media/pci/intel/ipu3/cio2-bridge.h
++++ /dev/null
+@@ -1,146 +0,0 @@
+-/* SPDX-License-Identifier: GPL-2.0 */
+-/* Author: Dan Scally <djrscally@gmail.com> */
+-#ifndef __CIO2_BRIDGE_H
+-#define __CIO2_BRIDGE_H
+-
+-#include <linux/property.h>
+-#include <linux/types.h>
+-
+-#include "ipu3-cio2.h"
+-
+-struct i2c_client;
+-
+-#define CIO2_HID "INT343E"
+-#define CIO2_MAX_LANES 4
+-#define MAX_NUM_LINK_FREQS 3
+-
+-/* Values are educated guesses as we don't have a spec */
+-#define CIO2_SENSOR_ROTATION_NORMAL 0
+-#define CIO2_SENSOR_ROTATION_INVERTED 1
+-
+-#define CIO2_SENSOR_CONFIG(_HID, _NR, ...) \
+- (const struct cio2_sensor_config) { \
+- .hid = _HID, \
+- .nr_link_freqs = _NR, \
+- .link_freqs = { __VA_ARGS__ } \
+- }
+-
+-#define NODE_SENSOR(_HID, _PROPS) \
+- (const struct software_node) { \
+- .name = _HID, \
+- .properties = _PROPS, \
+- }
+-
+-#define NODE_PORT(_PORT, _SENSOR_NODE) \
+- (const struct software_node) { \
+- .name = _PORT, \
+- .parent = _SENSOR_NODE, \
+- }
+-
+-#define NODE_ENDPOINT(_EP, _PORT, _PROPS) \
+- (const struct software_node) { \
+- .name = _EP, \
+- .parent = _PORT, \
+- .properties = _PROPS, \
+- }
+-
+-#define NODE_VCM(_TYPE) \
+- (const struct software_node) { \
+- .name = _TYPE, \
+- }
+-
+-enum cio2_sensor_swnodes {
+- SWNODE_SENSOR_HID,
+- SWNODE_SENSOR_PORT,
+- SWNODE_SENSOR_ENDPOINT,
+- SWNODE_CIO2_PORT,
+- SWNODE_CIO2_ENDPOINT,
+- /* Must be last because it is optional / maybe empty */
+- SWNODE_VCM,
+- SWNODE_COUNT
+-};
+-
+-/* Data representation as it is in ACPI SSDB buffer */
+-struct cio2_sensor_ssdb {
+- u8 version;
+- u8 sku;
+- u8 guid_csi2[16];
+- u8 devfunction;
+- u8 bus;
+- u32 dphylinkenfuses;
+- u32 clockdiv;
+- u8 link;
+- u8 lanes;
+- u32 csiparams[10];
+- u32 maxlanespeed;
+- u8 sensorcalibfileidx;
+- u8 sensorcalibfileidxInMBZ[3];
+- u8 romtype;
+- u8 vcmtype;
+- u8 platforminfo;
+- u8 platformsubinfo;
+- u8 flash;
+- u8 privacyled;
+- u8 degree;
+- u8 mipilinkdefined;
+- u32 mclkspeed;
+- u8 controllogicid;
+- u8 reserved1[3];
+- u8 mclkport;
+- u8 reserved2[13];
+-} __packed;
+-
+-struct cio2_property_names {
+- char clock_frequency[16];
+- char rotation[9];
+- char orientation[12];
+- char bus_type[9];
+- char data_lanes[11];
+- char remote_endpoint[16];
+- char link_frequencies[17];
+-};
+-
+-struct cio2_node_names {
+- char port[7];
+- char endpoint[11];
+- char remote_port[7];
+-};
+-
+-struct cio2_sensor_config {
+- const char *hid;
+- const u8 nr_link_freqs;
+- const u64 link_freqs[MAX_NUM_LINK_FREQS];
+-};
+-
+-struct cio2_sensor {
+- /* append ssdb.link(u8) in "-%u" format as suffix of HID */
+- char name[ACPI_ID_LEN + 4];
+- struct acpi_device *adev;
+- struct i2c_client *vcm_i2c_client;
+-
+- /* SWNODE_COUNT + 1 for terminating NULL */
+- const struct software_node *group[SWNODE_COUNT + 1];
+- struct software_node swnodes[SWNODE_COUNT];
+- struct cio2_node_names node_names;
+-
+- struct cio2_sensor_ssdb ssdb;
+- struct acpi_pld_info *pld;
+-
+- struct cio2_property_names prop_names;
+- struct property_entry ep_properties[5];
+- struct property_entry dev_properties[5];
+- struct property_entry cio2_properties[3];
+- struct software_node_ref_args local_ref[1];
+- struct software_node_ref_args remote_ref[1];
+- struct software_node_ref_args vcm_ref[1];
+-};
+-
+-struct cio2_bridge {
+- char cio2_node_name[ACPI_ID_LEN];
+- struct software_node cio2_hid_node;
+- u32 data_lanes[4];
+- unsigned int n_sensors;
+- struct cio2_sensor sensors[CIO2_NUM_PORTS];
+-};
+-
+-#endif
+diff --git a/drivers/media/pci/intel/ipu3/ipu3-cio2-main.c b/drivers/media/pci/intel/ipu3/ipu3-cio2-main.c
+index 3c84cb1216320..03a7ab4d2e693 100644
+--- a/drivers/media/pci/intel/ipu3/ipu3-cio2-main.c
++++ b/drivers/media/pci/intel/ipu3/ipu3-cio2-main.c
+@@ -29,6 +29,7 @@
+ #include <media/v4l2-ioctl.h>
+ #include <media/videobuf2-dma-sg.h>
+
++#include "../ipu-bridge.h"
+ #include "ipu3-cio2.h"
+
+ struct ipu3_cio2_fmt {
+@@ -1727,7 +1728,7 @@ static int cio2_pci_probe(struct pci_dev *pci_dev,
+ return -EINVAL;
+ }
+
+- r = cio2_bridge_init(pci_dev);
++ r = ipu_bridge_init(pci_dev);
+ if (r)
+ return r;
+ }
+@@ -2060,3 +2061,4 @@ MODULE_AUTHOR("Yuning Pu <yuning.pu@intel.com>");
+ MODULE_AUTHOR("Yong Zhi <yong.zhi@intel.com>");
+ MODULE_LICENSE("GPL v2");
+ MODULE_DESCRIPTION("IPU3 CIO2 driver");
++MODULE_IMPORT_NS(INTEL_IPU_BRIDGE);
+diff --git a/drivers/media/pci/intel/ipu3/ipu3-cio2.h b/drivers/media/pci/intel/ipu3/ipu3-cio2.h
+index 3a1f394e05aa7..d731ce8adbe31 100644
+--- a/drivers/media/pci/intel/ipu3/ipu3-cio2.h
++++ b/drivers/media/pci/intel/ipu3/ipu3-cio2.h
+@@ -459,10 +459,4 @@ static inline struct cio2_queue *vb2q_to_cio2_queue(struct vb2_queue *vq)
+ return container_of(vq, struct cio2_queue, vbq);
+ }
+
+-#if IS_ENABLED(CONFIG_CIO2_BRIDGE)
+-int cio2_bridge_init(struct pci_dev *cio2);
+-#else
+-static inline int cio2_bridge_init(struct pci_dev *cio2) { return 0; }
+-#endif
+-
+ #endif
+diff --git a/drivers/media/platform/amphion/vdec.c b/drivers/media/platform/amphion/vdec.c
+index 6515f3cdb7a74..133d77d1ea0c3 100644
+--- a/drivers/media/platform/amphion/vdec.c
++++ b/drivers/media/platform/amphion/vdec.c
+@@ -299,7 +299,8 @@ static int vdec_update_state(struct vpu_inst *inst, enum vpu_codec_state state,
+ vdec->state = VPU_CODEC_STATE_DYAMIC_RESOLUTION_CHANGE;
+
+ if (inst->state != pre_state)
+- vpu_trace(inst->dev, "[%d] %d -> %d\n", inst->id, pre_state, inst->state);
++ vpu_trace(inst->dev, "[%d] %s -> %s\n", inst->id,
++ vpu_codec_state_name(pre_state), vpu_codec_state_name(inst->state));
+
+ if (inst->state == VPU_CODEC_STATE_DYAMIC_RESOLUTION_CHANGE)
+ vdec_handle_resolution_change(inst);
+@@ -741,6 +742,21 @@ static int vdec_frame_decoded(struct vpu_inst *inst, void *arg)
+ dev_info(inst->dev, "[%d] buf[%d] has been decoded\n", inst->id, info->id);
+ vpu_set_buffer_state(vbuf, VPU_BUF_STATE_DECODED);
+ vdec->decoded_frame_count++;
++ if (vdec->params.display_delay_enable) {
++ struct vpu_format *cur_fmt;
++
++ cur_fmt = vpu_get_format(inst, inst->cap_format.type);
++ vpu_set_buffer_state(vbuf, VPU_BUF_STATE_READY);
++ for (int i = 0; i < vbuf->vb2_buf.num_planes; i++)
++ vb2_set_plane_payload(&vbuf->vb2_buf,
++ i, vpu_get_fmt_plane_size(cur_fmt, i));
++ vbuf->field = cur_fmt->field;
++ vbuf->sequence = vdec->sequence++;
++ dev_dbg(inst->dev, "[%d][OUTPUT TS]%32lld\n", inst->id, vbuf->vb2_buf.timestamp);
++
++ v4l2_m2m_buf_done(vbuf, VB2_BUF_STATE_DONE);
++ vdec->display_frame_count++;
++ }
+ exit:
+ vpu_inst_unlock(inst);
+
+@@ -768,14 +784,14 @@ static void vdec_buf_done(struct vpu_inst *inst, struct vpu_frame_info *frame)
+ struct vpu_format *cur_fmt;
+ struct vpu_vb2_buffer *vpu_buf;
+ struct vb2_v4l2_buffer *vbuf;
+- u32 sequence;
+ int i;
+
+ if (!frame)
+ return;
+
+ vpu_inst_lock(inst);
+- sequence = vdec->sequence++;
++ if (!vdec->params.display_delay_enable)
++ vdec->sequence++;
+ vpu_buf = vdec_find_buffer(inst, frame->luma);
+ vpu_inst_unlock(inst);
+ if (!vpu_buf) {
+@@ -794,13 +810,17 @@ static void vdec_buf_done(struct vpu_inst *inst, struct vpu_frame_info *frame)
+ dev_err(inst->dev, "[%d] buffer id(%d, %d) dismatch\n",
+ inst->id, vbuf->vb2_buf.index, frame->id);
+
++ if (vpu_get_buffer_state(vbuf) == VPU_BUF_STATE_READY && vdec->params.display_delay_enable)
++ return;
++
+ if (vpu_get_buffer_state(vbuf) != VPU_BUF_STATE_DECODED)
+ dev_err(inst->dev, "[%d] buffer(%d) ready without decoded\n", inst->id, frame->id);
++
+ vpu_set_buffer_state(vbuf, VPU_BUF_STATE_READY);
+ for (i = 0; i < vbuf->vb2_buf.num_planes; i++)
+ vb2_set_plane_payload(&vbuf->vb2_buf, i, vpu_get_fmt_plane_size(cur_fmt, i));
+ vbuf->field = cur_fmt->field;
+- vbuf->sequence = sequence;
++ vbuf->sequence = vdec->sequence;
+ dev_dbg(inst->dev, "[%d][OUTPUT TS]%32lld\n", inst->id, vbuf->vb2_buf.timestamp);
+
+ v4l2_m2m_buf_done(vbuf, VB2_BUF_STATE_DONE);
+@@ -999,6 +1019,7 @@ static int vdec_response_frame_abnormal(struct vpu_inst *inst)
+ {
+ struct vdec_t *vdec = inst->priv;
+ struct vpu_fs_info info;
++ int ret;
+
+ if (!vdec->req_frame_count)
+ return 0;
+@@ -1006,7 +1027,9 @@ static int vdec_response_frame_abnormal(struct vpu_inst *inst)
+ memset(&info, 0, sizeof(info));
+ info.type = MEM_RES_FRAME;
+ info.tag = vdec->seq_tag + 0xf0;
+- vpu_session_alloc_fs(inst, &info);
++ ret = vpu_session_alloc_fs(inst, &info);
++ if (ret)
++ return ret;
+ vdec->req_frame_count--;
+
+ return 0;
+@@ -1037,8 +1060,8 @@ static int vdec_response_frame(struct vpu_inst *inst, struct vb2_v4l2_buffer *vb
+ return -EINVAL;
+ }
+
+- dev_dbg(inst->dev, "[%d] state = %d, alloc fs %d, tag = 0x%x\n",
+- inst->id, inst->state, vbuf->vb2_buf.index, vdec->seq_tag);
++ dev_dbg(inst->dev, "[%d] state = %s, alloc fs %d, tag = 0x%x\n",
++ inst->id, vpu_codec_state_name(inst->state), vbuf->vb2_buf.index, vdec->seq_tag);
+ vpu_buf = to_vpu_vb2_buffer(vbuf);
+
+ memset(&info, 0, sizeof(info));
+@@ -1400,7 +1423,7 @@ static void vdec_abort(struct vpu_inst *inst)
+ struct vpu_rpc_buffer_desc desc;
+ int ret;
+
+- vpu_trace(inst->dev, "[%d] state = %d\n", inst->id, inst->state);
++ vpu_trace(inst->dev, "[%d] state = %s\n", inst->id, vpu_codec_state_name(inst->state));
+
+ vdec->aborting = true;
+ vpu_iface_add_scode(inst, SCODE_PADDING_ABORT);
+@@ -1453,9 +1476,7 @@ static void vdec_release(struct vpu_inst *inst)
+ {
+ if (inst->id != VPU_INST_NULL_ID)
+ vpu_trace(inst->dev, "[%d]\n", inst->id);
+- vpu_inst_lock(inst);
+ vdec_stop(inst, true);
+- vpu_inst_unlock(inst);
+ }
+
+ static void vdec_cleanup(struct vpu_inst *inst)
+diff --git a/drivers/media/platform/amphion/venc.c b/drivers/media/platform/amphion/venc.c
+index 58480e2755ec4..4eb57d793a9c0 100644
+--- a/drivers/media/platform/amphion/venc.c
++++ b/drivers/media/platform/amphion/venc.c
+@@ -268,7 +268,7 @@ static int venc_g_parm(struct file *file, void *fh, struct v4l2_streamparm *parm
+ {
+ struct vpu_inst *inst = to_inst(file);
+ struct venc_t *venc = inst->priv;
+- struct v4l2_fract *timeperframe = &parm->parm.capture.timeperframe;
++ struct v4l2_fract *timeperframe;
+
+ if (!parm)
+ return -EINVAL;
+@@ -279,6 +279,7 @@ static int venc_g_parm(struct file *file, void *fh, struct v4l2_streamparm *parm
+ if (!vpu_helper_check_type(inst, parm->type))
+ return -EINVAL;
+
++ timeperframe = &parm->parm.capture.timeperframe;
+ parm->parm.capture.capability = V4L2_CAP_TIMEPERFRAME;
+ parm->parm.capture.readbuffers = 0;
+ timeperframe->numerator = venc->params.frame_rate.numerator;
+@@ -291,7 +292,7 @@ static int venc_s_parm(struct file *file, void *fh, struct v4l2_streamparm *parm
+ {
+ struct vpu_inst *inst = to_inst(file);
+ struct venc_t *venc = inst->priv;
+- struct v4l2_fract *timeperframe = &parm->parm.capture.timeperframe;
++ struct v4l2_fract *timeperframe;
+ unsigned long n, d;
+
+ if (!parm)
+@@ -303,6 +304,7 @@ static int venc_s_parm(struct file *file, void *fh, struct v4l2_streamparm *parm
+ if (!vpu_helper_check_type(inst, parm->type))
+ return -EINVAL;
+
++ timeperframe = &parm->parm.capture.timeperframe;
+ if (!timeperframe->numerator)
+ timeperframe->numerator = venc->params.frame_rate.numerator;
+ if (!timeperframe->denominator)
+diff --git a/drivers/media/platform/amphion/vpu.h b/drivers/media/platform/amphion/vpu.h
+index 3bfe193722af4..5a701f64289ef 100644
+--- a/drivers/media/platform/amphion/vpu.h
++++ b/drivers/media/platform/amphion/vpu.h
+@@ -355,6 +355,9 @@ void vpu_inst_record_flow(struct vpu_inst *inst, u32 flow);
+ int vpu_core_driver_init(void);
+ void vpu_core_driver_exit(void);
+
++const char *vpu_id_name(u32 id);
++const char *vpu_codec_state_name(enum vpu_codec_state state);
++
+ extern bool debug;
+ #define vpu_trace(dev, fmt, arg...) \
+ do { \
+diff --git a/drivers/media/platform/amphion/vpu_cmds.c b/drivers/media/platform/amphion/vpu_cmds.c
+index fa581ba6bab2d..235b71398d403 100644
+--- a/drivers/media/platform/amphion/vpu_cmds.c
++++ b/drivers/media/platform/amphion/vpu_cmds.c
+@@ -98,7 +98,7 @@ static struct vpu_cmd_t *vpu_alloc_cmd(struct vpu_inst *inst, u32 id, void *data
+ cmd->id = id;
+ ret = vpu_iface_pack_cmd(inst->core, cmd->pkt, inst->id, id, data);
+ if (ret) {
+- dev_err(inst->dev, "iface pack cmd(%d) fail\n", id);
++ dev_err(inst->dev, "iface pack cmd %s fail\n", vpu_id_name(id));
+ vfree(cmd->pkt);
+ vfree(cmd);
+ return NULL;
+@@ -125,14 +125,14 @@ static int vpu_session_process_cmd(struct vpu_inst *inst, struct vpu_cmd_t *cmd)
+ {
+ int ret;
+
+- dev_dbg(inst->dev, "[%d]send cmd(0x%x)\n", inst->id, cmd->id);
++ dev_dbg(inst->dev, "[%d]send cmd %s\n", inst->id, vpu_id_name(cmd->id));
+ vpu_iface_pre_send_cmd(inst);
+ ret = vpu_cmd_send(inst->core, cmd->pkt);
+ if (!ret) {
+ vpu_iface_post_send_cmd(inst);
+ vpu_inst_record_flow(inst, cmd->id);
+ } else {
+- dev_err(inst->dev, "[%d] iface send cmd(0x%x) fail\n", inst->id, cmd->id);
++ dev_err(inst->dev, "[%d] iface send cmd %s fail\n", inst->id, vpu_id_name(cmd->id));
+ }
+
+ return ret;
+@@ -149,7 +149,8 @@ static void vpu_process_cmd_request(struct vpu_inst *inst)
+ list_for_each_entry_safe(cmd, tmp, &inst->cmd_q, list) {
+ list_del_init(&cmd->list);
+ if (vpu_session_process_cmd(inst, cmd))
+- dev_err(inst->dev, "[%d] process cmd(%d) fail\n", inst->id, cmd->id);
++ dev_err(inst->dev, "[%d] process cmd %s fail\n",
++ inst->id, vpu_id_name(cmd->id));
+ if (cmd->request) {
+ inst->pending = (void *)cmd;
+ break;
+@@ -305,7 +306,8 @@ static void vpu_core_keep_active(struct vpu_core *core)
+
+ dev_dbg(core->dev, "try to wake up\n");
+ mutex_lock(&core->cmd_lock);
+- vpu_cmd_send(core, &pkt);
++ if (vpu_cmd_send(core, &pkt))
++ dev_err(core->dev, "fail to keep active\n");
+ mutex_unlock(&core->cmd_lock);
+ }
+
+@@ -313,7 +315,7 @@ static int vpu_session_send_cmd(struct vpu_inst *inst, u32 id, void *data)
+ {
+ unsigned long key;
+ int sync = false;
+- int ret = -EINVAL;
++ int ret;
+
+ if (inst->id < 0)
+ return -EINVAL;
+@@ -339,7 +341,7 @@ static int vpu_session_send_cmd(struct vpu_inst *inst, u32 id, void *data)
+
+ exit:
+ if (ret)
+- dev_err(inst->dev, "[%d] send cmd(0x%x) fail\n", inst->id, id);
++ dev_err(inst->dev, "[%d] send cmd %s fail\n", inst->id, vpu_id_name(id));
+
+ return ret;
+ }
+diff --git a/drivers/media/platform/amphion/vpu_core.c b/drivers/media/platform/amphion/vpu_core.c
+index 82bf8b3be66a2..bfdebf2449a5c 100644
+--- a/drivers/media/platform/amphion/vpu_core.c
++++ b/drivers/media/platform/amphion/vpu_core.c
+@@ -88,6 +88,8 @@ static int vpu_core_boot_done(struct vpu_core *core)
+
+ core->supported_instance_count = min(core->supported_instance_count, count);
+ }
++ if (core->supported_instance_count >= BITS_PER_TYPE(core->instance_mask))
++ core->supported_instance_count = BITS_PER_TYPE(core->instance_mask);
+ core->fw_version = fw_version;
+ vpu_core_set_state(core, VPU_CORE_ACTIVE);
+
+diff --git a/drivers/media/platform/amphion/vpu_dbg.c b/drivers/media/platform/amphion/vpu_dbg.c
+index 44b830ae01d8c..982c2c777484c 100644
+--- a/drivers/media/platform/amphion/vpu_dbg.c
++++ b/drivers/media/platform/amphion/vpu_dbg.c
+@@ -50,6 +50,13 @@ static char *vpu_stat_name[] = {
+ [VPU_BUF_STATE_ERROR] = "error",
+ };
+
++static inline const char *to_vpu_stat_name(int state)
++{
++ if (state <= VPU_BUF_STATE_ERROR)
++ return vpu_stat_name[state];
++ return "unknown";
++}
++
+ static int vpu_dbg_instance(struct seq_file *s, void *data)
+ {
+ struct vpu_inst *inst = s->private;
+@@ -67,7 +74,7 @@ static int vpu_dbg_instance(struct seq_file *s, void *data)
+ num = scnprintf(str, sizeof(str), "tgig = %d,pid = %d\n", inst->tgid, inst->pid);
+ if (seq_write(s, str, num))
+ return 0;
+- num = scnprintf(str, sizeof(str), "state = %d\n", inst->state);
++ num = scnprintf(str, sizeof(str), "state = %s\n", vpu_codec_state_name(inst->state));
+ if (seq_write(s, str, num))
+ return 0;
+ num = scnprintf(str, sizeof(str),
+@@ -141,7 +148,7 @@ static int vpu_dbg_instance(struct seq_file *s, void *data)
+ num = scnprintf(str, sizeof(str),
+ "output [%2d] state = %10s, %8s\n",
+ i, vb2_stat_name[vb->state],
+- vpu_stat_name[vpu_get_buffer_state(vbuf)]);
++ to_vpu_stat_name(vpu_get_buffer_state(vbuf)));
+ if (seq_write(s, str, num))
+ return 0;
+ }
+@@ -156,7 +163,7 @@ static int vpu_dbg_instance(struct seq_file *s, void *data)
+ num = scnprintf(str, sizeof(str),
+ "capture[%2d] state = %10s, %8s\n",
+ i, vb2_stat_name[vb->state],
+- vpu_stat_name[vpu_get_buffer_state(vbuf)]);
++ to_vpu_stat_name(vpu_get_buffer_state(vbuf)));
+ if (seq_write(s, str, num))
+ return 0;
+ }
+@@ -188,9 +195,9 @@ static int vpu_dbg_instance(struct seq_file *s, void *data)
+
+ if (!inst->flows[idx])
+ continue;
+- num = scnprintf(str, sizeof(str), "\t[%s]0x%x\n",
++ num = scnprintf(str, sizeof(str), "\t[%s] %s\n",
+ inst->flows[idx] >= VPU_MSG_ID_NOOP ? "M" : "C",
+- inst->flows[idx]);
++ vpu_id_name(inst->flows[idx]));
+ if (seq_write(s, str, num)) {
+ mutex_unlock(&inst->core->cmd_lock);
+ return 0;
+diff --git a/drivers/media/platform/amphion/vpu_helpers.c b/drivers/media/platform/amphion/vpu_helpers.c
+index 019c77e84514c..af3b336e5dc32 100644
+--- a/drivers/media/platform/amphion/vpu_helpers.c
++++ b/drivers/media/platform/amphion/vpu_helpers.c
+@@ -11,6 +11,7 @@
+ #include <linux/module.h>
+ #include <linux/platform_device.h>
+ #include "vpu.h"
++#include "vpu_defs.h"
+ #include "vpu_core.h"
+ #include "vpu_rpc.h"
+ #include "vpu_helpers.h"
+@@ -447,3 +448,63 @@ int vpu_find_src_by_dst(struct vpu_pair *pairs, u32 cnt, u32 dst)
+
+ return -EINVAL;
+ }
++
++const char *vpu_id_name(u32 id)
++{
++ switch (id) {
++ case VPU_CMD_ID_NOOP: return "noop";
++ case VPU_CMD_ID_CONFIGURE_CODEC: return "configure codec";
++ case VPU_CMD_ID_START: return "start";
++ case VPU_CMD_ID_STOP: return "stop";
++ case VPU_CMD_ID_ABORT: return "abort";
++ case VPU_CMD_ID_RST_BUF: return "reset buf";
++ case VPU_CMD_ID_SNAPSHOT: return "snapshot";
++ case VPU_CMD_ID_FIRM_RESET: return "reset firmware";
++ case VPU_CMD_ID_UPDATE_PARAMETER: return "update parameter";
++ case VPU_CMD_ID_FRAME_ENCODE: return "encode frame";
++ case VPU_CMD_ID_SKIP: return "skip";
++ case VPU_CMD_ID_FS_ALLOC: return "alloc fb";
++ case VPU_CMD_ID_FS_RELEASE: return "release fb";
++ case VPU_CMD_ID_TIMESTAMP: return "timestamp";
++ case VPU_CMD_ID_DEBUG: return "debug";
++ case VPU_MSG_ID_RESET_DONE: return "reset done";
++ case VPU_MSG_ID_START_DONE: return "start done";
++ case VPU_MSG_ID_STOP_DONE: return "stop done";
++ case VPU_MSG_ID_ABORT_DONE: return "abort done";
++ case VPU_MSG_ID_BUF_RST: return "buf reset done";
++ case VPU_MSG_ID_MEM_REQUEST: return "mem request";
++ case VPU_MSG_ID_PARAM_UPD_DONE: return "param upd done";
++ case VPU_MSG_ID_FRAME_INPUT_DONE: return "frame input done";
++ case VPU_MSG_ID_ENC_DONE: return "encode done";
++ case VPU_MSG_ID_DEC_DONE: return "frame display";
++ case VPU_MSG_ID_FRAME_REQ: return "fb request";
++ case VPU_MSG_ID_FRAME_RELEASE: return "fb release";
++ case VPU_MSG_ID_SEQ_HDR_FOUND: return "seq hdr found";
++ case VPU_MSG_ID_RES_CHANGE: return "resolution change";
++ case VPU_MSG_ID_PIC_HDR_FOUND: return "pic hdr found";
++ case VPU_MSG_ID_PIC_DECODED: return "picture decoded";
++ case VPU_MSG_ID_PIC_EOS: return "eos";
++ case VPU_MSG_ID_FIFO_LOW: return "fifo low";
++ case VPU_MSG_ID_BS_ERROR: return "bs error";
++ case VPU_MSG_ID_UNSUPPORTED: return "unsupported";
++ case VPU_MSG_ID_FIRMWARE_XCPT: return "exception";
++ case VPU_MSG_ID_PIC_SKIPPED: return "skipped";
++ }
++ return "<unknown>";
++}
++
++const char *vpu_codec_state_name(enum vpu_codec_state state)
++{
++ switch (state) {
++ case VPU_CODEC_STATE_DEINIT: return "initialization";
++ case VPU_CODEC_STATE_CONFIGURED: return "configured";
++ case VPU_CODEC_STATE_START: return "start";
++ case VPU_CODEC_STATE_STARTED: return "started";
++ case VPU_CODEC_STATE_ACTIVE: return "active";
++ case VPU_CODEC_STATE_SEEK: return "seek";
++ case VPU_CODEC_STATE_STOP: return "stop";
++ case VPU_CODEC_STATE_DRAIN: return "drain";
++ case VPU_CODEC_STATE_DYAMIC_RESOLUTION_CHANGE: return "resolution change";
++ }
++ return "<unknown>";
++}
+diff --git a/drivers/media/platform/amphion/vpu_mbox.c b/drivers/media/platform/amphion/vpu_mbox.c
+index bf759eb2fd46d..b6d5b4844f672 100644
+--- a/drivers/media/platform/amphion/vpu_mbox.c
++++ b/drivers/media/platform/amphion/vpu_mbox.c
+@@ -46,11 +46,10 @@ static int vpu_mbox_request_channel(struct device *dev, struct vpu_mbox *mbox)
+ cl->rx_callback = vpu_mbox_rx_callback;
+
+ ch = mbox_request_channel_byname(cl, mbox->name);
+- if (IS_ERR(ch)) {
+- dev_err(dev, "Failed to request mbox chan %s, ret : %ld\n",
+- mbox->name, PTR_ERR(ch));
+- return PTR_ERR(ch);
+- }
++ if (IS_ERR(ch))
++ return dev_err_probe(dev, PTR_ERR(ch),
++ "Failed to request mbox chan %s\n",
++ mbox->name);
+
+ mbox->ch = ch;
+ return 0;
+diff --git a/drivers/media/platform/amphion/vpu_msgs.c b/drivers/media/platform/amphion/vpu_msgs.c
+index 92672a802b492..d0ead051f7d18 100644
+--- a/drivers/media/platform/amphion/vpu_msgs.c
++++ b/drivers/media/platform/amphion/vpu_msgs.c
+@@ -32,7 +32,7 @@ static void vpu_session_handle_start_done(struct vpu_inst *inst, struct vpu_rpc_
+
+ static void vpu_session_handle_mem_request(struct vpu_inst *inst, struct vpu_rpc_event *pkt)
+ {
+- struct vpu_pkt_mem_req_data req_data;
++ struct vpu_pkt_mem_req_data req_data = { 0 };
+
+ vpu_iface_unpack_msg_data(inst->core, pkt, (void *)&req_data);
+ vpu_trace(inst->dev, "[%d] %d:%d %d:%d %d:%d\n",
+@@ -80,7 +80,7 @@ static void vpu_session_handle_resolution_change(struct vpu_inst *inst, struct v
+
+ static void vpu_session_handle_enc_frame_done(struct vpu_inst *inst, struct vpu_rpc_event *pkt)
+ {
+- struct vpu_enc_pic_info info;
++ struct vpu_enc_pic_info info = { 0 };
+
+ vpu_iface_unpack_msg_data(inst->core, pkt, (void *)&info);
+ dev_dbg(inst->dev, "[%d] frame id = %d, wptr = 0x%x, size = %d\n",
+@@ -90,7 +90,7 @@ static void vpu_session_handle_enc_frame_done(struct vpu_inst *inst, struct vpu_
+
+ static void vpu_session_handle_frame_request(struct vpu_inst *inst, struct vpu_rpc_event *pkt)
+ {
+- struct vpu_fs_info fs;
++ struct vpu_fs_info fs = { 0 };
+
+ vpu_iface_unpack_msg_data(inst->core, pkt, &fs);
+ call_void_vop(inst, event_notify, VPU_MSG_ID_FRAME_REQ, &fs);
+@@ -107,7 +107,7 @@ static void vpu_session_handle_frame_release(struct vpu_inst *inst, struct vpu_r
+ info.type = inst->out_format.type;
+ call_void_vop(inst, buf_done, &info);
+ } else if (inst->core->type == VPU_CORE_TYPE_DEC) {
+- struct vpu_fs_info fs;
++ struct vpu_fs_info fs = { 0 };
+
+ vpu_iface_unpack_msg_data(inst->core, pkt, &fs);
+ call_void_vop(inst, event_notify, VPU_MSG_ID_FRAME_RELEASE, &fs);
+@@ -122,7 +122,7 @@ static void vpu_session_handle_input_done(struct vpu_inst *inst, struct vpu_rpc_
+
+ static void vpu_session_handle_pic_decoded(struct vpu_inst *inst, struct vpu_rpc_event *pkt)
+ {
+- struct vpu_dec_pic_info info;
++ struct vpu_dec_pic_info info = { 0 };
+
+ vpu_iface_unpack_msg_data(inst->core, pkt, (void *)&info);
+ call_void_vop(inst, get_one_frame, &info);
+@@ -130,7 +130,7 @@ static void vpu_session_handle_pic_decoded(struct vpu_inst *inst, struct vpu_rpc
+
+ static void vpu_session_handle_pic_done(struct vpu_inst *inst, struct vpu_rpc_event *pkt)
+ {
+- struct vpu_dec_pic_info info;
++ struct vpu_dec_pic_info info = { 0 };
+ struct vpu_frame_info frame;
+
+ memset(&frame, 0, sizeof(frame));
+@@ -210,7 +210,7 @@ static int vpu_session_handle_msg(struct vpu_inst *inst, struct vpu_rpc_event *m
+ return -EINVAL;
+
+ msg_id = ret;
+- dev_dbg(inst->dev, "[%d] receive event(0x%x)\n", inst->id, msg_id);
++ dev_dbg(inst->dev, "[%d] receive event(%s)\n", inst->id, vpu_id_name(msg_id));
+
+ for (i = 0; i < ARRAY_SIZE(handlers); i++) {
+ if (handlers[i].id == msg_id) {
+diff --git a/drivers/media/platform/amphion/vpu_v4l2.c b/drivers/media/platform/amphion/vpu_v4l2.c
+index 810e93d2c954a..8c9028df3bf42 100644
+--- a/drivers/media/platform/amphion/vpu_v4l2.c
++++ b/drivers/media/platform/amphion/vpu_v4l2.c
+@@ -489,6 +489,11 @@ static int vpu_vb2_queue_setup(struct vb2_queue *vq,
+ for (i = 0; i < cur_fmt->mem_planes; i++)
+ psize[i] = vpu_get_fmt_plane_size(cur_fmt, i);
+
++ if (V4L2_TYPE_IS_OUTPUT(vq->type) && inst->state == VPU_CODEC_STATE_SEEK) {
++ vpu_trace(inst->dev, "reinit when VIDIOC_REQBUFS(OUTPUT, 0)\n");
++ call_void_vop(inst, release);
++ }
++
+ return 0;
+ }
+
+@@ -773,9 +778,9 @@ int vpu_v4l2_close(struct file *file)
+ v4l2_m2m_ctx_release(inst->fh.m2m_ctx);
+ inst->fh.m2m_ctx = NULL;
+ }
++ call_void_vop(inst, release);
+ vpu_inst_unlock(inst);
+
+- call_void_vop(inst, release);
+ vpu_inst_unregister(inst);
+ vpu_inst_put(inst);
+
+diff --git a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
+index 60425c99a2b8b..7194f88edc0fb 100644
+--- a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
++++ b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
+@@ -1403,6 +1403,7 @@ static void mtk_jpeg_remove(struct platform_device *pdev)
+ {
+ struct mtk_jpeg_dev *jpeg = platform_get_drvdata(pdev);
+
++ cancel_delayed_work_sync(&jpeg->job_timeout_work);
+ pm_runtime_disable(&pdev->dev);
+ video_unregister_device(jpeg->vdev);
+ v4l2_m2m_release(jpeg->m2m_dev);
+diff --git a/drivers/media/platform/mediatek/vcodec/vdec/vdec_vp9_if.c b/drivers/media/platform/mediatek/vcodec/vdec/vdec_vp9_if.c
+index 70b8383f7c8ec..a27a109d8d144 100644
+--- a/drivers/media/platform/mediatek/vcodec/vdec/vdec_vp9_if.c
++++ b/drivers/media/platform/mediatek/vcodec/vdec/vdec_vp9_if.c
+@@ -226,10 +226,11 @@ static struct vdec_fb *vp9_rm_from_fb_use_list(struct vdec_vp9_inst
+ if (fb->base_y.va == addr) {
+ list_move_tail(&node->list,
+ &inst->available_fb_node_list);
+- break;
++ return fb;
+ }
+ }
+- return fb;
++
++ return NULL;
+ }
+
+ static void vp9_add_to_fb_free_list(struct vdec_vp9_inst *inst,
+diff --git a/drivers/media/platform/mediatek/vcodec/vdec_msg_queue.c b/drivers/media/platform/mediatek/vcodec/vdec_msg_queue.c
+index 03f8d7cd8eddc..a81212c0ade9d 100644
+--- a/drivers/media/platform/mediatek/vcodec/vdec_msg_queue.c
++++ b/drivers/media/platform/mediatek/vcodec/vdec_msg_queue.c
+@@ -246,6 +246,7 @@ void vdec_msg_queue_deinit(struct vdec_msg_queue *msg_queue,
+ mtk_vcodec_mem_free(ctx, mem);
+
+ kfree(lat_buf->private_data);
++ lat_buf->private_data = NULL;
+ }
+ }
+
+@@ -312,6 +313,7 @@ int vdec_msg_queue_init(struct vdec_msg_queue *msg_queue,
+ err = mtk_vcodec_mem_alloc(ctx, &msg_queue->wdma_addr);
+ if (err) {
+ mtk_v4l2_err("failed to allocate wdma_addr buf");
++ msg_queue->wdma_addr.size = 0;
+ return -ENOMEM;
+ }
+ msg_queue->wdma_rptr_addr = msg_queue->wdma_addr.dma_addr;
+diff --git a/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg-hw.h b/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg-hw.h
+index ed15ea348f97b..a2b4fb9e29e7d 100644
+--- a/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg-hw.h
++++ b/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg-hw.h
+@@ -58,7 +58,6 @@
+ #define CAST_OFBSIZE_LO CAST_STATUS18
+ #define CAST_OFBSIZE_HI CAST_STATUS19
+
+-#define MXC_MAX_SLOTS 1 /* TODO use all 4 slots*/
+ /* JPEG-Decoder Wrapper Slot Registers 0..3 */
+ #define SLOT_BASE 0x10000
+ #define SLOT_STATUS 0x0
+diff --git a/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg.c b/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg.c
+index c0e49be42450a..9512c0a619667 100644
+--- a/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg.c
++++ b/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg.c
+@@ -745,87 +745,77 @@ static void notify_src_chg(struct mxc_jpeg_ctx *ctx)
+ v4l2_event_queue_fh(&ctx->fh, &ev);
+ }
+
+-static int mxc_get_free_slot(struct mxc_jpeg_slot_data slot_data[], int n)
++static int mxc_get_free_slot(struct mxc_jpeg_slot_data *slot_data)
+ {
+- int free_slot = 0;
+-
+- while (slot_data[free_slot].used && free_slot < n)
+- free_slot++;
+-
+- return free_slot; /* >=n when there are no more free slots */
++ if (!slot_data->used)
++ return slot_data->slot;
++ return -1;
+ }
+
+-static bool mxc_jpeg_alloc_slot_data(struct mxc_jpeg_dev *jpeg,
+- unsigned int slot)
++static bool mxc_jpeg_alloc_slot_data(struct mxc_jpeg_dev *jpeg)
+ {
+ struct mxc_jpeg_desc *desc;
+ struct mxc_jpeg_desc *cfg_desc;
+ void *cfg_stm;
+
+- if (jpeg->slot_data[slot].desc)
++ if (jpeg->slot_data.desc)
+ goto skip_alloc; /* already allocated, reuse it */
+
+ /* allocate descriptor for decoding/encoding phase */
+ desc = dma_alloc_coherent(jpeg->dev,
+ sizeof(struct mxc_jpeg_desc),
+- &jpeg->slot_data[slot].desc_handle,
++ &jpeg->slot_data.desc_handle,
+ GFP_ATOMIC);
+ if (!desc)
+ goto err;
+- jpeg->slot_data[slot].desc = desc;
++ jpeg->slot_data.desc = desc;
+
+ /* allocate descriptor for configuration phase (encoder only) */
+ cfg_desc = dma_alloc_coherent(jpeg->dev,
+ sizeof(struct mxc_jpeg_desc),
+- &jpeg->slot_data[slot].cfg_desc_handle,
++ &jpeg->slot_data.cfg_desc_handle,
+ GFP_ATOMIC);
+ if (!cfg_desc)
+ goto err;
+- jpeg->slot_data[slot].cfg_desc = cfg_desc;
++ jpeg->slot_data.cfg_desc = cfg_desc;
+
+ /* allocate configuration stream */
+ cfg_stm = dma_alloc_coherent(jpeg->dev,
+ MXC_JPEG_MAX_CFG_STREAM,
+- &jpeg->slot_data[slot].cfg_stream_handle,
++ &jpeg->slot_data.cfg_stream_handle,
+ GFP_ATOMIC);
+ if (!cfg_stm)
+ goto err;
+- jpeg->slot_data[slot].cfg_stream_vaddr = cfg_stm;
++ jpeg->slot_data.cfg_stream_vaddr = cfg_stm;
+
+ skip_alloc:
+- jpeg->slot_data[slot].used = true;
++ jpeg->slot_data.used = true;
+
+ return true;
+ err:
+- dev_err(jpeg->dev, "Could not allocate descriptors for slot %d", slot);
++ dev_err(jpeg->dev, "Could not allocate descriptors for slot %d", jpeg->slot_data.slot);
+
+ return false;
+ }
+
+-static void mxc_jpeg_free_slot_data(struct mxc_jpeg_dev *jpeg,
+- unsigned int slot)
++static void mxc_jpeg_free_slot_data(struct mxc_jpeg_dev *jpeg)
+ {
+- if (slot >= MXC_MAX_SLOTS) {
+- dev_err(jpeg->dev, "Invalid slot %d, nothing to free.", slot);
+- return;
+- }
+-
+ /* free descriptor for decoding/encoding phase */
+ dma_free_coherent(jpeg->dev, sizeof(struct mxc_jpeg_desc),
+- jpeg->slot_data[slot].desc,
+- jpeg->slot_data[slot].desc_handle);
++ jpeg->slot_data.desc,
++ jpeg->slot_data.desc_handle);
+
+ /* free descriptor for encoder configuration phase / decoder DHT */
+ dma_free_coherent(jpeg->dev, sizeof(struct mxc_jpeg_desc),
+- jpeg->slot_data[slot].cfg_desc,
+- jpeg->slot_data[slot].cfg_desc_handle);
++ jpeg->slot_data.cfg_desc,
++ jpeg->slot_data.cfg_desc_handle);
+
+ /* free configuration stream */
+ dma_free_coherent(jpeg->dev, MXC_JPEG_MAX_CFG_STREAM,
+- jpeg->slot_data[slot].cfg_stream_vaddr,
+- jpeg->slot_data[slot].cfg_stream_handle);
++ jpeg->slot_data.cfg_stream_vaddr,
++ jpeg->slot_data.cfg_stream_handle);
+
+- jpeg->slot_data[slot].used = false;
++ jpeg->slot_data.used = false;
+ }
+
+ static void mxc_jpeg_check_and_set_last_buffer(struct mxc_jpeg_ctx *ctx,
+@@ -855,7 +845,7 @@ static void mxc_jpeg_job_finish(struct mxc_jpeg_ctx *ctx, enum vb2_buffer_state
+ v4l2_m2m_buf_done(dst_buf, state);
+
+ mxc_jpeg_disable_irq(reg, ctx->slot);
+- ctx->mxc_jpeg->slot_data[ctx->slot].used = false;
++ jpeg->slot_data.used = false;
+ if (reset)
+ mxc_jpeg_sw_reset(reg);
+ }
+@@ -919,7 +909,7 @@ static irqreturn_t mxc_jpeg_dec_irq(int irq, void *priv)
+ goto job_unlock;
+ }
+
+- if (!jpeg->slot_data[slot].used)
++ if (!jpeg->slot_data.used)
+ goto job_unlock;
+
+ dec_ret = readl(reg + MXC_SLOT_OFFSET(slot, SLOT_STATUS));
+@@ -1179,13 +1169,13 @@ static void mxc_jpeg_config_dec_desc(struct vb2_buffer *out_buf,
+ struct mxc_jpeg_dev *jpeg = ctx->mxc_jpeg;
+ void __iomem *reg = jpeg->base_reg;
+ unsigned int slot = ctx->slot;
+- struct mxc_jpeg_desc *desc = jpeg->slot_data[slot].desc;
+- struct mxc_jpeg_desc *cfg_desc = jpeg->slot_data[slot].cfg_desc;
+- dma_addr_t desc_handle = jpeg->slot_data[slot].desc_handle;
+- dma_addr_t cfg_desc_handle = jpeg->slot_data[slot].cfg_desc_handle;
+- dma_addr_t cfg_stream_handle = jpeg->slot_data[slot].cfg_stream_handle;
+- unsigned int *cfg_size = &jpeg->slot_data[slot].cfg_stream_size;
+- void *cfg_stream_vaddr = jpeg->slot_data[slot].cfg_stream_vaddr;
++ struct mxc_jpeg_desc *desc = jpeg->slot_data.desc;
++ struct mxc_jpeg_desc *cfg_desc = jpeg->slot_data.cfg_desc;
++ dma_addr_t desc_handle = jpeg->slot_data.desc_handle;
++ dma_addr_t cfg_desc_handle = jpeg->slot_data.cfg_desc_handle;
++ dma_addr_t cfg_stream_handle = jpeg->slot_data.cfg_stream_handle;
++ unsigned int *cfg_size = &jpeg->slot_data.cfg_stream_size;
++ void *cfg_stream_vaddr = jpeg->slot_data.cfg_stream_vaddr;
+ struct mxc_jpeg_src_buf *jpeg_src_buf;
+
+ jpeg_src_buf = vb2_to_mxc_buf(src_buf);
+@@ -1245,18 +1235,18 @@ static void mxc_jpeg_config_enc_desc(struct vb2_buffer *out_buf,
+ struct mxc_jpeg_dev *jpeg = ctx->mxc_jpeg;
+ void __iomem *reg = jpeg->base_reg;
+ unsigned int slot = ctx->slot;
+- struct mxc_jpeg_desc *desc = jpeg->slot_data[slot].desc;
+- struct mxc_jpeg_desc *cfg_desc = jpeg->slot_data[slot].cfg_desc;
+- dma_addr_t desc_handle = jpeg->slot_data[slot].desc_handle;
+- dma_addr_t cfg_desc_handle = jpeg->slot_data[slot].cfg_desc_handle;
+- void *cfg_stream_vaddr = jpeg->slot_data[slot].cfg_stream_vaddr;
++ struct mxc_jpeg_desc *desc = jpeg->slot_data.desc;
++ struct mxc_jpeg_desc *cfg_desc = jpeg->slot_data.cfg_desc;
++ dma_addr_t desc_handle = jpeg->slot_data.desc_handle;
++ dma_addr_t cfg_desc_handle = jpeg->slot_data.cfg_desc_handle;
++ void *cfg_stream_vaddr = jpeg->slot_data.cfg_stream_vaddr;
+ struct mxc_jpeg_q_data *q_data;
+ enum mxc_jpeg_image_format img_fmt;
+ int w, h;
+
+ q_data = mxc_jpeg_get_q_data(ctx, src_buf->vb2_queue->type);
+
+- jpeg->slot_data[slot].cfg_stream_size =
++ jpeg->slot_data.cfg_stream_size =
+ mxc_jpeg_setup_cfg_stream(cfg_stream_vaddr,
+ q_data->fmt->fourcc,
+ q_data->crop.width,
+@@ -1265,7 +1255,7 @@ static void mxc_jpeg_config_enc_desc(struct vb2_buffer *out_buf,
+ /* chain the config descriptor with the encoding descriptor */
+ cfg_desc->next_descpt_ptr = desc_handle | MXC_NXT_DESCPT_EN;
+
+- cfg_desc->buf_base0 = jpeg->slot_data[slot].cfg_stream_handle;
++ cfg_desc->buf_base0 = jpeg->slot_data.cfg_stream_handle;
+ cfg_desc->buf_base1 = 0;
+ cfg_desc->line_pitch = 0;
+ cfg_desc->stm_bufbase = 0; /* no output expected */
+@@ -1408,7 +1398,7 @@ static void mxc_jpeg_device_run_timeout(struct work_struct *work)
+ unsigned long flags;
+
+ spin_lock_irqsave(&ctx->mxc_jpeg->hw_lock, flags);
+- if (ctx->slot < MXC_MAX_SLOTS && ctx->mxc_jpeg->slot_data[ctx->slot].used) {
++ if (ctx->mxc_jpeg->slot_data.used) {
+ dev_warn(jpeg->dev, "%s timeout, cancel it\n",
+ ctx->mxc_jpeg->mode == MXC_JPEG_DECODE ? "decode" : "encode");
+ mxc_jpeg_job_finish(ctx, VB2_BUF_STATE_ERROR, true);
+@@ -1476,12 +1466,12 @@ static void mxc_jpeg_device_run(void *priv)
+ mxc_jpeg_enable(reg);
+ mxc_jpeg_set_l_endian(reg, 1);
+
+- ctx->slot = mxc_get_free_slot(jpeg->slot_data, MXC_MAX_SLOTS);
+- if (ctx->slot >= MXC_MAX_SLOTS) {
++ ctx->slot = mxc_get_free_slot(&jpeg->slot_data);
++ if (ctx->slot < 0) {
+ dev_err(dev, "No more free slots\n");
+ goto end;
+ }
+- if (!mxc_jpeg_alloc_slot_data(jpeg, ctx->slot)) {
++ if (!mxc_jpeg_alloc_slot_data(jpeg)) {
+ dev_err(dev, "Cannot allocate slot data\n");
+ goto end;
+ }
+@@ -2101,7 +2091,7 @@ static int mxc_jpeg_open(struct file *file)
+ }
+ ctx->fh.ctrl_handler = &ctx->ctrl_handler;
+ mxc_jpeg_set_default_params(ctx);
+- ctx->slot = MXC_MAX_SLOTS; /* slot not allocated yet */
++ ctx->slot = -1; /* slot not allocated yet */
+ INIT_DELAYED_WORK(&ctx->task_timer, mxc_jpeg_device_run_timeout);
+
+ if (mxc_jpeg->mode == MXC_JPEG_DECODE)
+@@ -2677,6 +2667,11 @@ static int mxc_jpeg_attach_pm_domains(struct mxc_jpeg_dev *jpeg)
+ dev_err(dev, "No power domains defined for jpeg node\n");
+ return jpeg->num_domains;
+ }
++ if (jpeg->num_domains == 1) {
++ /* genpd_dev_pm_attach() attach automatically if power domains count is 1 */
++ jpeg->num_domains = 0;
++ return 0;
++ }
+
+ jpeg->pd_dev = devm_kmalloc_array(dev, jpeg->num_domains,
+ sizeof(*jpeg->pd_dev), GFP_KERNEL);
+@@ -2718,7 +2713,6 @@ static int mxc_jpeg_probe(struct platform_device *pdev)
+ int ret;
+ int mode;
+ const struct of_device_id *of_id;
+- unsigned int slot;
+
+ of_id = of_match_node(mxc_jpeg_match, dev->of_node);
+ if (!of_id)
+@@ -2742,19 +2736,22 @@ static int mxc_jpeg_probe(struct platform_device *pdev)
+ if (IS_ERR(jpeg->base_reg))
+ return PTR_ERR(jpeg->base_reg);
+
+- for (slot = 0; slot < MXC_MAX_SLOTS; slot++) {
+- dec_irq = platform_get_irq(pdev, slot);
+- if (dec_irq < 0) {
+- ret = dec_irq;
+- goto err_irq;
+- }
+- ret = devm_request_irq(&pdev->dev, dec_irq, mxc_jpeg_dec_irq,
+- 0, pdev->name, jpeg);
+- if (ret) {
+- dev_err(&pdev->dev, "Failed to request irq %d (%d)\n",
+- dec_irq, ret);
+- goto err_irq;
+- }
++ ret = of_property_read_u32_index(pdev->dev.of_node, "slot", 0, &jpeg->slot_data.slot);
++ if (ret)
++ jpeg->slot_data.slot = 0;
++ dev_info(&pdev->dev, "choose slot %d\n", jpeg->slot_data.slot);
++ dec_irq = platform_get_irq(pdev, 0);
++ if (dec_irq < 0) {
++ dev_err(&pdev->dev, "Failed to get irq %d\n", dec_irq);
++ ret = dec_irq;
++ goto err_irq;
++ }
++ ret = devm_request_irq(&pdev->dev, dec_irq, mxc_jpeg_dec_irq,
++ 0, pdev->name, jpeg);
++ if (ret) {
++ dev_err(&pdev->dev, "Failed to request irq %d (%d)\n",
++ dec_irq, ret);
++ goto err_irq;
+ }
+
+ jpeg->pdev = pdev;
+@@ -2914,11 +2911,9 @@ static const struct dev_pm_ops mxc_jpeg_pm_ops = {
+
+ static void mxc_jpeg_remove(struct platform_device *pdev)
+ {
+- unsigned int slot;
+ struct mxc_jpeg_dev *jpeg = platform_get_drvdata(pdev);
+
+- for (slot = 0; slot < MXC_MAX_SLOTS; slot++)
+- mxc_jpeg_free_slot_data(jpeg, slot);
++ mxc_jpeg_free_slot_data(jpeg);
+
+ pm_runtime_disable(&pdev->dev);
+ video_unregister_device(jpeg->dec_vdev);
+diff --git a/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg.h b/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg.h
+index 87157db780826..d80e94cc9d992 100644
+--- a/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg.h
++++ b/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg.h
+@@ -97,7 +97,7 @@ struct mxc_jpeg_ctx {
+ struct mxc_jpeg_q_data cap_q;
+ struct v4l2_fh fh;
+ enum mxc_jpeg_enc_state enc_state;
+- unsigned int slot;
++ int slot;
+ unsigned int source_change;
+ bool header_parsed;
+ struct v4l2_ctrl_handler ctrl_handler;
+@@ -106,6 +106,7 @@ struct mxc_jpeg_ctx {
+ };
+
+ struct mxc_jpeg_slot_data {
++ int slot;
+ bool used;
+ struct mxc_jpeg_desc *desc; // enc/dec descriptor
+ struct mxc_jpeg_desc *cfg_desc; // configuration descriptor
+@@ -128,7 +129,7 @@ struct mxc_jpeg_dev {
+ struct v4l2_device v4l2_dev;
+ struct v4l2_m2m_dev *m2m_dev;
+ struct video_device *dec_vdev;
+- struct mxc_jpeg_slot_data slot_data[MXC_MAX_SLOTS];
++ struct mxc_jpeg_slot_data slot_data;
+ int num_domains;
+ struct device **pd_dev;
+ struct device_link **pd_link;
+diff --git a/drivers/media/platform/nxp/imx8-isi/imx8-isi-crossbar.c b/drivers/media/platform/nxp/imx8-isi/imx8-isi-crossbar.c
+index b5ffde46f31b6..641c802adcc33 100644
+--- a/drivers/media/platform/nxp/imx8-isi/imx8-isi-crossbar.c
++++ b/drivers/media/platform/nxp/imx8-isi/imx8-isi-crossbar.c
+@@ -483,7 +483,7 @@ int mxc_isi_crossbar_init(struct mxc_isi_dev *isi)
+
+ xbar->inputs = kcalloc(xbar->num_sinks, sizeof(*xbar->inputs),
+ GFP_KERNEL);
+- if (!xbar->pads) {
++ if (!xbar->inputs) {
+ ret = -ENOMEM;
+ goto err_free;
+ }
+diff --git a/drivers/media/platform/qcom/venus/hfi_venus.c b/drivers/media/platform/qcom/venus/hfi_venus.c
+index 2ad40b3945b0b..8fc8f46dc3908 100644
+--- a/drivers/media/platform/qcom/venus/hfi_venus.c
++++ b/drivers/media/platform/qcom/venus/hfi_venus.c
+@@ -131,7 +131,6 @@ struct venus_hfi_device {
+
+ static bool venus_pkt_debug;
+ int venus_fw_debug = HFI_DEBUG_MSG_ERROR | HFI_DEBUG_MSG_FATAL;
+-static bool venus_sys_idle_indicator;
+ static bool venus_fw_low_power_mode = true;
+ static int venus_hw_rsp_timeout = 1000;
+ static bool venus_fw_coverage;
+@@ -454,7 +453,6 @@ static int venus_boot_core(struct venus_hfi_device *hdev)
+ void __iomem *wrapper_base = hdev->core->wrapper_base;
+ int ret = 0;
+
+- writel(BIT(VIDC_CTRL_INIT_CTRL_SHIFT), cpu_cs_base + VIDC_CTRL_INIT);
+ if (IS_V6(hdev->core)) {
+ mask_val = readl(wrapper_base + WRAPPER_INTR_MASK);
+ mask_val &= ~(WRAPPER_INTR_MASK_A2HWD_BASK_V6 |
+@@ -465,6 +463,7 @@ static int venus_boot_core(struct venus_hfi_device *hdev)
+ writel(mask_val, wrapper_base + WRAPPER_INTR_MASK);
+ writel(1, cpu_cs_base + CPU_CS_SCIACMDARG3);
+
++ writel(BIT(VIDC_CTRL_INIT_CTRL_SHIFT), cpu_cs_base + VIDC_CTRL_INIT);
+ while (!ctrl_status && count < max_tries) {
+ ctrl_status = readl(cpu_cs_base + CPU_CS_SCIACMDARG0);
+ if ((ctrl_status & CPU_CS_SCIACMDARG0_ERROR_STATUS_MASK) == 4) {
+@@ -947,17 +946,12 @@ static int venus_sys_set_default_properties(struct venus_hfi_device *hdev)
+ if (ret)
+ dev_warn(dev, "setting fw debug msg ON failed (%d)\n", ret);
+
+- /*
+- * Idle indicator is disabled by default on some 4xx firmware versions,
+- * enable it explicitly in order to make suspend functional by checking
+- * WFI (wait-for-interrupt) bit.
+- */
+- if (IS_V4(hdev->core) || IS_V6(hdev->core))
+- venus_sys_idle_indicator = true;
+-
+- ret = venus_sys_set_idle_message(hdev, venus_sys_idle_indicator);
+- if (ret)
+- dev_warn(dev, "setting idle response ON failed (%d)\n", ret);
++ /* HFI_PROPERTY_SYS_IDLE_INDICATOR is not supported beyond 8916 (HFI V1) */
++ if (IS_V1(hdev->core)) {
++ ret = venus_sys_set_idle_message(hdev, false);
++ if (ret)
++ dev_warn(dev, "setting idle response ON failed (%d)\n", ret);
++ }
+
+ ret = venus_sys_set_power_control(hdev, venus_fw_low_power_mode);
+ if (ret)
+diff --git a/drivers/media/platform/verisilicon/hantro_v4l2.c b/drivers/media/platform/verisilicon/hantro_v4l2.c
+index 61cfaaf4e927b..e56d58fe28022 100644
+--- a/drivers/media/platform/verisilicon/hantro_v4l2.c
++++ b/drivers/media/platform/verisilicon/hantro_v4l2.c
+@@ -276,6 +276,7 @@ static int hantro_try_fmt(const struct hantro_ctx *ctx,
+ enum v4l2_buf_type type)
+ {
+ const struct hantro_fmt *fmt;
++ const struct hantro_fmt *vpu_fmt;
+ bool capture = V4L2_TYPE_IS_CAPTURE(type);
+ bool coded;
+
+@@ -295,19 +296,23 @@ static int hantro_try_fmt(const struct hantro_ctx *ctx,
+
+ if (coded) {
+ pix_mp->num_planes = 1;
+- } else if (!ctx->is_encoder) {
++ vpu_fmt = fmt;
++ } else if (ctx->is_encoder) {
++ vpu_fmt = hantro_find_format(ctx, ctx->dst_fmt.pixelformat);
++ } else {
+ /*
+ * Width/height on the CAPTURE end of a decoder are ignored and
+ * replaced by the OUTPUT ones.
+ */
+ pix_mp->width = ctx->src_fmt.width;
+ pix_mp->height = ctx->src_fmt.height;
++ vpu_fmt = fmt;
+ }
+
+ pix_mp->field = V4L2_FIELD_NONE;
+
+ v4l2_apply_frmsize_constraints(&pix_mp->width, &pix_mp->height,
+- &fmt->frmsize);
++ &vpu_fmt->frmsize);
+
+ if (!coded) {
+ /* Fill remaining fields */
+diff --git a/drivers/media/tuners/fc0011.c b/drivers/media/tuners/fc0011.c
+index eaa3bbc903d7e..3d3b54be29557 100644
+--- a/drivers/media/tuners/fc0011.c
++++ b/drivers/media/tuners/fc0011.c
+@@ -499,7 +499,7 @@ struct dvb_frontend *fc0011_attach(struct dvb_frontend *fe,
+
+ return fe;
+ }
+-EXPORT_SYMBOL(fc0011_attach);
++EXPORT_SYMBOL_GPL(fc0011_attach);
+
+ MODULE_DESCRIPTION("Fitipower FC0011 silicon tuner driver");
+ MODULE_AUTHOR("Michael Buesch <m@bues.ch>");
+diff --git a/drivers/media/tuners/fc0012.c b/drivers/media/tuners/fc0012.c
+index 4429d5e8c5796..81e65acbdb170 100644
+--- a/drivers/media/tuners/fc0012.c
++++ b/drivers/media/tuners/fc0012.c
+@@ -495,7 +495,7 @@ err:
+
+ return fe;
+ }
+-EXPORT_SYMBOL(fc0012_attach);
++EXPORT_SYMBOL_GPL(fc0012_attach);
+
+ MODULE_DESCRIPTION("Fitipower FC0012 silicon tuner driver");
+ MODULE_AUTHOR("Hans-Frieder Vogt <hfvogt@gmx.net>");
+diff --git a/drivers/media/tuners/fc0013.c b/drivers/media/tuners/fc0013.c
+index 29dd9b55ff333..1006a2798eefc 100644
+--- a/drivers/media/tuners/fc0013.c
++++ b/drivers/media/tuners/fc0013.c
+@@ -608,7 +608,7 @@ struct dvb_frontend *fc0013_attach(struct dvb_frontend *fe,
+
+ return fe;
+ }
+-EXPORT_SYMBOL(fc0013_attach);
++EXPORT_SYMBOL_GPL(fc0013_attach);
+
+ MODULE_DESCRIPTION("Fitipower FC0013 silicon tuner driver");
+ MODULE_AUTHOR("Hans-Frieder Vogt <hfvogt@gmx.net>");
+diff --git a/drivers/media/tuners/max2165.c b/drivers/media/tuners/max2165.c
+index 1c746bed51fee..1575ab94e1c8b 100644
+--- a/drivers/media/tuners/max2165.c
++++ b/drivers/media/tuners/max2165.c
+@@ -410,7 +410,7 @@ struct dvb_frontend *max2165_attach(struct dvb_frontend *fe,
+
+ return fe;
+ }
+-EXPORT_SYMBOL(max2165_attach);
++EXPORT_SYMBOL_GPL(max2165_attach);
+
+ MODULE_AUTHOR("David T. L. Wong <davidtlwong@gmail.com>");
+ MODULE_DESCRIPTION("Maxim MAX2165 silicon tuner driver");
+diff --git a/drivers/media/tuners/mc44s803.c b/drivers/media/tuners/mc44s803.c
+index 0c9161516abdf..ed8bdf7ebd99d 100644
+--- a/drivers/media/tuners/mc44s803.c
++++ b/drivers/media/tuners/mc44s803.c
+@@ -356,7 +356,7 @@ error:
+ kfree(priv);
+ return NULL;
+ }
+-EXPORT_SYMBOL(mc44s803_attach);
++EXPORT_SYMBOL_GPL(mc44s803_attach);
+
+ MODULE_AUTHOR("Jochen Friedrich");
+ MODULE_DESCRIPTION("Freescale MC44S803 silicon tuner driver");
+diff --git a/drivers/media/tuners/mt2060.c b/drivers/media/tuners/mt2060.c
+index e5d86874adb34..1a1635238b6ca 100644
+--- a/drivers/media/tuners/mt2060.c
++++ b/drivers/media/tuners/mt2060.c
+@@ -440,7 +440,7 @@ struct dvb_frontend * mt2060_attach(struct dvb_frontend *fe, struct i2c_adapter
+
+ return fe;
+ }
+-EXPORT_SYMBOL(mt2060_attach);
++EXPORT_SYMBOL_GPL(mt2060_attach);
+
+ static int mt2060_probe(struct i2c_client *client)
+ {
+diff --git a/drivers/media/tuners/mt2131.c b/drivers/media/tuners/mt2131.c
+index 37f50ff6c0bd2..eebc060883414 100644
+--- a/drivers/media/tuners/mt2131.c
++++ b/drivers/media/tuners/mt2131.c
+@@ -274,7 +274,7 @@ struct dvb_frontend * mt2131_attach(struct dvb_frontend *fe,
+ fe->tuner_priv = priv;
+ return fe;
+ }
+-EXPORT_SYMBOL(mt2131_attach);
++EXPORT_SYMBOL_GPL(mt2131_attach);
+
+ MODULE_AUTHOR("Steven Toth");
+ MODULE_DESCRIPTION("Microtune MT2131 silicon tuner driver");
+diff --git a/drivers/media/tuners/mt2266.c b/drivers/media/tuners/mt2266.c
+index 6136f20fa9b7f..2e92885a6bcb9 100644
+--- a/drivers/media/tuners/mt2266.c
++++ b/drivers/media/tuners/mt2266.c
+@@ -336,7 +336,7 @@ struct dvb_frontend * mt2266_attach(struct dvb_frontend *fe, struct i2c_adapter
+ mt2266_calibrate(priv);
+ return fe;
+ }
+-EXPORT_SYMBOL(mt2266_attach);
++EXPORT_SYMBOL_GPL(mt2266_attach);
+
+ MODULE_AUTHOR("Olivier DANET");
+ MODULE_DESCRIPTION("Microtune MT2266 silicon tuner driver");
+diff --git a/drivers/media/tuners/mxl5005s.c b/drivers/media/tuners/mxl5005s.c
+index 06dfab9fb8cbc..d9bfa257a0054 100644
+--- a/drivers/media/tuners/mxl5005s.c
++++ b/drivers/media/tuners/mxl5005s.c
+@@ -4120,7 +4120,7 @@ struct dvb_frontend *mxl5005s_attach(struct dvb_frontend *fe,
+ fe->tuner_priv = state;
+ return fe;
+ }
+-EXPORT_SYMBOL(mxl5005s_attach);
++EXPORT_SYMBOL_GPL(mxl5005s_attach);
+
+ MODULE_DESCRIPTION("MaxLinear MXL5005S silicon tuner driver");
+ MODULE_AUTHOR("Steven Toth");
+diff --git a/drivers/media/tuners/qt1010.c b/drivers/media/tuners/qt1010.c
+index 3853a3d43d4f2..60931367b82ca 100644
+--- a/drivers/media/tuners/qt1010.c
++++ b/drivers/media/tuners/qt1010.c
+@@ -440,7 +440,7 @@ struct dvb_frontend * qt1010_attach(struct dvb_frontend *fe,
+ fe->tuner_priv = priv;
+ return fe;
+ }
+-EXPORT_SYMBOL(qt1010_attach);
++EXPORT_SYMBOL_GPL(qt1010_attach);
+
+ MODULE_DESCRIPTION("Quantek QT1010 silicon tuner driver");
+ MODULE_AUTHOR("Antti Palosaari <crope@iki.fi>");
+diff --git a/drivers/media/tuners/tda18218.c b/drivers/media/tuners/tda18218.c
+index 4ed94646116fa..7d8d84dcb2459 100644
+--- a/drivers/media/tuners/tda18218.c
++++ b/drivers/media/tuners/tda18218.c
+@@ -336,7 +336,7 @@ struct dvb_frontend *tda18218_attach(struct dvb_frontend *fe,
+
+ return fe;
+ }
+-EXPORT_SYMBOL(tda18218_attach);
++EXPORT_SYMBOL_GPL(tda18218_attach);
+
+ MODULE_DESCRIPTION("NXP TDA18218HN silicon tuner driver");
+ MODULE_AUTHOR("Antti Palosaari <crope@iki.fi>");
+diff --git a/drivers/media/tuners/xc2028.c b/drivers/media/tuners/xc2028.c
+index 69c2e1b99bf17..5a967edceca93 100644
+--- a/drivers/media/tuners/xc2028.c
++++ b/drivers/media/tuners/xc2028.c
+@@ -1512,7 +1512,7 @@ fail:
+ return NULL;
+ }
+
+-EXPORT_SYMBOL(xc2028_attach);
++EXPORT_SYMBOL_GPL(xc2028_attach);
+
+ MODULE_DESCRIPTION("Xceive xc2028/xc3028 tuner driver");
+ MODULE_AUTHOR("Michel Ludwig <michel.ludwig@gmail.com>");
+diff --git a/drivers/media/tuners/xc4000.c b/drivers/media/tuners/xc4000.c
+index d59b4ab774302..57ded9ff3f043 100644
+--- a/drivers/media/tuners/xc4000.c
++++ b/drivers/media/tuners/xc4000.c
+@@ -1742,7 +1742,7 @@ fail2:
+ xc4000_release(fe);
+ return NULL;
+ }
+-EXPORT_SYMBOL(xc4000_attach);
++EXPORT_SYMBOL_GPL(xc4000_attach);
+
+ MODULE_AUTHOR("Steven Toth, Davide Ferri");
+ MODULE_DESCRIPTION("Xceive xc4000 silicon tuner driver");
+diff --git a/drivers/media/tuners/xc5000.c b/drivers/media/tuners/xc5000.c
+index 7b7d9fe4f9453..2182e5b7b6064 100644
+--- a/drivers/media/tuners/xc5000.c
++++ b/drivers/media/tuners/xc5000.c
+@@ -1460,7 +1460,7 @@ fail:
+ xc5000_release(fe);
+ return NULL;
+ }
+-EXPORT_SYMBOL(xc5000_attach);
++EXPORT_SYMBOL_GPL(xc5000_attach);
+
+ MODULE_AUTHOR("Steven Toth");
+ MODULE_DESCRIPTION("Xceive xc5000 silicon tuner driver");
+diff --git a/drivers/media/usb/dvb-usb/m920x.c b/drivers/media/usb/dvb-usb/m920x.c
+index fea5bcf72a31a..c88a202daf5fc 100644
+--- a/drivers/media/usb/dvb-usb/m920x.c
++++ b/drivers/media/usb/dvb-usb/m920x.c
+@@ -277,7 +277,6 @@ static int m920x_i2c_xfer(struct i2c_adapter *adap, struct i2c_msg msg[], int nu
+ char *read = kmalloc(1, GFP_KERNEL);
+ if (!read) {
+ ret = -ENOMEM;
+- kfree(read);
+ goto unlock;
+ }
+
+@@ -288,8 +287,10 @@ static int m920x_i2c_xfer(struct i2c_adapter *adap, struct i2c_msg msg[], int nu
+
+ if ((ret = m920x_read(d->udev, M9206_I2C, 0x0,
+ 0x20 | stop,
+- read, 1)) != 0)
++ read, 1)) != 0) {
++ kfree(read);
+ goto unlock;
++ }
+ msg[i].buf[j] = read[0];
+ }
+
+diff --git a/drivers/media/usb/go7007/go7007-i2c.c b/drivers/media/usb/go7007/go7007-i2c.c
+index 38339dd2f83f7..2880370e45c8b 100644
+--- a/drivers/media/usb/go7007/go7007-i2c.c
++++ b/drivers/media/usb/go7007/go7007-i2c.c
+@@ -165,8 +165,6 @@ static int go7007_i2c_master_xfer(struct i2c_adapter *adapter,
+ } else if (msgs[i].len == 3) {
+ if (msgs[i].flags & I2C_M_RD)
+ return -EIO;
+- if (msgs[i].len != 3)
+- return -EIO;
+ if (go7007_i2c_xfer(go, msgs[i].addr, 0,
+ (msgs[i].buf[0] << 8) | msgs[i].buf[1],
+ 0x01, &msgs[i].buf[2]) < 0)
+diff --git a/drivers/media/usb/siano/smsusb.c b/drivers/media/usb/siano/smsusb.c
+index 640737d3b8aeb..8a39cac76c585 100644
+--- a/drivers/media/usb/siano/smsusb.c
++++ b/drivers/media/usb/siano/smsusb.c
+@@ -455,12 +455,7 @@ static int smsusb_init_device(struct usb_interface *intf, int board_id)
+ rc = smscore_register_device(¶ms, &dev->coredev, 0, mdev);
+ if (rc < 0) {
+ pr_err("smscore_register_device(...) failed, rc %d\n", rc);
+- smsusb_term_device(intf);
+-#ifdef CONFIG_MEDIA_CONTROLLER_DVB
+- media_device_unregister(mdev);
+-#endif
+- kfree(mdev);
+- return rc;
++ goto err_unregister_device;
+ }
+
+ smscore_set_board_id(dev->coredev, board_id);
+@@ -477,8 +472,7 @@ static int smsusb_init_device(struct usb_interface *intf, int board_id)
+ rc = smsusb_start_streaming(dev);
+ if (rc < 0) {
+ pr_err("smsusb_start_streaming(...) failed\n");
+- smsusb_term_device(intf);
+- return rc;
++ goto err_unregister_device;
+ }
+
+ dev->state = SMSUSB_ACTIVE;
+@@ -486,13 +480,20 @@ static int smsusb_init_device(struct usb_interface *intf, int board_id)
+ rc = smscore_start_device(dev->coredev);
+ if (rc < 0) {
+ pr_err("smscore_start_device(...) failed\n");
+- smsusb_term_device(intf);
+- return rc;
++ goto err_unregister_device;
+ }
+
+ pr_debug("device 0x%p created\n", dev);
+
+ return rc;
++
++err_unregister_device:
++ smsusb_term_device(intf);
++#ifdef CONFIG_MEDIA_CONTROLLER_DVB
++ media_device_unregister(mdev);
++#endif
++ kfree(mdev);
++ return rc;
+ }
+
+ static int smsusb_probe(struct usb_interface *intf,
+diff --git a/drivers/media/v4l2-core/v4l2-fwnode.c b/drivers/media/v4l2-core/v4l2-fwnode.c
+index 049c2f2001eaa..4fa9225aa3d93 100644
+--- a/drivers/media/v4l2-core/v4l2-fwnode.c
++++ b/drivers/media/v4l2-core/v4l2-fwnode.c
+@@ -568,19 +568,29 @@ int v4l2_fwnode_parse_link(struct fwnode_handle *fwnode,
+ link->local_id = fwep.id;
+ link->local_port = fwep.port;
+ link->local_node = fwnode_graph_get_port_parent(fwnode);
++ if (!link->local_node)
++ return -ENOLINK;
+
+ fwnode = fwnode_graph_get_remote_endpoint(fwnode);
+- if (!fwnode) {
+- fwnode_handle_put(fwnode);
+- return -ENOLINK;
+- }
++ if (!fwnode)
++ goto err_put_local_node;
+
+ fwnode_graph_parse_endpoint(fwnode, &fwep);
+ link->remote_id = fwep.id;
+ link->remote_port = fwep.port;
+ link->remote_node = fwnode_graph_get_port_parent(fwnode);
++ if (!link->remote_node)
++ goto err_put_remote_endpoint;
+
+ return 0;
++
++err_put_remote_endpoint:
++ fwnode_handle_put(fwnode);
++
++err_put_local_node:
++ fwnode_handle_put(link->local_node);
++
++ return -ENOLINK;
+ }
+ EXPORT_SYMBOL_GPL(v4l2_fwnode_parse_link);
+
+diff --git a/drivers/misc/fastrpc.c b/drivers/misc/fastrpc.c
+index 9666d28037e18..5a134fa8a174c 100644
+--- a/drivers/misc/fastrpc.c
++++ b/drivers/misc/fastrpc.c
+@@ -1322,13 +1322,18 @@ static int fastrpc_init_create_static_process(struct fastrpc_user *fl,
+ return 0;
+ err_invoke:
+ if (fl->cctx->vmcount) {
+- struct qcom_scm_vmperm perm;
++ u64 src_perms = 0;
++ struct qcom_scm_vmperm dst_perms;
++ u32 i;
+
+- perm.vmid = QCOM_SCM_VMID_HLOS;
+- perm.perm = QCOM_SCM_PERM_RWX;
++ for (i = 0; i < fl->cctx->vmcount; i++)
++ src_perms |= BIT(fl->cctx->vmperms[i].vmid);
++
++ dst_perms.vmid = QCOM_SCM_VMID_HLOS;
++ dst_perms.perm = QCOM_SCM_PERM_RWX;
+ err = qcom_scm_assign_mem(fl->cctx->remote_heap->phys,
+ (u64)fl->cctx->remote_heap->size,
+- &fl->cctx->perms, &perm, 1);
++ &src_perms, &dst_perms, 1);
+ if (err)
+ dev_err(fl->sctx->dev, "Failed to assign memory phys 0x%llx size 0x%llx err %d",
+ fl->cctx->remote_heap->phys, fl->cctx->remote_heap->size, err);
+diff --git a/drivers/mmc/host/renesas_sdhi_core.c b/drivers/mmc/host/renesas_sdhi_core.c
+index 345934e4f59e6..2d5ef9c37d769 100644
+--- a/drivers/mmc/host/renesas_sdhi_core.c
++++ b/drivers/mmc/host/renesas_sdhi_core.c
+@@ -1006,6 +1006,8 @@ int renesas_sdhi_probe(struct platform_device *pdev,
+ host->sdcard_irq_setbit_mask = TMIO_STAT_ALWAYS_SET_27;
+ host->sdcard_irq_mask_all = TMIO_MASK_ALL_RCAR2;
+ host->reset = renesas_sdhi_reset;
++ } else {
++ host->sdcard_irq_mask_all = TMIO_MASK_ALL;
+ }
+
+ /* Orginally registers were 16 bit apart, could be 32 or 64 nowadays */
+@@ -1100,9 +1102,7 @@ int renesas_sdhi_probe(struct platform_device *pdev,
+ host->ops.hs400_complete = renesas_sdhi_hs400_complete;
+ }
+
+- ret = tmio_mmc_host_probe(host);
+- if (ret < 0)
+- goto edisclk;
++ sd_ctrl_write32_as_16_and_16(host, CTL_IRQ_MASK, host->sdcard_irq_mask_all);
+
+ num_irqs = platform_irq_count(pdev);
+ if (num_irqs < 0) {
+@@ -1129,6 +1129,10 @@ int renesas_sdhi_probe(struct platform_device *pdev,
+ goto eirq;
+ }
+
++ ret = tmio_mmc_host_probe(host);
++ if (ret < 0)
++ goto edisclk;
++
+ dev_info(&pdev->dev, "%s base at %pa, max clock rate %u MHz\n",
+ mmc_hostname(host->mmc), &res->start, host->mmc->f_max / 1000000);
+
+diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
+index 2e9c2e2d9c9f7..d8418d7fcc372 100644
+--- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c
++++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
+@@ -2612,6 +2612,8 @@ static int brcmnand_setup_dev(struct brcmnand_host *host)
+ struct nand_chip *chip = &host->chip;
+ const struct nand_ecc_props *requirements =
+ nanddev_get_ecc_requirements(&chip->base);
++ struct nand_memory_organization *memorg =
++ nanddev_get_memorg(&chip->base);
+ struct brcmnand_controller *ctrl = host->ctrl;
+ struct brcmnand_cfg *cfg = &host->hwcfg;
+ char msg[128];
+@@ -2633,10 +2635,11 @@ static int brcmnand_setup_dev(struct brcmnand_host *host)
+ if (cfg->spare_area_size > ctrl->max_oob)
+ cfg->spare_area_size = ctrl->max_oob;
+ /*
+- * Set oobsize to be consistent with controller's spare_area_size, as
+- * the rest is inaccessible.
++ * Set mtd and memorg oobsize to be consistent with controller's
++ * spare_area_size, as the rest is inaccessible.
+ */
+ mtd->oobsize = cfg->spare_area_size * (mtd->writesize >> FC_SHIFT);
++ memorg->oobsize = mtd->oobsize;
+
+ cfg->device_size = mtd->size;
+ cfg->block_size = mtd->erasesize;
+diff --git a/drivers/mtd/nand/raw/fsmc_nand.c b/drivers/mtd/nand/raw/fsmc_nand.c
+index 7b4742420dfcb..2e33ae77502a0 100644
+--- a/drivers/mtd/nand/raw/fsmc_nand.c
++++ b/drivers/mtd/nand/raw/fsmc_nand.c
+@@ -1200,9 +1200,14 @@ static int fsmc_nand_suspend(struct device *dev)
+ static int fsmc_nand_resume(struct device *dev)
+ {
+ struct fsmc_nand_data *host = dev_get_drvdata(dev);
++ int ret;
+
+ if (host) {
+- clk_prepare_enable(host->clk);
++ ret = clk_prepare_enable(host->clk);
++ if (ret) {
++ dev_err(dev, "failed to enable clk\n");
++ return ret;
++ }
+ if (host->dev_timings)
+ fsmc_nand_setup(host, host->dev_timings);
+ nand_reset(&host->nand, 0);
+diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
+index 5f29fac8669a3..55f4a902b8be9 100644
+--- a/drivers/mtd/spi-nor/core.c
++++ b/drivers/mtd/spi-nor/core.c
+@@ -870,21 +870,22 @@ static int spi_nor_write_16bit_sr_and_check(struct spi_nor *nor, u8 sr1)
+ ret = spi_nor_read_cr(nor, &sr_cr[1]);
+ if (ret)
+ return ret;
+- } else if (nor->params->quad_enable) {
++ } else if (spi_nor_get_protocol_width(nor->read_proto) == 4 &&
++ spi_nor_get_protocol_width(nor->write_proto) == 4 &&
++ nor->params->quad_enable) {
+ /*
+ * If the Status Register 2 Read command (35h) is not
+ * supported, we should at least be sure we don't
+ * change the value of the SR2 Quad Enable bit.
+ *
+- * We can safely assume that when the Quad Enable method is
+- * set, the value of the QE bit is one, as a consequence of the
+- * nor->params->quad_enable() call.
++ * When the Quad Enable method is set and the buswidth is 4, we
++ * can safely assume that the value of the QE bit is one, as a
++ * consequence of the nor->params->quad_enable() call.
+ *
+- * We can safely assume that the Quad Enable bit is present in
+- * the Status Register 2 at BIT(1). According to the JESD216
+- * revB standard, BFPT DWORDS[15], bits 22:20, the 16-bit
+- * Write Status (01h) command is available just for the cases
+- * in which the QE bit is described in SR2 at BIT(1).
++ * According to the JESD216 revB standard, BFPT DWORDS[15],
++ * bits 22:20, the 16-bit Write Status (01h) command is
++ * available just for the cases in which the QE bit is
++ * described in SR2 at BIT(1).
+ */
+ sr_cr[1] = SR2_QUAD_EN_BIT1;
+ } else {
+diff --git a/drivers/net/arcnet/arcnet.c b/drivers/net/arcnet/arcnet.c
+index 1bad1866ae462..a48220f91a2df 100644
+--- a/drivers/net/arcnet/arcnet.c
++++ b/drivers/net/arcnet/arcnet.c
+@@ -468,7 +468,7 @@ static void arcnet_reply_tasklet(struct tasklet_struct *t)
+
+ ret = sock_queue_err_skb(sk, ackskb);
+ if (ret)
+- kfree_skb(ackskb);
++ dev_kfree_skb_irq(ackskb);
+
+ local_irq_enable();
+ };
+diff --git a/drivers/net/can/m_can/tcan4x5x-regmap.c b/drivers/net/can/m_can/tcan4x5x-regmap.c
+index 2b218ce04e9f2..fafa6daa67e69 100644
+--- a/drivers/net/can/m_can/tcan4x5x-regmap.c
++++ b/drivers/net/can/m_can/tcan4x5x-regmap.c
+@@ -95,7 +95,6 @@ static const struct regmap_range tcan4x5x_reg_table_wr_range[] = {
+ regmap_reg_range(0x000c, 0x0010),
+ /* Device configuration registers and Interrupt Flags*/
+ regmap_reg_range(0x0800, 0x080c),
+- regmap_reg_range(0x0814, 0x0814),
+ regmap_reg_range(0x0820, 0x0820),
+ regmap_reg_range(0x0830, 0x0830),
+ /* M_CAN */
+diff --git a/drivers/net/can/usb/gs_usb.c b/drivers/net/can/usb/gs_usb.c
+index bd9eb066ecf15..129ef60a577c8 100644
+--- a/drivers/net/can/usb/gs_usb.c
++++ b/drivers/net/can/usb/gs_usb.c
+@@ -633,6 +633,9 @@ static void gs_usb_receive_bulk_callback(struct urb *urb)
+ }
+
+ if (hf->flags & GS_CAN_FLAG_OVERFLOW) {
++ stats->rx_over_errors++;
++ stats->rx_errors++;
++
+ skb = alloc_can_err_skb(netdev, &cf);
+ if (!skb)
+ goto resubmit_urb;
+@@ -640,8 +643,6 @@ static void gs_usb_receive_bulk_callback(struct urb *urb)
+ cf->can_id |= CAN_ERR_CRTL;
+ cf->len = CAN_ERR_DLC;
+ cf->data[1] = CAN_ERR_CRTL_RX_OVERFLOW;
+- stats->rx_over_errors++;
+- stats->rx_errors++;
+ netif_rx(skb);
+ }
+
+diff --git a/drivers/net/dsa/microchip/ksz_common.c b/drivers/net/dsa/microchip/ksz_common.c
+index a0ba2605bb620..f87ed14fa2ab2 100644
+--- a/drivers/net/dsa/microchip/ksz_common.c
++++ b/drivers/net/dsa/microchip/ksz_common.c
+@@ -635,10 +635,9 @@ static const struct regmap_range ksz9477_valid_regs[] = {
+ regmap_reg_range(0x1030, 0x1030),
+ regmap_reg_range(0x1100, 0x1115),
+ regmap_reg_range(0x111a, 0x111f),
+- regmap_reg_range(0x1122, 0x1127),
+- regmap_reg_range(0x112a, 0x112b),
+- regmap_reg_range(0x1136, 0x1139),
+- regmap_reg_range(0x113e, 0x113f),
++ regmap_reg_range(0x1120, 0x112b),
++ regmap_reg_range(0x1134, 0x113b),
++ regmap_reg_range(0x113c, 0x113f),
+ regmap_reg_range(0x1400, 0x1401),
+ regmap_reg_range(0x1403, 0x1403),
+ regmap_reg_range(0x1410, 0x1417),
+@@ -669,10 +668,9 @@ static const struct regmap_range ksz9477_valid_regs[] = {
+ regmap_reg_range(0x2030, 0x2030),
+ regmap_reg_range(0x2100, 0x2115),
+ regmap_reg_range(0x211a, 0x211f),
+- regmap_reg_range(0x2122, 0x2127),
+- regmap_reg_range(0x212a, 0x212b),
+- regmap_reg_range(0x2136, 0x2139),
+- regmap_reg_range(0x213e, 0x213f),
++ regmap_reg_range(0x2120, 0x212b),
++ regmap_reg_range(0x2134, 0x213b),
++ regmap_reg_range(0x213c, 0x213f),
+ regmap_reg_range(0x2400, 0x2401),
+ regmap_reg_range(0x2403, 0x2403),
+ regmap_reg_range(0x2410, 0x2417),
+@@ -703,10 +701,9 @@ static const struct regmap_range ksz9477_valid_regs[] = {
+ regmap_reg_range(0x3030, 0x3030),
+ regmap_reg_range(0x3100, 0x3115),
+ regmap_reg_range(0x311a, 0x311f),
+- regmap_reg_range(0x3122, 0x3127),
+- regmap_reg_range(0x312a, 0x312b),
+- regmap_reg_range(0x3136, 0x3139),
+- regmap_reg_range(0x313e, 0x313f),
++ regmap_reg_range(0x3120, 0x312b),
++ regmap_reg_range(0x3134, 0x313b),
++ regmap_reg_range(0x313c, 0x313f),
+ regmap_reg_range(0x3400, 0x3401),
+ regmap_reg_range(0x3403, 0x3403),
+ regmap_reg_range(0x3410, 0x3417),
+@@ -737,10 +734,9 @@ static const struct regmap_range ksz9477_valid_regs[] = {
+ regmap_reg_range(0x4030, 0x4030),
+ regmap_reg_range(0x4100, 0x4115),
+ regmap_reg_range(0x411a, 0x411f),
+- regmap_reg_range(0x4122, 0x4127),
+- regmap_reg_range(0x412a, 0x412b),
+- regmap_reg_range(0x4136, 0x4139),
+- regmap_reg_range(0x413e, 0x413f),
++ regmap_reg_range(0x4120, 0x412b),
++ regmap_reg_range(0x4134, 0x413b),
++ regmap_reg_range(0x413c, 0x413f),
+ regmap_reg_range(0x4400, 0x4401),
+ regmap_reg_range(0x4403, 0x4403),
+ regmap_reg_range(0x4410, 0x4417),
+@@ -771,10 +767,9 @@ static const struct regmap_range ksz9477_valid_regs[] = {
+ regmap_reg_range(0x5030, 0x5030),
+ regmap_reg_range(0x5100, 0x5115),
+ regmap_reg_range(0x511a, 0x511f),
+- regmap_reg_range(0x5122, 0x5127),
+- regmap_reg_range(0x512a, 0x512b),
+- regmap_reg_range(0x5136, 0x5139),
+- regmap_reg_range(0x513e, 0x513f),
++ regmap_reg_range(0x5120, 0x512b),
++ regmap_reg_range(0x5134, 0x513b),
++ regmap_reg_range(0x513c, 0x513f),
+ regmap_reg_range(0x5400, 0x5401),
+ regmap_reg_range(0x5403, 0x5403),
+ regmap_reg_range(0x5410, 0x5417),
+diff --git a/drivers/net/ethernet/amd/pds_core/core.c b/drivers/net/ethernet/amd/pds_core/core.c
+index 483a070d96fa9..d06934edc265e 100644
+--- a/drivers/net/ethernet/amd/pds_core/core.c
++++ b/drivers/net/ethernet/amd/pds_core/core.c
+@@ -464,7 +464,8 @@ void pdsc_teardown(struct pdsc *pdsc, bool removing)
+ {
+ int i;
+
+- pdsc_devcmd_reset(pdsc);
++ if (!pdsc->pdev->is_virtfn)
++ pdsc_devcmd_reset(pdsc);
+ pdsc_qcq_free(pdsc, &pdsc->notifyqcq);
+ pdsc_qcq_free(pdsc, &pdsc->adminqcq);
+
+@@ -524,7 +525,8 @@ static void pdsc_fw_down(struct pdsc *pdsc)
+ }
+
+ /* Notify clients of fw_down */
+- devlink_health_report(pdsc->fw_reporter, "FW down reported", pdsc);
++ if (pdsc->fw_reporter)
++ devlink_health_report(pdsc->fw_reporter, "FW down reported", pdsc);
+ pdsc_notify(PDS_EVENT_RESET, &reset_event);
+
+ pdsc_stop(pdsc);
+@@ -554,8 +556,9 @@ static void pdsc_fw_up(struct pdsc *pdsc)
+
+ /* Notify clients of fw_up */
+ pdsc->fw_recoveries++;
+- devlink_health_reporter_state_update(pdsc->fw_reporter,
+- DEVLINK_HEALTH_REPORTER_STATE_HEALTHY);
++ if (pdsc->fw_reporter)
++ devlink_health_reporter_state_update(pdsc->fw_reporter,
++ DEVLINK_HEALTH_REPORTER_STATE_HEALTHY);
+ pdsc_notify(PDS_EVENT_RESET, &reset_event);
+
+ return;
+diff --git a/drivers/net/ethernet/amd/pds_core/dev.c b/drivers/net/ethernet/amd/pds_core/dev.c
+index debe5216fe29e..f77cd9f5a2fda 100644
+--- a/drivers/net/ethernet/amd/pds_core/dev.c
++++ b/drivers/net/ethernet/amd/pds_core/dev.c
+@@ -121,7 +121,7 @@ static const char *pdsc_devcmd_str(int opcode)
+ }
+ }
+
+-static int pdsc_devcmd_wait(struct pdsc *pdsc, int max_seconds)
++static int pdsc_devcmd_wait(struct pdsc *pdsc, u8 opcode, int max_seconds)
+ {
+ struct device *dev = pdsc->dev;
+ unsigned long start_time;
+@@ -131,9 +131,6 @@ static int pdsc_devcmd_wait(struct pdsc *pdsc, int max_seconds)
+ int done = 0;
+ int err = 0;
+ int status;
+- int opcode;
+-
+- opcode = ioread8(&pdsc->cmd_regs->cmd.opcode);
+
+ start_time = jiffies;
+ max_wait = start_time + (max_seconds * HZ);
+@@ -180,10 +177,10 @@ int pdsc_devcmd_locked(struct pdsc *pdsc, union pds_core_dev_cmd *cmd,
+
+ memcpy_toio(&pdsc->cmd_regs->cmd, cmd, sizeof(*cmd));
+ pdsc_devcmd_dbell(pdsc);
+- err = pdsc_devcmd_wait(pdsc, max_seconds);
++ err = pdsc_devcmd_wait(pdsc, cmd->opcode, max_seconds);
+ memcpy_fromio(comp, &pdsc->cmd_regs->comp, sizeof(*comp));
+
+- if (err == -ENXIO || err == -ETIMEDOUT)
++ if ((err == -ENXIO || err == -ETIMEDOUT) && pdsc->wq)
+ queue_work(pdsc->wq, &pdsc->health_work);
+
+ return err;
+diff --git a/drivers/net/ethernet/amd/pds_core/devlink.c b/drivers/net/ethernet/amd/pds_core/devlink.c
+index 9c6b3653c1c7c..d9607033bbf21 100644
+--- a/drivers/net/ethernet/amd/pds_core/devlink.c
++++ b/drivers/net/ethernet/amd/pds_core/devlink.c
+@@ -10,6 +10,9 @@ pdsc_viftype *pdsc_dl_find_viftype_by_id(struct pdsc *pdsc,
+ {
+ int vt;
+
++ if (!pdsc->viftype_status)
++ return NULL;
++
+ for (vt = 0; vt < PDS_DEV_TYPE_MAX; vt++) {
+ if (pdsc->viftype_status[vt].dl_id == dl_id)
+ return &pdsc->viftype_status[vt];
+diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
+index 4a288799633f8..940c5d1ff9cfc 100644
+--- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
++++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
+@@ -2094,8 +2094,11 @@ static int atl1c_tso_csum(struct atl1c_adapter *adapter,
+ real_len = (((unsigned char *)ip_hdr(skb) - skb->data)
+ + ntohs(ip_hdr(skb)->tot_len));
+
+- if (real_len < skb->len)
+- pskb_trim(skb, real_len);
++ if (real_len < skb->len) {
++ err = pskb_trim(skb, real_len);
++ if (err)
++ return err;
++ }
+
+ hdr_len = skb_tcp_all_headers(skb);
+ if (unlikely(skb->len == hdr_len)) {
+diff --git a/drivers/net/ethernet/broadcom/b44.c b/drivers/net/ethernet/broadcom/b44.c
+index 392ec09a1d8a6..3e4fb3c3e8342 100644
+--- a/drivers/net/ethernet/broadcom/b44.c
++++ b/drivers/net/ethernet/broadcom/b44.c
+@@ -1793,11 +1793,9 @@ static int b44_nway_reset(struct net_device *dev)
+ b44_readphy(bp, MII_BMCR, &bmcr);
+ b44_readphy(bp, MII_BMCR, &bmcr);
+ r = -EINVAL;
+- if (bmcr & BMCR_ANENABLE) {
+- b44_writephy(bp, MII_BMCR,
+- bmcr | BMCR_ANRESTART);
+- r = 0;
+- }
++ if (bmcr & BMCR_ANENABLE)
++ r = b44_writephy(bp, MII_BMCR,
++ bmcr | BMCR_ANRESTART);
+ spin_unlock_irq(&bp->lock);
+
+ return r;
+diff --git a/drivers/net/ethernet/hisilicon/hns3/Makefile b/drivers/net/ethernet/hisilicon/hns3/Makefile
+index 6efea46628587..e214bfaece1f3 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/Makefile
++++ b/drivers/net/ethernet/hisilicon/hns3/Makefile
+@@ -17,11 +17,11 @@ hns3-$(CONFIG_HNS3_DCB) += hns3_dcbnl.o
+
+ obj-$(CONFIG_HNS3_HCLGEVF) += hclgevf.o
+
+-hclgevf-objs = hns3vf/hclgevf_main.o hns3vf/hclgevf_mbx.o hns3vf/hclgevf_devlink.o \
++hclgevf-objs = hns3vf/hclgevf_main.o hns3vf/hclgevf_mbx.o hns3vf/hclgevf_devlink.o hns3vf/hclgevf_regs.o \
+ hns3_common/hclge_comm_cmd.o hns3_common/hclge_comm_rss.o hns3_common/hclge_comm_tqp_stats.o
+
+ obj-$(CONFIG_HNS3_HCLGE) += hclge.o
+-hclge-objs = hns3pf/hclge_main.o hns3pf/hclge_mdio.o hns3pf/hclge_tm.o \
++hclge-objs = hns3pf/hclge_main.o hns3pf/hclge_mdio.o hns3pf/hclge_tm.o hns3pf/hclge_regs.o \
+ hns3pf/hclge_mbx.o hns3pf/hclge_err.o hns3pf/hclge_debugfs.o hns3pf/hclge_ptp.o hns3pf/hclge_devlink.o \
+ hns3_common/hclge_comm_cmd.o hns3_common/hclge_comm_rss.o hns3_common/hclge_comm_tqp_stats.o
+
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
+index 06f29e80104c0..e9c108128bb3b 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h
++++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
+@@ -102,6 +102,7 @@ enum HNAE3_DEV_CAP_BITS {
+ HNAE3_DEV_SUPPORT_FEC_STATS_B,
+ HNAE3_DEV_SUPPORT_LANE_NUM_B,
+ HNAE3_DEV_SUPPORT_WOL_B,
++ HNAE3_DEV_SUPPORT_TM_FLUSH_B,
+ };
+
+ #define hnae3_ae_dev_fd_supported(ae_dev) \
+@@ -173,6 +174,9 @@ enum HNAE3_DEV_CAP_BITS {
+ #define hnae3_ae_dev_wol_supported(ae_dev) \
+ test_bit(HNAE3_DEV_SUPPORT_WOL_B, (ae_dev)->caps)
+
++#define hnae3_ae_dev_tm_flush_supported(hdev) \
++ test_bit(HNAE3_DEV_SUPPORT_TM_FLUSH_B, (hdev)->ae_dev->caps)
++
+ enum HNAE3_PF_CAP_BITS {
+ HNAE3_PF_SUPPORT_VLAN_FLTR_MDF_B = 0,
+ };
+@@ -378,6 +382,7 @@ struct hnae3_dev_specs {
+ u16 umv_size;
+ u16 mc_mac_size;
+ u32 mac_stats_num;
++ u8 tnl_num;
+ };
+
+ struct hnae3_client_ops {
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_common/hclge_comm_cmd.c b/drivers/net/ethernet/hisilicon/hns3/hns3_common/hclge_comm_cmd.c
+index 16ba98ff2c9b1..dcecb23daac6e 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/hns3_common/hclge_comm_cmd.c
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3_common/hclge_comm_cmd.c
+@@ -156,6 +156,7 @@ static const struct hclge_comm_caps_bit_map hclge_pf_cmd_caps[] = {
+ {HCLGE_COMM_CAP_FEC_STATS_B, HNAE3_DEV_SUPPORT_FEC_STATS_B},
+ {HCLGE_COMM_CAP_LANE_NUM_B, HNAE3_DEV_SUPPORT_LANE_NUM_B},
+ {HCLGE_COMM_CAP_WOL_B, HNAE3_DEV_SUPPORT_WOL_B},
++ {HCLGE_COMM_CAP_TM_FLUSH_B, HNAE3_DEV_SUPPORT_TM_FLUSH_B},
+ };
+
+ static const struct hclge_comm_caps_bit_map hclge_vf_cmd_caps[] = {
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_common/hclge_comm_cmd.h b/drivers/net/ethernet/hisilicon/hns3/hns3_common/hclge_comm_cmd.h
+index 18f1b4bf362da..2b7197ce0ae8f 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/hns3_common/hclge_comm_cmd.h
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3_common/hclge_comm_cmd.h
+@@ -153,6 +153,7 @@ enum hclge_opcode_type {
+ HCLGE_OPC_TM_INTERNAL_STS = 0x0850,
+ HCLGE_OPC_TM_INTERNAL_CNT = 0x0851,
+ HCLGE_OPC_TM_INTERNAL_STS_1 = 0x0852,
++ HCLGE_OPC_TM_FLUSH = 0x0872,
+
+ /* Packet buffer allocate commands */
+ HCLGE_OPC_TX_BUFF_ALLOC = 0x0901,
+@@ -349,6 +350,7 @@ enum HCLGE_COMM_CAP_BITS {
+ HCLGE_COMM_CAP_FEC_STATS_B = 25,
+ HCLGE_COMM_CAP_LANE_NUM_B = 27,
+ HCLGE_COMM_CAP_WOL_B = 28,
++ HCLGE_COMM_CAP_TM_FLUSH_B = 31,
+ };
+
+ enum HCLGE_COMM_API_CAP_BITS {
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c b/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c
+index 207b2e3f3fc2b..dce158d4aeef6 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c
+@@ -411,6 +411,9 @@ static struct hns3_dbg_cap_info hns3_dbg_cap[] = {
+ }, {
+ .name = "support wake on lan",
+ .cap_bit = HNAE3_DEV_SUPPORT_WOL_B,
++ }, {
++ .name = "support tm flush",
++ .cap_bit = HNAE3_DEV_SUPPORT_TM_FLUSH_B,
+ }
+ };
+
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
+index 91c173f40701a..d5cfdc4c082d8 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
+@@ -826,7 +826,9 @@ struct hclge_dev_specs_1_cmd {
+ u8 rsv0[2];
+ __le16 umv_size;
+ __le16 mc_mac_size;
+- u8 rsv1[12];
++ u8 rsv1[6];
++ u8 tnl_num;
++ u8 rsv2[5];
+ };
+
+ /* mac speed type defined in firmware command */
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c
+index 09362823140d5..fad5a5ff3cda5 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_dcb.c
+@@ -227,6 +227,10 @@ static int hclge_notify_down_uinit(struct hclge_dev *hdev)
+ if (ret)
+ return ret;
+
++ ret = hclge_tm_flush_cfg(hdev, true);
++ if (ret)
++ return ret;
++
+ return hclge_notify_client(hdev, HNAE3_UNINIT_CLIENT);
+ }
+
+@@ -238,6 +242,10 @@ static int hclge_notify_init_up(struct hclge_dev *hdev)
+ if (ret)
+ return ret;
+
++ ret = hclge_tm_flush_cfg(hdev, false);
++ if (ret)
++ return ret;
++
+ return hclge_notify_client(hdev, HNAE3_UP_CLIENT);
+ }
+
+@@ -324,6 +332,7 @@ static int hclge_ieee_setpfc(struct hnae3_handle *h, struct ieee_pfc *pfc)
+ struct net_device *netdev = h->kinfo.netdev;
+ struct hclge_dev *hdev = vport->back;
+ u8 i, j, pfc_map, *prio_tc;
++ int last_bad_ret = 0;
+ int ret;
+
+ if (!(hdev->dcbx_cap & DCB_CAP_DCBX_VER_IEEE))
+@@ -361,13 +370,28 @@ static int hclge_ieee_setpfc(struct hnae3_handle *h, struct ieee_pfc *pfc)
+ if (ret)
+ return ret;
+
+- ret = hclge_buffer_alloc(hdev);
+- if (ret) {
+- hclge_notify_client(hdev, HNAE3_UP_CLIENT);
++ ret = hclge_tm_flush_cfg(hdev, true);
++ if (ret)
+ return ret;
+- }
+
+- return hclge_notify_client(hdev, HNAE3_UP_CLIENT);
++ /* No matter whether the following operations are performed
++ * successfully or not, disabling the tm flush and notify
++ * the network status to up are necessary.
++ * Do not return immediately.
++ */
++ ret = hclge_buffer_alloc(hdev);
++ if (ret)
++ last_bad_ret = ret;
++
++ ret = hclge_tm_flush_cfg(hdev, false);
++ if (ret)
++ last_bad_ret = ret;
++
++ ret = hclge_notify_client(hdev, HNAE3_UP_CLIENT);
++ if (ret)
++ last_bad_ret = ret;
++
++ return last_bad_ret;
+ }
+
+ static int hclge_ieee_setapp(struct hnae3_handle *h, struct dcb_app *app)
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c
+index 0fb2eaee3e8a0..f01a7a9ee02ca 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c
+@@ -7,6 +7,7 @@
+ #include "hclge_debugfs.h"
+ #include "hclge_err.h"
+ #include "hclge_main.h"
++#include "hclge_regs.h"
+ #include "hclge_tm.h"
+ #include "hnae3.h"
+
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+index c3e94598f3983..0d56dc2e9960e 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+@@ -20,6 +20,7 @@
+ #include "hclge_main.h"
+ #include "hclge_mbx.h"
+ #include "hclge_mdio.h"
++#include "hclge_regs.h"
+ #include "hclge_tm.h"
+ #include "hclge_err.h"
+ #include "hnae3.h"
+@@ -40,20 +41,6 @@
+ #define HCLGE_PF_RESET_SYNC_TIME 20
+ #define HCLGE_PF_RESET_SYNC_CNT 1500
+
+-/* Get DFX BD number offset */
+-#define HCLGE_DFX_BIOS_BD_OFFSET 1
+-#define HCLGE_DFX_SSU_0_BD_OFFSET 2
+-#define HCLGE_DFX_SSU_1_BD_OFFSET 3
+-#define HCLGE_DFX_IGU_BD_OFFSET 4
+-#define HCLGE_DFX_RPU_0_BD_OFFSET 5
+-#define HCLGE_DFX_RPU_1_BD_OFFSET 6
+-#define HCLGE_DFX_NCSI_BD_OFFSET 7
+-#define HCLGE_DFX_RTC_BD_OFFSET 8
+-#define HCLGE_DFX_PPP_BD_OFFSET 9
+-#define HCLGE_DFX_RCB_BD_OFFSET 10
+-#define HCLGE_DFX_TQP_BD_OFFSET 11
+-#define HCLGE_DFX_SSU_2_BD_OFFSET 12
+-
+ #define HCLGE_LINK_STATUS_MS 10
+
+ static int hclge_set_mac_mtu(struct hclge_dev *hdev, int new_mps);
+@@ -94,62 +81,6 @@ static const struct pci_device_id ae_algo_pci_tbl[] = {
+
+ MODULE_DEVICE_TABLE(pci, ae_algo_pci_tbl);
+
+-static const u32 cmdq_reg_addr_list[] = {HCLGE_COMM_NIC_CSQ_BASEADDR_L_REG,
+- HCLGE_COMM_NIC_CSQ_BASEADDR_H_REG,
+- HCLGE_COMM_NIC_CSQ_DEPTH_REG,
+- HCLGE_COMM_NIC_CSQ_TAIL_REG,
+- HCLGE_COMM_NIC_CSQ_HEAD_REG,
+- HCLGE_COMM_NIC_CRQ_BASEADDR_L_REG,
+- HCLGE_COMM_NIC_CRQ_BASEADDR_H_REG,
+- HCLGE_COMM_NIC_CRQ_DEPTH_REG,
+- HCLGE_COMM_NIC_CRQ_TAIL_REG,
+- HCLGE_COMM_NIC_CRQ_HEAD_REG,
+- HCLGE_COMM_VECTOR0_CMDQ_SRC_REG,
+- HCLGE_COMM_CMDQ_INTR_STS_REG,
+- HCLGE_COMM_CMDQ_INTR_EN_REG,
+- HCLGE_COMM_CMDQ_INTR_GEN_REG};
+-
+-static const u32 common_reg_addr_list[] = {HCLGE_MISC_VECTOR_REG_BASE,
+- HCLGE_PF_OTHER_INT_REG,
+- HCLGE_MISC_RESET_STS_REG,
+- HCLGE_MISC_VECTOR_INT_STS,
+- HCLGE_GLOBAL_RESET_REG,
+- HCLGE_FUN_RST_ING,
+- HCLGE_GRO_EN_REG};
+-
+-static const u32 ring_reg_addr_list[] = {HCLGE_RING_RX_ADDR_L_REG,
+- HCLGE_RING_RX_ADDR_H_REG,
+- HCLGE_RING_RX_BD_NUM_REG,
+- HCLGE_RING_RX_BD_LENGTH_REG,
+- HCLGE_RING_RX_MERGE_EN_REG,
+- HCLGE_RING_RX_TAIL_REG,
+- HCLGE_RING_RX_HEAD_REG,
+- HCLGE_RING_RX_FBD_NUM_REG,
+- HCLGE_RING_RX_OFFSET_REG,
+- HCLGE_RING_RX_FBD_OFFSET_REG,
+- HCLGE_RING_RX_STASH_REG,
+- HCLGE_RING_RX_BD_ERR_REG,
+- HCLGE_RING_TX_ADDR_L_REG,
+- HCLGE_RING_TX_ADDR_H_REG,
+- HCLGE_RING_TX_BD_NUM_REG,
+- HCLGE_RING_TX_PRIORITY_REG,
+- HCLGE_RING_TX_TC_REG,
+- HCLGE_RING_TX_MERGE_EN_REG,
+- HCLGE_RING_TX_TAIL_REG,
+- HCLGE_RING_TX_HEAD_REG,
+- HCLGE_RING_TX_FBD_NUM_REG,
+- HCLGE_RING_TX_OFFSET_REG,
+- HCLGE_RING_TX_EBD_NUM_REG,
+- HCLGE_RING_TX_EBD_OFFSET_REG,
+- HCLGE_RING_TX_BD_ERR_REG,
+- HCLGE_RING_EN_REG};
+-
+-static const u32 tqp_intr_reg_addr_list[] = {HCLGE_TQP_INTR_CTRL_REG,
+- HCLGE_TQP_INTR_GL0_REG,
+- HCLGE_TQP_INTR_GL1_REG,
+- HCLGE_TQP_INTR_GL2_REG,
+- HCLGE_TQP_INTR_RL_REG};
+-
+ static const char hns3_nic_test_strs[][ETH_GSTRING_LEN] = {
+ "External Loopback test",
+ "App Loopback test",
+@@ -375,36 +306,6 @@ static const struct hclge_mac_mgr_tbl_entry_cmd hclge_mgr_table[] = {
+ },
+ };
+
+-static const u32 hclge_dfx_bd_offset_list[] = {
+- HCLGE_DFX_BIOS_BD_OFFSET,
+- HCLGE_DFX_SSU_0_BD_OFFSET,
+- HCLGE_DFX_SSU_1_BD_OFFSET,
+- HCLGE_DFX_IGU_BD_OFFSET,
+- HCLGE_DFX_RPU_0_BD_OFFSET,
+- HCLGE_DFX_RPU_1_BD_OFFSET,
+- HCLGE_DFX_NCSI_BD_OFFSET,
+- HCLGE_DFX_RTC_BD_OFFSET,
+- HCLGE_DFX_PPP_BD_OFFSET,
+- HCLGE_DFX_RCB_BD_OFFSET,
+- HCLGE_DFX_TQP_BD_OFFSET,
+- HCLGE_DFX_SSU_2_BD_OFFSET
+-};
+-
+-static const enum hclge_opcode_type hclge_dfx_reg_opcode_list[] = {
+- HCLGE_OPC_DFX_BIOS_COMMON_REG,
+- HCLGE_OPC_DFX_SSU_REG_0,
+- HCLGE_OPC_DFX_SSU_REG_1,
+- HCLGE_OPC_DFX_IGU_EGU_REG,
+- HCLGE_OPC_DFX_RPU_REG_0,
+- HCLGE_OPC_DFX_RPU_REG_1,
+- HCLGE_OPC_DFX_NCSI_REG,
+- HCLGE_OPC_DFX_RTC_REG,
+- HCLGE_OPC_DFX_PPP_REG,
+- HCLGE_OPC_DFX_RCB_REG,
+- HCLGE_OPC_DFX_TQP_REG,
+- HCLGE_OPC_DFX_SSU_REG_2
+-};
+-
+ static const struct key_info meta_data_key_info[] = {
+ { PACKET_TYPE_ID, 6 },
+ { IP_FRAGEMENT, 1 },
+@@ -1426,6 +1327,7 @@ static void hclge_set_default_dev_specs(struct hclge_dev *hdev)
+ ae_dev->dev_specs.max_frm_size = HCLGE_MAC_MAX_FRAME;
+ ae_dev->dev_specs.max_qset_num = HCLGE_MAX_QSET_NUM;
+ ae_dev->dev_specs.umv_size = HCLGE_DEFAULT_UMV_SPACE_PER_PF;
++ ae_dev->dev_specs.tnl_num = 0;
+ }
+
+ static void hclge_parse_dev_specs(struct hclge_dev *hdev,
+@@ -1449,6 +1351,7 @@ static void hclge_parse_dev_specs(struct hclge_dev *hdev,
+ ae_dev->dev_specs.max_frm_size = le16_to_cpu(req1->max_frm_size);
+ ae_dev->dev_specs.umv_size = le16_to_cpu(req1->umv_size);
+ ae_dev->dev_specs.mc_mac_size = le16_to_cpu(req1->mc_mac_size);
++ ae_dev->dev_specs.tnl_num = req1->tnl_num;
+ }
+
+ static void hclge_check_dev_specs(struct hclge_dev *hdev)
+@@ -10936,9 +10839,12 @@ int hclge_cfg_flowctrl(struct hclge_dev *hdev)
+ u32 rx_pause, tx_pause;
+ u8 flowctl;
+
+- if (!phydev->link || !phydev->autoneg)
++ if (!phydev->link)
+ return 0;
+
++ if (!phydev->autoneg)
++ return hclge_mac_pause_setup_hw(hdev);
++
+ local_advertising = linkmode_adv_to_lcl_adv_t(phydev->advertising);
+
+ if (phydev->pause)
+@@ -12389,463 +12295,6 @@ out:
+ return ret;
+ }
+
+-static int hclge_get_regs_num(struct hclge_dev *hdev, u32 *regs_num_32_bit,
+- u32 *regs_num_64_bit)
+-{
+- struct hclge_desc desc;
+- u32 total_num;
+- int ret;
+-
+- hclge_cmd_setup_basic_desc(&desc, HCLGE_OPC_QUERY_REG_NUM, true);
+- ret = hclge_cmd_send(&hdev->hw, &desc, 1);
+- if (ret) {
+- dev_err(&hdev->pdev->dev,
+- "Query register number cmd failed, ret = %d.\n", ret);
+- return ret;
+- }
+-
+- *regs_num_32_bit = le32_to_cpu(desc.data[0]);
+- *regs_num_64_bit = le32_to_cpu(desc.data[1]);
+-
+- total_num = *regs_num_32_bit + *regs_num_64_bit;
+- if (!total_num)
+- return -EINVAL;
+-
+- return 0;
+-}
+-
+-static int hclge_get_32_bit_regs(struct hclge_dev *hdev, u32 regs_num,
+- void *data)
+-{
+-#define HCLGE_32_BIT_REG_RTN_DATANUM 8
+-#define HCLGE_32_BIT_DESC_NODATA_LEN 2
+-
+- struct hclge_desc *desc;
+- u32 *reg_val = data;
+- __le32 *desc_data;
+- int nodata_num;
+- int cmd_num;
+- int i, k, n;
+- int ret;
+-
+- if (regs_num == 0)
+- return 0;
+-
+- nodata_num = HCLGE_32_BIT_DESC_NODATA_LEN;
+- cmd_num = DIV_ROUND_UP(regs_num + nodata_num,
+- HCLGE_32_BIT_REG_RTN_DATANUM);
+- desc = kcalloc(cmd_num, sizeof(struct hclge_desc), GFP_KERNEL);
+- if (!desc)
+- return -ENOMEM;
+-
+- hclge_cmd_setup_basic_desc(&desc[0], HCLGE_OPC_QUERY_32_BIT_REG, true);
+- ret = hclge_cmd_send(&hdev->hw, desc, cmd_num);
+- if (ret) {
+- dev_err(&hdev->pdev->dev,
+- "Query 32 bit register cmd failed, ret = %d.\n", ret);
+- kfree(desc);
+- return ret;
+- }
+-
+- for (i = 0; i < cmd_num; i++) {
+- if (i == 0) {
+- desc_data = (__le32 *)(&desc[i].data[0]);
+- n = HCLGE_32_BIT_REG_RTN_DATANUM - nodata_num;
+- } else {
+- desc_data = (__le32 *)(&desc[i]);
+- n = HCLGE_32_BIT_REG_RTN_DATANUM;
+- }
+- for (k = 0; k < n; k++) {
+- *reg_val++ = le32_to_cpu(*desc_data++);
+-
+- regs_num--;
+- if (!regs_num)
+- break;
+- }
+- }
+-
+- kfree(desc);
+- return 0;
+-}
+-
+-static int hclge_get_64_bit_regs(struct hclge_dev *hdev, u32 regs_num,
+- void *data)
+-{
+-#define HCLGE_64_BIT_REG_RTN_DATANUM 4
+-#define HCLGE_64_BIT_DESC_NODATA_LEN 1
+-
+- struct hclge_desc *desc;
+- u64 *reg_val = data;
+- __le64 *desc_data;
+- int nodata_len;
+- int cmd_num;
+- int i, k, n;
+- int ret;
+-
+- if (regs_num == 0)
+- return 0;
+-
+- nodata_len = HCLGE_64_BIT_DESC_NODATA_LEN;
+- cmd_num = DIV_ROUND_UP(regs_num + nodata_len,
+- HCLGE_64_BIT_REG_RTN_DATANUM);
+- desc = kcalloc(cmd_num, sizeof(struct hclge_desc), GFP_KERNEL);
+- if (!desc)
+- return -ENOMEM;
+-
+- hclge_cmd_setup_basic_desc(&desc[0], HCLGE_OPC_QUERY_64_BIT_REG, true);
+- ret = hclge_cmd_send(&hdev->hw, desc, cmd_num);
+- if (ret) {
+- dev_err(&hdev->pdev->dev,
+- "Query 64 bit register cmd failed, ret = %d.\n", ret);
+- kfree(desc);
+- return ret;
+- }
+-
+- for (i = 0; i < cmd_num; i++) {
+- if (i == 0) {
+- desc_data = (__le64 *)(&desc[i].data[0]);
+- n = HCLGE_64_BIT_REG_RTN_DATANUM - nodata_len;
+- } else {
+- desc_data = (__le64 *)(&desc[i]);
+- n = HCLGE_64_BIT_REG_RTN_DATANUM;
+- }
+- for (k = 0; k < n; k++) {
+- *reg_val++ = le64_to_cpu(*desc_data++);
+-
+- regs_num--;
+- if (!regs_num)
+- break;
+- }
+- }
+-
+- kfree(desc);
+- return 0;
+-}
+-
+-#define MAX_SEPARATE_NUM 4
+-#define SEPARATOR_VALUE 0xFDFCFBFA
+-#define REG_NUM_PER_LINE 4
+-#define REG_LEN_PER_LINE (REG_NUM_PER_LINE * sizeof(u32))
+-#define REG_SEPARATOR_LINE 1
+-#define REG_NUM_REMAIN_MASK 3
+-
+-int hclge_query_bd_num_cmd_send(struct hclge_dev *hdev, struct hclge_desc *desc)
+-{
+- int i;
+-
+- /* initialize command BD except the last one */
+- for (i = 0; i < HCLGE_GET_DFX_REG_TYPE_CNT - 1; i++) {
+- hclge_cmd_setup_basic_desc(&desc[i], HCLGE_OPC_DFX_BD_NUM,
+- true);
+- desc[i].flag |= cpu_to_le16(HCLGE_COMM_CMD_FLAG_NEXT);
+- }
+-
+- /* initialize the last command BD */
+- hclge_cmd_setup_basic_desc(&desc[i], HCLGE_OPC_DFX_BD_NUM, true);
+-
+- return hclge_cmd_send(&hdev->hw, desc, HCLGE_GET_DFX_REG_TYPE_CNT);
+-}
+-
+-static int hclge_get_dfx_reg_bd_num(struct hclge_dev *hdev,
+- int *bd_num_list,
+- u32 type_num)
+-{
+- u32 entries_per_desc, desc_index, index, offset, i;
+- struct hclge_desc desc[HCLGE_GET_DFX_REG_TYPE_CNT];
+- int ret;
+-
+- ret = hclge_query_bd_num_cmd_send(hdev, desc);
+- if (ret) {
+- dev_err(&hdev->pdev->dev,
+- "Get dfx bd num fail, status is %d.\n", ret);
+- return ret;
+- }
+-
+- entries_per_desc = ARRAY_SIZE(desc[0].data);
+- for (i = 0; i < type_num; i++) {
+- offset = hclge_dfx_bd_offset_list[i];
+- index = offset % entries_per_desc;
+- desc_index = offset / entries_per_desc;
+- bd_num_list[i] = le32_to_cpu(desc[desc_index].data[index]);
+- }
+-
+- return ret;
+-}
+-
+-static int hclge_dfx_reg_cmd_send(struct hclge_dev *hdev,
+- struct hclge_desc *desc_src, int bd_num,
+- enum hclge_opcode_type cmd)
+-{
+- struct hclge_desc *desc = desc_src;
+- int i, ret;
+-
+- hclge_cmd_setup_basic_desc(desc, cmd, true);
+- for (i = 0; i < bd_num - 1; i++) {
+- desc->flag |= cpu_to_le16(HCLGE_COMM_CMD_FLAG_NEXT);
+- desc++;
+- hclge_cmd_setup_basic_desc(desc, cmd, true);
+- }
+-
+- desc = desc_src;
+- ret = hclge_cmd_send(&hdev->hw, desc, bd_num);
+- if (ret)
+- dev_err(&hdev->pdev->dev,
+- "Query dfx reg cmd(0x%x) send fail, status is %d.\n",
+- cmd, ret);
+-
+- return ret;
+-}
+-
+-static int hclge_dfx_reg_fetch_data(struct hclge_desc *desc_src, int bd_num,
+- void *data)
+-{
+- int entries_per_desc, reg_num, separator_num, desc_index, index, i;
+- struct hclge_desc *desc = desc_src;
+- u32 *reg = data;
+-
+- entries_per_desc = ARRAY_SIZE(desc->data);
+- reg_num = entries_per_desc * bd_num;
+- separator_num = REG_NUM_PER_LINE - (reg_num & REG_NUM_REMAIN_MASK);
+- for (i = 0; i < reg_num; i++) {
+- index = i % entries_per_desc;
+- desc_index = i / entries_per_desc;
+- *reg++ = le32_to_cpu(desc[desc_index].data[index]);
+- }
+- for (i = 0; i < separator_num; i++)
+- *reg++ = SEPARATOR_VALUE;
+-
+- return reg_num + separator_num;
+-}
+-
+-static int hclge_get_dfx_reg_len(struct hclge_dev *hdev, int *len)
+-{
+- u32 dfx_reg_type_num = ARRAY_SIZE(hclge_dfx_bd_offset_list);
+- int data_len_per_desc, bd_num, i;
+- int *bd_num_list;
+- u32 data_len;
+- int ret;
+-
+- bd_num_list = kcalloc(dfx_reg_type_num, sizeof(int), GFP_KERNEL);
+- if (!bd_num_list)
+- return -ENOMEM;
+-
+- ret = hclge_get_dfx_reg_bd_num(hdev, bd_num_list, dfx_reg_type_num);
+- if (ret) {
+- dev_err(&hdev->pdev->dev,
+- "Get dfx reg bd num fail, status is %d.\n", ret);
+- goto out;
+- }
+-
+- data_len_per_desc = sizeof_field(struct hclge_desc, data);
+- *len = 0;
+- for (i = 0; i < dfx_reg_type_num; i++) {
+- bd_num = bd_num_list[i];
+- data_len = data_len_per_desc * bd_num;
+- *len += (data_len / REG_LEN_PER_LINE + 1) * REG_LEN_PER_LINE;
+- }
+-
+-out:
+- kfree(bd_num_list);
+- return ret;
+-}
+-
+-static int hclge_get_dfx_reg(struct hclge_dev *hdev, void *data)
+-{
+- u32 dfx_reg_type_num = ARRAY_SIZE(hclge_dfx_bd_offset_list);
+- int bd_num, bd_num_max, buf_len, i;
+- struct hclge_desc *desc_src;
+- int *bd_num_list;
+- u32 *reg = data;
+- int ret;
+-
+- bd_num_list = kcalloc(dfx_reg_type_num, sizeof(int), GFP_KERNEL);
+- if (!bd_num_list)
+- return -ENOMEM;
+-
+- ret = hclge_get_dfx_reg_bd_num(hdev, bd_num_list, dfx_reg_type_num);
+- if (ret) {
+- dev_err(&hdev->pdev->dev,
+- "Get dfx reg bd num fail, status is %d.\n", ret);
+- goto out;
+- }
+-
+- bd_num_max = bd_num_list[0];
+- for (i = 1; i < dfx_reg_type_num; i++)
+- bd_num_max = max_t(int, bd_num_max, bd_num_list[i]);
+-
+- buf_len = sizeof(*desc_src) * bd_num_max;
+- desc_src = kzalloc(buf_len, GFP_KERNEL);
+- if (!desc_src) {
+- ret = -ENOMEM;
+- goto out;
+- }
+-
+- for (i = 0; i < dfx_reg_type_num; i++) {
+- bd_num = bd_num_list[i];
+- ret = hclge_dfx_reg_cmd_send(hdev, desc_src, bd_num,
+- hclge_dfx_reg_opcode_list[i]);
+- if (ret) {
+- dev_err(&hdev->pdev->dev,
+- "Get dfx reg fail, status is %d.\n", ret);
+- break;
+- }
+-
+- reg += hclge_dfx_reg_fetch_data(desc_src, bd_num, reg);
+- }
+-
+- kfree(desc_src);
+-out:
+- kfree(bd_num_list);
+- return ret;
+-}
+-
+-static int hclge_fetch_pf_reg(struct hclge_dev *hdev, void *data,
+- struct hnae3_knic_private_info *kinfo)
+-{
+-#define HCLGE_RING_REG_OFFSET 0x200
+-#define HCLGE_RING_INT_REG_OFFSET 0x4
+-
+- int i, j, reg_num, separator_num;
+- int data_num_sum;
+- u32 *reg = data;
+-
+- /* fetching per-PF registers valus from PF PCIe register space */
+- reg_num = ARRAY_SIZE(cmdq_reg_addr_list);
+- separator_num = MAX_SEPARATE_NUM - (reg_num & REG_NUM_REMAIN_MASK);
+- for (i = 0; i < reg_num; i++)
+- *reg++ = hclge_read_dev(&hdev->hw, cmdq_reg_addr_list[i]);
+- for (i = 0; i < separator_num; i++)
+- *reg++ = SEPARATOR_VALUE;
+- data_num_sum = reg_num + separator_num;
+-
+- reg_num = ARRAY_SIZE(common_reg_addr_list);
+- separator_num = MAX_SEPARATE_NUM - (reg_num & REG_NUM_REMAIN_MASK);
+- for (i = 0; i < reg_num; i++)
+- *reg++ = hclge_read_dev(&hdev->hw, common_reg_addr_list[i]);
+- for (i = 0; i < separator_num; i++)
+- *reg++ = SEPARATOR_VALUE;
+- data_num_sum += reg_num + separator_num;
+-
+- reg_num = ARRAY_SIZE(ring_reg_addr_list);
+- separator_num = MAX_SEPARATE_NUM - (reg_num & REG_NUM_REMAIN_MASK);
+- for (j = 0; j < kinfo->num_tqps; j++) {
+- for (i = 0; i < reg_num; i++)
+- *reg++ = hclge_read_dev(&hdev->hw,
+- ring_reg_addr_list[i] +
+- HCLGE_RING_REG_OFFSET * j);
+- for (i = 0; i < separator_num; i++)
+- *reg++ = SEPARATOR_VALUE;
+- }
+- data_num_sum += (reg_num + separator_num) * kinfo->num_tqps;
+-
+- reg_num = ARRAY_SIZE(tqp_intr_reg_addr_list);
+- separator_num = MAX_SEPARATE_NUM - (reg_num & REG_NUM_REMAIN_MASK);
+- for (j = 0; j < hdev->num_msi_used - 1; j++) {
+- for (i = 0; i < reg_num; i++)
+- *reg++ = hclge_read_dev(&hdev->hw,
+- tqp_intr_reg_addr_list[i] +
+- HCLGE_RING_INT_REG_OFFSET * j);
+- for (i = 0; i < separator_num; i++)
+- *reg++ = SEPARATOR_VALUE;
+- }
+- data_num_sum += (reg_num + separator_num) * (hdev->num_msi_used - 1);
+-
+- return data_num_sum;
+-}
+-
+-static int hclge_get_regs_len(struct hnae3_handle *handle)
+-{
+- int cmdq_lines, common_lines, ring_lines, tqp_intr_lines;
+- struct hnae3_knic_private_info *kinfo = &handle->kinfo;
+- struct hclge_vport *vport = hclge_get_vport(handle);
+- struct hclge_dev *hdev = vport->back;
+- int regs_num_32_bit, regs_num_64_bit, dfx_regs_len;
+- int regs_lines_32_bit, regs_lines_64_bit;
+- int ret;
+-
+- ret = hclge_get_regs_num(hdev, ®s_num_32_bit, ®s_num_64_bit);
+- if (ret) {
+- dev_err(&hdev->pdev->dev,
+- "Get register number failed, ret = %d.\n", ret);
+- return ret;
+- }
+-
+- ret = hclge_get_dfx_reg_len(hdev, &dfx_regs_len);
+- if (ret) {
+- dev_err(&hdev->pdev->dev,
+- "Get dfx reg len failed, ret = %d.\n", ret);
+- return ret;
+- }
+-
+- cmdq_lines = sizeof(cmdq_reg_addr_list) / REG_LEN_PER_LINE +
+- REG_SEPARATOR_LINE;
+- common_lines = sizeof(common_reg_addr_list) / REG_LEN_PER_LINE +
+- REG_SEPARATOR_LINE;
+- ring_lines = sizeof(ring_reg_addr_list) / REG_LEN_PER_LINE +
+- REG_SEPARATOR_LINE;
+- tqp_intr_lines = sizeof(tqp_intr_reg_addr_list) / REG_LEN_PER_LINE +
+- REG_SEPARATOR_LINE;
+- regs_lines_32_bit = regs_num_32_bit * sizeof(u32) / REG_LEN_PER_LINE +
+- REG_SEPARATOR_LINE;
+- regs_lines_64_bit = regs_num_64_bit * sizeof(u64) / REG_LEN_PER_LINE +
+- REG_SEPARATOR_LINE;
+-
+- return (cmdq_lines + common_lines + ring_lines * kinfo->num_tqps +
+- tqp_intr_lines * (hdev->num_msi_used - 1) + regs_lines_32_bit +
+- regs_lines_64_bit) * REG_LEN_PER_LINE + dfx_regs_len;
+-}
+-
+-static void hclge_get_regs(struct hnae3_handle *handle, u32 *version,
+- void *data)
+-{
+- struct hnae3_knic_private_info *kinfo = &handle->kinfo;
+- struct hclge_vport *vport = hclge_get_vport(handle);
+- struct hclge_dev *hdev = vport->back;
+- u32 regs_num_32_bit, regs_num_64_bit;
+- int i, reg_num, separator_num, ret;
+- u32 *reg = data;
+-
+- *version = hdev->fw_version;
+-
+- ret = hclge_get_regs_num(hdev, ®s_num_32_bit, ®s_num_64_bit);
+- if (ret) {
+- dev_err(&hdev->pdev->dev,
+- "Get register number failed, ret = %d.\n", ret);
+- return;
+- }
+-
+- reg += hclge_fetch_pf_reg(hdev, reg, kinfo);
+-
+- ret = hclge_get_32_bit_regs(hdev, regs_num_32_bit, reg);
+- if (ret) {
+- dev_err(&hdev->pdev->dev,
+- "Get 32 bit register failed, ret = %d.\n", ret);
+- return;
+- }
+- reg_num = regs_num_32_bit;
+- reg += reg_num;
+- separator_num = MAX_SEPARATE_NUM - (reg_num & REG_NUM_REMAIN_MASK);
+- for (i = 0; i < separator_num; i++)
+- *reg++ = SEPARATOR_VALUE;
+-
+- ret = hclge_get_64_bit_regs(hdev, regs_num_64_bit, reg);
+- if (ret) {
+- dev_err(&hdev->pdev->dev,
+- "Get 64 bit register failed, ret = %d.\n", ret);
+- return;
+- }
+- reg_num = regs_num_64_bit * 2;
+- reg += reg_num;
+- separator_num = MAX_SEPARATE_NUM - (reg_num & REG_NUM_REMAIN_MASK);
+- for (i = 0; i < separator_num; i++)
+- *reg++ = SEPARATOR_VALUE;
+-
+- ret = hclge_get_dfx_reg(hdev, reg);
+- if (ret)
+- dev_err(&hdev->pdev->dev,
+- "Get dfx register failed, ret = %d.\n", ret);
+-}
+-
+ static int hclge_set_led_status(struct hclge_dev *hdev, u8 locate_led_status)
+ {
+ struct hclge_set_led_state_cmd *req;
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
+index 81aa6b0facf5a..e292a1253dd72 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h
+@@ -1147,8 +1147,6 @@ int hclge_push_vf_port_base_vlan_info(struct hclge_vport *vport, u8 vfid,
+ u16 state,
+ struct hclge_vlan_info *vlan_info);
+ void hclge_task_schedule(struct hclge_dev *hdev, unsigned long delay_time);
+-int hclge_query_bd_num_cmd_send(struct hclge_dev *hdev,
+- struct hclge_desc *desc);
+ void hclge_report_hw_error(struct hclge_dev *hdev,
+ enum hnae3_hw_error_type type);
+ void hclge_inform_vf_promisc_info(struct hclge_vport *vport);
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_regs.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_regs.c
+new file mode 100644
+index 0000000000000..43c1c18fa81f8
+--- /dev/null
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_regs.c
+@@ -0,0 +1,668 @@
++// SPDX-License-Identifier: GPL-2.0+
++// Copyright (c) 2023 Hisilicon Limited.
++
++#include "hclge_cmd.h"
++#include "hclge_main.h"
++#include "hclge_regs.h"
++#include "hnae3.h"
++
++static const u32 cmdq_reg_addr_list[] = {HCLGE_COMM_NIC_CSQ_BASEADDR_L_REG,
++ HCLGE_COMM_NIC_CSQ_BASEADDR_H_REG,
++ HCLGE_COMM_NIC_CSQ_DEPTH_REG,
++ HCLGE_COMM_NIC_CSQ_TAIL_REG,
++ HCLGE_COMM_NIC_CSQ_HEAD_REG,
++ HCLGE_COMM_NIC_CRQ_BASEADDR_L_REG,
++ HCLGE_COMM_NIC_CRQ_BASEADDR_H_REG,
++ HCLGE_COMM_NIC_CRQ_DEPTH_REG,
++ HCLGE_COMM_NIC_CRQ_TAIL_REG,
++ HCLGE_COMM_NIC_CRQ_HEAD_REG,
++ HCLGE_COMM_VECTOR0_CMDQ_SRC_REG,
++ HCLGE_COMM_CMDQ_INTR_STS_REG,
++ HCLGE_COMM_CMDQ_INTR_EN_REG,
++ HCLGE_COMM_CMDQ_INTR_GEN_REG};
++
++static const u32 common_reg_addr_list[] = {HCLGE_MISC_VECTOR_REG_BASE,
++ HCLGE_PF_OTHER_INT_REG,
++ HCLGE_MISC_RESET_STS_REG,
++ HCLGE_MISC_VECTOR_INT_STS,
++ HCLGE_GLOBAL_RESET_REG,
++ HCLGE_FUN_RST_ING,
++ HCLGE_GRO_EN_REG};
++
++static const u32 ring_reg_addr_list[] = {HCLGE_RING_RX_ADDR_L_REG,
++ HCLGE_RING_RX_ADDR_H_REG,
++ HCLGE_RING_RX_BD_NUM_REG,
++ HCLGE_RING_RX_BD_LENGTH_REG,
++ HCLGE_RING_RX_MERGE_EN_REG,
++ HCLGE_RING_RX_TAIL_REG,
++ HCLGE_RING_RX_HEAD_REG,
++ HCLGE_RING_RX_FBD_NUM_REG,
++ HCLGE_RING_RX_OFFSET_REG,
++ HCLGE_RING_RX_FBD_OFFSET_REG,
++ HCLGE_RING_RX_STASH_REG,
++ HCLGE_RING_RX_BD_ERR_REG,
++ HCLGE_RING_TX_ADDR_L_REG,
++ HCLGE_RING_TX_ADDR_H_REG,
++ HCLGE_RING_TX_BD_NUM_REG,
++ HCLGE_RING_TX_PRIORITY_REG,
++ HCLGE_RING_TX_TC_REG,
++ HCLGE_RING_TX_MERGE_EN_REG,
++ HCLGE_RING_TX_TAIL_REG,
++ HCLGE_RING_TX_HEAD_REG,
++ HCLGE_RING_TX_FBD_NUM_REG,
++ HCLGE_RING_TX_OFFSET_REG,
++ HCLGE_RING_TX_EBD_NUM_REG,
++ HCLGE_RING_TX_EBD_OFFSET_REG,
++ HCLGE_RING_TX_BD_ERR_REG,
++ HCLGE_RING_EN_REG};
++
++static const u32 tqp_intr_reg_addr_list[] = {HCLGE_TQP_INTR_CTRL_REG,
++ HCLGE_TQP_INTR_GL0_REG,
++ HCLGE_TQP_INTR_GL1_REG,
++ HCLGE_TQP_INTR_GL2_REG,
++ HCLGE_TQP_INTR_RL_REG};
++
++/* Get DFX BD number offset */
++#define HCLGE_DFX_BIOS_BD_OFFSET 1
++#define HCLGE_DFX_SSU_0_BD_OFFSET 2
++#define HCLGE_DFX_SSU_1_BD_OFFSET 3
++#define HCLGE_DFX_IGU_BD_OFFSET 4
++#define HCLGE_DFX_RPU_0_BD_OFFSET 5
++#define HCLGE_DFX_RPU_1_BD_OFFSET 6
++#define HCLGE_DFX_NCSI_BD_OFFSET 7
++#define HCLGE_DFX_RTC_BD_OFFSET 8
++#define HCLGE_DFX_PPP_BD_OFFSET 9
++#define HCLGE_DFX_RCB_BD_OFFSET 10
++#define HCLGE_DFX_TQP_BD_OFFSET 11
++#define HCLGE_DFX_SSU_2_BD_OFFSET 12
++
++static const u32 hclge_dfx_bd_offset_list[] = {
++ HCLGE_DFX_BIOS_BD_OFFSET,
++ HCLGE_DFX_SSU_0_BD_OFFSET,
++ HCLGE_DFX_SSU_1_BD_OFFSET,
++ HCLGE_DFX_IGU_BD_OFFSET,
++ HCLGE_DFX_RPU_0_BD_OFFSET,
++ HCLGE_DFX_RPU_1_BD_OFFSET,
++ HCLGE_DFX_NCSI_BD_OFFSET,
++ HCLGE_DFX_RTC_BD_OFFSET,
++ HCLGE_DFX_PPP_BD_OFFSET,
++ HCLGE_DFX_RCB_BD_OFFSET,
++ HCLGE_DFX_TQP_BD_OFFSET,
++ HCLGE_DFX_SSU_2_BD_OFFSET
++};
++
++static const enum hclge_opcode_type hclge_dfx_reg_opcode_list[] = {
++ HCLGE_OPC_DFX_BIOS_COMMON_REG,
++ HCLGE_OPC_DFX_SSU_REG_0,
++ HCLGE_OPC_DFX_SSU_REG_1,
++ HCLGE_OPC_DFX_IGU_EGU_REG,
++ HCLGE_OPC_DFX_RPU_REG_0,
++ HCLGE_OPC_DFX_RPU_REG_1,
++ HCLGE_OPC_DFX_NCSI_REG,
++ HCLGE_OPC_DFX_RTC_REG,
++ HCLGE_OPC_DFX_PPP_REG,
++ HCLGE_OPC_DFX_RCB_REG,
++ HCLGE_OPC_DFX_TQP_REG,
++ HCLGE_OPC_DFX_SSU_REG_2
++};
++
++enum hclge_reg_tag {
++ HCLGE_REG_TAG_CMDQ = 0,
++ HCLGE_REG_TAG_COMMON,
++ HCLGE_REG_TAG_RING,
++ HCLGE_REG_TAG_TQP_INTR,
++ HCLGE_REG_TAG_QUERY_32_BIT,
++ HCLGE_REG_TAG_QUERY_64_BIT,
++ HCLGE_REG_TAG_DFX_BIOS_COMMON,
++ HCLGE_REG_TAG_DFX_SSU_0,
++ HCLGE_REG_TAG_DFX_SSU_1,
++ HCLGE_REG_TAG_DFX_IGU_EGU,
++ HCLGE_REG_TAG_DFX_RPU_0,
++ HCLGE_REG_TAG_DFX_RPU_1,
++ HCLGE_REG_TAG_DFX_NCSI,
++ HCLGE_REG_TAG_DFX_RTC,
++ HCLGE_REG_TAG_DFX_PPP,
++ HCLGE_REG_TAG_DFX_RCB,
++ HCLGE_REG_TAG_DFX_TQP,
++ HCLGE_REG_TAG_DFX_SSU_2,
++ HCLGE_REG_TAG_RPU_TNL,
++};
++
++#pragma pack(4)
++struct hclge_reg_tlv {
++ u16 tag;
++ u16 len;
++};
++
++struct hclge_reg_header {
++ u64 magic_number;
++ u8 is_vf;
++ u8 rsv[7];
++};
++
++#pragma pack()
++
++#define HCLGE_REG_TLV_SIZE sizeof(struct hclge_reg_tlv)
++#define HCLGE_REG_HEADER_SIZE sizeof(struct hclge_reg_header)
++#define HCLGE_REG_TLV_SPACE (sizeof(struct hclge_reg_tlv) / sizeof(u32))
++#define HCLGE_REG_HEADER_SPACE (sizeof(struct hclge_reg_header) / sizeof(u32))
++#define HCLGE_REG_MAGIC_NUMBER 0x686e733372656773 /* meaning is hns3regs */
++
++#define HCLGE_REG_RPU_TNL_ID_0 1
++
++static u32 hclge_reg_get_header(void *data)
++{
++ struct hclge_reg_header *header = data;
++
++ header->magic_number = HCLGE_REG_MAGIC_NUMBER;
++ header->is_vf = 0x0;
++
++ return HCLGE_REG_HEADER_SPACE;
++}
++
++static u32 hclge_reg_get_tlv(u32 tag, u32 regs_num, void *data)
++{
++ struct hclge_reg_tlv *tlv = data;
++
++ tlv->tag = tag;
++ tlv->len = regs_num * sizeof(u32) + HCLGE_REG_TLV_SIZE;
++
++ return HCLGE_REG_TLV_SPACE;
++}
++
++static int hclge_get_32_bit_regs(struct hclge_dev *hdev, u32 regs_num,
++ void *data)
++{
++#define HCLGE_32_BIT_REG_RTN_DATANUM 8
++#define HCLGE_32_BIT_DESC_NODATA_LEN 2
++
++ struct hclge_desc *desc;
++ u32 *reg_val = data;
++ __le32 *desc_data;
++ int nodata_num;
++ int cmd_num;
++ int i, k, n;
++ int ret;
++
++ if (regs_num == 0)
++ return 0;
++
++ nodata_num = HCLGE_32_BIT_DESC_NODATA_LEN;
++ cmd_num = DIV_ROUND_UP(regs_num + nodata_num,
++ HCLGE_32_BIT_REG_RTN_DATANUM);
++ desc = kcalloc(cmd_num, sizeof(struct hclge_desc), GFP_KERNEL);
++ if (!desc)
++ return -ENOMEM;
++
++ hclge_cmd_setup_basic_desc(&desc[0], HCLGE_OPC_QUERY_32_BIT_REG, true);
++ ret = hclge_cmd_send(&hdev->hw, desc, cmd_num);
++ if (ret) {
++ dev_err(&hdev->pdev->dev,
++ "Query 32 bit register cmd failed, ret = %d.\n", ret);
++ kfree(desc);
++ return ret;
++ }
++
++ for (i = 0; i < cmd_num; i++) {
++ if (i == 0) {
++ desc_data = (__le32 *)(&desc[i].data[0]);
++ n = HCLGE_32_BIT_REG_RTN_DATANUM - nodata_num;
++ } else {
++ desc_data = (__le32 *)(&desc[i]);
++ n = HCLGE_32_BIT_REG_RTN_DATANUM;
++ }
++ for (k = 0; k < n; k++) {
++ *reg_val++ = le32_to_cpu(*desc_data++);
++
++ regs_num--;
++ if (!regs_num)
++ break;
++ }
++ }
++
++ kfree(desc);
++ return 0;
++}
++
++static int hclge_get_64_bit_regs(struct hclge_dev *hdev, u32 regs_num,
++ void *data)
++{
++#define HCLGE_64_BIT_REG_RTN_DATANUM 4
++#define HCLGE_64_BIT_DESC_NODATA_LEN 1
++
++ struct hclge_desc *desc;
++ u64 *reg_val = data;
++ __le64 *desc_data;
++ int nodata_len;
++ int cmd_num;
++ int i, k, n;
++ int ret;
++
++ if (regs_num == 0)
++ return 0;
++
++ nodata_len = HCLGE_64_BIT_DESC_NODATA_LEN;
++ cmd_num = DIV_ROUND_UP(regs_num + nodata_len,
++ HCLGE_64_BIT_REG_RTN_DATANUM);
++ desc = kcalloc(cmd_num, sizeof(struct hclge_desc), GFP_KERNEL);
++ if (!desc)
++ return -ENOMEM;
++
++ hclge_cmd_setup_basic_desc(&desc[0], HCLGE_OPC_QUERY_64_BIT_REG, true);
++ ret = hclge_cmd_send(&hdev->hw, desc, cmd_num);
++ if (ret) {
++ dev_err(&hdev->pdev->dev,
++ "Query 64 bit register cmd failed, ret = %d.\n", ret);
++ kfree(desc);
++ return ret;
++ }
++
++ for (i = 0; i < cmd_num; i++) {
++ if (i == 0) {
++ desc_data = (__le64 *)(&desc[i].data[0]);
++ n = HCLGE_64_BIT_REG_RTN_DATANUM - nodata_len;
++ } else {
++ desc_data = (__le64 *)(&desc[i]);
++ n = HCLGE_64_BIT_REG_RTN_DATANUM;
++ }
++ for (k = 0; k < n; k++) {
++ *reg_val++ = le64_to_cpu(*desc_data++);
++
++ regs_num--;
++ if (!regs_num)
++ break;
++ }
++ }
++
++ kfree(desc);
++ return 0;
++}
++
++int hclge_query_bd_num_cmd_send(struct hclge_dev *hdev, struct hclge_desc *desc)
++{
++ int i;
++
++ /* initialize command BD except the last one */
++ for (i = 0; i < HCLGE_GET_DFX_REG_TYPE_CNT - 1; i++) {
++ hclge_cmd_setup_basic_desc(&desc[i], HCLGE_OPC_DFX_BD_NUM,
++ true);
++ desc[i].flag |= cpu_to_le16(HCLGE_COMM_CMD_FLAG_NEXT);
++ }
++
++ /* initialize the last command BD */
++ hclge_cmd_setup_basic_desc(&desc[i], HCLGE_OPC_DFX_BD_NUM, true);
++
++ return hclge_cmd_send(&hdev->hw, desc, HCLGE_GET_DFX_REG_TYPE_CNT);
++}
++
++static int hclge_get_dfx_reg_bd_num(struct hclge_dev *hdev,
++ int *bd_num_list,
++ u32 type_num)
++{
++ u32 entries_per_desc, desc_index, index, offset, i;
++ struct hclge_desc desc[HCLGE_GET_DFX_REG_TYPE_CNT];
++ int ret;
++
++ ret = hclge_query_bd_num_cmd_send(hdev, desc);
++ if (ret) {
++ dev_err(&hdev->pdev->dev,
++ "Get dfx bd num fail, status is %d.\n", ret);
++ return ret;
++ }
++
++ entries_per_desc = ARRAY_SIZE(desc[0].data);
++ for (i = 0; i < type_num; i++) {
++ offset = hclge_dfx_bd_offset_list[i];
++ index = offset % entries_per_desc;
++ desc_index = offset / entries_per_desc;
++ bd_num_list[i] = le32_to_cpu(desc[desc_index].data[index]);
++ }
++
++ return ret;
++}
++
++static int hclge_dfx_reg_cmd_send(struct hclge_dev *hdev,
++ struct hclge_desc *desc_src, int bd_num,
++ enum hclge_opcode_type cmd)
++{
++ struct hclge_desc *desc = desc_src;
++ int i, ret;
++
++ hclge_cmd_setup_basic_desc(desc, cmd, true);
++ for (i = 0; i < bd_num - 1; i++) {
++ desc->flag |= cpu_to_le16(HCLGE_COMM_CMD_FLAG_NEXT);
++ desc++;
++ hclge_cmd_setup_basic_desc(desc, cmd, true);
++ }
++
++ desc = desc_src;
++ ret = hclge_cmd_send(&hdev->hw, desc, bd_num);
++ if (ret)
++ dev_err(&hdev->pdev->dev,
++ "Query dfx reg cmd(0x%x) send fail, status is %d.\n",
++ cmd, ret);
++
++ return ret;
++}
++
++/* tnl_id = 0 means get sum of all tnl reg's value */
++static int hclge_dfx_reg_rpu_tnl_cmd_send(struct hclge_dev *hdev, u32 tnl_id,
++ struct hclge_desc *desc, int bd_num)
++{
++ int i, ret;
++
++ for (i = 0; i < bd_num; i++) {
++ hclge_cmd_setup_basic_desc(&desc[i], HCLGE_OPC_DFX_RPU_REG_0,
++ true);
++ if (i != bd_num - 1)
++ desc[i].flag |= cpu_to_le16(HCLGE_COMM_CMD_FLAG_NEXT);
++ }
++
++ desc[0].data[0] = cpu_to_le32(tnl_id);
++ ret = hclge_cmd_send(&hdev->hw, desc, bd_num);
++ if (ret)
++ dev_err(&hdev->pdev->dev,
++ "failed to query dfx rpu tnl reg, ret = %d\n",
++ ret);
++ return ret;
++}
++
++static int hclge_dfx_reg_fetch_data(struct hclge_desc *desc_src, int bd_num,
++ void *data)
++{
++ int entries_per_desc, reg_num, desc_index, index, i;
++ struct hclge_desc *desc = desc_src;
++ u32 *reg = data;
++
++ entries_per_desc = ARRAY_SIZE(desc->data);
++ reg_num = entries_per_desc * bd_num;
++ for (i = 0; i < reg_num; i++) {
++ index = i % entries_per_desc;
++ desc_index = i / entries_per_desc;
++ *reg++ = le32_to_cpu(desc[desc_index].data[index]);
++ }
++
++ return reg_num;
++}
++
++static int hclge_get_dfx_reg_len(struct hclge_dev *hdev, int *len)
++{
++ u32 dfx_reg_type_num = ARRAY_SIZE(hclge_dfx_bd_offset_list);
++ struct hnae3_ae_dev *ae_dev = pci_get_drvdata(hdev->pdev);
++ int data_len_per_desc;
++ int *bd_num_list;
++ int ret;
++ u32 i;
++
++ bd_num_list = kcalloc(dfx_reg_type_num, sizeof(int), GFP_KERNEL);
++ if (!bd_num_list)
++ return -ENOMEM;
++
++ ret = hclge_get_dfx_reg_bd_num(hdev, bd_num_list, dfx_reg_type_num);
++ if (ret) {
++ dev_err(&hdev->pdev->dev,
++ "Get dfx reg bd num fail, status is %d.\n", ret);
++ goto out;
++ }
++
++ data_len_per_desc = sizeof_field(struct hclge_desc, data);
++ *len = 0;
++ for (i = 0; i < dfx_reg_type_num; i++)
++ *len += bd_num_list[i] * data_len_per_desc + HCLGE_REG_TLV_SIZE;
++
++ /**
++ * the num of dfx_rpu_0 is reused by each dfx_rpu_tnl
++ * HCLGE_DFX_BD_OFFSET is starting at 1, but the array subscript is
++ * starting at 0, so offset need '- 1'.
++ */
++ *len += (bd_num_list[HCLGE_DFX_RPU_0_BD_OFFSET - 1] * data_len_per_desc +
++ HCLGE_REG_TLV_SIZE) * ae_dev->dev_specs.tnl_num;
++
++out:
++ kfree(bd_num_list);
++ return ret;
++}
++
++static int hclge_get_dfx_rpu_tnl_reg(struct hclge_dev *hdev, u32 *reg,
++ struct hclge_desc *desc_src,
++ int bd_num)
++{
++ struct hnae3_ae_dev *ae_dev = pci_get_drvdata(hdev->pdev);
++ int ret = 0;
++ u8 i;
++
++ for (i = HCLGE_REG_RPU_TNL_ID_0; i <= ae_dev->dev_specs.tnl_num; i++) {
++ ret = hclge_dfx_reg_rpu_tnl_cmd_send(hdev, i, desc_src, bd_num);
++ if (ret)
++ break;
++
++ reg += hclge_reg_get_tlv(HCLGE_REG_TAG_RPU_TNL,
++ ARRAY_SIZE(desc_src->data) * bd_num,
++ reg);
++ reg += hclge_dfx_reg_fetch_data(desc_src, bd_num, reg);
++ }
++
++ return ret;
++}
++
++static int hclge_get_dfx_reg(struct hclge_dev *hdev, void *data)
++{
++ u32 dfx_reg_type_num = ARRAY_SIZE(hclge_dfx_bd_offset_list);
++ int bd_num, bd_num_max, buf_len;
++ struct hclge_desc *desc_src;
++ int *bd_num_list;
++ u32 *reg = data;
++ int ret;
++ u32 i;
++
++ bd_num_list = kcalloc(dfx_reg_type_num, sizeof(int), GFP_KERNEL);
++ if (!bd_num_list)
++ return -ENOMEM;
++
++ ret = hclge_get_dfx_reg_bd_num(hdev, bd_num_list, dfx_reg_type_num);
++ if (ret) {
++ dev_err(&hdev->pdev->dev,
++ "Get dfx reg bd num fail, status is %d.\n", ret);
++ goto out;
++ }
++
++ bd_num_max = bd_num_list[0];
++ for (i = 1; i < dfx_reg_type_num; i++)
++ bd_num_max = max_t(int, bd_num_max, bd_num_list[i]);
++
++ buf_len = sizeof(*desc_src) * bd_num_max;
++ desc_src = kzalloc(buf_len, GFP_KERNEL);
++ if (!desc_src) {
++ ret = -ENOMEM;
++ goto out;
++ }
++
++ for (i = 0; i < dfx_reg_type_num; i++) {
++ bd_num = bd_num_list[i];
++ ret = hclge_dfx_reg_cmd_send(hdev, desc_src, bd_num,
++ hclge_dfx_reg_opcode_list[i]);
++ if (ret) {
++ dev_err(&hdev->pdev->dev,
++ "Get dfx reg fail, status is %d.\n", ret);
++ goto free;
++ }
++
++ reg += hclge_reg_get_tlv(HCLGE_REG_TAG_DFX_BIOS_COMMON + i,
++ ARRAY_SIZE(desc_src->data) * bd_num,
++ reg);
++ reg += hclge_dfx_reg_fetch_data(desc_src, bd_num, reg);
++ }
++
++ /**
++ * HCLGE_DFX_BD_OFFSET is starting at 1, but the array subscript is
++ * starting at 0, so offset need '- 1'.
++ */
++ bd_num = bd_num_list[HCLGE_DFX_RPU_0_BD_OFFSET - 1];
++ ret = hclge_get_dfx_rpu_tnl_reg(hdev, reg, desc_src, bd_num);
++
++free:
++ kfree(desc_src);
++out:
++ kfree(bd_num_list);
++ return ret;
++}
++
++static int hclge_fetch_pf_reg(struct hclge_dev *hdev, void *data,
++ struct hnae3_knic_private_info *kinfo)
++{
++#define HCLGE_RING_REG_OFFSET 0x200
++#define HCLGE_RING_INT_REG_OFFSET 0x4
++
++ int i, j, reg_num;
++ int data_num_sum;
++ u32 *reg = data;
++
++ /* fetching per-PF registers valus from PF PCIe register space */
++ reg_num = ARRAY_SIZE(cmdq_reg_addr_list);
++ reg += hclge_reg_get_tlv(HCLGE_REG_TAG_CMDQ, reg_num, reg);
++ for (i = 0; i < reg_num; i++)
++ *reg++ = hclge_read_dev(&hdev->hw, cmdq_reg_addr_list[i]);
++ data_num_sum = reg_num + HCLGE_REG_TLV_SPACE;
++
++ reg_num = ARRAY_SIZE(common_reg_addr_list);
++ reg += hclge_reg_get_tlv(HCLGE_REG_TAG_COMMON, reg_num, reg);
++ for (i = 0; i < reg_num; i++)
++ *reg++ = hclge_read_dev(&hdev->hw, common_reg_addr_list[i]);
++ data_num_sum += reg_num + HCLGE_REG_TLV_SPACE;
++
++ reg_num = ARRAY_SIZE(ring_reg_addr_list);
++ for (j = 0; j < kinfo->num_tqps; j++) {
++ reg += hclge_reg_get_tlv(HCLGE_REG_TAG_RING, reg_num, reg);
++ for (i = 0; i < reg_num; i++)
++ *reg++ = hclge_read_dev(&hdev->hw,
++ ring_reg_addr_list[i] +
++ HCLGE_RING_REG_OFFSET * j);
++ }
++ data_num_sum += (reg_num + HCLGE_REG_TLV_SPACE) * kinfo->num_tqps;
++
++ reg_num = ARRAY_SIZE(tqp_intr_reg_addr_list);
++ for (j = 0; j < hdev->num_msi_used - 1; j++) {
++ reg += hclge_reg_get_tlv(HCLGE_REG_TAG_TQP_INTR, reg_num, reg);
++ for (i = 0; i < reg_num; i++)
++ *reg++ = hclge_read_dev(&hdev->hw,
++ tqp_intr_reg_addr_list[i] +
++ HCLGE_RING_INT_REG_OFFSET * j);
++ }
++ data_num_sum += (reg_num + HCLGE_REG_TLV_SPACE) *
++ (hdev->num_msi_used - 1);
++
++ return data_num_sum;
++}
++
++static int hclge_get_regs_num(struct hclge_dev *hdev, u32 *regs_num_32_bit,
++ u32 *regs_num_64_bit)
++{
++ struct hclge_desc desc;
++ u32 total_num;
++ int ret;
++
++ hclge_cmd_setup_basic_desc(&desc, HCLGE_OPC_QUERY_REG_NUM, true);
++ ret = hclge_cmd_send(&hdev->hw, &desc, 1);
++ if (ret) {
++ dev_err(&hdev->pdev->dev,
++ "Query register number cmd failed, ret = %d.\n", ret);
++ return ret;
++ }
++
++ *regs_num_32_bit = le32_to_cpu(desc.data[0]);
++ *regs_num_64_bit = le32_to_cpu(desc.data[1]);
++
++ total_num = *regs_num_32_bit + *regs_num_64_bit;
++ if (!total_num)
++ return -EINVAL;
++
++ return 0;
++}
++
++int hclge_get_regs_len(struct hnae3_handle *handle)
++{
++ struct hnae3_knic_private_info *kinfo = &handle->kinfo;
++ struct hclge_vport *vport = hclge_get_vport(handle);
++ int regs_num_32_bit, regs_num_64_bit, dfx_regs_len;
++ int cmdq_len, common_len, ring_len, tqp_intr_len;
++ int regs_len_32_bit, regs_len_64_bit;
++ struct hclge_dev *hdev = vport->back;
++ int ret;
++
++ ret = hclge_get_regs_num(hdev, ®s_num_32_bit, ®s_num_64_bit);
++ if (ret) {
++ dev_err(&hdev->pdev->dev,
++ "Get register number failed, ret = %d.\n", ret);
++ return ret;
++ }
++
++ ret = hclge_get_dfx_reg_len(hdev, &dfx_regs_len);
++ if (ret) {
++ dev_err(&hdev->pdev->dev,
++ "Get dfx reg len failed, ret = %d.\n", ret);
++ return ret;
++ }
++
++ cmdq_len = HCLGE_REG_TLV_SIZE + sizeof(cmdq_reg_addr_list);
++ common_len = HCLGE_REG_TLV_SIZE + sizeof(common_reg_addr_list);
++ ring_len = HCLGE_REG_TLV_SIZE + sizeof(ring_reg_addr_list);
++ tqp_intr_len = HCLGE_REG_TLV_SIZE + sizeof(tqp_intr_reg_addr_list);
++ regs_len_32_bit = HCLGE_REG_TLV_SIZE + regs_num_32_bit * sizeof(u32);
++ regs_len_64_bit = HCLGE_REG_TLV_SIZE + regs_num_64_bit * sizeof(u64);
++
++ /* return the total length of all register values */
++ return HCLGE_REG_HEADER_SIZE + cmdq_len + common_len + ring_len *
++ kinfo->num_tqps + tqp_intr_len * (hdev->num_msi_used - 1) +
++ regs_len_32_bit + regs_len_64_bit + dfx_regs_len;
++}
++
++void hclge_get_regs(struct hnae3_handle *handle, u32 *version,
++ void *data)
++{
++#define HCLGE_REG_64_BIT_SPACE_MULTIPLE 2
++
++ struct hnae3_knic_private_info *kinfo = &handle->kinfo;
++ struct hclge_vport *vport = hclge_get_vport(handle);
++ struct hclge_dev *hdev = vport->back;
++ u32 regs_num_32_bit, regs_num_64_bit;
++ u32 *reg = data;
++ int ret;
++
++ *version = hdev->fw_version;
++
++ ret = hclge_get_regs_num(hdev, ®s_num_32_bit, ®s_num_64_bit);
++ if (ret) {
++ dev_err(&hdev->pdev->dev,
++ "Get register number failed, ret = %d.\n", ret);
++ return;
++ }
++
++ reg += hclge_reg_get_header(reg);
++ reg += hclge_fetch_pf_reg(hdev, reg, kinfo);
++
++ reg += hclge_reg_get_tlv(HCLGE_REG_TAG_QUERY_32_BIT,
++ regs_num_32_bit, reg);
++ ret = hclge_get_32_bit_regs(hdev, regs_num_32_bit, reg);
++ if (ret) {
++ dev_err(&hdev->pdev->dev,
++ "Get 32 bit register failed, ret = %d.\n", ret);
++ return;
++ }
++ reg += regs_num_32_bit;
++
++ reg += hclge_reg_get_tlv(HCLGE_REG_TAG_QUERY_64_BIT,
++ regs_num_64_bit *
++ HCLGE_REG_64_BIT_SPACE_MULTIPLE, reg);
++ ret = hclge_get_64_bit_regs(hdev, regs_num_64_bit, reg);
++ if (ret) {
++ dev_err(&hdev->pdev->dev,
++ "Get 64 bit register failed, ret = %d.\n", ret);
++ return;
++ }
++ reg += regs_num_64_bit * HCLGE_REG_64_BIT_SPACE_MULTIPLE;
++
++ ret = hclge_get_dfx_reg(hdev, reg);
++ if (ret)
++ dev_err(&hdev->pdev->dev,
++ "Get dfx register failed, ret = %d.\n", ret);
++}
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_regs.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_regs.h
+new file mode 100644
+index 0000000000000..b6bc1ecb8054e
+--- /dev/null
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_regs.h
+@@ -0,0 +1,17 @@
++/* SPDX-License-Identifier: GPL-2.0+ */
++// Copyright (c) 2023 Hisilicon Limited.
++
++#ifndef __HCLGE_REGS_H
++#define __HCLGE_REGS_H
++#include <linux/types.h>
++#include "hclge_comm_cmd.h"
++
++struct hnae3_handle;
++struct hclge_dev;
++
++int hclge_query_bd_num_cmd_send(struct hclge_dev *hdev,
++ struct hclge_desc *desc);
++int hclge_get_regs_len(struct hnae3_handle *handle);
++void hclge_get_regs(struct hnae3_handle *handle, u32 *version,
++ void *data);
++#endif
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c
+index 150f146fa24fb..c58c312217628 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c
+@@ -1485,7 +1485,11 @@ int hclge_tm_schd_setup_hw(struct hclge_dev *hdev)
+ return ret;
+
+ /* Cfg schd mode for each level schd */
+- return hclge_tm_schd_mode_hw(hdev);
++ ret = hclge_tm_schd_mode_hw(hdev);
++ if (ret)
++ return ret;
++
++ return hclge_tm_flush_cfg(hdev, false);
+ }
+
+ static int hclge_pause_param_setup_hw(struct hclge_dev *hdev)
+@@ -1549,7 +1553,7 @@ static int hclge_bp_setup_hw(struct hclge_dev *hdev, u8 tc)
+ return 0;
+ }
+
+-static int hclge_mac_pause_setup_hw(struct hclge_dev *hdev)
++int hclge_mac_pause_setup_hw(struct hclge_dev *hdev)
+ {
+ bool tx_en, rx_en;
+
+@@ -2114,3 +2118,28 @@ int hclge_tm_get_port_shaper(struct hclge_dev *hdev,
+
+ return 0;
+ }
++
++int hclge_tm_flush_cfg(struct hclge_dev *hdev, bool enable)
++{
++ struct hclge_desc desc;
++ int ret;
++
++ if (!hnae3_ae_dev_tm_flush_supported(hdev))
++ return 0;
++
++ hclge_cmd_setup_basic_desc(&desc, HCLGE_OPC_TM_FLUSH, false);
++
++ desc.data[0] = cpu_to_le32(enable ? HCLGE_TM_FLUSH_EN_MSK : 0);
++
++ ret = hclge_cmd_send(&hdev->hw, &desc, 1);
++ if (ret) {
++ dev_err(&hdev->pdev->dev,
++ "failed to config tm flush, ret = %d\n", ret);
++ return ret;
++ }
++
++ if (enable)
++ msleep(HCLGE_TM_FLUSH_TIME_MS);
++
++ return ret;
++}
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.h
+index dd6f1fd486cf2..53eec6df51946 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.h
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.h
+@@ -33,6 +33,9 @@ enum hclge_opcode_type;
+ #define HCLGE_DSCP_MAP_TC_BD_NUM 2
+ #define HCLGE_DSCP_TC_SHIFT(n) (((n) & 1) * 4)
+
++#define HCLGE_TM_FLUSH_TIME_MS 10
++#define HCLGE_TM_FLUSH_EN_MSK BIT(0)
++
+ struct hclge_pg_to_pri_link_cmd {
+ u8 pg_id;
+ u8 rsvd1[3];
+@@ -242,6 +245,7 @@ int hclge_pfc_pause_en_cfg(struct hclge_dev *hdev, u8 tx_rx_bitmap,
+ u8 pfc_bitmap);
+ int hclge_mac_pause_en_cfg(struct hclge_dev *hdev, bool tx, bool rx);
+ int hclge_pause_addr_cfg(struct hclge_dev *hdev, const u8 *mac_addr);
++int hclge_mac_pause_setup_hw(struct hclge_dev *hdev);
+ void hclge_pfc_rx_stats_get(struct hclge_dev *hdev, u64 *stats);
+ void hclge_pfc_tx_stats_get(struct hclge_dev *hdev, u64 *stats);
+ int hclge_tm_qs_shaper_cfg(struct hclge_vport *vport, int max_tx_rate);
+@@ -272,4 +276,5 @@ int hclge_tm_get_port_shaper(struct hclge_dev *hdev,
+ struct hclge_tm_shaper_para *para);
+ int hclge_up_to_tc_map(struct hclge_dev *hdev);
+ int hclge_dscp_to_tc_map(struct hclge_dev *hdev);
++int hclge_tm_flush_cfg(struct hclge_dev *hdev, bool enable);
+ #endif
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
+index dd08989a4c7c1..de42a0e1b54b8 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
+@@ -6,6 +6,7 @@
+ #include <net/rtnetlink.h>
+ #include "hclgevf_cmd.h"
+ #include "hclgevf_main.h"
++#include "hclgevf_regs.h"
+ #include "hclge_mbx.h"
+ #include "hnae3.h"
+ #include "hclgevf_devlink.h"
+@@ -33,58 +34,6 @@ static const struct pci_device_id ae_algovf_pci_tbl[] = {
+
+ MODULE_DEVICE_TABLE(pci, ae_algovf_pci_tbl);
+
+-static const u32 cmdq_reg_addr_list[] = {HCLGE_COMM_NIC_CSQ_BASEADDR_L_REG,
+- HCLGE_COMM_NIC_CSQ_BASEADDR_H_REG,
+- HCLGE_COMM_NIC_CSQ_DEPTH_REG,
+- HCLGE_COMM_NIC_CSQ_TAIL_REG,
+- HCLGE_COMM_NIC_CSQ_HEAD_REG,
+- HCLGE_COMM_NIC_CRQ_BASEADDR_L_REG,
+- HCLGE_COMM_NIC_CRQ_BASEADDR_H_REG,
+- HCLGE_COMM_NIC_CRQ_DEPTH_REG,
+- HCLGE_COMM_NIC_CRQ_TAIL_REG,
+- HCLGE_COMM_NIC_CRQ_HEAD_REG,
+- HCLGE_COMM_VECTOR0_CMDQ_SRC_REG,
+- HCLGE_COMM_VECTOR0_CMDQ_STATE_REG,
+- HCLGE_COMM_CMDQ_INTR_EN_REG,
+- HCLGE_COMM_CMDQ_INTR_GEN_REG};
+-
+-static const u32 common_reg_addr_list[] = {HCLGEVF_MISC_VECTOR_REG_BASE,
+- HCLGEVF_RST_ING,
+- HCLGEVF_GRO_EN_REG};
+-
+-static const u32 ring_reg_addr_list[] = {HCLGEVF_RING_RX_ADDR_L_REG,
+- HCLGEVF_RING_RX_ADDR_H_REG,
+- HCLGEVF_RING_RX_BD_NUM_REG,
+- HCLGEVF_RING_RX_BD_LENGTH_REG,
+- HCLGEVF_RING_RX_MERGE_EN_REG,
+- HCLGEVF_RING_RX_TAIL_REG,
+- HCLGEVF_RING_RX_HEAD_REG,
+- HCLGEVF_RING_RX_FBD_NUM_REG,
+- HCLGEVF_RING_RX_OFFSET_REG,
+- HCLGEVF_RING_RX_FBD_OFFSET_REG,
+- HCLGEVF_RING_RX_STASH_REG,
+- HCLGEVF_RING_RX_BD_ERR_REG,
+- HCLGEVF_RING_TX_ADDR_L_REG,
+- HCLGEVF_RING_TX_ADDR_H_REG,
+- HCLGEVF_RING_TX_BD_NUM_REG,
+- HCLGEVF_RING_TX_PRIORITY_REG,
+- HCLGEVF_RING_TX_TC_REG,
+- HCLGEVF_RING_TX_MERGE_EN_REG,
+- HCLGEVF_RING_TX_TAIL_REG,
+- HCLGEVF_RING_TX_HEAD_REG,
+- HCLGEVF_RING_TX_FBD_NUM_REG,
+- HCLGEVF_RING_TX_OFFSET_REG,
+- HCLGEVF_RING_TX_EBD_NUM_REG,
+- HCLGEVF_RING_TX_EBD_OFFSET_REG,
+- HCLGEVF_RING_TX_BD_ERR_REG,
+- HCLGEVF_RING_EN_REG};
+-
+-static const u32 tqp_intr_reg_addr_list[] = {HCLGEVF_TQP_INTR_CTRL_REG,
+- HCLGEVF_TQP_INTR_GL0_REG,
+- HCLGEVF_TQP_INTR_GL1_REG,
+- HCLGEVF_TQP_INTR_GL2_REG,
+- HCLGEVF_TQP_INTR_RL_REG};
+-
+ /* hclgevf_cmd_send - send command to command queue
+ * @hw: pointer to the hw struct
+ * @desc: prefilled descriptor for describing the command
+@@ -111,7 +60,7 @@ void hclgevf_arq_init(struct hclgevf_dev *hdev)
+ spin_unlock(&cmdq->crq.lock);
+ }
+
+-static struct hclgevf_dev *hclgevf_ae_get_hdev(struct hnae3_handle *handle)
++struct hclgevf_dev *hclgevf_ae_get_hdev(struct hnae3_handle *handle)
+ {
+ if (!handle->client)
+ return container_of(handle, struct hclgevf_dev, nic);
+@@ -3262,72 +3211,6 @@ static void hclgevf_get_link_mode(struct hnae3_handle *handle,
+ *advertising = hdev->hw.mac.advertising;
+ }
+
+-#define MAX_SEPARATE_NUM 4
+-#define SEPARATOR_VALUE 0xFDFCFBFA
+-#define REG_NUM_PER_LINE 4
+-#define REG_LEN_PER_LINE (REG_NUM_PER_LINE * sizeof(u32))
+-
+-static int hclgevf_get_regs_len(struct hnae3_handle *handle)
+-{
+- int cmdq_lines, common_lines, ring_lines, tqp_intr_lines;
+- struct hclgevf_dev *hdev = hclgevf_ae_get_hdev(handle);
+-
+- cmdq_lines = sizeof(cmdq_reg_addr_list) / REG_LEN_PER_LINE + 1;
+- common_lines = sizeof(common_reg_addr_list) / REG_LEN_PER_LINE + 1;
+- ring_lines = sizeof(ring_reg_addr_list) / REG_LEN_PER_LINE + 1;
+- tqp_intr_lines = sizeof(tqp_intr_reg_addr_list) / REG_LEN_PER_LINE + 1;
+-
+- return (cmdq_lines + common_lines + ring_lines * hdev->num_tqps +
+- tqp_intr_lines * (hdev->num_msi_used - 1)) * REG_LEN_PER_LINE;
+-}
+-
+-static void hclgevf_get_regs(struct hnae3_handle *handle, u32 *version,
+- void *data)
+-{
+- struct hclgevf_dev *hdev = hclgevf_ae_get_hdev(handle);
+- int i, j, reg_um, separator_num;
+- u32 *reg = data;
+-
+- *version = hdev->fw_version;
+-
+- /* fetching per-VF registers values from VF PCIe register space */
+- reg_um = sizeof(cmdq_reg_addr_list) / sizeof(u32);
+- separator_num = MAX_SEPARATE_NUM - reg_um % REG_NUM_PER_LINE;
+- for (i = 0; i < reg_um; i++)
+- *reg++ = hclgevf_read_dev(&hdev->hw, cmdq_reg_addr_list[i]);
+- for (i = 0; i < separator_num; i++)
+- *reg++ = SEPARATOR_VALUE;
+-
+- reg_um = sizeof(common_reg_addr_list) / sizeof(u32);
+- separator_num = MAX_SEPARATE_NUM - reg_um % REG_NUM_PER_LINE;
+- for (i = 0; i < reg_um; i++)
+- *reg++ = hclgevf_read_dev(&hdev->hw, common_reg_addr_list[i]);
+- for (i = 0; i < separator_num; i++)
+- *reg++ = SEPARATOR_VALUE;
+-
+- reg_um = sizeof(ring_reg_addr_list) / sizeof(u32);
+- separator_num = MAX_SEPARATE_NUM - reg_um % REG_NUM_PER_LINE;
+- for (j = 0; j < hdev->num_tqps; j++) {
+- for (i = 0; i < reg_um; i++)
+- *reg++ = hclgevf_read_dev(&hdev->hw,
+- ring_reg_addr_list[i] +
+- HCLGEVF_TQP_REG_SIZE * j);
+- for (i = 0; i < separator_num; i++)
+- *reg++ = SEPARATOR_VALUE;
+- }
+-
+- reg_um = sizeof(tqp_intr_reg_addr_list) / sizeof(u32);
+- separator_num = MAX_SEPARATE_NUM - reg_um % REG_NUM_PER_LINE;
+- for (j = 0; j < hdev->num_msi_used - 1; j++) {
+- for (i = 0; i < reg_um; i++)
+- *reg++ = hclgevf_read_dev(&hdev->hw,
+- tqp_intr_reg_addr_list[i] +
+- 4 * j);
+- for (i = 0; i < separator_num; i++)
+- *reg++ = SEPARATOR_VALUE;
+- }
+-}
+-
+ void hclgevf_update_port_base_vlan_info(struct hclgevf_dev *hdev, u16 state,
+ struct hclge_mbx_port_base_vlan *port_base_vlan)
+ {
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h
+index 59ca6c794d6db..81c16b8c8da29 100644
+--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h
+@@ -294,4 +294,5 @@ void hclgevf_reset_task_schedule(struct hclgevf_dev *hdev);
+ void hclgevf_mbx_task_schedule(struct hclgevf_dev *hdev);
+ void hclgevf_update_port_base_vlan_info(struct hclgevf_dev *hdev, u16 state,
+ struct hclge_mbx_port_base_vlan *port_base_vlan);
++struct hclgevf_dev *hclgevf_ae_get_hdev(struct hnae3_handle *handle);
+ #endif
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_regs.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_regs.c
+new file mode 100644
+index 0000000000000..197ab733306b5
+--- /dev/null
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_regs.c
+@@ -0,0 +1,127 @@
++// SPDX-License-Identifier: GPL-2.0+
++// Copyright (c) 2023 Hisilicon Limited.
++
++#include "hclgevf_main.h"
++#include "hclgevf_regs.h"
++#include "hnae3.h"
++
++static const u32 cmdq_reg_addr_list[] = {HCLGE_COMM_NIC_CSQ_BASEADDR_L_REG,
++ HCLGE_COMM_NIC_CSQ_BASEADDR_H_REG,
++ HCLGE_COMM_NIC_CSQ_DEPTH_REG,
++ HCLGE_COMM_NIC_CSQ_TAIL_REG,
++ HCLGE_COMM_NIC_CSQ_HEAD_REG,
++ HCLGE_COMM_NIC_CRQ_BASEADDR_L_REG,
++ HCLGE_COMM_NIC_CRQ_BASEADDR_H_REG,
++ HCLGE_COMM_NIC_CRQ_DEPTH_REG,
++ HCLGE_COMM_NIC_CRQ_TAIL_REG,
++ HCLGE_COMM_NIC_CRQ_HEAD_REG,
++ HCLGE_COMM_VECTOR0_CMDQ_SRC_REG,
++ HCLGE_COMM_VECTOR0_CMDQ_STATE_REG,
++ HCLGE_COMM_CMDQ_INTR_EN_REG,
++ HCLGE_COMM_CMDQ_INTR_GEN_REG};
++
++static const u32 common_reg_addr_list[] = {HCLGEVF_MISC_VECTOR_REG_BASE,
++ HCLGEVF_RST_ING,
++ HCLGEVF_GRO_EN_REG};
++
++static const u32 ring_reg_addr_list[] = {HCLGEVF_RING_RX_ADDR_L_REG,
++ HCLGEVF_RING_RX_ADDR_H_REG,
++ HCLGEVF_RING_RX_BD_NUM_REG,
++ HCLGEVF_RING_RX_BD_LENGTH_REG,
++ HCLGEVF_RING_RX_MERGE_EN_REG,
++ HCLGEVF_RING_RX_TAIL_REG,
++ HCLGEVF_RING_RX_HEAD_REG,
++ HCLGEVF_RING_RX_FBD_NUM_REG,
++ HCLGEVF_RING_RX_OFFSET_REG,
++ HCLGEVF_RING_RX_FBD_OFFSET_REG,
++ HCLGEVF_RING_RX_STASH_REG,
++ HCLGEVF_RING_RX_BD_ERR_REG,
++ HCLGEVF_RING_TX_ADDR_L_REG,
++ HCLGEVF_RING_TX_ADDR_H_REG,
++ HCLGEVF_RING_TX_BD_NUM_REG,
++ HCLGEVF_RING_TX_PRIORITY_REG,
++ HCLGEVF_RING_TX_TC_REG,
++ HCLGEVF_RING_TX_MERGE_EN_REG,
++ HCLGEVF_RING_TX_TAIL_REG,
++ HCLGEVF_RING_TX_HEAD_REG,
++ HCLGEVF_RING_TX_FBD_NUM_REG,
++ HCLGEVF_RING_TX_OFFSET_REG,
++ HCLGEVF_RING_TX_EBD_NUM_REG,
++ HCLGEVF_RING_TX_EBD_OFFSET_REG,
++ HCLGEVF_RING_TX_BD_ERR_REG,
++ HCLGEVF_RING_EN_REG};
++
++static const u32 tqp_intr_reg_addr_list[] = {HCLGEVF_TQP_INTR_CTRL_REG,
++ HCLGEVF_TQP_INTR_GL0_REG,
++ HCLGEVF_TQP_INTR_GL1_REG,
++ HCLGEVF_TQP_INTR_GL2_REG,
++ HCLGEVF_TQP_INTR_RL_REG};
++
++#define MAX_SEPARATE_NUM 4
++#define SEPARATOR_VALUE 0xFDFCFBFA
++#define REG_NUM_PER_LINE 4
++#define REG_LEN_PER_LINE (REG_NUM_PER_LINE * sizeof(u32))
++
++int hclgevf_get_regs_len(struct hnae3_handle *handle)
++{
++ int cmdq_lines, common_lines, ring_lines, tqp_intr_lines;
++ struct hclgevf_dev *hdev = hclgevf_ae_get_hdev(handle);
++
++ cmdq_lines = sizeof(cmdq_reg_addr_list) / REG_LEN_PER_LINE + 1;
++ common_lines = sizeof(common_reg_addr_list) / REG_LEN_PER_LINE + 1;
++ ring_lines = sizeof(ring_reg_addr_list) / REG_LEN_PER_LINE + 1;
++ tqp_intr_lines = sizeof(tqp_intr_reg_addr_list) / REG_LEN_PER_LINE + 1;
++
++ return (cmdq_lines + common_lines + ring_lines * hdev->num_tqps +
++ tqp_intr_lines * (hdev->num_msi_used - 1)) * REG_LEN_PER_LINE;
++}
++
++void hclgevf_get_regs(struct hnae3_handle *handle, u32 *version,
++ void *data)
++{
++#define HCLGEVF_RING_REG_OFFSET 0x200
++#define HCLGEVF_RING_INT_REG_OFFSET 0x4
++
++ struct hclgevf_dev *hdev = hclgevf_ae_get_hdev(handle);
++ int i, j, reg_um, separator_num;
++ u32 *reg = data;
++
++ *version = hdev->fw_version;
++
++ /* fetching per-VF registers values from VF PCIe register space */
++ reg_um = sizeof(cmdq_reg_addr_list) / sizeof(u32);
++ separator_num = MAX_SEPARATE_NUM - reg_um % REG_NUM_PER_LINE;
++ for (i = 0; i < reg_um; i++)
++ *reg++ = hclgevf_read_dev(&hdev->hw, cmdq_reg_addr_list[i]);
++ for (i = 0; i < separator_num; i++)
++ *reg++ = SEPARATOR_VALUE;
++
++ reg_um = sizeof(common_reg_addr_list) / sizeof(u32);
++ separator_num = MAX_SEPARATE_NUM - reg_um % REG_NUM_PER_LINE;
++ for (i = 0; i < reg_um; i++)
++ *reg++ = hclgevf_read_dev(&hdev->hw, common_reg_addr_list[i]);
++ for (i = 0; i < separator_num; i++)
++ *reg++ = SEPARATOR_VALUE;
++
++ reg_um = sizeof(ring_reg_addr_list) / sizeof(u32);
++ separator_num = MAX_SEPARATE_NUM - reg_um % REG_NUM_PER_LINE;
++ for (j = 0; j < hdev->num_tqps; j++) {
++ for (i = 0; i < reg_um; i++)
++ *reg++ = hclgevf_read_dev(&hdev->hw,
++ ring_reg_addr_list[i] +
++ HCLGEVF_RING_REG_OFFSET * j);
++ for (i = 0; i < separator_num; i++)
++ *reg++ = SEPARATOR_VALUE;
++ }
++
++ reg_um = sizeof(tqp_intr_reg_addr_list) / sizeof(u32);
++ separator_num = MAX_SEPARATE_NUM - reg_um % REG_NUM_PER_LINE;
++ for (j = 0; j < hdev->num_msi_used - 1; j++) {
++ for (i = 0; i < reg_um; i++)
++ *reg++ = hclgevf_read_dev(&hdev->hw,
++ tqp_intr_reg_addr_list[i] +
++ HCLGEVF_RING_INT_REG_OFFSET * j);
++ for (i = 0; i < separator_num; i++)
++ *reg++ = SEPARATOR_VALUE;
++ }
++}
+diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_regs.h b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_regs.h
+new file mode 100644
+index 0000000000000..77bdcf60a1afe
+--- /dev/null
++++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_regs.h
+@@ -0,0 +1,13 @@
++/* SPDX-License-Identifier: GPL-2.0+ */
++/* Copyright (c) 2023 Hisilicon Limited. */
++
++#ifndef __HCLGEVF_REGS_H
++#define __HCLGEVF_REGS_H
++#include <linux/types.h>
++
++struct hnae3_handle;
++
++int hclgevf_get_regs_len(struct hnae3_handle *handle);
++void hclgevf_get_regs(struct hnae3_handle *handle, u32 *version,
++ void *data);
++#endif
+diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
+index cfb76612bd2f9..66bd06ea0a703 100644
+--- a/drivers/net/ethernet/intel/ice/ice_main.c
++++ b/drivers/net/ethernet/intel/ice/ice_main.c
+@@ -1346,6 +1346,7 @@ int ice_aq_wait_for_event(struct ice_pf *pf, u16 opcode, unsigned long timeout,
+ static void ice_aq_check_events(struct ice_pf *pf, u16 opcode,
+ struct ice_rq_event_info *event)
+ {
++ struct ice_rq_event_info *task_ev;
+ struct ice_aq_task *task;
+ bool found = false;
+
+@@ -1354,15 +1355,15 @@ static void ice_aq_check_events(struct ice_pf *pf, u16 opcode,
+ if (task->state || task->opcode != opcode)
+ continue;
+
+- memcpy(&task->event->desc, &event->desc, sizeof(event->desc));
+- task->event->msg_len = event->msg_len;
++ task_ev = task->event;
++ memcpy(&task_ev->desc, &event->desc, sizeof(event->desc));
++ task_ev->msg_len = event->msg_len;
+
+ /* Only copy the data buffer if a destination was set */
+- if (task->event->msg_buf &&
+- task->event->buf_len > event->buf_len) {
+- memcpy(task->event->msg_buf, event->msg_buf,
++ if (task_ev->msg_buf && task_ev->buf_len >= event->buf_len) {
++ memcpy(task_ev->msg_buf, event->msg_buf,
+ event->buf_len);
+- task->event->buf_len = event->buf_len;
++ task_ev->buf_len = event->buf_len;
+ }
+
+ task->state = ICE_AQ_TASK_COMPLETE;
+diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
+index a38614d21ea8f..de1d83300481d 100644
+--- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
++++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
+@@ -131,6 +131,8 @@ static void ice_ptp_src_cmd(struct ice_hw *hw, enum ice_ptp_tmr_cmd cmd)
+ case READ_TIME:
+ cmd_val |= GLTSYN_CMD_READ_TIME;
+ break;
++ case ICE_PTP_NOP:
++ break;
+ }
+
+ wr32(hw, GLTSYN_CMD, cmd_val);
+@@ -1226,18 +1228,18 @@ ice_ptp_read_port_capture(struct ice_hw *hw, u8 port, u64 *tx_ts, u64 *rx_ts)
+ }
+
+ /**
+- * ice_ptp_one_port_cmd - Prepare a single PHY port for a timer command
++ * ice_ptp_write_port_cmd_e822 - Prepare a single PHY port for a timer command
+ * @hw: pointer to HW struct
+ * @port: Port to which cmd has to be sent
+ * @cmd: Command to be sent to the port
+ *
+ * Prepare the requested port for an upcoming timer sync command.
+ *
+- * Note there is no equivalent of this operation on E810, as that device
+- * always handles all external PHYs internally.
++ * Do not use this function directly. If you want to configure exactly one
++ * port, use ice_ptp_one_port_cmd() instead.
+ */
+ static int
+-ice_ptp_one_port_cmd(struct ice_hw *hw, u8 port, enum ice_ptp_tmr_cmd cmd)
++ice_ptp_write_port_cmd_e822(struct ice_hw *hw, u8 port, enum ice_ptp_tmr_cmd cmd)
+ {
+ u32 cmd_val, val;
+ u8 tmr_idx;
+@@ -1261,6 +1263,8 @@ ice_ptp_one_port_cmd(struct ice_hw *hw, u8 port, enum ice_ptp_tmr_cmd cmd)
+ case ADJ_TIME_AT_TIME:
+ cmd_val |= PHY_CMD_ADJ_TIME_AT_TIME;
+ break;
++ case ICE_PTP_NOP:
++ break;
+ }
+
+ /* Tx case */
+@@ -1306,6 +1310,39 @@ ice_ptp_one_port_cmd(struct ice_hw *hw, u8 port, enum ice_ptp_tmr_cmd cmd)
+ return 0;
+ }
+
++/**
++ * ice_ptp_one_port_cmd - Prepare one port for a timer command
++ * @hw: pointer to the HW struct
++ * @configured_port: the port to configure with configured_cmd
++ * @configured_cmd: timer command to prepare on the configured_port
++ *
++ * Prepare the configured_port for the configured_cmd, and prepare all other
++ * ports for ICE_PTP_NOP. This causes the configured_port to execute the
++ * desired command while all other ports perform no operation.
++ */
++static int
++ice_ptp_one_port_cmd(struct ice_hw *hw, u8 configured_port,
++ enum ice_ptp_tmr_cmd configured_cmd)
++{
++ u8 port;
++
++ for (port = 0; port < ICE_NUM_EXTERNAL_PORTS; port++) {
++ enum ice_ptp_tmr_cmd cmd;
++ int err;
++
++ if (port == configured_port)
++ cmd = configured_cmd;
++ else
++ cmd = ICE_PTP_NOP;
++
++ err = ice_ptp_write_port_cmd_e822(hw, port, cmd);
++ if (err)
++ return err;
++ }
++
++ return 0;
++}
++
+ /**
+ * ice_ptp_port_cmd_e822 - Prepare all ports for a timer command
+ * @hw: pointer to the HW struct
+@@ -1322,7 +1359,7 @@ ice_ptp_port_cmd_e822(struct ice_hw *hw, enum ice_ptp_tmr_cmd cmd)
+ for (port = 0; port < ICE_NUM_EXTERNAL_PORTS; port++) {
+ int err;
+
+- err = ice_ptp_one_port_cmd(hw, port, cmd);
++ err = ice_ptp_write_port_cmd_e822(hw, port, cmd);
+ if (err)
+ return err;
+ }
+@@ -2252,6 +2289,9 @@ static int ice_sync_phy_timer_e822(struct ice_hw *hw, u8 port)
+ if (err)
+ goto err_unlock;
+
++ /* Do not perform any action on the main timer */
++ ice_ptp_src_cmd(hw, ICE_PTP_NOP);
++
+ /* Issue the sync to activate the time adjustment */
+ ice_ptp_exec_tmr_cmd(hw);
+
+@@ -2372,6 +2412,9 @@ int ice_start_phy_timer_e822(struct ice_hw *hw, u8 port)
+ if (err)
+ return err;
+
++ /* Do not perform any action on the main timer */
++ ice_ptp_src_cmd(hw, ICE_PTP_NOP);
++
+ ice_ptp_exec_tmr_cmd(hw);
+
+ err = ice_read_phy_reg_e822(hw, port, P_REG_PS, &val);
+@@ -2847,6 +2890,8 @@ static int ice_ptp_port_cmd_e810(struct ice_hw *hw, enum ice_ptp_tmr_cmd cmd)
+ case ADJ_TIME_AT_TIME:
+ cmd_val = GLTSYN_CMD_ADJ_INIT_TIME;
+ break;
++ case ICE_PTP_NOP:
++ return 0;
+ }
+
+ /* Read, modify, write */
+diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.h b/drivers/net/ethernet/intel/ice/ice_ptp_hw.h
+index 3b68cb91bd819..096685237ca61 100644
+--- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.h
++++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.h
+@@ -9,7 +9,8 @@ enum ice_ptp_tmr_cmd {
+ INIT_INCVAL,
+ ADJ_TIME,
+ ADJ_TIME_AT_TIME,
+- READ_TIME
++ READ_TIME,
++ ICE_PTP_NOP,
+ };
+
+ enum ice_ptp_serdes {
+diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
+index ba5e1d1320f67..c7da6cfc00a33 100644
+--- a/drivers/net/ethernet/intel/igb/igb_main.c
++++ b/drivers/net/ethernet/intel/igb/igb_main.c
+@@ -4812,6 +4812,10 @@ void igb_configure_rx_ring(struct igb_adapter *adapter,
+ static void igb_set_rx_buffer_len(struct igb_adapter *adapter,
+ struct igb_ring *rx_ring)
+ {
++#if (PAGE_SIZE < 8192)
++ struct e1000_hw *hw = &adapter->hw;
++#endif
++
+ /* set build_skb and buffer size flags */
+ clear_ring_build_skb_enabled(rx_ring);
+ clear_ring_uses_large_buffer(rx_ring);
+@@ -4822,10 +4826,9 @@ static void igb_set_rx_buffer_len(struct igb_adapter *adapter,
+ set_ring_build_skb_enabled(rx_ring);
+
+ #if (PAGE_SIZE < 8192)
+- if (adapter->max_frame_size <= IGB_MAX_FRAME_BUILD_SKB)
+- return;
+-
+- set_ring_uses_large_buffer(rx_ring);
++ if (adapter->max_frame_size > IGB_MAX_FRAME_BUILD_SKB ||
++ rd32(E1000_RCTL) & E1000_RCTL_SBP)
++ set_ring_uses_large_buffer(rx_ring);
+ #endif
+ }
+
+diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rpm.c b/drivers/net/ethernet/marvell/octeontx2/af/rpm.c
+index b4fcb20c3f4fd..af21e2030cff2 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/af/rpm.c
++++ b/drivers/net/ethernet/marvell/octeontx2/af/rpm.c
+@@ -355,8 +355,8 @@ int rpm_lmac_enadis_pause_frm(void *rpmd, int lmac_id, u8 tx_pause,
+
+ void rpm_lmac_pause_frm_config(void *rpmd, int lmac_id, bool enable)
+ {
++ u64 cfg, pfc_class_mask_cfg;
+ rpm_t *rpm = rpmd;
+- u64 cfg;
+
+ /* ALL pause frames received are completely ignored */
+ cfg = rpm_read(rpm, lmac_id, RPMX_MTI_MAC100X_COMMAND_CONFIG);
+@@ -380,9 +380,11 @@ void rpm_lmac_pause_frm_config(void *rpmd, int lmac_id, bool enable)
+ rpm_write(rpm, 0, RPMX_CMR_CHAN_MSK_OR, ~0ULL);
+
+ /* Disable all PFC classes */
+- cfg = rpm_read(rpm, lmac_id, RPMX_CMRX_PRT_CBFC_CTL);
++ pfc_class_mask_cfg = is_dev_rpm2(rpm) ? RPM2_CMRX_PRT_CBFC_CTL :
++ RPMX_CMRX_PRT_CBFC_CTL;
++ cfg = rpm_read(rpm, lmac_id, pfc_class_mask_cfg);
+ cfg = FIELD_SET(RPM_PFC_CLASS_MASK, 0, cfg);
+- rpm_write(rpm, lmac_id, RPMX_CMRX_PRT_CBFC_CTL, cfg);
++ rpm_write(rpm, lmac_id, pfc_class_mask_cfg, cfg);
+ }
+
+ int rpm_get_rx_stats(void *rpmd, int lmac_id, int idx, u64 *rx_stat)
+@@ -605,8 +607,11 @@ int rpm_lmac_pfc_config(void *rpmd, int lmac_id, u8 tx_pause, u8 rx_pause, u16 p
+ if (!is_lmac_valid(rpm, lmac_id))
+ return -ENODEV;
+
++ pfc_class_mask_cfg = is_dev_rpm2(rpm) ? RPM2_CMRX_PRT_CBFC_CTL :
++ RPMX_CMRX_PRT_CBFC_CTL;
++
+ cfg = rpm_read(rpm, lmac_id, RPMX_MTI_MAC100X_COMMAND_CONFIG);
+- class_en = rpm_read(rpm, lmac_id, RPMX_CMRX_PRT_CBFC_CTL);
++ class_en = rpm_read(rpm, lmac_id, pfc_class_mask_cfg);
+ pfc_en |= FIELD_GET(RPM_PFC_CLASS_MASK, class_en);
+
+ if (rx_pause) {
+@@ -635,10 +640,6 @@ int rpm_lmac_pfc_config(void *rpmd, int lmac_id, u8 tx_pause, u8 rx_pause, u16 p
+ cfg |= RPMX_MTI_MAC100X_COMMAND_CONFIG_PFC_MODE;
+
+ rpm_write(rpm, lmac_id, RPMX_MTI_MAC100X_COMMAND_CONFIG, cfg);
+-
+- pfc_class_mask_cfg = is_dev_rpm2(rpm) ? RPM2_CMRX_PRT_CBFC_CTL :
+- RPMX_CMRX_PRT_CBFC_CTL;
+-
+ rpm_write(rpm, lmac_id, pfc_class_mask_cfg, class_en);
+
+ return 0;
+diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
+index 49c1dbe5ec788..43f6d1b50d2ad 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
++++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c
+@@ -1691,6 +1691,42 @@ exit:
+ return true;
+ }
+
++static void nix_reset_tx_schedule(struct rvu *rvu, int blkaddr,
++ int lvl, int schq)
++{
++ u64 tlx_parent = 0, tlx_schedule = 0;
++
++ switch (lvl) {
++ case NIX_TXSCH_LVL_TL2:
++ tlx_parent = NIX_AF_TL2X_PARENT(schq);
++ tlx_schedule = NIX_AF_TL2X_SCHEDULE(schq);
++ break;
++ case NIX_TXSCH_LVL_TL3:
++ tlx_parent = NIX_AF_TL3X_PARENT(schq);
++ tlx_schedule = NIX_AF_TL3X_SCHEDULE(schq);
++ break;
++ case NIX_TXSCH_LVL_TL4:
++ tlx_parent = NIX_AF_TL4X_PARENT(schq);
++ tlx_schedule = NIX_AF_TL4X_SCHEDULE(schq);
++ break;
++ case NIX_TXSCH_LVL_MDQ:
++ /* no need to reset SMQ_CFG as HW clears this CSR
++ * on SMQ flush
++ */
++ tlx_parent = NIX_AF_MDQX_PARENT(schq);
++ tlx_schedule = NIX_AF_MDQX_SCHEDULE(schq);
++ break;
++ default:
++ return;
++ }
++
++ if (tlx_parent)
++ rvu_write64(rvu, blkaddr, tlx_parent, 0x0);
++
++ if (tlx_schedule)
++ rvu_write64(rvu, blkaddr, tlx_schedule, 0x0);
++}
++
+ /* Disable shaping of pkts by a scheduler queue
+ * at a given scheduler level.
+ */
+@@ -2040,6 +2076,7 @@ int rvu_mbox_handler_nix_txsch_alloc(struct rvu *rvu,
+ pfvf_map[schq] = TXSCH_MAP(pcifunc, 0);
+ nix_reset_tx_linkcfg(rvu, blkaddr, lvl, schq);
+ nix_reset_tx_shaping(rvu, blkaddr, nixlf, lvl, schq);
++ nix_reset_tx_schedule(rvu, blkaddr, lvl, schq);
+ }
+
+ for (idx = 0; idx < req->schq[lvl]; idx++) {
+@@ -2049,6 +2086,7 @@ int rvu_mbox_handler_nix_txsch_alloc(struct rvu *rvu,
+ pfvf_map[schq] = TXSCH_MAP(pcifunc, 0);
+ nix_reset_tx_linkcfg(rvu, blkaddr, lvl, schq);
+ nix_reset_tx_shaping(rvu, blkaddr, nixlf, lvl, schq);
++ nix_reset_tx_schedule(rvu, blkaddr, lvl, schq);
+ }
+ }
+
+@@ -2144,6 +2182,7 @@ static int nix_txschq_free(struct rvu *rvu, u16 pcifunc)
+ continue;
+ nix_reset_tx_linkcfg(rvu, blkaddr, lvl, schq);
+ nix_clear_tx_xoff(rvu, blkaddr, lvl, schq);
++ nix_reset_tx_shaping(rvu, blkaddr, nixlf, lvl, schq);
+ }
+ }
+ nix_clear_tx_xoff(rvu, blkaddr, NIX_TXSCH_LVL_TL1,
+@@ -2182,6 +2221,7 @@ static int nix_txschq_free(struct rvu *rvu, u16 pcifunc)
+ for (schq = 0; schq < txsch->schq.max; schq++) {
+ if (TXSCH_MAP_FUNC(txsch->pfvf_map[schq]) != pcifunc)
+ continue;
++ nix_reset_tx_schedule(rvu, blkaddr, lvl, schq);
+ rvu_free_rsrc(&txsch->schq, schq);
+ txsch->pfvf_map[schq] = TXSCH_MAP(0, NIX_TXSCHQ_FREE);
+ }
+@@ -2241,6 +2281,9 @@ static int nix_txschq_free_one(struct rvu *rvu,
+ */
+ nix_clear_tx_xoff(rvu, blkaddr, lvl, schq);
+
++ nix_reset_tx_linkcfg(rvu, blkaddr, lvl, schq);
++ nix_reset_tx_shaping(rvu, blkaddr, nixlf, lvl, schq);
++
+ /* Flush if it is a SMQ. Onus of disabling
+ * TL2/3 queue links before SMQ flush is on user
+ */
+@@ -2250,6 +2293,8 @@ static int nix_txschq_free_one(struct rvu *rvu,
+ goto err;
+ }
+
++ nix_reset_tx_schedule(rvu, blkaddr, lvl, schq);
++
+ /* Free the resource */
+ rvu_free_rsrc(&txsch->schq, schq);
+ txsch->pfvf_map[schq] = TXSCH_MAP(0, NIX_TXSCHQ_FREE);
+diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c
+index 8a41ad8ca04f1..011355e73696e 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c
++++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c
+@@ -716,7 +716,8 @@ EXPORT_SYMBOL(otx2_smq_flush);
+ int otx2_txsch_alloc(struct otx2_nic *pfvf)
+ {
+ struct nix_txsch_alloc_req *req;
+- int lvl;
++ struct nix_txsch_alloc_rsp *rsp;
++ int lvl, schq, rc;
+
+ /* Get memory to put this msg */
+ req = otx2_mbox_alloc_msg_nix_txsch_alloc(&pfvf->mbox);
+@@ -726,33 +727,69 @@ int otx2_txsch_alloc(struct otx2_nic *pfvf)
+ /* Request one schq per level */
+ for (lvl = 0; lvl < NIX_TXSCH_LVL_CNT; lvl++)
+ req->schq[lvl] = 1;
++ rc = otx2_sync_mbox_msg(&pfvf->mbox);
++ if (rc)
++ return rc;
+
+- return otx2_sync_mbox_msg(&pfvf->mbox);
++ rsp = (struct nix_txsch_alloc_rsp *)
++ otx2_mbox_get_rsp(&pfvf->mbox.mbox, 0, &req->hdr);
++ if (IS_ERR(rsp))
++ return PTR_ERR(rsp);
++
++ /* Setup transmit scheduler list */
++ for (lvl = 0; lvl < NIX_TXSCH_LVL_CNT; lvl++)
++ for (schq = 0; schq < rsp->schq[lvl]; schq++)
++ pfvf->hw.txschq_list[lvl][schq] =
++ rsp->schq_list[lvl][schq];
++
++ pfvf->hw.txschq_link_cfg_lvl = rsp->link_cfg_lvl;
++
++ return 0;
+ }
+
+-int otx2_txschq_stop(struct otx2_nic *pfvf)
++void otx2_txschq_free_one(struct otx2_nic *pfvf, u16 lvl, u16 schq)
+ {
+ struct nix_txsch_free_req *free_req;
+- int lvl, schq, err;
++ int err;
+
+ mutex_lock(&pfvf->mbox.lock);
+- /* Free the transmit schedulers */
++
+ free_req = otx2_mbox_alloc_msg_nix_txsch_free(&pfvf->mbox);
+ if (!free_req) {
+ mutex_unlock(&pfvf->mbox.lock);
+- return -ENOMEM;
++ netdev_err(pfvf->netdev,
++ "Failed alloc txschq free req\n");
++ return;
+ }
+
+- free_req->flags = TXSCHQ_FREE_ALL;
++ free_req->schq_lvl = lvl;
++ free_req->schq = schq;
++
+ err = otx2_sync_mbox_msg(&pfvf->mbox);
++ if (err) {
++ netdev_err(pfvf->netdev,
++ "Failed stop txschq %d at level %d\n", schq, lvl);
++ }
++
+ mutex_unlock(&pfvf->mbox.lock);
++}
++EXPORT_SYMBOL(otx2_txschq_free_one);
++
++void otx2_txschq_stop(struct otx2_nic *pfvf)
++{
++ int lvl, schq;
++
++ /* free non QOS TLx nodes */
++ for (lvl = 0; lvl < NIX_TXSCH_LVL_CNT; lvl++)
++ otx2_txschq_free_one(pfvf, lvl,
++ pfvf->hw.txschq_list[lvl][0]);
+
+ /* Clear the txschq list */
+ for (lvl = 0; lvl < NIX_TXSCH_LVL_CNT; lvl++) {
+ for (schq = 0; schq < MAX_TXSCHQ_PER_FUNC; schq++)
+ pfvf->hw.txschq_list[lvl][schq] = 0;
+ }
+- return err;
++
+ }
+
+ void otx2_sqb_flush(struct otx2_nic *pfvf)
+@@ -1629,21 +1666,6 @@ void mbox_handler_cgx_fec_stats(struct otx2_nic *pfvf,
+ pfvf->hw.cgx_fec_uncorr_blks += rsp->fec_uncorr_blks;
+ }
+
+-void mbox_handler_nix_txsch_alloc(struct otx2_nic *pf,
+- struct nix_txsch_alloc_rsp *rsp)
+-{
+- int lvl, schq;
+-
+- /* Setup transmit scheduler list */
+- for (lvl = 0; lvl < NIX_TXSCH_LVL_CNT; lvl++)
+- for (schq = 0; schq < rsp->schq[lvl]; schq++)
+- pf->hw.txschq_list[lvl][schq] =
+- rsp->schq_list[lvl][schq];
+-
+- pf->hw.txschq_link_cfg_lvl = rsp->link_cfg_lvl;
+-}
+-EXPORT_SYMBOL(mbox_handler_nix_txsch_alloc);
+-
+ void mbox_handler_npa_lf_alloc(struct otx2_nic *pfvf,
+ struct npa_lf_alloc_rsp *rsp)
+ {
+diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h
+index 0c8fc66ade82d..53cf964fc3e14 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h
++++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h
+@@ -920,7 +920,8 @@ int otx2_config_nix(struct otx2_nic *pfvf);
+ int otx2_config_nix_queues(struct otx2_nic *pfvf);
+ int otx2_txschq_config(struct otx2_nic *pfvf, int lvl, int prio, bool pfc_en);
+ int otx2_txsch_alloc(struct otx2_nic *pfvf);
+-int otx2_txschq_stop(struct otx2_nic *pfvf);
++void otx2_txschq_stop(struct otx2_nic *pfvf);
++void otx2_txschq_free_one(struct otx2_nic *pfvf, u16 lvl, u16 schq);
+ void otx2_sqb_flush(struct otx2_nic *pfvf);
+ int __otx2_alloc_rbuf(struct otx2_nic *pfvf, struct otx2_pool *pool,
+ dma_addr_t *dma);
+diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_dcbnl.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_dcbnl.c
+index ccaf97bb1ce03..bfddbff7bcdfb 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_dcbnl.c
++++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_dcbnl.c
+@@ -70,7 +70,7 @@ static int otx2_pfc_txschq_alloc_one(struct otx2_nic *pfvf, u8 prio)
+ * link config level. These rest of the scheduler can be
+ * same as hw.txschq_list.
+ */
+- for (lvl = 0; lvl < pfvf->hw.txschq_link_cfg_lvl; lvl++)
++ for (lvl = 0; lvl <= pfvf->hw.txschq_link_cfg_lvl; lvl++)
+ req->schq[lvl] = 1;
+
+ rc = otx2_sync_mbox_msg(&pfvf->mbox);
+@@ -83,7 +83,7 @@ static int otx2_pfc_txschq_alloc_one(struct otx2_nic *pfvf, u8 prio)
+ return PTR_ERR(rsp);
+
+ /* Setup transmit scheduler list */
+- for (lvl = 0; lvl < pfvf->hw.txschq_link_cfg_lvl; lvl++) {
++ for (lvl = 0; lvl <= pfvf->hw.txschq_link_cfg_lvl; lvl++) {
+ if (!rsp->schq[lvl])
+ return -ENOSPC;
+
+@@ -125,19 +125,12 @@ int otx2_pfc_txschq_alloc(struct otx2_nic *pfvf)
+
+ static int otx2_pfc_txschq_stop_one(struct otx2_nic *pfvf, u8 prio)
+ {
+- struct nix_txsch_free_req *free_req;
++ int lvl;
+
+- mutex_lock(&pfvf->mbox.lock);
+ /* free PFC TLx nodes */
+- free_req = otx2_mbox_alloc_msg_nix_txsch_free(&pfvf->mbox);
+- if (!free_req) {
+- mutex_unlock(&pfvf->mbox.lock);
+- return -ENOMEM;
+- }
+-
+- free_req->flags = TXSCHQ_FREE_ALL;
+- otx2_sync_mbox_msg(&pfvf->mbox);
+- mutex_unlock(&pfvf->mbox.lock);
++ for (lvl = 0; lvl <= pfvf->hw.txschq_link_cfg_lvl; lvl++)
++ otx2_txschq_free_one(pfvf, lvl,
++ pfvf->pfc_schq_list[lvl][prio]);
+
+ pfvf->pfc_alloc_status[prio] = false;
+ return 0;
+diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
+index 384d26bee9b23..fb951e953df34 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
++++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c
+@@ -791,10 +791,6 @@ static void otx2_process_pfaf_mbox_msg(struct otx2_nic *pf,
+ case MBOX_MSG_NIX_LF_ALLOC:
+ mbox_handler_nix_lf_alloc(pf, (struct nix_lf_alloc_rsp *)msg);
+ break;
+- case MBOX_MSG_NIX_TXSCH_ALLOC:
+- mbox_handler_nix_txsch_alloc(pf,
+- (struct nix_txsch_alloc_rsp *)msg);
+- break;
+ case MBOX_MSG_NIX_BP_ENABLE:
+ mbox_handler_nix_bp_enable(pf, (struct nix_bp_cfg_rsp *)msg);
+ break;
+@@ -1517,8 +1513,7 @@ err_free_nix_queues:
+ otx2_free_cq_res(pf);
+ otx2_ctx_disable(mbox, NIX_AQ_CTYPE_RQ, false);
+ err_free_txsch:
+- if (otx2_txschq_stop(pf))
+- dev_err(pf->dev, "%s failed to stop TX schedulers\n", __func__);
++ otx2_txschq_stop(pf);
+ err_free_sq_ptrs:
+ otx2_sq_free_sqbs(pf);
+ err_free_rq_ptrs:
+@@ -1553,15 +1548,13 @@ static void otx2_free_hw_resources(struct otx2_nic *pf)
+ struct mbox *mbox = &pf->mbox;
+ struct otx2_cq_queue *cq;
+ struct msg_req *req;
+- int qidx, err;
++ int qidx;
+
+ /* Ensure all SQE are processed */
+ otx2_sqb_flush(pf);
+
+ /* Stop transmission */
+- err = otx2_txschq_stop(pf);
+- if (err)
+- dev_err(pf->dev, "RVUPF: Failed to stop/free TX schedulers\n");
++ otx2_txschq_stop(pf);
+
+ #ifdef CONFIG_DCB
+ if (pf->pfc_en)
+diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c
+index 53366dbfbf27c..f8f0c01f62a14 100644
+--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c
++++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_vf.c
+@@ -70,10 +70,6 @@ static void otx2vf_process_vfaf_mbox_msg(struct otx2_nic *vf,
+ case MBOX_MSG_NIX_LF_ALLOC:
+ mbox_handler_nix_lf_alloc(vf, (struct nix_lf_alloc_rsp *)msg);
+ break;
+- case MBOX_MSG_NIX_TXSCH_ALLOC:
+- mbox_handler_nix_txsch_alloc(vf,
+- (struct nix_txsch_alloc_rsp *)msg);
+- break;
+ case MBOX_MSG_NIX_BP_ENABLE:
+ mbox_handler_nix_bp_enable(vf, (struct nix_bp_cfg_rsp *)msg);
+ break;
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
+index 50022e7565f14..f202150a5093c 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
+@@ -332,16 +332,11 @@ static int mlx5_pci_link_toggle(struct mlx5_core_dev *dev)
+ pci_cfg_access_lock(sdev);
+ }
+ /* PCI link toggle */
+- err = pci_read_config_word(bridge, cap + PCI_EXP_LNKCTL, ®16);
+- if (err)
+- return err;
+- reg16 |= PCI_EXP_LNKCTL_LD;
+- err = pci_write_config_word(bridge, cap + PCI_EXP_LNKCTL, reg16);
++ err = pcie_capability_set_word(bridge, PCI_EXP_LNKCTL, PCI_EXP_LNKCTL_LD);
+ if (err)
+ return err;
+ msleep(500);
+- reg16 &= ~PCI_EXP_LNKCTL_LD;
+- err = pci_write_config_word(bridge, cap + PCI_EXP_LNKCTL, reg16);
++ err = pcie_capability_clear_word(bridge, PCI_EXP_LNKCTL, PCI_EXP_LNKCTL_LD);
+ if (err)
+ return err;
+
+diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
+index dba4c5e2f7667..94a1635ecdd47 100644
+--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
++++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
+@@ -32,16 +32,13 @@
+
+ #include <linux/clocksource.h>
+ #include <linux/highmem.h>
++#include <linux/log2.h>
+ #include <linux/ptp_clock_kernel.h>
+ #include <rdma/mlx5-abi.h>
+ #include "lib/eq.h"
+ #include "en.h"
+ #include "clock.h"
+
+-enum {
+- MLX5_CYCLES_SHIFT = 31
+-};
+-
+ enum {
+ MLX5_PIN_MODE_IN = 0x0,
+ MLX5_PIN_MODE_OUT = 0x1,
+@@ -93,6 +90,31 @@ static bool mlx5_modify_mtutc_allowed(struct mlx5_core_dev *mdev)
+ return MLX5_CAP_MCAM_FEATURE(mdev, ptpcyc2realtime_modify);
+ }
+
++static u32 mlx5_ptp_shift_constant(u32 dev_freq_khz)
++{
++ /* Optimal shift constant leads to corrections above just 1 scaled ppm.
++ *
++ * Two sets of equations are needed to derive the optimal shift
++ * constant for the cyclecounter.
++ *
++ * dev_freq_khz * 1000 / 2^shift_constant = 1 scaled_ppm
++ * ppb = scaled_ppm * 1000 / 2^16
++ *
++ * Using the two equations together
++ *
++ * dev_freq_khz * 1000 / 1 scaled_ppm = 2^shift_constant
++ * dev_freq_khz * 2^16 / 1 ppb = 2^shift_constant
++ * dev_freq_khz = 2^(shift_constant - 16)
++ *
++ * then yields
++ *
++ * shift_constant = ilog2(dev_freq_khz) + 16
++ */
++
++ return min(ilog2(dev_freq_khz) + 16,
++ ilog2((U32_MAX / NSEC_PER_MSEC) * dev_freq_khz));
++}
++
+ static bool mlx5_is_mtutc_time_adj_cap(struct mlx5_core_dev *mdev, s64 delta)
+ {
+ s64 min = MLX5_MTUTC_OPERATION_ADJUST_TIME_MIN;
+@@ -910,7 +932,7 @@ static void mlx5_timecounter_init(struct mlx5_core_dev *mdev)
+
+ dev_freq = MLX5_CAP_GEN(mdev, device_frequency_khz);
+ timer->cycles.read = read_internal_timer;
+- timer->cycles.shift = MLX5_CYCLES_SHIFT;
++ timer->cycles.shift = mlx5_ptp_shift_constant(dev_freq);
+ timer->cycles.mult = clocksource_khz2mult(dev_freq,
+ timer->cycles.shift);
+ timer->nominal_c_mult = timer->cycles.mult;
+diff --git a/drivers/net/ethernet/mellanox/mlxsw/core_hwmon.c b/drivers/net/ethernet/mellanox/mlxsw/core_hwmon.c
+index 70735068cf292..0fd290d776ffe 100644
+--- a/drivers/net/ethernet/mellanox/mlxsw/core_hwmon.c
++++ b/drivers/net/ethernet/mellanox/mlxsw/core_hwmon.c
+@@ -405,7 +405,8 @@ mlxsw_hwmon_module_temp_label_show(struct device *dev,
+ container_of(attr, struct mlxsw_hwmon_attr, dev_attr);
+
+ return sprintf(buf, "front panel %03u\n",
+- mlxsw_hwmon_attr->type_index);
++ mlxsw_hwmon_attr->type_index + 1 -
++ mlxsw_hwmon_attr->mlxsw_hwmon_dev->sensor_count);
+ }
+
+ static ssize_t
+diff --git a/drivers/net/ethernet/mellanox/mlxsw/i2c.c b/drivers/net/ethernet/mellanox/mlxsw/i2c.c
+index 2c586c2308aef..4fac27c36ad85 100644
+--- a/drivers/net/ethernet/mellanox/mlxsw/i2c.c
++++ b/drivers/net/ethernet/mellanox/mlxsw/i2c.c
+@@ -48,6 +48,7 @@
+ #define MLXSW_I2C_MBOX_SIZE_BITS 12
+ #define MLXSW_I2C_ADDR_BUF_SIZE 4
+ #define MLXSW_I2C_BLK_DEF 32
++#define MLXSW_I2C_BLK_MAX 100
+ #define MLXSW_I2C_RETRY 5
+ #define MLXSW_I2C_TIMEOUT_MSECS 5000
+ #define MLXSW_I2C_MAX_DATA_SIZE 256
+@@ -444,7 +445,7 @@ mlxsw_i2c_cmd(struct device *dev, u16 opcode, u32 in_mod, size_t in_mbox_size,
+ } else {
+ /* No input mailbox is case of initialization query command. */
+ reg_size = MLXSW_I2C_MAX_DATA_SIZE;
+- num = reg_size / mlxsw_i2c->block_size;
++ num = DIV_ROUND_UP(reg_size, mlxsw_i2c->block_size);
+
+ if (mutex_lock_interruptible(&mlxsw_i2c->cmd.lock) < 0) {
+ dev_err(&client->dev, "Could not acquire lock");
+@@ -653,7 +654,7 @@ static int mlxsw_i2c_probe(struct i2c_client *client)
+ return -EOPNOTSUPP;
+ }
+
+- mlxsw_i2c->block_size = max_t(u16, MLXSW_I2C_BLK_DEF,
++ mlxsw_i2c->block_size = min_t(u16, MLXSW_I2C_BLK_MAX,
+ min_t(u16, quirks->max_read_len,
+ quirks->max_write_len));
+ } else {
+diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_ptp.c b/drivers/net/ethernet/microchip/lan966x/lan966x_ptp.c
+index 266a21a2d1246..1da2b1f82ae93 100644
+--- a/drivers/net/ethernet/microchip/lan966x/lan966x_ptp.c
++++ b/drivers/net/ethernet/microchip/lan966x/lan966x_ptp.c
+@@ -59,7 +59,7 @@ static int lan966x_ptp_add_trap(struct lan966x_port *port,
+ int err;
+
+ vrule = vcap_get_rule(lan966x->vcap_ctrl, rule_id);
+- if (vrule) {
++ if (!IS_ERR(vrule)) {
+ u32 value, mask;
+
+ /* Just modify the ingress port mask and exit */
+@@ -106,7 +106,7 @@ static int lan966x_ptp_del_trap(struct lan966x_port *port,
+ int err;
+
+ vrule = vcap_get_rule(lan966x->vcap_ctrl, rule_id);
+- if (!vrule)
++ if (IS_ERR(vrule))
+ return -EEXIST;
+
+ vcap_rule_get_key_u32(vrule, VCAP_KF_IF_IGR_PORT_MASK, &value, &mask);
+diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
+index 31fdecb414b6f..d813613a670b6 100644
+--- a/drivers/net/ethernet/realtek/r8169_main.c
++++ b/drivers/net/ethernet/realtek/r8169_main.c
+@@ -5241,13 +5241,9 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
+
+ /* Disable ASPM L1 as that cause random device stop working
+ * problems as well as full system hangs for some PCIe devices users.
+- * Chips from RTL8168h partially have issues with L1.2, but seem
+- * to work fine with L1 and L1.1.
+ */
+ if (rtl_aspm_is_safe(tp))
+ rc = 0;
+- else if (tp->mac_version >= RTL_GIGA_MAC_VER_46)
+- rc = pci_disable_link_state(pdev, PCIE_LINK_STATE_L1_2);
+ else
+ rc = pci_disable_link_state(pdev, PCIE_LINK_STATE_L1);
+ tp->aspm_manageable = !rc;
+diff --git a/drivers/net/ethernet/sfc/ptp.c b/drivers/net/ethernet/sfc/ptp.c
+index 0c40571133cb9..00cf6de3bb2be 100644
+--- a/drivers/net/ethernet/sfc/ptp.c
++++ b/drivers/net/ethernet/sfc/ptp.c
+@@ -1485,7 +1485,9 @@ static int efx_ptp_insert_multicast_filters(struct efx_nic *efx)
+ goto fail;
+
+ rc = efx_ptp_insert_eth_multicast_filter(efx);
+- if (rc < 0)
++
++ /* Not all firmware variants support this filter */
++ if (rc < 0 && rc != -EPROTONOSUPPORT)
+ goto fail;
+ }
+
+diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
+index 144ec756c796a..2d64650f4eb3c 100644
+--- a/drivers/net/macsec.c
++++ b/drivers/net/macsec.c
+@@ -1341,8 +1341,7 @@ static struct crypto_aead *macsec_alloc_tfm(char *key, int key_len, int icv_len)
+ struct crypto_aead *tfm;
+ int ret;
+
+- /* Pick a sync gcm(aes) cipher to ensure order is preserved. */
+- tfm = crypto_alloc_aead("gcm(aes)", 0, CRYPTO_ALG_ASYNC);
++ tfm = crypto_alloc_aead("gcm(aes)", 0, 0);
+
+ if (IS_ERR(tfm))
+ return tfm;
+diff --git a/drivers/net/phy/sfp-bus.c b/drivers/net/phy/sfp-bus.c
+index 9372e5a4cadcf..5093fc82a0248 100644
+--- a/drivers/net/phy/sfp-bus.c
++++ b/drivers/net/phy/sfp-bus.c
+@@ -258,6 +258,16 @@ void sfp_parse_support(struct sfp_bus *bus, const struct sfp_eeprom_id *id,
+ switch (id->base.extended_cc) {
+ case SFF8024_ECC_UNSPEC:
+ break;
++ case SFF8024_ECC_100G_25GAUI_C2M_AOC:
++ if (br_min <= 28000 && br_max >= 25000) {
++ /* 25GBASE-R, possibly with FEC */
++ __set_bit(PHY_INTERFACE_MODE_25GBASER, interfaces);
++ /* There is currently no link mode for 25000base
++ * with unspecified range, reuse SR.
++ */
++ phylink_set(modes, 25000baseSR_Full);
++ }
++ break;
+ case SFF8024_ECC_100GBASE_SR4_25GBASE_SR:
+ phylink_set(modes, 100000baseSR4_Full);
+ phylink_set(modes, 25000baseSR_Full);
+diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
+index 2e7c7b0cdc549..c1bcd2ab1488e 100644
+--- a/drivers/net/usb/qmi_wwan.c
++++ b/drivers/net/usb/qmi_wwan.c
+@@ -1423,6 +1423,7 @@ static const struct usb_device_id products[] = {
+ {QMI_QUIRK_SET_DTR(0x2c7c, 0x0191, 4)}, /* Quectel EG91 */
+ {QMI_QUIRK_SET_DTR(0x2c7c, 0x0195, 4)}, /* Quectel EG95 */
+ {QMI_FIXED_INTF(0x2c7c, 0x0296, 4)}, /* Quectel BG96 */
++ {QMI_QUIRK_SET_DTR(0x2c7c, 0x030e, 4)}, /* Quectel EM05GV2 */
+ {QMI_QUIRK_SET_DTR(0x2cb7, 0x0104, 4)}, /* Fibocom NL678 series */
+ {QMI_FIXED_INTF(0x0489, 0xe0b4, 0)}, /* Foxconn T77W968 LTE */
+ {QMI_FIXED_INTF(0x0489, 0xe0b5, 0)}, /* Foxconn T77W968 LTE with eSIM support*/
+diff --git a/drivers/net/wireless/ath/ath10k/pci.c b/drivers/net/wireless/ath/ath10k/pci.c
+index a7f44f6335fb8..9275a672f90cb 100644
+--- a/drivers/net/wireless/ath/ath10k/pci.c
++++ b/drivers/net/wireless/ath/ath10k/pci.c
+@@ -1963,8 +1963,9 @@ static int ath10k_pci_hif_start(struct ath10k *ar)
+ ath10k_pci_irq_enable(ar);
+ ath10k_pci_rx_post(ar);
+
+- pcie_capability_write_word(ar_pci->pdev, PCI_EXP_LNKCTL,
+- ar_pci->link_ctl);
++ pcie_capability_clear_and_set_word(ar_pci->pdev, PCI_EXP_LNKCTL,
++ PCI_EXP_LNKCTL_ASPMC,
++ ar_pci->link_ctl & PCI_EXP_LNKCTL_ASPMC);
+
+ return 0;
+ }
+@@ -2821,8 +2822,8 @@ static int ath10k_pci_hif_power_up(struct ath10k *ar,
+
+ pcie_capability_read_word(ar_pci->pdev, PCI_EXP_LNKCTL,
+ &ar_pci->link_ctl);
+- pcie_capability_write_word(ar_pci->pdev, PCI_EXP_LNKCTL,
+- ar_pci->link_ctl & ~PCI_EXP_LNKCTL_ASPMC);
++ pcie_capability_clear_word(ar_pci->pdev, PCI_EXP_LNKCTL,
++ PCI_EXP_LNKCTL_ASPMC);
+
+ /*
+ * Bring the target up cleanly.
+diff --git a/drivers/net/wireless/ath/ath11k/dp_rx.c b/drivers/net/wireless/ath/ath11k/dp_rx.c
+index f67ce62b2b48d..c5ff1bc02999e 100644
+--- a/drivers/net/wireless/ath/ath11k/dp_rx.c
++++ b/drivers/net/wireless/ath/ath11k/dp_rx.c
+@@ -2408,7 +2408,7 @@ static void ath11k_dp_rx_h_ppdu(struct ath11k *ar, struct hal_rx_desc *rx_desc,
+ rx_status->freq = center_freq;
+ } else if (channel_num >= 1 && channel_num <= 14) {
+ rx_status->band = NL80211_BAND_2GHZ;
+- } else if (channel_num >= 36 && channel_num <= 173) {
++ } else if (channel_num >= 36 && channel_num <= 177) {
+ rx_status->band = NL80211_BAND_5GHZ;
+ } else {
+ spin_lock_bh(&ar->data_lock);
+diff --git a/drivers/net/wireless/ath/ath11k/pci.c b/drivers/net/wireless/ath/ath11k/pci.c
+index 7b33731a50ee7..6ba4cef6b1c7d 100644
+--- a/drivers/net/wireless/ath/ath11k/pci.c
++++ b/drivers/net/wireless/ath/ath11k/pci.c
+@@ -581,8 +581,8 @@ static void ath11k_pci_aspm_disable(struct ath11k_pci *ab_pci)
+ u16_get_bits(ab_pci->link_ctl, PCI_EXP_LNKCTL_ASPM_L1));
+
+ /* disable L0s and L1 */
+- pcie_capability_write_word(ab_pci->pdev, PCI_EXP_LNKCTL,
+- ab_pci->link_ctl & ~PCI_EXP_LNKCTL_ASPMC);
++ pcie_capability_clear_word(ab_pci->pdev, PCI_EXP_LNKCTL,
++ PCI_EXP_LNKCTL_ASPMC);
+
+ set_bit(ATH11K_PCI_ASPM_RESTORE, &ab_pci->flags);
+ }
+@@ -590,8 +590,10 @@ static void ath11k_pci_aspm_disable(struct ath11k_pci *ab_pci)
+ static void ath11k_pci_aspm_restore(struct ath11k_pci *ab_pci)
+ {
+ if (test_and_clear_bit(ATH11K_PCI_ASPM_RESTORE, &ab_pci->flags))
+- pcie_capability_write_word(ab_pci->pdev, PCI_EXP_LNKCTL,
+- ab_pci->link_ctl);
++ pcie_capability_clear_and_set_word(ab_pci->pdev, PCI_EXP_LNKCTL,
++ PCI_EXP_LNKCTL_ASPMC,
++ ab_pci->link_ctl &
++ PCI_EXP_LNKCTL_ASPMC);
+ }
+
+ static int ath11k_pci_power_up(struct ath11k_base *ab)
+diff --git a/drivers/net/wireless/ath/ath12k/mac.c b/drivers/net/wireless/ath/ath12k/mac.c
+index 58acfe8fdf8c0..faccea2d8148c 100644
+--- a/drivers/net/wireless/ath/ath12k/mac.c
++++ b/drivers/net/wireless/ath/ath12k/mac.c
+@@ -1634,9 +1634,9 @@ static void ath12k_peer_assoc_h_he(struct ath12k *ar,
+ arg->peer_nss = min(sta->deflink.rx_nss, max_nss);
+
+ memcpy(&arg->peer_he_cap_macinfo, he_cap->he_cap_elem.mac_cap_info,
+- sizeof(arg->peer_he_cap_macinfo));
++ sizeof(he_cap->he_cap_elem.mac_cap_info));
+ memcpy(&arg->peer_he_cap_phyinfo, he_cap->he_cap_elem.phy_cap_info,
+- sizeof(arg->peer_he_cap_phyinfo));
++ sizeof(he_cap->he_cap_elem.phy_cap_info));
+ arg->peer_he_ops = vif->bss_conf.he_oper.params;
+
+ /* the top most byte is used to indicate BSS color info */
+diff --git a/drivers/net/wireless/ath/ath12k/pci.c b/drivers/net/wireless/ath/ath12k/pci.c
+index 9f174daf324c9..e1e45eb50f3e3 100644
+--- a/drivers/net/wireless/ath/ath12k/pci.c
++++ b/drivers/net/wireless/ath/ath12k/pci.c
+@@ -794,8 +794,8 @@ static void ath12k_pci_aspm_disable(struct ath12k_pci *ab_pci)
+ u16_get_bits(ab_pci->link_ctl, PCI_EXP_LNKCTL_ASPM_L1));
+
+ /* disable L0s and L1 */
+- pcie_capability_write_word(ab_pci->pdev, PCI_EXP_LNKCTL,
+- ab_pci->link_ctl & ~PCI_EXP_LNKCTL_ASPMC);
++ pcie_capability_clear_word(ab_pci->pdev, PCI_EXP_LNKCTL,
++ PCI_EXP_LNKCTL_ASPMC);
+
+ set_bit(ATH12K_PCI_ASPM_RESTORE, &ab_pci->flags);
+ }
+@@ -803,8 +803,10 @@ static void ath12k_pci_aspm_disable(struct ath12k_pci *ab_pci)
+ static void ath12k_pci_aspm_restore(struct ath12k_pci *ab_pci)
+ {
+ if (test_and_clear_bit(ATH12K_PCI_ASPM_RESTORE, &ab_pci->flags))
+- pcie_capability_write_word(ab_pci->pdev, PCI_EXP_LNKCTL,
+- ab_pci->link_ctl);
++ pcie_capability_clear_and_set_word(ab_pci->pdev, PCI_EXP_LNKCTL,
++ PCI_EXP_LNKCTL_ASPMC,
++ ab_pci->link_ctl &
++ PCI_EXP_LNKCTL_ASPMC);
+ }
+
+ static void ath12k_pci_kill_tasklets(struct ath12k_base *ab)
+diff --git a/drivers/net/wireless/ath/ath12k/wmi.c b/drivers/net/wireless/ath/ath12k/wmi.c
+index 7ae0bb78b2b53..1e65e35b5f3a6 100644
+--- a/drivers/net/wireless/ath/ath12k/wmi.c
++++ b/drivers/net/wireless/ath/ath12k/wmi.c
+@@ -2144,8 +2144,7 @@ int ath12k_wmi_send_scan_start_cmd(struct ath12k *ar,
+ struct wmi_tlv *tlv;
+ void *ptr;
+ int i, ret, len;
+- u32 *tmp_ptr;
+- u8 extraie_len_with_pad = 0;
++ u32 *tmp_ptr, extraie_len_with_pad = 0;
+ struct ath12k_wmi_hint_short_ssid_arg *s_ssid = NULL;
+ struct ath12k_wmi_hint_bssid_arg *hint_bssid = NULL;
+
+diff --git a/drivers/net/wireless/ath/ath6kl/Makefile b/drivers/net/wireless/ath/ath6kl/Makefile
+index a75bfa9fd1cfd..dc2b3b46781e1 100644
+--- a/drivers/net/wireless/ath/ath6kl/Makefile
++++ b/drivers/net/wireless/ath/ath6kl/Makefile
+@@ -36,11 +36,6 @@ ath6kl_core-y += wmi.o
+ ath6kl_core-y += core.o
+ ath6kl_core-y += recovery.o
+
+-# FIXME: temporarily silence -Wdangling-pointer on non W=1+ builds
+-ifndef KBUILD_EXTRA_WARN
+-CFLAGS_htc_mbox.o += $(call cc-disable-warning, dangling-pointer)
+-endif
+-
+ ath6kl_core-$(CONFIG_NL80211_TESTMODE) += testmode.o
+ ath6kl_core-$(CONFIG_ATH6KL_TRACING) += trace.o
+
+diff --git a/drivers/net/wireless/ath/ath9k/htc_drv_debug.c b/drivers/net/wireless/ath/ath9k/htc_drv_debug.c
+index b3ed65e5c4da8..c55aab01fff5d 100644
+--- a/drivers/net/wireless/ath/ath9k/htc_drv_debug.c
++++ b/drivers/net/wireless/ath/ath9k/htc_drv_debug.c
+@@ -491,7 +491,7 @@ int ath9k_htc_init_debug(struct ath_hw *ah)
+
+ priv->debug.debugfs_phy = debugfs_create_dir(KBUILD_MODNAME,
+ priv->hw->wiphy->debugfsdir);
+- if (!priv->debug.debugfs_phy)
++ if (IS_ERR(priv->debug.debugfs_phy))
+ return -ENOMEM;
+
+ ath9k_cmn_spectral_init_debug(&priv->spec_priv, priv->debug.debugfs_phy);
+diff --git a/drivers/net/wireless/ath/ath9k/wmi.c b/drivers/net/wireless/ath/ath9k/wmi.c
+index d652c647d56b5..1476b42b52a91 100644
+--- a/drivers/net/wireless/ath/ath9k/wmi.c
++++ b/drivers/net/wireless/ath/ath9k/wmi.c
+@@ -242,10 +242,10 @@ static void ath9k_wmi_ctrl_rx(void *priv, struct sk_buff *skb,
+ spin_unlock_irqrestore(&wmi->wmi_lock, flags);
+ goto free_skb;
+ }
+- spin_unlock_irqrestore(&wmi->wmi_lock, flags);
+
+ /* WMI command response */
+ ath9k_wmi_rsp_callback(wmi, skb);
++ spin_unlock_irqrestore(&wmi->wmi_lock, flags);
+
+ free_skb:
+ kfree_skb(skb);
+@@ -283,7 +283,8 @@ int ath9k_wmi_connect(struct htc_target *htc, struct wmi *wmi,
+
+ static int ath9k_wmi_cmd_issue(struct wmi *wmi,
+ struct sk_buff *skb,
+- enum wmi_cmd_id cmd, u16 len)
++ enum wmi_cmd_id cmd, u16 len,
++ u8 *rsp_buf, u32 rsp_len)
+ {
+ struct wmi_cmd_hdr *hdr;
+ unsigned long flags;
+@@ -293,6 +294,11 @@ static int ath9k_wmi_cmd_issue(struct wmi *wmi,
+ hdr->seq_no = cpu_to_be16(++wmi->tx_seq_id);
+
+ spin_lock_irqsave(&wmi->wmi_lock, flags);
++
++ /* record the rsp buffer and length */
++ wmi->cmd_rsp_buf = rsp_buf;
++ wmi->cmd_rsp_len = rsp_len;
++
+ wmi->last_seq_id = wmi->tx_seq_id;
+ spin_unlock_irqrestore(&wmi->wmi_lock, flags);
+
+@@ -308,8 +314,8 @@ int ath9k_wmi_cmd(struct wmi *wmi, enum wmi_cmd_id cmd_id,
+ struct ath_common *common = ath9k_hw_common(ah);
+ u16 headroom = sizeof(struct htc_frame_hdr) +
+ sizeof(struct wmi_cmd_hdr);
++ unsigned long time_left, flags;
+ struct sk_buff *skb;
+- unsigned long time_left;
+ int ret = 0;
+
+ if (ah->ah_flags & AH_UNPLUGGED)
+@@ -333,11 +339,7 @@ int ath9k_wmi_cmd(struct wmi *wmi, enum wmi_cmd_id cmd_id,
+ goto out;
+ }
+
+- /* record the rsp buffer and length */
+- wmi->cmd_rsp_buf = rsp_buf;
+- wmi->cmd_rsp_len = rsp_len;
+-
+- ret = ath9k_wmi_cmd_issue(wmi, skb, cmd_id, cmd_len);
++ ret = ath9k_wmi_cmd_issue(wmi, skb, cmd_id, cmd_len, rsp_buf, rsp_len);
+ if (ret)
+ goto out;
+
+@@ -345,7 +347,9 @@ int ath9k_wmi_cmd(struct wmi *wmi, enum wmi_cmd_id cmd_id,
+ if (!time_left) {
+ ath_dbg(common, WMI, "Timeout waiting for WMI command: %s\n",
+ wmi_cmd_to_name(cmd_id));
++ spin_lock_irqsave(&wmi->wmi_lock, flags);
+ wmi->last_seq_id = 0;
++ spin_unlock_irqrestore(&wmi->wmi_lock, flags);
+ mutex_unlock(&wmi->op_mutex);
+ return -ETIMEDOUT;
+ }
+diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwil_types.h b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwil_types.h
+index 792adaf880b44..bece26741d3a3 100644
+--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwil_types.h
++++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwil_types.h
+@@ -398,7 +398,12 @@ struct brcmf_scan_params_le {
+ * fixed parameter portion is assumed, otherwise
+ * ssid in the fixed portion is ignored
+ */
+- __le16 channel_list[1]; /* list of chanspecs */
++ union {
++ __le16 padding; /* Reserve space for at least 1 entry for abort
++ * which uses an on stack brcmf_scan_params_le
++ */
++ DECLARE_FLEX_ARRAY(__le16, channel_list); /* chanspecs */
++ };
+ };
+
+ struct brcmf_scan_params_v2_le {
+diff --git a/drivers/net/wireless/marvell/mwifiex/debugfs.c b/drivers/net/wireless/marvell/mwifiex/debugfs.c
+index 52b18f4a774b7..0cdd6c50c1c08 100644
+--- a/drivers/net/wireless/marvell/mwifiex/debugfs.c
++++ b/drivers/net/wireless/marvell/mwifiex/debugfs.c
+@@ -253,8 +253,11 @@ mwifiex_histogram_read(struct file *file, char __user *ubuf,
+ if (!p)
+ return -ENOMEM;
+
+- if (!priv || !priv->hist_data)
+- return -EFAULT;
++ if (!priv || !priv->hist_data) {
++ ret = -EFAULT;
++ goto free_and_exit;
++ }
++
+ phist_data = priv->hist_data;
+
+ p += sprintf(p, "\n"
+@@ -309,6 +312,8 @@ mwifiex_histogram_read(struct file *file, char __user *ubuf,
+ ret = simple_read_from_buffer(ubuf, count, ppos, (char *)page,
+ (unsigned long)p - page);
+
++free_and_exit:
++ free_page(page);
+ return ret;
+ }
+
+diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c b/drivers/net/wireless/marvell/mwifiex/pcie.c
+index 9a698a16a8f38..6697132ecc977 100644
+--- a/drivers/net/wireless/marvell/mwifiex/pcie.c
++++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
+@@ -189,6 +189,8 @@ static int mwifiex_pcie_probe_of(struct device *dev)
+ }
+
+ static void mwifiex_pcie_work(struct work_struct *work);
++static int mwifiex_pcie_delete_rxbd_ring(struct mwifiex_adapter *adapter);
++static int mwifiex_pcie_delete_evtbd_ring(struct mwifiex_adapter *adapter);
+
+ static int
+ mwifiex_map_pci_memory(struct mwifiex_adapter *adapter, struct sk_buff *skb,
+@@ -792,14 +794,15 @@ static int mwifiex_init_rxq_ring(struct mwifiex_adapter *adapter)
+ if (!skb) {
+ mwifiex_dbg(adapter, ERROR,
+ "Unable to allocate skb for RX ring.\n");
+- kfree(card->rxbd_ring_vbase);
+ return -ENOMEM;
+ }
+
+ if (mwifiex_map_pci_memory(adapter, skb,
+ MWIFIEX_RX_DATA_BUF_SIZE,
+- DMA_FROM_DEVICE))
+- return -1;
++ DMA_FROM_DEVICE)) {
++ kfree_skb(skb);
++ return -ENOMEM;
++ }
+
+ buf_pa = MWIFIEX_SKB_DMA_ADDR(skb);
+
+@@ -849,7 +852,6 @@ static int mwifiex_pcie_init_evt_ring(struct mwifiex_adapter *adapter)
+ if (!skb) {
+ mwifiex_dbg(adapter, ERROR,
+ "Unable to allocate skb for EVENT buf.\n");
+- kfree(card->evtbd_ring_vbase);
+ return -ENOMEM;
+ }
+ skb_put(skb, MAX_EVENT_SIZE);
+@@ -857,8 +859,7 @@ static int mwifiex_pcie_init_evt_ring(struct mwifiex_adapter *adapter)
+ if (mwifiex_map_pci_memory(adapter, skb, MAX_EVENT_SIZE,
+ DMA_FROM_DEVICE)) {
+ kfree_skb(skb);
+- kfree(card->evtbd_ring_vbase);
+- return -1;
++ return -ENOMEM;
+ }
+
+ buf_pa = MWIFIEX_SKB_DMA_ADDR(skb);
+@@ -1058,6 +1059,7 @@ static int mwifiex_pcie_delete_txbd_ring(struct mwifiex_adapter *adapter)
+ */
+ static int mwifiex_pcie_create_rxbd_ring(struct mwifiex_adapter *adapter)
+ {
++ int ret;
+ struct pcie_service_card *card = adapter->card;
+ const struct mwifiex_pcie_card_reg *reg = card->pcie.reg;
+
+@@ -1096,7 +1098,10 @@ static int mwifiex_pcie_create_rxbd_ring(struct mwifiex_adapter *adapter)
+ (u32)((u64)card->rxbd_ring_pbase >> 32),
+ card->rxbd_ring_size);
+
+- return mwifiex_init_rxq_ring(adapter);
++ ret = mwifiex_init_rxq_ring(adapter);
++ if (ret)
++ mwifiex_pcie_delete_rxbd_ring(adapter);
++ return ret;
+ }
+
+ /*
+@@ -1127,6 +1132,7 @@ static int mwifiex_pcie_delete_rxbd_ring(struct mwifiex_adapter *adapter)
+ */
+ static int mwifiex_pcie_create_evtbd_ring(struct mwifiex_adapter *adapter)
+ {
++ int ret;
+ struct pcie_service_card *card = adapter->card;
+ const struct mwifiex_pcie_card_reg *reg = card->pcie.reg;
+
+@@ -1161,7 +1167,10 @@ static int mwifiex_pcie_create_evtbd_ring(struct mwifiex_adapter *adapter)
+ (u32)((u64)card->evtbd_ring_pbase >> 32),
+ card->evtbd_ring_size);
+
+- return mwifiex_pcie_init_evt_ring(adapter);
++ ret = mwifiex_pcie_init_evt_ring(adapter);
++ if (ret)
++ mwifiex_pcie_delete_evtbd_ring(adapter);
++ return ret;
+ }
+
+ /*
+diff --git a/drivers/net/wireless/marvell/mwifiex/sta_rx.c b/drivers/net/wireless/marvell/mwifiex/sta_rx.c
+index 13659b02ba882..65420ad674167 100644
+--- a/drivers/net/wireless/marvell/mwifiex/sta_rx.c
++++ b/drivers/net/wireless/marvell/mwifiex/sta_rx.c
+@@ -86,6 +86,15 @@ int mwifiex_process_rx_packet(struct mwifiex_private *priv,
+ rx_pkt_len = le16_to_cpu(local_rx_pd->rx_pkt_length);
+ rx_pkt_hdr = (void *)local_rx_pd + rx_pkt_off;
+
++ if (sizeof(*rx_pkt_hdr) + rx_pkt_off > skb->len) {
++ mwifiex_dbg(priv->adapter, ERROR,
++ "wrong rx packet offset: len=%d, rx_pkt_off=%d\n",
++ skb->len, rx_pkt_off);
++ priv->stats.rx_dropped++;
++ dev_kfree_skb_any(skb);
++ return -1;
++ }
++
+ if ((!memcmp(&rx_pkt_hdr->rfc1042_hdr, bridge_tunnel_header,
+ sizeof(bridge_tunnel_header))) ||
+ (!memcmp(&rx_pkt_hdr->rfc1042_hdr, rfc1042_header,
+@@ -194,7 +203,8 @@ int mwifiex_process_sta_rx_packet(struct mwifiex_private *priv,
+
+ rx_pkt_hdr = (void *)local_rx_pd + rx_pkt_offset;
+
+- if ((rx_pkt_offset + rx_pkt_length) > (u16) skb->len) {
++ if ((rx_pkt_offset + rx_pkt_length) > skb->len ||
++ sizeof(rx_pkt_hdr->eth803_hdr) + rx_pkt_offset > skb->len) {
+ mwifiex_dbg(adapter, ERROR,
+ "wrong rx packet: len=%d, rx_pkt_offset=%d, rx_pkt_length=%d\n",
+ skb->len, rx_pkt_offset, rx_pkt_length);
+diff --git a/drivers/net/wireless/marvell/mwifiex/uap_txrx.c b/drivers/net/wireless/marvell/mwifiex/uap_txrx.c
+index e495f7eaea033..b8b9a0fcb19cd 100644
+--- a/drivers/net/wireless/marvell/mwifiex/uap_txrx.c
++++ b/drivers/net/wireless/marvell/mwifiex/uap_txrx.c
+@@ -103,6 +103,16 @@ static void mwifiex_uap_queue_bridged_pkt(struct mwifiex_private *priv,
+ return;
+ }
+
++ if (sizeof(*rx_pkt_hdr) +
++ le16_to_cpu(uap_rx_pd->rx_pkt_offset) > skb->len) {
++ mwifiex_dbg(adapter, ERROR,
++ "wrong rx packet offset: len=%d,rx_pkt_offset=%d\n",
++ skb->len, le16_to_cpu(uap_rx_pd->rx_pkt_offset));
++ priv->stats.rx_dropped++;
++ dev_kfree_skb_any(skb);
++ return;
++ }
++
+ if ((!memcmp(&rx_pkt_hdr->rfc1042_hdr, bridge_tunnel_header,
+ sizeof(bridge_tunnel_header))) ||
+ (!memcmp(&rx_pkt_hdr->rfc1042_hdr, rfc1042_header,
+@@ -243,7 +253,15 @@ int mwifiex_handle_uap_rx_forward(struct mwifiex_private *priv,
+
+ if (is_multicast_ether_addr(ra)) {
+ skb_uap = skb_copy(skb, GFP_ATOMIC);
+- mwifiex_uap_queue_bridged_pkt(priv, skb_uap);
++ if (likely(skb_uap)) {
++ mwifiex_uap_queue_bridged_pkt(priv, skb_uap);
++ } else {
++ mwifiex_dbg(adapter, ERROR,
++ "failed to copy skb for uAP\n");
++ priv->stats.rx_dropped++;
++ dev_kfree_skb_any(skb);
++ return -1;
++ }
+ } else {
+ if (mwifiex_get_sta_entry(priv, ra)) {
+ /* Requeue Intra-BSS packet */
+@@ -367,6 +385,16 @@ int mwifiex_process_uap_rx_packet(struct mwifiex_private *priv,
+ rx_pkt_type = le16_to_cpu(uap_rx_pd->rx_pkt_type);
+ rx_pkt_hdr = (void *)uap_rx_pd + le16_to_cpu(uap_rx_pd->rx_pkt_offset);
+
++ if (le16_to_cpu(uap_rx_pd->rx_pkt_offset) +
++ sizeof(rx_pkt_hdr->eth803_hdr) > skb->len) {
++ mwifiex_dbg(adapter, ERROR,
++ "wrong rx packet for struct ethhdr: len=%d, offset=%d\n",
++ skb->len, le16_to_cpu(uap_rx_pd->rx_pkt_offset));
++ priv->stats.rx_dropped++;
++ dev_kfree_skb_any(skb);
++ return 0;
++ }
++
+ ether_addr_copy(ta, rx_pkt_hdr->eth803_hdr.h_source);
+
+ if ((le16_to_cpu(uap_rx_pd->rx_pkt_offset) +
+diff --git a/drivers/net/wireless/marvell/mwifiex/util.c b/drivers/net/wireless/marvell/mwifiex/util.c
+index 94c2d219835da..745b1d925b217 100644
+--- a/drivers/net/wireless/marvell/mwifiex/util.c
++++ b/drivers/net/wireless/marvell/mwifiex/util.c
+@@ -393,11 +393,15 @@ mwifiex_process_mgmt_packet(struct mwifiex_private *priv,
+ }
+
+ rx_pd = (struct rxpd *)skb->data;
++ pkt_len = le16_to_cpu(rx_pd->rx_pkt_length);
++ if (pkt_len < sizeof(struct ieee80211_hdr) + sizeof(pkt_len)) {
++ mwifiex_dbg(priv->adapter, ERROR, "invalid rx_pkt_length");
++ return -1;
++ }
+
+ skb_pull(skb, le16_to_cpu(rx_pd->rx_pkt_offset));
+ skb_pull(skb, sizeof(pkt_len));
+-
+- pkt_len = le16_to_cpu(rx_pd->rx_pkt_length);
++ pkt_len -= sizeof(pkt_len);
+
+ ieee_hdr = (void *)skb->data;
+ if (ieee80211_is_mgmt(ieee_hdr->frame_control)) {
+@@ -410,7 +414,7 @@ mwifiex_process_mgmt_packet(struct mwifiex_private *priv,
+ skb->data + sizeof(struct ieee80211_hdr),
+ pkt_len - sizeof(struct ieee80211_hdr));
+
+- pkt_len -= ETH_ALEN + sizeof(pkt_len);
++ pkt_len -= ETH_ALEN;
+ rx_pd->rx_pkt_length = cpu_to_le16(pkt_len);
+
+ cfg80211_rx_mgmt(&priv->wdev, priv->roc_cfg.chan.center_freq,
+diff --git a/drivers/net/wireless/mediatek/mt76/mt76.h b/drivers/net/wireless/mediatek/mt76/mt76.h
+index 6b07b8fafec2f..0e9f4197213a3 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt76.h
++++ b/drivers/net/wireless/mediatek/mt76/mt76.h
+@@ -277,7 +277,7 @@ struct mt76_sta_stats {
+ u64 tx_mcs[16]; /* mcs idx */
+ u64 tx_bytes;
+ /* WED TX */
+- u32 tx_packets;
++ u32 tx_packets; /* unit: MSDU */
+ u32 tx_retries;
+ u32 tx_failed;
+ /* WED RX */
+diff --git a/drivers/net/wireless/mediatek/mt76/mt76_connac_mac.c b/drivers/net/wireless/mediatek/mt76/mt76_connac_mac.c
+index be4d63db5f64a..e415ac5e321f1 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt76_connac_mac.c
++++ b/drivers/net/wireless/mediatek/mt76/mt76_connac_mac.c
+@@ -522,9 +522,9 @@ void mt76_connac2_mac_write_txwi(struct mt76_dev *dev, __le32 *txwi,
+ q_idx = wmm_idx * MT76_CONNAC_MAX_WMM_SETS +
+ mt76_connac_lmac_mapping(skb_get_queue_mapping(skb));
+
+- /* counting non-offloading skbs */
+- wcid->stats.tx_bytes += skb->len;
+- wcid->stats.tx_packets++;
++ /* mt7915 WA only counts WED path */
++ if (is_mt7915(dev) && mtk_wed_device_active(&dev->mmio.wed))
++ wcid->stats.tx_packets++;
+ }
+
+ val = FIELD_PREP(MT_TXD0_TX_BYTES, skb->len + sz_txd) |
+@@ -609,12 +609,11 @@ bool mt76_connac2_mac_fill_txs(struct mt76_dev *dev, struct mt76_wcid *wcid,
+ txs = le32_to_cpu(txs_data[0]);
+
+ /* PPDU based reporting */
+- if (FIELD_GET(MT_TXS0_TXS_FORMAT, txs) > 1) {
++ if (mtk_wed_device_active(&dev->mmio.wed) &&
++ FIELD_GET(MT_TXS0_TXS_FORMAT, txs) > 1) {
+ stats->tx_bytes +=
+ le32_get_bits(txs_data[5], MT_TXS5_MPDU_TX_BYTE) -
+ le32_get_bits(txs_data[7], MT_TXS7_MPDU_RETRY_BYTE);
+- stats->tx_packets +=
+- le32_get_bits(txs_data[5], MT_TXS5_MPDU_TX_CNT);
+ stats->tx_failed +=
+ le32_get_bits(txs_data[6], MT_TXS6_MPDU_FAIL_CNT);
+ stats->tx_retries +=
+diff --git a/drivers/net/wireless/mediatek/mt76/mt76_connac_mcu.h b/drivers/net/wireless/mediatek/mt76/mt76_connac_mcu.h
+index ca1ce97a6d2fd..7a52b68491b6e 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt76_connac_mcu.h
++++ b/drivers/net/wireless/mediatek/mt76/mt76_connac_mcu.h
+@@ -998,6 +998,7 @@ enum {
+ MCU_EXT_EVENT_ASSERT_DUMP = 0x23,
+ MCU_EXT_EVENT_RDD_REPORT = 0x3a,
+ MCU_EXT_EVENT_CSA_NOTIFY = 0x4f,
++ MCU_EXT_EVENT_WA_TX_STAT = 0x74,
+ MCU_EXT_EVENT_BCC_NOTIFY = 0x75,
+ MCU_EXT_EVENT_MURU_CTRL = 0x9f,
+ };
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7915/init.c b/drivers/net/wireless/mediatek/mt76/mt7915/init.c
+index ac2049f49bb38..9defd2b3c2f8d 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7915/init.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7915/init.c
+@@ -414,7 +414,6 @@ mt7915_init_wiphy(struct mt7915_phy *phy)
+ if (!dev->dbdc_support)
+ vht_cap->cap |=
+ IEEE80211_VHT_CAP_SHORT_GI_160 |
+- IEEE80211_VHT_CAP_SUPP_CHAN_WIDTH_160MHZ |
+ FIELD_PREP(IEEE80211_VHT_CAP_EXT_NSS_BW_MASK, 1);
+ } else {
+ vht_cap->cap |=
+@@ -499,6 +498,12 @@ mt7915_mac_init_band(struct mt7915_dev *dev, u8 band)
+ set = FIELD_PREP(MT_WTBLOFF_TOP_RSCR_RCPI_MODE, 0) |
+ FIELD_PREP(MT_WTBLOFF_TOP_RSCR_RCPI_PARAM, 0x3);
+ mt76_rmw(dev, MT_WTBLOFF_TOP_RSCR(band), mask, set);
++
++ /* MT_TXD5_TX_STATUS_HOST (MPDU format) has higher priority than
++ * MT_AGG_ACR_PPDU_TXS2H (PPDU format) even though ACR bit is set.
++ */
++ if (mtk_wed_device_active(&dev->mt76.mmio.wed))
++ mt76_set(dev, MT_AGG_ACR4(band), MT_AGG_ACR_PPDU_TXS2H);
+ }
+
+ static void
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7915/main.c b/drivers/net/wireless/mediatek/mt76/mt7915/main.c
+index 1b361199c0616..42a983e40ade9 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7915/main.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7915/main.c
+@@ -269,6 +269,7 @@ static int mt7915_add_interface(struct ieee80211_hw *hw,
+ vif->offload_flags |= IEEE80211_OFFLOAD_ENCAP_4ADDR;
+
+ mt7915_init_bitrate_mask(vif);
++ memset(&mvif->cap, -1, sizeof(mvif->cap));
+
+ mt7915_mcu_add_bss_info(phy, vif, true);
+ mt7915_mcu_add_sta(dev, vif, NULL, true);
+@@ -470,7 +471,8 @@ static int mt7915_config(struct ieee80211_hw *hw, u32 changed)
+ ieee80211_wake_queues(hw);
+ }
+
+- if (changed & IEEE80211_CONF_CHANGE_POWER) {
++ if (changed & (IEEE80211_CONF_CHANGE_POWER |
++ IEEE80211_CONF_CHANGE_CHANNEL)) {
+ ret = mt7915_mcu_set_txpower_sku(phy);
+ if (ret)
+ return ret;
+@@ -599,6 +601,7 @@ static void mt7915_bss_info_changed(struct ieee80211_hw *hw,
+ {
+ struct mt7915_phy *phy = mt7915_hw_phy(hw);
+ struct mt7915_dev *dev = mt7915_hw_dev(hw);
++ int set_bss_info = -1, set_sta = -1;
+
+ mutex_lock(&dev->mt76.mutex);
+
+@@ -607,15 +610,18 @@ static void mt7915_bss_info_changed(struct ieee80211_hw *hw,
+ * and then peer references bss_info_rfch to set bandwidth cap.
+ */
+ if (changed & BSS_CHANGED_BSSID &&
+- vif->type == NL80211_IFTYPE_STATION) {
+- bool join = !is_zero_ether_addr(info->bssid);
+-
+- mt7915_mcu_add_bss_info(phy, vif, join);
+- mt7915_mcu_add_sta(dev, vif, NULL, join);
+- }
+-
++ vif->type == NL80211_IFTYPE_STATION)
++ set_bss_info = set_sta = !is_zero_ether_addr(info->bssid);
+ if (changed & BSS_CHANGED_ASSOC)
+- mt7915_mcu_add_bss_info(phy, vif, vif->cfg.assoc);
++ set_bss_info = vif->cfg.assoc;
++ if (changed & BSS_CHANGED_BEACON_ENABLED &&
++ vif->type != NL80211_IFTYPE_AP)
++ set_bss_info = set_sta = info->enable_beacon;
++
++ if (set_bss_info == 1)
++ mt7915_mcu_add_bss_info(phy, vif, true);
++ if (set_sta == 1)
++ mt7915_mcu_add_sta(dev, vif, NULL, true);
+
+ if (changed & BSS_CHANGED_ERP_CTS_PROT)
+ mt7915_mac_enable_rtscts(dev, vif, info->use_cts_prot);
+@@ -629,11 +635,6 @@ static void mt7915_bss_info_changed(struct ieee80211_hw *hw,
+ }
+ }
+
+- if (changed & BSS_CHANGED_BEACON_ENABLED && info->enable_beacon) {
+- mt7915_mcu_add_bss_info(phy, vif, true);
+- mt7915_mcu_add_sta(dev, vif, NULL, true);
+- }
+-
+ /* ensure that enable txcmd_mode after bss_info */
+ if (changed & (BSS_CHANGED_QOS | BSS_CHANGED_BEACON_ENABLED))
+ mt7915_mcu_set_tx(dev, vif);
+@@ -650,6 +651,62 @@ static void mt7915_bss_info_changed(struct ieee80211_hw *hw,
+ BSS_CHANGED_FILS_DISCOVERY))
+ mt7915_mcu_add_beacon(hw, vif, info->enable_beacon, changed);
+
++ if (set_bss_info == 0)
++ mt7915_mcu_add_bss_info(phy, vif, false);
++ if (set_sta == 0)
++ mt7915_mcu_add_sta(dev, vif, NULL, false);
++
++ mutex_unlock(&dev->mt76.mutex);
++}
++
++static void
++mt7915_vif_check_caps(struct mt7915_phy *phy, struct ieee80211_vif *vif)
++{
++ struct mt7915_vif *mvif = (struct mt7915_vif *)vif->drv_priv;
++ struct mt7915_vif_cap *vc = &mvif->cap;
++
++ vc->ht_ldpc = vif->bss_conf.ht_ldpc;
++ vc->vht_ldpc = vif->bss_conf.vht_ldpc;
++ vc->vht_su_ebfer = vif->bss_conf.vht_su_beamformer;
++ vc->vht_su_ebfee = vif->bss_conf.vht_su_beamformee;
++ vc->vht_mu_ebfer = vif->bss_conf.vht_mu_beamformer;
++ vc->vht_mu_ebfee = vif->bss_conf.vht_mu_beamformee;
++ vc->he_ldpc = vif->bss_conf.he_ldpc;
++ vc->he_su_ebfer = vif->bss_conf.he_su_beamformer;
++ vc->he_su_ebfee = vif->bss_conf.he_su_beamformee;
++ vc->he_mu_ebfer = vif->bss_conf.he_mu_beamformer;
++}
++
++static int
++mt7915_start_ap(struct ieee80211_hw *hw, struct ieee80211_vif *vif,
++ struct ieee80211_bss_conf *link_conf)
++{
++ struct mt7915_phy *phy = mt7915_hw_phy(hw);
++ struct mt7915_dev *dev = mt7915_hw_dev(hw);
++ int err;
++
++ mutex_lock(&dev->mt76.mutex);
++
++ mt7915_vif_check_caps(phy, vif);
++
++ err = mt7915_mcu_add_bss_info(phy, vif, true);
++ if (err)
++ goto out;
++ err = mt7915_mcu_add_sta(dev, vif, NULL, true);
++out:
++ mutex_unlock(&dev->mt76.mutex);
++
++ return err;
++}
++
++static void
++mt7915_stop_ap(struct ieee80211_hw *hw, struct ieee80211_vif *vif,
++ struct ieee80211_bss_conf *link_conf)
++{
++ struct mt7915_dev *dev = mt7915_hw_dev(hw);
++
++ mutex_lock(&dev->mt76.mutex);
++ mt7915_mcu_add_sta(dev, vif, NULL, false);
+ mutex_unlock(&dev->mt76.mutex);
+ }
+
+@@ -1042,8 +1099,10 @@ static void mt7915_sta_statistics(struct ieee80211_hw *hw,
+ sinfo->tx_bytes = msta->wcid.stats.tx_bytes;
+ sinfo->filled |= BIT_ULL(NL80211_STA_INFO_TX_BYTES64);
+
+- sinfo->tx_packets = msta->wcid.stats.tx_packets;
+- sinfo->filled |= BIT_ULL(NL80211_STA_INFO_TX_PACKETS);
++ if (!mt7915_mcu_wed_wa_tx_stats(phy->dev, msta->wcid.idx)) {
++ sinfo->tx_packets = msta->wcid.stats.tx_packets;
++ sinfo->filled |= BIT_ULL(NL80211_STA_INFO_TX_PACKETS);
++ }
+
+ sinfo->tx_failed = msta->wcid.stats.tx_failed;
+ sinfo->filled |= BIT_ULL(NL80211_STA_INFO_TX_FAILED);
+@@ -1526,6 +1585,8 @@ const struct ieee80211_ops mt7915_ops = {
+ .conf_tx = mt7915_conf_tx,
+ .configure_filter = mt7915_configure_filter,
+ .bss_info_changed = mt7915_bss_info_changed,
++ .start_ap = mt7915_start_ap,
++ .stop_ap = mt7915_stop_ap,
+ .sta_add = mt7915_sta_add,
+ .sta_remove = mt7915_sta_remove,
+ .sta_pre_rcu_remove = mt76_sta_pre_rcu_remove,
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7915/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7915/mcu.c
+index 9fcb22fa1f97e..1a8611c6b684d 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7915/mcu.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7915/mcu.c
+@@ -164,7 +164,9 @@ mt7915_mcu_parse_response(struct mt76_dev *mdev, int cmd,
+ }
+
+ rxd = (struct mt76_connac2_mcu_rxd *)skb->data;
+- if (seq != rxd->seq)
++ if (seq != rxd->seq &&
++ !(rxd->eid == MCU_CMD_EXT_CID &&
++ rxd->ext_eid == MCU_EXT_EVENT_WA_TX_STAT))
+ return -EAGAIN;
+
+ if (cmd == MCU_CMD(PATCH_SEM_CONTROL)) {
+@@ -274,7 +276,7 @@ mt7915_mcu_rx_radar_detected(struct mt7915_dev *dev, struct sk_buff *skb)
+
+ r = (struct mt7915_mcu_rdd_report *)skb->data;
+
+- if (r->band_idx > MT_BAND1)
++ if (r->band_idx > MT_RX_SEL2)
+ return;
+
+ if ((r->band_idx && !dev->phy.mt76->band_idx) &&
+@@ -395,12 +397,14 @@ void mt7915_mcu_rx_event(struct mt7915_dev *dev, struct sk_buff *skb)
+ struct mt76_connac2_mcu_rxd *rxd;
+
+ rxd = (struct mt76_connac2_mcu_rxd *)skb->data;
+- if (rxd->ext_eid == MCU_EXT_EVENT_THERMAL_PROTECT ||
+- rxd->ext_eid == MCU_EXT_EVENT_FW_LOG_2_HOST ||
+- rxd->ext_eid == MCU_EXT_EVENT_ASSERT_DUMP ||
+- rxd->ext_eid == MCU_EXT_EVENT_PS_SYNC ||
+- rxd->ext_eid == MCU_EXT_EVENT_BCC_NOTIFY ||
+- !rxd->seq)
++ if ((rxd->ext_eid == MCU_EXT_EVENT_THERMAL_PROTECT ||
++ rxd->ext_eid == MCU_EXT_EVENT_FW_LOG_2_HOST ||
++ rxd->ext_eid == MCU_EXT_EVENT_ASSERT_DUMP ||
++ rxd->ext_eid == MCU_EXT_EVENT_PS_SYNC ||
++ rxd->ext_eid == MCU_EXT_EVENT_BCC_NOTIFY ||
++ !rxd->seq) &&
++ !(rxd->eid == MCU_CMD_EXT_CID &&
++ rxd->ext_eid == MCU_EXT_EVENT_WA_TX_STAT))
+ mt7915_mcu_rx_unsolicited_event(dev, skb);
+ else
+ mt76_mcu_rx_event(&dev->mt76, skb);
+@@ -706,6 +710,7 @@ static void
+ mt7915_mcu_sta_he_tlv(struct sk_buff *skb, struct ieee80211_sta *sta,
+ struct ieee80211_vif *vif)
+ {
++ struct mt7915_vif *mvif = (struct mt7915_vif *)vif->drv_priv;
+ struct ieee80211_he_cap_elem *elem = &sta->deflink.he_cap.he_cap_elem;
+ struct ieee80211_he_mcs_nss_supp mcs_map;
+ struct sta_rec_he *he;
+@@ -739,7 +744,7 @@ mt7915_mcu_sta_he_tlv(struct sk_buff *skb, struct ieee80211_sta *sta,
+ IEEE80211_HE_PHY_CAP0_CHANNEL_WIDTH_SET_RU_MAPPING_IN_5G))
+ cap |= STA_REC_HE_CAP_BW20_RU242_SUPPORT;
+
+- if (vif->bss_conf.he_ldpc &&
++ if (mvif->cap.he_ldpc &&
+ (elem->phy_cap_info[1] &
+ IEEE80211_HE_PHY_CAP1_LDPC_CODING_IN_PAYLOAD))
+ cap |= STA_REC_HE_CAP_LDPC;
+@@ -848,6 +853,7 @@ static void
+ mt7915_mcu_sta_muru_tlv(struct mt7915_dev *dev, struct sk_buff *skb,
+ struct ieee80211_sta *sta, struct ieee80211_vif *vif)
+ {
++ struct mt7915_vif *mvif = (struct mt7915_vif *)vif->drv_priv;
+ struct ieee80211_he_cap_elem *elem = &sta->deflink.he_cap.he_cap_elem;
+ struct sta_rec_muru *muru;
+ struct tlv *tlv;
+@@ -860,9 +866,9 @@ mt7915_mcu_sta_muru_tlv(struct mt7915_dev *dev, struct sk_buff *skb,
+
+ muru = (struct sta_rec_muru *)tlv;
+
+- muru->cfg.mimo_dl_en = vif->bss_conf.he_mu_beamformer ||
+- vif->bss_conf.vht_mu_beamformer ||
+- vif->bss_conf.vht_mu_beamformee;
++ muru->cfg.mimo_dl_en = mvif->cap.he_mu_ebfer ||
++ mvif->cap.vht_mu_ebfer ||
++ mvif->cap.vht_mu_ebfee;
+ if (!is_mt7915(&dev->mt76))
+ muru->cfg.mimo_ul_en = true;
+ muru->cfg.ofdma_dl_en = true;
+@@ -995,8 +1001,8 @@ mt7915_mcu_sta_wtbl_tlv(struct mt7915_dev *dev, struct sk_buff *skb,
+ mt76_connac_mcu_wtbl_hdr_trans_tlv(skb, vif, wcid, tlv, wtbl_hdr);
+ if (sta)
+ mt76_connac_mcu_wtbl_ht_tlv(&dev->mt76, skb, sta, tlv,
+- wtbl_hdr, vif->bss_conf.ht_ldpc,
+- vif->bss_conf.vht_ldpc);
++ wtbl_hdr, mvif->cap.ht_ldpc,
++ mvif->cap.vht_ldpc);
+
+ return 0;
+ }
+@@ -1005,6 +1011,7 @@ static inline bool
+ mt7915_is_ebf_supported(struct mt7915_phy *phy, struct ieee80211_vif *vif,
+ struct ieee80211_sta *sta, bool bfee)
+ {
++ struct mt7915_vif *mvif = (struct mt7915_vif *)vif->drv_priv;
+ int tx_ant = hweight8(phy->mt76->chainmask) - 1;
+
+ if (vif->type != NL80211_IFTYPE_STATION &&
+@@ -1018,10 +1025,10 @@ mt7915_is_ebf_supported(struct mt7915_phy *phy, struct ieee80211_vif *vif,
+ struct ieee80211_he_cap_elem *pe = &sta->deflink.he_cap.he_cap_elem;
+
+ if (bfee)
+- return vif->bss_conf.he_su_beamformee &&
++ return mvif->cap.he_su_ebfee &&
+ HE_PHY(CAP3_SU_BEAMFORMER, pe->phy_cap_info[3]);
+ else
+- return vif->bss_conf.he_su_beamformer &&
++ return mvif->cap.he_su_ebfer &&
+ HE_PHY(CAP4_SU_BEAMFORMEE, pe->phy_cap_info[4]);
+ }
+
+@@ -1029,10 +1036,10 @@ mt7915_is_ebf_supported(struct mt7915_phy *phy, struct ieee80211_vif *vif,
+ u32 cap = sta->deflink.vht_cap.cap;
+
+ if (bfee)
+- return vif->bss_conf.vht_su_beamformee &&
++ return mvif->cap.vht_su_ebfee &&
+ (cap & IEEE80211_VHT_CAP_SU_BEAMFORMER_CAPABLE);
+ else
+- return vif->bss_conf.vht_su_beamformer &&
++ return mvif->cap.vht_su_ebfer &&
+ (cap & IEEE80211_VHT_CAP_SU_BEAMFORMEE_CAPABLE);
+ }
+
+@@ -1527,7 +1534,7 @@ mt7915_mcu_sta_rate_ctrl_tlv(struct sk_buff *skb, struct mt7915_dev *dev,
+ cap |= STA_CAP_TX_STBC;
+ if (sta->deflink.ht_cap.cap & IEEE80211_HT_CAP_RX_STBC)
+ cap |= STA_CAP_RX_STBC;
+- if (vif->bss_conf.ht_ldpc &&
++ if (mvif->cap.ht_ldpc &&
+ (sta->deflink.ht_cap.cap & IEEE80211_HT_CAP_LDPC_CODING))
+ cap |= STA_CAP_LDPC;
+
+@@ -1553,7 +1560,7 @@ mt7915_mcu_sta_rate_ctrl_tlv(struct sk_buff *skb, struct mt7915_dev *dev,
+ cap |= STA_CAP_VHT_TX_STBC;
+ if (sta->deflink.vht_cap.cap & IEEE80211_VHT_CAP_RXSTBC_1)
+ cap |= STA_CAP_VHT_RX_STBC;
+- if (vif->bss_conf.vht_ldpc &&
++ if (mvif->cap.vht_ldpc &&
+ (sta->deflink.vht_cap.cap & IEEE80211_VHT_CAP_RXLDPC))
+ cap |= STA_CAP_VHT_LDPC;
+
+@@ -2993,7 +3000,7 @@ int mt7915_mcu_get_chan_mib_info(struct mt7915_phy *phy, bool chan_switch)
+ }
+
+ ret = mt76_mcu_send_and_get_msg(&dev->mt76, MCU_EXT_CMD(GET_MIB_INFO),
+- req, sizeof(req), true, &skb);
++ req, len * sizeof(req[0]), true, &skb);
+ if (ret)
+ return ret;
+
+@@ -3733,6 +3740,62 @@ int mt7915_mcu_twt_agrt_update(struct mt7915_dev *dev,
+ &req, sizeof(req), true);
+ }
+
++int mt7915_mcu_wed_wa_tx_stats(struct mt7915_dev *dev, u16 wlan_idx)
++{
++ struct {
++ __le32 cmd;
++ __le32 num;
++ __le32 __rsv;
++ __le16 wlan_idx;
++ } req = {
++ .cmd = cpu_to_le32(0x15),
++ .num = cpu_to_le32(1),
++ .wlan_idx = cpu_to_le16(wlan_idx),
++ };
++ struct mt7915_mcu_wa_tx_stat {
++ __le16 wlan_idx;
++ u8 __rsv[2];
++
++ /* tx_bytes is deprecated since WA byte counter uses u32,
++ * which easily leads to overflow.
++ */
++ __le32 tx_bytes;
++ __le32 tx_packets;
++ } *res;
++ struct mt76_wcid *wcid;
++ struct sk_buff *skb;
++ int ret;
++
++ ret = mt76_mcu_send_and_get_msg(&dev->mt76, MCU_WA_PARAM_CMD(QUERY),
++ &req, sizeof(req), true, &skb);
++ if (ret)
++ return ret;
++
++ if (!is_mt7915(&dev->mt76))
++ skb_pull(skb, 4);
++
++ res = (struct mt7915_mcu_wa_tx_stat *)skb->data;
++
++ if (le16_to_cpu(res->wlan_idx) != wlan_idx) {
++ ret = -EINVAL;
++ goto out;
++ }
++
++ rcu_read_lock();
++
++ wcid = rcu_dereference(dev->mt76.wcid[wlan_idx]);
++ if (wcid)
++ wcid->stats.tx_packets += le32_to_cpu(res->tx_packets);
++ else
++ ret = -EINVAL;
++
++ rcu_read_unlock();
++out:
++ dev_kfree_skb(skb);
++
++ return ret;
++}
++
+ int mt7915_mcu_rf_regval(struct mt7915_dev *dev, u32 regidx, u32 *val, bool set)
+ {
+ struct {
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7915/mmio.c b/drivers/net/wireless/mediatek/mt76/mt7915/mmio.c
+index 45f3558bf31c1..2fa059af23ded 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7915/mmio.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7915/mmio.c
+@@ -545,8 +545,6 @@ static u32 mt7915_rmw(struct mt76_dev *mdev, u32 offset, u32 mask, u32 val)
+ static int mt7915_mmio_wed_offload_enable(struct mtk_wed_device *wed)
+ {
+ struct mt7915_dev *dev;
+- struct mt7915_phy *phy;
+- int ret;
+
+ dev = container_of(wed, struct mt7915_dev, mt76.mmio.wed);
+
+@@ -554,43 +552,19 @@ static int mt7915_mmio_wed_offload_enable(struct mtk_wed_device *wed)
+ dev->mt76.token_size = wed->wlan.token_start;
+ spin_unlock_bh(&dev->mt76.token_lock);
+
+- ret = wait_event_timeout(dev->mt76.tx_wait,
+- !dev->mt76.wed_token_count, HZ);
+- if (!ret)
+- return -EAGAIN;
+-
+- phy = &dev->phy;
+- mt76_set(dev, MT_AGG_ACR4(phy->mt76->band_idx), MT_AGG_ACR_PPDU_TXS2H);
+-
+- phy = dev->mt76.phys[MT_BAND1] ? dev->mt76.phys[MT_BAND1]->priv : NULL;
+- if (phy)
+- mt76_set(dev, MT_AGG_ACR4(phy->mt76->band_idx),
+- MT_AGG_ACR_PPDU_TXS2H);
+-
+- return 0;
++ return !wait_event_timeout(dev->mt76.tx_wait,
++ !dev->mt76.wed_token_count, HZ);
+ }
+
+ static void mt7915_mmio_wed_offload_disable(struct mtk_wed_device *wed)
+ {
+ struct mt7915_dev *dev;
+- struct mt7915_phy *phy;
+
+ dev = container_of(wed, struct mt7915_dev, mt76.mmio.wed);
+
+ spin_lock_bh(&dev->mt76.token_lock);
+ dev->mt76.token_size = MT7915_TOKEN_SIZE;
+ spin_unlock_bh(&dev->mt76.token_lock);
+-
+- /* MT_TXD5_TX_STATUS_HOST (MPDU format) has higher priority than
+- * MT_AGG_ACR_PPDU_TXS2H (PPDU format) even though ACR bit is set.
+- */
+- phy = &dev->phy;
+- mt76_clear(dev, MT_AGG_ACR4(phy->mt76->band_idx), MT_AGG_ACR_PPDU_TXS2H);
+-
+- phy = dev->mt76.phys[MT_BAND1] ? dev->mt76.phys[MT_BAND1]->priv : NULL;
+- if (phy)
+- mt76_clear(dev, MT_AGG_ACR4(phy->mt76->band_idx),
+- MT_AGG_ACR_PPDU_TXS2H);
+ }
+
+ static void mt7915_mmio_wed_release_rx_buf(struct mtk_wed_device *wed)
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7915/mt7915.h b/drivers/net/wireless/mediatek/mt76/mt7915/mt7915.h
+index b3ead35307406..0f76733c9c1ac 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7915/mt7915.h
++++ b/drivers/net/wireless/mediatek/mt76/mt7915/mt7915.h
+@@ -147,9 +147,23 @@ struct mt7915_sta {
+ } twt;
+ };
+
++struct mt7915_vif_cap {
++ bool ht_ldpc:1;
++ bool vht_ldpc:1;
++ bool he_ldpc:1;
++ bool vht_su_ebfer:1;
++ bool vht_su_ebfee:1;
++ bool vht_mu_ebfer:1;
++ bool vht_mu_ebfee:1;
++ bool he_su_ebfer:1;
++ bool he_su_ebfee:1;
++ bool he_mu_ebfer:1;
++};
++
+ struct mt7915_vif {
+ struct mt76_vif mt76; /* must be first */
+
++ struct mt7915_vif_cap cap;
+ struct mt7915_sta sta;
+ struct mt7915_phy *phy;
+
+@@ -539,6 +553,7 @@ int mt7915_mcu_get_rx_rate(struct mt7915_phy *phy, struct ieee80211_vif *vif,
+ struct ieee80211_sta *sta, struct rate_info *rate);
+ int mt7915_mcu_rdd_background_enable(struct mt7915_phy *phy,
+ struct cfg80211_chan_def *chandef);
++int mt7915_mcu_wed_wa_tx_stats(struct mt7915_dev *dev, u16 wcid);
+ int mt7915_mcu_rf_regval(struct mt7915_dev *dev, u32 regidx, u32 *val, bool set);
+ int mt7915_mcu_wa_cmd(struct mt7915_dev *dev, int cmd, u32 a1, u32 a2, u32 a3);
+ int mt7915_mcu_fw_log_2_host(struct mt7915_dev *dev, u8 type, u8 ctrl);
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/init.c b/drivers/net/wireless/mediatek/mt76/mt7921/init.c
+index bf1da9fddfaba..f41975e37d06a 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7921/init.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7921/init.c
+@@ -113,7 +113,8 @@ mt7921_init_wiphy(struct ieee80211_hw *hw)
+ wiphy->max_sched_scan_ssids = MT76_CONNAC_MAX_SCHED_SCAN_SSID;
+ wiphy->max_match_sets = MT76_CONNAC_MAX_SCAN_MATCH;
+ wiphy->max_sched_scan_reqs = 1;
+- wiphy->flags |= WIPHY_FLAG_HAS_CHANNEL_SWITCH;
++ wiphy->flags |= WIPHY_FLAG_HAS_CHANNEL_SWITCH |
++ WIPHY_FLAG_SPLIT_SCAN_6GHZ;
+ wiphy->reg_notifier = mt7921_regd_notifier;
+
+ wiphy->features |= NL80211_FEATURE_SCHED_SCAN_RANDOM_MAC_ADDR |
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7996/dma.c b/drivers/net/wireless/mediatek/mt76/mt7996/dma.c
+index 534143465d9b3..fbedaacffbba5 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7996/dma.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7996/dma.c
+@@ -293,7 +293,7 @@ int mt7996_dma_init(struct mt7996_dev *dev)
+ /* event from WA */
+ ret = mt76_queue_alloc(dev, &dev->mt76.q_rx[MT_RXQ_MCU_WA],
+ MT_RXQ_ID(MT_RXQ_MCU_WA),
+- MT7996_RX_MCU_RING_SIZE,
++ MT7996_RX_MCU_RING_SIZE_WA,
+ MT_RX_BUF_SIZE,
+ MT_RXQ_RING_BASE(MT_RXQ_MCU_WA));
+ if (ret)
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7996/mac.c b/drivers/net/wireless/mediatek/mt76/mt7996/mac.c
+index 9b0f6053e0fa6..25c5deb15d213 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7996/mac.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7996/mac.c
+@@ -836,14 +836,19 @@ mt7996_mac_fill_rx(struct mt7996_dev *dev, struct sk_buff *skb)
+ skb_pull(skb, hdr_gap);
+ if (!hdr_trans && status->amsdu && !(ieee80211_has_a4(fc) && is_mesh)) {
+ pad_start = ieee80211_get_hdrlen_from_skb(skb);
+- } else if (hdr_trans && (rxd2 & MT_RXD2_NORMAL_HDR_TRANS_ERROR) &&
+- get_unaligned_be16(skb->data + pad_start) == ETH_P_8021Q) {
++ } else if (hdr_trans && (rxd2 & MT_RXD2_NORMAL_HDR_TRANS_ERROR)) {
+ /* When header translation failure is indicated,
+ * the hardware will insert an extra 2-byte field
+ * containing the data length after the protocol
+- * type field.
++ * type field. This happens either when the LLC-SNAP
++ * pattern did not match, or if a VLAN header was
++ * detected.
+ */
+- pad_start = 16;
++ pad_start = 12;
++ if (get_unaligned_be16(skb->data + pad_start) == ETH_P_8021Q)
++ pad_start += 4;
++ else
++ pad_start = 0;
+ }
+
+ if (pad_start) {
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7996/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7996/mcu.c
+index 88e2f9d0e5130..62a02b03d83ba 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7996/mcu.c
++++ b/drivers/net/wireless/mediatek/mt76/mt7996/mcu.c
+@@ -339,7 +339,11 @@ mt7996_mcu_rx_radar_detected(struct mt7996_dev *dev, struct sk_buff *skb)
+ if (r->band_idx >= ARRAY_SIZE(dev->mt76.phys))
+ return;
+
+- mphy = dev->mt76.phys[r->band_idx];
++ if (dev->rdd2_phy && r->band_idx == MT_RX_SEL2)
++ mphy = dev->rdd2_phy->mt76;
++ else
++ mphy = dev->mt76.phys[r->band_idx];
++
+ if (!mphy)
+ return;
+
+@@ -712,6 +716,7 @@ mt7996_mcu_bss_basic_tlv(struct sk_buff *skb,
+ struct cfg80211_chan_def *chandef = &phy->chandef;
+ struct mt76_connac_bss_basic_tlv *bss;
+ u32 type = CONNECTION_INFRA_AP;
++ u16 sta_wlan_idx = wlan_idx;
+ struct tlv *tlv;
+ int idx;
+
+@@ -731,7 +736,7 @@ mt7996_mcu_bss_basic_tlv(struct sk_buff *skb,
+ struct mt76_wcid *wcid;
+
+ wcid = (struct mt76_wcid *)sta->drv_priv;
+- wlan_idx = wcid->idx;
++ sta_wlan_idx = wcid->idx;
+ }
+ rcu_read_unlock();
+ }
+@@ -751,7 +756,7 @@ mt7996_mcu_bss_basic_tlv(struct sk_buff *skb,
+ bss->bcn_interval = cpu_to_le16(vif->bss_conf.beacon_int);
+ bss->dtim_period = vif->bss_conf.dtim_period;
+ bss->bmc_tx_wlan_idx = cpu_to_le16(wlan_idx);
+- bss->sta_idx = cpu_to_le16(wlan_idx);
++ bss->sta_idx = cpu_to_le16(sta_wlan_idx);
+ bss->conn_type = cpu_to_le32(type);
+ bss->omac_idx = mvif->omac_idx;
+ bss->band_idx = mvif->band_idx;
+diff --git a/drivers/net/wireless/mediatek/mt76/mt7996/mt7996.h b/drivers/net/wireless/mediatek/mt76/mt7996/mt7996.h
+index 4d7dcb95a620a..b8bcad717d89f 100644
+--- a/drivers/net/wireless/mediatek/mt76/mt7996/mt7996.h
++++ b/drivers/net/wireless/mediatek/mt76/mt7996/mt7996.h
+@@ -26,6 +26,7 @@
+
+ #define MT7996_RX_RING_SIZE 1536
+ #define MT7996_RX_MCU_RING_SIZE 512
++#define MT7996_RX_MCU_RING_SIZE_WA 1024
+
+ #define MT7996_FIRMWARE_WA "mediatek/mt7996/mt7996_wa.bin"
+ #define MT7996_FIRMWARE_WM "mediatek/mt7996/mt7996_wm.bin"
+diff --git a/drivers/net/wireless/mediatek/mt76/testmode.c b/drivers/net/wireless/mediatek/mt76/testmode.c
+index 0accc71a91c9a..4644dace9bb34 100644
+--- a/drivers/net/wireless/mediatek/mt76/testmode.c
++++ b/drivers/net/wireless/mediatek/mt76/testmode.c
+@@ -8,6 +8,7 @@ const struct nla_policy mt76_tm_policy[NUM_MT76_TM_ATTRS] = {
+ [MT76_TM_ATTR_RESET] = { .type = NLA_FLAG },
+ [MT76_TM_ATTR_STATE] = { .type = NLA_U8 },
+ [MT76_TM_ATTR_TX_COUNT] = { .type = NLA_U32 },
++ [MT76_TM_ATTR_TX_LENGTH] = { .type = NLA_U32 },
+ [MT76_TM_ATTR_TX_RATE_MODE] = { .type = NLA_U8 },
+ [MT76_TM_ATTR_TX_RATE_NSS] = { .type = NLA_U8 },
+ [MT76_TM_ATTR_TX_RATE_IDX] = { .type = NLA_U8 },
+diff --git a/drivers/net/wireless/mediatek/mt76/tx.c b/drivers/net/wireless/mediatek/mt76/tx.c
+index 72b3ec715e47a..e9b9728458a9b 100644
+--- a/drivers/net/wireless/mediatek/mt76/tx.c
++++ b/drivers/net/wireless/mediatek/mt76/tx.c
+@@ -121,6 +121,7 @@ int
+ mt76_tx_status_skb_add(struct mt76_dev *dev, struct mt76_wcid *wcid,
+ struct sk_buff *skb)
+ {
++ struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)skb->data;
+ struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb);
+ struct mt76_tx_cb *cb = mt76_tx_skb_cb(skb);
+ int pid;
+@@ -134,8 +135,14 @@ mt76_tx_status_skb_add(struct mt76_dev *dev, struct mt76_wcid *wcid,
+ return MT_PACKET_ID_NO_ACK;
+
+ if (!(info->flags & (IEEE80211_TX_CTL_REQ_TX_STATUS |
+- IEEE80211_TX_CTL_RATE_CTRL_PROBE)))
++ IEEE80211_TX_CTL_RATE_CTRL_PROBE))) {
++ if (mtk_wed_device_active(&dev->mmio.wed) &&
++ ((info->flags & IEEE80211_TX_CTL_HW_80211_ENCAP) ||
++ ieee80211_is_data(hdr->frame_control)))
++ return MT_PACKET_ID_WED;
++
+ return MT_PACKET_ID_NO_SKB;
++ }
+
+ spin_lock_bh(&dev->status_lock);
+
+diff --git a/drivers/net/wireless/realtek/rtw89/debug.c b/drivers/net/wireless/realtek/rtw89/debug.c
+index 858494ddfb12e..9bb09fdf931ab 100644
+--- a/drivers/net/wireless/realtek/rtw89/debug.c
++++ b/drivers/net/wireless/realtek/rtw89/debug.c
+@@ -3165,12 +3165,14 @@ static ssize_t rtw89_debug_priv_btc_manual_set(struct file *filp,
+ struct rtw89_dev *rtwdev = debugfs_priv->rtwdev;
+ struct rtw89_btc *btc = &rtwdev->btc;
+ bool btc_manual;
++ int ret;
+
+- if (kstrtobool_from_user(user_buf, count, &btc_manual))
+- goto out;
++ ret = kstrtobool_from_user(user_buf, count, &btc_manual);
++ if (ret)
++ return ret;
+
+ btc->ctrl.manual = btc_manual;
+-out:
++
+ return count;
+ }
+
+diff --git a/drivers/net/wireless/realtek/rtw89/fw.c b/drivers/net/wireless/realtek/rtw89/fw.c
+index b9b675bf9d050..60b201b24332f 100644
+--- a/drivers/net/wireless/realtek/rtw89/fw.c
++++ b/drivers/net/wireless/realtek/rtw89/fw.c
+@@ -305,31 +305,17 @@ rtw89_early_fw_feature_recognize(struct device *device,
+ struct rtw89_fw_info *early_fw,
+ int *used_fw_format)
+ {
+- union rtw89_compat_fw_hdr buf = {};
+ const struct firmware *firmware;
+- bool full_req = false;
+ char fw_name[64];
+ int fw_format;
+ u32 ver_code;
+ int ret;
+
+- /* If SECURITY_LOADPIN_ENFORCE is enabled, reading partial files will
+- * be denied (-EPERM). Then, we don't get right firmware things as
+- * expected. So, in this case, we have to request full firmware here.
+- */
+- if (IS_ENABLED(CONFIG_SECURITY_LOADPIN_ENFORCE))
+- full_req = true;
+-
+ for (fw_format = chip->fw_format_max; fw_format >= 0; fw_format--) {
+ rtw89_fw_get_filename(fw_name, sizeof(fw_name),
+ chip->fw_basename, fw_format);
+
+- if (full_req)
+- ret = request_firmware(&firmware, fw_name, device);
+- else
+- ret = request_partial_firmware_into_buf(&firmware, fw_name,
+- device, &buf, sizeof(buf),
+- 0);
++ ret = request_firmware(&firmware, fw_name, device);
+ if (!ret) {
+ dev_info(device, "loaded firmware %s\n", fw_name);
+ *used_fw_format = fw_format;
+@@ -342,10 +328,7 @@ rtw89_early_fw_feature_recognize(struct device *device,
+ return NULL;
+ }
+
+- if (full_req)
+- ver_code = rtw89_compat_fw_hdr_ver_code(firmware->data);
+- else
+- ver_code = rtw89_compat_fw_hdr_ver_code(&buf);
++ ver_code = rtw89_compat_fw_hdr_ver_code(firmware->data);
+
+ if (!ver_code)
+ goto out;
+@@ -353,11 +336,7 @@ rtw89_early_fw_feature_recognize(struct device *device,
+ rtw89_fw_iterate_feature_cfg(early_fw, chip, ver_code);
+
+ out:
+- if (full_req)
+- return firmware;
+-
+- release_firmware(firmware);
+- return NULL;
++ return firmware;
+ }
+
+ int rtw89_fw_recognize(struct rtw89_dev *rtwdev)
+diff --git a/drivers/net/wireless/realtek/rtw89/rtw8852b_rfk.c b/drivers/net/wireless/realtek/rtw89/rtw8852b_rfk.c
+index 722ae34b09c1f..b6fccb1cb7a5c 100644
+--- a/drivers/net/wireless/realtek/rtw89/rtw8852b_rfk.c
++++ b/drivers/net/wireless/realtek/rtw89/rtw8852b_rfk.c
+@@ -846,7 +846,7 @@ static bool _iqk_one_shot(struct rtw89_dev *rtwdev, enum rtw89_phy_idx phy_idx,
+ case ID_NBTXK:
+ rtw89_phy_write32_mask(rtwdev, R_P0_RFCTM, B_P0_RFCTM_EN, 0x0);
+ rtw89_phy_write32_mask(rtwdev, R_IQK_DIF4, B_IQK_DIF4_TXT, 0x011);
+- iqk_cmd = 0x308 | (1 << (4 + path));
++ iqk_cmd = 0x408 | (1 << (4 + path));
+ break;
+ case ID_NBRXK:
+ rtw89_phy_write32_mask(rtwdev, R_P0_RFCTM, B_P0_RFCTM_EN, 0x1);
+@@ -1078,7 +1078,7 @@ static bool _iqk_nbtxk(struct rtw89_dev *rtwdev, enum rtw89_phy_idx phy_idx, u8
+ {
+ struct rtw89_iqk_info *iqk_info = &rtwdev->iqk;
+ bool kfail;
+- u8 gp = 0x3;
++ u8 gp = 0x2;
+
+ switch (iqk_info->iqk_band[path]) {
+ case RTW89_BAND_2G:
+diff --git a/drivers/ntb/ntb_transport.c b/drivers/ntb/ntb_transport.c
+index 2abd2235bbcab..9532108d2dce1 100644
+--- a/drivers/ntb/ntb_transport.c
++++ b/drivers/ntb/ntb_transport.c
+@@ -909,7 +909,7 @@ static int ntb_set_mw(struct ntb_transport_ctx *nt, int num_mw,
+ return 0;
+ }
+
+-static void ntb_qp_link_down_reset(struct ntb_transport_qp *qp)
++static void ntb_qp_link_context_reset(struct ntb_transport_qp *qp)
+ {
+ qp->link_is_up = false;
+ qp->active = false;
+@@ -932,6 +932,13 @@ static void ntb_qp_link_down_reset(struct ntb_transport_qp *qp)
+ qp->tx_async = 0;
+ }
+
++static void ntb_qp_link_down_reset(struct ntb_transport_qp *qp)
++{
++ ntb_qp_link_context_reset(qp);
++ if (qp->remote_rx_info)
++ qp->remote_rx_info->entry = qp->rx_max_entry - 1;
++}
++
+ static void ntb_qp_link_cleanup(struct ntb_transport_qp *qp)
+ {
+ struct ntb_transport_ctx *nt = qp->transport;
+@@ -1174,7 +1181,7 @@ static int ntb_transport_init_queue(struct ntb_transport_ctx *nt,
+ qp->ndev = nt->ndev;
+ qp->client_ready = false;
+ qp->event_handler = NULL;
+- ntb_qp_link_down_reset(qp);
++ ntb_qp_link_context_reset(qp);
+
+ if (mw_num < qp_count % mw_count)
+ num_qps_mw = qp_count / mw_count + 1;
+@@ -2276,9 +2283,13 @@ int ntb_transport_tx_enqueue(struct ntb_transport_qp *qp, void *cb, void *data,
+ struct ntb_queue_entry *entry;
+ int rc;
+
+- if (!qp || !qp->link_is_up || !len)
++ if (!qp || !len)
+ return -EINVAL;
+
++ /* If the qp link is down already, just ignore. */
++ if (!qp->link_is_up)
++ return 0;
++
+ entry = ntb_list_rm(&qp->ntb_tx_free_q_lock, &qp->tx_free_q);
+ if (!entry) {
+ qp->tx_err_no_buf++;
+@@ -2418,7 +2429,7 @@ unsigned int ntb_transport_tx_free_entry(struct ntb_transport_qp *qp)
+ unsigned int head = qp->tx_index;
+ unsigned int tail = qp->remote_rx_info->entry;
+
+- return tail > head ? tail - head : qp->tx_max_entry + tail - head;
++ return tail >= head ? tail - head : qp->tx_max_entry + tail - head;
+ }
+ EXPORT_SYMBOL_GPL(ntb_transport_tx_free_entry);
+
+diff --git a/drivers/nvdimm/nd_perf.c b/drivers/nvdimm/nd_perf.c
+index 433bbb68ae641..2b6dc80d8fb5b 100644
+--- a/drivers/nvdimm/nd_perf.c
++++ b/drivers/nvdimm/nd_perf.c
+@@ -308,8 +308,8 @@ int register_nvdimm_pmu(struct nvdimm_pmu *nd_pmu, struct platform_device *pdev)
+
+ rc = perf_pmu_register(&nd_pmu->pmu, nd_pmu->pmu.name, -1);
+ if (rc) {
+- kfree(nd_pmu->pmu.attr_groups);
+ nvdimm_pmu_free_hotplug_memory(nd_pmu);
++ kfree(nd_pmu->pmu.attr_groups);
+ return rc;
+ }
+
+@@ -324,6 +324,7 @@ void unregister_nvdimm_pmu(struct nvdimm_pmu *nd_pmu)
+ {
+ perf_pmu_unregister(&nd_pmu->pmu);
+ nvdimm_pmu_free_hotplug_memory(nd_pmu);
++ kfree(nd_pmu->pmu.attr_groups);
+ kfree(nd_pmu);
+ }
+ EXPORT_SYMBOL_GPL(unregister_nvdimm_pmu);
+diff --git a/drivers/nvdimm/nd_virtio.c b/drivers/nvdimm/nd_virtio.c
+index c6a648fd8744a..1f8c667c6f1ee 100644
+--- a/drivers/nvdimm/nd_virtio.c
++++ b/drivers/nvdimm/nd_virtio.c
+@@ -105,7 +105,8 @@ int async_pmem_flush(struct nd_region *nd_region, struct bio *bio)
+ * parent bio. Otherwise directly call nd_region flush.
+ */
+ if (bio && bio->bi_iter.bi_sector != -1) {
+- struct bio *child = bio_alloc(bio->bi_bdev, 0, REQ_PREFLUSH,
++ struct bio *child = bio_alloc(bio->bi_bdev, 0,
++ REQ_OP_WRITE | REQ_PREFLUSH,
+ GFP_ATOMIC);
+
+ if (!child)
+diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
+index 7feb643f13707..28b479afd506f 100644
+--- a/drivers/of/overlay.c
++++ b/drivers/of/overlay.c
+@@ -752,8 +752,6 @@ static int init_overlay_changeset(struct overlay_changeset *ovcs)
+ if (!of_node_is_root(ovcs->overlay_root))
+ pr_debug("%s() ovcs->overlay_root is not root\n", __func__);
+
+- of_changeset_init(&ovcs->cset);
+-
+ cnt = 0;
+
+ /* fragment nodes */
+@@ -1013,6 +1011,7 @@ int of_overlay_fdt_apply(const void *overlay_fdt, u32 overlay_fdt_size,
+
+ INIT_LIST_HEAD(&ovcs->ovcs_list);
+ list_add_tail(&ovcs->ovcs_list, &ovcs_list);
++ of_changeset_init(&ovcs->cset);
+
+ /*
+ * Must create permanent copy of FDT because of_fdt_unflatten_tree()
+diff --git a/drivers/of/property.c b/drivers/of/property.c
+index ddc75cd50825e..cf8dacf3e3b84 100644
+--- a/drivers/of/property.c
++++ b/drivers/of/property.c
+@@ -1266,6 +1266,7 @@ DEFINE_SIMPLE_PROP(pwms, "pwms", "#pwm-cells")
+ DEFINE_SIMPLE_PROP(resets, "resets", "#reset-cells")
+ DEFINE_SIMPLE_PROP(leds, "leds", NULL)
+ DEFINE_SIMPLE_PROP(backlight, "backlight", NULL)
++DEFINE_SIMPLE_PROP(panel, "panel", NULL)
+ DEFINE_SUFFIX_PROP(regulators, "-supply", NULL)
+ DEFINE_SUFFIX_PROP(gpio, "-gpio", "#gpio-cells")
+
+@@ -1354,6 +1355,7 @@ static const struct supplier_bindings of_supplier_bindings[] = {
+ { .parse_prop = parse_resets, },
+ { .parse_prop = parse_leds, },
+ { .parse_prop = parse_backlight, },
++ { .parse_prop = parse_panel, },
+ { .parse_prop = parse_gpio_compat, },
+ { .parse_prop = parse_interrupts, },
+ { .parse_prop = parse_regulators, },
+diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c
+index 4fe02e9f7dcdd..03a1de841d3b7 100644
+--- a/drivers/of/unittest.c
++++ b/drivers/of/unittest.c
+@@ -77,7 +77,7 @@ static void __init of_unittest_find_node_by_name(void)
+
+ np = of_find_node_by_path("/testcase-data");
+ name = kasprintf(GFP_KERNEL, "%pOF", np);
+- unittest(np && !strcmp("/testcase-data", name),
++ unittest(np && name && !strcmp("/testcase-data", name),
+ "find /testcase-data failed\n");
+ of_node_put(np);
+ kfree(name);
+@@ -88,14 +88,14 @@ static void __init of_unittest_find_node_by_name(void)
+
+ np = of_find_node_by_path("/testcase-data/phandle-tests/consumer-a");
+ name = kasprintf(GFP_KERNEL, "%pOF", np);
+- unittest(np && !strcmp("/testcase-data/phandle-tests/consumer-a", name),
++ unittest(np && name && !strcmp("/testcase-data/phandle-tests/consumer-a", name),
+ "find /testcase-data/phandle-tests/consumer-a failed\n");
+ of_node_put(np);
+ kfree(name);
+
+ np = of_find_node_by_path("testcase-alias");
+ name = kasprintf(GFP_KERNEL, "%pOF", np);
+- unittest(np && !strcmp("/testcase-data", name),
++ unittest(np && name && !strcmp("/testcase-data", name),
+ "find testcase-alias failed\n");
+ of_node_put(np);
+ kfree(name);
+@@ -106,7 +106,7 @@ static void __init of_unittest_find_node_by_name(void)
+
+ np = of_find_node_by_path("testcase-alias/phandle-tests/consumer-a");
+ name = kasprintf(GFP_KERNEL, "%pOF", np);
+- unittest(np && !strcmp("/testcase-data/phandle-tests/consumer-a", name),
++ unittest(np && name && !strcmp("/testcase-data/phandle-tests/consumer-a", name),
+ "find testcase-alias/phandle-tests/consumer-a failed\n");
+ of_node_put(np);
+ kfree(name);
+@@ -1533,6 +1533,8 @@ static void attach_node_and_children(struct device_node *np)
+ const char *full_name;
+
+ full_name = kasprintf(GFP_KERNEL, "%pOF", np);
++ if (!full_name)
++ return;
+
+ if (!strcmp(full_name, "/__local_fixups__") ||
+ !strcmp(full_name, "/__fixups__")) {
+@@ -2208,7 +2210,7 @@ static int __init of_unittest_apply_revert_overlay_check(int overlay_nr,
+ of_unittest_untrack_overlay(save_ovcs_id);
+
+ /* unittest device must be again in before state */
+- if (of_unittest_device_exists(unittest_nr, PDEV_OVERLAY) != before) {
++ if (of_unittest_device_exists(unittest_nr, ovtype) != before) {
+ unittest(0, "%s with device @\"%s\" %s\n",
+ overlay_name_from_nr(overlay_nr),
+ unittest_path(unittest_nr, ovtype),
+diff --git a/drivers/opp/core.c b/drivers/opp/core.c
+index b5973fefdfd83..75b43c6c7031c 100644
+--- a/drivers/opp/core.c
++++ b/drivers/opp/core.c
+@@ -2382,7 +2382,7 @@ static int _opp_attach_genpd(struct opp_table *opp_table, struct device *dev,
+
+ virt_dev = dev_pm_domain_attach_by_name(dev, *name);
+ if (IS_ERR_OR_NULL(virt_dev)) {
+- ret = PTR_ERR(virt_dev) ? : -ENODEV;
++ ret = virt_dev ? PTR_ERR(virt_dev) : -ENODEV;
+ dev_err(dev, "Couldn't attach to pm_domain: %d\n", ret);
+ goto err;
+ }
+diff --git a/drivers/pci/access.c b/drivers/pci/access.c
+index 3c230ca3de584..0b2e90d2f04f2 100644
+--- a/drivers/pci/access.c
++++ b/drivers/pci/access.c
+@@ -497,8 +497,8 @@ int pcie_capability_write_dword(struct pci_dev *dev, int pos, u32 val)
+ }
+ EXPORT_SYMBOL(pcie_capability_write_dword);
+
+-int pcie_capability_clear_and_set_word(struct pci_dev *dev, int pos,
+- u16 clear, u16 set)
++int pcie_capability_clear_and_set_word_unlocked(struct pci_dev *dev, int pos,
++ u16 clear, u16 set)
+ {
+ int ret;
+ u16 val;
+@@ -512,7 +512,21 @@ int pcie_capability_clear_and_set_word(struct pci_dev *dev, int pos,
+
+ return ret;
+ }
+-EXPORT_SYMBOL(pcie_capability_clear_and_set_word);
++EXPORT_SYMBOL(pcie_capability_clear_and_set_word_unlocked);
++
++int pcie_capability_clear_and_set_word_locked(struct pci_dev *dev, int pos,
++ u16 clear, u16 set)
++{
++ unsigned long flags;
++ int ret;
++
++ spin_lock_irqsave(&dev->pcie_cap_lock, flags);
++ ret = pcie_capability_clear_and_set_word_unlocked(dev, pos, clear, set);
++ spin_unlock_irqrestore(&dev->pcie_cap_lock, flags);
++
++ return ret;
++}
++EXPORT_SYMBOL(pcie_capability_clear_and_set_word_locked);
+
+ int pcie_capability_clear_and_set_dword(struct pci_dev *dev, int pos,
+ u32 clear, u32 set)
+diff --git a/drivers/pci/controller/dwc/pcie-qcom-ep.c b/drivers/pci/controller/dwc/pcie-qcom-ep.c
+index 19b32839ea261..043b356d7d72d 100644
+--- a/drivers/pci/controller/dwc/pcie-qcom-ep.c
++++ b/drivers/pci/controller/dwc/pcie-qcom-ep.c
+@@ -415,7 +415,7 @@ static int qcom_pcie_perst_deassert(struct dw_pcie *pci)
+ /* Gate Master AXI clock to MHI bus during L1SS */
+ val = readl_relaxed(pcie_ep->parf + PARF_MHI_CLOCK_RESET_CTRL);
+ val &= ~PARF_MSTR_AXI_CLK_EN;
+- val = readl_relaxed(pcie_ep->parf + PARF_MHI_CLOCK_RESET_CTRL);
++ writel_relaxed(val, pcie_ep->parf + PARF_MHI_CLOCK_RESET_CTRL);
+
+ dw_pcie_ep_init_notify(&pcie_ep->pci.ep);
+
+diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c b/drivers/pci/controller/dwc/pcie-tegra194.c
+index e6eec85480ca9..15c8077f503f0 100644
+--- a/drivers/pci/controller/dwc/pcie-tegra194.c
++++ b/drivers/pci/controller/dwc/pcie-tegra194.c
+@@ -883,11 +883,6 @@ static int tegra_pcie_dw_host_init(struct dw_pcie_rp *pp)
+ pcie->pcie_cap_base = dw_pcie_find_capability(&pcie->pci,
+ PCI_CAP_ID_EXP);
+
+- val_16 = dw_pcie_readw_dbi(pci, pcie->pcie_cap_base + PCI_EXP_DEVCTL);
+- val_16 &= ~PCI_EXP_DEVCTL_PAYLOAD;
+- val_16 |= PCI_EXP_DEVCTL_PAYLOAD_256B;
+- dw_pcie_writew_dbi(pci, pcie->pcie_cap_base + PCI_EXP_DEVCTL, val_16);
+-
+ val = dw_pcie_readl_dbi(pci, PCI_IO_BASE);
+ val &= ~(IO_BASE_IO_DECODE | IO_BASE_IO_DECODE_BIT8);
+ dw_pcie_writel_dbi(pci, PCI_IO_BASE, val);
+@@ -1876,11 +1871,6 @@ static void pex_ep_event_pex_rst_deassert(struct tegra_pcie_dw *pcie)
+ pcie->pcie_cap_base = dw_pcie_find_capability(&pcie->pci,
+ PCI_CAP_ID_EXP);
+
+- val_16 = dw_pcie_readw_dbi(pci, pcie->pcie_cap_base + PCI_EXP_DEVCTL);
+- val_16 &= ~PCI_EXP_DEVCTL_PAYLOAD;
+- val_16 |= PCI_EXP_DEVCTL_PAYLOAD_256B;
+- dw_pcie_writew_dbi(pci, pcie->pcie_cap_base + PCI_EXP_DEVCTL, val_16);
+-
+ /* Clear Slot Clock Configuration bit if SRNS configuration */
+ if (pcie->enable_srns) {
+ val_16 = dw_pcie_readw_dbi(pci, pcie->pcie_cap_base +
+diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
+index 2d93d0c4f10db..bed3cefdaf198 100644
+--- a/drivers/pci/controller/pci-hyperv.c
++++ b/drivers/pci/controller/pci-hyperv.c
+@@ -3983,6 +3983,9 @@ static int hv_pci_restore_msi_msg(struct pci_dev *pdev, void *arg)
+ struct msi_desc *entry;
+ int ret = 0;
+
++ if (!pdev->msi_enabled && !pdev->msix_enabled)
++ return 0;
++
+ msi_lock_descs(&pdev->dev);
+ msi_for_each_desc(entry, &pdev->dev, MSI_DESC_ASSOCIATED) {
+ irq_data = irq_get_irq_data(entry->irq);
+diff --git a/drivers/pci/controller/pcie-apple.c b/drivers/pci/controller/pcie-apple.c
+index 66f37e403a09c..2340dab6cd5bd 100644
+--- a/drivers/pci/controller/pcie-apple.c
++++ b/drivers/pci/controller/pcie-apple.c
+@@ -783,6 +783,10 @@ static int apple_pcie_init(struct pci_config_window *cfg)
+ cfg->priv = pcie;
+ INIT_LIST_HEAD(&pcie->ports);
+
++ ret = apple_msi_init(pcie);
++ if (ret)
++ return ret;
++
+ for_each_child_of_node(dev->of_node, of_port) {
+ ret = apple_pcie_setup_port(pcie, of_port);
+ if (ret) {
+@@ -792,7 +796,7 @@ static int apple_pcie_init(struct pci_config_window *cfg)
+ }
+ }
+
+- return apple_msi_init(pcie);
++ return 0;
+ }
+
+ static int apple_pcie_probe(struct platform_device *pdev)
+diff --git a/drivers/pci/controller/pcie-microchip-host.c b/drivers/pci/controller/pcie-microchip-host.c
+index 5e710e4854646..dd5245904c874 100644
+--- a/drivers/pci/controller/pcie-microchip-host.c
++++ b/drivers/pci/controller/pcie-microchip-host.c
+@@ -167,12 +167,12 @@
+ #define EVENT_PCIE_DLUP_EXIT 2
+ #define EVENT_SEC_TX_RAM_SEC_ERR 3
+ #define EVENT_SEC_RX_RAM_SEC_ERR 4
+-#define EVENT_SEC_AXI2PCIE_RAM_SEC_ERR 5
+-#define EVENT_SEC_PCIE2AXI_RAM_SEC_ERR 6
++#define EVENT_SEC_PCIE2AXI_RAM_SEC_ERR 5
++#define EVENT_SEC_AXI2PCIE_RAM_SEC_ERR 6
+ #define EVENT_DED_TX_RAM_DED_ERR 7
+ #define EVENT_DED_RX_RAM_DED_ERR 8
+-#define EVENT_DED_AXI2PCIE_RAM_DED_ERR 9
+-#define EVENT_DED_PCIE2AXI_RAM_DED_ERR 10
++#define EVENT_DED_PCIE2AXI_RAM_DED_ERR 9
++#define EVENT_DED_AXI2PCIE_RAM_DED_ERR 10
+ #define EVENT_LOCAL_DMA_END_ENGINE_0 11
+ #define EVENT_LOCAL_DMA_END_ENGINE_1 12
+ #define EVENT_LOCAL_DMA_ERROR_ENGINE_0 13
+diff --git a/drivers/pci/controller/pcie-rockchip.h b/drivers/pci/controller/pcie-rockchip.h
+index fe0333778fd93..6111de35f84ca 100644
+--- a/drivers/pci/controller/pcie-rockchip.h
++++ b/drivers/pci/controller/pcie-rockchip.h
+@@ -158,7 +158,9 @@
+ #define PCIE_RC_CONFIG_THP_CAP (PCIE_RC_CONFIG_BASE + 0x274)
+ #define PCIE_RC_CONFIG_THP_CAP_NEXT_MASK GENMASK(31, 20)
+
+-#define PCIE_ADDR_MASK 0xffffff00
++#define MAX_AXI_IB_ROOTPORT_REGION_NUM 3
++#define MIN_AXI_ADDR_BITS_PASSED 8
++#define PCIE_ADDR_MASK GENMASK_ULL(63, MIN_AXI_ADDR_BITS_PASSED)
+ #define PCIE_CORE_AXI_CONF_BASE 0xc00000
+ #define PCIE_CORE_OB_REGION_ADDR0 (PCIE_CORE_AXI_CONF_BASE + 0x0)
+ #define PCIE_CORE_OB_REGION_ADDR0_NUM_BITS 0x3f
+@@ -185,8 +187,6 @@
+ #define AXI_WRAPPER_TYPE1_CFG 0xb
+ #define AXI_WRAPPER_NOR_MSG 0xc
+
+-#define MAX_AXI_IB_ROOTPORT_REGION_NUM 3
+-#define MIN_AXI_ADDR_BITS_PASSED 8
+ #define PCIE_RC_SEND_PME_OFF 0x11960
+ #define ROCKCHIP_VENDOR_ID 0x1d87
+ #define PCIE_LINK_IS_L2(x) \
+diff --git a/drivers/pci/doe.c b/drivers/pci/doe.c
+index 1b97a5ab71a96..e3aab5edaf706 100644
+--- a/drivers/pci/doe.c
++++ b/drivers/pci/doe.c
+@@ -293,8 +293,8 @@ static int pci_doe_recv_resp(struct pci_doe_mb *doe_mb, struct pci_doe_task *tas
+ static void signal_task_complete(struct pci_doe_task *task, int rv)
+ {
+ task->rv = rv;
+- task->complete(task);
+ destroy_work_on_stack(&task->work);
++ task->complete(task);
+ }
+
+ static void signal_task_abort(struct pci_doe_task *task, int rv)
+diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
+index f8c70115b6917..5deb45d79f9de 100644
+--- a/drivers/pci/hotplug/pciehp_hpc.c
++++ b/drivers/pci/hotplug/pciehp_hpc.c
+@@ -332,17 +332,11 @@ int pciehp_check_link_status(struct controller *ctrl)
+ static int __pciehp_link_set(struct controller *ctrl, bool enable)
+ {
+ struct pci_dev *pdev = ctrl_dev(ctrl);
+- u16 lnk_ctrl;
+
+- pcie_capability_read_word(pdev, PCI_EXP_LNKCTL, &lnk_ctrl);
++ pcie_capability_clear_and_set_word(pdev, PCI_EXP_LNKCTL,
++ PCI_EXP_LNKCTL_LD,
++ enable ? 0 : PCI_EXP_LNKCTL_LD);
+
+- if (enable)
+- lnk_ctrl &= ~PCI_EXP_LNKCTL_LD;
+- else
+- lnk_ctrl |= PCI_EXP_LNKCTL_LD;
+-
+- pcie_capability_write_word(pdev, PCI_EXP_LNKCTL, lnk_ctrl);
+- ctrl_dbg(ctrl, "%s: lnk_ctrl = %x\n", __func__, lnk_ctrl);
+ return 0;
+ }
+
+diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
+index c779eb4d7fb84..499c7ffa4e3b2 100644
+--- a/drivers/pci/pci.c
++++ b/drivers/pci/pci.c
+@@ -1200,6 +1200,10 @@ static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout)
+ *
+ * On success, return 0 or 1, depending on whether or not it is necessary to
+ * restore the device's BARs subsequently (1 is returned in that case).
++ *
++ * On failure, return a negative error code. Always return failure if @dev
++ * lacks a Power Management Capability, even if the platform was able to
++ * put the device in D0 via non-PCI means.
+ */
+ int pci_power_up(struct pci_dev *dev)
+ {
+@@ -1216,9 +1220,6 @@ int pci_power_up(struct pci_dev *dev)
+ else
+ dev->current_state = state;
+
+- if (state == PCI_D0)
+- return 0;
+-
+ return -EIO;
+ }
+
+@@ -1276,8 +1277,12 @@ static int pci_set_full_power_state(struct pci_dev *dev)
+ int ret;
+
+ ret = pci_power_up(dev);
+- if (ret < 0)
++ if (ret < 0) {
++ if (dev->current_state == PCI_D0)
++ return 0;
++
+ return ret;
++ }
+
+ pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
+ dev->current_state = pmcsr & PCI_PM_CTRL_STATE_MASK;
+diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
+index 998e26de2ad76..75e51b965c0b7 100644
+--- a/drivers/pci/pcie/aspm.c
++++ b/drivers/pci/pcie/aspm.c
+@@ -250,7 +250,7 @@ static int pcie_retrain_link(struct pcie_link_state *link)
+ static void pcie_aspm_configure_common_clock(struct pcie_link_state *link)
+ {
+ int same_clock = 1;
+- u16 reg16, parent_reg, child_reg[8];
++ u16 reg16, ccc, parent_old_ccc, child_old_ccc[8];
+ struct pci_dev *child, *parent = link->pdev;
+ struct pci_bus *linkbus = parent->subordinate;
+ /*
+@@ -272,6 +272,7 @@ static void pcie_aspm_configure_common_clock(struct pcie_link_state *link)
+
+ /* Port might be already in common clock mode */
+ pcie_capability_read_word(parent, PCI_EXP_LNKCTL, ®16);
++ parent_old_ccc = reg16 & PCI_EXP_LNKCTL_CCC;
+ if (same_clock && (reg16 & PCI_EXP_LNKCTL_CCC)) {
+ bool consistent = true;
+
+@@ -288,34 +289,29 @@ static void pcie_aspm_configure_common_clock(struct pcie_link_state *link)
+ pci_info(parent, "ASPM: current common clock configuration is inconsistent, reconfiguring\n");
+ }
+
++ ccc = same_clock ? PCI_EXP_LNKCTL_CCC : 0;
+ /* Configure downstream component, all functions */
+ list_for_each_entry(child, &linkbus->devices, bus_list) {
+ pcie_capability_read_word(child, PCI_EXP_LNKCTL, ®16);
+- child_reg[PCI_FUNC(child->devfn)] = reg16;
+- if (same_clock)
+- reg16 |= PCI_EXP_LNKCTL_CCC;
+- else
+- reg16 &= ~PCI_EXP_LNKCTL_CCC;
+- pcie_capability_write_word(child, PCI_EXP_LNKCTL, reg16);
++ child_old_ccc[PCI_FUNC(child->devfn)] = reg16 & PCI_EXP_LNKCTL_CCC;
++ pcie_capability_clear_and_set_word(child, PCI_EXP_LNKCTL,
++ PCI_EXP_LNKCTL_CCC, ccc);
+ }
+
+ /* Configure upstream component */
+- pcie_capability_read_word(parent, PCI_EXP_LNKCTL, ®16);
+- parent_reg = reg16;
+- if (same_clock)
+- reg16 |= PCI_EXP_LNKCTL_CCC;
+- else
+- reg16 &= ~PCI_EXP_LNKCTL_CCC;
+- pcie_capability_write_word(parent, PCI_EXP_LNKCTL, reg16);
++ pcie_capability_clear_and_set_word(parent, PCI_EXP_LNKCTL,
++ PCI_EXP_LNKCTL_CCC, ccc);
+
+ if (pcie_retrain_link(link)) {
+
+ /* Training failed. Restore common clock configurations */
+ pci_err(parent, "ASPM: Could not configure common clock\n");
+ list_for_each_entry(child, &linkbus->devices, bus_list)
+- pcie_capability_write_word(child, PCI_EXP_LNKCTL,
+- child_reg[PCI_FUNC(child->devfn)]);
+- pcie_capability_write_word(parent, PCI_EXP_LNKCTL, parent_reg);
++ pcie_capability_clear_and_set_word(child, PCI_EXP_LNKCTL,
++ PCI_EXP_LNKCTL_CCC,
++ child_old_ccc[PCI_FUNC(child->devfn)]);
++ pcie_capability_clear_and_set_word(parent, PCI_EXP_LNKCTL,
++ PCI_EXP_LNKCTL_CCC, parent_old_ccc);
+ }
+ }
+
+diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
+index 00ed20ac0dd61..a7af7ece98f07 100644
+--- a/drivers/pci/probe.c
++++ b/drivers/pci/probe.c
+@@ -999,6 +999,7 @@ static int pci_register_host_bridge(struct pci_host_bridge *bridge)
+ res = window->res;
+ if (!res->flags && !res->start && !res->end) {
+ release_resource(res);
++ resource_list_destroy_entry(window);
+ continue;
+ }
+
+@@ -2320,6 +2321,7 @@ struct pci_dev *pci_alloc_dev(struct pci_bus *bus)
+ .end = -1,
+ };
+
++ spin_lock_init(&dev->pcie_cap_lock);
+ #ifdef CONFIG_PCI_MSI
+ raw_spin_lock_init(&dev->msi_lock);
+ #endif
+diff --git a/drivers/perf/fsl_imx8_ddr_perf.c b/drivers/perf/fsl_imx8_ddr_perf.c
+index 5222ba1e79d0e..c684aab407f86 100644
+--- a/drivers/perf/fsl_imx8_ddr_perf.c
++++ b/drivers/perf/fsl_imx8_ddr_perf.c
+@@ -101,6 +101,7 @@ struct ddr_pmu {
+ const struct fsl_ddr_devtype_data *devtype_data;
+ int irq;
+ int id;
++ int active_counter;
+ };
+
+ static ssize_t ddr_perf_identifier_show(struct device *dev,
+@@ -495,6 +496,10 @@ static void ddr_perf_event_start(struct perf_event *event, int flags)
+
+ ddr_perf_counter_enable(pmu, event->attr.config, counter, true);
+
++ if (!pmu->active_counter++)
++ ddr_perf_counter_enable(pmu, EVENT_CYCLES_ID,
++ EVENT_CYCLES_COUNTER, true);
++
+ hwc->state = 0;
+ }
+
+@@ -548,6 +553,10 @@ static void ddr_perf_event_stop(struct perf_event *event, int flags)
+ ddr_perf_counter_enable(pmu, event->attr.config, counter, false);
+ ddr_perf_event_update(event);
+
++ if (!--pmu->active_counter)
++ ddr_perf_counter_enable(pmu, EVENT_CYCLES_ID,
++ EVENT_CYCLES_COUNTER, false);
++
+ hwc->state |= PERF_HES_STOPPED;
+ }
+
+@@ -565,25 +574,10 @@ static void ddr_perf_event_del(struct perf_event *event, int flags)
+
+ static void ddr_perf_pmu_enable(struct pmu *pmu)
+ {
+- struct ddr_pmu *ddr_pmu = to_ddr_pmu(pmu);
+-
+- /* enable cycle counter if cycle is not active event list */
+- if (ddr_pmu->events[EVENT_CYCLES_COUNTER] == NULL)
+- ddr_perf_counter_enable(ddr_pmu,
+- EVENT_CYCLES_ID,
+- EVENT_CYCLES_COUNTER,
+- true);
+ }
+
+ static void ddr_perf_pmu_disable(struct pmu *pmu)
+ {
+- struct ddr_pmu *ddr_pmu = to_ddr_pmu(pmu);
+-
+- if (ddr_pmu->events[EVENT_CYCLES_COUNTER] == NULL)
+- ddr_perf_counter_enable(ddr_pmu,
+- EVENT_CYCLES_ID,
+- EVENT_CYCLES_COUNTER,
+- false);
+ }
+
+ static int ddr_perf_init(struct ddr_pmu *pmu, void __iomem *base,
+diff --git a/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c b/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c
+index 6170f8fd118e2..d0319bee01c0f 100644
+--- a/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c
++++ b/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c
+@@ -214,8 +214,7 @@ static int __maybe_unused qcom_snps_hsphy_runtime_suspend(struct device *dev)
+ if (!hsphy->phy_initialized)
+ return 0;
+
+- qcom_snps_hsphy_suspend(hsphy);
+- return 0;
++ return qcom_snps_hsphy_suspend(hsphy);
+ }
+
+ static int __maybe_unused qcom_snps_hsphy_runtime_resume(struct device *dev)
+@@ -225,8 +224,7 @@ static int __maybe_unused qcom_snps_hsphy_runtime_resume(struct device *dev)
+ if (!hsphy->phy_initialized)
+ return 0;
+
+- qcom_snps_hsphy_resume(hsphy);
+- return 0;
++ return qcom_snps_hsphy_resume(hsphy);
+ }
+
+ static int qcom_snps_hsphy_set_mode(struct phy *phy, enum phy_mode mode,
+diff --git a/drivers/phy/rockchip/phy-rockchip-inno-hdmi.c b/drivers/phy/rockchip/phy-rockchip-inno-hdmi.c
+index 1e1563f5fffc4..fbdc23953b52e 100644
+--- a/drivers/phy/rockchip/phy-rockchip-inno-hdmi.c
++++ b/drivers/phy/rockchip/phy-rockchip-inno-hdmi.c
+@@ -745,10 +745,12 @@ unsigned long inno_hdmi_phy_rk3328_clk_recalc_rate(struct clk_hw *hw,
+ do_div(vco, (nd * (no_a == 1 ? no_b : no_a) * no_d * 2));
+ }
+
+- inno->pixclock = vco;
+- dev_dbg(inno->dev, "%s rate %lu\n", __func__, inno->pixclock);
++ inno->pixclock = DIV_ROUND_CLOSEST((unsigned long)vco, 1000) * 1000;
+
+- return vco;
++ dev_dbg(inno->dev, "%s rate %lu vco %llu\n",
++ __func__, inno->pixclock, vco);
++
++ return inno->pixclock;
+ }
+
+ static long inno_hdmi_phy_rk3328_clk_round_rate(struct clk_hw *hw,
+@@ -790,8 +792,8 @@ static int inno_hdmi_phy_rk3328_clk_set_rate(struct clk_hw *hw,
+ RK3328_PRE_PLL_POWER_DOWN);
+
+ /* Configure pre-pll */
+- inno_update_bits(inno, 0xa0, RK3228_PCLK_VCO_DIV_5_MASK,
+- RK3228_PCLK_VCO_DIV_5(cfg->vco_div_5_en));
++ inno_update_bits(inno, 0xa0, RK3328_PCLK_VCO_DIV_5_MASK,
++ RK3328_PCLK_VCO_DIV_5(cfg->vco_div_5_en));
+ inno_write(inno, 0xa1, RK3328_PRE_PLL_PRE_DIV(cfg->prediv));
+
+ val = RK3328_SPREAD_SPECTRUM_MOD_DISABLE;
+@@ -1021,9 +1023,10 @@ inno_hdmi_phy_rk3328_power_on(struct inno_hdmi_phy *inno,
+
+ inno_write(inno, 0xac, RK3328_POST_PLL_FB_DIV_7_0(cfg->fbdiv));
+ if (cfg->postdiv == 1) {
+- inno_write(inno, 0xaa, RK3328_POST_PLL_REFCLK_SEL_TMDS);
+ inno_write(inno, 0xab, RK3328_POST_PLL_FB_DIV_8(cfg->fbdiv) |
+ RK3328_POST_PLL_PRE_DIV(cfg->prediv));
++ inno_write(inno, 0xaa, RK3328_POST_PLL_REFCLK_SEL_TMDS |
++ RK3328_POST_PLL_POWER_DOWN);
+ } else {
+ v = (cfg->postdiv / 2) - 1;
+ v &= RK3328_POST_PLL_POST_DIV_MASK;
+@@ -1031,7 +1034,8 @@ inno_hdmi_phy_rk3328_power_on(struct inno_hdmi_phy *inno,
+ inno_write(inno, 0xab, RK3328_POST_PLL_FB_DIV_8(cfg->fbdiv) |
+ RK3328_POST_PLL_PRE_DIV(cfg->prediv));
+ inno_write(inno, 0xaa, RK3328_POST_PLL_POST_DIV_ENABLE |
+- RK3328_POST_PLL_REFCLK_SEL_TMDS);
++ RK3328_POST_PLL_REFCLK_SEL_TMDS |
++ RK3328_POST_PLL_POWER_DOWN);
+ }
+
+ for (v = 0; v < 14; v++)
+diff --git a/drivers/pinctrl/mediatek/pinctrl-mt7981.c b/drivers/pinctrl/mediatek/pinctrl-mt7981.c
+index 18abc57800111..0fd2c0c451f95 100644
+--- a/drivers/pinctrl/mediatek/pinctrl-mt7981.c
++++ b/drivers/pinctrl/mediatek/pinctrl-mt7981.c
+@@ -457,37 +457,15 @@ static const unsigned int mt7981_pull_type[] = {
+ MTK_PULL_PUPD_R1R0_TYPE,/*34*/ MTK_PULL_PUPD_R1R0_TYPE,/*35*/
+ MTK_PULL_PUPD_R1R0_TYPE,/*36*/ MTK_PULL_PUPD_R1R0_TYPE,/*37*/
+ MTK_PULL_PUPD_R1R0_TYPE,/*38*/ MTK_PULL_PUPD_R1R0_TYPE,/*39*/
+- MTK_PULL_PUPD_R1R0_TYPE,/*40*/ MTK_PULL_PUPD_R1R0_TYPE,/*41*/
+- MTK_PULL_PUPD_R1R0_TYPE,/*42*/ MTK_PULL_PUPD_R1R0_TYPE,/*43*/
+- MTK_PULL_PUPD_R1R0_TYPE,/*44*/ MTK_PULL_PUPD_R1R0_TYPE,/*45*/
+- MTK_PULL_PUPD_R1R0_TYPE,/*46*/ MTK_PULL_PUPD_R1R0_TYPE,/*47*/
+- MTK_PULL_PUPD_R1R0_TYPE,/*48*/ MTK_PULL_PUPD_R1R0_TYPE,/*49*/
+- MTK_PULL_PUPD_R1R0_TYPE,/*50*/ MTK_PULL_PUPD_R1R0_TYPE,/*51*/
+- MTK_PULL_PUPD_R1R0_TYPE,/*52*/ MTK_PULL_PUPD_R1R0_TYPE,/*53*/
+- MTK_PULL_PUPD_R1R0_TYPE,/*54*/ MTK_PULL_PUPD_R1R0_TYPE,/*55*/
+- MTK_PULL_PUPD_R1R0_TYPE,/*56*/ MTK_PULL_PUPD_R1R0_TYPE,/*57*/
+- MTK_PULL_PUPD_R1R0_TYPE,/*58*/ MTK_PULL_PUPD_R1R0_TYPE,/*59*/
+- MTK_PULL_PUPD_R1R0_TYPE,/*60*/ MTK_PULL_PUPD_R1R0_TYPE,/*61*/
+- MTK_PULL_PUPD_R1R0_TYPE,/*62*/ MTK_PULL_PUPD_R1R0_TYPE,/*63*/
+- MTK_PULL_PUPD_R1R0_TYPE,/*64*/ MTK_PULL_PUPD_R1R0_TYPE,/*65*/
+- MTK_PULL_PUPD_R1R0_TYPE,/*66*/ MTK_PULL_PUPD_R1R0_TYPE,/*67*/
+- MTK_PULL_PUPD_R1R0_TYPE,/*68*/ MTK_PULL_PU_PD_TYPE,/*69*/
+- MTK_PULL_PU_PD_TYPE,/*70*/ MTK_PULL_PU_PD_TYPE,/*71*/
+- MTK_PULL_PU_PD_TYPE,/*72*/ MTK_PULL_PU_PD_TYPE,/*73*/
+- MTK_PULL_PU_PD_TYPE,/*74*/ MTK_PULL_PU_PD_TYPE,/*75*/
+- MTK_PULL_PU_PD_TYPE,/*76*/ MTK_PULL_PU_PD_TYPE,/*77*/
+- MTK_PULL_PU_PD_TYPE,/*78*/ MTK_PULL_PU_PD_TYPE,/*79*/
+- MTK_PULL_PU_PD_TYPE,/*80*/ MTK_PULL_PU_PD_TYPE,/*81*/
+- MTK_PULL_PU_PD_TYPE,/*82*/ MTK_PULL_PU_PD_TYPE,/*83*/
+- MTK_PULL_PU_PD_TYPE,/*84*/ MTK_PULL_PU_PD_TYPE,/*85*/
+- MTK_PULL_PU_PD_TYPE,/*86*/ MTK_PULL_PU_PD_TYPE,/*87*/
+- MTK_PULL_PU_PD_TYPE,/*88*/ MTK_PULL_PU_PD_TYPE,/*89*/
+- MTK_PULL_PU_PD_TYPE,/*90*/ MTK_PULL_PU_PD_TYPE,/*91*/
+- MTK_PULL_PU_PD_TYPE,/*92*/ MTK_PULL_PU_PD_TYPE,/*93*/
+- MTK_PULL_PU_PD_TYPE,/*94*/ MTK_PULL_PU_PD_TYPE,/*95*/
+- MTK_PULL_PU_PD_TYPE,/*96*/ MTK_PULL_PU_PD_TYPE,/*97*/
+- MTK_PULL_PU_PD_TYPE,/*98*/ MTK_PULL_PU_PD_TYPE,/*99*/
+- MTK_PULL_PU_PD_TYPE,/*100*/
++ MTK_PULL_PU_PD_TYPE,/*40*/ MTK_PULL_PU_PD_TYPE,/*41*/
++ MTK_PULL_PU_PD_TYPE,/*42*/ MTK_PULL_PU_PD_TYPE,/*43*/
++ MTK_PULL_PU_PD_TYPE,/*44*/ MTK_PULL_PU_PD_TYPE,/*45*/
++ MTK_PULL_PU_PD_TYPE,/*46*/ MTK_PULL_PU_PD_TYPE,/*47*/
++ MTK_PULL_PU_PD_TYPE,/*48*/ MTK_PULL_PU_PD_TYPE,/*49*/
++ MTK_PULL_PU_PD_TYPE,/*50*/ MTK_PULL_PU_PD_TYPE,/*51*/
++ MTK_PULL_PU_PD_TYPE,/*52*/ MTK_PULL_PU_PD_TYPE,/*53*/
++ MTK_PULL_PU_PD_TYPE,/*54*/ MTK_PULL_PU_PD_TYPE,/*55*/
++ MTK_PULL_PU_PD_TYPE,/*56*/
+ };
+
+ static const struct mtk_pin_reg_calc mt7981_reg_cals[] = {
+@@ -1014,6 +992,10 @@ static struct mtk_pin_soc mt7981_data = {
+ .ies_present = false,
+ .base_names = mt7981_pinctrl_register_base_names,
+ .nbase_names = ARRAY_SIZE(mt7981_pinctrl_register_base_names),
++ .bias_disable_set = mtk_pinconf_bias_disable_set,
++ .bias_disable_get = mtk_pinconf_bias_disable_get,
++ .bias_set = mtk_pinconf_bias_set,
++ .bias_get = mtk_pinconf_bias_get,
+ .pull_type = mt7981_pull_type,
+ .bias_set_combo = mtk_pinconf_bias_set_combo,
+ .bias_get_combo = mtk_pinconf_bias_get_combo,
+diff --git a/drivers/pinctrl/mediatek/pinctrl-mt7986.c b/drivers/pinctrl/mediatek/pinctrl-mt7986.c
+index aa0ccd67f4f4e..acaac9b38aa8a 100644
+--- a/drivers/pinctrl/mediatek/pinctrl-mt7986.c
++++ b/drivers/pinctrl/mediatek/pinctrl-mt7986.c
+@@ -922,6 +922,10 @@ static struct mtk_pin_soc mt7986a_data = {
+ .ies_present = false,
+ .base_names = mt7986_pinctrl_register_base_names,
+ .nbase_names = ARRAY_SIZE(mt7986_pinctrl_register_base_names),
++ .bias_disable_set = mtk_pinconf_bias_disable_set,
++ .bias_disable_get = mtk_pinconf_bias_disable_get,
++ .bias_set = mtk_pinconf_bias_set,
++ .bias_get = mtk_pinconf_bias_get,
+ .pull_type = mt7986_pull_type,
+ .bias_set_combo = mtk_pinconf_bias_set_combo,
+ .bias_get_combo = mtk_pinconf_bias_get_combo,
+@@ -944,6 +948,10 @@ static struct mtk_pin_soc mt7986b_data = {
+ .ies_present = false,
+ .base_names = mt7986_pinctrl_register_base_names,
+ .nbase_names = ARRAY_SIZE(mt7986_pinctrl_register_base_names),
++ .bias_disable_set = mtk_pinconf_bias_disable_set,
++ .bias_disable_get = mtk_pinconf_bias_disable_get,
++ .bias_set = mtk_pinconf_bias_set,
++ .bias_get = mtk_pinconf_bias_get,
+ .pull_type = mt7986_pull_type,
+ .bias_set_combo = mtk_pinconf_bias_set_combo,
+ .bias_get_combo = mtk_pinconf_bias_get_combo,
+diff --git a/drivers/pinctrl/pinctrl-mcp23s08_spi.c b/drivers/pinctrl/pinctrl-mcp23s08_spi.c
+index 9ae10318f6f35..ea059b9c5542e 100644
+--- a/drivers/pinctrl/pinctrl-mcp23s08_spi.c
++++ b/drivers/pinctrl/pinctrl-mcp23s08_spi.c
+@@ -91,18 +91,28 @@ static int mcp23s08_spi_regmap_init(struct mcp23s08 *mcp, struct device *dev,
+ mcp->reg_shift = 0;
+ mcp->chip.ngpio = 8;
+ mcp->chip.label = devm_kasprintf(dev, GFP_KERNEL, "mcp23s08.%d", addr);
++ if (!mcp->chip.label)
++ return -ENOMEM;
+
+ config = &mcp23x08_regmap;
+ name = devm_kasprintf(dev, GFP_KERNEL, "%d", addr);
++ if (!name)
++ return -ENOMEM;
++
+ break;
+
+ case MCP_TYPE_S17:
+ mcp->reg_shift = 1;
+ mcp->chip.ngpio = 16;
+ mcp->chip.label = devm_kasprintf(dev, GFP_KERNEL, "mcp23s17.%d", addr);
++ if (!mcp->chip.label)
++ return -ENOMEM;
+
+ config = &mcp23x17_regmap;
+ name = devm_kasprintf(dev, GFP_KERNEL, "%d", addr);
++ if (!name)
++ return -ENOMEM;
++
+ break;
+
+ case MCP_TYPE_S18:
+diff --git a/drivers/platform/chrome/chromeos_acpi.c b/drivers/platform/chrome/chromeos_acpi.c
+index 50d8a4d4352d6..1312aaaa8750b 100644
+--- a/drivers/platform/chrome/chromeos_acpi.c
++++ b/drivers/platform/chrome/chromeos_acpi.c
+@@ -90,7 +90,36 @@ static int chromeos_acpi_handle_package(struct device *dev, union acpi_object *o
+ case ACPI_TYPE_STRING:
+ return sysfs_emit(buf, "%s\n", element->string.pointer);
+ case ACPI_TYPE_BUFFER:
+- return sysfs_emit(buf, "%s\n", element->buffer.pointer);
++ {
++ int i, r, at, room_left;
++ const int byte_per_line = 16;
++
++ at = 0;
++ room_left = PAGE_SIZE - 1;
++ for (i = 0; i < element->buffer.length && room_left; i += byte_per_line) {
++ r = hex_dump_to_buffer(element->buffer.pointer + i,
++ element->buffer.length - i,
++ byte_per_line, 1, buf + at, room_left,
++ false);
++ if (r > room_left)
++ goto truncating;
++ at += r;
++ room_left -= r;
++
++ r = sysfs_emit_at(buf, at, "\n");
++ if (!r)
++ goto truncating;
++ at += r;
++ room_left -= r;
++ }
++
++ buf[at] = 0;
++ return at;
++truncating:
++ dev_info_once(dev, "truncating sysfs content for %s\n", name);
++ sysfs_emit_at(buf, PAGE_SIZE - 4, "..\n");
++ return PAGE_SIZE - 1;
++ }
+ default:
+ dev_err(dev, "element type %d not supported\n", element->type);
+ return -EINVAL;
+diff --git a/drivers/platform/mellanox/mlxbf-tmfifo.c b/drivers/platform/mellanox/mlxbf-tmfifo.c
+index a79318e90a139..b600b77d91ef2 100644
+--- a/drivers/platform/mellanox/mlxbf-tmfifo.c
++++ b/drivers/platform/mellanox/mlxbf-tmfifo.c
+@@ -887,6 +887,7 @@ static bool mlxbf_tmfifo_virtio_notify(struct virtqueue *vq)
+ tm_vdev = fifo->vdev[VIRTIO_ID_CONSOLE];
+ mlxbf_tmfifo_console_output(tm_vdev, vring);
+ spin_unlock_irqrestore(&fifo->spin_lock[0], flags);
++ set_bit(MLXBF_TM_TX_LWM_IRQ, &fifo->pend_events);
+ } else if (test_and_set_bit(MLXBF_TM_TX_LWM_IRQ,
+ &fifo->pend_events)) {
+ return true;
+diff --git a/drivers/platform/x86/amd/pmf/core.c b/drivers/platform/x86/amd/pmf/core.c
+index a022325161273..6fa33a05c5a75 100644
+--- a/drivers/platform/x86/amd/pmf/core.c
++++ b/drivers/platform/x86/amd/pmf/core.c
+@@ -322,7 +322,8 @@ static void amd_pmf_init_features(struct amd_pmf_dev *dev)
+
+ static void amd_pmf_deinit_features(struct amd_pmf_dev *dev)
+ {
+- if (is_apmf_func_supported(dev, APMF_FUNC_STATIC_SLIDER_GRANULAR)) {
++ if (is_apmf_func_supported(dev, APMF_FUNC_STATIC_SLIDER_GRANULAR) ||
++ is_apmf_func_supported(dev, APMF_FUNC_OS_POWER_SLIDER_UPDATE)) {
+ power_supply_unreg_notifier(&dev->pwr_src_notifier);
+ amd_pmf_deinit_sps(dev);
+ }
+diff --git a/drivers/platform/x86/amd/pmf/sps.c b/drivers/platform/x86/amd/pmf/sps.c
+index fd448844de206..b2cf62937227c 100644
+--- a/drivers/platform/x86/amd/pmf/sps.c
++++ b/drivers/platform/x86/amd/pmf/sps.c
+@@ -121,7 +121,8 @@ int amd_pmf_get_pprof_modes(struct amd_pmf_dev *pmf)
+
+ int amd_pmf_power_slider_update_event(struct amd_pmf_dev *dev)
+ {
+- u8 mode, flag = 0;
++ u8 flag = 0;
++ int mode;
+ int src;
+
+ mode = amd_pmf_get_pprof_modes(dev);
+diff --git a/drivers/platform/x86/asus-wmi.c b/drivers/platform/x86/asus-wmi.c
+index 1038dfdcdd325..8bef66a2f0ce7 100644
+--- a/drivers/platform/x86/asus-wmi.c
++++ b/drivers/platform/x86/asus-wmi.c
+@@ -738,13 +738,23 @@ static ssize_t kbd_rgb_mode_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+ {
+- u32 cmd, mode, r, g, b, speed;
++ u32 cmd, mode, r, g, b, speed;
+ int err;
+
+ if (sscanf(buf, "%d %d %d %d %d %d", &cmd, &mode, &r, &g, &b, &speed) != 6)
+ return -EINVAL;
+
+- cmd = !!cmd;
++ /* B3 is set and B4 is save to BIOS */
++ switch (cmd) {
++ case 0:
++ cmd = 0xb3;
++ break;
++ case 1:
++ cmd = 0xb4;
++ break;
++ default:
++ return -EINVAL;
++ }
+
+ /* These are the known usable modes across all TUF/ROG */
+ if (mode >= 12 || mode == 9)
+diff --git a/drivers/platform/x86/dell/dell-wmi-sysman/sysman.c b/drivers/platform/x86/dell/dell-wmi-sysman/sysman.c
+index 0285b47d99d13..1dd5aa37ecc8b 100644
+--- a/drivers/platform/x86/dell/dell-wmi-sysman/sysman.c
++++ b/drivers/platform/x86/dell/dell-wmi-sysman/sysman.c
+@@ -396,6 +396,7 @@ static int init_bios_attributes(int attr_type, const char *guid)
+ struct kobject *attr_name_kobj; //individual attribute names
+ union acpi_object *obj = NULL;
+ union acpi_object *elements;
++ struct kobject *duplicate;
+ struct kset *tmp_set;
+ int min_elements;
+
+@@ -454,9 +455,11 @@ static int init_bios_attributes(int attr_type, const char *guid)
+ else
+ tmp_set = wmi_priv.main_dir_kset;
+
+- if (kset_find_obj(tmp_set, elements[ATTR_NAME].string.pointer)) {
+- pr_debug("duplicate attribute name found - %s\n",
+- elements[ATTR_NAME].string.pointer);
++ duplicate = kset_find_obj(tmp_set, elements[ATTR_NAME].string.pointer);
++ if (duplicate) {
++ pr_debug("Duplicate attribute name found - %s\n",
++ elements[ATTR_NAME].string.pointer);
++ kobject_put(duplicate);
+ goto nextobj;
+ }
+
+diff --git a/drivers/platform/x86/huawei-wmi.c b/drivers/platform/x86/huawei-wmi.c
+index 70e5c4c0574d5..0ef1c46b617b6 100644
+--- a/drivers/platform/x86/huawei-wmi.c
++++ b/drivers/platform/x86/huawei-wmi.c
+@@ -85,6 +85,8 @@ static const struct key_entry huawei_wmi_keymap[] = {
+ { KE_IGNORE, 0x293, { KEY_KBDILLUMTOGGLE } },
+ { KE_IGNORE, 0x294, { KEY_KBDILLUMUP } },
+ { KE_IGNORE, 0x295, { KEY_KBDILLUMUP } },
++ // Ignore Ambient Light Sensoring
++ { KE_KEY, 0x2c1, { KEY_RESERVED } },
+ { KE_END, 0 }
+ };
+
+diff --git a/drivers/platform/x86/intel/hid.c b/drivers/platform/x86/intel/hid.c
+index 5632bd3c534a3..7457ca2b27a60 100644
+--- a/drivers/platform/x86/intel/hid.c
++++ b/drivers/platform/x86/intel/hid.c
+@@ -150,6 +150,12 @@ static const struct dmi_system_id dmi_vgbs_allow_list[] = {
+ DMI_MATCH(DMI_PRODUCT_NAME, "Surface Go"),
+ },
+ },
++ {
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "HP"),
++ DMI_MATCH(DMI_PRODUCT_NAME, "HP Elite Dragonfly G2 Notebook PC"),
++ },
++ },
+ { }
+ };
+
+@@ -620,7 +626,7 @@ static bool button_array_present(struct platform_device *device)
+ static int intel_hid_probe(struct platform_device *device)
+ {
+ acpi_handle handle = ACPI_HANDLE(&device->dev);
+- unsigned long long mode;
++ unsigned long long mode, dummy;
+ struct intel_hid_priv *priv;
+ acpi_status status;
+ int err;
+@@ -692,18 +698,15 @@ static int intel_hid_probe(struct platform_device *device)
+ if (err)
+ goto err_remove_notify;
+
+- if (priv->array) {
+- unsigned long long dummy;
++ intel_button_array_enable(&device->dev, true);
+
+- intel_button_array_enable(&device->dev, true);
+-
+- /* Call button load method to enable HID power button */
+- if (!intel_hid_evaluate_method(handle, INTEL_HID_DSM_BTNL_FN,
+- &dummy)) {
+- dev_warn(&device->dev,
+- "failed to enable HID power button\n");
+- }
+- }
++ /*
++ * Call button load method to enable HID power button
++ * Always do this since it activates events on some devices without
++ * a button array too.
++ */
++ if (!intel_hid_evaluate_method(handle, INTEL_HID_DSM_BTNL_FN, &dummy))
++ dev_warn(&device->dev, "failed to enable HID power button\n");
+
+ device_init_wakeup(&device->dev, true);
+ /*
+diff --git a/drivers/platform/x86/think-lmi.c b/drivers/platform/x86/think-lmi.c
+index e4047ee0a7546..63eca13fd882f 100644
+--- a/drivers/platform/x86/think-lmi.c
++++ b/drivers/platform/x86/think-lmi.c
+@@ -719,12 +719,12 @@ static ssize_t cert_to_password_store(struct kobject *kobj,
+ /* Format: 'Password,Signature' */
+ auth_str = kasprintf(GFP_KERNEL, "%s,%s", passwd, setting->signature);
+ if (!auth_str) {
+- kfree(passwd);
++ kfree_sensitive(passwd);
+ return -ENOMEM;
+ }
+ ret = tlmi_simple_call(LENOVO_CERT_TO_PASSWORD_GUID, auth_str);
+ kfree(auth_str);
+- kfree(passwd);
++ kfree_sensitive(passwd);
+
+ return ret ?: count;
+ }
+diff --git a/drivers/powercap/arm_scmi_powercap.c b/drivers/powercap/arm_scmi_powercap.c
+index 05d0e516176a5..5d7330280bd83 100644
+--- a/drivers/powercap/arm_scmi_powercap.c
++++ b/drivers/powercap/arm_scmi_powercap.c
+@@ -12,6 +12,7 @@
+ #include <linux/module.h>
+ #include <linux/powercap.h>
+ #include <linux/scmi_protocol.h>
++#include <linux/slab.h>
+
+ #define to_scmi_powercap_zone(z) \
+ container_of(z, struct scmi_powercap_zone, zone)
+@@ -19,6 +20,8 @@
+ static const struct scmi_powercap_proto_ops *powercap_ops;
+
+ struct scmi_powercap_zone {
++ bool registered;
++ bool invalid;
+ unsigned int height;
+ struct device *dev;
+ struct scmi_protocol_handle *ph;
+@@ -32,6 +35,7 @@ struct scmi_powercap_root {
+ unsigned int num_zones;
+ struct scmi_powercap_zone *spzones;
+ struct list_head *registered_zones;
++ struct list_head scmi_zones;
+ };
+
+ static struct powercap_control_type *scmi_top_pcntrl;
+@@ -255,12 +259,6 @@ static void scmi_powercap_unregister_all_zones(struct scmi_powercap_root *pr)
+ }
+ }
+
+-static inline bool
+-scmi_powercap_is_zone_registered(struct scmi_powercap_zone *spz)
+-{
+- return !list_empty(&spz->node);
+-}
+-
+ static inline unsigned int
+ scmi_powercap_get_zone_height(struct scmi_powercap_zone *spz)
+ {
+@@ -279,11 +277,46 @@ scmi_powercap_get_parent_zone(struct scmi_powercap_zone *spz)
+ return &spz->spzones[spz->info->parent_id];
+ }
+
++static int scmi_powercap_register_zone(struct scmi_powercap_root *pr,
++ struct scmi_powercap_zone *spz,
++ struct scmi_powercap_zone *parent)
++{
++ int ret = 0;
++ struct powercap_zone *z;
++
++ if (spz->invalid) {
++ list_del(&spz->node);
++ return -EINVAL;
++ }
++
++ z = powercap_register_zone(&spz->zone, scmi_top_pcntrl, spz->info->name,
++ parent ? &parent->zone : NULL,
++ &zone_ops, 1, &constraint_ops);
++ if (!IS_ERR(z)) {
++ spz->height = scmi_powercap_get_zone_height(spz);
++ spz->registered = true;
++ list_move(&spz->node, &pr->registered_zones[spz->height]);
++ dev_dbg(spz->dev, "Registered node %s - parent %s - height:%d\n",
++ spz->info->name, parent ? parent->info->name : "ROOT",
++ spz->height);
++ } else {
++ list_del(&spz->node);
++ ret = PTR_ERR(z);
++ dev_err(spz->dev,
++ "Error registering node:%s - parent:%s - h:%d - ret:%d\n",
++ spz->info->name,
++ parent ? parent->info->name : "ROOT",
++ spz->height, ret);
++ }
++
++ return ret;
++}
++
+ /**
+- * scmi_powercap_register_zone - Register an SCMI powercap zone recursively
++ * scmi_zones_register- Register SCMI powercap zones starting from parent zones
+ *
++ * @dev: A reference to the SCMI device
+ * @pr: A reference to the root powercap zones descriptors
+- * @spz: A reference to the SCMI powercap zone to register
+ *
+ * When registering SCMI powercap zones with the powercap framework we should
+ * take care to always register zones starting from the root ones and to
+@@ -293,10 +326,10 @@ scmi_powercap_get_parent_zone(struct scmi_powercap_zone *spz)
+ * zones provided by the SCMI platform firmware is built to comply with such
+ * requirement.
+ *
+- * This function, given an SCMI powercap zone to register, takes care to walk
+- * the SCMI powercap zones tree up to the root looking recursively for
+- * unregistered parent zones before registering the provided zone; at the same
+- * time each registered zone height in such a tree is accounted for and each
++ * This function, given the set of SCMI powercap zones to register, takes care
++ * to walk the SCMI powercap zones trees up to the root registering any
++ * unregistered parent zone before registering the child zones; at the same
++ * time each registered-zone height in such a tree is accounted for and each
+ * zone, once registered, is stored in the @registered_zones array that is
+ * indexed by zone height: this way will be trivial, at unregister time, to walk
+ * the @registered_zones array backward and unregister all the zones starting
+@@ -314,57 +347,55 @@ scmi_powercap_get_parent_zone(struct scmi_powercap_zone *spz)
+ *
+ * Return: 0 on Success
+ */
+-static int scmi_powercap_register_zone(struct scmi_powercap_root *pr,
+- struct scmi_powercap_zone *spz)
++static int scmi_zones_register(struct device *dev,
++ struct scmi_powercap_root *pr)
+ {
+ int ret = 0;
+- struct scmi_powercap_zone *parent;
+-
+- if (!spz->info)
+- return ret;
++ unsigned int sp = 0, reg_zones = 0;
++ struct scmi_powercap_zone *spz, **zones_stack;
+
+- parent = scmi_powercap_get_parent_zone(spz);
+- if (parent && !scmi_powercap_is_zone_registered(parent)) {
+- /*
+- * Bail out if a parent domain was marked as unsupported:
+- * only domains participating as leaves can be skipped.
+- */
+- if (!parent->info)
+- return -ENODEV;
++ zones_stack = kcalloc(pr->num_zones, sizeof(spz), GFP_KERNEL);
++ if (!zones_stack)
++ return -ENOMEM;
+
+- ret = scmi_powercap_register_zone(pr, parent);
+- if (ret)
+- return ret;
+- }
++ spz = list_first_entry_or_null(&pr->scmi_zones,
++ struct scmi_powercap_zone, node);
++ while (spz) {
++ struct scmi_powercap_zone *parent;
+
+- if (!scmi_powercap_is_zone_registered(spz)) {
+- struct powercap_zone *z;
+-
+- z = powercap_register_zone(&spz->zone,
+- scmi_top_pcntrl,
+- spz->info->name,
+- parent ? &parent->zone : NULL,
+- &zone_ops, 1, &constraint_ops);
+- if (!IS_ERR(z)) {
+- spz->height = scmi_powercap_get_zone_height(spz);
+- list_add(&spz->node,
+- &pr->registered_zones[spz->height]);
+- dev_dbg(spz->dev,
+- "Registered node %s - parent %s - height:%d\n",
+- spz->info->name,
+- parent ? parent->info->name : "ROOT",
+- spz->height);
+- ret = 0;
++ parent = scmi_powercap_get_parent_zone(spz);
++ if (parent && !parent->registered) {
++ zones_stack[sp++] = spz;
++ spz = parent;
+ } else {
+- ret = PTR_ERR(z);
+- dev_err(spz->dev,
+- "Error registering node:%s - parent:%s - h:%d - ret:%d\n",
+- spz->info->name,
+- parent ? parent->info->name : "ROOT",
+- spz->height, ret);
++ ret = scmi_powercap_register_zone(pr, spz, parent);
++ if (!ret) {
++ reg_zones++;
++ } else if (sp) {
++ /* Failed to register a non-leaf zone.
++ * Bail-out.
++ */
++ dev_err(dev,
++ "Failed to register non-leaf zone - ret:%d\n",
++ ret);
++ scmi_powercap_unregister_all_zones(pr);
++ reg_zones = 0;
++ goto out;
++ }
++ /* Pick next zone to process */
++ if (sp)
++ spz = zones_stack[--sp];
++ else
++ spz = list_first_entry_or_null(&pr->scmi_zones,
++ struct scmi_powercap_zone,
++ node);
+ }
+ }
+
++out:
++ kfree(zones_stack);
++ dev_info(dev, "Registered %d SCMI Powercap domains !\n", reg_zones);
++
+ return ret;
+ }
+
+@@ -408,6 +439,8 @@ static int scmi_powercap_probe(struct scmi_device *sdev)
+ if (!pr->registered_zones)
+ return -ENOMEM;
+
++ INIT_LIST_HEAD(&pr->scmi_zones);
++
+ for (i = 0, spz = pr->spzones; i < pr->num_zones; i++, spz++) {
+ /*
+ * Powercap domains are validate by the protocol layer, i.e.
+@@ -422,6 +455,7 @@ static int scmi_powercap_probe(struct scmi_device *sdev)
+ INIT_LIST_HEAD(&spz->node);
+ INIT_LIST_HEAD(&pr->registered_zones[i]);
+
++ list_add_tail(&spz->node, &pr->scmi_zones);
+ /*
+ * Forcibly skip powercap domains using an abstract scale.
+ * Note that only leaves domains can be skipped, so this could
+@@ -432,7 +466,7 @@ static int scmi_powercap_probe(struct scmi_device *sdev)
+ dev_warn(dev,
+ "Abstract power scale not supported. Skip %s.\n",
+ spz->info->name);
+- spz->info = NULL;
++ spz->invalid = true;
+ continue;
+ }
+ }
+@@ -441,21 +475,12 @@ static int scmi_powercap_probe(struct scmi_device *sdev)
+ * Scan array of retrieved SCMI powercap domains and register them
+ * recursively starting from the root domains.
+ */
+- for (i = 0, spz = pr->spzones; i < pr->num_zones; i++, spz++) {
+- ret = scmi_powercap_register_zone(pr, spz);
+- if (ret) {
+- dev_err(dev,
+- "Failed to register powercap zone %s - ret:%d\n",
+- spz->info->name, ret);
+- scmi_powercap_unregister_all_zones(pr);
+- return ret;
+- }
+- }
++ ret = scmi_zones_register(dev, pr);
++ if (ret)
++ return ret;
+
+ dev_set_drvdata(dev, pr);
+
+- dev_info(dev, "Registered %d SCMI Powercap domains !\n", pr->num_zones);
+-
+ return ret;
+ }
+
+diff --git a/drivers/rpmsg/qcom_glink_native.c b/drivers/rpmsg/qcom_glink_native.c
+index 1beb40a1d3df2..e4015db99899d 100644
+--- a/drivers/rpmsg/qcom_glink_native.c
++++ b/drivers/rpmsg/qcom_glink_native.c
+@@ -221,6 +221,10 @@ static struct glink_channel *qcom_glink_alloc_channel(struct qcom_glink *glink,
+
+ channel->glink = glink;
+ channel->name = kstrdup(name, GFP_KERNEL);
++ if (!channel->name) {
++ kfree(channel);
++ return ERR_PTR(-ENOMEM);
++ }
+
+ init_completion(&channel->open_req);
+ init_completion(&channel->open_ack);
+diff --git a/drivers/s390/block/dasd.c b/drivers/s390/block/dasd.c
+index 9fbfce735d565..50c48a48fcae3 100644
+--- a/drivers/s390/block/dasd.c
++++ b/drivers/s390/block/dasd.c
+@@ -2938,41 +2938,32 @@ static void _dasd_wake_block_flush_cb(struct dasd_ccw_req *cqr, void *data)
+ * Requeue a request back to the block request queue
+ * only works for block requests
+ */
+-static int _dasd_requeue_request(struct dasd_ccw_req *cqr)
++static void _dasd_requeue_request(struct dasd_ccw_req *cqr)
+ {
+- struct dasd_block *block = cqr->block;
+ struct request *req;
+
+- if (!block)
+- return -EINVAL;
+ /*
+ * If the request is an ERP request there is nothing to requeue.
+ * This will be done with the remaining original request.
+ */
+ if (cqr->refers)
+- return 0;
++ return;
+ spin_lock_irq(&cqr->dq->lock);
+ req = (struct request *) cqr->callback_data;
+ blk_mq_requeue_request(req, true);
+ spin_unlock_irq(&cqr->dq->lock);
+
+- return 0;
++ return;
+ }
+
+-/*
+- * Go through all request on the dasd_block request queue, cancel them
+- * on the respective dasd_device, and return them to the generic
+- * block layer.
+- */
+-static int dasd_flush_block_queue(struct dasd_block *block)
++static int _dasd_requests_to_flushqueue(struct dasd_block *block,
++ struct list_head *flush_queue)
+ {
+ struct dasd_ccw_req *cqr, *n;
+- int rc, i;
+- struct list_head flush_queue;
+ unsigned long flags;
++ int rc, i;
+
+- INIT_LIST_HEAD(&flush_queue);
+- spin_lock_bh(&block->queue_lock);
++ spin_lock_irqsave(&block->queue_lock, flags);
+ rc = 0;
+ restart:
+ list_for_each_entry_safe(cqr, n, &block->ccw_queue, blocklist) {
+@@ -2987,13 +2978,32 @@ restart:
+ * is returned from the dasd_device layer.
+ */
+ cqr->callback = _dasd_wake_block_flush_cb;
+- for (i = 0; cqr != NULL; cqr = cqr->refers, i++)
+- list_move_tail(&cqr->blocklist, &flush_queue);
++ for (i = 0; cqr; cqr = cqr->refers, i++)
++ list_move_tail(&cqr->blocklist, flush_queue);
+ if (i > 1)
+ /* moved more than one request - need to restart */
+ goto restart;
+ }
+- spin_unlock_bh(&block->queue_lock);
++ spin_unlock_irqrestore(&block->queue_lock, flags);
++
++ return rc;
++}
++
++/*
++ * Go through all request on the dasd_block request queue, cancel them
++ * on the respective dasd_device, and return them to the generic
++ * block layer.
++ */
++static int dasd_flush_block_queue(struct dasd_block *block)
++{
++ struct dasd_ccw_req *cqr, *n;
++ struct list_head flush_queue;
++ unsigned long flags;
++ int rc;
++
++ INIT_LIST_HEAD(&flush_queue);
++ rc = _dasd_requests_to_flushqueue(block, &flush_queue);
++
+ /* Now call the callback function of flushed requests */
+ restart_cb:
+ list_for_each_entry_safe(cqr, n, &flush_queue, blocklist) {
+@@ -3878,75 +3888,36 @@ EXPORT_SYMBOL_GPL(dasd_generic_space_avail);
+ */
+ int dasd_generic_requeue_all_requests(struct dasd_device *device)
+ {
++ struct dasd_block *block = device->block;
+ struct list_head requeue_queue;
+ struct dasd_ccw_req *cqr, *n;
+- struct dasd_ccw_req *refers;
+ int rc;
+
+- INIT_LIST_HEAD(&requeue_queue);
+- spin_lock_irq(get_ccwdev_lock(device->cdev));
+- rc = 0;
+- list_for_each_entry_safe(cqr, n, &device->ccw_queue, devlist) {
+- /* Check status and move request to flush_queue */
+- if (cqr->status == DASD_CQR_IN_IO) {
+- rc = device->discipline->term_IO(cqr);
+- if (rc) {
+- /* unable to terminate requeust */
+- dev_err(&device->cdev->dev,
+- "Unable to terminate request %p "
+- "on suspend\n", cqr);
+- spin_unlock_irq(get_ccwdev_lock(device->cdev));
+- dasd_put_device(device);
+- return rc;
+- }
+- }
+- list_move_tail(&cqr->devlist, &requeue_queue);
+- }
+- spin_unlock_irq(get_ccwdev_lock(device->cdev));
+-
+- list_for_each_entry_safe(cqr, n, &requeue_queue, devlist) {
+- wait_event(dasd_flush_wq,
+- (cqr->status != DASD_CQR_CLEAR_PENDING));
++ if (!block)
++ return 0;
+
+- /*
+- * requeue requests to blocklayer will only work
+- * for block device requests
+- */
+- if (_dasd_requeue_request(cqr))
+- continue;
++ INIT_LIST_HEAD(&requeue_queue);
++ rc = _dasd_requests_to_flushqueue(block, &requeue_queue);
+
+- /* remove requests from device and block queue */
+- list_del_init(&cqr->devlist);
+- while (cqr->refers != NULL) {
+- refers = cqr->refers;
+- /* remove the request from the block queue */
+- list_del(&cqr->blocklist);
+- /* free the finished erp request */
+- dasd_free_erp_request(cqr, cqr->memdev);
+- cqr = refers;
++ /* Now call the callback function of flushed requests */
++restart_cb:
++ list_for_each_entry_safe(cqr, n, &requeue_queue, blocklist) {
++ wait_event(dasd_flush_wq, (cqr->status < DASD_CQR_QUEUED));
++ /* Process finished ERP request. */
++ if (cqr->refers) {
++ spin_lock_bh(&block->queue_lock);
++ __dasd_process_erp(block->base, cqr);
++ spin_unlock_bh(&block->queue_lock);
++ /* restart list_for_xx loop since dasd_process_erp
++ * might remove multiple elements
++ */
++ goto restart_cb;
+ }
+-
+- /*
+- * _dasd_requeue_request already checked for a valid
+- * blockdevice, no need to check again
+- * all erp requests (cqr->refers) have a cqr->block
+- * pointer copy from the original cqr
+- */
++ _dasd_requeue_request(cqr);
+ list_del_init(&cqr->blocklist);
+ cqr->block->base->discipline->free_cp(
+ cqr, (struct request *) cqr->callback_data);
+ }
+-
+- /*
+- * if requests remain then they are internal request
+- * and go back to the device queue
+- */
+- if (!list_empty(&requeue_queue)) {
+- /* move freeze_queue to start of the ccw_queue */
+- spin_lock_irq(get_ccwdev_lock(device->cdev));
+- list_splice_tail(&requeue_queue, &device->ccw_queue);
+- spin_unlock_irq(get_ccwdev_lock(device->cdev));
+- }
+ dasd_schedule_device_bh(device);
+ return rc;
+ }
+diff --git a/drivers/s390/block/dasd_3990_erp.c b/drivers/s390/block/dasd_3990_erp.c
+index f0f210627cadf..89957bb7244d2 100644
+--- a/drivers/s390/block/dasd_3990_erp.c
++++ b/drivers/s390/block/dasd_3990_erp.c
+@@ -2441,7 +2441,7 @@ static struct dasd_ccw_req *dasd_3990_erp_add_erp(struct dasd_ccw_req *cqr)
+ erp->block = cqr->block;
+ erp->magic = cqr->magic;
+ erp->expires = cqr->expires;
+- erp->retries = 256;
++ erp->retries = device->default_retries;
+ erp->buildclk = get_tod_clock();
+ erp->status = DASD_CQR_FILLED;
+
+diff --git a/drivers/s390/block/dasd_devmap.c b/drivers/s390/block/dasd_devmap.c
+index 620fab01b710b..c4e36650c4264 100644
+--- a/drivers/s390/block/dasd_devmap.c
++++ b/drivers/s390/block/dasd_devmap.c
+@@ -1378,16 +1378,12 @@ static ssize_t dasd_vendor_show(struct device *dev,
+
+ static DEVICE_ATTR(vendor, 0444, dasd_vendor_show, NULL);
+
+-#define UID_STRLEN ( /* vendor */ 3 + 1 + /* serial */ 14 + 1 +\
+- /* SSID */ 4 + 1 + /* unit addr */ 2 + 1 +\
+- /* vduit */ 32 + 1)
+-
+ static ssize_t
+ dasd_uid_show(struct device *dev, struct device_attribute *attr, char *buf)
+ {
++ char uid_string[DASD_UID_STRLEN];
+ struct dasd_device *device;
+ struct dasd_uid uid;
+- char uid_string[UID_STRLEN];
+ char ua_string[3];
+
+ device = dasd_device_from_cdev(to_ccwdev(dev));
+diff --git a/drivers/s390/block/dasd_eckd.c b/drivers/s390/block/dasd_eckd.c
+index 113c509bf6d05..718ac726ffced 100644
+--- a/drivers/s390/block/dasd_eckd.c
++++ b/drivers/s390/block/dasd_eckd.c
+@@ -1079,12 +1079,12 @@ static void dasd_eckd_get_uid_string(struct dasd_conf *conf,
+
+ create_uid(conf, &uid);
+ if (strlen(uid.vduit) > 0)
+- snprintf(print_uid, sizeof(*print_uid),
++ snprintf(print_uid, DASD_UID_STRLEN,
+ "%s.%s.%04x.%02x.%s",
+ uid.vendor, uid.serial, uid.ssid,
+ uid.real_unit_addr, uid.vduit);
+ else
+- snprintf(print_uid, sizeof(*print_uid),
++ snprintf(print_uid, DASD_UID_STRLEN,
+ "%s.%s.%04x.%02x",
+ uid.vendor, uid.serial, uid.ssid,
+ uid.real_unit_addr);
+@@ -1093,8 +1093,8 @@ static void dasd_eckd_get_uid_string(struct dasd_conf *conf,
+ static int dasd_eckd_check_cabling(struct dasd_device *device,
+ void *conf_data, __u8 lpm)
+ {
++ char print_path_uid[DASD_UID_STRLEN], print_device_uid[DASD_UID_STRLEN];
+ struct dasd_eckd_private *private = device->private;
+- char print_path_uid[60], print_device_uid[60];
+ struct dasd_conf path_conf;
+
+ path_conf.data = conf_data;
+@@ -1293,9 +1293,9 @@ static void dasd_eckd_path_available_action(struct dasd_device *device,
+ __u8 path_rcd_buf[DASD_ECKD_RCD_DATA_SIZE];
+ __u8 lpm, opm, npm, ppm, epm, hpfpm, cablepm;
+ struct dasd_conf_data *conf_data;
++ char print_uid[DASD_UID_STRLEN];
+ struct dasd_conf path_conf;
+ unsigned long flags;
+- char print_uid[60];
+ int rc, pos;
+
+ opm = 0;
+@@ -5855,8 +5855,8 @@ static void dasd_eckd_dump_sense(struct dasd_device *device,
+ static int dasd_eckd_reload_device(struct dasd_device *device)
+ {
+ struct dasd_eckd_private *private = device->private;
++ char print_uid[DASD_UID_STRLEN];
+ int rc, old_base;
+- char print_uid[60];
+ struct dasd_uid uid;
+ unsigned long flags;
+
+diff --git a/drivers/s390/block/dasd_int.h b/drivers/s390/block/dasd_int.h
+index 33f812f0e5150..966b915ef22cb 100644
+--- a/drivers/s390/block/dasd_int.h
++++ b/drivers/s390/block/dasd_int.h
+@@ -259,6 +259,10 @@ struct dasd_uid {
+ char vduit[33];
+ };
+
++#define DASD_UID_STRLEN ( /* vendor */ 3 + 1 + /* serial */ 14 + 1 + \
++ /* SSID */ 4 + 1 + /* unit addr */ 2 + 1 + \
++ /* vduit */ 32 + 1)
++
+ /*
+ * PPRC Status data
+ */
+diff --git a/drivers/s390/block/dcssblk.c b/drivers/s390/block/dcssblk.c
+index c09f2e053bf86..cbf03d68283bb 100644
+--- a/drivers/s390/block/dcssblk.c
++++ b/drivers/s390/block/dcssblk.c
+@@ -411,6 +411,7 @@ removeseg:
+ }
+ list_del(&dev_info->lh);
+
++ dax_remove_host(dev_info->gd);
+ kill_dax(dev_info->dax_dev);
+ put_dax(dev_info->dax_dev);
+ del_gendisk(dev_info->gd);
+@@ -706,9 +707,9 @@ dcssblk_add_store(struct device *dev, struct device_attribute *attr, const char
+ goto out;
+
+ out_dax_host:
++ put_device(&dev_info->dev);
+ dax_remove_host(dev_info->gd);
+ out_dax:
+- put_device(&dev_info->dev);
+ kill_dax(dev_info->dax_dev);
+ put_dax(dev_info->dax_dev);
+ put_dev:
+@@ -788,6 +789,7 @@ dcssblk_remove_store(struct device *dev, struct device_attribute *attr, const ch
+ }
+
+ list_del(&dev_info->lh);
++ dax_remove_host(dev_info->gd);
+ kill_dax(dev_info->dax_dev);
+ put_dax(dev_info->dax_dev);
+ del_gendisk(dev_info->gd);
+diff --git a/drivers/s390/crypto/pkey_api.c b/drivers/s390/crypto/pkey_api.c
+index a8def50c149bd..2b92ec20ed68e 100644
+--- a/drivers/s390/crypto/pkey_api.c
++++ b/drivers/s390/crypto/pkey_api.c
+@@ -565,6 +565,11 @@ static int pkey_genseckey2(const struct pkey_apqn *apqns, size_t nr_apqns,
+ if (*keybufsize < MINEP11AESKEYBLOBSIZE)
+ return -EINVAL;
+ break;
++ case PKEY_TYPE_EP11_AES:
++ if (*keybufsize < (sizeof(struct ep11kblob_header) +
++ MINEP11AESKEYBLOBSIZE))
++ return -EINVAL;
++ break;
+ default:
+ return -EINVAL;
+ }
+@@ -581,9 +586,10 @@ static int pkey_genseckey2(const struct pkey_apqn *apqns, size_t nr_apqns,
+ for (i = 0, rc = -ENODEV; i < nr_apqns; i++) {
+ card = apqns[i].card;
+ dom = apqns[i].domain;
+- if (ktype == PKEY_TYPE_EP11) {
++ if (ktype == PKEY_TYPE_EP11 ||
++ ktype == PKEY_TYPE_EP11_AES) {
+ rc = ep11_genaeskey(card, dom, ksize, kflags,
+- keybuf, keybufsize);
++ keybuf, keybufsize, ktype);
+ } else if (ktype == PKEY_TYPE_CCA_DATA) {
+ rc = cca_genseckey(card, dom, ksize, keybuf);
+ *keybufsize = (rc ? 0 : SECKEYBLOBSIZE);
+@@ -747,7 +753,7 @@ static int pkey_verifykey2(const u8 *key, size_t keylen,
+ if (ktype)
+ *ktype = PKEY_TYPE_EP11;
+ if (ksize)
+- *ksize = kb->head.keybitlen;
++ *ksize = kb->head.bitlen;
+
+ rc = ep11_findcard2(&_apqns, &_nr_apqns, *cardnr, *domain,
+ ZCRYPT_CEX7, EP11_API_V, kb->wkvp);
+@@ -1313,7 +1319,7 @@ static long pkey_unlocked_ioctl(struct file *filp, unsigned int cmd,
+ apqns = _copy_apqns_from_user(kgs.apqns, kgs.apqn_entries);
+ if (IS_ERR(apqns))
+ return PTR_ERR(apqns);
+- kkey = kmalloc(klen, GFP_KERNEL);
++ kkey = kzalloc(klen, GFP_KERNEL);
+ if (!kkey) {
+ kfree(apqns);
+ return -ENOMEM;
+@@ -1941,7 +1947,7 @@ static struct attribute_group ccacipher_attr_group = {
+ * (i.e. off != 0 or count < key blob size) -EINVAL is returned.
+ * This function and the sysfs attributes using it provide EP11 key blobs
+ * padded to the upper limit of MAXEP11AESKEYBLOBSIZE which is currently
+- * 320 bytes.
++ * 336 bytes.
+ */
+ static ssize_t pkey_ep11_aes_attr_read(enum pkey_key_size keybits,
+ bool is_xts, char *buf, loff_t off,
+@@ -1969,7 +1975,8 @@ static ssize_t pkey_ep11_aes_attr_read(enum pkey_key_size keybits,
+ for (i = 0, rc = -ENODEV; i < nr_apqns; i++) {
+ card = apqns[i] >> 16;
+ dom = apqns[i] & 0xFFFF;
+- rc = ep11_genaeskey(card, dom, keybits, 0, buf, &keysize);
++ rc = ep11_genaeskey(card, dom, keybits, 0, buf, &keysize,
++ PKEY_TYPE_EP11_AES);
+ if (rc == 0)
+ break;
+ }
+@@ -1979,7 +1986,8 @@ static ssize_t pkey_ep11_aes_attr_read(enum pkey_key_size keybits,
+ if (is_xts) {
+ keysize = MAXEP11AESKEYBLOBSIZE;
+ buf += MAXEP11AESKEYBLOBSIZE;
+- rc = ep11_genaeskey(card, dom, keybits, 0, buf, &keysize);
++ rc = ep11_genaeskey(card, dom, keybits, 0, buf, &keysize,
++ PKEY_TYPE_EP11_AES);
+ if (rc == 0)
+ return 2 * MAXEP11AESKEYBLOBSIZE;
+ }
+diff --git a/drivers/s390/crypto/zcrypt_ep11misc.c b/drivers/s390/crypto/zcrypt_ep11misc.c
+index f67d19d08571b..0cd0395fa17fc 100644
+--- a/drivers/s390/crypto/zcrypt_ep11misc.c
++++ b/drivers/s390/crypto/zcrypt_ep11misc.c
+@@ -113,6 +113,50 @@ static void __exit card_cache_free(void)
+ spin_unlock_bh(&card_list_lock);
+ }
+
++static int ep11_kb_split(const u8 *kb, size_t kblen, u32 kbver,
++ struct ep11kblob_header **kbhdr, size_t *kbhdrsize,
++ u8 **kbpl, size_t *kbplsize)
++{
++ struct ep11kblob_header *hdr = NULL;
++ size_t hdrsize, plsize = 0;
++ int rc = -EINVAL;
++ u8 *pl = NULL;
++
++ if (kblen < sizeof(struct ep11kblob_header))
++ goto out;
++ hdr = (struct ep11kblob_header *)kb;
++
++ switch (kbver) {
++ case TOKVER_EP11_AES:
++ /* header overlays the payload */
++ hdrsize = 0;
++ break;
++ case TOKVER_EP11_ECC_WITH_HEADER:
++ case TOKVER_EP11_AES_WITH_HEADER:
++ /* payload starts after the header */
++ hdrsize = sizeof(struct ep11kblob_header);
++ break;
++ default:
++ goto out;
++ }
++
++ plsize = kblen - hdrsize;
++ pl = (u8 *)kb + hdrsize;
++
++ if (kbhdr)
++ *kbhdr = hdr;
++ if (kbhdrsize)
++ *kbhdrsize = hdrsize;
++ if (kbpl)
++ *kbpl = pl;
++ if (kbplsize)
++ *kbplsize = plsize;
++
++ rc = 0;
++out:
++ return rc;
++}
++
+ /*
+ * Simple check if the key blob is a valid EP11 AES key blob with header.
+ */
+@@ -664,8 +708,9 @@ EXPORT_SYMBOL(ep11_get_domain_info);
+ */
+ #define KEY_ATTR_DEFAULTS 0x00200c00
+
+-int ep11_genaeskey(u16 card, u16 domain, u32 keybitsize, u32 keygenflags,
+- u8 *keybuf, size_t *keybufsize)
++static int _ep11_genaeskey(u16 card, u16 domain,
++ u32 keybitsize, u32 keygenflags,
++ u8 *keybuf, size_t *keybufsize)
+ {
+ struct keygen_req_pl {
+ struct pl_head head;
+@@ -701,7 +746,6 @@ int ep11_genaeskey(u16 card, u16 domain, u32 keybitsize, u32 keygenflags,
+ struct ep11_cprb *req = NULL, *rep = NULL;
+ struct ep11_target_dev target;
+ struct ep11_urb *urb = NULL;
+- struct ep11keyblob *kb;
+ int api, rc = -ENOMEM;
+
+ switch (keybitsize) {
+@@ -780,14 +824,9 @@ int ep11_genaeskey(u16 card, u16 domain, u32 keybitsize, u32 keygenflags,
+ goto out;
+ }
+
+- /* copy key blob and set header values */
++ /* copy key blob */
+ memcpy(keybuf, rep_pl->data, rep_pl->data_len);
+ *keybufsize = rep_pl->data_len;
+- kb = (struct ep11keyblob *)keybuf;
+- kb->head.type = TOKTYPE_NON_CCA;
+- kb->head.len = rep_pl->data_len;
+- kb->head.version = TOKVER_EP11_AES;
+- kb->head.keybitlen = keybitsize;
+
+ out:
+ kfree(req);
+@@ -795,6 +834,43 @@ out:
+ kfree(urb);
+ return rc;
+ }
++
++int ep11_genaeskey(u16 card, u16 domain, u32 keybitsize, u32 keygenflags,
++ u8 *keybuf, size_t *keybufsize, u32 keybufver)
++{
++ struct ep11kblob_header *hdr;
++ size_t hdr_size, pl_size;
++ u8 *pl;
++ int rc;
++
++ switch (keybufver) {
++ case TOKVER_EP11_AES:
++ case TOKVER_EP11_AES_WITH_HEADER:
++ break;
++ default:
++ return -EINVAL;
++ }
++
++ rc = ep11_kb_split(keybuf, *keybufsize, keybufver,
++ &hdr, &hdr_size, &pl, &pl_size);
++ if (rc)
++ return rc;
++
++ rc = _ep11_genaeskey(card, domain, keybitsize, keygenflags,
++ pl, &pl_size);
++ if (rc)
++ return rc;
++
++ *keybufsize = hdr_size + pl_size;
++
++ /* update header information */
++ hdr->type = TOKTYPE_NON_CCA;
++ hdr->len = *keybufsize;
++ hdr->version = keybufver;
++ hdr->bitlen = keybitsize;
++
++ return 0;
++}
+ EXPORT_SYMBOL(ep11_genaeskey);
+
+ static int ep11_cryptsingle(u16 card, u16 domain,
+@@ -1055,7 +1131,7 @@ static int ep11_unwrapkey(u16 card, u16 domain,
+ kb->head.type = TOKTYPE_NON_CCA;
+ kb->head.len = rep_pl->data_len;
+ kb->head.version = TOKVER_EP11_AES;
+- kb->head.keybitlen = keybitsize;
++ kb->head.bitlen = keybitsize;
+
+ out:
+ kfree(req);
+@@ -1201,7 +1277,6 @@ int ep11_clr2keyblob(u16 card, u16 domain, u32 keybitsize, u32 keygenflags,
+ const u8 *clrkey, u8 *keybuf, size_t *keybufsize)
+ {
+ int rc;
+- struct ep11keyblob *kb;
+ u8 encbuf[64], *kek = NULL;
+ size_t clrkeylen, keklen, encbuflen = sizeof(encbuf);
+
+@@ -1223,17 +1298,15 @@ int ep11_clr2keyblob(u16 card, u16 domain, u32 keybitsize, u32 keygenflags,
+ }
+
+ /* Step 1: generate AES 256 bit random kek key */
+- rc = ep11_genaeskey(card, domain, 256,
+- 0x00006c00, /* EN/DECRYPT, WRAP/UNWRAP */
+- kek, &keklen);
++ rc = _ep11_genaeskey(card, domain, 256,
++ 0x00006c00, /* EN/DECRYPT, WRAP/UNWRAP */
++ kek, &keklen);
+ if (rc) {
+ DEBUG_ERR(
+ "%s generate kek key failed, rc=%d\n",
+ __func__, rc);
+ goto out;
+ }
+- kb = (struct ep11keyblob *)kek;
+- memset(&kb->head, 0, sizeof(kb->head));
+
+ /* Step 2: encrypt clear key value with the kek key */
+ rc = ep11_cryptsingle(card, domain, 0, 0, def_iv, kek, keklen,
+diff --git a/drivers/s390/crypto/zcrypt_ep11misc.h b/drivers/s390/crypto/zcrypt_ep11misc.h
+index 07445041869fe..ed328c354bade 100644
+--- a/drivers/s390/crypto/zcrypt_ep11misc.h
++++ b/drivers/s390/crypto/zcrypt_ep11misc.h
+@@ -29,14 +29,7 @@ struct ep11keyblob {
+ union {
+ u8 session[32];
+ /* only used for PKEY_TYPE_EP11: */
+- struct {
+- u8 type; /* 0x00 (TOKTYPE_NON_CCA) */
+- u8 res0; /* unused */
+- u16 len; /* total length in bytes of this blob */
+- u8 version; /* 0x03 (TOKVER_EP11_AES) */
+- u8 res1; /* unused */
+- u16 keybitlen; /* clear key bit len, 0 for unknown */
+- } head;
++ struct ep11kblob_header head;
+ };
+ u8 wkvp[16]; /* wrapping key verification pattern */
+ u64 attr; /* boolean key attributes */
+@@ -114,7 +107,7 @@ int ep11_get_domain_info(u16 card, u16 domain, struct ep11_domain_info *info);
+ * Generate (random) EP11 AES secure key.
+ */
+ int ep11_genaeskey(u16 card, u16 domain, u32 keybitsize, u32 keygenflags,
+- u8 *keybuf, size_t *keybufsize);
++ u8 *keybuf, size_t *keybufsize, u32 keybufver);
+
+ /*
+ * Generate EP11 AES secure key with given clear key value.
+diff --git a/drivers/scsi/be2iscsi/be_iscsi.c b/drivers/scsi/be2iscsi/be_iscsi.c
+index 8aeaddc93b167..8d374ae863ba2 100644
+--- a/drivers/scsi/be2iscsi/be_iscsi.c
++++ b/drivers/scsi/be2iscsi/be_iscsi.c
+@@ -450,6 +450,10 @@ int beiscsi_iface_set_param(struct Scsi_Host *shost,
+ }
+
+ nla_for_each_attr(attrib, data, dt_len, rm_len) {
++ /* ignore nla_type as it is never used */
++ if (nla_len(attrib) < sizeof(*iface_param))
++ return -EINVAL;
++
+ iface_param = nla_data(attrib);
+
+ if (iface_param->param_type != ISCSI_NET_PARAM)
+diff --git a/drivers/scsi/fcoe/fcoe_ctlr.c b/drivers/scsi/fcoe/fcoe_ctlr.c
+index 5c8d1ba3f8f3c..19eee108db021 100644
+--- a/drivers/scsi/fcoe/fcoe_ctlr.c
++++ b/drivers/scsi/fcoe/fcoe_ctlr.c
+@@ -319,16 +319,17 @@ static void fcoe_ctlr_announce(struct fcoe_ctlr *fip)
+ {
+ struct fcoe_fcf *sel;
+ struct fcoe_fcf *fcf;
++ unsigned long flags;
+
+ mutex_lock(&fip->ctlr_mutex);
+- spin_lock_bh(&fip->ctlr_lock);
++ spin_lock_irqsave(&fip->ctlr_lock, flags);
+
+ kfree_skb(fip->flogi_req);
+ fip->flogi_req = NULL;
+ list_for_each_entry(fcf, &fip->fcfs, list)
+ fcf->flogi_sent = 0;
+
+- spin_unlock_bh(&fip->ctlr_lock);
++ spin_unlock_irqrestore(&fip->ctlr_lock, flags);
+ sel = fip->sel_fcf;
+
+ if (sel && ether_addr_equal(sel->fcf_mac, fip->dest_addr))
+@@ -699,6 +700,7 @@ int fcoe_ctlr_els_send(struct fcoe_ctlr *fip, struct fc_lport *lport,
+ {
+ struct fc_frame *fp;
+ struct fc_frame_header *fh;
++ unsigned long flags;
+ u16 old_xid;
+ u8 op;
+ u8 mac[ETH_ALEN];
+@@ -732,11 +734,11 @@ int fcoe_ctlr_els_send(struct fcoe_ctlr *fip, struct fc_lport *lport,
+ op = FIP_DT_FLOGI;
+ if (fip->mode == FIP_MODE_VN2VN)
+ break;
+- spin_lock_bh(&fip->ctlr_lock);
++ spin_lock_irqsave(&fip->ctlr_lock, flags);
+ kfree_skb(fip->flogi_req);
+ fip->flogi_req = skb;
+ fip->flogi_req_send = 1;
+- spin_unlock_bh(&fip->ctlr_lock);
++ spin_unlock_irqrestore(&fip->ctlr_lock, flags);
+ schedule_work(&fip->timer_work);
+ return -EINPROGRESS;
+ case ELS_FDISC:
+@@ -1705,10 +1707,11 @@ static int fcoe_ctlr_flogi_send_locked(struct fcoe_ctlr *fip)
+ static int fcoe_ctlr_flogi_retry(struct fcoe_ctlr *fip)
+ {
+ struct fcoe_fcf *fcf;
++ unsigned long flags;
+ int error;
+
+ mutex_lock(&fip->ctlr_mutex);
+- spin_lock_bh(&fip->ctlr_lock);
++ spin_lock_irqsave(&fip->ctlr_lock, flags);
+ LIBFCOE_FIP_DBG(fip, "re-sending FLOGI - reselect\n");
+ fcf = fcoe_ctlr_select(fip);
+ if (!fcf || fcf->flogi_sent) {
+@@ -1719,7 +1722,7 @@ static int fcoe_ctlr_flogi_retry(struct fcoe_ctlr *fip)
+ fcoe_ctlr_solicit(fip, NULL);
+ error = fcoe_ctlr_flogi_send_locked(fip);
+ }
+- spin_unlock_bh(&fip->ctlr_lock);
++ spin_unlock_irqrestore(&fip->ctlr_lock, flags);
+ mutex_unlock(&fip->ctlr_mutex);
+ return error;
+ }
+@@ -1736,8 +1739,9 @@ static int fcoe_ctlr_flogi_retry(struct fcoe_ctlr *fip)
+ static void fcoe_ctlr_flogi_send(struct fcoe_ctlr *fip)
+ {
+ struct fcoe_fcf *fcf;
++ unsigned long flags;
+
+- spin_lock_bh(&fip->ctlr_lock);
++ spin_lock_irqsave(&fip->ctlr_lock, flags);
+ fcf = fip->sel_fcf;
+ if (!fcf || !fip->flogi_req_send)
+ goto unlock;
+@@ -1764,7 +1768,7 @@ static void fcoe_ctlr_flogi_send(struct fcoe_ctlr *fip)
+ } else /* XXX */
+ LIBFCOE_FIP_DBG(fip, "No FCF selected - defer send\n");
+ unlock:
+- spin_unlock_bh(&fip->ctlr_lock);
++ spin_unlock_irqrestore(&fip->ctlr_lock, flags);
+ }
+
+ /**
+diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
+index cd78e4c983aa8..38a91e227842c 100644
+--- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
++++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
+@@ -2026,6 +2026,11 @@ static void slot_err_v2_hw(struct hisi_hba *hisi_hba,
+ u16 dma_tx_err_type = le16_to_cpu(err_record->dma_tx_err_type);
+ u16 sipc_rx_err_type = le16_to_cpu(err_record->sipc_rx_err_type);
+ u32 dma_rx_err_type = le32_to_cpu(err_record->dma_rx_err_type);
++ struct hisi_sas_complete_v2_hdr *complete_queue =
++ hisi_hba->complete_hdr[slot->cmplt_queue];
++ struct hisi_sas_complete_v2_hdr *complete_hdr =
++ &complete_queue[slot->cmplt_queue_slot];
++ u32 dw0 = le32_to_cpu(complete_hdr->dw0);
+ int error = -1;
+
+ if (err_phase == 1) {
+@@ -2310,7 +2315,8 @@ static void slot_err_v2_hw(struct hisi_hba *hisi_hba,
+ break;
+ }
+ }
+- hisi_sas_sata_done(task, slot);
++ if (dw0 & CMPLT_HDR_RSPNS_XFRD_MSK)
++ hisi_sas_sata_done(task, slot);
+ }
+ break;
+ default:
+@@ -2443,7 +2449,8 @@ static void slot_complete_v2_hw(struct hisi_hba *hisi_hba,
+ case SAS_PROTOCOL_SATA | SAS_PROTOCOL_STP:
+ {
+ ts->stat = SAS_SAM_STAT_GOOD;
+- hisi_sas_sata_done(task, slot);
++ if (dw0 & CMPLT_HDR_RSPNS_XFRD_MSK)
++ hisi_sas_sata_done(task, slot);
+ break;
+ }
+ default:
+diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+index 12d588454f5de..1794ad709183e 100644
+--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
++++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+@@ -2206,6 +2206,7 @@ slot_err_v3_hw(struct hisi_hba *hisi_hba, struct sas_task *task,
+ u32 trans_tx_fail_type = le32_to_cpu(record->trans_tx_fail_type);
+ u16 sipc_rx_err_type = le16_to_cpu(record->sipc_rx_err_type);
+ u32 dw3 = le32_to_cpu(complete_hdr->dw3);
++ u32 dw0 = le32_to_cpu(complete_hdr->dw0);
+
+ switch (task->task_proto) {
+ case SAS_PROTOCOL_SSP:
+@@ -2215,8 +2216,8 @@ slot_err_v3_hw(struct hisi_hba *hisi_hba, struct sas_task *task,
+ * but I/O information has been written to the host memory, we examine
+ * response IU.
+ */
+- if (!(complete_hdr->dw0 & CMPLT_HDR_RSPNS_GOOD_MSK) &&
+- (complete_hdr->dw0 & CMPLT_HDR_RSPNS_XFRD_MSK))
++ if (!(dw0 & CMPLT_HDR_RSPNS_GOOD_MSK) &&
++ (dw0 & CMPLT_HDR_RSPNS_XFRD_MSK))
+ return false;
+
+ ts->residual = trans_tx_fail_type;
+@@ -2232,7 +2233,7 @@ slot_err_v3_hw(struct hisi_hba *hisi_hba, struct sas_task *task,
+ case SAS_PROTOCOL_SATA:
+ case SAS_PROTOCOL_STP:
+ case SAS_PROTOCOL_SATA | SAS_PROTOCOL_STP:
+- if ((complete_hdr->dw0 & CMPLT_HDR_RSPNS_XFRD_MSK) &&
++ if ((dw0 & CMPLT_HDR_RSPNS_XFRD_MSK) &&
+ (sipc_rx_err_type & RX_FIS_STATUS_ERR_MSK)) {
+ ts->stat = SAS_PROTO_RESPONSE;
+ } else if (dma_rx_err_type & RX_DATA_LEN_UNDERFLOW_MSK) {
+@@ -2246,7 +2247,8 @@ slot_err_v3_hw(struct hisi_hba *hisi_hba, struct sas_task *task,
+ ts->stat = SAS_OPEN_REJECT;
+ ts->open_rej_reason = SAS_OREJ_RSVD_RETRY;
+ }
+- hisi_sas_sata_done(task, slot);
++ if (dw0 & CMPLT_HDR_RSPNS_XFRD_MSK)
++ hisi_sas_sata_done(task, slot);
+ break;
+ case SAS_PROTOCOL_SMP:
+ ts->stat = SAS_SAM_STAT_CHECK_CONDITION;
+@@ -2373,7 +2375,8 @@ static void slot_complete_v3_hw(struct hisi_hba *hisi_hba,
+ case SAS_PROTOCOL_STP:
+ case SAS_PROTOCOL_SATA | SAS_PROTOCOL_STP:
+ ts->stat = SAS_SAM_STAT_GOOD;
+- hisi_sas_sata_done(task, slot);
++ if (dw0 & CMPLT_HDR_RSPNS_XFRD_MSK)
++ hisi_sas_sata_done(task, slot);
+ break;
+ default:
+ ts->stat = SAS_SAM_STAT_CHECK_CONDITION;
+diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
+index f0bc8bbb39381..13ee3453e56a1 100644
+--- a/drivers/scsi/hosts.c
++++ b/drivers/scsi/hosts.c
+@@ -536,7 +536,7 @@ EXPORT_SYMBOL(scsi_host_alloc);
+ static int __scsi_host_match(struct device *dev, const void *data)
+ {
+ struct Scsi_Host *p;
+- const unsigned short *hostnum = data;
++ const unsigned int *hostnum = data;
+
+ p = class_to_shost(dev);
+ return p->host_no == *hostnum;
+@@ -553,7 +553,7 @@ static int __scsi_host_match(struct device *dev, const void *data)
+ * that scsi_host_get() took. The put_device() below dropped
+ * the reference from class_find_device().
+ **/
+-struct Scsi_Host *scsi_host_lookup(unsigned short hostnum)
++struct Scsi_Host *scsi_host_lookup(unsigned int hostnum)
+ {
+ struct device *cdev;
+ struct Scsi_Host *shost = NULL;
+diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
+index e989f130434e4..e34a41fb3e1cb 100644
+--- a/drivers/scsi/lpfc/lpfc_scsi.c
++++ b/drivers/scsi/lpfc/lpfc_scsi.c
+@@ -109,8 +109,6 @@ lpfc_sli4_set_rsp_sgl_last(struct lpfc_hba *phba,
+ }
+ }
+
+-#define LPFC_INVALID_REFTAG ((u32)-1)
+-
+ /**
+ * lpfc_rampdown_queue_depth - Post RAMP_DOWN_QUEUE event to worker thread
+ * @phba: The Hba for which this call is being executed.
+@@ -978,8 +976,6 @@ lpfc_bg_err_inject(struct lpfc_hba *phba, struct scsi_cmnd *sc,
+
+ sgpe = scsi_prot_sglist(sc);
+ lba = scsi_prot_ref_tag(sc);
+- if (lba == LPFC_INVALID_REFTAG)
+- return 0;
+
+ /* First check if we need to match the LBA */
+ if (phba->lpfc_injerr_lba != LPFC_INJERR_LBA_OFF) {
+@@ -1560,8 +1556,6 @@ lpfc_bg_setup_bpl(struct lpfc_hba *phba, struct scsi_cmnd *sc,
+
+ /* extract some info from the scsi command for pde*/
+ reftag = scsi_prot_ref_tag(sc);
+- if (reftag == LPFC_INVALID_REFTAG)
+- goto out;
+
+ #ifdef CONFIG_SCSI_LPFC_DEBUG_FS
+ rc = lpfc_bg_err_inject(phba, sc, &reftag, NULL, 1);
+@@ -1723,8 +1717,6 @@ lpfc_bg_setup_bpl_prot(struct lpfc_hba *phba, struct scsi_cmnd *sc,
+ /* extract some info from the scsi command */
+ blksize = scsi_prot_interval(sc);
+ reftag = scsi_prot_ref_tag(sc);
+- if (reftag == LPFC_INVALID_REFTAG)
+- goto out;
+
+ #ifdef CONFIG_SCSI_LPFC_DEBUG_FS
+ rc = lpfc_bg_err_inject(phba, sc, &reftag, NULL, 1);
+@@ -1953,8 +1945,6 @@ lpfc_bg_setup_sgl(struct lpfc_hba *phba, struct scsi_cmnd *sc,
+
+ /* extract some info from the scsi command for pde*/
+ reftag = scsi_prot_ref_tag(sc);
+- if (reftag == LPFC_INVALID_REFTAG)
+- goto out;
+
+ #ifdef CONFIG_SCSI_LPFC_DEBUG_FS
+ rc = lpfc_bg_err_inject(phba, sc, &reftag, NULL, 1);
+@@ -2154,8 +2144,6 @@ lpfc_bg_setup_sgl_prot(struct lpfc_hba *phba, struct scsi_cmnd *sc,
+ /* extract some info from the scsi command */
+ blksize = scsi_prot_interval(sc);
+ reftag = scsi_prot_ref_tag(sc);
+- if (reftag == LPFC_INVALID_REFTAG)
+- goto out;
+
+ #ifdef CONFIG_SCSI_LPFC_DEBUG_FS
+ rc = lpfc_bg_err_inject(phba, sc, &reftag, NULL, 1);
+@@ -2746,8 +2734,6 @@ lpfc_calc_bg_err(struct lpfc_hba *phba, struct lpfc_io_buf *lpfc_cmd)
+
+ src = (struct scsi_dif_tuple *)sg_virt(sgpe);
+ start_ref_tag = scsi_prot_ref_tag(cmd);
+- if (start_ref_tag == LPFC_INVALID_REFTAG)
+- goto out;
+ start_app_tag = src->app_tag;
+ len = sgpe->length;
+ while (src && protsegcnt) {
+@@ -3493,11 +3479,11 @@ err:
+ scsi_cmnd->sc_data_direction);
+
+ lpfc_printf_log(phba, KERN_ERR, LOG_TRACE_EVENT,
+- "9084 Cannot setup S/G List for HBA"
+- "IO segs %d/%d SGL %d SCSI %d: %d %d\n",
++ "9084 Cannot setup S/G List for HBA "
++ "IO segs %d/%d SGL %d SCSI %d: %d %d %d\n",
+ lpfc_cmd->seg_cnt, lpfc_cmd->prot_seg_cnt,
+ phba->cfg_total_seg_cnt, phba->cfg_sg_seg_cnt,
+- prot_group_type, num_sge);
++ prot_group_type, num_sge, ret);
+
+ lpfc_cmd->seg_cnt = 0;
+ lpfc_cmd->prot_seg_cnt = 0;
+diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
+index 53f5492579cb7..5284584e4cd2b 100644
+--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
++++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
+@@ -138,6 +138,9 @@ _base_get_ioc_facts(struct MPT3SAS_ADAPTER *ioc);
+ static void
+ _base_clear_outstanding_commands(struct MPT3SAS_ADAPTER *ioc);
+
++static u32
++_base_readl_ext_retry(const volatile void __iomem *addr);
++
+ /**
+ * mpt3sas_base_check_cmd_timeout - Function
+ * to check timeout and command termination due
+@@ -213,6 +216,20 @@ _base_readl_aero(const volatile void __iomem *addr)
+ return ret_val;
+ }
+
++static u32
++_base_readl_ext_retry(const volatile void __iomem *addr)
++{
++ u32 i, ret_val;
++
++ for (i = 0 ; i < 30 ; i++) {
++ ret_val = readl(addr);
++ if (ret_val == 0)
++ continue;
++ }
++
++ return ret_val;
++}
++
+ static inline u32
+ _base_readl(const volatile void __iomem *addr)
+ {
+@@ -940,7 +957,7 @@ mpt3sas_halt_firmware(struct MPT3SAS_ADAPTER *ioc)
+
+ dump_stack();
+
+- doorbell = ioc->base_readl(&ioc->chip->Doorbell);
++ doorbell = ioc->base_readl_ext_retry(&ioc->chip->Doorbell);
+ if ((doorbell & MPI2_IOC_STATE_MASK) == MPI2_IOC_STATE_FAULT) {
+ mpt3sas_print_fault_code(ioc, doorbell &
+ MPI2_DOORBELL_DATA_MASK);
+@@ -6686,7 +6703,7 @@ mpt3sas_base_get_iocstate(struct MPT3SAS_ADAPTER *ioc, int cooked)
+ {
+ u32 s, sc;
+
+- s = ioc->base_readl(&ioc->chip->Doorbell);
++ s = ioc->base_readl_ext_retry(&ioc->chip->Doorbell);
+ sc = s & MPI2_IOC_STATE_MASK;
+ return cooked ? sc : s;
+ }
+@@ -6831,7 +6848,7 @@ _base_wait_for_doorbell_ack(struct MPT3SAS_ADAPTER *ioc, int timeout)
+ __func__, count, timeout));
+ return 0;
+ } else if (int_status & MPI2_HIS_IOC2SYS_DB_STATUS) {
+- doorbell = ioc->base_readl(&ioc->chip->Doorbell);
++ doorbell = ioc->base_readl_ext_retry(&ioc->chip->Doorbell);
+ if ((doorbell & MPI2_IOC_STATE_MASK) ==
+ MPI2_IOC_STATE_FAULT) {
+ mpt3sas_print_fault_code(ioc, doorbell);
+@@ -6871,7 +6888,7 @@ _base_wait_for_doorbell_not_used(struct MPT3SAS_ADAPTER *ioc, int timeout)
+ count = 0;
+ cntdn = 1000 * timeout;
+ do {
+- doorbell_reg = ioc->base_readl(&ioc->chip->Doorbell);
++ doorbell_reg = ioc->base_readl_ext_retry(&ioc->chip->Doorbell);
+ if (!(doorbell_reg & MPI2_DOORBELL_USED)) {
+ dhsprintk(ioc,
+ ioc_info(ioc, "%s: successful count(%d), timeout(%d)\n",
+@@ -7019,7 +7036,7 @@ _base_handshake_req_reply_wait(struct MPT3SAS_ADAPTER *ioc, int request_bytes,
+ __le32 *mfp;
+
+ /* make sure doorbell is not in use */
+- if ((ioc->base_readl(&ioc->chip->Doorbell) & MPI2_DOORBELL_USED)) {
++ if ((ioc->base_readl_ext_retry(&ioc->chip->Doorbell) & MPI2_DOORBELL_USED)) {
+ ioc_err(ioc, "doorbell is in use (line=%d)\n", __LINE__);
+ return -EFAULT;
+ }
+@@ -7068,7 +7085,7 @@ _base_handshake_req_reply_wait(struct MPT3SAS_ADAPTER *ioc, int request_bytes,
+ }
+
+ /* read the first two 16-bits, it gives the total length of the reply */
+- reply[0] = le16_to_cpu(ioc->base_readl(&ioc->chip->Doorbell)
++ reply[0] = le16_to_cpu(ioc->base_readl_ext_retry(&ioc->chip->Doorbell)
+ & MPI2_DOORBELL_DATA_MASK);
+ writel(0, &ioc->chip->HostInterruptStatus);
+ if ((_base_wait_for_doorbell_int(ioc, 5))) {
+@@ -7076,7 +7093,7 @@ _base_handshake_req_reply_wait(struct MPT3SAS_ADAPTER *ioc, int request_bytes,
+ __LINE__);
+ return -EFAULT;
+ }
+- reply[1] = le16_to_cpu(ioc->base_readl(&ioc->chip->Doorbell)
++ reply[1] = le16_to_cpu(ioc->base_readl_ext_retry(&ioc->chip->Doorbell)
+ & MPI2_DOORBELL_DATA_MASK);
+ writel(0, &ioc->chip->HostInterruptStatus);
+
+@@ -7087,10 +7104,10 @@ _base_handshake_req_reply_wait(struct MPT3SAS_ADAPTER *ioc, int request_bytes,
+ return -EFAULT;
+ }
+ if (i >= reply_bytes/2) /* overflow case */
+- ioc->base_readl(&ioc->chip->Doorbell);
++ ioc->base_readl_ext_retry(&ioc->chip->Doorbell);
+ else
+ reply[i] = le16_to_cpu(
+- ioc->base_readl(&ioc->chip->Doorbell)
++ ioc->base_readl_ext_retry(&ioc->chip->Doorbell)
+ & MPI2_DOORBELL_DATA_MASK);
+ writel(0, &ioc->chip->HostInterruptStatus);
+ }
+@@ -7949,7 +7966,7 @@ _base_diag_reset(struct MPT3SAS_ADAPTER *ioc)
+ goto out;
+ }
+
+- host_diagnostic = ioc->base_readl(&ioc->chip->HostDiagnostic);
++ host_diagnostic = ioc->base_readl_ext_retry(&ioc->chip->HostDiagnostic);
+ drsprintk(ioc,
+ ioc_info(ioc, "wrote magic sequence: count(%d), host_diagnostic(0x%08x)\n",
+ count, host_diagnostic));
+@@ -7969,7 +7986,7 @@ _base_diag_reset(struct MPT3SAS_ADAPTER *ioc)
+ for (count = 0; count < (300000000 /
+ MPI2_HARD_RESET_PCIE_SECOND_READ_DELAY_MICRO_SEC); count++) {
+
+- host_diagnostic = ioc->base_readl(&ioc->chip->HostDiagnostic);
++ host_diagnostic = ioc->base_readl_ext_retry(&ioc->chip->HostDiagnostic);
+
+ if (host_diagnostic == 0xFFFFFFFF) {
+ ioc_info(ioc,
+@@ -8359,10 +8376,13 @@ mpt3sas_base_attach(struct MPT3SAS_ADAPTER *ioc)
+ ioc->rdpq_array_enable_assigned = 0;
+ ioc->use_32bit_dma = false;
+ ioc->dma_mask = 64;
+- if (ioc->is_aero_ioc)
++ if (ioc->is_aero_ioc) {
+ ioc->base_readl = &_base_readl_aero;
+- else
++ ioc->base_readl_ext_retry = &_base_readl_ext_retry;
++ } else {
+ ioc->base_readl = &_base_readl;
++ ioc->base_readl_ext_retry = &_base_readl;
++ }
+ r = mpt3sas_base_map_resources(ioc);
+ if (r)
+ goto out_free_resources;
+diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h
+index 05364aa15ecdb..10055c7e4a9f7 100644
+--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
++++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
+@@ -1618,6 +1618,7 @@ struct MPT3SAS_ADAPTER {
+ u8 diag_trigger_active;
+ u8 atomic_desc_capable;
+ BASE_READ_REG base_readl;
++ BASE_READ_REG base_readl_ext_retry;
+ struct SL_WH_MASTER_TRIGGER_T diag_trigger_master;
+ struct SL_WH_EVENT_TRIGGERS_T diag_trigger_event;
+ struct SL_WH_SCSI_TRIGGERS_T diag_trigger_scsi;
+diff --git a/drivers/scsi/qedf/qedf_dbg.h b/drivers/scsi/qedf/qedf_dbg.h
+index f4d81127239eb..5ec2b817c694a 100644
+--- a/drivers/scsi/qedf/qedf_dbg.h
++++ b/drivers/scsi/qedf/qedf_dbg.h
+@@ -59,6 +59,8 @@ extern uint qedf_debug;
+ #define QEDF_LOG_NOTICE 0x40000000 /* Notice logs */
+ #define QEDF_LOG_WARN 0x80000000 /* Warning logs */
+
++#define QEDF_DEBUGFS_LOG_LEN (2 * PAGE_SIZE)
++
+ /* Debug context structure */
+ struct qedf_dbg_ctx {
+ unsigned int host_no;
+diff --git a/drivers/scsi/qedf/qedf_debugfs.c b/drivers/scsi/qedf/qedf_debugfs.c
+index a3ed681c8ce3f..451fd236bfd05 100644
+--- a/drivers/scsi/qedf/qedf_debugfs.c
++++ b/drivers/scsi/qedf/qedf_debugfs.c
+@@ -8,6 +8,7 @@
+ #include <linux/uaccess.h>
+ #include <linux/debugfs.h>
+ #include <linux/module.h>
++#include <linux/vmalloc.h>
+
+ #include "qedf.h"
+ #include "qedf_dbg.h"
+@@ -98,7 +99,9 @@ static ssize_t
+ qedf_dbg_fp_int_cmd_read(struct file *filp, char __user *buffer, size_t count,
+ loff_t *ppos)
+ {
++ ssize_t ret;
+ size_t cnt = 0;
++ char *cbuf;
+ int id;
+ struct qedf_fastpath *fp = NULL;
+ struct qedf_dbg_ctx *qedf_dbg =
+@@ -108,19 +111,25 @@ qedf_dbg_fp_int_cmd_read(struct file *filp, char __user *buffer, size_t count,
+
+ QEDF_INFO(qedf_dbg, QEDF_LOG_DEBUGFS, "entered\n");
+
+- cnt = sprintf(buffer, "\nFastpath I/O completions\n\n");
++ cbuf = vmalloc(QEDF_DEBUGFS_LOG_LEN);
++ if (!cbuf)
++ return 0;
++
++ cnt += scnprintf(cbuf + cnt, QEDF_DEBUGFS_LOG_LEN - cnt, "\nFastpath I/O completions\n\n");
+
+ for (id = 0; id < qedf->num_queues; id++) {
+ fp = &(qedf->fp_array[id]);
+ if (fp->sb_id == QEDF_SB_ID_NULL)
+ continue;
+- cnt += sprintf((buffer + cnt), "#%d: %lu\n", id,
+- fp->completions);
++ cnt += scnprintf(cbuf + cnt, QEDF_DEBUGFS_LOG_LEN - cnt,
++ "#%d: %lu\n", id, fp->completions);
+ }
+
+- cnt = min_t(int, count, cnt - *ppos);
+- *ppos += cnt;
+- return cnt;
++ ret = simple_read_from_buffer(buffer, count, ppos, cbuf, cnt);
++
++ vfree(cbuf);
++
++ return ret;
+ }
+
+ static ssize_t
+@@ -138,15 +147,14 @@ qedf_dbg_debug_cmd_read(struct file *filp, char __user *buffer, size_t count,
+ loff_t *ppos)
+ {
+ int cnt;
++ char cbuf[32];
+ struct qedf_dbg_ctx *qedf_dbg =
+ (struct qedf_dbg_ctx *)filp->private_data;
+
+ QEDF_INFO(qedf_dbg, QEDF_LOG_DEBUGFS, "debug mask=0x%x\n", qedf_debug);
+- cnt = sprintf(buffer, "debug mask = 0x%x\n", qedf_debug);
++ cnt = scnprintf(cbuf, sizeof(cbuf), "debug mask = 0x%x\n", qedf_debug);
+
+- cnt = min_t(int, count, cnt - *ppos);
+- *ppos += cnt;
+- return cnt;
++ return simple_read_from_buffer(buffer, count, ppos, cbuf, cnt);
+ }
+
+ static ssize_t
+@@ -185,18 +193,17 @@ qedf_dbg_stop_io_on_error_cmd_read(struct file *filp, char __user *buffer,
+ size_t count, loff_t *ppos)
+ {
+ int cnt;
++ char cbuf[7];
+ struct qedf_dbg_ctx *qedf_dbg =
+ (struct qedf_dbg_ctx *)filp->private_data;
+ struct qedf_ctx *qedf = container_of(qedf_dbg,
+ struct qedf_ctx, dbg_ctx);
+
+ QEDF_INFO(qedf_dbg, QEDF_LOG_DEBUGFS, "entered\n");
+- cnt = sprintf(buffer, "%s\n",
++ cnt = scnprintf(cbuf, sizeof(cbuf), "%s\n",
+ qedf->stop_io_on_error ? "true" : "false");
+
+- cnt = min_t(int, count, cnt - *ppos);
+- *ppos += cnt;
+- return cnt;
++ return simple_read_from_buffer(buffer, count, ppos, cbuf, cnt);
+ }
+
+ static ssize_t
+diff --git a/drivers/scsi/qedi/qedi_main.c b/drivers/scsi/qedi/qedi_main.c
+index ef62dbbc1868e..1106d26113888 100644
+--- a/drivers/scsi/qedi/qedi_main.c
++++ b/drivers/scsi/qedi/qedi_main.c
+@@ -1977,8 +1977,9 @@ static int qedi_cpu_offline(unsigned int cpu)
+ struct qedi_percpu_s *p = this_cpu_ptr(&qedi_percpu);
+ struct qedi_work *work, *tmp;
+ struct task_struct *thread;
++ unsigned long flags;
+
+- spin_lock_bh(&p->p_work_lock);
++ spin_lock_irqsave(&p->p_work_lock, flags);
+ thread = p->iothread;
+ p->iothread = NULL;
+
+@@ -1989,7 +1990,7 @@ static int qedi_cpu_offline(unsigned int cpu)
+ kfree(work);
+ }
+
+- spin_unlock_bh(&p->p_work_lock);
++ spin_unlock_irqrestore(&p->p_work_lock, flags);
+ if (thread)
+ kthread_stop(thread);
+ return 0;
+diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c
+index e6b857d7b3dd4..00cf7b81cfc0f 100644
+--- a/drivers/scsi/qla2xxx/qla_init.c
++++ b/drivers/scsi/qla2xxx/qla_init.c
+@@ -5549,7 +5549,7 @@ static void qla_get_login_template(scsi_qla_host_t *vha)
+ __be32 *q;
+
+ memset(ha->init_cb, 0, ha->init_cb_size);
+- sz = min_t(int, sizeof(struct fc_els_csp), ha->init_cb_size);
++ sz = min_t(int, sizeof(struct fc_els_flogi), ha->init_cb_size);
+ rval = qla24xx_get_port_login_templ(vha, ha->init_cb_dma,
+ ha->init_cb, sz);
+ if (rval != QLA_SUCCESS) {
+diff --git a/drivers/scsi/qla4xxx/ql4_os.c b/drivers/scsi/qla4xxx/ql4_os.c
+index ee6d784c095c9..b6959470ddc04 100644
+--- a/drivers/scsi/qla4xxx/ql4_os.c
++++ b/drivers/scsi/qla4xxx/ql4_os.c
+@@ -968,6 +968,11 @@ static int qla4xxx_set_chap_entry(struct Scsi_Host *shost, void *data, int len)
+ memset(&chap_rec, 0, sizeof(chap_rec));
+
+ nla_for_each_attr(attr, data, len, rem) {
++ if (nla_len(attr) < sizeof(*param_info)) {
++ rc = -EINVAL;
++ goto exit_set_chap;
++ }
++
+ param_info = nla_data(attr);
+
+ switch (param_info->param) {
+@@ -2750,6 +2755,11 @@ qla4xxx_iface_set_param(struct Scsi_Host *shost, void *data, uint32_t len)
+ }
+
+ nla_for_each_attr(attr, data, len, rem) {
++ if (nla_len(attr) < sizeof(*iface_param)) {
++ rval = -EINVAL;
++ goto exit_init_fw_cb;
++ }
++
+ iface_param = nla_data(attr);
+
+ if (iface_param->param_type == ISCSI_NET_PARAM) {
+@@ -8104,6 +8114,11 @@ qla4xxx_sysfs_ddb_set_param(struct iscsi_bus_flash_session *fnode_sess,
+
+ memset((void *)&chap_tbl, 0, sizeof(chap_tbl));
+ nla_for_each_attr(attr, data, len, rem) {
++ if (nla_len(attr) < sizeof(*fnode_param)) {
++ rc = -EINVAL;
++ goto exit_set_param;
++ }
++
+ fnode_param = nla_data(attr);
+
+ switch (fnode_param->param) {
+diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
+index b9b97300e3b3c..49dbcd67579aa 100644
+--- a/drivers/scsi/scsi_transport_iscsi.c
++++ b/drivers/scsi/scsi_transport_iscsi.c
+@@ -3013,14 +3013,15 @@ iscsi_if_destroy_conn(struct iscsi_transport *transport, struct iscsi_uevent *ev
+ }
+
+ static int
+-iscsi_if_set_param(struct iscsi_transport *transport, struct iscsi_uevent *ev)
++iscsi_if_set_param(struct iscsi_transport *transport, struct iscsi_uevent *ev, u32 rlen)
+ {
+ char *data = (char*)ev + sizeof(*ev);
+ struct iscsi_cls_conn *conn;
+ struct iscsi_cls_session *session;
+ int err = 0, value = 0, state;
+
+- if (ev->u.set_param.len > PAGE_SIZE)
++ if (ev->u.set_param.len > rlen ||
++ ev->u.set_param.len > PAGE_SIZE)
+ return -EINVAL;
+
+ session = iscsi_session_lookup(ev->u.set_param.sid);
+@@ -3028,6 +3029,10 @@ iscsi_if_set_param(struct iscsi_transport *transport, struct iscsi_uevent *ev)
+ if (!conn || !session)
+ return -EINVAL;
+
++ /* data will be regarded as NULL-ended string, do length check */
++ if (strlen(data) > ev->u.set_param.len)
++ return -EINVAL;
++
+ switch (ev->u.set_param.param) {
+ case ISCSI_PARAM_SESS_RECOVERY_TMO:
+ sscanf(data, "%d", &value);
+@@ -3117,7 +3122,7 @@ put_ep:
+
+ static int
+ iscsi_if_transport_ep(struct iscsi_transport *transport,
+- struct iscsi_uevent *ev, int msg_type)
++ struct iscsi_uevent *ev, int msg_type, u32 rlen)
+ {
+ struct iscsi_endpoint *ep;
+ int rc = 0;
+@@ -3125,7 +3130,10 @@ iscsi_if_transport_ep(struct iscsi_transport *transport,
+ switch (msg_type) {
+ case ISCSI_UEVENT_TRANSPORT_EP_CONNECT_THROUGH_HOST:
+ case ISCSI_UEVENT_TRANSPORT_EP_CONNECT:
+- rc = iscsi_if_ep_connect(transport, ev, msg_type);
++ if (rlen < sizeof(struct sockaddr))
++ rc = -EINVAL;
++ else
++ rc = iscsi_if_ep_connect(transport, ev, msg_type);
+ break;
+ case ISCSI_UEVENT_TRANSPORT_EP_POLL:
+ if (!transport->ep_poll)
+@@ -3149,12 +3157,15 @@ iscsi_if_transport_ep(struct iscsi_transport *transport,
+
+ static int
+ iscsi_tgt_dscvr(struct iscsi_transport *transport,
+- struct iscsi_uevent *ev)
++ struct iscsi_uevent *ev, u32 rlen)
+ {
+ struct Scsi_Host *shost;
+ struct sockaddr *dst_addr;
+ int err;
+
++ if (rlen < sizeof(*dst_addr))
++ return -EINVAL;
++
+ if (!transport->tgt_dscvr)
+ return -EINVAL;
+
+@@ -3175,7 +3186,7 @@ iscsi_tgt_dscvr(struct iscsi_transport *transport,
+
+ static int
+ iscsi_set_host_param(struct iscsi_transport *transport,
+- struct iscsi_uevent *ev)
++ struct iscsi_uevent *ev, u32 rlen)
+ {
+ char *data = (char*)ev + sizeof(*ev);
+ struct Scsi_Host *shost;
+@@ -3184,7 +3195,8 @@ iscsi_set_host_param(struct iscsi_transport *transport,
+ if (!transport->set_host_param)
+ return -ENOSYS;
+
+- if (ev->u.set_host_param.len > PAGE_SIZE)
++ if (ev->u.set_host_param.len > rlen ||
++ ev->u.set_host_param.len > PAGE_SIZE)
+ return -EINVAL;
+
+ shost = scsi_host_lookup(ev->u.set_host_param.host_no);
+@@ -3194,6 +3206,10 @@ iscsi_set_host_param(struct iscsi_transport *transport,
+ return -ENODEV;
+ }
+
++ /* see similar check in iscsi_if_set_param() */
++ if (strlen(data) > ev->u.set_host_param.len)
++ return -EINVAL;
++
+ err = transport->set_host_param(shost, ev->u.set_host_param.param,
+ data, ev->u.set_host_param.len);
+ scsi_host_put(shost);
+@@ -3201,12 +3217,15 @@ iscsi_set_host_param(struct iscsi_transport *transport,
+ }
+
+ static int
+-iscsi_set_path(struct iscsi_transport *transport, struct iscsi_uevent *ev)
++iscsi_set_path(struct iscsi_transport *transport, struct iscsi_uevent *ev, u32 rlen)
+ {
+ struct Scsi_Host *shost;
+ struct iscsi_path *params;
+ int err;
+
++ if (rlen < sizeof(*params))
++ return -EINVAL;
++
+ if (!transport->set_path)
+ return -ENOSYS;
+
+@@ -3266,12 +3285,15 @@ iscsi_set_iface_params(struct iscsi_transport *transport,
+ }
+
+ static int
+-iscsi_send_ping(struct iscsi_transport *transport, struct iscsi_uevent *ev)
++iscsi_send_ping(struct iscsi_transport *transport, struct iscsi_uevent *ev, u32 rlen)
+ {
+ struct Scsi_Host *shost;
+ struct sockaddr *dst_addr;
+ int err;
+
++ if (rlen < sizeof(*dst_addr))
++ return -EINVAL;
++
+ if (!transport->send_ping)
+ return -ENOSYS;
+
+@@ -3769,13 +3791,12 @@ exit_host_stats:
+ }
+
+ static int iscsi_if_transport_conn(struct iscsi_transport *transport,
+- struct nlmsghdr *nlh)
++ struct nlmsghdr *nlh, u32 pdu_len)
+ {
+ struct iscsi_uevent *ev = nlmsg_data(nlh);
+ struct iscsi_cls_session *session;
+ struct iscsi_cls_conn *conn = NULL;
+ struct iscsi_endpoint *ep;
+- uint32_t pdu_len;
+ int err = 0;
+
+ switch (nlh->nlmsg_type) {
+@@ -3860,8 +3881,6 @@ static int iscsi_if_transport_conn(struct iscsi_transport *transport,
+
+ break;
+ case ISCSI_UEVENT_SEND_PDU:
+- pdu_len = nlh->nlmsg_len - sizeof(*nlh) - sizeof(*ev);
+-
+ if ((ev->u.send_pdu.hdr_size > pdu_len) ||
+ (ev->u.send_pdu.data_size > (pdu_len - ev->u.send_pdu.hdr_size))) {
+ err = -EINVAL;
+@@ -3891,6 +3910,7 @@ iscsi_if_recv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, uint32_t *group)
+ struct iscsi_internal *priv;
+ struct iscsi_cls_session *session;
+ struct iscsi_endpoint *ep = NULL;
++ u32 rlen;
+
+ if (!netlink_capable(skb, CAP_SYS_ADMIN))
+ return -EPERM;
+@@ -3910,6 +3930,13 @@ iscsi_if_recv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, uint32_t *group)
+
+ portid = NETLINK_CB(skb).portid;
+
++ /*
++ * Even though the remaining payload may not be regarded as nlattr,
++ * (like address or something else), calculate the remaining length
++ * here to ease following length checks.
++ */
++ rlen = nlmsg_attrlen(nlh, sizeof(*ev));
++
+ switch (nlh->nlmsg_type) {
+ case ISCSI_UEVENT_CREATE_SESSION:
+ err = iscsi_if_create_session(priv, ep, ev,
+@@ -3966,7 +3993,7 @@ iscsi_if_recv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, uint32_t *group)
+ err = -EINVAL;
+ break;
+ case ISCSI_UEVENT_SET_PARAM:
+- err = iscsi_if_set_param(transport, ev);
++ err = iscsi_if_set_param(transport, ev, rlen);
+ break;
+ case ISCSI_UEVENT_CREATE_CONN:
+ case ISCSI_UEVENT_DESTROY_CONN:
+@@ -3974,7 +4001,7 @@ iscsi_if_recv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, uint32_t *group)
+ case ISCSI_UEVENT_START_CONN:
+ case ISCSI_UEVENT_BIND_CONN:
+ case ISCSI_UEVENT_SEND_PDU:
+- err = iscsi_if_transport_conn(transport, nlh);
++ err = iscsi_if_transport_conn(transport, nlh, rlen);
+ break;
+ case ISCSI_UEVENT_GET_STATS:
+ err = iscsi_if_get_stats(transport, nlh);
+@@ -3983,23 +4010,22 @@ iscsi_if_recv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, uint32_t *group)
+ case ISCSI_UEVENT_TRANSPORT_EP_POLL:
+ case ISCSI_UEVENT_TRANSPORT_EP_DISCONNECT:
+ case ISCSI_UEVENT_TRANSPORT_EP_CONNECT_THROUGH_HOST:
+- err = iscsi_if_transport_ep(transport, ev, nlh->nlmsg_type);
++ err = iscsi_if_transport_ep(transport, ev, nlh->nlmsg_type, rlen);
+ break;
+ case ISCSI_UEVENT_TGT_DSCVR:
+- err = iscsi_tgt_dscvr(transport, ev);
++ err = iscsi_tgt_dscvr(transport, ev, rlen);
+ break;
+ case ISCSI_UEVENT_SET_HOST_PARAM:
+- err = iscsi_set_host_param(transport, ev);
++ err = iscsi_set_host_param(transport, ev, rlen);
+ break;
+ case ISCSI_UEVENT_PATH_UPDATE:
+- err = iscsi_set_path(transport, ev);
++ err = iscsi_set_path(transport, ev, rlen);
+ break;
+ case ISCSI_UEVENT_SET_IFACE_PARAMS:
+- err = iscsi_set_iface_params(transport, ev,
+- nlmsg_attrlen(nlh, sizeof(*ev)));
++ err = iscsi_set_iface_params(transport, ev, rlen);
+ break;
+ case ISCSI_UEVENT_PING:
+- err = iscsi_send_ping(transport, ev);
++ err = iscsi_send_ping(transport, ev, rlen);
+ break;
+ case ISCSI_UEVENT_GET_CHAP:
+ err = iscsi_get_chap(transport, nlh);
+@@ -4008,13 +4034,10 @@ iscsi_if_recv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, uint32_t *group)
+ err = iscsi_delete_chap(transport, ev);
+ break;
+ case ISCSI_UEVENT_SET_FLASHNODE_PARAMS:
+- err = iscsi_set_flashnode_param(transport, ev,
+- nlmsg_attrlen(nlh,
+- sizeof(*ev)));
++ err = iscsi_set_flashnode_param(transport, ev, rlen);
+ break;
+ case ISCSI_UEVENT_NEW_FLASHNODE:
+- err = iscsi_new_flashnode(transport, ev,
+- nlmsg_attrlen(nlh, sizeof(*ev)));
++ err = iscsi_new_flashnode(transport, ev, rlen);
+ break;
+ case ISCSI_UEVENT_DEL_FLASHNODE:
+ err = iscsi_del_flashnode(transport, ev);
+@@ -4029,8 +4052,7 @@ iscsi_if_recv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, uint32_t *group)
+ err = iscsi_logout_flashnode_sid(transport, ev);
+ break;
+ case ISCSI_UEVENT_SET_CHAP:
+- err = iscsi_set_chap(transport, ev,
+- nlmsg_attrlen(nlh, sizeof(*ev)));
++ err = iscsi_set_chap(transport, ev, rlen);
+ break;
+ case ISCSI_UEVENT_GET_HOST_STATS:
+ err = iscsi_get_host_stats(transport, nlh);
+diff --git a/drivers/soc/qcom/ocmem.c b/drivers/soc/qcom/ocmem.c
+index 199fe98720350..ef7c1748242ac 100644
+--- a/drivers/soc/qcom/ocmem.c
++++ b/drivers/soc/qcom/ocmem.c
+@@ -76,8 +76,12 @@ struct ocmem {
+ #define OCMEM_REG_GFX_MPU_START 0x00001004
+ #define OCMEM_REG_GFX_MPU_END 0x00001008
+
+-#define OCMEM_HW_PROFILE_NUM_PORTS(val) FIELD_PREP(0x0000000f, (val))
+-#define OCMEM_HW_PROFILE_NUM_MACROS(val) FIELD_PREP(0x00003f00, (val))
++#define OCMEM_HW_VERSION_MAJOR(val) FIELD_GET(GENMASK(31, 28), val)
++#define OCMEM_HW_VERSION_MINOR(val) FIELD_GET(GENMASK(27, 16), val)
++#define OCMEM_HW_VERSION_STEP(val) FIELD_GET(GENMASK(15, 0), val)
++
++#define OCMEM_HW_PROFILE_NUM_PORTS(val) FIELD_GET(0x0000000f, (val))
++#define OCMEM_HW_PROFILE_NUM_MACROS(val) FIELD_GET(0x00003f00, (val))
+
+ #define OCMEM_HW_PROFILE_LAST_REGN_HALFSIZE 0x00010000
+ #define OCMEM_HW_PROFILE_INTERLEAVING 0x00020000
+@@ -355,6 +359,12 @@ static int ocmem_dev_probe(struct platform_device *pdev)
+ }
+ }
+
++ reg = ocmem_read(ocmem, OCMEM_REG_HW_VERSION);
++ dev_dbg(dev, "OCMEM hardware version: %lu.%lu.%lu\n",
++ OCMEM_HW_VERSION_MAJOR(reg),
++ OCMEM_HW_VERSION_MINOR(reg),
++ OCMEM_HW_VERSION_STEP(reg));
++
+ reg = ocmem_read(ocmem, OCMEM_REG_HW_PROFILE);
+ ocmem->num_ports = OCMEM_HW_PROFILE_NUM_PORTS(reg);
+ ocmem->num_macros = OCMEM_HW_PROFILE_NUM_MACROS(reg);
+diff --git a/drivers/soc/qcom/smem.c b/drivers/soc/qcom/smem.c
+index 6be7ea93c78cf..1e08bb3b1679a 100644
+--- a/drivers/soc/qcom/smem.c
++++ b/drivers/soc/qcom/smem.c
+@@ -723,7 +723,7 @@ EXPORT_SYMBOL(qcom_smem_get_free_space);
+
+ static bool addr_in_range(void __iomem *base, size_t size, void *addr)
+ {
+- return base && (addr >= base && addr < base + size);
++ return base && ((void __iomem *)addr >= base && (void __iomem *)addr < base + size);
+ }
+
+ /**
+diff --git a/drivers/spi/spi-mpc512x-psc.c b/drivers/spi/spi-mpc512x-psc.c
+index 99aeef28a4774..5cecca1bef026 100644
+--- a/drivers/spi/spi-mpc512x-psc.c
++++ b/drivers/spi/spi-mpc512x-psc.c
+@@ -53,7 +53,7 @@ struct mpc512x_psc_spi {
+ int type;
+ void __iomem *psc;
+ struct mpc512x_psc_fifo __iomem *fifo;
+- unsigned int irq;
++ int irq;
+ u8 bits_per_word;
+ u32 mclk_rate;
+
+diff --git a/drivers/spi/spi-tegra20-sflash.c b/drivers/spi/spi-tegra20-sflash.c
+index 4286310628a2b..0c5507473f972 100644
+--- a/drivers/spi/spi-tegra20-sflash.c
++++ b/drivers/spi/spi-tegra20-sflash.c
+@@ -455,7 +455,11 @@ static int tegra_sflash_probe(struct platform_device *pdev)
+ goto exit_free_master;
+ }
+
+- tsd->irq = platform_get_irq(pdev, 0);
++ ret = platform_get_irq(pdev, 0);
++ if (ret < 0)
++ goto exit_free_master;
++ tsd->irq = ret;
++
+ ret = request_irq(tsd->irq, tegra_sflash_isr, 0,
+ dev_name(&pdev->dev), tsd);
+ if (ret < 0) {
+diff --git a/drivers/staging/fbtft/fb_ili9341.c b/drivers/staging/fbtft/fb_ili9341.c
+index 9ccd0823c3ab3..47e72b87d76d9 100644
+--- a/drivers/staging/fbtft/fb_ili9341.c
++++ b/drivers/staging/fbtft/fb_ili9341.c
+@@ -145,7 +145,7 @@ static struct fbtft_display display = {
+ },
+ };
+
+-FBTFT_REGISTER_DRIVER(DRVNAME, "ilitek,ili9341", &display);
++FBTFT_REGISTER_SPI_DRIVER(DRVNAME, "ilitek", "ili9341", &display);
+
+ MODULE_ALIAS("spi:" DRVNAME);
+ MODULE_ALIAS("platform:" DRVNAME);
+diff --git a/drivers/staging/media/av7110/sp8870.c b/drivers/staging/media/av7110/sp8870.c
+index 9767159aeb9b2..abf5c72607b64 100644
+--- a/drivers/staging/media/av7110/sp8870.c
++++ b/drivers/staging/media/av7110/sp8870.c
+@@ -606,4 +606,4 @@ MODULE_DESCRIPTION("Spase SP8870 DVB-T Demodulator driver");
+ MODULE_AUTHOR("Juergen Peitz");
+ MODULE_LICENSE("GPL");
+
+-EXPORT_SYMBOL(sp8870_attach);
++EXPORT_SYMBOL_GPL(sp8870_attach);
+diff --git a/drivers/staging/media/rkvdec/rkvdec.c b/drivers/staging/media/rkvdec/rkvdec.c
+index 134e2b9fa7d9a..84a41792cb4b8 100644
+--- a/drivers/staging/media/rkvdec/rkvdec.c
++++ b/drivers/staging/media/rkvdec/rkvdec.c
+@@ -120,7 +120,7 @@ static const struct rkvdec_coded_fmt_desc rkvdec_coded_fmts[] = {
+ .max_width = 4096,
+ .step_width = 16,
+ .min_height = 48,
+- .max_height = 2304,
++ .max_height = 2560,
+ .step_height = 16,
+ },
+ .ctrls = &rkvdec_h264_ctrls,
+diff --git a/drivers/thermal/imx8mm_thermal.c b/drivers/thermal/imx8mm_thermal.c
+index d8005e9ec992b..1f780c4a1c890 100644
+--- a/drivers/thermal/imx8mm_thermal.c
++++ b/drivers/thermal/imx8mm_thermal.c
+@@ -179,10 +179,8 @@ static int imx8mm_tmu_probe_set_calib_v1(struct platform_device *pdev,
+ int ret;
+
+ ret = nvmem_cell_read_u32(&pdev->dev, "calib", &ana0);
+- if (ret) {
+- dev_warn(dev, "Failed to read OCOTP nvmem cell (%d).\n", ret);
+- return ret;
+- }
++ if (ret)
++ return dev_err_probe(dev, ret, "Failed to read OCOTP nvmem cell\n");
+
+ writel(FIELD_PREP(TASR_BUF_VREF_MASK,
+ FIELD_GET(ANA0_BUF_VREF_MASK, ana0)) |
+diff --git a/drivers/thermal/mediatek/lvts_thermal.c b/drivers/thermal/mediatek/lvts_thermal.c
+index d0a3f95b7884b..d4f160e8c7dbc 100644
+--- a/drivers/thermal/mediatek/lvts_thermal.c
++++ b/drivers/thermal/mediatek/lvts_thermal.c
+@@ -63,7 +63,12 @@
+ #define LVTS_HW_FILTER 0x2
+ #define LVTS_TSSEL_CONF 0x13121110
+ #define LVTS_CALSCALE_CONF 0x300
+-#define LVTS_MONINT_CONF 0x9FBF7BDE
++#define LVTS_MONINT_CONF 0x8300318C
++
++#define LVTS_MONINT_OFFSET_SENSOR0 0xC
++#define LVTS_MONINT_OFFSET_SENSOR1 0x180
++#define LVTS_MONINT_OFFSET_SENSOR2 0x3000
++#define LVTS_MONINT_OFFSET_SENSOR3 0x3000000
+
+ #define LVTS_INT_SENSOR0 0x0009001F
+ #define LVTS_INT_SENSOR1 0x001203E0
+@@ -81,6 +86,8 @@
+
+ #define LVTS_HW_SHUTDOWN_MT8195 105000
+
++#define LVTS_MINIMUM_THRESHOLD 20000
++
+ static int golden_temp = LVTS_GOLDEN_TEMP_DEFAULT;
+ static int coeff_b = LVTS_COEFF_B;
+
+@@ -108,6 +115,8 @@ struct lvts_sensor {
+ void __iomem *base;
+ int id;
+ int dt_id;
++ int low_thresh;
++ int high_thresh;
+ };
+
+ struct lvts_ctrl {
+@@ -117,6 +126,8 @@ struct lvts_ctrl {
+ int num_lvts_sensor;
+ int mode;
+ void __iomem *base;
++ int low_thresh;
++ int high_thresh;
+ };
+
+ struct lvts_domain {
+@@ -288,32 +299,84 @@ static int lvts_get_temp(struct thermal_zone_device *tz, int *temp)
+ return 0;
+ }
+
++static void lvts_update_irq_mask(struct lvts_ctrl *lvts_ctrl)
++{
++ u32 masks[] = {
++ LVTS_MONINT_OFFSET_SENSOR0,
++ LVTS_MONINT_OFFSET_SENSOR1,
++ LVTS_MONINT_OFFSET_SENSOR2,
++ LVTS_MONINT_OFFSET_SENSOR3,
++ };
++ u32 value = 0;
++ int i;
++
++ value = readl(LVTS_MONINT(lvts_ctrl->base));
++
++ for (i = 0; i < ARRAY_SIZE(masks); i++) {
++ if (lvts_ctrl->sensors[i].high_thresh == lvts_ctrl->high_thresh
++ && lvts_ctrl->sensors[i].low_thresh == lvts_ctrl->low_thresh)
++ value |= masks[i];
++ else
++ value &= ~masks[i];
++ }
++
++ writel(value, LVTS_MONINT(lvts_ctrl->base));
++}
++
++static bool lvts_should_update_thresh(struct lvts_ctrl *lvts_ctrl, int high)
++{
++ int i;
++
++ if (high > lvts_ctrl->high_thresh)
++ return true;
++
++ for (i = 0; i < lvts_ctrl->num_lvts_sensor; i++)
++ if (lvts_ctrl->sensors[i].high_thresh == lvts_ctrl->high_thresh
++ && lvts_ctrl->sensors[i].low_thresh == lvts_ctrl->low_thresh)
++ return false;
++
++ return true;
++}
++
+ static int lvts_set_trips(struct thermal_zone_device *tz, int low, int high)
+ {
+ struct lvts_sensor *lvts_sensor = thermal_zone_device_priv(tz);
++ struct lvts_ctrl *lvts_ctrl = container_of(lvts_sensor, struct lvts_ctrl, sensors[lvts_sensor->id]);
+ void __iomem *base = lvts_sensor->base;
+- u32 raw_low = lvts_temp_to_raw(low);
++ u32 raw_low = lvts_temp_to_raw(low != -INT_MAX ? low : LVTS_MINIMUM_THRESHOLD);
+ u32 raw_high = lvts_temp_to_raw(high);
++ bool should_update_thresh;
++
++ lvts_sensor->low_thresh = low;
++ lvts_sensor->high_thresh = high;
++
++ should_update_thresh = lvts_should_update_thresh(lvts_ctrl, high);
++ if (should_update_thresh) {
++ lvts_ctrl->high_thresh = high;
++ lvts_ctrl->low_thresh = low;
++ }
++ lvts_update_irq_mask(lvts_ctrl);
++
++ if (!should_update_thresh)
++ return 0;
+
+ /*
+- * Hot to normal temperature threshold
++ * Low offset temperature threshold
+ *
+- * LVTS_H2NTHRE
++ * LVTS_OFFSETL
+ *
+ * Bits:
+ *
+ * 14-0 : Raw temperature for threshold
+ */
+- if (low != -INT_MAX) {
+- pr_debug("%s: Setting low limit temperature interrupt: %d\n",
+- thermal_zone_device_type(tz), low);
+- writel(raw_low, LVTS_H2NTHRE(base));
+- }
++ pr_debug("%s: Setting low limit temperature interrupt: %d\n",
++ thermal_zone_device_type(tz), low);
++ writel(raw_low, LVTS_OFFSETL(base));
+
+ /*
+- * Hot temperature threshold
++ * High offset temperature threshold
+ *
+- * LVTS_HTHRE
++ * LVTS_OFFSETH
+ *
+ * Bits:
+ *
+@@ -321,7 +384,7 @@ static int lvts_set_trips(struct thermal_zone_device *tz, int low, int high)
+ */
+ pr_debug("%s: Setting high limit temperature interrupt: %d\n",
+ thermal_zone_device_type(tz), high);
+- writel(raw_high, LVTS_HTHRE(base));
++ writel(raw_high, LVTS_OFFSETH(base));
+
+ return 0;
+ }
+@@ -449,7 +512,7 @@ static irqreturn_t lvts_irq_handler(int irq, void *data)
+
+ for (i = 0; i < lvts_td->num_lvts_ctrl; i++) {
+
+- aux = lvts_ctrl_irq_handler(lvts_td->lvts_ctrl);
++ aux = lvts_ctrl_irq_handler(&lvts_td->lvts_ctrl[i]);
+ if (aux != IRQ_HANDLED)
+ continue;
+
+@@ -519,6 +582,9 @@ static int lvts_sensor_init(struct device *dev, struct lvts_ctrl *lvts_ctrl,
+ */
+ lvts_sensor[i].msr = lvts_ctrl_data->mode == LVTS_MSR_IMMEDIATE_MODE ?
+ imm_regs[i] : msr_regs[i];
++
++ lvts_sensor[i].low_thresh = INT_MIN;
++ lvts_sensor[i].high_thresh = INT_MIN;
+ };
+
+ lvts_ctrl->num_lvts_sensor = lvts_ctrl_data->num_lvts_sensor;
+@@ -686,6 +752,9 @@ static int lvts_ctrl_init(struct device *dev, struct lvts_domain *lvts_td,
+ */
+ lvts_ctrl[i].hw_tshut_raw_temp =
+ lvts_temp_to_raw(lvts_data->lvts_ctrl[i].hw_tshut_temp);
++
++ lvts_ctrl[i].low_thresh = INT_MIN;
++ lvts_ctrl[i].high_thresh = INT_MIN;
+ }
+
+ /*
+@@ -894,24 +963,6 @@ static int lvts_ctrl_configure(struct device *dev, struct lvts_ctrl *lvts_ctrl)
+ LVTS_HW_FILTER << 3 | LVTS_HW_FILTER;
+ writel(value, LVTS_MSRCTL0(lvts_ctrl->base));
+
+- /*
+- * LVTS_MSRCTL1 : Measurement control
+- *
+- * Bits:
+- *
+- * 9: Ignore MSRCTL0 config and do immediate measurement on sensor3
+- * 6: Ignore MSRCTL0 config and do immediate measurement on sensor2
+- * 5: Ignore MSRCTL0 config and do immediate measurement on sensor1
+- * 4: Ignore MSRCTL0 config and do immediate measurement on sensor0
+- *
+- * That configuration will ignore the filtering and the delays
+- * introduced below in MONCTL1 and MONCTL2
+- */
+- if (lvts_ctrl->mode == LVTS_MSR_IMMEDIATE_MODE) {
+- value = BIT(9) | BIT(6) | BIT(5) | BIT(4);
+- writel(value, LVTS_MSRCTL1(lvts_ctrl->base));
+- }
+-
+ /*
+ * LVTS_MONCTL1 : Period unit and group interval configuration
+ *
+@@ -977,6 +1028,15 @@ static int lvts_ctrl_start(struct device *dev, struct lvts_ctrl *lvts_ctrl)
+ struct thermal_zone_device *tz;
+ u32 sensor_map = 0;
+ int i;
++ /*
++ * Bitmaps to enable each sensor on immediate and filtered modes, as
++ * described in MSRCTL1 and MONCTL0 registers below, respectively.
++ */
++ u32 sensor_imm_bitmap[] = { BIT(4), BIT(5), BIT(6), BIT(9) };
++ u32 sensor_filt_bitmap[] = { BIT(0), BIT(1), BIT(2), BIT(3) };
++
++ u32 *sensor_bitmap = lvts_ctrl->mode == LVTS_MSR_IMMEDIATE_MODE ?
++ sensor_imm_bitmap : sensor_filt_bitmap;
+
+ for (i = 0; i < lvts_ctrl->num_lvts_sensor; i++) {
+
+@@ -1012,20 +1072,38 @@ static int lvts_ctrl_start(struct device *dev, struct lvts_ctrl *lvts_ctrl)
+ * map, so we can enable the temperature monitoring in
+ * the hardware thermal controller.
+ */
+- sensor_map |= BIT(i);
++ sensor_map |= sensor_bitmap[i];
+ }
+
+ /*
+- * Bits:
+- * 9: Single point access flow
+- * 0-3: Enable sensing point 0-3
+- *
+ * The initialization of the thermal zones give us
+ * which sensor point to enable. If any thermal zone
+ * was not described in the device tree, it won't be
+ * enabled here in the sensor map.
+ */
+- writel(sensor_map | BIT(9), LVTS_MONCTL0(lvts_ctrl->base));
++ if (lvts_ctrl->mode == LVTS_MSR_IMMEDIATE_MODE) {
++ /*
++ * LVTS_MSRCTL1 : Measurement control
++ *
++ * Bits:
++ *
++ * 9: Ignore MSRCTL0 config and do immediate measurement on sensor3
++ * 6: Ignore MSRCTL0 config and do immediate measurement on sensor2
++ * 5: Ignore MSRCTL0 config and do immediate measurement on sensor1
++ * 4: Ignore MSRCTL0 config and do immediate measurement on sensor0
++ *
++ * That configuration will ignore the filtering and the delays
++ * introduced in MONCTL1 and MONCTL2
++ */
++ writel(sensor_map, LVTS_MSRCTL1(lvts_ctrl->base));
++ } else {
++ /*
++ * Bits:
++ * 9: Single point access flow
++ * 0-3: Enable sensing point 0-3
++ */
++ writel(sensor_map | BIT(9), LVTS_MONCTL0(lvts_ctrl->base));
++ }
+
+ return 0;
+ }
+diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
+index 842f678c1c3e1..cc2b5e81c6205 100644
+--- a/drivers/thermal/thermal_core.c
++++ b/drivers/thermal/thermal_core.c
+@@ -1203,7 +1203,7 @@ EXPORT_SYMBOL_GPL(thermal_zone_get_crit_temp);
+ struct thermal_zone_device *
+ thermal_zone_device_register_with_trips(const char *type, struct thermal_trip *trips, int num_trips, int mask,
+ void *devdata, struct thermal_zone_device_ops *ops,
+- struct thermal_zone_params *tzp, int passive_delay,
++ const struct thermal_zone_params *tzp, int passive_delay,
+ int polling_delay)
+ {
+ struct thermal_zone_device *tz;
+@@ -1371,7 +1371,7 @@ EXPORT_SYMBOL_GPL(thermal_zone_device_register_with_trips);
+
+ struct thermal_zone_device *thermal_zone_device_register(const char *type, int ntrips, int mask,
+ void *devdata, struct thermal_zone_device_ops *ops,
+- struct thermal_zone_params *tzp, int passive_delay,
++ const struct thermal_zone_params *tzp, int passive_delay,
+ int polling_delay)
+ {
+ return thermal_zone_device_register_with_trips(type, NULL, ntrips, mask,
+diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c
+index bc07ae1c284cf..22272f9c5934a 100644
+--- a/drivers/thermal/thermal_of.c
++++ b/drivers/thermal/thermal_of.c
+@@ -292,13 +292,13 @@ static int __thermal_of_unbind(struct device_node *map_np, int index, int trip_i
+ ret = of_parse_phandle_with_args(map_np, "cooling-device", "#cooling-cells",
+ index, &cooling_spec);
+
+- of_node_put(cooling_spec.np);
+-
+ if (ret < 0) {
+ pr_err("Invalid cooling-device entry\n");
+ return ret;
+ }
+
++ of_node_put(cooling_spec.np);
++
+ if (cooling_spec.args_count < 2) {
+ pr_err("wrong reference to cooling device, missing limits\n");
+ return -EINVAL;
+@@ -325,13 +325,13 @@ static int __thermal_of_bind(struct device_node *map_np, int index, int trip_id,
+ ret = of_parse_phandle_with_args(map_np, "cooling-device", "#cooling-cells",
+ index, &cooling_spec);
+
+- of_node_put(cooling_spec.np);
+-
+ if (ret < 0) {
+ pr_err("Invalid cooling-device entry\n");
+ return ret;
+ }
+
++ of_node_put(cooling_spec.np);
++
+ if (cooling_spec.args_count < 2) {
+ pr_err("wrong reference to cooling device, missing limits\n");
+ return -EINVAL;
+diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
+index 24ebdb0b63a8e..ae632a9d4f3ae 100644
+--- a/drivers/tty/serial/qcom_geni_serial.c
++++ b/drivers/tty/serial/qcom_geni_serial.c
+@@ -592,7 +592,6 @@ static void qcom_geni_serial_stop_tx_dma(struct uart_port *uport)
+ {
+ struct qcom_geni_serial_port *port = to_dev_port(uport);
+ bool done;
+- u32 m_irq_en;
+
+ if (!qcom_geni_serial_main_active(uport))
+ return;
+@@ -604,12 +603,10 @@ static void qcom_geni_serial_stop_tx_dma(struct uart_port *uport)
+ port->tx_remaining = 0;
+ }
+
+- m_irq_en = readl(uport->membase + SE_GENI_M_IRQ_EN);
+- writel(m_irq_en, uport->membase + SE_GENI_M_IRQ_EN);
+ geni_se_cancel_m_cmd(&port->se);
+
+- done = qcom_geni_serial_poll_bit(uport, SE_GENI_S_IRQ_STATUS,
+- S_CMD_CANCEL_EN, true);
++ done = qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
++ M_CMD_CANCEL_EN, true);
+ if (!done) {
+ geni_se_abort_m_cmd(&port->se);
+ done = qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
+diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
+index 54c760b46da13..8845301c16058 100644
+--- a/drivers/tty/serial/sc16is7xx.c
++++ b/drivers/tty/serial/sc16is7xx.c
+@@ -236,7 +236,8 @@
+
+ /* IOControl register bits (Only 750/760) */
+ #define SC16IS7XX_IOCONTROL_LATCH_BIT (1 << 0) /* Enable input latching */
+-#define SC16IS7XX_IOCONTROL_MODEM_BIT (1 << 1) /* Enable GPIO[7:4] as modem pins */
++#define SC16IS7XX_IOCONTROL_MODEM_A_BIT (1 << 1) /* Enable GPIO[7:4] as modem A pins */
++#define SC16IS7XX_IOCONTROL_MODEM_B_BIT (1 << 2) /* Enable GPIO[3:0] as modem B pins */
+ #define SC16IS7XX_IOCONTROL_SRESET_BIT (1 << 3) /* Software Reset */
+
+ /* EFCR register bits */
+@@ -301,12 +302,12 @@
+ /* Misc definitions */
+ #define SC16IS7XX_FIFO_SIZE (64)
+ #define SC16IS7XX_REG_SHIFT 2
++#define SC16IS7XX_GPIOS_PER_BANK 4
+
+ struct sc16is7xx_devtype {
+ char name[10];
+ int nr_gpio;
+ int nr_uart;
+- int has_mctrl;
+ };
+
+ #define SC16IS7XX_RECONF_MD (1 << 0)
+@@ -336,7 +337,9 @@ struct sc16is7xx_port {
+ struct clk *clk;
+ #ifdef CONFIG_GPIOLIB
+ struct gpio_chip gpio;
++ unsigned long gpio_valid_mask;
+ #endif
++ u8 mctrl_mask;
+ unsigned char buf[SC16IS7XX_FIFO_SIZE];
+ struct kthread_worker kworker;
+ struct task_struct *kworker_task;
+@@ -447,35 +450,30 @@ static const struct sc16is7xx_devtype sc16is74x_devtype = {
+ .name = "SC16IS74X",
+ .nr_gpio = 0,
+ .nr_uart = 1,
+- .has_mctrl = 0,
+ };
+
+ static const struct sc16is7xx_devtype sc16is750_devtype = {
+ .name = "SC16IS750",
+- .nr_gpio = 4,
++ .nr_gpio = 8,
+ .nr_uart = 1,
+- .has_mctrl = 1,
+ };
+
+ static const struct sc16is7xx_devtype sc16is752_devtype = {
+ .name = "SC16IS752",
+- .nr_gpio = 0,
++ .nr_gpio = 8,
+ .nr_uart = 2,
+- .has_mctrl = 1,
+ };
+
+ static const struct sc16is7xx_devtype sc16is760_devtype = {
+ .name = "SC16IS760",
+- .nr_gpio = 4,
++ .nr_gpio = 8,
+ .nr_uart = 1,
+- .has_mctrl = 1,
+ };
+
+ static const struct sc16is7xx_devtype sc16is762_devtype = {
+ .name = "SC16IS762",
+- .nr_gpio = 0,
++ .nr_gpio = 8,
+ .nr_uart = 2,
+- .has_mctrl = 1,
+ };
+
+ static bool sc16is7xx_regmap_volatile(struct device *dev, unsigned int reg)
+@@ -1357,8 +1355,98 @@ static int sc16is7xx_gpio_direction_output(struct gpio_chip *chip,
+
+ return 0;
+ }
++
++static int sc16is7xx_gpio_init_valid_mask(struct gpio_chip *chip,
++ unsigned long *valid_mask,
++ unsigned int ngpios)
++{
++ struct sc16is7xx_port *s = gpiochip_get_data(chip);
++
++ *valid_mask = s->gpio_valid_mask;
++
++ return 0;
++}
++
++static int sc16is7xx_setup_gpio_chip(struct sc16is7xx_port *s)
++{
++ struct device *dev = s->p[0].port.dev;
++
++ if (!s->devtype->nr_gpio)
++ return 0;
++
++ switch (s->mctrl_mask) {
++ case 0:
++ s->gpio_valid_mask = GENMASK(7, 0);
++ break;
++ case SC16IS7XX_IOCONTROL_MODEM_A_BIT:
++ s->gpio_valid_mask = GENMASK(3, 0);
++ break;
++ case SC16IS7XX_IOCONTROL_MODEM_B_BIT:
++ s->gpio_valid_mask = GENMASK(7, 4);
++ break;
++ default:
++ break;
++ }
++
++ if (s->gpio_valid_mask == 0)
++ return 0;
++
++ s->gpio.owner = THIS_MODULE;
++ s->gpio.parent = dev;
++ s->gpio.label = dev_name(dev);
++ s->gpio.init_valid_mask = sc16is7xx_gpio_init_valid_mask;
++ s->gpio.direction_input = sc16is7xx_gpio_direction_input;
++ s->gpio.get = sc16is7xx_gpio_get;
++ s->gpio.direction_output = sc16is7xx_gpio_direction_output;
++ s->gpio.set = sc16is7xx_gpio_set;
++ s->gpio.base = -1;
++ s->gpio.ngpio = s->devtype->nr_gpio;
++ s->gpio.can_sleep = 1;
++
++ return gpiochip_add_data(&s->gpio, s);
++}
+ #endif
+
++/*
++ * Configure ports designated to operate as modem control lines.
++ */
++static int sc16is7xx_setup_mctrl_ports(struct sc16is7xx_port *s)
++{
++ int i;
++ int ret;
++ int count;
++ u32 mctrl_port[2];
++ struct device *dev = s->p[0].port.dev;
++
++ count = device_property_count_u32(dev, "nxp,modem-control-line-ports");
++ if (count < 0 || count > ARRAY_SIZE(mctrl_port))
++ return 0;
++
++ ret = device_property_read_u32_array(dev, "nxp,modem-control-line-ports",
++ mctrl_port, count);
++ if (ret)
++ return ret;
++
++ s->mctrl_mask = 0;
++
++ for (i = 0; i < count; i++) {
++ /* Use GPIO lines as modem control lines */
++ if (mctrl_port[i] == 0)
++ s->mctrl_mask |= SC16IS7XX_IOCONTROL_MODEM_A_BIT;
++ else if (mctrl_port[i] == 1)
++ s->mctrl_mask |= SC16IS7XX_IOCONTROL_MODEM_B_BIT;
++ }
++
++ if (s->mctrl_mask)
++ regmap_update_bits(
++ s->regmap,
++ SC16IS7XX_IOCONTROL_REG << SC16IS7XX_REG_SHIFT,
++ SC16IS7XX_IOCONTROL_MODEM_A_BIT |
++ SC16IS7XX_IOCONTROL_MODEM_B_BIT, s->mctrl_mask);
++
++ return 0;
++}
++
+ static const struct serial_rs485 sc16is7xx_rs485_supported = {
+ .flags = SER_RS485_ENABLED | SER_RS485_RTS_AFTER_SEND,
+ .delay_rts_before_send = 1,
+@@ -1471,12 +1559,6 @@ static int sc16is7xx_probe(struct device *dev,
+ SC16IS7XX_EFCR_RXDISABLE_BIT |
+ SC16IS7XX_EFCR_TXDISABLE_BIT);
+
+- /* Use GPIO lines as modem status registers */
+- if (devtype->has_mctrl)
+- sc16is7xx_port_write(&s->p[i].port,
+- SC16IS7XX_IOCONTROL_REG,
+- SC16IS7XX_IOCONTROL_MODEM_BIT);
+-
+ /* Initialize kthread work structs */
+ kthread_init_work(&s->p[i].tx_work, sc16is7xx_tx_proc);
+ kthread_init_work(&s->p[i].reg_work, sc16is7xx_reg_proc);
+@@ -1514,23 +1596,14 @@ static int sc16is7xx_probe(struct device *dev,
+ s->p[u].irda_mode = true;
+ }
+
++ ret = sc16is7xx_setup_mctrl_ports(s);
++ if (ret)
++ goto out_ports;
++
+ #ifdef CONFIG_GPIOLIB
+- if (devtype->nr_gpio) {
+- /* Setup GPIO cotroller */
+- s->gpio.owner = THIS_MODULE;
+- s->gpio.parent = dev;
+- s->gpio.label = dev_name(dev);
+- s->gpio.direction_input = sc16is7xx_gpio_direction_input;
+- s->gpio.get = sc16is7xx_gpio_get;
+- s->gpio.direction_output = sc16is7xx_gpio_direction_output;
+- s->gpio.set = sc16is7xx_gpio_set;
+- s->gpio.base = -1;
+- s->gpio.ngpio = devtype->nr_gpio;
+- s->gpio.can_sleep = 1;
+- ret = gpiochip_add_data(&s->gpio, s);
+- if (ret)
+- goto out_thread;
+- }
++ ret = sc16is7xx_setup_gpio_chip(s);
++ if (ret)
++ goto out_ports;
+ #endif
+
+ /*
+@@ -1553,10 +1626,8 @@ static int sc16is7xx_probe(struct device *dev,
+ return 0;
+
+ #ifdef CONFIG_GPIOLIB
+- if (devtype->nr_gpio)
++ if (s->gpio_valid_mask)
+ gpiochip_remove(&s->gpio);
+-
+-out_thread:
+ #endif
+
+ out_ports:
+@@ -1579,7 +1650,7 @@ static void sc16is7xx_remove(struct device *dev)
+ int i;
+
+ #ifdef CONFIG_GPIOLIB
+- if (s->devtype->nr_gpio)
++ if (s->gpio_valid_mask)
+ gpiochip_remove(&s->gpio);
+ #endif
+
+diff --git a/drivers/tty/serial/serial-tegra.c b/drivers/tty/serial/serial-tegra.c
+index 1cf08b33456c9..37e1e05bc87e6 100644
+--- a/drivers/tty/serial/serial-tegra.c
++++ b/drivers/tty/serial/serial-tegra.c
+@@ -998,7 +998,11 @@ static int tegra_uart_hw_init(struct tegra_uart_port *tup)
+ tup->ier_shadow = 0;
+ tup->current_baud = 0;
+
+- clk_prepare_enable(tup->uart_clk);
++ ret = clk_prepare_enable(tup->uart_clk);
++ if (ret) {
++ dev_err(tup->uport.dev, "could not enable clk\n");
++ return ret;
++ }
+
+ /* Reset the UART controller to clear all previous status.*/
+ reset_control_assert(tup->rst);
+diff --git a/drivers/tty/serial/sprd_serial.c b/drivers/tty/serial/sprd_serial.c
+index b58f51296ace2..99da964e8bd44 100644
+--- a/drivers/tty/serial/sprd_serial.c
++++ b/drivers/tty/serial/sprd_serial.c
+@@ -364,7 +364,7 @@ static void sprd_rx_free_buf(struct sprd_uart_port *sp)
+ if (sp->rx_dma.virt)
+ dma_free_coherent(sp->port.dev, SPRD_UART_RX_SIZE,
+ sp->rx_dma.virt, sp->rx_dma.phys_addr);
+-
++ sp->rx_dma.virt = NULL;
+ }
+
+ static int sprd_rx_dma_config(struct uart_port *port, u32 burst)
+@@ -1106,7 +1106,7 @@ static bool sprd_uart_is_console(struct uart_port *uport)
+ static int sprd_clk_init(struct uart_port *uport)
+ {
+ struct clk *clk_uart, *clk_parent;
+- struct sprd_uart_port *u = sprd_port[uport->line];
++ struct sprd_uart_port *u = container_of(uport, struct sprd_uart_port, port);
+
+ clk_uart = devm_clk_get(uport->dev, "uart");
+ if (IS_ERR(clk_uart)) {
+@@ -1149,22 +1149,22 @@ static int sprd_probe(struct platform_device *pdev)
+ {
+ struct resource *res;
+ struct uart_port *up;
++ struct sprd_uart_port *sport;
+ int irq;
+ int index;
+ int ret;
+
+ index = of_alias_get_id(pdev->dev.of_node, "serial");
+- if (index < 0 || index >= ARRAY_SIZE(sprd_port)) {
++ if (index < 0 || index >= UART_NR_MAX) {
+ dev_err(&pdev->dev, "got a wrong serial alias id %d\n", index);
+ return -EINVAL;
+ }
+
+- sprd_port[index] = devm_kzalloc(&pdev->dev, sizeof(*sprd_port[index]),
+- GFP_KERNEL);
+- if (!sprd_port[index])
++ sport = devm_kzalloc(&pdev->dev, sizeof(*sport), GFP_KERNEL);
++ if (!sport)
+ return -ENOMEM;
+
+- up = &sprd_port[index]->port;
++ up = &sport->port;
+ up->dev = &pdev->dev;
+ up->line = index;
+ up->type = PORT_SPRD;
+@@ -1195,7 +1195,7 @@ static int sprd_probe(struct platform_device *pdev)
+ * Allocate one dma buffer to prepare for receive transfer, in case
+ * memory allocation failure at runtime.
+ */
+- ret = sprd_rx_alloc_buf(sprd_port[index]);
++ ret = sprd_rx_alloc_buf(sport);
+ if (ret)
+ return ret;
+
+@@ -1203,17 +1203,27 @@ static int sprd_probe(struct platform_device *pdev)
+ ret = uart_register_driver(&sprd_uart_driver);
+ if (ret < 0) {
+ pr_err("Failed to register SPRD-UART driver\n");
+- return ret;
++ goto free_rx_buf;
+ }
+ }
++
+ sprd_ports_num++;
++ sprd_port[index] = sport;
+
+ ret = uart_add_one_port(&sprd_uart_driver, up);
+ if (ret)
+- sprd_remove(pdev);
++ goto clean_port;
+
+ platform_set_drvdata(pdev, up);
+
++ return 0;
++
++clean_port:
++ sprd_port[index] = NULL;
++ if (--sprd_ports_num == 0)
++ uart_unregister_driver(&sprd_uart_driver);
++free_rx_buf:
++ sprd_rx_free_buf(sport);
+ return ret;
+ }
+
+diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
+index 6d8ef80d9cbc4..b9177182e5a7b 100644
+--- a/drivers/ufs/core/ufshcd.c
++++ b/drivers/ufs/core/ufshcd.c
+@@ -5255,9 +5255,17 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp,
+ int result = 0;
+ int scsi_status;
+ enum utp_ocs ocs;
++ u8 upiu_flags;
++ u32 resid;
+
+- scsi_set_resid(lrbp->cmd,
+- be32_to_cpu(lrbp->ucd_rsp_ptr->sr.residual_transfer_count));
++ upiu_flags = be32_to_cpu(lrbp->ucd_rsp_ptr->header.dword_0) >> 16;
++ resid = be32_to_cpu(lrbp->ucd_rsp_ptr->sr.residual_transfer_count);
++ /*
++ * Test !overflow instead of underflow to support UFS devices that do
++ * not set either flag.
++ */
++ if (resid && !(upiu_flags & UPIU_RSP_FLAG_OVERFLOW))
++ scsi_set_resid(lrbp->cmd, resid);
+
+ /* overall command status of utrd */
+ ocs = ufshcd_get_tr_ocs(lrbp, cqe);
+diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
+index 8300baedafd20..6af0a31ff1475 100644
+--- a/drivers/usb/core/hcd.c
++++ b/drivers/usb/core/hcd.c
+@@ -983,6 +983,7 @@ static int register_root_hub(struct usb_hcd *hcd)
+ {
+ struct device *parent_dev = hcd->self.controller;
+ struct usb_device *usb_dev = hcd->self.root_hub;
++ struct usb_device_descriptor *descr;
+ const int devnum = 1;
+ int retval;
+
+@@ -994,13 +995,16 @@ static int register_root_hub(struct usb_hcd *hcd)
+ mutex_lock(&usb_bus_idr_lock);
+
+ usb_dev->ep0.desc.wMaxPacketSize = cpu_to_le16(64);
+- retval = usb_get_device_descriptor(usb_dev, USB_DT_DEVICE_SIZE);
+- if (retval != sizeof usb_dev->descriptor) {
++ descr = usb_get_device_descriptor(usb_dev);
++ if (IS_ERR(descr)) {
++ retval = PTR_ERR(descr);
+ mutex_unlock(&usb_bus_idr_lock);
+ dev_dbg (parent_dev, "can't read %s device descriptor %d\n",
+ dev_name(&usb_dev->dev), retval);
+- return (retval < 0) ? retval : -EMSGSIZE;
++ return retval;
+ }
++ usb_dev->descriptor = *descr;
++ kfree(descr);
+
+ if (le16_to_cpu(usb_dev->descriptor.bcdUSB) >= 0x0201) {
+ retval = usb_get_bos_descriptor(usb_dev);
+diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
+index 97a0f8faea6e5..8a576326b3593 100644
+--- a/drivers/usb/core/hub.c
++++ b/drivers/usb/core/hub.c
+@@ -2656,12 +2656,17 @@ int usb_authorize_device(struct usb_device *usb_dev)
+ }
+
+ if (usb_dev->wusb) {
+- result = usb_get_device_descriptor(usb_dev, sizeof(usb_dev->descriptor));
+- if (result < 0) {
++ struct usb_device_descriptor *descr;
++
++ descr = usb_get_device_descriptor(usb_dev);
++ if (IS_ERR(descr)) {
++ result = PTR_ERR(descr);
+ dev_err(&usb_dev->dev, "can't re-read device descriptor for "
+ "authorization: %d\n", result);
+ goto error_device_descriptor;
+ }
++ usb_dev->descriptor = *descr;
++ kfree(descr);
+ }
+
+ usb_dev->authorized = 1;
+@@ -4703,6 +4708,67 @@ static int hub_enable_device(struct usb_device *udev)
+ return hcd->driver->enable_device(hcd, udev);
+ }
+
++/*
++ * Get the bMaxPacketSize0 value during initialization by reading the
++ * device's device descriptor. Since we don't already know this value,
++ * the transfer is unsafe and it ignores I/O errors, only testing for
++ * reasonable received values.
++ *
++ * For "old scheme" initialization, size will be 8 so we read just the
++ * start of the device descriptor, which should work okay regardless of
++ * the actual bMaxPacketSize0 value. For "new scheme" initialization,
++ * size will be 64 (and buf will point to a sufficiently large buffer),
++ * which might not be kosher according to the USB spec but it's what
++ * Windows does and what many devices expect.
++ *
++ * Returns: bMaxPacketSize0 or a negative error code.
++ */
++static int get_bMaxPacketSize0(struct usb_device *udev,
++ struct usb_device_descriptor *buf, int size, bool first_time)
++{
++ int i, rc;
++
++ /*
++ * Retry on all errors; some devices are flakey.
++ * 255 is for WUSB devices, we actually need to use
++ * 512 (WUSB1.0[4.8.1]).
++ */
++ for (i = 0; i < GET_MAXPACKET0_TRIES; ++i) {
++ /* Start with invalid values in case the transfer fails */
++ buf->bDescriptorType = buf->bMaxPacketSize0 = 0;
++ rc = usb_control_msg(udev, usb_rcvaddr0pipe(),
++ USB_REQ_GET_DESCRIPTOR, USB_DIR_IN,
++ USB_DT_DEVICE << 8, 0,
++ buf, size,
++ initial_descriptor_timeout);
++ switch (buf->bMaxPacketSize0) {
++ case 8: case 16: case 32: case 64: case 9:
++ if (buf->bDescriptorType == USB_DT_DEVICE) {
++ rc = buf->bMaxPacketSize0;
++ break;
++ }
++ fallthrough;
++ default:
++ if (rc >= 0)
++ rc = -EPROTO;
++ break;
++ }
++
++ /*
++ * Some devices time out if they are powered on
++ * when already connected. They need a second
++ * reset, so return early. But only on the first
++ * attempt, lest we get into a time-out/reset loop.
++ */
++ if (rc > 0 || (rc == -ETIMEDOUT && first_time &&
++ udev->speed > USB_SPEED_FULL))
++ break;
++ }
++ return rc;
++}
++
++#define GET_DESCRIPTOR_BUFSIZE 64
++
+ /* Reset device, (re)assign address, get device descriptor.
+ * Device connection must be stable, no more debouncing needed.
+ * Returns device in USB_STATE_ADDRESS, except on error.
+@@ -4712,10 +4778,17 @@ static int hub_enable_device(struct usb_device *udev)
+ * the port lock. For a newly detected device that is not accessible
+ * through any global pointers, it's not necessary to lock the device,
+ * but it is still necessary to lock the port.
++ *
++ * For a newly detected device, @dev_descr must be NULL. The device
++ * descriptor retrieved from the device will then be stored in
++ * @udev->descriptor. For an already existing device, @dev_descr
++ * must be non-NULL. The device descriptor will be stored there,
++ * not in @udev->descriptor, because descriptors for registered
++ * devices are meant to be immutable.
+ */
+ static int
+ hub_port_init(struct usb_hub *hub, struct usb_device *udev, int port1,
+- int retry_counter)
++ int retry_counter, struct usb_device_descriptor *dev_descr)
+ {
+ struct usb_device *hdev = hub->hdev;
+ struct usb_hcd *hcd = bus_to_hcd(hdev->bus);
+@@ -4727,6 +4800,13 @@ hub_port_init(struct usb_hub *hub, struct usb_device *udev, int port1,
+ int devnum = udev->devnum;
+ const char *driver_name;
+ bool do_new_scheme;
++ const bool initial = !dev_descr;
++ int maxp0;
++ struct usb_device_descriptor *buf, *descr;
++
++ buf = kmalloc(GET_DESCRIPTOR_BUFSIZE, GFP_NOIO);
++ if (!buf)
++ return -ENOMEM;
+
+ /* root hub ports have a slightly longer reset period
+ * (from USB 2.0 spec, section 7.1.7.5)
+@@ -4759,32 +4839,34 @@ hub_port_init(struct usb_hub *hub, struct usb_device *udev, int port1,
+ }
+ oldspeed = udev->speed;
+
+- /* USB 2.0 section 5.5.3 talks about ep0 maxpacket ...
+- * it's fixed size except for full speed devices.
+- * For Wireless USB devices, ep0 max packet is always 512 (tho
+- * reported as 0xff in the device descriptor). WUSB1.0[4.8.1].
+- */
+- switch (udev->speed) {
+- case USB_SPEED_SUPER_PLUS:
+- case USB_SPEED_SUPER:
+- case USB_SPEED_WIRELESS: /* fixed at 512 */
+- udev->ep0.desc.wMaxPacketSize = cpu_to_le16(512);
+- break;
+- case USB_SPEED_HIGH: /* fixed at 64 */
+- udev->ep0.desc.wMaxPacketSize = cpu_to_le16(64);
+- break;
+- case USB_SPEED_FULL: /* 8, 16, 32, or 64 */
+- /* to determine the ep0 maxpacket size, try to read
+- * the device descriptor to get bMaxPacketSize0 and
+- * then correct our initial guess.
++ if (initial) {
++ /* USB 2.0 section 5.5.3 talks about ep0 maxpacket ...
++ * it's fixed size except for full speed devices.
++ * For Wireless USB devices, ep0 max packet is always 512 (tho
++ * reported as 0xff in the device descriptor). WUSB1.0[4.8.1].
+ */
+- udev->ep0.desc.wMaxPacketSize = cpu_to_le16(64);
+- break;
+- case USB_SPEED_LOW: /* fixed at 8 */
+- udev->ep0.desc.wMaxPacketSize = cpu_to_le16(8);
+- break;
+- default:
+- goto fail;
++ switch (udev->speed) {
++ case USB_SPEED_SUPER_PLUS:
++ case USB_SPEED_SUPER:
++ case USB_SPEED_WIRELESS: /* fixed at 512 */
++ udev->ep0.desc.wMaxPacketSize = cpu_to_le16(512);
++ break;
++ case USB_SPEED_HIGH: /* fixed at 64 */
++ udev->ep0.desc.wMaxPacketSize = cpu_to_le16(64);
++ break;
++ case USB_SPEED_FULL: /* 8, 16, 32, or 64 */
++ /* to determine the ep0 maxpacket size, try to read
++ * the device descriptor to get bMaxPacketSize0 and
++ * then correct our initial guess.
++ */
++ udev->ep0.desc.wMaxPacketSize = cpu_to_le16(64);
++ break;
++ case USB_SPEED_LOW: /* fixed at 8 */
++ udev->ep0.desc.wMaxPacketSize = cpu_to_le16(8);
++ break;
++ default:
++ goto fail;
++ }
+ }
+
+ if (udev->speed == USB_SPEED_WIRELESS)
+@@ -4807,22 +4889,24 @@ hub_port_init(struct usb_hub *hub, struct usb_device *udev, int port1,
+ if (udev->speed < USB_SPEED_SUPER)
+ dev_info(&udev->dev,
+ "%s %s USB device number %d using %s\n",
+- (udev->config) ? "reset" : "new", speed,
++ (initial ? "new" : "reset"), speed,
+ devnum, driver_name);
+
+- /* Set up TT records, if needed */
+- if (hdev->tt) {
+- udev->tt = hdev->tt;
+- udev->ttport = hdev->ttport;
+- } else if (udev->speed != USB_SPEED_HIGH
+- && hdev->speed == USB_SPEED_HIGH) {
+- if (!hub->tt.hub) {
+- dev_err(&udev->dev, "parent hub has no TT\n");
+- retval = -EINVAL;
+- goto fail;
++ if (initial) {
++ /* Set up TT records, if needed */
++ if (hdev->tt) {
++ udev->tt = hdev->tt;
++ udev->ttport = hdev->ttport;
++ } else if (udev->speed != USB_SPEED_HIGH
++ && hdev->speed == USB_SPEED_HIGH) {
++ if (!hub->tt.hub) {
++ dev_err(&udev->dev, "parent hub has no TT\n");
++ retval = -EINVAL;
++ goto fail;
++ }
++ udev->tt = &hub->tt;
++ udev->ttport = port1;
+ }
+- udev->tt = &hub->tt;
+- udev->ttport = port1;
+ }
+
+ /* Why interleave GET_DESCRIPTOR and SET_ADDRESS this way?
+@@ -4846,9 +4930,6 @@ hub_port_init(struct usb_hub *hub, struct usb_device *udev, int port1,
+ }
+
+ if (do_new_scheme) {
+- struct usb_device_descriptor *buf;
+- int r = 0;
+-
+ retval = hub_enable_device(udev);
+ if (retval < 0) {
+ dev_err(&udev->dev,
+@@ -4857,52 +4938,14 @@ hub_port_init(struct usb_hub *hub, struct usb_device *udev, int port1,
+ goto fail;
+ }
+
+-#define GET_DESCRIPTOR_BUFSIZE 64
+- buf = kmalloc(GET_DESCRIPTOR_BUFSIZE, GFP_NOIO);
+- if (!buf) {
+- retval = -ENOMEM;
+- continue;
+- }
+-
+- /* Retry on all errors; some devices are flakey.
+- * 255 is for WUSB devices, we actually need to use
+- * 512 (WUSB1.0[4.8.1]).
+- */
+- for (operations = 0; operations < GET_MAXPACKET0_TRIES;
+- ++operations) {
+- buf->bMaxPacketSize0 = 0;
+- r = usb_control_msg(udev, usb_rcvaddr0pipe(),
+- USB_REQ_GET_DESCRIPTOR, USB_DIR_IN,
+- USB_DT_DEVICE << 8, 0,
+- buf, GET_DESCRIPTOR_BUFSIZE,
+- initial_descriptor_timeout);
+- switch (buf->bMaxPacketSize0) {
+- case 8: case 16: case 32: case 64: case 255:
+- if (buf->bDescriptorType ==
+- USB_DT_DEVICE) {
+- r = 0;
+- break;
+- }
+- fallthrough;
+- default:
+- if (r == 0)
+- r = -EPROTO;
+- break;
+- }
+- /*
+- * Some devices time out if they are powered on
+- * when already connected. They need a second
+- * reset. But only on the first attempt,
+- * lest we get into a time out/reset loop
+- */
+- if (r == 0 || (r == -ETIMEDOUT &&
+- retries == 0 &&
+- udev->speed > USB_SPEED_FULL))
+- break;
++ maxp0 = get_bMaxPacketSize0(udev, buf,
++ GET_DESCRIPTOR_BUFSIZE, retries == 0);
++ if (maxp0 > 0 && !initial &&
++ maxp0 != udev->descriptor.bMaxPacketSize0) {
++ dev_err(&udev->dev, "device reset changed ep0 maxpacket size!\n");
++ retval = -ENODEV;
++ goto fail;
+ }
+- udev->descriptor.bMaxPacketSize0 =
+- buf->bMaxPacketSize0;
+- kfree(buf);
+
+ retval = hub_port_reset(hub, port1, udev, delay, false);
+ if (retval < 0) /* error or disconnect */
+@@ -4913,14 +4956,13 @@ hub_port_init(struct usb_hub *hub, struct usb_device *udev, int port1,
+ retval = -ENODEV;
+ goto fail;
+ }
+- if (r) {
+- if (r != -ENODEV)
++ if (maxp0 < 0) {
++ if (maxp0 != -ENODEV)
+ dev_err(&udev->dev, "device descriptor read/64, error %d\n",
+- r);
+- retval = -EMSGSIZE;
++ maxp0);
++ retval = maxp0;
+ continue;
+ }
+-#undef GET_DESCRIPTOR_BUFSIZE
+ }
+
+ /*
+@@ -4966,18 +5008,22 @@ hub_port_init(struct usb_hub *hub, struct usb_device *udev, int port1,
+ break;
+ }
+
+- retval = usb_get_device_descriptor(udev, 8);
+- if (retval < 8) {
++ /* !do_new_scheme || wusb */
++ maxp0 = get_bMaxPacketSize0(udev, buf, 8, retries == 0);
++ if (maxp0 < 0) {
++ retval = maxp0;
+ if (retval != -ENODEV)
+ dev_err(&udev->dev,
+ "device descriptor read/8, error %d\n",
+ retval);
+- if (retval >= 0)
+- retval = -EMSGSIZE;
+ } else {
+ u32 delay;
+
+- retval = 0;
++ if (!initial && maxp0 != udev->descriptor.bMaxPacketSize0) {
++ dev_err(&udev->dev, "device reset changed ep0 maxpacket size!\n");
++ retval = -ENODEV;
++ goto fail;
++ }
+
+ delay = udev->parent->hub_delay;
+ udev->hub_delay = min_t(u32, delay,
+@@ -4996,48 +5042,61 @@ hub_port_init(struct usb_hub *hub, struct usb_device *udev, int port1,
+ goto fail;
+
+ /*
+- * Some superspeed devices have finished the link training process
+- * and attached to a superspeed hub port, but the device descriptor
+- * got from those devices show they aren't superspeed devices. Warm
+- * reset the port attached by the devices can fix them.
++ * Check the ep0 maxpacket guess and correct it if necessary.
++ * maxp0 is the value stored in the device descriptor;
++ * i is the value it encodes (logarithmic for SuperSpeed or greater).
+ */
+- if ((udev->speed >= USB_SPEED_SUPER) &&
+- (le16_to_cpu(udev->descriptor.bcdUSB) < 0x0300)) {
+- dev_err(&udev->dev, "got a wrong device descriptor, "
+- "warm reset device\n");
+- hub_port_reset(hub, port1, udev,
+- HUB_BH_RESET_TIME, true);
+- retval = -EINVAL;
+- goto fail;
+- }
+-
+- if (udev->descriptor.bMaxPacketSize0 == 0xff ||
+- udev->speed >= USB_SPEED_SUPER)
+- i = 512;
+- else
+- i = udev->descriptor.bMaxPacketSize0;
+- if (usb_endpoint_maxp(&udev->ep0.desc) != i) {
+- if (udev->speed == USB_SPEED_LOW ||
+- !(i == 8 || i == 16 || i == 32 || i == 64)) {
+- dev_err(&udev->dev, "Invalid ep0 maxpacket: %d\n", i);
+- retval = -EMSGSIZE;
+- goto fail;
+- }
++ i = maxp0;
++ if (udev->speed >= USB_SPEED_SUPER) {
++ if (maxp0 <= 16)
++ i = 1 << maxp0;
++ else
++ i = 0; /* Invalid */
++ }
++ if (usb_endpoint_maxp(&udev->ep0.desc) == i) {
++ ; /* Initial ep0 maxpacket guess is right */
++ } else if ((udev->speed == USB_SPEED_FULL ||
++ udev->speed == USB_SPEED_HIGH) &&
++ (i == 8 || i == 16 || i == 32 || i == 64)) {
++ /* Initial guess is wrong; use the descriptor's value */
+ if (udev->speed == USB_SPEED_FULL)
+ dev_dbg(&udev->dev, "ep0 maxpacket = %d\n", i);
+ else
+ dev_warn(&udev->dev, "Using ep0 maxpacket: %d\n", i);
+ udev->ep0.desc.wMaxPacketSize = cpu_to_le16(i);
+ usb_ep0_reinit(udev);
++ } else {
++ /* Initial guess is wrong and descriptor's value is invalid */
++ dev_err(&udev->dev, "Invalid ep0 maxpacket: %d\n", maxp0);
++ retval = -EMSGSIZE;
++ goto fail;
+ }
+
+- retval = usb_get_device_descriptor(udev, USB_DT_DEVICE_SIZE);
+- if (retval < (signed)sizeof(udev->descriptor)) {
++ descr = usb_get_device_descriptor(udev);
++ if (IS_ERR(descr)) {
++ retval = PTR_ERR(descr);
+ if (retval != -ENODEV)
+ dev_err(&udev->dev, "device descriptor read/all, error %d\n",
+ retval);
+- if (retval >= 0)
+- retval = -ENOMSG;
++ goto fail;
++ }
++ if (initial)
++ udev->descriptor = *descr;
++ else
++ *dev_descr = *descr;
++ kfree(descr);
++
++ /*
++ * Some superspeed devices have finished the link training process
++ * and attached to a superspeed hub port, but the device descriptor
++ * got from those devices show they aren't superspeed devices. Warm
++ * reset the port attached by the devices can fix them.
++ */
++ if ((udev->speed >= USB_SPEED_SUPER) &&
++ (le16_to_cpu(udev->descriptor.bcdUSB) < 0x0300)) {
++ dev_err(&udev->dev, "got a wrong device descriptor, warm reset device\n");
++ hub_port_reset(hub, port1, udev, HUB_BH_RESET_TIME, true);
++ retval = -EINVAL;
+ goto fail;
+ }
+
+@@ -5063,6 +5122,7 @@ fail:
+ hub_port_disable(hub, port1, 0);
+ update_devnum(udev, devnum); /* for disconnect processing */
+ }
++ kfree(buf);
+ return retval;
+ }
+
+@@ -5143,7 +5203,7 @@ hub_power_remaining(struct usb_hub *hub)
+
+
+ static int descriptors_changed(struct usb_device *udev,
+- struct usb_device_descriptor *old_device_descriptor,
++ struct usb_device_descriptor *new_device_descriptor,
+ struct usb_host_bos *old_bos)
+ {
+ int changed = 0;
+@@ -5154,8 +5214,8 @@ static int descriptors_changed(struct usb_device *udev,
+ int length;
+ char *buf;
+
+- if (memcmp(&udev->descriptor, old_device_descriptor,
+- sizeof(*old_device_descriptor)) != 0)
++ if (memcmp(&udev->descriptor, new_device_descriptor,
++ sizeof(*new_device_descriptor)) != 0)
+ return 1;
+
+ if ((old_bos && !udev->bos) || (!old_bos && udev->bos))
+@@ -5333,7 +5393,7 @@ static void hub_port_connect(struct usb_hub *hub, int port1, u16 portstatus,
+ }
+
+ /* reset (non-USB 3.0 devices) and get descriptor */
+- status = hub_port_init(hub, udev, port1, i);
++ status = hub_port_init(hub, udev, port1, i, NULL);
+ if (status < 0)
+ goto loop;
+
+@@ -5480,9 +5540,8 @@ static void hub_port_connect_change(struct usb_hub *hub, int port1,
+ {
+ struct usb_port *port_dev = hub->ports[port1 - 1];
+ struct usb_device *udev = port_dev->child;
+- struct usb_device_descriptor descriptor;
++ struct usb_device_descriptor *descr;
+ int status = -ENODEV;
+- int retval;
+
+ dev_dbg(&port_dev->dev, "status %04x, change %04x, %s\n", portstatus,
+ portchange, portspeed(hub, portstatus));
+@@ -5509,23 +5568,20 @@ static void hub_port_connect_change(struct usb_hub *hub, int port1,
+ * changed device descriptors before resuscitating the
+ * device.
+ */
+- descriptor = udev->descriptor;
+- retval = usb_get_device_descriptor(udev,
+- sizeof(udev->descriptor));
+- if (retval < 0) {
++ descr = usb_get_device_descriptor(udev);
++ if (IS_ERR(descr)) {
+ dev_dbg(&udev->dev,
+- "can't read device descriptor %d\n",
+- retval);
++ "can't read device descriptor %ld\n",
++ PTR_ERR(descr));
+ } else {
+- if (descriptors_changed(udev, &descriptor,
++ if (descriptors_changed(udev, descr,
+ udev->bos)) {
+ dev_dbg(&udev->dev,
+ "device descriptor has changed\n");
+- /* for disconnect() calls */
+- udev->descriptor = descriptor;
+ } else {
+ status = 0; /* Nothing to do */
+ }
++ kfree(descr);
+ }
+ #ifdef CONFIG_PM
+ } else if (udev->state == USB_STATE_SUSPENDED &&
+@@ -5967,7 +6023,7 @@ static int usb_reset_and_verify_device(struct usb_device *udev)
+ struct usb_device *parent_hdev = udev->parent;
+ struct usb_hub *parent_hub;
+ struct usb_hcd *hcd = bus_to_hcd(udev->bus);
+- struct usb_device_descriptor descriptor = udev->descriptor;
++ struct usb_device_descriptor descriptor;
+ struct usb_host_bos *bos;
+ int i, j, ret = 0;
+ int port1 = udev->portnum;
+@@ -6003,7 +6059,7 @@ static int usb_reset_and_verify_device(struct usb_device *udev)
+ /* ep0 maxpacket size may change; let the HCD know about it.
+ * Other endpoints will be handled by re-enumeration. */
+ usb_ep0_reinit(udev);
+- ret = hub_port_init(parent_hub, udev, port1, i);
++ ret = hub_port_init(parent_hub, udev, port1, i, &descriptor);
+ if (ret >= 0 || ret == -ENOTCONN || ret == -ENODEV)
+ break;
+ }
+@@ -6015,7 +6071,6 @@ static int usb_reset_and_verify_device(struct usb_device *udev)
+ /* Device might have changed firmware (DFU or similar) */
+ if (descriptors_changed(udev, &descriptor, bos)) {
+ dev_info(&udev->dev, "device firmware changed\n");
+- udev->descriptor = descriptor; /* for disconnect() calls */
+ goto re_enumerate;
+ }
+
+diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
+index b5811620f1de1..1da8e7ff39830 100644
+--- a/drivers/usb/core/message.c
++++ b/drivers/usb/core/message.c
+@@ -1040,40 +1040,35 @@ char *usb_cache_string(struct usb_device *udev, int index)
+ EXPORT_SYMBOL_GPL(usb_cache_string);
+
+ /*
+- * usb_get_device_descriptor - (re)reads the device descriptor (usbcore)
+- * @dev: the device whose device descriptor is being updated
+- * @size: how much of the descriptor to read
++ * usb_get_device_descriptor - read the device descriptor
++ * @udev: the device whose device descriptor should be read
+ *
+ * Context: task context, might sleep.
+ *
+- * Updates the copy of the device descriptor stored in the device structure,
+- * which dedicates space for this purpose.
+- *
+ * Not exported, only for use by the core. If drivers really want to read
+ * the device descriptor directly, they can call usb_get_descriptor() with
+ * type = USB_DT_DEVICE and index = 0.
+ *
+- * This call is synchronous, and may not be used in an interrupt context.
+- *
+- * Return: The number of bytes received on success, or else the status code
+- * returned by the underlying usb_control_msg() call.
++ * Returns: a pointer to a dynamically allocated usb_device_descriptor
++ * structure (which the caller must deallocate), or an ERR_PTR value.
+ */
+-int usb_get_device_descriptor(struct usb_device *dev, unsigned int size)
++struct usb_device_descriptor *usb_get_device_descriptor(struct usb_device *udev)
+ {
+ struct usb_device_descriptor *desc;
+ int ret;
+
+- if (size > sizeof(*desc))
+- return -EINVAL;
+ desc = kmalloc(sizeof(*desc), GFP_NOIO);
+ if (!desc)
+- return -ENOMEM;
++ return ERR_PTR(-ENOMEM);
++
++ ret = usb_get_descriptor(udev, USB_DT_DEVICE, 0, desc, sizeof(*desc));
++ if (ret == sizeof(*desc))
++ return desc;
+
+- ret = usb_get_descriptor(dev, USB_DT_DEVICE, 0, desc, size);
+ if (ret >= 0)
+- memcpy(&dev->descriptor, desc, size);
++ ret = -EMSGSIZE;
+ kfree(desc);
+- return ret;
++ return ERR_PTR(ret);
+ }
+
+ /*
+diff --git a/drivers/usb/core/usb.h b/drivers/usb/core/usb.h
+index ffe3f6818e9cf..4a16d559d3bff 100644
+--- a/drivers/usb/core/usb.h
++++ b/drivers/usb/core/usb.h
+@@ -43,8 +43,8 @@ extern bool usb_endpoint_is_ignored(struct usb_device *udev,
+ struct usb_endpoint_descriptor *epd);
+ extern int usb_remove_device(struct usb_device *udev);
+
+-extern int usb_get_device_descriptor(struct usb_device *dev,
+- unsigned int size);
++extern struct usb_device_descriptor *usb_get_device_descriptor(
++ struct usb_device *udev);
+ extern int usb_set_isoch_delay(struct usb_device *dev);
+ extern int usb_get_bos_descriptor(struct usb_device *dev);
+ extern void usb_release_bos_descriptor(struct usb_device *dev);
+diff --git a/drivers/usb/gadget/function/f_mass_storage.c b/drivers/usb/gadget/function/f_mass_storage.c
+index 3a30feb47073f..e927d82f0890d 100644
+--- a/drivers/usb/gadget/function/f_mass_storage.c
++++ b/drivers/usb/gadget/function/f_mass_storage.c
+@@ -927,7 +927,7 @@ static void invalidate_sub(struct fsg_lun *curlun)
+ {
+ struct file *filp = curlun->filp;
+ struct inode *inode = file_inode(filp);
+- unsigned long rc;
++ unsigned long __maybe_unused rc;
+
+ rc = invalidate_mapping_pages(inode->i_mapping, 0, -1);
+ VLDBG(curlun, "invalidate_mapping_pages -> %ld\n", rc);
+diff --git a/drivers/usb/gadget/udc/core.c b/drivers/usb/gadget/udc/core.c
+index d5bc2892184ca..5ec47757167e8 100644
+--- a/drivers/usb/gadget/udc/core.c
++++ b/drivers/usb/gadget/udc/core.c
+@@ -40,6 +40,7 @@ static const struct bus_type gadget_bus_type;
+ * @allow_connect: Indicates whether UDC is allowed to be pulled up.
+ * Set/cleared by gadget_(un)bind_driver() after gadget driver is bound or
+ * unbound.
++ * @vbus_work: work routine to handle VBUS status change notifications.
+ * @connect_lock: protects udc->started, gadget->connect,
+ * gadget->allow_connect and gadget->deactivate. The routines
+ * usb_gadget_connect_locked(), usb_gadget_disconnect_locked(),
+diff --git a/drivers/usb/phy/phy-mxs-usb.c b/drivers/usb/phy/phy-mxs-usb.c
+index e1a2b2ea098b5..cceabb9d37e98 100644
+--- a/drivers/usb/phy/phy-mxs-usb.c
++++ b/drivers/usb/phy/phy-mxs-usb.c
+@@ -388,14 +388,8 @@ static void __mxs_phy_disconnect_line(struct mxs_phy *mxs_phy, bool disconnect)
+
+ static bool mxs_phy_is_otg_host(struct mxs_phy *mxs_phy)
+ {
+- void __iomem *base = mxs_phy->phy.io_priv;
+- u32 phyctrl = readl(base + HW_USBPHY_CTRL);
+-
+- if (IS_ENABLED(CONFIG_USB_OTG) &&
+- !(phyctrl & BM_USBPHY_CTRL_OTG_ID_VALUE))
+- return true;
+-
+- return false;
++ return IS_ENABLED(CONFIG_USB_OTG) &&
++ mxs_phy->phy.last_event == USB_EVENT_ID;
+ }
+
+ static void mxs_phy_disconnect_line(struct mxs_phy *mxs_phy, bool on)
+diff --git a/drivers/usb/typec/bus.c b/drivers/usb/typec/bus.c
+index fe5b9a2e61f58..e95ec7e382bb7 100644
+--- a/drivers/usb/typec/bus.c
++++ b/drivers/usb/typec/bus.c
+@@ -183,12 +183,20 @@ EXPORT_SYMBOL_GPL(typec_altmode_exit);
+ *
+ * Notifies the partner of @adev about Attention command.
+ */
+-void typec_altmode_attention(struct typec_altmode *adev, u32 vdo)
++int typec_altmode_attention(struct typec_altmode *adev, u32 vdo)
+ {
+- struct typec_altmode *pdev = &to_altmode(adev)->partner->adev;
++ struct altmode *partner = to_altmode(adev)->partner;
++ struct typec_altmode *pdev;
++
++ if (!partner)
++ return -ENODEV;
++
++ pdev = &partner->adev;
+
+ if (pdev->ops && pdev->ops->attention)
+ pdev->ops->attention(pdev, vdo);
++
++ return 0;
+ }
+ EXPORT_SYMBOL_GPL(typec_altmode_attention);
+
+diff --git a/drivers/usb/typec/tcpm/tcpm.c b/drivers/usb/typec/tcpm/tcpm.c
+index 9c4b73d23f833..43d0519683db9 100644
+--- a/drivers/usb/typec/tcpm/tcpm.c
++++ b/drivers/usb/typec/tcpm/tcpm.c
+@@ -1877,7 +1877,8 @@ static void tcpm_handle_vdm_request(struct tcpm_port *port,
+ }
+ break;
+ case ADEV_ATTENTION:
+- typec_altmode_attention(adev, p[1]);
++ if (typec_altmode_attention(adev, p[1]))
++ tcpm_log(port, "typec_altmode_attention no port partner altmode");
+ break;
+ }
+ }
+@@ -3935,6 +3936,29 @@ static enum typec_cc_status tcpm_pwr_opmode_to_rp(enum typec_pwr_opmode opmode)
+ }
+ }
+
++static void tcpm_set_initial_svdm_version(struct tcpm_port *port)
++{
++ switch (port->negotiated_rev) {
++ case PD_REV30:
++ break;
++ /*
++ * 6.4.4.2.3 Structured VDM Version
++ * 2.0 states "At this time, there is only one version (1.0) defined.
++ * This field Shall be set to zero to indicate Version 1.0."
++ * 3.0 states "This field Shall be set to 01b to indicate Version 2.0."
++ * To ensure that we follow the Power Delivery revision we are currently
++ * operating on, downgrade the SVDM version to the highest one supported
++ * by the Power Delivery revision.
++ */
++ case PD_REV20:
++ typec_partner_set_svdm_version(port->partner, SVDM_VER_1_0);
++ break;
++ default:
++ typec_partner_set_svdm_version(port->partner, SVDM_VER_1_0);
++ break;
++ }
++}
++
+ static void run_state_machine(struct tcpm_port *port)
+ {
+ int ret;
+@@ -4172,10 +4196,12 @@ static void run_state_machine(struct tcpm_port *port)
+ * For now, this driver only supports SOP for DISCOVER_IDENTITY, thus using
+ * port->explicit_contract to decide whether to send the command.
+ */
+- if (port->explicit_contract)
++ if (port->explicit_contract) {
++ tcpm_set_initial_svdm_version(port);
+ mod_send_discover_delayed_work(port, 0);
+- else
++ } else {
+ port->send_discover = false;
++ }
+
+ /*
+ * 6.3.5
+@@ -4462,10 +4488,12 @@ static void run_state_machine(struct tcpm_port *port)
+ * For now, this driver only supports SOP for DISCOVER_IDENTITY, thus using
+ * port->explicit_contract.
+ */
+- if (port->explicit_contract)
++ if (port->explicit_contract) {
++ tcpm_set_initial_svdm_version(port);
+ mod_send_discover_delayed_work(port, 0);
+- else
++ } else {
+ port->send_discover = false;
++ }
+
+ power_supply_changed(port->psy);
+ break;
+diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
+index f18a9301ab94e..6b79ae746ab93 100644
+--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
++++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
+@@ -2447,7 +2447,15 @@ static int mlx5_vdpa_set_driver_features(struct vdpa_device *vdev, u64 features)
+ else
+ ndev->rqt_size = 1;
+
+- ndev->cur_num_vqs = 2 * ndev->rqt_size;
++ /* Device must start with 1 queue pair, as per VIRTIO v1.2 spec, section
++ * 5.1.6.5.5 "Device operation in multiqueue mode":
++ *
++ * Multiqueue is disabled by default.
++ * The driver enables multiqueue by sending a command using class
++ * VIRTIO_NET_CTRL_MQ. The command selects the mode of multiqueue
++ * operation, as follows: ...
++ */
++ ndev->cur_num_vqs = 2;
+
+ update_cvq_info(mvdev);
+ return err;
+diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
+index 0d2f805468e19..cd2113be632b9 100644
+--- a/drivers/vfio/vfio_iommu_type1.c
++++ b/drivers/vfio/vfio_iommu_type1.c
+@@ -2729,7 +2729,7 @@ static int vfio_iommu_iova_build_caps(struct vfio_iommu *iommu,
+ static int vfio_iommu_migration_build_caps(struct vfio_iommu *iommu,
+ struct vfio_info_cap *caps)
+ {
+- struct vfio_iommu_type1_info_cap_migration cap_mig;
++ struct vfio_iommu_type1_info_cap_migration cap_mig = {};
+
+ cap_mig.header.id = VFIO_IOMMU_TYPE1_INFO_CAP_MIGRATION;
+ cap_mig.header.version = 1;
+diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
+index bb10fa4bb4f6e..fa19c3f043b12 100644
+--- a/drivers/vhost/scsi.c
++++ b/drivers/vhost/scsi.c
+@@ -25,6 +25,8 @@
+ #include <linux/fs.h>
+ #include <linux/vmalloc.h>
+ #include <linux/miscdevice.h>
++#include <linux/blk_types.h>
++#include <linux/bio.h>
+ #include <asm/unaligned.h>
+ #include <scsi/scsi_common.h>
+ #include <scsi/scsi_proto.h>
+@@ -75,6 +77,9 @@ struct vhost_scsi_cmd {
+ u32 tvc_prot_sgl_count;
+ /* Saved unpacked SCSI LUN for vhost_scsi_target_queue_cmd() */
+ u32 tvc_lun;
++ u32 copied_iov:1;
++ const void *saved_iter_addr;
++ struct iov_iter saved_iter;
+ /* Pointer to the SGL formatted memory from virtio-scsi */
+ struct scatterlist *tvc_sgl;
+ struct scatterlist *tvc_prot_sgl;
+@@ -328,8 +333,13 @@ static void vhost_scsi_release_cmd_res(struct se_cmd *se_cmd)
+ int i;
+
+ if (tv_cmd->tvc_sgl_count) {
+- for (i = 0; i < tv_cmd->tvc_sgl_count; i++)
+- put_page(sg_page(&tv_cmd->tvc_sgl[i]));
++ for (i = 0; i < tv_cmd->tvc_sgl_count; i++) {
++ if (tv_cmd->copied_iov)
++ __free_page(sg_page(&tv_cmd->tvc_sgl[i]));
++ else
++ put_page(sg_page(&tv_cmd->tvc_sgl[i]));
++ }
++ kfree(tv_cmd->saved_iter_addr);
+ }
+ if (tv_cmd->tvc_prot_sgl_count) {
+ for (i = 0; i < tv_cmd->tvc_prot_sgl_count; i++)
+@@ -502,6 +512,28 @@ static void vhost_scsi_evt_work(struct vhost_work *work)
+ mutex_unlock(&vq->mutex);
+ }
+
++static int vhost_scsi_copy_sgl_to_iov(struct vhost_scsi_cmd *cmd)
++{
++ struct iov_iter *iter = &cmd->saved_iter;
++ struct scatterlist *sg = cmd->tvc_sgl;
++ struct page *page;
++ size_t len;
++ int i;
++
++ for (i = 0; i < cmd->tvc_sgl_count; i++) {
++ page = sg_page(&sg[i]);
++ len = sg[i].length;
++
++ if (copy_page_to_iter(page, 0, len, iter) != len) {
++ pr_err("Could not copy data while handling misaligned cmd. Error %zu\n",
++ len);
++ return -1;
++ }
++ }
++
++ return 0;
++}
++
+ /* Fill in status and signal that we are done processing this command
+ *
+ * This is scheduled in the vhost work queue so we are called with the owner
+@@ -525,15 +557,20 @@ static void vhost_scsi_complete_cmd_work(struct vhost_work *work)
+
+ pr_debug("%s tv_cmd %p resid %u status %#02x\n", __func__,
+ cmd, se_cmd->residual_count, se_cmd->scsi_status);
+-
+ memset(&v_rsp, 0, sizeof(v_rsp));
+- v_rsp.resid = cpu_to_vhost32(cmd->tvc_vq, se_cmd->residual_count);
+- /* TODO is status_qualifier field needed? */
+- v_rsp.status = se_cmd->scsi_status;
+- v_rsp.sense_len = cpu_to_vhost32(cmd->tvc_vq,
+- se_cmd->scsi_sense_length);
+- memcpy(v_rsp.sense, cmd->tvc_sense_buf,
+- se_cmd->scsi_sense_length);
++
++ if (cmd->saved_iter_addr && vhost_scsi_copy_sgl_to_iov(cmd)) {
++ v_rsp.response = VIRTIO_SCSI_S_BAD_TARGET;
++ } else {
++ v_rsp.resid = cpu_to_vhost32(cmd->tvc_vq,
++ se_cmd->residual_count);
++ /* TODO is status_qualifier field needed? */
++ v_rsp.status = se_cmd->scsi_status;
++ v_rsp.sense_len = cpu_to_vhost32(cmd->tvc_vq,
++ se_cmd->scsi_sense_length);
++ memcpy(v_rsp.sense, cmd->tvc_sense_buf,
++ se_cmd->scsi_sense_length);
++ }
+
+ iov_iter_init(&iov_iter, ITER_DEST, cmd->tvc_resp_iov,
+ cmd->tvc_in_iovs, sizeof(v_rsp));
+@@ -615,12 +652,12 @@ static int
+ vhost_scsi_map_to_sgl(struct vhost_scsi_cmd *cmd,
+ struct iov_iter *iter,
+ struct scatterlist *sgl,
+- bool write)
++ bool is_prot)
+ {
+ struct page **pages = cmd->tvc_upages;
+ struct scatterlist *sg = sgl;
+- ssize_t bytes;
+- size_t offset;
++ ssize_t bytes, mapped_bytes;
++ size_t offset, mapped_offset;
+ unsigned int npages = 0;
+
+ bytes = iov_iter_get_pages2(iter, pages, LONG_MAX,
+@@ -629,13 +666,53 @@ vhost_scsi_map_to_sgl(struct vhost_scsi_cmd *cmd,
+ if (bytes <= 0)
+ return bytes < 0 ? bytes : -EFAULT;
+
++ mapped_bytes = bytes;
++ mapped_offset = offset;
++
+ while (bytes) {
+ unsigned n = min_t(unsigned, PAGE_SIZE - offset, bytes);
++ /*
++ * The block layer requires bios/requests to be a multiple of
++ * 512 bytes, but Windows can send us vecs that are misaligned.
++ * This can result in bios and later requests with misaligned
++ * sizes if we have to break up a cmd/scatterlist into multiple
++ * bios.
++ *
++ * We currently only break up a command into multiple bios if
++ * we hit the vec/seg limit, so check if our sgl_count is
++ * greater than the max and if a vec in the cmd has a
++ * misaligned offset/size.
++ */
++ if (!is_prot &&
++ (offset & (SECTOR_SIZE - 1) || n & (SECTOR_SIZE - 1)) &&
++ cmd->tvc_sgl_count > BIO_MAX_VECS) {
++ WARN_ONCE(true,
++ "vhost-scsi detected misaligned IO. Performance may be degraded.");
++ goto revert_iter_get_pages;
++ }
++
+ sg_set_page(sg++, pages[npages++], n, offset);
+ bytes -= n;
+ offset = 0;
+ }
++
+ return npages;
++
++revert_iter_get_pages:
++ iov_iter_revert(iter, mapped_bytes);
++
++ npages = 0;
++ while (mapped_bytes) {
++ unsigned int n = min_t(unsigned int, PAGE_SIZE - mapped_offset,
++ mapped_bytes);
++
++ put_page(pages[npages++]);
++
++ mapped_bytes -= n;
++ mapped_offset = 0;
++ }
++
++ return -EINVAL;
+ }
+
+ static int
+@@ -659,25 +736,80 @@ vhost_scsi_calc_sgls(struct iov_iter *iter, size_t bytes, int max_sgls)
+ }
+
+ static int
+-vhost_scsi_iov_to_sgl(struct vhost_scsi_cmd *cmd, bool write,
+- struct iov_iter *iter,
+- struct scatterlist *sg, int sg_count)
++vhost_scsi_copy_iov_to_sgl(struct vhost_scsi_cmd *cmd, struct iov_iter *iter,
++ struct scatterlist *sg, int sg_count)
++{
++ size_t len = iov_iter_count(iter);
++ unsigned int nbytes = 0;
++ struct page *page;
++ int i;
++
++ if (cmd->tvc_data_direction == DMA_FROM_DEVICE) {
++ cmd->saved_iter_addr = dup_iter(&cmd->saved_iter, iter,
++ GFP_KERNEL);
++ if (!cmd->saved_iter_addr)
++ return -ENOMEM;
++ }
++
++ for (i = 0; i < sg_count; i++) {
++ page = alloc_page(GFP_KERNEL);
++ if (!page) {
++ i--;
++ goto err;
++ }
++
++ nbytes = min_t(unsigned int, PAGE_SIZE, len);
++ sg_set_page(&sg[i], page, nbytes, 0);
++
++ if (cmd->tvc_data_direction == DMA_TO_DEVICE &&
++ copy_page_from_iter(page, 0, nbytes, iter) != nbytes)
++ goto err;
++
++ len -= nbytes;
++ }
++
++ cmd->copied_iov = 1;
++ return 0;
++
++err:
++ pr_err("Could not read %u bytes while handling misaligned cmd\n",
++ nbytes);
++
++ for (; i >= 0; i--)
++ __free_page(sg_page(&sg[i]));
++ kfree(cmd->saved_iter_addr);
++ return -ENOMEM;
++}
++
++static int
++vhost_scsi_iov_to_sgl(struct vhost_scsi_cmd *cmd, struct iov_iter *iter,
++ struct scatterlist *sg, int sg_count, bool is_prot)
+ {
+ struct scatterlist *p = sg;
++ size_t revert_bytes;
+ int ret;
+
+ while (iov_iter_count(iter)) {
+- ret = vhost_scsi_map_to_sgl(cmd, iter, sg, write);
++ ret = vhost_scsi_map_to_sgl(cmd, iter, sg, is_prot);
+ if (ret < 0) {
++ revert_bytes = 0;
++
+ while (p < sg) {
+- struct page *page = sg_page(p++);
+- if (page)
++ struct page *page = sg_page(p);
++
++ if (page) {
+ put_page(page);
++ revert_bytes += p->length;
++ }
++ p++;
+ }
++
++ iov_iter_revert(iter, revert_bytes);
+ return ret;
+ }
+ sg += ret;
+ }
++
+ return 0;
+ }
+
+@@ -687,7 +819,6 @@ vhost_scsi_mapal(struct vhost_scsi_cmd *cmd,
+ size_t data_bytes, struct iov_iter *data_iter)
+ {
+ int sgl_count, ret;
+- bool write = (cmd->tvc_data_direction == DMA_FROM_DEVICE);
+
+ if (prot_bytes) {
+ sgl_count = vhost_scsi_calc_sgls(prot_iter, prot_bytes,
+@@ -700,9 +831,8 @@ vhost_scsi_mapal(struct vhost_scsi_cmd *cmd,
+ pr_debug("%s prot_sg %p prot_sgl_count %u\n", __func__,
+ cmd->tvc_prot_sgl, cmd->tvc_prot_sgl_count);
+
+- ret = vhost_scsi_iov_to_sgl(cmd, write, prot_iter,
+- cmd->tvc_prot_sgl,
+- cmd->tvc_prot_sgl_count);
++ ret = vhost_scsi_iov_to_sgl(cmd, prot_iter, cmd->tvc_prot_sgl,
++ cmd->tvc_prot_sgl_count, true);
+ if (ret < 0) {
+ cmd->tvc_prot_sgl_count = 0;
+ return ret;
+@@ -718,8 +848,14 @@ vhost_scsi_mapal(struct vhost_scsi_cmd *cmd,
+ pr_debug("%s data_sg %p data_sgl_count %u\n", __func__,
+ cmd->tvc_sgl, cmd->tvc_sgl_count);
+
+- ret = vhost_scsi_iov_to_sgl(cmd, write, data_iter,
+- cmd->tvc_sgl, cmd->tvc_sgl_count);
++ ret = vhost_scsi_iov_to_sgl(cmd, data_iter, cmd->tvc_sgl,
++ cmd->tvc_sgl_count, false);
++ if (ret == -EINVAL) {
++ sg_init_table(cmd->tvc_sgl, cmd->tvc_sgl_count);
++ ret = vhost_scsi_copy_iov_to_sgl(cmd, data_iter, cmd->tvc_sgl,
++ cmd->tvc_sgl_count);
++ }
++
+ if (ret < 0) {
+ cmd->tvc_sgl_count = 0;
+ return ret;
+diff --git a/drivers/video/backlight/bd6107.c b/drivers/video/backlight/bd6107.c
+index f4db6c064635b..e3410444ea235 100644
+--- a/drivers/video/backlight/bd6107.c
++++ b/drivers/video/backlight/bd6107.c
+@@ -104,7 +104,7 @@ static int bd6107_backlight_check_fb(struct backlight_device *backlight,
+ {
+ struct bd6107 *bd = bl_get_data(backlight);
+
+- return bd->pdata->fbdev == NULL || bd->pdata->fbdev == info->dev;
++ return bd->pdata->fbdev == NULL || bd->pdata->fbdev == info->device;
+ }
+
+ static const struct backlight_ops bd6107_backlight_ops = {
+diff --git a/drivers/video/backlight/gpio_backlight.c b/drivers/video/backlight/gpio_backlight.c
+index 6f78d928f054a..5c5c99f7979e3 100644
+--- a/drivers/video/backlight/gpio_backlight.c
++++ b/drivers/video/backlight/gpio_backlight.c
+@@ -35,7 +35,7 @@ static int gpio_backlight_check_fb(struct backlight_device *bl,
+ {
+ struct gpio_backlight *gbl = bl_get_data(bl);
+
+- return gbl->fbdev == NULL || gbl->fbdev == info->dev;
++ return gbl->fbdev == NULL || gbl->fbdev == info->device;
+ }
+
+ static const struct backlight_ops gpio_backlight_ops = {
+diff --git a/drivers/video/backlight/lv5207lp.c b/drivers/video/backlight/lv5207lp.c
+index 00673c8b66ac5..99ba4bc0a500d 100644
+--- a/drivers/video/backlight/lv5207lp.c
++++ b/drivers/video/backlight/lv5207lp.c
+@@ -67,7 +67,7 @@ static int lv5207lp_backlight_check_fb(struct backlight_device *backlight,
+ {
+ struct lv5207lp *lv = bl_get_data(backlight);
+
+- return lv->pdata->fbdev == NULL || lv->pdata->fbdev == info->dev;
++ return lv->pdata->fbdev == NULL || lv->pdata->fbdev == info->device;
+ }
+
+ static const struct backlight_ops lv5207lp_backlight_ops = {
+diff --git a/drivers/video/fbdev/goldfishfb.c b/drivers/video/fbdev/goldfishfb.c
+index 6fa2108fd912d..e41c9fef4a3b6 100644
+--- a/drivers/video/fbdev/goldfishfb.c
++++ b/drivers/video/fbdev/goldfishfb.c
+@@ -203,8 +203,8 @@ static int goldfish_fb_probe(struct platform_device *pdev)
+ }
+
+ fb->irq = platform_get_irq(pdev, 0);
+- if (fb->irq <= 0) {
+- ret = -ENODEV;
++ if (fb->irq < 0) {
++ ret = fb->irq;
+ goto err_no_irq;
+ }
+
+diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
+index 835f6cc2fb664..fa5226c198cc6 100644
+--- a/drivers/virtio/virtio_mem.c
++++ b/drivers/virtio/virtio_mem.c
+@@ -38,11 +38,6 @@ module_param(bbm_block_size, ulong, 0444);
+ MODULE_PARM_DESC(bbm_block_size,
+ "Big Block size in bytes. Default is 0 (auto-detection).");
+
+-static bool bbm_safe_unplug = true;
+-module_param(bbm_safe_unplug, bool, 0444);
+-MODULE_PARM_DESC(bbm_safe_unplug,
+- "Use a safe unplug mechanism in BBM, avoiding long/endless loops");
+-
+ /*
+ * virtio-mem currently supports the following modes of operation:
+ *
+@@ -173,6 +168,13 @@ struct virtio_mem {
+ /* The number of subblocks per Linux memory block. */
+ uint32_t sbs_per_mb;
+
++ /*
++ * Some of the Linux memory blocks tracked as "partially
++ * plugged" are completely unplugged and can be offlined
++ * and removed -- which previously failed.
++ */
++ bool have_unplugged_mb;
++
+ /* Summary of all memory block states. */
+ unsigned long mb_count[VIRTIO_MEM_SBM_MB_COUNT];
+
+@@ -746,11 +748,15 @@ static int virtio_mem_offline_and_remove_memory(struct virtio_mem *vm,
+ * immediately instead of waiting.
+ */
+ virtio_mem_retry(vm);
+- } else {
+- dev_dbg(&vm->vdev->dev,
+- "offlining and removing memory failed: %d\n", rc);
++ return 0;
+ }
+- return rc;
++ dev_dbg(&vm->vdev->dev, "offlining and removing memory failed: %d\n", rc);
++ /*
++ * We don't really expect this to fail, because we fake-offlined all
++ * memory already. But it could fail in corner cases.
++ */
++ WARN_ON_ONCE(rc != -ENOMEM && rc != -EBUSY);
++ return rc == -ENOMEM ? -ENOMEM : -EBUSY;
+ }
+
+ /*
+@@ -766,6 +772,34 @@ static int virtio_mem_sbm_offline_and_remove_mb(struct virtio_mem *vm,
+ return virtio_mem_offline_and_remove_memory(vm, addr, size);
+ }
+
++/*
++ * Try (offlining and) removing memory from Linux in case all subblocks are
++ * unplugged. Can be called on online and offline memory blocks.
++ *
++ * May modify the state of memory blocks in virtio-mem.
++ */
++static int virtio_mem_sbm_try_remove_unplugged_mb(struct virtio_mem *vm,
++ unsigned long mb_id)
++{
++ int rc;
++
++ /*
++ * Once all subblocks of a memory block were unplugged, offline and
++ * remove it.
++ */
++ if (!virtio_mem_sbm_test_sb_unplugged(vm, mb_id, 0, vm->sbm.sbs_per_mb))
++ return 0;
++
++ /* offline_and_remove_memory() works for online and offline memory. */
++ mutex_unlock(&vm->hotplug_mutex);
++ rc = virtio_mem_sbm_offline_and_remove_mb(vm, mb_id);
++ mutex_lock(&vm->hotplug_mutex);
++ if (!rc)
++ virtio_mem_sbm_set_mb_state(vm, mb_id,
++ VIRTIO_MEM_SBM_MB_UNUSED);
++ return rc;
++}
++
+ /*
+ * See virtio_mem_offline_and_remove_memory(): Try to offline and remove a
+ * all Linux memory blocks covered by the big block.
+@@ -1155,7 +1189,8 @@ static void virtio_mem_fake_online(unsigned long pfn, unsigned long nr_pages)
+ * Try to allocate a range, marking pages fake-offline, effectively
+ * fake-offlining them.
+ */
+-static int virtio_mem_fake_offline(unsigned long pfn, unsigned long nr_pages)
++static int virtio_mem_fake_offline(struct virtio_mem *vm, unsigned long pfn,
++ unsigned long nr_pages)
+ {
+ const bool is_movable = is_zone_movable_page(pfn_to_page(pfn));
+ int rc, retry_count;
+@@ -1168,6 +1203,14 @@ static int virtio_mem_fake_offline(unsigned long pfn, unsigned long nr_pages)
+ * some guarantees.
+ */
+ for (retry_count = 0; retry_count < 5; retry_count++) {
++ /*
++ * If the config changed, stop immediately and go back to the
++ * main loop: avoid trying to keep unplugging if the device
++ * might have decided to not remove any more memory.
++ */
++ if (atomic_read(&vm->config_changed))
++ return -EAGAIN;
++
+ rc = alloc_contig_range(pfn, pfn + nr_pages, MIGRATE_MOVABLE,
+ GFP_KERNEL);
+ if (rc == -ENOMEM)
+@@ -1917,7 +1960,7 @@ static int virtio_mem_sbm_unplug_sb_online(struct virtio_mem *vm,
+ start_pfn = PFN_DOWN(virtio_mem_mb_id_to_phys(mb_id) +
+ sb_id * vm->sbm.sb_size);
+
+- rc = virtio_mem_fake_offline(start_pfn, nr_pages);
++ rc = virtio_mem_fake_offline(vm, start_pfn, nr_pages);
+ if (rc)
+ return rc;
+
+@@ -1989,20 +2032,10 @@ static int virtio_mem_sbm_unplug_any_sb_online(struct virtio_mem *vm,
+ }
+
+ unplugged:
+- /*
+- * Once all subblocks of a memory block were unplugged, offline and
+- * remove it. This will usually not fail, as no memory is in use
+- * anymore - however some other notifiers might NACK the request.
+- */
+- if (virtio_mem_sbm_test_sb_unplugged(vm, mb_id, 0, vm->sbm.sbs_per_mb)) {
+- mutex_unlock(&vm->hotplug_mutex);
+- rc = virtio_mem_sbm_offline_and_remove_mb(vm, mb_id);
+- mutex_lock(&vm->hotplug_mutex);
+- if (!rc)
+- virtio_mem_sbm_set_mb_state(vm, mb_id,
+- VIRTIO_MEM_SBM_MB_UNUSED);
+- }
+-
++ rc = virtio_mem_sbm_try_remove_unplugged_mb(vm, mb_id);
++ if (rc)
++ vm->sbm.have_unplugged_mb = 1;
++ /* Ignore errors, this is not critical. We'll retry later. */
+ return 0;
+ }
+
+@@ -2111,38 +2144,32 @@ static int virtio_mem_bbm_offline_remove_and_unplug_bb(struct virtio_mem *vm,
+ VIRTIO_MEM_BBM_BB_ADDED))
+ return -EINVAL;
+
+- if (bbm_safe_unplug) {
+- /*
+- * Start by fake-offlining all memory. Once we marked the device
+- * block as fake-offline, all newly onlined memory will
+- * automatically be kept fake-offline. Protect from concurrent
+- * onlining/offlining until we have a consistent state.
+- */
+- mutex_lock(&vm->hotplug_mutex);
+- virtio_mem_bbm_set_bb_state(vm, bb_id,
+- VIRTIO_MEM_BBM_BB_FAKE_OFFLINE);
++ /*
++ * Start by fake-offlining all memory. Once we marked the device
++ * block as fake-offline, all newly onlined memory will
++ * automatically be kept fake-offline. Protect from concurrent
++ * onlining/offlining until we have a consistent state.
++ */
++ mutex_lock(&vm->hotplug_mutex);
++ virtio_mem_bbm_set_bb_state(vm, bb_id, VIRTIO_MEM_BBM_BB_FAKE_OFFLINE);
+
+- for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
+- page = pfn_to_online_page(pfn);
+- if (!page)
+- continue;
++ for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
++ page = pfn_to_online_page(pfn);
++ if (!page)
++ continue;
+
+- rc = virtio_mem_fake_offline(pfn, PAGES_PER_SECTION);
+- if (rc) {
+- end_pfn = pfn;
+- goto rollback_safe_unplug;
+- }
++ rc = virtio_mem_fake_offline(vm, pfn, PAGES_PER_SECTION);
++ if (rc) {
++ end_pfn = pfn;
++ goto rollback;
+ }
+- mutex_unlock(&vm->hotplug_mutex);
+ }
++ mutex_unlock(&vm->hotplug_mutex);
+
+ rc = virtio_mem_bbm_offline_and_remove_bb(vm, bb_id);
+ if (rc) {
+- if (bbm_safe_unplug) {
+- mutex_lock(&vm->hotplug_mutex);
+- goto rollback_safe_unplug;
+- }
+- return rc;
++ mutex_lock(&vm->hotplug_mutex);
++ goto rollback;
+ }
+
+ rc = virtio_mem_bbm_unplug_bb(vm, bb_id);
+@@ -2154,7 +2181,7 @@ static int virtio_mem_bbm_offline_remove_and_unplug_bb(struct virtio_mem *vm,
+ VIRTIO_MEM_BBM_BB_UNUSED);
+ return rc;
+
+-rollback_safe_unplug:
++rollback:
+ for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
+ page = pfn_to_online_page(pfn);
+ if (!page)
+@@ -2260,12 +2287,13 @@ static int virtio_mem_unplug_request(struct virtio_mem *vm, uint64_t diff)
+
+ /*
+ * Try to unplug all blocks that couldn't be unplugged before, for example,
+- * because the hypervisor was busy.
++ * because the hypervisor was busy. Further, offline and remove any memory
++ * blocks where we previously failed.
+ */
+-static int virtio_mem_unplug_pending_mb(struct virtio_mem *vm)
++static int virtio_mem_cleanup_pending_mb(struct virtio_mem *vm)
+ {
+ unsigned long id;
+- int rc;
++ int rc = 0;
+
+ if (!vm->in_sbm) {
+ virtio_mem_bbm_for_each_bb(vm, id,
+@@ -2287,6 +2315,27 @@ static int virtio_mem_unplug_pending_mb(struct virtio_mem *vm)
+ VIRTIO_MEM_SBM_MB_UNUSED);
+ }
+
++ if (!vm->sbm.have_unplugged_mb)
++ return 0;
++
++ /*
++ * Let's retry (offlining and) removing completely unplugged Linux
++ * memory blocks.
++ */
++ vm->sbm.have_unplugged_mb = false;
++
++ mutex_lock(&vm->hotplug_mutex);
++ virtio_mem_sbm_for_each_mb(vm, id, VIRTIO_MEM_SBM_MB_MOVABLE_PARTIAL)
++ rc |= virtio_mem_sbm_try_remove_unplugged_mb(vm, id);
++ virtio_mem_sbm_for_each_mb(vm, id, VIRTIO_MEM_SBM_MB_KERNEL_PARTIAL)
++ rc |= virtio_mem_sbm_try_remove_unplugged_mb(vm, id);
++ virtio_mem_sbm_for_each_mb(vm, id, VIRTIO_MEM_SBM_MB_OFFLINE_PARTIAL)
++ rc |= virtio_mem_sbm_try_remove_unplugged_mb(vm, id);
++ mutex_unlock(&vm->hotplug_mutex);
++
++ if (rc)
++ vm->sbm.have_unplugged_mb = true;
++ /* Ignore errors, this is not critical. We'll retry later. */
+ return 0;
+ }
+
+@@ -2368,9 +2417,9 @@ retry:
+ virtio_mem_refresh_config(vm);
+ }
+
+- /* Unplug any leftovers from previous runs */
++ /* Cleanup any leftovers from previous runs */
+ if (!rc)
+- rc = virtio_mem_unplug_pending_mb(vm);
++ rc = virtio_mem_cleanup_pending_mb(vm);
+
+ if (!rc && vm->requested_size != vm->plugged_size) {
+ if (vm->requested_size > vm->plugged_size) {
+@@ -2382,6 +2431,13 @@ retry:
+ }
+ }
+
++ /*
++ * Keep retrying to offline and remove completely unplugged Linux
++ * memory blocks.
++ */
++ if (!rc && vm->in_sbm && vm->sbm.have_unplugged_mb)
++ rc = -EBUSY;
++
+ switch (rc) {
+ case 0:
+ vm->retry_timer_ms = VIRTIO_MEM_RETRY_TIMER_MIN_MS;
+diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
+index c5310eaf8b468..da1150d127c24 100644
+--- a/drivers/virtio/virtio_ring.c
++++ b/drivers/virtio/virtio_ring.c
+@@ -1461,7 +1461,7 @@ static inline int virtqueue_add_packed(struct virtqueue *_vq,
+ }
+ }
+
+- if (i < head)
++ if (i <= head)
+ vq->packed.avail_wrap_counter ^= 1;
+
+ /* We're using some buffers from the free list. */
+diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
+index 961161da59000..06ce6d8c2e004 100644
+--- a/drivers/virtio/virtio_vdpa.c
++++ b/drivers/virtio/virtio_vdpa.c
+@@ -366,11 +366,14 @@ static int virtio_vdpa_find_vqs(struct virtio_device *vdev, unsigned int nvqs,
+ struct irq_affinity default_affd = { 0 };
+ struct cpumask *masks;
+ struct vdpa_callback cb;
++ bool has_affinity = desc && ops->set_vq_affinity;
+ int i, err, queue_idx = 0;
+
+- masks = create_affinity_masks(nvqs, desc ? desc : &default_affd);
+- if (!masks)
+- return -ENOMEM;
++ if (has_affinity) {
++ masks = create_affinity_masks(nvqs, desc ? desc : &default_affd);
++ if (!masks)
++ return -ENOMEM;
++ }
+
+ for (i = 0; i < nvqs; ++i) {
+ if (!names[i]) {
+@@ -386,20 +389,22 @@ static int virtio_vdpa_find_vqs(struct virtio_device *vdev, unsigned int nvqs,
+ goto err_setup_vq;
+ }
+
+- if (ops->set_vq_affinity)
++ if (has_affinity)
+ ops->set_vq_affinity(vdpa, i, &masks[i]);
+ }
+
+ cb.callback = virtio_vdpa_config_cb;
+ cb.private = vd_dev;
+ ops->set_config_cb(vdpa, &cb);
+- kfree(masks);
++ if (has_affinity)
++ kfree(masks);
+
+ return 0;
+
+ err_setup_vq:
+ virtio_vdpa_del_vqs(vdev);
+- kfree(masks);
++ if (has_affinity)
++ kfree(masks);
+ return err;
+ }
+
+diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
+index 70a4752ed913a..fd603e06d07fe 100644
+--- a/fs/dlm/plock.c
++++ b/fs/dlm/plock.c
+@@ -456,7 +456,8 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count,
+ }
+ } else {
+ list_for_each_entry(iter, &recv_list, list) {
+- if (!iter->info.wait) {
++ if (!iter->info.wait &&
++ iter->info.fsid == info.fsid) {
+ op = iter;
+ break;
+ }
+@@ -468,8 +469,7 @@ static ssize_t dev_write(struct file *file, const char __user *u, size_t count,
+ if (info.wait)
+ WARN_ON(op->info.optype != DLM_PLOCK_OP_LOCK);
+ else
+- WARN_ON(op->info.fsid != info.fsid ||
+- op->info.number != info.number ||
++ WARN_ON(op->info.number != info.number ||
+ op->info.owner != info.owner ||
+ op->info.optype != info.optype);
+
+diff --git a/fs/eventfd.c b/fs/eventfd.c
+index 95850a13ce8d0..1ffbf7c1cd16d 100644
+--- a/fs/eventfd.c
++++ b/fs/eventfd.c
+@@ -189,7 +189,7 @@ void eventfd_ctx_do_read(struct eventfd_ctx *ctx, __u64 *cnt)
+ {
+ lockdep_assert_held(&ctx->wqh.lock);
+
+- *cnt = (ctx->flags & EFD_SEMAPHORE) ? 1 : ctx->count;
++ *cnt = ((ctx->flags & EFD_SEMAPHORE) && ctx->count) ? 1 : ctx->count;
+ ctx->count -= *cnt;
+ }
+ EXPORT_SYMBOL_GPL(eventfd_ctx_do_read);
+diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
+index 3fa5de892d89d..333439b3ac146 100644
+--- a/fs/ext4/mballoc.c
++++ b/fs/ext4/mballoc.c
+@@ -966,8 +966,9 @@ static inline int should_optimize_scan(struct ext4_allocation_context *ac)
+ * Return next linear group for allocation. If linear traversal should not be
+ * performed, this function just returns the same group
+ */
+-static int
+-next_linear_group(struct ext4_allocation_context *ac, int group, int ngroups)
++static ext4_group_t
++next_linear_group(struct ext4_allocation_context *ac, ext4_group_t group,
++ ext4_group_t ngroups)
+ {
+ if (!should_optimize_scan(ac))
+ goto inc_and_return;
+@@ -2414,7 +2415,7 @@ static bool ext4_mb_good_group(struct ext4_allocation_context *ac,
+
+ BUG_ON(cr < 0 || cr >= 4);
+
+- if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(grp) || !grp))
++ if (unlikely(!grp || EXT4_MB_GRP_BBITMAP_CORRUPT(grp)))
+ return false;
+
+ free = grp->bb_free;
+diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
+index 0caf6c730ce34..6bcc3770ee19f 100644
+--- a/fs/ext4/namei.c
++++ b/fs/ext4/namei.c
+@@ -2799,6 +2799,7 @@ static int ext4_add_nondir(handle_t *handle,
+ return err;
+ }
+ drop_nlink(inode);
++ ext4_mark_inode_dirty(handle, inode);
+ ext4_orphan_add(handle, inode);
+ unlock_new_inode(inode);
+ return err;
+@@ -3436,6 +3437,7 @@ retry:
+
+ err_drop_inode:
+ clear_nlink(inode);
++ ext4_mark_inode_dirty(handle, inode);
+ ext4_orphan_add(handle, inode);
+ unlock_new_inode(inode);
+ if (handle)
+@@ -4021,6 +4023,7 @@ end_rename:
+ ext4_resetent(handle, &old,
+ old.inode->i_ino, old_file_type);
+ drop_nlink(whiteout);
++ ext4_mark_inode_dirty(handle, whiteout);
+ ext4_orphan_add(handle, whiteout);
+ }
+ unlock_new_inode(whiteout);
+diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
+index 8fd3b7f9fb88e..b0597a539fc54 100644
+--- a/fs/f2fs/checkpoint.c
++++ b/fs/f2fs/checkpoint.c
+@@ -1701,9 +1701,9 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
+ }
+
+ f2fs_restore_inmem_curseg(sbi);
++ stat_inc_cp_count(sbi);
+ stop:
+ unblock_operations(sbi);
+- stat_inc_cp_count(sbi->stat_info);
+
+ if (cpc->reason & CP_RECOVERY)
+ f2fs_notice(sbi, "checkpoint: version = %llx", ckpt_ver);
+diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
+index 61c35b59126ec..fdbf994f12718 100644
+--- a/fs/f2fs/debug.c
++++ b/fs/f2fs/debug.c
+@@ -215,6 +215,9 @@ static void update_general_status(struct f2fs_sb_info *sbi)
+ si->valid_blks[type] += blks;
+ }
+
++ for (i = 0; i < MAX_CALL_TYPE; i++)
++ si->cp_call_count[i] = atomic_read(&sbi->cp_call_count[i]);
++
+ for (i = 0; i < 2; i++) {
+ si->segment_count[i] = sbi->segment_count[i];
+ si->block_count[i] = sbi->block_count[i];
+@@ -497,7 +500,9 @@ static int stat_show(struct seq_file *s, void *v)
+ seq_printf(s, " - Prefree: %d\n - Free: %d (%d)\n\n",
+ si->prefree_count, si->free_segs, si->free_secs);
+ seq_printf(s, "CP calls: %d (BG: %d)\n",
+- si->cp_count, si->bg_cp_count);
++ si->cp_call_count[TOTAL_CALL],
++ si->cp_call_count[BACKGROUND]);
++ seq_printf(s, "CP count: %d\n", si->cp_count);
+ seq_printf(s, " - cp blocks : %u\n", si->meta_count[META_CP]);
+ seq_printf(s, " - sit blocks : %u\n",
+ si->meta_count[META_SIT]);
+@@ -511,12 +516,24 @@ static int stat_show(struct seq_file *s, void *v)
+ seq_printf(s, " - Total : %4d\n", si->nr_total_ckpt);
+ seq_printf(s, " - Cur time : %4d(ms)\n", si->cur_ckpt_time);
+ seq_printf(s, " - Peak time : %4d(ms)\n", si->peak_ckpt_time);
+- seq_printf(s, "GC calls: %d (BG: %d)\n",
+- si->call_count, si->bg_gc);
+- seq_printf(s, " - data segments : %d (%d)\n",
+- si->data_segs, si->bg_data_segs);
+- seq_printf(s, " - node segments : %d (%d)\n",
+- si->node_segs, si->bg_node_segs);
++ seq_printf(s, "GC calls: %d (gc_thread: %d)\n",
++ si->gc_call_count[BACKGROUND] +
++ si->gc_call_count[FOREGROUND],
++ si->gc_call_count[BACKGROUND]);
++ if (__is_large_section(sbi)) {
++ seq_printf(s, " - data sections : %d (BG: %d)\n",
++ si->gc_secs[DATA][BG_GC] + si->gc_secs[DATA][FG_GC],
++ si->gc_secs[DATA][BG_GC]);
++ seq_printf(s, " - node sections : %d (BG: %d)\n",
++ si->gc_secs[NODE][BG_GC] + si->gc_secs[NODE][FG_GC],
++ si->gc_secs[NODE][BG_GC]);
++ }
++ seq_printf(s, " - data segments : %d (BG: %d)\n",
++ si->gc_segs[DATA][BG_GC] + si->gc_segs[DATA][FG_GC],
++ si->gc_segs[DATA][BG_GC]);
++ seq_printf(s, " - node segments : %d (BG: %d)\n",
++ si->gc_segs[NODE][BG_GC] + si->gc_segs[NODE][FG_GC],
++ si->gc_segs[NODE][BG_GC]);
+ seq_puts(s, " - Reclaimed segs :\n");
+ seq_printf(s, " - Normal : %d\n", sbi->gc_reclaimed_segs[GC_NORMAL]);
+ seq_printf(s, " - Idle CB : %d\n", sbi->gc_reclaimed_segs[GC_IDLE_CB]);
+@@ -687,6 +704,8 @@ int f2fs_build_stats(struct f2fs_sb_info *sbi)
+ atomic_set(&sbi->inplace_count, 0);
+ for (i = META_CP; i < META_MAX; i++)
+ atomic_set(&sbi->meta_count[i], 0);
++ for (i = 0; i < MAX_CALL_TYPE; i++)
++ atomic_set(&sbi->cp_call_count[i], 0);
+
+ atomic_set(&sbi->max_aw_cnt, 0);
+
+diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
+index 271d4e7b22c91..cfb8e274c0699 100644
+--- a/fs/f2fs/f2fs.h
++++ b/fs/f2fs/f2fs.h
+@@ -1377,6 +1377,13 @@ enum errors_option {
+ MOUNT_ERRORS_PANIC, /* panic on errors */
+ };
+
++enum {
++ BACKGROUND,
++ FOREGROUND,
++ MAX_CALL_TYPE,
++ TOTAL_CALL = FOREGROUND,
++};
++
+ static inline int f2fs_test_bit(unsigned int nr, char *addr);
+ static inline void f2fs_set_bit(unsigned int nr, char *addr);
+ static inline void f2fs_clear_bit(unsigned int nr, char *addr);
+@@ -1687,6 +1694,7 @@ struct f2fs_sb_info {
+ unsigned int io_skip_bggc; /* skip background gc for in-flight IO */
+ unsigned int other_skip_bggc; /* skip background gc for other reasons */
+ unsigned int ndirty_inode[NR_INODE_TYPE]; /* # of dirty inodes */
++ atomic_t cp_call_count[MAX_CALL_TYPE]; /* # of cp call */
+ #endif
+ spinlock_t stat_lock; /* lock for stat operations */
+
+@@ -3873,7 +3881,7 @@ struct f2fs_stat_info {
+ int nats, dirty_nats, sits, dirty_sits;
+ int free_nids, avail_nids, alloc_nids;
+ int total_count, utilization;
+- int bg_gc, nr_wb_cp_data, nr_wb_data;
++ int nr_wb_cp_data, nr_wb_data;
+ int nr_rd_data, nr_rd_node, nr_rd_meta;
+ int nr_dio_read, nr_dio_write;
+ unsigned int io_skip_bggc, other_skip_bggc;
+@@ -3893,9 +3901,11 @@ struct f2fs_stat_info {
+ int rsvd_segs, overp_segs;
+ int dirty_count, node_pages, meta_pages, compress_pages;
+ int compress_page_hit;
+- int prefree_count, call_count, cp_count, bg_cp_count;
+- int tot_segs, node_segs, data_segs, free_segs, free_secs;
+- int bg_node_segs, bg_data_segs;
++ int prefree_count, free_segs, free_secs;
++ int cp_call_count[MAX_CALL_TYPE], cp_count;
++ int gc_call_count[MAX_CALL_TYPE];
++ int gc_segs[2][2];
++ int gc_secs[2][2];
+ int tot_blks, data_blks, node_blks;
+ int bg_data_blks, bg_node_blks;
+ int curseg[NR_CURSEG_TYPE];
+@@ -3917,10 +3927,9 @@ static inline struct f2fs_stat_info *F2FS_STAT(struct f2fs_sb_info *sbi)
+ return (struct f2fs_stat_info *)sbi->stat_info;
+ }
+
+-#define stat_inc_cp_count(si) ((si)->cp_count++)
+-#define stat_inc_bg_cp_count(si) ((si)->bg_cp_count++)
+-#define stat_inc_call_count(si) ((si)->call_count++)
+-#define stat_inc_bggc_count(si) ((si)->bg_gc++)
++#define stat_inc_cp_call_count(sbi, foreground) \
++ atomic_inc(&sbi->cp_call_count[(foreground)])
++#define stat_inc_cp_count(si) (F2FS_STAT(sbi)->cp_count++)
+ #define stat_io_skip_bggc_count(sbi) ((sbi)->io_skip_bggc++)
+ #define stat_other_skip_bggc_count(sbi) ((sbi)->other_skip_bggc++)
+ #define stat_inc_dirty_inode(sbi, type) ((sbi)->ndirty_inode[type]++)
+@@ -4005,18 +4014,12 @@ static inline struct f2fs_stat_info *F2FS_STAT(struct f2fs_sb_info *sbi)
+ if (cur > max) \
+ atomic_set(&F2FS_I_SB(inode)->max_aw_cnt, cur); \
+ } while (0)
+-#define stat_inc_seg_count(sbi, type, gc_type) \
+- do { \
+- struct f2fs_stat_info *si = F2FS_STAT(sbi); \
+- si->tot_segs++; \
+- if ((type) == SUM_TYPE_DATA) { \
+- si->data_segs++; \
+- si->bg_data_segs += (gc_type == BG_GC) ? 1 : 0; \
+- } else { \
+- si->node_segs++; \
+- si->bg_node_segs += (gc_type == BG_GC) ? 1 : 0; \
+- } \
+- } while (0)
++#define stat_inc_gc_call_count(sbi, foreground) \
++ (F2FS_STAT(sbi)->gc_call_count[(foreground)]++)
++#define stat_inc_gc_sec_count(sbi, type, gc_type) \
++ (F2FS_STAT(sbi)->gc_secs[(type)][(gc_type)]++)
++#define stat_inc_gc_seg_count(sbi, type, gc_type) \
++ (F2FS_STAT(sbi)->gc_segs[(type)][(gc_type)]++)
+
+ #define stat_inc_tot_blk_count(si, blks) \
+ ((si)->tot_blks += (blks))
+@@ -4043,10 +4046,8 @@ void __init f2fs_create_root_stats(void);
+ void f2fs_destroy_root_stats(void);
+ void f2fs_update_sit_info(struct f2fs_sb_info *sbi);
+ #else
+-#define stat_inc_cp_count(si) do { } while (0)
+-#define stat_inc_bg_cp_count(si) do { } while (0)
+-#define stat_inc_call_count(si) do { } while (0)
+-#define stat_inc_bggc_count(si) do { } while (0)
++#define stat_inc_cp_call_count(sbi, foreground) do { } while (0)
++#define stat_inc_cp_count(sbi) do { } while (0)
+ #define stat_io_skip_bggc_count(sbi) do { } while (0)
+ #define stat_other_skip_bggc_count(sbi) do { } while (0)
+ #define stat_inc_dirty_inode(sbi, type) do { } while (0)
+@@ -4074,7 +4075,9 @@ void f2fs_update_sit_info(struct f2fs_sb_info *sbi);
+ #define stat_inc_seg_type(sbi, curseg) do { } while (0)
+ #define stat_inc_block_count(sbi, curseg) do { } while (0)
+ #define stat_inc_inplace_blocks(sbi) do { } while (0)
+-#define stat_inc_seg_count(sbi, type, gc_type) do { } while (0)
++#define stat_inc_gc_call_count(sbi, foreground) do { } while (0)
++#define stat_inc_gc_sec_count(sbi, type, gc_type) do { } while (0)
++#define stat_inc_gc_seg_count(sbi, type, gc_type) do { } while (0)
+ #define stat_inc_tot_blk_count(si, blks) do { } while (0)
+ #define stat_inc_data_blk_count(sbi, blks, gc_type) do { } while (0)
+ #define stat_inc_node_blk_count(sbi, blks, gc_type) do { } while (0)
+@@ -4469,7 +4472,8 @@ static inline bool f2fs_low_mem_mode(struct f2fs_sb_info *sbi)
+ static inline bool f2fs_may_compress(struct inode *inode)
+ {
+ if (IS_SWAPFILE(inode) || f2fs_is_pinned_file(inode) ||
+- f2fs_is_atomic_file(inode) || f2fs_has_inline_data(inode))
++ f2fs_is_atomic_file(inode) || f2fs_has_inline_data(inode) ||
++ f2fs_is_mmap_file(inode))
+ return false;
+ return S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode);
+ }
+diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
+index ead75c4e833d2..094b047d246c9 100644
+--- a/fs/f2fs/file.c
++++ b/fs/f2fs/file.c
+@@ -528,7 +528,11 @@ static int f2fs_file_mmap(struct file *file, struct vm_area_struct *vma)
+
+ file_accessed(file);
+ vma->vm_ops = &f2fs_file_vm_ops;
++
++ f2fs_down_read(&F2FS_I(inode)->i_sem);
+ set_inode_flag(inode, FI_MMAP_FILE);
++ f2fs_up_read(&F2FS_I(inode)->i_sem);
++
+ return 0;
+ }
+
+@@ -1725,6 +1729,7 @@ next_alloc:
+ if (has_not_enough_free_secs(sbi, 0,
+ GET_SEC_FROM_SEG(sbi, overprovision_segments(sbi)))) {
+ f2fs_down_write(&sbi->gc_lock);
++ stat_inc_gc_call_count(sbi, FOREGROUND);
+ err = f2fs_gc(sbi, &gc_control);
+ if (err && err != -ENODATA)
+ goto out_err;
+@@ -1920,12 +1925,19 @@ static int f2fs_setflags_common(struct inode *inode, u32 iflags, u32 mask)
+ int err = f2fs_convert_inline_inode(inode);
+ if (err)
+ return err;
+- if (!f2fs_may_compress(inode))
+- return -EINVAL;
+- if (S_ISREG(inode->i_mode) && F2FS_HAS_BLOCKS(inode))
++
++ f2fs_down_write(&F2FS_I(inode)->i_sem);
++ if (!f2fs_may_compress(inode) ||
++ (S_ISREG(inode->i_mode) &&
++ F2FS_HAS_BLOCKS(inode))) {
++ f2fs_up_write(&F2FS_I(inode)->i_sem);
+ return -EINVAL;
+- if (set_compress_context(inode))
+- return -EOPNOTSUPP;
++ }
++ err = set_compress_context(inode);
++ f2fs_up_write(&F2FS_I(inode)->i_sem);
++
++ if (err)
++ return err;
+ }
+ }
+
+@@ -2466,6 +2478,7 @@ static int f2fs_ioc_gc(struct file *filp, unsigned long arg)
+
+ gc_control.init_gc_type = sync ? FG_GC : BG_GC;
+ gc_control.err_gc_skipped = sync;
++ stat_inc_gc_call_count(sbi, FOREGROUND);
+ ret = f2fs_gc(sbi, &gc_control);
+ out:
+ mnt_drop_write_file(filp);
+@@ -2509,6 +2522,7 @@ do_more:
+ }
+
+ gc_control.victim_segno = GET_SEGNO(sbi, range->start);
++ stat_inc_gc_call_count(sbi, FOREGROUND);
+ ret = f2fs_gc(sbi, &gc_control);
+ if (ret) {
+ if (ret == -EBUSY)
+@@ -2980,6 +2994,7 @@ static int f2fs_ioc_flush_device(struct file *filp, unsigned long arg)
+ sm->last_victim[ALLOC_NEXT] = end_segno + 1;
+
+ gc_control.victim_segno = start_segno;
++ stat_inc_gc_call_count(sbi, FOREGROUND);
+ ret = f2fs_gc(sbi, &gc_control);
+ if (ret == -EAGAIN)
+ ret = 0;
+@@ -3953,6 +3968,7 @@ static int f2fs_ioc_set_compress_option(struct file *filp, unsigned long arg)
+ file_start_write(filp);
+ inode_lock(inode);
+
++ f2fs_down_write(&F2FS_I(inode)->i_sem);
+ if (f2fs_is_mmap_file(inode) || get_dirty_pages(inode)) {
+ ret = -EBUSY;
+ goto out;
+@@ -3972,6 +3988,7 @@ static int f2fs_ioc_set_compress_option(struct file *filp, unsigned long arg)
+ f2fs_warn(sbi, "compression algorithm is successfully set, "
+ "but current kernel doesn't support this algorithm.");
+ out:
++ f2fs_up_write(&F2FS_I(inode)->i_sem);
+ inode_unlock(inode);
+ file_end_write(filp);
+
+diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
+index 719b1ba32a78b..7d2736b66c702 100644
+--- a/fs/f2fs/gc.c
++++ b/fs/f2fs/gc.c
+@@ -121,8 +121,8 @@ static int gc_thread_func(void *data)
+ else
+ increase_sleep_time(gc_th, &wait_ms);
+ do_gc:
+- if (!foreground)
+- stat_inc_bggc_count(sbi->stat_info);
++ stat_inc_gc_call_count(sbi, foreground ?
++ FOREGROUND : BACKGROUND);
+
+ sync_mode = F2FS_OPTION(sbi).bggc_mode == BGGC_MODE_SYNC;
+
+@@ -1685,6 +1685,7 @@ static int do_garbage_collect(struct f2fs_sb_info *sbi,
+ int seg_freed = 0, migrated = 0;
+ unsigned char type = IS_DATASEG(get_seg_entry(sbi, segno)->type) ?
+ SUM_TYPE_DATA : SUM_TYPE_NODE;
++ unsigned char data_type = (type == SUM_TYPE_DATA) ? DATA : NODE;
+ int submitted = 0;
+
+ if (__is_large_section(sbi))
+@@ -1766,7 +1767,7 @@ static int do_garbage_collect(struct f2fs_sb_info *sbi,
+ segno, gc_type,
+ force_migrate);
+
+- stat_inc_seg_count(sbi, type, gc_type);
++ stat_inc_gc_seg_count(sbi, data_type, gc_type);
+ sbi->gc_reclaimed_segs[sbi->gc_mode]++;
+ migrated++;
+
+@@ -1783,12 +1784,12 @@ skip:
+ }
+
+ if (submitted)
+- f2fs_submit_merged_write(sbi,
+- (type == SUM_TYPE_NODE) ? NODE : DATA);
++ f2fs_submit_merged_write(sbi, data_type);
+
+ blk_finish_plug(&plug);
+
+- stat_inc_call_count(sbi->stat_info);
++ if (migrated)
++ stat_inc_gc_sec_count(sbi, data_type, gc_type);
+
+ return seg_freed;
+ }
+@@ -1839,6 +1840,7 @@ gc_more:
+ * secure free segments which doesn't need fggc any more.
+ */
+ if (prefree_segments(sbi)) {
++ stat_inc_cp_call_count(sbi, TOTAL_CALL);
+ ret = f2fs_write_checkpoint(sbi, &cpc);
+ if (ret)
+ goto stop;
+@@ -1883,6 +1885,7 @@ retry:
+ round++;
+ if (skipped_round > MAX_SKIP_GC_COUNT &&
+ skipped_round * 2 >= round) {
++ stat_inc_cp_call_count(sbi, TOTAL_CALL);
+ ret = f2fs_write_checkpoint(sbi, &cpc);
+ goto stop;
+ }
+@@ -1898,6 +1901,7 @@ retry:
+ */
+ if (free_sections(sbi) <= upper_secs + NR_GC_CHECKPOINT_SECS &&
+ prefree_segments(sbi)) {
++ stat_inc_cp_call_count(sbi, TOTAL_CALL);
+ ret = f2fs_write_checkpoint(sbi, &cpc);
+ if (ret)
+ goto stop;
+@@ -2023,6 +2027,7 @@ static int free_segment_range(struct f2fs_sb_info *sbi,
+ if (gc_only)
+ goto out;
+
++ stat_inc_cp_call_count(sbi, TOTAL_CALL);
+ err = f2fs_write_checkpoint(sbi, &cpc);
+ if (err)
+ goto out;
+@@ -2215,6 +2220,7 @@ out_drop_write:
+ clear_sbi_flag(sbi, SBI_IS_RESIZEFS);
+ set_sbi_flag(sbi, SBI_IS_DIRTY);
+
++ stat_inc_cp_call_count(sbi, TOTAL_CALL);
+ err = f2fs_write_checkpoint(sbi, &cpc);
+ if (err) {
+ update_fs_metadata(sbi, secs);
+diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
+index cf4327ad106c0..a99c151e3bdcf 100644
+--- a/fs/f2fs/inode.c
++++ b/fs/f2fs/inode.c
+@@ -396,6 +396,12 @@ static int do_read_inode(struct inode *inode)
+ fi->i_inline_xattr_size = 0;
+ }
+
++ if (!sanity_check_inode(inode, node_page)) {
++ f2fs_put_page(node_page, 1);
++ f2fs_handle_error(sbi, ERROR_CORRUPTED_INODE);
++ return -EFSCORRUPTED;
++ }
++
+ /* check data exist */
+ if (f2fs_has_inline_data(inode) && !f2fs_exist_data(inode))
+ __recover_inline_status(inode, node_page);
+@@ -465,12 +471,6 @@ static int do_read_inode(struct inode *inode)
+ f2fs_init_read_extent_tree(inode, node_page);
+ f2fs_init_age_extent_tree(inode);
+
+- if (!sanity_check_inode(inode, node_page)) {
+- f2fs_put_page(node_page, 1);
+- f2fs_handle_error(sbi, ERROR_CORRUPTED_INODE);
+- return -EFSCORRUPTED;
+- }
+-
+ if (!sanity_check_extent_cache(inode)) {
+ f2fs_put_page(node_page, 1);
+ f2fs_handle_error(sbi, ERROR_CORRUPTED_INODE);
+diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
+index 58c1a0096f7de..5b632d2641d49 100644
+--- a/fs/f2fs/recovery.c
++++ b/fs/f2fs/recovery.c
+@@ -895,6 +895,7 @@ skip:
+ struct cp_control cpc = {
+ .reason = CP_RECOVERY,
+ };
++ stat_inc_cp_call_count(sbi, TOTAL_CALL);
+ err = f2fs_write_checkpoint(sbi, &cpc);
+ }
+ }
+diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
+index 6db410f1bb8ce..37196d7a9685c 100644
+--- a/fs/f2fs/segment.c
++++ b/fs/f2fs/segment.c
+@@ -433,6 +433,7 @@ void f2fs_balance_fs(struct f2fs_sb_info *sbi, bool need)
+ .err_gc_skipped = false,
+ .nr_free_secs = 1 };
+ f2fs_down_write(&sbi->gc_lock);
++ stat_inc_gc_call_count(sbi, FOREGROUND);
+ f2fs_gc(sbi, &gc_control);
+ }
+ }
+@@ -510,8 +511,8 @@ do_sync:
+
+ mutex_unlock(&sbi->flush_lock);
+ }
++ stat_inc_cp_call_count(sbi, BACKGROUND);
+ f2fs_sync_fs(sbi->sb, 1);
+- stat_inc_bg_cp_count(sbi->stat_info);
+ }
+
+ static int __submit_flush_wait(struct f2fs_sb_info *sbi,
+@@ -3150,6 +3151,7 @@ int f2fs_trim_fs(struct f2fs_sb_info *sbi, struct fstrim_range *range)
+ goto out;
+
+ f2fs_down_write(&sbi->gc_lock);
++ stat_inc_cp_call_count(sbi, TOTAL_CALL);
+ err = f2fs_write_checkpoint(sbi, &cpc);
+ f2fs_up_write(&sbi->gc_lock);
+ if (err)
+diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
+index 3d91b5313947f..a6f283d34cd49 100644
+--- a/fs/f2fs/super.c
++++ b/fs/f2fs/super.c
+@@ -860,11 +860,6 @@ static int parse_options(struct super_block *sb, char *options, bool is_remount)
+ if (!name)
+ return -ENOMEM;
+ if (!strcmp(name, "adaptive")) {
+- if (f2fs_sb_has_blkzoned(sbi)) {
+- f2fs_warn(sbi, "adaptive mode is not allowed with zoned block device feature");
+- kfree(name);
+- return -EINVAL;
+- }
+ F2FS_OPTION(sbi).fs_mode = FS_MODE_ADAPTIVE;
+ } else if (!strcmp(name, "lfs")) {
+ F2FS_OPTION(sbi).fs_mode = FS_MODE_LFS;
+@@ -1329,6 +1324,11 @@ default_check:
+ F2FS_OPTION(sbi).discard_unit =
+ DISCARD_UNIT_SECTION;
+ }
++
++ if (F2FS_OPTION(sbi).fs_mode != FS_MODE_LFS) {
++ f2fs_info(sbi, "Only lfs mode is allowed with zoned block device feature");
++ return -EINVAL;
++ }
+ #else
+ f2fs_err(sbi, "Zoned block device support is not enabled");
+ return -EINVAL;
+@@ -1571,6 +1571,7 @@ static void f2fs_put_super(struct super_block *sb)
+ {
+ struct f2fs_sb_info *sbi = F2FS_SB(sb);
+ int i;
++ int err = 0;
+ bool done;
+
+ /* unregister procfs/sysfs entries in advance to avoid race case */
+@@ -1597,7 +1598,8 @@ static void f2fs_put_super(struct super_block *sb)
+ struct cp_control cpc = {
+ .reason = CP_UMOUNT,
+ };
+- f2fs_write_checkpoint(sbi, &cpc);
++ stat_inc_cp_call_count(sbi, TOTAL_CALL);
++ err = f2fs_write_checkpoint(sbi, &cpc);
+ }
+
+ /* be sure to wait for any on-going discard commands */
+@@ -1606,7 +1608,8 @@ static void f2fs_put_super(struct super_block *sb)
+ struct cp_control cpc = {
+ .reason = CP_UMOUNT | CP_TRIMMED,
+ };
+- f2fs_write_checkpoint(sbi, &cpc);
++ stat_inc_cp_call_count(sbi, TOTAL_CALL);
++ err = f2fs_write_checkpoint(sbi, &cpc);
+ }
+
+ /*
+@@ -1623,6 +1626,19 @@ static void f2fs_put_super(struct super_block *sb)
+
+ f2fs_wait_on_all_pages(sbi, F2FS_WB_CP_DATA);
+
++ if (err) {
++ truncate_inode_pages_final(NODE_MAPPING(sbi));
++ truncate_inode_pages_final(META_MAPPING(sbi));
++ }
++
++ for (i = 0; i < NR_COUNT_TYPE; i++) {
++ if (!get_pages(sbi, i))
++ continue;
++ f2fs_err(sbi, "detect filesystem reference count leak during "
++ "umount, type: %d, count: %lld", i, get_pages(sbi, i));
++ f2fs_bug_on(sbi, 1);
++ }
++
+ f2fs_bug_on(sbi, sbi->fsync_node_num);
+
+ f2fs_destroy_compress_inode(sbi);
+@@ -1689,8 +1705,10 @@ int f2fs_sync_fs(struct super_block *sb, int sync)
+ if (unlikely(is_sbi_flag_set(sbi, SBI_POR_DOING)))
+ return -EAGAIN;
+
+- if (sync)
++ if (sync) {
++ stat_inc_cp_call_count(sbi, TOTAL_CALL);
+ err = f2fs_issue_checkpoint(sbi);
++ }
+
+ return err;
+ }
+@@ -2189,6 +2207,7 @@ static int f2fs_disable_checkpoint(struct f2fs_sb_info *sbi)
+ .nr_free_secs = 1 };
+
+ f2fs_down_write(&sbi->gc_lock);
++ stat_inc_gc_call_count(sbi, FOREGROUND);
+ err = f2fs_gc(sbi, &gc_control);
+ if (err == -ENODATA) {
+ err = 0;
+@@ -2214,6 +2233,7 @@ skip_gc:
+ f2fs_down_write(&sbi->gc_lock);
+ cpc.reason = CP_PAUSE;
+ set_sbi_flag(sbi, SBI_CP_DISABLED);
++ stat_inc_cp_call_count(sbi, TOTAL_CALL);
+ err = f2fs_write_checkpoint(sbi, &cpc);
+ if (err)
+ goto out_unlock;
+@@ -4833,6 +4853,7 @@ static void kill_f2fs_super(struct super_block *sb)
+ struct cp_control cpc = {
+ .reason = CP_UMOUNT,
+ };
++ stat_inc_cp_call_count(sbi, TOTAL_CALL);
+ f2fs_write_checkpoint(sbi, &cpc);
+ }
+
+diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
+index 8ea05340bad90..8e18c2d742ca9 100644
+--- a/fs/f2fs/sysfs.c
++++ b/fs/f2fs/sysfs.c
+@@ -356,6 +356,16 @@ static ssize_t f2fs_sbi_show(struct f2fs_attr *a,
+ if (!strcmp(a->attr.name, "revoked_atomic_block"))
+ return sysfs_emit(buf, "%llu\n", sbi->revoked_atomic_block);
+
++#ifdef CONFIG_F2FS_STAT_FS
++ if (!strcmp(a->attr.name, "cp_foreground_calls"))
++ return sysfs_emit(buf, "%d\n",
++ atomic_read(&sbi->cp_call_count[TOTAL_CALL]) -
++ atomic_read(&sbi->cp_call_count[BACKGROUND]));
++ if (!strcmp(a->attr.name, "cp_background_calls"))
++ return sysfs_emit(buf, "%d\n",
++ atomic_read(&sbi->cp_call_count[BACKGROUND]));
++#endif
++
+ ui = (unsigned int *)(ptr + a->offset);
+
+ return sysfs_emit(buf, "%u\n", *ui);
+@@ -842,68 +852,160 @@ static struct f2fs_attr f2fs_attr_##_name = { \
+ #define F2FS_GENERAL_RO_ATTR(name) \
+ static struct f2fs_attr f2fs_attr_##name = __ATTR(name, 0444, name##_show, NULL)
+
+-#define F2FS_STAT_ATTR(_struct_type, _struct_name, _name, _elname) \
+-static struct f2fs_attr f2fs_attr_##_name = { \
+- .attr = {.name = __stringify(_name), .mode = 0444 }, \
+- .show = f2fs_sbi_show, \
+- .struct_type = _struct_type, \
+- .offset = offsetof(struct _struct_name, _elname), \
+-}
++#ifdef CONFIG_F2FS_STAT_FS
++#define STAT_INFO_RO_ATTR(name, elname) \
++ F2FS_RO_ATTR(STAT_INFO, f2fs_stat_info, name, elname)
++#endif
+
+-F2FS_RW_ATTR(GC_THREAD, f2fs_gc_kthread, gc_urgent_sleep_time,
+- urgent_sleep_time);
+-F2FS_RW_ATTR(GC_THREAD, f2fs_gc_kthread, gc_min_sleep_time, min_sleep_time);
+-F2FS_RW_ATTR(GC_THREAD, f2fs_gc_kthread, gc_max_sleep_time, max_sleep_time);
+-F2FS_RW_ATTR(GC_THREAD, f2fs_gc_kthread, gc_no_gc_sleep_time, no_gc_sleep_time);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, gc_idle, gc_mode);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, gc_urgent, gc_mode);
+-F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, reclaim_segments, rec_prefree_segments);
+-F2FS_RW_ATTR(DCC_INFO, discard_cmd_control, max_small_discards, max_discards);
+-F2FS_RW_ATTR(DCC_INFO, discard_cmd_control, max_discard_request, max_discard_request);
+-F2FS_RW_ATTR(DCC_INFO, discard_cmd_control, min_discard_issue_time, min_discard_issue_time);
+-F2FS_RW_ATTR(DCC_INFO, discard_cmd_control, mid_discard_issue_time, mid_discard_issue_time);
+-F2FS_RW_ATTR(DCC_INFO, discard_cmd_control, max_discard_issue_time, max_discard_issue_time);
+-F2FS_RW_ATTR(DCC_INFO, discard_cmd_control, discard_io_aware_gran, discard_io_aware_gran);
+-F2FS_RW_ATTR(DCC_INFO, discard_cmd_control, discard_urgent_util, discard_urgent_util);
+-F2FS_RW_ATTR(DCC_INFO, discard_cmd_control, discard_granularity, discard_granularity);
+-F2FS_RW_ATTR(DCC_INFO, discard_cmd_control, max_ordered_discard, max_ordered_discard);
+-F2FS_RW_ATTR(RESERVED_BLOCKS, f2fs_sb_info, reserved_blocks, reserved_blocks);
+-F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, ipu_policy, ipu_policy);
+-F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, min_ipu_util, min_ipu_util);
+-F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, min_fsync_blocks, min_fsync_blocks);
+-F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, min_seq_blocks, min_seq_blocks);
+-F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, min_hot_blocks, min_hot_blocks);
+-F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, min_ssr_sections, min_ssr_sections);
+-F2FS_RW_ATTR(NM_INFO, f2fs_nm_info, ram_thresh, ram_thresh);
+-F2FS_RW_ATTR(NM_INFO, f2fs_nm_info, ra_nid_pages, ra_nid_pages);
+-F2FS_RW_ATTR(NM_INFO, f2fs_nm_info, dirty_nats_ratio, dirty_nats_ratio);
+-F2FS_RW_ATTR(NM_INFO, f2fs_nm_info, max_roll_forward_node_blocks, max_rf_node_blocks);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, max_victim_search, max_victim_search);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, migration_granularity, migration_granularity);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, dir_level, dir_level);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, cp_interval, interval_time[CP_TIME]);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, idle_interval, interval_time[REQ_TIME]);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, discard_idle_interval,
+- interval_time[DISCARD_TIME]);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, gc_idle_interval, interval_time[GC_TIME]);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info,
+- umount_discard_timeout, interval_time[UMOUNT_DISCARD_TIMEOUT]);
+-#ifdef CONFIG_F2FS_IOSTAT
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, iostat_enable, iostat_enable);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, iostat_period_ms, iostat_period_ms);
++#define GC_THREAD_RW_ATTR(name, elname) \
++ F2FS_RW_ATTR(GC_THREAD, f2fs_gc_kthread, name, elname)
++
++#define SM_INFO_RW_ATTR(name, elname) \
++ F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, name, elname)
++
++#define SM_INFO_GENERAL_RW_ATTR(elname) \
++ SM_INFO_RW_ATTR(elname, elname)
++
++#define DCC_INFO_RW_ATTR(name, elname) \
++ F2FS_RW_ATTR(DCC_INFO, discard_cmd_control, name, elname)
++
++#define DCC_INFO_GENERAL_RW_ATTR(elname) \
++ DCC_INFO_RW_ATTR(elname, elname)
++
++#define NM_INFO_RW_ATTR(name, elname) \
++ F2FS_RW_ATTR(NM_INFO, f2fs_nm_info, name, elname)
++
++#define NM_INFO_GENERAL_RW_ATTR(elname) \
++ NM_INFO_RW_ATTR(elname, elname)
++
++#define F2FS_SBI_RW_ATTR(name, elname) \
++ F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, name, elname)
++
++#define F2FS_SBI_GENERAL_RW_ATTR(elname) \
++ F2FS_SBI_RW_ATTR(elname, elname)
++
++#define F2FS_SBI_GENERAL_RO_ATTR(elname) \
++ F2FS_RO_ATTR(F2FS_SBI, f2fs_sb_info, elname, elname)
++
++#ifdef CONFIG_F2FS_FAULT_INJECTION
++#define FAULT_INFO_GENERAL_RW_ATTR(type, elname) \
++ F2FS_RW_ATTR(type, f2fs_fault_info, elname, elname)
+ #endif
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, readdir_ra, readdir_ra);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, max_io_bytes, max_io_bytes);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, gc_pin_file_thresh, gc_pin_file_threshold);
++
++#define RESERVED_BLOCKS_GENERAL_RW_ATTR(elname) \
++ F2FS_RW_ATTR(RESERVED_BLOCKS, f2fs_sb_info, elname, elname)
++
++#define CPRC_INFO_GENERAL_RW_ATTR(elname) \
++ F2FS_RW_ATTR(CPRC_INFO, ckpt_req_control, elname, elname)
++
++#define ATGC_INFO_RW_ATTR(name, elname) \
++ F2FS_RW_ATTR(ATGC_INFO, atgc_management, name, elname)
++
++/* GC_THREAD ATTR */
++GC_THREAD_RW_ATTR(gc_urgent_sleep_time, urgent_sleep_time);
++GC_THREAD_RW_ATTR(gc_min_sleep_time, min_sleep_time);
++GC_THREAD_RW_ATTR(gc_max_sleep_time, max_sleep_time);
++GC_THREAD_RW_ATTR(gc_no_gc_sleep_time, no_gc_sleep_time);
++
++/* SM_INFO ATTR */
++SM_INFO_RW_ATTR(reclaim_segments, rec_prefree_segments);
++SM_INFO_GENERAL_RW_ATTR(ipu_policy);
++SM_INFO_GENERAL_RW_ATTR(min_ipu_util);
++SM_INFO_GENERAL_RW_ATTR(min_fsync_blocks);
++SM_INFO_GENERAL_RW_ATTR(min_seq_blocks);
++SM_INFO_GENERAL_RW_ATTR(min_hot_blocks);
++SM_INFO_GENERAL_RW_ATTR(min_ssr_sections);
++
++/* DCC_INFO ATTR */
++DCC_INFO_RW_ATTR(max_small_discards, max_discards);
++DCC_INFO_GENERAL_RW_ATTR(max_discard_request);
++DCC_INFO_GENERAL_RW_ATTR(min_discard_issue_time);
++DCC_INFO_GENERAL_RW_ATTR(mid_discard_issue_time);
++DCC_INFO_GENERAL_RW_ATTR(max_discard_issue_time);
++DCC_INFO_GENERAL_RW_ATTR(discard_io_aware_gran);
++DCC_INFO_GENERAL_RW_ATTR(discard_urgent_util);
++DCC_INFO_GENERAL_RW_ATTR(discard_granularity);
++DCC_INFO_GENERAL_RW_ATTR(max_ordered_discard);
++
++/* NM_INFO ATTR */
++NM_INFO_RW_ATTR(max_roll_forward_node_blocks, max_rf_node_blocks);
++NM_INFO_GENERAL_RW_ATTR(ram_thresh);
++NM_INFO_GENERAL_RW_ATTR(ra_nid_pages);
++NM_INFO_GENERAL_RW_ATTR(dirty_nats_ratio);
++
++/* F2FS_SBI ATTR */
+ F2FS_RW_ATTR(F2FS_SBI, f2fs_super_block, extension_list, extension_list);
++F2FS_SBI_RW_ATTR(gc_idle, gc_mode);
++F2FS_SBI_RW_ATTR(gc_urgent, gc_mode);
++F2FS_SBI_RW_ATTR(cp_interval, interval_time[CP_TIME]);
++F2FS_SBI_RW_ATTR(idle_interval, interval_time[REQ_TIME]);
++F2FS_SBI_RW_ATTR(discard_idle_interval, interval_time[DISCARD_TIME]);
++F2FS_SBI_RW_ATTR(gc_idle_interval, interval_time[GC_TIME]);
++F2FS_SBI_RW_ATTR(umount_discard_timeout, interval_time[UMOUNT_DISCARD_TIMEOUT]);
++F2FS_SBI_RW_ATTR(gc_pin_file_thresh, gc_pin_file_threshold);
++F2FS_SBI_RW_ATTR(gc_reclaimed_segments, gc_reclaimed_segs);
++F2FS_SBI_GENERAL_RW_ATTR(max_victim_search);
++F2FS_SBI_GENERAL_RW_ATTR(migration_granularity);
++F2FS_SBI_GENERAL_RW_ATTR(dir_level);
++#ifdef CONFIG_F2FS_IOSTAT
++F2FS_SBI_GENERAL_RW_ATTR(iostat_enable);
++F2FS_SBI_GENERAL_RW_ATTR(iostat_period_ms);
++#endif
++F2FS_SBI_GENERAL_RW_ATTR(readdir_ra);
++F2FS_SBI_GENERAL_RW_ATTR(max_io_bytes);
++F2FS_SBI_GENERAL_RW_ATTR(data_io_flag);
++F2FS_SBI_GENERAL_RW_ATTR(node_io_flag);
++F2FS_SBI_GENERAL_RW_ATTR(gc_remaining_trials);
++F2FS_SBI_GENERAL_RW_ATTR(seq_file_ra_mul);
++F2FS_SBI_GENERAL_RW_ATTR(gc_segment_mode);
++F2FS_SBI_GENERAL_RW_ATTR(max_fragment_chunk);
++F2FS_SBI_GENERAL_RW_ATTR(max_fragment_hole);
++#ifdef CONFIG_F2FS_FS_COMPRESSION
++F2FS_SBI_GENERAL_RW_ATTR(compr_written_block);
++F2FS_SBI_GENERAL_RW_ATTR(compr_saved_block);
++F2FS_SBI_GENERAL_RW_ATTR(compr_new_inode);
++F2FS_SBI_GENERAL_RW_ATTR(compress_percent);
++F2FS_SBI_GENERAL_RW_ATTR(compress_watermark);
++#endif
++/* atomic write */
++F2FS_SBI_GENERAL_RO_ATTR(current_atomic_write);
++F2FS_SBI_GENERAL_RW_ATTR(peak_atomic_write);
++F2FS_SBI_GENERAL_RW_ATTR(committed_atomic_block);
++F2FS_SBI_GENERAL_RW_ATTR(revoked_atomic_block);
++/* block age extent cache */
++F2FS_SBI_GENERAL_RW_ATTR(hot_data_age_threshold);
++F2FS_SBI_GENERAL_RW_ATTR(warm_data_age_threshold);
++F2FS_SBI_GENERAL_RW_ATTR(last_age_weight);
++#ifdef CONFIG_BLK_DEV_ZONED
++F2FS_SBI_GENERAL_RO_ATTR(unusable_blocks_per_sec);
++#endif
++
++/* STAT_INFO ATTR */
++#ifdef CONFIG_F2FS_STAT_FS
++STAT_INFO_RO_ATTR(cp_foreground_calls, cp_call_count[FOREGROUND]);
++STAT_INFO_RO_ATTR(cp_background_calls, cp_call_count[BACKGROUND]);
++STAT_INFO_RO_ATTR(gc_foreground_calls, gc_call_count[FOREGROUND]);
++STAT_INFO_RO_ATTR(gc_background_calls, gc_call_count[BACKGROUND]);
++#endif
++
++/* FAULT_INFO ATTR */
+ #ifdef CONFIG_F2FS_FAULT_INJECTION
+-F2FS_RW_ATTR(FAULT_INFO_RATE, f2fs_fault_info, inject_rate, inject_rate);
+-F2FS_RW_ATTR(FAULT_INFO_TYPE, f2fs_fault_info, inject_type, inject_type);
++FAULT_INFO_GENERAL_RW_ATTR(FAULT_INFO_RATE, inject_rate);
++FAULT_INFO_GENERAL_RW_ATTR(FAULT_INFO_TYPE, inject_type);
+ #endif
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, data_io_flag, data_io_flag);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, node_io_flag, node_io_flag);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, gc_remaining_trials, gc_remaining_trials);
+-F2FS_RW_ATTR(CPRC_INFO, ckpt_req_control, ckpt_thread_ioprio, ckpt_thread_ioprio);
++
++/* RESERVED_BLOCKS ATTR */
++RESERVED_BLOCKS_GENERAL_RW_ATTR(reserved_blocks);
++
++/* CPRC_INFO ATTR */
++CPRC_INFO_GENERAL_RW_ATTR(ckpt_thread_ioprio);
++
++/* ATGC_INFO ATTR */
++ATGC_INFO_RW_ATTR(atgc_candidate_ratio, candidate_ratio);
++ATGC_INFO_RW_ATTR(atgc_candidate_count, max_candidate_count);
++ATGC_INFO_RW_ATTR(atgc_age_weight, age_weight);
++ATGC_INFO_RW_ATTR(atgc_age_threshold, age_threshold);
++
+ F2FS_GENERAL_RO_ATTR(dirty_segments);
+ F2FS_GENERAL_RO_ATTR(free_segments);
+ F2FS_GENERAL_RO_ATTR(ovp_segments);
+@@ -917,10 +1019,6 @@ F2FS_GENERAL_RO_ATTR(main_blkaddr);
+ F2FS_GENERAL_RO_ATTR(pending_discard);
+ F2FS_GENERAL_RO_ATTR(gc_mode);
+ #ifdef CONFIG_F2FS_STAT_FS
+-F2FS_STAT_ATTR(STAT_INFO, f2fs_stat_info, cp_foreground_calls, cp_count);
+-F2FS_STAT_ATTR(STAT_INFO, f2fs_stat_info, cp_background_calls, bg_cp_count);
+-F2FS_STAT_ATTR(STAT_INFO, f2fs_stat_info, gc_foreground_calls, call_count);
+-F2FS_STAT_ATTR(STAT_INFO, f2fs_stat_info, gc_background_calls, bg_gc);
+ F2FS_GENERAL_RO_ATTR(moved_blocks_background);
+ F2FS_GENERAL_RO_ATTR(moved_blocks_foreground);
+ F2FS_GENERAL_RO_ATTR(avg_vblocks);
+@@ -935,8 +1033,6 @@ F2FS_FEATURE_RO_ATTR(encrypted_casefold);
+ #endif /* CONFIG_FS_ENCRYPTION */
+ #ifdef CONFIG_BLK_DEV_ZONED
+ F2FS_FEATURE_RO_ATTR(block_zoned);
+-F2FS_RO_ATTR(F2FS_SBI, f2fs_sb_info, unusable_blocks_per_sec,
+- unusable_blocks_per_sec);
+ #endif
+ F2FS_FEATURE_RO_ATTR(atomic_write);
+ F2FS_FEATURE_RO_ATTR(extra_attr);
+@@ -956,37 +1052,9 @@ F2FS_FEATURE_RO_ATTR(casefold);
+ F2FS_FEATURE_RO_ATTR(readonly);
+ #ifdef CONFIG_F2FS_FS_COMPRESSION
+ F2FS_FEATURE_RO_ATTR(compression);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, compr_written_block, compr_written_block);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, compr_saved_block, compr_saved_block);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, compr_new_inode, compr_new_inode);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, compress_percent, compress_percent);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, compress_watermark, compress_watermark);
+ #endif
+ F2FS_FEATURE_RO_ATTR(pin_file);
+
+-/* For ATGC */
+-F2FS_RW_ATTR(ATGC_INFO, atgc_management, atgc_candidate_ratio, candidate_ratio);
+-F2FS_RW_ATTR(ATGC_INFO, atgc_management, atgc_candidate_count, max_candidate_count);
+-F2FS_RW_ATTR(ATGC_INFO, atgc_management, atgc_age_weight, age_weight);
+-F2FS_RW_ATTR(ATGC_INFO, atgc_management, atgc_age_threshold, age_threshold);
+-
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, seq_file_ra_mul, seq_file_ra_mul);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, gc_segment_mode, gc_segment_mode);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, gc_reclaimed_segments, gc_reclaimed_segs);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, max_fragment_chunk, max_fragment_chunk);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, max_fragment_hole, max_fragment_hole);
+-
+-/* For atomic write */
+-F2FS_RO_ATTR(F2FS_SBI, f2fs_sb_info, current_atomic_write, current_atomic_write);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, peak_atomic_write, peak_atomic_write);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, committed_atomic_block, committed_atomic_block);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, revoked_atomic_block, revoked_atomic_block);
+-
+-/* For block age extent cache */
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, hot_data_age_threshold, hot_data_age_threshold);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, warm_data_age_threshold, warm_data_age_threshold);
+-F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, last_age_weight, last_age_weight);
+-
+ #define ATTR_LIST(name) (&f2fs_attr_##name.attr)
+ static struct attribute *f2fs_attrs[] = {
+ ATTR_LIST(gc_urgent_sleep_time),
+diff --git a/fs/fs_context.c b/fs/fs_context.c
+index 851214d1d013d..375023e40161d 100644
+--- a/fs/fs_context.c
++++ b/fs/fs_context.c
+@@ -315,10 +315,31 @@ struct fs_context *fs_context_for_reconfigure(struct dentry *dentry,
+ }
+ EXPORT_SYMBOL(fs_context_for_reconfigure);
+
++/**
++ * fs_context_for_submount: allocate a new fs_context for a submount
++ * @type: file_system_type of the new context
++ * @reference: reference dentry from which to copy relevant info
++ *
++ * Allocate a new fs_context suitable for a submount. This also ensures that
++ * the fc->security object is inherited from @reference (if needed).
++ */
+ struct fs_context *fs_context_for_submount(struct file_system_type *type,
+ struct dentry *reference)
+ {
+- return alloc_fs_context(type, reference, 0, 0, FS_CONTEXT_FOR_SUBMOUNT);
++ struct fs_context *fc;
++ int ret;
++
++ fc = alloc_fs_context(type, reference, 0, 0, FS_CONTEXT_FOR_SUBMOUNT);
++ if (IS_ERR(fc))
++ return fc;
++
++ ret = security_fs_context_submount(fc, reference->d_sb);
++ if (ret) {
++ put_fs_context(fc);
++ return ERR_PTR(ret);
++ }
++
++ return fc;
+ }
+ EXPORT_SYMBOL(fs_context_for_submount);
+
+diff --git a/fs/fuse/file.c b/fs/fuse/file.c
+index 89d97f6188e05..acee575f9dc8b 100644
+--- a/fs/fuse/file.c
++++ b/fs/fuse/file.c
+@@ -19,7 +19,6 @@
+ #include <linux/uio.h>
+ #include <linux/fs.h>
+ #include <linux/filelock.h>
+-#include <linux/file.h>
+
+ static int fuse_send_open(struct fuse_mount *fm, u64 nodeid,
+ unsigned int open_flags, int opcode,
+@@ -479,36 +478,48 @@ static void fuse_sync_writes(struct inode *inode)
+ fuse_release_nowrite(inode);
+ }
+
+-struct fuse_flush_args {
+- struct fuse_args args;
+- struct fuse_flush_in inarg;
+- struct work_struct work;
+- struct file *file;
+-};
+-
+-static int fuse_do_flush(struct fuse_flush_args *fa)
++static int fuse_flush(struct file *file, fl_owner_t id)
+ {
+- int err;
+- struct inode *inode = file_inode(fa->file);
++ struct inode *inode = file_inode(file);
+ struct fuse_mount *fm = get_fuse_mount(inode);
++ struct fuse_file *ff = file->private_data;
++ struct fuse_flush_in inarg;
++ FUSE_ARGS(args);
++ int err;
++
++ if (fuse_is_bad(inode))
++ return -EIO;
++
++ if (ff->open_flags & FOPEN_NOFLUSH && !fm->fc->writeback_cache)
++ return 0;
+
+ err = write_inode_now(inode, 1);
+ if (err)
+- goto out;
++ return err;
+
+ inode_lock(inode);
+ fuse_sync_writes(inode);
+ inode_unlock(inode);
+
+- err = filemap_check_errors(fa->file->f_mapping);
++ err = filemap_check_errors(file->f_mapping);
+ if (err)
+- goto out;
++ return err;
+
+ err = 0;
+ if (fm->fc->no_flush)
+ goto inval_attr_out;
+
+- err = fuse_simple_request(fm, &fa->args);
++ memset(&inarg, 0, sizeof(inarg));
++ inarg.fh = ff->fh;
++ inarg.lock_owner = fuse_lock_owner_id(fm->fc, id);
++ args.opcode = FUSE_FLUSH;
++ args.nodeid = get_node_id(inode);
++ args.in_numargs = 1;
++ args.in_args[0].size = sizeof(inarg);
++ args.in_args[0].value = &inarg;
++ args.force = true;
++
++ err = fuse_simple_request(fm, &args);
+ if (err == -ENOSYS) {
+ fm->fc->no_flush = 1;
+ err = 0;
+@@ -521,57 +532,9 @@ inval_attr_out:
+ */
+ if (!err && fm->fc->writeback_cache)
+ fuse_invalidate_attr_mask(inode, STATX_BLOCKS);
+-
+-out:
+- fput(fa->file);
+- kfree(fa);
+ return err;
+ }
+
+-static void fuse_flush_async(struct work_struct *work)
+-{
+- struct fuse_flush_args *fa = container_of(work, typeof(*fa), work);
+-
+- fuse_do_flush(fa);
+-}
+-
+-static int fuse_flush(struct file *file, fl_owner_t id)
+-{
+- struct fuse_flush_args *fa;
+- struct inode *inode = file_inode(file);
+- struct fuse_mount *fm = get_fuse_mount(inode);
+- struct fuse_file *ff = file->private_data;
+-
+- if (fuse_is_bad(inode))
+- return -EIO;
+-
+- if (ff->open_flags & FOPEN_NOFLUSH && !fm->fc->writeback_cache)
+- return 0;
+-
+- fa = kzalloc(sizeof(*fa), GFP_KERNEL);
+- if (!fa)
+- return -ENOMEM;
+-
+- fa->inarg.fh = ff->fh;
+- fa->inarg.lock_owner = fuse_lock_owner_id(fm->fc, id);
+- fa->args.opcode = FUSE_FLUSH;
+- fa->args.nodeid = get_node_id(inode);
+- fa->args.in_numargs = 1;
+- fa->args.in_args[0].size = sizeof(fa->inarg);
+- fa->args.in_args[0].value = &fa->inarg;
+- fa->args.force = true;
+- fa->file = get_file(file);
+-
+- /* Don't wait if the task is exiting */
+- if (current->flags & PF_EXITING) {
+- INIT_WORK(&fa->work, fuse_flush_async);
+- schedule_work(&fa->work);
+- return 0;
+- }
+-
+- return fuse_do_flush(fa);
+-}
+-
+ int fuse_fsync_common(struct file *file, loff_t start, loff_t end,
+ int datasync, int opcode)
+ {
+diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
+index 063133ec77f49..08ee293c4117c 100644
+--- a/fs/iomap/buffered-io.c
++++ b/fs/iomap/buffered-io.c
+@@ -508,11 +508,6 @@ void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len)
+ WARN_ON_ONCE(folio_test_writeback(folio));
+ folio_cancel_dirty(folio);
+ iomap_page_release(folio);
+- } else if (folio_test_large(folio)) {
+- /* Must release the iop so the page can be split */
+- WARN_ON_ONCE(!folio_test_uptodate(folio) &&
+- folio_test_dirty(folio));
+- iomap_page_release(folio);
+ }
+ }
+ EXPORT_SYMBOL_GPL(iomap_invalidate_folio);
+diff --git a/fs/jfs/jfs_extent.c b/fs/jfs/jfs_extent.c
+index ae99a7e232eeb..a82751e6c47f9 100644
+--- a/fs/jfs/jfs_extent.c
++++ b/fs/jfs/jfs_extent.c
+@@ -311,6 +311,11 @@ extBalloc(struct inode *ip, s64 hint, s64 * nblocks, s64 * blkno)
+ * blocks in the map. in that case, we'll start off with the
+ * maximum free.
+ */
++
++ /* give up if no space left */
++ if (bmp->db_maxfreebud == -1)
++ return -ENOSPC;
++
+ max = (s64) 1 << bmp->db_maxfreebud;
+ if (*nblocks >= max && *nblocks > nbperpage)
+ nb = nblks = (max > nbperpage) ? max : nbperpage;
+diff --git a/fs/lockd/mon.c b/fs/lockd/mon.c
+index 1d9488cf05348..87a0f207df0b9 100644
+--- a/fs/lockd/mon.c
++++ b/fs/lockd/mon.c
+@@ -276,6 +276,9 @@ static struct nsm_handle *nsm_create_handle(const struct sockaddr *sap,
+ {
+ struct nsm_handle *new;
+
++ if (!hostname)
++ return NULL;
++
+ new = kzalloc(sizeof(*new) + hostname_len + 1, GFP_KERNEL);
+ if (unlikely(new == NULL))
+ return NULL;
+diff --git a/fs/namei.c b/fs/namei.c
+index 7e5cb92feab3f..e18c8c9f1d9c6 100644
+--- a/fs/namei.c
++++ b/fs/namei.c
+@@ -2890,7 +2890,7 @@ int path_pts(struct path *path)
+ dput(path->dentry);
+ path->dentry = parent;
+ child = d_hash_and_lookup(parent, &this);
+- if (!child)
++ if (IS_ERR_OR_NULL(child))
+ return -ENOENT;
+
+ path->dentry = child;
+diff --git a/fs/nfs/blocklayout/dev.c b/fs/nfs/blocklayout/dev.c
+index fea5f8821da5e..ce2ea62397972 100644
+--- a/fs/nfs/blocklayout/dev.c
++++ b/fs/nfs/blocklayout/dev.c
+@@ -402,7 +402,7 @@ bl_parse_concat(struct nfs_server *server, struct pnfs_block_dev *d,
+ int ret, i;
+
+ d->children = kcalloc(v->concat.volumes_count,
+- sizeof(struct pnfs_block_dev), GFP_KERNEL);
++ sizeof(struct pnfs_block_dev), gfp_mask);
+ if (!d->children)
+ return -ENOMEM;
+
+@@ -431,7 +431,7 @@ bl_parse_stripe(struct nfs_server *server, struct pnfs_block_dev *d,
+ int ret, i;
+
+ d->children = kcalloc(v->stripe.volumes_count,
+- sizeof(struct pnfs_block_dev), GFP_KERNEL);
++ sizeof(struct pnfs_block_dev), gfp_mask);
+ if (!d->children)
+ return -ENOMEM;
+
+diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
+index 3cc027d3bd588..1607c23f68d41 100644
+--- a/fs/nfs/internal.h
++++ b/fs/nfs/internal.h
+@@ -489,6 +489,7 @@ extern const struct nfs_pgio_completion_ops nfs_async_read_completion_ops;
+ extern void nfs_pageio_init_read(struct nfs_pageio_descriptor *pgio,
+ struct inode *inode, bool force_mds,
+ const struct nfs_pgio_completion_ops *compl_ops);
++extern bool nfs_read_alloc_scratch(struct nfs_pgio_header *hdr, size_t size);
+ extern int nfs_read_add_folio(struct nfs_pageio_descriptor *pgio,
+ struct nfs_open_context *ctx,
+ struct folio *folio);
+diff --git a/fs/nfs/nfs2xdr.c b/fs/nfs/nfs2xdr.c
+index 05c3b4b2b3dd8..c190938142960 100644
+--- a/fs/nfs/nfs2xdr.c
++++ b/fs/nfs/nfs2xdr.c
+@@ -949,7 +949,7 @@ int nfs2_decode_dirent(struct xdr_stream *xdr, struct nfs_entry *entry,
+
+ error = decode_filename_inline(xdr, &entry->name, &entry->len);
+ if (unlikely(error))
+- return -EAGAIN;
++ return error == -ENAMETOOLONG ? -ENAMETOOLONG : -EAGAIN;
+
+ /*
+ * The type (size and byte order) of nfscookie isn't defined in
+diff --git a/fs/nfs/nfs3xdr.c b/fs/nfs/nfs3xdr.c
+index 3b0b650c9c5ab..60f032be805ae 100644
+--- a/fs/nfs/nfs3xdr.c
++++ b/fs/nfs/nfs3xdr.c
+@@ -1991,7 +1991,7 @@ int nfs3_decode_dirent(struct xdr_stream *xdr, struct nfs_entry *entry,
+
+ error = decode_inline_filename3(xdr, &entry->name, &entry->len);
+ if (unlikely(error))
+- return -EAGAIN;
++ return error == -ENAMETOOLONG ? -ENAMETOOLONG : -EAGAIN;
+
+ error = decode_cookie3(xdr, &new_cookie);
+ if (unlikely(error))
+diff --git a/fs/nfs/nfs42.h b/fs/nfs/nfs42.h
+index 0fe5aacbcfdf1..b59876b01a1e3 100644
+--- a/fs/nfs/nfs42.h
++++ b/fs/nfs/nfs42.h
+@@ -13,6 +13,7 @@
+ * more? Need to consider not to pre-alloc too much for a compound.
+ */
+ #define PNFS_LAYOUTSTATS_MAXDEV (4)
++#define READ_PLUS_SCRATCH_SIZE (16)
+
+ /* nfs4.2proc.c */
+ #ifdef CONFIG_NFS_V4_2
+diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
+index 5d7e0511f3513..d5ec3d5568da5 100644
+--- a/fs/nfs/nfs42proc.c
++++ b/fs/nfs/nfs42proc.c
+@@ -471,8 +471,9 @@ ssize_t nfs42_proc_copy(struct file *src, loff_t pos_src,
+ continue;
+ }
+ break;
+- } else if (err == -NFS4ERR_OFFLOAD_NO_REQS && !args.sync) {
+- args.sync = true;
++ } else if (err == -NFS4ERR_OFFLOAD_NO_REQS &&
++ args.sync != res.synchronous) {
++ args.sync = res.synchronous;
+ dst_exception.retry = 1;
+ continue;
+ } else if ((err == -ESTALE ||
+diff --git a/fs/nfs/nfs42xdr.c b/fs/nfs/nfs42xdr.c
+index a6df815a140c7..20aa5e746497d 100644
+--- a/fs/nfs/nfs42xdr.c
++++ b/fs/nfs/nfs42xdr.c
+@@ -51,10 +51,16 @@
+ (1 /* data_content4 */ + \
+ 2 /* data_info4.di_offset */ + \
+ 1 /* data_info4.di_length */)
++#define NFS42_READ_PLUS_HOLE_SEGMENT_SIZE \
++ (1 /* data_content4 */ + \
++ 2 /* data_info4.di_offset */ + \
++ 2 /* data_info4.di_length */)
++#define READ_PLUS_SEGMENT_SIZE_DIFF (NFS42_READ_PLUS_HOLE_SEGMENT_SIZE - \
++ NFS42_READ_PLUS_DATA_SEGMENT_SIZE)
+ #define decode_read_plus_maxsz (op_decode_hdr_maxsz + \
+ 1 /* rpr_eof */ + \
+ 1 /* rpr_contents count */ + \
+- NFS42_READ_PLUS_DATA_SEGMENT_SIZE)
++ NFS42_READ_PLUS_HOLE_SEGMENT_SIZE)
+ #define encode_seek_maxsz (op_encode_hdr_maxsz + \
+ encode_stateid_maxsz + \
+ 2 /* offset */ + \
+@@ -781,8 +787,8 @@ static void nfs4_xdr_enc_read_plus(struct rpc_rqst *req,
+ encode_putfh(xdr, args->fh, &hdr);
+ encode_read_plus(xdr, args, &hdr);
+
+- rpc_prepare_reply_pages(req, args->pages, args->pgbase,
+- args->count, hdr.replen);
++ rpc_prepare_reply_pages(req, args->pages, args->pgbase, args->count,
++ hdr.replen - READ_PLUS_SEGMENT_SIZE_DIFF);
+ encode_nops(&hdr);
+ }
+
+@@ -1136,13 +1142,12 @@ static int decode_read_plus(struct xdr_stream *xdr, struct nfs_pgio_res *res)
+ res->eof = be32_to_cpup(p++);
+ segments = be32_to_cpup(p++);
+ if (segments == 0)
+- return status;
++ return 0;
+
+ segs = kmalloc_array(segments, sizeof(*segs), GFP_KERNEL);
+ if (!segs)
+ return -ENOMEM;
+
+- status = -EIO;
+ for (i = 0; i < segments; i++) {
+ status = decode_read_plus_segment(xdr, &segs[i]);
+ if (status < 0)
+@@ -1346,7 +1351,7 @@ static int nfs4_xdr_dec_read_plus(struct rpc_rqst *rqstp,
+ struct compound_hdr hdr;
+ int status;
+
+- xdr_set_scratch_buffer(xdr, res->scratch, sizeof(res->scratch));
++ xdr_set_scratch_buffer(xdr, res->scratch, READ_PLUS_SCRATCH_SIZE);
+
+ status = decode_compound_hdr(xdr, &hdr);
+ if (status)
+diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
+index fd752e0c4ec24..43d458965330f 100644
+--- a/fs/nfs/nfs4proc.c
++++ b/fs/nfs/nfs4proc.c
+@@ -5438,18 +5438,8 @@ static bool nfs4_read_plus_not_supported(struct rpc_task *task,
+ return false;
+ }
+
+-static inline void nfs4_read_plus_scratch_free(struct nfs_pgio_header *hdr)
+-{
+- if (hdr->res.scratch) {
+- kfree(hdr->res.scratch);
+- hdr->res.scratch = NULL;
+- }
+-}
+-
+ static int nfs4_read_done(struct rpc_task *task, struct nfs_pgio_header *hdr)
+ {
+- nfs4_read_plus_scratch_free(hdr);
+-
+ if (!nfs4_sequence_done(task, &hdr->res.seq_res))
+ return -EAGAIN;
+ if (nfs4_read_stateid_changed(task, &hdr->args))
+@@ -5469,8 +5459,7 @@ static bool nfs42_read_plus_support(struct nfs_pgio_header *hdr,
+ /* Note: We don't use READ_PLUS with pNFS yet */
+ if (nfs_server_capable(hdr->inode, NFS_CAP_READ_PLUS) && !hdr->ds_clp) {
+ msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_READ_PLUS];
+- hdr->res.scratch = kmalloc(32, GFP_KERNEL);
+- return hdr->res.scratch != NULL;
++ return nfs_read_alloc_scratch(hdr, READ_PLUS_SCRATCH_SIZE);
+ }
+ return false;
+ }
+diff --git a/fs/nfs/pnfs_nfs.c b/fs/nfs/pnfs_nfs.c
+index a0112ad4937aa..2e14ce2f82191 100644
+--- a/fs/nfs/pnfs_nfs.c
++++ b/fs/nfs/pnfs_nfs.c
+@@ -943,7 +943,7 @@ static int _nfs4_pnfs_v4_ds_connect(struct nfs_server *mds_srv,
+ * Test this address for session trunking and
+ * add as an alias
+ */
+- xprtdata.cred = nfs4_get_clid_cred(clp),
++ xprtdata.cred = nfs4_get_clid_cred(clp);
+ rpc_clnt_add_xprt(clp->cl_rpcclient, &xprt_args,
+ rpc_clnt_setup_test_and_add_xprt,
+ &rpcdata);
+diff --git a/fs/nfs/read.c b/fs/nfs/read.c
+index f71eeee67e201..7dc21a48e3e7b 100644
+--- a/fs/nfs/read.c
++++ b/fs/nfs/read.c
+@@ -47,6 +47,8 @@ static struct nfs_pgio_header *nfs_readhdr_alloc(void)
+
+ static void nfs_readhdr_free(struct nfs_pgio_header *rhdr)
+ {
++ if (rhdr->res.scratch != NULL)
++ kfree(rhdr->res.scratch);
+ kmem_cache_free(nfs_rdata_cachep, rhdr);
+ }
+
+@@ -108,6 +110,14 @@ void nfs_pageio_reset_read_mds(struct nfs_pageio_descriptor *pgio)
+ }
+ EXPORT_SYMBOL_GPL(nfs_pageio_reset_read_mds);
+
++bool nfs_read_alloc_scratch(struct nfs_pgio_header *hdr, size_t size)
++{
++ WARN_ON(hdr->res.scratch != NULL);
++ hdr->res.scratch = kmalloc(size, GFP_KERNEL);
++ return hdr->res.scratch != NULL;
++}
++EXPORT_SYMBOL_GPL(nfs_read_alloc_scratch);
++
+ static void nfs_readpage_release(struct nfs_page *req, int error)
+ {
+ struct folio *folio = nfs_page_to_folio(req);
+diff --git a/fs/nfsd/blocklayoutxdr.c b/fs/nfsd/blocklayoutxdr.c
+index 8e9c1a0f8d380..1ed2f691ebb90 100644
+--- a/fs/nfsd/blocklayoutxdr.c
++++ b/fs/nfsd/blocklayoutxdr.c
+@@ -83,6 +83,15 @@ nfsd4_block_encode_getdeviceinfo(struct xdr_stream *xdr,
+ int len = sizeof(__be32), ret, i;
+ __be32 *p;
+
++ /*
++ * See paragraph 5 of RFC 8881 S18.40.3.
++ */
++ if (!gdp->gd_maxcount) {
++ if (xdr_stream_encode_u32(xdr, 0) != XDR_UNIT)
++ return nfserr_resource;
++ return nfs_ok;
++ }
++
+ p = xdr_reserve_space(xdr, len + sizeof(__be32));
+ if (!p)
+ return nfserr_resource;
+diff --git a/fs/nfsd/flexfilelayoutxdr.c b/fs/nfsd/flexfilelayoutxdr.c
+index e81d2a5cf381e..bb205328e043d 100644
+--- a/fs/nfsd/flexfilelayoutxdr.c
++++ b/fs/nfsd/flexfilelayoutxdr.c
+@@ -85,6 +85,15 @@ nfsd4_ff_encode_getdeviceinfo(struct xdr_stream *xdr,
+ int addr_len;
+ __be32 *p;
+
++ /*
++ * See paragraph 5 of RFC 8881 S18.40.3.
++ */
++ if (!gdp->gd_maxcount) {
++ if (xdr_stream_encode_u32(xdr, 0) != XDR_UNIT)
++ return nfserr_resource;
++ return nfs_ok;
++ }
++
+ /* len + padding for two strings */
+ addr_len = 16 + da->netaddr.netid_len + da->netaddr.addr_len;
+ ver_len = 20;
+diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
+index ee1a24debd60c..679d9be4abd4d 100644
+--- a/fs/nfsd/nfs4xdr.c
++++ b/fs/nfsd/nfs4xdr.c
+@@ -4681,20 +4681,17 @@ nfsd4_encode_getdeviceinfo(struct nfsd4_compoundres *resp, __be32 nfserr,
+
+ *p++ = cpu_to_be32(gdev->gd_layout_type);
+
+- /* If maxcount is 0 then just update notifications */
+- if (gdev->gd_maxcount != 0) {
+- ops = nfsd4_layout_ops[gdev->gd_layout_type];
+- nfserr = ops->encode_getdeviceinfo(xdr, gdev);
+- if (nfserr) {
+- /*
+- * We don't bother to burden the layout drivers with
+- * enforcing gd_maxcount, just tell the client to
+- * come back with a bigger buffer if it's not enough.
+- */
+- if (xdr->buf->len + 4 > gdev->gd_maxcount)
+- goto toosmall;
+- return nfserr;
+- }
++ ops = nfsd4_layout_ops[gdev->gd_layout_type];
++ nfserr = ops->encode_getdeviceinfo(xdr, gdev);
++ if (nfserr) {
++ /*
++ * We don't bother to burden the layout drivers with
++ * enforcing gd_maxcount, just tell the client to
++ * come back with a bigger buffer if it's not enough.
++ */
++ if (xdr->buf->len + 4 > gdev->gd_maxcount)
++ goto toosmall;
++ return nfserr;
+ }
+
+ if (gdev->gd_notify_types) {
+diff --git a/fs/nls/nls_base.c b/fs/nls/nls_base.c
+index 52ccd34b1e792..a026dbd3593f6 100644
+--- a/fs/nls/nls_base.c
++++ b/fs/nls/nls_base.c
+@@ -272,7 +272,7 @@ int unregister_nls(struct nls_table * nls)
+ return -EINVAL;
+ }
+
+-static struct nls_table *find_nls(char *charset)
++static struct nls_table *find_nls(const char *charset)
+ {
+ struct nls_table *nls;
+ spin_lock(&nls_lock);
+@@ -288,7 +288,7 @@ static struct nls_table *find_nls(char *charset)
+ return nls;
+ }
+
+-struct nls_table *load_nls(char *charset)
++struct nls_table *load_nls(const char *charset)
+ {
+ return try_then_request_module(find_nls(charset), "nls_%s", charset);
+ }
+diff --git a/fs/ocfs2/namei.c b/fs/ocfs2/namei.c
+index 17c52225b87d4..03bccfd183f3c 100644
+--- a/fs/ocfs2/namei.c
++++ b/fs/ocfs2/namei.c
+@@ -1535,6 +1535,10 @@ static int ocfs2_rename(struct mnt_idmap *idmap,
+ status = ocfs2_add_entry(handle, new_dentry, old_inode,
+ OCFS2_I(old_inode)->ip_blkno,
+ new_dir_bh, &target_insert);
++ if (status < 0) {
++ mlog_errno(status);
++ goto bail;
++ }
+ }
+
+ old_inode->i_ctime = current_time(old_inode);
+diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
+index ae1058fbfb5b2..8c60da7b4afd8 100644
+--- a/fs/overlayfs/super.c
++++ b/fs/overlayfs/super.c
+@@ -2052,7 +2052,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
+ ovl_trusted_xattr_handlers;
+ sb->s_fs_info = ofs;
+ sb->s_flags |= SB_POSIXACL;
+- sb->s_iflags |= SB_I_SKIP_SYNC;
++ sb->s_iflags |= SB_I_SKIP_SYNC | SB_I_IMA_UNVERIFIABLE_SIGNATURE;
+
+ err = -ENOMEM;
+ root_dentry = ovl_get_root(sb, upperpath.dentry, oe);
+diff --git a/fs/proc/base.c b/fs/proc/base.c
+index 05452c3b9872b..7394229816f37 100644
+--- a/fs/proc/base.c
++++ b/fs/proc/base.c
+@@ -3583,7 +3583,8 @@ static int proc_tid_comm_permission(struct mnt_idmap *idmap,
+ }
+
+ static const struct inode_operations proc_tid_comm_inode_operations = {
+- .permission = proc_tid_comm_permission,
++ .setattr = proc_setattr,
++ .permission = proc_tid_comm_permission,
+ };
+
+ /*
+diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c
+index 85aaf0fc6d7d1..eb6df190d7523 100644
+--- a/fs/pstore/ram_core.c
++++ b/fs/pstore/ram_core.c
+@@ -519,7 +519,7 @@ static int persistent_ram_post_init(struct persistent_ram_zone *prz, u32 sig,
+ sig ^= PERSISTENT_RAM_SIG;
+
+ if (prz->buffer->sig == sig) {
+- if (buffer_size(prz) == 0) {
++ if (buffer_size(prz) == 0 && buffer_start(prz) == 0) {
+ pr_debug("found existing empty buffer\n");
+ return 0;
+ }
+diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
+index e3e4f40476579..c7afe433d991a 100644
+--- a/fs/quota/dquot.c
++++ b/fs/quota/dquot.c
+@@ -225,13 +225,22 @@ static void put_quota_format(struct quota_format_type *fmt)
+
+ /*
+ * Dquot List Management:
+- * The quota code uses four lists for dquot management: the inuse_list,
+- * free_dquots, dqi_dirty_list, and dquot_hash[] array. A single dquot
+- * structure may be on some of those lists, depending on its current state.
++ * The quota code uses five lists for dquot management: the inuse_list,
++ * releasing_dquots, free_dquots, dqi_dirty_list, and dquot_hash[] array.
++ * A single dquot structure may be on some of those lists, depending on
++ * its current state.
+ *
+ * All dquots are placed to the end of inuse_list when first created, and this
+ * list is used for invalidate operation, which must look at every dquot.
+ *
++ * When the last reference of a dquot will be dropped, the dquot will be
++ * added to releasing_dquots. We'd then queue work item which would call
++ * synchronize_srcu() and after that perform the final cleanup of all the
++ * dquots on the list. Both releasing_dquots and free_dquots use the
++ * dq_free list_head in the dquot struct. When a dquot is removed from
++ * releasing_dquots, a reference count is always subtracted, and if
++ * dq_count == 0 at that point, the dquot will be added to the free_dquots.
++ *
+ * Unused dquots (dq_count == 0) are added to the free_dquots list when freed,
+ * and this list is searched whenever we need an available dquot. Dquots are
+ * removed from the list as soon as they are used again, and
+@@ -250,6 +259,7 @@ static void put_quota_format(struct quota_format_type *fmt)
+
+ static LIST_HEAD(inuse_list);
+ static LIST_HEAD(free_dquots);
++static LIST_HEAD(releasing_dquots);
+ static unsigned int dq_hash_bits, dq_hash_mask;
+ static struct hlist_head *dquot_hash;
+
+@@ -260,6 +270,9 @@ static qsize_t inode_get_rsv_space(struct inode *inode);
+ static qsize_t __inode_get_rsv_space(struct inode *inode);
+ static int __dquot_initialize(struct inode *inode, int type);
+
++static void quota_release_workfn(struct work_struct *work);
++static DECLARE_DELAYED_WORK(quota_release_work, quota_release_workfn);
++
+ static inline unsigned int
+ hashfn(const struct super_block *sb, struct kqid qid)
+ {
+@@ -305,12 +318,18 @@ static inline void put_dquot_last(struct dquot *dquot)
+ dqstats_inc(DQST_FREE_DQUOTS);
+ }
+
++static inline void put_releasing_dquots(struct dquot *dquot)
++{
++ list_add_tail(&dquot->dq_free, &releasing_dquots);
++}
++
+ static inline void remove_free_dquot(struct dquot *dquot)
+ {
+ if (list_empty(&dquot->dq_free))
+ return;
+ list_del_init(&dquot->dq_free);
+- dqstats_dec(DQST_FREE_DQUOTS);
++ if (!atomic_read(&dquot->dq_count))
++ dqstats_dec(DQST_FREE_DQUOTS);
+ }
+
+ static inline void put_inuse(struct dquot *dquot)
+@@ -336,6 +355,11 @@ static void wait_on_dquot(struct dquot *dquot)
+ mutex_unlock(&dquot->dq_lock);
+ }
+
++static inline int dquot_active(struct dquot *dquot)
++{
++ return test_bit(DQ_ACTIVE_B, &dquot->dq_flags);
++}
++
+ static inline int dquot_dirty(struct dquot *dquot)
+ {
+ return test_bit(DQ_MOD_B, &dquot->dq_flags);
+@@ -351,14 +375,14 @@ int dquot_mark_dquot_dirty(struct dquot *dquot)
+ {
+ int ret = 1;
+
+- if (!test_bit(DQ_ACTIVE_B, &dquot->dq_flags))
++ if (!dquot_active(dquot))
+ return 0;
+
+ if (sb_dqopt(dquot->dq_sb)->flags & DQUOT_NOLIST_DIRTY)
+ return test_and_set_bit(DQ_MOD_B, &dquot->dq_flags);
+
+ /* If quota is dirty already, we don't have to acquire dq_list_lock */
+- if (test_bit(DQ_MOD_B, &dquot->dq_flags))
++ if (dquot_dirty(dquot))
+ return 1;
+
+ spin_lock(&dq_list_lock);
+@@ -440,7 +464,7 @@ int dquot_acquire(struct dquot *dquot)
+ smp_mb__before_atomic();
+ set_bit(DQ_READ_B, &dquot->dq_flags);
+ /* Instantiate dquot if needed */
+- if (!test_bit(DQ_ACTIVE_B, &dquot->dq_flags) && !dquot->dq_off) {
++ if (!dquot_active(dquot) && !dquot->dq_off) {
+ ret = dqopt->ops[dquot->dq_id.type]->commit_dqblk(dquot);
+ /* Write the info if needed */
+ if (info_dirty(&dqopt->info[dquot->dq_id.type])) {
+@@ -482,7 +506,7 @@ int dquot_commit(struct dquot *dquot)
+ goto out_lock;
+ /* Inactive dquot can be only if there was error during read/init
+ * => we have better not writing it */
+- if (test_bit(DQ_ACTIVE_B, &dquot->dq_flags))
++ if (dquot_active(dquot))
+ ret = dqopt->ops[dquot->dq_id.type]->commit_dqblk(dquot);
+ else
+ ret = -EIO;
+@@ -547,6 +571,8 @@ static void invalidate_dquots(struct super_block *sb, int type)
+ struct dquot *dquot, *tmp;
+
+ restart:
++ flush_delayed_work("a_release_work);
++
+ spin_lock(&dq_list_lock);
+ list_for_each_entry_safe(dquot, tmp, &inuse_list, dq_inuse) {
+ if (dquot->dq_sb != sb)
+@@ -555,6 +581,12 @@ restart:
+ continue;
+ /* Wait for dquot users */
+ if (atomic_read(&dquot->dq_count)) {
++ /* dquot in releasing_dquots, flush and retry */
++ if (!list_empty(&dquot->dq_free)) {
++ spin_unlock(&dq_list_lock);
++ goto restart;
++ }
++
+ atomic_inc(&dquot->dq_count);
+ spin_unlock(&dq_list_lock);
+ /*
+@@ -597,7 +629,7 @@ int dquot_scan_active(struct super_block *sb,
+
+ spin_lock(&dq_list_lock);
+ list_for_each_entry(dquot, &inuse_list, dq_inuse) {
+- if (!test_bit(DQ_ACTIVE_B, &dquot->dq_flags))
++ if (!dquot_active(dquot))
+ continue;
+ if (dquot->dq_sb != sb)
+ continue;
+@@ -612,7 +644,7 @@ int dquot_scan_active(struct super_block *sb,
+ * outstanding call and recheck the DQ_ACTIVE_B after that.
+ */
+ wait_on_dquot(dquot);
+- if (test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) {
++ if (dquot_active(dquot)) {
+ ret = fn(dquot, priv);
+ if (ret < 0)
+ goto out;
+@@ -628,6 +660,18 @@ out:
+ }
+ EXPORT_SYMBOL(dquot_scan_active);
+
++static inline int dquot_write_dquot(struct dquot *dquot)
++{
++ int ret = dquot->dq_sb->dq_op->write_dquot(dquot);
++ if (ret < 0) {
++ quota_error(dquot->dq_sb, "Can't write quota structure "
++ "(error %d). Quota may get out of sync!", ret);
++ /* Clear dirty bit anyway to avoid infinite loop. */
++ clear_dquot_dirty(dquot);
++ }
++ return ret;
++}
++
+ /* Write all dquot structures to quota files */
+ int dquot_writeback_dquots(struct super_block *sb, int type)
+ {
+@@ -651,23 +695,16 @@ int dquot_writeback_dquots(struct super_block *sb, int type)
+ dquot = list_first_entry(&dirty, struct dquot,
+ dq_dirty);
+
+- WARN_ON(!test_bit(DQ_ACTIVE_B, &dquot->dq_flags));
++ WARN_ON(!dquot_active(dquot));
+
+ /* Now we have active dquot from which someone is
+ * holding reference so we can safely just increase
+ * use count */
+ dqgrab(dquot);
+ spin_unlock(&dq_list_lock);
+- err = sb->dq_op->write_dquot(dquot);
+- if (err) {
+- /*
+- * Clear dirty bit anyway to avoid infinite
+- * loop here.
+- */
+- clear_dquot_dirty(dquot);
+- if (!ret)
+- ret = err;
+- }
++ err = dquot_write_dquot(dquot);
++ if (err && !ret)
++ ret = err;
+ dqput(dquot);
+ spin_lock(&dq_list_lock);
+ }
+@@ -760,13 +797,54 @@ static struct shrinker dqcache_shrinker = {
+ .seeks = DEFAULT_SEEKS,
+ };
+
++/*
++ * Safely release dquot and put reference to dquot.
++ */
++static void quota_release_workfn(struct work_struct *work)
++{
++ struct dquot *dquot;
++ struct list_head rls_head;
++
++ spin_lock(&dq_list_lock);
++ /* Exchange the list head to avoid livelock. */
++ list_replace_init(&releasing_dquots, &rls_head);
++ spin_unlock(&dq_list_lock);
++
++restart:
++ synchronize_srcu(&dquot_srcu);
++ spin_lock(&dq_list_lock);
++ while (!list_empty(&rls_head)) {
++ dquot = list_first_entry(&rls_head, struct dquot, dq_free);
++ /* Dquot got used again? */
++ if (atomic_read(&dquot->dq_count) > 1) {
++ remove_free_dquot(dquot);
++ atomic_dec(&dquot->dq_count);
++ continue;
++ }
++ if (dquot_dirty(dquot)) {
++ spin_unlock(&dq_list_lock);
++ /* Commit dquot before releasing */
++ dquot_write_dquot(dquot);
++ goto restart;
++ }
++ if (dquot_active(dquot)) {
++ spin_unlock(&dq_list_lock);
++ dquot->dq_sb->dq_op->release_dquot(dquot);
++ goto restart;
++ }
++ /* Dquot is inactive and clean, now move it to free list */
++ remove_free_dquot(dquot);
++ atomic_dec(&dquot->dq_count);
++ put_dquot_last(dquot);
++ }
++ spin_unlock(&dq_list_lock);
++}
++
+ /*
+ * Put reference to dquot
+ */
+ void dqput(struct dquot *dquot)
+ {
+- int ret;
+-
+ if (!dquot)
+ return;
+ #ifdef CONFIG_QUOTA_DEBUG
+@@ -778,7 +856,7 @@ void dqput(struct dquot *dquot)
+ }
+ #endif
+ dqstats_inc(DQST_DROPS);
+-we_slept:
++
+ spin_lock(&dq_list_lock);
+ if (atomic_read(&dquot->dq_count) > 1) {
+ /* We have more than one user... nothing to do */
+@@ -790,35 +868,15 @@ we_slept:
+ spin_unlock(&dq_list_lock);
+ return;
+ }
++
+ /* Need to release dquot? */
+- if (dquot_dirty(dquot)) {
+- spin_unlock(&dq_list_lock);
+- /* Commit dquot before releasing */
+- ret = dquot->dq_sb->dq_op->write_dquot(dquot);
+- if (ret < 0) {
+- quota_error(dquot->dq_sb, "Can't write quota structure"
+- " (error %d). Quota may get out of sync!",
+- ret);
+- /*
+- * We clear dirty bit anyway, so that we avoid
+- * infinite loop here
+- */
+- clear_dquot_dirty(dquot);
+- }
+- goto we_slept;
+- }
+- if (test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) {
+- spin_unlock(&dq_list_lock);
+- dquot->dq_sb->dq_op->release_dquot(dquot);
+- goto we_slept;
+- }
+- atomic_dec(&dquot->dq_count);
+ #ifdef CONFIG_QUOTA_DEBUG
+ /* sanity check */
+ BUG_ON(!list_empty(&dquot->dq_free));
+ #endif
+- put_dquot_last(dquot);
++ put_releasing_dquots(dquot);
+ spin_unlock(&dq_list_lock);
++ queue_delayed_work(system_unbound_wq, "a_release_work, 1);
+ }
+ EXPORT_SYMBOL(dqput);
+
+@@ -908,7 +966,7 @@ we_slept:
+ * already finished or it will be canceled due to dq_count > 1 test */
+ wait_on_dquot(dquot);
+ /* Read the dquot / allocate space in quota file */
+- if (!test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) {
++ if (!dquot_active(dquot)) {
+ int err;
+
+ err = sb->dq_op->acquire_dquot(dquot);
+@@ -1425,7 +1483,7 @@ static int info_bdq_free(struct dquot *dquot, qsize_t space)
+ return QUOTA_NL_NOWARN;
+ }
+
+-static int dquot_active(const struct inode *inode)
++static int inode_quota_active(const struct inode *inode)
+ {
+ struct super_block *sb = inode->i_sb;
+
+@@ -1448,7 +1506,7 @@ static int __dquot_initialize(struct inode *inode, int type)
+ qsize_t rsv;
+ int ret = 0;
+
+- if (!dquot_active(inode))
++ if (!inode_quota_active(inode))
+ return 0;
+
+ dquots = i_dquot(inode);
+@@ -1556,7 +1614,7 @@ bool dquot_initialize_needed(struct inode *inode)
+ struct dquot **dquots;
+ int i;
+
+- if (!dquot_active(inode))
++ if (!inode_quota_active(inode))
+ return false;
+
+ dquots = i_dquot(inode);
+@@ -1667,7 +1725,7 @@ int __dquot_alloc_space(struct inode *inode, qsize_t number, int flags)
+ int reserve = flags & DQUOT_SPACE_RESERVE;
+ struct dquot **dquots;
+
+- if (!dquot_active(inode)) {
++ if (!inode_quota_active(inode)) {
+ if (reserve) {
+ spin_lock(&inode->i_lock);
+ *inode_reserved_space(inode) += number;
+@@ -1737,7 +1795,7 @@ int dquot_alloc_inode(struct inode *inode)
+ struct dquot_warn warn[MAXQUOTAS];
+ struct dquot * const *dquots;
+
+- if (!dquot_active(inode))
++ if (!inode_quota_active(inode))
+ return 0;
+ for (cnt = 0; cnt < MAXQUOTAS; cnt++)
+ warn[cnt].w_type = QUOTA_NL_NOWARN;
+@@ -1780,7 +1838,7 @@ int dquot_claim_space_nodirty(struct inode *inode, qsize_t number)
+ struct dquot **dquots;
+ int cnt, index;
+
+- if (!dquot_active(inode)) {
++ if (!inode_quota_active(inode)) {
+ spin_lock(&inode->i_lock);
+ *inode_reserved_space(inode) -= number;
+ __inode_add_bytes(inode, number);
+@@ -1822,7 +1880,7 @@ void dquot_reclaim_space_nodirty(struct inode *inode, qsize_t number)
+ struct dquot **dquots;
+ int cnt, index;
+
+- if (!dquot_active(inode)) {
++ if (!inode_quota_active(inode)) {
+ spin_lock(&inode->i_lock);
+ *inode_reserved_space(inode) += number;
+ __inode_sub_bytes(inode, number);
+@@ -1866,7 +1924,7 @@ void __dquot_free_space(struct inode *inode, qsize_t number, int flags)
+ struct dquot **dquots;
+ int reserve = flags & DQUOT_SPACE_RESERVE, index;
+
+- if (!dquot_active(inode)) {
++ if (!inode_quota_active(inode)) {
+ if (reserve) {
+ spin_lock(&inode->i_lock);
+ *inode_reserved_space(inode) -= number;
+@@ -1921,7 +1979,7 @@ void dquot_free_inode(struct inode *inode)
+ struct dquot * const *dquots;
+ int index;
+
+- if (!dquot_active(inode))
++ if (!inode_quota_active(inode))
+ return;
+
+ dquots = i_dquot(inode);
+@@ -2093,7 +2151,7 @@ int dquot_transfer(struct mnt_idmap *idmap, struct inode *inode,
+ struct super_block *sb = inode->i_sb;
+ int ret;
+
+- if (!dquot_active(inode))
++ if (!inode_quota_active(inode))
+ return 0;
+
+ if (i_uid_needs_update(idmap, iattr, inode)) {
+diff --git a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c
+index 4d11d60f493c1..dd58e0dca5e5a 100644
+--- a/fs/reiserfs/journal.c
++++ b/fs/reiserfs/journal.c
+@@ -2326,7 +2326,7 @@ static struct buffer_head *reiserfs_breada(struct block_device *dev,
+ int i, j;
+
+ bh = __getblk(dev, block, bufsize);
+- if (buffer_uptodate(bh))
++ if (!bh || buffer_uptodate(bh))
+ return (bh);
+
+ if (block + BUFNR > max_block) {
+@@ -2336,6 +2336,8 @@ static struct buffer_head *reiserfs_breada(struct block_device *dev,
+ j = 1;
+ for (i = 1; i < blocks; i++) {
+ bh = __getblk(dev, block + i, bufsize);
++ if (!bh)
++ break;
+ if (buffer_uptodate(bh)) {
+ brelse(bh);
+ break;
+diff --git a/fs/smb/client/cifsglob.h b/fs/smb/client/cifsglob.h
+index ca2da713c5fe9..87c6ce54c72d0 100644
+--- a/fs/smb/client/cifsglob.h
++++ b/fs/smb/client/cifsglob.h
+@@ -1062,6 +1062,7 @@ struct cifs_ses {
+ unsigned long chans_need_reconnect;
+ /* ========= end: protected by chan_lock ======== */
+ struct cifs_ses *dfs_root_ses;
++ struct nls_table *local_nls;
+ };
+
+ static inline bool
+diff --git a/fs/smb/client/cifssmb.c b/fs/smb/client/cifssmb.c
+index a0c4e9874b010..a49f95ea7cf6f 100644
+--- a/fs/smb/client/cifssmb.c
++++ b/fs/smb/client/cifssmb.c
+@@ -129,7 +129,7 @@ again:
+ }
+ spin_unlock(&server->srv_lock);
+
+- nls_codepage = load_nls_default();
++ nls_codepage = ses->local_nls;
+
+ /*
+ * need to prevent multiple threads trying to simultaneously
+@@ -200,7 +200,6 @@ out:
+ rc = -EAGAIN;
+ }
+
+- unload_nls(nls_codepage);
+ return rc;
+ }
+
+diff --git a/fs/smb/client/connect.c b/fs/smb/client/connect.c
+index 853209268f507..e965196e4f746 100644
+--- a/fs/smb/client/connect.c
++++ b/fs/smb/client/connect.c
+@@ -1837,6 +1837,10 @@ static int match_session(struct cifs_ses *ses, struct smb3_fs_context *ctx)
+ CIFS_MAX_PASSWORD_LEN))
+ return 0;
+ }
++
++ if (strcmp(ctx->local_nls->charset, ses->local_nls->charset))
++ return 0;
++
+ return 1;
+ }
+
+@@ -2280,6 +2284,7 @@ cifs_get_smb_ses(struct TCP_Server_Info *server, struct smb3_fs_context *ctx)
+
+ ses->sectype = ctx->sectype;
+ ses->sign = ctx->sign;
++ ses->local_nls = load_nls(ctx->local_nls->charset);
+
+ /* add server as first channel */
+ spin_lock(&ses->chan_lock);
+diff --git a/fs/smb/client/misc.c b/fs/smb/client/misc.c
+index 70dbfe6584f9e..d7e85d9a26553 100644
+--- a/fs/smb/client/misc.c
++++ b/fs/smb/client/misc.c
+@@ -95,6 +95,7 @@ sesInfoFree(struct cifs_ses *buf_to_free)
+ return;
+ }
+
++ unload_nls(buf_to_free->local_nls);
+ atomic_dec(&sesInfoAllocCount);
+ kfree(buf_to_free->serverOS);
+ kfree(buf_to_free->serverDomain);
+diff --git a/fs/smb/client/smb2pdu.c b/fs/smb/client/smb2pdu.c
+index e04766fe6f803..a457f07f820dc 100644
+--- a/fs/smb/client/smb2pdu.c
++++ b/fs/smb/client/smb2pdu.c
+@@ -242,7 +242,7 @@ again:
+ }
+ spin_unlock(&server->srv_lock);
+
+- nls_codepage = load_nls_default();
++ nls_codepage = ses->local_nls;
+
+ /*
+ * need to prevent multiple threads trying to simultaneously
+@@ -324,7 +324,6 @@ out:
+ rc = -EAGAIN;
+ }
+ failed:
+- unload_nls(nls_codepage);
+ return rc;
+ }
+
+diff --git a/fs/smb/server/server.c b/fs/smb/server/server.c
+index ced7a9e916f01..9df121bdf3492 100644
+--- a/fs/smb/server/server.c
++++ b/fs/smb/server/server.c
+@@ -286,6 +286,7 @@ static void handle_ksmbd_work(struct work_struct *wk)
+ static int queue_ksmbd_work(struct ksmbd_conn *conn)
+ {
+ struct ksmbd_work *work;
++ int err;
+
+ work = ksmbd_alloc_work_struct();
+ if (!work) {
+@@ -297,7 +298,11 @@ static int queue_ksmbd_work(struct ksmbd_conn *conn)
+ work->request_buf = conn->request_buf;
+ conn->request_buf = NULL;
+
+- ksmbd_init_smb_server(work);
++ err = ksmbd_init_smb_server(work);
++ if (err) {
++ ksmbd_free_work_struct(work);
++ return 0;
++ }
+
+ ksmbd_conn_enqueue_request(work);
+ atomic_inc(&conn->r_count);
+diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c
+index 6d0896c76b098..a61bc3a2649cb 100644
+--- a/fs/smb/server/smb2pdu.c
++++ b/fs/smb/server/smb2pdu.c
+@@ -87,9 +87,9 @@ struct channel *lookup_chann_list(struct ksmbd_session *sess, struct ksmbd_conn
+ */
+ int smb2_get_ksmbd_tcon(struct ksmbd_work *work)
+ {
+- struct smb2_hdr *req_hdr = smb2_get_msg(work->request_buf);
++ struct smb2_hdr *req_hdr = ksmbd_req_buf_next(work);
+ unsigned int cmd = le16_to_cpu(req_hdr->Command);
+- int tree_id;
++ unsigned int tree_id;
+
+ if (cmd == SMB2_TREE_CONNECT_HE ||
+ cmd == SMB2_CANCEL_HE ||
+@@ -114,7 +114,7 @@ int smb2_get_ksmbd_tcon(struct ksmbd_work *work)
+ pr_err("The first operation in the compound does not have tcon\n");
+ return -EINVAL;
+ }
+- if (work->tcon->id != tree_id) {
++ if (tree_id != UINT_MAX && work->tcon->id != tree_id) {
+ pr_err("tree id(%u) is different with id(%u) in first operation\n",
+ tree_id, work->tcon->id);
+ return -EINVAL;
+@@ -559,9 +559,9 @@ int smb2_allocate_rsp_buf(struct ksmbd_work *work)
+ */
+ int smb2_check_user_session(struct ksmbd_work *work)
+ {
+- struct smb2_hdr *req_hdr = smb2_get_msg(work->request_buf);
++ struct smb2_hdr *req_hdr = ksmbd_req_buf_next(work);
+ struct ksmbd_conn *conn = work->conn;
+- unsigned int cmd = conn->ops->get_cmd_val(work);
++ unsigned int cmd = le16_to_cpu(req_hdr->Command);
+ unsigned long long sess_id;
+
+ /*
+@@ -587,7 +587,7 @@ int smb2_check_user_session(struct ksmbd_work *work)
+ pr_err("The first operation in the compound does not have sess\n");
+ return -EINVAL;
+ }
+- if (work->sess->id != sess_id) {
++ if (sess_id != ULLONG_MAX && work->sess->id != sess_id) {
+ pr_err("session id(%llu) is different with the first operation(%lld)\n",
+ sess_id, work->sess->id);
+ return -EINVAL;
+@@ -6223,6 +6223,11 @@ int smb2_read(struct ksmbd_work *work)
+ unsigned int max_read_size = conn->vals->max_read_size;
+
+ WORK_BUFFERS(work, req, rsp);
++ if (work->next_smb2_rcv_hdr_off) {
++ work->send_no_response = 1;
++ err = -EOPNOTSUPP;
++ goto out;
++ }
+
+ if (test_share_config_flag(work->tcon->share_conf,
+ KSMBD_SHARE_FLAG_PIPE)) {
+@@ -8623,7 +8628,8 @@ int smb3_decrypt_req(struct ksmbd_work *work)
+ struct smb2_transform_hdr *tr_hdr = smb2_get_msg(buf);
+ int rc = 0;
+
+- if (buf_data_size < sizeof(struct smb2_hdr)) {
++ if (pdu_length < sizeof(struct smb2_transform_hdr) ||
++ buf_data_size < sizeof(struct smb2_hdr)) {
+ pr_err("Transform message is too small (%u)\n",
+ pdu_length);
+ return -ECONNABORTED;
+diff --git a/fs/smb/server/smb_common.c b/fs/smb/server/smb_common.c
+index 3e391a7d5a3ab..27b8bd039791e 100644
+--- a/fs/smb/server/smb_common.c
++++ b/fs/smb/server/smb_common.c
+@@ -388,26 +388,29 @@ static struct smb_version_cmds smb1_server_cmds[1] = {
+ [SMB_COM_NEGOTIATE_EX] = { .proc = smb1_negotiate, },
+ };
+
+-static void init_smb1_server(struct ksmbd_conn *conn)
++static int init_smb1_server(struct ksmbd_conn *conn)
+ {
+ conn->ops = &smb1_server_ops;
+ conn->cmds = smb1_server_cmds;
+ conn->max_cmds = ARRAY_SIZE(smb1_server_cmds);
++ return 0;
+ }
+
+-void ksmbd_init_smb_server(struct ksmbd_work *work)
++int ksmbd_init_smb_server(struct ksmbd_work *work)
+ {
+ struct ksmbd_conn *conn = work->conn;
+ __le32 proto;
+
+- if (conn->need_neg == false)
+- return;
+-
+ proto = *(__le32 *)((struct smb_hdr *)work->request_buf)->Protocol;
++ if (conn->need_neg == false) {
++ if (proto == SMB1_PROTO_NUMBER)
++ return -EINVAL;
++ return 0;
++ }
++
+ if (proto == SMB1_PROTO_NUMBER)
+- init_smb1_server(conn);
+- else
+- init_smb3_11_server(conn);
++ return init_smb1_server(conn);
++ return init_smb3_11_server(conn);
+ }
+
+ int ksmbd_populate_dot_dotdot_entries(struct ksmbd_work *work, int info_level,
+diff --git a/fs/smb/server/smb_common.h b/fs/smb/server/smb_common.h
+index 6b0d5f1fe85ca..f0134d16067fb 100644
+--- a/fs/smb/server/smb_common.h
++++ b/fs/smb/server/smb_common.h
+@@ -427,7 +427,7 @@ bool ksmbd_smb_request(struct ksmbd_conn *conn);
+
+ int ksmbd_lookup_dialect_by_id(__le16 *cli_dialects, __le16 dialects_count);
+
+-void ksmbd_init_smb_server(struct ksmbd_work *work);
++int ksmbd_init_smb_server(struct ksmbd_work *work);
+
+ struct ksmbd_kstat;
+ int ksmbd_populate_dot_dotdot_entries(struct ksmbd_work *work,
+diff --git a/fs/smb/server/vfs.c b/fs/smb/server/vfs.c
+index 911cb3d294b86..93f73c35a9c5c 100644
+--- a/fs/smb/server/vfs.c
++++ b/fs/smb/server/vfs.c
+@@ -423,7 +423,8 @@ static int ksmbd_vfs_stream_write(struct ksmbd_file *fp, char *buf, loff_t *pos,
+ {
+ char *stream_buf = NULL, *wbuf;
+ struct mnt_idmap *idmap = file_mnt_idmap(fp->filp);
+- size_t size, v_len;
++ size_t size;
++ ssize_t v_len;
+ int err = 0;
+
+ ksmbd_debug(VFS, "write stream data pos : %llu, count : %zd\n",
+@@ -440,9 +441,9 @@ static int ksmbd_vfs_stream_write(struct ksmbd_file *fp, char *buf, loff_t *pos,
+ fp->stream.name,
+ fp->stream.size,
+ &stream_buf);
+- if ((int)v_len < 0) {
++ if (v_len < 0) {
+ pr_err("not found stream in xattr : %zd\n", v_len);
+- err = (int)v_len;
++ err = v_len;
+ goto out;
+ }
+
+diff --git a/fs/splice.c b/fs/splice.c
+index 030e162985b5d..3ae2de263e806 100644
+--- a/fs/splice.c
++++ b/fs/splice.c
+@@ -1153,10 +1153,8 @@ long do_splice(struct file *in, loff_t *off_in, struct file *out,
+ if ((in->f_flags | out->f_flags) & O_NONBLOCK)
+ flags |= SPLICE_F_NONBLOCK;
+
+- return splice_pipe_to_pipe(ipipe, opipe, len, flags);
+- }
+-
+- if (ipipe) {
++ ret = splice_pipe_to_pipe(ipipe, opipe, len, flags);
++ } else if (ipipe) {
+ if (off_in)
+ return -ESPIPE;
+ if (off_out) {
+@@ -1181,18 +1179,11 @@ long do_splice(struct file *in, loff_t *off_in, struct file *out,
+ ret = do_splice_from(ipipe, out, &offset, len, flags);
+ file_end_write(out);
+
+- if (ret > 0)
+- fsnotify_modify(out);
+-
+ if (!off_out)
+ out->f_pos = offset;
+ else
+ *off_out = offset;
+-
+- return ret;
+- }
+-
+- if (opipe) {
++ } else if (opipe) {
+ if (off_out)
+ return -ESPIPE;
+ if (off_in) {
+@@ -1208,18 +1199,25 @@ long do_splice(struct file *in, loff_t *off_in, struct file *out,
+
+ ret = splice_file_to_pipe(in, opipe, &offset, len, flags);
+
+- if (ret > 0)
+- fsnotify_access(in);
+-
+ if (!off_in)
+ in->f_pos = offset;
+ else
+ *off_in = offset;
++ } else {
++ ret = -EINVAL;
++ }
+
+- return ret;
++ if (ret > 0) {
++ /*
++ * Generate modify out before access in:
++ * do_splice_from() may've already sent modify out,
++ * and this ensures the events get merged.
++ */
++ fsnotify_modify(out);
++ fsnotify_access(in);
+ }
+
+- return -EINVAL;
++ return ret;
+ }
+
+ static long __do_splice(struct file *in, loff_t __user *off_in,
+@@ -1348,6 +1346,9 @@ static long vmsplice_to_user(struct file *file, struct iov_iter *iter,
+ pipe_unlock(pipe);
+ }
+
++ if (ret > 0)
++ fsnotify_access(file);
++
+ return ret;
+ }
+
+@@ -1377,8 +1378,10 @@ static long vmsplice_to_pipe(struct file *file, struct iov_iter *iter,
+ if (!ret)
+ ret = iter_to_pipe(iter, pipe, buf_flag);
+ pipe_unlock(pipe);
+- if (ret > 0)
++ if (ret > 0) {
+ wakeup_pipe_readers(pipe);
++ fsnotify_modify(file);
++ }
+ return ret;
+ }
+
+@@ -1812,6 +1815,11 @@ long do_tee(struct file *in, struct file *out, size_t len, unsigned int flags)
+ }
+ }
+
++ if (ret > 0) {
++ fsnotify_access(in);
++ fsnotify_modify(out);
++ }
++
+ return ret;
+ }
+
+diff --git a/fs/verity/signature.c b/fs/verity/signature.c
+index b8c51ad40d3a3..5694daf378e78 100644
+--- a/fs/verity/signature.c
++++ b/fs/verity/signature.c
+@@ -54,6 +54,22 @@ int fsverity_verify_signature(const struct fsverity_info *vi,
+ return 0;
+ }
+
++ if (fsverity_keyring->keys.nr_leaves_on_tree == 0) {
++ /*
++ * The ".fs-verity" keyring is empty, due to builtin signatures
++ * being supported by the kernel but not actually being used.
++ * In this case, verify_pkcs7_signature() would always return an
++ * error, usually ENOKEY. It could also be EBADMSG if the
++ * PKCS#7 is malformed, but that isn't very important to
++ * distinguish. So, just skip to ENOKEY to avoid the attack
++ * surface of the PKCS#7 parser, which would otherwise be
++ * reachable by any task able to execute FS_IOC_ENABLE_VERITY.
++ */
++ fsverity_err(inode,
++ "fs-verity keyring is empty, rejecting signed file!");
++ return -ENOKEY;
++ }
++
+ d = kzalloc(sizeof(*d) + hash_alg->digest_size, GFP_KERNEL);
+ if (!d)
+ return -ENOMEM;
+diff --git a/include/crypto/algapi.h b/include/crypto/algapi.h
+index 016d5a302b84a..bd3a99b0106bf 100644
+--- a/include/crypto/algapi.h
++++ b/include/crypto/algapi.h
+@@ -12,6 +12,7 @@
+ #include <linux/cache.h>
+ #include <linux/crypto.h>
+ #include <linux/types.h>
++#include <linux/workqueue.h>
+
+ /*
+ * Maximum values for blocksize and alignmask, used to allocate
+@@ -83,6 +84,8 @@ struct crypto_instance {
+ struct crypto_spawn *spawns;
+ };
+
++ struct work_struct free_work;
++
+ void *__ctx[] CRYPTO_MINALIGN_ATTR;
+ };
+
+diff --git a/include/dt-bindings/clock/qcom,gcc-sc8280xp.h b/include/dt-bindings/clock/qcom,gcc-sc8280xp.h
+index 721105ea4fad8..8454915917849 100644
+--- a/include/dt-bindings/clock/qcom,gcc-sc8280xp.h
++++ b/include/dt-bindings/clock/qcom,gcc-sc8280xp.h
+@@ -494,5 +494,15 @@
+ #define USB30_SEC_GDSC 11
+ #define EMAC_0_GDSC 12
+ #define EMAC_1_GDSC 13
++#define USB4_1_GDSC 14
++#define USB4_GDSC 15
++#define HLOS1_VOTE_MMNOC_MMU_TBU_HF0_GDSC 16
++#define HLOS1_VOTE_MMNOC_MMU_TBU_HF1_GDSC 17
++#define HLOS1_VOTE_MMNOC_MMU_TBU_SF0_GDSC 18
++#define HLOS1_VOTE_MMNOC_MMU_TBU_SF1_GDSC 19
++#define HLOS1_VOTE_TURING_MMU_TBU0_GDSC 20
++#define HLOS1_VOTE_TURING_MMU_TBU1_GDSC 21
++#define HLOS1_VOTE_TURING_MMU_TBU2_GDSC 22
++#define HLOS1_VOTE_TURING_MMU_TBU3_GDSC 23
+
+ #endif
+diff --git a/include/dt-bindings/clock/qcom,qdu1000-gcc.h b/include/dt-bindings/clock/qcom,qdu1000-gcc.h
+index ddbc6b825e80c..2fd36cbfddbb2 100644
+--- a/include/dt-bindings/clock/qcom,qdu1000-gcc.h
++++ b/include/dt-bindings/clock/qcom,qdu1000-gcc.h
+@@ -1,6 +1,6 @@
+ /* SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause */
+ /*
+- * Copyright (c) 2021-2022, Qualcomm Innovation Center, Inc. All rights reserved.
++ * Copyright (c) 2021-2023, Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+ #ifndef _DT_BINDINGS_CLK_QCOM_GCC_QDU1000_H
+@@ -138,6 +138,8 @@
+ #define GCC_AGGRE_NOC_ECPRI_GSI_CLK 128
+ #define GCC_PCIE_0_PIPE_CLK_SRC 129
+ #define GCC_PCIE_0_PHY_AUX_CLK_SRC 130
++#define GCC_GPLL1_OUT_EVEN 131
++#define GCC_DDRSS_ECPRI_GSI_CLK 132
+
+ /* GCC resets */
+ #define GCC_ECPRI_CC_BCR 0
+diff --git a/include/linux/arm_sdei.h b/include/linux/arm_sdei.h
+index 14dc461b0e829..255701e1251b4 100644
+--- a/include/linux/arm_sdei.h
++++ b/include/linux/arm_sdei.h
+@@ -47,10 +47,12 @@ int sdei_unregister_ghes(struct ghes *ghes);
+ int sdei_mask_local_cpu(void);
+ int sdei_unmask_local_cpu(void);
+ void __init sdei_init(void);
++void sdei_handler_abort(void);
+ #else
+ static inline int sdei_mask_local_cpu(void) { return 0; }
+ static inline int sdei_unmask_local_cpu(void) { return 0; }
+ static inline void sdei_init(void) { }
++static inline void sdei_handler_abort(void) { }
+ #endif /* CONFIG_ARM_SDE_INTERFACE */
+
+
+diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
+index 67e942d776bd8..2e2cd4b824e7f 100644
+--- a/include/linux/blkdev.h
++++ b/include/linux/blkdev.h
+@@ -546,6 +546,7 @@ struct request_queue {
+ #define QUEUE_FLAG_ADD_RANDOM 10 /* Contributes to random pool */
+ #define QUEUE_FLAG_SYNCHRONOUS 11 /* always completes in submit context */
+ #define QUEUE_FLAG_SAME_FORCE 12 /* force complete on same CPU */
++#define QUEUE_FLAG_HW_WC 18 /* Write back caching supported */
+ #define QUEUE_FLAG_INIT_DONE 14 /* queue is initialized */
+ #define QUEUE_FLAG_STABLE_WRITES 15 /* don't modify blks until WB is done */
+ #define QUEUE_FLAG_POLL 16 /* IO polling enabled if set */
+diff --git a/include/linux/hid.h b/include/linux/hid.h
+index 4e4c4fe369118..7cbc10073a1fe 100644
+--- a/include/linux/hid.h
++++ b/include/linux/hid.h
+@@ -360,6 +360,7 @@ struct hid_item {
+ #define HID_QUIRK_NO_OUTPUT_REPORTS_ON_INTR_EP BIT(18)
+ #define HID_QUIRK_HAVE_SPECIAL_DRIVER BIT(19)
+ #define HID_QUIRK_INCREMENT_USAGE_ON_DUPLICATE BIT(20)
++#define HID_QUIRK_NOINVERT BIT(21)
+ #define HID_QUIRK_FULLSPEED_INTERVAL BIT(28)
+ #define HID_QUIRK_NO_INIT_REPORTS BIT(29)
+ #define HID_QUIRK_NO_IGNORE BIT(30)
+diff --git a/include/linux/if_arp.h b/include/linux/if_arp.h
+index 1ed52441972f9..10a1e81434cb9 100644
+--- a/include/linux/if_arp.h
++++ b/include/linux/if_arp.h
+@@ -53,6 +53,10 @@ static inline bool dev_is_mac_header_xmit(const struct net_device *dev)
+ case ARPHRD_NONE:
+ case ARPHRD_RAWIP:
+ case ARPHRD_PIMREG:
++ /* PPP adds its l2 header automatically in ppp_start_xmit().
++ * This makes it look like an l3 device to __bpf_redirect() and tcf_mirred_init().
++ */
++ case ARPHRD_PPP:
+ return false;
+ default:
+ return true;
+diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h
+index 73f5c120def88..2a36f3218b510 100644
+--- a/include/linux/kernfs.h
++++ b/include/linux/kernfs.h
+@@ -550,6 +550,10 @@ static inline int kernfs_setattr(struct kernfs_node *kn,
+ const struct iattr *iattr)
+ { return -ENOSYS; }
+
++static inline __poll_t kernfs_generic_poll(struct kernfs_open_file *of,
++ struct poll_table_struct *pt)
++{ return -ENOSYS; }
++
+ static inline void kernfs_notify(struct kernfs_node *kn) { }
+
+ static inline int kernfs_xattr_get(struct kernfs_node *kn, const char *name,
+diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
+index 6bb55e61e8e87..fb9f5f00a7789 100644
+--- a/include/linux/lsm_hook_defs.h
++++ b/include/linux/lsm_hook_defs.h
+@@ -54,6 +54,7 @@ LSM_HOOK(int, 0, bprm_creds_from_file, struct linux_binprm *bprm, struct file *f
+ LSM_HOOK(int, 0, bprm_check_security, struct linux_binprm *bprm)
+ LSM_HOOK(void, LSM_RET_VOID, bprm_committing_creds, struct linux_binprm *bprm)
+ LSM_HOOK(void, LSM_RET_VOID, bprm_committed_creds, struct linux_binprm *bprm)
++LSM_HOOK(int, 0, fs_context_submount, struct fs_context *fc, struct super_block *reference)
+ LSM_HOOK(int, 0, fs_context_dup, struct fs_context *fc,
+ struct fs_context *src_sc)
+ LSM_HOOK(int, -ENOPARAM, fs_context_parse_param, struct fs_context *fc,
+diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
+index 222d7370134c7..7f2921217f50b 100644
+--- a/include/linux/memcontrol.h
++++ b/include/linux/memcontrol.h
+@@ -284,6 +284,11 @@ struct mem_cgroup {
+ atomic_long_t memory_events[MEMCG_NR_MEMORY_EVENTS];
+ atomic_long_t memory_events_local[MEMCG_NR_MEMORY_EVENTS];
+
++ /*
++ * Hint of reclaim pressure for socket memroy management. Note
++ * that this indicator should NOT be used in legacy cgroup mode
++ * where socket memory is accounted/charged separately.
++ */
+ unsigned long socket_pressure;
+
+ /* Legacy tcp memory accounting */
+@@ -1743,8 +1748,8 @@ void mem_cgroup_sk_alloc(struct sock *sk);
+ void mem_cgroup_sk_free(struct sock *sk);
+ static inline bool mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg)
+ {
+- if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && memcg->tcpmem_pressure)
+- return true;
++ if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
++ return !!memcg->tcpmem_pressure;
+ do {
+ if (time_before(jiffies, READ_ONCE(memcg->socket_pressure)))
+ return true;
+diff --git a/include/linux/nls.h b/include/linux/nls.h
+index 499e486b3722d..e0bf8367b274a 100644
+--- a/include/linux/nls.h
++++ b/include/linux/nls.h
+@@ -47,7 +47,7 @@ enum utf16_endian {
+ /* nls_base.c */
+ extern int __register_nls(struct nls_table *, struct module *);
+ extern int unregister_nls(struct nls_table *);
+-extern struct nls_table *load_nls(char *);
++extern struct nls_table *load_nls(const char *charset);
+ extern void unload_nls(struct nls_table *);
+ extern struct nls_table *load_nls_default(void);
+ #define register_nls(nls) __register_nls((nls), THIS_MODULE)
+diff --git a/include/linux/nvmem-consumer.h b/include/linux/nvmem-consumer.h
+index fa030d93b768e..27373024856dc 100644
+--- a/include/linux/nvmem-consumer.h
++++ b/include/linux/nvmem-consumer.h
+@@ -256,7 +256,7 @@ static inline struct nvmem_device *of_nvmem_device_get(struct device_node *np,
+ static inline struct device_node *
+ of_nvmem_layout_get_container(struct nvmem_device *nvmem)
+ {
+- return ERR_PTR(-EOPNOTSUPP);
++ return NULL;
+ }
+ #endif /* CONFIG_NVMEM && CONFIG_OF */
+
+diff --git a/include/linux/pci.h b/include/linux/pci.h
+index c69a2cc1f4123..7ee498cd1f374 100644
+--- a/include/linux/pci.h
++++ b/include/linux/pci.h
+@@ -467,6 +467,7 @@ struct pci_dev {
+ pci_dev_flags_t dev_flags;
+ atomic_t enable_cnt; /* pci_enable_device has been called */
+
++ spinlock_t pcie_cap_lock; /* Protects RMW ops in capability accessors */
+ u32 saved_config_space[16]; /* Config space saved at suspend time */
+ struct hlist_head saved_cap_space;
+ int rom_attr_enabled; /* Display of ROM attribute enabled? */
+@@ -1217,11 +1218,40 @@ int pcie_capability_read_word(struct pci_dev *dev, int pos, u16 *val);
+ int pcie_capability_read_dword(struct pci_dev *dev, int pos, u32 *val);
+ int pcie_capability_write_word(struct pci_dev *dev, int pos, u16 val);
+ int pcie_capability_write_dword(struct pci_dev *dev, int pos, u32 val);
+-int pcie_capability_clear_and_set_word(struct pci_dev *dev, int pos,
+- u16 clear, u16 set);
++int pcie_capability_clear_and_set_word_unlocked(struct pci_dev *dev, int pos,
++ u16 clear, u16 set);
++int pcie_capability_clear_and_set_word_locked(struct pci_dev *dev, int pos,
++ u16 clear, u16 set);
+ int pcie_capability_clear_and_set_dword(struct pci_dev *dev, int pos,
+ u32 clear, u32 set);
+
++/**
++ * pcie_capability_clear_and_set_word - RMW accessor for PCI Express Capability Registers
++ * @dev: PCI device structure of the PCI Express device
++ * @pos: PCI Express Capability Register
++ * @clear: Clear bitmask
++ * @set: Set bitmask
++ *
++ * Perform a Read-Modify-Write (RMW) operation using @clear and @set
++ * bitmasks on PCI Express Capability Register at @pos. Certain PCI Express
++ * Capability Registers are accessed concurrently in RMW fashion, hence
++ * require locking which is handled transparently to the caller.
++ */
++static inline int pcie_capability_clear_and_set_word(struct pci_dev *dev,
++ int pos,
++ u16 clear, u16 set)
++{
++ switch (pos) {
++ case PCI_EXP_LNKCTL:
++ case PCI_EXP_RTCTL:
++ return pcie_capability_clear_and_set_word_locked(dev, pos,
++ clear, set);
++ default:
++ return pcie_capability_clear_and_set_word_unlocked(dev, pos,
++ clear, set);
++ }
++}
++
+ static inline int pcie_capability_set_word(struct pci_dev *dev, int pos,
+ u16 set)
+ {
+diff --git a/include/linux/pid_namespace.h b/include/linux/pid_namespace.h
+index c758809d5bcf3..53974d79d98e8 100644
+--- a/include/linux/pid_namespace.h
++++ b/include/linux/pid_namespace.h
+@@ -17,18 +17,10 @@
+ struct fs_pin;
+
+ #if defined(CONFIG_SYSCTL) && defined(CONFIG_MEMFD_CREATE)
+-/*
+- * sysctl for vm.memfd_noexec
+- * 0: memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL
+- * acts like MFD_EXEC was set.
+- * 1: memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL
+- * acts like MFD_NOEXEC_SEAL was set.
+- * 2: memfd_create() without MFD_NOEXEC_SEAL will be
+- * rejected.
+- */
+-#define MEMFD_NOEXEC_SCOPE_EXEC 0
+-#define MEMFD_NOEXEC_SCOPE_NOEXEC_SEAL 1
+-#define MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED 2
++/* modes for vm.memfd_noexec sysctl */
++#define MEMFD_NOEXEC_SCOPE_EXEC 0 /* MFD_EXEC implied if unset */
++#define MEMFD_NOEXEC_SCOPE_NOEXEC_SEAL 1 /* MFD_NOEXEC_SEAL implied if unset */
++#define MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED 2 /* same as 1, except MFD_EXEC rejected */
+ #endif
+
+ struct pid_namespace {
+diff --git a/include/linux/security.h b/include/linux/security.h
+index e2734e9e44d5c..274c75fa2e272 100644
+--- a/include/linux/security.h
++++ b/include/linux/security.h
+@@ -293,6 +293,7 @@ int security_bprm_creds_from_file(struct linux_binprm *bprm, struct file *file);
+ int security_bprm_check(struct linux_binprm *bprm);
+ void security_bprm_committing_creds(struct linux_binprm *bprm);
+ void security_bprm_committed_creds(struct linux_binprm *bprm);
++int security_fs_context_submount(struct fs_context *fc, struct super_block *reference);
+ int security_fs_context_dup(struct fs_context *fc, struct fs_context *src_fc);
+ int security_fs_context_parse_param(struct fs_context *fc, struct fs_parameter *param);
+ int security_sb_alloc(struct super_block *sb);
+@@ -629,6 +630,11 @@ static inline void security_bprm_committed_creds(struct linux_binprm *bprm)
+ {
+ }
+
++static inline int security_fs_context_submount(struct fs_context *fc,
++ struct super_block *reference)
++{
++ return 0;
++}
+ static inline int security_fs_context_dup(struct fs_context *fc,
+ struct fs_context *src_fc)
+ {
+diff --git a/include/linux/thermal.h b/include/linux/thermal.h
+index 87837094d549f..dee66ade89a03 100644
+--- a/include/linux/thermal.h
++++ b/include/linux/thermal.h
+@@ -301,14 +301,14 @@ int thermal_acpi_critical_trip_temp(struct acpi_device *adev, int *ret_temp);
+ #ifdef CONFIG_THERMAL
+ struct thermal_zone_device *thermal_zone_device_register(const char *, int, int,
+ void *, struct thermal_zone_device_ops *,
+- struct thermal_zone_params *, int, int);
++ const struct thermal_zone_params *, int, int);
+
+ void thermal_zone_device_unregister(struct thermal_zone_device *);
+
+ struct thermal_zone_device *
+ thermal_zone_device_register_with_trips(const char *, struct thermal_trip *, int, int,
+ void *, struct thermal_zone_device_ops *,
+- struct thermal_zone_params *, int, int);
++ const struct thermal_zone_params *, int, int);
+
+ void *thermal_zone_device_priv(struct thermal_zone_device *tzd);
+ const char *thermal_zone_device_type(struct thermal_zone_device *tzd);
+@@ -348,7 +348,7 @@ void thermal_zone_device_critical(struct thermal_zone_device *tz);
+ static inline struct thermal_zone_device *thermal_zone_device_register(
+ const char *type, int trips, int mask, void *devdata,
+ struct thermal_zone_device_ops *ops,
+- struct thermal_zone_params *tzp,
++ const struct thermal_zone_params *tzp,
+ int passive_delay, int polling_delay)
+ { return ERR_PTR(-ENODEV); }
+ static inline void thermal_zone_device_unregister(
+diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
+index c55fc453e33b5..6a41ad2ca84cc 100644
+--- a/include/linux/trace_events.h
++++ b/include/linux/trace_events.h
+@@ -875,7 +875,8 @@ extern int perf_uprobe_init(struct perf_event *event,
+ extern void perf_uprobe_destroy(struct perf_event *event);
+ extern int bpf_get_uprobe_info(const struct perf_event *event,
+ u32 *fd_type, const char **filename,
+- u64 *probe_offset, bool perf_type_tracepoint);
++ u64 *probe_offset, u64 *probe_addr,
++ bool perf_type_tracepoint);
+ #endif
+ extern int ftrace_profile_set_filter(struct perf_event *event, int event_id,
+ char *filter_str);
+diff --git a/include/linux/usb/typec_altmode.h b/include/linux/usb/typec_altmode.h
+index 350d49012659b..28aeef8f9e7b5 100644
+--- a/include/linux/usb/typec_altmode.h
++++ b/include/linux/usb/typec_altmode.h
+@@ -67,7 +67,7 @@ struct typec_altmode_ops {
+
+ int typec_altmode_enter(struct typec_altmode *altmode, u32 *vdo);
+ int typec_altmode_exit(struct typec_altmode *altmode);
+-void typec_altmode_attention(struct typec_altmode *altmode, u32 vdo);
++int typec_altmode_attention(struct typec_altmode *altmode, u32 vdo);
+ int typec_altmode_vdm(struct typec_altmode *altmode,
+ const u32 header, const u32 *vdo, int count);
+ int typec_altmode_notify(struct typec_altmode *altmode, unsigned long conf,
+diff --git a/include/media/cec.h b/include/media/cec.h
+index abee41ae02d0e..9c007f83569aa 100644
+--- a/include/media/cec.h
++++ b/include/media/cec.h
+@@ -113,22 +113,25 @@ struct cec_fh {
+ #define CEC_FREE_TIME_TO_USEC(ft) ((ft) * 2400)
+
+ struct cec_adap_ops {
+- /* Low-level callbacks */
++ /* Low-level callbacks, called with adap->lock held */
+ int (*adap_enable)(struct cec_adapter *adap, bool enable);
+ int (*adap_monitor_all_enable)(struct cec_adapter *adap, bool enable);
+ int (*adap_monitor_pin_enable)(struct cec_adapter *adap, bool enable);
+ int (*adap_log_addr)(struct cec_adapter *adap, u8 logical_addr);
+- void (*adap_configured)(struct cec_adapter *adap, bool configured);
++ void (*adap_unconfigured)(struct cec_adapter *adap);
+ int (*adap_transmit)(struct cec_adapter *adap, u8 attempts,
+ u32 signal_free_time, struct cec_msg *msg);
++ void (*adap_nb_transmit_canceled)(struct cec_adapter *adap,
++ const struct cec_msg *msg);
+ void (*adap_status)(struct cec_adapter *adap, struct seq_file *file);
+ void (*adap_free)(struct cec_adapter *adap);
+
+- /* Error injection callbacks */
++ /* Error injection callbacks, called without adap->lock held */
+ int (*error_inj_show)(struct cec_adapter *adap, struct seq_file *sf);
+ bool (*error_inj_parse_line)(struct cec_adapter *adap, char *line);
+
+- /* High-level CEC message callback */
++ /* High-level CEC message callback, called without adap->lock held */
++ void (*configured)(struct cec_adapter *adap);
+ int (*received)(struct cec_adapter *adap, struct cec_msg *msg);
+ };
+
+diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h
+index 872dcb91a540e..3ff822ebb3a47 100644
+--- a/include/net/bluetooth/hci.h
++++ b/include/net/bluetooth/hci.h
+@@ -309,6 +309,26 @@ enum {
+ * to support it.
+ */
+ HCI_QUIRK_BROKEN_SET_RPA_TIMEOUT,
++
++ /* When this quirk is set, MSFT extension monitor tracking by
++ * address filter is supported. Since tracking quantity of each
++ * pattern is limited, this feature supports tracking multiple
++ * devices concurrently if controller supports multiple
++ * address filters.
++ *
++ * This quirk must be set before hci_register_dev is called.
++ */
++ HCI_QUIRK_USE_MSFT_EXT_ADDRESS_FILTER,
++
++ /*
++ * When this quirk is set, LE Coded PHY shall not be used. This is
++ * required for some Intel controllers which erroneously claim to
++ * support it but it causes problems with extended scanning.
++ *
++ * This quirk can be set before hci_register_dev is called or
++ * during the hdev->setup vendor callback.
++ */
++ HCI_QUIRK_BROKEN_LE_CODED,
+ };
+
+ /* HCI device flags */
+diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h
+index 870b6d3c5146b..3190ca493bd18 100644
+--- a/include/net/bluetooth/hci_core.h
++++ b/include/net/bluetooth/hci_core.h
+@@ -321,8 +321,8 @@ struct adv_monitor {
+
+ #define HCI_MAX_SHORT_NAME_LENGTH 10
+
+-#define HCI_CONN_HANDLE_UNSET 0xffff
+ #define HCI_CONN_HANDLE_MAX 0x0eff
++#define HCI_CONN_HANDLE_UNSET(_handle) (_handle > HCI_CONN_HANDLE_MAX)
+
+ /* Min encryption key size to match with SMP */
+ #define HCI_MIN_ENC_KEY_SIZE 7
+@@ -741,6 +741,7 @@ struct hci_conn {
+ unsigned long flags;
+
+ enum conn_reasons conn_reason;
++ __u8 abort_reason;
+
+ __u32 clock;
+ __u16 clock_accuracy;
+@@ -760,7 +761,6 @@ struct hci_conn {
+ struct delayed_work auto_accept_work;
+ struct delayed_work idle_work;
+ struct delayed_work le_conn_timeout;
+- struct work_struct le_scan_cleanup;
+
+ struct device dev;
+ struct dentry *debugfs;
+@@ -976,6 +976,10 @@ enum {
+ HCI_CONN_SCANNING,
+ HCI_CONN_AUTH_FAILURE,
+ HCI_CONN_PER_ADV,
++ HCI_CONN_BIG_CREATED,
++ HCI_CONN_CREATE_CIS,
++ HCI_CONN_BIG_SYNC,
++ HCI_CONN_BIG_SYNC_FAILED,
+ };
+
+ static inline bool hci_conn_ssp_enabled(struct hci_conn *conn)
+@@ -1117,6 +1121,32 @@ static inline struct hci_conn *hci_conn_hash_lookup_bis(struct hci_dev *hdev,
+ return NULL;
+ }
+
++static inline struct hci_conn *
++hci_conn_hash_lookup_per_adv_bis(struct hci_dev *hdev,
++ bdaddr_t *ba,
++ __u8 big, __u8 bis)
++{
++ struct hci_conn_hash *h = &hdev->conn_hash;
++ struct hci_conn *c;
++
++ rcu_read_lock();
++
++ list_for_each_entry_rcu(c, &h->list, list) {
++ if (bacmp(&c->dst, ba) || c->type != ISO_LINK ||
++ !test_bit(HCI_CONN_PER_ADV, &c->flags))
++ continue;
++
++ if (c->iso_qos.bcast.big == big &&
++ c->iso_qos.bcast.bis == bis) {
++ rcu_read_unlock();
++ return c;
++ }
++ }
++ rcu_read_unlock();
++
++ return NULL;
++}
++
+ static inline struct hci_conn *hci_conn_hash_lookup_handle(struct hci_dev *hdev,
+ __u16 handle)
+ {
+@@ -1261,6 +1291,29 @@ static inline struct hci_conn *hci_conn_hash_lookup_big(struct hci_dev *hdev,
+ return NULL;
+ }
+
++static inline struct hci_conn *hci_conn_hash_lookup_big_any_dst(struct hci_dev *hdev,
++ __u8 handle)
++{
++ struct hci_conn_hash *h = &hdev->conn_hash;
++ struct hci_conn *c;
++
++ rcu_read_lock();
++
++ list_for_each_entry_rcu(c, &h->list, list) {
++ if (c->type != ISO_LINK)
++ continue;
++
++ if (handle == c->iso_qos.bcast.big) {
++ rcu_read_unlock();
++ return c;
++ }
++ }
++
++ rcu_read_unlock();
++
++ return NULL;
++}
++
+ static inline struct hci_conn *hci_conn_hash_lookup_state(struct hci_dev *hdev,
+ __u8 type, __u16 state)
+ {
+@@ -1326,7 +1379,8 @@ int hci_disconnect(struct hci_conn *conn, __u8 reason);
+ bool hci_setup_sync(struct hci_conn *conn, __u16 handle);
+ void hci_sco_setup(struct hci_conn *conn, __u8 status);
+ bool hci_iso_setup_path(struct hci_conn *conn);
+-int hci_le_create_cis(struct hci_conn *conn);
++int hci_le_create_cis_pending(struct hci_dev *hdev);
++int hci_conn_check_create_cis(struct hci_conn *conn);
+
+ struct hci_conn *hci_conn_add(struct hci_dev *hdev, int type, bdaddr_t *dst,
+ u8 role);
+@@ -1353,6 +1407,9 @@ struct hci_conn *hci_connect_sco(struct hci_dev *hdev, int type, bdaddr_t *dst,
+ __u16 setting, struct bt_codec *codec);
+ struct hci_conn *hci_bind_cis(struct hci_dev *hdev, bdaddr_t *dst,
+ __u8 dst_type, struct bt_iso_qos *qos);
++struct hci_conn *hci_bind_bis(struct hci_dev *hdev, bdaddr_t *dst,
++ struct bt_iso_qos *qos,
++ __u8 base_len, __u8 *base);
+ struct hci_conn *hci_connect_cis(struct hci_dev *hdev, bdaddr_t *dst,
+ __u8 dst_type, struct bt_iso_qos *qos);
+ struct hci_conn *hci_connect_bis(struct hci_dev *hdev, bdaddr_t *dst,
+@@ -1715,7 +1772,9 @@ void hci_conn_del_sysfs(struct hci_conn *conn);
+ #define scan_2m(dev) (((dev)->le_tx_def_phys & HCI_LE_SET_PHY_2M) || \
+ ((dev)->le_rx_def_phys & HCI_LE_SET_PHY_2M))
+
+-#define le_coded_capable(dev) (((dev)->le_features[1] & HCI_LE_PHY_CODED))
++#define le_coded_capable(dev) (((dev)->le_features[1] & HCI_LE_PHY_CODED) && \
++ !test_bit(HCI_QUIRK_BROKEN_LE_CODED, \
++ &(dev)->quirks))
+
+ #define scan_coded(dev) (((dev)->le_tx_def_phys & HCI_LE_SET_PHY_CODED) || \
+ ((dev)->le_rx_def_phys & HCI_LE_SET_PHY_CODED))
+diff --git a/include/net/bluetooth/hci_sync.h b/include/net/bluetooth/hci_sync.h
+index 2495be4d8b828..b516a0f4a55b8 100644
+--- a/include/net/bluetooth/hci_sync.h
++++ b/include/net/bluetooth/hci_sync.h
+@@ -124,7 +124,7 @@ int hci_abort_conn_sync(struct hci_dev *hdev, struct hci_conn *conn, u8 reason);
+
+ int hci_le_create_conn_sync(struct hci_dev *hdev, struct hci_conn *conn);
+
+-int hci_le_create_cis_sync(struct hci_dev *hdev, struct hci_conn *conn);
++int hci_le_create_cis_sync(struct hci_dev *hdev);
+
+ int hci_le_remove_cig_sync(struct hci_dev *hdev, u8 handle);
+
+diff --git a/include/net/lwtunnel.h b/include/net/lwtunnel.h
+index 6f15e6fa154e6..53bd2d02a4f0d 100644
+--- a/include/net/lwtunnel.h
++++ b/include/net/lwtunnel.h
+@@ -16,9 +16,12 @@
+ #define LWTUNNEL_STATE_INPUT_REDIRECT BIT(1)
+ #define LWTUNNEL_STATE_XMIT_REDIRECT BIT(2)
+
++/* LWTUNNEL_XMIT_CONTINUE should be distinguishable from dst_output return
++ * values (NET_XMIT_xxx and NETDEV_TX_xxx in linux/netdevice.h) for safety.
++ */
+ enum {
+ LWTUNNEL_XMIT_DONE,
+- LWTUNNEL_XMIT_CONTINUE,
++ LWTUNNEL_XMIT_CONTINUE = 0x100,
+ };
+
+
+diff --git a/include/net/mac80211.h b/include/net/mac80211.h
+index 67d81f7186660..52b336ada480c 100644
+--- a/include/net/mac80211.h
++++ b/include/net/mac80211.h
+@@ -1192,9 +1192,11 @@ struct ieee80211_tx_info {
+ u8 ampdu_ack_len;
+ u8 ampdu_len;
+ u8 antenna;
++ u8 pad;
+ u16 tx_time;
+ u8 flags;
+- void *status_driver_data[18 / sizeof(void *)];
++ u8 pad2;
++ void *status_driver_data[16 / sizeof(void *)];
+ } status;
+ struct {
+ struct ieee80211_tx_rate driver_rates[
+diff --git a/include/net/tcp.h b/include/net/tcp.h
+index 182337a8cf94a..ca6435f5d821e 100644
+--- a/include/net/tcp.h
++++ b/include/net/tcp.h
+@@ -355,7 +355,6 @@ ssize_t tcp_splice_read(struct socket *sk, loff_t *ppos,
+ struct sk_buff *tcp_stream_alloc_skb(struct sock *sk, int size, gfp_t gfp,
+ bool force_schedule);
+
+-void tcp_enter_quickack_mode(struct sock *sk, unsigned int max_quickacks);
+ static inline void tcp_dec_quickack_mode(struct sock *sk,
+ const unsigned int pkts)
+ {
+diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
+index 0f29799efa021..6b90b476a03cb 100644
+--- a/include/scsi/scsi_host.h
++++ b/include/scsi/scsi_host.h
+@@ -763,7 +763,7 @@ extern void scsi_remove_host(struct Scsi_Host *);
+ extern struct Scsi_Host *scsi_host_get(struct Scsi_Host *);
+ extern int scsi_host_busy(struct Scsi_Host *shost);
+ extern void scsi_host_put(struct Scsi_Host *t);
+-extern struct Scsi_Host *scsi_host_lookup(unsigned short);
++extern struct Scsi_Host *scsi_host_lookup(unsigned int hostnum);
+ extern const char *scsi_host_state_name(enum scsi_host_state);
+ extern void scsi_host_complete_all_commands(struct Scsi_Host *shost,
+ enum scsi_host_status status);
+diff --git a/include/uapi/linux/sync_file.h b/include/uapi/linux/sync_file.h
+index 7e42a5b7558bf..ff0a931833e25 100644
+--- a/include/uapi/linux/sync_file.h
++++ b/include/uapi/linux/sync_file.h
+@@ -56,7 +56,7 @@ struct sync_fence_info {
+ * @name: name of fence
+ * @status: status of fence. 1: signaled 0:active <0:error
+ * @flags: sync_file_info flags
+- * @num_fences number of fences in the sync_file
++ * @num_fences: number of fences in the sync_file
+ * @pad: padding for 64-bit alignment, should always be zero
+ * @sync_fence_info: pointer to array of struct &sync_fence_info with all
+ * fences in the sync_file
+diff --git a/include/ufs/ufs.h b/include/ufs/ufs.h
+index 4e8d6240e589b..af5f5e588d5f4 100644
+--- a/include/ufs/ufs.h
++++ b/include/ufs/ufs.h
+@@ -102,6 +102,12 @@ enum {
+ UPIU_CMD_FLAGS_READ = 0x40,
+ };
+
++/* UPIU response flags */
++enum {
++ UPIU_RSP_FLAG_UNDERFLOW = 0x20,
++ UPIU_RSP_FLAG_OVERFLOW = 0x40,
++};
++
+ /* UPIU Task Attributes */
+ enum {
+ UPIU_TASK_ATTR_SIMPLE = 0x00,
+diff --git a/init/Kconfig b/init/Kconfig
+index 32c24950c4ced..c70617066a4c8 100644
+--- a/init/Kconfig
++++ b/init/Kconfig
+@@ -629,6 +629,7 @@ config TASK_IO_ACCOUNTING
+
+ config PSI
+ bool "Pressure stall information tracking"
++ select KERNFS
+ help
+ Collect metrics that indicate how overcommitted the CPU, memory,
+ and IO capacity are in the system.
+diff --git a/io_uring/io-wq.c b/io_uring/io-wq.c
+index 399e9a15c38d6..2c03bc881edfd 100644
+--- a/io_uring/io-wq.c
++++ b/io_uring/io-wq.c
+@@ -174,6 +174,16 @@ static void io_worker_ref_put(struct io_wq *wq)
+ complete(&wq->worker_done);
+ }
+
++bool io_wq_worker_stopped(void)
++{
++ struct io_worker *worker = current->worker_private;
++
++ if (WARN_ON_ONCE(!io_wq_current_is_worker()))
++ return true;
++
++ return test_bit(IO_WQ_BIT_EXIT, &worker->wq->state);
++}
++
+ static void io_worker_cancel_cb(struct io_worker *worker)
+ {
+ struct io_wq_acct *acct = io_wq_get_acct(worker);
+@@ -1285,13 +1295,16 @@ static int io_wq_cpu_offline(unsigned int cpu, struct hlist_node *node)
+ return __io_wq_cpu_online(wq, cpu, false);
+ }
+
+-int io_wq_cpu_affinity(struct io_wq *wq, cpumask_var_t mask)
++int io_wq_cpu_affinity(struct io_uring_task *tctx, cpumask_var_t mask)
+ {
++ if (!tctx || !tctx->io_wq)
++ return -EINVAL;
++
+ rcu_read_lock();
+ if (mask)
+- cpumask_copy(wq->cpu_mask, mask);
++ cpumask_copy(tctx->io_wq->cpu_mask, mask);
+ else
+- cpumask_copy(wq->cpu_mask, cpu_possible_mask);
++ cpumask_copy(tctx->io_wq->cpu_mask, cpu_possible_mask);
+ rcu_read_unlock();
+
+ return 0;
+diff --git a/io_uring/io-wq.h b/io_uring/io-wq.h
+index 31228426d1924..2b2a6406dd8ee 100644
+--- a/io_uring/io-wq.h
++++ b/io_uring/io-wq.h
+@@ -50,8 +50,9 @@ void io_wq_put_and_exit(struct io_wq *wq);
+ void io_wq_enqueue(struct io_wq *wq, struct io_wq_work *work);
+ void io_wq_hash_work(struct io_wq_work *work, void *val);
+
+-int io_wq_cpu_affinity(struct io_wq *wq, cpumask_var_t mask);
++int io_wq_cpu_affinity(struct io_uring_task *tctx, cpumask_var_t mask);
+ int io_wq_max_workers(struct io_wq *wq, int *new_count);
++bool io_wq_worker_stopped(void);
+
+ static inline bool io_wq_is_hashed(struct io_wq_work *work)
+ {
+diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
+index a57bdf336ca8a..d3b36197087a5 100644
+--- a/io_uring/io_uring.c
++++ b/io_uring/io_uring.c
+@@ -231,7 +231,6 @@ static inline void req_fail_link_node(struct io_kiocb *req, int res)
+ static inline void io_req_add_to_cache(struct io_kiocb *req, struct io_ring_ctx *ctx)
+ {
+ wq_stack_add_head(&req->comp_list, &ctx->submit_state.free_list);
+- kasan_poison_object_data(req_cachep, req);
+ }
+
+ static __cold void io_ring_ctx_ref_free(struct percpu_ref *ref)
+@@ -1690,6 +1689,9 @@ static int io_iopoll_check(struct io_ring_ctx *ctx, long min)
+ break;
+ nr_events += ret;
+ ret = 0;
++
++ if (task_sigpending(current))
++ return -EINTR;
+ } while (nr_events < min && !need_resched());
+
+ return ret;
+@@ -2048,6 +2050,8 @@ fail:
+ if (!needs_poll) {
+ if (!(req->ctx->flags & IORING_SETUP_IOPOLL))
+ break;
++ if (io_wq_worker_stopped())
++ break;
+ cond_resched();
+ continue;
+ }
+@@ -2468,7 +2472,9 @@ static bool io_get_sqe(struct io_ring_ctx *ctx, const struct io_uring_sqe **sqe)
+ }
+
+ /* drop invalid entries */
++ spin_lock(&ctx->completion_lock);
+ ctx->cq_extra--;
++ spin_unlock(&ctx->completion_lock);
+ WRITE_ONCE(ctx->rings->sq_dropped,
+ READ_ONCE(ctx->rings->sq_dropped) + 1);
+ return false;
+@@ -4173,16 +4179,28 @@ static int io_register_enable_rings(struct io_ring_ctx *ctx)
+ return 0;
+ }
+
++static __cold int __io_register_iowq_aff(struct io_ring_ctx *ctx,
++ cpumask_var_t new_mask)
++{
++ int ret;
++
++ if (!(ctx->flags & IORING_SETUP_SQPOLL)) {
++ ret = io_wq_cpu_affinity(current->io_uring, new_mask);
++ } else {
++ mutex_unlock(&ctx->uring_lock);
++ ret = io_sqpoll_wq_cpu_affinity(ctx, new_mask);
++ mutex_lock(&ctx->uring_lock);
++ }
++
++ return ret;
++}
++
+ static __cold int io_register_iowq_aff(struct io_ring_ctx *ctx,
+ void __user *arg, unsigned len)
+ {
+- struct io_uring_task *tctx = current->io_uring;
+ cpumask_var_t new_mask;
+ int ret;
+
+- if (!tctx || !tctx->io_wq)
+- return -EINVAL;
+-
+ if (!alloc_cpumask_var(&new_mask, GFP_KERNEL))
+ return -ENOMEM;
+
+@@ -4203,19 +4221,14 @@ static __cold int io_register_iowq_aff(struct io_ring_ctx *ctx,
+ return -EFAULT;
+ }
+
+- ret = io_wq_cpu_affinity(tctx->io_wq, new_mask);
++ ret = __io_register_iowq_aff(ctx, new_mask);
+ free_cpumask_var(new_mask);
+ return ret;
+ }
+
+ static __cold int io_unregister_iowq_aff(struct io_ring_ctx *ctx)
+ {
+- struct io_uring_task *tctx = current->io_uring;
+-
+- if (!tctx || !tctx->io_wq)
+- return -EINVAL;
+-
+- return io_wq_cpu_affinity(tctx->io_wq, NULL);
++ return __io_register_iowq_aff(ctx, NULL);
+ }
+
+ static __cold int io_register_iowq_max_workers(struct io_ring_ctx *ctx,
+diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h
+index 259bf798a390e..97cfb3f2f06d0 100644
+--- a/io_uring/io_uring.h
++++ b/io_uring/io_uring.h
+@@ -361,7 +361,6 @@ static inline struct io_kiocb *io_extract_req(struct io_ring_ctx *ctx)
+ struct io_kiocb *req;
+
+ req = container_of(ctx->submit_state.free_list.next, struct io_kiocb, comp_list);
+- kasan_unpoison_object_data(req_cachep, req);
+ wq_stack_extract(&ctx->submit_state.free_list);
+ return req;
+ }
+diff --git a/io_uring/sqpoll.c b/io_uring/sqpoll.c
+index 5e329e3cd4706..bd6c2c7959a5b 100644
+--- a/io_uring/sqpoll.c
++++ b/io_uring/sqpoll.c
+@@ -421,3 +421,20 @@ err:
+ io_sq_thread_finish(ctx);
+ return ret;
+ }
++
++__cold int io_sqpoll_wq_cpu_affinity(struct io_ring_ctx *ctx,
++ cpumask_var_t mask)
++{
++ struct io_sq_data *sqd = ctx->sq_data;
++ int ret = -EINVAL;
++
++ if (sqd) {
++ io_sq_thread_park(sqd);
++ /* Don't set affinity for a dying thread */
++ if (sqd->thread)
++ ret = io_wq_cpu_affinity(sqd->thread->io_uring, mask);
++ io_sq_thread_unpark(sqd);
++ }
++
++ return ret;
++}
+diff --git a/io_uring/sqpoll.h b/io_uring/sqpoll.h
+index e1b8d508d22d1..8df37e8c91493 100644
+--- a/io_uring/sqpoll.h
++++ b/io_uring/sqpoll.h
+@@ -27,3 +27,4 @@ void io_sq_thread_park(struct io_sq_data *sqd);
+ void io_sq_thread_unpark(struct io_sq_data *sqd);
+ void io_put_sq_data(struct io_sq_data *sqd);
+ void io_sqpoll_wait_sq(struct io_ring_ctx *ctx);
++int io_sqpoll_wq_cpu_affinity(struct io_ring_ctx *ctx, cpumask_var_t mask);
+diff --git a/kernel/auditsc.c b/kernel/auditsc.c
+index addeed3df15d3..8dfd581cd5543 100644
+--- a/kernel/auditsc.c
++++ b/kernel/auditsc.c
+@@ -2456,6 +2456,8 @@ void __audit_inode_child(struct inode *parent,
+ }
+ }
+
++ cond_resched();
++
+ /* is there a matching child entry? */
+ list_for_each_entry(n, &context->names_list, list) {
+ /* can only match entries that have a name */
+diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
+index 8b4e92439d1d6..9976149ec42f0 100644
+--- a/kernel/bpf/btf.c
++++ b/kernel/bpf/btf.c
+@@ -6126,7 +6126,6 @@ static int btf_struct_walk(struct bpf_verifier_log *log, const struct btf *btf,
+ const char *tname, *mname, *tag_value;
+ u32 vlen, elem_id, mid;
+
+- *flag = 0;
+ again:
+ tname = __btf_name_by_offset(btf, t->name_off);
+ if (!btf_type_is_struct(t)) {
+@@ -6135,6 +6134,14 @@ again:
+ }
+
+ vlen = btf_type_vlen(t);
++ if (BTF_INFO_KIND(t->info) == BTF_KIND_UNION && vlen != 1 && !(*flag & PTR_UNTRUSTED))
++ /*
++ * walking unions yields untrusted pointers
++ * with exception of __bpf_md_ptr and other
++ * unions with a single member
++ */
++ *flag |= PTR_UNTRUSTED;
++
+ if (off + size > t->size) {
+ /* If the last element is a variable size array, we may
+ * need to relax the rule.
+@@ -6295,15 +6302,6 @@ error:
+ * of this field or inside of this struct
+ */
+ if (btf_type_is_struct(mtype)) {
+- if (BTF_INFO_KIND(mtype->info) == BTF_KIND_UNION &&
+- btf_type_vlen(mtype) != 1)
+- /*
+- * walking unions yields untrusted pointers
+- * with exception of __bpf_md_ptr and other
+- * unions with a single member
+- */
+- *flag |= PTR_UNTRUSTED;
+-
+ /* our field must be inside that union or struct */
+ t = mtype;
+
+@@ -6361,7 +6359,7 @@ error:
+ * that also allows using an array of int as a scratch
+ * space. e.g. skb->cb[].
+ */
+- if (off + size > mtrue_end) {
++ if (off + size > mtrue_end && !(*flag & PTR_UNTRUSTED)) {
+ bpf_log(log,
+ "access beyond the end of member %s (mend:%u) in struct %s with off %u size %u\n",
+ mname, mtrue_end, tname, off, size);
+@@ -6469,7 +6467,7 @@ bool btf_struct_ids_match(struct bpf_verifier_log *log,
+ bool strict)
+ {
+ const struct btf_type *type;
+- enum bpf_type_flag flag;
++ enum bpf_type_flag flag = 0;
+ int err;
+
+ /* Are we already done? */
+diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
+index f12565ba136b0..8c5daa841704b 100644
+--- a/kernel/bpf/helpers.c
++++ b/kernel/bpf/helpers.c
+@@ -2218,7 +2218,7 @@ __bpf_kfunc void *bpf_dynptr_slice(const struct bpf_dynptr_kern *ptr, u32 offset
+ case BPF_DYNPTR_TYPE_XDP:
+ {
+ void *xdp_ptr = bpf_xdp_pointer(ptr->data, ptr->offset + offset, len);
+- if (xdp_ptr)
++ if (!IS_ERR_OR_NULL(xdp_ptr))
+ return xdp_ptr;
+
+ bpf_xdp_copy_buf(ptr->data, ptr->offset + offset, buffer, len, false);
+diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
+index 4fbfe1d086467..ae391e8256551 100644
+--- a/kernel/bpf/verifier.c
++++ b/kernel/bpf/verifier.c
+@@ -4790,20 +4790,22 @@ static int map_kptr_match_type(struct bpf_verifier_env *env,
+ struct bpf_reg_state *reg, u32 regno)
+ {
+ const char *targ_name = btf_type_name(kptr_field->kptr.btf, kptr_field->kptr.btf_id);
+- int perm_flags = PTR_MAYBE_NULL | PTR_TRUSTED | MEM_RCU;
++ int perm_flags;
+ const char *reg_name = "";
+
+- /* Only unreferenced case accepts untrusted pointers */
+- if (kptr_field->type == BPF_KPTR_UNREF)
+- perm_flags |= PTR_UNTRUSTED;
++ if (btf_is_kernel(reg->btf)) {
++ perm_flags = PTR_MAYBE_NULL | PTR_TRUSTED | MEM_RCU;
++
++ /* Only unreferenced case accepts untrusted pointers */
++ if (kptr_field->type == BPF_KPTR_UNREF)
++ perm_flags |= PTR_UNTRUSTED;
++ } else {
++ perm_flags = PTR_MAYBE_NULL | MEM_ALLOC;
++ }
+
+ if (base_type(reg->type) != PTR_TO_BTF_ID || (type_flag(reg->type) & ~perm_flags))
+ goto bad_type;
+
+- if (!btf_is_kernel(reg->btf)) {
+- verbose(env, "R%d must point to kernel BTF\n", regno);
+- return -EINVAL;
+- }
+ /* We need to verify reg->type and reg->btf, before accessing reg->btf */
+ reg_name = btf_type_name(reg->btf, reg->btf_id);
+
+@@ -4816,7 +4818,7 @@ static int map_kptr_match_type(struct bpf_verifier_env *env,
+ if (__check_ptr_off_reg(env, reg, regno, true))
+ return -EACCES;
+
+- /* A full type match is needed, as BTF can be vmlinux or module BTF, and
++ /* A full type match is needed, as BTF can be vmlinux, module or prog BTF, and
+ * we also need to take into account the reg->off.
+ *
+ * We want to support cases like:
+@@ -5893,6 +5895,11 @@ static int check_ptr_to_btf_access(struct bpf_verifier_env *env,
+ type_is_rcu_or_null(env, reg, field_name, btf_id)) {
+ /* __rcu tagged pointers can be NULL */
+ flag |= MEM_RCU | PTR_MAYBE_NULL;
++
++ /* We always trust them */
++ if (type_is_rcu_or_null(env, reg, field_name, btf_id) &&
++ flag & PTR_UNTRUSTED)
++ flag &= ~PTR_UNTRUSTED;
+ } else if (flag & (MEM_PERCPU | MEM_USER)) {
+ /* keep as-is */
+ } else {
+@@ -7549,7 +7556,10 @@ found:
+ verbose(env, "verifier internal error: unimplemented handling of MEM_ALLOC\n");
+ return -EFAULT;
+ }
+- /* Handled by helper specific checks */
++ if (meta->func_id == BPF_FUNC_kptr_xchg) {
++ if (map_kptr_match_type(env, meta->kptr_field, reg, regno))
++ return -EACCES;
++ }
+ break;
+ case PTR_TO_BTF_ID | MEM_PERCPU:
+ case PTR_TO_BTF_ID | MEM_PERCPU | PTR_TRUSTED:
+@@ -7601,17 +7611,6 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
+ if (arg_type_is_dynptr(arg_type) && type == PTR_TO_STACK)
+ return 0;
+
+- if ((type_is_ptr_alloc_obj(type) || type_is_non_owning_ref(type)) && reg->off) {
+- if (reg_find_field_offset(reg, reg->off, BPF_GRAPH_NODE_OR_ROOT))
+- return __check_ptr_off_reg(env, reg, regno, true);
+-
+- verbose(env, "R%d must have zero offset when passed to release func\n",
+- regno);
+- verbose(env, "No graph node or root found at R%d type:%s off:%d\n", regno,
+- btf_type_name(reg->btf, reg->btf_id), reg->off);
+- return -EINVAL;
+- }
+-
+ /* Doing check_ptr_off_reg check for the offset will catch this
+ * because fixed_off_ok is false, but checking here allows us
+ * to give the user a better error message.
+@@ -13598,6 +13597,12 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
+ return -EINVAL;
+ }
+
++ /* check src2 operand */
++ err = check_reg_arg(env, insn->dst_reg, SRC_OP);
++ if (err)
++ return err;
++
++ dst_reg = ®s[insn->dst_reg];
+ if (BPF_SRC(insn->code) == BPF_X) {
+ if (insn->imm != 0) {
+ verbose(env, "BPF_JMP/JMP32 uses reserved fields\n");
+@@ -13609,12 +13614,13 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
+ if (err)
+ return err;
+
+- if (is_pointer_value(env, insn->src_reg)) {
++ src_reg = ®s[insn->src_reg];
++ if (!(reg_is_pkt_pointer_any(dst_reg) && reg_is_pkt_pointer_any(src_reg)) &&
++ is_pointer_value(env, insn->src_reg)) {
+ verbose(env, "R%d pointer comparison prohibited\n",
+ insn->src_reg);
+ return -EACCES;
+ }
+- src_reg = ®s[insn->src_reg];
+ } else {
+ if (insn->src_reg != BPF_REG_0) {
+ verbose(env, "BPF_JMP/JMP32 uses reserved fields\n");
+@@ -13622,12 +13628,6 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
+ }
+ }
+
+- /* check src2 operand */
+- err = check_reg_arg(env, insn->dst_reg, SRC_OP);
+- if (err)
+- return err;
+-
+- dst_reg = ®s[insn->dst_reg];
+ is_jmp32 = BPF_CLASS(insn->code) == BPF_JMP32;
+
+ if (BPF_SRC(insn->code) == BPF_K) {
+diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
+index 2c76fcd9f0bcb..7c5153273c1a8 100644
+--- a/kernel/cgroup/cpuset.c
++++ b/kernel/cgroup/cpuset.c
+@@ -1611,11 +1611,16 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp,
+ }
+
+ /*
+- * Skip the whole subtree if the cpumask remains the same
+- * and has no partition root state and force flag not set.
++ * Skip the whole subtree if
++ * 1) the cpumask remains the same,
++ * 2) has no partition root state,
++ * 3) force flag not set, and
++ * 4) for v2 load balance state same as its parent.
+ */
+ if (!cp->partition_root_state && !force &&
+- cpumask_equal(tmp->new_cpus, cp->effective_cpus)) {
++ cpumask_equal(tmp->new_cpus, cp->effective_cpus) &&
++ (!cgroup_subsys_on_dfl(cpuset_cgrp_subsys) ||
++ (is_sched_load_balance(parent) == is_sched_load_balance(cp)))) {
+ pos_css = css_rightmost_descendant(pos_css);
+ continue;
+ }
+@@ -1698,6 +1703,20 @@ update_parent_subparts:
+
+ update_tasks_cpumask(cp, tmp->new_cpus);
+
++ /*
++ * On default hierarchy, inherit the CS_SCHED_LOAD_BALANCE
++ * from parent if current cpuset isn't a valid partition root
++ * and their load balance states differ.
++ */
++ if (cgroup_subsys_on_dfl(cpuset_cgrp_subsys) &&
++ !is_partition_valid(cp) &&
++ (is_sched_load_balance(parent) != is_sched_load_balance(cp))) {
++ if (is_sched_load_balance(parent))
++ set_bit(CS_SCHED_LOAD_BALANCE, &cp->flags);
++ else
++ clear_bit(CS_SCHED_LOAD_BALANCE, &cp->flags);
++ }
++
+ /*
+ * On legacy hierarchy, if the effective cpumask of any non-
+ * empty cpuset is changed, we need to rebuild sched domains.
+@@ -3245,6 +3264,14 @@ static int cpuset_css_online(struct cgroup_subsys_state *css)
+ cs->use_parent_ecpus = true;
+ parent->child_ecpus_count++;
+ }
++
++ /*
++ * For v2, clear CS_SCHED_LOAD_BALANCE if parent is isolated
++ */
++ if (cgroup_subsys_on_dfl(cpuset_cgrp_subsys) &&
++ !is_sched_load_balance(parent))
++ clear_bit(CS_SCHED_LOAD_BALANCE, &cs->flags);
++
+ spin_unlock_irq(&callback_lock);
+
+ if (!test_bit(CGRP_CPUSET_CLONE_CHILDREN, &css->cgroup->flags))
+diff --git a/kernel/cpu.c b/kernel/cpu.c
+index f4a2c5845bcbd..20b3817cdf56c 100644
+--- a/kernel/cpu.c
++++ b/kernel/cpu.c
+@@ -1215,8 +1215,22 @@ out:
+ return ret;
+ }
+
++struct cpu_down_work {
++ unsigned int cpu;
++ enum cpuhp_state target;
++};
++
++static long __cpu_down_maps_locked(void *arg)
++{
++ struct cpu_down_work *work = arg;
++
++ return _cpu_down(work->cpu, 0, work->target);
++}
++
+ static int cpu_down_maps_locked(unsigned int cpu, enum cpuhp_state target)
+ {
++ struct cpu_down_work work = { .cpu = cpu, .target = target, };
++
+ /*
+ * If the platform does not support hotplug, report it explicitly to
+ * differentiate it from a transient offlining failure.
+@@ -1225,7 +1239,15 @@ static int cpu_down_maps_locked(unsigned int cpu, enum cpuhp_state target)
+ return -EOPNOTSUPP;
+ if (cpu_hotplug_disabled)
+ return -EBUSY;
+- return _cpu_down(cpu, 0, target);
++
++ /*
++ * Ensure that the control task does not run on the to be offlined
++ * CPU to prevent a deadlock against cfs_b->period_timer.
++ */
++ cpu = cpumask_any_but(cpu_online_mask, cpu);
++ if (cpu >= nr_cpu_ids)
++ return -EBUSY;
++ return work_on_cpu(cpu, __cpu_down_maps_locked, &work);
+ }
+
+ static int cpu_down(unsigned int cpu, enum cpuhp_state target)
+diff --git a/kernel/kprobes.c b/kernel/kprobes.c
+index 00e177de91ccd..3da9726232ff9 100644
+--- a/kernel/kprobes.c
++++ b/kernel/kprobes.c
+@@ -1545,6 +1545,17 @@ static int check_ftrace_location(struct kprobe *p)
+ return 0;
+ }
+
++static bool is_cfi_preamble_symbol(unsigned long addr)
++{
++ char symbuf[KSYM_NAME_LEN];
++
++ if (lookup_symbol_name(addr, symbuf))
++ return false;
++
++ return str_has_prefix("__cfi_", symbuf) ||
++ str_has_prefix("__pfx_", symbuf);
++}
++
+ static int check_kprobe_address_safe(struct kprobe *p,
+ struct module **probed_mod)
+ {
+@@ -1563,7 +1574,8 @@ static int check_kprobe_address_safe(struct kprobe *p,
+ within_kprobe_blacklist((unsigned long) p->addr) ||
+ jump_label_text_reserved(p->addr, p->addr) ||
+ static_call_text_reserved(p->addr, p->addr) ||
+- find_bug((unsigned long)p->addr)) {
++ find_bug((unsigned long)p->addr) ||
++ is_cfi_preamble_symbol((unsigned long)p->addr)) {
+ ret = -EINVAL;
+ goto out;
+ }
+diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
+index 2dc4d5a1f1ff8..fde338606ce83 100644
+--- a/kernel/printk/printk_ringbuffer.c
++++ b/kernel/printk/printk_ringbuffer.c
+@@ -1735,7 +1735,7 @@ static bool copy_data(struct prb_data_ring *data_ring,
+ if (!buf || !buf_size)
+ return true;
+
+- data_size = min_t(u16, buf_size, len);
++ data_size = min_t(unsigned int, buf_size, len);
+
+ memcpy(&buf[0], data, data_size); /* LMM(copy_data:A) */
+ return true;
+diff --git a/kernel/rcu/refscale.c b/kernel/rcu/refscale.c
+index 1970ce5f22d40..71d138573856f 100644
+--- a/kernel/rcu/refscale.c
++++ b/kernel/rcu/refscale.c
+@@ -1107,12 +1107,11 @@ ref_scale_init(void)
+ VERBOSE_SCALEOUT("Starting %d reader threads", nreaders);
+
+ for (i = 0; i < nreaders; i++) {
++ init_waitqueue_head(&reader_tasks[i].wq);
+ firsterr = torture_create_kthread(ref_scale_reader, (void *)i,
+ reader_tasks[i].task);
+ if (torture_init_error(firsterr))
+ goto unwind;
+-
+- init_waitqueue_head(&(reader_tasks[i].wq));
+ }
+
+ // Main Task
+diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
+index 00e0e50741153..185d3d749f6b6 100644
+--- a/kernel/sched/rt.c
++++ b/kernel/sched/rt.c
+@@ -25,7 +25,7 @@ unsigned int sysctl_sched_rt_period = 1000000;
+ int sysctl_sched_rt_runtime = 950000;
+
+ #ifdef CONFIG_SYSCTL
+-static int sysctl_sched_rr_timeslice = (MSEC_PER_SEC / HZ) * RR_TIMESLICE;
++static int sysctl_sched_rr_timeslice = (MSEC_PER_SEC * RR_TIMESLICE) / HZ;
+ static int sched_rt_handler(struct ctl_table *table, int write, void *buffer,
+ size_t *lenp, loff_t *ppos);
+ static int sched_rr_handler(struct ctl_table *table, int write, void *buffer,
+diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
+index 91836b727cef5..0600e16dbafef 100644
+--- a/kernel/time/clocksource.c
++++ b/kernel/time/clocksource.c
+@@ -473,8 +473,8 @@ static void clocksource_watchdog(struct timer_list *unused)
+ /* Check the deviation from the watchdog clocksource. */
+ md = cs->uncertainty_margin + watchdog->uncertainty_margin;
+ if (abs(cs_nsec - wd_nsec) > md) {
+- u64 cs_wd_msec;
+- u64 wd_msec;
++ s64 cs_wd_msec;
++ s64 wd_msec;
+ u32 wd_rem;
+
+ pr_warn("timekeeping watchdog on CPU%d: Marking clocksource '%s' as unstable because the skew is too large:\n",
+@@ -483,8 +483,8 @@ static void clocksource_watchdog(struct timer_list *unused)
+ watchdog->name, wd_nsec, wdnow, wdlast, watchdog->mask);
+ pr_warn(" '%s' cs_nsec: %lld cs_now: %llx cs_last: %llx mask: %llx\n",
+ cs->name, cs_nsec, csnow, cslast, cs->mask);
+- cs_wd_msec = div_u64_rem(cs_nsec - wd_nsec, 1000U * 1000U, &wd_rem);
+- wd_msec = div_u64_rem(wd_nsec, 1000U * 1000U, &wd_rem);
++ cs_wd_msec = div_s64_rem(cs_nsec - wd_nsec, 1000 * 1000, &wd_rem);
++ wd_msec = div_s64_rem(wd_nsec, 1000 * 1000, &wd_rem);
+ pr_warn(" Clocksource '%s' skewed %lld ns (%lld ms) over watchdog '%s' interval of %lld ns (%lld ms)\n",
+ cs->name, cs_nsec - wd_nsec, cs_wd_msec, watchdog->name, wd_nsec, wd_msec);
+ if (curr_clocksource == cs)
+diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
+index 4df14db4da490..87015e9deacc9 100644
+--- a/kernel/time/tick-sched.c
++++ b/kernel/time/tick-sched.c
+@@ -1045,7 +1045,7 @@ static bool report_idle_softirq(void)
+ return false;
+
+ /* On RT, softirqs handling may be waiting on some lock */
+- if (!local_bh_blocked())
++ if (local_bh_blocked())
+ return false;
+
+ pr_warn("NOHZ tick-stop error: local softirq work is pending, handler #%02x!!!\n",
+diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
+index a53524f3f7d82..3d8d5c383dfe5 100644
+--- a/kernel/trace/bpf_trace.c
++++ b/kernel/trace/bpf_trace.c
+@@ -2391,7 +2391,7 @@ int bpf_get_perf_event_info(const struct perf_event *event, u32 *prog_id,
+ #ifdef CONFIG_UPROBE_EVENTS
+ if (flags & TRACE_EVENT_FL_UPROBE)
+ err = bpf_get_uprobe_info(event, fd_type, buf,
+- probe_offset,
++ probe_offset, probe_addr,
+ event->attr.type == PERF_TYPE_TRACEPOINT);
+ #endif
+ }
+diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
+index f4855be6ac2b5..146b1c8d7e449 100644
+--- a/kernel/trace/trace.c
++++ b/kernel/trace/trace.c
+@@ -6690,10 +6690,36 @@ tracing_max_lat_write(struct file *filp, const char __user *ubuf,
+
+ #endif
+
++static int open_pipe_on_cpu(struct trace_array *tr, int cpu)
++{
++ if (cpu == RING_BUFFER_ALL_CPUS) {
++ if (cpumask_empty(tr->pipe_cpumask)) {
++ cpumask_setall(tr->pipe_cpumask);
++ return 0;
++ }
++ } else if (!cpumask_test_cpu(cpu, tr->pipe_cpumask)) {
++ cpumask_set_cpu(cpu, tr->pipe_cpumask);
++ return 0;
++ }
++ return -EBUSY;
++}
++
++static void close_pipe_on_cpu(struct trace_array *tr, int cpu)
++{
++ if (cpu == RING_BUFFER_ALL_CPUS) {
++ WARN_ON(!cpumask_full(tr->pipe_cpumask));
++ cpumask_clear(tr->pipe_cpumask);
++ } else {
++ WARN_ON(!cpumask_test_cpu(cpu, tr->pipe_cpumask));
++ cpumask_clear_cpu(cpu, tr->pipe_cpumask);
++ }
++}
++
+ static int tracing_open_pipe(struct inode *inode, struct file *filp)
+ {
+ struct trace_array *tr = inode->i_private;
+ struct trace_iterator *iter;
++ int cpu;
+ int ret;
+
+ ret = tracing_check_open_get_tr(tr);
+@@ -6701,13 +6727,16 @@ static int tracing_open_pipe(struct inode *inode, struct file *filp)
+ return ret;
+
+ mutex_lock(&trace_types_lock);
++ cpu = tracing_get_cpu(inode);
++ ret = open_pipe_on_cpu(tr, cpu);
++ if (ret)
++ goto fail_pipe_on_cpu;
+
+ /* create a buffer to store the information to pass to userspace */
+ iter = kzalloc(sizeof(*iter), GFP_KERNEL);
+ if (!iter) {
+ ret = -ENOMEM;
+- __trace_array_put(tr);
+- goto out;
++ goto fail_alloc_iter;
+ }
+
+ trace_seq_init(&iter->seq);
+@@ -6730,7 +6759,7 @@ static int tracing_open_pipe(struct inode *inode, struct file *filp)
+
+ iter->tr = tr;
+ iter->array_buffer = &tr->array_buffer;
+- iter->cpu_file = tracing_get_cpu(inode);
++ iter->cpu_file = cpu;
+ mutex_init(&iter->mutex);
+ filp->private_data = iter;
+
+@@ -6740,12 +6769,15 @@ static int tracing_open_pipe(struct inode *inode, struct file *filp)
+ nonseekable_open(inode, filp);
+
+ tr->trace_ref++;
+-out:
++
+ mutex_unlock(&trace_types_lock);
+ return ret;
+
+ fail:
+ kfree(iter);
++fail_alloc_iter:
++ close_pipe_on_cpu(tr, cpu);
++fail_pipe_on_cpu:
+ __trace_array_put(tr);
+ mutex_unlock(&trace_types_lock);
+ return ret;
+@@ -6762,7 +6794,7 @@ static int tracing_release_pipe(struct inode *inode, struct file *file)
+
+ if (iter->trace->pipe_close)
+ iter->trace->pipe_close(iter);
+-
++ close_pipe_on_cpu(tr, iter->cpu_file);
+ mutex_unlock(&trace_types_lock);
+
+ free_cpumask_var(iter->started);
+@@ -7558,6 +7590,11 @@ out:
+ return ret;
+ }
+
++static void tracing_swap_cpu_buffer(void *tr)
++{
++ update_max_tr_single((struct trace_array *)tr, current, smp_processor_id());
++}
++
+ static ssize_t
+ tracing_snapshot_write(struct file *filp, const char __user *ubuf, size_t cnt,
+ loff_t *ppos)
+@@ -7616,13 +7653,15 @@ tracing_snapshot_write(struct file *filp, const char __user *ubuf, size_t cnt,
+ ret = tracing_alloc_snapshot_instance(tr);
+ if (ret < 0)
+ break;
+- local_irq_disable();
+ /* Now, we're going to swap */
+- if (iter->cpu_file == RING_BUFFER_ALL_CPUS)
++ if (iter->cpu_file == RING_BUFFER_ALL_CPUS) {
++ local_irq_disable();
+ update_max_tr(tr, current, smp_processor_id(), NULL);
+- else
+- update_max_tr_single(tr, current, iter->cpu_file);
+- local_irq_enable();
++ local_irq_enable();
++ } else {
++ smp_call_function_single(iter->cpu_file, tracing_swap_cpu_buffer,
++ (void *)tr, 1);
++ }
+ break;
+ default:
+ if (tr->allocated_snapshot) {
+@@ -9426,6 +9465,9 @@ static struct trace_array *trace_array_create(const char *name)
+ if (!alloc_cpumask_var(&tr->tracing_cpumask, GFP_KERNEL))
+ goto out_free_tr;
+
++ if (!zalloc_cpumask_var(&tr->pipe_cpumask, GFP_KERNEL))
++ goto out_free_tr;
++
+ tr->trace_flags = global_trace.trace_flags & ~ZEROED_TRACE_FLAGS;
+
+ cpumask_copy(tr->tracing_cpumask, cpu_all_mask);
+@@ -9467,6 +9509,7 @@ static struct trace_array *trace_array_create(const char *name)
+ out_free_tr:
+ ftrace_free_ftrace_ops(tr);
+ free_trace_buffers(tr);
++ free_cpumask_var(tr->pipe_cpumask);
+ free_cpumask_var(tr->tracing_cpumask);
+ kfree(tr->name);
+ kfree(tr);
+@@ -9569,6 +9612,7 @@ static int __remove_instance(struct trace_array *tr)
+ }
+ kfree(tr->topts);
+
++ free_cpumask_var(tr->pipe_cpumask);
+ free_cpumask_var(tr->tracing_cpumask);
+ kfree(tr->name);
+ kfree(tr);
+@@ -10366,12 +10410,14 @@ __init static int tracer_alloc_buffers(void)
+ if (trace_create_savedcmd() < 0)
+ goto out_free_temp_buffer;
+
++ if (!zalloc_cpumask_var(&global_trace.pipe_cpumask, GFP_KERNEL))
++ goto out_free_savedcmd;
++
+ /* TODO: make the number of buffers hot pluggable with CPUS */
+ if (allocate_trace_buffers(&global_trace, ring_buf_size) < 0) {
+ MEM_FAIL(1, "tracer: failed to allocate ring buffer!\n");
+- goto out_free_savedcmd;
++ goto out_free_pipe_cpumask;
+ }
+-
+ if (global_trace.buffer_disabled)
+ tracing_off();
+
+@@ -10424,6 +10470,8 @@ __init static int tracer_alloc_buffers(void)
+
+ return 0;
+
++out_free_pipe_cpumask:
++ free_cpumask_var(global_trace.pipe_cpumask);
+ out_free_savedcmd:
+ free_saved_cmdlines_buffer(savedcmd);
+ out_free_temp_buffer:
+diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
+index 2daeac8e690a6..b577f65a63f11 100644
+--- a/kernel/trace/trace.h
++++ b/kernel/trace/trace.h
+@@ -366,6 +366,8 @@ struct trace_array {
+ struct list_head events;
+ struct trace_event_file *trace_marker_file;
+ cpumask_var_t tracing_cpumask; /* only trace on set CPUs */
++ /* one per_cpu trace_pipe can be opened by only one user */
++ cpumask_var_t pipe_cpumask;
+ int ref;
+ int trace_ref;
+ #ifdef CONFIG_FUNCTION_TRACER
+diff --git a/kernel/trace/trace_hwlat.c b/kernel/trace/trace_hwlat.c
+index 2f37a6e68aa9f..b791524a6536a 100644
+--- a/kernel/trace/trace_hwlat.c
++++ b/kernel/trace/trace_hwlat.c
+@@ -635,7 +635,7 @@ static int s_mode_show(struct seq_file *s, void *v)
+ else
+ seq_printf(s, "%s", thread_mode_str[mode]);
+
+- if (mode != MODE_MAX)
++ if (mode < MODE_MAX - 1) /* if mode is any but last */
+ seq_puts(s, " ");
+
+ return 0;
+diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
+index 7b47e9a2c0102..9173fcfc03820 100644
+--- a/kernel/trace/trace_uprobe.c
++++ b/kernel/trace/trace_uprobe.c
+@@ -1416,7 +1416,7 @@ static void uretprobe_perf_func(struct trace_uprobe *tu, unsigned long func,
+
+ int bpf_get_uprobe_info(const struct perf_event *event, u32 *fd_type,
+ const char **filename, u64 *probe_offset,
+- bool perf_type_tracepoint)
++ u64 *probe_addr, bool perf_type_tracepoint)
+ {
+ const char *pevent = trace_event_name(event->tp_event);
+ const char *group = event->tp_event->class->system;
+@@ -1433,6 +1433,7 @@ int bpf_get_uprobe_info(const struct perf_event *event, u32 *fd_type,
+ : BPF_FD_TYPE_UPROBE;
+ *filename = tu->filename;
+ *probe_offset = tu->offset;
++ *probe_addr = 0;
+ return 0;
+ }
+ #endif /* CONFIG_PERF_EVENTS */
+diff --git a/lib/iov_iter.c b/lib/iov_iter.c
+index 061cc3ed58f5b..dac0ec7b9436e 100644
+--- a/lib/iov_iter.c
++++ b/lib/iov_iter.c
+@@ -2086,14 +2086,14 @@ static ssize_t iov_iter_extract_bvec_pages(struct iov_iter *i,
+ size_t *offset0)
+ {
+ struct page **p, *page;
+- size_t skip = i->iov_offset, offset;
++ size_t skip = i->iov_offset, offset, size;
+ int k;
+
+ for (;;) {
+ if (i->nr_segs == 0)
+ return 0;
+- maxsize = min(maxsize, i->bvec->bv_len - skip);
+- if (maxsize)
++ size = min(maxsize, i->bvec->bv_len - skip);
++ if (size)
+ break;
+ i->iov_offset = 0;
+ i->nr_segs--;
+@@ -2106,16 +2106,16 @@ static ssize_t iov_iter_extract_bvec_pages(struct iov_iter *i,
+ offset = skip % PAGE_SIZE;
+ *offset0 = offset;
+
+- maxpages = want_pages_array(pages, maxsize, offset, maxpages);
++ maxpages = want_pages_array(pages, size, offset, maxpages);
+ if (!maxpages)
+ return -ENOMEM;
+ p = *pages;
+ for (k = 0; k < maxpages; k++)
+ p[k] = page + k;
+
+- maxsize = min_t(size_t, maxsize, maxpages * PAGE_SIZE - offset);
+- iov_iter_advance(i, maxsize);
+- return maxsize;
++ size = min_t(size_t, size, maxpages * PAGE_SIZE - offset);
++ iov_iter_advance(i, size);
++ return size;
+ }
+
+ /*
+@@ -2130,14 +2130,14 @@ static ssize_t iov_iter_extract_kvec_pages(struct iov_iter *i,
+ {
+ struct page **p, *page;
+ const void *kaddr;
+- size_t skip = i->iov_offset, offset, len;
++ size_t skip = i->iov_offset, offset, len, size;
+ int k;
+
+ for (;;) {
+ if (i->nr_segs == 0)
+ return 0;
+- maxsize = min(maxsize, i->kvec->iov_len - skip);
+- if (maxsize)
++ size = min(maxsize, i->kvec->iov_len - skip);
++ if (size)
+ break;
+ i->iov_offset = 0;
+ i->nr_segs--;
+@@ -2149,13 +2149,13 @@ static ssize_t iov_iter_extract_kvec_pages(struct iov_iter *i,
+ offset = (unsigned long)kaddr & ~PAGE_MASK;
+ *offset0 = offset;
+
+- maxpages = want_pages_array(pages, maxsize, offset, maxpages);
++ maxpages = want_pages_array(pages, size, offset, maxpages);
+ if (!maxpages)
+ return -ENOMEM;
+ p = *pages;
+
+ kaddr -= offset;
+- len = offset + maxsize;
++ len = offset + size;
+ for (k = 0; k < maxpages; k++) {
+ size_t seg = min_t(size_t, len, PAGE_SIZE);
+
+@@ -2169,9 +2169,9 @@ static ssize_t iov_iter_extract_kvec_pages(struct iov_iter *i,
+ kaddr += PAGE_SIZE;
+ }
+
+- maxsize = min_t(size_t, maxsize, maxpages * PAGE_SIZE - offset);
+- iov_iter_advance(i, maxsize);
+- return maxsize;
++ size = min_t(size_t, size, maxpages * PAGE_SIZE - offset);
++ iov_iter_advance(i, size);
++ return size;
+ }
+
+ /*
+diff --git a/lib/sbitmap.c b/lib/sbitmap.c
+index eff4e42c425a4..d0a5081dfd122 100644
+--- a/lib/sbitmap.c
++++ b/lib/sbitmap.c
+@@ -550,7 +550,7 @@ EXPORT_SYMBOL_GPL(sbitmap_queue_min_shallow_depth);
+
+ static void __sbitmap_queue_wake_up(struct sbitmap_queue *sbq, int nr)
+ {
+- int i, wake_index;
++ int i, wake_index, woken;
+
+ if (!atomic_read(&sbq->ws_active))
+ return;
+@@ -567,13 +567,12 @@ static void __sbitmap_queue_wake_up(struct sbitmap_queue *sbq, int nr)
+ */
+ wake_index = sbq_index_inc(wake_index);
+
+- /*
+- * It is sufficient to wake up at least one waiter to
+- * guarantee forward progress.
+- */
+- if (waitqueue_active(&ws->wait) &&
+- wake_up_nr(&ws->wait, nr))
+- break;
++ if (waitqueue_active(&ws->wait)) {
++ woken = wake_up_nr(&ws->wait, nr);
++ if (woken == nr)
++ break;
++ nr -= woken;
++ }
+ }
+
+ if (wake_index != atomic_read(&sbq->wake_index))
+diff --git a/lib/xarray.c b/lib/xarray.c
+index 2071a3718f4ed..142e36f9dfda1 100644
+--- a/lib/xarray.c
++++ b/lib/xarray.c
+@@ -206,7 +206,7 @@ static void *xas_descend(struct xa_state *xas, struct xa_node *node)
+ void *entry = xa_entry(xas->xa, node, offset);
+
+ xas->xa_node = node;
+- if (xa_is_sibling(entry)) {
++ while (xa_is_sibling(entry)) {
+ offset = xa_to_sibling(entry);
+ entry = xa_entry(xas->xa, node, offset);
+ if (node->shift && xa_is_node(entry))
+diff --git a/mm/memfd.c b/mm/memfd.c
+index e763e76f11064..d65485c762def 100644
+--- a/mm/memfd.c
++++ b/mm/memfd.c
+@@ -268,11 +268,32 @@ long memfd_fcntl(struct file *file, unsigned int cmd, unsigned int arg)
+
+ #define MFD_ALL_FLAGS (MFD_CLOEXEC | MFD_ALLOW_SEALING | MFD_HUGETLB | MFD_NOEXEC_SEAL | MFD_EXEC)
+
++static int check_sysctl_memfd_noexec(unsigned int *flags)
++{
++#ifdef CONFIG_SYSCTL
++ int sysctl = task_active_pid_ns(current)->memfd_noexec_scope;
++
++ if (!(*flags & (MFD_EXEC | MFD_NOEXEC_SEAL))) {
++ if (sysctl >= MEMFD_NOEXEC_SCOPE_NOEXEC_SEAL)
++ *flags |= MFD_NOEXEC_SEAL;
++ else
++ *flags |= MFD_EXEC;
++ }
++
++ if (!(*flags & MFD_NOEXEC_SEAL) && sysctl >= MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED) {
++ pr_err_ratelimited(
++ "%s[%d]: memfd_create() requires MFD_NOEXEC_SEAL with vm.memfd_noexec=%d\n",
++ current->comm, task_pid_nr(current), sysctl);
++ return -EACCES;
++ }
++#endif
++ return 0;
++}
++
+ SYSCALL_DEFINE2(memfd_create,
+ const char __user *, uname,
+ unsigned int, flags)
+ {
+- char comm[TASK_COMM_LEN];
+ unsigned int *file_seals;
+ struct file *file;
+ int fd, error;
+@@ -294,35 +315,15 @@ SYSCALL_DEFINE2(memfd_create,
+ return -EINVAL;
+
+ if (!(flags & (MFD_EXEC | MFD_NOEXEC_SEAL))) {
+-#ifdef CONFIG_SYSCTL
+- int sysctl = MEMFD_NOEXEC_SCOPE_EXEC;
+- struct pid_namespace *ns;
+-
+- ns = task_active_pid_ns(current);
+- if (ns)
+- sysctl = ns->memfd_noexec_scope;
+-
+- switch (sysctl) {
+- case MEMFD_NOEXEC_SCOPE_EXEC:
+- flags |= MFD_EXEC;
+- break;
+- case MEMFD_NOEXEC_SCOPE_NOEXEC_SEAL:
+- flags |= MFD_NOEXEC_SEAL;
+- break;
+- default:
+- pr_warn_once(
+- "memfd_create(): MFD_NOEXEC_SEAL is enforced, pid=%d '%s'\n",
+- task_pid_nr(current), get_task_comm(comm, current));
+- return -EINVAL;
+- }
+-#else
+- flags |= MFD_EXEC;
+-#endif
+ pr_warn_once(
+- "memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL, pid=%d '%s'\n",
+- task_pid_nr(current), get_task_comm(comm, current));
++ "%s[%d]: memfd_create() called without MFD_EXEC or MFD_NOEXEC_SEAL set\n",
++ current->comm, task_pid_nr(current));
+ }
+
++ error = check_sysctl_memfd_noexec(&flags);
++ if (error < 0)
++ return error;
++
+ /* length includes terminating zero */
+ len = strnlen_user(uname, MFD_NAME_MAX_LEN + 1);
+ if (len <= 0)
+diff --git a/mm/shmem.c b/mm/shmem.c
+index fe208a072e594..87cc98a9a014a 100644
+--- a/mm/shmem.c
++++ b/mm/shmem.c
+@@ -3506,6 +3506,8 @@ static int shmem_parse_one(struct fs_context *fc, struct fs_parameter *param)
+ unsigned long long size;
+ char *rest;
+ int opt;
++ kuid_t kuid;
++ kgid_t kgid;
+
+ opt = fs_parse(fc, shmem_fs_parameters, param, &result);
+ if (opt < 0)
+@@ -3541,14 +3543,32 @@ static int shmem_parse_one(struct fs_context *fc, struct fs_parameter *param)
+ ctx->mode = result.uint_32 & 07777;
+ break;
+ case Opt_uid:
+- ctx->uid = make_kuid(current_user_ns(), result.uint_32);
+- if (!uid_valid(ctx->uid))
++ kuid = make_kuid(current_user_ns(), result.uint_32);
++ if (!uid_valid(kuid))
+ goto bad_value;
++
++ /*
++ * The requested uid must be representable in the
++ * filesystem's idmapping.
++ */
++ if (!kuid_has_mapping(fc->user_ns, kuid))
++ goto bad_value;
++
++ ctx->uid = kuid;
+ break;
+ case Opt_gid:
+- ctx->gid = make_kgid(current_user_ns(), result.uint_32);
+- if (!gid_valid(ctx->gid))
++ kgid = make_kgid(current_user_ns(), result.uint_32);
++ if (!gid_valid(kgid))
+ goto bad_value;
++
++ /*
++ * The requested gid must be representable in the
++ * filesystem's idmapping.
++ */
++ if (!kgid_has_mapping(fc->user_ns, kgid))
++ goto bad_value;
++
++ ctx->gid = kgid;
+ break;
+ case Opt_huge:
+ ctx->huge = result.uint_32;
+diff --git a/mm/util.c b/mm/util.c
+index dd12b9531ac4c..406634f26918c 100644
+--- a/mm/util.c
++++ b/mm/util.c
+@@ -1071,7 +1071,9 @@ void mem_dump_obj(void *object)
+ if (vmalloc_dump_obj(object))
+ return;
+
+- if (virt_addr_valid(object))
++ if (is_vmalloc_addr(object))
++ type = "vmalloc memory";
++ else if (virt_addr_valid(object))
+ type = "non-slab/vmalloc memory";
+ else if (object == NULL)
+ type = "NULL pointer";
+diff --git a/mm/vmalloc.c b/mm/vmalloc.c
+index 73a0077ee3afc..d78dfb071f89d 100644
+--- a/mm/vmalloc.c
++++ b/mm/vmalloc.c
+@@ -4228,14 +4228,32 @@ void pcpu_free_vm_areas(struct vm_struct **vms, int nr_vms)
+ #ifdef CONFIG_PRINTK
+ bool vmalloc_dump_obj(void *object)
+ {
+- struct vm_struct *vm;
+ void *objp = (void *)PAGE_ALIGN((unsigned long)object);
++ const void *caller;
++ struct vm_struct *vm;
++ struct vmap_area *va;
++ unsigned long addr;
++ unsigned int nr_pages;
+
+- vm = find_vm_area(objp);
+- if (!vm)
++ if (!spin_trylock(&vmap_area_lock))
++ return false;
++ va = __find_vmap_area((unsigned long)objp, &vmap_area_root);
++ if (!va) {
++ spin_unlock(&vmap_area_lock);
+ return false;
++ }
++
++ vm = va->vm;
++ if (!vm) {
++ spin_unlock(&vmap_area_lock);
++ return false;
++ }
++ addr = (unsigned long)vm->addr;
++ caller = vm->caller;
++ nr_pages = vm->nr_pages;
++ spin_unlock(&vmap_area_lock);
+ pr_cont(" %u-page vmalloc region starting at %#lx allocated at %pS\n",
+- vm->nr_pages, (unsigned long)vm->addr, vm->caller);
++ nr_pages, addr, caller);
+ return true;
+ }
+ #endif
+diff --git a/mm/vmpressure.c b/mm/vmpressure.c
+index b52644771cc43..22c6689d93027 100644
+--- a/mm/vmpressure.c
++++ b/mm/vmpressure.c
+@@ -244,6 +244,14 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, bool tree,
+ if (mem_cgroup_disabled())
+ return;
+
++ /*
++ * The in-kernel users only care about the reclaim efficiency
++ * for this @memcg rather than the whole subtree, and there
++ * isn't and won't be any in-kernel user in a legacy cgroup.
++ */
++ if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && !tree)
++ return;
++
+ vmpr = memcg_to_vmpressure(memcg);
+
+ /*
+diff --git a/mm/vmscan.c b/mm/vmscan.c
+index 7ff3389c677f9..caf17b7c1b9f0 100644
+--- a/mm/vmscan.c
++++ b/mm/vmscan.c
+@@ -4853,7 +4853,8 @@ static int lru_gen_memcg_seg(struct lruvec *lruvec)
+ * the eviction
+ ******************************************************************************/
+
+-static bool sort_folio(struct lruvec *lruvec, struct folio *folio, int tier_idx)
++static bool sort_folio(struct lruvec *lruvec, struct folio *folio, struct scan_control *sc,
++ int tier_idx)
+ {
+ bool success;
+ int gen = folio_lru_gen(folio);
+@@ -4904,6 +4905,13 @@ static bool sort_folio(struct lruvec *lruvec, struct folio *folio, int tier_idx)
+ return true;
+ }
+
++ /* ineligible */
++ if (zone > sc->reclaim_idx) {
++ gen = folio_inc_gen(lruvec, folio, false);
++ list_move_tail(&folio->lru, &lrugen->folios[gen][type][zone]);
++ return true;
++ }
++
+ /* waiting for writeback */
+ if (folio_test_locked(folio) || folio_test_writeback(folio) ||
+ (type == LRU_GEN_FILE && folio_test_dirty(folio))) {
+@@ -4952,7 +4960,8 @@ static bool isolate_folio(struct lruvec *lruvec, struct folio *folio, struct sca
+ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
+ int type, int tier, struct list_head *list)
+ {
+- int gen, zone;
++ int i;
++ int gen;
+ enum vm_event_item item;
+ int sorted = 0;
+ int scanned = 0;
+@@ -4968,9 +4977,10 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
+
+ gen = lru_gen_from_seq(lrugen->min_seq[type]);
+
+- for (zone = sc->reclaim_idx; zone >= 0; zone--) {
++ for (i = MAX_NR_ZONES; i > 0; i--) {
+ LIST_HEAD(moved);
+ int skipped = 0;
++ int zone = (sc->reclaim_idx + i) % MAX_NR_ZONES;
+ struct list_head *head = &lrugen->folios[gen][type][zone];
+
+ while (!list_empty(head)) {
+@@ -4984,7 +4994,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
+
+ scanned += delta;
+
+- if (sort_folio(lruvec, folio, tier))
++ if (sort_folio(lruvec, folio, sc, tier))
+ sorted += delta;
+ else if (isolate_folio(lruvec, folio, sc)) {
+ list_add(&folio->lru, list);
+diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
+index 3c27ffb781e3e..f3f6782894239 100644
+--- a/net/9p/trans_virtio.c
++++ b/net/9p/trans_virtio.c
+@@ -384,7 +384,7 @@ static void handle_rerror(struct p9_req_t *req, int in_hdr_len,
+ void *to = req->rc.sdata + in_hdr_len;
+
+ // Fits entirely into the static data? Nothing to do.
+- if (req->rc.size < in_hdr_len)
++ if (req->rc.size < in_hdr_len || !pages)
+ return;
+
+ // Really long error message? Tough, truncate the reply. Might get
+@@ -428,7 +428,7 @@ p9_virtio_zc_request(struct p9_client *client, struct p9_req_t *req,
+ struct page **in_pages = NULL, **out_pages = NULL;
+ struct virtio_chan *chan = client->trans;
+ struct scatterlist *sgs[4];
+- size_t offs;
++ size_t offs = 0;
+ int need_drop = 0;
+ int kicked = 0;
+
+diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c
+index 31c115b225e7e..eb2802ef34bfe 100644
+--- a/net/bluetooth/hci_conn.c
++++ b/net/bluetooth/hci_conn.c
+@@ -178,57 +178,6 @@ static void hci_conn_cleanup(struct hci_conn *conn)
+ hci_conn_put(conn);
+ }
+
+-static void le_scan_cleanup(struct work_struct *work)
+-{
+- struct hci_conn *conn = container_of(work, struct hci_conn,
+- le_scan_cleanup);
+- struct hci_dev *hdev = conn->hdev;
+- struct hci_conn *c = NULL;
+-
+- BT_DBG("%s hcon %p", hdev->name, conn);
+-
+- hci_dev_lock(hdev);
+-
+- /* Check that the hci_conn is still around */
+- rcu_read_lock();
+- list_for_each_entry_rcu(c, &hdev->conn_hash.list, list) {
+- if (c == conn)
+- break;
+- }
+- rcu_read_unlock();
+-
+- if (c == conn) {
+- hci_connect_le_scan_cleanup(conn, 0x00);
+- hci_conn_cleanup(conn);
+- }
+-
+- hci_dev_unlock(hdev);
+- hci_dev_put(hdev);
+- hci_conn_put(conn);
+-}
+-
+-static void hci_connect_le_scan_remove(struct hci_conn *conn)
+-{
+- BT_DBG("%s hcon %p", conn->hdev->name, conn);
+-
+- /* We can't call hci_conn_del/hci_conn_cleanup here since that
+- * could deadlock with another hci_conn_del() call that's holding
+- * hci_dev_lock and doing cancel_delayed_work_sync(&conn->disc_work).
+- * Instead, grab temporary extra references to the hci_dev and
+- * hci_conn and perform the necessary cleanup in a separate work
+- * callback.
+- */
+-
+- hci_dev_hold(conn->hdev);
+- hci_conn_get(conn);
+-
+- /* Even though we hold a reference to the hdev, many other
+- * things might get cleaned up meanwhile, including the hdev's
+- * own workqueue, so we can't use that for scheduling.
+- */
+- schedule_work(&conn->le_scan_cleanup);
+-}
+-
+ static void hci_acl_create_connection(struct hci_conn *conn)
+ {
+ struct hci_dev *hdev = conn->hdev;
+@@ -679,13 +628,6 @@ static void hci_conn_timeout(struct work_struct *work)
+ if (refcnt > 0)
+ return;
+
+- /* LE connections in scanning state need special handling */
+- if (conn->state == BT_CONNECT && conn->type == LE_LINK &&
+- test_bit(HCI_CONN_SCANNING, &conn->flags)) {
+- hci_connect_le_scan_remove(conn);
+- return;
+- }
+-
+ hci_abort_conn(conn, hci_proto_disconn_ind(conn));
+ }
+
+@@ -791,7 +733,8 @@ struct iso_list_data {
+ u16 sync_handle;
+ };
+ int count;
+- struct iso_cig_params pdu;
++ bool big_term;
++ bool big_sync_term;
+ };
+
+ static void bis_list(struct hci_conn *conn, void *data)
+@@ -809,17 +752,6 @@ static void bis_list(struct hci_conn *conn, void *data)
+ d->count++;
+ }
+
+-static void find_bis(struct hci_conn *conn, void *data)
+-{
+- struct iso_list_data *d = data;
+-
+- /* Ignore unicast */
+- if (bacmp(&conn->dst, BDADDR_ANY))
+- return;
+-
+- d->count++;
+-}
+-
+ static int terminate_big_sync(struct hci_dev *hdev, void *data)
+ {
+ struct iso_list_data *d = data;
+@@ -828,11 +760,8 @@ static int terminate_big_sync(struct hci_dev *hdev, void *data)
+
+ hci_remove_ext_adv_instance_sync(hdev, d->bis, NULL);
+
+- /* Check if ISO connection is a BIS and terminate BIG if there are
+- * no other connections using it.
+- */
+- hci_conn_hash_list_state(hdev, find_bis, ISO_LINK, BT_CONNECTED, d);
+- if (d->count)
++ /* Only terminate BIG if it has been created */
++ if (!d->big_term)
+ return 0;
+
+ return hci_le_terminate_big_sync(hdev, d->big,
+@@ -844,19 +773,21 @@ static void terminate_big_destroy(struct hci_dev *hdev, void *data, int err)
+ kfree(data);
+ }
+
+-static int hci_le_terminate_big(struct hci_dev *hdev, u8 big, u8 bis)
++static int hci_le_terminate_big(struct hci_dev *hdev, struct hci_conn *conn)
+ {
+ struct iso_list_data *d;
+ int ret;
+
+- bt_dev_dbg(hdev, "big 0x%2.2x bis 0x%2.2x", big, bis);
++ bt_dev_dbg(hdev, "big 0x%2.2x bis 0x%2.2x", conn->iso_qos.bcast.big,
++ conn->iso_qos.bcast.bis);
+
+ d = kzalloc(sizeof(*d), GFP_KERNEL);
+ if (!d)
+ return -ENOMEM;
+
+- d->big = big;
+- d->bis = bis;
++ d->big = conn->iso_qos.bcast.big;
++ d->bis = conn->iso_qos.bcast.bis;
++ d->big_term = test_and_clear_bit(HCI_CONN_BIG_CREATED, &conn->flags);
+
+ ret = hci_cmd_sync_queue(hdev, terminate_big_sync, d,
+ terminate_big_destroy);
+@@ -873,31 +804,26 @@ static int big_terminate_sync(struct hci_dev *hdev, void *data)
+ bt_dev_dbg(hdev, "big 0x%2.2x sync_handle 0x%4.4x", d->big,
+ d->sync_handle);
+
+- /* Check if ISO connection is a BIS and terminate BIG if there are
+- * no other connections using it.
+- */
+- hci_conn_hash_list_state(hdev, find_bis, ISO_LINK, BT_CONNECTED, d);
+- if (d->count)
+- return 0;
+-
+- hci_le_big_terminate_sync(hdev, d->big);
++ if (d->big_sync_term)
++ hci_le_big_terminate_sync(hdev, d->big);
+
+ return hci_le_pa_terminate_sync(hdev, d->sync_handle);
+ }
+
+-static int hci_le_big_terminate(struct hci_dev *hdev, u8 big, u16 sync_handle)
++static int hci_le_big_terminate(struct hci_dev *hdev, u8 big, struct hci_conn *conn)
+ {
+ struct iso_list_data *d;
+ int ret;
+
+- bt_dev_dbg(hdev, "big 0x%2.2x sync_handle 0x%4.4x", big, sync_handle);
++ bt_dev_dbg(hdev, "big 0x%2.2x sync_handle 0x%4.4x", big, conn->sync_handle);
+
+ d = kzalloc(sizeof(*d), GFP_KERNEL);
+ if (!d)
+ return -ENOMEM;
+
+ d->big = big;
+- d->sync_handle = sync_handle;
++ d->sync_handle = conn->sync_handle;
++ d->big_sync_term = test_and_clear_bit(HCI_CONN_BIG_SYNC, &conn->flags);
+
+ ret = hci_cmd_sync_queue(hdev, big_terminate_sync, d,
+ terminate_big_destroy);
+@@ -916,6 +842,7 @@ static int hci_le_big_terminate(struct hci_dev *hdev, u8 big, u16 sync_handle)
+ static void bis_cleanup(struct hci_conn *conn)
+ {
+ struct hci_dev *hdev = conn->hdev;
++ struct hci_conn *bis;
+
+ bt_dev_dbg(hdev, "conn %p", conn);
+
+@@ -923,11 +850,25 @@ static void bis_cleanup(struct hci_conn *conn)
+ if (!test_and_clear_bit(HCI_CONN_PER_ADV, &conn->flags))
+ return;
+
+- hci_le_terminate_big(hdev, conn->iso_qos.bcast.big,
+- conn->iso_qos.bcast.bis);
++ /* Check if ISO connection is a BIS and terminate advertising
++ * set and BIG if there are no other connections using it.
++ */
++ bis = hci_conn_hash_lookup_bis(hdev, BDADDR_ANY,
++ conn->iso_qos.bcast.big,
++ conn->iso_qos.bcast.bis);
++ if (bis)
++ return;
++
++ hci_le_terminate_big(hdev, conn);
+ } else {
++ bis = hci_conn_hash_lookup_big_any_dst(hdev,
++ conn->iso_qos.bcast.big);
++
++ if (bis)
++ return;
++
+ hci_le_big_terminate(hdev, conn->iso_qos.bcast.big,
+- conn->sync_handle);
++ conn);
+ }
+ }
+
+@@ -983,6 +924,25 @@ static void cis_cleanup(struct hci_conn *conn)
+ hci_le_remove_cig(hdev, conn->iso_qos.ucast.cig);
+ }
+
++static u16 hci_conn_hash_alloc_unset(struct hci_dev *hdev)
++{
++ struct hci_conn_hash *h = &hdev->conn_hash;
++ struct hci_conn *c;
++ u16 handle = HCI_CONN_HANDLE_MAX + 1;
++
++ rcu_read_lock();
++
++ list_for_each_entry_rcu(c, &h->list, list) {
++ /* Find the first unused handle */
++ if (handle == 0xffff || c->handle != handle)
++ break;
++ handle++;
++ }
++ rcu_read_unlock();
++
++ return handle;
++}
++
+ struct hci_conn *hci_conn_add(struct hci_dev *hdev, int type, bdaddr_t *dst,
+ u8 role)
+ {
+@@ -996,7 +956,7 @@ struct hci_conn *hci_conn_add(struct hci_dev *hdev, int type, bdaddr_t *dst,
+
+ bacpy(&conn->dst, dst);
+ bacpy(&conn->src, &hdev->bdaddr);
+- conn->handle = HCI_CONN_HANDLE_UNSET;
++ conn->handle = hci_conn_hash_alloc_unset(hdev);
+ conn->hdev = hdev;
+ conn->type = type;
+ conn->role = role;
+@@ -1059,7 +1019,6 @@ struct hci_conn *hci_conn_add(struct hci_dev *hdev, int type, bdaddr_t *dst,
+ INIT_DELAYED_WORK(&conn->auto_accept_work, hci_conn_auto_accept);
+ INIT_DELAYED_WORK(&conn->idle_work, hci_conn_idle);
+ INIT_DELAYED_WORK(&conn->le_conn_timeout, le_conn_timeout);
+- INIT_WORK(&conn->le_scan_cleanup, le_scan_cleanup);
+
+ atomic_set(&conn->refcnt, 0);
+
+@@ -1081,6 +1040,29 @@ struct hci_conn *hci_conn_add(struct hci_dev *hdev, int type, bdaddr_t *dst,
+ return conn;
+ }
+
++static void hci_conn_cleanup_child(struct hci_conn *conn, u8 reason)
++{
++ if (!reason)
++ reason = HCI_ERROR_REMOTE_USER_TERM;
++
++ /* Due to race, SCO/ISO conn might be not established yet at this point,
++ * and nothing else will clean it up. In other cases it is done via HCI
++ * events.
++ */
++ switch (conn->type) {
++ case SCO_LINK:
++ case ESCO_LINK:
++ if (HCI_CONN_HANDLE_UNSET(conn->handle))
++ hci_conn_failed(conn, reason);
++ break;
++ case ISO_LINK:
++ if (conn->state != BT_CONNECTED &&
++ !test_bit(HCI_CONN_CREATE_CIS, &conn->flags))
++ hci_conn_failed(conn, reason);
++ break;
++ }
++}
++
+ static void hci_conn_unlink(struct hci_conn *conn)
+ {
+ struct hci_dev *hdev = conn->hdev;
+@@ -1103,14 +1085,7 @@ static void hci_conn_unlink(struct hci_conn *conn)
+ if (!test_bit(HCI_UP, &hdev->flags))
+ continue;
+
+- /* Due to race, SCO connection might be not established
+- * yet at this point. Delete it now, otherwise it is
+- * possible for it to be stuck and can't be deleted.
+- */
+- if ((child->type == SCO_LINK ||
+- child->type == ESCO_LINK) &&
+- child->handle == HCI_CONN_HANDLE_UNSET)
+- hci_conn_del(child);
++ hci_conn_cleanup_child(child, conn->abort_reason);
+ }
+
+ return;
+@@ -1495,10 +1470,10 @@ static int qos_set_bis(struct hci_dev *hdev, struct bt_iso_qos *qos)
+
+ /* This function requires the caller holds hdev->lock */
+ static struct hci_conn *hci_add_bis(struct hci_dev *hdev, bdaddr_t *dst,
+- struct bt_iso_qos *qos)
++ struct bt_iso_qos *qos, __u8 base_len,
++ __u8 *base)
+ {
+ struct hci_conn *conn;
+- struct iso_list_data data;
+ int err;
+
+ /* Let's make sure that le is enabled.*/
+@@ -1516,24 +1491,27 @@ static struct hci_conn *hci_add_bis(struct hci_dev *hdev, bdaddr_t *dst,
+ if (err)
+ return ERR_PTR(err);
+
+- data.big = qos->bcast.big;
+- data.bis = qos->bcast.bis;
+- data.count = 0;
+-
+- /* Check if there is already a matching BIG/BIS */
+- hci_conn_hash_list_state(hdev, bis_list, ISO_LINK, BT_BOUND, &data);
+- if (data.count)
++ /* Check if the LE Create BIG command has already been sent */
++ conn = hci_conn_hash_lookup_per_adv_bis(hdev, dst, qos->bcast.big,
++ qos->bcast.big);
++ if (conn)
+ return ERR_PTR(-EADDRINUSE);
+
+- conn = hci_conn_hash_lookup_bis(hdev, dst, qos->bcast.big, qos->bcast.bis);
+- if (conn)
++ /* Check BIS settings against other bound BISes, since all
++ * BISes in a BIG must have the same value for all parameters
++ */
++ conn = hci_conn_hash_lookup_bis(hdev, dst, qos->bcast.big,
++ qos->bcast.bis);
++
++ if (conn && (memcmp(qos, &conn->iso_qos, sizeof(*qos)) ||
++ base_len != conn->le_per_adv_data_len ||
++ memcmp(conn->le_per_adv_data, base, base_len)))
+ return ERR_PTR(-EADDRINUSE);
+
+ conn = hci_conn_add(hdev, ISO_LINK, dst, HCI_ROLE_MASTER);
+ if (!conn)
+ return ERR_PTR(-ENOMEM);
+
+- set_bit(HCI_CONN_PER_ADV, &conn->flags);
+ conn->state = BT_CONNECT;
+
+ hci_conn_hold(conn);
+@@ -1707,52 +1685,25 @@ struct hci_conn *hci_connect_sco(struct hci_dev *hdev, int type, bdaddr_t *dst,
+ return sco;
+ }
+
+-static void cis_add(struct iso_list_data *d, struct bt_iso_qos *qos)
+-{
+- struct hci_cis_params *cis = &d->pdu.cis[d->pdu.cp.num_cis];
+-
+- cis->cis_id = qos->ucast.cis;
+- cis->c_sdu = cpu_to_le16(qos->ucast.out.sdu);
+- cis->p_sdu = cpu_to_le16(qos->ucast.in.sdu);
+- cis->c_phy = qos->ucast.out.phy ? qos->ucast.out.phy : qos->ucast.in.phy;
+- cis->p_phy = qos->ucast.in.phy ? qos->ucast.in.phy : qos->ucast.out.phy;
+- cis->c_rtn = qos->ucast.out.rtn;
+- cis->p_rtn = qos->ucast.in.rtn;
+-
+- d->pdu.cp.num_cis++;
+-}
+-
+-static void cis_list(struct hci_conn *conn, void *data)
+-{
+- struct iso_list_data *d = data;
+-
+- /* Skip if broadcast/ANY address */
+- if (!bacmp(&conn->dst, BDADDR_ANY))
+- return;
+-
+- if (d->cig != conn->iso_qos.ucast.cig || d->cis == BT_ISO_QOS_CIS_UNSET ||
+- d->cis != conn->iso_qos.ucast.cis)
+- return;
+-
+- d->count++;
+-
+- if (d->pdu.cp.cig_id == BT_ISO_QOS_CIG_UNSET ||
+- d->count >= ARRAY_SIZE(d->pdu.cis))
+- return;
+-
+- cis_add(d, &conn->iso_qos);
+-}
+-
+ static int hci_le_create_big(struct hci_conn *conn, struct bt_iso_qos *qos)
+ {
+ struct hci_dev *hdev = conn->hdev;
+ struct hci_cp_le_create_big cp;
++ struct iso_list_data data;
+
+ memset(&cp, 0, sizeof(cp));
+
++ data.big = qos->bcast.big;
++ data.bis = qos->bcast.bis;
++ data.count = 0;
++
++ /* Create a BIS for each bound connection */
++ hci_conn_hash_list_state(hdev, bis_list, ISO_LINK,
++ BT_BOUND, &data);
++
+ cp.handle = qos->bcast.big;
+ cp.adv_handle = qos->bcast.bis;
+- cp.num_bis = 0x01;
++ cp.num_bis = data.count;
+ hci_cpu_to_le24(qos->bcast.out.interval, cp.bis.sdu_interval);
+ cp.bis.sdu = cpu_to_le16(qos->bcast.out.sdu);
+ cp.bis.latency = cpu_to_le16(qos->bcast.out.latency);
+@@ -1766,25 +1717,62 @@ static int hci_le_create_big(struct hci_conn *conn, struct bt_iso_qos *qos)
+ return hci_send_cmd(hdev, HCI_OP_LE_CREATE_BIG, sizeof(cp), &cp);
+ }
+
+-static void set_cig_params_complete(struct hci_dev *hdev, void *data, int err)
++static int set_cig_params_sync(struct hci_dev *hdev, void *data)
+ {
+- struct iso_cig_params *pdu = data;
++ u8 cig_id = PTR_ERR(data);
++ struct hci_conn *conn;
++ struct bt_iso_qos *qos;
++ struct iso_cig_params pdu;
++ u8 cis_id;
+
+- bt_dev_dbg(hdev, "");
++ conn = hci_conn_hash_lookup_cig(hdev, cig_id);
++ if (!conn)
++ return 0;
+
+- if (err)
+- bt_dev_err(hdev, "Unable to set CIG parameters: %d", err);
++ memset(&pdu, 0, sizeof(pdu));
+
+- kfree(pdu);
+-}
++ qos = &conn->iso_qos;
++ pdu.cp.cig_id = cig_id;
++ hci_cpu_to_le24(qos->ucast.out.interval, pdu.cp.c_interval);
++ hci_cpu_to_le24(qos->ucast.in.interval, pdu.cp.p_interval);
++ pdu.cp.sca = qos->ucast.sca;
++ pdu.cp.packing = qos->ucast.packing;
++ pdu.cp.framing = qos->ucast.framing;
++ pdu.cp.c_latency = cpu_to_le16(qos->ucast.out.latency);
++ pdu.cp.p_latency = cpu_to_le16(qos->ucast.in.latency);
++
++ /* Reprogram all CIS(s) with the same CIG, valid range are:
++ * num_cis: 0x00 to 0x1F
++ * cis_id: 0x00 to 0xEF
++ */
++ for (cis_id = 0x00; cis_id < 0xf0 &&
++ pdu.cp.num_cis < ARRAY_SIZE(pdu.cis); cis_id++) {
++ struct hci_cis_params *cis;
+
+-static int set_cig_params_sync(struct hci_dev *hdev, void *data)
+-{
+- struct iso_cig_params *pdu = data;
+- u32 plen;
++ conn = hci_conn_hash_lookup_cis(hdev, NULL, 0, cig_id, cis_id);
++ if (!conn)
++ continue;
++
++ qos = &conn->iso_qos;
++
++ cis = &pdu.cis[pdu.cp.num_cis++];
++ cis->cis_id = cis_id;
++ cis->c_sdu = cpu_to_le16(conn->iso_qos.ucast.out.sdu);
++ cis->p_sdu = cpu_to_le16(conn->iso_qos.ucast.in.sdu);
++ cis->c_phy = qos->ucast.out.phy ? qos->ucast.out.phy :
++ qos->ucast.in.phy;
++ cis->p_phy = qos->ucast.in.phy ? qos->ucast.in.phy :
++ qos->ucast.out.phy;
++ cis->c_rtn = qos->ucast.out.rtn;
++ cis->p_rtn = qos->ucast.in.rtn;
++ }
++
++ if (!pdu.cp.num_cis)
++ return 0;
+
+- plen = sizeof(pdu->cp) + pdu->cp.num_cis * sizeof(pdu->cis[0]);
+- return __hci_cmd_sync_status(hdev, HCI_OP_LE_SET_CIG_PARAMS, plen, pdu,
++ return __hci_cmd_sync_status(hdev, HCI_OP_LE_SET_CIG_PARAMS,
++ sizeof(pdu.cp) +
++ pdu.cp.num_cis * sizeof(pdu.cis[0]), &pdu,
+ HCI_CMD_TIMEOUT);
+ }
+
+@@ -1792,7 +1780,6 @@ static bool hci_le_set_cig_params(struct hci_conn *conn, struct bt_iso_qos *qos)
+ {
+ struct hci_dev *hdev = conn->hdev;
+ struct iso_list_data data;
+- struct iso_cig_params *pdu;
+
+ memset(&data, 0, sizeof(data));
+
+@@ -1819,60 +1806,31 @@ static bool hci_le_set_cig_params(struct hci_conn *conn, struct bt_iso_qos *qos)
+ qos->ucast.cig = data.cig;
+ }
+
+- data.pdu.cp.cig_id = qos->ucast.cig;
+- hci_cpu_to_le24(qos->ucast.out.interval, data.pdu.cp.c_interval);
+- hci_cpu_to_le24(qos->ucast.in.interval, data.pdu.cp.p_interval);
+- data.pdu.cp.sca = qos->ucast.sca;
+- data.pdu.cp.packing = qos->ucast.packing;
+- data.pdu.cp.framing = qos->ucast.framing;
+- data.pdu.cp.c_latency = cpu_to_le16(qos->ucast.out.latency);
+- data.pdu.cp.p_latency = cpu_to_le16(qos->ucast.in.latency);
+-
+ if (qos->ucast.cis != BT_ISO_QOS_CIS_UNSET) {
+- data.count = 0;
+- data.cig = qos->ucast.cig;
+- data.cis = qos->ucast.cis;
+-
+- hci_conn_hash_list_state(hdev, cis_list, ISO_LINK, BT_BOUND,
+- &data);
+- if (data.count)
++ if (hci_conn_hash_lookup_cis(hdev, NULL, 0, qos->ucast.cig,
++ qos->ucast.cis))
+ return false;
+-
+- cis_add(&data, qos);
++ goto done;
+ }
+
+- /* Reprogram all CIS(s) with the same CIG */
+- for (data.cig = qos->ucast.cig, data.cis = 0x00; data.cis < 0x11;
++ /* Allocate first available CIS if not set */
++ for (data.cig = qos->ucast.cig, data.cis = 0x00; data.cis < 0xf0;
+ data.cis++) {
+- data.count = 0;
+-
+- hci_conn_hash_list_state(hdev, cis_list, ISO_LINK, BT_BOUND,
+- &data);
+- if (data.count)
+- continue;
+-
+- /* Allocate a CIS if not set */
+- if (qos->ucast.cis == BT_ISO_QOS_CIS_UNSET) {
++ if (!hci_conn_hash_lookup_cis(hdev, NULL, 0, data.cig,
++ data.cis)) {
+ /* Update CIS */
+ qos->ucast.cis = data.cis;
+- cis_add(&data, qos);
++ break;
+ }
+ }
+
+- if (qos->ucast.cis == BT_ISO_QOS_CIS_UNSET || !data.pdu.cp.num_cis)
+- return false;
+-
+- pdu = kzalloc(sizeof(*pdu), GFP_KERNEL);
+- if (!pdu)
++ if (qos->ucast.cis == BT_ISO_QOS_CIS_UNSET)
+ return false;
+
+- memcpy(pdu, &data.pdu, sizeof(*pdu));
+-
+- if (hci_cmd_sync_queue(hdev, set_cig_params_sync, pdu,
+- set_cig_params_complete) < 0) {
+- kfree(pdu);
++done:
++ if (hci_cmd_sync_queue(hdev, set_cig_params_sync,
++ ERR_PTR(qos->ucast.cig), NULL) < 0)
+ return false;
+- }
+
+ return true;
+ }
+@@ -1971,59 +1929,47 @@ bool hci_iso_setup_path(struct hci_conn *conn)
+ return true;
+ }
+
+-static int hci_create_cis_sync(struct hci_dev *hdev, void *data)
++int hci_conn_check_create_cis(struct hci_conn *conn)
+ {
+- return hci_le_create_cis_sync(hdev, data);
+-}
++ if (conn->type != ISO_LINK || !bacmp(&conn->dst, BDADDR_ANY))
++ return -EINVAL;
+
+-int hci_le_create_cis(struct hci_conn *conn)
+-{
+- struct hci_conn *cis;
+- struct hci_link *link, *t;
+- struct hci_dev *hdev = conn->hdev;
+- int err;
++ if (!conn->parent || conn->parent->state != BT_CONNECTED ||
++ conn->state != BT_CONNECT || HCI_CONN_HANDLE_UNSET(conn->handle))
++ return 1;
+
+- bt_dev_dbg(hdev, "hcon %p", conn);
++ return 0;
++}
+
+- switch (conn->type) {
+- case LE_LINK:
+- if (conn->state != BT_CONNECTED || list_empty(&conn->link_list))
+- return -EINVAL;
++static int hci_create_cis_sync(struct hci_dev *hdev, void *data)
++{
++ return hci_le_create_cis_sync(hdev);
++}
+
+- cis = NULL;
++int hci_le_create_cis_pending(struct hci_dev *hdev)
++{
++ struct hci_conn *conn;
++ bool pending = false;
+
+- /* hci_conn_link uses list_add_tail_rcu so the list is in
+- * the same order as the connections are requested.
+- */
+- list_for_each_entry_safe(link, t, &conn->link_list, list) {
+- if (link->conn->state == BT_BOUND) {
+- err = hci_le_create_cis(link->conn);
+- if (err)
+- return err;
++ rcu_read_lock();
+
+- cis = link->conn;
+- }
++ list_for_each_entry_rcu(conn, &hdev->conn_hash.list, list) {
++ if (test_bit(HCI_CONN_CREATE_CIS, &conn->flags)) {
++ rcu_read_unlock();
++ return -EBUSY;
+ }
+
+- return cis ? 0 : -EINVAL;
+- case ISO_LINK:
+- cis = conn;
+- break;
+- default:
+- return -EINVAL;
++ if (!hci_conn_check_create_cis(conn))
++ pending = true;
+ }
+
+- if (cis->state == BT_CONNECT)
++ rcu_read_unlock();
++
++ if (!pending)
+ return 0;
+
+ /* Queue Create CIS */
+- err = hci_cmd_sync_queue(hdev, hci_create_cis_sync, cis, NULL);
+- if (err)
+- return err;
+-
+- cis->state = BT_CONNECT;
+-
+- return 0;
++ return hci_cmd_sync_queue(hdev, hci_create_cis_sync, NULL, NULL);
+ }
+
+ static void hci_iso_qos_setup(struct hci_dev *hdev, struct hci_conn *conn,
+@@ -2053,16 +1999,6 @@ static void hci_iso_qos_setup(struct hci_dev *hdev, struct hci_conn *conn,
+ qos->latency = conn->le_conn_latency;
+ }
+
+-static void hci_bind_bis(struct hci_conn *conn,
+- struct bt_iso_qos *qos)
+-{
+- /* Update LINK PHYs according to QoS preference */
+- conn->le_tx_phy = qos->bcast.out.phy;
+- conn->le_tx_phy = qos->bcast.out.phy;
+- conn->iso_qos = *qos;
+- conn->state = BT_BOUND;
+-}
+-
+ static int create_big_sync(struct hci_dev *hdev, void *data)
+ {
+ struct hci_conn *conn = data;
+@@ -2185,27 +2121,80 @@ static void create_big_complete(struct hci_dev *hdev, void *data, int err)
+ }
+ }
+
+-struct hci_conn *hci_connect_bis(struct hci_dev *hdev, bdaddr_t *dst,
+- __u8 dst_type, struct bt_iso_qos *qos,
+- __u8 base_len, __u8 *base)
++struct hci_conn *hci_bind_bis(struct hci_dev *hdev, bdaddr_t *dst,
++ struct bt_iso_qos *qos,
++ __u8 base_len, __u8 *base)
+ {
+ struct hci_conn *conn;
+- int err;
++ __u8 eir[HCI_MAX_PER_AD_LENGTH];
++
++ if (base_len && base)
++ base_len = eir_append_service_data(eir, 0, 0x1851,
++ base, base_len);
+
+ /* We need hci_conn object using the BDADDR_ANY as dst */
+- conn = hci_add_bis(hdev, dst, qos);
++ conn = hci_add_bis(hdev, dst, qos, base_len, eir);
+ if (IS_ERR(conn))
+ return conn;
+
+- hci_bind_bis(conn, qos);
++ /* Update LINK PHYs according to QoS preference */
++ conn->le_tx_phy = qos->bcast.out.phy;
++ conn->le_tx_phy = qos->bcast.out.phy;
+
+ /* Add Basic Announcement into Peridic Adv Data if BASE is set */
+ if (base_len && base) {
+- base_len = eir_append_service_data(conn->le_per_adv_data, 0,
+- 0x1851, base, base_len);
++ memcpy(conn->le_per_adv_data, eir, sizeof(eir));
+ conn->le_per_adv_data_len = base_len;
+ }
+
++ hci_iso_qos_setup(hdev, conn, &qos->bcast.out,
++ conn->le_tx_phy ? conn->le_tx_phy :
++ hdev->le_tx_def_phys);
++
++ conn->iso_qos = *qos;
++ conn->state = BT_BOUND;
++
++ return conn;
++}
++
++static void bis_mark_per_adv(struct hci_conn *conn, void *data)
++{
++ struct iso_list_data *d = data;
++
++ /* Skip if not broadcast/ANY address */
++ if (bacmp(&conn->dst, BDADDR_ANY))
++ return;
++
++ if (d->big != conn->iso_qos.bcast.big ||
++ d->bis == BT_ISO_QOS_BIS_UNSET ||
++ d->bis != conn->iso_qos.bcast.bis)
++ return;
++
++ set_bit(HCI_CONN_PER_ADV, &conn->flags);
++}
++
++struct hci_conn *hci_connect_bis(struct hci_dev *hdev, bdaddr_t *dst,
++ __u8 dst_type, struct bt_iso_qos *qos,
++ __u8 base_len, __u8 *base)
++{
++ struct hci_conn *conn;
++ int err;
++ struct iso_list_data data;
++
++ conn = hci_bind_bis(hdev, dst, qos, base_len, base);
++ if (IS_ERR(conn))
++ return conn;
++
++ data.big = qos->bcast.big;
++ data.bis = qos->bcast.bis;
++
++ /* Set HCI_CONN_PER_ADV for all bound connections, to mark that
++ * the start periodic advertising and create BIG commands have
++ * been queued
++ */
++ hci_conn_hash_list_state(hdev, bis_mark_per_adv, ISO_LINK,
++ BT_BOUND, &data);
++
+ /* Queue start periodic advertising and create BIG */
+ err = hci_cmd_sync_queue(hdev, create_big_sync, conn,
+ create_big_complete);
+@@ -2214,10 +2203,6 @@ struct hci_conn *hci_connect_bis(struct hci_dev *hdev, bdaddr_t *dst,
+ return ERR_PTR(err);
+ }
+
+- hci_iso_qos_setup(hdev, conn, &qos->bcast.out,
+- conn->le_tx_phy ? conn->le_tx_phy :
+- hdev->le_tx_def_phys);
+-
+ return conn;
+ }
+
+@@ -2259,11 +2244,9 @@ struct hci_conn *hci_connect_cis(struct hci_dev *hdev, bdaddr_t *dst,
+ return ERR_PTR(-ENOLINK);
+ }
+
+- /* If LE is already connected and CIS handle is already set proceed to
+- * Create CIS immediately.
+- */
+- if (le->state == BT_CONNECTED && cis->handle != HCI_CONN_HANDLE_UNSET)
+- hci_le_create_cis(cis);
++ cis->state = BT_CONNECT;
++
++ hci_le_create_cis_pending(hdev);
+
+ return cis;
+ }
+@@ -2850,81 +2833,46 @@ u32 hci_conn_get_phy(struct hci_conn *conn)
+ return phys;
+ }
+
+-int hci_abort_conn(struct hci_conn *conn, u8 reason)
++static int abort_conn_sync(struct hci_dev *hdev, void *data)
+ {
+- int r = 0;
++ struct hci_conn *conn;
++ u16 handle = PTR_ERR(data);
+
+- if (test_and_set_bit(HCI_CONN_CANCEL, &conn->flags))
++ conn = hci_conn_hash_lookup_handle(hdev, handle);
++ if (!conn)
+ return 0;
+
+- switch (conn->state) {
+- case BT_CONNECTED:
+- case BT_CONFIG:
+- if (conn->type == AMP_LINK) {
+- struct hci_cp_disconn_phy_link cp;
++ return hci_abort_conn_sync(hdev, conn, conn->abort_reason);
++}
+
+- cp.phy_handle = HCI_PHY_HANDLE(conn->handle);
+- cp.reason = reason;
+- r = hci_send_cmd(conn->hdev, HCI_OP_DISCONN_PHY_LINK,
+- sizeof(cp), &cp);
+- } else {
+- struct hci_cp_disconnect dc;
++int hci_abort_conn(struct hci_conn *conn, u8 reason)
++{
++ struct hci_dev *hdev = conn->hdev;
+
+- dc.handle = cpu_to_le16(conn->handle);
+- dc.reason = reason;
+- r = hci_send_cmd(conn->hdev, HCI_OP_DISCONNECT,
+- sizeof(dc), &dc);
+- }
++ /* If abort_reason has already been set it means the connection is
++ * already being aborted so don't attempt to overwrite it.
++ */
++ if (conn->abort_reason)
++ return 0;
+
+- conn->state = BT_DISCONN;
++ bt_dev_dbg(hdev, "handle 0x%2.2x reason 0x%2.2x", conn->handle, reason);
+
+- break;
+- case BT_CONNECT:
+- if (conn->type == LE_LINK) {
+- if (test_bit(HCI_CONN_SCANNING, &conn->flags))
+- break;
+- r = hci_send_cmd(conn->hdev,
+- HCI_OP_LE_CREATE_CONN_CANCEL, 0, NULL);
+- } else if (conn->type == ACL_LINK) {
+- if (conn->hdev->hci_ver < BLUETOOTH_VER_1_2)
+- break;
+- r = hci_send_cmd(conn->hdev,
+- HCI_OP_CREATE_CONN_CANCEL,
+- 6, &conn->dst);
+- }
+- break;
+- case BT_CONNECT2:
+- if (conn->type == ACL_LINK) {
+- struct hci_cp_reject_conn_req rej;
+-
+- bacpy(&rej.bdaddr, &conn->dst);
+- rej.reason = reason;
+-
+- r = hci_send_cmd(conn->hdev,
+- HCI_OP_REJECT_CONN_REQ,
+- sizeof(rej), &rej);
+- } else if (conn->type == SCO_LINK || conn->type == ESCO_LINK) {
+- struct hci_cp_reject_sync_conn_req rej;
+-
+- bacpy(&rej.bdaddr, &conn->dst);
+-
+- /* SCO rejection has its own limited set of
+- * allowed error values (0x0D-0x0F) which isn't
+- * compatible with most values passed to this
+- * function. To be safe hard-code one of the
+- * values that's suitable for SCO.
+- */
+- rej.reason = HCI_ERROR_REJ_LIMITED_RESOURCES;
++ conn->abort_reason = reason;
+
+- r = hci_send_cmd(conn->hdev,
+- HCI_OP_REJECT_SYNC_CONN_REQ,
+- sizeof(rej), &rej);
++ /* If the connection is pending check the command opcode since that
++ * might be blocking on hci_cmd_sync_work while waiting its respective
++ * event so we need to hci_cmd_sync_cancel to cancel it.
++ */
++ if (conn->state == BT_CONNECT && hdev->req_status == HCI_REQ_PEND) {
++ switch (hci_skb_event(hdev->sent_cmd)) {
++ case HCI_EV_LE_CONN_COMPLETE:
++ case HCI_EV_LE_ENHANCED_CONN_COMPLETE:
++ case HCI_EVT_LE_CIS_ESTABLISHED:
++ hci_cmd_sync_cancel(hdev, -ECANCELED);
++ break;
+ }
+- break;
+- default:
+- conn->state = BT_CLOSED;
+- break;
+ }
+
+- return r;
++ return hci_cmd_sync_queue(hdev, abort_conn_sync, ERR_PTR(conn->handle),
++ NULL);
+ }
+diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
+index 1ec83985f1ab0..2c845c9a26be0 100644
+--- a/net/bluetooth/hci_core.c
++++ b/net/bluetooth/hci_core.c
+@@ -1074,9 +1074,9 @@ void hci_uuids_clear(struct hci_dev *hdev)
+
+ void hci_link_keys_clear(struct hci_dev *hdev)
+ {
+- struct link_key *key;
++ struct link_key *key, *tmp;
+
+- list_for_each_entry(key, &hdev->link_keys, list) {
++ list_for_each_entry_safe(key, tmp, &hdev->link_keys, list) {
+ list_del_rcu(&key->list);
+ kfree_rcu(key, rcu);
+ }
+@@ -1084,9 +1084,9 @@ void hci_link_keys_clear(struct hci_dev *hdev)
+
+ void hci_smp_ltks_clear(struct hci_dev *hdev)
+ {
+- struct smp_ltk *k;
++ struct smp_ltk *k, *tmp;
+
+- list_for_each_entry(k, &hdev->long_term_keys, list) {
++ list_for_each_entry_safe(k, tmp, &hdev->long_term_keys, list) {
+ list_del_rcu(&k->list);
+ kfree_rcu(k, rcu);
+ }
+@@ -1094,9 +1094,9 @@ void hci_smp_ltks_clear(struct hci_dev *hdev)
+
+ void hci_smp_irks_clear(struct hci_dev *hdev)
+ {
+- struct smp_irk *k;
++ struct smp_irk *k, *tmp;
+
+- list_for_each_entry(k, &hdev->identity_resolving_keys, list) {
++ list_for_each_entry_safe(k, tmp, &hdev->identity_resolving_keys, list) {
+ list_del_rcu(&k->list);
+ kfree_rcu(k, rcu);
+ }
+@@ -1104,9 +1104,9 @@ void hci_smp_irks_clear(struct hci_dev *hdev)
+
+ void hci_blocked_keys_clear(struct hci_dev *hdev)
+ {
+- struct blocked_key *b;
++ struct blocked_key *b, *tmp;
+
+- list_for_each_entry(b, &hdev->blocked_keys, list) {
++ list_for_each_entry_safe(b, tmp, &hdev->blocked_keys, list) {
+ list_del_rcu(&b->list);
+ kfree_rcu(b, rcu);
+ }
+@@ -1949,15 +1949,15 @@ int hci_add_adv_monitor(struct hci_dev *hdev, struct adv_monitor *monitor)
+
+ switch (hci_get_adv_monitor_offload_ext(hdev)) {
+ case HCI_ADV_MONITOR_EXT_NONE:
+- bt_dev_dbg(hdev, "%s add monitor %d status %d", hdev->name,
++ bt_dev_dbg(hdev, "add monitor %d status %d",
+ monitor->handle, status);
+ /* Message was not forwarded to controller - not an error */
+ break;
+
+ case HCI_ADV_MONITOR_EXT_MSFT:
+ status = msft_add_monitor_pattern(hdev, monitor);
+- bt_dev_dbg(hdev, "%s add monitor %d msft status %d", hdev->name,
+- monitor->handle, status);
++ bt_dev_dbg(hdev, "add monitor %d msft status %d",
++ handle, status);
+ break;
+ }
+
+@@ -1976,15 +1976,15 @@ static int hci_remove_adv_monitor(struct hci_dev *hdev,
+
+ switch (hci_get_adv_monitor_offload_ext(hdev)) {
+ case HCI_ADV_MONITOR_EXT_NONE: /* also goes here when powered off */
+- bt_dev_dbg(hdev, "%s remove monitor %d status %d", hdev->name,
++ bt_dev_dbg(hdev, "remove monitor %d status %d",
+ monitor->handle, status);
+ goto free_monitor;
+
+ case HCI_ADV_MONITOR_EXT_MSFT:
+ handle = monitor->handle;
+ status = msft_remove_monitor(hdev, monitor);
+- bt_dev_dbg(hdev, "%s remove monitor %d msft status %d",
+- hdev->name, handle, status);
++ bt_dev_dbg(hdev, "remove monitor %d msft status %d",
++ handle, status);
+ break;
+ }
+
+diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
+index cb0b5fe7a6f8c..5f4af2cfd21d8 100644
+--- a/net/bluetooth/hci_event.c
++++ b/net/bluetooth/hci_event.c
+@@ -3173,7 +3173,7 @@ static void hci_conn_complete_evt(struct hci_dev *hdev, void *data,
+ * As the connection handle is set here for the first time, it indicates
+ * whether the connection is already set up.
+ */
+- if (conn->handle != HCI_CONN_HANDLE_UNSET) {
++ if (!HCI_CONN_HANDLE_UNSET(conn->handle)) {
+ bt_dev_err(hdev, "Ignoring HCI_Connection_Complete for existing connection");
+ goto unlock;
+ }
+@@ -3803,6 +3803,22 @@ static u8 hci_cc_le_read_buffer_size_v2(struct hci_dev *hdev, void *data,
+ return rp->status;
+ }
+
++static void hci_unbound_cis_failed(struct hci_dev *hdev, u8 cig, u8 status)
++{
++ struct hci_conn *conn, *tmp;
++
++ lockdep_assert_held(&hdev->lock);
++
++ list_for_each_entry_safe(conn, tmp, &hdev->conn_hash.list, list) {
++ if (conn->type != ISO_LINK || !bacmp(&conn->dst, BDADDR_ANY) ||
++ conn->state == BT_OPEN || conn->iso_qos.ucast.cig != cig)
++ continue;
++
++ if (HCI_CONN_HANDLE_UNSET(conn->handle))
++ hci_conn_failed(conn, status);
++ }
++}
++
+ static u8 hci_cc_le_set_cig_params(struct hci_dev *hdev, void *data,
+ struct sk_buff *skb)
+ {
+@@ -3810,6 +3826,7 @@ static u8 hci_cc_le_set_cig_params(struct hci_dev *hdev, void *data,
+ struct hci_cp_le_set_cig_params *cp;
+ struct hci_conn *conn;
+ u8 status = rp->status;
++ bool pending = false;
+ int i;
+
+ bt_dev_dbg(hdev, "status 0x%2.2x", rp->status);
+@@ -3822,12 +3839,15 @@ static u8 hci_cc_le_set_cig_params(struct hci_dev *hdev, void *data,
+
+ hci_dev_lock(hdev);
+
++ /* BLUETOOTH CORE SPECIFICATION Version 5.4 | Vol 4, Part E page 2554
++ *
++ * If the Status return parameter is non-zero, then the state of the CIG
++ * and its CIS configurations shall not be changed by the command. If
++ * the CIG did not already exist, it shall not be created.
++ */
+ if (status) {
+- while ((conn = hci_conn_hash_lookup_cig(hdev, rp->cig_id))) {
+- conn->state = BT_CLOSED;
+- hci_connect_cfm(conn, status);
+- hci_conn_del(conn);
+- }
++ /* Keep current configuration, fail only the unbound CIS */
++ hci_unbound_cis_failed(hdev, rp->cig_id, status);
+ goto unlock;
+ }
+
+@@ -3851,13 +3871,15 @@ static u8 hci_cc_le_set_cig_params(struct hci_dev *hdev, void *data,
+
+ bt_dev_dbg(hdev, "%p handle 0x%4.4x parent %p", conn,
+ conn->handle, conn->parent);
+-
+- /* Create CIS if LE is already connected */
+- if (conn->parent && conn->parent->state == BT_CONNECTED)
+- hci_le_create_cis(conn);
++
++ if (conn->state == BT_CONNECT)
++ pending = true;
+ }
+
+ unlock:
++ if (pending)
++ hci_le_create_cis_pending(hdev);
++
+ hci_dev_unlock(hdev);
+
+ return rp->status;
+@@ -4223,6 +4245,7 @@ static void hci_cmd_complete_evt(struct hci_dev *hdev, void *data,
+ static void hci_cs_le_create_cis(struct hci_dev *hdev, u8 status)
+ {
+ struct hci_cp_le_create_cis *cp;
++ bool pending = false;
+ int i;
+
+ bt_dev_dbg(hdev, "status 0x%2.2x", status);
+@@ -4245,12 +4268,18 @@ static void hci_cs_le_create_cis(struct hci_dev *hdev, u8 status)
+
+ conn = hci_conn_hash_lookup_handle(hdev, handle);
+ if (conn) {
++ if (test_and_clear_bit(HCI_CONN_CREATE_CIS,
++ &conn->flags))
++ pending = true;
+ conn->state = BT_CLOSED;
+ hci_connect_cfm(conn, status);
+ hci_conn_del(conn);
+ }
+ }
+
++ if (pending)
++ hci_le_create_cis_pending(hdev);
++
+ hci_dev_unlock(hdev);
+ }
+
+@@ -4998,7 +5027,7 @@ static void hci_sync_conn_complete_evt(struct hci_dev *hdev, void *data,
+ * As the connection handle is set here for the first time, it indicates
+ * whether the connection is already set up.
+ */
+- if (conn->handle != HCI_CONN_HANDLE_UNSET) {
++ if (!HCI_CONN_HANDLE_UNSET(conn->handle)) {
+ bt_dev_err(hdev, "Ignoring HCI_Sync_Conn_Complete event for existing connection");
+ goto unlock;
+ }
+@@ -5862,7 +5891,7 @@ static void le_conn_complete_evt(struct hci_dev *hdev, u8 status,
+ * As the connection handle is set here for the first time, it indicates
+ * whether the connection is already set up.
+ */
+- if (conn->handle != HCI_CONN_HANDLE_UNSET) {
++ if (!HCI_CONN_HANDLE_UNSET(conn->handle)) {
+ bt_dev_err(hdev, "Ignoring HCI_Connection_Complete for existing connection");
+ goto unlock;
+ }
+@@ -6788,6 +6817,8 @@ static void hci_le_cis_estabilished_evt(struct hci_dev *hdev, void *data,
+ {
+ struct hci_evt_le_cis_established *ev = data;
+ struct hci_conn *conn;
++ struct bt_iso_qos *qos;
++ bool pending = false;
+ u16 handle = __le16_to_cpu(ev->handle);
+
+ bt_dev_dbg(hdev, "status 0x%2.2x", ev->status);
+@@ -6809,21 +6840,41 @@ static void hci_le_cis_estabilished_evt(struct hci_dev *hdev, void *data,
+ goto unlock;
+ }
+
+- if (conn->role == HCI_ROLE_SLAVE) {
+- __le32 interval;
++ qos = &conn->iso_qos;
+
+- memset(&interval, 0, sizeof(interval));
++ pending = test_and_clear_bit(HCI_CONN_CREATE_CIS, &conn->flags);
+
+- memcpy(&interval, ev->c_latency, sizeof(ev->c_latency));
+- conn->iso_qos.ucast.in.interval = le32_to_cpu(interval);
+- memcpy(&interval, ev->p_latency, sizeof(ev->p_latency));
+- conn->iso_qos.ucast.out.interval = le32_to_cpu(interval);
+- conn->iso_qos.ucast.in.latency = le16_to_cpu(ev->interval);
+- conn->iso_qos.ucast.out.latency = le16_to_cpu(ev->interval);
+- conn->iso_qos.ucast.in.sdu = le16_to_cpu(ev->c_mtu);
+- conn->iso_qos.ucast.out.sdu = le16_to_cpu(ev->p_mtu);
+- conn->iso_qos.ucast.in.phy = ev->c_phy;
+- conn->iso_qos.ucast.out.phy = ev->p_phy;
++ /* Convert ISO Interval (1.25 ms slots) to SDU Interval (us) */
++ qos->ucast.in.interval = le16_to_cpu(ev->interval) * 1250;
++ qos->ucast.out.interval = qos->ucast.in.interval;
++
++ switch (conn->role) {
++ case HCI_ROLE_SLAVE:
++ /* Convert Transport Latency (us) to Latency (msec) */
++ qos->ucast.in.latency =
++ DIV_ROUND_CLOSEST(get_unaligned_le24(ev->c_latency),
++ 1000);
++ qos->ucast.out.latency =
++ DIV_ROUND_CLOSEST(get_unaligned_le24(ev->p_latency),
++ 1000);
++ qos->ucast.in.sdu = le16_to_cpu(ev->c_mtu);
++ qos->ucast.out.sdu = le16_to_cpu(ev->p_mtu);
++ qos->ucast.in.phy = ev->c_phy;
++ qos->ucast.out.phy = ev->p_phy;
++ break;
++ case HCI_ROLE_MASTER:
++ /* Convert Transport Latency (us) to Latency (msec) */
++ qos->ucast.out.latency =
++ DIV_ROUND_CLOSEST(get_unaligned_le24(ev->c_latency),
++ 1000);
++ qos->ucast.in.latency =
++ DIV_ROUND_CLOSEST(get_unaligned_le24(ev->p_latency),
++ 1000);
++ qos->ucast.out.sdu = le16_to_cpu(ev->c_mtu);
++ qos->ucast.in.sdu = le16_to_cpu(ev->p_mtu);
++ qos->ucast.out.phy = ev->c_phy;
++ qos->ucast.in.phy = ev->p_phy;
++ break;
+ }
+
+ if (!ev->status) {
+@@ -6834,10 +6885,14 @@ static void hci_le_cis_estabilished_evt(struct hci_dev *hdev, void *data,
+ goto unlock;
+ }
+
++ conn->state = BT_CLOSED;
+ hci_connect_cfm(conn, ev->status);
+ hci_conn_del(conn);
+
+ unlock:
++ if (pending)
++ hci_le_create_cis_pending(hdev);
++
+ hci_dev_unlock(hdev);
+ }
+
+@@ -6916,6 +6971,7 @@ static void hci_le_create_big_complete_evt(struct hci_dev *hdev, void *data,
+ {
+ struct hci_evt_le_create_big_complete *ev = data;
+ struct hci_conn *conn;
++ __u8 bis_idx = 0;
+
+ BT_DBG("%s status 0x%2.2x", hdev->name, ev->status);
+
+@@ -6924,33 +6980,44 @@ static void hci_le_create_big_complete_evt(struct hci_dev *hdev, void *data,
+ return;
+
+ hci_dev_lock(hdev);
++ rcu_read_lock();
+
+- conn = hci_conn_hash_lookup_big(hdev, ev->handle);
+- if (!conn)
+- goto unlock;
++ /* Connect all BISes that are bound to the BIG */
++ list_for_each_entry_rcu(conn, &hdev->conn_hash.list, list) {
++ if (bacmp(&conn->dst, BDADDR_ANY) ||
++ conn->type != ISO_LINK ||
++ conn->iso_qos.bcast.big != ev->handle)
++ continue;
+
+- if (conn->type != ISO_LINK) {
+- bt_dev_err(hdev,
+- "Invalid connection link type handle 0x%2.2x",
+- ev->handle);
+- goto unlock;
+- }
++ conn->handle = __le16_to_cpu(ev->bis_handle[bis_idx++]);
+
+- if (ev->num_bis)
+- conn->handle = __le16_to_cpu(ev->bis_handle[0]);
++ if (!ev->status) {
++ conn->state = BT_CONNECTED;
++ set_bit(HCI_CONN_BIG_CREATED, &conn->flags);
++ rcu_read_unlock();
++ hci_debugfs_create_conn(conn);
++ hci_conn_add_sysfs(conn);
++ hci_iso_setup_path(conn);
++ rcu_read_lock();
++ continue;
++ }
+
+- if (!ev->status) {
+- conn->state = BT_CONNECTED;
+- hci_debugfs_create_conn(conn);
+- hci_conn_add_sysfs(conn);
+- hci_iso_setup_path(conn);
+- goto unlock;
++ hci_connect_cfm(conn, ev->status);
++ rcu_read_unlock();
++ hci_conn_del(conn);
++ rcu_read_lock();
+ }
+
+- hci_connect_cfm(conn, ev->status);
+- hci_conn_del(conn);
++ if (!ev->status && !bis_idx)
++ /* If no BISes have been connected for the BIG,
++ * terminate. This is in case all bound connections
++ * have been closed before the BIG creation
++ * has completed.
++ */
++ hci_le_terminate_big_sync(hdev, ev->handle,
++ HCI_ERROR_LOCAL_HOST_TERM);
+
+-unlock:
++ rcu_read_unlock();
+ hci_dev_unlock(hdev);
+ }
+
+@@ -6967,9 +7034,6 @@ static void hci_le_big_sync_established_evt(struct hci_dev *hdev, void *data,
+ flex_array_size(ev, bis, ev->num_bis)))
+ return;
+
+- if (ev->status)
+- return;
+-
+ hci_dev_lock(hdev);
+
+ for (i = 0; i < ev->num_bis; i++) {
+@@ -6993,9 +7057,25 @@ static void hci_le_big_sync_established_evt(struct hci_dev *hdev, void *data,
+ bis->iso_qos.bcast.in.latency = le16_to_cpu(ev->interval) * 125 / 100;
+ bis->iso_qos.bcast.in.sdu = le16_to_cpu(ev->max_pdu);
+
+- hci_iso_setup_path(bis);
++ if (!ev->status) {
++ set_bit(HCI_CONN_BIG_SYNC, &bis->flags);
++ hci_iso_setup_path(bis);
++ }
+ }
+
++ /* In case BIG sync failed, notify each failed connection to
++ * the user after all hci connections have been added
++ */
++ if (ev->status)
++ for (i = 0; i < ev->num_bis; i++) {
++ u16 handle = le16_to_cpu(ev->bis[i]);
++
++ bis = hci_conn_hash_lookup_handle(hdev, handle);
++
++ set_bit(HCI_CONN_BIG_SYNC_FAILED, &bis->flags);
++ hci_connect_cfm(bis, ev->status);
++ }
++
+ hci_dev_unlock(hdev);
+ }
+
+diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c
+index 1bcb54272dc67..3177a38ef4d60 100644
+--- a/net/bluetooth/hci_sync.c
++++ b/net/bluetooth/hci_sync.c
+@@ -4684,7 +4684,10 @@ static const struct {
+ "advertised, but not supported."),
+ HCI_QUIRK_BROKEN(SET_RPA_TIMEOUT,
+ "HCI LE Set Random Private Address Timeout command is "
+- "advertised, but not supported.")
++ "advertised, but not supported."),
++ HCI_QUIRK_BROKEN(LE_CODED,
++ "HCI LE Coded PHY feature bit is set, "
++ "but its usage is not supported.")
+ };
+
+ /* This function handles hdev setup stage:
+@@ -5271,22 +5274,27 @@ static int hci_disconnect_sync(struct hci_dev *hdev, struct hci_conn *conn,
+ }
+
+ static int hci_le_connect_cancel_sync(struct hci_dev *hdev,
+- struct hci_conn *conn)
++ struct hci_conn *conn, u8 reason)
+ {
++ /* Return reason if scanning since the connection shall probably be
++ * cleanup directly.
++ */
+ if (test_bit(HCI_CONN_SCANNING, &conn->flags))
+- return 0;
++ return reason;
+
+- if (test_and_set_bit(HCI_CONN_CANCEL, &conn->flags))
++ if (conn->role == HCI_ROLE_SLAVE ||
++ test_and_set_bit(HCI_CONN_CANCEL, &conn->flags))
+ return 0;
+
+ return __hci_cmd_sync_status(hdev, HCI_OP_LE_CREATE_CONN_CANCEL,
+ 0, NULL, HCI_CMD_TIMEOUT);
+ }
+
+-static int hci_connect_cancel_sync(struct hci_dev *hdev, struct hci_conn *conn)
++static int hci_connect_cancel_sync(struct hci_dev *hdev, struct hci_conn *conn,
++ u8 reason)
+ {
+ if (conn->type == LE_LINK)
+- return hci_le_connect_cancel_sync(hdev, conn);
++ return hci_le_connect_cancel_sync(hdev, conn, reason);
+
+ if (hdev->hci_ver < BLUETOOTH_VER_1_2)
+ return 0;
+@@ -5332,43 +5340,81 @@ static int hci_reject_conn_sync(struct hci_dev *hdev, struct hci_conn *conn,
+
+ int hci_abort_conn_sync(struct hci_dev *hdev, struct hci_conn *conn, u8 reason)
+ {
+- int err;
++ int err = 0;
++ u16 handle = conn->handle;
++ struct hci_conn *c;
+
+ switch (conn->state) {
+ case BT_CONNECTED:
+ case BT_CONFIG:
+- return hci_disconnect_sync(hdev, conn, reason);
++ err = hci_disconnect_sync(hdev, conn, reason);
++ break;
+ case BT_CONNECT:
+- err = hci_connect_cancel_sync(hdev, conn);
+- /* Cleanup hci_conn object if it cannot be cancelled as it
+- * likelly means the controller and host stack are out of sync.
+- */
+- if (err) {
++ err = hci_connect_cancel_sync(hdev, conn, reason);
++ break;
++ case BT_CONNECT2:
++ err = hci_reject_conn_sync(hdev, conn, reason);
++ break;
++ case BT_OPEN:
++ /* Cleanup bises that failed to be established */
++ if (test_and_clear_bit(HCI_CONN_BIG_SYNC_FAILED, &conn->flags)) {
+ hci_dev_lock(hdev);
+- hci_conn_failed(conn, err);
++ hci_conn_failed(conn, reason);
+ hci_dev_unlock(hdev);
+ }
+- return err;
+- case BT_CONNECT2:
+- return hci_reject_conn_sync(hdev, conn, reason);
++ break;
+ default:
++ hci_dev_lock(hdev);
+ conn->state = BT_CLOSED;
+- break;
++ hci_disconn_cfm(conn, reason);
++ hci_conn_del(conn);
++ hci_dev_unlock(hdev);
++ return 0;
+ }
+
+- return 0;
++ hci_dev_lock(hdev);
++
++ /* Check if the connection hasn't been cleanup while waiting
++ * commands to complete.
++ */
++ c = hci_conn_hash_lookup_handle(hdev, handle);
++ if (!c || c != conn) {
++ err = 0;
++ goto unlock;
++ }
++
++ /* Cleanup hci_conn object if it cannot be cancelled as it
++ * likelly means the controller and host stack are out of sync
++ * or in case of LE it was still scanning so it can be cleanup
++ * safely.
++ */
++ hci_conn_failed(conn, reason);
++
++unlock:
++ hci_dev_unlock(hdev);
++ return err;
+ }
+
+ static int hci_disconnect_all_sync(struct hci_dev *hdev, u8 reason)
+ {
+- struct hci_conn *conn, *tmp;
+- int err;
++ struct list_head *head = &hdev->conn_hash.list;
++ struct hci_conn *conn;
+
+- list_for_each_entry_safe(conn, tmp, &hdev->conn_hash.list, list) {
+- err = hci_abort_conn_sync(hdev, conn, reason);
+- if (err)
+- return err;
++ rcu_read_lock();
++ while ((conn = list_first_or_null_rcu(head, struct hci_conn, list))) {
++ /* Make sure the connection is not freed while unlocking */
++ conn = hci_conn_get(conn);
++ rcu_read_unlock();
++ /* Disregard possible errors since hci_conn_del shall have been
++ * called even in case of errors had occurred since it would
++ * then cause hci_conn_failed to be called which calls
++ * hci_conn_del internally.
++ */
++ hci_abort_conn_sync(hdev, conn, reason);
++ hci_conn_put(conn);
++ rcu_read_lock();
+ }
++ rcu_read_unlock();
+
+ return 0;
+ }
+@@ -6255,63 +6301,99 @@ int hci_le_create_conn_sync(struct hci_dev *hdev, struct hci_conn *conn)
+
+ done:
+ if (err == -ETIMEDOUT)
+- hci_le_connect_cancel_sync(hdev, conn);
++ hci_le_connect_cancel_sync(hdev, conn, 0x00);
+
+ /* Re-enable advertising after the connection attempt is finished. */
+ hci_resume_advertising_sync(hdev);
+ return err;
+ }
+
+-int hci_le_create_cis_sync(struct hci_dev *hdev, struct hci_conn *conn)
++int hci_le_create_cis_sync(struct hci_dev *hdev)
+ {
+ struct {
+ struct hci_cp_le_create_cis cp;
+ struct hci_cis cis[0x1f];
+ } cmd;
+- u8 cig;
+- struct hci_conn *hcon = conn;
++ struct hci_conn *conn;
++ u8 cig = BT_ISO_QOS_CIG_UNSET;
++
++ /* The spec allows only one pending LE Create CIS command at a time. If
++ * the command is pending now, don't do anything. We check for pending
++ * connections after each CIS Established event.
++ *
++ * BLUETOOTH CORE SPECIFICATION Version 5.3 | Vol 4, Part E
++ * page 2566:
++ *
++ * If the Host issues this command before all the
++ * HCI_LE_CIS_Established events from the previous use of the
++ * command have been generated, the Controller shall return the
++ * error code Command Disallowed (0x0C).
++ *
++ * BLUETOOTH CORE SPECIFICATION Version 5.3 | Vol 4, Part E
++ * page 2567:
++ *
++ * When the Controller receives the HCI_LE_Create_CIS command, the
++ * Controller sends the HCI_Command_Status event to the Host. An
++ * HCI_LE_CIS_Established event will be generated for each CIS when it
++ * is established or if it is disconnected or considered lost before
++ * being established; until all the events are generated, the command
++ * remains pending.
++ */
+
+ memset(&cmd, 0, sizeof(cmd));
+- cmd.cis[0].acl_handle = cpu_to_le16(conn->parent->handle);
+- cmd.cis[0].cis_handle = cpu_to_le16(conn->handle);
+- cmd.cp.num_cis++;
+- cig = conn->iso_qos.ucast.cig;
+
+ hci_dev_lock(hdev);
+
+ rcu_read_lock();
+
++ /* Wait until previous Create CIS has completed */
+ list_for_each_entry_rcu(conn, &hdev->conn_hash.list, list) {
+- struct hci_cis *cis = &cmd.cis[cmd.cp.num_cis];
++ if (test_bit(HCI_CONN_CREATE_CIS, &conn->flags))
++ goto done;
++ }
+
+- if (conn == hcon || conn->type != ISO_LINK ||
+- conn->state == BT_CONNECTED ||
+- conn->iso_qos.ucast.cig != cig)
++ /* Find CIG with all CIS ready */
++ list_for_each_entry_rcu(conn, &hdev->conn_hash.list, list) {
++ struct hci_conn *link;
++
++ if (hci_conn_check_create_cis(conn))
+ continue;
+
+- /* Check if all CIS(s) belonging to a CIG are ready */
+- if (!conn->parent || conn->parent->state != BT_CONNECTED ||
+- conn->state != BT_CONNECT) {
+- cmd.cp.num_cis = 0;
+- break;
++ cig = conn->iso_qos.ucast.cig;
++
++ list_for_each_entry_rcu(link, &hdev->conn_hash.list, list) {
++ if (hci_conn_check_create_cis(link) > 0 &&
++ link->iso_qos.ucast.cig == cig &&
++ link->state != BT_CONNECTED) {
++ cig = BT_ISO_QOS_CIG_UNSET;
++ break;
++ }
+ }
+
+- /* Group all CIS with state BT_CONNECT since the spec don't
+- * allow to send them individually:
+- *
+- * BLUETOOTH CORE SPECIFICATION Version 5.3 | Vol 4, Part E
+- * page 2566:
+- *
+- * If the Host issues this command before all the
+- * HCI_LE_CIS_Established events from the previous use of the
+- * command have been generated, the Controller shall return the
+- * error code Command Disallowed (0x0C).
+- */
++ if (cig != BT_ISO_QOS_CIG_UNSET)
++ break;
++ }
++
++ if (cig == BT_ISO_QOS_CIG_UNSET)
++ goto done;
++
++ list_for_each_entry_rcu(conn, &hdev->conn_hash.list, list) {
++ struct hci_cis *cis = &cmd.cis[cmd.cp.num_cis];
++
++ if (hci_conn_check_create_cis(conn) ||
++ conn->iso_qos.ucast.cig != cig)
++ continue;
++
++ set_bit(HCI_CONN_CREATE_CIS, &conn->flags);
+ cis->acl_handle = cpu_to_le16(conn->parent->handle);
+ cis->cis_handle = cpu_to_le16(conn->handle);
+ cmd.cp.num_cis++;
++
++ if (cmd.cp.num_cis >= ARRAY_SIZE(cmd.cis))
++ break;
+ }
+
++done:
+ rcu_read_unlock();
+
+ hci_dev_unlock(hdev);
+diff --git a/net/bluetooth/iso.c b/net/bluetooth/iso.c
+index 94d5bc104fede..4f2443e1aab3c 100644
+--- a/net/bluetooth/iso.c
++++ b/net/bluetooth/iso.c
+@@ -48,6 +48,11 @@ static void iso_sock_kill(struct sock *sk);
+ #define EIR_SERVICE_DATA_LENGTH 4
+ #define BASE_MAX_LENGTH (HCI_MAX_PER_AD_LENGTH - EIR_SERVICE_DATA_LENGTH)
+
++/* iso_pinfo flags values */
++enum {
++ BT_SK_BIG_SYNC,
++};
++
+ struct iso_pinfo {
+ struct bt_sock bt;
+ bdaddr_t src;
+@@ -58,7 +63,7 @@ struct iso_pinfo {
+ __u8 bc_num_bis;
+ __u8 bc_bis[ISO_MAX_NUM_BIS];
+ __u16 sync_handle;
+- __u32 flags;
++ unsigned long flags;
+ struct bt_iso_qos qos;
+ bool qos_user_set;
+ __u8 base_len;
+@@ -287,13 +292,24 @@ static int iso_connect_bis(struct sock *sk)
+ goto unlock;
+ }
+
+- hcon = hci_connect_bis(hdev, &iso_pi(sk)->dst,
+- le_addr_type(iso_pi(sk)->dst_type),
+- &iso_pi(sk)->qos, iso_pi(sk)->base_len,
+- iso_pi(sk)->base);
+- if (IS_ERR(hcon)) {
+- err = PTR_ERR(hcon);
+- goto unlock;
++ /* Just bind if DEFER_SETUP has been set */
++ if (test_bit(BT_SK_DEFER_SETUP, &bt_sk(sk)->flags)) {
++ hcon = hci_bind_bis(hdev, &iso_pi(sk)->dst,
++ &iso_pi(sk)->qos, iso_pi(sk)->base_len,
++ iso_pi(sk)->base);
++ if (IS_ERR(hcon)) {
++ err = PTR_ERR(hcon);
++ goto unlock;
++ }
++ } else {
++ hcon = hci_connect_bis(hdev, &iso_pi(sk)->dst,
++ le_addr_type(iso_pi(sk)->dst_type),
++ &iso_pi(sk)->qos, iso_pi(sk)->base_len,
++ iso_pi(sk)->base);
++ if (IS_ERR(hcon)) {
++ err = PTR_ERR(hcon);
++ goto unlock;
++ }
+ }
+
+ conn = iso_conn_add(hcon);
+@@ -317,6 +333,9 @@ static int iso_connect_bis(struct sock *sk)
+ if (hcon->state == BT_CONNECTED) {
+ iso_sock_clear_timer(sk);
+ sk->sk_state = BT_CONNECTED;
++ } else if (test_bit(BT_SK_DEFER_SETUP, &bt_sk(sk)->flags)) {
++ iso_sock_clear_timer(sk);
++ sk->sk_state = BT_CONNECT;
+ } else {
+ sk->sk_state = BT_CONNECT;
+ iso_sock_set_timer(sk, sk->sk_sndtimeo);
+@@ -1202,6 +1221,12 @@ static bool check_io_qos(struct bt_iso_io_qos *qos)
+
+ static bool check_ucast_qos(struct bt_iso_qos *qos)
+ {
++ if (qos->ucast.cig > 0xef && qos->ucast.cig != BT_ISO_QOS_CIG_UNSET)
++ return false;
++
++ if (qos->ucast.cis > 0xef && qos->ucast.cis != BT_ISO_QOS_CIS_UNSET)
++ return false;
++
+ if (qos->ucast.sca > 0x07)
+ return false;
+
+@@ -1466,7 +1491,7 @@ static int iso_sock_release(struct socket *sock)
+
+ iso_sock_close(sk);
+
+- if (sock_flag(sk, SOCK_LINGER) && sk->sk_lingertime &&
++ if (sock_flag(sk, SOCK_LINGER) && READ_ONCE(sk->sk_lingertime) &&
+ !(current->flags & PF_EXITING)) {
+ lock_sock(sk);
+ err = bt_sock_wait_state(sk, BT_CLOSED, sk->sk_lingertime);
+@@ -1563,6 +1588,12 @@ static void iso_conn_ready(struct iso_conn *conn)
+ hci_conn_hold(hcon);
+ iso_chan_add(conn, sk, parent);
+
++ if (ev && ((struct hci_evt_le_big_sync_estabilished *)ev)->status) {
++ /* Trigger error signal on child socket */
++ sk->sk_err = ECONNREFUSED;
++ sk->sk_error_report(sk);
++ }
++
+ if (test_bit(BT_SK_DEFER_SETUP, &bt_sk(parent)->flags))
+ sk->sk_state = BT_CONNECT2;
+ else
+@@ -1631,15 +1662,17 @@ int iso_connect_ind(struct hci_dev *hdev, bdaddr_t *bdaddr, __u8 *flags)
+ if (ev2->num_bis < iso_pi(sk)->bc_num_bis)
+ iso_pi(sk)->bc_num_bis = ev2->num_bis;
+
+- err = hci_le_big_create_sync(hdev,
+- &iso_pi(sk)->qos,
+- iso_pi(sk)->sync_handle,
+- iso_pi(sk)->bc_num_bis,
+- iso_pi(sk)->bc_bis);
+- if (err) {
+- bt_dev_err(hdev, "hci_le_big_create_sync: %d",
+- err);
+- sk = NULL;
++ if (!test_and_set_bit(BT_SK_BIG_SYNC, &iso_pi(sk)->flags)) {
++ err = hci_le_big_create_sync(hdev,
++ &iso_pi(sk)->qos,
++ iso_pi(sk)->sync_handle,
++ iso_pi(sk)->bc_num_bis,
++ iso_pi(sk)->bc_bis);
++ if (err) {
++ bt_dev_err(hdev, "hci_le_big_create_sync: %d",
++ err);
++ sk = NULL;
++ }
+ }
+ }
+ } else {
+@@ -1676,13 +1709,18 @@ static void iso_connect_cfm(struct hci_conn *hcon, __u8 status)
+ }
+
+ /* Create CIS if pending */
+- hci_le_create_cis(hcon);
++ hci_le_create_cis_pending(hcon->hdev);
+ return;
+ }
+
+ BT_DBG("hcon %p bdaddr %pMR status %d", hcon, &hcon->dst, status);
+
+- if (!status) {
++ /* Similar to the success case, if HCI_CONN_BIG_SYNC_FAILED is set,
++ * queue the failed bis connection into the accept queue of the
++ * listening socket and wake up userspace, to inform the user about
++ * the BIG sync failed event.
++ */
++ if (!status || test_bit(HCI_CONN_BIG_SYNC_FAILED, &hcon->flags)) {
+ struct iso_conn *conn;
+
+ conn = iso_conn_add(hcon);
+diff --git a/net/bluetooth/mgmt.c b/net/bluetooth/mgmt.c
+index d4498037fadc6..6240b20f020a8 100644
+--- a/net/bluetooth/mgmt.c
++++ b/net/bluetooth/mgmt.c
+@@ -3580,18 +3580,6 @@ unlock:
+ return err;
+ }
+
+-static int abort_conn_sync(struct hci_dev *hdev, void *data)
+-{
+- struct hci_conn *conn;
+- u16 handle = PTR_ERR(data);
+-
+- conn = hci_conn_hash_lookup_handle(hdev, handle);
+- if (!conn)
+- return 0;
+-
+- return hci_abort_conn_sync(hdev, conn, HCI_ERROR_REMOTE_USER_TERM);
+-}
+-
+ static int cancel_pair_device(struct sock *sk, struct hci_dev *hdev, void *data,
+ u16 len)
+ {
+@@ -3642,8 +3630,7 @@ static int cancel_pair_device(struct sock *sk, struct hci_dev *hdev, void *data,
+ le_addr_type(addr->type));
+
+ if (conn->conn_reason == CONN_REASON_PAIR_DEVICE)
+- hci_cmd_sync_queue(hdev, abort_conn_sync, ERR_PTR(conn->handle),
+- NULL);
++ hci_abort_conn(conn, HCI_ERROR_REMOTE_USER_TERM);
+
+ unlock:
+ hci_dev_unlock(hdev);
+diff --git a/net/bluetooth/msft.c b/net/bluetooth/msft.c
+index bf5cee48916c7..b80a2162a5c33 100644
+--- a/net/bluetooth/msft.c
++++ b/net/bluetooth/msft.c
+@@ -91,6 +91,33 @@ struct msft_ev_le_monitor_device {
+ struct msft_monitor_advertisement_handle_data {
+ __u8 msft_handle;
+ __u16 mgmt_handle;
++ __s8 rssi_high;
++ __s8 rssi_low;
++ __u8 rssi_low_interval;
++ __u8 rssi_sampling_period;
++ __u8 cond_type;
++ struct list_head list;
++};
++
++enum monitor_addr_filter_state {
++ AF_STATE_IDLE,
++ AF_STATE_ADDING,
++ AF_STATE_ADDED,
++ AF_STATE_REMOVING,
++};
++
++#define MSFT_MONITOR_ADVERTISEMENT_TYPE_ADDR 0x04
++struct msft_monitor_addr_filter_data {
++ __u8 msft_handle;
++ __u8 pattern_handle; /* address filters pertain to */
++ __u16 mgmt_handle;
++ int state;
++ __s8 rssi_high;
++ __s8 rssi_low;
++ __u8 rssi_low_interval;
++ __u8 rssi_sampling_period;
++ __u8 addr_type;
++ bdaddr_t bdaddr;
+ struct list_head list;
+ };
+
+@@ -99,9 +126,12 @@ struct msft_data {
+ __u8 evt_prefix_len;
+ __u8 *evt_prefix;
+ struct list_head handle_map;
++ struct list_head address_filters;
+ __u8 resuming;
+ __u8 suspending;
+ __u8 filter_enabled;
++ /* To synchronize add/remove address filter and monitor device event.*/
++ struct mutex filter_lock;
+ };
+
+ bool msft_monitor_supported(struct hci_dev *hdev)
+@@ -180,6 +210,24 @@ static struct msft_monitor_advertisement_handle_data *msft_find_handle_data
+ return NULL;
+ }
+
++/* This function requires the caller holds msft->filter_lock */
++static struct msft_monitor_addr_filter_data *msft_find_address_data
++ (struct hci_dev *hdev, u8 addr_type, bdaddr_t *addr,
++ u8 pattern_handle)
++{
++ struct msft_monitor_addr_filter_data *entry;
++ struct msft_data *msft = hdev->msft_data;
++
++ list_for_each_entry(entry, &msft->address_filters, list) {
++ if (entry->pattern_handle == pattern_handle &&
++ addr_type == entry->addr_type &&
++ !bacmp(addr, &entry->bdaddr))
++ return entry;
++ }
++
++ return NULL;
++}
++
+ /* This function requires the caller holds hdev->lock */
+ static int msft_monitor_device_del(struct hci_dev *hdev, __u16 mgmt_handle,
+ bdaddr_t *bdaddr, __u8 addr_type,
+@@ -240,6 +288,7 @@ static int msft_le_monitor_advertisement_cb(struct hci_dev *hdev, u16 opcode,
+
+ handle_data->mgmt_handle = monitor->handle;
+ handle_data->msft_handle = rp->handle;
++ handle_data->cond_type = MSFT_MONITOR_ADVERTISEMENT_TYPE_PATTERN;
+ INIT_LIST_HEAD(&handle_data->list);
+ list_add(&handle_data->list, &msft->handle_map);
+
+@@ -254,6 +303,70 @@ unlock:
+ return status;
+ }
+
++/* This function requires the caller holds hci_req_sync_lock */
++static void msft_remove_addr_filters_sync(struct hci_dev *hdev, u8 handle)
++{
++ struct msft_monitor_addr_filter_data *address_filter, *n;
++ struct msft_cp_le_cancel_monitor_advertisement cp;
++ struct msft_data *msft = hdev->msft_data;
++ struct list_head head;
++ struct sk_buff *skb;
++
++ INIT_LIST_HEAD(&head);
++
++ /* Cancel all corresponding address monitors */
++ mutex_lock(&msft->filter_lock);
++
++ list_for_each_entry_safe(address_filter, n, &msft->address_filters,
++ list) {
++ if (address_filter->pattern_handle != handle)
++ continue;
++
++ list_del(&address_filter->list);
++
++ /* Keep the address filter and let
++ * msft_add_address_filter_sync() remove and free the address
++ * filter.
++ */
++ if (address_filter->state == AF_STATE_ADDING) {
++ address_filter->state = AF_STATE_REMOVING;
++ continue;
++ }
++
++ /* Keep the address filter and let
++ * msft_cancel_address_filter_sync() remove and free the address
++ * filter
++ */
++ if (address_filter->state == AF_STATE_REMOVING)
++ continue;
++
++ list_add_tail(&address_filter->list, &head);
++ }
++
++ mutex_unlock(&msft->filter_lock);
++
++ list_for_each_entry_safe(address_filter, n, &head, list) {
++ list_del(&address_filter->list);
++
++ cp.sub_opcode = MSFT_OP_LE_CANCEL_MONITOR_ADVERTISEMENT;
++ cp.handle = address_filter->msft_handle;
++
++ skb = __hci_cmd_sync(hdev, hdev->msft_opcode, sizeof(cp), &cp,
++ HCI_CMD_TIMEOUT);
++ if (IS_ERR_OR_NULL(skb)) {
++ kfree(address_filter);
++ continue;
++ }
++
++ kfree_skb(skb);
++
++ bt_dev_dbg(hdev, "MSFT: Canceled device %pMR address filter",
++ &address_filter->bdaddr);
++
++ kfree(address_filter);
++ }
++}
++
+ static int msft_le_cancel_monitor_advertisement_cb(struct hci_dev *hdev,
+ u16 opcode,
+ struct adv_monitor *monitor,
+@@ -263,6 +376,7 @@ static int msft_le_cancel_monitor_advertisement_cb(struct hci_dev *hdev,
+ struct msft_monitor_advertisement_handle_data *handle_data;
+ struct msft_data *msft = hdev->msft_data;
+ int status = 0;
++ u8 msft_handle;
+
+ rp = (struct msft_rp_le_cancel_monitor_advertisement *)skb->data;
+ if (skb->len < sizeof(*rp)) {
+@@ -293,11 +407,17 @@ static int msft_le_cancel_monitor_advertisement_cb(struct hci_dev *hdev,
+ NULL, 0, false);
+ }
+
++ msft_handle = handle_data->msft_handle;
++
+ list_del(&handle_data->list);
+ kfree(handle_data);
+- }
+
+- hci_dev_unlock(hdev);
++ hci_dev_unlock(hdev);
++
++ msft_remove_addr_filters_sync(hdev, msft_handle);
++ } else {
++ hci_dev_unlock(hdev);
++ }
+
+ done:
+ return status;
+@@ -394,12 +514,14 @@ static int msft_add_monitor_sync(struct hci_dev *hdev,
+ {
+ struct msft_cp_le_monitor_advertisement *cp;
+ struct msft_le_monitor_advertisement_pattern_data *pattern_data;
++ struct msft_monitor_advertisement_handle_data *handle_data;
+ struct msft_le_monitor_advertisement_pattern *pattern;
+ struct adv_pattern *entry;
+ size_t total_size = sizeof(*cp) + sizeof(*pattern_data);
+ ptrdiff_t offset = 0;
+ u8 pattern_count = 0;
+ struct sk_buff *skb;
++ int err;
+
+ if (!msft_monitor_pattern_valid(monitor))
+ return -EINVAL;
+@@ -436,16 +558,31 @@ static int msft_add_monitor_sync(struct hci_dev *hdev,
+
+ skb = __hci_cmd_sync(hdev, hdev->msft_opcode, total_size, cp,
+ HCI_CMD_TIMEOUT);
+- kfree(cp);
+
+ if (IS_ERR_OR_NULL(skb)) {
+- if (!skb)
+- return -EIO;
+- return PTR_ERR(skb);
++ err = PTR_ERR(skb);
++ goto out_free;
+ }
+
+- return msft_le_monitor_advertisement_cb(hdev, hdev->msft_opcode,
+- monitor, skb);
++ err = msft_le_monitor_advertisement_cb(hdev, hdev->msft_opcode,
++ monitor, skb);
++ if (err)
++ goto out_free;
++
++ handle_data = msft_find_handle_data(hdev, monitor->handle, true);
++ if (!handle_data) {
++ err = -ENODATA;
++ goto out_free;
++ }
++
++ handle_data->rssi_high = cp->rssi_high;
++ handle_data->rssi_low = cp->rssi_low;
++ handle_data->rssi_low_interval = cp->rssi_low_interval;
++ handle_data->rssi_sampling_period = cp->rssi_sampling_period;
++
++out_free:
++ kfree(cp);
++ return err;
+ }
+
+ /* This function requires the caller holds hci_req_sync_lock */
+@@ -538,6 +675,7 @@ void msft_do_close(struct hci_dev *hdev)
+ {
+ struct msft_data *msft = hdev->msft_data;
+ struct msft_monitor_advertisement_handle_data *handle_data, *tmp;
++ struct msft_monitor_addr_filter_data *address_filter, *n;
+ struct adv_monitor *monitor;
+
+ if (!msft)
+@@ -559,6 +697,14 @@ void msft_do_close(struct hci_dev *hdev)
+ kfree(handle_data);
+ }
+
++ mutex_lock(&msft->filter_lock);
++ list_for_each_entry_safe(address_filter, n, &msft->address_filters,
++ list) {
++ list_del(&address_filter->list);
++ kfree(address_filter);
++ }
++ mutex_unlock(&msft->filter_lock);
++
+ hci_dev_lock(hdev);
+
+ /* Clear any devices that are being monitored and notify device lost */
+@@ -568,6 +714,49 @@ void msft_do_close(struct hci_dev *hdev)
+ hci_dev_unlock(hdev);
+ }
+
++static int msft_cancel_address_filter_sync(struct hci_dev *hdev, void *data)
++{
++ struct msft_monitor_addr_filter_data *address_filter = data;
++ struct msft_cp_le_cancel_monitor_advertisement cp;
++ struct msft_data *msft = hdev->msft_data;
++ struct sk_buff *skb;
++ int err = 0;
++
++ if (!msft) {
++ bt_dev_err(hdev, "MSFT: msft data is freed");
++ return -EINVAL;
++ }
++
++ /* The address filter has been removed by hci dev close */
++ if (!test_bit(HCI_UP, &hdev->flags))
++ return 0;
++
++ mutex_lock(&msft->filter_lock);
++ list_del(&address_filter->list);
++ mutex_unlock(&msft->filter_lock);
++
++ cp.sub_opcode = MSFT_OP_LE_CANCEL_MONITOR_ADVERTISEMENT;
++ cp.handle = address_filter->msft_handle;
++
++ skb = __hci_cmd_sync(hdev, hdev->msft_opcode, sizeof(cp), &cp,
++ HCI_CMD_TIMEOUT);
++ if (IS_ERR_OR_NULL(skb)) {
++ bt_dev_err(hdev, "MSFT: Failed to cancel address (%pMR) filter",
++ &address_filter->bdaddr);
++ err = EIO;
++ goto done;
++ }
++ kfree_skb(skb);
++
++ bt_dev_dbg(hdev, "MSFT: Canceled device %pMR address filter",
++ &address_filter->bdaddr);
++
++done:
++ kfree(address_filter);
++
++ return err;
++}
++
+ void msft_register(struct hci_dev *hdev)
+ {
+ struct msft_data *msft = NULL;
+@@ -581,7 +770,9 @@ void msft_register(struct hci_dev *hdev)
+ }
+
+ INIT_LIST_HEAD(&msft->handle_map);
++ INIT_LIST_HEAD(&msft->address_filters);
+ hdev->msft_data = msft;
++ mutex_init(&msft->filter_lock);
+ }
+
+ void msft_unregister(struct hci_dev *hdev)
+@@ -596,6 +787,7 @@ void msft_unregister(struct hci_dev *hdev)
+ hdev->msft_data = NULL;
+
+ kfree(msft->evt_prefix);
++ mutex_destroy(&msft->filter_lock);
+ kfree(msft);
+ }
+
+@@ -645,11 +837,149 @@ static void *msft_skb_pull(struct hci_dev *hdev, struct sk_buff *skb,
+ return data;
+ }
+
++static int msft_add_address_filter_sync(struct hci_dev *hdev, void *data)
++{
++ struct msft_monitor_addr_filter_data *address_filter = data;
++ struct msft_rp_le_monitor_advertisement *rp;
++ struct msft_cp_le_monitor_advertisement *cp;
++ struct msft_data *msft = hdev->msft_data;
++ struct sk_buff *skb = NULL;
++ bool remove = false;
++ size_t size;
++
++ if (!msft) {
++ bt_dev_err(hdev, "MSFT: msft data is freed");
++ return -EINVAL;
++ }
++
++ /* The address filter has been removed by hci dev close */
++ if (!test_bit(HCI_UP, &hdev->flags))
++ return -ENODEV;
++
++ /* We are safe to use the address filter from now on.
++ * msft_monitor_device_evt() wouldn't delete this filter because it's
++ * not been added by now.
++ * And all other functions that requiring hci_req_sync_lock wouldn't
++ * touch this filter before this func completes because it's protected
++ * by hci_req_sync_lock.
++ */
++
++ if (address_filter->state == AF_STATE_REMOVING) {
++ mutex_lock(&msft->filter_lock);
++ list_del(&address_filter->list);
++ mutex_unlock(&msft->filter_lock);
++ kfree(address_filter);
++ return 0;
++ }
++
++ size = sizeof(*cp) +
++ sizeof(address_filter->addr_type) +
++ sizeof(address_filter->bdaddr);
++ cp = kzalloc(size, GFP_KERNEL);
++ if (!cp) {
++ bt_dev_err(hdev, "MSFT: Alloc cmd param err");
++ remove = true;
++ goto done;
++ }
++ cp->sub_opcode = MSFT_OP_LE_MONITOR_ADVERTISEMENT;
++ cp->rssi_high = address_filter->rssi_high;
++ cp->rssi_low = address_filter->rssi_low;
++ cp->rssi_low_interval = address_filter->rssi_low_interval;
++ cp->rssi_sampling_period = address_filter->rssi_sampling_period;
++ cp->cond_type = MSFT_MONITOR_ADVERTISEMENT_TYPE_ADDR;
++ cp->data[0] = address_filter->addr_type;
++ memcpy(&cp->data[1], &address_filter->bdaddr,
++ sizeof(address_filter->bdaddr));
++
++ skb = __hci_cmd_sync(hdev, hdev->msft_opcode, size, cp,
++ HCI_CMD_TIMEOUT);
++ if (IS_ERR_OR_NULL(skb)) {
++ bt_dev_err(hdev, "Failed to enable address %pMR filter",
++ &address_filter->bdaddr);
++ skb = NULL;
++ remove = true;
++ goto done;
++ }
++
++ rp = skb_pull_data(skb, sizeof(*rp));
++ if (!rp || rp->sub_opcode != MSFT_OP_LE_MONITOR_ADVERTISEMENT ||
++ rp->status)
++ remove = true;
++
++done:
++ mutex_lock(&msft->filter_lock);
++
++ if (remove) {
++ bt_dev_warn(hdev, "MSFT: Remove address (%pMR) filter",
++ &address_filter->bdaddr);
++ list_del(&address_filter->list);
++ kfree(address_filter);
++ } else {
++ address_filter->state = AF_STATE_ADDED;
++ address_filter->msft_handle = rp->handle;
++ bt_dev_dbg(hdev, "MSFT: Address %pMR filter enabled",
++ &address_filter->bdaddr);
++ }
++ mutex_unlock(&msft->filter_lock);
++
++ kfree_skb(skb);
++
++ return 0;
++}
++
++/* This function requires the caller holds msft->filter_lock */
++static struct msft_monitor_addr_filter_data *msft_add_address_filter
++ (struct hci_dev *hdev, u8 addr_type, bdaddr_t *bdaddr,
++ struct msft_monitor_advertisement_handle_data *handle_data)
++{
++ struct msft_monitor_addr_filter_data *address_filter = NULL;
++ struct msft_data *msft = hdev->msft_data;
++ int err;
++
++ address_filter = kzalloc(sizeof(*address_filter), GFP_KERNEL);
++ if (!address_filter)
++ return NULL;
++
++ address_filter->state = AF_STATE_ADDING;
++ address_filter->msft_handle = 0xff;
++ address_filter->pattern_handle = handle_data->msft_handle;
++ address_filter->mgmt_handle = handle_data->mgmt_handle;
++ address_filter->rssi_high = handle_data->rssi_high;
++ address_filter->rssi_low = handle_data->rssi_low;
++ address_filter->rssi_low_interval = handle_data->rssi_low_interval;
++ address_filter->rssi_sampling_period = handle_data->rssi_sampling_period;
++ address_filter->addr_type = addr_type;
++ bacpy(&address_filter->bdaddr, bdaddr);
++
++ /* With the above AF_STATE_ADDING, duplicated address filter can be
++ * avoided when receiving monitor device event (found/lost) frequently
++ * for the same device.
++ */
++ list_add_tail(&address_filter->list, &msft->address_filters);
++
++ err = hci_cmd_sync_queue(hdev, msft_add_address_filter_sync,
++ address_filter, NULL);
++ if (err < 0) {
++ bt_dev_err(hdev, "MSFT: Add address %pMR filter err", bdaddr);
++ list_del(&address_filter->list);
++ kfree(address_filter);
++ return NULL;
++ }
++
++ bt_dev_dbg(hdev, "MSFT: Add device %pMR address filter",
++ &address_filter->bdaddr);
++
++ return address_filter;
++}
++
+ /* This function requires the caller holds hdev->lock */
+ static void msft_monitor_device_evt(struct hci_dev *hdev, struct sk_buff *skb)
+ {
++ struct msft_monitor_addr_filter_data *n, *address_filter = NULL;
+ struct msft_ev_le_monitor_device *ev;
+ struct msft_monitor_advertisement_handle_data *handle_data;
++ struct msft_data *msft = hdev->msft_data;
++ u16 mgmt_handle = 0xffff;
+ u8 addr_type;
+
+ ev = msft_skb_pull(hdev, skb, MSFT_EV_LE_MONITOR_DEVICE, sizeof(*ev));
+@@ -662,9 +992,53 @@ static void msft_monitor_device_evt(struct hci_dev *hdev, struct sk_buff *skb)
+ ev->monitor_state, &ev->bdaddr);
+
+ handle_data = msft_find_handle_data(hdev, ev->monitor_handle, false);
+- if (!handle_data)
++
++ if (!test_bit(HCI_QUIRK_USE_MSFT_EXT_ADDRESS_FILTER, &hdev->quirks)) {
++ if (!handle_data)
++ return;
++ mgmt_handle = handle_data->mgmt_handle;
++ goto report_state;
++ }
++
++ if (handle_data) {
++ /* Don't report any device found/lost event from pattern
++ * monitors. Pattern monitor always has its address filters for
++ * tracking devices.
++ */
++
++ address_filter = msft_find_address_data(hdev, ev->addr_type,
++ &ev->bdaddr,
++ handle_data->msft_handle);
++ if (address_filter)
++ return;
++
++ if (ev->monitor_state && handle_data->cond_type ==
++ MSFT_MONITOR_ADVERTISEMENT_TYPE_PATTERN)
++ msft_add_address_filter(hdev, ev->addr_type,
++ &ev->bdaddr, handle_data);
++
+ return;
++ }
+
++ /* This device event is not from pattern monitor.
++ * Report it if there is a corresponding address_filter for it.
++ */
++ list_for_each_entry(n, &msft->address_filters, list) {
++ if (n->state == AF_STATE_ADDED &&
++ n->msft_handle == ev->monitor_handle) {
++ mgmt_handle = n->mgmt_handle;
++ address_filter = n;
++ break;
++ }
++ }
++
++ if (!address_filter) {
++ bt_dev_warn(hdev, "MSFT: Unexpected device event %pMR, %u, %u",
++ &ev->bdaddr, ev->monitor_handle, ev->monitor_state);
++ return;
++ }
++
++report_state:
+ switch (ev->addr_type) {
+ case ADDR_LE_DEV_PUBLIC:
+ addr_type = BDADDR_LE_PUBLIC;
+@@ -681,12 +1055,18 @@ static void msft_monitor_device_evt(struct hci_dev *hdev, struct sk_buff *skb)
+ return;
+ }
+
+- if (ev->monitor_state)
+- msft_device_found(hdev, &ev->bdaddr, addr_type,
+- handle_data->mgmt_handle);
+- else
+- msft_device_lost(hdev, &ev->bdaddr, addr_type,
+- handle_data->mgmt_handle);
++ if (ev->monitor_state) {
++ msft_device_found(hdev, &ev->bdaddr, addr_type, mgmt_handle);
++ } else {
++ if (address_filter && address_filter->state == AF_STATE_ADDED) {
++ address_filter->state = AF_STATE_REMOVING;
++ hci_cmd_sync_queue(hdev,
++ msft_cancel_address_filter_sync,
++ address_filter,
++ NULL);
++ }
++ msft_device_lost(hdev, &ev->bdaddr, addr_type, mgmt_handle);
++ }
+ }
+
+ void msft_vendor_evt(struct hci_dev *hdev, void *data, struct sk_buff *skb)
+@@ -724,7 +1104,9 @@ void msft_vendor_evt(struct hci_dev *hdev, void *data, struct sk_buff *skb)
+
+ switch (*evt) {
+ case MSFT_EV_LE_MONITOR_DEVICE:
++ mutex_lock(&msft->filter_lock);
+ msft_monitor_device_evt(hdev, skb);
++ mutex_unlock(&msft->filter_lock);
+ break;
+
+ default:
+diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c
+index 7762604ddfc05..99b149261949a 100644
+--- a/net/bluetooth/sco.c
++++ b/net/bluetooth/sco.c
+@@ -1267,7 +1267,7 @@ static int sco_sock_release(struct socket *sock)
+
+ sco_sock_close(sk);
+
+- if (sock_flag(sk, SOCK_LINGER) && sk->sk_lingertime &&
++ if (sock_flag(sk, SOCK_LINGER) && READ_ONCE(sk->sk_lingertime) &&
+ !(current->flags & PF_EXITING)) {
+ lock_sock(sk);
+ err = bt_sock_wait_state(sk, BT_CLOSED, sk->sk_lingertime);
+diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
+index b65962682771f..75204d36d7f90 100644
+--- a/net/bridge/br_stp_if.c
++++ b/net/bridge/br_stp_if.c
+@@ -201,9 +201,6 @@ int br_stp_set_enabled(struct net_bridge *br, unsigned long val,
+ {
+ ASSERT_RTNL();
+
+- if (!net_eq(dev_net(br->dev), &init_net))
+- NL_SET_ERR_MSG_MOD(extack, "STP does not work in non-root netns");
+-
+ if (br_mrp_enabled(br)) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "STP can't be enabled if MRP is already enabled");
+diff --git a/net/core/filter.c b/net/core/filter.c
+index f15ae393c2767..a9e93d528869f 100644
+--- a/net/core/filter.c
++++ b/net/core/filter.c
+@@ -7337,6 +7337,8 @@ BPF_CALL_3(bpf_sk_assign, struct sk_buff *, skb, struct sock *, sk, u64, flags)
+ return -ENETUNREACH;
+ if (unlikely(sk_fullsock(sk) && sk->sk_reuseport))
+ return -ESOCKTNOSUPPORT;
++ if (sk_unhashed(sk))
++ return -EOPNOTSUPP;
+ if (sk_is_refcounted(sk) &&
+ unlikely(!refcount_inc_not_zero(&sk->sk_refcnt)))
+ return -ENOENT;
+diff --git a/net/core/lwt_bpf.c b/net/core/lwt_bpf.c
+index 8b6b5e72b2179..4a0797f0a154b 100644
+--- a/net/core/lwt_bpf.c
++++ b/net/core/lwt_bpf.c
+@@ -60,9 +60,8 @@ static int run_lwt_bpf(struct sk_buff *skb, struct bpf_lwt_prog *lwt,
+ ret = BPF_OK;
+ } else {
+ skb_reset_mac_header(skb);
+- ret = skb_do_redirect(skb);
+- if (ret == 0)
+- ret = BPF_REDIRECT;
++ skb_do_redirect(skb);
++ ret = BPF_REDIRECT;
+ }
+ break;
+
+@@ -255,7 +254,7 @@ static int bpf_lwt_xmit_reroute(struct sk_buff *skb)
+
+ err = dst_output(dev_net(skb_dst(skb)->dev), skb->sk, skb);
+ if (unlikely(err))
+- return err;
++ return net_xmit_errno(err);
+
+ /* ip[6]_finish_output2 understand LWTUNNEL_XMIT_DONE */
+ return LWTUNNEL_XMIT_DONE;
+diff --git a/net/core/skbuff.c b/net/core/skbuff.c
+index 593ec18e3f007..0c0fef73be709 100644
+--- a/net/core/skbuff.c
++++ b/net/core/skbuff.c
+@@ -559,7 +559,7 @@ static void *kmalloc_reserve(unsigned int *size, gfp_t flags, int node,
+ bool *pfmemalloc)
+ {
+ bool ret_pfmemalloc = false;
+- unsigned int obj_size;
++ size_t obj_size;
+ void *obj;
+
+ obj_size = SKB_HEAD_ALIGN(*size);
+@@ -578,7 +578,13 @@ static void *kmalloc_reserve(unsigned int *size, gfp_t flags, int node,
+ goto out;
+ }
+ #endif
+- *size = obj_size = kmalloc_size_roundup(obj_size);
++
++ obj_size = kmalloc_size_roundup(obj_size);
++ /* The following cast might truncate high-order bits of obj_size, this
++ * is harmless because kmalloc(obj_size >= 2^32) will fail anyway.
++ */
++ *size = (unsigned int)obj_size;
++
+ /*
+ * Try a regular allocation, when that fails and we're not entitled
+ * to the reserves, fail.
+@@ -4364,21 +4370,20 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
+ struct sk_buff *segs = NULL;
+ struct sk_buff *tail = NULL;
+ struct sk_buff *list_skb = skb_shinfo(head_skb)->frag_list;
+- skb_frag_t *frag = skb_shinfo(head_skb)->frags;
+ unsigned int mss = skb_shinfo(head_skb)->gso_size;
+ unsigned int doffset = head_skb->data - skb_mac_header(head_skb);
+- struct sk_buff *frag_skb = head_skb;
+ unsigned int offset = doffset;
+ unsigned int tnl_hlen = skb_tnl_header_len(head_skb);
+ unsigned int partial_segs = 0;
+ unsigned int headroom;
+ unsigned int len = head_skb->len;
++ struct sk_buff *frag_skb;
++ skb_frag_t *frag;
+ __be16 proto;
+ bool csum, sg;
+- int nfrags = skb_shinfo(head_skb)->nr_frags;
+ int err = -ENOMEM;
+ int i = 0;
+- int pos;
++ int nfrags, pos;
+
+ if ((skb_shinfo(head_skb)->gso_type & SKB_GSO_DODGY) &&
+ mss != GSO_BY_FRAGS && mss != skb_headlen(head_skb)) {
+@@ -4455,6 +4460,13 @@ normal:
+ headroom = skb_headroom(head_skb);
+ pos = skb_headlen(head_skb);
+
++ if (skb_orphan_frags(head_skb, GFP_ATOMIC))
++ return ERR_PTR(-ENOMEM);
++
++ nfrags = skb_shinfo(head_skb)->nr_frags;
++ frag = skb_shinfo(head_skb)->frags;
++ frag_skb = head_skb;
++
+ do {
+ struct sk_buff *nskb;
+ skb_frag_t *nskb_frag;
+@@ -4475,6 +4487,10 @@ normal:
+ (skb_headlen(list_skb) == len || sg)) {
+ BUG_ON(skb_headlen(list_skb) > len);
+
++ nskb = skb_clone(list_skb, GFP_ATOMIC);
++ if (unlikely(!nskb))
++ goto err;
++
+ i = 0;
+ nfrags = skb_shinfo(list_skb)->nr_frags;
+ frag = skb_shinfo(list_skb)->frags;
+@@ -4493,12 +4509,8 @@ normal:
+ frag++;
+ }
+
+- nskb = skb_clone(list_skb, GFP_ATOMIC);
+ list_skb = list_skb->next;
+
+- if (unlikely(!nskb))
+- goto err;
+-
+ if (unlikely(pskb_trim(nskb, len))) {
+ kfree_skb(nskb);
+ goto err;
+@@ -4574,12 +4586,16 @@ normal:
+ skb_shinfo(nskb)->flags |= skb_shinfo(head_skb)->flags &
+ SKBFL_SHARED_FRAG;
+
+- if (skb_orphan_frags(frag_skb, GFP_ATOMIC) ||
+- skb_zerocopy_clone(nskb, frag_skb, GFP_ATOMIC))
++ if (skb_zerocopy_clone(nskb, frag_skb, GFP_ATOMIC))
+ goto err;
+
+ while (pos < offset + len) {
+ if (i >= nfrags) {
++ if (skb_orphan_frags(list_skb, GFP_ATOMIC) ||
++ skb_zerocopy_clone(nskb, list_skb,
++ GFP_ATOMIC))
++ goto err;
++
+ i = 0;
+ nfrags = skb_shinfo(list_skb)->nr_frags;
+ frag = skb_shinfo(list_skb)->frags;
+@@ -4593,10 +4609,6 @@ normal:
+ i--;
+ frag--;
+ }
+- if (skb_orphan_frags(frag_skb, GFP_ATOMIC) ||
+- skb_zerocopy_clone(nskb, frag_skb,
+- GFP_ATOMIC))
+- goto err;
+
+ list_skb = list_skb->next;
+ }
+diff --git a/net/core/sock.c b/net/core/sock.c
+index 8451a95266bf0..b2083a359ec10 100644
+--- a/net/core/sock.c
++++ b/net/core/sock.c
+@@ -425,6 +425,7 @@ static int sock_set_timeout(long *timeo_p, sockptr_t optval, int optlen,
+ {
+ struct __kernel_sock_timeval tv;
+ int err = sock_copy_user_timeval(&tv, optval, optlen, old_timeval);
++ long val;
+
+ if (err)
+ return err;
+@@ -435,7 +436,7 @@ static int sock_set_timeout(long *timeo_p, sockptr_t optval, int optlen,
+ if (tv.tv_sec < 0) {
+ static int warned __read_mostly;
+
+- *timeo_p = 0;
++ WRITE_ONCE(*timeo_p, 0);
+ if (warned < 10 && net_ratelimit()) {
+ warned++;
+ pr_info("%s: `%s' (pid %d) tries to set negative timeout\n",
+@@ -443,11 +444,12 @@ static int sock_set_timeout(long *timeo_p, sockptr_t optval, int optlen,
+ }
+ return 0;
+ }
+- *timeo_p = MAX_SCHEDULE_TIMEOUT;
+- if (tv.tv_sec == 0 && tv.tv_usec == 0)
+- return 0;
+- if (tv.tv_sec < (MAX_SCHEDULE_TIMEOUT / HZ - 1))
+- *timeo_p = tv.tv_sec * HZ + DIV_ROUND_UP((unsigned long)tv.tv_usec, USEC_PER_SEC / HZ);
++ val = MAX_SCHEDULE_TIMEOUT;
++ if ((tv.tv_sec || tv.tv_usec) &&
++ (tv.tv_sec < (MAX_SCHEDULE_TIMEOUT / HZ - 1)))
++ val = tv.tv_sec * HZ + DIV_ROUND_UP((unsigned long)tv.tv_usec,
++ USEC_PER_SEC / HZ);
++ WRITE_ONCE(*timeo_p, val);
+ return 0;
+ }
+
+@@ -791,7 +793,7 @@ EXPORT_SYMBOL(sock_set_reuseport);
+ void sock_no_linger(struct sock *sk)
+ {
+ lock_sock(sk);
+- sk->sk_lingertime = 0;
++ WRITE_ONCE(sk->sk_lingertime, 0);
+ sock_set_flag(sk, SOCK_LINGER);
+ release_sock(sk);
+ }
+@@ -809,9 +811,9 @@ void sock_set_sndtimeo(struct sock *sk, s64 secs)
+ {
+ lock_sock(sk);
+ if (secs && secs < MAX_SCHEDULE_TIMEOUT / HZ - 1)
+- sk->sk_sndtimeo = secs * HZ;
++ WRITE_ONCE(sk->sk_sndtimeo, secs * HZ);
+ else
+- sk->sk_sndtimeo = MAX_SCHEDULE_TIMEOUT;
++ WRITE_ONCE(sk->sk_sndtimeo, MAX_SCHEDULE_TIMEOUT);
+ release_sock(sk);
+ }
+ EXPORT_SYMBOL(sock_set_sndtimeo);
+@@ -1224,15 +1226,15 @@ set_sndbuf:
+ ret = -EFAULT;
+ break;
+ }
+- if (!ling.l_onoff)
++ if (!ling.l_onoff) {
+ sock_reset_flag(sk, SOCK_LINGER);
+- else {
+-#if (BITS_PER_LONG == 32)
+- if ((unsigned int)ling.l_linger >= MAX_SCHEDULE_TIMEOUT/HZ)
+- sk->sk_lingertime = MAX_SCHEDULE_TIMEOUT;
++ } else {
++ unsigned long t_sec = ling.l_linger;
++
++ if (t_sec >= MAX_SCHEDULE_TIMEOUT / HZ)
++ WRITE_ONCE(sk->sk_lingertime, MAX_SCHEDULE_TIMEOUT);
+ else
+-#endif
+- sk->sk_lingertime = (unsigned int)ling.l_linger * HZ;
++ WRITE_ONCE(sk->sk_lingertime, t_sec * HZ);
+ sock_set_flag(sk, SOCK_LINGER);
+ }
+ break;
+@@ -1678,7 +1680,7 @@ int sk_getsockopt(struct sock *sk, int level, int optname,
+ case SO_LINGER:
+ lv = sizeof(v.ling);
+ v.ling.l_onoff = sock_flag(sk, SOCK_LINGER);
+- v.ling.l_linger = sk->sk_lingertime / HZ;
++ v.ling.l_linger = READ_ONCE(sk->sk_lingertime) / HZ;
+ break;
+
+ case SO_BSDCOMPAT:
+@@ -1710,12 +1712,14 @@ int sk_getsockopt(struct sock *sk, int level, int optname,
+
+ case SO_RCVTIMEO_OLD:
+ case SO_RCVTIMEO_NEW:
+- lv = sock_get_timeout(sk->sk_rcvtimeo, &v, SO_RCVTIMEO_OLD == optname);
++ lv = sock_get_timeout(READ_ONCE(sk->sk_rcvtimeo), &v,
++ SO_RCVTIMEO_OLD == optname);
+ break;
+
+ case SO_SNDTIMEO_OLD:
+ case SO_SNDTIMEO_NEW:
+- lv = sock_get_timeout(sk->sk_sndtimeo, &v, SO_SNDTIMEO_OLD == optname);
++ lv = sock_get_timeout(READ_ONCE(sk->sk_sndtimeo), &v,
++ SO_SNDTIMEO_OLD == optname);
+ break;
+
+ case SO_RCVLOWAT:
+diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
+index e7b9703bd1a1a..1097a13b07a06 100644
+--- a/net/dccp/ipv4.c
++++ b/net/dccp/ipv4.c
+@@ -255,12 +255,17 @@ static int dccp_v4_err(struct sk_buff *skb, u32 info)
+ int err;
+ struct net *net = dev_net(skb->dev);
+
+- /* Only need dccph_dport & dccph_sport which are the first
+- * 4 bytes in dccp header.
++ /* For the first __dccp_basic_hdr_len() check, we only need dh->dccph_x,
++ * which is in byte 7 of the dccp header.
+ * Our caller (icmp_socket_deliver()) already pulled 8 bytes for us.
++ *
++ * Later on, we want to access the sequence number fields, which are
++ * beyond 8 bytes, so we have to pskb_may_pull() ourselves.
+ */
+- BUILD_BUG_ON(offsetofend(struct dccp_hdr, dccph_sport) > 8);
+- BUILD_BUG_ON(offsetofend(struct dccp_hdr, dccph_dport) > 8);
++ dh = (struct dccp_hdr *)(skb->data + offset);
++ if (!pskb_may_pull(skb, offset + __dccp_basic_hdr_len(dh)))
++ return -EINVAL;
++ iph = (struct iphdr *)skb->data;
+ dh = (struct dccp_hdr *)(skb->data + offset);
+
+ sk = __inet_lookup_established(net, &dccp_hashinfo,
+diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
+index 94b69a50c8b50..278bd6dc043f9 100644
+--- a/net/dccp/ipv6.c
++++ b/net/dccp/ipv6.c
+@@ -74,7 +74,7 @@ static inline __u64 dccp_v6_init_sequence(struct sk_buff *skb)
+ static int dccp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
+ u8 type, u8 code, int offset, __be32 info)
+ {
+- const struct ipv6hdr *hdr = (const struct ipv6hdr *)skb->data;
++ const struct ipv6hdr *hdr;
+ const struct dccp_hdr *dh;
+ struct dccp_sock *dp;
+ struct ipv6_pinfo *np;
+@@ -83,12 +83,17 @@ static int dccp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
+ __u64 seq;
+ struct net *net = dev_net(skb->dev);
+
+- /* Only need dccph_dport & dccph_sport which are the first
+- * 4 bytes in dccp header.
++ /* For the first __dccp_basic_hdr_len() check, we only need dh->dccph_x,
++ * which is in byte 7 of the dccp header.
+ * Our caller (icmpv6_notify()) already pulled 8 bytes for us.
++ *
++ * Later on, we want to access the sequence number fields, which are
++ * beyond 8 bytes, so we have to pskb_may_pull() ourselves.
+ */
+- BUILD_BUG_ON(offsetofend(struct dccp_hdr, dccph_sport) > 8);
+- BUILD_BUG_ON(offsetofend(struct dccp_hdr, dccph_dport) > 8);
++ dh = (struct dccp_hdr *)(skb->data + offset);
++ if (!pskb_may_pull(skb, offset + __dccp_basic_hdr_len(dh)))
++ return -EINVAL;
++ hdr = (const struct ipv6hdr *)skb->data;
+ dh = (struct dccp_hdr *)(skb->data + offset);
+
+ sk = __inet6_lookup_established(net, &dccp_hashinfo,
+diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
+index 48ff5f13e7979..193d8362efe2e 100644
+--- a/net/ipv4/igmp.c
++++ b/net/ipv4/igmp.c
+@@ -353,8 +353,9 @@ static struct sk_buff *igmpv3_newpack(struct net_device *dev, unsigned int mtu)
+ struct flowi4 fl4;
+ int hlen = LL_RESERVED_SPACE(dev);
+ int tlen = dev->needed_tailroom;
+- unsigned int size = mtu;
++ unsigned int size;
+
++ size = min(mtu, IP_MAX_MTU);
+ while (1) {
+ skb = alloc_skb(size + hlen + tlen,
+ GFP_ATOMIC | __GFP_NOWARN);
+diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
+index 6f6f63cf9224f..625da48741a4f 100644
+--- a/net/ipv4/ip_output.c
++++ b/net/ipv4/ip_output.c
+@@ -216,7 +216,7 @@ static int ip_finish_output2(struct net *net, struct sock *sk, struct sk_buff *s
+ if (lwtunnel_xmit_redirect(dst->lwtstate)) {
+ int res = lwtunnel_xmit(skb);
+
+- if (res < 0 || res == LWTUNNEL_XMIT_DONE)
++ if (res != LWTUNNEL_XMIT_CONTINUE)
+ return res;
+ }
+
+diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
+index 57f1e4883b761..094b3e266bbea 100644
+--- a/net/ipv4/tcp_input.c
++++ b/net/ipv4/tcp_input.c
+@@ -287,7 +287,7 @@ static void tcp_incr_quickack(struct sock *sk, unsigned int max_quickacks)
+ icsk->icsk_ack.quick = quickacks;
+ }
+
+-void tcp_enter_quickack_mode(struct sock *sk, unsigned int max_quickacks)
++static void tcp_enter_quickack_mode(struct sock *sk, unsigned int max_quickacks)
+ {
+ struct inet_connection_sock *icsk = inet_csk(sk);
+
+@@ -295,7 +295,6 @@ void tcp_enter_quickack_mode(struct sock *sk, unsigned int max_quickacks)
+ inet_csk_exit_pingpong_mode(sk);
+ icsk->icsk_ack.ato = TCP_ATO_MIN;
+ }
+-EXPORT_SYMBOL(tcp_enter_quickack_mode);
+
+ /* Send ACKs quickly, if "quick" count is not exhausted
+ * and the session is not interactive.
+diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
+index 366c3c25ebe20..db90bd2d4ed66 100644
+--- a/net/ipv4/tcp_timer.c
++++ b/net/ipv4/tcp_timer.c
+@@ -441,6 +441,22 @@ static void tcp_fastopen_synack_timer(struct sock *sk, struct request_sock *req)
+ req->timeout << req->num_timeout, TCP_RTO_MAX);
+ }
+
++static bool tcp_rtx_probe0_timed_out(const struct sock *sk,
++ const struct sk_buff *skb)
++{
++ const struct tcp_sock *tp = tcp_sk(sk);
++ const int timeout = TCP_RTO_MAX * 2;
++ u32 rcv_delta, rtx_delta;
++
++ rcv_delta = inet_csk(sk)->icsk_timeout - tp->rcv_tstamp;
++ if (rcv_delta <= timeout)
++ return false;
++
++ rtx_delta = (u32)msecs_to_jiffies(tcp_time_stamp(tp) -
++ (tp->retrans_stamp ?: tcp_skb_timestamp(skb)));
++
++ return rtx_delta > timeout;
++}
+
+ /**
+ * tcp_retransmit_timer() - The TCP retransmit timeout handler
+@@ -506,7 +522,7 @@ void tcp_retransmit_timer(struct sock *sk)
+ tp->snd_una, tp->snd_nxt);
+ }
+ #endif
+- if (tcp_jiffies32 - tp->rcv_tstamp > TCP_RTO_MAX) {
++ if (tcp_rtx_probe0_timed_out(sk, skb)) {
+ tcp_write_err(sk);
+ goto out;
+ }
+diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
+index 6d327d6d978c5..a3302136ce92e 100644
+--- a/net/ipv4/udp.c
++++ b/net/ipv4/udp.c
+@@ -452,14 +452,24 @@ static struct sock *udp4_lib_lookup2(struct net *net,
+ score = compute_score(sk, net, saddr, sport,
+ daddr, hnum, dif, sdif);
+ if (score > badness) {
+- result = lookup_reuseport(net, sk, skb,
+- saddr, sport, daddr, hnum);
++ badness = score;
++ result = lookup_reuseport(net, sk, skb, saddr, sport, daddr, hnum);
++ if (!result) {
++ result = sk;
++ continue;
++ }
++
+ /* Fall back to scoring if group has connections */
+- if (result && !reuseport_has_conns(sk))
++ if (!reuseport_has_conns(sk))
+ return result;
+
+- result = result ? : sk;
+- badness = score;
++ /* Reuseport logic returned an error, keep original score. */
++ if (IS_ERR(result))
++ continue;
++
++ badness = compute_score(result, net, saddr, sport,
++ daddr, hnum, dif, sdif);
++
+ }
+ }
+ return result;
+diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
+index 4a27fab1d09a3..50f8d2ac4e246 100644
+--- a/net/ipv6/ip6_output.c
++++ b/net/ipv6/ip6_output.c
+@@ -113,7 +113,7 @@ static int ip6_finish_output2(struct net *net, struct sock *sk, struct sk_buff *
+ if (lwtunnel_xmit_redirect(dst->lwtstate)) {
+ int res = lwtunnel_xmit(skb);
+
+- if (res < 0 || res == LWTUNNEL_XMIT_DONE)
++ if (res != LWTUNNEL_XMIT_CONTINUE)
+ return res;
+ }
+
+diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
+index 8521729fb2375..9c7457823eb97 100644
+--- a/net/ipv6/udp.c
++++ b/net/ipv6/udp.c
+@@ -195,14 +195,23 @@ static struct sock *udp6_lib_lookup2(struct net *net,
+ score = compute_score(sk, net, saddr, sport,
+ daddr, hnum, dif, sdif);
+ if (score > badness) {
+- result = lookup_reuseport(net, sk, skb,
+- saddr, sport, daddr, hnum);
++ badness = score;
++ result = lookup_reuseport(net, sk, skb, saddr, sport, daddr, hnum);
++ if (!result) {
++ result = sk;
++ continue;
++ }
++
+ /* Fall back to scoring if group has connections */
+- if (result && !reuseport_has_conns(sk))
++ if (!reuseport_has_conns(sk))
+ return result;
+
+- result = result ? : sk;
+- badness = score;
++ /* Reuseport logic returned an error, keep original score. */
++ if (IS_ERR(result))
++ continue;
++
++ badness = compute_score(sk, net, saddr, sport,
++ daddr, hnum, dif, sdif);
+ }
+ }
+ return result;
+diff --git a/net/mac80211/cfg.c b/net/mac80211/cfg.c
+index f2d08dbccfb7d..30d69091064fe 100644
+--- a/net/mac80211/cfg.c
++++ b/net/mac80211/cfg.c
+@@ -3640,12 +3640,6 @@ static int __ieee80211_csa_finalize(struct ieee80211_sub_if_data *sdata)
+ lockdep_assert_held(&local->mtx);
+ lockdep_assert_held(&local->chanctx_mtx);
+
+- if (sdata->vif.bss_conf.eht_puncturing != sdata->vif.bss_conf.csa_punct_bitmap) {
+- sdata->vif.bss_conf.eht_puncturing =
+- sdata->vif.bss_conf.csa_punct_bitmap;
+- changed |= BSS_CHANGED_EHT_PUNCTURING;
+- }
+-
+ /*
+ * using reservation isn't immediate as it may be deferred until later
+ * with multi-vif. once reservation is complete it will re-schedule the
+@@ -3675,6 +3669,12 @@ static int __ieee80211_csa_finalize(struct ieee80211_sub_if_data *sdata)
+ if (err)
+ return err;
+
++ if (sdata->vif.bss_conf.eht_puncturing != sdata->vif.bss_conf.csa_punct_bitmap) {
++ sdata->vif.bss_conf.eht_puncturing =
++ sdata->vif.bss_conf.csa_punct_bitmap;
++ changed |= BSS_CHANGED_EHT_PUNCTURING;
++ }
++
+ ieee80211_link_info_change_notify(sdata, &sdata->deflink, changed);
+
+ if (sdata->deflink.csa_block_tx) {
+diff --git a/net/netfilter/ipset/ip_set_hash_netportnet.c b/net/netfilter/ipset/ip_set_hash_netportnet.c
+index 005a7ce87217e..bf4f91b78e1dc 100644
+--- a/net/netfilter/ipset/ip_set_hash_netportnet.c
++++ b/net/netfilter/ipset/ip_set_hash_netportnet.c
+@@ -36,6 +36,7 @@ MODULE_ALIAS("ip_set_hash:net,port,net");
+ #define IP_SET_HASH_WITH_PROTO
+ #define IP_SET_HASH_WITH_NETS
+ #define IPSET_NET_COUNT 2
++#define IP_SET_HASH_WITH_NET0
+
+ /* IPv4 variant */
+
+diff --git a/net/netfilter/nft_exthdr.c b/net/netfilter/nft_exthdr.c
+index a54a7f772cec2..7effc7260fec8 100644
+--- a/net/netfilter/nft_exthdr.c
++++ b/net/netfilter/nft_exthdr.c
+@@ -237,7 +237,12 @@ static void nft_exthdr_tcp_set_eval(const struct nft_expr *expr,
+ if (!tcph)
+ goto err;
+
++ if (skb_ensure_writable(pkt->skb, nft_thoff(pkt) + tcphdr_len))
++ goto err;
++
++ tcph = (struct tcphdr *)(pkt->skb->data + nft_thoff(pkt));
+ opt = (u8 *)tcph;
++
+ for (i = sizeof(*tcph); i < tcphdr_len - 1; i += optl) {
+ union {
+ __be16 v16;
+@@ -252,15 +257,6 @@ static void nft_exthdr_tcp_set_eval(const struct nft_expr *expr,
+ if (i + optl > tcphdr_len || priv->len + priv->offset > optl)
+ goto err;
+
+- if (skb_ensure_writable(pkt->skb,
+- nft_thoff(pkt) + i + priv->len))
+- goto err;
+-
+- tcph = nft_tcp_header_pointer(pkt, sizeof(buff), buff,
+- &tcphdr_len);
+- if (!tcph)
+- goto err;
+-
+ offset = i + priv->offset;
+
+ switch (priv->len) {
+@@ -324,9 +320,9 @@ static void nft_exthdr_tcp_strip_eval(const struct nft_expr *expr,
+ if (skb_ensure_writable(pkt->skb, nft_thoff(pkt) + tcphdr_len))
+ goto drop;
+
+- opt = (u8 *)nft_tcp_header_pointer(pkt, sizeof(buff), buff, &tcphdr_len);
+- if (!opt)
+- goto err;
++ tcph = (struct tcphdr *)(pkt->skb->data + nft_thoff(pkt));
++ opt = (u8 *)tcph;
++
+ for (i = sizeof(*tcph); i < tcphdr_len - 1; i += optl) {
+ unsigned int j;
+
+diff --git a/net/netfilter/xt_sctp.c b/net/netfilter/xt_sctp.c
+index e8961094a2822..b46a6a5120583 100644
+--- a/net/netfilter/xt_sctp.c
++++ b/net/netfilter/xt_sctp.c
+@@ -149,6 +149,8 @@ static int sctp_mt_check(const struct xt_mtchk_param *par)
+ {
+ const struct xt_sctp_info *info = par->matchinfo;
+
++ if (info->flag_count > ARRAY_SIZE(info->flag_info))
++ return -EINVAL;
+ if (info->flags & ~XT_SCTP_VALID_FLAGS)
+ return -EINVAL;
+ if (info->invflags & ~XT_SCTP_VALID_FLAGS)
+diff --git a/net/netfilter/xt_u32.c b/net/netfilter/xt_u32.c
+index 177b40d08098b..117d4615d6684 100644
+--- a/net/netfilter/xt_u32.c
++++ b/net/netfilter/xt_u32.c
+@@ -96,11 +96,32 @@ static bool u32_mt(const struct sk_buff *skb, struct xt_action_param *par)
+ return ret ^ data->invert;
+ }
+
++static int u32_mt_checkentry(const struct xt_mtchk_param *par)
++{
++ const struct xt_u32 *data = par->matchinfo;
++ const struct xt_u32_test *ct;
++ unsigned int i;
++
++ if (data->ntests > ARRAY_SIZE(data->tests))
++ return -EINVAL;
++
++ for (i = 0; i < data->ntests; ++i) {
++ ct = &data->tests[i];
++
++ if (ct->nnums > ARRAY_SIZE(ct->location) ||
++ ct->nvalues > ARRAY_SIZE(ct->value))
++ return -EINVAL;
++ }
++
++ return 0;
++}
++
+ static struct xt_match xt_u32_mt_reg __read_mostly = {
+ .name = "u32",
+ .revision = 0,
+ .family = NFPROTO_UNSPEC,
+ .match = u32_mt,
++ .checkentry = u32_mt_checkentry,
+ .matchsize = sizeof(struct xt_u32),
+ .me = THIS_MODULE,
+ };
+diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c
+index 5a4cb796150f5..ec5747969f964 100644
+--- a/net/netrom/af_netrom.c
++++ b/net/netrom/af_netrom.c
+@@ -660,6 +660,11 @@ static int nr_connect(struct socket *sock, struct sockaddr *uaddr,
+ goto out_release;
+ }
+
++ if (sock->state == SS_CONNECTING) {
++ err = -EALREADY;
++ goto out_release;
++ }
++
+ sk->sk_state = TCP_CLOSE;
+ sock->state = SS_UNCONNECTED;
+
+diff --git a/net/sched/em_meta.c b/net/sched/em_meta.c
+index af85a73c4c545..da34fd4c92695 100644
+--- a/net/sched/em_meta.c
++++ b/net/sched/em_meta.c
+@@ -502,7 +502,7 @@ META_COLLECTOR(int_sk_lingertime)
+ *err = -1;
+ return;
+ }
+- dst->value = sk->sk_lingertime / HZ;
++ dst->value = READ_ONCE(sk->sk_lingertime) / HZ;
+ }
+
+ META_COLLECTOR(int_sk_err_qlen)
+@@ -568,7 +568,7 @@ META_COLLECTOR(int_sk_rcvtimeo)
+ *err = -1;
+ return;
+ }
+- dst->value = sk->sk_rcvtimeo / HZ;
++ dst->value = READ_ONCE(sk->sk_rcvtimeo) / HZ;
+ }
+
+ META_COLLECTOR(int_sk_sndtimeo)
+@@ -579,7 +579,7 @@ META_COLLECTOR(int_sk_sndtimeo)
+ *err = -1;
+ return;
+ }
+- dst->value = sk->sk_sndtimeo / HZ;
++ dst->value = READ_ONCE(sk->sk_sndtimeo) / HZ;
+ }
+
+ META_COLLECTOR(int_sk_sendmsg_off)
+diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
+index 70b0c5873d326..61d52594ff6d8 100644
+--- a/net/sched/sch_hfsc.c
++++ b/net/sched/sch_hfsc.c
+@@ -1012,6 +1012,10 @@ hfsc_change_class(struct Qdisc *sch, u32 classid, u32 parentid,
+ if (parent == NULL)
+ return -ENOENT;
+ }
++ if (!(parent->cl_flags & HFSC_FSC) && parent != &q->root) {
++ NL_SET_ERR_MSG(extack, "Invalid parent - parent class must have FSC");
++ return -EINVAL;
++ }
+
+ if (classid == 0 || TC_H_MAJ(classid ^ sch->handle) != 0)
+ return -EINVAL;
+diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
+index f94e7a04e33d0..462ece6bb1802 100644
+--- a/net/smc/af_smc.c
++++ b/net/smc/af_smc.c
+@@ -1820,7 +1820,7 @@ void smc_close_non_accepted(struct sock *sk)
+ lock_sock(sk);
+ if (!sk->sk_lingertime)
+ /* wait for peer closing */
+- sk->sk_lingertime = SMC_MAX_STREAM_WAIT_TIMEOUT;
++ WRITE_ONCE(sk->sk_lingertime, SMC_MAX_STREAM_WAIT_TIMEOUT);
+ __smc_release(smc);
+ release_sock(sk);
+ sock_put(sk); /* sock_hold above */
+diff --git a/net/socket.c b/net/socket.c
+index b7e01d0fe0824..5da8bb1ff8eab 100644
+--- a/net/socket.c
++++ b/net/socket.c
+@@ -3528,7 +3528,11 @@ EXPORT_SYMBOL(kernel_accept);
+ int kernel_connect(struct socket *sock, struct sockaddr *addr, int addrlen,
+ int flags)
+ {
+- return sock->ops->connect(sock, addr, addrlen, flags);
++ struct sockaddr_storage address;
++
++ memcpy(&address, addr, addrlen);
++
++ return sock->ops->connect(sock, (struct sockaddr *)&address, addrlen, flags);
+ }
+ EXPORT_SYMBOL(kernel_connect);
+
+diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
+index 1b688745ce0a1..be798ce8a20ff 100644
+--- a/net/wireless/nl80211.c
++++ b/net/wireless/nl80211.c
+@@ -323,6 +323,7 @@ nl80211_pmsr_ftm_req_attr_policy[NL80211_PMSR_FTM_REQ_ATTR_MAX + 1] = {
+ [NL80211_PMSR_FTM_REQ_ATTR_TRIGGER_BASED] = { .type = NLA_FLAG },
+ [NL80211_PMSR_FTM_REQ_ATTR_NON_TRIGGER_BASED] = { .type = NLA_FLAG },
+ [NL80211_PMSR_FTM_REQ_ATTR_LMR_FEEDBACK] = { .type = NLA_FLAG },
++ [NL80211_PMSR_FTM_REQ_ATTR_BSS_COLOR] = { .type = NLA_U8 },
+ };
+
+ static const struct nla_policy
+diff --git a/samples/bpf/tracex3_kern.c b/samples/bpf/tracex3_kern.c
+index bde6591cb20c5..af235bd6615b1 100644
+--- a/samples/bpf/tracex3_kern.c
++++ b/samples/bpf/tracex3_kern.c
+@@ -11,6 +11,12 @@
+ #include <bpf/bpf_helpers.h>
+ #include <bpf/bpf_tracing.h>
+
++struct start_key {
++ dev_t dev;
++ u32 _pad;
++ sector_t sector;
++};
++
+ struct {
+ __uint(type, BPF_MAP_TYPE_HASH);
+ __type(key, long);
+@@ -18,16 +24,17 @@ struct {
+ __uint(max_entries, 4096);
+ } my_map SEC(".maps");
+
+-/* kprobe is NOT a stable ABI. If kernel internals change this bpf+kprobe
+- * example will no longer be meaningful
+- */
+-SEC("kprobe/blk_mq_start_request")
+-int bpf_prog1(struct pt_regs *ctx)
++/* from /sys/kernel/tracing/events/block/block_io_start/format */
++SEC("tracepoint/block/block_io_start")
++int bpf_prog1(struct trace_event_raw_block_rq *ctx)
+ {
+- long rq = PT_REGS_PARM1(ctx);
+ u64 val = bpf_ktime_get_ns();
++ struct start_key key = {
++ .dev = ctx->dev,
++ .sector = ctx->sector
++ };
+
+- bpf_map_update_elem(&my_map, &rq, &val, BPF_ANY);
++ bpf_map_update_elem(&my_map, &key, &val, BPF_ANY);
+ return 0;
+ }
+
+@@ -49,21 +56,26 @@ struct {
+ __uint(max_entries, SLOTS);
+ } lat_map SEC(".maps");
+
+-SEC("kprobe/__blk_account_io_done")
+-int bpf_prog2(struct pt_regs *ctx)
++/* from /sys/kernel/tracing/events/block/block_io_done/format */
++SEC("tracepoint/block/block_io_done")
++int bpf_prog2(struct trace_event_raw_block_rq *ctx)
+ {
+- long rq = PT_REGS_PARM1(ctx);
++ struct start_key key = {
++ .dev = ctx->dev,
++ .sector = ctx->sector
++ };
++
+ u64 *value, l, base;
+ u32 index;
+
+- value = bpf_map_lookup_elem(&my_map, &rq);
++ value = bpf_map_lookup_elem(&my_map, &key);
+ if (!value)
+ return 0;
+
+ u64 cur_time = bpf_ktime_get_ns();
+ u64 delta = cur_time - *value;
+
+- bpf_map_delete_elem(&my_map, &rq);
++ bpf_map_delete_elem(&my_map, &key);
+
+ /* the lines below are computing index = log10(delta)*10
+ * using integer arithmetic
+diff --git a/samples/bpf/tracex6_kern.c b/samples/bpf/tracex6_kern.c
+index acad5712d8b4f..fd602c2774b8b 100644
+--- a/samples/bpf/tracex6_kern.c
++++ b/samples/bpf/tracex6_kern.c
+@@ -2,6 +2,8 @@
+ #include <linux/version.h>
+ #include <uapi/linux/bpf.h>
+ #include <bpf/bpf_helpers.h>
++#include <bpf/bpf_tracing.h>
++#include <bpf/bpf_core_read.h>
+
+ struct {
+ __uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY);
+@@ -45,13 +47,24 @@ int bpf_prog1(struct pt_regs *ctx)
+ return 0;
+ }
+
+-SEC("kprobe/htab_map_lookup_elem")
+-int bpf_prog2(struct pt_regs *ctx)
++/*
++ * Since *_map_lookup_elem can't be expected to trigger bpf programs
++ * due to potential deadlocks (bpf_disable_instrumentation), this bpf
++ * program will be attached to bpf_map_copy_value (which is called
++ * from map_lookup_elem) and will only filter the hashtable type.
++ */
++SEC("kprobe/bpf_map_copy_value")
++int BPF_KPROBE(bpf_prog2, struct bpf_map *map)
+ {
+ u32 key = bpf_get_smp_processor_id();
+ struct bpf_perf_event_value *val, buf;
++ enum bpf_map_type type;
+ int error;
+
++ type = BPF_CORE_READ(map, map_type);
++ if (type != BPF_MAP_TYPE_HASH)
++ return 0;
++
+ error = bpf_perf_event_read_value(&counters, key, &buf, sizeof(buf));
+ if (error)
+ return 0;
+diff --git a/scripts/gdb/linux/constants.py.in b/scripts/gdb/linux/constants.py.in
+index 50a92c4e9984e..fab74ca9df6fc 100644
+--- a/scripts/gdb/linux/constants.py.in
++++ b/scripts/gdb/linux/constants.py.in
+@@ -64,6 +64,9 @@ LX_GDBPARSED(IRQ_HIDDEN)
+
+ /* linux/module.h */
+ LX_GDBPARSED(MOD_TEXT)
++LX_GDBPARSED(MOD_DATA)
++LX_GDBPARSED(MOD_RODATA)
++LX_GDBPARSED(MOD_RO_AFTER_INIT)
+
+ /* linux/mount.h */
+ LX_VALUE(MNT_NOSUID)
+diff --git a/scripts/gdb/linux/modules.py b/scripts/gdb/linux/modules.py
+index 261f28640f4cd..f76a43bfa15fc 100644
+--- a/scripts/gdb/linux/modules.py
++++ b/scripts/gdb/linux/modules.py
+@@ -73,11 +73,17 @@ class LxLsmod(gdb.Command):
+ " " if utils.get_long_type().sizeof == 8 else ""))
+
+ for module in module_list():
+- layout = module['mem'][constants.LX_MOD_TEXT]
++ text = module['mem'][constants.LX_MOD_TEXT]
++ text_addr = str(text['base']).split()[0]
++ total_size = 0
++
++ for i in range(constants.LX_MOD_TEXT, constants.LX_MOD_RO_AFTER_INIT + 1):
++ total_size += module['mem'][i]['size']
++
+ gdb.write("{address} {name:<19} {size:>8} {ref}".format(
+- address=str(layout['base']).split()[0],
++ address=text_addr,
+ name=module['name'].string(),
+- size=str(layout['size']),
++ size=str(total_size),
+ ref=str(module['refcnt']['counter'] - 1)))
+
+ t = self._module_use_type.get_type().pointer()
+diff --git a/scripts/rust_is_available.sh b/scripts/rust_is_available.sh
+index aebbf19139709..7a925d2b20fc7 100755
+--- a/scripts/rust_is_available.sh
++++ b/scripts/rust_is_available.sh
+@@ -2,8 +2,6 @@
+ # SPDX-License-Identifier: GPL-2.0
+ #
+ # Tests whether a suitable Rust toolchain is available.
+-#
+-# Pass `-v` for human output and more checks (as warnings).
+
+ set -e
+
+@@ -23,21 +21,17 @@ get_canonical_version()
+
+ # Check that the Rust compiler exists.
+ if ! command -v "$RUSTC" >/dev/null; then
+- if [ "$1" = -v ]; then
+- echo >&2 "***"
+- echo >&2 "*** Rust compiler '$RUSTC' could not be found."
+- echo >&2 "***"
+- fi
++ echo >&2 "***"
++ echo >&2 "*** Rust compiler '$RUSTC' could not be found."
++ echo >&2 "***"
+ exit 1
+ fi
+
+ # Check that the Rust bindings generator exists.
+ if ! command -v "$BINDGEN" >/dev/null; then
+- if [ "$1" = -v ]; then
+- echo >&2 "***"
+- echo >&2 "*** Rust bindings generator '$BINDGEN' could not be found."
+- echo >&2 "***"
+- fi
++ echo >&2 "***"
++ echo >&2 "*** Rust bindings generator '$BINDGEN' could not be found."
++ echo >&2 "***"
+ exit 1
+ fi
+
+@@ -53,16 +47,14 @@ rust_compiler_min_version=$($min_tool_version rustc)
+ rust_compiler_cversion=$(get_canonical_version $rust_compiler_version)
+ rust_compiler_min_cversion=$(get_canonical_version $rust_compiler_min_version)
+ if [ "$rust_compiler_cversion" -lt "$rust_compiler_min_cversion" ]; then
+- if [ "$1" = -v ]; then
+- echo >&2 "***"
+- echo >&2 "*** Rust compiler '$RUSTC' is too old."
+- echo >&2 "*** Your version: $rust_compiler_version"
+- echo >&2 "*** Minimum version: $rust_compiler_min_version"
+- echo >&2 "***"
+- fi
++ echo >&2 "***"
++ echo >&2 "*** Rust compiler '$RUSTC' is too old."
++ echo >&2 "*** Your version: $rust_compiler_version"
++ echo >&2 "*** Minimum version: $rust_compiler_min_version"
++ echo >&2 "***"
+ exit 1
+ fi
+-if [ "$1" = -v ] && [ "$rust_compiler_cversion" -gt "$rust_compiler_min_cversion" ]; then
++if [ "$rust_compiler_cversion" -gt "$rust_compiler_min_cversion" ]; then
+ echo >&2 "***"
+ echo >&2 "*** Rust compiler '$RUSTC' is too new. This may or may not work."
+ echo >&2 "*** Your version: $rust_compiler_version"
+@@ -82,16 +74,14 @@ rust_bindings_generator_min_version=$($min_tool_version bindgen)
+ rust_bindings_generator_cversion=$(get_canonical_version $rust_bindings_generator_version)
+ rust_bindings_generator_min_cversion=$(get_canonical_version $rust_bindings_generator_min_version)
+ if [ "$rust_bindings_generator_cversion" -lt "$rust_bindings_generator_min_cversion" ]; then
+- if [ "$1" = -v ]; then
+- echo >&2 "***"
+- echo >&2 "*** Rust bindings generator '$BINDGEN' is too old."
+- echo >&2 "*** Your version: $rust_bindings_generator_version"
+- echo >&2 "*** Minimum version: $rust_bindings_generator_min_version"
+- echo >&2 "***"
+- fi
++ echo >&2 "***"
++ echo >&2 "*** Rust bindings generator '$BINDGEN' is too old."
++ echo >&2 "*** Your version: $rust_bindings_generator_version"
++ echo >&2 "*** Minimum version: $rust_bindings_generator_min_version"
++ echo >&2 "***"
+ exit 1
+ fi
+-if [ "$1" = -v ] && [ "$rust_bindings_generator_cversion" -gt "$rust_bindings_generator_min_cversion" ]; then
++if [ "$rust_bindings_generator_cversion" -gt "$rust_bindings_generator_min_cversion" ]; then
+ echo >&2 "***"
+ echo >&2 "*** Rust bindings generator '$BINDGEN' is too new. This may or may not work."
+ echo >&2 "*** Your version: $rust_bindings_generator_version"
+@@ -100,23 +90,39 @@ if [ "$1" = -v ] && [ "$rust_bindings_generator_cversion" -gt "$rust_bindings_ge
+ fi
+
+ # Check that the `libclang` used by the Rust bindings generator is suitable.
++#
++# In order to do that, first invoke `bindgen` to get the `libclang` version
++# found by `bindgen`. This step may already fail if, for instance, `libclang`
++# is not found, thus inform the user in such a case.
++bindgen_libclang_output=$( \
++ LC_ALL=C "$BINDGEN" $(dirname $0)/rust_is_available_bindgen_libclang.h 2>&1 >/dev/null
++) || bindgen_libclang_code=$?
++if [ -n "$bindgen_libclang_code" ]; then
++ echo >&2 "***"
++ echo >&2 "*** Running '$BINDGEN' to check the libclang version (used by the Rust"
++ echo >&2 "*** bindings generator) failed with code $bindgen_libclang_code. This may be caused by"
++ echo >&2 "*** a failure to locate libclang. See output and docs below for details:"
++ echo >&2 "***"
++ echo >&2 "$bindgen_libclang_output"
++ echo >&2 "***"
++ exit 1
++fi
++
++# `bindgen` returned successfully, thus use the output to check that the version
++# of the `libclang` found by the Rust bindings generator is suitable.
+ bindgen_libclang_version=$( \
+- LC_ALL=C "$BINDGEN" $(dirname $0)/rust_is_available_bindgen_libclang.h 2>&1 >/dev/null \
+- | grep -F 'clang version ' \
+- | grep -oE '[0-9]+\.[0-9]+\.[0-9]+' \
+- | head -n 1 \
++ echo "$bindgen_libclang_output" \
++ | sed -nE 's:.*clang version ([0-9]+\.[0-9]+\.[0-9]+).*:\1:p'
+ )
+ bindgen_libclang_min_version=$($min_tool_version llvm)
+ bindgen_libclang_cversion=$(get_canonical_version $bindgen_libclang_version)
+ bindgen_libclang_min_cversion=$(get_canonical_version $bindgen_libclang_min_version)
+ if [ "$bindgen_libclang_cversion" -lt "$bindgen_libclang_min_cversion" ]; then
+- if [ "$1" = -v ]; then
+- echo >&2 "***"
+- echo >&2 "*** libclang (used by the Rust bindings generator '$BINDGEN') is too old."
+- echo >&2 "*** Your version: $bindgen_libclang_version"
+- echo >&2 "*** Minimum version: $bindgen_libclang_min_version"
+- echo >&2 "***"
+- fi
++ echo >&2 "***"
++ echo >&2 "*** libclang (used by the Rust bindings generator '$BINDGEN') is too old."
++ echo >&2 "*** Your version: $bindgen_libclang_version"
++ echo >&2 "*** Minimum version: $bindgen_libclang_min_version"
++ echo >&2 "***"
+ exit 1
+ fi
+
+@@ -125,21 +131,19 @@ fi
+ #
+ # In the future, we might be able to perform a full version check, see
+ # https://github.com/rust-lang/rust-bindgen/issues/2138.
+-if [ "$1" = -v ]; then
+- cc_name=$($(dirname $0)/cc-version.sh "$CC" | cut -f1 -d' ')
+- if [ "$cc_name" = Clang ]; then
+- clang_version=$( \
+- LC_ALL=C "$CC" --version 2>/dev/null \
+- | sed -nE '1s:.*version ([0-9]+\.[0-9]+\.[0-9]+).*:\1:p'
+- )
+- if [ "$clang_version" != "$bindgen_libclang_version" ]; then
+- echo >&2 "***"
+- echo >&2 "*** libclang (used by the Rust bindings generator '$BINDGEN')"
+- echo >&2 "*** version does not match Clang's. This may be a problem."
+- echo >&2 "*** libclang version: $bindgen_libclang_version"
+- echo >&2 "*** Clang version: $clang_version"
+- echo >&2 "***"
+- fi
++cc_name=$($(dirname $0)/cc-version.sh $CC | cut -f1 -d' ')
++if [ "$cc_name" = Clang ]; then
++ clang_version=$( \
++ LC_ALL=C $CC --version 2>/dev/null \
++ | sed -nE '1s:.*version ([0-9]+\.[0-9]+\.[0-9]+).*:\1:p'
++ )
++ if [ "$clang_version" != "$bindgen_libclang_version" ]; then
++ echo >&2 "***"
++ echo >&2 "*** libclang (used by the Rust bindings generator '$BINDGEN')"
++ echo >&2 "*** version does not match Clang's. This may be a problem."
++ echo >&2 "*** libclang version: $bindgen_libclang_version"
++ echo >&2 "*** Clang version: $clang_version"
++ echo >&2 "***"
+ fi
+ fi
+
+@@ -150,11 +154,9 @@ rustc_sysroot=$("$RUSTC" $KRUSTFLAGS --print sysroot)
+ rustc_src=${RUST_LIB_SRC:-"$rustc_sysroot/lib/rustlib/src/rust/library"}
+ rustc_src_core="$rustc_src/core/src/lib.rs"
+ if [ ! -e "$rustc_src_core" ]; then
+- if [ "$1" = -v ]; then
+- echo >&2 "***"
+- echo >&2 "*** Source code for the 'core' standard library could not be found"
+- echo >&2 "*** at '$rustc_src_core'."
+- echo >&2 "***"
+- fi
++ echo >&2 "***"
++ echo >&2 "*** Source code for the 'core' standard library could not be found"
++ echo >&2 "*** at '$rustc_src_core'."
++ echo >&2 "***"
+ exit 1
+ fi
+diff --git a/security/integrity/ima/Kconfig b/security/integrity/ima/Kconfig
+index 60a511c6b583e..c17660bf5f347 100644
+--- a/security/integrity/ima/Kconfig
++++ b/security/integrity/ima/Kconfig
+@@ -248,18 +248,6 @@ config IMA_APPRAISE_MODSIG
+ The modsig keyword can be used in the IMA policy to allow a hook
+ to accept such signatures.
+
+-config IMA_TRUSTED_KEYRING
+- bool "Require all keys on the .ima keyring be signed (deprecated)"
+- depends on IMA_APPRAISE && SYSTEM_TRUSTED_KEYRING
+- depends on INTEGRITY_ASYMMETRIC_KEYS
+- select INTEGRITY_TRUSTED_KEYRING
+- default y
+- help
+- This option requires that all keys added to the .ima
+- keyring be signed by a key on the system trusted keyring.
+-
+- This option is deprecated in favor of INTEGRITY_TRUSTED_KEYRING
+-
+ config IMA_KEYRINGS_PERMIT_SIGNED_BY_BUILTIN_OR_SECONDARY
+ bool "Permit keys validly signed by a built-in or secondary CA cert (EXPERIMENTAL)"
+ depends on SYSTEM_TRUSTED_KEYRING
+diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c
+index d54f73c558f72..19be69fa4d052 100644
+--- a/security/keys/keyctl.c
++++ b/security/keys/keyctl.c
+@@ -980,14 +980,19 @@ long keyctl_chown_key(key_serial_t id, uid_t user, gid_t group)
+ ret = -EACCES;
+ down_write(&key->sem);
+
+- if (!capable(CAP_SYS_ADMIN)) {
++ {
++ bool is_privileged_op = false;
++
+ /* only the sysadmin can chown a key to some other UID */
+ if (user != (uid_t) -1 && !uid_eq(key->uid, uid))
+- goto error_put;
++ is_privileged_op = true;
+
+ /* only the sysadmin can set the key's GID to a group other
+ * than one of those that the current process subscribes to */
+ if (group != (gid_t) -1 && !gid_eq(gid, key->gid) && !in_group_p(gid))
++ is_privileged_op = true;
++
++ if (is_privileged_op && !capable(CAP_SYS_ADMIN))
+ goto error_put;
+ }
+
+@@ -1088,7 +1093,7 @@ long keyctl_setperm_key(key_serial_t id, key_perm_t perm)
+ down_write(&key->sem);
+
+ /* if we're not the sysadmin, we can only change a key that we own */
+- if (capable(CAP_SYS_ADMIN) || uid_eq(key->uid, current_fsuid())) {
++ if (uid_eq(key->uid, current_fsuid()) || capable(CAP_SYS_ADMIN)) {
+ key->perm = perm;
+ notify_key(key, NOTIFY_KEY_SETATTR, 0);
+ ret = 0;
+diff --git a/security/security.c b/security/security.c
+index d5ff7ff45b776..521f74e77dd15 100644
+--- a/security/security.c
++++ b/security/security.c
+@@ -1138,6 +1138,20 @@ void security_bprm_committed_creds(struct linux_binprm *bprm)
+ call_void_hook(bprm_committed_creds, bprm);
+ }
+
++/**
++ * security_fs_context_submount() - Initialise fc->security
++ * @fc: new filesystem context
++ * @reference: dentry reference for submount/remount
++ *
++ * Fill out the ->security field for a new fs_context.
++ *
++ * Return: Returns 0 on success or negative error code on failure.
++ */
++int security_fs_context_submount(struct fs_context *fc, struct super_block *reference)
++{
++ return call_int_hook(fs_context_submount, 0, fc, reference);
++}
++
+ /**
+ * security_fs_context_dup() - Duplicate a fs_context LSM blob
+ * @fc: destination filesystem context
+diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
+index 79b4890e9936d..674f43372f490 100644
+--- a/security/selinux/hooks.c
++++ b/security/selinux/hooks.c
+@@ -2721,6 +2721,27 @@ static int selinux_umount(struct vfsmount *mnt, int flags)
+ FILESYSTEM__UNMOUNT, NULL);
+ }
+
++static int selinux_fs_context_submount(struct fs_context *fc,
++ struct super_block *reference)
++{
++ const struct superblock_security_struct *sbsec;
++ struct selinux_mnt_opts *opts;
++
++ opts = kzalloc(sizeof(*opts), GFP_KERNEL);
++ if (!opts)
++ return -ENOMEM;
++
++ sbsec = selinux_superblock(reference);
++ if (sbsec->flags & FSCONTEXT_MNT)
++ opts->fscontext_sid = sbsec->sid;
++ if (sbsec->flags & CONTEXT_MNT)
++ opts->context_sid = sbsec->mntpoint_sid;
++ if (sbsec->flags & DEFCONTEXT_MNT)
++ opts->defcontext_sid = sbsec->def_sid;
++ fc->security = opts;
++ return 0;
++}
++
+ static int selinux_fs_context_dup(struct fs_context *fc,
+ struct fs_context *src_fc)
+ {
+@@ -7142,6 +7163,7 @@ static struct security_hook_list selinux_hooks[] __ro_after_init = {
+ /*
+ * PUT "CLONING" (ACCESSING + ALLOCATING) HOOKS HERE
+ */
++ LSM_HOOK_INIT(fs_context_submount, selinux_fs_context_submount),
+ LSM_HOOK_INIT(fs_context_dup, selinux_fs_context_dup),
+ LSM_HOOK_INIT(fs_context_parse_param, selinux_fs_context_parse_param),
+ LSM_HOOK_INIT(sb_eat_lsm_opts, selinux_sb_eat_lsm_opts),
+diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
+index 7a3e9ab137d85..6bdc01600aa74 100644
+--- a/security/smack/smack_lsm.c
++++ b/security/smack/smack_lsm.c
+@@ -614,6 +614,56 @@ out_opt_err:
+ return -EINVAL;
+ }
+
++/**
++ * smack_fs_context_submount - Initialise security data for a filesystem context
++ * @fc: The filesystem context.
++ * @reference: reference superblock
++ *
++ * Returns 0 on success or -ENOMEM on error.
++ */
++static int smack_fs_context_submount(struct fs_context *fc,
++ struct super_block *reference)
++{
++ struct superblock_smack *sbsp;
++ struct smack_mnt_opts *ctx;
++ struct inode_smack *isp;
++
++ ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
++ if (!ctx)
++ return -ENOMEM;
++ fc->security = ctx;
++
++ sbsp = smack_superblock(reference);
++ isp = smack_inode(reference->s_root->d_inode);
++
++ if (sbsp->smk_default) {
++ ctx->fsdefault = kstrdup(sbsp->smk_default->smk_known, GFP_KERNEL);
++ if (!ctx->fsdefault)
++ return -ENOMEM;
++ }
++
++ if (sbsp->smk_floor) {
++ ctx->fsfloor = kstrdup(sbsp->smk_floor->smk_known, GFP_KERNEL);
++ if (!ctx->fsfloor)
++ return -ENOMEM;
++ }
++
++ if (sbsp->smk_hat) {
++ ctx->fshat = kstrdup(sbsp->smk_hat->smk_known, GFP_KERNEL);
++ if (!ctx->fshat)
++ return -ENOMEM;
++ }
++
++ if (isp->smk_flags & SMK_INODE_TRANSMUTE) {
++ if (sbsp->smk_root) {
++ ctx->fstransmute = kstrdup(sbsp->smk_root->smk_known, GFP_KERNEL);
++ if (!ctx->fstransmute)
++ return -ENOMEM;
++ }
++ }
++ return 0;
++}
++
+ /**
+ * smack_fs_context_dup - Duplicate the security data on fs_context duplication
+ * @fc: The new filesystem context.
+@@ -4845,6 +4895,7 @@ static struct security_hook_list smack_hooks[] __ro_after_init = {
+ LSM_HOOK_INIT(ptrace_traceme, smack_ptrace_traceme),
+ LSM_HOOK_INIT(syslog, smack_syslog),
+
++ LSM_HOOK_INIT(fs_context_submount, smack_fs_context_submount),
+ LSM_HOOK_INIT(fs_context_dup, smack_fs_context_dup),
+ LSM_HOOK_INIT(fs_context_parse_param, smack_fs_context_parse_param),
+
+diff --git a/security/smack/smackfs.c b/security/smack/smackfs.c
+index 5590eaad241bb..25f67d1b5c73e 100644
+--- a/security/smack/smackfs.c
++++ b/security/smack/smackfs.c
+@@ -896,7 +896,7 @@ static ssize_t smk_set_cipso(struct file *file, const char __user *buf,
+ }
+
+ ret = sscanf(rule, "%d", &catlen);
+- if (ret != 1 || catlen > SMACK_CIPSO_MAXCATNUM)
++ if (ret != 1 || catlen < 0 || catlen > SMACK_CIPSO_MAXCATNUM)
+ goto out;
+
+ if (format == SMK_FIXED24_FMT &&
+diff --git a/sound/Kconfig b/sound/Kconfig
+index 0ddfb717b81dc..466e848689bd1 100644
+--- a/sound/Kconfig
++++ b/sound/Kconfig
+@@ -1,7 +1,7 @@
+ # SPDX-License-Identifier: GPL-2.0-only
+ menuconfig SOUND
+ tristate "Sound card support"
+- depends on HAS_IOMEM
++ depends on HAS_IOMEM || UML
+ help
+ If you have a sound card in your computer, i.e. if it can say more
+ than an occasional beep, say Y.
+diff --git a/sound/core/pcm_compat.c b/sound/core/pcm_compat.c
+index 42c2ada8e8887..c96483091f30a 100644
+--- a/sound/core/pcm_compat.c
++++ b/sound/core/pcm_compat.c
+@@ -253,10 +253,14 @@ static int snd_pcm_ioctl_hw_params_compat(struct snd_pcm_substream *substream,
+ goto error;
+ }
+
+- if (refine)
++ if (refine) {
+ err = snd_pcm_hw_refine(substream, data);
+- else
++ if (err < 0)
++ goto error;
++ err = fixup_unreferenced_params(substream, data);
++ } else {
+ err = snd_pcm_hw_params(substream, data);
++ }
+ if (err < 0)
+ goto error;
+ if (copy_to_user(data32, data, sizeof(*data32)) ||
+diff --git a/sound/pci/ac97/ac97_codec.c b/sound/pci/ac97/ac97_codec.c
+index 80a65b8ad7b9b..25f93e56cfc7a 100644
+--- a/sound/pci/ac97/ac97_codec.c
++++ b/sound/pci/ac97/ac97_codec.c
+@@ -2069,10 +2069,9 @@ int snd_ac97_mixer(struct snd_ac97_bus *bus, struct snd_ac97_template *template,
+ .dev_disconnect = snd_ac97_dev_disconnect,
+ };
+
+- if (!rac97)
+- return -EINVAL;
+- if (snd_BUG_ON(!bus || !template))
++ if (snd_BUG_ON(!bus || !template || !rac97))
+ return -EINVAL;
++ *rac97 = NULL;
+ if (snd_BUG_ON(template->num >= 4))
+ return -EINVAL;
+ if (bus->codec[template->num])
+diff --git a/sound/pci/hda/patch_cs8409-tables.c b/sound/pci/hda/patch_cs8409-tables.c
+index b288874e401e5..36b411d1a9609 100644
+--- a/sound/pci/hda/patch_cs8409-tables.c
++++ b/sound/pci/hda/patch_cs8409-tables.c
+@@ -550,6 +550,10 @@ const struct snd_pci_quirk cs8409_fixup_tbl[] = {
+ SND_PCI_QUIRK(0x1028, 0x0C50, "Dolphin", CS8409_DOLPHIN),
+ SND_PCI_QUIRK(0x1028, 0x0C51, "Dolphin", CS8409_DOLPHIN),
+ SND_PCI_QUIRK(0x1028, 0x0C52, "Dolphin", CS8409_DOLPHIN),
++ SND_PCI_QUIRK(0x1028, 0x0C73, "Dolphin", CS8409_DOLPHIN),
++ SND_PCI_QUIRK(0x1028, 0x0C75, "Dolphin", CS8409_DOLPHIN),
++ SND_PCI_QUIRK(0x1028, 0x0C7D, "Dolphin", CS8409_DOLPHIN),
++ SND_PCI_QUIRK(0x1028, 0x0C7F, "Dolphin", CS8409_DOLPHIN),
+ {} /* terminator */
+ };
+
+diff --git a/sound/pci/hda/patch_cs8409.c b/sound/pci/hda/patch_cs8409.c
+index 0ba1fbcbb21e4..627899959ffe8 100644
+--- a/sound/pci/hda/patch_cs8409.c
++++ b/sound/pci/hda/patch_cs8409.c
+@@ -888,7 +888,7 @@ static void cs42l42_resume(struct sub_codec *cs42l42)
+
+ /* Initialize CS42L42 companion codec */
+ cs8409_i2c_bulk_write(cs42l42, cs42l42->init_seq, cs42l42->init_seq_num);
+- usleep_range(30000, 35000);
++ msleep(CS42L42_INIT_TIMEOUT_MS);
+
+ /* Clear interrupts, by reading interrupt status registers */
+ cs8409_i2c_bulk_read(cs42l42, irq_regs, ARRAY_SIZE(irq_regs));
+diff --git a/sound/pci/hda/patch_cs8409.h b/sound/pci/hda/patch_cs8409.h
+index 2a8dfb4ff046b..937e9387abdc7 100644
+--- a/sound/pci/hda/patch_cs8409.h
++++ b/sound/pci/hda/patch_cs8409.h
+@@ -229,6 +229,7 @@ enum cs8409_coefficient_index_registers {
+ #define CS42L42_I2C_SLEEP_US (2000)
+ #define CS42L42_PDN_TIMEOUT_US (250000)
+ #define CS42L42_PDN_SLEEP_US (2000)
++#define CS42L42_INIT_TIMEOUT_MS (45)
+ #define CS42L42_FULL_SCALE_VOL_MASK (2)
+ #define CS42L42_FULL_SCALE_VOL_0DB (1)
+ #define CS42L42_FULL_SCALE_VOL_MINUS6DB (0)
+diff --git a/sound/soc/atmel/atmel-i2s.c b/sound/soc/atmel/atmel-i2s.c
+index 49930baf5e4d6..69a88dc651652 100644
+--- a/sound/soc/atmel/atmel-i2s.c
++++ b/sound/soc/atmel/atmel-i2s.c
+@@ -163,11 +163,14 @@ struct atmel_i2s_gck_param {
+
+ #define I2S_MCK_12M288 12288000UL
+ #define I2S_MCK_11M2896 11289600UL
++#define I2S_MCK_6M144 6144000UL
+
+ /* mck = (32 * (imckfs+1) / (imckdiv+1)) * fs */
+ static const struct atmel_i2s_gck_param gck_params[] = {
++ /* mck = 6.144Mhz */
++ { 8000, I2S_MCK_6M144, 1, 47}, /* mck = 768 fs */
++
+ /* mck = 12.288MHz */
+- { 8000, I2S_MCK_12M288, 0, 47}, /* mck = 1536 fs */
+ { 16000, I2S_MCK_12M288, 1, 47}, /* mck = 768 fs */
+ { 24000, I2S_MCK_12M288, 3, 63}, /* mck = 512 fs */
+ { 32000, I2S_MCK_12M288, 3, 47}, /* mck = 384 fs */
+diff --git a/sound/soc/codecs/Kconfig b/sound/soc/codecs/Kconfig
+index 1b50b2d66beb2..3840dd6844974 100644
+--- a/sound/soc/codecs/Kconfig
++++ b/sound/soc/codecs/Kconfig
+@@ -1673,6 +1673,7 @@ config SND_SOC_STA529
+ config SND_SOC_STAC9766
+ tristate
+ depends on SND_SOC_AC97_BUS
++ select REGMAP_AC97
+
+ config SND_SOC_STI_SAS
+ tristate "codec Audio support for STI SAS codec"
+diff --git a/sound/soc/codecs/cs35l56-i2c.c b/sound/soc/codecs/cs35l56-i2c.c
+index 295caad262243..c613a2554fa31 100644
+--- a/sound/soc/codecs/cs35l56-i2c.c
++++ b/sound/soc/codecs/cs35l56-i2c.c
+@@ -62,10 +62,19 @@ static const struct i2c_device_id cs35l56_id_i2c[] = {
+ };
+ MODULE_DEVICE_TABLE(i2c, cs35l56_id_i2c);
+
++#ifdef CONFIG_ACPI
++static const struct acpi_device_id cs35l56_asoc_acpi_match[] = {
++ { "CSC355C", 0 },
++ {},
++};
++MODULE_DEVICE_TABLE(acpi, cs35l56_asoc_acpi_match);
++#endif
++
+ static struct i2c_driver cs35l56_i2c_driver = {
+ .driver = {
+ .name = "cs35l56",
+ .pm = &cs35l56_pm_ops_i2c_spi,
++ .acpi_match_table = ACPI_PTR(cs35l56_asoc_acpi_match),
+ },
+ .id_table = cs35l56_id_i2c,
+ .probe_new = cs35l56_i2c_probe,
+diff --git a/sound/soc/codecs/cs35l56-spi.c b/sound/soc/codecs/cs35l56-spi.c
+index 996aab10500ee..302f9c47407a4 100644
+--- a/sound/soc/codecs/cs35l56-spi.c
++++ b/sound/soc/codecs/cs35l56-spi.c
+@@ -59,10 +59,19 @@ static const struct spi_device_id cs35l56_id_spi[] = {
+ };
+ MODULE_DEVICE_TABLE(spi, cs35l56_id_spi);
+
++#ifdef CONFIG_ACPI
++static const struct acpi_device_id cs35l56_asoc_acpi_match[] = {
++ { "CSC355C", 0 },
++ {},
++};
++MODULE_DEVICE_TABLE(acpi, cs35l56_asoc_acpi_match);
++#endif
++
+ static struct spi_driver cs35l56_spi_driver = {
+ .driver = {
+ .name = "cs35l56",
+ .pm = &cs35l56_pm_ops_i2c_spi,
++ .acpi_match_table = ACPI_PTR(cs35l56_asoc_acpi_match),
+ },
+ .id_table = cs35l56_id_spi,
+ .probe = cs35l56_spi_probe,
+diff --git a/sound/soc/codecs/cs43130.h b/sound/soc/codecs/cs43130.h
+index 1dd8936743132..90e8895275e77 100644
+--- a/sound/soc/codecs/cs43130.h
++++ b/sound/soc/codecs/cs43130.h
+@@ -381,88 +381,88 @@ struct cs43130_clk_gen {
+
+ /* frm_size = 16 */
+ static const struct cs43130_clk_gen cs43130_16_clk_gen[] = {
+- { 22579200, 32000, .v = { 441, 10, }, },
+- { 22579200, 44100, .v = { 32, 1, }, },
+- { 22579200, 48000, .v = { 147, 5, }, },
+- { 22579200, 88200, .v = { 16, 1, }, },
+- { 22579200, 96000, .v = { 147, 10, }, },
+- { 22579200, 176400, .v = { 8, 1, }, },
+- { 22579200, 192000, .v = { 147, 20, }, },
+- { 22579200, 352800, .v = { 4, 1, }, },
+- { 22579200, 384000, .v = { 147, 40, }, },
+- { 24576000, 32000, .v = { 48, 1, }, },
+- { 24576000, 44100, .v = { 5120, 147, }, },
+- { 24576000, 48000, .v = { 32, 1, }, },
+- { 24576000, 88200, .v = { 2560, 147, }, },
+- { 24576000, 96000, .v = { 16, 1, }, },
+- { 24576000, 176400, .v = { 1280, 147, }, },
+- { 24576000, 192000, .v = { 8, 1, }, },
+- { 24576000, 352800, .v = { 640, 147, }, },
+- { 24576000, 384000, .v = { 4, 1, }, },
++ { 22579200, 32000, .v = { 10, 441, }, },
++ { 22579200, 44100, .v = { 1, 32, }, },
++ { 22579200, 48000, .v = { 5, 147, }, },
++ { 22579200, 88200, .v = { 1, 16, }, },
++ { 22579200, 96000, .v = { 10, 147, }, },
++ { 22579200, 176400, .v = { 1, 8, }, },
++ { 22579200, 192000, .v = { 20, 147, }, },
++ { 22579200, 352800, .v = { 1, 4, }, },
++ { 22579200, 384000, .v = { 40, 147, }, },
++ { 24576000, 32000, .v = { 1, 48, }, },
++ { 24576000, 44100, .v = { 147, 5120, }, },
++ { 24576000, 48000, .v = { 1, 32, }, },
++ { 24576000, 88200, .v = { 147, 2560, }, },
++ { 24576000, 96000, .v = { 1, 16, }, },
++ { 24576000, 176400, .v = { 147, 1280, }, },
++ { 24576000, 192000, .v = { 1, 8, }, },
++ { 24576000, 352800, .v = { 147, 640, }, },
++ { 24576000, 384000, .v = { 1, 4, }, },
+ };
+
+ /* frm_size = 32 */
+ static const struct cs43130_clk_gen cs43130_32_clk_gen[] = {
+- { 22579200, 32000, .v = { 441, 20, }, },
+- { 22579200, 44100, .v = { 16, 1, }, },
+- { 22579200, 48000, .v = { 147, 10, }, },
+- { 22579200, 88200, .v = { 8, 1, }, },
+- { 22579200, 96000, .v = { 147, 20, }, },
+- { 22579200, 176400, .v = { 4, 1, }, },
+- { 22579200, 192000, .v = { 147, 40, }, },
+- { 22579200, 352800, .v = { 2, 1, }, },
+- { 22579200, 384000, .v = { 147, 80, }, },
+- { 24576000, 32000, .v = { 24, 1, }, },
+- { 24576000, 44100, .v = { 2560, 147, }, },
+- { 24576000, 48000, .v = { 16, 1, }, },
+- { 24576000, 88200, .v = { 1280, 147, }, },
+- { 24576000, 96000, .v = { 8, 1, }, },
+- { 24576000, 176400, .v = { 640, 147, }, },
+- { 24576000, 192000, .v = { 4, 1, }, },
+- { 24576000, 352800, .v = { 320, 147, }, },
+- { 24576000, 384000, .v = { 2, 1, }, },
++ { 22579200, 32000, .v = { 20, 441, }, },
++ { 22579200, 44100, .v = { 1, 16, }, },
++ { 22579200, 48000, .v = { 10, 147, }, },
++ { 22579200, 88200, .v = { 1, 8, }, },
++ { 22579200, 96000, .v = { 20, 147, }, },
++ { 22579200, 176400, .v = { 1, 4, }, },
++ { 22579200, 192000, .v = { 40, 147, }, },
++ { 22579200, 352800, .v = { 1, 2, }, },
++ { 22579200, 384000, .v = { 80, 147, }, },
++ { 24576000, 32000, .v = { 1, 24, }, },
++ { 24576000, 44100, .v = { 147, 2560, }, },
++ { 24576000, 48000, .v = { 1, 16, }, },
++ { 24576000, 88200, .v = { 147, 1280, }, },
++ { 24576000, 96000, .v = { 1, 8, }, },
++ { 24576000, 176400, .v = { 147, 640, }, },
++ { 24576000, 192000, .v = { 1, 4, }, },
++ { 24576000, 352800, .v = { 147, 320, }, },
++ { 24576000, 384000, .v = { 1, 2, }, },
+ };
+
+ /* frm_size = 48 */
+ static const struct cs43130_clk_gen cs43130_48_clk_gen[] = {
+- { 22579200, 32000, .v = { 147, 100, }, },
+- { 22579200, 44100, .v = { 32, 3, }, },
+- { 22579200, 48000, .v = { 49, 5, }, },
+- { 22579200, 88200, .v = { 16, 3, }, },
+- { 22579200, 96000, .v = { 49, 10, }, },
+- { 22579200, 176400, .v = { 8, 3, }, },
+- { 22579200, 192000, .v = { 49, 20, }, },
+- { 22579200, 352800, .v = { 4, 3, }, },
+- { 22579200, 384000, .v = { 49, 40, }, },
+- { 24576000, 32000, .v = { 16, 1, }, },
+- { 24576000, 44100, .v = { 5120, 441, }, },
+- { 24576000, 48000, .v = { 32, 3, }, },
+- { 24576000, 88200, .v = { 2560, 441, }, },
+- { 24576000, 96000, .v = { 16, 3, }, },
+- { 24576000, 176400, .v = { 1280, 441, }, },
+- { 24576000, 192000, .v = { 8, 3, }, },
+- { 24576000, 352800, .v = { 640, 441, }, },
+- { 24576000, 384000, .v = { 4, 3, }, },
++ { 22579200, 32000, .v = { 100, 147, }, },
++ { 22579200, 44100, .v = { 3, 32, }, },
++ { 22579200, 48000, .v = { 5, 49, }, },
++ { 22579200, 88200, .v = { 3, 16, }, },
++ { 22579200, 96000, .v = { 10, 49, }, },
++ { 22579200, 176400, .v = { 3, 8, }, },
++ { 22579200, 192000, .v = { 20, 49, }, },
++ { 22579200, 352800, .v = { 3, 4, }, },
++ { 22579200, 384000, .v = { 40, 49, }, },
++ { 24576000, 32000, .v = { 1, 16, }, },
++ { 24576000, 44100, .v = { 441, 5120, }, },
++ { 24576000, 48000, .v = { 3, 32, }, },
++ { 24576000, 88200, .v = { 441, 2560, }, },
++ { 24576000, 96000, .v = { 3, 16, }, },
++ { 24576000, 176400, .v = { 441, 1280, }, },
++ { 24576000, 192000, .v = { 3, 8, }, },
++ { 24576000, 352800, .v = { 441, 640, }, },
++ { 24576000, 384000, .v = { 3, 4, }, },
+ };
+
+ /* frm_size = 64 */
+ static const struct cs43130_clk_gen cs43130_64_clk_gen[] = {
+- { 22579200, 32000, .v = { 441, 40, }, },
+- { 22579200, 44100, .v = { 8, 1, }, },
+- { 22579200, 48000, .v = { 147, 20, }, },
+- { 22579200, 88200, .v = { 4, 1, }, },
+- { 22579200, 96000, .v = { 147, 40, }, },
+- { 22579200, 176400, .v = { 2, 1, }, },
+- { 22579200, 192000, .v = { 147, 80, }, },
++ { 22579200, 32000, .v = { 40, 441, }, },
++ { 22579200, 44100, .v = { 1, 8, }, },
++ { 22579200, 48000, .v = { 20, 147, }, },
++ { 22579200, 88200, .v = { 1, 4, }, },
++ { 22579200, 96000, .v = { 40, 147, }, },
++ { 22579200, 176400, .v = { 1, 2, }, },
++ { 22579200, 192000, .v = { 80, 147, }, },
+ { 22579200, 352800, .v = { 1, 1, }, },
+- { 24576000, 32000, .v = { 12, 1, }, },
+- { 24576000, 44100, .v = { 1280, 147, }, },
+- { 24576000, 48000, .v = { 8, 1, }, },
+- { 24576000, 88200, .v = { 640, 147, }, },
+- { 24576000, 96000, .v = { 4, 1, }, },
+- { 24576000, 176400, .v = { 320, 147, }, },
+- { 24576000, 192000, .v = { 2, 1, }, },
+- { 24576000, 352800, .v = { 160, 147, }, },
++ { 24576000, 32000, .v = { 1, 12, }, },
++ { 24576000, 44100, .v = { 147, 1280, }, },
++ { 24576000, 48000, .v = { 1, 8, }, },
++ { 24576000, 88200, .v = { 147, 640, }, },
++ { 24576000, 96000, .v = { 1, 4, }, },
++ { 24576000, 176400, .v = { 147, 320, }, },
++ { 24576000, 192000, .v = { 1, 2, }, },
++ { 24576000, 352800, .v = { 147, 160, }, },
+ { 24576000, 384000, .v = { 1, 1, }, },
+ };
+
+diff --git a/sound/soc/codecs/da7219-aad.c b/sound/soc/codecs/da7219-aad.c
+index 993a0d00bc48d..70c175744772c 100644
+--- a/sound/soc/codecs/da7219-aad.c
++++ b/sound/soc/codecs/da7219-aad.c
+@@ -361,11 +361,15 @@ static irqreturn_t da7219_aad_irq_thread(int irq, void *data)
+ struct da7219_priv *da7219 = snd_soc_component_get_drvdata(component);
+ u8 events[DA7219_AAD_IRQ_REG_MAX];
+ u8 statusa;
+- int i, report = 0, mask = 0;
++ int i, ret, report = 0, mask = 0;
+
+ /* Read current IRQ events */
+- regmap_bulk_read(da7219->regmap, DA7219_ACCDET_IRQ_EVENT_A,
+- events, DA7219_AAD_IRQ_REG_MAX);
++ ret = regmap_bulk_read(da7219->regmap, DA7219_ACCDET_IRQ_EVENT_A,
++ events, DA7219_AAD_IRQ_REG_MAX);
++ if (ret) {
++ dev_warn_ratelimited(component->dev, "Failed to read IRQ events: %d\n", ret);
++ return IRQ_NONE;
++ }
+
+ if (!events[DA7219_AAD_IRQ_REG_A] && !events[DA7219_AAD_IRQ_REG_B])
+ return IRQ_NONE;
+@@ -910,6 +914,8 @@ void da7219_aad_suspend(struct snd_soc_component *component)
+ }
+ }
+ }
++
++ synchronize_irq(da7219_aad->irq);
+ }
+
+ void da7219_aad_resume(struct snd_soc_component *component)
+diff --git a/sound/soc/codecs/es8316.c b/sound/soc/codecs/es8316.c
+index ccecfdf700649..9f5522dee501d 100644
+--- a/sound/soc/codecs/es8316.c
++++ b/sound/soc/codecs/es8316.c
+@@ -153,7 +153,7 @@ static const char * const es8316_dmic_txt[] = {
+ "dmic data at high level",
+ "dmic data at low level",
+ };
+-static const unsigned int es8316_dmic_values[] = { 0, 1, 2 };
++static const unsigned int es8316_dmic_values[] = { 0, 2, 3 };
+ static const struct soc_enum es8316_dmic_src_enum =
+ SOC_VALUE_ENUM_SINGLE(ES8316_ADC_DMIC, 0, 3,
+ ARRAY_SIZE(es8316_dmic_txt),
+diff --git a/sound/soc/codecs/nau8821.c b/sound/soc/codecs/nau8821.c
+index fee970427a243..42de7588fdb68 100644
+--- a/sound/soc/codecs/nau8821.c
++++ b/sound/soc/codecs/nau8821.c
+@@ -10,6 +10,7 @@
+ #include <linux/acpi.h>
+ #include <linux/clk.h>
+ #include <linux/delay.h>
++#include <linux/dmi.h>
+ #include <linux/init.h>
+ #include <linux/i2c.h>
+ #include <linux/module.h>
+@@ -25,6 +26,13 @@
+ #include <sound/tlv.h>
+ #include "nau8821.h"
+
++#define NAU8821_JD_ACTIVE_HIGH BIT(0)
++
++static int nau8821_quirk;
++static int quirk_override = -1;
++module_param_named(quirk, quirk_override, uint, 0444);
++MODULE_PARM_DESC(quirk, "Board-specific quirk override");
++
+ #define NAU_FREF_MAX 13500000
+ #define NAU_FVCO_MAX 100000000
+ #define NAU_FVCO_MIN 90000000
+@@ -1792,6 +1800,33 @@ static int nau8821_setup_irq(struct nau8821 *nau8821)
+ return 0;
+ }
+
++/* Please keep this list alphabetically sorted */
++static const struct dmi_system_id nau8821_quirk_table[] = {
++ {
++ /* Positivo CW14Q01P-V2 */
++ .matches = {
++ DMI_MATCH(DMI_SYS_VENDOR, "Positivo Tecnologia SA"),
++ DMI_MATCH(DMI_BOARD_NAME, "CW14Q01P-V2"),
++ },
++ .driver_data = (void *)(NAU8821_JD_ACTIVE_HIGH),
++ },
++ {}
++};
++
++static void nau8821_check_quirks(void)
++{
++ const struct dmi_system_id *dmi_id;
++
++ if (quirk_override != -1) {
++ nau8821_quirk = quirk_override;
++ return;
++ }
++
++ dmi_id = dmi_first_match(nau8821_quirk_table);
++ if (dmi_id)
++ nau8821_quirk = (unsigned long)dmi_id->driver_data;
++}
++
+ static int nau8821_i2c_probe(struct i2c_client *i2c)
+ {
+ struct device *dev = &i2c->dev;
+@@ -1812,6 +1847,12 @@ static int nau8821_i2c_probe(struct i2c_client *i2c)
+
+ nau8821->dev = dev;
+ nau8821->irq = i2c->irq;
++
++ nau8821_check_quirks();
++
++ if (nau8821_quirk & NAU8821_JD_ACTIVE_HIGH)
++ nau8821->jkdet_polarity = 0;
++
+ nau8821_print_device_properties(nau8821);
+
+ nau8821_reset_chip(nau8821->regmap);
+diff --git a/sound/soc/codecs/rt1308-sdw.c b/sound/soc/codecs/rt1308-sdw.c
+index 1797af824f60b..e2699c0b117be 100644
+--- a/sound/soc/codecs/rt1308-sdw.c
++++ b/sound/soc/codecs/rt1308-sdw.c
+@@ -52,6 +52,7 @@ static bool rt1308_volatile_register(struct device *dev, unsigned int reg)
+ case 0x300a:
+ case 0xc000:
+ case 0xc710:
++ case 0xcf01:
+ case 0xc860 ... 0xc863:
+ case 0xc870 ... 0xc873:
+ return true;
+@@ -213,7 +214,7 @@ static int rt1308_io_init(struct device *dev, struct sdw_slave *slave)
+ {
+ struct rt1308_sdw_priv *rt1308 = dev_get_drvdata(dev);
+ int ret = 0;
+- unsigned int tmp;
++ unsigned int tmp, hibernation_flag;
+
+ if (rt1308->hw_init)
+ return 0;
+@@ -242,6 +243,10 @@ static int rt1308_io_init(struct device *dev, struct sdw_slave *slave)
+
+ pm_runtime_get_noresume(&slave->dev);
+
++ regmap_read(rt1308->regmap, 0xcf01, &hibernation_flag);
++ if ((hibernation_flag != 0x00) && rt1308->first_hw_init)
++ goto _preset_ready_;
++
+ /* sw reset */
+ regmap_write(rt1308->regmap, RT1308_SDW_RESET, 0);
+
+@@ -282,6 +287,12 @@ static int rt1308_io_init(struct device *dev, struct sdw_slave *slave)
+ regmap_write(rt1308->regmap, 0xc100, 0xd7);
+ regmap_write(rt1308->regmap, 0xc101, 0xd7);
+
++ /* apply BQ params */
++ rt1308_apply_bq_params(rt1308);
++
++ regmap_write(rt1308->regmap, 0xcf01, 0x01);
++
++_preset_ready_:
+ if (rt1308->first_hw_init) {
+ regcache_cache_bypass(rt1308->regmap, false);
+ regcache_mark_dirty(rt1308->regmap);
+diff --git a/sound/soc/codecs/rt5682-sdw.c b/sound/soc/codecs/rt5682-sdw.c
+index 23f17f70d7e9b..9622aaf1b3e63 100644
+--- a/sound/soc/codecs/rt5682-sdw.c
++++ b/sound/soc/codecs/rt5682-sdw.c
+@@ -753,8 +753,15 @@ static int __maybe_unused rt5682_dev_resume(struct device *dev)
+ if (!rt5682->first_hw_init)
+ return 0;
+
+- if (!slave->unattach_request)
++ if (!slave->unattach_request) {
++ if (rt5682->disable_irq == true) {
++ mutex_lock(&rt5682->disable_irq_lock);
++ sdw_write_no_pm(slave, SDW_SCP_INTMASK1, SDW_SCP_INT1_IMPL_DEF);
++ rt5682->disable_irq = false;
++ mutex_unlock(&rt5682->disable_irq_lock);
++ }
+ goto regmap_sync;
++ }
+
+ time = wait_for_completion_timeout(&slave->initialization_complete,
+ msecs_to_jiffies(RT5682_PROBE_TIMEOUT));
+diff --git a/sound/soc/codecs/rt711-sdca-sdw.c b/sound/soc/codecs/rt711-sdca-sdw.c
+index 51f3335343e08..76ed61e47316d 100644
+--- a/sound/soc/codecs/rt711-sdca-sdw.c
++++ b/sound/soc/codecs/rt711-sdca-sdw.c
+@@ -441,8 +441,16 @@ static int __maybe_unused rt711_sdca_dev_resume(struct device *dev)
+ if (!rt711->first_hw_init)
+ return 0;
+
+- if (!slave->unattach_request)
++ if (!slave->unattach_request) {
++ if (rt711->disable_irq == true) {
++ mutex_lock(&rt711->disable_irq_lock);
++ sdw_write_no_pm(slave, SDW_SCP_SDCA_INTMASK1, SDW_SCP_SDCA_INTMASK_SDCA_0);
++ sdw_write_no_pm(slave, SDW_SCP_SDCA_INTMASK2, SDW_SCP_SDCA_INTMASK_SDCA_8);
++ rt711->disable_irq = false;
++ mutex_unlock(&rt711->disable_irq_lock);
++ }
+ goto regmap_sync;
++ }
+
+ time = wait_for_completion_timeout(&slave->initialization_complete,
+ msecs_to_jiffies(RT711_PROBE_TIMEOUT));
+diff --git a/sound/soc/codecs/rt711-sdw.c b/sound/soc/codecs/rt711-sdw.c
+index 4fe68bcf2a7c2..9545b8a7eb192 100644
+--- a/sound/soc/codecs/rt711-sdw.c
++++ b/sound/soc/codecs/rt711-sdw.c
+@@ -541,8 +541,15 @@ static int __maybe_unused rt711_dev_resume(struct device *dev)
+ if (!rt711->first_hw_init)
+ return 0;
+
+- if (!slave->unattach_request)
++ if (!slave->unattach_request) {
++ if (rt711->disable_irq == true) {
++ mutex_lock(&rt711->disable_irq_lock);
++ sdw_write_no_pm(slave, SDW_SCP_INTMASK1, SDW_SCP_INT1_IMPL_DEF);
++ rt711->disable_irq = false;
++ mutex_unlock(&rt711->disable_irq_lock);
++ }
+ goto regmap_sync;
++ }
+
+ time = wait_for_completion_timeout(&slave->initialization_complete,
+ msecs_to_jiffies(RT711_PROBE_TIMEOUT));
+diff --git a/sound/soc/codecs/rt712-sdca-sdw.c b/sound/soc/codecs/rt712-sdca-sdw.c
+index 3f319459dfec3..1c9e10fea3ddd 100644
+--- a/sound/soc/codecs/rt712-sdca-sdw.c
++++ b/sound/soc/codecs/rt712-sdca-sdw.c
+@@ -441,8 +441,16 @@ static int __maybe_unused rt712_sdca_dev_resume(struct device *dev)
+ if (!rt712->first_hw_init)
+ return 0;
+
+- if (!slave->unattach_request)
++ if (!slave->unattach_request) {
++ if (rt712->disable_irq == true) {
++ mutex_lock(&rt712->disable_irq_lock);
++ sdw_write_no_pm(slave, SDW_SCP_SDCA_INTMASK1, SDW_SCP_SDCA_INTMASK_SDCA_0);
++ sdw_write_no_pm(slave, SDW_SCP_SDCA_INTMASK2, SDW_SCP_SDCA_INTMASK_SDCA_8);
++ rt712->disable_irq = false;
++ mutex_unlock(&rt712->disable_irq_lock);
++ }
+ goto regmap_sync;
++ }
+
+ time = wait_for_completion_timeout(&slave->initialization_complete,
+ msecs_to_jiffies(RT712_PROBE_TIMEOUT));
+diff --git a/sound/soc/fsl/fsl_qmc_audio.c b/sound/soc/fsl/fsl_qmc_audio.c
+index 7cbb8e4758ccc..56d6b0b039a2e 100644
+--- a/sound/soc/fsl/fsl_qmc_audio.c
++++ b/sound/soc/fsl/fsl_qmc_audio.c
+@@ -372,8 +372,8 @@ static int qmc_dai_hw_rule_format_by_channels(struct qmc_dai *qmc_dai,
+ struct snd_mask *f_old = hw_param_mask(params, SNDRV_PCM_HW_PARAM_FORMAT);
+ unsigned int channels = params_channels(params);
+ unsigned int slot_width;
++ snd_pcm_format_t format;
+ struct snd_mask f_new;
+- unsigned int i;
+
+ if (!channels || channels > nb_ts) {
+ dev_err(qmc_dai->dev, "channels %u not supported\n",
+@@ -384,10 +384,10 @@ static int qmc_dai_hw_rule_format_by_channels(struct qmc_dai *qmc_dai,
+ slot_width = (nb_ts / channels) * 8;
+
+ snd_mask_none(&f_new);
+- for (i = 0; i <= SNDRV_PCM_FORMAT_LAST; i++) {
+- if (snd_mask_test(f_old, i)) {
+- if (snd_pcm_format_physical_width(i) <= slot_width)
+- snd_mask_set(&f_new, i);
++ pcm_for_each_format(format) {
++ if (snd_mask_test_format(f_old, format)) {
++ if (snd_pcm_format_physical_width(format) <= slot_width)
++ snd_mask_set_format(&f_new, format);
+ }
+ }
+
+@@ -551,26 +551,26 @@ static const struct snd_soc_dai_ops qmc_dai_ops = {
+
+ static u64 qmc_audio_formats(u8 nb_ts)
+ {
+- u64 formats;
+- unsigned int chan_width;
+ unsigned int format_width;
+- int i;
++ unsigned int chan_width;
++ snd_pcm_format_t format;
++ u64 formats_mask;
+
+ if (!nb_ts)
+ return 0;
+
+- formats = 0;
++ formats_mask = 0;
+ chan_width = nb_ts * 8;
+- for (i = 0; i <= SNDRV_PCM_FORMAT_LAST; i++) {
++ pcm_for_each_format(format) {
+ /*
+ * Support format other than little-endian (ie big-endian or
+ * without endianness such as 8bit formats)
+ */
+- if (snd_pcm_format_little_endian(i) == 1)
++ if (snd_pcm_format_little_endian(format) == 1)
+ continue;
+
+ /* Support physical width multiple of 8bit */
+- format_width = snd_pcm_format_physical_width(i);
++ format_width = snd_pcm_format_physical_width(format);
+ if (format_width == 0 || format_width % 8)
+ continue;
+
+@@ -581,9 +581,9 @@ static u64 qmc_audio_formats(u8 nb_ts)
+ if (format_width > chan_width || chan_width % format_width)
+ continue;
+
+- formats |= (1ULL << i);
++ formats_mask |= pcm_format_to_bits(format);
+ }
+- return formats;
++ return formats_mask;
+ }
+
+ static int qmc_audio_dai_parse(struct qmc_audio *qmc_audio, struct device_node *np,
+diff --git a/sound/soc/soc-compress.c b/sound/soc/soc-compress.c
+index d8715db5e415e..2117fd61cf8f3 100644
+--- a/sound/soc/soc-compress.c
++++ b/sound/soc/soc-compress.c
+@@ -194,6 +194,7 @@ open_err:
+ snd_soc_dai_compr_shutdown(cpu_dai, cstream, 1);
+ out:
+ dpcm_path_put(&list);
++ snd_soc_dpcm_mutex_unlock(fe);
+ be_err:
+ fe->dpcm[stream].runtime_update = SND_SOC_DPCM_UPDATE_NO;
+ snd_soc_card_mutex_unlock(fe->card);
+diff --git a/sound/soc/sof/amd/acp.c b/sound/soc/sof/amd/acp.c
+index 2ae76bcd3590c..973bd81059852 100644
+--- a/sound/soc/sof/amd/acp.c
++++ b/sound/soc/sof/amd/acp.c
+@@ -351,9 +351,9 @@ static irqreturn_t acp_irq_handler(int irq, void *dev_id)
+ unsigned int val;
+
+ val = snd_sof_dsp_read(sdev, ACP_DSP_BAR, base + DSP_SW_INTR_STAT_OFFSET);
+- if (val) {
+- val |= ACP_DSP_TO_HOST_IRQ;
+- snd_sof_dsp_write(sdev, ACP_DSP_BAR, base + DSP_SW_INTR_STAT_OFFSET, val);
++ if (val & ACP_DSP_TO_HOST_IRQ) {
++ snd_sof_dsp_write(sdev, ACP_DSP_BAR, base + DSP_SW_INTR_STAT_OFFSET,
++ ACP_DSP_TO_HOST_IRQ);
+ return IRQ_WAKE_THREAD;
+ }
+
+diff --git a/sound/soc/sof/intel/hda-mlink.c b/sound/soc/sof/intel/hda-mlink.c
+index b7cbf66badf5b..df87b3791c23e 100644
+--- a/sound/soc/sof/intel/hda-mlink.c
++++ b/sound/soc/sof/intel/hda-mlink.c
+@@ -331,14 +331,14 @@ static bool hdaml_link_check_cmdsync(u32 __iomem *lsync, u32 cmdsync_mask)
+ return !!(val & cmdsync_mask);
+ }
+
+-static void hdaml_link_set_lsdiid(u32 __iomem *lsdiid, int dev_num)
++static void hdaml_link_set_lsdiid(u16 __iomem *lsdiid, int dev_num)
+ {
+- u32 val;
++ u16 val;
+
+- val = readl(lsdiid);
++ val = readw(lsdiid);
+ val |= BIT(dev_num);
+
+- writel(val, lsdiid);
++ writew(val, lsdiid);
+ }
+
+ static void hdaml_shim_map_stream_ch(u16 __iomem *pcmsycm, int lchan, int hchan,
+@@ -781,6 +781,8 @@ int hdac_bus_eml_sdw_map_stream_ch(struct hdac_bus *bus, int sublink, int y,
+ {
+ struct hdac_ext2_link *h2link;
+ u16 __iomem *pcmsycm;
++ int hchan;
++ int lchan;
+ u16 val;
+
+ h2link = find_ext2_link(bus, true, AZX_REG_ML_LEPTR_ID_SDW);
+@@ -791,9 +793,17 @@ int hdac_bus_eml_sdw_map_stream_ch(struct hdac_bus *bus, int sublink, int y,
+ h2link->instance_offset * sublink +
+ AZX_REG_SDW_SHIM_PCMSyCM(y);
+
++ if (channel_mask) {
++ hchan = __fls(channel_mask);
++ lchan = __ffs(channel_mask);
++ } else {
++ hchan = 0;
++ lchan = 0;
++ }
++
+ mutex_lock(&h2link->eml_lock);
+
+- hdaml_shim_map_stream_ch(pcmsycm, 0, hweight32(channel_mask),
++ hdaml_shim_map_stream_ch(pcmsycm, lchan, hchan,
+ stream_id, dir);
+
+ mutex_unlock(&h2link->eml_lock);
+diff --git a/sound/usb/mixer_maps.c b/sound/usb/mixer_maps.c
+index f4bd1e8ae4b6c..23260aa1919d3 100644
+--- a/sound/usb/mixer_maps.c
++++ b/sound/usb/mixer_maps.c
+@@ -374,6 +374,15 @@ static const struct usbmix_name_map corsair_virtuoso_map[] = {
+ { 0 }
+ };
+
++/* Microsoft USB Link headset */
++/* a guess work: raw playback volume values are from 2 to 129 */
++static const struct usbmix_dB_map ms_usb_link_dB = { -3225, 0, true };
++static const struct usbmix_name_map ms_usb_link_map[] = {
++ { 9, NULL, .dB = &ms_usb_link_dB },
++ { 10, NULL }, /* Headset Capture volume; seems non-working, disabled */
++ { 0 } /* terminator */
++};
++
+ /* ASUS ROG Zenith II with Realtek ALC1220-VB */
+ static const struct usbmix_name_map asus_zenith_ii_map[] = {
+ { 19, NULL, 12 }, /* FU, Input Gain Pad - broken response, disabled */
+@@ -668,6 +677,11 @@ static const struct usbmix_ctl_map usbmix_ctl_maps[] = {
+ .id = USB_ID(0x1395, 0x0025),
+ .map = sennheiser_pc8_map,
+ },
++ {
++ /* Microsoft USB Link headset */
++ .id = USB_ID(0x045e, 0x083c),
++ .map = ms_usb_link_map,
++ },
+ { 0 } /* terminator */
+ };
+
+diff --git a/sound/usb/quirks.c b/sound/usb/quirks.c
+index 6cf55b7f7a041..4667d543f7481 100644
+--- a/sound/usb/quirks.c
++++ b/sound/usb/quirks.c
+@@ -1874,8 +1874,10 @@ u64 snd_usb_interface_dsd_format_quirks(struct snd_usb_audio *chip,
+
+ /* XMOS based USB DACs */
+ switch (chip->usb_id) {
+- case USB_ID(0x1511, 0x0037): /* AURALiC VEGA */
+- case USB_ID(0x21ed, 0xd75a): /* Accuphase DAC-60 option card */
++ case USB_ID(0x139f, 0x5504): /* Nagra DAC */
++ case USB_ID(0x20b1, 0x3089): /* Mola-Mola DAC */
++ case USB_ID(0x2522, 0x0007): /* LH Labs Geek Out 1V5 */
++ case USB_ID(0x2522, 0x0009): /* LH Labs Geek Pulse X Inifinity 2V0 */
+ case USB_ID(0x2522, 0x0012): /* LH Labs VI DAC Infinity */
+ case USB_ID(0x2772, 0x0230): /* Pro-Ject Pre Box S2 Digital */
+ if (fp->altsetting == 2)
+@@ -1885,14 +1887,18 @@ u64 snd_usb_interface_dsd_format_quirks(struct snd_usb_audio *chip,
+ case USB_ID(0x0d8c, 0x0316): /* Hegel HD12 DSD */
+ case USB_ID(0x10cb, 0x0103): /* The Bit Opus #3; with fp->dsd_raw */
+ case USB_ID(0x16d0, 0x06b2): /* NuPrime DAC-10 */
+- case USB_ID(0x16d0, 0x09dd): /* Encore mDSD */
++ case USB_ID(0x16d0, 0x06b4): /* NuPrime Audio HD-AVP/AVA */
+ case USB_ID(0x16d0, 0x0733): /* Furutech ADL Stratos */
++ case USB_ID(0x16d0, 0x09d8): /* NuPrime IDA-8 */
+ case USB_ID(0x16d0, 0x09db): /* NuPrime Audio DAC-9 */
++ case USB_ID(0x16d0, 0x09dd): /* Encore mDSD */
+ case USB_ID(0x1db5, 0x0003): /* Bryston BDA3 */
++ case USB_ID(0x20a0, 0x4143): /* WaveIO USB Audio 2.0 */
+ case USB_ID(0x22e1, 0xca01): /* HDTA Serenade DSD */
+ case USB_ID(0x249c, 0x9326): /* M2Tech Young MkIII */
+ case USB_ID(0x2616, 0x0106): /* PS Audio NuWave DAC */
+ case USB_ID(0x2622, 0x0041): /* Audiolab M-DAC+ */
++ case USB_ID(0x278b, 0x5100): /* Rotel RC-1590 */
+ case USB_ID(0x27f7, 0x3002): /* W4S DAC-2v2SE */
+ case USB_ID(0x29a2, 0x0086): /* Mutec MC3+ USB */
+ case USB_ID(0x6b42, 0x0042): /* MSB Technology */
+@@ -1902,9 +1908,6 @@ u64 snd_usb_interface_dsd_format_quirks(struct snd_usb_audio *chip,
+
+ /* Amanero Combo384 USB based DACs with native DSD support */
+ case USB_ID(0x16d0, 0x071a): /* Amanero - Combo384 */
+- case USB_ID(0x2ab6, 0x0004): /* T+A DAC8DSD-V2.0, MP1000E-V2.0, MP2000R-V2.0, MP2500R-V2.0, MP3100HV-V2.0 */
+- case USB_ID(0x2ab6, 0x0005): /* T+A USB HD Audio 1 */
+- case USB_ID(0x2ab6, 0x0006): /* T+A USB HD Audio 2 */
+ if (fp->altsetting == 2) {
+ switch (le16_to_cpu(chip->dev->descriptor.bcdDevice)) {
+ case 0x199:
+@@ -2011,6 +2014,9 @@ static const struct usb_audio_quirk_flags_table quirk_flags_table[] = {
+ QUIRK_FLAG_IGNORE_CTL_ERROR),
+ DEVICE_FLG(0x041e, 0x4080, /* Creative Live Cam VF0610 */
+ QUIRK_FLAG_GET_SAMPLE_RATE),
++ DEVICE_FLG(0x045e, 0x083c, /* MS USB Link headset */
++ QUIRK_FLAG_GET_SAMPLE_RATE | QUIRK_FLAG_CTL_MSG_DELAY |
++ QUIRK_FLAG_DISABLE_AUTOSUSPEND),
+ DEVICE_FLG(0x046d, 0x084c, /* Logitech ConferenceCam Connect */
+ QUIRK_FLAG_GET_SAMPLE_RATE | QUIRK_FLAG_CTL_MSG_DELAY_1M),
+ DEVICE_FLG(0x046d, 0x0991, /* Logitech QuickCam Pro */
+@@ -2046,6 +2052,9 @@ static const struct usb_audio_quirk_flags_table quirk_flags_table[] = {
+ QUIRK_FLAG_IFACE_DELAY),
+ DEVICE_FLG(0x0644, 0x805f, /* TEAC Model 12 */
+ QUIRK_FLAG_FORCE_IFACE_RESET),
++ DEVICE_FLG(0x0644, 0x806b, /* TEAC UD-701 */
++ QUIRK_FLAG_ITF_USB_DSD_DAC | QUIRK_FLAG_CTL_MSG_DELAY |
++ QUIRK_FLAG_IFACE_DELAY),
+ DEVICE_FLG(0x06f8, 0xb000, /* Hercules DJ Console (Windows Edition) */
+ QUIRK_FLAG_IGNORE_CTL_ERROR),
+ DEVICE_FLG(0x06f8, 0xd002, /* Hercules DJ Console (Macintosh Edition) */
+@@ -2084,6 +2093,8 @@ static const struct usb_audio_quirk_flags_table quirk_flags_table[] = {
+ QUIRK_FLAG_ITF_USB_DSD_DAC | QUIRK_FLAG_CTL_MSG_DELAY),
+ DEVICE_FLG(0x154e, 0x3006, /* Marantz SA-14S1 */
+ QUIRK_FLAG_ITF_USB_DSD_DAC | QUIRK_FLAG_CTL_MSG_DELAY),
++ DEVICE_FLG(0x154e, 0x300b, /* Marantz SA-KI RUBY / SA-12 */
++ QUIRK_FLAG_DSD_RAW),
+ DEVICE_FLG(0x154e, 0x500e, /* Denon DN-X1600 */
+ QUIRK_FLAG_IGNORE_CLOCK_SOURCE),
+ DEVICE_FLG(0x1686, 0x00dd, /* Zoom R16/24 */
+@@ -2128,6 +2139,10 @@ static const struct usb_audio_quirk_flags_table quirk_flags_table[] = {
+ QUIRK_FLAG_SHARE_MEDIA_DEVICE | QUIRK_FLAG_ALIGN_TRANSFER),
+ DEVICE_FLG(0x21b4, 0x0081, /* AudioQuest DragonFly */
+ QUIRK_FLAG_GET_SAMPLE_RATE),
++ DEVICE_FLG(0x21b4, 0x0230, /* Ayre QB-9 Twenty */
++ QUIRK_FLAG_DSD_RAW),
++ DEVICE_FLG(0x21b4, 0x0232, /* Ayre QX-5 Twenty */
++ QUIRK_FLAG_DSD_RAW),
+ DEVICE_FLG(0x2522, 0x0007, /* LH Labs Geek Out HD Audio 1V5 */
+ QUIRK_FLAG_SET_IFACE_FIRST),
+ DEVICE_FLG(0x2708, 0x0002, /* Audient iD14 */
+@@ -2170,12 +2185,18 @@ static const struct usb_audio_quirk_flags_table quirk_flags_table[] = {
+ QUIRK_FLAG_VALIDATE_RATES),
+ VENDOR_FLG(0x1235, /* Focusrite Novation */
+ QUIRK_FLAG_VALIDATE_RATES),
++ VENDOR_FLG(0x1511, /* AURALiC */
++ QUIRK_FLAG_DSD_RAW),
+ VENDOR_FLG(0x152a, /* Thesycon devices */
+ QUIRK_FLAG_DSD_RAW),
++ VENDOR_FLG(0x18d1, /* iBasso devices */
++ QUIRK_FLAG_DSD_RAW),
+ VENDOR_FLG(0x1de7, /* Phoenix Audio */
+ QUIRK_FLAG_GET_SAMPLE_RATE),
+ VENDOR_FLG(0x20b1, /* XMOS based devices */
+ QUIRK_FLAG_DSD_RAW),
++ VENDOR_FLG(0x21ed, /* Accuphase Laboratory */
++ QUIRK_FLAG_DSD_RAW),
+ VENDOR_FLG(0x22d9, /* Oppo */
+ QUIRK_FLAG_DSD_RAW),
+ VENDOR_FLG(0x23ba, /* Playback Design */
+@@ -2191,10 +2212,14 @@ static const struct usb_audio_quirk_flags_table quirk_flags_table[] = {
+ QUIRK_FLAG_DSD_RAW),
+ VENDOR_FLG(0x2ab6, /* T+A devices */
+ QUIRK_FLAG_DSD_RAW),
++ VENDOR_FLG(0x2d87, /* Cayin device */
++ QUIRK_FLAG_DSD_RAW),
+ VENDOR_FLG(0x3336, /* HEM devices */
+ QUIRK_FLAG_DSD_RAW),
+ VENDOR_FLG(0x3353, /* Khadas devices */
+ QUIRK_FLAG_DSD_RAW),
++ VENDOR_FLG(0x35f4, /* MSB Technology */
++ QUIRK_FLAG_DSD_RAW),
+ VENDOR_FLG(0x3842, /* EVGA */
+ QUIRK_FLAG_DSD_RAW),
+ VENDOR_FLG(0xc502, /* HiBy devices */
+diff --git a/tools/bpf/bpftool/skeleton/pid_iter.bpf.c b/tools/bpf/bpftool/skeleton/pid_iter.bpf.c
+index eb05ea53afb12..26004f0c5a6ae 100644
+--- a/tools/bpf/bpftool/skeleton/pid_iter.bpf.c
++++ b/tools/bpf/bpftool/skeleton/pid_iter.bpf.c
+@@ -15,6 +15,19 @@ enum bpf_obj_type {
+ BPF_OBJ_BTF,
+ };
+
++struct bpf_perf_link___local {
++ struct bpf_link link;
++ struct file *perf_file;
++} __attribute__((preserve_access_index));
++
++struct perf_event___local {
++ u64 bpf_cookie;
++} __attribute__((preserve_access_index));
++
++enum bpf_link_type___local {
++ BPF_LINK_TYPE_PERF_EVENT___local = 7,
++};
++
+ extern const void bpf_link_fops __ksym;
+ extern const void bpf_map_fops __ksym;
+ extern const void bpf_prog_fops __ksym;
+@@ -41,10 +54,10 @@ static __always_inline __u32 get_obj_id(void *ent, enum bpf_obj_type type)
+ /* could be used only with BPF_LINK_TYPE_PERF_EVENT links */
+ static __u64 get_bpf_cookie(struct bpf_link *link)
+ {
+- struct bpf_perf_link *perf_link;
+- struct perf_event *event;
++ struct bpf_perf_link___local *perf_link;
++ struct perf_event___local *event;
+
+- perf_link = container_of(link, struct bpf_perf_link, link);
++ perf_link = container_of(link, struct bpf_perf_link___local, link);
+ event = BPF_CORE_READ(perf_link, perf_file, private_data);
+ return BPF_CORE_READ(event, bpf_cookie);
+ }
+@@ -84,10 +97,13 @@ int iter(struct bpf_iter__task_file *ctx)
+ e.pid = task->tgid;
+ e.id = get_obj_id(file->private_data, obj_type);
+
+- if (obj_type == BPF_OBJ_LINK) {
++ if (obj_type == BPF_OBJ_LINK &&
++ bpf_core_enum_value_exists(enum bpf_link_type___local,
++ BPF_LINK_TYPE_PERF_EVENT___local)) {
+ struct bpf_link *link = (struct bpf_link *) file->private_data;
+
+- if (BPF_CORE_READ(link, type) == BPF_LINK_TYPE_PERF_EVENT) {
++ if (link->type == bpf_core_enum_value(enum bpf_link_type___local,
++ BPF_LINK_TYPE_PERF_EVENT___local)) {
+ e.has_bpf_cookie = true;
+ e.bpf_cookie = get_bpf_cookie(link);
+ }
+diff --git a/tools/bpf/bpftool/skeleton/profiler.bpf.c b/tools/bpf/bpftool/skeleton/profiler.bpf.c
+index ce5b65e07ab10..2f80edc682f11 100644
+--- a/tools/bpf/bpftool/skeleton/profiler.bpf.c
++++ b/tools/bpf/bpftool/skeleton/profiler.bpf.c
+@@ -4,6 +4,12 @@
+ #include <bpf/bpf_helpers.h>
+ #include <bpf/bpf_tracing.h>
+
++struct bpf_perf_event_value___local {
++ __u64 counter;
++ __u64 enabled;
++ __u64 running;
++} __attribute__((preserve_access_index));
++
+ /* map of perf event fds, num_cpu * num_metric entries */
+ struct {
+ __uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY);
+@@ -15,14 +21,14 @@ struct {
+ struct {
+ __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
+ __uint(key_size, sizeof(u32));
+- __uint(value_size, sizeof(struct bpf_perf_event_value));
++ __uint(value_size, sizeof(struct bpf_perf_event_value___local));
+ } fentry_readings SEC(".maps");
+
+ /* accumulated readings */
+ struct {
+ __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
+ __uint(key_size, sizeof(u32));
+- __uint(value_size, sizeof(struct bpf_perf_event_value));
++ __uint(value_size, sizeof(struct bpf_perf_event_value___local));
+ } accum_readings SEC(".maps");
+
+ /* sample counts, one per cpu */
+@@ -39,7 +45,7 @@ const volatile __u32 num_metric = 1;
+ SEC("fentry/XXX")
+ int BPF_PROG(fentry_XXX)
+ {
+- struct bpf_perf_event_value *ptrs[MAX_NUM_MATRICS];
++ struct bpf_perf_event_value___local *ptrs[MAX_NUM_MATRICS];
+ u32 key = bpf_get_smp_processor_id();
+ u32 i;
+
+@@ -53,10 +59,10 @@ int BPF_PROG(fentry_XXX)
+ }
+
+ for (i = 0; i < num_metric && i < MAX_NUM_MATRICS; i++) {
+- struct bpf_perf_event_value reading;
++ struct bpf_perf_event_value___local reading;
+ int err;
+
+- err = bpf_perf_event_read_value(&events, key, &reading,
++ err = bpf_perf_event_read_value(&events, key, (void *)&reading,
+ sizeof(reading));
+ if (err)
+ return 0;
+@@ -68,14 +74,14 @@ int BPF_PROG(fentry_XXX)
+ }
+
+ static inline void
+-fexit_update_maps(u32 id, struct bpf_perf_event_value *after)
++fexit_update_maps(u32 id, struct bpf_perf_event_value___local *after)
+ {
+- struct bpf_perf_event_value *before, diff;
++ struct bpf_perf_event_value___local *before, diff;
+
+ before = bpf_map_lookup_elem(&fentry_readings, &id);
+ /* only account samples with a valid fentry_reading */
+ if (before && before->counter) {
+- struct bpf_perf_event_value *accum;
++ struct bpf_perf_event_value___local *accum;
+
+ diff.counter = after->counter - before->counter;
+ diff.enabled = after->enabled - before->enabled;
+@@ -93,7 +99,7 @@ fexit_update_maps(u32 id, struct bpf_perf_event_value *after)
+ SEC("fexit/XXX")
+ int BPF_PROG(fexit_XXX)
+ {
+- struct bpf_perf_event_value readings[MAX_NUM_MATRICS];
++ struct bpf_perf_event_value___local readings[MAX_NUM_MATRICS];
+ u32 cpu = bpf_get_smp_processor_id();
+ u32 i, zero = 0;
+ int err;
+@@ -102,7 +108,8 @@ int BPF_PROG(fexit_XXX)
+ /* read all events before updating the maps, to reduce error */
+ for (i = 0; i < num_metric && i < MAX_NUM_MATRICS; i++) {
+ err = bpf_perf_event_read_value(&events, cpu + i * num_cpu,
+- readings + i, sizeof(*readings));
++ (void *)(readings + i),
++ sizeof(*readings));
+ if (err)
+ return 0;
+ }
+diff --git a/tools/hv/vmbus_testing b/tools/hv/vmbus_testing
+index e7212903dd1d9..4467979d8f699 100755
+--- a/tools/hv/vmbus_testing
++++ b/tools/hv/vmbus_testing
+@@ -164,7 +164,7 @@ def recursive_file_lookup(path, file_map):
+ def get_all_devices_test_status(file_map):
+
+ for device in file_map:
+- if (get_test_state(locate_state(device, file_map)) is 1):
++ if (get_test_state(locate_state(device, file_map)) == 1):
+ print("Testing = ON for: {}"
+ .format(device.split("/")[5]))
+ else:
+@@ -203,7 +203,7 @@ def write_test_files(path, value):
+ def set_test_state(state_path, state_value, quiet):
+
+ write_test_files(state_path, state_value)
+- if (get_test_state(state_path) is 1):
++ if (get_test_state(state_path) == 1):
+ if (not quiet):
+ print("Testing = ON for device: {}"
+ .format(state_path.split("/")[5]))
+diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
+index a27f6e9ccce75..2a4dbe7d9b3d4 100644
+--- a/tools/lib/bpf/libbpf.c
++++ b/tools/lib/bpf/libbpf.c
+@@ -6136,7 +6136,11 @@ static int append_subprog_relos(struct bpf_program *main_prog, struct bpf_progra
+ if (main_prog == subprog)
+ return 0;
+ relos = libbpf_reallocarray(main_prog->reloc_desc, new_cnt, sizeof(*relos));
+- if (!relos)
++ /* if new count is zero, reallocarray can return a valid NULL result;
++ * in this case the previous pointer will be freed, so we *have to*
++ * reassign old pointer to the new value (even if it's NULL)
++ */
++ if (!relos && new_cnt)
+ return -ENOMEM;
+ if (subprog->nr_reloc)
+ memcpy(relos + main_prog->nr_reloc, subprog->reloc_desc,
+@@ -8504,7 +8508,8 @@ int bpf_program__set_insns(struct bpf_program *prog,
+ return -EBUSY;
+
+ insns = libbpf_reallocarray(prog->insns, new_insn_cnt, sizeof(*insns));
+- if (!insns) {
++ /* NULL is a valid return from reallocarray if the new count is zero */
++ if (!insns && new_insn_cnt) {
+ pr_warn("prog '%s': failed to realloc prog code\n", prog->name);
+ return -ENOMEM;
+ }
+@@ -8534,13 +8539,31 @@ enum bpf_prog_type bpf_program__type(const struct bpf_program *prog)
+ return prog->type;
+ }
+
++static size_t custom_sec_def_cnt;
++static struct bpf_sec_def *custom_sec_defs;
++static struct bpf_sec_def custom_fallback_def;
++static bool has_custom_fallback_def;
++static int last_custom_sec_def_handler_id;
++
+ int bpf_program__set_type(struct bpf_program *prog, enum bpf_prog_type type)
+ {
+ if (prog->obj->loaded)
+ return libbpf_err(-EBUSY);
+
++ /* if type is not changed, do nothing */
++ if (prog->type == type)
++ return 0;
++
+ prog->type = type;
+- prog->sec_def = NULL;
++
++ /* If a program type was changed, we need to reset associated SEC()
++ * handler, as it will be invalid now. The only exception is a generic
++ * fallback handler, which by definition is program type-agnostic and
++ * is a catch-all custom handler, optionally set by the application,
++ * so should be able to handle any type of BPF program.
++ */
++ if (prog->sec_def != &custom_fallback_def)
++ prog->sec_def = NULL;
+ return 0;
+ }
+
+@@ -8716,13 +8739,6 @@ static const struct bpf_sec_def section_defs[] = {
+ SEC_DEF("netfilter", NETFILTER, BPF_NETFILTER, SEC_NONE),
+ };
+
+-static size_t custom_sec_def_cnt;
+-static struct bpf_sec_def *custom_sec_defs;
+-static struct bpf_sec_def custom_fallback_def;
+-static bool has_custom_fallback_def;
+-
+-static int last_custom_sec_def_handler_id;
+-
+ int libbpf_register_prog_handler(const char *sec,
+ enum bpf_prog_type prog_type,
+ enum bpf_attach_type exp_attach_type,
+@@ -8802,7 +8818,11 @@ int libbpf_unregister_prog_handler(int handler_id)
+
+ /* try to shrink the array, but it's ok if we couldn't */
+ sec_defs = libbpf_reallocarray(custom_sec_defs, custom_sec_def_cnt, sizeof(*sec_defs));
+- if (sec_defs)
++ /* if new count is zero, reallocarray can return a valid NULL result;
++ * in this case the previous pointer will be freed, so we *have to*
++ * reassign old pointer to the new value (even if it's NULL)
++ */
++ if (sec_defs || custom_sec_def_cnt == 0)
+ custom_sec_defs = sec_defs;
+
+ return 0;
+diff --git a/tools/lib/bpf/usdt.c b/tools/lib/bpf/usdt.c
+index 086eef355ab3d..1af77f9935833 100644
+--- a/tools/lib/bpf/usdt.c
++++ b/tools/lib/bpf/usdt.c
+@@ -852,8 +852,11 @@ static int bpf_link_usdt_detach(struct bpf_link *link)
+ * system is so exhausted on memory, it's the least of user's
+ * concerns, probably.
+ * So just do our best here to return those IDs to usdt_manager.
++ * Another edge case when we can legitimately get NULL is when
++ * new_cnt is zero, which can happen in some edge cases, so we
++ * need to be careful about that.
+ */
+- if (new_free_ids) {
++ if (new_free_ids || new_cnt == 0) {
+ memcpy(new_free_ids + man->free_spec_cnt, usdt_link->spec_ids,
+ usdt_link->spec_cnt * sizeof(*usdt_link->spec_ids));
+ man->free_spec_ids = new_free_ids;
+diff --git a/tools/testing/radix-tree/multiorder.c b/tools/testing/radix-tree/multiorder.c
+index e00520cc63498..cffaf2245d4f1 100644
+--- a/tools/testing/radix-tree/multiorder.c
++++ b/tools/testing/radix-tree/multiorder.c
+@@ -159,7 +159,7 @@ void multiorder_tagged_iteration(struct xarray *xa)
+ item_kill_tree(xa);
+ }
+
+-bool stop_iteration = false;
++bool stop_iteration;
+
+ static void *creator_func(void *ptr)
+ {
+@@ -201,6 +201,7 @@ static void multiorder_iteration_race(struct xarray *xa)
+ pthread_t worker_thread[num_threads];
+ int i;
+
++ stop_iteration = false;
+ pthread_create(&worker_thread[0], NULL, &creator_func, xa);
+ for (i = 1; i < num_threads; i++)
+ pthread_create(&worker_thread[i], NULL, &iterator_func, xa);
+@@ -211,6 +212,61 @@ static void multiorder_iteration_race(struct xarray *xa)
+ item_kill_tree(xa);
+ }
+
++static void *load_creator(void *ptr)
++{
++ /* 'order' is set up to ensure we have sibling entries */
++ unsigned int order;
++ struct radix_tree_root *tree = ptr;
++ int i;
++
++ rcu_register_thread();
++ item_insert_order(tree, 3 << RADIX_TREE_MAP_SHIFT, 0);
++ item_insert_order(tree, 2 << RADIX_TREE_MAP_SHIFT, 0);
++ for (i = 0; i < 10000; i++) {
++ for (order = 1; order < RADIX_TREE_MAP_SHIFT; order++) {
++ unsigned long index = (3 << RADIX_TREE_MAP_SHIFT) -
++ (1 << order);
++ item_insert_order(tree, index, order);
++ item_delete_rcu(tree, index);
++ }
++ }
++ rcu_unregister_thread();
++
++ stop_iteration = true;
++ return NULL;
++}
++
++static void *load_worker(void *ptr)
++{
++ unsigned long index = (3 << RADIX_TREE_MAP_SHIFT) - 1;
++
++ rcu_register_thread();
++ while (!stop_iteration) {
++ struct item *item = xa_load(ptr, index);
++ assert(!xa_is_internal(item));
++ }
++ rcu_unregister_thread();
++
++ return NULL;
++}
++
++static void load_race(struct xarray *xa)
++{
++ const int num_threads = sysconf(_SC_NPROCESSORS_ONLN) * 4;
++ pthread_t worker_thread[num_threads];
++ int i;
++
++ stop_iteration = false;
++ pthread_create(&worker_thread[0], NULL, &load_creator, xa);
++ for (i = 1; i < num_threads; i++)
++ pthread_create(&worker_thread[i], NULL, &load_worker, xa);
++
++ for (i = 0; i < num_threads; i++)
++ pthread_join(worker_thread[i], NULL);
++
++ item_kill_tree(xa);
++}
++
+ static DEFINE_XARRAY(array);
+
+ void multiorder_checks(void)
+@@ -218,12 +274,20 @@ void multiorder_checks(void)
+ multiorder_iteration(&array);
+ multiorder_tagged_iteration(&array);
+ multiorder_iteration_race(&array);
++ load_race(&array);
+
+ radix_tree_cpu_dead(0);
+ }
+
+-int __weak main(void)
++int __weak main(int argc, char **argv)
+ {
++ int opt;
++
++ while ((opt = getopt(argc, argv, "ls:v")) != -1) {
++ if (opt == 'v')
++ test_verbose++;
++ }
++
+ rcu_register_thread();
+ radix_tree_init();
+ multiorder_checks();
+diff --git a/tools/testing/selftests/bpf/benchs/run_bench_rename.sh b/tools/testing/selftests/bpf/benchs/run_bench_rename.sh
+index 16f774b1cdbed..7b281dbe41656 100755
+--- a/tools/testing/selftests/bpf/benchs/run_bench_rename.sh
++++ b/tools/testing/selftests/bpf/benchs/run_bench_rename.sh
+@@ -2,7 +2,7 @@
+
+ set -eufo pipefail
+
+-for i in base kprobe kretprobe rawtp fentry fexit fmodret
++for i in base kprobe kretprobe rawtp fentry fexit
+ do
+ summary=$(sudo ./bench -w2 -d5 -a rename-$i | tail -n1 | cut -d'(' -f1 | cut -d' ' -f3-)
+ printf "%-10s: %s\n" $i "$summary"
+diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_nf.c b/tools/testing/selftests/bpf/prog_tests/bpf_nf.c
+index c8ba4009e4ab9..b30ff6b3b81ae 100644
+--- a/tools/testing/selftests/bpf/prog_tests/bpf_nf.c
++++ b/tools/testing/selftests/bpf/prog_tests/bpf_nf.c
+@@ -123,12 +123,13 @@ static void test_bpf_nf_ct(int mode)
+ ASSERT_EQ(skel->data->test_snat_addr, 0, "Test for source natting");
+ ASSERT_EQ(skel->data->test_dnat_addr, 0, "Test for destination natting");
+ end:
+- if (srv_client_fd != -1)
+- close(srv_client_fd);
+ if (client_fd != -1)
+ close(client_fd);
++ if (srv_client_fd != -1)
++ close(srv_client_fd);
+ if (srv_fd != -1)
+ close(srv_fd);
++
+ snprintf(cmd, sizeof(cmd), iptables, "-D");
+ system(cmd);
+ test_bpf_nf__destroy(skel);
+diff --git a/tools/testing/selftests/bpf/prog_tests/kfunc_call.c b/tools/testing/selftests/bpf/prog_tests/kfunc_call.c
+index a543742cd7bd1..2eb71559713c9 100644
+--- a/tools/testing/selftests/bpf/prog_tests/kfunc_call.c
++++ b/tools/testing/selftests/bpf/prog_tests/kfunc_call.c
+@@ -173,8 +173,8 @@ static void verify_fail(struct kfunc_test_params *param)
+ case tc_test:
+ topts.data_in = &pkt_v4;
+ topts.data_size_in = sizeof(pkt_v4);
+- break;
+ topts.repeat = 1;
++ break;
+ }
+
+ skel = kfunc_call_fail__open_opts(&opts);
+diff --git a/tools/testing/selftests/bpf/progs/test_cls_redirect.h b/tools/testing/selftests/bpf/progs/test_cls_redirect.h
+index 76eab0aacba0c..233b089d1fbac 100644
+--- a/tools/testing/selftests/bpf/progs/test_cls_redirect.h
++++ b/tools/testing/selftests/bpf/progs/test_cls_redirect.h
+@@ -12,6 +12,15 @@
+ #include <linux/ipv6.h>
+ #include <linux/udp.h>
+
++/* offsetof() is used in static asserts, and the libbpf-redefined CO-RE
++ * friendly version breaks compilation for older clang versions <= 15
++ * when invoked in a static assert. Restore original here.
++ */
++#ifdef offsetof
++#undef offsetof
++#define offsetof(type, member) __builtin_offsetof(type, member)
++#endif
++
+ struct gre_base_hdr {
+ uint16_t flags;
+ uint16_t protocol;
+diff --git a/tools/testing/selftests/futex/functional/futex_wait_timeout.c b/tools/testing/selftests/futex/functional/futex_wait_timeout.c
+index 3651ce17beeb9..d183f878360bc 100644
+--- a/tools/testing/selftests/futex/functional/futex_wait_timeout.c
++++ b/tools/testing/selftests/futex/functional/futex_wait_timeout.c
+@@ -24,6 +24,7 @@
+
+ static long timeout_ns = 100000; /* 100us default timeout */
+ static futex_t futex_pi;
++static pthread_barrier_t barrier;
+
+ void usage(char *prog)
+ {
+@@ -48,6 +49,8 @@ void *get_pi_lock(void *arg)
+ if (ret != 0)
+ error("futex_lock_pi failed\n", ret);
+
++ pthread_barrier_wait(&barrier);
++
+ /* Blocks forever */
+ ret = futex_wait(&lock, 0, NULL, 0);
+ error("futex_wait failed\n", ret);
+@@ -130,6 +133,7 @@ int main(int argc, char *argv[])
+ basename(argv[0]));
+ ksft_print_msg("\tArguments: timeout=%ldns\n", timeout_ns);
+
++ pthread_barrier_init(&barrier, NULL, 2);
+ pthread_create(&thread, NULL, get_pi_lock, NULL);
+
+ /* initialize relative timeout */
+@@ -163,6 +167,9 @@ int main(int argc, char *argv[])
+ res = futex_wait_requeue_pi(&f1, f1, &futex_pi, &to, 0);
+ test_timeout(res, &ret, "futex_wait_requeue_pi monotonic", ETIMEDOUT);
+
++ /* Wait until the other thread calls futex_lock_pi() */
++ pthread_barrier_wait(&barrier);
++ pthread_barrier_destroy(&barrier);
+ /*
+ * FUTEX_LOCK_PI with CLOCK_REALTIME
+ * Due to historical reasons, FUTEX_LOCK_PI supports only realtime
+diff --git a/tools/testing/selftests/kselftest_harness.h b/tools/testing/selftests/kselftest_harness.h
+index 5fd49ad0c696f..e05ac82610467 100644
+--- a/tools/testing/selftests/kselftest_harness.h
++++ b/tools/testing/selftests/kselftest_harness.h
+@@ -938,7 +938,11 @@ void __wait_for_test(struct __test_metadata *t)
+ fprintf(TH_LOG_STREAM,
+ "# %s: Test terminated by timeout\n", t->name);
+ } else if (WIFEXITED(status)) {
+- if (t->termsig != -1) {
++ if (WEXITSTATUS(status) == 255) {
++ /* SKIP */
++ t->passed = 1;
++ t->skip = 1;
++ } else if (t->termsig != -1) {
+ t->passed = 0;
+ fprintf(TH_LOG_STREAM,
+ "# %s: Test exited normally instead of by signal (code: %d)\n",
+@@ -950,11 +954,6 @@ void __wait_for_test(struct __test_metadata *t)
+ case 0:
+ t->passed = 1;
+ break;
+- /* SKIP */
+- case 255:
+- t->passed = 1;
+- t->skip = 1;
+- break;
+ /* Other failure, assume step report. */
+ default:
+ t->passed = 0;
+diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/selftests/memfd/memfd_test.c
+index dba0e8ba002f8..8b7390ad81d11 100644
+--- a/tools/testing/selftests/memfd/memfd_test.c
++++ b/tools/testing/selftests/memfd/memfd_test.c
+@@ -1145,8 +1145,25 @@ static void test_sysctl_child(void)
+
+ printf("%s sysctl 2\n", memfd_str);
+ sysctl_assert_write("2");
+- mfd_fail_new("kern_memfd_sysctl_2",
+- MFD_CLOEXEC | MFD_ALLOW_SEALING);
++ mfd_fail_new("kern_memfd_sysctl_2_exec",
++ MFD_EXEC | MFD_CLOEXEC | MFD_ALLOW_SEALING);
++
++ fd = mfd_assert_new("kern_memfd_sysctl_2_dfl",
++ mfd_def_size,
++ MFD_CLOEXEC | MFD_ALLOW_SEALING);
++ mfd_assert_mode(fd, 0666);
++ mfd_assert_has_seals(fd, F_SEAL_EXEC);
++ mfd_fail_chmod(fd, 0777);
++ close(fd);
++
++ fd = mfd_assert_new("kern_memfd_sysctl_2_noexec_seal",
++ mfd_def_size,
++ MFD_NOEXEC_SEAL | MFD_CLOEXEC | MFD_ALLOW_SEALING);
++ mfd_assert_mode(fd, 0666);
++ mfd_assert_has_seals(fd, F_SEAL_EXEC);
++ mfd_fail_chmod(fd, 0777);
++ close(fd);
++
+ sysctl_fail_write("0");
+ sysctl_fail_write("1");
+ }
+@@ -1202,7 +1219,24 @@ static pid_t spawn_newpid_thread(unsigned int flags, int (*fn)(void *))
+
+ static void join_newpid_thread(pid_t pid)
+ {
+- waitpid(pid, NULL, 0);
++ int wstatus;
++
++ if (waitpid(pid, &wstatus, 0) < 0) {
++ printf("newpid thread: waitpid() failed: %m\n");
++ abort();
++ }
++
++ if (WIFEXITED(wstatus) && WEXITSTATUS(wstatus) != 0) {
++ printf("newpid thread: exited with non-zero error code %d\n",
++ WEXITSTATUS(wstatus));
++ abort();
++ }
++
++ if (WIFSIGNALED(wstatus)) {
++ printf("newpid thread: killed by signal %d\n",
++ WTERMSIG(wstatus));
++ abort();
++ }
+ }
+
+ /*
+diff --git a/tools/testing/selftests/resctrl/Makefile b/tools/testing/selftests/resctrl/Makefile
+index 73d53257df42f..5073dbc961258 100644
+--- a/tools/testing/selftests/resctrl/Makefile
++++ b/tools/testing/selftests/resctrl/Makefile
+@@ -7,4 +7,4 @@ TEST_GEN_PROGS := resctrl_tests
+
+ include ../lib.mk
+
+-$(OUTPUT)/resctrl_tests: $(wildcard *.c)
++$(OUTPUT)/resctrl_tests: $(wildcard *.[ch])
+diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c
+index 8a4fe8693be63..289b619116fec 100644
+--- a/tools/testing/selftests/resctrl/cache.c
++++ b/tools/testing/selftests/resctrl/cache.c
+@@ -87,21 +87,19 @@ static int reset_enable_llc_perf(pid_t pid, int cpu_no)
+ static int get_llc_perf(unsigned long *llc_perf_miss)
+ {
+ __u64 total_misses;
++ int ret;
+
+ /* Stop counters after one span to get miss rate */
+
+ ioctl(fd_lm, PERF_EVENT_IOC_DISABLE, 0);
+
+- if (read(fd_lm, &rf_cqm, sizeof(struct read_format)) == -1) {
++ ret = read(fd_lm, &rf_cqm, sizeof(struct read_format));
++ if (ret == -1) {
+ perror("Could not get llc misses through perf");
+-
+ return -1;
+ }
+
+ total_misses = rf_cqm.values[0].value;
+-
+- close(fd_lm);
+-
+ *llc_perf_miss = total_misses;
+
+ return 0;
+@@ -253,19 +251,25 @@ int cat_val(struct resctrl_val_param *param)
+ memflush, operation, resctrl_val)) {
+ fprintf(stderr, "Error-running fill buffer\n");
+ ret = -1;
+- break;
++ goto pe_close;
+ }
+
+ sleep(1);
+ ret = measure_cache_vals(param, bm_pid);
+ if (ret)
+- break;
++ goto pe_close;
++
++ close(fd_lm);
+ } else {
+ break;
+ }
+ }
+
+ return ret;
++
++pe_close:
++ close(fd_lm);
++ return ret;
+ }
+
+ /*
+diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c
+index 341cc93ca84c4..3b328c8448964 100644
+--- a/tools/testing/selftests/resctrl/fill_buf.c
++++ b/tools/testing/selftests/resctrl/fill_buf.c
+@@ -177,12 +177,13 @@ fill_cache(unsigned long long buf_size, int malloc_and_init, int memflush,
+ else
+ ret = fill_cache_write(start_ptr, end_ptr, resctrl_val);
+
++ free(startptr);
++
+ if (ret) {
+ printf("\n Error in fill cache read/write...\n");
+ return -1;
+ }
+
+- free(startptr);
+
+ return 0;
+ }
+diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
+index 87e39456dee08..f455f0b7e314b 100644
+--- a/tools/testing/selftests/resctrl/resctrl.h
++++ b/tools/testing/selftests/resctrl/resctrl.h
+@@ -43,6 +43,7 @@
+ do { \
+ perror(err_msg); \
+ kill(ppid, SIGKILL); \
++ umount_resctrlfs(); \
+ exit(EXIT_FAILURE); \
+ } while (0)
+
+diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
+index 9584eb57e0eda..365d30779768a 100644
+--- a/virt/kvm/vfio.c
++++ b/virt/kvm/vfio.c
+@@ -21,7 +21,7 @@
+ #include <asm/kvm_ppc.h>
+ #endif
+
+-struct kvm_vfio_group {
++struct kvm_vfio_file {
+ struct list_head node;
+ struct file *file;
+ #ifdef CONFIG_SPAPR_TCE_IOMMU
+@@ -30,7 +30,7 @@ struct kvm_vfio_group {
+ };
+
+ struct kvm_vfio {
+- struct list_head group_list;
++ struct list_head file_list;
+ struct mutex lock;
+ bool noncoherent;
+ };
+@@ -98,34 +98,35 @@ static struct iommu_group *kvm_vfio_file_iommu_group(struct file *file)
+ }
+
+ static void kvm_spapr_tce_release_vfio_group(struct kvm *kvm,
+- struct kvm_vfio_group *kvg)
++ struct kvm_vfio_file *kvf)
+ {
+- if (WARN_ON_ONCE(!kvg->iommu_group))
++ if (WARN_ON_ONCE(!kvf->iommu_group))
+ return;
+
+- kvm_spapr_tce_release_iommu_group(kvm, kvg->iommu_group);
+- iommu_group_put(kvg->iommu_group);
+- kvg->iommu_group = NULL;
++ kvm_spapr_tce_release_iommu_group(kvm, kvf->iommu_group);
++ iommu_group_put(kvf->iommu_group);
++ kvf->iommu_group = NULL;
+ }
+ #endif
+
+ /*
+- * Groups can use the same or different IOMMU domains. If the same then
+- * adding a new group may change the coherency of groups we've previously
+- * been told about. We don't want to care about any of that so we retest
+- * each group and bail as soon as we find one that's noncoherent. This
+- * means we only ever [un]register_noncoherent_dma once for the whole device.
++ * Groups/devices can use the same or different IOMMU domains. If the same
++ * then adding a new group/device may change the coherency of groups/devices
++ * we've previously been told about. We don't want to care about any of
++ * that so we retest each group/device and bail as soon as we find one that's
++ * noncoherent. This means we only ever [un]register_noncoherent_dma once
++ * for the whole device.
+ */
+ static void kvm_vfio_update_coherency(struct kvm_device *dev)
+ {
+ struct kvm_vfio *kv = dev->private;
+ bool noncoherent = false;
+- struct kvm_vfio_group *kvg;
++ struct kvm_vfio_file *kvf;
+
+ mutex_lock(&kv->lock);
+
+- list_for_each_entry(kvg, &kv->group_list, node) {
+- if (!kvm_vfio_file_enforced_coherent(kvg->file)) {
++ list_for_each_entry(kvf, &kv->file_list, node) {
++ if (!kvm_vfio_file_enforced_coherent(kvf->file)) {
+ noncoherent = true;
+ break;
+ }
+@@ -143,10 +144,10 @@ static void kvm_vfio_update_coherency(struct kvm_device *dev)
+ mutex_unlock(&kv->lock);
+ }
+
+-static int kvm_vfio_group_add(struct kvm_device *dev, unsigned int fd)
++static int kvm_vfio_file_add(struct kvm_device *dev, unsigned int fd)
+ {
+ struct kvm_vfio *kv = dev->private;
+- struct kvm_vfio_group *kvg;
++ struct kvm_vfio_file *kvf;
+ struct file *filp;
+ int ret;
+
+@@ -162,27 +163,27 @@ static int kvm_vfio_group_add(struct kvm_device *dev, unsigned int fd)
+
+ mutex_lock(&kv->lock);
+
+- list_for_each_entry(kvg, &kv->group_list, node) {
+- if (kvg->file == filp) {
++ list_for_each_entry(kvf, &kv->file_list, node) {
++ if (kvf->file == filp) {
+ ret = -EEXIST;
+ goto err_unlock;
+ }
+ }
+
+- kvg = kzalloc(sizeof(*kvg), GFP_KERNEL_ACCOUNT);
+- if (!kvg) {
++ kvf = kzalloc(sizeof(*kvf), GFP_KERNEL_ACCOUNT);
++ if (!kvf) {
+ ret = -ENOMEM;
+ goto err_unlock;
+ }
+
+- kvg->file = filp;
+- list_add_tail(&kvg->node, &kv->group_list);
++ kvf->file = filp;
++ list_add_tail(&kvf->node, &kv->file_list);
+
+ kvm_arch_start_assignment(dev->kvm);
++ kvm_vfio_file_set_kvm(kvf->file, dev->kvm);
+
+ mutex_unlock(&kv->lock);
+
+- kvm_vfio_file_set_kvm(kvg->file, dev->kvm);
+ kvm_vfio_update_coherency(dev);
+
+ return 0;
+@@ -193,10 +194,10 @@ err_fput:
+ return ret;
+ }
+
+-static int kvm_vfio_group_del(struct kvm_device *dev, unsigned int fd)
++static int kvm_vfio_file_del(struct kvm_device *dev, unsigned int fd)
+ {
+ struct kvm_vfio *kv = dev->private;
+- struct kvm_vfio_group *kvg;
++ struct kvm_vfio_file *kvf;
+ struct fd f;
+ int ret;
+
+@@ -208,18 +209,18 @@ static int kvm_vfio_group_del(struct kvm_device *dev, unsigned int fd)
+
+ mutex_lock(&kv->lock);
+
+- list_for_each_entry(kvg, &kv->group_list, node) {
+- if (kvg->file != f.file)
++ list_for_each_entry(kvf, &kv->file_list, node) {
++ if (kvf->file != f.file)
+ continue;
+
+- list_del(&kvg->node);
++ list_del(&kvf->node);
+ kvm_arch_end_assignment(dev->kvm);
+ #ifdef CONFIG_SPAPR_TCE_IOMMU
+- kvm_spapr_tce_release_vfio_group(dev->kvm, kvg);
++ kvm_spapr_tce_release_vfio_group(dev->kvm, kvf);
+ #endif
+- kvm_vfio_file_set_kvm(kvg->file, NULL);
+- fput(kvg->file);
+- kfree(kvg);
++ kvm_vfio_file_set_kvm(kvf->file, NULL);
++ fput(kvf->file);
++ kfree(kvf);
+ ret = 0;
+ break;
+ }
+@@ -234,12 +235,12 @@ static int kvm_vfio_group_del(struct kvm_device *dev, unsigned int fd)
+ }
+
+ #ifdef CONFIG_SPAPR_TCE_IOMMU
+-static int kvm_vfio_group_set_spapr_tce(struct kvm_device *dev,
+- void __user *arg)
++static int kvm_vfio_file_set_spapr_tce(struct kvm_device *dev,
++ void __user *arg)
+ {
+ struct kvm_vfio_spapr_tce param;
+ struct kvm_vfio *kv = dev->private;
+- struct kvm_vfio_group *kvg;
++ struct kvm_vfio_file *kvf;
+ struct fd f;
+ int ret;
+
+@@ -254,20 +255,20 @@ static int kvm_vfio_group_set_spapr_tce(struct kvm_device *dev,
+
+ mutex_lock(&kv->lock);
+
+- list_for_each_entry(kvg, &kv->group_list, node) {
+- if (kvg->file != f.file)
++ list_for_each_entry(kvf, &kv->file_list, node) {
++ if (kvf->file != f.file)
+ continue;
+
+- if (!kvg->iommu_group) {
+- kvg->iommu_group = kvm_vfio_file_iommu_group(kvg->file);
+- if (WARN_ON_ONCE(!kvg->iommu_group)) {
++ if (!kvf->iommu_group) {
++ kvf->iommu_group = kvm_vfio_file_iommu_group(kvf->file);
++ if (WARN_ON_ONCE(!kvf->iommu_group)) {
+ ret = -EIO;
+ goto err_fdput;
+ }
+ }
+
+ ret = kvm_spapr_tce_attach_iommu_group(dev->kvm, param.tablefd,
+- kvg->iommu_group);
++ kvf->iommu_group);
+ break;
+ }
+
+@@ -278,8 +279,8 @@ err_fdput:
+ }
+ #endif
+
+-static int kvm_vfio_set_group(struct kvm_device *dev, long attr,
+- void __user *arg)
++static int kvm_vfio_set_file(struct kvm_device *dev, long attr,
++ void __user *arg)
+ {
+ int32_t __user *argp = arg;
+ int32_t fd;
+@@ -288,16 +289,16 @@ static int kvm_vfio_set_group(struct kvm_device *dev, long attr,
+ case KVM_DEV_VFIO_GROUP_ADD:
+ if (get_user(fd, argp))
+ return -EFAULT;
+- return kvm_vfio_group_add(dev, fd);
++ return kvm_vfio_file_add(dev, fd);
+
+ case KVM_DEV_VFIO_GROUP_DEL:
+ if (get_user(fd, argp))
+ return -EFAULT;
+- return kvm_vfio_group_del(dev, fd);
++ return kvm_vfio_file_del(dev, fd);
+
+ #ifdef CONFIG_SPAPR_TCE_IOMMU
+ case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE:
+- return kvm_vfio_group_set_spapr_tce(dev, arg);
++ return kvm_vfio_file_set_spapr_tce(dev, arg);
+ #endif
+ }
+
+@@ -309,8 +310,8 @@ static int kvm_vfio_set_attr(struct kvm_device *dev,
+ {
+ switch (attr->group) {
+ case KVM_DEV_VFIO_GROUP:
+- return kvm_vfio_set_group(dev, attr->attr,
+- u64_to_user_ptr(attr->addr));
++ return kvm_vfio_set_file(dev, attr->attr,
++ u64_to_user_ptr(attr->addr));
+ }
+
+ return -ENXIO;
+@@ -339,16 +340,16 @@ static int kvm_vfio_has_attr(struct kvm_device *dev,
+ static void kvm_vfio_release(struct kvm_device *dev)
+ {
+ struct kvm_vfio *kv = dev->private;
+- struct kvm_vfio_group *kvg, *tmp;
++ struct kvm_vfio_file *kvf, *tmp;
+
+- list_for_each_entry_safe(kvg, tmp, &kv->group_list, node) {
++ list_for_each_entry_safe(kvf, tmp, &kv->file_list, node) {
+ #ifdef CONFIG_SPAPR_TCE_IOMMU
+- kvm_spapr_tce_release_vfio_group(dev->kvm, kvg);
++ kvm_spapr_tce_release_vfio_group(dev->kvm, kvf);
+ #endif
+- kvm_vfio_file_set_kvm(kvg->file, NULL);
+- fput(kvg->file);
+- list_del(&kvg->node);
+- kfree(kvg);
++ kvm_vfio_file_set_kvm(kvf->file, NULL);
++ fput(kvf->file);
++ list_del(&kvf->node);
++ kfree(kvf);
+ kvm_arch_end_assignment(dev->kvm);
+ }
+
+@@ -382,7 +383,7 @@ static int kvm_vfio_create(struct kvm_device *dev, u32 type)
+ if (!kv)
+ return -ENOMEM;
+
+- INIT_LIST_HEAD(&kv->group_list);
++ INIT_LIST_HEAD(&kv->file_list);
+ mutex_init(&kv->lock);
+
+ dev->private = kv;
^ permalink raw reply related [flat|nested] 29+ messages in thread
end of thread, other threads:[~2023-09-13 11:04 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-04 12:57 [gentoo-commits] proj/linux-patches:6.4 commit in: / Mike Pagano
-- strict thread matches above, loose matches on Subject: below --
2023-09-13 11:04 Mike Pagano
2023-09-06 22:15 Mike Pagano
2023-09-02 9:55 Mike Pagano
2023-08-30 13:48 Mike Pagano
2023-08-30 13:45 Mike Pagano
2023-08-23 15:57 Mike Pagano
2023-08-16 20:20 Mike Pagano
2023-08-16 17:28 Mike Pagano
2023-08-11 11:53 Mike Pagano
2023-08-08 18:39 Mike Pagano
2023-08-03 11:47 Mike Pagano
2023-08-02 10:35 Mike Pagano
2023-07-29 18:37 Mike Pagano
2023-07-27 11:46 Mike Pagano
2023-07-27 11:41 Mike Pagano
2023-07-24 20:26 Mike Pagano
2023-07-23 15:15 Mike Pagano
2023-07-19 17:20 Mike Pagano
2023-07-19 17:04 Mike Pagano
2023-07-11 11:45 Mike Pagano
2023-07-05 20:40 Mike Pagano
2023-07-05 20:26 Mike Pagano
2023-07-04 12:57 Mike Pagano
2023-07-03 16:59 Mike Pagano
2023-07-01 18:48 Mike Pagano
2023-07-01 18:19 Mike Pagano
2023-05-09 12:38 Mike Pagano
2023-05-09 12:36 Mike Pagano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox